Operations efficiency – Memory vs. CPU speed

Ttelmah · Joined: 11 Mar 2010 Posts: 19221

However all of that is pointless if you are sending as serial. It is going to take you 174uSec to send the value. Since if you use the buffering, you can be calculating the next value while sending the last, this becomes the limiting factor on the speed.

viki2000 · Joined: 08 May 2013 Posts: 233

As I said in previous post:

temtronic · Posted: Fri Jun 16, 2017 4:57 pm

I don't know what the limits are for a 'new' PC but the PIC18F46K22 can go 2MBaud, the 'math' say 4M is possible, though I've not tried it.

Ttelmah · Joined: 11 Mar 2010 Posts: 19221

I'm running SPI on a PIC24FJ256GB610 at 16Mbps. Using DMA.

viki2000 · Joined: 08 May 2013 Posts: 233

Do you refer to 16-bit/Asynchronous with formula Baud Rate = Fosc/[4 (n + 1)] and FOSC=64MHz PLL? How do you pick up the minimum value for n, because if I put it 0, then I get 16Mbps?
Page 280 Table 16-3
http://ww1.microchip.com/downloads/en/DeviceDoc/41412F.pdf

viki2000 · Joined: 08 May 2013 Posts: 233

I was just curious how temtronic arrived to 2Mbaud and 4M for PIC18F46K22.

My intention is not to use EUSART high speed to communicate with a PC. I did it only to check the generated sin() values.
I was thinking at DAC SPI or Parallel for high speed.
Here are some calculations that I have in mind.
I have some cheap SPI DAC 12 bit from Microchip MCP4921. It says it can work with SPI frequencies up to 20MHz.
The PIC24HJ64GP202 that I used can handle, theoretically according with the datasheet, SPI speed = FOSC/4. Then in my case with FOSC=80MHz, I can have SPI speed 20MHz to match max. MCP4921 speed.
So, I expect PIC24HJ64GP202 to be able to handle 20Mbps with FOSC=80MHz internal clock with PLL.
When you use SPI on a PIC24FJ256GB610 at 16Mbps, what is the FOSC?
Then, to have 1 sample out of MCP4921, I must send 2 times 8bit of data, so in total 16bit of data, probably with some sync/delay between these 2 byte, but let’s assume for simplicity of calculation and thinking at max. possible speed that we send 16bit for 1 sample. Then 20MHz/16bit=1250K samples per second.
If I want generate 100Hz signal that would be 10ms period, so 100 faster than 1s, meaning the LUT table is 100 times smaller.
In other words, for fast speed using LUT, I could use a table with 1250K/100=12500 entries in lookup table. As I expect additional delays in subroutines accessing LUT and SPI transmission, probably half of the LUT (5K) or maybe one quarter (2.5K), as suggested several times in the thread, will do the job in reality. Then considering only for 1 quadrant 2.5K, I get 5K for rectified sine, which should be more than enough.

temtronic · Posted: Sat Jun 17, 2017 5:09 am

re:
formula Baud Rate = Fosc/[4 (n + 1)] and FOSC=64MHz

n is the 16 bit baud rate generator registers,so you can set to zero...

64,000,000 / 4( 0+1)

64,000,000 / 4 = 16MBaud

I never set the PICs to go that fast. I do recall some problems at 2MB due to layout so 'slowed down' to 1MB for the bench test of a month or so. You have to be aware that the faster you go, you MUST pay attention to the 'details' like wiring, shielding,levels,distance, etc. Same thing applies to a much lesser extent when you run 24 Baud, my other standard.

Since 'modern' PCs don't have true RS232 ports I suspect you'd have to use a USB<>TTL module to have the PIC talk to the PC. Now how fast a USB port can be is anybodies guess. I know there's a LOT of 'overhead' aka programs that have their hooks into USB..

Jay

viki2000 · Joined: 08 May 2013 Posts: 233

Speaking about speed and optimization, I guess fixed point math library from Microchip with Q15sinPI is implemented using polynomial approximation in ASM.
As I am not a daily programmer and not and ASM knowledgeable, it is hard for me to apply reverse engineering to the ASM of Q15sinPI and see what mechanism and methods are applied, but if I see well there are some constants moved in working registers followed by multiplication, which is rather a polynomial approached than CORDIC.
It would be interesting to know what kind of polynomial equation was used by Microchip in Q15sinPI fixed point math library.

Reading on internet various sites, I noticed that basically most of the people apply these approaches: LUT or Cordic or polynomial.
The analysis and comparison that I have tried to do in a simple manner, it was done in a more complex form and presented at the end of next file, this time for FPGAs:
https://hal-ens-lyon.archives-ouvertes.fr/ensl-00802777/document
Related with fixed point math libraries for microcontrollers for C compilers, I find a better presentation and analysis done by Texas Instruments, they compare LUT and polynomial approach using Taylor series, page 7-16 of the next pdf:
http://www.ece.uidaho.edu/hydrofly/OLD/Code/math/doc/math_mdl.pdf
The proposed code written by Ttelmah is basically Taylor polynomial approximation of the 2nd order and it also explained in the next pages:
http://forum.devmaster.net/t/fast-and-accurate-sine-cosine/9648/6
http://lab.polygonal.de/2007/07/18/fast-and-accurate-sinecosine-approximation/
Here is another similar approach:
https://dspguru.com/dsp/tricks/parabolic-approximation-of-sin-and-cos/
Then next link works only once in a while:
http://lolengine.net/blog/2011/12/21/better-function-approximations

I provided these links and few more below as reference for future readers.
Basically a polynomial is used to approximate the sine in the range 0-90°(PI/2).
To recap, the idea is to use faster function with very good approximations compared with normal float sin().
And usually the higher order of the polynomial is, the higher accuracy we get, but that will have the downside for microcontrollers to be slower in calculating the polynomial/sine.
For example here is 6th order for high accuracy, implemented on 32 bit microcontroller:
http://www.olliw.eu/2014/fast-functions/#sincos

Another ancient polynomial approximation with good accuracy was found some hundreds of years ago by the Indian Bhaskara I:
https://en.wikipedia.org/wiki/Bhaskara_I%27s_sine_approximation_formula

viki2000 · Joined: 08 May 2013 Posts: 233

I am looking now at the next approach, written for 32bit microcontrollers ARM9:
http://www.coranac.com/2009/07/sines/

It starts with 2nd order polynomial implemented by Ttelmah:
S2(x)=4x/PI – 4x²/PI²

And then it analyzes the 3rd, 4th and 5th order polynomial:
S3(x)=3x/PI – 4x³/PI³
S5(z)=z/2(PI-z²[(2PI-5)-z²(PI-3)])
C4(z)=1-z²[(2-PI/4)-z²(1-PI/4)]
With z=x/(PI/2) and sin(x)=cos(x-PI/2)
Then extended is:
S5(x)=x/PI(PI-(2x/PI)²((2PI-5)-(2x/PI)²(PI-3)))
S4(x-PI/2)=1-(2x/PI)²[(2-PI/4)-( 2x/PI)²(1-PI/4)]

It is nice that he compares the speed and accuracy.
I plotted some of these functions and the 5th order overlays “perfect” over the sine:
https://www.desmos.com/calculator/xxkkb0gmvw

I can follow the math involved, but there is something that I do not quite understand.
When the 3rd function order is calculated S3(x)=3x/PI – 4x³/PI³, there are 5 constants A, n, p, r, s with:
r = 2n−p and s = n+p+1−A
I do not understand how A, n, p are chosen based on the next statements:

temtronic · Posted: Sun Jun 18, 2017 5:27 am

Since I don't punish PICs by forcing them to do 'fancy' math, I'd suggest contacting the author of the code to explain the variables used..

One thing to remeber about PICs, CCS C and ASM is that CCS does allow you to cut C code that can run an assembler 'subroutine'. They provide a couple examples and if you think about it , it's similar to using say one of the math functions. You do need to know th enames of the variables and such but it is feasible to 'port' that qsin... program you talk about and just have your C code 'call' it.

For the ultimate(IE 'best') in speed and accuracy you will need to cut code in Assembler though. No C programmer can optimize a micro's performance in C. You need to KNOW how the micro works it's instruction set.

Jay

viki2000 · Joined: 08 May 2013 Posts: 233

If I understand right:

Ttelmah · Joined: 11 Mar 2010 Posts: 19221

You are wasting so much time on this it is painful.

First thing is I think you 'overestimate' your actual errors. Even the worst algorithm you have used to synthesise a waveform, will give THD errors only a tiny fraction of a percent.
Then if you have code that synthesises an acceptable sin in only just over 6uSec, then use it. If you want to code in CCS, just include the assembler as a routine.
Then you are going to get far worse problems from just about every other point of the hardware. Digital crosstalk. Physical non linearity of the DAC (this is going to be far worse than the maths errors you have). Actual timing etc. etc.. You also have the huge problem of the frequency responses (have alluded to this already). If you actually try to synthesise the half wave, you need to have really good HF response, however this will emphasise sampling noise.
The phrase involving a sledgehammer and a nut seems to apply here. I'd suspect that any of the experienced electronic designers here would have rejected this approach in a few seconds, and had a design working in a few hours, giving better results than you are ever going to achieve...

viki2000 · Joined: 08 May 2013 Posts: 233

Probably you are right, but I am trying to understand how one approach or another would have been done.

temtronic · Posted: Sun Jun 18, 2017 11:01 am

re:...

viki2000 · Joined: 08 May 2013 Posts: 233

I have checked but is not enough info.
PCD reference manual page 103:
https://www.ccsinfo.com/downloads/PCDReferenceManual.pdf
PCB/PCM/PCH user manual page 97:
https://www.ccsinfo.com/downloads/ccs_c_manual.pdf
The example file is FFT.c in Examples folder. I found ex_fft.c. There is nothing inside related with my question above, so no additional help.
I understand how the variable from C is passed/used in ASM.
What I do not understand from the given example is how the C int function find_parity has the assigned value calculated in ASM. Is it only due to MOV W0. _RETURN_. That means the result is always in W0 working register and is passed to find_parity using _RETURN_ ?
Then do I need to use always _RETURN_ when I want to pass the calculated from a register in ASM to a function declared in C?