Multiple clock sources and clock switching Goto page Previous  1, 2
Author Message
gaugeguy

Joined: 05 Apr 2011
Posts: 156 Posted: Tue May 14, 2019 3:14 pm Adding a decimal in any position can be done by making a string of characters from the packed BCD with the '.' inserted as needed between digits. Each step is very quick and by dividing it up in to several steps you avoid the printf trying to do every bit of conversion and formatting all at one time. blowtorch

Joined: 11 Jun 2013
Posts: 31
Location: Cape Town Posted: Tue May 14, 2019 5:34 pm So, results are very encouraging. Thank you for the idea of using BCD... I wrote some test code in the sim, then ported into my main code. This involved quite a few changes, so more than likely I have screwed up something somewhere, it's 01:23AM here. Initial results show a big improvement. I have some ideas to optimise it further, specifically for the GLCD I am using. So writing a dedicated "number" display function, where I convert to BCD and then use the BCD number as an index to a "number only" lookup table for the 5x7 font...cost is only 5 x 10 bytes..., and I be able to get rid of printf entirely... Ttelmah

Joined: 11 Mar 2010
Posts: 15907 Posted: Tue May 14, 2019 11:28 pm temtronic wrote: hmm, wonder if /2,/2,/2,/2,/2 is faster than /10 ? My PIC PC is 'down for service'....

Yes. /32 by shifting is a lot faster than /10. Ttelmah

Joined: 11 Mar 2010
Posts: 15907 Posted: Wed May 15, 2019 2:49 am As a comment, it is worth realising, that though the /10, is well written,
it won't be specifically using code designed to perform /10. The code will
just be the standard integer division code. So will loop through all the bits
involved to perform the division. With this in mind I decided to 'try my hand'
at writing a more efficient division to just perform /10. Now 'no guarantees',
just a first attempt at this!...
 Code: typedef struct {    unsigned int16 quot;    unsigned int8 rem; } div_vals; typedef union {    unsigned int16 whole;    unsigned int8 b; } access; div_vals div_10(access source) {    div_vals temp;    temp.quot=(source.whole>>1)+(source.whole>>2);    temp.quot+=temp.quot>>4;    temp.quot+=temp.quot>>8;    temp.quot>>=3;    temp.rem=source.whole-(((temp.quot<<2)+temp.quot)<<1);    if (temp.rem>9)    {       temp.quot+=1;       temp.rem-=10;    }    return temp; } //Called like this:    int16 test=12345;    div_vals result;        result=div_10(test); //gives 1234 in test.quot, and 5 in test.rem

It looks to be about 3 to 4* faster than the CCS division using /10.

Might be worth a play!... blowtorch

Joined: 11 Jun 2013
Posts: 31
Location: Cape Town Posted: Wed May 15, 2019 6:11 am OK here is some updated code, designed for the sim in order to easily get timings etc, without external (think interrupts) messing with the numbers...

Earlier last night a google search found some nice code on a microchip forum which did a BCD conversion using simple subtraction as opposed to division, this I adapted, and changed so it outputs ASCII...I named the variables such that they should be self explanatory.

The first function named 'uint16_to_ascii' takes a 16 bit unsigned int and writes back to a string. No bounds checking is done, limit of 4 characters or 9999 value. 2nd function does the same for up to 999.

 Code: #include <16LF18345.h> #pin_select U1TX=PIN_B7 #pin_select U1RX=PIN_B6 #use delay(clock=4MHZ) #use rs232(UART1,baud=111111,parity=N,bits=8,stream=jn1out) // RS232 available #include #include typedef unsigned int8 uint8; typedef unsigned int16 uint16; typedef unsigned int32 uint32; typedef signed int8 sint8; typedef signed int16 sint16; typedef signed int32 sint32; void uint16_to_ascii_4(uint16 num_16, char* dest_ptr) {     *dest_ptr = 0;     while (num_16 & 0x3C00) {         num_16 -= 1000;         *dest_ptr += 1;     }     if (num_16 >= 1000) {         num_16 -= 1000;         *dest_ptr += 1;     }     *dest_ptr |= 48;     dest_ptr++;     *dest_ptr = 0;     while (num_16 & 0x0780)     {         num_16 -= 100;         *dest_ptr += 1;     }     if (num_16 >= 100) {         num_16 -= 100;         *dest_ptr += 1;     }     *dest_ptr |= 48;     dest_ptr++;     *dest_ptr = 0;     while (num_16 & 0x70)     {         num_16 -= 10;         *dest_ptr += 1;     }     if (num_16 >= 10) {         num_16 -= 10;         *dest_ptr += 1;     }     *dest_ptr |= 48;     dest_ptr++;     *dest_ptr = (unsigned char) num_16 | 48; } void uint16_to_ascii_3(uint16 num_16, char* dest_ptr) {     *dest_ptr = 0;     while (num_16 & 0x0780) // ((int)num_16 > 0)     {         num_16 -= 100;         *dest_ptr += 1;     }     if (num_16 >= 100) {         num_16 -= 100;         *dest_ptr += 1;     }     *dest_ptr |= 48;     dest_ptr++;     *dest_ptr = 0;     while (num_16 & 0x70) // (num_16 > 0)     {         num_16 -= 10;         *dest_ptr += 1;     }     if (num_16 >= 10) {         num_16 -= 10;         *dest_ptr += 1;     }     *dest_ptr |= 48;     dest_ptr++;     *dest_ptr = (unsigned char) num_16 | 48; } void main() {     char str_secs; // field length of 4 + 1 for null     char str_millis; // field length of 3 + 1 for null     char whole_field;     uint32 big_millis = 9999999;     uint16 secs = 9999;     uint16 millis = 999;         printf("\r\nBCD vs printf Test\r\n");     str_secs = '\0';     str_millis = '\0';     whole_field='.';     whole_field='\0';         delay_cycles(1); // dummy instruction - 1st break point     printf("%08.3w", big_millis);     delay_cycles(1); // dummy instruction - 2nd break point     // above printf takes 22.166ms for 123456     // above printf takes 22.237ms for 999999     printf("\r\n");     delay_cycles(1); // dummy instruction - 1st break point     uint16_to_ascii_4(secs, str_secs);     uint16_to_ascii_3(millis, str_millis);     printf("%s.%s", str_secs, str_millis);     delay_cycles(1); // dummy instruction - 2nd break point     //above 2 bcd conversions and printf takes 2.316ms for 123 456     //above 2 bcd conversions and printf takes 3.052ms for 9999 and 999         printf("\r\n");     delay_cycles(1); // dummy instruction - 1st break point     uint16_to_ascii_4(secs, &whole_field);     uint16_to_ascii_3(millis, &whole_field);     printf("%s", whole_field);     delay_cycles(1); // dummy instruction - 2nd break point     //above 2 bcd conversions and printf takes 2.318ms for 123 and 456     //above 2 bcd conversions and printf takes 3.054ms for 9999 and 999         printf("\r\n");                 sleep(); }

Note the timing in comments! For the first number, printf did it in 22.1ms, the custom code did the equivalent in 2.3ms. Almost 10 times faster. Worst case will be when you have the biggest number that will fit, in this case printf was 22.3 and the custom function took 3ms. Still a seven fold improvement...

Yay! I think the improvement will be slightly better when driving a graphics LCD, because one can have a dedicated number display function that does a streamlined convert and directly indexes the byte array (font) used for display...
Even more yay! gaugeguy

Joined: 05 Apr 2011
Posts: 156 Posted: Wed May 15, 2019 6:58 am Here is a BCD conversion routine that may help. This can be expanded to more digits.
It can be done slightly more efficiently in assembly but this isn't too bad.

 Code: // 16 bit 4 digit BCD conversion routine unsigned int16 Int16toBCD4(unsigned int16 local_convert) {    //converts 16bit value, to four BCD digits. Tries to do it fairly    //efficiently, both in size, and speed.    unsigned int16 bit_cnt = 16;    unsigned int16 BCD;    BCD=0;    {       do       {          if ((BCD & 0x000F)>=0x0005) BCD+=0x0003;          if ((BCD & 0x00F0)>=0x0050) BCD+=0x0030;          if ((BCD & 0x0F00)>=0x0500) BCD+=0x0300;          if ((BCD & 0xF000)>=0x5000) BCD+=0x3000;          shift_left(&BCD,2,shift_left(&local_convert,2,0));       }       while (--bit_cnt != 0);    }    return BCD; } Ttelmah

Joined: 11 Mar 2010
Posts: 15907 Posted: Wed May 15, 2019 7:25 am Problem with the subtraction approach is it'll be slower on larger numbers. Test with 49999, and you may find it is not as good as you think.... Gaugeguy's shift and if >5 add+3 approach, for each digit, is normally considered the most efficient relatively easy to code algorithm. blowtorch

Joined: 11 Jun 2013
Posts: 31
Location: Cape Town Posted: Wed May 15, 2019 8:38 am Ttelmah wrote: Problem with the subtraction approach is it'll be slower on larger numbers. Test with 49999, and you may find it is not as good as you think....

Agreed, it is measurably slower. The numbers are in the previous post by way of comment. total time to convert 0123 and 456 took 2.3ms, whereas converting 9999 and 999 took 3ms. 30% longer.

Thanks Gaugeguy, I will code and test your method next, then feedback comparison. dluu13

Joined: 28 Sep 2018
Posts: 288
Location: Toronto, ON Posted: Wed May 15, 2019 8:39 am Thanks for these posts, everyone. I've been noticing some lag myself when using %lw when I use my logic analyzer as well. We'll see how this goes :D I'm gonna have to play with these myself to test it out! blowtorch

Joined: 11 Jun 2013
Posts: 31
Location: Cape Town Posted: Wed May 15, 2019 9:14 am Loosely related, how can one calculate the time taken to service a timer based ISR? I put the isr code into the sim, and used the stopwatch feature to time the 2 different paths through the code. This came out to 9 and 17 us respectively. But what to add to get the total isr service time? dluu13

Joined: 28 Sep 2018
Posts: 288
Location: Toronto, ON Posted: Wed May 15, 2019 10:01 am I just tried the BCD stuff using gaugeguy's converter (five digits). Here's the code I tested with. I tested straight printing out the ints, scaling them with lw, scaling with float, and then BCD. Ints and BCD were not scaled, but I added a decimal point at the end of the number just to have the same number of chars printed.

As expected, floats were the slowest, coming in at 60ms to print everything. lw was next, coming in at 5.9ms. Straight ints and BCD came in at a tie at 5.4ms. Now, if I were to scale the BCD and make it add the decimal point where I want I don't know how much more time that will take. However, lw is pretty fast...

 Code: /*  * File:   CuriosityPrint.c  * Author: dluu  *  * Created on Apr 5, 2019  */ #include<24FJ128GA204.h> #FUSES NOWDT, NODEBUG, NOWRT, NOPROTECT, NOJTAG, ICSP1 #FUSES NOLVR, NOBROWNOUT, NOIOL1WAY, NODSBOR, NODSWDT #FUSES NOALTCMPI, FRC_PLL, PLL_FROM_FRC, PLL8X #PIN_SELECT U3RX=PIN_B5 #PIN_SELECT U3TX=PIN_B6 #USE DELAY(clock=32MHZ) #USE RS232(BAUD=115200, UART3, BITS=8, PARITY=N, STOP=1, STREAM=PC, ERRORS, RECEIVE_BUFFER=128) #include #include #include uint32_t Int16toBCD8(uint16_t local_convert) {     //converts 16bit value, to four BCD digits. Tries to do it fairly     //efficiently, both in size, and speed.     uint16_t bit_cnt = 16;     uint32_t BCD;     BCD = 0;     {         do         {             if ((BCD & 0x0000000F) >= 0x00000005) BCD += 0x00000003;             if ((BCD & 0x000000F0) >= 0x00000050) BCD += 0x00000030;             if ((BCD & 0x00000F00) >= 0x00000500) BCD += 0x00000300;             if ((BCD & 0x0000F000) >= 0x00005000) BCD += 0x00003000;             if ((BCD & 0x000F0000) >= 0x00050000) BCD += 0x00030000; //            if ((BCD & 0x00F00000) >= 0x00500000) BCD += 0x00300000; //            if ((BCD & 0x0F000000) >= 0x05000000) BCD += 0x03000000; //            if ((BCD & 0xF0000000) >= 0x50000000) BCD += 0x30000000;             shift_left(&BCD, 3, shift_left(&local_convert, 2, 0));         }         while (--bit_cnt != 0);     }     return BCD; } int main(void) {     uint16_t test[] = {11111, 22222, 33333, 44444, 55555, 12222, 23333, 34444, 45555};     delay_ms(100);     fprintf(PC, "\r\n\r\n");         fprintf(PC, "test lu: ");     output_high(PIN_B13);     for (int i = 0; i < 9; ++i)     {         fprintf(PC, "%lu.,", test[i]);     }     output_low(PIN_B13); // 5.4 ms     fprintf(PC, "\r\n");     fprintf(PC, "test lw: ");     output_high(PIN_A9);     for (int i = 0; i < 9; ++i)     {         fprintf(PC, "%1.3lw,", test[i]);     }     output_low(PIN_A9); // 5.9 ms     fprintf(PC, "\r\n");     fprintf(PC, "test float: ");     output_high(PIN_A10);     for (int i = 0; i < 9; ++i)     {         fprintf(PC, "%1.3f,", (float) test[i] / 1000);     }     output_low(PIN_A10); // 60 ms     fprintf(PC, "\r\n");     fprintf(PC, "test bcd: ");     output_high(PIN_C3);     for (int i = 0; i < 9; ++i)     {         fprintf(PC, "%lx.,", Int16toBCD8(test[i]));     }     output_low(PIN_C3); // 5.4 ms     fprintf(PC, "\r\n");     while (1)     {     }     return 0; }

Now to figure out how to insert a decimal at the desired nibble

EDIT:
 Code: #define BCDNIBBLES 5 void printScaledBCD(uint16_t num, uint8_t decimalPlaces) {     uint32_t BCD5 = Int16toBCD5(num);     if (decimalPlaces == BCDNIBBLES) fprintf(PC, "0");     for (int i = 0; i < BCDNIBBLES; ++i)     {         if (BCDNIBBLES - i == decimalPlaces) fprintf(PC, ".");         fprintf(PC, "%x", (BCD5 >> ((BCDNIBBLES - 1 - i) << 2))&0x0F);     } }

I think I can use this in my code to gain about 7% speed over lw when printing numbers.

 Code: fprintf(PC, "test BCD dec: "); output_high(PIN_B8); for (int i = 0; i < 9; ++i) {     printBCD(Int16toBCD5(test[i]), 3);     fprintf(PC, ","); } output_low(PIN_B8); // 5.5 ms fprintf(PC, "\r\n");

 Code: void ScaledBCDtoStr(uint16_t num, uint8_t decimalPlaces, char * buf) // very slow... {     uint32_t BCD5 = Int16toBCD5(num);     uint8_t decimal = 0;     uint8_t j = 0;     if (decimalPlaces > 0) decimal = 1;     if (decimalPlaces == BCDNIBBLES) fprintf(PC, "0");         for (int i = 0; i < BCDNIBBLES+decimal; ++i)     {         if (BCDNIBBLES - i == decimalPlaces)         {             buf[j] = '.';             ++j;         }         buf[j] = ((BCD5 >> ((BCDNIBBLES - 1 - i) << 2))&0x0F) + 0x30;         ++j;     }     buf[BCDNIBBLES+decimal] = '\0'; } fprintf(PC, "test BCD str: "); char bcdstr; output_high(PIN_B9); for (int i = 0; i < 9; ++i) {     ScaledBCDtoStr(test[i], 3, bcdstr);     fprintf(PC, "%s,", bcdstr); } output_low(PIN_B9); // over 200 ms... fprintf(PC, "\r\n");

puzzlingly, this takes over 200 ms... My ScaledBCDtoStr function is very slow... Are array accesses slow? gaugeguy

Joined: 05 Apr 2011
Posts: 156 Posted: Thu May 16, 2019 8:22 am I have not looked at the listing for this, but here is what I think is happening. The array access is doing the index calculation every time through the loop and this takes time. If you switch to using a pointer instead of an array inside the loop I think it will not keep recalculating the offset each time and should save a significant amount of time. Ttelmah

Joined: 11 Mar 2010
Posts: 15907 Posted: Thu May 16, 2019 8:53 am The real killer is this: ((BCD5 >> ((BCDNIBBLES - 1 - i) << 2))&0x0F) + 0x30; Rotation by a variable, is done by having a one bit rotation, and looping round counting till the number of bits needed has happened. Result this is going to involve hundreds of instruction times.... Display posts from previous: All Posts1 Day7 Days2 Weeks1 Month3 Months6 Months1 Year Oldest FirstNewest First
 All times are GMT - 6 HoursGoto page Previous  1, 2 Page 2 of 2

 Jump to: Select a forum Software----------------General CCS C DiscussionCode LibraryEZ App LynxBest Of Hardware----------------CCS ICD / Mach X / Load-n-Go
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum