hi Øyvind
Yes the idea was to do 4 computations with int64 routines and to compare the results with what the compiler would do if you were using float (or double as in XC double are treated as float).
now I m not sure what you mean by direct convert or not.
I have provided a basic set of s31, s28(),s56() and f28(),f56() which convert respectively float to int and int to float and this is done at compile time. this is useful to help you writing the integers numbers directly as float instead of using hexadecimal !
just print s28(0.5) in hexadecimal and you will get then "direct" if this is what you meant.
about shift, yes you have to play with the "less expensive" ones:
when you have a unsigned 32 bit, "shr dest, source,n" is the way forward to do a shift right and it can be combined with another instruction when using dual issue.
when you have a signed 32 bits, "ashr dest, source,n" is the way forward to keep and propagate the sign bit, but it take a full instruction.
when you have a 64bits, then lextract is very good to get 32 bits at any position so you can get same effect than a shit left or right.
but if you want to shift right a 64 bits into another 64 bits then the best optimization is to use both : lextract to retrieve the lsb and ashr to reduce the msb.
here we go for this morning :)
dsp_math_multiply_sat explanation
-
- XCore Addict
- Posts: 246
- Joined: Mon Jan 08, 2018 4:14 pm
-
- Respected Member
- Posts: 510
- Joined: Wed Apr 25, 2012 8:52 pm
With "direct" I meant any need when going from like s1_31 (which I also do from my mic) to the AGC (or just downscaling) to the biquad filters q4_24 to s2_30 for the FFT etc. I don't have any particular functions which do that. I guess that's what got me around to starting this thread. My system works (and can pick up and alarm when it hears the recorded door bell's sound through a second door, but best after AGC), and I do check overflows, and don't get too large components from any mirror frequencies even without any (Hanning, Hamming or Blackman) window to go to zero at the ends.
Off Topic
Being retired is in many ways great! Nobody tells me what to do. But since I cant' just drop over to the neighbour and trouble him any more (I couldn't at work, either, we were only C and ATmega for the small embedded controllers (so I had to make my own CSP based runtimes) (but as retired, yess, I can do as much XMOS as I want!), being part of such a good forum as this is and has been a must! I do all the "stupid" questions only to get tons of experience right on my desk! Thanks a lot, Fabrice and others!
--
Øyvind Teig
Trondheim (Norway)
https://www.teigfam.net/oyvind/home/
Øyvind Teig
Trondheim (Norway)
https://www.teigfam.net/oyvind/home/
-
- Respected Member
- Posts: 510
- Joined: Wed Apr 25, 2012 8:52 pm
We have 6 dB per bit, so q4-q1=3 bits "more". 3*6=18 dB "head room"? I understand this I think, but not why you in the same sentence mention the range.
q8-q1=7 bits more. 7*6=42 dB. OK!
Both dsp_design_biquad_lowpass and dsp_filters_biquads would have a q_format param, so I assume that both the coefs and the input_sample adhere to the same q_format(?)
Aren't only the internal there 128 bits? Where are the 128 bits in mul64_int64_int64 (since sizeof a long long is 8 = 64 bits)?
To help myself:
// https://stackoverflow.com/questions/189 ... sizeoflong
// The C and C++ specifications only state that long must be greater than or equal to 32 bits
//
Code: Select all
debug_print ("sizeof(short) = %d\n", (int)sizeof(short));
debug_print ("sizeof(int) = %d\n", (int)sizeof(int));
debug_print ("sizeof(long) = %d\n", (int)sizeof(long));
debug_print ("sizeof(long long) = %d\n", (int)sizeof(long long));
debug_print ("sizeof(float) = %d\n", (int)sizeof(float));
debug_print ("sizeof(double) = %d\n", (int)sizeof(double));
debug_print ("sizeof(long double) = %d\n", (int)sizeof(long double));
// sizeof(int) = 4
// sizeof(long) = 4
// sizeof(long long) = 8
// sizeof(float) = 4
// sizeof(double) = 8
// sizeof(long double) = 8
--
Øyvind Teig
Trondheim (Norway)
https://www.teigfam.net/oyvind/home/
Øyvind Teig
Trondheim (Norway)
https://www.teigfam.net/oyvind/home/
-
- XCore Addict
- Posts: 246
- Joined: Mon Jan 08, 2018 4:14 pm
Hum, I see a bug in the transcription of my assembly code for int64 by 64 and will come back with new code.
otherwise
The sample s31 which has been scaled before by a q4.28 gain is now q4.28 so this gives a 18db overhead security, and when going through the biquad, we get a q8.56 which is great to hold the extra gain or overshoot that the biquad may generate.
otherwise
just because sometime you prefer to consider a gain of 4 instead of writing 12db. or you want to negate the result and need to write -4. So I consider important to give the "range" representation also.aclassifier wrote: ↑Wed May 15, 2024 6:12 pm We have 6 dB per bit, so q4-q1=3 bits "more". 3*6=18 dB "head room"? I understand this I think, but not why you in the same sentence mention the range.
yes . we organize the flow so that the input of the biquad and the coefficient have the same q_format. This makes things easier :aclassifier wrote: ↑Wed May 15, 2024 6:12 pm Both dsp_design_biquad_lowpass and dsp_filters_biquads would have a q_format param, so I assume that both the coefs and the input_sample adhere to the same q_format(?)
The sample s31 which has been scaled before by a q4.28 gain is now q4.28 so this gives a 18db overhead security, and when going through the biquad, we get a q8.56 which is great to hold the extra gain or overshoot that the biquad may generate.
-
- XCore Addict
- Posts: 246
- Joined: Mon Jan 08, 2018 4:14 pm
Hello
as mentioned there is bug in the int64 x int64 which I realized when documenting it...
here is the corrected files showing accuracy after multiple treatments.
sorry for those who have downloaded the file and will scratch there head.
also I realized during the text that the Compiler (and execution library) is happy with double and not only float as per documentation 14.4.1 figure 83
The source code is documented and should give enough information to understand how the optimization works.
hope this is useful to someone
as mentioned there is bug in the int64 x int64 which I realized when documenting it...
here is the corrected files showing accuracy after multiple treatments.
sorry for those who have downloaded the file and will scratch there head.
also I realized during the text that the Compiler (and execution library) is happy with double and not only float as per documentation 14.4.1 figure 83
The source code is documented and should give enough information to understand how the optimization works.
hope this is useful to someone
You do not have the required permissions to view the files attached to this post.
-
- Respected Member
- Posts: 510
- Joined: Wed Apr 25, 2012 8:52 pm
Great! Seems like you have done quite some more. I do notice that the last 4 to 6 numbers are not equal, which I think you show with error. Here is the log:
I added this to #define VER_STR "v102"
And this:
Code: Select all
Hello, test with XC compiler (v102):
acc1f = 1.813799364234
acc2f = 3.141592653590
acc3f = 5.698218757764
outputf = 0.949703126294
sample = 0.577350269072, 0x49E69D16
gain = 0.392699081451, 0x3243F6A8
acc1 = 1.813799362718, 0x 1D05527B061874E
acc2 = 3.141592648634, 0x 3243F6A733C4A1B
acc3 = 5.698218744012, 0x 5B2BE76AEFCBF2A
output = 0.949703109451, 0x798FDF1A
error = 0.017735407ppm, -155.0dB
accuracy on 25 bits, -150.5dB
Code: Select all
#define VER_STR "v102"
// v100 13May2024 Fabriceo orig with -O2
// v101 13May2024 Fabriceo orig with -O3 and name _ends (cannnot see any difference in log)
// v102 18May2024 Fabrice new versions of both the .h and this file. See "Sat May 18, 2024 2:49 pm" there
Code: Select all
printf("Hello, test with XC compiler (%s):\n", VER_STR);
Last edited by aclassifier on Sun May 19, 2024 11:33 am, edited 2 times in total.
--
Øyvind Teig
Trondheim (Norway)
https://www.teigfam.net/oyvind/home/
Øyvind Teig
Trondheim (Norway)
https://www.teigfam.net/oyvind/home/
-
- Respected Member
- Posts: 510
- Joined: Wed Apr 25, 2012 8:52 pm
As I guess my log from Wed May 15, 2024 7:12 pm also would confirm.
--
Øyvind Teig
Trondheim (Norway)
https://www.teigfam.net/oyvind/home/
Øyvind Teig
Trondheim (Norway)
https://www.teigfam.net/oyvind/home/
-
- Respected Member
- Posts: 510
- Joined: Wed Apr 25, 2012 8:52 pm
Hi again,
you call number of fraction bits for "mantissa". I was wondering about that naming, but you are right!
My head was on (2) below, but you use it as (1). See https://www.mathsisfun.com/definitions/mantissa.html
Two definitions (somewhat rewritten)
(1) The part of a number after the "."
Example: in 2.71828 the mantissa is 0.71828
(2) In scientific notation the mantissa is the digits without the ×10n part.
Example: in 5,3266 × 103 the mantissa is 5,3266
you call number of fraction bits for "mantissa". I was wondering about that naming, but you are right!
My head was on (2) below, but you use it as (1). See https://www.mathsisfun.com/definitions/mantissa.html
Two definitions (somewhat rewritten)
(1) The part of a number after the "."
Example: in 2.71828 the mantissa is 0.71828
(2) In scientific notation the mantissa is the digits without the ×10n part.
Example: in 5,3266 × 103 the mantissa is 5,3266
--
Øyvind Teig
Trondheim (Norway)
https://www.teigfam.net/oyvind/home/
Øyvind Teig
Trondheim (Norway)
https://www.teigfam.net/oyvind/home/
-
- Respected Member
- Posts: 510
- Joined: Wed Apr 25, 2012 8:52 pm
Is there a way to pick out from the saturation lsats asm instruction whether it saturated or not? Like in #fabriceo's mymath v110, kind of..?
For my code I have such functionality, basically increasing a number if saturation happens. Since this is done in a loop, and saturation happens so seldom (with my data), instead of testing the returned value for the saturated values (INT_MIN and INT_MAX) every time, I can do it after the loop. Even if that formally is not correct if the value in were legally INT_MIN or INT_MAX. Of course, if that's what a function like sat64 would need to do anyhow, then there may not be any cycles saved.
Now I do it like this:
So does lsats leave any traces of this?
Øyvind
For my code I have such functionality, basically increasing a number if saturation happens. Since this is done in a loop, and saturation happens so seldom (with my data), instead of testing the returned value for the saturated values (INT_MIN and INT_MAX) every time, I can do it after the loop. Even if that formally is not correct if the value in were legally INT_MIN or INT_MAX. Of course, if that's what a function like sat64 would need to do anyhow, then there may not be any cycles saved.
Now I do it like this:
Code: Select all
int32_t product_32_sat = dsp_math_multiply_sat (input1_value, input2_value, q_format);
int32_t product_32 = dsp_math_multiply (input1_value, input2_value, q_format);
if (product_32_sat != product_32) {
err_cnt++;
} else {}
return product_32_sat;
Øyvind
--
Øyvind Teig
Trondheim (Norway)
https://www.teigfam.net/oyvind/home/
Øyvind Teig
Trondheim (Norway)
https://www.teigfam.net/oyvind/home/
-
- XCore Addict
- Posts: 246
- Joined: Mon Jan 08, 2018 4:14 pm
Hello Øyvind
the Lsat only modify the target registers. If you want to know if the result was saturated to min or max then you just need to compare each of the 2 target registers with their previous value. I m doing that also to raise a flag when the dsp engine clips.
fabriceo
the Lsat only modify the target registers. If you want to know if the result was saturated to min or max then you just need to compare each of the 2 target registers with their previous value. I m doing that also to raise a flag when the dsp engine clips.
fabriceo