dsp_math_multiply_sat explanation

fabriceo · Post by **fabriceo** » Tue May 14, 2024 8:03 am

hi Øyvind
Yes the idea was to do 4 computations with int64 routines and to compare the results with what the compiler would do if you were using float (or double as in XC double are treated as float).
now I m not sure what you mean by direct convert or not.

I have provided a basic set of s31, s28(),s56() and f28(),f56() which convert respectively float to int and int to float and this is done at compile time. this is useful to help you writing the integers numbers directly as float instead of using hexadecimal !
just print s28(0.5) in hexadecimal and you will get then "direct" if this is what you meant.

about shift, yes you have to play with the "less expensive" ones:

when you have a unsigned 32 bit, "shr dest, source,n" is the way forward to do a shift right and it can be combined with another instruction when using dual issue.
when you have a signed 32 bits, "ashr dest, source,n" is the way forward to keep and propagate the sign bit, but it take a full instruction.
when you have a 64bits, then lextract is very good to get 32 bits at any position so you can get same effect than a shit left or right.
but if you want to shift right a 64 bits into another 64 bits then the best optimization is to use both : lextract to retrieve the lsb and ashr to reduce the msb.

here we go for this morning :)

aclassifier · Post by **aclassifier** » Tue May 14, 2024 10:40 am

With "direct" I meant any need when going from like s1_31 (which I also do from my mic) to the AGC (or just downscaling) to the biquad filters q4_24 to s2_30 for the FFT etc. I don't have any particular functions which do that. I guess that's what got me around to starting this thread. My system works (and can pick up and alarm when it hears the recorded door bell's sound through a second door, but best after AGC), and I do check overflows, and don't get too large components from any mirror frequencies even without any (Hanning, Hamming or Blackman) window to go to zero at the ends.

Off Topic

Being retired is in many ways great! Nobody tells me what to do. But since I cant' just drop over to the neighbour and trouble him any more (I couldn't at work, either, we were only C and ATmega for the small embedded controllers (so I had to make my own CSP based runtimes) (but as retired, yess, I can do as much XMOS as I want!), being part of such a good forum as this is and has been a must! I do all the "stupid" questions only to get tons of experience right on my desk! Thanks a lot, Fabrice and others!

aclassifier · Post by **aclassifier** » Wed May 15, 2024 6:12 pm

fabriceo wrote: ↑Fri May 10, 2024 8:57 pm the (music) samples are coded q1.31
any gain, factor or biquad coefficient is coded q4.28 which gives a 18db head room as we can represent number -8.0..7.999

We have 6 dB per bit, so q4-q1=3 bits "more". 3*6=18 dB "head room"? I understand this I think, but not why you in the same sentence mention the range.

fabriceo wrote: ↑Fri May 10, 2024 8:57 pm q8.56 which gives a 42db

q8-q1=7 bits more. 7*6=42 dB. OK!

fabriceo wrote: ↑Fri May 10, 2024 8:57 pm when doing a biquad, the q8.56 is converted as q4.28 and multiplied by all coefs q4.28 so the end result is again q8.56

Both dsp_design_biquad_lowpass and dsp_filters_biquads would have a q_format param, so I assume that both the coefs and the input_sample adhere to the same q_format(?)

fabriceo wrote: ↑Fri May 10, 2024 8:57 pm sometime it is needed to multiply q8.56 by q8.56 (128bits result) and then again reduced as q8.56
q8.56*q8.56->q8.56

Aren't only the internal there 128 bits? Where are the 128 bits in mul64_int64_int64 (since sizeof a long long is 8 = 64 bits)?

To help myself:

// https://stackoverflow.com/questions/189 ... sizeoflong
// The C and C++ specifications only state that long must be greater than or equal to 32 bits
//

Code: Select all

debug_print ("sizeof(short) = %d\n", (int)sizeof(short));
debug_print ("sizeof(int) = %d\n", (int)sizeof(int));
debug_print ("sizeof(long) = %d\n", (int)sizeof(long));
debug_print ("sizeof(long long) = %d\n", (int)sizeof(long long));
debug_print ("sizeof(float) = %d\n", (int)sizeof(float));
debug_print ("sizeof(double) = %d\n", (int)sizeof(double));
debug_print ("sizeof(long double) = %d\n", (int)sizeof(long double));

// sizeof(short) = 2
// sizeof(int) = 4
// sizeof(long) = 4
// sizeof(long long) = 8
// sizeof(float) = 4
// sizeof(double) = 8
// sizeof(long double) = 8

fabriceo · Post by **fabriceo** » Wed May 15, 2024 9:20 pm

Hum, I see a bug in the transcription of my assembly code for int64 by 64 and will come back with new code.

otherwise

aclassifier wrote: ↑Wed May 15, 2024 6:12 pm We have 6 dB per bit, so q4-q1=3 bits "more". 3*6=18 dB "head room"? I understand this I think, but not why you in the same sentence mention the range.

just because sometime you prefer to consider a gain of 4 instead of writing 12db. or you want to negate the result and need to write -4. So I consider important to give the "range" representation also.

aclassifier wrote: ↑Wed May 15, 2024 6:12 pm Both dsp_design_biquad_lowpass and dsp_filters_biquads would have a q_format param, so I assume that both the coefs and the input_sample adhere to the same q_format(?)

yes . we organize the flow so that the input of the biquad and the coefficient have the same q_format. This makes things easier :
The sample s31 which has been scaled before by a q4.28 gain is now q4.28 so this gives a 18db overhead security, and when going through the biquad, we get a q8.56 which is great to hold the extra gain or overshoot that the biquad may generate.

fabriceo · Post by **fabriceo** » Sat May 18, 2024 1:49 pm

Hello
as mentioned there is bug in the int64 x int64 which I realized when documenting it...
here is the corrected files showing accuracy after multiple treatments.

sorry for those who have downloaded the file and will scratch there head.
also I realized during the text that the Compiler (and execution library) is happy with double and not only float as per documentation 14.4.1 figure 83

The source code is documented and should give enough information to understand how the optimization works.
hope this is useful to someone

aclassifier · Post by **aclassifier** » Sun May 19, 2024 11:18 am

Great! Seems like you have done quite some more. I do notice that the last 4 to 6 numbers are not equal, which I think you show with error. Here is the log:

Code: Select all

Hello, test with XC compiler (v102):
acc1f   = 1.813799364234
acc2f   = 3.141592653590
acc3f   = 5.698218757764
outputf = 0.949703126294

sample  = 0.577350269072, 0x49E69D16
gain    = 0.392699081451, 0x3243F6A8
acc1    = 1.813799362718, 0x 1D05527B061874E
acc2    = 3.141592648634, 0x 3243F6A733C4A1B
acc3    = 5.698218744012, 0x 5B2BE76AEFCBF2A
output  = 0.949703109451, 0x798FDF1A
error   = 0.017735407ppm, -155.0dB
accuracy on 25 bits, -150.5dB

I added this to #define VER_STR "v102"

Code: Select all

#define VER_STR "v102"
// v100 13May2024 Fabriceo orig with -O2
// v101 13May2024 Fabriceo orig with -O3 and name _ends (cannnot see any difference in log)
// v102 18May2024 Fabrice new versions of both the .h and this file. See "Sat May 18, 2024 2:49 pm" there

And this:

Code: Select all

printf("Hello, test with XC compiler (%s):\n", VER_STR);

aclassifier · Post by **aclassifier** » Sun May 19, 2024 11:25 am

fabriceo wrote: ↑Sat May 18, 2024 1:49 pm I realized during the text that the Compiler (and execution library) is happy with double and not only float as per documentation 14.4.1 figure 83

As I guess my log from Wed May 15, 2024 7:12 pm also would confirm.

aclassifier · Post by **aclassifier** » Tue May 21, 2024 12:09 pm

Hi again,

you call number of fraction bits for "mantissa". I was wondering about that naming, but you are right!

My head was on (2) below, but you use it as (1). See https://www.mathsisfun.com/definitions/mantissa.html

Two definitions (somewhat rewritten)
(1) The part of a number after the "."
Example: in 2.71828 the mantissa is 0.71828
(2) In scientific notation the mantissa is the digits without the ×10ⁿ part.
Example: in 5,3266 × 10³ the mantissa is 5,3266

aclassifier · Post by **aclassifier** » Sun May 26, 2024 10:19 am

Is there a way to pick out from the saturation lsats asm instruction whether it saturated or not? Like in #fabriceo's mymath v110, kind of..?

For my code I have such functionality, basically increasing a number if saturation happens. Since this is done in a loop, and saturation happens so seldom (with my data), instead of testing the returned value for the saturated values (INT_MIN and INT_MAX) every time, I can do it after the loop. Even if that formally is not correct if the value in were legally INT_MIN or INT_MAX. Of course, if that's what a function like sat64 would need to do anyhow, then there may not be any cycles saved.

Now I do it like this:

Code: Select all

int32_t product_32_sat = dsp_math_multiply_sat (input1_value, input2_value, q_format);
int32_t product_32     = dsp_math_multiply     (input1_value, input2_value, q_format);

if (product_32_sat != product_32) {
    err_cnt++;
} else {}

return product_32_sat;

So does lsats leave any traces of this?
Øyvind

fabriceo · Post by **fabriceo** » Mon May 27, 2024 7:51 am

Hello Øyvind
the Lsat only modify the target registers. If you want to know if the result was saturated to min or max then you just need to compare each of the 2 target registers with their previous value. I m doing that also to raise a flag when the dsp engine clips.

fabriceo

dsp_math_multiply_sat explanation

Re: dsp_math_multiply_sat explanation

Re: dsp_math_multiply_sat explanation

Re: dsp_math_multiply_sat explanation

Re: dsp_math_multiply_sat explanation

Re: dsp_math_multiply_sat explanation

Re: dsp_math_multiply_sat explanation

Re: dsp_math_multiply_sat explanation

Re: dsp_math_multiply_sat explanation

Re: dsp_math_multiply_sat explanation

Re: dsp_math_multiply_sat explanation