Re: Problem when realizing DSP algorithm:LMS on xCORE-200 MC Audio
Posted: Wed May 09, 2018 1:38 pm
The problem with using Q31 is that you could get overflow when you multiply accumulate into a 64 bit register. Here is some explanatory text from ARM for their similar function (arm_fir_q31) that uses 32 bit input data and a 64 bit accumulator, with a 32 bit output:
So it appears you must scale down your input by log2(num_taps) to ensure you don't get overflow. So if you had 128 taps you would need to shift all your data right by log2(128) = 7 to ensure you don't overflow. Hence that's why I said @CousinItt's suggestion of using Q7.24 to avoid overflow is so useful.Scaling and Overflow Behavior:
The function is implemented using an internal 64-bit accumulator. The accumulator has a 2.62 format and maintains full precision of the intermediate multiplication results but provides only a single guard bit. Thus, if the accumulator result overflows it wraps around rather than clip. In order to avoid overflows completely the input signal must be scaled down by log2(numTaps) bits. After all multiply-accumulates are performed, the 2.62 accumulator is right shifted by 31 bits and saturated to 1.31 format to yield the final result.