iir filter implementation

stdefeber · Post by **stdefeber** » Tue Oct 07, 2014 7:49 pm

I was struggling with the IIR implementation of lilltroll.
Modified the code to fit my implementation.

I tried to run an impulse and step response and got rubbish.
Large oscillations. So I expected that the feedback loop was not right.
Probably the quantization step of Y0.
To figure out what went wrong I coded a fixed point implementation in plain C and got that running.
32-bit integer multiplication, result is a 64 bit integer (signed long long) and shifted back to fit into an 32-bit integer again.
Next I recoded the xcore implementation. and got that one running too.

In the xcore implementation I noticed that shifting back the macs result by 1 or 2 makes a difference but does not make the filter unreliable. Observed from the step and impulse response plot which have similar shapes but different scales.
Next I noticed that using coefficients scaled to 30 bit (2^30) or 31-bit ((2^31)-1), again observed from the plots, make a difference but do not make the filter unreliable. Overflowing of the MAC is, of course, less of a problem due to smaller scaled coefficients. The smaller coefs will result in a smaller end result.

My question is; How can I match up the C and xc implementation as accurate as possible ?
And how do I make maximum use of the precision the {h,l} macs function gives me ?

best regards

Simon

Post by **lilltroll** » Wed Oct 08, 2014 11:20 am

Hi

As you probably know the B and A coef. doesn't affect noise , or limited cycles in the filter. But it affect the accuracy.
If you calculate a peaking filter with double float precision -10 dB @ 37.0 Hz Q = 10 , you will end up with something else when rounding it to fixed numbers (like -9.68 dB @ 36.56 Hz with Q = 9.4.) For very high Q values the rounding can give you an unstable solution if you not round towards the point (complex number) 0 + i0.

You will need to handle normalized B, A coef. in the range | -1.999999999999999999 1.9999999999999999| for filters with a maximum of unity gain. For gain above unity you need an other range for B.

Which (old) version of my code are you using?
I have a XC kit now that calculates the coefs to my IIR filters.

You can also check out the sc_dspfilters on Github. Henk has an asm version of a standard filter implementation.

stdefeber · Post by **stdefeber** » Wed Oct 08, 2014 12:03 pm

Thanks for the reply.

I am using a modified version of the Filter toolbox version 0.1.
Modifications I made :
1. Unrolled the channel "for loop", to test just 1 channel for debug purposes only
2. Instead of using streaming channels for the in-/output I use arguments
3. Unrolled the "for loop" wrt feed forward calculations, for debug purposes only

grtz

Simon

Post by **lilltroll** » Wed Oct 08, 2014 7:06 pm

Something more up to date here.

First some typedef

Code: Select all

void EQdAsm(streaming chanend c_audio, chanend c_coef);

enum FilterType{LP1, LP2 , HP1 , HP2 , LowShelf , HighShelf , PeakingEQ , Notch , AllPass , BandPass , Mute};

typedef struct{
        int active;
        enum FilterType type;
        float Fc;
        int link;
        float Q;
        float Gain;
        float MasterGain;
    }EQ_section_t;

Code to calculate filters
Check out the Shift feature!

Code: Select all

#include <math.h>
#include "filters.h"
#include <print.h>

double norm(double a,double b,double c){
	double max;
	max=fabs(a);
	if(fabs(b)>max)
		max=fabs(b);
	if(fabs(c)>max)
		max=fabs(c);
	return max;
}


void calcFilt(EQ_section_t EQ , unsigned fs , int Bcoef[3] , int Acoef[2] , int &Shift){
	/*Calculate parametric EQ coef
    %f0 is the filter frequency
    %Q is the Q value
    %GaindB is the gain in dB
    %fs is the sampling frequency
    %Filtertype is one of
    % 1 LPF (Lowpass Filter)
    % 2 HPF (High pass filter)
    % 3 BPFs (Band pass filter constant skirt gain, peak gain = Q)
    % 4 BPFp (constant 0 dB peak gain)
    % 5 notch
    % 6 APF AllPass filter
    % 7 peakingEQ
    % 8 lowShelf
    % 9 highSelf*/

	double SCALE,max;
	const double pi=3.14159265358979323846;
    double w0 = 2 * pi * (double)EQ.Fc/ (double)fs;
    double cos_w0 = cos(w0);
    double alpha=sin(w0)/(2*(double)EQ.Q);
    double A = pow(10,((double)EQ.Gain/40));
    double sqA;
    double Gain=pow(10,((double)EQ.Gain/20));
    double p; //prewarp factor
    double b0, b1, b2 ,a0 ,a1 ,a2;
    switch(EQ.type){
        case LP1:
            p = tan(0.5*w0);
            a0 = 1;
            a1 = (p-1)/(p+1);
            a2 = 0;
            b0 = Gain*p/(p+1);
            b1 = b0;
            b2 = 0;
            break;
        case LP2:
    		b0 = Gain *  (1 - cos_w0)/2;
            b1 =  2*b0;
            b2 =  b0;
            a0 =   1 + alpha;
            a1 =  -2*cos_w0;
            a2 =   1 - alpha;
            break;
        case HP1:
             p = tan(0.5*w0);
             a0 = 1;
             a1 = (p-1)/(p+1);
             a2 = 0;
             b0 = Gain*(a0-a1)/2;
             b1 = -b0;
             b2 = 0;
             break;
        case HP2:
            b0 = Gain *  (1 + cos_w0)/2;
            b1 = -2*b0;
            b2 = b0;
            a0 =   1 + alpha;
            a1 =  -2*cos_w0;
            a2 =   1 - alpha;
            break;
    /*case BPFs :
            b0 =   sin(w0)/2 ;
            b1 =   0 ;
            b2 =  -b0;
            a0 =   1 + alpha;
            a1 =  -2*cos_w0;
            a2 =   1 - alpha;
            break;*/
    case BandPass:
            b0 = Gain *  alpha;
            b1 =   0;
            b2 = Gain * -alpha;
            a0 =   1 + alpha;
            a1 =  -2*cos_w0;
            a2 =   1 - alpha;
            break;
     case Notch:
            b0 =   1;
            b1 =  -2*cos_w0;
            b2 =   1;
            a0 =   1 + alpha;
            a1 =  -2*cos_w0;
            a2 =   1 - alpha;
            break;
     case AllPass:
            b0 =   1 - alpha;
            b1 =  -2*cos_w0;
            b2 =   1 + alpha;
            a0 =   1 + alpha;
            a1 =  -2*cos_w0;
            a2 =   1 - alpha;
            break;
     case PeakingEQ:
            b0 =   1 + alpha*A;
            b1 =  -2*cos_w0;
            b2 =   1 - alpha*A;
            a0 =   1 + alpha/A;
            a1 =  -2*cos_w0;
            a2 =   1 - alpha/A;
            break;
      case LowShelf:
            sqA=sqrt(A);
    	    b0 =    A*( (A+1) - (A-1)*cos_w0 + 2*sqA*alpha );
            b1 =  2*A*( (A-1) - (A+1)*cos_w0                   );
            b2 =    A*( (A+1) - (A-1)*cos_w0 - 2*sqA*alpha );
            a0 =        (A+1) + (A-1)*cos_w0 + 2*sqA*alpha;
            a1 =   -2*( (A-1) + (A+1)*cos_w0                   );
            a2 =        (A+1) + (A-1)*cos_w0 - 2*sqA*alpha;
      break;
      case HighShelf:
    	    sqA=sqrt(A);
    	  	b0 =    A*( (A+1) + (A-1)*cos_w0 + 2*sqA*alpha );
            b1 = -2*A*( (A-1) + (A+1)*cos_w0                   );
            b2 =    A*( (A+1) + (A-1)*cos_w0 - 2*sqA*alpha );
            a0 =        (A+1) - (A-1)*cos_w0 + 2*sqA*alpha;
            a1 =    2*( (A-1) - (A+1)*cos_w0                   );
            a2 =        (A+1) - (A-1)*cos_w0 - 2*sqA*alpha;
      break;
      case Mute:
          Bcoef[0]=0;
          Bcoef[1]=0;
          Bcoef[2]=0;
          Acoef[0]=0;
          Acoef[1]=0;
     return;
    }
    SCALE=1<<30;
    Acoef[0]=(int) (SCALE* -a1/a0);
    Acoef[1]=(int) (SCALE* -a2/a0);

max = norm(b0/a0,b1/a0,b2/a0);
if(max<2){
	Shift=2;
	}
else{
	Shift=2+floor(log(max)/log(2));
	SCALE=1<<(32-Shift);
	//printstrln("Shift !=2");
}
Bcoef[0]=(int) ((double)EQ.MasterGain*SCALE* b0/a0);
Bcoef[1]=(int) ((double)EQ.MasterGain*SCALE* b1/a0);
Bcoef[2]=(int) ((double)EQ.MasterGain*SCALE* b2/a0);

}

Code to update filters coef

Code: Select all

void sendCoefTr(int active, int bank, int reset , int Bcoef[3],int Acoef[2],int Shift, chanend c ){
    if(bank>=BANKS)
        printstr("ERROR in sendCoefTr(");
    else{
        master{
            c<: bank;//
            c<: reset;
            c<: Bcoef[0];
            c<: Bcoef[1];
            c<: Bcoef[2];
            c<: Acoef[0];
            c<: Acoef[1];
            c<: Shift; //shiftout
            c<: active;
        }
    }
}

Code to update the delay

Code: Select all

void sendDelayTr(int samples , chanend c){
    master{
      c<: -1;
      c<: samples;
    }
}

And the asm code for the filter

Code: Select all

#define useEvents
//#define useNonStreamingData
//#define BITREVOUT

#define BANKS 8
#define BUFSIZE 1024
#define NWORDS (1+12+(7+6)*BANKS+BUFSIZE)

    .globl EQdAsm
    .globl EQdAsm.nstackwords
    .linkset EQdAsm.nstackwords, NWORDS
    .linkset EQdAsm.maxthreads, 1

#define FRACTIONALBITS 30
#define CT_END 0x1

#define s1  r0
#define s2  r1
#define s3  r2
#define s4  r3
#define s5 	r4
#define C31  r5
#define cptr r6
#define i 	r7
#define zero r8
#define hi r9
#define mi r10
#define lo r11

#define DELAY 		sp[8]
#define WRITEPTR 	sp[9]
#define Caudio		sp[10]
#define Ccoef		sp[11]

#define COEF  sp[12]
#define STATES sp[12+7*BANKS]
#define BUFF sp[12+(7+6)*BANKS]

#define b0 cptr[0]
#define b1 cptr[1]
#define b2 cptr[2]
#define a1 cptr[3]
#define a2 cptr[4]
#define ShiftLeft cptr[5]
#define active cptr[6]

#define	x1 		dp[0]
#define	x2 		dp[1]
#define	y1hi 	dp[2]
#define	y1lo 	dp[3]
#define	y2hi	dp[4]
#define	y2lo 	dp[5]


EQdAsm:
	entsp NWORDS
    stw r0,Caudio
    stw r1,Ccoef
    // Store r4 to r10 on the stack
    stw r4, sp[1]
    stw r5, sp[2]
    stw r6, sp[3]
    stw r7, sp[4]
    stw r8, sp[5]
    stw r9, sp[6]
    stw r10, sp[7]

    ldc zero, 0
    ldc	C31,31
	stw zero,DELAY
	stw zero,WRITEPTR //reset writePtr

    	// Write 0 to all state vectors and coeff vectors
	 	ldc  	i,12    	//startPtr
	 	ldc		r10,NWORDS-1  //stopPtr
		ldaw	s5,sp[0]
writeZEROS:
	stw     zero,s5[i]  //Write zeros in range sp[startPtr]-sp[stopPtr]
    add		i,i,1
    eq		r11,r10,i
    bf		r11, writeZEROS

    clre	//Clear event vector
    ldap r11,   UPDATEFILTERCOEF
    setv res[r1], r11
    eeu  res[r1] //enable event uncond. on Ccoef

#ifdef useEvents
	ldap r11,   AUDIOSAMPLE1
	setv res[r0], r11
	eeu  res[r0]
	waiteu
#endif

SETSTATUSREG:
    setsr 0x1  //Check if an event is waiting
    clrsr 0x1
AUDIOSAMPLE1:
    ldw 	s1,Caudio
    in  	s2, res[s1]
    ashr		s2,s2,1
    ldaw 	cptr, BUFF
    ldw 	i, WRITEPTR
    stw		s2, cptr[i]  // store audio sample at &WRITEPTR

	ldc  	hi,BUFSIZE-1
	ldw 	s3, DELAY
	add 	s4,i,s3 //s4=readPtr = WRITEPTR+DELAY
    and		s4,s4,hi // to be optimized
	//lss		s5,s4,hi
    ldw  	mi, cptr[s4]  //load audio sample at BUFF[WRITEPTR+DELAY]
	sub 	i,i,1 //update writeptr
	and  	i,i,hi
	stw 	i, WRITEPTR

FILTER1:
	ldc i,BANKS 		//Filter sections
	ldaw dp,STATES 		//reset statesPtr
	ldaw cptr,COEF		//reset coeffsPtr
    ldw s1,active
	bu TEST1

UPDATE1:
	sub i,i,1
	ldaw dp,dp[6]  		//add 6 words to the states
	ldaw cptr,cptr[7]	//add 7 words to the coeffs
	ldw s1,active
TEST1:
	bf i,send1			//i==0 => send sample
	//bf s1,UPDATE1       //BANK not active
	bt s1,LOOP1
	stw zero,x1			//False
	stw zero,x2
	stw zero,y1lo
	stw zero,y1hi
	stw zero,y2lo
	stw zero,y2hi
	bu UPDATE1
LOOP1:
	mov s3,mi
	ldw s1,b0 //B0
	lsub hi,mi,zero,zero,zero
	maccs 	hi, mi, s1, s3 	// B0*X0
	ldw s1,b2 //B2
	ldw s2,x1 //!X2
	maccs 	hi, mi, s1, s2	// B2*!X2
	stw s3,x1  //X-> !X2 OVERWRITE
	ldw s1,b1 //B1
	ldw s2,x2 //!X1
	//FNOP
	maccs 	hi, mi, s1, s2	// B1*!X1

//Calc (int96) ACC+ =|2-complement|= Ysign:Yhi:Ylo * Asign:Ahi:Alo
   ldw s4, a1   //Alo
   ldw s1, y2hi
   ldw s2, y2lo
.align 4
   ashr s3, s4, C31               //extract sign from A
   lmul   s5, lo, s2, s4, zero, zero    //Ylo*Alo   -- LO --
   lmul   s5, mi, s2, s3, s5, mi       	//Ylo*Ahi   -- MID --
   maccu  s5, mi, s1, s4              	//Yhi*Alo   -- MID --
   lmul   s5, hi, s1, s3, s5, hi     	//Yhi*Ahi   -- HI --
   maccu  s5, hi, s2, s3             	//Ylo*Asign   -- HI --
   ashr   s1, s1, C31               	//extract sign from Yhi
   maccu  s5, hi, s1, s4             	//Ysign*Alo   -- HI --


//Calc (int96) ACC+ =|2-complement|= Ysign:Yhi:Ylo * Asign:Ahi:Alo
   ldw s4, a2   //Alo
   ldw s1, y1hi
   ldw s2, y1lo

.align 4
   ashr s3, s4, C31               //extract sign from A
   lmul   s5, lo, s2, s4, zero, lo     //Ylo*Alo   -- LO --
   lmul   s5, mi, s2, s3, s5, mi       //Ylo*Ahi   -- MID --
   maccu  s5, mi, s1, s4              //Yhi*Alo   -- MID --
   lmul   s5, hi, s1, s3, s5, hi     //Yhi*Ahi   -- HI --
   maccu  s5, hi, s2, s3             //Ylo*Asign   -- HI --
   ashr   s1, s1, C31               //extract sign from Yhi
   maccu  s5, hi, s1, s4             //Ysign*Alo   -- HI --

	//Y_int64=(ACC>>FRACTIONALBITS)
	ldc     s5,FRACTIONALBITS
	shr 	lo,lo,s5				//lo>>=FRACTIONALBITS
	shl 	s4,mi,32-FRACTIONALBITS	//s4=mi<<(32-FRACTIONALBITS)
	or  	s3,s4,lo				//s3 = mi<<(32-FRACTIONALBITS)  || lo>>FRACTIONALBITS
	stw		s3,y1lo					//s3-> y1lo update y1lo
	shr 	mi,mi,s5				//mi>>=FRACTIONALBITS
	shl 	s1,hi,32-FRACTIONALBITS	//s1=hi<<(32-FRACTIONALBITS)
	or  	s2,s1,mi				//y1hi = hi<<(32-FRACTIONALBITS)  || mi>>FRACTIONALBITS
	stw		s2,y1hi					//mi-> y1hi  update y1hi

	//y_out=(Y_int64+0x80000000)>>32
	shr 	s1,s3, C31
	add  	mi,s2,s1 			//if reminder >.5 round upwards

	//Calc next BANK or send out answer if finished
	bt i,UPDATE1

send1:
	ldw s1,Caudio
#ifdef	useNonStreamingData
	outct  res[s1], CT_END
    chkct  res[s1], CT_END
#endif
	shl mi,mi,1
#ifdef BITREVOUT
	bitrev mi , mi
#endif
	out res[s1],mi

#ifdef	useNonStreamingData
	outct  res[s1], CT_END
    chkct  res[s1], CT_END
#endif

AUDIOSAMPLE2:
    ldw 	s1,Caudio
    in  	s2, res[s1]
    ashr		s2,s2,1
    ldaw 	cptr, BUFF
    ldw 	i, WRITEPTR
    stw		s2, cptr[i]  // store audio sample at &WRITEPTR

	ldc  	hi,BUFSIZE-1
	ldw 	s3, DELAY
	add 	s4,i,s3 //s4=readPtr = WRITEPTR+DELAY
    and		s4,s4,hi // to be optimized
	//lss		s5,s4,hi
    ldw  	mi, cptr[s4]  //load audio sample at BUFF[WRITEPTR+DELAY]
	sub 	i,i,1 //update writeptr
	and  	i,i,hi
	stw 	i, WRITEPTR

FILTER2:
	ldc i,BANKS 		//Filter sections
	ldaw dp,STATES 		//reset statesPtr
	ldaw cptr,COEF		//reset coeffsPtr
    ldw s1,active
	bu TEST2

UPDATE2:
	sub i,i,1
	ldaw dp,dp[6]  		//add 6 words to the states
	ldaw cptr,cptr[7]	//add 7 words to the coeffs
	ldw s1,active
TEST2:
	bf i,send2			//i==0 => send sample
	//bf s1,UPDATE2       //BANK not active
	bt s1,LOOP2
	stw zero,x1			//False
	stw zero,x2
	stw zero,y1lo
	stw zero,y1hi
	stw zero,y2lo
	stw zero,y2hi
	bu UPDATE2

LOOP2:
	mov s3,mi
	ldw s1,b0 //B0
	lsub hi,mi,zero,zero,zero
	maccs 	hi, mi, s1, s3 	// B0*X0
	//B2*X2
	ldw s1,b2 //B2
	ldw s2,x2 //X2
	maccs 	hi, mi, s1, s2	// B2*X2
	stw s3,x2 //X-> X2

	ldw s1,b1 //B1
	ldw s2,x1 //X1
	//FNOP
	maccs 	hi, mi, s1, s2	// B1*X1
//Calc (int96) ACC+ =|2-complement|= Ysign:Yhi:Ylo * Asign:Ahi:Alo
   ldw s4, a1   //Alo
   ldw s1, y1hi
   ldw s2, y1lo

.align 4
   ashr s3, s4, C31               //extract sign from A
   lmul   s5, lo, s2, s4, zero, zero     //Ylo*Alo   -- LO --
   lmul   s5, mi, s2, s3, s5, mi       //Ylo*Ahi   -- MID --
   maccu  s5, mi, s1, s4              //Yhi*Alo   -- MID --
   lmul   s5, hi, s1, s3, s5, hi     //Yhi*Ahi   -- HI --
   maccu  s5, hi, s2, s3             //Ylo*Asign   -- HI --
   ashr   s1, s1, C31               //extract sign from Yhi
   maccu  s5, hi, s1, s4             //Ysign*Alo   -- HI --

   //Calc (int96) ACC+ =|2-complement|= Ysign:Yhi:Ylo * Asign:Ahi:Alo
   ldw s4, a2   //Alo
   ldw s1, y2hi
   ldw s2, y2lo

.align 4
   ashr s3, s4, C31               //extract sign from A
   lmul   s5, lo, s2, s4, zero, lo     //Ylo*Alo   -- LO --
   lmul   s5, mi, s2, s3, s5, mi       //Ylo*Ahi   -- MID --
   maccu  s5, mi, s1, s4              //Yhi*Alo   -- MID --
   lmul   s5, hi, s1, s3, s5, hi     //Yhi*Ahi   -- HI --
   maccu  s5, hi, s2, s3             //Ylo*Asign   -- HI --
   ashr   s1, s1, C31               //extract sign from Yhi
   maccu  s5, hi, s1, s4             //Ysign*Alo   -- HI --

	//shl		hi,hi,2
	//stw		hi,y2hi					//mi-> y1hi  update y1hi
	//stw		mi,y2lo					//mi-> y1hi  update y1hi

	//Y_int64=(ACC>>30)
	ldc		s5,FRACTIONALBITS
	shr 	lo,lo,s5				//lo>>=FRACTIONALBITS
	shl 	s4,mi,32-FRACTIONALBITS	//s4=mi<<(32-FRACTIONALBITS)
	or  	s3,s4,lo				//s3 = mi<<(32-FRACTIONALBITS)  || lo>>FRACTIONALBITS
	stw		s3,y2lo					//s3-> y1lo update y1lo
	shr 	mi,mi,s5				//mi>>=FRACTIONALBITS
	shl 	s1,hi,32-FRACTIONALBITS	//s1=hi<<(32-FRACTIONALBITS)
	or  	s2,s1,mi				//y1hi = hi<<(32-FRACTIONALBITS)  || mi>>FRACTIONALBITS
	stw		s2,y2hi					//mi-> y1hi  update y1hi

	//y_out=(Y_int64+0x80000000)>>32
	shr 	s1,s3, C31
	add  	mi,s2, s1 			//if reminder >.5 round upwards

	//Calc next BANK or send out answer if finished
	bt i, UPDATE2

send2:
	ldw 	s1,Caudio
 #ifdef	useNonStreamingData
    outct  	res[s1], CT_END
    chkct  	res[s1], CT_END
 #endif
    shl mi,mi,1
#ifdef BITREVOUT
	bitrev mi , mi
#endif
    out  	res[s1], mi
 #ifdef	useNonStreamingData
    outct  	res[s1], CT_END
    chkct  	res[s1], CT_END
 #endif

#ifdef useEvents
	waiteu
#endif
	bu SETSTATUSREG

UPDATEFILTERCOEF:
	ldw s1,Ccoef
    chkct  res[s1], CT_END
    outct  res[s1], CT_END
    in s2, res[s1]
    //Can be optmized
    mkmsk  s5, 0x20 //r5=-1
    eq s5, s2, s5
    bt s5, READDELAY

READRESET:
           in	r11, res[s1] // RESET or dynamic
           bf   r11, CALCPRT2
CALCPTR1:
    	   ldc i, 6  // i=6;
           mul  s3, s2, i   //s3 = bank *7
		   ldaw	s4, STATES
		   ldaw s3, s4[s3]  //s4=baseadress to STATES
RESETCOEF:
            stw zero , s3[0]
			stw zero , s3[1]
			stw zero , s3[2]
			stw zero , s3[3]
			stw zero , s3[4]
			stw zero , s3[5]
CALCPRT2:
      		ldc i, 7  //7
    		mul s3, s2, i   //bank *7
    		ldaw	s4, COEF
    		ldaw s3, s4[s3]  //r3=baseadress to COEF
READCOEF:
           in   r11, res[s1]
           stw  r11, s3[0]
           in   r11, res[s1]
           stw  r11, s3[1]
           in   r11, res[s1]
           stw  r11, s3[2]
           in   r11, res[s1]
           stw  r11, s3[3]
           in   r11, res[s1]
           stw  r11, s3[4]
           in   r11, res[s1]
           stw  r11, s3[5]
           in   r11, res[s1]
           stw  r11, s3[6]

RX_CT:
           chkct res[s1], CT_END
           outct res[s1], CT_END

          	#ifdef useEvents
				waiteu
			#endif
			bu SETSTATUSREG // if not event branch

READDELAY:
           in r11, res[s1]
           ldc r10, BUFSIZE
           lss r10,r11,r10  //DELAY<BUFSIZE
           ecallf r10 // ERROR: DELAY value too large
           stw r11, DELAY
           //ecallf zero
           bu RX_CT

Epilogue:
		    ldw r4, sp[1]
   	 		ldw r5, sp[2]
    		ldw r6, sp[3]
    		ldw r7, sp[4]
    		ldw r8, sp[5]
    		ldw r9, sp[6]
    		ldw r10, sp[7]
		    retsp NWORDS

An example of a filter setup.

Code: Select all

                EQ.Linked_Fc=150;
                for(int bank=0 ; bank<2 ; bank++){
                    EQ.channel[ch].section[bank].active=1;
                    EQ.channel[ch].section[bank].type=HP2;
                    EQ.channel[ch].section[bank].link=1;
                    EQ.channel[ch].section[bank].Fc = EQ.Linked_Fc;
                    EQ.channel[ch].section[bank].Q=0.707;
                    EQ.channel[ch].section[bank].Gain=0;
                    calcFilt( EQ.channel[ch].section[bank] , fs, B, A, Shift);
                    sendCoef(active , bank , reset, B ,A , Shift , ch  , c_LR , c_CB);
                }

Using reset = 0 is intended for dynamic filter updates.
reset = 1 will also reset the internal states, which is a must when changing from LP filter to ex. peaking filter on the fly.

The trick with Shift is that it will automatically fix the scaling of B coef.
But I think it is yet to be implemented in the asm code. That creates something floating-point alike to B.
B_real = B * 2^Shift;

stdefeber · Post by **stdefeber** » Wed Oct 08, 2014 8:54 pm

Thank you. I will have a look at the code.
I have had a look before at it.
It is a very nice handcrafted high res piece of code !

You just posted before I was ready to show my results.

Code: Select all

signed int iir(signed int input, struct IirStruct *IirBank)
{
  int h = 0;
  unsigned int l = 0;

  #pragma unsafe arrays
  //********** START DSP CODE ****************
  IirBank->X[0]=input;
  h=0;
  l=0;
  {h, l} = macs(IirBank->B[0], IirBank->X[0], h, l);
  {h, l} = macs(IirBank->B[1], IirBank->X[1], h, l);
  {h, l} = macs(IirBank->B[2], IirBank->X[2], h, l);
  {h, l} = macs(-IirBank->A[0], IirBank->Y[0], h, l);
  {h, l} = macs(-IirBank->A[1], IirBank->Y[1], h, l);

  IirBank->X[2]=IirBank->X[1];
  IirBank->X[1]=IirBank->X[0];
  IirBank->Y[1]=IirBank->Y[0];
  IirBank->Y[0]=h<<2;
  return (h<<2);
}

You probably recognize your code ;)

The picture below shows the results from octave and my C code implementation.

and the picture below as the result from my xcore simulation

Except for the scaling result I am quite pleased.

Post by **lilltroll** » Wed Oct 08, 2014 10:02 pm

That is some good start because it is easy to understand and follow!

You should use the same stimuli in Octave as in XC so the scaling becomes the same.
Use the exact same fixed point coef. as input in Octave as you use on the XMOS.
The reason is that you will run the same filter in both simulations.

With correct scaling you should then subtract the 2 curves and plot the error!

Next thing is to use a full scale chirp as a stimuli.

Post by **lilltroll** » Wed Oct 08, 2014 10:25 pm

Try the direct form II implementation as well.

Below is an example with 16 bit x 16 bit but you can change it to 32bit x 32 bit

Instead of swapping both x and y in the end, you only need to swap w

And some math about the coef.

and more scaling

stdefeber · Post by **stdefeber** » Thu Oct 09, 2014 6:13 am

Thanks for the slides.

Why a direct form II ?
I thought these wee goed for floating point implementation ?

Post by **lilltroll** » Thu Oct 09, 2014 8:30 am

stdefeber wrote:Thanks for the slides.

Why a direct form II ?
I thought these wee goed for floating point implementation ?

:) That is the correct answer to that exercise !

The internal states of DF I can never overflow. If the input and true output signal is within the full scale range, all internal states must be so as well since we store the actual old values of input and output signal.
The DF-II needs guard bits (the scaling with alfa in the slides) to prevent overflow for a fixed point implementation.

The asm code above is of the form DF I, since it is safe.

Maybe Transposed DF-II could be interesting!?

We would need 1 ?? guard bits to prevent overflow in the Z1 state and in total 2 ?? guard bits for the adder above.

I have not analysed if there is any performance advantage in code speed if a T-DFII filter would be written in XMOS asm compared to the DFI. The memory usage of the internal states is maybe not so important.

stdefeber · Post by **stdefeber** » Thu Oct 09, 2014 11:11 am

IIR filters become unstable at lower frequencies. For 48000KHz sampling rate I read that should be around 300Hz.
Since I intend to filter in that range I was thinking of down sampling first, filter and then up-sample again.

Have you experienced stability issues at lower frequencies ?
Do you have experience with IIR half-band filters or up/down sampling ?

iir filter implementation

iir filter implementation

Re: iir filter implementation

Re: iir filter implementation

Re: iir filter implementation

Re: iir filter implementation

Re: iir filter implementation

Re: iir filter implementation

Re: iir filter implementation

Re: iir filter implementation

Re: iir filter implementation