Optimization by xC on for loops

Technical questions regarding the XTC tools and programming with XMOS.
Post Reply
tonetechnician
Member
Posts: 14
Joined: Mon Jan 30, 2017 11:09 am

Optimization by xC on for loops

Post by tonetechnician »

Hey everyone,

Have been running into an interesting issue regarding the xC compiler optimization. I have been implementing a matrix mixer on an xcore-200 processor. This matrix mixer uses a for loop to handle the mixing of the signals. I've found that my for loop behaviour is not as expected. If I do a plain for loop with the "#pragma loop unroll", it doesn't seem to iterate through the loop. This seems to be fixed by adding a counter and printintln at the end of the loop - this printintln seems to force the compiler to have to iterate through the loop as expected at runtime.

This works --->

Code: Select all

char n = 0; // counter for number of channels active
/* Do Mixing */
#pragma loop unroll
for (int i = 0; i < MAX_SPEAKERS; i++){
    for (int j = 0; j < MAX_CHANNELS; j++){
        n++
        /* mix samples */
            samples_mixed[i][j] = sample_out_buf[j]*fifoMixVals[i][j];
            samples_mixed[i][j] = samples_mixed[i][j] >> 8;     // Shift right to prevent overflow
            samples_out[i] += samples_mixed[i][j];
    }
    printintln(n);
}
printintln(n); // Force compiler to go through each iteration
This doesn't ---->

Code: Select all

/* Do Mixing */
#pragma loop unroll
for (int i = 0; i < MAX_SPEAKERS; i++){
    for (int j = 0; j < MAX_CHANNELS; j++){
        /* mix samples */
            samples_mixed[i][j] = sample_out_buf[j]*fifoMixVals[i][j];
            samples_mixed[i][j] = samples_mixed[i][j] >> 8;     // Shift right to prevent overflow
            samples_out[i] += samples_mixed[i][j];
    }
}
Now, when trying a similar thing on another for loop i my code I don't get the expected behaviour at all, even after adding the printintln fix.

Code: Select all

#pragma loop unroll
char count = 0;
for (int j = 0; j < 4; j++){
      count++;
      ambi_dec_mix[0][j] = sample_out_buf[j]*decoderVals[0][j];
      ambi_dec_mix[1][j] = sample_out_buf[j]*decoderVals[1][j];
      samples_out[0] += (ambi_dec_mix[0][j]);
      samples_out[1] += (ambi_dec_mix[1][j]);
     }
printintln(count);
so I have to resort to this which is definitely not a flexible solution

Code: Select all

samples_out[0] = (sample_out_buf[0]*decoderVals[0][0]) +
                 (sample_out_buf[1]*decoderVals[0][1]) +
                 (sample_out_buf[2]*decoderVals[0][2]) +
                 (sample_out_buf[3]*decoderVals[0][3]);

samples_out[1] = (sample_out_buf[0]*decoderVals[1][0]) +
                 (sample_out_buf[1]*decoderVals[1][1]) +
                 (sample_out_buf[2]*decoderVals[1][2]) +
                 (sample_out_buf[3]*decoderVals[1][3]);

Does anyone have any idea why this keeps happening, or how I could make it handle a for loop as expected?
Last edited by tonetechnician on Fri May 26, 2017 12:43 pm, edited 2 times in total.


User avatar
mon2
XCore Legend
Posts: 1913
Joined: Thu Jun 10, 2010 11:43 am
Contact:

Post by mon2 »

Very difficult to read the posted code.

https://www.xmos.com/support/tools/docu ... nent=14787

Try compiling with:

1) change the compiler optimization flags -> compile and review again

2) here is how you can alter the optimization flags

Image

Post your results on this testing for future readers. Try flags like -O0 and also -O2 and/or -O3
tonetechnician
Member
Posts: 14
Joined: Mon Jan 30, 2017 11:09 am

Post by tonetechnician »

Thanks so much for reply.

I've tried to increase the size of these images but not sure why they are so small. Will try fix again.

I have tried the different flags on the compiler. Have used particularly -O0 and it doesn't seem to help the problem :/

My flags are

XCC_FLAGS = -O0 -save-temps -g -report -fxscope -DRGMII=1 -lquadflash

I see there is also an XCC_FLAGS_main.xc which is set as:

XCC_FLAGS_main.xc = $(XCC_FLAGS) -falways-inline -O3

I have changed the XCC_FLAGS_main.xc to

XCC_FLAGS_main.xc = $(XCC_FLAGS) -falways-inline -O0

I'm recompiling now, and will let you know.
tonetechnician
Member
Posts: 14
Joined: Mon Jan 30, 2017 11:09 am

Post by tonetechnician »

Alright!

It seems the XCC_FLAGS_main.xc -03 was causing my issue.

Code: Select all

XCC_FLAGS = -O0 -save-temps -g -report -fxscope -DRGMII=1 -lquadflash
XCC_FLAGS_main.xc = $(XCC_FLAGS) -falways-inline -O0 
seems to make the for loop work. Unfortunately, this makes the code larger, but this is fine for my application as it stands.

Hope this helps any future readers
User avatar
mon2
XCore Legend
Posts: 1913
Joined: Thu Jun 10, 2010 11:43 am
Contact:

Post by mon2 »

Excellent ! Perhaps someone else will have better advice but for now, it moves you forward :)
Post Reply