Which Program Flow?

Technical questions regarding the XTC tools and programming with XMOS.
User avatar
Folknology
XCore Legend
Posts: 1274
Joined: Thu Dec 10, 2009 10:20 pm

Post by Folknology »

Good news, so what are you getting frequency wise now with the replicator and 4 concurrent threads?

regards
Al


User avatar
rp181
Respected Member
Posts: 395
Joined: Tue May 18, 2010 12:25 am

Post by rp181 »

Getting the same (> 900 kHz) as without a replicator and with a replicator. Without, i just manually wrote out all of the lines, which i would think does the same thing.

Some other questions i thought of:
1) Does optimizing with the compiler have a downside? If not, why isn't the highest level always on?
2) What is a streaming channel's buffer size?
User avatar
Folknology
XCore Legend
Posts: 1274
Joined: Thu Dec 10, 2009 10:20 pm

Post by Folknology »

Sorry no my bad, I thought you were using the older sequential code to call ProcessorThread, as I didn't see a new main. The replicator will give the same results as manual par entries its just neater and scalable as you can make the number of ProcessorThreads variable very simply .

I will leave the Optimisation question for Xmos but I would imagine a compile takes longer of course with higher optimisations.
Streaming channels are limited resources, underneath is a credit system for parts of those resources so I'm not sure if that has an easy answer, it may depend on the number of streaming channels in operation, but Xmos will prob answer that better.

Glad your getting good results

By the way a macro would just replace the function call, or you could do the calculation directly to loose the function overhead:

Code: Select all

for (counter1 = 0; counter1 < 14; counter1+=2) {
         ret = (1000000 * ( currentData[counter1]- currentData[counter1+1]) / 1000000 * (currentData[counter1] + currentData[counter1+1]))
      }
regards
Al
User avatar
rp181
Respected Member
Posts: 395
Joined: Tue May 18, 2010 12:25 am

Post by rp181 »

I didn't realize function calls were so bad! I made it into a macro, and removed extraneous timer statements, and it is now 3 MHz!

Code: Select all

void ProcessorThread(streaming chanend usbOut, int quadrant) {
	short counter1;
	short ret;
	short a;
	short b;

	timer t;
	unsigned long time;
	unsigned long time5;


	short cycles;
t	:> time;
	for (cycles = 0; cycles < 10; cycles++) {
		for (counter1 = 0; counter1 < 14; counter1+=2) {
			a = ((readADC(counter1)+readADC(counter1)+readADC(counter1)+readADC(counter1))/4);
			b = ((readADC(counter1+1)+readADC(counter1+1)+readADC(counter1+1)+readADC(counter1+1))/4);
			//ret = processPair(readNormalizedADC(counter1),readNormalizedADC(counter1+1));
			ret = (1000000 * (a - b) / 1000000 * (a + b));
		}
	}
	t :> time5;

	printf("Time Quadrant %i: %ld\n",(int)quadrant,((time5-time)));
}
This makes me think I am doing something wrong... :shock:

If this is correct, I may, depending on how fast the channel is to the USB thread, have to slow it down to give the ADC time to refresh...

EDIT: With a streaming channel to a USB thread, it is 1.6 MHz.
User avatar
Folknology
XCore Legend
Posts: 1274
Joined: Thu Dec 10, 2009 10:20 pm

Post by Folknology »

Also as you get nearer to the real thing you will loose the readNormalizedADC() overheads as you will run the ProcessorThreads on response to inputs from a select which would be a case from the ADC input port. This can all effectively be inlined by converting it to a select function which will likely whistle through it faster than the ADC can supply with data. But you need not worry about adding delays etc as it will become event driven and thus the threads will pause whilst waiting for data (this is a good thing). Actually this could leave some thread capacity for your actuation thread/s.

regards
Al
User avatar
segher
XCore Expert
Posts: 844
Joined: Sun Jul 11, 2010 1:31 am

Post by segher »

return (1000000 * (a - b) / 1000000 * (a + b));

That's not doing what you want, you multiply by a+b instead of dividing by it.
The division by a constant will be optimised to a multiply by the compiler (and
completely optimised away in this case). You shouldn't divide by the scale
factor anyway. The code you want is:

return 1000000 * (a-b) / (a+b);

(it might help a little if you used a power of two instead of the 1000000, fwiw).
User avatar
rp181
Respected Member
Posts: 395
Joined: Tue May 18, 2010 12:25 am

Post by rp181 »

So will the intermediary value still retain the decimal points? I thought i had to multiply both first, as the processor isn't floating point. I wouldn't have realized using a 2's complement would benefit, but i will use 1048576 (2^20)
User avatar
Folknology
XCore Legend
Posts: 1274
Joined: Thu Dec 10, 2009 10:20 pm

Post by Folknology »

Intent would be clearer with:

ret = 0x100000u * (a-b) / (a+b);

regards
Al
User avatar
Interactive_Matter
XCore Addict
Posts: 216
Joined: Wed Feb 10, 2010 10:26 am

Post by Interactive_Matter »

rp181 wrote:I didn't realize function calls were so bad! I made it into a macro, and removed extraneous timer statements
Completely unrelated question but still on topi:

Is a function call that bad?
Even if I say it is 'static inline'?
Or is a makro the only real guarantee for inlining?

Thanks

Marcus
User avatar
segher
XCore Expert
Posts: 844
Joined: Sun Jul 11, 2010 1:31 am

Post by segher »

Folknology wrote:Intent would be clearer with:

ret = 0x100000u * (a-b) / (a+b);
That doesn't work. The division has to be signed; making the constant unsigned like this
makes the multiplication unsigned (which is fine), and then the division unsigned as well
(which is not fine).

Don't use the U.