Page 2 of 3

Re: XUF224 channel comms

Posted: Thu Feb 15, 2018 5:20 pm
by peter
Have you tried calling:
set_core_fast_mode_on();
on each of the tiles so that there is no additional latency from a core going to sleep and starting up. I would expect that to ensure that the first & subsequent runs have the same performance.

Re: XUF224 channel comms

Posted: Thu Feb 15, 2018 5:21 pm
by peter
Have you tried calling:

Code: Select all

set_core_fast_mode_on();
on each of the tiles so that there is no additional latency from a core going to sleep and starting up. I would expect that to ensure that the first & subsequent runs have the same performance.

Re: XUF224 channel comms

Posted: Thu Feb 15, 2018 6:32 pm
by MyKeys
Hi Peter,

I've tried adding set_core_fast_mode_on() to the two tasks in the example from the original post.
Unfortunately it doesn't seem to make a difference.

Thanks,
Mike.

Re: XUF224 channel comms

Posted: Thu Feb 15, 2018 6:38 pm
by MyKeys
Perhaps someone with access to a quad tile device could run something similar to the original post?

I imagine there is a test example some where that demonstrates the best way to achieve maximum bandwidth in this configuration?

Re: XUF224 channel comms

Posted: Fri Feb 16, 2018 8:22 am
by cl-b
I almost have the same problem with XU232-1024 chip (http://www.xcore.com/viewtopic.php?f=7&t=6414) and up to now I do not find a solution to my problem.
I also contact XMOS FAE but without success
There are no technical documentation about the inter tile communication (latency, bandwith ...) and no test exemple

Claire

Re: XUF224 channel comms

Posted: Fri Feb 16, 2018 12:44 pm
by MyKeys
Hi Claire,

I believe the problem I have is down to xlink bandwidth and buffering.
The following link provides some useful info though needs a little decryption:

https://www.xmos.com/download/private/X ... 2.0%29.pdf

The first time iteration issue I believe is due to the link only being established when the first message is sent.
You would think declaring the channel as streaming would have this done upfront but alas that doesn't seem to be the case.
To mitigate this issue in the code from the original post a sync message can be added to both tasks in both directions before the main loop of each task:

Code: Select all

// Establish channel connection
c <: 0;
c :> int _;
On my board I am using an XUF224 at 500MHz with 5wire xlinks at 3clks delay.
I believe the 3clks delay is the transition time which I think is the same as the Symbol time.
There are 4 symbols per token in 5wire mode, 1 token being a byte of data in my case. Every 16 tokens sent requires an extra credit token to be sent, so we have 17 tokens per 16 bytes.
If the 3clks is referenced to the 500MHz clock then a transition is 6nS.

Time to send 1 byte:

transition time (delay) * 4 transitions * 17 tokens / 16

So the data rate with the typical 7.5nS delay (3 clks at 400MHz) with the above gives 31.372MBytes/s which agrees with the table in the document.
I wish they clarified the terminology in the performance document.

The confusion in my original post I think is down to buffering, each xSwitch adds an additional buffer of 48 bytes.
The buffers in the tile can fill at the instruction speed, the buffers in the xSwitch post xlink will be subject to the xlink speed.
So channel blocking will be subject to the above conditions and complicates timings especially if you're sending small bursts of messages compared to continuous data.

Of course I could be entirely wrong but so far its my best guess and kindof agrees with my test measurements.

Mike.

Re: XUF224 channel comms

Posted: Fri Feb 16, 2018 1:04 pm
by MyKeys
One tip I forgot to mention, if your sending bursts of data such as in the audio reference design then if hardware permits you can use multiple channels to make use of even more buffering provided by the hardware. In the code from the original post you can halve the time it takes to send by adding an additional streaming channel and splitting the data across them. This works because the printuinln takes time and allows the buffers to drain ready for the next iteration.

Hope that helps,
Mike.

Re: XUF224 channel comms

Posted: Fri Feb 16, 2018 1:51 pm
by cl-b
Thanks for all theses explanations, but I still cannot understand why there is such a difference when mixer thread is put on core 2 instead of 0 (with same hardware). Is there additional overhead for using xConnect switch between the two nodes ?

I have to transfer 10*32 bits (10 audio sample). If the sample frequency is set to 192 kHz, the maximum bandwith is 7.68 MByte/s which is much less than the theoretical flow.

I think I missed something.

Claire

Re: XUF224 channel comms

Posted: Fri Feb 16, 2018 2:17 pm
by MyKeys
Using the xConnect switch means you're data sending will keep pausing whilst the data is drained from the buffer (at a different and much slower rate to you're instruction timing). When running on the same node (tiles 0 & 1) you'll never have to wait for the data to be drained from the buffer. The audio ref design uses tight loops of sending samples with #pragma loop unroll so it's always going to pause in this situation if the channel count is greater than the local buffer size (before the xlink).

Whilst the xlink has way more bandwidth than you need, it's still far slower than consecutive instructions pumping in data. The audio ref design really isn't optimised for traversing audio samples over the xlinks.

Hope that makes sense.
Mike.

Re: XUF224 channel comms

Posted: Fri Feb 16, 2018 2:42 pm
by cl-b
Where can we find information about local buffer size, buffer emptying speed ?