How to use wider port and linking tile together on startKIT

All technical discussions and projects around startKIT
quyenhuynh
Member
Posts: 12
Joined: Thu Dec 22, 2016 9:36 am

How to use wider port and linking tile together on startKIT

Post by quyenhuynh »

I'm studying a project about the ultrasound waves. My aim is creation 40khz square waves with phased delay for each transducer to them operate with code as the following. The current result is this code operated on 8 transducers on startKit ( with this code I refer from @henk), but I want to expand the number of the transducers as much as possible. In the actually, I used 8 cores of tile[0].
Please how to show me to use wider port as well as maximum numbers of pins that I can use (I want to use this code for 50 transducers).
Thanks all

Code: Select all

#include <xs1.h>

port p0 = XS1_PORT_1E;
port p1 = XS1_PORT_1F;
port p2 = XS1_PORT_1G;
port p3 = XS1_PORT_1H;
port p4 = XS1_PORT_1I;
port p5 = XS1_PORT_1J;
port p6 = XS1_PORT_1K;
port p7 = XS1_PORT_1D;

void forty(port p, unsigned short edge_clk_cnt) {
    for(int i = 0; i <=100; i++) {
        p @ edge_clk_cnt <: 0;
        edge_clk_cnt += 1250;
        p @ edge_clk_cnt <: 1;
        edge_clk_cnt += 1250;
    }
}


int main(void) {
    while(1) {
        unsigned short phase0 = (2500 * 0)/360;
        unsigned short phase1 = (2500 * 10)/360;
        unsigned short phase2 = (2500 * 20)/360;
        unsigned short phase3 = (2500 * 30)/360;
        unsigned short phase4 = (2500 * 40)/360;
        unsigned short phase5 = (2500 * 50)/360;
        unsigned short phase6 = (2500 * 60)/360;
        unsigned short phase7 = (2500 * 60)/360;



        unsigned int current;

            p0 <: 0 @ current;
            current += 5000;

            par {

                forty(p0, phase0 + current);
                forty(p1, phase1 + current);
                forty(p2, phase2 + current);
                forty(p3, phase3 + current);
                forty(p4, phase4 + current);
                forty(p5, phase5 + current);
                forty(p6, phase6 + current);
                forty(p6, phase7 + current);
            }

                 }

    return 0;
}






User avatar
infiniteimprobability
XCore Legend
Posts: 1126
Joined: Thu May 27, 2010 10:08 am
Contact:

Post by infiniteimprobability »

What sort of resolution do you need? If you use wider ports, then you lose the luxury of having a timer per port/pin.. However, you can do some neat tricks with wider ports to support playing back bitstreams. The key is to use the port buffers - for example:

outputting on:

Code: Select all

out buffered port:32 out_port = XS1_PORT_8A;
out_port <: 0xFFFOF1FE;
will result in the 8b port outputting the values 0xFE, 0xF1, 0xF0, 0xFF on each clock. The buffer size is specified by the :32 bit.. So you can pre-compute the bit-stream and play it back on a wider port quite efficiently with a single out instruction for 32b worth of output data (in the above example that's 4 x 8b).

There is an example of using a different clock input for the port here:

https://www.xmos.com/published/xc-port-buffering

You can use the XS2 zip/unzip instructions (accessed via C built-ins) to pack the words efficiently:
https://www.xmos.com/download/private/A ... rc1%29.pdf

Have a read of this too which I dind helpful:
https://github.com/xcore/doc_tips_and_t ... treams.rst
quyenhuynh
Member
Posts: 12
Joined: Thu Dec 22, 2016 9:36 am

Post by quyenhuynh »

thanks @infiniteimprobability, I only need to resolution about some ns. As code above, I used reference clock is 100MHZ. Because I want to generate 40khz for many transducers at the same time but startKit only support 8 core on tile[0]. Can Xc port buffering make this?
User avatar
mon2
XCore Legend
Posts: 1913
Joined: Thu Jun 10, 2010 11:43 am
Contact:

Post by mon2 »

Here is another suggestion.

1) continue to map your transducers as you have working now with 8 per StartKit.

2) daisy chain connect additional StartKits to expand your chain of transducers.

3) communicate between the chained StartKits using LVDS interface over the xlinks interface.

We have completed a design with LVDS links and can share more details on request. This is a slice board we have designed.
User avatar
infiniteimprobability
XCore Legend
Posts: 1126
Joined: Thu May 27, 2010 10:08 am
Contact:

Post by infiniteimprobability »

You will definitely need more than a startkit to hit around 50 outputs. There are probably enough I/O pins but not the right hardware behind them to sustain the data. Either you can go mon2's route and chain multiple kits together (either by clink or just plain old SPI), and use the code you have...or you can go to XS2 (xcore200) based approach and choose a device with 2 tiles and use wide, buffered, ports ... say 24 ports per tile (total 48) using 6* 4b ports or 32 ports per tile using 4 * 8b ports (total 64) per tile. A single explorer kit would work for this.

However, assuming you want to produce phase controlled square waves (so they are not reset synchronised), then using buffered, wide, ports may introduce timing compromises.

What are your design specs?
  • Resolution would typically be 8, 10, 12 ns (depending on whether the port clock was /4 /5 or /6).. which gives better than 480ppm phase granularity. Is this good enough?
    Output squarewave frequency could just be 40.00000KHz (using 8b ports and 100Mhz clock) but most other combinations give around 1600PPM error (39.94kHz or 40.064kHz).
    What deviation can you tolerate?
    Do you really need 50 outputs, or is 48 enough?
    How often do the outputs need to be updated (modulated?)
    Do outputs need to be changed synchronously? (Ie. do all 50 need to changed within the same 40KHz period?
    How much work are you prepared to do on this module?
The answers to the above will help guide the advice myself and others can proivde!
quyenhuynh
Member
Posts: 12
Joined: Thu Dec 22, 2016 9:36 am

Post by quyenhuynh »

thanks @mon2, I thinks the solution expand the pins by linking multiple startKIt is suitable for me now. Can you share more details LVDS links to me ? ( I want to link about 8 startKIt by using xlinks interface.)
quyenhuynh
Member
Posts: 12
Joined: Thu Dec 22, 2016 9:36 am

Post by quyenhuynh »

thanks @infiniteimprobability
My research is using ultrasound wave generated from many transducers to levitate a any object in the medium.
- I want to use ultrasound to apply in medical field so need very high accuracy. By that reason, Resolution as high as possible but I thinks about 10ns is good for my project.
- The transducers that I'm using operate at 40khz and tolerate that it can accept is not be mention in it's datasheet. But I thinks it as close 40khz as possible.
- Output is as many as possible so it will create a force enough large to I can measure. In the actually, Many researcher used 64 transducers and 128 transducers
- Output is many squarewaves with phased delay differently for each transducer and phased delay will be update continuously to I can change position that I want levitate. The programming on Xmos will be interface with matlab ( Task of matlab is caculate phase delay for each transducer) to receive phased delay for each transducer.
- Presently, I have designed a array transducer including 25 sensors and all be controlled by raspberry pi but it seems not enough force to can levitate as well as observe and signal generated from raspberry pi ( 25 square waves) is not stability as well as can not expand pins on this board. So I have studied xmos as a developed method for my research.
quyenhuynh
Member
Posts: 12
Joined: Thu Dec 22, 2016 9:36 am

Post by quyenhuynh »

Dear @infiniteimprobability, Do you know how to use combinable functions of startKIT? " can be combined to have several tasks running on the same logical core. The core swaps context based on cooperative multitasking between the tasks driven by the compiler".
I intend to use combination function to can generate many waves on many ports only with a core. But It's seems I not still understand more about them.Please for me some advises about this.
int main () {
par {
on tile [0]. core [0]: forty(p0, phase0 + current)
on tile [0]. core [0]: forty(p1, phase0 + current)
}
return 0;
User avatar
infiniteimprobability
XCore Legend
Posts: 1126
Joined: Thu May 27, 2010 10:08 am
Contact:

Post by infiniteimprobability »

Do you know how to use combinable functions of startKIT? " can be combined to have several tasks running on the same logical core. The core swaps context based on cooperative multitasking between the tasks driven by the compiler".
Combinable only works for a while(1) { select { construct which allows multiple tasks to get flattened into a big select. You can only select on a port input (not output), and the current code that Henk provided relies on timed outputs, which are inherently blocking operations. So that doesn't work I'm afraid. It might be possible, with some fast sorting code (to allow the transitions to be output in time order), to have a single core handle more than one one bit port, but this will be a manual task (compiler won't do it for you). At best, this approach will allow you to generate 16 timed outputs per tile using a one bit port per output (startkit has one user tile), but shouldn't take too much work.
User avatar
infiniteimprobability
XCore Legend
Posts: 1126
Joined: Thu May 27, 2010 10:08 am
Contact:

Post by infiniteimprobability »

- I want to use ultrasound to apply in medical field so need very high accuracy. By that reason, Resolution as high as possible but I thinks about 10ns is good for my project.
Thanks - the reason for my questions is that, using an xCore 200, I think you can acheive 24 outputs per tile using 6 x 4b port and 6 x tasks. This would likely rely on dual issue, so would require xCore 200.
- The transducers that I'm using operate at 40khz and tolerate that it can accept is not be mention in it's datasheet. But I thinks it as close 40khz as possible.
I did some calculations, and I think that using a 480MHz core clock with 96MHz ref clock, an exact 40KHz can be achieved using the buffered port approach.
- Output is as many as possible so it will create a force enough large to I can measure. In the actually, Many researcher used 64 transducers and 128 transducers
OK - my best guess is that, with your ~10ns accuracy target, you can get at best 32 outputs per tile on an XS2 (xcore 200).. There is definitely some work to get to this number
- Output is many squarewaves with phased delay differently for each transducer and phased delay will be update continuously to I can change position that I want levitate. The programming on Xmos will be interface with matlab ( Task of matlab is caculate phase delay for each transducer) to receive phased delay for each transducer.
OK fine
- Presently, I have designed a array transducer including 25 sensors and all be controlled by raspberry pi but it seems not enough force to can levitate as well as observe and signal generated from raspberry pi ( 25 square waves) is not stability as well as can not expand pins on this board. So I have studied xmos as a developed method for my research.
On balance, if you don't want to spend a fair chunk of time developing custom bit pattern generator code, then mon2's approach sounds good. Especially given the cost of startkits vs anything else
Post Reply