synchronize Outputs

Technical questions regarding the XTC tools and programming with XMOS.
grosdodde
Member
Posts: 14
Joined: Tue Nov 24, 2015 4:33 pm

synchronize Outputs

Post by grosdodde »

Hello everybody,
I have to output more than 20 signals (100kHz) exactly phase shifted. To do this I need to synchronize the tasks. I read some threads and found three ways:

1. Using channels to trigger the tasks, fast enough to do the job with more than 20 tasks and at 200 kHz?

2. Using the same clock, but I can only use the same clock at the same core / tile or?

3. Spend one task to generate an output at a Pin and read the Pin with the more than 20 tasks, but xMOS do not allow to share ports or?

Any other ideas ?
hkr87
Member
Posts: 12
Joined: Tue Oct 13, 2015 11:36 am

Post by hkr87 »

I would use port counters; connect all ports to one clock block, set that clock block to 100 KHz (or the frequency that gives you the right phase shift), set the block going, and then all ports can be driven by independent tasks using port counters.

All port counters will start at 0 and will increment in sync.
grosdodde
Member
Posts: 14
Joined: Tue Nov 24, 2015 4:33 pm

Post by grosdodde »

I'm not sure if I have understood correctly. Can you describe it more in detail?

The signal frequenz is round 100 kHz but the phase shift is only 40 ns (25 MHz).
Can I connect all Ports to one clock and several tiles and tasks?
hkr87
Member
Posts: 12
Joined: Tue Oct 13, 2015 11:36 am

Post by hkr87 »

The method blow works on a single tile with multiple tasks and ports.

1. Set one clock block to a 25 MHz rate

2. Attach all ports to this clock block

3. Start the clock block

Now you can use this in one task:

Code: Select all

  count1 = 12345;
  port1 @ count1 <: 1;
and in a different task:

Code: Select all

  count2 = 12344;
  port2 @ count2 <: 0;
And port 1 will go low exactly 40 ns after port1 went high.

Time is only a short, so you cannot make things happen more than 65535*40 ns = 2.6 ms ahead. But all will wrap around at the same time, so as long as you do not add more than 65536 at a time it should all work out. You can use buffered ports if you want to drive a pre-defined pattern to the ports.

Running this on multiple tiles is (much) harder I think. Your best bet is to clock all tiles from a single source, use the above scheme, and then somehow work out how many 2ns clocks skew there is between the tiles. You can possibly do that by using one tile to input one signal from each of the tiles and work out what the skew is. Then use that skew to correct the counts on each tile.
grosdodde
Member
Posts: 14
Joined: Tue Nov 24, 2015 4:33 pm

Post by grosdodde »

1. Set one clock block to a 25 MHz rate check

2. Attach all ports to this clock block check

3. Start the clock block check

Now you can use this in one task: fail

I cant passed the same value (port) to different tasks
hkr87
Member
Posts: 12
Joined: Tue Oct 13, 2015 11:36 am

Post by hkr87 »

True - you will need to do the operations as follows:

Code: Select all

main() {
    set_up_clk_blk_and_ports();
    par {
        task1();
        task2();
        task3();
        task4();
        task5();
        task6();
        task7();
        task8();
    }
}
Make sure that task1..8 operate on different ports and all should be fine?
grosdodde
Member
Posts: 14
Joined: Tue Nov 24, 2015 4:33 pm

Post by grosdodde »

Yes but I need 60 and more outputs, copy the function 60 times and adjust the ports cant be the only way to deal with
User avatar
ahenshaw
Experienced Member
Posts: 96
Joined: Mon Mar 22, 2010 8:55 pm

Post by ahenshaw »

What about four processes each controlling a single 16-bit buffered port, each clocked-out at 40 ns intervals (tied to the same clock block)? Update the in-memory bit pattern from the 100 kHz signals and then send to the port. The phase shift would just be a function of the way that you manage the bit patterns.
hkr87
Member
Posts: 12
Joined: Tue Oct 13, 2015 11:36 am

Post by hkr87 »

Indeed, 60 signals can be achieved with fewer ports. As ahenshaw points out, a single 16-bit port can drive 16 signals. Four 16-bit ports would be spread over two tiles (which is undesirable if all 60 signals have to be absolutely precisely synchronized), but a single tile has 64 IO pins, so you just may be able to drive them all using all ports, except for the ones used for booting.

A task can drive more than one port; if there is enough room to predict what you need to do to the various signals. If you make the ports buffered, then a single task can control multiple ports as follows, provided that count10 and count20 are ahead of both count1 and count2.

Code: Select all

  // compute count1, count2
  port1 @ count1 <: 1;
  port2 @ count2 <: 1;
  // compute count10, count20
  port1 @ count10 <: 0;
  port2 @ count20 <: 0;
Where you replace '1' and '0' with more complex bit patterns, appropriate to the port-width.

If you ever want to scale beyond 60 pins, then it is much cleaner to use a few 8- or 16-bit ports per tile, and bite the bullet and synchronise two or more tiles.
User avatar
infiniteimprobability
Verified
XCore Legend
Posts: 1142
Joined: Thu May 27, 2010 10:08 am

Post by infiniteimprobability »

Interesting discussion. There are certainly a whole load of options here. My vote would be using 16b or 8b ports with 2 oe 4 on each tile using the XL210-512-TQ128 - keeps the software homogenous too. 1b ports are powerful in that they each contain an individual timer and compare match mechanism (makes software very simple), but you only get 16 per tile so you'd need 3 tiles to do it this way..

If device cost was the biggest factor, you will of course be able to squeeze it in to a single tile but using a mix of port types. The XS1-L8-64-TQ128 has the most fully pinned out tile (all 64 pins brought out) so would be the absolute lowest chip cost, with the tradeoff of additional software complexity with different low level drivers for different port types.

One question worth asking is how often the phase updates need to be? If not so frequent, then it allows for a client side pre-calculation (sorting) of the 8/16 transition events which can then be loaded into bitmap / timing lookup of the port server task (the bit doing the waggling), keeping that part slim and fast and able to handle transitions as close as 40ns to each other.

Regarding the actual port mechanism to use (which hardware in the port to take advantage of) - currently undecided whether it would be best to use the buffer (port serialiser) and a 40ns clocked port to load a 32 bit bit pattern in every 80ns/160ns or setting up the next transition at a given time using the port timer. Both should be possible I think.