Hello everybody,
I have to output more than 20 signals (100kHz) exactly phase shifted. To do this I need to synchronize the tasks. I read some threads and found three ways:
1. Using channels to trigger the tasks, fast enough to do the job with more than 20 tasks and at 200 kHz?
2. Using the same clock, but I can only use the same clock at the same core / tile or?
3. Spend one task to generate an output at a Pin and read the Pin with the more than 20 tasks, but xMOS do not allow to share ports or?
Any other ideas ?
synchronize Outputs
-
- Member
- Posts: 14
- Joined: Tue Nov 24, 2015 4:33 pm
-
- Member
- Posts: 12
- Joined: Tue Oct 13, 2015 11:36 am
I would use port counters; connect all ports to one clock block, set that clock block to 100 KHz (or the frequency that gives you the right phase shift), set the block going, and then all ports can be driven by independent tasks using port counters.
All port counters will start at 0 and will increment in sync.
All port counters will start at 0 and will increment in sync.
-
- Member
- Posts: 14
- Joined: Tue Nov 24, 2015 4:33 pm
I'm not sure if I have understood correctly. Can you describe it more in detail?
The signal frequenz is round 100 kHz but the phase shift is only 40 ns (25 MHz).
Can I connect all Ports to one clock and several tiles and tasks?
The signal frequenz is round 100 kHz but the phase shift is only 40 ns (25 MHz).
Can I connect all Ports to one clock and several tiles and tasks?
-
- Member
- Posts: 12
- Joined: Tue Oct 13, 2015 11:36 am
The method blow works on a single tile with multiple tasks and ports.
1. Set one clock block to a 25 MHz rate
2. Attach all ports to this clock block
3. Start the clock block
Now you can use this in one task:
and in a different task:
And port 1 will go low exactly 40 ns after port1 went high.
Time is only a short, so you cannot make things happen more than 65535*40 ns = 2.6 ms ahead. But all will wrap around at the same time, so as long as you do not add more than 65536 at a time it should all work out. You can use buffered ports if you want to drive a pre-defined pattern to the ports.
Running this on multiple tiles is (much) harder I think. Your best bet is to clock all tiles from a single source, use the above scheme, and then somehow work out how many 2ns clocks skew there is between the tiles. You can possibly do that by using one tile to input one signal from each of the tiles and work out what the skew is. Then use that skew to correct the counts on each tile.
1. Set one clock block to a 25 MHz rate
2. Attach all ports to this clock block
3. Start the clock block
Now you can use this in one task:
Code: Select all
count1 = 12345;
port1 @ count1 <: 1;
Code: Select all
count2 = 12344;
port2 @ count2 <: 0;
Time is only a short, so you cannot make things happen more than 65535*40 ns = 2.6 ms ahead. But all will wrap around at the same time, so as long as you do not add more than 65536 at a time it should all work out. You can use buffered ports if you want to drive a pre-defined pattern to the ports.
Running this on multiple tiles is (much) harder I think. Your best bet is to clock all tiles from a single source, use the above scheme, and then somehow work out how many 2ns clocks skew there is between the tiles. You can possibly do that by using one tile to input one signal from each of the tiles and work out what the skew is. Then use that skew to correct the counts on each tile.
-
- Member
- Posts: 14
- Joined: Tue Nov 24, 2015 4:33 pm
1. Set one clock block to a 25 MHz rate check
2. Attach all ports to this clock block check
3. Start the clock block check
Now you can use this in one task: fail
I cant passed the same value (port) to different tasks
2. Attach all ports to this clock block check
3. Start the clock block check
Now you can use this in one task: fail
I cant passed the same value (port) to different tasks
-
- Member
- Posts: 12
- Joined: Tue Oct 13, 2015 11:36 am
True - you will need to do the operations as follows:
Make sure that task1..8 operate on different ports and all should be fine?
Code: Select all
main() {
set_up_clk_blk_and_ports();
par {
task1();
task2();
task3();
task4();
task5();
task6();
task7();
task8();
}
}
-
- Member
- Posts: 14
- Joined: Tue Nov 24, 2015 4:33 pm
Yes but I need 60 and more outputs, copy the function 60 times and adjust the ports cant be the only way to deal with
-
- Experienced Member
- Posts: 96
- Joined: Mon Mar 22, 2010 8:55 pm
What about four processes each controlling a single 16-bit buffered port, each clocked-out at 40 ns intervals (tied to the same clock block)? Update the in-memory bit pattern from the 100 kHz signals and then send to the port. The phase shift would just be a function of the way that you manage the bit patterns.
-
- Member
- Posts: 12
- Joined: Tue Oct 13, 2015 11:36 am
Indeed, 60 signals can be achieved with fewer ports. As ahenshaw points out, a single 16-bit port can drive 16 signals. Four 16-bit ports would be spread over two tiles (which is undesirable if all 60 signals have to be absolutely precisely synchronized), but a single tile has 64 IO pins, so you just may be able to drive them all using all ports, except for the ones used for booting.
A task can drive more than one port; if there is enough room to predict what you need to do to the various signals. If you make the ports buffered, then a single task can control multiple ports as follows, provided that count10 and count20 are ahead of both count1 and count2.
Where you replace '1' and '0' with more complex bit patterns, appropriate to the port-width.
If you ever want to scale beyond 60 pins, then it is much cleaner to use a few 8- or 16-bit ports per tile, and bite the bullet and synchronise two or more tiles.
A task can drive more than one port; if there is enough room to predict what you need to do to the various signals. If you make the ports buffered, then a single task can control multiple ports as follows, provided that count10 and count20 are ahead of both count1 and count2.
Code: Select all
// compute count1, count2
port1 @ count1 <: 1;
port2 @ count2 <: 1;
// compute count10, count20
port1 @ count10 <: 0;
port2 @ count20 <: 0;
If you ever want to scale beyond 60 pins, then it is much cleaner to use a few 8- or 16-bit ports per tile, and bite the bullet and synchronise two or more tiles.
-
Verified
- XCore Legend
- Posts: 1142
- Joined: Thu May 27, 2010 10:08 am
Interesting discussion. There are certainly a whole load of options here. My vote would be using 16b or 8b ports with 2 oe 4 on each tile using the XL210-512-TQ128 - keeps the software homogenous too. 1b ports are powerful in that they each contain an individual timer and compare match mechanism (makes software very simple), but you only get 16 per tile so you'd need 3 tiles to do it this way..
If device cost was the biggest factor, you will of course be able to squeeze it in to a single tile but using a mix of port types. The XS1-L8-64-TQ128 has the most fully pinned out tile (all 64 pins brought out) so would be the absolute lowest chip cost, with the tradeoff of additional software complexity with different low level drivers for different port types.
One question worth asking is how often the phase updates need to be? If not so frequent, then it allows for a client side pre-calculation (sorting) of the 8/16 transition events which can then be loaded into bitmap / timing lookup of the port server task (the bit doing the waggling), keeping that part slim and fast and able to handle transitions as close as 40ns to each other.
Regarding the actual port mechanism to use (which hardware in the port to take advantage of) - currently undecided whether it would be best to use the buffer (port serialiser) and a 40ns clocked port to load a 32 bit bit pattern in every 80ns/160ns or setting up the next transition at a given time using the port timer. Both should be possible I think.
If device cost was the biggest factor, you will of course be able to squeeze it in to a single tile but using a mix of port types. The XS1-L8-64-TQ128 has the most fully pinned out tile (all 64 pins brought out) so would be the absolute lowest chip cost, with the tradeoff of additional software complexity with different low level drivers for different port types.
One question worth asking is how often the phase updates need to be? If not so frequent, then it allows for a client side pre-calculation (sorting) of the 8/16 transition events which can then be loaded into bitmap / timing lookup of the port server task (the bit doing the waggling), keeping that part slim and fast and able to handle transitions as close as 40ns to each other.
Regarding the actual port mechanism to use (which hardware in the port to take advantage of) - currently undecided whether it would be best to use the buffer (port serialiser) and a 40ns clocked port to load a 32 bit bit pattern in every 80ns/160ns or setting up the next transition at a given time using the port timer. Both should be possible I think.