Multi-node simulations?

Technical questions regarding the XTC tools and programming with XMOS.
User avatar
Jamie
Experienced Member
Posts: 99
Joined: Mon Dec 14, 2009 1:01 pm

Multi-node simulations?

Post by Jamie »

I'm using xsim to run simulations of the XMP-64 (which I assume it can do) but there are a couple of things that seem to be wrong. Firstly the node ids are all multiplied by a factor of 16, so like 0, 16, 32, 48 etc. rather than 0, 1, 2, 3 as I'd expect. Also the node ids are incorect for returned channel resource ids -- they all appear to be 0. Here's some example trace, where each core on each node is executing getting a channel:

Code: Select all

0@0@0   A-.----0001034a (initSystem          + 5a) : getr    r0(0x2), 0x2 @341
0@1@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x10002), 0x2 @341
0@2@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x20002), 0x2 @341
0@3@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x30002), 0x2 @341
128@0@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x2), 0x2 @341
128@1@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x10002), 0x2 @341
128@2@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x20002), 0x2 @341
128@3@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x30002), 0x2 @341
192@0@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x2), 0x2 @341
192@1@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x10002), 0x2 @341
192@2@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x20002), 0x2 @341
192@3@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x30002), 0x2 @341
64@0@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x2), 0x2 @341
64@1@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x10002), 0x2 @341
64@2@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x20002), 0x2 @341
64@3@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x30002), 0x2 @341
96@0@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x2), 0x2 @341
96@1@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x10002), 0x2 @341
96@2@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x20002), 0x2 @341
96@3@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x30002), 0x2 @341
224@0@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x2), 0x2 @341
224@1@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x10002), 0x2 @341
224@2@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x20002), 0x2 @341
224@3@0   A-.----0001019a (initSystem          + 5a) : getr    r0(0x30002), 0x2 @341
Am I missing something, or does xsim just not support this?


m_y
Experienced Member
Posts: 69
Joined: Mon May 17, 2010 10:19 am

Post by m_y »

The Node ids are populated from the MSB downwards so with two nodes you'd get 0 and 128. With four you'd get 0, 128, 64 and 192. With the XMP you have 16 nodes so the top 4 bits are fully used. This can make it appear that they're multipled by 16.

The simulator certainly can simulate XMP systems (I've done it) but there are some complexities to do with how the node ids are set (programatically or by the simulator), and how they're reported in the trace. How are you building and running your code?
User avatar
Jamie
Experienced Member
Posts: 99
Joined: Mon Dec 14, 2009 1:01 pm

Post by Jamie »

The Node ids are populated from the MSB downwards so with two nodes you'd get 0 and 128. With four you'd get 0, 128, 64 and 192. With the XMP you have 16 nodes so the top 4 bits are fully used. This can make it appear that they're multipled by 16.
Wouldn't it make sense though to display the true node id in the trace output?
The simulator certainly can simulate XMP systems (I've done it) but there are some complexities to do with how the node ids are set (programatically or by the simulator), and how they're reported in the trace. How are you building and running your code?
I'm linking the objects with

Code: Select all

-nostdlib -Xmapper --nochaninit
and building the binary in the way you suggested a while ago by replacing each elf into the XE manually:

Code: Select all

xobjdump --split slave.xe

for(( i=1; i<$NUM_CORES; i++ ))
do
    node=$(($i/4))
    core=$(($i%4))
    xobjdump master.xe -a $node,$core,image_n0c0.elf
done
m_y
Experienced Member
Posts: 69
Joined: Mon May 17, 2010 10:19 am

Post by m_y »

Jamie wrote:I'm linking the objects with

Code: Select all

-nostdlib -Xmapper --nochaninit
Okay, so the number you see on the left hand side of the trace come from the config file embedded in the XE; they're not the actual node numbers (we do this so the end-to-end trace for a node/core/thread has a consistent id). Initially all the actual node ids are zero, just like on hardware.

One of the things you miss out on with --nochaninit is programming the node ids. You should do this by programming register 5 (called NODE_ID, IIRC) of each switch. Once you've done that you'll start getting resource id with the same node number in bits 24..31.
User avatar
Jamie
Experienced Member
Posts: 99
Joined: Mon Dec 14, 2009 1:01 pm

Post by Jamie »

One of the things you miss out on with --nochaninit is programming the node ids. You should do this by programming register 5 (called NODE_ID, IIRC) of each switch. Once you've done that you'll start getting resource id with the same node number in bits 24..31.
But in another post (http://www.xcore.com/forum/viewtopic.php?f=25&t=573) you said:
1: Network bringup. The gets the system into a state where each node has the correct node id, xlinks and routing tables are set up and thus a message from one node is correctly to its destination. The correct application binaries are loaded onto each core.
...
Part 3 (minus the code needed to arrange for the right threads to run on the right cores) can be removed by passing "--nochaninit" to the linker.
...
There isn't a flag to turn off generation of part 1 however xobjdump can be used to separately extract part 1 from the rest.
So I would have thought that setting the node id would have taken place in the network bringup and wouldn't be affected by --nochaninit?!
m_y
Experienced Member
Posts: 69
Joined: Mon May 17, 2010 10:19 am

Post by m_y »

If you're constructing the XE using xobjdump then you won't have the network bringup stage (depending on what master.xe has in it, I'm assuming it has the network bringup stage for node 0 but not for the other nodes?).
User avatar
nieuwhzn
Member++
Posts: 26
Joined: Sat Dec 12, 2009 6:45 am

Post by nieuwhzn »

Is there a particular reason that you are trying to make your own life miserable by using xobjdump instead of just relying on the standard tools to set up the XMP-64 network?
User avatar
Jamie
Experienced Member
Posts: 99
Joined: Mon Dec 14, 2009 1:01 pm

Post by Jamie »

If you're constructing the XE using xobjdump then you won't have the network bringup stage (depending on what master.xe has in it, I'm assuming it has the network bringup stage for node 0 but not for the other nodes?).
Yeah, that's a good point, and why my description in the first post wasn't working. The best way to do it then seems to be to create an XE with all the network bringup binaries in it by linking against a multicore main, and then replace out cores using xobjdump with a separate elf binary. As far as I can see still, --nochaninit will just disable the inclusion of the channel initialisation code in the main loadable image. Does this seem about right?
Is there a particular reason that you are trying to make your own life miserable by using xobjdump instead of just relying on the standard tools to set up the XMP-64 network?
I'm working on a software mechanism in support of a programming language to allow run-time migration of processes between cores. Most of this messing about with XEs and ELFs is for two reasons: firstly, so that I can disable the standard library and use my own initialisation routines, and secondly to be able to load different elf binaries onto different cores; i.e. a master process on node 0, core 0 and slave procedures on the others.
m_y
Experienced Member
Posts: 69
Joined: Mon May 17, 2010 10:19 am

Post by m_y »

Jamie wrote:
If you're constructing the XE using xobjdump then you won't have the network bringup stage (depending on what master.xe has in it, I'm assuming it has the network bringup stage for node 0 but not for the other nodes?).
Yeah, that's a good point, and why my description in the first post wasn't working. The best way to do it then seems to be to create an XE with all the network bringup binaries in it by linking against a multicore main, and then replace out cores using xobjdump with a separate elf binary. As far as I can see still, --nochaninit will just disable the inclusion of the channel initialisation code in the main loadable image. Does this seem about right?
That's a tentative "yes". I'm unwilling to say yes for sure because I haven't actually tried it myself.
User avatar
Jamie
Experienced Member
Posts: 99
Joined: Mon Dec 14, 2009 1:01 pm

Post by Jamie »

That's a tentative "yes". I'm unwilling to say yes for sure because I haven't actually tried it myself.
Okay, well from the xsim traces it all seems to work okay :)

Just a couple more simulator-related issues:

Does it mean anything when the simulator spits out a '^@' character mid-trace:

Code: Select all

^@0@0@0   A-.----00010336 (initSystem          +  6) : ldw     r1(0x1), sp[0x0] L[0x1eff4] @4315
And, I'm not sure what to make of this:

Code: Select all

ERROR: Unimplemented OS call '-9'
as it's not clear what's causing it, unlike a TRAP or ECALL which give a location and time.

Thanks for all your help by the way m_y, its been really useful!

Jamie