Preventing deadlocks with shared channels

ers35 · Post by **ers35** » Mon Jun 10, 2013 3:37 pm

The XMOS XS1 Architecture manual states:

A destination channel end can be shared by any number of outputting threads;
they are served in a round-robin manner. Once a connection has been established it
will persist until an END is received; any other thread attempting to establish a
connection will be queued. In the case of a shared channel end, the outputting thread
will usually transmit the identifier of its channel end so that the inputting thread
can use it to reply.

and:

Note that when sending long messages to a shared channel, the sender should
send a short request and then wait for a reply before proceeding as this will minimise
intercon- nect congestion caused by delays in accepting the message.

Can the following two properties be satisfied at the same time?

1) Multiple client channels on multiple tiles sending to one shared server channel,
where the client knows the server's channel id ahead of time, but the client must send
its channel id to the server for each stream.

2) Clients send a short request and then wait for a reply before proceeding. The link
needs to be free while the clients are waiting for a reply.

Consider the following topology: Two XS1-L1s (an L2) with all but one link disabled. There
is one bidirectional link between tile[0] and tile[1]. The reason for disabling the other
three links is to make deadlocks easier to spot.

Code: Select all

tile[0] <-> tile[1]

See the following trace from the XMOS simulator. I have removed all instructions
not related to channel operations.

client_0 is waiting on a reply from server_0, which has not yet begun execution.
server_0's channel id is 0x40000102 and has already been allocated with getr before
any threads have started executing. client_0 sends its channel as three data tokens,
a pause, then waits for an END from server_0.

client_1 sends a request to server_1, but the reply from server_1 is never delivered,
so client_1 waits on chkct, END forever.

Code: Select all

tile[0]@1- -p-A-p-p-.----.00010202 (client_0         + 10) : getr    r5(0x2), 0x2 @16346
tile[0]@2- -p-p-A-p-.----..00010202 (client_0         + 10) : getr    r5(0x1102), 0x2 @16347
tile[0]@3- -p-p-p-A-.----...00010202 (client_0         + 10) : getr    r5(0x1502), 0x2 @16348
tile[0]@1- -p-A-p-p-.----.00010220 (client_0         + 2e) : setd    res[r5(0x2)], r0(0x40000102) @16402
tile[0]@2- -p-p-A-p-.----..00010220 (client_0         + 2e) : setd    res[r5(0x1102)], r0(0x40000102) @16403
tile[0]@3- -p-p-p-A-.----...00010220 (client_0         + 2e) : setd    res[r5(0x1502)], r0(0x40000102) @16404
tile[0]@1- -p-A-p-p-.----.00010222 (client_0         + 30) : outt    res[r5(0x2)], r8(0x0) @16406
tile[0]@2- -p-p-A-p-.----..00010222 (client_0         + 30) : outt    res[r5(0x1102)], r8(0x11) @16407
tile[0]@3- -p-p-p-A-.----...00010222 (client_0         + 30) : outt    res[r5(0x1502)], r8(0x15) @16408
tile[0]@1- -p-A-p-p-.----.00010224 (client_0         + 32) : outt    res[r5(0x2)], r7(0x0) @16410
tile[0]@2- -p-p-A-p-.----..00010224 (client_0         + 32) : outt    res[r5(0x1102)], r7(0x0) @16411
tile[0]@3- -p-p-p-A-.----...00010224 (client_0         + 32) : outt    res[r5(0x1502)], r7(0x0) @16412
tile[0]@1- -p-A-p-p-.----.00010226 (client_0         + 34) : outt    res[r5(0x2)], r6(0x0) @16414
tile[0]@2- -p-p-A-p-.----..00010226 (client_0         + 34) : outt    res[r5(0x1102)], r6(0x0) @16415
tile[0]@3- -p-p-p-A-.----...00010226 (client_0         + 34) : outt    res[r5(0x1502)], r6(0x0) @16416
tile[0]@1- -p-A-p-p-.----.00010228 (client_0         + 36) : outct   res[r5(0x2)], 0x2 @16418
tile[0]@2- -p-p-A-p-.----..00010228 (client_0         + 36) : outct   res[r5(0x1102)], 0x2 @16419
tile[0]@3- -p-p-p-A-.----...00010228 (client_0         + 36) : outct   res[r5(0x1502)], 0x2 @16420
tile[0]@1-P-p-A-p-p-.----.0001022a (client_0         + 38) : chkct   res[r5(0x2)], 0x1 @16422
tile[0]@2-P-p-w-A-p-.----..0001022a (client_0         + 38) : chkct   res[r5(0x1102)], 0x1 @16423
tile[0]@3-P-p-w-w-A-.----...0001022a (client_0         + 38) : chkct   res[r5(0x1502)], 0x1 @16424

tile[0]@0- -A-w-w-w-.----00010468 (server_1     + d2) : getr    r0(0x103), 0x3 @30365
tile[0]@0-P-A-w-w-w-p-.----00010532 (server_1+ 50) : testct  r0(0x0), res[r5(0x402)] @30569

tile[1]@0- -A-.----00010156 (client_1        +  e) : getr    r4(0x40000002), 0x2 @78804
tile[1]@0- -A-.----00010162 (client_1        + 1a) : setd    res[r4(0x40000002)], r0(0x402) @78820
tile[1]@0- -A-.----00010168 (client_1        + 20) : outt    res[r4(0x40000002)], r0(0x0) @78832
tile[1]@0- -A-.----0001016e (client_1        + 26) : outt    res[r4(0x40000002)], r11(0x0) @78844
tile[1]@0- -A-.----00010172 (client_1        + 2a) : outt    res[r4(0x40000002)], r6(0x40) @78852
tile[1]@0- -A-.----00010174 (client_1        + 2c) : outct   res[r4(0x40000002)], 0x2 @78856
tile[1]@0-P-A-.----00010176 (client_1        + 2e) : chkct   res[r4(0x40000002)], 0x1 @78860

tile[0]@0- -A-w-w-w-we.----00010532 (server_1+ 50) : testct  r0(0x0), res[r5(0x402)] @78861
tile[0]@0- -A-w-w-w-we.----00010536 (server_1+ 54) : int     r0(0x0), res[r5(0x402)] @78869
tile[0]@0- -A-w-w-w-we.----0001053a (server_1+ 58) : int     r1(0x0), res[r5(0x402)] @78877
tile[0]@0- -A-w-w-w-we.----00010540 (server_1+ 5e) : int     r1(0x40), res[r5(0x402)] @78889
tile[0]@0- -A-w-w-w-we.----00010548 (server_1+ 66) : setd    res[r5(0x402)], r0(0x40000002) @78905
tile[0]@0- -A-w-w-w-we.----0001054a (server_1+ 68) : outct   res[r5(0x402)], 0x1 @78909
tile[0]@0-P-A-w-w-w-we.----0001054c (server_1+ 6a) : testct  r0(0x0), res[r5(0x402)] @78913

I want to be able to allocate a client channel at any time with getr, set its
destination with setd to that of a pre-defined server channel id, and have the server
reply using the channel id that the client sent with its request. I also want the client to
make sure that the server is ready to receive the request before proceeding, so as
not to deadlock threads sharing the single link.

I can use a lock on each client to make sure that only one request goes out at a time,
but this does not work when there are multiple clients on multiple tiles making requests
to one shared server channel, as they cannot use the lock across tiles.

ers35 · Post by **ers35** » Mon Jun 10, 2013 5:50 pm

Attached is a program that demonstrates the problem. The included XN has all but one link disabled. The program runs successfully if all four links are enabled.

An example xsim trace similiar to the one above:

Code: Select all

tile[0]@0- -A-.----00010480 (main                + bc) : chkct   res[r1(0x2)], 0x3 @9262
tile[0]@0- -A-.----00010482 (main                + be) : chkct   res[r1(0x2)], 0x1 @9266
tile[0]@0- -A-.----000104a8 (main                + e4) : set     sp, r11(0x1fef0) @9350
tile[0]@0- -A-.----000104aa (main                + e6) : freer   res[r1(0x2)] @9354
tile[1]@0-P-A-.----000100d4 (sync_other          +  e) : chkct   res[r0(0x40000102)], 0x1 @10261
tile[0]@0- -A-.----000100d0 (sync_0              +  a) : getr    r0(0x2), 0x2 @10430
tile[0]@0- -A-.----000100d2 (sync_0              +  c) : getr    r1(0xb02), 0x2 @10434
tile[0]@0- -A-.----000100d4 (sync_0              +  e) : setd    res[r0(0x2)], r1(0xb02) @10438
tile[0]@0- -A-.----000100d6 (sync_0              + 10) : setd    res[r1(0xb02)], r0(0x2) @10442
tile[0]@0- -A-.----000100c2 (setd                +  e) : setd    res[r0(0x102)], r1(0x40000102) @10530
tile[0]@0- -A-.----000100ec (sync_0              + 26) : outct   res[r0(0x102)], 0x1 @10550
tile[0]@0- -A-.----000100f2 (sync_0              + 2c) : freer   res[r0(0x102)] @10562
tile[0]@0- -A-.----000100f4 (sync_0              + 2e) : freer   res[r1(0xb02)] @10566
tile[1]@0- -A-.----000100d4 (sync_other          +  e) : chkct   res[r0(0x40000102)], 0x1 @10579
tile[0]@0- -A-.----00010282 (__main_xm_0         + ee) : getr    r0(0x3), 0x3 @10622
tile[1]@0- -A-.----000100e2 (client              +  a) : getr    r0(0x40000002), 0x2 @10655
tile[1]@0- -A-.----000100e4 (client              +  c) : getr    r1(0x40000b02), 0x2 @10659
tile[1]@0- -A-.----000100e6 (client              +  e) : setd    res[r0(0x40000002)], r1(0x40000b02) @10663
tile[1]@0- -A-.----000100e8 (client              + 10) : setd    res[r1(0x40000b02)], r0(0x40000002) @10667
tile[1]@0- -A-.----000100c2 (setd                +  e) : setd    res[r0(0x40000002)], r1(0x402) @10759

tile[1]@0- -A-.----00010102 (client              + 2a) : outt    res[r0(0x40000002)], r1(0x400000) @10783
tile[1]@0- -A-.----00010108 (client              + 30) : outt    res[r0(0x40000002)], r1(0x4000) @10795
tile[1]@0- -A-.----0001010e (client              + 36) : outt    res[r0(0x40000002)], r1(0x40) @10807
tile[1]@0- -A-.----00010112 (client              + 3a) : outct   res[r0(0x40000002)], 0x2 @10815
# waits forever
tile[1]@0-P-A-.----00010116 (client              + 3e) : chkct   res[r0(0x40000002)], 0x1 @10823

tile[0]@1- -p-A-p-p-.----.00010102 (client              +  a) : getr    r0(0x102), 0x2 @10836
tile[0]@2- -p-p-A-p-.----..00010102 (client              +  a) : getr    r0(0xb02), 0x2 @10837
tile[0]@1- -p-A-p-p-.----.00010104 (client              +  c) : getr    r1(0xc02), 0x2 @10840
tile[0]@2- -p-p-A-p-.----..00010104 (client              +  c) : getr    r1(0xd02), 0x2 @10841
tile[0]@3- -p-p-p-A-.----...00010102 (client              +  a) : getr    r0(0xe02), 0x2 @10842
tile[0]@1- -p-A-p-p-.----.00010106 (client              +  e) : setd    res[r0(0x102)], r1(0xc02) @10844
tile[0]@2- -p-p-A-p-.----..00010106 (client              +  e) : setd    res[r0(0xb02)], r1(0xd02) @10845
tile[0]@3- -p-p-p-A-.----...00010104 (client              +  c) : getr    r1(0xf02), 0x2 @10846
tile[0]@1- -p-A-p-p-.----.00010108 (client              + 10) : setd    res[r1(0xc02)], r0(0x102) @10848
tile[0]@2- -p-p-A-p-.----..00010108 (client              + 10) : setd    res[r1(0xd02)], r0(0xb02) @10849
tile[0]@3- -p-p-p-A-.----...00010106 (client              +  e) : setd    res[r0(0xe02)], r1(0xf02) @10850
tile[0]@3- -p-p-p-A-.----...00010108 (client              + 10) : setd    res[r1(0xf02)], r0(0xe02) @10854
tile[0]@1- -p-A-p-p-.----.000100c2 (setd                +  e) : setd    res[r0(0x102)], r1(0x40000202) @10940
tile[0]@2- -p-p-A-p-.----..000100c2 (setd                +  e) : setd    res[r0(0xb02)], r1(0x40000202) @10941
tile[0]@3- -p-p-p-A-.----...000100c2 (setd                +  e) : setd    res[r0(0xe02)], r1(0x40000202) @10946

tile[0]@1- -p-A-p-p-.----.00010122 (client              + 2a) : outt    res[r0(0x102)], r1(0x1) @10964
tile[0]@2- -p-p-A-p-.----..00010122 (client              + 2a) : outt    res[r0(0xb02)], r1(0xb) @10965
tile[0]@3- -p-p-p-A-.----...00010122 (client              + 2a) : outt    res[r0(0xe02)], r1(0xe) @10970

tile[0]@0- -A-p-p-p-.----0001015c (server              + 16) : int     r0(0x0), res[r0(0x402)] @10975

tile[0]@1- -p-A-p-p-.----.00010128 (client              + 30) : outt    res[r0(0x102)], r1(0x0) @10976
tile[0]@2- -p-p-A-p-.----..00010128 (client              + 30) : outt    res[r0(0xb02)], r1(0x0) @10977
tile[0]@3- -p-p-p-A-.----...00010128 (client              + 30) : outt    res[r0(0xe02)], r1(0x0) @10982
tile[0]@1- -p-A-p-p-.----.0001012e (client              + 36) : outt    res[r0(0x102)], r1(0x0) @10988
tile[0]@2- -p-p-A-p-.----..0001012e (client              + 36) : outt    res[r0(0xb02)], r1(0x0) @10989
tile[0]@3- -p-p-p-A-.----...0001012e (client              + 36) : outt    res[r0(0xe02)], r1(0x0) @10994

tile[0]@1- -p-A-p-p-.----.00010132 (client              + 3a) : outct   res[r0(0x102)], 0x2 @10996
tile[0]@2- -p-p-A-p-.----..00010132 (client              + 3a) : outct   res[r0(0xb02)], 0x2 @10997
tile[0]@3- -p-p-p-A-.----...00010132 (client              + 3a) : outct   res[r0(0xe02)], 0x2 @11002

tile[0]@0- -A-p-p-p-.----00010168 (server              + 22) : int     r0(0x0), res[r0(0x402)] @11003

tile[0]@1-P-p-A-p-p-.----.00010136 (client              + 3e) : chkct   res[r0(0x102)], 0x1 @11004
tile[0]@2-P-p-w-A-p-.----..00010136 (client              + 3e) : chkct   res[r0(0xb02)], 0x1 @11005
tile[0]@3-P-p-w-w-A-.----...00010136 (client              + 3e) : chkct   res[r0(0xe02)], 0x1 @11010

tile[0]@0- -A-w-w-w-.----00010174 (server              + 2e) : int     r0(0x40), res[r0(0x402)] @11031
tile[0]@0- -A-w-w-w-.----000100c2 (setd                +  e) : setd    res[r0(0x402)], r1(0x40000002) @11115
tile[0]@0- -A-w-w-w-.----0001018a (server              + 44) : outct   res[r0(0x402)], 0x1 @11131
tile[0]@0-P-A-w-w-w-.----0001015c (server              + 16) : int     r0(0x0), res[r0(0x402)] @11159

segher · Post by **segher** » Tue Jun 11, 2013 6:09 am

Yes, if nothing is reading from the channel ends you are sending
to the buffers along the way will all clog up. To still be able to
send data to other channel ends the simplest solution is to use
multiple networks -- but you have only one link, so you cannot do
that.

Is this merely a startup problem, or do you want to be able to
have non-running servers? If it's only a startup problem you
should solve _that_: do not start the clients until all servers are
up, for example. The more general problem is much harder.

ers35 · Post by **ers35** » Tue Jun 11, 2013 6:41 am

segher wrote:Yes, if nothing is reading from the channel ends you are sending
to the buffers along the way will all clog up.

Is there a separate buffer for the sending switch and receiving switch? Why is the server not able to respond if the client only took the link in one direction?

segher wrote:Is this merely a startup problem,

Startup is okay because the system is in a known state and clients can be initialized in a specific order if needed.

segher wrote:or do you want to be able to
have non-running servers?

The servers can be running. The example shows a case where the server might be busy processing a previous request.

segher wrote: To still be able to send data to other channel ends the simplest solution is to use multiple networks -- but you have only one link, so you cannot do that.
The more general problem is much harder.

I am interested in a solution to the general problem.

Even with multiple networks, if there exists multiple servers, there is a possibility for a deadlock if clients sending to different servers output at the same time. If possible, I want to eliminate deadlocks and have clients be queued in a round-robin order. I think having more links only reduces the possibility for a deadlock, but does not eliminate it.

For example, consider the case of a single database server and multiple database clients spread across different tiles. The clients have no way to know whether outputting their channel id to the server to start a request will cause them to deadlock because they do not know the status of the other clients. Also, because the server channel destination is shared, the client must output its channel id first. An END token cannot be used to start the request because the server does not know on which channel the request originated.

segher · Post by **segher** » Tue Jun 11, 2013 7:42 am

ers35 wrote:
segher wrote:Yes, if nothing is reading from the channel ends you are sending
to the buffers along the way will all clog up.
Is there a separate buffer for each direction on the link?

It doesn't matter. I don't think they are physically separate; what
matters is how much credit is given out, and no switch will give
out more credits than it can handle. Send and receive on a channel
end (on a CPU) are also independent.

Why is the server not able to respond if the client only took the link in one direction?

The clients tried to send data while they could not; the client threads
are all blocked until one can send data. Nothing else can send data
in that direction either. The server needs to input data to unblock
things, nothing else will do.

segher wrote: To still be able to send data to other channel ends the simplest solution is to use multiple networks -- but you have only one link, so you cannot do that.
The more general problem is much harder.
I am interested in a solution to the general problem.

The most general solution is to have an "arbiter" thread and channel
end on each side. To get permission to use the single link, you first
ask the arbiter _on your own side_. The arbiters can communicate
between themselves just fine, they know everything that goes on after
all.

You can send all data through the arbiters, if you prefer.

Even with multiple networks, if there exists multiple servers, there is a possibility for a deadlock if clients sending to different servers output at the same time. If possible, I want to eliminate deadlocks and have clients be queued in a round-robin order. I think having more links only reduces the possibility for a deadlock, but does not eliminate it.

Having as many links as servers does eliminate it -- if your communication
is deadlock-free to start with (i.e., if it would not deadlock on a network
with infinite buffers).

edschulz · Post by **edschulz** » Tue Jun 11, 2013 11:05 am

segher wrote: The most general solution is to have an "arbiter" thread and channel
end on each side. To get permission to use the single link, you first
ask the arbiter _on your own side_. The arbiters can communicate
between themselves just fine, they know everything that goes on after
all.

I thought the hardware was designed to handle the arbitration automatically. See the Architecture manual quote at the start of this topic. It seems that in this example the two directions on the link are not truly independent. That is the confusing part.

If we had a single-tile device with a single off-chip link, we could use a hardware lock to acquire exclusive access to the link. But with two tiles on a device, hardware locks would not prevent a channel on the tile upstream of the single link from contending with a channel on the tile closest to the off-chip link. We can reduce the probability of deadlock, but not to zero. Is that accurate?

It seems a shame if a shared destination channel end cannot be used in a practical system. The hardware almost supports it, but not quite. Can someone at XMOS confirm our conclusion? Is there a reliable workaround other than Segher's arbiter logical core + channel end + software on each side?

segher · Post by **segher** » Tue Jun 11, 2013 12:14 pm

edschulz wrote:I thought the hardware was designed to handle the arbitration automatically. See the Architecture manual quote at the start of this topic.

That quote says (paraphrased) that when multiple source channel
ends all want to send to the same destination channel end, all of
them will be served. The paragraph right before this says you can
tie up network resources for as long as you like. Two different
things.

Jamie · Post by **Jamie** » Mon Jul 22, 2013 1:54 pm

Hi ers35,

Back to your original post, the deadlock is due to the client's use of the 'pause' control token and I think xsim isn't freeing the channel end properly when the pause is received. Running the same program on AXE (https://github.com/rlsosborne/tool_axe) it looks to be okay. Incidentally, changing the 'pause' to an 'end' also fixes things.

In general with shared channel ends, deadlock can occur when client requests queue up in the network. If the server needs to communicate to service a particular request but there is no available route in the network, it will be blocked indefinitely by its clients. To avoid this, a server must guarantee that every client request will always complete, so that all requests will eventually be taken off of the network and serviced.

Either a server must not engage in any communication when servicing a request, or if must do so, then client requests have to be dealt with (i.e. taken off the network) as they are generated. There are a number of ways in which this can be done.

Jamie

Preventing deadlocks with shared channels

Preventing deadlocks with shared channels

Re: Preventing deadlocks with shared channels

Re: Preventing deadlocks with shared channels

Re: Preventing deadlocks with shared channels

Re: Preventing deadlocks with shared channels

Re: Preventing deadlocks with shared channels

Re: Preventing deadlocks with shared channels

Re: Preventing deadlocks with shared channels