How does the xCONNECT Link Layer work?

Technical discussions around xCORE processors (e.g. xcore-200 & xcore.ai).
Post Reply
erlingrj
Member++
Posts: 16
Joined: Tue Aug 10, 2021 1:21 pm

How does the xCONNECT Link Layer work?

Post by erlingrj »

Hi guys.

I am trying to really get how the xCONNECT works on the link and application layer. I am studying the xCONNECT Architecture document here: https://www.xmos.ai/download/xCONNECT-A ... e(1.0).pdf
However, I am not getting the complete picture here, hopefully some of you experts can help me out.

1. What steps are involved in sending a single END token from one chanend A to another remote chanend B? (outct res[r0] 0x01)
Is my understanding correct here?
- When an output instruction (outct) is called the architecture will first create a header which will propagate through the network and open a link from one chanend to the other
- The END output operation is still stalled at chanend A since it has no credits. So first the architecture will first transmit a HELLO token to request some tokens so the END can be transmitted
- Chanend B link layer receives HELLO and it will send a CREDIT token back. This will happen as long as there space in the input buffer. I.e. chanend B doesn't have to do a "inct" or a "chkct" for it to be sent.
- Chanend A receives the CREDIT token, it increments its credit-counter and finally the END token can be sent.
- The END token closes the circuit behind itself and other headers waiting to propagate through the network can use the links

2. How does the link layer handle the transmit of a single word between chanend TX and chanend RX?
Conside the two functions:

Code: Select all

void tx(chanend chan_out)  {
	chan_out <: 0x13;
}

void rx(chanend chan_in) {
	int recv;
	chan_in :> recv;
}
This compiles to roughly this:

Code: Select all

/* TX */
outct res[r0], 0x1 /* send END */
chkct res[r0], 0x1 /* receive END */
ldc r1, 0x13 
out res[r0] r1 /* Send word */
outct res[r0], 0x1 /* Do another sync */
chkct res[r0], 0x1 

/* RX */
chkct res[r0], 0x1
outct res[r0], 0x1
in r1, res[r0]
chkct res[r0], 0x1 
outct res[r0], 0x1
What are the actual tokens being sent back and forth between TX and RX? If I understand the documentation correctly then we 2 "transactions" from TX->RX and 2 transactions from RX->TX.

- Transaction 1 TX->RX is the sending of a single END token. (This would then be preceded by header and then HELLO token and then a CREDIT token from RX)
- Transaction 1 RX->TX is also sending a single END token
- Transaction 2 TX->RX is sending the word (0x13) followed by an END token. (This would also be preceded by a header and a HELLO token and a received CREDIT token)
- Transaction 2 RX->TX is the sending of a single END token. Also preceded by header+HELLO+CREDIT

This seems quite ineffective as it seems as if you are opening and closing the link 2 times (in both directions) for a single transmit.

3. How reliable is xCONNCET?
- What happens if a link-layer token is lost? E.g. the first CREDIT token sent upon receiving a HELLO token?
- In general, what happens if we get into a state where the credit-counter and credit issued-counter for two chanends dont match due to a packet loss?
- After reading the following document: https://www.xmos.ai/download/AN01024:-x ... .1rc1).pdf I have the following prepositions
3a: link-layer token loss/corruptiom can be mitigated by both RX and TX using a timer in a select clause that has a timeout which can get us out of a deadlock (like a mismatch in credit and credits-issued counters)
3b: application-layer token loss/corruption can be mitigated by building a protocol ontop of xCONNECT. Token-loss can be addressed by sending synchronization tokens at evenly spaced intervals. Token-corruption can be addressed with checksums.
Any other thoughts or references here?


4. In the xCONNECT Architecture Document Section 2.2 the following is stated:
5. If the header is blocked, then no more incoming traffic behind the header is
processed. This can deadlock the network in cases where large messages are
sent without knowing whether the inputting side is ready to accept the message.
To create a deadlock-free environment, sender and receiver should agree that
they are ready to exchange a large message by synchronizing using small,
empty, messages; these can always be buffered in the channel-ends concerned
and will hence not block the network. Channel ends can hold at least a word
and one token enabling, for example, an identifier and an end token to be
output unsolicited
There are several things I dont understand here.

4a) How can a large message be sent without the "inputting" side being ready? (I assume that inputting=reader, if not it doesnt make any sense). Must not the receiving chanend issue CREDITS to the transmitting chanend before it can send any data?
4b) How does synchronizing with small messages help? If you send a header + END token then the circuit is closed behind you and another chanend could get the opportunity to transmit data, maybe even to the same chanend as you first contacted. So if you then send a big message then this will also be blocked. But I still dont understand how you can push a big message into the network without having CREDITs

5. In the xCONNECT Architecture document section 3.2 we have the following paragraph

The HELLO token is triggered by a write to the control register of the link. This
write can be triggered locally, or by a remote node. In the latter case, the link has
to be enabled and have credits. So one way to set up a bi-directional link is for the
local node to first trigger a HELLO locally, and then send a message to the remote
switch forcing it to say HELLO back. This initializes the credit counters on both
transmitters and both receivers in order to establish a bi-directional link
Again multiple things I don't understand:
5a) I assume a "write to a control register of the link" is something like executing a "outct" or "out" instruction? Then how can this be triggered remotely?
5b) How can a remote node send a message to the remote switch (why are we talking about switches? I assume it is meant remote node) that forces it to send a HELLO back?

Thanks a lot for any pointers here. My end goal is to achieve fault-tolerant and deterministic communication using the xCONNECT.


User avatar
CousinItt
Respected Member
Posts: 360
Joined: Wed May 31, 2017 6:55 pm

Post by CousinItt »

I'll try to answer a couple of your questions. I've been using xmos devices for years and never had to bother with the low level details of xCONNECT. Using interfaces and channels is easy enough, but maybe I should get my hands dirty sometime.

For question 3, generally the xCONNECT links have to be absolutely reliable. As far as I know there is zero fault tolerance or error checking. These things work more like on-chip interconnections than general bus systems. Accordingly the signal integrity requirements are very high. I guess that if a token is lost the whole system will hang up eventually (as the blockages cascade back through the connecting processes). This is the default case.

It's possible to add some higher-level management to deal with a less reliable link, as I think the dynamic configuration app note allows, but that would involve some overhead and shouldn't be needed.
User avatar
CousinItt
Respected Member
Posts: 360
Joined: Wed May 31, 2017 6:55 pm

Post by CousinItt »

For Q4, I think maybe the idea could be expressed better. If there are various layers of management in the xCONNECT protocol, something could be sent at a higher level (which may result in something being sent physically over the link) if there's nothing in hardware to stop it. If the hardware supporting the protocol is as simple (=fast, small, inexpensive) as it can be, there may need to be some low-level software - presumably relying on the tiny amount of buffering available in the hardware - to manage whether more data can be exchanged. I'm not sure the documentation provides enough information to understand fully what's going on.
Post Reply