Many Client channel ends -> 1 server channel end

Technical questions regarding the XTC tools and programming with XMOS.
User avatar
lilltroll
XCore Expert
Posts: 956
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

Post by lilltroll »

"I'm a bit confused as to what you are trying to do with that code. The server shouldn't need to know the resource IDs of the clients in advance. The server shouldn't need to use testct().

The server needs to get hold of a channel end with a known resource ID. When you want to communicate with the server a client should allocate a channel end, set the destination to the known resource ID of the server and send a packet with some data.

If communication is one way (i.e. the client doesn't need a response) then this is the minimum that is required. The server should be a while(1) loop which receives and handles one packet from a client in each iteration of the loop. If the client needs a response from the server then the resource ID of the client channel end should be contained in the original request. The server should set the destination of its channel end to the resource ID specified in the request before sending the response."

You didn't tell on how to end the package so other package can use the server. Pause or END ? By Client or Server.
Do you need to send an END both ways to make the switch kill the connection ?


Probably not the most confused programmer anymore on the XCORE forum.
User avatar
Jamie
Experienced Member
Posts: 99
Joined: Mon Dec 14, 2009 1:01 pm

Post by Jamie »

lilltroll wrote: Jamie, do you mean like this?

The console output is:
0, 2, 4, 6, 8, 10, 12, 14, END

e.g it locked after the first client.
I guess the crux of it is what happens to the two control tokens from the other clients? I would assume they would be delivered and would sit in the server's chanend buffer, but that doesn't seem to be what's happening.

Richard: but aren't you assuming that the token isn't consumed after the TESTCT?
User avatar
lilltroll
XCore Expert
Posts: 956
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

Post by lilltroll »

Hmm, a small delay helps out.

+ a pause in the end of the client e.g. do not tell server.

But it doesn't work without the delay. Do the switch need some time to config, the 3 incoming routing requests before tokens are started to stream?

Code: Select all

#define CT_END 1
#define CT_PAUSE 2
#define CT_ACK 3
#define CT_NACK 4
#define len 4

static inline
int TESTCT(unsigned Chanend) {
	int ret;
	asm("testct %0,res[%1]":"=r"(ret) : "r"(Chanend));
	return ret;
}
static inline
void OUTCT_END(unsigned Chanend) {
	asm("outct res[%0],%1" ::"r"(Chanend),"r"(CT_END));
}

void OUTCT_PAUSE(unsigned Chanend) {
	asm("outct res[%0],%1" ::"r"(Chanend),"r"(CT_PAUSE));
}

static inline
void OUTCHK_PAUSE(unsigned Chanend) {
	asm("chkct res[%0],%1" ::"r"(Chanend),"r"(CT_PAUSE));
}

static inline
void OUTCHK_END(unsigned Chanend) {
	asm("chkct res[%0],%1" ::"r"(Chanend),"r"(CT_END));
}
static inline
void SETD_CLIENT(unsigned Chanend, unsigned ServerCore) {
	asm("setd res[%0],%1" ::"r"(Chanend),"r"((ServerCore<<16)+0x102));
}
static inline
void SETD_SERVER(unsigned Chanend, unsigned ClientChannelend) {
	asm("setd res[%0],%1" ::"r"(Chanend),"r"(ClientChannelend));
}

static inline
unsigned GETR_CHANEND() {
	unsigned Chanend;
	asm("getr %0,2":"=r"(Chanend));
	return Chanend;
}

void client(unsigned sh, unsigned ServerCore, unsigned timedelay) {
	timer t;
	int time;
	unsigned Chanend;
	Chanend = GETR_CHANEND();
	t:>time;t when timerafter(time+100*timedelay):>time;
	SETD_CLIENT(Chanend, ServerCore);
	for(int i=0; i<10;i++){
		for (int i = 0; i < len; i++)
			asm("out res[%0],%1" ::"r"(Chanend),"r"(i<<sh));
		OUTCT_PAUSE(Chanend);
	}
}

void server() {
	unsigned Chanend;
	unsigned ClientChannelEnd;
	int data;
	Chanend = GETR_CHANEND();
	while (1) {
		for (int i = 0; i < len; i++) {
			asm("in %0,res[%1] ":"=r"(data):"r"(Chanend));
			printint(data);
			printstr(", ");
		}
		printstrln("");
	}
}

int main() {
	par {
		on stdcore[0]:client(1, 3, 10);
		on stdcore[1]:client(2, 3, 10);
		on stdcore[2]:client(3, 3, 10);
		on stdcore[3]:server();
	}
}
Console
0, 4, 8, 12,
0, 8, 16, 24,
0, 2, 4, 6,
0, 4, 8, 12,
0, 8, 16, 24,
0, 2, 4, 6,
0, 4, 8, 12,
0, 8, 16, 24,
0, 2, 4, 6,
0, 4, 8, 12,
0, 8, 16, 24,
0, 2, 4, 6,
0, 4, 8, 12,
0, 8, 16, 24,
0, 2, 4, 6,
0, 4, 8, 12,
0,
Probably not the most confused programmer anymore on the XCORE forum.
User avatar
Bianco
XCore Expert
Posts: 754
Joined: Thu Dec 10, 2009 6:56 pm

Post by Bianco »

lilltroll wrote: You didn't tell on how to end the package so other package can use the server. Pause or END ? By Client or Server.
Do you need to send an END both ways to make the switch kill the connection ?
I think both pause and END should work. With pause the routing entry is kept.
You do not need to see a channel end as a bidirectional line. When a typical "channel" is set up between threads on XC there are two physical channels, one for each direction, which are independent. One could for example also set up a ringbus which runs in a single direction. So the server only needs to terminate the connection if the server sends something back to the client. And it should use an END token because the destination will problably be changed when serving an other client. The client will always need to terminate the connection.
User avatar
lilltroll
XCore Expert
Posts: 956
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

Post by lilltroll »

Yes, I use the pause to make it more simple, but

Code: Select all

/*
 * main.xc
 *
 *  Created on: 5 jan 2011
 *      Author: mikael
 */

#include <xs1.h>
#include <xclib.h>
#include <platform.h>
#include <print.h>

#define CT_END 1
#define CT_PAUSE 2
#define CT_ACK 3
#define CT_NACK 4
#define len 4

static inline
int TESTCT(unsigned Chanend) {
	int ret;
	asm("testct %0,res[%1]":"=r"(ret) : "r"(Chanend));
	return ret;
}
static inline
void OUTCT_END(unsigned Chanend) {
	asm("outct res[%0],%1" ::"r"(Chanend),"r"(CT_END));
}

void OUTCT_PAUSE(unsigned Chanend) {
	asm("outct res[%0],%1" ::"r"(Chanend),"r"(CT_PAUSE));
}

static inline
void OUTCHK_PAUSE(unsigned Chanend) {
	asm("chkct res[%0],%1" ::"r"(Chanend),"r"(CT_PAUSE));
}

static inline
void OUTCHK_END(unsigned Chanend) {
	asm("chkct res[%0],%1" ::"r"(Chanend),"r"(CT_END));
}
static inline
void SETD_CLIENT(unsigned Chanend, unsigned ServerCore) {
	asm("setd res[%0],%1" ::"r"(Chanend),"r"((ServerCore<<16)+0x102));
}
static inline
void SETD_SERVER(unsigned Chanend, unsigned ClientChannelend) {
	asm("setd res[%0],%1" ::"r"(Chanend),"r"(ClientChannelend));
}

static inline
unsigned GETR_CHANEND() {
	unsigned Chanend;
	asm("getr %0,2":"=r"(Chanend));
	return Chanend;
}

void client(unsigned sh, unsigned ServerCore) {
	timer t;
	int time;
	unsigned Chanend;
	Chanend = GETR_CHANEND();
	t:>time;t when timerafter (time+100):>time;
	SETD_CLIENT(Chanend, ServerCore);
	for(int k=0; k<3;k++){
		for (int i = 0; i < len; i++)
			asm("out res[%0],%1" ::"r"(Chanend),"r"((100*k)+(i*sh)));
		OUTCT_PAUSE(Chanend);
	}
}

void server() {
	unsigned Chanend;
	unsigned ClientChannelEnd;
	int data;
	Chanend = GETR_CHANEND();
	while (1) {
		for (int i = 0; i < len; i++) {
			asm("in %0,res[%1] ":"=r"(data):"r"(Chanend));
			printint(data);
			printstr(", ");
		}
		printstrln("");
	}
}

int main() {
	par {
		on stdcore[0]:client(1, 3);
		on stdcore[1]:client(2, 3);
		on stdcore[2]:client(3, 3);
		on stdcore[3]:server();
	}
}
This program produces:
0, 2, 4, 6,
0, 3, 6, 9,
0, 1, 2, 3,
100, 102, 104, 106,
100, 103, 106, 109,
100, 101, 102, 103,
200, 202, 204, 206,
200, 203, 206, 209,
200, 201, 202, 203,

Thus 3x3 sets what I want.
Now try to decrease "(time+100)" and you will get less messages :shock:
This part with programming the switch, seems to take some time before it is ready to deliver ?
It doesn't seems strange. If 4 cores deliver setup-information to the switch at the same time, the switch will need some instruction before the lookuptable is set !? But how much time to always be safe ?
Probably not the most confused programmer anymore on the XCORE forum.
richard
Respected Member
Posts: 318
Joined: Tue Dec 15, 2009 12:46 am

Post by richard »

There is a race condition. Packets sent to a channel end that is not in use (not allocated by any thread) will be silently discarded. Therefore if the server channel end is allocated after clients start sending messages to it the first few messages will be lost. Each core in your program will start executing code at slightly different times and it just so happens in this case that stdcore[3] is last to start. Adding a pause before clients start sending delays the time at which the first messages so they are sent until after the server channel end is allocated, fixing the problem.

A pause is probably the easiest fix but it is somewhat fragile (the timing might change with different optimisation levels and new toolchain releases etc.). You could alternatively try adding some code to synchronise between cores after the server has allocated its channel end but before the clients start sending.
User avatar
lilltroll
XCore Expert
Posts: 956
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

Post by lilltroll »

Thanks R.

I will take a look into that sync stuff in ASM and try to learn and update. I have of course seen the compiler use it looking at debugs.
Probably not the most confused programmer anymore on the XCORE forum.
User avatar
lilltroll
XCore Expert
Posts: 956
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

Post by lilltroll »

@Richard

I need a little help with the sync.

I added so the server "scan" to find the last Chanend (0x1F), so it will not interfere with other channels that the compiler adds.
I do not know the ID of the channel ends of the clients. I like them to all pause until the server is ready.

There is a lot of candy in chapter
10 Resources and the Thread Scheduler and 11 Concurrency and Thread Synchronisation in the The XMOS XS1 Architecture

Should I go for MSYNC and SSYNC ?
Probably not the most confused programmer anymore on the XCORE forum.
richard
Respected Member
Posts: 318
Joined: Tue Dec 15, 2009 12:46 am

Post by richard »

lilltroll wrote:Should I go for MSYNC and SSYNC ?
The MSYNC and SSYNC instructions are for synchronisation between threads on the same core. To synchronise between cores you needs to use channel ends or use something all cores have access to like a pswitch register.

The following (untested!) code should work, bit it is a bit wasteful in terms of channel ends allocated which is presumably what you are trying to avoid with the use of many to one channels:

Code: Select all

int main()
{
  // Channels for synchronising between cores.
  chan c[3];
  par {
    on stdcore[0]:
    {
      c[0] :> int;
      client();
    }
    on stdcore[1]:
    {
      c[1] :> int;
      c[0] <: 1;
      client();
    }
    on stdcore[2]:
    {
      c[2] :> int;
      c[1] <: 1;
      client();
    }
    on stdcore[3]:
    {
      unsigned server_chan = allocate_server_chanend();
      c[2] <: 1;
      server(server_chan);
    }
  }
}
Alternatively you could extend the server to support a ping request that sends a packet back sender. At startup the clients could repeatedly send ping requests with a timeout until it gets a response.
User avatar
lilltroll
XCore Expert
Posts: 956
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

Post by lilltroll »

First a question about the:

read_pswitch_reg
read_sswitch_reg

I run singel instruction stepping to understand what's happening. They allocate a channel end, send some Hardware tokens and Privileged tokens and thereafter it gets some tokens back. Thereafter a datatoken is return with the value, and the chanend is freed.

Using pswitch I for an example get the readback (from core 2)

Code: Select all

Val=0x33554944
Val=0x537134856
Val=0x1546
Val=0x0
Val=0x0
Val=0x0
Val=0x0
Val=0x0

using sswitch I get

Code: Select all

Val=0x66978304
Val=0x1049604
Val=0x0
Val=0x4
Val=0x0
Val=0x0
Val=0x16787200
Val=0x0
Val=0x3
Val=0x0
Val=0x0
...
What are I'm playing with here, and what can I use for my own purpose without generating total run failure?

Also, in this example:

Code: Select all

int main() {
streaming chan c_console;
chan test[8];
...
Will c_console always get the thread ID==1 since it was written first ? e.g. the compiler allocates chanend in the order that I have written them !?
Probably not the most confused programmer anymore on the XCORE forum.