Synchronizing tasks with MSYNC SSYNC

Technical questions regarding the XTC tools and programming with XMOS.
User avatar
fabriceo
XCore Addict
Posts: 213
Joined: Mon Jan 08, 2018 4:14 pm

Synchronizing tasks with MSYNC SSYNC

Post by fabriceo »

Hello,
I could not find an easy example or explanation on how to create tasks and synchronize their behavior, appart an assembly source code in Github tensor flow :
https://github.com/xmos/lib_tflite_micr ... all.S#L615

According to the Architecture, the MSYNC and SSYNC instruction are used to do so by referring to a "synchronizer" which is allocated when a master tasks is created. But how do we get access to the synchronizer created by the "par" instruction...

googling a bit, I found a patent which is probably linked to xmos architecture about this:
https://patentimages.storage.googleapis ... 7169A1.pdf
but reading head will give you a big headache !

My use case is for the USB Audio app. on tile 0 (xu216) we have the audio-i2s and 1 master dsp task which is triggered by the audio-i2s when a new sample is received from decouple. Then I d like this "master" dsp task to start also 3 other "slave" dsp tasks at the same time.

I guess I need to create the 4 dsp task and in the 3 slaves just do SSYNC.
then in the master one just use MSYNC synchronizer
but not clear and how to access this dam hell synchronizer.


Thanks in advance.
Last edited by fabriceo on Sat May 11, 2024 1:03 pm, edited 1 time in total.


User avatar
fabriceo
XCore Addict
Posts: 213
Joined: Mon Jan 08, 2018 4:14 pm

Post by fabriceo »

Okay, I ve got the solution, after doing some reverse engineering in the assembly code.

assuming such declaration:

Code: Select all

void task1(){
    par {
        a();    //slave tasks
        b();    //slave tasks
        c();    //master task with synchronizer in R5
    }
    printf("all finished\n");
}

int main(){
    task1();
    return 0;
}
the compiler will create a "par descriptor" containing the address of a() and its stack size, the address of b() and its stack size and then a 0 followed by the address of c(), which will be considered as master of a() and b().

then at the beginning of task1, it calls a library function:

Code: Select all

	ldaw r1, dp[par.desc.1]
	ldc r0, 0
	bl __start_other_cores
investigating what s happen in this __start_other_cores is not too difficult:
a synchronizer is requested and stored in the register r5
then the 2 tasks ( a() and b() ) are created against this synchronizer.
then the instruction msync res[r5] is called, which starts a() and b()
and immediately after, c() is called.

The trick is just to retrieve r5 immediately at the beginning of c() and we can play ourselves.
Here is a full example working:

Code: Select all

#include <platform.h>
#include <stdio.h>

#ifdef XSCOPE
#include <xscope.h>
void xscope_user_init()
{   xscope_register(0, 0, "", 0, "");
    xscope_config_io(XSCOPE_IO_BASIC);  }   // or XSCOPE_IO_TIMED
#endif

void a(){
    printf("a before sync\n");
    asm volatile("ssync":::"memory");
    printf("a after sync\n");
}

void b(){
    printf("b before sync\n");
    asm volatile("ssync":::"memory");
    printf("b after sync\n");
}

void c(){
    int sync;
    asm volatile("mov %0,r5 ":"=r"(sync));
    printf("c sync = 0x%x\n",sync);
    delay_ticks(10000);
    printf("c after delay 100us\n");
    asm volatile("msync res[%0]"::"r"(sync));
    delay_ticks(10000);
    printf("c after delay 100us and msync\n",sync);
}
void task1(){
    par {
        a();    //slave tasks
        b();    //slave tasks
        c();    //master task owning synchronizer
    }
    printf("all finished\n");
}

int main(){
    task1();
    return 0;
}
obviously the reader should double check that the library used by its XTC version is aligned with this approach. This example runs perfect with XTC 14.4.1 and was tested on xu216 with compilation flag -o1

hope this helps, at least it solves an old topic:
https://www.xcore.com/viewtopic.php?t=7999

cheers
Last edited by fabriceo on Sat May 11, 2024 1:06 pm, edited 2 times in total.
User avatar
fabriceo
XCore Addict
Posts: 213
Joined: Mon Jan 08, 2018 4:14 pm

Post by fabriceo »

well, it is important to use the "memory" barrier with ssync, otherwise the compiler optimization is reshuffling instructions and potentially moving some instruction above ssync :) (tested!)
User avatar
fabriceo
XCore Addict
Posts: 213
Joined: Mon Jan 08, 2018 4:14 pm

Post by fabriceo »

also, again due to compiler optimization, saving the register R5 has to be done at the very beginning and sometime the compiler reschedule some instructions before, thus loosing the original value of r5...

one way to solve it for me was to declare the synchronizer as a global variable and then to combine the following assembly with the master task inside the par statement like this:

Code: Select all

int     dspSynchronizer;

void task1(){
    par {
        a();    //slave tasks
        b();    //slave tasks
        { asm volatile("stw r5,dp[dspSynchronizer]":::"memory"); 
          c();  }  //master task owning synchronizer
    }
    printf("all finished\n");
}
User avatar
Ross
XCore Expert
Posts: 972
Joined: Thu Dec 10, 2009 9:20 pm
Location: Bristol, UK

Post by Ross »

This is a some nice hacking, is the sync always stored in r5? I've not checked.
User avatar
fabriceo
XCore Addict
Posts: 213
Joined: Mon Jan 08, 2018 4:14 pm

Post by fabriceo »

Hi Ross
yes, in the assembly code of this library function "__start_other_cores"
r5 is always the synchronizer, no problem. The only problem is be careful with the prologue of the master task to keep r5 at its original value.
after careful attention of what does the compiler in -O3, I m now utilizing this in a commercial product.

you might provide a feedback to the dev team so that they propose a way to make things more convenient.

thanks