a Cooperative Scheduler Arduino style

XCore Project reviews, ideas, videos and proposals.
User avatar
fabriceo
XCore Addict
Posts: 168
Joined: Mon Jan 08, 2018 4:14 pm

a Cooperative Scheduler Arduino style

Post by fabriceo »

Hi

I’ve developed the SCoop library for the Arduino platform some years ago here:
https://forum.arduino.cc/index.php?topic=137801.0
and I was missing an easy Cooperative scheduler to execute multiple cpp tasks on XMOS.
The intend is for me to manage some sort of front-panel and I/O within the USB application, in an Arduino style.

Here is a xc demo application using XCScheduler.cpp library file providing a Cooperative scheduler solution, in addition and compatible with the XCore multitasking inherent capability of the XMOS architecture; it uses Setjump/longjump and is 80% based on the original code ad smart approach of from Mikael Patel here: https://github.com/mikaelpatel/Arduino-Scheduler
All credits to him for this very smart approach.

A main.xc program declares 2 tasks in a par statement for 2 cores.
One core is just printing a message every 5 seconds using a blocking delay function;
The other core is
- initializing the XCScheduler with a certain amount of stack space for all the cpp tasks
- launching the initialization of multiple tasks written in cpp
- allocating a specific stack space for each of them
- orchestrating and monitoring a simple context switch between the task with a round robin approach.
- This is done as a default branch of a select statement.
Each cpp task is blocking and just yielding to the other tasks when cpu is not needed.
A sleepMs function provides a simple delay(ms) with embedded yield call.

A stronger and clean library will most probably be provided on my github in a next post here but wanted to share this interesting outcome as of now, hope it worth your reading.

Remark:
- A #pragma stackfunction is needed and the amount of memory reserved should be greater than the sum of all stack task (2500 words in this demo)
- With this implementation spread between xc and cpp, there is no any linker error due to stack information missing or due to name-mangling. All seems ok 
- The xcore task1 can just yield or execute a loop() in the cpp code, depending if we want a circularly or sequential approach. The second one gives the possibility to measure the time of a complete circularly cycle, which is of 2us in this example. Any firm blocking timer within a task will increase the time for a cycle.
- the demo uses only printf to the stdio rerouted with xscope so need a xtag

Don’t forget to change the platform .xn file to suit your dev board. This app is running on a DXIO board bought at diyinhk.com
app_demoScheduler.zip
You do not have the required permissions to view the files attached to this post.


User avatar
fabriceo
XCore Addict
Posts: 168
Joined: Mon Jan 08, 2018 4:14 pm

Post by fabriceo »

Hi,
there was 19 downloads as of october 8th, any feedback or suggestions ?
cheers
User avatar
fabriceo
XCore Addict
Posts: 168
Joined: Mon Jan 08, 2018 4:14 pm

Post by fabriceo »

Hi,
the scheduler is hosted on my xmos GitHub project but havent had time to make examples, your welcome to fork and propose merge
https://github.com/fabriceo/XMOS
User avatar
fabriceo
XCore Addict
Posts: 168
Joined: Mon Jan 08, 2018 4:14 pm

Post by fabriceo »

Hello,
I ve proposed a cooperative scheduler some years ago based on a cpp library made by a pear.
But I was not fully satisfied with it so I ve used some spare time to elaborate another one. (it is like sudoku, sometime you just need that).
Very simple (2 functions) and very easy to use.

just include "xcscheduler.h"
and then create a task with a macro, and then just call yield() as mush as possible from the main XC task and from the created tasks.

The example given is creating 100 tasks, each calls for a random number and increments an index in an array if the number correspond to the task number. Process stops after 5000 counts per task so around 50millions random number.

in this version, tasks are allocated with malloc() and will be free()) if they end/return, but the preferred behavior is that a task should be an infinite loop to avoid heap fragmentation of course. Stack size needs is automatically extracted from compiler output (nstackwords).

The yield() code is written in assembly and very easy to read if you want to look at it.
There are 2 options, either the task switch always come back to the main XC thread before moving to next one, ore it goes through the whole list (round robin style) (default, can be changed by commenting line 49 in .s file)

the library maintains a list of cooperative tasks per logical core, so each core can yield and spread its time to its own tasks without impacting other tasks;

The pro & cons of a cooperative scheduler is that the program switch/yield to the other task under the program control and not on a timer slice priority based like in any preemptive RTOS. This is an easy way to create multiple independent task in a program to treat multiple context like for example a serial protocol, a rotary encoder and front panel switched, an IR remote input or a menu logic on a display, instead of having multiple re-entrant code launched from a single select. might make things easier than a complex state machine. Also parallel usage is not anymore a problem and you almost never need to "lock" a piece of code as no other task will enter in it unless a yield is present.

some helpers also provided to manage timers and test channel data presence.

essentially for fun and experimentation with an augmented XMOS :)

fabriceo
You do not have the required permissions to view the files attached to this post.
User avatar
Ross
XCore Expert
Posts: 962
Joined: Thu Dec 10, 2009 9:20 pm
Location: Bristol, UK

Post by Ross »

Thanks for sharing! Is this on GitHub somewhere?
User avatar
fabriceo
XCore Addict
Posts: 168
Joined: Mon Jan 08, 2018 4:14 pm

Post by fabriceo »

susnak
Member
Posts: 13
Joined: Fri Apr 12, 2019 1:01 pm

Post by susnak »

hi fabriceo.
I'll give your scheduler a try in my next project.
Will it integrate well with the xc select block? Right now, I have a task/core using [[notification]] and timer cases and I am doing a "big loop multitasking" in the timer. One of the tasks is using a pretty awful statemachine, which I'd like to simplify.
User avatar
fabriceo
XCore Addict
Posts: 168
Joined: Mon Jan 08, 2018 4:14 pm

Post by fabriceo »

Hi
your welcome to try it and share feedbacks :)
its easy to integrate as it uses the compiler stack calculator and doesn't requires setting priorities.
also your cooperative tasks can be written in c++ if you like, (just need and extern c to make them visible by the linker).

For sure the target is to simplify state machines by spreading algorithms across multiple cooperative tasks, especially interesting for UI or low speed protocols (like modbus or midi).
Still cooperative tasks should be infinite loop but they can absolutely be blocking by using yieldDelay(xxx), for example to check time outs situation without creating a new state in a FSM and thus no need to come back in their main-loop.

I recommend you keep your XC select statement as it is today and you just add a "default" at the end, just calling yield().
This will pass all unused mips to the cooperative tasks without impacting anything in your original code.
So you can migrate progressively the function from your "big loop multitasking" to specific cooperative tasks which will live by themselves by the mean of the yield().
you just need to be aware that the "default: yield(); break;" may take few tens of microseconds, which will inherently delay your other cases response time.
This strictly depends on how often your cooperative tasks call yield() also and thats why I recommend to use yield() or yieldDelay(10) as much as possible so.

cheers
User avatar
fabriceo
XCore Addict
Posts: 168
Joined: Mon Jan 08, 2018 4:14 pm

Post by fabriceo »

Hello
I was successful today in integrating this cooperative scheduler as part of the Endpoint0 task used in the USB Audio software (tested v7.3.1 with xmake xtc 15).

in fact XUA_Endpoint0() is always waiting for a setup-packet from the host as we can see in the source code at the end of the file xua_endpoint0.c:

Code: Select all

    while(1)
    {
        /* Returns XUD_RES_OKAY for success, XUD_RES_RST for bus reset */
        XUD_Result_t result = USB_GetSetupPacket(ep0_out, ep0_in, &sp);
        XUA_Endpoint0_loop(result, sp, c_ep0_out, c_ep0_in, c_audioControl, c_mix_ctl, c_clk_ctl, c_EANativeTransport_ctrl, dfuInterface VENDOR_REQUESTS_PARAMS_);
    }
the function USB_GetSetupPacket calls XUD_GetSetupBuffer which is blocking and waiting for a data or a token in the channel associated to ep0_out.
Just putting one line of code before the asm volatile("testct %0, res[%1]") statement solved it:

Code: Select all

XUD_Result_t XUD_GetSetupBuffer(XUD_ep e, unsigned char buffer[], unsigned *datalength)
{
...
    /* Mark EP as ready for SETUP data */
    unsigned * array_ptr_setup = (unsigned *)ep->array_ptr_setup;
    *array_ptr_setup = (unsigned) ep;

/* this line reroute cpu to cooperative scheduler until channel presence */
 	while ( ! XCStestChan( ep->client_chanend ) ) )  yield();

    /* Wait for XUD response */
    asm volatile("testct %0, res[%1]" : "=r"(isReset) : "r"(ep->client_chanend));

    if(isReset)
    {
        return XUD_RES_RST;
    }
...
}
To enable this possibility more easily without changing XUD_EpFunctions.c ourselves, I will suggest an enhancement for lib_xud on xmos-github, to insert a weak function just before the asm-volatile-testct, called XUD_UserYieldChanend that could be part of XUD_user.c for example.
This function could also be called from XUD_GetBuffer_Finish and XUD_SetBuffer_Finish which are also awaiting channel presence.

Then we just have to create a .c file somewhere in the app folder with our cooperative tasks and declare

Code: Select all

static inline XUD_UserYieldChanend(unsigned ch) { yieldChannend(ch); }
The cooperative task will then use the endpoint0 spare time.
We still have to be careful not to block XUD_manager() by having fast response time, but we can also count on the inherent channel buffering. More test will be done later on this.

Here is an updated zip file containing a demo_ep0_tasks.c in folder examples which is tested on the XCore.ai dev kit.
the LED3 is blinking and pressing button 0 & 1 will lit the corresponding leds, from Endpoint0 core (on tile 0).
you need to patch XUD_EpFunction.c with XUD_UserYieldChanend as explained, the new file is provided here for example. same for XUD_user.c

I will create a specific repo on GitHub later tomorrow.
You do not have the required permissions to view the files attached to this post.
Last edited by fabriceo on Sun Dec 03, 2023 12:29 pm, edited 1 time in total.
User avatar
fabriceo
XCore Addict
Posts: 168
Joined: Mon Jan 08, 2018 4:14 pm

Post by fabriceo »

dedicated repo now available:
https://github.com/fabriceo/lib_XCScheduler