FreeRTOS port

XCore Project reviews, ideas, videos and proposals.
User avatar
akp
XCore Expert
Posts: 580
Joined: Thu Nov 26, 2015 11:47 pm

Post by akp »

Changing the kernel assembly to dual issue on the XS2A seems to reduce the context switch time by roughly 15%.
User avatar
akp
XCore Expert
Posts: 580
Joined: Thu Nov 26, 2015 11:47 pm

Post by akp »

Not surprisingly, rewriting the kernel assembly in dual-issue yields an improvement
User avatar
fabriceo
XCore Addict
Posts: 222
Joined: Mon Jan 08, 2018 4:14 pm

Post by fabriceo »

Hi see this RTOS stack implementation for freertos for VocalFusion. ver active GitHub branch
https://github.com/xmos/lib_rtos_support
User avatar
akp
XCore Expert
Posts: 580
Joined: Thu Nov 26, 2015 11:47 pm

Post by akp »

Thanks, I will take a look.
User avatar
akp
XCore Expert
Posts: 580
Joined: Thu Nov 26, 2015 11:47 pm

Post by akp »

I took a look at the FreeRTOS port and I wonder if it supports dual-issue mode? It doesn't look like it would work if the FreeRTOS core were running dual-issue code, it might generate exception when it returns from kcall. With respect the kernel assembly I wrote is faster (takes advantage of dual-issue features to perform context switch quicker) and it supports context switching a FreeRTOS core running dual-issue code. The examples are compiled -Os so that means they're tested in single-issue mode.

EDIT: Obviously the big advantage of the XMOS FreeRTOS port is that it supports SMP whereas I can run only one FreeRTOS core per tile. So I am not pooh-poohing it. I just don't have a need for SMP FreeRTOS at present so optimizing the single core FreeRTOS for speed seemed to be a better option for me, leaving more MIPS and cores for xc tasks which is where most of my stuff gets done (e.g. time critical stuff or co-operative multitasking using combinable tasks).
mbruno
Member
Posts: 11
Joined: Thu Aug 24, 2017 2:48 pm

Post by mbruno »

Hi akp,

Thanks for these insights. I have updated our kernel assembly code to support yields from either single or dual issue code. I will update the task context switch code to utilize dual issue as well. Hopefully I will have this released publicly within a few days. I will post here again when it is ready.

Note that we have a single core port without the SMP kernel modifications here:
https://github.com/xmos/FreeRTOS/tree/release/xcore

This single core port will likely soon be integrated into the official FreeRTOS repository, so if you have any more suggestions let me know.

Thanks,
Mike
mbruno
Member
Posts: 11
Joined: Thu Aug 24, 2017 2:48 pm

Post by mbruno »

Hi akp,

I am wondering what exactly you rewrote in dual issue mode to achieve the 15% context switch speed up. The majority of the context switch assembly is a series of stw/ldw instructions which cannot be dual issued. The rest is primarily the call to vTaskSwitchContext which is written in C in the FreeRTOS file tasks.c, so this can be dual issued by compiling with -mdual-issue, but will not be hand optimized. So I have been able to set it up so that dual issue mode is enabled upon kernel entry, and I have everything compiled and assembled and running successfully with dual issue mode enabled everywhere. It just doesn't look like much, if any, of the kernel port assembly code can be sped up by rewriting it to take advantage of dual issue mode.

Are you using the configUSE_PORT_OPTIMISED_TASK_SELECTION option? We do have this on in our single core port (though not in SMP). This should reduce context switch time as well.

The context switch time could be reduced by replacing the stw/ldw instructions with std/ldd instructions, but this requires that the stack pointer be at an 8 byte boundary upon kernel entry which cannot be guaranteed. I have been thinking about how I could force the alignment, but it seems like this will likely waste more cycles than it will save.

When comparing the assembly code in our XS2 port with the one I believe you have based yours on, I do note a couple significant differences. I'm comparing the following two files:

https://github.com/xmos/FreeRTOS/blob/r ... /portasm.S
https://github.com/BiancoZandbergen/XMO ... port_asm.S

The context save and restore code in ours is shared between interrupts and kcalls rather than duplicated, so this should reduce code size. And the bit of code that adds the context size to the stack pointer at the end of the restore just before the kret is done in only 1 instruction in our port rather than in 4, saving a small amount of both time and space.

Mike
User avatar
akp
XCore Expert
Posts: 580
Joined: Thu Nov 26, 2015 11:47 pm

Post by akp »

Hi Mike

Right. I meant I used the ldd / std instructions rather than ldw / stw instructions. So that's not dual issue, there's only a few instructions that are truly dual issue. But ldd / std does move two words per instruction rather than one.

I will try to get my port together and post it up for you to look at. I didn't implement your optimization to add the context size in a single instruction I don't think. I will look at it.

Thanks
Akp
mbruno
Member
Posts: 11
Joined: Thu Aug 24, 2017 2:48 pm

Post by mbruno »

Great, thanks. I'm curious to see what you did. I did actually try using the ldd/std instructions for the r0-r11 registers a while back but quickly realized that it would occasionally crash with an ET_LOAD_STORE exception whenever a task entered the kernel while its SP was not at a double word boundary.

Mike
mbruno
Member
Posts: 11
Joined: Thu Aug 24, 2017 2:48 pm

Post by mbruno »

Please see the latest single core update here:
https://github.com/xmos/FreeRTOS/tree/6 ... af0028ba86

FreeRTOS 10.3.0 has been merged in and support for dual issue has been added. Compiling the application and/or kernel with -O2 or -mdual-issue works now without issue.

Entering the kernel via a yield (kcall) or interrupt now automatically enables dual issue mode. I was able to modify the context switch assembly code to dual issue 12 instructions which should shave off 6 cycles total. Not much, but better than nothing.

Mike