-fschedule Instruction Scheduling: faster code on 10.4

Technical questions regarding the XTC tools and programming with XMOS.
Post Reply
User avatar
Woody
XCore Addict
Posts: 165
Joined: Wed Feb 10, 2010 2:32 pm

-fschedule Instruction Scheduling: faster code on 10.4

Post by Woody »

Instruction scheduling is a really smart new feature of the 10.4 tools. By simply adding the -fschedule switch to your build, your code will run faster over and above any optimization (-O3 etc.). -fschedule works for XC, C, C++ and assembler.

What does instruction scheduling do? The XCore has a very smart way of performing code reads (fetches) when instructions execute which do not require a memory access. The code is then stored in the instruction buffer until it needs to be executed. When too many memory operations occur back to back no instruction fetches occur which ultimately can lead to a dummy instruction (a fetch nop or fnop) being inserted which performs fetches the next instruction.

To increase execution speed, instruction scheduling reorders independent instructions to minimize the number of fnops that are inserted.

What's the downside? As with the higher optimization levels, debugging can give unexpected results when instruction scheduling is enabled. However if you do not need to debug C or XC then adding -fschedule should always give you better results with no downside.

To enable instruction scheduling within XDE add -fschedule to:
Project/Properties/XMOS XC Compiler/Optimization/Other optimization flags
Project/Properties/XMOS C Compiler/Optimization/Other optimization flags
Project/Properties/XMOS C++ Compiler/Optimization/Other optimization flags
Project/Properties/XMOS Assembler/General/Assembler flags

Tell us how you get on!


User avatar
Folknology
XCore Legend
Posts: 1274
Joined: Thu Dec 10, 2009 10:20 pm
Contact:

Post by Folknology »

Given the sensitivity of the ethernet code only -O3 would work previously, does the -fschedule work with that code or interfere with it?

I am assuming that -fschedule is added as an also here.. rather than a replace.

regards
Al
User avatar
Woody
XCore Addict
Posts: 165
Joined: Wed Feb 10, 2010 2:32 pm

Post by Woody »

Folknology wrote:I am assuming that -fschedule is added as an also here.. rather than a replace.
Yes, -fschedule is an also rather than a replace. -fschedule's instruction reordering can only improve the performance.
User avatar
f_petrini
Active Member
Posts: 43
Joined: Fri Dec 11, 2009 8:20 am
Contact:

Post by f_petrini »

Sounds like a nice feature.
However, when I tried the -fschedule flag on my largest project it stopped working.
I haven't investigated what actually breaks but I will do that when I have time.
Yes, -fschedule is an also rather than a replace. -fschedule's instruction reordering can only improve the performance.
Or in my case, degrade the performance to none at all... ;) :lol:
richard
Respected Member
Posts: 318
Joined: Tue Dec 15, 2009 12:46 am

Post by richard »

Woody wrote:-fschedule's instruction reordering can only improve the performance.
I'd like to qualify this. Currently the scheduler's goal is to reduce the number of fetch nops that occur by reordering instructions. Usually less fnops translates directly to increased performance. However moving instructions after a resource instruction like an input or output might cause that resource instruction to execute earlier and pause for longer, wasting time that could be spent doing computation. Alternatively moving instructions before a resource instruction might cause it to execute later relative to a previous operation on that resource, causing it to fail timing.

Currently the scheduler doesn't take into account timing properties of resource operations and assumes they execute the same as other instruction. So while in most the cases the scheduler will improve performance it isn't guaranteed to always do so. This is an obvious area for improvement in future.
Sounds like a nice feature. However, when I tried the -fschedule flag on my largest project it stopped working. I haven't investigated what actually breaks but I will do that when I have time.
It would be good to get to the bottom of this. Are you able to selectively enable the flag for each language (C / C++ / XC / asm) to get a better of idea of which part of the project has an issue? Are you able to identify if it is a timing / performance issue or a miscompilation?
User avatar
f_petrini
Active Member
Posts: 43
Joined: Fri Dec 11, 2009 8:20 am
Contact:

Post by f_petrini »

richard wrote:It would be good to get to the bottom of this. Are you able to selectively enable the flag for each language (C / C++ / XC / asm) to get a better of idea of which part of the project has an issue? Are you able to identify if it is a timing / performance issue or a miscompilation?
I tried it on C only, XC only and both C and XC at the same time. When enabled on XC my application failed to even get past the startup stage and when enabled only on C it failed when I started to send network traffic to it. So I clearly have at least two points of failure.
I will try to dig a bit deeper later today...
Post Reply