Performance of C++ on xCore-200

Technical questions regarding the XTC tools and programming with XMOS.
User avatar
aneves
Experienced Member
Posts: 93
Joined: Wed Sep 16, 2015 2:38 pm

Performance of C++ on xCore-200

Post by aneves »

I've done some research on the pros and cons of utilizing C++ object oriented features vs using straight C when developing firmware. It seems like the consensus is that if the embedded platform is beefy enough (and the xCore-200s seem more than capable) then favoring C++ object oriented design is welcomed with some guidelines.

The summary of all of these guidelines basically encourage developers to keep things as simple as possible when working with C++ and object oriented programming on an embedded platform. Things like keeping your inheritance short and sweet, avoid late-binding or keep it minimal, avoid virtual inheritance, try not to use the STL, etc.

I'm curious what the experience has been in the community for those of you who have successfully used C++ and object oriented design in your projects. Did you find you had to pretty much follow the same basic guidelines? Was there anything you had to optimize because of some C++ language-specific feature that was too expensive for your platform? If you had to do it again, would you have implemented things in pure C or everything in xC for that matter?

Interested in hearing everyone's thoughts!


robertxmos
XCore Addict
Posts: 169
Joined: Fri Oct 23, 2015 10:23 am

Post by robertxmos »

Hi aneves,

The best thing about the xcore - the memory usage is statically calculated and allocated - can also be a problem. Whilst you can dynamically allocate from the stack, via new() or malloc(), you are risking heap fragmentation.

For C++ applications, I would advise that you implement operator new to use statically allocated memory arenas (each arena having particular behaviour e.g. temporal or allocation size) and not use malloc - or just don't dynamically allocate anything!
You should also not use exceptions as stack usage can't currently be calculated.
I would expect there to be a problem with virtual function stack usage too - not checked.
Smart pointer etc etc are also out.
Indeed anything that uses function pointers - see below.

In summary, use C++ as C structs with class viz: private data, member functions, implicit this.
It might not be the whole OO experience but well worth doing.

In future tools, there is the ability to tell the compiler what functions a function pointer may point to, thus allowing it to calculate the stack required.
From 14.2 you can check where function pointers are called by adding '-Wxcore-fptrgroup' to your C/C++ flags.
To check if a C++ feature uses the heap, build and example and look for the 'malloc' symbol:
xobjdump app.xe -t | grep malloc
(printf uses malloc so don't print in your example)
User avatar
aneves
Experienced Member
Posts: 93
Joined: Wed Sep 16, 2015 2:38 pm

Post by aneves »

Hi robertxmos,

Thanks for the tips! Very helpful!
robertxmos wrote: For C++ applications, I would advise that you implement operator new to use statically allocated memory arenas (each arena having particular behaviour e.g. temporal or allocation size) and not use malloc - or just don't dynamically allocate anything!
Yes, very good point I forgot to mention myself.
robertxmos wrote: I would expect there to be a problem with virtual function stack usage too - not checked.
What do you mean by virtual function stack usage? Do you mean stay away from virtual functions and late-binding altogether? What about inheritance? Can you clarify this?
robertxmos wrote: Smart pointer etc etc are also out.
Indeed anything that uses function pointers - see below.

In future tools, there is the ability to tell the compiler what functions a function pointer may point to, thus allowing it to calculate the stack required.
From 14.2 you can check where function pointers are called by adding '-Wxcore-fptrgroup' to your C/C++ flags.
I know that in xC function pointers are not supported, but in C and C++ you can use them in the xmos toolchain as long as you decorate the caller with "#pragma stackfunction 200" (for example). Are you just saying stay away from them if you can as opposed to they flat out won't work?
Thanks for the tip on the "-Wxcore-fptrgroup" flag. Will keep that in mind.
robertxmos wrote: To check if a C++ feature uses the heap, build and example and look for the 'malloc' symbol:
xobjdump app.xe -t | grep malloc
(printf uses malloc so don't print in your example)
Another neat tip to keep in my back pocket. Thank you!
robertxmos wrote: In summary, use C++ as C structs with class viz: private data, member functions, implicit this.
It might not be the whole OO experience but well worth doing.
Sounds like you're harmonizing my previous sentiment - utilize the very core C++ advantages and stay away from the "advanced" features.

Thanks again for your thoughts!
robertxmos
XCore Addict
Posts: 169
Joined: Fri Oct 23, 2015 10:23 am

Post by robertxmos »

What do you mean by virtual function stack usage? Do you mean stay away from virtual functions and late-binding altogether? What about inheritance? Can you clarify this?
They will use function pointers (unless the compiler can optimise them away), and hence the core's stack usage will not be calculated viz:

Code: Select all

 struct Foo {
  virtual int fee(void);
};
int Bar(Foo &foo) {
  return foo.fee(); // virtual method
}

Code: Select all

<tools>/libexec/xcc2clang test.cpp -S -Os -o -
Will not contain stack information viz: _Z3BarR3Foo.stackwords

However, making fee non virtual and you get stack usage information:

Code: Select all

.set    _Z3BarR3Foo.nstackwords,(_ZN3Foo3feeEv.nstackwords + 1)
robertxmos
XCore Addict
Posts: 169
Joined: Fri Oct 23, 2015 10:23 am

Post by robertxmos »

Are you just saying stay away from them if you can as opposed to they flat out won't work?
They work, but the tools wont be able to calculate the stack usage - see previous example.

14.2 tools have a hidden feature under test - 'fptrgroups'.
This allows you to tell the compiler which set of functions a function pointer may point to.
It has change slight in to be release 14.3 (added runtime check flag).
If you look in stdlib you will see they are being used:

Code: Select all

// xcore requires arg '_compar' to have its fptrgroup attribute set viz:
//    __attribute__((fptrgroup("stdlib_qsort"))) int myComparFunc(void*,void*) {...}
_VOID _EXFUN(qsort,(_PTR __base, size_t __nmemb, size_t __size, int(*_compar)(const _PTR, const _PTR)));
caveat emptor - Here is some incomplete, unfinalised & unpublished documentation :-)

The fptrgroup attribute is attributed to either:
  1. A (member) function pointer definition/declaration
    • Function pointer definition:

      Code: Select all

      __attribute__((fptrgroup("G1")))
      void (*fp1)(); // we create a variable (we could initialise it too).
    • Function pointer declaration:

      Code: Select all

      __attribute__((fptrgroup("G1")))
      extern void (*fp2)() ;        // The variable is defined else where.
    This tells the compiler that when the pointer is dereferenced, the resource usage will be the maximum found in the list of functions who are members of "G1".

    Code: Select all

    fp1();  // the linker can now deduce the worse case stack usage.
  2. A function definition.

    Code: Select all

    __attribute__((fptrgroup("G1")))
    void func() {/*implementation*/}  // this is NOT a declaration - we know the stack usage!
    This will add the function's resource usage to the "G1" list - unless it is eliminated!
A function may be added to multiple groups "G1, G2" viz: add to multiple tables.
A function pointer may be a member in any of several groups "G1, G3" viz check all tables for a match.

Runtime checking (not in 14.2):
Function pointer definition/declaration attributes have an implicit 'check' boolean set to true, or you may explicitly set it:

Code: Select all

__attribute__((fptrgroup("G1",0)))    // disable runtime checking
void (*fp3)();
__attribute__((fptrgroup("G1",1)))    // enable runtime checking
void (*fp4)();
__attribute__((fptrgroup("G1")))    // default is enable runtime checking
void (*fp5)();