Measuring 'CPU Usage'

hybridalienrob · Post by **hybridalienrob** » Sat Nov 22, 2014 2:22 am

Is there a way using the simulator in xTime e.g observing signals in the waves view or otherwise measure the effective busyness of a core / tile ?

So for example if I have a task which wakes up on regular timer events and does some processing, how can I tell how much spare capacity there is left in the core and / or tile.

I would have thought plotting the CoreN_inuse, CoreN_waiting would be what I'm looking for but they don't seem to do much.

What I'm after is a signal for each core which shows when it is idle, waiting on events. Is there such a thing ?

Thanks

Rob

infiniteimprobability · Sun Nov 23, 2014 4:35 pm

Hi, there are a lot of ways you can do this, and the best one depends on the situation and your preferences. The first question is whether you are measuring in hardware or not.. Some options include:

- Use gprof. This will give you a histogram of where the PC was. The time spent on the instruction that blocks (eg. out or in for a single event, or waiteu for select statements) tells you how many spare cycles you have. This works on the simulator, and can work in hardware using PC sampling (this will give you an average, based on sampling so will be correct for loops run over a long period). You get an output file per core and can see in the below example that the out instruction is taking .. Look for the following examples in the GUI ""How to profile an executable on the hardware" & "How to profile an executable on the simulator""

- Use xscope print with timing. This gives you fast printing with an additional timestamp of when the print was called. Look for "xscope timed example" in the GUI

- Place timers in your code and calculate time between sections of code:

tmr :> start_time;

<some stuff>

tmr :> end_time;

printintln(end_time - start_time);

- Simulate using the tile[x]_corey_waiting singnal (as you suggested). Measure the ratio between waiting (blocked on event) and executing instrcutions. See attached image example showing the core waiting for the port most of the time.. This is a good method and does work fine - please share the specifics of what you are doing if this is not working for you.

- Toggle an I/O and observe it using a 'scope for hardware, or in the simulator

All of the above methods are a dynamic, data dependent way of measuring.. Ie the amount of time taken may depend on input data, if statements, whether the other cores are busy or not etc.. So not always 100% reliable and may miss corner cases, unless you apply large margins or can make some reliable assumptions.

A thorough, and data independent way of obtaining WCET (worst case execution time) is via XTA (Xmos Timing Analyzer). This is a static tool (ie not simulation based) that will analyze the structure of your code and provide you with a WCET for your code. There are loads of examples showing how to do this in the tools - see "TIming and XTA" in the how to section. For example, in this case you can see that the loop time is 24ns. Knowing that the port will block for 10ns * 32 cycles = 320ns, the used CPU time is 24/320 = 7.5%.

Sorry - this may be more than you were looking for, but it's an intesting subject and one which can be appraoched in many ways, and is often down to user preference. I expect there are even more ways you can get the info you want!

fchin · Post by **fchin** » Tue Jul 19, 2016 7:13 am

Hi infiniteimprobability,

I'm also interested in monitoring how much real-time spare capacity available in a core. I enabled Gprof collection when running the AN00203 Gigabit Ethernet AVB endpoint example on a xCORE-200 MC audio board.

Please refer to the attached profile for one of the cores (Gprof_AVB.png). There are three combined tasks running in this core, i.e. avb_manager, application task and avb_1722_1_maap_srp_task. I noticed the "__wait_nonlocal" node in the profile was taking 84.01% of the execution time. Can I say that this node is essentially indicating that 84.01% of the time the core was in idle state?

Thanks,
Frankie

henk · Post by **henk** » Tue Jul 19, 2016 8:28 am

That is a very interesting question. There is two parts to it.

First, gprof and friends give you an average over some period of time; real-time is often about meeting a deadline which is not an average. So - if a thread 84% of the CPU was free, then that does not automatically mean that may nt be in the real-time part; it might only just be making its deadline.

Second, the question how to get to it. I don't know how gprof measures its CPU time; it may be using real-time (in which case 100% would reflect the time that threads wait for IO, comms, etc), in which case the __wait_nonlocal call may be waiting for an input, and you have on average a lot of spare time prior to that call.

Measuring 'CPU Usage' Topic is solved

Measuring 'CPU Usage'

Answer 2957

Re: Measuring 'CPU Usage'

Re: Measuring 'CPU Usage'