How to analyze a code portion with XTA (Defining routes)

Technical questions regarding the xTIMEcomposer, xSOFTip Explorer and Programming with XMOS.
mozcelikors
Experienced Member
Posts: 75
Joined: Sat May 07, 2016 11:47 am

How to analyze a code portion with XTA (Defining routes)

Postby mozcelikors » Mon Jan 09, 2017 4:26 pm

Hello guys,
I would like to simply analyze a select-case portion of my code and find how many instructions are there between.
To be able to do this, I am trying to use XTA. However, I cant seem to get it working,
A few questions:
1) How to analyze such a select-case portion, by loop points, or endpoints, what is the difference and how to define them correctly?
2) Usually the code portion is Unresolved, how to resolve that issue?

Example code,

Code: Select all

select
        {
            case tmr3 when timerafter(time3) :> void :
                //START ANALYSIS HERE!
                //Measure start time
                ////debug_timer :> start_time;

                //Initialize messaging
                InitializeMessaging(i2c_interface);

                // For Left Sensor
                // Read from high and low byte respectively
                high_byte = i2c_interface.read_reg(getDistanceSensorAddr(LEFT_DISTANCE_SENSOR_ID), 0x02, result);
                low_byte = i2c_interface.read_reg(getDistanceSensorAddr(LEFT_DISTANCE_SENSOR_ID),  0x03, result);
                // Construct the distance information in centimeters
                acc = (high_byte * 256) + low_byte;
                if ((acc < 600)  && (acc > 0)) // Distance should be in between 600cm and 0cm
                {
                    left = acc;
                }
                else
                {
                    left = 0;
                }
                //printf("y\n");

                // For Right Sensor
                // Read from high and low byte respectively
                high_byte = i2c_interface.read_reg(getDistanceSensorAddr(RIGHT_DISTANCE_SENSOR_ID), 0x02, result);
                low_byte = i2c_interface.read_reg(getDistanceSensorAddr(RIGHT_DISTANCE_SENSOR_ID),  0x03, result);
                // Construct the distance information in centimeters
                acc = (high_byte * 256) + low_byte;
                if ((acc < 600)  && (acc > 0)) // Distance should be in between 600cm and 0cm
                {
                    right = acc;
                }
                else
                {
                    right = 0;
                }
                //printf("z\n");

                // For Front Sensor
                // Read from high and low byte respectively
                high_byte = i2c_interface.read_reg(getDistanceSensorAddr(FRONT_DISTANCE_SENSOR_ID), 0x02, result);
                low_byte = i2c_interface.read_reg(getDistanceSensorAddr(FRONT_DISTANCE_SENSOR_ID),  0x03, result);
                // Construct the distance information in centimeters
                acc = (high_byte * 256) + low_byte;
                if ((acc < 600)  && (acc > 0)) // Distance should be in between 600cm and 0cm
                {
                    front = acc;
                }
                else
                {
                    front = 0;
                }


                // For Rear Sensor
                // Read from high and low byte respectively
                high_byte = i2c_interface.read_reg(getDistanceSensorAddr(REAR_DISTANCE_SENSOR_ID), 0x02, result);
                low_byte = i2c_interface.read_reg(getDistanceSensorAddr(REAR_DISTANCE_SENSOR_ID),  0x03, result);
                // Construct the distance information in centimeters
                acc = (high_byte * 256) + low_byte;
                if ((acc < 600)  && (acc > 0)) // Distance should be in between 600cm and 0cm
                {
                    rear = acc;
                }
                else
                {
                    rear = 0;
                }

                // Send sensor values all together
                sensors_interface.ShareDistanceSensorValues (left, right, front, rear);

                // Delay
                time3 += delay3;

                //Measure end time
                ////debug_timer :> end_time;
                ////printf("SONAR t: %u", end_time - start_time);
                //STOP ANALYSIS HERE!
                break;
        }

Thanks, best regards
peter
XCore Addict
Posts: 230
Joined: Wed Mar 10, 2010 12:46 pm

Postby peter » Wed Jan 11, 2017 10:03 am

I have created a basic test case which should hopefully help you. Please let me know if the following isn't clear. This program has two tasks, one using a select and the other using just a loop:

Code: Select all

#include <platform.h>
#include <xs1.h>

out port p = XS1_PORT_32A;

#define PERIOD1 1000
#define PERIOD2 4003

void a(chanend c)
{
  timer tmr1, tmr2;
  int time1, time2;

  tmr1 :> time1;
  tmr2 :> time2;

  time1 += PERIOD1;
  time2 += PERIOD2;

  while (1) {
    select {
      case tmr1 when timerafter(time1) :> int time:
        #pragma xta endpoint "a_case1"
        c <: time;
        time1 += PERIOD1;
        break;

      case tmr2 when timerafter(time2) :> int time:
        #pragma xta endpoint "a_case2"
        c <: time;
        time2 += PERIOD2;
        break;
    }
  }
}

void b(chanend c)
{
  int time;
  while (1) {
    #pragma xta endpoint "b_loop"
    c :> time;
    p <: time;
  }
}

int main()
{
  chan c;
  par {
    a(c);
    b(c);
  }
  return 0;
}
I've added the XTA pragmas to each function to allow me to do the timing afterwards. Note that XTA endpoints have to be placed on lines that will not be optimised away - usually I/O operations on ports/channels or function calls.

I then compile it:

Code: Select all

xcc -target=XCORE-200-EXPLORER -O2 main.xc -o test.xe
And load it into XTA:

Code: Select all

xta load test.xe
xta 2>list endpoints 
tile[0], label: a_case1, pc: 0x40108, compilation dir: ., filename: main.xc, line number: 25, exists: true
tile[0], label: a_case2, pc: 0x40158, compilation dir: ., filename: main.xc, line number: 31, exists: true
tile[0], label: b_loop, pc: 0x40178, compilation dir: ., filename: main.xc, line number: 43, exists: true
Then it is possible to see the timing of the loop in function a():

Code: Select all

xta 3>analyze loop b_loop
xta 4>print trace 0
*        0.0: ( 10.0ns) 0x40170 b + 8        { chkct (rus)  res[r0], 0x1 ; nop (0r)                  } (P)
*       10.0: ( 10.0ns) 0x40174 b + 12       { outct (rus)  res[r0], 0x1 ; nop (0r)                  } (P)
*       20.0: ( 10.0ns) 0x40178 b + 16       { in (2r)      r2, res[r0]  ; nop (0r)                  } (P)
*       30.0: ( 10.0ns) 0x4017c b + 20       { chkct (rus)  res[r0], 0x1 ; nop (0r)                  } (P)
*       40.0: ( 10.0ns) 0x40180 b + 24       { outct (rus)  res[r0], 0x1 ; nop (0r)                  } (P)
*       50.0: ( 10.0ns) 0x40184 b + 28       { out (r2r)    res[r1], r2  ; nop (0r)                  } (P)
        60.0: ( 10.0ns) 0x40188 b + 32       bu (lu6)     -0x7        
Or each of the paths in function b() by doing a path from an endpoint back to itself:

Code: Select all

xta 5> analyze endpoints a_case1 a_case1
xta 6>print structure -
seq(59)        : 210.0 ns / 170.0 ns 
...

xta 7> analyze endpoints a_case2 a_case2
xta 8>print structure 1
seq(181)       : 220.0 ns / 180.0 ns 
...
Hopefully that gives you some idea of how to time these loops.

One thing to note is that the instruction rate is determined from the number of active cores, and so in this case is assumed to be 10ns per instruction while the reality is that if you then make this code run with more active cores it will go slower. A true worst-case can be determined by telling the tool that there will be 8 active cores before doing any commands:

Code: Select all

config tasks tile[0] 8
Regards,

Peter
mozcelikors
Experienced Member
Posts: 75
Joined: Sat May 07, 2016 11:47 am

Postby mozcelikors » Wed Jan 11, 2017 3:42 pm

Hello, first of all thank you for your guidance.
I've got quite a different problem.
I've followed your instructions you gave to aybarskizilay,
Up until a point, adding branches work; but there comes an instance where adding branch does not work and produces the following error:

Code: Select all

xta: warning: adding branch from instruction which is not a branch: (_SInitializeMessaging_0+72) 0x458dc add (2rus) r4, r3, 0x0
xta: error: References not resolved on any active tile
Please let me explain how I came to this point:
I have discovered and easier way of analyzing via the functions,

First I analyze the function via:

Code: Select all

analyze function _STask_ReadSonarSensors_0
Then, printing trace,

Code: Select all

print trace -
I have added a couple of branches and it worked out fine until this point:

Code: Select all

     16880.0: ( 16.0ns) 0x458c4 _SInitializeMessaging_0 + 48 { nop (0r)                  ; ldw (2rus)   r1, r5[0x1]  }
     16896.0: ( 16.0ns) 0x458c8 _SInitializeMessaging_0 + 52 { nop (0r)                  ; ldw (2rus)   r11, r1[0x0] }
     16912.0: ( 16.0ns) 0x458cc _SInitializeMessaging_0 + 56 { ldaw (ru6)   r1, sp[0x5]  ; stw (ru6)    r8, sp[0x2]  }
     16928.0: ( 16.0ns) 0x458d0                        --FNOP--                 
     16944.0: ( 16.0ns) 0x458d0 _SInitializeMessaging_0 + 60 { add (2rus)   r7, r1, 0x0  ; stw (ru6)    r1, sp[0x1]  }
     16960.0: ( 16.0ns) 0x458d4 _SInitializeMessaging_0 + 64 ldc (lru6)   r1, 0x7a    
     16976.0: ( 16.0ns) 0x458d8 _SInitializeMessaging_0 + 68 { ldc (ru6)    r3, 0x2      ; add (2rus)   r2, r9, 0x0  }
     16992.0: ( 16.0ns) 0x458dc _SInitializeMessaging_0 + 72 { add (2rus)   r4, r3, 0x0  ; bla (1r)     r11          } [UNRESOLVED]

xta 19>add branch 0x458dc [0x4594e]+
which gives the following problem:

Code: Select all

xta: warning: adding branch from instruction which is not a branch: (_SInitializeMessaging_0+72) 0x458dc add (2rus) r4, r3, 0x0
xta: error: References not resolved on any active tile
Could you help us get through this error?
Thank you very much.
peter
XCore Addict
Posts: 230
Joined: Wed Mar 10, 2010 12:46 pm

Postby peter » Wed Jan 11, 2017 3:46 pm

I think:

Code: Select all

add branch 0x458dc [0x4594e]+
should be:

Code: Select all

add branch 0x458dc 0x4594e
That will fix the second error, but not the first warning about not being a branch instruction. It might need to be

Code: Select all

add branch 0x458de 0x4594e
Because the branch instruction in that bundle is on the second half-word.
mozcelikors
Experienced Member
Posts: 75
Joined: Sat May 07, 2016 11:47 am

Postby mozcelikors » Wed Jan 11, 2017 4:12 pm

Hello,

Unfortunately,

Code: Select all

add branch 0x458dc [0x4594e]+
or

Code: Select all

add branch 0x458dc 0x4594e
made no difference and the error stays as it is.

The second solution unfortunately resulted in:

Code: Select all

     16928.0: ( 16.0ns) 0x458d0                        --FNOP--                 
     16944.0: ( 16.0ns) 0x458d0 _SInitializeMessaging_0 + 60 { add (2rus)   r7, r1, 0x0  ; stw (ru6)    r1, sp[0x1]  }
     16960.0: ( 16.0ns) 0x458d4 _SInitializeMessaging_0 + 64 ldc (lru6)   r1, 0x7a    
     16976.0: ( 16.0ns) 0x458d8 _SInitializeMessaging_0 + 68 { ldc (ru6)    r3, 0x2      ; add (2rus)   r2, r9, 0x0  }
     16992.0: ( 16.0ns) 0x458dc _SInitializeMessaging_0 + 72 { add (2rus)   r4, r3, 0x0  ; bla (1r)     r11          } [UNRESOLVED]

xta 48>add branch 0x458de 0x4594e
xta: error: References not resolved on any active tile
xta 49>
I just want to find the instruction count of a function/case section. Isnt there a simpler solution? If not, how do I proceed with this error.

Thanks.
peter
XCore Addict
Posts: 230
Joined: Wed Mar 10, 2010 12:46 pm

Postby peter » Wed Jan 11, 2017 4:40 pm

You can easily analyse the timing of a function with something like:

Code: Select all

analyse function f
However, this is a function call using a function pointer which the XTA tool doesn't know what the pointer value is so you are needing to help it with the added branch annotation.
mozcelikors
Experienced Member
Posts: 75
Joined: Sat May 07, 2016 11:47 am

Postby mozcelikors » Wed Jan 11, 2017 7:23 pm

Do you have any more suggestions for adding branch? Because unfortunately we are getting the error that I mentioned.
Thanks in advance
mozcelikors
Experienced Member
Posts: 75
Joined: Sat May 07, 2016 11:47 am

Postby mozcelikors » Wed Jan 11, 2017 9:42 pm

Hello,
A quick update:
Trying out some hex numbers worked, now I somehow am able to add some branches.
The question is, will it go on forever?
It really feels like I'm dealing with an infinite loop.
I should remind you that I'm analyzing with:

Code: Select all

analyze function _STask_MyTask
Final point:

Code: Select all

     19424.0: ( 16.0ns) 0x459b6 _SInitializeMessaging_0 + 290 { stw (ru6)    r2, sp[0x2]  ; nop (0r)                  }
     19440.0: ( 16.0ns) 0x459bc _SInitializeMessaging_0 + 296 ldc (lru6)   r1, 0x79    
     19456.0: ( 16.0ns) 0x459ba _SInitializeMessaging_0 + 294 { stw (ru6)    r1, sp[0x1]  ; add (2rus)   r5, r2, 0x0  }
     19472.0: ( 16.0ns) 0x459c2 _SInitializeMessaging_0 + 302 { ldc (ru6)    r3, 0x2      ; add (2rus)   r2, r4, 0x0  }
     19488.0: ( 16.0ns) 0x459c6 _SInitializeMessaging_0 + 306 { nop (0r)                  ; nop (0r)                  }
     19504.0: ( 16.0ns) 0x459ca _SInitializeMessaging_0 + 310 bla (1r)     r11          [UNRESOLVED]
aand it keeps going like this..
peter
XCore Addict
Posts: 230
Joined: Wed Mar 10, 2010 12:46 pm

Postby peter » Thu Jan 12, 2017 9:12 am

If your task is an infinite loop then it doesn't make sense to just analyse it as a single function - XTA will just report that it is an infinite loop.

Unfortunately, XTA was designed for timing XC which didn't contain function pointers. As a result, timing code with function pointers is painful. We will try to improve its ease of use in time, but that won't happen in the near future.

If, however, your code is calling memset/memcpy this can cause issues in the tools. Let me know if that is the case as there is a way to time functions which use memset/memcpy.

Regards,

Peter
mozcelikors
Experienced Member
Posts: 75
Joined: Sat May 07, 2016 11:47 am

Postby mozcelikors » Thu Jan 12, 2017 10:46 am

Hello,
Thanks for the reply.
There is no infinite loop detected, however, yes, my functions use lots of memset/memcpy. (Not this one, though)

Please take a look at the images, which also shows the code part causing "Unresolved" issue.

Image

Image

However, I also want to get instruction count for a task like this (with infiniteloop iterated once):

Image

And also if you could mention what to do about memset/memcpy, I will apply it when I see.

There are over 15 tasks I want to find the instruction count of, which I stumbled upon memset/memcpy a lot.

Thanks,

Who is online

Users browsing this forum: No registered users and 7 guests