Getting more accurate core utilization in percentage inXMOS Topic is solved

Technical questions regarding the XTC tools and programming with XMOS.
mozcelikors
Experienced Member
Posts: 75
Joined: Sat May 07, 2016 11:47 am

Getting more accurate core utilization in percentage inXMOS

Post by mozcelikors »

Hello guys,

I have asked a similar question which everybody helped greately. The original thread is here: https://www.xcore.com/forum/viewtopic.php?f=26&t=4983

What I need to get as an info is how the core is being used at that instant, with what percentage. (e.g. Like in Raspberry Pi, tells you 63% of cpu is used).
Can anybody suggest any method to extend function in above thread for this purpose. Any ideas are appreciated, I am not asking for a full code necessarily.

Thanks in advance,
Best regards

Original final code:

Code: Select all

[[combinable]]
void monitorCores(client core_stats_if core_stats_interface)
{
  short int t;
  timer poll_tmr, print_tmr;
  int poll_time, print_time;
  poll_tmr :> poll_time;

  print_time = poll_time + PRINT_MS;

  float core_busy[8];
  float core_idle[8];
  short int core_usage[8];

  for (t = 0; t <= 7; t++) {
    core_busy[t] = 0;
    core_idle[t] = 0;
  }

  int tile_id = get_local_tile_id();

  while(1)
  {
    select {
      case print_tmr when timerafter(print_time) :> void:
        for (t = 0; t <= 7; t++) {
            core_usage[t] = ((float)(core_busy[t] / (core_busy[t] + core_idle[t])))*100;
        }
        core_stats_interface.ShareCoreUsage (core_usage[0],
                                             core_usage[1],
                                             core_usage[2],
                                             core_usage[3],
                                             core_usage[4],
                                             core_usage[5],
                                             core_usage[6],
                                             core_usage[7]);
        /*printf("%d %d %d %d %d %d %d %d\n",core_usage[0],
                                        core_usage[1],
                                        core_usage[2],
                                        core_usage[3],
                                        core_usage[4],
                                        core_usage[5],
                                        core_usage[6],
                                        core_usage[7]);*/

        for (t = 0; t <= 7; t++) {
          core_busy[t] = 0;
          core_idle[t] = 0;
        }
        print_time += PRINT_MS;
        break;

      case poll_tmr when timerafter(poll_time) :> void:
        for (t = 0; t <= 7; t++) {
          // Read the processor state
          int ps_value = getps(0x100*t+4);

          // Read the status register
          unsigned int sr_value;
          read_pswitch_reg(tile_id, XS1_PSWITCH_T0_SR_NUM+t, sr_value);

          const int in_use = (ps_value & 0x1);
          const int waiting = (sr_value >> 6) & 0x1;
          if (in_use && waiting) {
            core_busy[t] += 1;
          } else {
            core_idle[t] += 1;
          }
        }
        poll_time += POLLING_MS;
        break;
    }
  }
}


View Solution
Gothmag
XCore Addict
Posts: 129
Joined: Wed May 11, 2016 3:50 pm

Post by Gothmag »

I think at any point in time it's only going to be 100% or 0% load, you have to specify a time period to get what you want.
User avatar
infiniteimprobability
XCore Legend
Posts: 1126
Joined: Thu May 27, 2010 10:08 am
Contact:

Post by infiniteimprobability »

What I need to get as an info is how the core is being used at that instant
It's an interesting question because, as Gothmag says, the real answer is 0 or 100%. The core is either in the scheduler's run set or it's not.

That is because the granularity with which the core runs or waits is a single clock cycle (2ns @ 500MHz), as opposed to an OS based system where slices are very much longer - such as 750us in Linux.

So to get the number you are looking for, you will have to pick a period and calculate an average. That period should be exactly the same as, or much longer than your typical task rate.

It's actually questionable how useful the number is most event based real time systems. What matters is your response time and worst case execution time (do you always meet your deadline?). It may well be the core sits idle for much of the time, but is very busy on a regular basis (eq. an I2S task which needs to do lots of I/O twice each frame).

Can I ask what the overall system/goal is?
mozcelikors
Experienced Member
Posts: 75
Joined: Sat May 07, 2016 11:47 am

Post by mozcelikors »

Hello,
Thanks for the helpful comment.
System is a robot with several capabilities such as wifi and bluetooth connectivity, TCP apps, featuring image-processing algorithms and a local webcam server.
System is being built as a demonstrator/test environment for a multi-core parallelization and development platform.
That's why we need the accurate core utilization information.
Your answer makes sense, I'll give averaging a try.
mozcelikors
Experienced Member
Posts: 75
Joined: Sat May 07, 2016 11:47 am

Post by mozcelikors »

The thing is, the values are always the same.

I allocated different cores for different tasks, I deactivated and activated some tasks.

Code: Select all

tile[8002]: 0 0 0 0 0 0 100 0
tile[8003]: 100 0 0 100 100 0 0 0
Sometimes it maybe becomes something like following:

Code: Select all

tile[8002]: 0 3 0 0 0 0 100 0
tile[8003]: 100 0 0 100 100 0 0 0

What do you think might be the problem?

Code: Select all

[[combinable]]
void Task_MonitorCoresInATile(client core_stats_if core_stats_interface)
{
      short int t;
      timer poll_tmr, print_tmr;
      int poll_time, print_time;
      poll_tmr :> poll_time;

      print_time = poll_time + PRINT_MS;

      short int core_busy[8];
      short int core_idle[8];
      short int core_usage[8];

      for (t = 0; t <= 7; t++) {
            core_busy[t] = 0;
            core_idle[t] = 0;
      }

      int tile_id = get_local_tile_id();

      while(1)
      {
            select {
                  case print_tmr when timerafter(print_time) :> void:
                        /*printf("tile[%x]: %d/%d %d/%d %d/%d %d/%d %d/%d %d/%d %d/%d %d/%d\n",
                            tile_id,
                            core_busy[0], core_idle[0],
                            core_busy[1], core_idle[1],
                            core_busy[2], core_idle[2],
                            core_busy[3], core_idle[3],
                            core_busy[4], core_idle[4],
                            core_busy[5], core_idle[5],
                            core_busy[6], core_idle[6],
                            core_busy[7], core_idle[7]);*/
                          for (t = 0; t <= 7; t++) {
                                  if (core_idle[t] + core_busy[t]) {
                                      core_usage[t] = (100 * core_busy[t]) / (core_busy[t] + core_idle[t]);
                                  } else {
                                      core_usage[t] = 0;
                                  }
                          }
                          core_stats_interface.ShareCoreUsage (core_usage[0],
                                                             core_usage[1],
                                                             core_usage[2],
                                                             core_usage[3],
                                                             core_usage[4],
                                                             core_usage[5],
                                                             core_usage[6],
                                                             core_usage[7]);
                        printf("tile[%x]: %d %d %d %d %d %d %d %d\n",tile_id, core_usage[0],
                                                        core_usage[1],
                                                        core_usage[2],
                                                        core_usage[3],
                                                        core_usage[4],
                                                        core_usage[5],
                                                        core_usage[6],
                                                        core_usage[7]);

                        for (t = 0; t <= 7; t++) {
                            core_busy[t] = 0;
                            core_idle[t] = 0;
                        }
                        print_time += PRINT_MS;
                        break;

                  case poll_tmr when timerafter(poll_time) :> void:
                        for (t = 0; t <= 7; t++) {
                              // Read the processor state
                              int ps_value = getps(0x100*t+4);

                              // Read the status register
                              unsigned int sr_value;
                              read_pswitch_reg(tile_id, XS1_PSWITCH_T0_SR_NUM+t, sr_value);

                              const int in_use = (ps_value & 0x1);
                              const int waiting = (sr_value >> 6) & 0x1;
                              if (in_use) {
                                      if (waiting) {
                                          core_idle[t] += 1;
                                      } else {
                                          core_busy[t] += 1;
                                      }
                                   }
                        }
                        poll_time += POLLING_MS;
                        break;
            }
      }
}

Code: Select all

      // Core Monitoring Tasks
     on tile[0].core[6]:           Task_MonitorCoresInATile (core_stats_interface_tile0);
     on tile[1].core[7]:           Task_MonitorCoresInATile (core_stats_interface_tile1);
User avatar
infiniteimprobability
XCore Legend
Posts: 1126
Joined: Thu May 27, 2010 10:08 am
Contact:

Post by infiniteimprobability »

What do you think might be the problem?
Difficult to say without knowing what code you're running on the other cores.

By the way, there's no real advantage in using short ints on Xmos, unless defining big arrays where space is an issue. Xmos is naturally a 32b machine...
mozcelikors
Experienced Member
Posts: 75
Joined: Sat May 07, 2016 11:47 am

Post by mozcelikors »

infiniteimprobability wrote:
What do you think might be the problem?
Difficult to say without knowing what code you're running on the other cores.

By the way, there's no real advantage in using short ints on Xmos, unless defining big arrays where space is an issue. Xmos is naturally a 32b machine...
Hello,
Thanks for the reply,
For example consider the following task (simple pwm task with 2 channels),

Code: Select all

[[combinable]]
void Task_ControlLightSystem (port p_TH, port p_ST, server lightstate_if lightstate_interface)
{
    uint32_t overall_pwm_period = LIGHTSYSTEM_PWM_PERIOD ;
    uint32_t on_period_TH,  on_period_ST;
    uint32_t off_period_TH, off_period_ST;

    uint32_t    time_TH, time_ST;
    int         port_state_TH = 0;
    int         port_state_ST  = 0;
    int         toggle_port_TH = 0;
    int         toggle_port_ST = 0;
    timer       tmr_TH, tmr_ST;

    short int lightstate_val;

    //Initialization for some variables
    lightstate_val = 1;
    {on_period_TH, on_period_ST} = GetLightSystemPeriodsFromLightState (lightstate_val);
    while(1)
    {
        select
        {
            //Wait for the lightstate value (Event)
            case lightstate_interface.ShareLightSystemState (short int state):
                lightstate_val = state;
                //printf("lst = %d\n",lightstate_val);

                {on_period_TH, on_period_ST} = GetLightSystemPeriodsFromLightState (lightstate_val);
                //calculations
                off_period_TH = overall_pwm_period - on_period_TH;
                off_period_ST = overall_pwm_period - on_period_ST;
                break;

            //Port p_ST PWM Timer Event
            case tmr_ST when timerafter(time_ST) :> void :

                tmr_ST :> time_ST;

                //calculations

                off_period_ST = overall_pwm_period - on_period_ST;


                //PWM Port Toggling
                if(port_state_ST == 0)
                {
                    p_ST <: 1;
                    port_state_ST = 1;
                    time_ST += on_period_ST; //Extend timer deadline
                }
                else if(port_state_ST == 1)
                {
                    p_ST <: 0;
                    port_state_ST = 0;
                    time_ST += off_period_ST; //Extend timer deadline
                }

                break;


            //Port p_TH PWM Timer Event
            case tmr_TH when timerafter(time_TH) :> void :

                tmr_TH :> time_TH;

                //calculations

                off_period_TH = overall_pwm_period - on_period_TH;



                //PWM Port Toggling
                if(port_state_TH == 0)
                {
                    p_TH <: 1;
                    port_state_TH = 1;
                    time_TH += on_period_TH; //Extend timer deadline
                }
                else if(port_state_TH == 1)
                {
                    p_TH <: 0;
                    port_state_TH = 0;
                    time_TH += off_period_TH; //Extend timer deadline
                }

                break;



        }
    }
}

.....main.xc
//Light System Task
     on tile[0].core[6]:              Task_ControlLightSystem (PortLightSystem_TH, PortLightSystem_ST, lightstate_interface);

.......Console
tile[8002]: 100 100 100 100 100 100 0 100
tile[8003]: 0 100 100 0 0 100 100 0

This task I can not notice it, meaning nothing at all changes when i replace the core of this.
even if I put it in a dedicated core.
User avatar
infiniteimprobability
XCore Legend
Posts: 1126
Joined: Thu May 27, 2010 10:08 am
Contact:

Post by infiniteimprobability »

That could be right. It depends on what on_period_TH is set to. There are very few instructions in your select case and events are pretty efficient, so you could be using less than 1% of the core's instructions. The core will be in and out of the event handler very quickly.

Why not add some burn code that will not be optimised out on the select case (for loop toggling an I/O for example)? See if the number increases?
mozcelikors
Experienced Member
Posts: 75
Joined: Sat May 07, 2016 11:47 am

Post by mozcelikors »

Thanks for the suggestion. I will try that and let you know.
mozcelikors
Experienced Member
Posts: 75
Joined: Sat May 07, 2016 11:47 am

Post by mozcelikors »

The thing is, even if I change the location of the core, the core usage information stays the same. Therefore, I dont think its a load related problem.

For example
For
lightsystem_task on core 6
ethernet_task on core 7

Output: .... 0% 100%

---
For
lightsystem_task on core 7
ethernet_task on core 6

Output: .... 0% 100%

Like its not weird enough, I have a task that reads over uart which is placed at CORE 1. When I send some commands over uart, the task on core 3 fluctuates between 0%-100%

Please take a look at my overall core distribution:

Code: Select all

par {
     // UART TX Related Tasks
     on tile[0]:          output_gpio (i_gpio_tx, 1, PortUART_TX, null); //Core yok idi
     on tile[0]:          uart_tx(i_tx, null, BAUD_RATE, UART_PARITY_NONE, 8, 1, i_gpio_tx[0]);//Core yok idi

     // UART RX Related Tasks
     on tile[0].core[0] : input_gpio_1bit_with_events (i_gpio_rx, PortUART_RX);
     on tile[0].core[0] : uart_rx(i_rx, null, RX_BUFFER_SIZE, BAUD_RATE, UART_PARITY_NONE, 8, 1, i_gpio_rx);

     // I2C Task
     on tile[0] :         Task_MaintainI2CConnection(i2c_client_device_instances, 1, PortSCL, PortSDA, I2C_SPEED_KBITPERSEC);

     // Motor Speed Controller (PWM) Tasks
     on tile[0].core[4] :         Task_DriveTBLE02S_MotorController(PortMotorSpeedController, control_interface, sensors_interface);//Comb,Core yok idi

     // Steering Servo (PWM) Tasks
     on tile[0].core[4] :         Task_SteeringServo_MotorController (PortSteeringServo, steering_interface);//Comb,Core yok idi

     //Light System Task
     on tile[0].core[6]:              Task_ControlLightSystem (PortLightSystem_TH, PortLightSystem_ST, lightstate_interface);

     // Other Tasks
     on tile[0] :           Task_ReadSonarSensors(i2c_client_device_instances[0], sensors_interface);
     on tile[0].core[1] :   Task_GetRemoteCommandsViaBluetooth(i_tx, i_rx, control_interface, steering_interface, i_cmd_from_ethernet_to_override, lightstate_interface);
     //on tile[0] :         Task_ProduceMotorControlOutputs (sensors_interface);

     // Core Monitoring Tasks
     on tile[0].core[7]:           Task_MonitorCoresInATile (core_stats_interface_tile0);
     on tile[1].core[1]:           Task_MonitorCoresInATile (core_stats_interface_tile1);


     // Ethernet App Tasks
     on tile[1]: rgmii_ethernet_mac(i_eth_rx, NUM_ETH_CLIENTS, i_eth_tx, NUM_ETH_CLIENTS,
             null, null,
             c_rgmii_cfg, rgmii_ports,
             ETHERNET_DISABLE_SHAPER);
     on tile[1].core[0]: rgmii_ethernet_mac_config(i_eth_cfg, NUM_CFG_CLIENTS, c_rgmii_cfg);
     on tile[1].core[0]: ar8035_phy_driver(i_smi, i_eth_cfg[CFG_TO_PHY_DRIVER]);
     on tile[1]: smi(i_smi, p_smi_mdio, p_smi_mdc);

     on tile[0]: xtcp(c_xtcp,
             1,
             null,
             i_eth_cfg[0],
             i_eth_rx[0],
             i_eth_tx[0],
             null,
             ETHERNET_SMI_PHY_ADDRESS,
             null,
             otp_ports,
             ipconfig);

     on tile[0]: Task_EthernetAppTCPServer(c_xtcp[0], i_cmd_from_ethernet_to_override, core_stats_interface_tile0, core_stats_interface_tile1);



  }

Post Reply