Xcore assembler code -> problem with event handling Topic is solved

Technical discussions around xCORE processors (e.g. xcore-200 & xcore.ai).
Eliwood
Member
Posts: 10
Joined: Fri Apr 29, 2016 7:35 am

Xcore assembler code -> problem with event handling

Post by Eliwood »

Hi all,

when i'm using this assembler code :

Code: Select all


.text
.align 2
.globl DoPingAsm0

.type DoPingAsm0, @function
.cc_top DoPingAsm0.function

#*********************************************************************************
# Fkt.:	DoPingAsm0()
#				Ausgabe von 16-Bit-Werten aus einem Array an den gepufferten 16-Bit-Port pPing
#       Die Basisadresse liegt hinter dem Array ValTable[]. Durch den neg. Index in r1
#       kann die Schleife auf 4 Befehle reduziert werden.
#       Mit clk=25MHz könne Werte aus ValTable[] alle 40ns ausgegeben werden.
# Par.:	r0  - Adr. nach dem Array, 2er-Adr.
#				r1  - Anz, Anzahl der auszugebenden 16-Bit-Werte als negativer Wert
#             Muß <= Tabellenlänge sein. DARF NIE 0 SEIN !!!!!!!!!!
#       r2  - Port
# 			r3  -
# used: r4  - inc-Wert, always 1
# 		  r5  - Val, 16-Bit-Wert aus ValTable0[]
#*********************************************************************************

DoPingAsm0:

  entsp 10            # 10 scheint übertrieben
  stw r4 , sp [1]
	stw r5 , sp [2]
  #--- Init ------------------------------
  ldc   r4, 1         #  incr-Wert
  #---------------------------------------
Loop0:
  ld16s r5, r0[r1]    # Val = ValTable0[Anz]    # Anz ist negativer Wert
  out  res[r2],r5     # pPing <: Val
  #nop
  add r1, r1, r4      # Anz = Anz - 1
  bt r1, Loop0

  #--------------------------------------
  ldw r4 , sp [1]
  ldw r5 , sp [2]

	retsp 10


#.size DoPingAsm0, -DoPingAsm0
.cc_bottom DoPingAsm0.function
  .globl DoPingAsm0.nstackwords
  .linkset DoPingAsm0.nstackwords, 10



i have ever the event, where the assembler function is called and the programm will not jump into the other event. The second event is receiving information about an interface. The timing should work. When i comment the assembler function, all works fine, but i need this function for the time critically output on ports. ( the sourcecode is from another person [retirement] )

Best regards :)
View Solution
henk
Verified
Respected Member
Posts: 347
Joined: Wed Jan 27, 2016 5:21 pm

Post by henk »

Hi Eliwood,

That code does not contain any events as such; do you have some context, e.g., the place that this is called from?

Cheers,
Henk
Eliwood
Member
Posts: 10
Joined: Fri Apr 29, 2016 7:35 am

Post by Eliwood »

Code: Select all


     case i0.fKey(unsigned char Key):  // INNO: : von gettingkeys.xc auf Tile1
                                        // vorher: von gettingkeys.xc auf Tile0
           NewKey = Key;                   // 520ns VOR neg. Flanke von ADDR_SCHWENKEN (Byte wird zuerst abgeschickt)



           //--- ValTable0[] neu schreiben ----------------------------------
           if((NewKey != ActKey) && (NewKey != KEY_NOT_VALID) )
           {

                ActKey = NewKey;
                InitValTable(ActKey, ANGLE_ID, 0 , ValTable0);
                // Achtung! Bit7/6 evtl. gesetzt

                //NewKey &= ~(3 << 6);
           }
           disable = 0;

            break;


//=== ADDR_SCHWENKEN =================================================================

   /*   case s0.fKey(unsigned char Key):
          T0.NewKeySchwenken = Key;   // 520ns VOR neg. Flanke von ADDR_TX_MUSTER!!!
          //--- ValTable0[] initialisieren ----------------------------


          break;*/



//=== Pingen ==========================================================================
      case pInStart0 when pinsneq(y0) :> y0:  // wenn Start0 erkannt-->y0 =1, d.h. pos. Flanke            !disable =>


         while(y0) {
             pInStart0 :> y0;  // warten auf neg. Flanke des Startimpulses, -->y0 = 0
             sync(pInStart0);
         }

          pOutStart0 <: 1;   // für pStart0 und pStart1 auf H setzen
          sync(pOutStart0);  // warten auf neg. Taktflanke, sync to neg. edge
                             // -->nicht mehr im Bereich der pos. Flanke von Start0 (=TX_START1 auf Digi-Board),
                             // pOutStart0 geht nicht mehr an pStartPinging und pStartPing0;
                             // nur noch zur Triggerung des Oszis ob beide Portausgaben synchron sind
          start_clock(clk);  // pStartPinging und pStartPing0 bekommen Takt

          trigger1 <: 1;      // für synchrone Ausgabe notwendig , mehrere nop befehle gingen nicht, trigger beeinflusst nciht das wichtige ausgangssignal
          asm ("nop");        // time shift for compact system
          asm ("nop");
          asm ("nop");
          asm ("nop");
          asm ("nop");
          asm ("nop");
          asm ("nop");


          DoPing0(iLen);
          pOutStart0 <: 0;  // begrenzt durchgereichten Start-Impuls (TX_START1 auf Digi-Board)
          y0 = 0;           // Der Startimpuls muß kürzer sein als die kleinste Dauer für den Ping.


          stop_clock(clk);


          break;

      }
  }
}

DoPing0 is calling the assembler function:

Code: Select all

void DoPing0(int iLen) {
  //--- Array ausgeben -----------------------
  DoPingAsm0((unsigned int)ValTable0 + 2*iLen, -iLen, XS1_PORT_16B );
}
henk
Verified
Respected Member
Posts: 347
Joined: Wed Jan 27, 2016 5:21 pm

Post by henk »

Hi,

What processor do you run this on, XS1 architecture or XS2 architecture?

Cheers,
Henk
Eliwood
Member
Posts: 10
Joined: Fri Apr 29, 2016 7:35 am

Post by Eliwood »

it is the XE216-512-TQ128

Cheers,
Eli
henk
Verified
Respected Member
Posts: 347
Joined: Wed Jan 27, 2016 5:21 pm

Post by henk »

Hi Eli,

it is possible that the code is called from Dual issue mode - the ABI for XS2 devices requires all functions to be executable both in single and dual issue mode.

There are a couple of things that don't seem right in this respect
a) the function should be 4-byte aligned
b) the ENTSP instruction should be paired with an instruction that can execute in either dual or single issue mode.

The assembly code is aligned on 2-bytes only I think, and it is paired with another memory instruction which is not legal.

It is worth replacing the alignment with a 4-byte alignment and putting a NOP in front of the ENTSP just to clean that up first.

Cheers,
Henk
Eliwood
Member
Posts: 10
Joined: Fri Apr 29, 2016 7:35 am

Post by Eliwood »

i changed the code :

Code: Select all



.text
.align 4
.globl DoPingAsm0

.type DoPingAsm0, @function
.cc_top DoPingAsm0.function

#*********************************************************************************
# Fkt.:	DoPingAsm0()
#				Ausgabe von 16-Bit-Werten aus einem Array an den gepufferten 16-Bit-Port pPing
#       Die Basisadresse liegt hinter dem Array ValTable[]. Durch den neg. Index in r1
#       kann die Schleife auf 4 Befehle reduziert werden.
#       Mit clk=25MHz könne Werte aus ValTable[] alle 40ns ausgegeben werden.
# Par.:	r0  - Adr. nach dem Array, 2er-Adr.
#				r1  - Anz, Anzahl der auszugebenden 16-Bit-Werte als negativer Wert
#             Muß <= Tabellenlänge sein. DARF NIE 0 SEIN !!!!!!!!!!
#       r2  - Port
# 			r3  -
# used: r4  - inc-Wert, always 1
# 		  r5  - Val, 16-Bit-Wert aus ValTable0[]
#*********************************************************************************

DoPingAsm0:
  nop			 			# entsp 10
  stw r4 , sp [1]
	stw r5 , sp [2]
  #--- Init ------------------------------
  ldc   r4, 1         #  incr-Wert
  #---------------------------------------
Loop0:
  ld16s r5, r0[r1]    # Val = ValTable0[Anz]    # Anz ist negativer Wert
  out  res[r2],r5     # pPing <: Val
  #nop
  add r1, r1, r4      # Anz = Anz - 1
  bt r1, Loop0

  #--------------------------------------
  ldw r4 , sp [1]
  ldw r5 , sp [2]

	retsp 10


#.size DoPingAsm0, -DoPingAsm0
.cc_bottom DoPingAsm0.function
  .globl DoPingAsm0.nstackwords
  .linkset DoPingAsm0.nstackwords, 10

and got the compiler message:

xrun: Program received signal ET_ILLEGAL_INSTRUCTION, Unable to decode instruction.
[Switching to tile[0] core[2] (dual issue)]
0x00047bc0 in ValTable0 ()

I have forgotten to tell you that the assembler function exist for both tiles (DoPing0 and DoPing1)and are seperate functions with the same source code. The complete code is relative similiar between tile 1 and tile 0 .

code of DoPing1:

Code: Select all



# pingasm.s
# Ausgabe von 16-Bit-Werten aus dem Array ValTable[] an den gepufferten 16-Bit-Port pPing

.text
.align 4
.globl DoPingAsm1

.type DoPingAsm1, @function
.cc_top DoPingAsm1.function

#*********************************************************************************
# Fkt.:	DoPingAsm1()
#				Ausgabe von 16-Bit-Werten aus einem Array an den gepufferten 16-Bit-Port pPing
#       Die Basisadresse liegt hinter dem Array ValTable[]. Durch den neg. Index in r1
#       kann die Schleife auf 4 Befehle reduziert werden.
#       Mit clk=25MHz könne Werte aus ValTable[] alle 40ns ausgegeben werden.
# Par.:	r0  - Adr. nach dem Array, 2er-Adr.
#				r1  - Anz, Anzahl der auszugebenden 16-Bit-Werte als negativer Wert
#             Muß <= Tabellenlänge sein. DARF NIE 0 SEIN !!!!!!!!!!
#       r2  - Port
# 			r3  -
# used: r4  - inc-Wert, always 1
# 		  r5  - Val, 16-Bit-Wert aus ValTable[]
#*********************************************************************************

DoPingAsm1:

  nop			 			# entsp 10
  stw r4 , sp [1]
	stw r5 , sp [2]
  #--- Init ------------------------------
  ldc   r4, 1         #  incr-Wert
  #---------------------------------------
Loop1:
  ld16s r5, r0[r1]    # Val = ValTable[Anz]    # Anz ist negativer Wert
  out res[r2], r5     # pPing <: Val
  #nop
  add r1, r1, r4      # Anz = Anz - 1
  bt r1, Loop1

  #--------------------------------------
  ldw r4 , sp [1]
  ldw r5 , sp [2]
	retsp 10


#.size DoPingAsm1, -DoPingAsm1
.cc_bottom DoPingAsm1.function
  .globl DoPingAsm1.nstackwords
  .linkset DoPingAsm1.nstackwords, 10


ups, sry, i did a mistake in the code, now i wrote:

nop
entsp 10

for both and don't comment "entsp 10"....now i haven't the compiler message, but the xcore is always again in this one event
Last edited by Eliwood on Tue May 03, 2016 12:42 pm, edited 2 times in total.
henk
Verified
Respected Member
Posts: 347
Joined: Wed Jan 27, 2016 5:21 pm

Post by henk »

You need the entsp too, i.e.

Code: Select all

label:
    nop                     // Maybe in dual issue.
    entsp 10
    stw    r4, sp[1]     // Guaranteed single issue here
Eliwood
Member
Posts: 10
Joined: Fri Apr 29, 2016 7:35 am

Post by Eliwood »

sry, i edited my last post because i did a mistake ;)
henk
Verified
Respected Member
Posts: 347
Joined: Wed Jan 27, 2016 5:21 pm

Post by henk »

Ok - you may want to edit the code in the other posts too - they still have an entsp in the comment!

Anyhow - is XS1_PORT_16B different from all the ports in the select statement? I presume so (otherwise there would be some interesting aliasing)

My best guess is that something has changed in the select (e.g., you added an interface-case) that has pushed the setup time of the select beyond what you need to keep up with that case that always fires. Ie, the case that you are in all the time is always firing, simply because it is always ready to go; and then it will always go.

If you are just over the limit, then you could find out if this is the case by speeding the thread that runs this code up a bit. If you are running a system with more than 5 threads, then you can try and put this thread in high priority mode. That will guarantee this thread to get 100 MIPS, and slow the other threads down proportionally.

A few possibly unrelated points that could help you simplify the code a bit

1) you shouldn't need a DoPingAsm per core - the mapper should replicate it as required.

2) The code:

Code: Select all

         while(y0) {
             pInStart0 :> y0;  // warten auf neg. Flanke des Startimpulses, -->y0 = 0
             sync(pInStart0);
         }
Can be replaced with a

Code: Select all

             pInStart0 when pinsneq(0) :> y0;  // warten auf neg. Flanke des Startimpulses, -->y0 = 0
3) Finally, the use of a series of NOPs adds uncertainty, e.g.:

Code: Select all

          asm ("nop");        // time shift for compact system
          asm ("nop");
          asm ("nop");
          asm ("nop");
          asm ("nop");
          asm ("nop");
          asm ("nop");
Takes between 35 and 56 core clock cycles, depending on other threads. So - if you have added another thread to the system, you may have to remove a NOP or two here.