Shared memory and busy waiting

Technical questions regarding the XTC tools and programming with XMOS.
Post Reply
User avatar
kean
Member++
Posts: 19
Joined: Tue Sep 27, 2011 11:49 am

Shared memory and busy waiting

Post by kean »

Hello everyone,

I have been trying to do something as illustrated in the code below. Here the goal is having one shared variable whose value is updated by thread1 and then another thread2 reads the value of the variable. In my case the real application would be a circular array queue and the values to be written/read are the tail and the head indexes.

Code: Select all

int var=0;

extern inline int get_var();
extern inline void set_var(int x);

int main()
{
	par
	{
		{
			while(get_var()!=1)
				;
			printstr("var read is 1\n");
		}
		{
			set_var(1);
			printstr("var set to 1\n");
		}
	}
}
In order to trick the compiler and not to create errors because of parallel rules violations the functions get_var() and set_var() are defined with a macro as shown below.

Code: Select all

#define create_getsetvar \
	inline int get_var() { \
		int x=0; \
		asm("add %0, %1, 0": "=r"(x),: "r"(var)); \
		return x; \
	} \
	inline void set_var(int x) { \
		asm("add %0, %1, 0": "=r"(var),: "r"(x)); \
	}

create_getsetvar
Everything works great without any optimization, but when I turn on the o3 optimization (which is imperative in this project) the logic of my program is broken.

Do you know any turnaround for this? It doesn't seem possible to disactivate optimisation just for a code block or function...

Of course I could simply drop the inline specifier and define the two functions in a c module with var being volatile. But my goal is to achieve the maximal performance.


User avatar
Folknology
XCore Legend
Posts: 1274
Joined: Thu Dec 10, 2009 10:20 pm
Contact:

Post by Folknology »

You could use a channel to pass the variable value between threads then you wouldn't need to trick the compiler. Unfortunately although you could do this with the head and tail indexes you would still have to share the array itself which would again trip you up. However one trick is to pass C pointers over a channel and I have seen this used a number of times with mixed XC/C programs.

regards
Al
User avatar
XMatt
XCore Addict
Posts: 147
Joined: Tue Feb 23, 2010 6:55 pm

Post by XMatt »

You can just do something like this, create 2 references to the same memory buffer in assembler. Obviously using channels are safer but if you want to do this it works.

main.xc:

Code: Select all

#include <print.h>

extern int data_buffer[1024];
extern int data_buffer_[1024];

int main()
{
   par
   {
      {
         while(data_buffer[0]!=1);
         printstr("var read is 1\n");
      }
      {
         data_buffer_[0] = 1;
         printstr("var set to 1\n");
      }
   }

   return 0;
}
buffer.s:

Code: Select all

.section .dp.bss, "awd", @nobits

.align 4

.globl data_buffer
.globl data_buffer_
data_buffer:
data_buffer_:
  .space 4096
PPavlov
Junior Member
Posts: 4
Joined: Fri Apr 22, 2016 12:12 pm

Post by PPavlov »

XMatt wrote:You can just do something like this, create 2 references to the same memory buffer in assembler. Obviously using channels are safer but if you want to do this it works.
buffer.s:

Code: Select all

.section .dp.bss, "awd", @nobits

.align 4

.globl data_buffer
.globl data_buffer_
data_buffer:
data_buffer_:
  .space 4096
Hi XMatt,

This example works when everything is on the same tile. I met the issues when I want to exchange data between different tiles in this way. Is it possible to adapt this example to get it working in my case?
P.S. I have tried many ways (interfaces, channels) to solve my task. Now it looks that the only right way to use some asm level approach.

Best regards,
Peter
User avatar
kean
Member++
Posts: 19
Joined: Tue Sep 27, 2011 11:49 am

Post by kean »

Thank you XMatt, this alternative semplifies a lot my code.

A last question about this line of code:

Code: Select all

while(data_buffer[0]!=1);
Does the access to arrays is not optimized out by the compiler? Is this the reason because you are using arrays instead of a variable, as in my previous example?

Probably I will also take into account the Folknology's solution, without all this turnarounds everything will be much cleaner!
User avatar
Folknology
XCore Legend
Posts: 1274
Joined: Thu Dec 10, 2009 10:20 pm
Contact:

Post by Folknology »

Its worth taking a look at this thread. I still think a 'I really do know what I'm doing with this array' pragma would be really handy rather than these ugly hacks. More recently however I have been moving away from XC back to C in order to reclaim its power, I just wish Xmos would release the XS1 intrinsics for C to make things easier, if they don't I may end up realising my own libs next year.

regards
Al
User avatar
Lele
Active Member
Posts: 52
Joined: Mon Oct 31, 2011 4:08 pm
Contact:

Post by Lele »

Hi Kean,
I had same problem:
A fifo buffer managed in a circular way where an .xc thread write/insert data and a .c thread read/extract them when available.
The buffer, write and read position stored in shared global variables in the .xc module.
Write position only modified by insert thread and read position only by read thread ( so no parallel rules violations ).
The writing thread need to check when fifo full and the reader when fifo empty.
To do the fifo empty check the reader compares read and write positions: if equal then fifo is empty and need to wait for new data pushed by writer (i.e. wait for WritePosition to change value).
It was done this way:

Code: Select all

while(WritePosition==ReadPosition)
;
Everything worked fine without optimization but when enabling the C optimization it did't work any more.
Looking into compiled code I saw that register where used in the comparation and undertood my error.
However casting to 'volatile' did't solve the problem.
Moved the global variables into the C module ( XC doesn't have volatile ) and declared as volatile didn't work too, the compiler continued to use registers.
It only worked when I used a function to read the global variable:

Code: Select all

unsigned WritePosition(void) {return WritePosition;}
...
while(WritePosition()==ReadPosition)
;
However I think this is not a good solution. We have an event driven processor and polling should never be done (for power and mips reasons).
The best way is to synchronize with channels.
User avatar
kean
Member++
Posts: 19
Joined: Tue Sep 27, 2011 11:49 am

Post by kean »

Hi Lele
I took my cue from XMatt's post and referring to your example I modified the busy waiting part as follows:

Code: Select all

unsigned int WritePosition[1], ReadPosition[1];
...
while(WritePosition[0]==ReadPosition[0])
;
Apparently the compiler XCC is not able to optimize out array accesses, so even with a o3 optimization the program logic does not get broken. This solution is ugly of course, and the main concern on my part is whether XCC will keep being "this stupid" also in new releases.

This is the best solution I could find to my problem so far. In fact in my producer and consumer scenario by the time that the consumer processes its data the producer could need to enque up to 20 items of data. If the data address is passed through a channel, once that the limited size of the channel buffer has been filled, it would block my producer. In my case this is not acceptable.
User avatar
kean
Member++
Posts: 19
Joined: Tue Sep 27, 2011 11:49 am

Post by kean »

Hi Al
I totally agree with you. A XS1 C library is absuletely necessary. I lost lots of time just to end up with something that runs, but I am almost ashamed to run ;)
Post Reply