Hi guys,
I've been doing a little experimentation with lib_locks swlock on XS2. The implementation is not fair when high priority cores contend with low priority cores for the lock i.e. in swlock_acquire(). The low priority cores get the lock more often than the high priority cores for some reason. Of course I don't expect it to be as good as hwlock but it should be pretty reasonably comparable. I am testing a fix that does following:
- Saves core prio
- Sets core prio low
- Obtains lock through executing XS2A optimized version of swlock_try_acquire
--> dual issue to improve speed
--> remove excess nops since core is known to be low priority
- Restores core priority before returning
lib_locks swlock on XS2
-
- XCore Expert
- Posts: 580
- Joined: Thu Nov 26, 2015 11:47 pm
-
- XCore Expert
- Posts: 580
- Joined: Thu Nov 26, 2015 11:47 pm
I am getting the feeling my "fix" doesn't really improve the fairness of swlock compared to hwlock. However, I can improve the swlock speed anyway so that's a thing. I also changed my implementation to use inline assembly so I removed the dual issue portion of the code (in case the code is compiled -mno-dual-issue). Or maybe there's a way to determine if the assembler is in dual issue or single issue mode when the inline assembly is called, I don't know. Anyway it doesn't really speed things up much because there's not much that can be dual issued.
As part of my optimization I implemented a spin wait when the lock is locked. I found an interesting problem, I need to insert a nop in my spin or the code doesn't lock properly in my testing. Is this a fetch no-op? Or is there some other reason this nop seems necessary? It would be nice to know since I can't prove the lock works through testing, only through analysis, and I guess I start my analysis by assuming the XMOS library is good, and then argue my changes don't break it.
Bad code:
Apparently functional:
As part of my optimization I implemented a spin wait when the lock is locked. I found an interesting problem, I need to insert a nop in my spin or the code doesn't lock properly in my testing. Is this a fetch no-op? Or is there some other reason this nop seems necessary? It would be nice to know since I can't prove the lock works through testing, only through analysis, and I guess I start my analysis by assuming the XMOS library is good, and then argue my changes don't break it.
Bad code:
Code: Select all
".Lspin%=:\n"
"ldw %0, %1[0]\n" // Get the current mutex value.
"bt %0, .Lspin%=\n" // Check if it is already claimed.
Code: Select all
".Lspin%=:\n"
"nop\n"
"ldw %0, %1[0]\n" // Get the current mutex value.
"bt %0, .Lspin%=\n" // Check if it is already claimed.