Game: optimised machine code

Technical questions regarding the XTC tools and programming with XMOS.
yzoer
XCore Addict
Posts: 133
Joined: Tue Dec 15, 2009 10:23 pm

Post by yzoer »

Here's another one for anyone who likes bit-twiddling:

Add two 8-bit numbers and clamp the result between 0 and 255, but without any compares and/or branches.

Yvo


User avatar
ahenshaw
Experienced Member
Posts: 96
Joined: Mon Mar 22, 2010 8:55 pm

Post by ahenshaw »

You must be talking about a signed add, otherwise that would just be an ADD and an AND. Correct?
User avatar
ahenshaw
Experienced Member
Posts: 96
Joined: Mon Mar 22, 2010 8:55 pm

Post by ahenshaw »

Oh. Clamp not wrap-around. Sorry, now I see the difficulty.
yzoer
XCore Addict
Posts: 133
Joined: Tue Dec 15, 2009 10:23 pm

Post by yzoer »

Think carry's, shift's, and's and or's :-)
User avatar
segher
XCore Expert
Posts: 844
Joined: Sun Jul 11, 2010 1:31 am

Post by segher »

Five (short) insns is easy:

Code: Select all

add t1,a,b
shr t2,t1,8
neg t3,t2
or d,t1,t3
zext d,8
but I suspect you have a shorter sequence in mind?
yzoer
XCore Addict
Posts: 133
Joined: Tue Dec 15, 2009 10:23 pm

Post by yzoer »

It's a bit more complicated than that :-)

Yours only clamps to 255 but doesn't clamp to zero. So when you add a two's complement number to it ( for example 2 and -3 ) the result should become zero, not 255. Maybe I should have been clearer on that part..

It's the same principle though. Rather than looking at the 8th bit, you look at the 9th bit to see if it underflowed.
User avatar
segher
XCore Expert
Posts: 844
Joined: Sun Jul 11, 2010 1:31 am

Post by segher »

Not sure what you're asking for then...

Adding two 8-bit two's complement numbers cannot reach 255.
Do you want 9-bit?
yzoer
XCore Addict
Posts: 133
Joined: Tue Dec 15, 2009 10:23 pm

Post by yzoer »

Alright, let me rephrase that. Add two SIGNED n-bit numbers and clamp them to either 0 or the highest positive value.

Best I can come up with is 9 instructions...
User avatar
segher
XCore Expert
Posts: 844
Joined: Sun Jul 11, 2010 1:31 am

Post by segher »

Seven:

Code: Select all

add t1,a,b
shr t2,t1,7
neg t3,t2
or d,t1,t3
shr t4,t1,8
andnot d,t4
zext d,7
but I think this can be done shorter :-)

Any particular reason you don't want branches, or just for
the challenge?
User avatar
segher
XCore Expert
Posts: 844
Joined: Sun Jul 11, 2010 1:31 am

Post by segher »

Five...

Code: Select all

.globl electrickery
electrickery:
ldap r11,x+8
add r0,r0,r1
stw r0,r11[0]
ashr r0,r0,7
ldw r0,r11[r0]
retsp 0

.align 4
x: .long 0, 0, 42, 127
Caveats: only works from one thread concurrently; requires
the inputs to be in range (as a signed char).