Impressive! No reason to not use branches other than to make it more interested :-)
Yvo
Game: optimised machine code
-
- XCore Addict
- Posts: 133
- Joined: Tue Dec 15, 2009 10:23 pm
-
- Experienced Member
- Posts: 114
- Joined: Fri Dec 11, 2009 10:22 am
I've got one for you all: generate the sequence 0, 1, 2, 0, 1, 2, ... i.e.:
can anyone beat the naive approach for speed and fewest registers:
worst case: 4 instruction(no divs), 1 register clobber
Code: Select all
unsigned foo(unsigned x){
x = x+1;
x = x%3;
return x;
}
Code: Select all
add r0, r0, 1
sub r11, r0, 3
bt r11, return
lcd r0, 0
return:
retsp 0
-
- XCore Expert
- Posts: 844
- Joined: Sun Jul 11, 2010 1:31 am
Hi Andrew,
Your example code requires 0 <= x < 3; in that case, it can
be done in two instructions.
Your example code requires 0 <= x < 3; in that case, it can
be done in two instructions.
-
- XCore Expert
- Posts: 844
- Joined: Sun Jul 11, 2010 1:31 am
Without table lookup or branches I'm stuck at four insns.
It _feels_ like three should be possible...
It _feels_ like three should be possible...
-
- Respected Member
- Posts: 318
- Joined: Tue Dec 15, 2009 12:46 am
The next in the sequence is given by x + 1 - 3 * (x >> 1) so you can use the following:
This is four instructions, but if you are doing this in a loop you'll can hoist the ldc instructions, leaving only two instructions.
Code: Select all
ldc t1, 3
ldc t2, 1
shr t3, x, 1
lmul t4, x, t1, t3, x, t2
-
- Respected Member
- Posts: 318
- Joined: Tue Dec 15, 2009 12:46 am
Three instructions:
or
Edit: Added second code sequence
Code: Select all
ldc tmp, 0xd
crc32 x, tmp, tmp
zext x, 2
Code: Select all
ldc tmp, 1
shl x, tmp1, x
zext x, 2
-
- XCore Expert
- Posts: 844
- Joined: Sun Jul 11, 2010 1:31 am
Very nice, esp. that last one :-)
Another three-insn one, using my favourite instruction:
A related question: still using 0,1,2, but this time count backwards?
Another three-insn one, using my favourite instruction:
Code: Select all
mkmsk x,x
add x,x,1
zext x,2
-
- XCore Expert
- Posts: 844
- Joined: Sun Jul 11, 2010 1:31 am
Suppose we want to compute the "ne0" function, that is,
then the standard way is
which is optimal in the general case. But now suppose we
already have a register (say z) that contains 0, how can we
improve on this?
Code: Select all
d = (a != 0);
Code: Select all
eq t,a,0
eq d,t,0
already have a register (say z) that contains 0, how can we
improve on this?
-
- Respected Member
- Posts: 318
- Joined: Tue Dec 15, 2009 12:46 am
a != 0 is equivalent to (unsigned)a > 0 so you can implement ne0 as follows:You don't need the ldc instruction if you already have 0 in a register.
Code: Select all
ldc t, 0
lsu d, t, a
-
- XCore Expert
- Posts: 844
- Joined: Sun Jul 11, 2010 1:31 am
Ha yes, too easy! I was thinking of doing it with lsub (have
to do _something_ tricky with that ;-) ), but this of course
is better.
Your turn for a challenge?
to do _something_ tricky with that ;-) ), but this of course
is better.
Your turn for a challenge?