Game: optimised machine code

yzoer · Post by **yzoer** » Mon Aug 25, 2014 7:10 pm

Impressive! No reason to not use branches other than to make it more interested :-)

Yvo

andrew · Post by **andrew** » Wed Aug 27, 2014 12:01 pm

I've got one for you all: generate the sequence 0, 1, 2, 0, 1, 2, ... i.e.:

unsigned foo(unsigned x){
    x = x+1;
    x = x%3;
    return x;
}

can anyone beat the naive approach for speed and fewest registers:

Code: Select all

add r0, r0, 1
sub r11, r0, 3
bt r11, return
  lcd r0, 0
return:
retsp 0

worst case: 4 instruction(no divs), 1 register clobber

segher · Post by **segher** » Wed Aug 27, 2014 12:45 pm

Hi Andrew,

Your example code requires 0 <= x < 3; in that case, it can
be done in two instructions.

segher · Post by **segher** » Wed Aug 27, 2014 3:37 pm

Without table lookup or branches I'm stuck at four insns.
It _feels_ like three should be possible...

richard · Post by **richard** » Wed Aug 27, 2014 9:21 pm

The next in the sequence is given by x + 1 - 3 * (x >> 1) so you can use the following:

Code: Select all

ldc t1, 3
ldc t2, 1
shr t3, x, 1
lmul t4, x, t1, t3, x, t2

This is four instructions, but if you are doing this in a loop you'll can hoist the ldc instructions, leaving only two instructions.

richard · Post by **richard** » Wed Aug 27, 2014 9:56 pm

Three instructions:

Code: Select all

ldc tmp, 0xd
crc32 x, tmp, tmp
zext x, 2

or

Code: Select all

ldc tmp, 1
shl x, tmp1, x
zext x, 2

Edit: Added second code sequence

segher · Post by **segher** » Wed Aug 27, 2014 11:56 pm

Very nice, esp. that last one :-)

Another three-insn one, using my favourite instruction:

Code: Select all

mkmsk x,x
add x,x,1
zext x,2

A related question: still using 0,1,2, but this time count backwards?

segher · Post by **segher** » Thu Oct 02, 2014 1:25 am

Suppose we want to compute the "ne0" function, that is,

Code: Select all

d = (a != 0);

then the standard way is

Code: Select all

eq t,a,0
eq d,t,0

which is optimal in the general case. But now suppose we
already have a register (say z) that contains 0, how can we
improve on this?

richard · Post by **richard** » Tue Oct 07, 2014 9:06 pm

a != 0 is equivalent to (unsigned)a > 0 so you can implement ne0 as follows:

Code: Select all

ldc t, 0
lsu d, t, a

You don't need the ldc instruction if you already have 0 in a register.

segher · Post by **segher** » Tue Oct 07, 2014 9:24 pm

Ha yes, too easy! I was thinking of doing it with lsub (have
to do _something_ tricky with that ;-) ), but this of course
is better.

Your turn for a challenge?

Game: optimised machine code

Re: Game: optimised machine code

Re: Game: optimised machine code

Re: Game: optimised machine code

Re: Game: optimised machine code

Re: Game: optimised machine code

Re: Game: optimised machine code

Re: Game: optimised machine code

Re: Game: optimised machine code

Re: Game: optimised machine code

Re: Game: optimised machine code