GNU superoptimiser (xcore port)

XCore Project reviews, ideas, videos and proposals.
richard
Respected Member
Posts: 318
Joined: Tue Dec 15, 2009 12:46 am

GNU superoptimiser (xcore port)

Post by richard »

Version: 2010
Status: Public release
License: GPL
Download: /files/project_builds/superopt.zip

Port of GNU Superoptimiser to the XCore ISA. The GNU superoptimiser is a function sequence generator that uses an exhaustive generate-and-test approach to finding the shortest instruction sequence for a given function. The superoptimizer can't generate very long sequences, unless you have a very fast computer or lots of spare time. The time complexity of the used algorithm is approximately
2n

O(m n )
where m is the number of available instructions on the architecture and n is the shortest sequence for the goal function.


User avatar
segher
XCore Expert
Posts: 844
Joined: Sun Jul 11, 2010 1:31 am

Post by segher »

Nice stuff, this will save me some work :-)

I notice it misses some things though... e.g.:

$ ./superopt-xcore -fneq0 -assembly
Searching for { r = -(v0 == 0); }
Superoptimizing at cost 1 2
1: sub r1,r0,1
shr r2,r1,r0
2: eq r1,r0,0
sext r1,1
3: eq r1,r0,0
neg r2,r1
[3 sequences found]

Another way to do neq0 is
mkmsk r1,r0 ; sext r1,1
and it can be done with the long arithmetic insns as well, also in two insns if i'm not mistaken.

Would it be hard to make superopt take the insn length into account as well? Or just display it.

Oh, and look at alternative 1 for neq0 above, that's pretty awesome :-)

Finally, can you put the results on all the standard ops somewhere on the web please?
For lazy people like myself ;-)


Segher
richard
Respected Member
Posts: 318
Joined: Tue Dec 15, 2009 12:46 am

Post by richard »

Thanks for your comments, it's great to see people trying it out.

> Another way to do neq0 is
> mkmsk r1,r0 ; sext r1,1

If you plug 0 into r = -(v0 == 0) you get r = -1.
If you plug 0 into mkmsk r1,r0 ; sext r1,1 you get 0, so I think the tool is right. Those instructions instead compute the goal nne0: { r = -(v0 != 0); }:

Searching for goal nne0: { r = -(v0 != 0); }
Superoptimizing at cost 1 2
1: eq r1,r0,0
sub r2,r1,1
2: mkmsk r1,r0
sext r1,1
[2 sequences found]

> and it can be done with the long arithmetic insns as well
It doesn't support instructions with multiple results. It would be nice to extend it to handle these, but I haven't looked into what that would require.

> Would it be hard to make superopt take the insn length into account as well?
This shouldn’t be too hard. There is a variable allowed_cost which is decremented by 1 each time an instruction is added to the sequence. It should just be a case of changing the code to instead decrement the cost by the size of the instruction.
User avatar
segher
XCore Expert
Posts: 844
Joined: Sun Jul 11, 2010 1:31 am

Post by segher »

Whoops, that neq0 vs. nne0 thing really shows it is good to have tools to
do these things, humans get confused :-)

Wrt multiple results, the PowerPC port uses that for addc and that whole
family of insns (all using the carry bit). It might special-case it though.
And when an insn has (say) d,e as outputs, you also need to try e,d
because it takes the highest-numbered reg as final function result.

Insns with multiple results are pretty interesting because you can do
min/max in three insns (instead of the five it finds now).

Insn length... what I meant was prefer shorter insns, but still prefer fewer
insns. I'll just do this by hand, it's sometimes nice to see the longer insns
anyway.

Thanks again,


Segher
User avatar
segher
XCore Expert
Posts: 844
Joined: Sun Jul 11, 2010 1:31 am

Post by segher »

I'm generating all target functions with a max cost of 6 -- will put the result on the web when it finished.
But, hit a snag -- where it should print "r8", it prints "(null)" instead.

I think you should just pretend to have an infinite number of registers, superopt doesn't do register allocation :-)