LMUL and signed multiplication

Technical questions regarding the XTC tools and programming with XMOS.
Post Reply
User avatar
lilltroll
XCore Expert
Posts: 956
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

LMUL and signed multiplication

Post by lilltroll »

What is the "smart way" to implement LMUL ?
For an example say that I would like to calculate

signed_int128 * signed_int128 -> signed_int128

Or a 128bit MAC

signed_int64 * signed_int64 + signed_int128 -> signed_int128 as fast as possible.

Using the LMUL macro in XC, How can I write a fast loop that multiplies integers with any length stored in some type of array or struct.

I checked the simulator output in debugger mode.

long long A,B,C;

A*B
A*B+C

it starts with LMUL and LADD

but it uses
MUL
ADD
MUL
ADD

calculating the most significant part. Could it use MAC or LMUL instead, or is is some magic with the sign ?
Somehow it handles -1*1 = -1 and -1 *-1 = 1 ... and I find that very nice :mrgreen: and you can see the magic happen looking at the registers in the simulations with instruction stepping.

I understand the thing with circular math, making the ADD instruction functional both for signed and unsigned - but I haven't understood the math behind signed and unsigned multiplication.


Probably not the most confused programmer anymore on the XCORE forum.
User avatar
lilltroll
XCore Expert
Posts: 956
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

Post by lilltroll »

Making a signed 96-bit MAC:

int64(int Bh,uint Bl) * int32(int Ah) + int96(int Ch, uint Cm,uint Cl) => int96(int yh,uint ym,uint yl)

Something like this below?? Using the macs instruction in the end, handeling the sign of A and B.

Code: Select all

	{ym,yl}=lmul(Al,Bh,0,Cl);
	asm("ladd %0, %1, %2, %3, %4 " : "=r"(carry), "=r"(ym) : "r"(Cm), "r"(ym), "r"(carry));
	{yh,ym}=macs(Ah,Bh,carry,ym);
	yh+=Ch;
and with an example:

Code: Select all

#include <xs1.h>
#include <print.h>

long Ah=0x40000001,Bh=0x40000001,Ch=0x10000001,yh;
unsigned long yl,ym,Al=0,Cm=2,Cl=3;
int carry=0;

int main(){
	{ym,yl}=lmul(Al,Bh,0,Cl);
	asm("ladd %0, %1, %2, %3, %4 " : "=r"(carry), "=r"(ym) : "r"(Cm), "r"(ym), "r"(carry));
	{yh,ym}=macs(Ah,Bh,carry,ym);
	yh+=Ch;
			
	printhexln(yh);
	printhexln(ym);
	printhexln(yl);
	return 0;
}
giving the console output
20000001 (yh)
80000003 (ym)
3 (yl)

0xA * 0xB + 0xC= 10000000 80000001 00000000 + 10000001 00000002 00000003
Probably not the most confused programmer anymore on the XCORE forum.
User avatar
lilltroll
XCore Expert
Posts: 956
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

Post by lilltroll »

This must be better for "MAC96". Can I reduce it further skipping the if-else ? Can you do abs() more efficient ?

Code: Select all

unsigned int ym=Cm;
unsigned int yl=Cl;
int yh=Ch;

if(x>=0)
{ym,yl}=mac(Al,x,ym,yl);
else
{ym,yl}=mac(Al,-x,ym,yl);
{yh,ym}=macs(Ah,x,Ch,ym);
Probably not the most confused programmer anymore on the XCORE forum.
Post Reply