LMUL and signed multiplication

Technical questions regarding the xTIMEcomposer, xSOFTip Explorer and Programming with XMOS.
User avatar
lilltroll
XCore Expert
Posts: 955
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

LMUL and signed multiplication

Postby lilltroll » Sun Feb 21, 2010 12:28 pm

What is the "smart way" to implement LMUL ?
For an example say that I would like to calculate

signed_int128 * signed_int128 -> signed_int128

Or a 128bit MAC

signed_int64 * signed_int64 + signed_int128 -> signed_int128 as fast as possible.

Using the LMUL macro in XC, How can I write a fast loop that multiplies integers with any length stored in some type of array or struct.

I checked the simulator output in debugger mode.

long long A,B,C;

A*B
A*B+C

it starts with LMUL and LADD

but it uses
MUL
ADD
MUL
ADD

calculating the most significant part. Could it use MAC or LMUL instead, or is is some magic with the sign ?
Somehow it handles -1*1 = -1 and -1 *-1 = 1 ... and I find that very nice :mrgreen: and you can see the magic happen looking at the registers in the simulations with instruction stepping.

I understand the thing with circular math, making the ADD instruction functional both for signed and unsigned - but I haven't understood the math behind signed and unsigned multiplication.
Probably not the most confused programmer anymore on the XCORE forum.
User avatar
lilltroll
XCore Expert
Posts: 955
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

Postby lilltroll » Sun Feb 21, 2010 8:04 pm

Making a signed 96-bit MAC:

int64(int Bh,uint Bl) * int32(int Ah) + int96(int Ch, uint Cm,uint Cl) => int96(int yh,uint ym,uint yl)

Something like this below?? Using the macs instruction in the end, handeling the sign of A and B.

Code: Select all

	{ym,yl}=lmul(Al,Bh,0,Cl);
	asm("ladd %0, %1, %2, %3, %4 " : "=r"(carry), "=r"(ym) : "r"(Cm), "r"(ym), "r"(carry));
	{yh,ym}=macs(Ah,Bh,carry,ym);
	yh+=Ch;
and with an example:

Code: Select all

#include <xs1.h>
#include <print.h>

long Ah=0x40000001,Bh=0x40000001,Ch=0x10000001,yh;
unsigned long yl,ym,Al=0,Cm=2,Cl=3;
int carry=0;

int main(){
	{ym,yl}=lmul(Al,Bh,0,Cl);
	asm("ladd %0, %1, %2, %3, %4 " : "=r"(carry), "=r"(ym) : "r"(Cm), "r"(ym), "r"(carry));
	{yh,ym}=macs(Ah,Bh,carry,ym);
	yh+=Ch;
			
	printhexln(yh);
	printhexln(ym);
	printhexln(yl);
	return 0;
}
giving the console output
20000001 (yh)
80000003 (ym)
3 (yl)

0xA * 0xB + 0xC= 10000000 80000001 00000000 + 10000001 00000002 00000003
Probably not the most confused programmer anymore on the XCORE forum.
User avatar
lilltroll
XCore Expert
Posts: 955
Joined: Fri Dec 11, 2009 3:53 am
Location: Sweden, Eskilstuna

Postby lilltroll » Mon Feb 22, 2010 12:17 pm

This must be better for "MAC96". Can I reduce it further skipping the if-else ? Can you do abs() more efficient ?

Code: Select all

unsigned int ym=Cm;
unsigned int yl=Cl;
int yh=Ch;

if(x>=0)
{ym,yl}=mac(Al,x,ym,yl);
else
{ym,yl}=mac(Al,-x,ym,yl);
{yh,ym}=macs(Ah,x,Ch,ym);
Probably not the most confused programmer anymore on the XCORE forum.

Who is online

Users browsing this forum: No registered users and 1 guest