## LMUL and signed multiplication

lilltroll
### LMUL and signed multiplication

What is the "smart way" to implement LMUL ?
For an example say that I would like to calculate

signed_int128 * signed_int128 -> signed_int128

Or a 128bit MAC

signed_int64 * signed_int64 + signed_int128 -> signed_int128 as fast as possible.

Using the LMUL macro in XC, How can I write a fast loop that multiplies integers with any length stored in some type of array or struct.

I checked the simulator output in debugger mode.

long long A,B,C;

A*B
A*B+C

it starts with LMUL and LADD

but it uses
MUL
MUL

calculating the most significant part. Could it use MAC or LMUL instead, or is is some magic with the sign ?
Somehow it handles -1*1 = -1 and -1 *-1 = 1 ... and I find that very nice :mrgreen: and you can see the magic happen looking at the registers in the simulations with instruction stepping.

I understand the thing with circular math, making the ADD instruction functional both for signed and unsigned - but I haven't understood the math behind signed and unsigned multiplication.
lilltroll
Making a signed 96-bit MAC:

int64(int Bh,uint Bl) * int32(int Ah) + int96(int Ch, uint Cm,uint Cl) => int96(int yh,uint ym,uint yl)

Something like this below?? Using the macs instruction in the end, handeling the sign of A and B.

Code: Select all

``````	{ym,yl}=lmul(Al,Bh,0,Cl);
asm("ladd %0, %1, %2, %3, %4 " : "=r"(carry), "=r"(ym) : "r"(Cm), "r"(ym), "r"(carry));
{yh,ym}=macs(Ah,Bh,carry,ym);
yh+=Ch;``````
and with an example:

Code: Select all

``````#include <xs1.h>
#include <print.h>

long Ah=0x40000001,Bh=0x40000001,Ch=0x10000001,yh;
unsigned long yl,ym,Al=0,Cm=2,Cl=3;
int carry=0;

int main(){
{ym,yl}=lmul(Al,Bh,0,Cl);
asm("ladd %0, %1, %2, %3, %4 " : "=r"(carry), "=r"(ym) : "r"(Cm), "r"(ym), "r"(carry));
{yh,ym}=macs(Ah,Bh,carry,ym);
yh+=Ch;

printhexln(yh);
printhexln(ym);
printhexln(yl);
return 0;
}
``````
giving the console output
20000001 (yh)
80000003 (ym)
3 (yl)

0xA * 0xB + 0xC= 10000000 80000001 00000000 + 10000001 00000002 00000003
lilltroll
This must be better for "MAC96". Can I reduce it further skipping the if-else ? Can you do abs() more efficient ?

Code: Select all

``````unsigned int ym=Cm;
unsigned int yl=Cl;
int yh=Ch;

if(x>=0)
{ym,yl}=mac(Al,x,ym,yl);
else
{ym,yl}=mac(Al,-x,ym,yl);
{yh,ym}=macs(Ah,x,Ch,ym);``````
