For an example say that I would like to calculate

**signed_int128***

**signed_int128**->

**signed_int128**

Or a 128bit MAC

**signed_int64***

**signed_int64**+

**signed_int128**->

**signed_int128**as fast as possible.

Using the LMUL macro in XC, How can I write a fast loop that multiplies integers with any length stored in some type of array or struct.

I checked the simulator output in debugger mode.

long long A,B,C;

A*B

A*B+C

it starts with LMUL and LADD

but it uses

MUL

ADD

MUL

ADD

calculating the most significant part. Could it use MAC or LMUL instead, or is is some magic with the sign ?

Somehow it handles -1*1 = -1 and -1 *-1 = 1 ... and I find that very nice :mrgreen: and you can see the magic happen looking at the registers in the simulations with instruction stepping.

I understand the thing with circular math, making the ADD instruction functional both for signed and unsigned - but I haven't understood the math behind signed and unsigned multiplication.