In short: In dual issue mode, the processor can execute two instructions (real) concurrently, potentially doubling your performance.An XS2 has two lanes: the memory lane can execute all memory instructions,
branches, and basic arithmetic, and the resource lane can execute all resource
instructions and basic arithmetic. Each thread can chose to execute in dual issue
mode, in which case the processor will execute two 16-bit instructions or a single
32-bit instruction in a single thread cycle. In dual issue mode, all instructions
must be aligned: 32-bit instructions must be 32-bit aligned and pairs of 16-bit
instructions must be aligned on a 32-bit boundary. The program counter is
always aligned two a 32-bit boundary and points to an issue slot rather than to
an individual instruction. [...]
Where two instructions are executed simultaneously, any destination operands
should be disjoint. If they are not disjoint, an exception will be raised.
When the resource lane stalls a thread, the other lane will be stalled also. This is
normally not observable, except when an interrupt or an exception is raised. On
an interrupt or exception, no registers will be overwritten, and the PC will point to
the instruction to be reexecuted.
If an instruction in one of the two lanes causes an exception, then this exception is
reported. If the other lane is executing an instruction then this second instruction
is aborted. If the instructions in both lanes cause an exception, then only one
exception is reported, and both instructions are aborted, but any memory store
which is in progress will complete. On an exception, the saved PC value is set to
the instruction that caused the exception.
[...]
I think, the details above are very important when you are writing assembler code. When you are writing xc or c code, the compiler cares about these details for you. You can switch between these modes for a function with [[dual_issue]] and [[single_issue]]. Even enabling optimization forces the compiler to use dual issue mode.
On p. 286 to 289 are the instructions listed, that can be called on the corresponding lane. The instructions i.e. for loading data, process 8, 16 or 32 bit values. The XS2 is a 32 bit MCU; so far, it is not surprising that maximum loadable value is 32 bit.
My questions:
============
1) There are some threads here on the forum claiming that for dual issue mode, the memory of arrays and structs must be aligned to 64 bit otherwise some unpredictable things happen. Is this really true? I have not found this restriction anywhere in the official XMOS documents, yet. And I think it would also contradict the XS2 architecture.
2) If it is true, is there a short code example, that shows, that unpredictable things happen on ignoring the 64 bit alignment?
3) And why must it be 64 bit and not 32 bit?
4) Why do you need alignment at all?
5) Are there other data types that have to be aligned?
6) What "unpredictable things" can happen in detail?