The current architecture is:
- tile[2] 2xTDM16 I/O (1-bit ports), 2xI2S output (4-bit port) (<-> shared memory)
- tile[2] 4x submixer tasks (shared memory -> channel)
- tile[2] summing/output mixer task feeding I/O task (channel -> shared memory)
- tile[3] 2xTDM16 I/O (-> shared memory)
- tile[3] 4x submixer tasks (shared memory -> channel)
Other permutations would have better parallelism and less communication overhead (for example, having a DAC per tile and optionally mult'ing the TDM streams to both tiles, avoiding any cross-tile signal plane communication), but with slightly less flexibility and greater CPU overhead.
I'm wondering if anyone here has built something similar has any general advice on these tradeoffs.