Channel limits for I2S

Discussions about USB Audio on XMOS devices
Zip
Junior Member
Posts: 5
Joined: Fri Oct 24, 2025 9:24 pm

Channel limits for I2S

Post by Zip »

I've been reading the manual for lib_i2s and I'm kind of confused about the resource usage limits. Looking in table 4, I see that with a MCLK of 24.576 MHz and 32-bit words, the MAX IN and MAX OUT is 1. Does that mean that I can only have one input data port and one output data port? I was hoping I could use XMOS chips to build a serious multichannel interface with 24 output channels and 16 input channels. That means 12 input ports and 8 output ports.
User avatar
Ross
Verified
XCore Legend
Posts: 1312
Joined: Thu Dec 10, 2009 9:20 pm
Location: Bristol, UK

Post by Ross »

These values do look rather pessimistic and the table is a little hard to interpret, IMO. I will chase this internally for an update.

Though, for those kinds of channel counts we'd typically expect to use a TDM interface.

Are you just wanting to do I2S -> I2S or USB -> I2S?
Technical Director @ XMOS. Opinions expressed are my own
User avatar
infiniteimprobability
Verified
XCore Legend
Posts: 1180
Joined: Thu May 27, 2010 10:08 am

Post by infiniteimprobability »

Zip wrote: Fri Oct 24, 2025 11:22 pm I've been reading the manual for lib_i2s and I'm kind of confused about the resource usage limits. Looking in table 4, I see that with a MCLK of 24.576 MHz and 32-bit words, the MAX IN and MAX OUT is 1. Does that mean that I can only have one input data port and one output data port? I was hoping I could use XMOS chips to build a serious multichannel interface with 24 output channels and 16 input channels. That means 12 input ports and 8 output ports.
I agree with Ross on this one - it's (very) pessimistic documentation. The "old" I2S (non-master, now deprecated) did struggle at high SRs due to having to many callbacks and not being optimised for best use of port buffers.
Since version 6.0.0 the "frame based" API is default (see changelog). I suspect the docs didn't get updated (our bad).

We actually test at 192kHz 8ch in 8ch out as part of the test test params and so it looks like the docs are lagging. We do this under fully a loaded chip so we know that I2S has 1/8 of the core MIPS max. Real world performance will be better.

Apologies for this and you were right to post a query about this. An issue has been raised to address the docs so it won't get lost - https://github.com/xmos/lib_i2s/issues/159

Internally, lib_i2s does lots of port activity at the start of each frame, loads up the output buffers and empties the input buffers. So you have the majority of the frame period (1/LRCLK) to execute the callbacks which exchange samples and check for restart.
The only things that affect performance are: number of IO (total number of input and output ports), how quickly the callbacks return and the LRCLK rate. Master clock rate is a don't care because we use internal hardware dividers to generate the bit clock.
32b I2S is slightly more efficient than 24/16b because we don't need to pre-set the port buffer count.

We also test for callback back-pressure up to 384 kHz for 6 in 6 out which is roughly where it runs out of stream. So 12 channels total at 384kHz for a fully loaded chip.

So you can definitely have 8 in 8 out at 192kHz with 62.5MIPS threads (worst case). You could likely double that with a few more MIPS (have say max 5 threads so each one gets 120MIPS) and short callbacks. Halve the max sample rate and you can double the channel counts as a rule of thumb.

However since you only have 16 x 1b ports per tile, and you need MCLK, LRCLK and BCLK, and you only get 2 channels per data line, you will be IO limited likely. There are ways around this:

1) Put an I2S slave on the second tile and get more data lines
2) Use 4b ports for I2S data. This does mean a hit in maximum sample/IO rate quicker but it might be enough.

What is your desired sample rate?
Engineer at XMOS
Zip
Junior Member
Posts: 5
Joined: Fri Oct 24, 2025 9:24 pm

Post by Zip »

I was hoping to hit 192 kHz with 24-in, 16-out. This would be straight to and from USB, no processing. According to my math this fits within 3 packets per endpoint per isochronous frame.
User avatar
upav
Verified
Active Member
Posts: 36
Joined: Wed May 22, 2024 3:30 pm

Post by upav »

Hey Zip,

If by packets, you mean the number of USB transactions per frame,
then we only support 2 transactions per isochronous frame, I'm afraid.
Error handling for high-bandwidth endpoint drives lib_xua to the edge of timing,
2 packets is the max we were able to get without seriously rewriting lib_xud/lib_xua and compromising the current performance.

Cheers,
Pavel
xmos software engineer
Zip
Junior Member
Posts: 5
Joined: Fri Oct 24, 2025 9:24 pm

Post by Zip »

upav wrote: Mon Nov 03, 2025 4:17 pm If by packets, you mean the number of USB transactions per frame,
then we only support 2 transactions per isochronous frame, I'm afraid.
Error handling for high-bandwidth endpoint drives lib_xua to the edge of timing,
2 packets is the max we were able to get without seriously rewriting lib_xud/lib_xua and compromising the current performance.
Yes I meant transactions. Ross seemed to imply in this post that you guys are upping the transactions from 2 to 3, or did I misread this?

Alternatively, 96 kHz would also be reasonable. In that case it would be nice if 32-in plus 8-out was also an option. This should fit within 2 transactions.
User avatar
infiniteimprobability
Verified
XCore Legend
Posts: 1180
Joined: Thu May 27, 2010 10:08 am

Post by infiniteimprobability »

Pavel is right about the 2 frames per microframe. It took quite a lot of work to get to get that going (some very carefully crafted dual-issue assembler and testing) and also requires the use of an 800MHz part to achieve this. On that note, the TQFP128 only supports 600MHz so you'll need the BGA265 which can be ordered in 800MHz speed grade.

So given we support two frames, each of 1024B at 8kHz SoF rate, we get 1024×2×8×8000 = 131072000 throughput max. per ISO endpoint.

Running a lot of channels at high sample rates means that it is likely you'll need to use 32b sample slots since the 24b packing may struggle (Ross may have some extra info on this). So in each direction (in and out) we get:

@96kHz - > 131072000÷96000÷32 = 42.667, or 42 channels theoretical max
@192kHz - > 131072000÷192000÷32 = 21.333, or 21 channels theoretical max

So 24 channels at 192kHz doesn't fit in the USB payload if using 32b slots.

32 channels at 96kHz does fit, and in fact this is a tested configuration (see here) albeit with TDM8 IO where we have 8 channels per data line. So 4 data lines only for one direction as long as the DACs support TDM. This is actually a tougher case than your second requirement so is safe.

8 channels per data line at 96kHz give a bit clock of 96000 * 32 * 8 = 24.676MHz which is fine for us.

Note also that USB audio uses it's own implementation of I2S/TDM but it's pretty efficient and will match or possibly exceed what can be done in lib_i2s.

Also potentially useful, we have an example of USB audio where an additional I2S slave interface has been added to get the IO count up here.
Engineer at XMOS
Zip
Junior Member
Posts: 5
Joined: Fri Oct 24, 2025 9:24 pm

Post by Zip »

TDM should be workable. I was hoping 4-line I2S per ADC/DAC was also possible, is this really not possible?
User avatar
infiniteimprobability
Verified
XCore Legend
Posts: 1180
Joined: Thu May 27, 2010 10:08 am

Post by infiniteimprobability »

Have a look at the port map which is great for IO planning. You get max 16 x 1b ports per tile. You get 2 channels per I2S line. However you also need MCLK, BCLK and LRCLK. So Max 13 x 1b ports per tile for I2S data -> 26 channels.
You can put I2C (config etc.) on a 4b port which can help. But you'd need a second, slave, I2S interface (see example link in previous post on how to do this - the extra_I2S build) to get the channel count up to your desired number. Say you wanted 32 + 8 channel that's 20 x 1b ports for data + clocks for each tile, so does sound possible.
Engineer at XMOS