Page 1 of 1

time delay about xCORE VOCAL FUSION SPEAKER algorithm

Posted: Wed Jan 03, 2018 1:52 am
by andy wong
Hi,
I just Using xmos.I had a problem with the 3100 that when I used the 1i0o0_I2S_ONLY_48KHz_Cir43 to build the configuration, the delay of the sound from the microphone board input to the horn output was more than 50ms. I realized that this could be a delay for far-end stuff (beamform AEC AGC etc), is my guess correct? If right, I can turn off some functions such as AEC, because I am on-site equipment.
If anyone is willing to answer my question I will be very happy
Wish you a Happy New Year!

Re: time delay about xCORE VOCAL FUSION SPEAKER algorithm

Posted: Fri Jan 05, 2018 10:55 am
by infiniteimprobability
Hi,
the total delay for all processing blocks is around 170 milliseconds or 2750ish samples. The block size if 256 and there are 4 such buffers internally which is 64ms before we add processing time which accounts for the remainder. There are a lot of processing stages as well as time to frequency domain conversions and inherently this adds latency.

The AEC and BAP (BAP contains beaformer, ns, agc, other filtering etc.) blocks are designed to work hand in hand. It is not easy to separate them (it's not a simple case of forwarding echo removed samples) and although I am sure it is possible with enough work, is not supported/recommended.

Changing the block size is also not supported and has not been tested - all AVS and other ASR tests have been done in the current configuration. It may not even work due to the block size assumptions.

I assume this also addresses your other post:
http://www.xcore.com/viewtopic.php?f=3&t=6297

Re: time delay about xCORE VOCAL FUSION SPEAKER algorithm

Posted: Tue Jan 09, 2018 3:31 am
by andy wong
My friend, thank you very much~

Because you say that separating AEC and BAP is possible, so I'm going to give it a try. Can you give me some suggestions like how to get started? My current practice is to change SMARTHOME as a try.

you can ignore my other post,I'm sorry to bother you。

Re: time delay about xCORE VOCAL FUSION SPEAKER algorithm

Posted: Fri May 25, 2018 11:05 am
by infiniteimprobability
CORRECTION

The latency of the Vocalfusion processing is actually in the order of 50ms. Apologies to anyone who took the incorrect number above - my mistake.
Even adding lib_mic_array (1.125ms) and some upsampling to 48k + USB full speed delays, mic to host it will be comfortably below 60ms.