Moving weights to external RAM

Discussions relating to machine learning and XMOS
andy-aic
Member++
Posts: 26
Joined: Thu Jun 27, 2024 3:38 pm

Moving weights to external RAM

Post by andy-aic »

Hi,

I've opened an issue on GitHub about this (https://github.com/xmos/ai_tools/issues/903) but it doesn't look like I'll be getting feedback anytime soon, so I figured I should try here.

I'm trying to run a model on the xcore.ai evk, but our models are heavily restricted by I/O according to the execution profiles:

Code: Select all

Cumulative times for invoke()...
53    OP_XC_ld_flash                   2659902      26.60ms
14    OP_XC_conv2d_v2                  80202        0.80ms
4     OP_UNPACK                        3227         0.03ms
4     OP_SPLIT                         9783         0.10ms
20    OP_XC_add                        12066        0.12ms
21    OP_XC_lookup                     12327        0.12ms
12    OP_XC_mul                        6970         0.07ms
3     OP_RESHAPE                       508          0.01ms
I figure we could get much better performance if we put the model weights on the LPDDR1 memory at initialization rather than reading weights from flash during inference. The problem is that it doesn't seem straightforward to do it, or at least I haven't figured it out.

I'd appreciate it if someone could point me in the right direction. If there's some implementation work needed, I'm willing to put in the work to the ai_tools repo and docs myself.

Thanks for your help :)
User avatar
Ross
Verified
XCore Legend
Posts: 1071
Joined: Thu Dec 10, 2009 9:20 pm
Location: Bristol, UK

Post by Ross »

Mod edit: Moved to "Machine Learning & AI"
Technical Director @ XMOS. Opinions expressed are my own