Moving weights to external RAM

andy-aic · Post by **andy-aic** » Tue Jul 16, 2024 10:38 am

Hi,

I've opened an issue on GitHub about this (https://github.com/xmos/ai_tools/issues/903) but it doesn't look like I'll be getting feedback anytime soon, so I figured I should try here.

I'm trying to run a model on the xcore.ai evk, but our models are heavily restricted by I/O according to the execution profiles:

Code: Select all

Cumulative times for invoke()...
53    OP_XC_ld_flash                   2659902      26.60ms
14    OP_XC_conv2d_v2                  80202        0.80ms
4     OP_UNPACK                        3227         0.03ms
4     OP_SPLIT                         9783         0.10ms
20    OP_XC_add                        12066        0.12ms
21    OP_XC_lookup                     12327        0.12ms
12    OP_XC_mul                        6970         0.07ms
3     OP_RESHAPE                       508          0.01ms

I figure we could get much better performance if we put the model weights on the LPDDR1 memory at initialization rather than reading weights from flash during inference. The problem is that it doesn't seem straightforward to do it, or at least I haven't figured it out.

I'd appreciate it if someone could point me in the right direction. If there's some implementation work needed, I'm willing to put in the work to the ai_tools repo and docs myself.

Thanks for your help :)

Ross · Post by **Ross** » Tue Jul 16, 2024 10:44 am

Mod edit: Moved to "Machine Learning & AI"

Moving weights to external RAM

Moving weights to external RAM

Re: Moving weights to external RAM