Heater wrote:
But a co-processor to what?
Any host with an XLink interface would do. One with a large FLASH to store the bootable XMOS program image might be useful.
Folknology wrote:
However I am not sure it works in the simpler cases...
The XOSS diagrams depict a single XMOS connected to an FPGA-subsystem doing the external
memory interface. In systems with few or no XMOS devices networked to that interface you're probably right in respect of cost; the questionable benefit in these cases comes from having the host operating system interface of a XOSS-like design. All system designers need to make a cost estimate, which here would include comparing an XMOS network against an FPGA, and I would hope for systems with more XMOS devices connected the equation balances more favourably, irrespective of having the host O/S feature.
An simpler solution than XOSS, for a system with few networked XMOS parts, is to directly attach one XMOS processor to a host through an XLink (as in l.h.s. illustration 3 in XOSS), and have a software protocol for data exchange. The problem then is to design the software, to attain the performance required for your app - which is what this thread is about.
These things we can do today (almost).
I'll gladly be corrected but imagine implementing an external address bus in XMOS is not going to be low-cost. Providing the i/o pins is one aspect, but you also need to route the internal
memory address and data lines through a crossbar switch, to allow the CPU access to both internal SRAM and external SDRAM at different times. Then you may have interference when a thread accessing some external address may temporarily block all other threads (badness), so you may want a per-thread level2 cache too. And you need to modify the CPU hold-off logic if an external access is required, to suspend the thread concerned. And as mentioned you either need to drive the SDRAM interface (refresh and stuff), or select an auto-refresh device, and you really should be targetting 400MHz DDR2/3. And there are probably issues of power, timing and other hardware stuff that's beyond my understanding.
Hmm. Instead of a direct EMIF, maybe it would be easier to add an 8-channel (per-thread) DMA device on-chip which interfaced to an external
memory on behalf of the CPU when an external address range is placed onto the address bus. But then you need a paged
memory of sorts, so the CPU/DMA knew how much to load into SRAM and where. And for data you also need to write-back to external
memory which needs need a cache table. Hmm.
I love the idea of running a virtual XS1 core inside the FPGA
Dude, that's your idea. And XOSS' provision of an XLink interface belongs to Ali/Paul.
The link Heater is referring to is ...
Thanks, that adds clarity.
I would be interested in you response to this Julian in terms of the making it simple.
Wish I could make it so. Probably what's missing is a good book on Systems and Software Design for the Xcore.