New multicore chips and architecture

Off topic discussions that do not fit into any of the above can go here. Please keep it clean and respectful.
User avatar
XCore Legend
Posts: 1274
Joined: Thu Dec 10, 2009 10:20 pm

New multicore chips and architecture

Post by Folknology »

For those that haven't noticed yet:


Hope Xmos don't mind me posting about this, but it is interesting. My first thought is where are the supported parallel/concurrent equiped languages something like an XC equivalent at low level and perhaps OCCAM/Go or Erlang at a higher level, I see no such support, just the usual languages, to me that's an Achilles heal with that horse power underneath and little to help you exploit it.

What does everyone here think of it?
XCore Addict
Posts: 133
Joined: Tue Dec 15, 2009 10:23 pm

Post by yzoer »

Thanks for sharing!

I think it's an interesting proposition but aimed at a completely different market than xmos. Likewise the folks at (chuck Moore te all) provide a chip for $20 that contains 144 cores but aimed at low power due to being asynchronous. Again a completely different market.

I think xmos fits somewhere in the middle. The price point is attractive as chips cost less than $5 and are easy to work with. Kind of a cross between a low cost fpgs and high end microprocessor.

Parallella seems to to downtalk gpu's but, as somebody already mentioned in the forum, gpu's have hundreds of cores these days with vastly superior performance. Granted they're probably not as flexible, but I wouldn't say they're inferior! Floating point on parallella definitely helps though and I can see applications that will benefit from that over something like xmos.

Bit of ramble, it's early here and my kid wants to play Lego :)

In all, I'll probably pledge $100 and see where it leads...if anything, I like to encourage hardware projects :)

User avatar
XCore Expert
Posts: 844
Joined: Sun Jul 11, 2010 1:31 am

Post by segher »

[Their documentation is not freely available, so I'm deducting from what _is_
on the web. Ugh.]

800MHz 32-bit core, 64 registers that can hold integer as well as floating
point values. Only single-precision floating point. 32kB of RAM per core,
no cache. No coherency. Every core can access the RAM of every other

At least the assembler language seems to be inspired by ARM.

Interconnect is a single 2-D mesh (not a toroid as far as I have seen).

Dual issue for simple integer; everything else is single issue.

Taken branches cause a 2-cycle bubble. Not bad.
Respected Member
Posts: 296
Joined: Thu Dec 10, 2009 10:33 pm

Post by Heater »

Parallella chips may implement a lot of parallelism but that is where any comparison with XMOS devices ends. Parallella are not aimed at the same high speed real-time applications.

Parallella cores do not have direct or close coupling with the I/O pins like XMOS does. That makes them useless for the "software as silicon" possibilities that the XMOS excells at.

Parallela core probably have no idea about deterministic timing, timers, events etc that takes them further away from the usefullness of an XMOS in the role of "soft hardware".

They seem to be more aimed at compute intensive applications where parallel processing is used to good effect at levels from my quad core Intel box up to super computers. In that world there are things like OpenMP to enable parallelization of your C code and the use of multiple cores in shared memory systems and OpenMPI to distribute work tasks accross network connected nodes. OpenMP is a standard part of Intel compilers, GCC C/C++ and other compilers so I expect Parallella to adopt that (Just a guess).

Interesting devices though, would love to have a play if they are as cheap as they say and required dev software is not costing an arm an a leg.
User avatar
XCore Expert
Posts: 546
Joined: Thu Dec 10, 2009 10:41 pm
Location: St. Leonards-on-Sea, E. Sussex, UK.

Post by leon_heller »

I pledged $100 when I heard about it a few days ago.
User avatar
Posts: 31
Joined: Fri Aug 31, 2012 3:42 pm

Post by Carpentier »

I want to know exactly. Did you really gave 100USD for support this project?
If the project is started and your money go to it, i understood that you will receive a project board that will contain a preliminary device, maybe not the final version.

These products look very powerful, but are they really available and documented ??
Are they really free of bugs ??
User avatar
XCore Expert
Posts: 546
Joined: Thu Dec 10, 2009 10:41 pm
Location: St. Leonards-on-Sea, E. Sussex, UK.

Post by leon_heller »

One only pays the money if the project is fully subscribed.

First chips will have 16 cores. They have been prototyped using FPGAs, but the funding is needed to produce working chips and PCBs.

Here is an update:

Experienced Member
Posts: 67
Joined: Fri Aug 24, 2012 9:37 pm

Post by SpacedCowboy »

I got in on the kickstarter as well. We'll see how it turns out.

Since we're talking about other chips (seriously, is that even allowed [grin]), another thing that's caught my eye recently is the Blackfin 60x series ( ... oduct.html). These are medium-speed (500MHz) fixed-point DSP chips with dual cores, with each core being able to retire 2 16-bit MACs per clock, assuming you set them up correctly (DMA into level-1 RAM from SDRAM, DMA out of level-1 RAM to SDRAM, work using level-1 RAM). The C/C++ compiler offered by Analog Devices allows you to set things up like this, there's also a gcc port, but I'm not sure what facilities that offers.

They also have Ethernet MAC (x2), CAN, USB, DMA-driven parallel ports that can parse video etc. (x3), UARTS (x2), TWI, SPI, Synchronous serial ports that run at ~80MHz (x3), DDR SDRAM controller, and (to bring it struggling back to being relevant) 4 link ports.

Each link port is a bidirectional DMA-driven 8-bit port with some extra signals to do handshaking. There's no protocol sitting on top of it to handle message routing between nodes, but the hardware is there. The data is still premature, but it looks as though the clock will be able to run up to 250MHz, so each port can do ~250MB/sec.

Oh, and they have a *very* friendly BGA layout - only the outer 3 layers of pins are used for anything other than power/ground - after trying to route out the XS1-G4 I really appreciate that. Cost has been estimated as ~$15/1k quantities, which translates to ~$35-$40 in quantity-1 if you extrapolate from other Analog-Devices 1k figures to what Digikey is charging for quantity-1. That's for the high-end 609 part, there are lesser chips (which still have the link ports but less internal RAM, or no vision-processor) which will presumably be cheaper.

To be honest, these new DSPs (not yet available in quantity, but AD will sell you an eval kit with a working '609 part on it) look pretty tasty - sure you don't get 4 independent 125MHz threads, or 8 independent 67MHz threads, but you do get 2 500MHz threads. If you're running multitasking, it's a step-down in predictability on the timeslicing front, but on the flipside a single thread can run a lot faster. And it has links.

It's also pretty easy to do the "software as silicon" approach, given that every port pin on the chip (that's 112 pins in a 7x16-bit port configuration) can register for interrupts (level or edge), or you may be able to get away with the DMA-driven ports if you can massage the input protocol into something the DMA engines can handle.

I've ordered an eval kit, just to play with one, it ought to be arriving Mon/Tue. It'll be interesting to see how far I can take it... If I can get a basic timeslicing "os" up and running, and then implement a message-routing protocol, then a hypercube of {DSP, DDR RAM} nodes could be pretty intriguing.