500 MIPS limitation per tile

If you have a simple question and just want an answer.
DemoniacMilk
XCore Addict
Posts: 191
Joined: Tue Jul 05, 2016 2:19 pm

500 MIPS limitation per tile

Post by DemoniacMilk »

I was just wondering: what exactly causes the limitation of a maximum 500 MIPS per tile and 100 MIPS per core?


Gothmag
XCore Addict
Posts: 129
Joined: Wed May 11, 2016 3:50 pm

Post by Gothmag »

It's a 500 MHZ processor, per tile, so it performs 500 Million Instructions Per Second. Divides aren't single cycle so you will lose performance with those I suppose. This is why it can run at 1000 MIPS, using dual issue mode. The 100 MIPS per core is just about the way they've built the scheduler where you get a maximum minimum of 100 MIPS, and a minimum minimum of 62.5 MIPS. An employee may be able to clarify about usage under 5 cores(assuming XS2). I haven't tried doing any testing myself. I believe the hardware also has 8 sets of registers for efficient switching but I'm having trouble remembering where I read that and that's why it can switch every clock cycle without losing performance for every thread.
DemoniacMilk
XCore Addict
Posts: 191
Joined: Tue Jul 05, 2016 2:19 pm

Post by DemoniacMilk »

Thank you for your reply!
Gothmag wrote:It's a 500 MHZ processor, so it performs 500 Million Instructions Per Second
Ye this confuses me a bit. What exactly is clocked at 500 MHz? From what i understood, the reference clock per tile is 100 MHz. I am not sure if the reference clock equals the core clock, but if it does, it would explain the maximum of 100 MIPS in single issue mode.

Or is it a minimum of 100 MIPS?
In taht case, the 500 MIPS per tile must be due to a shared ressoruce, probably memory? I have read somewhere that memory access is scheduled in round-robin mode, so for n active cores, each core will be given 500/n million memory accesses per second, with n being a minimum of 5. If more than 5 tiles are running, executing instructions that do not need memory access, should allow to get over 500 MIPS on a tile (all single issue)?
peter
XCore Addict
Posts: 230
Joined: Wed Mar 10, 2010 12:46 pm

Post by peter »

The core clock runs at 500MHz, meaning that one instruction completes every 2ns. However, in order to keep the machine simple there is a hardware limitation that restricts each core to a maximum issue rate of 1 in 5 cycles. This guarantees that by the time a core issues an instruction its previous instruction has completed and there is no complicated bypass or stall logic required if an instruction depends on the result of a previous instruction. This makes the hardware simpler at the cost of peak instruction issue rate for a logical core.

The other 4 cycles have to be filled by other logical cores or they are simply not used. Running with only one logical core active will result in the machine being idle for 4/5 of the cycles. The architecture has been designed/optimised for running applications where there are multiple active logical cores.

The maximum rate of a given logical core is 1/5 of 500 -> 100MHz. These can be dual issue instructions, giving the 200MIPs peak per logical core.

The peripheral clock runs at 100MHz, and this is what drives the timers and clock blocks by default. This gives a default 10ns resolution on times and port I/O.
DemoniacMilk
XCore Addict
Posts: 191
Joined: Tue Jul 05, 2016 2:19 pm

Post by DemoniacMilk »

Awesome answer, thank you.