Expected speed gain from switching processors?

Technical discussions around xCORE processors (e.g. General Purpose (L/G), xCORE-USB, xCORE-Analog, xCORE-XA).
MuellerNick
Member++
Posts: 29
Joined: Fri Dec 11, 2009 9:33 am

Expected speed gain from switching processors?

Postby MuellerNick » Sun Aug 04, 2019 9:46 am

Hi!

I currently do have a dev board "startKIT"*)
For a job application that I *really* want, I wrote some software that shows that there is a better way than using a FPGA + an ARM.
So the software I wrote is running on a variant of the XS1-A8A-64-FB96 (not exactly that xcore) and I am using 5 cores for the benchmark tests I made.

Now to my question:
Is my math right, that if I switch to a XL210 also using just 5 cores, that I can expect about a 3 fold performance?

The final product would use more cores than just the 5 (for IO), but that won't matter.

Thanks,
Nick

*) I do have two older ones, but they also do use the XS1 processors, so no gain.
Furthermore, I would have bought a slice kit, but they are not available since months. When will the next batch arrive?
User avatar
akp
Respected Member
Posts: 314
Joined: Thu Nov 26, 2015 11:47 pm

Postby akp » Tue Aug 06, 2019 4:02 pm

My suspicion is it would matter what you were doing with the FPGA + ARM.

If you use five logical cores they will all run at 100MHz, regardless if it's XCORE-200 or XS1 (assuming it's 500 MIPS speed grade). If you could get it to 4 logical cores the XS1 would run it at 125MHz per core but the XCORE-200 is limited to max of 100MHz. I suspect you could get a speed up if you hand coded dual issue assembly, and could make use of the new XS2 instructions. But it would take effort. I would guess the theoretical maximum speed up is about 2x, but you are unlikely to achieve that. Maybe I calculated the speed up different from you.
MuellerNick
Member++
Posts: 29
Joined: Fri Dec 11, 2009 9:33 am

Postby MuellerNick » Tue Aug 06, 2019 5:30 pm

Thanks for your input!
Well, the job description required FPGA and ARM and DSP-knowledge (PID controll).
So I suppose they hit the speed bump with just a using a µC. They do make quite fast controlls (without being too specific from my side).
So I thought (I'm "not too good" at FPGA), that a XMOS would be quite the match. So I wrote a PID controll over the weekend, tuned it by guesswork and came to 600000 loops per second.
But I absolutely don't know their specs. I guess that this is still too slow for them.

And I misunderstood the XMOS speed specs. Today, I realized that I won't gain by using a XL2xx. OK, I will need more cores for the bells and whistles, but these aren't speed critical.

Anyhow, I'll titdy up my code and complete it. Then mease loops/s and signal delay and let them hear. Without source and what CPU I used. :-)


Nick
User avatar
akp
Respected Member
Posts: 314
Joined: Thu Nov 26, 2015 11:47 pm

Postby akp » Tue Aug 06, 2019 5:45 pm

Good luck, hope you nail the job application. Sounds like you're putting some good effort into it.
User avatar
akp
Respected Member
Posts: 314
Joined: Thu Nov 26, 2015 11:47 pm

Postby akp » Tue Aug 06, 2019 9:13 pm

Here are some other thoughts.
- If you get XVF3000/XVF3100 it is guaranteed to run at 600MHz so that would give you a 20% overclock. You can try to see if your development chip will boot at 600MHz by editing the xn file, see other threads on the forum
- Most likely your best bet will be to see if you can split your most computationally intensive core over multiple cores and use an 8 core (e.g.) per tile device. Then you can set the cores you need fast at 100MHz and pipeline the computation more efficiently. If you write to shared memory for a fifo for the pipeline that's fast, rather than using channels. Or if you can use streaming channels that's probably better due to built in synchronization.
- refer to the tips and tricks e.g. https://xcore.github.io/doc_tips_and_tr ... eedup.html
- search for ancient stuff on this forum from the true assembly gurus

cheers
MuellerNick
Member++
Posts: 29
Joined: Fri Dec 11, 2009 9:33 am

Postby MuellerNick » Thu Aug 08, 2019 6:41 pm

Thanks again for your thoughts!
After two evenings of thinking and testing, I made it to 1.2 million loops per second and a signal delay of 520 ns.
No assembler harmed! With a little trick, I even get almost 1.5 million loops/s.

Haven't implemented FF0 ... FF2, but I don't expect that to slow down too much. And I'm running out of cores on the startKIT. :-)

Now I'll write a nice proposal ...

Nick

XMOS is so damned cool! But I don't understand why XMOS went away from promoting these kind of applications. All that Alexa-stuff. I personally could't care less.

Who is online

Users browsing this forum: No registered users and 0 guests