NetStamp project

XCore Project reviews, ideas, videos and proposals.
Heater
Respected Member
Posts: 296
Joined: Thu Dec 10, 2009 10:33 pm

Post by Heater »

Ah yes, it is convoluted and it does not sit in the memory map and that is the whole point of the ZPU idea. Perhaps you haven't quite realized how crazy us Parallax Propeller chip users have become:)

Some two years ago I started to learn the Propeller assembly language by writing a Intel 8080 emulator in PASM. Which then grew into a full Z80 emulator which could run CP/M from SD card. That was OK but was limited to using 16K of the 32K RAM on board the Propeller. Since then a number of board designs have emerged that add 512K or so RAM driven by bit banging the bus with the general purpose I/O pins. Some designs end up using most of the pins some use external latches and such to reduce the pin number of pins required.

These designs were in part inspired by that Z80 emulation and the idea of being able to run emulations in general on the Propeller, so now they are running CP/M, MP/M with banked memory, Sinclair Spectrum, NASCOM etc etc. There are 6502 Motoroal 6809 emulators in the works as well.

Amazingly the speed of these emulations approaches that of the original chips.

Even the designs that use most of pins for RAM manage to multiplex and SD card interface on the RAM bus and drive video and keyboard from a pin or two left over. There is a complete CP/M computer that fits in a matchbox done this way.

Enter the ZPU idea. I've got this Propeller board with 512K RAM. It has 8 cores for doing high speed real-real time things. I can put a huge lot of code that does not need the speed into that RAM if I had a convenient way to program it. Ah there is the ZPU with its minimalist instruction set that can be emulated easily in one core of the Propeller and it has a GCC compiler to go with it. Perfect.

Enter the XMOS. Same problem. Super nice multi-core high speed real-time processing device but sadly limited in its RAM capacity. The ZPU will work there as well. Provided we can get the pins to drive the RAM. No matter if it's execution speed is only a hand full of MIPS that is still usefull.

Now, what I'm learning is that the XMOS devices are not so flexible in their use of pins. What with having to program whole blocks as IN or OUT at a time as opposed to the one at a time approach of the Propeller. Still there must be a way.


Heater
Respected Member
Posts: 296
Joined: Thu Dec 10, 2009 10:33 pm

Post by Heater »

Further to the ZPU on XMOS idea.

I notice on the Folknology blog that on your open source hardware travels you passed by the idea of an ARM processor coupled with an FPGA. Discarded due to it's extreme complexity. A decision with which I concur.

The XMOS device makes an excellent replacement for an FPGA in many situations where speed and/or real-time response is required. Or simply a lot of pins, or perhaps the plug and play expandability offered by the links.

That still leaves the remaining question of what happened to the general purpose computing part of the plan that was originally filled by the ARM?

My thesis is that a virtual machine like the ZPU running from external RAM, even if it slow SPI RAM can fill that requirement in may situations. Much like a processor core embedded in an FPGA does.

Still this is a discussion for the ZPU project when I ever get it posted. I don't have an XMOS chip to hand to check it out on yet.
User avatar
skoe
Experienced Member
Posts: 94
Joined: Tue Apr 27, 2010 10:55 pm
Contact:

Post by skoe »

Heater wrote:What with having to program whole blocks as IN or OUT at a time as opposed to the one at a time approach of the Propeller. Still there must be a way.
I'm sure there is. There's something like port precedence at the pins which are shared by more than one port with different widths. Just search the forum or read the docs again.

For my current project I had to chose between Propeller and XMOS. Even if the performance and the features of the Propeller can't compete with XMOS (e.g. 20 MIPS per thread vs. 100 MIPS per thread, no I/O timestamps etc.) it has very good documentation, a very active community and easier to solder (0,8 mm QFP and even DIP for the breadboards =) ). Currently the choice is 80% for XMOS :)

Heater: How about starting a new thread about this topic? Maybe a moderator can move these OT posts to that one, to clean up this thread. I'll have some more remarks in that new thread...
User avatar
Folknology
XCore Legend
Posts: 1274
Joined: Thu Dec 10, 2009 10:20 pm
Contact:

Post by Folknology »

I could'nt hope to fully understand all of the reasons for developing the ZPU and have zero experience of the propeller, but this may be the right thread for a discussion of this nature in a funny sort of way, let me see if I can unpack that.

In particular this point you made I find fascinating on a number of levels:
I notice on the Folknology blog that on your open source hardware travels you passed by the idea of an ARM processor coupled with an FPGA. Discarded due to it's extreme complexity. A decision with which I concur.

The XMOS device makes an excellent replacement for an FPGA in many situations where speed and/or real-time response is required. Or simply a lot of pins, or perhaps the plug and play expandability offered by the links.

That still leaves the remaining question of what happened to the general purpose computing part of the plan that was originally filled by the ARM?
Simple answer I replaced the ARM with and XS1 core.

The more interesting response would be why isn't the XS1 considered "general purpose computing part" I would love your personal opinion on it, even if it's just a Freudian slip. I am playing devils advocate here to further the point that this thread is very relevant to the bigger idea you expressed. Obviously a ZPU offers something important apart from being "a general computing part" it offers all sorts of goodies like emulation of the classic CPUs of our past and can do so in a performant manner.

Now that's all fine and dandy if you are trying to run older code written for those processors, but if you want to take advantage of the real power sitting dormant in those XS1 cores you are likely going to need something designed for the task. Now obviously that could be the common suspects C or C++, better you could use XC or even low level XS1 assembly. My choice here would be XC over the others because I think it frames the the event driven problem very well (after all it was designed for that very thing!).

However, and this is where it gets interesting, what if you had an open pallet, a free reign to choose how you develop with something like the NetStamp, what would you come up with? The reason I ask is not just out of curiosity, but practical implementation, for that is indeed what I have proposed for the Amino stack. A complete development environment including languages that actually compile on the XS1 cores themselves, maybe even a JIT based solution. That's is why I am working with Folks inside and outside of Xmos in order to achieve this goal. It is really up to us how that is shaped because Amino is a community based project, I also expect there to be more than one solution, I know the eLua folks are interested in porting to Amino and Xmos, they might even make it their standard platform as they don't currently have one. I also Know that David May has done some XS1 self compile work internally and I am hoping to work with him on that and perhaps integrate that into Amino. I am also not ruling out ideas akin to the original Smalltalk concept where it is not just a language but a complete environment ("objects all the way down until you get to turtles") using the Scandinavian interpretation of message based OO.

So in that sense I would love to see you working Inside, underneath or on top of the Amino stack in order to achieve the "general purpose computing part", please consider this an invitation to join the Amino community and if there is anything I can do to enable that please just ask.
I have created the Amino Community group for anyone wishing to get involved in the Amino project.

P.S. at this point I am not ruling out any language or ideas, lets see what the community wants..
User avatar
skoe
Experienced Member
Posts: 94
Joined: Tue Apr 27, 2010 10:55 pm
Contact:

Post by skoe »

The first thing I thought when I saw your board: So many useful interfaces, but is there space left for "real" applications? If you take a full FAT32+LFN implementation for the SD card, an IP Stack for the Ethernet interface and some sort of USB implementation, there are already 20 to 40 kByte gone (guessed). If you want to access a graphical LCD there must be some kByte bitmap font etc. Then there's not much left for complex applications.

Most probably this is not exactly what the XMOS was made for. Look at e.g. the LPC1769. 512 kByte internal program flash memory and 64 kByte SRAM. Plus many standard interfaces like USB, Ethernet, SPI, UART etc. in hardware. But it has only one core. This is also the reason why it is not suitable for my current project.

One possibility could be to combine a general purpose controller with the realtime multithread power of the XMOS. Another way is to take one thread of the XMOS to run a small bytecode interpreter (could be a Java or eLUA VM or an emulated CPU like the ZPU). This VM could take its program code from external memory, even from the serial flash plus possibly some kind of cache in the XMOS RAM).

Most probably you had the same understanding of most if this already. Just wanted to tell why I'm happy about Heaters plan :)
User avatar
Folknology
XCore Legend
Posts: 1274
Joined: Thu Dec 10, 2009 10:20 pm
Contact:

Post by Folknology »

Hi Thomas

Let me try to address your points:

First of all NetStamp is based around an L2 124QFN not only does this provide dual core and 16 hardware threads, it also provides 144K of usable memory. 16K of which is OTP split across the cores and then each core has 64K SRAM. That's a a fair bit to be going on with but obviously not quite up to LPC1769 standards.

Secondly on the comparison point unlike the LPC1769 we are unlikely to be running ucLinux or another RT OS. Rather we are leaning on an OS less, event driven architecture. The Amino stack is being designed to get out of your way as much as possible but make development less arduous at the same time. For instance even though Amino and its features look considerable, they are being designed around a hardware service modules (HSM). These modules are being designed to be able to be dynamically loaded and unloaded as required. Thus one is unlikely to be using everything all of the time, Amino will provide a more dynamic way of handling resources more akin to a mobile event based model. That way much more efficient use of shared resources such as memory can be realised.

I am still not keen of FAT based interfaces and would prefer generally to use other faster and more lightweight storage systems (Although no doubt someone will create an Amino FAT32 port). In fact I favour the use of fast shared native and distributed storage (think and XC version of mnesia), file systems are not required in most cases for Amino, although they could be loaded (dynamically) if you so wish. I could actually call it Amnesia! A native Amnesia implementation has a number of advantages:
1) It can run in SRAM, Flash, SDcard or even USB storage
2) Its uses native data structures so serialisation comes free.
3) It is atomically safe and can be used between thread using a channel approach
4) It does not add the bloat and overhead of files systems and Operating systems.

As for graphical user interfaces well you can add them as dynamic modules if you wish to develop them. I on the overhand prefer headless design. The headless design does not take up large chunks of SRAM in order to operate, rather it uses a a smaller buffer to deliver JSON or markup to a http client. That is the user interface is web based. This web based user interfaced is based around Javascript, JSON, Ajax and minimal amounts of markup. Markup and javascript are static resources that are delivered from via Amnesia to a light weight http server over ethernet to the browser. The browising device runs the GUI and special Amino widgets and Amino just passes json data back in response to http header or get requests. That way you can use your computer, laptop,iPad,iPhone android phone as a GUI device.

Even the high level development can be done using this same interface perhaps using Bespin/Clamato or similar.


But I am wandering of topic much of this is conversation is centered around the Amino Stack, but hopefully it gives you an idea on what Amino is about and how NetStamp is likely to use those features and operate. This is also why NetStamp requires many features, it isn't just an embedded stamp its a whole lot more.


P.S. we also have 4Mbits Flash as well as the MicroSD storage
Last edited by Folknology on Fri May 14, 2010 11:28 am, edited 2 times in total.
User avatar
skoe
Experienced Member
Posts: 94
Joined: Tue Apr 27, 2010 10:55 pm
Contact:

Post by skoe »

We talked about it on IRC already, but I want to repeat it here to pick up the thread: Loading modules of code like overlays could be a suitable solution for my "problems".

Edit: Let me add that I didn't even think about running uClinux or similar OSs, even on an Cortex M3 I would not do that.
Last edited by skoe on Fri May 14, 2010 8:07 am, edited 1 time in total.
Heater
Respected Member
Posts: 296
Joined: Thu Dec 10, 2009 10:33 pm

Post by Heater »

Folnology: "...why isn't the XS1 considered "general purpose computing part""

Well of course on some scale it is general purpose. I was simple eluding to the fact that the XMOS devices are severely RAM constrained. Given their intended use as "programmable hardware" or FPGA replacement that RAM limitation is quite reasonable. On the other hand it's not a lot of space to move in compared to an tiny ARM with 256MB RAM and 256MB Flash that will run Linux and/or whatever else.

So really that is what I meant by "what happened to the general purpose computing part of the plan?" In a way you have answered that by describing your vision of WEB based interfaces to the Amino stack. I might have expected you to go with ARM + XMOS instead of ARM + FPGA but again the complexity goes though the roof.

I think I just find it frustrating that in XMOS we have a quite speedy 32 bit CPU and a wonderful tool chain in C and XC but we are constrained to programming on the scale of old 8 bit processors. If only we could bolt that 256KRAM and FLASH onto one core...

The ZPU. Please be clear that the ZPU is nothing to do with emulations or running older code. If you are thinking in terms Z80's, CP/M old games etc. The original motivation for Zylin to design the ZPU is simply to be able to run C code compiled with your favourite GCC in a small corner of an FPGA. A smaller corner than any other 32 bit CPU core would occupy.

For me the ZPU architecture is a way to get big lumps of C code to run on an XMOS (or Parallax micro) from external RAM whilst using minimal resources within the chip. Whether this actually proves useful or not remains to be seen. But it is a shot at adding the "general purpose computing" part to those devices.

You have a lot of interesting and ambitions plans for the Amino project and I thank you for the invitation to join in. Sadly I don't think I have the skills or the time to go where you are heading. I will however continue chipping away at the ZPU for XMOS which I will gladly throw into the pot.
User avatar
Folknology
XCore Legend
Posts: 1274
Joined: Thu Dec 10, 2009 10:20 pm
Contact:

Post by Folknology »

Thanks Heater.

I agree that the Xmos XS1 as packaged and shipping is Ram constrained on 2 fronts. Limited choice of 64K in package and no way to add memory externally into the memory map, this is also commonly the case for many micro controllers. For me an ideal Xmos package would be one with greater in package memory choices 256K, 512K with equal on package flash. I am less interested in attaching external memory as that means either more pins (please no!) or less I/O for a given pin count, although pins could be shared of course.

Having said that I do not know what the limitations are yet for Amino within 144K, I am hoping that the HSM (dynamic loading) makes much more efficient use of the memory. It will be interesting to see how well this works in practice.

I am looking forward to seeing the ZPU by the way, its an interesting approach.

regards
Al
Heater
Respected Member
Posts: 296
Joined: Thu Dec 10, 2009 10:33 pm

Post by Heater »

Yep. We want a multi-core high speed event driven CPU. We want all the I/O, timer and link goodness. For all those real-time "software define silicon" tasks.

We (I at least) also what the possibility for huge amounts of RAM and FLASH so that we can accommodate big jobs that an ARM can handle with ease.

No we don't want to give up our precious pins and no we don't want a chip with a millions pins to make all this possible.

Sounds like we want to have our cake and eat it. It's just not possible. Or is it?

I think a neat solution to this dilemma would be a new chip from XMOS. Maybe just a single core. This chip would forgo a lot of I/O for a proper external memory bus. But crucially it would keep the xlinks. BINGO an XMOS capable of running 100's of megabytes of code with a hot line(s) through the xlinks to other devices doing the real-time peripheral stuff. The whole shebang programmable with the same tools.

Dah...I'm dreaming again.
Post Reply