Code only runs after flash as

Technical questions regarding the XTC tools and programming with XMOS.
User avatar
Ross
XCore Expert
Posts: 968
Joined: Thu Dec 10, 2009 9:20 pm
Location: Bristol, UK

Post by Ross »

gerrykurz wrote:
When I type xrun -- dumpstate or xrun --dump-state

I get the following error message

xrun: No .xe file passed to --dumpstate option
You will want to pass your xe file into that command such that symbols can be resolved for you..

if you are running with xrun binary.xe then use xrun --dumpstate binary.xe


User avatar
gerrykurz
XCore Addict
Posts: 204
Joined: Sun Jun 01, 2014 10:25 pm

Post by gerrykurz »

H Ross,

Thank you for your response but I don't quite understand what you are saying.

If I do an xrun with my application.xe file, then everything works fine.

The problem I am having is booting out of flash on a power up condition.

In every other scenario, my application code works fine.

I don't know what binary.xe is....

How can I get some direct support for this issue?
User avatar
infiniteimprobability
XCore Legend
Posts: 1126
Joined: Thu May 27, 2010 10:08 am

Post by infiniteimprobability »

Hi - binary.xe means your binary file. ie. whatever the name of your application is, substitute your app name for "binary".
How can I get some direct support for this issue?
Ways of getting support:
- Private ticketed direct support is a paid for service via the Enterprise tools license
- via an FAE if a direct customer
- via your distributor
- Via this forum (for most of us at Xmos, answering this forum is not our primary role and we have chose to be here to support our community of developers). Support is best effort via the forum.
- If it's clearly an XMOS bug, then this can be reported by anyone with an xmos.com account.

I think a lot of the non-direct (ie. community member) support has been pretty good on this thread. For example, the suggestion about setting --spi-div <value> to something much larger (see xflash --help for setting the value) is a very good one and kit would be good to hear the results of this.

However, be aware that 13.2 tools had a bug which caused --spi-div to be ignored in the second stage boot. 13.1 was OK, as is 14 (which has been fixed - See tools 14 release notes.) - so please check your tools version and let us know what it is.

Also, how did you get on with the boot debugging FAQ which I posted in direct response to this thread?
Here's an FAQ with some further info on debugging the boot.

http://www.xcore.com/questions/3233/wha ... #node-3234
That should reveal more info about where things are going amiss and provide pointers to help track this elusive bug down..

I recall from a previous thread that you have a multi-chip setup. Something in this area is likely to be the cause of the issue because we know xflash in general works - could you post your custom .xn file?

It would be interesting to know if flashing an app just on the first device in the chain works too.. That would weed out whether it's a SPI speed/connection issue or link network setup bug.
colin
Experienced Member
Posts: 74
Joined: Mon Dec 16, 2013 12:14 pm

Post by colin »

Xflash does indeed verify what it writes to flash memory and the fact that your device boots on occassion does suggest that the memory has been successfully written.

Is this a multi-node network of devices where one node is responsible for booting other nodes? It would be useful to see your XN file and the values used for the PLL and link modes/speeds.

If this is a network of nodes then there are a couple of hidden xflash options that you could use to modify the delays used in the bootloader:

--link-reset-delay option increases the delay used before a link is reset. The default value is: 8192 clock cycles.

--s2l-worst-case-boot-time increases the delay used between the pll register being written and attempted communication with the node that was reset. The default value is: 60000 clock cycles.

Try increasing these options independantly/in combination to see if it improves the frequency of successful boots.

Alternatively there have been a number of improvements to the timings used in the bootloader for tools 14. You could install tools 14 and use xflash to program your device.

If this is a network setup timing issue then I would expect either of these options to help resolve your boot issues.
User avatar
gerrykurz
XCore Addict
Posts: 204
Joined: Sun Jun 01, 2014 10:25 pm

Post by gerrykurz »

Progress on this issue so far, and yes to everyone for the help,

This design does use two L16 devices, with the first booting from flash and the second booting via link. Right now I am testing with a very simple bit of code that only uses one core of tile 0 so I don't think the network timing or link timing should be an issue as no application code is being loaded to the other tiles on the second device. I have included my .xn file never the less.

I have just installed and tried version 14 of the tools and I can verify that the xflash spi-div option does work in V14 and does not work in V13.2

However slowing down the spi clock does not solve the problem of booting from flash out of power up.

So to summarize, the application runs without issue in both debug and run mode when loaded via jtag.

xflash (all versions) is able to program and verify the flash device with no errors.

After running xflash, the device boots and runs the application successfully.

The only issue is that the application does not start properly when coming out of a power up condition. I have confirmed that it reaches the second stage bootloader and then fails after an extended interaction with the flash device where the spi clock speed changes to two different clock rates from the first stage bootloader spi clock. Is this expected behavior?

One question that seems relevant, is what is the difference in the processes of the device booting the application after running xflash and booting out of external reset? Does xflash load the application directly the same as xrun or does the application load from flash after running xflash?

I will now try the boot debugging as suggested in the previous reply and report the results.
You do not have the required permissions to view the files attached to this post.
User avatar
gerrykurz
XCore Addict
Posts: 204
Joined: Sun Jun 01, 2014 10:25 pm

Post by gerrykurz »

OK here are the results of running the xrun dump state.

The boot process gets into the application but seems to hang at a particular address in ram at 0x00010730.

The actual code at this location seems to have no relationship to the problem as can be seen in the attached dump state files. In the debug build, the code hangs in dsp_status.dtor and in the release build, it hangs in the call to the configure_clock_rate function.

I have included a dump state from the application running correctly for comparison.

This dump state shows two loops running two logical cores in tile (0)

So any new ideas about this?

Is this a reportable bug?
You do not have the required permissions to view the files attached to this post.
User avatar
segher
XCore Expert
Posts: 844
Joined: Sun Jul 11, 2010 1:31 am

Post by segher »

I don't think your "only one tile" test was correct. You
have to use a modified XN for that.

Please don't zip text files, it is quite inconvenient (so I
haven't read them yet).
User avatar
gerrykurz
XCore Addict
Posts: 204
Joined: Sun Jun 01, 2014 10:25 pm

Post by gerrykurz »

Ok thanks, you are right, I need to modify the xn file

and here are the links to the dump state outputs

Debug build dump state no xe option

Debug build dump state

Release build dump state

Release build running correctly dump state
User avatar
Ross
XCore Expert
Posts: 968
Joined: Thu Dec 10, 2009 9:20 pm
Location: Bristol, UK

Post by Ross »

Have you tried this without the xtag plugged in?
User avatar
segher
XCore Expert
Posts: 844
Joined: Sun Jul 11, 2010 1:31 am

Post by segher »

In all cases, CPU 0 hangs trying to output to a channel,
so the channel's buffers are full. This is at 10730, in the
bootloader.

The other three CPUs are in the ROM, trying to boot from
a channel (waiting for the first data to come in).