USB and internal xlinks failing over time

Technical discussions around xCORE processors (e.g. xcore-200 & xcore.ai).
Machdisk
Member++
Posts: 20
Joined: Wed Aug 01, 2018 11:17 am

Post by Machdisk »

Easy enough, its just a couple of switch regulators like the attached. Adjusting R89, R94 lets me set any delay I want. Any particular value that you think is optimal?
You do not have the required permissions to view the files attached to this post.
User avatar
mon2
XCore Legend
Posts: 1913
Joined: Thu Jun 10, 2010 11:43 am

Post by mon2 »

In the posted schematic, the 1v0 rail is powering up based on the RC enable circuit? However, the enable is based on a 5v0 rail input.

The proper solution is to use the same 3v3 rail that is powering the XMOS CPU -> wait till the threshold is reached on the 3v3 rail (often through a PG = power good (OD) signal with a local pull-up resistor) -> once the 3v3 power_good is HIGH -> use this signal to enable the 1v0 rail for the XMOS CPU. No use of RC circuits needed or should be applied.

How does the +24V0 power source interact with this power supply? Is it an external adapter? If yes, what is the DC cable length between the adapter and this design?

Do not see the 3v3 power supply details in the posting so will be good to review.

How does the +24V0 rail convert to the +5v0? +3v3 rail?
Machdisk
Member++
Posts: 20
Joined: Wed Aug 01, 2018 11:17 am

Post by Machdisk »

There were reasons at the time I did it that way although they have since become irrelevant (beyond the fact that this part does not have a power good output and is both very affordable and I've used it happily in the past) but I don't see why it would cause an issue? (Unless 3V3 fails to come up but then we have a duff board anyway so I'm not overly worried about that and have never seen that happen). The RC delays are very reliable and let me reorganise the sequence however I desire. The rails do come up exactly as specified (with the diodes preventing any quick reboot behaviour variation). The 3v3 is a little difficult to post as half of it is part of a block with varying parts outside the block to allow myself to do some multi-channel layout copying. It's effectively exactly the same as the above however with different R90/R92 values and different values for the RC network and deriving from 24V directly.

There are three near identical circuits like the above.

One for 5V which comes up first.

One for 3v3 which comes up 20ms later. Those both derive directly from 24V and that sequencing is also robust.

The third is the above 1V0 which derives directly from 5V as shown.

The 24Vdc source is an in line power brick with a 1 metre cable. There is a soft start FET switch that ramps the 24V over 40ms even if it is plugged in fully powered up. That has all been proven to work smoothly after some tweaking.

Per the documents you linked, I did consider the input surge and I did initially have some ringing from input capacitance (I use low impedance polycaps for the input output caps) interacting with the cable length and the rise time being a bit quick but slowing the soft start cured those spikes completely and made no difference to the observed failure modes. A good thought though, I thought that might be it for a while.

Equally I have tried various soft start sequencing with no improvement. Currently for the RC sequencing networks I have:

5V R89=22k, R94=4k7
3V3 R89=33k, R94=4k7
1V0 R89=6.8k, R94=12k (apologies, changed these two values since the schematic I posted. Hadn't realised it had the old values when I posted it).

Given the varied input voltages that gives me a nice consistent 5V - 20ms - 3v3 - 10ms - 1V0 - 1s - reset goes high. The 3n3 soft start capacitors prevent any of the rails rising too quickly so no problems there. I haven't tried a closer sequence yet. You think less than 10ms between 3V3 and 1V0?
User avatar
CousinItt
Respected Member
Posts: 366
Joined: Wed May 31, 2017 6:55 pm

Post by CousinItt »

Another wild idea: could the damage be done as the board is powered down? Is one rail staying up for a while without others, or without reset?

Re the PLL cap. I haven't seen any ripple specs, but it would seem sensible to have it as close as possible. For the explorer kit, which uses the TQFP, the capacitor is right on the pins. The pin length inside the package would be several mm, and the loop area fairly low. For the BGA I would have thought it best to put as many of the supply caps as possible on the other side of the board, including the PLL cap.

There's some fairly sound advice here:

http://processors.wiki.ti.com/index.php ... decoupling under Methods of placement for bypass capacitors.
User avatar
mon2
XCore Legend
Posts: 1913
Joined: Thu Jun 10, 2010 11:43 am

Post by mon2 »

Hi. Still have skeleton details on your application and it is difficult to go back and forth on the requests due to very little free time.

Consider:

1) Replace the external power supply with a quality standard PC power supply and apply the +12V rail to your black box design. The switchers should be fine with the +12v input feed. Does this allow for your unit to work without failure?

2) If you still fail - how experienced is your SMD assembly house? Could they be violating the reflow profiles for this device? If the XMOS CPUs are not in a dry pack and sealed, they should be slowly baked and a good PCBA shop will know this. Only a real concern if you are noticing any popcorn effect on the XMOS device - but appears you are ok here but thought to mention.

3) What else do you have mated with the XMOS device? Are those legs of the circuit powered by any chance by the +5v feed? That is, some feed where the voltage rail is active BEFORE the +3v3 / +1v0 rails? That is a serious no-no if that is the case.

4) Any inductive loads (motors? solenoids?) around this design? If yes, that could be a show stopper. We have worked with some crazy inductive loads in some OEM designs and resolved to zero field failures. Post back if you are working with such a design.

Losing track in all of this dialog but to confirm - you could be working for days / weeks and the design just fails and remains to be permanently damaged?

If possible, consider to use your power supply of choice + your XMOS design -> apply a simple LED blink routine that uses somehow both tiles -> remove any other connection to the outside (best if you can do this) -> now does this Cray 1 design as a blinker work for you and continue to work till you decide to halt?

Unless there is a silicon failure from the factory (doubt it), you have some event that is a transient that is surfacing after a period of time or violating some voltage spec. Just my 2 bits.
User avatar
mon2
XCore Legend
Posts: 1913
Joined: Thu Jun 10, 2010 11:43 am

Post by mon2 »

More thoughts..

1) if you take a virgin board and power cycle (without any IP), does this same board continue to work at the end of xx power on / off cycles? Can you detect the board ok with the toolchain? Do not load your IP yet in these tests but perform only a flash erase at most.

2) if the above test allows for your board to survive, proceed to upload a proven XMOS IP example. Perhaps something like the XMOS USB HID (probably the easiest to apply), CDC or similar example. Not your custom code at this time. Confirm the IP works. Then proceed to power cycle again xx times (5-10 times at least). Be sure to allow for the power supply caps to discharge between cycles. Does the custom board survive this test?

3) what are the details of your PLL? Any chance you are overclocking or configuring the PLL out of spec which may be leading to these failures?
xmos_pll.png
4) Not fully comfortable with the power supply as defined for this product. If the issues persist, consider to apply a lower input power supply (ie. standard PC power supply should be ample) and with proper power good output for the 3v3 and 1v0 rails. While the datasheet for the CPU notes that it is ok to have at most 50 ms between the 3v3 and 1v0 rails, would suggest to shorten this delayed turn on.

Suggesting to validate that a simple LED blinky is solid on your custom design and continues to operate regardless of the number of power cycles -> only then proceed to test the IP with the other tile. Do confirm that your PLL values are within spec else you may be over-clocking the CPU.
You do not have the required permissions to view the files attached to this post.
Machdisk
Member++
Posts: 20
Joined: Wed Aug 01, 2018 11:17 am

Post by Machdisk »

Hi Mon, I'm working through your various tests but a few answers for your last two posts while I work:
1)
2) Very experienced assembly house, Boards have been xrayed to ensure proper assembly. Don't believe this to be the issue.
3) Nothing is supplied to the XMOS that is powered off the 5V rail. The 5V powers the analogue side of the ADCs and DACs but their digital interfaces are powered off the same 3v3 supply as the XMOS.
4) The only inductie loads anywhere are the coils for a few small signal relays that are powered off 5V but I have seen the issue several times with that board disconnected entirely.
5) The board that worked for long periods erroneously had incorrect pull ups on X2D04,5,6,7 and X2D67, 68, 69 and70 which pulled them all to 3v3 instead of the appropriate combination of ground and 3v3. Within a day or so of fixing those to the correct values the board died between two jtag runs. I have never seen (in 6 boards or so) a failure while the code is running. I have many failures without cycling the power rails (Just between IP load/runs).
6) Chips have come from several batches so a silicon failure seems unlikely.
7) If I take a virgin board and don't load IP I won't be able to tell if the failures have occurred because the failures are failed USB enumeration and unusual GPIO behaviour. I can always load and run code on a failed unit it will just not exchange data correctly between running cores or the IO acts strangely.
8) I'll be trying out the various simpler code setups later today.
9) 24MHz Crystal running per the multichannel dev kit. Don't believe there is an issue here as this all measures correctly and is effectively stock. Mode resistors are definitely correct.
10) Trying the power supply tests you describe today.
11) A simple blink LED piece of code would probably appear to work unless it is shuttling data across the links between tiles because the failure mode would not affect that. It's usually just the P32 input gpios on tile 2 that show odd readings and the only thing connected to that port is apunch of pull ups, pull downs and open collector outputs from some comparators.
User avatar
mon2
XCore Legend
Posts: 1913
Joined: Thu Jun 10, 2010 11:43 am

Post by mon2 »

What are the details of your JTAG (XTAG) programmer interface and wiring?

Are you on the same ground potential for your XMOS target board and the PC used for programming the target CPU via XTAG?

Is the XTAG tool known to be working otherwise with other kits?

Is the FLASH external to the CPU? If yes, then you are using QSPI mode enabled flash and respectively, X0D01, X0D04..X0D07, and X0D10 for booting?
Machdisk
Member++
Posts: 20
Joined: Wed Aug 01, 2018 11:17 am

Post by Machdisk »

XMOS xtag 3 programmer. Definitely the same ground potential (All on the same ground plane). I have three identical xtag programmers and the all behave the same.

I can program to the flash chip on the board and the unit will boot from the flash (Which is external QSPI yes) and then behave the same way it will if I program it using the xtag.
You do not have the required permissions to view the files attached to this post.
User avatar
mon2
XCore Legend
Posts: 1913
Joined: Thu Jun 10, 2010 11:43 am

Post by mon2 »

When you reflash the IP, do you power cycle the power supply / your custom PCB? Or do you hot swap connect the XTAG interface? Still concerned about possible in rush current / transients from the +24v feed over the fair length copper cabling.

Let us see the results of your PC power supply @ +12v input or even the +5v from the alternate power source (PC power supply).