FreeRTOS task scheduling exception debugging

grumpi · Post by **grumpi** » Mon Apr 14, 2025 8:52 am

Hello everyone,

I'm currently stuck with a bug that occurs intermittently in our XMOS firmware. I'm using xgdb and encountering the following exception, but I'm having trouble identifying the specific task causing it.

```
Thread 1.1 hit Catchpoint -1 (XCore Exception ET_LOAD_STORE), 0x0008470a in prvSelectHighestPriorityTask (xCoreID=0)
at .../modules/rtos/modules/FreeRTOS/FreeRTOS-SMP-Kernel/tasks.c:894
894 if( pxTCB->xTaskRunState == taskTASK_NOT_RUNNING )
(gdb) bt
#0 0x000800c4 in _DoException ()
#1 0x0008470a in prvSelectHighestPriorityTask (xCoreID=0) at .../modules/rtos/modules/FreeRTOS/FreeRTOS-SMP-Kernel/tasks.c:894
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
```
Does anyone have suggestions on how to retrieve more information about the cause of this exception, or how to identify which task is triggering it?
Unfortunately, since debugging is only possible via JTAG on this device, I assume rtos_printf won't be usable in this context, right?

Thanks a lot!

grumpi · Post by **grumpi** » Mon Apr 14, 2025 12:01 pm

a quick update, it seems that the whole internal RAM got reset to zero, can this be caused by the debugging itself or a consequence of the exception? This makes debugging even harder...
Any suggestions on how to proceed ?
Thanks!

CiaranW · Post by **CiaranW** » Mon Apr 14, 2025 4:08 pm

Hi,

It seems that the exception is occurring while the RTOS is attempting to choose the next task to run. An 'info registers' command will give a bit more information about what exception occurred. Specifically the 'ed' register will show the address that was involved in the load_store exception.

Although from what I can see in your logs above, it could be that pxTCP is NULL, and that is causing the exception (if 'ed' is zero, this would be confirm the NULL dereference). It could be that part of the program is erroneously setting a range of memory to zero and causing this exception. Is the exception always hit in prvSelectHighestPriorityTask? Could it be that a task is being removed incorrectly, leaving a dangling reference to it within the scheduler?

The debugger should not erase all internal RAM, nor should the default exception handler, so this suggests some other problem. Could be a faulty part of the software (e.g. a memset with bad address), the watchdog resetting the system, an external reset or hardware issue, for example. If it's the software, then it should be possible to catch it using a watchpoint on some memory you aren't expecting to change.

As for logging: if you don't have `xscope` available, you're right that `rtos_printf` (or standard `printf`) will halt the core while printing, which breaks real-time operation. One workaround is to log to an in-memory circular buffer — then, when the program crashes or hits a breakpoint, you can inspect the buffer from the debugger (assuming memory hasn't been wiped at that point).

Hope this helps,
Ciaran

grumpi · Post by **grumpi** » Thu Apr 17, 2025 9:54 am

Hey Ciaran,
thanks a lot for your reply and the helpful debugging tips!

It looks like I'm dealing with two separate issues:

1. The first is related to corruption in the xSuspendedTaskList. E.g. it contains an item whose pxContainer points to a pxReadyTasksList instead of xSuspendedTaskList. I assume something goes wrong while moving the item between lists — possibly during a task state transition.

2. The second issue is making debugging extremely difficult. After an unpredictable amount of time, the debugger starts returning the same fixed value for every memory address. Previously this value was 0, which led me to believe the memory was being wiped, but now it's a different constant. Interestingly, reading registers still works fine. Are those copied once to xgdb or read via JTAG each time I call info register? I am wondering if the JTAG connection could be the issue here.

Have you ever seen something like this?

Thanks again,
Mischa

CiaranW · Post by **CiaranW** » Thu Apr 17, 2025 12:26 pm

Hi Mischa,

1. I'm not very familiar with freeRTOS, but hopefully with the debugger issue resolved this can be tracked down.

2. Regarding registers - yes, they get cached inside GDB. You can clear the cache by issuing the 'maintenance flush register-cache' command in gdb, which will force them to be read from the target again

Regarding the memory issue - this is symptomatic of the system getting reset behind the debugger's back, so xgdb is trying to communicate with the target when the target is in the wrong state. The usual cause of this is the use of the Watchdog timer resetting the system - because the watchdog isn't stopped when the debugger halts the program. This will result in the same data being returned over & over. So this would be my expectation - a good step would be to disable the watchdog, and ensure that the system is stable.

It's possible that the JTAG connection is defective, but generally this is pretty stable - so either it works or it doesn't, so it wouldn't be my first guess.

If this doesn't lead anywhere, you can send me the gdb log and I can take a look. This can contain detailed information about your program, so you may want to private message it to me (upload it to pastebin or some other external service). To create the log, modify your connect/attach command to add the log-level and log-file options:

Code: Select all

connect --log-level=trace,xdbg::usb=warn --log-file=my_log.txt

And the log will be created in 'my_log.txt'.

Thanks,
Ciaran

FreeRTOS task scheduling exception debugging

FreeRTOS task scheduling exception debugging

Re: FreeRTOS task scheduling exception debugging

Re: FreeRTOS task scheduling exception debugging

Re: FreeRTOS task scheduling exception debugging

Re: FreeRTOS task scheduling exception debugging