Malloc stuck in loop?

Technical questions regarding the XTC tools and programming with XMOS.
Post Reply
User avatar
monk_is_batman
Active Member
Posts: 38
Joined: Wed Jun 09, 2010 3:20 am
Location: Maine, USA
Contact:

Malloc stuck in loop?

Post by monk_is_batman »

I seem to have run across a loop issue with malloc. At the same point in a program I seem to encounter a point when malloc does not return. When I'm in the debugger and start and stop it inside _malloc which is called by malloc. It stays inside _malloc and does not ever return, but it is not stuck on a single instruction it seems to move in some sort of loop that I haven't identified completely.

This does not seem to be an issue with the lock as its not suck on a single instruction. Has anyone encountered anything like this before? Any ideas on what might cause this?


richard
Respected Member
Posts: 318
Joined: Tue Dec 15, 2009 12:46 am

Post by richard »

This obviously shouldn't happen. Is the problem easily reproducible? If it is it would be great if you could post the steps required to recreate what you are seeing, otherwise I have some questions for you.

Can you able to disassemble the code near the pc it is getting stuck? There is a loop in _malloc that iterates over a linked list of free blocks. One possible explanation for the function never returning would be a cycle in this linked list (e.g. due to memory corruption or a library bug).

Is malloc() / free() called from multiple threads?
User avatar
monk_is_batman
Active Member
Posts: 38
Joined: Wed Jun 09, 2010 3:20 am
Location: Maine, USA
Contact:

Post by monk_is_batman »

I have yet to reproduce this outside of the project I'm working in. I'll keep testing and see if I can find out an easy way to reproduce it in a smaller project. It sounds like it might be related to memory corruption like you mentioned. I am calling malloc/free from multiple threads (could this cause the issue?), as well as doing some out of the ordinary threading/memory tasks.

I've managed to grab the following out of xde dissassembly window, I have caught it in various places between the arrows I believe.

Code: Select all

0x00014bee <__malloc+30>:  add  (2rus)      r11, r1, 0x0
0x00014bf0 <__malloc+32>:  ldw  (2rus)      r3, r11[0x0] 
0x00014bf2 <__malloc+34>:  bf   (ru6)        r3, 0x25
0x00014bf4 <__malloc+36>:  ldw  (2rus)      r4, r3[0x0]
0x00014bf6 <__malloc+38>:  lsu  (3r)        r1, r4, r0  <-----------
0x00014bf8 <__malloc+40>:  bt   (ru6)        r1, 0xe
0x00014bfa <__malloc+42>:  add  (2rus)      r1, r0, 0x8
0x00014bfc <__malloc+44>:  lsu  (3r)        r1, r4, r1
0x00014bfe <__malloc+46>:  bf   (ru6)        r1, 0x3
0x00014c00 <__malloc+48>:  ldw  (2rus)      r0, r3[0x1]
0x00014c02 <__malloc+50>:  stw  (2rus)      r0, r11[0x0]
0x00014c04 <__malloc+52>:  bu   (u6)         0x28
0x00014c06 <__malloc+54>:  add  (3r)        r1, r3, r0
0x00014c08 <__malloc+56>:  ldw  (2rus)      r2, r3[0x1]
0x00014c0a <__malloc+58>:  stw  (2rus)      r1, r11[0x0]
0x00014c0c <__malloc+60>:  sub  (3r)        r4, r4, r0
0x00014c0e <__malloc+62>:  stw  (2rus)      r4, r1[0x0]
0x00014c10 <__malloc+64>:  ldw  (2rus)      r1, r11[0x0]
0x00014c12 <__malloc+66>:  stw  (2rus)      r2, r1[0x1]
0x00014c14 <__malloc+68>:  bu   (u6)         0x1f
0x00014c16 <__malloc+70>:  ldw  (2rus)      r5, r3[0x1]
0x00014c18 <__malloc+72>:  add  (2rus)      r1, r3, 0x4    <----------------
0x00014c1a <__malloc+74>:  bt   (ru6)        r5, -0x17
0x00014c1c <__malloc+76>:  add  (3r)        r5, r3, r4
0x00014c1e <__malloc+78>:  eq   (3r)         r5, r5, r2
0x00014c20 <__malloc+80>:  bf   (ru6)        r5, -0x1a
0x00014c22 <__malloc+82>:  sub  (3r)        r1, r0, r4
0x00014c24 <__malloc+84>:  ldw  (lru6)      r4, dp[0x9]
0x00014c28 <__malloc+88>:  ldaw (l2rus)    r4, r4[-0x8]
0x00014c2c <__malloc+92>:  sub  (3r)        r2, r4, r2
0x00014c2e <__malloc+94>:  lsu  (3r)        r1, r2, r1
0x00014c30 <__malloc+96>:  bt   (ru6)        r1, 0x16
0x00014c32 <__malloc+98>:  ldc  (ru6)       r1, 0x0
Not sure if this will be any help but here is the entire function dumped from my xe file for the project.

Code: Select all

<__malloc>:
           0x00011f46: 42 77:       entsp (u6)      0x2
           0x00011f48: 01 55:       stw (ru6)       r4, sp[0x1]
           0x00011f4a: 40 55:       stw (ru6)       r5, sp[0x0]
           0x00011f4c: d6 a6:       mkmsk (rus)     r1, 0x2
           0x00011f4e: 14 c8:       lsu (3r)        r1, r1, r0
           0x00011f50: 01 f0 42 78: bf (lru6)       r1, 0x42 <.bt247>
           0x00011f54: c4 96:       neg (2r)        r1, r0
           0x00011f56: d6 46:       zext (rus)      r1, 0x2
           0x00011f58: 01 10:       add (3r)        r0, r0, r1
           0x00011f5a: 40 92:       add (2rus)      r0, r0, 0x4
.bt256     0x00011f5c: 03 f0 46 60: ldaw (lru6)     r1, dp[0xc6]
           0x00011f60: 00 f0 85 58: ldw (lru6)      r2, dp[0x5]
.bt253     0x00011f64: b4 90:       add (2rus)      r11, r1, 0x0
           0x00011f66: bc 09:       ldw (2rus)      r3, r11[0x0]
           0x00011f68: e5 78:       bf (ru6)        r3, 0x25 <.bt248>
           0x00011f6a: 4c 08:       ldw (2rus)      r4, r3[0x0]
           0x00011f6c: d0 c8:       lsu (3r)        r1, r4, r0
           0x00011f6e: 4e 70:       bt (ru6)        r1, 0xe <.bt249>
           0x00011f70: 90 94:       add (2rus)      r1, r0, 0x8
           0x00011f72: d1 c8:       lsu (3r)        r1, r4, r1
           0x00011f74: 43 78:       bf (ru6)        r1, 0x3 <.bt250>
           0x00011f76: 0d 08:       ldw (2rus)      r0, r3[0x1]
           0x00011f78: 8c 01:       stw (2rus)      r0, r11[0x0]
           0x00011f7a: 28 73:       bu (u6)         0x28 <.bt251>
.bt250     0x00011f7c: 1c 10:       add (3r)        r1, r3, r0
           0x00011f7e: 2d 08:       ldw (2rus)      r2, r3[0x1]
           0x00011f80: 9c 01:       stw (2rus)      r1, r11[0x0]
           0x00011f82: 00 19:       sub (3r)        r4, r4, r0
           0x00011f84: 44 00:       stw (2rus)      r4, r1[0x0]
           0x00011f86: 9c 09:       ldw (2rus)      r1, r11[0x0]
           0x00011f88: 25 00:       stw (2rus)      r2, r1[0x1]
           0x00011f8a: 1f 73:       bu (u6)         0x1f <.bt252>
.bt249     0x00011f8c: 5d 08:       ldw (2rus)      r5, r3[0x1]
           0x00011f8e: 5c 92:       add (2rus)      r1, r3, 0x4
           0x00011f90: 57 75:       bt (ru6)        r5, -0x17 <.bt253>
           0x00011f92: 9c 12:       add (3r)        r5, r3, r4
           0x00011f94: 16 31:       eq (3r)         r5, r5, r2
           0x00011f96: 5a 7d:       bf (ru6)        r5, -0x1a <.bt253>
           0x00011f98: 50 1a:       sub (3r)        r1, r0, r4
           0x00011f9a: 00 f0 04 59: ldw (lru6)      r4, dp[0x4]
           0x00011f9e: 80 fd ec a7: ldaw (l2rus)    r4, r4[-0x8]
           0x00011fa2: e2 18:       sub (3r)        r2, r4, r2
           0x00011fa4: 19 c8:       lsu (3r)        r1, r2, r1
           0x00011fa6: 56 70:       bt (ru6)        r1, 0x16 <.bt254>
           0x00011fa8: 40 68:       ldc (ru6)       r1, 0x0
           0x00011faa: 9c 01:       stw (2rus)      r1, r11[0x0]
           0x00011fac: 1c 10:       add (3r)        r1, r3, r0
           0x00011fae: 00 f0 45 50: stw (lru6)      r1, dp[0x5]
           0x00011fb2: 0b 73:       bu (u6)         0xb <.bt252>
.bt248     0x00011fb4: 00 f0 c4 58: ldw (lru6)      r3, dp[0x4]
           0x00011fb8: bc fc ec a7: ldaw (l2rus)    r3, r3[-0x8]
           0x00011fbc: 3e 18:       sub (3r)        r3, r3, r2
           0x00011fbe: 3c c8:       lsu (3r)        r3, r3, r0
           0x00011fc0: c9 70:       bt (ru6)        r3, 0x9 <.bt254>
           0x00011fc2: 38 10:       add (3r)        r3, r2, r0
           0x00011fc4: 00 f0 c5 50: stw (lru6)      r3, dp[0x5]
           0x00011fc8: 38 90:       add (2rus)      r3, r2, 0x0
.bt252     0x00011fca: 0c 00:       stw (2rus)      r0, r3[0x0]
.bt251     0x00011fcc: 4c 92:       add (2rus)      r0, r3, 0x4
.bt255     0x00011fce: 40 5d:       ldw (ru6)       r5, sp[0x0]
           0x00011fd0: 01 5d:       ldw (ru6)       r4, sp[0x1]
           0x00011fd2: c2 77:       retsp (u6)      0x2
.bt254     0x00011fd4: 00 68:       ldc (ru6)       r0, 0x0
           0x00011fd6: 05 77:       bu (u6)         -0x5 <.bt255>
.bt247     0x00011fd8: 08 68:       ldc (ru6)       r0, 0x8
           0x00011fda: 01 f0 01 77: bu (lu6)        -0x41 <.bt256>
richard
Respected Member
Posts: 318
Joined: Tue Dec 15, 2009 12:46 am

Post by richard »

monk_is_batman wrote:I am calling malloc/free from multiple threads (could this cause the issue?)
The heap is protected by a lock so they should be safe to call from multiple threads.
I've managed to grab the following out of xde dissassembly window, I have caught it in various places between the arrows I believe.
This pc range agrees with the theory that the problem is caused by a cycle in the linked list of free blocks.

I've attached an object file that contains updated versions of malloc / free / etc. that check some invariants at the start and end of each function. If you link with this object file it will override the default implementations of these functions in the standard libraries. In the XDE you need to add the full path to the object file to Mapper/Linker -> Miscellaneous -> Other options in the project properties.

Could you try linking your program with this file and running it again. If something goes wrong it will print the reason to stdout and exit the program.
Attachments
dbg-malloc.zip
(5.97 KiB) Downloaded 239 times
dbg-malloc.zip
(5.97 KiB) Downloaded 239 times
User avatar
monk_is_batman
Active Member
Posts: 38
Joined: Wed Jun 09, 2010 3:20 am
Location: Maine, USA
Contact:

Post by monk_is_batman »

The object file you gave me worked great. When I ran it, it found a problem in an earlier call to free. It said something like "cycle detected in __Free__", I'm not sure what the exact wording is because my xde is out of commission right now, when I get it back up and running I can post it for sure. The cause of the issue was code that freed the same memory twice (I don't know how I came up with it). It was as follows.

Code: Select all

if (msg->content) free(msg->content);
free(msg->content);
When I called malloc after that it would not return, I'm assuming the second free causes some sort of issue in the linked list.

I do not pretend to know the inner workings of malloc/free in the library, but is it possible to change the way it works so that free raise an exception when you try to free memory that cannot be freed?
richard
Respected Member
Posts: 318
Joined: Tue Dec 15, 2009 12:46 am

Post by richard »

monk_is_batman wrote:I do not pretend to know the inner workings of malloc/free in the library, but is it possible to change the way it works so that free raise an exception when you try to free memory that cannot be freed?
I'm not sure how easy this is to do given the current information maintained about the heap, but I'll will investigate this as it would make it easier to diagnose misuses of malloc/free.
User avatar
monk_is_batman
Active Member
Posts: 38
Joined: Wed Jun 09, 2010 3:20 am
Location: Maine, USA
Contact:

Post by monk_is_batman »

Great I appreciate you looking into this. Thanks for all of your help.
Post Reply