100mbps MII stuck? Topic is solved

Technical questions regarding the XTC tools and programming with XMOS.
hasde
Member
Posts: 9
Joined: Thu Aug 11, 2016 3:52 pm

100mbps MII stuck?

Post by hasde »

Hello guys,
I am using the XS1 Startkit and two SK100 Ethernet slices at Triangle and Circle. For Ethernet, I am using lib_ethernet 3.1.2 with the "lite MAC" running twice at 100mbps.

The remaining application is using a lot of interfaces to support several IO on different slices (meaning also the Ethernet frames will be passed around tasks and tiles to some extend). However, most of the tasks are either combinable or even distributable, depending on their resource usage.

When generating only small traffic from a PC (which will also be answered from the SliceKit) everything is OK. When adding traffic with frames that will we dropped (because there is no protocol handler), at some point the Ethernet MAC will stop receiving frames from the MII. I tested this using debug_printf's, "i_mii.get_incoming_packet()" in mii.xc is not being called any more at some point.

However, this behavior is port dependent (if I "spam" port 0, port 1 still works until I then spam this one). The remaining application is alive all the time (heartbeat debug_printf's). It seems to make no difference if I run it in Debug or Release config with debugger attached, or standalone from flash memory.

The MII assembler part is also still running, however somewhat in idle mode.

Has anybody experienced similar issues? I am not even close to having a solution or workaround..

Thank you very much in advance,
Sebastian


View Solution
hasde
Member
Posts: 9
Joined: Thu Aug 11, 2016 3:52 pm

Post by hasde »

PS: when both interfaces are active (or receive data), it is likely the ports will fail faster.

The Phy will keep receiving (blinking LEDs), and the PC (Windows) will tend to only send ARP broadcast frames after some time, since the XMOS cannot reply anymore.

It seems the mii_lite_lld.S is still running, since e.g. a breakpoint at "mii_rxd_preamble:" is being stopped actually.
User avatar
mon2
XCore Legend
Posts: 1913
Joined: Thu Jun 10, 2010 11:43 am

Post by mon2 »

Be sure that the printfs are not your bottle neck leading to your raised issues.

https://www.xcore.com/forum/viewtopic.php?f=44&t=2956
hasde
Member
Posts: 9
Joined: Thu Aug 11, 2016 3:52 pm

Post by hasde »

Hello,
thanks for the hint. The same behavior is experienced with all debug_printf's disabled (release configuration for me). Also, debug mode with printfs is being executed with xscope output.

The heartbeats are being output in debug mode only every 5 or 10 seconds.

Best regards
Sebastian
User avatar
mon2
XCore Legend
Posts: 1913
Joined: Thu Jun 10, 2010 11:43 am

Post by mon2 »

What if you remove one of the ethernet sliceboards and focus to only 'spam' the single Ethernet sliceboard ?

Does your IP permit high speed spamming to the active single ethernet port ?

What is the version of your SliceKit ? There is an advisory which may not be relevant since you have both of your Ethernet ports functional but worth a quick review:

https://www.xmos.com/support/boards?pro ... nent=16243

Perhaps you are running out of bandwidth on the logical CPU cores when both ethernet ports are being heavily tasked ? Very much IP dependent.

The XCORE-200 (XS2) device features a gigabit interface so, in theory, it should be possible to support your slower rates with the XS1 architecture.
hasde
Member
Posts: 9
Joined: Thu Aug 11, 2016 3:52 pm

Post by hasde »

I also thought of overloading the CPU, but assumed this will mainly cause packet losses. Maybe it is worth spending some money and upgrading the slicekit to xcore200. The current kit says "XP-SKC-L2 1V2" with the XS1-L16-128.

Actually I am using Circle for the traffic, the slice at Triangle is mostly being unused so far, however also responding to ARP and ICMP. However, removing the slice physically already seems to improve the behavior, but not finally solve the issue (is it pointing to CPU overload?)

First I will try reducing the application to a single port, and then let's see.

Where can I actually see if packets are still being stored in the MAC RX queue? I am a little bit lost in the MII (assembler) code.

Best regards
Sebastian
User avatar
mon2
XCore Legend
Posts: 1913
Joined: Thu Jun 10, 2010 11:43 am

Post by mon2 »

Review this webpage and the related s/w on using the dual ethernet slice boards:

https://www.xmos.com/support/boards?product=17634


Perhaps the following code will help you to debug the ethernet traffic with Wireshark:

https://github.com/xcore/sw_ethernet_tap
hasde
Member
Posts: 9
Joined: Thu Aug 11, 2016 3:52 pm

Post by hasde »

Thanks for the hints. The xCore200 Slicekits are ordered, until then I will try using two chained XS1 slicekits and spread the workload among the tiles.

Best regards
Sebastian
hasde
Member
Posts: 9
Joined: Thu Aug 11, 2016 3:52 pm

Post by hasde »

Ok, adding the second slicekit did not really help.

However, I found one interesting thing in mii_ethernet_mac.xc:

There is no check if (incoming_tcount == 0), so in any case a packet was not requested by the application / all ethernet rx clients before the next packet arrives from the MII, it seems this non-requested packet will never be released back to the MII? Consequently, the MII will be running out of free buffers over time.

In this case, having console outputs might affect the application by introducing delays between the rx notification and the get_packet call.

When catching this situation (and dropping new packets if the old one was not requested), the behavior was getting better. However I cannot confirm yet this solved the issue completely...more testing under way.

Best regards
Sebastian

edit: the original problem was occurring after seconds usually, now it is running for minutes without any troubles...bug reported.
edit2: the app was now running over night, with "spam" traffic generator enabled, and it did not freeze any more. So I see this issue as solved.