This is the mail archive of the ecos-discuss@sourceware.org mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

"packets eaten" with AT91 EMAC Ethernet driver


Hello,

Since I solved the bugs in the AT91 EMAC driver
(RX: reset of ‘bytes_in_list’ (position in current sg_list))(TX: at TXERR IRQ, reset SW pointer; set all used bits to 0 instead of 1),
I always had the same problem: after a while of communicating over Ethernet with the AT91 EMAC, packets get “eaten”.


TX Packets get stuck, and they need an RX packet to get out.


It is not a problem with the pointers in the EMAC: There is only 1 TX buffer, so no out-of-sync there. I checked RX counter/pointers, also OK (checked twice).



See this ping:


64 bytes from 10.0.54.249: icmp_seq=0 ttl=64 time=22 ms

64 bytes from 10.0.54.249: icmp_seq=1 ttl=64 time=9 ms

64 bytes from 10.0.54.249: icmp_seq=2 ttl=64 time=9 ms

64 bytes from 10.0.54.249: icmp_seq=3 ttl=64 time=9 ms

64 bytes from 10.0.54.249: icmp_seq=4 ttl=64 time=9 ms

64 bytes from 10.0.54.249: icmp_seq=5 ttl=64 time=9 ms

64 bytes from 10.0.54.249: icmp_seq=6 ttl=64 time=22 ms

64 bytes from 10.0.54.249: icmp_seq=7 ttl=64 time=9 ms

64 bytes from 10.0.54.249: icmp_seq=8 ttl=64 time=9 ms

64 bytes from 10.0.54.249: icmp_seq=9 ttl=64 time=9 ms

64 bytes from 10.0.54.249: icmp_seq=10 ttl=64 time=1024 ms

64 bytes from 10.0.54.249: icmp_seq=11 ttl=64 time=43 ms

64 bytes from 10.0.54.249: icmp_seq=12 ttl=64 time=2011 ms

64 bytes from 10.0.54.249: icmp_seq=13 ttl=64 time=2011 ms

64 bytes from 10.0.54.249: icmp_seq=14 ttl=64 time=2009 ms

64 bytes from 10.0.54.249: icmp_seq=15 ttl=64 time=1083 ms

64 bytes from 10.0.54.249: icmp_seq=16 ttl=64 time=233 ms

64 bytes from 10.0.54.249: icmp_seq=17 ttl=64 time=2029 ms

64 bytes from 10.0.54.249: icmp_seq=18 ttl=64 time=1050 ms

64 bytes from 10.0.54.249: icmp_seq=19 ttl=64 time=1009 ms

64 bytes from 10.0.54.249: icmp_seq=20 ttl=64 time=2098 ms

64 bytes from 10.0.54.249: icmp_seq=21 ttl=64 time=1116 ms

64 bytes from 10.0.54.249: icmp_seq=22 ttl=64 time=1009 ms

64 bytes from 10.0.54.249: icmp_seq=23 ttl=64 time=4025 ms

64 bytes from 10.0.54.249: icmp_seq=24 ttl=64 time=4009 ms

64 bytes from 10.0.54.249: icmp_seq=25 ttl=64 time=4009 ms

64 bytes from 10.0.54.249: icmp_seq=26 ttl=64 time=4009 ms

64 bytes from 10.0.54.249: icmp_seq=27 ttl=64 time=4009 ms

64 bytes from 10.0.54.249: icmp_seq=28 ttl=64 time=5009 ms

64 bytes from 10.0.54.249: icmp_seq=29 ttl=64 time=5009 ms

-> and 5 packets lost

When you stop the ping and start it again, it takes a while before the ping starts again. When looking with Wireshark, you see that the first packets that are sent to the EMAC are of course ARP requests (cleared arp table). The first 5 ARP requests get a ping reply as answer: that are the 5 “eaten” packets of the previous ping. The next ARP request gets an ARP reply, and then the ping reply’s start.

This problem is caused by an TX burst.
* I have to send other packets also during a ping to get the problem. Doing a ping and using the httpd monitor in ecos at the same time is sufficient.


* I can trigger the problem very easy with my echo program:

I have the inetd echo running as test program. Each UDP or TCP packet received at port 7 must be returned to the sender.

This goes perfect as long as I send non-fragmented IP packets.

When starting to send fragmented packets (size bigger than Ethernet MTU) – this results in a burst of packets – the echo reply does not come anymore, and packets get “eaten”.
When looking with Wireshark you see what happens:



<- echo request fragment1 <- echo request final fragment2 -> echo reply fragment1 ERROR (fragment 2 does not come; then I start a ping) <- ping request 1 -> echo reply final fragment2 <- ping request 2 -> ping reply 2


So the TCP/IP stack returns the echo request with a echo reply, resulting in a burst of 2 packets. This causes the second packet to become "eaten".
Has anyone an idea why?
Maybe the 'can_send()' function blocks too long?


In attachment my current if_at91.c AT91 EMAC driver (I am waiting for answers from Atmel to clean it up; also, I write directly to SRAM – I should do it via the linker with an sram section in the .ldi file).

Jürgen

Attachment: if_at91.c
Description: Binary data

-- 
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]