This is the mail archive of the ecos-discuss@sourceware.org mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: FreeBSD Netstack EPIPE error


Bell, Andrew [Allen & Heath UK] wrote:
Hi Gary,

Thanks for the reply.

One end of the connection is eCos the other is Java running on a Linux
JVM.

The connection is dropped by the Java side; after repeated packets are
not acknowledged by the eCos host, followed by the eCos netstack xmiting
an out of order segment, the Java protocol stack appears to wind back to
the last packed with a sequence number the two hosts agreed on and
resends it. After a couple more attempts the Java netstack gives up and
drops the connection.

Looking at the packet capture from eCos with cyg_io_eth_net_debug set,
there is a complete lack on xmit activity; I see the retransmits from
the Java host, but the eCos netstack fails to ack them. Almost like the
protocol stack has stalled.

From what I understand the protocol stack is interrupt driven. Here's
the final twist. My main worker thread in eCos is very busy during the
time I observe the netstack xmit starvation (writing to flash) for a
period of around 16 seconds.

I've made sure my main thread priority is lower (higher in integer
terms) than the internal netstack delivery threads priority.

Is there any way a user thread can cause netstack starvation? BTW I'm
not locking out interrupts during this time.


Wow! 16 seconds writing to FLASH. This is quite possibly your problem. The V1 FLASH drivers will lock interrupts during write & erase operations (this happens in the drivers, irrelevant of what your code may do).

Is there any way to do shorter FLASH operations?

Bell, Andrew [Allen & Heath UK] wrote:
Hello All,

I'm having FreeBSD netstack issues with an eCos port for a Motorola
852T
board based on an A&M Adder.

Our eCos application keeps dropping socket connections with an EPIPE
(broken pipe) after a period of high tx activity. The ethereal capture
of the stream shows the eCos nestack shortly after the burst of tx
activity stops sending acks to the front end, ignores retransmits from
the front end, then eventually emits an out of order segment which
ethereal calculates a RTT of 1158229289 seconds!

I've run the bsd tests, enabled stack checking and enabled assertions.
I've turned on MBUF warnings and enabled cyg_io_eth_net_debug and
increased CYGPKG_NET_USAGE to (1008 *1024) + (MAXSOCK * 1024), all of
which show no clues.

If anyone can point me in the right diections I'd be grateful.

AFAIK, EPIPE is only returned if the receiving end of a TCP connection breaks off and the Tx end is still trying to send.

Are both "ends" of your connections eCos applications?  On the same
or different machines?

Is this failure something that can be tested/demonstrated separately?
In other words, can you send a test case that duplicates the problem?

Finally, do you have any idea if it's hardware/platform specific?



--
------------------------------------------------------------
Gary Thomas                 |  Consulting for the
MLB Associates              |    Embedded world
------------------------------------------------------------

--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]