This is the mail archive of the
mailing list for the eCos project.
Re: RFC, fix for bogus timeouts in select()
- From: Andrew Lunn <andrew at lunn dot ch>
- To: ?yvind Harboe <oyvind dot harboe at zylin dot com>
- Cc: ecos-patches at sources dot redhat dot com
- Date: Thu, 21 Oct 2004 20:38:09 +0200
- Subject: Re: RFC, fix for bogus timeouts in select()
- References: <1098362231.21934.21.camel@famine>
> Index: current/ChangeLog
> RCS file: /cvs/ecos/ecos/packages/io/fileio/current/ChangeLog,v
> retrieving revision 1.46
> diff -u -w -r1.46 ChangeLog
> --- current/ChangeLog 4 Oct 2004 11:50:06 -0000 1.46
> +++ current/ChangeLog 21 Oct 2004 12:32:21 -0000
> @@ -1,3 +1,15 @@
> +2004-10-21 Oyvind Harboe <email@example.com>
> + * src/select.cxx: Fix problem with bogus timeouts in select().
> + The problem is that a thread can receive data while it is currently
> + starved for CPU. It can then wake up with data arrived and timeout
> + expired. The fix is to check for data after timeout has expired. One
> + can of course claim that select() is "doing the right thing", but
> + it is a royal pain for developers to track down this sort of thing
> + so removing this API tripwire seems worthwhile. E.g. serial drivers
> + can spend a lot of time in DSRs copying lots of traffic. Not easily
> + dealt with at an application level.
Although the current implementation is probably not optimal, i don't
see any tripwire in the API. A task/process can get
descheduled/rescheduled at any time. Think about this on a Unix
system. The select() system call exited on a timeout and you are back
into the libc select() function when you get time sliced. While some
other process is running the ethernet device interrupt goes off and
the stack puts new data into the socket ready for the userspace to
read sometime in the future. Your process then gets the CPU back and
the libc select function exits back into you application. Select tells
you it has timed out, but there is infact data to be read on the
socket. In practice this makes little difference. The next time around
the loop select will exist imeadiately telling you there is data on
Any application that assumes that select returning a timeout means
there is no data on the socket is broken.
I will take a closer look at the patch though.