This is the mail archive of the
ecos-discuss@sourceware.org
mailing list for the eCos project.
RE: Re: Network TCP Handler: stale socket disposal
- From: John Mills <johnmills at speakeasy dot net>
- To: eCos Users <ecos-discuss at ecos dot sourceware dot org>
- Cc: Alok Singh <alok dot singh at broadcom dot com>
- Date: Fri, 28 Sep 2007 08:31:03 -0500 (EST)
- Subject: RE: [ECOS] Re: Network TCP Handler: stale socket disposal
- Reply-to: John Mills <john dot m dot mills at alum dot mit dot edu>
Alok -
Thanks for your feedback. That makes the success rate 50:50 (2 of 4
respondents) for the patch.
The web server in our product is a somewhat secondary administrative
function and we serve simple content that we control 100%. That allows me
to summarily close many browser inquiries. I have added and fixed the POST
code so it handles our binary firmware images. Either of these may have
closed some vulnerabilities that may be affecting you - I don't know.
Operationally our problem was triggered by vulnerability scanners used by
our customers' SysAdmins. These locked up our product in the course of
their overall test scenarios. In principle that meant we were also
vlunerable to the corresponding hacker exploits. That's what I meant in an
earlier post about a specific, observed functional problem.
The lock-up is broader than just web service. When the socket-descriptor
pool ('zone') is depleted, no new net sockets can be allocated. This
affected other, primary functions in our product, making it a critical
issue.
I traced the problem by putting 'diag_printf' lines at points where
sockets were created and deallocated, working down to find "what didn't
happen" when a socket was lost. Sounds like you have the same road ahead
of you.
Thanks again for your reaponse.
- John Mills
On Fri, 28 Sep 2007, Alok Singh wrote:
> Hi John/everybody,
> The patch didn't work for me. I still had all the sockets exhausted, and so the web server hangs, and doesn't accept any new connection. The number of sockets I'm creating while configuring ECOS is 32. Please see the dump of " cyg_kmem_print_stats()" below when the problem comes.
>
> Test case: I've a script that opens and closes connection to web server every second. It takes around 2 hours to exhaust the SOCKETS zone of sockets. The TCP zone of sockets also comes down. Even if I stop the script, the sockets never recover. I'm currently debugging it(trying to understand TCP by reading Comer and stevens).
>
> Any ideas are welcome.
>
> ***************
> cyg_kmem_print_stats() -
> Network stack mbuf stats:
> mbufs 32, clusters 6, free clusters 6
> Failed to get 0 times
> Waited to get 0 times
> Drained queues to get 0 times
> VM zone 'ripcb':
> Total: 32, Free: 32, Allocs: 0, Frees: 0, Fails: 0
> VM zone 'tcpcb':
> Total: 32, Free: 1, Allocs: 3989, Frees: 3958, Fails: 0
> VM zone 'udpcb':
> Total: 32, Free: 31, Allocs: 14, Frees: 13, Fails: 0
> VM zone 'socket':
> Total: 32, Free: 0, Allocs: 10319, Frees: 3971, Fails: 6316
> Misc mpool: total 131056, free 79008, max free block 77972
> Mbufs pool: total 130944, free 128768, blocksize 128
> Clust pool: total 262144, free 247808, blocksize 2048
> ***********************************************************
>
> -Alok
--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss