This is the mail archive of the ecos-discuss@sources.redhat.com mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Problem with IP Stack ...


On Mon, Sep 09, 2002 at 05:52:19PM +0200, Thomas BINDER wrote:
> Andrew Lunn wrote:
> > 
> > > There is one interesting thing, however, I discovered just this
> > > morning. I redirected the diag_printf messages to a different
> > > (debug) stream (not IP based) and ECOS does not crash any longer!
> > 
> > That not really surprising. Using the stack to debug the stack is
> > never a good idea. Have gdb user the serial port, not the stack.
> 

> Well, I was not really debugging the stack. Isn't that a `regular'
> warning message that gets printed (and thereby somehow crashes the
> stack)? 

To send the message out, you need an mbuf and maybe a cluster. These
are just what you don't have at the time the problem happens. Maybe
the code does not correctly handle these conditions. They don't
'normally' happen that often and so may not have been tested. 
  
> > Thats not correct. The number of mbufs is not going to zero. You have
> > lots of mbufs free. The information it prints out for clusters is 0,
> > but that figure is bogus anyway. The stack gets clusters from the
> > pool. It does not return them to the pool, but puts them on a linked
> > list. When it needs another cluster, it first tries the linked list
> > and then tries the pool.
> 

> You're right. This figure is indeed bogus. What is still interesting
> is that the number of free clusters is not increasing above 27 (in
> my case). What happens to the rest? What is the Pool actually used
> for if memory is never returned to it (as opposed to the MBUFs pool)
> ?

You still need some pool of memory to make it available to be put onto
the linked list i 2kbyte chunks. Using a pool is the easiest way.

> > So i think you are running out of clusters, not mbufs. To prove this,
> > change io/eth/current/net/eth_drv.c:873. The error message is
> > wrong. MCLGET allocates a cluster not an mbuf. Change this message so
> > you can tell it apart from the error at line 820.
> 
> You're right again. I've already checked it.

OK. Is there anything else you know, but aren't telling us :-)

> > Whats interesting is that number of free clusters does not slowly
> > drop, but jumps. Some sort of event is happening which causes a whole
> > lot of clusters to suddenly get lost. I would look at the ring
> > buffer. What happens when the receive ring buffer wraps around when
> > the ethernet device is receiving faster than the stack is taking them
> > out of the ring buffer. I've had problems like this with TX.
> 
> What ringbuffers are you referring to? 

In your ethernet device? It depends on the inteligence/dumbness of
your ethernet device. The intelligent ones your have a ring of buffers
to do receives into. The device DMAs into the next free buffer and
then interrupts the processor so it can process it. If the ethernet is
receiving faster than the processor is processing and the ring is not
set up correctly, it can loop around and stomp over the buffer the
processor is currently working on.

I don't know what your ethernet device is. I could well be barking up
the wrong tree.

    Andrew

-- 
Before posting, please read the FAQ: http://sources.redhat.com/fom/ecos
and search the list archive: http://sources.redhat.com/ml/ecos-discuss


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]