This is the mail archive of the ecos-discuss@sources.redhat.com mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [SMP]serious bug in synchronisation primitives


sandeep <shimple0@yahoo.com> writes:

> > The DESTRUCT and BREAK wake reasons are explicitly intended for cases
> > where the mutex will not be locked. This is why they set the result to
> is there any list of tests (if any) in ecos cvs that do not intend to
> lock mutex directly/indirectly?

I'm not sure what you are asking here. The kill and release tests test
the basic mechanisms.

> > It is unclear to me at present why that thread should be seeing a
> > BREAK at that point, however.  It is possible that there is an SMP
> > race condition somewhere in the POSIX code that handles thread
> > cancellation. However, a quick glance through it hasn't brought
> > anything to light.
> With luck and reverse tracing on who sets wakeup_reason, this was pretty
> quick. verified things happening in direction by examining the state on
> all 4 processors when (self == owner) failed at the end of mutex lock.
> 
> here is the finding with compat-posix-tm_basic test -
> 

[snip scenario]

The simple fix for this particular problem is to replace 

        pthread_mutex.lock();

with

        while( !pthread_mutex.lock() )
            continue;

in pthread_setcancelstate().

I suspect that a number of other calls to pthread_mutex.lock() would
benefit from a similar modification. Others may require:

        if( !pthread_mutex.lock() )
            PTHREAD_RETURN(EINTR);

However, that depends on the exact specification of the API call and
whether it is permitted to return EINTR.

> 
> Was wondering (haven't looked much into it) if some of the found races
> could affect no-SMP case also??

I suspect not. I think we would have tripped over these problems in
our testing before this. It's the genuine concurrency of SMP that is
causing the problems. In uniprocessor systems, threads rarely
sleep in the mutexes and are ususally in a condition variable sleep,
where the semantics are better defined.

> 
> I hope smp bugs would be concern for lot of people - as gathered from list,
> SMP port for SPARC (LEON) processor is planned, someone at IIT
> kharagpur seems to be porting ecos to multicore architecture from TI,
> a port for multicore cradle architecture and exisiting in public cvs
> for ix86 smp port. it could be possible that for certain non-cvs
> mutlicore/multiprocessor architectures SMP ports are continuing,
> already exist but not mentioned or are being planned.
> 
> I was wondering if someone is porting SMP ecos to multicore
> architectures from IBM/Sony?? It could give a phenomenal boost to
> ecos, financial boost aside. I am just a programmer, my speculations
> could be wrong.

We are definitely seeing SMP move into the embedded space. Having
solid SMP support would be a real advantage over some other embedded
operating systems.

-- 
Nick Garnett                    eCos Kernel Architect
http://www.ecoscentric.com/     The eCos and RedBoot experts



-- 
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]