This is the mail archive of the ecos-discuss@sources.redhat.com mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

questions about condition variable example from eCos book


I've got some questions about the condition variable example code in
the eCos book on pages 107-109.  I'm new to eCos so it's possible that
I'm misinterpreting something.

First, the description of cyg_cond_signal() says that it will "wake up
exactly one thread waiting on the condition variable [...]  If there are
no threads waiting for the condition variable when it is signaled, nothing
happens."  But it also says "a race condition could arise if more than
one thread is waiting for the condition variable.  This is why it is
important for the waiting thread to retest the condition variable to
ensure its proper state."

I don't understand where this race condition comes from.  Even if there
are multiple threads waiting on the same condition, cyg_cond_signal() will
only wake up one, right?  So how could the thread wake up without the
condition having been signalled?

The example code contains two threads.  Here's an excerpt,
slightly reformatted and without most of the comments:

line
#
11    void thread_a (cyg_addrword_t index)
12    {
14      while (1)
15        {
16          // Acquire data into the buffer...
19          buffer_empty = false;
22          cyg_mutex_lock (& mut_cond_var);
25          cyg_cond_signal (& cond_var);
28          cyg_mutex_unlock (& mut_cond_var);
29        }
30    }

35    void thread_b (cyg_addrword_t index)
36    {
38      while (1)
39        {
41          cyg_mutex_lock (& mut_cond_var);
44          while (buffer_empty == true)
45            {
46              cyg_cond_wait (& cond_var);
47            }
49          // get the buffer data...
52          buffer_empty = true;
55          cyg_mutex_unlock (& mut_cond_var);
57          // Process the data in the buffer...
58        }
59    }


So in this example, why is it not adequate for the while statement
on line 44 to be an if statement instead?  I'll concede that a while
is better defensive programming to use while, but it doesn't seem
strictly necessary as the text claims.

Secondly, I think this code does have two actual race conditions.
Suppose the two threads are at the same priority, and timeslicing is
enabled.  They handshake successfully once.  Here's one possible
scenario of the suceeding events:

   thread_b   line 49  gets data from the buffer (still holding mutex)

   thread_a   line 16  acquires data
              line 19  sets buffer_empty = false
              line 22  wait for mutex (still held by thread_b)

   thread_b   line 52  sets buffer_empty = true
              line 55  gives up mutex
              line 57  processes the data
              line 41  locks the mutex
              line 44  checks buffer_empty, it is true!
              line 46  waits for condition to be signalled (yeilding mutex)

   thread_a   line 22  acquires mutex and wakes up
              line 25  signals condition
              line 28  releases mutex
              line 16  starts acquiring more data

   thread_b   line 46  wakes up (re-acquiring the mutex)
              line 44  tests buffer_empty, finds it true!
              line 46  waits for condition to be signalled (yeilding mutex)

thread_b has effectively missed the signal from thread_a, though it
should get it the next time around.  However, that could be
arbitrarily far in the future, and in the meanwhile it's not
processing the data thread_a was trying to pass to it.

It seems fairly clear that line 19 "buffer_empty =false" in thread_a should
actually come after line 22 acquires the mutex, in order to prevent exactly
this sort of race condition.

In general, any time you use a single variable to communicate state between
two processes, it should be protected by a mutual exclusion mechanism of
some sort.  There are some exceptions to this rule, but any time you think
you can avoid the need for mutual exclusion, you should study the problem
*very* carefully.

A more general problem with the example code is that if there is a single
buffer shared between thread_a and thread_b, there needs to be something
to prevent thread_a from refilling the buffer before thread_b is done
with it.  To solve this problem, it may be necessary to move the
"acquire data" portion of thread_a after the mutex has been locked.  So
the code with these two fixes would be:

14      while (1)
15        {
            cyg_mutex_lock (& mut_cond_var);
16          // Acquire data into the buffer...
19          buffer_empty = false;
25          cyg_cond_signal (& cond_var);
28          cyg_mutex_unlock (& mut_cond_var);
29        }

Since the thread has the mutex locked during the acquisition of data and
setting buffer_empty = false, those could be done in either order, but as
a matter of style it seems best to not set the variable until after the
acquisition is completed.

Eric


-- 
Before posting, please read the FAQ: http://sources.redhat.com/fom/ecos
and search the list archive: http://sources.redhat.com/ml/ecos-discuss


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]