Thread synchronization

To allow threads to cooperate and compete for resources, it is necessary to provide mechanisms for synchronization and communication. The classic synchronization mechanisms are mutexes/condition variables and semaphores. These are provided in the eCos kernel, together with other synchronization/communication mechanisms that are common in real-time systems, such as event flags and message queues.

One of the problems that must be dealt with in any real-time systems is priority inversion. This is where a high priority thread is (wrongly) prevented from continuing by one at lower priority. The normal example is of a high priority thread waiting at a mutex already held by a low priority thread. If the low priority thread is preempted by a medium priority thread then priority inversion has occurred since the high priority thread is prevented from continuing by an unrelated thread of lower priority.

This problem got much attention recently when the Mars Pathfinder mission had to reset the computers on the ground exploration robot repeatedly because a priority inversion problem would cause it to hang.

There are several solutions to this problem. The simplest is to employ a priority ceiling protocol where all threads that acquire the mutex have their priority boosted to some predetermined value. This has a number of disadvantages: it requires the maximum priority of the threads using the mutex to be known a priori; if the ceiling priority is too high it acts as a global lock disabling all scheduling and it is pessimistic, taking action to prevent the problem even when it does not arise.

A better solution is to use priority inheritance protocol. Here, the priority of the thread that owns the mutex is boosted to equal that of the highest priority thread that is waiting for it. This technique does not require a priori knowledge of the priorities of the threads that are going to use the mutex, and the priority of the owning thread is only boosted when a higher priority thread is waiting. This reduces the effect on the scheduling of other threads, and is more optimistic than the priority ceiling protocol. A disadvantage of this mechanism is that the cost of each synchronization call is increased since the inheritance protocol must be obeyed each time.

A third approach to priority inversion is to recognize that relative thread priorities have been poorly chosen and thus the system in which it occurs is faulty. In this case the kernel needs the ability to detect when priority inversion has taken place, and to raise an exception when it occurs to aid debugging. Then this code is removed from the shipping version.

The current eCos release provides a relatively simple implementation of mutex priority inheritance. This implementation will only work in the multi-level queue scheduler, and it does not handle the rare case of nested mutexes completely correctly. However it is both fast and deterministic. Mutex priority inheritance can be disabled if the application does not require it. This will reduce both code size and data space.

Future releases will provide alternative implementations of mutex priority inheritance, and application developers will be able to choose the implementation appropriate to their application.