eCos kernel overview

To Contents

To previous page

To next page

 




eCos kernel overview

This is an overview of the internal workings of the eCos kernel.

The scheduler

At the core of the kernel is the scheduler. This defines the way in which threads are run, and provides the mechanisms by which they may synchronize. It also controls the means by which interrupts affect thread execution. No single scheduler can cover all possible system configurations. For different purposes we will need to cover several scheduling polices. In this release three schedulers are provided (described in more detail in Sched subdirectory ):

At present the system will only support a single scheduler at any one time. Future systems may allow multiple schedulers to co-exist, but this will be hidden behind the scheduler API in the current release.

To make scheduling safe we need a mechanism to protect the scheduler data structures from concurrent access. The traditional approach to this is to disable interrupts during the critical regions. Unfortunately this increases the maximum interrupt dispatch latency, which is to be avoided in any real-time system.

The mechanisms chosen for eCos is to maintain a counter, Scheduler::sched_lock that, if non-zero, prevents any rescheduling. The current thread can claim the lock by calling Scheduler::lock() . This increments the counter and prevents any further scheduling. The function Scheduler::unlock() decrements the counter and if it returns to zero, allows scheduling to continue.

For this to work in the presence of interrupts, it is necessary for the Interrupt Service Routines (ISR) to defer any scheduler-oriented operations until the lock is about to go zero. We do this by splitting the work of an ISR into two parts, with the second part, the Deferred Service Routine ( DSR ), being queued until the scheduler decides it is safe to run. This is covered in more detail in Interrupts and Interrupt and exception handlers .

On a uni-processor, Scheduler::lock() is a simple increment of Scheduler::sched_lock. It does not need to be a read-modify-write cycle since the lock is strictly nested. The mere fact that the current thread is running implies that the lock has not been claimed by another thread, so it is always claimable.

Scheduler::unlock() is generic to all scheduler implementations.

Thread synchronization

To allow threads to cooperate and compete for resources, it is necessary to provide mechanisms for synchronization and communication. The classic synchronization mechanisms are mutexes/condition variables and semaphores. These are provided in the eCos kernel, together with other synchronization/communication mechanisms that are common in real-time systems, such as event flags and message queues.

One of the problems that must be dealt with in any real-time systems is priority inversion. This is where a high priority thread is (wrongly) prevented from continuing by one at lower priority. The normal example is of a high priority thread waiting at a mutex already held by a low priority thread. If the low priority thread is preempted by a medium priority thread then priority inversion has occurred since the high priority thread is prevented from continuing by an unrelated thread of lower priority.

This problem got much attention recently when the Mars Pathfinder mission had to reset the computers on the ground exploration robot repeatedly because a priority inversion problem would cause it to hang.

There are several solutions to this problem. The simplest is to employ a priority ceiling protocol where all threads that acquire the mutex have their priority boosted to some predetermined value. This has a number of disadvantages: it requires the maximum priority of the threads using the mutex to be known in advance; if the ceiling priority is too high it acts as a global lock disabling all scheduling and it is pessimistic, taking action to prevent the problem even when it does not arise.

A better solution is to use priority inheritance protocol. Here, the priority of the thread that owns the mutex is boosted to equal that of the highest priority thread that is waiting for it. This technique does not require prior knowledge of the priorities of the threads that are going to use the mutex, and the priority of the owning thread is only boosted when a higher priority thread is waiting. This reduces the effect on the scheduling of other threads, and is more optimistic than the priority ceiling protocol. A disadvantage of this mechanism is that the cost of each synchronization call is increased since the inheritance protocol must be obeyed each time.

A third approach to priority inversion is to recognize that relative thread priorities have been poorly chosen and thus the system in which it occurs is faulty. In this case the kernel needs the ability to detect when priority inversion has taken place, and to raise an exception when it occurs to aid debugging. Then this code is removed from the shipping version.

The current eCos release provides a relatively simple implementation of mutex priority inheritance. This implementation will only work in the multi-level queue scheduler, and it does not handle the rare case of nested mutexes completely correctly. However it is both fast and deterministic. Mutex priority inheritance can be disabled if the application does not require it. This will reduce both code size and data space.

Future releases will provide alternative implementations of mutex priority inheritance, and application developers will be able to choose the implementation appropriate to their application.

Exceptions

An exception is a synchronous event caused by the execution of a thread. These include both the machine exceptions raised by hardware (such as divide-by-zero, memory fault and illegal instruction) and machine exceptions raised by software (such as deadline overrun). The standard C++ exception mechanism is too expensive to use for this, and in any case has the wrong semantics for the exception handling in an RTOS.

The simplest, and most flexible, mechanism for exception handling is to call a function. This function needs context in which to work, so access to some working data is required. The function may also need to be handed some data about the exception raised: at least the exception number and some optional parameters.

The exception handler receives a data argument which is a value that was registered with the handler and points to context information. It also receives an exception_number which identifies the exception taken, and an error code which contains any additional information (such as a memory fault address) needed to handle the exception. Returning from the function will allow the thread to continue.

Exception handlers may be either global or per-thread, or both, depending on configuration options. If exceptions are per-thread, it is necessary to have an exception handler attached to each thread.

Interrupts

Interrupts are asynchronous events caused by external devices. They may occur at any time and are not associated in any way with the thread that is currently running.

The handling of interrupts is one of the more complex areas in RTOS design, largely because it is the least well defined. The ways in which interrupt vectors are named, how interrupts are delivered to the software and how interrupts are masked are all highly architecture- (and in some cases board-) specific. The approach taken in eCos is to provide a generalized mechanism with sufficient hooks for system-specific code to be inserted where needed.

Let us start by considering the issue of interrupt vectors. Hardware support differs greatly here: from the Intel Architecture and the 680X0 having support for vectoring individual interrupts to their own vectors, to most RISC architectures that only have a single vector. In the first case it is possible to attach an ISR directly to the vector and know that it need only concern itself with the device in question. In the second case it is necessary to determine which device is actually interrupting and then vector to the correct ISR. Where there is an external interrupt controller, it will be possible to query that and provide what is essentially a software implementation of hardware vectoring. Otherwise the actual hardware devices must be tested, by calling the ISRs in turn and letting them make the determination. Since it is possible for two devices to interrupt simultaneously, it is necessary to call all ISRs each time an interrupt occurs.

Interrupt masking has a similar variety of support. Most processors have a simple interrupt mask bit in a status register. The 680X0 has seven levels of masking. Any board with a interrupt controller can be programmed to provide similar multi-level masking. It is necessary to keep the interrupt masking mechanism simple and efficient, and use only architectural support. The cost of manipulating an on-board interrupt controller may be too high. However, individual device drivers may want access to their individual mask bits in the interrupt controller, so support for this must be provided.

Most of the infrastructure necessary for a (somewhat) portable treatment of interrupts is implemented in the eCos Hardware Abstraction Layer (HAL), which is documented in The eCos Hardware Abstraction Layer (HAL) .

Counters, clocks, alarms and timers

If the hardware provides a periodic clock or timer, it will be used to drive timing-related features of the system. Many CPU architectures now have built in timer registers that can provide a periodic interrupt. This should be used to drive these features where possible. Otherwise an external timer/clock chip must be used.

We draw a distinction between Counters, Clocks, Alarms and Timers. A Counter maintains a monotonically increasing counter that is driven by some source of ticks. A Clock is a counter driven by a regular source of ticks (i.e. it counts time). Clocks have a resolution associated with them. A default system Clock is driven by the periodic interrupt described above, and tracks real-time. Other interrupt sources may drive other Counters that may or may not track real-time at different resolutions. Some Counters may be driven by aperiodic events and thus have no relation to real-time at all.

An Alarm is attached to a Counter and provides a mechanism for generating single-shot or periodic events based on the counter's value. A Timer is simply an Alarm that is attached to a Clock.

The system (including the kernel) represents time in units of ticks. These are clock-specific time units and are usually the period of the timer interrupt, or a multiple thereof. Conversion of ticks into conventional time and date units should occur only when required via library functions. Equivalence between Clock time and real-time can be made with an RTC (real-time clock), NTP (network time protocol) or user input.

The representation of the current tick count needs to be 64 bit. This requires either compiler support for 64 bit integers, or assembly code. Even at the extreme of a 1 ns tick (ticks will typically be >1ms), this gives a 584 year rollover period.

The Clock API and configuration options that affect clock, counter and alarm behavior are described in detail in Counters, clocks and alarms .


eCos kernel overview

To Contents

To previous page

To next page