This is the mail archive of the ecos-discuss@sourceware.org mailing list for the eCos project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: DSR Scheduling Problem

From: Bart Veer <bartv at ecoscentric dot com>
To: jay at systech dot com
Cc: ecos-discuss at ecos dot sourceware dot org
Date: Mon, 16 Jan 2006 11:30:20 +0000 (GMT)
Subject: Re: [ECOS] DSR Scheduling Problem
References: <74C9525D67A5FF4791614FDB06593BB1028531@mail.systech.com>

>>>>> "Jay" == Jay Foster <jay@systech.com> writes:

    Jay> I am experiencing a problem with a serial driver that I'm
    Jay> developing. It uses separate RX and TX ISRs and DSRs. The RX
    Jay> and TX ISRs just schedule the corresponding DSR to run.

    Jay> The test begins by transmitting data, which is looped back to
    Jay> the receiver. It starts out with:
    Jay> 	TX ISR -> TX DSR
    Jay> 	TX ISR -> TX DSR
    Jay> 	...
    Jay> 	TX-ISR -> TX DSR

    Jay> Then I get the RX ISR during the TX DSR, which just schedules
    Jay> the RX DSR. However, the RX DSR does not run until 39 ms
    Jay> later, resulting in an overrun error. During this time
    Jay> period, the TX ISR and TX DSR continue their work
    Jay> transmitting the remaining data. After all of the data has
    Jay> been sent, THEN the RX DSR runs.

    Jay> Looking at the code post_dsr() and call_dsr() in
    Jay> hal/common/current/src/drv_api.c, I noticed that the DSRs are
    Jay> queued at the head of the list, and dequeued also from the
    Jay> head of the list. This seems wrong, as it can (and apparently
    Jay> does) cause DSRs to get delayed by other DSRs that are queued
    Jay> later. Seems like it would be better to queue them on the end
    Jay> of the list and dequeue them from the head of the list, so
    Jay> that the DSRs would get run in the order in which they are
    Jay> queued.

I think the problem is more fundamental than this. You appear to have
a system which performs only I/O for at least 39ms, handling
interrupts and running DSRs. No threads get to run during this time,
so no thread can respond to any events or timers. For some
applications this may be perfectly acceptable, but for a general
purpose driver it is unhealthy. Instead you should look at ways of
getting a better balance between I/O and processing. Options include:

1) carefully optimizing the existing TX dsr code. If there is some way
   to speed this up it may solve the problem completely.

2) minimize the number of interrupts, e.g. by only triggering an
   interrupt when the fifo is empty or nearly empty, and then filling
   it completely in the DSR. This could greatly reduce the number of
   interrupts.

3) do more work in the ISR and only request a DSR when necessary. For
   example if the hardware only allows you to transmit a single
   character at a time then try to arrange to do this in the ISR, and
   only go for the DSR when a transmission is complete. Typically this
   would avoid the overheads of calling the DSR plus some interrupt
   masking and unmasking.

   Getting this working with the current generic serial code is an
   exercise left to the reader. Other subsystems such as SPI do make
   it easy for the device drivers to decide what bits of I/O should
   happen in ISR vs. DSR vs. the calling thread.

4) the previous approach has a risk. If the serial I/O is very fast
   compared with the cpu then you may end up saturating the cpu with
   ISRs instead of DSRs. In such cases it would be better to do the
   I/O at thread level in a busy polling loop, with careful use of
   thread priorities to get the right balance between serial I/O and
   everything else in the system. That eliminates all the overheads of
   interrupt handling. Again the SPI subsystem allows this, and even
   includes a polled flag so that applications can give a hint to the
   device driver.

5) throttle I/O. If the cpu cannot keep up with all the I/O activity
   then the system is out of balance. Assuming it is not possible to
   switch to a faster processor, you should instead slow down I/O by
   running at a lower baud rate.

The current DSR scheduling code works just fine for most systems.
Switching to fifo order would require more cpu cycles, so you would be
penalizing all systems for the sake of a few where I/O and cpu
performance are out of balance. That is certainly not acceptable as a
default. It might be acceptable as a configuration option, but should
only be used when there is no other way to reduce the I/O overheads.

Bart

-- 
Bart Veer                       eCos Configuration Architect
http://www.ecoscentric.com/     The eCos and RedBoot experts


-- 
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss

References:
- DSR Scheduling Problem
  - From: Jay Foster

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]