This is the mail archive of the ecos-patches@sources.redhat.com mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Bug in eCos kernel timers


Hi!

We found a bug in Cyg_Counter::rem_alarm in conjunction with CYGNUM_KERNEL_COUNTERS_MULTI_LIST that occasionally caused corruption of the kernel timer lists.

The problem occurs in the (very unlikely) event that an interruption between the selection of the list pointer and the actual remove operation happens. For example, if a timer with an interval is disabled in a thread and the tick function re-inserts the very same timer into another list pointer (as head). Other error conditions even without an interval are also possible. Generally, the problem occurs whenever a timer is used from several threads.

The bug caused our boards to hang occasionally after 3 days of operation. The application continously starts/stops 300 timers. To increase the bug probability we inserted a large number of NOPs between the selection of the pointer and the remove operation.

best regards,
Tom
--
--- packages/kernel/current/ChangeLog.old	Wed Jun 25 11:32:26 2003
+++ packages/kernel/current/ChangeLog	Wed Jun 25 11:30:24 2003
@@ -1,5 +1,11 @@
+2003-06-25  Thomas Binder  <Thomas.Binder@frequentis.com>
+
+	* src/common/clock.cxx (Cyg_Counter::rem_alarm): Bugfix: call
+	Cyg_Scheduler::lock() before calculation of index into alarm_list
+	array to avoid race condition with multi list counters.
+
 2003-06-06  David Brennan  <eCos@brennanhome.com>
 2003-06-23  Nick Garnett  <nickg@balti.calivar.com>
 
 	* cdl/kernel.cdl: Added tests/bin_sem3 to list of kernel tests.
 
Index: packages/kernel/current/src/common/clock.cxx
===================================================================
RCS file: /project/cvsroot/vcs_3020_series/vds6000/software/os/ecos/ecos/packages/kernel/current/src/common/clock.cxx,v
retrieving revision 1.3
diff -a -U5 -r1.3 clock.cxx
--- packages/kernel/current/src/common/clock.cxx	2003/03/26 15:51:42	1.3
+++ packages/kernel/current/src/common/clock.cxx	2003/06/25 09:21:41
@@ -395,10 +395,12 @@
     CYG_ASSERTCLASS( this, "Bad counter object" );
     CYG_ASSERTCLASS( alarm, "Bad alarm passed" );
     
     Cyg_Alarm_List *alarm_list_ptr;     // pointer to list
 
+    Cyg_Scheduler::lock();
+
 #if defined(CYGIMP_KERNEL_COUNTERS_SINGLE_LIST)
 
     alarm_list_ptr = &alarm_list;
 
 #elif defined(CYGIMP_KERNEL_COUNTERS_MULTI_LIST)
@@ -411,12 +413,10 @@
 #error "No CYGIMP_KERNEL_COUNTERS_x_LIST config"
 #endif
 
     // Now that we have the list pointer, we can use common code for
     // both list organizations.
-
-    Cyg_Scheduler::lock();
 
     CYG_INSTRUMENT_ALARM( REM, this, alarm );
 
     alarm_list_ptr->remove( alarm );
     

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]