This is the mail archive of the ecos-discuss@sourceware.org mailing list for the eCos project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Problems with "Scheduler lock not zero"

From: Jürgen Lambrecht <J dot Lambrecht at televic dot com>
To: Øyvind Harboe <oyvind dot harboe at zylin dot com>, eCos Discussion <ecos-discuss at ecos dot sourceware dot org>
Date: Wed, 08 Nov 2006 14:37:05 +0100
Subject: [ECOS] Re: Problems with "Scheduler lock not zero"
References: <c09652430611070722u281d3291sac51828f476aa014@mail.gmail.com> <4550AD75.9080306@televic.com> <c09652430611071304g378d9548wf4c99f0ae3c00fa3@mail.gmail.com>

Øyvind Harboe wrote:

On 11/7/06, Jürgen Lambrecht <J.Lambrecht@televic.com> wrote:

Hello Harboe,

I'm wondering.... What would be the observable effects of the
scheduler lock count not being zero if asserts weren't enabled?

If the answer is that everything, except timeslicing, would work just
fine, then I may have observed this on another project.

Now we're mainly testing release versions, so with asserts disabled..


I use again the MLQ scheduler instead of the bitmap and I have not seen
the error again.

I'm using the MLQ scheduler and I do see the problem.

Perhaps you still have the problem, only more rarely?

It would not surprise me one bit if this problem is timing sensitive
and pretty much anything could make it come or go.

From my ecos.ecc:

cdl_component CYGSEM_KERNEL_SCHED_MLQUEUE {
   # Flavor: bool
   # No user value, uncomment the following line to provide one.
   # user_value 1
   # value_source default
   # Default value: 1

My ecos is from feb 15 2006, so I have that patch from Nick.

I fetched a fresh version today, and the problem exists with our HAL &
CVS HEAD. Since this problem appears to be rare, I would suspect that
a) either our HAL is somehow provoking a rare problem or b) our HAL is
busted. We're using the opencores ethermac, otherwise it is basically
an EB40a.

our HAL is based on the eb55; with a memory mapped ethermac, an i2c driver and an extended tftp server


As an experiment I disabled timeslicing(since I'm using pthreads this
requires a bit of hacking) and the problem persists.

Be carefull: to be able to use the bitmap scheduler, I had to make sure each priority was unique. And by default, both the main thread and the tftpd thread have priority 10 (CYGNUM_LIBC_MAIN_THREAD_PRIORITY and CYGPKG_NET_TFTPD_THREAD_PRIORITY). So I changed the priorities (to 8 and 9).
And now we use again the MLQ scheduler, but I kept the bitmap priorities of the main and tftpd thread. So I don't use timeslicing anymore. That's maybe the reason we don't have the problem anymore?
(Also the networking threads have default priorities, CYGPKG_NET_THREAD_PRIORITY and CYGPKG_NET_FAST_THREAD_PRIORITY (7 and 6).)


--
Before posting, please read the FAQ: http://ecos.sourceware.org/fom/ecos
and search the list archive: http://ecos.sourceware.org/ml/ecos-discuss

References:
- Re: Problems with "Scheduler lock not zero"
  - From: Jürgen Lambrecht
- Re: Problems with "Scheduler lock not zero"
  - From: Øyvind Harboe

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]