Re: Sleeping thread (tid 100033, pid 16): panic in FreeBSD
On Tue, Dec 20, 2011 at 6:32 AM, John Baldwin <jhb@freebsd.org> wrote:
> On Tuesday, December 20, 2011 9:22:48 am mdf@freebsd.org wrote:
>> On Tue, Dec 20, 2011 at 5:52 AM, John Baldwin <jhb@freebsd.org> wrote:
>> > On Saturday, December 17, 2011 10:41:15 pm mdf@freebsd.org wrote:
>> >> On Sat, Dec 17, 2011 at 5:45 PM, Alexander Kabaev <kabaev@gmail.com> =
wrote:
>> >> > On Sun, 18 Dec 2011 01:09:00 +0100
>> >> > "O. Hartmann" <ohartman@zedat.fu-berlin.de> wrote:
>> >> >
>> >> >> Sleeping thread (tid 100033, pid 16) owns a non sleepable lock
>> >> >> panic: sleeping thread
>> >> >> cpuid =3D 0
>> >> >>
>> >> >> PID 16 is always USB on my box.
>> >> >
>> >> > You really need to give us a backtrace when you quote panics. It is
>> >> > impossible to make any sense of the above panic message without mor=
e
>> >> > context.
>> >>
>> >> In the case of this panic, the stack of the thread which panics is
>> >> useless; it's someone trying to propagate priority that discovered it=
..
>> >> =A0A backtrace on tid 100033 would be useful.
>> >>
>> >> With WITNESS enabled, it's possible to have this panic display the
>> >> stack of the incorrectly sleeping thread at the time it acquired the
>> >> lock, as well, but this code isn't in CURRENT or any release. =A0I ha=
ve
>> >> a patch at $WORK I can dig up on Monday.
>> >
>> > Huh? =A0The stock kernel dumps a stack trace of the offending thread i=
f you have
>> > DDB enabled:
>> >
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0/*
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * If the thread is asleep, then we are=
probably about
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * to deadlock. =A0To make debugging th=
is easier, just
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * panic and tell the user which thread=
misbehaved so
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * they can hopefully get a stack trace=
from the truly
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 * misbehaving thread.
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 */
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (TD_IS_SLEEPING(td)) {
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0printf(
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0"Sleeping thread (tid %d, pid %d) owns =
a non-sleepable lock\n",
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0td->td_tid, td-=
>td_proc->p_pid);
>> > #ifdef DDB
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0db_trace_thread(td, -1)=
;
>> > #endif
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0panic("sleeping thread"=
);
>> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
>>
>> Hmm, maybe this wasn't in 7, or maybe I'm just remembering that we
>> added code to print *which* lock it holds (using WITNESS data). =A0I do
>> recall that this panic alone was often not sufficient to debug the
>> problem.
>
> I think the db_trace_thread() has been around for a while (since 5 or 6),
> but it is true that we don't tell you which lock is held even with this.
> That might be a useful thing to output before the panic.
This patch isn't quite right since I had to hand-edit it. There's a
small chance I can commit this in the near future, but of someone else
wants to take it, feel free. Style isn't yet fixed up to be FreeBSD
standard either.
--- /data/sb/bsd.git/sys/kern/subr_turnstile.c 2011-12-12
10:23:12.542196632 -0800
+++ kern/subr_turnstile.c 2011-12-09 10:59:29.882643558 -0800
@@ -165,10 +165,43 @@
static void turnstile_dtor(void *mem, int size, void *arg);
#endif
static int turnstile_init(void *mem, int size, int flags);
static void turnstile_fini(void *mem, int size);
+#ifdef INVARIANTS
+static void
+sleeping_thread_owns_a_nonsleepable_lock(struct thread *td)
+{
+ printf("Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n",
+ td->td_tid, td->td_proc->p_pid);
+#ifdef DDB
+ db_trace_thread(td, -1);
+#endif
+#ifdef WITNESS
+ struct lock_list_entry *lock_list, *lle;
+ int i;
+
+ lock_list =3D td->td_sleeplocks;
+ if (lock_list =3D=3D NULL || lock_list->ll_count =3D=3D 0) {
+ printf("Thread does not appear to hold any mutexes!\n");
+ return;
+ }
+
+ for (lle =3D lock_list; lle !=3D NULL; lle =3D lle->ll_next) {
+ for (i =3D lle->ll_count - 1; i >=3D 0; i--) {
+ struct lock_instance *li =3D &lle->ll_children[i];
+
+ printf("Lock %s acquired at %s:%d\n",
+ li->li_lock->lo_name, li->li_file, li->li_line);
+ }
+ }
+#endif /* WITNESS */
+}
+#else
+#define sleeping_thread_owns_a_nonsleepable_lock(td) do { } while (0)
+#endif /* INVARIANTS */
+
/*
* Walks the chain of turnstiles and their owners to propagate the priorit=
y
* of the thread being blocked to all the threads holding locks that have =
to
* release their locks before this thread can run again.
*/
@@ -210,19 +243,31 @@
* If the thread is asleep, then we are probably about
* to deadlock. To make debugging this easier, just
* panic and tell the user which thread misbehaved so
* they can hopefully get a stack trace from the truly
* misbehaving thread.
*/
if (TD_IS_SLEEPING(td)) {
- printf(
- "Sleeping thread (tid %d, pid %d) owns a non-sleepable lock\n",
- td->td_tid, td->td_proc->p_pid);
-#ifdef DDB
- db_trace_thread(td, -1);
-#endif
- panic("sleeping thread");
+ sleeping_thread_owns_a_nonsleepable_lock(td);
+ panic("sleeping thread %p owns a nonsleepable lock",
+ td);
}
/*
* If this thread already has higher priority than the
* thread that is being blocked, we are finished.
Cheers,
matthew
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
討論串 (同標題文章)
完整討論串 (本文為第 7 之 7 篇):