Re: [poll / rfc] kdb_stop_cpus
On 3 Jun 2011, at 16:13, Andriy Gapon wrote:
> I wonder if anybody uses kdb_stop_cpus with non-default value.
> If, yes, I am very interested to learn about your usecase for it.
The issue that prompted the sysctl was non-NMI IPIs being used to enter =
the debugger or reboot following a core hanging with interrupts =
disabled. With the switch to NMI IPIs in some of those circumstances, =
life is better -- at least, on hardware that supports non-maskable IPIs. =
I seem to recall sparc64 doesn't, however? Not sure about MIPS, etc. =
Attilio has since significantly improved our shutdown behaviour -- =
initially, the switch to NMI IPIs broke other things (because certain =
IPIs then improperly preempted threads holding spinlocks), but that =
pretty much all seems worked out now.
Robert
>=20
> I think that the default kdb behavior is the correct one, so it =
doesn't make sense
> to have a knob to turn on incorrect behavior.
> But I may be missing something obvious.
>=20
> The comment in the code doesn't really satisfy me:
> /*
> * Flag indicating whether or not to IPI the other CPUs to stop them on
> * entering the debugger. Sometimes, this will result in a deadlock as
> * stop_cpus() waits for the other cpus to stop, so we allow it to be
> * disabled. In order to maximize the chances of success, use a hard
> * stop for that.
> */
>=20
> The hard stop should be sufficiently mighty.
> Yes, I am aware of supposedly extremely rare situations where a =
deadlock could
> happen even when using hard stop. But I'd rather fix that than have =
this switch.
>=20
> Oh, the commit message (from 2004) explains it:
>> Add a new sysctl, debug.kdb.stop_cpus, which controls whether or not =
we
>> attempt to IPI other cpus when entering the debugger in order to stop
>> them while in the debugger. The default remains to issue the stop;
>> however, that can result in a hang if another cpu has interrupts =
disabled
>> and is spinning, since the IPI won't be received and the KDB will =
wait
>> indefinitely. We probably need to add a timeout, but this is a =
useful
>> stopgap in the mean time.
>=20
> But that was before we started using hard stop in this context (in =
2009).
>=20
> --=20
> Andriy Gapon
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
討論串 (同標題文章)
完整討論串 (本文為第 4 之 12 篇):