Re: smp_rendezvous runs with interrupts and preemption enabled o

看板FB_current作者時間14年前 (2011/10/29 02:01), 編輯推噓0(000)
留言0則, 0人參與, 最新討論串1/4 (看更多)
On Fri, Oct 28, 2011 at 8:37 AM, Ryan Stone <rysto32@gmail.com> wrote: > I'm seeing issues on a unicore systems running a derivative of FreeBSD > 8.2-RELEASE if something calls mem_range_attr_set. =A0It turns out that > the root cause is a bug in smp_rendezvous_cpus. =A0The first part of > smp_rendezvous_cpus attempts to short-circuit the non-SMP case(note > that smp_started is never set to 1 on a unicore system): > > =A0 =A0 =A0 =A0if (!smp_started) { > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (setup_func !=3D NULL) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0setup_func(arg); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (action_func !=3D NULL) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0action_func(arg); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (teardown_func !=3D NULL) > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0teardown_func(arg); > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return; > =A0 =A0 =A0 =A0} > > The problem is that this runs with interrupts enabled, outside of a > critical section. =A0My system runs with device_polling enabled with hz > set to 2500, so its quite easy to wedge the system by having a thread > run mem_range_attr_set. =A0That has to do a smp_rendezvous, and if a > timer interrupt happens to go off half-way through the action_func and > preempt this thread, the system ends up deadlocked(although once it's > wedged, typing at the serial console stands a good chance of unwedging > the system. =A0Go figure). > > I know that smp_rendezvous was reworked substantially on HEAD, but by > inspection it looks like the bug is still present, as the > short-circuit behaviour is still there. > > I am not entirely sure of the best way to fix this. =A0Is it as simple > as doing a spinlock_enter before setup_func and a spinlock_exit after > teardown_func? =A0It seems to boot fine, but I'm not at all confident > that I understand the nuances of smp_rendezvous to be sure that there > aren't side effects that I don't know about. Looks like Max didn't have time to upstream this fix: struct mtx smp_ipi_mtx; +MTX_SYSINIT(smp_ipi_mtx, &smp_ipi_mtx, "smp rendezvous", MTX_SPIN); .... static void mp_start(void *dummy) { - mtx_init(&smp_ipi_mtx, "smp rendezvous", NULL, MTX_SPIN); .... if (!smp_started) { + mtx_lock_spin(&smp_ipi_mtx); if (setup_func !=3D NULL) setup_func(arg); if (action_func !=3D NULL) action_func(arg); if (teardown_func !=3D NULL) teardown_func(arg); + mtx_unlock_spin(&smp_ipi_mtx); return; } Cheers, matthew _______________________________________________ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
文章代碼(AID): #1Egkta7R (FB_current)
文章代碼(AID): #1Egkta7R (FB_current)