Re: rc and smf
Dan Melomedman wrote:
> Joerg Sonnenberger wrote:
>
>>Actually, this is exactly one of the situations where I don't want
>>automatic, silent restarts. It hides problems, which is in my position
>>even more problematic. "Magic restart" doesn't solve every problem.
>>
>>Joerg
>
>
> Nothing solves every problem. Supervision solves the 'Oops, something
> crashed, and needs to be restarted' problem. If my nearby nuclear power
> plant's reactor monitoring software running on a Unix box gets killed
> due to a memory leak, I want it restarted immediately, not wait for the
> administrator to find out by the time the reactor melts down.
No you do not.
What you DO want, when *any* fault occurs of that nature, is for a
totally separate system - usually a 'state machine' - or even *gravity*
to take over and 'safe' that plant until the real cause is scrutinized
by a team of experts. Too much is at stake to blindly restart a daemon
OR the OS.
Unix has no more business running nuke power plants than Windows. That
is specialized RT OS ground. Or state machines monitored by specialized
computers. Or both.
> All fault
> tolerant systems have some kind of supervision in software.
All seriously critical ones have hardware / firmware fall-backs and
manual overrides as well.
All failures be they oil-refinery, chemical plant, power plant or web
and mail servers *should* be brought to human attention, examined and
attended to by folks with brains. That way we can fix them, not be
victims of them.
Bill
討論串 (同標題文章)
完整討論串 (本文為第 24 之 53 篇):