Re: three kernel patches for review
That's interesting, but how do you know the location of the bad
addresses? Run memtest at boot?
Run memtest manually and export a table of bad addresses to a file?
As we are pushing further in the product life, more and more bits are
going to break. How do we handle that?
That said, I like the idea of being able to survive with bad ram for a
while.
Raphael Marmier
Chris Pressey wrote:
> On Tue, 19 Apr 2005 18:48:02 -0700 (PDT)
> Matthew Dillon <dillon@apollo.backplane.com> wrote:
>
>
>>[...]
>> Otherwise, the ANSIfication patch generally looks good, go ahead
>> and commit #1 after you clean up that comment.
>>
>> On #2 ... looks reasonable. Commit away!
>
>
> OK, committed!
>
>
>> On #3 ... that doesn't look so reasonable. I suppose on a machine
>> with huge amounts of memory one might want such a mechanism, but
>> frankly if memory is bad (especially if it is ECC'd memory), the
>> only correct solution is to replace it.
>
>
> I'm going to call you on that one, Matt - _why_ do you say that is the
> only correct solution?
>
> My understanding of the service curve of RAM is that it is not like that
> of disks. Entropy does affect RAM, but at a much longer time-scale, so
> the first few bad bits you see are much more likely to be flukes than an
> indication that the RAM stick is reaching the end of its useful life.
>
> Also, the conventional wisdom that the thing you should do when you have
> bad bits in a stick of RAM is to replace the entire stick, sounds like
> it stems from the fact that the OS has no way of remapping those bad
> bits (like it has with a disk.) Of course, with this patch, that fact
> would no longer be a fact, and that wisdom wouldn't hold water anymore.
>
> On Wed, 20 Apr 2005 10:24:37 +0200
> Joerg Sonnenberger <joerg@britannica.bec.de> wrote:
>
>
>>I'm also split on the badram patch. I have some RAM modules which have
>>static bad bits, so they could be used with the bad ram patch. On the
>>other hand, such modules should be replaced and burned :)
>
>
> Same question to you, Joerg - _why_ should they be replaced and burned?
>
> When I consider the sheer amount of resources that go into manufacturing
> a stick, and that there are typically millions of still-good bits that
> could still be put to use on a "bad" one, I'd consider it a rotten shame
> to just throw it out.
>
> The Linux BadRAM project's website also lists some sound motivations,
> including a commercial one:
>
> http://rick.vanrein.org/linux/badram/
>
> Anyway, that's my case for including this patch. If you still don't
> think it should go in, I won't say anything further, but please do at
> least consider the reasons I've given.
>
> -Chris
討論串 (同標題文章)
完整討論串 (本文為第 7 之 13 篇):