Re: failing disk, or not?
George Georgalis wrote:
> I'm seeing some disk errors in dfly that I cannot reproduce with other
> OS checking the partition:
>
> ad4: UDMA ICRC error writing fsbn 249842603 of 110387664-110387679 (ad4 bn 249842603; cn 15551 tn 250 sn 38) retrying
> ad4: UDMA ICRC error writing fsbn 249842603 of 110387664-110387679 (ad4 bn 249842603; cn 15551 tn 250 sn 38) retrying
> ad4: UDMA ICRC error writing fsbn 249842603 of 110387664-110387679 (ad4 bn 249842603; cn 15551 tn 250 sn 38) retrying
> ad4: UDMA ICRC error writing fsbn 488278315 of 229605520-229605535 (ad4 bn 488278315; cn 30393 tn 234 sn 28) retrying
> ad4: UDMA ICRC error writing fsbn 488278315 of 229605520-229605535 (ad4 bn 488278315; cn 30393 tn 234 sn 28) retrying
> ad4: UDMA ICRC error writing fsbn 488278443 of 229605584-229605599 (ad4 bn 488278443; cn 30393 tn 236 sn 30) retrying
>
> This happened while running dvdbackup and I reproduced it running
> a dd read from the partition. However, after several attempts I cannot
> reproduce it from Linux badblocks (read or non-distructive write) check
> or linux dd read from the partition. I know failures can be intermetint
> But not getting any errors at all yet, from Linux, seems odd at this
> point, if the disk is really failing.
Might DFLY be attempting I/O beyond the permitted
end of the assigned area? Or to an area that Linux
is not trying to access?
>
> # df -h
> Filesystem Size Used Avail Capacity Mounted on
> /dev/ad4s3a 248M 122M 106M 54% /
> /dev/ad4s3d 248M 1.3M 227M 1% /var
> /dev/ad4s3e 124G 94G 20G 83% /usr
> procfs 4.0K 4.0K 0B 100% /proc
>
> A bit of history, I did have a system lockup -- I could switch virtural
> terminals but no keyboard input was accepted -- a week or two ago,
> didn't file bug because I was half-hazard experimenting (in user space)
> and couldn't explain well enough, at the time all I was doing, now I
> don't even remember. A fsck was required, and with a 95Gb /usr, that
> took quite a while. (welcome comments on why softupdates didn't help
> here),
Best case, SU just leave data in an earlier state rather than
half-committed. More transaction-oriented than jornalling.
fsck -y doesn't care about the content of data - only about its
proper file indexing, so *maybe* some time saved during
a 'preen', but no savings at all with fsck -y.
> also the /usr partition was near or over 100% capacity, but I
> never got disk full errors, ie didn't *completely* run out of space.
>
It normally has around a 10% reserve, will usually stand 102% before it
even throws an error message.
> At this point can I be sure my disk is failing or could there be some
> driver instability? The full dmesg is below.
>
> Don't see it in dmesg, but ad4 is a 200Gb Seagate drive, on a nvidia
> sata controler. Disk Product Number ST3200822AS, Part Number 9W2854-301
>
> Thanks,
> // George
>
>
Cutting ...
> agp0: <NVIDIA Generic AGP Controller> mem 0xe0000000-0xe3ffffff at device 0.0 on pci0
> agp0: Unable to find NVIDIA Memory Controller 1.
Unable? That's odd ?
> device_probe_and_attach: agp0 attach returned 19
> isab0: <PCI to ISA bridge (vendor=10de device=00e0)> at device 1.0 on pci0
> isa0: <ISA bus> on isab0
> pci0: <unknown card> (vendor=0x10de, dev=0x00e4) at 1.1 irq 10
NVIDIA - nForce3 250 SMBus Controller ?
*SNIP*
> atapci0: <Generic PCI ATA controller> port 0xf000-0xf00f at device 8.0 on pci0
> ata0: at 0x1f0 irq 14 on atapci0
> installed MI handler for int 14
> ata1: at 0x170 irq 15 on atapci0
> installed MI handler for int 15
> atapci1: <Generic PCI ATA controller> port 0xec00-0xec7f,0xeb00-0xeb0f,0xb70-0xb73,0x970-0x977,0xbf0-0xbf3,0x9f0-0x9f7 irq 11 at device 10.0 on pci0
> ata2: at 0x9f0 on atapci1
> installed MI handler for int 11
> ata3: at 0x970 on atapci1
*snip*
> ad0: 58644MB <Maxtor 6Y060L0> [119150/16/63] at ata0-master BIOSDMA
> ad4: DMA limited to UDMA33, non-ATA66 cable or device
> ad4: 190782MB <ST3200822AS> [387621/16/63] at ata2-master BIOSDMA
I'm puzzled:
- ata0-master claims /dev/ad0
- ata1-master claims /dev/acd0
- ata2-master claims /dev/ad4
- ata3 seems empty...
So how do we skip /dev/ad1, /dev/ad2, and /dev/ad3 to arive at /dev/ad4?
> Mounting root from ufs:/dev/ad4s3a
> ad4: UDMA ICRC error writing fsbn 249842603 of 110387664-110387679 (ad4 bn 249842603; cn 15551 tn 250 sn 38) retrying
> ad4: UDMA ICRC error writing fsbn 249842603 of 110387664-110387679 (ad4 bn 249842603; cn 15551 tn 250 sn 38) retrying
> ad4: UDMA ICRC error writing fsbn 249842603 of 110387664-110387679 (ad4 bn 249842603; cn 15551 tn 250 sn 38) retrying
> ad4: UDMA ICRC error writing fsbn 488278315 of 229605520-229605535 (ad4 bn 488278315; cn 30393 tn 234 sn 28) retrying
> ad4: UDMA ICRC error writing fsbn 488278315 of 229605520-229605535 (ad4 bn 488278315; cn 30393 tn 234 sn 28) retrying
> ad4: UDMA ICRC error writing fsbn 488278443 of 229605584-229605599 (ad4 bn 488278443; cn 30393 tn 236 sn 30) retrying
>
>
You are on slice2, presumably well up in the cylinder count.
Might the areas above be a geometry mapping conflict?
Bill
討論串 (同標題文章)
完整討論串 (本文為第 4 之 12 篇):