Re: "PANIC: could not open critical system index 2662" - twice - Mailing list pgsql-general

From Evgeny Morozov
Subject Re: "PANIC: could not open critical system index 2662" - twice
Date
Msg-id 010201877eb3c125-d9aff979-cad6-407b-8b4e-012487666088-000000@eu-west-1.amazonses.com
Whole thread Raw
In response to Re: "PANIC: could not open critical system index 2662" - twice  (Laurenz Albe <laurenz.albe@cybertec.at>)
Responses Re: "PANIC: could not open critical system index 2662" - twice
List pgsql-general
> Hmm, I am not certain. The block was filled with zeros from your error
> message, and I think such blocks don't trigger a checksum warning.

OK, so data_checksums=on might not have made any difference in this case?


> So if your disk replaces a valid block with zeros (filesystem check
> after crash?), that could explain what you see.

If by "crash" here you mean the OS crashing - we didn't have that
happen. The OS is on separate disks, which have not reported any errors.

When we first ran into this problem the PG data was on a ZFS RAIDZ (i.e.
RAID5) volume of 3 disks, and for one of them `zpool status -v` reported
read, write and checksum error count > 0, but it also said  "errors: No
known data errors" and the disk status remained "online" (it did not
become "faulted" or "offline"). (Now we have the PG data on a ZFS mirror
volume of 2 new disks, which have not reported any errors.)

I don't know whether ZFS zero-fills blocks on disk errors. As I
understood, ZFS should have been able to recover from disk errors (that
were "unrecoverable" at the hardware level) using the data on the other
two disks (which did not report any errors). Thus, PG should not have
seen any corrupted data (if ZFS was working correctly).
https://unix.stackexchange.com/questions/341614/understanding-the-error-reporting-of-zfs-on-linux
seems to confirm this. Am I misunderstanding something?




pgsql-general by date:

Previous
From: Thorsten Glaser
Date:
Subject: Re: "PANIC: could not open critical system index 2662" - twice
Next
From: Alban Hertroys
Date:
Subject: Re: "PANIC: could not open critical system index 2662" - twice