Re: corrupted tuple (header?), pg_filedump output - Mailing list pgsql-hackers
From | Eric Parusel |
---|---|
Subject | Re: corrupted tuple (header?), pg_filedump output |
Date | |
Msg-id | 423B8B8D.30106@globalrelay.net Whole thread Raw |
In response to | corrupted tuple (header?), pg_filedump output (Eric Parusel <lists@globalrelay.net>) |
List | pgsql-hackers |
I've brought this back on-list, probably best that way..? Eric Parusel wrote: > Tom Lane wrote: > >> What it kinda looks like from here is that you suffered a "page tear": >> the itemid pointers at the front of the page may be self-consistent, but >> they don't quite match the state of the rest of the page. For instance >> the claimed item-2 header is obviously bogus but it looks like there is >> a valid header starting a few bytes after where the itemid points. >> I suspect that the itemid pointers are one generation earlier or later >> than the remainder of the page. Since disks typically write in 512-byte >> sectors and there is nothing else in the first 512 bytes except the >> itemids, we could imagine that that sector got written and then the rest >> of the page did not. Postgres is supposed to protect against this sort >> of thing in case of a system crash, but I wouldn't want to swear that >> the protections are completely bulletproof. Have you had any power >> failures or system crashes lately? What sort of hardware and OS is this >> on? > > > Hmm... > Here is some system information: > > Dell PE1750, 2GB ECC ram, 2x73GB 10K scsi attached to Perc4/di > (raid-on-motherboard, LSI megaraid chipset, battery-backed cache, > write-back cache enabled), firmware/drivers is up to date as of a month > ago. > > The OS is RHEL3, kept up to date with the newest kernel for it. > > PgSQL 8.0.1 installed from RPMs on postgresql.org, it had 8.0.0 > installed from DGPG RPMs initially until 8.0.1 came out. > > No power failures or crashes since it's been up... > > It's been up and running with moderate to heavy load for about 2 months > now. > > I don't think there have been any pgsql backend (if that's the word for > them) processes crashing or anything of that sort... > > Pretty heavy write load on the box, it will be getting a 14 disk raid10 > array plugged into it soon to speed things up. > > > > I can't remember and I couldn't find it, but is there a consistency > checking tool (pg_fsck or something?) for pgsql? Or I suppose a dump of > the whole database (which I do nightly) ensures all the data is readable... > > If there's anything else I can do to help figure this out, let me know.. > > Thanks, > Eric > How would I go about double checking I don't have this problem on other pages? As above, a successful db dump would verify everything's fine? I suppose a dump and reload after that point would verify that my indexes and anything else in base/ is fine? How would I figure out where and how much to overwrite with dd if I was to clear this page? Or how would I set the invalid item's itemid to empty? Obviously, stuff like this tends not to be in the documentation :D Thanks for the help, Eric
pgsql-hackers by date: