Thread: FATAL: cache lookup failed for access method 6881280
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'm getting this message when I try to access one of the databases in my cluster: FATAL: cache lookup failed for access method 6881280 I get the message any way I try to access it: remotely via TCP, locally on the db server, as well as with a single-user mode postgres command. Any ideas what could be the cause, or what I could do to fix it? All the other (30+) databases in the cluster appear to be perfectly fine. Thanks, Brock - -- Brock Williams brock@cotcomsol.com Cottonwood Computer Solutions, Inc. www.cotcomsol.com 406-896-4910 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (FreeBSD) iD8DBQFC8AjygMCVEEcrWuIRAr0rAJ0TJwvWDxxNSYq2nib2P2ix2IT9ggCeNixx UvLi8JF08ga2PiBW6hr0DaU= =Wum4 -----END PGP SIGNATURE-----
Have you tried restarting the Postmaster? Best Wishes, Chris Travers Metatron Technology Consulting Brock Williams wrote: >-----BEGIN PGP SIGNED MESSAGE----- >Hash: SHA1 > >I'm getting this message when I try to access one of the databases in >my cluster: > >FATAL: cache lookup failed for access method 6881280 > >I get the message any way I try to access it: remotely via TCP, >locally on the db server, as well as with a single-user mode postgres >command. > >Any ideas what could be the cause, or what I could do to fix it? All >the other (30+) databases in the cluster appear to be perfectly fine. > >Thanks, > >Brock >- -- >Brock Williams brock@cotcomsol.com >Cottonwood Computer Solutions, Inc. >www.cotcomsol.com 406-896-4910 >-----BEGIN PGP SIGNATURE----- >Version: GnuPG v1.4.0 (FreeBSD) > >iD8DBQFC8AjygMCVEEcrWuIRAr0rAJ0TJwvWDxxNSYq2nib2P2ix2IT9ggCeNixx >UvLi8JF08ga2PiBW6hr0DaU= >=Wum4 >-----END PGP SIGNATURE----- > >---------------------------(end of broadcast)--------------------------- >TIP 5: don't forget to increase your free space map settings > > > >
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Yep. I tried starting and stopping a couple of times. I also tried stopping postmaster and starting a single-user mode postgres process. Same result. Brock Chris Travers wrote: > Have you tried restarting the Postmaster? > > Best Wishes, > Chris Travers > Metatron Technology Consulting > > Brock Williams wrote: > > I'm getting this message when I try to access one of the databases in > my cluster: > > FATAL: cache lookup failed for access method 6881280 > > I get the message any way I try to access it: remotely via TCP, > locally on the db server, as well as with a single-user mode postgres > command. > > Any ideas what could be the cause, or what I could do to fix it? All > the other (30+) databases in the cluster appear to be perfectly fine. > > Thanks, > > Brock >> - ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings >> >> >> > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (FreeBSD) iD8DBQFC8AkegMCVEEcrWuIRApjZAJ9qvkzqIvVHKkgKyOfEl/Z3xr0JbACgtc0U 5CsANX2XfrkEkBoXRSPv298= =3uBS -----END PGP SIGNATURE-----
Brock Williams <brock@cotcomsol.com> writes: > I'm getting this message when I try to access one of the databases in > my cluster: > FATAL: cache lookup failed for access method 6881280 Unless you are using a custom index access method you didn't tell us about, this is pretty strong evidence of data corruption in the pg_class system catalog (specifically some pg_class.relam entry). If the corruption extends only as far as those four bytes, it'd be pretty easy to poke the correct value back in ... but it seems likely that this is just the tip of the iceberg :-( Please get a copy of pg_filedump from http://sources.redhat.com/rhdb/utilities.html and show us a dump of the pg_class file from that database (I like the format it produces with -i -f options; note pg_class is file 1259 in all known PG versions). regards, tom lane
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Tried to post directy to the list but I got a bounce back. Must be too big... Here it is posted to my web site: http://www.cotcomsol.com/~brock/postgresql_debug.txt Thanks for the help, Brock Tom Lane wrote: > Brock Williams <brock@cotcomsol.com> writes: > >>I'm getting this message when I try to access one of the databases in >>my cluster: > > >>FATAL: cache lookup failed for access method 6881280 > > > Unless you are using a custom index access method you didn't tell us > about, this is pretty strong evidence of data corruption in the pg_class > system catalog (specifically some pg_class.relam entry). > > If the corruption extends only as far as those four bytes, it'd be > pretty easy to poke the correct value back in ... but it seems likely > that this is just the tip of the iceberg :-( > > Please get a copy of pg_filedump from > http://sources.redhat.com/rhdb/utilities.html > and show us a dump of the pg_class file from that database (I like the > format it produces with -i -f options; note pg_class is file 1259 in all > known PG versions). > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (FreeBSD) iD8DBQFC8EPYgMCVEEcrWuIRAhCcAJ9RZEMrKFpbRfjHOVW1Vfj6eys+VgCfaKhA 2EV6MSgumAvDv+Lv0k2WrdI= =IZnE -----END PGP SIGNATURE-----
Brock Williams <brock@cotcomsol.com> writes: > Tried to post directy to the list but I got a bounce back. Must be too > big... > Here it is posted to my web site: > http://www.cotcomsol.com/~brock/postgresql_debug.txt Hmm, this is quite odd. You have Item 22 -- Length: 156 Offset: 4760 (0x1298) Flags: USED XMIN: 1 CMIN: 0 XMAX: 0 CMAX|XVAC: 0 Block Id: 1 linp Index: 22 Attributes: 24 Size: 32 infomask: 0x0910 (HASOID|XMIN_COMMITTED|XMAX_INVALID) 1298: 01000000 00000000 00000000 00000000 ................ 12a8: 00000100 16001800 10092000 4f410000 .......... .OA.. 12b8: 70675f70 726f635f 70726f6e 616d655f pg_proc_proname_ 12c8: 61726773 5f6e7370 5f696e64 65780000 args_nsp_index.. 12d8: 00000000 00000000 00000000 00000000 ................ 12e8: 00000000 00000000 00000000 00000000 ................ 12f8: 0b000000 00000000 01000000 00006900 ..............i. -------- 1308: 01000000 00000000 94000000 00c0da44 ...............D -------- 1318: 01000000 00000000 00006900 04000000 ..........i..... -------- 1328: 00000000 00000000 00000000 ............ A non-broken 8.0 installation has Item 22 -- Length: 156 Offset: 4760 (0x1298) Flags: USED XMIN: 1 CMIN: 0 XMAX: 0 CMAX|XVAC: 0 Block Id: 1 linp Index: 22 Attributes: 24 Size: 32 infomask: 0x0910 (HASOID|XMIN_COMMITTED|XMAX_INVALID) 1298: 01000000 00000000 00000000 00000000 ................ 12a8: 00000100 16001800 10092000 4f410000 .......... .OA.. 12b8: 70675f70 726f635f 70726f6e 616d655f pg_proc_proname_ 12c8: 61726773 5f6e7370 5f696e64 65780000 args_nsp_index.. 12d8: 00000000 00000000 00000000 00000000 ................ 12e8: 00000000 00000000 00000000 00000000 ................ 12f8: 0b000000 00000000 01000000 93010000 ................ -------- 1308: 4f410000 00000000 94000000 0060da44 OA...........`.D -------- 1318: 00000000 00000000 00006900 04000000 ..........i..... -------- 1328: 00000000 00000000 00000000 ............ where I underlined the parts that are different. The first different word is the relam field, which leads directly to the reported error message, and the other two changes are bad news as well. (Note: the difference at offset 1314 is the reltuples field, which could legitimately be different across installations, and your value matches the other pg_proc indexes so it's presumably OK.) I have no idea what happened here ... but if you can put those values back the way they were you'll probably be all right. You'll need to shut down the postmaster before editing the file (else it may have the page cached in shared memory), and if there is a pg_internal.init file in the same directory delete it. regards, tom lane
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Tom: Thanks for the help. I hex-edited the file back to what you recommended, and it did in fact work. I did a dump/restore on the database just to be safe as well. Brock Tom Lane wrote: > Brock Williams <brock@cotcomsol.com> writes: > >>Tried to post directy to the list but I got a bounce back. Must be too >>big... >>Here it is posted to my web site: >>http://www.cotcomsol.com/~brock/postgresql_debug.txt > > > Hmm, this is quite odd. You have > > Item 22 -- Length: 156 Offset: 4760 (0x1298) Flags: USED > XMIN: 1 CMIN: 0 XMAX: 0 CMAX|XVAC: 0 > Block Id: 1 linp Index: 22 Attributes: 24 Size: 32 > infomask: 0x0910 (HASOID|XMIN_COMMITTED|XMAX_INVALID) > > 1298: 01000000 00000000 00000000 00000000 ................ > 12a8: 00000100 16001800 10092000 4f410000 .......... .OA.. > 12b8: 70675f70 726f635f 70726f6e 616d655f pg_proc_proname_ > 12c8: 61726773 5f6e7370 5f696e64 65780000 args_nsp_index.. > 12d8: 00000000 00000000 00000000 00000000 ................ > 12e8: 00000000 00000000 00000000 00000000 ................ > 12f8: 0b000000 00000000 01000000 00006900 ..............i. > -------- > 1308: 01000000 00000000 94000000 00c0da44 ...............D > -------- > 1318: 01000000 00000000 00006900 04000000 ..........i..... > -------- > 1328: 00000000 00000000 00000000 ............ > > A non-broken 8.0 installation has > > Item 22 -- Length: 156 Offset: 4760 (0x1298) Flags: USED > XMIN: 1 CMIN: 0 XMAX: 0 CMAX|XVAC: 0 > Block Id: 1 linp Index: 22 Attributes: 24 Size: 32 > infomask: 0x0910 (HASOID|XMIN_COMMITTED|XMAX_INVALID) > > 1298: 01000000 00000000 00000000 00000000 ................ > 12a8: 00000100 16001800 10092000 4f410000 .......... .OA.. > 12b8: 70675f70 726f635f 70726f6e 616d655f pg_proc_proname_ > 12c8: 61726773 5f6e7370 5f696e64 65780000 args_nsp_index.. > 12d8: 00000000 00000000 00000000 00000000 ................ > 12e8: 00000000 00000000 00000000 00000000 ................ > 12f8: 0b000000 00000000 01000000 93010000 ................ > -------- > 1308: 4f410000 00000000 94000000 0060da44 OA...........`.D > -------- > 1318: 00000000 00000000 00006900 04000000 ..........i..... > -------- > 1328: 00000000 00000000 00000000 ............ > > where I underlined the parts that are different. The first different > word is the relam field, which leads directly to the reported error > message, and the other two changes are bad news as well. (Note: the > difference at offset 1314 is the reltuples field, which could > legitimately be different across installations, and your value matches > the other pg_proc indexes so it's presumably OK.) > > I have no idea what happened here ... but if you can put those values > back the way they were you'll probably be all right. You'll need to > shut down the postmaster before editing the file (else it may have the > page cached in shared memory), and if there is a pg_internal.init file > in the same directory delete it. > > regards, tom lane - -- Brock Williams brock@cotcomsol.com Cottonwood Computer Solutions, Inc. www.cotcomsol.com 406-896-4910 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (FreeBSD) iD8DBQFC8PlMgMCVEEcrWuIRArLtAJ0fSPeN7z6a1XfiIuXqMEBizAS4wACgnLcp hb+XBsKHmi7ToPf4p+D0Btg= =hjNE -----END PGP SIGNATURE-----