> > ERROR: index "pg_class_oid_index" is not a btree
>
> That means you got bogus data while reading the metapage.
> I'm beginning to wonder about the hardware on this server ...
This happened again, and this time I went back through
the logs and found that it is always the exact same query causing
the issue. I also found it occuring on different servers,
which rules out RAM anyway (still shared disk, so those are suspect).
This query also sometimes gives errors like this:
ERROR: could not read block 3 of relation 1663/1554846571/3925298284: read only 0 of 8192 bytes
However, the final number changes: these are invariably temporary relations.
The query itself is a GROUP BY over a large view and the explain plan is
107 rows, with nothing esoteric about it. Most of the tables used are
fairly common ones. I'm trying to duplicate on a non-production box, without
success so far, and I'm loath to run it on production as it sometimes
causes multiple backends to freeze up and requires a forceful restart.
Any ideas on how to carefully debug this? There are a couple of quicksorts
when I explain analyze on a non-prod system, which I am guessing where
the temp tables come from (work_mem is 24MB). I'm not sure I understand
what could be causing both the 'read 0' and btree errors for the
same query - bad blocks on disk for one of the underlying tables?
I'll work next on checking each of the tables the view is using.
--
Greg Sabino Mullane greg@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8