Matt Beauregard <matt@designscape.com.au> writes:
>> The plot thickens ... what did you say the error message was
> exactly?
> !759 marauder@bofh:~$ pg_dump -h tweedledee ds
> getTypes(): SELECT failed. Explanation from backend: 'ERROR:
> Relation 'pg_user' does not exist
> NOTICE: mdopen: couldn't open pg_user: No such file or directory
This is strange in itself. There should be an (empty) file named
pg_user in your $PGDATA/base/template1 directory --- is there?
What about pg_views?
> Tried it, couldn't get the breakpoint to trigger the first time,
> restarted postgres, got a different set of messages again.
> template1=> select * from pg_user;
> ERROR: Relation 'pg_user' does not exist
> (bt:)
> #0 0x812db0c in elog ()
> #1 0x806dbae in heap_openr ()
Hmm. Tracing this back makes it appear that the pg_user row in pg_class
isn't being found by an index scan. Since the row clearly is there when
you do a sequential scan, this suggests that the pg_class_relname_index
index is corrupted. If so, how'd it get that way? Is there a
pg_class_relname_index file in template1? How big is it?
One thing I am wondering about, since you mention restarting the
postmaster, is whether you're being careful to start the postmaster
in a consistent environment --- in particular, with consistent
LOCALE-related environment variable values. A number of people have
managed to produce corrupted or corrupted-acting indexes by making
entries with different LOCALE values at different times. That affects
the sort order of the index and can produce inconsistent index ordering,
thereby preventing some entries from being found. This doesn't seem
like a very likely theory for template1, since it normally is readonly
after initdb and usually hasn't got any non-ASCII names in it anyway,
but the inconsistent behavior is hard to explain without some such
factor...
regards, tom lane