On Thursday 24 September 2009 23:02:15 Michael Brown wrote:
> > I think this must mean that corrupt data is being read from the relcache
> > init file. The reason a restart fixes it is probably that restart
> > forcibly removes the old init file, which is good for recovery but not
> > so good for finding out what's wrong. Could you modify
> > RelationCacheInitFileRemove (at the bottom of relcache.c) to rename the
> > file someplace else instead of deleting it? And then send me a copy
> > of the bad file once you have one?
>
> I have captured and attached the file as saved-pg_internal.init.bak.
In case it helps, I noticed the following in gdb:
(gdb) p *(RelIdCacheEnt*)status.curEntry
$1 = {reloid = 932863600, reldesc = 0x0}
and this reloid is too high to be realistic; we have only just hit the two
million mark for oids in pg_class. This seems to support your thought that
the relcache init file is corrupt.
Michael