pg_internal.init is hazardous to your health - Mailing list pgsql-hackers

From Tom Lane
Subject pg_internal.init is hazardous to your health
Date
Msg-id 14353.1161138553@sss.pgh.pa.us
Whole thread Raw
Responses Re: pg_internal.init is hazardous to your health  (Gavin Sherry <swm@linuxworld.com.au>)
Re: pg_internal.init is hazardous to your health  ("Simon Riggs" <simon@2ndquadrant.com>)
List pgsql-hackers
Dirk Lutzebaeck and I just spent a tense couple of hours trying to
figure out why a large database Down Under wasn't coming up after being
reloaded from a base backup plus PITR recovery.  The symptoms were that
the recovery went fine, but backend processes would fail at startup or
soon after with "could not open relation XX/XX/XX: No such file" type of
errors.

The answer that ultimately emerged was that they'd been running a
nightly maintenance script that did REINDEX SYSTEM (among other things
I suppose).  The PITR base backup included pg_internal.init files that
were appropriate when it was taken, and the PITR recovery process did
nothing whatsoever to update 'em :-(.  So incoming backends picked up
init files with obsolete relfilenode values.

We don't actually need to *update* the file, per se, we only need to
remove it if no longer valid --- the next incoming backend will rebuild
it.  I could see fixing this by making WAL recovery run around and zap
all the .init files (only problem is to find 'em), or we could add a new
kind of WAL record saying "remove the .init file for database XYZ"
to be emitted whenever someone removes the active one.  Thoughts?

Meanwhile, if you're trying to recover from a PITR backup and it's not
working, try removing any pg_internal.init files you can find.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Douglas Toltzman
Date:
Subject: 8.1.4 verified on Intel Mac OS 10.4.8
Next
From: Robert Treat
Date:
Subject: Re: [PERFORM] Hints proposal