Home > mailing lists

Re: Idea for improving buildfarm robustness - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Idea for improving buildfarm robustness
Date	September 30, 2015 16:59:43
Msg-id	3196.1443621558@sss.pgh.pa.us Whole thread Raw
In response to	Re: Idea for improving buildfarm robustness (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Responses	Re: Idea for improving buildfarm robustness
List	pgsql-hackers

Tree view

Jim Nasby <Jim.Nasby@bluetreble.com> writes:
> Ouch. So it sounds like there's value to seeing if pg_control isn't what 
> we expect it to be.

> Instead of looking at the inode (portability problem), what if 
> pg_control contained a random number that was created at initdb time? On 
> startup postmaster would read that value and then if it ever changed 
> after that you'd know something just went wrong.

> Perhaps even stronger would be to write a new random value on startup; 
> that way you'd know if an old copy accidentally got put in place.

Or maybe better than an otherwise-useless random value, write the
postmaster's start time.

But none of these would offer that much added safety IMV.  If you don't
restart the postmaster very often, it's not unlikely that what you were
trying to restore is a backup from earlier in the current postmaster's
run.  Another problem with checking the contents of pg_control, rather
than only its presence, is that the checkpointer will overwrite it every
so often, and thereby put back whatever we were expecting to find there.
If the postmaster's recheck interval is significantly less than the
checkpoint interval, then you'll *probably* notice before the evidence
vanishes, but it's hardly guaranteed.

It strikes me that a different approach that might be of value would
be to re-read postmaster.pid and make sure that (a) it's still there
and (b) it still contains the current postmaster's PID.  This would
be morally equivalent to what Jim suggests above, and it would dodge
the checkpointer-destroys-the-evidence problem, and it would have the
additional advantage that we'd notice when a brain-dead DBA decides
to manually remove postmaster.pid so he can start a new postmaster.
(It's probably far too late to avoid data corruption at that point,
but better late than never.)

This is still not bulletproof against all overwrite-with-a-backup
scenarios, but it seems like a noticeable improvement over what we
discussed yesterday.
        regards, tom lane

pgsql-hackers by date:

From: Paul Ramsey
Date: 30 September 2015, 16:53:19
Subject: Re: [PATCH] postgres_fdw extension support

From: Tom Lane
Date: 30 September 2015, 17:07:03
Subject: Re: [PATCH] postgres_fdw extension support

Re: Idea for improving buildfarm robustness - Mailing list pgsql-hackers

Previous

Next