Re: pg_standby -l might destory the archived file - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: pg_standby -l might destory the archived file
Date
Msg-id 1244105585.23910.153.camel@ebony.2ndQuadrant
Whole thread Raw
In response to Re: pg_standby -l might destory the archived file  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, 2009-06-02 at 14:54 -0400, Tom Lane wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> > Tom Lane wrote:
> >> That's a good point; don't we recover files under names like
> >> RECOVERYXLOG, not under names that could possibly conflict with regular
> >> WAL files?
> 
> > Yes. But we rename RECOVERYXLOG to 000000010000000000000057 or similar 
> > at the end of recovery, in exitArchiveRecovery().
> 
> > Thinking about this some more, I think we should've changed 
> > exitArchiveRecovery() rather than RemoveOldXlogFiles(): it would be more 
> > robust if exitArchiveRecovery() always copied the last WAL file rather 
> > than just renamed it. It doesn't seem safe to rely on the file the 
> > symlink points to to be valid after recovery is finished, and we might 
> > write to it before it's recycled, so the current fix isn't complete.
> 
> Hmm.  I think really the reason it's coded that way is that we assumed
> the recovery command would be physically copying the file from someplace
> else.  pg_standby is violating the backend's expectations by using a
> symlink.  And I really doubt that the technique is saving anything, since
> the data has to be read in from the archive location anyway.
> 
> I'm leaning back to the position that pg_standby's -l option is simply a
> bad idea and should be removed.

ISTM we didn't clearly state what the recovery_command should do either
way. Even if you remove the pg_standby option that will not fix the
problem for people who have written their own script, or existing users
of pg_standby. The safe way is to do as Heikki suggests and copy the
final file into place and I would add that we must then fsync it also.
That should be back-patched possibly as far as 8.0. Documenting a change
is not nearly enough.

Removing -l is a separate discussion. If there was a consensus against
it, I would suggest that we deprecate the option, so that it does
nothing.

As an aside, I would be also much more comfortable if there was an
option to not recycle the WAL files at all, as a safe mode for error
checking at least. The question is: why do we need to zero fill the file
first anyway? We could save the 8 bytes for the prev pointer on every
record if we added enough zeroes onto every WAL write to zap any
pre-existing header data, causing recovery to fail.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: [COMMITTERS] pgsql: Fix LOCK TABLE to eliminate the race condition that could make it
Next
From: Simon Riggs
Date:
Subject: Re: pg_standby -l might destory the archived file