Re: Streaming replication and WAL archive interactions - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Streaming replication and WAL archive interactions
Date
Msg-id 55534930.4040905@iki.fi
Whole thread Raw
In response to Re: Streaming replication and WAL archive interactions  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Streaming replication and WAL archive interactions  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 05/13/2015 03:36 PM, Robert Haas wrote:
> On Mon, May 11, 2015 at 12:00 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> And here is a new version of the patch. I kept the approach of using pgstat,
>> but it now only polls pgstat every 10 seconds, and doesn't block to wait for
>> updated stats.
>
> It's not entirely a new problem, but this error message has gotten pretty crazy:
>
> +                               (errmsg("WAL archival
> (archive_mode=on/always/shared) requires wal_level \"archive\",
> \"hot_standby\", or \"logical\"")));
>
> Maybe: WAL archival cannot be enabled when wal_level is "minimal"
>
> I think the documentation should be explicit about what happens if the
> primary archives a file and dies before the standby gets notified that
> the archiving happened.

Yes, good point.

>  The standby, running in shared mode, is then
> promoted.  My first guess would be that the standby will end up with
> files that thinks it needs to archive but, being unable to do so
> because they're already there, they'll live forever in pg_xlog.  I
> hope that's not the case.

Hmm. That is exactly what happens. The standby will attempt to archive 
them, which will fail, so the archiver will get stuck retrying.

That's not actually a new problem though. Even with a single server 
doing archiving, it's possible that you crash just after archive_command 
has archived a file, but before it has created the .done file. After 
restart, the server will try to archive the file again, which will fail. 
But yeah, with this patch, that's much more likely to happen after a 
promotion.

Our manual says that archive_command should refuse to overwrite an 
existing file. But to work-around the double-archival problem, where the 
same file is archived twice, it would be even better if it would simply 
return success if the file exists, *and has identical contents*. I don't 
know how to code that logic in a simple one-liner though.

- Heikki



pgsql-hackers by date:

Previous
From: Kohei KaiGai
Date:
Subject: Re: One question about security label command
Next
From: Robert Haas
Date:
Subject: Re: Auditing extension for PostgreSQL (Take 2)