Re: [GENERAL] Shared WAL archive between master and standby: WALs notalways identical - Mailing list pgsql-general

From Adrian Klaver
Subject Re: [GENERAL] Shared WAL archive between master and standby: WALs notalways identical
Date
Msg-id 7209546d-e946-70a4-9bb5-c498566a58ae@aklaver.com
Whole thread Raw
In response to Re: [GENERAL] Shared WAL archive between master and standby: WALs notalways identical  (Sasa Vilic <sasavilic@gmail.com>)
Responses Re: [GENERAL] Shared WAL archive between master and standby: WALs notalways identical  (Jon Nelson <jnelson+pgsql@jamponi.net>)
Re: [GENERAL] Shared WAL archive between master and standby: WALs notalways identical  (Sasa Vilic <sasavilic@gmail.com>)
List pgsql-general
On 02/27/2017 11:14 PM, Sasa Vilic wrote:
> On 2017-02-28 06:14, Adrian Klaver wrote:
>> On 02/27/2017 05:52 PM, Sasa Vilic wrote:
>>> Because standby is running in syncronous replication, whereby wal
>>> archiver is asynchronous. Therefore there is a small window where slave
>>> has received the data but master has not pushed it yet to wal archive.
>>
>> Exactly. The standby already has the latest information, it would gain
>> nothing from fetching it from the archive and anything it wrote to the
>> archive would only have the information it got from the master at the
>> point of failure. That is before you promoted it, after it would be on
>> its own path independent of the master.
>
> Hi Adrian,
>
> I am afraid that you don't understand me.
>
> Standby is not fetching WAL from archive, it fetches it directly from
> master and is done synchronously, which means that master will only
> confirm transaction to client when WAL is streamed and applied at
> standby. On the other hand, master does not have to wait for WAL
> archiver. If master crashes before WAL archiver is able to send WAL, we
> would still have it on standby.
>
> Let us for the sake of demonstration consider that we have same very low
> busy but very critical system:
>
> 1. Your client connects to primary server (master) and performs changes
> on data
> 2. It just happen that this is a moment where PostgreSQL opens new WAL
> segment. It writes few kilobytes in this new WAL segment but it has
> almost 16MB to write before segment is complete. So the wal archiver has
> to wait before it can push wal segment in wal archive
> 3. Secondary server (standby) is doing same, it is writing changes to
> newly created WAL segment
> 4. Your client issues COMMIT
>     - primary waits until changes are applied at secondary
>     - primary flushes changes to WAL
>     - secondary confirms transaction to primary
>     - primary confirms transaction to client
>     - WAL is still not processed by wal archiver because it is only i.e.
> 1 MB big and we are still left 15MB to go
> 5. Primary server crashes, i.e. due to catastrophic disk failure
>     - everything stops and can't be recovered
>     - wal archiver is dead, but even if it were alive it wouldn't send
> WAL to archive anyway because 16MB of wal segment was not filled up
> 6. We promote our secondary server to master
>     - In secondary server's WAL we already got changes from primary
>     - Secondary continues appending new changes to wal segment
> 7. Eventually WAL segment on secondary fills up and then pushes it to
> wal archive.
> 8. Although primary is dead, we didn't loose anything because lost WAL
> data was pushed by secondary.

I understand the above, what I did not understand, from your original post:

"My problem is that sometimes WAL uploaded from master and from slave
are not 100% identical. In most cases they are but occasionally they are
not. I have written small script that ensures that upload is free of
race condition and I log md5 sum of each WAL."

To me that reads as you sending WALs to the archive from both the master
and the standby in parallel, instead of sequentially as you imply in the
outline above. It would seem to be confirmed by the setting of
archive_mode = always:

https://www.postgresql.org/docs/9.6/static/runtime-config-wal.html#RUNTIME-CONFIG-WAL-ARCHIVING

https://www.postgresql.org/docs/9.6/static/warm-standby.html#CONTINUOUS-ARCHIVING-IN-STANDBY

Seems to mean the simpler thing to do would be to set standby to
archive_mode = on, in which case the standby would not contribute WAL's
until it was promoted which would seem to be what you want.

>
> Regards,
> Sasa
>


--
Adrian Klaver
adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: "Sven R. Kunze"
Date:
Subject: Re: [GENERAL] ERROR: functions in index expression must be markedIMMUTABLE
Next
From: Jon Nelson
Date:
Subject: Re: [GENERAL] Shared WAL archive between master and standby: WALs notalways identical