Re: How to Qualifying or quantify risk of loss in asynchronous replication - Mailing list pgsql-general

From Thomas Munro
Subject Re: How to Qualifying or quantify risk of loss in asynchronous replication
Date
Msg-id CAEepm=0WicoiJ74MAYe1pcyVeUzvQpYNptYj907=peXbvg2EGw@mail.gmail.com
Whole thread Raw
In response to Re: How to Qualifying or quantify risk of loss in asynchronous replication  (otheus uibk <otheus.uibk@gmail.com>)
List pgsql-general
On Wed, Mar 16, 2016 at 9:59 PM, otheus uibk <otheus.uibk@gmail.com> wrote:
>> In asynchronous replication,
>> the primary writes to the WAL and flushes the disk.  Then, for any
>> standbys that happen to be connected, a WAL sender process trundles
>> along behind feeding new WAL doesn the socket as soon as it can, but
>> it can be running arbitrarily far behind or not running at all (the
>> network could be down or saturated, the standby could be temporarily
>> down or up but not reading the stream fast enough, etc etc).
>
>
>
> This is the *process* I want more detail about. The question is the same as
> above:
>> (is it true that) PG async guarantees that the WAL
>> is *sent* to the receivers, but not that they are received, before the
>> client receives acknowledgement?

The primary writes WAL to disk, and then wakes up walsender processes,
and they read the WAL from disk (presumably straight out of the OS
page cache) in the background and send it down the network some time
later.  Async replication doesn't guarantee anything about the WAL
being sent.

Look for WalSndWakeupRequest() in xlog.c, which expands to a call to
WalSndWakeup in walsender.c which sets latches (= a mechanism for
waking processes) on all walsenders, and see the WaitLatchOrSocket
calls in walsender.c which wait for that to happen.

--
Thomas Munro
http://www.enterprisedb.com


pgsql-general by date:

Previous
From: otheus uibk
Date:
Subject: Re: How to Qualifying or quantify risk of loss in asynchronous replication
Next
From: Thomas Kellerer
Date:
Subject: Confusing deadlock report