Re: Inconsistent DB data in Streaming Replication - Mailing list pgsql-hackers

From Florian Pflug
Subject Re: Inconsistent DB data in Streaming Replication
Date
Msg-id 53401A75-FE8F-4E8A-B3D2-ED2BED113AC2@phlo.org
Whole thread Raw
In response to Re: Inconsistent DB data in Streaming Replication  (Amit Kapila <amit.kapila@huawei.com>)
Responses Re: Inconsistent DB data in Streaming Replication
Re: Inconsistent DB data in Streaming Replication
Re: Inconsistent DB data in Streaming Replication
List pgsql-hackers
On Apr17, 2013, at 12:22 , Amit Kapila <amit.kapila@huawei.com> wrote:
> Do you mean to say that as an error has occurred, so it would not be able to
> flush received WAL, which could result in loss of WAL?
> I think even if error occurs, it will call flush in WalRcvDie(), before
> terminating WALReceiver.

Hm, true, but for that to prevent the problem the inner processing
loop needs to always read up to EOF before it exits and we attempt
to send a reply. Which I don't think it necessarily does. Assume,
that the master sends a chunk of data, waits a bit, and finally
sends the shutdown record and exits. The slave might then receive
the first chunk, and it might trigger sending a reply. At the time
the reply is sent, the master has already sent the shutdown record
and closed the connection, and we'll thus fail to reply and abort.
Since the shutdown record has never been read from the socket,
XLogWalRcvFlush won't flush it, and the slave ends up behind the
master.

Also, since XLogWalRcvProcessMsg responds to keep-alives messages,
we might also error out of the inner processing loop if the server
closes the socket after sending a keepalive but before we attempt
to respond.

Fixing this on the receive side alone seems quite messy and fragile.
So instead, I think we should let the master send a shutdown message
after it has sent everything it wants to send, and wait for the client
to acknowledge it before shutting down the socket.

If the client fails to respond, we could log a fat WARNING.

best regards,
Florian Pflug




pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Inconsistent DB data in Streaming Replication
Next
From: Robert Haas
Date:
Subject: Re: TODO links broken?