Re: Inconsistent DB data in Streaming Replication - Mailing list pgsql-hackers

From Samrat Revagade
Subject Re: Inconsistent DB data in Streaming Replication
Date
Msg-id CAF8Q-GyF=vrm+WLHhCLtLtg0skb_LkZwFEWjhRvcG=iybFyzwg@mail.gmail.com
Whole thread Raw
In response to Re: Inconsistent DB data in Streaming Replication  (Hannu Krosing <hannu@2ndQuadrant.com>)
Responses Re: Inconsistent DB data in Streaming Replication
List pgsql-hackers
<div dir="ltr"><p class="">>>it's one of the reasons why a fresh base backup is required when starting old master
asnew standby? >>If yes, I agree with you. I've often heard the complaints about a backup when restarting new
standby.>>That's really big problem.<p class="">I think Fujii Masao is on the same page.<p class=""> <p
class="">>Incase of syncrep the master just waits for confirmation from standby before returning to client on
>commit.<pclass="">>Not just commit, you must stop any *writing* of the wal records effectively killing any
parallelism.<br/> > Min issue is that it will make *all* backends dependant on each sync commit, essentially
serialisingall >backends commits, with the serialisation *including* the latency of roundtrip to client. With
current>sync streaming the other backends can continue to write wal, with proposed approach you cannot >write any
recordsafter the one waiting an ACK from standby.<p class=""> <p class="">Let me rephrase the proposal in a more
accuratemanner:<p class="">Consider following scenario:<p class=""> <p class="">(1) A client sends the "COMMIT" command
tothe master server.<p class=""><p class="">(2) The master writes WAL record to disk<p class="">(3) The master writes
thedata page related to this transaction.  i.e. via checkpoint or bgwriter.<p class="">(4) The master sends WAL records
continuouslyto the standby, up to the commit WAL record.<p class="">(5) The standby receives WAL records, writes them
tothe disk, and then replies the ACK.<p class="">(6) The master returns a success indication to a client after it
receivesACK.<p class=""> <p class="">If failover happens between (3) and (4), WAL and DB data in old master are ahead
ofthem in new master. After failover, new master continues running new transactions independently from old master. Then
WALrecord and DB data would become inconsistent between those two servers. To resolve these inconsistencies, the backup
ofnew master needs to be taken onto new standby.<p class=""><br /><p class="">But taking backup is not feasible in case
oflarger database size with several TB over a slow WAN.<br /><p class=""><p class="">So to avoid this type of
inconsistencywithout taking fresh backup we are thinking to do following thing:<p class=""> <br /><p class="">>>
Ithink that you can introduce GUC specifying whether this extra check is required to avoid a backup >>when
failback.<pclass="">Approach:<p class="">Introduce new GUC option specifying whether to prevent PostgreSQL from writing
DBdata before corresponding WAL records have been replicated to the standby. That is, if this GUC option is enabled,
PostgreSQLwaits for corresponding WAL records to be not only written to the disk but also replicated to the standby
beforewriting DB data.<p class=""><br /><p class="">So the process becomes as follows:<p class="">(1) A client sends
the"COMMIT" command to the master server.<p class="">(2) The master writes the commit WAL record to the disk.<p
class="">(3)The master sends WAL records continuously to standby up to the commit WAL record.<p class="">(4) The
standbyreceives WAL records, writes them to disk, and then replies the ACK.<p class="">(5) <b>The master then forces a
writeof the data page related to this transaction. </b><p class="">(6) The master returns a success indication to a
clientafter it receives ACK.<p class=""> <p class="">While master is waiting to force a write (point 5) for this data
page,streaming replication continuous. Also other data page writes are not dependent on this particular page write. So
thecommit of data pages are not serialized.<p class="" style="style"><br /><p class="" style="style">Regards,<p
class=""style="style">Samrat<p class=""><br /></div> 

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [BUGS] replication_timeout not effective
Next
From: Dang Minh Huong
Date:
Subject: Re: [BUGS] replication_timeout not effective