Re: Synchronous replication patch built on SR - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: Synchronous replication patch built on SR |
Date | |
Msg-id | AANLkTinPxvwvv42_yNkYecqX0J3Bjk111DRXEONcVcB1@mail.gmail.com Whole thread Raw |
In response to | Re: Synchronous replication patch built on SR (Boszormenyi Zoltan <zb@cybertec.at>) |
Responses |
Re: Synchronous replication patch built on SR
|
List | pgsql-hackers |
On Wed, May 19, 2010 at 5:41 PM, Boszormenyi Zoltan <zb@cybertec.at> wrote: >> Isn't reading the same WAL twice (by walreceiver and startup process) >> inefficient? > > Yes, and I didn't implement that because it's inefficient. So I'd like to propose to use LSN instead of XID since LSN can be easily handled by both walreceiver and startup process. >> Currently >> PQputCopyData() cannot be executed in COPY OUT, but we can relax >> that. >> > > And I implemented just that, in a way that upon walreceiver startup > it sends a new protocol message to the walsender by calling > PQsetDuplexCopy() (see my patch) and the walsender response is ACK. > This protocol message is intentionally not handled by the normal > backend, so plain libpq clients cannot mess up their COPY streams. The newly-introduced message type "Set Duplex Copy" is really required? I think that the standby can send its replication mode to the master via Query or CopyData message, which are already used in SR. For example, how about including the mode in the handshake message "START_REPLICATION"? If we do that, we would not need to introduce new libpq function PQsetDuplexCopy(). BTW, I often got the complaints about adding new libpq function when I implemented SR ;) In the patch, PQputCopyData() checks the newly-introduced pg_conn field "duplexCopy". Instead, how about checking the existing field "replication"? Or we can just allow PQputCopyData() to go even in COPY OUT state. > We can change the walreceiver so it sends similarly encapsulated > messages as the walsender does. In our patch, the walreceiver > currently sends the raw XIDs. If we add a minimal protocol > encapsulation, we can distinguish between the XIDs (or later LSNs) > and the "mark me synchronous from now on" message. > > The only problem is: what should be the point when such a client > becomes synchronous from the master's POV, so the XID/LSN reports > will count and transactions are made to wait for this client? One idea is to switch to "sync" when the gap of LSN becomes less than or equal to XLOG_SEG_SIZE (currently 8MB). That is, walsender calculates the gap from the current write WAL location on the master and the last receive/flush/replay location on the standby. And if the gap <= XLOG_SEG_SIZE, it instructs backends to wait for replication from then on. > As a side note, the async walreceivers' behaviour should be kept > so they don't send anything back and the message that > PQsetDuplexCopy() sends to the master would then only > prepare the walsender that its client will become synchronous > in the near future. I agree that walreceiver should send no replication ack if "async" mode is chosen. OTOH, in "sync" case, walreceiver should always send ack even if the gap is large and the master doesn't wait for replication yet. As mentioned above, walsender needs to calculate the gap from the ack. >> Seems s/min_sync_replication_clients/max_sync_replication_clients >> > > No, "min" is indicating the minimum number of walreceiver reports > needed before a transaction can be released from under the waiting. > The other reports coming from walreceivers are ignored. Hmm... when min_sync_replication_clients = 2 and there are three "synchronous" standbys, the master waits for only two standbys? The standby which the master ignores is fixed? or dynamically (or randomly) changed? >> min_sync_replication_clients is required to prevent outside attacker >> from connecting to the master as "synchronous" standby, and degrading >> the performance on the master? > > ??? > > Properly configured pg_hba.conf prevents outside attackers > to connect as replication clients, no? Yes :) I'd like to just know the use case of min_sync_replication_clients. Sorry, I've not understood yet how useful this option is. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
pgsql-hackers by date: