Thread: libpq changes for synchronous replication
On Fri, Sep 17, 2010 at 5:09 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > That said, there's a few small things that can be progressed regardless of > the details of synchronous replication. There's the changes to trigger > failover with a signal, and it seems that we'll need some libpq changes to > allow acknowledgments to be sent back to the master regardless of the rest > of the design. We can discuss those in separate threads in parallel. Agreed. The attached patch introduces new function which is used to send ACK back from walreceiver. The function sends a message to XLOG stream by calling PQputCopyData. Also I allowed PQputCopyData to be called even during COPY OUT. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Attachment
On 17/09/10 12:22, Fujii Masao wrote: > On Fri, Sep 17, 2010 at 5:09 PM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: >> That said, there's a few small things that can be progressed regardless of >> the details of synchronous replication. There's the changes to trigger >> failover with a signal, and it seems that we'll need some libpq changes to >> allow acknowledgments to be sent back to the master regardless of the rest >> of the design. We can discuss those in separate threads in parallel. > > Agreed. The attached patch introduces new function which is used > to send ACK back from walreceiver. The function sends a message > to XLOG stream by calling PQputCopyData. Also I allowed PQputCopyData > to be called even during COPY OUT. Oh, that's simple. It doesn't feel right to always accept PQputCopyData in COPY OUT mode, though. IMHO there should be a new COPY IN+OUT mode. It should be pretty safe to add a CopyInOutResponse message to the protocol without a protocol version bump. Thoughts on that? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > It doesn't feel right to always accept PQputCopyData in COPY OUT mode, > though. IMHO there should be a new COPY IN+OUT mode. Yeah, I was going to make the same complaint. Breaking basic error-checking functionality in libpq is not very acceptable. > It should be pretty safe to add a CopyInOutResponse message to the > protocol without a protocol version bump. Thoughts on that? Not if it's something that an existing application might see. If it can only happen in replication mode it's OK. Personally I think this demonstrates that piggybacking replication data transfer on the COPY protocol was a bad design to start with. It's probably time to split them apart. regards, tom lane
On Fri, 2010-09-17 at 18:22 +0900, Fujii Masao wrote: > On Fri, Sep 17, 2010 at 5:09 PM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: > > That said, there's a few small things that can be progressed regardless of > > the details of synchronous replication. There's the changes to trigger > > failover with a signal, and it seems that we'll need some libpq changes to > > allow acknowledgments to be sent back to the master regardless of the rest > > of the design. We can discuss those in separate threads in parallel. > > Agreed. The attached patch introduces new function which is used > to send ACK back from walreceiver. The function sends a message > to XLOG stream by calling PQputCopyData. Also I allowed PQputCopyData > to be called even during COPY OUT. Does this differ from Zoltan's code? -- Simon Riggs www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services
On Mon, Sep 20, 2010 at 11:55 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > It doesn't feel right to always accept PQputCopyData in COPY OUT mode, > though. IMHO there should be a new COPY IN+OUT mode. > > It should be pretty safe to add a CopyInOutResponse message to the protocol > without a protocol version bump. Thoughts on that? Or we check "replication" field in PGConn, and accept PQputCopyData in COPY OUT mode only if it indicates TRUE? This is much simpler, but maybe not versatile.. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Hi, Tom Lane írta: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > >> It doesn't feel right to always accept PQputCopyData in COPY OUT mode, >> though. IMHO there should be a new COPY IN+OUT mode. >> > > Yeah, I was going to make the same complaint. Breaking basic > error-checking functionality in libpq is not very acceptable. > if you looked at my sync replication patch, basically I only added the checking in PQputCopyData that it's allowed in COPY IN mode iff the pgconn was set up for replication. I introduced a new libpq function PQsetDuplexCopy() at the time but Fujii's idea was that it can be omitted and use the conn->replication pointer instead. It seems he forgot about it. Something like this might work: if (conn->asyncStatus != PGASYNC_COPY_IN && !(conn->asyncStatus == PGASYNC_COPY_OUT && conn->replication&& conn->replication[0])) ... This way the original error checking is still in place and only a replication client can do a duplex COPY. >> It should be pretty safe to add a CopyInOutResponse message to the >> protocol without a protocol version bump. Thoughts on that? >> > > Not if it's something that an existing application might see. If > it can only happen in replication mode it's OK. > My PQsetDuplexCopy() call was only usable for a replication client, it resulted in an "unknown protocol message" for a regular client. For a replication client, walsender sent an ack and libpq have set the "duplex copy" flag so it allowed PQputCopyData while in COPY OUT. I'd like a little comment from you whether it's a good idea, or the above check is enough. > Personally I think this demonstrates that piggybacking replication > data transfer on the COPY protocol was a bad design to start with. > It's probably time to split them apart. > Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
Simon Riggs írta: > On Fri, 2010-09-17 at 18:22 +0900, Fujii Masao wrote: > >> On Fri, Sep 17, 2010 at 5:09 PM, Heikki Linnakangas >> <heikki.linnakangas@enterprisedb.com> wrote: >> >>> That said, there's a few small things that can be progressed regardless of >>> the details of synchronous replication. There's the changes to trigger >>> failover with a signal, and it seems that we'll need some libpq changes to >>> allow acknowledgments to be sent back to the master regardless of the rest >>> of the design. We can discuss those in separate threads in parallel. >>> >> Agreed. The attached patch introduces new function which is used >> to send ACK back from walreceiver. The function sends a message >> to XLOG stream by calling PQputCopyData. Also I allowed PQputCopyData >> to be called even during COPY OUT. >> > > Does this differ from Zoltan's code? > Somewhat. See my other mail to Tom. Best regards, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/
On Tue, Sep 21, 2010 at 1:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: >> It doesn't feel right to always accept PQputCopyData in COPY OUT mode, >> though. IMHO there should be a new COPY IN+OUT mode. > > Yeah, I was going to make the same complaint. Breaking basic > error-checking functionality in libpq is not very acceptable. > >> It should be pretty safe to add a CopyInOutResponse message to the >> protocol without a protocol version bump. Thoughts on that? > > Not if it's something that an existing application might see. If > it can only happen in replication mode it's OK. The attached patch adds a CopyXLogResponse message. The walsender sends it after processing START_REPLICATION command, instead of CopyOutResponse. During Copy XLog mode, walreceiver can receive some data from walsender, and can send some data to walsender. > Personally I think this demonstrates that piggybacking replication > data transfer on the COPY protocol was a bad design to start with. > It's probably time to split them apart. In the patch, replication data is still transferred on COPY protocol. If we'd transfer that on dedicated protocol for replication, we would need to duplicate PQgetCopyData and PQputCopyData and define those duplicated functions as something like PQgetXLogData and PQputXLogData for replication. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Attachment
On Mon, Sep 20, 2010 at 12:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Personally I think this demonstrates that piggybacking replication > data transfer on the COPY protocol was a bad design to start with. > It's probably time to split them apart. This appears to be the only obvious unresolved issue regarding this patch: https://commitfest.postgresql.org/action/patch_view?id=412 I don't have a strong personal position on whether or not we should do this, but it strikes me that Tom hasn't given much justification for why he thinks we should do this, what benefit we'd get from it, or what the design should look like. So I guess the question is whether Tom - or anyone - would like to make a case for a more serious protocol overhaul, or whether we should just go with the approach proposed here. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Sep 20, 2010 at 12:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Personally I think this demonstrates that piggybacking replication >> data transfer on the COPY protocol was a bad design to start with. >> It's probably time to split them apart. > This appears to be the only obvious unresolved issue regarding this patch: > https://commitfest.postgresql.org/action/patch_view?id=412 > I don't have a strong personal position on whether or not we should do > this, but it strikes me that Tom hasn't given much justification for > why he thinks we should do this, what benefit we'd get from it, or > what the design should look like. So I guess the question is whether > Tom - or anyone - would like to make a case for a more serious > protocol overhaul, or whether we should just go with the approach > proposed here. I was objecting to v1 of the patch. v2 seems somewhat cleaner --- it at least avoids changing the behavior of libpq for normal COPY operation. I'm still a bit concerned by the prospect of having to shove further warts into the COPY data path in future, but maybe its premature to complain about that when it hasn't happened yet. Just in a quick scan, I don't have any objection to v2 except that the protocol documentation is lacking. regards, tom lane
On Mon, Nov 15, 2010 at 7:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Mon, Sep 20, 2010 at 12:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Personally I think this demonstrates that piggybacking replication >>> data transfer on the COPY protocol was a bad design to start with. >>> It's probably time to split them apart. > >> This appears to be the only obvious unresolved issue regarding this patch: > >> https://commitfest.postgresql.org/action/patch_view?id=412 > >> I don't have a strong personal position on whether or not we should do >> this, but it strikes me that Tom hasn't given much justification for >> why he thinks we should do this, what benefit we'd get from it, or >> what the design should look like. So I guess the question is whether >> Tom - or anyone - would like to make a case for a more serious >> protocol overhaul, or whether we should just go with the approach >> proposed here. > > I was objecting to v1 of the patch. v2 seems somewhat cleaner --- it at > least avoids changing the behavior of libpq for normal COPY operation. > I'm still a bit concerned by the prospect of having to shove further > warts into the COPY data path in future, but maybe its premature to > complain about that when it hasn't happened yet. It's not an unreasonable complaint, but I don't have a very clear idea what to do about it. > Just in a quick scan, I don't have any objection to v2 except that the > protocol documentation is lacking. OK, I'll mark it Waiting on Author pending that issue. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, Nov 16, 2010 at 10:49 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> Just in a quick scan, I don't have any objection to v2 except that the >> protocol documentation is lacking. > > OK, I'll mark it Waiting on Author pending that issue. The patch is touching protocol.sgml as follows. Isn't this enough? ----------------------------- *** a/doc/src/sgml/protocol.sgml --- b/doc/src/sgml/protocol.sgml *************** *** 1344,1350 **** The commands accepted in walsender mode are: WAL position <replaceable>XXX</>/<replaceable>XXX</>. The server can reply with an error, e.g. if the requested section of WAL has already been recycled. On success, server responds with a ! CopyOutResponse message, and then starts to stream WAL to the frontend. WAL will continue to be streamed untilthe connection is broken; no further commands will be accepted. </para> --- 1344,1350 ---- WAL position <replaceable>XXX</>/<replaceable>XXX</>. The server can reply with an error,e.g. if the requested section of WAL has already been recycled. On success, server responds with a ! CopyXLogResponse message, and then starts to stream WAL to the frontend. WAL will continue to be streamed untilthe connection is broken; no further commands will be accepted. </para> *************** *** 2696,2701 **** CopyOutResponse (B) --- 2696,2737 ---- <varlistentry> <term> + CopyXLogResponse (B) + </term> + <listitem> + <para> + + <variablelist> + <varlistentry> + <term> + Byte1('W') + </term> + <listitem> + <para> + Identifies the message as a Start Copy XLog response. + This message is used only for Streaming Replication. + </para> + </listitem> + </varlistentry> + <varlistentry> + <term> + Int32 + </term> + <listitem> + <para> + Length of message contents in bytes, including self. + </para> + </listitem> + </varlistentry> + </variablelist> + + </para> + </listitem> + </varlistentry> + + + <varlistentry> + <term> DataRow (B) </term> <listitem> ----------------------------- Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Thu, Nov 18, 2010 at 7:43 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Tue, Nov 16, 2010 at 10:49 AM, Robert Haas <robertmhaas@gmail.com> wrote: >>> Just in a quick scan, I don't have any objection to v2 except that the >>> protocol documentation is lacking. >> >> OK, I'll mark it Waiting on Author pending that issue. > > The patch is touching protocol.sgml as follows. Isn't this enough? How about some updates to the "Message Flow" section, especially the section on "COPY Operations"? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Nov 18, 2010 at 7:43 AM, Fujii Masao <masao.fujii@gmail.com> wrote: >> The patch is touching protocol.sgml as follows. Isn't this enough? > How about some updates to the "Message Flow" section, especially the > section on "COPY Operations"? Yeah. You're adding a new fundamental state to the protocol; it's not enough to bury that in the description of a message format. I don't think a whole lot of new verbiage is needed, but the COPY section needs to point out that this is a different state that allows both send and receive, and explain what the conditions are for getting into and out of that state. regards, tom lane
Excerpts from Tom Lane's message of vie nov 19 12:25:13 -0300 2010: > Robert Haas <robertmhaas@gmail.com> writes: > > On Thu, Nov 18, 2010 at 7:43 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > >> The patch is touching protocol.sgml as follows. Isn't this enough? > > > How about some updates to the "Message Flow" section, especially the > > section on "COPY Operations"? > > Yeah. You're adding a new fundamental state to the protocol; it's not > enough to bury that in the description of a message format. I don't > think a whole lot of new verbiage is needed, but the COPY section needs > to point out that this is a different state that allows both send and > receive, and explain what the conditions are for getting into and out of > that state. Is it sane that the new message has so specific a name? -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Excerpts from Tom Lane's message of vie nov 19 12:25:13 -0300 2010: >> Yeah. You're adding a new fundamental state to the protocol; it's not >> enough to bury that in the description of a message format. I don't >> think a whole lot of new verbiage is needed, but the COPY section needs >> to point out that this is a different state that allows both send and >> receive, and explain what the conditions are for getting into and out of >> that state. > Is it sane that the new message has so specific a name? Yeah, it might be better to call it something generic like CopyBoth. regards, tom lane
On Sat, Nov 20, 2010 at 2:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> Excerpts from Tom Lane's message of vie nov 19 12:25:13 -0300 2010: >>> Yeah. You're adding a new fundamental state to the protocol; it's not >>> enough to bury that in the description of a message format. I don't >>> think a whole lot of new verbiage is needed, but the COPY section needs >>> to point out that this is a different state that allows both send and >>> receive, and explain what the conditions are for getting into and out of >>> that state. > >> Is it sane that the new message has so specific a name? > > Yeah, it might be better to call it something generic like CopyBoth. Thanks for the review! The attached patch s/CopyXLog/CopyBoth/g and adds the description about CopyBoth into the COPY section. While modifying the code, it occurred to me that we might have to add new ExecStatusType like PGRES_COPY_BOTH and use that for CopyBoth mode, for the sake of consistency. But since it's just alias of PGRES_COPY_BOTH for now, i.e., there is no specific behavior for that ExecStatusType, I don't think that it's worth adding that yet. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Attachment
Excerpts from Fujii Masao's message of jue nov 25 10:47:12 -0300 2010: > The attached patch s/CopyXLog/CopyBoth/g and adds the description > about CopyBoth into the COPY section. I gave this a look. It seems good, but I'm not sure about this bit: + case 'W': /* Start Copy Both */ + /* + * We don't need to use getCopyStart here since CopyBothResponse + * specifies neither the copy format nor the number of columns in + * the Copy data. They should be always zero. + */ + conn->result = PQmakeEmptyPGresult(conn, PGRES_COPY_OUT); + if (!conn->result) + return; + conn->asyncStatus = PGASYNC_COPY_BOTH; + conn->copy_already_done = 0; + break; I guess this was OK when this was conceived as CopyXlog, but since it's now a generic mechanism, this seems a bit unwise. Should this be reconsidered so that it's possible to change the format or number of columns? (The paragraph added to the docs is also a bit too specific about this being used exclusively in streaming replication, ISTM) > While modifying the code, it occurred to me that we might have to add new > ExecStatusType like PGRES_COPY_BOTH and use that for CopyBoth mode, > for the sake of consistency. But since it's just alias of PGRES_COPY_BOTH > for now, i.e., there is no specific behavior for that ExecStatusType, I don't > think that it's worth adding that yet. I'm not so sure about this. If we think that it's worth adding a new possible state, we should do so now; we will not be able to change this behavior later. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
The one time this year top-posting seems appropriate...this patch seems stalled waiting for some sort of response to theconcerns Alvaro raised here.<br /><br /> Alvaro Herrera wrote: <blockquote cite="mid:1290818506-sup-5440@alvh.no-ip.org"type="cite"><pre wrap="">Excerpts from Fujii Masao's message of jue nov 25 10:47:12-0300 2010: </pre><blockquote type="cite"><pre wrap="">The attached patch s/CopyXLog/CopyBoth/g and adds the description about CopyBoth into the COPY section. </pre></blockquote><pre wrap=""> I gave this a look. It seems good, but I'm not sure about this bit: + case 'W': /* Start Copy Both */ + /* + * We don't need to use getCopyStart here since CopyBothResponse + * specifies neither the copy format nor the number of columns in + * the Copy data. They should be always zero. + */ + conn->result = PQmakeEmptyPGresult(conn, PGRES_COPY_OUT); + if (!conn->result) + return; + conn->asyncStatus = PGASYNC_COPY_BOTH; + conn->copy_already_done = 0; + break; I guess this was OK when this was conceived as CopyXlog, but since it's now a generic mechanism, this seems a bit unwise. Should this be reconsidered so that it's possible to change the format or number of columns? (The paragraph added to the docs is also a bit too specific about this being used exclusively in streaming replication, ISTM) </pre><blockquote type="cite"><pre wrap="">While modifying the code, it occurred to me that we might have to add new ExecStatusType like PGRES_COPY_BOTH and use that for CopyBoth mode, for the sake of consistency. But since it's just alias of PGRES_COPY_BOTH for now, i.e., there is no specific behavior for that ExecStatusType, I don't think that it's worth adding that yet. </pre></blockquote><pre wrap=""> I'm not so sure about this. If we think that it's worth adding a new possible state, we should do so now; we will not be able to change this behavior later. </pre></blockquote><br />
On Mon, Dec 6, 2010 at 3:07 AM, Greg Smith <greg@2ndquadrant.com> wrote: > The one time this year top-posting seems appropriate...this patch seems > stalled waiting for some sort of response to the concerns Alvaro raised > here. Sorry for the delay. I didn't have the time. > I gave this a look. It seems good, but I'm not sure about this bit: Thanks for the review! > I guess this was OK when this was conceived as CopyXlog, but since it's > now a generic mechanism, this seems a bit unwise. Should this be > reconsidered so that it's possible to change the format or number of > columns? I changed CopyBothResponse message so that it includes the format and number of columns of copy data. Please see the attached patch. > (The paragraph added to the docs is also a bit too specific about this > being used exclusively in streaming replication, ISTM) Yes. But it seems difficult to generalize the docs more because currently only SR uses Copy-both. So I had to write that, for example, the condition to get into the state is only "START REPLICATION" command. > While modifying the code, it occurred to me that we might have to add new > ExecStatusType like PGRES_COPY_BOTH and use that for CopyBoth mode, > for the sake of consistency. But since it's just alias of PGRES_COPY_BOTH > for now, i.e., there is no specific behavior for that ExecStatusType, I > don't > think that it's worth adding that yet. > > > I'm not so sure about this. If we think that it's worth adding a new > possible state, we should do so now; we will not be able to change this > behavior later. OK. I added that new state. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Attachment
On Mon, Dec 6, 2010 at 12:54 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Mon, Dec 6, 2010 at 3:07 AM, Greg Smith <greg@2ndquadrant.com> wrote: >> The one time this year top-posting seems appropriate...this patch seems >> stalled waiting for some sort of response to the concerns Alvaro raised >> here. > > Sorry for the delay. I didn't have the time. > >> I gave this a look. It seems good, but I'm not sure about this bit: > > Thanks for the review! > >> I guess this was OK when this was conceived as CopyXlog, but since it's >> now a generic mechanism, this seems a bit unwise. Should this be >> reconsidered so that it's possible to change the format or number of >> columns? > > I changed CopyBothResponse message so that it includes the format > and number of columns of copy data. Please see the attached patch. > >> (The paragraph added to the docs is also a bit too specific about this >> being used exclusively in streaming replication, ISTM) > > Yes. But it seems difficult to generalize the docs more because currently > only SR uses Copy-both. So I had to write that, for example, the condition > to get into the state is only "START REPLICATION" command. > >> While modifying the code, it occurred to me that we might have to add new >> ExecStatusType like PGRES_COPY_BOTH and use that for CopyBoth mode, >> for the sake of consistency. But since it's just alias of PGRES_COPY_BOTH >> for now, i.e., there is no specific behavior for that ExecStatusType, I >> don't >> think that it's worth adding that yet. >> >> >> I'm not so sure about this. If we think that it's worth adding a new >> possible state, we should do so now; we will not be able to change this >> behavior later. > > OK. I added that new state. Committed with just a few changes to the documentation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Dec 11, 2010 at 11:37 PM, Robert Haas <robertmhaas@gmail.com> wrote: > Committed with just a few changes to the documentation. Thanks a lot! Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center