Thread: making write location work (was: Efficient transaction-controlled synchronous replication)
making write location work (was: Efficient transaction-controlled synchronous replication)
From
Robert Haas
Date:
On Fri, Mar 18, 2011 at 8:16 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Mar 18, 2011 at 3:52 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> I agree to get rid of write_location. >> >> No, don't remove it. >> >> We seem to be just looking for things to tweak without any purpose. >> Removing this adds nothing for us. >> >> We will have the column in the future, it is there now, so leave it. > > Well then can we revert the part of your patch that causes it to not > actually work any more? Specifically, if we're not going to remove write location, then I think we need to apply something like the attached. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
Re: making write location work (was: Efficient transaction-controlled synchronous replication)
From
Simon Riggs
Date:
On Wed, Mar 23, 2011 at 3:35 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Mar 18, 2011 at 8:16 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Fri, Mar 18, 2011 at 3:52 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >>>> I agree to get rid of write_location. >>> >>> No, don't remove it. >>> >>> We seem to be just looking for things to tweak without any purpose. >>> Removing this adds nothing for us. >>> >>> We will have the column in the future, it is there now, so leave it. >> >> Well then can we revert the part of your patch that causes it to not >> actually work any more? > > Specifically, if we're not going to remove write location, then I > think we need to apply something like the attached. The protocol supports different write/fsync values, so the view should display them. We don't know what the standby end will be doing with the data in all cases. For the main server, making the additional change will just decrease performance, for no benefit. In the future we would have a parameter that says how often we send replies, but there's no point having a parameter if there is only one meaningful value for standby servers currently. Please leave this as it is now. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: making write location work (was: Efficient transaction-controlled synchronous replication)
From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes: > Specifically, if we're not going to remove write location, then I > think we need to apply something like the attached. > while (walrcv_receive(0, &type, &buf, &len)) > XLogWalRcvProcessMsg(type, buf, len); > + /* Let the master know that we received some data. */ > + XLogWalRcvSendReply(); What if we didn't actually receive any new data? regards, tom lane
Re: Re: making write location work (was: Efficient transaction-controlled synchronous replication)
From
Robert Haas
Date:
On Wed, Mar 23, 2011 at 12:10 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> Specifically, if we're not going to remove write location, then I >> think we need to apply something like the attached. > > The protocol supports different write/fsync values, so the view should > display them. That's exactly the point. Currently, we have a protocol that supports different write and fsync values, but the code as written does not actually ever send a reply at any time when the two values can ever be different. So there is no point in sending both of them. The write location is completely redundant with the fsync location and therefore completely useless. We shouldn't bother sending the value twice, or displaying it twice, if it's absolutely 100% guaranteed to be identical in every case. The point of the patch that I posted is that it restores the previous behavior, where we send an update before flushing WAL and again after flushing WAL. If we do that, then the write location can be ahead of the flush location when we've written but not flushed. If we don't do that, and only send replies after flushing everything, then the two fields are perforce always the same on the master. I don't see that as being a useful behavior, and in fact I think it could be quite confusing. Someone might assume that if we bother to expose both a write_location and a flush_location, they are somehow different. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: making write location work (was: Efficient transaction-controlled synchronous replication)
From
Robert Haas
Date:
On Wed, Mar 23, 2011 at 12:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Specifically, if we're not going to remove write location, then I >> think we need to apply something like the attached. > >> while (walrcv_receive(0, &type, &buf, &len)) >> XLogWalRcvProcessMsg(type, buf, len); > >> + /* Let the master know that we received some data. */ >> + XLogWalRcvSendReply(); > > What if we didn't actually receive any new data? The portion of the code immediately preceding what's included in the diff guards against that, and there is a second guard in XLogWalRcvSendReply(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Re: making write location work (was: Efficient transaction-controlled synchronous replication)
From
Robert Haas
Date:
On Wed, Mar 23, 2011 at 2:43 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Wed, Mar 23, 2011 at 6:22 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Wed, Mar 23, 2011 at 12:10 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >>>> Specifically, if we're not going to remove write location, then I >>>> think we need to apply something like the attached. >>> >>> The protocol supports different write/fsync values, so the view should >>> display them. >> >> That's exactly the point. > > No its not. > >> Currently, we have a protocol that supports >> different write and fsync values, but the code as written does not >> actually ever send a reply at any time when the two values can ever be >> different. So there is no point in sending both of them. The write >> location is completely redundant with the fsync location and therefore >> completely useless. We shouldn't bother sending the value twice, or >> displaying it twice, if it's absolutely 100% guaranteed to be >> identical in every case. > > As of 9.1, we now support other tools that use the protocol, so you > cannot assume you know what is being sent, just because one sender has > certain characteristics. Oh, really? Is this strictly hypothetical or is such a beast planned/already in existence? I'm just afraid this is going to be confusing to users who will expect it to do something that it doesn't. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Re: making write location work (was: Efficient transaction-controlled synchronous replication)
From
Simon Riggs
Date:
On Wed, Mar 23, 2011 at 7:29 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Mar 23, 2011 at 2:43 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> On Wed, Mar 23, 2011 at 6:22 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>> On Wed, Mar 23, 2011 at 12:10 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >>>>> Specifically, if we're not going to remove write location, then I >>>>> think we need to apply something like the attached. >>>> >>>> The protocol supports different write/fsync values, so the view should >>>> display them. >>> >>> That's exactly the point. >> >> No its not. >> >>> Currently, we have a protocol that supports >>> different write and fsync values, but the code as written does not >>> actually ever send a reply at any time when the two values can ever be >>> different. So there is no point in sending both of them. The write >>> location is completely redundant with the fsync location and therefore >>> completely useless. We shouldn't bother sending the value twice, or >>> displaying it twice, if it's absolutely 100% guaranteed to be >>> identical in every case. >> >> As of 9.1, we now support other tools that use the protocol, so you >> cannot assume you know what is being sent, just because one sender has >> certain characteristics. > > Oh, really? Is this strictly hypothetical or is such a beast > planned/already in existence? Ask Magnus. In any case, that's not the only argument for keeping it. We introduce the view in this release and I would like it to stay the same from now, since we know we will need that info later. No more minor tweaks, please. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Re: making write location work (was: Efficient transaction-controlled synchronous replication)
From
Robert Haas
Date:
On Wed, Mar 23, 2011 at 3:33 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > In any case, that's not the only argument for keeping it. We introduce > the view in this release and I would like it to stay the same from > now, since we know we will need that info later. At least as I understand it, it's not our project policy to carry around code that doesn't accomplish anything useful. I have no objection to keeping the field; I simply think that if we're going to have it, we should make it work, as in fact it did before you changed it without discussion. You haven't offered any evidence at all that it introduces any kind of a performance regression AT ALL, much less that such any such regression can't be trivially patched around by making SyncRepReleaseWaiters exit quickly if the flush LSN hasn't advanced. The onus is as much on you to justify the change as it is on me to justify changing it back. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Re: making write location work (was: Efficient transaction-controlled synchronous replication)
From
Simon Riggs
Date:
On Wed, Mar 23, 2011 at 8:20 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Mar 23, 2011 at 3:33 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> In any case, that's not the only argument for keeping it. We introduce >> the view in this release and I would like it to stay the same from >> now, since we know we will need that info later. > > At least as I understand it, it's not our project policy to carry > around code that doesn't accomplish anything useful. I have no > objection to keeping the field; I simply think that if we're going to > have it, we should make it work, as in fact it did before you changed > it without discussion. You haven't offered any evidence at all that > it introduces any kind of a performance regression AT ALL, much less > that such any such regression can't be trivially patched around by > making SyncRepReleaseWaiters exit quickly if the flush LSN hasn't > advanced. The onus is as much on you to justify the change as it is > on me to justify changing it back. What a stupid conversation. There's no onus on me to have to keep justifying to you why the code is the way it is, but I do. If you want to make a change that I already know reduces performance, you have to have a good reason. So far, you don't. Stop fussing and wrap the release. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Re: making write location work (was: Efficient transaction-controlled synchronous replication)
From
Robert Haas
Date:
On Thu, Mar 24, 2011 at 2:28 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > The protocol supports sending two values, so two are displayed. > > If you wish to remove one from the display then that only makes sense > if you also prevent the protocol from supporting two values. > > There is no benefit in doing that, so why do it? We are going to put > that back in 9.2 if you remove it now. Why not leave it, so we don't > need to rewrite all the monitoring tools that will use this view? If we're going to put it back in 9.2, then we shouldn't remove it now.We should just make it work. It's a three line patch. If 9.2 is going to meaningfully distinguish between write location and flush location, then we may as well do the same thing in 9.1. Then we'll be ahead of the game: not only will the view have the same columns in both releases, but they'll actually have the same semantics in both releases. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Re: making write location work (was: Efficient transaction-controlled synchronous replication)
From
Fujii Masao
Date:
On Fri, Mar 25, 2011 at 6:11 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Mar 24, 2011 at 2:28 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> The protocol supports sending two values, so two are displayed. >> >> If you wish to remove one from the display then that only makes sense >> if you also prevent the protocol from supporting two values. >> >> There is no benefit in doing that, so why do it? We are going to put >> that back in 9.2 if you remove it now. Why not leave it, so we don't >> need to rewrite all the monitoring tools that will use this view? What are you planning to use write_location for? BTW, I'm thinking to add recv_location (not write_location) in 9.2 to support another sync rep mode which makes transactions wait until the standby has received (not fsync'd) the WAL. You're planning the same? > If we're going to put it back in 9.2, then we shouldn't remove it now. > We should just make it work. It's a three line patch. If 9.2 is > going to meaningfully distinguish between write location and flush > location, then we may as well do the same thing in 9.1. Then we'll be > ahead of the game: not only will the view have the same columns in > both releases, but they'll actually have the same semantics in both > releases. +1 Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: Re: making write location work (was: Efficient transaction-controlled synchronous replication)
From
Robert Haas
Date:
On Fri, Mar 25, 2011 at 5:03 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Fri, Mar 25, 2011 at 6:11 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Thu, Mar 24, 2011 at 2:28 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> The protocol supports sending two values, so two are displayed. >>> >>> If you wish to remove one from the display then that only makes sense >>> if you also prevent the protocol from supporting two values. >>> >>> There is no benefit in doing that, so why do it? We are going to put >>> that back in 9.2 if you remove it now. Why not leave it, so we don't >>> need to rewrite all the monitoring tools that will use this view? > > What are you planning to use write_location for? BTW, I'm thinking to > add recv_location (not write_location) in 9.2 to support another sync rep > mode which makes transactions wait until the standby has received > (not fsync'd) the WAL. You're planning the same? > >> If we're going to put it back in 9.2, then we shouldn't remove it now. >> We should just make it work. It's a three line patch. If 9.2 is >> going to meaningfully distinguish between write location and flush >> location, then we may as well do the same thing in 9.1. Then we'll be >> ahead of the game: not only will the view have the same columns in >> both releases, but they'll actually have the same semantics in both >> releases. > > +1 I think we have adequate consensus on this topic, so committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Re: making write location work (was: Efficient transaction-controlled synchronous replication)
From
Simon Riggs
Date:
On Wed, Mar 23, 2011 at 6:22 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Mar 23, 2011 at 12:10 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> Specifically, if we're not going to remove write location, then I >>> think we need to apply something like the attached. >> >> The protocol supports different write/fsync values, so the view should >> display them. > > That's exactly the point. No its not. > Currently, we have a protocol that supports > different write and fsync values, but the code as written does not > actually ever send a reply at any time when the two values can ever be > different. So there is no point in sending both of them. The write > location is completely redundant with the fsync location and therefore > completely useless. We shouldn't bother sending the value twice, or > displaying it twice, if it's absolutely 100% guaranteed to be > identical in every case. As of 9.1, we now support other tools that use the protocol, so you cannot assume you know what is being sent, just because one sender has certain characteristics. > The point of the patch that I posted is that it restores the previous > behavior, where we send an update before flushing WAL and again after > flushing WAL. If we do that, then the write location can be ahead of > the flush location when we've written but not flushed. If we don't do > that, and only send replies after flushing everything, then the two > fields are perforce always the same on the master. I don't see that > as being a useful behavior, and in fact I think it could be quite > confusing. Someone might assume that if we bother to expose both a > write_location and a flush_location, they are somehow different. They can be in 9.1 and almost certainly will be in 9.2 -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Re: making write location work (was: Efficient transaction-controlled synchronous replication)
From
Fujii Masao
Date:
On Thu, Mar 24, 2011 at 3:22 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Mar 23, 2011 at 12:10 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >>> Specifically, if we're not going to remove write location, then I >>> think we need to apply something like the attached. >> >> The protocol supports different write/fsync values, so the view should >> display them. > > That's exactly the point. Currently, we have a protocol that supports > different write and fsync values, but the code as written does not > actually ever send a reply at any time when the two values can ever be > different. So there is no point in sending both of them. The write > location is completely redundant with the fsync location and therefore > completely useless. We shouldn't bother sending the value twice, or > displaying it twice, if it's absolutely 100% guaranteed to be > identical in every case. > > The point of the patch that I posted is that it restores the previous > behavior, where we send an update before flushing WAL and again after > flushing WAL. If we do that, then the write location can be ahead of > the flush location when we've written but not flushed. If we don't do > that, and only send replies after flushing everything, then the two > fields are perforce always the same on the master. I don't see that > as being a useful behavior, and in fact I think it could be quite > confusing. Someone might assume that if we bother to expose both a > write_location and a flush_location, they are somehow different. I agree with Robert. It's useless and confusing to send the same location as flush_location as write_location redundantly. We should either remove write_location from the pg_stat_replication view and the protocol at all, or apply the proposed patch. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: Re: making write location work (was: Efficient transaction-controlled synchronous replication)
From
Simon Riggs
Date:
On Thu, Mar 24, 2011 at 11:45 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Thu, Mar 24, 2011 at 3:22 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Wed, Mar 23, 2011 at 12:10 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >>>> Specifically, if we're not going to remove write location, then I >>>> think we need to apply something like the attached. >>> >>> The protocol supports different write/fsync values, so the view should >>> display them. >> >> That's exactly the point. Currently, we have a protocol that supports >> different write and fsync values, but the code as written does not >> actually ever send a reply at any time when the two values can ever be >> different. So there is no point in sending both of them. The write >> location is completely redundant with the fsync location and therefore >> completely useless. We shouldn't bother sending the value twice, or >> displaying it twice, if it's absolutely 100% guaranteed to be >> identical in every case. >> >> The point of the patch that I posted is that it restores the previous >> behavior, where we send an update before flushing WAL and again after >> flushing WAL. If we do that, then the write location can be ahead of >> the flush location when we've written but not flushed. If we don't do >> that, and only send replies after flushing everything, then the two >> fields are perforce always the same on the master. I don't see that >> as being a useful behavior, and in fact I think it could be quite >> confusing. Someone might assume that if we bother to expose both a >> write_location and a flush_location, they are somehow different. > > I agree with Robert. It's useless and confusing to send the same > location as flush_location as write_location redundantly. We should > either remove write_location from the pg_stat_replication view and > the protocol at all, or apply the proposed patch. You may be confused, but that doesn't mean its useless. The protocol supports sending two values, so two are displayed. If you wish to remove one from the display then that only makes sense if you also prevent the protocol from supporting two values. There is no benefit in doing that, so why do it? We are going to put that back in 9.2 if you remove it now. Why not leave it, so we don't need to rewrite all the monitoring tools that will use this view? Just say this in the docs. "Currently, standbys return the same value for write and flush locations and so in most cases these values will be the same. It is possible to write a user program that sends replies at different times and in that case the values would differ. In later release these values will differ for standbys. when more options are available." It's almost certain that Magnus will fix the bug in pg_xlogstream (or whatever its called) and a tool will be available, so lets leave this. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services