Thread: Sync Rep for 2011CF1

Sync Rep for 2011CF1

From
Simon Riggs
Date:
Here's the latest patch for sync rep.

>From here, I will be developing the patch further on public git
repository towards commit. My expectation is that commit is at least 2
weeks away, though there are no major unresolved problems. I expect
essential follow on patches to continue for a further 2-4 weeks after
that first commit.

I will add my own reviewer's notes tomorrow.

In terms of testing, the patch hasn't been tested further than my own
laptop as yet, so it seems likely there's a few trivial howlers in
there. That is simply because of my recent flu.

I've requested Heikki as main reviewer and he's accepted. Other comments
are also welcome about the user interface and the reply protocol are
also welcome. Please don't bother performance testing yet. I'll let you
know when that is appropriate.

--
 Simon Riggs           http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services


Attachment

Re: Sync Rep for 2011CF1

From
Magnus Hagander
Date:
On Sat, Jan 15, 2011 at 22:40, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> Here's the latest patch for sync rep.
>
> >From here, I will be developing the patch further on public git
> repository towards commit. My expectation is that commit is at least 2

That's great. Just one tiny detail - which repository and which branch? ;)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: Sync Rep for 2011CF1

From
Heikki Linnakangas
Date:
(grr, I wrote this on Monday already, but just found it in my drafts 
folder, unsent)

On 15.01.2011 23:40, Simon Riggs wrote:
>
> Here's the latest patch for sync rep.
>
>> From here, I will be developing the patch further on public git
> repository towards commit. My expectation is that commit is at least 2
> weeks away, though there are no major unresolved problems. I expect
> essential follow on patches to continue for a further 2-4 weeks after
> that first commit.

Thanks! Some quick observations after first read-through:

* The docs for synchronous_replication still claim that it means two 
different things in master and standby. Looking at the code, I believe 
that's not true anymore.

* it seems like overkill to not let clients to even connect when 
allow_standalone_primary=off and no synchronous standbys are available. 
What if you just want to run a read-only query?

* Please separate the hot standby feedback loop into a separate patch on 
top of the synch rep patch. I know it's not a lot of code, but it's 
still easier to handle features separately.

* The UI differs from what was agreed on here: 
http://archives.postgresql.org/message-id/4D1DCF5A.7070808@enterprisedb.com.

* Instead of the short-circuit for autovacuum in SyncRepWaitOnQueue(), 
it's probably better to set synchronous_commit=off locally when the 
autovacuum process starts.

* the "queue id" thing is dead code at the moment, as there is only one 
queue. I gather this is a leftover from having different queues for 
"apply", "sync", "write" modes, but I think it would be better to just 
remove it for now.

PS, I'm surprised how small this patch is. Thinking about it some more, 
I don't know why I expected this to be a big patch.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Sync Rep for 2011CF1

From
Simon Riggs
Date:
On Fri, 2011-01-21 at 14:45 +0200, Heikki Linnakangas wrote:
> (grr, I wrote this on Monday already, but just found it in my drafts 
> folder, unsent)

No worries, thanks for commenting.

> Thanks! Some quick observations after first read-through:
> 
> * The docs for synchronous_replication still claim that it means two 
> different things in master and standby. Looking at the code, I believe 
> that's not true anymore.

Probably. The docs changed so many times I had gone "code-blind".

> * it seems like overkill to not let clients to even connect when 
> allow_standalone_primary=off and no synchronous standbys are available. 
> What if you just want to run a read-only query?

That's what Aidan requested, I agreed and so its there. You're using
sync rep because of writes, so you have a read-write app. If you allow
connections then half of the app will work, half will not. Half-working
isn't very useful, as Aidan eloquently explained. If your app is all
read-only you wouldn't be using sync rep anyway. That's the argument,
but I've not got especially strong feelings it has to be this way.

Perhaps discuss that on a separate thread? See what everyone thinks?

> * Please separate the hot standby feedback loop into a separate patch on 
> top of the synch rep patch. I know it's not a lot of code, but it's 
> still easier to handle features separately.

I tried to do that initially, but there is interaction between those
features. The way I have it is that the replies from the standby act as
keepalives to the master. So the hot standby feedback is just an extra
parameter and an extra field. Removing that doesn't really make the
patch any easier to understand.

> * The UI differs from what was agreed on here: 
> http://archives.postgresql.org/message-id/4D1DCF5A.7070808@enterprisedb.com.

You mean synchronous_standbys is not there yet? Yes, I know. It can be
added after we commit this, its only a small bit of code and no
dependencies. I figured we had bigger things to agree first.

> * Instead of the short-circuit for autovacuum in SyncRepWaitOnQueue(), 
> it's probably better to set synchronous_commit=off locally when the 
> autovacuum process starts.

Even better plan, thanks.

> * the "queue id" thing is dead code at the moment, as there is only one 
> queue. I gather this is a leftover from having different queues for 
> "apply", "sync", "write" modes, but I think it would be better to just 
> remove it for now.

It's a trivial patch to add options to either fsync or apply, so I was
expecting to add that back in this release also.

> PS, I'm surprised how small this patch is. Thinking about it some more, 
> I don't know why I expected this to be a big patch.

Yes, it's the decisions which seem fairly big this time.

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services



Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Fri, Jan 21, 2011 at 7:45 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> * it seems like overkill to not let clients to even connect when
> allow_standalone_primary=off and no synchronous standbys are available. What
> if you just want to run a read-only query?

For what it's worth, +1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Magnus Hagander
Date:
On Fri, Jan 21, 2011 at 14:24, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Fri, 2011-01-21 at 14:45 +0200, Heikki Linnakangas wrote:
>> * it seems like overkill to not let clients to even connect when
>> allow_standalone_primary=off and no synchronous standbys are available.
>> What if you just want to run a read-only query?
>
> That's what Aidan requested, I agreed and so its there. You're using
> sync rep because of writes, so you have a read-write app. If you allow
> connections then half of the app will work, half will not. Half-working
> isn't very useful, as Aidan eloquently explained. If your app is all
> read-only you wouldn't be using sync rep anyway. That's the argument,
> but I've not got especially strong feelings it has to be this way.
>
> Perhaps discuss that on a separate thread? See what everyone thinks?

I'll respond here once, and we'll see if more people want to comment
then we can move it :-)

Doesn't this make a pretty strange assumption - namely that you have a
single application? We support multiple databases, and multiple users,
and multiple pretty much anything - in most cases, people deploy
multiple apps. (They may well be part of the same "solution" or
whatever you want to call it, but parts may well be readonly - like a
reporting app, or even just a monitoring client)


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: Sync Rep for 2011CF1

From
Heikki Linnakangas
Date:
On 21.01.2011 15:24, Simon Riggs wrote:
> On Fri, 2011-01-21 at 14:45 +0200, Heikki Linnakangas wrote:
>> * it seems like overkill to not let clients to even connect when
>> allow_standalone_primary=off and no synchronous standbys are available.
>> What if you just want to run a read-only query?
>
> That's what Aidan requested, I agreed and so its there. You're using
> sync rep because of writes, so you have a read-write app. If you allow
> connections then half of the app will work, half will not. Half-working
> isn't very useful, as Aidan eloquently explained. If your app is all
> read-only you wouldn't be using sync rep anyway. That's the argument,
> but I've not got especially strong feelings it has to be this way.

It's also possible that most of your transactions in fact do "set 
synchronous_replication=off", and only a few actually do synchronous 
replication. It would be pretty bad to not allow connections in that 
case. And what if you want to connect to the server to diagnose the 
issue? Oh, you can't... Besides, we're not kicking out existing 
connections, are we? Seems inconsistent to let the old connections live.

IMHO the only reasonable option is to allow connections as usual, and 
only fail (or block forever) at COMMIT.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Fri, Jan 21, 2011 at 10:33 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> It's also possible that most of your transactions in fact do "set
> synchronous_replication=off", and only a few actually do synchronous
> replication. It would be pretty bad to not allow connections in that case.
> And what if you want to connect to the server to diagnose the issue? Oh, you
> can't... Besides, we're not kicking out existing connections, are we? Seems
> inconsistent to let the old connections live.
>
> IMHO the only reasonable option is to allow connections as usual, and only
> fail (or block forever) at COMMIT.

Another point is that the synchronous standby could come back at any
time.  There's no reason not to let the client do all the work they
want up until the commit - maybe the standby will pop back up before
the COMMIT actually issued.  Or even if it doesn't, as soon as it pops
back up, all those COMMITs get released.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Simon Riggs
Date:
On Fri, 2011-01-21 at 17:33 +0200, Heikki Linnakangas wrote:
> On 21.01.2011 15:24, Simon Riggs wrote:
> > On Fri, 2011-01-21 at 14:45 +0200, Heikki Linnakangas wrote:
> >> * it seems like overkill to not let clients to even connect when
> >> allow_standalone_primary=off and no synchronous standbys are available.
> >> What if you just want to run a read-only query?
> >
> > That's what Aidan requested, I agreed and so its there. You're using
> > sync rep because of writes, so you have a read-write app. If you allow
> > connections then half of the app will work, half will not. Half-working
> > isn't very useful, as Aidan eloquently explained. If your app is all
> > read-only you wouldn't be using sync rep anyway. That's the argument,
> > but I've not got especially strong feelings it has to be this way.
> 
> It's also possible that most of your transactions in fact do "set 
> synchronous_replication=off", and only a few actually do synchronous 
> replication. It would be pretty bad to not allow connections in that 
> case. And what if you want to connect to the server to diagnose the 
> issue? Oh, you can't... Besides, we're not kicking out existing 
> connections, are we? Seems inconsistent to let the old connections live.
> 
> IMHO the only reasonable option is to allow connections as usual, and 
> only fail (or block forever) at COMMIT.

We all think our own proposed options are the only reasonable thing, but
that helps us not at all in moving forwards. I've put much time into
delivering options many other people want, so there is a range of
function. I think we should hear from Aidan first before we decide to
remove that aspect.

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services



Re: Sync Rep for 2011CF1

From
Simon Riggs
Date:
On Fri, 2011-01-21 at 14:34 +0100, Magnus Hagander wrote:
> On Fri, Jan 21, 2011 at 14:24, Simon Riggs <simon@2ndquadrant.com> wrote:
> > On Fri, 2011-01-21 at 14:45 +0200, Heikki Linnakangas wrote:
> >> * it seems like overkill to not let clients to even connect when
> >> allow_standalone_primary=off and no synchronous standbys are available.
> >> What if you just want to run a read-only query?
> >
> > That's what Aidan requested, I agreed and so its there. You're using
> > sync rep because of writes, so you have a read-write app. If you allow
> > connections then half of the app will work, half will not. Half-working
> > isn't very useful, as Aidan eloquently explained. If your app is all
> > read-only you wouldn't be using sync rep anyway. That's the argument,
> > but I've not got especially strong feelings it has to be this way.
> >
> > Perhaps discuss that on a separate thread? See what everyone thinks?
> 
> I'll respond here once, and we'll see if more people want to comment
> then we can move it :-)
> 
> Doesn't this make a pretty strange assumption - namely that you have a
> single application? We support multiple databases, and multiple users,
> and multiple pretty much anything - in most cases, people deploy
> multiple apps. (They may well be part of the same "solution" or
> whatever you want to call it, but parts may well be readonly - like a
> reporting app, or even just a monitoring client)

There are various problems whatever we do. If we don't like one way, we
must balance that by judging what happens if we do things the other way.

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services



Re: Sync Rep for 2011CF1

From
Aidan Van Dyk
Date:
On Fri, Jan 21, 2011 at 11:59 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

> We all think our own proposed options are the only reasonable thing, but
> that helps us not at all in moving forwards. I've put much time into
> delivering options many other people want, so there is a range of
> function. I think we should hear from Aidan first before we decide to
> remove that aspect.

Since invited, I'll describe what I *want* do to do.  I understand I
may not get it ;-)

When no sync slave is connected, yes, I want to stop things hard.  I
don't mind read-only queries working, but what I want to avoid (if
possible) is having the master do lots of inserts/updates/deletes for
clients, fsyncing them all to disk (so on some strange event causing
recovery they'll be considered commit) and just delay the commit
return until it has a valid sync slave connected and caught up again.
And *I*'ld prefer if client transactions get errors right away rather
than begin to hang if a sync slave is not connected.

Even with single server, there's the window where stuff could be
"committed" but the client not notified yet.  And that leads to
transactions which need to be verified.  And with sync rep, that
window get's a little larger.  But I'ld prefer not to make it a hanger
door, *especially* when it gets flung open at the point where the shit
has hit the fan and we're in the midst of switching over to manual
processing...

So, in my case, I'ld like it if PG couldn't do anything to generate
any user-initiated WAL unless there is a sync slave connected.  Yes, I
understand that leads to hard-fail, and yes, I understand I'm in the
minority, maybe almost singular in that desire.

a.

--
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Fri, Jan 21, 2011 at 12:23 PM, Aidan Van Dyk <aidan@highrise.ca> wrote:
> On Fri, Jan 21, 2011 at 11:59 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
>> We all think our own proposed options are the only reasonable thing, but
>> that helps us not at all in moving forwards. I've put much time into
>> delivering options many other people want, so there is a range of
>> function. I think we should hear from Aidan first before we decide to
>> remove that aspect.
>
> Since invited, I'll describe what I *want* do to do.  I understand I
> may not get it ;-)
>
> When no sync slave is connected, yes, I want to stop things hard.  I
> don't mind read-only queries working, but what I want to avoid (if
> possible) is having the master do lots of inserts/updates/deletes for
> clients, fsyncing them all to disk (so on some strange event causing
> recovery they'll be considered commit) and just delay the commit
> return until it has a valid sync slave connected and caught up again.
> And *I*'ld prefer if client transactions get errors right away rather
> than begin to hang if a sync slave is not connected.
>
> Even with single server, there's the window where stuff could be
> "committed" but the client not notified yet.  And that leads to
> transactions which need to be verified.  And with sync rep, that
> window get's a little larger.  But I'ld prefer not to make it a hanger
> door, *especially* when it gets flung open at the point where the shit
> has hit the fan and we're in the midst of switching over to manual
> processing...
>
> So, in my case, I'ld like it if PG couldn't do anything to generate
> any user-initiated WAL unless there is a sync slave connected.  Yes, I
> understand that leads to hard-fail, and yes, I understand I'm in the
> minority, maybe almost singular in that desire.

What you're proposing is to fail things earlier than absolutely
necessary (when they try to XLOG, rather than at commit) but still
later than what I think Simon is proposing (not even letting them log
in).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Jan 21, 2011 at 12:23 PM, Aidan Van Dyk <aidan@highrise.ca> wrote:
>> When no sync slave is connected, yes, I want to stop things hard.

> What you're proposing is to fail things earlier than absolutely
> necessary (when they try to XLOG, rather than at commit) but still
> later than what I think Simon is proposing (not even letting them log
> in).

I can't see a reason to disallow login, because read-only transactions
can still run in such a situation --- and, indeed, might be fairly
essential if you need to inspect the database state on the way to fixing
the replication problem.  (Of course, we've already had the discussion
about it being a terrible idea to configure replication from inside the
database, but that doesn't mean there might not be views or status you
would wish to look at.)
        regards, tom lane


Re: Sync Rep for 2011CF1

From
Aidan Van Dyk
Date:
On Fri, Jan 21, 2011 at 1:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Fri, Jan 21, 2011 at 12:23 PM, Aidan Van Dyk <aidan@highrise.ca> wrote:
>>> When no sync slave is connected, yes, I want to stop things hard.
>
>> What you're proposing is to fail things earlier than absolutely
>> necessary (when they try to XLOG, rather than at commit) but still
>> later than what I think Simon is proposing (not even letting them log
>> in).
>
> I can't see a reason to disallow login, because read-only transactions
> can still run in such a situation --- and, indeed, might be fairly
> essential if you need to inspect the database state on the way to fixing
> the replication problem.  (Of course, we've already had the discussion
> about it being a terrible idea to configure replication from inside the
> database, but that doesn't mean there might not be views or status you
> would wish to look at.)

And just disallowing new logins is probably not even enough, because
it allows current logged in clients "forward progress", leading
towards an eventual hang (with now committed data on the master).

Again, I'm trying to stop "forward progress" as soon as possible when
a sync slave isn't replicating.  And I'ld like clients to fail with
errors sooner (hopefully they get to the commit point) rather than
accumulate the WAL synced to the master and just wait at the commit.

So I think that's a more complete picture of my quick "not do anything
with no synchronous slave replicating" that I think was what led to
the no-login approach.

a.

--
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Fri, Jan 21, 2011 at 1:09 PM, Aidan Van Dyk <aidan@highrise.ca> wrote:
> On Fri, Jan 21, 2011 at 1:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> On Fri, Jan 21, 2011 at 12:23 PM, Aidan Van Dyk <aidan@highrise.ca> wrote:
>>>> When no sync slave is connected, yes, I want to stop things hard.
>>
>>> What you're proposing is to fail things earlier than absolutely
>>> necessary (when they try to XLOG, rather than at commit) but still
>>> later than what I think Simon is proposing (not even letting them log
>>> in).
>>
>> I can't see a reason to disallow login, because read-only transactions
>> can still run in such a situation --- and, indeed, might be fairly
>> essential if you need to inspect the database state on the way to fixing
>> the replication problem.  (Of course, we've already had the discussion
>> about it being a terrible idea to configure replication from inside the
>> database, but that doesn't mean there might not be views or status you
>> would wish to look at.)
>
> And just disallowing new logins is probably not even enough, because
> it allows current logged in clients "forward progress", leading
> towards an eventual hang (with now committed data on the master).
>
> Again, I'm trying to stop "forward progress" as soon as possible when
> a sync slave isn't replicating.  And I'ld like clients to fail with
> errors sooner (hopefully they get to the commit point) rather than
> accumulate the WAL synced to the master and just wait at the commit.
>
> So I think that's a more complete picture of my quick "not do anything
> with no synchronous slave replicating" that I think was what led to
> the no-login approach.

Well, stopping all WAL activity with an error sounds *more* reasonable
than refusing all logins, but I'm not personally sold on it.  For
example, a brief network disruption on the connection between master
and standby would cause the master to grind to a halt... and then
almost immediately resume operations.  More generally, if you have
short-running transactions, there's not much difference between
wait-at-commit and wait-at-WAL, and if you have long-running
transactions, then wait-at-WAL might be gumming up the works more than
necessary.

One idea might be to wait both before and after commit.  If
allow_standalone_primary is off, and a commit is attempted, we check
whether there's a slave connected, and if not, wait for one to
connect.  Then, we write and sync the commit WAL record.  Next, we
wait for the WAL to be ack'd.  Of course, the standby might disappear
between the first check and the second, but it would greatly reduce
the possibility of the master being ahead of the standby after a
crash, which might be useful for some people.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Aidan Van Dyk
Date:
On Fri, Jan 21, 2011 at 1:32 PM, Robert Haas <robertmhaas@gmail.com> wrote:

>> Again, I'm trying to stop "forward progress" as soon as possible when
>> a sync slave isn't replicating.  And I'ld like clients to fail with
>> errors sooner (hopefully they get to the commit point) rather than
>> accumulate the WAL synced to the master and just wait at the commit.

> Well, stopping all WAL activity with an error sounds *more* reasonable
> than refusing all logins, but I'm not personally sold on it.  For
> example, a brief network disruption on the connection between master
> and standby would cause the master to grind to a halt... and then
> almost immediately resume operations.

Yup.  And I'm OK with that.  In my case, it would be much better to
have a few quick failures, which can complete automatically a few
seconds later then to have a big buildup of transactions to re-verify
by hand upon starting manual processing.

But again, I'll stress that I'm talking about whe the master has no
sync slave connected.  a "brief netowrk disruption" between the
master/slave isn't likely going to disconnect the slave.  TCP is
pretty good at handling those.  If the master thinks it has a sync
slave connected, I'm fine with it continuing to queue WAL for it even
if it's lagging noticeably.

>                                        More generally, if you have
> short-running transactions, there's not much difference between
> wait-at-commit and wait-at-WAL, and if you have long-running
> transactions, then wait-at-WAL might be gumming up the works more than
> necessary.

Again, when there is not sync slave *connected*, I don't want to wait
*at all*.  I want to fail ASAP.  If there is a sync slave, and it's
just slow, I don't really care where it waits.

From my experience, if the slave is not connected (i.e TCP connection
has been disconnected), then we're in something like:

1) Proper slave shutdown: pilot error here stopping it if the master requires it
2) Master start, slave not connected yet:  I'm fine with getting
errors here... We *hope* a slave will be here soon, but...
3) network has seperated master/slave:  TCP means it's been like this
for a long time already...
4) Slave hardware/os low-level hang/crash: TCP means it's been like
this for a while already before master's os tears down the connection
5) Slave has crashed (or rebooted) and slave OS has closed/rejected
our TCP connection

In all of these, I'ld love for my master not to be generating WAL and
letting clients think they are making progress.  And I'm hoping that
for #3 & 4 above, PG will have keepalive type traffic that will
prevent me from queing WAL for normal TCP connection time values.

> One idea might be to wait both before and after commit.  If
> allow_standalone_primary is off, and a commit is attempted, we check
> whether there's a slave connected, and if not, wait for one to
> connect.  Then, we write and sync the commit WAL record.  Next, we
> wait for the WAL to be ack'd.  Of course, the standby might disappear
> between the first check and the second, but it would greatly reduce
> the possibility of the master being ahead of the standby after a
> crash, which might be useful for some people.

Ya, but that becomes much more expensive.  Instead of it just being a
"write WAL, fsync WAL, send WAL, wait for slave", it becomes "write
WAL, fsync WAL, send WAL, wait for slave fsync, write WAL, fsync WAL,
send WAL, wait for slave fsync".  And it's expense is all the time,
rather than just when the "no slave no go" situations arise.

And it doesn't reduce the transactions I need to verify by hand
either, because that waiting/error still only happens at the COMMIT
statement from the client.

--
Aidan Van Dyk                                             Create like a god,
aidan@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Fri, Jan 21, 2011 at 1:59 PM, Aidan Van Dyk <aidan@highrise.ca> wrote:
> Yup.  And I'm OK with that.  In my case, it would be much better to
> have a few quick failures, which can complete automatically a few
> seconds later then to have a big buildup of transactions to re-verify
> by hand upon starting manual processing.

Why would you need to do that?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Simon Riggs
Date:
On Fri, 2011-01-21 at 13:32 -0500, Robert Haas wrote:

> One idea might be to wait both before and after commit.  If
> allow_standalone_primary is off, and a commit is attempted, we check
> whether there's a slave connected, and if not, wait for one to
> connect.  Then, we write and sync the commit WAL record.  Next, we
> wait for the WAL to be ack'd.  Of course, the standby might disappear
> between the first check and the second, but it would greatly reduce
> the possibility of the master being ahead of the standby after a
> crash, which might be useful for some people.

I like this idea.

I think it would be too invasive to make a check before we insert each
WAL record, as Aidan suggests. Even if we did that, you aren't protected
when a standby goes down because you'll still have written half a
transaction and still be waiting.

So I propose that 

if (!allow_standalone_primary)   ConfirmSyncRepAvailable();

before PreCommit_Notify(). That puts transaction into a wait state that
lasts until a sync rep standby is available. Note that it is before the
actual commit, so if we decide we need to we can cancel those
transactions and have them properly abort.

I won't add that code yet, in case better ideas emerge.

There is no support for preventing connections at startup, so I will
remove that completely, now.

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services



Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Sat, Jan 22, 2011 at 8:31 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Fri, 2011-01-21 at 13:32 -0500, Robert Haas wrote:
>
>> One idea might be to wait both before and after commit.  If
>> allow_standalone_primary is off, and a commit is attempted, we check
>> whether there's a slave connected, and if not, wait for one to
>> connect.  Then, we write and sync the commit WAL record.  Next, we
>> wait for the WAL to be ack'd.  Of course, the standby might disappear
>> between the first check and the second, but it would greatly reduce
>> the possibility of the master being ahead of the standby after a
>> crash, which might be useful for some people.
>
> I like this idea.
>
> I think it would be too invasive to make a check before we insert each
> WAL record, as Aidan suggests. Even if we did that, you aren't protected
> when a standby goes down because you'll still have written half a
> transaction and still be waiting.
>
> So I propose that
>
> if (!allow_standalone_primary)
>    ConfirmSyncRepAvailable();
>
> before PreCommit_Notify(). That puts transaction into a wait state that
> lasts until a sync rep standby is available. Note that it is before the
> actual commit, so if we decide we need to we can cancel those
> transactions and have them properly abort.
>
> I won't add that code yet, in case better ideas emerge.
>
> There is no support for preventing connections at startup, so I will
> remove that completely, now.

Time's running short - do you have an updated patch?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Sun, Jan 30, 2011 at 11:44 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> Time's running short - do you have an updated patch?

This patch hasn't been updated in more than three weeks.  I assume
this should now be marked Returned with Feedback, and we'll revisit
synchronous replication for 9.2?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Bruce Momjian
Date:
Robert Haas wrote:
> On Sun, Jan 30, 2011 at 11:44 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > Time's running short - do you have an updated patch?
> 
> This patch hasn't been updated in more than three weeks.  I assume
> this should now be marked Returned with Feedback, and we'll revisit
> synchronous replication for 9.2?

Seems it is time for someone else to take the patch and complete it? 
Who can do that?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


Re: Sync Rep for 2011CF1

From
Simon Riggs
Date:
On Mon, 2011-02-07 at 09:55 -0500, Robert Haas wrote:
> On Sun, Jan 30, 2011 at 11:44 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > Time's running short - do you have an updated patch?
> 
> This patch hasn't been updated in more than three weeks.  I assume
> this should now be marked Returned with Feedback, and we'll revisit
> synchronous replication for 9.2?

Hi Robert,

I have time to complete that in next two weeks, but you are right I
haven't had it in last few weeks.

Cheers

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services



Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Mon, Feb 7, 2011 at 11:33 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Mon, 2011-02-07 at 09:55 -0500, Robert Haas wrote:
>> On Sun, Jan 30, 2011 at 11:44 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> > Time's running short - do you have an updated patch?
>>
>> This patch hasn't been updated in more than three weeks.  I assume
>> this should now be marked Returned with Feedback, and we'll revisit
>> synchronous replication for 9.2?
>
> I have time to complete that in next two weeks, but you are right I
> haven't had it in last few weeks.

Well, the current CommitFest ends in one week, and we need to leave
time for someone (Heikki, most likely) to review, so there's really
only a couple of days left.

Bruce's suggestion of having someone else pick it up seems like it
might work.  The obvious candidates are probably Heikki Linnakangas,
Tom Lane, Fujii Masao, and (if you squint a little) me.  I am not
clear that any of those people have the necessary time available
immediately, however.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Mon, Feb 7, 2011 at 12:28 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Feb 7, 2011 at 11:33 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On Mon, 2011-02-07 at 09:55 -0500, Robert Haas wrote:
>>> On Sun, Jan 30, 2011 at 11:44 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> > Time's running short - do you have an updated patch?
>>>
>>> This patch hasn't been updated in more than three weeks.  I assume
>>> this should now be marked Returned with Feedback, and we'll revisit
>>> synchronous replication for 9.2?
>>
>> I have time to complete that in next two weeks, but you are right I
>> haven't had it in last few weeks.
>
> Well, the current CommitFest ends in one week, and we need to leave
> time for someone (Heikki, most likely) to review, so there's really
> only a couple of days left.
>
> Bruce's suggestion of having someone else pick it up seems like it
> might work.  The obvious candidates are probably Heikki Linnakangas,
> Tom Lane, Fujii Masao, and (if you squint a little) me.  I am not
> clear that any of those people have the necessary time available
> immediately, however.

I just spoke to my manager at EnterpriseDB and he cleared my schedule
for the next two days to work on this.  So I'll go hack on this now.
I haven't read the patch yet so I don't know for sure how quite I'll
be able to get up to speed on it, so if someone who is more familiar
with this code wants to grab the baton away from me, feel free.
Otherwise, I'll see what I can do with it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> ... Well, the current CommitFest ends in one week, ...

Really?  I thought the idea for the last CF of a development cycle was
that it kept going till we'd dealt with everything.  Arbitrarily
rejecting stuff we haven't dealt with doesn't seem fair.
        regards, tom lane


Re: Sync Rep for 2011CF1

From
Simon Riggs
Date:
On Mon, 2011-02-07 at 12:39 -0500, Robert Haas wrote:

> I just spoke to my manager at EnterpriseDB and he cleared my schedule
> for the next two days to work on this.  So I'll go hack on this now.
> I haven't read the patch yet so I don't know for sure how quite I'll
> be able to get up to speed on it, so if someone who is more familiar
> with this code wants to grab the baton away from me, feel free.
> Otherwise, I'll see what I can do with it.

Presumably you have a reason for declaring war? I'm sorry for that, I
really am.

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services



Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Mon, Feb 7, 2011 at 12:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> ... Well, the current CommitFest ends in one week, ...
>
> Really?  I thought the idea for the last CF of a development cycle was
> that it kept going till we'd dealt with everything.  Arbitrarily
> rejecting stuff we haven't dealt with doesn't seem fair.

Uh, we did that with 8.4 and it was a disaster.  The CommitFest lasted
*five months*. We've been doing schedule-based CommitFests ever since
and it's worked much better.  I agree it's unfair to reject things
without looking at them, and I'd like to avoid that if at all
possible, but punting things because they need more work than can be
done in the time available is another thing entirely.  I do NOT want
to still be working on the items for this CommitFest in June - that's
about when I'd like to be releasing beta3.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Mon, Feb 7, 2011 at 12:59 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Mon, 2011-02-07 at 12:39 -0500, Robert Haas wrote:
>
>> I just spoke to my manager at EnterpriseDB and he cleared my schedule
>> for the next two days to work on this.  So I'll go hack on this now.
>> I haven't read the patch yet so I don't know for sure how quite I'll
>> be able to get up to speed on it, so if someone who is more familiar
>> with this code wants to grab the baton away from me, feel free.
>> Otherwise, I'll see what I can do with it.
>
> Presumably you have a reason for declaring war? I'm sorry for that, I
> really am.

What the hell are you talking about?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Sat, Jan 15, 2011 at 4:40 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Here's the latest patch for sync rep.

Here is a rebased version of this patch which applies to head of the
master branch.  I haven't tested it yet beyond making sure that it
compiles and passes the regression tests -- but this fixes the bitrot.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: Sync Rep for 2011CF1

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> done in the time available is another thing entirely.  I do NOT want
> to still be working on the items for this CommitFest in June - that's
> about when I'd like to be releasing beta3.

Except that's not how we work here.  You want to change that with
respect to the release management process and schedule (or lack
thereof).  Tradition and current practice say you need to reach
consensus to be able to bypass compromising.

Good luck with that.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Sync Rep for 2011CF1

From
"Joshua D. Drake"
Date:
On Mon, 2011-02-07 at 17:59 +0000, Simon Riggs wrote:
> On Mon, 2011-02-07 at 12:39 -0500, Robert Haas wrote:
> 
> > I just spoke to my manager at EnterpriseDB and he cleared my schedule
> > for the next two days to work on this.  So I'll go hack on this now.
> > I haven't read the patch yet so I don't know for sure how quite I'll
> > be able to get up to speed on it, so if someone who is more familiar
> > with this code wants to grab the baton away from me, feel free.
> > Otherwise, I'll see what I can do with it.
> 
> Presumably you have a reason for declaring war? I'm sorry for that, I
> really am.

Simon,

My impression was that Robert had received a release from current
responsibilities to help you with your patch, not that he was declaring
war or some such thing. I believe we all want SyncRep to be successful. 

Sincerely,

JD


-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt



Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Mon, Feb 7, 2011 at 2:09 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> done in the time available is another thing entirely.  I do NOT want
>> to still be working on the items for this CommitFest in June - that's
>> about when I'd like to be releasing beta3.
>
> Except that's not how we work here.  You want to change that with
> respect to the release management process and schedule (or lack
> thereof).  Tradition and current practice say you need to reach
> consensus to be able to bypass compromising.
>
> Good luck with that.

I'm not trying to bypass compromising, and I don't know what makes you
think otherwise.  I am trying to ensure that the CommitFest wraps up
in a timely fashion, which is something we have done consistently for
every CommitFest in the 9.0 and 9.1 cycles to date, including the last
CommitFest of the 9.0 cycle.  It is not somehow a deviation from past
community practice to boot patches that can't be finished up in the
time available during the CommitFest.  That has been routine practice
for a long time.

I have worked very hard on this CommitFest, both to line up patch
reviewers and to review myself.  I want to make sure that every patch
gets a good, thorough review before the CommitFest is over.  I think
there is general consensus that this is important and that we will
lose contributors if we don't do it.  However, I don't want to prolong
the CommitFest indefinitely in the face of patches that the authors
are not actively working on or can't finish in the time available, or
where there is no consensus that the proposed change is what we want.
I believe that this, too, is a generally accepted principle in our
community, not something I just made up.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Josh Berkus
Date:
>> I just spoke to my manager at EnterpriseDB and he cleared my schedule
>> for the next two days to work on this.  So I'll go hack on this now.
>> I haven't read the patch yet so I don't know for sure how quite I'll
>> be able to get up to speed on it, so if someone who is more familiar
>> with this code wants to grab the baton away from me, feel free.
>> Otherwise, I'll see what I can do with it.
> 
> Presumably you have a reason for declaring war? I'm sorry for that, I
> really am.

How is clearing out his whole schedule to help review & fix the patch
declaring war?  You have an odd attitude towards assistance, Simon.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Sync Rep for 2011CF1

From
Thom Brown
Date:
On 7 February 2011 18:20, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, Jan 15, 2011 at 4:40 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Here's the latest patch for sync rep.
>
> Here is a rebased version of this patch which applies to head of the
> master branch.  I haven't tested it yet beyond making sure that it
> compiles and passes the regression tests -- but this fixes the bitrot.

"When the primary is started with allow_standalone_primary enabled,
the primary will not allow connections until a standby connects that
also has synchronous_replication enabled. This is a convenience to
ensure that we don't allow connections before write transactions will
return successfully."

Shouldn't this be if allow_standalone_primary is disabled?

Also spotted some indentation, spelling and grammatical issues, which
I've applied to the patch if it's of interest (attached).

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

Attachment

Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Mon, Feb 7, 2011 at 2:56 PM, Dave Page <dpage@pgadmin.org> wrote:
> On Mon, Feb 7, 2011 at 6:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, Feb 7, 2011 at 12:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Robert Haas <robertmhaas@gmail.com> writes:
>>>> ... Well, the current CommitFest ends in one week, ...
>>>
>>> Really?  I thought the idea for the last CF of a development cycle was
>>> that it kept going till we'd dealt with everything.  Arbitrarily
>>> rejecting stuff we haven't dealt with doesn't seem fair.
>>
>> Uh, we did that with 8.4 and it was a disaster.  The CommitFest lasted
>> *five months*. We've been doing schedule-based CommitFests ever since
>> and it's worked much better.
>
> Rejecting stuff because we haven't gotten round to dealing with it in
> such a short period of time is a damn good way to limit the number of
> contributions we get. I don't believe we've agreed at any point that
> the last commitfest should be the same time length as the others

News to me.

http://wiki.postgresql.org/wiki/PostgreSQL_9.1_Development_Plan

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Dave Page
Date:
On Mon, Feb 7, 2011 at 6:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Feb 7, 2011 at 12:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> ... Well, the current CommitFest ends in one week, ...
>>
>> Really?  I thought the idea for the last CF of a development cycle was
>> that it kept going till we'd dealt with everything.  Arbitrarily
>> rejecting stuff we haven't dealt with doesn't seem fair.
>
> Uh, we did that with 8.4 and it was a disaster.  The CommitFest lasted
> *five months*. We've been doing schedule-based CommitFests ever since
> and it's worked much better.

Rejecting stuff because we haven't gotten round to dealing with it in
such a short period of time is a damn good way to limit the number of
contributions we get. I don't believe we've agreed at any point that
the last commitfest should be the same time length as the others (when
we originally came up with the commitfest idea, it certainly wasn't
expected), and deciding that without giving people advanced notice is
a really good way to piss them off and encourage them to go work on
other things.

If we're going to put a time limit on this - and I think we should -
we should publish a date ASAP, that gives everyone a fair chance to
finish their work - say, 4 weeks.

Then, if we want to make the last commitfest the same length as the
others next year, we can make that decision and document those plans.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
"Kevin Grittner"
Date:
Robert Haas <robertmhaas@gmail.com> wrote:
> Dave Page <dpage@pgadmin.org> wrote:
>> Robert Haas <robertmhaas@gmail.com> wrote:
>>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> Robert Haas <robertmhaas@gmail.com> writes:
>>>>> ... Well, the current CommitFest ends in one week, ...
>>>>
>>>> Really?  I thought the idea for the last CF of a development
>>>> cycle was that it kept going till we'd dealt with everything. 
>>>> Arbitrarily rejecting stuff we haven't dealt with doesn't seem
>>>> fair.
>>>
>>> Uh, we did that with 8.4 and it was a disaster.  The CommitFest
>>> lasted *five months*. We've been doing schedule-based
>>> CommitFests ever since and it's worked much better.
>>
>> Rejecting stuff because we haven't gotten round to dealing with
>> it in such a short period of time is a damn good way to limit the
>> number of contributions we get. I don't believe we've agreed at
>> any point that the last commitfest should be the same time length
>> as the others
> 
> News to me.
> 
> http://wiki.postgresql.org/wiki/PostgreSQL_9.1_Development_Plan
I believe that with tighter management of the process, it should be
possible to reduce the average delay between someone writing a
feature and that feature appearing in a production release by about
two months without compromising quality.  Getting hypothetical for a
moment, delaying release of 50 features for two months to allow
release of one feature ten months earlier is likely to frustrate a
lot more people than having the train leave the station on time and
putting that one feature into the next release.
My impression was that Robert is trying to find a way to help get
Simon's patch into this release without holding everything up for
it.  In my book, that's not a declaration of war; it's community
spirit.
-Kevin


Re: Sync Rep for 2011CF1

From
Dave Page
Date:
On Mon, Feb 7, 2011 at 8:59 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Feb 7, 2011 at 2:56 PM, Dave Page <dpage@pgadmin.org> wrote:
>> On Mon, Feb 7, 2011 at 6:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Mon, Feb 7, 2011 at 12:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> Robert Haas <robertmhaas@gmail.com> writes:
>>>>> ... Well, the current CommitFest ends in one week, ...
>>>>
>>>> Really?  I thought the idea for the last CF of a development cycle was
>>>> that it kept going till we'd dealt with everything.  Arbitrarily
>>>> rejecting stuff we haven't dealt with doesn't seem fair.
>>>
>>> Uh, we did that with 8.4 and it was a disaster.  The CommitFest lasted
>>> *five months*. We've been doing schedule-based CommitFests ever since
>>> and it's worked much better.
>>
>> Rejecting stuff because we haven't gotten round to dealing with it in
>> such a short period of time is a damn good way to limit the number of
>> contributions we get. I don't believe we've agreed at any point that
>> the last commitfest should be the same time length as the others
>
> News to me.
>
> http://wiki.postgresql.org/wiki/PostgreSQL_9.1_Development_Plan

Yes, and? It doesn't say beta 1 at the after a month of the last
commitfest, which is the milestone which marks the end of development.
It says alpha 4, and possibly more alphas. It's pretty clear that it
is expected that development and polishing will continue past the 20th
February.


--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Dimitri Fontaine
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I'm not trying to bypass compromising, and I don't know what makes you
> think otherwise.  I am trying to ensure that the CommitFest wraps up

Well, I'm too tired to allow myself posting such comments, sorry to have
left the previous mail through.  More than one commit fest saw its time
frame extended for 1 or 2 weeks already, I think, all I'm saying is that
this one will certainly not be an exception, and that's for the best.

Be sure I appreciate the efforts you're putting into the mix!

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Sync Rep for 2011CF1

From
Josh Berkus
Date:
On 2/7/11 11:41 AM, Robert Haas wrote:
> However, I don't want to prolong
> the CommitFest indefinitely in the face of patches that the authors
> are not actively working on or can't finish in the time available, or
> where there is no consensus that the proposed change is what we want.
> I believe that this, too, is a generally accepted principle in our
> community, not something I just made up.

+1.

I, for one, would vote against extending beta if Sync Rep isn't ready
yet.  There's plenty of other "big features" in 9.1, and Sync Rep will
benefit from having additional development time given the number of
major spec points we only cleared up a few weeks ago.

I think the majority of our users would prefer a 9.1 in May to one that
has Sync Rep and is delivered in September.  If they had a choice.

Speaking of which, time to do some reviewing ...

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Mon, Feb 7, 2011 at 3:06 PM, Dave Page <dpage@pgadmin.org> wrote:
>>> Rejecting stuff because we haven't gotten round to dealing with it in
>>> such a short period of time is a damn good way to limit the number of
>>> contributions we get. I don't believe we've agreed at any point that
>>> the last commitfest should be the same time length as the others
>>
>> News to me.
>>
>> http://wiki.postgresql.org/wiki/PostgreSQL_9.1_Development_Plan
>
> Yes, and? It doesn't say beta 1 at the after a month of the last
> commitfest, which is the milestone which marks the end of development.
> It says alpha 4, and possibly more alphas. It's pretty clear that it
> is expected that development and polishing will continue past the 20th
> February.

You're moving the bar.  It DOES say that the CommitFest will end on
February 15th.  Now, if we want to have a discussion about changing
that, let's have that discussion (perhaps on a thread where the
subject has something to do with the topic), but we DID talk about
this, it WAS agreed, and it's been sitting there on the wiki for
something like 8 months.  Obviously, there will continue to be
polishing after the CommitFest is over, but that's not the same thing
as saying we're going to lengthen the CommitFest itself.

I think we need to step back a few paces here and talk about what
we're trying to accomplish by making some change to the proposed and
agreed CommitFest schedule.  If there's a concern that some patches
haven't been thoroughly reviewed at this point, then I think that's a
fair concern, and let's talk about which ones they are and see what we
can do about it.  I don't believe that's the case, and it's certainly
not the case for sync rep, which was submitted in an unpolished state
by Simon's own admission, reviewed and discussed, then sat for three
weeks without an update.  So perhaps the concern is that sync rep is a
make or break for this release.  OK, then fine, let's talk about
whether it's worth slipping the release for that feature.  I have no
problem with either of those conversations, and I'm happy to offer my
opinions and listen to the opinions of others, and we can make some
decision.

I think, though, that we need to be explicit about what we're doing,
and why we're doing it.  I have been working hard on this CommitFest
for a long time (since approximately a month before it started) at the
cost of development projects I would have liked to have worked on,
because I knew we were going to be overwhelmed with patches.  I have
helped as many people as I can with as many patches as I have been
able to.  I think that finishing on time (or at least as close to on
time as we can manage) is important to our success as a development
community, just as having good features is.  We don't have to agree on
what the best thing to do is, but I would certainly appreciate it if
everyone could at least credit me with acting in good faith.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
"Joshua D. Drake"
Date:
On Mon, 2011-02-07 at 12:24 -0800, Josh Berkus wrote:

> +1.
> 
> I, for one, would vote against extending beta if Sync Rep isn't ready
> yet.  There's plenty of other "big features" in 9.1, and Sync Rep will
> benefit from having additional development time given the number of
> major spec points we only cleared up a few weeks ago.
> 
> I think the majority of our users would prefer a 9.1 in May to one that
> has Sync Rep and is delivered in September.  If they had a choice.
> 

+1

JD

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt



Re: Sync Rep for 2011CF1

From
Dave Page
Date:
On Mon, Feb 7, 2011 at 9:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> You're moving the bar.  It DOES say that the CommitFest will end on
> February 15th.  Now, if we want to have a discussion about changing
> that, let's have that discussion (perhaps on a thread where the
> subject has something to do with the topic), but we DID talk about
> this, it WAS agreed, and it's been sitting there on the wiki for
> something like 8 months.  Obviously, there will continue to be
> polishing after the CommitFest is over, but that's not the same thing
> as saying we're going to lengthen the CommitFest itself.

I'm not moving the bar - I'm talking practically. Regardless of when
we consider the commitfest itself over, development and commit work of
new features has always continued until beta 1, and that has not
changed as far as I'm aware.

> I think, though, that we need to be explicit about what we're doing,
> and why we're doing it.  I have been working hard on this CommitFest
> for a long time (since approximately a month before it started) at the
> cost of development projects I would have liked to have worked on,
> because I knew we were going to be overwhelmed with patches.  I have
> helped as many people as I can with as many patches as I have been
> able to.  I think that finishing on time (or at least as close to on
> time as we can manage) is important to our success as a development
> community, just as having good features is.  We don't have to agree on
> what the best thing to do is, but I would certainly appreciate it if
> everyone could at least credit me with acting in good faith.

Oh, I have absolutely no doubt you're working in good faith, and
personally I thank you for the hard work you've put in. I just
disagree with your interpretation of the timetable.


--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Mon, Feb 7, 2011 at 3:14 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I'm not trying to bypass compromising, and I don't know what makes you
>> think otherwise.  I am trying to ensure that the CommitFest wraps up
>
> Well, I'm too tired to allow myself posting such comments, sorry to have
> left the previous mail through.

Thanks, I understand.

> More than one commit fest saw its time
> frame extended for 1 or 2 weeks already, I think, all I'm saying is that
> this one will certainly not be an exception, and that's for the best.

We've actually done really well.  The last CommitFest in 9.0 wrapped
up on 2/17 (two days late), and the others were mostly right on time
as well.  The CommitFests for 9.1 ended on: 8/15 (on time), 10/26 (9
days late, but there was no activity on the last two of those days, so
say 7 days late), and 12/21 (six days late).  As far as I can tell,
the difference primarily has to do with who manages the CommitFests
and how aggressively they follow up on patches that are dropped.  The
last CommitFest we have that really ran late was the final CommitFest
of the 8.4 cycle, and it was that event that led me to accept Josh
Berkus's invitation to be a CF manager for the first 9.0 CommitFest.
Because five month CommitFests with the tree frozen are awful and
sucky for everyone except the people who are getting extra time to
finish their patches, and they aren't really that great for those
people either.

As far as I am concerned, everything from now until we've released a
stable beta with no known issues is time that I can't spend doing
development.  So I'd like to minimize that time - not by arbitrarily
throwing patches out the window - but by a combination of postponing
patches that are not done and working my ass off to finish as much as
possible.

> Be sure I appreciate the efforts you're putting into the mix!

Thanks.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Mon, Feb 7, 2011 at 3:34 PM, Dave Page <dpage@pgadmin.org> wrote:
> On Mon, Feb 7, 2011 at 9:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> You're moving the bar.  It DOES say that the CommitFest will end on
>> February 15th.  Now, if we want to have a discussion about changing
>> that, let's have that discussion (perhaps on a thread where the
>> subject has something to do with the topic), but we DID talk about
>> this, it WAS agreed, and it's been sitting there on the wiki for
>> something like 8 months.  Obviously, there will continue to be
>> polishing after the CommitFest is over, but that's not the same thing
>> as saying we're going to lengthen the CommitFest itself.
>
> I'm not moving the bar - I'm talking practically. Regardless of when
> we consider the commitfest itself over, development and commit work of
> new features has always continued until beta 1, and that has not
> changed as far as I'm aware.

I don't think that's really true.  Go back and read the output of 'git
log REL9_0_BETA1'.  It's bug fixes, rearrangements of things that were
committed but turned out to be controversial, documentation work,
release note editing, pgindent crap...  sure, it wasn't a totally hard
freeze, but it was pretty solid slush.  We did a good job not letting
things drag out, and FWIW I think that was a good decision.  I don't
remember too many people being unhappy about their patches getting
punted, either.  There were one or two, but generally we punted things
that needed major rework or just weren't getting updated in a timely
fashion, and that, combined with a lot of hard work on Tom's part
among others, worked fine.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Dave Page
Date:
On Mon, Feb 7, 2011 at 9:46 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Feb 7, 2011 at 3:34 PM, Dave Page <dpage@pgadmin.org> wrote:
>> On Mon, Feb 7, 2011 at 9:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> You're moving the bar.  It DOES say that the CommitFest will end on
>>> February 15th.  Now, if we want to have a discussion about changing
>>> that, let's have that discussion (perhaps on a thread where the
>>> subject has something to do with the topic), but we DID talk about
>>> this, it WAS agreed, and it's been sitting there on the wiki for
>>> something like 8 months.  Obviously, there will continue to be
>>> polishing after the CommitFest is over, but that's not the same thing
>>> as saying we're going to lengthen the CommitFest itself.
>>
>> I'm not moving the bar - I'm talking practically. Regardless of when
>> we consider the commitfest itself over, development and commit work of
>> new features has always continued until beta 1, and that has not
>> changed as far as I'm aware.
>
> I don't think that's really true.  Go back and read the output of 'git
> log REL9_0_BETA1'.  It's bug fixes, rearrangements of things that were
> committed but turned out to be controversial, documentation work,
> release note editing, pgindent crap...  sure, it wasn't a totally hard
> freeze, but it was pretty solid slush.  We did a good job not letting
> things drag out, and FWIW I think that was a good decision.  I don't
> remember too many people being unhappy about their patches getting
> punted, either.  There were one or two, but generally we punted things
> that needed major rework or just weren't getting updated in a timely
> fashion, and that, combined with a lot of hard work on Tom's part
> among others, worked fine.

I guess we disagree on what we consider to be "development" then. Just
looking back to April, I see various committers whacking things around
that look to me like the fine tuning and completion of earlier
patches.

Oh - and just so we're clear... I too want us to get the release out
promptly, I'm just concerned that we don't blindside developers.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Chris Browne
Date:
dpage@pgadmin.org (Dave Page) writes:
> On Mon, Feb 7, 2011 at 6:55 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, Feb 7, 2011 at 12:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Robert Haas <robertmhaas@gmail.com> writes:
>>>> ... Well, the current CommitFest ends in one week, ...
>>>
>>> Really?  I thought the idea for the last CF of a development cycle was
>>> that it kept going till we'd dealt with everything.  Arbitrarily
>>> rejecting stuff we haven't dealt with doesn't seem fair.
>>
>> Uh, we did that with 8.4 and it was a disaster.  The CommitFest lasted
>> *five months*. We've been doing schedule-based CommitFests ever since
>> and it's worked much better.
>
> Rejecting stuff because we haven't gotten round to dealing with it in
> such a short period of time is a damn good way to limit the number of
> contributions we get. I don't believe we've agreed at any point that
> the last commitfest should be the same time length as the others (when
> we originally came up with the commitfest idea, it certainly wasn't
> expected), and deciding that without giving people advanced notice is
> a really good way to piss them off and encourage them to go work on
> other things.
>
> If we're going to put a time limit on this - and I think we should -
> we should publish a date ASAP, that gives everyone a fair chance to
> finish their work - say, 4 weeks.
>
> Then, if we want to make the last commitfest the same length as the
> others next year, we can make that decision and document those plans.

There *is* a problem that there doesn't seem to be enough time to
readily allow development of larger features without people getting
stuck fighting with the release periods.  But that's not the problem
taking place here.  It was documented, last May, that the final
CommitFest for 9.1 was to complete 2011-02-15, and there did seem to be
agreement on that.

It sure looks to me like there are going to be a bunch of items that,
based on the recognized policies, need to get deferred to 9.2, and the
prospects for Sync Rep getting into 9.1 don't look notably good to me.

Looking at things statistically, the 9.1 commitfests have had the
following numbers of items:
 #1 - 2010-09 - 52, of which 26 were committed #2 - 2010-11 - 43, of which 23 were committed #3 - 2011-01 - 98, of
which35 have been committed, and 10 are                considered ready to commit.
 

It may appear unfair to not offer everyone a "fair chance to finish
their work," but it's not as if the date wasn't published Plenty Long
Ago. and well-publicized.

But deferring the end of the CommitFest would be Not Fair to those that
*did* get their proposed changes ready for the preceding Fests.  We
cannot evade unfairness.

It's definitely readily arguable that fairness requires that:
- Items not committable by 2011-02-15 be deferred to the 2011-Next fest
  There are around 25 items right now that are sitting with [Waiting  for Author] and [Returned with Feedback]
statuses. They largely seem  like pretty fair game for "next fest."
 
- Large items that weren't included in the 2010-11 fest be considered  problematic to try to integrate into 9.1
  There sure seem to be some large items in the 2011-01 fest, which I  thought wasn't supposed to be the case.

We shouldn't just impose policy for the sake of imposing policy, but I
do recall Really Long CommitFests being pretty disastrous.  And there's
*SO* much outstanding in this particular fest that it's getting past
time for doing some substantial triage so that reviewer attentions may
be directed towards the items most likely to be acceptable for 9.1.

I hate to think that 9.1 won't include Simon's SR material, but that may
have to be.
-- 
http://www3.sympatico.ca/cbbrowne/slony.html
"It's a pretty rare beginner who isn't clueless.  If beginners weren't
clueless, the infamous Unix learning cliff wouldn't be a problem."
-- david parsons


Re: Sync Rep for 2011CF1

From
Simon Riggs
Date:
On Mon, 2011-02-07 at 15:25 -0500, Robert Haas wrote:

> I would certainly appreciate it if
> everyone could at least credit me with acting in good faith.

I think you are, if that helps.

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services



Re: Sync Rep for 2011CF1

From
Simon Riggs
Date:
On Mon, 2011-02-07 at 11:50 -0800, Josh Berkus wrote:
> >> I just spoke to my manager at EnterpriseDB and he cleared my schedule
> >> for the next two days to work on this.  So I'll go hack on this now.
> >> I haven't read the patch yet so I don't know for sure how quite I'll
> >> be able to get up to speed on it, so if someone who is more familiar
> >> with this code wants to grab the baton away from me, feel free.
> >> Otherwise, I'll see what I can do with it.
> > 
> > Presumably you have a reason for declaring war? I'm sorry for that, I
> > really am.
> 
> How is clearing out his whole schedule to help review & fix the patch
> declaring war?  You have an odd attitude towards assistance, Simon.

It seems likely that Robert had not read my reply where I said I had
time to work on this project before posting. In that case, I withdraw my
comments and apologise.

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services



Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Mon, Feb 7, 2011 at 5:16 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Mon, 2011-02-07 at 11:50 -0800, Josh Berkus wrote:
>> >> I just spoke to my manager at EnterpriseDB and he cleared my schedule
>> >> for the next two days to work on this.  So I'll go hack on this now.
>> >> I haven't read the patch yet so I don't know for sure how quite I'll
>> >> be able to get up to speed on it, so if someone who is more familiar
>> >> with this code wants to grab the baton away from me, feel free.
>> >> Otherwise, I'll see what I can do with it.
>> >
>> > Presumably you have a reason for declaring war? I'm sorry for that, I
>> > really am.
>>
>> How is clearing out his whole schedule to help review & fix the patch
>> declaring war?  You have an odd attitude towards assistance, Simon.
>
> It seems likely that Robert had not read my reply where I said I had
> time to work on this project before posting. In that case, I withdraw my
> comments and apologise.

I did read it, but I still don't think saying I'm going to work on a
patch that's been stalled for weeks - or really months - constitutes
any sort of declaration of war, peace, or anything else.  I have just
as much of a right to work on a given feature as you or anyone else
does.  Typically, when I work on patches and help get them committed,
the response is "thanks".  I'm not so sure what's different in this
case.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Mon, Feb 7, 2011 at 1:20 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, Jan 15, 2011 at 4:40 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Here's the latest patch for sync rep.
>
> Here is a rebased version of this patch which applies to head of the
> master branch.  I haven't tested it yet beyond making sure that it
> compiles and passes the regression tests -- but this fixes the bitrot.

As I mentioned yesterday that I would do, I spent some time working on
this.  I think that there are somewhere between three and six
independently useful features in this patch, plus a few random changes
to the documentation that I'm not sure whether want or not (e.g.
replacing master by primary in a few places, or the other way around).

One problem with the core synchronous replication technology is that
walreceiver cannot both receive WAL and write WAL at the same time.
It switches back and forth between reading WAL from the network socket
and flushing it to disk.  The impact of that is somewhat mitigated in
the current patch because it only implements the "fsync" level of
replication, and chances are that the network read time is small
compared to the fsync time.  But it would certainly suck for the
"receive" level we've talked about having in the past, because after
receiving each batch of WAL, the WAL receiver wouldn't be able to send
any more acknowledgments until the fsync completed, and that's bound
to be slow.  I'm not really sure how bad it will be in "fsync" mode;
it may be tolerable, but as Simon noted in a comment, in the long run
it'd certainly be nicer to have the WAL writer process running during
recovery.

As a general comment on the quality of the code, I think that the
overall logic is probably sound, but there are an awful lot of
debugging leftovers and inconsistencies between various parts of the
patch.  For example, when I initially tested it, *asynchronous*
replication kept breaking between the master and the standby, and I
couldn't figure out why.  I finally realized that there was a ten
second pause that had been inserting into the WAL receiver loop as a
debugging tool which was allowing the standby to get far enough behind
that the master was able to recycle WAL segments the standby still
needed.  Under ordinary circumstances, I would say that a patch like
this was not mature enough to submit for review, let alone commit.
For that reason, I am pretty doubtful about the chances of getting
this finished for 9.1 without some substantial prolongation of the
schedule.

That having been said, there is at least one part of this patch which
looks to be in pretty good shape and seems independently useful
regardless of what happens to the rest of it, and that is the code
that sends replies from the standby back to the primary.  This allows
pg_stat_replication to display the write/flush/apply log positions on
the standby next to the sent position on the primary, which as far as
I am concerned is pure gold.  Simon had this set up to happen only
when synchronous replication or XID feedback in use, but I think
people are going to want it even with plain old asynchronous
replication, because it provides a FAR easier way to monitor standby
lag than anything we have today.  I've extracted this portion of the
patch, cleaned it up a bit, written docs, and attached it here.

I wasn't too sure how to control the timing of the replies.  It's
worth noting that you have to send them pretty frequently for the
distinction between xlog written and xlog flushed to have any value.
What I've done here is made it so that every time we read all
available data on the socket, we send a reply.  After flushing, we
send another reply.  And then just for the heck of it we send a reply
at least every 10 seconds (configurable), which causes the
last-known-apply position to eventually get updated on the master.
This means the apply position can lag reality by a bit.  Simon's
version adds a latch, so that the startup process can poke the WAL
receiver to send a reply when the apply position moves.  But this is
substantially more complex and I'm not sure it's worth it.  If we were
implementing the "apply" level of synchronized replication, we'd
clearly need that for performance not to stink.  But since the patch
is only implementing "fsync" anyway, it doesn't seem necessary for
now.

The only real complaint I can imagine about offering this
functionality all the time is that it uses extra bandwidth.  I'm
inclined to think that the ability to shut it off completely is
sufficient answer to that complaint.

<dons asbestos underwear>

Comments?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: Sync Rep for 2011CF1

From
Magnus Hagander
Date:
On Tue, Feb 8, 2011 at 19:53, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Feb 7, 2011 at 1:20 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Sat, Jan 15, 2011 at 4:40 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> Here's the latest patch for sync rep.
>>
>> Here is a rebased version of this patch which applies to head of the
>> master branch.  I haven't tested it yet beyond making sure that it
>> compiles and passes the regression tests -- but this fixes the bitrot.
>
> As I mentioned yesterday that I would do, I spent some time working on
> this.  I think that there are somewhere between three and six
> independently useful features in this patch, plus a few random changes
> to the documentation that I'm not sure whether want or not (e.g.
> replacing master by primary in a few places, or the other way around).
>
> One problem with the core synchronous replication technology is that
> walreceiver cannot both receive WAL and write WAL at the same time.
> It switches back and forth between reading WAL from the network socket
> and flushing it to disk.  The impact of that is somewhat mitigated in
> the current patch because it only implements the "fsync" level of
> replication, and chances are that the network read time is small
> compared to the fsync time.  But it would certainly suck for the
> "receive" level we've talked about having in the past, because after
> receiving each batch of WAL, the WAL receiver wouldn't be able to send
> any more acknowledgments until the fsync completed, and that's bound
> to be slow.  I'm not really sure how bad it will be in "fsync" mode;
> it may be tolerable, but as Simon noted in a comment, in the long run
> it'd certainly be nicer to have the WAL writer process running during
> recovery.
>
> As a general comment on the quality of the code, I think that the
> overall logic is probably sound, but there are an awful lot of
> debugging leftovers and inconsistencies between various parts of the
> patch.  For example, when I initially tested it, *asynchronous*
> replication kept breaking between the master and the standby, and I
> couldn't figure out why.  I finally realized that there was a ten
> second pause that had been inserting into the WAL receiver loop as a
> debugging tool which was allowing the standby to get far enough behind
> that the master was able to recycle WAL segments the standby still
> needed.  Under ordinary circumstances, I would say that a patch like
> this was not mature enough to submit for review, let alone commit.
> For that reason, I am pretty doubtful about the chances of getting
> this finished for 9.1 without some substantial prolongation of the
> schedule.
>
> That having been said, there is at least one part of this patch which
> looks to be in pretty good shape and seems independently useful
> regardless of what happens to the rest of it, and that is the code
> that sends replies from the standby back to the primary.  This allows
> pg_stat_replication to display the write/flush/apply log positions on
> the standby next to the sent position on the primary, which as far as
> I am concerned is pure gold.  Simon had this set up to happen only
> when synchronous replication or XID feedback in use, but I think
> people are going to want it even with plain old asynchronous
> replication, because it provides a FAR easier way to monitor standby
> lag than anything we have today.  I've extracted this portion of the
> patch, cleaned it up a bit, written docs, and attached it here.

+1. I haven't actually looked at the patch, but having this ability
would be *great*.

I also agree with the general idea of trying to break it into smaller
parts - even if they only provide small parts each on it's own. That
also makes it easier to get an overview of exactly how much is left,
to see where to focus.


> The only real complaint I can imagine about offering this
> functionality all the time is that it uses extra bandwidth.  I'm
> inclined to think that the ability to shut it off completely is
> sufficient answer to that complaint.

Yes, agreed.

I would usually not worry about the bandwidth, really, I'd be more
worried about potentially increasing latency somewhere.


> <dons asbestos underwear>

The ones with little rocketships on them?


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Tue, Feb 8, 2011 at 2:34 PM, Magnus Hagander <magnus@hagander.net> wrote:
> I would usually not worry about the bandwidth, really, I'd be more
> worried about potentially increasing latency somewhere.

The time to read and write the socket doesn't seem like it should be
significant, unless the network buffers fill up.... or I'm missing
something.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Tue, Feb 8, 2011 at 2:34 PM, Magnus Hagander <magnus@hagander.net> wrote:
> I also agree with the general idea of trying to break it into smaller
> parts - even if they only provide small parts each on it's own. That
> also makes it easier to get an overview of exactly how much is left,
> to see where to focus.

And on that note, here's the rest of the patch back, rebased over what
I posted ~90 minutes ago.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: Sync Rep for 2011CF1

From
"Joshua D. Drake"
Date:
On Tue, 2011-02-08 at 13:53 -0500, Robert Haas wrote:

> That having been said, there is at least one part of this patch which
> looks to be in pretty good shape and seems independently useful
> regardless of what happens to the rest of it, and that is the code
> that sends replies from the standby back to the primary.  This allows
> pg_stat_replication to display the write/flush/apply log positions on
> the standby next to the sent position on the primary, which as far as
> I am concerned is pure gold.  Simon had this set up to happen only
> when synchronous replication or XID feedback in use, but I think
> people are going to want it even with plain old asynchronous
> replication, because it provides a FAR easier way to monitor standby
> lag than anything we have today.  I've extracted this portion of the
> patch, cleaned it up a bit, written docs, and attached it here.

Score! +1

JD

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt



Re: Sync Rep for 2011CF1

From
Fujii Masao
Date:
On Wed, Feb 9, 2011 at 3:53 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> That having been said, there is at least one part of this patch which
> looks to be in pretty good shape and seems independently useful
> regardless of what happens to the rest of it, and that is the code
> that sends replies from the standby back to the primary.  This allows
> pg_stat_replication to display the write/flush/apply log positions on
> the standby next to the sent position on the primary, which as far as
> I am concerned is pure gold.  Simon had this set up to happen only
> when synchronous replication or XID feedback in use, but I think
> people are going to want it even with plain old asynchronous
> replication, because it provides a FAR easier way to monitor standby
> lag than anything we have today.  I've extracted this portion of the
> patch, cleaned it up a bit, written docs, and attached it here.

What about also sending back the timestamp of the last applied
transaction? That's more user-friendly than the apply location
when we calculate the lag of replication, I think.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Sync Rep for 2011CF1

From
Peter Eisentraut
Date:
On mån, 2011-02-07 at 12:55 -0500, Robert Haas wrote:
> On Mon, Feb 7, 2011 at 12:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Robert Haas <robertmhaas@gmail.com> writes:
> >> ... Well, the current CommitFest ends in one week, ...
> >
> > Really?  I thought the idea for the last CF of a development cycle was
> > that it kept going till we'd dealt with everything.  Arbitrarily
> > rejecting stuff we haven't dealt with doesn't seem fair.
> 
> Uh, we did that with 8.4 and it was a disaster.  The CommitFest lasted
> *five months*. We've been doing schedule-based CommitFests ever since
> and it's worked much better.

The previous three commit fests contained about 50 patches each and
lasted one month each.  The current commit fest contains about 100
patches, so it shouldn't be surprising that it will take about 2 months
to get through it.

Moreover, under the current process, it is apparent that reviewing is
the bottleneck.  More code gets written than gets reviewed.  By
insisting on the current schedule, we would just push the growing review
backlog ahead of ourselves.  The solution (at least short-term, while
maintaining the process) has to be to increase the resources (in
practice: time) dedicated to reviewing relative to coding.




Re: Sync Rep for 2011CF1

From
Andrew Dunstan
Date:

On 02/09/2011 07:53 AM, Peter Eisentraut wrote:
> On mån, 2011-02-07 at 12:55 -0500, Robert Haas wrote:
>> On Mon, Feb 7, 2011 at 12:43 PM, Tom Lane<tgl@sss.pgh.pa.us>  wrote:
>>> Robert Haas<robertmhaas@gmail.com>  writes:
>>>> ... Well, the current CommitFest ends in one week, ...
>>> Really?  I thought the idea for the last CF of a development cycle was
>>> that it kept going till we'd dealt with everything.  Arbitrarily
>>> rejecting stuff we haven't dealt with doesn't seem fair.
>> Uh, we did that with 8.4 and it was a disaster.  The CommitFest lasted
>> *five months*. We've been doing schedule-based CommitFests ever since
>> and it's worked much better.
> The previous three commit fests contained about 50 patches each and
> lasted one month each.  The current commit fest contains about 100
> patches, so it shouldn't be surprising that it will take about 2 months
> to get through it.
>
> Moreover, under the current process, it is apparent that reviewing is
> the bottleneck.  More code gets written than gets reviewed.  By
> insisting on the current schedule, we would just push the growing review
> backlog ahead of ourselves.  The solution (at least short-term, while
> maintaining the process) has to be to increase the resources (in
> practice: time) dedicated to reviewing relative to coding.


Personally I think it's not unreasonable to extend the final commitfest 
of the release some. It doesn't need to be a huge amount longer, 
certainly not five months, but a couple of weeks to a month might be fair.

cheers

andrew


Re: Sync Rep for 2011CF1

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 02/09/2011 07:53 AM, Peter Eisentraut wrote:
>> The previous three commit fests contained about 50 patches each and
>> lasted one month each.  The current commit fest contains about 100
>> patches, so it shouldn't be surprising that it will take about 2 months
>> to get through it.

> Personally I think it's not unreasonable to extend the final commitfest 
> of the release some. It doesn't need to be a huge amount longer, 
> certainly not five months, but a couple of weeks to a month might be fair.

Yeah.  IIRC, in our first cycle using the CF process, we expected the
last CF to take longer than others.  I am not sure where the idea came
from that we'd be able to finish this one in a month.

I do accept the fact that we mustn't let it drag on indefinitely.
But two months instead of one isn't indefinite, and it seems more
realistic given the amount of work to be done.
        regards, tom lane


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Wed, Feb 9, 2011 at 9:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 02/09/2011 07:53 AM, Peter Eisentraut wrote:
>>> The previous three commit fests contained about 50 patches each and
>>> lasted one month each.  The current commit fest contains about 100
>>> patches, so it shouldn't be surprising that it will take about 2 months
>>> to get through it.
>
>> Personally I think it's not unreasonable to extend the final commitfest
>> of the release some. It doesn't need to be a huge amount longer,
>> certainly not five months, but a couple of weeks to a month might be fair.
>
> Yeah.  IIRC, in our first cycle using the CF process, we expected the
> last CF to take longer than others.  I am not sure where the idea came
> from that we'd be able to finish this one in a month.

It came from the fact that we did it last time.

> I do accept the fact that we mustn't let it drag on indefinitely.
> But two months instead of one isn't indefinite, and it seems more
> realistic given the amount of work to be done.

The work will expand to fill the time available.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Wed, Feb 9, 2011 at 7:53 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
> Moreover, under the current process, it is apparent that reviewing is
> the bottleneck.  More code gets written than gets reviewed.  By
> insisting on the current schedule, we would just push the growing review
> backlog ahead of ourselves.  The solution (at least short-term, while
> maintaining the process) has to be to increase the resources (in
> practice: time) dedicated to reviewing relative to coding.

Yep.  People who submit patches must also review patches if they want
their own stuff reviewed.

It sounds to me like what's being proposed is that I should spend
another month working on other people's patches, while they work on
their own patches.  I can't get excited about that.  The situation
with reviewing has gotten totally out of hand.  I review and commit
more patches as part of each CommitFest than anyone except Tom, and I
think there have been some CommitFests where I did more patches than
he did (though he still wins by a mile if you factor in patch
complexity).   But on the flip side, I can't always get a reviewer for
my own patches, or sometimes I get a perfunctory review that someone
spent ten minutes on.  Huh?

So I heartily approve of the suggestion that we need to devote more
energy to reviewing, if it means "more reviewing by the people who are
not me".  And allow me to suggest that that energy get put in NOW,
rather than a month from now.  Most of the patches that still need
review are not that complicated.  At least half of them could probably
be meaningfully reviewed in an hour or two.  Then the author could
post an update tomorrow.  Then the reviewer could spend another 30
minutes and mark them ready for committer.  Next!

There are certainly some patches in this CommitFest that need more
attention than that, and that probably need the attention of a senior
community member.  Jeff's range types patch and Alvaro's key lock
patch are two of those.  And I would be willing to do that, except
that I'm already listed as a reviewer for FOURTEEN PATCHES this
CommitFest, plus I committed some others that someone else reviewed
and am also functioning as CommitFest manager.  The problem isn't so
much the amount of calendar time that's required to get through 100
patches as the many people either submit half-baked code and assume
that they or someone else will fix it later, or else they submit code
but don't do an amount of review work equal to the amount of review
work they generate.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
"David E. Wheeler"
Date:
On Feb 9, 2011, at 9:20 AM, Robert Haas wrote:

> There are certainly some patches in this CommitFest that need more
> attention than that, and that probably need the attention of a senior
> community member.  Jeff's range types patch and Alvaro's key lock
> patch are two of those.  And I would be willing to do that, except
> that I'm already listed as a reviewer for FOURTEEN PATCHES this
> CommitFest, plus I committed some others that someone else reviewed
> and am also functioning as CommitFest manager.  The problem isn't so
> much the amount of calendar time that's required to get through 100
> patches as the many people either submit half-baked code and assume
> that they or someone else will fix it later, or else they submit code
> but don't do an amount of review work equal to the amount of review
> work they generate.

Frankly, I think you should surrender some of those 14 and cajole some other folks to take on more.

Best,

David

Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Wed, Feb 9, 2011 at 1:09 PM, David E. Wheeler <david@kineticode.com> wrote:
> Frankly, I think you should surrender some of those 14 and cajole some other folks to take on more.

Happily...  only trouble is, I suck at cajoling.  Even my begging is
distinctly sub-par.

Pleeeeeeeease?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
"David E. Wheeler"
Date:
On Feb 9, 2011, at 10:29 AM, Robert Haas wrote:

>> Frankly, I think you should surrender some of those 14 and cajole some other folks to take on more.
>
> Happily...  only trouble is, I suck at cajoling.  Even my begging is
> distinctly sub-par.
>
> Pleeeeeeeease?

Try this:

“Listen up, bitches! I'm tired of Tom and me having to do all the work. All of you who submitted patches need to review
someother patches! If you haven't submitted a review for someone else's patch by commitfest end, your patches will be
marked"returned."” 

Then maybe cuff Jeff or Alvaro or someone, to show you mean business.

HTH,

David



Re: Sync Rep for 2011CF1

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Feb 9, 2011 at 1:09 PM, David E. Wheeler <david@kineticode.com> wrote:
> > Frankly, I think you should surrender some of those 14 and cajole some other folks to take on more.
>
> Happily...  only trouble is, I suck at cajoling.  Even my begging is
> distinctly sub-par.
>
> Pleeeeeeeease?

Erm, I've been through the commitfest app a couple of different times,
but have ignored things which are marked 'Needs Reivew' when there's a
reviewer listed...

If there are patches where you're marked as the reviewer but you don't
have time to review them or want help, take your name off as a reviewer
for them and/or speak up and explicitly ask for help.  I'm not going to
start reviewing something if I think someone else is already working on
it..
Thanks,
    Stephen

Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Wed, Feb 9, 2011 at 1:32 PM, David E. Wheeler <david@kineticode.com> wrote:
> On Feb 9, 2011, at 10:29 AM, Robert Haas wrote:
>>> Frankly, I think you should surrender some of those 14 and cajole some other folks to take on more.
>>
>> Happily...  only trouble is, I suck at cajoling.  Even my begging is
>> distinctly sub-par.
>>
>> Pleeeeeeeease?
>
> Try this:
>
> “Listen up, bitches! I'm tired of Tom and me having to do all the work. All of you who submitted patches need to
reviewsome other patches! If you haven't submitted a review for someone else's patch by commitfest end, your patches
willbe marked "returned."” 
>
> Then maybe cuff Jeff or Alvaro or someone, to show you mean business.

That tends not to get a lot of community support, and it isn't my
intention anyway.  We actually do not need to impose a draconian rule;
we just need everyone to put in a little extra effort to get us over
the hump.

But speaking of that, I just so happen to notice you haven't signed up
to review any patches this CF.  How about grabbing one or two?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
"David E. Wheeler"
Date:
On Feb 9, 2011, at 10:56 AM, Robert Haas wrote:

>> “Listen up, bitches! I'm tired of Tom and me having to do all the work. All of you who submitted patches need to
reviewsome other patches! If you haven't submitted a review for someone else's patch by commitfest end, your patches
willbe marked "returned."” 
>>
>> Then maybe cuff Jeff or Alvaro or someone, to show you mean business.
>
> That tends not to get a lot of community support, and it isn't my
> intention anyway.  We actually do not need to impose a draconian rule;
> we just need everyone to put in a little extra effort to get us over
> the hump.

Agreed. Let me remove my tongue from my cheek.

> But speaking of that, I just so happen to notice you haven't signed up
> to review any patches this CF.  How about grabbing one or tw

ha ha! Alas, I'm completely overcommitted at this point. Been having a hard time making time for PGXN. I've been
trackingthe extension stuff closely, though, as you can imagine. 

Looking at the patches without reviewers anyway, frankly none look like the sorts of things I have the expertise to
testin any but the most superficial way. Are there more that should have the reviewer removed? If there were one I
couldgive a couple of hours to and speak with some knowledge, I could fix up some time next week. 

Best,

David

Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Wed, Feb 9, 2011 at 2:01 PM, David E. Wheeler <david@kineticode.com> wrote:
> ha ha! Alas, I'm completely overcommitted at this point. Been having a hard time making time for PGXN. I've been
trackingthe extension stuff closely, though, as you can imagine. 

It's a common problem, and of course none of us are in a position to
dictate how other people spend their time.  But the issue on the table
is whether we want PostgreSQL 9.1 to be released in 2011.  If yes,
then without making any statements about what any particular person
has to or must do, we collectively need to step it up a notch or two.

> Looking at the patches without reviewers anyway, frankly none look like the sorts of things I have the expertise to
testin any but the most superficial way. Are there more that should have the reviewer removed? If there were one I
couldgive a couple of hours to and speak with some knowledge, I could fix up some time next week. 

I just sent a note on some that seem like they could use more looking
at, but there may be other ones too.  Now is not the time to hold back
because you think someone else might be working on it.  Most of the
time, the fact that a patch has a reviewer means that they either
intended to or actually did review it at some point in time, but not
that they are necessarily working on it right this minute, and
certainly not that other input isn't welcome.  This is especially true
towards the end of the CommitFest or when the thread hasn't had
anything new posted to it for several days.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Fujii Masao
Date:
On Wed, Feb 9, 2011 at 5:25 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Feb 8, 2011 at 2:34 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> I also agree with the general idea of trying to break it into smaller
>> parts - even if they only provide small parts each on it's own. That
>> also makes it easier to get an overview of exactly how much is left,
>> to see where to focus.
>
> And on that note, here's the rest of the patch back, rebased over what
> I posted ~90 minutes ago.

Though I haven't read the patch enough yet, I have one review comment.

While walsender uses the non-blocking I/O function (i.e.,
pq_getbyte_if_available)
for the receive, it uses the blocking one (i.e., pq_flush, etc) for the send.
So, sync_rep_timeout_server would not work well when the walsender
gets blocked in sending WAL. This is one the problems which I struggled
with when I created the SyncRep patch before. I think that we need to
introduce the non-blocking send function for the replication timeout.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Sync Rep for 2011CF1

From
Heikki Linnakangas
Date:
On 08.02.2011 20:53, Robert Haas wrote:
> That having been said, there is at least one part of this patch which
> looks to be in pretty good shape and seems independently useful
> regardless of what happens to the rest of it, and that is the code
> that sends replies from the standby back to the primary.  This allows
> pg_stat_replication to display the write/flush/apply log positions on
> the standby next to the sent position on the primary, which as far as
> I am concerned is pure gold.  Simon had this set up to happen only
> when synchronous replication or XID feedback in use, but I think
> people are going to want it even with plain old asynchronous
> replication, because it provides a FAR easier way to monitor standby
> lag than anything we have today.  I've extracted this portion of the
> patch, cleaned it up a bit, written docs, and attached it here.

Thanks!

> I wasn't too sure how to control the timing of the replies.  It's
> worth noting that you have to send them pretty frequently for the
> distinction between xlog written and xlog flushed to have any value.
> What I've done here is made it so that every time we read all
> available data on the socket, we send a reply.  After flushing, we
> send another reply.  And then just for the heck of it we send a reply
> at least every 10 seconds (configurable), which causes the
> last-known-apply position to eventually get updated on the master.
> This means the apply position can lag reality by a bit.

Seems reasonable. As the patch stands, however, the standby doesn't send 
any status updates if its busy receiving, writing, and flushing the 
incoming WAL. That would happen if you have a fast network, and slow 
disk, and the standby is catching up, e.g after restoring a base backup.

I added a XLogWalRcvSendReply() call into XLogWalRcvFlush() so that it 
also sends a status update every time the WAL is flushed. If the 
walreceiver is busy receiving and flushing, that would happen once per 
WAL segment, which seems sensible.

The comment above StandbyReplyMessage said that its message type is 'r'. 
However, no message type was actually sent for the replies. A message 
type byte seems like a good idea, for the sake of extensibility, so I 
made the code match that comment. I also added documentation of this new 
message type in the manual section about the streaming replication protocol.

I committed the patch with those changes, and some minor comment tweaks 
and other kibitzing.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Tue, Feb 8, 2011 at 3:25 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Feb 8, 2011 at 2:34 PM, Magnus Hagander <magnus@hagander.net> wrote:
>> I also agree with the general idea of trying to break it into smaller
>> parts - even if they only provide small parts each on it's own. That
>> also makes it easier to get an overview of exactly how much is left,
>> to see where to focus.
>
> And on that note, here's the rest of the patch back, rebased over what
> I posted ~90 minutes ago.

Another rebase.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Re: Sync Rep for 2011CF1

From
Fujii Masao
Date:
On Fri, Feb 11, 2011 at 4:06 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> I committed the patch with those changes, and some minor comment tweaks and
> other kibitzing.

+            * 'd' means a standby reply wrapped in a COPY BOTH packet.
+            */

Typo: s/COPY BOTH/CopyData


+   msgtype = pq_getmsgbyte(&input_message);
+   if (msgtype != 'r')
+       ereport(COMMERROR,
+               (errcode(ERRCODE_PROTOCOL_VIOLATION),
+                errmsg("unexpected message type %c", msgtype)));

I think that proc_exit(0) needs to be called in error case.


+   static StringInfoData input_message;
+   StandbyReplyMessage reply;
+   char msgtype;
+
+   initStringInfo(&input_message);

Doesn't the repeat of initStringInfo() cause the memory leak?


@@ -518,6 +584,7 @@ WalSndLoop(void)       {           if (!XLogSend(output_message, &caughtup))               break;
+           ProcessRepliesIfAny();

Why is ProcessRepliesIfAny() required there?


We added new columns "write_location", "flush_location" and
"apply_location". So, for the sake of consistency, the column
name "sent_location" should be changed to "send_location"?

Regards,.

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Sync Rep for 2011CF1

From
Fujii Masao
Date:
On Fri, Feb 11, 2011 at 4:06 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> I added a XLogWalRcvSendReply() call into XLogWalRcvFlush() so that it also
> sends a status update every time the WAL is flushed. If the walreceiver is
> busy receiving and flushing, that would happen once per WAL segment, which
> seems sensible.

This change can make the callback function "WalRcvDie()" call ereport(ERROR)
via XLogWalRcvFlush(). This looks unsafe.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Sync Rep for 2011CF1

From
Fujii Masao
Date:
On Mon, Feb 14, 2011 at 2:08 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Fri, Feb 11, 2011 at 4:06 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> I committed the patch with those changes, and some minor comment tweaks and
>> other kibitzing.

I have another comment:

The description of wal_receiver_status_interval is in "18.5.4.
Streaming Replication".
But I think that it should be moved to "18.5.5. Standby Servers" since
it's a parameter
to control the behavior of the standby server rather than that of the master.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Sync Rep for 2011CF1

From
Jaime Casanova
Date:
On Sat, Jan 15, 2011 at 4:40 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> Here's the latest patch for sync rep.
>

I was looking at this code and found something in SyncRepWaitOnQueue
we declare a timeout variable that is a long and another that is a
boolean (this last one in the else part of the "if
(!IsOnSyncRepQueue())"), and then use the boolean one as if it were
the long one

+       else
+       {
+           bool release = false;
+           bool timeout = false;
+
+           SpinLockAcquire(&queue->qlock);
+
+           /*
+            * Check the LSN on our queue and if its moved far enough then
+            * remove us from the queue. First time through this is
+            * unlikely to be far enough, yet is possible. Next time we are
+            * woken we should be more lucky.
+            */
+           if (XLByteLE(XactCommitLSN, queue->lsn))
+               release = true;
+           else if (timeout > 0 &&
+               TimestampDifferenceExceeds(GetCurrentTransactionStopTimestamp(),
+                                       now,
+                                       timeout))
+           {
+               release = true;
+               timeout = true;
+           }

the other two things are on postgresql.conf.sample:
- we have two replication_timeout_client, obviously one of them should
be replication_timeout_server
- synchronous_replication_feedback is off by default, but docs says otherwise

i also have been testing this, until now the only issue i have found
is that if i set allow_standalone_primary to off and there isn't a
standby connected i need to stop the server with -m immediate which is
at least surprising

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL


Re: Sync Rep for 2011CF1

From
Simon Riggs
Date:
On Tue, 2011-02-15 at 01:45 -0500, Jaime Casanova wrote:
> On Sat, Jan 15, 2011 at 4:40 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> >
> > Here's the latest patch for sync rep.
> >
> 
> I was looking at this code and found something in SyncRepWaitOnQueue
> we declare a timeout variable that is a long and another that is a
> boolean (this last one in the else part of the "if
> (!IsOnSyncRepQueue())"), and then use the boolean one as if it were
> the long one

OK, thanks.

> +       else
> +       {
> +           bool release = false;
> +           bool timeout = false;
> +
> +           SpinLockAcquire(&queue->qlock);
> +
> +           /*
> +            * Check the LSN on our queue and if its moved far enough then
> +            * remove us from the queue. First time through this is
> +            * unlikely to be far enough, yet is possible. Next time we are
> +            * woken we should be more lucky.
> +            */
> +           if (XLByteLE(XactCommitLSN, queue->lsn))
> +               release = true;
> +           else if (timeout > 0 &&
> +               TimestampDifferenceExceeds(GetCurrentTransactionStopTimestamp(),
> +                                       now,
> +                                       timeout))
> +           {
> +               release = true;
> +               timeout = true;
> +           }
> 
> the other two things are on postgresql.conf.sample:
> - we have two replication_timeout_client, obviously one of them should
> be replication_timeout_server
> - synchronous_replication_feedback is off by default, but docs says otherwise
> 
> i also have been testing this, until now the only issue i have found
> is that if i set allow_standalone_primary to off and there isn't a
> standby connected i need to stop the server with -m immediate which is
> at least surprising

I think that code is being ripped out, so will check again later.

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services



Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Mon, Feb 14, 2011 at 12:08 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Fri, Feb 11, 2011 at 4:06 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> I committed the patch with those changes, and some minor comment tweaks and
>> other kibitzing.
>
> +            * 'd' means a standby reply wrapped in a COPY BOTH packet.
> +            */
>
> Typo: s/COPY BOTH/CopyData

Fixed.

> +   msgtype = pq_getmsgbyte(&input_message);
> +   if (msgtype != 'r')
> +       ereport(COMMERROR,
> +               (errcode(ERRCODE_PROTOCOL_VIOLATION),
> +                errmsg("unexpected message type %c", msgtype)));
>
> I think that proc_exit(0) needs to be called in error case.

Fixed.

> +   static StringInfoData input_message;
> +   StandbyReplyMessage reply;
> +   char msgtype;
> +
> +   initStringInfo(&input_message);
>
> Doesn't the repeat of initStringInfo() cause the memory leak?

Yeah.  Fixed, I hope.

> @@ -518,6 +584,7 @@ WalSndLoop(void)
>        {
>            if (!XLogSend(output_message, &caughtup))
>                break;
> +           ProcessRepliesIfAny();
>
> Why is ProcessRepliesIfAny() required there?

I'm not sure if that's 100% necessary, but it seems harmless enough.

> We added new columns "write_location", "flush_location" and
> "apply_location". So, for the sake of consistency, the column
> name "sent_location" should be changed to "send_location"?

I was thinking about stream_location or streaming_location, per
discussion on the other thread.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Tue, Feb 15, 2011 at 1:11 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Mon, Feb 14, 2011 at 2:08 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> On Fri, Feb 11, 2011 at 4:06 AM, Heikki Linnakangas
>> <heikki.linnakangas@enterprisedb.com> wrote:
>>> I committed the patch with those changes, and some minor comment tweaks and
>>> other kibitzing.
>
> I have another comment:
>
> The description of wal_receiver_status_interval is in "18.5.4.
> Streaming Replication".
> But I think that it should be moved to "18.5.5. Standby Servers" since
> it's a parameter
> to control the behavior of the standby server rather than that of the master.

Fixed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Mon, Feb 14, 2011 at 12:25 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Fri, Feb 11, 2011 at 4:06 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
>> I added a XLogWalRcvSendReply() call into XLogWalRcvFlush() so that it also
>> sends a status update every time the WAL is flushed. If the walreceiver is
>> busy receiving and flushing, that would happen once per WAL segment, which
>> seems sensible.
>
> This change can make the callback function "WalRcvDie()" call ereport(ERROR)
> via XLogWalRcvFlush(). This looks unsafe.

Good catch.  Is the cleanest solution to pass a boolean parameter to
XLogWalRcvFlush() indicating whether we're in the midst of dying?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Fujii Masao
Date:
On Wed, Feb 16, 2011 at 2:08 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Feb 14, 2011 at 12:25 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> On Fri, Feb 11, 2011 at 4:06 AM, Heikki Linnakangas
>> <heikki.linnakangas@enterprisedb.com> wrote:
>>> I added a XLogWalRcvSendReply() call into XLogWalRcvFlush() so that it also
>>> sends a status update every time the WAL is flushed. If the walreceiver is
>>> busy receiving and flushing, that would happen once per WAL segment, which
>>> seems sensible.
>>
>> This change can make the callback function "WalRcvDie()" call ereport(ERROR)
>> via XLogWalRcvFlush(). This looks unsafe.
>
> Good catch.  Is the cleanest solution to pass a boolean parameter to
> XLogWalRcvFlush() indicating whether we're in the midst of dying?

Agreed if the comment about why such a boolean parameter is
required is added.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Tue, Feb 15, 2011 at 10:13 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, Feb 16, 2011 at 2:08 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, Feb 14, 2011 at 12:25 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>> On Fri, Feb 11, 2011 at 4:06 AM, Heikki Linnakangas
>>> <heikki.linnakangas@enterprisedb.com> wrote:
>>>> I added a XLogWalRcvSendReply() call into XLogWalRcvFlush() so that it also
>>>> sends a status update every time the WAL is flushed. If the walreceiver is
>>>> busy receiving and flushing, that would happen once per WAL segment, which
>>>> seems sensible.
>>>
>>> This change can make the callback function "WalRcvDie()" call ereport(ERROR)
>>> via XLogWalRcvFlush(). This looks unsafe.
>>
>> Good catch.  Is the cleanest solution to pass a boolean parameter to
>> XLogWalRcvFlush() indicating whether we're in the midst of dying?
>
> Agreed if the comment about why such a boolean parameter is
> required is added.

OK, done.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Simon Riggs
Date:
On Tue, 2011-02-15 at 12:08 -0500, Robert Haas wrote:
> On Mon, Feb 14, 2011 at 12:25 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> > On Fri, Feb 11, 2011 at 4:06 AM, Heikki Linnakangas
> > <heikki.linnakangas@enterprisedb.com> wrote:
> >> I added a XLogWalRcvSendReply() call into XLogWalRcvFlush() so that it also
> >> sends a status update every time the WAL is flushed. If the walreceiver is
> >> busy receiving and flushing, that would happen once per WAL segment, which
> >> seems sensible.
> >
> > This change can make the callback function "WalRcvDie()" call ereport(ERROR)
> > via XLogWalRcvFlush(). This looks unsafe.
> 
> Good catch.  Is the cleanest solution to pass a boolean parameter to
> XLogWalRcvFlush() indicating whether we're in the midst of dying?

Surely if you do this then sync rep will fail to respond correctly if
WalReceiver dies.

Why is it OK to write to disk, but not OK to reply?

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services



Re: Sync Rep for 2011CF1

From
Heikki Linnakangas
Date:
On 16.02.2011 17:36, Simon Riggs wrote:
> On Tue, 2011-02-15 at 12:08 -0500, Robert Haas wrote:
>> On Mon, Feb 14, 2011 at 12:25 AM, Fujii Masao<masao.fujii@gmail.com>  wrote:
>>> On Fri, Feb 11, 2011 at 4:06 AM, Heikki Linnakangas
>>> <heikki.linnakangas@enterprisedb.com>  wrote:
>>>> I added a XLogWalRcvSendReply() call into XLogWalRcvFlush() so that it also
>>>> sends a status update every time the WAL is flushed. If the walreceiver is
>>>> busy receiving and flushing, that would happen once per WAL segment, which
>>>> seems sensible.
>>>
>>> This change can make the callback function "WalRcvDie()" call ereport(ERROR)
>>> via XLogWalRcvFlush(). This looks unsafe.
>>
>> Good catch.  Is the cleanest solution to pass a boolean parameter to
>> XLogWalRcvFlush() indicating whether we're in the midst of dying?
>
> Surely if you do this then sync rep will fail to respond correctly if
> WalReceiver dies.
>
> Why is it OK to write to disk, but not OK to reply?

Because the connection might be dead. A broken connection is a likely 
cause of walreceiver death.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Sync Rep for 2011CF1

From
Simon Riggs
Date:
On Wed, 2011-02-16 at 17:40 +0200, Heikki Linnakangas wrote:
> On 16.02.2011 17:36, Simon Riggs wrote:
> > On Tue, 2011-02-15 at 12:08 -0500, Robert Haas wrote:
> >> On Mon, Feb 14, 2011 at 12:25 AM, Fujii Masao<masao.fujii@gmail.com>  wrote:
> >>> On Fri, Feb 11, 2011 at 4:06 AM, Heikki Linnakangas
> >>> <heikki.linnakangas@enterprisedb.com>  wrote:
> >>>> I added a XLogWalRcvSendReply() call into XLogWalRcvFlush() so that it also
> >>>> sends a status update every time the WAL is flushed. If the walreceiver is
> >>>> busy receiving and flushing, that would happen once per WAL segment, which
> >>>> seems sensible.
> >>>
> >>> This change can make the callback function "WalRcvDie()" call ereport(ERROR)
> >>> via XLogWalRcvFlush(). This looks unsafe.
> >>
> >> Good catch.  Is the cleanest solution to pass a boolean parameter to
> >> XLogWalRcvFlush() indicating whether we're in the midst of dying?
> >
> > Surely if you do this then sync rep will fail to respond correctly if
> > WalReceiver dies.
> >
> > Why is it OK to write to disk, but not OK to reply?
> 
> Because the connection might be dead. A broken connection is a likely 
> cause of walreceiver death.

Would it not be possible to check that? 

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services



Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Wed, Feb 16, 2011 at 11:32 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Wed, 2011-02-16 at 17:40 +0200, Heikki Linnakangas wrote:
>> On 16.02.2011 17:36, Simon Riggs wrote:
>> > On Tue, 2011-02-15 at 12:08 -0500, Robert Haas wrote:
>> >> On Mon, Feb 14, 2011 at 12:25 AM, Fujii Masao<masao.fujii@gmail.com>  wrote:
>> >>> On Fri, Feb 11, 2011 at 4:06 AM, Heikki Linnakangas
>> >>> <heikki.linnakangas@enterprisedb.com>  wrote:
>> >>>> I added a XLogWalRcvSendReply() call into XLogWalRcvFlush() so that it also
>> >>>> sends a status update every time the WAL is flushed. If the walreceiver is
>> >>>> busy receiving and flushing, that would happen once per WAL segment, which
>> >>>> seems sensible.
>> >>>
>> >>> This change can make the callback function "WalRcvDie()" call ereport(ERROR)
>> >>> via XLogWalRcvFlush(). This looks unsafe.
>> >>
>> >> Good catch.  Is the cleanest solution to pass a boolean parameter to
>> >> XLogWalRcvFlush() indicating whether we're in the midst of dying?
>> >
>> > Surely if you do this then sync rep will fail to respond correctly if
>> > WalReceiver dies.
>> >
>> > Why is it OK to write to disk, but not OK to reply?
>>
>> Because the connection might be dead. A broken connection is a likely
>> cause of walreceiver death.
>
> Would it not be possible to check that?

I'm not actually sure that it matters that much whether we do or not.
ISTM that the WAL receiver is normally going to exit the main loop (in
WalReceiverMain) right here:
       /* Process any requests or signals received recently */       ProcessWalRcvInterrupts();

But to get to that point, we either have to be making our first pass
through the loop (in which case nothing interesting has happened yet)
or we have to have just completed an iteration through the loop (in
which case we just sent a reply).  I think that the only thing that
can have changed since the last reply is the replay position, which
this version of the sync rep patch doesn't care about anyway.  Even if
it did, I'm not sure it'd be worth complicating the die path to
squeeze in one final reply.

Actually, on further reflection, I'm not even sure why we bother with
the fsync.  It seems like a useful safeguard but I'm not seeing how we
can get to that point without having fsync'd everything anyway.  Am I
missing something?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Heikki Linnakangas
Date:
On 16.02.2011 19:29, Robert Haas wrote:
> Actually, on further reflection, I'm not even sure why we bother with
> the fsync.  It seems like a useful safeguard but I'm not seeing how we
> can get to that point without having fsync'd everything anyway.  Am I
> missing something?

WalRcvDie() is called on error. For example, if the connection dies 
unexpectedly during walrcv_receive().

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: Sync Rep for 2011CF1

From
Robert Haas
Date:
On Wed, Feb 16, 2011 at 12:34 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> On 16.02.2011 19:29, Robert Haas wrote:
>>
>> Actually, on further reflection, I'm not even sure why we bother with
>> the fsync.  It seems like a useful safeguard but I'm not seeing how we
>> can get to that point without having fsync'd everything anyway.  Am I
>> missing something?
>
> WalRcvDie() is called on error. For example, if the connection dies
> unexpectedly during walrcv_receive().

Ah, OK.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Sync Rep for 2011CF1

From
Simon Riggs
Date:
On Fri, 2011-01-21 at 14:45 +0200, Heikki Linnakangas wrote:

> * The UI differs from what was agreed on here:
> http://archives.postgresql.org/message-id/4D1DCF5A.7070808@enterprisedb.com.

Patch to add server_name parameter, plus mechanism to send info from
standby to master. While doing that, refactor into 3 message types, not
just 1. This addresses Fujii's comment that we may not wish to send
feedback as often as other replies, but doesn't actually alter yet when
the feedback is sent (nor will I do that anytime soon).

Complete but rough hack, for comments, but nothing surprising.

--
 Simon Riggs           http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services


Attachment

Re: Sync Rep for 2011CF1

From
Simon Riggs
Date:
On Fri, 2011-02-18 at 00:48 +0000, Simon Riggs wrote:

> Complete but rough hack, for comments, but nothing surprising.

This is an implicit requirement from our earlier agreed API, so its
blocking further work on Sync Rep.

I'm looking to commit this in about 3-4 hours unless I get comments.

-- Simon Riggs           http://www.2ndQuadrant.com/books/PostgreSQL Development, 24x7 Support, Training and Services