Thread: Hot Standby Feedback should default to on in 9.3+

Hot Standby Feedback should default to on in 9.3+

From
Andres Freund
Date:
Hi,

The subject says it all.

There are workloads where its detrimental, but in general having it
default to on improver experience tremendously because getting conflicts
because of vacuum is rather confusing.

In the workloads where it might not be a good idea (very long queries on
the standby, many dead tuples on the primary) you need to think very
carefuly about the strategy of avoiding conflicts anyway, and explicit
configuration is required as well.

Does anybody have an argument against changing the default value?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Hot Standby Feedback should default to on in 9.3+

From
Simon Riggs
Date:
On 30 November 2012 19:02, Andres Freund <andres@2ndquadrant.com> wrote:

> The subject says it all.
>
> There are workloads where its detrimental, but in general having it
> default to on improver experience tremendously because getting conflicts
> because of vacuum is rather confusing.
>
> In the workloads where it might not be a good idea (very long queries on
> the standby, many dead tuples on the primary) you need to think very
> carefuly about the strategy of avoiding conflicts anyway, and explicit
> configuration is required as well.
>
> Does anybody have an argument against changing the default value?

I don't see a technical objection, perhaps others do.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Hot Standby Feedback should default to on in 9.3+

From
Robert Haas
Date:
On Fri, Nov 30, 2012 at 2:02 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> Does anybody have an argument against changing the default value?

Well, the disadvantage of it is that the standby can bloat the master,
which might be surprising to some people, too.  But I don't really
have a lot of skin in this game.

While we're talking about changing defaults, how about changing the
default value of the recovery.conf parameter 'standby_mode' to on?
Not sure about anybody else, but I never want it any other way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Hot Standby Feedback should default to on in 9.3+

From
Josh Berkus
Date:
> In the workloads where it might not be a good idea (very long queries on
> the standby, many dead tuples on the primary) you need to think very
> carefuly about the strategy of avoiding conflicts anyway, and explicit
> configuration is required as well.
> 
> Does anybody have an argument against changing the default value?

On balance, I think it's a good idea.  It's easier for new users,
conceptually, to deal with table bloat than query cancel.

Have we done testing on how much query cancel it actually eliminates?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Hot Standby Feedback should default to on in 9.3+

From
Heikki Linnakangas
Date:
On 30.11.2012 21:02, Andres Freund wrote:
> Hi,
>
> The subject says it all.
>
> There are workloads where its detrimental, but in general having it
> default to on improver experience tremendously because getting conflicts
> because of vacuum is rather confusing.
>
> In the workloads where it might not be a good idea (very long queries on
> the standby, many dead tuples on the primary) you need to think very
> carefuly about the strategy of avoiding conflicts anyway, and explicit
> configuration is required as well.
>
> Does anybody have an argument against changing the default value?

-1. By default, I would expect a standby server to not have any 
meaningful impact on the performance of the master. With hot standby 
feedback, you can bloat the master very badly if you're not careful.

Think of someone setting up a test server, by setting it up as a standby 
from the master. Now, when someone holds a transaction open in the test 
server, you get bloat in the master. Or if you set up a standby for 
reporting purposes - a very common use case - you would not expect a 
long running ad-hoc query in the standby to bloat the master. That's 
precisely why you set up such a standby in the first place.

You could of course still turn it off, but you would have to know about 
it in the first place. I think it's a reasonable assumption that a 
standby does *not* affect the master (aside from the bandwidth and disk 
space required to retain/ship the WAL). If you have to remember to 
explicitly set a GUC to get that behavior, that's a pretty big gotcha.

- Heikki



Re: Hot Standby Feedback should default to on in 9.3+

From
Claudio Freire
Date:
On Fri, Nov 30, 2012 at 5:46 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
>
> Think of someone setting up a test server, by setting it up as a standby
> from the master. Now, when someone holds a transaction open in the test
> server, you get bloat in the master. Or if you set up a standby for
> reporting purposes - a very common use case - you would not expect a long
> running ad-hoc query in the standby to bloat the master. That's precisely
> why you set up such a standby in the first place.

Without hot standby feedback, reporting queries are impossible. I've
experienced it. Cancellations make it impossible to finish any
decently complex reporting query.



Re: Hot Standby Feedback should default to on in 9.3+

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> While we're talking about changing defaults, how about changing the
> default value of the recovery.conf parameter 'standby_mode' to on?
> Not sure about anybody else, but I never want it any other way.

Dunno, it's been only a couple of days since there was a thread about
somebody who had turned it on and not gotten the results he wanted
(because he was only trying to do a point-in-time recovery not create
a standby).  There's enough other configuration needed to set up a
standby node that I'm not sure flipping this default helps the case
much.

But having said that, would it be practical to get rid of the explicit
standby_mode parameter altogether?  I'm thinking we could assume standby
mode is wanted if primary_conninfo has a nonempty value.

There remains the case of a standby being fed solely from WAL archive
without primary_conninfo, but that's a pretty darn corner-y corner case,
and I doubt it has to be easy to set up.  One possibility for it is to
allow primary_conninfo to be set to "none", which would still trigger
standby mode but could be coded to not enable connection attempts.

Mind you, I'm not sure that such a design is easier to understand or
document.  But it would be one less parameter.
        regards, tom lane



Re: Hot Standby Feedback should default to on in 9.3+

From
Heikki Linnakangas
Date:
On 30.11.2012 22:49, Claudio Freire wrote:
> On Fri, Nov 30, 2012 at 5:46 PM, Heikki Linnakangas
> <hlinnakangas@vmware.com>  wrote:
>>
>> Think of someone setting up a test server, by setting it up as a standby
>> from the master. Now, when someone holds a transaction open in the test
>> server, you get bloat in the master. Or if you set up a standby for
>> reporting purposes - a very common use case - you would not expect a long
>> running ad-hoc query in the standby to bloat the master. That's precisely
>> why you set up such a standby in the first place.
>
> Without hot standby feedback, reporting queries are impossible. I've
> experienced it. Cancellations make it impossible to finish any
> decently complex reporting query.

Maybe so, but I'd rather get cancellations in the standby, and then read 
up on feedback and the other options and figure out how to make it work, 
than get severe bloat in the master and scratch my head wondering what's 
causing it.

- Heikki



Re: Hot Standby Feedback should default to on in 9.3+

From
"Kevin Grittner"
Date:
Claudio Freire wrote:

> Without hot standby feedback, reporting queries are impossible.
> I've experienced it. Cancellations make it impossible to finish
> any decently complex reporting query.

With what setting of max_standby_streaming_delay? I would rather
default that to -1 than default hot_standby_feedback on. That way
what you do on the standby only affects the standby.

A default that allows anyone who has a read-only login to a standby
to bloat the server by default, which may require hours of down
time to correct, seems dangerous to me.

-Kevin



Re: Hot Standby Feedback should default to on in 9.3+

From
Claudio Freire
Date:
On Fri, Nov 30, 2012 at 6:06 PM, Kevin Grittner <kgrittn@mail.com> wrote:
>
>> Without hot standby feedback, reporting queries are impossible.
>> I've experienced it. Cancellations make it impossible to finish
>> any decently complex reporting query.
>
> With what setting of max_standby_streaming_delay? I would rather
> default that to -1 than default hot_standby_feedback on. That way
> what you do on the standby only affects the standby.

1d



Re: Hot Standby Feedback should default to on in 9.3+

From
Tom Lane
Date:
Claudio Freire <klaussfreire@gmail.com> writes:
> Without hot standby feedback, reporting queries are impossible. I've
> experienced it. Cancellations make it impossible to finish any
> decently complex reporting query.

The original expectation was that slave-side cancels would be
infrequent.  Maybe there's some fixing/tuning to be done there.
        regards, tom lane



Re: Hot Standby Feedback should default to on in 9.3+

From
"Kevin Grittner"
Date:
Claudio Freire wrote:

>> With what setting of max_standby_streaming_delay? I would rather
>> default that to -1 than default hot_standby_feedback on. That
>> way what you do on the standby only affects the standby.
> 
> 1d

Was there actually a transaction hanging open for an entire day on
the standby? Was it a query which actually ran that long, or an
ill-behaved user or piece of software?

I have most certainly managed databases where holding up vacuuming
on the source would cripple performance to the point that users
would have demanded that any other process causing it must be
immediately canceled. And canceling it wouldn't be enough at that
point -- the bloat would still need to be fixed before they could
work efficiently.

-Kevin



Re: Hot Standby Feedback should default to on in 9.3+

From
Claudio Freire
Date:
On Fri, Nov 30, 2012 at 6:20 PM, Kevin Grittner <kgrittn@mail.com> wrote:
> Claudio Freire wrote:
>
>>> With what setting of max_standby_streaming_delay? I would rather
>>> default that to -1 than default hot_standby_feedback on. That
>>> way what you do on the standby only affects the standby.
>>
>> 1d
>
> Was there actually a transaction hanging open for an entire day on
> the standby? Was it a query which actually ran that long, or an
> ill-behaved user or piece of software?

No, and if there was, I wouldn't care for it to be cancelled.

Queries were being cancelled way before that timeout was reached,
probably something to do with max_keep_segments on the master side
being unable to keep up for that long.

> I have most certainly managed databases where holding up vacuuming
> on the source would cripple performance to the point that users
> would have demanded that any other process causing it must be
> immediately canceled. And canceling it wouldn't be enough at that
> point -- the bloat would still need to be fixed before they could
> work efficiently.

I wouldn't mind occasional cancels, but these were recurring. When a
query ran long enough, there was no way for it to finish, no matter
how many times you tried. The master never stops being busy, that's
probably a factor.



Re: Hot Standby Feedback should default to on in 9.3+

From
Heikki Linnakangas
Date:
On 30.11.2012 23:40, Claudio Freire wrote:
> On Fri, Nov 30, 2012 at 6:20 PM, Kevin Grittner<kgrittn@mail.com>  wrote:
>> Claudio Freire wrote:
>>
>>>> With what setting of max_standby_streaming_delay? I would rather
>>>> default that to -1 than default hot_standby_feedback on. That
>>>> way what you do on the standby only affects the standby.
>>>
>>> 1d
>>
>> Was there actually a transaction hanging open for an entire day on
>> the standby? Was it a query which actually ran that long, or an
>> ill-behaved user or piece of software?
>
> No, and if there was, I wouldn't care for it to be cancelled.
>
> Queries were being cancelled way before that timeout was reached,
> probably something to do with max_keep_segments on the master side
> being unable to keep up for that long.

Running out of max_keep_segments would produce a different error, 
requiring a new base backup.

>> I have most certainly managed databases where holding up vacuuming
>> on the source would cripple performance to the point that users
>> would have demanded that any other process causing it must be
>> immediately canceled. And canceling it wouldn't be enough at that
>> point -- the bloat would still need to be fixed before they could
>> work efficiently.
>
> I wouldn't mind occasional cancels, but these were recurring. When a
> query ran long enough, there was no way for it to finish, no matter
> how many times you tried. The master never stops being busy, that's
> probably a factor.

Hmm, it sounds like max_standby_streaming_delay=1d didn't work as 
intended for some reason. It should've given the query one day to run 
before canceling it. Unless the standby was running one day behind the 
master already, but that seems unlikely. Any chance you could reproduce 
that?

- Heikki



Re: Hot Standby Feedback should default to on in 9.3+

From
Claudio Freire
Date:
On Fri, Nov 30, 2012 at 6:49 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
>>> I have most certainly managed databases where holding up vacuuming
>>> on the source would cripple performance to the point that users
>>> would have demanded that any other process causing it must be
>>> immediately canceled. And canceling it wouldn't be enough at that
>>> point -- the bloat would still need to be fixed before they could
>>> work efficiently.
>>
>>
>> I wouldn't mind occasional cancels, but these were recurring. When a
>> query ran long enough, there was no way for it to finish, no matter
>> how many times you tried. The master never stops being busy, that's
>> probably a factor.
>
>
> Hmm, it sounds like max_standby_streaming_delay=1d didn't work as intended
> for some reason. It should've given the query one day to run before
> canceling it. Unless the standby was running one day behind the master
> already, but that seems unlikely. Any chance you could reproduce that?

I have a pre-production server with replication for these tests. I
could create a fake stream of writes on it, disable feedback, and see
what happens.



Re: Hot Standby Feedback should default to on in 9.3+

From
Andres Freund
Date:
On 2012-11-30 14:35:37 -0500, Robert Haas wrote:
> On Fri, Nov 30, 2012 at 2:02 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> > Does anybody have an argument against changing the default value?
>
> Well, the disadvantage of it is that the standby can bloat the master,
> which might be surprising to some people, too.  But I don't really
> have a lot of skin in this game.

Sure, thats a problem. But ISTM that its a problem everyone running
postgres has to know about anyway from running the master itself.

> While we're talking about changing defaults, how about changing the
> default value of the recovery.conf parameter 'standby_mode' to on?
> Not sure about anybody else, but I never want it any other way.

Hm. But only if there is a recovery.conf I guess?

Andres

--Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Hot Standby Feedback should default to on in 9.3+

From
Daniel Farina
Date:
On Fri, Nov 30, 2012 at 11:35 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Nov 30, 2012 at 2:02 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>> Does anybody have an argument against changing the default value?
>
> Well, the disadvantage of it is that the standby can bloat the master,
> which might be surprising to some people, too.  But I don't really
> have a lot of skin in this game.

Under this precept, we used to not enable hot standby feedback and
instead allowed more or less unbounded staleness of the standby
through very long cancellation times. Although not immediate,
eventually we decided that enough people were getting confused by
sufficiently long standby delay caused by bad queries and idle in xact
backends, so now we have enabled feedback for new database replicants,
along with some fairly un-aggressive cancellation timeouts. It's all
rather messy and not very satisfying.  We have yet to know if feedback
causes or solves problems, on average.

In very early versions we tried the default cancellation settings, and
query cancellation confused everyone a *lot*.  That went away in a
hurry as a result, so I suppose it's not entirely unreasonable to say
in retrospect that the defaults can be considered kind of bad.

Longer term, I think I'd be keen to switch all our user-controlled
replication to logical except for use cases where the workload of the
standby is under our (and not the user's) control, such as for
failover.

Unfortunately, our experience with the feature and its use suggests
that the contract granted by the mechanisms seen in hot standby are
too complex for full-stack developers to keep in careful consideration
along with all the other things they want to do with their application
and/or have to remember about Postgres to get by.

--
fdr



Re: Hot Standby Feedback should default to on in 9.3+

From
Magnus Hagander
Date:
On Fri, Nov 30, 2012 at 9:46 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> On 30.11.2012 21:02, Andres Freund wrote:
>>
>> Hi,
>>
>> The subject says it all.
>>
>> There are workloads where its detrimental, but in general having it
>> default to on improver experience tremendously because getting conflicts
>> because of vacuum is rather confusing.
>>
>> In the workloads where it might not be a good idea (very long queries on
>> the standby, many dead tuples on the primary) you need to think very
>> carefuly about the strategy of avoiding conflicts anyway, and explicit
>> configuration is required as well.
>>
>> Does anybody have an argument against changing the default value?
>
>
> -1. By default, I would expect a standby server to not have any meaningful
> impact on the performance of the master. With hot standby feedback, you can
> bloat the master very badly if you're not careful.

I'm with Heikki on the -1 on this. It's certainly unexpected to have
the slave affect the master by default - people will expect the master
to be independent.

Also, it doesn't IMHO actually *help*. The big thing that makes it
harder for people to set up replication that way is wal_level=minimal
by default, and in a smaller sense max_wal_senders (but
wal_level=minimal also has the interesting property that it's not
enough to change it to wal_level=hot_standby if you figure it out too
late - you have to turn off hot standby on the slave, start it, have
it catch up, shut it down, and reenable hot standby). And they
requires a *restart* of the master, which is a lot worse than a small
change to the config of the *slave*. So unless you're suggesting to
change the default of those two values as well, I'm not sure it really
helps that much...


> Think of someone setting up a test server, by setting it up as a standby
> from the master. Now, when someone holds a transaction open in the test
> server, you get bloat in the master. Or if you set up a standby for
> reporting purposes - a very common use case - you would not expect a long
> running ad-hoc query in the standby to bloat the master. That's precisely
> why you set up such a standby in the first place.
>
> You could of course still turn it off, but you would have to know about it
> in the first place. I think it's a reasonable assumption that a standby does
> *not* affect the master (aside from the bandwidth and disk space required to
> retain/ship the WAL). If you have to remember to explicitly set a GUC to get
> that behavior, that's a pretty big gotcha.

+1. Having your reporting query time out *shows you* the problem.
Having the master bloat for you won't show the problem until later -
when it's much bigger, and it's much more pain to recover from.


--Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



Re: Hot Standby Feedback should default to on in 9.3+

From
Magnus Hagander
Date:
On Fri, Nov 30, 2012 at 10:09 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Claudio Freire <klaussfreire@gmail.com> writes:
>> Without hot standby feedback, reporting queries are impossible. I've
>> experienced it. Cancellations make it impossible to finish any
>> decently complex reporting query.
>
> The original expectation was that slave-side cancels would be
> infrequent.  Maybe there's some fixing/tuning to be done there.

It depends completely on the query pattern on the master. Saying that
cancellations makes it "impossible to finish any decently complex
reporting query" is completely incorrect - it depends on the queries
on the *master*, not on the complexity of the query on the slave. I
know a lot of scenarios where query cancels pretty much never happen
at all.

--Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



Re: Hot Standby Feedback should default to on in 9.3+

From
Andres Freund
Date:
On 2012-11-30 22:46:06 +0200, Heikki Linnakangas wrote:
> On 30.11.2012 21:02, Andres Freund wrote:
> >Hi,
> >
> >The subject says it all.
> >
> >There are workloads where its detrimental, but in general having it
> >default to on improver experience tremendously because getting conflicts
> >because of vacuum is rather confusing.
> >
> >In the workloads where it might not be a good idea (very long queries on
> >the standby, many dead tuples on the primary) you need to think very
> >carefuly about the strategy of avoiding conflicts anyway, and explicit
> >configuration is required as well.
> >
> >Does anybody have an argument against changing the default value?
>
> -1. By default, I would expect a standby server to not have any meaningful
> impact on the performance of the master. With hot standby feedback, you can
> bloat the master very badly if you're not careful.

True. But everyone running postgres hopefully knows the problem
already. So that effect is relatively easy to explain.

The other control possibilities we have are rather hard to understand
and to setup in my experience.

> Think of someone setting up a test server, by setting it up as a standby
> from the master. Now, when someone holds a transaction open in the test
> server, you get bloat in the master. Or if you set up a standby for
> reporting purposes - a very common use case - you would not expect a long
> running ad-hoc query in the standby to bloat the master. That's precisely
> why you set up such a standby in the first place.

But you can't do any meaningful reporting without changing the current
variables around this anyway. If you have any writes on the master
barely any significant query ever completes.
The two basic choices we give people suck more imo:
* you setup a large delay: It possibly takes a very long time to catch
up if the primary dies, you don't see any up2date data in later queries
* you abort queries: You can't do any reporting queries

Both are unusable for most scenarios and getting the former just right
is hard.

Imo a default of on works in far more scenarios than the contrary.

Greetings,

Andres Freund

--Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Hot Standby Feedback should default to on in 9.3+

From
Josh Berkus
Date:
All:

Well, the problem is that we have three configurations which only work
for one very common scenario:

- reporting slave: feedback off, very long replication_delay
- load-balancing slave: feedback on, short replication_delay
- backup/failover slave: feedback off, short replication_delay

I don't think anyone without a serious market survey can say that any of
the above scenarios is more common than the others; I run into all three
pretty darned frequently.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Re: Hot Standby Feedback should default to on in 9.3+

From
Andres Freund
Date:
On 2012-11-30 16:09:15 -0500, Tom Lane wrote:
> Claudio Freire <klaussfreire@gmail.com> writes:
> > Without hot standby feedback, reporting queries are impossible. I've
> > experienced it. Cancellations make it impossible to finish any
> > decently complex reporting query.
>
> The original expectation was that slave-side cancels would be
> infrequent.  Maybe there's some fixing/tuning to be done there.

I've mostly seen snapshot conflicts. Its hard to say anything more
precise because we don't log any additional information (its admittedly
not easy).

I think it would already help a lot if
ResolveRecoveryConflictWithSnapshot would log:
* the relfilenode (already passed)
* the removed xid
* the pid of the backend holding the oldest snapshot
* the oldest xid of that backend

Most of that should be easy to get.

But I don't think we really can expect a very low rate of conflicts if
the primary has few longrunning transactions but the standby does.

Greetings,

Andres Freund

--Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Hot Standby Feedback should default to on in 9.3+

From
"Albe Laurenz"
Date:
Magnus Hagander wrote:
>> On 30.11.2012 21:02, Andres Freund wrote:
>>> There are workloads where its detrimental, but in general having it
>>> default to on improver experience tremendously because getting conflicts
>>> because of vacuum is rather confusing.
>>>
>>> In the workloads where it might not be a good idea (very long queries on
>>> the standby, many dead tuples on the primary) you need to think very
>>> carefuly about the strategy of avoiding conflicts anyway, and explicit
>>> configuration is required as well.
>>>
>>> Does anybody have an argument against changing the default value?
>>
>>
>> -1. By default, I would expect a standby server to not have any meaningful
>> impact on the performance of the master. With hot standby feedback, you can
>> bloat the master very badly if you're not careful.
> 
> I'm with Heikki on the -1 on this. It's certainly unexpected to have
> the slave affect the master by default - people will expect the master
> to be independent.

I agree.

> +1. Having your reporting query time out *shows you* the problem.
> Having the master bloat for you won't show the problem until later -
> when it's much bigger, and it's much more pain to recover from.

I couldn't agree more.

There are different requirements, and there will always be
people who need to change the defaults, but the way it is is
the safest in my opinion.

Yours,
Laurenz Albe

Re: Hot Standby Feedback should default to on in 9.3+

From
Robert Haas
Date:
On Fri, Nov 30, 2012 at 6:41 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>> While we're talking about changing defaults, how about changing the
>> default value of the recovery.conf parameter 'standby_mode' to on?
>> Not sure about anybody else, but I never want it any other way.
>
> Hm. But only if there is a recovery.conf I guess?

Yeah, wouldn't make much sense otherwise.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company