Re: max_standby_delay considered harmful - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: max_standby_delay considered harmful
Date
Msg-id 1273472865.3936.1954.camel@ebony
Whole thread Raw
In response to Re: max_standby_delay considered harmful  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: max_standby_delay considered harmful
Re: max_standby_delay considered harmful
List pgsql-hackers
On Sun, 2010-05-09 at 20:56 -0400, Robert Haas wrote:

> >> > Seems like it could take FOREVER on a busy system.  Surely that's not
> >> > OK.  The fact that Hot Standby has to take exclusive locks that can't
> >> > be released until WAL replay has progressed to a certain point seems
> >> > like a fairly serious wart.
> >>
> >> If this is a serious wart then it's not one of hot standby, but one of
> >> postgres proper. AccessExclusiveLocks (SELECT-blocking locks that is, as
> >> opposed to UPDATE/DELETE-blocking locks) are never necessary from a
> >> correctness POV, they're only there for implementation reasons.
> >>
> >> Getting rid of them doesn't seem completely insurmountable either - just as
> >> multiple row versions remove the need to block SELECTs dues to concurrent
> >> UPDATEs, multiple datafile versions could remove the need to block SELECTs
> >> due to concurrent ALTERs. But people seem to live with them quite well,
> >> judged from the amount of work put into getting rid of them (zero). I
> >> therefore fail to see why they should pose a significant problem in HS
> >> setups.
> > The difference is that in HS you have to wait for a moment where *no exclusive
> > lock at all* exist, possibly without contending for any of them, while on the
> > master you might not even blocked by the existence of any of those locks.
> >
> > If you have two sessions which in overlapping transactions lock different
> > tables exlusively you have no problem shutting the master down, but you will
> > never reach a point where no exclusive lock is taken on the slave.
> 
> A possible solution to this in the shutdown case is to kill anyone
> waiting on a lock held by the startup process at the same time we kill
> the startup process, and to kill anyone who subsequently waits for
> such a lock as soon as they attempt to take it.  

I already explained that killing the startup process first is a bad idea
for many reasons when shutdown was discussed. Can't remember who added
the new standby shutdown code recently, but it sounds like their design
was pretty poor if it didn't include shutting down properly with HS. I
hope they fix the bug they have introduced. HS was never designed to
work that way, so there is no flaw there; it certainly worked when
committed.

> I'm not sure if this
> would also make sense in the pause case.

Not sure why pausing replay would make any difference at all. Being
between one WAL record and the next is a valid and normal state that
exists many thousands of times per second. If making that state longer
would cause problems we would already have seen any issues. There are
none, it will work fine.

> Another possible solution would be to try to figure out if there's a
> way to delay application of WAL that requires the taking of AELs to
> the point where we could apply it all at once.  That might not be
> feasible, though, or only in some cases, and it's certainly 9.1
> material (at least) in any case.

Locks usually protect users from accessing a table while its being
clustered or dropped or something like that. Locks are not bad. They are
also used by some developers to specifically serialize access to an
object. AccessExclusiveLocks are rare in normal running and not to be
avoided when they do exist. HS correctly supports locking, as and when
such locks are made on the master.

-- Simon Riggs           www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: 9.0b1: "ERROR: btree index keys must be ordered by attribute"
Next
From: Takahiro Itagaki
Date:
Subject: Re: [BUGS] "SET search_path" clause ignored during function creation