Thread: pgsql: Resolve timing issue with logging locks for Hot Standby.

pgsql: Resolve timing issue with logging locks for Hot Standby.

From
Simon Riggs
Date:
Resolve timing issue with logging locks for Hot Standby.
We log AccessExclusiveLocks for replay onto standby nodes,
but because of timing issues on ProcArray it is possible to
log a lock that is still held by a just committed transaction
that is very soon to be removed. To avoid any timing issue we
avoid applying locks made by transactions with InvalidXid.

Simon Riggs, bug report Tom Lane, diagnosis Pavan Deolasee

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/c172b7b02e6f6008d6dad66ddee8f67faf223c5b

Modified Files
--------------
src/backend/storage/ipc/procarray.c |    8 +--
src/backend/storage/ipc/standby.c   |  110 ++++++++++++++++++++++++-----------
src/backend/storage/lmgr/lock.c     |   12 ++++-
src/include/storage/standby.h       |    2 +-
4 files changed, 88 insertions(+), 44 deletions(-)


Re: pgsql: Resolve timing issue with logging locks for Hot Standby.

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> Resolve timing issue with logging locks for Hot Standby.
> We log AccessExclusiveLocks for replay onto standby nodes,
> but because of timing issues on ProcArray it is possible to
> log a lock that is still held by a just committed transaction
> that is very soon to be removed. To avoid any timing issue we
> avoid applying locks made by transactions with InvalidXid.

> Simon Riggs, bug report Tom Lane, diagnosis Pavan Deolasee

I see this was only applied to HEAD.  Wouldn't back-patching be in
order?  The problem is in 9.1 as well, no?

            regards, tom lane

Re: pgsql: Resolve timing issue with logging locks for Hot Standby.

From
Simon Riggs
Date:
On Mon, Jan 30, 2012 at 12:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
>> Resolve timing issue with logging locks for Hot Standby.
>> We log AccessExclusiveLocks for replay onto standby nodes,
>> but because of timing issues on ProcArray it is possible to
>> log a lock that is still held by a just committed transaction
>> that is very soon to be removed. To avoid any timing issue we
>> avoid applying locks made by transactions with InvalidXid.
>
>> Simon Riggs, bug report Tom Lane, diagnosis Pavan Deolasee
>
> I see this was only applied to HEAD.  Wouldn't back-patching be in
> order?  The problem is in 9.1 as well, no?

Yes, it is. I prefer to give a little time before backpatching to
avoid mistakes (of my own making), especially since we're busy enough
not to want to divert energy to other releases right now. The patch
will make it in before next minor release.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: pgsql: Resolve timing issue with logging locks for Hot Standby.

From
Robert Haas
Date:
On Mon, Jan 30, 2012 at 4:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Mon, Jan 30, 2012 at 12:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Simon Riggs <simon@2ndQuadrant.com> writes:
>>> Resolve timing issue with logging locks for Hot Standby.
>>> We log AccessExclusiveLocks for replay onto standby nodes,
>>> but because of timing issues on ProcArray it is possible to
>>> log a lock that is still held by a just committed transaction
>>> that is very soon to be removed. To avoid any timing issue we
>>> avoid applying locks made by transactions with InvalidXid.
>>
>>> Simon Riggs, bug report Tom Lane, diagnosis Pavan Deolasee
>>
>> I see this was only applied to HEAD.  Wouldn't back-patching be in
>> order?  The problem is in 9.1 as well, no?
>
> Yes, it is. I prefer to give a little time before backpatching to
> avoid mistakes (of my own making), especially since we're busy enough
> not to want to divert energy to other releases right now. The patch
> will make it in before next minor release.

If it's going to go in before the next minor release, there's no real
value in holding off, is there?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pgsql: Resolve timing issue with logging locks for Hot Standby.

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jan 30, 2012 at 4:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On Mon, Jan 30, 2012 at 12:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I see this was only applied to HEAD. �Wouldn't back-patching be in
>>> order? �The problem is in 9.1 as well, no?

>> Yes, it is. I prefer to give a little time before backpatching to
>> avoid mistakes (of my own making), especially since we're busy enough
>> not to want to divert energy to other releases right now. The patch
>> will make it in before next minor release.

> If it's going to go in before the next minor release, there's no real
> value in holding off, is there?

It seems like a strange approach to me as well.  The longer you wait,
the more the patch details will have faded from your mind, increasing
the chance that you'll mess up any necessary cross-branch adjustments.
I also consider the possibility to be non-zero that it won't get done
at all before the next release.

Another aspect of this is that when the same-or-almost-the-same
patch is applied to multiple branches, it's a *lot* easier for future
archeology in the commit logs if the patch goes into all the affected
branches at the same time (and with the same commit message, please).

Lastly, one of the main reasons for wanting this particular bug fixed
(since after all it's just an Assert failure in the end) is to not have
transient buildfarm failures from it.  Those failures are still
happening, in 9.1 if not HEAD, and still distracting attention from
more significant issues.

            regards, tom lane

Re: pgsql: Resolve timing issue with logging locks for Hot Standby.

From
Andrew Dunstan
Date:

On 01/30/2012 10:45 AM, Tom Lane wrote:
> Another aspect of this is that when the same-or-almost-the-same patch
> is applied to multiple branches, it's a *lot* easier for future
> archeology in the commit logs if the patch goes into all the affected
> branches at the same time (and with the same commit message, please).

Yeah, I've been pinged on this before, for exactly this reason. It makes
release note preparation significantly easier, AIUI.

cheers

andrew

Re: pgsql: Resolve timing issue with logging locks for Hot Standby.

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 01/30/2012 10:45 AM, Tom Lane wrote:
>> Another aspect of this is that when the same-or-almost-the-same patch
>> is applied to multiple branches, it's a *lot* easier for future
>> archeology in the commit logs if the patch goes into all the affected
>> branches at the same time (and with the same commit message, please).

> Yeah, I've been pinged on this before, for exactly this reason. It makes
> release note preparation significantly easier, AIUI.

Not just release notes, but routine questions like "when was this bug
fixed, and in which branches?".  I keep a regularly-updated copy of
the output of src/tools/git_changelog (before git we used cvs2cl
to get similar output from CVS).  It's a rare day that I don't have
some occasion to consult it.  I don't appreciate committers randomly
cluttering that log by making commits that the tool doesn't know how
to merge.

            regards, tom lane

Re: pgsql: Resolve timing issue with logging locks for Hot Standby.

From
Simon Riggs
Date:
On Mon, Jan 30, 2012 at 4:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 01/30/2012 10:45 AM, Tom Lane wrote:
>>> Another aspect of this is that when the same-or-almost-the-same patch
>>> is applied to multiple branches, it's a *lot* easier for future
>>> archeology in the commit logs if the patch goes into all the affected
>>> branches at the same time (and with the same commit message, please).
>
>> Yeah, I've been pinged on this before, for exactly this reason. It makes
>> release note preparation significantly easier, AIUI.
>
> Not just release notes, but routine questions like "when was this bug
> fixed, and in which branches?".  I keep a regularly-updated copy of
> the output of src/tools/git_changelog (before git we used cvs2cl
> to get similar output from CVS).  It's a rare day that I don't have
> some occasion to consult it.  I don't appreciate committers randomly
> cluttering that log by making commits that the tool doesn't know how
> to merge.

As you request it, I will endeavour to do this next time.

I think its more likely errors will occur that way, however.
Responsibility for any such errors is still mine so there may be times
where that outweighs other thoughts, but as policy I'm happy to make
apply-all-at-once the default.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services