Re: change in LOCK behavior - Mailing list pgsql-hackers

From Tom Lane
Subject Re: change in LOCK behavior
Date
Msg-id 14951.1349916214@sss.pgh.pa.us
Whole thread Raw
In response to Re: change in LOCK behavior  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: change in LOCK behavior  (Bruce Momjian <bruce@momjian.us>)
Re: change in LOCK behavior  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Oct 10, 2012 at 7:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah, I think that last is the key point: this patch was sold on the
>> grounds that it wouldn't cause any interesting user-visible behavioral
>> change, but your example blows that claim into tiny little pieces.
>> 
>> I'm inclined to think we need to revert this.  The performance gain is
>> not worth the prospect of breaking a lot of applications that used to
>> work reliably.  Robert?

> Yeah, I have to admit that I didn't consider the possibility that a
> lock wait might intervene between the first and second snapshots.  But
> something about this isn't making sense to me.

It's not hard to see what's happening.  The old code was:
* take parse/plan snapshot* parse & plan query (includes taking AccessShare lock on tables)* take execution snapshot*
runquery
 

Taking the AccessShare lock blocks until the transaction with exclusive
lock commits; therefore, the execution snapshot will see that
transaction's effects.

In the new code, we re-use the parse/plan snapshot for execution, so it
predates the other transaction's commit, so we don't see its effects.

> The whole idea of that
> patch is that we reuse the parse/plan snapshot at execution time.  If
> it's wrong to use a snapshot that might be stale at execution time,
> then it's equally wrong to use it for parse/plan.

Nope, because the parse/plan snapshot is only relevant to planner
estimation purposes.  It doesn't have to be exactly right.  (What does
have to be exactly right is our idea of the table structure, but that's
okay because any catalog consultation we do is with SnapshotNow, and
hence will see any DDL the other transaction might have done.)

> But that sounds more like a happy accident than anything else.

[ shrug... ]  Maybe it's a happy accident and maybe somebody designed it
with malice aforethought many years ago.  But the point is that this
type of thing worked, with 100% reliability, before 9.2.  Now it does
not.  I don't think we can accept that.

> It seems to me that the
> root of the issue here is that people is not that people expect two
> snapshots -- indeed, a number of people strongly supported getting rid
> of that behavior at the time -- but rather that they expect the
> snapshot to be taken after locks are acquired.

Sure.  Maybe we can rejigger things in a way that does that, although
I think the stumbling block is going to be parse-time calls to
user-defined I/O functions for constants --- which might need a
snapshot.  It might be possible to redesign things so that all tables
are locked before we do anything that requires a non-SnapshotNow
snapshot, and then take a single "planning/execution" snapshot.  But
that is not this patch, and would be a lot more invasive than this
patch, and would certainly not be back-patchable to 9.2.

I think we have to revert and go back to the drawing board on this.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: change in LOCK behavior
Next
From: Robert Haas
Date:
Subject: Re: [PATCH 8/8] Introduce wal decoding via catalog timetravel