Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Date
Msg-id CAM3SWZRYdbg=SY+jc6V=uEuZp+cBzTkpZtzjGoAyYgqNSJfqQw@mail.gmail.com
Whole thread Raw
In response to Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On Mon, Sep 30, 2013 at 8:32 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> I doubt that any change to HeapTupleSatisfiesMVCC() will be
>>> acceptable.  This feature needs to restrain itself to behavior changes
>>> that only affect users of this feature, I think.
>>
>> I agree with the principle of what you're saying, but I'm not aware
>> that those changes to HeapTupleSatisfiesMVCC() imply any behavioral
>> changes for those not using the feature.

> Well, at a minimum, it's a performance worry.  Those functions are
> *hot*.  Single branches do matter there.

Well, that certainly is a reasonable concern. Offhand, I suspect that
branch prediction helps immensely. But even if it doesn't, couldn't it
be the case that returning earlier there actually helps? Where we have
a real xid (so TransactionIdIsCurrentTransactionId() must do more than
a single test of a scalar variable), and the row is locked *only*
(which is already very cheap to check - it's another scalar variable
that we already test in a few other places in that function), isn't
there on average a high chance that the tuple ought to be visible to
our snapshot anyway?

>> I am starting to wonder if it's really necessary to have a "blessed"
>> update that can see the locked, not-otherwise-visible tuple.

> If we're not going to just error out over the invisible tuple the user
> needs some way to interact with it.  The details are negotiable.

I think that we will error out over an invisible tuple with higher
isolation levels. Certainly, what we do there today instead of
EvalPlanQual() looping is consistent with that behavior.

>> Couldn't EvalPlanQual() be said to be an MVCC violation on similar
>> grounds? It also "reaches into the future". Locking a row isn't really
>> that distinct from updating it in terms of the code footprint, but
>> also from a logical perspective.
>
> Yes, EvalPlanQual() is definitely an MVCC violation.

So I think that you can at least see why I'd consider that the two (my
tweaks to HeapTupleSatisfiesMVCC() and EvalPlanQual()) are isomorphic.
It just becomes the job of this new locking infrastructure to worry
about the would-be invisibility of the locked tuple, and raise a
serialization error accordingly at higher isolation levels.

>> The important observation here is that an updater, in effect, locks
>> both the old and new sets of values (for non-HOT updates). And as I've
>> already noted, any practical "value locking" implementation isn't
>> going to be able to prevent the update from immediately locking the
>> old, because that doesn't touch an index. Hence, there is an
>> irresolvable disconnect between value and row locking.
>
> This part I don't follow.  "locking the old"?  What irresolvable
> disconnect?  I mean, they're different things; I get *that*.

Well, if you update a row, the old row version's values are locked, in
the sense that any upserter interested in inserting the same values as
the old version is going to have to block pending the outcome of the
updating xact.

The disconnect is that any attempt at a clever dance, to interplay
value and row locking such that this definitely just works first time
seems totally futile - I'm emphasizing this because it's the obvious
way to approach this basic problem. It turns out that it could only be
done at great expense, in a way that would immediately be dismissed as
totally outlandish.

> OK, I take your point, I think.  The existing system already acquires
> value locks when a tuple lock is held, during an UPDATE, and we can't
> change that.

Right.

>> I think that re-ordering is an important property of any design where
>> we cannot bail out with serialization failures.

> I worry about the behavior being confusing and hard to understand in
> the presence of multiple unique indexes and reordering.  Perhaps I
> simply don't understand the problem domain well-enough yet.

It's only confusing if you are worried about what concurrent sessions
do with respect to each other at this low level. In which case, just
use a higher isolation level and pay the price. I'm not introducing
any additional anomalies described and prohibited by the standard by
doing this, and indeed the order of retrying in the event of a
conflict today is totally undefined, so this line of thinking is not
inconsistent with how things work today. Today, strictly speaking some
unique constraint violations might be more appropriate as
serialization failures. So with this new functionality, when used,
they're going to be actual serialization failures where that's
appropriate, where we'd otherwise go do something else other than
error. Why burden read committed like that? (Actually, fwiw I suspect
that currently the SSI guarantees *can* be violated with unique retry
re-ordering, but that's a whole other story, and is pretty subtle).

Let me come right out and say it: Yes, part of the reason that I'm
taking this line is because it's convenient to my implementation from
a number of different perspectives. But one of those perspectives is
that it will help performance in the face of contention immensely,
without violating any actual precept held today (by us or by the
standard or by anyone else AFAIK), and besides, my basic design is
informed by sincerely-held beliefs about what will actually work
within the constraints presented.

> From a user perspective, I would really think people would want to
> specify a set of key columns and then update if a match is found on
> those key columns. Suppose there's a unique index on (a, b) and
> another on (c), and the user passes in (a,b,c)=(1,1,1).  It's hard for
> me to imagine that the user will be happy to update either (1,1,2) or
> (2,2,1), whichever exists.  In what situation would that be the
> desired behavior?

You're right - that isn't desirable. The reason that we go to all this
trouble with locking multiple values concurrently boils down to
preventing the user from having to specify a constraint name - it's
usually really obvious *to users* that understand their schema, so why
bother them with that esoteric detail? The user *is* more or less
required to have a particular constraint in mind when writing their
DML (for upsert). It could be that that constraint has a 1:1
correlation with another constraint in practice, which would also work
out fine - they'd specify one or the other constrained column (maybe
both) in the subsequent update's predicate. But generally, yes,
they're out of luck here, until we get around to implementing MERGE in
its full generality, which I think what I've proposed is a logical
stepping stone towards (again, because it involves locking values
across unique indexes simultaneously).

Now, at least what I've proposed has the advantage of allowing the
user to add some controls in their update's predicate. So if they only
had updating (1,1,2) in mind, they could put WHERE a = 1 AND b = 1 in
there too (I'm imagining the wCTE pattern is used). They'd then be
able to inspect the value of the FOUND pseudo-variable or whatever.
Now, I'm assuming that we'd somehow be able to tell that the insert
hasn't succeeded (i.e. it locked), and maybe that doesn't accord very
well with these kinds of facilities as they exist today, but it
doesn't seem like too much extra work (MySQL would consider that both
the locked and updated rows were affected, which might help us here).

MySQL's INSERT...ON DUPLICATE KEY UPDATE has nothing like this - there
is no guidance as to why you went to update, and you cannot have a
separate update qual. Users better just get it right!

Maybe what's really needed here is INSERT...ON DUPLICATE KEY LOCK FOR
UPDATE RETURNING LOCKED... . You can see what was actually locked, and
act on *that* as appropriate. Though you don't get to see the actual
value of default expressions and so on, which is a notable
disadvantage over RETURNING REJECTS... .

The advantage of RETURNING LOCKED would be you could check if it
LOCKED for the reason you thought it should have. If it didn't, then
surely what you'd prefer would be a unique constraint violation, so
you can just go throw an error in application code (or maybe consider
another value for the columns that surprised you).

What do others think?

> Also, under such a programming model, if somebody drops a unique index
> or adds a new one, the behavior of someone's application can
> completely change.  I have a hard time swallowing that.  It's an
> established precedent that dropping a unique index can make some other
> operation fail (e.g. ADD FOREIGN KEY, and more recently CREATE VIEW ..
> GROUP BY), and of course it can cause performance or plan changes.
> But overturning the semantics is, I think, something new, and it
> doesn't feel like a good direction.

In what sense is causing, or preventing an error (the current state of
affairs) not a behavioral change? I'd have thought it a very
significant one. If what you're saying here is true, wouldn't that
mandate that we specify the name of a unique index inline, within DML?
I thought we were in agreement that that wasn't desirable.

If you think it's a bit odd that we lock every value while the user
essentially has one constraint in mind when writing their DML,
consider:

1) We need this for MERGE anyway.

2) Don't underestimate the intellectual overhead for developers and
operations personnel of adding an application-defined significance to
unique indexes that they don't otherwise have. It sure would suck if a
refactoring effort to normalize unique index naming style had the
effect of breaking a slew of application code. Certainly, everyone
else seems to have reached the same conclusion in their own
implementation of upsert, because they don't require that a unique
index be specified, even when that could have unexpected results.

3) The problems that getting the details wrong present can be
ameliorated by developers who feel it might be a problem for them, as
already described. I think in the vast majority of cases it just
obviously won't be a problem to begin with.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: logical changeset generation v6.2
Next
From: Noah Misch
Date:
Subject: Re: dynamic shared memory