Re: SERIALIZABLE and INSERTs with multiple VALUES - Mailing list pgsql-general

From Kevin Grittner
Subject Re: SERIALIZABLE and INSERTs with multiple VALUES
Date
Msg-id CACjxUsP2vYcqyfNvvTgsKZjBmAX5Xx3r20KR-pB-2cgE2KZYbw@mail.gmail.com
Whole thread Raw
In response to Re: SERIALIZABLE and INSERTs with multiple VALUES  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: SERIALIZABLE and INSERTs with multiple VALUES
List pgsql-general
On Thu, Oct 13, 2016 at 2:16 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> On Thu, Oct 13, 2016 at 6:19 AM, Kevin Grittner <kgrittn@gmail.com> wrote:

>> Every situation that generates a false positive hurts performance;
>> we went to great lengths to minimize those cases.

>> To generate a
>> serialization failure on a single transaction has to be considered
>> a bug, because a retry *CAN NOT SUCCEED*!  This is likely to break
>> many frameworks designed to work with serializable transactions.
>
> It sounds like you're talking about the original complaint about a
> multi-value INSERT. It took me a minute to decide that that's probably
> what you meant, because everyone already agrees that that isn't okay
> -- you don't need to convince me.

That second part, yeah -- that's about generating a serialization
failure with one transaction.  It's pretty bad if you can't get a
set that contains one transaction to behave as though the
transactions in that set were run one at a time.  ;-)

> We must still determine if a fix along the lines of the one proposed
> by Thomas is basically acceptable (that is, that it does not clearly
> break any documented guarantees, even if it is overly strict).
> Separately, I'd be interested in seeing how specifically we could do
> better with the patch that you have in the works for this.

Basically, rather than just failing, I think we should call
CheckForSerializableConflictOut() (which determines whether the
tuple we are reading causes a rw-conflict between our current
transaction and the transaction which last wrote that tuple) and
PredicateLockTuple() (which tells later updates or deletes that
we've read the tuple).

> In general, I see value in reducing false positives, but I don't
> understand why your concern here isn't just about preferring to keep
> them to a minimum (doing our best).

That's exactly what I want to do, rather that what is the easiest
and first thing to come to mind.

> In other words, I don't understand
> why these false positives are special, and I'm still not even clear on
> whether you are actually arguing that they are special. (Except, of
> course, the multi-value case -- that's clearly not okay.)
>
> So, with the fix proposed by Thomas applied, will there be any
> remaining false positives that are qualitatively different to existing
> false positive cases? And, if so, how?

The INSERT ... ON CONFLICT DO NOTHING case does not write the
tuple, so this would be the first place we would be generating a
"write conflict" when we're not writing a tuple.  (You might argue
that "behind the scenes we write a tuple that disappears
automagically, but that's an implementation detail that might
someday change and should not be something users need to think
about a lot.)  We put a lot of effort into minimizing false
positives everywhere we could, and I'm not sure why you seem to be
arguing that we should not do so here.  If it proves impractical to
"do it right", we would not destroy logical correctness by using
the patch Thomas proposed, but we would increase the number of
transaction rollbacks and retries, which has a performance hit.

BTW, feel free to post a fix for the locking issue separately
when/if you have one.  I'm not looking at that for the moment,
since it sounded like you had already looked at it and were working
on something.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: "The index is not optimal" GiST warnings
Next
From: Kevin Grittner
Date:
Subject: Re: SERIALIZABLE and INSERTs with multiple VALUES