Re: BUG #12330: ACID is broken for unique constraints - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: BUG #12330: ACID is broken for unique constraints
Date
Msg-id CAHyXU0zr42rgGuqB6=5fW6TEw8183s89U-2epXJ1Me8WDW=RRw@mail.gmail.com
Whole thread Raw
In response to Re: BUG #12330: ACID is broken for unique constraints  (Kevin Grittner <kgrittn@ymail.com>)
Responses Re: BUG #12330: ACID is broken for unique constraints  (Kevin Grittner <kgrittn@ymail.com>)
List pgsql-hackers
On Mon, Dec 29, 2014 at 8:03 AM, Kevin Grittner <kgrittn@ymail.com> wrote:
> Merlin Moncure <mmoncure@gmail.com> wrote:
>> On Fri, Dec 26, 2014 at 12:38 PM, Kevin Grittner <kgrittn@ymail.com>
>> wrote:
>>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>
>>>> Just for starters, a 40XXX error report will fail to provide the
>>>> duplicated key's value.  This will be a functional regression,
>>>
>>> Not if, as is normally the case, the transaction is retried from
>>> the beginning on a serialization failure.  Either the code will
>>> check for a duplicate (as in the case of the OP on this thread) and
>>> they won't see the error, *or* the the transaction which created
>>> the duplicate key will have committed before the start of the retry
>>> and you will get the duplicate key error.
>>
>> I'm not buying that; that argument assumes duplicate key errors are
>> always 'upsert' driven.  Although OP's code may have checked for
>> duplicates it's perfectly reasonable (and in many cases preferable) to
>> force the transaction to fail and report the error directly back to
>> the application.  The application will then switch on the error code
>> and decide what to do: retry for deadlock/serialization or abort for
>> data integrity error.  IOW, the error handling semantics are
>> fundamentally different and should not be mixed.
>
> I think you might be agreeing with me without realizing it.  Right
> now you get "duplicate key error" even if the duplication is caused
> by a concurrent transaction -- it is not possible to check the
> error code (well, SQLSTATE, technically) to determine whether this
> is fundamentally a serialization problem.  What we're talking about
> is returning the serialization failure return code for the cases
> where it is a concurrent transaction causing the failure and
> continuing to return the duplicate key error for all other cases.
>
> Either I'm not understanding what you wrote above, or you seem to
> be arguing for being able to distinguish between errors caused by
> concurrent transactions and those which aren't.

Well, I'm arguing that duplicate key errors are not serialization
failures unless it's likely the insertion would succeed upon a retry;
a proper insert, not an upsert.  If that's the case with what you're
proposing, then it makes sense to me.  But that's not what it sounds
like...your language suggests AIUI that having the error simply be
caused by another transaction being concurrent would be sufficient to
switch to a serialization error (feel free to correct me if I'm
wrong!).

In other words, the current behavior is:
txn A,B begin
txn A inserts
txn B inserts over A, locks, waits
txn A commits.  B aborts with duplicate key error

Assuming that case is untouched, then we're good!  My long winded
point above is that case must fail with duplicate key error; a
serialization error is suggesting the transaction should be retried
and it shouldn't be...it would simply fail a second time.

merlin



pgsql-hackers by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: BUG #12330: ACID is broken for unique constraints
Next
From: Ali Akbar
Date:
Subject: Re: PATCH: decreasing memory needlessly consumed by array_agg