Re: Serialization errors despite KEY SHARE/NO KEY UPDATE - Mailing list pgsql-general

From Olivier Dony
Subject Re: Serialization errors despite KEY SHARE/NO KEY UPDATE
Date
Msg-id 5614026C.10404@odoo.com
Whole thread Raw
In response to Re: Serialization errors despite KEY SHARE/NO KEY UPDATE  (Kevin Grittner <kgrittn@ymail.com>)
Responses Re: Serialization errors despite KEY SHARE/NO KEY UPDATE  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
List pgsql-general
On 10/05/2015 11:17 PM, Kevin Grittner wrote:
> Jim Nasby <Jim.Nasby@BlueTreble.com> wrote:
>> On 10/2/15 11:44 AM, Olivier Dony wrote:
>>> On 10/02/2015 12:28 AM, Jim Nasby wrote:
>>>> On 9/29/15 9:47 AM, Olivier Dony wrote:
>
>>>>> -- Setup tables
>>>>> CREATE TABLE users ( id serial PRIMARY KEY,
>>>>>                       name varchar,
>>>>>                       date timestamp );
>>>>> CREATE TABLE orders ( id serial PRIMARY KEY,
>>>>>                        name varchar,
>>>>>                        user_id int REFERENCES users (id) );
>>>>> INSERT INTO users (id, name) VALUES (1, 'foo');
>>>>> INSERT INTO orders (id, name) VALUES (1, 'order 1');
>>>>>
>>>>>
>>>>> -- Run 2 concurrent transactions: T1 and T2
>>>>>                               T1                               T2
>>>>> |-----------------------------|----------------------------------|
>>>>>         BEGIN ISOLATION LEVEL
>>>>>                    REPEATABLE READ;
>>>>>
>>>>>    UPDATE orders
>>>>>         SET name = 'order of foo',
>>>>>                 user_id = 1
>>>>>         WHERE id = 1;
>>>>>
>>>>>                                  BEGIN ISOLATION LEVEL
>>>>>                                        REPEATABLE READ;
>>>>>
>>>>>                                  UPDATE users
>>>>>                                  SET date = now()
>>>>>                                  WHERE id = 1;
>>>>>
>>>>>                                  COMMIT;
>>>>>
>>>>>    UPDATE orders
>>>>>    SET name = 'order of foo (2)',
>>>>>        user_id = 1
>>>>>    WHERE id = 1;
>>>>>
>>>>> T1 fails with:
>>>>> ERROR:  could not serialize access due to concurrent update
>>>>> CONTEXT:  SQL statement "SELECT 1 FROM ONLY "public"."users" x WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE
OFx" 
>>>>
>>>> This isn't a locking failure, it's a serialization failure. I'm not sure
>>>> why it's happening though... is there an index on date?
>>>
>>> I don't think so. I can reproduce the problem with the queries quoted
>>> above, and the only index that seems to be present is the PK
>
>> I'm not sure. Perhaps Kevin Grittner (author of serializable patch) can
>> shed some light.
>
> This has nothing to do with the serializable patch[1], but it is
> related to the new multixact code in 9.3.  I thought I remembered
> that this was a regression from 9.2 behavior which was fixed in a
> patch after being reported, and sure enough tests of the tips of
> branches showed that prior to 9.3 T2 blocked until T1 committed,
> rather than generating a serialization failure, while on 9.5 and
> master there was no blocking or error.
>
> Using `git bisect` to see where this was fixed shows:
>
> commit 05315498012530d44cd89a209242a243374e274d
> Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
> Date:   Tue Jul 29 15:41:06 2014 -0400
>
>      Avoid uselessly looking up old LOCK_ONLY multixacts
>
>      Commit 0ac5ad5134f2 removed an optimization in multixact.c that skipped
>      fetching members of MultiXactId that were older than our
>      OldestVisibleMXactId value.  The reason this was removed is that it is
>      possible for multixacts that contain updates to be older than that
>      value.  However, if the caller is certain that the multi does not
>      contain an update (because the infomask bits say so), it can pass this
>      info down to GetMultiXactIdMembers, enabling it to use the old
>      optimization.
>
>      Pointed out by Andres Freund in 20131121200517.GM7240@alap2.anarazel.de

That's exactly it, thanks a lot for tracking down the root cause!
I understand (a little bit) better now.


> I don't know whether, after being in 9.5 for over a year it could
> be considered for a back-patch to 9.3 and 9.4; I leave that to
> Álvaro and others who have been repairing the multixact problems to
> determine.

 From what I understand of the original discussion between Andres, Álvaro and
Tom, a back-patch seemed desirable because this issue has performance
consequences. I have to add that in our case it also causes spurious
serialization failures that were not present in 9.2 and aren't either in 9.5.
The backwards-compatibility problem of the patch might however apply to both
9.3 and 9.4 now...

For what it's worth, our system uses many long-running transactions (background
jobs, batch data imports, etc.) that are frequently interrupted and rolled back
by micro-transactions coming from users who just happen to update minor data on
their records (such as their last login date).

Having to replay those long transactions is particularly expensive and they
stand a good chance of being interrupted again during the replay, etc.

We would happily skip the micro-transactions (as a perf workaround) if there
was a way to detect this situation, but we couldn't find a way to do that in
9.3. <9.3 we used SELECT FOR UPDATE NOWAIT to guard similar cases.

If there is any way I could help to make the back-patch happen, please let me know!


--
Olivier



>
> [1]  The serializable patch has no effect if there are not
> serializable transactions in use, and it doesn't generate errors
> "due to concurrent update" in any new situations.
>


pgsql-general by date:

Previous
From: John McKown
Date:
Subject: Re: Processing data from table using awk.
Next
From: John R Pierce
Date:
Subject: Re: Recording exceptions within function (autonomous transactions?)