Thread: regression, deadlock in high frequency single-row UPDATE

regression, deadlock in high frequency single-row UPDATE

From
Andrew Sackville-West
Date:
Hi,

I've discussed this problem on irc a couple of times and think I've
found a regression that plagues our application, introduced in some
version newer than 9.1.9, and still present in 9.3.5.

Multiple instances of the *exact* same single row update:

UPDATE
       "z8z6px927zu6qzzbnb5ntgghxg"."access_grants" ag
SET
        last_issued=DEFAULT
FROM
        "z8z6px927zu6qzzbnb5ntgghxg"."oauth_clients" oc
WHERE
        oc.id = ag.client_id
        AND ag.entity_name = 'user'
        AND ag.entity_id = 129
        AND oc.client_id = '3hp45h9d4f9wwtx7cvpus6rdb4s5kb9f'
RETURNING
        ag.id
;

if performed with sufficient concurrency will produce a deadlock.

I've attached a snippet of logs, all uncommented lines from
postgresql.conf, and the table definitions of the involved
tables. This first instance of the deadlock appears in line 142; I
left a bunch of lead-in in case it's relevant.

I think this is a regression as we only see the behavior under
postgres 9.3.x (reproduced locally on 9.3.4 and 9.3.5 in a VMWare VM
running Ubuntu 11.04, but also evident in 9.3.3 on Amazon RDS). I am
unable to reproduce in the earlier versions I've been able to test against
(9.0.something and 9.1.9).

To reproduce the problem, I have to fork about 100 api calls against
our application, with the results you see in the logs. The queries in
the log are the *only* activity in the application and database at the
time and I have not filtered the logs at all, other than snipping to a
reasonable window around the problem. I can produce more extensive
logs if needed, but there really is nothing else.

I have not been able to reproduce the deadlock by making concurrent
UPDATEs via what amounts to a bash fork-bomb w/ psql, but I suspect my
methodology might be too crude there. The application is able to spin
up a large number of lightweight threads fairly quickly and presumably
attain the needed level of concurrency.

Thanks,

A

Attachment

Re: regression, deadlock in high frequency single-row UPDATE

From
Marti Raudsepp
Date:
On Fri, Aug 1, 2014 at 2:30 AM, Andrew Sackville-West <awest@janrain.com> wrote:
> I have not been able to reproduce the deadlock by making concurrent
> UPDATEs via what amounts to a bash fork-bomb w/ psql, but I suspect my
> methodology might be too crude there.

Another way is using pgbench. Write a script file with the SQL query
in it (must be on a single line). Then launch it as:
    pgbench -f scriptfile.sql -n -c 10 dbname

Regards,
Marti

Re: regression, deadlock in high frequency single-row UPDATE

From
Andrew Sackville-West
Date:
On Fri, Aug 01, 2014 at 06:40:51PM +0300, Marti Raudsepp wrote:
> On Fri, Aug 1, 2014 at 2:30 AM, Andrew Sackville-West <awest@janrain.com> wrote:
> > I have not been able to reproduce the deadlock by making concurrent
> > UPDATEs via what amounts to a bash fork-bomb w/ psql, but I suspect my
> > methodology might be too crude there.
>
> Another way is using pgbench. Write a script file with the SQL query
> in it (must be on a single line). Then launch it as:
>     pgbench -f scriptfile.sql -n -c 10 dbname
>

a perfect, thanks I'll see if I can reproduce this way and follow up.

A


--

Re: regression, deadlock in high frequency single-row UPDATE

From
Alvaro Herrera
Date:
Andrew Sackville-West wrote:
> Hi,
>
> I've discussed this problem on irc a couple of times and think I've
> found a regression that plagues our application, introduced in some
> version newer than 9.1.9, and still present in 9.3.5.

I don't think you have provided everything: for instance I see that your
log has INSERTs into table access_tokens, which has a foreign key
relationship to/from some other table you showed.

I'm too lazy to unscramble the whole thing into a real schema right now.
pg_dump output would be more helpful, as would some trivial sample data
enough to reproduce the problem.

It'd probably be easier to reproduce if you wrap the statements in
transactions and have them sleep before COMMIT.  (If you run two of
these updates in parallel, the first one blocks the second until the
first one commits, because they update the same row.  What happens if
the blocking session runs the aforementioned INSERT while the other is
blocked?  Perhaps three sessions are needed to show a problem without
explicit transactions.)

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: regression, deadlock in high frequency single-row UPDATE

From
Andrew Sackville-West
Date:
On Thu, Jul 31, 2014 at 08:59:50PM -0400, Alvaro Herrera wrote:
> Andrew Sackville-West wrote:
> > Hi,
> >
> > I've discussed this problem on irc a couple of times and think I've
> > found a regression that plagues our application, introduced in some
> > version newer than 9.1.9, and still present in 9.3.5.
>
> I don't think you have provided everything: for instance I see that your
> log has INSERTs into table access_tokens, which has a foreign key
> relationship to/from some other table you showed.

Ah you're correct, I'll attach the description of that table. But,
really, that is all that's going on in the database.

>
> I'm too lazy to unscramble the whole thing into a real schema right now.
> pg_dump output would be more helpful, as would some trivial sample data
> enough to reproduce the problem.

It's really minimal data. I'll attach a dump of the access_grants and
oauth_clients table. The access_tokens table is not empty, but I'm not
sure how relevant the data is. What is shown in the log is
typical. There is another table in play here, refresh_tokens, but
we're not actually using them in this case -- the table is empty, and
the reference to it in access_tokens in null.


> It'd probably be easier to reproduce if you wrap the statements in
> transactions and have them sleep before COMMIT.

The issue here is, with no code changes, this works fine under very
heavy load in production against 9.0.x, 9.1.9, etc. But under
9.3.{3,4,5} it fails. That code has no transaction. I think adding a
transaction is a distraction from what appears to me to be a
regression.

Interestingly, if I change the code to include a transaction around
the UPDATE, the problem goes away (or so it seems so far in my testing).

> (If you run two of
> these updates in parallel, the first one blocks the second until the
> first one commits, because they update the same row.  What happens if
> the blocking session runs the aforementioned INSERT while the other is
> blocked?  Perhaps three sessions are needed to show a problem without
> explicit transactions.)

I've run pg_bench from 3 machines all pointing at the same database
running this same set of statements and it fails to induce the
deadlock.

It occurs to me that I haven't completely described the sequence of
actions, so here is a snippet of the actual SQL that a given action
induces, when it works as expected:

SET standard_conforming_strings TO on;

SET datestyle TO ISO;

SET client_encoding TO utf8;

SELECT "user_1"."id", "user_1"."uuid", "user_1"."lastUpdated" AT TIME
ZONE 'UTC', "user_1"."id" as _id
FROM "z8z6px927zu6qzzbnb5ntgghxg"."user_1" AS "user_1"  WHERE "user_1"."id" = 129;

UPDATE "z8z6px927zu6qzzbnb5ntgghxg"."access_grants" ag
SET last_issued=DEFAULT
FROM "z8z6px927zu6qzzbnb5ntgghxg"."oauth_clients" oc
WHERE oc.id = ag.client_id AND ag.entity_name = 'user'
  AND ag.entity_id = 129 AND oc.client_id = '3hp45h9d4f9wwtx7cvpus6rdb4s5kb9f'
RETURNING ag.id;

INSERT INTO "z8z6px927zu6qzzbnb5ntgghxg"."access_tokens"
  (token, access_id, refresh_token_id, issued, expiry)
VALUES
  ('xys7vzs94huex2y7', 2, null, timezone('UTC'::text, now()), timezone('UTC'::text, now()) + '1:00:00':: interval);

This represents one complete action. There are several of these
running concurrently, interleaved together. There is no transaction
involved here. As far as I can tell, the application is using
bog-standard libpq (via FFI wrapper in Haskell) with no non-default
configuration applied to the connection.

I realize there's not a lot to go on here, and I don't expect help
debugging my application. I've got a patch that seems to resolve the
problem for us. But I'm suspicious of this change in behavior and hope
that pointing it out can help in some way.

regards,

A



>
> --
> Álvaro Herrera                http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services

--

Attachment

Re: regression, deadlock in high frequency single-row UPDATE

From
Alvaro Herrera
Date:
Andrew Sackville-West wrote:

> > I don't think you have provided everything: for instance I see that your
> > log has INSERTs into table access_tokens, which has a foreign key
> > relationship to/from some other table you showed.
>
> Ah you're correct, I'll attach the description of that table. But,
> really, that is all that's going on in the database.

See, access_tokens has a foreign key to access_grants.  So when the
insert to access_tokens happens, the process also acquires a lock on a
tuple on access_grants, which should not conflict with the UPDATE; but
apparently it is blocking in some cases, for unknown reasons.  But note
that one of the original tables had a large number of foreign keys to
other tables; are those necessary to reproduce the problem?  I mean, if
you just copy those three tables to an otherwise empty database and then
run the hanging application, do you see a problem?  If not, does adding
all those other tables cause the problem to show up?

Note that the set of columns covered by unique indexes is very
important.  Your \d does not have an unique index on the updated column
AFAICS; but if it did have one, things would be completely different
(because then the insert would definitely conflict with the update).

It would help a lot of you posted the table definitions as pg_dump -t
output rather than \d, because it's much easier for me to reproduce the
setup, rather than cutting and pasting the \d output into CREATE TABLE
commands.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: regression, deadlock in high frequency single-row UPDATE

From
Andrew Sackville-West
Date:
Some months later (stupid business priorities...) we have something to
report on this. We have been able to produce a minimal schema to
demonstrate the problem, and successfully induced the deadlock using
pgbench (thanks to Paulo Tanimoto, cc-ed here).

Please see:

https://gist.github.com/andrewsw-janrain/40d1687db013b1e7c3b3

for detailed instructions on how to trigger the deadlock.

I would be thrilled to learn that we've done something wrong here,
otherwise I think this represents a regression introduced in 9.3.

A

On Mon, Aug 04, 2014 at 12:30:27PM -0400, Alvaro Herrera wrote:
> Andrew Sackville-West wrote:
>
> > > I don't think you have provided everything: for instance I see that your
> > > log has INSERTs into table access_tokens, which has a foreign key
> > > relationship to/from some other table you showed.
> >
> > Ah you're correct, I'll attach the description of that table. But,
> > really, that is all that's going on in the database.
>
> See, access_tokens has a foreign key to access_grants.  So when the
> insert to access_tokens happens, the process also acquires a lock on a
> tuple on access_grants, which should not conflict with the UPDATE; but
> apparently it is blocking in some cases, for unknown reasons.  But note
> that one of the original tables had a large number of foreign keys to
> other tables; are those necessary to reproduce the problem?  I mean, if
> you just copy those three tables to an otherwise empty database and then
> run the hanging application, do you see a problem?  If not, does adding
> all those other tables cause the problem to show up?
>
> Note that the set of columns covered by unique indexes is very
> important.  Your \d does not have an unique index on the updated column
> AFAICS; but if it did have one, things would be completely different
> (because then the insert would definitely conflict with the update).
>
> It would help a lot of you posted the table definitions as pg_dump -t
> output rather than \d, because it's much easier for me to reproduce the
> setup, rather than cutting and pasting the \d output into CREATE TABLE
> commands.
>
> --
> Álvaro Herrera                http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs

--

Re: regression, deadlock in high frequency single-row UPDATE

From
Mark Kirkwood
Date:
On 09/12/14 12:55, Andrew Sackville-West wrote:
> Some months later (stupid business priorities...) we have something to
> report on this. We have been able to produce a minimal schema to
> demonstrate the problem, and successfully induced the deadlock using
> pgbench (thanks to Paulo Tanimoto, cc-ed here).
>
> Please see:
>
> https://gist.github.com/andrewsw-janrain/40d1687db013b1e7c3b3
>
> for detailed instructions on how to trigger the deadlock.
>
> I would be thrilled to learn that we've done something wrong here,
> otherwise I think this represents a regression introduced in 9.3.
>
>


That is interesting - FWIW I can reproduce:

- 9.2.9 no deadlock
- 9.4rc1 many deadlocks

so something has changed after 9.2 for sure!

Just mucking about I changed certain things in your schema setup
(removed ON DELETE CASCADE, use CURRENT_TIMESTAMP instead of
now()...lessen the use of DEFAULT)...however still seeing deadlocks in
9.4, so at least it is easy to reproduce!

Regards

Mark

Re: regression, deadlock in high frequency single-row UPDATE

From
hubert depesz lubaczewski
Date:
For whatever it's worth, I was able to bisect the problem to this commit:

commit
0ac5ad5134f2769ccbaefec73844f8504c4d6182

Author: Alvaro Herrera <alvherre@alvh.no-ip.org>

Date:   Wed Jan 23 12:04:59 2013 -0300

    Improve concurrency of foreign key locking
...


depesz


On Tue, Dec 9, 2014 at 6:58 AM, Mark Kirkwood <mark.kirkwood@catalyst.net.nz
> wrote:

> On 09/12/14 12:55, Andrew Sackville-West wrote:
>
>> Some months later (stupid business priorities...) we have something to
>> report on this. We have been able to produce a minimal schema to
>> demonstrate the problem, and successfully induced the deadlock using
>> pgbench (thanks to Paulo Tanimoto, cc-ed here).
>>
>> Please see:
>>
>> https://gist.github.com/andrewsw-janrain/40d1687db013b1e7c3b3
>>
>> for detailed instructions on how to trigger the deadlock.
>>
>> I would be thrilled to learn that we've done something wrong here,
>> otherwise I think this represents a regression introduced in 9.3.
>>
>>
>>
>
> That is interesting - FWIW I can reproduce:
>
> - 9.2.9 no deadlock
> - 9.4rc1 many deadlocks
>
> so something has changed after 9.2 for sure!
>
> Just mucking about I changed certain things in your schema setup (removed
> ON DELETE CASCADE, use CURRENT_TIMESTAMP instead of now()...lessen the use
> of DEFAULT)...however still seeing deadlocks in 9.4, so at least it is easy
> to reproduce!
>
> Regards
>
> Mark
>
>
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

Re: regression, deadlock in high frequency single-row UPDATE

From
Alvaro Herrera
Date:
hubert depesz lubaczewski wrote:
> For whatever it's worth, I was able to bisect the problem to this commit:
>
> commit
> 0ac5ad5134f2769ccbaefec73844f8504c4d6182
>
> Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
>
> Date:   Wed Jan 23 12:04:59 2013 -0300
>
>     Improve concurrency of foreign key locking

I'm not surprised in the least, TBH.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: regression, deadlock in high frequency single-row UPDATE

From
Andrew Sackville-West
Date:
On Mon, Dec 8, 2014 at 9:58 PM, Mark Kirkwood <mark.kirkwood@catalyst.net.nz
> wrote:

> On 09/12/14 12:55, Andrew Sackville-West wrote:
>
>>
>> https://gist.github.com/andrewsw-janrain/40d1687db013b1e7c3b3
>>
>> for detailed instructions on how to trigger the deadlock.
>
>

>
>>
> That is interesting - FWIW I can reproduce:
>
> - 9.2.9 no deadlock
> - 9.4rc1 many deadlocks
>
> so something has changed after 9.2 for sure!
>
> Just mucking about I changed certain things in your schema setup (removed
> ON DELETE CASCADE, use CURRENT_TIMESTAMP instead of now()...lessen the use
> of DEFAULT)...however still seeing deadlocks in 9.4, so at least it is easy
> to reproduce!
>

Can you speak to what benefits those changes might yield? Or point to
relevant docs?

thanks

A


>
> Regards
>
> Mark
>
>
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

Re: regression, deadlock in high frequency single-row UPDATE

From
Andrew Sackville-West
Date:
On Tue, Dec 9, 2014 at 6:14 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:

> hubert depesz lubaczewski wrote:
> > For whatever it's worth, I was able to bisect the problem to this commi=
t:
> >
> > commit
> > 0ac5ad5134f2769ccbaefec73844f8504c4d6182
> >
> > Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
> >
> > Date:   Wed Jan 23 12:04:59 2013 -0300
> >
> >     Improve concurrency of foreign key locking
>
> I'm not surprised in the least, TBH.
>
>
Thanks for your attention to this. Is there anything we're doing wrong that
could improve this in current 9.3 versions?

A


> --
> =C3=81lvaro Herrera                http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

Re: regression, deadlock in high frequency single-row UPDATE

From
Andres Freund
Date:
On 2014-12-09 09:38:33 -0800, Andrew Sackville-West wrote:
> On Tue, Dec 9, 2014 at 6:14 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
> wrote:
>
> > hubert depesz lubaczewski wrote:
> > > For whatever it's worth, I was able to bisect the problem to this commit:
> > >
> > > commit
> > > 0ac5ad5134f2769ccbaefec73844f8504c4d6182
> > >
> > > Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
> > >
> > > Date:   Wed Jan 23 12:04:59 2013 -0300
> > >
> > >     Improve concurrency of foreign key locking
> >
> > I'm not surprised in the least, TBH.
> >
> >
> Thanks for your attention to this. Is there anything we're doing wrong that
> could improve this in current 9.3 versions?

I don't think anybody can seriously can give you a serious workaround
until the issue has been diagnosed further.

One thing you could try would be to create a unique index that spans all
the columns in the table... That'd prevent usage of lower locklevels
added in the above commit.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: regression, deadlock in high frequency single-row UPDATE

From
Alvaro Herrera
Date:
Andrew Sackville-West wrote:
> Some months later (stupid business priorities...) we have something to
> report on this. We have been able to produce a minimal schema to
> demonstrate the problem, and successfully induced the deadlock using
> pgbench (thanks to Paulo Tanimoto, cc-ed here).
>
> Please see:
>
> https://gist.github.com/andrewsw-janrain/40d1687db013b1e7c3b3
>
> for detailed instructions on how to trigger the deadlock.
>
> I would be thrilled to learn that we've done something wrong here,
> otherwise I think this represents a regression introduced in 9.3.

To avoid having to visit an external URL that might well not exist at
all, and to simplify the test case which still has too many extraneous
details, here's a simpler version that still causes the UPDATE to
deadlock:

-- schema.sql:
create table pktab (id int primary key, data serial not null);
create table fktab (fk int references pktab);
insert into pktab values (1);

-- query.sql:
update pktab set data = default;
insert into fktab values (1);

This doesn't deadlock in 9.2, yet in 9.3 it raises an error in a few
seconds with pgbench -c 16.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: regression, deadlock in high frequency single-row UPDATE

From
Mark Kirkwood
Date:
On 10/12/14 06:28, Andrew Sackville-West wrote:
> On Mon, Dec 8, 2014 at 9:58 PM, Mark Kirkwood

>     Just mucking about I changed certain things in your schema setup
>     (removed ON DELETE CASCADE, use CURRENT_TIMESTAMP instead of
>     now()...lessen the use of DEFAULT)...however still seeing deadlocks
>     in 9.4, so at least it is easy to reproduce!
>
>
> Can you speak to what benefits those changes might yield? Or point to
> relevant docs?
>

Not so much advantages - just seeing if I could still reproduce the
issue with a simpler test case i.e ensuring you were not doing anything
'odd' - and you are not :-)

The next step would be trying to figure out what commit introduced this
behaviour - but depesz has already beaten me to that (nice work)!

Cheers

Mark

Re: regression, deadlock in high frequency single-row UPDATE

From
Andrew Sackville-West
Date:
On Tue, Dec 09, 2014 at 06:43:20PM +0100, Andres Freund wrote:
> On 2014-12-09 09:38:33 -0800, Andrew Sackville-West wrote:
> > On Tue, Dec 9, 2014 at 6:14 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
> > wrote:
> >
> > > hubert depesz lubaczewski wrote:
> > > > For whatever it's worth, I was able to bisect the problem to this commit:
> > > >
> > > > commit
> > > > 0ac5ad5134f2769ccbaefec73844f8504c4d6182
> > > >
> > > > Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
> > > >
> > > > Date:   Wed Jan 23 12:04:59 2013 -0300
> > > >
> > > >     Improve concurrency of foreign key locking
> > >
> > > I'm not surprised in the least, TBH.
> > >
> > >
> > Thanks for your attention to this. Is there anything we're doing wrong that
> > could improve this in current 9.3 versions?
>
> I don't think anybody can seriously can give you a serious workaround
> until the issue has been diagnosed further.

agreed.

>
> One thing you could try would be to create a unique index that spans all
> the columns in the table... That'd prevent usage of lower locklevels
> added in the above commit.


Thanks for this idea. I'll investigate.

thanks


A

Re: regression, deadlock in high frequency single-row UPDATE

From
Alvaro Herrera
Date:
Mark Kirkwood wrote:

> Not so much advantages - just seeing if I could still reproduce the issue
> with a simpler test case i.e ensuring you were not doing anything 'odd' -
> and you are not :-)
>
> The next step would be trying to figure out what commit introduced this
> behaviour - but depesz has already beaten me to that (nice work)!

So I traced through the problem using the simplified test case I posted.
It goes like this:

* tuple on the referenced table has been updated a bunch of times
  already, and keeps being updated, so there's a large update chain to
  follow.

* two or more transactions (T1 and T2, say) have snapshots in which
  some version of the tuple (not the last one) is visible.

* transaction T3 has a ForKeyShare lock on the latest version of the
  tuple, which is grabbed because of the INSERT in the referencing
  table.  Note that these locks are propagated upwards in an update
  chain, so even if T3 locked an old version, all future versions are
  also locked by T3.

* each of T1 and T2 tries to update the version they have visible; this
  fails because it's not the last version, so they need to go through
  EvalPlanQual, which walks through the update chain.

At this point, T1 passes through heap_lock_tuple, gets the HW tuple
lock, marks itself as locker in the tuple's Xmax.  T1 is scheduled out;
time for T2 to run.  T2 goes through heap_lock_tuple, grabs HW lock;
examines Xmax, sees that T1 is the locker, goes to sleep until T1
awakes.  T1 is scheduled to run again, runs heap_update.  The first
thing it needs is to do LockTuple ...  but T2 is holding that one, so T1
goes to sleep until T2 awakes -- now they are both sleeping on each
other.

Kaboom.

I'm not seeing a solution here.  There is something wrong in having to
release the HW lock then grab it again, in that EvalPlanQual dance.  It
sounds like we need to hold on to the tuple's HW lock until after
heap_update has run, but it's not at all clear how this would work --
the interface through EvalPlanQual is messy enough as it is without
entertaining the idea that a "please keep HW lock until later" flag
needs to be passed up and down all over the place.

One simple idea that occurs to me is have some global state in heapam.c;
when the EPQ code is invoked from ExecUpdate, we tell heapam to keep the
HW lock, and it's only released once heap_update is finished.  That
would keep T2 away from the tuple.  We'd need a PG_TRY block in
ExecUpdate to reset the global state once the update is done.  (It's not
entirely clear that we need all this for deletes too.  The main
difference is that after a delete completes the tuple is gone, which is
obviously not the case in an update.  Also, deletes don't create update
chains.)

One thing I haven't thought too much about is why doesn't this happen in
9.2.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: regression, deadlock in high frequency single-row UPDATE

From
Mark Kirkwood
Date:
On 11/12/14 08:33, Alvaro Herrera wrote:
> Mark Kirkwood wrote:
>
>> Not so much advantages - just seeing if I could still reproduce the issue
>> with a simpler test case i.e ensuring you were not doing anything 'odd' -
>> and you are not :-)
>>
>> The next step would be trying to figure out what commit introduced this
>> behaviour - but depesz has already beaten me to that (nice work)!
>
> So I traced through the problem using the simplified test case I posted.
> It goes like this:
>
> * tuple on the referenced table has been updated a bunch of times
>    already, and keeps being updated, so there's a large update chain to
>    follow.
>
> * two or more transactions (T1 and T2, say) have snapshots in which
>    some version of the tuple (not the last one) is visible.
>
> * transaction T3 has a ForKeyShare lock on the latest version of the
>    tuple, which is grabbed because of the INSERT in the referencing
>    table.  Note that these locks are propagated upwards in an update
>    chain, so even if T3 locked an old version, all future versions are
>    also locked by T3.
>
> * each of T1 and T2 tries to update the version they have visible; this
>    fails because it's not the last version, so they need to go through
>    EvalPlanQual, which walks through the update chain.
>
> At this point, T1 passes through heap_lock_tuple, gets the HW tuple
> lock, marks itself as locker in the tuple's Xmax.  T1 is scheduled out;
> time for T2 to run.  T2 goes through heap_lock_tuple, grabs HW lock;
> examines Xmax, sees that T1 is the locker, goes to sleep until T1
> awakes.  T1 is scheduled to run again, runs heap_update.  The first
> thing it needs is to do LockTuple ...  but T2 is holding that one, so T1
> goes to sleep until T2 awakes -- now they are both sleeping on each
> other.
>
> Kaboom.
>
> I'm not seeing a solution here.  There is something wrong in having to
> release the HW lock then grab it again, in that EvalPlanQual dance.  It
> sounds like we need to hold on to the tuple's HW lock until after
> heap_update has run, but it's not at all clear how this would work --
> the interface through EvalPlanQual is messy enough as it is without
> entertaining the idea that a "please keep HW lock until later" flag
> needs to be passed up and down all over the place.
>
> One simple idea that occurs to me is have some global state in heapam.c;
> when the EPQ code is invoked from ExecUpdate, we tell heapam to keep the
> HW lock, and it's only released once heap_update is finished.  That
> would keep T2 away from the tuple.  We'd need a PG_TRY block in
> ExecUpdate to reset the global state once the update is done.  (It's not
> entirely clear that we need all this for deletes too.  The main
> difference is that after a delete completes the tuple is gone, which is
> obviously not the case in an update.  Also, deletes don't create update
> chains.)
>
> One thing I haven't thought too much about is why doesn't this happen in
> 9.2.
>

Hmmm - interesting, yeah I think the why-it-doesn't-in-9.2 part is the
next thing to understand :-)

Cheers

Mark

Re: regression, deadlock in high frequency single-row UPDATE

From
Mark Kirkwood
Date:
On 11/12/14 10:56, Mark Kirkwood wrote:
> On 11/12/14 08:33, Alvaro Herrera wrote:
>> Mark Kirkwood wrote:
>>
>>> Not so much advantages - just seeing if I could still reproduce the
>>> issue
>>> with a simpler test case i.e ensuring you were not doing anything
>>> 'odd' -
>>> and you are not :-)
>>>
>>> The next step would be trying to figure out what commit introduced this
>>> behaviour - but depesz has already beaten me to that (nice work)!
>>
>> So I traced through the problem using the simplified test case I posted.
>> It goes like this:
>>
>> * tuple on the referenced table has been updated a bunch of times
>>    already, and keeps being updated, so there's a large update chain to
>>    follow.
>>
>> * two or more transactions (T1 and T2, say) have snapshots in which
>>    some version of the tuple (not the last one) is visible.
>>
>> * transaction T3 has a ForKeyShare lock on the latest version of the
>>    tuple, which is grabbed because of the INSERT in the referencing
>>    table.  Note that these locks are propagated upwards in an update
>>    chain, so even if T3 locked an old version, all future versions are
>>    also locked by T3.
>>
>> * each of T1 and T2 tries to update the version they have visible; this
>>    fails because it's not the last version, so they need to go through
>>    EvalPlanQual, which walks through the update chain.
>>
>> At this point, T1 passes through heap_lock_tuple, gets the HW tuple
>> lock, marks itself as locker in the tuple's Xmax.  T1 is scheduled out;
>> time for T2 to run.  T2 goes through heap_lock_tuple, grabs HW lock;
>> examines Xmax, sees that T1 is the locker, goes to sleep until T1
>> awakes.  T1 is scheduled to run again, runs heap_update.  The first
>> thing it needs is to do LockTuple ...  but T2 is holding that one, so T1
>> goes to sleep until T2 awakes -- now they are both sleeping on each
>> other.
>>
>> Kaboom.
>>
>> I'm not seeing a solution here.  There is something wrong in having to
>> release the HW lock then grab it again, in that EvalPlanQual dance.  It
>> sounds like we need to hold on to the tuple's HW lock until after
>> heap_update has run, but it's not at all clear how this would work --
>> the interface through EvalPlanQual is messy enough as it is without
>> entertaining the idea that a "please keep HW lock until later" flag
>> needs to be passed up and down all over the place.
>>
>> One simple idea that occurs to me is have some global state in heapam.c;
>> when the EPQ code is invoked from ExecUpdate, we tell heapam to keep the
>> HW lock, and it's only released once heap_update is finished.  That
>> would keep T2 away from the tuple.  We'd need a PG_TRY block in
>> ExecUpdate to reset the global state once the update is done.  (It's not
>> entirely clear that we need all this for deletes too.  The main
>> difference is that after a delete completes the tuple is gone, which is
>> obviously not the case in an update.  Also, deletes don't create update
>> chains.)
>>
>> One thing I haven't thought too much about is why doesn't this happen in
>> 9.2.
>>
>
> Hmmm - interesting, yeah I think the why-it-doesn't-in-9.2 part is the
> next thing to understand :-)
>

In the most glaringly obvious (and possibly not at all helpful) analysis
- this is because we are using "FOR KEY SHARE" instead of "FOR SHARE" in
the newer code. Reverting this (see attached very basic patch) gets rid
of the deadlock (tested in 9.5devel). However this is obviously a less
than desirable solution, as it loses any possible concurrency
improvement - that we worked hard to gain in the patch that added this
new lock mode! We really want to figure out how to keep "FOR KEY SHARE"
and have it *not* deadlock.

Cheers

Mark


Attachment

Re: regression, deadlock in high frequency single-row UPDATE

From
Alvaro Herrera
Date:
Mark Kirkwood wrote:

> In the most glaringly obvious (and possibly not at all helpful) analysis -
> this is because we are using "FOR KEY SHARE" instead of "FOR SHARE" in the
> newer code. Reverting this (see attached very basic patch) gets rid of the
> deadlock (tested in 9.5devel). However this is obviously a less than
> desirable solution, as it loses any possible concurrency improvement - that
> we worked hard to gain in the patch that added this new lock mode! We really
> want to figure out how to keep "FOR KEY SHARE" and have it *not* deadlock.

Yeah.  The problem is that the INSERT in the fktab table takes a FOR KEY
SHARE lock in the pktab table.  This puts the inserting process' Xid
into the tuple's Xmax, but allows concurrent UPDATE while the inserter
transaction is still in progress.  If you use a FOR SHARE lock, the
UPDATE is disallowed until the inserter transaction is completed.

Having the inserter be live is what causes the heap_lock_tuple and
heap_update calls in the updater process to take the heavyweight tuple
lock.

I think what we should do here is NOT take the HW lock unless we're
going to sleep in heap_lock_tuple and heap_update.  Right now we always
take it if HeapTupleSatisfiesUpdate returns HeapTupleBeingUpdated, even
if we end up not sleeping to wait for something else.

I'm going to experiment with that idea and see if it leads to a
solution.  I tried the other idea yesterday (to keep the HW tuple lock
we acquire in heap_lock_tuple until heap_update is done) but aside from
being very complicated and bug-prone, it doesn't solve the problem
anyway.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: regression, deadlock in high frequency single-row UPDATE

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:

> I'm going to experiment with that idea and see if it leads to a
> solution.  I tried the other idea yesterday (to keep the HW tuple lock
> we acquire in heap_lock_tuple until heap_update is done) but aside from
> being very complicated and bug-prone, it doesn't solve the problem
> anyway.

Here's a preliminary patch.  It does solve the deadlock in my simplified
test case.  If Andrew can confirm that it fixes his original problem
too, that'd be good.

Before this can be committed I need an isolationtester spec file that
reproduces the problem.  Now that I understand why it happens it should
be easy to produce: just have a transaction that does BEGIN, then the
insert, and keeps the transaction open; enough other sessions run the
UPDATE until the problem pops up.  (Also, comments on
Would_MultiXactIdWait_Block need work.)

FWIW this code should also have slightly better performance than the
original coding, since the heavyweight tuple lock acquisition is skipped
in some cases.  Not sure if that is measurable, though.  Maybe in
extreme cases such as the one in #8470 ...

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: regression, deadlock in high frequency single-row UPDATE

From
Andrew Sackville-West
Date:
On Thu, Dec 11, 2014 at 02:22:56PM -0300, Alvaro Herrera wrote:
> Alvaro Herrera wrote:
>
> > I'm going to experiment with that idea and see if it leads to a
> > solution.  I tried the other idea yesterday (to keep the HW tuple lock
> > we acquire in heap_lock_tuple until heap_update is done) but aside from
> > being very complicated and bug-prone, it doesn't solve the problem
> > anyway.
>
> Here's a preliminary patch.  It does solve the deadlock in my simplified
> test case.  If Andrew can confirm that it fixes his original problem
> too, that'd be good.

We can try that. We don't currently build our own pg, so it'll take a
bit for us to get there. I'll report back with results.

A

Re: regression, deadlock in high frequency single-row UPDATE

From
Mark Kirkwood
Date:
On 12/12/14 06:22, Alvaro Herrera wrote:
> Alvaro Herrera wrote:
>
>> I'm going to experiment with that idea and see if it leads to a
>> solution.  I tried the other idea yesterday (to keep the HW tuple lock
>> we acquire in heap_lock_tuple until heap_update is done) but aside from
>> being very complicated and bug-prone, it doesn't solve the problem
>> anyway.
>
> Here's a preliminary patch.  It does solve the deadlock in my simplified
> test case.  If Andrew can confirm that it fixes his original problem
> too, that'd be good.
>
> Before this can be committed I need an isolationtester spec file that
> reproduces the problem.  Now that I understand why it happens it should
> be easy to produce: just have a transaction that does BEGIN, then the
> insert, and keeps the transaction open; enough other sessions run the
> UPDATE until the problem pops up.  (Also, comments on
> Would_MultiXactIdWait_Block need work.)
>
> FWIW this code should also have slightly better performance than the
> original coding, since the heavyweight tuple lock acquisition is skipped
> in some cases.  Not sure if that is measurable, though.  Maybe in
> extreme cases such as the one in #8470 ...
>

Yeah, works for me too. I've tried your simplified and also Andrew's
original schema + queries against 9.5devel with deadlock.patch applied
(neither show any deadlocks in that case).

Cheers

Mark

Re: regression, deadlock in high frequency single-row UPDATE

From
Mark Kirkwood
Date:
On 12/12/14 06:22, Alvaro Herrera wrote:

> Before this can be committed I need an isolationtester spec file that
> reproduces the problem.  Now that I understand why it happens it should
> be easy to produce: just have a transaction that does BEGIN, then the
> insert, and keeps the transaction open; enough other sessions run the
> UPDATE until the problem pops up.  (Also, comments on
> Would_MultiXactIdWait_Block need work.)
>


FWIW - I was having a look at the isolationtester spec. Playing with the
pgbench setup to see the minimal number of session I needed to provoke
the deadlock  (unpatched 9.5 code) seemed to converge to about 4. I
managed to still see 1 deadlock with each session only doing 1
transaction - i.e:

$ pgbench -c 4 -C  -n -t 1 -f query.sql deadlock

...which resulted in 1 deadlock. So we may be able to get a reasonable
test case that shows this up! I'll take a look at setting up a test case
later - but don't let that stop you doing it 1st :-)

Cheers

Mark

Re: regression, deadlock in high frequency single-row UPDATE

From
Alvaro Herrera
Date:
Mark Kirkwood wrote:

> FWIW - I was having a look at the isolationtester spec. Playing with the
> pgbench setup to see the minimal number of session I needed to provoke the
> deadlock  (unpatched 9.5 code) seemed to converge to about 4. I managed to
> still see 1 deadlock with each session only doing 1 transaction - i.e:
>
> $ pgbench -c 4 -C  -n -t 1 -f query.sql deadlock
>
> ...which resulted in 1 deadlock. So we may be able to get a reasonable test
> case that shows this up! I'll take a look at setting up a test case later -
> but don't let that stop you doing it 1st :-)

Okay, I created a reproducer ... but it doesn't work.  I mean, the
commands are fine: if I execute them manually in four sessions in an
unpatched server I get a deadlock.  But isolationtester doesn't know to
handle more than one blocked session, so it just hangs when the third
session blocks after the second one is already blocked.  :-(

Here it is anyway.  If we get around to improving isolationtester to
handle more blocked sessions, it would be good to add this file.

setup {
   DROP TABLE IF EXISTS pktab;
   CREATE TABLE pktab (id int PRIMARY KEY, data SERIAL NOT NULL);
   INSERT INTO pktab VALUES (1, DEFAULT);
}

teardown {
   DROP TABLE pktab;
}

session "s1"
step "s1_advlock" { SELECT pg_advisory_lock(142857); }
step "s1_chain" { UPDATE pktab SET data = DEFAULT; }
step "s1_begin" { BEGIN; }
step "s1_grablock" { SELECT * FROM pktab FOR KEY SHARE; }
step "s1_advunlock" { SELECT pg_advisory_unlock(142857); }
step "s1_commit" { COMMIT; }

session "s2"
step "s2_update" { UPDATE pktab SET data = DEFAULT WHERE pg_advisory_lock_shared(142857) IS NOT NULL; }

session "s3"
step "s3_update" { UPDATE pktab SET data = DEFAULT WHERE pg_advisory_lock_shared(142857) IS NOT NULL; }

session "s4"
step "s4_update" { UPDATE pktab SET data = DEFAULT WHERE pg_advisory_lock_shared(142857) IS NOT NULL; }

permutation "s1_advlock" "s2_update" "s3_update" "s4_update" "s1_chain" "s1_grablock" "s1_advunlock" "s1_commit"

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: regression, deadlock in high frequency single-row UPDATE

From
Peter Geoghegan
Date:
On Fri, Dec 12, 2014 at 12:52 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Okay, I created a reproducer ... but it doesn't work.  I mean, the
> commands are fine: if I execute them manually in four sessions in an
> unpatched server I get a deadlock.  But isolationtester doesn't know to
> handle more than one blocked session, so it just hangs when the third
> session blocks after the second one is already blocked.  :-(

We need a stress-testing framework. I hacked together one for the ON
CONFLICT UPDATE patch (https://github.com/petergeoghegan/upsert), but
a more comprehensive effort is needed.

--
Peter Geoghegan

Re: regression, deadlock in high frequency single-row UPDATE

From
Alvaro Herrera
Date:
Peter Geoghegan wrote:
> On Fri, Dec 12, 2014 at 12:52 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
> > Okay, I created a reproducer ... but it doesn't work.  I mean, the
> > commands are fine: if I execute them manually in four sessions in an
> > unpatched server I get a deadlock.  But isolationtester doesn't know to
> > handle more than one blocked session, so it just hangs when the third
> > session blocks after the second one is already blocked.  :-(
>
> We need a stress-testing framework. I hacked together one for the ON
> CONFLICT UPDATE patch (https://github.com/petergeoghegan/upsert), but
> a more comprehensive effort is needed.

Hm, that's nice, but I hope you realize that a framework based on shell
scripts cannot possibly work for us.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: regression, deadlock in high frequency single-row UPDATE

From
Peter Geoghegan
Date:
On Fri, Dec 12, 2014 at 1:16 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Hm, that's nice, but I hope you realize that a framework based on shell
> scripts cannot possibly work for us.

Of course it couldn't be committed, but maybe this test framework
doesn't need to live in the authoritative git repo. The effort might
be bike-shedded to death otherwise. Not everyone needs to run these
types of tests.

--
Peter Geoghegan

Re: regression, deadlock in high frequency single-row UPDATE

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:
> Alvaro Herrera wrote:
>
> > I'm going to experiment with that idea and see if it leads to a
> > solution.  I tried the other idea yesterday (to keep the HW tuple lock
> > we acquire in heap_lock_tuple until heap_update is done) but aside from
> > being very complicated and bug-prone, it doesn't solve the problem
> > anyway.
>
> Here's a preliminary patch.

Here's a finished version of this patch, which I messed a bit with and
so needs some extra testing.  I want to push this shortly and backpatch
to 9.4 and 9.3.  I don't want to disrupt the 9.4.0 release next week,
but it'd be nice not to ship it with this bug.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: regression, deadlock in high frequency single-row UPDATE

From
Mark Kirkwood
Date:
On 13/12/14 10:57, Alvaro Herrera wrote:
> Alvaro Herrera wrote:
>> Alvaro Herrera wrote:
>>
>>> I'm going to experiment with that idea and see if it leads to a
>>> solution.  I tried the other idea yesterday (to keep the HW tuple lock
>>> we acquire in heap_lock_tuple until heap_update is done) but aside from
>>> being very complicated and bug-prone, it doesn't solve the problem
>>> anyway.
>>
>> Here's a preliminary patch.
>
> Here's a finished version of this patch, which I messed a bit with and
> so needs some extra testing.  I want to push this shortly and backpatch
> to 9.4 and 9.3.  I don't want to disrupt the 9.4.0 release next week,
> but it'd be nice not to ship it with this bug.
>

This patch seems good too. I'm possibly seeing it performing slightly
slower than the previous patch (approx 68 tps vs 63 on average -
completely untuned 9.5 running on a single SATA drive, but that could
just be natural variation/noise).

Regards

Mark

Re: regression, deadlock in high frequency single-row UPDATE

From
Mark Kirkwood
Date:
On 14/12/14 15:22, Mark Kirkwood wrote:
> On 13/12/14 10:57, Alvaro Herrera wrote:
>> Alvaro Herrera wrote:
>>> Alvaro Herrera wrote:
>>>
>>>> I'm going to experiment with that idea and see if it leads to a
>>>> solution.  I tried the other idea yesterday (to keep the HW tuple lock
>>>> we acquire in heap_lock_tuple until heap_update is done) but aside from
>>>> being very complicated and bug-prone, it doesn't solve the problem
>>>> anyway.
>>>
>>> Here's a preliminary patch.
>>
>> Here's a finished version of this patch, which I messed a bit with and
>> so needs some extra testing.  I want to push this shortly and backpatch
>> to 9.4 and 9.3.  I don't want to disrupt the 9.4.0 release next week,
>> but it'd be nice not to ship it with this bug.
>>
>
> This patch seems good too. I'm possibly seeing it performing slightly
> slower than the previous patch (approx 68 tps vs 63 on average -
> completely untuned 9.5 running on a single SATA drive, but that could
> just be natural variation/noise).
>

Blast - and forgot to clarify that "seems good" means I've tested it
against both your and Andrew's deadlock producing schema (no deadlocks
with this patch applied)...

Cheers

Mark

Re: regression, deadlock in high frequency single-row UPDATE

From
Andrew Sackville-West
Date:
On Fri, Dec 12, 2014 at 06:57:52PM -0300, Alvaro Herrera wrote:
> Alvaro Herrera wrote:
> > Alvaro Herrera wrote:
> >
> > > I'm going to experiment with that idea and see if it leads to a
> > > solution.  I tried the other idea yesterday (to keep the HW tuple lock
> > > we acquire in heap_lock_tuple until heap_update is done) but aside from
> > > being very complicated and bug-prone, it doesn't solve the problem
> > > anyway.
> >
> > Here's a preliminary patch.
>
> Here's a finished version of this patch, which I messed a bit with and
> so needs some extra testing.  I want to push this shortly and backpatch
> to 9.4 and 9.3.  I don't want to disrupt the 9.4.0 release next week,
> but it'd be nice not to ship it with this bug.

We have tested this patch on the latest postgresql with our actual
application and it appears to work for us.

We'll be happy to test a patch against the 9.3 series, using our actual
application, as well when it becomes available.

thanks!

A

Re: regression, deadlock in high frequency single-row UPDATE

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:
> Alvaro Herrera wrote:
>
> > I'm going to experiment with that idea and see if it leads to a
> > solution.  I tried the other idea yesterday (to keep the HW tuple lock
> > we acquire in heap_lock_tuple until heap_update is done) but aside from
> > being very complicated and bug-prone, it doesn't solve the problem
> > anyway.
>
> Here's a preliminary patch.  It does solve the deadlock in my simplified
> test case.  If Andrew can confirm that it fixes his original problem
> too, that'd be good.

Pushed this patch to master, 9.4 and 9.3.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: regression, deadlock in high frequency single-row UPDATE

From
Paulo Tanimoto
Date:
Hi Alvaro,

On Fri, Dec 26, 2014 at 8:56 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
>
> Pushed this patch to master, 9.4 and 9.3.
>

Thank you for porting and pushing your patch to these other branches.
Andrew, our team lead, is out this week, but I wanted to give you some
feedback as soon as possible.

I've built PostgreSQL 9.3 (git branch REL9_3_STABLE) with and without
this patch, and at least in my tests I was able to confirm that it
solves our issue.  Besides testing it with our simplified pgbench
scripts, I also tested it locally with our Haskell web server, which
was how we ran into the deadlock error in the first place.  In all my
tests, I consistently get the deadlock errors without the patch, and
no longer see them when the patch is applied.

Let us know if you'd like to see anything else tested on our end.

Take care,

Paulo