Thread: Simultaneous index creates on different schemas cause deadlock?

Simultaneous index creates on different schemas cause deadlock?

From
Paul Hinze
Date:
In our production environment, we separate each shard out into its own
schema to give us flexibility in shard location. This leaves us with a
lot of schemas hosted on a relatively smaller set of servers.

With so many schemas, database migrations started to take a long time
to run across all of them. To address this, we introduced some
parallelism in our migration-running process. So we'll have several
processes hopping across the shards at once performing DDL operations
and what have you. This helped our migration runtime considerably.

We had an odd situation crop up recently that confused me. The
migration we were running included a "CREATE INDEX CONCURRENTLY" (a
fairly common occurrence for us), and as it ran across our environment
we hit several deadlocks of the following form:

> Process 3445 waits for ShareLock on virtual transaction 5/27569425; blocked by process 3440.
> Process 3440 waits for ShareLock on virtual transaction 13/22769556; blocked by process 3445.

For every deadlock, each process involved was running one of our
"CREATE INDEX" statements.

I've managed to reproduce this behavior with a simple little ruby
script that creates 10 schemas, each with a 10k row table, and then
attempts to index those tables in parallel. For me, this deadlocks
every time:

https://gist.github.com/phinze/f38578d39f9e0ed4fd25

This is what confused me: since the indexes are each being created in
completely independent schemas, what resource could possibly be in
contention to cause these deadlocks?

Thanks for your time,

Paul


Re: Simultaneous index creates on different schemas cause deadlock?

From
Tom Lane
Date:
Paul Hinze <paul.t.hinze@gmail.com> writes:
> [ multiple CREATE INDEX CONCURRENTLY commands will deadlock with each other ]

Hm.  I guess the reason nobody noticed this before now is that generally
the idea with CREATE INDEX CONCURRENTLY is to minimize the impact on
system load, hence you wouldn't do more than one at a time.  Still, it's
surely a POLA violation that you *can't* do more than one at a time.

The cause is that each one will wait for all older snapshots to be
gone --- and it does that before dropping its own snapshot, so that the
other ones will see it as something to be waited out too.

Since we know that C.I.C. executes in its own transaction, and there
can't be more than one on the same table due to locking, it seems to me
that it'd be safe to drop our own snapshot before waiting for other
xacts to end.  That is, we could just rearrange the last few steps in
DefineIndex(), taking care to save snapshot->xmin before we destroy the
snapshot so that we still have that value to pass to
GetCurrentVirtualXIDs().

Anybody see a flaw in that solution?

            regards, tom lane


Re: [HACKERS] Simultaneous index creates on different schemas cause deadlock?

From
Andres Freund
Date:
On 2013-04-25 13:17:31 -0400, Tom Lane wrote:
> Paul Hinze <paul.t.hinze@gmail.com> writes:
> > [ multiple CREATE INDEX CONCURRENTLY commands will deadlock with each other ]
>
> Hm.  I guess the reason nobody noticed this before now is that generally
> the idea with CREATE INDEX CONCURRENTLY is to minimize the impact on
> system load, hence you wouldn't do more than one at a time.  Still, it's
> surely a POLA violation that you *can't* do more than one at a time.
>
> The cause is that each one will wait for all older snapshots to be
> gone --- and it does that before dropping its own snapshot, so that the
> other ones will see it as something to be waited out too.

Makes sense.

> Since we know that C.I.C. executes in its own transaction, and there
> can't be more than one on the same table due to locking, it seems to me
> that it'd be safe to drop our own snapshot before waiting for other
> xacts to end.  That is, we could just rearrange the last few steps in
> DefineIndex(), taking care to save snapshot->xmin before we destroy the
> snapshot so that we still have that value to pass to
> GetCurrentVirtualXIDs().
>
> Anybody see a flaw in that solution?

Except that it still will unnecessarily wait for other CICs, just not
deadlock, I don't see a problem. We could have a PROC_IN_CIC flag or
something so we can ignore other index creations, but I am not sure if
its worth the complication.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Andres Freund <andres@2ndquadrant.com> writes:
> On 2013-04-25 13:17:31 -0400, Tom Lane wrote:
>> Since we know that C.I.C. executes in its own transaction, and there
>> can't be more than one on the same table due to locking, it seems to me
>> that it'd be safe to drop our own snapshot before waiting for other
>> xacts to end.  That is, we could just rearrange the last few steps in
>> DefineIndex(), taking care to save snapshot->xmin before we destroy the
>> snapshot so that we still have that value to pass to
>> GetCurrentVirtualXIDs().
>>
>> Anybody see a flaw in that solution?

> Except that it still will unnecessarily wait for other CICs, just not
> deadlock, I don't see a problem. We could have a PROC_IN_CIC flag or
> something so we can ignore other index creations, but I am not sure if
> its worth the complication.

I'm not sure it's a good idea to ignore other CICs altogether --- they
could be executing user-defined index functions that do strange things
like consult other tables.  Since this seems to me to be a bit outside
the intended use-case for CIC anyway, I think it's good enough if they
just don't deadlock.

            regards, tom lane


Re: [HACKERS] Simultaneous index creates on different schemas cause deadlock?

From
"anarazel@anarazel.de"
Date:

Tom Lane <tgl@sss.pgh.pa.us> schrieb:

>Andres Freund <andres@2ndquadrant.com> writes:
>> On 2013-04-25 13:17:31 -0400, Tom Lane wrote:
>>> Since we know that C.I.C. executes in its own transaction, and there
>>> can't be more than one on the same table due to locking, it seems to
>me
>>> that it'd be safe to drop our own snapshot before waiting for other
>>> xacts to end.  That is, we could just rearrange the last few steps
>in
>>> DefineIndex(), taking care to save snapshot->xmin before we destroy
>the
>>> snapshot so that we still have that value to pass to
>>> GetCurrentVirtualXIDs().
>>>
>>> Anybody see a flaw in that solution?
>
>> Except that it still will unnecessarily wait for other CICs, just not
>> deadlock, I don't see a problem. We could have a PROC_IN_CIC flag or
>> something so we can ignore other index creations, but I am not sure
>if
>> its worth the complication.
>
>I'm not sure it's a good idea to ignore other CICs altogether --- they
>could be executing user-defined index functions that do strange things
>like consult other tables.  Since this seems to me to be a bit outside
>the intended use-case for CIC anyway, I think it's good enough if they
>just don't deadlock

Fine with me, especially as nobody seems to have complained so far other than the OP, so it doesn't seem to be to
common. 
I don't have access to the code ATM an I wonder whether DROP CONCURRENTLY has a similar problem? Depends a bit on how
thewaiting is done... 

Andres

---
Please excuse brevity and formatting - I am writing this on my mobile phone.


"anarazel@anarazel.de" <andres@anarazel.de> writes:
> I don't have access to the code ATM an I wonder whether DROP CONCURRENTLY has a similar problem? Depends a bit on how
thewaiting is done... 

It's not a problem --- that code doesn't depend on waiting for snapshots
to expire, it just checks for other sessions holding locks on the target
table.  (I also did some experimental testing to verify this.)

            regards, tom lane


Re: Simultaneous index creates on different schemas cause deadlock?

From
Paul Hinze
Date:
On Thu, Apr 25, 2013 at 12:17 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The cause is that each one will wait for all older snapshots to be
> gone --- and it does that before dropping its own snapshot, so that the
> other ones will see it as something to be waited out too.

This makes sense. Thank you for explaining.

> Since we know that C.I.C. executes in its own transaction, and there
> can't be more than one on the same table due to locking, it seems to me
> that it'd be safe to drop our own snapshot before waiting for other
> xacts to end.  That is, we could just rearrange the last few steps in
> DefineIndex(), taking care to save snapshot->xmin before we destroy the
> snapshot so that we still have that value to pass to
> GetCurrentVirtualXIDs().

Seems reasonable to me. Looks like a fix landed in master yesterday:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c3d09b3bd23f5f65b5eb8124a3c7592dad85a50c

Many thanks to Tom and all the pgsql-hackers for all the work you do!

Paul