Thread: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Nikhil Sontakke

Date:

09 November 2011, 05:56:43

Hi,<br /><br />Consider the following sequence of events:<br /><br />s1 #> CREATE SCHEMA test_schema;<br /><br />s1
#>CREATE TABLE test_schema.c1(x int);<br /><br />Now open another session s2 and via gdb issue a breakpoint on
heap_create_with_catalog()which is called by DefineRelation().<br /><br />s2 #> CREATE TABLE test_schema.c2(y
int);<br/><br />The above will break on the function. Now issue a drop schema in session s1<br /><br />s1 #> DROP
SCHEMAtest_schema CASCADE;<br />NOTICE:  drop cascades to table test_schema.c1<br /> DROP SCHEMA<br /><br />Continuing
ingdb, also completes the creation of c2 table without any errors. We are now left with a dangling entry in pg_class
alongwith all the corresponding data files in our data directory. The problem becomes worse if c2 was created using a
TABLESPACE.Now dropping of that tablespace does not work at all. Am sure we can come up with myriad such other issues.
<br/><br />Am sure other CREATE commands in this namespace will have similar issues when faced with a concurrent DROP
SCHEMA.<br /><br />We definitely need some interlocking to handle this. For lack of better APIs, we could do a
LockDatabaseObject()call in AccessShareLock mode on the namespace and release the same on completion of the creation of
theobject. <br /><br />Thoughts? <br /><br />Regards,<br />Nikhils<br />

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Robert Haas

Date:

09 November 2011, 09:01:43

On Wed, Nov 9, 2011 at 4:56 AM, Nikhil Sontakke <nikkhils@gmail.com> wrote:
> Consider the following sequence of events:
>
> s1 #> CREATE SCHEMA test_schema;
>
> s1 #> CREATE TABLE test_schema.c1(x int);
>
> Now open another session s2 and via gdb issue a breakpoint on
> heap_create_with_catalog() which is called by DefineRelation().
>
> s2 #> CREATE TABLE test_schema.c2(y int);
>
> The above will break on the function. Now issue a drop schema in session s1
>
> s1 #> DROP SCHEMA test_schema CASCADE;
> NOTICE:  drop cascades to table test_schema.c1
> DROP SCHEMA
>
> Continuing in gdb, also completes the creation of c2 table without any
> errors. We are now left with a dangling entry in pg_class along with all the
> corresponding data files in our data directory. The problem becomes worse if
> c2 was created using a TABLESPACE. Now dropping of that tablespace does not
> work at all. Am sure we can come up with myriad such other issues.
>
> Am sure other CREATE commands in this namespace will have similar issues
> when faced with a concurrent DROP SCHEMA.
>
> We definitely need some interlocking to handle this. For lack of better
> APIs, we could do a LockDatabaseObject() call in AccessShareLock mode on the
> namespace and release the same on completion of the creation of the object.
>
> Thoughts?

In general, we've been reluctant to add locking on non-table objects
for reasons of overhead.  You can, for example, drop a type or
function while a query is running that depends on it (which is not
true for tables).  But I think it is sensible to do it for DDL
commands, which shouldn't be frequent enough for the overhead to
matter much.  When I rewrote the comment code for 9.1, I added locking
that works just this way, to prevent pg_description entries from being
orphaned; see the end of get_object_address().

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Nikhil Sontakke

Date:

09 November 2011, 10:52:07

> We definitely need some interlocking to handle this. For lack of better
> APIs, we could do a LockDatabaseObject() call in AccessShareLock mode on the
> namespace and release the same on completion of the creation of the object.
>
> Thoughts?

In general, we've been reluctant to add locking on non-table objects
for reasons of overhead. You can, for example, drop a type or
function while a query is running that depends on it (which is not
true for tables). But I think it is sensible to do it for DDL
commands, which shouldn't be frequent enough for the overhead to
matter much.

Agreed. Especially if the race condition has non-trivial downsides as mentioned in the tablespace case.

When I rewrote the comment code for 9.1, I added locking
that works just this way, to prevent pg_description entries from being
orphaned; see the end of get_object_address().

Yeah thanks, that does the object locking. For pre-9.1 versions, we will need a similar solution. I encountered the issue on 8.3.x..

Regards,
Nikhils

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Robert Haas

Date:

09 November 2011, 11:12:34

On Wed, Nov 9, 2011 at 9:51 AM, Nikhil Sontakke <nikkhils@gmail.com> wrote:
> Yeah thanks, that does the object locking. For pre-9.1 versions, we will
> need a similar solution. I encountered the issue on 8.3.x..

I don't think we should back-patch a fix of this type.  There is a lot
of cruftiness of this type scattered throughout the code base, and if
we start back-patching all the fixes for it, we're going to end up
destabilizing older branches for little real benefit.

Also, the fix would need to be quite different in older branches.  For
example, in the master branch, you can probably fix 90% of the issue
by adjusting dropcmds.c, which now handles drop operations for most
object types.  I believe KaiGai Kohei is still working on code which
will allow that code to support drop operations for most of the
remaining object types as well.  But in any previous release you would
need scattered fixes all over the code base.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Nikhil Sontakke

Date:

09 November 2011, 11:59:58

> Yeah thanks, that does the object locking. For pre-9.1 versions, we will
> need a similar solution. I encountered the issue on 8.3.x..

I don't think we should back-patch a fix of this type. There is a lot
of cruftiness of this type scattered throughout the code base, and if
we start back-patching all the fixes for it, we're going to end up
destabilizing older branches for little real benefit.

Ok, understood.

Thanks and Regards,
Nikhils

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Nikhil Sontakke

Date:

10 November 2011, 08:00:30

Hi,

Ok, understood.

PFA, a patch against git head. We take the AccessShareLock lock on the schema in DefineRelation now. Note that we do not want to interlock with other concurrent creations in the schema. We only want to interlock with deletion activity. So even performance wise this should not be much of an overhead in case of concurrent DDL operations to the same schema.

Adding this in DefineRelation handles creation of tables/views/types/sequences. I do not think we need to do stuff in ALTER commands, because the objects pre-exist and this issue appears to be with new objects only.

Comments?

Regards,
Nikhils

Attachment

git_head_lock_schema_ddl.patch

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Tom Lane

Date:

10 November 2011, 11:30:45

Nikhil Sontakke <nikkhils@gmail.com> writes:
> PFA, a patch against git head. We take the AccessShareLock lock on the
> schema in DefineRelation now.

Um ... why would we do this only for tables, and not for creations of
other sorts of objects that belong to schemas?

Also, if we are going to believe that this is a serious problem, what
of ALTER ... SET SCHEMA?

Also, the proposed solution is pretty silly on its face, because it has
not removed the race condition only made the window somewhat narrower.
You would have to acquire the lock as part of the initial schema lookup,
not lock the OID after the fact.  And could we please not do something
as silly as translate the OID back to a string and then look up that
string a second time?

(To be clear, I don't particularly believe that this is a problem worthy
of spending code space and cycles on.  But if it's deemed to be a
problem, I want to see a solution that's actually watertight.)
        regards, tom lane

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Nikhil Sontakke

Date:

10 November 2011, 13:48:38

Um ... why would we do this only for tables, and not for creations of
other sorts of objects that belong to schemas?

Right, we need to do it for other objects like functions etc. too.

Also, if we are going to believe that this is a serious problem, what
of ALTER ... SET SCHEMA?

I admit, I hadn't thought of this.

Also, the proposed solution is pretty silly on its face, because it has
not removed the race condition only made the window somewhat narrower.
You would have to acquire the lock as part of the initial schema lookup,
not lock the OID after the fact. And could we please not do something
as silly as translate the OID back to a string and then look up that
string a second time?

The comment mentions that part is a kluge but that we get to re-use the existing function because of it. The get_object_address function will bail out anyways if the schema has vanished from down under and it does lock it up immediately after it's found to be valid.

(To be clear, I don't particularly believe that this is a problem worthy
of spending code space and cycles on. But if it's deemed to be a
problem, I want to see a solution that's actually watertight.)

Got the message.

Regards,
Nikhils

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Daniel Farina

Date:

11 November 2011, 00:05:21

On Wed, Nov 9, 2011 at 1:56 AM, Nikhil Sontakke <nikkhils@gmail.com> wrote:
> Hi,
>
> Consider the following sequence of events:
>
> s1 #> CREATE SCHEMA test_schema;
>
> s1 #> CREATE TABLE test_schema.c1(x int);
>
> Now open another session s2 and via gdb issue a breakpoint on
> heap_create_with_catalog() which is called by DefineRelation().
>
> s2 #> CREATE TABLE test_schema.c2(y int);
>
> The above will break on the function. Now issue a drop schema in session s1
>
> s1 #> DROP SCHEMA test_schema CASCADE;
> NOTICE:  drop cascades to table test_schema.c1
> DROP SCHEMA
>
> Continuing in gdb, also completes the creation of c2 table without any
> errors. We are now left with a dangling entry in pg_class along with all the
> corresponding data files in our data directory. The problem becomes worse if
> c2 was created using a TABLESPACE. Now dropping of that tablespace does not
> work at all. Am sure we can come up with myriad such other issues.

Hmm.  Does this break pg_dump?  I have reported a bug whereby dangling
pg_class entries point to a namespace that has since been dropped in
the past (and has been reported many times before that, even).

The bug report is here, whereby I also aggregate other similar bug
reports that have taken place over a very long period of time:

http://archives.postgresql.org/pgsql-bugs/2011-02/msg00185.php

Given that the schema is successfully dropped, yet another table is
created presumably using this already-resolved schema OID, it seems
like it would run into this...

You could run this query, which should return 0, but may not in your case:

select count(distinct typnamespace) from pg_type where  not exists
(select 1 from pg_namespace where oid = pg_type.typnamespace);

--
fdr

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Nikhil Sontakke

Date:

11 November 2011, 02:04:24

> Continuing in gdb, also completes the creation of c2 table without any
> errors. We are now left with a dangling entry in pg_class along with all the
> corresponding data files in our data directory. The problem becomes worse if
> c2 was created using a TABLESPACE. Now dropping of that tablespace does not
> work at all. Am sure we can come up with myriad such other issues.

Hmm. Does this break pg_dump? I have reported a bug whereby dangling
pg_class entries point to a namespace that has since been dropped in
the past (and has been reported many times before that, even).

Sure does! I just tried it and got:
pg_dump: schema with OID 16384 does not exist

The bug report is here, whereby I also aggregate other similar bug
reports that have taken place over a very long period of time:

http://archives.postgresql.org/pgsql-bugs/2011-02/msg00185.php

I guess we DO need to pay attention to fix this properly now as there are some reports with 9.x too. And I have just provided a way to reproduce this reliably too.

Regards,
Nikhils

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Robert Haas

Date:

12 November 2011, 03:11:55

On Thu, Nov 10, 2011 at 7:00 AM, Nikhil Sontakke <nikkhils@gmail.com> wrote:
> PFA, a patch against git head. We take the AccessShareLock lock on the
> schema in DefineRelation now. Note that we do not want to interlock with
> other concurrent creations in the schema. We only want to interlock with
> deletion activity. So even performance wise this should not be much of an
> overhead in case of concurrent DDL operations to the same schema.

If all you need to do is lock a schema, you can just call
LockDatabaseObject(NamespaceRelationId, namespace_oid, 0,
AccessShareLock); there's no need to fake up an objectaddress just to
take a lock.  But I think that's not really all you need to do,
because somebody could drop the namespace between the time that you
decide what OID to lock and the time you acquire the lock.  So I think
you need something like what we did in RangeVarGetRelid().  See
attached patch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

lock-namespace.patch

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Nikhil Sontakke

Date:

14 November 2011, 12:48:47

If all you need to do is lock a schema, you can just call
LockDatabaseObject(NamespaceRelationId, namespace_oid, 0,
AccessShareLock); there's no need to fake up an objectaddress just to
take a lock. But I think that's not really all you need to do,
because somebody could drop the namespace between the time that you
decide what OID to lock and the time you acquire the lock. So I think
you need something like what we did in RangeVarGetRelid(). See
attached patch.

Thanks Robert. But currently there are very few callers of RangeVarGetAndCheckCreationNamespace() function. For the sake of completeness we will have to introduce a call to this function while creating all other objects too.

Regards,
Nikhils

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Robert Haas

Date:

14 November 2011, 13:39:32

On Mon, Nov 14, 2011 at 11:48 AM, Nikhil Sontakke
<nikhil.sontakke@enterprisedb.com> wrote:
>> If all you need to do is lock a schema, you can just call
>> LockDatabaseObject(NamespaceRelationId, namespace_oid, 0,
>> AccessShareLock); there's no need to fake up an objectaddress just to
>> take a lock.  But I think that's not really all you need to do,
>> because somebody could drop the namespace between the time that you
>> decide what OID to lock and the time you acquire the lock.  So I think
>> you need something like what we did in RangeVarGetRelid().  See
>> attached patch.
>
> Thanks Robert. But currently there are very few callers of
> RangeVarGetAndCheckCreationNamespace() function. For the sake of
> completeness we will have to introduce a call to this function while
> creating all other objects too.

Well, RangeVarGetAndCheckCreationNamespace is only (and can only) be
used for relations.  To get similar protection for other object types,
we'd need to add a similar logic elsewhere.  I haven't looked at where
it would need to go.

In fact, I think that the technique demonstrated here (which was
pioneered by Noah Misch) is actually quite general, and there are
probably a lot of places where we need to be doing it but currently
are not.  So it's probably going to take a while to get this
completely nailed down, but we can keep chipping away at it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Nikhil Sontakke

Date:

14 November 2011, 13:48:31

So it's probably going to take a while to get this
completely nailed down, but we can keep chipping away at it.

Agreed. So are you planning to commit this change? Or we want some more objects to be fixed? Last I looked at this, we will need locking to be done while creating tables, views, types, sequences, functions, indexes, extensions, constraints, operators stuff, ts stuff, rules, domains, etc. that can go into schemas.

So might even make sense to write a schema specific function based on your patch template to cater in general to schema locking during object creation.

Regards,
Nikhils

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Robert Haas

Date:

14 November 2011, 14:56:52

On Mon, Nov 14, 2011 at 12:48 PM, Nikhil Sontakke <nikkhils@gmail.com> wrote:
>> So it's probably going to take a while to get this
>> completely nailed down, but we can keep chipping away at it.
>
> Agreed. So are you planning to commit this change? Or we want some more
> objects to be fixed? Last I looked at this, we will need locking to be done
> while creating tables, views, types, sequences, functions, indexes,
> extensions, constraints, operators stuff, ts stuff, rules, domains, etc.
> that can go into schemas.

<reads the code>

Well, it looks to me like there are three different places that we
need to nail down: RangeVarGetAndCheckCreationNamespace() is used for
relations (except that a few places call RangeVarGetCreationNamespace
directly, which means my previous patch probably needs some tweaking
before commit), QualifiedNameGetCreationNamespace() is used for pretty
much all other schema-qualified objects, and LookupCreationNamespace()
is used for ALTER BLAH SET SCHEMA (which I think has a problem when
you rename an object into a schema that is concurrently being
dropped).

I'm fairly unhappy with the idea of modifying a function that is
described as doing a "get" or "lookup" to have the side effect of
"locking something".  So probably some renaming or refactoring is in
order here.  It seems like we're duplicating almost identical logic in
an awful lot of places in namespace.c.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Alvaro Herrera

Date:

14 November 2011, 15:54:30

Excerpts from Robert Haas's message of lun nov 14 15:56:43 -0300 2011:

> Well, it looks to me like there are three different places that we
> need to nail down: RangeVarGetAndCheckCreationNamespace() is used for
> relations (except that a few places call RangeVarGetCreationNamespace
> directly, which means my previous patch probably needs some tweaking
> before commit), QualifiedNameGetCreationNamespace() is used for pretty
> much all other schema-qualified objects, and LookupCreationNamespace()
> is used for ALTER BLAH SET SCHEMA (which I think has a problem when
> you rename an object into a schema that is concurrently being
> dropped).
> 
> I'm fairly unhappy with the idea of modifying a function that is
> described as doing a "get" or "lookup" to have the side effect of
> "locking something".  So probably some renaming or refactoring is in
> order here.  It seems like we're duplicating almost identical logic in
> an awful lot of places in namespace.c.

So RangeVarGetCheckAndLockCreationNamespace(), uh?  Pity you can't
stick a comma in there.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Robert Haas

Date:

14 November 2011, 16:08:39

On Mon, Nov 14, 2011 at 2:26 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Excerpts from Robert Haas's message of lun nov 14 15:56:43 -0300 2011:
>
>> Well, it looks to me like there are three different places that we
>> need to nail down: RangeVarGetAndCheckCreationNamespace() is used for
>> relations (except that a few places call RangeVarGetCreationNamespace
>> directly, which means my previous patch probably needs some tweaking
>> before commit), QualifiedNameGetCreationNamespace() is used for pretty
>> much all other schema-qualified objects, and LookupCreationNamespace()
>> is used for ALTER BLAH SET SCHEMA (which I think has a problem when
>> you rename an object into a schema that is concurrently being
>> dropped).
>>
>> I'm fairly unhappy with the idea of modifying a function that is
>> described as doing a "get" or "lookup" to have the side effect of
>> "locking something".  So probably some renaming or refactoring is in
>> order here.  It seems like we're duplicating almost identical logic in
>> an awful lot of places in namespace.c.
>
> So RangeVarGetCheckAndLockCreationNamespace(), uh?  Pity you can't
> stick a comma in there.

Yeah, really.  :-)

Actually, I think that one could probably stay as-is.  "Check" implies
that there's something else going on besides just a lookup, and we
can't go nuts with it.  I'm more concerned about
QualifiedNameGetCreationNamespace() and LookupCreationNamespace().

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Daniel Farina

Date:

29 November 2011, 04:38:14

On Mon, Nov 14, 2011 at 12:07 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Nov 14, 2011 at 2:26 PM, Alvaro Herrera
> <alvherre@commandprompt.com> wrote:
>> Excerpts from Robert Haas's message of lun nov 14 15:56:43 -0300 2011:
>>
>>> Well, it looks to me like there are three different places that we
>>> need to nail down: RangeVarGetAndCheckCreationNamespace() is used for
>>> relations (except that a few places call RangeVarGetCreationNamespace
>>> directly, which means my previous patch probably needs some tweaking
>>> before commit), QualifiedNameGetCreationNamespace() is used for pretty
>>> much all other schema-qualified objects, and LookupCreationNamespace()
>>> is used for ALTER BLAH SET SCHEMA (which I think has a problem when
>>> you rename an object into a schema that is concurrently being
>>> dropped).
>>>
>>> I'm fairly unhappy with the idea of modifying a function that is
>>> described as doing a "get" or "lookup" to have the side effect of
>>> "locking something".  So probably some renaming or refactoring is in
>>> order here.  It seems like we're duplicating almost identical logic in
>>> an awful lot of places in namespace.c.
>>
>> So RangeVarGetCheckAndLockCreationNamespace(), uh?  Pity you can't
>> stick a comma in there.
>
> Yeah, really.  :-)
>
> Actually, I think that one could probably stay as-is.  "Check" implies
> that there's something else going on besides just a lookup, and we
> can't go nuts with it.  I'm more concerned about
> QualifiedNameGetCreationNamespace() and LookupCreationNamespace().

Hmm, just to prod this thread: has any fix for this been committed?
After Nikhil confirmed that this bug could cause pg_dump to not be
able to operate without direct catalog surgery I have encountered this
bug (and treated its symptoms in the same manner) twice.  I tried to
do my homework in a backbranch, but am not seeing anything beaming out
at me.

Cheers,

--
fdr

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Robert Haas

Date:

29 November 2011, 11:10:21

On Tue, Nov 29, 2011 at 3:37 AM, Daniel Farina <daniel@heroku.com> wrote:
> Hmm, just to prod this thread: has any fix for this been committed?
> After Nikhil confirmed that this bug could cause pg_dump to not be
> able to operate without direct catalog surgery I have encountered this
> bug (and treated its symptoms in the same manner) twice.  I tried to
> do my homework in a backbranch, but am not seeing anything beaming out
> at me.

I have plans to try to improve this, but it's one of those things that
I care about more than the people who write the checks do, so it
hasn't quite gotten to the top of the priority list yet.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Robert Haas

Date:

13 January 2012, 18:05:56

On Tue, Nov 29, 2011 at 10:10 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Nov 29, 2011 at 3:37 AM, Daniel Farina <daniel@heroku.com> wrote:
>> Hmm, just to prod this thread: has any fix for this been committed?
>> After Nikhil confirmed that this bug could cause pg_dump to not be
>> able to operate without direct catalog surgery I have encountered this
>> bug (and treated its symptoms in the same manner) twice.  I tried to
>> do my homework in a backbranch, but am not seeing anything beaming out
>> at me.
>
> I have plans to try to improve this, but it's one of those things that
> I care about more than the people who write the checks do, so it
> hasn't quite gotten to the top of the priority list yet.

All right... I have a patch that I think fixes this, at least so far
as relations are concerned.  I rewrote
RangeVarGetAndCheckCreationNamespace() extensively, did surgery on its
callers, and then modified CREATE OR REPLACE VIEW and ALTER <relkind>
.. SET SCHEMA to make use of it rather than doing things as they did
before.

So this patch prevents (1) concurrently dropping a schema in which a
new relation is being created, (2) concurrently dropping a schema into
which an existing relation is being moved, and (3) using CREATE OR
REPLACE VIEW to attempt to obtain a lock on a relation you don't own
(the command would eventually fail, of course, but if there were, say,
an outstanding AccessShareLock on the relation, you'd queue up for the
lock and thus prevent any further locks from being granted, despite
your lack of any rights on the target).

It doesn't fix any of the non-relation object types.  That would be
nice to do, but I think it's material for a separate patch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

concurrent-drop-schema.patch

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Hitoshi Harada

Date:

14 January 2012, 06:25:14

On Fri, Jan 13, 2012 at 2:05 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Nov 29, 2011 at 10:10 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> I have plans to try to improve this, but it's one of those things that
>> I care about more than the people who write the checks do, so it
>> hasn't quite gotten to the top of the priority list yet.
>
> All right... I have a patch that I think fixes this, at least so far
> as relations are concerned.  I rewrote
> RangeVarGetAndCheckCreationNamespace() extensively, did surgery on its
> callers, and then modified CREATE OR REPLACE VIEW and ALTER <relkind>
> .. SET SCHEMA to make use of it rather than doing things as they did
> before.
>
> So this patch prevents (1) concurrently dropping a schema in which a
> new relation is being created, (2) concurrently dropping a schema into
> which an existing relation is being moved, and (3) using CREATE OR
> REPLACE VIEW to attempt to obtain a lock on a relation you don't own
> (the command would eventually fail, of course, but if there were, say,
> an outstanding AccessShareLock on the relation, you'd queue up for the
> lock and thus prevent any further locks from being granted, despite
> your lack of any rights on the target).

The patch looks ok, though I wonder if we could have a way to release
the lock on namespace much before the end of transaction. Since the
lock is held all the time, now the possibility of dead lock is bigger.
Say,

-- tx1
BEGIN;
SELECT * FROM s.x;
DROP SCHEMA t;

-- tx2
BEGIN;
SELECT * FROM t.y;
DROP SCHEMA s;

I know it's a limited situation, though. Maybe the right way would be
to check the namespace at the end of the transaction if any object is
created, rather than locking it.

> It doesn't fix any of the non-relation object types.  That would be
> nice to do, but I think it's material for a separate patch.

I took a quick look under src/include/catalog and the objects that
have namespace are

- collation
- constraint
- conversion
- extension
- operator
- operator class
- operator family
- function (proc)
- ts_config
- ts_dict
- ts_parser
- ts_template
- (what's missing?)

I agree with you that it's not worth doing everything, but function is
nice to have. I don't mind if we don't have it with the other objects.

Thanks,
--
Hitoshi Harada

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Hitoshi Harada

Date:

14 January 2012, 06:37:42

On Sat, Jan 14, 2012 at 2:25 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
> On Fri, Jan 13, 2012 at 2:05 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Tue, Nov 29, 2011 at 10:10 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>
>>> I have plans to try to improve this, but it's one of those things that
>>> I care about more than the people who write the checks do, so it
>>> hasn't quite gotten to the top of the priority list yet.
>>
>> All right... I have a patch that I think fixes this, at least so far
>> as relations are concerned.  I rewrote
>> RangeVarGetAndCheckCreationNamespace() extensively, did surgery on its
>> callers, and then modified CREATE OR REPLACE VIEW and ALTER <relkind>
>> .. SET SCHEMA to make use of it rather than doing things as they did
>> before.
>>
>> So this patch prevents (1) concurrently dropping a schema in which a
>> new relation is being created, (2) concurrently dropping a schema into
>> which an existing relation is being moved, and (3) using CREATE OR
>> REPLACE VIEW to attempt to obtain a lock on a relation you don't own
>> (the command would eventually fail, of course, but if there were, say,
>> an outstanding AccessShareLock on the relation, you'd queue up for the
>> lock and thus prevent any further locks from being granted, despite
>> your lack of any rights on the target).
>
> The patch looks ok, though I wonder if we could have a way to release
> the lock on namespace much before the end of transaction. Since the
> lock is held all the time, now the possibility of dead lock is bigger.
> Say,
>
> -- tx1
> BEGIN;
> SELECT * FROM s.x;
> DROP SCHEMA t;
>
> -- tx2
> BEGIN;
> SELECT * FROM t.y;
> DROP SCHEMA s;

Although I wrote
> I know it's a limited situation, though. Maybe the right way would be
> to check the namespace at the end of the transaction if any object is
> created, rather than locking it.

actually what's surprising to me is that even SELECT holds lock on
namespace to the end of transaction. The ideal way is that we lock
only on CREATE or other DDL.

Thanks,
--
Hitoshi Harada

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Robert Haas

Date:

14 January 2012, 10:48:50

On Sat, Jan 14, 2012 at 5:25 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
> The patch looks ok, though I wonder if we could have a way to release
> the lock on namespace much before the end of transaction.

Well, that wold kind of miss the point, wouldn't it?  I mean, the race
is that the process dropping the schema might not see the newly
created object.  The only way to prevent that is to hold some sort of
lock until the newly created object is visible, which doesn't happen
until commit.

> I know it's a limited situation, though. Maybe the right way would be
> to check the namespace at the end of the transaction if any object is
> created, rather than locking it.

And then what?  If you find that you created an object in a namespace
that's been subsequently dropped, what do you do about that?  As far
as I can see, your own really choice would be to roll back the
transaction that uncovers this problem, which is likely to produce
more rollbacks than just letting the deadlock detector sort it out.

>> It doesn't fix any of the non-relation object types.  That would be
>> nice to do, but I think it's material for a separate patch.
>
> I took a quick look under src/include/catalog and the objects that
> have namespace are
>
> - collation
> - constraint
> - conversion
> - extension
> - operator
> - operator class
> - operator family
> - function (proc)
> - ts_config
> - ts_dict
> - ts_parser
> - ts_template
> - (what's missing?)
>
> I agree with you that it's not worth doing everything, but function is
> nice to have. I don't mind if we don't have it with the other objects.

I think the fix for all of them will be very symmetrical; it's just
relations that have a different code path.  I don't see the point of
plugging some but not others; that just provides a mixture of good and
bad examples for future hackers to copy, which doesn't seem ideal.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Hitoshi Harada

Date:

15 January 2012, 12:03:46

On Sat, Jan 14, 2012 at 6:48 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, Jan 14, 2012 at 5:25 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
>> The patch looks ok, though I wonder if we could have a way to release
>> the lock on namespace much before the end of transaction.
>
> Well, that wold kind of miss the point, wouldn't it?  I mean, the race
> is that the process dropping the schema might not see the newly
> created object.  The only way to prevent that is to hold some sort of
> lock until the newly created object is visible, which doesn't happen
> until commit.
>
>> I know it's a limited situation, though. Maybe the right way would be
>> to check the namespace at the end of the transaction if any object is
>> created, rather than locking it.
>
> And then what?  If you find that you created an object in a namespace
> that's been subsequently dropped, what do you do about that?  As far
> as I can see, your own really choice would be to roll back the
> transaction that uncovers this problem, which is likely to produce
> more rollbacks than just letting the deadlock detector sort it out.

Right. I thought this patch introduced lock on schema in SELECT, but
actually we'd been locking schema with SELECT since before the patch.
So the behavior becomes consistent (between SELECT and CREATE) now
with it. And I agree that my insist is pointless after looking more.

Thanks,
--
Hitoshi Harada

Re: Concurrent CREATE TABLE/DROP SCHEMA leaves inconsistent leftovers

From

Robert Haas

Date:

16 January 2012, 10:53:41

On Sun, Jan 15, 2012 at 11:03 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
> On Sat, Jan 14, 2012 at 6:48 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Sat, Jan 14, 2012 at 5:25 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
>>> The patch looks ok, though I wonder if we could have a way to release
>>> the lock on namespace much before the end of transaction.
>>
>> Well, that wold kind of miss the point, wouldn't it?  I mean, the race
>> is that the process dropping the schema might not see the newly
>> created object.  The only way to prevent that is to hold some sort of
>> lock until the newly created object is visible, which doesn't happen
>> until commit.
>>
>>> I know it's a limited situation, though. Maybe the right way would be
>>> to check the namespace at the end of the transaction if any object is
>>> created, rather than locking it.
>>
>> And then what?  If you find that you created an object in a namespace
>> that's been subsequently dropped, what do you do about that?  As far
>> as I can see, your own really choice would be to roll back the
>> transaction that uncovers this problem, which is likely to produce
>> more rollbacks than just letting the deadlock detector sort it out.
>
> Right. I thought this patch introduced lock on schema in SELECT, but
> actually we'd been locking schema with SELECT since before the patch.

No, we lock the table with SELECT, not the schema.  This doesn't add
any additional locking for DML: nor is any needed, AFAICS.

> So the behavior becomes consistent (between SELECT and CREATE) now
> with it. And I agree that my insist is pointless after looking more.

OK.  Thanks for your review.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company