Thread: Prevent concurrent DROP SCHEMA when certain objects are beinginitially created in the namespace

Hello,

When an empty namespace is being initially populated with certain objects, it is possible for a DROP SCHEMA operation to come in and delete the namespace without using CASCADE.  These objects would be created but are left unusable.  This is capable of leaving behind pg_class, pg_type, and other such entries with invalid namespace values that need to be manually removed by the user.  To prevent this from happening, we should take an AccessShareLock on the namespace of which the type, function, etc. is being added to.

Attached is a patch that covers the following CREATE commands:
CREATE TYPE, CREATE FUNCTION, CREATE AGGREGATE, CREATE OPERATOR FAMILY, CREATE OPERATOR CLASS, and CREATE OPERATOR

Example Reproduction
Session1: CREATE SCHEMA testschema;
Session1: BEGIN;
Session1: CREATE TYPE testschema.testtype;
Session2: DROP SCHEMA testschema;
Session1: COMMIT;
// The DROP SCHEMA schema succeeds in session 2 and session 1's COMMIT goes through without error but the pg_type entry will still be there for typname testtype.  This example is for just a simple TYPE object but the other objects can be reproduced the same way in place of CREATE TYPE.

Related Postgres 9.2 commit (note in commit message that "We need similar protection for all other object types"):

The patch itself is relatively trivial by just adding copy/paste lock acquires near other lock acquires.  I wasn't sure if this needed a decently sized refactor such as the referenced commit above.

Regards,
Jimmy
Attachment
On Tue, Sep 04, 2018 at 03:09:21PM -0700, Jimmy Yih wrote:
> When an empty namespace is being initially populated with certain objects,
> it is possible for a DROP SCHEMA operation to come in and delete the
> namespace without using CASCADE.  These objects would be created but are
> left unusable.  This is capable of leaving behind pg_class, pg_type, and
> other such entries with invalid namespace values that need to be manually
> removed by the user.  To prevent this from happening, we should take an
> AccessShareLock on the namespace of which the type, function, etc. is being
> added to.
>
> Attached is a patch that covers the following CREATE commands:
> CREATE TYPE, CREATE FUNCTION, CREATE AGGREGATE, CREATE OPERATOR FAMILY,
> CREATE OPERATOR CLASS, and CREATE OPERATOR
>
> [...]
>
> The patch itself is relatively trivial by just adding copy/paste lock
> acquires near other lock acquires.  I wasn't sure if this needed a decently
> sized refactor such as the referenced commit above.

It seems to me that we are missing some dependency tracking in some of
those cases.  For example if you use a CREATE TYPE AS (...), then the
DROP SCHEMA would fail without specifying CASCADE, and session 2 would
block without a CASCADE.  So instead of adding more code paths where
LockDatabaseObject() is used, I would expect session 2 to block in
get_object_address all the time, which looks a bit lossy now by the
way.

Something which would be good to have for all those queries is a set of
isolation tests.  No need for multiple specs, you could just use one
spec with one session defining all the object types you would like to
work on.  How did you find this object list?  Did you test all the
objects available manually?

Please let me play with what you have here a bit..  If you have an
isolation test spec at hand, that would be helpful.
--
Michael

Attachment
Michael Paquier <michael@paquier.xyz> writes:
> On Tue, Sep 04, 2018 at 03:09:21PM -0700, Jimmy Yih wrote:
>> When an empty namespace is being initially populated with certain objects,
>> it is possible for a DROP SCHEMA operation to come in and delete the
>> namespace without using CASCADE.

> It seems to me that we are missing some dependency tracking in some of
> those cases.

No, I think Jimmy is right: it's a race condition.  The pg_depend entry
would produce the right result, except that it's not committed yet so
the DROP SCHEMA doesn't see it.

The bigger question is whether we want to do anything about this.
Historically we've not bothered with locking on database objects that
don't represent storage (ie, relations and databases).  If we're going to
take this seriously, then we should for example also acquire lock on any
function that's referenced in a view definition, to ensure it doesn't go
away before the view is committed and its dependencies become visible.
Likewise for operators, opclasses, collations, text search objects, you
name it.  And worse, we'd really need this sort of locking even in vanilla
DML queries, since objects could easily go away before the query is done.

I think that line of thought leads to an enormous increase in locking
overhead, for which we'd get little if any gain in usability.  So my
inclination is to make an engineering judgment that we won't fix this.

It'd be interesting to know whether any other DBMSes make an effort
to prevent this kind of problem.

            regards, tom lane



On September 4, 2018 9:11:25 PM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>Michael Paquier <michael@paquier.xyz> writes:
>> On Tue, Sep 04, 2018 at 03:09:21PM -0700, Jimmy Yih wrote:
>>> When an empty namespace is being initially populated with certain
>objects,
>>> it is possible for a DROP SCHEMA operation to come in and delete the
>>> namespace without using CASCADE.
>
>> It seems to me that we are missing some dependency tracking in some
>of
>> those cases.
>
>No, I think Jimmy is right: it's a race condition.  The pg_depend entry
>would produce the right result, except that it's not committed yet so
>the DROP SCHEMA doesn't see it.
>
>The bigger question is whether we want to do anything about this.
>Historically we've not bothered with locking on database objects that
>don't represent storage (ie, relations and databases).  If we're going
>to
>take this seriously, then we should for example also acquire lock on
>any
>function that's referenced in a view definition, to ensure it doesn't
>go
>away before the view is committed and its dependencies become visible.
>Likewise for operators, opclasses, collations, text search objects, you
>name it.  And worse, we'd really need this sort of locking even in
>vanilla
>DML queries, since objects could easily go away before the query is
>done.
>
>I think that line of thought leads to an enormous increase in locking
>overhead, for which we'd get little if any gain in usability.  So my
>inclination is to make an engineering judgment that we won't fix this.

Haven't we already significantly started down this road, to avoid a lot of the "tuple concurrently updated" type
errors?Would expanding this a git further really be that noticeable? 

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Andres Freund <andres@anarazel.de> writes:
> On September 4, 2018 9:11:25 PM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think that line of thought leads to an enormous increase in locking
>> overhead, for which we'd get little if any gain in usability.  So my
>> inclination is to make an engineering judgment that we won't fix this.

> Haven't we already significantly started down this road, to avoid a lot of the "tuple concurrently updated" type
errors?

Not that I'm aware of.  We do not take locks on schemas, nor functions,
nor any other of the object types I mentioned.

> Would expanding this a git further really be that noticeable?

Frankly, I think it would be not so much "noticeable" as "disastrous".

Making the overhead tolerable would require very large compromises
in coverage, perhaps like "we'll only lock during DDL not DML".
At which point I'd question why bother.  We've seen no field reports
(that I can recall offhand, anyway) that trace to not locking these
objects.

            regards, tom lane


Something which would be good to have for all those queries is a set of
isolation tests.  No need for multiple specs, you could just use one
spec with one session defining all the object types you would like to
work on.  How did you find this object list?  Did you test all the
objects available manually?
Attached the isolation spec file.  I originally was only going to fix the simple CREATE TYPE scenario but decided to look up other objects that can reside in namespaces and ended up finding a handful of others.  I tested each one manually before and after adding the AccessShareLock acquire on the schema.

I think that line of thought leads to an enormous increase in locking
overhead, for which we'd get little if any gain in usability.  So my
inclination is to make an engineering judgment that we won't fix this.
As I was creating this patch, I had similar feelings on the locking overhead and was curious how others would feel about it as well.

Regards,
Jimmy


On Tue, Sep 4, 2018 at 10:05 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
> On September 4, 2018 9:11:25 PM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think that line of thought leads to an enormous increase in locking
>> overhead, for which we'd get little if any gain in usability.  So my
>> inclination is to make an engineering judgment that we won't fix this.

> Haven't we already significantly started down this road, to avoid a lot of the "tuple concurrently updated" type errors?

Not that I'm aware of.  We do not take locks on schemas, nor functions,
nor any other of the object types I mentioned.

> Would expanding this a git further really be that noticeable?

Frankly, I think it would be not so much "noticeable" as "disastrous".

Making the overhead tolerable would require very large compromises
in coverage, perhaps like "we'll only lock during DDL not DML".
At which point I'd question why bother.  We've seen no field reports
(that I can recall offhand, anyway) that trace to not locking these
objects.

                        regards, tom lane

Attachment
Hi,

On 2018-09-05 01:05:54 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On September 4, 2018 9:11:25 PM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> I think that line of thought leads to an enormous increase in locking
> >> overhead, for which we'd get little if any gain in usability.  So my
> >> inclination is to make an engineering judgment that we won't fix this.
> 
> > Haven't we already significantly started down this road, to avoid a lot of the "tuple concurrently updated" type
errors?
> 
> Not that I'm aware of.  We do not take locks on schemas, nor functions,
> nor any other of the object types I mentioned.

Well, we kinda do, during some of their own DDL. CF
AcquireDeletionLock(), RangeVarGetAndCheckCreationNamespace(), and other
LockDatabaseObject() callers.  The
RangeVarGetAndCheckCreationNamespace() even locks the schema an object
is created in , which is pretty much what we're discussing here.

I thinkt he problem with the current logic is more that the
findDependentObjects() scan doesn't use a "dirty" scan, so it doesn't
ever get to seeing conflicting operations.


> > Would expanding this a git further really be that noticeable?
> 
> Frankly, I think it would be not so much "noticeable" as "disastrous".
> 
> Making the overhead tolerable would require very large compromises
> in coverage, perhaps like "we'll only lock during DDL not DML".
> At which point I'd question why bother.  We've seen no field reports
> (that I can recall offhand, anyway) that trace to not locking these
> objects.

Why would "we'll only lock during DDL not DML" be such a large
compromise? To me that's a pretty darn reasonable subset - preventing
corruption of the catalog contents is, uh, good?

Greetings,

Andres Freund


On Wed, Sep 05, 2018 at 12:14:41AM -0700, Jimmy Yih wrote:
> Attached the isolation spec file.  I originally was only going to fix the
> simple CREATE TYPE scenario but decided to look up other objects that can
> reside in namespaces and ended up finding a handful of others.  I tested
> each one manually before and after adding the AccessShareLock acquire on
> the schema.

(You should avoid top-posting, this is not the style of the lists.)

Thanks.  One problem I have with what you have here is that you just
test only locking path as the session dropping the session would just
block on the first concurrent object it finds.  If you don't mind I am
just stealing it, and extending it a bit ;)
--
Michael

Attachment
On Wed, Sep 05, 2018 at 09:23:37AM -0700, Andres Freund wrote:
> Well, we kinda do, during some of their own DDL. CF
> AcquireDeletionLock(), RangeVarGetAndCheckCreationNamespace(), and other
> LockDatabaseObject() callers.  The
> RangeVarGetAndCheckCreationNamespace() even locks the schema an object
> is created in , which is pretty much what we're discussing here.
>
> I think the problem with the current logic is more that the
> findDependentObjects() scan doesn't use a "dirty" scan, so it doesn't
> ever get to seeing conflicting operations.

Well, it seems to me that we don't necessarily want to go down to that
for sure on back-branches.  What's actually striking me as a very bad
thing is the inconsistent behavior when we need to get a namespace OID
after calling QualifiedNameGetCreationNamespace using a list of defnames
which does not lock the schema the DDL is working on.  And this happens
for basically all the object types Jimmy has mentioned.

For example, when creating a composite type, the namespace lock is taken
through RangeVarGetAndCheckCreationNamespace.  If a user tries to create
a root type, then we simply don't lock it, which leads to an
inconsistency of behavior between composite types and root types.  In my
opinion the inconsistency of behavior for the same DDL is not logic.

So I have been looking at that, and finished with the attached, which
actually fixes the set of problems reported.  Thoughts?
--
Michael

Attachment

Re: Prevent concurrent DROP SCHEMA when certain objects are beinginitially created in the namespace

From
Fabrízio de Royes Mello
Date:

On Thu, Sep 6, 2018 at 4:22 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Sep 05, 2018 at 09:23:37AM -0700, Andres Freund wrote:
> > Well, we kinda do, during some of their own DDL. CF
> > AcquireDeletionLock(), RangeVarGetAndCheckCreationNamespace(), and other
> > LockDatabaseObject() callers.  The
> > RangeVarGetAndCheckCreationNamespace() even locks the schema an object
> > is created in , which is pretty much what we're discussing here.
> >
> > I think the problem with the current logic is more that the
> > findDependentObjects() scan doesn't use a "dirty" scan, so it doesn't
> > ever get to seeing conflicting operations.
>
> Well, it seems to me that we don't necessarily want to go down to that
> for sure on back-branches.  What's actually striking me as a very bad
> thing is the inconsistent behavior when we need to get a namespace OID
> after calling QualifiedNameGetCreationNamespace using a list of defnames
> which does not lock the schema the DDL is working on.  And this happens
> for basically all the object types Jimmy has mentioned.
>
> For example, when creating a composite type, the namespace lock is taken
> through RangeVarGetAndCheckCreationNamespace.  If a user tries to create
> a root type, then we simply don't lock it, which leads to an
> inconsistency of behavior between composite types and root types.  In my
> opinion the inconsistency of behavior for the same DDL is not logic.
>
> So I have been looking at that, and finished with the attached, which
> actually fixes the set of problems reported.  Thoughts?

Hi,

I applied to current master and run "make check-world" and everything is ok...

I also run some similar tests as Jimmy pointed and using PLpgSQL to execute DDLs and the new consistent behavior is ok. Also I run one session using DROP SCHEMA at end and after COMMIT the session 2 report 'ERROR:  schema "testschema" does not exist', so avoiding concerns about lock overhead seems the proposed patch is ok.

Regards,

--
   Fabrízio de Royes Mello         Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
On Thu, Sep 06, 2018 at 05:19:15PM -0300, Fabrízio de Royes Mello wrote:
> I also run some similar tests as Jimmy pointed and using PLpgSQL to execute
> DDLs and the new consistent behavior is ok. Also I run one session using
> DROP SCHEMA at end and after COMMIT the session 2 report 'ERROR:  schema
> "testschema" does not exist', so avoiding concerns about lock overhead
> seems the proposed patch is ok.

Thanks Fabrízio for the review.

I think so too, patching the low-level API is I think a proper way to go
particularly for back-branches because referencing objects which do not
exist at catalog level is a consistency problem.

Double-checking for the callers of QualifiedNameGetCreationNamespace,
CREATE STATISTICS could be called with CREATE TABLE LIKE, where it would
not get called but the table reference blocks the schema drop.

I am thinking about adding more tests to cover all the callers of
QualifiedNameGetCreationNamespace with:
- range type
- domain
- enum type
- statictics
- text search
- composite type

This way if any refactoring is done with this routine, then we don't
break schema lock logic.  Andres, Tom and others, any objections?
--
Michael

Attachment
Michael Paquier <michael@paquier.xyz> writes:
> This way if any refactoring is done with this routine, then we don't
> break schema lock logic.  Andres, Tom and others, any objections?

I'm still not very happy about this, mainly because it seems like
(a) it's papering over just a small fraction of the true problem
and (b) there's been no discussion about cost-benefit tradeoffs.
What's it going to cost us in terms of additional locking --- not
only performance, but the potential for new deadlock cases ---
and does this really fix enough real-world problems to be worth it?

            regards, tom lane


On Sat, Sep 08, 2018 at 04:21:34PM -0400, Tom Lane wrote:
> I'm still not very happy about this, mainly because it seems like
> (a) it's papering over just a small fraction of the true problem

Check.

> (b) there's been no discussion about cost-benefit tradeoffs.

Noted.  I am not sure how to actually prove how an approach is more
beneficial than another.

> What's it going to cost us in terms of additional locking --- not
> only performance, but the potential for new deadlock cases ---

That's more complicated, the cross-lock test coverage is poor.
Designing tests which show the behavior we'd expect is easy enough,
trying to imagine all deadlock issues is harder.  One thing which comes
with deadlocking is usually lock upgrade.  We could put more safeguards
to make sure that a given locked object does not get locked more.

> and does this really fix enough real-world problems to be worth it?

Using a dirty snapshots would likely help in most cases if we take a
low-level approach as Andres has suggested.  This is invasive though
after spending an hour or so looking at how that would be doable when
fetching the object dependencies.  And not back-patchable.  The lack of
complains about a rather old set of problems brings the point of doing
something only on HEAD, surely.

Another thing which has crossed my mind would be get completely rid of
QualifiedNameGetCreationNamespace and replace it with a set of RangeVar
on which we could apply RangeVarGetAndCheckCreationNamespace.  And we
expect callers to check for ACL_CREATE on the namespace in all cases.
That would simplify handling for all objects ...  Do you think that this
would be acceptable?  I am talking about objects which can be
schema-qualified.
--
Michael

Attachment
On Wed, Sep 5, 2018 at 1:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Would expanding this a git further really be that noticeable?
>
> Frankly, I think it would be not so much "noticeable" as "disastrous".
>
> Making the overhead tolerable would require very large compromises
> in coverage, perhaps like "we'll only lock during DDL not DML".
> At which point I'd question why bother.  We've seen no field reports
> (that I can recall offhand, anyway) that trace to not locking these
> objects.

I think that would actually be a quite principled separation.  If your
DML fails with a strange error, that sucks, but you can retry it and
the problem will go away -- it will fail in some more sane way, or it
will work.  On the other hand, if you manage to create an object in a
no-longer-existing schema, you now have a database that can't be
backed up in pg_dump, and the only way to fix it is to run manual
DELETE commands against the PostgreSQL catalogs.  It's not even easily
to figure out what you need to DELETE, because there can be references
to pg_namespace.oid from zillions of other catalogs -- pg_catcheck
will tell you, but that's a third-party tool which many users won't
have and won't know that they should use.

I do agree with you that locking every schema referenced by any object
in a query would suck big time.  At a minimum, we'd need to extend
fast-path locking to cover AccessShareLock on schemas.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company