Avoid orphaned objects dependencies, take 3 - Mailing list pgsql-hackers

From Bertrand Drouvot
Subject Avoid orphaned objects dependencies, take 3
Date
Msg-id ZiYjn0eVc7pxVY45@ip-10-97-1-34.eu-west-3.compute.internal
Whole thread Raw
Responses Re: Avoid orphaned objects dependencies, take 3
List pgsql-hackers
Hi,

This new thread is a follow-up of [1] and [2].

Problem description:

We have occasionally observed objects having an orphaned dependency, the 
most common case we have seen is functions not linked to any namespaces.

Examples to produce such orphaned dependencies:

Scenario 1:

session 1: begin; drop schema schem;
session 2: create a function in the schema schem
session 1: commit;

With the above, the function created in session 2 would be linked to a non
existing schema.

Scenario 2:

session 1: begin; create a function in the schema schem
session 2: drop schema schem;
session 1: commit;

With the above, the function created in session 1 would be linked to a non
existing schema.

A patch has been initially proposed to fix this particular 
(function-to-namespace) dependency (see [1]), but there could be much 
more scenarios (like the function-to-datatype one highlighted by Gilles 
in [1] that could lead to a function having an invalid parameter datatype).

As Tom said there are dozens more cases that would need to be 
considered, and a global approach to avoid those race conditions should 
be considered instead.

A first global approach attempt has been proposed in [2] making use of a dirty
snapshot when recording the dependency. But this approach appeared to be "scary"
and it was still failing to close some race conditions (see [2] for details).

Then, Tom proposed another approach in [2] which is that "creation DDL will have
to take a lock on each referenced object that'd conflict with a lock taken by
DROP".

This is what the attached patch is trying to achieve.

It does the following:

1) A new lock (that conflicts with a lock taken by DROP) has been put in place
when the dependencies are being recorded.

Thanks to it, the drop schema in scenario 2 would be locked (resulting in an
error should session 1 committs).

2) After locking the object while recording the dependency, the patch checks
that the object still exists.

Thanks to it, session 2 in scenario 1 would be locked and would report an error
once session 1 committs (that would not be the case should session 1 abort the
transaction).

The patch also adds a few tests for some dependency cases (that would currently
produce orphaned objects):

- schema and function (as the above scenarios)
- function and type
- table and type (which is I think problematic enough, as involving a table into
the game, to fix this stuff as a whole).

[1]:
https://www.postgresql.org/message-id/flat/a4f55089-7cbd-fe5d-a9bb-19adc6418ae9@darold.net#9af5cdaa9e80879beb1def3604c976e8
[2]: https://www.postgresql.org/message-id/flat/8369ff70-0e31-f194-2954-787f4d9e21dd%40amazon.com

Please note that I'm not used to with this area of the code so that the patch
might not take the option proposed by Tom the "right" way.

Adding the patch to the July CF.

Looking forward to your feedback,

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: jian he
Date:
Subject: slightly misleading Error message in guc.c
Next
From: Bertrand Drouvot
Date:
Subject: Re: Patch to avoid orphaned dependencies