Rethinking dependency traversal during DROP - Mailing list pgsql-hackers

From Tom Lane
Subject Rethinking dependency traversal during DROP
Date
Msg-id 4264.1210439590@sss.pgh.pa.us
Whole thread Raw
List pgsql-hackers
I promise I won't expend any real effort on this until after the
commitfest is over, but ...

While working on the recently committed constraints patch, I got annoyed
by the way in which DROP CASCADE frequently emitted "noise" messages.
For instance:

regression=# create table p1 (f1 int);
CREATE TABLE
regression=# create table c1 (f2 int check (f2>0)) inherits (p1);
CREATE TABLE
regression=# drop table p1 cascade;
NOTICE:  drop cascades to table c1
NOTICE:  drop cascades to constraint c1_f2_check on table c1
DROP TABLE

The check constraint is auto-dependent on c1, so really the second
message shouldn't be there.  The reason it is there is that the
constraint also has a normal dependency on c1.f2, and if this pg_depend
link is traversed first (which seems to almost always be the case)
then you get the message.  The findAutoDeletableObjects() scan that is
supposed to prevent this behavior does not, because it only goes as far
as the objects that are directly auto- or internal-dependent on the
original target object (ie, p1).

I thought about fixing this by doing a new findAutoDeletableObjects()
scan whenever we recurse, adding-on new objects to not complain about.
This sort of works; I attach a test patch that does that, and the
regression test output changes it induces, which seem to be all to
the good.  But it's a pretty ugly idea for a number of reasons:

1. With this patch, every single call to recursiveDeletion is preceded
by findAutoDeletableObjects, which at the very least cries out for
refactoring.

2. The above observation puts the final nail in the coffin of the
original design idea, which was to do one recursive traversal of
pg_depend links during DROP.  It's indisputable that we now must
traverse the entire link tree twice (at least!), and the cost of
doing that seems annoying, especially when we are sitting there
building various lists of the objects in memory anyway.

3. We have got various open bug reports centered around the fact
that a cascaded DROP takes locks too late:
http://archives.postgresql.org/pgsql-hackers/2007-01/msg00937.php
http://archives.postgresql.org/pgsql-bugs/2007-03/msg00143.php
http://archives.postgresql.org/pgsql-bugs/2007-12/msg00188.php
If we are going to be forced into a two-pass approach, we really
ought to try to solve those issues at the same time.

So the idea I am toying with goes like this:

1. First, recursively scan the pg_depend tree to find every object that
would have to be deleted in order to drop the initial target object(s).
Take out locks on objects that are of lockable types before scanning for
their referencing objects, but don't do anything else yet.  Build an
ObjectAddresses list in memory of the objects-to-drop, and annotate each
entry with flags showing how we got to it (ie, via normal, auto, or
internal dependency links).

2. Scan the ObjectAddresses list and issue notices about any objects
that are reachable only via normal dependencies.  If there are any,
and it's not a CASCADE drop, error out.  (This means we will fail in
such a case after expending quite a lot less work than we do now.)

3. Scan the ObjectAddresses list back-to-front and perform the
deletions.  The back-to-front scan ensures dependent objects are deleted
before depended-on ones, which is critical in a number of cases.

I am not entirely certain that a back-to-front traversal is sufficient:
there may need to be some kind of sorting step, or maybe we should just
continue to drive the deletion pass off scans of pg_depend, seeing that
we'll have to do those anyway to delete the pg_depend entries.  So this
is just a back-of-the-napkin sketch of how it might work.

Comments?

            regards, tom lane

Index: src/backend/catalog/dependency.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/catalog/dependency.c,v
retrieving revision 1.71
diff -c -r1.71 dependency.c
*** src/backend/catalog/dependency.c    27 Mar 2008 03:57:33 -0000    1.71
--- src/backend/catalog/dependency.c    10 May 2008 16:36:34 -0000
***************
*** 701,706 ****
--- 701,708 ----
                      (errmsg("drop cascades to %s",
                              getObjectDescription(&owningObject))));

+         findAutoDeletableObjects(&owningObject, oktodelete, depRel, true);
+
          if (!recursiveDeletion(&owningObject, behavior, msglevel,
                                 object, oktodelete, depRel, alreadyDeleted))
              ok = false;
***************
*** 866,871 ****
--- 868,875 ----
                              (errmsg("drop cascades to %s",
                                      getObjectDescription(&otherObject))));

+                 findAutoDeletableObjects(&otherObject, oktodelete, depRel, true);
+
                  if (!recursiveDeletion(&otherObject, behavior, msglevel,
                                         object, oktodelete, depRel, NULL))
                      ok = false;
***************
*** 882,887 ****
--- 886,893 ----
                          (errmsg("drop auto-cascades to %s",
                                  getObjectDescription(&otherObject))));

+                 findAutoDeletableObjects(&otherObject, oktodelete, depRel, true);
+
                  if (!recursiveDeletion(&otherObject, behavior, msglevel,
                                         object, oktodelete, depRel, NULL))
                      ok = false;
*** ./expected/inherit.out    Fri May  9 16:44:55 2008
--- ./results/inherit.out    Sat May 10 12:28:20 2008
***************
*** 846,855 ****
  drop table p1 cascade;
  NOTICE:  drop cascades to table c2
  NOTICE:  drop cascades to table c3
- NOTICE:  drop cascades to constraint p2_f2_check on table c3
- NOTICE:  drop cascades to constraint p2_f2_check on table c2
  NOTICE:  drop cascades to table c1
- NOTICE:  drop cascades to constraint p2_f2_check on table c1
  drop table p2 cascade;
  create table pp1 (f1 int);
  create table cc1 (f2 text, f3 int) inherits (pp1);
--- 846,852 ----
***************
*** 904,911 ****

  drop table pp1 cascade;
  NOTICE:  drop cascades to table cc2
- NOTICE:  drop cascades to constraint pp1_a1_check on table cc2
- NOTICE:  drop cascades to constraint pp1_a2_check on table cc2
  NOTICE:  drop cascades to table cc1
- NOTICE:  drop cascades to constraint pp1_a1_check on table cc1
- NOTICE:  drop cascades to constraint pp1_a2_check on table cc1
--- 901,904 ----

======================================================================

*** ./expected/create_view.out    Thu Feb  1 14:10:30 2007
--- ./results/create_view.out    Sat May 10 12:28:22 2008
***************
*** 238,261 ****

  DROP SCHEMA temp_view_test CASCADE;
  NOTICE:  drop cascades to view temp_view_test.v9
- NOTICE:  drop cascades to rule _RETURN on view temp_view_test.v9
  NOTICE:  drop cascades to sequence temp_view_test.seq1
  NOTICE:  drop cascades to view temp_view_test.v8
- NOTICE:  drop cascades to rule _RETURN on view temp_view_test.v8
  NOTICE:  drop cascades to view temp_view_test.v7
- NOTICE:  drop cascades to rule _RETURN on view temp_view_test.v7
  NOTICE:  drop cascades to view temp_view_test.v6
- NOTICE:  drop cascades to rule _RETURN on view temp_view_test.v6
  NOTICE:  drop cascades to view temp_view_test.v5
- NOTICE:  drop cascades to rule _RETURN on view temp_view_test.v5
  NOTICE:  drop cascades to view temp_view_test.v4
- NOTICE:  drop cascades to rule _RETURN on view temp_view_test.v4
  NOTICE:  drop cascades to view temp_view_test.v3
- NOTICE:  drop cascades to rule _RETURN on view temp_view_test.v3
  NOTICE:  drop cascades to view temp_view_test.v2
- NOTICE:  drop cascades to rule _RETURN on view temp_view_test.v2
  NOTICE:  drop cascades to view temp_view_test.v1
- NOTICE:  drop cascades to rule _RETURN on view temp_view_test.v1
  NOTICE:  drop cascades to table temp_view_test.base_table2
  NOTICE:  drop cascades to rule _RETURN on view v5_temp
  NOTICE:  drop cascades to view v5_temp
--- 238,252 ----
***************
*** 280,286 ****
  NOTICE:  drop cascades to view v10_temp
  DROP SCHEMA testviewschm2 CASCADE;
  NOTICE:  drop cascades to view pubview
- NOTICE:  drop cascades to rule _RETURN on view pubview
  NOTICE:  drop cascades to table tbl4
  NOTICE:  drop cascades to rule _RETURN on view mytempview
  NOTICE:  drop cascades to view mytempview
--- 271,276 ----
***************
*** 288,300 ****
  NOTICE:  drop cascades to table tbl2
  NOTICE:  drop cascades to table tbl1
  NOTICE:  drop cascades to view nontemp4
- NOTICE:  drop cascades to rule _RETURN on view nontemp4
  NOTICE:  drop cascades to view nontemp3
- NOTICE:  drop cascades to rule _RETURN on view nontemp3
  NOTICE:  drop cascades to view nontemp2
- NOTICE:  drop cascades to rule _RETURN on view nontemp2
  NOTICE:  drop cascades to view nontemp1
- NOTICE:  drop cascades to rule _RETURN on view nontemp1
  NOTICE:  drop cascades to table t2
  NOTICE:  drop cascades to table t1
  NOTICE:  drop cascades to rule _RETURN on view temporal4
--- 278,286 ----

======================================================================

*** ./expected/namespace.out    Sun Aug 20 20:57:26 2006
--- ./results/namespace.out    Sat May 10 12:28:43 2008
***************
*** 40,48 ****

  DROP SCHEMA test_schema_1 CASCADE;
  NOTICE:  drop cascades to view test_schema_1.abc_view
- NOTICE:  drop cascades to rule _RETURN on view test_schema_1.abc_view
  NOTICE:  drop cascades to table test_schema_1.abc
- NOTICE:  drop cascades to default for table test_schema_1.abc column a
  -- verify that the objects were dropped
  SELECT COUNT(*) FROM pg_class WHERE relnamespace =
      (SELECT oid FROM pg_namespace WHERE nspname = 'test_schema_1');
--- 40,46 ----

======================================================================

*** ./expected/alter_table.out    Fri May  9 13:19:56 2008
--- ./results/alter_table.out    Sat May 10 12:29:55 2008
***************
*** 376,382 ****

  drop table atacc2 cascade;
  NOTICE:  drop cascades to table atacc3
- NOTICE:  drop cascades to constraint foo on table atacc3
  drop table atacc1;
  -- adding only to a parent is disallowed as of 8.4
  create table atacc1 (test int);
--- 376,381 ----
***************
*** 1251,1257 ****

  drop table p1 cascade;
  NOTICE:  drop cascades to table c1
- NOTICE:  drop cascades to constraint p1_a1_check on table c1
  -- test that operations with a dropped column do not try to reference
  -- its datatype
  create domain mytype as text;
--- 1250,1255 ----
***************
*** 1474,1481 ****
  NOTICE:  drop cascades to type alter2.posint
  NOTICE:  drop cascades to function alter2.plus1(integer)
  NOTICE:  drop cascades to view alter2.v1
- NOTICE:  drop cascades to rule _RETURN on view alter2.v1
  NOTICE:  drop cascades to sequence alter2.t1_f1_seq
  NOTICE:  drop cascades to default for table alter2.t1 column f1
  NOTICE:  drop cascades to table alter2.t1
- NOTICE:  drop cascades to constraint t1_f2_check on table alter2.t1
--- 1472,1477 ----

======================================================================


pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: Setting a pre-existing index as a primary key
Next
From: David Fetter
Date:
Subject: Re: Posting to hackers and patches lists