Thread: BUG #15934: pg_dump output in wrong order if custom operator class is used as subtype_opclass in a range type

The following bug has been logged on the website:

Bug reference:      15934
Logged by:          Tom Gottfried
Email address:      tom@intevation.de
PostgreSQL version: 11.4
Operating system:   Linux (Ubuntu 18.04)
Description:

Dear PostgreSQL developers,

consider the following to reproduce (to keep the example short, the operator
class is reduced to the minimum needed to reproduce the problem):

createdb test
psql -d test -c '
CREATE OPERATOR ~~ (
    FUNCTION = int4eq,
    LEFTARG = int,
    RIGHTARG = int
);
CREATE OPERATOR CLASS testclass FOR TYPE int USING btree AS
    OPERATOR 3 ~~;
CREATE TYPE testrange AS RANGE (
    subtype = int,
    subtype_opclass = testclass
);'
pg_dump -OcC test | psql -v ON_ERROR_STOP=

The output is as follows:
[...]
CREATE OPERATOR FAMILY
ERROR:  operator does not exist: integer public.~~ integer

Dropping and recreating the database as above and looking at the output of
`pg_dump   -OcC test' shows that the DDL for creating the operator used in
the operator class comes at the end (after the DDL for the operator
class):
--                                
-- PostgreSQL database dump
--                                
[...]
--
   
-- Name: testclass; Type: OPERATOR FAMILY; Schema: public; Owner: -
   
--

CREATE OPERATOR FAMILY public.testclass USING btree;


--
-- Name: testclass; Type: OPERATOR CLASS; Schema: public; Owner: -
--

CREATE OPERATOR CLASS public.testclass
    FOR TYPE integer USING btree FAMILY public.testclass AS
    OPERATOR 3 public.~~(integer,integer);


--
-- Name: testrange; Type: TYPE; Schema: public; Owner: -
--

CREATE TYPE public.testrange AS RANGE (
    subtype = integer,
    subtype_opclass = public.testclass
);


--
-- Name: ~~; Type: OPERATOR; Schema: public; Owner: -
--

CREATE OPERATOR public.~~ (
    FUNCTION = int4eq,
    LEFTARG = integer,
    RIGHTARG = integer
);


--
-- PostgreSQL database dump complete                    
--


A similar error can be produced using a custom-format archive file:
createdb test_new
pg_dump -Fc test | pg_restore -ed test_new
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 1522; 2616 77795 OPERATOR
CLASS testclass postgres
pg_restore: [archiver (db)] could not execute query: ERROR:  operator does
not exist: integer public.~~ integer
    Command was: CREATE OPERATOR CLASS public.testclass
    FOR TYPE integer USING btree FAMILY public.testclass AS
    OPERATOR 3 public.~~(integer,integer);


Dropping and recreating the database as above again but without the
subtype_opclass in the type definition and running `pg_dump -OcC test | psql
-v ON_ERROR_STOP=' gives the following output without any error:
[...]
CREATE TYPE
CREATE OPERATOR
CREATE OPERATOR FAMILY
CREATE OPERATOR CLASS

If using the subtype_opclass, I expected a similar output but with the
'CREATE TYPE' at the end.

System information:
psql -tc 'select version()'
 PostgreSQL 11.4 (Ubuntu 11.4-1.pgdg18.04+1) on x86_64-pc-linux-gnu,
compiled by
 gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0, 64-bit

Thanks and best regards,
Tom


PG Bug reporting form <noreply@postgresql.org> writes:
> consider the following to reproduce (to keep the example short, the operator
> class is reduced to the minimum needed to reproduce the problem):
> ...
> Dropping and recreating the database as above and looking at the output of
> `pg_dump   -OcC test' shows that the DDL for creating the operator used in
> the operator class comes at the end (after the DDL for the operator
> class):

Huh.  I'm surprised nobody has tripped over this before, because it's
been broken since range types were introduced.

The core of the problem is that pg_dump doesn't treat opclass members
(pg_amop rows) as independent objects, which is problematic because
pg_depend does think they're independent objects.  So the dependency
entries involving this opclass are

 operator 3 (integer, integer) of operator family testclass for access method btree: ~~(integer,integer) | operator
~~(integer,integer)                     | n 
 operator 3 (integer, integer) of operator family testclass for access method btree: ~~(integer,integer) | operator
classtestclass for access method btree  | i 
 operator class testclass for access method btree                                                        | operator
familytestclass for access method btree | a 
 type testrange                                                                                          | operator
classtestclass for access method btree  | n 

pg_dump ignores the first two of these, meaning it has no knowledge
that operator ~~ needs to be dumped before opclass testclass.
Now it does have a general rule that all else being equal, operators
should be dumped before opclasses, and that saves it most of the time.
But in this example, all else is not equal, because the last-mentioned
dependency forces the opclass to be moved up, to before type testrange.
And the same general sort rules say that types come before operators,
so without a dependency to force the operator to be moved up too,
we get the wrong result.

Clearly, we gotta upgrade pg_dump's intelligence about handling this
dependency structure.  The most straightforward fix would involve
promoting pg_amop (and pg_amproc) entries to be full DumpableObjects,
but I'm apprehensive about the overhead that would add ...

I'll try to work on a fix later this week, with hopes of getting
it into next month's releases.

Thanks for the report!

            regards, tom lane



I wrote:
> Clearly, we gotta upgrade pg_dump's intelligence about handling this
> dependency structure.  The most straightforward fix would involve
> promoting pg_amop (and pg_amproc) entries to be full DumpableObjects,
> but I'm apprehensive about the overhead that would add ...

After further thought, I realized we could fix it far less invasively
than that, by teaching getDependencies() to translate pg_depend entries
for the pg_amop/amproc rows to look like dependencies for their parent
opfamily.  This looks messy, but in a typical database without any
custom opclasses, it's actually really cheap, because there won't be
any pg_depend entries satisfying the classid conditions.

I wasn't looking forward to back-patching the other idea, but this
patch seems like it should be pretty painless.

Again, thanks for the report!

            regards, tom lane

diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 8a31672..0cc9ede 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -17911,14 +17911,45 @@ getDependencies(Archive *fout)
     query = createPQExpBuffer();

     /*
+     * Messy query to collect the dependency data we need.  Note that we
+     * ignore the sub-object column, so that dependencies of or on a column
+     * look the same as dependencies of or on a whole table.
+     *
      * PIN dependencies aren't interesting, and EXTENSION dependencies were
      * already processed by getExtensionMembership.
      */
     appendPQExpBufferStr(query, "SELECT "
                          "classid, objid, refclassid, refobjid, deptype "
                          "FROM pg_depend "
-                         "WHERE deptype != 'p' AND deptype != 'e' "
-                         "ORDER BY 1,2");
+                         "WHERE deptype != 'p' AND deptype != 'e'\n");
+
+    /*
+     * Since we don't treat pg_amop entries as separate DumpableObjects, we
+     * have to translate their dependencies into dependencies of their parent
+     * opfamily.  Ignore internal dependencies though, as those will point to
+     * their parent opclass, which we needn't consider here (and if we did,
+     * it'd just result in circular dependencies).  Also, "loose" opfamily
+     * entries will have dependencies on their parent opfamily, which we
+     * should drop since they'd likewise become useless self-dependencies.
+     * (But be sure to keep deps on *other* opfamilies; see amopsortfamily.)
+     */
+    appendPQExpBufferStr(query, "UNION ALL\n"
+                         "SELECT 'pg_opfamily'::regclass AS classid, amopfamily AS objid, refclassid, refobjid,
deptype" 
+                         "FROM pg_depend d, pg_amop o "
+                         "WHERE deptype NOT IN ('p', 'e', 'i') AND "
+                         "classid = 'pg_amop'::regclass AND objid = o.oid "
+                         "AND NOT (refclassid = 'pg_opfamily'::regclass AND amopfamily = refobjid)\n");
+
+    /* Likewise for pg_amproc entries */
+    appendPQExpBufferStr(query, "UNION ALL\n"
+                         "SELECT 'pg_opfamily'::regclass AS classid, amprocfamily AS objid, refclassid, refobjid,
deptype" 
+                         "FROM pg_depend d, pg_amproc p "
+                         "WHERE deptype NOT IN ('p', 'e', 'i') AND "
+                         "classid = 'pg_amproc'::regclass AND objid = p.oid "
+                         "AND NOT (refclassid = 'pg_opfamily'::regclass AND amprocfamily = refobjid)\n");
+
+    /* Sort the output for efficiency below */
+    appendPQExpBufferStr(query, "ORDER BY 1,2");

     res = ExecuteSqlQuery(fout, query->data, PGRES_TUPLES_OK);