Thread: Possible performance regression with pg_dump of a large number ofrelations

Possible performance regression with pg_dump of a large number ofrelations

From
Luke Cowell
Date:
I've been troubleshooting an issue with slow pg_dump times on postgres 9.6.6. I believe something changed between
9.5.10and 9.6.6 that has made dumps significantly slower for databases with a large number of relations. I posted this
inirc and someone suggested that I should post this here. I'm sorry if this isn't the right place. 

To simulate the issue I generated 150,000 relations spread across 1000 schemas (this roughly reflects my production
setup).

```ruby
File.write "many_relations.sql", (150000 / 150).times.flat_map {|n|
  [
   "create schema s_#{n};",
   150.times.map do |t|
     "create table s_#{n}.test_#{t} (id int);"
   end
   ]
}.join("\n")
```

I have 2 identical pieces of hardware. I've installed 9.5 on one and 9.6 on the other. I've run the same generated
pieceof sql in a fresh database on both systems. 

On my 9.5.10 system:
> time pg_dump -n s_10 testing > /dev/null
real    0m5.492s
user    0m1.424s
sys    0m0.184s

On my 9.6.6 system:
> time pg_dump -n s_10 testing > /dev/null
real    0m27.342s
user    0m1.748s
sys    0m0.248s

If I call that same pg_dump command with the verbose option, the delay is at `pg_dump: reading user-defined tables`
step.

I don't have identical hardware, so I can't say for sure, but I believe this issue is still present in 10.1.

Is this a legitimate issue? Is there more information I can provide to help better assess the situation?

Thanks in advance everyone!

Luke



On Thu, Jan 11, 2018 at 5:26 PM, Luke Cowell <lcowell@gmail.com> wrote:
I've been troubleshooting an issue with slow pg_dump times on postgres 9.6.6. I believe something changed between 9.5.10 and 9.6.6 that has made dumps significantly slower for databases with a large number of relations. I posted this in irc and someone suggested that I should post this here. I'm sorry if this isn't the right place.

To simulate the issue I generated 150,000 relations spread across 1000 schemas (this roughly reflects my production setup).

```ruby
File.write "many_relations.sql", (150000 / 150).times.flat_map {|n|
  [
   "create schema s_#{n};",
   150.times.map do |t|
     "create table s_#{n}.test_#{t} (id int);"
   end
   ]
}.join("\n")
```

I have 2 identical pieces of hardware. I've installed 9.5 on one and 9.6 on the other. I've run the same generated piece of sql in a fresh database on both systems.

On my 9.5.10 system:
> time pg_dump -n s_10 testing > /dev/null
real    0m5.492s
user    0m1.424s
sys     0m0.184s

On my 9.6.6 system:
> time pg_dump -n s_10 testing > /dev/null
real    0m27.342s
user    0m1.748s
sys     0m0.248s

I don't get quite as large a regression as you do, from 6s to 19s.  It looks like there are multiple of them, but the biggest is caused by:

commit 5d589993cad212f7d556d52cc1e42fe18f65b057
Author: Stephen Frost <sfrost@snowman.net>
Date:   Fri May 6 14:06:50 2016 -0400

    pg_dump performance and other fixes

That commit covered a few different things, and I don't what improvement it mentions is the one that motivated this, but the key change was to add this query:

EXISTS (SELECT 1 FROM pg_attribute at LEFT JOIN pg_init_privs pip ON(c.oid = pip.objoid AND pip.classoid = (SELECT oid FROM pg_class WHERE relname = 'pg_class') AND pip.objsubid = at.attnum)WHERE at.attrelid = c.oid AND at.attnum>0 and ((SELECT count(acl) FROM (SELECT unnest(coalesce(at.attacl,acldefault('c',c.relowner))) AS acl EXCEPT SELECT unnest(coalesce(pip.initprivs,acldefault('c',c.relowner)))) as foo) >1 OR (SELECT count(acl) FROM (SELECT unnest(coalesce(pip.initprivs,acldefault('c',c.relowner))) AS acl EXCEPT SELECT unnest(coalesce(at.attacl,acldefault('c',c.relowner)))) as foo) >0))AS changed_acl

Considering it runs 2 subqueries for every column (including the 6 hidden system columns) of every table, even ones that don't end up getting dumped out, it is no wonder it is slow.

If you were just dumping the database with 150,000 objects, I wouldn't worry about a 20 second regression.  But I assume you intend to loop over every schema and dump each individually?
 
Cheers,

Jeff
On Fri, Jan 12, 2018 at 8:01 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
That commit covered a few different things, and I don't what improvement it mentions is the one that motivated this, but the key change was to add this query:

EXISTS (SELECT 1 FROM pg_attribute at LEFT JOIN pg_init_privs pip ON(c.oid = pip.objoid AND pip.classoid = (SELECT oid FROM pg_class WHERE relname = 'pg_class') AND pip.objsubid = at.attnum)WHERE at.attrelid = c.oid AND at.attnum>0 and ((SELECT count(acl) FROM (SELECT unnest(coalesce(at.attacl,acldefault('c',c.relowner))) AS acl EXCEPT SELECT unnest(coalesce(pip.initprivs,acldefault('c',c.relowner)))) as foo) >1 OR (SELECT count(acl) FROM (SELECT unnest(coalesce(pip.initprivs,acldefault('c',c.relowner))) AS acl EXCEPT SELECT unnest(coalesce(at.attacl,acldefault('c',c.relowner)))) as foo) >0))AS changed_acl

Sorry, that query reflects some munging I did to it.  The real part added to the query is:

EXISTS (SELECT 1 FROM pg_attribute at LEFT JOIN pg_init_privs pip ON(c.oid = pip.objoid AND pip.classoid = (SELECT oid FROM pg_class WHERE relname = 'pg_class') AND pip.objsubid = at.attnum)WHERE at.attrelid = c.oid AND ((SELECT array_agg(acl) FROM (SELECT unnest(coalesce(at.attacl,acldefault('c',c.relowner))) AS acl EXCEPT SELECT unnest(coalesce(pip.initprivs,acldefault('c',c.relowner)))) as foo) IS NOT NULL OR (SELECT array_agg(acl) FROM (SELECT unnest(coalesce(pip.initprivs,acldefault('c',c.relowner))) AS acl EXCEPT SELECT unnest(coalesce(at.attacl,acldefault('c',c.relowner)))) as foo) IS NOT NULL OR NULL IS NOT NULL OR NULL IS NOT NULL))AS changed_ac
 
Cheers,

Jeff

Re: Possible performance regression with pg_dump of a large number ofrelations

From
Luke Cowell
Date:
On Jan 12, 2018, at 8:01 AM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Thu, Jan 11, 2018 at 5:26 PM, Luke Cowell <lcowell@gmail.com> wrote:
I've been troubleshooting an issue with slow pg_dump times on postgres 9.6.6. I believe something changed between 9.5.10 and 9.6.6 that has made dumps significantly slower for databases with a large number of relations. I posted this in irc and someone suggested that I should post this here. I'm sorry if this isn't the right place.

To simulate the issue I generated 150,000 relations spread across 1000 schemas (this roughly reflects my production setup).

```ruby
File.write "many_relations.sql", (150000 / 150).times.flat_map {|n|
  [
   "create schema s_#{n};",
   150.times.map do |t|
     "create table s_#{n}.test_#{t} (id int);"
   end
   ]
}.join("\n")
```

I have 2 identical pieces of hardware. I've installed 9.5 on one and 9.6 on the other. I've run the same generated piece of sql in a fresh database on both systems.

On my 9.5.10 system:
> time pg_dump -n s_10 testing > /dev/null
real    0m5.492s
user    0m1.424s
sys     0m0.184s

On my 9.6.6 system:
> time pg_dump -n s_10 testing > /dev/null
real    0m27.342s
user    0m1.748s
sys     0m0.248s

I don't get quite as large a regression as you do, from 6s to 19s.  It looks like there are multiple of them, but the biggest is caused by:

commit 5d589993cad212f7d556d52cc1e42fe18f65b057
Author: Stephen Frost <sfrost@snowman.net>
Date:   Fri May 6 14:06:50 2016 -0400

    pg_dump performance and other fixes

That commit covered a few different things, and I don't what improvement it mentions is the one that motivated this, but the key change was to add this query:

EXISTS (SELECT 1 FROM pg_attribute at LEFT JOIN pg_init_privs pip ON(c.oid = pip.objoid AND pip.classoid = (SELECT oid FROM pg_class WHERE relname = 'pg_class') AND pip.objsubid = at.attnum)WHERE at.attrelid = c.oid AND at.attnum>0 and ((SELECT count(acl) FROM (SELECT unnest(coalesce(at.attacl,acldefault('c',c.relowner))) AS acl EXCEPT SELECT unnest(coalesce(pip.initprivs,acldefault('c',c.relowner)))) as foo) >1 OR (SELECT count(acl) FROM (SELECT unnest(coalesce(pip.initprivs,acldefault('c',c.relowner))) AS acl EXCEPT SELECT unnest(coalesce(at.attacl,acldefault('c',c.relowner)))) as foo) >0))AS changed_acl

Considering it runs 2 subqueries for every column (including the 6 hidden system columns) of every table, even ones that don't end up getting dumped out, it is no wonder it is slow.

If you were just dumping the database with 150,000 objects, I wouldn't worry about a 20 second regression.  But I assume you intend to loop over every schema and dump each individually?
 
Cheers,

Jeff

Hi Jeff, thanks for your attention on this. Yes, that is exactly our use case. We dump each schema individually so we would be paying that 20 second penalty each time. As a workaround I've been dumping the schemas in batches of 20, but this isn't really ideal as we'll lose access to a number of our existing workflows.

Luke

Re: Possible performance regression with pg_dump of a large numberof relations

From
Stephen Frost
Date:
Greetings Jeff & Luke,

* Jeff Janes (jeff.janes@gmail.com) wrote:
> Sorry, that query reflects some munging I did to it.  The real part added
> to the query is:
>
> EXISTS (SELECT 1 FROM pg_attribute at LEFT JOIN pg_init_privs pip ON(c.oid
> = pip.objoid AND pip.classoid = (SELECT oid FROM pg_class WHERE relname =
> 'pg_class') AND pip.objsubid = at.attnum)WHERE at.attrelid = c.oid AND
> ((SELECT array_agg(acl) FROM (SELECT
> unnest(coalesce(at.attacl,acldefault('c',c.relowner))) AS acl EXCEPT SELECT
> unnest(coalesce(pip.initprivs,acldefault('c',c.relowner)))) as foo) IS NOT
> NULL OR (SELECT array_agg(acl) FROM (SELECT
> unnest(coalesce(pip.initprivs,acldefault('c',c.relowner))) AS acl EXCEPT
> SELECT unnest(coalesce(at.attacl,acldefault('c',c.relowner)))) as foo) IS
> NOT NULL OR NULL IS NOT NULL OR NULL IS NOT NULL))AS changed_ac

Yes, this is to check if any of the rights on the table or any of its'
columns has been changed from what it's initial rights are as recorded
in pg_init_privs.

I've been playing around with this a bit tonight trying to think of a
way to avoid doing this work and it occurs to me that we really only
need to worry about initprivs on objects in schemas that are either
there at init time, or from extensions.  Not all of the objects in the
system can have init-privs because the only way to get init-privs is
at initdb time or from an extension creating a new object.

As such, I've reworked the query (but not yet put it into pg_dump to
run it through the regression tests) to look like this (for anyone else
who wants to take a look at it and play with it):

EXPLAIN ANALYZE
SELECT c.tableoid, c.oid, c.relname,
(SELECT pg_catalog.array_agg(acl ORDER BY row_n)
    FROM (SELECT acl, row_n
        FROM pg_catalog.unnest(
            coalesce(c.relacl,pg_catalog.acldefault(
                    CASE WHEN c.relkind = 'S'
                        THEN 's'
                    ELSE 'r' END::"char",c.relowner)))
        WITH ORDINALITY AS perm(acl,row_n)
        WHERE NOT EXISTS (
            SELECT 1 FROM
            pg_catalog.unnest(
                coalesce(pip.initprivs,pg_catalog.acldefault(
                        CASE WHEN c.relkind = 'S'
                            THEN 's'
                        ELSE 'r' END::"char",c.relowner)))
                            AS init(init_acl)
            WHERE acl = init_acl)) as foo
        WHERE nsp.nspname IN ('information_schema','pg_catalog')
        OR ext.oid IS NOT NULL)
    AS relacl,
(SELECT pg_catalog.array_agg(acl ORDER BY row_n)
    FROM (SELECT acl, row_n
        FROM pg_catalog.unnest(
            coalesce(pip.initprivs,pg_catalog.acldefault(
                    CASE WHEN c.relkind = 'S'
                        THEN 's'
                    ELSE 'r' END::"char",c.relowner)))
        WITH ORDINALITY AS initp(acl,row_n)
        WHERE NOT EXISTS (
            SELECT 1 FROM pg_catalog.unnest(
                coalesce(c.relacl,pg_catalog.acldefault(
                        CASE WHEN c.relkind = 'S'
                            THEN 's'
                        ELSE 'r' END::"char",c.relowner)))
                            AS permp(orig_acl)
            WHERE acl = orig_acl)) as foo
        WHERE nsp.nspname IN ('information_schema','pg_catalog')
        OR ext.oid IS NOT NULL)
    as rrelacl,
NULL AS initrelacl,
NULL as initrrelacl,
c.relkind,
c.relnamespace,
(SELECT rolname
    FROM pg_catalog.pg_roles
    WHERE oid = c.relowner) AS rolname,
c.relchecks,
c.relhastriggers,
c.relhasindex,
c.relhasrules,
c.relhasoids,
c.relrowsecurity,
c.relforcerowsecurity,
c.relfrozenxid,
c.relminmxid,
tc.oid AS toid,
tc.relfrozenxid AS tfrozenxid,
tc.relminmxid AS tminmxid,
c.relpersistence,
c.relispopulated,
c.relreplident,
c.relpages,
CASE WHEN c.reloftype <> 0
    THEN c.reloftype::pg_catalog.regtype
ELSE NULL END AS reloftype,
d.refobjid AS owning_tab,
d.refobjsubid AS owning_col,
(SELECT spcname
    FROM pg_tablespace t
    WHERE t.oid = c.reltablespace) AS reltablespace,
array_remove(
    array_remove(
        c.reloptions,'check_option=local'),
    'check_option=cascaded') AS reloptions,
CASE WHEN 'check_option=local' = ANY (c.reloptions)
    THEN 'LOCAL'::text
    WHEN 'check_option=cascaded' = ANY (c.reloptions)
        THEN 'CASCADED'::text ELSE NULL END AS checkoption,
tc.reloptions AS toast_reloptions,
c.relkind = 'S' AND
EXISTS (SELECT 1 FROM pg_depend WHERE classid = 'pg_class'::regclass
    AND objid = c.oid AND objsubid = 0
    AND refclassid = 'pg_class'::regclass
    AND deptype = 'i') AS is_identity_sequence,
EXISTS (SELECT 1 FROM pg_attribute at
    LEFT JOIN pg_init_privs pip
    ON (c.oid = pip.objoid
        AND pip.classoid = 'pg_class'::regclass
        AND pip.objsubid = at.attnum)
    WHERE
    (nsp.nspname IN ('information_schema','pg_catalog')
     OR ext.oid IS NOT NULL)
    AND
    at.attrelid = c.oid AND (
        (SELECT pg_catalog.array_agg(acl ORDER BY row_n)
            FROM (SELECT acl, row_n
                FROM pg_catalog.unnest(
                    coalesce(at.attacl,
                        pg_catalog.acldefault('c',c.relowner)))
                WITH ORDINALITY AS perm(acl,row_n)
                WHERE
                    NOT EXISTS (
                    SELECT 1 FROM pg_catalog.unnest(
                        coalesce(pip.initprivs,
                            pg_catalog.acldefault('c',c.relowner)))
                    AS init(init_acl)
                    WHERE acl = init_acl)
                ) as foo) IS NOT NULL
        OR
        (SELECT pg_catalog.array_agg(acl ORDER BY row_n)
            FROM (SELECT acl, row_n
                FROM pg_catalog.unnest(
                    coalesce(pip.initprivs,
                        pg_catalog.acldefault('c',c.relowner)))
                WITH ORDINALITY AS initp(acl,row_n)
                WHERE NOT EXISTS (
                    SELECT 1 FROM pg_catalog.unnest(
                        coalesce(at.attacl,pg_catalog.
                            acldefault('c',c.relowner)))
                    AS permp(orig_acl)
                    WHERE acl = orig_acl)) as foo) IS NOT NULL
        OR NULL IS NOT NULL
        OR NULL IS NOT NULL)) AS changed_acl,
pg_get_partkeydef(c.oid) AS partkeydef,
c.relispartition AS ispartition,
pg_get_expr(c.relpartbound, c.oid) AS partbound
FROM pg_class c
JOIN pg_namespace nsp ON (c.relnamespace = nsp.oid)
LEFT JOIN pg_extension ext ON (nsp.oid = ext.extnamespace)
LEFT JOIN pg_depend d
ON (c.relkind = 'S'
    AND d.classid = c.tableoid
    AND d.objid = c.oid
    AND d.objsubid = 0
    AND d.refclassid = c.tableoid
    AND d.deptype IN ('a', 'i'))
LEFT JOIN pg_class tc ON (c.reltoastrelid = tc.oid)
LEFT JOIN pg_init_privs pip
ON (c.oid = pip.objoid
    AND pip.classoid = 'pg_class'::regclass
    AND pip.objsubid = 0)
WHERE c.relkind in ('r', 'S', 'v', 'c', 'm', 'f', 'p')
ORDER BY c.oid;

This ends up dropping the query time from around 8.6s on my system to
about 1s, with a test rig of 1000 schemas and 100,000 tables.

Unfortunately, the way pg_dump is structured today, it really wouldn't
be easy to have it only run this query for the tables in the schema(s)
requested because it expects to be able to look for dependencies in the
data structure which results from this query (as I recall, it's been a
while since I looked, but I remember trying to figure out a way to do
that and it certainly didn't look easy to do).

There's another query that's currently taking about 4s on my system that
I believe could benefit from a similar treatment to reduce the work the
query is having to do.  Hopefully with those two changes we can get
pg_dump runtime on this use-case back closer to what it was before these
changes went in.

If folks get a chance to take a look at the query and/or test, that'd be
great.  I'll try to work up an actual patch to pg_dump this weekend to
run it through the regression tests and see if anything breaks.

Thanks!

Stephen

Attachment

Re: Possible performance regression with pg_dump of a large number ofrelations

From
Luke Cowell
Date:
Hi Stephen, thank you for putting this together.

> If folks get a chance to take a look at the query and/or test, that'd be
> great.  I'll try to work up an actual patch to pg_dump this weekend to
> run it through the regression tests and see if anything breaks.

I'm not sure how I can help other than testing that this runs. I can confirm that it runs on 10.1. It does not run on
9.5or 9.6 and gives this error: 
> ERROR:  relation "pg_init_privs" does not exist
> LINE 139: LEFT JOIN pg_init_privs pip

I'm guessing that the error is not surprising and that the query is intended for an upcoming release of postgres and
wouldn'tbe backported to 9.6.x 

Luke

Re: Possible performance regression with pg_dump of a large numberof relations

From
Stephen Frost
Date:
Hi there!

* Luke Cowell (lcowell@gmail.com) wrote:
> Hi Stephen, thank you for putting this together.

Yeah, it needs more work, which I figured out after actually hacking
together a patch for it and I've just not gotten back to it yet.

> > If folks get a chance to take a look at the query and/or test, that'd be
> > great.  I'll try to work up an actual patch to pg_dump this weekend to
> > run it through the regression tests and see if anything breaks.
>
> I'm not sure how I can help other than testing that this runs. I can confirm that it runs on 10.1. It does not run on
9.5or 9.6 and gives this error: 

Thanks for offering to help!  Once I've got an actual patch together
that works well enough to get through the regression tests, I'll
definitely let you know.  The general premise still looks viable, but
the actual query needs to be reworked.

> > ERROR:  relation "pg_init_privs" does not exist
> > LINE 139: LEFT JOIN pg_init_privs pip

I certainly hope that works on 9.6, since that's when pg_init_privs was
added..

> I'm guessing that the error is not surprising and that the query is intended for an upcoming release of postgres and
wouldn'tbe backported to 9.6.x 

Presuming I can make it work, the idea would be to back-port it to 9.6
and 10, since pg_init_privs and this code was added in 9.6.

Thanks again!

Stephen

Attachment

Re: Possible performance regression with pg_dump of a large number ofrelations

From
Luke Cowell
Date:

> On Jan 24, 2018, at 2:56 PM, Stephen Frost <sfrost@snowman.net> wrote:
>
> Hi there!
>
>
>>> ERROR:  relation "pg_init_privs" does not exist
>>> LINE 139: LEFT JOIN pg_init_privs pip
>
> I certainly hope that works on 9.6, since that's when pg_init_privs was
> added..

My mistake. That error is from my 9.5 server. It does error on 9.6, but I get the following error:

$ psql --version
psql (PostgreSQL) 9.6.6
$ psql postgres < query.sql
ERROR:  function pg_get_partkeydef(oid) does not exist
LINE 126: pg_get_partkeydef(c.oid) AS partkeydef,
          ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

> Presuming I can make it work, the idea would be to back-port it to 9.6
> and 10, since pg_init_privs and this code was added in 9.6.

A 9.6 backport would be excellent.

Thanks again!

Luke

Re: Possible performance regression with pg_dump of a large numberof relations

From
Stephen Frost
Date:
Luke,

* Luke Cowell (lcowell@gmail.com) wrote:
> > On Jan 24, 2018, at 2:56 PM, Stephen Frost <sfrost@snowman.net> wrote:
> >>> ERROR:  relation "pg_init_privs" does not exist
> >>> LINE 139: LEFT JOIN pg_init_privs pip
> >
> > I certainly hope that works on 9.6, since that's when pg_init_privs was
> > added..
>
> My mistake. That error is from my 9.5 server. It does error on 9.6, but I get the following error:

No worries!

> $ psql --version
> psql (PostgreSQL) 9.6.6
> $ psql postgres < query.sql
> ERROR:  function pg_get_partkeydef(oid) does not exist
> LINE 126: pg_get_partkeydef(c.oid) AS partkeydef,
>           ^
> HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

Ah, yes, that makes sense, that function wasn't included in 9.6.  That's
not really a problem though, pg_dump knows how to talk to different
versions of PG, it just happens that the query I sent was the one for
HEAD and not for older versions.

> > Presuming I can make it work, the idea would be to back-port it to 9.6
> > and 10, since pg_init_privs and this code was added in 9.6.
>
> A 9.6 backport would be excellent.

Yup, I'll make sure that the query that's fun for 9.6 server systems
works on, well, 9.6 systems. ;)

Thanks!

Stephen

Attachment

Re: Possible performance regression with pg_dump of a large numberof relations

From
Felipe Sateler
Date:
Hello Stephen, everyone,

On Fri, Jan 12, 2018 at 10:39:09PM -0500, Stephen Frost wrote:
> Greetings Jeff & Luke,
>
> * Jeff Janes (jeff.janes@gmail.com) wrote:
> > Sorry, that query reflects some munging I did to it.  The real part added
> > to the query is:
> >
> > EXISTS (SELECT 1 FROM pg_attribute at LEFT JOIN pg_init_privs pip ON(c.oid
> > = pip.objoid AND pip.classoid = (SELECT oid FROM pg_class WHERE relname =
> > 'pg_class') AND pip.objsubid = at.attnum)WHERE at.attrelid = c.oid AND
> > ((SELECT array_agg(acl) FROM (SELECT
> > unnest(coalesce(at.attacl,acldefault('c',c.relowner))) AS acl EXCEPT SELECT
> > unnest(coalesce(pip.initprivs,acldefault('c',c.relowner)))) as foo) IS NOT
> > NULL OR (SELECT array_agg(acl) FROM (SELECT
> > unnest(coalesce(pip.initprivs,acldefault('c',c.relowner))) AS acl EXCEPT
> > SELECT unnest(coalesce(at.attacl,acldefault('c',c.relowner)))) as foo) IS
> > NOT NULL OR NULL IS NOT NULL OR NULL IS NOT NULL))AS changed_ac
>
> Yes, this is to check if any of the rights on the table or any of its'
> columns has been changed from what it's initial rights are as recorded
> in pg_init_privs.
>
> I've been playing around with this a bit tonight trying to think of a
> way to avoid doing this work and it occurs to me that we really only
> need to worry about initprivs on objects in schemas that are either
> there at init time, or from extensions.  Not all of the objects in the
> system can have init-privs because the only way to get init-privs is
> at initdb time or from an extension creating a new object.
>
> As such, I've reworked the query (but not yet put it into pg_dump to
> run it through the regression tests) to look like this (for anyone else
> who wants to take a look at it and play with it):
> <snip sql>
>
> This ends up dropping the query time from around 8.6s on my system to
> about 1s, with a test rig of 1000 schemas and 100,000 tables.

The effect seems quite milder here. It reduces from 8s to 5.7s. Still an
improvement though.

I'm using pg_dump 11.6. Has something been merged since this time and
now?

My use case is similar to the OP: I dump each schema separately, which
causes long delays between dumps.

>
> Unfortunately, the way pg_dump is structured today, it really wouldn't
> be easy to have it only run this query for the tables in the schema(s)
> requested because it expects to be able to look for dependencies in the
> data structure which results from this query (as I recall, it's been a
> while since I looked, but I remember trying to figure out a way to do
> that and it certainly didn't look easy to do).

This would be nice. Such a filter (when applied manually) reduces the query
time significantly.

-- 
Saludos,
Felipe Sateler