Thread: pg_upgrade slowness for databases with many tables

pg_upgrade slowness for databases with many tables

From
Stefan Seifert
Date:
Hi,

upgrading a database containing > 90000 tables and about the same number of
views and indexes with pg_upgrade takes several hours. I've started the
upgrade more than five hours ago and the "Creating dump of database schemas"
step is still not finished.

ps xaf output:

60709 pts/10   S+     0:00  |       \_ sudo -u postgres pg_upgrade -d
/var/lib/pgsql/data.intern -D /var/lib/pgsql/data -b /usr/lib/postgresql92/bin
-B /usr/lib/postgresql94/bin -j12 -k
60710 pts/10   S+     0:00  |           \_ pg_upgrade -d
/var/lib/pgsql/data.intern -D /var/lib/pgsql/data -b /usr/lib/postgresql92/bin
-B /usr/lib/postgresql94/bin -j12 -k
60806 pts/10   S+     0:00  |               \_ pg_upgrade -d
/var/lib/pgsql/data.intern -D /var/lib/pgsql/data -b /usr/lib/postgresql92/bin
-B /usr/lib/postgresql94/bin -j12 -k
60809 pts/10   S+     0:00  |                   \_ sh -c
"/usr/lib/postgresql94/bin/pg_dump" --host "/data/pgsql" --port 50432 --
username "postgres" --schema-only --quote-all-identifiers --binary-upgrade --
format=custom  --file="pg_upgrade_dump_16428.custom" "zms" >>
"pg_upgrade_dump_16428.log" 2>&1
60815 pts/10   S+     3:52  |                       \_
/usr/lib/postgresql94/bin/pg_dump --host /data/pgsql --port 50432 --username
postgres --schema-only --quote-all-identifiers --binary-upgrade --
format=custom --file=pg_upgrade_dump_16428.custom zms


60719 pts/10   S+     0:00 /usr/lib/postgresql92/bin/postgres -D
/var/lib/pgsql/data.intern -p 50432 -b -c listen_addresses= -c
unix_socket_permissions=0700 -c unix_socket_directory=/data/pgsql
60720 ?        Ss     0:00  \_ postgres: logger process
60722 ?        Ss     0:00  \_ postgres: checkpointer process
60723 ?        Ss     0:08  \_ postgres: writer process
60724 ?        Ss     0:00  \_ postgres: wal writer process
60725 ?        Ss     0:04  \_ postgres: stats collector process
60817 ?        Rs   334:12  \_ postgres: postgres zms [local] SELECT

The database is version 9.2, pg_upgrade is version 9.4.1. In a discussion at
lwn.net I've been told, that 9.4.1 already contains fixes for inefficiencies
and that it shouldn't be so slow anymore: http://lwn.net/Articles/645600/

What can I do to help improve pg_dump/pg_upgrade for my use case?

Regards,
Stefan

Re: pg_upgrade slowness for databases with many tables

From
Jeff Janes
Date:
On Fri, May 22, 2015 at 9:10 AM, Stefan Seifert <nine@detonation.org> wrote:

> Hi,
>
> upgrading a database containing > 90000 tables and about the same number of
> views and indexes with pg_upgrade takes several hours. I've started the
> upgrade more than five hours ago and the "Creating dump of database
> schemas"
> step is still not finished.
>
>
...


> The database is version 9.2, pg_upgrade is version 9.4.1. In a discussion
> at
> lwn.net I've been told, that 9.4.1 already contains fixes for
> inefficiencies
> and that it shouldn't be so slow anymore: http://lwn.net/Articles/645600/
>
> What can I do to help improve pg_dump/pg_upgrade for my use case?
>

Unfortunately the 9.4 version of pg_dump has to run against the 9.2 server,
where some of the improvements are not applicable.

Your next upgrade should be much less painful.  But unfortunately this one
will be slow.

If it is intolerable, you could try to port
commit eeb6f37d89fc60c6449ca12ef9e into a custom build of 9.2.

This is more a topic for the performance list than for bugs.

Cheers,

Jeff

Re: pg_upgrade slowness for databases with many tables

From
Andres Freund
Date:
On 2015-05-22 12:34:54 -0700, Jeff Janes wrote:
> On Fri, May 22, 2015 at 9:10 AM, Stefan Seifert <nine@detonation.org> wrote:
> > upgrading a database containing > 90000 tables and about the same number of
> > views and indexes with pg_upgrade takes several hours. I've started the
> > upgrade more than five hours ago and the "Creating dump of database
> > schemas"
> > step is still not finished.

> > The database is version 9.2, pg_upgrade is version 9.4.1. In a discussion
> > at
> > lwn.net I've been told, that 9.4.1 already contains fixes for
> > inefficiencies
> > and that it shouldn't be so slow anymore: http://lwn.net/Articles/645600/
> >
> > What can I do to help improve pg_dump/pg_upgrade for my use case?
> >
>
> Unfortunately the 9.4 version of pg_dump has to run against the 9.2 server,
> where some of the improvements are not applicable.
>
> Your next upgrade should be much less painful.  But unfortunately this one
> will be slow.
>
> If it is intolerable, you could try to port
> commit eeb6f37d89fc60c6449ca12ef9e into a custom build of 9.2.
>
> This is more a topic for the performance list than for bugs.

I actually suggested -bugs... So that's my fault, if any. There've been
a number of problems with the dependency code with a larger number of
objects.

It'd be interesting to see a profile of 9.2 while a pg_dump is running,
to see what the problem is.

Greetings,

Andres Freund

Re: pg_upgrade slowness for databases with many tables

From
Stefan Seifert
Date:
On Saturday 23 May 2015 00:08:44 you wrote:
> On 2015-05-22 23:03:34 +0200, Stefan Seifert wrote:
> > On Friday 22 May 2015 21:43:52 you wrote:
> > > It'd be interesting to see a profile of 9.2 while a pg_dump is running,
> > > to see what the problem is.
> >
> > How can I get you such a profile?
>
> What are you running this on? i guess linux from the strace output and
> being on lwn? Which distribution & version?

Correct. It's openSUSE 13.2. PostgreSQL packages are from the
server:database:postgresql repository.

> Depending on either you could do a 'perf record -p $pid_of_slow_backend'
> or the same with -g added. The latter gives a hierarchical profile, but
> requires postgres to be compiled with frame pointers
> (-fno-omit-frame-pointers); or --call-graph dwarf which is slower but
> works on more binaries, but requires a relatively new kernel/perf.

I restarted the upgrade and am following it with strace. I'm taking perf
measurements for every kind of query pg_dump does (and which is done often
enough for me to notice the change in pattern in the terminal window).

After a couple of minutes of just locking tables, I noticed to following
different kinds of queries. Every entry is the current time and an example
query. I'm attaching perf reports with corresponding file names. The latest is
the first to be run with -g.

It looks like LockReassignCurrentOwner is taking most of the time. The rest
goes to hash_seq_search with almost nothing remaining.

What I really wonder though is, why pg_dump does all these queries for single
objects in the first place? If the target is to create a complete schema dump,
why not doing a select * from all the relevant pg_* tables and construct the
big picture in memory? That surely would be much faster?

16:01:
recvfrom(9, "Q\0\0\1\334SELECT tableoid, oid, typname, typnamespace, typacl,
(SELECT rolname FROM pg_catalog.pg_roles WHERE oid = typowner) AS rolname,
typinput::oid AS typinput, typoutput::oid AS typoutput, typelem, typrelid,
CASE WHEN typrelid = 0 THEN ' '::\"char\" ELSE (SELECT relkind FROM pg_class
WHERE oid = typrelid) END AS typrelkind, typtype, typisdefined, typname[0] =
'_' AND typelem != 0 AND (SELECT typarray FROM pg_type te WHERE oid =
pg_type.typelem) = oid AS isarray FROM pg_type\0", 8192, 0, NULL, NULL) = 477
recvfrom(9, "Q\0\0\0004SET search_path = \"2004.atikon.com\", pg_catalog\0",
8192, 0, NULL, NULL) = 53

16:02:
recvfrom(9, "Q\0\0\3\24SELECT a.attnum, a.attname, a.atttypmod,
a.attstattarget, a.attstorage, t.typstorage, a.attnotnull, a.atthasdef,
a.attisdropped, a.attlen, a.attalign, a.attislocal,
pg_catalog.format_type(t.oid,a.atttypmod) AS atttypname,
array_to_string(a.attoptions, ', ') AS attoptions, CASE WHEN a.attcollation <>
t.typcollation THEN a.attcollation ELSE 0 END AS attcollation,
pg_catalog.array_to_string(ARRAY(SELECT pg_catalog.quote_ident(option_name) ||
' ' || pg_catalog.quote_literal(option_value) FROM
pg_catalog.pg_options_to_table(attfdwoptions) ORDER BY option_name), E',\n
') AS attfdwoptions FROM pg_catalog.pg_attribute a LEFT JOIN
pg_catalog.pg_type t ON a.atttypid = t.oid WHERE a.attrelid =
'3757242'::pg_catalog.oid AND a.attnum > 0::pg_catalog.int2 ORDER BY
a.attrelid, a.attnum\0", 8192, 0, NULL, NULL) = 789

16:13:
recvfrom(9, "Q\0\0\3\24SELECT a.attnum, a.attname, a.atttypmod,
a.attstattarget, a.attstorage, t.typstorage, a.attnotnull, a.atthasdef,
a.attisdropped, a.attlen, a.attalign, a.attislocal,
pg_catalog.format_type(t.oid,a.atttypmod) AS atttypname,
array_to_string(a.attoptions, ', ') AS attoptions, CASE WHEN a.attcollation <>
t.typcollation THEN a.attcollation ELSE 0 END AS attcollation,
pg_catalog.array_to_string(ARRAY(SELECT pg_catalog.quote_ident(option_name) ||
' ' || pg_catalog.quote_literal(option_value) FROM
pg_catalog.pg_options_to_table(attfdwoptions) ORDER BY option_name), E',\n
') AS attfdwoptions FROM pg_catalog.pg_attribute a LEFT JOIN
pg_catalog.pg_type t ON a.atttypid = t.oid WHERE a.attrelid =
'3898404'::pg_catalog.oid AND a.attnum > 0::pg_catalog.int2 ORDER BY
a.attrelid, a.attnum\0", 8192, 0, NULL, NULL) = 789

17:40:
recvfrom(9, "Q\0\0\3\36SELECT t.tableoid, t.oid, t.relname AS indexname,
pg_catalog.pg_get_indexdef(i.indexrelid) AS indexdef, t.relnatts AS indnkeys,
i.indkey, i.indisclustered, false AS indisreplident, t.relpages, c.contype,
c.conname, c.condeferrable, c.condeferred, c.tableoid AS contableoid, c.oid AS
conoid, pg_catalog.pg_get_constraintdef(c.oid, false) AS condef, (SELECT
spcname FROM pg_catalog.pg_tablespace s WHERE s.oid = t.reltablespace) AS
tablespace, array_to_string(t.reloptions, ', ') AS options FROM
pg_catalog.pg_index i JOIN pg_catalog.pg_class t ON (t.oid = i.indexrelid)
LEFT JOIN pg_catalog.pg_constraint c ON (i.indrelid = c.conrelid AND
i.indexrelid = c.conindid AND c.contype IN ('p','u','x')) WHERE i.indrelid =
'4300750'::pg_catalog.oid AND i.indisvalid AND i.indisready ORDER BY
indexname\0", 8192, 0, NULL, NULL) = 799

18:20:
recvfrom(9, "Q\0\0\0ASET search_path = \"2009.steuerberater-hauer.com\",
pg_catalog\0", 8192, 0, NULL, NULL) = 66
sendto(9, "C\0\0\0\10SET\0Z\0\0\0\5T", 15, 0, NULL, 0) = 15
recvfrom(9, "Q\0\0\0\270SELECT tableoid, oid, conname, confrelid,
pg_catalog.pg_get_constraintdef(oid) AS condef FROM pg_catalog.pg_constraint
WHERE conrelid = '3384167'::pg_catalog.oid AND contype = 'f'\0", 8192, 0,
NULL, NULL) = 185

Will continue to watch for new statements as time permits.

Regards,
Stefan
Attachment

Re: pg_upgrade slowness for databases with many tables

From
Stefan Seifert
Date:
On Saturday 23 May 2015 00:08:44 you wrote:
> On 2015-05-22 23:03:34 +0200, Stefan Seifert wrote:
> > On Friday 22 May 2015 21:43:52 you wrote:
> > > It'd be interesting to see a profile of 9.2 while a pg_dump is running,
> > > to see what the problem is.
> >
> > How can I get you such a profile?
>
> What are you running this on? i guess linux from the strace output and
> being on lwn? Which distribution & version?

Correct. It's openSUSE 13.2. PostgreSQL packages are from the
server:database:postgresql repository.

> Depending on either you could do a 'perf record -p $pid_of_slow_backend'
> or the same with -g added. The latter gives a hierarchical profile, but
> requires postgres to be compiled with frame pointers
> (-fno-omit-frame-pointers); or --call-graph dwarf which is slower but
> works on more binaries, but requires a relatively new kernel/perf.

I restarted the upgrade and am following it with strace. I'm taking perf
measurements for every kind of query pg_dump does (and which is done often
enough for me to notice the change in pattern in the terminal window).

After a couple of minutes of just locking tables, I noticed to following
different kinds of queries. Every entry is the current time and an example
query. I'm attaching perf reports with corresponding file names. The latest is
the first to be run with -g.

It looks like LockReassignCurrentOwner is taking most of the time. The rest
goes to hash_seq_search with almost nothing remaining.

What I really wonder though is, why pg_dump does all these queries for single
objects in the first place? If the target is to create a complete schema dump,
why not doing a select * from all the relevant pg_* tables and construct the
big picture in memory? That surely would be much faster?

16:01:
recvfrom(9, "Q\0\0\1\334SELECT tableoid, oid, typname, typnamespace, typacl,
(SELECT rolname FROM pg_catalog.pg_roles WHERE oid = typowner) AS rolname,
typinput::oid AS typinput, typoutput::oid AS typoutput, typelem, typrelid,
CASE WHEN typrelid = 0 THEN ' '::\"char\" ELSE (SELECT relkind FROM pg_class
WHERE oid = typrelid) END AS typrelkind, typtype, typisdefined, typname[0] =
'_' AND typelem != 0 AND (SELECT typarray FROM pg_type te WHERE oid =
pg_type.typelem) = oid AS isarray FROM pg_type\0", 8192, 0, NULL, NULL) = 477
recvfrom(9, "Q\0\0\0004SET search_path = \"2004.atikon.com\", pg_catalog\0",
8192, 0, NULL, NULL) = 53

16:02:
recvfrom(9, "Q\0\0\3\24SELECT a.attnum, a.attname, a.atttypmod,
a.attstattarget, a.attstorage, t.typstorage, a.attnotnull, a.atthasdef,
a.attisdropped, a.attlen, a.attalign, a.attislocal,
pg_catalog.format_type(t.oid,a.atttypmod) AS atttypname,
array_to_string(a.attoptions, ', ') AS attoptions, CASE WHEN a.attcollation <>
t.typcollation THEN a.attcollation ELSE 0 END AS attcollation,
pg_catalog.array_to_string(ARRAY(SELECT pg_catalog.quote_ident(option_name) ||
' ' || pg_catalog.quote_literal(option_value) FROM
pg_catalog.pg_options_to_table(attfdwoptions) ORDER BY option_name), E',\n
') AS attfdwoptions FROM pg_catalog.pg_attribute a LEFT JOIN
pg_catalog.pg_type t ON a.atttypid = t.oid WHERE a.attrelid =
'3757242'::pg_catalog.oid AND a.attnum > 0::pg_catalog.int2 ORDER BY
a.attrelid, a.attnum\0", 8192, 0, NULL, NULL) = 789

16:13:
recvfrom(9, "Q\0\0\3\24SELECT a.attnum, a.attname, a.atttypmod,
a.attstattarget, a.attstorage, t.typstorage, a.attnotnull, a.atthasdef,
a.attisdropped, a.attlen, a.attalign, a.attislocal,
pg_catalog.format_type(t.oid,a.atttypmod) AS atttypname,
array_to_string(a.attoptions, ', ') AS attoptions, CASE WHEN a.attcollation <>
t.typcollation THEN a.attcollation ELSE 0 END AS attcollation,
pg_catalog.array_to_string(ARRAY(SELECT pg_catalog.quote_ident(option_name) ||
' ' || pg_catalog.quote_literal(option_value) FROM
pg_catalog.pg_options_to_table(attfdwoptions) ORDER BY option_name), E',\n
') AS attfdwoptions FROM pg_catalog.pg_attribute a LEFT JOIN
pg_catalog.pg_type t ON a.atttypid = t.oid WHERE a.attrelid =
'3898404'::pg_catalog.oid AND a.attnum > 0::pg_catalog.int2 ORDER BY
a.attrelid, a.attnum\0", 8192, 0, NULL, NULL) = 789

17:40:
recvfrom(9, "Q\0\0\3\36SELECT t.tableoid, t.oid, t.relname AS indexname,
pg_catalog.pg_get_indexdef(i.indexrelid) AS indexdef, t.relnatts AS indnkeys,
i.indkey, i.indisclustered, false AS indisreplident, t.relpages, c.contype,
c.conname, c.condeferrable, c.condeferred, c.tableoid AS contableoid, c.oid AS
conoid, pg_catalog.pg_get_constraintdef(c.oid, false) AS condef, (SELECT
spcname FROM pg_catalog.pg_tablespace s WHERE s.oid = t.reltablespace) AS
tablespace, array_to_string(t.reloptions, ', ') AS options FROM
pg_catalog.pg_index i JOIN pg_catalog.pg_class t ON (t.oid = i.indexrelid)
LEFT JOIN pg_catalog.pg_constraint c ON (i.indrelid = c.conrelid AND
i.indexrelid = c.conindid AND c.contype IN ('p','u','x')) WHERE i.indrelid =
'4300750'::pg_catalog.oid AND i.indisvalid AND i.indisready ORDER BY
indexname\0", 8192, 0, NULL, NULL) = 799

18:20:
recvfrom(9, "Q\0\0\0ASET search_path = \"2009.steuerberater-hauer.com\",
pg_catalog\0", 8192, 0, NULL, NULL) = 66
sendto(9, "C\0\0\0\10SET\0Z\0\0\0\5T", 15, 0, NULL, 0) = 15
recvfrom(9, "Q\0\0\0\270SELECT tableoid, oid, conname, confrelid,
pg_catalog.pg_get_constraintdef(oid) AS condef FROM pg_catalog.pg_constraint
WHERE conrelid = '3384167'::pg_catalog.oid AND contype = 'f'\0", 8192, 0,
NULL, NULL) = 185

Will continue to watch for new statements as time permits.

Regards,
Stefan
Attachment

Re: pg_upgrade slowness for databases with many tables

From
Tom Lane
Date:
Stefan Seifert <nine@detonation.org> writes:
> It looks like LockReassignCurrentOwner is taking most of the time.

Yeah, so that is the issue that was alleviated by the patch you were
pointed to.

> What I really wonder though is, why pg_dump does all these queries for single
> objects in the first place? If the target is to create a complete schema dump,
> why not doing a select * from all the relevant pg_* tables and construct the
> big picture in memory? That surely would be much faster?

It's unclear that it would be faster, and it is clear that restructuring
pg_dump like that would be a lotta work.

            regards, tom lane