Thread: pg_upgrade slowness for databases with many tables
Hi, upgrading a database containing > 90000 tables and about the same number of views and indexes with pg_upgrade takes several hours. I've started the upgrade more than five hours ago and the "Creating dump of database schemas" step is still not finished. ps xaf output: 60709 pts/10 S+ 0:00 | \_ sudo -u postgres pg_upgrade -d /var/lib/pgsql/data.intern -D /var/lib/pgsql/data -b /usr/lib/postgresql92/bin -B /usr/lib/postgresql94/bin -j12 -k 60710 pts/10 S+ 0:00 | \_ pg_upgrade -d /var/lib/pgsql/data.intern -D /var/lib/pgsql/data -b /usr/lib/postgresql92/bin -B /usr/lib/postgresql94/bin -j12 -k 60806 pts/10 S+ 0:00 | \_ pg_upgrade -d /var/lib/pgsql/data.intern -D /var/lib/pgsql/data -b /usr/lib/postgresql92/bin -B /usr/lib/postgresql94/bin -j12 -k 60809 pts/10 S+ 0:00 | \_ sh -c "/usr/lib/postgresql94/bin/pg_dump" --host "/data/pgsql" --port 50432 -- username "postgres" --schema-only --quote-all-identifiers --binary-upgrade -- format=custom --file="pg_upgrade_dump_16428.custom" "zms" >> "pg_upgrade_dump_16428.log" 2>&1 60815 pts/10 S+ 3:52 | \_ /usr/lib/postgresql94/bin/pg_dump --host /data/pgsql --port 50432 --username postgres --schema-only --quote-all-identifiers --binary-upgrade -- format=custom --file=pg_upgrade_dump_16428.custom zms 60719 pts/10 S+ 0:00 /usr/lib/postgresql92/bin/postgres -D /var/lib/pgsql/data.intern -p 50432 -b -c listen_addresses= -c unix_socket_permissions=0700 -c unix_socket_directory=/data/pgsql 60720 ? Ss 0:00 \_ postgres: logger process 60722 ? Ss 0:00 \_ postgres: checkpointer process 60723 ? Ss 0:08 \_ postgres: writer process 60724 ? Ss 0:00 \_ postgres: wal writer process 60725 ? Ss 0:04 \_ postgres: stats collector process 60817 ? Rs 334:12 \_ postgres: postgres zms [local] SELECT The database is version 9.2, pg_upgrade is version 9.4.1. In a discussion at lwn.net I've been told, that 9.4.1 already contains fixes for inefficiencies and that it shouldn't be so slow anymore: http://lwn.net/Articles/645600/ What can I do to help improve pg_dump/pg_upgrade for my use case? Regards, Stefan
On Fri, May 22, 2015 at 9:10 AM, Stefan Seifert <nine@detonation.org> wrote: > Hi, > > upgrading a database containing > 90000 tables and about the same number of > views and indexes with pg_upgrade takes several hours. I've started the > upgrade more than five hours ago and the "Creating dump of database > schemas" > step is still not finished. > > ... > The database is version 9.2, pg_upgrade is version 9.4.1. In a discussion > at > lwn.net I've been told, that 9.4.1 already contains fixes for > inefficiencies > and that it shouldn't be so slow anymore: http://lwn.net/Articles/645600/ > > What can I do to help improve pg_dump/pg_upgrade for my use case? > Unfortunately the 9.4 version of pg_dump has to run against the 9.2 server, where some of the improvements are not applicable. Your next upgrade should be much less painful. But unfortunately this one will be slow. If it is intolerable, you could try to port commit eeb6f37d89fc60c6449ca12ef9e into a custom build of 9.2. This is more a topic for the performance list than for bugs. Cheers, Jeff
On 2015-05-22 12:34:54 -0700, Jeff Janes wrote: > On Fri, May 22, 2015 at 9:10 AM, Stefan Seifert <nine@detonation.org> wrote: > > upgrading a database containing > 90000 tables and about the same number of > > views and indexes with pg_upgrade takes several hours. I've started the > > upgrade more than five hours ago and the "Creating dump of database > > schemas" > > step is still not finished. > > The database is version 9.2, pg_upgrade is version 9.4.1. In a discussion > > at > > lwn.net I've been told, that 9.4.1 already contains fixes for > > inefficiencies > > and that it shouldn't be so slow anymore: http://lwn.net/Articles/645600/ > > > > What can I do to help improve pg_dump/pg_upgrade for my use case? > > > > Unfortunately the 9.4 version of pg_dump has to run against the 9.2 server, > where some of the improvements are not applicable. > > Your next upgrade should be much less painful. But unfortunately this one > will be slow. > > If it is intolerable, you could try to port > commit eeb6f37d89fc60c6449ca12ef9e into a custom build of 9.2. > > This is more a topic for the performance list than for bugs. I actually suggested -bugs... So that's my fault, if any. There've been a number of problems with the dependency code with a larger number of objects. It'd be interesting to see a profile of 9.2 while a pg_dump is running, to see what the problem is. Greetings, Andres Freund
On Saturday 23 May 2015 00:08:44 you wrote: > On 2015-05-22 23:03:34 +0200, Stefan Seifert wrote: > > On Friday 22 May 2015 21:43:52 you wrote: > > > It'd be interesting to see a profile of 9.2 while a pg_dump is running, > > > to see what the problem is. > > > > How can I get you such a profile? > > What are you running this on? i guess linux from the strace output and > being on lwn? Which distribution & version? Correct. It's openSUSE 13.2. PostgreSQL packages are from the server:database:postgresql repository. > Depending on either you could do a 'perf record -p $pid_of_slow_backend' > or the same with -g added. The latter gives a hierarchical profile, but > requires postgres to be compiled with frame pointers > (-fno-omit-frame-pointers); or --call-graph dwarf which is slower but > works on more binaries, but requires a relatively new kernel/perf. I restarted the upgrade and am following it with strace. I'm taking perf measurements for every kind of query pg_dump does (and which is done often enough for me to notice the change in pattern in the terminal window). After a couple of minutes of just locking tables, I noticed to following different kinds of queries. Every entry is the current time and an example query. I'm attaching perf reports with corresponding file names. The latest is the first to be run with -g. It looks like LockReassignCurrentOwner is taking most of the time. The rest goes to hash_seq_search with almost nothing remaining. What I really wonder though is, why pg_dump does all these queries for single objects in the first place? If the target is to create a complete schema dump, why not doing a select * from all the relevant pg_* tables and construct the big picture in memory? That surely would be much faster? 16:01: recvfrom(9, "Q\0\0\1\334SELECT tableoid, oid, typname, typnamespace, typacl, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = typowner) AS rolname, typinput::oid AS typinput, typoutput::oid AS typoutput, typelem, typrelid, CASE WHEN typrelid = 0 THEN ' '::\"char\" ELSE (SELECT relkind FROM pg_class WHERE oid = typrelid) END AS typrelkind, typtype, typisdefined, typname[0] = '_' AND typelem != 0 AND (SELECT typarray FROM pg_type te WHERE oid = pg_type.typelem) = oid AS isarray FROM pg_type\0", 8192, 0, NULL, NULL) = 477 recvfrom(9, "Q\0\0\0004SET search_path = \"2004.atikon.com\", pg_catalog\0", 8192, 0, NULL, NULL) = 53 16:02: recvfrom(9, "Q\0\0\3\24SELECT a.attnum, a.attname, a.atttypmod, a.attstattarget, a.attstorage, t.typstorage, a.attnotnull, a.atthasdef, a.attisdropped, a.attlen, a.attalign, a.attislocal, pg_catalog.format_type(t.oid,a.atttypmod) AS atttypname, array_to_string(a.attoptions, ', ') AS attoptions, CASE WHEN a.attcollation <> t.typcollation THEN a.attcollation ELSE 0 END AS attcollation, pg_catalog.array_to_string(ARRAY(SELECT pg_catalog.quote_ident(option_name) || ' ' || pg_catalog.quote_literal(option_value) FROM pg_catalog.pg_options_to_table(attfdwoptions) ORDER BY option_name), E',\n ') AS attfdwoptions FROM pg_catalog.pg_attribute a LEFT JOIN pg_catalog.pg_type t ON a.atttypid = t.oid WHERE a.attrelid = '3757242'::pg_catalog.oid AND a.attnum > 0::pg_catalog.int2 ORDER BY a.attrelid, a.attnum\0", 8192, 0, NULL, NULL) = 789 16:13: recvfrom(9, "Q\0\0\3\24SELECT a.attnum, a.attname, a.atttypmod, a.attstattarget, a.attstorage, t.typstorage, a.attnotnull, a.atthasdef, a.attisdropped, a.attlen, a.attalign, a.attislocal, pg_catalog.format_type(t.oid,a.atttypmod) AS atttypname, array_to_string(a.attoptions, ', ') AS attoptions, CASE WHEN a.attcollation <> t.typcollation THEN a.attcollation ELSE 0 END AS attcollation, pg_catalog.array_to_string(ARRAY(SELECT pg_catalog.quote_ident(option_name) || ' ' || pg_catalog.quote_literal(option_value) FROM pg_catalog.pg_options_to_table(attfdwoptions) ORDER BY option_name), E',\n ') AS attfdwoptions FROM pg_catalog.pg_attribute a LEFT JOIN pg_catalog.pg_type t ON a.atttypid = t.oid WHERE a.attrelid = '3898404'::pg_catalog.oid AND a.attnum > 0::pg_catalog.int2 ORDER BY a.attrelid, a.attnum\0", 8192, 0, NULL, NULL) = 789 17:40: recvfrom(9, "Q\0\0\3\36SELECT t.tableoid, t.oid, t.relname AS indexname, pg_catalog.pg_get_indexdef(i.indexrelid) AS indexdef, t.relnatts AS indnkeys, i.indkey, i.indisclustered, false AS indisreplident, t.relpages, c.contype, c.conname, c.condeferrable, c.condeferred, c.tableoid AS contableoid, c.oid AS conoid, pg_catalog.pg_get_constraintdef(c.oid, false) AS condef, (SELECT spcname FROM pg_catalog.pg_tablespace s WHERE s.oid = t.reltablespace) AS tablespace, array_to_string(t.reloptions, ', ') AS options FROM pg_catalog.pg_index i JOIN pg_catalog.pg_class t ON (t.oid = i.indexrelid) LEFT JOIN pg_catalog.pg_constraint c ON (i.indrelid = c.conrelid AND i.indexrelid = c.conindid AND c.contype IN ('p','u','x')) WHERE i.indrelid = '4300750'::pg_catalog.oid AND i.indisvalid AND i.indisready ORDER BY indexname\0", 8192, 0, NULL, NULL) = 799 18:20: recvfrom(9, "Q\0\0\0ASET search_path = \"2009.steuerberater-hauer.com\", pg_catalog\0", 8192, 0, NULL, NULL) = 66 sendto(9, "C\0\0\0\10SET\0Z\0\0\0\5T", 15, 0, NULL, 0) = 15 recvfrom(9, "Q\0\0\0\270SELECT tableoid, oid, conname, confrelid, pg_catalog.pg_get_constraintdef(oid) AS condef FROM pg_catalog.pg_constraint WHERE conrelid = '3384167'::pg_catalog.oid AND contype = 'f'\0", 8192, 0, NULL, NULL) = 185 Will continue to watch for new statements as time permits. Regards, Stefan
Attachment
On Saturday 23 May 2015 00:08:44 you wrote: > On 2015-05-22 23:03:34 +0200, Stefan Seifert wrote: > > On Friday 22 May 2015 21:43:52 you wrote: > > > It'd be interesting to see a profile of 9.2 while a pg_dump is running, > > > to see what the problem is. > > > > How can I get you such a profile? > > What are you running this on? i guess linux from the strace output and > being on lwn? Which distribution & version? Correct. It's openSUSE 13.2. PostgreSQL packages are from the server:database:postgresql repository. > Depending on either you could do a 'perf record -p $pid_of_slow_backend' > or the same with -g added. The latter gives a hierarchical profile, but > requires postgres to be compiled with frame pointers > (-fno-omit-frame-pointers); or --call-graph dwarf which is slower but > works on more binaries, but requires a relatively new kernel/perf. I restarted the upgrade and am following it with strace. I'm taking perf measurements for every kind of query pg_dump does (and which is done often enough for me to notice the change in pattern in the terminal window). After a couple of minutes of just locking tables, I noticed to following different kinds of queries. Every entry is the current time and an example query. I'm attaching perf reports with corresponding file names. The latest is the first to be run with -g. It looks like LockReassignCurrentOwner is taking most of the time. The rest goes to hash_seq_search with almost nothing remaining. What I really wonder though is, why pg_dump does all these queries for single objects in the first place? If the target is to create a complete schema dump, why not doing a select * from all the relevant pg_* tables and construct the big picture in memory? That surely would be much faster? 16:01: recvfrom(9, "Q\0\0\1\334SELECT tableoid, oid, typname, typnamespace, typacl, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = typowner) AS rolname, typinput::oid AS typinput, typoutput::oid AS typoutput, typelem, typrelid, CASE WHEN typrelid = 0 THEN ' '::\"char\" ELSE (SELECT relkind FROM pg_class WHERE oid = typrelid) END AS typrelkind, typtype, typisdefined, typname[0] = '_' AND typelem != 0 AND (SELECT typarray FROM pg_type te WHERE oid = pg_type.typelem) = oid AS isarray FROM pg_type\0", 8192, 0, NULL, NULL) = 477 recvfrom(9, "Q\0\0\0004SET search_path = \"2004.atikon.com\", pg_catalog\0", 8192, 0, NULL, NULL) = 53 16:02: recvfrom(9, "Q\0\0\3\24SELECT a.attnum, a.attname, a.atttypmod, a.attstattarget, a.attstorage, t.typstorage, a.attnotnull, a.atthasdef, a.attisdropped, a.attlen, a.attalign, a.attislocal, pg_catalog.format_type(t.oid,a.atttypmod) AS atttypname, array_to_string(a.attoptions, ', ') AS attoptions, CASE WHEN a.attcollation <> t.typcollation THEN a.attcollation ELSE 0 END AS attcollation, pg_catalog.array_to_string(ARRAY(SELECT pg_catalog.quote_ident(option_name) || ' ' || pg_catalog.quote_literal(option_value) FROM pg_catalog.pg_options_to_table(attfdwoptions) ORDER BY option_name), E',\n ') AS attfdwoptions FROM pg_catalog.pg_attribute a LEFT JOIN pg_catalog.pg_type t ON a.atttypid = t.oid WHERE a.attrelid = '3757242'::pg_catalog.oid AND a.attnum > 0::pg_catalog.int2 ORDER BY a.attrelid, a.attnum\0", 8192, 0, NULL, NULL) = 789 16:13: recvfrom(9, "Q\0\0\3\24SELECT a.attnum, a.attname, a.atttypmod, a.attstattarget, a.attstorage, t.typstorage, a.attnotnull, a.atthasdef, a.attisdropped, a.attlen, a.attalign, a.attislocal, pg_catalog.format_type(t.oid,a.atttypmod) AS atttypname, array_to_string(a.attoptions, ', ') AS attoptions, CASE WHEN a.attcollation <> t.typcollation THEN a.attcollation ELSE 0 END AS attcollation, pg_catalog.array_to_string(ARRAY(SELECT pg_catalog.quote_ident(option_name) || ' ' || pg_catalog.quote_literal(option_value) FROM pg_catalog.pg_options_to_table(attfdwoptions) ORDER BY option_name), E',\n ') AS attfdwoptions FROM pg_catalog.pg_attribute a LEFT JOIN pg_catalog.pg_type t ON a.atttypid = t.oid WHERE a.attrelid = '3898404'::pg_catalog.oid AND a.attnum > 0::pg_catalog.int2 ORDER BY a.attrelid, a.attnum\0", 8192, 0, NULL, NULL) = 789 17:40: recvfrom(9, "Q\0\0\3\36SELECT t.tableoid, t.oid, t.relname AS indexname, pg_catalog.pg_get_indexdef(i.indexrelid) AS indexdef, t.relnatts AS indnkeys, i.indkey, i.indisclustered, false AS indisreplident, t.relpages, c.contype, c.conname, c.condeferrable, c.condeferred, c.tableoid AS contableoid, c.oid AS conoid, pg_catalog.pg_get_constraintdef(c.oid, false) AS condef, (SELECT spcname FROM pg_catalog.pg_tablespace s WHERE s.oid = t.reltablespace) AS tablespace, array_to_string(t.reloptions, ', ') AS options FROM pg_catalog.pg_index i JOIN pg_catalog.pg_class t ON (t.oid = i.indexrelid) LEFT JOIN pg_catalog.pg_constraint c ON (i.indrelid = c.conrelid AND i.indexrelid = c.conindid AND c.contype IN ('p','u','x')) WHERE i.indrelid = '4300750'::pg_catalog.oid AND i.indisvalid AND i.indisready ORDER BY indexname\0", 8192, 0, NULL, NULL) = 799 18:20: recvfrom(9, "Q\0\0\0ASET search_path = \"2009.steuerberater-hauer.com\", pg_catalog\0", 8192, 0, NULL, NULL) = 66 sendto(9, "C\0\0\0\10SET\0Z\0\0\0\5T", 15, 0, NULL, 0) = 15 recvfrom(9, "Q\0\0\0\270SELECT tableoid, oid, conname, confrelid, pg_catalog.pg_get_constraintdef(oid) AS condef FROM pg_catalog.pg_constraint WHERE conrelid = '3384167'::pg_catalog.oid AND contype = 'f'\0", 8192, 0, NULL, NULL) = 185 Will continue to watch for new statements as time permits. Regards, Stefan
Attachment
Stefan Seifert <nine@detonation.org> writes: > It looks like LockReassignCurrentOwner is taking most of the time. Yeah, so that is the issue that was alleviated by the patch you were pointed to. > What I really wonder though is, why pg_dump does all these queries for single > objects in the first place? If the target is to create a complete schema dump, > why not doing a select * from all the relevant pg_* tables and construct the > big picture in memory? That surely would be much faster? It's unclear that it would be faster, and it is clear that restructuring pg_dump like that would be a lotta work. regards, tom lane