Thread: pg_dump incredibly slow dumping a single schema from a large db

pg_dump incredibly slow dumping a single schema from a large db

From
Mike Roest
Date:
Hey Everyone,
    I've got an interesting issue.  We're running postgres 9.1.1 linux x64 centos 5.8

aspdata=# select version();
                                                    version
---------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.1.1 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-52), 64-bit


We currently have 1 DB we use for multiple independent tenant schemas. The database size is current 56227005240 bytes as reported by pg_database_size.  

There are 557 schemas each with about 1300 objects (760 tables 520 views).

We are using pg_dump to do backups of a single schema with a total size of (5480448 bytes calculated with  SELECT sum(pg_relation_size(schemaname || '.' || tablename))::bigint FROM pg_tables WHERE schemaname ='miketest';)  

pg_dump -f /dumps/test.backup -Fc -n miketest aspdata

This dump is currently taking around 8 minutes.  While dumping the pg_dump process is using 100% of one core in the server (24 core machine).  Doing a -v pg_dump I found that the following stages are taking the majority of the time 

reading user_defined tables (2 minutes and 20 seconds)
reading dependency data (5 minutes and 30 seconds)

The size of the schema doesn't really seem to effect theses times are almost identical for a 700 meg schema as well (obviously the data dump portion takes longer with the bigger db)

During the reading user_defined tables the following query shows up for a 10-20 seconds then the pg_dump connection sits idle for the rest of the 2 minutes:

SELECT c.tableoid, c.oid, c.relname, c.relacl, c.relkind, c.relnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = c.relowner) AS rolname, c.relchecks, c.relhastriggers, c.relhasindex, c.relhasrules, c.relhasoids, c.relfrozenxid, tc.oid AS
toid, tc.relfrozenxid AS tfrozenxid, c.relpersistence, CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, d.refobjid AS owning_tab, d.refobjsubid AS owning_col, (SELECT spcname FROM pg_tablespace
t WHERE t.oid = c.reltablespace) AS reltablespace, array_to_string(c.reloptions, ', ') AS reloptions, array_to_string(array(SELECT 'toast.' || x FROM unnest(tc.reloptions) x), ', ') AS toast_reloptions FROM pg_class c LEFT JOIN pg_depend
 d ON (c.relkind = 'S' AND d.classid = c.tableoid AND d.objid = c.oid AND d.objsubid = 0 AND d.refclassid = c.tableoid AND d.deptype = 'a') LEFT JOIN pg_class tc ON (c.reltoastrelid = tc.oid) WHERE c.relkind in ('r', 'S', 'v', 'c', 'f')
ORDER BY c.oid

During the reading dependency data the following queries show up for a few seconds then the connection sits idle for the rest of the 5.5 minutes:
 SELECT tableoid
, oid, typname, typnamespace, (SELECT rolname FROM pg_catalog.pg_roles WHERE oid = typowner) AS rolname, typinput::oid AS typinput, typoutput::oid AS typoutput, typelem, typrelid, CASE WHEN typrelid = 0 THEN ' '::"char" ELSE (SELECT relk
ind FROM pg_class WHERE oid = typrelid) END AS typrelkind, typtype, typisdefined, typname[0] = '_' AND typelem != 0 AND (SELECT typarray FROM pg_type te WHERE oid = pg_type.typelem) = oid AS isarray FROM pg_type
SELECT classid,
 objid, refclassid, refobjid, deptype FROM pg_depend WHERE deptype != 'p' AND deptype != 'e' ORDER BY 1,2

Non production servers with less schemas don't seem to have any issue and perform the same dump in under 10 seconds on much lower classed hardware.

Server Specs:

2 x Intel Xeon X5650
32 Gigs of Ram
DELL Perc H700 Controller 
Data drive - 6XSAS2 15K in RAID10 FS: xfs
Log Drive - 2XSAS2 15K in RAID1 FS: xfs

There are 2 of these machine one master other slaved via streaming replication over gigabit network.


Thanks

Re: pg_dump incredibly slow dumping a single schema from a large db

From
Tom Lane
Date:
Mike Roest <mike.roest@replicon.com> writes:
> This dump is currently taking around 8 minutes.  While dumping the pg_dump
> process is using 100% of one core in the server (24 core machine).  Doing a
> -v pg_dump I found that the following stages are taking the majority of the
> time

> reading user_defined tables (2 minutes and 20 seconds)
> reading dependency data (5 minutes and 30 seconds)

Can you get an execution profile with oprofile or gprof or similar tool?
It doesn't surprise me a lot that pg_dump might have some issues with
large numbers of objects, but guessing which inefficiencies are hurting
you is difficult without more info.

            regards, tom lane

Re: pg_dump incredibly slow dumping a single schema from a large db

From
Mike Roest
Date:
For sure I'll work on that now.  One thing I noticed looking through the pg_dump code based on the messages and the code one thing I noticed it seems to be grabbing the full dependency graph for the whole db rather then limiting it by the schema (not sure if limiting this would be possible)

This query returns 9843923 rows from the DB.  So processing this seems like it'll take quite a while.

I'll get a -pg build of pg_dump going here on a dev box so I can get you a profile.


On Fri, Mar 30, 2012 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mike Roest <mike.roest@replicon.com> writes:
> This dump is currently taking around 8 minutes.  While dumping the pg_dump
> process is using 100% of one core in the server (24 core machine).  Doing a
> -v pg_dump I found that the following stages are taking the majority of the
> time

> reading user_defined tables (2 minutes and 20 seconds)
> reading dependency data (5 minutes and 30 seconds)

Can you get an execution profile with oprofile or gprof or similar tool?
It doesn't surprise me a lot that pg_dump might have some issues with
large numbers of objects, but guessing which inefficiencies are hurting
you is difficult without more info.

                       regards, tom lane

Re: pg_dump incredibly slow dumping a single schema from a large db

From
Mike Roest
Date:
Ok I just realized that's probably not going to be much help :)

  0.00      0.00     0.00        5     0.00     0.00  canonicalize_path
  0.00      0.00     0.00        5     0.00     0.00  trim_trailing_separator
  0.00      0.00     0.00        3     0.00     0.00  strlcpy
  0.00      0.00     0.00        2     0.00     0.00  join_path_components
  0.00      0.00     0.00        2     0.00     0.00  last_dir_separator
  0.00      0.00     0.00        1     0.00     0.00  find_my_exec
  0.00      0.00     0.00        1     0.00     0.00  first_dir_separator
  0.00      0.00     0.00        1     0.00     0.00  get_etc_path
  0.00      0.00     0.00        1     0.00     0.00  get_progname
  0.00      0.00     0.00        1     0.00     0.00  help
  0.00      0.00     0.00        1     0.00     0.00  make_relative_path
  0.00      0.00     0.00        1     0.00     0.00  resolve_symlinks
  0.00      0.00     0.00        1     0.00     0.00  set_pglocale_pgservice
  0.00      0.00     0.00        1     0.00     0.00  trim_directory
  0.00      0.00     0.00        1     0.00     0.00  validate_exec

That's the output of gprof pg_dump gmon.out  (I built the -pg build on my dev box then ran it on the server.  I'm just running the actual dump on my dev box against the server instead to see if I get something more useful since that doesn't really seem to have much data in it)


On Fri, Mar 30, 2012 at 11:09 AM, Mike Roest <mike.roest@replicon.com> wrote:
Here's the gmon.out from a -pg compiled 9.1.1 pg_dump.

--Mike


On Fri, Mar 30, 2012 at 10:40 AM, Mike Roest <mike.roest@replicon.com> wrote:
For sure I'll work on that now.  One thing I noticed looking through the pg_dump code based on the messages and the code one thing I noticed it seems to be grabbing the full dependency graph for the whole db rather then limiting it by the schema (not sure if limiting this would be possible)

This query returns 9843923 rows from the DB.  So processing this seems like it'll take quite a while.

I'll get a -pg build of pg_dump going here on a dev box so I can get you a profile.


On Fri, Mar 30, 2012 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mike Roest <mike.roest@replicon.com> writes:
> This dump is currently taking around 8 minutes.  While dumping the pg_dump
> process is using 100% of one core in the server (24 core machine).  Doing a
> -v pg_dump I found that the following stages are taking the majority of the
> time

> reading user_defined tables (2 minutes and 20 seconds)
> reading dependency data (5 minutes and 30 seconds)

Can you get an execution profile with oprofile or gprof or similar tool?
It doesn't surprise me a lot that pg_dump might have some issues with
large numbers of objects, but guessing which inefficiencies are hurting
you is difficult without more info.

                       regards, tom lane



Re: pg_dump incredibly slow dumping a single schema from a large db

From
Tom Lane
Date:
Mike Roest <mike.roest@replicon.com> writes:
> Ok I just realized that's probably not going to be much help :)

gmon.out would be of no value to anybody else anyway --- making sense of
it requires the exact executable you took the measurements with.

>   0.00      0.00     0.00        5     0.00     0.00  canonicalize_path
>   0.00      0.00     0.00        5     0.00     0.00
>  trim_trailing_separator
>   0.00      0.00     0.00        3     0.00     0.00  strlcpy

Ugh.  There are some platforms on which gprof is busted to various
degrees; you may have one.  What platform is this exactly?

            regards, tom lane

Re: pg_dump incredibly slow dumping a single schema from a large db

From
Mike Roest
Date:
That was on the CentOS 5.8 x64 machine.  The one I'm trying it from now is Ubuntu 11.10 x64



On Fri, Mar 30, 2012 at 11:30 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mike Roest <mike.roest@replicon.com> writes:
> Ok I just realized that's probably not going to be much help :)

gmon.out would be of no value to anybody else anyway --- making sense of
it requires the exact executable you took the measurements with.

>   0.00      0.00     0.00        5     0.00     0.00  canonicalize_path
>   0.00      0.00     0.00        5     0.00     0.00
>  trim_trailing_separator
>   0.00      0.00     0.00        3     0.00     0.00  strlcpy

Ugh.  There are some platforms on which gprof is busted to various
degrees; you may have one.  What platform is this exactly?

                       regards, tom lane

Re: pg_dump incredibly slow dumping a single schema from a large db

From
Tom Lane
Date:
Mike Roest <mike.roest@replicon.com> writes:
> That was on the CentOS 5.8 x64 machine.  The one I'm trying it from now is
> Ubuntu 11.10 x64

Hm.  On current Red-Hat-derived systems I'd recommend oprofile, although
you need root privileges to use that.  Not real sure what is available
on Ubuntu, but our crib sheet for oprofile says it doesn't currently
work there:
http://wiki.postgresql.org/wiki/Profiling_with_OProfile
That could be out-of-date info though.  Here's some info about
another possibility:
http://wiki.postgresql.org/wiki/Profiling_with_perf

            regards, tom lane

Re: pg_dump incredibly slow dumping a single schema from a large db

From
Mike Roest
Date:
That could be out-of-date info though.  Here's some info about
another possibility:
http://wiki.postgresql.org/wiki/Profiling_with_perf

 
There we go this perf worked on the VM.  

The file is 6 megs so I've dropped it here.

That was doing perf for the length of the pg_dump command and then a perf report -n

Re: pg_dump incredibly slow dumping a single schema from a large db

From
Tom Lane
Date:
Mike Roest <mike.roest@replicon.com> writes:
> The file is 6 megs so I've dropped it here.
> That was doing perf for the length of the pg_dump command and then a perf
> report -n
> http://dl.dropbox.com/u/13153/output.txt

Hmm ... that's a remarkably verbose output format, but the useful part
of this info seems to be just

# Events: 2M cpu-clock
#
# Overhead  Samples            Command                  Shared Object
Symbol
# ........ ..........  ...............  .............................
.................................................
#
    65.96%    1635392          pg_dump  pg_dump                        [.] findLoop
              |
              --- findLoop
                 |
                 |--69.66%-- findDependencyLoops
                 |          sortDumpableObjects
                 |          main
                 |          __libc_start_main
                 |
                 |--30.34%-- findLoop
                 |          findDependencyLoops
                 |          sortDumpableObjects
                 |          main
                 |          __libc_start_main
                  --0.00%-- [...]

    10.16%     251955          pg_dump  pg_dump                        [.] getTables
              |
              --- getTables
                  getSchemaData
                  main
                  __libc_start_main


The file also shows that findObjectByDumpId() accounts for only a
negligible percentage of runtime, which means that the recursion path in
findLoop() is being executed hardly at all.  After staring at that for
awhile I realized that what is the O(N^2) part is the initial "is the
object in the workspace?" test.  The actual dependency chains are
probably never very long, but as we run through the collection of
objects we gradually add all of them to the front of the workspace.
So this is dumb; we should manage the "is the object already processed"
component of that with an O(1) check, like a bool array or some such,
rather than an O(N) search loop.

As for the getTables slowdown, the only part of that I can see that
looks to be both significant and entirely contained in getTables()
itself is the nested loop near the end that's trying to copy
the dumpable flags for owned sequences from their owning tables.
Do you have a whole lot of owned sequences?  Maybe we could postpone
that work until we have the fast table lookup array constructed,
which should reduce this from O(M*N) to O(M*logN).

            regards, tom lane

Re: pg_dump incredibly slow dumping a single schema from a large db

From
Tom Lane
Date:
I wrote:
> So this is dumb; we should manage the "is the object already processed"
> component of that with an O(1) check, like a bool array or some such,
> rather than an O(N) search loop.

> As for the getTables slowdown, the only part of that I can see that
> looks to be both significant and entirely contained in getTables()
> itself is the nested loop near the end that's trying to copy
> the dumpable flags for owned sequences from their owning tables.
> Do you have a whole lot of owned sequences?  Maybe we could postpone
> that work until we have the fast table lookup array constructed,
> which should reduce this from O(M*N) to O(M*logN).

I've committed fixes for both these issues.  If you are in a position to
test with 9.1 branch tip from git, it'd be nice to have confirmation
that these patches actually cure your problem.  For both of them, the
issue seems to only show up in a subset of cases, which may explain why
we'd not identified the problem before.

            regards, tom lane

Re: pg_dump incredibly slow dumping a single schema from a large db

From
Mike Roest
Date:
I've committed fixes for both these issues.  If you are in a position to
test with 9.1 branch tip from git, it'd be nice to have confirmation
that these patches actually cure your problem.  For both of them, the
issue seems to only show up in a subset of cases, which may explain why
we'd not identified the problem before.

                       regards, tom lane
Cool,
  I've grabbed your changes and it seems to have completely cured the issue.

reading user-defined tables has gone down to 9 seconds from 2.5 minutes
reading dependency data has gone down to 20 seconds from 5.5 minutes.  

Thanks so much Tom this is perfect.  I'm just pulling another backup using the stock 9.1.1 pg_dump to ensure the backups are equivalent.

Any idea when 9.1.4 with this change will be out so we can pull the cluster up.


With regards to your previous question about sequences there are 61K in the DB, looks like our schema currently has about 115 sequences per tenant.


--Mike


Re: pg_dump incredibly slow dumping a single schema from a large db

From
Mike Roest
Date:

I'm just pulling another backup using the stock 9.1.1 pg_dump to ensure the backups are equivalent.

Schema & data are identical between the 2 backups. the new backup passes all our tests for validating a tenant.

Thank you again for the quick response!

--Mike
 
Mike Roest <mike.roest@replicon.com> writes:
> Any idea when 9.1.4 with this change will be out so we can pull the cluster
> up.

Well, we just did some releases last month, so unless somebody finds a
really nasty security or data-loss issue, I'd think it will be a couple
of months before the next set.

            regards, tom lane

:)  yah that makes sense no big deal.  i'll probably just push this head buiild of pg_dump onto the production machines till it comes out.

Thanks again!

On Sat, Mar 31, 2012 at 3:44 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mike Roest <mike.roest@replicon.com> writes:
> Any idea when 9.1.4 with this change will be out so we can pull the cluster
> up.

Well, we just did some releases last month, so unless somebody finds a
really nasty security or data-loss issue, I'd think it will be a couple
of months before the next set.

                       regards, tom lane