Re: pg_dump and thousands of schemas - Mailing list pgsql-performance

From Bruce Momjian
Subject Re: pg_dump and thousands of schemas
Date
Msg-id 20120525035455.GC25444@momjian.us
Whole thread Raw
In response to Re: pg_dump and thousands of schemas  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: pg_dump and thousands of schemas  ("Hugo <Nabble>" <hugo.tech@gmail.com>)
Re: pg_dump and thousands of schemas  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-performance
On Thu, May 24, 2012 at 08:20:34PM -0700, Jeff Janes wrote:
> On Thu, May 24, 2012 at 8:21 AM, Craig James <cjames@emolecules.com> wrote:
> >
> >
> > On Thu, May 24, 2012 at 12:06 AM, Hugo <Nabble> <hugo.tech@gmail.com> wrote:
> >>
> >> Hi everyone,
> >>
> >> We have a production database (postgresql 9.0) with more than 20,000
> >> schemas
> >> and 40Gb size. In the past we had all that information in just one schema
> >> and pg_dump used to work just fine (2-3 hours to dump everything). Then we
> >> decided to split the database into schemas, which makes a lot of sense for
> >> the kind of information we store and the plans we have for the future. The
> >> problem now is that pg_dump takes forever to finish (more than 24 hours)
> >> and
> >> we just can't have consistent daily backups like we had in the past. When
> >> I
> >> try to dump just one schema with almost nothing in it, it takes 12
> >> minutes.
>
> Sorry, your original did not show up here, so I'm piggy-backing on
> Craig's reply.
>
> Is dumping just one schema out of thousands an actual use case, or is
> it just an attempt to find a faster way to dump all the schemata
> through a back door?
>
> pg_dump itself seems to have a lot of quadratic portions (plus another
> one on the server which it hits pretty heavily), and it hard to know
> where to start addressing them.  It seems like addressing the overall
> quadratic nature might be a globally better option, but addressing
> just the problem with dumping one schema might be easier to kluge
> together.

Postgres 9.2 will have some speedups for pg_dump scanning large
databases --- that might help.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: Re: pg_dump and thousands of schemas
Next
From: "Hugo "
Date:
Subject: Re: pg_dump and thousands of schemas