On Thu, May 24, 2012 at 08:20:34PM -0700, Jeff Janes wrote:
> On Thu, May 24, 2012 at 8:21 AM, Craig James <cjames@emolecules.com> wrote:
> >
> >
> > On Thu, May 24, 2012 at 12:06 AM, Hugo <Nabble> <hugo.tech@gmail.com> wrote:
> >>
> >> Hi everyone,
> >>
> >> We have a production database (postgresql 9.0) with more than 20,000
> >> schemas
> >> and 40Gb size. In the past we had all that information in just one schema
> >> and pg_dump used to work just fine (2-3 hours to dump everything). Then we
> >> decided to split the database into schemas, which makes a lot of sense for
> >> the kind of information we store and the plans we have for the future. The
> >> problem now is that pg_dump takes forever to finish (more than 24 hours)
> >> and
> >> we just can't have consistent daily backups like we had in the past. When
> >> I
> >> try to dump just one schema with almost nothing in it, it takes 12
> >> minutes.
>
> Sorry, your original did not show up here, so I'm piggy-backing on
> Craig's reply.
>
> Is dumping just one schema out of thousands an actual use case, or is
> it just an attempt to find a faster way to dump all the schemata
> through a back door?
>
> pg_dump itself seems to have a lot of quadratic portions (plus another
> one on the server which it hits pretty heavily), and it hard to know
> where to start addressing them. It seems like addressing the overall
> quadratic nature might be a globally better option, but addressing
> just the problem with dumping one schema might be easier to kluge
> together.
Postgres 9.2 will have some speedups for pg_dump scanning large
databases --- that might help.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +