Home > mailing lists

Re: Any work on better parallelization of pg_dump? - Mailing list pgsql-general

From	Alvaro Herrera
Subject	Re: Any work on better parallelization of pg_dump?
Date	August 29, 2016 16:13:40
Msg-id	20160829161317.GA388440@alvherre.pgsql Whole thread Raw
In response to	Re: Any work on better parallelization of pg_dump? (Jehan-Guillaume de Rorthais <ioguix@free.fr>)
Responses	Re: Any work on better parallelization of pg_dump? Re: Any work on better parallelization of pg_dump?
List	pgsql-general

Tree view

Jehan-Guillaume de Rorthais wrote:
> On Mon, 29 Aug 2016 13:38:03 +0200
> hubert depesz lubaczewski <depesz@depesz.com> wrote:
>
> > Hi,
> > we have rather uncommon case - DB with ~ 50GB of data, but this is
> > spread across ~ 80000 tables.
> >
> > Running pg_dump -Fd -jxx dumps in parallel, but only data, and MOST of
> > the time is spent on queries that run sequentially, and as far as I can
> > tell, get schema of tables, and sequence values.
> >
> > This happens on Pg 9.5. Are there any plans to make getting schema
> > faster for such cases? Either by parallelization, or at least by getting
> > schema for all tables "at once", and having pg_dump "sort it out",
> > instead of getting schema for each table separately?

Depesz: I suggest you start coding ASAP.

> Another issue I found in current implementation is how pg_restore deal with PK.
> As it takes an exclusif lock on the table, it is executed alone before indexes
> creation.
>
> Splitting the PK in unique index creation then the constraint creation might
> save a lot of time as other index can be built during the PK creation.

Yeah.  I recall there being some stupid limitation in ALTER TABLE .. ADD
CONSTRAINT USING INDEX to create a primary key from a previously
existing unique index, which would be very good to fix (I don't recall
what it was, but it was something infuriatingly silly).  I suggest you
start coding that ASAP.

(Two new contributors to pg_dump!  Yay!)

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-general by date:

From: Jehan-Guillaume de Rorthais
Date: 29 August 2016, 13:11:36
Subject: Re: Any work on better parallelization of pg_dump?

From: hubert depesz lubaczewski
Date: 29 August 2016, 16:14:37
Subject: Re: Any work on better parallelization of pg_dump?

Re: Any work on better parallelization of pg_dump? - Mailing list pgsql-general

Previous

Next