Home > mailing lists

Re: Any work on better parallelization of pg_dump? - Mailing list pgsql-general

From	Jehan-Guillaume de Rorthais
Subject	Re: Any work on better parallelization of pg_dump?
Date	August 29, 2016 13:11:36
Msg-id	20160829151125.443631ac@firost Whole thread Raw
In response to	Any work on better parallelization of pg_dump? (hubert depesz lubaczewski <depesz@depesz.com>)
Responses	Re: Any work on better parallelization of pg_dump?
List	pgsql-general

Tree view

On Mon, 29 Aug 2016 13:38:03 +0200
hubert depesz lubaczewski <depesz@depesz.com> wrote:

> Hi,
> we have rather uncommon case - DB with ~ 50GB of data, but this is
> spread across ~ 80000 tables.
>
> Running pg_dump -Fd -jxx dumps in parallel, but only data, and MOST of
> the time is spent on queries that run sequentially, and as far as I can
> tell, get schema of tables, and sequence values.
>
> This happens on Pg 9.5. Are there any plans to make getting schema
> faster for such cases? Either by parallelization, or at least by getting
> schema for all tables "at once", and having pg_dump "sort it out",
> instead of getting schema for each table separately?

Another issue I found in current implementation is how pg_restore deal with PK.
As it takes an exclusif lock on the table, it is executed alone before indexes
creation.

Splitting the PK in unique index creation then the constraint creation might
save a lot of time as other index can be built during the PK creation.

Regards,

pgsql-general by date:

From: hubert depesz lubaczewski
Date: 29 August 2016, 11:38:09
Subject: Any work on better parallelization of pg_dump?

From: Alvaro Herrera
Date: 29 August 2016, 16:13:40
Subject: Re: Any work on better parallelization of pg_dump?

Re: Any work on better parallelization of pg_dump? - Mailing list pgsql-general

Previous

Next