Home > mailing lists

Re: Parallel pg_dump for 9.1 - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: Parallel pg_dump for 9.1
Date	March 29, 2010 12:45:29
Msg-id	603c8f071003290840wa8b25dfr8eecfdd4e81fd16c@mail.gmail.com Whole thread Raw
In response to	Parallel pg_dump for 9.1 (Joachim Wieland <joe@mcknight.de>)
Responses	Re: Parallel pg_dump for 9.1
List	pgsql-hackers

Tree view

On Mon, Mar 29, 2010 at 10:46 AM, Joachim Wieland <joe@mcknight.de> wrote:
> - There are ideas on how to solve the issue with the consistent
> snapshot but in the end you can always solve it by stopping your
> application(s). I actually assume that whenever people are interested
> in a very fast dump, it is because they are doing some maintenance
> task (like migrating to a different server) that involves pg_dump. In
> these cases, they would stop their system anyway.
> Even if we had consistent snapshots in a future version, would we
> forbid people to run parallel dumps against old server versions? What
> I suggest is to just display a big warning if run against a server
> without consistent snapshot support (which currently is every
> version).

Seems reasonable.

> - Regarding the output of pg_dump I am proposing two solutions. The
> first one is to introduce a new archive type "directory" where each
> table and each blob is a file in a directory, similar to the
> experimental "files" archive type. Also the idea has come up that you
> should be able to specify multiple directories in order to make use of
> several physical disk drives. Thinking this further, in order to
> manage all the mess that you can create with this, every file of the
> same backup needs to have a unique identifier and pg_restore should
> have a check parameter that tells you if your backup directory is in a
> sane and complete state (think about moving a file from one backup
> directory to another one or trying to restore from two directories
> which are from different backup sets...).

I think that specifying several directories is a piece of complexity
that would be best left alone for a first version of this.  But a
single directory with multiple files sounds pretty reasonable.  Of
course we'll also need to support that format in non-parallel mode,
and in pg_restore.

> The second solution to the single-file-problem is to generate no
> output at all, i.e. whatever you export from your source database you
> import directly into your target database, which in the end turns out
> to be a parallel form of "pg_dump | psql".

This is a very interesting idea but you might want to get the other
thing merged first, as it's going to present a different set of
issues.

> I am currently not planning to make parallel dumps work with the
> custom format even though this would be possible if we changed the
> format to a certain degree.

I'm thinking we probably don't want to change the existing formats.

...Robert

pgsql-hackers by date:

From: Robert Haas
Date: 29 March 2010, 12:34:13
Subject: Re: enable_joinremoval

From: Jaime Casanova
Date: 29 March 2010, 12:47:26
Subject: Re: enable_joinremoval

Re: Parallel pg_dump for 9.1 - Mailing list pgsql-hackers

Previous

Next