Re: Parallel pg_dump for 9.1 - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Parallel pg_dump for 9.1 |
Date | |
Msg-id | 603c8f071003290840wa8b25dfr8eecfdd4e81fd16c@mail.gmail.com Whole thread Raw |
In response to | Parallel pg_dump for 9.1 (Joachim Wieland <joe@mcknight.de>) |
Responses |
Re: Parallel pg_dump for 9.1
|
List | pgsql-hackers |
On Mon, Mar 29, 2010 at 10:46 AM, Joachim Wieland <joe@mcknight.de> wrote: > - There are ideas on how to solve the issue with the consistent > snapshot but in the end you can always solve it by stopping your > application(s). I actually assume that whenever people are interested > in a very fast dump, it is because they are doing some maintenance > task (like migrating to a different server) that involves pg_dump. In > these cases, they would stop their system anyway. > Even if we had consistent snapshots in a future version, would we > forbid people to run parallel dumps against old server versions? What > I suggest is to just display a big warning if run against a server > without consistent snapshot support (which currently is every > version). Seems reasonable. > - Regarding the output of pg_dump I am proposing two solutions. The > first one is to introduce a new archive type "directory" where each > table and each blob is a file in a directory, similar to the > experimental "files" archive type. Also the idea has come up that you > should be able to specify multiple directories in order to make use of > several physical disk drives. Thinking this further, in order to > manage all the mess that you can create with this, every file of the > same backup needs to have a unique identifier and pg_restore should > have a check parameter that tells you if your backup directory is in a > sane and complete state (think about moving a file from one backup > directory to another one or trying to restore from two directories > which are from different backup sets...). I think that specifying several directories is a piece of complexity that would be best left alone for a first version of this. But a single directory with multiple files sounds pretty reasonable. Of course we'll also need to support that format in non-parallel mode, and in pg_restore. > The second solution to the single-file-problem is to generate no > output at all, i.e. whatever you export from your source database you > import directly into your target database, which in the end turns out > to be a parallel form of "pg_dump | psql". This is a very interesting idea but you might want to get the other thing merged first, as it's going to present a different set of issues. > I am currently not planning to make parallel dumps work with the > custom format even though this would be possible if we changed the > format to a certain degree. I'm thinking we probably don't want to change the existing formats. ...Robert
pgsql-hackers by date: