diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml index c4215be..ff5c8b7 100644 *** a/doc/src/sgml/backup.sgml --- b/doc/src/sgml/backup.sgml *************** pg_restore -d + pg_dump -j num -F d -f out.dir dbname + + + You can use pg_restore -j to restore a dump in parallel. + This will work for any archive of either the "custom" or the "directory" + archive mode, no matter if it has been created with pg_dump + -j or not. + + diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml index 1e7544a..34eace3 100644 *** a/doc/src/sgml/perform.sgml --- b/doc/src/sgml/perform.sgml *************** SELECT * FROM x, y, a, b, c WHERE someth *** 1435,1440 **** --- 1435,1449 ---- + Experiment with the parallel dump and restore modes of both + pg_dump and pg_restore and find the + optimal number of concurrent jobs to use. Dumping and restoring in + parallel by means of the + + + Consider whether the whole dump should be restored as a single transaction. To do that, pass the ). It allows ! for selection and reordering of all archived items, and is compressed ! by default. --- 73,84 ---- transfer mechanism. pg_dump can be used to backup an entire database, then pg_restore can be used to examine the archive and/or select which parts of the ! database are to be restored. The most flexible output file formats are ! the custom format () and the ! directory format. They allow for selection and reordering ! of all archived items, and are compressed by default. The ! directory format is the only format that supports ! parallel dumps. *************** PostgreSQL documentation *** 251,257 **** can read. A directory format archive can be manipulated with standard Unix tools; for example, files in an uncompressed archive can be compressed with the gzip tool. ! This format is compressed by default. --- 253,260 ---- can read. A directory format archive can be manipulated with standard Unix tools; for example, files in an uncompressed archive can be compressed with the gzip tool. ! This format is compressed by default and also supports parallel ! dumps. *************** PostgreSQL documentation *** 286,291 **** --- 289,350 ---- + + + + + Run the dump in parallel by dumping njobs + tables simultaneously. This option reduces the time of the dump but it also + increases the load on the database server. You can only use this option with the + directory output format because this is the only output format where multiple processes + can write their data at the same time. + + + pg_dump will open njobs + + 1 connections to the database, so make sure your + setting is high enough to accommodate all connections. + + + Requesting exclusive locks on database objects while running a parallel dump could + cause the dump to fail. The reason is that the pg_dump master process + requests shared locks on the objects that the worker processes are going to dump later + in order to + make sure that nobody deletes them and makes them go away while the dump is running. + Now if any other client requests an exclusive lock on a table, this lock will not be + granted but will queue after the shared lock of the master process. Consequently any + other access to the table will not be granted either and will queue after the exclusive + lock. This includes the worker process trying to dump the table. Without any + precautions this would be a classic deadlock situation. To detect this conflict, the + pg_dump worker process requests another shared lock using the + NOWAIT option. If the worker process is not granted this shared lock, + this means that somebody else has requested an exclusive lock in the meantime and there + is no way to continue with the dump, so pg_dump has no choice but to + abort the dump. + + + For a consistent backup, the database server needs to support synchronized snapshots, + a feature that was introduced in PostgreSQL 9.2. With this + feature, database clients can request to see the exact same dataset even though they + connected at different times. Technically, pg_dump -j uses several + regular database clients, it connects to the database once with the master process and + then once again for each worker job. Without the sychronized snapshot feature, the + different worker jobs wouldn't be guaranteed to see the same data from every worker + which could lead to an inconsistent backup. + + + If you want to run a parallel dump of a pre-9.2 server, you need to make sure that the + database content doesn't change from between the time the master connects to the + database until the last worker job has connected to the database. The easiest way to + do this is to halt any data modifying processes (DDL and DML) accessing the database + before starting the backup. You also need to specify the + parameter when running + pg_dump -j against a pre-9.2 PostgreSQL + server. + + + + + *************** PostgreSQL documentation *** 691,696 **** --- 750,766 ---- + + + + This option allows running pg_dump -j against a pre-9.2 + server, see the documentation of the parameter + for more details. + + + + + *************** CREATE DATABASE foo WITH TEMPLATE templa *** 1062,1067 **** --- 1132,1146 ---- + + To dump a database into a directory-format archive in parallel with + 5 worker jobs: + + + $ pg_dump -Fd mydb -j 5 -f dumpdir + + + To reload an archive file into a (freshly created) database named newdb: