Re: Allowing multiple concurrent base backups - Mailing list pgsql-hackers
From | Heikki Linnakangas |
---|---|
Subject | Re: Allowing multiple concurrent base backups |
Date | |
Msg-id | 4D831C49.6070408@enterprisedb.com Whole thread Raw |
In response to | Re: Allowing multiple concurrent base backups (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Allowing multiple concurrent base backups
|
List | pgsql-hackers |
On 17.03.2011 21:39, Robert Haas wrote: > On Mon, Jan 31, 2011 at 10:45 PM, Fujii Masao<masao.fujii@gmail.com> wrote: >> On Tue, Feb 1, 2011 at 1:31 AM, Heikki Linnakangas >> <heikki.linnakangas@enterprisedb.com> wrote: >>> Hmm, good point. It's harmless, but creating the history file in the first >>> place sure seems like a waste of time. >> >> The attached patch changes pg_stop_backup so that it doesn't create >> the backup history file if archiving is not enabled. >> >> When I tested the multiple backups, I found that they can have the same >> checkpoint location and the same history file name. >> >> -------------------- >> $ for ((i=0; i<4; i++)); do >> pg_basebackup -D test$i -c fast -x -l test$i& >> done >> >> $ cat test0/backup_label >> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002) >> CHECKPOINT LOCATION: 0/20000E8 >> START TIME: 2011-02-01 12:12:31 JST >> LABEL: test0 >> >> $ cat test1/backup_label >> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002) >> CHECKPOINT LOCATION: 0/20000E8 >> START TIME: 2011-02-01 12:12:31 JST >> LABEL: test1 >> >> $ cat test2/backup_label >> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002) >> CHECKPOINT LOCATION: 0/20000E8 >> START TIME: 2011-02-01 12:12:31 JST >> LABEL: test2 >> >> $ cat test3/backup_label >> START WAL LOCATION: 0/20000B0 (file 000000010000000000000002) >> CHECKPOINT LOCATION: 0/20000E8 >> START TIME: 2011-02-01 12:12:31 JST >> LABEL: test3 >> >> $ ls archive/*.backup >> archive/000000010000000000000002.000000B0.backup >> -------------------- >> >> This would cause a serious problem. Because the backup-end record >> which indicates the same "START WAL LOCATION" can be written by the >> first backup before the other finishes. So we might think wrongly that >> we've already reached a consistency state by reading the backup-end >> record (written by the first backup) before reading the last required WAL >> file. >> >> /* >> * Force a CHECKPOINT. Aside from being necessary to prevent torn >> * page problems, this guarantees that two successive backup runs will >> * have different checkpoint positions and hence different history >> * file names, even if nothing happened in between. >> * >> * We use CHECKPOINT_IMMEDIATE only if requested by user (via passing >> * fast = true). Otherwise this can take awhile. >> */ >> RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT | >> (fast ? CHECKPOINT_IMMEDIATE : 0)); >> >> This problem happens because the above code (in do_pg_start_backup) >> actually doesn't ensure that the concurrent backups have the different >> checkpoint locations. ISTM that we should change the above or elsewhere >> to ensure that. Yes, good point. >> Or we should include backup label name in the backup-end >> record, to prevent a recovery from reading not-its-own backup-end record. Backup labels are not guaranteed to be unique either, so including backup label in the backup-end-record doesn't solve the problem. But something else like a backup-start counter in shared memory or process id would work. It won't make the history file names unique, though. Now that we use on the end-of-backup record for detecting end-of-backup, the history files are just for documenting purposes. Do we want to give up on history files for backups performed with pg_basebackup? Or we can include the backup counter or similar in the filename too. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
pgsql-hackers by date: