Thread: BUG #3110: Online Backup introduces Duplicate OIDs
A bit more information.=20=20 I found that if I take an offline backup created around the same time as my= online backup, roll forward the transaction log files included in the offl= ine backup using a recovery.conf file, the duplicate records do NOT exist. Therefore it seems there is no corruption in the WAL files. The problem mu= st be in the PITR processing of the online backup file. - r.
"Randy Isbell \(jisbell\)" <jisbell@cisco.com> writes: > I found that if I take an offline backup created around the same time as > my online backup, roll forward the transaction log files included in the > offline backup using a recovery.conf file, the duplicate records do NOT > exist. > Therefore it seems there is no corruption in the WAL files. The problem > must be in the PITR processing of the online backup file. ... or there's something wrong with your backup procedure. I hadn't looked closely at that point before, but I see you describe it as > 3. Issue pg_start_backup() > 4. Save off the data cluster > 5. Issue pg_stop_backup() > 6. Collect the WAL files > 7. Create a big hairy tar file with the stuff from items 4 and 6. > 8. Take the big hairy tar file to another server running the same pg > 8.2.3, untar and start postgres AFAICS this procedure is *not* suggested anywhere in our documentation. What's bothering me about it is that I don't see anything guaranteeing that you have a full set of WAL files back to pg_start_backup(). If checkpoints occur during step 4, as is virtually certain given you say step 4 takes 20 minutes, then WAL files you need will get renamed/recycled. What are you doing to "collect the WAL files" exactly? Also, what do you consider to be an "offline backup", and what's different in your process for creating that? regards, tom lane
On Mar 10, 2007, at 5:02 PM, Tom Lane wrote: > "Randy Isbell \(jisbell\)" <jisbell@cisco.com> writes: >> I found that if I take an offline backup created around the same >> time as >> my online backup, roll forward the transaction log files included >> in the >> offline backup using a recovery.conf file, the duplicate records >> do NOT >> exist. >> Therefore it seems there is no corruption in the WAL files. The >> problem >> must be in the PITR processing of the online backup file. > > ... or there's something wrong with your backup procedure. > > I hadn't looked closely at that point before, but I see you > describe it > as > >> 3. Issue pg_start_backup() >> 4. Save off the data cluster >> 5. Issue pg_stop_backup() >> 6. Collect the WAL files >> 7. Create a big hairy tar file with the stuff from items 4 and 6. >> 8. Take the big hairy tar file to another server running the same pg >> 8.2.3, untar and start postgres > > AFAICS this procedure is *not* suggested anywhere in our > documentation. My description is abbreviated, but the process follows exactly that which is documented in Chapter 23.3 "Continuous Archiving and Point- In-Time Recovery". I use the term "online backup" to refer to this process. > What's bothering me about it is that I don't see anything guaranteeing > that you have a full set of WAL files back to pg_start_backup(). If > checkpoints occur during step 4, as is virtually certain given you say > step 4 takes 20 minutes, then WAL files you need will get > renamed/recycled. What are you doing to "collect the WAL files" > exactly? > "archive_command" is defined in postgresql.conf in order to save the WAL files to an archive location. The start and end WAL file names are obtained from the backup history file. All WAL files between the first and last inclusive are saved with the data cluster snapshot. In any event, starting postgres with this saved cluster initiates the recovery process and postgres does come up ready. One thing I do find interesting: while the backup created with my "online" / Chap 23.3 processing contains 18 WAL files, only the last 3 (most recent) are used during the recovery process. Is that normal? > Also, what do you consider to be an "offline backup", and what's > different in your process for creating that? My term "offline backup" refers to the backup process documented in Chapter 23.2 "File System Level Backup". In this backup postgres is shutdown and a snapshot of the data cluster is taken. It seems odd that I'm the only person having this problem. My database is not that big (10MM tuples in 100 tables or so). So I question my processing or the environment. After reviewing the process I find my backup procedure matches the documentation. Could this be FreeBSD 6.1 or the filesystem? How many folks are using FreeBSD 6.1 and UFS2? - r.
Randy Isbell <jisbell@cisco.com> writes: > One thing I do find interesting: while the backup created with my > "online" / Chap 23.3 processing contains 18 WAL files, only the last > 3 (most recent) are used during the recovery process. Is that normal? Hmm, no, that seems pretty suspicious. The thing *should* be starting from the checkpoint created by pg_start_backup(), but that observation makes it sound like it is starting from some later one, probably whatever happened to be current when pg_control got archived. Please check this by noting what the startup messages say about where replay commences, and compare to the backup label file (plain text, you can just look at it) and pg_control (use pg_controldata before starting the replay). regards, tom lane
On Mar 12, 2007, at 8:08 PM, Tom Lane wrote: > Randy Isbell <jisbell@cisco.com> writes: >> One thing I do find interesting: while the backup created with my >> "online" / Chap 23.3 processing contains 18 WAL files, only the last >> 3 (most recent) are used during the recovery process. Is that >> normal? > > Hmm, no, that seems pretty suspicious. The thing *should* be starting > from the checkpoint created by pg_start_backup(), but that observation > makes it sound like it is starting from some later one, probably > whatever happened to be current when pg_control got archived. Please > check this by noting what the startup messages say about where replay > commences, and compare to the backup label file (plain text, you can > just look at it) and pg_control (use pg_controldata before starting > the > replay). Here is the pg_controldata output for the saved cluster prior to startup: pg_control version number: 822 Catalog version number: 200611241 Database system identifier: 5038266934557908351 Database cluster state: in production pg_control last modified: Tue Mar 6 12:43:53 2007 Current log file ID: 2 Next log file segment: 6 Latest checkpoint location: 2/3035F48 Prior checkpoint location: 2/292EC Latest checkpoint's REDO location: 2/3000840 Latest checkpoint's UNDO location: 0/0 Latest checkpoint's TimeLineID: 1 Latest checkpoint's NextXID: 0/18823426 Latest checkpoint's NextOID: 8774996 Latest checkpoint's NextMultiXactId: 1 Latest checkpoint's NextMultiOffset: 0 Time of latest checkpoint: Tue Mar 6 12:41:41 2007 Minimum recovery ending location: 0/0 Maximum data alignment: 4 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment: 16777216 Maximum length of identifiers: 64 Maximum columns in an index: 32 Date/time type storage: 64-bit integers Maximum length of locale name: 128 LC_COLLATE: C LC_CTYPE: C Here is the backup label file: # cat 0000000100000001000000F3.00FB4C3C.backup START WAL LOCATION: 1/F3FB4C3C (file 0000000100000001000000F3) STOP WAL LOCATION: 2/56912D8 (file 000000010000000200000005) CHECKPOINT LOCATION: 1/F3FB4C3C START TIME: 2007-03-06 12:25:05 CST LABEL: /usr/local/bck/db/sn200703061224.tar.gz STOP TIME: 2007-03-06 12:44:27 CST Here are the startup messages: <2007-03-12 10:39:10 CDT 82475>LOG: could not create socket for statistics collector: Protocol not supported <2007-03-12 10:39:10 CDT 82475>LOG: trying another address for the statistics collector <2007-03-12 10:39:10 CDT 82889>LOG: database system was interrupted at 2007-03-06 12:43:53 CST <2007-03-12 10:39:10 CDT 82889>LOG: starting archive recovery <2007-03-12 10:39:10 CDT 82889>LOG: restore_command = "cp /usr/local/ pgsql/walarch/%f %p" cp: /usr/local/pgsql/walarch/00000001.history: No such file or directory <2007-03-12 10:39:10 CDT 82889>LOG: restored log file "000000010000000200000003" from archive <2007-03-12 10:39:10 CDT 82889>LOG: checkpoint record is at 2/3035F48 <2007-03-12 10:39:10 CDT 82889>LOG: redo record is at 2/3000840; undo record is at 0/0; shutdown FALSE <2007-03-12 10:39:10 CDT 82889>LOG: next transaction ID: 0/18823426; next OID: 8774996 <2007-03-12 10:39:10 CDT 82889>LOG: next MultiXactId: 1; next MultiXactOffset: 0 <2007-03-12 10:39:10 CDT 82889>LOG: automatic recovery in progress <2007-03-12 10:39:10 CDT 82889>LOG: redo starts at 2/3000840 <2007-03-12 10:39:10 CDT 82889>LOG: REDO @ 2/3000840; LSN 2/30024C0: prev 2/3000818; xid 18823427; bkpb1: Heap - insert: rel 1663/16390/22670; tid 211239/23 <2007-03-12 10:39:10 CDT 82889>LOG: REDO @ 2/30024C0; LSN 2/3004514: prev 2/3000840; xid 18823427; bkpb1: Btree - insert: rel 1663/16390/8358831; tid 9103/77 <2007-03-12 10:39:10 CDT 82889>LOG: REDO @ 2/3004514; LSN 2/3005E9C: prev 2/30024C0; xid 18823427; bkpb1: Btree - insert: rel 1663/16390/8359008; tid 36219/100 <2007-03-12 10:39:10 CDT 82889>LOG: REDO @ 2/3005E9C; LSN 2/3007958: prev 2/3004514; xid 18823427; bkpb1: Btree - insert: rel 1663/16390/8359009; tid 15905/133 <2007-03-12 10:39:10 CDT 82889>LOG: REDO @ 2/3007958; LSN 2/3008AAC: prev 2/3005E9C; xid 18823427; bkpb1: Btree - insert: rel 1663/16390/8359011; tid 13711/14 <2007-03-12 10:39:10 CDT 82889>LOG: REDO @ 2/3008AAC; LSN 2/3009D98: prev 2/3007958; xid 18823427; bkpb1: Btree - insert: rel 1663/16390/8359012; tid 15516/235 <2007-03-12 10:39:10 CDT 82889>LOG: REDO @ 2/3009D98; LSN 2/300B0D4: prev 2/3008AAC; xid 18823427; bkpb1: Btree - insert: rel 1663/16390/8359024; tid 12394/300 <2007-03-12 10:39:10 CDT 82889>LOG: REDO @ 2/300B0D4; LSN 2/300B0FC: prev 2/3009D98; xid 18823427: Transaction - commit: 2007-03-06 12:41:41 From the backup label file it seems the starting checkpoint record should be at 1/F3FB4C3C. However the pg_control info indicates 2/3035F48. Which is correct?
Randy Isbell <jisbell@cisco.com> writes: > From the backup label file it seems the starting checkpoint record > should be at 1/F3FB4C3C. However the pg_control info indicates > 2/3035F48. Which is correct? The backup label file, of course. So we have the smoking gun: it's not seeing the backup label file. Question is why not. You do have that in the $PGDATA directory at the moment of postmaster start, no? regards, tom lane
On Mar 13, 2007, at 10:40 AM, Tom Lane wrote: > > The backup label file, of course. So we have the smoking gun: it's > not seeing the backup label file. Question is why not. You do have > that in the $PGDATA directory at the moment of postmaster start, no? > > regards, tom lane Well, buggers. That was it. The usage of backup_label is not clearly documented in the manual, I will submitted a user comment on the 23.3 page of the 8.2 manual. After ensuring backup_label is present, the recovery proceeded from the correct point. Thanks for your help in tracking this down. I believe we can safely say this is NOT a bug, and close this thread. Regards, - r. Randy Isbell Technical Leader Linksys One - SBSBU 2200 E Pres Geo Bush Tnpk RCDN4/N2-115 Richardson, TX 75082 Office: 972.813.1410 Cell: 817.239.3941 eMail: jisbell@cisco.com
Randy Isbell <jisbell@cisco.com> writes: > Well, buggers. That was it. > The usage of backup_label is not clearly documented in the manual, I > will submitted a user comment on the 23.3 page of the 8.2 manual. Please send a proposed rewording to the pgsql-docs list. We do try to incorporate comments from the interactive docs into the manual, but it's a whole lot more certain to happen if you send it to the mailing list. BTW, the backup process is supposed to result in backup_label being present in the filesystem dump, so that you'd have had to go out of your way for it to NOT be present (which is why we didn't think it needed much documentation). What exactly was happening? If you didn't deliberately remove it from the dump, there may still be some kind of bug here. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > BTW, the backup process is supposed to result in backup_label being > present in the filesystem dump, so that you'd have had to go out of your > way for it to NOT be present (which is why we didn't think it needed > much documentation). What exactly was happening? If you didn't > deliberately remove it from the dump, there may still be some kind of > bug here. Correct. backup_label was getting created. My cleanup of such items as pg= _xlog/ was overly aggressive and removing backup_label. While testing with= no database activity during the dump, the problem did not exist.=20=20 I'll send verbiage to the docs folks as requested. Thanks again for all th= e help. - r.=09
On Sun, 2007-03-18 at 07:39 -0700, Randy Isbell (jisbell) wrote: > Tom Lane <tgl@sss.pgh.pa.us> writes: > > > BTW, the backup process is supposed to result in backup_label being > > present in the filesystem dump, so that you'd have had to go out of > your > > way for it to NOT be present (which is why we didn't think it needed > > much documentation). What exactly was happening? If you didn't > > deliberately remove it from the dump, there may still be some kind > of > > bug here. > > Correct. backup_label was getting created. My cleanup of such items > as pg_xlog/ was overly aggressive and removing backup_label. While > testing with no database activity during the dump, the problem did not > exist. > > I'll send verbiage to the docs folks as requested. Thanks again for > all the help. Not sure if we ever patched the docs for this one. This happened again recently. I would like to introduce a WARNING message into recovery to show that a backup_label was expected, yet was not present. This is normal in some use cases of recovery, but we need to be able to tell those situations apart. Any objections? Perhaps we can upgrade that to an explicit option in the recovery.conf, so we have backup_label_required=true as the default, that can be overridden. But that would break too many existing setups and we should wait to see how the dust settles on 8.4 before we do that in the future. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support
Simon Riggs <simon@2ndquadrant.com> writes: > I would like to introduce a WARNING message into recovery to show that a > backup_label was expected, yet was not present. This is normal in some > use cases of recovery, but we need to be able to tell those situations > apart. Any objections? Given that the case would sometimes be expected, I think a warning would just add confusion. If we could distinguish the cases and make it a hard ERROR to be lacking the label when it's needed, then maybe, but I'm not sure how. regards, tom lane
On Thu, 2008-06-05 at 10:59 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > I would like to introduce a WARNING message into recovery to show that a > > backup_label was expected, yet was not present. This is normal in some > > use cases of recovery, but we need to be able to tell those situations > > apart. Any objections? > > Given that the case would sometimes be expected, I think a warning would > just add confusion. If we could distinguish the cases and make it a > hard ERROR to be lacking the label when it's needed, then maybe, but I'm > not sure how. OK, understood. How about we do this: * Add some more docs * Put a specific warning about the need for a backup_label file into the next release notes for 8.x.x releases. "Important change to docs for recovery:..." * In the release note comments, add that in a future release we will add a parameter to recovery.conf "require_backup_label" which defaults to "true". * Repeat this warning in the next release notes also * On the one after that we actually make the change. Once we add this, if the backup_label is missing then we throw an ERROR. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support
Simon Riggs <simon@2ndquadrant.com> writes: > * In the release note comments, add that in a future release we will add > a parameter to recovery.conf "require_backup_label" which defaults to > "true". The problem that I'm having with that idea is that it breaks crash-recovery after replay starts, because we intentionally remove the backup label once we've read it. (And no, please do not suggest rewriting the config file ...) Basically it seems that we need a more robust way to distinguish "beginning a recovery" from "resuming a recovery". Perhaps more state in pg_control is needed? regards, tom lane
On Thu, 2008-06-05 at 11:39 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > * In the release note comments, add that in a future release we will add > > a parameter to recovery.conf "require_backup_label" which defaults to > > "true". > > The problem that I'm having with that idea is that it breaks > crash-recovery after replay starts, because we intentionally remove the > backup label once we've read it. (And no, please do not suggest > rewriting the config file ...) LOL :-) Hmm, seems we can check for the backup_label.old if the first check fails. We don't actually remove the backup_label, we just move it out of the way. > Basically it seems that we need a more robust way to distinguish > "beginning a recovery" from "resuming a recovery". Perhaps more state > in pg_control is needed? That would work also. Which do you prefer? The former is back patchable eventually, the latter is not, even if I like it as a cleaner solution. Or should we just go with shouting DON'T REMOVE THE backup_label FILE in the docs? For that matter, it would be easier to list the files you *are* allowed to remove *ever* somewhere prominent and leave it at that, rather than try to explain all the various ways the other files are essential. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support