Thread: BUG #3110: Online Backup introduces Duplicate OIDs

BUG #3110: Online Backup introduces Duplicate OIDs

From
"Randy Isbell \(jisbell\)"
Date:
A bit more information.=20=20

I found that if I take an offline backup created around the same time as my=
 online backup, roll forward the transaction log files included in the offl=
ine backup using a recovery.conf file, the duplicate records do NOT exist.

Therefore it seems there is no corruption in the WAL files.  The problem mu=
st be in the PITR processing of the online backup file.

- r.

Re: BUG #3110: Online Backup introduces Duplicate OIDs

From
Tom Lane
Date:
"Randy Isbell \(jisbell\)" <jisbell@cisco.com> writes:
> I found that if I take an offline backup created around the same time as
> my online backup, roll forward the transaction log files included in the
> offline backup using a recovery.conf file, the duplicate records do NOT
> exist.
> Therefore it seems there is no corruption in the WAL files.  The problem
> must be in the PITR processing of the online backup file.

... or there's something wrong with your backup procedure.

I hadn't looked closely at that point before, but I see you describe it
as

> 3. Issue pg_start_backup()
> 4. Save off the data cluster
> 5. Issue pg_stop_backup()
> 6. Collect the WAL files
> 7. Create a big hairy tar file with the stuff from items 4 and 6.
> 8. Take the big hairy tar file to another server running the same pg
> 8.2.3, untar and start postgres

AFAICS this procedure is *not* suggested anywhere in our documentation.
What's bothering me about it is that I don't see anything guaranteeing
that you have a full set of WAL files back to pg_start_backup().  If
checkpoints occur during step 4, as is virtually certain given you say
step 4 takes 20 minutes, then WAL files you need will get
renamed/recycled.  What are you doing to "collect the WAL files"
exactly?

Also, what do you consider to be an "offline backup", and what's
different in your process for creating that?

            regards, tom lane

Re: BUG #3110: Online Backup introduces Duplicate OIDs

From
Randy Isbell
Date:
On Mar 10, 2007, at 5:02 PM, Tom Lane wrote:

> "Randy Isbell \(jisbell\)" <jisbell@cisco.com> writes:
>> I found that if I take an offline backup created around the same
>> time as
>> my online backup, roll forward the transaction log files included
>> in the
>> offline backup using a recovery.conf file, the duplicate records
>> do NOT
>> exist.
>> Therefore it seems there is no corruption in the WAL files.  The
>> problem
>> must be in the PITR processing of the online backup file.
>
> ... or there's something wrong with your backup procedure.
>
> I hadn't looked closely at that point before, but I see you
> describe it
> as
>
>> 3. Issue pg_start_backup()
>> 4. Save off the data cluster
>> 5. Issue pg_stop_backup()
>> 6. Collect the WAL files
>> 7. Create a big hairy tar file with the stuff from items 4 and 6.
>> 8. Take the big hairy tar file to another server running the same pg
>> 8.2.3, untar and start postgres
>
> AFAICS this procedure is *not* suggested anywhere in our
> documentation.

My description is abbreviated, but the process follows exactly that
which is documented in Chapter 23.3 "Continuous Archiving and Point-
In-Time Recovery".  I use the term "online backup" to refer to this
process.

> What's bothering me about it is that I don't see anything guaranteeing
> that you have a full set of WAL files back to pg_start_backup().  If
> checkpoints occur during step 4, as is virtually certain given you say
> step 4 takes 20 minutes, then WAL files you need will get
> renamed/recycled.  What are you doing to "collect the WAL files"
> exactly?
>

"archive_command" is defined in postgresql.conf in order to save the
WAL files to an archive location.  The start and end WAL file names
are obtained from the backup history file.  All WAL files between the
first and last inclusive are saved with the data cluster snapshot.
In any event, starting postgres with this saved cluster initiates the
recovery process and postgres does come up ready.

One thing I do find interesting:  while the backup created with my
"online" / Chap 23.3 processing contains 18 WAL files, only the last
3 (most recent) are used during the recovery process.  Is that normal?

> Also, what do you consider to be an "offline backup", and what's
> different in your process for creating that?

My term "offline backup" refers to the backup process documented in
Chapter 23.2 "File System Level Backup".  In this backup postgres is
shutdown and a snapshot of the data cluster is taken.

It seems odd that I'm the only person having this problem.  My
database is not that big (10MM tuples in 100 tables or so).  So I
question my processing or the environment.   After reviewing the
process I find my backup procedure matches the documentation.  Could
this be FreeBSD 6.1 or the filesystem?  How many folks are using
FreeBSD 6.1 and UFS2?

- r.

Re: BUG #3110: Online Backup introduces Duplicate OIDs

From
Tom Lane
Date:
Randy Isbell <jisbell@cisco.com> writes:
> One thing I do find interesting:  while the backup created with my
> "online" / Chap 23.3 processing contains 18 WAL files, only the last
> 3 (most recent) are used during the recovery process.  Is that normal?

Hmm, no, that seems pretty suspicious.  The thing *should* be starting
from the checkpoint created by pg_start_backup(), but that observation
makes it sound like it is starting from some later one, probably
whatever happened to be current when pg_control got archived.  Please
check this by noting what the startup messages say about where replay
commences, and compare to the backup label file (plain text, you can
just look at it) and pg_control (use pg_controldata before starting the
replay).

            regards, tom lane

Re: BUG #3110: Online Backup introduces Duplicate OIDs

From
Randy Isbell
Date:
On Mar 12, 2007, at 8:08 PM, Tom Lane wrote:

> Randy Isbell <jisbell@cisco.com> writes:
>> One thing I do find interesting:  while the backup created with my
>> "online" / Chap 23.3 processing contains 18 WAL files, only the last
>> 3 (most recent) are used during the recovery process.  Is that
>> normal?
>
> Hmm, no, that seems pretty suspicious.  The thing *should* be starting
> from the checkpoint created by pg_start_backup(), but that observation
> makes it sound like it is starting from some later one, probably
> whatever happened to be current when pg_control got archived.  Please
> check this by noting what the startup messages say about where replay
> commences, and compare to the backup label file (plain text, you can
> just look at it) and pg_control (use pg_controldata before starting
> the
> replay).

Here is the pg_controldata output for the saved cluster prior to
startup:

pg_control version number:            822
Catalog version number:               200611241
Database system identifier:           5038266934557908351
Database cluster state:               in production
pg_control last modified:             Tue Mar  6 12:43:53 2007
Current log file ID:                  2
Next log file segment:                6
Latest checkpoint location:           2/3035F48
Prior checkpoint location:            2/292EC
Latest checkpoint's REDO location:    2/3000840
Latest checkpoint's UNDO location:    0/0
Latest checkpoint's TimeLineID:       1
Latest checkpoint's NextXID:          0/18823426
Latest checkpoint's NextOID:          8774996
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Time of latest checkpoint:            Tue Mar  6 12:41:41 2007
Minimum recovery ending location:     0/0
Maximum data alignment:               4
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Date/time type storage:               64-bit integers
Maximum length of locale name:        128
LC_COLLATE:                           C
LC_CTYPE:                             C


Here is the backup label file:
# cat 0000000100000001000000F3.00FB4C3C.backup
START WAL LOCATION: 1/F3FB4C3C (file 0000000100000001000000F3)
STOP WAL LOCATION: 2/56912D8 (file 000000010000000200000005)
CHECKPOINT LOCATION: 1/F3FB4C3C
START TIME: 2007-03-06 12:25:05 CST
LABEL: /usr/local/bck/db/sn200703061224.tar.gz
STOP TIME: 2007-03-06 12:44:27 CST


Here are the startup messages:

<2007-03-12 10:39:10 CDT 82475>LOG:  could not create socket for
statistics collector: Protocol not supported
<2007-03-12 10:39:10 CDT 82475>LOG:  trying another address for the
statistics collector
<2007-03-12 10:39:10 CDT 82889>LOG:  database system was interrupted
at 2007-03-06 12:43:53 CST
<2007-03-12 10:39:10 CDT 82889>LOG:  starting archive recovery
<2007-03-12 10:39:10 CDT 82889>LOG:  restore_command = "cp /usr/local/
pgsql/walarch/%f %p"
cp: /usr/local/pgsql/walarch/00000001.history: No such file or directory
<2007-03-12 10:39:10 CDT 82889>LOG:  restored log file
"000000010000000200000003" from archive
<2007-03-12 10:39:10 CDT 82889>LOG:  checkpoint record is at 2/3035F48
<2007-03-12 10:39:10 CDT 82889>LOG:  redo record is at 2/3000840;
undo record is at 0/0; shutdown FALSE
<2007-03-12 10:39:10 CDT 82889>LOG:  next transaction ID: 0/18823426;
next OID: 8774996
<2007-03-12 10:39:10 CDT 82889>LOG:  next MultiXactId: 1; next
MultiXactOffset: 0
<2007-03-12 10:39:10 CDT 82889>LOG:  automatic recovery in progress
<2007-03-12 10:39:10 CDT 82889>LOG:  redo starts at 2/3000840
<2007-03-12 10:39:10 CDT 82889>LOG:  REDO @ 2/3000840; LSN 2/30024C0:
prev 2/3000818; xid 18823427; bkpb1: Heap - insert: rel
1663/16390/22670; tid 211239/23
<2007-03-12 10:39:10 CDT 82889>LOG:  REDO @ 2/30024C0; LSN 2/3004514:
prev 2/3000840; xid 18823427; bkpb1: Btree - insert: rel
1663/16390/8358831; tid 9103/77
<2007-03-12 10:39:10 CDT 82889>LOG:  REDO @ 2/3004514; LSN 2/3005E9C:
prev 2/30024C0; xid 18823427; bkpb1: Btree - insert: rel
1663/16390/8359008; tid 36219/100
<2007-03-12 10:39:10 CDT 82889>LOG:  REDO @ 2/3005E9C; LSN 2/3007958:
prev 2/3004514; xid 18823427; bkpb1: Btree - insert: rel
1663/16390/8359009; tid 15905/133
<2007-03-12 10:39:10 CDT 82889>LOG:  REDO @ 2/3007958; LSN 2/3008AAC:
prev 2/3005E9C; xid 18823427; bkpb1: Btree - insert: rel
1663/16390/8359011; tid 13711/14
<2007-03-12 10:39:10 CDT 82889>LOG:  REDO @ 2/3008AAC; LSN 2/3009D98:
prev 2/3007958; xid 18823427; bkpb1: Btree - insert: rel
1663/16390/8359012; tid 15516/235
<2007-03-12 10:39:10 CDT 82889>LOG:  REDO @ 2/3009D98; LSN 2/300B0D4:
prev 2/3008AAC; xid 18823427; bkpb1: Btree - insert: rel
1663/16390/8359024; tid 12394/300
<2007-03-12 10:39:10 CDT 82889>LOG:  REDO @ 2/300B0D4; LSN 2/300B0FC:
prev 2/3009D98; xid 18823427: Transaction - commit: 2007-03-06 12:41:41


 From the backup label file it seems the starting checkpoint record
should be at 1/F3FB4C3C.  However the pg_control info indicates
2/3035F48.  Which is correct?

Re: BUG #3110: Online Backup introduces Duplicate OIDs

From
Tom Lane
Date:
Randy Isbell <jisbell@cisco.com> writes:
>  From the backup label file it seems the starting checkpoint record
> should be at 1/F3FB4C3C.  However the pg_control info indicates
> 2/3035F48.  Which is correct?

The backup label file, of course.  So we have the smoking gun: it's
not seeing the backup label file.  Question is why not.  You do have
that in the $PGDATA directory at the moment of postmaster start, no?

            regards, tom lane

Re: BUG #3110: Online Backup introduces Duplicate OIDs

From
Randy Isbell
Date:
On Mar 13, 2007, at 10:40 AM, Tom Lane wrote:

>
> The backup label file, of course.  So we have the smoking gun: it's
> not seeing the backup label file.  Question is why not.  You do have
> that in the $PGDATA directory at the moment of postmaster start, no?
>
>             regards, tom lane

Well, buggers.  That was it.

The usage of backup_label is not clearly documented in the manual, I
will submitted a user comment on the 23.3 page of the 8.2 manual.

After ensuring backup_label is present, the recovery proceeded from
the correct point.

Thanks for your help in tracking this down.  I believe we can safely
say this is NOT a bug, and close this thread.

Regards,
- r.

Randy Isbell
Technical Leader

Linksys One - SBSBU
2200 E Pres Geo Bush Tnpk
RCDN4/N2-115
Richardson, TX  75082

Office:    972.813.1410
Cell:       817.239.3941
eMail:    jisbell@cisco.com

Re: BUG #3110: Online Backup introduces Duplicate OIDs

From
Tom Lane
Date:
Randy Isbell <jisbell@cisco.com> writes:
> Well, buggers.  That was it.
> The usage of backup_label is not clearly documented in the manual, I
> will submitted a user comment on the 23.3 page of the 8.2 manual.

Please send a proposed rewording to the pgsql-docs list.  We do try
to incorporate comments from the interactive docs into the manual,
but it's a whole lot more certain to happen if you send it to the
mailing list.

BTW, the backup process is supposed to result in backup_label being
present in the filesystem dump, so that you'd have had to go out of your
way for it to NOT be present (which is why we didn't think it needed
much documentation).  What exactly was happening?  If you didn't
deliberately remove it from the dump, there may still be some kind of
bug here.

            regards, tom lane

Re: BUG #3110: Online Backup introduces Duplicate OIDs

From
"Randy Isbell \(jisbell\)"
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> BTW, the backup process is supposed to result in backup_label being
> present in the filesystem dump, so that you'd have had to go out of your
> way for it to NOT be present (which is why we didn't think it needed
> much documentation).  What exactly was happening?  If you didn't
> deliberately remove it from the dump, there may still be some kind of
> bug here.

Correct.  backup_label was getting created.  My cleanup of such items as pg=
_xlog/ was overly aggressive and removing backup_label.  While testing with=
 no database activity during the dump, the problem did not exist.=20=20

I'll send verbiage to the docs folks as requested.  Thanks again for all th=
e help.

- r.=09

Re: BUG #3110: Online Backup introduces Duplicate OIDs

From
Simon Riggs
Date:
On Sun, 2007-03-18 at 07:39 -0700, Randy Isbell (jisbell) wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>
> > BTW, the backup process is supposed to result in backup_label being
> > present in the filesystem dump, so that you'd have had to go out of
> your
> > way for it to NOT be present (which is why we didn't think it needed
> > much documentation).  What exactly was happening?  If you didn't
> > deliberately remove it from the dump, there may still be some kind
> of
> > bug here.
>
> Correct.  backup_label was getting created.  My cleanup of such items
> as pg_xlog/ was overly aggressive and removing backup_label.  While
> testing with no database activity during the dump, the problem did not
> exist.
>
> I'll send verbiage to the docs folks as requested.  Thanks again for
> all the help.

Not sure if we ever patched the docs for this one.

This happened again recently.

I would like to introduce a WARNING message into recovery to show that a
backup_label was expected, yet was not present. This is normal in some
use cases of recovery, but we need to be able to tell those situations
apart. Any objections?

Perhaps we can upgrade that to an explicit option in the recovery.conf,
so we have backup_label_required=true as the default, that can be
overridden. But that would break too many existing setups and we should
wait to see how the dust settles on 8.4 before we do that in the future.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

Re: BUG #3110: Online Backup introduces Duplicate OIDs

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> I would like to introduce a WARNING message into recovery to show that a
> backup_label was expected, yet was not present. This is normal in some
> use cases of recovery, but we need to be able to tell those situations
> apart. Any objections?

Given that the case would sometimes be expected, I think a warning would
just add confusion.  If we could distinguish the cases and make it a
hard ERROR to be lacking the label when it's needed, then maybe, but I'm
not sure how.

            regards, tom lane

Re: BUG #3110: Online Backup introduces Duplicate OIDs

From
Simon Riggs
Date:
On Thu, 2008-06-05 at 10:59 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > I would like to introduce a WARNING message into recovery to show that a
> > backup_label was expected, yet was not present. This is normal in some
> > use cases of recovery, but we need to be able to tell those situations
> > apart. Any objections?
>
> Given that the case would sometimes be expected, I think a warning would
> just add confusion.  If we could distinguish the cases and make it a
> hard ERROR to be lacking the label when it's needed, then maybe, but I'm
> not sure how.

OK, understood.

How about we do this:

* Add some more docs

* Put a specific warning about the need for a backup_label file into the
next release notes for 8.x.x releases. "Important change to docs for
recovery:..."

* In the release note comments, add that in a future release we will add
a parameter to recovery.conf "require_backup_label" which defaults to
"true".

* Repeat this warning in the next release notes also

* On the one after that we actually make the change. Once we add this,
if the backup_label is missing then we throw an ERROR.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

Re: BUG #3110: Online Backup introduces Duplicate OIDs

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> * In the release note comments, add that in a future release we will add
> a parameter to recovery.conf "require_backup_label" which defaults to
> "true".

The problem that I'm having with that idea is that it breaks
crash-recovery after replay starts, because we intentionally remove the
backup label once we've read it.  (And no, please do not suggest
rewriting the config file ...)

Basically it seems that we need a more robust way to distinguish
"beginning a recovery" from "resuming a recovery".  Perhaps more state
in pg_control is needed?

            regards, tom lane

Re: BUG #3110: Online Backup introduces Duplicate OIDs

From
Simon Riggs
Date:
On Thu, 2008-06-05 at 11:39 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > * In the release note comments, add that in a future release we will add
> > a parameter to recovery.conf "require_backup_label" which defaults to
> > "true".
>
> The problem that I'm having with that idea is that it breaks
> crash-recovery after replay starts, because we intentionally remove the
> backup label once we've read it.  (And no, please do not suggest
> rewriting the config file ...)

LOL :-)

Hmm, seems we can check for the backup_label.old if the first check
fails. We don't actually remove the backup_label, we just move it out of
the way.

> Basically it seems that we need a more robust way to distinguish
> "beginning a recovery" from "resuming a recovery".  Perhaps more state
> in pg_control is needed?

That would work also.

Which do you prefer? The former is back patchable eventually, the latter
is not, even if I like it as a cleaner solution.

Or should we just go with shouting DON'T REMOVE THE backup_label FILE in
the docs?

For that matter, it would be easier to list the files you *are* allowed
to remove *ever* somewhere prominent and leave it at that, rather than
try to explain all the various ways the other files are essential.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support