Thread: Problems with postgres online backup - restore

Problems with postgres online backup - restore

From
"Chris Barnes"
Date:

  I started an online backup of postgres, tar’d my data folder, copy to usb drive in production
and restored it into my RC environment. Have I missed something important?

When starting the database I receive many errors that look like the backup was corrupted.

 

2009-06-23 08:29:15 EDT:@:7614ERROR:  xlog flush request 10D/590D6578 is not satisfied --- flushed only to 10D/510C4FB8

2009-06-23 08:29:15 EDT:@:7614CONTEXT:  writing block 2 of relation 16403/16884/2696

2009-06-23 08:29:16 EDT:@:7614ERROR:  xlog flush request 10D/590D6578 is not satisfied --- flushed only to 10D/510C4FB8

2009-06-23 08:29:16 EDT:@:7614CONTEXT:  writing block 2 of relation 16403/16884/2696

2009-06-23 08:29:16 EDT:@:7614WARNING:  could not write block 2 of 16403/16884/2696

 

 

pgdb001=> select criteria_rank from client001.t1020screen where criteria_rank like '%TR009%';

ERROR:  missing chunk number 0 for toast value 738680

 

Version

[postgres@pgprd01 data]$ cat PG_VERSION

8.3

 

Online Backup

psql

checkpoint;

SELECT pg_switch_xlog();

SELECT pg_start_backup('postres_full_backup_June222009');

tar -czvf pgprd01_June22_2009_production.dmp.tar.gz data/

SELECT pg_stop_backup();

Restore

tar –xzvf pgprd01_June22_2009_production.dmp.tar.gz

 

 

 

 

Re: Problems with postgres online backup - restore

From
Gerhard Wiesinger
Date:
Hello,

I'd like to understand the PostgreSQL internals in "backup mode".

When I understood it correctly pg_start_backup() make a checkpoint and
stops writing to the data/ directory except the WAL.

All new transaction go into WAL which is also logical. But how is data
consistency done when the written/changed blocks don't fit into the buffer
cache?

E.g.
A lot of updates which exceed buffer cache (Where is data written except
WAL since data should be kept constant?)
SELECT FROM updated data: WHERE is the new data fetched from?

Thnx for any explainations.

Ciao,
Gerhard

--
http://www.wiesinger.com/


On Thu, 25 Jun 2009, Chris Barnes wrote:

>
> SELECT pg_switch_xlog();
>
> SELECT pg_start_backup('postres_full_backup_June222009');
>
> tar -czvf pgprd01_June22_2009_production.dmp.tar.gz data/
>
> SELECT pg_stop_backup();
>

Re: Problems with postgres online backup - restore

From
Richard Huxton
Date:
Gerhard Wiesinger wrote:
> Hello,
>
> I'd like to understand the PostgreSQL internals in "backup mode".
>
> When I understood it correctly pg_start_backup() make a checkpoint and
> stops writing to the data/ directory except the WAL.
>
> All new transaction go into WAL which is also logical. But how is data
> consistency done when the written/changed blocks don't fit into the
> buffer cache?

The data isn't kept consistent. Which is why you need the WAL. Restoring
from a PITR backup is basically the same idea as recovering from a
crash. Any blocks that might have been updated since you called
pg_start_backup() will be rewritten from the WAL.

--
   Richard Huxton
   Archonet Ltd

Re: Problems with postgres online backup - restore

From
Gerhard Wiesinger
Date:
Hello,

OK, what's then the difference doing a pg_start_backup() or just
doing the backup?

Isn't that a problem that the datablock are very inconsistent, even so
inconsistent that they are corrupt:

E.g. A part of a datablock is written when e.g. the file is tarred. =>
Datablock on backup is corrupt => An then even the WAL can't be applied.

Why does it work correctly? Or is there some design problem?

Thnx.

Ciao,
Gerhard

--
http://www.wiesinger.com/


On Thu, 25 Jun 2009, Richard Huxton wrote:

> Gerhard Wiesinger wrote:
>> Hello,
>>
>> I'd like to understand the PostgreSQL internals in "backup mode".
>>
>> When I understood it correctly pg_start_backup() make a checkpoint and
>> stops writing to the data/ directory except the WAL.
>>
>> All new transaction go into WAL which is also logical. But how is data
>> consistency done when the written/changed blocks don't fit into the buffer
>> cache?
>
> The data isn't kept consistent. Which is why you need the WAL. Restoring from
> a PITR backup is basically the same idea as recovering from a crash. Any
> blocks that might have been updated since you called pg_start_backup() will
> be rewritten from the WAL.
>
> --
>  Richard Huxton
>  Archonet Ltd
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: Problems with postgres online backup - restore

From
Richard Huxton
Date:
Gerhard Wiesinger wrote:
> Hello,
>
> OK, what's then the difference doing a pg_start_backup() or just doing
> the backup?

pg_start_backup() forces a checkpoint (and logs a label for your backup
too).

> Isn't that a problem that the datablock are very inconsistent, even so
> inconsistent that they are corrupt:
>
> E.g. A part of a datablock is written when e.g. the file is tarred. =>
> Datablock on backup is corrupt => An then even the WAL can't be applied.
>
> Why does it work correctly? Or is there some design problem?

It works because the WAL doesn't hold a list of row updates ("update row
12345 set field 4 = true") but block updates. Any update to a disk block
is recorded - table or index. The WAL doesn't really know anything about
  tables, columns, primary keys etc - just disk blocks.

One small optimisation is that the first time a block is touched after a
checkpoint the value of the whole block is written to WAL and after that
only updates.

So - if you have a checkpointed system (all updates guaranteed written
to disk) and a complete set of WAL files from that point on you can
always recreate the writes to any point in time after that.

http://www.postgresql.org/docs/8.3/static/continuous-archiving.html

--
   Richard Huxton
   Archonet Ltd

Re: Problems with postgres online backup - restore

From
Gerhard Wiesinger
Date:
Hello Richard,

OK, understood it and looks to me a good system with whole block overwrite
and then the deltas.

Thnx.

Ciao,
Gerhard

--
http://www.wiesinger.com/


On Fri, 26 Jun 2009, Richard Huxton wrote:

> Gerhard Wiesinger wrote:
>> Hello,
>>
>> OK, what's then the difference doing a pg_start_backup() or just doing the
>> backup?
>
> pg_start_backup() forces a checkpoint (and logs a label for your backup too).
>
>> Isn't that a problem that the datablock are very inconsistent, even so
>> inconsistent that they are corrupt:
>>
>> E.g. A part of a datablock is written when e.g. the file is tarred. =>
>> Datablock on backup is corrupt => An then even the WAL can't be applied.
>>
>> Why does it work correctly? Or is there some design problem?
>
> It works because the WAL doesn't hold a list of row updates ("update row
> 12345 set field 4 = true") but block updates. Any update to a disk block is
> recorded - table or index. The WAL doesn't really know anything about
> tables, columns, primary keys etc - just disk blocks.
>
> One small optimisation is that the first time a block is touched after a
> checkpoint the value of the whole block is written to WAL and after that only
> updates.
>
> So - if you have a checkpointed system (all updates guaranteed written to
> disk) and a complete set of WAL files from that point on you can always
> recreate the writes to any point in time after that.
>
> http://www.postgresql.org/docs/8.3/static/continuous-archiving.html
>
> --
>  Richard Huxton
>  Archonet Ltd
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>