Thread: Re: pg v. 8.4.5 misses objects and data after restoring from backup using wal

Re: pg v. 8.4.5 misses objects and data after restoring from backup using wal

From
pasman pasmański
Date:
The problem exists when you do a base backup with pg_backup?
Maybe rsync skip some data ?



------------
pasman

Hi!

Thanks for the answer.

On 01/03/11 18:36, pasman pasmański wrote:
> The problem exists when you do a base backup with pg_backup?
> Maybe rsync skip some data ?
>
>

Yes, problem exists using pg_start/stop_backup as i discribed.

I am sure that 'rsync -avH ...' and 'tar cvf ...' will skip at least
additions made to the filesystem tree after they are started. In this
case rsync says nothing and tar says

tar: directory-name: file changed as we read it

But
http://www.postgresql.org/docs/9.0/interactive/continuous-archiving.html
suggests to use tar on rsync and i guess that PostgreSQL recovery with
wal files takes care of these inconsistencies that are created during
copying filesystem, right?


Imre

> ------------
> pasman
>
>


On Jan 3, 2011, at 11:22 AM, Imre Oolberg wrote:

> But http://www.postgresql.org/docs/9.0/interactive/continuous-archiving.html suggests to use tar on rsync and i guess
thatPostgreSQL recovery with wal files takes care of these inconsistencies that are created during copying filesystem,
right?

Yes, but the database is recovered to the consistent state as of the pg_start_backup command, as I pointed out to you
before.Results of transactions that commit after the pg_start_backup command will not be in the backed up database. 

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice





Re: Re: pg v. 8.4.5 misses objects and data after restoring from backup using wal

From
"Kevin Grittner"
Date:
Scott Ribe <scott_ribe@elevated-dev.com> wrote:

> Yes, but the database is recovered to the consistent state as of
> the pg_start_backup command, as I pointed out to you before.
> Results of transactions that commit after the pg_start_backup
> command will not be in the backed up database.

I doubt you.  The transactions between pg_start_backup and
pg_stop_backup must all be applied to get to a consistent state.
That means you will be *at least* to the point of the stop.  If
there are more WAL files available to apply, and you don't tell it
otherwise, it will keep going *past* the stop point until the
attempt to get a WAL file fails.

It sounds to me more like the OP isn't following all the
instructions for a PITR recovery, and is instead falling into crash
recovery.  That won't work on a copy made while a database was being
modified.

-Kevin

Scott Ribe <scott_ribe@elevated-dev.com> writes:
> On Jan 3, 2011, at 11:22 AM, Imre Oolberg wrote:
>> But http://www.postgresql.org/docs/9.0/interactive/continuous-archiving.html suggests to use tar on rsync and i
guessthat PostgreSQL recovery with wal files takes care of these inconsistencies that are created during copying
filesystem,right? 

> Yes, but the database is recovered to the consistent state as of the pg_start_backup command, as I pointed out to you
before.Results of transactions that commit after the pg_start_backup command will not be in the backed up database. 

That's either incorrect or poorly worded.  The only way to get a
consistent, usable database is to replay all the WAL that was generated
between pg_start_backup and pg_stop_backup.  That will fix up whatever
inconsistencies exist in the base backup fileset.  Once you've done
that, you do have the results of transactions that committed after
pg_start_backup (and up to pg_stop_backup).  If you haven't done that,
what you have is an inconsistent pile of rather arbitrary bits.  The
base backup by itself (without the concurrently-created WAL) is
*useless*.

            regards, tom lane

On Jan 3, 2011, at 11:42 AM, Tom Lane wrote:

> That's either incorrect or poorly worded.

Let's go with poorly worded then. (Actually I don't do this, I use streaming replication, where I don't have to copy
theWAL files because the standby picks up the ones that are needed.) 

Unless I misread his earlier post, he is starting the backup, copying files, stopping the backup, and expecting those
transactions(between start & stop) to be applied, even though files changed during that time (including WAL files) have
notall been copied. Again, unless I misread his earlier post, he is *not* copying WAL that was generated between start
&stop. 

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice





Scott Ribe <scott_ribe@elevated-dev.com> writes:
> Unless I misread his earlier post, he is starting the backup, copying files, stopping the backup, and expecting those
transactions(between start & stop) to be applied, even though files changed during that time (including WAL files) have
notall been copied. Again, unless I misread his earlier post, he is *not* copying WAL that was generated between start
&stop. 

If that's the case, it just plain won't work.  You can either shut down
the postmaster and copy a static set of files, or you can apply WAL to a
base backup that was taken from a live database.  There is no way to get
consistent results from a backup of a live database without applying all
the WAL that was generated while the backup was being taken.

            regards, tom lane

Hi!

On 01/03/11 20:56, Tom Lane wrote:
> Scott Ribe<scott_ribe@elevated-dev.com>  writes:
>
>> Unless I misread his earlier post, he is starting the backup, copying files, stopping the backup, and expecting
thosetransactions (between start&  stop) to be applied, even though files changed during that time (including WAL
files)have not all been copied. Again, unless I misread his earlier post, he is *not* copying WAL that was generated
betweenstart&  stop. 
>>
> If that's the case, it just plain won't work.  You can either shut down
> the postmaster and copy a static set of files, or you can apply WAL to a
> base backup that was taken from a live database.  There is no way to get
> consistent results from a backup of a live database without applying all
> the WAL that was generated while the backup was being taken.
>
>             regards, tom lane
>

First of all, thanks for your time dealing with my situation. I must
stress that i have wal arhived starting from before issuing
pg_start_backup, wal archives generated before pg_start/stop_backup and
also some wal archive files generated after pg_stop_backup (and
permissions are ok). But for some reason my replay starts exactly where
.backup file says 'stop wal location', referring to my earlier post

STOP WAL LOCATION: 1/6325B2E0 (file 000000010000000100000063)

I am starting to think more in the direction what Kevin Grittner
suggested that i miss something in my procedure and instead following so
to say recovery path i am doing something else which strangely ends up
with working database process but actually misses some data. I really
tried to write down all the steps i take in my first mail, maybe you
could say something based on that or i could do some additional checking
(with pg_controldata or the like) before starting database in wal-replay
mode. In the meantime a try to read thru manual once more and try out
streaming replication, hope it advances my general understanding of the
process and helps in solving the current problem.


Best regards,

Imre


Imre Oolberg <imre@auul.pri.ee> writes:
> First of all, thanks for your time dealing with my situation. I must
> stress that i have wal arhived starting from before issuing
> pg_start_backup, wal archives generated before pg_start/stop_backup and
> also some wal archive files generated after pg_stop_backup (and
> permissions are ok). But for some reason my replay starts exactly where
> .backup file says 'stop wal location', referring to my earlier post

> STOP WAL LOCATION: 1/6325B2E0 (file 000000010000000100000063)

> I am starting to think more in the direction what Kevin Grittner
> suggested that i miss something in my procedure and instead following so
> to say recovery path i am doing something else which strangely ends up
> with working database process but actually misses some data.

I wonder whether you're failing to copy the backup_label file as part of
the base backup.  The presence of that file is what tells the slave
postmaster where it has to start recovering from.

            regards, tom lane

Re: Re: pg v. 8.4.5 misses objects and data after restoring from backup using wal

From
"Kevin Grittner"
Date:
Imre Oolberg <imre@auul.pri.ee> wrote:

> I am starting to think more in the direction what Kevin Grittner
> suggested that i miss something in my procedure and instead
> following so to say recovery path i am doing something else which
> strangely ends up with working database process but actually
> misses some data.

Did you keep the pg_xlog/archive_status directory and its *.backup
files?

It sounds like maybe it's not finding the recovery.conf file.  Are
the permissions right?  Do you have spaces or other special
characters in the directory path to your cluster data directory?  (I
notice you didn't use quoting around the path.)

Did you delete (or avoid copying) the postmaster.pid file?

Making sure to start with an empty log file and/or directory
(depending on how you log), what shows up in the log on your
recovery attempt?

-Kevin

Hi!

On 01/03/11 22:24, Tom Lane wrote:
I wonder whether you're failing to copy the backup_label file as part of
the base backup.  The presence of that file is what tells the slave
postmaster where it has to start recovering from.
		regards, tom lane 

Thank you so much for your attention, i really need to learn reading postgresql.org manuals, probably already a long time ago i learned from somewhere that i need to delete backup_label file before starting up wal-recoverying database and i did, as presented in my first mail

Restore went like this

1. mounted /mnt/backup/ under /var/lib/postgresql
2. removed backup_label
3. created recovery.conf with a line

restore_command = 'cp /data/backup/postgresql/archive-logs/%f %p'

4. emptied pg_xlog directory while having directory itself
5. started postgresql and it finishes accepting connections

Now everything works, in the process backup_label gets renamed backup_label.old, thanks again for your attention! Still lucky it was mistake in the recovery procedure having my backups made so far still usable :)


Imre