Re: FW: Setting up of PITR system. - Mailing list pgsql-admin

From Rajesh Kumar Mallah
Subject Re: FW: Setting up of PITR system.
Date
Msg-id a97c77030604081010p70ef330au57dc838c40127ca0@mail.gmail.com
Whole thread Raw
In response to Re: FW: Setting up of PITR system.  (Grega Bremec <gregab@p0f.net>)
Responses Re: FW: Setting up of PITR system.
List pgsql-admin
> Rajesh Kumar Mallah wrote:
> |>| Do you see any problem in the current approach ?
> |>|  i have seen it working fine till now.
> |>
> |>I do, to be honest. The WAL location counter accounts for 4294967295
> |>positions and while I'm certain that's WAY more than the average number
> |>of transactions that go into a WAL, quite a number of small ones can
> |>certainly happen before a WAL is rolled over, and until then, you're
> |>dealing with the same log file.
> |>
> |>If two backups happen in that period of time for whatever reason, you're
> |>going to have a false positive by looking into ${WAL_ARCHIVE} and
> |>searching just for the WAL name, so including the location in the search
> |>of a WAL fragment is certainly necessary. Infact, going purely by
> |>chance, the probability of hitting the same location in two different
> |>log files in two subsequent backups is much lower than hitting the same
> |>WAL twice.
> |
> | The current wal log is not being removed from the wal archive area
> | in any case. The files less than the current ones are being rm'ed.
> |
> | I am sorry i am not able to get your apprehension. But i shall
> | surely try harder to understand your point.
>
> Hi Rajesh, list.
>
> I'm sorry I didn't get back to you earlier, I was at an IBM business
> conference for a couple of days; not to say it rendered me incapable of
> communicating via e-mail, but it did bring along certain social
> responsibilities which caused me to both stay up and sleep late, if you
> know what I mean. :)
>
> Let me explain the above predicament in more practical terms.
>
> Let us say you started a backup very soon after a WAL had been rolled
> over. Current WAL at that time was called, for example,
> ${PGDATA}/pg_xlog/000000010000000E0000000A. The location at that time
> was 000F594A (iow, early in the WAL cycle). [disclaimer: all events in
> this story are entirely fictional, any similarity to actual persons and
> events is purely coincidental :) ]
>
> pg_start_backup() will create a WAL backup:
> ~  ${PGDATA}/pg_xlog/000000010000000E0000000A.000F594A.backup
>
> which will be archived to ${WAL_ARCHIVE} under the same name, or
> possibly given a different extension, depending on archive_method. Let
> us assume for the purpose of this explanation, that archive_method
> consists only of cp -i </dev/null, although the problem would have been
> identical if one used gzip -c, for example.
>
> Now, this backup fails for whatever reason (rsync trouble, etc.). You
> abort it and leave WAL archive as it was. You diagnose the problem that
> caused the backup to fail and repeat the procedure. And since your
> diagnostic skills are so good it took you almost no time to fix it, the
> database engine is now at location 002D94AF in that _same_ WAL.
>
> Once you restart the backup script, pg_start_backup() is called and
> ${PGDATA}/pg_xlog/000000010000000E0000000A.002D94AF.backup is created
> and archived to ${WAL_ARCHIVE} under that same name.
>
> Your method of discovering logs to delete will now match _two_ "current"
> log file archives instead of one, because they both come from the same
> WAL, fail to actually delete the stale one (the one from position
> 000F594A) and thus clutter your backup with irrelevant WAL fragments.

Dear Grega,
Thanks for the reply.

now i have started understanding !

Is cluttering of the wal archive area in cases where that backup
had to be re-started for whatever reasons is the *only* concern ?

if its so , we should not be too much bothered becoz in the next
successfull backup the extra clutter will get deleted.

if there are other concern please lemme know.


>
> The second part of the second paragraph was only to expose that,
> following the same logic as outlined above, if you take WAL locations as
> the criterion of removing stale WAL fragments instead of WAL names, it
> is far less likely to hit a false positive, because you would have to
> pg_start_backup() _exactly_ 4294967296 locations after the first one.
>
> Of course, you want to be unambiguous in your search of the perfect WAL
> archive, so you want to use _both_ WAL name and location as the criterion.
>
> | the old log files without the base backup are not useful. since
> | rsync is being used to optimise the copying by overwriting the
> | base backup everytime, i dont thing preserving the old files
> | makes sense. Had it been and non overwritng backup the files
> | would have made sense.
>
> I see. I was assuming you used rsync to copy the database cluster
> somewhere then tar it there, while it was lying still ("Fell, Destroyed"
> of Fugazi comes to mind :) ).
>
> I will get back to you with the review of your script later. A quick
> scan reveals there is not much left to be improved, though.


Please do not put too much effort, as i the drives in my other
server has got installed and i am adapting the script for doing
remote backup ( which is a more common senerio).

Thank You
Regds
Rajesh Kumar Mallah.

>

pgsql-admin by date:

Previous
From: Sidar López Cruz
Date:
Subject: Cross database
Next
From: "Rajesh Kumar Mallah"
Date:
Subject: Re: Cross database