Thread: Re: [GENERAL] Point in Time Recovery WAS: Hot Backup

Re: [GENERAL] Point in Time Recovery WAS: Hot Backup

From

"Sandeep Chadha"

Date:

09 October 2002, 12:46:11

I'd have agree on most of what you said. I still think most crashes occur due to data corruption which can only be
recoveredby using a good backup.  

Anyways my problem is I have a 5 gig database. I run a cron job every hour which runs pg_dump which takes over 30
minutesto run and degrades the db performance. I was hoping for something which can solve my problem and then I don't
haveto take backup every hour. Is there a plan on implementing incremental backup technique for pg_dump or Is it going
tobe same for next one or two releases. 

Thanks much For you time
Sandeep.

-----Original Message-----
From: scott.marlowe [mailto:scott.marlowe@ihs.com]
Sent: Wednesday, October 09, 2002 12:19 PM
To: Sandeep Chadha
Cc: Tom Lane; pgsql-hackers@postgresql.org; pgsql-general
Subject: [GENERAL] Point in Time Recovery WAS: Hot Backup

Hi Sandeep.  What you were calling Hot Backup is really called Point in
Time Recovery (PITR).  Hot Backup means making a complete backup of the
database while it is running, something Postgresql has supported for a
very long time.

On Mon, 7 Oct 2002, Sandeep Chadha wrote:

> Hello to all the Doers of Postgres!!!
>
> Last time I went through forums, people spoke highly about 7.3 and its
> capability to do hot backups. My problem is if the database goes down
> and I lose my main data store, then I will lose all transactions back
> to the time I did the pg_dump.

Let's make it clear that this kind of failure is EXTREMELY rare on real
database servers since they almost ALL run their data sets on RAID arrays.
While it is possible to lost >1 drive at the same time and all your
database, it is probably more likely to have a bad memory chip corrupt
your data silently, or a bad query delete data it shouldn't.

That said, there IS work ongoing to provide this facility for Postgresql,
but I would much rather have work done on making large complex queries run
faster, or fix the little issues with foreign keys cause deadlocks.

> Other databases (i e Oracle) solves this by retaining their archive
> logs in some physically separate storage. So, when you lose your data,
> you can restore the data from back-up, and then apply your archive log,
> and avoid losing any committed transactions.
>
> > Postgresql has been lacking this all along. I've installed postgres
> 7.3b2 and still don't see any archive's flushed to any other place.
> Please let me know how is hot backup procedure implemented in current
> 7.3 beta(2) release.

Again, you'll get better response to your questions if you call it "point
in time recovery" or pitr.  Hot backup is the wrong word, and something
Postgresql DOES have.

It also supports WALs, which stands for Write ahead logs.  These files
store what the database is about to do before it does it.  Should the
database crash with transactions pending, the server will come back up and
process the pending transactions that are in the WAL files, ensuring the
integrity of your database.

Point in Time recovery is very nice, but it's the last step in many to
ensure a stable, coherent database, and will probably be in 7.4 or
somewhere around there.  If you're running in a RAID array, then the loss
of your datastore should be a very remote possibility.

Re: [GENERAL] Point in Time Recovery WAS: Hot Backup

From

Rod Taylor

Date:

09 October 2002, 13:11:38

On Wed, 2002-10-09 at 12:46, Sandeep Chadha wrote:
> I'd have agree on most of what you said. I still think most crashes occur due to data corruption which can only be
recoveredby using a good backup. 
 
> 
> Anyways my problem is I have a 5 gig database. I run a cron job every hour which runs pg_dump which takes over 30
minutesto run and degrades the db performance. I was hoping for something which can solve my problem and then I don't
haveto take backup every hour. Is there a plan on implementing incremental backup technique for pg_dump or Is it going
tobe same for next one or two releases.
 
> 
> Thanks much For you time 

Oh, if thats your problem then use asynchronous replication instead. It
doesn't remove the slow time, but will distribute the slowness across
every transaction rather than all at once (via creation of replication
logs).  Things won't degrade much during the snapshot transfer itself,
as there isn't very much work involved (differences only).

Now periodically backup the secondary box.  Needs diskspace, but not
very much power otherwise.

#!/bin/sh
while(true)
doasynchreplicate.shpg_dumpall > "`date`.bak"
done



> -----Original Message-----
> From: scott.marlowe [mailto:scott.marlowe@ihs.com]
> Sent: Wednesday, October 09, 2002 12:19 PM
> To: Sandeep Chadha
> Cc: Tom Lane; pgsql-hackers@postgresql.org; pgsql-general
> Subject: [GENERAL] Point in Time Recovery WAS: Hot Backup 
> 
> 
> Hi Sandeep.  What you were calling Hot Backup is really called Point in 
> Time Recovery (PITR).  Hot Backup means making a complete backup of the 
> database while it is running, something Postgresql has supported for a 
> very long time.
> 
> On Mon, 7 Oct 2002, Sandeep Chadha wrote:
> 
> > Hello to all the Doers of Postgres!!!
> > 
> > Last time I went through forums, people spoke highly about 7.3 and its 
> > capability to do hot backups. My problem is if the database goes down 
> > and I lose my main data store, then I will lose all transactions back 
> > to the time I did the pg_dump.
> 
> Let's make it clear that this kind of failure is EXTREMELY rare on real 
> database servers since they almost ALL run their data sets on RAID arrays.  
> While it is possible to lost >1 drive at the same time and all your 
> database, it is probably more likely to have a bad memory chip corrupt 
> your data silently, or a bad query delete data it shouldn't.
> 
> That said, there IS work ongoing to provide this facility for Postgresql, 
> but I would much rather have work done on making large complex queries run 
> faster, or fix the little issues with foreign keys cause deadlocks.
> 
> > Other databases (i e Oracle) solves this by retaining their archive 
> > logs in some physically separate storage. So, when you lose your data, 
> > you can restore the data from back-up, and then apply your archive log, 
> > and avoid losing any committed transactions. 
> > 
> > > Postgresql has been lacking this all along. I've installed postgres 
> > 7.3b2 and still don't see any archive's flushed to any other place. 
> > Please let me know how is hot backup procedure implemented in current 
> > 7.3 beta(2) release.
> 
> Again, you'll get better response to your questions if you call it "point 
> in time recovery" or pitr.  Hot backup is the wrong word, and something 
> Postgresql DOES have.
> 
> It also supports WALs, which stands for Write ahead logs.  These files 
> store what the database is about to do before it does it.  Should the 
> database crash with transactions pending, the server will come back up and 
> process the pending transactions that are in the WAL files, ensuring the 
> integrity of your database.
> 
> Point in Time recovery is very nice, but it's the last step in many to 
> ensure a stable, coherent database, and will probably be in 7.4 or 
> somewhere around there.  If you're running in a RAID array, then the loss 
> of your datastore should be a very remote possibility.
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
> 
--  Rod Taylor

Re: [GENERAL] Point in Time Recovery WAS: Hot Backup

From

Justin Clift

Date:

09 October 2002, 14:04:53

Rod Taylor wrote:
> 
<snip>
> Oh, if thats your problem then use asynchronous replication instead.

For specific info, the contrib/rserv package does master->slave
asynchronous replication as Rod is suggesting.  From memory it was
having troubles working with PostgreSQL 7.2.x, but someone recently
submitted patches that make it work.

There's a HOW-TO guide that a community member wrote on setting up rserv
with PostgreSQL 7.0.3, although it should be practically identical for
PostgreSQL 7.2.x (when rserv is patched to make it work).

http://techdocs.postgresql.org/techdocs/settinguprserv.php

That could be the basis for your async replication solution.

Hope that helps.

:-)

Regards and best wishes,

Justin Clift


> It doesn't remove the slow time, but will distribute the slowness across
> every transaction rather than all at once (via creation of replication
> logs).  Things won't degrade much during the snapshot transfer itself,
> as there isn't very much work involved (differences only).
> 
> Now periodically backup the secondary box.  Needs diskspace, but not
> very much power otherwise.
> 
> #!/bin/sh
> while(true)
> do
>         asynchreplicate.sh
>         pg_dumpall > "`date`.bak"
> done
> 
> > -----Original Message-----
> > From: scott.marlowe [mailto:scott.marlowe@ihs.com]
> > Sent: Wednesday, October 09, 2002 12:19 PM
> > To: Sandeep Chadha
> > Cc: Tom Lane; pgsql-hackers@postgresql.org; pgsql-general
> > Subject: [GENERAL] Point in Time Recovery WAS: Hot Backup
> >
> >
> > Hi Sandeep.  What you were calling Hot Backup is really called Point in
> > Time Recovery (PITR).  Hot Backup means making a complete backup of the
> > database while it is running, something Postgresql has supported for a
> > very long time.
> >
> > On Mon, 7 Oct 2002, Sandeep Chadha wrote:
> >
> > > Hello to all the Doers of Postgres!!!
> > >
> > > Last time I went through forums, people spoke highly about 7.3 and its
> > > capability to do hot backups. My problem is if the database goes down
> > > and I lose my main data store, then I will lose all transactions back
> > > to the time I did the pg_dump.
> >
> > Let's make it clear that this kind of failure is EXTREMELY rare on real
> > database servers since they almost ALL run their data sets on RAID arrays.
> > While it is possible to lost >1 drive at the same time and all your
> > database, it is probably more likely to have a bad memory chip corrupt
> > your data silently, or a bad query delete data it shouldn't.
> >
> > That said, there IS work ongoing to provide this facility for Postgresql,
> > but I would much rather have work done on making large complex queries run
> > faster, or fix the little issues with foreign keys cause deadlocks.
> >
> > > Other databases (i e Oracle) solves this by retaining their archive
> > > logs in some physically separate storage. So, when you lose your data,
> > > you can restore the data from back-up, and then apply your archive log,
> > > and avoid losing any committed transactions.
> > >
> > > > Postgresql has been lacking this all along. I've installed postgres
> > > 7.3b2 and still don't see any archive's flushed to any other place.
> > > Please let me know how is hot backup procedure implemented in current
> > > 7.3 beta(2) release.
> >
> > Again, you'll get better response to your questions if you call it "point
> > in time recovery" or pitr.  Hot backup is the wrong word, and something
> > Postgresql DOES have.
> >
> > It also supports WALs, which stands for Write ahead logs.  These files
> > store what the database is about to do before it does it.  Should the
> > database crash with transactions pending, the server will come back up and
> > process the pending transactions that are in the WAL files, ensuring the
> > integrity of your database.
> >
> > Point in Time recovery is very nice, but it's the last step in many to
> > ensure a stable, coherent database, and will probably be in 7.4 or
> > somewhere around there.  If you're running in a RAID array, then the loss
> > of your datastore should be a very remote possibility.
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo@postgresql.org so that your
> > message can get through to the mailing list cleanly
> >
> --
>   Rod Taylor
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly

-- 
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."  - Indira Gandhi

Re: [GENERAL] Point in Time Recovery WAS: Hot Backup

From

Rod Taylor

Date:

09 October 2002, 14:13:16

On Wed, 2002-10-09 at 14:04, Justin Clift wrote:
> Rod Taylor wrote:
> > 
> <snip>
> > Oh, if thats your problem then use asynchronous replication instead.
> 
> For specific info, the contrib/rserv package does master->slave

Thanks.  I was having a heck of a time remembering what it was called or
even where the DBA found it.

--  Rod Taylor