Thread: Re: [GENERAL] Point in Time Recovery WAS: Hot Backup
I'd have agree on most of what you said. I still think most crashes occur due to data corruption which can only be recoveredby using a good backup. Anyways my problem is I have a 5 gig database. I run a cron job every hour which runs pg_dump which takes over 30 minutesto run and degrades the db performance. I was hoping for something which can solve my problem and then I don't haveto take backup every hour. Is there a plan on implementing incremental backup technique for pg_dump or Is it going tobe same for next one or two releases. Thanks much For you time Sandeep. -----Original Message----- From: scott.marlowe [mailto:scott.marlowe@ihs.com] Sent: Wednesday, October 09, 2002 12:19 PM To: Sandeep Chadha Cc: Tom Lane; pgsql-hackers@postgresql.org; pgsql-general Subject: [GENERAL] Point in Time Recovery WAS: Hot Backup Hi Sandeep. What you were calling Hot Backup is really called Point in Time Recovery (PITR). Hot Backup means making a complete backup of the database while it is running, something Postgresql has supported for a very long time. On Mon, 7 Oct 2002, Sandeep Chadha wrote: > Hello to all the Doers of Postgres!!! > > Last time I went through forums, people spoke highly about 7.3 and its > capability to do hot backups. My problem is if the database goes down > and I lose my main data store, then I will lose all transactions back > to the time I did the pg_dump. Let's make it clear that this kind of failure is EXTREMELY rare on real database servers since they almost ALL run their data sets on RAID arrays. While it is possible to lost >1 drive at the same time and all your database, it is probably more likely to have a bad memory chip corrupt your data silently, or a bad query delete data it shouldn't. That said, there IS work ongoing to provide this facility for Postgresql, but I would much rather have work done on making large complex queries run faster, or fix the little issues with foreign keys cause deadlocks. > Other databases (i e Oracle) solves this by retaining their archive > logs in some physically separate storage. So, when you lose your data, > you can restore the data from back-up, and then apply your archive log, > and avoid losing any committed transactions. > > > Postgresql has been lacking this all along. I've installed postgres > 7.3b2 and still don't see any archive's flushed to any other place. > Please let me know how is hot backup procedure implemented in current > 7.3 beta(2) release. Again, you'll get better response to your questions if you call it "point in time recovery" or pitr. Hot backup is the wrong word, and something Postgresql DOES have. It also supports WALs, which stands for Write ahead logs. These files store what the database is about to do before it does it. Should the database crash with transactions pending, the server will come back up and process the pending transactions that are in the WAL files, ensuring the integrity of your database. Point in Time recovery is very nice, but it's the last step in many to ensure a stable, coherent database, and will probably be in 7.4 or somewhere around there. If you're running in a RAID array, then the loss of your datastore should be a very remote possibility.
On Wed, 2002-10-09 at 12:46, Sandeep Chadha wrote: > I'd have agree on most of what you said. I still think most crashes occur due to data corruption which can only be recoveredby using a good backup. > > Anyways my problem is I have a 5 gig database. I run a cron job every hour which runs pg_dump which takes over 30 minutesto run and degrades the db performance. I was hoping for something which can solve my problem and then I don't haveto take backup every hour. Is there a plan on implementing incremental backup technique for pg_dump or Is it going tobe same for next one or two releases. > > Thanks much For you time Oh, if thats your problem then use asynchronous replication instead. It doesn't remove the slow time, but will distribute the slowness across every transaction rather than all at once (via creation of replication logs). Things won't degrade much during the snapshot transfer itself, as there isn't very much work involved (differences only). Now periodically backup the secondary box. Needs diskspace, but not very much power otherwise. #!/bin/sh while(true) doasynchreplicate.shpg_dumpall > "`date`.bak" done > -----Original Message----- > From: scott.marlowe [mailto:scott.marlowe@ihs.com] > Sent: Wednesday, October 09, 2002 12:19 PM > To: Sandeep Chadha > Cc: Tom Lane; pgsql-hackers@postgresql.org; pgsql-general > Subject: [GENERAL] Point in Time Recovery WAS: Hot Backup > > > Hi Sandeep. What you were calling Hot Backup is really called Point in > Time Recovery (PITR). Hot Backup means making a complete backup of the > database while it is running, something Postgresql has supported for a > very long time. > > On Mon, 7 Oct 2002, Sandeep Chadha wrote: > > > Hello to all the Doers of Postgres!!! > > > > Last time I went through forums, people spoke highly about 7.3 and its > > capability to do hot backups. My problem is if the database goes down > > and I lose my main data store, then I will lose all transactions back > > to the time I did the pg_dump. > > Let's make it clear that this kind of failure is EXTREMELY rare on real > database servers since they almost ALL run their data sets on RAID arrays. > While it is possible to lost >1 drive at the same time and all your > database, it is probably more likely to have a bad memory chip corrupt > your data silently, or a bad query delete data it shouldn't. > > That said, there IS work ongoing to provide this facility for Postgresql, > but I would much rather have work done on making large complex queries run > faster, or fix the little issues with foreign keys cause deadlocks. > > > Other databases (i e Oracle) solves this by retaining their archive > > logs in some physically separate storage. So, when you lose your data, > > you can restore the data from back-up, and then apply your archive log, > > and avoid losing any committed transactions. > > > > > Postgresql has been lacking this all along. I've installed postgres > > 7.3b2 and still don't see any archive's flushed to any other place. > > Please let me know how is hot backup procedure implemented in current > > 7.3 beta(2) release. > > Again, you'll get better response to your questions if you call it "point > in time recovery" or pitr. Hot backup is the wrong word, and something > Postgresql DOES have. > > It also supports WALs, which stands for Write ahead logs. These files > store what the database is about to do before it does it. Should the > database crash with transactions pending, the server will come back up and > process the pending transactions that are in the WAL files, ensuring the > integrity of your database. > > Point in Time recovery is very nice, but it's the last step in many to > ensure a stable, coherent database, and will probably be in 7.4 or > somewhere around there. If you're running in a RAID array, then the loss > of your datastore should be a very remote possibility. > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- Rod Taylor
Rod Taylor wrote: > <snip> > Oh, if thats your problem then use asynchronous replication instead. For specific info, the contrib/rserv package does master->slave asynchronous replication as Rod is suggesting. From memory it was having troubles working with PostgreSQL 7.2.x, but someone recently submitted patches that make it work. There's a HOW-TO guide that a community member wrote on setting up rserv with PostgreSQL 7.0.3, although it should be practically identical for PostgreSQL 7.2.x (when rserv is patched to make it work). http://techdocs.postgresql.org/techdocs/settinguprserv.php That could be the basis for your async replication solution. Hope that helps. :-) Regards and best wishes, Justin Clift > It doesn't remove the slow time, but will distribute the slowness across > every transaction rather than all at once (via creation of replication > logs). Things won't degrade much during the snapshot transfer itself, > as there isn't very much work involved (differences only). > > Now periodically backup the secondary box. Needs diskspace, but not > very much power otherwise. > > #!/bin/sh > while(true) > do > asynchreplicate.sh > pg_dumpall > "`date`.bak" > done > > > -----Original Message----- > > From: scott.marlowe [mailto:scott.marlowe@ihs.com] > > Sent: Wednesday, October 09, 2002 12:19 PM > > To: Sandeep Chadha > > Cc: Tom Lane; pgsql-hackers@postgresql.org; pgsql-general > > Subject: [GENERAL] Point in Time Recovery WAS: Hot Backup > > > > > > Hi Sandeep. What you were calling Hot Backup is really called Point in > > Time Recovery (PITR). Hot Backup means making a complete backup of the > > database while it is running, something Postgresql has supported for a > > very long time. > > > > On Mon, 7 Oct 2002, Sandeep Chadha wrote: > > > > > Hello to all the Doers of Postgres!!! > > > > > > Last time I went through forums, people spoke highly about 7.3 and its > > > capability to do hot backups. My problem is if the database goes down > > > and I lose my main data store, then I will lose all transactions back > > > to the time I did the pg_dump. > > > > Let's make it clear that this kind of failure is EXTREMELY rare on real > > database servers since they almost ALL run their data sets on RAID arrays. > > While it is possible to lost >1 drive at the same time and all your > > database, it is probably more likely to have a bad memory chip corrupt > > your data silently, or a bad query delete data it shouldn't. > > > > That said, there IS work ongoing to provide this facility for Postgresql, > > but I would much rather have work done on making large complex queries run > > faster, or fix the little issues with foreign keys cause deadlocks. > > > > > Other databases (i e Oracle) solves this by retaining their archive > > > logs in some physically separate storage. So, when you lose your data, > > > you can restore the data from back-up, and then apply your archive log, > > > and avoid losing any committed transactions. > > > > > > > Postgresql has been lacking this all along. I've installed postgres > > > 7.3b2 and still don't see any archive's flushed to any other place. > > > Please let me know how is hot backup procedure implemented in current > > > 7.3 beta(2) release. > > > > Again, you'll get better response to your questions if you call it "point > > in time recovery" or pitr. Hot backup is the wrong word, and something > > Postgresql DOES have. > > > > It also supports WALs, which stands for Write ahead logs. These files > > store what the database is about to do before it does it. Should the > > database crash with transactions pending, the server will come back up and > > process the pending transactions that are in the WAL files, ensuring the > > integrity of your database. > > > > Point in Time recovery is very nice, but it's the last step in many to > > ensure a stable, coherent database, and will probably be in 7.4 or > > somewhere around there. If you're running in a RAID array, then the loss > > of your datastore should be a very remote possibility. > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 3: if posting/reading through Usenet, please send an appropriate > > subscribe-nomail command to majordomo@postgresql.org so that your > > message can get through to the mailing list cleanly > > > -- > Rod Taylor > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
On Wed, 2002-10-09 at 14:04, Justin Clift wrote: > Rod Taylor wrote: > > > <snip> > > Oh, if thats your problem then use asynchronous replication instead. > > For specific info, the contrib/rserv package does master->slave Thanks. I was having a heck of a time remembering what it was called or even where the DBA found it. -- Rod Taylor