Thread: backup database by cloning itself

backup database by cloning itself

From
"filippo"
Date:
Hello,

my database is not very big so I want to adopt this backup strategy:

I want to clone my database every 1 hour  to another
database 'currenttime_mydatabase' in order to have 24 backup a day,
overwriting the yesterday backups by today-same-time backups.

This is good for me because I have all the backups readily available
to be read by my program (opening the backup read only). This is a
very important for my needs.

I'm writing a script run by cron each hour to do accomplish the backup
task.

My target is to have the backup operation not affecting the users, so
I want to be able to copy a database even if the database is used by
someone.

Can I use
CREATE DATABASE my_backup_database TEMPLATE current_database?

 Is there a better way to get what I need?

Thanks,
Filippo


Re: backup database by cloning itself

From
Ron Johnson
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/13/07 07:54, filippo wrote:
> Hello,
>
> my database is not very big so I want to adopt this backup strategy:
>
> I want to clone my database every 1 hour  to another
> database 'currenttime_mydatabase' in order to have 24 backup a day,
> overwriting the yesterday backups by today-same-time backups.
>
> This is good for me because I have all the backups readily available
> to be read by my program (opening the backup read only). This is a
> very important for my needs.

There's almost definitely a better way to do what you want to do.

What benefit are you trying to obtain by creating 720 almost
identical databases per month?

> I'm writing a script run by cron each hour to do accomplish the backup
> task.
>
> My target is to have the backup operation not affecting the users, so
> I want to be able to copy a database even if the database is used by
> someone.
>
> Can I use
> CREATE DATABASE my_backup_database TEMPLATE current_database?
>
>  Is there a better way to get what I need?

Have you tried pg_dump?

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFF0rtOS9HxQb37XmcRAtUxAKDWLK7x3uDGxwni47Y+o1yJsHXOzACg4XYu
ik9TtDFb6DJ+uZllXxahSMs=
=zcis
-----END PGP SIGNATURE-----

Re: backup database by cloning itself

From
Scott Marlowe
Date:
On Tue, 2007-02-13 at 07:54, filippo wrote:
> Hello,
>
> my database is not very big so I want to adopt this backup strategy:
>
> I want to clone my database every 1 hour  to another
> database 'currenttime_mydatabase' in order to have 24 backup a day,
> overwriting the yesterday backups by today-same-time backups.
>
> This is good for me because I have all the backups readily available
> to be read by my program (opening the backup read only). This is a
> very important for my needs.
>
> I'm writing a script run by cron each hour to do accomplish the backup
> task.
>
> My target is to have the backup operation not affecting the users, so
> I want to be able to copy a database even if the database is used by
> someone.
>
> Can I use
> CREATE DATABASE my_backup_database TEMPLATE current_database?
>
>  Is there a better way to get what I need?

Create database ain't gonna work, cause it needs a database with no
users connected.  You could do:

dropdb hour_13;
createdb hour_13
pg_dump masterdb | psql hour_13

with the number after hour being a var you set every hour when you run
it.

Re: backup database by cloning itself

From
Tom Lane
Date:
Scott Marlowe <smarlowe@g2switchworks.com> writes:
> On Tue, 2007-02-13 at 07:54, filippo wrote:
>> my database is not very big so I want to adopt this backup strategy:
>> I want to clone my database every 1 hour  to another
>> database 'currenttime_mydatabase' in order to have 24 backup a day,
>> overwriting the yesterday backups by today-same-time backups.
>> Can I use
>> CREATE DATABASE my_backup_database TEMPLATE current_database?

> Create database ain't gonna work, cause it needs a database with no
> users connected.

There's a more serious objection, which is that storing a duplicate
database under the same postmaster doesn't give you an independent copy.
If something bad happens to pg_clog or pg_global, *all* your backups may
be rendered useless.

Now if your purpose in making the backups is only to protect against
user errors, and not any sort of hardware failure or Postgres bug,
maybe this isn't an issue.  But it's not what I'd call a backup.

            regards, tom lane

Re: backup database by cloning itself

From
Ron Johnson
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/14/07 12:41, Tom Lane wrote:
> Scott Marlowe <smarlowe@g2switchworks.com> writes:
>> On Tue, 2007-02-13 at 07:54, filippo wrote:
>>> my database is not very big so I want to adopt this backup strategy:
>>> I want to clone my database every 1 hour  to another
>>> database 'currenttime_mydatabase' in order to have 24 backup a day,
>>> overwriting the yesterday backups by today-same-time backups.
>>> Can I use
>>> CREATE DATABASE my_backup_database TEMPLATE current_database?
>
>> Create database ain't gonna work, cause it needs a database with no
>> users connected.
>
> There's a more serious objection, which is that storing a duplicate
> database under the same postmaster doesn't give you an independent copy.
> If something bad happens to pg_clog or pg_global, *all* your backups may
> be rendered useless.
>
> Now if your purpose in making the backups is only to protect against
> user errors, and not any sort of hardware failure or Postgres bug,
> maybe this isn't an issue.  But it's not what I'd call a backup.

Maybe his real goal "all the backups readily available to be read by
my program (opening the backup read only)" is to have a historical
record of what certain records looked like in the past.

There are other ways of doing that, though.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFF01oIS9HxQb37XmcRAqYQAKDoSNb76asUadv9InNXroshleKZEQCgl6w6
SwWu3841RN4B+GBBkxoa/DQ=
=bdEY
-----END PGP SIGNATURE-----

Re: backup database by cloning itself

From
"Ted Byers"
Date:
> Maybe his real goal "all the backups readily available to be read by
> my program (opening the backup read only)" is to have a historical
> record of what certain records looked like in the past.
>
> There are other ways of doing that, though.
>

If your speculation is right, perhaps the OP ought to explain a little more
fully why he needs 24 snapshots a day, or indeed any at all.

It seems to me that if you really want a historical record of what certain
tables looked like in the past, it would be smarter and more accurate to
create triggers, for each possible operation, that store the relevant
details in an audit table including especially who made the edits and when.
This strikes me as being much less work than developing code that processes
so many backups.

Cheers

Ted



Re: backup database by cloning itself

From
Ron Johnson
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/14/07 13:40, Ted Byers wrote:
>> Maybe his real goal "all the backups readily available to be read by
>> my program (opening the backup read only)" is to have a historical
>> record of what certain records looked like in the past.
>>
>> There are other ways of doing that, though.
>>
>
> If your speculation is right, perhaps the OP ought to explain a little
> more fully why he needs 24 snapshots a day, or indeed any at all.
>
> It seems to me that if you really want a historical record of what
> certain tables looked like in the past, it would be smarter and more
> accurate to create triggers, for each possible operation, that store the
> relevant details in an audit table including especially who made the
> edits and when. This strikes me as being much less work than developing
> code that processes so many backups.

I dunno about that.  We use triggers to populate log tables that get
extracted, truncated then loaded into a reporting db every night.
Because of the night time batch cycle, there is no quiescent time to
do this, so we have 2 log tables, and the triggers alternate which
log table to insert into, depending on whether the day number is
even or odd.

That's in addition to the trigger logic to insert into history tables.

It's a royal pain.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFF03URS9HxQb37XmcRAq0RAJ4iAHtbst+Gq9QndTr36lErYUwSmgCg7dM/
luIRI+F9eqYqUoMz9VNNaNc=
=NYOz
-----END PGP SIGNATURE-----

Re: backup database by cloning itself

From
"Webb Sprague"
Date:
> >> Maybe his real goal "all the backups readily available to be read by
> >> my program (opening the backup read only)" is to have a historical
> >> record of what certain records looked like in the past.

What postgresql "time travel"?  I have never used it, and it looks a
little bit unmaintained, but it might be perfect with some tweaking:

contrib/spi/README.timetravel

I know that someone at SFPUG presented on using time travel last year,
and she may have more info .

Re: backup database by cloning itself

From
"Ted Byers"
Date:
----- Original Message -----
From: "Ron Johnson" <ron.l.johnson@cox.net>
To: <pgsql-general@postgresql.org>
Sent: Wednesday, February 14, 2007 3:46 PM
Subject: Re: [GENERAL] backup database by cloning itself


>>
>> It seems to me that if you really want a historical record of what
>> certain tables looked like in the past, it would be smarter and more
>> accurate to create triggers, for each possible operation, that store the
>> relevant details in an audit table including especially who made the
>> edits and when. This strikes me as being much less work than developing
>> code that processes so many backups.
>
> I dunno about that.  We use triggers to populate log tables that get
> extracted, truncated then loaded into a reporting db every night.
> Because of the night time batch cycle, there is no quiescent time to
> do this, so we have 2 log tables, and the triggers alternate which
> log table to insert into, depending on whether the day number is
> even or odd.
>
> That's in addition to the trigger logic to insert into history tables.
>
> It's a royal pain.

Sounds painful, but not as painful as trying to do the same thing with a
suite of backups and client code to read the backups.  Or maybe the latter
task is easier than it looks at first glance.

Doesn't the amount of pain depend on the amount of traffic your servers see,
the capability of the hardware you can devote to the task, what you're doing
with the history tables, &c.?  When I've worked on this kind of problem,
everything was in a single DB. The traffic, though, was low enough that for
daily reports a simple view of the log selecting only the previous day's
data, was sufficient, but only occasionally used.  The principal reason for
the log was to support accountability, to know who made changes, and when,
and who knew what when.  Historical reconstructions were required, e.g.,
only when something went awry and there was a need to know if a good
decision was made based on bad data or a bad decision was made based on good
data, or, e.g., during an audit of the business processes and decision
support systems.

Cheers,

Ted



Re: backup database by cloning itself

From
"Webb Sprague"
Date:
Here is the link to Elein's presentation:

http://www.varlena.com/GeneralBits/Tidbits/tt.pdf

> What [about] postgresql "time travel"?  I have never used it, and it looks a
> little bit unmaintained, but it might be perfect with some tweaking:

Re: backup database by cloning itself

From
"filippo"
Date:
On 13 Feb, 14:54, "filippo" <filippo2...@virgilio.it> wrote:

> My target is to have the backup operation not affecting the users, so
> I want to be able to copy a database even if the database is used by
> someone.

I could use pg_dump/pg_restore. pg_dump doesn't have to have exclusive
access to database to perform the operation. My only problem is that
pg_dump create a backup on a file, the best to me whould be to have a
perfect clone (users/ data etc) of original database ready to be used
just after the cloning. Is it possible?

Thanks,

Filippo


Re: backup database by cloning itself

From
"filippo"
Date:
On 14 Feb, 08:33, ron.l.john...@cox.net (Ron Johnson) wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> There's almost definitely a better way to do what you want to do.
>
> What benefit are you trying to obtain by creating 720 almost
> identical databases per month?

I only need the last 24, overwriting each day. In my case 8 are enough
(8am to 24pm, one every 2 hours). The important thing is to be able to
open each very simply (read only) without restore anything. Just tell
my application (written be me), to point the basckup database instead
of latest one.

> Have you tried pg_dump?

probably this is ok. pg_dump create a backup file, not a true clone
db. I want a exact copy (data and users and everithing) with a
different name (05-mydatabase 06-mydatabase ...)


Re: backup database by cloning itself

From
"Filip Rembiałkowski"
Date:
13 Feb 2007 05:54:44 -0800, filippo <filippo2991@virgilio.it>:
> Hello,
>
> my database is not very big so I want to adopt this backup strategy:
>
> I want to clone my database every 1 hour  to another
> database 'currenttime_mydatabase' in order to have 24 backup a day,
> overwriting the yesterday backups by today-same-time backups.
>
> This is good for me because I have all the backups readily available
> to be read by my program (opening the backup read only). This is a
> very important for my needs.
>
> I'm writing a script run by cron each hour to do accomplish the backup
> task.
>
> My target is to have the backup operation not affecting the users, so
> I want to be able to copy a database even if the database is used by
> someone.
>
> Can I use
> CREATE DATABASE my_backup_database TEMPLATE current_database?

no. database used as template must not be accessed during copy

>
>  Is there a better way to get what I need?

you can script this:

pg_dump sourcedb |  psql targetdb


--
Filip Rembiałkowski

Re: backup database by cloning itself

From
"Ted"
Date:
On Feb 14, 2:14 am, "filippo" <filippo2...@virgilio.it> wrote:
> On 13 Feb, 14:54, "filippo" <filippo2...@virgilio.it> wrote:
>
> > My target is to have the backup operation not affecting the users, so
> > I want to be able to copy a database even if the database is used by
> > someone.
>
> I could use pg_dump/pg_restore. pg_dump doesn't have to have exclusive
> access to database to perform the operation. My only problem is that
> pg_dump create a backup on a file, the best to me whould be to have a
> perfect clone (users/ data etc) of original database ready to be used
> just after the cloning. Is it possible?
>
> Thanks,
>
> Filippo

Well, I could see you writing a client application that creates a
clone by first recreating all the schemas in your database and then
copying the data to the clone, and probably quite a bit more,  In such
a case, since you have absolute control over your client code, you can
do anything you want.  I am not sure, though, that that is the best
use of your time and hardware resources, especially if all you're
after is a backup.  Just think of all the overhead involved in
creating a new clone, and everything that implies, every hour.

But why not further explore your backup options if all you're
concerned about is a reliable backup.  You may find "23.3. On-line
backup and point-in-time recovery (PITR)" in the postgresql
documentation useful.  You haven't given any information about why it
might not be appropriate in your situation.  If you're really doing
what it looks to me like you're doing, then you may be in the
beginning stages of reinventing Postgresql's PITR capability.

The builtin support for PITR in Postgresql strikes me as sufficient
for what you say you need.  If you require more, which would imply you
want more than the simple backup you say you're after, then defining a
suitable suite of triggers and audit tables may serve.  Neither should
adversely affect your users. especially if your "database is not very
big ".

HTH

Ted


Re: backup database by cloning itself

From
Alvaro Herrera
Date:
filippo wrote:
> On 14 Feb, 08:33, ron.l.john...@cox.net (Ron Johnson) wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > There's almost definitely a better way to do what you want to do.
> >
> > What benefit are you trying to obtain by creating 720 almost
> > identical databases per month?
>
> I only need the last 24, overwriting each day. In my case 8 are enough
> (8am to 24pm, one every 2 hours). The important thing is to be able to
> open each very simply (read only) without restore anything. Just tell
> my application (written be me), to point the basckup database instead
> of latest one.

I wonder if you could have a PITR warm standby instead, and every hour
stop it and backup that.  This is very hand-wavy, you're expected to
fill in the details :-)

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: backup database by cloning itself

From
Ron Johnson
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 02/14/07 01:14, filippo wrote:
> On 13 Feb, 14:54, "filippo" <filippo2...@virgilio.it> wrote:
>
>> My target is to have the backup operation not affecting the users, so
>> I want to be able to copy a database even if the database is used by
>> someone.
>
> I could use pg_dump/pg_restore. pg_dump doesn't have to have exclusive
> access to database to perform the operation. My only problem is that
> pg_dump create a backup on a file, the best to me whould be to have a
> perfect clone (users/ data etc) of original database ready to be used
> just after the cloning. Is it possible?

pg_dump | pg_restore.

But you still haven't told us why you need copies of the database
every 2 hours.  What is the "business need" you are trying to solve.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFF1MGpS9HxQb37XmcRAlYvAJ92Hl9wI/7mb/zOh1xsZwRHR8uDvQCdFKE2
SIdsgnhecZKpEUMWYARLWWA=
=lqeo
-----END PGP SIGNATURE-----