Thread: backups

backups

From
Dennis Gearon
Date:
Along with backing up of my site, what files in 'pgdata' dir should I
back up?

Re: backups

From
Richard Welty
Date:
On Wed, 30 Jun 2004 08:52:21 -0700 Dennis Gearon <gearond@fireserve.net> wrote:

> Along with backing up of my site, what files in 'pgdata' dir should I
> back up?

is there a reason why pg_dump and pg_dumpall won't suffice?
i should think you don't really want to be messing around down
there unless you have a compelling reason for it.

richard
--
Richard Welty                                         rwelty@averillpark.net
Averill Park Networking                                         518-573-7592
    Java, PHP, PostgreSQL, Unix, Linux, IP Network Engineering, Security

Re: backups

From
Dennis Gearon
Date:
Thanks!

Joshua D. Drake wrote:

> Dennis Gearon wrote:
>
>> Along with backing up of my site, what files in 'pgdata' dir should I
>> back up?
>>
> If you are using pg_dump/pg_dumpall the only files that are really
> required are postgresql.conf pg_hba.conf.
>
> Sincerely,
>
> Joshua D. Drake
>
>
>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 2: you can get off all lists at once with the unregister command
>>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>
>
>


Re: backups

From
"Joshua D. Drake"
Date:
Dennis Gearon wrote:

> Along with backing up of my site, what files in 'pgdata' dir should I
> back up?
>
If you are using pg_dump/pg_dumpall the only files that are really
required are postgresql.conf pg_hba.conf.

Sincerely,

Joshua D. Drake



> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)



--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL


Re: backups

From
Dennis Gearon
Date:
jearl@bullysports.com wrote:

>Dennis Gearon <gearond@fireserve.net> writes:
>
>
>
>>Along with backing up of my site, what files in 'pgdata' dir should I
>>back up?
>>
>>
>
>You don't want to back files in the pgdata directory directly.
>Instead you want to use pg_dump to create a snapshot of your database
>and use that as your backup.
>
>Jason
>
>
>
Waht about these two files as Joshua talks about?
    postgresql.conf
    pg_hba.conf.

Re: backups

From
jearl@bullysports.com
Date:
Dennis Gearon <gearond@fireserve.net> writes:

> Along with backing up of my site, what files in 'pgdata' dir should I
> back up?

You don't want to back files in the pgdata directory directly.
Instead you want to use pg_dump to create a snapshot of your database
and use that as your backup.

Jason

Re: backups

From
Alvaro Herrera
Date:
On Wed, Jun 30, 2004 at 10:05:30AM -0700, Dennis Gearon wrote:

> Waht about these two files as Joshua talks about?
>    postgresql.conf
>    pg_hba.conf.

Maybe you want to worry about pg_ident.conf too, if you use that.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"El sentido de las cosas no viene de las cosas, sino de
las inteligencias que las aplican a sus problemas diarios
en busca del progreso." (Ernesto Hernández-Novich)


Re: backups

From
Richard Welty
Date:
On Wed, 30 Jun 2004 10:05:30 -0700 Dennis Gearon <gearond@fireserve.net> wrote:
> Waht about these two files as Joshua talks about?
>     postgresql.conf
>     pg_hba.conf.

yes, make copies of those. depend on pg_dump or pg_dumpall for
everything else.

richard
--
Richard Welty                                         rwelty@averillpark.net
Averill Park Networking                                         518-573-7592
    Java, PHP, PostgreSQL, Unix, Linux, IP Network Engineering, Security

~Strange Operators~

From
Chris Gamache
Date:
I was digging through the operators on PostgreSQL and came across these... They
have no description in \do, aren't in the manual, and I can't seem to Google
them because Google filters out the special chars even if you put them in
double-quotes! I don't seen them in any of the contrib modules that I've
inserted...

~>=~
~<=~
~<>~
~<~
~=~
~>~

They aren't familiar to me, but I can tell that they are text comparison
operators. The details of what and how they compare is a mystery to me!

Could someone in-the-know explain, or point me to a description?





__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - 100MB free storage!
http://promotions.yahoo.com/new_mail

Re: backups

From
Bruno Wolff III
Date:
On Wed, Jun 30, 2004 at 10:05:30 -0700,
  Dennis Gearon <gearond@fireserve.net> wrote:
> Waht about these two files as Joshua talks about?
>    postgresql.conf
>    pg_hba.conf.

Those are configuration files (along with pg_ident.conf) and don't contain
your data. You probably want a back of those as well, but you may not need
them when restoring your data if that instance of postgres has already
been configured. Also if the machine you are restoring to is significantly
different than the original machine, you may want to tweak those files.

Re: backups

From
Richard Welty
Date:
On Wed, 30 Jun 2004 15:34:12 -0500 Bruno Wolff III <bruno@wolff.to> wrote:
> Those are configuration files (along with pg_ident.conf) and don't contain
> your data. You probably want a back of those as well, but you may not need
> them when restoring your data if that instance of postgres has already
> been configured. Also if the machine you are restoring to is significantly
> different than the original machine, you may want to tweak those files.

yes. what actually makes sense is to set up a backup area for postgresql
on another partition, and pg_dump or pg_dumpall to a file system there,
rsync the config files to that file system in case they've changed, and then
run dump or whatever against the partition containing the backup data.

running dump against an active, live database won't produce anything
of value.

richard
--
Richard Welty                                         rwelty@averillpark.net
Averill Park Networking                                         518-573-7592
    Java, PHP, PostgreSQL, Unix, Linux, IP Network Engineering, Security

Re: backups

From
Date:
On 6/30/04 11:59 AM, "jearl@bullysports.com" <jearl@bullysports.com> wrote:

> You don't want to back files in the pgdata directory directly.
> Instead you want to use pg_dump to create a snapshot of your database
> and use that as your backup.

That's great for a small to medium database, but doesn't work worth a hoot
for large databases.  With several hundred million to over a billion rows,
the pg_dump isn't too bad, but I can't wait days for the reload to complete.
While replication may be an option to avoid shutting the primary DB down, we
currently have to shut down the database and do file system dumps (full
weekly, incrementals nightly).  Even with replication, we'd need to shut
down the shadow DB and do a file system backup - a replicate doesn't protect
you against replicated garbage.  We are also investigating using file system
snapshots - shut the DB down, snapshot, bring it back up.

I'm looking forward to point in time recovery...

What do other sites with mondo databases do?

Wes


Re: ~Strange Operators~

From
Tom Lane
Date:
Chris Gamache <cgg007@yahoo.com> writes:
> ~>=~
> ~<=~
> ~<>~
> ~<~
> ~=~
> ~>~

> They aren't familiar to me, but I can tell that they are text comparison
> operators. The details of what and how they compare is a mystery to me!

Those are the non-locale-aware operators that Peter added to support
LIKE.   I'm surprised to hear they aren't documented.

            regards, tom lane

Re: ~Strange Operators~

From
Dennis Gearon
Date:
Tom Lane wrote:

> Chris Gamache <cgg007@yahoo.com> writes:
>
>>~>=~
>>~<=~
>>~<>~
>>~<~
>>~=~
>>~>~
>
>
>>They aren't familiar to me, but I can tell that they are text comparison
>>operators. The details of what and how they compare is a mystery to me!
>
>
> Those are the non-locale-aware operators that Peter added to support
> LIKE.   I'm surprised to hear they aren't documented.
>
>             regards, tom lane
>
What are they, just binary?

BTW, isn't that what locale C is, binary?

Re: backups

From
Bruno Wolff III
Date:
On Wed, Jun 30, 2004 at 18:23:08 -0500,
  wespvp@syntegra.com wrote:
>
> What do other sites with mondo databases do?

There have been comments from people using storage systems that they
can freeze the storage system and get a consistant snap shot of the
file system. This can be used to do a restore. It will look just like
postgres crashed when coming back up.
If you find one of the posts about this in the archives the poster may
have more details on their storage systems.

Re: backups

From
Bruno Wolff III
Date:
On Wed, Jun 30, 2004 at 20:35:04 -0700,
  Dennis Gearon <gearond@fireserve.net> wrote:
> >
> Just shutting it down doesn't work?

If you can shut the database down then that works fine. Some people can't
do that and can't afford the long recovery time either.

Re: backups

From
Dennis Gearon
Date:
Bruno Wolff III wrote:

> On Wed, Jun 30, 2004 at 18:23:08 -0500,
>   wespvp@syntegra.com wrote:
>
>>What do other sites with mondo databases do?
>
>
> There have been comments from people using storage systems that they
> can freeze the storage system and get a consistant snap shot of the
> file system. This can be used to do a restore. It will look just like
> postgres crashed when coming back up.
> If you find one of the posts about this in the archives the poster may
> have more details on their storage systems.
>
Just shutting it down doesn't work?

Re: backups

From
Bill Montgomery
Date:
Bruno Wolff III wrote:

>On Wed, Jun 30, 2004 at 18:23:08 -0500,
>  wespvp@syntegra.com wrote:
>
>
>>What do other sites with mondo databases do?
>>
>>
>
>There have been comments from people using storage systems that they
>can freeze the storage system and get a consistant snap shot of the
>file system. This can be used to do a restore. It will look just like
>postgres crashed when coming back up.
>If you find one of the posts about this in the archives the poster may
>have more details on their storage systems.
>
>
We achieve the same effect with LVM. An rsync on the live set of files
is done to get the majority of files that have changed since the last
snapshot, then an LVM snapshot is taken and the relatively small set of
files that have changed are rsyncd. This minimizes the lifetime of the
snapshot. LVM snapshots degrade write performance on the LVM volume
group on which they reside, due to the dual-write they require, so we
want them around for as short a time as possible.

Of course, we also tar up the files after the snapshot rsync, in case
the primary server craps out in the middle of the next hour's
snapshot/rsync. Otherwise, we might find ourselves resorting to last
night's pg_dump.

Regards,

Bill Montgomery

Re: backups

From
Dennis Gearon
Date:
Bill Montgomery wrote:

> Bruno Wolff III wrote:
>
>> On Wed, Jun 30, 2004 at 18:23:08 -0500,
>>  wespvp@syntegra.com wrote:
>>
>>
>>> What do other sites with mondo databases do?
>>>
>>
>>
>> There have been comments from people using storage systems that they
>> can freeze the storage system and get a consistant snap shot of the
>> file system. This can be used to do a restore. It will look just like
>> postgres crashed when coming back up.
>> If you find one of the posts about this in the archives the poster may
>> have more details on their storage systems.
>>
>>
> We achieve the same effect with LVM. An rsync on the live set of files
> is done to get the majority of files that have changed since the last
> snapshot, then an LVM snapshot is taken and the relatively small set
> of files that have changed are rsyncd. This minimizes the lifetime of
> the snapshot. LVM snapshots degrade write performance on the LVM
> volume group on which they reside, due to the dual-write they require,
> so we want them around for as short a time as possible.
>
> Of course, we also tar up the files after the snapshot rsync, in case
> the primary server craps out in the middle of the next hour's
> snapshot/rsync. Otherwise, we might find ourselves resorting to last
> night's pg_dump.
>
> Regards,
>
> Bill Montgomery
>
What's LVM?

Re: backups

From
Bill Montgomery
Date:
Dennis Gearon wrote:

> What's LVM?

http://tldp.org/HOWTO/LVM-HOWTO/

LVM stands for Linux Volume Manager. It groups block devices (like hard
drives) into volume groups, then creates logical volumes on top of those
volume groups. In effect, it virtualizes away the nasty realities of
hard drives that don't dynamically change size, etc. and presents
"virtual" block devices to the OS.

One of the many wonderful features is that you can freeze one of these
virtual block devices ("logical volume" in LVM-speak) at a particular
instant in time, giving you a consistent view of that block device, and
any data on it, such as a filesystem. This is called a snapshot, and is
how, at our site, we get a consistent set of Postgres files to rsync from.

Regards,

Bill Montgomery

Re: backups

From
Christopher Petrilli
Date:
On Wed, 30 Jun 2004 18:23:08 -0500, wespvp@syntegra.com
<wespvp@syntegra.com> wrote:
>
> What do other sites with mondo databases do?
>

Let me offer the ideas of what I've used in some high-end environments
before. First, we've always used a mirror configuration in most
situations, simply for reliability and performance concerns (we can
get into the arguments of RAID-5, but that's neither here nor there).
So, it ends up being 1+0 (mirroring of striped sets).

What you can do is split the mirror and back up one side of the duplex
set. This leaves the database running on the other side, and when you
join them back together, the logs will catch up. This does have a
potential performance problem associated with it, of course, but
backups always do. It really depends on write-percentages.

If you still need high availability when doing backups, I've also used
triplex setups (3 mirrors), so that you still have one left. The
reality is, drive space is cheap, and the ability to pull them off and
do backups that way is very helpful. You can in-fact in an SAN
reattach them to another server for backups.

As someone else pointed out, you do have the issue of sanity of the
files when you do a backup, so given PostgreSQL's current lack of
PITR, I'd likely stop the database, split the mirrors, and restart the
database. I don't know of anyway to coalesce the database and quiet it
for 1 second to do the split.

Chris
--
| Christopher Petrilli
| petrilli@gmail.com

Re: backups

From
vinny@nospamthankyoumam_yapf.net
Date:
On Wed, 30 Jun 2004 22:32:26 -0500, bruno@wolff.to (Bruno Wolff III)
wrote:

>On Wed, Jun 30, 2004 at 18:23:08 -0500,
>  wespvp@syntegra.com wrote:
>>
>> What do other sites with mondo databases do?
>
>There have been comments from people using storage systems that they
>can freeze the storage system and get a consistant snap shot of the
>file system. This can be used to do a restore. It will look just like
>postgres crashed when coming back up.
>If you find one of the posts about this in the archives the poster may
>have more details on their storage systems.
>
>---------------------------(end of broadcast)---------------------------
>TIP 8: explain analyze is your friend

I've been playing around with something like that.
On my test server I have put the postgresql directory (including the
config files) onto a software raid-1 array. This array starts off as
just one disk, but when the time comes to create a backup, you can add
a secondary disk to the array, on-the-fly, so the database does not
have to stop for this. The recovery-synchronosing of the disk consumes
a few % of the CPU, but nothing too bad (it's disk-to-disk copying)

When syncing is complete I shutdown the database, remove the secondary
disk from the array and start the database up again. Ofcourse this is
in a test environment so this operation takes a few seconds, I have
yet to test what this will do with a normal production load.

Now the secondary disk is an exact copy of the datafiles as they were
when the database was offline, and because it is software-raid, the
secondary disk can now be mounted and backed-up. And because the files
were in an offline state at backup, they can be restored without the
database server having to recover at startup.

It seems to work ok in the test, but ofcourse this has to be tested on
a much much larger scale.