Thread: backup-strategies for large databases

backup-strategies for large databases

From
MirrorX
Date:
hello to all

i am trying to find an "acceptable" solution for a backup strategy on one of
our servers (will be 8.4.8 soon, now its 8.4.7). i am familiar with both
logical (dump/restore) and physical backups (pg_start_backup, walarchives
etc) and have tried both in some other cases.

the issue here is that the server is heavily loaded. the daily traffic is
heavy, which means the db size is increasing every day (by 30 gb on average)
and the size is already pretty large (~2TB). so the mentioned above
techniques are not very suitable here. the pg_dump beside the fact that it
would take about a month to be completed is not acceptable also b/c of the
the fact that there will be no PITR then.

at the moment, the copy of the PGDATA folder (excluding pg_xlog folder), the
compression of it and the storing of it in a local storage disk takes about
60 hours while the file size is about 550 GB. the archives are kept in a
different location so that not a problem. so, i dont want even to imagine
how much time the uncompress and copy will take in 'disaster' scenario.

soon, another server will be added there, a fail-over one. but again, with
this technique, to send there the PGDATA and the wals doesnt seem very
efficient.

plus, we cannot keep the PGDATA in an older version and just replicate the
wals b/c due to the heavy load they are about 150GB/day. so, even though
that we can suppose that we have unlimited disk storage its not reasonable
to use 5 TB for the wals (if the PGDATA is sent once a month) and
furthermore a lot of time will be needed for the 2nd server to recover since
it will have to process all this huge amount of wals.

so, in conclusion, given the fact that the size of the db is large, a very
big amount of storage is at our disposal and a new server will be up soon,
what options are there to reduce to minimum the down-time in a disaster
scenario? any (file-system) solutions that keep the disks at sync like DRDB
are suitable?so that the disk of the 2nd server would be at sync with the
1st. even if that works, i would still like to have a 3rd backup in the
storage disks so my question remains.

thx in advance for any suggestions and sorry for my long post...




--
View this message in context:
http://postgresql.1045698.n5.nabble.com/backup-strategies-for-large-databases-tp4697145p4697145.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

Re: backup-strategies for large databases

From
Scott Marlowe
Date:
One possible answer to your issues is data partitioning.  By
partitioning your data by date or primary key or some other field, you
can backup individual partitions for incremental backups. I run a
stats database that partitions by day daily and we can just backup
yesterday's partition each night.  Each partition is in the 10 to 20
meg size range.

Re: backup-strategies for large databases

From
MirrorX
Date:
thx a lot. i will definitely look into that option

in the meantime, if there are any other suggestions i 'd love to hear them

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/backup-strategies-for-large-databases-tp4697145p4698006.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

Re: backup-strategies for large databases

From
Mikko Partio
Date:


On Sun, Aug 14, 2011 at 12:44 AM, MirrorX <mirrorx@gmail.com> wrote:
the issue here is that the server is heavily loaded. the daily traffic is
heavy, which means the db size is increasing every day (by 30 gb on average)
and the size is already pretty large (~2TB).

at the moment, the copy of the PGDATA folder (excluding pg_xlog folder), the
compression of it and the storing of it in a local storage disk takes about
60 hours while the file size is about 550 GB. the archives are kept in a
different location so that not a problem. so, i dont want even to imagine
how much time the uncompress and copy will take in 'disaster' scenario.

plus, we cannot keep the PGDATA in an older version and just replicate the
wals b/c due to the heavy load they are about 150GB/day. so, even though
that we can suppose that we have unlimited disk storage its not reasonable
to use 5 TB for the wals (if the PGDATA is sent once a month) and
furthermore a lot of time will be needed for the 2nd server to recover since
it will have to process all this huge amount of wals.

We have a pretty similar situation, database size is ~3TB with daily xlog generation of about 25G. We do a full backup (tar PGDATA + xlogs) every fortnight and backup just the xlogs in between. The full backup takes almost 48h and is about 500G in size. All backups are gzipped of course.

The backup duration is not a problem, but the restore _might_ be. We have restored this database more than once, and each time it got fully restored surprisingly quick (a matter of hours). Of course if you have a 24/7 database this might not be acceptable, but then again if that's the case you should have a standby ready anyways.

Regards

Mikko


Re: backup-strategies for large databases

From
MirrorX
Date:
i looked into data partitioning and it is definitely something we will use
soon. but, as far as the backups are concerned, how can i take a backup
incrementally? if i get it correctly, the idea is to partition a big table
(using a date field for example) and then take each night for example a dump
of the 'daily' partition. so that the dump of this specific table will be
relatively small to the size of the initial table. is that right?

so, we are talking about logical backups without PITR. i am not saying that
it's a bad idea, i just want to make sure that i got it right.

thank you again all for your answers

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/backup-strategies-for-large-databases-tp4697145p4702690.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

Re: backup-strategies for large databases

From
Scott Marlowe
Date:
On Mon, Aug 15, 2011 at 5:06 PM, MirrorX <mirrorx@gmail.com> wrote:
> i looked into data partitioning and it is definitely something we will use
> soon. but, as far as the backups are concerned, how can i take a backup
> incrementally? if i get it correctly, the idea is to partition a big table
> (using a date field for example) and then take each night for example a dump
> of the 'daily' partition. so that the dump of this specific table will be
> relatively small to the size of the initial table. is that right?
>
> so, we are talking about logical backups without PITR. i am not saying that
> it's a bad idea, i just want to make sure that i got it right.

Exactly.  Sometimes PITR is the right answer, sometimes partitioning is.

Re: backup-strategies for large databases

From
John R Pierce
Date:
On 08/15/11 4:12 PM, Scott Marlowe wrote:
> Exactly.  Sometimes PITR is the right answer, sometimes partitioning is.

those answer two completely different questions.



--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast


Re: backup-strategies for large databases

From
Greg Smith
Date:
On 08/13/2011 05:44 PM, MirrorX wrote:
> at the moment, the copy of the PGDATA folder (excluding pg_xlog folder), the
> compression of it and the storing of it in a local storage disk takes about
> 60 hours while the file size is about 550 GB. the archives are kept in a
> different location so that not a problem. so, i dont want even to imagine
> how much time the uncompress and copy will take in 'disaster' scenario.
>

If you haven't actually run this test--confirmed that you can uncompress
the whole thing and get a working copy out of it again--I'd be concerned
that you haven't tested your backup procedure fully.  You can't really
tell if a backup is good or not unless you restore it.  And that process
will get you a read on just how bad the recovery situation will look
like if it comes to that one day.

One technique I've used to accelerate the situation you're in is to
always keep a real filesystem copy of the last backup somewhere.  Then,
rather than archive the main database directly for the base backup, you
execute rsync to make that secondary copy identical to the one on the
master.  That should happen quite a bit faster than making a whole new
backup, so long as you use the --inplace option.  Once the standby copy
is done, if you want a compressed archive you can then make it from the
copy--with no extra load on the master.  And you can then copy that
again to another place too, followed by having it consume WAL files so
that it eventually turns into a warm standby.  If you want a true
fail-over here, you're going to have to make one that is replaying WAL
files as they arrive.

> any (file-system) solutions that keep the disks at sync like DRDB
> are suitable?so that the disk of the 2nd server would be at sync with the
> 1st. even if that works, i would still like to have a 3rd backup in the
> storage disks so my question remains.
>

I doubt you'll be able to get DRDB to keep up with the volume you've got
reliably.  The only filesystem level solution I've seen scale nicely to
handle the exact problem you have is using ZFS snapshots to make some of
this easier.  It's worth buying a Solaris license for some people to
have that technology available.

I had been hoping some of the new things in FreeBSD 9.0 would finally
make it a lot more practical to consider for this sort of thing once
that ships.  But it looks like the issues around not supporting Intel's
latest graphics drivers on recent "Sandy Bridge" servers may postpone
adopting that further for me.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


Re: backup-strategies for large databases

From
Niyas
Date:
Hi All,

I have some other issue related to taking backup of the database having
bigger size. I have been getting the
following errors

anil@ubuntu107:~/Desktop$ pg_dump -Uadmin -h192.168.2.5 dbname >
filename.sql

 pg_dump: Dumping the contents of table "tbl_voucher" failed:
PQgetCopyData() failed.
pg_dump: Error message from server: server closed the connection
unexpectedly This probably means the server terminated abnormally before or
while processing the request.

Due to this error i am not able to dump the database.

Pls advice me how to tackle this issue...

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/backup-strategies-for-large-databases-tp4697145p4737573.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

Re: backup-strategies for large databases

From
"Tomas Vondra"
Date:
On 26 Srpen 2011, 11:48, Niyas wrote:
> Hi All,
>
> I have some other issue related to taking backup of the database having
> bigger size. I have been getting the
> following errors
>
> anil@ubuntu107:~/Desktop$ pg_dump -Uadmin -h192.168.2.5 dbname >
> filename.sql
>
>  pg_dump: Dumping the contents of table "tbl_voucher" failed:
> PQgetCopyData() failed.
> pg_dump: Error message from server: server closed the connection
> unexpectedly This probably means the server terminated abnormally before
> or
> while processing the request.
>
> Due to this error i am not able to dump the database.
>
> Pls advice me how to tackle this issue...

This is usually caused by backend crash - you need to check the server log
for more details.

Tomas


Re: backup-strategies for large databases

From
Niyas
Date:
Actually database is not crashed. I can run my application perfectly.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/backup-strategies-for-large-databases-tp4697145p4737697.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

Re: backup-strategies for large databases

From
"Tomas Vondra"
Date:
On 26 Srpen 2011, 12:46, Niyas wrote:
> Actually database is not crashed. I can run my application perfectly.

That does not mean one of the backends did not crash. Check the log.

Tomas


Re: backup-strategies for large databases

From
Pavel Stehule
Date:
Hello

2011/8/26 Niyas <cmniyas@gmail.com>:
> Hi All,
>
> I have some other issue related to taking backup of the database having
> bigger size. I have been getting the
> following errors
>
> anil@ubuntu107:~/Desktop$ pg_dump -Uadmin -h192.168.2.5 dbname >
> filename.sql
>
>  pg_dump: Dumping the contents of table "tbl_voucher" failed:
> PQgetCopyData() failed.
> pg_dump: Error message from server: server closed the connection
> unexpectedly This probably means the server terminated abnormally before or
> while processing the request.
>
> Due to this error i am not able to dump the database.

probably your data files are broken - you should to find a broken rows
and delete it.

probably SELECT * FROM tbl_voucher fails too?

Regards

Pavel Stehule

>
> Pls advice me how to tackle this issue...
>
> --
> View this message in context:
http://postgresql.1045698.n5.nabble.com/backup-strategies-for-large-databases-tp4697145p4737573.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: backup-strategies for large databases

From
Niyas
Date:
I also guessed the same at initial stage of debugging. So i tried to export
the tbl_voucher data
to a file and it works fine. Then i googled and found some link, its
explaines the reason is
higher size of the database. But didnt get any proper solution in the
internet.


--
View this message in context:
http://postgresql.1045698.n5.nabble.com/backup-strategies-for-large-databases-tp4697145p4737737.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.