Re: postgres large database backup - Mailing list pgsql-general

From Michael Loftis
Subject Re: postgres large database backup
Date
Msg-id CAHDg04tXCrWPpB13RrXYxFWoC3neyjKNWBQC8-Ju_f+uNPy9yw@mail.gmail.com
Whole thread Raw
In response to postgres large database backup  (Atul Kumar <akumar14871@gmail.com>)
List pgsql-general
On Wed, Nov 30, 2022 at 8:40 AM Atul Kumar <akumar14871@gmail.com> wrote:
>
> Hi,
>
> I have a 10TB database running on postgres 11 version running on centos 7 "on premises", I need to schedule the
backupof this database in a faster way.
 
>
> The scheduled backup will be used for PITR purposes.
>
> So please let me know how I should do it in a quicker backup for my 10TB database ? Is there any tool to take backups
andsubsequently incremental backups in a faster way and restore it for PITR in a faster way when required.
 
>
> What should be the exact approach for scheduling such backups so that it can be restored in a faster way ?

Faster than *what*?

If speed is the primary criteria, filesystem snapshots by using
pg_start_backup() to tell the DB cluster to be in a binary ready
backup mode, snapshot, then pg_stop_backup(), capture the WALs
generated alongside your FS snapshot, all on the same machine or
shared storage would be the fastest to restore.   To restore, bring
back the old snapshot+ the WALs captured with the DB shutdown/stopped,
startup is normal "crash recovery" or you can select PITR/LSN in the
short pg_start_backup() ... pg_stop_backup() window.  If you're
properly archiving WALs outside of JUST the full backup you can PITR
to any point after the full backup snapshot, but the more
transactions/WAL it has to process to get to the desired point the
longer the recovery.

pgbackrest can backup a PG cluster in multiple ways (including taking
a base backup while/and actively streaming WALs or being the WAL
archiver), and a restore on the same machine as the backup repository
would be basically limited by I/O (well, unless you've got all NVMe,
then CPU, bus, or memory bandwidth constraints become the limiting
factor).

Basically no matter how you backup, 10TB takes a long time to copy,
and except in the "local FS snapshot" method I outlined above, that's
going to be your limiting factor, is how fast you can move the data
back to where you need it.

For critical DBs of this nature I've actually done almost exactly the
method I just outlined, only the backup/snapshot process happens on a
replica.  *NORMAL* failure recovery in that replicated cluster is by
failovers, but, for actual backup restore due to disaster or need to
go back in time (which is...extremely rare...) there's some manual
intervention to bring up a snapshot and play back WALs to the point in
time that we want the DB cluster.

>
>
>
> Regards.



-- 

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler



pgsql-general by date:

Previous
From: Вадим Самохин
Date:
Subject: Re: postgresql 13.1: precision of spatial operations
Next
From: Christophe Pettus
Date:
Subject: Re: postgres large database backup