Thread: Is file system replication sufficient to recovery?
Tom Korach <tom@safekeep.com> writes: > We have a Postgresql instance (0.5-4TB in size) used for development and > on-line reporting. > We do not need high-availability, but we do need: > 1. Quick disaster recovery (<1 hour) is important. > 2. Recovery from corruption of the server or mistakes. > Will file-system replication be enough to achieve this goal? What do you mean exactly by "file-system replication"? Something equivalent to rsync will absolutely not work against a running Postgres server, because it won't capture a consistent state of all the files. If you have (and trust) a filesystem with snapshot capabilities, it might work to take a filesystem snapshot and hold onto it long enough to rsync from the snapshot. I'm not sure about the reliability or performance implications of such a setup, though. See https://www.postgresql.org/docs/current/backup-file.html > Do we also need WAL file archiving? Not as long as you capture the currently-active WAL files along with the database contents. regards, tom lane
Tom Korach <tom@safekeep.com> writes:
> We have a Postgresql instance (0.5-4TB in size) used for development and
> on-line reporting.
> We do not need high-availability, but we do need:
> 1. Quick disaster recovery (<1 hour) is important.
> 2. Recovery from corruption of the server or mistakes.
> Will file-system replication be enough to achieve this goal?
What do you mean exactly by "file-system replication"? Something
equivalent to rsync will absolutely not work against a running
Postgres server, because it won't capture a consistent state of
all the files. If you have (and trust) a filesystem with snapshot
capabilities, it might work to take a filesystem snapshot and hold
onto it long enough to rsync from the snapshot. I'm not sure about
the reliability or performance implications of such a setup, though.
See
https://www.postgresql.org/docs/current/backup-file.html
> Do we also need WAL file archiving?
Not as long as you capture the currently-active WAL files along
with the database contents.
regards, tom lane
Tom Korach <tom@safekeep.com> writes: >> What do you mean exactly by "file-system replication"? > RAID1 setup (specifically, between two disks or EBS volumes [on AWS]), > using LVM. Maybe I'm missing something, but AFAIK plain old RAID will not protect you against any scenario except failure of a single disk. It certainly won't do anything to help you revert to a prior database state. The docs page I pointed you to is part of a chapter that lays out all the backup methods the PG community considers reliable. I strongly suggest sticking to one of those and not trying to take shortcuts. (The following chapter on high-availability setups is relevant reading as well.) regards, tom lane
> won't do anything to help you revert to a prior database state.
- File System (Block Device) Replication
A modified version of shared hardware functionality is file system replication, where all changes to a file system are mirrored to a file system residing on another computer. The only restriction is that the mirroring must be done in a way that ensures the standby server has a consistent copy of the file system — specifically, writes to the standby must be done in the same order as those on the primary. DRBD is a popular file system replication solution for Linux.
Tom Korach <tom@safekeep.com> writes:
>> What do you mean exactly by "file-system replication"?
> RAID1 setup (specifically, between two disks or EBS volumes [on AWS]),
> using LVM.
Maybe I'm missing something, but AFAIK plain old RAID will not protect
you against any scenario except failure of a single disk. It certainly
won't do anything to help you revert to a prior database state.
The docs page I pointed you to is part of a chapter that lays out all
the backup methods the PG community considers reliable. I strongly
suggest sticking to one of those and not trying to take shortcuts.
(The following chapter on high-availability setups is relevant
reading as well.)
regards, tom lane
So could the following backup architecture make sense?> Maybe I'm missing something, but AFAIK plain old RAID will not protect> you against any scenario except failure of a single disk. It certainly
> won't do anything to help you revert to a prior database state.The idea for RAID came from https://www.postgresql.org/docs/current/different-replication-solutions.html
- File System (Block Device) Replication
A modified version of shared hardware functionality is file system replication, where all changes to a file system are mirrored to a file system residing on another computer. The only restriction is that the mirroring must be done in a way that ensures the standby server has a consistent copy of the file system — specifically, writes to the standby must be done in the same order as those on the primary. DRBD is a popular file system replication solution for Linux.
DRBD seems to work similar to RAID but over network, but I might be wrong.According to that link (https://www.postgresql.org/docs/current/backup-file.html):> An alternative file-system backup approach is to make a “consistent snapshot” of the data directory, if the file system supports that functionality (and you are willing to trust that it is implemented correctly).> The typical procedure is to make a “frozen snapshot” of the volume containing the database1. Periodic snapshots using EBS mechanism (to get consistent snapshots).2. Periodic pg_basebackup + WAL file archiving ( to allow reverting to a previous step if we e.g. mistakenly drop a table).Thanks,TomOn Thu, Dec 30, 2021 at 12:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:Tom Korach <tom@safekeep.com> writes:
>> What do you mean exactly by "file-system replication"?
> RAID1 setup (specifically, between two disks or EBS volumes [on AWS]),
> using LVM.
Maybe I'm missing something, but AFAIK plain old RAID will not protect
you against any scenario except failure of a single disk. It certainly
won't do anything to help you revert to a prior database state.
The docs page I pointed you to is part of a chapter that lays out all
the backup methods the PG community considers reliable. I strongly
suggest sticking to one of those and not trying to take shortcuts.
(The following chapter on high-availability setups is relevant
reading as well.)
regards, tom lane
Repeat after me: "RAID is not a backup". Write it down, chisel it into your monitor, tattoo it on your arm. Yes, DRBD replicates data at the block level but if you do a mass update and scramble your data in one place then you still have two perfectly identical copies of bad data.DRBD is pretty neat in that you create a block device under DRBD, that device is replicated elsewhere, and then on top of that device you format a filesystem or can use it as a raw device. As writes happen locally they get shuffled to the other side.--Paul D. CarlucciOn Thu, Dec 30, 2021, 2:31 PM Tom Korach <tom@safekeep.com> wrote:So could the following backup architecture make sense?> Maybe I'm missing something, but AFAIK plain old RAID will not protect> you against any scenario except failure of a single disk. It certainly
> won't do anything to help you revert to a prior database state.The idea for RAID came from https://www.postgresql.org/docs/current/different-replication-solutions.html
- File System (Block Device) Replication
A modified version of shared hardware functionality is file system replication, where all changes to a file system are mirrored to a file system residing on another computer. The only restriction is that the mirroring must be done in a way that ensures the standby server has a consistent copy of the file system — specifically, writes to the standby must be done in the same order as those on the primary. DRBD is a popular file system replication solution for Linux.
DRBD seems to work similar to RAID but over network, but I might be wrong.According to that link (https://www.postgresql.org/docs/current/backup-file.html):> An alternative file-system backup approach is to make a “consistent snapshot” of the data directory, if the file system supports that functionality (and you are willing to trust that it is implemented correctly).> The typical procedure is to make a “frozen snapshot” of the volume containing the database1. Periodic snapshots using EBS mechanism (to get consistent snapshots).2. Periodic pg_basebackup + WAL file archiving ( to allow reverting to a previous step if we e.g. mistakenly drop a table).Thanks,TomOn Thu, Dec 30, 2021 at 12:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:Tom Korach <tom@safekeep.com> writes:
>> What do you mean exactly by "file-system replication"?
> RAID1 setup (specifically, between two disks or EBS volumes [on AWS]),
> using LVM.
Maybe I'm missing something, but AFAIK plain old RAID will not protect
you against any scenario except failure of a single disk. It certainly
won't do anything to help you revert to a prior database state.
The docs page I pointed you to is part of a chapter that lays out all
the backup methods the PG community considers reliable. I strongly
suggest sticking to one of those and not trying to take shortcuts.
(The following chapter on high-availability setups is relevant
reading as well.)
regards, tom lane
True. So we thought to use:1. Periodic pg_basebackup with WAL archiving to allow recovering in case of corruption/mistakes.2. EBS snapshots to quickly (minutes) recover from disk issues (at the cost of uncommitted transactions being lost - it's not critical).According to the documentation, it seems that Postgresql will enter recovery mode from the file-system snapshot:> An alternative file-system backup approach is to make a “consistent snapshot” of the data directory, if the file system supports that functionality... The typical procedure is to make a “frozen snapshot” of the volume containing the database,...> This will work even while the database server is running. However, a backup created in this way saves the database files in a state as if the database server was not properly shut down; therefore, when you start the database server on the backed-up data, it will think the previous server instance crashed and will replay the WAL log.So there are two resiliency goals (1) overcoming corruption and 2) quickly handling disk failure). So far we are using streaming replication to handle #2, but we want a simpler setup.If the above does not make sense, what other solution (beside replication) would suffice?Thanks,TomOn Thu, Dec 30, 2021 at 4:30 PM Paul Carlucci <paul.carlucci@gmail.com> wrote:Repeat after me: "RAID is not a backup". Write it down, chisel it into your monitor, tattoo it on your arm. Yes, DRBD replicates data at the block level but if you do a mass update and scramble your data in one place then you still have two perfectly identical copies of bad data.DRBD is pretty neat in that you create a block device under DRBD, that device is replicated elsewhere, and then on top of that device you format a filesystem or can use it as a raw device. As writes happen locally they get shuffled to the other side.--Paul D. CarlucciOn Thu, Dec 30, 2021, 2:31 PM Tom Korach <tom@safekeep.com> wrote:So could the following backup architecture make sense?> Maybe I'm missing something, but AFAIK plain old RAID will not protect> you against any scenario except failure of a single disk. It certainly
> won't do anything to help you revert to a prior database state.The idea for RAID came from https://www.postgresql.org/docs/current/different-replication-solutions.html
- File System (Block Device) Replication
A modified version of shared hardware functionality is file system replication, where all changes to a file system are mirrored to a file system residing on another computer. The only restriction is that the mirroring must be done in a way that ensures the standby server has a consistent copy of the file system — specifically, writes to the standby must be done in the same order as those on the primary. DRBD is a popular file system replication solution for Linux.
DRBD seems to work similar to RAID but over network, but I might be wrong.According to that link (https://www.postgresql.org/docs/current/backup-file.html):> An alternative file-system backup approach is to make a “consistent snapshot” of the data directory, if the file system supports that functionality (and you are willing to trust that it is implemented correctly).> The typical procedure is to make a “frozen snapshot” of the volume containing the database1. Periodic snapshots using EBS mechanism (to get consistent snapshots).2. Periodic pg_basebackup + WAL file archiving ( to allow reverting to a previous step if we e.g. mistakenly drop a table).Thanks,TomOn Thu, Dec 30, 2021 at 12:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:Tom Korach <tom@safekeep.com> writes:
>> What do you mean exactly by "file-system replication"?
> RAID1 setup (specifically, between two disks or EBS volumes [on AWS]),
> using LVM.
Maybe I'm missing something, but AFAIK plain old RAID will not protect
you against any scenario except failure of a single disk. It certainly
won't do anything to help you revert to a prior database state.
The docs page I pointed you to is part of a chapter that lays out all
the backup methods the PG community considers reliable. I strongly
suggest sticking to one of those and not trying to take shortcuts.
(The following chapter on high-availability setups is relevant
reading as well.)
regards, tom lane
Angular momentum makes the world go 'round.