Re: Will Altering and Modifying tables during backup result in acorrupted server after the restore? - Mailing list pgsql-general

From Stephen Frost
Subject Re: Will Altering and Modifying tables during backup result in acorrupted server after the restore?
Date
Msg-id 20180521162648.GC27724@tamriel.snowman.net
Whole thread Raw
In response to Will Altering and Modifying tables during backup result in acorrupted server after the restore?  (Yashwanth Govinda Setty <ygovindasetty@commvault.com>)
List pgsql-general
Greetings,

* Yashwanth Govinda Setty (ygovindasetty@commvault.com) wrote:
> 1.       Creating a big table. Identify the physical file on the disk.
> 1.  While backup process is backing up a file associated with the table - update the rows , add a column.
> 2.  Restore the server with transaction logs
>
> We are backing up (copying) the entire postgres data directory. The database/table file being backed up (copied), is
modifiedby running alter/update queries on the table. 
> When we restore the copied data directory and replay/apply the transaction logs, will the server be restored to an
healthystate? 

You haven't really spelled out your actual backup process but it
certainly sounds like it's lacking.

> (The files modified during backup can be corrupted, will this affect the restore?)

With a proper PG-style filesystem-backup, any corruption due to ongoing
writes from PG will be handled by the transaction log.

To perform a proper PG-style filesystem-backup, you need to:

- Ensure that WAL is being archived somewhere.  This can be done with
  archive_command or with pg_receivewal.  All WAL archived during the
  backup *must* be saved or the backup will be inconsistent and
  incomplete (and, therefore, basically useless).

- Make sure to run pg_start_backup() before you begin copying *any*
  files

- Make sure to run pg_stop_backup() after you have copied all files

- Verify that all of the WAL generated between the pg_start_backup()
  call and the pg_stop_backup() call have been archived.  The
  information about what WAL is needed is returned from those calls.

- On restore, create and populate the backup_label file in the data
  directory to indicate that it was a backup being restored.
  (Alternatively, create that file during the backup itself and store it
  in the backup system, to be restored when the backup is restored).

Ideally, you'd also verify the page-level checksums (if they're enabled)
when doing the backup, calculate your own checksum of the file (to
detect if it gets corrupted between the backup time and the restore
time) and verify that everything is physically written out to permanent
storage before claiming to have a successful backup.

Further information is available here:

https://www.postgresql.org/docs/current/static/continuous-archiving.html#BACKUP-BASE-BACKUP

Generally speaking, however, I would strongly discourage people from
trying to write yet-another-PG-backup-tool, there are several already,
my favorite being pgbackrest, and contributing to one of them would be
a better approach.

Thanks!

Stephen

Attachment

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Aggregate functions with FROM clause and ROW_COUNT diagnostics
Next
From: Alexey Dokuchaev
Date:
Subject: Re: Aggregate functions with FROM clause and ROW_COUNT diagnostics