Thread: Will Altering and Modifying tables during backup result in acorrupted server after the restore?

Hi All,

 

We are trying this scenario:

Here are the steps being done:

1.       Creating a big table. Identify the physical file on the disk.

  1. While backup process is backing up a file associated with the table - update the rows , add a column.
  2. Restore the server with transaction logs

 

We are backing up (copying) the entire postgres data directory. The database/table file being backed up (copied), is modified by running alter/update queries on the table.

When we restore the copied data directory and replay/apply the transaction logs, will the server be restored to an healthy state?

(The files modified during backup can be corrupted, will this affect the restore?)

 

Thanks,

Yashwanth

 

***************************Legal Disclaimer***************************
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**********************************************************************
Not sure if such case would corrup the backup, but I won't make a backup just by copying the data directory.

Use pg_basebackup instead, it's safer

Regards,

Alvaro Aguayo
Operations Manager
Open Comb Systems E.I.R.L.

Office: (+51-1) 3377813 | Mobile: (+51) 995540103 | (+51) 954183248
Web: www.ocs.pe

----- Original Message -----
From: "Yashwanth Govinda Setty" <ygovindasetty@commvault.com>
To: "PostgreSql-general" <pgsql-general@postgresql.org>
Sent: Monday, 21 May, 2018 10:03:18
Subject: Will Altering and Modifying tables during backup result in a corrupted server after the restore? 

Hi All,

We are trying this scenario:
Here are the steps being done:

1.       Creating a big table. Identify the physical file on the disk.

  1.  While backup process is backing up a file associated with the table - update the rows , add a column.
  2.  Restore the server with transaction logs

We are backing up (copying) the entire postgres data directory. The database/table file being backed up (copied), is
modifiedby running alter/update queries on the table.
 
When we restore the copied data directory and replay/apply the transaction logs, will the server be restored to an
healthystate?
 
(The files modified during backup can be corrupted, will this affect the restore?)

Thanks,
Yashwanth

***************************Legal Disclaimer***************************
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**********************************************************************


Greetings,

* Yashwanth Govinda Setty (ygovindasetty@commvault.com) wrote:
> 1.       Creating a big table. Identify the physical file on the disk.
> 1.  While backup process is backing up a file associated with the table - update the rows , add a column.
> 2.  Restore the server with transaction logs
>
> We are backing up (copying) the entire postgres data directory. The database/table file being backed up (copied), is
modifiedby running alter/update queries on the table. 
> When we restore the copied data directory and replay/apply the transaction logs, will the server be restored to an
healthystate? 

You haven't really spelled out your actual backup process but it
certainly sounds like it's lacking.

> (The files modified during backup can be corrupted, will this affect the restore?)

With a proper PG-style filesystem-backup, any corruption due to ongoing
writes from PG will be handled by the transaction log.

To perform a proper PG-style filesystem-backup, you need to:

- Ensure that WAL is being archived somewhere.  This can be done with
  archive_command or with pg_receivewal.  All WAL archived during the
  backup *must* be saved or the backup will be inconsistent and
  incomplete (and, therefore, basically useless).

- Make sure to run pg_start_backup() before you begin copying *any*
  files

- Make sure to run pg_stop_backup() after you have copied all files

- Verify that all of the WAL generated between the pg_start_backup()
  call and the pg_stop_backup() call have been archived.  The
  information about what WAL is needed is returned from those calls.

- On restore, create and populate the backup_label file in the data
  directory to indicate that it was a backup being restored.
  (Alternatively, create that file during the backup itself and store it
  in the backup system, to be restored when the backup is restored).

Ideally, you'd also verify the page-level checksums (if they're enabled)
when doing the backup, calculate your own checksum of the file (to
detect if it gets corrupted between the backup time and the restore
time) and verify that everything is physically written out to permanent
storage before claiming to have a successful backup.

Further information is available here:

https://www.postgresql.org/docs/current/static/continuous-archiving.html#BACKUP-BASE-BACKUP

Generally speaking, however, I would strongly discourage people from
trying to write yet-another-PG-backup-tool, there are several already,
my favorite being pgbackrest, and contributing to one of them would be
a better approach.

Thanks!

Stephen

Attachment
## Yashwanth Govinda Setty (ygovindasetty@commvault.com):

>   2.  Restore the server with transaction logs

This is missing a lot of details. If you do it right - see your email
thread from one week ago - you will be able to recover the database
server to a state as of the _end_ of the backup process (as marked by
the return of the pg_stop_backup() command).
If you do not follow the backup/restore documentation to the letter,
the database will be corrupted and will not start (sometimes people
report with haphazard backup schemes, but that's just more luck than
they deserve, and nobody should rely on that).

Regards,
Christoph

-- 
Spare Space


Greetings,

* Christoph Moench-Tegeder (cmt@burggraben.net) wrote:
> ## Yashwanth Govinda Setty (ygovindasetty@commvault.com):
>
> >   2.  Restore the server with transaction logs
>
> This is missing a lot of details. If you do it right - see your email
> thread from one week ago - you will be able to recover the database
> server to a state as of the _end_ of the backup process (as marked by
> the return of the pg_stop_backup() command).
> If you do not follow the backup/restore documentation to the letter,
> the database will be corrupted and will not start (sometimes people
> report with haphazard backup schemes, but that's just more luck than
> they deserve, and nobody should rely on that).

Please also note that the PG documentation provided, when it comes to
the examples, are purely for usage demonstration only and shouldn't be
considered a good idea when it comes to implementing an actual solution.

Using only "cp" for archive_command is a particularly bad idea as it
doesn't sync the file to disk.  Be sure to also heed the recommendation
about using the non-exclusive backup method and *not* using the
exclusive backup method.

Thanks!

Stephen

Attachment