Thread: [HACKERS] question: data file update when pg_basebackup in progress
Hello,
I'm checking how the pg_basebackup works and I got a question(maybe there are no such issues):
When pg_basebackup is launched, a checkpoint is created first, then all files are transferred to the pg_basebackup client. Is it possible that a data page(say page-N) in a data file is changed after the checkpoint and before the pg_basebackup is finished?
If this happens, is it possible that only part of the changed page be transferred to the pg_basebackup client? i.e. the pg_basebackup client gets page-N with part of the old content and part of the new content. How does postgreSQL handle this kind of data page?
Thanks,
Rui Hai
Re: [HACKERS] question: data file update when pg_basebackup in progress
From
"David G. Johnston"
Date:
When pg_basebackup is launched, a checkpoint is created first, then all files are transferred to the pg_basebackup client. Is it possible that a data page(say page-N) in a data file is changed after the checkpoint and before the pg_basebackup is finished?
I believe so.
If this happens, is it possible that only part of the changed page be transferred to the pg_basebackup client? i.e. the pg_basebackup client gets page-N with part of the old content and part of the new content. How does postgreSQL handle this kind of data page?
The first write to a page after a checkpoint is always recorded in the WAL as a full page write. Every WAL file since the checkpoint must also be copied to the backed up system. The replay of those WAL files is what brings the remote and local system into sync with respect to all changes since the backup checkpoint.
David J.
On Wed, Apr 26, 2017 at 1:45 AM, David G. Johnston <david.g.johnston@gmail.com> wrote: > The first write to a page after a checkpoint is always recorded in the WAL > as a full page write. Every WAL file since the checkpoint must also be > copied to the backed up system. The replay of those WAL files is what > brings the remote and local system into sync with respect to all changes > since the backup checkpoint. Bringing to the point that the presence of backup_label in a backup is critical, as this tells Postgres from which position in WAL it should begin recovery to bring the system up to a consistent state. pg_basebackup also makes sure that the last WAL segment needed is archived before the backup completes so as recovery can completely be done. -- Michael