Home > mailing lists

Re: block-level incremental backup - Mailing list pgsql-hackers

From	vignesh C
Subject	Re: block-level incremental backup
Date	July 31, 2019 17:59:30
Msg-id	CALDaNm310fUZ72nM2n=cD0eSHKRAoJPuCyvvR0dhTEZ9Oytyzg@mail.gmail.com Whole thread Raw
In response to	Re: block-level incremental backup (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: block-level incremental backup
List	pgsql-hackers

Tree view

On Tue, Jul 30, 2019 at 1:58 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Jul 10, 2019 at 2:17 PM Anastasia Lubennikova
> <a.lubennikova@postgrespro.ru> wrote:
> > In attachments, you can find a prototype of incremental pg_basebackup,
> > which consists of 2 features:
> >
> > 1) To perform incremental backup one should call pg_basebackup with a
> > new argument:
> >
> > pg_basebackup -D 'basedir' --prev-backup-start-lsn 'lsn'
> >
> > where lsn is a start_lsn of parent backup (can be found in
> > "backup_label" file)
> >
> > It calls BASE_BACKUP replication command with a new argument
> > PREV_BACKUP_START_LSN 'lsn'.
> >
> > For datafiles, only pages with LSN > prev_backup_start_lsn will be
> > included in the backup.
>>
One thought, if the file is not modified no need to check the lsn.
>>
> > They are saved into 'filename.partial' file, 'filename.blockmap' file
> > contains an array of BlockNumbers.
> > For example, if we backuped blocks 1,3,5, filename.partial will contain
> > 3 blocks, and 'filename.blockmap' will contain array {1,3,5}.
>
> I think it's better to keep both the information about changed blocks
> and the contents of the changed blocks in a single file.  The list of
> changed blocks is probably quite short, and I don't really want to
> double the number of files in the backup if there's no real need. I
> suspect it's just overall a bit simpler to keep everything together.
> I don't think this is a make-or-break thing, and welcome contrary
> arguments, but that's my preference.
>
I feel Robert's suggestion is good.
We can probably keep one meta file for each backup with some basic information
of all the files being backed up, this metadata file will be useful in the
below case:
Table dropped before incremental backup
Table truncated and Insert/Update/Delete operations before incremental backup

I feel if we have the metadata, we can add some optimization to decide the
above scenario with the metadata information to identify the file deletion
and avoiding write and delete for pg_combinebackup which Jeevan has told in
his previous mail.

Probably it can also help us to decide which work the worker needs to do
if we are planning to backup in parallel.

Regards,
vignesh
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Andres Freund
Date: 31 July 2019, 17:55:48
Subject: Re: Remove HeapTuple and Buffer dependency for predicate lockingfunctions

From: Ashwin Agrawal
Date: 31 July 2019, 19:37:58
Subject: Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Re: block-level incremental backup - Mailing list pgsql-hackers

Previous

Next