Re: block-level incremental backup - Mailing list pgsql-hackers

From Jeevan Ladhe
Subject Re: block-level incremental backup
Date
Msg-id CAOgcT0PmGC-etzOKPefYV7wgdMiP+0mmxBfBBoHj0KdZiwcxuQ@mail.gmail.com
Whole thread Raw
In response to Re: block-level incremental backup  (vignesh C <vignesh21@gmail.com>)
Responses Re: block-level incremental backup  (vignesh C <vignesh21@gmail.com>)
List pgsql-hackers
Hi Vignesh,

This backup technology is extending the pg_basebackup itself, which means we can
still take online backups. This is internally done using pg_start_backup and
pg_stop_backup. pg_start_backup performs a checkpoint, and this checkpoint is
used in the recovery process while starting the cluster from a backup image. What
incremental backup will just modify (as compared to traditional pg_basebackup)
is - After doing the checkpoint, instead of copying the entire relation files,
it takes an input LSN and scan all the blocks in all relation files, and store
the blocks having LSN >= InputLSN. This means it considers all the changes
that are already written into relation files including insert/update/delete etc
up to the checkpoint performed by pg_start_backup internally, and as Jeevan Chalke
mentioned upthread the incremental backup will also contain copy of WAL files.
Once this incremental backup is combined with the parent backup by means of new
combine process (that will be introduced as part of this feature itself) should
ideally look like a full pg_basebackup. Note that any changes done by these
insert/delete/update operations while the incremental backup was being taken
will be still available via WAL files and as normal restore process, will be
replayed from the checkpoint onwards up to a consistent point.

My two cents!

Regards,
Jeevan Ladhe

On Sat, Jul 20, 2019 at 11:22 PM vignesh C <vignesh21@gmail.com> wrote:
Hi Jeevan,

The idea is very nice.
When Insert/update/delete and truncate/drop happens at various
combinations, How the incremental backup handles the copying of the
blocks?


On Wed, Jul 17, 2019 at 8:12 PM Jeevan Chalke
<jeevan.chalke@enterprisedb.com> wrote:
>
>
>
> On Wed, Jul 17, 2019 at 7:38 PM Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote:
>>
>>
>>
>> On Wed, Jul 17, 2019 at 6:43 PM Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote:
>>>
>>> On Wed, Jul 17, 2019 at 2:15 PM Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote:
>>>>
>>>>
>>>> At what stage you will apply the WAL generated in between the START/STOP backup.
>>>
>>>
>>> In this design, we are not touching any WAL related code. The WAL files will
>>> get copied with each backup either full or incremental. And thus, the last
>>> incremental backup will have the final WAL files which will be copied as-is
>>> in the combined full-backup and they will get apply automatically if that
>>> the data directory is used to start the server.
>>
>>
>> Ok, so you keep all the WAL files since the first backup, right?
>
>
> The WAL files will anyway be copied while taking a backup (full or incremental),
> but only last incremental backup's WAL files are copied to the combined
> synthetic full backup.
>
>>>
>>>>
>>>> --
>>>> Ibrar Ahmed
>>>
>>>
>>> --
>>> Jeevan Chalke
>>> Technical Architect, Product Development
>>> EnterpriseDB Corporation
>>>
>>
>>
>> --
>> Ibrar Ahmed
>
>
>
> --
> Jeevan Chalke
> Technical Architect, Product Development
> EnterpriseDB Corporation
>


--
Regards,
vignesh
EnterpriseDB: http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: stress test for parallel workers
Next
From: Fabien COELHO
Date:
Subject: pgbench - allow to create partitioned tables