Re: [RFC] Incremental backup v2: add backup profile to base backup - Mailing list pgsql-hackers

From Marco Nenciarini
Subject Re: [RFC] Incremental backup v2: add backup profile to base backup
Date
Msg-id 54327D42.9020504@2ndquadrant.it
Whole thread Raw
In response to Re: [RFC] Incremental backup v2: add backup profile to base backup  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
Il 03/10/14 23:12, Andres Freund ha scritto:
> On 2014-10-03 17:31:45 +0200, Marco Nenciarini wrote:
>> I've updated the wiki page
>> https://wiki.postgresql.org/wiki/Incremental_backup following the result
>> of discussion on hackers.
>>
>> Compared to first version, we switched from a timestamp+checksum based
>> approach to one based on LSN.
>>
>> This patch adds an option to pg_basebackup and to replication protocol
>> BASE_BACKUP command to generate a backup_profile file. It is almost
>> useless by itself, but it is the foundation on which we will build the
>> file based incremental backup (and hopefully a block based incremental
>> backup after it).
>>
>> Any comment will be appreciated. In particular I'd appreciate comments
>> on correctness of relnode files detection and LSN extraction code.
>
> Can you describe the algorithm you implemented in words?
>


Here it is the relnode files detection algorithm:

I've added a has_relfiles parameter to the sendDir function. If
has_relfiles is true every file in the directory is tested against the
validateRelfilenodeName function. If the response is true, the maxLSN
value is computed for the file.

The sendDir function is called with has_relfiles=true by sendTablespace
function and by sendDir itself when is recurring into a subdirectory
* if has_relfiles is true* if we are recurring into a "./global" or "./base" directory

The validateRelfilenodeName has been taken from pg_computemaxlsn patch.

It's short enough to be pasted here:

static bool
validateRelfilenodename(char *name)
{int            pos = 0;
while ((name[pos] >= '0') && (name[pos] <= '9'))    pos++;
if (name[pos] == '_'){    pos++;    while ((name[pos] >= 'a') && (name[pos] <= 'z'))        pos++;}if (name[pos] ==
'.'){   pos++;    while ((name[pos] >= '0') && (name[pos] <= '9'))        pos++;} 
if (name[pos] == 0)    return true;return false;
}


To compute the maxLSN for a file, as the file is sent in TAR_SEND_SIZE
chunks (32kb) and it is always a multiple of the block size, I've added
the following code inside the send cycle:


+   char *page;
+
+   /* Scan every page to find the max file LSN */
+   for (page = buf; page < buf + (off_t) cnt; page += (off_t) BLCKSZ) {
+       pagelsn = PageGetLSN(page);
+       if (filemaxlsn < pagelsn)
+           filemaxlsn = pagelsn;
+   }
+

Regards,
Marco

--
Marco Nenciarini - 2ndQuadrant Italy
PostgreSQL Training, Services and Support
marco.nenciarini@2ndQuadrant.it | www.2ndQuadrant.it


pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: WAL format and API changes (9.5)
Next
From: Michael Paquier
Date:
Subject: Re: pg_receivexlog and replication slots