Re: File based Incremental backup v8 - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: File based Incremental backup v8 |
Date | |
Msg-id | CAHGQGwHbya2tCg6r0gPPERjJQKWAYdTgS52H2gqEDe14p1_5ig@mail.gmail.com Whole thread Raw |
In response to | Re: File based Incremental backup v8 (Marco Nenciarini <marco.nenciarini@2ndquadrant.it>) |
Responses |
Re: File based Incremental backup v8
|
List | pgsql-hackers |
On Tue, Mar 3, 2015 at 12:36 AM, Marco Nenciarini <marco.nenciarini@2ndquadrant.it> wrote: > Il 02/03/15 14:21, Fujii Masao ha scritto: >> On Thu, Feb 12, 2015 at 10:50 PM, Marco Nenciarini >> <marco.nenciarini@2ndquadrant.it> wrote: >>> Hi, >>> >>> I've attached an updated version of the patch. >> >> basebackup.c:1565: warning: format '%lld' expects type 'long long >> int', but argument 8 has type '__off_t' >> basebackup.c:1565: warning: format '%lld' expects type 'long long >> int', but argument 8 has type '__off_t' >> pg_basebackup.c:865: warning: ISO C90 forbids mixed declarations and code >> > > I'll add the an explicit cast at that two lines. > >> When I applied three patches and compiled the code, I got the above warnings. >> >> How can we get the full backup that we can use for the archive recovery, from >> the first full backup and subsequent incremental backups? What commands should >> we use for that, for example? It's better to document that. >> > > I've sent a python PoC that supports the plain format only (not the tar one). > I'm currently rewriting it in C (with also the tar support) and I'll send a new patch containing it ASAP. Yeah, if special tool is required for that purpose, the patch should include it. >> What does "1" of the heading line in backup_profile mean? >> > > Nothing. It's a version number. If you think it's misleading I will remove it. A version number of file format of backup profile? If it's required for the validation of backup profile file as a safe-guard, it should be included in the profile file. For example, it might be useful to check whether pg_basebackup executable is compatible with the "source" backup that you specify. But more info might be needed for such validation. >> Sorry if this has been already discussed so far. Why is a backup profile file >> necessary? Maybe it's necessary in the future, but currently seems not. > > It's necessary because it's the only way to detect deleted files. Maybe I'm missing something. Seems we can detect that even without a profile. For example, please imagine the case where the file has been deleted since the last full backup and then the incremental backup is taken. In this case, that deleted file exists only in the full backup. We can detect the deletion of the file by checking both full and incremental backups. >> We've really gotten the consensus about the current design, especially that >> every files basically need to be read to check whether they have been modified >> since last backup even when *no* modification happens since last backup? > > The real problem here is that there is currently no way to detect that a file is not changed since the last backup. Weagreed to not use file system timestamps as they are not reliable for that purpose. TBH I prefer timestamp-based approach in the first version of incremental backup even if's less reliable than LSN-based one. I think that some users who are using timestamp-based rsync (i.e., default mode) for the backup would be satisfied with timestamp-based one. > Using LSN have a significant advantage over using checksum, as we can start the full copy as soon as we found a block whitha LSN greater than the threshold. > There are two cases: 1) the file is changed, so we can assume that we detect it after reading 50% of the file, then wesend it taking advantage of file system cache; 2) the file is not changed, so we read it without sending anything. > It will end up producing an I/O comparable to a normal backup. Yeah, it might make the situation better than today. But I'm afraid that many users might get disappointed about that behavior of an incremental backup after the release... Regards, -- Fujii Masao
pgsql-hackers by date: