Home > mailing lists

Re: [PATCH] Incremental backup: add backup profile to base backup - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: [PATCH] Incremental backup: add backup profile to base backup
Date	August 18, 2014 13:05:43
Msg-id	53F1FA03.40407@vmware.com Whole thread Raw
In response to	Re: [PATCH] Incremental backup: add backup profile to base backup (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses	Re: [PATCH] Incremental backup: add backup profile to base backup Re: [PATCH] Incremental backup: add backup profile to base backup Re: [PATCH] Incremental backup: add backup profile to base backup Re: [PATCH] Incremental backup: add backup profile to base backup
List	pgsql-hackers

Tree view

On 08/18/2014 08:05 AM, Alvaro Herrera wrote:
> Marco Nenciarini wrote:
>
>> To calculate the md5 checksum I've used the md5 code present in pgcrypto
>> contrib as the code in src/include/libpq/md5.h is not suitable for large
>> files. Since a core feature cannot depend on a piece of contrib, I've
>> moved the files
>>
>> contrib/pgcrypto/md5.c
>> contrib/pgcrypto/md5.h
>>
>> to
>>
>> src/backend/utils/hash/md5.c
>> src/include/utils/md5.h
>>
>> changing the pgcrypto extension to use them.
>
> We already have the FNV checksum implementation in the backend -- can't
> we use that one for this and avoid messing with MD5?
>
> (I don't think we're looking for a cryptographic hash here.  Am I wrong?)

Hmm. Any user that can update a table can craft such an update that its 
checksum matches an older backup. That may seem like an onerous task; to 
correctly calculate the checksum of a file in a previous, you need to 
know the LSNs and the exact data, including deleted data, on every block 
in the table, and then construct a suitable INSERT or UPDATE that 
modifies the table such that you get a collision. But for some tables it 
could be trivial; you might know that a table was bulk-loaded with a 
particular LSN and there are no dead tuples. Or you can simply create 
your own table and insert exactly the data you want. Messing with your 
own table might seem harmless, but it'll e.g. let you construct a case 
where an index points to a tuple that doesn't exist anymore, or there's 
a row that doesn't pass a CHECK-constraint that was added later. Even if 
there's no direct security issue with that, you don't want that kind of 
uncertainty from a backup solution.

But more to the point, I thought the consensus was to use the highest 
LSN of all the blocks in the file, no? That's essentially free to 
calculate (if you have to read all the data anyway), and isn't 
vulnerable to collisions.

- Heikki

pgsql-hackers by date:

From: Greg Stark
Date: 18 August 2014, 13:03:36
Subject: Re: wrapping in extended mode doesn't work well with default pager

From: Arthur Silva
Date: 18 August 2014, 13:14:23
Subject: Re: [PATCH] Incremental backup: add backup profile to base backup

Re: [PATCH] Incremental backup: add backup profile to base backup - Mailing list pgsql-hackers

Previous

Next