Re: backup manifests - Mailing list pgsql-hackers

From Tels
Subject Re: backup manifests
Date
Msg-id c677dec6e478b2b3459f518354cc4fcf@bloodgate.com
Whole thread Raw
In response to Re: backup manifests  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: backup manifests  (David Steele <david@pgmasters.net>)
Re: backup manifests  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Moin Robert,

On 2019-11-22 20:01, Robert Haas wrote:
> On Fri, Nov 22, 2019 at 1:10 PM David Steele <david@pgmasters.net> 
> wrote:
>> Well, the maximum amount of data that can be protected with a 32-bit 
>> CRC
>> is 512MB according to all the sources I found (NIST, Wikipedia, etc).  
>> I
>> presume that's what we are talking about since I can't find any 64-bit
>> CRC code in core or this patch.
> 
> Could you give a more precise citation for this? I can't find a
> reference to that in the Wikipedia article off-hand and I don't know
> where to look in NIST. I apologize if I'm being dense here, but I
> don't see why there should be any limit on the amount of data that can
> be protected. The important thing is that if the original file F is
> altered to F', we hope that CHECKSUM(F) != CHECKSUM(F'). The
> probability of that, assuming that the alteration is random rather
> than malicious and that the checksum function is equally likely to
> produce every possible output, is just 1-2^-${CHECKSUM_BITS},
> regardless of the length of the message (except that there might be
> some special cases for very short messages, which don't matter here).
> 
> This analysis by me seems to match
> https://en.wikipedia.org/wiki/Cyclic_redundancy_check, which says:
> 
> "Typically an n-bit CRC applied to a data block of arbitrary length
> will detect any single error burst not longer than n bits, and the
> fraction of all longer error bursts that it will detect is (1 −
> 2^−n)."
> 
> Notice the phrase "a data block of arbitrary length" and the formula "1 
> - 2^-n".

It is related to the number of states, and the birthday problem factors 
in it, too:

    https://en.wikipedia.org/wiki/Birthday_problem

If you have a 32 bit checksum or hash, it can represent only 2**32-1 
states at most (or less, if the
algorithmn isn't really good).

Each byte is 8 bit, so 2 ** 32 / 8 is 512 Mbyte. If you process your 
data bit by bit, each
new bit would add a new state (consider: missing bit == 0, added bit == 
1). If each new state
is repesented by a different checksum, all possible 2 ** 32 values are 
exhausted after
processing 512 Mbyte, after that you get one of the former states again 
- aka a collision.

There is no way around it with so little bits, no matter what algorithmn 
you choose.

>> > Phrased more positively, if you want a cryptographic hash
>> > at all, you should probably use one that isn't widely viewed as too
>> > weak.
>> 
>> Sure.  There's another advantage to picking an algorithm with lower
>> collision rates, though.
>> 
>> CRCs are fine for catching transmission errors (as caveated above) but
>> not as great for comparing two files for equality.  With strong hashes
>> you can confidently compare local files against the path, size, and 
>> hash
>> stored in the manifest and save yourself a round-trip to the remote
>> storage to grab the file if it has not changed locally.
> 
> I agree in part. I think there are two reasons why a cryptographically
> strong hash is desirable for delta restore. First, since the checksums
> are longer, the probability of a false match happening randomly is
> lower, which is important. Even if the above analysis is correct and
> the chance of a false match is just 2^-32 with a 32-bit CRC, if you
> back up ten million files every day, you'll likely get a false match
> within a few years or less, and once is too often. Second, unlike what
> I supposed above, the contents of a PostgreSQL data file are not
> chosen at random, unlike transmission errors, which probably are more
> or less random. It seems somewhat possible that there is an adversary
> who is trying to choose the data that gets stored in some particular
> record so as to create a false checksum match. A CRC is a lot easier
> to fool than a crytographic hash, so I think that using a CRC of *any*
> length for this kind of use case would be extremely dangerous no
> matter the probability of an accidental match.

Agreed. See above.

However, if you choose a hash, please do not go below SHA-256. Both MD5
and SHA-1 already had collision attacks, and these only got to be bound
to be worse.

   https://www.mscs.dal.ca/~selinger/md5collision/
   https://shattered.io/

It might even be a wise idea to encode the used Hash-Algorithm into the
manifest file, so it can be changed later. The hash length might be not
enough to decide which algorithm is the one used.

>> This is the basic premise of what we call delta restore which can 
>> speed
>> up restores by orders of magnitude.
>> 
>> Delta restore is the main advantage that made us decide to require 
>> SHA1
>> checksums.  In most cases, restore speed is more important than backup
>> speed.
> 
> I see your point, but it's not the whole story. We've encountered a
> bunch of cases where the time it took to complete a backup exceeded
> the user's desired backup interval, which is obviously very bad, or
> even more commonly where it exceeded the length of the user's
> "low-usage" period when they could tolerate the extra overhead imposed
> by the backup. A few percentage points is probably not a big deal, but
> a user who has an 8-hour window to get the backup done overnight will
> not be happy if it's taking 6 hours now and we tack 40%-50% on to
> that. So I think that we either have to disable backup checksums by
> default, or figure out a way to get the overhead down to something a
> lot smaller than what current tests are showing -- which we could
> possibly do without changing the algorithm if we can somehow make it a
> lot cheaper, but otherwise I think the choice is between disabling the
> functionality altogether by default and adopting a less-expensive
> algorithm. Maybe someday when delta restore is in core and widely used
> and CPUs are faster, it'll make sense to revise the default, and
> that's cool, but I can't see imposing a big overhead by default to
> enable a feature core doesn't have yet...

Modern algorithms are amazingly fast on modern hardware, some even
are implemented in hardware nowadays:

  https://software.intel.com/en-us/articles/intel-sha-extensions

Quote from:

  
https://neosmart.net/blog/2017/will-amds-ryzen-finally-bring-sha-extensions-to-intels-cpus/

  "Despite the extremely limited availability of SHA extension support
   in modern desktop and mobile processors, crypto libraries have already
   upstreamed support to great effect. Botan’s SHA extension patches show 
a
   significant 3x to 5x performance boost when taking advantage of the 
hardware
   extensions, and the Linux kernel itself shipped with hardware SHA 
support
   with version 4.4, bringing a very respectable 3.6x performance upgrade 
over
   the already hardware-assisted SSE3-enabled code."

If you need to load the data from disk and shove it over a network, the
hashing will certainly be very little overhead, it might even be 
completely
invisible, since it can run in paralell to all the other things. Sure, 
there
is the thing called zero-copy-networking, but if you have to compress 
the
data bevore sending it to the network, you have to put it through the 
CPU,
anyway. And if you have more than one core, the second one can to the
hashing it paralell to the first one doing the compression.

To get a feeling one can use:

    openssl speed md5 sha1 sha256 sha512

On my really-not-fast desktop CPU (i5-4690T CPU @ 2.50GHz) it says:

  The 'numbers' are in 1000s of bytes per second processed.
   type       16 bytes     64 bytes    256 bytes   1024 bytes   8192 
bytes  16384 bytes
   md5       122638.55k   277023.96k   487725.57k   630806.19k   
683892.74k   688553.98k
   sha1      127226.45k   313891.52k   632510.55k   865753.43k   
960995.33k   977215.19k
   sha256     77611.02k   173368.15k   325460.99k   412633.43k   
447022.92k   448020.48k
   sha512     51164.77k   205189.87k   361345.79k   543883.26k   
638372.52k   645933.74k

Or in other words, it can hash nearly 931 MByte /s with SHA-1 and about
427 MByte / s with SHA-256 (if I haven't miscalculated something). You'd
need a
pretty fast disk (aka M.2 SSD) and network (aka > 1 Gbit) to top these 
speeds
and then you'd use a real CPU for your server, not some poor Intel 
powersaving
surfing thingy-majingy :)

Best regards,

Tels



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: [PATCH] Tiny optmization or is a bug?
Next
From: Tom Lane
Date:
Subject: Re: [PATCH] Tiny optmization.