Re: [PATCH] Incremental backup: add backup profile to base backup - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [PATCH] Incremental backup: add backup profile to base backup
Date
Msg-id CA+TgmoZha+Yb1qP-Tbjf2xOPK4orpyVVSN90SvZCcp7yFzbbNA@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Incremental backup: add backup profile to base backup  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
On Mon, Aug 18, 2014 at 4:55 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> You're not thinking evil enough ;-). Let's say that you have a table that
> stores bank transfers. You can do a bank transfer to pay a merchant, get the
> goods delivered to you, and then a second transfer to yourself with a
> specially crafted message attached to it that makes the checksum match the
> state before the first transfer. If the backup is restored (e.g. by a daily
> batch job to a reporting system), it will appear as if neither transfer
> happened, and you get to keep your money.
>
> Far-fetched? Well, how about this scenario: a vandal just wants to cause
> damage. Creating a situation where a restore from backup causes the system
> to be inconsistent will certainly cause headaches to the admins, and leave
> them wondering what else is corrupt.
>
> Or how about this: you can do the trick to a system catalog, say
> pg_attribute, to make it look like a column is of type varlena, when it's
> actually since been ALTERed to be an integer. Now you can access arbitrary
> memory in the server, and take over the whole system.
>
> I'm sure any or all of those scenarios are highly inpractical when you
> actually sit down and try to do it, but you don't want to rely on that. You
> have to be able to trust your backups.

Yeah.  I agree that these scenarios are far-fetched; however, they're
also preventable, so we should.

Also, with respect to checksum collisions, you figure to have an
*accidental* checksum collision every so often as well.  For block
checksums, we're using a 16-bit value, which is OK because we'll still
detect 65535/65536 = 99.998% of corruption events.  The requirements
are much higher for incremental backup, because a checksum collision
here means automatic data corruption.  If we were crazy enough to use
a 16-bit block-level checksum in this context, about one
actually-modified block would fail to get copied out of every 8kB *
64k = 512MB of modified data, which would not make anybody very happy.
A 32-bit checksum would be much safer, and a 64-bit checksum would be
better still, but block LSNs seem better still.

Of course, there's one case where block LSNs aren't better, which is
where somebody goes backwards in time - i.e. back up the database, run
it for a while, take a base backup, shut down, restore from backup, do
stuff that's different from what you did the first time through, try
to take an incremental backup against your base backup.  LSNs won't
catch that; checksums will.  Do we want to allow incremental backup in
that kind of situation?  It seems like playing with fire, but it's
surely not useless.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [v9.5] Custom Plan API
Next
From: Robert Haas
Date:
Subject: Re: [PATCH] Incremental backup: add backup profile to base backup