Re: Transparent column encryption - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Transparent column encryption
Date
Msg-id CA+Tgmob_6zqqy2xdqh--Rwesszr_e9SFFXgC3zZJHkbLbH3O_Q@mail.gmail.com
Whole thread Raw
In response to Re: Transparent column encryption  (Jacob Champion <jchampion@timescale.com>)
Responses Re: Transparent column encryption
List pgsql-hackers
On Thu, Jul 21, 2022 at 2:30 PM Jacob Champion <jchampion@timescale.com> wrote:
> On Mon, Jul 18, 2022 at 9:07 AM Robert Haas <robertmhaas@gmail.com> wrote:
> > Even there, what can be accomplished with a feature that only encrypts
> > individual column values is by nature somewhat limited. If you have a
> > text column that, for one row, stores the value 'a', and for some
> > other row, stores the entire text of Don Quixote in the original
> > Spanish, it is going to be really difficult to keep an adversary who
> > can read from the disk from distinguishing those rows. If you want to
> > fix that, you're going to need to do block-level encryption or
> > something of that sort.
>
> A minimum padding option would fix the leak here, right? If every
> entry is the same length then there's no information to be gained, at
> least in an offline analysis.

Sure, but padding every text column that you have, even the ones
containing only 'a', out to the length of Don Quixote in the original
Spanish, is unlikely to be an appealing option.

> I think some work around that is probably going to be needed for
> serious use of this encryption, in part because of the use of text
> format as the canonical input. If the encrypted values of 1, 10, 100,
> and 1000 hypothetically leaked their exact lengths, then an encrypted
> int wouldn't be very useful. So I'd want to quantify (and possibly
> configure) exactly how much data you can encrypt in a single message
> before the length starts being leaked, and then make sure that my
> encrypted values stay inside that bound.

I think most ciphers these days are block ciphers, so you're going to
get output that is a multiple of the block size anyway - e.g. I think
for AES it's 128 bits = 16 bytes. So small differences in length will
be concealed naturally, which may be good enough for some use cases.

I'm not really convinced that it's worth putting a lot of effort into
bolstering the security of this kind of tech above what it naturally
gives. I think it's likely to be a wild goose chase. If you have major
worries about someone reading your disk in its entirety, use full-disk
encryption. Selective encryption is only suitable when you want to add
a modest level of protection for individual value and are willing to
accept that some information leakage is likely if an adversary can in
fact read the full disk. Padding values to try to further obscure
things may be situationally useful, but if you find yourself worrying
too much about that sort of thing, you likely should have picked
stronger medicine initially.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: predefined role(s) for VACUUM and ANALYZE
Next
From: Robert Haas
Date:
Subject: Re: predefined role(s) for VACUUM and ANALYZE