Re: Transparent column encryption - Mailing list pgsql-hackers
From | Peter Eisentraut |
---|---|
Subject | Re: Transparent column encryption |
Date | |
Msg-id | b12ab6ca-5a87-e79f-f03b-02b42c42e6bd@enterprisedb.com Whole thread Raw |
In response to | Re: Transparent column encryption (Stephen Frost <sfrost@snowman.net>) |
List | pgsql-hackers |
On 30.03.23 20:35, Stephen Frost wrote: > I do feel that column encryption is a useful capability and there's > large parts of this approach that I agree with, but I dislike the idea > of having our clients be able to depend on what gets returned for > non-encrypted columns while not being able to trust what encrypted > column results are and then trying to say it's 'transparent'. To that > end, it seems like just saying they get back a bytea and making it clear > that they have to provide the validation would be clear, while keeping > much of the rest. [Note that the word "transparent" has been removed from the feature name. I just didn't change the email thread name.] These thoughts are reasonable, but I think there is a tradeoff to be made between having featureful data validation and enhanced security. If you want your database system to validate your data, you have to send it in plain text. If you want to have your database system not see the plain text, then it cannot validate it. But there is still utility in it. You can't really depend on what gets returned even in the non-encrypted case, unless you cryptographically sign the schema against modification or something like that. So realistically, a client that cares strongly about the data it receives has to do some kind of client-side validation anyway. Note also that higher-level client libraries like JDBC effectively do client-side data validation, for example when you call ResultSet.getInt() etc. This is also one of the reasons why the user facing type declaration exists. You could just make all encrypted columns of type "opaque" or something and not make any promises about what's inside. But client APIs sort or rely on the application being able to ask the result set for what's inside a column value. If we just say, we don't know, then applications (or driver APIs) will have to be changed to accommodate that, but the intention was to not require that. So instead we say, it's supposed to be int, and then if it's sometimes actually not int, then your application throws an exception you can deal with. This is arguably a better developer experience, even if it concerns the data type purist. But do you have a different idea about how it should work? > Expanding out from that I'd imagine, pie-in-the-sky > and in some far off land, having our data type in/out validation > functions moved to the common library and then adding client-side > validation of the data going in/out of the encrypted columns would allow > application developers to be able to trust what we're returning (as long > as they're using libpq- and we'd have to document that independent > implementations of the protocol have to provide this or just continue to > return bytea's). As mentioned, some client libraries effectively already do that. But even if we could make this much more comprehensive, I don't see how this can ever actually satisfy your point. It would require that all participating clients apply validation all the time, and all other clients can rely on that happening, which is impossible.
pgsql-hackers by date: