Re: Transparent column encryption - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: Transparent column encryption
Date
Msg-id a9381ab5-234a-f796-f43e-ebf84c3922f8@enterprisedb.com
Whole thread Raw
In response to Re: Transparent column encryption  (Jehan-Guillaume de Rorthais <jgdr@dalibo.com>)
Responses Re: Transparent column encryption
List pgsql-hackers
On 28.10.22 12:16, Jehan-Guillaume de Rorthais wrote:
> I did a review of the documentation and usability.

I have incorporated some of your feedback into the v11 patch I just posted.

> # Applying patch
> 
>    The patch applied on top of f13b2088fa2 without trouble. Notice a small
>    warning during compilation:
>    
>      colenccmds.c:134:27: warning: ‘encval’ may be used uninitialized
>    
>    A simple fix could be:
>    
>      +++ b/src/backend/commands/colenccmds.c
>      @@ -119,2 +119,3
>                      encval = defGetString(encvalEl);
>      +               *encval_p = encval;
>              }
>      @@ -132,4 +133,2
>                      *alg_p = alg;
>      -       if (encval_p)
>      -               *encval_p = encval;
>       }

fixed

> # Documentation
> 
>    * In page "create_column_encryption_key.sgml", both encryption algorithms for
>      a CMK are declared as the default one:
> 
>      +     <para>
>      +      The encryption algorithm that was used to encrypt the key material of
>      +      this column encryption key.  Supported algorithms are:
>      +      <itemizedlist>
>      +       <listitem>
>      +        <para><literal>RSAES_OAEP_SHA_1</literal> (default)</para>
>      +       </listitem>
>      +       <listitem>
>      +        <para><literal>RSAES_OAEP_SHA_256</literal> (default)</para>
>      +       </listitem>
>      +      </itemizedlist>
>      +     </para>
>    
>      As far as I understand the code, I suppose RSAES_OAEP_SHA_1 should be the
>      default.

fixed

>    I believe two information should be clearly shown to user somewhere in
>    chapter 5.5 instead of being buried deep in documentation:
>   
>    * «COPY does not support column decryption», currently buried in pg_dump page
>    * «When transparent column encryption is enabled, the client encoding must
>      match the server encoding», currently buried in the protocol description
>      page.
> 
>    * In the libpq doc of PQexecPrepared2, "paramForceColumnEncryptions" might
>      deserve a little more detail about the array format, like «0 means "don't
>      enforce" and anything else enforce the encryption is enabled on this
>      column». By the way, maybe this array could be an array of boolean?
> 
>    * In chapter 55.2.5 (protocol-flow) is stated: «when column encryption is
>      used, the plaintext is always in text format (not binary format).». Does it
>      means parameter "resultFormat" in "PQexecPrepared2" should always be 0? If
>      yes, is it worth keeping this argument? Moreover, this format constraint
>      should probably be explained in the libpq page as well.

I will keep these suggestions around.  Some of these things will 
probably change again, so I'll make sure to update the documentation 
when I touch it again.

> # Protocol
> 
>    * In the ColumnEncryptionKey message, it seems the field holding the length
>      key material is redundant with the message length itself, as all other
>      fields have a known size. The key material length is the message length -
>      (4+4+4+2). For comparison, the AuthenticationSASLContinue message has a
>      variable data size but rely only on the message length without additional
>      field.

I find that weird, though.  An explicit length seems better.  Things 
like AuthenticationSASLContinue only work if they have exactly one 
variable-length data item.

>    * I wonder if encryption related fields in ParameterDescription and
>      RowDescription could be optional somehow? The former might be quite large
>      when using a lot of parameters (like, imagine a large and ugly
>      "IN($1...$N)"). On the other hand, these messages are not sent in high
>      frequency anyway...

They are only used if you turn on the column_encryption protocol option. 
  Or did you mean make them optional even then?

> # libpq
> 
>    Would it be possible to have an encryption-ready PQexecParams() equivalent
>    of what PQprepare/describe/exec do?

I plan to do that.

> # psql
> 
>    * You already mark \d in the TODO. But having some psql command to list the
>      known CEK/CMK might be useful as well.

right

>    * about write queries using psql, would it be possible to use psql
>      parameters? Eg.:
> 
>        => \set c1 myval
>        => INSERT INTO mytable VALUES (:'c1') \gencr

No, because those are resolved by psql before libpq sees them.

> # Manual tests
> 
>    * The lookup error message is shown twice for some reason:
> 
>        => select * from test_tce;
>        no CMK lookup found for realm ""
>        
>        no CMK lookup found for realm ""
> 
>      It might worth adding the column name and the CMK/CEK names related to each
>      error message? Last, notice the useless empty line between messages.

I'll look into that.

>    * When "DROP IF EXISTS" a missing CEK or CMK, the command raise an
>      "unrecognized object type":
> 
>        => DROP COLUMN MASTER KEY IF EXISTS noexists;
>        ERROR:  unrecognized object type: 10
>        => DROP COLUMN ENCRYPTION KEY IF EXISTS noexists;
>        ERROR:  unrecognized object type: 8

fixed

>    * I was wondering what "pg_typeof" should return. It currently returns
>      "pg_encrypted_*". If this is supposed to be transparent from the client
>      perspective, shouldn't it return "attrealtypid" when the field is encrypted?

Interesting question.  Need to think about it.  I'm not sure what the 
purpose of pg_typeof really is.  The only use I can recall is for pgTAP.

>    * any reason to not support altering the CMK realm?

This could be added.  I have that in my notes.




pgsql-hackers by date:

Previous
From: Dean Rasheed
Date:
Subject: Re: Another multi-row VALUES bug
Next
From: Tom Lane
Date:
Subject: Re: Document parameter count limit