Re: Transparent column encryption - Mailing list pgsql-hackers

From Jehan-Guillaume de Rorthais
Subject Re: Transparent column encryption
Date
Msg-id 20221028121629.6467d8f9@karst
Whole thread Raw
In response to Re: Transparent column encryption  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Responses Re: Transparent column encryption
Re: Transparent column encryption
List pgsql-hackers
Hi,

I did a review of the documentation and usability.

# Applying patch

  The patch applied on top of f13b2088fa2 without trouble. Notice a small
  warning during compilation:

    colenccmds.c:134:27: warning: ‘encval’ may be used uninitialized

  A simple fix could be:

    +++ b/src/backend/commands/colenccmds.c
    @@ -119,2 +119,3
                    encval = defGetString(encvalEl);
    +               *encval_p = encval;
            }
    @@ -132,4 +133,2
                    *alg_p = alg;
    -       if (encval_p)
    -               *encval_p = encval;
     }

# Documentation

  * In page "create_column_encryption_key.sgml", both encryption algorithms for
    a CMK are declared as the default one:

    +     <para>
    +      The encryption algorithm that was used to encrypt the key material of
    +      this column encryption key.  Supported algorithms are:
    +      <itemizedlist>
    +       <listitem>
    +        <para><literal>RSAES_OAEP_SHA_1</literal> (default)</para>
    +       </listitem>
    +       <listitem>
    +        <para><literal>RSAES_OAEP_SHA_256</literal> (default)</para>
    +       </listitem>
    +      </itemizedlist>
    +     </para>

    As far as I understand the code, I suppose RSAES_OAEP_SHA_1 should be the
    default.

  I believe two information should be clearly shown to user somewhere in
  chapter 5.5 instead of being buried deep in documentation:

  * «COPY does not support column decryption», currently buried in pg_dump page
  * «When transparent column encryption is enabled, the client encoding must
    match the server encoding», currently buried in the protocol description
    page.

  * In the libpq doc of PQexecPrepared2, "paramForceColumnEncryptions" might
    deserve a little more detail about the array format, like «0 means "don't
    enforce" and anything else enforce the encryption is enabled on this
    column». By the way, maybe this array could be an array of boolean?

  * In chapter 55.2.5 (protocol-flow) is stated: «when column encryption is
    used, the plaintext is always in text format (not binary format).». Does it
    means parameter "resultFormat" in "PQexecPrepared2" should always be 0? If
    yes, is it worth keeping this argument? Moreover, this format constraint
    should probably be explained in the libpq page as well.

# Protocol

  * In the ColumnEncryptionKey message, it seems the field holding the length
    key material is redundant with the message length itself, as all other
    fields have a known size. The key material length is the message length -
    (4+4+4+2). For comparison, the AuthenticationSASLContinue message has a
    variable data size but rely only on the message length without additional
    field.

  * I wonder if encryption related fields in ParameterDescription and
    RowDescription could be optional somehow? The former might be quite large
    when using a lot of parameters (like, imagine a large and ugly
    "IN($1...$N)"). On the other hand, these messages are not sent in high
    frequency anyway...

# libpq

  Would it be possible to have an encryption-ready PQexecParams() equivalent
  of what PQprepare/describe/exec do?

# psql

  * You already mark \d in the TODO. But having some psql command to list the
    known CEK/CMK might be useful as well.

  * about write queries using psql, would it be possible to use psql
    parameters? Eg.:

      => \set c1 myval
      => INSERT INTO mytable VALUES (:'c1') \gencr

# Manual tests

  * The lookup error message is shown twice for some reason:

      => select * from test_tce;
      no CMK lookup found for realm ""

      no CMK lookup found for realm ""

    It might worth adding the column name and the CMK/CEK names related to each
    error message? Last, notice the useless empty line between messages.

  * When "DROP IF EXISTS" a missing CEK or CMK, the command raise an
    "unrecognized object type":

      => DROP COLUMN MASTER KEY IF EXISTS noexists;
      ERROR:  unrecognized object type: 10
      => DROP COLUMN ENCRYPTION KEY IF EXISTS noexists;
      ERROR:  unrecognized object type: 8

  * I was wondering what "pg_typeof" should return. It currently returns
    "pg_encrypted_*". If this is supposed to be transparent from the client
    perspective, shouldn't it return "attrealtypid" when the field is encrypted?

  * any reason to not support altering the CMK realm?

This patch is really interesting and would be a nice addition to the core.

Thanks!

Regards,



pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: Fast COPY FROM based on batch insert
Next
From: David Rowley
Date:
Subject: Re: Use LIMIT instead of Unique for DISTINCT when all distinct pathkeys are redundant