Re: Transparent Data Encryption (TDE) and encrypted files - Mailing list pgsql-hackers
From | Moon, Insung |
---|---|
Subject | Re: Transparent Data Encryption (TDE) and encrypted files |
Date | |
Msg-id | CAEMmqBvQ2bDJFdHbabM5Ti0MFfwpYrZV-NqkWaCwKe2mqUi2Uw@mail.gmail.com Whole thread Raw |
In response to | Re: Transparent Data Encryption (TDE) and encrypted files (Antonin Houska <ah@cybertec.at>) |
Responses |
Re: Transparent Data Encryption (TDE) and encrypted files
|
List | pgsql-hackers |
Hello. On Tue, Oct 8, 2019 at 8:52 PM Antonin Houska <ah@cybertec.at> wrote: > > Robert Haas <robertmhaas@gmail.com> wrote: > > > On Mon, Oct 7, 2019 at 3:01 PM Antonin Houska <ah@cybertec.at> wrote: > > > However the design doesn't seem to be stable enough at the > > > moment for coding to make sense. > > > > Well, I think the question is whether working further on your patch > > could produce some things that everyone would agree are a step > > forward. > > It would have made a lot of sense several months ago (Masahiko Sawada actually > used parts of our patch in the previous version of his patch (see [1]), but > the requirement to use a different IV for each execution of the encryption > changes things quite a bit. > > Besides the relation pages and SLRU (CLOG), which are already being discussed > elsewhere in the thread, let's consider other two file types: > > * Temporary files (buffile.c): we derive the IV from PID of the process that > created the file + segment number + block within the segment. This > information does not change if you need to write the same block again. If > new IV should be used for each encryption run, we can simply introduce an > in-memory counter that generates the IV for each block. However it becomes > trickier if the temporary file is shared by multiple backends. I think it > might still be easier to expose the IV values to other backends via shared > memory than to store them on disk ... I think encrypt a temporary file in a slightly different way. Previously, I had a lot of trouble with IV uniqueness, but I have proposed a unique encryption key for each file. First, in the case of the CTR mode to be used, 32 bits are used for the counter in the 128-bit nonce value. Here, the counter increases every time 16 bytes are encrypted, and theoretically, if nonce 96 bits are the same, a total of 64 GiB can be encrypted. Therefore, in the case of buffile.c that creates a temporary file due to lack of work_mem, it is possible to use up to 1GiB per file, so it is possible to encrypt to a simple IV value sufficiently safely. The problem is that a vulnerability occurs when 96-bit nonce values excluding Counter are the same values. I also tried to generate IV using PID (32bit) + tempCounter (64bit) at first, but in the worst-case PID and tempCounter are used in the same values. Therefore, the uniqueness of the encryption key was considered without considering the uniqueness of the IV value. The encryption key uses a separate key for each file, as described earlier. First, it generates a hash value randomly for the file, and uses the hash value and KEK (or MDEK) to derive and use the key with HMAC-SHA256. In this case, there is no need to store the encryption key separately if it is not necessary to keep it in a separate IV file or memory. (IV is a hash value of 64 bits and a counter of 32 bits.) Also, currently, the temporary file name is specified by the current PID.tempFileCounter, but if this is set to PID.tempFileCounter.hashvalue, we can encrypt and decrypt in any process thinking about. Reference URL https://wiki.postgresql.org/wiki/Transparent_Data_Encryption#TODO_for_Full-Cluster_Encryption > > * "Buffered transient file". This is to be used instead of OpenTransientFile() > if user needs the option to encrypt the file. (Our patch adds this API to > buffile.c. Currently we use it in reorderbuffer.c to encrypt the data > changes produced by logical decoding, but there should be more use cases.) Agreed. Best regards. Moon. > > In this case we cannot keep the IVs in memory because user can close the > file anytime and open it much later. So we derive the IV by hashing the file > path. However if we should generate the IV again and again, we need to store > it on disk in another way, probably one IV value per block (PGAlignedBlock). > > However since our implementation of both these file types shares some code, > it might yet be easier if the shared temporary file also stored the IV on > disk instead of exposing it via shared memory ... > > Perhaps this is what I can work on, but I definitely need some feedback. > > [1] https://www.postgresql.org/message-id/CAD21AoBjrbxvaMpTApX1cEsO=8N=nc2xVZPB0d9e-VjJ=YaRnw@mail.gmail.com > > -- > Antonin Houska > Web: https://www.cybertec-postgresql.com > >
pgsql-hackers by date: