Temporary file access API - Mailing list pgsql-hackers

From Antonin Houska
Subject Temporary file access API
Date
Msg-id 4987.1644323098@antos
Whole thread Raw
Responses Re: Temporary file access API
List pgsql-hackers
Here I'm starting a new thread to discuss a topic that's related to the
Transparent Data Encryption (TDE), but could be useful even without that.  The
problem has been addressed somehow in the Cybertec TDE fork, and I can post
the code here if it helps. However, after reading [1] (and the posts
upthread), I've got another idea, so let's try to discuss it first.

It makes sense to me if we first implement the buffering (i.e. writing/reading
certain amount of data at a time) and make the related functions aware of
encryption later: as long as we use a block cipher, we also need to read/write
(suitably sized) chunks rather than individual bytes (or arbitrary amounts of
data). (In theory, someone might need encryption but reject buffering, but I'm
not sure if this is a realistic use case.)

For the buffering, I imagine a "file stream" object that user creates on the
top of a file descriptor, such as

    FileStream  *FileStreamCreate(File file, int buffer_size)

    or

    FileStream  *FileStreamCreateFD(int fd, int buffer_size)

and uses functions like

    int FileStreamWrite(FileStream *stream, char *buffer, int amount)

    and

    int FileStreamRead(FileStream *stream, char *buffer, int amount)

to write and read data respectively.

Besides functions to close the streams explicitly (e.g. FileStreamClose() /
FileStreamFDClose()), we'd need to ensure automatic closing where that happens
to the file. For example, if OpenTemporaryFile() was used to obtain the file
descriptor, the user expects that the file will be closed and deleted on
transaction boundary, so the corresponding stream should be freed
automatically as well.

To avoid code duplication, buffile.c should use these streams internally as
well, as it also performs buffering. (Here we'd also need functions to change
reading/writing position.)

Once we implement the encryption, we might need add an argument to the
FileStreamCreate...() functions that helps to generate an unique IV, but the
...Read() / ...Write() functions would stay intact. And possibly one more
argument to specify the kind of cipher, in case we support more than one.

I think that's enough to start the discussion. Thanks for feedback in advance.

[1]
https://www.postgresql.org/message-id/CA%2BTgmoYGjN_f%3DFCErX49bzjhNG%2BGoctY%2Ba%2BXhNRWCVvDY8U74w%40mail.gmail.com

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: Re: [PATCH v2] use has_privs_for_role for predefined roles
Next
From: Antonin Houska
Date:
Subject: Re: storing an explicit nonce