Re: storing an explicit nonce - Mailing list pgsql-hackers

From Robert Haas
Subject Re: storing an explicit nonce
Date
Msg-id CA+TgmoYGjN_f=FCErX49bzjhNG+GoctY+a+XhNRWCVvDY8U74w@mail.gmail.com
Whole thread Raw
In response to Re: storing an explicit nonce  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On Wed, Oct 6, 2021 at 11:22 AM Stephen Frost <sfrost@snowman.net> wrote:
> Seems like it'd be easier to achieve that by having something that looks
> very close to how write() looks, but just happens to have the option to
> run the data through a stream cipher and maybe does better error
> handling for us.  Making that layer also do block-based access to the
> files underneath seems like a much larger effort that, sure, may make
> some things better too but if we could do that with the same API then it
> could also be done later if someone's interested in that.

Yeah, it's possible that is the best option, but I'm not really
convinced. I think the places that are doing I/O in small chunks are
pretty questionable. Like look at this code from pgstat.c, with block
comments omitted for brevity:

    rc = fwrite(&format_id, sizeof(format_id), 1, fpout);
    (void) rc;                  /* we'll check for error with ferror */
    rc = fwrite(&globalStats, sizeof(globalStats), 1, fpout);
    (void) rc;                  /* we'll check for error with ferror */
    rc = fwrite(&archiverStats, sizeof(archiverStats), 1, fpout);
    (void) rc;                  /* we'll check for error with ferror */
    rc = fwrite(&walStats, sizeof(walStats), 1, fpout);
    (void) rc;                  /* we'll check for error with ferror */
   rc = fwrite(slruStats, sizeof(slruStats), 1, fpout);
    (void) rc;                  /* we'll check for error with ferror */

I don't know exactly what the best way to write this code is, but I'm
fairly sure this isn't it. I suppose that whoever wrote this code
chose to use fwrite() rather than write() to get buffering, but that
had the effect of delaying the error checking by an amount that I
would consider unacceptable in new code -- we do all the fwrite()
calls to generate the entire file and then only check ferror() once at
the very end! If we did our own buffering, we could do this a lot
better. And if we used that same buffering layer everywhere, it might
not be too hard to make it use a block cipher rather than a stream
cipher. Now I don't intrinsically have strong feeling about whether
block ciphers or stream ciphers are better, but I think it's going to
be easier to analyze the security of the system and to maintain it
across future developments in cryptography if we can use the same kind
of cipher everywhere. If we use block ciphers for some things and
stream ciphers for other things, it is more complicated.

Perhaps that is unavoidable and I just shouldn't worry about it. It
may work out that we'll end up needing to do that anyway for one
reason or another. But all things being equal, I think it's nice if we
make all the places where we do I/O look more like each other, not
specifically because of TDE but because that's just better in general.
For example Andres is working on async I/O. Maybe this particular
piece of code is moot in terms of that project because I think Andres
is hoping to get the shared memory stats collector patch committed.
But, say that doesn't happen. The more all of the I/O looks the same,
the easier it will be to make all of it use whatever async I/O
infrastructure he creates. The more every module does things in its
own way, the harder it is. And batching work together into
reasonable-sized blocks is probably necessary for async I/O too.

I just can't look at code like that shown above and think anything
other than "blech".

-- 
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Role Self-Administration
Next
From: Tom Lane
Date:
Subject: Re: Running tests under valgrind is getting slower at an alarming pace