Re: Cryptohash OpenSSL error queue in FIPS enabled builds - Mailing list pgsql-hackers

From Daniel Gustafsson
Subject Re: Cryptohash OpenSSL error queue in FIPS enabled builds
Date
Msg-id 2E3162A4-874A-456E-AF89-3E93AAB7CE95@yesql.se
Whole thread Raw
In response to Re: Cryptohash OpenSSL error queue in FIPS enabled builds  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Cryptohash OpenSSL error queue in FIPS enabled builds  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Cryptohash OpenSSL error queue in FIPS enabled builds  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
> On 25 Apr 2022, at 02:50, Michael Paquier <michael@paquier.xyz> wrote:
> On Sat, Apr 23, 2022 at 11:40:19PM +0200, Daniel Gustafsson wrote:
>> On 22 Apr 2022, at 19:01, Tom Lane <tgl@sss.pgh.pa.us> wrote:

>>> It seems like that solution means you're leaving an extra error in the queue to
>>> break unrelated code.  Wouldn't it be better to clear after grabbing the error?
>>> (Or maybe do both.)
>>
>> That's a very good point, doing it in both ends of the operation is better
>> here.
>
> Error queues are cleaned with ERR_clear_error() before specific SSL
> calls in the frontend and the backend, never after the fact.  If we
> assume that multiple errors can be stacked in the OpenSSL error queue,
> shouldn't we worry about cleaning up the error queue in code paths
> like pgtls_read/write(), be_tls_read/write() and be_tls_open_server()?
> So it seems to me that SSLerrmessage() should be treated the same way
> for the backend and the frontend.  Any opinions?

Well, clearing the queue before calling into OpenSSL is the programming pattern
which is quite universally followed so I'm not sure we need to litter the
codepaths with calls to clearing the queue as we leave.

In this particular codepath I think we can afford clearing it on the way out,
with a comment explaining why.  It's easily reproducible and adding a call and
a comment is a good documentation for ourselves of this OpenSSL behavior.  That
being said, clearing on the way in is the important bit.

> pgcrypto's openssl.c has the same problem under FIPS as it includes
> EVP calls.  Saying that, putting a cleanup in pg_cryptohash_create()
> before the fact, and one in SSLerrmessage() after consuming the error
> code should be fine to keep a clean queue.

pgcrypto doesn't really consume or even inspect the OpenSSL errors, but pass on
a PXE error based on the context of the operation.  We could clear the queue as
we leave, but as you say we already clear it before calling in other places so
it's not clear that it's useful.  We've had EVP in pgcrypto for some time
without seeing issues from error queues, one error left isn't that different
from two when not consumed.

The attached 0002 does however correctly (IMO) report the error as an init
error instead of the non-descript generic error, which isn't really all that
helpful.  I think that also removes the last consumer of the generic error, but
I will take another look with fresh eyes to confirm that.

0003 removes what I think is a weirdly placed questionmark from the message
that make it seem strangely ambiguous.  This needs to update the test answer
files as well, but I first wanted to float the idea before doing that.

--
Daniel Gustafsson        https://vmware.com/


Attachment

pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: add checkpoint stats of snapshot and mapping files of pg_logical dir
Next
From: Tomas Vondra
Date:
Subject: Re: bogus: logical replication rows/cols combinations