Re: Unnecessary connection overhead due copy-on-write (mainly openssl) - Mailing list pgsql-hackers
From | Nico Williams |
---|---|
Subject | Re: Unnecessary connection overhead due copy-on-write (mainly openssl) |
Date | |
Msg-id | aENNKkE+JkkwBtmV@ubby Whole thread Raw |
In response to | Re: Unnecessary connection overhead due copy-on-write (mainly openssl) (Jacob Champion <jacob.champion@enterprisedb.com>) |
List | pgsql-hackers |
On Fri, Jun 06, 2025 at 11:58:38AM -0700, Jacob Champion wrote: > > I'd expect all subsystems to recover cleanly from unclean shutdowns. I > > know, that's a lot to expect, but nowadays pretty much all filesystems > > used in production do, for example. > > I guess, but if we stop cleaning up entirely, we will suddenly be > stressing those code paths... But maybe that's a community service? :) The latter. > I realize I'm making an argument from fear and ignorance. Maybe that > ecosystem is very healthy. I'm just imagining the following > conversation: > > DBA: we upgraded our server and our HSM is freaking out after a few > thousand connections; what gives? > us: oh, we stopped cleaning up after ourselves for performance! tell > your vendor to fix their drivers! > DBA: hahahaha TPMs for example have a concept of session. You can have up to 64 open sessions, and if you use the TPM resource manager and you're accessing it through a file descriptor then the RM will just clean up when you exit. Though if you're accessing the raw TPM directly then fail to flush sessions then yes, you'll eventually be unable to create new ones. However no one will be using a discrete or firmware TPM for TLS server certificate private key usage: discrete TPMs are way way too slow for that, and firmware TPMs are... also way too slow. You wouldn't bother with a software TPM for this unless it's for privilege separation. Anyways, if you were using a TPM then the user's startup scripts, or postgres itself could just flush all sessions and be done. Other types of hardware cryptographic providers also tend to have a notion of "session", and they all tend to have relatively paltry limits, which means that the software side that calls them will generally need to be prepared to a) close its own sessions eagerly (at the cost of extra overhead on the next operation), and b) recover from running out of sessions (by flushing others at the cost of causing those that were live to need retries). But anyways, IIUC the OpenSSL engine interface is itself stateless and I would expect providers to auto-recover. And anyways I expect no one uses PG with HW cryptographic providers to perform TLS server signatures. Instead the best current practice would be to use short-lived server certificates with software keys and longer-lived credentials in hardware with which to fetch new short-lived credentials with software keys. The kinds of HSMs that can do high rates of signatures are neither cheap nor commonly used, and those do tend to have higher session limits, and again you can recover from running out of sessions by flushing extant sessions. > > I doubt that PG w/ OpenSSL in any configuration maintains stateful > > interactions with HW cryptographic providers. > > (Why? From looking over the Cryptoki/PKCS#11 stuff, for example, isn't > a lot of that API stateful?) PKCS#11 is stateful, yes (it has session handles), but there are generally low limits on how many sessions you can keep open, therefore high pressure to close them soon, therefore the inference is that that must be what actually happens at the rather high cost of having to set up new sessions often. That inference could be wrong, but then as you note you'd be doing the community a service by testing it and making it true in the future. Nico --
pgsql-hackers by date: