Hi,
Looking at [1] I, again, noticed that a decent portion of our connection
overhead is due to openssl's atexit handler.
On my older workstation (with a few noisy things running):
c=16;pgbench -n -M prepared -c$c -j$c -P1 -T10 -f <(echo 'select') -C
-> 3057 TPS
If I change the exit() in proc_exit() to a _exit():
-> 3633 TPS
The reason for this difference is that by default openssl registers an atexit
handler that frees a lot of memory that was initialized in postmaster. That in
turn triggers page-faults due to the relevant pages now differing in child
processes. Which a) isn't cheap b) causes contention with postmaster, since
those datastructures are shared.
It's possible to tell openssl to not register an atexit handler, see [2]:
> OPENSSL_INIT_NO_ATEXIT
> By default OpenSSL will attempt to clean itself up when the process exits via
> an "atexit" handler. Using this option suppresses that behaviour. This means
> that the application will have to clean up OpenSSL explicitly using
> OPENSSL_cleanup().
One slight difficulty is that we initialize openssl somewhat indirectly, via
PostmasterMain()->InitProcessGlobals()->pg_prng_strong_seed() which then, if
built with openssl support, triggers initialization within RAND_status().
The quick hack of putting
#ifdef USE_OPENSSL
OPENSSL_init_crypto(OPENSSL_INIT_NO_ATEXIT, NULL);
#endif
at the start of PostmasterMain() gets the connection speed up a fair bit:
-> 3449 TPS
The reason this isn't as good as using _exit is that there are other libraries
with (effectively) atexit handlers. In particular ICU pulls in libstdc++,
which in turn seems to have a lot of destructors for global objects that
aren't cheap.
If I build without ICU support, the connection rate with exit() (and the
openssl "fix") is
-> 3863 TPS
and if I use _exit() it is
-> 3900 TPS
I.e. at that point the remaining atexit handlers only play a small role.
I don't know if there's a decent solution for the nontrivial overhead due to
ICU -> libstdc++'s atexit handlers.
There are a few related issues where we ourselves to blame. The most prominent
one is that we go around and delete PostmasterContext in child processes. That
however doesn't really save memory, as the memory is still needed in
postmaster, we just end up causing page faults that trigger copy-on-write.
If I just comment out the MemoryContextDelete in PostgresMain() I see
connection rates improve from
-> 3891 TPS
to
-> 4004 TPS
If I build a much more minimal postgres, disabling all optional dependencies
other than openssl I see a significant improvement, just due fewer mmaps for
the libraries:
-> 4865 TPS
Further disabling openssl and zlib interestingly does not help, interestingly.
Greetings,
Andres Freund
[1] https://postgr.es/m/CAFbpF8OA44_UG%2BRYJcWH9WjF7E3GA6gka3gvH6nsrSnEe9H0NA%40mail.gmail.com
[2] https://docs.openssl.org/3.1/man3/OPENSSL_init_crypto/#name