libpq contention due to gss even when not using gss - Mailing list pgsql-hackers

From Andres Freund
Subject libpq contention due to gss even when not using gss
Date
Msg-id 20240610181212.auytluwmbfl7lb5n@awork3.anarazel.de
Whole thread Raw
Responses Re: libpq contention due to gss even when not using gss
List pgsql-hackers
Hi,

To investigate a report of both postgres and pgbouncer having issues when a
lot of new connections aree established, I used pgbench -C.  Oddly, on an
early attempt, the bottleneck wasn't postgres+pgbouncer, it was pgbench. But
only when using TCP, not with unix sockets.

c=40;pgbench -C -n -c$c -j$c -T5 -f <(echo 'select 1') 'port=6432 host=127.0.0.1 user=test dbname=postgres
password=fake'

host=127.0.0.1:                           16465
host=127.0.0.1,gssencmode=disable         20860
host=/tmp:                                49286

Note that the server does *not* support gss, yet gss has a substantial
performance impact.

Obviously the connection rates here absurdly high and outside of badly written
applications likely never practically relevant. However, the number of cores
in systems are going up, and this quite possibly will become relevant in more
realistic scenarios (lock contention kicks in earlier the more cores you
have).

And it doesn't seem great that something as rarely used as gss introduces
overhead to very common paths.

Here's a bottom-up profile:

-   32.10%  pgbench  [kernel.kallsyms]      [k] queued_spin_lock_slowpath
   - 32.09% queued_spin_lock_slowpath
      - 16.15% futex_wake
           do_futex
           __x64_sys_futex
           do_syscall_64
         - entry_SYSCALL_64_after_hwframe
            - 16.15% __GI___lll_lock_wake
               - __GI___pthread_mutex_unlock_usercnt
                  - 5.12% gssint_select_mech_type
                     - 4.36% gss_inquire_attrs_for_mech
                        - 2.85% gss_indicate_mechs
                           - gss_indicate_mechs_by_attrs
                              - 1.58% gss_acquire_cred_from
                                   gss_acquire_cred
                                   pg_GSS_have_cred_cache
                                   select_next_encryption_method (inlined)
                                   init_allowed_encryption_methods (inlined)
                                   PQconnectPoll
                                   pqConnectDBStart (inlined)
                                   PQconnectStartParams
                                   PQconnectdbParams
                                   doConnect


And a bottom-up profile:

-   32.10%  pgbench  [kernel.kallsyms]      [k] queued_spin_lock_slowpath
   - 32.09% queued_spin_lock_slowpath
      - 16.15% futex_wake
           do_futex
           __x64_sys_futex
           do_syscall_64
         - entry_SYSCALL_64_after_hwframe
            - 16.15% __GI___lll_lock_wake
               - __GI___pthread_mutex_unlock_usercnt
                  - 5.12% gssint_select_mech_type
                     - 4.36% gss_inquire_attrs_for_mech
                        - 2.85% gss_indicate_mechs
                           - gss_indicate_mechs_by_attrs
                              - 1.58% gss_acquire_cred_from
                                   gss_acquire_cred
                                   pg_GSS_have_cred_cache
                                   select_next_encryption_method (inlined)
                                   init_allowed_encryption_methods (inlined)
                                   PQconnectPoll
                                   pqConnectDBStart (inlined)
                                   PQconnectStartParams
                                   PQconnectdbParams
                                   doConnect



Clearly the contention originates outside of our code, but is triggered by
doing pg_GSS_have_cred_cache() every time a connection is established.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Bertrand Drouvot
Date:
Subject: Re: Track the amount of time waiting due to cost_delay
Next
From: Andres Freund
Date:
Subject: Re: Remove dependence on integer wrapping