Re: Database logins taking longer and longer, showing up as "authentication" in ps(1) - Mailing list pgsql-general

From Adam Sjøgren
Subject Re: Database logins taking longer and longer, showing up as "authentication" in ps(1)
Date
Msg-id 87tuwyojsu.fsf@tullinup.koldfront.dk
Whole thread Raw
In response to Re: Database logins taking longer and longer, showing up as "authentication" in ps(1)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Database logins taking longer and longer, showing up as "authentication" in ps(1)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Tom writes:

> Stephen Frost <sfrost@snowman.net> writes:
>> * Adam Sjøgren (asjo@koldfront.dk) wrote:
>>> Sometimes new database logins slow down, from usually taking <0.05s to
>>> taking minutes. This is for psql as a normal user using Kerberos, for
>>> psql as the postgres superuser, for the web-application logging into the
>>> database, for everything.
>
> I think looking into pg_stat_activity and pg_locks might be interesting,
> although it's possible the delay is before the new session has made
> any entries there.
>
> Also, obtaining stack traces from a few of the stuck processes would
> likely be quite informative.

I will try to look at pg_stat_activity and pg_locks the next time - it
just happened now and I focused on getting some stacktraces -
unfortunately I do not have the -dbg package installed, so I don't know
how useful they are.

Here's the first one:

  $ ps faux | grep postgres | grep authentication
  postgres  16697  0.0  0.0 34686088 8352 ?       Ss   14:45   0:00  \_ postgres: 11/main: workqueue XXdb
127.0.0.1(42534)authentication 
  postgres  16701  0.0  0.0 34686088 6476 ?       Ss   14:45   0:00  \_ postgres: 11/main: anonymous XXdb
127.0.0.1(42536)authentication 
  postgres  16702  0.0  0.0 34686088 8348 ?       Ss   14:45   0:00  \_ postgres: 11/main: svc_user XXdb
127.0.0.1(42540)authentication 
  postgres  16704  0.0  0.0 34686040 5376 ?       Ss   14:45   0:00  \_ postgres: 11/main: postgres postgres [local]
authentication
  postgres  16705  0.0  0.0 34686088 8352 ?       Ss   14:45   0:00  \_ postgres: 11/main: svc_user XXdb
127.0.0.1(42542)authentication 
  postgres  16708  0.0  0.0 34686088 6476 ?       Ss   14:45   0:00  \_ postgres: 11/main: svc_user XXdb
127.0.0.1(42544)authentication 
  $ sudo gdb -p  16708
  GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
    [...]
  Attaching to process 16708
  Reading symbols from /usr/lib/postgresql/11/bin/postgres...(no debugging symbols found)...done.
    [...]
  0x00007f8270a039f3 in delete_module () at ../sysdeps/unix/syscall-template.S:86
  86    ../sysdeps/unix/syscall-template.S: No such file or directory.
  (gdb) bt
  #0  0x00007f8270a039f3 in delete_module () at ../sysdeps/unix/syscall-template.S:86
  #1  0x000055f60d3d0101 in WaitEventSetWait ()
  #2  0x000055f60d2dccff in secure_read ()
  #3  0x000055f60d2e9138 in ?? ()
  #4  0x000055f60d2e9e75 in pq_getbyte ()
  #5  0x000055f60d3f3100 in PostgresMain ()
  #6  0x000055f60d102bf8 in ?? ()
  #7  0x000055f60d37e492 in PostmasterMain ()
  #8  0x000055f60d103fa5 in main ()
  (gdb)

I think I was too slow on this one, and this is what it looks like if
the process is just waiting for a connection (as you can guess, I'm not
used to looking at backtraces).

Here is another one, which hopefully is caught at a better moment, and
more useful:

  postgres  19495  0.0  0.0 34686088 8412 ?       Ss   14:48   0:00  \_ postgres: 11/main: workqueue XXdb
127.0.0.1(43390)authentication 
  postgres  19497  0.0  0.0 34686088 6476 ?       Ss   14:48   0:00  \_ postgres: 11/main: XXX XXdb 127.0.0.1(43392)
authentication
  postgres  19502  0.0  0.0 34686088 8412 ?       Ss   14:48   0:00  \_ postgres: 11/main: XXX XXdb 127.0.0.1(43394)
authentication
  $ sudo gdb -p  19495
  GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
    [...]
  Attaching to process 19495
    [...]
  0x00007f8272cf2827 in futex_abstimed_wait_cancelable (private=128, abstime=0x0, expected=0,
futex_word=0x7f7a2e9f1d38)at ../sysdeps/unix/sysv/linux/futex-internal.h:205 
  205    ../sysdeps/unix/sysv/linux/futex-internal.h: No such file or directory.
  (gdb) bt
  #0  0x00007f8272cf2827 in futex_abstimed_wait_cancelable (private=128, abstime=0x0, expected=0,
futex_word=0x7f7a2e9f1d38)at ../sysdeps/unix/sysv/linux/futex-internal.h:205 
  #1  do_futex_wait (sem=sem@entry=0x7f7a2e9f1d38, abstime=0x0) at sem_waitcommon.c:111
  #2  0x00007f8272cf28d4 in __new_sem_wait_slow (sem=0x7f7a2e9f1d38, abstime=0x0) at sem_waitcommon.c:181
  #3  0x00007f8272cf297a in __new_sem_wait (sem=<optimized out>) at sem_wait.c:29
  #4  0x000055f60d36c252 in PGSemaphoreLock ()
  #5  0x000055f60d3e35cc in LWLockAcquire ()
  #6  0x000055f60d3d14ee in ProcArrayAdd ()
  #7  0x000055f60d3e01a3 in InitProcessPhase2 ()
  #8  0x000055f60d51c99b in InitPostgres ()
  #9  0x000055f60d3f2baf in PostgresMain ()
  #10 0x000055f60d102bf8 in ?? ()
  #11 0x000055f60d37e492 in PostmasterMain ()
  #12 0x000055f60d103fa5 in main ()
  (gdb) cont
  Continuing.
  ^C
  Program received signal SIGINT, Interrupt.
  0x00007f8272cf2827 in futex_abstimed_wait_cancelable (private=128, abstime=0x0, expected=0,
futex_word=0x7f7a2e9f1d38)at ../sysdeps/unix/sysv/linux/futex-internal.h:205 
  205    in ../sysdeps/unix/sysv/linux/futex-internal.h
  (gdb) bt
  #0  0x00007f8272cf2827 in futex_abstimed_wait_cancelable (private=128, abstime=0x0, expected=0,
futex_word=0x7f7a2e9f1d38)at ../sysdeps/unix/sysv/linux/futex-internal.h:205 
  #1  do_futex_wait (sem=sem@entry=0x7f7a2e9f1d38, abstime=0x0) at sem_waitcommon.c:111
  #2  0x00007f8272cf28d4 in __new_sem_wait_slow (sem=0x7f7a2e9f1d38, abstime=0x0) at sem_waitcommon.c:181
  #3  0x00007f8272cf297a in __new_sem_wait (sem=<optimized out>) at sem_wait.c:29
  #4  0x000055f60d36c252 in PGSemaphoreLock ()
  #5  0x000055f60d3e35cc in LWLockAcquire ()
  #6  0x000055f60d3d14ee in ProcArrayAdd ()
  #7  0x000055f60d3e01a3 in InitProcessPhase2 ()
  #8  0x000055f60d51c99b in InitPostgres ()
  #9  0x000055f60d3f2baf in PostgresMain ()
  #10 0x000055f60d102bf8 in ?? ()
  #11 0x000055f60d37e492 in PostmasterMain ()
  #12 0x000055f60d103fa5 in main ()
  (gdb) cont
  Continuing.
  (gdb) bt

  Program received signal SIGUSR1, User defined signal 1.
  0x00007f8270a039f3 in delete_module () at ../sysdeps/unix/syscall-template.S:86
  86    ../sysdeps/unix/syscall-template.S: No such file or directory.
  (gdb) bt
  #0  0x00007f8270a039f3 in delete_module () at ../sysdeps/unix/syscall-template.S:86
  #1  0x000055f60d3d0101 in WaitEventSetWait ()
  #2  0x000055f60d3d0574 in WaitLatchOrSocket ()
  #3  0x000055f60d3e0b41 in ProcSleep ()
  #4  0x000055f60d3db249 in ?? ()
  #5  0x000055f60d3dc591 in LockAcquireExtended ()
  #6  0x000055f60d3da365 in LockTuple ()
  #7  0x000055f60d144ea6 in heap_update ()
  #8  0x000055f60d2c59ff in ?? ()
  #9  0x000055f60d2c609b in ?? ()
  #10 0x000055f60d2a0d8b in standard_ExecutorRun ()
  #11 0x00007f8269670e45 in ?? () from /usr/lib/postgresql/11/lib/pg_stat_statements.so
  ---Type <return> to continue, or q <return> to quit---
  #12 0x000055f60d3f5cd5 in ?? ()
  #13 0x000055f60d3f5f28 in ?? ()
  #14 0x000055f60d3f6be5 in PortalRun ()
  #15 0x000055f60d3f26a1 in ?? ()
  #16 0x000055f60d3f3bcb in PostgresMain ()
  #17 0x000055f60d102bf8 in ?? ()
  #18 0x000055f60d37e492 in PostmasterMain ()
  #19 0x000055f60d103fa5 in main ()
  (gdb) cont
  Continuing.
  (gdb) bt
  (gdb) cont
  (gdb) cont
  ^C
  Program received signal SIGINT, Interrupt.
  0x00007f8270a039f3 in delete_module () at ../sysdeps/unix/syscall-template.S:86
  86    in ../sysdeps/unix/syscall-template.S
  (gdb) bt
  #0  0x00007f8270a039f3 in delete_module () at ../sysdeps/unix/syscall-template.S:86
  #1  0x000055f60d3d0101 in WaitEventSetWait ()
  #2  0x000055f60d2dccff in secure_read ()
  #3  0x000055f60d2e9138 in ?? ()
  #4  0x000055f60d2e9e75 in pq_getbyte ()
  #5  0x000055f60d3f3100 in PostgresMain ()
  #6  0x000055f60d102bf8 in ?? ()
  #7  0x000055f60d37e492 in PostmasterMain ()
  #8  0x000055f60d103fa5 in main ()
  (gdb) cont
  Continuing.
  ^C
  Program received signal SIGINT, Interrupt.
  0x00007f8270a039f3 in delete_module () at ../sysdeps/unix/syscall-template.S:86
  86    in ../sysdeps/unix/syscall-template.S
  (gdb) bt
  #0  0x00007f8270a039f3 in delete_module () at ../sysdeps/unix/syscall-template.S:86
  #1  0x000055f60d3d0101 in WaitEventSetWait ()
  #2  0x000055f60d2dccff in secure_read ()
  #3  0x000055f60d2e9138 in ?? ()
  #4  0x000055f60d2e9e75 in pq_getbyte ()
  #5  0x000055f60d3f3100 in PostgresMain ()
  #6  0x000055f60d102bf8 in ?? ()
  #7  0x000055f60d37e492 in PostmasterMain ()
  #8  0x000055f60d103fa5 in main ()
  (gdb) quit
  A debugging session is active.

          Inferior 1 [process 19495] will be detached.

  Quit anyway? (y or n) n
  Not confirmed.
  (gdb) detach
  Detaching from program: /usr/lib/postgresql/11/bin/postgres, process 19495
  (gdb) quit

Is this helpful, or should I do something different?

Here is a final one:

  postgres  21257  0.0  0.0 34686088 6476 ?       Ss   14:52   0:00  \_ postgres: 11/main: svc_user XXdb
127.0.0.1(44192)authentication 
  $ sudo gdb -p 21257
    [...]
  0x00007f8272cf2827 in futex_abstimed_wait_cancelable (private=128, abstime=0x0, expected=0,
futex_word=0x7f7a2ea001b8)at ../sysdeps/unix/sysv/linux/futex-internal.h:205 
  205    ../sysdeps/unix/sysv/linux/futex-internal.h: No such file or directory.
  (gdb) bt
  #0  0x00007f8272cf2827 in futex_abstimed_wait_cancelable (private=128, abstime=0x0, expected=0,
futex_word=0x7f7a2ea001b8)at ../sysdeps/unix/sysv/linux/futex-internal.h:205 
  #1  do_futex_wait (sem=sem@entry=0x7f7a2ea001b8, abstime=0x0) at sem_waitcommon.c:111
  #2  0x00007f8272cf28d4 in __new_sem_wait_slow (sem=0x7f7a2ea001b8, abstime=0x0) at sem_waitcommon.c:181
  #3  0x00007f8272cf297a in __new_sem_wait (sem=<optimized out>) at sem_wait.c:29
  #4  0x000055f60d36c252 in PGSemaphoreLock ()
  #5  0x000055f60d3e35cc in LWLockAcquire ()
  #6  0x000055f60d3d14ee in ProcArrayAdd ()
  #7  0x000055f60d3e01a3 in InitProcessPhase2 ()
  #8  0x000055f60d51c99b in InitPostgres ()
  #9  0x000055f60d3f2baf in PostgresMain ()
  #10 0x000055f60d102bf8 in ?? ()
  #11 0x000055f60d37e492 in PostmasterMain ()
  #12 0x000055f60d103fa5 in main ()
  (gdb) cont
  Continuing.
  bt

  Program received signal SIGUSR1, User defined signal 1.
  0x000055f60d10e50e in slot_getsomeattrs ()
  (gdb) bt
  #0  0x000055f60d10e50e in slot_getsomeattrs ()
  #1  0x000055f60d29d260 in ?? ()
  #2  0x000055f60d2a9965 in ExecScan ()
  #3  0x000055f60d2b7b2a in ?? ()
  #4  0x000055f60d2c71e6 in ?? ()
  #5  0x000055f60d2a0d8b in standard_ExecutorRun ()
  #6  0x00007f8269670e45 in ?? () from /usr/lib/postgresql/11/lib/pg_stat_statements.so
  #7  0x000055f60d3f571b in ?? ()
  #8  0x000055f60d3f6d40 in PortalRun ()
  #9  0x000055f60d3f26a1 in ?? ()
  #10 0x000055f60d3f3bcb in PostgresMain ()
  #11 0x000055f60d102bf8 in ?? ()
  #12 0x000055f60d37e492 in PostmasterMain ()
  #13 0x000055f60d103fa5 in main ()
  (gdb) cont
  Continuing.

  Program received signal SIGUSR1, User defined signal 1.
  0x000055f60d10e50e in slot_getsomeattrs ()
  (gdb) bt
  #0  0x000055f60d10e50e in slot_getsomeattrs ()
  #1  0x000055f60d29d260 in ?? ()
  #2  0x000055f60d2a9965 in ExecScan ()
  #3  0x000055f60d2b7b2a in ?? ()
  #4  0x000055f60d2c71e6 in ?? ()
  #5  0x000055f60d2a0d8b in standard_ExecutorRun ()
  #6  0x00007f8269670e45 in ?? () from /usr/lib/postgresql/11/lib/pg_stat_statements.so
  #7  0x000055f60d3f571b in ?? ()
  #8  0x000055f60d3f6d40 in PortalRun ()
  #9  0x000055f60d3f26a1 in ?? ()
  #10 0x000055f60d3f3bcb in PostgresMain ()
  #11 0x000055f60d102bf8 in ?? ()
  #12 0x000055f60d37e492 in PostmasterMain ()
  #13 0x000055f60d103fa5 in main ()
  (gdb) cont
  Continuing.
  (gdb) bt
  [Inferior 1 (process 21257) exited normally]
  (gdb) bt
  No stack.
  (gdb) cont
  The program is not being run.
  (gdb) quit

Let me know if there is something else I should try.


  Best regards,

    Adam

--
 "It is a sort of cheap and cheerful kind of                Adam Sjøgren
  abstraction, but it works well in practise."         asjo@koldfront.dk



pgsql-general by date:

Previous
From: Jehan-Guillaume de Rorthais
Date:
Subject: Re: BUG? Slave don't reconnect to the master
Next
From: Laura Smith
Date:
Subject: Inline count on a query