Re: How to investiage slow insert problem - Mailing list pgsql-performance

From Rural Hunter
Subject Re: How to investiage slow insert problem
Date
Msg-id 521418DA.6020608@gmail.com
Whole thread Raw
In response to How to investiage slow insert problem  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-performance
Hi Jeff,

Thanks a lot for such a detailed guide!

于 2013/8/21 8:24, Jeff Janes 写道:
>
> OK, that certainly does sound like network problems and not disk
> contention.  But what I don't see is why it would be listed as
> "active" in pg_stat_activity.  If it is blocked on a network
> connection, I would think it would show 'idle'.
>
> strace and gdb are aggressive forms of monitoring, so you should talk
> to your sys admins before using them (if your organization has that
> type of segregation of duties) or read their documentation.
>
> It is best to run them on a test system if you can get the problem to
> occur there.  If you can't, then I would be willing to run them
> against a hung backend on a production system, but I might be more
> adventurous than most.
>
> I have occasionally seen strace cause the straced program to
> seg-fault, but I've only seen this with GUI programs and never with
> postgresql.
>
> In theory attaching gdb could make a backend pause while holding a
> spinlock, and if left that way for long enough could cause a
> postgresql-wide panic over a stuck spinlock.  But I've never seen this
> happen without intentionally causing.  If the backend you attach is
> truly hung and hasn't already causes a panic, then it almost surely
> can't be happen, but just to be sure you should quit gdb as soon as
> possible so that the backend can continue unimpeded.
>
> Anyway, both are fairly easy to run once you find the pid of one of a
> stuck backend (either using ps, or using pg_stat_activity).  Then you
> give the pid to the debugging program with -p option.
>
> You probably have to run as the postgres user, or you won't have
> permissions to attach to the backend.
>
> With strace, once you attach you will see a stream of system calls go
> to your screen, until you hit ctrl-C to detach.  But if the backend is
> hung on a network connection, there should really only be one system
> call, and then it just wait until you detach or the network connection
> times out, like this:
>
> $ strace -p 21116
> Process 21116 attached - interrupt to quit
> recvfrom(9,
>
> So it is waiting on a recvfrom call, and that call never returns until
> I got sick of waiting and hit ctrl-C.  Not very interesting, but it
> does show it is indeed stuck on the network
>
> for gdb, it is similar to invoke:
>
> $ gdb -p 21116
>
> and it then produces several screenfuls of diagnostic gibberish and
> gives you an interactive command-line environment.  Once attached, you
> want to get a backtrace ("bt", return), and then quit promptly ("q",
> return, "y").
>
> That produces something like this:
>
> Loaded symbols for /lib64/libnss_files.so.2
> 0x00000032a80e9672 in __libc_recv (fd=<value optimized out>,
> buf=0xb33f60, n=8192, flags=0) at
> ../sysdeps/unix/sysv/linux/x86_64/recv.c:30
> 30          return INLINE_SYSCALL (recvfrom, 6, fd, buf, n, flags,
> NULL, NULL);
> (gdb) bt
> #0  0x00000032a80e9672 in __libc_recv (fd=<value optimized out>,
> buf=0xb33f60, n=8192, flags=0) at
> ../sysdeps/unix/sysv/linux/x86_64/recv.c:30
> #1  0x00000000005a4846 in secure_read (port=0x22f1190, ptr=0xb33f60,
> len=8192) at be-secure.c:304
> #2  0x00000000005ae33b in pq_recvbuf () at pqcomm.c:824
> #3  0x00000000005ae73b in pq_getbyte () at pqcomm.c:865
> #4  0x0000000000651c11 in SocketBackend (argc=<value optimized out>,
> argv=<value optimized out>, dbname=0x22d2a28 "jjanes", username=<value
> optimized out>)
>     at postgres.c:342
> #5  ReadCommand (argc=<value optimized out>, argv=<value optimized
> out>, dbname=0x22d2a28 "jjanes", username=<value optimized out>) at
> postgres.c:490
> #6  PostgresMain (argc=<value optimized out>, argv=<value optimized
> out>, dbname=0x22d2a28 "jjanes", username=<value optimized out>) at
> postgres.c:3923
> #7  0x000000000060e861 in BackendRun () at postmaster.c:3614
> #8  BackendStartup () at postmaster.c:3304
> #9  ServerLoop () at postmaster.c:1367
> #10 0x00000000006113b1 in PostmasterMain (argc=<value optimized out>,
> argv=<value optimized out>) at postmaster.c:1127
> #11 0x00000000005b0440 in main (argc=3, argv=0x22d1cb0) at main.c:199
> (gdb) q
> A debugging session is active.
>
>         Inferior 1 [process 21116] will be detached.
>
> Quit anyway? (y or n) y
> Detaching from program: /usr/local/pgsql9_2/bin/postgres, process 21116
>
> Cheers,
>
> Jeff



pgsql-performance by date:

Previous
From: Jeff Janes
Date:
Subject: How to investiage slow insert problem
Next
From: Pavel Stehule
Date:
Subject: Re: Function execute slow down in 9.2