Re: Debugging a backend stuck consuming CPU - Mailing list pgsql-general

From ktm@rice.edu
Subject Re: Debugging a backend stuck consuming CPU
Date
Msg-id 20160519200641.GD32767@aart.rice.edu
Whole thread Raw
In response to Re: Debugging a backend stuck consuming CPU  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Debugging a backend stuck consuming CPU
List pgsql-general
On Thu, May 19, 2016 at 09:58:45AM -0400, Tom Lane wrote:
> "ktm@rice.edu" <ktm@rice.edu> writes:
> > I am investigating a problem with a backend that appears to be stuck
> > and spinning while performing a "DISCARD ALL" command. The system is
> > running an older release 9.2.2.
>
> You do realize that the current release in that series is 9.2.17.
>
> > Are there any bugs that could be causing this behavior?
>
> Known bugs are summarized here:
> http://www.postgresql.org/docs/9.2/static/release.html
>
> > How can I tell what the process is actually doing?
>
> Getting a stack trace with gdb might be informative:
> https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend
>
>             regards, tom lane
>

Hi,

The stack trace just appeared to be what I would expect while a 'DISCARD ALL'
command was being run:

Continuing.

Program received signal SIGINT, Interrupt.
0x000000000073bc7c in MemoryContextSetParent ()
#0  0x000000000073bc7c in MemoryContextSetParent ()
#1  0x000000000073bde3 in MemoryContextDelete ()
#2  0x000000000054e3a9 in DropAllPreparedStatements ()
#3  0x00000000005365f3 in DiscardCommand ()
#4  0x00000000006582c7 in ?? ()
#5  0x00000000006592bd in ?? ()
#6  0x0000000000659a42 in PortalRun ()
#7  0x000000000065603d in ?? ()
#8  0x0000000000656ed0 in PostgresMain ()
#9  0x0000000000613b91 in ?? ()
#10 0x00000000006167fc in PostmasterMain ()
#11 0x00000000005b5290 in main ()
Continuing.

Program received signal SIGINT, Interrupt.
0x000000000073bc7c in MemoryContextSetParent ()
#0  0x000000000073bc7c in MemoryContextSetParent ()
#1  0x000000000073bde3 in MemoryContextDelete ()
#2  0x000000000054e3a9 in DropAllPreparedStatements ()
#3  0x00000000005365f3 in DiscardCommand ()
#4  0x00000000006582c7 in ?? ()
#5  0x00000000006592bd in ?? ()
#6  0x0000000000659a42 in PortalRun ()
#7  0x000000000065603d in ?? ()
#8  0x0000000000656ed0 in PostgresMain ()
#9  0x0000000000613b91 in ?? ()
#10 0x00000000006167fc in PostmasterMain ()
#11 0x00000000005b5290 in main ()
Continuing.

Program received signal SIGINT, Interrupt.
0x000000000073bc7c in MemoryContextSetParent ()
#0  0x000000000073bc7c in MemoryContextSetParent ()
#1  0x000000000073bde3 in MemoryContextDelete ()
#2  0x000000000070e7df in DropCachedPlan ()
#3  0x000000000054e3a9 in DropAllPreparedStatements ()
#4  0x00000000005365f3 in DiscardCommand ()
#5  0x00000000006582c7 in ?? ()
#6  0x00000000006592bd in ?? ()
#7  0x0000000000659a42 in PortalRun ()
#8  0x000000000065603d in ?? ()
#9  0x0000000000656ed0 in PostgresMain ()
#10 0x0000000000613b91 in ?? ()
#11 0x00000000006167fc in PostmasterMain ()
#12 0x00000000005b5290 in main ()
Continuing.

Program received signal SIGINT, Interrupt.
0x000000000073bc7c in MemoryContextSetParent ()
#0  0x000000000073bc7c in MemoryContextSetParent ()
#1  0x000000000073bde3 in MemoryContextDelete ()
#2  0x000000000054e3a9 in DropAllPreparedStatements ()
#3  0x00000000005365f3 in DiscardCommand ()
#4  0x00000000006582c7 in ?? ()
#5  0x00000000006592bd in ?? ()
#6  0x0000000000659a42 in PortalRun ()
#7  0x000000000065603d in ?? ()
#8  0x0000000000656ed0 in PostgresMain ()
#9  0x0000000000613b91 in ?? ()
#10 0x00000000006167fc in PostmasterMain ()
#11 0x00000000005b5290 in main ()
Continuing.

Program received signal SIGINT, Interrupt.
0x000000000070e7ff in DropCachedPlan ()
#0  0x000000000070e7ff in DropCachedPlan ()
#1  0x000000000054e3a9 in DropAllPreparedStatements ()
#2  0x00000000005365f3 in DiscardCommand ()
#3  0x00000000006582c7 in ?? ()
#4  0x00000000006592bd in ?? ()
#5  0x0000000000659a42 in PortalRun ()
#6  0x000000000065603d in ?? ()
#7  0x0000000000656ed0 in PostgresMain ()
#8  0x0000000000613b91 in ?? ()
#9  0x00000000006167fc in PostmasterMain ()
#10 0x00000000005b5290 in main ()
Detaching from program: /usr/pgsql-9.2/bin/postgres, process 38604
Undefined command: "exit".  Try "help".
Continuing.

Program received signal SIGINT, Interrupt.
0x000000000070e7ff in DropCachedPlan ()
#0  0x000000000070e7ff in DropCachedPlan ()
#1  0x000000000054e3a9 in DropAllPreparedStatements ()
#2  0x00000000005365f3 in DiscardCommand ()
#3  0x00000000006582c7 in ?? ()
#4  0x00000000006592bd in ?? ()
#5  0x0000000000659a42 in PortalRun ()
#6  0x000000000065603d in ?? ()
#7  0x0000000000656ed0 in PostgresMain ()
#8  0x0000000000613b91 in ?? ()
#9  0x00000000006167fc in PostmasterMain ()
#10 0x00000000005b5290 in main ()
Continuing.

Program received signal SIGINT, Interrupt.
0x000000000070e7ff in DropCachedPlan ()
#0  0x000000000070e7ff in DropCachedPlan ()
#1  0x000000000054e3a9 in DropAllPreparedStatements ()
#2  0x00000000005365f3 in DiscardCommand ()
#3  0x00000000006582c7 in ?? ()
#4  0x00000000006592bd in ?? ()
#5  0x0000000000659a42 in PortalRun ()
#6  0x000000000065603d in ?? ()
#7  0x0000000000656ed0 in PostgresMain ()
#8  0x0000000000613b91 in ?? ()
#9  0x00000000006167fc in PostmasterMain ()
#10 0x00000000005b5290 in main ()
Continuing.

Program received signal SIGINT, Interrupt.
0x000000000073bc7c in MemoryContextSetParent ()
#0  0x000000000073bc7c in MemoryContextSetParent ()
#1  0x000000000073bde3 in MemoryContextDelete ()
#2  0x000000000070e7df in DropCachedPlan ()
#3  0x000000000054e3a9 in DropAllPreparedStatements ()
#4  0x00000000005365f3 in DiscardCommand ()
#5  0x00000000006582c7 in ?? ()
#6  0x00000000006592bd in ?? ()
#7  0x0000000000659a42 in PortalRun ()
#8  0x000000000065603d in ?? ()
#9  0x0000000000656ed0 in PostgresMain ()
#10 0x0000000000613b91 in ?? ()
#11 0x00000000006167fc in PostmasterMain ()
#12 0x00000000005b5290 in main ()
Continuing.

Does a DISCARD command take alot of time, or is it like TRUNCATE? The
backend does have a very large memory footprint (12GB).

Regards,
Ken


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: PQcancel may hang in the recv call
Next
From: Cameron Smith
Date:
Subject: Re: PostgreSQL with BDR - PANIC: could not create replication identifier checkpoint