Re: Postmaster hangs - Mailing list pgsql-bugs
From | Craig Ringer |
---|---|
Subject | Re: Postmaster hangs |
Date | |
Msg-id | 1256624664.1709.80.camel@wallace.localnet Whole thread Raw |
In response to | Re: Postmaster hangs (Karen Pease <meme@daughtersoftiresias.org>) |
Responses |
Re: Postmaster hangs
|
List | pgsql-bugs |
On Tue, 2009-10-27 at 00:50 -0500, Karen Pease wrote: > > OK, so there's nothing shrieklingly obviously wrong with what the > > postmaster is up to. But what about the backend that's stopped > > responding? Try connecting gdb to that "postgres" process once it's > > stopped responding and get a backtrace from that. > > > > Okay -- I started up a psql instance, which immediately locks up. I > then attached gdb to it and got this: > > (gdb) cont > Continuing. > ^C > Program received signal SIGINT, Interrupt. > 0x00fe2416 in __kernel_vsyscall () You didn't actually request a backtrace (bt), so all it shows is the top stack frame. That doesn't tell us anything except that it's busy in a system call in the kernel. > > You can find out a bit more about what the kernel is doing using the > > "magic" keyboard sequence "ALT-SysRQ-T" from a vconsole (not under X). > > Nothing happened. Nothing useful in dmesg -- certainly no stacktraces. Your kernel might not have the "magic sysrq key" enabled. Run: sudo sysctl -w kernel.sysrq=1 and try again. Note that on some systems with weird keyboards you might have to hold the "Fn" key (if you have one) or disable "F-Lock" (if you have it) to get SysRq to be recognised. The print screen / PrtScn key is usually shared with SysRq even if it's not marked as such. Hmm, it looks like the SysRq magic key sequences even work under X. I didn't think they did, but on my system here hitting alt-sysrq-t under X11 dumps a bunch of task trace data in to /var/log/kern.log (Ubuntu system), including task info and some general info on the CPU states. Oct 27 14:20:19 wallace kernel: [13668207.501781] postgres S 28e389e2 0 3152 31105 Oct 27 14:20:19 wallace kernel: [13668207.501781] f352bcac 00200086 c3634358 28e389e2 00029346 00000000 c069a340 c07d2180 Oct 27 14:20:19 wallace kernel: [13668207.501781] c51fe480 c51fe6f8 c3634300 ffffffff c3634300 00000000 c3634300 c51fe6f8 Oct 27 14:20:19 wallace kernel: [13668207.501781] f352bca8 c01398c8 c90e0000 7fffffff c90e01a8 f352bcf8 c050ec7d f352bcc8 Oct 27 14:20:19 wallace kernel: [13668207.501781] Call Trace: Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c01398c8>] ? enqueue_task_fair+0x68/0x70 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c050ec7d>] schedule_timeout+0xad/0xe0 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c0156aea>] ? prepare_to_wait+0x3a/0x70 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c04aaf38>] unix_stream_data_wait+0x88/0xe0 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c0156890>] ? autoremove_wake_function+0x0/0x50 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c04ac631>] unix_stream_recvmsg+0x311/0x490 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c02b0990>] ? apparmor_socket_recvmsg+0x10/0x20 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c04356e4>] sock_recvmsg+0xf4/0x120 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c0156890>] ? autoremove_wake_function+0x0/0x50 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c01c3bb7>] ? __mem_cgroup_uncharge_common+0x137/0x170 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c01a39b8>] ? __dec_zone_page_state+0x18/0x20 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c01b0411>] ? page_remove_rmap+0x61/0x130 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c012eb00>] ? kunmap_atomic+0x50/0x60 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c043599c>] sys_recvfrom+0x7c/0xd0 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c019c305>] ? __pagevec_free+0x25/0x30 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c019eee0>] ? release_pages+0x160/0x1a0 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c02d2bfd>] ? rb_erase+0xcd/0x150 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c0435a26>] sys_recv+0x36/0x40 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c0435be7>] sys_socketcall+0x1b7/0x2b0 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c014617b>] ? sys_gettimeofday+0x2b/0x70 Oct 27 14:20:19 wallace kernel: [13668207.501781] [<c0109eef>] sysenter_do_call+0x12/0x2f BTW, if you do get kernel task trace information and you decide to redact it, it'd be good to include at least all your `postgres' instances, all `httpd' instances, and the summary information at the end. -- Craig Ringer
pgsql-bugs by date: