Thread: "pgbench -c 13" causes backend hangup.
Hi. I've found a problem about PostgreSQL and cygwin. 1. run postmaster 2. run "pgbench -c 13" (This option means test with 13 connection.) 3. pgbench never finishes ,because some backends hangup. Are there any workarounds about this? software versions: Windows 2000 Pro.(with SP2) PostgreSQL 7.1.3(compiled by me) cygwin 1.3.3-1(binary) cygipc 1.09-2 -- Yutaka tanida <yutaka@hi-net.zaq.ne.jp>
Yutaka, On Sat, Sep 15, 2001 at 03:31:57PM +0900, Yutaka tanida wrote: > I've found a problem about PostgreSQL and cygwin. > > 1. run postmaster > 2. run "pgbench -c 13" (This option means test with 13 connection.) > 3. pgbench never finishes ,because some backends hangup. > > Are there any workarounds about this? Can you attach to the hung postgres.exe processes and get a backtrace to determine where they are hanging? Thanks, Jason
Jason, On Sat, 15 Sep 2001 10:29:32 -0400 Jason Tishler <jason@tishler.net> wrote: > Can you attach to the hung postgres.exe processes and get a backtrace to > determine where they are hanging? Here's log, but it's nothing to use....So,I can't determine what's wrong. -- bash-2.05$ gdb -nw GNU gdb 5.0 (20010428-1) (snip) (gdb) attach 776 Attaching to process 776 [Switching to thread 776.0x5b0] (gdb) backtrace #0 0x77fa018d in ?? () #1 0x77e5758a in ?? () -- There's a backend log when they're hanged. -- DEBUG: ProcessUtility: end DEBUG: CommitTransactionCommand DEBUG: proc_exit(0) DEBUG: shmem_exit(0) DEBUG: exit(0) postmaster: reaping dead processes... postmaster: CleanupProc: pid 1480 exited with status 0 -- I suppose it's a bug of cygwin's signal handling.I'll trying to create other testcase. --- Yutaka tanida<yutaka@hi-net.zaq.ne.jp> 謎のWebsite http://www.hi-net.zaq.ne.jp/yutaka/
Yutaka, On Sun, Sep 16, 2001 at 03:15:02PM +0900, Yutaka tanida wrote: > On Sat, 15 Sep 2001 10:29:32 -0400 > Jason Tishler <jason@tishler.net> wrote: > > > Can you attach to the hung postgres.exe processes and get a backtrace to > > determine where they are hanging? > > Here's log, but it's nothing to use....So,I can't determine what's wrong. > > -- > bash-2.05$ gdb -nw > GNU gdb 5.0 (20010428-1) > (snip) > (gdb) attach 776 > Attaching to process 776 > [Switching to thread 776.0x5b0] > (gdb) backtrace > #0 0x77fa018d in ?? () > #1 0x77e5758a in ?? () > -- I believe that you need to start gdb as follows: $ gdb -nw /usr/bin/postgres.exe or $ gdb -nw /usr/local/bin/postgres.exe as appropriate. Also, you (usually) need to switch to thread 1 before attempting the backtrace: (gdb) thread 1 (gbd) bt ... And finally, make sure that your postgres.exe is built with debug info and unstripped. > There's a backend log when they're hanged. > > -- > DEBUG: ProcessUtility: end > DEBUG: CommitTransactionCommand > DEBUG: proc_exit(0) > DEBUG: shmem_exit(0) > DEBUG: exit(0) > postmaster: reaping dead processes... > postmaster: CleanupProc: pid 1480 exited with status 0 > -- > > I suppose it's a bug of cygwin's signal handling.I'll trying to create > other testcase. The above is a good hypothesis. A simple test case will greatly improve our chances of getting the bug fixed. BTW, I just tried pgbench last night and was able to get the hang with 12 clients and sometimes 13 worked. Sounds like some kind of race condition. My postgres.exe is currently stripped -- I will try again tomorrow (or the next day) with a debuggable executable. Thanks, Jason
Jason, On Sun, 16 Sep 2001 10:34:50 -0400 Jason Tishler <jason@tishler.net> wrote: > I believe that you need to start gdb as follows: (snip) Thanks for your information . I can get a backtrace log. -- (gdb) bt #0 0x77f827e8 in _libkernel32_a_iname () #1 0x77e56a15 in _libkernel32_a_iname () #2 0x77e56a3d in _libkernel32_a_iname () #3 0x00538a5d in semop () #4 0x004cf450 in IpcSemaphoreLock (semId=5250, sem=11, interruptOK=1 '\001') at ipc.c:426 #5 0x004d48b9 in ProcSleep (lockMethodTable=0xa0102e8, lockmode=4, lock=0x1abf527c, holder=0x1abf770c) at proc.c:666 #6 0x004d3795 in WaitOnLock (lockmethod=1, lockmode=4, lock=0x1abf527c, holder=0x1abf770c) at lock.c:955 #7 0x004d351c in LockAcquire (lockmethod=1, locktag=0x22f338, xid=3288, lockmode=4) at lock.c:739 #8 0x004d2b11 in XactLockTableWait (xid=3280) at lmgr.c:310 #9 0x0040c61b in heap_update (relation=0xa07e5a0, otid=0x22f488, newtup=0xa0ae810, ctid=0x22f408) at heapam.c:1627 #10 0x0048517a in ExecReplace (slot=0xa0ae010, tupleid=0x22f488, estate=0xa0ade50) at execMain.c:1451 #11 0x00484dc1 in ExecutePlan (estate=0xa0ade50, plan=0xa0addb0, operation=CMD_UPDATE, numberTuples=0, direction=ForwardScanDirection, destfunc=0xa0ae650) at execMain.c:1125 #12 0x00483f2a in ExecutorRun (queryDesc=0xa0ade38, estate=0xa0ade50, feature=3, count=0) at execMain.c:233 #13 0x004db703 in ProcessQuery (parsetree=0xa0ab000, plan=0xa0addb0, dest=Remote) at pquery.c:295 #14 0x004d8ff9 in pg_exec_query_string ( query_string=0xa0aaca8 "update tellers set tbalance = tbalance + 850 where t id = 2\n", dest=Remote, parse_context=0xa025318) at postgres.c:810 #15 0x004daf29 in PostgresMain (argc=4, argv=0x22f778, real_argc=2, real_argv=0x614e2414, username=0xa01cc29 "Administrator") at postgres.c:1908 #16 0x004bfaf2 in DoBackend (port=0xa01c9c0) at postmaster.c:2114 #17 0x004bf662 in BackendStartup (port=0xa01c9c0) at postmaster.c:1897 #18 0x004be18d in ServerLoop () at postmaster.c:995 #19 0x004bd54c in PostmasterMain (argc=2, argv=0x614e2414) at postmaster.c:685 #20 0x00497551 in main (argc=2, argv=0x614e2414) at main.c:171 #21 0x6100401e in _libkernel32_a_iname () #22 0x6100421d in _libkernel32_a_iname () #23 0x6100425c in _libkernel32_a_iname () #24 0x0053924b in cygwin_crt0 () at /cygnus/netrel/src/cygwin-1.3.3-2/winsup/cygwin/lib/cygwin_crt0.c:33 -- Hmm... , it it a cygipc probrem? --- Yutaka tanida<yutaka@hi-net.zaq.ne.jp> 謎のWebsite http://www.hi-net.zaq.ne.jp/yutaka/
Yutaka, On Mon, Sep 17, 2001 at 12:00:55AM +0900, Yutaka tanida wrote: > (gdb) bt > #0 0x77f827e8 in _libkernel32_a_iname () > #1 0x77e56a15 in _libkernel32_a_iname () > #2 0x77e56a3d in _libkernel32_a_iname () > #3 0x00538a5d in semop () > #4 0x004cf450 in IpcSemaphoreLock (semId=5250, sem=11, interruptOK=1 '\001') > at ipc.c:426 > [snip] > > Hmm... , it it a cygipc probrem? Agreed. We have two options: 1. build a debuggable cygipc and continue debugging 2. wait for true Cygwin (server) support of System V IPC Given the following: http://sources.redhat.com/ml/cygwin-apps/2001-09/msg00017.html http://www.cygwin.com/ml/cygwin-patches/2001-q3/msg00151.html I'm leaning toward option #2 above. FYI, Rob Collins is the developer that recently contributed significant (and very well done) Cygwin pthreads support and appears to be really cranking out the new Cygwin server (including System V IPC) functionality. Jason