weird hang while running in HS mode - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | weird hang while running in HS mode |
Date | |
Msg-id | AANLkTikn06AkL7v_gh0zssmqoO_unmCCyfg5cDSzddK-@mail.gmail.com Whole thread Raw |
Responses |
Re: weird hang while running in HS mode
|
List | pgsql-hackers |
While fooling around with Hot Standby today, I did this on the master: rhaas=# begin work; BEGIN rhaas=# lock table pgbench_accounts; LOCK TABLE Then on slave I did this: rhaas=# select * from pgbench_accounts; ERROR: canceling statement due to conflict with recovery DETAIL: User query might have needed to see row versions that must be removed. The error message appeared immediately, although max_standby_delay is set to 30s and both servers are running on the same machine. IMHO, that's a good example of why max_standby_delay in its current form is not a good thing, but that's a topic for another thread. I then did this on the master: rhaas=# rollback; ROLLBACK So at this point, one would think that there are no locks hanging around anywhere. Back to the standby: rhaas=# select * from pgbench_accounts; <really long hang> I did this from another session on the standby: rhaas=# select * from pg_stat_activity;datid | datname | procpid | usesysid | usename | application_name | client_addr | client_port | backend_start | xact_start | query_start | waiting | current_query -------+---------+---------+----------+---------+------------------+-------------+-------------+-------------------------------+-------------------------------+-------------------------------+---------+---------------------------------16384 |rhaas | 6706 | 10 | rhaas | psql | | -1 | 2010-05-12 14:07:21.73227-04 | 2010-05-12 14:08:02.657664-04 | 2010-05-12 14:08:02.657664-04 | t | select * from pgbench_accounts;16384 | rhaas | 6719 | 10 | rhaas | psql | | -1 | 2010-05-1214:10:15.916115-04 | 2010-05-12 14:10:24.18013-04 | 2010-05-12 14:10:24.18013-04 | f | select * from pg_stat_activity; (2 rows) OK, so it's blocked on a lock. Let's find out more. [rhaas ~]$ gdb -p 6709 GNU gdb 6.3.50-20050815 (Apple version gdb-1461.2) (Fri Mar 5 04:43:10 UTC 2010) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-apple-darwin". /Users/rhaas/6709: No such file or directory Unable to access task for process-id 6709: (os/kern) failure. (gdb) q [rhaas ~]$ gdb -p 6706 GNU gdb 6.3.50-20050815 (Apple version gdb-1461.2) (Fri Mar 5 04:43:10 UTC 2010) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-apple-darwin". /Users/rhaas/6706: No such file or directory Attaching to process 6706. Reading symbols for shared libraries . done Reading symbols for shared libraries ................. done 0x00007fff838d014e in semop () (gdb) bt #0 0x00007fff838d014e in semop () #1 0x00000001001ba1af in PGSemaphoreLock (sema=0x100d4db08, interruptOK=1 '\001') at pg_sema.c:420 #2 0x00000001001f8782 in ProcSleep (locallock=0x1010d9568, lockMethodTable=<value temporarily unavailable, due to optimizations>) at proc.c:971 #3 0x00000001001f4c31 in WaitOnLock (locallock=0x1010d9568, owner=0x101060628) at lock.c:1217 #4 0x00000001001f6254 in LockAcquireExtended (locktag=0x7fff5fbfdf00, lockmode=1, sessionLock=<value temporarily unavailable, due to optimizations>, dontWait=0 '\0', reportMemoryError=1 '\001') at lock.c:836 #5 0x00000001001f36bf in LockRelationOid (relid=16420, lockmode=1) at lmgr.c:79 #6 0x00000001000242ba in relation_open (relationId=16420, lockmode=1) at heapam.c:899 #7 0x00000001000244ac in try_heap_openrv (relation=<value temporarily unavailable, due to optimizations>, lockmode=<value temporarily unavailable, due to optimizations>) at heapam.c:1127 #8 0x00000001000c7f14 in parserOpenTable (pstate=0x101041c68, relation=0x101041a38, lockmode=1) at parse_relation.c:829 #9 0x00000001000c8141 in addRangeTableEntry (pstate=0x101041c68, relation=0x101041a38, alias=0x0, inh=1 '\001', inFromCl=1 '\001') at parse_relation.c:895 #10 0x00000001000b8927 in transformTableEntry [inlined] () at /Users/rhaas/pgsql/src/backend/parser/parse_clause.c:444 #11 0x00000001000b8927 in transformFromClauseItem (pstate=0x101041c68, n=0x101041a38, top_rte=0x7fff5fbfe1b0, top_rti=0x7fff5fbfe1bc, relnamespace=0x7fff5fbfe1a8, containedRels=0x7fff5fbfe1a0) at parse_clause.c:676 #12 0x00000001000b939a in transformFromClause (pstate=0x101041c68, frmList=<value temporarily unavailable, due to optimizations>) at parse_clause.c:128 #13 0x000000010009eccb in transformSelectStmt [inlined] () at /Users/rhaas/pgsql/src/backend/parser/analyze.c:800 #14 0x000000010009eccb in transformStmt (pstate=<value temporarily unavailable, due to optimizations>, parseTree=0x101041af0) at analyze.c:185 #15 0x00000001000a01a3 in parse_analyze (parseTree=0x101041af0, sourceText=0x101040e38 "select * from pgbench_accounts;", paramTypes=0x0, numParams=0) at analyze.c:93 #16 0x0000000100202cb4 in pg_analyze_and_rewrite (parsetree=0x101041af0, query_string=0x101040e38 "select * from pgbench_accounts;", paramTypes=0x0, numParams=0) at postgres.c:619 #17 0x0000000100203535 in exec_simple_query (query_string=0x101040e38 "select * from pgbench_accounts;") at postgres.c:977 #18 0x0000000100204241 in PostgresMain (argc=2, argv=<value temporarily unavailable, due to optimizations>, username=<value temporarily unavailable, due to optimizations>) at postgres.c:3854 #19 0x00000001001c653f in ServerLoop () at postmaster.c:3528 #20 0x00000001001c7432 in PostmasterMain (argc=3, argv=0x100808ac0) at postmaster.c:1092 #21 0x0000000100162055 in main (argc=3, argv=0x100808ac0) at main.c:188 (gdb) bt full #0 0x00007fff838d014e in semop () No symbol table info available. #1 0x00000001001ba1af in PGSemaphoreLock (sema=0x100d4db08, interruptOK=1 '\001') at pg_sema.c:420errStatus = <value temporarily unavailable, due to optimizations>sops = { sem_num =3, sem_op = -1, sem_flg = 0 } #2 0x00000001001f8782 in ProcSleep (locallock=0x1010d9568, lockMethodTable=<value temporarily unavailable, due to optimizations>) at proc.c:971lockmode = 1lock = (LOCK *) 0x100d2aad0proclock = (PROCLOCK *) 0x100d4a560hashcode = <value temporarily unavailable,due to optimizations>partitionLock = 49myHeldLocks = <value temporarily unavailable, due to optimizations>early_deadlock= <value temporarily unavailable, due to optimizations>allow_autovacuum_cancel = 1 '\001'myWaitStatus= 13806288proc = (PGPROC *) 0x100d2aaf8i = <value temporarily unavailable, due to optimizations> #3 0x00000001001f4c31 in WaitOnLock (locallock=0x1010d9568, owner=0x101060628) at lock.c:1217save_exception_stack = (sigjmp_buf *) 0x7fff5fbfe4a0save_context_stack = (ErrorContextCallback*) 0x7fff5fbfdf90local_sigjmp_buf = {1606417070, 32767, 1606409760, 32767, 1606409536, 32767, 17172008, 1, 17667432, 1, 1, 0, 1606409984, 32767, 2051077, 1, 530, 0, 8064, 895, 16, 0, 2, 0, 0, 0, 17170328, 1, 0, 0, 128, 0, 1606409712, 32767, 13935968, 1, 1606409728, 32767}lockMethodTable = (LockMethod) 0x1004a9fc0new_status = 0x101041fb0 "SELECT" #4 0x00000001001f6254 in LockAcquireExtended (locktag=0x7fff5fbfdf00, lockmode=1, sessionLock=<value temporarily unavailable, due to optimizations>, dontWait=0 '\0', reportMemoryError=1 '\001') at lock.c:836lockmethodid = <value temporarily unavailable, due to optimizations>lockMethodTable = (LockMethod) 0x1004a9fc0localtag= { lock = { locktag_field1 = 16384, locktag_field2 = 16420, locktag_field3 = 0, locktag_field4= 0, locktag_type = 0 '\0', locktag_lockmethodid = 1 '\001' }, mode = 1 }locallock = (LOCALLOCK *) 0x1010d9568lock = <value temporarily unavailable, due to optimizations>proclock = <value temporarilyunavailable, due to optimizations>proclocktag = { myLock = 0x100d2aad0, myProc = 0x100d4daf8 }found = 0 '\0'owner = (ResourceOwner) 0x101060628hashcode = 534311893proclock_hashcode = 311778389partitionLock = 49status= <value temporarily unavailable, due to optimizations> #5 0x00000001001f36bf in LockRelationOid (relid=16420, lockmode=1) at lmgr.c:79tag = { locktag_field1 = 16384, locktag_field2= 16420, locktag_field3 = 0, locktag_field4 = 0, locktag_type = 0 '\0', locktag_lockmethodid = 1 '\001' }res = <value temporarily unavailable, due to optimizations> #6 0x00000001000242ba in relation_open (relationId=16420, lockmode=1) at heapam.c:899r = <value temporarily unavailable, due to optimizations> #7 0x00000001000244ac in try_heap_openrv (relation=<value temporarily unavailable, due to optimizations>, lockmode=<value temporarily unavailable, due to optimizations>) at heapam.c:1127r = <value temporarily unavailable, due to optimizations> #8 0x00000001000c7f14 in parserOpenTable (pstate=0x101041c68, relation=0x101041a38, lockmode=1) at parse_relation.c:829rel = <value temporarily unavailable, due to optimizations>pcbstate= { pstate = 0x101041c68, location = 14, errcontext = { previous = 0x0, callback = 0x1000c4810<pcb_error_callback>, arg = 0x7fff5fbfdf80 } } #9 0x00000001000c8141 in addRangeTableEntry (pstate=0x101041c68, relation=0x101041a38, alias=0x0, inh=1 '\001', inFromCl=1 '\001') at parse_relation.c:895rte = <value temporarily unavailable, due to optimizations>refname = 0x101041a00 "pgbench_accounts"lockmode= <value temporarily unavailable, due to optimizations>rel = <value temporarily unavailable, dueto optimizations> #10 0x00000001000b8927 in transformTableEntry [inlined] () at /Users/rhaas/pgsql/src/backend/parser/parse_clause.c:444r = (RangeVar *) Cannot access memory at address 0x0 When I detached gdb, the query immediately returned results, even though it had been hung for a long time before that. I have no idea what is going on here and have not checked to see whether this is reproducible, but thought it was worth reporting anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
pgsql-hackers by date: