Thread: weird hang while running in HS mode

weird hang while running in HS mode

From

Robert Haas

Date:

12 May 2010, 15:50:55

While fooling around with Hot Standby today, I did this on the master:

rhaas=# begin work;
BEGIN
rhaas=# lock table pgbench_accounts;
LOCK TABLE

Then on slave I did this:

rhaas=# select * from pgbench_accounts;
ERROR:  canceling statement due to conflict with recovery
DETAIL:  User query might have needed to see row versions that must be removed.

The error message appeared immediately, although max_standby_delay is
set to 30s and both servers are running on the same machine.  IMHO,
that's a good example of why max_standby_delay in its current form is
not a good thing, but that's a topic for another thread.  I then did
this on the master:

rhaas=# rollback;
ROLLBACK

So at this point, one would think that there are no locks hanging
around anywhere.  Back to the standby:

rhaas=# select * from pgbench_accounts;
<really long hang>

I did this from another session on the standby:

rhaas=# select * from pg_stat_activity;datid | datname | procpid | usesysid | usename | application_name |
client_addr | client_port |         backend_start         |
xact_start           |          query_start          | waiting |  current_query

-------+---------+---------+----------+---------+------------------+-------------+-------------+-------------------------------+-------------------------------+-------------------------------+---------+---------------------------------16384
|rhaas   |    6706 |       10 | rhaas   | psql             |         |          -1 | 2010-05-12 14:07:21.73227-04  |
2010-05-12
14:08:02.657664-04 | 2010-05-12 14:08:02.657664-04 | t       | select
* from pgbench_accounts;16384 | rhaas   |    6719 |       10 | rhaas   | psql             |         |          -1 |
2010-05-1214:10:15.916115-04 | 2010-05-12
 
14:10:24.18013-04  | 2010-05-12 14:10:24.18013-04  | f       | select
* from pg_stat_activity;
(2 rows)

OK, so it's blocked on a lock.  Let's find out more.

[rhaas ~]$ gdb -p 6709
GNU gdb 6.3.50-20050815 (Apple version gdb-1461.2) (Fri Mar  5
04:43:10 UTC 2010)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin".
/Users/rhaas/6709: No such file or directory
Unable to access task for process-id 6709: (os/kern) failure.
(gdb) q
[rhaas ~]$ gdb -p 6706
GNU gdb 6.3.50-20050815 (Apple version gdb-1461.2) (Fri Mar  5
04:43:10 UTC 2010)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-apple-darwin".
/Users/rhaas/6706: No such file or directory
Attaching to process 6706.
Reading symbols for shared libraries . done
Reading symbols for shared libraries ................. done
0x00007fff838d014e in semop ()
(gdb) bt
#0  0x00007fff838d014e in semop ()
#1  0x00000001001ba1af in PGSemaphoreLock (sema=0x100d4db08,
interruptOK=1 '\001') at pg_sema.c:420
#2  0x00000001001f8782 in ProcSleep (locallock=0x1010d9568,
lockMethodTable=<value temporarily unavailable, due to optimizations>)
at proc.c:971
#3  0x00000001001f4c31 in WaitOnLock (locallock=0x1010d9568,
owner=0x101060628) at lock.c:1217
#4  0x00000001001f6254 in LockAcquireExtended (locktag=0x7fff5fbfdf00,
lockmode=1, sessionLock=<value temporarily unavailable, due to
optimizations>, dontWait=0 '\0', reportMemoryError=1 '\001') at
lock.c:836
#5  0x00000001001f36bf in LockRelationOid (relid=16420, lockmode=1) at lmgr.c:79
#6  0x00000001000242ba in relation_open (relationId=16420, lockmode=1)
at heapam.c:899
#7  0x00000001000244ac in try_heap_openrv (relation=<value temporarily
unavailable, due to optimizations>, lockmode=<value temporarily
unavailable, due to optimizations>) at heapam.c:1127
#8  0x00000001000c7f14 in parserOpenTable (pstate=0x101041c68,
relation=0x101041a38, lockmode=1) at parse_relation.c:829
#9  0x00000001000c8141 in addRangeTableEntry (pstate=0x101041c68,
relation=0x101041a38, alias=0x0, inh=1 '\001', inFromCl=1 '\001') at
parse_relation.c:895
#10 0x00000001000b8927 in transformTableEntry [inlined] () at
/Users/rhaas/pgsql/src/backend/parser/parse_clause.c:444
#11 0x00000001000b8927 in transformFromClauseItem (pstate=0x101041c68,
n=0x101041a38, top_rte=0x7fff5fbfe1b0, top_rti=0x7fff5fbfe1bc,
relnamespace=0x7fff5fbfe1a8, containedRels=0x7fff5fbfe1a0) at
parse_clause.c:676
#12 0x00000001000b939a in transformFromClause (pstate=0x101041c68,
frmList=<value temporarily unavailable, due to optimizations>) at
parse_clause.c:128
#13 0x000000010009eccb in transformSelectStmt [inlined] () at
/Users/rhaas/pgsql/src/backend/parser/analyze.c:800
#14 0x000000010009eccb in transformStmt (pstate=<value temporarily
unavailable, due to optimizations>, parseTree=0x101041af0) at
analyze.c:185
#15 0x00000001000a01a3 in parse_analyze (parseTree=0x101041af0,
sourceText=0x101040e38 "select * from pgbench_accounts;",
paramTypes=0x0, numParams=0) at analyze.c:93
#16 0x0000000100202cb4 in pg_analyze_and_rewrite
(parsetree=0x101041af0, query_string=0x101040e38 "select * from
pgbench_accounts;", paramTypes=0x0, numParams=0) at postgres.c:619
#17 0x0000000100203535 in exec_simple_query (query_string=0x101040e38
"select * from pgbench_accounts;") at postgres.c:977
#18 0x0000000100204241 in PostgresMain (argc=2, argv=<value
temporarily unavailable, due to optimizations>, username=<value
temporarily unavailable, due to optimizations>) at postgres.c:3854
#19 0x00000001001c653f in ServerLoop () at postmaster.c:3528
#20 0x00000001001c7432 in PostmasterMain (argc=3, argv=0x100808ac0) at
postmaster.c:1092
#21 0x0000000100162055 in main (argc=3, argv=0x100808ac0) at main.c:188
(gdb) bt full
#0  0x00007fff838d014e in semop ()
No symbol table info available.
#1  0x00000001001ba1af in PGSemaphoreLock (sema=0x100d4db08,
interruptOK=1 '\001') at pg_sema.c:420errStatus = <value temporarily unavailable, due to optimizations>sops = { sem_num
=3, sem_op = -1, sem_flg = 0
 
}
#2  0x00000001001f8782 in ProcSleep (locallock=0x1010d9568,
lockMethodTable=<value temporarily unavailable, due to optimizations>)
at proc.c:971lockmode = 1lock = (LOCK *) 0x100d2aad0proclock = (PROCLOCK *) 0x100d4a560hashcode = <value temporarily
unavailable,due to optimizations>partitionLock = 49myHeldLocks = <value temporarily unavailable, due to
optimizations>early_deadlock= <value temporarily unavailable, due to optimizations>allow_autovacuum_cancel = 1
'\001'myWaitStatus= 13806288proc = (PGPROC *) 0x100d2aaf8i = <value temporarily unavailable, due to optimizations>
 
#3  0x00000001001f4c31 in WaitOnLock (locallock=0x1010d9568,
owner=0x101060628) at lock.c:1217save_exception_stack = (sigjmp_buf *) 0x7fff5fbfe4a0save_context_stack =
(ErrorContextCallback*) 0x7fff5fbfdf90local_sigjmp_buf = {1606417070, 32767, 1606409760, 32767, 1606409536,
 
32767, 17172008, 1, 17667432, 1, 1, 0, 1606409984, 32767, 2051077, 1,
530, 0, 8064, 895, 16, 0, 2, 0, 0, 0, 17170328, 1, 0, 0, 128, 0,
1606409712, 32767, 13935968, 1, 1606409728, 32767}lockMethodTable = (LockMethod) 0x1004a9fc0new_status = 0x101041fb0
"SELECT"
#4  0x00000001001f6254 in LockAcquireExtended (locktag=0x7fff5fbfdf00,
lockmode=1, sessionLock=<value temporarily unavailable, due to
optimizations>, dontWait=0 '\0', reportMemoryError=1 '\001') at
lock.c:836lockmethodid = <value temporarily unavailable, due to optimizations>lockMethodTable = (LockMethod)
0x1004a9fc0localtag= { lock = {   locktag_field1 = 16384,   locktag_field2 = 16420,   locktag_field3 = 0,
locktag_field4= 0,   locktag_type = 0 '\0',   locktag_lockmethodid = 1 '\001' }, mode = 1 
}locallock = (LOCALLOCK *) 0x1010d9568lock = <value temporarily unavailable, due to optimizations>proclock = <value
temporarilyunavailable, due to optimizations>proclocktag = { myLock = 0x100d2aad0, myProc = 0x100d4daf8
 
}found = 0 '\0'owner = (ResourceOwner) 0x101060628hashcode = 534311893proclock_hashcode = 311778389partitionLock =
49status= <value temporarily unavailable, due to optimizations>
 
#5  0x00000001001f36bf in LockRelationOid (relid=16420, lockmode=1) at lmgr.c:79tag = { locktag_field1 = 16384,
locktag_field2= 16420, locktag_field3 = 0, locktag_field4 = 0, locktag_type = 0 '\0', locktag_lockmethodid = 1 '\001'
 
}res = <value temporarily unavailable, due to optimizations>
#6  0x00000001000242ba in relation_open (relationId=16420, lockmode=1)
at heapam.c:899r = <value temporarily unavailable, due to optimizations>
#7  0x00000001000244ac in try_heap_openrv (relation=<value temporarily
unavailable, due to optimizations>, lockmode=<value temporarily
unavailable, due to optimizations>) at heapam.c:1127r = <value temporarily unavailable, due to optimizations>
#8  0x00000001000c7f14 in parserOpenTable (pstate=0x101041c68,
relation=0x101041a38, lockmode=1) at parse_relation.c:829rel = <value temporarily unavailable, due to
optimizations>pcbstate= { pstate = 0x101041c68, location = 14, errcontext = {   previous = 0x0,   callback =
0x1000c4810<pcb_error_callback>,   arg = 0x7fff5fbfdf80 }
 
}
#9  0x00000001000c8141 in addRangeTableEntry (pstate=0x101041c68,
relation=0x101041a38, alias=0x0, inh=1 '\001', inFromCl=1 '\001') at
parse_relation.c:895rte = <value temporarily unavailable, due to optimizations>refname = 0x101041a00
"pgbench_accounts"lockmode= <value temporarily unavailable, due to optimizations>rel = <value temporarily unavailable,
dueto optimizations>
 
#10 0x00000001000b8927 in transformTableEntry [inlined] () at
/Users/rhaas/pgsql/src/backend/parser/parse_clause.c:444r = (RangeVar *) Cannot access memory at address 0x0

When I detached gdb, the query immediately returned results, even
though it had been hung for a long time before that.

I have no idea what is going on here and have not checked to see
whether this is reproducible, but thought it was worth reporting
anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: weird hang while running in HS mode

From

Fujii Masao

Date:

12 May 2010, 22:49:27

On Thu, May 13, 2010 at 3:50 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> rhaas=# rollback;
> ROLLBACK
>
> So at this point, one would think that there are no locks hanging
> around anywhere.  Back to the standby:
>
> rhaas=# select * from pgbench_accounts;
> <really long hang>

I think that this problem happens because the WAL record of ROLLBACK
is not flushed to the disk immediately (i.e., until another transaction
flushes the WAL records in wal_buffers) when we execute the ROLLBACK
command. The walsender sends only the WAL records in the disk, so the
standby server would not be able to replay the ROLLBACK and not release
the lock.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: weird hang while running in HS mode

From

Simon Riggs

Date:

13 May 2010, 06:47:32

On Thu, 2010-05-13 at 10:49 +0900, Fujii Masao wrote:
> On Thu, May 13, 2010 at 3:50 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > rhaas=# rollback;
> > ROLLBACK
> >
> > So at this point, one would think that there are no locks hanging
> > around anywhere.  Back to the standby:
> >
> > rhaas=# select * from pgbench_accounts;
> > <really long hang>
> 
> I think that this problem happens because the WAL record of ROLLBACK
> is not flushed to the disk immediately (i.e., until another transaction
> flushes the WAL records in wal_buffers) when we execute the ROLLBACK
> command. The walsender sends only the WAL records in the disk, so the
> standby server would not be able to replay the ROLLBACK and not release
> the lock.

Rollbacks are always flushed to disk, so this explanation doesn't work.
Even if it were it would take no longer than ~1 sec if everything were
working correctly on the test system.

The "weird hang" is a lock wait and is perfectly normal in database
systems. Robert says he hasn't checked whether it is reproduceable, so
there is no evidence to show there is anything other than pilot error,
at this point.

-- Simon Riggs           www.2ndQuadrant.com

Re: weird hang while running in HS mode

From

Fujii Masao

Date:

13 May 2010, 07:08:14

On Thu, May 13, 2010 at 6:47 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Rollbacks are always flushed to disk, so this explanation doesn't work.
> Even if it were it would take no longer than ~1 sec if everything were
> working correctly on the test system.

Yeah, rollbacks are always flushed sooner or later, but not *immediately*,
since RecordTransactionAbort() calls only XLogInsert() but not XLogFlush().
Until XLogFlush() is executed by another process, the WAL record of rollback
would stay in wal_buffers.

On the other hand, RecordTransactionCommit() calls XLogFlush(),
so commits are always flushed to the disk immediately.

> The "weird hang" is a lock wait and is perfectly normal in database
> systems. Robert says he hasn't checked whether it is reproduceable, so
> there is no evidence to show there is anything other than pilot error,
> at this point.

I was able to reproduce such a hang by not executing another transaction
after rollback. In this case, walsender cannot replicate the rollback
since it's not in the disk.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: weird hang while running in HS mode

From

Simon Riggs

Date:

13 May 2010, 07:23:09

On Thu, 2010-05-13 at 19:08 +0900, Fujii Masao wrote:

> I was able to reproduce such a hang by not executing another
> transaction after rollback. In this case, walsender cannot replicate
> the rollback since it's not in the disk.

WALWriter is not active?

-- Simon Riggs           www.2ndQuadrant.com

Re: weird hang while running in HS mode

From

Fujii Masao

Date:

13 May 2010, 07:32:38

On Thu, May 13, 2010 at 7:22 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Thu, 2010-05-13 at 19:08 +0900, Fujii Masao wrote:
>
>> I was able to reproduce such a hang by not executing another
>> transaction after rollback. In this case, walsender cannot replicate
>> the rollback since it's not in the disk.
>
> WALWriter is not active?

WALWriter is active, but unfortunately it doesn't flush all of the WAL
records in wal_buffers. Please see XLogBackgroundFlush().

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: weird hang while running in HS mode

From

Simon Riggs

Date:

13 May 2010, 08:05:34

On Thu, 2010-05-13 at 19:32 +0900, Fujii Masao wrote:
> On Thu, May 13, 2010 at 7:22 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> > On Thu, 2010-05-13 at 19:08 +0900, Fujii Masao wrote:
> >
> >> I was able to reproduce such a hang by not executing another
> >> transaction after rollback. In this case, walsender cannot replicate
> >> the rollback since it's not in the disk.
> >
> > WALWriter is not active?
> 
> WALWriter is active, but unfortunately it doesn't flush all of the WAL
> records in wal_buffers. Please see XLogBackgroundFlush().

Yes, I wrote it. It flushes after at most 3 cycles, stated in comments.

-- Simon Riggs           www.2ndQuadrant.com

Re: weird hang while running in HS mode

From

Fujii Masao

Date:

13 May 2010, 08:14:12

On Thu, May 13, 2010 at 8:05 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Thu, 2010-05-13 at 19:32 +0900, Fujii Masao wrote:
>> On Thu, May 13, 2010 at 7:22 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> > On Thu, 2010-05-13 at 19:08 +0900, Fujii Masao wrote:
>> >
>> >> I was able to reproduce such a hang by not executing another
>> >> transaction after rollback. In this case, walsender cannot replicate
>> >> the rollback since it's not in the disk.
>> >
>> > WALWriter is not active?
>>
>> WALWriter is active, but unfortunately it doesn't flush all of the WAL
>> records in wal_buffers. Please see XLogBackgroundFlush().
>
> Yes, I wrote it. It flushes after at most 3 cycles, stated in comments.

Yeah, what is worse is that RecordTransactionAbort() doesn't update the
XLogCtl->asyncCommitLSN, so rollback might not be flushed even after at
3 cycles.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: weird hang while running in HS mode

From

Simon Riggs

Date:

13 May 2010, 08:18:47

On Thu, 2010-05-13 at 20:13 +0900, Fujii Masao wrote:
> On Thu, May 13, 2010 at 8:05 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> > On Thu, 2010-05-13 at 19:32 +0900, Fujii Masao wrote:
> >> On Thu, May 13, 2010 at 7:22 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> >> > On Thu, 2010-05-13 at 19:08 +0900, Fujii Masao wrote:
> >> >
> >> >> I was able to reproduce such a hang by not executing another
> >> >> transaction after rollback. In this case, walsender cannot replicate
> >> >> the rollback since it's not in the disk.
> >> >
> >> > WALWriter is not active?
> >>
> >> WALWriter is active, but unfortunately it doesn't flush all of the WAL
> >> records in wal_buffers. Please see XLogBackgroundFlush().
> >
> > Yes, I wrote it. It flushes after at most 3 cycles, stated in comments.
> 
> Yeah, what is worse is that RecordTransactionAbort() doesn't update the
> XLogCtl->asyncCommitLSN, so rollback might not be flushed even after at
> 3 cycles.

What you mean then is that there is a bug, not that it should work this
way. Will look.

-- Simon Riggs           www.2ndQuadrant.com

Re: weird hang while running in HS mode

From

Simon Riggs

Date:

13 May 2010, 08:44:04

On Thu, 2010-05-13 at 20:13 +0900, Fujii Masao wrote:
> On Thu, May 13, 2010 at 8:05 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> > On Thu, 2010-05-13 at 19:32 +0900, Fujii Masao wrote:
> >> On Thu, May 13, 2010 at 7:22 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> >> > On Thu, 2010-05-13 at 19:08 +0900, Fujii Masao wrote:
> >> >
> >> >> I was able to reproduce such a hang by not executing another
> >> >> transaction after rollback. In this case, walsender cannot replicate
> >> >> the rollback since it's not in the disk.
> >> >
> >> > WALWriter is not active?
> >>
> >> WALWriter is active, but unfortunately it doesn't flush all of the WAL
> >> records in wal_buffers. Please see XLogBackgroundFlush().
> >
> > Yes, I wrote it. It flushes after at most 3 cycles, stated in comments.
> 
> Yeah, what is worse is that RecordTransactionAbort() doesn't update the
> XLogCtl->asyncCommitLSN, so rollback might not be flushed even after at
> 3 cycles.

Well spotted. Fix applied, thanks both.

-- Simon Riggs           www.2ndQuadrant.com