Home > mailing lists

Re: Hot Standby 0.2.1 - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Hot Standby 0.2.1
Date	September 23, 2009 09:07:36
Msg-id	4AB9E547.9040602@enterprisedb.com Whole thread Raw
In response to	Hot Standby 0.2.1 (Simon Riggs <simon@2ndQuadrant.com>)
Responses	Re: Hot Standby 0.2.1 Re: Hot Standby 0.2.1 Re: Hot Standby 0.2.1
List	pgsql-hackers

Tree view

The logic in the lock manager to track the number of held
AccessExclusiveLocks (with ProcArrayIncrementNumHeldLocks and
ProcArrayDecrementNumHeldLocks) seems to be broken. I added an Assertion
into ProcArrayDecrementNumHeldLocks:

--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -1401,6 +1401,7 @@ ProcArrayIncrementNumHeldLocks(PGPROC *proc)voidProcArrayDecrementNumHeldLocks(PGPROC *proc){
+   Assert(proc->numHeldLocks > 0);   proc->numHeldLocks--;}

This tripped the assertion:

postgres=# CREATE TABLE foo (id int4 primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
"foo_pkey" for table "foo"
server closed the connection unexpectedlyThis probably means the server terminated abnormallybefore or while processing
therequest.
 

Making matters worse, the primary server refuses to startup up after
that, tripping the assertion again in crash recovery:

$ bin/postmaster -D data
LOG:  database system was interrupted while in recovery at 2009-09-23
11:56:15 EEST
HINT:  This probably means that some data is corrupted and you will have
to use the last backup for recovery.
LOG:  database system was not properly shut down; automatic recovery in
progress
LOG:  redo starts at 0/32000070
LOG:  REDO @ 0/32000070; LSN 0/320000AC: prev 0/32000020; xid 0; len 32:
Heap2 - clean: rel 1663/11562/1249; blk 32 remxid 4352
LOG:  consistent recovery state reached
LOG:  REDO @ 0/320000AC; LSN 0/320000CC: prev 0/32000070; xid 0; len 4:
XLOG - nextOid: 24600
LOG:  REDO @ 0/320000CC; LSN 0/320000F4: prev 0/320000AC; xid 0; len 12:
Storage - file create: base/11562/16408
LOG:  REDO @ 0/320000F4; LSN 0/3200011C: prev 0/320000CC; xid 4364; len
12: Relation - exclusive relation lock: xid 4364 db 11562 rel 16408
LOG:  REDO @ 0/3200011C; LSN 0/320001D8: prev 0/320000F4; xid 4364; len
159: Heap - insert: rel 1663/11562/1259; tid 5/4
...
LOG:  REDO @ 0/32004754; LSN 0/32004878: prev 0/320046A8; xid 4364; len
264: Transaction - commit: 2009-09-23 11:55:51.888398+03; 15 inval
msgs:catcache id38 catcache id37 catcache id38 catcache id37 catcache
id38 catcache id37 catcache id7 catcache id6 catcache id26 smgr relcache
smgr relcache smgr relcache
TRAP: FailedAssertion("!(proc->numHeldLocks > 0)", File: "procarray.c",
Line: 1404)
LOG:  startup process (PID 27430) was terminated by signal 6: Aborted
LOG:  aborting startup due to startup process failure

I'm sure that's just a simple bug somewhere, but it highlights that we
need be careful to avoid putting any extra work into the normal recovery
path. Otherwise bugs in hot standby related code can cause crash
recovery to fail.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

pgsql-hackers by date:

From: Heikki Linnakangas
Date: 23 September 2009, 08:14:02
Subject: Re: Hot Standby 0.2.1

From: Roger Leigh
Date: 23 September 2009, 09:16:38
Subject: Re: Unicode UTF-8 table formatting for psql text output

Re: Hot Standby 0.2.1 - Mailing list pgsql-hackers

Previous

Next