On Thu, Apr 23, 2015 at 7:09 AM, <grant@amazon.com> wrote:
> 1. Set max_prepared_transactions to 2 in postgresql.conf.
> 2. start Postgres.
> 3. Create two uncommitted prepared transactions: BEGIN; PREPARE
> TRANSACTION 'test1'; BEGIN; PREPARE TRANSACTION 'test2';
> 4. Set max_prepared_transactions to 1 in postgresql.conf.
> 5. Restart Postgres again.
>
> At this point the startup will fail with a fatal but the postmaster process
> keeps running.
>
> LOG: database system was interrupted; last known up at 2015-04-16 17:19:56
> PDT
> LOG: database system was not properly shut down; automatic recovery in
> progress
> LOG: record with zero length at 0/1826C70
> LOG: redo is not required
> LOG: recovering prepared transaction 1685
> LOG: recovering prepared transaction 1683
> FATAL: maximum number of prepared transactions reached
> HINT: Increase max_prepared_transactions (currently 1).
>
> Looks like their may be a LWLock that is not correctly getting released
> which causes the process to hang rather then exit.
Yep, the startup process remains stuck here, so we should release the
lock before issuing ERROR in twophase.c:
* frame #0: 0x00007fff95004b72 libsystem_kernel.dylib`semop + 10
frame #1: 0x000000010f59530f
postgres`PGSemaphoreLock(sema=0x00000001187a1190) + 63 at
pg_sema.c:387
frame #2: 0x000000010f632af9
postgres`LWLockAcquireCommon(lock=0x000000010fc452e0,
mode=LW_EXCLUSIVE, valptr=0x0000000000000000, val=0) + 377 at
lwlock.c:1037
frame #3: 0x000000010f63296b
postgres`LWLockAcquire(l=0x000000010fc452e0, mode=LW_EXCLUSIVE) + 43
at lwlock.c:900
frame #4: 0x000000010f2f975d postgres`AtAbort_Twophase + 93 at
twophase.c:284
frame #5: 0x000000010f2f9f24 postgres`AtProcExit_Twophase(code=1,
arg=0) + 20 at twophase.c:246
The patch attached fixes the problem for me, and I think that this
should be backpatched as well.
(Note to self: check the other calls of LWLockAcquire/Release to see
if there are other code paths in the same situation).
Regards,
--
Michael