CheckpointStartLock starvation - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject CheckpointStartLock starvation
Date
Msg-id 461152B3.5060404@enterprisedb.com
Whole thread Raw
Responses Re: CheckpointStartLock starvation  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: CheckpointStartLock starvation  (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
List pgsql-hackers
I'm seeing a problem on my benchmark machine: checkpoints stop happening 
after the ramp-up period.

It looks like the bgwriter gets starved waiting on the 
CheckpointStartLock. The CheckpointStartLock is held in shared mode over 
an XLogFlush when committing, which on an extremely busy system like a 
benchmark is always long enough to have a new transaction to acquire the 
CheckpointStartLock again.

I'm running another test with more logging to confirm that's what's 
happening, but I'm pretty sure that's it...

As a proposed fix, instead of acquiring the CheckpointStartLock in 
RecordTransactionCommit, we set a flag in MyProc saying "commit in 
progress". Checkpoint will scan through the procarray and make note of 
any commit in progress transactions, after computing the new redo record 
ptr, and wait for all of them to finish before flushing clog.

Unless someone has a better idea, I'll write a patch to do the above.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Zdenek Kotala
Date:
Subject: Questions about pid file creation code
Next
From: Hiroki Kataoka
Date:
Subject: Re: Proposal: Adding JIS X 0213 support