Checkpoint gets stuck in mdsync - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Checkpoint gets stuck in mdsync
Date
Msg-id 4614BD46.2070800@enterprisedb.com
Whole thread Raw
Responses Re: Checkpoint gets stuck in mdsync  (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>)
Re: Checkpoint gets stuck in mdsync  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Now that the CheckpointStartLock starvation has been taken care of, I'm 
seeing another problem with checkpoints in my test run: mdsync never 
finishes.

Here's what's happening:
1. checkpoint calls mdsync
2. mdsync start processing pending fsyncs from pendingOpsTable
(at this point, normal backends have to start doing writes themselves, 
because bgwriter is busy checkpointing and isn't keeping buffers clean)
3. after fsyncing 10 files, it calls AbsorbFsyncRequests
4. AbsorbFsyncRequests puts back entries into pendingOpsTable for those 
files that were already fsynced.
5. mdsync starts over, goto 2.

The loop doesn't end until the test run is over, mdsync keeps fsyncing 
the same over and over again.

My proposed fix is to make a copy of pendingOpsTable before entering the 
loop. AbsorbFsyncRequest will put new requests to a fresh new 
pendingOpsTable, while the mdsync loop will drain the copy. I'll write a 
patch along those lines if there's no better ideas.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: "Simon Riggs"
Date:
Subject: Re: Auto Partitioning
Next
From: ITAGAKI Takahiro
Date:
Subject: Re: Checkpoint gets stuck in mdsync