Re: checkpointer continuous flushing - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: checkpointer continuous flushing
Date
Msg-id CAA4eK1Kxy1OLK61cjTLdTcs0eXO1aVOhuMg11QL3BdgPOUkc1Q@mail.gmail.com
Whole thread Raw
In response to checkpointer continuous flushing  (Fabien COELHO <coelho@cri.ensmp.fr>)
Responses Re: checkpointer continuous flushing  (Fabien COELHO <coelho@cri.ensmp.fr>)
List pgsql-hackers
On Mon, Jun 1, 2015 at 5:10 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Hello pg-devs,

This patch is a simplified and generalized version of Andres Freund's August 2014 patch for flushing while writing during checkpoints, with some documentation and configuration warnings added.

For the initial patch, see:

  http://www.postgresql.org/message-id/20140827091922.GD21544@awork2.anarazel.de

For the whole thread:

  http://www.postgresql.org/message-id/alpine.DEB.2.10.1408251900211.11151@sto

The objective is to help avoid PG stalling when fsyncing on checkpoints, and in general to get better latency-bound performance.


-FlushBuffer(volatile BufferDesc *buf, SMgrRelation reln)
+FlushBuffer(volatile BufferDesc *buf, SMgrRelation reln, bool flush_to_disk)
 {
  XLogRecPtr recptr;
  ErrorContextCallback errcallback;
@@ -2410,7 +2417,8 @@ FlushBuffer(volatile BufferDesc *buf, SMgrRelation reln)
   buf->tag.forkNum,
   buf->tag.blockNum,
   bufToWrite,
-  false);
+  false,
+  flush_to_disk);

Won't this lead to more-unsorted writes (random I/O) as the
FlushBuffer requests (by checkpointer or bgwriter) are not sorted as
per files or order of blocks on disk?

I remember sometime back there was some discusion regarding
sorting writes during checkpoint, one idea could be try to
check this idea along with that patch.  I just saw that Andres has
also given same suggestion which indicates that it is important
to see both the things together.

Also here another related point is that I think currently even fsync
requests are not in order of the files as they are stored on disk so
that also might cause random I/O?

Yet another idea could be to allow BGWriter to also fsync the dirty
buffers, that may have side impact of not able to clear the dirty pages
at speed required by system, but I think if that happens one can
think of having multiple BGwriter tasks. 


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: checkpointer continuous flushing
Next
From: "Shulgin, Oleksandr"
Date:
Subject: Streaming replication for psycopg2