Home > mailing lists

Re: Partitioned checkpointing - Mailing list pgsql-hackers

From	Fabien COELHO
Subject	Re: Partitioned checkpointing
Date	September 11, 2015 16:28:46
Msg-id	alpine.DEB.2.10.1509111813310.21797@sto Whole thread
In response to	Re: Partitioned checkpointing (Simon Riggs <simon@2ndQuadrant.com>)
List	pgsql-hackers

Tree view

Hello Simon,

> The idea to do a partial pass through shared buffers and only write a
> fraction of dirty buffers, then fsync them is a good one.

Sure.

> The key point is that we spread out the fsyncs across the whole checkpoint
> period.

Yes, this is really Andres suggestion, as I understood it.

> I think we should be writing out all buffers for a particular file in one
> pass, then issue one fsync per file.  >1 fsyncs per file seems a bad idea.

This is one of the things done in the "checkpoint continuous flushing" 
patch, as buffers are sorted, they are written per file, and in order 
within a file, which help getting sequencial writes instead of random 
writes.

See https://commitfest.postgresql.org/6/260/

However for now the final fsync is not called, but Linux is told that the 
written buffers must be flushed, which is akin to an "asynchronous fsync", 
i.e. it asks to move data but does not wait for the data to be actually 
written, as a blocking fsync would.

Andres suggestion, which has some common points to Takashi-san patch, is 
to also integrate the fsync in the buffer writing process. There are some 
details to think about, because probably it is not a a good to issue an 
fsync right after the corresponding writes, it is better to wait for some 
delay before doing so, so the implementation is not straightforward.

-- 
Fabien.

pgsql-hackers by date:

From: Andres Freund
Date: 11 September 2015, 16:24:48
Subject: Re: WIP: Make timestamptz_out less slow.

From: Andres Freund
Date: 11 September 2015, 16:29:48
Subject: Re: Partitioned checkpointing

Re: Partitioned checkpointing - Mailing list pgsql-hackers

Previous

Next