Re: Partitioned checkpointing - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: Partitioned checkpointing
Date
Msg-id alpine.DEB.2.10.1509110936340.15994@sto
Whole thread Raw
In response to Partitioned checkpointing  (Takashi Horikawa <t-horikawa@aj.jp.nec.com>)
Responses Re: Partitioned checkpointing  (Simon Riggs <simon@2ndQuadrant.com>)
Re: Partitioned checkpointing  (Takashi Horikawa <t-horikawa@aj.jp.nec.com>)
List pgsql-hackers
Hello Takashi-san,

I wanted to do some tests with this POC patch. For this purpose, it would 
be nice to have a guc which would allow to activate or not this feature.

Could you provide a patch with such a guc? I would suggest to have the 
number of partitions as a guc, so that choosing 1 would basically reflect 
the current behavior.

Some general comments :

I understand that what this patch does is cutting the checkpoint of 
buffers in 16 partitions, each addressing 1/16 of buffers, and each with 
its own wal-log entry, pacing, fsync and so on.

I'm not sure why it would be much better, although I agree that it may 
have some small positive influence on performance, but I'm afraid it may 
also degrade performance in some conditions. So I think that maybe a 
better understanding of why there is a better performance and focus on 
that could help obtain a more systematic gain.

This method interacts with the current proposal to improve the 
checkpointer behavior by avoiding random I/Os, but it could be combined.

I'm wondering whether the benefit you see are linked to the file flushing 
behavior induced by fsyncing more often, in which case it is quite close 
the "flushing" part of the current "checkpoint continuous flushing" patch, 
and could be redundant/less efficient that what is done there, especially 
as test have shown that the effect of flushing is *much* better on sorted 
buffers.

Another proposal around, suggested by Andres Freund I think, is that 
checkpoint could fsync files while checkpointing and not wait for the end 
of the checkpoint. I think that it may also be one of the reason why your 
patch does bring benefit, but Andres approach would be more systematic, 
because there would be no need to fsync files several time (basically your 
patch issues 16 fsync per file). This suggest that the "partitionning" 
should be done at a lower level, from within the CheckPointBuffers, which 
would take care of fsyncing files some time after writting buffers to them 
is finished.

-- 
Fabien.



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [PATCH v2] GSSAPI encryption support
Next
From: Bernd Helmle
Date:
Subject: Re: 9.3.9 and pg_multixact corruption