On Tue, Mar 15, 2022 at 11:04:26AM +0900, Michael Paquier wrote:
> On Mon, Mar 14, 2022 at 03:54:19PM +0530, Bharath Rupireddy wrote:
>> At times, the snapshot or mapping files can be large in number and one
>> some platforms it takes time for checkpoint to process all of them.
>> Having the stats about them in server logs can help us better analyze
>> why checkpoint took a long time and provide a better RCA.
> 
> Do you have any numbers to share regarding that?  Seeing information
> about 1k WAL segments being recycled and/or removed by a checkpoint
> where the operation takes dozens of seconds to complete because we can
> talk about hundred of gigs worth of files moved around.  If we are
> talking about 100~200 files up to 10~20kB each for snapshot and
> mapping files, the information has less value, worth only a portion of
> one WAL segment.
I don't have specific numbers to share, but as noted elsewhere [0], I
routinely see lengthy checkpoints that spend a lot of time in these cleanup
tasks.
[0] https://postgr.es/m/18ED8B1F-7F5B-4ABF-848D-45916C938BC7%40amazon.com
-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com