Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work
Date
Msg-id 20220215175752.GA2413813@nathanxps13
Whole thread Raw
In response to Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work  (Andres Freund <andres@anarazel.de>)
Responses Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Tue, Feb 15, 2022 at 09:09:52AM -0800, Andres Freund wrote:
> On 2022-02-10 21:30:45 +0530, Bharath Rupireddy wrote:
>> Replace ReadDir with ReadDirExtended (in CheckPointSnapBuild) and
>> get rid of lstat entirely.
> 
> I think this might be based on a slight misunderstanding / bad phrasing on my
> part.  We can use get_dirent_type() to optimize away the lstat on most
> platforms, ReadDirExtended itself doesn't do that automatically. I was trying
> to reference removing lstat calls by using get_dirent_type() in more places...
> 
> 
>> We still use ReadDir in CheckPointLogicalRewriteHeap
>> because unable to read directory would result a NULL from
>> ReadDirExtended and we may miss to fsync the remaining map files,
>> so here let's error out with ReadDir.
> 
> Then why is this skipping the lstat?
> 
> 
>> Also, convert "could not parse filename" and "could not remove file"
>> errors to LOG messages in  CheckPointLogicalRewriteHeap. This will
>> enable checkpoint not to waste the amount of work that it had done.
> 
> I still doubt this is a good idea.

IIUC you are advocating for something more like the attached patches.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

Previous
From: Nitin Jadhav
Date:
Subject: Re: Refactor CheckpointWriteDelay()
Next
From: Jeevan Ladhe
Date:
Subject: Re: refactoring basebackup.c