Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work
Date
Msg-id 20220201235538.GA740616@nathanxps13
Whole thread Raw
In response to Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
List pgsql-hackers
On Mon, Jan 31, 2022 at 10:42:54AM +0530, Bharath Rupireddy wrote:
> After an off-list discussion with Andreas, proposing here a patch that
> basically replaces ReadDir call with ReadDirExtended and gets rid of
> lstat entirely. With this chance, the checkpoint will only care about
> the snapshot and mapping files and not fail if it finds other files in
> the directories. Removing lstat enables us to make things faster as we
> avoid a bunch of extra system calls - one lstat call per each mapping
> or snapshot file.

I think removing the lstat() is probably reasonable.  We currently aren't
doing proper error checking, and the chances of a non-regular file matching
the prefix are likely pretty low.  In the worst case, we'll LOG or ERROR
when unlinking or fsyncing fails.

However, I'm not sure about the change to ReadDirExtended().  That might be
okay for CheckPointSnapBuild(), which is just trying to remove old files,
but CheckPointLogicalRewriteHeap() is responsible for ensuring that files
are flushed to disk for the checkpoint.  If we stop reading the directory
after an error and let the checkpoint continue, isn't it possible that some
mappings files won't be persisted to disk?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: CREATEROLE and role ownership hierarchies
Next
From: Todd Hubers
Date:
Subject: Re: Feature Proposal: Connection Pool Optimization - Change the Connection User