Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work
Date
Msg-id CALj2ACXGwc5UBx1Jc2wacqEM1oBmFHjRHxvTDA+X48sE7ObcFQ@mail.gmail.com
Whole thread Raw
In response to Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work  (Nathan Bossart <nathandbossart@gmail.com>)
Responses Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work  (Nathan Bossart <nathandbossart@gmail.com>)
List pgsql-hackers
On Thu, Feb 3, 2022 at 12:07 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Wed, Feb 02, 2022 at 05:19:26PM +0530, Bharath Rupireddy wrote:
> > On Wed, Feb 2, 2022 at 5:25 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
> >> However, I'm not sure about the change to ReadDirExtended().  That might be
> >> okay for CheckPointSnapBuild(), which is just trying to remove old files,
> >> but CheckPointLogicalRewriteHeap() is responsible for ensuring that files
> >> are flushed to disk for the checkpoint.  If we stop reading the directory
> >> after an error and let the checkpoint continue, isn't it possible that some
> >> mappings files won't be persisted to disk?
> >
> > Unless I mis-read your above statement, with LOG level in
> > ReadDirExtended, I don't think we stop reading the files in
> > CheckPointLogicalRewriteHeap. Am I missing something here?
>
> ReadDirExtended() has the following comment:
>
>  * If elevel < ERROR, returns NULL after any error.  With the normal coding
>  * pattern, this will result in falling out of the loop immediately as
>  * though the directory contained no (more) entries.
>
> If there is a problem reading the directory, we will LOG and then exit the
> loop.  If we didn't scan through all the entries in the directory, there is
> a chance that we didn't fsync() all the files that need it.

Thanks. I get it. For syncing map files, we don't want to tolerate any
errors, whereas removal of the old map files (lesser than cutoff LSN)
can be tolerated in CheckPointLogicalRewriteHeap.

Here's the v7 version using ReadDir for CheckPointLogicalRewriteHeap.

Regards,
Bharath Rupireddy.

Attachment

pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Adding CI to our tree
Next
From: Pavel Stehule
Date:
Subject: Re: support for CREATE MODULE