Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work - Mailing list pgsql-hackers

From Nathan Bossart
Subject Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work
Date
Msg-id 20220329224832.GA560657@nathanxps13
Whole thread Raw
In response to Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work
List pgsql-hackers
Thanks for taking a look!

On Thu, Mar 24, 2022 at 01:17:01PM +1300, Thomas Munro wrote:
>      /* we're only handling directories here, skip if it's not ours */
> -    if (lstat(path, &statbuf) == 0 && !S_ISDIR(statbuf.st_mode))
> +    if (lstat(path, &statbuf) != 0)
> +        ereport(ERROR,
> +                (errcode_for_file_access(),
> +                 errmsg("could not stat file \"%s\": %m", path)));
> +    else if (!S_ISDIR(statbuf.st_mode))
>          return;
> 
> Why is this a good place to silently ignore non-directories?
> StartupReorderBuffer() is already in charge of skipping random
> detritus found in the directory, so would it be better to do "if
> (get_dirent_type(...) != PGFILETYPE_DIR) continue" there, and then
> drop the lstat() stanza from ReorderBufferCleanupSeralizedTXNs()
> completely?  Then perhaps its ReadDirExtended() shoud be using ERROR
> instead of INFO, so that missing/non-dir/b0rked directories raise an
> error.

My guess is that this was done because ReorderBufferCleanupSerializedTXNs()
is also called from ReorderBufferAllocate() and ReorderBufferFree().
However, it is odd that we just silently return if the slot path isn't a
directory in those cases.  I think we could use get_dirent_type() in
StartupReorderBuffer() as you suggested, and then we could let ReadDir()
ERROR for non-directories for the other callers of
ReorderBufferCleanupSerializedTXNs().  WDYT?

> I don't understand why it's reporting readdir() errors at INFO
> but unlink() errors at ERROR, and as far as I can see the other paths
> that reach this code shouldn't be sending in paths to non-directories
> here unless something is seriously busted and that's ERROR-worthy.

I agree.  I'll switch it to ReadDir() in the next revision so that we ERROR
instead of INFO.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Frontend error logging style
Next
From: Andres Freund
Date:
Subject: Re: Add parameter jit_warn_above_fraction