Thread: pg11+: pg_ls_*dir LIMIT 1: temporary files .. not closed atend-of-transaction

pg11+: pg_ls_*dir LIMIT 1: temporary files .. not closed atend-of-transaction

From
Justin Pryzby
Date:
While working on a patch, I noticed this pre-existing behavior, which seems to
be new since v11, maybe due to changes to SRF.

|postgres=# SELECT pg_ls_dir('.') LIMIT 1;
|WARNING:  1 temporary files and directories not closed at end-of-transaction
|pg_ls_dir | pg_dynshmem

|postgres=# SELECT pg_ls_waldir() LIMIT 1;
|WARNING:  1 temporary files and directories not closed at end-of-transaction
|-[ RECORD 1 ]+-------------------------------------------------------------
|pg_ls_waldir | (00000001000031920000007B,16777216,"2020-03-08 03:50:34-07")


Note, that doesn't happen with "SELECT * FROM".

I'm not sure what the solution is to that, but my patch was going to make it
worse rather than better for pg_ls_tmpdir.

-- 
Justin



Justin Pryzby <pryzby@telsasoft.com> writes:
> While working on a patch, I noticed this pre-existing behavior, which seems to
> be new since v11, maybe due to changes to SRF.

> |postgres=# SELECT pg_ls_dir('.') LIMIT 1;
> |WARNING:  1 temporary files and directories not closed at end-of-transaction

Hmm, actually it looks to me like pg_ls_dir has been broken forever.
The reason the warning didn't show up before v11 is that CleanupTempFiles
didn't bleat about leaked "allocated" directories before that
(cf 9cb7db3f0).

I guess we ought to change that function to use returns-a-tuplestore
protocol instead of thinking it can hold a directory open across calls.
It's not hard to think of use-cases where the existing behavior would
cause issues worse than a nanny-ish WARNING, especially on platforms
with tight "ulimit -n" limits.

            regards, tom lane



On Sun, Mar 08, 2020 at 02:37:49PM -0400, Tom Lane wrote:
> Justin Pryzby <pryzby@telsasoft.com> writes:
> > While working on a patch, I noticed this pre-existing behavior, which seems to
> > be new since v11, maybe due to changes to SRF.
> 
> > |postgres=# SELECT pg_ls_dir('.') LIMIT 1;
> > |WARNING:  1 temporary files and directories not closed at end-of-transaction
> 
> Hmm, actually it looks to me like pg_ls_dir has been broken forever.
> The reason the warning didn't show up before v11 is that CleanupTempFiles
> didn't bleat about leaked "allocated" directories before that
> (cf 9cb7db3f0).
> 
> I guess we ought to change that function to use returns-a-tuplestore
> protocol instead of thinking it can hold a directory open across calls.
> It's not hard to think of use-cases where the existing behavior would
> cause issues worse than a nanny-ish WARNING, especially on platforms
> with tight "ulimit -n" limits.

Thanks for the analysis.

Do you mean it should enumerate all files during the initial SRF call, or use
something other than the SRF_* macros ?

-- 
Justin



Justin Pryzby <pryzby@telsasoft.com> writes:
> On Sun, Mar 08, 2020 at 02:37:49PM -0400, Tom Lane wrote:
>> I guess we ought to change that function to use returns-a-tuplestore
>> protocol instead of thinking it can hold a directory open across calls.
>> It's not hard to think of use-cases where the existing behavior would
>> cause issues worse than a nanny-ish WARNING, especially on platforms
>> with tight "ulimit -n" limits.

> Thanks for the analysis.

> Do you mean it should enumerate all files during the initial SRF call, or use
> something other than the SRF_* macros ?

It has to enumerate all the files during the first call.  I suppose it
could do that and then still hand back the results one-at-a-time, but
there seems little point compared to filling a tuplestore immediately.
So probably the SRF_ macros are useless here.

Another possible solution is to register an exprstate-shutdown hook to
ensure the resource is cleaned up, but I'm not very happy with that
because it does nothing to prevent the hazard of overrunning the
available resources if you have several of these active at once.

I've just finished scanning the source code and concluding that all
of these functions are similarly broken:

pg_ls_dir
pg_ls_dir_files
pg_tablespace_databases
pg_logdir_ls_internal
pg_timezone_names
pgrowlocks

The first five risk leaking an open directory, the last risks leaking
an active tablescan and open relation.

I don't see anything in the documentation (either funcapi.h or
xfunc.sgml) warning that the function might not get run to completion,
either ...

            regards, tom lane



I wrote:
> I've just finished scanning the source code and concluding that all
> of these functions are similarly broken:
> pg_ls_dir
> pg_ls_dir_files
> pg_tablespace_databases
> pg_logdir_ls_internal
> pg_timezone_names
> pgrowlocks

BTW, another thing I noticed while looking around is that some of
the functions using SRF_RETURN_DONE() think they should clean up
memory beforehand.  This is a waste of code/cycles, as long as the
memory was properly allocated in funcctx->multi_call_memory_ctx,
because funcapi.c takes care of deleting that context.

We should probably document that *any* manual cleanup before
SRF_RETURN_DONE() is an antipattern.  If you have to have cleanup,
it needs to be done via RegisterExprContextCallback instead.

            regards, tom lane



On Sun, Mar 08, 2020 at 03:40:09PM -0400, Tom Lane wrote:
> Justin Pryzby <pryzby@telsasoft.com> writes:
> > On Sun, Mar 08, 2020 at 02:37:49PM -0400, Tom Lane wrote:
> >> I guess we ought to change that function to use returns-a-tuplestore
> >> protocol instead of thinking it can hold a directory open across calls.
> > Thanks for the analysis.
> > Do you mean it should enumerate all files during the initial SRF call, or use
> > something other than the SRF_* macros ?
> It has to enumerate all the files during the first call.  I suppose it

> I've just finished scanning the source code and concluding that all
> of these functions are similarly broken:
> pg_ls_dir_files

I patched this one to see what it looks like and to allow /hopefully/ moving
forward one way or another with the pg_ls_tmpfile() patch set (or at least
avoid trying to do anything there which is too inconsistent with this fix).

> I don't see anything in the documentation (either funcapi.h or
> xfunc.sgml) warning that the function might not get run to completion,
> either ...

Also, at first glance, these seem to be passing constant "randomAccess=true"
rather than (bool) (rsinfo->allowedModes&SFRM_Materialize_Random)

$ git grep -wl SFRM_Materialize |xargs grep -l 'tuplestore_begin_heap(true'
contrib/dblink/dblink.c
contrib/pageinspect/brinfuncs.c
contrib/pg_stat_statements/pg_stat_statements.c
src/backend/access/transam/xlogfuncs.c
src/backend/commands/event_trigger.c
src/backend/commands/extension.c
src/backend/foreign/foreign.c
src/backend/replication/logical/launcher.c
src/backend/replication/logical/logicalfuncs.c
src/backend/replication/logical/origin.c
src/backend/replication/slotfuncs.c
src/backend/replication/walsender.c
src/backend/storage/ipc/shmem.c
src/backend/utils/adt/pgstatfuncs.c
src/backend/utils/misc/guc.c
src/backend/utils/misc/pg_config.c

-- 
Justin

Attachment
Justin Pryzby <pryzby@telsasoft.com> writes:
>>> On Sun, Mar 08, 2020 at 02:37:49PM -0400, Tom Lane wrote:
>>>> I guess we ought to change that function to use returns-a-tuplestore
>>>> protocol instead of thinking it can hold a directory open across calls.

> I patched this one to see what it looks like and to allow /hopefully/ moving
> forward one way or another with the pg_ls_tmpfile() patch set (or at least
> avoid trying to do anything there which is too inconsistent with this fix).

I reviewed this, added some test cases, and pushed it, so that we can see
if the buildfarm finds anything wrong.  (I'm not expecting that, because
this should all be pretty portable, but you never know.)  Assuming not,
we need to fix the other functions similarly, and then do something about
revising the documentation to warn against this coding style.  Do you
want to have a go at that?

> Also, at first glance, these seem to be passing constant "randomAccess=true"
> rather than (bool) (rsinfo->allowedModes&SFRM_Materialize_Random)

Hm.  Not a bug, but possibly a performance issue, if the tuplestore
gets big enough for that to matter.  (I think it doesn't matter until
we start spilling to temp files.)

            regards, tom lane



On Wed, Mar 11, 2020 at 03:32:38PM -0400, Tom Lane wrote:
> > I patched this one to see what it looks like and to allow /hopefully/ moving
> > forward one way or another with the pg_ls_tmpfile() patch set (or at least
> > avoid trying to do anything there which is too inconsistent with this fix).
> 
> I reviewed this, added some test cases, and pushed it, so that we can see

Thanks, tests were on my periphery..

|    In passing, fix bogus error report for stat() failure: it was
|    whining about the directory when it should be fingering the
|    individual file.  Doubtless a copy-and-paste error.

Thanks again ; that was my 0001 patch on the other thread.  No rebase conflict
even ;)
https://www.postgresql.org/message-id/20191228101650.GG12890%40telsasoft.com

> Do you want to have a go at that?

First draft attached.  Note that I handled pg_ls_dir, even though I'm proposing
on the other thread to collapse/merge/meld it with pg_ls_dir_files [0].
Possibly that's a bad idea with tuplestore, due to returning a scalar vs a row
and needing to conditionally call CreateTemplateTupleDesc vs
get_call_result_type.  I'll rebase that patch later today.

I didn't write test cases yet.  Also didn't look for functions not on your
list.

I noticed this doesn't actually do anything, but kept it for now...except in
pg_ls_dir error case:

src/include/utils/tuplestore.h:/* tuplestore_donestoring() used to be required, but is no longer used */
src/include/utils/tuplestore.h:#define tuplestore_donestoring(state)    ((void) 0)

I found a few documentation bits that I think aren't relevant, but could
possibly be misread to encourage the bad coding practice.  This is about *sql*
functions:

|37.5.8. SQL Functions Returning Sets
|When an SQL function is declared as returning SETOF sometype, the function's
|final query is executed TO COMPLETION, and each row it outputs is returned as
|an element of the result set.
|...
|Set-returning functions in the select list are always evaluated as though they
|are on the inside of a nested-loop join with the rest of the FROM clause, so
|that the function(s) are run TO COMPLETION before the next row from the FROM
|clause is considered.

-- 
Justin

[0] https://www.postgresql.org/message-id/20200310183037.GA29065%40telsasoft.com
v9-0008-generalize-pg_ls_dir_files-and-retire-pg_ls_dir.patch

Attachment
On Sun, Mar 08, 2020 at 04:30:44PM -0400, Tom Lane wrote:
> BTW, another thing I noticed while looking around is that some of
> the functions using SRF_RETURN_DONE() think they should clean up
> memory beforehand.  This is a waste of code/cycles, as long as the
> memory was properly allocated in funcctx->multi_call_memory_ctx,
> because funcapi.c takes care of deleting that context.
> 
> We should probably document that *any* manual cleanup before
> SRF_RETURN_DONE() is an antipattern.  If you have to have cleanup,
> it needs to be done via RegisterExprContextCallback instead.

This part appears to be already in place since
e4186762ffaa4188e16702e8f4f299ea70988b96:

|The memory context that is current when the SRF is called is a transient
|context that will be cleared between calls. This means that you do not need to
|call pfree on everything you allocated using palloc; it will go away anyway.
|However, if you want to allocate any data structures to live across calls, you
|need to put them somewhere else. The memory context referenced by
|multi_call_memory_ctx is a suitable location for any data that needs to survive
|until the SRF is finished running. In most cases, this means that you should
|switch into multi_call_memory_ctx while doing the first-call setup.

-- 
Justin



Nitpick: please see c4dcd9144ba6.

> From: Justin Pryzby <pryzbyj@telsasoft.com>
> Date: Wed, 11 Mar 2020 10:09:18 -0500
> Subject: [PATCH] SRF: avoid leaking resources if not run to completion
> 
> Change to return a tuplestore populated immediately and returned in full.
> 
> Discussion: https://www.postgresql.org/message-id/20200308173103.GC1357%40telsasoft.com

I wonder if this isn't saying that the whole value-per-call protocol is
bogus, in that it seems impossible to write a useful function with it.
Maybe we should add one final call with a special flag "function
shutdown" or something, so that these resources can be released if the
SRF isn't run to completion?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> I wonder if this isn't saying that the whole value-per-call protocol is
> bogus, in that it seems impossible to write a useful function with it.

Only if you have a *very* narrow definition of "useful function".
If you look through SRF_RETURN_DONE callers, only a small minority
are trying to do resource cleanup beforehand.

> Maybe we should add one final call with a special flag "function
> shutdown" or something, so that these resources can be released if the
> SRF isn't run to completion?

We already have an appropriate mechanism for cleaning up resources,
ie RegisterExprContextCallback.  I do not think what you're suggesting
could be made to work without an incompatible break in the API for
SRFs, so it's not an improvement over telling people more forcefully
about how to do resource cleanup correctly.

            regards, tom lane



On Thu, Mar 12, 2020 at 07:11:56AM -0500, Justin Pryzby wrote:
> > Do you want to have a go at that?
> 
> First draft attached.  Note that I handled pg_ls_dir, even though I'm proposing
> on the other thread to collapse/merge/meld it with pg_ls_dir_files [0].
> Possibly that's a bad idea with tuplestore, due to returning a scalar vs a row
> and needing to conditionally call CreateTemplateTupleDesc vs
> get_call_result_type.  I'll rebase that patch later today.
> 
> I didn't write test cases yet.  Also didn't look for functions not on your
> list.
> 
> I noticed this doesn't actually do anything, but kept it for now...except in
> pg_ls_dir error case:
> 
> src/include/utils/tuplestore.h:/* tuplestore_donestoring() used to be required, but is no longer used */
> src/include/utils/tuplestore.h:#define tuplestore_donestoring(state)    ((void) 0)

v2 attached - I will add to next CF in case you want to defer it until later.

-- 
Justin

Attachment
Justin Pryzby <pryzby@telsasoft.com> writes:
> v2 attached - I will add to next CF in case you want to defer it until later.

Thanks, reviewed and pushed.  Since this is a bug fix (at least in part)
I didn't want to wait.

            regards, tom lane



On Mon, Mar 16, 2020 at 09:38:50PM -0400, Tom Lane wrote:
> Justin Pryzby <pryzby@telsasoft.com> writes:
> > v2 attached - I will add to next CF in case you want to defer it until later.
> 
> Thanks, reviewed and pushed.  Since this is a bug fix (at least in part)
> I didn't want to wait.

Thanks for fixing my test case and pushing.

-- 
Justin



Justin Pryzby <pryzby@telsasoft.com> writes:
> Thanks for fixing my test case and pushing.

The buildfarm just showed up another instability in the test cases
we added:

=========================== regression.diffs ================
diff -U3 /home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/../pgsql/src/test/regress/expected/misc_functions.out
/home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/src/bin/pg_upgrade/tmp_check/regress/results/misc_functions.out
--- /home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/../pgsql/src/test/regress/expected/misc_functions.out
2020-03-1708:14:50.292037956 +0100 
+++
/home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/src/bin/pg_upgrade/tmp_check/regress/results/misc_functions.out
2020-03-2813:55:12.490024822 +0100 
@@ -169,11 +169,7 @@

 select (w).size = :segsize as ok
 from (select pg_ls_waldir() w) ss where length((w).name) = 24 limit 1;
- ok
-----
- t
-(1 row)
-
+ERROR:  could not stat file "pg_wal/000000010000000000000078": No such file or directory
 select count(*) >= 0 as ok from pg_ls_archive_statusdir();
  ok
 ----

It's pretty obvious what happened here: concurrent activity renamed or
removed the WAL segment between when we saw it in the directory and
when we tried to stat() it.

This seems like it would be just as much of a hazard for field usage
as it is for regression testing, so I propose that we fix these
directory-scanning functions to silently ignore ENOENT failures from
stat().  Are there any for which we should not do that?

            regards, tom lane



On Sat, Mar 28, 2020 at 01:13:54PM -0400, Tom Lane wrote:
> The buildfarm just showed up another instability in the test cases
> we added:

Yea, as you said, this is an issue with the *testcase*.  The function behavior
didn't change, we just weren't previously exercising it.

>  select (w).size = :segsize as ok
>  from (select pg_ls_waldir() w) ss where length((w).name) = 24 limit 1;
> - ok 
> -----
> - t
> -(1 row)
> -
> +ERROR:  could not stat file "pg_wal/000000010000000000000078": No such file or directory
>  select count(*) >= 0 as ok from pg_ls_archive_statusdir();
>   ok 
>  ----
> 
> It's pretty obvious what happened here: concurrent activity renamed or
> removed the WAL segment between when we saw it in the directory and
> when we tried to stat() it.
> 
> This seems like it would be just as much of a hazard for field usage
> as it is for regression testing,

That's clearly true for pg_ls_waldir(), which call pg_ls_dir_files, and
includes some metadata columns.

> so I propose that we fix these directory-scanning functions to silently
> ignore ENOENT failures from stat().  Are there any for which we should not do
> that?

I think it's reasonable to ignore transient ENOENT for tmpdir, logdir, and
probably archive_statusdir.  That doesn't currently affect pg_ls_dir(), which
lists file but not metadata for an arbitrary dir, so doesn't call stat().

Note that dangling links in the other functions currently cause (wrong [0])
error.  I guess it should be documented if broken link is will be ignored due
to ENOENT.

Maybe we should lstat() the file to determine if it's a dangling link; if
lstat() fails, then skip it.  Currently, we use stat(), which shows metdata of
a link's *target*.  Maybe we'd change that.

Note that I have a patch which generalizes pg_ls_dir_files and makes
pg_ls_dir() a simple wrapper, so if that's pursued, they would behave the same
unless I add another flag to do otherwise (but behaving the same has its
merits).  It already uses lstat() to show links to dirs as isdir=no, which was
needed to avoid recursing into links-to-dirs in the new helper function
pg_ls_dir_recurse().  https://commitfest.postgresql.org/26/2377/

[0] Which you fixed in 085b6b667 and I previously fixed at:
https://www.postgresql.org/message-id/attachment/106478/v2-0001-BUG-in-errmsg.patch
|$ sudo ln -s foo /var/log/postgresql/bar
|ts=# SELECT * FROM pg_ls_logdir() ORDER BY 3;
|ERROR:  could not stat directory "/var/log/postgresql": No such file or directory



Justin Pryzby <pryzby@telsasoft.com> writes:
> On Sat, Mar 28, 2020 at 01:13:54PM -0400, Tom Lane wrote:
>> so I propose that we fix these directory-scanning functions to silently
>> ignore ENOENT failures from stat().  Are there any for which we should not do
>> that?

> Maybe we should lstat() the file to determine if it's a dangling link; if
> lstat() fails, then skip it.  Currently, we use stat(), which shows metdata of
> a link's *target*.  Maybe we'd change that.

Hm, good point that ENOENT could refer to a symlink's target.  Still,
I'm not sure it's worth going out of our way to disambiguate that,
given that these directories aren't really supposed to contain symlinks.
(And on the third hand, if they aren't supposed to, then maybe these
functions needn't look through any symlinks?  In which case just
substituting lstat for stat would resolve the ambiguity.)

> Note that I have a patch which generalizes pg_ls_dir_files and makes
> pg_ls_dir() a simple wrapper, so if that's pursued, they would behave the same
> unless I add another flag to do otherwise (but behaving the same has its
> merits).  It already uses lstat() to show links to dirs as isdir=no, which was
> needed to avoid recursing into links-to-dirs in the new helper function
> pg_ls_dir_recurse().  https://commitfest.postgresql.org/26/2377/

I think we need a back-patchable fix for the ENOENT failure, seeing that
we back-patched the new regression test; intermittent buildfarm failures
are no fun in any branch.  So new functions aren't too relevant here,
although it's fair to look ahead at whether the same behavior will be
appropriate for them.

            regards, tom lane



I wrote:
> Justin Pryzby <pryzby@telsasoft.com> writes:
>> Maybe we should lstat() the file to determine if it's a dangling link; if
>> lstat() fails, then skip it.  Currently, we use stat(), which shows metdata of
>> a link's *target*.  Maybe we'd change that.

> Hm, good point that ENOENT could refer to a symlink's target.  Still,
> I'm not sure it's worth going out of our way to disambiguate that,
> given that these directories aren't really supposed to contain symlinks.
> (And on the third hand, if they aren't supposed to, then maybe these
> functions needn't look through any symlinks?  In which case just
> substituting lstat for stat would resolve the ambiguity.)

After looking at the callers of pg_ls_dir_files, and noticing that
it's already defined to ignore anything that's not a regular file,
I think switching to lstat makes sense.

I also grepped the other uses of ReadDir[Extended], and didn't see
any other ones that seemed desperately in need of changing.

So the attached seems like a sufficient fix.

            regards, tom lane

diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index 01185f2..8429a12 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -596,10 +596,15 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)

         /* Get the file info */
         snprintf(path, sizeof(path), "%s/%s", dir, de->d_name);
-        if (stat(path, &attrib) < 0)
+        if (lstat(path, &attrib) < 0)
+        {
+            /* Ignore concurrently-deleted files, else complain */
+            if (errno == ENOENT)
+                continue;
             ereport(ERROR,
                     (errcode_for_file_access(),
                      errmsg("could not stat file \"%s\": %m", path)));
+        }

         /* Ignore anything but regular files */
         if (!S_ISREG(attrib.st_mode))

On Sun, Mar 29, 2020 at 12:37:05PM -0400, Tom Lane wrote:
> I wrote:
> > Justin Pryzby <pryzby@telsasoft.com> writes:
> >> Maybe we should lstat() the file to determine if it's a dangling link; if
> >> lstat() fails, then skip it.  Currently, we use stat(), which shows metdata of
> >> a link's *target*.  Maybe we'd change that.
> 
> > Hm, good point that ENOENT could refer to a symlink's target.  Still,
> > I'm not sure it's worth going out of our way to disambiguate that,
> > given that these directories aren't really supposed to contain symlinks.
> > (And on the third hand, if they aren't supposed to, then maybe these
> > functions needn't look through any symlinks?  In which case just
> > substituting lstat for stat would resolve the ambiguity.)
> 
> After looking at the callers of pg_ls_dir_files, and noticing that
> it's already defined to ignore anything that's not a regular file,
> I think switching to lstat makes sense.

Yea, only pg_ls_dir() shows special file types (and currently the others even
hide dirs).

The essence of your patch is to ignore ENOENT, but you also changed to use
lstat(), which seems unrelated.  That means we'll now hide (non-broken)
symlinks.  Is that intentional/needed ?  I guess maybe you're trying to fix the
bug (?) that symlinks aren't skipped?  If so, I guess it should be a separate
commit, or the commit message should say so.  I think the doc update is already
handled by: 8b6d94cf6c8319bfd6bebf8b863a5db586c19c3b (we didn't used to say we
skipped specials, and now we say we do, and we'll to follow through RSN and
actually do it, too).

> diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
> index 01185f2..8429a12 100644
> --- a/src/backend/utils/adt/genfile.c
> +++ b/src/backend/utils/adt/genfile.c
> @@ -596,10 +596,15 @@ pg_ls_dir_files(FunctionCallInfo fcinfo, const char *dir, bool missing_ok)
>  
>          /* Get the file info */
>          snprintf(path, sizeof(path), "%s/%s", dir, de->d_name);
> -        if (stat(path, &attrib) < 0)
> +        if (lstat(path, &attrib) < 0)
> +        {
> +            /* Ignore concurrently-deleted files, else complain */
> +            if (errno == ENOENT)
> +                continue;
>              ereport(ERROR,
>                      (errcode_for_file_access(),
>                       errmsg("could not stat file \"%s\": %m", path)));
> +        }
>  
>          /* Ignore anything but regular files */
>          if (!S_ISREG(attrib.st_mode))




Justin Pryzby <pryzby@telsasoft.com> writes:
> On Sun, Mar 29, 2020 at 12:37:05PM -0400, Tom Lane wrote:
>> After looking at the callers of pg_ls_dir_files, and noticing that
>> it's already defined to ignore anything that's not a regular file,
>> I think switching to lstat makes sense.

> Yea, only pg_ls_dir() shows special file types (and currently the others even
> hide dirs).

> The essence of your patch is to ignore ENOENT, but you also changed to use
> lstat(), which seems unrelated.  That means we'll now hide (non-broken)
> symlinks.  Is that intentional/needed ?

Well, the following comment says "ignore anything but regular files",
so I'm supposing that that is the behavior that we actually want here
and failed to implement correctly.  There might be scope for
additional directory-reading functions, but I'd think you'd want
more information (such as the file type) returned from anything
that doesn't act this way.

In practice, since these directories shouldn't contain symlinks,
it's likely moot.  The only place in PG data directories where
we actually expect symlinks is pg_tablespace ... and that contains
symlinks to directories, so that this function would ignore them
anyway.

            regards, tom lane



On Sun, Mar 29, 2020 at 01:22:04PM -0400, Tom Lane wrote:
> Justin Pryzby <pryzby@telsasoft.com> writes:
> > On Sun, Mar 29, 2020 at 12:37:05PM -0400, Tom Lane wrote:
> >> After looking at the callers of pg_ls_dir_files, and noticing that
> >> it's already defined to ignore anything that's not a regular file,
> >> I think switching to lstat makes sense.
> 
> > Yea, only pg_ls_dir() shows special file types (and currently the others even
> > hide dirs).
> 
> > The essence of your patch is to ignore ENOENT, but you also changed to use
> > lstat(), which seems unrelated.  That means we'll now hide (non-broken)
> > symlinks.  Is that intentional/needed ?
> 
> Well, the following comment says "ignore anything but regular files",
> so I'm supposing that that is the behavior that we actually want here
> and failed to implement correctly.  There might be scope for
> additional directory-reading functions, but I'd think you'd want
> more information (such as the file type) returned from anything
> that doesn't act this way.

Maybe pg_stat_file() deserves similar attention ?  Right now, it'll fail on a
broken link.  If we changed it to lstat(), then it'd work, but it'd also show
metadata for the *link* rather than its target.

Patch proposed as v14-0001 patch here may be relevant:
https://www.postgresql.org/message-id/20200317190401.GY26184%40telsasoft.com
-    indicating if it is a directory.  Typical usages include:
+    indicating if it is a directory (or a symbolic link to a directory).
...

> In practice, since these directories shouldn't contain symlinks,
> it's likely moot.  The only place in PG data directories where
> we actually expect symlinks is pg_tablespace ... and that contains
> symlinks to directories, so that this function would ignore them
> anyway.

I wouldn't hesitate to make symlinks, at least in log.  It's surprising when
files are hidden, but I won't argue about the best behavior here.

I'm thinking of distributions or local configurations that use
/var/log/postgresql.  I didn't remember or didn't realize, but it looks like
debian's packages use logging_collector=off and then launch postmaster with
2> /var/log/postgres/...  It seems reasonable to do something like:
log/huge-querylog.csv => /zfs/compressed/...

-- 
Justin



Hello Justin,

>> Well, the following comment says "ignore anything but regular files",
>> so I'm supposing that that is the behavior that we actually want here
>> and failed to implement correctly.  There might be scope for
>> additional directory-reading functions, but I'd think you'd want
>> more information (such as the file type) returned from anything
>> that doesn't act this way.
>
> Maybe pg_stat_file() deserves similar attention ?  Right now, it'll fail on a
> broken link.  If we changed it to lstat(), then it'd work, but it'd also show
> metadata for the *link* rather than its target.

Yep. I think this traditional answer is the rational answer.

As I wrote about an earlier version of the patch, ISTM that instead of 
reinventing, extending, adapting various ls variants (with/without 
metadata, which show only files, which shows target of links, which shows 
directory, etc.) we would just need *one* postgres "ls" implementation 
which would be like "ls -la arg" (returns file type, dates), and then 
everything else is a wrapper around that with appropriate filtering that 
can be done at the SQL level, like you started with recurse.

It would reduce the amount of C code and I find the SQL-level approach 
quite elegant.

-- 
Fabien.



Fabien COELHO <coelho@cri.ensmp.fr> writes:
> As I wrote about an earlier version of the patch, ISTM that instead of 
> reinventing, extending, adapting various ls variants (with/without 
> metadata, which show only files, which shows target of links, which shows 
> directory, etc.) we would just need *one* postgres "ls" implementation 
> which would be like "ls -la arg" (returns file type, dates), and then 
> everything else is a wrapper around that with appropriate filtering that 
> can be done at the SQL level, like you started with recurse.

Yeah, I agree that some new function that can represent symlinks
explicitly in its output is the place to deal with this, for
people who want to deal with it.

In the meantime, there's still the question of what pg_ls_dir_files
should do exactly.  Are we content to have it ignore symlinks?
I remain inclined to think that's the right thing given its current
brief.

            regards, tom lane



Hello,

>> As I wrote about an earlier version of the patch, ISTM that instead of
>> reinventing, extending, adapting various ls variants (with/without
>> metadata, which show only files, which shows target of links, which shows
>> directory, etc.) we would just need *one* postgres "ls" implementation
>> which would be like "ls -la arg" (returns file type, dates), and then
>> everything else is a wrapper around that with appropriate filtering that
>> can be done at the SQL level, like you started with recurse.
>
> Yeah, I agree that some new function that can represent symlinks
> explicitly in its output is the place to deal with this, for
> people who want to deal with it.
>
> In the meantime, there's still the question of what pg_ls_dir_files
> should do exactly.  Are we content to have it ignore symlinks?
> I remain inclined to think that's the right thing given its current
> brief.

My 0.02€:

I agree that it is enough to reproduce the current behavior of various 
existing pg_ls* functions, but on the other hand outputing a column type 
char like ls (-, d, l…) looks like really no big deal. I'd say that the 
only reason not to do it may be to pass this before feature freeze.

-- 
Fabien.
On Tue, Mar 31, 2020 at 07:36:03AM +0200, Fabien COELHO wrote:
> > > As I wrote about an earlier version of the patch, ISTM that instead of
> > > reinventing, extending, adapting various ls variants (with/without
> > > metadata, which show only files, which shows target of links, which shows
> > > directory, etc.) we would just need *one* postgres "ls" implementation
> > > which would be like "ls -la arg" (returns file type, dates), and then
> > > everything else is a wrapper around that with appropriate filtering that
> > > can be done at the SQL level, like you started with recurse.
> > 
> > Yeah, I agree that some new function that can represent symlinks
> > explicitly in its output is the place to deal with this, for
> > people who want to deal with it.
> > 
> > In the meantime, there's still the question of what pg_ls_dir_files
> > should do exactly.  Are we content to have it ignore symlinks?
> > I remain inclined to think that's the right thing given its current
> > brief.
> 
> My 0.02€:
> 
> I agree that it is enough to reproduce the current behavior of various
> existing pg_ls* functions, but on the other hand outputing a column type
> char like ls (-, d, l…) looks like really no big deal. I'd say that the only
> reason not to do it may be to pass this before feature freeze.

Remember, there's two threads here, and this one is about the bug in stable
releases ($SUBJECT), and now the instability in the test that was added with
its fix.

I suggest to leave stat() alone in your patch for stable releases.  I think
it's okay if we change behavior so that a broken symlink is skipped instead of
erroring (as a side effect of skipping ENOENT with stat()).  But not okay if we
change pg_ls_logdir() to hide symlinks in back braches.

-- 
Justin



Justin Pryzby <pryzby@telsasoft.com> writes:
> I suggest to leave stat() alone in your patch for stable releases.  I think
> it's okay if we change behavior so that a broken symlink is skipped instead of
> erroring (as a side effect of skipping ENOENT with stat()).  But not okay if we
> change pg_ls_logdir() to hide symlinks in back braches.

Meh.  I'm not really convinced, but in the absence of anyone expressing
support for my position, I'll do it that way.  I don't think it's worth
doing both a stat and lstat to tell the difference between file-is-gone
and file-is-a-broken-symlink.

            regards, tom lane



Re: pg11+: pg_ls_*dir LIMIT 1: temporary files .. not closed at end-of-transaction

From
Alexander Lakhin
Date:
Hello hackers,
31.03.2020 19:41, Tom Lane wrote:
> Justin Pryzby <pryzby@telsasoft.com> writes:
>> I suggest to leave stat() alone in your patch for stable releases.  I think
>> it's okay if we change behavior so that a broken symlink is skipped instead of
>> erroring (as a side effect of skipping ENOENT with stat()).  But not okay if we
>> change pg_ls_logdir() to hide symlinks in back braches.
> Meh.  I'm not really convinced, but in the absence of anyone expressing
> support for my position, I'll do it that way.  I don't think it's worth
> doing both a stat and lstat to tell the difference between file-is-gone
> and file-is-a-broken-symlink.
As we've discovered in Bug #[16161], stat() for "concurrently-deleted
file" can also return ERROR_ACCESS_DENIED on Windows. It seems that
pg_upgradeCheck failures seen on
https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=fairywren&br=REL_13_STABLE
caused by the same issue.
Shouldn't pg_ls_dir_files() retry stat() on ERROR_ACCESS_DENIED just
like the pgwin32_open() does to ignore files in "delete pending" state?

[16161]
https://www.postgresql.org/message-id/16161-7a985d2f1bbe8f71%40postgresql.org

Best regards,
Alexander



Alexander Lakhin <exclusion@gmail.com> writes:
> Shouldn't pg_ls_dir_files() retry stat() on ERROR_ACCESS_DENIED just
> like the pgwin32_open() does to ignore files in "delete pending" state?

That would soon lead us to changing every stat() caller in the system
to have Windows-specific looping logic.  No thanks.  If we need to do
this, let's put in a Windows wrapper layer comparable to pgwin32_open()
for open().

            regards, tom lane



Re: pg11+: pg_ls_*dir LIMIT 1: temporary files .. not closed at end-of-transaction

From
Alexander Lakhin
Date:
13.11.2020 23:16, Tom Lane wrote:
> Alexander Lakhin <exclusion@gmail.com> writes:
>> Shouldn't pg_ls_dir_files() retry stat() on ERROR_ACCESS_DENIED just
>> like the pgwin32_open() does to ignore files in "delete pending" state?
> That would soon lead us to changing every stat() caller in the system
> to have Windows-specific looping logic.  No thanks.  If we need to do
> this, let's put in a Windows wrapper layer comparable to pgwin32_open()
> for open().
Maybe pgwin32_safestat() should perform this... For now it checks
GetLastError() for ERROR_DELETE_PENDING, but as we've found out
previously this error code in fact is not returned by the OS.
And if the fix is not going to be quick, probably it should be discussed
in another thread...

Best regards,
Alexander



Alexander Lakhin <exclusion@gmail.com> writes:
> 13.11.2020 23:16, Tom Lane wrote:
>> That would soon lead us to changing every stat() caller in the system
>> to have Windows-specific looping logic.  No thanks.  If we need to do
>> this, let's put in a Windows wrapper layer comparable to pgwin32_open()
>> for open().

> Maybe pgwin32_safestat() should perform this...

Uh ... now that you mention it, that's gone since bed90759f.

There is code in win32stat.c that purports to cope with this case, though
I've not tested it personally.

            regards, tom lane