Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*) - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Date
Msg-id 20220315023725.GW28503@telsasoft.com
Whole thread Raw
In response to Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)  (Michael Paquier <michael@paquier.xyz>)
Responses Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
List pgsql-hackers
On Mon, Mar 14, 2022 at 01:53:54PM +0900, Michael Paquier wrote:
> On Wed, Mar 09, 2022 at 10:50:45AM -0600, Justin Pryzby wrote:
> > I also changed pg_ls_dir_recurse() to handle concurrent removal of a dir, which
> > I noticed caused an infrequent failure on CI.  However I'm not including that
> > here, since the 2nd half of the patch set seems isn't ready due to lstat() on
> > windows.
> 
> lstat() has been a subject of many issues over the years with our
> internal emulation and issues related to its concurrency, but we use
> it in various areas of the in-core code, so that does not sound like
> an issue to me.  It depends on what you want to do with it in
> genfile.c and which data you'd expect, in addition to the detection of
> junction points for WIN32, I guess.

To avoid cycles, a recursion function would need to know whether to recurse
into a directory or to output that something is isdir=false or type=link, and
avoid recursing into it. 

> pg_ls_dir_recurse(), but that's a WITH RECURSIVE, so we would not
> really need it, do we?

Tom disliked it when I had written it as a recursive CTE, so I rewrote it in C.
129225.1606166058@sss.pgh.pa.us

> Hmm.  I am not convinced that we need a new set of SQL functions as
> presented in 0003 (addition of meta-data in pg_ls_dir()), and
> extensions of 0004 (do the same but for pg_ls_tmpdir) and 0005 (same
> for the other pg_ls* functions).  The changes of pg_ls_dir_files()
> make in my opinion the code harder to follow as the resulting tuple 
> size depends on the type of flag used, and you can already retrieve
> the rest of the information with a join, probably LATERAL, on
> pg_stat_file(), no?  The same can be said about 0007, actually.

Yes, one can get the same information with a lateral join (as I said 2 years
ago).  But it's more helpful to provide a function, rather than leave that to
people to figure out - possibly incorrectly, or badly, like by parsing the
output of COPY FROM PROGRAM 'ls -l'.  The query to handle tablespaces is
particularly obscure:
20200310183037.GA29065@telsasoft.com
20201223191710.GR30237@telsasoft.com

One could argue that most of the pg_ls_* functions aren't needed (including
1922d7c6e), since the same things are possible with pg_ls_dir() and
pg_stat_file().
|1922d7c6e Add SQL functions to monitor the directory contents of replication slots

The original, minimal goal of this patch was to show shared tempdirs in
pg_ls_tmpfile() - rather than hiding them misleadingly as currently happens.
20200310183037.GA29065@telsasoft.com
20200313131232.GO29065@telsasoft.com

I added the metadata function 2 years ago since it's silly to show metadata for
tmpdir but not other, arbitrary directories.
20200310183037.GA29065@telsasoft.com
20200313131232.GO29065@telsasoft.com
20201223191710.GR30237@telsasoft.com

> The addition of isdir for any of the paths related to pg_logical/ and
> the replication slots has also a limited interest, because we know
> already those contents, and these are not directories as far as I
> recall.

Except when we don't, since extensions can do things that core doesn't, as
Fabien pointed out.
alpine.DEB.2.21.2001160927390.30419@pseudo

> In the whole set, improving the docs as of 0001 makes sense, but the
> change is incomplete.  Most of 0002 also makes sense and should be
> stable enough.  I am less enthusiastic about any of the other changes
> proposed and what we can gain from these parts.

It is frustrating to hear this feedback now, after the patch has gone through
multiple rewrites over 2 years - based on other positive feedback and review.
I went to the effort to ask, numerous times, whether to write the patch and how
its interfaces should look.  Now, I'm hearing that not only the implementation
but its goals are wrong.  What should I have done to avoid that ?

20200503024215.GJ28974@telsasoft.com
20191227195918.GF12890@telsasoft.com
20200116003924.GJ26045@telsasoft.com
20200908195126.GB18552@telsasoft.com



pgsql-hackers by date:

Previous
From: "shiy.fnst@fujitsu.com"
Date:
Subject: RE: Column Filtering in Logical Replication
Next
From: Michael Paquier
Date:
Subject: Re: Estimating HugePages Requirements?