Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*) - Mailing list pgsql-hackers
From | Michael Paquier |
---|---|
Subject | Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*) |
Date | |
Msg-id | Yi7KYhyweiLf6XXc@paquier.xyz Whole thread Raw |
In response to | Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*) (Justin Pryzby <pryzby@telsasoft.com>) |
Responses |
Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)
Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*) |
List | pgsql-hackers |
On Wed, Mar 09, 2022 at 10:50:45AM -0600, Justin Pryzby wrote: > I also changed pg_ls_dir_recurse() to handle concurrent removal of a dir, which > I noticed caused an infrequent failure on CI. However I'm not including that > here, since the 2nd half of the patch set seems isn't ready due to lstat() on > windows. lstat() has been a subject of many issues over the years with our internal emulation and issues related to its concurrency, but we use it in various areas of the in-core code, so that does not sound like an issue to me. It depends on what you want to do with it in genfile.c and which data you'd expect, in addition to the detection of junction points for WIN32, I guess. v34 has no references to pg_ls_dir_recurse(), but that's a WITH RECURSIVE, so we would not really need it, do we? @@ -27618,7 +27618,7 @@ SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8'); Returns a record containing the file's size, last access time stamp, last modification time stamp, last file status change time stamp (Unix platforms only), file creation time stamp (Windows only), and a flag - indicating if it is a directory. + indicating if it is a directory (or a symbolic link to a directory). </para> <para> This function is restricted to superusers by default, but other users This is from 0001, and this addition in the documentation is not completely right. As pg_stat_file() uses stat() to get back the information of a file/directory, we'd just follow the link if specifying one in the input argument. We could say instead, if we were to improve the docs, that "If filename is a link, this function returns information about the file or directory the link refers to." I would put that as a different paragraph. +select * from pg_ls_archive_statusdir() limit 0; + name | size | modification +------+------+-------------- +(0 rows) FWIW, this one is fine as of ValidateXLOGDirectoryStructure() that would make sure archive_status exists before any connection is attempted to the cluster. > +select * from pg_ls_logdir() limit 0; This test on pg_ls_logdir() would fail if running installcheck on a cluster that has logging_collector disabled. So this cannot be included. +select * from pg_ls_logicalmapdir() limit 0; +select * from pg_ls_logicalsnapdir() limit 0; +select * from pg_ls_replslotdir('') limit 0; +select * from pg_ls_tmpdir() limit 0; +select * from pg_ls_waldir() limit 0; +select * from pg_stat_file('.') limit 0; The rest of the patch set should be stable AFAIK, there are various steps when running a checkpoint that makes sure that any of these exist, without caring about the value of wal_level. + <para> + For each file in the specified directory, list the file and its + metadata. + Restricted to superusers by default, but other users can be granted + EXECUTE to run the function. + </para></entry> What is metadata in this case? (I have read the code and know what you mean, but folks only looking at the documentation may be puzzled by that). It could be cleaner to use the same tupledesc for any callers of this function, and return NULL in cases these are not adapted. + /* check the optional arguments */ + if (PG_NARGS() == 3) + { + if (!PG_ARGISNULL(1)) + { + if (PG_GETARG_BOOL(1)) + flags |= LS_DIR_MISSING_OK; + else + flags &= ~LS_DIR_MISSING_OK; + } + + if (!PG_ARGISNULL(2)) + { + if (PG_GETARG_BOOL(2)) + flags &= ~LS_DIR_SKIP_DOT_DIRS; + else + flags |= LS_DIR_SKIP_DOT_DIRS; + } + } The subtle different between the false and true code paths of those arguments 1 and 2 had better be explained? The bit-wise operations are slightly different in both cases, so it is not clear which part does what, and what's the purpose of this logic. - SetSingleFuncCall(fcinfo, 0); + /* isdir depends on metadata */ + Assert(!(flags&LS_DIR_ISDIR) || (flags&LS_DIR_METADATA)); + /* Unreasonable to show isdir and skip dirs */ + Assert(!(flags&LS_DIR_ISDIR) || !(flags&LS_DIR_SKIP_DIRS)); Incorrect code format. Spaces required. +-- This tests the missing_ok parameter, which causes pg_ls_tmpdir to succeed even if the tmpdir doesn't exist yet +-- The name='' condition is never true, so the function runs to completion but returns zero rows. +-- The query is written to ERROR if the tablespace doesn't exist, rather than silently failing to call pg_ls_tmpdir() +SELECT c.* FROM (SELECT oid FROM pg_tablespace b WHERE b.spcname='regress_tblspace' UNION SELECT 0 ORDER BY 1 DESC LIMIT 1) AS b , pg_ls_tmpdir(oid) AS c WHERE c.name='Does not exist'; So, here, we have a test that may not actually test what we want to test. Hmm. I am not convinced that we need a new set of SQL functions as presented in 0003 (addition of meta-data in pg_ls_dir()), and extensions of 0004 (do the same but for pg_ls_tmpdir) and 0005 (same for the other pg_ls* functions). The changes of pg_ls_dir_files() make in my opinion the code harder to follow as the resulting tuple size depends on the type of flag used, and you can already retrieve the rest of the information with a join, probably LATERAL, on pg_stat_file(), no? The same can be said about 0007, actually. The addition of isdir for any of the paths related to pg_logical/ and the replication slots has also a limited interest, because we know already those contents, and these are not directories as far as I recall. 0006 invokes a behavior change for pg_ls_logdir(), where it makes sense to me to fail if the directory does not exist, so I am not in favor of that. In the whole set, improving the docs as of 0001 makes sense, but the change is incomplete. Most of 0002 also makes sense and should be stable enough. I am less enthusiastic about any of the other changes proposed and what we can gain from these parts. -- Michael
Attachment
pgsql-hackers by date: