Thread: Can we avoid chdir'ing in resolve_symlinks() ?

Can we avoid chdir'ing in resolve_symlinks() ?

From
Tom Lane
Date:
find_my_exec() wants to obtain an absolute, symlink-free path
to the program's own executable, for what seem to me good
reasons.  However, chasing down symlinks is left to its
subroutine resolve_symlinks(), which does this:

     * To resolve a symlink properly, we have to chdir into its directory and
     * then chdir to where the symlink points; otherwise we may fail to
     * resolve relative links correctly (consider cases involving mount
     * points, for example).  After following the final symlink, we use
     * getcwd() to figure out where the heck we're at.

and then afterwards it has to chdir back to the original cwd.
That last step is a bit of a sore spot, because sometimes
(especially in sudo situations) we may not have the privileges
necessary to do that; I think this is the cause of the complaint
at [1].  Anyway the whole thing seems a bit excessively Rube
Goldbergian.  I'm wondering why we couldn't just read the
symlink(s), concatenate them together, and use canonicalize_path()
to clean up any mess.

This code was mine originally (336969e49), but I sure don't
remember why I wrote it like that.  I know we didn't have a
robust version of canonicalize_path() then, and that may have
been the main issue, but that offhand comment about mount
points bothers me.  But I can't reconstruct precisely what
I was worried about there.  The only contemporaneous discussion
thread I can find is [2], which doesn't go into coding details.

Thoughts?

            regards, tom lane

[1] https://www.postgresql.org/message-id/flat/CAH8yC8kOj0pmHF1RbK2Gb2t4YCcNG-5h0TwZ7yxk3Hzw6C0Otg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/flat/4973.1099605411%40sss.pgh.pa.us



Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Isaac Morland
Date:
On Thu, 1 Sept 2022 at 19:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This code was mine originally (336969e49), but I sure don't
remember why I wrote it like that.  I know we didn't have a
robust version of canonicalize_path() then, and that may have
been the main issue, but that offhand comment about mount
points bothers me.  But I can't reconstruct precisely what
I was worried about there.  The only contemporaneous discussion
thread I can find is [2], which doesn't go into coding details.

Does this happen in a context where we need to worried about the directory structure changing under us, either accidentally or maliciously?

I'm wondering because I understand cd'ing through the structure can avoid some of the related problems and might be the reason for doing it that way originally. My impression is that the modern equivalent would be to use openat() with O_PATH to step through the hierarchy. But then I'm not clear on how to get back to the absolute path, given a file descriptor for the final directory.


Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Tom Lane
Date:
Isaac Morland <isaac.morland@gmail.com> writes:
> On Thu, 1 Sept 2022 at 19:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> This code was mine originally (336969e49), but I sure don't
>> remember why I wrote it like that.

> Does this happen in a context where we need to worried about the directory
> structure changing under us, either accidentally or maliciously?

Well, one of the reasons it'd be a good idea to not change cwd is
that then you don't have to worry about that moving while you're
messing around.  But everything else that we're considering here is
either a component of PATH or a directory/symlink associated with
the PG installation.  If $badguy has control of any of that,
you've already lost, so I'm not excited about worrying about it.

> I'm wondering because I understand cd'ing through the structure can avoid
> some of the related problems and might be the reason for doing it that way
> originally.

Pretty sure I was not thinking about that.  I might have been
thinking about AFS installations, which IIRC often have two nominal
paths associated with them.  But I don't recall any details about how
that works, and anyway the comment says nothing about AFS.

> My impression is that the modern equivalent would be to use
> openat() with O_PATH to step through the hierarchy. But then I'm not clear
> on how to get back to the absolute path, given a file descriptor for the
> final directory.

Yeah.  The point here is not to open a particular file, but to derive
a pathname string for where the file is.

What I'm thinking right at the moment is that we don't necessarily
have to have the exact path that getcwd() would report.  We need
*some* path-in-absolute-form that works.  This leads me to think
that both the AFS case and the mount-point case are red herrings.
But I can't shake the feeling that I'm missing something.

            regards, tom lane



Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Andrew Dunstan
Date:
On 2022-09-01 Th 19:39, Tom Lane wrote:
> find_my_exec() wants to obtain an absolute, symlink-free path
> to the program's own executable, for what seem to me good
> reasons.  However, chasing down symlinks is left to its
> subroutine resolve_symlinks(), which does this:
>
>      * To resolve a symlink properly, we have to chdir into its directory and
>      * then chdir to where the symlink points; otherwise we may fail to
>      * resolve relative links correctly (consider cases involving mount
>      * points, for example).  After following the final symlink, we use
>      * getcwd() to figure out where the heck we're at.
>
> and then afterwards it has to chdir back to the original cwd.
> That last step is a bit of a sore spot, because sometimes
> (especially in sudo situations) we may not have the privileges
> necessary to do that; I think this is the cause of the complaint
> at [1].  Anyway the whole thing seems a bit excessively Rube
> Goldbergian.  I'm wondering why we couldn't just read the
> symlink(s), concatenate them together, and use canonicalize_path()
> to clean up any mess.
>
> This code was mine originally (336969e49), but I sure don't
> remember why I wrote it like that.  I know we didn't have a
> robust version of canonicalize_path() then, and that may have
> been the main issue, but that offhand comment about mount
> points bothers me.  But I can't reconstruct precisely what
> I was worried about there.  The only contemporaneous discussion
> thread I can find is [2], which doesn't go into coding details.
>
> Thoughts?
>
>             regards, tom lane
>
> [1] https://www.postgresql.org/message-id/flat/CAH8yC8kOj0pmHF1RbK2Gb2t4YCcNG-5h0TwZ7yxk3Hzw6C0Otg%40mail.gmail.com
> [2] https://www.postgresql.org/message-id/flat/4973.1099605411%40sss.pgh.pa.us
>
>

These days there seem to be library functions that do this, realpath(3)
and canonicalize_file_name(3). The latter is what seems to be called by
readlink(1). Should we be using one of those? I don't know how portable
they are. I don't see them here :-(
<https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/crt-alphabetical-function-reference?view=msvc-170>


cheers


andrew



--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 2022-09-01 Th 19:39, Tom Lane wrote:
>> find_my_exec() wants to obtain an absolute, symlink-free path
>> to the program's own executable, for what seem to me good
>> reasons.  However, chasing down symlinks is left to its
>> subroutine resolve_symlinks(), which does this:

> These days there seem to be library functions that do this, realpath(3)
> and canonicalize_file_name(3). The latter is what seems to be called by
> readlink(1). Should we be using one of those?

Oh!  I see realpath() in POSIX, but not canonicalize_file_name().
It does look like realpath() would be helpful here, although if
it's not present on Windows that's a problem.

Quick googling suggests that _fullpath() could be used as a substitute.

            regards, tom lane



Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Tom Lane
Date:
I wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> These days there seem to be library functions that do this, realpath(3)
>> and canonicalize_file_name(3). The latter is what seems to be called by
>> readlink(1). Should we be using one of those?

> Oh!  I see realpath() in POSIX, but not canonicalize_file_name().
> It does look like realpath() would be helpful here, although if
> it's not present on Windows that's a problem.

After some surveying of man pages, I conclude that

(1) realpath() exists on all platforms of interest except Windows,
where it looks like we can use _fullpath() instead.

(2) AIX and Solaris 10 only implement the SUSv2 semantics,
where the caller must supply a buffer that it has no good way
to determine a safe size for.  Annoying.

(3) The Solaris 10 man page has this interesting disclaimer:

     The realpath() function might fail to return to the current
     directory if an error occurs.

which implies that on that platform it's basically implemented
in the same way as our current code.  Sigh.

I think we can ignore (3) though.  Solaris 11 seems to have an
up-to-speed implementation of realpath(), and 10 will be EOL
in January 2024 according to Wikipedia.

As for (2), both systems promise to report EINVAL for a null
pointer, which is also what SUSv2 says.  So I think what we
can do is approximately

    ptr = realpath(fname, NULL);
    if (ptr == NULL && errno == EINVAL)
    {
        ptr = pg_malloc(MAXPGPATH);
        ptr = realpath(fname, ptr);
    }

and just take it on faith that MAXPGPATH is enough on those
platforms.

            regards, tom lane



Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Tom Lane
Date:
Here's a draft patch for this.  It seems to work on Linux,
but the Windows code is just speculation.  In particular,
I did

    path = _fullpath(NULL, fname, 0);
    if (path == NULL)
        _dosmaperr(GetLastError());

but I'm not really sure that the _dosmaperr bit is needed,
because the _fullpath man page I found makes reference to
setting "errno" [1].  It's likely to be hard to test, because
most of the possible error cases should be nigh unreachable
in our usage; we already know the input is a valid reference
to an executable file.

BTW, I noticed what seems a flat-out bug in validate_exec:

        /* Win32 requires a .exe suffix for stat() */
-       if (strlen(path) >= strlen(".exe") &&
+       if (strlen(path) < strlen(".exe") ||
                pg_strcasecmp(path + strlen(path) - strlen(".exe"), ".exe") != 0)

Nobody's noticed because none of our executables have base names
shorter than 4 characters, but it's still a bug.

            regards, tom lane

[1] https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/fullpath-wfullpath?view=msvc-170

diff --git a/src/common/exec.c b/src/common/exec.c
index 22f04aafbe..fa01ff7b32 100644
--- a/src/common/exec.c
+++ b/src/common/exec.c
@@ -14,6 +14,15 @@
  *-------------------------------------------------------------------------
  */

+/*
+ * On macOS, "man realpath" avers:
+ *    Defining _DARWIN_C_SOURCE or _DARWIN_BETTER_REALPATH before including
+ *    stdlib.h will cause the provided implementation of realpath() to use
+ *    F_GETPATH from fcntl(2) to discover the path.
+ * This should be harmless everywhere else.
+ */
+#define _DARWIN_BETTER_REALPATH
+
 #ifndef FRONTEND
 #include "postgres.h"
 #else
@@ -58,11 +67,8 @@ extern int    _CRT_glob = 0;        /* 0 turns off globbing; 1 turns it on */
     (fprintf(stderr, __VA_ARGS__), fputc('\n', stderr))
 #endif

-#ifdef _MSC_VER
-#define getcwd(cwd,len)  GetCurrentDirectory(len, cwd)
-#endif
-
-static int    resolve_symlinks(char *path);
+static int    normalize_exec_path(char *path);
+static char *pg_realpath(const char *fname);

 #ifdef WIN32
 static BOOL GetTokenUser(HANDLE hToken, PTOKEN_USER *ppTokenUser);
@@ -87,7 +93,7 @@ validate_exec(const char *path)
     char        path_exe[MAXPGPATH + sizeof(".exe") - 1];

     /* Win32 requires a .exe suffix for stat() */
-    if (strlen(path) >= strlen(".exe") &&
+    if (strlen(path) < strlen(".exe") ||
         pg_strcasecmp(path + strlen(path) - strlen(".exe"), ".exe") != 0)
     {
         strlcpy(path_exe, path, sizeof(path_exe) - 4);
@@ -151,30 +157,16 @@ validate_exec(const char *path)
 int
 find_my_exec(const char *argv0, char *retpath)
 {
-    char        cwd[MAXPGPATH],
-                test_path[MAXPGPATH];
     char       *path;

-    if (!getcwd(cwd, MAXPGPATH))
-    {
-        log_error(errcode_for_file_access(),
-                  _("could not identify current directory: %m"));
-        return -1;
-    }
-
     /*
      * If argv0 contains a separator, then PATH wasn't used.
      */
-    if (first_dir_separator(argv0) != NULL)
+    strlcpy(retpath, argv0, MAXPGPATH);
+    if (first_dir_separator(retpath) != NULL)
     {
-        if (is_absolute_path(argv0))
-            strlcpy(retpath, argv0, MAXPGPATH);
-        else
-            join_path_components(retpath, cwd, argv0);
-        canonicalize_path(retpath);
-
         if (validate_exec(retpath) == 0)
-            return resolve_symlinks(retpath);
+            return normalize_exec_path(retpath);

         log_error(errcode(ERRCODE_WRONG_OBJECT_TYPE),
                   _("invalid binary \"%s\": %m"), retpath);
@@ -183,9 +175,8 @@ find_my_exec(const char *argv0, char *retpath)

 #ifdef WIN32
     /* Win32 checks the current directory first for names without slashes */
-    join_path_components(retpath, cwd, argv0);
     if (validate_exec(retpath) == 0)
-        return resolve_symlinks(retpath);
+        return normalize_exec_path(retpath);
 #endif

     /*
@@ -208,21 +199,15 @@ find_my_exec(const char *argv0, char *retpath)
             if (!endp)
                 endp = startp + strlen(startp); /* point to end */

-            strlcpy(test_path, startp, Min(endp - startp + 1, MAXPGPATH));
+            strlcpy(retpath, startp, Min(endp - startp + 1, MAXPGPATH));

-            if (is_absolute_path(test_path))
-                join_path_components(retpath, test_path, argv0);
-            else
-            {
-                join_path_components(retpath, cwd, test_path);
-                join_path_components(retpath, retpath, argv0);
-            }
+            join_path_components(retpath, retpath, argv0);
             canonicalize_path(retpath);

             switch (validate_exec(retpath))
             {
                 case 0:            /* found ok */
-                    return resolve_symlinks(retpath);
+                    return normalize_exec_path(retpath);
                 case -1:        /* wasn't even a candidate, keep looking */
                     break;
                 case -2:        /* found but disqualified */
@@ -241,105 +226,90 @@ find_my_exec(const char *argv0, char *retpath)


 /*
- * resolve_symlinks - resolve symlinks to the underlying file
+ * normalize_exec_path - resolve symlinks and convert to absolute path
  *
- * Replace "path" by the absolute path to the referenced file.
+ * Given a path that refers to an executable, chase through any symlinks
+ * to find the real file location; then convert that to an absolute path.
  *
+ * On success, replaces the contents of "path" with the absolute path.
+ * ("path" is assumed to be of size MAXPGPATH.)
  * Returns 0 if OK, -1 if error.
- *
- * Note: we are not particularly tense about producing nice error messages
- * because we are not really expecting error here; we just determined that
- * the symlink does point to a valid executable.
- *
- * Here we test HAVE_READLINK, which excludes Windows.  There's no point in
- * using our junction point-based replacement code for this, because that only
- * works for directories.
  */
 static int
-resolve_symlinks(char *path)
+normalize_exec_path(char *path)
 {
-#ifdef HAVE_READLINK
-    struct stat buf;
-    char        orig_wd[MAXPGPATH],
-                link_buf[MAXPGPATH];
-    char       *fname;
-
     /*
-     * To resolve a symlink properly, we have to chdir into its directory and
-     * then chdir to where the symlink points; otherwise we may fail to
-     * resolve relative links correctly (consider cases involving mount
-     * points, for example).  After following the final symlink, we use
-     * getcwd() to figure out where the heck we're at.
-     *
-     * One might think we could skip all this if path doesn't point to a
-     * symlink to start with, but that's wrong.  We also want to get rid of
-     * any directory symlinks that are present in the given path. We expect
-     * getcwd() to give us an accurate, symlink-free path.
+     * We used to do a lot of work ourselves here, but now we just let
+     * realpath(3) do all the heavy lifting.
      */
-    if (!getcwd(orig_wd, MAXPGPATH))
+    char       *abspath = pg_realpath(path);
+
+    if (abspath == NULL)
     {
         log_error(errcode_for_file_access(),
-                  _("could not identify current directory: %m"));
+                  _("could not resolve path \"%s\" to absolute form: %m"),
+                  path);
         return -1;
     }
+    strlcpy(path, abspath, MAXPGPATH);
+    free(abspath);

-    for (;;)
-    {
-        char       *lsep;
-        int            rllen;
-
-        lsep = last_dir_separator(path);
-        if (lsep)
-        {
-            *lsep = '\0';
-            if (chdir(path) == -1)
-            {
-                log_error(errcode_for_file_access(),
-                          _("could not change directory to \"%s\": %m"), path);
-                return -1;
-            }
-            fname = lsep + 1;
-        }
-        else
-            fname = path;
+#ifdef WIN32
+    /* On Windows, be sure to convert '\' to '/' */
+    canonicalize_path(path);
+#endif

-        if (lstat(fname, &buf) < 0 ||
-            !S_ISLNK(buf.st_mode))
-            break;
+    return 0;
+}

-        errno = 0;
-        rllen = readlink(fname, link_buf, sizeof(link_buf));
-        if (rllen < 0 || rllen >= sizeof(link_buf))
-        {
-            log_error(errcode_for_file_access(),
-                      _("could not read symbolic link \"%s\": %m"), fname);
-            return -1;
-        }
-        link_buf[rllen] = '\0';
-        strcpy(path, link_buf);
-    }

-    /* must copy final component out of 'path' temporarily */
-    strlcpy(link_buf, fname, sizeof(link_buf));
+/*
+ * pg_realpath() - realpath(3) with POSIX.1-2008 semantics
+ *
+ * This is equivalent to realpath(fname, NULL), in that it returns a
+ * malloc'd buffer containing the absolute path equivalent to fname.
+ * On error, returns NULL with errno set.
+ *
+ * On Windows, what you get is spelled per platform conventions,
+ * so you probably want to apply canonicalize_path() to the result.
+ *
+ * For now, this is needed only here so mark it static.  If you choose to
+ * move it into its own file, move the _DARWIN_BETTER_REALPATH #define too!
+ */
+static char *
+pg_realpath(const char *fname)
+{
+    char       *path;

-    if (!getcwd(path, MAXPGPATH))
+#ifndef WIN32
+    path = realpath(fname, NULL);
+    if (path == NULL && errno == EINVAL)
     {
-        log_error(errcode_for_file_access(),
-                  _("could not identify current directory: %m"));
-        return -1;
-    }
-    join_path_components(path, path, link_buf);
-    canonicalize_path(path);
+        /*
+         * Cope with old-POSIX systems that require a user-provided buffer.
+         * Assume MAXPGPATH is enough room on all such systems.
+         */
+        char       *buf = malloc(MAXPGPATH);

-    if (chdir(orig_wd) == -1)
-    {
-        log_error(errcode_for_file_access(),
-                  _("could not change directory to \"%s\": %m"), orig_wd);
-        return -1;
+        if (buf == NULL)
+            return NULL;        /* assume errno is set */
+        path = realpath(fname, buf);
+        if (path == NULL)        /* don't leak memory */
+        {
+            int            save_errno = errno;
+
+            free(buf);
+            errno = save_errno;
+        }
     }
-#endif                            /* HAVE_READLINK */
+#else                            /* WIN32 */
+    /* Microsoft is resolutely non-POSIX, but _fullpath does the same thing */
+    path = _fullpath(NULL, fname, 0);
+    if (path == NULL)
+        _dosmaperr(GetLastError());
+#endif

-    return 0;
+    return path;
 }



Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Peter Eisentraut
Date:
On 02.09.22 01:39, Tom Lane wrote:
> find_my_exec() wants to obtain an absolute, symlink-free path
> to the program's own executable, for what seem to me good
> reasons.

I still think they are bad reasons, and we should kill all that code. 
Just sayin' ...



Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> On 02.09.22 01:39, Tom Lane wrote:
>> find_my_exec() wants to obtain an absolute, symlink-free path
>> to the program's own executable, for what seem to me good
>> reasons.

> I still think they are bad reasons, and we should kill all that code. 
> Just sayin' ...

Are you proposing we give up the support for relocatable installations?
I'm not here to defend that feature, but I bet somebody will.  (And
doesn't "make check" depend on it?)

            regards, tom lane



Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Peter Eisentraut
Date:
On 12.09.22 17:33, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
>> On 02.09.22 01:39, Tom Lane wrote:
>>> find_my_exec() wants to obtain an absolute, symlink-free path
>>> to the program's own executable, for what seem to me good
>>> reasons.
> 
>> I still think they are bad reasons, and we should kill all that code.
>> Just sayin' ...
> 
> Are you proposing we give up the support for relocatable installations?
> I'm not here to defend that feature, but I bet somebody will.  (And
> doesn't "make check" depend on it?)

I'm complaining specifically about the resolving of symlinks.  Why does

$ /usr/local/opt/postgresql@13/bin/pg_config --bindir

print

/usr/local/Cellar/postgresql@13/13.8/bin

when it clearly should print

/usr/local/opt/postgresql@13/bin

This is unrelated to the support for relocatable installations, AFAICT.




Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> On 12.09.22 17:33, Tom Lane wrote:
>> Are you proposing we give up the support for relocatable installations?
>> I'm not here to defend that feature, but I bet somebody will.  (And
>> doesn't "make check" depend on it?)

> I'm complaining specifically about the resolving of symlinks.  Why does

> $ /usr/local/opt/postgresql@13/bin/pg_config --bindir
> print
> /usr/local/Cellar/postgresql@13/13.8/bin
> when it clearly should print
> /usr/local/opt/postgresql@13/bin

I'm not sure about your setup there, but if you mean that
/usr/local/opt/postgresql@13/bin is a symlink reading more or less
"./13.8/bin", I doubt that failing to canonicalize that is a good idea.
The point of finding the bindir is mainly to be able to navigate to its
sibling directories such as lib/, etc/, share/.  There's no certainty
that a symlink leading to the bin directory will have sibling symlinks
to those other directories.

            regards, tom lane



Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Andrew Dunstan
Date:
On 2022-09-12 Mo 16:07, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
>> On 12.09.22 17:33, Tom Lane wrote:
>>> Are you proposing we give up the support for relocatable installations?
>>> I'm not here to defend that feature, but I bet somebody will.  (And
>>> doesn't "make check" depend on it?)
>> I'm complaining specifically about the resolving of symlinks.  Why does
>> $ /usr/local/opt/postgresql@13/bin/pg_config --bindir
>> print
>> /usr/local/Cellar/postgresql@13/13.8/bin
>> when it clearly should print
>> /usr/local/opt/postgresql@13/bin
> I'm not sure about your setup there, but if you mean that
> /usr/local/opt/postgresql@13/bin is a symlink reading more or less
> "./13.8/bin", I doubt that failing to canonicalize that is a good idea.
> The point of finding the bindir is mainly to be able to navigate to its
> sibling directories such as lib/, etc/, share/.  There's no certainty
> that a symlink leading to the bin directory will have sibling symlinks
> to those other directories.
>
>             


I think the discussion here is a bit tangential to the original topic.

The point you make is reasonable, but it seems a bit more likely that in
the case Peter cites the symlink is one level higher in the tree, in
which case there's probably little value in resolving the symlink. Maybe
we could compromise and check if a path exists and only resolve symlinks
if it does not?


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> I think the discussion here is a bit tangential to the original topic.

Indeed, because I just wanted to reimplement *how* we resolve the
executable path to absolute, not question whether we should do it at all.

> The point you make is reasonable, but it seems a bit more likely that in
> the case Peter cites the symlink is one level higher in the tree, in
> which case there's probably little value in resolving the symlink. Maybe
> we could compromise and check if a path exists and only resolve symlinks
> if it does not?

It's non-negotiable that we apply realpath() or a handmade equivalent
if the path we find to the executable turns out to be relative, ie
you did "../../postgres/bin/psql" or the equivalent.  In the case of
the server, we *will* chdir to someplace else, rendering the original
path useless.  psql might chdir in response to a user command, so it
likewise had better resolve the installation location while it can.

We could maybe skip realpath() if we find what appears to be an
absolute path to the executable.  However, I think that fails in
too many situations.  As an example, if I do
    ln -s /path/to/psql ~/bin
and then invoke psql using that symlink, we're not going to be
able to find any of the installation's supporting files unless
we resolve the symlink.  The executable path we'd deduce after
examining PATH is /home/tgl/bin/psql, which is plenty absolute,
but it doesn't help us find the rest of the PG installation.
That case works today, and I think a lot of people will be
sad if we break it.

I'm not familiar with how homebrew sets up the installation
layout, but I'm suspicious that the situation Peter refers to
has a similar problem, only with a symlink for the bin directory
not the individual executable.

I think the only potentially-workable alternative design is
to forget about relocatable installations and insist that the
supporting files be found at the installation path designated
at configure time.  But, again, that seems likely to break a
lot of setups that work today.  And I've still not heard a
positive reason why we should change it.

            regards, tom lane



Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Peter Eisentraut
Date:
On 13.09.22 01:26, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> I think the discussion here is a bit tangential to the original topic.
> 
> Indeed, because I just wanted to reimplement *how* we resolve the
> executable path to absolute, not question whether we should do it at all.

Well, if we decided not to do it, then we could just delete the code and 
not have to think about how to change it.

> I'm not familiar with how homebrew sets up the installation
> layout, but I'm suspicious that the situation Peter refers to
> has a similar problem, only with a symlink for the bin directory
> not the individual executable.

I think the two contradicting use cases are:

1) You configure and install with prefix=/usr/local/pgsql, and then 
symlink ~/bin/pg_ctl -> /usr/local/pgsql/bin/pg_ctl; hoping that that 
will allow pg_ctl to find the other programs it needs in 
/usr/local/pgsql/bin.  This is what we currently support.

2) You configure and install with prefix=/usr/local/pgsql-14, and then 
symlink /usr/local/pgsql -> /usr/local/pgsql-14; hoping that you can 
then use /usr/local/pgsql as if that's where it actually is.  We don't 
currently support that.  (Note that it would work if you made a copy of 
the tree instead of using the symlink.)

I don't know if anyone uses #1 or what the details of such use are.

#2 is how Homebrew and some other packaging systems work.




Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> 2) You configure and install with prefix=/usr/local/pgsql-14, and then 
> symlink /usr/local/pgsql -> /usr/local/pgsql-14; hoping that you can 
> then use /usr/local/pgsql as if that's where it actually is.  We don't 
> currently support that.  (Note that it would work if you made a copy of 
> the tree instead of using the symlink.)

What about it does not work?

            regards, tom lane



Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Peter Eisentraut
Date:
On 13.09.22 17:16, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
>> 2) You configure and install with prefix=/usr/local/pgsql-14, and then
>> symlink /usr/local/pgsql -> /usr/local/pgsql-14; hoping that you can
>> then use /usr/local/pgsql as if that's where it actually is.  We don't
>> currently support that.  (Note that it would work if you made a copy of
>> the tree instead of using the symlink.)
> 
> What about it does not work?

The problem is if another package or extension uses pg_config to find, 
say, libdir, includedir, or bindir and integrates it into its own build 
system or its own build products.  If those directories point to 
/usr/local/pgsql/{bin,include,lib}, then there is no problem.  But if 
they point to /usr/local/pgsql-14.5/{bin,include,lib}, then the next 
minor update will break those other packages.




Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> On 13.09.22 17:16, Tom Lane wrote:
>> What about it does not work?

> The problem is if another package or extension uses pg_config to find, 
> say, libdir, includedir, or bindir and integrates it into its own build 
> system or its own build products.  If those directories point to 
> /usr/local/pgsql/{bin,include,lib}, then there is no problem.  But if 
> they point to /usr/local/pgsql-14.5/{bin,include,lib}, then the next 
> minor update will break those other packages.

That seems ... a tad far-fetched, and even more to the point,
it'd be the other package's fault not ours.  We have never promised
that those directories point to anyplace that's not PG-specific.
I certainly do not buy that that's a good argument for breaking
Postgres installation setups that work today.

Also, there is nothing in that scenario that is in any way dependent
on the use of symlinks, or even absolute paths, so I don't quite
see the relevance to the current discussion.

            regards, tom lane



Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Thomas Munro
Date:
On Sun, Sep 4, 2022 at 2:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Here's a draft patch for this.  It seems to work on Linux,
> but the Windows code is just speculation.  In particular,
> I did
>
>         path = _fullpath(NULL, fname, 0);
>         if (path == NULL)
>                 _dosmaperr(GetLastError());
>
> but I'm not really sure that the _dosmaperr bit is needed,
> because the _fullpath man page I found makes reference to
> setting "errno" [1].  It's likely to be hard to test, because
> most of the possible error cases should be nigh unreachable
> in our usage; we already know the input is a valid reference
> to an executable file.

I tried lots of crazy stuff[1] to try to get an error out of this
thing, but came up empty handed.  Unlike realpath(), _fullpath()
doesn't resolve symlinks (or junctions), so I guess there's less to go
wrong.  It still needs the present working directory, which is a
per-drive concept on this OS, but even bogus drives don't seem to
produce an error (despite what the manual says).

I'd still lean towards assuming errno is set, given that the manual
references errno and not GetLastError().  Typical manual pages
explicitly tell you when GetLastError() has the error (example:
GetFullPathName(), for which this might be intended as a more Unix-y
wrapper, but even if so there's nothing to say that _fullpath() can't
set errno directly itself, in which case you might clobber it that
way).

[1] https://cirrus-ci.com/task/4935917730267136?logs=main



Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes:
> I tried lots of crazy stuff[1] to try to get an error out of this
> thing, but came up empty handed.  Unlike realpath(), _fullpath()
> doesn't resolve symlinks (or junctions), so I guess there's less to go
> wrong.  It still needs the present working directory, which is a
> per-drive concept on this OS, but even bogus drives don't seem to
> produce an error (despite what the manual says).

Interesting.

> I'd still lean towards assuming errno is set, given that the manual
> references errno and not GetLastError().

Agreed.  In the attached, I drop the _dosmaperr() step and instead
just do "errno = 0" before the call.  That way, if we ever do manage
to hit a _fullpath() failure, we can at least tell whether the errno
that's reported is real or not.

In this version I've attempted to resolve Peter's complaint by only
applying realpath() when the executable path we've obtained is relative
or has a symlink as the last component.  Things will definitely not
work right if either of those is true and we make no effort to get
a more trustworthy path.  I concede that things will usually work okay
without resolving a symlink that's two or more levels up the path,
but I wonder how much we want to trust that.  Suppose somebody changes
such a symlink while the server is running --- nothing very good is
likely to happen if it suddenly starts consulting some other libdir
or sharedir.  Maybe we need to add a flag telling whether we want
this behavior?  TBH I think that pg_config is the only place I'd
be comfortable with doing it like this.  Peter, would your concerns
be satisfied if we just made pg_config do it?

            regards, tom lane

diff --git a/src/common/exec.c b/src/common/exec.c
index 22f04aafbe..06a8601b85 100644
--- a/src/common/exec.c
+++ b/src/common/exec.c
@@ -14,6 +14,15 @@
  *-------------------------------------------------------------------------
  */

+/*
+ * On macOS, "man realpath" avers:
+ *    Defining _DARWIN_C_SOURCE or _DARWIN_BETTER_REALPATH before including
+ *    stdlib.h will cause the provided implementation of realpath() to use
+ *    F_GETPATH from fcntl(2) to discover the path.
+ * This should be harmless everywhere else.
+ */
+#define _DARWIN_BETTER_REALPATH
+
 #ifndef FRONTEND
 #include "postgres.h"
 #else
@@ -58,11 +67,8 @@ extern int    _CRT_glob = 0;        /* 0 turns off globbing; 1 turns it on */
     (fprintf(stderr, __VA_ARGS__), fputc('\n', stderr))
 #endif

-#ifdef _MSC_VER
-#define getcwd(cwd,len)  GetCurrentDirectory(len, cwd)
-#endif
-
-static int    resolve_symlinks(char *path);
+static int    normalize_exec_path(char *path);
+static char *pg_realpath(const char *fname);

 #ifdef WIN32
 static BOOL GetTokenUser(HANDLE hToken, PTOKEN_USER *ppTokenUser);
@@ -87,7 +93,7 @@ validate_exec(const char *path)
     char        path_exe[MAXPGPATH + sizeof(".exe") - 1];

     /* Win32 requires a .exe suffix for stat() */
-    if (strlen(path) >= strlen(".exe") &&
+    if (strlen(path) < strlen(".exe") ||
         pg_strcasecmp(path + strlen(path) - strlen(".exe"), ".exe") != 0)
     {
         strlcpy(path_exe, path, sizeof(path_exe) - 4);
@@ -151,30 +157,16 @@ validate_exec(const char *path)
 int
 find_my_exec(const char *argv0, char *retpath)
 {
-    char        cwd[MAXPGPATH],
-                test_path[MAXPGPATH];
     char       *path;

-    if (!getcwd(cwd, MAXPGPATH))
-    {
-        log_error(errcode_for_file_access(),
-                  _("could not identify current directory: %m"));
-        return -1;
-    }
-
     /*
      * If argv0 contains a separator, then PATH wasn't used.
      */
-    if (first_dir_separator(argv0) != NULL)
+    strlcpy(retpath, argv0, MAXPGPATH);
+    if (first_dir_separator(retpath) != NULL)
     {
-        if (is_absolute_path(argv0))
-            strlcpy(retpath, argv0, MAXPGPATH);
-        else
-            join_path_components(retpath, cwd, argv0);
-        canonicalize_path(retpath);
-
         if (validate_exec(retpath) == 0)
-            return resolve_symlinks(retpath);
+            return normalize_exec_path(retpath);

         log_error(errcode(ERRCODE_WRONG_OBJECT_TYPE),
                   _("invalid binary \"%s\": %m"), retpath);
@@ -183,9 +175,8 @@ find_my_exec(const char *argv0, char *retpath)

 #ifdef WIN32
     /* Win32 checks the current directory first for names without slashes */
-    join_path_components(retpath, cwd, argv0);
     if (validate_exec(retpath) == 0)
-        return resolve_symlinks(retpath);
+        return normalize_exec_path(retpath);
 #endif

     /*
@@ -208,21 +199,15 @@ find_my_exec(const char *argv0, char *retpath)
             if (!endp)
                 endp = startp + strlen(startp); /* point to end */

-            strlcpy(test_path, startp, Min(endp - startp + 1, MAXPGPATH));
+            strlcpy(retpath, startp, Min(endp - startp + 1, MAXPGPATH));

-            if (is_absolute_path(test_path))
-                join_path_components(retpath, test_path, argv0);
-            else
-            {
-                join_path_components(retpath, cwd, test_path);
-                join_path_components(retpath, retpath, argv0);
-            }
+            join_path_components(retpath, retpath, argv0);
             canonicalize_path(retpath);

             switch (validate_exec(retpath))
             {
                 case 0:            /* found ok */
-                    return resolve_symlinks(retpath);
+                    return normalize_exec_path(retpath);
                 case -1:        /* wasn't even a candidate, keep looking */
                     break;
                 case -2:        /* found but disqualified */
@@ -241,108 +226,139 @@ find_my_exec(const char *argv0, char *retpath)


 /*
- * resolve_symlinks - resolve symlinks to the underlying file
+ * normalize_exec_path - convert to absolute path and resolve symlinks
  *
- * Replace "path" by the absolute path to the referenced file.
+ * Given a path that refers to an executable, convert it to absolute form
+ * and make sure it does not refer to a symlink.
  *
+ * On success, replaces the contents of "path" with the absolute path.
+ * ("path" is assumed to be of size MAXPGPATH.)
  * Returns 0 if OK, -1 if error.
- *
- * Note: we are not particularly tense about producing nice error messages
- * because we are not really expecting error here; we just determined that
- * the symlink does point to a valid executable.
- *
- * Here we test HAVE_READLINK, which excludes Windows.  There's no point in
- * using our junction point-based replacement code for this, because that only
- * works for directories.
  */
 static int
-resolve_symlinks(char *path)
+normalize_exec_path(char *path)
 {
-#ifdef HAVE_READLINK
-    struct stat buf;
-    char        orig_wd[MAXPGPATH],
-                link_buf[MAXPGPATH];
-    char       *fname;
+    char       *abspath;

     /*
-     * To resolve a symlink properly, we have to chdir into its directory and
-     * then chdir to where the symlink points; otherwise we may fail to
-     * resolve relative links correctly (consider cases involving mount
-     * points, for example).  After following the final symlink, we use
-     * getcwd() to figure out where the heck we're at.
-     *
-     * One might think we could skip all this if path doesn't point to a
-     * symlink to start with, but that's wrong.  We also want to get rid of
-     * any directory symlinks that are present in the given path. We expect
-     * getcwd() to give us an accurate, symlink-free path.
+     * Canonicalize the given path, for cleanliness and to be sure that
+     * is_absolute_path() isn't fooled.  (This is redundant in some calling
+     * cases, but not all.)
      */
-    if (!getcwd(orig_wd, MAXPGPATH))
-    {
-        log_error(errcode_for_file_access(),
-                  _("could not identify current directory: %m"));
-        return -1;
-    }
+    canonicalize_path(path);

-    for (;;)
+    /*
+     * If it's absolute and doesn't refer to a symlink, return the
+     * canonicalized path as-is, without factoring out any symlinks occurring
+     * earlier in the path.  This supports situations where our whole
+     * installation tree is symlinked, for example our executable path is
+     * /usr/local/pgsql/bin/psql while /usr/local/pgsql is a symlink to
+     * /usr/local/pgsql-14.5.  It's also possible that the user has symlinked
+     * an individual program, say "ln -s /usr/local/pgsql/bin/psql ~/bin", and
+     * in that case we'd better resolve the symlink or we won't be able to
+     * find the rest of our installation.
+     *
+     * A case that will not work well is if the symlink is for the bin
+     * directory as a whole, as we won't be able to find sibling installation
+     * directories unless there are sibling symlinks.  We could deal with that
+     * by also insisting that the immediate parent dir not be a symlink.  But
+     * such setups appear to be rare if not nonexistent, so currently it
+     * doesn't seem worth the extra complication to handle that.
+     */
+    if (is_absolute_path(path))
     {
-        char       *lsep;
-        int            rllen;
-
-        lsep = last_dir_separator(path);
-        if (lsep)
-        {
-            *lsep = '\0';
-            if (chdir(path) == -1)
-            {
-                log_error(errcode_for_file_access(),
-                          _("could not change directory to \"%s\": %m"), path);
-                return -1;
-            }
-            fname = lsep + 1;
-        }
-        else
-            fname = path;
+        struct stat st;

-        if (lstat(fname, &buf) < 0 ||
-            !S_ISLNK(buf.st_mode))
-            break;
-
-        errno = 0;
-        rllen = readlink(fname, link_buf, sizeof(link_buf));
-        if (rllen < 0 || rllen >= sizeof(link_buf))
-        {
-            log_error(errcode_for_file_access(),
-                      _("could not read symbolic link \"%s\": %m"), fname);
-            return -1;
-        }
-        link_buf[rllen] = '\0';
-        strcpy(path, link_buf);
+        /* Note that caller already verified that stat() says it's S_ISREG */
+        if (lstat(path, &st) == 0 &&
+            S_ISREG(st.st_mode))
+            return 0;
     }

-    /* must copy final component out of 'path' temporarily */
-    strlcpy(link_buf, fname, sizeof(link_buf));
+    /*
+     * The path is relative or refers to a symlink, so resolve it to a full
+     * absolute path.
+     *
+     * We used to do a lot of work ourselves here, but now we just let
+     * realpath(3) do all the heavy lifting.
+     */
+    abspath = pg_realpath(path);

-    if (!getcwd(path, MAXPGPATH))
+    if (abspath == NULL)
     {
         log_error(errcode_for_file_access(),
-                  _("could not identify current directory: %m"));
+                  _("could not resolve path \"%s\" to absolute form: %m"),
+                  path);
         return -1;
     }
-    join_path_components(path, path, link_buf);
-    canonicalize_path(path);
+    strlcpy(path, abspath, MAXPGPATH);
+    free(abspath);

-    if (chdir(orig_wd) == -1)
-    {
-        log_error(errcode_for_file_access(),
-                  _("could not change directory to \"%s\": %m"), orig_wd);
-        return -1;
-    }
-#endif                            /* HAVE_READLINK */
+#ifdef WIN32
+    /* On Windows, be sure to convert '\' to '/' */
+    canonicalize_path(path);
+#endif

     return 0;
 }


+/*
+ * pg_realpath() - realpath(3) with POSIX.1-2008 semantics
+ *
+ * This is equivalent to realpath(fname, NULL), in that it returns a
+ * malloc'd buffer containing the absolute path equivalent to fname.
+ * On error, returns NULL with errno set.
+ *
+ * On Windows, what you get is spelled per platform conventions,
+ * so you probably want to apply canonicalize_path() to the result.
+ *
+ * For now, this is needed only here so mark it static.  If you choose to
+ * move it into its own file, move the _DARWIN_BETTER_REALPATH #define too!
+ */
+static char *
+pg_realpath(const char *fname)
+{
+    char       *path;
+
+#ifndef WIN32
+    path = realpath(fname, NULL);
+    if (path == NULL && errno == EINVAL)
+    {
+        /*
+         * Cope with old-POSIX systems that require a user-provided buffer.
+         * Assume MAXPGPATH is enough room on all such systems.
+         */
+        char       *buf = malloc(MAXPGPATH);
+
+        if (buf == NULL)
+            return NULL;        /* assume errno is set */
+        path = realpath(fname, buf);
+        if (path == NULL)        /* don't leak memory */
+        {
+            int            save_errno = errno;
+
+            free(buf);
+            errno = save_errno;
+        }
+    }
+#else                            /* WIN32 */
+
+    /*
+     * Microsoft is resolutely non-POSIX, but _fullpath() does the same thing.
+     * The documentation claims it reports errors by setting errno, which is a
+     * bit surprising for Microsoft, but we'll believe that until it's proven
+     * wrong.  Clear errno first, though, so we can at least tell if a failure
+     * occurs and doesn't set it.
+     */
+    errno = 0;
+    path = _fullpath(NULL, fname, 0);
+#endif
+
+    return path;
+}
+
+
 /*
  * Find another program in our binary's directory,
  * then make sure it is the proper version.

Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Peter Eisentraut
Date:
On 15.09.22 16:43, Tom Lane wrote:
> That seems ... a tad far-fetched, and even more to the point,
> it'd be the other package's fault not ours.  We have never promised
> that those directories point to anyplace that's not PG-specific.
> I certainly do not buy that that's a good argument for breaking
> Postgres installation setups that work today.
> 
> Also, there is nothing in that scenario that is in any way dependent
> on the use of symlinks, or even absolute paths, so I don't quite
> see the relevance to the current discussion.

Here is another variant of the same problem:

I have

$ which meson
/usr/local/bin/meson

Meson records its own path (somewhere under meson-info/ AFAICT), so it 
can re-run itself when any of the meson.build files change.  But since 
the above is a symlink, it records its own location as 
"/usr/local/Cellar/meson/0.63.1/bin/meson".  So now, whenever the meson 
package updates (even if it's just 0.63.0 -> 0.63.1), my build tree is 
broken.

To clarify, this instance is not at all the fault of any code in 
PostgreSQL.  But it's another instance where resolving symlinks just 
because we can causing problems.




Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> To clarify, this instance is not at all the fault of any code in 
> PostgreSQL.  But it's another instance where resolving symlinks just 
> because we can causing problems.

[ shrug... ]  *Not* resolving symlinks when we can causes its
own set of problems, which maybe we don't see very clearly
because we have been doing it like that for a couple of decades.
I remain pretty hesitant to change this behavior.

What did you think of the compromise proposal to change only
the paths that pg_config outputs?  I've not tried to code that,
but I think it should be feasible.

            regards, tom lane



Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Peter Eisentraut
Date:
On 05.10.22 15:59, Tom Lane wrote:
> What did you think of the compromise proposal to change only
> the paths that pg_config outputs?  I've not tried to code that,
> but I think it should be feasible.

I don't think I understand what this proposal actually means.  What 
would be the behavior of pg_config and how would it be different from 
before?




Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> On 05.10.22 15:59, Tom Lane wrote:
>> What did you think of the compromise proposal to change only
>> the paths that pg_config outputs?  I've not tried to code that,
>> but I think it should be feasible.

> I don't think I understand what this proposal actually means.  What 
> would be the behavior of pg_config and how would it be different from 
> before?

What I had in mind was:

* server and most frontend programs keep the same behavior, that is
fully resolve their executable's path to an absolute path (and then
navigate to the rest of the installation from there); but now they'll
use realpath() to avoid chdir'ing while they do that.

* pg_config applies realpath() if its initial PATH search produces a
relative path to the executable, or if the last component of that path
is a symlink.  Otherwise leave it alone, which would have the effect of
not expanding directory-level symlinks.

I think that changing pg_config's behavior would be enough to resolve
the complaints you listed, but perhaps I'm missing some fine points.

            regards, tom lane



Re: Can we avoid chdir'ing in resolve_symlinks() ?

From
Tom Lane
Date:
I wrote:
> I think that changing pg_config's behavior would be enough to resolve
> the complaints you listed, but perhaps I'm missing some fine points.

Meanwhile I've gone ahead and pushed my v1 patch (plus Munro's
recommendation about _fullpath error handling), so we can see if
the buildfarm blows up.  The question of whether we can sometimes
skip replacement of symlinks seems like material for a second patch
in any case.

            regards, tom lane