Thread: macOS SIP, next try
Previous discussions on macOS SIP: https://www.postgresql.org/message-id/flat/561E73AB.8060800%40gmx.net https://www.postgresql.org/message-id/flat/CA%2BTgmoYGi5oR8Rvb2-SY1_WEjd76H5gCqSukxFQ66qR7MewWAQ%40mail.gmail.com https://www.postgresql.org/message-id/flat/6a4d6124-41f0-756b-0811-c5c5def7ef4b%402ndquadrant.com Tests that use a temporary installation (i.e., make check) use a platform-specific environment variable to make programs and libraries find shared libraries in the temporary installation rather than in the actual configured installation target directory. On macOS, this is DYLD_LIBRARY_PATH. Ever since System Integrity Protection (SIP) arrived, this hasn't worked anymore, because DYLD_LIBRARY_PATH gets disabled by the system (in somewhat obscure ways). The workaround was to either disable SIP or to do make install before make check. The solution here is to use install_name_tool to edit the shared library path in the binaries (programs and libraries) after installation into the temporary prefix. The solution is, I think, correct in principle, but the implementation is hackish, since it hardcodes the names and versions of the interesting shared libraries in an unprincipled way. More refinement is needed here. Ideas welcome. In order to allow all that, we need to link everything with the option -headerpad_max_install_names so that there is enough room in the binaries to put in the temporary path in place of the original one. (The temporary path is strictly longer than the original one.) This bloats the binaries, so it might only be appropriate for development builds and should perhaps be hidden behind some option. Ideas also welcome here. The attached patch works for me. You can see that it disables setting DYLD_LIBRARY_PATH, but the tests still work. Verification on other system variantions welcome, of course. Comments welcome. I think this direction is promising, but some details need to be addressed.
Attachment
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: > The solution here is to use install_name_tool to edit the shared library > path in the binaries (programs and libraries) after installation into > the temporary prefix. Ah, this indeed seems like a promising direction. > The solution is, I think, correct in principle, but the implementation > is hackish, since it hardcodes the names and versions of the interesting > shared libraries in an unprincipled way. More refinement is needed > here. Ideas welcome. Yeah, it's pretty hacky. I tried the patch as given, and "make check" failed with error: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/install_name_tool: can't openfile: /Users/tgl/pgsql/tmp_install//Users/tgl/testversion/lib/*.so (No such file or directory) make: *** [temp-install] Error 1 evidently because there are no files named *.so in the "lib" directory in my build. I removed '$(abs_top_builddir)/tmp_install/$(libdir)'/*.so from the "call temp_install_postprocess" invocation, and then it got through "make temp-install", and most of the core regression tests worked. But a couple of them failed with +ERROR: could not load library "/Users/tgl/pgsql/tmp_install/Users/tgl/testversion/lib/postgresql/libpqwalreceiver.so": dlopen(/Users/tgl/pgsql/tmp_install/Users/tgl/testversion/lib/postgresql/libpqwalreceiver.so,10): Library not loaded: /Users/tgl/testversion/lib/libpq.5.dylib + Referenced from: /Users/tgl/pgsql/tmp_install/Users/tgl/testversion/lib/postgresql/libpqwalreceiver.so + Reason: image not found What this looks like to me is some confusion about whether "pgsql" or "postgresql" has been injected into the install path or not. (This build is using /Users/tgl/testversion as the install --prefix; I surmise that your path had "pgsql" or "postgres" in it already.) One plausible way to fix this is to use "find" on the install tree to locate files to modify, instead of hard-wiring knowledge into the "call temp_install_postprocess" invocation. A possibly better way is to arrange to invoke install_name_tool once per installed executable or shlib, at the time we do "make install" for it. Not sure how hard the latter approach is given our Makefiles. Another suggestion, after reading "man install_name_tool", is that maybe you could avoid having to know the names of the specific libraries if you didn't use per-library '-change' switches, but instead installed the test-installation rpath with the -add_rpath switch. This would be closer to the spirit of the existing test process anyway. Aside from removing that annoying requirement, this method would bound the amount of extra header space needed, so instead of "-headerpad_max_install_names" we could use "-headerpad N" with some not-hugely-generous N. That ought to damp down the executable bloat to the point where we'd not have to worry too much about it. (Pokes around ... it appears that on my old dinosaur prairiedog, install_name_tool exists and has the -change switch, but not anything for rpath. So we'd better investigate how old -add_rpath is. Maybe there's another way to set rpath that works further back? At least the -headerpad switches exist there.) I haven't experimented with any of these ideas, so maybe they won't work for some reason. But I think we ought to push forward looking for a solution, because it's getting harder and harder to disable SIP and still have a useful macOS installation :-( regards, tom lane
> On Dec 2, 2020, at 6:28 AM, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > > Previous discussions on macOS SIP: > > https://www.postgresql.org/message-id/flat/561E73AB.8060800%40gmx.net > https://www.postgresql.org/message-id/flat/CA%2BTgmoYGi5oR8Rvb2-SY1_WEjd76H5gCqSukxFQ66qR7MewWAQ%40mail.gmail.com > https://www.postgresql.org/message-id/flat/6a4d6124-41f0-756b-0811-c5c5def7ef4b%402ndquadrant.com See also https://www.postgresql.org/message-id/flat/18012hGLG6HJ9pQDkHAMYuwQKg%40sparkpost.com — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Mark Dilger <mark.dilger@enterprisedb.com> writes: > See also https://www.postgresql.org/message-id/flat/18012hGLG6HJ9pQDkHAMYuwQKg%40sparkpost.com Yeah. As I recall from that thread and prior ones, the bottleneck is really just /bin/sh: something, either the kernel or sh itself, is clearing out DYLD_LIBRARY_PATH when a shell script starts. (Since PATH is not reset at the same time, this seems the height of stupidity/uselessness; but I digress.) But if "make" has a setting for that variable, the setting *is* successfully passed down to executed programs. Which implies that make doesn't use the shell while running stuff, which seems odd. But it gives us a possible escape hatch. I experimented a bit, and found that the attached very-quick-hack patch is enough to get through "make check" and also the isolation tests. Unsurprisingly, none of the TAP tests work. I experimented a bit and it looks like "prove" may be failing to pass down DYLD_LIBRARY_PATH to the test scripts, which would be annoying but could be worked around (say, by getting TestLib.pm to pull the value from somewhere). Based on what I've got here, this avenue probably ends up messier and more invasive than the install_name_tool avenue, since we'd likely have to teach quite a few places about explicitly passing down DYLD_LIBRARY_PATH. It does have the advantage that "make check" doesn't have to test executables that are different from the bits we'll really install; but that's probably not worth all that much. Unless someone can think of a cute way to centralize the changes some more, I'm inclined to think this isn't the way forward. But it seemed worth posting the results anyway. regards, tom lane diff --git a/src/common/exec.c b/src/common/exec.c index 6cbc820042..4de79f94ef 100644 --- a/src/common/exec.c +++ b/src/common/exec.c @@ -323,8 +323,9 @@ int find_other_exec(const char *argv0, const char *target, const char *versionstr, char *retpath) { - char cmd[MAXPGPATH]; + char cmd[MAXPGPATH * 2]; char line[MAXPGPATH]; + char *dlp = getenv("DYLD_LIBRARY_PATH"); if (find_my_exec(argv0, retpath) < 0) return -1; @@ -340,7 +341,11 @@ find_other_exec(const char *argv0, const char *target, if (validate_exec(retpath) != 0) return -1; - snprintf(cmd, sizeof(cmd), "\"%s\" -V", retpath); + if (dlp) + snprintf(cmd, sizeof(cmd), "DYLD_LIBRARY_PATH=\"%s\" \"%s\" -V", + dlp, retpath); + else + snprintf(cmd, sizeof(cmd), "\"%s\" -V", retpath); if (!pipe_read_line(cmd, line, sizeof(line))) return -1; diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c index 5cfb4c4a49..dbde78608a 100644 --- a/src/test/regress/pg_regress.c +++ b/src/test/regress/pg_regress.c @@ -77,6 +77,7 @@ bool debug = false; char *inputdir = "."; char *outputdir = "."; char *bindir = PGBINDIR; +char *setlibpath = NULL; char *launcher = NULL; static _stringlist *loadextension = NULL; static int max_connections = 0; @@ -270,7 +271,8 @@ stop_postmaster(void) fflush(stderr); snprintf(buf, sizeof(buf), - "\"%s%spg_ctl\" stop -D \"%s/data\" -s", + "%s\"%s%spg_ctl\" stop -D \"%s/data\" -s", + setlibpath, bindir ? bindir : "", bindir ? "/" : "", temp_instance); @@ -1135,7 +1137,8 @@ psql_command(const char *database, const char *query,...) /* And now we can build and execute the shell command */ snprintf(psql_cmd, sizeof(psql_cmd), - "\"%s%spsql\" -X -c \"%s\" \"%s\"", + "%s\"%s%spsql\" -X -c \"%s\" \"%s\"", + setlibpath, bindir ? bindir : "", bindir ? "/" : "", query_escaped, @@ -1187,7 +1190,7 @@ spawn_process(const char *cmdline) */ char *cmdline2; - cmdline2 = psprintf("exec %s", cmdline); + cmdline2 = psprintf("%sexec %s", setlibpath, cmdline); execl(shellprog, shellprog, "-c", cmdline2, (char *) NULL); fprintf(stderr, _("%s: could not exec \"%s\": %s\n"), progname, shellprog, strerror(errno)); @@ -2290,6 +2293,15 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc unlimit_core_size(); #endif + { + char *dlp = getenv("DYLD_LIBRARY_PATH"); + + if (dlp) + setlibpath = psprintf("DYLD_LIBRARY_PATH=\"%s\" ", dlp); + else + setlibpath = ""; + } + if (temp_instance) { FILE *pg_conf; @@ -2324,7 +2336,8 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc /* initdb */ header(_("initializing database system")); snprintf(buf, sizeof(buf), - "\"%s%sinitdb\" -D \"%s/data\" --no-clean --no-sync%s%s > \"%s/log/initdb.log\" 2>&1", + "%s\"%s%sinitdb\" -D \"%s/data\" --no-clean --no-sync%s%s > \"%s/log/initdb.log\" 2>&1", + setlibpath, bindir ? bindir : "", bindir ? "/" : "", temp_instance, @@ -2397,7 +2410,8 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc * Check if there is a postmaster running already. */ snprintf(buf2, sizeof(buf2), - "\"%s%spsql\" -X postgres <%s 2>%s", + "%s\"%s%spsql\" -X postgres <%s 2>%s", + setlibpath, bindir ? bindir : "", bindir ? "/" : "", DEVNULL, DEVNULL);
BTW, a couple other things that should be noted here: * Per observation in a nearby thread, install_name_tool seems to be provided by Apple's "Command Line Tools", but not by Xcode. This is also true of bison and flex, but we don't require those in a build-from-tarball. So relying on install_name_tool would represent an expansion of our requirements for end-user builds. I don't think this is a fatal objection, not least because the CLT are a great deal smaller than Xcode --- so we likely ought to encourage people to install just the former. But it's a change. * By chance I came across a previous thread in which someone suggested use of install_name_tool: https://www.postgresql.org/message-id/flat/CAN-RpxCFWqPXQD8CqP3UBqdMwUgQWLG%2By7vQgxQdJR8aiKB89A%40mail.gmail.com I griped in that thread that it didn't help for test executables that don't get installed, such as isolationtester, because the hack never got applied to them. Re-reading Peter's patch with that in mind, I see he replaces the path in isolationtester in-place, which means that it works in "make check" but will fail in "make installcheck". So I'm not sure how to cope with that, but it's a problem. regards, tom lane