Thread: Re: [COMMITTERS] pgsql: Refactor flex and bison make rules
Andrew Dunstan <andrew@dunslane.net> writes: > On 11/28/2012 02:14 PM, Alvaro Herrera wrote: >> Okapi has been failing sporadically on ecpg, and I wonder if it's >> related to this change. > Well, it looks like the make is broken and missing a clear dependency > requirement. I think we need to ask Jeremy to turn off parallel build > for okapi. Yeah, we already know that unpatched make 3.82 has got serious parallelism bugs: http://archives.postgresql.org/pgsql-hackers/2012-09/msg00397.php I wonder whether adding another .NOTPARALLEL directive would be a better idea than insisting people get hold of patched versions. regards, tom lane
On 11/28/12 6:01 PM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> On 11/28/2012 02:14 PM, Alvaro Herrera wrote: >>> Okapi has been failing sporadically on ecpg, and I wonder if it's >>> related to this change. > >> Well, it looks like the make is broken and missing a clear dependency >> requirement. I think we need to ask Jeremy to turn off parallel build >> for okapi. > > Yeah, we already know that unpatched make 3.82 has got serious > parallelism bugs: > http://archives.postgresql.org/pgsql-hackers/2012-09/msg00397.php > > I wonder whether adding another .NOTPARALLEL directive would be a better > idea than insisting people get hold of patched versions. We could put ifeq ($(MAKE_VERSION),3.82) .NOTPARALLEL: endif into Makefile.global.
On 11/28/2012 06:01 PM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> On 11/28/2012 02:14 PM, Alvaro Herrera wrote: >>> Okapi has been failing sporadically on ecpg, and I wonder if it's >>> related to this change. >> Well, it looks like the make is broken and missing a clear dependency >> requirement. I think we need to ask Jeremy to turn off parallel build >> for okapi. > Yeah, we already know that unpatched make 3.82 has got serious > parallelism bugs: > http://archives.postgresql.org/pgsql-hackers/2012-09/msg00397.php > > I wonder whether adding another .NOTPARALLEL directive would be a better > idea than insisting people get hold of patched versions. > > You mean in the preproc Makefile? Maybe. cheers andrew
Peter Eisentraut <peter_e@gmx.net> writes: > On 11/28/12 6:01 PM, Tom Lane wrote: >> I wonder whether adding another .NOTPARALLEL directive would be a better >> idea than insisting people get hold of patched versions. > We could put > ifeq ($(MAKE_VERSION),3.82) > .NOTPARALLEL: > endif > into Makefile.global. I don't wish to go *that* far. Parallel make works fine for most of the tree in 3.82, and shutting it off would penalize developers a lot. It appears to me that the case that okapi is hitting is specific to the ecpg preprocessor build rules, and indeed specific to the case where preproc.c needs to be rebuilt. A .NOTPARALLEL in ecpg/preproc/Makefile would probably be enough to fix it. (I'm a bit tempted to make the one already added to ecpg/Makefile conditional on the make version, as you suggest above, too.) regards, tom lane
On 11/28/2012 06:19 PM, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: >> On 11/28/12 6:01 PM, Tom Lane wrote: >>> I wonder whether adding another .NOTPARALLEL directive would be a better >>> idea than insisting people get hold of patched versions. >> We could put >> ifeq ($(MAKE_VERSION),3.82) >> .NOTPARALLEL: >> endif >> into Makefile.global. > I don't wish to go *that* far. Parallel make works fine for most of the > tree in 3.82, and shutting it off would penalize developers a lot. > > It appears to me that the case that okapi is hitting is specific to the > ecpg preprocessor build rules, and indeed specific to the case where > preproc.c needs to be rebuilt. A .NOTPARALLEL in ecpg/preproc/Makefile > would probably be enough to fix it. (I'm a bit tempted to make the one > already added to ecpg/Makefile conditional on the make version, as you > suggest above, too.) > > There is something odd about okapi, because my linux/gcc buildfarm animal is using make 3.82 happily, with make_jobs = 4. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > On 11/28/2012 06:19 PM, Tom Lane wrote: >> It appears to me that the case that okapi is hitting is specific to the >> ecpg preprocessor build rules, and indeed specific to the case where >> preproc.c needs to be rebuilt. A .NOTPARALLEL in ecpg/preproc/Makefile >> would probably be enough to fix it. (I'm a bit tempted to make the one >> already added to ecpg/Makefile conditional on the make version, as you >> suggest above, too.) > There is something odd about okapi, because my linux/gcc buildfarm > animal is using make 3.82 happily, with make_jobs = 4. Yeah, and nobody else has seen this either. It might just be that okapi has exactly the right number of processors with exactly the right speeds to make the failure a lot more probable. Or maybe there's something weird about Gentoo's version of make (wouldn't be the first time). Anyway, deparallelizing just the ecpg/preproc build would cost very little in build time, since it's totally dominated by the preproc.c and preproc.o build steps anyway. I'm inclined to just do it and see if the problem goes away. regards, tom lane
Jeremy Drake <pgbuildfarm@jdrake.com> writes: > While we're talking about odd issues that only seem to happen on Okapi, > does anyone know of anything I can do to diagnose the pg_upgrade failure > on the 9.2 branch? There are no rogue (non-buildfarm-related) > postmaster/postgres processes running on the machine. [ digs around ... ] It looks like the failure is coming from here: if (strlen(path) >= sizeof(unp->sun_path)) return EAI_FAIL; What's the size of the sun_path member of struct sockaddr_un on your machine? I count 115 characters in your socket path ... maybe you just need a less deeply nested test directory. (If that is the problem, seems like we need to return something more helpful than EAI_FAIL here.) regards, tom lane
On Wed, 28 Nov 2012, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > On 11/28/2012 02:14 PM, Alvaro Herrera wrote: > >> Okapi has been failing sporadically on ecpg, and I wonder if it's > >> related to this change. > > > Well, it looks like the make is broken and missing a clear dependency > > requirement. I think we need to ask Jeremy to turn off parallel build > > for okapi. > > Yeah, we already know that unpatched make 3.82 has got serious > parallelism bugs: > http://archives.postgresql.org/pgsql-hackers/2012-09/msg00397.php > > I wonder whether adding another .NOTPARALLEL directive would be a better > idea than insisting people get hold of patched versions. While we're talking about odd issues that only seem to happen on Okapi, does anyone know of anything I can do to diagnose the pg_upgrade failure on the 9.2 branch? There are no rogue (non-buildfarm-related) postmaster/postgres processes running on the machine.
On Wed, 28 Nov 2012, Tom Lane wrote: > Jeremy Drake <pgbuildfarm@jdrake.com> writes: > > While we're talking about odd issues that only seem to happen on Okapi, > > does anyone know of anything I can do to diagnose the pg_upgrade failure > > on the 9.2 branch? There are no rogue (non-buildfarm-related) > > postmaster/postgres processes running on the machine. > > [ digs around ... ] It looks like the failure is coming from here: > > if (strlen(path) >= sizeof(unp->sun_path)) > return EAI_FAIL; > > What's the size of the sun_path member of struct sockaddr_un on your > machine? I count 115 characters in your socket path ... maybe you > just need a less deeply nested test directory. > > (If that is the problem, seems like we need to return something > more helpful than EAI_FAIL here.) /usr/include/sys/un.h: char sun_path[108]; /* Path name. */ That seems to be it. This may be just the excuse I needed to set up dedicated users for my buildfarm animals.
On 11/28/2012 11:03 PM, Tom Lane wrote: > Jeremy Drake <pgbuildfarm@jdrake.com> writes: >> While we're talking about odd issues that only seem to happen on Okapi, >> does anyone know of anything I can do to diagnose the pg_upgrade failure >> on the 9.2 branch? There are no rogue (non-buildfarm-related) >> postmaster/postgres processes running on the machine. > [ digs around ... ] It looks like the failure is coming from here: > > if (strlen(path) >= sizeof(unp->sun_path)) > return EAI_FAIL; > > What's the size of the sun_path member of struct sockaddr_un on your > machine? I count 115 characters in your socket path ... maybe you > just need a less deeply nested test directory. > > (If that is the problem, seems like we need to return something > more helpful than EAI_FAIL here.) > > Looks like it was. Good catch. What's the best way to fix? Note that this problem was triggered by having a buildfarm buildroot path of length about 49 or 50. I'm lucky not to have triggered it myself. Do I need to add a check to limit the buildroot path length? cheers andrew
Overlength socket paths (was Re: [COMMITTERS] pgsql: Refactor flex and bison make rules)
From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes: > On 11/28/2012 11:03 PM, Tom Lane wrote: >> [ digs around ... ] It looks like the failure is coming from here: >> >> if (strlen(path) >= sizeof(unp->sun_path)) >> return EAI_FAIL; > Looks like it was. Good catch. What's the best way to fix? So far as I can see, none of the spec-defined EAI_XXX codes map very nicely to "path name too long". Possibly we could return EAI_SYSTEM and set errno to ENAMETOOLONG, but I'm not sure the latter is very portable either. Another line of attack is to just teach getnameinfo_unix() to malloc its result struct big enough to hold whatever the supplied path is. The portability risk here is if sun_path is not the last field in struct sockaddr_un on some platform --- but that seems a bit unlikely, and even if it isn't I doubt we access any other members besides sun_family and sun_path. I kind of like this approach, since it gets rid of the length limitation rather than just reporting it more clearly. regards, tom lane
Re: Overlength socket paths (was Re: [COMMITTERS] pgsql: Refactor flex and bison make rules)
From
Tom Lane
Date:
I wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> Looks like it was. Good catch. What's the best way to fix? > So far as I can see, none of the spec-defined EAI_XXX codes map very > nicely to "path name too long". Possibly we could return EAI_SYSTEM > and set errno to ENAMETOOLONG, but I'm not sure the latter is very > portable either. I tried this out and found that at least on Linux, gai_strerror() is too stupid to pay attention to errno anyway; you just get "System error", which is about as unhelpful as it could possibly be. I don't see any way that we can get a more specific error message to be printed without eliminating use of gai_strerror and providing our own infrastructure for reporting getaddrinfo errors. While that wouldn't be incredibly awful (we have such infrastructure already for ancient platforms...), it still kinda sucks. > Another line of attack is to just teach getaddrinfo_unix() to malloc its > result struct big enough to hold whatever the supplied path is. I tried this out too, and found that it doesn't work well, because both libpq and the backend expect to be able to copy getaddrinfo results into fixed-size SockAddr structs. We could probably fix that by adding another layer of pointers and malloc operations, but it would be somewhat invasive. Given the lack of prior complaints it's not clear to me that it's worth that much trouble --- although getting rid of our hard-wired assumptions about the maximum result size from getaddrinfo is attractive from a robustness standpoint. I'm a bit tempted to just replace EAI_FAIL with EAI_MEMORY here, so that you'd get messages similar to "Memory allocation failure". That might at least point your thoughts in the right direction, whereas "Non-recoverable failure in name resolution" certainly conveys nothing of use. Thoughts? regards, tom lane
Re: Overlength socket paths (was Re: [COMMITTERS] pgsql: Refactor flex and bison make rules)
From
Andrew Dunstan
Date:
On 11/29/2012 03:33 PM, Tom Lane wrote: > I wrote: >> Andrew Dunstan <andrew@dunslane.net> writes: >>> Looks like it was. Good catch. What's the best way to fix? >> So far as I can see, none of the spec-defined EAI_XXX codes map very >> nicely to "path name too long". Possibly we could return EAI_SYSTEM >> and set errno to ENAMETOOLONG, but I'm not sure the latter is very >> portable either. > I tried this out and found that at least on Linux, gai_strerror() is too > stupid to pay attention to errno anyway; you just get "System error", > which is about as unhelpful as it could possibly be. I don't see any > way that we can get a more specific error message to be printed without > eliminating use of gai_strerror and providing our own infrastructure for > reporting getaddrinfo errors. While that wouldn't be incredibly awful > (we have such infrastructure already for ancient platforms...), it > still kinda sucks. > >> Another line of attack is to just teach getaddrinfo_unix() to malloc its >> result struct big enough to hold whatever the supplied path is. > I tried this out too, and found that it doesn't work well, because both > libpq and the backend expect to be able to copy getaddrinfo results into > fixed-size SockAddr structs. We could probably fix that by adding > another layer of pointers and malloc operations, but it would be > somewhat invasive. Given the lack of prior complaints it's not clear > to me that it's worth that much trouble --- although getting rid of our > hard-wired assumptions about the maximum result size from getaddrinfo is > attractive from a robustness standpoint. > > I'm a bit tempted to just replace EAI_FAIL with EAI_MEMORY here, so > that you'd get messages similar to "Memory allocation failure". That > might at least point your thoughts in the right direction, whereas > "Non-recoverable failure in name resolution" certainly conveys nothing > of use. > > Thoughts? I don't think it's worth a heroic effort. Meanwhile I'll add a check in the upgrade test module(s) to check for overly long paths likely to give problems. cheers andrew
Re: Overlength socket paths (was Re: [COMMITTERS] pgsql: Refactor flex and bison make rules)
From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes: > On 11/29/2012 03:33 PM, Tom Lane wrote: >>> Another line of attack is to just teach getaddrinfo_unix() to malloc its >>> result struct big enough to hold whatever the supplied path is. >> I tried this out too, and found that it doesn't work well, because both >> libpq and the backend expect to be able to copy getaddrinfo results into >> fixed-size SockAddr structs. We could probably fix that by adding >> another layer of pointers and malloc operations, but it would be >> somewhat invasive. Given the lack of prior complaints it's not clear >> to me that it's worth that much trouble --- although getting rid of our >> hard-wired assumptions about the maximum result size from getaddrinfo is >> attractive from a robustness standpoint. > I don't think it's worth a heroic effort. Meanwhile I'll add a check in > the upgrade test module(s) to check for overly long paths likely to give > problems. I'm thinking maybe we should try to fix this. What's bugging me is that Jeremy's build path doesn't look all that unreasonably long. The scary scenario that's in the back of my mind is that one day somebody decides to rearrange the Red Hat build servers a bit and suddenly I can't build Postgres there anymore, because the build directory is nested a bit too deep. Murphy's law would of course dictate that I find this out only when under the gun to get a security update packaged. regards, tom lane
Re: Overlength socket paths (was Re: [COMMITTERS] pgsql: Refactor flex and bison make rules)
From
Noah Misch
Date:
On Thu, Nov 29, 2012 at 03:33:59PM -0500, Tom Lane wrote: > I wrote: > > So far as I can see, none of the spec-defined EAI_XXX codes map very > > nicely to "path name too long". Possibly we could return EAI_SYSTEM > > and set errno to ENAMETOOLONG, but I'm not sure the latter is very > > portable either. > > I tried this out and found that at least on Linux, gai_strerror() is too > stupid to pay attention to errno anyway; you just get "System error", > which is about as unhelpful as it could possibly be. I don't see any > way that we can get a more specific error message to be printed without > eliminating use of gai_strerror and providing our own infrastructure for > reporting getaddrinfo errors. While that wouldn't be incredibly awful > (we have such infrastructure already for ancient platforms...), it > still kinda sucks. RFC 2553 and successor standards do not call for gai_strerror() to look at anything other than its argument, so your finding for Linux surprises me less than its alternative. Adopt code like "rc == EAI_SYSTEM ? strerror(errno) : gai_strerror(rc)" to report the error, and your proposal to use ENAMETOOLONG sounds suitable. > > Another line of attack is to just teach getaddrinfo_unix() to malloc its > > result struct big enough to hold whatever the supplied path is. > > I tried this out too, and found that it doesn't work well, because both > libpq and the backend expect to be able to copy getaddrinfo results into > fixed-size SockAddr structs. We could probably fix that by adding > another layer of pointers and malloc operations, but it would be > somewhat invasive. Given the lack of prior complaints it's not clear > to me that it's worth that much trouble --- although getting rid of our > hard-wired assumptions about the maximum result size from getaddrinfo is > attractive from a robustness standpoint. Linux enforces a hard limit matching the static buffer in sockaddr_un. You'd proceed a bit further and hit "could not bind Unix socket: Invalid argument" or some such. I agree we should perhaps fix pg_upgrade to work even when its CWD is not usable as a socket path. It could create a temporary directory under /tmp and place the socket there, for example. Thanks, nm
Re: Overlength socket paths (was Re: [COMMITTERS] pgsql: Refactor flex and bison make rules)
From
Tom Lane
Date:
Noah Misch <noah@leadboat.com> writes: > Linux enforces a hard limit matching the static buffer in sockaddr_un. You'd > proceed a bit further and hit "could not bind Unix socket: Invalid argument" > or some such. Hm, I was wondering about that. The Single Unix Spec suggests that bind/connect ought to accept pathnames at least up to PATH_MAX, but if popular implementations don't honor that then it is a bit pointless for us to do a lot of pushups in userspace. > I agree we should perhaps fix pg_upgrade to work even when its CWD is not > usable as a socket path. It could create a temporary directory under /tmp and > place the socket there, for example. Yeah, I was starting to think that pg_upgrade's test script is the real culprit here. Every other variant of "make check" just puts the socket in the default place, typically /tmp, so it's rather useless that this one place is doing things differently. Another thing that we should possibly consider if we're going to hack on that is that "make check" is not currently very friendly to people who try to move the default socket location to someplace other than /tmp, such as the ever-popular /var/run/postgresql. The reason that this is problematic is that /var/run/postgresql may not be there at all in a build environment, and if it is, it's likely not writable by the user you're running your build as. So just using the default socket directory isn't real friendly in any case. In converting the Fedora packages to use /var/run/postgresql recently, I found I had to add the attached crude hacks to support running the regression tests during build. It'd be nice if the consideration were handled by unmodified sources ... regards, tom lane diff -Naur postgresql-9.2.0.sockets/contrib/pg_upgrade/test.sh postgresql-9.2.0/contrib/pg_upgrade/test.sh --- postgresql-9.2.0.sockets/contrib/pg_upgrade/test.sh 2012-09-06 17:26:17.000000000 -0400 +++ postgresql-9.2.0/contrib/pg_upgrade/test.sh 2012-09-06 18:13:18.178092176 -0400 @@ -62,10 +62,14 @@rm -rf "$logdir"mkdir "$logdir" +# we want the Unix sockets in $temp_root +PGHOST=$temp_root +export PGHOST +set -x$oldbindir/initdb -$oldbindir/pg_ctl start -l "$logdir/postmaster1.log" -w +$oldbindir/pg_ctl start -l "$logdir/postmaster1.log" -o "-c unix_socket_directories='$PGHOST'" -wif "$MAKE" -C "$oldsrc"installcheck; then pg_dumpall -f "$temp_root"/dump1.sql || pg_dumpall1_status=$? if [ "$newsrc" != "$oldsrc"]; then @@ -108,7 +112,7 @@pg_upgrade -d "${PGDATA}.old" -D "${PGDATA}" -b "$oldbindir" -B "$bindir" -pg_ctl start -l "$logdir/postmaster2.log" -w +pg_ctl start -l "$logdir/postmaster2.log" -o "-c unix_socket_directories='$PGHOST'" -wif [ $testhost = Msys ] ; then cmd /c analyze_new_cluster.bat diff -Naur postgresql-9.2.0.sockets/src/test/regress/pg_regress.c postgresql-9.2.0/src/test/regress/pg_regress.c --- postgresql-9.2.0.sockets/src/test/regress/pg_regress.c 2012-09-06 17:26:17.000000000 -0400 +++ postgresql-9.2.0/src/test/regress/pg_regress.c 2012-09-06 18:13:18.184092537 -0400 @@ -772,7 +772,7 @@ if (hostname != NULL) doputenv("PGHOST", hostname); else - unsetenv("PGHOST"); + doputenv("PGHOST", "/tmp"); unsetenv("PGHOSTADDR"); if (port != -1) { @@ -2233,7 +2233,7 @@ */ header(_("starting postmaster")); snprintf(buf, sizeof(buf), - SYSTEMQUOTE "\"%s/postgres\" -D \"%s/data\" -F%s -c \"listen_addresses=%s\" > \"%s/log/postmaster.log\"2>&1" SYSTEMQUOTE, + SYSTEMQUOTE "\"%s/postgres\" -D \"%s/data\" -F%s -c \"listen_addresses=%s\" -c \"unix_socket_directories=/tmp\"> \"%s/log/postmaster.log\" 2>&1" SYSTEMQUOTE, bindir, temp_install, debug ? " -d 5" : "", hostname ? hostname : "",
Re: Overlength socket paths (was Re: [COMMITTERS] pgsql: Refactor flex and bison make rules)
From
Andrew Dunstan
Date:
On 11/29/2012 05:20 PM, Tom Lane wrote: >> I don't think it's worth a heroic effort. Meanwhile I'll add a check in >> the upgrade test module(s) to check for overly long paths likely to give >> problems. > I'm thinking maybe we should try to fix this. What's bugging me is that > Jeremy's build path doesn't look all that unreasonably long. The scary > scenario that's in the back of my mind is that one day somebody decides > to rearrange the Red Hat build servers a bit and suddenly I can't build > Postgres there anymore, because the build directory is nested a bit too > deep. Murphy's law would of course dictate that I find this out only > when under the gun to get a security update packaged. > > The only thing it breaks AFAIK is pg_upgrade testing because pg_upgrade insists on setting the current directory as the socket dir. Maybe we need a pg_upgrade option to specify the socket dir to use. Or maybe the postmaster needs to check the length somehow before it calls StreamServerPort() so we can give a saner error message. cheers andrew
Re: Overlength socket paths (was Re: [COMMITTERS] pgsql: Refactor flex and bison make rules)
From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes: > ... Or maybe the > postmaster needs to check the length somehow before it calls > StreamServerPort() so we can give a saner error message. Hm. That's ugly, but a lot less invasive than trying to get the official API to pass the information through. Sounds like a plan to me. However, that's only addressing the reporting end of it; I think we also need to change the pg_upgrade test script as suggested by Noah. regards, tom lane
Re: Overlength socket paths (was Re: [COMMITTERS] pgsql: Refactor flex and bison make rules)
From
Andrew Dunstan
Date:
On 11/29/2012 06:23 PM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> ... Or maybe the >> postmaster needs to check the length somehow before it calls >> StreamServerPort() so we can give a saner error message. > Hm. That's ugly, but a lot less invasive than trying to get the > official API to pass the information through. Sounds like a plan to me. > > However, that's only addressing the reporting end of it; I think we > also need to change the pg_upgrade test script as suggested by Noah. > > The test script doesn't do anything. It's pg_upgrade itself that sets the socket dir. See option.c:get_sock_dir() cheers andrew
Re: Overlength socket paths (was Re: [COMMITTERS] pgsql: Refactor flex and bison make rules)
From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes: > On 11/29/2012 06:23 PM, Tom Lane wrote: >> However, that's only addressing the reporting end of it; I think we >> also need to change the pg_upgrade test script as suggested by Noah. > The test script doesn't do anything. It's pg_upgrade itself that sets > the socket dir. See option.c:get_sock_dir() Um ... that's *another* place that needs to change then, because the test script fires up postmasters on its own, outside of pg_upgrade. regards, tom lane
Re: Overlength socket paths (was Re: [COMMITTERS] pgsql: Refactor flex and bison make rules)
From
Tom Lane
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes: > Andrew Dunstan <andrew@dunslane.net> writes: >> ... Or maybe the >> postmaster needs to check the length somehow before it calls >> StreamServerPort() so we can give a saner error message. > Hm. That's ugly, but a lot less invasive than trying to get the > official API to pass the information through. Sounds like a plan to me. Here's a patch for that --- I think we should apply and back-patch this. regards, tom lane diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c index 5e86987f221fee01c5049f925492e0f9c441d372..15a01a8324b8a6ff9727a1b02e7ba8fe385d3a39 100644 *** a/src/backend/libpq/pqcomm.c --- b/src/backend/libpq/pqcomm.c *************** StreamServerPort(int family, char *hostN *** 308,313 **** --- 308,321 ---- * that file path */ UNIXSOCK_PATH(unixSocketPath, portNumber, unixSocketDir); + if (strlen(unixSocketPath) >= UNIXSOCK_PATH_BUFLEN) + { + ereport(LOG, + (errmsg("Unix-domain socket path \"%s\" is too long (maximum %d bytes)", + unixSocketPath, + (int) (UNIXSOCK_PATH_BUFLEN - 1)))); + return STATUS_ERROR; + } if (Lock_AF_UNIX(unixSocketDir, unixSocketPath) != STATUS_OK) return STATUS_ERROR; service = unixSocketPath; diff --git a/src/include/libpq/pqcomm.h b/src/include/libpq/pqcomm.h index 604b5535df6b24885b782688fee9ff7fd284746b..635132dec9317b4045108c75cd8c67a70c360968 100644 *** a/src/include/libpq/pqcomm.h --- b/src/include/libpq/pqcomm.h *************** typedef struct *** 74,79 **** --- 74,92 ---- (port)) /* + * The maximum workable length of a socket path is what will fit into + * struct sockaddr_un. This is usually only 100 or so bytes :-(. + * + * For consistency, always pass a MAXPGPATH-sized buffer to UNIXSOCK_PATH(), + * then complain if the resulting string is >= UNIXSOCK_PATH_BUFLEN bytes. + * (Because the standard API for getaddrinfo doesn't allow it to complain in + * a useful way when the socket pathname is too long, we have to test for + * this explicitly, instead of just letting the subroutine return an error.) + */ + #define UNIXSOCK_PATH_BUFLEN sizeof(((struct sockaddr_un *) NULL)->sun_path) + + + /* * These manipulate the frontend/backend protocol version number. * * The major number should be incremented for incompatible changes. The minor diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c index 9eaf41025beb652c6c242035490d93319a6bc5d0..1386bb791a96fab087af1ba33546ab89772327fd 100644 *** a/src/interfaces/libpq/fe-connect.c --- b/src/interfaces/libpq/fe-connect.c *************** static int *** 1322,1328 **** connectDBStart(PGconn *conn) { int portnum; ! char portstr[128]; struct addrinfo *addrs = NULL; struct addrinfo hint; const char *node; --- 1322,1328 ---- connectDBStart(PGconn *conn) { int portnum; ! char portstr[MAXPGPATH]; struct addrinfo *addrs = NULL; struct addrinfo hint; const char *node; *************** connectDBStart(PGconn *conn) *** 1384,1389 **** --- 1384,1398 ---- node = NULL; hint.ai_family = AF_UNIX; UNIXSOCK_PATH(portstr, portnum, conn->pgunixsocket); + if (strlen(portstr) >= UNIXSOCK_PATH_BUFLEN) + { + appendPQExpBuffer(&conn->errorMessage, + libpq_gettext("Unix-domain socket path \"%s\" is too long (maximum %d bytes)\n"), + portstr, + (int) (UNIXSOCK_PATH_BUFLEN - 1)); + conn->options_valid = false; + goto connect_errReturn; + } #else /* Without Unix sockets, default to localhost instead */ node = DefaultHost;
Re: Overlength socket paths (was Re: [COMMITTERS] pgsql: Refactor flex and bison make rules)
From
Andrew Dunstan
Date:
On 11/29/2012 07:16 PM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> On 11/29/2012 06:23 PM, Tom Lane wrote: >>> However, that's only addressing the reporting end of it; I think we >>> also need to change the pg_upgrade test script as suggested by Noah. >> The test script doesn't do anything. It's pg_upgrade itself that sets >> the socket dir. See option.c:get_sock_dir() > Um ... that's *another* place that needs to change then, because the > test script fires up postmasters on its own, outside of pg_upgrade. > > True, but it doesn't do anything about setting the socket dir, so those just get the compiled-in defaults. What exactly do you want to change about the test script? cheers andrew
Re: Overlength socket paths (was Re: [COMMITTERS] pgsql: Refactor flex and bison make rules)
From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes: > On 11/29/2012 07:16 PM, Tom Lane wrote: >> Um ... that's *another* place that needs to change then, because the >> test script fires up postmasters on its own, outside of pg_upgrade. > True, but it doesn't do anything about setting the socket dir, so those > just get the compiled-in defaults. What exactly do you want to change > about the test script? Well, I was thinking about also solving the problem that the compiled-in default might be no good in a build environment. regards, tom lane