Re: Problem with dblink regression test - Mailing list pgsql-hackers

From Jim C. Nasby
Subject Re: Problem with dblink regression test
Date
Msg-id 20050622164547.GZ84822@decibel.org
Whole thread Raw
In response to Re: Problem with dblink regression test  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Problem with dblink regression test
List pgsql-hackers
On Wed, Jun 22, 2005 at 11:45:09AM -0400, Tom Lane wrote:
> "Andrew Dunstan" <andrew@dunslane.net> writes:
> > Tom Lane said:
> >> There are several buildfarm machines failing like this.  I think a
> >> possible solution is for the postmaster to do putenv("PGPORT=nnn") so
> >> that libpq instances running in postmaster children will default to the
> >> local installation's actual port rather than some compiled-in default
> >> port.
> 
> > If this diagnosis were correct, wouldn't every buildfarm member be failing
> > at the ContribCheck stage (if they get that far)? They all run on non
> > standard ports and all run the contrib installcheck suite if they can (this
> > is required, not optional). So if they show OK then they do not exhibit the
> > problem.
> 
> Now that I'm a little more awake ...
> 
> I think the difference between the working and not-working machines
> probably has to do with dynamic-linker configuration.  You have the
> buildfarm builds using "configure --prefix=something
> --with-pgport=something".  So, the copy of libpq.so installed into
> the prefix tree has the "right" default port.  But on a machine with
> a regular installation of Postgres, there is also going to be a copy
> of libpq.so in /usr/lib or some such place ... and that copy thinks
> the default port is where the regular postmaster lives (eg 5432).
> When dblink.so is loaded into the backend, if the dynamic linker chooses
> to resolve its requirement for libpq.so by loading /usr/lib/libpq.so,
> then the wrong things happen.
> 
> In the "make check" case this is masked because pg_regress.sh has set
> PGPORT in the postmaster's environment, and that will override the
> compiled-in default.  But of course the contrib tests only work in
> "installcheck" mode.
> 
> To believe this, you have to assume that "psql" links to the correct
> version (the test version) of libpq.so but dblink.so fails to do so.
> So it's only an issue on platforms where "rpath" works for executables
> but not for shared libraries.  I haven't run down exactly which
> buildfarm machines have shown this symptom --- do you know offhand?
> 
> (Thinks some more...)  Another possibility is that on the failing
> machines, there is a system-wide PGPORT environment variable; however,
> unless you specify "-p" on the postmaster command line when you start
> the "installed" postmaster, I'd expect that to change where the
> postmaster puts its socket, so that's probably not the right answer.
> 
> If this is the correct explanation, then fooling with PGPORT would
> mask this particular symptom, but it wouldn't fix the fundamental
> problem that we're loading the wrong version of libpq.so.  Eventually
> that would come back to bite us (whenever dblink.so requires some
> feature that doesn't exist in older libpq.so versions).

Here's the info I have for my two machines (platypus and cuckoo), both
of which are exhibiting this behavior.

I manually ran the dblink regression on platypus to see what was going
on. If I added port=5682 to the connection string, it would properly
connect to the test database. Without that it complained that the
contrib_regression database didn't exist. After adding
contrib_regression to the default postgresql cluster on that machine it
then errored out saying that there was no buildfarm user, which is true
on the default install on that machine. $PGPORT isn't set globally or in
the buildfarm user account.

ISTM there's a couple ways a buildfarm machine could pass besides what
Tom's mentioned. If the machine doesn't have a default install at all
it's possible that dblink will act differently. It's also possible that
the default install has both the contrib_regression database and the
user that's running the buildfarm.

Is there a way to confirm which libpq.so psql and/or dblink.so has
linked to? Are there any other tests I could run to shed some light on
this?
-- 
Jim C. Nasby, Database Consultant               decibel@decibel.org 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


pgsql-hackers by date:

Previous
From: Steve Atkins
Date:
Subject: Re: pl/pgsql: END verbosity
Next
From: "Jim C. Nasby"
Date:
Subject: Re: Problem with dblink regression test