Thread: Help? Unexpected PostgreSQL compilation failure using generic compile script
Help? Unexpected PostgreSQL compilation failure using generic compile script
From
Martin Goodson
Date:
Hello. For reasons I won't bore you with, we compile PostgreSQL from source rather than use the standard packages for some of our databases. We've compiled numerous PostgreSQL versions, from 11.1 to 14.4, using a fairly generic and not particularly complicated compile script that has worked successfully on dozens (possibly hundreds, I don't keep track :) ) of redhat boxes using numerous different versions of RHEL. This script has worked without incident for *years*. Until last week, where we tried to compile PostgreSQL 12.9 on an RHEL 7.9 box, where it bombed out with an error we have never seen before. To be honest, I'm not sure what's going wrong. I am by no means a Linux sysadm or compile expert. I just run the script (and a variety of other post-build steps ...) Our basic process: 1. Install pre-requisite libraries/packages: yum install pam-devel yum install libxml2-devel yum install libxslt-devel yum install openldap yum install openldap-devel yum install uuid-devel yum install readline-devel yum install openssl-devel yum install libicu-devel yum install uuid-devel yum install gcc yum install make 2. Create a user to compile the source and own the software. For example, pgbuild 3. Build a couple of directories owned by the build user for the destination, source, etc. We then run the following script under the build user. targetdir={directory to install postgresql into} sourcedir={directory where the postgresql unzipped and untarred tarball has been located} builddir={temporary build directory} port={port number} rm -Rf ${targetdir} rm -Rf ${builddir} mkdir ${targetdir} mkdir ${builddir} cd ${builddir} ${sourcedir}/configure --prefix=${targetdir} --with-pgport=${port} \ --with-openssl \ --with-ldap \ --with-pam \ --with-icu \ --with-libxml \ --with-ossp-uuid \ --with-libxslt \ --with-libedit-preferred \ --with-gssapi \ --enable-debug rc=$? if [ $rc -ne 0 ] then echo "#### ERROR! Configure returned non-zero code $rc - press RETURN to continue / Ctrl+C to abort" read ok fi make world rc=$? if [ $rc -ne 0 ] then echo "#### ERROR! make world returned non-zero code $rc - press RETURN to continue / Ctrl+C to abort" read ok fi make check rc=$? if [ $rc -ne 0 ] then echo "#### ERROR! make check returned non-zero code $rc - press RETURN to continue / Ctrl+C to abort" read ok fi make install-world rc=$? if [ $rc -ne 0 ] then echo "#### ERROR! install-world returned non-zero code $rc - press RETURN to continue / Ctrl+C to abort" read ok fi So, pretty straightforward stuff. Run configure, make world, make check, make install-word and a little bit of basic error checking after each step. For years we've been able to run this script without issue, until last week where the configure failed with the following error on one of our servers. After the usual hundreds of lines of text configure output the following: checking for library containing gss_init_sec_context... no configure: error: could not find function 'gss_init_sec_context' required for GSSAPI And then bombed out with rc 1. Rest of the script aborted due to our error checking. Bit odd, nothing we've seen before on dozens/numerous other compiles across the enterprise. Then I spotted that our libraries pre-install doesn't include anything for GSSAPI. Bit of a bug in our pre-reqs step, perhaps we've got away with it previously and this one server in our whole estate doesn't have GSSAPI. I need to figure out how to install GSSAPI, but that's a bit of a faff and I need to get this build tested in a hurry. So I simply removed the --with-gssapi, and tried again. AND IT FAILED AGAIN. This time it failed claiming it couldn't find the ldap library. Which is most -definitely- present. I have no idea what's going on at this point. We have *never* had any issues like this. This script/process has been in place for years and we've never had any issues with it. It gets weirder. The compile step and make world steps work perfectly if the script is run under root. Though, of course, the make check step fails. Running it under root was inadvertent, but the fact the compile and make steps seemed to have run successfully was a bit of a surprise. So a fairly basic script that has been used for years suddenly fails on a fairly generic RHEL 7.9 server. I am no compilation expert. Obviously. Have I mised something basic? As I said, we've not seen problems like this before. Could there be some sort of issue on the box's configuration? If it works for root but not our usual build user could there be a user config with our account? Can anyone offer any insight on what I need to check? At the moment it all seems somewhat ... mystifying. I am assuming there must be something wrong with the box/our configuration somewhere, but where to look? If anyone can help - even if it's to tell me I'm an idiot for missing one or more incredibly basic things somehow - I would be very grateful. Many thanks. Regards, M. -- Martin Goodson. "Have you thought up some clever plan, Doctor?" "Yes, Jamie, I believe I have." "What're you going to do?" "Bung a rock at it."
Re: Help? Unexpected PostgreSQL compilation failure using generic compile script
From
Tom Lane
Date:
Martin Goodson <kaemaril@googlemail.com> writes: > So I simply removed the --with-gssapi, and tried again. > AND IT FAILED AGAIN. > This time it failed claiming it couldn't find the ldap library. Which is > most -definitely- present. Hard to debug this sort of thing remotely when you don't supply the exact error messages. But ... do you have openldap-devel installed, or just the base openldap package? > The compile step and make world steps work perfectly if the script is > run under root. That is odd. Permissions problems on the libraries, maybe? regards, tom lane
Re: Help? Unexpected PostgreSQL compilation failure using generic compile script
From
Martin Goodson
Date:
On 12/03/2023 21:52, Tom Lane wrote: > Martin Goodson <kaemaril@googlemail.com> writes: >> So I simply removed the --with-gssapi, and tried again. >> AND IT FAILED AGAIN. >> This time it failed claiming it couldn't find the ldap library. Which is >> most -definitely- present. > Hard to debug this sort of thing remotely when you don't supply the exact > error messages. But ... do you have openldap-devel installed, or just > the base openldap package? > >> The compile step and make world steps work perfectly if the script is >> run under root. > That is odd. Permissions problems on the libraries, maybe? > > regards, tom lane Hi, Tom. Sorry, I can get the complete log tomorrow - it's on my work PC, not my home. I clearly made insufficient notes, for which I apologize :( Not sure about permissions on libraries. We just open up a session under root and execute yum install <blah blah>, and that has always worked in the past. Not sure what I'd need to check? I can perhaps ask our friendly neighbourhood UNIX sysadmin to check those? We did install openldap and openldap-devel, however: yum install pam-devel yum install libxml2-devel yum install libxslt-devel yum install openldap yum install openldap-devel yum install uuid-devel yum install readline-devel yum install openssl-devel yum install libicu-devel yum install uuid-devel yum install gcc yum install make Regards, M. -- Martin Goodson. "Have you thought up some clever plan, Doctor?" "Yes, Jamie, I believe I have." "What're you going to do?" "Bung a rock at it."
Re: Help? Unexpected PostgreSQL compilation failure using generic compile script
From
Adrian Klaver
Date:
On 3/12/23 14:43, Martin Goodson wrote: > Hello. > > For reasons I won't bore you with, we compile PostgreSQL from source > rather than use the standard packages for some of our databases. > > So a fairly basic script that has been used for years suddenly fails on > a fairly generic RHEL 7.9 server. > > I am no compilation expert. Obviously. Have I mised something basic? As > I said, we've not seen problems like this before. Could there be some > sort of issue on the box's configuration? If it works for root but not > our usual build user could there be a user config with our account? Can > anyone offer any insight on what I need to check? At the moment it all > seems somewhat ... mystifying. SELinux issues? Have you looked at the system logs to see if they shed any light? > > I am assuming there must be something wrong with the box/our > configuration somewhere, but where to look? If anyone can help - even if > it's to tell me I'm an idiot for missing one or more incredibly basic > things somehow - I would be very grateful. > > Many thanks. > > Regards, > > M. > -- Adrian Klaver adrian.klaver@aklaver.com
Re: Help? Unexpected PostgreSQL compilation failure using generic compile script
From
Martin Goodson
Date:
On 13/03/2023 00:02, Adrian Klaver wrote: > On 3/12/23 14:43, Martin Goodson wrote: >> Hello. >> >> For reasons I won't bore you with, we compile PostgreSQL from source >> rather than use the standard packages for some of our databases. >> > > >> So a fairly basic script that has been used for years suddenly fails >> on a fairly generic RHEL 7.9 server. >> >> I am no compilation expert. Obviously. Have I mised something basic? >> As I said, we've not seen problems like this before. Could there be >> some sort of issue on the box's configuration? If it works for root >> but not our usual build user could there be a user config with our >> account? Can anyone offer any insight on what I need to check? At the >> moment it all seems somewhat ... mystifying. > > SELinux issues? > > Have you looked at the system logs to see if they shed any light? > Apologies for the delay in replying, it's been a busy week. After a spot more testing today I found the problem, and an embarrassing one it was too. Can't believe I didn't spot it earlier. One of my colleagues had earlier used our 'generic build account' to install an older version of PostgreSQL on the same server, and had set the account's PATH and LD_LIBRARY_PATH to point to that version in the .bash_profile script. That's something we don't normally do - our 'build account' is deliberately left as a clean slate, as it were. Bit bizarre it was somehow only causing problems with the compile check on the gssapi and ldap libraries, but there you go. Feel a bit of a twit now, but definitely something I'll be explicitly checking beforehand on future compiles :( -- Martin Goodson. "Have you thought up some clever plan, Doctor?" "Yes, Jamie, I believe I have." "What're you going to do?" "Bung a rock at it."