Thread: Make check fails on 8.3.7
I'm trying to install 8.3.7, but can't get past make check. CentOS release 4.7 (Final), with an existing install of 8.3.1 running as a warm standby $ eval ./configure `pg_config --configure` $ gmake == All of PostgreSQL successfully made. Ready to install. $ gmake check Everything looks ok until it actually starts the tests: ./pg_regress --temp-install=./tmp_check --top-builddir=../../.. --srcdir=/home/postgres/postgresql-8.3.7/src/test/regress --temp-port=55432 --schedule=./parallel_schedule --multibyte=SQL_ASCII --load-language=plpgsql ============== creating temporary installation ============== ============== initializing database system ============== ============== starting postmaster ============== running on port 55432 with pid 16296 ============== creating database "regression" ============== CREATE DATABASE ALTER DATABASE ============== installing plpgsql ============== CREATE LANGUAGE ============== running regression test queries ============== parallel group (17 tests): boolean char name varchar text int2 int4 oid float4 uuid float8 txid money bit int8 enum numeric boolean ... FAILED char ... FAILED name ... ok varchar ... FAILED text ... FAILED int2 ... FAILED int4 ... FAILED int8 ... FAILED oid ... FAILED float4 ... FAILED float8 ... FAILED bit ... FAILED numeric ... FAILED txid ... FAILED uuid ... FAILED enum ... FAILED money ... ok test strings ... FAILED test numerology ... ok parallel group (18 tests): point lseg box path polygon circle time date tinterval timetz comments reltime abstime tstypes inet interval timestamptz timestamp point ... FAILED lseg ... FAILED box ... FAILED path ... FAILED polygon ... FAILED circle ... FAILED date ... FAILED time ... FAILED timetz ... FAILED timestamp ... FAILED and so on ... 83 of 114 tests failed. Samples from the regression.diffs: *** ./expected/boolean.out Fri Jun 1 18:40:19 2007 --- ./results/boolean.out Thu Jul 30 19:16:33 2009 *************** *** 75,83 **** (1 row) SELECT ' tru e '::text::boolean AS invalid; -- error - ERROR: invalid input syntax for type boolean: " tru e " SELECT ''::text::boolean AS invalid; -- error - ERROR: invalid input syntax for type boolean: "" CREATE TABLE BOOLTBL1 (f1 bool); INSERT INTO BOOLTBL1 (f1) VALUES (bool 't'); INSERT INTO BOOLTBL1 (f1) VALUES (bool 'True'); --- 75,81 ---- *************** *** 136,142 **** -- For pre-v6.3 this evaluated to false - thomas 1997-10-23 INSERT INTO BOOLTBL2 (f1) VALUES (bool 'XXX'); - ERROR: invalid input syntax for type boolean: "XXX" -- BOOLTBL2 should be full of false's at this point SELECT '' AS f_4, BOOLTBL2.* FROM BOOLTBL2; f_4 | f1 --- 134,139 ---- ====================================================================== *** ./expected/tablespace.out Thu Jul 30 19:16:15 2009 --- ./results/tablespace.out Thu Jul 30 19:16:58 2009 *************** *** 48,54 **** ALTER INDEX testschema.anindex SET TABLESPACE testspace; INSERT INTO testschema.atable VALUES(3); -- ok INSERT INTO testschema.atable VALUES(1); -- fail (checks index) - ERROR: duplicate key value violates unique constraint "anindex" SELECT COUNT(*) FROM testschema.atable; -- checks heap count ------- --- 48,53 ---- *************** *** 57,69 **** -- Will fail with bad path CREATE TABLESPACE badspace LOCATION '/no/such/location'; - ERROR: could not set permissions on directory "/no/such/location": No such file or directory -- No such tablespace CREATE TABLE bar (i int) TABLESPACE nosuchspace; - ERROR: tablespace "nosuchspace" does not exist -- Fail, not empty DROP TABLESPACE testspace; - ERROR: tablespace "testspace" is not empty DROP SCHEMA testschema CASCADE; NOTICE: drop cascades to table testschema.atable NOTICE: drop cascades to table testschema.asexecute --- 56,65 ---- It looks in every case like the ERROR (and also HINT lines) lines are causing the failures, but I'm not sure what setting I messed up to cause that. What should I be looking for? Thanks. -- Christine Desmuke Kansas Historical Society cdesmuke@kshs.org
Christine Desmuke <cdesmuke@kshs.org> writes: > I'm trying to install 8.3.7, but can't get past make check. > CentOS release 4.7 (Final), with an existing install of 8.3.1 running as > a warm standby > ... > It looks in every case like the ERROR (and also HINT lines) lines are > causing the failures, but I'm not sure what setting I messed up to cause > that. What should I be looking for? Years ago I saw roughly similar symptoms when SELinux decided postgres shouldn't be allowed to write to /dev/tty. I'm not sure how that would relate to your situation, but it'd be worth checking for avc messages in the kernel log ... regards, tom lane
Tom Lane wrote: > Christine Desmuke <cdesmuke@kshs.org> writes: >> I'm trying to install 8.3.7, but can't get past make check. >> CentOS release 4.7 (Final), with an existing install of 8.3.1 running as >> a warm standby >> ... >> It looks in every case like the ERROR (and also HINT lines) lines are >> causing the failures, but I'm not sure what setting I messed up to cause >> that. What should I be looking for? > > Years ago I saw roughly similar symptoms when SELinux decided postgres > shouldn't be allowed to write to /dev/tty. I'm not sure how that would > relate to your situation, but it'd be worth checking for avc messages in > the kernel log ... > > regards, tom lane Thanks for the suggestion. It is not SELinux (SELinux status: disabled), but _something_ is preventing postgres (both the existing install and the one I'm trying to check) from writing to /dev/tty. There are no related messages in the kernel log or postgres log, however. The permissions on /dev/tty appear to be ok: [root@zu log]# ls -l /dev/tty crw-rw-rw- 1 root root 5, 0 Jul 31 12:23 /dev/tty Trying to invoke psql ought to write an error (because the warm standby is perpetually in startup mode), but I'm just returned to the prompt: [cdesmuke@zu ~]# psql -h localhost psql: [cdesmuke@zu ~]# Other programs can write to /dev/tty without problems: [cdesmuke@zu ~]# mysql ERROR 1045 (28000): Access denied for user 'cdesmuke'@'localhost' (using password: NO) [cdesmuke@zu ~]$ pg_config BINDIR = /usr/local/pgsql/bin DOCDIR = /usr/local/pgsql/doc INCLUDEDIR = /usr/local/pgsql/include PKGINCLUDEDIR = /usr/local/pgsql/include INCLUDEDIR-SERVER = /usr/local/pgsql/include/server LIBDIR = /usr/local/pgsql/lib PKGLIBDIR = /usr/local/pgsql/lib LOCALEDIR = MANDIR = /usr/local/pgsql/man SHAREDIR = /usr/local/pgsql/share SYSCONFDIR = /usr/local/pgsql/etc PGXS = /usr/local/pgsql/lib/pgxs/src/makefiles/pgxs.mk CONFIGURE = CC = gcc CPPFLAGS = -D_GNU_SOURCE CFLAGS = -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Winline -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv CFLAGS_SL = -fpic LDFLAGS = -Wl,-rpath,'/usr/local/pgsql/lib' LDFLAGS_SL = LIBS = -lpgport -lz -lreadline -ltermcap -lcrypt -ldl -lm VERSION = PostgreSQL 8.3.1 The same thing happens whether trying as root, as postgres, or under my own unprivileged username. Trying to connect to this instance from another machine does generate the expected error: [cdesmuke@alberta ~]$ psql -h zu psql: FATAL: the database system is starting up Some other PG programs seems to work partway, but not fully: [cdesmuke@zu ~]$ pg_dump -U gar pg_dump: [archiver (db)] connection to database "gar" failed: [cdesmuke@zu ~]$ [i.e., the message "fe_sendauth: no password supplied" that I expected to see did not appear, and the command prompt appeared immediately after the "failed:" rather than on a new line.] I'm out of ideas on what to check next. Thank you for any and all suggestions. -- Christine Desmuke Kansas State Historical Society cdesmuke@kshs.org
Christine Desmuke <cdesmuke@kshs.org> writes: > Tom Lane wrote: >> Years ago I saw roughly similar symptoms when SELinux decided postgres >> shouldn't be allowed to write to /dev/tty. > Thanks for the suggestion. It is not SELinux (SELinux status: disabled), > but _something_ is preventing postgres (both the existing install and > the one I'm trying to check) from writing to /dev/tty. You *sure* selinux is disabled? Because that sounds exactly like a long-ago selinux policy bug. I'd have thought everybody's machine had the fix by now, but if this machine isn't too up2date, maybe not... regards, tom lane
Tom Lane wrote: > Christine Desmuke <cdesmuke@kshs.org> writes: >> Tom Lane wrote: >>> Years ago I saw roughly similar symptoms when SELinux decided postgres >>> shouldn't be allowed to write to /dev/tty. > >> Thanks for the suggestion. It is not SELinux (SELinux status: disabled), but _something_ is preventing postgres (both the existing install and the one I'm trying to check) from writing to /dev/tty. > > You *sure* selinux is disabled? Because that sounds exactly like a > long-ago selinux policy bug. I'd have thought everybody's machine > had the fix by now, but if this machine isn't too up2date, maybe not... > > regards, tom lane [root@zu ~]# sestatus SELinux status: disabled [root@zu ~]# more /etc/sysconfig/selinux # This file controls the state of SELinux on the system. # SELINUX= can take one of these three values: # enforcing - SELinux security policy is enforced. # permissive - SELinux prints warnings instead of enforcing. # disabled - SELinux is fully disabled. SELINUX=disabled # SELINUXTYPE= type of policy in use. Possible values are: # targeted - Only targeted network daemons are protected. # strict - Full SELinux protection. SELINUXTYPE=targeted [root@zu ~]# uname -r 2.6.9-78.0.22.ELsmp Yes, the symptoms look like an selinux problem, except the machine claims selinux is disabled, and there is nothing in the logs from avc (or anything else notable). I've updated to the latest version of CentOS 4, which is supposed to mirror RHEL 4. I'm probably missing something obvious, but I dunno what. Thank you, as always, for taking the time to help. -- Christine Desmuke Kansas State Historical Society cdesmuke@kshs.org
On 31 Jul 2009, at 3:25, Christine Desmuke wrote: > Samples from the regression.diffs: > > *** ./expected/boolean.out Fri Jun 1 18:40:19 2007 > --- ./results/boolean.out Thu Jul 30 19:16:33 2009 > *************** > *** 75,83 **** > (1 row) > > SELECT ' tru e '::text::boolean AS invalid; -- error > - ERROR: invalid input syntax for type boolean: " tru e " > SELECT ''::text::boolean AS invalid; -- error > - ERROR: invalid input syntax for type boolean: "" > CREATE TABLE BOOLTBL1 (f1 bool); I'm not familiar with the regression test stuff, but I suppose the output of a shell command gets captured in a file and those are then diffed with the expected output? If so, isn't it just the output of stderr getting lost here? What shell are you using? Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4a7ab75410131623016330!
Alban Hertroys wrote: > On 31 Jul 2009, at 3:25, Christine Desmuke wrote: > >> Samples from the regression.diffs: >> >> *** ./expected/boolean.out Fri Jun 1 18:40:19 2007 >> --- ./results/boolean.out Thu Jul 30 19:16:33 2009 >> *************** >> *** 75,83 **** >> (1 row) >> >> SELECT ' tru e '::text::boolean AS invalid; -- error >> - ERROR: invalid input syntax for type boolean: " tru e " >> SELECT ''::text::boolean AS invalid; -- error >> - ERROR: invalid input syntax for type boolean: "" >> CREATE TABLE BOOLTBL1 (f1 bool); > > > I'm not familiar with the regression test stuff, but I suppose the > output of a shell command gets captured in a file and those are then > diffed with the expected output? > > If so, isn't it just the output of stderr getting lost here? What shell > are you using? > > Alban Hertroys > Yes, it looks like stderr is lost. I'm running bash, and there is nothing odd in .bash_profile [postgres@zu ~]$ echo $SHELL /bin/bash [postgres@zu ~]$ more .bash_profile # .bash_profile # Get the aliases and functions if [ -f ~/.bashrc ]; then . ~/.bashrc fi # User specific environment and startup programs PATH=$PATH:$HOME/bin export PATH Any ideas? -- Christine Desmuke Kansas State Historical Society cdesmuke@kshs.org
On 7 Aug 2009, at 4:02, Christine Desmuke wrote: >> If so, isn't it just the output of stderr getting lost here? What >> shell are you using? > > Yes, it looks like stderr is lost. I'm running bash, and there is > nothing odd in .bash_profile > Any ideas? I have to admit I'm running out, this seems to be a rather odd problem. Maybe someone who knows CentOS (or Linux in general) has some ideas what's going on here. Let's see if we can find any trace of where things are going wrong... Is there anything about why the regression tests failed in the system logs? Were you redirecting the script output somewhere? Does your stderr work? If you purposely cause an error, do you get an error message? Can you write it to a file? What are you running your shell from, the console or some kind of X- or otherwise virtual console (screen for example)? If the latter, can you try the regression tests from the console? If that still doesn't show anything it's probably a good idea to run the regression tests through trace, but that's probably going to create a LOT of output to wade through. It should point you to the culprit though. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4a7c116e10131844317574!
Alban Hertroys wrote: > On 7 Aug 2009, at 4:02, Christine Desmuke wrote: > >>> If so, isn't it just the output of stderr getting lost here? What >>> shell are you using? >> >> Yes, it looks like stderr is lost. I'm running bash, and there is >> nothing odd in .bash_profile > >> Any ideas? > > > I have to admit I'm running out, this seems to be a rather odd problem. > Maybe someone who knows CentOS (or Linux in general) has some ideas > what's going on here. > Let's see if we can find any trace of where things are going wrong... > > Is there anything about why the regression tests failed in the system logs? No, nothing. > Were you redirecting the script output somewhere? No, I was running make check, which does redirect the output to a series of files, and then diffs those against the expected output. > Does your stderr work? If you purposely cause an error, do you get an > error message? Can you write it to a file? [postgres@zu ~]$ less bangu bangu: No such file or directory [postgres@zu ~]$ less bangu 2>zztmp [postgres@zu ~]$ cat zztmp bangu: No such file or directory [postgres@zu ~]$ but [postgres@zu ~]$ /usr/local/pgsql/bin/psql -U nobody psql: [postgres@zu ~]$ [That is, the expected error from psql about the nonexistent user is swallowed.] > What are you running your shell from, the console or some kind of X- or > otherwise virtual console (screen for example)? If the latter, can you > try the regression tests from the console? Normally I run from an ssh session, but I tried this from the console as well, with the same results. > If that still doesn't show anything it's probably a good idea to run the > regression tests through trace, but that's probably going to create a > LOT of output to wade through. It should point you to the culprit though. I'm going to try this over the weekend and see what I get. > > Alban Hertroys > -- Christine Desmuke Kansas State Historical Society cdesmuke@kshs.org
Christine Desmuke <cdesmuke@kshs.org> writes: > [postgres@zu ~]$ /usr/local/pgsql/bin/psql -U nobody > psql: [postgres@zu ~]$ > [That is, the expected error from psql about the nonexistent user is > swallowed.] Try the above under strace, and see what it shows happening to the writes to stderr (they should be very close to the end of the strace output). regards, tom lane
Christine Desmuke <cdesmuke@kshs.org> writes: > [postgres@zu ~]$ /usr/local/pgsql/bin/psql -U nobody > psql: [postgres@zu ~]$ Wait a minute ... I just looked closer at your sample there. That shows that psql *is* able to output to stderr, because it was able to print its own name. So we've been barking up the wrong tree with the permissions theories. What is actually happening, evidently, is that PQerrorMessage() is returning an empty string --- or perhaps a NULL pointer --- when it should not. So this leads to a different conclusion, which is that there's something broken about your libpq. I'd look at whether you're linking to the version you think you are, and try rebuilding libpq. (I seem to recall seeing a similar report once before, but I can't find it in the archives right now.) regards, tom lane
I wrote: > (I seem to recall seeing a similar report once before, but I can't find > it in the archives right now.) I found what I think is the bug I was remembering: http://archives.postgresql.org/pgsql-bugs/2007-05/msg00074.php but unfortunately it's not much help since we never did resolve what was happening. Can you reproduce the problem in a debug-enabled build? It would be worth stepping through the code to see where it's losing track of the error message. regards, tom lane