Thread: initdb fails on ultra2 sparc64, freebsd 5.4
On a Sun ultra2 sparc64 initdb hangs here: $ initdb -D /usr/local/pgsql/data The files belonging to this database system will be owned by user "pgsql". This user must also own the server process. The database cluster will be initialized with locale C. creating directory /usr/local/pgsql/data ... ok creating directory /usr/local/pgsql/data/global ... ok creating directory /usr/local/pgsql/data/pg_xlog ... ok creating directory /usr/local/pgsql/data/pg_xlog/archive_status ... ok creating directory /usr/local/pgsql/data/pg_clog ... ok creating directory /usr/local/pgsql/data/pg_subtrans ... ok creating directory /usr/local/pgsql/data/base ... ok creating directory /usr/local/pgsql/data/base/1 ... ok creating directory /usr/local/pgsql/data/pg_tblspc ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 1000 creating configuration files ... ok creating template1 database in /usr/local/pgsql/data/base/1 ... After killing initdb with ^C postgres is still running PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 23061 pgsql -8 0 30264K 8984K piperd 0:10 0.00% 0.00% postgres $ uname -a FreeBSD ultra2 5.4-RELEASE FreeBSD 5.4-RELEASE #0: Sun May 8 22:21:34 UTC 2005 root@binkley.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC sparc64 $ gcc -v Using built-in specs. Configured with: FreeBSD/sparc64 system compiler Thread model: posix gcc version 3.4.2 [FreeBSD] 20040728 PostgreSQL config.status 8.0.3 configured by ./configure, generated by GNU Autoconf 2.53, with options \"'--with-libraries=/usr/local/lib' '--with-includes=/usr/local/i nclude' '--with-docdir=/usr/local/share/doc/postgresql' '--with-openssl' '--disa ble-nls' '--enable-thread-safety' '--prefix=/usr/local' 'sparc64-portbld-freebsd 5.4' 'LDFLAGS= -rpath=/usr/lib:/usr/local/lib -L/usr/local/lib' 'CFLAGS=-O -pipe ' 'CPPFLAGS=-I/usr/local/include' 'host_alias=sparc64-portbld-freebsd5.4' 'buil d_alias=sparc64-portbld-freebsd5.4' 'target_alias=sparc64-portbld-freebsd5.4' 'C C=cc'\" David -- http://howto.mainstreamlinux.com
David Walker <david@cosmicfires.com> writes: > On a Sun ultra2 sparc64 initdb hangs here: Where's "here"? Build with --enable-debug, then attach to the stuck process with gdb and get a stack trace. regards, tom lane
On a Sun ultra2 sparc64 initdb hangs. cpu0: Sun Microsystems UltraSparc-I Processor (200.00 MHz CPU) $ initdb -D /usr/local/pgsql/data The files belonging to this database system will be owned by user "pgsql". This user must also own the server process. The database cluster will be initialized with locale C. creating directory /usr/local/pgsql/data ... ok creating directory /usr/local/pgsql/data/global ... ok creating directory /usr/local/pgsql/data/pg_xlog ... ok creating directory /usr/local/pgsql/data/pg_xlog/archive_status ... ok creating directory /usr/local/pgsql/data/pg_clog ... ok creating directory /usr/local/pgsql/data/pg_subtrans ... ok creating directory /usr/local/pgsql/data/base ... ok creating directory /usr/local/pgsql/data/base/1 ... ok creating directory /usr/local/pgsql/data/pg_tblspc ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 1000 creating configuration files ... ok creating template1 database in /usr/local/pgsql/data/base/1 ... initdb hangs there, backend status: PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 43332 pgsql -8 0 30264K 8888K piperd 0:11 0.00% 0.00% postgres (gdb) where #0 0x0000000040f7f0a8 in read () from /lib/libc.so.5 #1 0x0000000040ffdd54 in __sread () from /lib/libc.so.5 #2 0x0000000040ffddd8 in _sread () from /lib/libc.so.5 #3 0x0000000040fe8954 in __srefill () from /lib/libc.so.5 #4 0x0000000040fe2e40 in fread () from /lib/libc.so.5 #5 0x000000000016a0d8 in Int_yy_get_next_buffer () at lex.Int_yy.c:1217 #6 0x0000000000169c9c in Int_yylex () at lex.Int_yy.c:1052 #7 0x0000000000168674 in Int_yyparse () at y.tab.c:1090 #8 0x000000000016b2b8 in BootstrapMain (argc=4, argv=0x7fdffffeb58) at bootstrap.c:455 #9 0x00000000001fc774 in main (argc=5, argv=0x7fdffffeb50) at main.c:296 (gdb) disassemble Dump of assembler code for function read: 0x0000000040f7f0a0 <read+0>: mov 3, %g1 ! 0x3 0x0000000040f7f0a4 <read+4>: ta %xcc, -4031 0x0000000040f7f0a8 <read+8>: bcc,a %xcc, 0x40f7f0bc <read+28> 0x0000000040f7f0ac <read+12>: nop 0x0000000040f7f0b0 <read+16>: mov %o7, %g1 0x0000000040f7f0b4 <read+20>: b 0x4110f600 <__sglue+9496> 0x0000000040f7f0b8 <read+24>: mov %g1, %o7 0x0000000040f7f0bc <read+28>: retl 0x0000000040f7f0c0 <read+32>: nop Could this the be bug described in this URL? http://www.netbsd.org/cgi-bin/query-pr-single.pl?number=21750 cpu0: Sun Microsystems UltraSparc-I Processor (200.00 MHz CPU)
On 2005-05-13, Tom Lane <tgl@sss.pgh.pa.us> wrote: > David Walker <david@cosmicfires.com> writes: >> On a Sun ultra2 sparc64 initdb hangs here: > > Where's "here"? Build with --enable-debug, then attach to the stuck > process with gdb and get a stack trace. I've just done some analysis on this on David's machine - the bottom line is that -enable-thread-safety is broken on this platform, but I've not yet established for certain whether this is a bug in freebsd or incorrect compiler options in pg's configure (will be looking into that in due course) The reason for the hang is that the sending end of the pipe from initdb to postgres is not being closed by popen(), so postgres never sees EOF on it. In this context I am suspicious of the fact that while libpq is being built with threading, the apps which link against it do not appear to be. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services
Andrew - Supernews <andrew+nonews@supernews.com> writes: > The reason for the hang is that the sending end of the pipe from initdb > to postgres is not being closed by popen(), so postgres never sees EOF > on it. In this context I am suspicious of the fact that while libpq is > being built with threading, the apps which link against it do not appear > to be. initdb does not use libpq ... it might link to it, because of sloppy LIBS list management, but it doesn't ever call it. There is, by definition, no running postmaster available for libpq to contact. regards, tom lane
On 2005-05-21, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andrew - Supernews <andrew+nonews@supernews.com> writes: >> The reason for the hang is that the sending end of the pipe from initdb >> to postgres is not being closed by popen(), so postgres never sees EOF >> on it. In this context I am suspicious of the fact that while libpq is >> being built with threading, the apps which link against it do not appear >> to be. > > initdb does not use libpq ... it might link to it, because of sloppy > LIBS list management, but it doesn't ever call it. There is, by > definition, no running postmaster available for libpq to contact. Linking to it is enough to bring in libc_r, and pick up libc_r's versions of at least some system calls. The ktrace results show this. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services
Andrew - Supernews <andrew+nonews@supernews.com> writes: > On 2005-05-21, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> initdb does not use libpq ... it might link to it, > Linking to it is enough to bring in libc_r, and pick up libc_r's versions > of at least some system calls. The ktrace results show this. [ shrug... ] If libc_r behaves differently than libc in an app that doesn't create multiple threads, I think that's an issue to be taking up with the libc developers not us. regards, tom lane
On 2005-05-21, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andrew - Supernews <andrew+nonews@supernews.com> writes: >> On 2005-05-21, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> initdb does not use libpq ... it might link to it, > >> Linking to it is enough to bring in libc_r, and pick up libc_r's versions >> of at least some system calls. The ktrace results show this. > > [ shrug... ] If libc_r behaves differently than libc in an app that > doesn't create multiple threads, I think that's an issue to be taking up > with the libc developers not us. Further investigation (not yet complete) suggests that this is a link order dependency problem. It kicks in because libpq is being linked explicitly with -lc_r (which should not be necessary or desirable), causing libc and libc_r to be linked in the wrong order if the executable is _not_ linked with -pthread. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services