Re: Server hangs on multiple connections - Mailing list pgsql-bugs

From David Christian
Subject Re: Server hangs on multiple connections
Date
Msg-id A302AA78-CC1B-11D6-9296-0003933E390A@comtechmobile.com
Whole thread Raw
In response to Re: Server hangs on multiple connections  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Server hangs on multiple connections
List pgsql-bugs
On Thursday, Sep 19, 2002, at 17:10 US/Eastern, Tom Lane wrote:

> Could you build with --enable-debug and --enable-cassert (if you didn't
> already), repeat the 'make check' scenario, and then attach to a few of
> the stuck backend processes with gdb and get stack traces from them?
> That would give us a little more info to work with.

Happy to.  Interestingly, when I build with --enable-debug and
--enable-cassert, the server doesn't lock up during 'make check', it
just (very quickly) fails all of the tests and exits.  I tried several
times.

$ ./configure --enable-debug --enable-cassert
$ make
$ make check

Here is the tail end of the 'make check' output in that case:

/bin/sh ./pg_regress --temp-install --top-builddir=../../..
--schedule=./parallel_schedule --multibyte=
============== creating temporary installation        ==============
============== initializing database system           ==============
============== starting postmaster                    ==============
running on port 65432 with pid 7893
============== creating database "regression"         ==============
CREATE DATABASE
============== dropping regression test user accounts ==============
============== installing PL/pgSQL                    ==============
============== running regression test queries        ==============
parallel group (13 tests):  boolean int4 varchar char name text int2
int8 oid float4 bit numeric float8
      boolean              ... FAILED
      char                 ... FAILED
      name                 ... FAILED
      varchar              ... FAILED
      text                 ... FAILED
      int2                 ... FAILED
      int4                 ... FAILED
      int8                 ... FAILED
      oid                  ... FAILED
      float4               ... FAILED
      float8               ... FAILED
      bit                  ... FAILED
      numeric              ... FAILED
test strings              ... FAILED
test numerology           ... FAILED
parallel group (20 tests):  point box lseg path circle polygon time
date timetz timestamp timestamptz interval abstime tinterval reltime
inet comments oidjoins type_sanity opr_sanity
      point                ... FAILED
      lseg                 ... FAILED
      box                  ... FAILED
      path                 ... FAILED
      polygon              ... FAILED
      circle               ... FAILED
      date                 ... FAILED
      time                 ... FAILED
      timetz               ... FAILED
      timestamp            ... FAILED
      timestamptz          ... FAILED
      interval             ... FAILED
      abstime              ... FAILED
      reltime              ... FAILED
      tinterval            ... FAILED
      inet                 ... FAILED
      comments             ... FAILED
      oidjoins             ... FAILED
      type_sanity          ... FAILED
      opr_sanity           ... FAILED
test geometry             ... FAILED
test horology             ... FAILED
test create_function_1    ... FAILED
test create_type          ... FAILED
test create_table         ... FAILED
test create_function_2    ... FAILED
test copy                 ... FAILED
parallel group (7 tests):  constraints triggers create_misc
create_operator create_aggregate create_index inherit
      constraints          ... FAILED
      triggers             ... FAILED
      create_misc          ... FAILED
      create_aggregate     ... FAILED
      create_operator      ... FAILED
      create_index         ... FAILED
      inherit              ... FAILED
test create_view          ... FAILED
test sanity_check         ... FAILED
test errors               ... FAILED
test select               ... FAILED
parallel group (16 tests):  select_distinct select_into
select_distinct_on select_implicit select_having subselect case union
join aggregates transactions portals arrays random btree_index
hash_index
      select_into          ... FAILED
      select_distinct      ... FAILED
      select_distinct_on   ... FAILED
      select_implicit      ... FAILED
      select_having        ... FAILED
      subselect            ... FAILED
      union                ... FAILED
      case                 ... FAILED
      join                 ... FAILED
      aggregates           ... FAILED
      transactions         ... FAILED
      random               ... failed (ignored)
      portals              ... FAILED
      arrays               ... FAILED
      btree_index          ... FAILED
      hash_index           ... FAILED
test privileges           ... ok
test misc                 ... FAILED
parallel group (5 tests):  alter_table select_views portals_p2 rules
foreign_key
      select_views         ... FAILED
      alter_table          ... FAILED
      portals_p2           ... FAILED
      rules                ... FAILED
      foreign_key          ... FAILED
parallel group (3 tests):  plpgsql limit temp
      limit                ... FAILED
      plpgsql              ... FAILED
      temp                 ... FAILED
============== shutting down postmaster               ==============

=====================================================
  78 of 79 tests failed, 1 of these failures ignored.
=====================================================

The differences that caused some tests to fail can be viewed in the
file `./regression.diffs'.  A copy of the test summary that you see
above is saved in the file `./regression.out'.

make[2]: *** [check] Error 1
rm regress.o
make[2]: Leaving directory
`/home/davidc/src/PostgreSQL/postgresql-7.2.2/src/test/regress'
make[1]: *** [check] Error 2
make[1]: Leaving directory
`/home/davidc/src/PostgreSQL/postgresql-7.2.2/src/test'
make: *** [check] Error 2


I then tried with just ./configure --enable-debug alone.  And it did
hang in the place I described in my first message.  (Between builds, I
rm -rf'd the installation postgresql-7.2.2 directory to be sure I was
using fully clean source each time.)

$ ps auxw | grep 'postgres:'
davidc   15639  0.0  0.0  7212 1380 pts/0    S    21:44   0:00
postgres: stats buffer process
davidc   15641  0.0  0.0  6268 1428 pts/0    S    21:44   0:00
postgres: stats collector process
davidc   15712  0.0  0.2  6664 3176 pts/0    S    21:44   0:00
postgres: davidc regression [local] idle
davidc   15715  0.0  0.1  6660 3040 pts/0    S    21:44   0:00
postgres: davidc regression [local] SELECT
davidc   15716  0.0  0.1  6660 3044 pts/0    S    21:44   0:00
postgres: davidc regression [local] SELECT
davidc   15717  0.0  0.1  6660 2944 pts/0    S    21:44   0:00
postgres: davidc regression [local] idle
davidc   15722  0.0  0.1  6660 2864 pts/0    S    21:44   0:00
postgres: davidc regression [local] SELECT
davidc   15731  0.0  0.1  6572 2140 pts/0    S    21:44   0:00
postgres: davidc regression [local] startup
davidc   15732  0.0  0.1  6568 1944 pts/0    S    21:44   0:00
postgres: davidc regression [local] startup
davidc   15733  0.0  0.1  6620 2524 pts/0    S    21:44   0:00
postgres: davidc regression [local] SELECT
davidc   15737  0.0  0.1  6568 1980 pts/0    S    21:44   0:00
postgres: davidc regression [local] startup
davidc   15738  0.0  0.1  6660 2844 pts/0    S    21:44   0:00
postgres: davidc regression [local] CREATE
davidc   15742  0.0  0.1  6568 1940 pts/0    S    21:44   0:00
postgres: davidc regression [local] startup
davidc   15743  0.0  0.1  6568 1876 pts/0    S    21:44   0:00
postgres: davidc regression [local] startup
davidc   15744  0.0  0.1  6548 1696 pts/0    S    21:44   0:00
postgres: davidc regression [local] startup


I don't really know what I'm doing with gdb, but I scanned the  man
page, and here's what I typed:


$ gdb src/test/regress/tmp_check/install/usr/local/pgsql/bin/postgres
15715
GNU gdb Yellow Dog Linux (5.1.1-1b)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "ppc-yellowdog-linux"...
/home/davidc/src/PostgreSQL/postgresql-7.2.2/15715: No such file or
directory.
Attaching to program:
/home/davidc/src/PostgreSQL/postgresql-7.2.2/src/test/regress/
tmp_check/install/usr/local/pgsql/bin/postgres, process 15715
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /lib/libresolv.so.2...done.
Loaded symbols for /lib/libresolv.so.2
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /usr/lib/libhistory.so.4...done.
Loaded symbols for /usr/lib/libhistory.so.4
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld.so.1...done.
Loaded symbols for /lib/ld.so.1
0x0fdc297c in __syscall_ipc () at soinit.c:76
76      soinit.c: No such file or directory.
         in soinit.c
(gdb) bt
#0  0x0fdc297c in __syscall_ipc () at soinit.c:76
#1  0x0fdc38c0 in semop (semid=4, sops=0x7fffea18, nsops=1) at
../sysdeps/unix/sysv/linux/semop.c:36
#2  0x100e4424 in IpcSemaphoreLock ()
#3  0x100eb018 in LWLockAcquire ()
#4  0x100e7f3c in LockAcquire ()
#5  0x100e7434 in LockRelation ()
#6  0x1002cc5c in relation_openr ()
#7  0x1002cdac in heap_openr ()
#8  0x100dc100 in fireRIRrules ()
#9  0x100dc878 in QueryRewrite ()
#10 0x100eeee4 in pg_analyze_and_rewrite ()
#11 0x100ef244 in pg_exec_query_string ()
#12 0x100f0688 in PostgresMain ()
#13 0x100cf3dc in DoBackend ()
#14 0x100cec54 in BackendStartup ()
#15 0x100cdaac in ServerLoop ()
#16 0x100cd564 in PostmasterMain ()
#17 0x100a26b8 in main ()
#18 0x0fd07f70 in __libc_start_main (argc=4, ubp_av=0x7ffff814,
ubp_ev=0x1, auxvec=0x7ffff8a8, rtld_fini=0x4, stinfo=0x10154c20,
stack_on_entry=0x1)
     at ../sysdeps/powerpc/elf/libc-start.c:119
(gdb) q
The program is running.  Quit anyway (and detach it)? (y or n) y
Detaching from program:
/home/davidc/src/PostgreSQL/postgresql-7.2.2/src/test/regress/
tmp_check/install/usr/local/pgsql/bin/postgres, process 15715


$ gdb src/test/regress/tmp_check/install/usr/local/pgsql/bin/postgres
15738
GNU gdb Yellow Dog Linux (5.1.1-1b)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "ppc-yellowdog-linux"...
/home/davidc/src/PostgreSQL/postgresql-7.2.2/15738: No such file or
directory.
Attaching to program:
/home/davidc/src/PostgreSQL/postgresql-7.2.2/src/test/regress/
tmp_check/install/usr/local/pgsql/bin/postgres, process 15738
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /lib/libresolv.so.2...done.
Loaded symbols for /lib/libresolv.so.2
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /usr/lib/libhistory.so.4...done.
Loaded symbols for /usr/lib/libhistory.so.4
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld.so.1...done.
Loaded symbols for /lib/ld.so.1
0x0fdc297c in __syscall_ipc () at soinit.c:76
76      soinit.c: No such file or directory.
         in soinit.c
(gdb) bt
#0  0x0fdc297c in __syscall_ipc () at soinit.c:76
#1  0x0fdc38c0 in semop (semid=4, sops=0x7fffe7a8, nsops=1) at
../sysdeps/unix/sysv/linux/semop.c:36
#2  0x100e4424 in IpcSemaphoreLock ()
#3  0x100eb018 in LWLockAcquire ()
#4  0x100e7f3c in LockAcquire ()
#5  0x100e7434 in LockRelation ()
#6  0x1003206c in index_beginscan ()
#7  0x1013d280 in SearchCatCache ()
#8  0x101420c8 in SearchSysCache ()
#9  0x100502c0 in CatalogIndexInsert ()
#10 0x1004baf0 in AddNewRelationTuple ()
#11 0x1004bd10 in heap_create_with_catalog ()
#12 0x1006c86c in DefineRelation ()
#13 0x100f1828 in ProcessUtility ()
#14 0x100ef2f8 in pg_exec_query_string ()
#15 0x100f0688 in PostgresMain ()
#16 0x100cf3dc in DoBackend ()
#17 0x100cec54 in BackendStartup ()
#18 0x100cdaac in ServerLoop ()
#19 0x100cd564 in PostmasterMain ()
#20 0x100a26b8 in main ()
#21 0x0fd07f70 in __libc_start_main (argc=4, ubp_av=0x7ffff814,
ubp_ev=0x1, auxvec=0x7ffff8a8, rtld_fini=0x4, stinfo=0x10154c20,
stack_on_entry=0x1)
---Type <return> to continue, or q <return> to quit---
     at ../sysdeps/powerpc/elf/libc-start.c:119
(gdb) q
The program is running.  Quit anyway (and detach it)? (y or n) y
Detaching from program:
/home/davidc/src/PostgreSQL/postgresql-7.2.2/src/test/regress/
tmp_check/install/usr/local/pgsql/bin/postgres, process 15738


$ gdb src/test/regress/tmp_check/install/usr/local/pgsql/bin/postgres
15744
GNU gdb Yellow Dog Linux (5.1.1-1b)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and
you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "ppc-yellowdog-linux"...
/home/davidc/src/PostgreSQL/postgresql-7.2.2/15744: No such file or
directory.
Attaching to program:
/home/davidc/src/PostgreSQL/postgresql-7.2.2/src/test/regress/
tmp_check/install/usr/local/pgsql/bin/postgres, process 15744
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /lib/libresolv.so.2...done.
Loaded symbols for /lib/libresolv.so.2
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /usr/lib/libhistory.so.4...done.
Loaded symbols for /usr/lib/libhistory.so.4
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld.so.1...done.
Loaded symbols for /lib/ld.so.1
0x0fdc297c in __syscall_ipc () at soinit.c:76
76      soinit.c: No such file or directory.
         in soinit.c
(gdb) bt
#0  0x0fdc297c in __syscall_ipc () at soinit.c:76
#1  0x0fdc38c0 in semop (semid=4, sops=0x7fffe738, nsops=1) at
../sysdeps/unix/sysv/linux/semop.c:36
#2  0x100e4424 in IpcSemaphoreLock ()
#3  0x100eb018 in LWLockAcquire ()
#4  0x100e7f3c in LockAcquire ()
#5  0x100e79b4 in XactLockTableInsert ()
#6  0x10040828 in StartTransaction ()
#7  0x10040bb0 in StartTransactionCommand ()
#8  0x1014a610 in InitPostgres ()
#9  0x100f036c in PostgresMain ()
#10 0x100cf3dc in DoBackend ()
#11 0x100cec54 in BackendStartup ()
#12 0x100cdaac in ServerLoop ()
#13 0x100cd564 in PostmasterMain ()
#14 0x100a26b8 in main ()
#15 0x0fd07f70 in __libc_start_main (argc=4, ubp_av=0x7ffff814,
ubp_ev=0x1, auxvec=0x7ffff8a8, rtld_fini=0x4, stinfo=0x10154c20,
stack_on_entry=0x1)
     at ../sysdeps/powerpc/elf/libc-start.c:119
(gdb) q
The program is running.  Quit anyway (and detach it)? (y or n) y
Detaching from program:
/home/davidc/src/PostgreSQL/postgresql-7.2.2/src/test/regress/
tmp_check/install/usr/local/pgsql/bin/postgres, process 15744


If I goofed on this, I'm afraid I will need to ask for some
hand-holding with using gdb properly.  I'm happy to go through the
steps to get you what you need to see.


>> I have run PostgreSQL since 7.1 successfully on Red Hat Linux i386 and
>> Mac OS X 10.2 ppc (the very box I am currently having problems with)
>> without the lockup problem.
>
> Have you run 7.2.* on this same box under OS X?  (Ie, could the problem
> be specific to YDL?)

Yes, I have, and I can hammer it all I want without it hanging.
Interestingly, I tried Yellow Dog's RPM also (7.2) and it exhibits the
same behavior (i.e., locking up on multiple connections).

Thanks,
David

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: Bug #775: Unable to identify an operator '=' for types 'numeric' and 'double precision'
Next
From: Tom Lane
Date:
Subject: Re: Server hangs on multiple connections