Thread: segmentation fault in psql

segmentation fault in psql

From
pgsql-bugs@postgresql.org
Date:
David George (david@onyxsoft.com) reports a bug with a severity of 1
The lower the number the more severe it is.

Short Description
segmentation fault in psql

Long Description
System info: Sparc Solaris 2.7 with GCC 2.95.2
I have compiled Postgresql 7.1RC1 without any problems.
initdb, createuser, createdb work fine.  psql works, I can create a table, and insert data into that table, but if I
tryto select anything I get a core dump.  I even tried just a 'select CURRENT_USER;'.  Here is the output from gdb with
abacktrace: 

GNU gdb 5.0
Copyright 2000 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "sparc-sun-solaris2.7"...
Core was generated by `psql test'.
Program terminated with signal 11, Segmentation Fault.
Reading symbols from /usr/local/lib/libpq.so.2...done.
Loaded symbols for /usr/local/lib/libpq.so.2
Reading symbols from /usr/lib/libresolv.so.2...done.
Loaded symbols for /usr/lib/libresolv.so.2
Reading symbols from /usr/lib/libgen.so.1...done.
Loaded symbols for /usr/lib/libgen.so.1
Reading symbols from /usr/lib/libnsl.so.1...done.
Loaded symbols for /usr/lib/libnsl.so.1
Reading symbols from /usr/lib/libsocket.so.1...done.
Loaded symbols for /usr/lib/libsocket.so.1
Reading symbols from /usr/lib/libdl.so.1...done.
Loaded symbols for /usr/lib/libdl.so.1
Reading symbols from /usr/lib/libm.so.1...done.
Loaded symbols for /usr/lib/libm.so.1
Reading symbols from /usr/lib/libc.so.1...done.
Loaded symbols for /usr/lib/libc.so.1
Reading symbols from /usr/lib/libmp.so.2...done.
Loaded symbols for /usr/lib/libmp.so.2
Reading symbols from /usr/platform/SUNW,UltraSPARC-IIi-Engine/lib/libc_psr.so.1...done.
Loaded symbols for /usr/platform/SUNW,UltraSPARC-IIi-Engine/lib/libc_psr.so.1
#0  0x274a8 in putc ()
(gdb) bt
#0  0x274a8 in putc ()
#1  0x21044 in print_aligned_text ()
#2  0x23360 in printTable ()
#3  0x23a44 in printQuery ()
#4  0x18820 in SendQuery ()
#5  0x1b044 in MainLoop ()
#6  0x1d5a8 in main ()
(gdb)


Sample Code


No file was uploaded with this report

Re: segmentation fault in psql

From
Tom Lane
Date:
David George (david@onyxsoft.com) writes:
> (gdb) bt
> #0  0x274a8 in putc ()
> #1  0x21044 in print_aligned_text ()
> #2  0x23360 in printTable ()
> #3  0x23a44 in printQuery ()
> #4  0x18820 in SendQuery ()
> #5  0x1b044 in MainLoop ()
> #6  0x1d5a8 in main ()
> (gdb)

Can't tell a lot from that.  Could you rebuild psql with debug symbols
so we can see a more complete backtrace?

            regards, tom lane

Re: segmentation fault in psql

From
David George
Date:
Tom Lane wrote:

> Can't tell a lot from that.  Could you rebuild psql with debug symbols
> so we can see a more complete backtrace?

Here is a backtrace with debug enabled:
(gdb) bt
#0  0x446cc in putc ()
#1  0x26748 in print_aligned_text (title=0x0, headers=0x746d0,
cells=0x746e0, footers=0x746f0, opt_align=0x74700 "l",
    opt_barebones=0 '\000', opt_border=1, fout=0x68458) at print.c:288
#2  0x28a2c in printTable (title=0x0, headers=0x746d0, cells=0x746e0,
footers=0x746f0, align=0x74700 "l", opt=0x6857c,
    fout=0x68438) at print.c:986
#3  0x29104 in printQuery (result=0x76a00, opt=0x6857c, fout=0x68438) at
print.c:1108
#4  0x1da8c in SendQuery (query=0x702f0 "select current_user;") at
common.c:459
#5  0x20714 in MainLoop (source=0x68428) at mainloop.c:427
#6  0x22c6c in main (argc=2, argv=0xffbef774) at startup.c:293

I had a thought.  I remember configure checking for sfio (which I actually
have installed), but it wasn't checking for libstdio.a so I added
(AC_CHECK_LIB(stdio,     main)) to configure.in right under the sfio check
and ran autoconf then configure again.  This time I don't get a segfault.
It outputs the following:

test=# select current_user; current_user
--------------
 david
(1 row)

Then it doesn't echo what I type.  Without exiting, I typed select
current_user; again and it did output the following even though it didn't
echo what I was typing:
 current_user
--------------
 david
(1 row)

I tried a create table and as soon as I pressed enter, my key presses
stopped echoing.

Versions:
Postgresql 7.1RC1
Sparc Solaris 2.7 11/99 (with Mar 7 2001 patch cluster)
gcc 2.95.3 (I was using 2.95.2 earlier)
readline 4.1
sfio 20000531
zlib 1.1.3

Re: segmentation fault in psql

From
Tom Lane
Date:
David George <david@onyxsoft.com> writes:
> Here is a backtrace with debug enabled:
> (gdb) bt
> #0  0x446cc in putc ()
> #1  0x26748 in print_aligned_text (title=0x0, headers=0x746d0,
> cells=0x746e0, footers=0x746f0, opt_align=0x74700 "l",
>     opt_barebones=0 '\000', opt_border=1, fout=0x68458) at print.c:288

Hmm.  Line 288 is

    fputc(' ', fout);

which is difficult to imagine screwing up.  So it does seem that you
must have library problems.

> I had a thought.  I remember configure checking for sfio (which I actually
> have installed), but it wasn't checking for libstdio.a so I added
> (AC_CHECK_LIB(stdio,     main)) to configure.in right under the sfio check
> and ran autoconf then configure again.  This time I don't get a segfault.

Uh, what are sfio and stdio anyway, and why would we want them?  putc is
in plain old libc in every system I've dealt with.  If you remove both
sfio and stdio from configure, does it work any better?

            regards, tom lane

Re: segmentation fault in psql

From
David George
Date:
Tom Lane wrote:

> Uh, what are sfio and stdio anyway, and why would we want them?  putc is
> in plain old libc in every system I've dealt with.  If you remove both
> sfio and stdio from configure, does it work any better?

Thanks.  Removing sfio from configure.in and reconfiguring/making did the job.
I didn't try it before because I figured Postgresql may have actually been using
sfio for something.

sfio is AT&T's replacement for stdio.  It is available at
http://www.research.att.com/sw/tools/sfio/

The reason for using it on Solaris is because Solaris can't fopen file
descriptors above 255.  So if you have a process that has more than 255 open
files in a process any further fopens will fail mysteriously (I have forgotten
what the error message is, but it is something like EPERM or something stupid
like that).

Here is a link to the Solaris FAQ that describes this:
http://www.science.uva.nl/pub/solaris/solaris2.html#q3.45

Re: segmentation fault in psql

From
Tom Lane
Date:
David George <david@onyxsoft.com> writes:
> Thanks.  Removing sfio from configure.in and reconfiguring/making did
> the job.  I didn't try it before because I figured Postgresql may have
> actually been using sfio for something.

No; I'm not sure why it's in configure's search list at all.

It sounds like we might be tripping over a bug in sfio's stdio
emulation.  You might want to report this to the sfio people.

> The reason for using it on Solaris is because Solaris can't fopen file
> descriptors above 255.  So if you have a process that has more than
> 255 open files in a process any further fopens will fail mysteriously
> (I have forgotten what the error message is, but it is something like
> EPERM or something stupid like that).

As long as the error code is something appropriate (EMFILE one hopes)
then I think we should cope with this situation correctly.  If it really
is EPERM then you might find the backend giving weird errors when run
with a file descriptor limit above 256.

            regards, tom lane