Thread: 8.2beta1 crash possibly in libpq

8.2beta1 crash possibly in libpq

From
Mark Cave-Ayland
Date:
Hi everyone,

I'm in the process of generating the Windows installer for the latest
PostGIS 1.1.4 release and I'm getting a regression failure in one of
libpq applications - the application in question is generating a
segfault.

Testing so far shows that the regression tests pass without segfaulting
in the following scenarios:

PostgreSQL 8.2beta1 / PostGIS 1.1.4 / Linux
PostgreSQL 8.1 / PostGIS 1.1.4 / Win32

So it appears it is something to do with 8.2beta1 and Win32. I've
compiled the application with debugging symbols enabled and get the
following backtrace from gdb in MingW:


(gdb) set args -f /tmp/pgis_reg_4060/dumper postgis_reg loadedshp
(gdb) run
Starting program: C:\msys\1.0\home\mca\postgis\pg82\postgis-1.1.4
\regress/../loader/pgsql2shp.exe -f /tmp/pgis_reg_4060/dumper
postgis_reg loadedshp
Initializing... 
Program received signal SIGSEGV, Segmentation fault.
0x63512c1c in ?? ()
(gdb) bt
#0  0x63512c1c in ?? ()
#1  0x0040c69c in _fu8__PQntuples () at pgsql2shp.c:2502
#2  0x00408481 in main (ARGC=5, ARGV=0x3d2750) at pgsql2shp.c:243
(gdb) 


I also turned on the logging in the server and get the following in the
server log:


2006-10-08 12:01:15 LOG:  statement: BEGIN;
2006-10-08 12:01:15 LOG:  statement: CREATE TABLE "loadedshp" (gid
serial PRIMARY KEY);
2006-10-08 12:01:15 NOTICE:  CREATE TABLE will create implicit sequence
"loadedshp_gid_seq" for serial column "loadedshp.gid"
2006-10-08 12:01:15 NOTICE:  CREATE TABLE / PRIMARY KEY will create
implicit index "loadedshp_pkey" for table "loadedshp"
2006-10-08 12:01:15 LOG:  statement: SELECT
AddGeometryColumn('','loadedshp','the_geom','-1','POINT',2);
2006-10-08 12:01:17 LOG:  statement: INSERT INTO "loadedshp" (the_geom)
VALUES ('01010000000000000000000000000000000000F03F');
2006-10-08 12:01:18 LOG:  statement: INSERT INTO "loadedshp" (the_geom)
VALUES ('01010000000000000000002240000000000000F0BF');
2006-10-08 12:01:18 LOG:  statement: INSERT INTO "loadedshp" (the_geom)
VALUES ('01010000000000000000002240000000000000F0BF');
2006-10-08 12:01:18 LOG:  statement: END;
2006-10-08 12:01:21 LOG:  statement: select asewkt(the_geom) from
loadedshp;
2006-10-08 12:01:36 LOG:  statement: DROP table loadedshp
2006-10-08 12:01:39 LOG:  statement: BEGIN;
2006-10-08 12:01:39 LOG:  statement: CREATE TABLE "loadedshp" (gid
serial PRIMARY KEY);
2006-10-08 12:01:39 NOTICE:  CREATE TABLE will create implicit sequence
"loadedshp_gid_seq" for serial column "loadedshp.gid"
2006-10-08 12:01:39 NOTICE:  CREATE TABLE / PRIMARY KEY will create
implicit index "loadedshp_pkey" for table "loadedshp"
2006-10-08 12:01:39 LOG:  statement: SELECT
AddGeometryColumn('','loadedshp','the_geom','-1','POINT',2);
2006-10-08 12:01:41 LOG:  statement: COPY "loadedshp" (the_geom) FROM
stdin;
2006-10-08 12:01:41 LOG:  statement: END;
2006-10-08 12:01:43 LOG:  statement: select asewkt(the_geom) from
loadedshp;
2006-10-08 12:02:34 LOG:  statement: SELECT postgis_version()
2006-10-08 12:02:34 LOG:  statement: SELECT a.attname, a.atttypid,
a.attlen, a.atttypmod FROM pg_attribute a, pg_class c WHERE a.attrelid =
c.oid and a.attnum > 0 AND a.atttypid != 0 AND c.relname = 'loadedshp'
2006-10-08 12:02:48 LOG:  could not receive data from client: No
connection could be made because the target machine actively refused
it.
2006-10-08 12:02:48 LOG:  unexpected EOF on client connection


AFAICT the backtrace and server log is indicating that the crash is
happening somewhere in libpq. If someone can help me figure out how to
load the libpq symbols into MingW's gdb then I can get a better
backtrace if required as I can reproduce this 100% of the time. For
reference, the source for the application in question can be found at
http://svn.refractions.net/postgis/tags/1.1.4/loader/pgsql2shp.c.


Many thanks,

Mark.




Re: 8.2beta1 crash possibly in libpq

From
"Magnus Hagander"
Date:
> AFAICT the backtrace and server log is indicating that the
> crash is happening somewhere in libpq. If someone can help me
> figure out how to load the libpq symbols into MingW's gdb
> then I can get a better backtrace if required as I can
> reproduce this 100% of the time. For reference, the source
> for the application in question can be found at
> http://svn.refractions.net/postgis/tags/1.1.4/loader/pgsql2shp.c.

If you figure out how to make gdb actually work on mingw, let us know -
not many has ever managed to get it wokring, and I don't know of anybody
who can make it work repeatedly.

That said, libpq builds with Visual C++. Could you try building your
pgsql2shp with Visual C++ as well, and then use the Visual C++ debugger
(or windbg, really). They should give working backtraces.

//Magnus


Re: 8.2beta1 crash possibly in libpq

From
Mark Cave-Ayland
Date:
On Sun, 2006-10-08 at 17:53 +0200, Magnus Hagander wrote:
> > AFAICT the backtrace and server log is indicating that the 
> > crash is happening somewhere in libpq. If someone can help me 
> > figure out how to load the libpq symbols into MingW's gdb 
> > then I can get a better backtrace if required as I can 
> > reproduce this 100% of the time. For reference, the source 
> > for the application in question can be found at 
> > http://svn.refractions.net/postgis/tags/1.1.4/loader/pgsql2shp.c.
> 
> If you figure out how to make gdb actually work on mingw, let us know -
> not many has ever managed to get it wokring, and I don't know of anybody
> who can make it work repeatedly.
> 
> That said, libpq builds with Visual C++. Could you try building your
> pgsql2shp with Visual C++ as well, and then use the Visual C++ debugger
> (or windbg, really). They should give working backtraces.
> 
> //Magnus


Hi Magnus,

Getting closer I think. I managed to compile a MSVC libpq but it agreed
with the MingW backtrace in that it was jumping into the middle of
nowhere :(

I think I may be getting closer though: I've just done a comparison
build with PostgreSQL 8.1 and noticed that there is an error message is
being emitted regarding PGntuples (which is where the crash is
occuring):



PG 8.1:

mca@MCAWINXP ~/postgis/pg81/postgis-1.1.4/loader
$ make
gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=81   -c -o shpopen.o
shpopen.c
shpopen.c:176: warning: 'rcsid' defined but not used
gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=81   -c -o dbfopen.o
dbfopen.c
dbfopen.c:206: warning: 'rcsid' defined but not used
gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=81   -c -o getopt.o
getopt.c
gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=81   -c -o shp2pgsql.o
shp2pgsql.c
shp2pgsql.c: In function `utf8':
shp2pgsql.c:1686: warning: passing arg 2 of `libiconv' from incompatible
pointer type
gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=81 shpopen.o dbfopen.o
getopt.o shp2pgsql.o -liconv -o shp2pgsql.exe 
gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=81
-IC:/msys/1.0/home/mca/pg81/REL-81~1.4/include -c pgsql2shp.c
gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=81   -c -o PQunescapeBytea.o
PQunescapeBytea.c
gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=81 shpopen.o dbfopen.o
getopt.o PQunescapeBytea.o pgsql2shp.o -liconv
C:/msys/1.0/home/mca/pg81/REL-81~1.4/lib/libpq.dll -o pgsql2shp.exe 


PG 8.2:

mca@MCAWINXP ~/postgis/pg82/postgis-1.1.4/loader
$ make
gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=82   -c -o shpopen.o
shpopen.c
shpopen.c:176: warning: 'rcsid' defined but not used
gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=82   -c -o dbfopen.o
dbfopen.c
dbfopen.c:206: warning: 'rcsid' defined but not used
gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=82   -c -o getopt.o
getopt.c
gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=82   -c -o shp2pgsql.o
shp2pgsql.c
shp2pgsql.c: In function `utf8':
shp2pgsql.c:1686: warning: passing arg 2 of `libiconv' from incompatible
pointer type
gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=82 shpopen.o dbfopen.o
getopt.o shp2pgsql.o -liconv -o shp2pgsql.exe 
gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=82
-IC:/msys/1.0/home/mca/pg82/REL-8~1.2BE/include -c pgsql2shp.c
gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=82   -c -o PQunescapeBytea.o
PQunescapeBytea.c
gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=82 shpopen.o dbfopen.o
getopt.o PQunescapeBytea.o pgsql2shp.o -liconv
C:/msys/1.0/home/mca/pg82/REL-8~1.2BE/lib/libpq.dll -o pgsql2shp.exe 
Info: resolving _PQntuples by linking to __imp__PQntuples (auto-import)


I think the key part is this line: "Info: resolving _PQntuples by
linking to __imp__PQntuples (auto-import)". Could it be that the linker
cannot find a reference to PQntuples and hence is jumping into random
code? I have verified that PQntuples does exist within libpq.dll using
the Microsoft Dependency Walker though.


Kind regards,

Mark.





Re: 8.2beta1 crash possibly in libpq

From
"Magnus Hagander"
Date:
> > > AFAICT the backtrace and server log is indicating that the
> > > crash is happening somewhere in libpq. If someone can help me
> > > figure out how to load the libpq symbols into MingW's gdb
> > > then I can get a better backtrace if required as I can
> > > reproduce this 100% of the time. For reference, the source
> > > for the application in question can be found at
> > >
> http://svn.refractions.net/postgis/tags/1.1.4/loader/pgsql2shp.c.
> >
> > If you figure out how to make gdb actually work on mingw, let us
> know -
> > not many has ever managed to get it wokring, and I don't know of
> anybody
> > who can make it work repeatedly.
> >
> > That said, libpq builds with Visual C++. Could you try building
> your
> > pgsql2shp with Visual C++ as well, and then use the Visual C++
> debugger
> > (or windbg, really). They should give working backtraces.
> >
> > //Magnus
>
>
> Hi Magnus,
>
> Getting closer I think. I managed to compile a MSVC libpq but it
> agreed
> with the MingW backtrace in that it was jumping into the middle of
> nowhere :(

Oops. Sounds like a generic memory corruption then, overwriting the
return stack so the backtrace doesn't work.


> I think I may be getting closer though: I've just done a comparison
> build with PostgreSQL 8.1 and noticed that there is an error
> message is
> being emitted regarding PGntuples (which is where the crash is
> occuring):

> mca@MCAWINXP ~/postgis/pg81/postgis-1.1.4/loader
> $ make
> gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=81   -c -o shpopen.o
> shpopen.c


A question based on that - are you using gettext? I know gettext, and
possibly iconv, breaks if gettext is compiled with one version of VC++
and the program using it a different version. If you are building with
it, try to disable it and see if that's where the problem is from.

<snip>

> C:/msys/1.0/home/mca/pg82/REL-8~1.2BE/lib/libpq.dll -o
> pgsql2shp.exe
> Info: resolving _PQntuples by linking to __imp__PQntuples (auto-
> import)
>
>
> I think the key part is this line: "Info: resolving _PQntuples by
> linking to __imp__PQntuples (auto-import)". Could it be that the
> linker
> cannot find a reference to PQntuples and hence is jumping into
> random
> code? I have verified that PQntuples does exist within libpq.dll
> using
> the Microsoft Dependency Walker though.

This is fairly normal, and it's just info - not even a warning. If it
couldn't find the refenrence, you'd get one of those "could not find
entrypoint in DLL" errorboxes when you tried to start the program. It
absolutely will not just pick a random memory and jump to. You could
possibly do that yourself if you were loading the DLL manually, but
since you're not doing that...

//Magnus



Re: 8.2beta1 crash possibly in libpq

From
Tom Lane
Date:
"Magnus Hagander" <mha@sollentuna.net> writes:
>> C:/msys/1.0/home/mca/pg82/REL-8~1.2BE/lib/libpq.dll -o pgsql2shp.exe
>> Info: resolving _PQntuples by linking to __imp__PQntuples (auto-import)

> This is fairly normal, and it's just info - not even a warning.

It seems pretty odd that it would only be whinging about PQntuples and
not any of the other libpq entry points, though.  I think Mark should
try to figure out why that is.
        regards, tom lane


Re: 8.2beta1 crash possibly in libpq

From
Mark Cave-Ayland
Date:
Hi Magnus,

I finally got to the bottom of this - it seems that the flags being
passed to MingW's linker were incorrect, but instead of erroring out it
decided to create a corrupt executable. Here is the command line that
was being used to link the pgsql2shp.exe executable, along with the
associated auto-import warning:


gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=82 shpopen.o dbfopen.o
getopt.o PQunescapeBytea.o pgsql2shp.o -liconv
C:/msys/1.0/home/mca/pg82/REL-8~1.2BE/lib/libpq.dll -o pgsql2shp.exe
Info: resolving _PQntuples by linking to __imp__PQntuples (auto-import)


Note that libpq.dll is referenced directly with -l which I believe
should be an invalid syntax. This produces a corrupt executable that
crashes whenever PQntuples is accessed. On the other hand, a correct
executable can be realised by linking like this:


gcc -g -Wall -I.. -DUSE_ICONV -DUSE_VERSION=82 shpopen.o dbfopen.o
getopt.o PQunescapeBytea.o pgsql2shp.o -liconv
-LC:/msys/1.0/home/mca/pg82/REL-8~1.2BE/lib -lpq -o pgsql2shp.exe


Note there is no auto-import warning, and the use of -L and -l is how I
would expect. In actual fact, the incorrect link line was being produced
by an error in the configure.in script, so this won't be a scenario that
most people will experience.

The executables linked using the second method now work properly without
crashing during regression. The big mystery is that the command line
used to link the executables has been like that for several versions
now, and I have absolutely no idea why it only triggered this failure
when being linked against 8.2beta1 when it works perfectly on 8.1 and
8.0, and also why only PQntuples was affected.


Kind regards,

Mark.