Thread: Win32 GiST bug - more info

Win32 GiST bug - more info

From
"Mark Cave-Ayland"
Date:
Hi guys,

Further to Paul Ramsey's email regarding problems with GiST under Win32,
I've been able to get some more information using gdb under MingW.

It looks as if something strange is going on with one of the pointers
being used during the creation of the GiST index. Here is the output
from gdb which shows that the problem occurs in the gbox_union()
function when trying to build a GiST index using the PostGIS spatial
extensions to PostgreSQL:


$ gdb ./postgres.exe
GNU gdb 5.2.1
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "i686-pc-mingw32"...
(gdb) attach 3984
Attaching to program `c:\pgsql75win\bin/./postgres.exe', process 3984
[Switching to thread 3984.0xdbc]
(gdb) continue
Continuing.

Program received signal SIGSEGV, Segmentation fault.
[Switching to thread 3984.0x1a8]
0x6d850a9a in gbox_union (fcinfo=0x22d470) at postgis_gist_72.c:342
342                     if (pageunion->high.x < cur->high.x)
(gdb) bt
#0  0x6d850a9a in gbox_union (fcinfo=0x22d470) at postgis_gist_72.c:342
#1  0x00000000 in ?? ()
(gdb) print pageunion
$1 = (BOX *) 0x141b6b0
(gdb) print cur
$2 = (BOX *) 0x7f7f7f7e
(gdb) print entryvec
$3 = (bytea *) 0x1453e24
(gdb) print sizep
$4 = (int *) 0x22d55c


The actual function itself looks like this:


Datum gbox_union(PG_FUNCTION_ARGS)
{
    bytea       *entryvec = (bytea *) PG_GETARG_POINTER(0);
    int           *sizep = (int *) PG_GETARG_POINTER(1);
    int            numranges,
                i;
    BOX           *cur,
               *pageunion;

#ifdef DEBUG_GIST
    elog(NOTICE,"GIST: gbox_union called\n");
    fflush( stdout );
#endif

    numranges = (VARSIZE(entryvec) - VARHDRSZ) / sizeof(GISTENTRY);
    pageunion = (BOX *) palloc(sizeof(BOX));
    cur = DatumGetBoxP(((GISTENTRY *) VARDATA(entryvec))[0].key);
    memcpy((void *) pageunion, (void *) cur, sizeof(BOX));

    for (i = 1; i < numranges; i++)
    {
        cur = DatumGetBoxP(((GISTENTRY *)
VARDATA(entryvec))[i].key);
        if (pageunion->high.x < cur->high.x)
            pageunion->high.x = cur->high.x;
        if (pageunion->low.x > cur->low.x)
            pageunion->low.x = cur->low.x;
        if (pageunion->high.y < cur->high.y)
            pageunion->high.y = cur->high.y;
        if (pageunion->low.y > cur->low.y)
            pageunion->low.y = cur->low.y;
    }
    *sizep = sizeof(BOX);

    PG_RETURN_POINTER(pageunion);
}


Looking at the pointers, the value for cur looks quite suspicious ;)
Could it be some sort of alignment/pointer problem with the GISTENTRY or
VAR* macros? I can provide as much information as required if someone
can talk me through what I need to do using gdb. The version of
PostgreSQL being used was taken from CVS last Friday (21/05/2004).


Many thanks,

Mark.

---

Mark Cave-Ayland
Webbased Ltd.
Tamar Science Park
Derriford
Plymouth
PL6 8BX
England

Tel: +44 (0)1752 764445
Fax: +44 (0)1752 764446


This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender. You
should not copy it or use it for any purpose nor disclose or distribute
its contents to any other person.



Re: Win32 GiST bug - more info

From
Tom Lane
Date:
"Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk> writes:
> (gdb) print cur
> $2 = (BOX *) 0x7f7f7f7e

> Looking at the pointers, the value for cur looks quite suspicious ;)

Yeah, it's been taken from memory that's not been initialized, or has
already been pfree'd.  In other words the caller is at fault, not this
subroutine.  Can't say much more than that though.

            regards, tom lane

Re: Win32 GiST bug - more info

From
"Mark Cave-Ayland"
Date:
Hi Tom,

Thanks for your reply. I found that I got a "better" backtrace by
executing a couple of commands in the psql.exe session before creating
the index. The improved backtrace was given below:


$ gdb postgres.exe
GNU gdb 5.2.1
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "i686-pc-mingw32"...
(gdb) attach 3804
Attaching to program `c:\pgsql75win\bin/postgres.exe', process 3804
[Switching to thread 3804.0xec]
(gdb) continue
Continuing.

Program received signal SIGSEGV, Segmentation fault.
[Switching to thread 3804.0xe30]
0x6d850a9a in gbox_union (fcinfo=0x22d470) at postgis_gist_72.c:342
342                     if (pageunion->high.x < cur->high.x)
(gdb) bt
#0  0x6d850a9a in gbox_union (fcinfo=0x22d470) at postgis_gist_72.c:342
#1  0x77f57d70 in _libmsvcrt_a_iname ()
#2  0x0060896a in FunctionCall2 (flinfo=0x22e530, arg1=21174392,
arg2=2282844)
    at fmgr.c:1132
#3  0x004078f5 in gistgetadjusted (r=0x13caef8, oldtup=0x188ad30,
    addtup=0x14318c8, giststate=0x22e230) at gist.c:813
#4  0x00406ba1 in gistlayerinsert (r=0x13caef8, blkno=0, itup=0x22e044,
    len=0x22e048, res=0x0, giststate=0x22e230) at gist.c:483
#5  0x0040642f in gistdoinsert (r=0x13caef8, itup=0x1411048, res=0x0,
    giststate=0x22e230) at gist.c:426
#6  0x0040616b in gistbuildCallback (index=0x13caef8, htup=0x1431edc,
    attdata=0x22e150, nulls=0x22e130 " ", tupleIsAlive=1,
state=0x22e230)
    at gist.c:274
#7  0x00454e0a in IndexBuildHeapScan (heapRelation=0x13aefe8,
    indexRelation=0x13caef8, indexInfo=0x14110e8,
    callback=0x4060b0 <gistbuildCallback>, callback_state=0x22e230)
    at index.c:1599
#8  0x00405ff2 in gistbuild (fcinfo=0x22f770) at gist.c:185
#9  0x00609184 in OidFunctionCall3 (functionId=782, arg1=20639720,
    arg2=20754168, arg3=21041384) at fmgr.c:1399
#10 0x004549eb in index_build (heapRelation=0x13aefe8,
    indexRelation=0x13caef8, indexInfo=0x14110e8) at index.c:1348
#11 0x0045373c in index_create (heapRelationId=33591,
    indexRelationName=0x13b4cb8 "geomtest_idx", indexInfo=0x14110e8,
    accessMethodObjectId=783, classObjectId=0x1411028, primary=0 '\0',
    isconstraint=0 '\0', allow_system_table_mods=0 '\0', skip_build=0
'\0')
    at index.c:747
#12 0x004d1321 in DefineIndex (heapRelation=0x13b4cf8,
    indexRelationName=0x13b4cb8 "geomtest_idx",
    accessMethodName=0x13b4d28 "gist", attributeList=0x13b4df8,
predicate=0x0,
    rangetable=0x0, unique=0 '\0', primary=0 '\0', isconstraint=0 '\0',
    is_alter_table=0 '\0', check_rights=1 '\001', skip_build=0 '\0',
    quiet=0 '\0') at indexcmds.c:339
#13 0x0058d801 in ProcessUtility (parsetree=0x13b4e18, dest=0x13b4e68,
    completionTag=0x22fac0 "") at utility.c:634
#14 0x0058bab5 in PortalRunUtility (portal=0x13fedb0, query=0x13b4b98,
    dest=0x13b4e68, completionTag=0x22fac0 "") at pquery.c:780
#15 0x0058be34 in PortalRunMulti (portal=0x13fedb0, dest=0x13b4e68,
    altdest=0x13b4e68, completionTag=0x22fac0 "") at pquery.c:844
#16 0x0058b4d2 in PortalRun (portal=0x13fedb0, count=2147483647,
    dest=0x13b4e68, altdest=0x13b4e68, completionTag=0x22fac0 "")
    at pquery.c:501
#17 0x005869de in exec_simple_query (
    query_string=0x13b4978 "create index geomtest_idx on geomtest using
gist (geom gist_geometry_ops);") at postgres.c:930
#18 0x0058a206 in PostgresMain (argc=6, argv=0x3d4960, username=0x3d50b0
"mca")
    at postgres.c:2925
#19 0x00558ca3 in BackendRun (port=0x22fc40) at postmaster.c:2660
#20 0x00558e6f in SubPostmasterMain (argc=2, argv=0x3d24d8)
    at postmaster.c:2716
#21 0x0051d821 in main (argc=4, argv=0x3d24d0) at main.c:286
(gdb)


Does this give you a better idea of exactly what's happening?


Many thanks,

Mark.

---

Mark Cave-Ayland
Webbased Ltd.
Tamar Science Park
Derriford
Plymouth
PL6 8BX
England

Tel: +44 (0)1752 764445
Fax: +44 (0)1752 764446


This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender. You
should not copy it or use it for any purpose nor disclose or distribute
its contents to any other person.

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: 26 May 2004 13:45
> To: Mark Cave-Ayland
> Cc: pgsql-hackers-win32@postgresql.org
> Subject: Re: [pgsql-hackers-win32] Win32 GiST bug - more info
>
>
> "Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk> writes:
> > (gdb) print cur
> > $2 = (BOX *) 0x7f7f7f7e
>
> > Looking at the pointers, the value for cur looks quite suspicious ;)
>
> Yeah, it's been taken from memory that's not been
> initialized, or has already been pfree'd.  In other words the
> caller is at fault, not this subroutine.  Can't say much more
> than that though.
>
>             regards, tom lane
>



Re: Win32 GiST bug - more info

From
Tom Lane
Date:
"Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk> writes:
> Thanks for your reply. I found that I got a "better" backtrace by
> executing a couple of commands in the psql.exe session before creating
> the index. The improved backtrace was given below:

Ah, I think I know the problem: you haven't updated your code to conform
to the recently-revised API for GIST index support functions.  You need
to look at these diffs:

2004-03-30 10:45  teodor

    * contrib/btree_gist/btree_common.c,
    contrib/btree_gist/btree_gist.h,
    contrib/btree_gist/btree_gist.sql.in,
    contrib/btree_gist/btree_num.c.in, contrib/btree_gist/btree_ts.c,
    contrib/cube/cube.c, contrib/cube/cube.sql.in,
    contrib/intarray/_int.sql.in, contrib/intarray/_int_gist.c,
    contrib/intarray/_intbig_gist.c, contrib/ltree/_ltree_gist.c,
    contrib/ltree/ltree.sql.in, contrib/ltree/ltree_gist.c,
    contrib/rtree_gist/rtree_gist.c,
    contrib/rtree_gist/rtree_gist.sql.in, contrib/seg/seg.c,
    contrib/seg/seg.sql.in, contrib/tsearch/gistidx.c,
    contrib/tsearch/tsearch.sql.in, contrib/tsearch2/gistidx.c,
    contrib/tsearch2/tsearch.sql.in, contrib/tsearch2/untsearch.sql.in,
    src/backend/access/gist/gist.c, src/include/access/gist.h: Cleanup
    vectors of GISTENTRY and eliminate problem with 64-bit
    strict-aligned boxes. Change interface to user-defined GiST support
    methods union and picksplit. Now instead of bytea struct it used
    special GistEntryVector structure.

There should be some discussion in the pgsql-hackers archives, too.

I think the direct cause of the crash is you're computing the wrong
number of elements in the passed GISTENTRY vector and iterating off the
end of the actually allocated vector.

            regards, tom lane

Re: Win32 GiST bug - more info

From
"Mark Cave-Ayland"
Date:
Hi Tom/Win32 hackers,

Just to clarify this thread and let you know that Tom was correct - I
didn't notice that a recent commit to the GiST code had altered part of
the API that was causing the crash. Our Linux development box runs an
older CVS snapshot and so we didn't notice.... don't suppose there's any
way of having the compiler throw an error on this?

So in short, GiST indices work fine in Win32 :)


Cheers,

Mark.

---

Mark Cave-Ayland
Webbased Ltd.
Tamar Science Park
Derriford
Plymouth
PL6 8BX
England

Tel: +44 (0)1752 764445
Fax: +44 (0)1752 764446


This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender. You
should not copy it or use it for any purpose nor disclose or distribute
its contents to any other person.

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: 26 May 2004 17:02
> To: Mark Cave-Ayland
> Cc: pgsql-hackers-win32@postgresql.org
> Subject: Re: [pgsql-hackers-win32] Win32 GiST bug - more info
>
>
> "Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk> writes:
> > Thanks for your reply. I found that I got a "better" backtrace by
> > executing a couple of commands in the psql.exe session
> before creating
> > the index. The improved backtrace was given below:
>
> Ah, I think I know the problem: you haven't updated your code
> to conform to the recently-revised API for GIST index support
> functions.  You need to look at these diffs:
>
> 2004-03-30 10:45  teodor
>
>     * contrib/btree_gist/btree_common.c,
>     contrib/btree_gist/btree_gist.h,
>     contrib/btree_gist/btree_gist.sql.in,
>     contrib/btree_gist/btree_num.c.in,
> contrib/btree_gist/btree_ts.c,
>     contrib/cube/cube.c, contrib/cube/cube.sql.in,
>     contrib/intarray/_int.sql.in, contrib/intarray/_int_gist.c,
>     contrib/intarray/_intbig_gist.c, contrib/ltree/_ltree_gist.c,
>     contrib/ltree/ltree.sql.in, contrib/ltree/ltree_gist.c,
>     contrib/rtree_gist/rtree_gist.c,
>     contrib/rtree_gist/rtree_gist.sql.in, contrib/seg/seg.c,
>     contrib/seg/seg.sql.in, contrib/tsearch/gistidx.c,
>     contrib/tsearch/tsearch.sql.in, contrib/tsearch2/gistidx.c,
>     contrib/tsearch2/tsearch.sql.in,
> contrib/tsearch2/untsearch.sql.in,
>     src/backend/access/gist/gist.c,
> src/include/access/gist.h: Cleanup
>     vectors of GISTENTRY and eliminate problem with 64-bit
>     strict-aligned boxes. Change interface to user-defined
> GiST support
>     methods union and picksplit. Now instead of bytea struct it used
>     special GistEntryVector structure.
>
> There should be some discussion in the pgsql-hackers archives, too.
>
> I think the direct cause of the crash is you're computing the
> wrong number of elements in the passed GISTENTRY vector and
> iterating off the end of the actually allocated vector.
>
>             regards, tom lane
>