Thread: GiST: Bad newtup On Exit From gistSplit() ?

GiST: Bad newtup On Exit From gistSplit() ?

From
Itai Zukerman
Date:
Just before the return from gistSplit() I see this:
 (gdb) p (*newtup)[0] $147 = {t_tid = {ip_blkid = {bi_hi = 0, bi_lo = 34}, ip_posid = 1}, t_info = 136} (gdb) p
(*newtup)[1]$148 = {t_tid = {ip_blkid = {bi_hi = 65510, bi_lo = 65535}, ip_posid = 65535}, t_info = 24575} (gdb) p nlen
$149= 2
 

(*newtup)[1] doesn't look right, correct?  Everything *seems* OK.  I
need to go back and trace gistFormTuple() and ItemPointerSet(), maybe?

-- 
Itai Zukerman  <http://www.math-hat.com/~zukerman/>


Re: GiST: Bad newtup On Exit From gistSplit() ?

From
Itai Zukerman
Date:
>   (gdb) p (*newtup)[0]
>   $147 = {t_tid = {ip_blkid = {bi_hi = 0, bi_lo = 34}, ip_posid = 1}, t_info = 136}
>   (gdb) p (*newtup)[1]
>   $148 = {t_tid = {ip_blkid = {bi_hi = 65510, bi_lo = 65535}, ip_posid = 65535}, t_info = 24575}
>   (gdb) p nlen
>   $149 = 2
>
> (*newtup)[1] doesn't look right, correct?

Sorry, my fault, that should've been *(newtup[1]), which indeed looks
fine.

I'm still not having any luck tracking down my SEGV.  If anyone is
interested in debugging or even just reproducing this, I'd be happy to
post or E-mail you my code.  It looks like shortly after calling
gistSplit() there's maybe a double-free() somewhere?  Here's a typical
session; the NOTICEs are generated by my picksplit function:

$ gdb ./postgres
(gdb) run -D /var/lib/postgres/data test <x

[... output as many tuples are inserted, and finally ...]

NOTICE:  g_sig_picksplit:  <--
NOTICE:  g_sig_picksplit:  left  (  9)
*.*.***..*..**....*....*.***.**.*..*.*.****....*..**..**.....*****...***..*.*.*.***..*.**..**.*..*.**.*..**.***.*...*.**.**.......*......***....**......*.***.*......*.*..**....**.*..*..**.*..*.*.***.*.*.**.*....*.**..*...*..*.*****.*.*...*..*.*...*..**..*..***....***.*.*.*..*...**.......*....**.*.*.*..*.*..***.*.****.....*.....**..**...*.*....*..**..**.***...*......*..*....*.*.*.....*.*..*....**......****.*.***.*........***..*....*.*....**.*..***.*.*...*..**.*...*...*...*.......*.*.....**......*............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
NOTICE:  g_sig_picksplit:  right ( 50)
.**..********************************************************.*.********.****************************************************************.********.***************************************.**********************************.********************.*.*************.********.****************************.***.**.*******************.*.************************************************.******************************.*****.*******************************************************************************.********............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Program received signal SIGSEGV, Segmentation fault.
0x4028e2ff in mallopt () from /lib/libc.so.6
(gdb) bt
#0  0x4028e2ff in mallopt () from /lib/libc.so.6
#1  0x4028d2ca in free () from /lib/libc.so.6
#2  0x081d05e5 in AllocSetDelete (context=0x82ae098) at aset.c:460
#3  0x081d0f03 in MemoryContextDelete (context=0x82ae098) at mcxt.c:188
#4  0x081d0f36 in MemoryContextDeleteChildren (context=0x82ae010) at mcxt.c:207
#5  0x081d0e8f in MemoryContextDelete (context=0x82ae010) at mcxt.c:161
#6  0x08094070 in AtCommit_Memory () at xact.c:685
#7  0x080943e6 in CommitTransaction () at xact.c:1033
#8  0x080946b6 in CommitTransactionCommand (forceCommit=0 '\0') at xact.c:1304
#9  0x08160966 in finish_xact_command (forceCommit=0 '\0') at postgres.c:978
#10 0x081607ef in pg_exec_query_string (query_string=0x82e4570, dest=Debug, parse_context=0x82adf88) at postgres.c:897
#11 0x081619f0 in PostgresMain (argc=4, argv=0x82960c0, username=0x82966c0 "postgres") at postgres.c:2013
#12 0x08110625 in main (argc=4, argv=0xbffffc14) at main.c:235

--
Itai Zukerman  <http://www.math-hat.com/~zukerman/>


Re: GiST: Bad newtup On Exit From gistSplit() ?

From
Tom Lane
Date:
Itai Zukerman <zukerman@math-hat.com> writes:
> I'm still not having any luck tracking down my SEGV.  If anyone is
> interested in debugging or even just reproducing this, I'd be happy to
> post or E-mail you my code.  It looks like shortly after calling
> gistSplit() there's maybe a double-free() somewhere?

It might help to build with --enable-cassert, if you didn't already.
        regards, tom lane


Re: GiST: Bad newtup On Exit From gistSplit() ?

From
Itai Zukerman
Date:
>> I'm still not having any luck tracking down my SEGV.  If anyone is
>> interested in debugging or even just reproducing this, I'd be happy to
>> post or E-mail you my code.  It looks like shortly after calling
>> gistSplit() there's maybe a double-free() somewhere?
>
> It might help to build with --enable-cassert, if you didn't already.

After recompiling with --enable-cassert, nserting into an empty table,
I get:
 TRAP: FailedAssertion("!((VfdCache[0].fd == (-1)))", File: "fd.c", Line: 1113)
 Program received signal SIGABRT, Aborted. 0x400eda51 in kill () from /lib/libc.so.6 (gdb) bt #0  0x400eda51 in kill ()
from/lib/libc.so.6 #1  0x400ed872 in raise () from /lib/libc.so.6 #2  0x400ee986 in abort () from /lib/libc.so.6 #3
0x081dc4d7in ExceptionalCondition (conditionName=0x8262d92 "!((VfdCache[0].fd == (-1)))", errorType=0x8262a13
"FailedAssertion",fileName=0x8262a0e "fd.c",      lineNumber=1113) at assert.c:46 #4  0x0816a085 in AtEOXact_Files ()
atfd.c:1113 #5  0x080a1921 in CommitTransaction () at xact.c:1031 #6  0x080a1c85 in CommitTransactionCommand
(forceCommit=0'\0') at xact.c:1304 #7  0x08177aaa in finish_xact_command (forceCommit=0 '\0') at postgres.c:978 #8
0x08177933in pg_exec_query_string (query_string=0x8314054, dest=Debug, parse_context=0x82ddb18) at postgres.c:897 #9
0x08178ba7in PostgresMain (argc=4, argv=0x82c5e00, username=0x82c6400 "postgres") at postgres.c:2013 #10 0x081233b9 in
main(argc=4, argv=0xbffffc14) at main.c:235
 

But if I restart postgres and start with some data in the table, I
get, as before:
 Program received signal SIGSEGV, Segmentation fault. 0x401312ff in mallopt () from /lib/libc.so.6 (gdb) bt #0
0x401312ffin mallopt () from /lib/libc.so.6 #1  0x401302ca in free () from /lib/libc.so.6 #2  0x081ea205 in
AllocSetDelete(context=0x82efffc) at aset.c:460 #3  0x081eb107 in MemoryContextDelete (context=0x82efffc) at mcxt.c:188
#4 0x081eb172 in MemoryContextDeleteChildren (context=0x82eff70) at mcxt.c:207 #5  0x081eb093 in MemoryContextDelete
(context=0x82eff70)at mcxt.c:161 #6  0x080a1587 in AtCommit_Memory () at xact.c:685 #7  0x080a192b in CommitTransaction
()at xact.c:1033 #8  0x080a1c85 in CommitTransactionCommand (forceCommit=0 '\0') at xact.c:1304 #9  0x08177aaa in
finish_xact_command(forceCommit=0 '\0') at postgres.c:978 #10 0x08177933 in pg_exec_query_string
(query_string=0x8313fcc,dest=Debug, parse_context=0x82efee4) at postgres.c:897 #11 0x08178ba7 in PostgresMain (argc=4,
argv=0x82c5e00,username=0x82c6400 "postgres") at postgres.c:2013 #12 0x081233b9 in main (argc=4, argv=0xbffffc14) at
main.c:235

Since this only happens after a call to gistSplit(), I still suspect
GiST.  But I've been going over the GiST code the past 3 days and I
still can't find anything substantially wrong.

-- 
Itai Zukerman  <http://www.math-hat.com/~zukerman/>


Re: GiST: Bad newtup On Exit From gistSplit() ?

From
Tom Lane
Date:
Itai Zukerman <zukerman@math-hat.com> writes:
> After recompiling with --enable-cassert, nserting into an empty table,
> I get:

>   TRAP: FailedAssertion("!((VfdCache[0].fd == (-1)))", File: "fd.c", Line: 1113)

Begins to look like a plain old wild store: ain't *nothing* should ever
write into VfdCache[0].fd.  If that's repeatable, you could try homing
in on the place that is clobbering that variable by stepping through
major routines with gdb.
        regards, tom lane


Re: GiST: Bad newtup On Exit From gistSplit() ?

From
Kevin Brown
Date:
Tom Lane wrote:
> Itai Zukerman <zukerman@math-hat.com> writes:
> > After recompiling with --enable-cassert, nserting into an empty table,
> > I get:
> 
> >   TRAP: FailedAssertion("!((VfdCache[0].fd == (-1)))", File: "fd.c", Line: 1113)
> 
> Begins to look like a plain old wild store: ain't *nothing* should ever
> write into VfdCache[0].fd.  If that's repeatable, you could try homing
> in on the place that is clobbering that variable by stepping through
> major routines with gdb.

If this is a system that supports hardware watchpoints (like Linux on
x86), then you should be able to do a "watch VfdCache[0].fd".  In
fact, if you know the specific value it's going to be set to, you can
do a conditional watchpoint: "watch VfdCache[0].fd if VfdCache[0] ==
<value>" (you can set the condition to be pretty much anything).

The program will stop on the statement immediately following the one
that scribbled on the area you're watching when the condition you
specify is met.  Note, though, that the condition is tested *after*
the modification happens.


-- 
Kevin Brown                          kevin@sysexperts.com


Re: GiST: Bad newtup On Exit From gistSplit() ?

From
Itai Zukerman
Date:
>> >   TRAP: FailedAssertion("!((VfdCache[0].fd == (-1)))", File: "fd.c", Line: 1113)

> If this is a system that supports hardware watchpoints (like Linux on
> x86), then you should be able to do a "watch VfdCache[0].fd".

That's exactly what I did, and tracked the problem down to a typo in
my code *blush*.  Thanks for all the great suggestions!

-- 
Itai Zukerman  <http://www.math-hat.com/~zukerman/>


Re: GiST: Bad newtup On Exit From gistSplit() ?

From
Teodor Sigaev
Date:
Send me 'x' file and I'll try to help you.

Itai Zukerman wrote:
>>  (gdb) p (*newtup)[0]
>>  $147 = {t_tid = {ip_blkid = {bi_hi = 0, bi_lo = 34}, ip_posid = 1}, t_info = 136}
>>  (gdb) p (*newtup)[1]
>>  $148 = {t_tid = {ip_blkid = {bi_hi = 65510, bi_lo = 65535}, ip_posid = 65535}, t_info = 24575}
>>  (gdb) p nlen
>>  $149 = 2
>>
>>(*newtup)[1] doesn't look right, correct?
> 
> 
> Sorry, my fault, that should've been *(newtup[1]), which indeed looks
> fine.
> 
> I'm still not having any luck tracking down my SEGV.  If anyone is
> interested in debugging or even just reproducing this, I'd be happy to
> post or E-mail you my code.  It looks like shortly after calling
> gistSplit() there's maybe a double-free() somewhere?  Here's a typical
> session; the NOTICEs are generated by my picksplit function:
> 
> $ gdb ./postgres
> (gdb) run -D /var/lib/postgres/data test <x
> 
> [... output as many tuples are inserted, and finally ...]
> 
> NOTICE:  g_sig_picksplit:  <--
> NOTICE:  g_sig_picksplit:  left  (  9)
*.*.***..*..**....*....*.***.**.*..*.*.****....*..**..**.....*****...***..*.*.*.***..*.**..**.*..*.**.*..**.***.*...*.**.**.......*......***....**......*.***.*......*.*..**....**.*..*..**.*..*.*.***.*.*.**.*....*.**..*...*..*.*****.*.*...*..*.*...*..**..*..***....***.*.*.*..*...**.......*....**.*.*.*..*.*..***.*.****.....*.....**..**...*.*....*..**..**.***...*......*..*....*.*.*.....*.*..*....**......****.*.***.*........***..*....*.*....**.*..***.*.*...*..**.*...*...*...*.......*.*.....**......*.................................................................................................................................................................................................................................................................................................................................................................................................................................................................
...........................................................................
> NOTICE:  g_sig_picksplit:  right ( 50)
.**..********************************************************.*.********.****************************************************************.********.***************************************.**********************************.********************.*.*************.********.****************************.***.**.*******************.*.************************************************.******************************.*****.*******************************************************************************.********.................................................................................................................................................................................................................................................................................................................................................................................................................................................................
...........................................................................
> 
> Program received signal SIGSEGV, Segmentation fault.
> 0x4028e2ff in mallopt () from /lib/libc.so.6
> (gdb) bt
> #0  0x4028e2ff in mallopt () from /lib/libc.so.6
> #1  0x4028d2ca in free () from /lib/libc.so.6
> #2  0x081d05e5 in AllocSetDelete (context=0x82ae098) at aset.c:460
> #3  0x081d0f03 in MemoryContextDelete (context=0x82ae098) at mcxt.c:188
> #4  0x081d0f36 in MemoryContextDeleteChildren (context=0x82ae010) at mcxt.c:207
> #5  0x081d0e8f in MemoryContextDelete (context=0x82ae010) at mcxt.c:161
> #6  0x08094070 in AtCommit_Memory () at xact.c:685
> #7  0x080943e6 in CommitTransaction () at xact.c:1033
> #8  0x080946b6 in CommitTransactionCommand (forceCommit=0 '\0') at xact.c:1304
> #9  0x08160966 in finish_xact_command (forceCommit=0 '\0') at postgres.c:978
> #10 0x081607ef in pg_exec_query_string (query_string=0x82e4570, dest=Debug, parse_context=0x82adf88) at
postgres.c:897
> #11 0x081619f0 in PostgresMain (argc=4, argv=0x82960c0, username=0x82966c0 "postgres") at postgres.c:2013
> #12 0x08110625 in main (argc=4, argv=0xbffffc14) at main.c:235
> 

-- 
Teodor Sigaev
teodor@stack.net