Re: BUG #18129: GiST index produces incorrect query results - Mailing list pgsql-bugs

From Alexander Lakhin
Subject Re: BUG #18129: GiST index produces incorrect query results
Date
Msg-id cea1e1d0-7f95-fcf0-4aaf-71f0cdd11f4a@gmail.com
Whole thread Raw
In response to Re: BUG #18129: GiST index produces incorrect query results  (Heikki Linnakangas <hlinnaka@iki.fi>)
List pgsql-bugs
Hello Heikki,

26.09.2023 00:24, Heikki Linnakangas wrote:
> I spent a while trying to create a test case for this that would not require expanding the test data so much, but no

> avail. I even tried to take the same data set, but instead of duplicating each element like you did, I appended a 
> random number of integers to each array, but even that did not trigger the failure. That particular data set seems 
> cursed; how did you stumble upon it?
>
> I'm reluctant to just make all the arrays larger, as it makes the test 2x slower. So instead of running all the 
> commands over the larger data set, I added one more copy of the CREATE INDEX and the test queries to the end that
uses
 
> the larger set.
>
> See attached. It's squashed version of the previous patches I posted, with the test case. Barring objections, I'll 
> commit this.

Thank you for working on this!

I can't review the code change in depth, unfortunately (I need time to
study gist guts), but I've tried to just test it.
And with the patch applied I get an assertion failure when the server
compiled --with-blocksize=1 (and -O0):
CPPFLAGS="-O0" ./configure -q --with-blocksize=1 --enable-debug --enable-cassert && make -s -j8 && make -s -j8 -C 
contrib && make -s check -C contrib/intarray
not ok 1     - _int                                      806 ms
# (test process exited with exit code 2)
^C# could not stop postmaster: exit code was 2

(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140373756303168) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140373756303168) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140373756303168, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007fab4f409476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007fab4f3ef7f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x0000559563a87ed9 in ExceptionalCondition (conditionName=0x559563b17a28 "ItemIdHasStorage(itemId)", 
fileName=0x559563b17990 "../../../../src/include/storage/bufpage.h", lineNumber=355) at assert.c:66
#6  0x00005595633334fb in PageGetItem (page=0x7fab43d15800 "", itemId=0x7fab43d1581c) at 
../../../../src/include/storage/bufpage.h:355
#7  0x00005595633358e9 in gistFindCorrectParent (r=0x7fab4176b118, child=0x559565b45b18, is_build=true) at gist.c:1063
#8  0x00005595633361b5 in gistfinishsplit (state=0x7ffc087cd250, stack=0x559565b45b18, giststate=0x559565b7ecd8, 
splitinfo=0x559565be0290, unlockbuf=false) at gist.c:1392
#9  0x0000559563335fdb in gistinserttuples (state=0x7ffc087cd250, stack=0x559565b45b18, giststate=0x559565b7ecd8, 
tuples=0x559565b6b5d8, ntup=1, oldoffnum=0, leftchild=12421, rightchild=12422, unlockbuf=false, unlockleftchild=false)
     at gist.c:1322
#10 0x000055956333612c in gistfinishsplit (state=0x7ffc087cd250, stack=0x559565b45fc8, giststate=0x559565b7ecd8, 
splitinfo=0x559565b6b560, unlockbuf=false) at gist.c:1368
#11 0x0000559563335fdb in gistinserttuples (state=0x7ffc087cd250, stack=0x559565b45fc8, giststate=0x559565b7ecd8, 
tuples=0x559565b692a8, ntup=1, oldoffnum=0, leftchild=12419, rightchild=12420, unlockbuf=false, unlockleftchild=false)
     at gist.c:1322
....

(in fact it crashes even on `make check`)

I haven't understand the reason yet, just wanted to let you know that maybe
the patch needs one more fix before committing.

>
> ERROR:  failed to add item to index page in "byteatest_a_idx1"
>
> If you make that value even larger, then it fails with a better error message:
>
> ERROR:  index row requires 8208 bytes, maximum size is 8191
>
> So the "failed to add item" error doesn't seem expected.
>

Yeah, I saw that anomaly too.

Best regards,
Alexander




pgsql-bugs by date:

Previous
From: Alexander Lakhin
Date:
Subject: Re: BUG #17969: Assert failed in bloom_init() when false_positive_rate = 0.25
Next
From: Richard Guo
Date:
Subject: Re: BUG #18103: bugs of concurrent merge into when use different join plan