Re: issues with range types, btree_gist and constraints - Mailing list pgsql-hackers

From Tom Lane
Subject Re: issues with range types, btree_gist and constraints
Date
Msg-id 11942.1360105776@sss.pgh.pa.us
Whole thread Raw
In response to Re: issues with range types, btree_gist and constraints  (Tomas Vondra <tv@fuzzy.cz>)
Responses Re: issues with range types, btree_gist and constraints
List pgsql-hackers
Tomas Vondra <tv@fuzzy.cz> writes:
> I've managed to further simplify the test-case, and I've verified that
> it's reproducible on current 9.2 and 9.3 branches.

I'm not sure why you're getting unstable behavior --- it's pretty stable
for me, at least in a debug build.  What I'm finding is that gistsplit.c
appears to contain multiple bugs.  If you just insert the first 271 rows
of sample-1.csv, the 271'st row results in a gist page split.  The split
results in two pages having these root-page downlink entries:
Item   2 -- Length:   88  Offset: 8000 (0x1f40)  Flags: NORMAL Block Id: 2  linp Index: 65535  Size: 88 Has Nulls: 0
HasVarwidths: 1
 
 1f40: 00000200 ffff5840 93900000 00373637  ......X@.....767 1f50: 61626536 31313961 30313834 34366263
abe6119a018446bc1f60: 64373632 37666638 61346632 64900000  d7627ff8a4f2d... 1f70: 00626465 61613234 38323966 31626365
.bdeaa24829f1bce1f80: 36343561 39313463 38386665 61613739  645a914c88feaa79 1f90: 340d440f 00001800
4.D.....       
 

(ie, range 767abe6119a018446bcd7627ff8a4f2d..bdeaa24829f1bce645a914c88feaa794)
Item   3 -- Length:   72  Offset: 7928 (0x1ef8)  Flags: NORMAL Block Id: 3  linp Index: 65535  Size: 72 Has Nulls: 0
HasVarwidths: 1
 
 1ef8: 00000300 ffff4840 73900000 00626538  ......H@s....be8 1f08: 30333261 33656138 31326565 62323838
032a3ea812eeb2881f18: 64663361 65636161 30666361 34500000  df3aecaa0fca4P.. 1f28: 0053696d 706c6554 65737453 7472696e
.SimpleTestStrin1f38: 670d440f 00001800                    g.D.....        
 

(ie, range be8032a3ea812eeb288df3aecaa0fca4..SimpleTestString)

which looks fine .. but the leaf entry for SimpleTestString went into
the first of these two pages, which is 100% wrong.

The split produced by contrib/btree_gist is perfectly fine, and it
assigns SimpleTestString to the right-hand page as expected.  The
bug is being introduced in lines 435..480 of gistsplit.c, which
is code that doesn't fire unless we're working on a non-last
column of the index (which is why you couldn't reproduce this in
a 1-column index).  It seems that

(1) gistfindgroup decides that SimpleTestString is "equiv" to something.
It's not too clear what; for sure there is no other leaf key of the same
value.  The code in gistfindgroup looks a bit like it ought to think
that all the items are "equiv" to one of the union datums or the other,
but that's not the case --- no other item gets marked.  (After further
investigation it seems that gbt_var_penalty is returning a negative
penalty at the critical spot here, which might have something to
do with it --- the business with the tmp[4] array seems many degrees
away from trustworthy.)

(2) cleanupOffsets removes the item for SimpleTestString from
sv->spl_right[], evidently because it's "equiv".

(3) placeOne() sticks the entry into the spl_left[] array, which is
an absolutely horrid decision: it means the left-side page will now
need a range spanning the right-side page's range, when a moment
ago they had been disjoint.

(4) gistunionsubkey(), which apparently is supposed to fix the union
datums to reflect this rearrangement, does no such thing because it
only processes the index columns to the right of the one where the
damage has been done.  (It's not clear to me that it could ever
be allowed to skip any columns, but that's what it's doing.)

If I don't hear a commitment from Teodor to fix this, I'm going to
rip out pretty much all of this logic as being under-documented
and overly-buggy.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system
Next
From: Tom Lane
Date:
Subject: Re: split rm_name and rm_desc out of rmgr.c