Re: issues with range types, btree_gist and constraints - Mailing list pgsql-hackers

From Tom Lane
Subject Re: issues with range types, btree_gist and constraints
Date
Msg-id 19072.1360195807@sss.pgh.pa.us
Whole thread Raw
In response to Re: issues with range types, btree_gist and constraints  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: issues with range types, btree_gist and constraints  (Tomas Vondra <tv@fuzzy.cz>)
List pgsql-hackers
I wrote:
> Tomas Vondra <tv@fuzzy.cz> writes:
>> I've managed to further simplify the test-case, and I've verified that
>> it's reproducible on current 9.2 and 9.3 branches.

> It seems that

> (1) gistfindgroup decides that SimpleTestString is "equiv" to something.
> It's not too clear what; for sure there is no other leaf key of the same
> value.  The code in gistfindgroup looks a bit like it ought to think
> that all the items are "equiv" to one of the union datums or the other,
> but that's not the case --- no other item gets marked.

After looking closer at this, I see that it's marking left-side tuples
as "equiv" if they could be put in the right-side page for zero penalty,
and similarly marking right-side tuples as "equiv" if they could be put
in the left-side page for zero penalty.  IOW it's finding the tuples for
which it doesn't matter which side they go to.  So this is not quite as
insane as I thought, although the name "equiv" for the flag does not
seem like le mot juste, and the function name and comment are rather
misleading IMO.

> (After further
> investigation it seems that gbt_var_penalty is returning a negative
> penalty at the critical spot here, which might have something to
> do with it --- the business with the tmp[4] array seems many degrees
> away from trustworthy.)

This is a bug, but the core gist code ought not generate invalid indexes
just because the type-specific penalty function is doing something
stupid.

> (2) cleanupOffsets removes the item for SimpleTestString from
> sv->spl_right[], evidently because it's "equiv".

> (3) placeOne() sticks the entry into the spl_left[] array, which is
> an absolutely horrid decision: it means the left-side page will now
> need a range spanning the right-side page's range, when a moment
> ago they had been disjoint.

The above behaviors are fine, really, because they are driven by the
penalty function's claim that there's zero/negative cost to moving this
tuple to the other side.

> (4) gistunionsubkey(), which apparently is supposed to fix the union
> datums to reflect this rearrangement, does no such thing because it
> only processes the index columns to the right of the one where the
> damage has been done.  (It's not clear to me that it could ever
> be allowed to skip any columns, but that's what it's doing.)

This is a bug.  It's also a bug that gistunionsubkey() fails to null out
the existing union keys for each side before calling gistMakeUnionItVec
--- that can result in the recomputed union keys being larger than
necessary.

Furthermore, there's also a bug in gbt_var_bin_union(), the same one
I saw previously that it does the wrong thing when both sides of the
existing range need to be enlarged.

The attached patch fixes these things, but not the buggy penalty
function, because we ought to try to verify that these changes are
enough to prevent creation of an incorrect index.  I can't reproduce any
misbehavior anymore with this patch applied.  However, I'm troubled by
your report that the behavior is unstable, because I get the same result
each time if I start from an empty (truncated) table, with or without
the patch.  You may be seeing some further bug(s).  Could you test this
fix against your own test cases?

There's a lot of other things I don't much like in gistsplit.c, but
they're in the nature of cosmetic and documentation fixes.

            regards, tom lane


Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system
Next
From: Tomas Vondra
Date:
Subject: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system