Thread: Index-only scan for btree_gist turns bpchar to char

Index-only scan for btree_gist turns bpchar to char

From
Alexander Lakhin
Date:
Hello hackers,

While testing the index-only scan fix, I've discovered that replacing
the index-only scan with the index scan changes contrib/btree_gist
output because index-only scan for btree_gist returns a string without
padding.
A simple demonstration (based on btree_gist/sql/char.sql):
CREATE EXTENSION btree_gist;

CREATE TABLE chartmp (a char(32));
INSERT INTO chartmp VALUES('31b0');

CREATE INDEX charidx ON chartmp USING gist ( a );
SET enable_seqscan=off;
EXPLAIN VERBOSE SELECT *, octet_length(a) FROM chartmp WHERE a BETWEEN
'31a' AND '31c';
SELECT *, octet_length(a) FROM chartmp WHERE a BETWEEN '31a' AND '31c';

SET enable_indexonlyscan=off;
EXPLAIN VERBOSE SELECT *, octet_length(a) FROM chartmp WHERE a BETWEEN
'31a' AND '31c';
SELECT *, octet_length(a) FROM chartmp WHERE a BETWEEN '31a' AND '31c';


                                  QUERY
PLAN                                 
------------------------------------------------------------------------------
 Index Only Scan using charidx on chartmp  (cost=0.12..8.15 rows=1
width=136)
   Index Cond: ((a >= '31a'::bpchar) AND (a <= '31c'::bpchar))
(2 rows)

  a   | octet_length
------+--------------
 31b0 |            4
(1 row)


                               QUERY PLAN                               
-------------------------------------------------------------------------
 Index Scan using charidx on chartmp  (cost=0.12..8.15 rows=1 width=136)
   Index Cond: ((a >= '31a'::bpchar) AND (a <= '31c'::bpchar))
(2 rows)

                a                 | octet_length
----------------------------------+--------------
 31b0                             |           32
(1 row)


It seems that loosing blank padding is incorrect (btree and btree_gin
preserve padding with index-only scan) but it's recorded in
contrib/btree_gist/expected/char.out.

Best regards,
Alexander



Re: Index-only scan for btree_gist turns bpchar to char

From
Tom Lane
Date:
Alexander Lakhin <exclusion@gmail.com> writes:
> While testing the index-only scan fix, I've discovered that replacing
> the index-only scan with the index scan changes contrib/btree_gist
> output because index-only scan for btree_gist returns a string without
> padding.

Ugh, yeah.  This seems to be because gbt_bpchar_compress() strips
trailing spaces (using rtrim1) before storing the value.  The
idea evidently is to simplify gbt_bpchar_consistent, but it's not
acceptable if the opclass is supposed to support index-only scan.

I see two ways to fix this:

* Disallow index-only scan, by removing the fetch function for this
opclass.  This'd require a module version bump, so people wouldn't
get that fix automatically.

* Change gbt_bpchar_compress to not trim spaces (it becomes just
like gbt_text_compress), and adapt gbt_bpchar_consistent to cope.
This does nothing for the problem immediately, unless you REINDEX
affected indexes --- but over time an index's entries would get
replaced with untrimmed versions.

I also wondered if we could make the fetch function reconstruct the
padding, but it doesn't seem to have access to the necessary info.

            regards, tom lane



Re: Index-only scan for btree_gist turns bpchar to char

From
Alexander Lakhin
Date:
04.01.2022 22:19, Tom Lane wrote:
> Alexander Lakhin <exclusion@gmail.com> writes:
>> While testing the index-only scan fix, I've discovered that replacing
>> the index-only scan with the index scan changes contrib/btree_gist
>> output because index-only scan for btree_gist returns a string without
>> padding.
> Ugh, yeah.  This seems to be because gbt_bpchar_compress() strips
> trailing spaces (using rtrim1) before storing the value.  The
> idea evidently is to simplify gbt_bpchar_consistent, but it's not
> acceptable if the opclass is supposed to support index-only scan.
>
> I see two ways to fix this:
>
> * Disallow index-only scan, by removing the fetch function for this
> opclass.  This'd require a module version bump, so people wouldn't
> get that fix automatically.
>
> * Change gbt_bpchar_compress to not trim spaces (it becomes just
> like gbt_text_compress), and adapt gbt_bpchar_consistent to cope.
> This does nothing for the problem immediately, unless you REINDEX
> affected indexes --- but over time an index's entries would get
> replaced with untrimmed versions.
I think that the second way is preferable in the long run. It doesn't
need an explanation after years, why index-only scan is not supported
for that type. One-time mentioning the change and the need for REINDEX
in release notes seems more future-oriented to me.

Best regards,
Alexander



Re: Index-only scan for btree_gist turns bpchar to char

From
Japin Li
Date:
On Wed, 05 Jan 2022 at 03:19, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Lakhin <exclusion@gmail.com> writes:
>> While testing the index-only scan fix, I've discovered that replacing
>> the index-only scan with the index scan changes contrib/btree_gist
>> output because index-only scan for btree_gist returns a string without
>> padding.
>
> Ugh, yeah.  This seems to be because gbt_bpchar_compress() strips
> trailing spaces (using rtrim1) before storing the value.  The
> idea evidently is to simplify gbt_bpchar_consistent, but it's not
> acceptable if the opclass is supposed to support index-only scan.
>
> I see two ways to fix this:
>
> * Disallow index-only scan, by removing the fetch function for this
> opclass.  This'd require a module version bump, so people wouldn't
> get that fix automatically.
>
> * Change gbt_bpchar_compress to not trim spaces (it becomes just
> like gbt_text_compress), and adapt gbt_bpchar_consistent to cope.
> This does nothing for the problem immediately, unless you REINDEX
> affected indexes --- but over time an index's entries would get
> replaced with untrimmed versions.
>
> I also wondered if we could make the fetch function reconstruct the
> padding, but it doesn't seem to have access to the necessary info.
>

If we fix this In the second way, the range query has the same results
in both seq scan and index only scan.  However, it will incur other
problems.  For the following query:

SELECT *, octet_length(a) FROM chartmp WHERE a = '31b0';

Currently, we can get

   a  | octet_length
------+--------------
 31b0 |            4

After fixed, we cannot get any result.  For the equal condition,
we must put the extra spaces to make it work.

Here is a patch for POC testing.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

diff --git a/contrib/btree_gist/btree_text.c b/contrib/btree_gist/btree_text.c
index 8019d11281..5fd425047f 100644
--- a/contrib/btree_gist/btree_text.c
+++ b/contrib/btree_gist/btree_text.c
@@ -121,16 +121,7 @@ gbt_bpchar_compress(PG_FUNCTION_ARGS)
     }
 
     if (entry->leafkey)
-    {
-
-        Datum        d = DirectFunctionCall1(rtrim1, entry->key);
-        GISTENTRY    trim;
-
-        gistentryinit(trim, d,
-                      entry->rel, entry->page,
-                      entry->offset, true);
-        retval = gbt_var_compress(&trim, &tinfo);
-    }
+        retval = gbt_var_compress(entry, &tinfo);
     else
         retval = entry;
 
@@ -179,7 +170,6 @@ gbt_bpchar_consistent(PG_FUNCTION_ARGS)
     bool        retval;
     GBT_VARKEY *key = (GBT_VARKEY *) DatumGetPointer(entry->key);
     GBT_VARKEY_R r = gbt_var_key_readable(key);
-    void       *trim = (void *) DatumGetPointer(DirectFunctionCall1(rtrim1, PointerGetDatum(query)));
 
     /* All cases served by this function are exact */
     *recheck = false;
@@ -189,7 +179,7 @@ gbt_bpchar_consistent(PG_FUNCTION_ARGS)
         tinfo.eml = pg_database_encoding_max_length();
     }
 
-    retval = gbt_var_consistent(&r, trim, strategy, PG_GET_COLLATION(),
+    retval = gbt_var_consistent(&r, query, strategy, PG_GET_COLLATION(),
                                 GIST_LEAF(entry), &tinfo, fcinfo->flinfo);
     PG_RETURN_BOOL(retval);
 }



Re: Index-only scan for btree_gist turns bpchar to char

From
Tom Lane
Date:
Japin Li <japinli@hotmail.com> writes:
> Here is a patch for POC testing.

This is certainly not right.  You've made gbt_bpchar_consistent
work identically to gbt_text_consistent, but it needs to implement
a test equivalent to bpchareq, ie ignore trailing spaces in both
inputs.

The minimum-effort fix would be to apply rtrim1 to both strings
in gbt_bpchar_consistent, but I wonder if we can improve on that
by pushing the ignore-trailing-spaces behavior further down.
I didn't look yet at whether gbt_var_consistent can support
any type-specific behavior.

            regards, tom lane



Re: Index-only scan for btree_gist turns bpchar to char

From
Japin Li
Date:
On Thu, 06 Jan 2022 at 00:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Japin Li <japinli@hotmail.com> writes:
>> Here is a patch for POC testing.
>
> This is certainly not right.  You've made gbt_bpchar_consistent
> work identically to gbt_text_consistent, but it needs to implement
> a test equivalent to bpchareq, ie ignore trailing spaces in both
> inputs.
>

Thanks for your explaintion!  The bpchareq already ignore trailing spaces
in both inputs.  The question is that the bpchar in btree_gist do not call
bpchareq, it always call texteq.  I tried the patch[1] and it works as
expected, howerver, I don't think it's good way to fix this problem.

> The minimum-effort fix would be to apply rtrim1 to both strings
> in gbt_bpchar_consistent, but I wonder if we can improve on that
> by pushing the ignore-trailing-spaces behavior further down.
> I didn't look yet at whether gbt_var_consistent can support
> any type-specific behavior.
>

Adding type-specific for gbt_var_consistent looks like more generally.
For example, for bpchar type, we should call bpchareq rather than texteq.

Am I understand right?

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

[1]
diff --git a/contrib/btree_gist/btree_text.c b/contrib/btree_gist/btree_text.c
index 8019d11281..7f45ee6e3b 100644
--- a/contrib/btree_gist/btree_text.c
+++ b/contrib/btree_gist/btree_text.c
@@ -121,16 +121,7 @@ gbt_bpchar_compress(PG_FUNCTION_ARGS)
     }
 
     if (entry->leafkey)
-    {
-
-        Datum        d = DirectFunctionCall1(rtrim1, entry->key);
-        GISTENTRY    trim;
-
-        gistentryinit(trim, d,
-                      entry->rel, entry->page,
-                      entry->offset, true);
-        retval = gbt_var_compress(&trim, &tinfo);
-    }
+        retval = gbt_var_compress(entry, &tinfo);
     else
         retval = entry;
 
@@ -189,6 +180,11 @@ gbt_bpchar_consistent(PG_FUNCTION_ARGS)
         tinfo.eml = pg_database_encoding_max_length();
     }
 
+    r.lower = (bytea *) DatumGetPointer(DirectFunctionCall1(rtrim1,
+                                                            PointerGetDatum(r.lower)));
+    r.upper = (bytea *) DatumGetPointer(DirectFunctionCall1(rtrim1,
+                                                            PointerGetDatum(r.upper)));
+
     retval = gbt_var_consistent(&r, trim, strategy, PG_GET_COLLATION(),
                                 GIST_LEAF(entry), &tinfo, fcinfo->flinfo);
     PG_RETURN_BOOL(retval);



Re: Index-only scan for btree_gist turns bpchar to char

From
Tom Lane
Date:
Japin Li <japinli@hotmail.com> writes:
> On Thu, 06 Jan 2022 at 00:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The minimum-effort fix would be to apply rtrim1 to both strings
>> in gbt_bpchar_consistent, but I wonder if we can improve on that
>> by pushing the ignore-trailing-spaces behavior further down.
>> I didn't look yet at whether gbt_var_consistent can support
>> any type-specific behavior.

> Adding type-specific for gbt_var_consistent looks like more generally.
> For example, for bpchar type, we should call bpchareq rather than texteq.

I looked at this and it does seem like it might work, as per attached
patch.  The one thing that is troubling me is that the opclass is set
up to apply gbt_text_same, which is formally the Wrong Thing for bpchar,
because the equality semantics shouldn't be quite the same.  But we
could not fix that without a module version bump, which is annoying.
I think that it might not be necessary to change it, because

(1) There's no such thing as unique GIST indexes, so it should not
matter if the "same" function is a bit stricter than the datatype's
nominal notion of equality.  It's certainly okay for that to vary
from the semantics applied by the consistent function --- GIST has
no idea that the consistent function is allegedly testing equality.

(2) If all the input values for a column have been coerced to the same
typmod, then it doesn't matter because two values that are equal after
space-stripping would be equal without space-stripping, too.

However, (2) doesn't hold for an existing index that the user has failed
to REINDEX, because then the index would contain some space-stripped
values that same() will not say are equal to incoming new values.
Again, I think this doesn't matter much, but maybe I'm missing
something.  I've not really dug into what GIST uses the same()
function for.

In any case, if we do need same() to implement the identical
behavior to bpchareq(), then the other solution isn't sufficient
either.

So in short, it seems like we ought to do some compatibility testing
and see if this code misbehaves at all with an index created by the
old code.  I don't particularly want to do that ... any volunteers?

            regards, tom lane

diff --git a/contrib/btree_gist/btree_text.c b/contrib/btree_gist/btree_text.c
index 8019d11281..be0eac7975 100644
--- a/contrib/btree_gist/btree_text.c
+++ b/contrib/btree_gist/btree_text.c
@@ -90,6 +90,76 @@ static gbtree_vinfo tinfo =
     NULL
 };

+/* bpchar needs its own comparison rules */
+
+static bool
+gbt_bpchargt(const void *a, const void *b, Oid collation, FmgrInfo *flinfo)
+{
+    return DatumGetBool(DirectFunctionCall2Coll(bpchargt,
+                                                collation,
+                                                PointerGetDatum(a),
+                                                PointerGetDatum(b)));
+}
+
+static bool
+gbt_bpcharge(const void *a, const void *b, Oid collation, FmgrInfo *flinfo)
+{
+    return DatumGetBool(DirectFunctionCall2Coll(bpcharge,
+                                                collation,
+                                                PointerGetDatum(a),
+                                                PointerGetDatum(b)));
+}
+
+static bool
+gbt_bpchareq(const void *a, const void *b, Oid collation, FmgrInfo *flinfo)
+{
+    return DatumGetBool(DirectFunctionCall2Coll(bpchareq,
+                                                collation,
+                                                PointerGetDatum(a),
+                                                PointerGetDatum(b)));
+}
+
+static bool
+gbt_bpcharle(const void *a, const void *b, Oid collation, FmgrInfo *flinfo)
+{
+    return DatumGetBool(DirectFunctionCall2Coll(bpcharle,
+                                                collation,
+                                                PointerGetDatum(a),
+                                                PointerGetDatum(b)));
+}
+
+static bool
+gbt_bpcharlt(const void *a, const void *b, Oid collation, FmgrInfo *flinfo)
+{
+    return DatumGetBool(DirectFunctionCall2Coll(bpcharlt,
+                                                collation,
+                                                PointerGetDatum(a),
+                                                PointerGetDatum(b)));
+}
+
+static int32
+gbt_bpcharcmp(const void *a, const void *b, Oid collation, FmgrInfo *flinfo)
+{
+    return DatumGetInt32(DirectFunctionCall2Coll(bpcharcmp,
+                                                 collation,
+                                                 PointerGetDatum(a),
+                                                 PointerGetDatum(b)));
+}
+
+static gbtree_vinfo bptinfo =
+{
+    gbt_t_bpchar,
+    0,
+    false,
+    gbt_bpchargt,
+    gbt_bpcharge,
+    gbt_bpchareq,
+    gbt_bpcharle,
+    gbt_bpcharlt,
+    gbt_bpcharcmp,
+    NULL
+};
+

 /**************************************************
  * Text ops
@@ -112,29 +182,8 @@ gbt_text_compress(PG_FUNCTION_ARGS)
 Datum
 gbt_bpchar_compress(PG_FUNCTION_ARGS)
 {
-    GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
-    GISTENTRY  *retval;
-
-    if (tinfo.eml == 0)
-    {
-        tinfo.eml = pg_database_encoding_max_length();
-    }
-
-    if (entry->leafkey)
-    {
-
-        Datum        d = DirectFunctionCall1(rtrim1, entry->key);
-        GISTENTRY    trim;
-
-        gistentryinit(trim, d,
-                      entry->rel, entry->page,
-                      entry->offset, true);
-        retval = gbt_var_compress(&trim, &tinfo);
-    }
-    else
-        retval = entry;
-
-    PG_RETURN_POINTER(retval);
+    /* This should never have been distinct from gbt_text_compress */
+    return gbt_text_compress(fcinfo);
 }


@@ -179,18 +228,17 @@ gbt_bpchar_consistent(PG_FUNCTION_ARGS)
     bool        retval;
     GBT_VARKEY *key = (GBT_VARKEY *) DatumGetPointer(entry->key);
     GBT_VARKEY_R r = gbt_var_key_readable(key);
-    void       *trim = (void *) DatumGetPointer(DirectFunctionCall1(rtrim1, PointerGetDatum(query)));

     /* All cases served by this function are exact */
     *recheck = false;

-    if (tinfo.eml == 0)
+    if (bptinfo.eml == 0)
     {
-        tinfo.eml = pg_database_encoding_max_length();
+        bptinfo.eml = pg_database_encoding_max_length();
     }

-    retval = gbt_var_consistent(&r, trim, strategy, PG_GET_COLLATION(),
-                                GIST_LEAF(entry), &tinfo, fcinfo->flinfo);
+    retval = gbt_var_consistent(&r, query, strategy, PG_GET_COLLATION(),
+                                GIST_LEAF(entry), &bptinfo, fcinfo->flinfo);
     PG_RETURN_BOOL(retval);
 }

diff --git a/contrib/btree_gist/expected/char.out b/contrib/btree_gist/expected/char.out
index d715c045cc..45cf9f38d6 100644
--- a/contrib/btree_gist/expected/char.out
+++ b/contrib/btree_gist/expected/char.out
@@ -75,8 +75,8 @@ SELECT * FROM chartmp WHERE a BETWEEN '31a' AND '31c';
 (2 rows)

 SELECT * FROM chartmp WHERE a BETWEEN '31a' AND '31c';
-  a
-------
- 31b0
+                a
+----------------------------------
+ 31b0
 (1 row)

diff --git a/contrib/btree_gist/expected/char_1.out b/contrib/btree_gist/expected/char_1.out
index 867318002b..7b7824b717 100644
--- a/contrib/btree_gist/expected/char_1.out
+++ b/contrib/btree_gist/expected/char_1.out
@@ -75,8 +75,8 @@ SELECT * FROM chartmp WHERE a BETWEEN '31a' AND '31c';
 (2 rows)

 SELECT * FROM chartmp WHERE a BETWEEN '31a' AND '31c';
-  a
-------
- 31b0
+                a
+----------------------------------
+ 31b0
 (1 row)


Re: Index-only scan for btree_gist turns bpchar to char

From
Japin Li
Date:
On Fri, 07 Jan 2022 at 03:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I looked at this and it does seem like it might work, as per attached
> patch.  The one thing that is troubling me is that the opclass is set
> up to apply gbt_text_same, which is formally the Wrong Thing for bpchar,
> because the equality semantics shouldn't be quite the same.  But we
> could not fix that without a module version bump, which is annoying.
> I think that it might not be necessary to change it, because
>
> (1) There's no such thing as unique GIST indexes, so it should not
> matter if the "same" function is a bit stricter than the datatype's
> nominal notion of equality.  It's certainly okay for that to vary
> from the semantics applied by the consistent function --- GIST has
> no idea that the consistent function is allegedly testing equality.
>
> (2) If all the input values for a column have been coerced to the same
> typmod, then it doesn't matter because two values that are equal after
> space-stripping would be equal without space-stripping, too.
>
> However, (2) doesn't hold for an existing index that the user has failed
> to REINDEX, because then the index would contain some space-stripped
> values that same() will not say are equal to incoming new values.
> Again, I think this doesn't matter much, but maybe I'm missing
> something.  I've not really dug into what GIST uses the same()
> function for.
>
> In any case, if we do need same() to implement the identical
> behavior to bpchareq(), then the other solution isn't sufficient
> either.
>
> So in short, it seems like we ought to do some compatibility testing
> and see if this code misbehaves at all with an index created by the
> old code.  I don't particularly want to do that ... any volunteers?
>

Thanks for your patch, it looks good to me.  I'm not sure how to test this.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



Re: Index-only scan for btree_gist turns bpchar to char

From
Alexander Lakhin
Date:
Hello,
07.01.2022 09:26, Japin Li wrote:
> On Fri, 07 Jan 2022 at 03:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> In any case, if we do need same() to implement the identical
> behavior to bpchareq(), then the other solution isn't sufficient
> either.
>
> So in short, it seems like we ought to do some compatibility testing
> and see if this code misbehaves at all with an index created by the
> old code.  I don't particularly want to do that ... any volunteers?
>
> Thanks for your patch, it looks good to me.  I'm not sure how to test this.
I will test it tomorrow.

Best regards,
Alexander



Re: Index-only scan for btree_gist turns bpchar to char

From
Alexander Lakhin
Date:
07.01.2022 12:00, Alexander Lakhin wrote:
> Hello,
> 07.01.2022 09:26, Japin Li wrote:
>> On Fri, 07 Jan 2022 at 03:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> In any case, if we do need same() to implement the identical
>> behavior to bpchareq(), then the other solution isn't sufficient
>> either.
>>
>> So in short, it seems like we ought to do some compatibility testing
>> and see if this code misbehaves at all with an index created by the
>> old code.  I don't particularly want to do that ... any volunteers?
>>
>> Thanks for your patch, it looks good to me.  I'm not sure how to test this.
> I will test it tomorrow.
I've made a simple test based on the regression test (see attachment)
and can confirm that REINDEX after upgrade fixes the index contents.

Differences after upgrade but before REINDEX:
--- /tmp/pgtest/char.out        2022-01-08 21:27:43.912274805 +0300
+++ /tmp/pgtest/char.expected   2022-01-08 21:27:43.896274765 +0300
@@ -40,8 +40,8 @@
 (2 rows)
 
 SELECT * FROM chartmp WHERE a BETWEEN '31a' AND '31c';
-  a  
-------
- 31b0
+                a                
+----------------------------------
+ 31b0                           
 (1 row)
 
REINDEX INDEX charidx
Differences after upgrade and REINDEX:
Files /tmp/pgtest/char.out and /tmp/pgtest/char.expected are identical

(Unfortunately for me) I found no anomalies related to gbt_text_same()
with an index created with the previous implementation. I've added
diagnostic logging that shows when gbt_text_same() returns 0 for keys
that are the equal but have different padding. So I've observed that
gbt_text_same() returns incorrect result, but all the btree_gist tests
still pass. Moreover, unconditional "*result = 0;" in gbt_text_same()
doesn't affect the tests at all.
I've found that gbt_text_same() is called by gistKeyIsEQ() from
backend/access/gist/gistutil.c, and made gistKeyIsEQ() return false any
time. And even with such change all check-world tests still pass (except
for isolation/predicate-gist that failed due to locking of pages split
differently). So for now, I still don't know how to get incorrect query
results due to incorrect gistKeyIsEQ() behavior/excessive page splitting.

Best regards,
Alexander

Attachment

Re: Index-only scan for btree_gist turns bpchar to char

From
Tom Lane
Date:
Alexander Lakhin <exclusion@gmail.com> writes:
> (Unfortunately for me) I found no anomalies related to gbt_text_same()
> with an index created with the previous implementation. I've added
> diagnostic logging that shows when gbt_text_same() returns 0 for keys
> that are the equal but have different padding. So I've observed that
> gbt_text_same() returns incorrect result, but all the btree_gist tests
> still pass. Moreover, unconditional "*result = 0;" in gbt_text_same()
> doesn't affect the tests at all.
> I've found that gbt_text_same() is called by gistKeyIsEQ() from
> backend/access/gist/gistutil.c, and made gistKeyIsEQ() return false any
> time. And even with such change all check-world tests still pass (except
> for isolation/predicate-gist that failed due to locking of pages split
> differently). So for now, I still don't know how to get incorrect query
> results due to incorrect gistKeyIsEQ() behavior/excessive page splitting.

Yeah, if that's the only use-case then it's pretty easy to see that
an overly strict equality test isn't going to hurt us much.  At worst
it'll cause the index to be a little bit inefficiently stored due to
unnecessary node splits.  Even then, that won't happen much in normal
use, since the discrepancy could only arise once in the lifespan of
an index node (when it first sees a new-style entry that could have
been considered equal to the old-style value).

So I think this solution will work, and I'll go ahead and push it.
Thanks for testing!  (and for the original report ...)

            regards, tom lane