Re: Index-only scan returns incorrect results when using acomposite GIST index with a gist_trgm_ops column. - Mailing list pgsql-bugs

From Kyotaro HORIGUCHI
Subject Re: Index-only scan returns incorrect results when using acomposite GIST index with a gist_trgm_ops column.
Date
Msg-id 20180118.211449.240575561.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: Index-only scan returns incorrect results when using a compositeGIST index with a gist_trgm_ops column.  (Michael Paquier <michael.paquier@gmail.com>)
Responses Re: Index-only scan returns incorrect results when using a composite GIST index with a gist_trgm_ops column.
Re: Index-only scan returns incorrect results when using a composite GIST index with a gist_trgm_ops column.
List pgsql-bugs
Hello.

At Thu, 18 Jan 2018 17:25:05 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in
<20180118082505.GA84508@paquier.xyz>
> On Thu, Jan 18, 2018 at 12:57:38PM +0500, Andrey Borodin wrote:
> >> Please find the attached patch.
> > I agree with you that current behavior is a bug and your patch seems correct.
> > I'm a bit worried about ninth strategy words: fetch is not necessary
> >> now, if opclass lacks compress methods - index only scan is
> >> possible. See
> >> https://github.com/postgres/postgres/commit/d3a4f89d8a3e500bd7c0b7a8a8a5ce1b47859128
> >> for details.
> >
> > Though there are tests in cube and seg for that, if your patch passes
> > check-world, than this behavior is not affected. 
> 
> The proposed patch has no regression tests. If the current set is not
> enough to stress the problem, you surely should add some (haven't
> checked the patch in detail, sorry ;p ).

Uggg.. I'm beaten again.. You're definitely right!

It was a bit hard to find the way to cause the failure without
extension but the first attached file is that.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From dc496cba91feb8cb3aea4438337c98efdfac9b8c Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 18 Jan 2018 20:57:13 +0900
Subject: [PATCH 1/2] Regression test for the failure of check_index_only.

check_index_only forgets the case where two or more index columns on
the same table column but with different operator class.
---
 src/test/regress/expected/create_index.out | 41 ++++++++++++++++++++++++++++++
 src/test/regress/sql/create_index.sql      | 26 +++++++++++++++++++
 2 files changed, 67 insertions(+)

diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 031a0bc..1d107f6 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -2990,6 +2990,47 @@ explain (costs off)
 (4 rows)
 
 --
+-- Check for duplicate indexkey with different opclasses
+--
+-- opclass that doesn't have function 9 (GIST_FETCH_PROC) for GiST
+CREATE OPERATOR CLASS test_inet_ops FOR TYPE inet USING gist AS
+        OPERATOR        3       &&,
+        FUNCTION        1       inet_gist_consistent (internal, inet, smallint, oid, internal),
+        FUNCTION        2       inet_gist_union (internal, internal),
+        FUNCTION        3       inet_gist_compress (internal),
+        FUNCTION        5       inet_gist_penalty (internal, internal, internal),
+        FUNCTION        6       inet_gist_picksplit (internal, internal),
+        FUNCTION        7       inet_gist_same (inet, inet, internal);
+CREATE TABLE t (a inet);
+CREATE INDEX ON t USING gist (a test_inet_ops, a inet_ops);
+INSERT INTO t VALUES ('192.168.0.1');
+VACUUM t;
+SELECT * FROM t WHERE a && '192.168.0.1'; -- should retuan a value
+      a      
+-------------
+ 192.168.0.1
+(1 row)
+
+-- enfoce GiST
+SET enable_seqscan TO false;
+SET enable_bitmapscan TO false;
+EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) SELECT * FROM t WHERE a && '192.168.0.1'; -- shoudn't be index
onlyscan
 
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Index Scan using t_a_a1_idx on t (actual rows=1 loops=1)
+   Index Cond: (a && '192.168.0.1'::inet)
+(2 rows)
+
+SELECT * FROM t WHERE a && '192.168.0.1'; -- also should return a value
+      a      
+-------------
+ 192.168.0.1
+(1 row)
+
+-- cleanup
+DROP TABLE t;
+DROP OPERATOR CLASS test_inet_ops USING gist;
+--
 -- REINDEX (VERBOSE)
 --
 CREATE TABLE reindex_verbose(id integer primary key);
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index a45e8eb..e8016f4 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -1026,6 +1026,32 @@ explain (costs off)
   select * from boolindex where not b order by i limit 10;
 
 --
+-- Check for duplicate indexkey with different opclasses
+--
+-- opclass that doesn't have function 9 (GIST_FETCH_PROC) for GiST
+CREATE OPERATOR CLASS test_inet_ops FOR TYPE inet USING gist AS
+        OPERATOR        3       &&,
+        FUNCTION        1       inet_gist_consistent (internal, inet, smallint, oid, internal),
+        FUNCTION        2       inet_gist_union (internal, internal),
+        FUNCTION        3       inet_gist_compress (internal),
+        FUNCTION        5       inet_gist_penalty (internal, internal, internal),
+        FUNCTION        6       inet_gist_picksplit (internal, internal),
+        FUNCTION        7       inet_gist_same (inet, inet, internal);
+CREATE TABLE t (a inet);
+CREATE INDEX ON t USING gist (a test_inet_ops, a inet_ops);
+INSERT INTO t VALUES ('192.168.0.1');
+VACUUM t;
+SELECT * FROM t WHERE a && '192.168.0.1'; -- should retuan a value
+-- enfoce GiST
+SET enable_seqscan TO false;
+SET enable_bitmapscan TO false;
+EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) SELECT * FROM t WHERE a && '192.168.0.1'; -- shoudn't be index
onlyscan
 
+SELECT * FROM t WHERE a && '192.168.0.1'; -- also should return a value
+-- cleanup
+DROP TABLE t;
+DROP OPERATOR CLASS test_inet_ops USING gist;
+
+--
 -- REINDEX (VERBOSE)
 --
 CREATE TABLE reindex_verbose(id integer primary key);
-- 
2.9.2

From d95fcda94f1ef39be6b46deb64a441e053d05f31 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 18 Jan 2018 21:04:18 +0900
Subject: [PATCH 2/2] Fix check_index_only for the case of duplicate keycolumn

If there is an index that have multiple keys on the same attribute but
using different operator class, only the last key on the same
attribute affects the availability of the attribute. This causes
failure of rechecking index tuples. An attribute should be considered
as unavailable if any key on it cannot return the attribute.
---
 src/backend/optimizer/path/indxpath.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 7fc7080..2fede1d 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -1866,6 +1866,7 @@ check_index_only(RelOptInfo *rel, IndexOptInfo *index)
     bool        result;
     Bitmapset  *attrs_used = NULL;
     Bitmapset  *index_canreturn_attrs = NULL;
+    Bitmapset  *index_cannotreturn_attrs = NULL;
     ListCell   *lc;
     int            i;
 
@@ -1905,7 +1906,8 @@ check_index_only(RelOptInfo *rel, IndexOptInfo *index)
 
     /*
      * Construct a bitmapset of columns that the index can return back in an
-     * index-only scan.
+     * index-only scan. We must have a value for all occurances of the same
+     * attribute since it can be used for rechecking.
      */
     for (i = 0; i < index->ncolumns; i++)
     {
@@ -1922,11 +1924,19 @@ check_index_only(RelOptInfo *rel, IndexOptInfo *index)
             index_canreturn_attrs =
                 bms_add_member(index_canreturn_attrs,
                                attno - FirstLowInvalidHeapAttributeNumber);
+        else
+            index_cannotreturn_attrs =
+                bms_add_member(index_cannotreturn_attrs,
+                               attno - FirstLowInvalidHeapAttributeNumber);
     }
 
+    index_canreturn_attrs = bms_del_members(index_canreturn_attrs,
+                                            index_cannotreturn_attrs);
+
     /* Do we have all the necessary attributes? */
     result = bms_is_subset(attrs_used, index_canreturn_attrs);
 
+
     bms_free(attrs_used);
     bms_free(index_canreturn_attrs);
 
-- 
2.9.2


pgsql-bugs by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Index-only scan returns incorrect results when using acomposite GIST index with a gist_trgm_ops column.
Next
From: PG Bug reporting form
Date:
Subject: BUG #15014: pg_trgm regexp with wchar not good?