Re: Fixing GIN for empty/null/full-scan cases - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Fixing GIN for empty/null/full-scan cases
Date
Msg-id 6785.1295394408@sss.pgh.pa.us
Whole thread Raw
In response to Re: Fixing GIN for empty/null/full-scan cases  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Fixing GIN for empty/null/full-scan cases  ("David E. Wheeler" <david@kineticode.com>)
List pgsql-hackers
I wrote:
> No, I see no reason to think that has much to do with it.  I'm wondering
> if your table is itself a bit bloated ...

Actually ... I notice you did not show EXPLAIN ANALYZE output for your
tests.  Now I'm wondering whether you tested the right thing at all.
I got burnt that way too.  Observe:

regression=# create index idx_gin_features on listings using gin(features) WHERE deleted_at IS NULL AND status = 1;
CREATE INDEX
regression=# explain analyze SELECT count(*) FROM listings     WHERE features @@ '(1368799&1368800&1369043)'::query_int
     AND deleted_at IS NULL AND status = 1;                                                    QUERY PLAN
                                     
 

--------------------------------------------------------------------------------------------------------------------Aggregate
(cost=9158.24..9158.25 rows=1 width=0) (actual time=153.633..153.634 rows=1 loops=1)  ->  Seq Scan on listings
(cost=0.00..9157.22rows=406 width=0) (actual time=0.048..153.493 rows=772 loops=1)        Filter: ((deleted_at IS NULL)
AND(features @@ '1368799 & 1368800 & 1369043'::query_int) AND (status = 1))Total runtime: 153.713 ms
 
(4 rows)

regression=# set enable_seqscan TO 0;
SET
regression=# explain analyze SELECT count(*) FROM listings     WHERE features @@ '(1368799&1368800&1369043)'::query_int
     AND deleted_at IS NULL AND status = 1;                                                                  QUERY PLAN
                                                                 
 

------------------------------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=13253.42..13253.43 rows=1 width=0) (actual time=331.990..331.990 rows=1 loops=1)  ->  Bitmap Heap Scan on
listings (cost=4095.18..13252.40 rows=406 width=0) (actual time=164.785..331.858 rows=772 loops=1)        Recheck Cond:
((deleted_atIS NULL) AND (status = 1))        Filter: (features @@ '1368799 & 1368800 & 1369043'::query_int)        ->
BitmapIndex Scan on idx_gin_features  (cost=0.00..4095.07 rows=406215 width=0) (actual time=164.045..164.045
rows=406215loops=1)Total runtime: 332.169 ms
 
(6 rows)

The above is "using" the index, but only as a guide to where the rows
satisfying the partial-index predicate are --- note the lack of any
index condition in the indexscan node.  That's because the query_int
query is not in fact compatible with the core-provided index opclass.
We get much better results using intarray's gin__int_ops opclass:

regression=# drop index idx_gin_features;
DROP INDEX
regression=# create index idx_gin_features on listings using gin(features gin__int_ops) WHERE deleted_at IS NULL AND
status= 1;
 
CREATE INDEX
regression=# explain analyze SELECT count(*) FROM listings     WHERE features @@ '(1368799&1368800&1369043)'::query_int
     AND deleted_at IS NULL AND status = 1;                                                             QUERY PLAN
                                                       
 

--------------------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=1159.20..1159.21 rows=1 width=0) (actual time=23.896..23.896 rows=1 loops=1)  ->  Bitmap Heap Scan on listings
(cost=31.15..1158.18rows=406 width=0) (actual time=22.912..23.813 rows=772 loops=1)        Recheck Cond: ((features @@
'1368799& 1368800 & 1369043'::query_int) AND (deleted_at IS NULL) AND (status = 1))        ->  Bitmap Index Scan on
idx_gin_features (cost=0.00..31.05 rows=406 width=0) (actual time=22.811..22.811 rows=772 loops=1)              Index
Cond:(features @@ '1368799 & 1368800 & 1369043'::query_int)Total runtime: 24.036 ms
 
(6 rows)

        regards, tom lane


pgsql-hackers by date:

Previous
From: "David E. Wheeler"
Date:
Subject: Re: Fixing GIN for empty/null/full-scan cases
Next
From: "David E. Wheeler"
Date:
Subject: Re: Fixing GIN for empty/null/full-scan cases