can't coax query planner into using all columns of a gist index - Mailing list pgsql-hackers

From Gideon Dresdner
Subject can't coax query planner into using all columns of a gist index
Date
Msg-id CAPGvaSPhimbRtYTifEjHx4KoQaAca620GYjWfrbaYNDnfvoO5w@mail.gmail.com
Whole thread Raw
Responses Re: can't coax query planner into using all columns of a gist index  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Greetings,

I had a discussion on IRC today with RhodiumToad regarding optimizing a specific query. We didn't manage to figure out how to get postgres to hit a GIST index. 

The bigger picture is that I am trying to do some bioinformatics and thought that postgres would be a great way of getting the 1000 genomes project "in the palm of my hand" instead of submitting C++ programs to a cluster, waiting for the results, etc. 

Anyway, there is a multicolumn index that we created:

CREATE INDEX qcregions_chrregion_index ON qcregions USING gist(chr, region) WHERE type = 'include'

- chr is an int between 1 and 24 (inclusive)
- region is an int4range
- type is an enum with two values, 'include' and 'exclude'.

Here are three relevant explain outputs:

Example 1:

EXPLAIN SELECT * FROM (values (12,5000), (13,5001) ) v(c,r)
WHERE EXISTS (SELECT region FROM qcregions
    WHERE type = 'include' and region @> v.r and chr = v.c);
                                              QUERY PLAN                                               
-------------------------------------------------------------------------------------------------------
 Nested Loop Semi Join  (cost=0.41..8.82 rows=1 width=8)
   ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=8)
   ->  Index Scan using qcregions_chrregion_index on qcregions  (cost=0.41..3464.26 rows=874 width=17)
         Index Cond: ((chr = "*VALUES*".column1) AND (region @> "*VALUES*".column2))
(4 rows)

Time: 1.284 ms

Example 2:

-- set enable_setbitmapscan = true
EXPLAIN SELECT * FROM (select * from vcf limit 2) AS vcf
WHERE EXISTS (SELECT region FROM qcregions
    WHERE qcregions.chr = vcf.chr
          AND qcregions.type = 'include'
          AND vcf.pos <@ qcregions.region);
                                                QUERY PLAN                                                
----------------------------------------------------------------------------------------------------------
 Nested Loop Semi Join  (cost=4862.57..18775.78 rows=1 width=64)
   ->  Limit  (cost=0.00..0.04 rows=2 width=64)
         ->  Seq Scan on vcf  (cost=0.00..1894654.40 rows=84801840 width=64)
   ->  Bitmap Heap Scan on qcregions  (cost=4862.57..7873.60 rows=874 width=17)
         Recheck Cond: ((vcf.pos <@ region) AND (type = 'include'::qcregiontype) AND (chr = vcf.chr))
         ->  BitmapAnd  (cost=4862.57..4862.57 rows=874 width=0)
               ->  Bitmap Index Scan on qcregions_chrregion_index  (cost=0.00..977.76 rows=20980 width=0)
                     Index Cond: (vcf.pos <@ region)
               ->  Bitmap Index Scan on qcregions_chr_index  (cost=0.00..3884.12 rows=215158 width=0)
                     Index Cond: (chr = vcf.chr)
(10 rows)

Time: 0.708 ms

Example 3 (same as example 2 but with enable_bitmapscan = false).

-- set enable_bitmapscan = false 
 EXPLAIN SELECT * FROM (select * from vcf limit 2) AS vcf
WHERE EXISTS (SELECT region FROM qcregions
    WHERE qcregions.chr = vcf.chr          
          AND qcregions.type = 'include'
          AND vcf.pos <@ qcregions.region);
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Nested Loop Semi Join  (cost=0.43..38691.26 rows=1 width=64)
   ->  Limit  (cost=0.00..0.04 rows=2 width=64)
         ->  Seq Scan on vcf  (cost=0.00..1894654.40 rows=84801840 width=64)
   ->  Index Scan using qcregions_chr_index on qcregions  (cost=0.43..12891.38 rows=874 width=17)
         Index Cond: (chr = vcf.chr)
         Filter: ((type = 'include'::qcregiontype) AND (vcf.pos <@ region))
(6 rows)

Time: 1.214 ms

I am running psql (PostgreSQL) 9.4.4 on a laptop with 4 cores and 16 GB RAM.

Looking forward to hearing your thoughts,
Gideon.

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: WIP: SCRAM authentication
Next
From: Jeff Janes
Date:
Subject: GIN pending clean up is not interruptable