BitmapScan mishaps - Mailing list pgsql-general

From Listmail
Subject BitmapScan mishaps
Date
Msg-id op.tp7qpie5zcizji@apollo13
Whole thread Raw
In response to Re: Using C# to create stored procedures  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BitmapScan mishaps  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
    Hello everyone !

    I have this query :

annonces=> EXPLAIN ANALYZE SELECT * FROM annonces AS a WHERE
detect_time > CURRENT_TIMESTAMP - '7 DAY'::INTERVAL
AND detect_time >= '2006-10-30 16:17:45.064793'
AND vente
AND surface IS NOT NULL AND price IS NOT NULL
AND type_id IN
(1,12,24,17,18,19,20,33,35,50,51,7,52,4,13,41,14,16,26,28,43,53,15,29,30,31,45,32,34,46,47,6)
AND (a.zipcode IN (69001,69002,69003,69004,69005,69006,69007,69008,69009)
OR a.city_id IN (27595)
OR a.coords &&
'(45.74101689082,4.8371263505564),(45.75898310918,4.8628736494436)'::BOX) ;

                                                           QUERY   
PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Bitmap Heap Scan on annonces a  (cost=1657.06..7145.98 rows=1177
width=691) (actual time=118.342..118.854 rows=194 loops=1)
    Recheck Cond: (((vente AND (zipcode = ANY
('{69001,69002,69003,69004,69005,69006,69007,69008,69009}'::integer[])))
OR (vente AND (city_id = 27595)) OR (coords &&
'(45.75898310918,4.8628736494436),(45.74101689082,4.8371263505564)'::box))
AND (detect_time > (now() - '7 days'::interval)) AND (detect_time >=
'2006-10-30 16:17:45.064793'::timestamp without time zone))
    Filter: (vente AND (surface IS NOT NULL) AND (price IS NOT NULL) AND
(type_id = ANY
('{1,12,24,17,18,19,20,33,35,50,51,7,52,4,13,41,14,16,26,28,43,53,15,29,30,31,45,32,34,46,47,6}'::integer[])))
    ->  BitmapAnd  (cost=1657.06..1657.06 rows=2465 width=0) (actual
time=118.294..118.294 rows=0 loops=1)
          ->  BitmapOr  (cost=133.83..133.83 rows=4368 width=0) (actual
time=2.903..2.903 rows=0 loops=1)
                ->  Bitmap Index Scan on annonces_zip  (cost=0.00..56.54
rows=1825 width=0) (actual time=0.599..0.599 rows=1580 loops=1)
                      Index Cond: ((vente = true) AND (zipcode = ANY
('{69001,69002,69003,69004,69005,69006,69007,69008,69009}'::integer[])))
                ->  Bitmap Index Scan on annonces_city  (cost=0.00..43.30
rows=1904 width=0) (actual time=0.464..0.464 rows=1575 loops=1)
                      Index Cond: ((vente = true) AND (city_id = 27595))
                ->  Bitmap Index Scan on annonces_coords  (cost=0.00..33.10
rows=640 width=0) (actual time=1.837..1.837 rows=2166 loops=1)
                      Index Cond: (coords &&
'(45.75898310918,4.8628736494436),(45.74101689082,4.8371263505564)'::box)
          ->  Bitmap Index Scan on annonces_date  (cost=0.00..1522.68
rows=72241 width=0) (actual time=114.930..114.930 rows=68022 loops=1)
                Index Cond: ((detect_time > (now() - '7 days'::interval))
AND (detect_time >= '2006-10-30 16:17:45.064793'::timestamp without time
zone))
  Total runtime: 119.000 ms
(14 lignes)

    The interesting part is :

Bitmap Index Scan on annonces_date  (cost=0.00..1522.68 rows=72241
width=0) (actual time=114.930..114.930 rows=68022 loops=1)

    It bitmapscans about half the table...
    I realized this index was actually useless for all my queries, so I
dropped it, and behold :


                                      QUERY   
PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Bitmap Heap Scan on annonces a  (cost=133.83..7583.77 rows=1176
width=691) (actual time=5.483..18.731 rows=194 loops=1)
    Recheck Cond: ((vente AND (zipcode = ANY
('{69001,69002,69003,69004,69005,69006,69007,69008,69009}'::integer[])))
OR (vente AND (city_id = 27595)) OR (coords &&
'(45.75898310918,4.8628736494436),(45.74101689082,4.8371263505564)'::box))
    Filter: ((detect_time > (now() - '7 days'::interval)) AND (detect_time
>= '2006-10-30 16:17:45.064793'::timestamp without time zone) AND vente
AND (surface IS NOT NULL) AND (price IS NOT NULL) AND (type_id = ANY
('{1,12,24,17,18,19,20,33,35,50,51,7,52,4,13,41,14,16,26,28,43,53,15,29,30,31,45,32,34,46,47,6}'::integer[])))
    ->  BitmapOr  (cost=133.83..133.83 rows=4368 width=0) (actual
time=2.648..2.648 rows=0 loops=1)
          ->  Bitmap Index Scan on annonces_zip  (cost=0.00..56.54
rows=1825 width=0) (actual time=0.505..0.505 rows=1580 loops=1)
                Index Cond: ((vente = true) AND (zipcode = ANY
('{69001,69002,69003,69004,69005,69006,69007,69008,69009}'::integer[])))
          ->  Bitmap Index Scan on annonces_city  (cost=0.00..43.30
rows=1904 width=0) (actual time=0.415..0.415 rows=1575 loops=1)
                Index Cond: ((vente = true) AND (city_id = 27595))
          ->  Bitmap Index Scan on annonces_coords  (cost=0.00..33.10
rows=640 width=0) (actual time=1.725..1.725 rows=2166 loops=1)
                Index Cond: (coords &&
'(45.75898310918,4.8628736494436),(45.74101689082,4.8371263505564)'::box)
  Total runtime: 18.859 ms

    I guess saving 68022 rows of index scan is worth it !
    Now 18 ms to extract the rows I want from that table (not huge, but still
about 70 megabytes) is what I call : nice.

    Just thought it might be useful to some of you. I should have remembered
KISS !

    BTW, shouldn't the planner think about this also ? Bitmap-scanning half a
table is likely to be slower than seq scan anyway...




pgsql-general by date:

Previous
From: "A. Kretschmer"
Date:
Subject: Re: inserting multiple values in version 8.1.5
Next
From: Steve Gerhardt
Date:
Subject: UPDATE on two large datasets is very slow