Re: bitmap scan much slower than index scan, hash_search_with_hash_value - Mailing list pgsql-hackers
From | Pavel Stehule |
---|---|
Subject | Re: bitmap scan much slower than index scan, hash_search_with_hash_value |
Date | |
Msg-id | CAFj8pRA3f19NsDd7rNRgJw9mLv=FEFXNQRJivR+Te-X1jOCY7A@mail.gmail.com Whole thread Raw |
In response to | bitmap scan much slower than index scan, hash_search_with_hash_value (Sergey Koposov <koposov@ast.cam.ac.uk>) |
Responses |
Re: bitmap scan much slower than index scan,
hash_search_with_hash_value
|
List | pgsql-hackers |
Hello 2012/9/2 Sergey Koposov <koposov@ast.cam.ac.uk>: > Hi, > > I'm experiencing the case when bitmap scan is ~ 70 times slower than index > scan which seems to be caused by 1) very big table 2) some hash search logic > (hash_search_with_hash_value ) > > Here is the explain analyze of the query with bitmap scans allowed: > > wsdb=> explain analyze select * from test as t, crts.data as d1 > where d1.objid=t.objid and d1.mjd=t.mjd limit 10000; > QUERY > PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=11514.04..115493165.44 rows=10000 width=68) (actual > time=27.512..66620.231 rows=10000 loops=1) > -> Nested Loop (cost=11514.04..1799585184.18 rows=155832 width=68) > (actual time=27.511..66616.807 rows=10000 loops=1) > -> Seq Scan on test t (cost=0.00..2678.40 rows=156240 width=28) > (actual time=0.010..4.685 rows=11456 loops=1) > -> Bitmap Heap Scan on data d1 (cost=11514.04..11518.05 rows=1 > width=40) (actual time=5.807..5.807 rows=1 loops=11456) > Recheck Cond: ((mjd = t.mjd) AND (objid = t.objid)) > -> BitmapAnd (cost=11514.04..11514.04 rows=1 width=0) > (actual time=5.777..5.777 rows=0 loops=11456) > -> Bitmap Index Scan on data_mjd_idx > (cost=0.00..2501.40 rows=42872 width=0) (actual time=3.920..3.920 rows=22241 > loops=11456) > Index Cond: (mjd = t.mjd) > -> Bitmap Index Scan on data_objid_idx > (cost=0.00..8897.90 rows=415080 width=0) (actual time=0.025..0.025 rows=248 > loops=11456) statistics on data_objid_idx table are absolutly out - so planner cannot find optimal plan Regard Pavel Stehule > Index Cond: (objid = t.objid) > Total runtime: 66622.026 ms > (11 rows) > > Here is the output when bitmap scans are disabled: > QUERY PLAN > QUERY > PLAN > ---------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.00..329631941.65 rows=10000 width=68) (actual > time=0.082..906.876 rows=10000 loops=1) > -> Nested Loop (cost=0.00..4979486036.95 rows=151062 width=68) (actual > time=0.081..905.683 rows=10000 loops=1) > Join Filter: (t.mjd = d1.mjd) > -> Seq Scan on test t (cost=0.00..2632.77 rows=151677 width=28) > (actual time=0.009..1.679 rows=11456 loops=1) > -> Index Scan using data_objid_idx on data d1 > (cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050 > rows=248 loops=11456) > Index Cond: (objid = t.objid) > Total runtime: 907.462 ms > > When the bitmap scans are enabled the "prof" of postgres shows > 47.10% postmaster postgres [.] hash_search_with_hash_value > | > --- hash_search_with_hash_value > > 11.06% postmaster postgres [.] hash_seq_search > | > --- hash_seq_search > > 6.95% postmaster postgres [.] hash_any > | > --- hash_any > > 5.17% postmaster postgres [.] _bt_checkkeys > | > --- _bt_checkkeys > > 4.07% postmaster postgres [.] tbm_add_tuples > | > --- tbm_add_tuples > > 3.41% postmaster postgres [.] hash_search > | > --- hash_search > > > And the last note is that the crts.data table which is being bitmap scanned > is a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan > code > is somehow unprepared to combine two bitmaps for such a big table, and this > leads to the terrible performance. > > Regards, > Sergey > > PS Here are the schemas of the tables, just in case: > wsdb=> \d test > Table "koposov.test" > Column | Type | Modifiers > ---------+------------------+----------- > mjd | double precision | > fieldid | bigint | > intmag | integer | > objid | bigint | > > wsdb=> \d crts.data > Table "crts.data" > Column | Type | Modifiers > --------+------------------+----------- > objid | bigint | > mjd | double precision | > mag | real | > emag | real | > ra | double precision | > dec | double precision | > Indexes: > "data_mjd_idx" btree (mjd) WITH (fillfactor=100) > "data_objid_idx" btree (objid) WITH (fillfactor=100) > "data_q3c_ang2ipix_idx" btree (q3c_ang2ipix(ra, "dec")) WITH > (fillfactor=100) > > PPS shared_buffers=10GB, work_mem=1GB > All the test shown here were don in fully cached regime. > > PPS I can believe that what I'm seeing is a feature, not a bug of bitmap > scans, > and I can live with disabling them, but I still thought it's worth > reporting. > > > ***************************************************** > Sergey E. Koposov, PhD, Research Associate > Institute of Astronomy, University of Cambridge > Madingley road, CB3 0HA, Cambridge, UK > Tel: +44-1223-337-551 > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
pgsql-hackers by date: