Re: bitmap scan much slower than index scan, hash_search_with_hash_value - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: bitmap scan much slower than index scan, hash_search_with_hash_value |
Date | |
Msg-id | CAEYLb_UX0_5fw3-jW4neJa=bs0KmGpgH_9FWqVhKXh5AHrpAEQ@mail.gmail.com Whole thread Raw |
In response to | bitmap scan much slower than index scan, hash_search_with_hash_value (Sergey Koposov <koposov@ast.cam.ac.uk>) |
Responses |
Re: bitmap scan much slower than index scan,
hash_search_with_hash_value
|
List | pgsql-hackers |
On 2 September 2012 06:21, Sergey Koposov <koposov@ast.cam.ac.uk> wrote: > Hi, > > I'm experiencing the case when bitmap scan is ~ 70 times slower than index > scan which seems to be caused by 1) very big table 2) some hash search logic > (hash_search_with_hash_value ) > > Here is the explain analyze of the query with bitmap scans allowed: > > wsdb=> explain analyze select * from test as t, crts.data as d1 > where d1.objid=t.objid and d1.mjd=t.mjd limit 10000; > QUERY > PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=11514.04..115493165.44 rows=10000 width=68) (actual > time=27.512..66620.231 rows=10000 loops=1) > -> Nested Loop (cost=11514.04..1799585184.18 rows=155832 width=68) > (actual time=27.511..66616.807 rows=10000 loops=1) > -> Seq Scan on test t (cost=0.00..2678.40 rows=156240 width=28) > (actual time=0.010..4.685 rows=11456 loops=1) > -> Bitmap Heap Scan on data d1 (cost=11514.04..11518.05 rows=1 > width=40) (actual time=5.807..5.807 rows=1 loops=11456) > Recheck Cond: ((mjd = t.mjd) AND (objid = t.objid)) > -> BitmapAnd (cost=11514.04..11514.04 rows=1 width=0) > (actual time=5.777..5.777 rows=0 loops=11456) > -> Bitmap Index Scan on data_mjd_idx > (cost=0.00..2501.40 rows=42872 width=0) (actual time=3.920..3.920 rows=22241 > loops=11456) > Index Cond: (mjd = t.mjd) > -> Bitmap Index Scan on data_objid_idx > (cost=0.00..8897.90 rows=415080 width=0) (actual time=0.025..0.025 rows=248 > loops=11456) > Index Cond: (objid = t.objid) > Total runtime: 66622.026 ms > (11 rows) > > Here is the output when bitmap scans are disabled: > QUERY PLAN > QUERY > PLAN > ---------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.00..329631941.65 rows=10000 width=68) (actual > time=0.082..906.876 rows=10000 loops=1) > -> Nested Loop (cost=0.00..4979486036.95 rows=151062 width=68) (actual > time=0.081..905.683 rows=10000 loops=1) > Join Filter: (t.mjd = d1.mjd) > -> Seq Scan on test t (cost=0.00..2632.77 rows=151677 width=28) > (actual time=0.009..1.679 rows=11456 loops=1) > -> Index Scan using data_objid_idx on data d1 > (cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050 > rows=248 loops=11456) > Index Cond: (objid = t.objid) > Total runtime: 907.462 ms > > When the bitmap scans are enabled the "prof" of postgres shows > 47.10% postmaster postgres [.] hash_search_with_hash_value > | > --- hash_search_with_hash_value > > 11.06% postmaster postgres [.] hash_seq_search > | > --- hash_seq_search > > 6.95% postmaster postgres [.] hash_any > | > --- hash_any > > 5.17% postmaster postgres [.] _bt_checkkeys > | > --- _bt_checkkeys > > 4.07% postmaster postgres [.] tbm_add_tuples > | > --- tbm_add_tuples > > 3.41% postmaster postgres [.] hash_search > | > --- hash_search > > > And the last note is that the crts.data table which is being bitmap scanned > is a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan > code > is somehow unprepared to combine two bitmaps for such a big table, and this > leads to the terrible performance. I think that this kind of question is better suited to the pgsql-performance list. Granted, it was presented as a bug report (though they're generally sent to -bugs rather than -hackers), but I don't think that this is a valid bug. One obvious red-flag from your query plans is that there is a misestimation of the row return count of a few orders of magnitude in the Bitmap Index Scan node. Did you trying performing an ANALYZE to see if that helped? It may also be helpful to show pg_stats entries for both the data.mjd and test.mjd columns. You may find, prior to doing an ANALYZE, that there is no entries for one of those tables. Turning off the enable_* planner options in production is generally discouraged. Certainly, you'd be crazy to do that on a server-wide basis. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
pgsql-hackers by date: