Re: bitmap scan much slower than index scan, hash_search_with_hash_value - Mailing list pgsql-hackers

From Tom Lane
Subject Re: bitmap scan much slower than index scan, hash_search_with_hash_value
Date
Msg-id 22545.1346597069@sss.pgh.pa.us
Whole thread Raw
In response to Re: bitmap scan much slower than index scan, hash_search_with_hash_value  (Sergey Koposov <koposov@ast.cam.ac.uk>)
Responses Re: bitmap scan much slower than index scan, hash_search_with_hash_value  (Sergey Koposov <koposov@ast.cam.ac.uk>)
List pgsql-hackers
Sergey Koposov <koposov@ast.cam.ac.uk> writes:
> On Sun, 2 Sep 2012, Peter Geoghegan wrote:
>> One obvious red-flag from your query plans is that there is a
>> misestimation of the row return count of a few orders of magnitude in
>> the Bitmap Index Scan node. Did you trying performing an ANALYZE to
>> see if that helped? It may also be helpful to show pg_stats entries
>> for both the data.mjd and test.mjd columns. You may find, prior to
>> doing an ANALYZE, that there is no entries for one of those tables.

> The main large table is static and was analyzed. The test table was as 
> well. But as mentioned in another recent email, the query is the join, so 
> column correlation is a problem.

The problem is definitely the misestimation here:
         ->  Index Scan using data_objid_idx on data d1  (cost=0.00..26603.32 rows=415080 width=40) (actual
time=0.010..0.050rows=248 loops=11456)               Index Cond: (objid = t.objid)
 

The planner thinks that indexscan will be 2000 times more expensive than
it really is (assuming that the cost per retrieved row is linear, which
it isn't entirely, but close enough for the moment).  Of course, it's
also thinking the bitmap component scan on the same index will be 2000
times more expensive than reality, but that has only perhaps a 4X impact
on the total cost of the bitmap scan, since the use of the other index
is what dominates there.  With a more accurate idea of this join
condition's selectivity, I'm pretty certain it would have gone for a
plain indexscan, or else a bitmap scan using only this index.

So if there's a bug here, it's in eqjoinsel().  Can you pinpoint
anything unusual about the stats of the objid columns?
        regards, tom lane



pgsql-hackers by date:

Previous
From: Sergey Koposov
Date:
Subject: Re: bitmap scan much slower than index scan, hash_search_with_hash_value
Next
From: Phil Sorber
Date:
Subject: Re: Fwd: PATCH: psql boolean display