Thread: bitmap scan much slower than index scan, hash_search_with_hash_value

bitmap scan much slower than index scan, hash_search_with_hash_value

From
Sergey Koposov
Date:
Hi,

I'm experiencing the case when bitmap scan is ~ 70 times slower than 
index scan which seems to be caused by 1) very big table 2) some hash 
search logic (hash_search_with_hash_value )

Here is the explain analyze of the query with bitmap scans allowed:

wsdb=> explain analyze select * from test as t, crts.data as d1                where d1.objid=t.objid and d1.mjd=t.mjd
limit10000;                                                                      QUERY PLAN
 

-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=11514.04..115493165.44 rows=10000 width=68) (actual time=27.512..66620.231 rows=10000 loops=1)   ->  Nested
Loop (cost=11514.04..1799585184.18 rows=155832 width=68) (actual time=27.511..66616.807 rows=10000 loops=1)         ->
SeqScan on test t  (cost=0.00..2678.40 rows=156240 width=28) (actual time=0.010..4.685 rows=11456 loops=1)         ->
BitmapHeap Scan on data d1  (cost=11514.04..11518.05 rows=1 width=40) (actual time=5.807..5.807 rows=1 loops=11456)
         Recheck Cond: ((mjd = t.mjd) AND (objid = t.objid))               ->  BitmapAnd  (cost=11514.04..11514.04
rows=1width=0) (actual time=5.777..5.777 rows=0 loops=11456)                     ->  Bitmap Index Scan on data_mjd_idx
(cost=0.00..2501.40rows=42872 width=0) (actual time=3.920..3.920 rows=22241 loops=11456)
IndexCond: (mjd = t.mjd)                     ->  Bitmap Index Scan on data_objid_idx  (cost=0.00..8897.90 rows=415080
width=0)(actual time=0.025..0.025 rows=248 loops=11456)                           Index Cond: (objid = t.objid) Total
runtime:66622.026 ms
 
(11 rows)

Here is the output when bitmap scans are disabled:
             QUERY PLAN                                                                     QUERY PLAN 
 

----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..329631941.65 rows=10000 width=68) (actual time=0.082..906.876 rows=10000 loops=1)   ->  Nested Loop
(cost=0.00..4979486036.95rows=151062 width=68) (actual time=0.081..905.683 rows=10000 loops=1)         Join Filter:
(t.mjd= d1.mjd)         ->  Seq Scan on test t  (cost=0.00..2632.77 rows=151677 width=28) (actual time=0.009..1.679
rows=11456loops=1)         ->  Index Scan using data_objid_idx on data d1  (cost=0.00..26603.32 rows=415080 width=40)
(actualtime=0.010..0.050 rows=248 loops=11456)               Index Cond: (objid = t.objid) Total runtime: 907.462 ms
 

When the bitmap scans are enabled the "prof" of postgres shows    47.10%  postmaster  postgres           [.]
hash_search_with_hash_value           |            --- hash_search_with_hash_value
 
    11.06%  postmaster  postgres           [.] hash_seq_search            |            --- hash_seq_search
     6.95%  postmaster  postgres           [.] hash_any            |            --- hash_any
     5.17%  postmaster  postgres           [.] _bt_checkkeys            |            --- _bt_checkkeys
     4.07%  postmaster  postgres           [.] tbm_add_tuples            |            --- tbm_add_tuples
     3.41%  postmaster  postgres           [.] hash_search            |            --- hash_search


And the last note is that the crts.data table which is being bitmap scanned is 
a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan code
is somehow unprepared to combine two bitmaps for such a big table, and this
leads to the terrible performance.

Regards,    Sergey

PS Here are the schemas of the tables, just in case:
wsdb=> \d test          Table "koposov.test" Column  |       Type       | Modifiers
---------+------------------+----------- mjd     | double precision | fieldid | bigint           | intmag  | integer
     | objid   | bigint           |
 

wsdb=> \d crts.data           Table "crts.data" Column |       Type       | Modifiers
--------+------------------+----------- objid  | bigint           | mjd    | double precision | mag    | real
 | emag   | real             | ra     | double precision | dec    | double precision |
 
Indexes:    "data_mjd_idx" btree (mjd) WITH (fillfactor=100)    "data_objid_idx" btree (objid) WITH (fillfactor=100)
"data_q3c_ang2ipix_idx"btree (q3c_ang2ipix(ra, "dec")) WITH (fillfactor=100)
 

PPS shared_buffers=10GB, work_mem=1GB
All the test shown here were don in fully cached regime.

PPS I can believe that what I'm seeing is a feature, not a bug of bitmap scans,
and I can live with disabling them, but I still thought it's worth reporting.


*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551



Re: bitmap scan much slower than index scan, hash_search_with_hash_value

From
Pavel Stehule
Date:
Hello

2012/9/2 Sergey Koposov <koposov@ast.cam.ac.uk>:
> Hi,
>
> I'm experiencing the case when bitmap scan is ~ 70 times slower than index
> scan which seems to be caused by 1) very big table 2) some hash search logic
> (hash_search_with_hash_value )
>
> Here is the explain analyze of the query with bitmap scans allowed:
>
> wsdb=> explain analyze select * from test as t, crts.data as d1
>                 where d1.objid=t.objid and d1.mjd=t.mjd limit 10000;
>                                                                       QUERY
> PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=11514.04..115493165.44 rows=10000 width=68) (actual
> time=27.512..66620.231 rows=10000 loops=1)
>    ->  Nested Loop  (cost=11514.04..1799585184.18 rows=155832 width=68)
> (actual time=27.511..66616.807 rows=10000 loops=1)
>          ->  Seq Scan on test t  (cost=0.00..2678.40 rows=156240 width=28)
> (actual time=0.010..4.685 rows=11456 loops=1)
>          ->  Bitmap Heap Scan on data d1  (cost=11514.04..11518.05 rows=1
> width=40) (actual time=5.807..5.807 rows=1 loops=11456)
>                Recheck Cond: ((mjd = t.mjd) AND (objid = t.objid))
>                ->  BitmapAnd  (cost=11514.04..11514.04 rows=1 width=0)
> (actual time=5.777..5.777 rows=0 loops=11456)
>                      ->  Bitmap Index Scan on data_mjd_idx
> (cost=0.00..2501.40 rows=42872 width=0) (actual time=3.920..3.920 rows=22241
> loops=11456)
>                            Index Cond: (mjd = t.mjd)



>                      ->  Bitmap Index Scan on data_objid_idx
> (cost=0.00..8897.90 rows=415080 width=0) (actual time=0.025..0.025 rows=248
> loops=11456)

statistics on data_objid_idx  table are absolutly out - so planner
cannot find optimal plan

Regard

Pavel Stehule

>                            Index Cond: (objid = t.objid)
>  Total runtime: 66622.026 ms
> (11 rows)
>
> Here is the output when bitmap scans are disabled:
> QUERY PLAN
>                                                                      QUERY
> PLAN
>
----------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.00..329631941.65 rows=10000 width=68) (actual
> time=0.082..906.876 rows=10000 loops=1)
>    ->  Nested Loop  (cost=0.00..4979486036.95 rows=151062 width=68) (actual
> time=0.081..905.683 rows=10000 loops=1)
>          Join Filter: (t.mjd = d1.mjd)
>          ->  Seq Scan on test t  (cost=0.00..2632.77 rows=151677 width=28)
> (actual time=0.009..1.679 rows=11456 loops=1)
>          ->  Index Scan using data_objid_idx on data d1
> (cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050
> rows=248 loops=11456)
>                Index Cond: (objid = t.objid)
>  Total runtime: 907.462 ms
>
> When the bitmap scans are enabled the "prof" of postgres shows
>     47.10%  postmaster  postgres           [.] hash_search_with_hash_value
>             |
>             --- hash_search_with_hash_value
>
>     11.06%  postmaster  postgres           [.] hash_seq_search
>             |
>             --- hash_seq_search
>
>      6.95%  postmaster  postgres           [.] hash_any
>             |
>             --- hash_any
>
>      5.17%  postmaster  postgres           [.] _bt_checkkeys
>             |
>             --- _bt_checkkeys
>
>      4.07%  postmaster  postgres           [.] tbm_add_tuples
>             |
>             --- tbm_add_tuples
>
>      3.41%  postmaster  postgres           [.] hash_search
>             |
>             --- hash_search
>
>
> And the last note is that the crts.data table which is being bitmap scanned
> is a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan
> code
> is somehow unprepared to combine two bitmaps for such a big table, and this
> leads to the terrible performance.
>
> Regards,
>         Sergey
>
> PS Here are the schemas of the tables, just in case:
> wsdb=> \d test
>           Table "koposov.test"
>  Column  |       Type       | Modifiers
> ---------+------------------+-----------
>  mjd     | double precision |
>  fieldid | bigint           |
>  intmag  | integer          |
>  objid   | bigint           |
>
> wsdb=> \d crts.data
>            Table "crts.data"
>  Column |       Type       | Modifiers
> --------+------------------+-----------
>  objid  | bigint           |
>  mjd    | double precision |
>  mag    | real             |
>  emag   | real             |
>  ra     | double precision |
>  dec    | double precision |
> Indexes:
>     "data_mjd_idx" btree (mjd) WITH (fillfactor=100)
>     "data_objid_idx" btree (objid) WITH (fillfactor=100)
>     "data_q3c_ang2ipix_idx" btree (q3c_ang2ipix(ra, "dec")) WITH
> (fillfactor=100)
>
> PPS shared_buffers=10GB, work_mem=1GB
> All the test shown here were don in fully cached regime.
>
> PPS I can believe that what I'm seeing is a feature, not a bug of bitmap
> scans,
> and I can live with disabling them, but I still thought it's worth
> reporting.
>
>
> *****************************************************
> Sergey E. Koposov, PhD, Research Associate
> Institute of Astronomy, University of Cambridge
> Madingley road, CB3 0HA, Cambridge, UK
> Tel: +44-1223-337-551
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers



Re: bitmap scan much slower than index scan, hash_search_with_hash_value

From
Peter Geoghegan
Date:
On 2 September 2012 06:21, Sergey Koposov <koposov@ast.cam.ac.uk> wrote:
> Hi,
>
> I'm experiencing the case when bitmap scan is ~ 70 times slower than index
> scan which seems to be caused by 1) very big table 2) some hash search logic
> (hash_search_with_hash_value )
>
> Here is the explain analyze of the query with bitmap scans allowed:
>
> wsdb=> explain analyze select * from test as t, crts.data as d1
>                 where d1.objid=t.objid and d1.mjd=t.mjd limit 10000;
>                                                                       QUERY
> PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=11514.04..115493165.44 rows=10000 width=68) (actual
> time=27.512..66620.231 rows=10000 loops=1)
>    ->  Nested Loop  (cost=11514.04..1799585184.18 rows=155832 width=68)
> (actual time=27.511..66616.807 rows=10000 loops=1)
>          ->  Seq Scan on test t  (cost=0.00..2678.40 rows=156240 width=28)
> (actual time=0.010..4.685 rows=11456 loops=1)
>          ->  Bitmap Heap Scan on data d1  (cost=11514.04..11518.05 rows=1
> width=40) (actual time=5.807..5.807 rows=1 loops=11456)
>                Recheck Cond: ((mjd = t.mjd) AND (objid = t.objid))
>                ->  BitmapAnd  (cost=11514.04..11514.04 rows=1 width=0)
> (actual time=5.777..5.777 rows=0 loops=11456)
>                      ->  Bitmap Index Scan on data_mjd_idx
> (cost=0.00..2501.40 rows=42872 width=0) (actual time=3.920..3.920 rows=22241
> loops=11456)
>                            Index Cond: (mjd = t.mjd)
>                      ->  Bitmap Index Scan on data_objid_idx
> (cost=0.00..8897.90 rows=415080 width=0) (actual time=0.025..0.025 rows=248
> loops=11456)
>                            Index Cond: (objid = t.objid)
>  Total runtime: 66622.026 ms
> (11 rows)
>
> Here is the output when bitmap scans are disabled:
> QUERY PLAN
>                                                                      QUERY
> PLAN
>
----------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.00..329631941.65 rows=10000 width=68) (actual
> time=0.082..906.876 rows=10000 loops=1)
>    ->  Nested Loop  (cost=0.00..4979486036.95 rows=151062 width=68) (actual
> time=0.081..905.683 rows=10000 loops=1)
>          Join Filter: (t.mjd = d1.mjd)
>          ->  Seq Scan on test t  (cost=0.00..2632.77 rows=151677 width=28)
> (actual time=0.009..1.679 rows=11456 loops=1)
>          ->  Index Scan using data_objid_idx on data d1
> (cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050
> rows=248 loops=11456)
>                Index Cond: (objid = t.objid)
>  Total runtime: 907.462 ms
>
> When the bitmap scans are enabled the "prof" of postgres shows
>     47.10%  postmaster  postgres           [.] hash_search_with_hash_value
>             |
>             --- hash_search_with_hash_value
>
>     11.06%  postmaster  postgres           [.] hash_seq_search
>             |
>             --- hash_seq_search
>
>      6.95%  postmaster  postgres           [.] hash_any
>             |
>             --- hash_any
>
>      5.17%  postmaster  postgres           [.] _bt_checkkeys
>             |
>             --- _bt_checkkeys
>
>      4.07%  postmaster  postgres           [.] tbm_add_tuples
>             |
>             --- tbm_add_tuples
>
>      3.41%  postmaster  postgres           [.] hash_search
>             |
>             --- hash_search
>
>
> And the last note is that the crts.data table which is being bitmap scanned
> is a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan
> code
> is somehow unprepared to combine two bitmaps for such a big table, and this
> leads to the terrible performance.

I think that this kind of question is better suited to the
pgsql-performance list. Granted, it was presented as a bug report
(though they're generally sent to -bugs rather than -hackers), but I
don't think that this is a valid bug.

One obvious red-flag from your query plans is that there is a
misestimation of the row return count of a few orders of magnitude in
the Bitmap Index Scan node. Did you trying performing an ANALYZE to
see if that helped? It may also be helpful to show pg_stats entries
for both the data.mjd and test.mjd columns. You may find, prior to
doing an ANALYZE, that there is no entries for one of those tables.

Turning off the enable_* planner options in production is generally
discouraged. Certainly, you'd be crazy to do that on a server-wide
basis.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services



Re: bitmap scan much slower than index scan, hash_search_with_hash_value

From
Sergey Koposov
Date:
On Sun, 2 Sep 2012, Pavel Stehule wrote:

>
> statistics on data_objid_idx  table are absolutly out - so planner
> cannot find optimal plan

That doesn't have anything to do with the problem, AFAIU.
First, the data table is static and was analysed.
Second, the query in question is the join, and afaik the estimation of the 
number of rows is known to be incorrect, in the case of column 
correlation.
Third, according at least to my understanding in the fully cached regime 
bitmap scan should not take two orders of magnitude more CPU time than 
index scan.
    Sergey
>
> Regard
>
> Pavel Stehule
>
>>                            Index Cond: (objid = t.objid)
>>  Total runtime: 66622.026 ms
>> (11 rows)
>>
>> Here is the output when bitmap scans are disabled:
>> QUERY PLAN
>>                                                                      QUERY
>> PLAN
>>
----------------------------------------------------------------------------------------------------------------------------------------------------
>>  Limit  (cost=0.00..329631941.65 rows=10000 width=68) (actual
>> time=0.082..906.876 rows=10000 loops=1)
>>    ->  Nested Loop  (cost=0.00..4979486036.95 rows=151062 width=68) (actual
>> time=0.081..905.683 rows=10000 loops=1)
>>          Join Filter: (t.mjd = d1.mjd)
>>          ->  Seq Scan on test t  (cost=0.00..2632.77 rows=151677 width=28)
>> (actual time=0.009..1.679 rows=11456 loops=1)
>>          ->  Index Scan using data_objid_idx on data d1
>> (cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050
>> rows=248 loops=11456)
>>                Index Cond: (objid = t.objid)
>>  Total runtime: 907.462 ms
>>
>> When the bitmap scans are enabled the "prof" of postgres shows
>>     47.10%  postmaster  postgres           [.] hash_search_with_hash_value
>>             |
>>             --- hash_search_with_hash_value
>>
>>     11.06%  postmaster  postgres           [.] hash_seq_search
>>             |
>>             --- hash_seq_search
>>
>>      6.95%  postmaster  postgres           [.] hash_any
>>             |
>>             --- hash_any
>>
>>      5.17%  postmaster  postgres           [.] _bt_checkkeys
>>             |
>>             --- _bt_checkkeys
>>
>>      4.07%  postmaster  postgres           [.] tbm_add_tuples
>>             |
>>             --- tbm_add_tuples
>>
>>      3.41%  postmaster  postgres           [.] hash_search
>>             |
>>             --- hash_search
>>
>>
>> And the last note is that the crts.data table which is being bitmap scanned
>> is a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan
>> code
>> is somehow unprepared to combine two bitmaps for such a big table, and this
>> leads to the terrible performance.
>>
>> Regards,
>>         Sergey
>>
>> PS Here are the schemas of the tables, just in case:
>> wsdb=> \d test
>>           Table "koposov.test"
>>  Column  |       Type       | Modifiers
>> ---------+------------------+-----------
>>  mjd     | double precision |
>>  fieldid | bigint           |
>>  intmag  | integer          |
>>  objid   | bigint           |
>>
>> wsdb=> \d crts.data
>>            Table "crts.data"
>>  Column |       Type       | Modifiers
>> --------+------------------+-----------
>>  objid  | bigint           |
>>  mjd    | double precision |
>>  mag    | real             |
>>  emag   | real             |
>>  ra     | double precision |
>>  dec    | double precision |
>> Indexes:
>>     "data_mjd_idx" btree (mjd) WITH (fillfactor=100)
>>     "data_objid_idx" btree (objid) WITH (fillfactor=100)
>>     "data_q3c_ang2ipix_idx" btree (q3c_ang2ipix(ra, "dec")) WITH
>> (fillfactor=100)
>>
>> PPS shared_buffers=10GB, work_mem=1GB
>> All the test shown here were don in fully cached regime.
>>
>> PPS I can believe that what I'm seeing is a feature, not a bug of bitmap
>> scans,
>> and I can live with disabling them, but I still thought it's worth
>> reporting.
>>
>>
>> *****************************************************
>> Sergey E. Koposov, PhD, Research Associate
>> Institute of Astronomy, University of Cambridge
>> Madingley road, CB3 0HA, Cambridge, UK
>> Tel: +44-1223-337-551
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>

*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/



Re: bitmap scan much slower than index scan, hash_search_with_hash_value

From
Sergey Koposov
Date:
Thanks for your comments. 
On Sun, 2 Sep 2012, Peter Geoghegan wrote:
> On 2 September 2012 06:21, Sergey Koposov <koposov@ast.cam.ac.uk> wrote:
>
> I think that this kind of question is better suited to the
> pgsql-performance list. Granted, it was presented as a bug report
> (though they're generally sent to -bugs rather than -hackers), but I
> don't think that this is a valid bug.

The reason is that was inclined to think that it is a bug is that I
encountered a similar bug before with bitmap scans and very big 
tables
http://archives.postgresql.org/pgsql-hackers/2011-08/msg00958.php
Furthermore 2 orders of magnitudes more of CPU time for bitmap scans 
comparing to  the index scan didn't sound right to me (although obviously
I'm not in the position to claim that it's 100% bug).

> One obvious red-flag from your query plans is that there is a
> misestimation of the row return count of a few orders of magnitude in
> the Bitmap Index Scan node. Did you trying performing an ANALYZE to
> see if that helped? It may also be helpful to show pg_stats entries
> for both the data.mjd and test.mjd columns. You may find, prior to
> doing an ANALYZE, that there is no entries for one of those tables.

The main large table is static and was analyzed. The test table was as 
well. But as mentioned in another recent email, the query is the join, so 
column correlation is a problem.

> Turning off the enable_* planner options in production is generally
> discouraged. Certainly, you'd be crazy to do that on a server-wide
> basis.

I'm using PG for data mining, data analysis purposes with very few clients 
connected and very large tables, so enable_* is used quite often to fix 
incorrect plans due to incorrect selectivities.

Regards,    Sergey

*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/



Re: bitmap scan much slower than index scan, hash_search_with_hash_value

From
Tom Lane
Date:
Sergey Koposov <koposov@ast.cam.ac.uk> writes:
> On Sun, 2 Sep 2012, Peter Geoghegan wrote:
>> One obvious red-flag from your query plans is that there is a
>> misestimation of the row return count of a few orders of magnitude in
>> the Bitmap Index Scan node. Did you trying performing an ANALYZE to
>> see if that helped? It may also be helpful to show pg_stats entries
>> for both the data.mjd and test.mjd columns. You may find, prior to
>> doing an ANALYZE, that there is no entries for one of those tables.

> The main large table is static and was analyzed. The test table was as 
> well. But as mentioned in another recent email, the query is the join, so 
> column correlation is a problem.

The problem is definitely the misestimation here:
         ->  Index Scan using data_objid_idx on data d1  (cost=0.00..26603.32 rows=415080 width=40) (actual
time=0.010..0.050rows=248 loops=11456)               Index Cond: (objid = t.objid)
 

The planner thinks that indexscan will be 2000 times more expensive than
it really is (assuming that the cost per retrieved row is linear, which
it isn't entirely, but close enough for the moment).  Of course, it's
also thinking the bitmap component scan on the same index will be 2000
times more expensive than reality, but that has only perhaps a 4X impact
on the total cost of the bitmap scan, since the use of the other index
is what dominates there.  With a more accurate idea of this join
condition's selectivity, I'm pretty certain it would have gone for a
plain indexscan, or else a bitmap scan using only this index.

So if there's a bug here, it's in eqjoinsel().  Can you pinpoint
anything unusual about the stats of the objid columns?
        regards, tom lane



Re: bitmap scan much slower than index scan, hash_search_with_hash_value

From
Sergey Koposov
Date:
On Sun, 2 Sep 2012, Tom Lane wrote:

> Sergey Koposov <koposov@ast.cam.ac.uk> writes:
> The problem is definitely the misestimation here:
>
>          ->  Index Scan using data_objid_idx on data d1  (cost=0.00..26603.32 rows=415080 width=40) (actual
time=0.010..0.050rows=248 loops=11456)
 
>                Index Cond: (objid = t.objid)
>
> The planner thinks that indexscan will be 2000 times more expensive than
> it really is (assuming that the cost per retrieved row is linear, which
> it isn't entirely, but close enough for the moment).  Of course, it's
> also thinking the bitmap component scan on the same index will be 2000
> times more expensive than reality, but that has only perhaps a 4X impact
> on the total cost of the bitmap scan, since the use of the other index
> is what dominates there.  With a more accurate idea of this join
> condition's selectivity, I'm pretty certain it would have gone for a
> plain indexscan, or else a bitmap scan using only this index.
> So if there's a bug here, it's in eqjoinsel().  Can you pinpoint
> anything unusual about the stats of the objid columns?

Here are the pg_stats rows for two tables.
 schemaname | tablename | attname | inherited | null_frac | avg_width | n_distinct |





                                                   most_common_vals

                    !



                                    |



                         most_common_freq!s                                                             !



                               |




            !
       histogram_bounds





|correlation
 

------------+-----------+---------+-----------+-----------+-----------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!--------------------------------------------------------------!--------

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------
koposov   | test      | objid   | f         |         0 |         8 |      12871 |
{1129054024305,1129054013805,1129054015519,1129054010351,1129054024288,1129054031271,1129054025160,1129054002377,1129054009879,1129054052531,1129054009504,1129054012457,1129054019711,1129054021674,1129054036120,1129054007511,1129054047608,1129054051315,1129053031933,1129054002167,1129054025887,1129054031101,1129054052009,1129054005693,1129054052179,1129054009770,1129054010667,1129054011055,1129054012202,1129054023796,1129054024869,1129054033130,1129054034568,1129054007984,1129054011841,1129054017365,1129054024604,1129055040074,1129054004140,1129054023593,1129054037199,1129054049199,1129054012698,1129054012820,1129054035783,1129054036867,1129054040468,1129054004406,1129054019158,1129054021526,1129054041113,1129054045723,1129055012252,1129054017556,1129054020772,1129054036364,1129054044186,1129053031526,1129054016861,1129054026443,1129054040073,1129054045776,1129054009002,1129054013202,112905!4051372,1129054037023,1129054038374,1129054042219,1129054049868,1129054007536,1129054009831,1129054017925,1129054000969,1129054012326,1129054049904,1129054012362,1129054040299,1129054043170,1129054049082,1129054002797,1129054009865,1129054019731,1129054025094,1129054025513,1129054036242,1129054045152,1129054052444,1129053049005,1129054008818,1129054012536,1129054019393,1129054036625,1129054039488,1129054001843,1129054030113,1129054033380,1129054036119,1129054037048,1129054037726,1129054040232}
|
{0.0017,0.0016,0.0016,0.00153333,0.00146667,0.00143333,0.0014,0.00136667,0.00136667,0.00136667,0.00133333,0.00133333,0.00133333,0.00133333,0.00133333,0.0013,0.0013,0.0013,0.00126667,0.00126667,0.00126667,0.00126667,0.00126667,0.00123333,0.00123333,0.0012,0.0012,0.0012,0.0012,0.0012,0.0012,0.0012,0.0012,0.00116667,0.00116667,0.00116667,0.00116667,0.00116667,0.00113333,0.00113333,0.00113333,0.00113333,0.0011,0.0011,0.0011,0.0011,0.0011,0.00106667,0.00106667,0.00106667,0.00106667,0.001!06667,0.00106667,0.00103333,0.00103333,0.00103333,0.00103333,0!.001,0.0

01,0.001,0.001,0.001,0.000966667,0.000933333,0.000933333,0.0009,0.0009,0.0009,0.0009,0.000866667,0.000866667,0.000866667,0.000833333,0.000833333,0.000833333,0.0008,0.0008,0.0008,0.0008,0.000766667,0.000766667,0.000766667,0.000766667,0.000766667,0.000766667,0.000766667,0.000766667,0.000733333,0.000733333,0.000733333,0.000733333,0.000733333,0.000733333,0.0007,0.0007,0.0007,0.0007,0.0007,0.0007,0.0007}
|
{1126055052854,1129053016156,1129053034242,1129054000425,1129054001170,1129054001752,1129054002346,1129054002907,1129054003399,1129054003950,1129054004482,1129054005041,1129054005624,1129054006138,1129054006636,1129054007360,1129054007880,1129054008405,1129054009003,1129054009521,1129054010022,1129054010472,1129054010987,1129054011411,1129054012093,1129054012747,1129054013320,1129054013877,1129054014498,1129054014886,1129054015395,1129054015882,1129054016429,1129054016996,1129054017481,1129054018011,1129054018594,1129054019010,1129054019419,1129054019857,1129054020367,112905402!0851,1129054021386,1129054021974,1129054022470,1129054022961,1129054023507,1129054024144,1129054024709,1129054025236,1129054025668,1129054026108,1129054026713,1129054027136,1129054027725,1129054028321,1129054028791,1129054029410,1129054029930,1129054030588,1129054031188,1129054031714,1129054032265,1129054032885,1129054033562,1129054033982,1129054034471,1129054034995,1129054035565,1129054036163,1129054036618,1129054037228,1129054037814,1129054038511,1129054039023,1129054039716,1129054040221,1129054040896,1129054041519,1129054042087,1129054042712,1129054043109,1129054043572,1129054044106,1129054044719,1129054045295,1129054045953,1129054046667,1129054047347,1129054047912,1129054048614,1129054049229,1129054049970,1129054050705,1129054051360,1129054051926,1129054052381,1129054053001,1129055021424,1129055042151,1132053000980}
| 0.00388717
 
(1 row)
 schemaname | tablename | attname | inherited | null_frac | avg_width | n_distinct |





                                                   most_common_vals

                    !



                                    |



                                         !                                                              !
most_common_freqs



                                                                                               |



               !

                                                                       histogram_bounds





                                    !                          | correlation
 

------------+-----------+---------+-----------+-----------+-----------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!--------------------------------------------------------------!--------

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------!-------------------------+-------------
crts      | data      | objid   | f         |         0 |         8 |      42984 |
{1001052054863,1101028035628,1107015007145,1129016053807,1001010035337,1001047071131,1101056005163,1109079016218,1112118039038,1115046030816,1115046038573,1115078006068,1121063029656,1126048008082,1132004060369,1138019056456,1004051056380,1004073058947,1004083045581,1007021032422,1007113045682,1009018027582,1012083068700,1012110014859,1015121028788,1101029052205,1101048059555,1101127022623,1104089078373,1104120055698,1109045027987,1109078024374,1109083064345,1109091021615,1112024034752,1112053044352,1112060039848,1115015037059,1115067015530,1115078013852,1118021049852,1118042046304,1118086054873,1121059018037,1121071057373,1121080059645,1123038024940,1123038039165,1123077021003,1129053014283,1129112028843,1132104013785,1135002067293,1135046049690,1135064053365,1138070003559,1140002058964,1140035053541,1140042049671,1140058002804,1146044021576,1146046009579,1001013030330,1001014040915,100102!6061566,1001027033057,1001032050491,1001050029421,1001063047914,1001077045026,1001086040152,1001111064042,1001114077173,1001119045214,1001122002865,1004018025733,1004050058091,1004071054511,1004074007987,1004076056804,1004078064774,1004112043963,1004112068670,1004116012773,1004117067826,1004121051038,1004123012077,1004126006726,1004127028634,1007079042379,1007110052823,1007111033667,1007123042281,1009075018815,1009119043295,1012007002296,1012025045008,1012049033574,1012055017037,1012087051308}
|
{0.000233333,0.000233333,0.000233333,0.000233333,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.0001!66667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166!667,0.00

0166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000166667,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333,0.000133333}
|
{1001004004139,1001029020233,1001059002785,1001087029100,1001118009993,1004027052124,1004062046045,1004088060260,1004116023848,1007013040141,1007050044111,1007080021932,1007111006517,1009009008718,1009053001627,1009086036243,1009119018400,1012049009567,1012084095760,1012117034362,1015030070068,1015085100797,1018025013582,1018085006093,1101016044008,1101044062170,1101063008371,1101087058719,110111803!7664,1104019022680,1104044088809,1104066053561,1104086083356,1104115067821,1107010050477,1107043017994,1107058030735,1107083056547,1107112094304,1109007017319,1109029048299,1109059031081,1109084031184,1109114001040,1112007032430,1112029071539,1112059005867,1112083065284,1112113082391,1115007032793,1115028081315,1115054012610,1115080039403,1115113004475,1118008012760,1118040026538,1118059016105,1118086012906,1118115042717,1121015044477,1121040023534,1121062004140,1121085034182,1121116010387,1123018070377,1123046044260,1123073063570,1123087064201,1126011041245,1126025050115,1126049009289,1126075061393,1126109065727,1129013063350,1129036011138,1129052004132,1129080036686,1129111040209,1132016030793,1132040002165,1132067026752,1132104035342,1135018109413,1135042003024,1135070060711,1135102062229,1138014077819,1138037001219,1138068005351,1140002057146,1140033028928,1140062055119,1140098008413,1143040044324,1143074050576,1146053003606,1149034019717,1152027024188,1155054018323,116!0036012767,1171035015563}
|          1
 
(1 row)

After looking at them, I think I understand the reason -- the
number of n_distinct for crts.data is terribly wrong. In reality it should 
be ~ 130 millions. 
I already faced this problem at certain point when doing "group by objid" and 
PG was excausting all the memory, because of that wrong estimate. But 
I know that it's a known hard problem to estimate n_distinct.

So probably that's the main reason of a problem...

Regards,    Sergey



*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/

Re: bitmap scan much slower than index scan, hash_search_with_hash_value

From
Peter Geoghegan
Date:
On 2 September 2012 16:26, Sergey Koposov <koposov@ast.cam.ac.uk> wrote:
> After looking at them, I think I understand the reason -- the
> number of n_distinct for crts.data is terribly wrong. In reality it should
> be ~ 130 millions. I already faced this problem at certain point when doing
> "group by objid" and PG was excausting all the memory, because of that wrong
> estimate. But I know that it's a known hard problem to estimate n_distinct.
>
> So probably that's the main reason of a problem...

That's why we support altering that value with an ALTER TABLE...ALTER
COLUMN DDL statement. You might at least consider increasing the
statistics target for the column first though, to see if that will
make the n_distinct value better accord with reality.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services



Re: bitmap scan much slower than index scan, hash_search_with_hash_value

From
Sergey Koposov
Date:
On Sun, 2 Sep 2012, Peter Geoghegan wrote:

> On 2 September 2012 16:26, Sergey Koposov <koposov@ast.cam.ac.uk> wrote:
>
> That's why we support altering that value with an ALTER TABLE...ALTER
> COLUMN DDL statement. You might at least consider increasing the
> statistics target for the column first though, to see if that will
> make the n_distinct value better accord with reality.

Thanks, I've forgot about that possibility.

After fixing the n_distinct to the correct value the index scan became the 
preferred option. and the row number estimate for the index scan became 
more realistic

Sorry for the noise.    Sergey

*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/