Re: bitmap scan much slower than index scan, hash_search_with_hash_value - Mailing list pgsql-hackers

From Sergey Koposov
Subject Re: bitmap scan much slower than index scan, hash_search_with_hash_value
Date
Msg-id alpine.LRH.2.02.1209021242510.25232@calx046.ast.cam.ac.uk
Whole thread Raw
In response to Re: bitmap scan much slower than index scan, hash_search_with_hash_value  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers
On Sun, 2 Sep 2012, Pavel Stehule wrote:

>
> statistics on data_objid_idx  table are absolutly out - so planner
> cannot find optimal plan

That doesn't have anything to do with the problem, AFAIU.
First, the data table is static and was analysed.
Second, the query in question is the join, and afaik the estimation of the 
number of rows is known to be incorrect, in the case of column 
correlation.
Third, according at least to my understanding in the fully cached regime 
bitmap scan should not take two orders of magnitude more CPU time than 
index scan.
    Sergey
>
> Regard
>
> Pavel Stehule
>
>>                            Index Cond: (objid = t.objid)
>>  Total runtime: 66622.026 ms
>> (11 rows)
>>
>> Here is the output when bitmap scans are disabled:
>> QUERY PLAN
>>                                                                      QUERY
>> PLAN
>>
----------------------------------------------------------------------------------------------------------------------------------------------------
>>  Limit  (cost=0.00..329631941.65 rows=10000 width=68) (actual
>> time=0.082..906.876 rows=10000 loops=1)
>>    ->  Nested Loop  (cost=0.00..4979486036.95 rows=151062 width=68) (actual
>> time=0.081..905.683 rows=10000 loops=1)
>>          Join Filter: (t.mjd = d1.mjd)
>>          ->  Seq Scan on test t  (cost=0.00..2632.77 rows=151677 width=28)
>> (actual time=0.009..1.679 rows=11456 loops=1)
>>          ->  Index Scan using data_objid_idx on data d1
>> (cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050
>> rows=248 loops=11456)
>>                Index Cond: (objid = t.objid)
>>  Total runtime: 907.462 ms
>>
>> When the bitmap scans are enabled the "prof" of postgres shows
>>     47.10%  postmaster  postgres           [.] hash_search_with_hash_value
>>             |
>>             --- hash_search_with_hash_value
>>
>>     11.06%  postmaster  postgres           [.] hash_seq_search
>>             |
>>             --- hash_seq_search
>>
>>      6.95%  postmaster  postgres           [.] hash_any
>>             |
>>             --- hash_any
>>
>>      5.17%  postmaster  postgres           [.] _bt_checkkeys
>>             |
>>             --- _bt_checkkeys
>>
>>      4.07%  postmaster  postgres           [.] tbm_add_tuples
>>             |
>>             --- tbm_add_tuples
>>
>>      3.41%  postmaster  postgres           [.] hash_search
>>             |
>>             --- hash_search
>>
>>
>> And the last note is that the crts.data table which is being bitmap scanned
>> is a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan
>> code
>> is somehow unprepared to combine two bitmaps for such a big table, and this
>> leads to the terrible performance.
>>
>> Regards,
>>         Sergey
>>
>> PS Here are the schemas of the tables, just in case:
>> wsdb=> \d test
>>           Table "koposov.test"
>>  Column  |       Type       | Modifiers
>> ---------+------------------+-----------
>>  mjd     | double precision |
>>  fieldid | bigint           |
>>  intmag  | integer          |
>>  objid   | bigint           |
>>
>> wsdb=> \d crts.data
>>            Table "crts.data"
>>  Column |       Type       | Modifiers
>> --------+------------------+-----------
>>  objid  | bigint           |
>>  mjd    | double precision |
>>  mag    | real             |
>>  emag   | real             |
>>  ra     | double precision |
>>  dec    | double precision |
>> Indexes:
>>     "data_mjd_idx" btree (mjd) WITH (fillfactor=100)
>>     "data_objid_idx" btree (objid) WITH (fillfactor=100)
>>     "data_q3c_ang2ipix_idx" btree (q3c_ang2ipix(ra, "dec")) WITH
>> (fillfactor=100)
>>
>> PPS shared_buffers=10GB, work_mem=1GB
>> All the test shown here were don in fully cached regime.
>>
>> PPS I can believe that what I'm seeing is a feature, not a bug of bitmap
>> scans,
>> and I can live with disabling them, but I still thought it's worth
>> reporting.
>>
>>
>> *****************************************************
>> Sergey E. Koposov, PhD, Research Associate
>> Institute of Astronomy, University of Cambridge
>> Madingley road, CB3 0HA, Cambridge, UK
>> Tel: +44-1223-337-551
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>

*****************************************************
Sergey E. Koposov, PhD, Research Associate
Institute of Astronomy, University of Cambridge
Madingley road, CB3 0HA, Cambridge, UK
Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Yet another failure mode in pg_upgrade
Next
From: Bruce Momjian
Date:
Subject: Re: [GENERAL] Multiple Slave Failover with PITR