Re: [9.2devel] why it doesn't do index scan only? - Mailing list pgsql-general

From Thom Brown
Subject Re: [9.2devel] why it doesn't do index scan only?
Date
Msg-id CAA-aLv7TB5EC=VhT561VhguM2K+guYE8iGGG0L03oexPo9QHug@mail.gmail.com
Whole thread Raw
In response to Re: [9.2devel] why it doesn't do index scan only?  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: [9.2devel] why it doesn't do index scan only?  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-general
On 8 October 2011 19:30, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2011/10/8 Thom Brown <thom@linux.com>:
>> On 8 October 2011 18:53, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>> Hello
>>>
>>> 2011/10/8 Tom Lane <tgl@sss.pgh.pa.us>:
>>>> hubert depesz lubaczewski <depesz@depesz.com> writes:
>>>>> it is selecting 20 rows out of 30 million. why is it:
>>>>> 1. not using index only scan
>>>>> 2. not using even normal index scan?
>>>>
>>>> It thinks the bitmap scan is cheaper.  Whether that's true or not is not
>>>> very clear, but nobody is claiming that the costing of index-only scans
>>>> is accurate yet.
>>>>
>>>
>>> I did a few tests and bitmap scan is faster. Maybe there is a some
>>> issue. In very simple test (and very syntetic test)
>>>
>>> create table omega(a int);
>>> insert into omega select (random()*10000)::int from generate_series(1,400000);
>>>
>>> select count(*) from omega where a = 100;
>>>
>>> and index scan is faster than index only scan. There is lot of
>>> duplicates. When I used a bigger range, a speed of bitmap index, index
>>> only scan and index scan is similar - but index scan was faster
>>> everywhere.
>>
>> Here, index-only scan is massively faster than any other scan:
>>
>> test=# explain analyse select count(thing) from stuff where thing = 14;
>>
>> QUERY PLAN
>>
----------------------------------------------------------------------------------------------------------------------------------------------------
>>  Aggregate  (cost=99814.38..99814.39 rows=1 width=4) (actual
>> time=337.506..337.506 rows=1 loops=1)
>>   ->  Index Only Scan using idx_stuff_thing on stuff
>> (cost=0.00..99336.88 rows=191000 width=4) (actual
>> time=155.955..315.106 rows=196828 loops=1)
>>         Index Cond: (thing = 14)
>>  Total runtime: 337.639 ms
>> (4 rows)
>>
>> test=# set enable_indexonlyscan to false;
>> SET
>> test=# explain analyse select count(thing) from stuff where thing = 14;
>>
>> QUERY PLAN
>>
------------------------------------------------------------------------------------------------------------------------------------------------
>>  Aggregate  (cost=99814.38..99814.39 rows=1 width=4) (actual
>> time=164882.528..164882.528 rows=1 loops=1)
>>   ->  Index Scan using idx_stuff_thing on stuff  (cost=0.00..99336.88
>> rows=191000 width=4) (actual time=0.184..164494.806 rows=196828
>> loops=1)
>>         Index Cond: (thing = 14)
>>  Total runtime: 164882.666 ms
>> (4 rows)
>>
>> test=# set enable_indexonlyscan to false;
>> SET
>> test=# set enable_indexscan to false;
>> SET
>> test=# explain analyse select count(thing) from stuff where thing = 14;
>>
>> QUERY PLAN
>>
-----------------------------------------------------------------------------------------------------------------------------------------------
>>  Aggregate  (cost=170553.91..170553.92 rows=1 width=4) (actual
>> time=154102.221..154102.222 rows=1 loops=1)
>>   ->  Bitmap Heap Scan on stuff  (cost=2004.91..170076.41 rows=191000
>> width=4) (actual time=482.974..153730.892 rows=196828 loops=1)
>>         Recheck Cond: (thing = 14)
>>         ->  Bitmap Index Scan on idx_stuff_thing  (cost=0.00..1957.16
>> rows=191000 width=0) (actual time=421.854..421.854 rows=196828
>> loops=1)
>>               Index Cond: (thing = 14)
>>  Total runtime: 154107.415 ms
>> (6 rows)
>>
>> test=# set enable_indexonlyscan to false;
>> SET
>> test=# set enable_indexscan to false;
>> SET
>> test=# set enable_bitmapscan to false;
>> SET
>> test=# explain analyse select count(thing) from stuff where thing = 14;
>>                                                        QUERY PLAN
>>
--------------------------------------------------------------------------------------------------------------------------
>>  Aggregate  (cost=812977.50..812977.51 rows=1 width=4) (actual
>> time=121296.897..121296.897 rows=1 loops=1)
>>   ->  Seq Scan on stuff  (cost=0.00..812500.00 rows=191000 width=4)
>> (actual time=67.105..121215.296 rows=196828 loops=1)
>>         Filter: (thing = 14)
>>         Rows Removed by Filter: 14803172
>>  Total runtime: 121296.999 ms
>> (5 rows)
>>
>> Note: buffer cache cleared between queries.
>
> I did it. It is strange, so your times are significantly slower than I
> have. Have you enabled asserts?

The table contains 15 million rows with column values randomly
selected from the 1-350 range, with 60% within the 1-50 range, and
asserts are enabled.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-general by date:

Previous
From: "Francisco Figueiredo Jr."
Date:
Subject: Re: How to add xml data to table
Next
From: Pavel Stehule
Date:
Subject: Re: [9.2devel] why it doesn't do index scan only?