Thread: [9.2devel] why it doesn't do index scan only?

[9.2devel] why it doesn't do index scan only?

From

hubert depesz lubaczewski

Date:

08 October 2011, 13:39:21

hi
did:
create table test as select i as id, i || ' ' || repeat('depesz', 100) as z from generate_series(1,30000000) i;
create index q on test (id);
vacuum verbose analyze test;
vacuum verbose analyze test;

then I checked that index only scans work:

$ explain analyze select id from test order by id desc limit 5;
                                                               QUERY PLAN
                 

-----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..0.50 rows=5 width=4) (actual time=0.036..0.040 rows=5 loops=1)
   ->  Index Only Scan Backward using q on test  (cost=0.00..3029050.54 rows=30000000 width=4) (actual
time=0.033..0.034rows=5 loops=1) 
 Total runtime: 0.061 ms
(3 rows)

but when I tried it with a bit more complex query, it failed?

$ explain analyze select id from test where id =
any('{29803895,11161654,17844200,2064469,10910969,6182261,12733552,3820132,3297207,6508781,5831136,4716412,3044368,14719611,8523750,21748704,19193684,13331907,4850220,10061877}'::int4[]);
                                                                                                        QUERY PLAN
                                                                                                   

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on test  (cost=88.47..169.00 rows=20 width=4) (actual time=0.126..0.149 rows=20 loops=1)
   Recheck Cond: (id = ANY
('{29803895,11161654,17844200,2064469,10910969,6182261,12733552,3820132,3297207,6508781,5831136,4716412,3044368,14719611,8523750,21748704,19193684,13331907,4850220,10061877}'::integer[]))
   ->  Bitmap Index Scan on q  (cost=0.00..88.46 rows=20 width=0) (actual time=0.117..0.117 rows=20 loops=1)
         Index Cond: (id = ANY
('{29803895,11161654,17844200,2064469,10910969,6182261,12733552,3820132,3297207,6508781,5831136,4716412,3044368,14719611,8523750,21748704,19193684,13331907,4850220,10061877}'::integer[]))
 Total runtime: 0.177 ms
(5 rows)

it is selecting 20 rows out of 30 million. why is it:
1. not using index only scan
2. not using even normal index scan?

Best regards,

depesz

--
The best thing about modern society is how easy it is to avoid contact with it.
                                                             http://depesz.com/

Re: [9.2devel] why it doesn't do index scan only?

From

Tom Lane

Date:

08 October 2011, 13:48:01

hubert depesz lubaczewski <depesz@depesz.com> writes:
> it is selecting 20 rows out of 30 million. why is it:
> 1. not using index only scan
> 2. not using even normal index scan?

It thinks the bitmap scan is cheaper.  Whether that's true or not is not
very clear, but nobody is claiming that the costing of index-only scans
is accurate yet.

            regards, tom lane

Re: [9.2devel] why it doesn't do index scan only?

From

Tom Lane

Date:

08 October 2011, 14:04:21

I wrote:
> hubert depesz lubaczewski <depesz@depesz.com> writes:
>> it is selecting 20 rows out of 30 million. why is it:
>> 1. not using index only scan
>> 2. not using even normal index scan?

> It thinks the bitmap scan is cheaper.

No, wait, I take that back --- it can't do a plain indexscan because
ScalarArrayOp (=ANY(ARRAY)) isn't supported as a plain indexscan qual,
only as a bitmap qual.  This is because we rely on the bitmap to
eliminate duplicates.  It was never worth improving on that before;
but now that plain indexscans have a potential performance advantage,
we ought to think about ways to use ScalarArrayOp in plain indexscans.

            regards, tom lane

Re: [9.2devel] why it doesn't do index scan only?

From

pasman pasmański

Date:

08 October 2011, 14:20:56

This beaviour is normal. Bitmap index scan is faster than index scan.

2011/10/8, hubert depesz lubaczewski <depesz@depesz.com>:
> hi
> did:
> create table test as select i as id, i || ' ' || repeat('depesz', 100) as z
> from generate_series(1,30000000) i;
> create index q on test (id);
> vacuum verbose analyze test;
> vacuum verbose analyze test;
>
> then I checked that index only scans work:
>
> $ explain analyze select id from test order by id desc limit 5;
>                                                                QUERY PLAN
>
>
-----------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.00..0.50 rows=5 width=4) (actual time=0.036..0.040 rows=5
> loops=1)
>    ->  Index Only Scan Backward using q on test  (cost=0.00..3029050.54
> rows=30000000 width=4) (actual time=0.033..0.034 rows=5 loops=1)
>  Total runtime: 0.061 ms
> (3 rows)
>
> but when I tried it with a bit more complex query, it failed?
>
> $ explain analyze select id from test where id =
>
any('{29803895,11161654,17844200,2064469,10910969,6182261,12733552,3820132,3297207,6508781,5831136,4716412,3044368,14719611,8523750,21748704,19193684,13331907,4850220,10061877}'::int4[]);
>
>                            QUERY PLAN
>
>
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Bitmap Heap Scan on test  (cost=88.47..169.00 rows=20 width=4) (actual
> time=0.126..0.149 rows=20 loops=1)
>    Recheck Cond: (id = ANY
>
('{29803895,11161654,17844200,2064469,10910969,6182261,12733552,3820132,3297207,6508781,5831136,4716412,3044368,14719611,8523750,21748704,19193684,13331907,4850220,10061877}'::integer[]))
>    ->  Bitmap Index Scan on q  (cost=0.00..88.46 rows=20 width=0) (actual
> time=0.117..0.117 rows=20 loops=1)
>          Index Cond: (id = ANY
>
('{29803895,11161654,17844200,2064469,10910969,6182261,12733552,3820132,3297207,6508781,5831136,4716412,3044368,14719611,8523750,21748704,19193684,13331907,4850220,10061877}'::integer[]))
>  Total runtime: 0.177 ms
> (5 rows)
>
> it is selecting 20 rows out of 30 million. why is it:
> 1. not using index only scan
> 2. not using even normal index scan?
>
> Best regards,
>
> depesz
>
> --
> The best thing about modern society is how easy it is to avoid contact with
> it.
>
> http://depesz.com/
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


--
------------
pasman

Re: [9.2devel] why it doesn't do index scan only?

From

Pavel Stehule

Date:

08 October 2011, 14:54:26

Hello

2011/10/8 Tom Lane <tgl@sss.pgh.pa.us>:
> hubert depesz lubaczewski <depesz@depesz.com> writes:
>> it is selecting 20 rows out of 30 million. why is it:
>> 1. not using index only scan
>> 2. not using even normal index scan?
>
> It thinks the bitmap scan is cheaper.  Whether that's true or not is not
> very clear, but nobody is claiming that the costing of index-only scans
> is accurate yet.
>

I did a few tests and bitmap scan is faster. Maybe there is a some
issue. In very simple test (and very syntetic test)

create table omega(a int);
insert into omega select (random()*10000)::int from generate_series(1,400000);

select count(*) from omega where a = 100;

and index scan is faster than index only scan. There is lot of
duplicates. When I used a bigger range, a speed of bitmap index, index
only scan and index scan is similar - but index scan was faster
everywhere.

Regards

Pavel Stehule


>                        regards, tom lane
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [9.2devel] why it doesn't do index scan only?

From

Thom Brown

Date:

08 October 2011, 15:21:44

On 8 October 2011 18:53, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> Hello
>
> 2011/10/8 Tom Lane <tgl@sss.pgh.pa.us>:
>> hubert depesz lubaczewski <depesz@depesz.com> writes:
>>> it is selecting 20 rows out of 30 million. why is it:
>>> 1. not using index only scan
>>> 2. not using even normal index scan?
>>
>> It thinks the bitmap scan is cheaper.  Whether that's true or not is not
>> very clear, but nobody is claiming that the costing of index-only scans
>> is accurate yet.
>>
>
> I did a few tests and bitmap scan is faster. Maybe there is a some
> issue. In very simple test (and very syntetic test)
>
> create table omega(a int);
> insert into omega select (random()*10000)::int from generate_series(1,400000);
>
> select count(*) from omega where a = 100;
>
> and index scan is faster than index only scan. There is lot of
> duplicates. When I used a bigger range, a speed of bitmap index, index
> only scan and index scan is similar - but index scan was faster
> everywhere.

Here, index-only scan is massively faster than any other scan:

test=# explain analyse select count(thing) from stuff where thing = 14;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=99814.38..99814.39 rows=1 width=4) (actual
time=337.506..337.506 rows=1 loops=1)
   ->  Index Only Scan using idx_stuff_thing on stuff
(cost=0.00..99336.88 rows=191000 width=4) (actual
time=155.955..315.106 rows=196828 loops=1)
         Index Cond: (thing = 14)
 Total runtime: 337.639 ms
(4 rows)

test=# set enable_indexonlyscan to false;
SET
test=# explain analyse select count(thing) from stuff where thing = 14;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=99814.38..99814.39 rows=1 width=4) (actual
time=164882.528..164882.528 rows=1 loops=1)
   ->  Index Scan using idx_stuff_thing on stuff  (cost=0.00..99336.88
rows=191000 width=4) (actual time=0.184..164494.806 rows=196828
loops=1)
         Index Cond: (thing = 14)
 Total runtime: 164882.666 ms
(4 rows)

test=# set enable_indexonlyscan to false;
SET
test=# set enable_indexscan to false;
SET
test=# explain analyse select count(thing) from stuff where thing = 14;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=170553.91..170553.92 rows=1 width=4) (actual
time=154102.221..154102.222 rows=1 loops=1)
   ->  Bitmap Heap Scan on stuff  (cost=2004.91..170076.41 rows=191000
width=4) (actual time=482.974..153730.892 rows=196828 loops=1)
         Recheck Cond: (thing = 14)
         ->  Bitmap Index Scan on idx_stuff_thing  (cost=0.00..1957.16
rows=191000 width=0) (actual time=421.854..421.854 rows=196828
loops=1)
               Index Cond: (thing = 14)
 Total runtime: 154107.415 ms
(6 rows)

test=# set enable_indexonlyscan to false;
SET
test=# set enable_indexscan to false;
SET
test=# set enable_bitmapscan to false;
SET
test=# explain analyse select count(thing) from stuff where thing = 14;
                                                        QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=812977.50..812977.51 rows=1 width=4) (actual
time=121296.897..121296.897 rows=1 loops=1)
   ->  Seq Scan on stuff  (cost=0.00..812500.00 rows=191000 width=4)
(actual time=67.105..121215.296 rows=196828 loops=1)
         Filter: (thing = 14)
         Rows Removed by Filter: 14803172
 Total runtime: 121296.999 ms
(5 rows)

Note: buffer cache cleared between queries.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [9.2devel] why it doesn't do index scan only?

From

Pavel Stehule

Date:

08 October 2011, 15:30:55

2011/10/8 Thom Brown <thom@linux.com>:
> On 8 October 2011 18:53, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> Hello
>>
>> 2011/10/8 Tom Lane <tgl@sss.pgh.pa.us>:
>>> hubert depesz lubaczewski <depesz@depesz.com> writes:
>>>> it is selecting 20 rows out of 30 million. why is it:
>>>> 1. not using index only scan
>>>> 2. not using even normal index scan?
>>>
>>> It thinks the bitmap scan is cheaper.  Whether that's true or not is not
>>> very clear, but nobody is claiming that the costing of index-only scans
>>> is accurate yet.
>>>
>>
>> I did a few tests and bitmap scan is faster. Maybe there is a some
>> issue. In very simple test (and very syntetic test)
>>
>> create table omega(a int);
>> insert into omega select (random()*10000)::int from generate_series(1,400000);
>>
>> select count(*) from omega where a = 100;
>>
>> and index scan is faster than index only scan. There is lot of
>> duplicates. When I used a bigger range, a speed of bitmap index, index
>> only scan and index scan is similar - but index scan was faster
>> everywhere.
>
> Here, index-only scan is massively faster than any other scan:
>
> test=# explain analyse select count(thing) from stuff where thing = 14;
>
> QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=99814.38..99814.39 rows=1 width=4) (actual
> time=337.506..337.506 rows=1 loops=1)
>   ->  Index Only Scan using idx_stuff_thing on stuff
> (cost=0.00..99336.88 rows=191000 width=4) (actual
> time=155.955..315.106 rows=196828 loops=1)
>         Index Cond: (thing = 14)
>  Total runtime: 337.639 ms
> (4 rows)
>
> test=# set enable_indexonlyscan to false;
> SET
> test=# explain analyse select count(thing) from stuff where thing = 14;
>
> QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=99814.38..99814.39 rows=1 width=4) (actual
> time=164882.528..164882.528 rows=1 loops=1)
>   ->  Index Scan using idx_stuff_thing on stuff  (cost=0.00..99336.88
> rows=191000 width=4) (actual time=0.184..164494.806 rows=196828
> loops=1)
>         Index Cond: (thing = 14)
>  Total runtime: 164882.666 ms
> (4 rows)
>
> test=# set enable_indexonlyscan to false;
> SET
> test=# set enable_indexscan to false;
> SET
> test=# explain analyse select count(thing) from stuff where thing = 14;
>
> QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=170553.91..170553.92 rows=1 width=4) (actual
> time=154102.221..154102.222 rows=1 loops=1)
>   ->  Bitmap Heap Scan on stuff  (cost=2004.91..170076.41 rows=191000
> width=4) (actual time=482.974..153730.892 rows=196828 loops=1)
>         Recheck Cond: (thing = 14)
>         ->  Bitmap Index Scan on idx_stuff_thing  (cost=0.00..1957.16
> rows=191000 width=0) (actual time=421.854..421.854 rows=196828
> loops=1)
>               Index Cond: (thing = 14)
>  Total runtime: 154107.415 ms
> (6 rows)
>
> test=# set enable_indexonlyscan to false;
> SET
> test=# set enable_indexscan to false;
> SET
> test=# set enable_bitmapscan to false;
> SET
> test=# explain analyse select count(thing) from stuff where thing = 14;
>                                                        QUERY PLAN
>
--------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=812977.50..812977.51 rows=1 width=4) (actual
> time=121296.897..121296.897 rows=1 loops=1)
>   ->  Seq Scan on stuff  (cost=0.00..812500.00 rows=191000 width=4)
> (actual time=67.105..121215.296 rows=196828 loops=1)
>         Filter: (thing = 14)
>         Rows Removed by Filter: 14803172
>  Total runtime: 121296.999 ms
> (5 rows)
>
> Note: buffer cache cleared between queries.

I did it. It is strange, so your times are significantly slower than I
have. Have you enabled asserts?

Pavel


>
> --
> Thom Brown
> Twitter: @darkixion
> IRC (freenode): dark_ixion
> Registered Linux user: #516935
>
> EnterpriseDB UK: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [9.2devel] why it doesn't do index scan only?

From

Thom Brown

Date:

08 October 2011, 15:42:34

On 8 October 2011 19:30, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2011/10/8 Thom Brown <thom@linux.com>:
>> On 8 October 2011 18:53, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>> Hello
>>>
>>> 2011/10/8 Tom Lane <tgl@sss.pgh.pa.us>:
>>>> hubert depesz lubaczewski <depesz@depesz.com> writes:
>>>>> it is selecting 20 rows out of 30 million. why is it:
>>>>> 1. not using index only scan
>>>>> 2. not using even normal index scan?
>>>>
>>>> It thinks the bitmap scan is cheaper.  Whether that's true or not is not
>>>> very clear, but nobody is claiming that the costing of index-only scans
>>>> is accurate yet.
>>>>
>>>
>>> I did a few tests and bitmap scan is faster. Maybe there is a some
>>> issue. In very simple test (and very syntetic test)
>>>
>>> create table omega(a int);
>>> insert into omega select (random()*10000)::int from generate_series(1,400000);
>>>
>>> select count(*) from omega where a = 100;
>>>
>>> and index scan is faster than index only scan. There is lot of
>>> duplicates. When I used a bigger range, a speed of bitmap index, index
>>> only scan and index scan is similar - but index scan was faster
>>> everywhere.
>>
>> Here, index-only scan is massively faster than any other scan:
>>
>> test=# explain analyse select count(thing) from stuff where thing = 14;
>>
>> QUERY PLAN
>>
----------------------------------------------------------------------------------------------------------------------------------------------------
>>  Aggregate  (cost=99814.38..99814.39 rows=1 width=4) (actual
>> time=337.506..337.506 rows=1 loops=1)
>>   ->  Index Only Scan using idx_stuff_thing on stuff
>> (cost=0.00..99336.88 rows=191000 width=4) (actual
>> time=155.955..315.106 rows=196828 loops=1)
>>         Index Cond: (thing = 14)
>>  Total runtime: 337.639 ms
>> (4 rows)
>>
>> test=# set enable_indexonlyscan to false;
>> SET
>> test=# explain analyse select count(thing) from stuff where thing = 14;
>>
>> QUERY PLAN
>>
------------------------------------------------------------------------------------------------------------------------------------------------
>>  Aggregate  (cost=99814.38..99814.39 rows=1 width=4) (actual
>> time=164882.528..164882.528 rows=1 loops=1)
>>   ->  Index Scan using idx_stuff_thing on stuff  (cost=0.00..99336.88
>> rows=191000 width=4) (actual time=0.184..164494.806 rows=196828
>> loops=1)
>>         Index Cond: (thing = 14)
>>  Total runtime: 164882.666 ms
>> (4 rows)
>>
>> test=# set enable_indexonlyscan to false;
>> SET
>> test=# set enable_indexscan to false;
>> SET
>> test=# explain analyse select count(thing) from stuff where thing = 14;
>>
>> QUERY PLAN
>>
-----------------------------------------------------------------------------------------------------------------------------------------------
>>  Aggregate  (cost=170553.91..170553.92 rows=1 width=4) (actual
>> time=154102.221..154102.222 rows=1 loops=1)
>>   ->  Bitmap Heap Scan on stuff  (cost=2004.91..170076.41 rows=191000
>> width=4) (actual time=482.974..153730.892 rows=196828 loops=1)
>>         Recheck Cond: (thing = 14)
>>         ->  Bitmap Index Scan on idx_stuff_thing  (cost=0.00..1957.16
>> rows=191000 width=0) (actual time=421.854..421.854 rows=196828
>> loops=1)
>>               Index Cond: (thing = 14)
>>  Total runtime: 154107.415 ms
>> (6 rows)
>>
>> test=# set enable_indexonlyscan to false;
>> SET
>> test=# set enable_indexscan to false;
>> SET
>> test=# set enable_bitmapscan to false;
>> SET
>> test=# explain analyse select count(thing) from stuff where thing = 14;
>>                                                        QUERY PLAN
>>
--------------------------------------------------------------------------------------------------------------------------
>>  Aggregate  (cost=812977.50..812977.51 rows=1 width=4) (actual
>> time=121296.897..121296.897 rows=1 loops=1)
>>   ->  Seq Scan on stuff  (cost=0.00..812500.00 rows=191000 width=4)
>> (actual time=67.105..121215.296 rows=196828 loops=1)
>>         Filter: (thing = 14)
>>         Rows Removed by Filter: 14803172
>>  Total runtime: 121296.999 ms
>> (5 rows)
>>
>> Note: buffer cache cleared between queries.
>
> I did it. It is strange, so your times are significantly slower than I
> have. Have you enabled asserts?

The table contains 15 million rows with column values randomly
selected from the 1-350 range, with 60% within the 1-50 range, and
asserts are enabled.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [9.2devel] why it doesn't do index scan only?

From

Pavel Stehule

Date:

08 October 2011, 15:48:16

>> I did it. It is strange, so your times are significantly slower than I
>> have. Have you enabled asserts?
>
> The table contains 15 million rows with column values randomly
> selected from the 1-350 range, with 60% within the 1-50 range, and
> asserts are enabled.
>

Now I repeated tests on litlle bit wide table with 9 milion rows, but
without success.

Try to disable asserts. I am not sure, but maybe there significantlly
change a speed.

Pavel

Re: [9.2devel] why it doesn't do index scan only?

From

Thom Brown

Date:

08 October 2011, 16:05:00

On 8 October 2011 19:47, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>> I did it. It is strange, so your times are significantly slower than I
>>> have. Have you enabled asserts?
>>
>> The table contains 15 million rows with column values randomly
>> selected from the 1-350 range, with 60% within the 1-50 range, and
>> asserts are enabled.
>>
>
> Now I repeated tests on litlle bit wide table with 9 milion rows, but
> without success.
>
> Try to disable asserts. I am not sure, but maybe there significantlly
> change a speed.

Okay, here you go.  Results with debug_assertions = false:

Index-only scan: 173.389 ms (78.442 ms)
Index scan: 184239.399 ms (previously 164882.666 ms)
Bitmap scan: 159354.261 ms (previously 154107.415 ms)
Sequential scan: 134552.263 ms (previously 121296.999 ms)

So no particularly significant difference, except with the index-only
scan (which I repeated 3 times and it's about the same each time).

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [9.2devel] why it doesn't do index scan only?

From

Thom Brown

Date:

08 October 2011, 18:41:13

On 8 October 2011 21:13, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2011/10/8 Thom Brown <thom@linux.com>:
>> On 8 October 2011 19:47, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>>>> I did it. It is strange, so your times are significantly slower than I
>>>>> have. Have you enabled asserts?
>>>>
>>>> The table contains 15 million rows with column values randomly
>>>> selected from the 1-350 range, with 60% within the 1-50 range, and
>>>> asserts are enabled.
>>>>
>>>
>>> Now I repeated tests on litlle bit wide table with 9 milion rows, but
>>> without success.
>>>
>>> Try to disable asserts. I am not sure, but maybe there significantlly
>>> change a speed.
>>
>> Okay, here you go.  Results with debug_assertions = false:
>>
>> Index-only scan: 173.389 ms (78.442 ms)
>> Index scan: 184239.399 ms (previously 164882.666 ms)
>> Bitmap scan: 159354.261 ms (previously 154107.415 ms)
>> Sequential scan: 134552.263 ms (previously 121296.999 ms)
>>
>> So no particularly significant difference, except with the index-only
>> scan (which I repeated 3 times and it's about the same each time).
>
> what is size of table?

4884MB

Here's how I set it up:

create table stuff (id serial, a text, b int, c int, d text, e text, f
int, g int, h text, thing int);

insert into stuff (a,b,c,d,e,f,g,h,thing) select 'nfewiufn weoifn
weiou fnwe fnoi weuoifn woeuifniunfeiwodnjnfdoewnfjienfjioe
wneinrienrf',53424524,26575757,'fhdsouhfuisdhfiudshif sdhiuf iu hfius
dhfiu sdiuf dsihfiusdhiufhdsiufhdsiu fhiuds fhiusd fiuds fidsf iuds
fiu dsiuf
sdhiu','dfoisdhjfojdsoifjdiosjfoidsjiofjds',52944234,5246463,'fdfidsjoifjdsoifjoidsjfodsjfd',ceil(random()*350)
from generate_series(1,6000000);

insert into stuff (a,b,c,d,e,f,g,h,thing) select 'nfewiufn weoifn
weiou fnwe fnoi weuoifn woeuifniunfeiwodnjnfdoewnfjienfjioe
wneinrienrf',53424524,26575757,'fhdsouhfuisdhfiudshif sdhiuf iu hfius
dhfiu sdiuf dsihfiusdhiufhdsiufhdsiu fhiuds fhiusd fiuds fidsf iuds
fiu dsiuf
sdhiu','dfoisdhjfojdsoifjdiosjfoidsjiofjds',52944234,5246463,'fdfidsjoifjdsoifjoidsjfodsjfd',ceil(random()*50)
from generate_series(1,9000000);

create index idx_stuff_thing on stuff (thing);

vacuum analyse stuff;

Testing without all the extra columns in the table yields a similar
result pattern.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [9.2devel] why it doesn't do index scan only?

From

Pavel Stehule

Date:

08 October 2011, 19:08:25

2011/10/8 Thom Brown <thom@linux.com>:
> On 8 October 2011 19:47, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>>> I did it. It is strange, so your times are significantly slower than I
>>>> have. Have you enabled asserts?
>>>
>>> The table contains 15 million rows with column values randomly
>>> selected from the 1-350 range, with 60% within the 1-50 range, and
>>> asserts are enabled.
>>>
>>
>> Now I repeated tests on litlle bit wide table with 9 milion rows, but
>> without success.
>>
>> Try to disable asserts. I am not sure, but maybe there significantlly
>> change a speed.
>
> Okay, here you go.  Results with debug_assertions = false:
>
> Index-only scan: 173.389 ms (78.442 ms)
> Index scan: 184239.399 ms (previously 164882.666 ms)
> Bitmap scan: 159354.261 ms (previously 154107.415 ms)
> Sequential scan: 134552.263 ms (previously 121296.999 ms)
>
> So no particularly significant difference, except with the index-only
> scan (which I repeated 3 times and it's about the same each time).

what is size of table?

It is mystic - I created 1.5GB long table, Everywhere I restarted
postgres and I did a reset of system cache and still I have index only
scan little bit slower than index scan.

I use

[root@nemesis pavel]# uname -a
Linux nemesis 2.6.35.14-97.fc14.i686 #1 SMP Sat Sep 17 00:34:02 UTC
2011 i686 i686 i386 GNU/Linux

Notebook Dell D830 - 2GB RAM, PostgreSQL used with default
configuration - 24MB Shared buffers

[pavel@nemesis ~]$ bonnie++
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
nemesis          4G   170  99 38853  10 17292   6  1283  94 41249   7 140.2   6
Latency             56039us    1374ms    1647ms   93214us     365ms     429ms
Version  1.96       ------Sequential Create------ --------Random Create--------
nemesis             -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16  9777  47 +++++ +++ 20363  39 11331  57 +++++ +++ 23478  45
Latency             91217us    1079us    1147us     200us    1148us     199us

1.96,1.96,nemesis,1,1318123502,4G,,170,99,38853,10,17292,6,1283,94,41249,7,140.2,6,16,,,,,9777,47,+++++,+++,20363,39,11331,57,+++++,+++,23478,45,56039us,1374ms,1647ms,93214us,365ms,429ms,91217us,1079us,1147us,200us,1148us,199us




>
> --
> Thom Brown
> Twitter: @darkixion
> IRC (freenode): dark_ixion
> Registered Linux user: #516935
>
> EnterpriseDB UK: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [9.2devel] why it doesn't do index scan only?

From

Pavel Stehule

Date:

09 October 2011, 00:36:15

2011/10/8 Thom Brown <thom@linux.com>:
> On 8 October 2011 21:13, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> 2011/10/8 Thom Brown <thom@linux.com>:
>>> On 8 October 2011 19:47, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>>>>> I did it. It is strange, so your times are significantly slower than I
>>>>>> have. Have you enabled asserts?
>>>>>
>>>>> The table contains 15 million rows with column values randomly
>>>>> selected from the 1-350 range, with 60% within the 1-50 range, and
>>>>> asserts are enabled.
>>>>>
>>>>
>>>> Now I repeated tests on litlle bit wide table with 9 milion rows, but
>>>> without success.
>>>>
>>>> Try to disable asserts. I am not sure, but maybe there significantlly
>>>> change a speed.
>>>
>>> Okay, here you go.  Results with debug_assertions = false:
>>>
>>> Index-only scan: 173.389 ms (78.442 ms)
>>> Index scan: 184239.399 ms (previously 164882.666 ms)
>>> Bitmap scan: 159354.261 ms (previously 154107.415 ms)
>>> Sequential scan: 134552.263 ms (previously 121296.999 ms)
>>>
>>> So no particularly significant difference, except with the index-only
>>> scan (which I repeated 3 times and it's about the same each time).
>>
>> what is size of table?
>
> 4884MB

It has a sense - index only scan  it is faster (and significantly
faster) on wider tables - or tables with strings where TOAST is not
active. Maybe there is a some issue because on thin tables is slower
(and I expect a should be faster everywhere).

Regards

Pavel Stehule

>
> Here's how I set it up:
>
> create table stuff (id serial, a text, b int, c int, d text, e text, f
> int, g int, h text, thing int);
>
> insert into stuff (a,b,c,d,e,f,g,h,thing) select 'nfewiufn weoifn
> weiou fnwe fnoi weuoifn woeuifniunfeiwodnjnfdoewnfjienfjioe
> wneinrienrf',53424524,26575757,'fhdsouhfuisdhfiudshif sdhiuf iu hfius
> dhfiu sdiuf dsihfiusdhiufhdsiufhdsiu fhiuds fhiusd fiuds fidsf iuds
> fiu dsiuf
sdhiu','dfoisdhjfojdsoifjdiosjfoidsjiofjds',52944234,5246463,'fdfidsjoifjdsoifjoidsjfodsjfd',ceil(random()*350)
> from generate_series(1,6000000);
>
> insert into stuff (a,b,c,d,e,f,g,h,thing) select 'nfewiufn weoifn
> weiou fnwe fnoi weuoifn woeuifniunfeiwodnjnfdoewnfjienfjioe
> wneinrienrf',53424524,26575757,'fhdsouhfuisdhfiudshif sdhiuf iu hfius
> dhfiu sdiuf dsihfiusdhiufhdsiufhdsiu fhiuds fhiusd fiuds fidsf iuds
> fiu dsiuf
sdhiu','dfoisdhjfojdsoifjdiosjfoidsjiofjds',52944234,5246463,'fdfidsjoifjdsoifjoidsjfodsjfd',ceil(random()*50)
> from generate_series(1,9000000);
>
> create index idx_stuff_thing on stuff (thing);
>
> vacuum analyse stuff;
>
> Testing without all the extra columns in the table yields a similar
> result pattern.
>
> --
> Thom Brown
> Twitter: @darkixion
> IRC (freenode): dark_ixion
> Registered Linux user: #516935
>
> EnterpriseDB UK: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [9.2devel] why it doesn't do index scan only?

From

Thom Brown

Date:

09 October 2011, 07:39:41

On 9 October 2011 04:35, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2011/10/8 Thom Brown <thom@linux.com>:
>> On 8 October 2011 21:13, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>> 2011/10/8 Thom Brown <thom@linux.com>:
>>>> On 8 October 2011 19:47, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>>>>>> I did it. It is strange, so your times are significantly slower than I
>>>>>>> have. Have you enabled asserts?
>>>>>>
>>>>>> The table contains 15 million rows with column values randomly
>>>>>> selected from the 1-350 range, with 60% within the 1-50 range, and
>>>>>> asserts are enabled.
>>>>>>
>>>>>
>>>>> Now I repeated tests on litlle bit wide table with 9 milion rows, but
>>>>> without success.
>>>>>
>>>>> Try to disable asserts. I am not sure, but maybe there significantlly
>>>>> change a speed.
>>>>
>>>> Okay, here you go.  Results with debug_assertions = false:
>>>>
>>>> Index-only scan: 173.389 ms (78.442 ms)
>>>> Index scan: 184239.399 ms (previously 164882.666 ms)
>>>> Bitmap scan: 159354.261 ms (previously 154107.415 ms)
>>>> Sequential scan: 134552.263 ms (previously 121296.999 ms)
>>>>
>>>> So no particularly significant difference, except with the index-only
>>>> scan (which I repeated 3 times and it's about the same each time).
>>>
>>> what is size of table?
>>
>> 4884MB
>
> It has a sense - index only scan  it is faster (and significantly
> faster) on wider tables - or tables with strings where TOAST is not
> active. Maybe there is a some issue because on thin tables is slower
> (and I expect a should be faster everywhere).

No, that's my point, I re-tested it on a table with just 2 int
columns, and the results are roughly the same.  I added all the
columns to make it expensive to fetch the  column being queried.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [9.2devel] why it doesn't do index scan only?

From

Pavel Stehule

Date:

09 October 2011, 07:52:20

2011/10/9 Thom Brown <thom@linux.com>:
> On 9 October 2011 04:35, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> 2011/10/8 Thom Brown <thom@linux.com>:
>>> On 8 October 2011 21:13, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>>> 2011/10/8 Thom Brown <thom@linux.com>:
>>>>> On 8 October 2011 19:47, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>>>>>>> I did it. It is strange, so your times are significantly slower than I
>>>>>>>> have. Have you enabled asserts?
>>>>>>>
>>>>>>> The table contains 15 million rows with column values randomly
>>>>>>> selected from the 1-350 range, with 60% within the 1-50 range, and
>>>>>>> asserts are enabled.
>>>>>>>
>>>>>>
>>>>>> Now I repeated tests on litlle bit wide table with 9 milion rows, but
>>>>>> without success.
>>>>>>
>>>>>> Try to disable asserts. I am not sure, but maybe there significantlly
>>>>>> change a speed.
>>>>>
>>>>> Okay, here you go.  Results with debug_assertions = false:
>>>>>
>>>>> Index-only scan: 173.389 ms (78.442 ms)
>>>>> Index scan: 184239.399 ms (previously 164882.666 ms)
>>>>> Bitmap scan: 159354.261 ms (previously 154107.415 ms)
>>>>> Sequential scan: 134552.263 ms (previously 121296.999 ms)
>>>>>
>>>>> So no particularly significant difference, except with the index-only
>>>>> scan (which I repeated 3 times and it's about the same each time).
>>>>
>>>> what is size of table?
>>>
>>> 4884MB
>>
>> It has a sense - index only scan  it is faster (and significantly
>> faster) on wider tables - or tables with strings where TOAST is not
>> active. Maybe there is a some issue because on thin tables is slower
>> (and I expect a should be faster everywhere).
>
> No, that's my point, I re-tested it on a table with just 2 int
> columns, and the results are roughly the same.  I added all the
> columns to make it expensive to fetch the  column being queried.

then I don't understand

Regards

Pavel

>
> --
> Thom Brown
> Twitter: @darkixion
> IRC (freenode): dark_ixion
> Registered Linux user: #516935
>
> EnterpriseDB UK: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [9.2devel] why it doesn't do index scan only?

From

Thom Brown

Date:

09 October 2011, 08:55:24

On 9 October 2011 11:51, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2011/10/9 Thom Brown <thom@linux.com>:
>> On 9 October 2011 04:35, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>> 2011/10/8 Thom Brown <thom@linux.com>:
>>>> On 8 October 2011 21:13, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>>>> 2011/10/8 Thom Brown <thom@linux.com>:
>>>>>> On 8 October 2011 19:47, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>>>>>>>> I did it. It is strange, so your times are significantly slower than I
>>>>>>>>> have. Have you enabled asserts?
>>>>>>>>
>>>>>>>> The table contains 15 million rows with column values randomly
>>>>>>>> selected from the 1-350 range, with 60% within the 1-50 range, and
>>>>>>>> asserts are enabled.
>>>>>>>>
>>>>>>>
>>>>>>> Now I repeated tests on litlle bit wide table with 9 milion rows, but
>>>>>>> without success.
>>>>>>>
>>>>>>> Try to disable asserts. I am not sure, but maybe there significantlly
>>>>>>> change a speed.
>>>>>>
>>>>>> Okay, here you go.  Results with debug_assertions = false:
>>>>>>
>>>>>> Index-only scan: 173.389 ms (78.442 ms)
>>>>>> Index scan: 184239.399 ms (previously 164882.666 ms)
>>>>>> Bitmap scan: 159354.261 ms (previously 154107.415 ms)
>>>>>> Sequential scan: 134552.263 ms (previously 121296.999 ms)
>>>>>>
>>>>>> So no particularly significant difference, except with the index-only
>>>>>> scan (which I repeated 3 times and it's about the same each time).
>>>>>
>>>>> what is size of table?
>>>>
>>>> 4884MB
>>>
>>> It has a sense - index only scan  it is faster (and significantly
>>> faster) on wider tables - or tables with strings where TOAST is not
>>> active. Maybe there is a some issue because on thin tables is slower
>>> (and I expect a should be faster everywhere).
>>
>> No, that's my point, I re-tested it on a table with just 2 int
>> columns, and the results are roughly the same.  I added all the
>> columns to make it expensive to fetch the  column being queried.
>
> then I don't understand

Well here's some more reliable results since each has been run 3
times.  The row size in this sample is a 10th of previous ones (i.e.
1.5 million rows):

-- Narrow table (table with 2 int columns) --

Assertions = on

Index-only scan = 6.088, 5.885, 6.361
Index scan = 65.661, 63.441, 64.105
Bitmap scan = 68.448, 66.476, 64.626
Sequential scan = 244.129, 239.584, 242.680

Assertions = off

Index-only scan = 6.710, 6.709, 6.192
Index scan = 66.838, 67.534, 64.348
Bitmap scan = 75.662, 64.500, 69.080
Sequential scan = 232.065, 231.366, 231.547


-- Wide table (table as described in earlier post) --

Assertions = on

Index-only scan = 7.313, 7.299, 6.401
Index scan = 193.555, 186.564, 198.150
Bitmap scan = 199.082, 204.664, 207.902
Sequential scan = 643.765, 645.426, 621.150

Assertions = off

Index-only scan = 7.569, 6.477, 6.113
Index scan = 197.332, 197.286, 204.257
Bitmap scan = 207.838, 202.235, 203.322
Sequential scan = 653.739, 633.309, 619.081

Bitmap scans tend not to be faster than index scans in this scenario.
These numbers are produced by HEAD as of this morning.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [9.2devel] why it doesn't do index scan only?

From

Tom Lane

Date:

09 October 2011, 13:56:22

Pavel Stehule <pavel.stehule@gmail.com> writes:
> 2011/10/9 Thom Brown <thom@linux.com>:
>> On 9 October 2011 04:35, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>> It has a sense - index only scan  it is faster (and significantly
>>> faster) on wider tables - or tables with strings where TOAST is not
>>> active. Maybe there is a some issue because on thin tables is slower
>>> (and I expect a should be faster everywhere).

>> No, that's my point, I re-tested it on a table with just 2 int
>> columns, and the results are roughly the same.  I added all the
>> columns to make it expensive to fetch the  column being queried.

> then I don't understand

Are you sure you've remembered to vacuum the test table?  I get results
like yours (ie, no speed benefit for index-only scan) if the table
doesn't have its visibility-map bits set.

            regards, tom lane

Re: [9.2devel] why it doesn't do index scan only?

From

Pavel Stehule

Date:

09 October 2011, 14:02:52

2011/10/9 Tom Lane <tgl@sss.pgh.pa.us>:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> 2011/10/9 Thom Brown <thom@linux.com>:
>>> On 9 October 2011 04:35, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>>> It has a sense - index only scan  it is faster (and significantly
>>>> faster) on wider tables - or tables with strings where TOAST is not
>>>> active. Maybe there is a some issue because on thin tables is slower
>>>> (and I expect a should be faster everywhere).
>
>>> No, that's my point, I re-tested it on a table with just 2 int
>>> columns, and the results are roughly the same.  I added all the
>>> columns to make it expensive to fetch the  column being queried.
>
>> then I don't understand
>
> Are you sure you've remembered to vacuum the test table?  I get results
> like yours (ie, no speed benefit for index-only scan) if the table
> doesn't have its visibility-map bits set.

it should be - I didn't do VACUUM


Regards

Pavel
>
>                        regards, tom lane
>

Re: [9.2devel] why it doesn't do index scan only?

From

Pavel Stehule

Date:

09 October 2011, 14:39:18

2011/10/9 Pavel Stehule <pavel.stehule@gmail.com>:
> 2011/10/9 Tom Lane <tgl@sss.pgh.pa.us>:
>> Pavel Stehule <pavel.stehule@gmail.com> writes:
>>> 2011/10/9 Thom Brown <thom@linux.com>:
>>>> On 9 October 2011 04:35, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>>>> It has a sense - index only scan  it is faster (and significantly
>>>>> faster) on wider tables - or tables with strings where TOAST is not
>>>>> active. Maybe there is a some issue because on thin tables is slower
>>>>> (and I expect a should be faster everywhere).
>>
>>>> No, that's my point, I re-tested it on a table with just 2 int
>>>> columns, and the results are roughly the same.  I added all the
>>>> columns to make it expensive to fetch the  column being queried.
>>
>>> then I don't understand
>>
>> Are you sure you've remembered to vacuum the test table?  I get results
>> like yours (ie, no speed benefit for index-only scan) if the table
>> doesn't have its visibility-map bits set.
>
> it should be - I didn't do VACUUM
>

yes, After VACUUM I got a significantly better times - index only scan
is about 5-6x better

Regards

Pavel Stehule

>
> Regards
>
> Pavel
>>
>>                        regards, tom lane
>>
>

Re: [9.2devel] why it doesn't do index scan only?

From

Thom Brown

Date:

09 October 2011, 17:41:40

On 9 October 2011 18:38, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> 2011/10/9 Pavel Stehule <pavel.stehule@gmail.com>:
>> 2011/10/9 Tom Lane <tgl@sss.pgh.pa.us>:
>>> Pavel Stehule <pavel.stehule@gmail.com> writes:
>>>> 2011/10/9 Thom Brown <thom@linux.com>:
>>>>> On 9 October 2011 04:35, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>>>>>> It has a sense - index only scan  it is faster (and significantly
>>>>>> faster) on wider tables - or tables with strings where TOAST is not
>>>>>> active. Maybe there is a some issue because on thin tables is slower
>>>>>> (and I expect a should be faster everywhere).
>>>
>>>>> No, that's my point, I re-tested it on a table with just 2 int
>>>>> columns, and the results are roughly the same.  I added all the
>>>>> columns to make it expensive to fetch the  column being queried.
>>>
>>>> then I don't understand
>>>
>>> Are you sure you've remembered to vacuum the test table?  I get results
>>> like yours (ie, no speed benefit for index-only scan) if the table
>>> doesn't have its visibility-map bits set.
>>
>> it should be - I didn't do VACUUM
>>
>
> yes, After VACUUM I got a significantly better times - index only scan
> is about 5-6x better

Something that I was expecting the planner to do with this patch,
which it doesn't, is pull in the index for queries like:

SELECT count(*) from my_table;

or

SELECT sum(indexed_column) from my_table;

I don't see why a non-partial index can't fulfill these queries.  I
can only get index-only scans with WHERE conditions.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company