Thread: [GENERAL] Help on Index only scan

[GENERAL] Help on Index only scan

From
Ertan Küçükoğlu
Date:
Hello,

My table details:
robox=# \dS+ updates
                                                   Table "public.updates"
    Column     |  Type   |                         Modifiers
| Storage  | Stats target | Description
---------------+---------+--------------------------------------------------
---------+----------+--------------+-------------
 autoinc       | integer | not null default
nextval('updates_autoinc_seq'::regclass) | plain    |              |
 filename      | text    |
| extended |              |
 dateofrelease | date    |
| plain    |              |
 fileversion   | text    |
| extended |              |
 afile         | text    |
| extended |              |
 filehash      | text    |
| extended |              |
 active        | boolean |
| plain    |              |
Indexes:
    "updates_pkey" PRIMARY KEY, btree (autoinc)
    "update_filename" btree (filename)
    "updates_autoinc" btree (autoinc DESC)
    "updates_dateofrelease" btree (dateofrelease)
    "updates_filename_dateofrelease" btree (filename, dateofrelease)


robox=# select count(autoinc) from updates;
 count
-------
  2003
(1 row)

robox=# select autoinc, filename, fileversion from updates limit 10;
 autoinc |             filename             | fileversion
---------+----------------------------------+-------------
      18 | Robox.exe                        | 1.0.1.218
      19 | Robox.exe                        | 1.0.1.220
      20 | Robox.exe                        | 1.0.1.220
      21 | 8423bfc5a669864f9b66b6b15ce908b9 | 1.1.1.1
      22 | 4fdabb0c7adbc5a89fbe679ce76ccef9 | 1.1.1.1
      23 | f469d77bfa86c8917c7846c0f871137c | 1.1.1.1
      24 | bc10af4c8789718a9ca6565ea14cb17d | 1.1.1.1
      25 | d9f87ee46cdb41cd15c2f71ed599faf9 | 1.1.1.1
      26 | 6f7428a5364aae1d5914a66cba3e6f3b | 1.1.1.1
      27 | 66ec4cdb8d64ca1414f75c1fb9eaa518 | 1.1.1.1
(10 rows)

I want to have an index only scan for my below query:
select autoinc, fileversion from updates where filename = 'Robox.exe' order
by autoinc desc;

I simply could not understand planner and cannot provide right index for it.
Below index names "update_filename" and "updates_autoinc" are added just for
the query that I would like to have a index only scan plan. I also failed
with following indexes
"autoinc desc, filename, fileversion"
"autoinc desc, filename"

First 3 rows in above select results are actual data. You will find that I
have inserted about 2000 rows of dummy data to have somewhat meaningful plan
for the query.

Current planner result:
robox=# vacuum full;
VACUUM
robox=# explain analyze
robox-# select autoinc, fileversion
robox-# from updates
robox-# where filename = 'Robox.exe'
robox-# order by autoinc desc;
                                                          QUERY PLAN
----------------------------------------------------------------------------
--------------------------------------------------
 Sort  (cost=12.79..12.79 rows=3 width=12) (actual time=0.047..0.047 rows=3
loops=1)
   Sort Key: autoinc DESC
   Sort Method: quicksort  Memory: 25kB
   ->  Bitmap Heap Scan on updates  (cost=4.30..12.76 rows=3 width=12)
(actual time=0.040..0.040 rows=3 loops=1)
         Recheck Cond: (filename = 'Robox.exe'::text)
         Heap Blocks: exact=1
         ->  Bitmap Index Scan on update_filename  (cost=0.00..4.30 rows=3
width=0) (actual time=0.035..0.035 rows=3 loops=1)
               Index Cond: (filename = 'Robox.exe'::text)
 Planning time: 1.873 ms
 Execution time: 0.076 ms
(10 rows)


I appreciate any help on having right index(es) as I simply failed myself.

Regards,
Ertan Küçükoğlu




Re: [GENERAL] Help on Index only scan

From
Melvin Davidson
Date:

On Sun, Aug 13, 2017 at 5:59 PM, Ertan Küçükoğlu <ertan.kucukoglu@1nar.com.tr> wrote:
Hello,

My table details:
robox=# \dS+ updates
                                                   Table "public.updates"
    Column     |  Type   |                         Modifiers
| Storage  | Stats target | Description
---------------+---------+--------------------------------------------------
---------+----------+--------------+-------------
 autoinc       | integer | not null default
nextval('updates_autoinc_seq'::regclass) | plain    |              |
 filename      | text    |
| extended |              |
 dateofrelease | date    |
| plain    |              |
 fileversion   | text    |
| extended |              |
 afile         | text    |
| extended |              |
 filehash      | text    |
| extended |              |
 active        | boolean |
| plain    |              |
Indexes:
    "updates_pkey" PRIMARY KEY, btree (autoinc)
    "update_filename" btree (filename)
    "updates_autoinc" btree (autoinc DESC)
    "updates_dateofrelease" btree (dateofrelease)
    "updates_filename_dateofrelease" btree (filename, dateofrelease)


robox=# select count(autoinc) from updates;
 count
-------
  2003
(1 row)

robox=# select autoinc, filename, fileversion from updates limit 10;
 autoinc |             filename             | fileversion
---------+----------------------------------+-------------
      18 | Robox.exe                        | 1.0.1.218
      19 | Robox.exe                        | 1.0.1.220
      20 | Robox.exe                        | 1.0.1.220
      21 | 8423bfc5a669864f9b66b6b15ce908b9 | 1.1.1.1
      22 | 4fdabb0c7adbc5a89fbe679ce76ccef9 | 1.1.1.1
      23 | f469d77bfa86c8917c7846c0f871137c | 1.1.1.1
      24 | bc10af4c8789718a9ca6565ea14cb17d | 1.1.1.1
      25 | d9f87ee46cdb41cd15c2f71ed599faf9 | 1.1.1.1
      26 | 6f7428a5364aae1d5914a66cba3e6f3b | 1.1.1.1
      27 | 66ec4cdb8d64ca1414f75c1fb9eaa518 | 1.1.1.1
(10 rows)

I want to have an index only scan for my below query:
select autoinc, fileversion from updates where filename = 'Robox.exe' order
by autoinc desc;

I simply could not understand planner and cannot provide right index for it.
Below index names "update_filename" and "updates_autoinc" are added just for
the query that I would like to have a index only scan plan. I also failed
with following indexes
"autoinc desc, filename, fileversion"
"autoinc desc, filename"

First 3 rows in above select results are actual data. You will find that I
have inserted about 2000 rows of dummy data to have somewhat meaningful plan
for the query.

Current planner result:
robox=# vacuum full;
VACUUM
robox=# explain analyze
robox-# select autoinc, fileversion
robox-# from updates
robox-# where filename = 'Robox.exe'
robox-# order by autoinc desc;
                                                          QUERY PLAN
----------------------------------------------------------------------------
--------------------------------------------------
 Sort  (cost=12.79..12.79 rows=3 width=12) (actual time=0.047..0.047 rows=3
loops=1)
   Sort Key: autoinc DESC
   Sort Method: quicksort  Memory: 25kB
   ->  Bitmap Heap Scan on updates  (cost=4.30..12.76 rows=3 width=12)
(actual time=0.040..0.040 rows=3 loops=1)
         Recheck Cond: (filename = 'Robox.exe'::text)
         Heap Blocks: exact=1
         ->  Bitmap Index Scan on update_filename  (cost=0.00..4.30 rows=3
width=0) (actual time=0.035..0.035 rows=3 loops=1)
               Index Cond: (filename = 'Robox.exe'::text)
 Planning time: 1.873 ms
 Execution time: 0.076 ms
(10 rows)


I appreciate any help on having right index(es) as I simply failed myself.

Regards,
Ertan Küçükoğlu




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

First, you do not need index "updates_autoinc", since autoinc is the Primary Key, you are just duplicating the index.

As far as "Index only scan" , since the table only has 2003 rows, the optimizer has determined it is faster just to
load all the rows into memory and then filter. If you really want to force an index scan, then you would have to do
SET enable_seqscan = off; Before doing the query, however you are just shooting yourself in the foot by doing that
as it will make the query slower.


--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: [GENERAL] Help on Index only scan

From
Ertan Küçükoğlu
Date:

On 14 Aug 2017, at 01:15, Melvin Davidson <melvin6925@gmail.com> wrote:


On Sun, Aug 13, 2017 at 5:59 PM, Ertan Küçükoğlu <ertan.kucukoglu@1nar.com.tr> wrote:
Hello,

My table details:
robox=# \dS+ updates
                                                   Table "public.updates"
    Column     |  Type   |                         Modifiers
| Storage  | Stats target | Description
---------------+---------+--------------------------------------------------
---------+----------+--------------+-------------
 autoinc       | integer | not null default
nextval('updates_autoinc_seq'::regclass) | plain    |              |
 filename      | text    |
| extended |              |
 dateofrelease | date    |
| plain    |              |
 fileversion   | text    |
| extended |              |
 afile         | text    |
| extended |              |
 filehash      | text    |
| extended |              |
 active        | boolean |
| plain    |              |
Indexes:
    "updates_pkey" PRIMARY KEY, btree (autoinc)
    "update_filename" btree (filename)
    "updates_autoinc" btree (autoinc DESC)
    "updates_dateofrelease" btree (dateofrelease)
    "updates_filename_dateofrelease" btree (filename, dateofrelease)


robox=# select count(autoinc) from updates;
 count
-------
  2003
(1 row)

robox=# select autoinc, filename, fileversion from updates limit 10;
 autoinc |             filename             | fileversion
---------+----------------------------------+-------------
      18 | Robox.exe                        | 1.0.1.218
      19 | Robox.exe                        | 1.0.1.220
      20 | Robox.exe                        | 1.0.1.220
      21 | 8423bfc5a669864f9b66b6b15ce908b9 | 1.1.1.1
      22 | 4fdabb0c7adbc5a89fbe679ce76ccef9 | 1.1.1.1
      23 | f469d77bfa86c8917c7846c0f871137c | 1.1.1.1
      24 | bc10af4c8789718a9ca6565ea14cb17d | 1.1.1.1
      25 | d9f87ee46cdb41cd15c2f71ed599faf9 | 1.1.1.1
      26 | 6f7428a5364aae1d5914a66cba3e6f3b | 1.1.1.1
      27 | 66ec4cdb8d64ca1414f75c1fb9eaa518 | 1.1.1.1
(10 rows)

I want to have an index only scan for my below query:
select autoinc, fileversion from updates where filename = 'Robox.exe' order
by autoinc desc;

I simply could not understand planner and cannot provide right index for it.
Below index names "update_filename" and "updates_autoinc" are added just for
the query that I would like to have a index only scan plan. I also failed
with following indexes
"autoinc desc, filename, fileversion"
"autoinc desc, filename"

First 3 rows in above select results are actual data. You will find that I
have inserted about 2000 rows of dummy data to have somewhat meaningful plan
for the query.

Current planner result:
robox=# vacuum full;
VACUUM
robox=# explain analyze
robox-# select autoinc, fileversion
robox-# from updates
robox-# where filename = 'Robox.exe'
robox-# order by autoinc desc;
                                                          QUERY PLAN
----------------------------------------------------------------------------
--------------------------------------------------
 Sort  (cost=12.79..12.79 rows=3 width=12) (actual time=0.047..0.047 rows=3
loops=1)
   Sort Key: autoinc DESC
   Sort Method: quicksort  Memory: 25kB
   ->  Bitmap Heap Scan on updates  (cost=4.30..12.76 rows=3 width=12)
(actual time=0.040..0.040 rows=3 loops=1)
         Recheck Cond: (filename = 'Robox.exe'::text)
         Heap Blocks: exact=1
         ->  Bitmap Index Scan on update_filename  (cost=0.00..4.30 rows=3
width=0) (actual time=0.035..0.035 rows=3 loops=1)
               Index Cond: (filename = 'Robox.exe'::text)
 Planning time: 1.873 ms
 Execution time: 0.076 ms
(10 rows)


I appreciate any help on having right index(es) as I simply failed myself.

Regards,
Ertan Küçükoğlu

First, you do not need index "updates_autoinc", since autoinc is the Primary Key, you are just duplicating the index.

Is that true even if that index is a descending one?


As far as "Index only scan" , since the table only has 2003 rows, the optimizer has determined it is faster just to
load all the rows into memory and then filter. If you really want to force an index scan, then you would have to do
SET enable_seqscan = off; Before doing the query, however you are just shooting yourself in the foot by doing that
as it will make the query slower.

I will try to load up more dummy rows to overflow the work_mem and observe results.

Sorry, my question was misleading. I do not want to use "set enable_seqscan = off" I want to be sure that when necessary (record count increases) relevant index(es) will be used.

Obviously I still can't read query plan as I did not understand that operation is still in memory.

Just for complete information, this is 64bit PostgreSQL 9.6.4 on Windows 10, EDB binaries.

Thanks.
Ertan Küçükoğlu 

Re: [GENERAL] Help on Index only scan

From
Melvin Davidson
Date:


On Sun, Aug 13, 2017 at 7:37 PM, Ertan Küçükoğlu <ertan.kucukoglu@1nar.com.tr> wrote:

On 14 Aug 2017, at 01:15, Melvin Davidson <melvin6925@gmail.com> wrote:


On Sun, Aug 13, 2017 at 5:59 PM, Ertan Küçükoğlu <ertan.kucukoglu@1nar.com.tr> wrote:
Hello,

My table details:
robox=# \dS+ updates
                                                   Table "public.updates"
    Column     |  Type   |                         Modifiers
| Storage  | Stats target | Description
---------------+---------+--------------------------------------------------
---------+----------+--------------+-------------
 autoinc       | integer | not null default
nextval('updates_autoinc_seq'::regclass) | plain    |              |
 filename      | text    |
| extended |              |
 dateofrelease | date    |
| plain    |              |
 fileversion   | text    |
| extended |              |
 afile         | text    |
| extended |              |
 filehash      | text    |
| extended |              |
 active        | boolean |
| plain    |              |
Indexes:
    "updates_pkey" PRIMARY KEY, btree (autoinc)
    "update_filename" btree (filename)
    "updates_autoinc" btree (autoinc DESC)
    "updates_dateofrelease" btree (dateofrelease)
    "updates_filename_dateofrelease" btree (filename, dateofrelease)


robox=# select count(autoinc) from updates;
 count
-------
  2003
(1 row)

robox=# select autoinc, filename, fileversion from updates limit 10;
 autoinc |             filename             | fileversion
---------+----------------------------------+-------------
      18 | Robox.exe                        | 1.0.1.218
      19 | Robox.exe                        | 1.0.1.220
      20 | Robox.exe                        | 1.0.1.220
      21 | 8423bfc5a669864f9b66b6b15ce908b9 | 1.1.1.1
      22 | 4fdabb0c7adbc5a89fbe679ce76ccef9 | 1.1.1.1
      23 | f469d77bfa86c8917c7846c0f871137c | 1.1.1.1
      24 | bc10af4c8789718a9ca6565ea14cb17d | 1.1.1.1
      25 | d9f87ee46cdb41cd15c2f71ed599faf9 | 1.1.1.1
      26 | 6f7428a5364aae1d5914a66cba3e6f3b | 1.1.1.1
      27 | 66ec4cdb8d64ca1414f75c1fb9eaa518 | 1.1.1.1
(10 rows)

I want to have an index only scan for my below query:
select autoinc, fileversion from updates where filename = 'Robox.exe' order
by autoinc desc;

I simply could not understand planner and cannot provide right index for it.
Below index names "update_filename" and "updates_autoinc" are added just for
the query that I would like to have a index only scan plan. I also failed
with following indexes
"autoinc desc, filename, fileversion"
"autoinc desc, filename"

First 3 rows in above select results are actual data. You will find that I
have inserted about 2000 rows of dummy data to have somewhat meaningful plan
for the query.

Current planner result:
robox=# vacuum full;
VACUUM
robox=# explain analyze
robox-# select autoinc, fileversion
robox-# from updates
robox-# where filename = 'Robox.exe'
robox-# order by autoinc desc;
                                                          QUERY PLAN
----------------------------------------------------------------------------
--------------------------------------------------
 Sort  (cost=12.79..12.79 rows=3 width=12) (actual time=0.047..0.047 rows=3
loops=1)
   Sort Key: autoinc DESC
   Sort Method: quicksort  Memory: 25kB
   ->  Bitmap Heap Scan on updates  (cost=4.30..12.76 rows=3 width=12)
(actual time=0.040..0.040 rows=3 loops=1)
         Recheck Cond: (filename = 'Robox.exe'::text)
         Heap Blocks: exact=1
         ->  Bitmap Index Scan on update_filename  (cost=0.00..4.30 rows=3
width=0) (actual time=0.035..0.035 rows=3 loops=1)
               Index Cond: (filename = 'Robox.exe'::text)
 Planning time: 1.873 ms
 Execution time: 0.076 ms
(10 rows)


I appreciate any help on having right index(es) as I simply failed myself.

Regards,
Ertan Küçükoğlu

First, you do not need index "updates_autoinc", since autoinc is the Primary Key, you are just duplicating the index.

Is that true even if that index is a descending one?


As far as "Index only scan" , since the table only has 2003 rows, the optimizer has determined it is faster just to
load all the rows into memory and then filter. If you really want to force an index scan, then you would have to do
SET enable_seqscan = off; Before doing the query, however you are just shooting yourself in the foot by doing that
as it will make the query slower.

I will try to load up more dummy rows to overflow the work_mem and observe results.

Sorry, my question was misleading. I do not want to use "set enable_seqscan = off" I want to be sure that when necessary (record count increases) relevant index(es) will be used.

Obviously I still can't read query plan as I did not understand that operation is still in memory.

Just for complete information, this is 64bit PostgreSQL 9.6.4 on Windows 10, EDB binaries.

Thanks.
Ertan Küçükoğlu 

>I will try to load up more dummy rows to overflow the work_mem and observe results.

If you are going to do that, don't forget to run ANALYZE on your table afterwards, as the optimizer users the statistics to determine the best query plan.
I strongly recommend that you acquaint yourself with how the optimizer works.


One final suggestion, it is not a good idea to create indexes on every column. You should only create indexes for columns (or combinations
of columns) the will be used often in query WHERE clauses.

I also recommend you consider purchasing  PostgreSQL High Performance
from either Amazon

or  PACKT

Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: [GENERAL] Help on Index only scan

From
Tom Lane
Date:
=?utf-8?Q?Ertan_K=C3=BC=C3=A7=C3=BCko=C4=9Flu?= <ertan.kucukoglu@1nar.com.tr> writes:
>>> I want to have an index only scan for my below query:
>>> select autoinc, fileversion from updates where filename = 'Robox.exe' order
>>> by autoinc desc;

>> On 14 Aug 2017, at 01:15, Melvin Davidson <melvin6925@gmail.com> wrote:
>> As far as "Index only scan" , since the table only has 2003 rows, the optimizer has determined it is faster just to
>> load all the rows into memory and then filter.

> Sorry, my question was misleading. I do not want to use "set enable_seqscan = off" I want to be sure that when
necessary(record count increases) relevant index(es) will be used. 

There's a considerable distance between "is the planner making appropriate
use of indexes" and "I insist on an index-only scan".  The reason you're
not getting an index-only scan here is that that requires an index that
includes every column referenced in the query, which you don't have.  At
minimum you'd need an index including all of autoinc, fileversion, and
filename to do this query with an IOS.  If you want it to be particularly
efficient for this query then you'd need the index's column order to be
(filename, autoinc, fileversion) --- putting filename means the entries
satisfying WHERE will be clumped in the index, and putting autoinc second
means that a backwards scan on that portion of the index is enough to
produce the requested sort ordering without an explicit sort step.

Whether it's worth maintaining an index this specialized depends on how
much update traffic you have versus how often you want to do this
particular query.  Often it's not worth the extra disk space and update
overhead to have such an index.

In any case, I wouldn't worry about it until you have an actual
performance problem.  Trying to tell on toy data what the planner
will do with production-sized data is usually a losing game.

            regards, tom lane