Thread: Question on Explain : Index Scan
Question on Index scan:
--------------------------------------------------------------------------->
test=# \d test_seqindex1
Table "public.test_seqindex1"
Column | Type | Modifiers
--------+-----------------------+-----------
sid | character varying(13) | not null
name | character varying(80) |
Indexes:
"test_seqindex1_pkey" PRIMARY KEY, btree (sid)
test=# explain select * from test_seqindex1 where sid='AA023';
QUERY PLAN
-------------------------------------------------------------------------------------------
Index Scan using test_seqindex1_pkey on test_seqindex1 (cost=0.00..8.27 rows=1 width=28)
Index Cond: ((sid)::text = 'AA023'::text)
test=# \d test_seqindex2
Table "public.test_seqindex2"
Column | Type | Modifiers
--------+-----------------------+-----------
eid | integer | not null
sid | character varying(13) |
ename | character varying(80) |
Indexes:
"test_seqindex2_pkey" PRIMARY KEY, btree (eid)
"idx_test_seqindex2_sid" btree (sid)
test=# explain select * from test_seqindex2 where sid='AA023';
QUERY PLAN
--------------------------------------------------------------------------------------
Bitmap Heap Scan on test_seqindex2 (cost=4.95..275.53 rows=73 width=30)
Recheck Cond: ((sid)::text = 'AA023'::text)
-> Bitmap Index Scan on idx_test_seqindex2_sid (cost=0.00..4.93 rows=73 width=0)
Index Cond: ((sid)::text = 'AA023'::text)
test=explain select * from test_seqindex1 t1,test_seqindex2 t2 where t1.sid=t2.sid;
QUERY PLAN
------------------------------------------------------------------------------------
Hash Join (cost=1231.55..46386.19 rows=920544 width=58)
Hash Cond: ((t2.sid)::text = (t1.sid)::text)
-> Seq Scan on test_seqindex2 t2 (cost=0.00..16225.97 rows=920697 width=30)
-> Hash (cost=581.80..581.80 rows=33580 width=28)
-> Seq Scan on test_seqindex1 t1 (cost=0.00..581.80 rows=33580 width=28)
(5 rows)
I was hoping the optimizer would do a join using index scan.
Could some one please explain me why its not doing an index scan rather than sequential scan .
Thanks
Deepak
--------------------------------------------------------------------------->
test=# \d test_seqindex1
Table "public.test_seqindex1"
Column | Type | Modifiers
--------+-----------------------+-----------
sid | character varying(13) | not null
name | character varying(80) |
Indexes:
"test_seqindex1_pkey" PRIMARY KEY, btree (sid)
test=# explain select * from test_seqindex1 where sid='AA023';
QUERY PLAN
-------------------------------------------------------------------------------------------
Index Scan using test_seqindex1_pkey on test_seqindex1 (cost=0.00..8.27 rows=1 width=28)
Index Cond: ((sid)::text = 'AA023'::text)
test=# \d test_seqindex2
Table "public.test_seqindex2"
Column | Type | Modifiers
--------+-----------------------+-----------
eid | integer | not null
sid | character varying(13) |
ename | character varying(80) |
Indexes:
"test_seqindex2_pkey" PRIMARY KEY, btree (eid)
"idx_test_seqindex2_sid" btree (sid)
test=# explain select * from test_seqindex2 where sid='AA023';
QUERY PLAN
--------------------------------------------------------------------------------------
Bitmap Heap Scan on test_seqindex2 (cost=4.95..275.53 rows=73 width=30)
Recheck Cond: ((sid)::text = 'AA023'::text)
-> Bitmap Index Scan on idx_test_seqindex2_sid (cost=0.00..4.93 rows=73 width=0)
Index Cond: ((sid)::text = 'AA023'::text)
test=explain select * from test_seqindex1 t1,test_seqindex2 t2 where t1.sid=t2.sid;
QUERY PLAN
------------------------------------------------------------------------------------
Hash Join (cost=1231.55..46386.19 rows=920544 width=58)
Hash Cond: ((t2.sid)::text = (t1.sid)::text)
-> Seq Scan on test_seqindex2 t2 (cost=0.00..16225.97 rows=920697 width=30)
-> Hash (cost=581.80..581.80 rows=33580 width=28)
-> Seq Scan on test_seqindex1 t1 (cost=0.00..581.80 rows=33580 width=28)
(5 rows)
I was hoping the optimizer would do a join using index scan.
Could some one please explain me why its not doing an index scan rather than sequential scan .
Thanks
Deepak
On Thu, Oct 21, 2010 at 3:47 AM, DM <dm.aeqa@gmail.com> wrote: > I was hoping the optimizer would do a join using index scan. > > Could some one please explain me why its not doing an index scan rather than > sequential scan . A index scan would be probably slower here because you're asking for a lot of rows. A lot of rows means a lot of I/O, and an index scan is more I/O intensive (since it has to read the index too). If you limit the result (by being more selective in your where clause, just like you do in the first two queries), postgres will most likely switch to index scan. You can see for yourself if index-scan would be faster in your case by running the following command before "explain (analyze)": set enable_seqscan = off; BTW, try to use explain analyze instead of explain, that way you'll see the actual timings too instead of just the planner estimates. Kind regards, Mathieu
perfecto, thank you for the explanation.
- Deepak
- Deepak
On Thu, Oct 21, 2010 at 3:20 AM, Mathieu De Zutter <mathieu@dezutter.org> wrote:
On Thu, Oct 21, 2010 at 3:47 AM, DM <dm.aeqa@gmail.com> wrote:A index scan would be probably slower here because you're asking for a
> I was hoping the optimizer would do a join using index scan.
>
> Could some one please explain me why its not doing an index scan rather than
> sequential scan .
lot of rows. A lot of rows means a lot of I/O, and an index scan is
more I/O intensive (since it has to read the index too). If you limit
the result (by being more selective in your where clause, just like
you do in the first two queries), postgres will most likely switch to
index scan.
You can see for yourself if index-scan would be faster in your case by
running the following command before "explain (analyze)":
set enable_seqscan = off;
BTW, try to use explain analyze instead of explain, that way you'll
see the actual timings too instead of just the planner estimates.
Kind regards,
Mathieu
Why is the difference in query plan, and the total runtime. Both tables have the same btree index
test=# explain analyze select * from test_seqindex1 where sid='AA023';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Index Scan using test_seqindex1_pkey on test_seqindex1 (cost=0.00..8.27 rows=1 width=28) (actual time=0.017..0.018 rows=1 loops=1)
Index Cond: ((sid)::text = 'AA023'::text)
Total runtime: 0.035 ms
(3 rows)
test=# explain analyze select * from test_seqindex2 where sid='AA023';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test_seqindex2 (cost=4.95..275.53 rows=73 width=30) (actual time=57.833..71.577 rows=2 loops=1)
Recheck Cond: ((sid)::text = 'AA023'::text)
-> Bitmap Index Scan on idx_test_seqindex2_sid (cost=0.00..4.93 rows=73 width=0) (actual time=34.374..34.374 rows=2 loops=1)
Index Cond: ((sid)::text = 'AA023'::text)
Total runtime: 71.599 ms
(5 rows)
test=# \d test_seqindex1
Table "public.test_seqindex1"
Column | Type | Modifiers
--------+-----------------------+-----------
sid | character varying(13) | not null
name | character varying(80) |
Indexes:
"test_seqindex1_pkey" PRIMARY KEY, btree (sid)
test=# \d test_seqindex2
Table "public.test_seqindex2"
Column | Type | Modifiers
--------+-----------------------+-----------
eid | integer | not null
sid | character varying(13) |
ename | character varying(80) |
Indexes:
"test_seqindex2_pkey" PRIMARY KEY, btree (eid)
"idx_test_seqindex2_sid" btree (sid)
===============================================================================================
test=# explain analyze select * from test_seqindex1 where sid='AA023';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Index Scan using test_seqindex1_pkey on test_seqindex1 (cost=0.00..8.27 rows=1 width=28) (actual time=0.017..0.018 rows=1 loops=1)
Index Cond: ((sid)::text = 'AA023'::text)
Total runtime: 0.035 ms
(3 rows)
test=# explain analyze select * from test_seqindex2 where sid='AA023';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on test_seqindex2 (cost=4.95..275.53 rows=73 width=30) (actual time=57.833..71.577 rows=2 loops=1)
Recheck Cond: ((sid)::text = 'AA023'::text)
-> Bitmap Index Scan on idx_test_seqindex2_sid (cost=0.00..4.93 rows=73 width=0) (actual time=34.374..34.374 rows=2 loops=1)
Index Cond: ((sid)::text = 'AA023'::text)
Total runtime: 71.599 ms
(5 rows)
test=# \d test_seqindex1
Table "public.test_seqindex1"
Column | Type | Modifiers
--------+-----------------------+-----------
sid | character varying(13) | not null
name | character varying(80) |
Indexes:
"test_seqindex1_pkey" PRIMARY KEY, btree (sid)
test=# \d test_seqindex2
Table "public.test_seqindex2"
Column | Type | Modifiers
--------+-----------------------+-----------
eid | integer | not null
sid | character varying(13) |
ename | character varying(80) |
Indexes:
"test_seqindex2_pkey" PRIMARY KEY, btree (eid)
"idx_test_seqindex2_sid" btree (sid)
===============================================================================================
On Thu, Oct 21, 2010 at 11:09 AM, DM <dm.aeqa@gmail.com> wrote:
perfecto, thank you for the explanation.
- DeepakOn Thu, Oct 21, 2010 at 3:20 AM, Mathieu De Zutter <mathieu@dezutter.org> wrote:On Thu, Oct 21, 2010 at 3:47 AM, DM <dm.aeqa@gmail.com> wrote:A index scan would be probably slower here because you're asking for a
> I was hoping the optimizer would do a join using index scan.
>
> Could some one please explain me why its not doing an index scan rather than
> sequential scan .
lot of rows. A lot of rows means a lot of I/O, and an index scan is
more I/O intensive (since it has to read the index too). If you limit
the result (by being more selective in your where clause, just like
you do in the first two queries), postgres will most likely switch to
index scan.
You can see for yourself if index-scan would be faster in your case by
running the following command before "explain (analyze)":
set enable_seqscan = off;
BTW, try to use explain analyze instead of explain, that way you'll
see the actual timings too instead of just the planner estimates.
Kind regards,
Mathieu
On Thu, Oct 21, 2010 at 8:51 PM, DM <dm.aeqa@gmail.com> wrote: > Why is the difference in query plan, and the total runtime. Both tables have > the same btree index > > > test=# explain analyze select * from test_seqindex1 where sid='AA023'; > QUERY > PLAN > ------------------------------------------------------------------------------------------------------------------------------------- > Index Scan using test_seqindex1_pkey on test_seqindex1 (cost=0.00..8.27 > rows=1 width=28) (actual time=0.017..0.018 rows=1 loops=1) > Index Cond: ((sid)::text = 'AA023'::text) > Total runtime: 0.035 ms > (3 rows) > > test=# explain analyze select * from test_seqindex2 where sid='AA023'; > QUERY > PLAN > ---------------------------------------------------------------------------------------------------------------------------------- > Bitmap Heap Scan on test_seqindex2 (cost=4.95..275.53 rows=73 width=30) > (actual time=57.833..71.577 rows=2 loops=1) > Recheck Cond: ((sid)::text = 'AA023'::text) > -> Bitmap Index Scan on idx_test_seqindex2_sid (cost=0.00..4.93 rows=73 > width=0) (actual time=34.374..34.374 rows=2 loops=1) > Index Cond: ((sid)::text = 'AA023'::text) > Total runtime: 71.599 ms > (5 rows) > > > test=# \d test_seqindex1 > Table "public.test_seqindex1" > Column | Type | Modifiers > --------+-----------------------+----------- > sid | character varying(13) | not null > name | character varying(80) | > Indexes: > "test_seqindex1_pkey" PRIMARY KEY, btree (sid) > > test=# \d test_seqindex2 > Table "public.test_seqindex2" > Column | Type | Modifiers > --------+-----------------------+----------- > eid | integer | not null > sid | character varying(13) | > ename | character varying(80) | > Indexes: > "test_seqindex2_pkey" PRIMARY KEY, btree (eid) > "idx_test_seqindex2_sid" btree (sid) Index scans are very efficient when they return very few rows. If they return more (like in this case: estimate is 73 rows), the cost goes up linearly (here: 73 times slower). An index does not contain the elements according to their physical location, so it results in a lot of random I/O. Here the planner thinks it's better to do a bitmap index scan (first mark all pages that contain results and then read all these pages sequentially) so to have sequential I/O instead, which is usually faster. If your query really returns 73 rows, it would be a good choice. However, for your case, the estimates are off (73 matches estimated vs 2 matches actual). Therefore the planner needs better statistics on the actual values in your table. Look for "SET STATISTICS" here: http://www.postgresql.org/docs/9.0/static/sql-altertable.html Kind regards, Mathieu