disabling seqscan not using primary key index? - Mailing list pgsql-general
From | Luca Ferrari |
---|---|
Subject | disabling seqscan not using primary key index? |
Date | |
Msg-id | CAKoxK+6P38HkoQYXs_Kw0hOJ4sdTe7A34n=oY2EkspQKHxw4cQ@mail.gmail.com Whole thread Raw |
Responses |
Re: disabling seqscan not using primary key index?
|
List | pgsql-general |
Hi all, doing a little and trivial experiment, I decided to populate a table with a primary key (and thus an automatically generated btree index): testdb=# create table foo( pk serial primary key, i int ); CREATE TABLE testdb=# insert into foo( i ) select v from generate_series( 1, 1000000 ) v; INSERT 0 1000000 testdb=# explain select * from foo where i > 10; QUERY PLAN ------------------------------------------------------------ Seq Scan on foo (cost=0.00..16925.00 rows=999900 width=8) Filter: (i > 10) (2 rows) So far so good, but if I disable seqscan I would expect the planner to choose the primary key index, because that "should be" the preferred way to access the table. On the other hand, the planner enables JIT machinery and executes again a seqsca. testdb=# set enable_seqscan to off; SET testdb=# explain select * from foo where i > 10; QUERY PLAN ------------------------------------------------------------------------------- Seq Scan on foo (cost=10000000000.00..10000016925.00 rows=999900 width=8) Filter: (i > 10) JIT: Functions: 2 Options: Inlining true, Optimization true, Expressions true, Deforming true (5 rows) Now, I know the planner has done things right, and thus has selected the minor cost plan for the situation, but I would like to better understand why the primary key index is excluded and why the JIT machinery is used in this case. I know that columns pk and i are pretty identitcal, and the same query made against the column pk do exploits the index: testdb=# explain select * from foo where pk > 10; QUERY PLAN ------------------------------------------------------------------------------- Bitmap Heap Scan on foo (cost=6243.89..14835.76 rows=333350 width=8) Recheck Cond: (pk > 10) -> Bitmap Index Scan on foo_pkey (cost=0.00..6160.55 rows=333350 width=0) Index Cond: (pk > 10) (4 rows) and the final cost using the index is much lower than the cost of the seqscan + JIT, that makes me think the index should have been used also in the previous query. Any suggestion to better understand? Please note that even with an extended statistic, the planner does not choose to use an index: testdb=# create statistics st_foo ( dependencies ) on pk,i from foo; CREATE STATISTICS testdb=# analyze foo; ANALYZE testdb=# set enable_seqscan to off; SET testdb=# explain select * from foo where i > 10; QUERY PLAN ------------------------------------------------------------------------------- Seq Scan on foo (cost=10000000000.00..10000016925.00 rows=999900 width=8) Filter: (i > 10) JIT: Functions: 2 Options: Inlining true, Optimization true, Expressions true, Deforming true (5 rows) Even doing a quick computation about how much does it cost to get from index to the table for all tuples seems to me much lower than the initial cost for a not-sequentially-enabled cost: testdb=# select sum( relpages + reltuples * current_setting( 'cpu_tuple_cost' )::numeric ) from pg_class where relname in ( 'foo', 'foo_pkey' ); sum ------- 27170 I'm clearly missing something here. Thanks, Luca
pgsql-general by date: