Re: Range search on primary key does not use index scan - Mailing list pgsql-general

From Tom Lane
Subject Re: Range search on primary key does not use index scan
Date
Msg-id 750937.1608774188@sss.pgh.pa.us
Whole thread Raw
In response to Range search on primary key does not use index scan  (Shantanu Shekhar <shekharshan@yahoo.com>)
List pgsql-general
Shantanu Shekhar <shekharshan@yahoo.com> writes:
> (2) Here I am searching for employee_id < 123. I was expecting the plan would use the index on employees_pk to find
allleaf nodes where employee_id < 123 and then issue read of table blocks for each of the matching entries in the index
leaf.But looks like the query plan has decided on using full table scan instead and not using the index. Any ideas why
isthis happening? 
> testdb# explain select first_name, last_name from testschema.employees where employee_id < 123;                     
   QUERY PLAN                          ══════════════════════════════════════════════════════════════ Seq Scan on
employees (cost=0.00..10.62 rows=17 width=1032)   Filter: (employee_id < 123)(2 rows) 

In the absence of any statistics (and since you have no data in the table,
there are no stats either), the default assumption about an inequality
condition is that it's not very selective --- I think it's assumed to
select one-third of the table, but am too lazy to go check that right now.
In any case, it's enough of the table to discourage use of an indexscan.
With Postgres' normal cost settings, a potentially indexable condition
has to be estimated to select just a few percent of the table, else a
seqscan is going to look cheaper.

If you'd used an actual range condition (which I take to be a BETWEEN
clause or equivalent), you probably would have gotten an indexscan plan,
because the default estimate for that is a lot tighter.  But a one-sided
inequality can't reasonably be assumed to be selecting just a small
part of the table without any evidence to back that.  Now, if you'd
populated and analyzed the table, the planner might discover that only
a few percent of the rows have employee_id < 123 (if you've got many
thousands of employees), and then it'd pick an indexscan for the
query as it stands.

Taking a few steps back here, the way you are testing things is not going
to do anything except mislead you.  Queries on empty tables that lack any
statistics are unlikely to produce the same plans as queries on populated,
analyzed tables.  See for instance the advice at

https://www.postgresql.org/docs/current/using-explain.html#USING-EXPLAIN-CAVEATS

            regards, tom lane



pgsql-general by date:

Previous
From: Shantanu Shekhar
Date:
Subject: Range search on primary key does not use index scan
Next
From: Guyren Howe
Date:
Subject: Is there a good discussion of optimizations?