On Mon, 2025-04-28 at 15:22 +0200, Felipe López Montes wrote:
> I am using PostgreSQL 17.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 12.4.0, 64-bit,
> and working with the postgres_air Database.
>
> I have a very simple query (please forget about the sense of the query itself,
> I just want to focus on the planner):
>
> SELECT status
> FROM postgres_air.flight
> WHERE status = 'Canceled';
>
> And the following indexes:
>
> CREATE INDEX flight_status_index ON flight(status)
>
> CREATE INDEX flight_canceled ON flight(status)
> WHERE status = 'Canceled'
>
>
> Following the book PostgreSQL Query Optimization (Second Edition), there is a
> statement on page 90 talking about Partial Indexes that says that the planner
> will use the partial index rather than the full index on the flight table,
> however after doing my own tests I have checked that this is not true and the
> planner estimates that scanning the full index is cheaper than scanning the
> partial one and would like to understand why.
>
> I assume but might be wrong that having this partial index, lighter than the
> full table index, with both satisfying a specific index-suitable filter
> condition (in this case canceled flights represent 171 rows vs 683178 rows
> from the whole table), should be a reason for the planner to know that
> searching in the partial index should be faster than searching in the full
> index, besides the true fact that this partial index weights less than the
> full one.
>
> I also tried downgrading the version to the one used by the authors of the
> book but same behavior happens.
>
> Please see attached the different plan executions:
>
> Plan for the full index:
>
> QUERY PLAN
> Index Only Scan using flight_status_index on flight (cost=0.42..7.61 rows=182 width=11) (actual time=0.042..0.062
rows=171loops=1)
> Index Cond: (status = 'Canceled'::text)
> Heap Fetches: 0
> Planning Time: 0.173 ms
> Execution Time: 0.080 ms
>
> Plan for the partial index:
>
> QUERY PLAN
> Index Only Scan using flight_canceled on flight (cost=0.14..10.82 rows=182 width=11) (actual time=0.039..0.050
rows=171loops=1)
> Heap Fetches: 0
> Planning Time: 0.135 ms
> Execution Time: 0.066 ms
Which index is bigger (you can use \di+ in "psql")?
Could you run the pgstatindex() function from the "pgstattuple" extension on
both indexes and compare the output?
Does ANALYZE on the table make a difference?
Yours,
Laurenz Albe