Re: PostgreSQL Choosing Full Index Over Partial Index - Mailing list pgsql-performance
From | Felipe López Montes |
---|---|
Subject | Re: PostgreSQL Choosing Full Index Over Partial Index |
Date | |
Msg-id | CACJPJu8GW4T9SM5=GnA3T_6oxH91YL+5D0+WJ_1yOj2HcmpJFw@mail.gmail.com Whole thread Raw |
In response to | Re: PostgreSQL Choosing Full Index Over Partial Index (Laurenz Albe <laurenz.albe@cybertec.at>) |
List | pgsql-performance |
Hi Mr. Laurenz,
Thanks a lot for your response :).
The full index is bigger as it has an entry for all the rows of the table, whilst the partial one only has entries for canceled flights.
Output of pgstatindex() for the partial index:
version,tree_level,index_size,root_block_no,internal_pages,leaf_pages,empty_pages,deleted_pages,avg_leaf_density,leaf_fragmentation
4,0,16384,1,0,1,0,0,13.4,0
Output of pgstatindex() for the full index:
version,tree_level,index_size,root_block_no,internal_pages,leaf_pages,empty_pages,deleted_pages,avg_leaf_density,leaf_fragmentation
4,2,4825088,180,5,583,0,0,90.1,0
I already ran ANALYZE and also VACUUM ANALYZE, and it still goes for the full one.
Thanks a lot for your response :).
The full index is bigger as it has an entry for all the rows of the table, whilst the partial one only has entries for canceled flights.
Output of pgstatindex() for the partial index:
version,tree_level,index_size,root_block_no,internal_pages,leaf_pages,empty_pages,deleted_pages,avg_leaf_density,leaf_fragmentation
4,0,16384,1,0,1,0,0,13.4,0
Output of pgstatindex() for the full index:
version,tree_level,index_size,root_block_no,internal_pages,leaf_pages,empty_pages,deleted_pages,avg_leaf_density,leaf_fragmentation
4,2,4825088,180,5,583,0,0,90.1,0
I already ran ANALYZE and also VACUUM ANALYZE, and it still goes for the full one.
El lun, 28 abr 2025 a las 15:35, Laurenz Albe (<laurenz.albe@cybertec.at>) escribió:
On Mon, 2025-04-28 at 15:22 +0200, Felipe López Montes wrote:
> I am using PostgreSQL 17.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 12.4.0, 64-bit,
> and working with the postgres_air Database.
>
> I have a very simple query (please forget about the sense of the query itself,
> I just want to focus on the planner):
>
> SELECT status
> FROM postgres_air.flight
> WHERE status = 'Canceled';
>
> And the following indexes:
>
> CREATE INDEX flight_status_index ON flight(status)
>
> CREATE INDEX flight_canceled ON flight(status)
> WHERE status = 'Canceled'
>
>
> Following the book PostgreSQL Query Optimization (Second Edition), there is a
> statement on page 90 talking about Partial Indexes that says that the planner
> will use the partial index rather than the full index on the flight table,
> however after doing my own tests I have checked that this is not true and the
> planner estimates that scanning the full index is cheaper than scanning the
> partial one and would like to understand why.
>
> I assume but might be wrong that having this partial index, lighter than the
> full table index, with both satisfying a specific index-suitable filter
> condition (in this case canceled flights represent 171 rows vs 683178 rows
> from the whole table), should be a reason for the planner to know that
> searching in the partial index should be faster than searching in the full
> index, besides the true fact that this partial index weights less than the
> full one.
>
> I also tried downgrading the version to the one used by the authors of the
> book but same behavior happens.
>
> Please see attached the different plan executions:
>
> Plan for the full index:
>
> QUERY PLAN
> Index Only Scan using flight_status_index on flight (cost=0.42..7.61 rows=182 width=11) (actual time=0.042..0.062 rows=171 loops=1)
> Index Cond: (status = 'Canceled'::text)
> Heap Fetches: 0
> Planning Time: 0.173 ms
> Execution Time: 0.080 ms
>
> Plan for the partial index:
>
> QUERY PLAN
> Index Only Scan using flight_canceled on flight (cost=0.14..10.82 rows=182 width=11) (actual time=0.039..0.050 rows=171 loops=1)
> Heap Fetches: 0
> Planning Time: 0.135 ms
> Execution Time: 0.066 ms
Which index is bigger (you can use \di+ in "psql")?
Could you run the pgstatindex() function from the "pgstattuple" extension on
both indexes and compare the output?
Does ANALYZE on the table make a difference?
Yours,
Laurenz Albe
pgsql-performance by date: