Re: Estimates on partial index - Mailing list pgsql-performance

From Jeff Janes
Subject Re: Estimates on partial index
Date
Msg-id CAMkU=1ymn6xjf37LS6BAuuJE3PhCogRszzSBbTKcNFttjkKcmA@mail.gmail.com
Whole thread Raw
In response to Estimates on partial index  (Victor Yegorov <vyegorov@gmail.com>)
List pgsql-performance
On Thu, Aug 18, 2016 at 6:52 AM, Victor Yegorov <vyegorov@gmail.com> wrote:
> Greetings.
>
> I have a question on why planner chooses `IndexScan` for the following
> query:
>
>     SELECT la.loan_id, la.due_date, la.is_current
>       FROM loan_agreements la WHERE la.is_current AND '2016-08-11' >
> la.due_date;
>
...
>
> Planner chooses the following plan:
>
> QUERY PLAN
>
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>     Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on
> loan_agreements la      (cost=0.42..16986.53 rows=226145 width=13) (actual
> time=0.054..462.394 rows=216530 loops=1)
>       Filter: ('2016-08-11'::date > due_date)
>       Rows Removed by Filter: 21304
>       Buffers: shared hit=208343 read=18399
>     Planning time: 0.168 ms
>     Execution time: 479.773 ms
>
> If I disable IndexScans, plan changes likes this:
>
> QUERY PLAN
>
>
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>     Bitmap Heap Scan on loan_agreements la  (cost=2884.01..23974.88
> rows=226145 width=13) (actual time=38.893..200.376 rows=216530 loops=1)
>       Recheck Cond: is_current
>       Filter: ('2016-08-11'::date > due_date)
>       Rows Removed by Filter: 21304
>       Heap Blocks: exact=18117
>       Buffers: shared hit=18212 read=557
>       ->  Bitmap Index Scan on
> idx_loan_agreemnets_loan_id_cond_is_current_true  (cost=0.00..2827.47
> rows=237910 width=0) (actual time=35.166..35.166 rows=237853 loops=1)
>             Buffers: shared hit=119 read=533
>     Planning time: 0.171 ms
>     Execution time: 214.341 ms
>
> Question is — why IndexScan over partial index is estimated less than
> BitmapHeap + BitmapIndex scan. And how can I tell Planner, that IndexScan
> over 1/3 of table is not a good thing — IndexScan is touching 10x more pages

Both plans touch the same pages.  The index scan just touches some of
those pages over and over again.  A large setting of
effective_cache_size would tell it that the page will most likely
still be in cache when it comes back to touch it again, meaning the
cost of doing so will be small, basically free.

> and in a typical situation those are cold.

But they won't be, because it is heating them up itself, and
effective_cache_size says that stay then hot for the duration of the
query.

Also, with a random_page_cost of 2.5, you are telling it that even
cold pages are not all that cold.

What are the correlations of the is_current column to the ctid order,
and of the loan_id column to the ctid order?

Cheers,

Jeff


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Estimates on partial index
Next
From: Victor Yegorov
Date:
Subject: Re: Estimates on partial index