Home > mailing lists

Thread: Estimates on partial index

Estimates on partial index

From

Victor Yegorov

Date:

18 August 2016, 16:52:22

Greetings.

I have a question on why planner chooses `IndexScan` for the following query:

SELECT la.loan_id, la.due_date, la.is_current

FROM loan_agreements la WHERE la.is_current AND '2016-08-11' > la.due_date;

Relevant (cannot post it all, sorry) table definition is:

Column Type Modifiers

------------------------------ --------------------------- ---------

id bigint not null

...

is_current boolean not null

due_date date not null

loan_id bigint

Indexes:

"loan_agreements_pkey" PRIMARY KEY, btree (id)

...

"idx_loan_agreements_due_date" btree (due_date)

"idx_loan_agreemnets_loan_id_cond_is_current_true" btree (loan_id) WHERE is_current = true

Some stats:

SELECT relname,reltuples::numeric,relpages FROM pg_class WHERE oid IN ('loan_agreements'::regclass, 'idx_loan_agreemnets_loan_id_cond_is_current_true'::regclass, 'idx_loan_agreements_due_date'::regclass);

relname reltuples relpages

------------------------------------------------ --------- --------

idx_loan_agreements_due_date 664707 1828

idx_loan_agreemnets_loan_id_cond_is_current_true 237910 655

loan_agreements 664707 18117

Settings:

SELECT name,setting,unit FROM pg_settings WHERE name ~ '(buffers|mem|cost)$';

name setting unit

-------------------- -------- ----

autovacuum_work_mem 524288 kB

cpu_index_tuple_cost 0.005 ¤

cpu_operator_cost 0.0025 ¤

cpu_tuple_cost 0.01 ¤

maintenance_work_mem 16777216 kB

random_page_cost 2.5 ¤

seq_page_cost 1 ¤

shared_buffers 1572864 8kB

temp_buffers 8192 8kB

wal_buffers 2048 8kB

work_mem 65536 kB

PostgreSQL 9.4.8 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit

Planner chooses the following plan:

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on loan_agreements la (cost=0.42..16986.53 rows=226145 width=13) (actual time=0.054..462.394 rows=216530 loops=1)

Filter: ('2016-08-11'::date > due_date)

Rows Removed by Filter: 21304

Buffers: shared hit=208343 read=18399

Planning time: 0.168 ms

Execution time: 479.773 ms

If I disable IndexScans, plan changes likes this:

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------

Bitmap Heap Scan on loan_agreements la (cost=2884.01..23974.88 rows=226145 width=13) (actual time=38.893..200.376 rows=216530 loops=1)

Recheck Cond: is_current

Filter: ('2016-08-11'::date > due_date)

Rows Removed by Filter: 21304

Heap Blocks: exact=18117

Buffers: shared hit=18212 read=557

-> Bitmap Index Scan on idx_loan_agreemnets_loan_id_cond_is_current_true (cost=0.00..2827.47 rows=237910 width=0) (actual time=35.166..35.166 rows=237853 loops=1)

Buffers: shared hit=119 read=533

Planning time: 0.171 ms

Execution time: 214.341 ms

Question is — why IndexScan over partial index is estimated less than BitmapHeap + BitmapIndex scan. And how can I tell Planner, that IndexScan over 1/3 of table is not a good thing — IndexScan is touching 10x more pages and in a typical situation those are cold.

Thanks in advance.

Victor Y. Yegorov

Re: Estimates on partial index

From

Tom Lane

Date:

18 August 2016, 16:56:15

Victor Yegorov <vyegorov@gmail.com> writes:
> Settings:
>     random_page_cost     2.5      ¤
>     seq_page_cost        1        ¤

> Question is — why IndexScan over partial index is estimated less than
> BitmapHeap + BitmapIndex scan. And how can I tell Planner, that IndexScan
> over 1/3 of table is not a good thing — IndexScan is touching 10x more
> pages and in a typical situation those are cold.

In that case you've got random_page_cost too far down.  Values less than
the default of 4 are generally only appropriate if the bulk of your
database stays in RAM.

            regards, tom lane

Re: Estimates on partial index

From

Jeff Janes

Date:

18 August 2016, 18:59:34

On Thu, Aug 18, 2016 at 6:52 AM, Victor Yegorov <vyegorov@gmail.com> wrote:
> Greetings.
>
> I have a question on why planner chooses `IndexScan` for the following
> query:
>
>     SELECT la.loan_id, la.due_date, la.is_current
>       FROM loan_agreements la WHERE la.is_current AND '2016-08-11' >
> la.due_date;
>
...
>
> Planner chooses the following plan:
>
> QUERY PLAN
>
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>     Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on
> loan_agreements la      (cost=0.42..16986.53 rows=226145 width=13) (actual
> time=0.054..462.394 rows=216530 loops=1)
>       Filter: ('2016-08-11'::date > due_date)
>       Rows Removed by Filter: 21304
>       Buffers: shared hit=208343 read=18399
>     Planning time: 0.168 ms
>     Execution time: 479.773 ms
>
> If I disable IndexScans, plan changes likes this:
>
> QUERY PLAN
>
>
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>     Bitmap Heap Scan on loan_agreements la  (cost=2884.01..23974.88
> rows=226145 width=13) (actual time=38.893..200.376 rows=216530 loops=1)
>       Recheck Cond: is_current
>       Filter: ('2016-08-11'::date > due_date)
>       Rows Removed by Filter: 21304
>       Heap Blocks: exact=18117
>       Buffers: shared hit=18212 read=557
>       ->  Bitmap Index Scan on
> idx_loan_agreemnets_loan_id_cond_is_current_true  (cost=0.00..2827.47
> rows=237910 width=0) (actual time=35.166..35.166 rows=237853 loops=1)
>             Buffers: shared hit=119 read=533
>     Planning time: 0.171 ms
>     Execution time: 214.341 ms
>
> Question is — why IndexScan over partial index is estimated less than
> BitmapHeap + BitmapIndex scan. And how can I tell Planner, that IndexScan
> over 1/3 of table is not a good thing — IndexScan is touching 10x more pages

Both plans touch the same pages.  The index scan just touches some of
those pages over and over again.  A large setting of
effective_cache_size would tell it that the page will most likely
still be in cache when it comes back to touch it again, meaning the
cost of doing so will be small, basically free.

> and in a typical situation those are cold.

But they won't be, because it is heating them up itself, and
effective_cache_size says that stay then hot for the duration of the
query.

Also, with a random_page_cost of 2.5, you are telling it that even
cold pages are not all that cold.

What are the correlations of the is_current column to the ctid order,
and of the loan_id column to the ctid order?

Cheers,

Jeff

Re: Estimates on partial index

From

Victor Yegorov

Date:

18 August 2016, 21:40:39

2016-08-18 16:56 GMT+03:00 Tom Lane <tgl@sss.pgh.pa.us>:

In that case you've got random_page_cost too far down. Values less than
the default of 4 are generally only appropriate if the bulk of your
database stays in RAM.

Oh, that's interesting. I was under impression, that r_p_c reflects IO speed, like — make it smaller for SSDs.

To make this query prefer BitmapScan, I need to bump r_p_c to 5.8. And 6.0 makes it switch to SeqScan.

Does it means, that for huge DB (this one is ~1.5TB) it is better to increase r_p_c?

Still, this effect shows only for partial indexes, i.e. if I disable `idx_loan_agreemnets_loan_id_cond_is_current_true`,

than planner will not use any other and goes straight to SeqScan.

Does it means, that amount of table-related IO is not accounted for when planning scan over partial index?

Victor Y. Yegorov

Re: Estimates on partial index

From

Victor Yegorov

Date:

18 August 2016, 21:55:32

2016-08-18 18:59 GMT+03:00 Jeff Janes <jeff.janes@gmail.com>:

Both plans touch the same pages. The index scan just touches some of
those pages over and over again. A large setting of
effective_cache_size would tell it that the page will most likely
still be in cache when it comes back to touch it again, meaning the
cost of doing so will be small, basically free.

> and in a typical situation those are cold.

But they won't be, because it is heating them up itself, and
effective_cache_size says that stay then hot for the duration of the
query.

(Re-sending as I've missed to add the list.)

But IndexScan means, that not only index, table is also accessed.

And although index is small get's hot quite quickly (yes, e_c_s is 96GB on this dedicated box),

table is not. And this clearly adds up to the total time.

I am wondering, if heap page accesses are also accounted for during planning.

Also, with a random_page_cost of 2.5, you are telling it that even
cold pages are not all that cold.

Yes, this was new for me and I will review my setup.

Current setting is based on the fact we're running SSDs.

What are the correlations of the is_current column to the ctid order,
and of the loan_id column to the ctid order?

SELECT attname,null_frac,avg_width,n_distinct,correlation FROM pg_stats WHERE tablename='loan_agreements' AND attname IN ('loan_id','is_current','due_date');

attname null_frac avg_width n_distinct correlation

---------- --------- --------- ---------- -----------

due_date 0 4 1197 0.982312

is_current 0 1 2 0.547268

loan_id 0 8 -0.202438 0.937507

Victor Y. Yegorov

Re: Estimates on partial index

From

Jeff Janes

Date:

18 August 2016, 23:06:57

On Thu, Aug 18, 2016 at 11:55 AM, Victor Yegorov <vyegorov@gmail.com> wrote:
> 2016-08-18 18:59 GMT+03:00 Jeff Janes <jeff.janes@gmail.com>:
>>
>> Both plans touch the same pages.  The index scan just touches some of
>> those pages over and over again.  A large setting of
>> effective_cache_size would tell it that the page will most likely
>> still be in cache when it comes back to touch it again, meaning the
>> cost of doing so will be small, basically free.
>>
>> > and in a typical situation those are cold.
>>
>> But they won't be, because it is heating them up itself, and
>> effective_cache_size says that stay then hot for the duration of the
>> query.
>
>
> But IndexScan means, that not only index, table is also accessed.
> And although index is small get's hot quite quickly (yes, e_c_s is 96GB on
> this dedicated box),
> table is not.

Both types of scans have to touch the same set of pages.  The bitmap
hits all of the needed index pages first and memorizes the relevant
results, then hits all the needed table pages.  The regular index scan
keeps jumping back and forth from index to table. But they are the
same set of pages either way.

With a regular index scan, if the same table page is pointed to from
40 different places in the index, then it will be touched 40 different
times.  But at least 39 of those times it is going to already be in
memory.  The bitmap scan will touch the page just one and deal with
all 40 entries.

>  And this clearly adds up to the total time.

That isn't clear at all from the info you gave.  You would have to set
track_io_timing=on in order to show something like that.  And we don't
know if you ran each query once in the order shown, and posted what
you got (with one warming the cache for the other); or if you have ran
each repeatedly and posted representative examples with a pre-warmed
cache.

> I am wondering, if heap page accesses are also accounted for during
> planning.

It does account for them, but perhaps not perfectly.  See "[PERFORM]
index fragmentation on insert-only table with non-unique column" for
some arguments on that which might be relevant to you.

If you can come up with a data generator which creates data that
others can use to reproduce this situation, we can then investigate it
in more detail.

Cheers,

Jeff

Re: Estimates on partial index

From

Jim Nasby

Date:

19 August 2016, 01:01:19

On 8/18/16 3:06 PM, Jeff Janes wrote:
> If you can come up with a data generator which creates data that
> others can use to reproduce this situation, we can then investigate it
> in more detail.

BTW, the easy fix to this is most likely to create an index on due_date
WHERE is_current. Or perhaps partition the table so non-current rows
don't live with current ones. I'm not a fan of mixing history tracking
in with the main table, because it leads to problems like this.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461

Re: Estimates on partial index

From

Ashish Kumar Singh

Date:

19 August 2016, 07:21:26

K te er Xi

Sent from Nine

From: Jim Nasby <Jim.Nasby@BlueTreble.com>
Sent: 19-Aug-2016 03:32
To: Jeff Janes; Victor Yegorov
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Estimates on partial index

On 8/18/16 3:06 PM, Jeff Janes wrote:
> If you can come up with a data generator which creates data that
> others can use to reproduce this situation, we can then investigate it
> in more detail.

BTW, the easy fix to this is most likely to create an index on due_date
WHERE is_current. Or perhaps partition the table so non-current rows
don't live with current ones. I'm not a fan of mixing history tracking
in with the main table, because it leads to problems like this.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Estimates on partial index

From

Victor Yegorov

Date:

19 August 2016, 14:32:00

2016-08-18 23:06 GMT+03:00 Jeff Janes <jeff.janes@gmail.com>:

It does account for them, but perhaps not perfectly. See "[PERFORM]
index fragmentation on insert-only table with non-unique column" for
some arguments on that which might be relevant to you.

Thanks for pointing this out, good stuff to know.

If you can come up with a data generator which creates data that
others can use to reproduce this situation, we can then investigate it
in more detail.

I do not think this has any value anymore. I have reconfigured the server:

- r_p_c returned to the default 4.0

- turned on track_io_timing

- reindexed all indexes on the table

- went with the suggestion from Jim about partial index on `due_date`, although

delayed it a bit to get a better view on the situation

Running several explains in a row produces the following:

Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on loan_agreements la (cost=0.42..21469.40 rows=224331 width=13) (actual time=0.040..2687.422 rows=216440 loops=1)

Filter: ('2016-08-11'::date > due_date)

Rows Removed by Filter: 21806

Buffers: shared hit=226670 read=692 dirtied=48

I/O Timings: read=9.854

Planning time: 1885.219 ms

Execution time: 2712.833 ms

Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on loan_agreements la (cost=0.42..21469.40 rows=224331 width=13) (actual time=426.027..2273.617 rows=216440 loops=1)

Filter: ('2016-08-11'::date > due_date)

Rows Removed by Filter: 21806

Buffers: shared hit=227276

Planning time: 0.175 ms

Execution time: 2296.414 ms

Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on loan_agreements la (cost=0.42..21469.40 rows=224331 width=13) (actual time=0.034..297.113 rows=216440 loops=1)

Filter: ('2016-08-11'::date > due_date)

Rows Removed by Filter: 21806

Buffers: shared hit=227276

Planning time: 0.173 ms

Execution time: 310.509 ms

Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on loan_agreements la (cost=0.42..21469.40 rows=224331 width=13) (actual time=0.031..286.212 rows=216440 loops=1)

Filter: ('2016-08-11'::date > due_date)

Rows Removed by Filter: 21806

Buffers: shared hit=227276

Planning time: 0.163 ms

Execution time: 299.831 ms

This makes me think, that IO is not my issue here and, honestly, I have no clue what can be behind this.

What I noticed — queries do experience this kind of hiccups from time to time. CPU and IO monitoring shows no spikes at all, CPU is below 40% all the time.

Currently I am trying to find out ways how to track down what's going on here.

Still — thanks everyone for the feedback, it was valuable for me!

Victor Y. Yegorov

Re: Estimates on partial index

From

Victor Yegorov

Date:

19 August 2016, 15:51:00

2016-08-18 21:40 GMT+03:00 Victor Yegorov <vyegorov@gmail.com>:

Oh, that's interesting. I was under impression, that r_p_c reflects IO speed, like — make it smaller for SSDs.
To make this query prefer BitmapScan, I need to bump r_p_c to 5.8. And 6.0 makes it switch to SeqScan.

I was looking into different databases and queries around — many of them prefers to use indexes over SeqScans, even if index is not a "perfect" match,

like using index on the 2-nd column of the index (like searching for `rev` via IndexScan over `id,rev` index).

I need to bump r_p_c to 6 (at least) to make things shift towards BtimapScans, and I feel uncertain about such increase.

This makes me thinking — can this situation be an indication, that tables are bloated?

(I've performed reindexing recently, touching majority of indexes around, while tables were not touched.)

Victor Y. Yegorov