Thread: Estimates on partial index
Greetings.
--
I have a question on why planner chooses `IndexScan` for the following query:
SELECT la.loan_id, la.due_date, la.is_current
FROM loan_agreements la WHERE la.is_current AND '2016-08-11' > la.due_date;
Relevant (cannot post it all, sorry) table definition is:
Column Type Modifiers
------------------------------ --------------------------- ---------
id bigint not null
...
is_current boolean not null
due_date date not null
loan_id bigint
Indexes:
"loan_agreements_pkey" PRIMARY KEY, btree (id)
...
"idx_loan_agreements_due_date" btree (due_date)
"idx_loan_agreemnets_loan_id_cond_is_current_true" btree (loan_id) WHERE is_current = true
Some stats:
SELECT relname,reltuples::numeric,relpages FROM pg_class WHERE oid IN ('loan_agreements'::regclass, 'idx_loan_agreemnets_loan_id_cond_is_current_true'::regclass, 'idx_loan_agreements_due_date'::regclass);
relname reltuples relpages
------------------------------------------------ --------- --------
idx_loan_agreements_due_date 664707 1828
idx_loan_agreemnets_loan_id_cond_is_current_true 237910 655
loan_agreements 664707 18117
Settings:
SELECT name,setting,unit FROM pg_settings WHERE name ~ '(buffers|mem|cost)$';
name setting unit
-------------------- -------- ----
autovacuum_work_mem 524288 kB
cpu_index_tuple_cost 0.005 ¤
cpu_operator_cost 0.0025 ¤
cpu_tuple_cost 0.01 ¤
maintenance_work_mem 16777216 kB
random_page_cost 2.5 ¤
seq_page_cost 1 ¤
shared_buffers 1572864 8kB
temp_buffers 8192 8kB
wal_buffers 2048 8kB
work_mem 65536 kB
PostgreSQL 9.4.8 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit
Planner chooses the following plan:
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on loan_agreements la (cost=0.42..16986.53 rows=226145 width=13) (actual time=0.054..462.394 rows=216530 loops=1)
Filter: ('2016-08-11'::date > due_date)
Rows Removed by Filter: 21304
Buffers: shared hit=208343 read=18399
Planning time: 0.168 ms
Execution time: 479.773 ms
If I disable IndexScans, plan changes likes this:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on loan_agreements la (cost=2884.01..23974.88 rows=226145 width=13) (actual time=38.893..200.376 rows=216530 loops=1)
Recheck Cond: is_current
Filter: ('2016-08-11'::date > due_date)
Rows Removed by Filter: 21304
Heap Blocks: exact=18117
Buffers: shared hit=18212 read=557
-> Bitmap Index Scan on idx_loan_agreemnets_loan_id_cond_is_current_true (cost=0.00..2827.47 rows=237910 width=0) (actual time=35.166..35.166 rows=237853 loops=1)
Buffers: shared hit=119 read=533
Planning time: 0.171 ms
Execution time: 214.341 ms
Question is — why IndexScan over partial index is estimated less than BitmapHeap + BitmapIndex scan. And how can I tell Planner, that IndexScan over 1/3 of table is not a good thing — IndexScan is touching 10x more pages and in a typical situation those are cold.
Thanks in advance.
Victor Y. Yegorov
Victor Yegorov <vyegorov@gmail.com> writes: > Settings: > random_page_cost 2.5 ¤ > seq_page_cost 1 ¤ > Question is — why IndexScan over partial index is estimated less than > BitmapHeap + BitmapIndex scan. And how can I tell Planner, that IndexScan > over 1/3 of table is not a good thing — IndexScan is touching 10x more > pages and in a typical situation those are cold. In that case you've got random_page_cost too far down. Values less than the default of 4 are generally only appropriate if the bulk of your database stays in RAM. regards, tom lane
On Thu, Aug 18, 2016 at 6:52 AM, Victor Yegorov <vyegorov@gmail.com> wrote: > Greetings. > > I have a question on why planner chooses `IndexScan` for the following > query: > > SELECT la.loan_id, la.due_date, la.is_current > FROM loan_agreements la WHERE la.is_current AND '2016-08-11' > > la.due_date; > ... > > Planner chooses the following plan: > > QUERY PLAN > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on > loan_agreements la (cost=0.42..16986.53 rows=226145 width=13) (actual > time=0.054..462.394 rows=216530 loops=1) > Filter: ('2016-08-11'::date > due_date) > Rows Removed by Filter: 21304 > Buffers: shared hit=208343 read=18399 > Planning time: 0.168 ms > Execution time: 479.773 ms > > If I disable IndexScans, plan changes likes this: > > QUERY PLAN > > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Bitmap Heap Scan on loan_agreements la (cost=2884.01..23974.88 > rows=226145 width=13) (actual time=38.893..200.376 rows=216530 loops=1) > Recheck Cond: is_current > Filter: ('2016-08-11'::date > due_date) > Rows Removed by Filter: 21304 > Heap Blocks: exact=18117 > Buffers: shared hit=18212 read=557 > -> Bitmap Index Scan on > idx_loan_agreemnets_loan_id_cond_is_current_true (cost=0.00..2827.47 > rows=237910 width=0) (actual time=35.166..35.166 rows=237853 loops=1) > Buffers: shared hit=119 read=533 > Planning time: 0.171 ms > Execution time: 214.341 ms > > Question is — why IndexScan over partial index is estimated less than > BitmapHeap + BitmapIndex scan. And how can I tell Planner, that IndexScan > over 1/3 of table is not a good thing — IndexScan is touching 10x more pages Both plans touch the same pages. The index scan just touches some of those pages over and over again. A large setting of effective_cache_size would tell it that the page will most likely still be in cache when it comes back to touch it again, meaning the cost of doing so will be small, basically free. > and in a typical situation those are cold. But they won't be, because it is heating them up itself, and effective_cache_size says that stay then hot for the duration of the query. Also, with a random_page_cost of 2.5, you are telling it that even cold pages are not all that cold. What are the correlations of the is_current column to the ctid order, and of the loan_id column to the ctid order? Cheers, Jeff
2016-08-18 16:56 GMT+03:00 Tom Lane <tgl@sss.pgh.pa.us>:
In that case you've got random_page_cost too far down. Values less than
the default of 4 are generally only appropriate if the bulk of your
database stays in RAM.
Oh, that's interesting. I was under impression, that r_p_c reflects IO speed, like — make it smaller for SSDs.
To make this query prefer BitmapScan, I need to bump r_p_c to 5.8. And 6.0 makes it switch to SeqScan.
Does it means, that for huge DB (this one is ~1.5TB) it is better to increase r_p_c?
Still, this effect shows only for partial indexes, i.e. if I disable `idx_loan_agreemnets_loan_
than planner will not use any other and goes straight to SeqScan.
Does it means, that amount of table-related IO is not accounted for when planning scan over partial index?
--
Victor Y. Yegorov
2016-08-18 18:59 GMT+03:00 Jeff Janes <jeff.janes@gmail.com>:
Both plans touch the same pages. The index scan just touches some of
those pages over and over again. A large setting of
effective_cache_size would tell it that the page will most likely
still be in cache when it comes back to touch it again, meaning the
cost of doing so will be small, basically free.
> and in a typical situation those are cold.
But they won't be, because it is heating them up itself, and
effective_cache_size says that stay then hot for the duration of the
query.
(Re-sending as I've missed to add the list.)
But IndexScan means, that not only index, table is also accessed.
And although index is small get's hot quite quickly (yes, e_c_s is 96GB on this dedicated box),
table is not. And this clearly adds up to the total time.
I am wondering, if heap page accesses are also accounted for during planning.
Also, with a random_page_cost of 2.5, you are telling it that even
cold pages are not all that cold.
Yes, this was new for me and I will review my setup.
Current setting is based on the fact we're running SSDs.
What are the correlations of the is_current column to the ctid order,
and of the loan_id column to the ctid order?
SELECT attname,null_frac,avg_width,n_ distinct,correlation FROM pg_stats WHERE tablename='loan_agreements' AND attname IN ('loan_id','is_current','due_ date');
attname null_frac avg_width n_distinct correlation
---------- --------- --------- ---------- -----------
due_date 0 4 1197 0.982312
is_current 0 1 2 0.547268
loan_id 0 8 -0.202438 0.937507
Victor Y. Yegorov
On Thu, Aug 18, 2016 at 11:55 AM, Victor Yegorov <vyegorov@gmail.com> wrote: > 2016-08-18 18:59 GMT+03:00 Jeff Janes <jeff.janes@gmail.com>: >> >> Both plans touch the same pages. The index scan just touches some of >> those pages over and over again. A large setting of >> effective_cache_size would tell it that the page will most likely >> still be in cache when it comes back to touch it again, meaning the >> cost of doing so will be small, basically free. >> >> > and in a typical situation those are cold. >> >> But they won't be, because it is heating them up itself, and >> effective_cache_size says that stay then hot for the duration of the >> query. > > > But IndexScan means, that not only index, table is also accessed. > And although index is small get's hot quite quickly (yes, e_c_s is 96GB on > this dedicated box), > table is not. Both types of scans have to touch the same set of pages. The bitmap hits all of the needed index pages first and memorizes the relevant results, then hits all the needed table pages. The regular index scan keeps jumping back and forth from index to table. But they are the same set of pages either way. With a regular index scan, if the same table page is pointed to from 40 different places in the index, then it will be touched 40 different times. But at least 39 of those times it is going to already be in memory. The bitmap scan will touch the page just one and deal with all 40 entries. > And this clearly adds up to the total time. That isn't clear at all from the info you gave. You would have to set track_io_timing=on in order to show something like that. And we don't know if you ran each query once in the order shown, and posted what you got (with one warming the cache for the other); or if you have ran each repeatedly and posted representative examples with a pre-warmed cache. > I am wondering, if heap page accesses are also accounted for during > planning. It does account for them, but perhaps not perfectly. See "[PERFORM] index fragmentation on insert-only table with non-unique column" for some arguments on that which might be relevant to you. If you can come up with a data generator which creates data that others can use to reproduce this situation, we can then investigate it in more detail. Cheers, Jeff
On 8/18/16 3:06 PM, Jeff Janes wrote: > If you can come up with a data generator which creates data that > others can use to reproduce this situation, we can then investigate it > in more detail. BTW, the easy fix to this is most likely to create an index on due_date WHERE is_current. Or perhaps partition the table so non-current rows don't live with current ones. I'm not a fan of mixing history tracking in with the main table, because it leads to problems like this. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461
K te er Xi
Sent from Nine
From: Jim Nasby <Jim.Nasby@BlueTreble.com>
Sent: 19-Aug-2016 03:32
To: Jeff Janes; Victor Yegorov
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Estimates on partial index
Sent: 19-Aug-2016 03:32
To: Jeff Janes; Victor Yegorov
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Estimates on partial index
On 8/18/16 3:06 PM, Jeff Janes wrote:
> If you can come up with a data generator which creates data that
> others can use to reproduce this situation, we can then investigate it
> in more detail.
BTW, the easy fix to this is most likely to create an index on due_date
WHERE is_current. Or perhaps partition the table so non-current rows
don't live with current ones. I'm not a fan of mixing history tracking
in with the main table, because it leads to problems like this.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
> If you can come up with a data generator which creates data that
> others can use to reproduce this situation, we can then investigate it
> in more detail.
BTW, the easy fix to this is most likely to create an index on due_date
WHERE is_current. Or perhaps partition the table so non-current rows
don't live with current ones. I'm not a fan of mixing history tracking
in with the main table, because it leads to problems like this.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
2016-08-18 23:06 GMT+03:00 Jeff Janes <jeff.janes@gmail.com>:
It does account for them, but perhaps not perfectly. See "[PERFORM]
index fragmentation on insert-only table with non-unique column" for
some arguments on that which might be relevant to you.
Thanks for pointing this out, good stuff to know.
If you can come up with a data generator which creates data that
others can use to reproduce this situation, we can then investigate it
in more detail.
I do not think this has any value anymore. I have reconfigured the server:
- r_p_c returned to the default 4.0
- turned on track_io_timing
- reindexed all indexes on the table
- went with the suggestion from Jim about partial index on `due_date`, although
delayed it a bit to get a better view on the situation
Running several explains in a row produces the following:
Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on loan_agreements la (cost=0.42..21469.40 rows=224331 width=13) (actual time=0.040..2687.422 rows=216440 loops=1)
Filter: ('2016-08-11'::date > due_date)
Rows Removed by Filter: 21806
Buffers: shared hit=226670 read=692 dirtied=48
I/O Timings: read=9.854
Planning time: 1885.219 ms
Execution time: 2712.833 ms
Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on loan_agreements la (cost=0.42..21469.40 rows=224331 width=13) (actual time=426.027..2273.617 rows=216440 loops=1)
Filter: ('2016-08-11'::date > due_date)
Rows Removed by Filter: 21806
Buffers: shared hit=227276
Planning time: 0.175 ms
Execution time: 2296.414 ms
Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on loan_agreements la (cost=0.42..21469.40 rows=224331 width=13) (actual time=0.034..297.113 rows=216440 loops=1)
Filter: ('2016-08-11'::date > due_date)
Rows Removed by Filter: 21806
Buffers: shared hit=227276
Planning time: 0.173 ms
Execution time: 310.509 ms
Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on loan_agreements la (cost=0.42..21469.40 rows=224331 width=13) (actual time=0.031..286.212 rows=216440 loops=1)
Filter: ('2016-08-11'::date > due_date)
Rows Removed by Filter: 21806
Buffers: shared hit=227276
Planning time: 0.163 ms
Execution time: 299.831 ms
This makes me think, that IO is not my issue here and, honestly, I have no clue what can be behind this.
What I noticed — queries do experience this kind of hiccups from time to time. CPU and IO monitoring shows no spikes at all, CPU is below 40% all the time.
Currently I am trying to find out ways how to track down what's going on here.
Still — thanks everyone for the feedback, it was valuable for me!
Victor Y. Yegorov
2016-08-18 21:40 GMT+03:00 Victor Yegorov <vyegorov@gmail.com>:
Oh, that's interesting. I was under impression, that r_p_c reflects IO speed, like — make it smaller for SSDs.To make this query prefer BitmapScan, I need to bump r_p_c to 5.8. And 6.0 makes it switch to SeqScan.
I was looking into different databases and queries around — many of them prefers to use indexes over SeqScans, even if index is not a "perfect" match,
like using index on the 2-nd column of the index (like searching for `rev` via IndexScan over `id,rev` index).
I need to bump r_p_c to 6 (at least) to make things shift towards BtimapScans, and I feel uncertain about such increase.
This makes me thinking — can this situation be an indication, that tables are bloated?
(I've performed reindexing recently, touching majority of indexes around, while tables were not touched.)
Victor Y. Yegorov