Thread: Estimates on partial index

Estimates on partial index

From
Victor Yegorov
Date:
Greetings.

I have a question on why planner chooses `IndexScan` for the following query:

    SELECT la.loan_id, la.due_date, la.is_current
      FROM loan_agreements la WHERE la.is_current AND '2016-08-11' > la.due_date;

Relevant (cannot post it all, sorry) table definition is:

                Column                        Type             Modifiers
    ------------------------------ --------------------------- ---------
    id                             bigint                      not null
    ...
    is_current                     boolean                     not null
    due_date                       date                        not null
    loan_id                        bigint
    
    Indexes:
        "loan_agreements_pkey" PRIMARY KEY, btree (id)
        ...
        "idx_loan_agreements_due_date" btree (due_date)
        "idx_loan_agreemnets_loan_id_cond_is_current_true" btree (loan_id) WHERE is_current = true

Some stats:
    SELECT relname,reltuples::numeric,relpages FROM pg_class WHERE oid IN ('loan_agreements'::regclass, 'idx_loan_agreemnets_loan_id_cond_is_current_true'::regclass, 'idx_loan_agreements_due_date'::regclass);
                    relname                      reltuples relpages
    ------------------------------------------------ --------- --------
    idx_loan_agreements_due_date                        664707     1828
    idx_loan_agreemnets_loan_id_cond_is_current_true    237910      655
    loan_agreements                                     664707    18117


Settings:

    SELECT name,setting,unit FROM pg_settings WHERE name ~ '(buffers|mem|cost)$';
            name         setting  unit
    -------------------- -------- ----
    autovacuum_work_mem  524288   kB
    cpu_index_tuple_cost 0.005    ¤
    cpu_operator_cost    0.0025   ¤
    cpu_tuple_cost       0.01     ¤
    maintenance_work_mem 16777216 kB
    random_page_cost     2.5      ¤
    seq_page_cost        1        ¤
    shared_buffers       1572864  8kB
    temp_buffers         8192     8kB
    wal_buffers          2048     8kB
    work_mem             65536    kB
    PostgreSQL 9.4.8 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16), 64-bit

Planner chooses the following plan:
                                                                                         QUERY PLAN
    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on loan_agreements la      (cost=0.42..16986.53 rows=226145 width=13) (actual time=0.054..462.394 rows=216530 loops=1)
      Filter: ('2016-08-11'::date > due_date)
      Rows Removed by Filter: 21304
      Buffers: shared hit=208343 read=18399
    Planning time: 0.168 ms
    Execution time: 479.773 ms

If I disable IndexScans, plan changes likes this:
                                                                              QUERY PLAN
    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Bitmap Heap Scan on loan_agreements la  (cost=2884.01..23974.88 rows=226145 width=13) (actual time=38.893..200.376 rows=216530 loops=1)
      Recheck Cond: is_current
      Filter: ('2016-08-11'::date > due_date)
      Rows Removed by Filter: 21304
      Heap Blocks: exact=18117
      Buffers: shared hit=18212 read=557
      ->  Bitmap Index Scan on idx_loan_agreemnets_loan_id_cond_is_current_true  (cost=0.00..2827.47 rows=237910 width=0) (actual time=35.166..35.166 rows=237853 loops=1)
            Buffers: shared hit=119 read=533
    Planning time: 0.171 ms
    Execution time: 214.341 ms

Question is — why IndexScan over partial index is estimated less than BitmapHeap + BitmapIndex scan. And how can I tell Planner, that IndexScan over 1/3 of table is not a good thing — IndexScan is touching 10x more pages and in a typical situation those are cold.

Thanks in advance.

--
Victor Y. Yegorov

Re: Estimates on partial index

From
Tom Lane
Date:
Victor Yegorov <vyegorov@gmail.com> writes:
> Settings:
>     random_page_cost     2.5      ¤
>     seq_page_cost        1        ¤

> Question is — why IndexScan over partial index is estimated less than
> BitmapHeap + BitmapIndex scan. And how can I tell Planner, that IndexScan
> over 1/3 of table is not a good thing — IndexScan is touching 10x more
> pages and in a typical situation those are cold.

In that case you've got random_page_cost too far down.  Values less than
the default of 4 are generally only appropriate if the bulk of your
database stays in RAM.

            regards, tom lane


Re: Estimates on partial index

From
Jeff Janes
Date:
On Thu, Aug 18, 2016 at 6:52 AM, Victor Yegorov <vyegorov@gmail.com> wrote:
> Greetings.
>
> I have a question on why planner chooses `IndexScan` for the following
> query:
>
>     SELECT la.loan_id, la.due_date, la.is_current
>       FROM loan_agreements la WHERE la.is_current AND '2016-08-11' >
> la.due_date;
>
...
>
> Planner chooses the following plan:
>
> QUERY PLAN
>
>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>     Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on
> loan_agreements la      (cost=0.42..16986.53 rows=226145 width=13) (actual
> time=0.054..462.394 rows=216530 loops=1)
>       Filter: ('2016-08-11'::date > due_date)
>       Rows Removed by Filter: 21304
>       Buffers: shared hit=208343 read=18399
>     Planning time: 0.168 ms
>     Execution time: 479.773 ms
>
> If I disable IndexScans, plan changes likes this:
>
> QUERY PLAN
>
>
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>     Bitmap Heap Scan on loan_agreements la  (cost=2884.01..23974.88
> rows=226145 width=13) (actual time=38.893..200.376 rows=216530 loops=1)
>       Recheck Cond: is_current
>       Filter: ('2016-08-11'::date > due_date)
>       Rows Removed by Filter: 21304
>       Heap Blocks: exact=18117
>       Buffers: shared hit=18212 read=557
>       ->  Bitmap Index Scan on
> idx_loan_agreemnets_loan_id_cond_is_current_true  (cost=0.00..2827.47
> rows=237910 width=0) (actual time=35.166..35.166 rows=237853 loops=1)
>             Buffers: shared hit=119 read=533
>     Planning time: 0.171 ms
>     Execution time: 214.341 ms
>
> Question is — why IndexScan over partial index is estimated less than
> BitmapHeap + BitmapIndex scan. And how can I tell Planner, that IndexScan
> over 1/3 of table is not a good thing — IndexScan is touching 10x more pages

Both plans touch the same pages.  The index scan just touches some of
those pages over and over again.  A large setting of
effective_cache_size would tell it that the page will most likely
still be in cache when it comes back to touch it again, meaning the
cost of doing so will be small, basically free.

> and in a typical situation those are cold.

But they won't be, because it is heating them up itself, and
effective_cache_size says that stay then hot for the duration of the
query.

Also, with a random_page_cost of 2.5, you are telling it that even
cold pages are not all that cold.

What are the correlations of the is_current column to the ctid order,
and of the loan_id column to the ctid order?

Cheers,

Jeff


Re: Estimates on partial index

From
Victor Yegorov
Date:
2016-08-18 16:56 GMT+03:00 Tom Lane <tgl@sss.pgh.pa.us>:
In that case you've got random_page_cost too far down.  Values less than
the default of 4 are generally only appropriate if the bulk of your
database stays in RAM.

Oh, that's interesting. I was under impression, that r_p_c reflects IO speed, like — make it smaller for SSDs.
To make this query prefer BitmapScan, I need to bump r_p_c to 5.8. And 6.0 makes it switch to SeqScan.

Does it means, that for huge DB (this one is ~1.5TB) it is better to increase r_p_c?

Still, this effect shows only for partial indexes, i.e. if I disable `idx_loan_agreemnets_loan_id_cond_is_current_true`,
than planner will not use any other and goes straight to SeqScan.
Does it means, that amount of table-related IO is not accounted for when planning scan over partial index?

--
Victor Y. Yegorov

Re: Estimates on partial index

From
Victor Yegorov
Date:
2016-08-18 18:59 GMT+03:00 Jeff Janes <jeff.janes@gmail.com>:
Both plans touch the same pages.  The index scan just touches some of
those pages over and over again.  A large setting of
effective_cache_size would tell it that the page will most likely
still be in cache when it comes back to touch it again, meaning the
cost of doing so will be small, basically free.

> and in a typical situation those are cold.

But they won't be, because it is heating them up itself, and
effective_cache_size says that stay then hot for the duration of the
query.

(Re-sending as I've missed to add the list.) 

But IndexScan means, that not only index, table is also accessed.
And although index is small get's hot quite quickly (yes, e_c_s is 96GB on this dedicated box),
table is not. And this clearly adds up to the total time.

I am wondering, if heap page accesses are also accounted for during planning.


Also, with a random_page_cost of 2.5, you are telling it that even
cold pages are not all that cold.

Yes, this was new for me and I will review my setup.
Current setting is based on the fact we're running SSDs.


What are the correlations of the is_current column to the ctid order,
and of the loan_id column to the ctid order?

    SELECT attname,null_frac,avg_width,n_distinct,correlation FROM pg_stats WHERE tablename='loan_agreements' AND attname IN ('loan_id','is_current','due_date');
     attname   null_frac avg_width n_distinct correlation
    ---------- --------- --------- ---------- -----------
    due_date           0         4       1197    0.982312
    is_current         0         1          2    0.547268
    loan_id            0         8  -0.202438    0.937507


--
Victor Y. Yegorov

Re: Estimates on partial index

From
Jeff Janes
Date:
On Thu, Aug 18, 2016 at 11:55 AM, Victor Yegorov <vyegorov@gmail.com> wrote:
> 2016-08-18 18:59 GMT+03:00 Jeff Janes <jeff.janes@gmail.com>:
>>
>> Both plans touch the same pages.  The index scan just touches some of
>> those pages over and over again.  A large setting of
>> effective_cache_size would tell it that the page will most likely
>> still be in cache when it comes back to touch it again, meaning the
>> cost of doing so will be small, basically free.
>>
>> > and in a typical situation those are cold.
>>
>> But they won't be, because it is heating them up itself, and
>> effective_cache_size says that stay then hot for the duration of the
>> query.
>
>
> But IndexScan means, that not only index, table is also accessed.
> And although index is small get's hot quite quickly (yes, e_c_s is 96GB on
> this dedicated box),
> table is not.

Both types of scans have to touch the same set of pages.  The bitmap
hits all of the needed index pages first and memorizes the relevant
results, then hits all the needed table pages.  The regular index scan
keeps jumping back and forth from index to table. But they are the
same set of pages either way.

With a regular index scan, if the same table page is pointed to from
40 different places in the index, then it will be touched 40 different
times.  But at least 39 of those times it is going to already be in
memory.  The bitmap scan will touch the page just one and deal with
all 40 entries.


>  And this clearly adds up to the total time.

That isn't clear at all from the info you gave.  You would have to set
track_io_timing=on in order to show something like that.  And we don't
know if you ran each query once in the order shown, and posted what
you got (with one warming the cache for the other); or if you have ran
each repeatedly and posted representative examples with a pre-warmed
cache.


> I am wondering, if heap page accesses are also accounted for during
> planning.

It does account for them, but perhaps not perfectly.  See "[PERFORM]
index fragmentation on insert-only table with non-unique column" for
some arguments on that which might be relevant to you.

If you can come up with a data generator which creates data that
others can use to reproduce this situation, we can then investigate it
in more detail.

Cheers,

Jeff


Re: Estimates on partial index

From
Jim Nasby
Date:
On 8/18/16 3:06 PM, Jeff Janes wrote:
> If you can come up with a data generator which creates data that
> others can use to reproduce this situation, we can then investigate it
> in more detail.

BTW, the easy fix to this is most likely to create an index on due_date
WHERE is_current. Or perhaps partition the table so non-current rows
don't live with current ones. I'm not a fan of mixing history tracking
in with the main table, because it leads to problems like this.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


Re: Estimates on partial index

From
Ashish Kumar Singh
Date:

  K te er Xi

Sent from Nine

From: Jim Nasby <Jim.Nasby@BlueTreble.com>
Sent: 19-Aug-2016 03:32
To: Jeff Janes; Victor Yegorov
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Estimates on partial index

On 8/18/16 3:06 PM, Jeff Janes wrote:
> If you can come up with a data generator which creates data that
> others can use to reproduce this situation, we can then investigate it
> in more detail.

BTW, the easy fix to this is most likely to create an index on due_date
WHERE is_current. Or perhaps partition the table so non-current rows
don't live with current ones. I'm not a fan of mixing history tracking
in with the main table, because it leads to problems like this.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Estimates on partial index

From
Victor Yegorov
Date:

2016-08-18 23:06 GMT+03:00 Jeff Janes <jeff.janes@gmail.com>:
It does account for them, but perhaps not perfectly.  See "[PERFORM]
index fragmentation on insert-only table with non-unique column" for
some arguments on that which might be relevant to you.

Thanks for pointing this out, good stuff to know.


If you can come up with a data generator which creates data that
others can use to reproduce this situation, we can then investigate it
in more detail.

I do not think this has any value anymore. I have reconfigured the server:
- r_p_c returned to the default 4.0
- turned on track_io_timing
- reindexed all indexes on the table
- went with the suggestion from Jim about partial index on `due_date`, although
  delayed it a bit to get a better view on the situation

Running several explains in a row produces the following:

    Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on loan_agreements la  (cost=0.42..21469.40 rows=224331 width=13) (actual time=0.040..2687.422 rows=216440 loops=1)
      Filter: ('2016-08-11'::date > due_date)
      Rows Removed by Filter: 21806
      Buffers: shared hit=226670 read=692 dirtied=48
      I/O Timings: read=9.854
    Planning time: 1885.219 ms
    Execution time: 2712.833 ms
    
    Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on loan_agreements la  (cost=0.42..21469.40 rows=224331 width=13) (actual time=426.027..2273.617 rows=216440 loops=1)
      Filter: ('2016-08-11'::date > due_date)
      Rows Removed by Filter: 21806
      Buffers: shared hit=227276
    Planning time: 0.175 ms
    Execution time: 2296.414 ms
    
    Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on loan_agreements la  (cost=0.42..21469.40 rows=224331 width=13) (actual time=0.034..297.113 rows=216440 loops=1)
      Filter: ('2016-08-11'::date > due_date)
      Rows Removed by Filter: 21806
      Buffers: shared hit=227276
    Planning time: 0.173 ms
    Execution time: 310.509 ms
    
    Index Scan using idx_loan_agreemnets_loan_id_cond_is_current_true on loan_agreements la  (cost=0.42..21469.40 rows=224331 width=13) (actual time=0.031..286.212 rows=216440 loops=1)
      Filter: ('2016-08-11'::date > due_date)
      Rows Removed by Filter: 21806
      Buffers: shared hit=227276
    Planning time: 0.163 ms
    Execution time: 299.831 ms

This makes me think, that IO is not my issue here and, honestly, I have no clue what can be behind this.

What I noticed — queries do experience this kind of hiccups from time to time. CPU and IO monitoring shows no spikes at all, CPU is below 40% all the time.
Currently I am trying to find out ways how to track down what's going on here.

Still — thanks everyone for the feedback, it was valuable for me!


--
Victor Y. Yegorov

Re: Estimates on partial index

From
Victor Yegorov
Date:
2016-08-18 21:40 GMT+03:00 Victor Yegorov <vyegorov@gmail.com>:
Oh, that's interesting. I was under impression, that r_p_c reflects IO speed, like — make it smaller for SSDs.
To make this query prefer BitmapScan, I need to bump r_p_c to 5.8. And 6.0 makes it switch to SeqScan.

I was looking into different databases and queries around — many of them prefers to use indexes over SeqScans, even if index is not a "perfect" match,
like using index on the 2-nd column of the index (like searching for `rev` via IndexScan over `id,rev` index).
I need to bump r_p_c to 6 (at least) to make things shift towards BtimapScans, and I feel uncertain about such increase.

This makes me thinking — can this situation be an indication, that tables are bloated?
(I've performed reindexing recently, touching majority of indexes around, while tables were not touched.)


--
Victor Y. Yegorov