Re: [HACKERS] GUC for cleanup indexes threshold. - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: [HACKERS] GUC for cleanup indexes threshold.
Date
Msg-id CAPpHfdsyX_kOAkcKOcgYtWh0zSjP7i_U85r225mggP4-qpP7OA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] GUC for cleanup indexes threshold.  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: [HACKERS] GUC for cleanup indexes threshold.  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
List pgsql-hackers
On Mon, Mar 5, 2018 at 5:56 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Sun, Mar 4, 2018 at 8:59 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Fri, Mar 2, 2018 at 10:53 AM, Masahiko Sawada <sawada.mshk@gmail.com>
> wrote:
>>
>> > 2) In the append-only case, index statistics can lag indefinitely.
>>
>> The original proposal proposed a new GUC that specifies a fraction of
>> the modified pages to trigger a cleanup indexes.
>
>
> Regarding original proposal, I didn't get what exactly it's intended to be.
> You're checking if vacuumed_pages >= nblocks * vacuum_cleanup_index_scale.
> But vacuumed_pages is the variable which could be incremented when
> no indexes exist on the table.  When indexes are present, this variable is
> always
> zero.  I can assume, that it's intended to compare number of pages where
> at least one tuple is deleted to nblocks * vacuum_cleanup_index_scale.
> But that is also not an option for us, because we're going to optimize the
> case when exactly zero tuples is deleted by vacuum.

In the latest v4 patch, I compare scanned_pages and the threshold,
which means if the number of pages that are modified since the last
vacuum is larger than the threshold we force cleanup index.

Right, sorry I've overlooked that.  However, if even use number of pages
I would still prefer cumulative measure.  So, number of vacuums are
taken into account even if each of them touched only small number of
pages.
 
> The thing I'm going to propose is to add estimated number of tuples in
> table to IndexVacuumInfo.  Then B-tree can memorize that number of tuples
> when last time index was scanned in the meta-page.  If pass value
> is differs from the value in meta-page too much, then cleanup is forced.
>
> Any better ideas?

I think that would work. But I'm concerned about metapage format
compatibility.

That's not show-stopper.  B-tree meta page have version number.  So,
it's no problem to provide online update.
 
And since I've not fully investigated about cleanup
index of other index types I'm not sure that interface makes sense. It
might not be better but an alternative idea is to add a condition
(Irel[i]->rd_rel->relam == BTREE_AM_OID) in lazy_scan_heap.

I meant putting this logic *inside* btvacuumcleanup() while passing
required measure to IndexVacuumInfo which is accessible from
btvacuumcleanup().

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

 

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: PATCH: Configurable file mode mask
Next
From: Alvaro Herrera
Date:
Subject: Re: STATISTICS retained in CREATE TABLE ... LIKE (INCLUDING ALL)?