Re: [HACKERS] GUC for cleanup indexes threshold. - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [HACKERS] GUC for cleanup indexes threshold.
Date
Msg-id CAD21AoCEowCyd5KpV5XAj+41gJShE8AXW2woXHweRFRfX577Bw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] GUC for cleanup indexes threshold.  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: [HACKERS] GUC for cleanup indexes threshold.  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
List pgsql-hackers
On Sun, Mar 4, 2018 at 8:59 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> On Fri, Mar 2, 2018 at 10:53 AM, Masahiko Sawada <sawada.mshk@gmail.com>
> wrote:
>>
>> > 2) In the append-only case, index statistics can lag indefinitely.
>>
>> The original proposal proposed a new GUC that specifies a fraction of
>> the modified pages to trigger a cleanup indexes.
>
>
> Regarding original proposal, I didn't get what exactly it's intended to be.
> You're checking if vacuumed_pages >= nblocks * vacuum_cleanup_index_scale.
> But vacuumed_pages is the variable which could be incremented when
> no indexes exist on the table.  When indexes are present, this variable is
> always
> zero.  I can assume, that it's intended to compare number of pages where
> at least one tuple is deleted to nblocks * vacuum_cleanup_index_scale.
> But that is also not an option for us, because we're going to optimize the
> case when exactly zero tuples is deleted by vacuum.

In the latest v4 patch, I compare scanned_pages and the threshold,
which means if the number of pages that are modified since the last
vacuum is larger than the threshold we force cleanup index.

> The thing I'm going to propose is to add estimated number of tuples in
> table to IndexVacuumInfo.  Then B-tree can memorize that number of tuples
> when last time index was scanned in the meta-page.  If pass value
> is differs from the value in meta-page too much, then cleanup is forced.
>
> Any better ideas?

I think that would work. But I'm concerned about metapage format
compatibility. And since I've not fully investigated about cleanup
index of other index types I'm not sure that interface makes sense. It
might not be better but an alternative idea is to add a condition
(Irel[i]->rd_rel->relam == BTREE_AM_OID) in lazy_scan_heap.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: non-bulk inserts and tuple routing
Next
From: David Gould
Date:
Subject: Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuplesinaccurate.