Re: [HACKERS] GUC for cleanup indexes threshold. - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [HACKERS] GUC for cleanup indexes threshold.
Date
Msg-id CAD21AoBWhBskHP6GEeAgkcq_8euPfjDbDJ80XFYMVd5f==iH_A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] GUC for cleanup indexes threshold.  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Sat, Mar 4, 2017 at 4:37 AM, Peter Geoghegan <pg@bowt.ie> wrote:
> On Fri, Mar 3, 2017 at 8:13 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>> Thank you for clarification. Let me check my understanding. IIUC,
>> skipping second index vacuum path (lazy_cleanup_index) can not be
>> cause of leaving page as half-dead state but could leave recyclable
>> pages that are not marked as a recyclable. But second one, it can be
>> reclaimed by next index vacuum because btvacuumpage calls
>> RecordFreeIndexPage for recyclable page. Am I missing something?
>
> I think that this is a good summary. You have not missed anything.
>
>> My first motivation of this patch is to skip the second index vacuum
>> patch when vacuum skipped whole table by visibility map.
>
> Makes sense.
>
>> But as Robert
>> suggested on another thread, I changed it to have a threshold. If my
>> understanding is correct, we can have a threshold that specifies the
>> fraction of the scanned pages by vacuum. If it's set 0.1,
>> lazy_scan_heap can do the second vacuum index only when 10% of table
>> is scanned. IOW, if 90% of table pages is skipped, which means almost
>> of table has not changed since previous vacuum, we can skip the second
>> index vacuum.
>
> It sounds like a setting of 0.1 would leave us in a position where
> it's very unlikely that a VACUUM of indexes could fail to occur when
> autovacuum has been triggered in the common way, by the "vacuum
> threshold" having been exceeded. Does this feeling that I have about
> it seem accurate to you? Is that actually your intent? It's hard to
> really be sure, because there are so many confounding factors (e.g.
> HOT), but if that was 100% true, then I suppose there would
> theoretically be zero new risk (except, perhaps, of the "other type of
> bloat" that I described, which I am not very worried about).
>
> Please verify my understanding of your thought process: We don't have
> to freeze indexes at all, ever, so if we see index bloat as a separate
> problem, we also see that there is no need to *link* index needs to
> the need for freezing. XID burn rate is a very bad proxy for how
> bloated an index may be. Besides, we already have a separate trigger
> for the thing that *actually* matters to indexes (the vacuum threshold
> stuff).
>
> Maybe 0.0 is too low as a default for
> vacuum_cleanup_index_scale_factor, even though initially it seemed
> attractive to me for theoretical reasons. Something like 0.01 is still
> "practically zero", but removes the extreme sensitivity that would
> have with 0.0. So, 0.01 might make sense as a default for roughly the
> same reason that autovacuum_vacuum_threshold exists. (Maybe it should
> be more like autovacuum_vacuum_threshold, in that it's an absolute
> minimum number of heap blocks to trigger index clean-up.)

Agreed. Attached current design patch. Also I've changed default value to 0.01.

Remaining issues are two issues; first is what you mentioned about
that pg_class.relfrozenxid/pg_database.datfrozenxid could be advanced
past opaque->btpo.xact. The second is to find out the impact to type
of indexes other than btree. I'm investigating the second one now.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Haribabu Kommi
Date:
Subject: Re: [HACKERS] ANALYZE command progress checker
Next
From: Amit Langote
Date:
Subject: Re: [HACKERS] contrib modules and relkind check