Re: [HACKERS] GUC for cleanup indexes threshold. - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [HACKERS] GUC for cleanup indexes threshold.
Date
Msg-id CAD21AoA76m07oDC0qyEojfkB=HP+A=xAa+AXgLgJT_G3A=ZSZQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] GUC for cleanup indexes threshold.  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Wed, Mar 15, 2017 at 4:50 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Thu, Mar 9, 2017 at 10:21 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>> On Wed, Mar 8, 2017 at 1:43 AM, Peter Geoghegan <pg@bowt.ie> wrote:
>>> On Sat, Mar 4, 2017 at 1:30 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>>>> While I can't see this explained anywhere, I'm
>>>>> pretty sure that that's supposed to be impossible, which this patch
>>>>> changes.
>>>>>
>>>>
>>>> What makes you think that patch will allow pg_class.relfrozenxid to be
>>>> advanced past opaque->btpo.xact which was previously not possible?
>>>
>>> By not reliably recycling pages in a timely manner, we won't change
>>> anything about the dead page. It just sticks around. This is mostly
>>> fine, but we still need VACUUM to be able to reason about it (to
>>> determine if it is in fact recyclable), which is what I'm concerned
>>> about breaking here. It still needs to be *possible* to recycle any
>>> recyclable page at some point (e.g., when we find it convenient).
>>>
>>> pg_class.relfrozenxid is InvalidTransactionId for indexes because
>>> indexes generally don't store XIDs. This is the one exception that I'm
>>> aware of, presumably justified by the fact that it's only for
>>> recyclable pages anyway, and those are currently *guaranteed* to get
>>> recycled quickly. In particular, they're guaranteed to get recycled by
>>> the next VACUUM. They may be recycled in the course of anti-wraparound
>>> VACUUM, even if VACUUM has no garbage tuples to kill (even if we only
>>> do lazy_cleanup_index() instead of lazy_vacuum_index()). This is the
>>> case that this patch proposes to have us skip touching indexes for.
>>>
>>
>> To prevent this, I think we need to not skip the lazy_cleanup_index
>> when ant-wraparound vacuum (aggressive = true) even if the number of
>> scanned pages is less then the threshold. This can ensure that
>> pg_class.relfrozenxid is not advanced past opaque->bp.xact with
>> minimal pain. Even if the btree dead page is leaved, the subsequent
>> modification makes garbage on heap and then autovauum recycles that
>> page while index vacuuming(lazy_index_vacuum).
>>
>
> What about if somebody does manual vacuum and there are no garbage
> tuples to clean, won't in that case also you want to avoid skipping
> the lazy_cleanup_index?

Yes, in that case lazy_cleanup_index will be skipped.

> Another option could be to skip updating the
> relfrozenxid if we have skipped the index cleanup.

This could make anti-wraparound VACUUM occurs at high frequency and we
cannot skip lazy_clean_index when aggressive vacuum after all.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: [HACKERS] Partitioned tables and relfilenode
Next
From: David Steele
Date:
Subject: Re: [HACKERS] GUC for cleanup indexes threshold.