Re: [HACKERS] GUC for cleanup indexes threshold. - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: [HACKERS] GUC for cleanup indexes threshold.
Date
Msg-id 20180319.144505.166111203.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: [HACKERS] GUC for cleanup indexes threshold.  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: [HACKERS] GUC for cleanup indexes threshold.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Re: [HACKERS] GUC for cleanup indexes threshold.  (Masahiko Sawada <sawada.mshk@gmail.com>)
Re: [HACKERS] GUC for cleanup indexes threshold.  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
List pgsql-hackers
At Mon, 19 Mar 2018 11:12:58 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in
<CAD21AoAB8tQg9xwojupUJjKD=fMhtx6thDEPENDdhftVLWcR8A@mail.gmail.com>
> On Wed, Mar 14, 2018 at 9:25 PM, Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
> > On Wed, Mar 14, 2018 at 7:40 AM, Masahiko Sawada <sawada.mshk@gmail.com>
> > wrote:
> >>
> >> On Sat, Mar 10, 2018 at 3:40 AM, Alexander Korotkov
> >> <a.korotkov@postgrespro.ru> wrote:
> >> > On Fri, Mar 9, 2018 at 3:12 PM, Masahiko Sawada <sawada.mshk@gmail.com>
> >> > wrote:
> >> >>
> >> >> On Fri, Mar 9, 2018 at 8:43 AM, Alexander Korotkov
> >> >> <a.korotkov@postgrespro.ru> wrote:
> >> >> > 2) These parameters are reset during btbulkdelete() and set during
> >> >> > btvacuumcleanup().
> >> >>
> >> >> Can't we set these parameters even during btbulkdelete()? By keeping
> >> >> them up to date, we will able to avoid an unnecessary cleanup vacuums
> >> >> even after index bulk-delete.
> >> >
> >> >
> >> > We certainly can update cleanup-related parameters during
> >> > btbulkdelete().
> >> > However, in this case we would update B-tree meta-page during each
> >> > VACUUM cycle.  That may cause some overhead for non append-only
> >> > workloads.  I don't think this overhead would be sensible, because in
> >> > non append-only scenarios VACUUM typically writes much more of
> >> > information.
> >> > But I would like this oriented to append-only workload patch to be
> >> > as harmless as possible for other workloads.
> >>
> >> What overhead are you referring here? I guess the overhead is only the
> >> calculating the oldest btpo.xact. And I think it would be harmless.
> >
> >
> > I meant overhead of setting last_cleanup_num_heap_tuples after every
> > btbulkdelete with wal-logging of meta-page.  I bet it also would be
> > harmless, but I think that needs some testing.
> 
> Agreed.
> 
> After more thought, it might be too late but we can consider the
> possibility of another idea proposed by Peter. Attached patch
> addresses the original issue of index cleanups by storing the epoch
> number of page deletion XID into PageHeader->pd_prune_xid which is
> 4byte field.

Mmm. It seems to me that the story is returning to the
beginning. Could I try retelling the story?

I understant that the initial problem was vacuum runs apparently
unnecessary full-scan on indexes many times. The reason for that
is the fact that a cleanup scan may leave some (or many under
certain condition) dead pages not-recycled but we don't know
whether a cleanup is needed or not. They will be staying left
forever unless we run additional cleanup-scans at the appropriate
timing.

(If I understand it correctly,) Sawada-san's latest proposal is
(fundamentally the same to the first one,) just skipping the
cleanup scan if the vacuum scan just before found that the number
of *live* tuples are increased. If there where many deletions and
insertions but no increase of total number of tuples, we don't
have a cleanup. Consequently it had a wraparound problem and it
is addressed in this version.

(ditto.) Alexander proposed to record the oldest xid of
recyclable pages in metapage (and the number of tuples at the
last cleanup). This prevents needless cleanup scan and surely
runs cleanups to remove all recyclable pages.

I think that we can accept Sawada-san's proposal if we accept the
fact that indexes can retain recyclable pages for a long
time. (Honestly I don't think so.)

If (as I might have mentioned as the same upthread for Yura's
patch,) we accept to hold the information on index meta page,
Alexander's way would be preferable. The difference betwen Yura's
and Alexander's is the former runs cleanup scan if a recyclable
page is present but the latter avoids that before any recyclable
pages are knwon to be removed.

>               Comparing to the current proposed patch this patch
> doesn't need neither the page upgrade code nor extra WAL-logging. If

# By the way, my proposal was storing the information as Yura
# proposed into stats collector. The information maybe be
# available a bit lately, but it doesn't harm. This doesn't need
# extra WAL logging nor the upgrad code:p

> we also want to address cases other than append-only case we will

I'm afraid that "the problem for the other cases" is a new one
that this patch introduces, not an existing one.

> require the bulk-delete method of scanning whole index and of logging
> WAL. But it leads some extra overhead. With this patch we no longer
> need to depend on the full scan on b-tree index. This might be useful
> for a future when we make the bulk-delete of b-tree index not scan
> whole index.

Perhaps I'm taking something incorrectly, but is it just the
result of skipping 'maybe needed' scans without condiering the
actual necessity?

I also don't like extra WAL logging, but it happens once (or
twice?) per vaccum cycle (for every index). On the other hand I
want to put the on-the-fly upgrade path out of the ordinary
path. (Reviving the pg_upgrade's custom module?)

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: User defined data types in Logical Replication
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: pg_get_functiondef forgets about most GUC_LIST_INPUT GUCs