Re: [HACKERS] GUC for cleanup indexes threshold. - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: [HACKERS] GUC for cleanup indexes threshold. |
Date | |
Msg-id | 20170922.173110.253964775.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: [HACKERS] GUC for cleanup indexes threshold. (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: [HACKERS] GUC for cleanup indexes threshold.
|
List | pgsql-hackers |
At Fri, 22 Sep 2017 17:21:04 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoBN9ucgMDuinx2ptU8upEToHnR-A35aBcQyZnLFvWdVPg@mail.gmail.com> > On Fri, Sep 22, 2017 at 4:16 PM, Kyotaro HORIGUCHI > <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > > At Fri, 22 Sep 2017 15:00:20 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoD6zgb1W6ps1aXj0CcAB_chDYiiTNtEdpMhkefGg13-GQ@mail.gmail.com> > >> On Tue, Sep 19, 2017 at 3:31 PM, Kyotaro HORIGUCHI > >> <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > >> Could you elaborate about this? For example in btree index, the index > >> cleanup skips to scan on the index scan if index_bulk_delete has been > >> called during vacuuming because stats != NULL. So I think we don't > >> need such a flag. > > > > The flag works so that successive two index full scans don't > > happen in a vacuum round. If any rows are fully deleted, just > > following btvacuumcleanup does nothing. > > > > I think what you wanted to solve here was the problem that > > index_vacuum_cleanup runs a full scan even if it ends with no > > actual work, when manual or anti-wraparound vacuums. (I'm > > getting a bit confused on this..) It is caused by using the > > pointer "stats" as the flag to instruct to do that. If the > > stats-as-a-flag worked as expected, the GUC doesn't seem to be > > required. > > Hmm, my proposal is like that if a table doesn't changed since the > previous vacuum much we skip the cleaning up index. > > If the table has at least one garbage we do the lazy_vacuum_index and > then IndexBulkDeleteResutl is stored, which causes to skip doing the > btvacuumcleanup. On the other hand, if the table doesn't have any > garbage but some new tuples inserted since the previous vacuum, we > don't do the lazy_vacuum_index but do the lazy_cleanup_index. In this > case, we always do the lazy_cleanup_index (i.g, we do the full scan) > even if only one tuple is inserted. That's why I proposed a new GUC > parameter which allows us to skip the lazy_cleanup_index in the case. I think the problem raised in this thread is that the last index scan may leave dangling pages. > > Addition to that, as Simon and Peter pointed out > > index_bulk_delete can leave not-fully-removed pages (so-called > > half-dead pages and pages that are recyclable but not registered > > in FSM, AFAICS) in some cases mainly by RecentGlobalXmin > > interlock. In this case, just inhibiting cleanup scan by a > > threshold lets such dangling pages persist in the index. (I > > conldn't make such a many dangling pages, though..) > > > > The first patch in the mail (*1) does that. It seems having some > > bugs, though.. > > > > > > Since the dangling pages persist until autovacuum decided to scan > > the belonging table again, we should run a vacuum round (or > > index_vacuum_cleanup itself) even having no dead rows if we want > > to clean up such pages within a certain period. The second patch > > doesn that. > > > > IIUC half-dead pages are not relevant to this proposal. The proposal > has two problems; > > * By skipping index cleanup we could leave recyclable pages that are > not marked as a recyclable. Yes. > * we stash an XID when a btree page is deleted, which is used to > determine when it's finally safe to recycle the page Is it a "problem" of this proposal? regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
pgsql-hackers by date: