Re: [HACKERS] GUC for cleanup indexes threshold. - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [HACKERS] GUC for cleanup indexes threshold.
Date
Msg-id CAD21AoDCniZW_eBGJ8_Wt4of+d7xB1ZKk2X9_w0mZo-NL2Ni8A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] GUC for cleanup indexes threshold.  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: [HACKERS] GUC for cleanup indexes threshold.
List pgsql-hackers
On Sun, Jan 7, 2018 at 8:40 AM, Peter Geoghegan <pg@bowt.ie> wrote:
> On Sat, Jan 6, 2018 at 2:20 PM, Stephen Frost <sfrost@snowman.net> wrote:
>>> > IIRC the patches that makes the cleanup scan skip has a problem
>>> > pointed by Peter[1], that is that we stash an XID when a btree page is
>>> > deleted, which is used to determine when it's finally safe to recycle
>>> > the page. Yura's patch doesn't have that problem?
>>> >
>>> > [1]
https://www.postgresql.org/message-id/CAH2-Wz%3D1%3Dt5fcGGfarQGcAWBqaCh%2BdLMjpYCYHpEyzK8Qg6OrQ%40mail.gmail.com
>
>> Masahiko Sawada, if this patch isn't viable or requires serious rework
>> to be acceptable, then perhaps we should change its status to 'returned
>> with feedback' and you can post a new patch for the next commitfest..?
>
> I believe that the problem that I pointed out with freezing/wraparound
> is a solvable problem. If we think about it carefully, we will come up
> with a good solution. I have tried to get the ball rolling with my
> pd_prune_xid suggestion. I think it's important to not lose sight of
> the fact that the page deletion/recycling XID thing is just one detail
> that we need to think about some more.
>
> I cannot fault Sawada-san for waiting to hear other people's views
> before proceeding. It really needs to be properly discussed.
>

Thank you for commenting.

IIUC we have two approaches: one idea is based on Peter's suggestion.
We can use pd_prune_xid to store epoch of xid of which the page is
deleted. That way, we can correctly mark deleted pages as recyclable
without breaking on-disk format.

Another idea is suggested by  Sokolov Yura. His original patch makes
btree have a flag in btpo_flags that implies the btree has deleted but
not recyclable page or not. I'd rather want to store it as bool in
BTMetaPageData. Currently btm_version is defined as uint32 but I think
we won't use all of them. If we store it in part of btm_version we
don't break on-disk format. However, we're now assuming that the
vacuum on btree index always scans whole btree rather than a part, and
this approach will nurture it more. It might be possible that it will
become a restriction in the future.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [HACKERS] path toward faster partition pruning
Next
From: Michael Paquier
Date:
Subject: Re: BUG #14941: Vacuum crashes