Re: decoupling table and index vacuum - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: decoupling table and index vacuum
Date
Msg-id CAD21AoA8R5UevMTYqM7Ytv==aNWZxBmSMmcvOenUHxvRiRAUBA@mail.gmail.com
Whole thread Raw
In response to Re: decoupling table and index vacuum  (Andres Freund <andres@anarazel.de>)
Responses Re: decoupling table and index vacuum
List pgsql-hackers
On Fri, Apr 23, 2021 at 5:01 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2021-04-22 12:15:27 -0400, Robert Haas wrote:
> > On Wed, Apr 21, 2021 at 5:38 PM Andres Freund <andres@anarazel.de> wrote:
> > > I'm not sure that's the only way to deal with this. While some form of
> > > generic "conveyor belt" infrastructure would be a useful building block,
> > > and it'd be sensible to use it here if it existed, it seems feasible to
> > > dead tids in a different way here. You could e.g. have per-heap-vacuum
> > > files with a header containing LSNs that indicate the age of the
> > > contents.
> >
> > That's true, but have some reservations about being overly reliant on
> > the filesystem to provide structure here. There are good reasons to be
> > worried about bloating the number of files in the data directory. Hmm,
> > but maybe we could mitigate that. First, we could skip this for small
> > relations. If you can vacuum the table and all of its indexes using
> > the naive algorithm in <10 seconds, you probably shouldn't do anything
> > fancy. That would *greatly* reduce the number of additional files
> > generated. Second, we could forget about treating them as separate
> > relation forks and make them some other kind of thing entirely, in a
> > separate directory
>
> I'm not *too* worried about this issue. IMO the big difference to the
> cost of additional relation forks is that such files would only exist
> when the table is modified to a somewhat meaningful degree. IME the
> practical issues with the number of files due to forks are cases where
> huge number of tables that are practically never modified exist.
>
> That's not to say that I am sure that some form of "conveyor belt"
> storage *wouldn't* be the right thing. How were you thinking of dealing
> with the per-relation aspects of this? One conveyor belt per relation?
>
>
> > especially if we adopted Sawada-san's proposal to skip WAL logging. I
> > don't know if that proposal is actually a good idea, because it
> > effectively adds a performance penalty when you crash or fail over,
> > and that sort of thing can be an unpleasant surprise.  But it's
> > something to think about.
>
> I'm doubtful about skipping WAL logging entirely - I'd have to think
> harder about it, but I think that'd mean we'd restart from scratch after
> crashes / immediate restarts as well, because we couldn't rely on the
> contents of the "dead tid" files to be accurate. In addition to the
> replication issues you mention.

Yeah, not having WAL would have a big negative impact on other various
aspects. Can we piggyback the WAL for the TID fork and
XLOG_HEAP2_PRUNE? That is, we add the buffer for the TID fork to
XLOG_HEAP2_PRUNE and record one 64-bit number of the first dead TID in
the list so that we can add dead TIDs to the TID fork during replaying
XLOG_HEAP2_PRUNE.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-hackers by date:

Previous
From: "tsunakawa.takay@fujitsu.com"
Date:
Subject: RE: [bug?] Missed parallel safety checks, and wrong parallel safety
Next
From: Hannu Krosing
Date:
Subject: Re: MaxOffsetNumber for Table AMs