Re: xid wraparound danger due to INDEX_CLEANUP false - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: xid wraparound danger due to INDEX_CLEANUP false
Date
Msg-id CAH2-WznWj_J4LK2d8Uu1AhE09DSBJY1hezvb+SwNtVnt8z6kcQ@mail.gmail.com
Whole thread Raw
In response to Re: xid wraparound danger due to INDEX_CLEANUP false  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: xid wraparound danger due to INDEX_CLEANUP false  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
List pgsql-hackers
On Fri, Jun 26, 2020 at 5:39 AM Robert Haas <robertmhaas@gmail.com> wrote:
> My opinion is that there's no need to change the code in the
> back-branches, and that I don't really like the approach in master
> either.

I guess it's hard to see a way that we could fix this in the
backbranches, provided we aren't willing to tolerate a big refactor,
or a cleanup scan of the index (note that I mean btvacuumcleanup(),
not btvacuumscan(), which is quite different).

> I think what we're saying is that there is no worse consequence to
> turning off index_cleanup than some bloat that isn't likely to be
> recovered unless you REINDEX.

That's true.

> Now, what about master? I think it's fine to offer the AM a callback
> even when index_cleanup = false, for example so that it can freeze
> something in its metapage, but I don't agree with passing it the TIDs.
> That seems like it's just inviting it to ignore the emergency brake,
> and it's also incurring real overhead, because remembering all those
> TIDs can use a lot of memory.

You don't have to do anything with TIDs passed from vacuumlazy.c to
recycle pages that need to be recycled, since you only have to go
through btvacuumcleanup() to avoid the problem that we're talking
about (you don't have to call btvacuumscan() to kill TIDs that
vacuumlazy.c will have pruned). Killing TIDs/tuples in the index was
never something that would make sense, even within the confines of the
existing flawed nbtree recycling design. However, you do need to scan
the entire index to do that much. FWIW, that doesn't seem like it
*totally* violates the spirit of "index_cleanup = false", since you're
still not doing most of the usual nbtree vacuuming stuff (even though
you have to scan the index, there is still much less work total).

> If that API limitation causes a problem
> for some future index AM, that will be a good point to discuss when
> the patch for said AM is submitted for review. I entirely agree with
> you that the way btree arranges for btree recycling is crude, and I
> would be delighted if you want to improve it, either for v14 or for
> any future release, or if somebody else wants to do so. However, even
> if that never happens, so what?

I think that it's important to be able to describe an ideal (though
still realistic) design, even if it might remain aspirational for a
long time. I suspect that pushing the mechanism down into index AMs
has other non-obvious benefits.

> In retrospect, I regret committing this patch without better
> understanding the issues in this area. That was a fail on my part. At
> the same time, it doesn't really sound like the issues are all that
> bad. The potential index bloat does suck, but it can still suck less
> than the alternatives, and we have evidence that for at least one
> user, it was worth a major version upgrade just to replace the
> suckitude they had with the suckitude this patch creates.

I actually agree -- this is a really important feature, and I'm glad
that we have it. Even in this slightly flawed form. I remember a great
need for the feature back when I was involved in supporting Postgres
in production.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Default setting for enable_hashagg_disk
Next
From: Ranier Vilela
Date:
Subject: Re: Possible NULL dereferencing (src/backend/tcop/pquery.c)