Re: xid wraparound danger due to INDEX_CLEANUP false - Mailing list pgsql-hackers

From Robert Haas
Subject Re: xid wraparound danger due to INDEX_CLEANUP false
Date
Msg-id CA+TgmoYSLS4OFBmKcQL7+D1QqfeiyU-Hg-_PRHNUcjsZmew8Yg@mail.gmail.com
Whole thread Raw
In response to Re: xid wraparound danger due to INDEX_CLEANUP false  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
On Thu, Nov 19, 2020 at 8:58 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> For HEAD, there was a discussion that we change lazy vacuum and
> bulkdelete and vacuumcleanup APIs so that it calls these APIs even
> when INDEX_CLEANUP is specified. That is, when INDEX_CLEANUP false is
> specified, it collects dead tuple TIDs into maintenance_work_mem space
> and passes the flag indicating INDEX_CLEANUP is specified or not to
> index AMs. Index AM decides whether doing bulkdelete/vacuumcleanup. A
> downside of this idea would be that we will end up using
> maintenance_work_mem even if all index AMs of the table don't do
> bulkdelete/vacuumcleanup at all.
>
> The second idea I came up with is to add an index AM API  (say,
> amcanskipindexcleanup = true/false) telling index cleanup is skippable
> or not. Lazy vacuum checks this flag for each index on the table
> before starting. If index cleanup is skippable in all indexes, it can
> choose one-pass vacuum, meaning no need to collect dead tuple TIDs in
> maintenance_work_mem. All in-core index AM will set to true. Perhaps
> it’s true (skippable) by default for backward compatibility.
>
> The in-core AMs including btree indexes will work same as before. This
> fix is to make it more desirable behavior and possibly to help other
> AMs that require to call vacuumcleanup in all cases. Once we fix it I
> wonder if we can disable index cleanup when autovacuum’s
> anti-wraparound vacuum.

It (still) doesn't seem very sane to me to have an index that requires
cleanup in all cases. I mean, VACUUM could error or be killed just
before the index cleanup hase happens anyway, so it's not like an
index AM can licitly depend on getting called just because we visited
the heap. It could, of course, depend on getting called before
relfrozenxid is advanced, or before the heap's dead line pointers are
marked unused, or something like that, but it can't just be like, hey,
you have to call me.

I think this whole discussion is to some extent a product of the
contract between the index AM and the table AM being more than
slightly unclear. Maybe we need to clear up the definitional problems
first.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: jit and explain nontext
Next
From: Alvaro Herrera
Date:
Subject: Re: VACUUM (DISABLE_PAGE_SKIPPING on)