Re: Proposal: Global Index - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Proposal: Global Index
Date
Msg-id CAH2-WzmbrvL0xrRDm6pmNQGPK9BYJU8mbw-AzRR_r9tQFvEOzg@mail.gmail.com
Whole thread Raw
In response to Re: Proposal: Global Index  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Proposal: Global Index  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Mon, Jan 11, 2021 at 10:37 AM Robert Haas <robertmhaas@gmail.com> wrote:
> I actually think the idea of lazily deleting the index entries is
> pretty good, but it won't work if the way the global index is
> implemented is by adding a tableoid column.

Perhaps there is an opportunity to apply some of the infrastructure
that Masahiko Sawada has been working on, that makes VACUUM more
incremental in certain specific scenarios:

https://postgr.es/m/CAD21AoD0SkE11fMw4jD4RENAwBMcw1wasVnwpJVw3tVqPOQgAw@mail.gmail.com

I think that VACUUM can be taught to skip the ambulkdelete() step for
indexes in many common scenarios. Global indexes might be one place in
which that's almost essential.

> However, there is a VACUUM amplification effect to worry about here
> which Wenjing seems not to be considering.

> That's not necessarily a death sentence for every use case, but it's
> going to be pretty bad for tables that are big and heavily updated.

The main way in which index vacuuming is currently a death sentence
for this design (as you put it) is that it's an all-or-nothing thing.
Presumably you'll need to VACUUM the entire global index for each
partition that receives even one UPDATE. That seems pretty extreme,
and probably not acceptable. In a way it's not really a new problem,
but the fact remains: it makes global indexes much less valuable.

However, it probably would be okay if a global index feature performed
poorly in scenarios where partitions get lots of UPDATEs that produce
lots of index bloat and cause lots of LP_DEAD line pointers to
accumulate in heap pages. It is probably reasonable to just expect
users to not do that if they want to get acceptable performance while
using a global index. Especially since it probably is not so bad if
the index bloat situation gets out of hand for just one of the
partitions (say the most recent one) every once in a while. You at
least don't have the same crazy I/O multiplier effect that you
described.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Proposal: Global Index
Next
From: Bruce Momjian
Date:
Subject: Re: Key management with tests