Re: Proposal: Global Index for PostgreSQL - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Proposal: Global Index for PostgreSQL
Date
Msg-id CAFiTN-tEBjDFp6udcD1vRweFuhYsH7DSAkDXTDMp7Tv-3kPjjg@mail.gmail.com
Whole thread Raw
In response to Re: Proposal: Global Index for PostgreSQL  (Dilip Kumar <dilipbalaut@gmail.com>)
List pgsql-hackers
On Mon, Jun 9, 2025 at 3:28 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Jun 9, 2025 at 2:03 PM Nikita Malakhov <hukutoc@gmail.com> wrote:

> > 4) Update-heavy partitioned tables that should run vacuum frequently. Significant
> > vacuum slowdown would result in going beyond SLAs without corresponding
> > significant improvements.
> >
> You've got it. I'm on board with prioritizing a VACUUM optimization
> solution for partitioned tables with global indexes. My initial
> proposal touched on a proof-of-concept experiment, which indicated no
> significant performance hit with global index after the optimization.
> I'll share the detailed VACUUM optimization proposal in this thread
> within the next couple of days.

As discussed earlier the one of main problems is that the global
indexes are vacuumed along with each partition whereas logically it
should be vacuumed only once when all the partitions are vacuum in an
ideal world.

So my proposal is that we make some infrastructure change in vacuum
api's such that there is option to tell heap_vacuum_rel() to skip the
global index vacuum and also return back the deadtid_store, vacuum
layer will store these deadtid_store, hash it by the reloid until it
vacuum all the partitions which are covered by a global index.  Once
that is done it will vacuum the global index, we also need to modify
the vac_tid_reaped() so that it can take additional input of reloid so
that it can find the appropriate deadtid_store by looking into the
hash and then look up the dead tid in respective deadtid_store.

We also need to do something for the autovacuum, because currently
autovacuum workers scan the pg_class and identify the relation which
needs to be vacuumed and vacuum it one at a time.  However once we
have educated vacuum machinery to first vacuum all the partitions and
then perform global index vacuum, the autovacuum worker should know
which all partitions are supposed to be vacuum together and the
autovacuum worker can pass a list of all those partitions together to
the vacuum machinery.

I believe this enhancement can be implemented in autovacuum without
significant difficulty. Currently, autovacuum scans pg_class to
generate a list of relations requiring vacuuming. For partitioned
tables that have a global index, we can extend this process by
additionally maintaining a list of all their leaf relations within the
parent table's entry.

I believe this enhancement can be implemented in autovacuum without
significant difficulty. Currently, autovacuum scans pg_class to
generate a list of relations requiring vacuuming. For partitioned
tables that have a global index, we can extend this process by
additionally maintaining a list of all their leaf relations within the
parent table's entry.

An autovacuum worker would then process all the leaf relations in this
complete hierarchy. To effectively prevent other workers from
attempting to vacuum the same hierarchy concurrently, the worker would
publish the top-most partitioned relation ID (which identifies the
table with the global index) as MyWorkerInfo->wi_tableoid. This
mechanism ensures that if one worker is processing a partitioned
table, the entire set of child relations under that top-level ID is
automatically skipped by other workers

While this approach offers benefits, it's important to acknowledge
certain potential drawbacks. One concern is a possible impact on
parallelism, as currently each partition might be vacuumed by a
separate worker, but with this change, all partitions covered by the
same global index would have to be processed by a single worker.
Another significant challenge lies in effectively choosing which
partitions to vacuum; for instance, if a table with a global index has
1000 partitions and only 10 meet the vacuum threshold in a given
cycle, this method might not be very efficient. Although we wouldn't
vacuum the global index once for every partition, we could still end
up vacuuming it numerous times by the time all partitions are
eventually processed. To mitigate this, an alternative strategy could
involve proactively vacuuming partitions that haven't fully met their
vacuum threshold but are close (e.g., reached 50% of the threshold).
This would allow us to combine more partitions for a single vacuum
operation, thereby reducing the number of times the global index needs
to be vacuumed.


--
Regards,
Dilip Kumar
Google



pgsql-hackers by date:

Previous
From: Perumal Raj
Date:
Subject: Resolved: Logical Replication slot disappeared after promote Standby
Next
From: Andy Fan
Date:
Subject: Re: Improve CRC32C performance on SSE4.2