Home > mailing lists

Re: Proposal: Global Index for PostgreSQL - Mailing list pgsql-hackers

From	Masahiko Sawada
Subject	Re: Proposal: Global Index for PostgreSQL
Date	June 18 02:07:25
Msg-id	CAD21AoCHenor-AXY5QsEdiVBPPTJm+dResmRvNzZLPeUotWyaw@mail.gmail.com Whole thread Raw
In response to	Re: Proposal: Global Index for PostgreSQL (Dilip Kumar <dilipbalaut@gmail.com>)
Responses	Re: Proposal: Global Index for PostgreSQL
List	pgsql-hackers

Tree view

On Sat, Jun 14, 2025 at 2:32 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Mon, Jun 9, 2025 at 3:28 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Mon, Jun 9, 2025 at 2:03 PM Nikita Malakhov <hukutoc@gmail.com> wrote:
>
> > > 4) Update-heavy partitioned tables that should run vacuum frequently. Significant
> > > vacuum slowdown would result in going beyond SLAs without corresponding
> > > significant improvements.
> > >
> > You've got it. I'm on board with prioritizing a VACUUM optimization
> > solution for partitioned tables with global indexes. My initial
> > proposal touched on a proof-of-concept experiment, which indicated no
> > significant performance hit with global index after the optimization.
> > I'll share the detailed VACUUM optimization proposal in this thread
> > within the next couple of days.
>
> As discussed earlier the one of main problems is that the global
> indexes are vacuumed along with each partition whereas logically it
> should be vacuumed only once when all the partitions are vacuum in an
> ideal world.
>
> So my proposal is that we make some infrastructure change in vacuum
> api's such that there is option to tell heap_vacuum_rel() to skip the
> global index vacuum and also return back the deadtid_store, vacuum
> layer will store these deadtid_store, hash it by the reloid until it
> vacuum all the partitions which are covered by a global index.  Once
> that is done it will vacuum the global index, we also need to modify
> the vac_tid_reaped() so that it can take additional input of reloid so
> that it can find the appropriate deadtid_store by looking into the
> hash and then look up the dead tid in respective deadtid_store.

Does it need to keep holding dead TIDs for each partition until it
completes vacuuming all partitions that are covered by the global
index? If so, it would end up holding a huge amount of memory in cases
where there are many partitions. How does maintanence_work_mem (or
autovacuum_work_mem) work in this context? Also, what if the
autovacuum worker who is processing the partitioned table with global
indexes gets cancelled? I guess that we would need to scan the
partitions again in order to collect dead TIDs to vacuum the global
index but it would be very expensive.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

pgsql-hackers by date:

From: Masahiko Sawada
Date: 18 June, 01:36:41
Subject: Re: POC: Parallel processing of indexes in autovacuum

From: Fujii Masao
Date: 18 June, 03:13:36
Subject: pg_dump misses comments on NOT NULL constraints

Re: Proposal: Global Index for PostgreSQL - Mailing list pgsql-hackers

Previous

Next