Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: Adding REPACK [concurrently]
Date
Msg-id 9548.1773744820@localhost
Whole thread Raw
In response to Re: Adding REPACK [concurrently]  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: Adding REPACK [concurrently]
List pgsql-hackers
Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> On 2026-Mar-16, Matthias van de Meent wrote:
>
> > On Mon, 16 Mar 2026 at 21:15, Antonin Houska <ah@cybertec.at> wrote:
>
> > > Anyway (fortunately?), the concurrent use of slots by REPACK is limited
> > > because, during the initialization of logical decoding, the backend needs to
> > > wait for all the transactions having XID assigned to finish, and these include
> > > the already running REPACK commands. See SnapBuildWaitSnapshot() and callers
> > > if you're interested in details.
> >
> > Huh, so would you be able to run more than one Repack Concurrently in
> > the same database? ISTM that would not be possible, apart from
> > possibly a mechanism comparable to the SAFE_IN_IC flag (to not wait on
> > those backends).
>
> Yeah, this sounds kind of bad news ...

Admittedly, it is a problem. I tried to address this in pg_squeeze by
pre-allocating slots when it's clear (due to scheduling) that more than one
table needs to be processed. This was an effort to achieve the best possible
performance rather than a response to complaints of users about low
throughput. Nevertheless, I'm glad I happened to mention it before it's too
late.

Regarding solution, a flag like SAFE_IN_IC alone does not help. The
information that particular transaction is used by REPACK (and therefore it
does not have to be decoded) would need to be propagated to the
xl_running_xacts WAL record too.

The enhancements I wrote for PG 20 (not all of them posted yet) that aim at
eliminating the impact of REPACK on VACUUM xmin horizon should fix this
problem: due to the MVCC-safety (i.e. preserving xmin/xmax of the tuples),
REPACK will not need XID assigned (except for catalog changes, which will
happen in separate transactions), so it won't block the logical decoding setup
of other backends.

So the question is whether we should implement a workaround for PG 19, that
won't be needed in v20.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



pgsql-hackers by date:

Previous
From: Ilia Evdokimov
Date:
Subject: Re: Reduce planning time for large NOT IN lists containing NULL
Next
From: "Jelte Fennema-Nio"
Date:
Subject: Re: Change copyObject() to use typeof_unqual