Home > mailing lists

Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers

From	Antonin Houska
Subject	Re: Adding REPACK [concurrently]
Date	April 1 12:43:26
Msg-id	49194.1775036606@localhost Whole thread
In response to	Re: Adding REPACK [concurrently] (Amit Kapila <amit.kapila16@gmail.com>)
Responses	Re: Adding REPACK [concurrently]
List	pgsql-hackers

Tree view

Amit Kapila <amit.kapila16@gmail.com> wrote:

> On Tue, Mar 31, 2026 at 8:06 PM Antonin Houska <ah@cybertec.at> wrote:
> >
> > Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > 2.
> > > *
> > > + /*
> > > + * TODO Consider a GUC to reserve certain amount of replication slots for
> > > + * REPACK (CONCURRENTLY) and using it here.
> > > + */
> > > +#define MAX_REPACK_XIDS 16
> > > +
> > >
> > > This sounds a bit scary as reserving replication slots for REPACK
> > > (CONCURRENTLY) may not be what users expect. But it is not clear why
> > > replication slots need to be reserved for this.
> >
> > The point is that REPACK should not block replication [1]. Thus reserving
> > slots for non-REPACK users is probably more precise statement.
> >
>
> oh, so shouldn't be a separate patch than this and an important for
> this functionality to get committed? I don't see why we need to make
> such a GUC or knob as part of this patch if we need the same.

REPACK is a new user of replication slots. Without that, there is no other way
to "steal" the slots from the replication users.

> > > IIUC, two reasons why logical decoding can ignore REPACK
> > > (CONCURRENTLY) are (a) does not perform any catalog changes relevant
> > > to logical decoding, (b) neither walsenders nor backends performing
> > > logical decoding needs to care sending the WAL generated by REPACK
> > > (CONCURRENTLY). Is that understanding correct? If so, we may want to
> > > clarify why we want to ignore this command's WAL while sending changes
> > > downstream in the commit message or give some reference of the patch
> > > where the same is mentioned. This can help reviewing this patch
> > > independently.
> >
> > Correct, but in fact this diff only affects the setup of the logical decoding
> > rather than the decoding itself. On the other hand, if REPACK (CONCURRENTLY)
> > starts when the decoding backend's snapshot builder is already in the
> > SNAPBUILD_FULL_SNAPSHOT state, reorderbuffer.c processes the transaction
> > normally, and another part of the series (v46-0002) ensures that the table
> > rewriting is not decoded: REPACK simply tells heap_insert(), heap_update(),
> > heap_delete() not to put the extra (replication specific) information into the
> > corresponding WAL records. I suppose this is what you mean in (b).
> >
> > Regarding (a), yes, the absence of catalog changes in the REPACK's transaction
> > is the reason that even the logical decoding setup does not have to wait for
> > the transaction to finish.
> >
>
> Hmm, but we don't do any catalog changes for transactions that have
> DML say only INSERT but we don't have separate logic like REPACK in
> snapbuild machinery. Same is probably true for dml operations on an
> unlogged table which doesn't have WAL to send nor any catalog
> operations involved but we don't have any special path for that in
> snapbuild code path. So, why do we need special handling for this
> operation?

If an "ordinary" transaction, which had started before the snapshot builder
reached the SNAPBUILD_FULL_SNAPSHOT state, runs DML, the snapshot builder does
not know if the same transaction changed something in the catalog earlier. So
it needs to wait for this transaction to finish before it advances to
SNAPBUILD_CONSISTENT. For REPACK, we know that it cannot happen because it
cannot run in transaction block for other reasons. So the snapshot builder
does not have to wait.

Nevertheless, I'm not sure it's a good idea for snapbuild.c to handle special
cases like REPACK. Instead, I'm thinking if snapbuild.c can safely ignore XIDs
of backends connected to databases other than the one we're decoding. Thus the
restriction would be one backend running REPACK per database rather than
cluster.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

pgsql-hackers by date:

From: Amit Kapila
Date: 01 April, 12:39:31
Subject: Re: Adding REPACK [concurrently]

From: Richard Guo
Date: 01 April, 12:45:42
Subject: Re: Remove inner joins based on foreign keys

Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers

Previous

Next