Home > mailing lists

Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: Adding REPACK [concurrently]
Date	April 1 13:22:15
Msg-id	CAA4eK1+CNiuphtE=wwExsHD_McmYtGp2XK4qwX146VqRbgXmbA@mail.gmail.com Whole thread
In response to	Re: Adding REPACK [concurrently] (Antonin Houska <ah@cybertec.at>)
Responses	Re: Adding REPACK [concurrently] Re: Adding REPACK [concurrently]
List	pgsql-hackers

Tree view

On Wed, Apr 1, 2026 at 3:13 PM Antonin Houska <ah@cybertec.at> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > On Tue, Mar 31, 2026 at 8:06 PM Antonin Houska <ah@cybertec.at> wrote:
> > >
> > > Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > 2.
> > > > *
> > > > + /*
> > > > + * TODO Consider a GUC to reserve certain amount of replication slots for
> > > > + * REPACK (CONCURRENTLY) and using it here.
> > > > + */
> > > > +#define MAX_REPACK_XIDS 16
> > > > +
> > > >
> > > > This sounds a bit scary as reserving replication slots for REPACK
> > > > (CONCURRENTLY) may not be what users expect. But it is not clear why
> > > > replication slots need to be reserved for this.
> > >
> > > The point is that REPACK should not block replication [1]. Thus reserving
> > > slots for non-REPACK users is probably more precise statement.
> > >
> >
> > oh, so shouldn't be a separate patch than this and an important for
> > this functionality to get committed? I don't see why we need to make
> > such a GUC or knob as part of this patch if we need the same.
>
> REPACK is a new user of replication slots. Without that, there is no other way
> to "steal" the slots from the replication users.
>
> > > > IIUC, two reasons why logical decoding can ignore REPACK
> > > > (CONCURRENTLY) are (a) does not perform any catalog changes relevant
> > > > to logical decoding, (b) neither walsenders nor backends performing
> > > > logical decoding needs to care sending the WAL generated by REPACK
> > > > (CONCURRENTLY). Is that understanding correct? If so, we may want to
> > > > clarify why we want to ignore this command's WAL while sending changes
> > > > downstream in the commit message or give some reference of the patch
> > > > where the same is mentioned. This can help reviewing this patch
> > > > independently.
> > >
> > > Correct, but in fact this diff only affects the setup of the logical decoding
> > > rather than the decoding itself. On the other hand, if REPACK (CONCURRENTLY)
> > > starts when the decoding backend's snapshot builder is already in the
> > > SNAPBUILD_FULL_SNAPSHOT state, reorderbuffer.c processes the transaction
> > > normally, and another part of the series (v46-0002) ensures that the table
> > > rewriting is not decoded: REPACK simply tells heap_insert(), heap_update(),
> > > heap_delete() not to put the extra (replication specific) information into the
> > > corresponding WAL records. I suppose this is what you mean in (b).
> > >
> > > Regarding (a), yes, the absence of catalog changes in the REPACK's transaction
> > > is the reason that even the logical decoding setup does not have to wait for
> > > the transaction to finish.
> > >
> >
> > Hmm, but we don't do any catalog changes for transactions that have
> > DML say only INSERT but we don't have separate logic like REPACK in
> > snapbuild machinery. Same is probably true for dml operations on an
> > unlogged table which doesn't have WAL to send nor any catalog
> > operations involved but we don't have any special path for that in
> > snapbuild code path. So, why do we need special handling for this
> > operation?
>
> If an "ordinary" transaction, which had started before the snapshot builder
> reached the SNAPBUILD_FULL_SNAPSHOT state, runs DML, the snapshot builder does
> not know if the same transaction changed something in the catalog earlier. So
> it needs to wait for this transaction to finish before it advances to
> SNAPBUILD_CONSISTENT. For REPACK, we know that it cannot happen because it
> cannot run in transaction block for other reasons. So the snapshot builder
> does not have to wait.
>
> Nevertheless, I'm not sure it's a good idea for snapbuild.c to handle special
> cases like REPACK. Instead, I'm thinking if snapbuild.c can safely ignore XIDs
> of backends connected to databases other than the one we're decoding.
>

What if such transactions have made changes in the global catalog?
Even if that won't matter, I feel such a change would be quite
fundamental to snapbuild machinery and changing at this point would be
risky.

BTW, is the reason to skip REPACK while building a snapshot is that it
can take a long time to finish?

--
With Regards,
Amit Kapila.

pgsql-hackers by date:

From: Amit Langote
Date: 01 April, 12:51:10
Subject: Re: Eliminating SPI / SQL from some RI triggers - take 3

From: Andrew Dunstan
Date: 01 April, 13:39:05
Subject: Re: pg_waldump: support decoding of WAL inside tarfile

Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers

Previous

Next