Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Adding REPACK [concurrently]
Date
Msg-id CAA4eK1+6GyTWSQ4XC8B=TJCkk=uFG1N+d7xg+Hp1N=uphVX0=Q@mail.gmail.com
Whole thread
In response to Re: Adding REPACK [concurrently]  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
On Mon, Apr 6, 2026 at 3:08 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2026-Apr-06, vignesh C wrote:
>
> > On Mon, 6 Apr 2026 at 02:12, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> > Few comments:
>
> Thanks for reviewing this patch!
>
> > 1) Can we add a comment why we should error out here, as repack
> > concurrently requires this whereas repack does not require this check.
> > Even if it is required for decoding can't it be handled by replica
> > identity full:
> > +       /*
> > +        * If the identity index is not set due to replica identity being, PK
> > +        * might exist.
> > +        */
> > +       ident_idx = RelationGetReplicaIndex(rel);
> > +       if (!OidIsValid(ident_idx) && OidIsValid(rel->rd_pkindex))
> > +               ident_idx = rel->rd_pkindex;
> > +       if (!OidIsValid(ident_idx))
> > +               ereport(ERROR,
> > +
> > errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> > +                               errmsg("cannot process relation \"%s\"",
> > +                                          RelationGetRelationName(rel)),
> > +                               errhint("Relation \"%s\" has no
> > identity index.",
> > +                                               RelationGetRelationName(rel)));
>
> Ah, I was just rewriting that comment moments ago.  I think we could
> make it work with replica identity full in theory, but I'm guessing it
> would be useless.  You really need an index that lets you locate the
> affected tuples; otherwise the replay would take forever.  If the table
> is big, that would make the whole thing unworkable.  If the table isn't
> big, then you can probably just use straight repack without too much
> disruption.
>
>         /*
>          * Obtain the replica identity index -- either one that has been set
>          * explicitly, or the primary key.  If none of these cases apply, the
>          * table cannot be repacked concurrently.  It might be possible to have
>          * repack work with a FULL replica identity; however that requires more
>          * work and is not implemented yet.
>          */
>
> Now, it's possible to have a non-unique index on some columns (not good
> enough to be replica identity) which gives you a list of candidate
> tuples, and then the implementation chooses one based on the other
> columns which are present in the FULL replica identity.  (This seems a
> bit dangerous, because there might be multiple matching tuples; and how
> do you choose?)  Now let's further suppose you can narrow down to a
> single tuple; in that case, using FULL would work.
>

I think this is already working for apply worker where we compare
tuples if the index is non_uniuqe, see FindReplTupleInLocalRel.

>
  However, as I said,
> this requires more implementation effort.  We could entertain a patch
> for this during the pg20 cycle, though I'm doubtful that it would really
> be worth your while.
>

It is fine to support it later in pg20. In general, I wonder if we
have evaluated whether some of the apply-worker infrastructure could
have been reused here?

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: pg_get__*_ddl consolidation
Next
From: Andreas Karlsson
Date:
Subject: Re: DEREF_AFTER_NULL: src/common/jsonapi.c:2529