Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Adding REPACK [concurrently]
Date
Msg-id 202603121839.7ttist64mas5@alvherre.pgsql
Whole thread Raw
In response to Re: Adding REPACK [concurrently]  (Antonin Houska <ah@cybertec.at>)
Responses Re: Adding REPACK [concurrently]
List pgsql-hackers
On 2026-Mar-11, Antonin Houska wrote:

> I'm not sure it can be fixed nicely in the REPACK (CONCURRENTLY) patch. I
> think the problem is that, in the current tree, VARSIZE_ANY() is used in such
> a way that the compiler cannot check the "array bounds". The restore_tuple()
> function is special in that it uses VARSIZE_ANY() to check a variable
> allocated from the stack, so the compiler can check the size.
> 
> I'm trying to fix that in a new diff 0002 - the point is that VARSIZE_ANY()
> should not need to dereference a pointer to varattrib_4b, since the size
> information is always located at the beginning of the structure. Maybe you
> have better idea.

I have no immediate ideas on this.

I offer the following rather trivial fixup diffs, which I think should
be mostly be for 0002.

Other trivial things I'd like to change, if you don't mind,
- the name pgoutput_repack sounds wrong to me.  I would rather say
  rpck_output, repack_output, repack_plugin, ... or something.  I don't
  understand where the suffix "output" comes from in the name
  "pgoutput", but it sounds like arbitrary nonsense to me.
- I would rename the routines in that file to also have the name
  "repack", probably as prefixes.

One thing we need and is rather not trivial, is to go through the table
AM interface rather than calling heapam.c routines directly.  I'm
working on this now and will present a patch later.

Another thing I noticed while going over the code was that we seem to
spill whole tuples even for things like the old tuple of an UPDATE and
for DELETE, but unless I misunderstand how this works, these shouldn't
be necessary: we just need the replica identity so that we can locate
the tuple to operate on.  Especially for tuples that contain large
toasted attributes, this is likely important.

It may make sense to use the TupleTableSlot interface rather than
HeapTuple for everything.  I'm not really sure about this though.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Most hackers will be perfectly comfortable conceptualizing users as entropy
 sources, so let's move on."                               (Nathaniel Smith)
      https://mail.gnu.org/archive/html/monotone-devel/2007-01/msg00080.html

Attachment

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Why clearing the VM doesn't require registering vm buffer in wal record
Next
From: Corey Huinker
Date:
Subject: Re: Import Statistics in postgres_fdw before resorting to sampling.