Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: Adding REPACK [concurrently]
Date
Msg-id 137668.1768235610@localhost
Whole thread Raw
In response to Re: Adding REPACK [concurrently]  (Mihail Nikalayeu <mihailnikalayeu@gmail.com>)
Responses Re: Adding REPACK [concurrently]
List pgsql-hackers
Mihail Nikalayeu <mihailnikalayeu@gmail.com> wrote:

> On Thu, Jan 8, 2026 at 7:59 PM Antonin Houska <ah@cybertec.at> wrote:
> > v29 tries to fix the problem.
>
> Some comments for 0001-0004.

Thanks.

> ------ 0001 -----

> > * FIXME: this is missing a way to specify the index to use to repack one
> > * table, or whether to pass a WITH INDEX clause when multiple tables are
> > * used.  Something like --index[=indexname].  Adding that bleeds into
> > * vacuuming.c as well.
>
> Comments look stale.

This is an open question, see [1].

> > return "???";
> I think it is better to add Assert(false); before (done that way in a
> few places).

This is not really uncommon, see for example event_trigger.c. Added the
comment

    /* keep compiler quiet */

> > “An utility”
> Should be “A utility”

Not sure it *should be* [2], but "a utility" appears to be much more common in
the tree. Changed.

> > else if (pg_strcasecmp(cmd, "CLUSTER") == 0)
> >    cmdtype = PROGRESS_COMMAND_CLUSTER;
>
> Should we set PROGRESS_COMMAND_REPACK here? Because cluster is not
> used anywhere. Probably we may even delete PROGRESS_COMMAND_CLUSTER.

Good point. Actually we do not need this branch at all as there's no
pg_stat_get_progress_info('CLUSTER') call in system_views.sql. Removed.

> > CLUOPT_RECHECK_ISCLUSTERED
> It is not set anymore... Probably something is wrong here or we need
> to just remove that constant and check for it.

Yes, it got lost somehow. I added it where I think it's appropriate.

> ------ 0002 -----
>
> > rebuild_relation(Relation OldHeap, Relation index, bool verbose)
> It removes unused cmd parameter, but I think it is better to not add
> it in the previous commit.

Yes, the changes were not correctly split into diffs. Fixed.

> ------ 0003 -----
>
> > int       newxcnt = 0;
>
> I think it is better to use uint32 for consistency here.

This diff only moves code across functions, I'm not going to do other changes
now.

> Also, I think it is worth adding Assert(snapshot->snapshot_type ==
> SNAPSHOT_HISTORIC_MVCC)

ok

> ------ 0004 -----
>
> > /* Is REPACK (CONCURRENTLY) being run by this backend? */
> > if (am_decoding_for_repack())
>
> We should check change_useless_for_repack here - to avoid looking and
> TRUNCATE of unrelated tables.

In v29, if XLOG_HEAP_TRUNCATE of an unrelated table is seen here, we'll raise
ERROR unnecessarily instead of truncating the table. That's obviously wrong as
well. On the other hand, it's not trivial to teach change_useless_for_repack()
to filter the TRUNCATE records by file locator. So besides adding a call of
change_useless_for_repack(), I removed that ereport(ERROR) from heap_decode()
and added a comment to plugin_change() explaining why TRUNCATE on the table
being repacked should fire the Assert() statement: TRUNCATE shouldn't appear
here due to locking.

> > /* For the same reason, unlock TOAST relation. */
> > if (OldHeap->rd_rel->reltoastrelid)
> >    LockRelationOid(OldHeap->rd_rel->reltoastrelid, AccessExclusiveLock);
>
> Hm, we are locking here instead of unlocking ;)

Copy-pasto, the lock level is incorrect as well. Actually the whole idea of
unlocking index and TOAST relation is probably wrong: if some transaction
already locked the table with a lock that does not conflict with
ShareUpdateExclusiveLock, it should not need to wait for
ShareUpdateExclusiveLock on index / TOAST relation. At least I don't recall a
case where index / TOAST requires stronger lock than the main table. So I
removed the unlocking altogether.

> > (errhint("Relation \"%s\" has no identity index.",
> >       RelationGetRelationName(rel)))));
>
> One level of braces may be removed.
> > * to decode on behalf of REPACK (CONCURRENT)?
>
> CONCURRENTLY
> > * If recheck is required, it must have been preformed on the source
>
> "performed"
> > * On exit,'*scan_p' contains the scan descriptor used. The caller must close
> > * it when he no longer needs the tuple returned.
>
> There is no scan_p argument here.
> > * Copyright (c) 2012-2025, PostgreSQL Global Development Group
>
> 2026

ok

> > newtuple = change->data.tp.newtuple != NULL ?
> >   change->data.tp.newtuple : NULL;
>
> > oldtuple = change->data.tp.oldtuple != NULL ?
> >    change->data.tp.oldtuple : NULL;
> > newtuple = change->data.tp.newtuple != NULL ?
> >    change->data.tp.newtuple : NULL;
>
> Hm, should it be just x = y ?

Thanks for spotting this. The reason is that significant portion of this patch
is copied from the pg_squeeze extension, and there it originally looked like:

    oldtuple = change->data.tp.oldtuple != NULL ?
        &change->data.tp.oldtuple->tuple : NULL;

I failed to notice the unnecessary complexity when adapting the code to the
commit 08e6344fd642 in postgres core, and then copied it to the REPACK
patch. Fixed now.

> > apply_concurrent_insert
>
> Double newline at function start.

ok

> > heap2_decode
>
> Should we check for change_useless_for_repack here also? (for multi
> insert, for example).

Yes, done, thanks. While doing that, I've done the same for XLOG_HEAP_CONFIRM
in heap_decode() as it'd be bad for reorderbuffer.c to receive the CONFIRM
record w/o previously receiving the actual (speculative) INSERT.


[1] https://www.postgresql.org/message-id/7224.1762326739%40localhost
[2] https://en.wiktionary.org/wiki/an#Usage_notes

--
Antonin Houska
Web: https://www.cybertec-postgresql.com


Attachment

pgsql-hackers by date:

Previous
From: Tatsuya Kawata
Date:
Subject: Re: [PATCH] Add sampling statistics to autoanalyze log output
Next
From: Corey Huinker
Date:
Subject: Re: CAST(... ON DEFAULT) - WIP build on top of Error-Safe User Functions