Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Adding REPACK [concurrently]
Date
Msg-id 202507311650.3a44mqyi3xnw@alvherre.pgsql
Whole thread Raw
In response to Re: Adding REPACK [concurrently]  (Robert Treat <rob@xzilla.net>)
Responses Re: Adding REPACK [concurrently]
List pgsql-hackers
On 2025-Jul-26, Robert Treat wrote:

> For clarity, are you intending to commit this patch before having the
> other parts ready? (If that sounds like an objection, it isn't) After
> a first pass, I think there's some confusing bits in the new docs that
> could use straightening out, but there likely going to overlap changes
> once concurrently is brought in, so it might make sense to hold off on
> those.

I'm aiming at getting 0001 committed during the September commitfest,
and the CONCURRENTLY flag addition later in the pg19 cycle.  But I'd
rather have good-enough docs at every step of the way.  They don't have
to be *perfect* if we want to get everything in pg19, but I'd rather not
leave anything openly confusing even transiently.

That said, I did not review the docs this time around, so here's them
the same as they were in the previous post.  But if you want to suggest
changes for the docs in 0001, please do.  Just don't get too carried
away.

> speaking of, for this bit in src/backend/commands/cluster.c
> 
> +    switch (cmd)
> +    {
> +        case REPACK_COMMAND_REPACK:
> +            return "REPACK";
> +        case REPACK_COMMAND_VACUUMFULL:
> +            return "VACUUM";
> +        case REPACK_COMMAND_CLUSTER:
> +            return "VACUUM";
> +    }
> +    return "???";
> 
> The last one should return "CLUSTER" no?

Absolutely -- my blunder.

On 2025-Jul-27, Fujii Masao wrote:

> > The patch submitted here, largely by Antonin Houska with some
> > changes by me, is based on the the pg_squeeze code which he
> > authored, and first introduces a new command called REPACK to absorb
> > both VACUUM FULL and CLUSTER, followed by addition of a CONCURRENTLY
> > flag to allow some forms of REPACK to operate online using logical
> > decoding.
> 
> Does this mean REPACK CONCURRENTLY requires wal_level = logical, while
> plain REPACK (without CONCURRENTLY) works with any wal_level setting?
> If we eventually deprecate VACUUM FULL and CLUSTER, I think plain
> REPACK should still be allowed with wal_level = minimal or replica, so
> users with those settings can perform equivalent processing.

Absolutely.

One of the later patches in the series, which I have not included yet,
intends to implement the idea of transiently enabling wal_level=logical
for the table being repacked concurrently, so that you can still use
the concurrent mode if you have a non-logical-wal_level instance.

> + if (!cluster_is_permitted_for_relation(tableOid, userid,
> +    CLUSTER_COMMAND_CLUSTER))
> 
> As for the patch you attached, it seems to be an early WIP and
> might not be ready for review yet?? BTW, I got the following
> compilation failure and probably CLUSTER_COMMAND_CLUSTER
> the above should be GetUserId().

This was a silly merge mistake, caused by my squashing Antonin's 0004
(trivial code restructuring) into 0001 at the last minute and failing to
"git add" the compile fixes before doing git-format-patch.

Here's v17.  (I decided that calling my previous one "v1" after Antonin
had gone all the way to v15 was stupid on my part.)  The important part
here is that I rebased Antonin 0004's, that is, the addition of the
CONCURRENTLY flag, plus 0005 regression tests.

The only interesting change here is that I decided to not mess with the
grammar by allowing an unparenthesized CONCURRENTLY keyword; if you want
concurrent, you have to say "REPACK (CONCURRENTLY)".  This is at odds
with the way we use the keyword in other commands, but ISTM we don't
_need_ to support that legacy syntax.  Anyway, this is easy to put back
afterwards, if enough people find it not useless.

I've not reviewed 0003 in depth yet, just rebased it.  But it works to
the point that CI is happy with it.

I've not yet included Antonin's 0006 and 0007.


TODO list for 0001:

- addition of src/bin/scripts/repackdb
- clean up the progress report infrastructure
- doc review

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
Thou shalt check the array bounds of all strings (indeed, all arrays), for
surely where thou typest "foo" someone someday shall type
"supercalifragilisticexpialidocious" (5th Commandment for C programmers)

Attachment

pgsql-hackers by date:

Previous
From: Japin Li
Date:
Subject: Re: log_min_messages per backend type
Next
From: Alvaro Herrera
Date:
Subject: Re: log_min_messages per backend type