Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: Adding REPACK [concurrently]
Date
Msg-id 6607.1775633515@localhost
Whole thread Raw
In response to Re: Adding REPACK [concurrently]  (Robert Treat <rob@xzilla.net>)
List pgsql-hackers
Robert Treat <rob@xzilla.net> wrote:

> On Mon, Apr 6, 2026 at 6:22 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > On 2026-Apr-06, Mihail Nikalayeu wrote:
> <snip>
> >
> > Anyway, here's the three missing parts.  I have not yet edited the
> > deadlock-checker one to protect autovacuum from processing tables under
> > repack.
> >
>
> I have this lingering bit of paranoia that users could end up in a
> situation with a large / long running repack that goes past failsafe
> age which prevents the simpler fix of failsafe autovacuum from
> running. While the repack finishing would resolve this issue, we can't
> know ahead of time that the repack would finish in time, and
> statistically speaking, failsafe autovacuum should generally run much
> quicker than any repack could. I'm not sure if that means we should
> let failsafe vacuum cancel repacks (that seems a bit extreme), but
> maybe we want to help $operator to think about this decision, except
> if we don't allow autovacuum to wait and we don't allow it to respawn,
> I wonder if the end user will ever realize they are in this position.
> Granted, there doesn't seem like a clean fix for this...

If REPACK is not going to finish in time, I think it makes little difference
whether VACUUM is allowed to wait or not: even if it waits, it will start just
too late. One reason to avoid waiting might be to allow autovacuum to work on
other tables in between.

I agree that the DBA should have some guidance to asses whether REPACK or
(failsafe) VACUUM is the appropriate action. While failsafe VACUUM is clearly
a means to avoid XID wraparound, I tend to consider REPACK primarily a command
to remove table bloat. Or is there a situation where REPACK is better even to
avoid the wraparound?

Technically, the deadlock can be avoided by not running DDLs on the table
while REPACK is running. I'm just thinking if, by mentioning this in the
REPACK documentation, we'd admit that the REPACK (CONCURRENTLY) feature is
actually incomplete. On the other hand, if we don't mention the risk of
deadlock, it's a similar situation to not mentioning it for commands like
ALTER TABLE: if ALTER TABLE performs table rewrite, deadlock can also result
in a significant amount of wasted resources. (Of course, it's not the same if
the purpose of REPACK is considered substitute for failsafe VACUUM, but I'm
not sure about that.)

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



pgsql-hackers by date:

Previous
From: SATYANARAYANA NARLAPURAM
Date:
Subject: Re: Changing the state of data checksums in a running cluster
Next
From: jie wang
Date:
Subject: Re: DOCS: pg_plan_advice minor doc fixes