Re: why there is not VACUUM FULL CONCURRENTLY? - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: why there is not VACUUM FULL CONCURRENTLY?
Date
Msg-id 26221.1738326749@antos
Whole thread Raw
In response to Re: why there is not VACUUM FULL CONCURRENTLY?  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
List pgsql-hackers
Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:

> Further observations:
> 
> First, due to the XLog-based change detection this feature can't work
> for unlogged tables without first changing them to logged (which
> implies first writing the whole table to XLog, to not cause issues on
> any replicas). However, documentation for this limitation seems to be
> missing from the patches, and I hope a solution can be found without
> requiring LOGGED.

Currently I've got no idea how to handle UNLOGGED table. I'll at least fix the
documentation.

> Second, I'm concerned about long-running snapshots: While I've not
> read the patches fully, I think they work something like the
> following:
> 
> 1. Mark some start LSN as start for decoding changes
> 2. Do the usual REPACK operations, but with reduced locking
> 3. Apply the decoded changes
> 4. Switch the relfilenodes over
> 
> For (2), I think the scan needs a snapshot to guarantee we keep the
> original tuples of updates around, wich will hold back any other
> VACUUM activity in the database. For CIC/RIC, a solution is being
> created [0], but I'm not sure the same can be applied to this REPACK
> CONCURRENTLY: while CIC/RIC doesn't care much about cross-page update
> chains (it's only interested in TID+field values for possibly-live
> tuples), REPACK seems to require access to the fields of the old
> versions of updated tuples to correctly apply updates, thus requiring
> a single snapshot for the full scan.
> 
> Maybe that's something that can be further improved upon, maybe not.
> REPACK CONCURRENTLY is an improvement over the current situation
> w.r.t. locks, but it'd be nice if this new system does not impact the
> visibility horizons of the cluster by more than the current.

A single snapshot is used because there is a single stream of decoded data
changes. Thus a new version of a tuple is either visible to the snapshot or it
appears in the stream, but not both.

If part of the table was scanned using one snapshot, and another part with
another one, it'd be difficult to "put things together". For example, if the
first scan does not see a tuple for which the corresponding stream contains an
UPDATE change (because the old version is in the not-yet-scanned part of the
table), that UPDATE needs to be moved to the stream associated with another
snapshot. But that snapshot might not see that tuple either because it was
either deleted in between, or should be found by yet another scan.

Doing the repacking in several steps might be interesting, but I admit I
haven't yet thought that far.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning
Next
From: Alvaro Herrera
Date:
Subject: Re: NOT ENFORCED constraint feature