Re: Adding REPACK [concurrently] - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: Adding REPACK [concurrently]
Date
Msg-id 17670.1756289505@localhost
Whole thread Raw
In response to Re: Adding REPACK [concurrently]  (Mihail Nikalayeu <mihailnikalayeu@gmail.com>)
Responses Re: Adding REPACK [concurrently]
List pgsql-hackers
Mihail Nikalayeu <mihailnikalayeu@gmail.com> wrote:

> > A new kind of snapshot seems like (much) cleaner solution at the moment.
> 
> Do you mean some kind of snapshot which only uses
> TransactionIdDidCommit/Abort ignoring
> TransactionIdIsCurrentTransactionId/TransactionIdIsInProgress?
> Actually it behaves like SnapshotBelieveEverythingCommitted in that
> particular case, but TransactionIdDidCommit/Abort may be used as some
> kind of assert/error source to be sure everything is going as
> designed.

Given that there should be no (sub)transaction aborts in the new table, I
think you only need to check that the tuple has valid xmin and invalid xmax.

I think the XID should be in the commit log at the time the transaction is
being replayed, so it should be legal to use TransactionIdDidCommit/Abort in
Assert() statements. (And as long as REPACK CONCURRENTLY will use
ShareUpdateExclusiveLock, which conflicts with VACUUM, pg_class(relfrozenxid)
for given table should not advance during the processing, and therefore the
replayed XIDs should not be truncated from the commit log while REPACK
CONCURRENTLY is running.)

> And, yes, for the new snapshot we need to have
> HeapTupleSatisfiesUpdate to be modified.
> 
> Also, to deal with that particular race we may just use
> XactLockTableWait(xid, NULL, NULL, XLTW_None) before starting
> transaction replay.

Do you mean the race related to TransactionIdIsInProgress()? Not sure I
understand, as you suggested above that you no longer need the function.

> > No rush. First, the MVCC safety is not likely to be included in v19 [1].
> 
> That worries me - it is not the behaviour someone expects from a
> database by default. At least the warning should be much more visible
> and obvious.
> I think most of user will expect the same guarantees as [CREATE|RE]
> INDEX CONCURRENTLY provides.

It does not really worry me. The pg_squeeze extension is not MVCC-safe and I
remember there were only 1 or 2 related complaints throughout its
existence. (pg_repack isn't MVCC-safe as well, but I don't keep track of its
issues.)

Of course, user documentation should warn about the problem, in a way it does
for other commands (typically ALTER TABLE).

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com



pgsql-hackers by date:

Previous
From: Bertrand Drouvot
Date:
Subject: Re: Per backend relation statistics tracking
Next
From: Christoph Berg
Date:
Subject: Re: pgsql: oauth: Add unit tests for multiplexer handling