Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements - Mailing list pgsql-hackers

From Michail Nikolaev
Subject Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Date
Msg-id CANtu0oj66JjAq8xyRSeO=MuRHYS2XsYbhHRRESHtOcLJs=3+Sw@mail.gmail.com
Whole thread Raw
In response to Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Responses Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
List pgsql-hackers
Hello!

> How would this deal with tuples not visible to the old snapshot?
> Presumably we can assume they're newer than that snapshot (the old
> snapshot didn't have it, but the new one does, so it's committed after
> the old snapshot, making them newer), so that backend must have
> inserted it into the index already, right?

Yes, exactly.

>> HeapTupleSatisfiesHistoricMVCC
> That function has this comment marker:
> "Only usable on tuples from catalog tables!"
> Is that correct even for this?

Yeah, we just need HeapTupleSatisfiesVisibility (which calls
HeapTupleSatisfiesMVCC) instead.

> Should this deal with any potential XID wraparound, too?

Yeah, looks like we should care about such case somehow.

Possible options here:

1) Skip vac_truncate_clog while CIC is running. In fact, I think it's
not that much worse than the current state - datfrozenxid is still
updated in the catalog and will be considered the next time
vac_update_datfrozenxid is called (the next VACCUM on any table).

2) Delay vac_truncate_clog while CIC is running.
In such a case, if it was skipped, we will need to re-run it using the
index builds backend later.

3) Wait for 64-bit xids :)

4) Any ideas?

In addition, for the first and second options, we need logic to cancel
the second phase in the case of ForceTransactionIdLimitUpdate.
But maybe I'm missing something and the tuples may be frozen, ignoring
the set datfrozenxid values (over some horizon calculated at runtime
based on the xmin backends).

> How does this behave when the newly inserted tuple's xmin gets frozen?
> This would be allowed to happen during heap page pruning, afaik - no
> rules that I know of which are against that - but it would create
> issues where normal snapshot visibility rules would indicate it
> visible to both snapshots regardless of whether it actually was
> visible to the older snapshot when that snapshot was created...

Yes, good catch.
Assuming we have somehow prevented vac_truncate_clog from occurring
during CIC, we can leave frozen and potentially frozen
(xmin<frozenXID) for the second phase.

So, first phase processing items:
* not frozen
* xmin>frozenXID (may not be frozen)
* visible by snapshot

second phase:
* frozen
* xmin>frozenXID (may be frozen)
* not in the index summary
* visible by "old" snapshot

You might also think – why is the first stage needed at all? Just use
batch processing during initial index building?

Best regards,
Mikhail.



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Add --check option to pgindent
Next
From: Nathan Bossart
Date:
Subject: Re: introduce dynamic shared memory registry