Re: Reducing relation locking overhead - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Reducing relation locking overhead
Date
Msg-id 16007.1134071604@sss.pgh.pa.us
Whole thread Raw
In response to Re: Reducing relation locking overhead  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: Reducing relation locking overhead
List pgsql-hackers
Simon Riggs <simon@2ndquadrant.com> writes:
> I'm hand-waving here, so I'll stop. But we do know there *is* a way,
> because this is already implemented elsewhere, somehow.

That's not really the point --- the question is whether the cure is
worse than the disease.  It's entirely possible that the tradeoffs
needed to support fully concurrent REINDEX would represent a higher
price than the feature is worth, or that it's simply too much work
to get there from here.

For instance, I would imagine that the way Oracle does this relies
on their use of rollback segments, which is something we're certainly
unlikely to emulate.

Given the discussion so far, it seems likely to me that completely
concurrent REINDEX is indeed out of reach, and that what we ought to
be thinking about is what sort of compromise design (ie, partially
concurrent REINDEX) is reasonable.

Something that might work is:

1. Take ShareUpdateExclusiveLock (this blocks VACUUM and DDL changes),
then run existing CREATE INDEX code.  The finished index may be missing
some tuples inserted during the run.

2. Commit transaction so that index becomes visible (we assume it's
marked so that the planner will know not to rely on it).  Continue to
hold ShareUpdateExclusiveLock so VACUUM doesn't run.

3. Attempt to acquire ShareLock (possibly a ConditionalLockAcquire/sleep
loop instead of just flat-out LockAcquire).  Once we have this we know
there are no active writer transactions.  Release the lock immediately.

4. Make a new scan of the table and insert any rows not already present
in the index.  (This need not process anything inserted later than step
3, because any new transactions will know to insert rows in the index
anyway.)

5. Mark index good and commit, releasing all locks.

I don't think that it's worth the effort and complexity to try to avoid
a full table scan in step 4.  At best you would save much less than 50%
of the total work, and the impact on normal operations is not free.

If what you want is a REINDEX rather than creating an independent new
index, then at step 5 you need to do a swap operation which'll require
obtaining exclusive lock on the index.  This creates another opportunity
for deadlock failures, but again a conditional loop might help.

There are still some issues about the behavior when the index is UNIQUE.
Ideally you would like the CREATE INDEX to fail on a duplicate, not any
concurrent writer transaction, but I don't think it's possible to
guarantee that.

Also, I'm not sure how we get rid of the broken index if there is a
failure later than step 2.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Jan Wieck
Date:
Subject: Re: HOOKS for Synchronous Replication?
Next
From: "Jonah H. Harris"
Date:
Subject: Re: HOOKS for Synchronous Replication?