tom lane wrote:
> "Curtis Faith" <curtis@galtcapital.com> writes:
> > I don't dispute their conclusions in that context and under the
> > circumstances they outline of random distribution of deletion and
> > insertion values for the index keys. [But the random-distribution
> > assumption doesn't always hold.]
>
> That's a fair point. Nonetheless, we have little choice: we
> cannot move keys around during concurrent operations. If we
> push keys to the right, we may cause an indexscan moving left
> to miss them, and vice versa. So we can only eliminate empty pages.
>
> We could possibly allow VACUUM FULL to collapse partly-full
> pages, since in that case we know there are no concurrent scans.
Couldn't we do an exclusive lock call on both leaf pages and the parent
using a new LockBuffer type function, named something like
LockBufferNoWait, that uses LWLockConditionalAcquire instead of
LWLockAcquire, in the event that all three exclusive locks cannot be
obtained release all three locks, sleep, and try again for N retries.
(Actually, this would probably be four locks since the sibling pointer
of one of the siblings would have to be updated to point to the new
merged page instead of the to-be-deleted page.)
Having exclusive locks on all three pages prior to a merge would ensure
that no scans were interrupted during that merge.
Releasing all the exclusive locks in the event of failure to obtain any
of the locks will eliminate the possibility of creating deadlocks.
After N retries, VACCUUM could abort leaving the merge to be picked up
in another future VACUUM run.
I would think that this would be fairly easy to implement and that
except for very heavily scanned indexes, most of the time the locks
could be acquired and the merge would take place without causing any
concurrency problems.
- Curtis