Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > We already do something similar for page deletions. Empty pages are not
> > deleted right away, but they are marked with BTP_DEAD, and then deleted
> > on a subsequent vacuum. Or something like that, I don't remember the
> > exact details.
>
> Right, and the reason for that is exactly that there might be a
> concurrent indexscan already "in flight" to the newly-dead page.
> We must wait to recycle the page until we are certain no such scans
> remain.
>
> It doesn't matter whether a concurrent indexscan visits the dead
> page or not, *because it's empty* and so there's nothing to miss.
> So there's no race condition. But if you try to move valid data
> across pages then there is a race condition.
Hmm...
Well, REINDEX is apparently a very expensive operation right now. But
how expensive would it be to go through the entire index and perform
the index page merge operation being discussed here, and nothing else?
If it's fast enough, might it be worthwhile to implement just this
alone as a separate maintenance command (e.g., VACUUM INDEX) that
acquires the appropriate lock (AccessExclusive, I'd expect) on the
index to prevent exactly the issues you're concerned about?
If it's fast enough even on large tables, it would be a nice
alternative to REINDEX, I'd think.
--
Kevin Brown kevin@sysexperts.com