Tom Lane wrote:
> The reason the critical section is so large is that we're manipulating
> the contents of a shared buffer, and we don't want a failure to leave a
> partially-modified page in the buffer. We could fix that if we were to
> memcpy the page into local storage and do all the pruning work there.
> Then the critical section would only surround copying the page back to
> the buffer and writing the WAL record. Copying the page is a tad
> annoying but heap_page_prune is an expensive operation anyway, and
> I think we really are at too much risk of PANIC the way it's being done
> now. Has anyone got a better idea?
We could do the pruning in two phases: first figure out what to do
without modifyng anything, outside critical-section, and then actually
do it, inside critical section.
Looking at heap_page_prune, we already collect information of what we
did in the redirected/nowdead/nowunused arrays for WAL logging purposes.
We could use that, but we would also have to teach heap_prune_chain to
not step into tuples that we've already decided to remove.
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com