Re: 8.3.0 Core with concurrent vacuum fulls - Mailing list pgsql-hackers

From Tom Lane
Subject Re: 8.3.0 Core with concurrent vacuum fulls
Date
Msg-id 26892.1204732782@sss.pgh.pa.us
Whole thread Raw
In response to Re: 8.3.0 Core with concurrent vacuum fulls  ("Heikki Linnakangas" <heikki@enterprisedb.com>)
Responses Re: 8.3.0 Core with concurrent vacuum fulls
List pgsql-hackers
"Heikki Linnakangas" <heikki@enterprisedb.com> writes:
> Tom Lane wrote:
>> I think we really are at too much risk of PANIC the way it's being done
>> now.  Has anyone got a better idea?

> We could do the pruning in two phases: first figure out what to do 
> without modifyng anything, outside critical-section, and then actually 
> do it, inside critical section.

> Looking at heap_page_prune, we already collect information of what we 
> did in the redirected/nowdead/nowunused arrays for WAL logging purposes. 

That's a thought, but ...

> We could use that, but we would also have to teach heap_prune_chain to 
> not step into tuples that we've already decided to remove.

... seems like this would require searching the aforementioned arrays
for each tuple examined, which could turn into an O(N^2) problem.
If there are many removable tuples it could easily end up slower than
copying.

[ thinks some more... ]  I guess we could use a flag array dimensioned
MaxHeapTuplesPerPage to mark already-processed tuples, so that you
wouldn't need to search the existing arrays but just index into the flag
array with the tuple's offsetnumber.

I wonder if the logic could be restructured to avoid this by taking
advantage of it being a two-pass process, instead of fighting it?
But that'd probably be a bigger change than we'd want to risk
back-patching.

Since I'm the one complaining about the PANIC risk, I guess I should
do the legwork here.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Batch update of indexes on data loading
Next
From: Tom Lane
Date:
Subject: Re: 8.3.0 Core with concurrent vacuum fulls