Re: Setting visibility map in VACUUM's second phase - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: Setting visibility map in VACUUM's second phase
Date
Msg-id CAMkU=1wM-9u39QQJhWkHxz0bZCYbGUNMxirTDAqggrOh-R-qRg@mail.gmail.com
Whole thread Raw
In response to Re: Setting visibility map in VACUUM's second phase  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Responses Re: Setting visibility map in VACUUM's second phase
List pgsql-hackers
On Thu, Jan 24, 2013 at 1:28 AM, Pavan Deolasee
<pavan.deolasee@gmail.com> wrote:
> Hi Jeff,
>
> On Thu, Jan 24, 2013 at 2:41 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>>
>> lazy_vacuum_page now pins and unpins the vmbuffer for each page it marks
>> all-visible, which seems like a lot of needless traffic since the next
>> vmbuffer is likely to be the same as the previous one.
>>
>
> Good idea. Even though the cost of pinning/unpinning may not be high
> with respect to the vacuum cost itself, but it seems to be a good idea
> because we already do that at other places. Do you have any other
> review comments on the patch or I'll fix this and send an updated
> patch soon.

That was the only thing that stood out to me.  You had some doubts
about visibility_cutoff_xid, but I don't know enough about that to
address them.  I agree that it would be nice to avoid the code
duplication, but I don't see a reasonable way to do that.

It applies cleanly (some offsets), builds without warnings, passes
make check under cassert.    No documentation or regression tests are
needed.  We want this, and it does it.

I have verified the primary objective (setting vm sooner) but haven't
experimentally verified that this actually increases throughput for
some workload.  For pgbench when all data fits in shared_buffers, at
least it doesn't cause a readily noticeable slow down.

I haven't devised any patch-specific test cases, either for
correctness or performance.  I just used make check, pgbench, and the
"vacuum verbose" verification you told us about in the original
submission.

If we are going to scan a block twice, I wonder if it should be done
the other way around.  If there are any dead tuples that need to be
removed from indexes, there is no point in dirtying the page with a
HOT prune on the first pass when it will just be dirtied again after
the indexes are vacuumed.  I don't see this idea holding up your patch
though, as surely it would be more invasive than what you are doing.

Cheers,

Jeff



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCH 0/3] Work around icc miscompilation
Next
From: Steve Singer
Date:
Subject: Re: logical changeset generation v4 - Heikki's thoughts about the patch state