Re: [PROPOSAL] VACUUM Progress Checker. - Mailing list pgsql-hackers

From Amit Langote
Subject Re: [PROPOSAL] VACUUM Progress Checker.
Date
Msg-id 56E8AF3F.6020508@lab.ntt.co.jp
Whole thread Raw
In response to Re: [PROPOSAL] VACUUM Progress Checker.  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2016/03/16 2:33, Robert Haas wrote:
> On Tue, Mar 15, 2016 at 1:16 AM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> On 2016/03/15 3:41, Robert Haas wrote:
>>> Well, I think you need to study the index AMs and figure this out.
>>
>> OK.  I tried to put calls to the callback in appropriate places, but
>> couldn't get the resulting progress numbers to look sane.  So I ended up
>> concluding that any attempt to do so is futile unless I analyze each AM's
>> vacuum code carefully to be able to determine in advance the max bound on
>> the count of blocks that the callback will report.  Anyway, as you
>> suggest, we can improve it later.
> 
> I don't think there is any way to bound that, because new blocks can
> get added to the index concurrently, and we might end up needing to
> scan them.  Reporting the number of blocks scanned can still be
> useful, though - any chance you can just implement that part of it?

Do you mean the changes I made to index bulk delete API for this purpose
in last few versions of this patch?  With it, I have observed that
reported scanned blocks count (that is incremented for every
ReadBufferExtended() call across a index vacuum cycle using the new
callback) can be non-deterministic.  Would there be an accompanying
index_blocks_total (pg_indexes_size()-based) in the view, as well?

>>> But I think for starters you should write a patch that reports the following:
>>>
>>> 1. phase
>>> 2. number of heap blocks scanned
>>> 3. number of heap blocks vacuumed
>>> 4. number of completed index vac cycles
>>> 5. number of dead tuples collected since the last index vac cycle
>>> 6. number of dead tuples that we can store before needing to perform
>>> an index vac cycle
>>>
>>> All of that should be pretty straightforward, and then we'd have
>>> something we can ship.  We can add the detailed index reporting later,
>>> when we get to it, perhaps for 9.7.
>>
>> OK, I agree with this plan.  Attached updated patch implements this.
> 
> Sorta.  Committed after renaming what you called heap blocks vacuumed
> back to heap blocks scanned, adding heap blocks vacuumed, removing the
> overall progress meter which I don't believe will be anything close to
> accurate, fixing some stylistic stuff, arranging to update multiple
> counters automatically where it could otherwise produce confusion,
> moving the new view near similar ones in the file, reformatting it to
> follow the style of the rest of the file, exposing the counter
> #defines via a header file in case extensions want to use them, and
> overhauling and substantially expanding the documentation.

Thanks a lot for committing this.  The committed version is much better.
Some of the things therein should really have been part of the final
*submitted* patch; I will try to improve in that area in my submissions
henceforth.

Thanks,
Amit





pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Parallel Aggregate
Next
From: Robert Haas
Date:
Subject: Re: Parallel Aggregate