Re: [PROPOSAL] VACUUM Progress Checker. - Mailing list pgsql-hackers

From
Subject Re: [PROPOSAL] VACUUM Progress Checker.
Date
Msg-id 2c0b4be2f2e84e019d6bc5ea9f2e9813@MP-MSGSS-MBX007.msg.nttdata.co.jp
Whole thread Raw
In response to Re: [PROPOSAL] VACUUM Progress Checker.  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,
> -----Original Message-----
> From: Robert Haas [mailto:robertmhaas@gmail.com]
> Sent: Wednesday, March 16, 2016 2:34 AM
> To: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
> Cc: Amit Langote <amitlangote09@gmail.com>; SPS ポクレ ヴィナヤック(
> 三技術) <pokurev@pm.nttdata.co.jp>; Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp>; pgsql-hackers@postgresql.org; SPS 坂野
> 昌平(三技術) <bannos@nttdata.co.jp>
> Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
> 
> On Tue, Mar 15, 2016 at 1:16 AM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
> > On 2016/03/15 3:41, Robert Haas wrote:
> >> On Sat, Mar 12, 2016 at 7:49 AM, Amit Langote
> <amitlangote09@gmail.com> wrote:
> >>> Instead, the attached patch adds a IndexBulkDeleteProgressCallback
> >>> which AMs should call for every block that's read (say, right before
> >>> a call to ReadBufferExtended) as part of a given vacuum run. The
> >>> callback with help of some bookkeeping state can count each block
> >>> and report to pgstat_progress API. Now, I am not sure if all AMs
> >>> read 1..N blocks for every vacuum or if it's possible that some
> >>> blocks are read more than once in single vacuum, etc.  IOW, some
> >>> AM's processing may be non-linear and counting blocks 1..N (where N
> >>> is reported total index blocks) may not be possible.  However, this
> >>> is the best I could think of as doing what we are trying to do here.
> >>> Maybe index AM experts can chime in on that.
> >>>
> >>> Thoughts?
> >>
> >> Well, I think you need to study the index AMs and figure this out.
> >
> > OK.  I tried to put calls to the callback in appropriate places, but
> > couldn't get the resulting progress numbers to look sane.  So I ended
> > up concluding that any attempt to do so is futile unless I analyze
> > each AM's vacuum code carefully to be able to determine in advance the
> > max bound on the count of blocks that the callback will report.
> > Anyway, as you suggest, we can improve it later.
> 
> I don't think there is any way to bound that, because new blocks can get
> added to the index concurrently, and we might end up needing to scan
> them.  Reporting the number of blocks scanned can still be useful, though -
> any chance you can just implement that part of it?
> 
> >> But I think for starters you should write a patch that reports the following:
> >>
> >> 1. phase
> >> 2. number of heap blocks scanned
> >> 3. number of heap blocks vacuumed
> >> 4. number of completed index vac cycles 5. number of dead tuples
> >> collected since the last index vac cycle 6. number of dead tuples
> >> that we can store before needing to perform an index vac cycle
> >>
> >> All of that should be pretty straightforward, and then we'd have
> >> something we can ship.  We can add the detailed index reporting
> >> later, when we get to it, perhaps for 9.7.
> >
> > OK, I agree with this plan.  Attached updated patch implements this.
> 
> Sorta.  Committed after renaming what you called heap blocks vacuumed
> back to heap blocks scanned, adding heap blocks vacuumed, removing the
> overall progress meter which I don't believe will be anything close to
> accurate, fixing some stylistic stuff, arranging to update multiple counters
> automatically where it could otherwise produce confusion, moving the new
> view near similar ones in the file, reformatting it to follow the style of the
> rest of the file, exposing the counter #defines via a header file in case
> extensions want to use them, and overhauling and substantially expanding
> the documentation.

Thank you for committing this feature.
There is one minor bug.
s/ pgstat_progress_update_params/ pgstat_progress_update_multi_param/g
 Attached patch fixes a minor bug.

Regards,
Vinayak Pokale

Attachment

pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Typo in monitoring.sgml
Next
From: Amit Kapila
Date:
Subject: Re: [WIP] speeding up GIN build with parallel workers