Re: [PROPOSAL] VACUUM Progress Checker. - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: [PROPOSAL] VACUUM Progress Checker.
Date
Msg-id 564DF17A.4050705@BlueTreble.com
Whole thread Raw
In response to Re: [PROPOSAL] VACUUM Progress Checker.  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Responses Re: [PROPOSAL] VACUUM Progress Checker.  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
List pgsql-hackers
On 11/19/15 1:18 AM, Amit Langote wrote:
> 1) General purpose interface for (maintenance?) commands to report a set

I'm surprised no one has picked up on using this for DML. Certainly 
anyone works with ETL processes would love to be able to get some clue 
on the status of a long running query...

> About pass 2, ie, lazy_index_vacuum() and
> lazy_vacuum_heap(), I don't see how we can do better than reporting its
> progress only after finishing all of it without any finer-grained
> instrumentation. They are essentially block-box as far as the proposed
> instrumentation approach is concerned. Being able to report progress per
> index seems good but as a whole, a user would have to wait arbitrarily
> long before numbers move forward. We might as well just report a bool
> saying we're about to enter a potentially time-consuming index vacuum
> round with possibly multiple indexes followed by lazy_vacuum_heap()
> processing. Additionally, we can report the incremented count of the
> vacuuming round (pass 2) once we are through it.

Another option is to provide the means for the index scan routines to 
report their progress. Maybe every index AM won't use it, but it'd 
certainly be a lot better than staring at a long_running boolean.

> Note that we can leave them out of
> percent_done of overall vacuum progress. Until we have a good solution for
> number (3) above, it seems to difficult to incorporate index pages into
> overall progress.

IMHO we need to either put a big caution sign on any % estimate that it 
could be wildly off, or just forgo it completely for now. I'll bet that 
if we don't provide it some enterprising users will figure out the best 
way to do this (similar to how the bloat estimate query has evolved over 
time).

Even if we never get a % done indicator, just being able to see what 
'position' a command is at will be very valuable.

> As someone pointed out upthread, the final heap truncate phase can take
> arbitrarily long and is outside the scope of lazy_scan_heap() to
> instrument. Perhaps a bool, say, waiting_heap_trunc could be reported for
> the same. Note that, it would have to be reported from lazy_vacuum_rel().

ISTM this is similar to the problem of reporting index status, namely 
that a progress reporting method needs to accept reports from multiple 
places in the code.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PROPOSAL] TAP test example
Next
From: Big Mike
Date:
Subject: Re: Foreign Data Wrapper