Re: [PROPOSAL] VACUUM Progress Checker. - Mailing list pgsql-hackers

From Rahila Syed
Subject Re: [PROPOSAL] VACUUM Progress Checker.
Date
Msg-id CAH2L28sJyTG+5UmxFRxsdcQGQUznrQASk0eryHGNrQTSKXi=TA@mail.gmail.com
Whole thread Raw
In response to Re: [PROPOSAL] VACUUM Progress Checker.  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [PROPOSAL] VACUUM Progress Checker.  (Amit Langote <amitlangote09@gmail.com>)
List pgsql-hackers
>+                if(!scan_all)
>+                    scanned_heap_pages = scanned_heap_pages +
>next_not_all_visible_block;

>I don't want to be too much of a stickler for details here, but it
>seems to me that this is an outright lie.

Initially the scanned_heap_pages were meant to report just the scanned pages and skipped pages were not added to the count.  Instead the skipped pages were deduced from number of total heap pages to be scanned to make the number of scanned pages eventually add up to total heap pages.   As per comments received later total heap pages were kept constant and skipped pages count was added to scanned pages to make the count add up to total heap pages at the end of scan. That said, as suggested, scanned_heap_pages should be renamed to current_heap_page to report current blkno in lazy_scan_heap loop which will add up to total heap pages(nblocks) at the end of scan. And scanned_heap_pages can be reported as a separate number which wont contain skipped pages.


>+        /*
>+         * Reporting vacuum progress to statistics collector
>+         */

>This patch doesn't report anything to the statistics collector, nor should it.
Yes. This was incorrectly added initially by referring to similar pgstat_report interface functions.

Thank you,
Rahila Syed

On Thu, Jan 28, 2016 at 2:27 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Jan 26, 2016 at 11:37 PM, Vinayak Pokale <vinpokale@gmail.com> wrote:
> Hi,
>
> Please find attached updated patch with an updated interface.

Well, this isn't right.  You've got this sort of thing:

+            scanned_index_pages += RelationGetNumberOfBlocks(Irel[i]);
+            /* Report progress to the statistics collector */
+            pgstat_report_progress_update_message(0, progress_message);
+            pgstat_report_progress_update_counter(1, scanned_heap_pages);
+            pgstat_report_progress_update_counter(3, scanned_index_pages);
+            pgstat_report_progress_update_counter(4,
vacrelstats->num_index_scans + 1);

The point of having pgstat_report_progress_update_counter() is so that
you can efficiently update a single counter without having to update
everything, when only one counter has changed.  But here you are
calling this function a whole bunch of times in a row, which
completely misses the point - if you are updating all the counters,
it's more efficient to use an interface that does them all at once
instead of one at a time.

But there's a second problem here, too, which is that I think you've
got this code in the wrong place.  The second point of the
pgstat_report_progress_update_counter interface is that this should be
cheap enough that we can do it every time the counter changes.  That's
not what you are doing here.  You're updating the counters at various
points in the code and just pushing new values for all of them
regardless of which ones have changed.  I think you should find a way
that you can update the value immediately at the exact moment it
changes.  If that seems like too much of a performance hit we can talk
about it, but I think the value of this feature will be greatly
weakened if users can't count on it to be fully and continuously up to
the moment.  When something gets stuck, you want to know where it's
stuck, not approximately kinda where it's stuck.

+                if(!scan_all)
+                    scanned_heap_pages = scanned_heap_pages +
next_not_all_visible_block;

I don't want to be too much of a stickler for details here, but it
seems to me that this is an outright lie.  The number of scanned pages
does not go up when we decide to skip some pages, because scanning and
skipping aren't the same thing.  If we're only going to report one
number here, it needs to be called something like "current heap page",
and then you can just report blkno at the top of each iteration of
lazy_scan_heap's main loop.  If you want to report the numbers of
scanned and skipped pages separately that'd be OK too, but you can't
call it the number of scanned pages and then actually report a value
that is not that.

+        /*
+         * Reporting vacuum progress to statistics collector
+         */

This patch doesn't report anything to the statistics collector, nor should it.

Instead of making the SQL-visible function
pg_stat_get_vacuum_progress(), I think it should be something more
generic like pg_stat_get_command_progress().  Maybe VACUUM will be the
only command that reports into that feature for right now, but I'd
hope for us to change that pretty soon after we get the first patch
committed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Template for commit messages
Next
From: Bruce Momjian
Date:
Subject: Re: Template for commit messages