Re: [PROPOSAL] VACUUM Progress Checker. - Mailing list pgsql-hackers

From Rahila Syed
Subject Re: [PROPOSAL] VACUUM Progress Checker.
Date
Msg-id CAH2L28sqDDt-aoTpzp1ma8s=1WiqRW3QEobBcjJQ7e2RbKhcDA@mail.gmail.com
Whole thread Raw
In response to Re: [PROPOSAL] VACUUM Progress Checker.  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Responses Re: [PROPOSAL] VACUUM Progress Checker.  (Vinayak <vinpokale@gmail.com>)
List pgsql-hackers
Hello,

Thank you for your comments.
Please find attached patch addressing following comments ,

>- duplicate_oids error in HEAD.
Check.

>- a compiler warning:
>pgstat.c:2898: warning: no previous prototype for ‘pgstat_reset_activityflag’
Check.

>One more change you could do is 's/activityflag/activity_flag/g',
Check.

>Type of the flag of 
vacuum activity.
The flag variable is an integer to incorporate more commands in future. 

>Type of st_progress_param and so.
st_progress_param is also given a generic name to incorporate different parameters reported from various commands.

>st_progress_param_float is currently totally useless.
Float parameter has currently been removed from the patch.

>Definition of progress_message.
>The definition of progress_message in lazy_scan_heap is "char
>[PROGRESS_MESSAGE_LENGTH][N_PROGRESS_PARAM]" which looks to be
>inversed.
Corrected.

>The current code subtracts the number of blocks when
>skipping_all_visible_blocks is set in two places. But I think
>it is enough to decrement when skipping.
In both the places, the pages are being skipped hence the total count was decremented.

>He suggested to keep total_heap_pages fixed while adding number
>of skipped pages to that of scanned pages. 
This has been done in the attached.

>snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s.%s",
>                          get_namespace_name(RelationGetNamespace(rel)),
>                          relname);
Check.

The previous implementation used to add total number of pages across all indexes of a relation to total_index_pages in every scan of 
indexes to account for total pages scanned. Thus, it  was equal to number of scans * total_index_pages.

In the attached patch, total_index_pages reflects total number of pages across all indexes of a relation.
And the column to report passes through indexes (phase 2) has been added to account for number of passes for index and heap vacuuming.
Number of scanned index pages is reset at the end of each pass.
This makes the reporting clearer.
The percent complete does not account for index pages. It just denotes percentage of heap scanned. 

>Spotted a potential oversight regarding report of scanned_pages. It
>seems pages that are skipped because of not getting a pin, being new,
>being empty could be left out of the progress equation.
Corrected.


>It's better to
>report that some other way, like use one of the strings to report a
>"phase" of processing that we're currently performing.
Has been included in the attached.

Some more comments need to be addressed which include name change of activity flag, reporting only changed parameters to shared memory,
ACTIVITY_IS_VACUUM flag being set unnecessarily for ANALYZE and FULL commands ,documentation for new view.  
Also, finer grain reporting from indexes and heap truncate phase is yet to be incorporated into the patch

Thank you,
Rahila Syed
 
Attachment

pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: [PoC] Asynchronous execution again (which is not parallel)
Next
From: Michael Paquier
Date:
Subject: Re: Proposal: Trigonometric functions in degrees