Home > mailing lists

Thread: [PROPOSAL] VACUUM Progress Checker.

[PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

30 June 2015, 07:38:04

Hello Hackers,

Following is a proposal for feature to calculate VACUUM progress.

Use Case : Measuring progress of long running VACUUMs to help DBAs make informed decision

whether to continue running VACUUM or abort it.

Design:

A shared preload library to store progress information from different backends running VACUUM, calculate remaining time for each and display progress in the

in the form a view.

VACUUM needs to be instrumented with a hook to collect progress information (pages vacuumed/scanned) periodically.

The patch attached implements a new hook to store vacuumed_pages and scanned_pages count at the end of each page scanned by VACUUM.

This information is stored in a shared memory structure.

In addition to measuring progress this function using hook also calculates remaining time for VACUUM.

The frequency of collecting progress information can be reduced by appending delays in between hook function calls.

Also, a GUC parameter

log_vacuum_min_duration can be used.

This will cause VACUUM progress to be calculated only if VACUUM runs more than specified milliseconds.

A value of zero calculates VACUUM progress for each page processed. -1 disables logging.

Progress calculation :

percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;

remaining_time = elapsed_time * (total_pages_to_be_scanned - scanned_pages) / scanned_pages;

Shared memory struct:

typedef struct PgStat_VacuumStats

{

Oid databaseoid;

Oid tableoid;

Int32 vacuumed_pages;

Int32 total_pages;

Int32 scanned_pages;

double elapsed_time;

double remaining_time;

} PgStat_VacuumStats[max_connections];

Reporting :

A view named 'pg_maintenance_progress' can be created using the values in the struct above.

pg_stat_maintenance can be called from any other backend and will display progress of

each running VACUUM.

Other uses of hook in VACUUM:

Cost of VACUUM in terms of pages hit , missed and dirtied same as autovacuum can be collected using this hook.

Autovacuum does it at the end of VACUUM for each table. It can be done while VACUUM on a table is in progress.
This can be helpful to track manual VACUUMs also not just autovacuum.

Read/Write(I/O) rates can be computed on the lines of autovacuum.
Read rate patterns can be used to help tuning future vacuum on the table(like shared buffers tuning)
Other resource usages can also be collected using progress checker hook.

Attached patch is POC patch of progress calculation for a single backend.

Also attached is a brief snapshot of the output log.

Attachment

Re: [PROPOSAL] VACUUM Progress Checker.

From

Pavel Stehule

Date:

30 June 2015, 07:53:57

Hi

2015-06-30 9:37 GMT+02:00 Rahila Syed <rahilasyed90@gmail.com>:

Hello Hackers,

Following is a proposal for feature to calculate VACUUM progress.

interesting idea - I like to see it integrated to core.

Use Case : Measuring progress of long running VACUUMs to help DBAs make informed decision
whether to continue running VACUUM or abort it.

Design:

A shared preload library to store progress information from different backends running VACUUM, calculate remaining time for each and display progress in the
in the form a view.

probably similar idea can be used for REINDEX, CREATE INDEX, COPY TO statements

I though about the possibilities of progress visualization - and one possibility is one or two special column in pg_stat_activity table - this info can be interesting for VACUUM started by autovacuum too.

Regards

Pavel

VACUUM needs to be instrumented with a hook to collect progress information (pages vacuumed/scanned) periodically.
The patch attached implements a new hook to store vacuumed_pages and scanned_pages count at the end of each page scanned by VACUUM.
This information is stored in a shared memory structure.
In addition to measuring progress this function using hook also calculates remaining time for VACUUM.

The frequency of collecting progress information can be reduced by appending delays in between hook function calls.
Also, a GUC parameter
log_vacuum_min_duration can be used.
This will cause VACUUM progress to be calculated only if VACUUM runs more than specified milliseconds.
A value of zero calculates VACUUM progress for each page processed. -1 disables logging.

Progress calculation :

percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;
remaining_time = elapsed_time * (total_pages_to_be_scanned - scanned_pages) / scanned_pages;

Shared memory struct:
typedef struct PgStat_VacuumStats
{
Oid databaseoid;
Oid tableoid;
Int32 vacuumed_pages;
Int32 total_pages;
Int32 scanned_pages;
double elapsed_time;
double remaining_time;
} PgStat_VacuumStats[max_connections];

Reporting :
A view named 'pg_maintenance_progress' can be created using the values in the struct above.
pg_stat_maintenance can be called from any other backend and will display progress of
each running VACUUM.

Other uses of hook in VACUUM:

Cost of VACUUM in terms of pages hit , missed and dirtied same as autovacuum can be collected using this hook.
Autovacuum does it at the end of VACUUM for each table. It can be done while VACUUM on a table is in progress.
This can be helpful to track manual VACUUMs also not just autovacuum.

Read/Write(I/O) rates can be computed on the lines of autovacuum.
Read rate patterns can be used to help tuning future vacuum on the table(like shared buffers tuning)
Other resource usages can also be collected using progress checker hook.

Attached patch is POC patch of progress calculation for a single backend.
Also attached is a brief snapshot of the output log.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PROPOSAL] VACUUM Progress Checker.

From

Thomas Munro

Date:

30 June 2015, 07:59:17

On Tue, Jun 30, 2015 at 7:37 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:
> Hello Hackers,
>
> Following is a proposal for feature to calculate VACUUM progress.
>
> Use Case : Measuring progress of long running VACUUMs to help DBAs make
> informed decision
> whether to continue running VACUUM or abort it.

+1

I was thinking recently that it would be very cool to see some
estimation of the progress of VACUUM and CLUSTER in a view similar to
pg_stat_activity, or the ps title.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

dinesh kumar

Date:

30 June 2015, 08:09:56

On Tue, Jun 30, 2015 at 1:07 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:

Hello Hackers,

Following is a proposal for feature to calculate VACUUM progress.

Use Case : Measuring progress of long running VACUUMs to help DBAs make informed decision
whether to continue running VACUUM or abort it.

+1

I am excited to know how the progress works in when any of the statement got blocked during locks. Rather displaying the stats in the LOG, shall we have this in a pg_stat_vacuum_activity[ New catalog for all auto-vacuum stats].

Best Regards,

Dinesh

manojadinesh.blogspot.com

Design:

A shared preload library to store progress information from different backends running VACUUM, calculate remaining time for each and display progress in the
in the form a view.

VACUUM needs to be instrumented with a hook to collect progress information (pages vacuumed/scanned) periodically.
The patch attached implements a new hook to store vacuumed_pages and scanned_pages count at the end of each page scanned by VACUUM.
This information is stored in a shared memory structure.
In addition to measuring progress this function using hook also calculates remaining time for VACUUM.

The frequency of collecting progress information can be reduced by appending delays in between hook function calls.
Also, a GUC parameter
log_vacuum_min_duration can be used.
This will cause VACUUM progress to be calculated only if VACUUM runs more than specified milliseconds.
A value of zero calculates VACUUM progress for each page processed. -1 disables logging.

Progress calculation :

percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;
remaining_time = elapsed_time * (total_pages_to_be_scanned - scanned_pages) / scanned_pages;

Shared memory struct:
typedef struct PgStat_VacuumStats
{
Oid databaseoid;
Oid tableoid;
Int32 vacuumed_pages;
Int32 total_pages;
Int32 scanned_pages;
double elapsed_time;
double remaining_time;
} PgStat_VacuumStats[max_connections];

Reporting :
A view named 'pg_maintenance_progress' can be created using the values in the struct above.
pg_stat_maintenance can be called from any other backend and will display progress of
each running VACUUM.

Other uses of hook in VACUUM:

Cost of VACUUM in terms of pages hit , missed and dirtied same as autovacuum can be collected using this hook.
Autovacuum does it at the end of VACUUM for each table. It can be done while VACUUM on a table is in progress.
This can be helpful to track manual VACUUMs also not just autovacuum.

Read/Write(I/O) rates can be computed on the lines of autovacuum.
Read rate patterns can be used to help tuning future vacuum on the table(like shared buffers tuning)
Other resource usages can also be collected using progress checker hook.

Attached patch is POC patch of progress calculation for a single backend.
Also attached is a brief snapshot of the output log.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PROPOSAL] VACUUM Progress Checker.

From

Simon Riggs

Date:

30 June 2015, 08:32:59

On 30 June 2015 at 08:52, Pavel Stehule <pavel.stehule@gmail.com> wrote:

I though about the possibilities of progress visualization - and one possibility is one or two special column in pg_stat_activity table - this info can be interesting for VACUUM started by autovacuum too.

Yes, I suggest just a single column on pg_stat_activity called pct_complete

trace_completion_interval = 5s (default)

Every interval, we report the current % complete for any operation that supports it. We just show NULL if the current operation has not reported anything or never will.

We do this for VACUUM first, then we can begin adding other operations as we work out how (for that operation).

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

30 June 2015, 08:45:44

On 2015-06-30 PM 04:37, Rahila Syed wrote:
> 
> Design:
> 
> A shared preload library to store progress information from different
> backends running VACUUM, calculate remaining time for each and display
> progress in the
> in the form a view.
> 
> 
> [...]
> 
> Reporting :
> 
>  A view named 'pg_maintenance_progress' can be created using the values in
> the struct above.
> 
> pg_stat_maintenance can be called from any other backend and will display
> progress of
> 

+1

Just to clarify, the attached patch does not implement the view or the  shared
memory initialization part yet, right? I understand your intention to get
comments on proposed hooks and shared memory structure(s) at this point. By
the way, how does a regular send stats to background stats collector approach
compares to the proposed hooks+shmem approach?

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Thom Brown

Date:

30 June 2015, 08:50:07

On 30 June 2015 at 08:37, Rahila Syed <rahilasyed90@gmail.com> wrote:

Hello Hackers,

Following is a proposal for feature to calculate VACUUM progress.

Use Case : Measuring progress of long running VACUUMs to help DBAs make informed decision
whether to continue running VACUUM or abort it.

Design:

A shared preload library to store progress information from different backends running VACUUM, calculate remaining time for each and display progress in the
in the form a view.

VACUUM needs to be instrumented with a hook to collect progress information (pages vacuumed/scanned) periodically.
The patch attached implements a new hook to store vacuumed_pages and scanned_pages count at the end of each page scanned by VACUUM.
This information is stored in a shared memory structure.
In addition to measuring progress this function using hook also calculates remaining time for VACUUM.

The frequency of collecting progress information can be reduced by appending delays in between hook function calls.
Also, a GUC parameter
log_vacuum_min_duration can be used.
This will cause VACUUM progress to be calculated only if VACUUM runs more than specified milliseconds.
A value of zero calculates VACUUM progress for each page processed. -1 disables logging.

Progress calculation :

percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;
remaining_time = elapsed_time * (total_pages_to_be_scanned - scanned_pages) / scanned_pages;

Shared memory struct:
typedef struct PgStat_VacuumStats
{
Oid databaseoid;
Oid tableoid;
Int32 vacuumed_pages;
Int32 total_pages;
Int32 scanned_pages;
double elapsed_time;
double remaining_time;
} PgStat_VacuumStats[max_connections];

Reporting :
A view named 'pg_maintenance_progress' can be created using the values in the struct above.
pg_stat_maintenance can be called from any other backend and will display progress of
each running VACUUM.

Other uses of hook in VACUUM:

Cost of VACUUM in terms of pages hit , missed and dirtied same as autovacuum can be collected using this hook.
Autovacuum does it at the end of VACUUM for each table. It can be done while VACUUM on a table is in progress.
This can be helpful to track manual VACUUMs also not just autovacuum.

Read/Write(I/O) rates can be computed on the lines of autovacuum.
Read rate patterns can be used to help tuning future vacuum on the table(like shared buffers tuning)
Other resource usages can also be collected using progress checker hook.

Attached patch is POC patch of progress calculation for a single backend.
Also attached is a brief snapshot of the output log.

@@ -559,7 +567,9 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
             * following blocks.
             */
             if (next_not_all_visible_block - blkno > SKIP_PAGES_THRESHOLD)
+            {
                 skipping_all_visible_blocks = true;
+            }

There's no need to add those curly braces, or to subsequent if blocks.

Also, is this patch taking the visibility map into account for its calculations?

--

Thom

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Syed, Rahila"

Date:

30 June 2015, 13:16:34

Hello,

>There's no need to add those curly braces, or to subsequent if blocks

Yes, those are added by mistake.

>Also, is this patch taking the visibility map into account for its calculations?

Yes, it subtracts skippable/all-visible pages from total pages to be scanned.

For each page processed by lazy_scan_heap, if number of all visible pages ahead exceeds the threshold, it is subtracted from

the ‘total pages to be scanned’ count.

The all visible pages are accounted for incrementally during the execution of VACUUM and not before starting the process.

Thank you,

Rahila Syed

From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Thom Brown
Sent: Tuesday, June 30, 2015 2:20 PM
To: Rahila Syed
Cc: PostgreSQL-development
Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

On 30 June 2015 at 08:37, Rahila Syed <rahilasyed90@gmail.com> wrote:

Hello Hackers,

Following is a proposal for feature to calculate VACUUM progress.

Use Case : Measuring progress of long running VACUUMs to help DBAs make informed decision

whether to continue running VACUUM or abort it.

Design:

A shared preload library to store progress information from different backends running VACUUM, calculate remaining time for each and display progress in the

in the form a view.

VACUUM needs to be instrumented with a hook to collect progress information (pages vacuumed/scanned) periodically.

The patch attached implements a new hook to store vacuumed_pages and scanned_pages count at the end of each page scanned by VACUUM.

This information is stored in a shared memory structure.

In addition to measuring progress this function using hook also calculates remaining time for VACUUM.

The frequency of collecting progress information can be reduced by appending delays in between hook function calls.

Also, a GUC parameter

log_vacuum_min_duration can be used.

This will cause VACUUM progress to be calculated only if VACUUM runs more than specified milliseconds.

A value of zero calculates VACUUM progress for each page processed. -1 disables logging.

Progress calculation :

percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;

remaining_time = elapsed_time * (total_pages_to_be_scanned - scanned_pages) / scanned_pages;

Shared memory struct:

typedef struct PgStat_VacuumStats

{

Oid databaseoid;

Oid tableoid;

Int32 vacuumed_pages;

Int32 total_pages;

Int32 scanned_pages;

double elapsed_time;

double remaining_time;

} PgStat_VacuumStats[max_connections];

Reporting :

A view named 'pg_maintenance_progress' can be created using the values in the struct above.

pg_stat_maintenance can be called from any other backend and will display progress of

each running VACUUM.

Other uses of hook in VACUUM:

Cost of VACUUM in terms of pages hit , missed and dirtied same as autovacuum can be collected using this hook.

Autovacuum does it at the end of VACUUM for each table. It can be done while VACUUM on a table is in progress.
This can be helpful to track manual VACUUMs also not just autovacuum.

Read/Write(I/O) rates can be computed on the lines of autovacuum.
Read rate patterns can be used to help tuning future vacuum on the table(like shared buffers tuning)
Other resource usages can also be collected using progress checker hook.

Attached patch is POC patch of progress calculation for a single backend.

Also attached is a brief snapshot of the output log.

@@ -559,7 +567,9 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
             * following blocks.
             */
             if (next_not_all_visible_block - blkno > SKIP_PAGES_THRESHOLD)
+            {
                 skipping_all_visible_blocks = true;
+            }

There's no need to add those curly braces, or to subsequent if blocks.

Also, is this patch taking the visibility map into account for its calculations?

--

Thom

______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

02 July 2015, 02:00:56

Hello,

Thank you for suggestions.

>Yes, I suggest just a single column on pg_stat_activity called pct_complete

Reporting remaining time also can be crucial to make decisions regarding continuing or aborting VACUUM.

The same has been suggested in the thread below,

http://www.postgresql.org/message-id/13072.1284826206@sss.pgh.pa.us

>trace_completion_interval = 5s (default)

>Every interval, we report the current % complete for any operation that supports it. We just show NULL if the current operation has not reported anything or never will.

>We do this for VACUUM first, then we can begin adding other operations as we work out how (for that operation).

Thank you for explaining. This design seems good to me except, adding more than one columns(percent_complete, remaining_time) if required to pg_stat_activity can be less user intuitive than having a separate view for VACUUM.

-Rahila Syed

On Tue, Jun 30, 2015 at 2:02 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 30 June 2015 at 08:52, Pavel Stehule <pavel.stehule@gmail.com> wrote:

I though about the possibilities of progress visualization - and one possibility is one or two special column in pg_stat_activity table - this info can be interesting for VACUUM started by autovacuum too.

Yes, I suggest just a single column on pg_stat_activity called pct_complete

trace_completion_interval = 5s (default)

Every interval, we report the current % complete for any operation that supports it. We just show NULL if the current operation has not reported anything or never will.

We do this for VACUUM first, then we can begin adding other operations as we work out how (for that operation).

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

02 July 2015, 02:41:51

Hello,

>I though about the possibilities of progress visualization - and one possibility is one or two special column in pg_stat_activity table - this info can be interesting for VACUUM >started by autovacuum too.

Thank you for suggestion. The design with hooks and a separate view was mainly to keep most of the code outside core as the feature proposed is specific to VACUUM command. Also, having a separate view can give more flexibility in terms of displaying various progress parameters.

FWIW ,there was resistance to include columns in pg_stat_activity earlier in the following thread,

http://www.postgresql.org/message-id/AANLkTi=TcuMA38oGUKX9p5WVPpY+M3L0XUp7=PLT+LCT@mail.gmail.com

Thank you,

Rahila Syed

On Tue, Jun 30, 2015 at 1:22 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Hi

2015-06-30 9:37 GMT+02:00 Rahila Syed <rahilasyed90@gmail.com>:
Hello Hackers,

Following is a proposal for feature to calculate VACUUM progress.

interesting idea - I like to see it integrated to core.

Use Case : Measuring progress of long running VACUUMs to help DBAs make informed decision
whether to continue running VACUUM or abort it.

Design:

A shared preload library to store progress information from different backends running VACUUM, calculate remaining time for each and display progress in the
in the form a view.

probably similar idea can be used for REINDEX, CREATE INDEX, COPY TO statements

I though about the possibilities of progress visualization - and one possibility is one or two special column in pg_stat_activity table - this info can be interesting for VACUUM started by autovacuum too.

Regards

Pavel

VACUUM needs to be instrumented with a hook to collect progress information (pages vacuumed/scanned) periodically.
The patch attached implements a new hook to store vacuumed_pages and scanned_pages count at the end of each page scanned by VACUUM.
This information is stored in a shared memory structure.
In addition to measuring progress this function using hook also calculates remaining time for VACUUM.

The frequency of collecting progress information can be reduced by appending delays in between hook function calls.
Also, a GUC parameter
log_vacuum_min_duration can be used.
This will cause VACUUM progress to be calculated only if VACUUM runs more than specified milliseconds.
A value of zero calculates VACUUM progress for each page processed. -1 disables logging.

Progress calculation :

percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;
remaining_time = elapsed_time * (total_pages_to_be_scanned - scanned_pages) / scanned_pages;

Shared memory struct:
typedef struct PgStat_VacuumStats
{
Oid databaseoid;
Oid tableoid;
Int32 vacuumed_pages;
Int32 total_pages;
Int32 scanned_pages;
double elapsed_time;
double remaining_time;
} PgStat_VacuumStats[max_connections];

Reporting :
A view named 'pg_maintenance_progress' can be created using the values in the struct above.
pg_stat_maintenance can be called from any other backend and will display progress of
each running VACUUM.

Other uses of hook in VACUUM:

Cost of VACUUM in terms of pages hit , missed and dirtied same as autovacuum can be collected using this hook.
Autovacuum does it at the end of VACUUM for each table. It can be done while VACUUM on a table is in progress.
This can be helpful to track manual VACUUMs also not just autovacuum.

Read/Write(I/O) rates can be computed on the lines of autovacuum.
Read rate patterns can be used to help tuning future vacuum on the table(like shared buffers tuning)
Other resource usages can also be collected using progress checker hook.

Attached patch is POC patch of progress calculation for a single backend.
Also attached is a brief snapshot of the output log.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

02 July 2015, 02:58:36

On 2015-07-02 AM 11:41, Rahila Syed wrote:
> Hello,
> 
>> I though about the possibilities of progress visualization - and one
> possibility is one or two special column in pg_stat_activity table - this
> info can be interesting for VACUUM >started by autovacuum too.
> 
> Thank you for suggestion. The design with hooks and a separate view was
> mainly to keep most of the code outside core as the feature proposed is
> specific to VACUUM command. Also, having a separate view can give more
> flexibility in terms of displaying various progress parameters.
> 

Unless I am missing something, I guess you can still keep the actual code that
updates counters outside the core if you adopt an approach that Simon
suggests. Whatever the view (existing/new), any related counters would have a
valid (non-NULL) value when read off the view iff hooks are set perhaps
because you have an extension that sets them. I guess that means any operation
that "supports" progress tracking would have an extension with suitable hooks
implemented.

Of course unless I misinterpreted Simon's words.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Sameer Thakur

Date:

02 July 2015, 05:21:59

Hello,
>Thank you for suggestion. The design with hooks and a separate view was
mainly to keep most of the >code outside core as the feature proposed is
specific to VACUUM command. Also, having a separate view >can give more
flexibility in terms of displaying various progress parameters.
>FWIW ,there was resistance to include columns in pg_stat_activity earlier
in the following thread,
>http://www.postgresql.org/message-id/AANLkTi=TcuMA38oGUKX9p5WVPpY+M3L0XUp7=PLT+LCT@...

Perhaps as suggested in the link, the progress could be made available via a
function call which does progress calculation "on demand". Then we do not
need a separate view, or clutter pg_stat_activity, and also has benefit of
calculating progress just when it's needed.
 



--
View this message in context: http://postgresql.nabble.com/PROPOSAL-VACUUM-Progress-Checker-tp5855849p5856192.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: [PROPOSAL] VACUUM Progress Checker.

From

Simon Riggs

Date:

02 July 2015, 05:26:34

On 2 July 2015 at 03:00, Rahila Syed <rahilasyed90@gmail.com> wrote:

>Yes, I suggest just a single column on pg_stat_activity called pct_complete

Reporting remaining time also can be crucial to make decisions regarding continuing or aborting VACUUM.
The same has been suggested in the thread below,

http://www.postgresql.org/message-id/13072.1284826206@sss.pgh.pa.us

>trace_completion_interval = 5s (default)

>Every interval, we report the current % complete for any operation that supports it. We just show NULL if the current operation has not reported anything or never will.

>We do this for VACUUM first, then we can begin adding other operations as we work out how (for that operation).

Thank you for explaining. This design seems good to me except, adding more than one columns(percent_complete, remaining_time)

It is attractive to have a "remaining_time" column, or a "predicted_completion_timestamp", but those columns are prediction calculations rather than actual progress reports. I'm interested in seeing a report that relates to actual progress made.

Predicted total work required is also interesting, but is much less trustworthy figure.

I think we'll need to get wider input about the user interface for this feature.

if required to pg_stat_activity can be less user intuitive than having a separate view for VACUUM.

I think it is a mistake to do something just for VACUUM.

Monitoring software will look at pg_stat_activity. I don't think we should invent a separate view for progress statistics because it will cause users to look in two places rather than just one. Reporting progress is fairly cheap instrumentation, calculating a prediction of completion time might be expensive.

Having said that, monitoring systems currently use a polling mechanism to retrieve status data. They look at information published by the backend. We don't currently have a mechanism to defer publication of expensive monitoring information until requested by the monitoring system. If you have a design for how that might work then say so, otherwise we need to assume a simple workflow: the backend publishes whatever it chooses, whenever it chooses and then that is made available via the monitoring system via views.

Your current design completely misses the time taken to scan indexes, which is significant.

There might be a justification to put this out of core, but measuring progress of VACUUM wouldn't be it, IMHO.

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] VACUUM Progress Checker.

From

Guillaume Lelarge

Date:

02 July 2015, 05:33:41

<p dir="ltr">Le 2 juil. 2015 7:28 AM, "Simon Riggs" <<a
href="mailto:simon@2ndquadrant.com">simon@2ndquadrant.com</a>>a écrit :<br /> ><br /> > On 2 July 2015 at
03:00,Rahila Syed <<a href="mailto:rahilasyed90@gmail.com">rahilasyed90@gmail.com</a>> wrote:<br /> >  <br />
>><br/> >> >Yes, I suggest just a single column on pg_stat_activity called pct_complete<br />
>><br/> >> Reporting remaining time also can be crucial to make decisions regarding continuing or aborting
VACUUM. <br/> >> The same has been suggested  in  the thread below,<br /> >><br /> >> <a
href="http://www.postgresql.org/message-id/13072.1284826206@sss.pgh.pa.us">http://www.postgresql.org/message-id/13072.1284826206@sss.pgh.pa.us</a><br
/>>><br /> >> >trace_completion_interval = 5s (default)<br /> >><br /> >> >Every
interval,we report the current % complete for any operation that supports it. We just show NULL if the current
operationhas not reported anything or never will.<br /> >><br /> >> >We do this for VACUUM first, then
wecan begin adding other operations as we work out how (for that operation).<br /> >><br /> >> Thank you
forexplaining. This design seems good to me except, adding more than one columns(percent_complete, remaining_time)<br
/>><br /> ><br /> > It is attractive to have a "remaining_time" column, or a "predicted_completion_timestamp",
butthose columns are prediction calculations rather than actual progress reports. I'm interested in seeing a report
thatrelates to actual progress made.<br /> ><p dir="ltr">Agreed.<p dir="ltr">> Predicted total work required is
alsointeresting, but is much less trustworthy figure.<br /> ><p dir="ltr">And it is something a client app or an
extensioncan compute. No need to put this in core as long as we have the actual progress.<p dir="ltr">> I think
we'llneed to get wider input about the user interface for this feature.<br /> ><br /> >  <br /> >><br />
>>if required to pg_stat_activity can be less user intuitive than having a separate view for VACUUM. <br />
><br/> ><br /> > I think it is a mistake to do something just for VACUUM. <br /> ><br /> > Monitoring
softwarewill look at pg_stat_activity. I don't think we should invent a separate view for progress statistics because
itwill cause users to look in two places rather than just one. Reporting progress is fairly cheap instrumentation,
calculatinga prediction of completion time might be expensive.<br /> ><p dir="ltr">+1<p dir="ltr">> Having said
that,monitoring systems currently use a polling mechanism to retrieve status data. They look at information published
bythe backend. We don't currently have a mechanism to defer publication of expensive monitoring information until
requestedby the monitoring system. If you have a design for how that might work then say so, otherwise we need to
assumea simple workflow: the backend publishes whatever it chooses, whenever it chooses and then that is made available
viathe monitoring system via views.<br /> ><br /> ><br /> > Your current design completely misses the time
takento scan indexes, which is significant.<br /> ><br /> > There might be a justification to put this out of
core,but measuring progress of VACUUM wouldn't be it, IMHO.<br /> ><br /> > -- <br /> > Simon Riggs          
    <a href="http://www.2ndQuadrant.com/">http://www.2ndQuadrant.com/</a><br /> > PostgreSQL Development, 24x7
Support,Remote DBA, Training & Services<br />

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Syed, Rahila"

Date:

02 July 2015, 14:07:02

Hello,

>Unless I am missing something, I guess you can still keep the actual code that updates counters outside the core if
youadopt an approach that Simon suggests. 
Yes. The code to extract progress information from VACUUM and storing in shared memory can be outside core even with
pg_stat_activityas a user interface. 

>Whatever the view (existing/new), any related counters would have a valid (non-NULL) value when read off the view iff
hooksare set perhaps because you have an extension that sets them.  
>I guess that means any operation that "supports" progress tracking would have an extension with suitable hooks
implemented.
Do you mean to say , any operation/application that want progress  tracking feature will dynamically load the progress
checkermodule which will set the hooks for progress reporting? 
If yes , unless I am missing something such dynamic loading cannot happen if we use pg_stat_activity as it gets values
fromshared memory. Module has to be a shared_preload_library 
to allocate a shared memory. So this will mean the module can be loaded only at server restart. Am I missing something?

Thank you,
Rahila Syed




______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Re: [PROPOSAL] VACUUM Progress Checker.

From

Tom Lane

Date:

02 July 2015, 14:31:40

"Syed, Rahila" <Rahila.Syed@nttdata.com> writes:
> Hello,
>> Unless I am missing something, I guess you can still keep the actual code that updates counters outside the core if
youadopt an approach that Simon suggests.
 
> Yes. The code to extract progress information from VACUUM and storing in shared memory can be outside core even with
pg_stat_activityas a user interface.
 

>> Whatever the view (existing/new), any related counters would have a valid (non-NULL) value when read off the view
iffhooks are set perhaps because you have an extension that sets them. 
 
>> I guess that means any operation that "supports" progress tracking would have an extension with suitable hooks
implemented.
> Do you mean to say , any operation/application that want progress  tracking feature will dynamically load the
progresschecker module which will set the hooks for progress reporting?
 
> If yes , unless I am missing something such dynamic loading cannot happen if we use pg_stat_activity as it gets
valuesfrom shared memory. Module has to be a shared_preload_library
 
> to allocate a shared memory. So this will mean the module can be loaded
>> only at server restart. Am I missing something?

TBH, I think that designing this as a hook-based solution is adding a
whole lot of complexity for no value.  The hard parts of the problem are
collecting the raw data and making the results visible to users, and
both of those require involvement of the core code.  Where is the benefit
from pushing some trivial intermediate arithmetic into an external module?
If there's any at all, it's certainly not enough to justify problems such
as you mention here.

So I'd just create a "pgstat_report_percent_done()" type of interface in
pgstat.c and then teach VACUUM to call it directly.
        regards, tom lane

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

03 July 2015, 05:24:45

On 2015-07-02 PM 11:00, Syed, Rahila wrote:
> Hello,
> 
>> Unless I am missing something, I guess you can still keep the actual code that updates counters outside the core if
youadopt an approach that Simon suggests.

> Yes. The code to extract progress information from VACUUM and storing in shared memory can be outside core even with
pg_stat_activityas a user interface.

> 
>> Whatever the view (existing/new), any related counters would have a valid (non-NULL) value when read off the view
iffhooks are set perhaps because you have an extension that sets them. 

>> I guess that means any operation that "supports" progress tracking would have an extension with suitable hooks
implemented.
> Do you mean to say , any operation/application that want progress  tracking feature will dynamically load the
progresschecker module which will set the hooks for progress reporting?

> If yes , unless I am missing something such dynamic loading cannot happen if we use pg_stat_activity as it gets
valuesfrom shared memory. Module has to be a shared_preload_library

> to allocate a shared memory. So this will mean the module can be loaded only at server restart. Am I missing
something?
> 

Assuming that set of hooks per command and shared memory structure(s) is a way
to go, I meant to say that hook implementations per command would be in their
separate modules, of course loaded at the server start for shared memory). Of
those, your proposed patch has vacuum_progress, for example. And in context of
my comment above, that means the view would say NULL for commands for which
the module has not been set up in advance. IOW, between showing NULL in the
view and dynamically loading hook functions, we choose the former because I
don't know what the latter means in postgres.

Having said that, Tom's suggestion to export pgstat.c function(s) for
command(s) may be a more appealing way to go.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Syed, Rahila"

Date:

03 July 2015, 09:46:38

Hello,

>TBH, I think that designing this as a hook-based solution is adding a whole lot of complexity for no value.  The hard
partsof the problem are collecting the raw data and making the results visible to users, and both of those require
involvementof the core code.  Where is the benefit from pushing some trivial >intermediate arithmetic into an external
module?
>If there's any at all, it's certainly not enough to justify problems such as you mention here.

>So I'd just create a "pgstat_report_percent_done()" type of interface in pgstat.c and then teach VACUUM to call it
directly.

Thank you for suggestion. I agree that adding code in core will reduce code complexity with no additional overhead.
Going by the consensus, I will update the patch with code to collect and store progress information from vacuum in
pgstat.cand 
UI using pg_stat_activity view.

Thank you,
Rahila Syed

______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

15 July 2015, 20:18:48

Hello,

Please find attached updated patch with an interface to calculate command progress in pgstat.c. This interface currently implements VACUUM progress tracking .
A column named percent_complete has been added in pg_stat_activity to report progress.

VACUUM calls the progress calculation interface periodically at an interval specified by pgstat_track_progress GUC in ms.

Progress calculation can be disabled by setting pgstat_track_progress as -1.

Remaining_time for VACUUM is not included in current patch to avoid cluttering pg_stat_activity with too many columns.

But the estimate as seen from previous implementation seems reasonable enough to be included in progress information , may be as an exclusive view for vacuum progress information.

GUC parameter 'pgstat_track_progress' is currently PGC_SUSET in line with 'track_activities' GUC parameter. Although IMO, pgstat_track_progress can be made PGC_USERSET in order to provide more flexibility to any user to enable/disable progress calculation provided progress is tracked only if track_activities GUC parameter is enabled.

In this patch, index scans are not taken into account for progress calculation as of now .

Thank you,

Rahila Syed.

Attachment

Vacuum_progress_checker_v1.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

16 July 2015, 01:22:57

On 2015-07-16 AM 05:18, Rahila Syed wrote:
> 
> GUC parameter 'pgstat_track_progress' is currently PGC_SUSET in line with
> 'track_activities' GUC parameter.

Naming the GUC pgstat* seems a little inconsistent. It could be called,
say, track_maintenance_progress_interval/track_vacuum_progress_interval.
That way, it will look similar to existing track_* parameters:

#track_activities = on
#track_counts = on
#track_io_timing = off
#track_functions = none         # none, pl, all
#track_activity_query_size = 1024   # (change requires restart)

Also, adding the new GUC to src/backend/utils/misc/postgresql.conf.sample
might be helpful.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Fujii Masao

Date:

16 July 2015, 04:28:07

On Thu, Jul 16, 2015 at 5:18 AM, Rahila Syed <rahilasyed90@gmail.com> wrote:
> Hello,
>
> Please find attached updated patch with an interface to calculate command
> progress in pgstat.c.

Thanks for updating the patch!

I got the following compiler warning.

guc.c:2316: warning: initialization makes pointer from integer without a cast

make check-world caused lots of failures in my environment.

The following query caused a segmentation fault.

SELECT name FROM  (SELECT pg_catalog.lower(name) AS name FROM
pg_catalog.pg_settings   UNION ALL SELECT 'session authorization'
UNION ALL SELECT 'all') ss  WHERE substring(name,1,3)='tra';

Regards,

-- 
Fujii Masao

Re: [PROPOSAL] VACUUM Progress Checker.

From

Sameer Thakur-2

Date:

16 July 2015, 04:30:19

Hello,
>Your current design completely misses the time taken to scan indexes, which
is significant.
I tried to address this issue in the attached patch.
The patch calculates index scan progress by measuring against scanned pages
in LVRelStats. It checks for a change current page being scanned and
increments the progress counter. When counter reaches scanned pages number
in LVRelStats, progress is 100% complete. For now the progress is emitted as
a warning (so no config changes needed to see progress)
Thoughts?
regards
Sameer IndexScanProgress.patch
<http://postgresql.nabble.com/file/n5858109/IndexScanProgress.patch>  




--
View this message in context: http://postgresql.nabble.com/PROPOSAL-VACUUM-Progress-Checker-tp5855849p5858109.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: [PROPOSAL] VACUUM Progress Checker.

From

Michael Paquier

Date:

16 July 2015, 04:35:20

On Thu, Jul 16, 2015 at 1:30 PM, Sameer Thakur-2 wrote:
> Thoughts?
> regards
> Sameer IndexScanProgress.patch
> <http://postgresql.nabble.com/file/n5858109/IndexScanProgress.patch>

I am not really willing to show up as the picky guy here, but could it
be possible to receive those patches as attached to emails instead of
having them referenced by URL? I imagine that you are directly using
the nabble interface.
-- 
Michael

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Thakur, Sameer"

Date:

16 July 2015, 04:40:37

Hello,
>I am not really willing to show up as the picky guy here, but could it be possible to receive those patches as
attachedto emails instead of having them referenced by URL? I >imagine that you are directly using the nabble
interface.
Just configured a new mail client for nabble, did not know how to use it within an existing conversation.
Now I can send the patch attached!
Thanks
Sameer


______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Attachment

IndexScanProgress.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

dinesh kumar

Date:

16 July 2015, 05:19:24

Hi

On Wed, Jul 15, 2015 at 9:27 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Thu, Jul 16, 2015 at 5:18 AM, Rahila Syed <rahilasyed90@gmail.com> wrote:
> Hello,
>
> Please find attached updated patch with an interface to calculate command
> progress in pgstat.c.

Thanks for updating the patch!

I got the following compiler warning.

guc.c:2316: warning: initialization makes pointer from integer without a cast

make check-world caused lots of failures in my environment.

Yeah, i got couple of warnings with plain make.

The following query caused a segmentation fault.

It was the typo I believe. I see the problem is with GUC definition in guc.c. There should be "NULL" between gettext_noop and GUC_UNIT_MS.

Regards,

Dinesh

manojadinesh.blogspot.com

SELECT name FROM (SELECT pg_catalog.lower(name) AS name FROM
pg_catalog.pg_settings UNION ALL SELECT 'session authorization'
UNION ALL SELECT 'all') ss WHERE substring(name,1,3)='tra';

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Syed, Rahila"

Date:

16 July 2015, 10:21:45

Hello,

>Naming the GUC pgstat* seems a little inconsistent.
Sorry, there is a typo in the mail. The GUC name is 'track_activity_progress'.

>Also, adding the new GUC to src/backend/utils/misc/postgresql.conf.sample
>might be helpful
Yes.  I will update.

Thank you,
Rahila Syed

______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Re: [PROPOSAL] VACUUM Progress Checker.

From

Jim Nasby

Date:

17 July 2015, 22:39:51

On 7/15/15 11:38 PM, Thakur, Sameer wrote:
> Hello,
>> I am not really willing to show up as the picky guy here, but could it be possible to receive those patches as
attachedto emails instead of having them referenced by URL? I >imagine that you are directly using the nabble
interface.
> Just configured a new mail client for nabble, did not know how to use it within an existing conversation.

Does this actually handle multiple indexes? It doesn't appear so, which 
I'd think is a significant problem... :/

I'm also not seeing how this will deal with exhausting 
maintenance_work_mem. ISTM that when that happens you'd definitely want 
a better idea of what's going on...
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Thakur, Sameer"

Date:

20 July 2015, 09:33:12

Hello,
>Does this actually handle multiple indexes? It doesn't appear so, which I'd think is a significant problem... :/
Please find v2 attached which does this.
>I'm also not seeing how this will deal with exhausting maintenance_work_mem. ISTM that when that happens you'd
definitelywant a better idea of what's going on... 
Will work on this aspect in v3.
Thank you,
Sameer

______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Attachment

IndexScanProgress_v2.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Jim Nasby

Date:

21 July 2015, 18:06:46

On 7/20/15 4:32 AM, Thakur, Sameer wrote:
> Hello,
>> Does this actually handle multiple indexes? It doesn't appear so, which I'd think is a significant problem... :/
> Please find v2 attached which does this.

I think it'd be better to combine both numbers into one report:

elog(WARNING,"Current/Overall index percentage completion %f/%f", 
current_index_progress * 100, all_index_progress);

It'd also be good to standardize on where the * 100 is happening.

Also, AFAIK:

(itemptr->ip_blkid.bi_hi != vacrelstats->last_scanned_page.bi_hi) || 
(itemptr->ip_blkid.bi_lo != vacrelstats->last_scanned_page.bi_lo)

can be replaced by

(itemptr->ipblkid != vacrelstats->last_scanned_page)

and

vacrelstats->current_index_scanned_page_count = 
vacrelstats->current_index_scanned_page_count + 1;

can simply be

vacrelstats->current_index_scanned_page_count++;
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

21 July 2015, 20:24:28

On Tue, Jun 30, 2015 at 4:32 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Yes, I suggest just a single column on pg_stat_activity called pct_complete
>
> trace_completion_interval = 5s (default)
>
> Every interval, we report the current % complete for any operation that
> supports it. We just show NULL if the current operation has not reported
> anything or never will.

I am deeply skeptical about the usefulness of a progress-reporting
system that can only report one number.  I think that, in many cases,
it won't be possible to compute an accurate completion percentage, and
even if it is, people may want more data than that for various reasons
anyway.

For example, in the case of VACUUM, suppose there is a table with
1,000,000 heap pages and 200,000 index pages (so it's probably
over-indexed, but whatever).  After reading 500,000 heap pages, we
have found 0 dead tuples.  What percentage of the work have we
finished?

It's hard to say.  If we don't find any dead tuples, we've read half
the pages we will eventually read and are therefore half done.  But if
we find even 1 dead tuple, then we've got to scan all 200,000 index
pages, so we've read only 41.7% of the pages we'll eventually touch.
If we find so many dead tuples that we have to scan the indexes
multiple times for lack of maintenance_work_mem, we'll eventually read
1,000,000 + 200,000k pages, where k is the number of index scans; if
say k = 5 then we are only 25% done.  All of these scenarios are
plausible because, in all likelihood, the dirty pages in the table are
concentrated near the end.

Now we could come up with ways of making good guesses about what is
likely to happen.  We could look at the data from pg_stat_all_tables,
historical results of vacuuming this table, the state of the
visibility map, and so on.  And that all might help.  But it's going
to be fairly hard to produce a percentage of completion that is
monotonically increasing and always accurately reflects the time
remaining.  Even if we can do it, it doesn't seem like a stretch to
suppose that sometimes people will want to look at the detail data.
Instead of getting told "we're X% done" (according to some arcane
formula), it's quite reasonable to think that people will want to get
a bunch of values, e.g.:

1. For what percentage of heap pages have we completed phase one (HOT
prune + mark all visible if appropriate + freeze + remember dead
tuples)?
2. For what percentage of heap pages have we completed phase two (mark
line pointers unused)?
3. What percentage of maintenance_work_mem are we currently using to
remember tuples?

For a query, the information we want back is likely to be even more
complicated; e.g. EXPLAIN output with row counts and perhaps timings
to date for each plan node.  We can insist that all status reporting
get boiled down to one number, but I suspect we would be better off
asking ourselves how we could let commands return a richer set of
data.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Simon Riggs

Date:

22 July 2015, 07:02:12

On 21 July 2015 at 21:24, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Jun 30, 2015 at 4:32 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Yes, I suggest just a single column on pg_stat_activity called pct_complete
>
> trace_completion_interval = 5s (default)
>
> Every interval, we report the current % complete for any operation that
> supports it. We just show NULL if the current operation has not reported
> anything or never will.

I am deeply skeptical about the usefulness of a progress-reporting
system that can only report one number. I think that, in many cases,
it won't be possible to compute an accurate completion percentage, and
even if it is, people may want more data than that for various reasons
anyway.

The goal here was to have a common metric for all tasks. Multiple numbers are OK for me, but not extended detail (yet!)

As I said later:

Simon Riggs <simon@2ndquadrant.com> wrote:

> I'm interested in seeing a report that relates to actual progress made.

> Predicted total work required is also interesting, but is much less trustworthy figure.

I'm aware of the difficulties for VACUUM in particular and agree with your scenario/details.

That does not deter me from wishing to see high level information, even it varies or is inaccurate. The "arrival time" on my Sat Nav is still useful, even if it changes because of traffic jams that develop while my journey is in progress. If the value bothers me, I look at the detail. So both summary and detail information are useful, but summary is more important even though it is less accurate.

Instead of getting told "we're X% done" (according to some arcane
formula), it's quite reasonable to think that people will want to get
a bunch of values, e.g.:

1. For what percentage of heap pages have we completed phase one (HOT
prune + mark all visible if appropriate + freeze + remember dead
tuples)?
2. For what percentage of heap pages have we completed phase two (mark
line pointers unused)?
3. What percentage of maintenance_work_mem are we currently using to
remember tuples?

For a query, the information we want back is likely to be even more
complicated; e.g. EXPLAIN output with row counts and perhaps timings
to date for each plan node. We can insist that all status reporting
get boiled down to one number, but I suspect we would be better off
asking ourselves how we could let commands return a richer set of
data.

I agree that it is desirable to have a more detailed breakdown of what is happening. As soon as we do that we hit the need for very action-specific information reporting, which renders the topic much harder and much more specific.

For me, the user workflow looks like these....

Worried: "Task X is taking ages? When is it expected to finish?"

Ops: 13:50

<sometime later, about 14:00>

Worried: "Task X is still running? But I thought its ETA was 13:50?"

Ops: Now says 14:30

Worried: "Is it stuck, or is it making progress?"

Ops: Looks like its making progress

Worried: "Can we have a look at it and find out what its doing?"

Worried: "When will Task Y finish?"

Ops: Monday at 11am

Worried: "Bad news! We should cancel it on Sunday evening."

The point is that nobody looks at the detailed info until we have looked at the summary. So the summary of progress/completion time is important, even if it is possibly wrong. The detail is also useful. I think we should have both, but I'd like to see the summary info first, because it is the most useful, best leading indicator of problems.

In terms of VACUUM specifically: VACUUM should be able to assess beforehand whether it will scan the indexes, or it can just assume that it will need to scan the indexes. Perhaps VACUUM can pre-scan the VM to decide how big a task it has before it starts.

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Thakur, Sameer"

Date:

22 July 2015, 10:09:47

Hello,
>I think it'd be better to combine both numbers into one report:
>It'd also be good to standardize on where the * 100 is happening.
Done
>can be replaced by
>(itemptr->ipblkid != vacrelstats->last_scanned_page)
Get compiler error : invalid operands to binary != (have ‘BlockIdData’ and ‘BlockIdData’)
>vacrelstats->current_index_scanned_page_count++;
Done
Please find v3 attached.

I am struggling to create  maintenance work memory exhaustion.  Did the following
maintenance_work_mem=1MB.
Inserted 10 million records in tbl1 with 3 indexes. Deleted 5 million and vacuumed. So far no error. I could keep
bumpingup the records to say 100 million and try to get this error. 
This seems a tedious manner to simulate maintenance work memory exhaustion. Is there a better way?
To insert I am using COPY (from a csv which has 10 million records) and building indexes after insert is complete.
Thank you
Sameer

______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Attachment

IndexScanProgress_v3.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Alvaro Herrera

Date:

22 July 2015, 11:20:06

Thakur, Sameer wrote:
> Hello,
> >I think it'd be better to combine both numbers into one report:
> >It'd also be good to standardize on where the * 100 is happening.
> Done
> >can be replaced by
> >(itemptr->ipblkid != vacrelstats->last_scanned_page)
> Get compiler error : invalid operands to binary != (have ‘BlockIdData’ and ‘BlockIdData’)
> >vacrelstats->current_index_scanned_page_count++;
> Done
> Please find v3 attached.
> 
> I am struggling to create  maintenance work memory exhaustion.  Did the following
> maintenance_work_mem=1MB.
> Inserted 10 million records in tbl1 with 3 indexes. Deleted 5 million and vacuumed. So far no error. I could keep
bumpingup the records to say 100 million and try to get this error.
 
> This seems a tedious manner to simulate maintenance work memory exhaustion. Is there a better way?
> To insert I am using COPY (from a csv which has 10 million records) and building indexes after insert is complete.

I don't think there's any maintenance work exhaustion that results in an
error.  The system is designed to use all the memory it is allowed to,
and to have other strategies when it's not sufficient to do the whole
sort.

Not sure what Jim meant.  Maybe he meant to be aware of when spilling to
disk happens?  Obviously, things become slower, so maybe you need to
consider it for progress reporting purposes.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

22 July 2015, 11:58:51

On Wed, Jul 22, 2015 at 8:19 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
>
> Not sure what Jim meant.  Maybe he meant to be aware of when spilling to
> disk happens?  Obviously, things become slower, so maybe you need to
> consider it for progress reporting purposes.
>

Perhaps the m_w_m determines how many dead tuples lazy_scan_heap() can
keep track of before doing a lazy_vacuum_indexes() +
lazy_vacuum_heap() round. Smaller the m_w_m, more the number of index
scans, slower the progress?

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

22 July 2015, 12:00:27

On Wed, Jul 22, 2015 at 3:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> For me, the user workflow looks like these....
>
> Worried: "Task X is taking ages? When is it expected to finish?"
> Ops: 13:50
> <sometime later, about 14:00>
> Worried: "Task X is still running? But I thought its ETA was 13:50?"
> Ops: Now says 14:30
> Worried: "Is it stuck, or is it making progress?"
> Ops: Looks like its making progress
> Worried: "Can we have a look at it and find out what its doing?"

How does Ops know that it is making progress?  Just because the
completion percentage is changing?

> In terms of VACUUM specifically: VACUUM should be able to assess beforehand
> whether it will scan the indexes, or it can just assume that it will need to
> scan the indexes. Perhaps VACUUM can pre-scan the VM to decide how big a
> task it has before it starts.

Well, we can assume that it will scan the indexes exactly once, but
the actual number may be more or less; and the cost of rescanning the
heap in phase 2 is also hard to estimate.

Maybe I'm worrying over nothing, but I have a feeling that if we try
to do what you're proposing here, we're gonna end up with this:

https://xkcd.com/612/

Most of the progress estimators I have seen over the ~30 years that
I've been playing with computers have been unreliable, and many of
those have been unreliable to the point of being annoying.  I think
that's likely to happen with what you are proposing too, though of
course like all predictions of the future it could turn out to be
wrong.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Simon Riggs

Date:

22 July 2015, 12:24:29

On 22 July 2015 at 13:00, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Jul 22, 2015 at 3:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> For me, the user workflow looks like these....
>
> Worried: "Task X is taking ages? When is it expected to finish?"
> Ops: 13:50
> <sometime later, about 14:00>
> Worried: "Task X is still running? But I thought its ETA was 13:50?"
> Ops: Now says 14:30
> Worried: "Is it stuck, or is it making progress?"
> Ops: Looks like its making progress
> Worried: "Can we have a look at it and find out what its doing?"

How does Ops know that it is making progress? Just because the
completion percentage is changing?

You could, but that is not the way I suggested.

We need

* Some measure of actual progress (the definition of which may vary from action to action, e.g. blocks scanned)

* Some estimate of the total work required

* An estimate of the estimated time of completion - I liked your view that this prediction may be costly to request

> In terms of VACUUM specifically: VACUUM should be able to assess beforehand
> whether it will scan the indexes, or it can just assume that it will need to
> scan the indexes. Perhaps VACUUM can pre-scan the VM to decide how big a
> task it has before it starts.

Well, we can assume that it will scan the indexes exactly once, but
the actual number may be more or less; and the cost of rescanning the
heap in phase 2 is also hard to estimate.

Maybe I'm worrying over nothing, but I have a feeling that if we try
to do what you're proposing here, we're gonna end up with this:

https://xkcd.com/612/

Most of the progress estimators I have seen over the ~30 years that
I've been playing with computers have been unreliable, and many of
those have been unreliable to the point of being annoying. I think
that's likely to happen with what you are proposing too, though of
course like all predictions of the future it could turn out to be
wrong.

Almost like an Optimizer then. Important, often annoyingly wrong, needs more work.

I'm not proposing this feature, I'm merely asking for it to be defined in a way that makes it work for more than just VACUUM. Once we have a way of reporting useful information, other processes can be made to follow that mechanism, like REINDEX, ALTER TABLE etc.. I believe those things are important, even if we never get such information for user queries. But I hope we do.

I won't get in the way of your search for detailed information in more complex forms. Both things are needed.

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

22 July 2015, 14:15:28

On Wed, Jul 22, 2015 at 8:24 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> * An estimate of the estimated time of completion - I liked your view that
> this prediction may be costly to request

I'm saying it may be massively unreliable, not that it may be costly.
(Someone else may have said that it would be costly, but I don't think
it was me.)

>> Most of the progress estimators I have seen over the ~30 years that
>> I've been playing with computers have been unreliable, and many of
>> those have been unreliable to the point of being annoying.  I think
>> that's likely to happen with what you are proposing too, though of
>> course like all predictions of the future it could turn out to be
>> wrong.
>
> Almost like an Optimizer then. Important, often annoyingly wrong, needs more
> work.

Yes, but with an important difference.  If the optimizer mis-estimates
the row count by 3x or 10x or 1000x, but the plan is OK anyway, it's
often the case that no one cares.  Except when the plan is bad, people
don't really care about the method used to derive it.  The same is not
true here: people will rely on the progress estimates directly, and
they will really care if they are not right.

> I'm not proposing this feature, I'm merely asking for it to be defined in a
> way that makes it work for more than just VACUUM. Once we have a way of
> reporting useful information, other processes can be made to follow that
> mechanism, like REINDEX, ALTER TABLE etc.. I believe those things are
> important, even if we never get such information for user queries. But I
> hope we do.
>
> I won't get in the way of your search for detailed information in more
> complex forms. Both things are needed.

OK.

One idea I have is to create a system where we expose a command tag
(i.e. VACUUM) plus a series of generic fields whose specific meanings
are dependent on the command tag.  Say, 6 bigint counters, 6 float8
counters, and 3 strings up to 80 characters each.  So we have a
fixed-size chunk of shared memory per backend, and each backend that
wants to expose progress information can fill in those fields however
it likes, and we expose the results.

This would be sorta like the way pg_statistic works: the same columns
can be used for different purposes depending on what estimator will be
used to access them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Jim Nasby

Date:

22 July 2015, 15:23:44

On 7/22/15 6:58 AM, Amit Langote wrote:
> On Wed, Jul 22, 2015 at 8:19 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>>
>> Not sure what Jim meant.  Maybe he meant to be aware of when spilling to
>> disk happens?  Obviously, things become slower, so maybe you need to
>> consider it for progress reporting purposes.
>>
>
> Perhaps the m_w_m determines how many dead tuples lazy_scan_heap() can
> keep track of before doing a lazy_vacuum_indexes() +
> lazy_vacuum_heap() round. Smaller the m_w_m, more the number of index
> scans, slower the progress?

Yes. Any percent completion calculation will have to account for the 
case of needing multiple passes through all the indexes.

Each dead tuple requires 6 bytes (IIRC) of maintenance work mem. So if 
you're deleting 5M rows with m_w_m=1MB you should be getting many passes 
through the indexes. Studying the output of VACUUM VERBOSE will confirm 
that (or just throw a temporary WARNING in the path where we start the 
scan).
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

Jim Nasby

Date:

22 July 2015, 15:29:14

On 7/22/15 9:15 AM, Robert Haas wrote:
>> I'm not proposing this feature, I'm merely asking for it to be defined in a
>> >way that makes it work for more than just VACUUM. Once we have a way of
>> >reporting useful information, other processes can be made to follow that
>> >mechanism, like REINDEX, ALTER TABLE etc.. I believe those things are
>> >important, even if we never get such information for user queries. But I
>> >hope we do.
>> >
>> >I won't get in the way of your search for detailed information in more
>> >complex forms. Both things are needed.
> OK.
>
> One idea I have is to create a system where we expose a command tag
> (i.e. VACUUM) plus a series of generic fields whose specific meanings
> are dependent on the command tag.  Say, 6 bigint counters, 6 float8
> counters, and 3 strings up to 80 characters each.  So we have a
> fixed-size chunk of shared memory per backend, and each backend that
> wants to expose progress information can fill in those fields however
> it likes, and we expose the results.
>
> This would be sorta like the way pg_statistic works: the same columns
> can be used for different purposes depending on what estimator will be
> used to access them.

If we want to expose that level of detail, I think either JSON or arrays 
would make more sense, so we're not stuck with a limited amount of info. 
Perhaps DDL would be OK with the numbers you suggested, but 
https://www.pgcon.org/2013/schedule/events/576.en.html would not, and I 
think wanting query progress is much more common.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Thakur, Sameer"

Date:

23 July 2015, 09:44:31

Hello,
>Yes. Any percent completion calculation will have to account for the case of needing multiple passes through all the
indexes.

>Each dead tuple requires 6 bytes (IIRC) of maintenance work mem. So if you're deleting 5M rows with m_w_m=1MB you
shouldbe getting many passes through the indexes. >Studying the output of VACUUM VERBOSE will confirm that (or just
throwa temporary WARNING in the path where we start the scan). 

Yes I see the problem now. I get the message "WARNING:  Overall index percentage completion 100.000000" logged > 25
timeswhile vacuuming after 5 million records deleted. 
Figuring out number  of multiple index passes beforehand, accurately, is the problem to solve. Clearly need to study
thissome more. 
Thank you,
Sameer Thakur | Senior Software Specialist | NTTDATA Global Delivery Services Private Ltd | w. +91.20.6641.7146 | VoIP:
8834.8146| m. +91 989.016.6656 | sameer.thakur@nttdata.com | Follow us on Twitter@NTTDATAAmericas 


______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Thakur, Sameer"

Date:

23 July 2015, 10:24:06

Hello,
>logged > 25 times
Sorry, it is much lower at 7 times. Does not change overall point though
regards
Sameer Thakur | Senior Software Specialist | NTTDATA Global Delivery Services Private Ltd | w. +91.20.6641.7146 | VoIP:
8834.8146| m. +91 989.016.6656 | sameer.thakur@nttdata.com | Follow us on Twitter@NTTDATAAmericas 


______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

23 July 2015, 19:43:42

On Wed, Jul 22, 2015 at 11:28 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> If we want to expose that level of detail, I think either JSON or arrays
> would make more sense, so we're not stuck with a limited amount of info.
> Perhaps DDL would be OK with the numbers you suggested, but
> https://www.pgcon.org/2013/schedule/events/576.en.html would not, and I
> think wanting query progress is much more common.

You need to restrict the amount of info, because you've got to
preallocate enough shared memory to store all the data that somebody
might report.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Jim Nasby

Date:

24 July 2015, 18:00:59

On 7/23/15 2:43 PM, Robert Haas wrote:
> On Wed, Jul 22, 2015 at 11:28 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
>> If we want to expose that level of detail, I think either JSON or arrays
>> would make more sense, so we're not stuck with a limited amount of info.
>> Perhaps DDL would be OK with the numbers you suggested, but
>> https://www.pgcon.org/2013/schedule/events/576.en.html would not, and I
>> think wanting query progress is much more common.
>
> You need to restrict the amount of info, because you've got to
> preallocate enough shared memory to store all the data that somebody
> might report.

I was thinking your DSM stuff would come into play here. We wouldn't 
want to be reallocating during execution, but I'd expect we would know 
during setup how much memory we actually needed.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

24 July 2015, 18:03:42

On Fri, Jul 24, 2015 at 2:00 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
>> You need to restrict the amount of info, because you've got to
>> preallocate enough shared memory to store all the data that somebody
>> might report.
>
> I was thinking your DSM stuff would come into play here. We wouldn't want to
> be reallocating during execution, but I'd expect we would know during setup
> how much memory we actually needed.

You could make that work, but it would be a pretty significant amount
of new mechanism.  Also, if it's to be practical to report progress
frequently, it's got to be cheap, and that precludes reporting vast
volumes of data.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Jim Nasby

Date:

24 July 2015, 18:07:09

On 7/23/15 5:18 AM, Thakur, Sameer wrote:
> Hello,
>> >logged > 25 times
> Sorry, it is much lower at 7 times. Does not change overall point though

I think it's related to the problem of figuring out how many dead tuples 
you expect to find in the overall heap, which you need to do to have any 
hope of this being a comprehensive estimate.

My inclination at this point is to provide a simple means of providing 
the raw numbers and let users test it in the wild. A really crude method 
of doing that might be to trap SIGINFO (if we're not using it already) 
and elog current status.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

Josh Berkus

Date:

24 July 2015, 22:41:47

On 07/24/2015 11:06 AM, Jim Nasby wrote:
> On 7/23/15 5:18 AM, Thakur, Sameer wrote:
>> Hello,
>>> >logged > 25 times
>> Sorry, it is much lower at 7 times. Does not change overall point though
> 
> I think it's related to the problem of figuring out how many dead tuples
> you expect to find in the overall heap, which you need to do to have any
> hope of this being a comprehensive estimate.

What about just reporting scanned pages/total pages ?  That would be
easy and cheap to track.  It would result in some herky-jerky
"progress", but would still be an improvement over the feedback we don't
have now.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

Mike Blackwell

Date:

25 July 2015, 00:21:46

Something like that would be helpful. I just had to stop one after an hour and have no idea how much longer it would have taken.

__________________________________________________________________________________
Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management | RR Donnelley
1750 Wallace Ave | St Charles, IL 60174-3401
Office: 630.313.7818
Mike.Blackwell@rrd.com
http://www.rrdonnelley.com

On Fri, Jul 24, 2015 at 5:41 PM, Josh Berkus <josh@agliodbs.com> wrote:

On 07/24/2015 11:06 AM, Jim Nasby wrote:
> On 7/23/15 5:18 AM, Thakur, Sameer wrote:
>> Hello,
>>> >logged > 25 times
>> Sorry, it is much lower at 7 times. Does not change overall point though
>
> I think it's related to the problem of figuring out how many dead tuples
> you expect to find in the overall heap, which you need to do to have any
> hope of this being a comprehensive estimate.

What about just reporting scanned pages/total pages ? That would be
easy and cheap to track. It would result in some herky-jerky
"progress", but would still be an improvement over the feedback we don't
have now.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PROPOSAL] VACUUM Progress Checker.

From

Pavel Stehule

Date:

25 July 2015, 06:22:58

2015-07-25 0:41 GMT+02:00 Josh Berkus <josh@agliodbs.com>:

On 07/24/2015 11:06 AM, Jim Nasby wrote:
> On 7/23/15 5:18 AM, Thakur, Sameer wrote:
>> Hello,
>>> >logged > 25 times
>> Sorry, it is much lower at 7 times. Does not change overall point though
>
> I think it's related to the problem of figuring out how many dead tuples
> you expect to find in the overall heap, which you need to do to have any
> hope of this being a comprehensive estimate.

What about just reporting scanned pages/total pages ? That would be
easy and cheap to track. It would result in some herky-jerky
"progress", but would still be an improvement over the feedback we don't
have now.

I like this idea.

Regards

Pavel

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

31 July 2015, 03:30:31

>I think it's related to the problem of figuring out how many dead tuples you expect to find in the overall heap, which you need to do to have >any hope of this being a comprehensive estimate.

An estimate of number of index scans while vacuuming can be done using estimate of total dead tuples in the relation and maintenance work mem.

n_dead_tuples in pg_stat_all_tables can be used as an estimate of dead tuples.

Following can be a way to estimate,

if nindexes == 0

index_scans =0

else if pages_all_visible

index_scans =0

else

index_scans = Max((n_dead_tuples * space occupied by single dead tuple)/m_w_m,1)

This estimates index_scans = 1 if n_dead_tuples = 0 assuming lazy scan heap is likely to find some dead_tuples.

If n_dead_tuples is non zero the above estimate gives a lower bound on number of index scans possible.

Thank you,

Rahila Syed

Re: [PROPOSAL] VACUUM Progress Checker.

From

Alvaro Herrera

Date:

31 July 2015, 18:21:20

I think the only way to produce usable estimates is to report more than
one number.  And in the particular case of lazy vacuuming, ISTM the way
to do it is to consider heap scanning as one phase, index cleanup as
another phase; these two phases can be interleaved.  And there's a final
heap scan which is a third phase, which can only run after phases one
and two are complete.

So you would report either "we're in phases one/two" or "we're in phase
three".  If we're in phases one/two, then we need to report

1. what's the page number of heap scan (i.e. how much more do we have to  go yet?)
2. how many index scans have we done so far
3. if phase two, how many index pages have we scanned (total, i.e.  across all indexes).

The total number of heap pages is known, and the total number of index
pages is also known, so it's possible to derive a percentage out of
this part.  Maybe it would be useful to report how much time it's been
in phases one and two respectively; with that, I think it is possible to
extrapolate the total time.

If we're in third phase, we report the heap page number we're in.

This looks pretty complicated to understand from the user POV, but
anything other than this seems to me too simplistic to be of any use.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Joshua D. Drake"

Date:

31 July 2015, 18:43:46

On 07/31/2015 11:21 AM, Alvaro Herrera wrote:

> This looks pretty complicated to understand from the user POV, but
> anything other than this seems to me too simplistic to be of any use.
>

I would agree and I don't think it is all that complicated. This is an 
RDBMS not a web browser downloading a file.

JD

-- 
Command Prompt, Inc. - http://www.commandprompt.com/  503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Announcing "I'm offended" is basically telling the world you can't
control your own emotions, so everyone else should do it for you.

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

31 July 2015, 18:54:45

On Fri, Jul 31, 2015 at 2:21 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> I think the only way to produce usable estimates is to report more than
> one number.  And in the particular case of lazy vacuuming, ISTM the way
> to do it is to consider heap scanning as one phase, index cleanup as
> another phase; these two phases can be interleaved.  And there's a final
> heap scan which is a third phase, which can only run after phases one
> and two are complete.

That's not really right.  There's a phase three for each phase two.

Put in terms of the code, what we're calling phase one is
lazy_scan_heap(), which prunes all pages, sets hint bits, collects
dead TIDs, and maybe marks the page all-visible.

When lazy_scan_heap() fills up maintenance_work_mem, or when it
reaches the end of the heap, it does phase two, which is
lazy_vacuum_index(), and phase three, which is lazy_vacuum_heap().
Phase one - lazy_scan_heap() - then keeps going from where it left
off.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Alvaro Herrera

Date:

31 July 2015, 19:31:00

Robert Haas wrote:
> On Fri, Jul 31, 2015 at 2:21 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
> > I think the only way to produce usable estimates is to report more than
> > one number.  And in the particular case of lazy vacuuming, ISTM the way
> > to do it is to consider heap scanning as one phase, index cleanup as
> > another phase; these two phases can be interleaved.  And there's a final
> > heap scan which is a third phase, which can only run after phases one
> > and two are complete.
> 
> That's not really right.  There's a phase three for each phase two.
> 
> Put in terms of the code, what we're calling phase one is
> lazy_scan_heap(), which prunes all pages, sets hint bits, collects
> dead TIDs, and maybe marks the page all-visible.
> 
> When lazy_scan_heap() fills up maintenance_work_mem, or when it
> reaches the end of the heap, it does phase two, which is
> lazy_vacuum_index(), and phase three, which is lazy_vacuum_heap().
> Phase one - lazy_scan_heap() - then keeps going from where it left
> off.

Hmm, you're right.  I don't think it changes the essence of what I
suggest, though.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

01 August 2015, 14:25:51

>The total number of heap pages is known, and the total number of index
>pages is also known, so it's possible to derive a percentage out of
>this part.

The total number of index pages scanned during entire vacuum will depend on number

of index scans that happens.

In order to extrapolate percent complete for phase two(index scan) we need number of index scans * total index pages.

The number of index scans can vary from 1 to n (n depending on maintenance_work_mem)

Summarizing suggestions in previous mails, following information can be reported

Phase 1.heap pages scanned / total heap pages

Phase 2.index pages scanned / total index pages (across all indexes)

Phase 3.count of heap pages vacuumed

Additional info like number of index scans so far, number of dead tuples being vacuumed in one batch can also be provided.

A combined estimate for vacuum percent complete can be provided by summing up heap pages scanned, index pages scanned against total heap pages, total index pages * number of index scans.

Thank you,

Rahila Syed

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

10 August 2015, 04:36:12

Hello,

>Say, 6 bigint counters, 6 float8
>counters, and 3 strings up to 80 characters each. So we have a
>fixed-size chunk of shared memory per backend, and each backend that
>wants to expose progress information can fill in those fields however
>it likes, and we expose the results.
>This would be sorta like the way pg_statistic works: the same columns
>can be used for different purposes depending on what estimator will be
>used to access them.

After thinking more on this suggestion, I came up with following generic structure which can be used to store progress of any command per backend in shared memory.

Struct PgBackendProgress

{

int32 *counter[COMMAND_NUM_SLOTS];

float8 *counter_float[COMMAND_NUM_SLOTS];

char *progress_message[COMMAND_NUM_SLOTS];

}

COMMAND_NUM_SLOTS will define maximum number of slots(phases) for any command.

Progress of command will be measured using progress of each phase in command.

For some command the number of phases can be singular and rest of the slots will be NULL.

Each phase will report n integer counters, n float counters and a progress message.

For some phases , any of the above fields can be NULL.

For VACUUM , there can 3 phases as discussed in the earlier mails.

Phase 1. Report 2 integer counters: heap pages scanned and total heap pages, 1 float counter: percentage_complete and progress message.

Phase 2. Report 2 integer counters: index pages scanned and total index pages(across all indexes) and progress message.

Phase 3. 1 integer counter: heap pages vacuumed.

This structure can be accessed by statistics collector to display progress via new view.

Thank you,

Rahila Syed

Re: [PROPOSAL] VACUUM Progress Checker.

From

Masahiko Sawada

Date:

10 August 2015, 13:12:21

On Mon, Aug 10, 2015 at 1:36 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:
> Hello,
>
>>Say, 6 bigint counters, 6 float8
>>counters, and 3 strings up to 80 characters each.  So we have a
>>fixed-size chunk of shared memory per backend, and each backend that
>>wants to expose progress information can fill in those fields however
>>it likes, and we expose the results.
>>This would be sorta like the way pg_statistic works: the same columns
>>can be used for different purposes depending on what estimator will be
>>used to access them.
>
> After thinking more on this suggestion, I came up with following generic
> structure which can be used to store progress of any command per backend in
> shared memory.
>
> Struct PgBackendProgress
> {
> int32 *counter[COMMAND_NUM_SLOTS];
> float8 *counter_float[COMMAND_NUM_SLOTS];
>
> char *progress_message[COMMAND_NUM_SLOTS];
> }
>
> COMMAND_NUM_SLOTS will define maximum number of slots(phases)  for any
> command.
> Progress of command will be measured using progress of each phase in
> command.
> For some command the number of phases can be singular and rest of the slots
> will be NULL.
>
> Each phase will report n integer counters, n float counters and a progress
> message.
> For some phases , any of the above fields can be NULL.
>
> For VACUUM , there can 3 phases as discussed in the earlier mails.
>
> Phase 1. Report 2 integer counters: heap pages scanned and total heap pages,
> 1 float counter: percentage_complete and progress message.
> Phase 2. Report 2 integer counters: index pages scanned and total index
> pages(across all indexes) and progress message.
> Phase 3. 1 integer counter: heap pages vacuumed.
>
> This structure can be accessed by statistics collector to display progress
> via new view.

I have one question about this.

When we're in Phase2 or 3, don't we need to report the number of total
page scanned or percentage of how many table pages scanned, as well?
As Robert said, Phase2(means lazy_vacuum_index here) and 3(means
lazy_vacuum_heap here) could be called whenever lazy_scan_heap fills
up the maintenance_work_mem. And phase 3 could be called at the end of
scanning single page if table doesn't have index.
So if vacuum progress checker reports the only current phase
information, we would not be able to know where we are in now.

Regards,

--
Masahiko Sawada

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Syed, Rahila"

Date:

10 August 2015, 15:00:16

Hello,

>When we're in Phase2 or 3, don't we need to report the number of total page scanned or percentage of how many table
pagesscanned, as well? 
The total heap pages scanned need to be reported with phase 2 or 3. Complete progress report need to have numbers from
eachphase when applicable.  

> Phase 1. Report 2 integer counters: heap pages scanned and total heap
> pages,
> 1 float counter: percentage_complete and progress message.
> Phase 2. Report 2 integer counters: index pages scanned and total
> index pages(across all indexes) and progress message.
> Phase 3. 1 integer counter: heap pages vacuumed.

Sorry for being unclear here. What I meant to say is, each phase of a command will correspond to a slot in
COMMAND_NUM_SLOTS.Each phase will be a separate array element and  
will comprise of n integers, n floats, string. So , in the view reporting progress, VACUUM command can have 3 entries
onefor each phase.  

Thank you,
Rahila Syed



______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Re: [PROPOSAL] VACUUM Progress Checker.

From

Simon Riggs

Date:

10 August 2015, 15:20:33

On 10 August 2015 at 15:59, Syed, Rahila <Rahila.Syed@nttdata.com> wrote:

Hello,

>When we're in Phase2 or 3, don't we need to report the number of total page scanned or percentage of how many table pages scanned, as well?
The total heap pages scanned need to be reported with phase 2 or 3. Complete progress report need to have numbers from each phase when applicable.

> Phase 1. Report 2 integer counters: heap pages scanned and total heap
> pages,
> 1 float counter: percentage_complete and progress message.
> Phase 2. Report 2 integer counters: index pages scanned and total
> index pages(across all indexes) and progress message.
> Phase 3. 1 integer counter: heap pages vacuumed.

Sorry for being unclear here. What I meant to say is, each phase of a command will correspond to a slot in COMMAND_NUM_SLOTS. Each phase will be a separate array element and
will comprise of n integers, n floats, string. So , in the view reporting progress, VACUUM command can have 3 entries one for each phase.

VACUUM has 3 phases now, but since phases 2 and 3 repeat, you can have an unbounded number of phases. But that assumes that we don't count truncation as a 4th phase of VACUUM...

SELECT statements also have a variable number of phases, hash, materialize, sorts all act as blocking nodes where you cannot progress to next phase until it is complete and you don't know for certain how much data will come in later phases.

I think the best you'll do is an array of pairs of values [(current blocks, total blocks), ... ]

Knowing how many phases there are is a tough problem. I think the only way forwards is to admit that we will publish our best initial estimate of total workload size and then later we may realise it was wrong and publish a better number (do until complete). It's not wonderful, but la vida es loca.

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] VACUUM Progress Checker.

From

Masahiko Sawada

Date:

10 August 2015, 16:00:22

On Tue, Aug 11, 2015 at 12:20 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 10 August 2015 at 15:59, Syed, Rahila <Rahila.Syed@nttdata.com> wrote:
>>
>> Hello,
>>
>> >When we're in Phase2 or 3, don't we need to report the number of total
>> > page scanned or percentage of how many table pages scanned, as well?
>> The total heap pages scanned need to be reported with phase 2 or 3.
>> Complete progress report need to have numbers from each phase when
>> applicable.
>>
>> > Phase 1. Report 2 integer counters: heap pages scanned and total heap
>> > pages,
>> > 1 float counter: percentage_complete and progress message.
>> > Phase 2. Report 2 integer counters: index pages scanned and total
>> > index pages(across all indexes) and progress message.
>> > Phase 3. 1 integer counter: heap pages vacuumed.
>>
>> Sorry for being unclear here. What I meant to say is, each phase of a
>> command will correspond to a slot in COMMAND_NUM_SLOTS. Each phase will be a
>> separate array element and
>> will comprise of n integers, n floats, string. So , in the view reporting
>> progress, VACUUM command can have 3 entries one for each phase.
>
>
> VACUUM has 3 phases now, but since phases 2 and 3 repeat, you can have an
> unbounded number of phases. But that assumes that we don't count truncation
> as a 4th phase of VACUUM...

Yeah.
This topic may have been already discussed but, why don't we use just
total scanned pages and total pages?

The mechanism of VACUUM is complicated a bit today, and other
maintenance command is as well.
It would be tough to trace these processing, and these might be
changed in the future.
But basically, we can trace total scanned pages of target relation
easily, and such information would be enough at many case.

Regards,

--
Masahiko Sawada

Re: [PROPOSAL] VACUUM Progress Checker.

From

Alvaro Herrera

Date:

10 August 2015, 16:50:27

Masahiko Sawada wrote:

> This topic may have been already discussed but, why don't we use just
> total scanned pages and total pages?

Because those numbers don't extrapolate nicely.  If the density of dead
tuples is irregular across the table, such absolute numbers might be
completely meaningless: you could scan 90% of the table without seeing
any index scan, and then at the final 10% be hit by many index scans
cleaning dead tuples.  Thus you would see progress go up to 90% very
quickly and then take hours to have it go to 91%.  (Additionally, and a
comparatively minor point: since you don't know how many index scans are
going to happen, there's no way to know the total number of blocks
scanned, unless you don't count index blocks at all, and then the
numbers become a lie.)

If you instead track number of heap pages separately from index pages,
and indicate how many index scans have taken place, you have a chance of
actually figuring out how many heap pages are left to scan and how many
more index scans will occur.

> The mechanism of VACUUM is complicated a bit today,

Understatement of the week ...

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] VACUUM Progress Checker.

From

Masahiko Sawada

Date:

11 August 2015, 05:54:45

On Tue, Aug 11, 2015 at 1:50 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Masahiko Sawada wrote:
>
>> This topic may have been already discussed but, why don't we use just
>> total scanned pages and total pages?
>
> Because those numbers don't extrapolate nicely.  If the density of dead
> tuples is irregular across the table, such absolute numbers might be
> completely meaningless: you could scan 90% of the table without seeing
> any index scan, and then at the final 10% be hit by many index scans
> cleaning dead tuples.  Thus you would see progress go up to 90% very
> quickly and then take hours to have it go to 91%.  (Additionally, and a
> comparatively minor point: since you don't know how many index scans are
> going to happen, there's no way to know the total number of blocks
> scanned, unless you don't count index blocks at all, and then the
> numbers become a lie.)
> If you instead track number of heap pages separately from index pages,
> and indicate how many index scans have taken place, you have a chance of
> actually figuring out how many heap pages are left to scan and how many
> more index scans will occur.

Thank you for your explanation!
I understood about this.

> VACUUM has 3 phases now, but since phases 2 and 3 repeat, you can have an unbounded number of phases. But that
assumesthat we don't count truncation as a 4th phase of VACUUM...

In case of vacuum, I think we need to track the number of scanned heap
pages at least, and the information about index scan is the additional
information.
The another idea for displaying progress is to have two kind of
information: essential information and additional information.

Essential information has one numeric data, which is stored
essentially information regarding of its processing.
Additional information has two data: text and numeric. These data is
free-style data which is stored by each backend as it like.
And these three data are output at same time.

For example, In case of vacuum, essential information is the number of
total scanned heap page.

* When lazy_scan_heap starts, the two additional data are NULL.

* When lazy_vacuum_index starts, the backend set additional data like
followings. - "Index vacuuming" into text data which describes what we're doing
now actually. - "50" into numeric data which describes how many index pages we scanned.

* And when lazy_vacuum_index is done, backend sets additional data NULL again.

Regards,

--
Masahiko Sawada

Re: [PROPOSAL] VACUUM Progress Checker.

From

Simon Riggs

Date:

11 August 2015, 09:43:34

On 10 August 2015 at 17:50, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Masahiko Sawada wrote:

> This topic may have been already discussed but, why don't we use just
> total scanned pages and total pages?

Because those numbers don't extrapolate nicely. If the density of dead
tuples is irregular across the table, such absolute numbers might be
completely meaningless: you could scan 90% of the table without seeing
any index scan, and then at the final 10% be hit by many index scans
cleaning dead tuples. Thus you would see progress go up to 90% very
quickly and then take hours to have it go to 91%. (Additionally, and a
comparatively minor point: since you don't know how many index scans are
going to happen, there's no way to know the total number of blocks
scanned, unless you don't count index blocks at all, and then the
numbers become a lie.)

If you instead track number of heap pages separately from index pages,
and indicate how many index scans have taken place, you have a chance of
actually figuring out how many heap pages are left to scan and how many
more index scans will occur.

I think this overstates the difficulty.

Autovacuum knows what % of a table needs to be cleaned - that is how it is triggered. When a vacuum runs we should calculate how many TIDs we will collect and therefore how many trips to the indexes we need for given memory. We can use the VM to find out how many blocks we'll need to scan in the table. So overall, we know how many blocks we need to scan.

I think just storing (total num blocks, scanned blocks) is sufficiently accurate to be worth holding, rather than make it even more complex.

--

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Syed, Rahila"

Date:

13 August 2015, 10:15:32

Hello,

>Autovacuum knows what % of a table needs to be cleaned - that is how it is triggered.
>When a vacuum runs we should calculate how many TIDs we will collect and therefore how many trips to the indexes we
needfor given memory. 
>We can use the VM to find out how many blocks we'll need to scan in the table. So overall, we know how many blocks we
needto scan. 

Total heap pages to be scanned can be obtained from VM as mentioned. To figure out number of index scans we need an
estimateof dead tuples.  

IIUC, autovacuum acquires information that a table has to be cleaned by looking at pgstat entry for the table. i.e
n_dead_tuples.
Hence,initial estimate of dead tuple TIDs can be made using n_dead_tuples in pgstat.
n_dead_tuples in pgstat table entry is the value  updated by last analyze and may not be up to date.
In cases where pgstat entry for table is NULL, number of dead tuples TIDs cannot be estimated.
In  such cases where TIDs cannot be estimated , we can start with an initial estimate of 1 index scan and later go on
addingnumber of index pages to the total count of pages(heap+index)  if count of index scan exceeds. 

Thank you,
Rahila Syed.


______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

17 August 2015, 22:08:04

>In case of vacuum, I think we need to track the number of scanned heap
>pages at least, and the information about index scan is the additional
>information

Actually the progress of heap pages scan depend on index scans. So complete VACUUM progress

needs to have a count of index pages scanned too. So, progress can be calculated by measuring index_pages_scanned + heap_pages_scanned

against total_index_pages + total_heap_pages. This can make essential information.

This can be followed by additional individual phase information.

Following fields common across different commands can be used to display progress

Command work done total work percent complete message

VACUUM x y z total progress

u v w phase 1

The command code can be divided into distinct phases and each phase progress can be represented separately. With a summary of entire command progress as the first entry. The summary can be the summation of individual phase entries.

If the phase repeats during command execution the previous entry for the phase will be replaced.(for ex. index scan in vacuum)

>Essential information has one numeric data, which is stored
>essentially information regarding of its processing.
We may need more than one numeric data as mentioned above to represent scanned blocks versus total blocks.

>Additional information has two data: text and numeric. These data is
>free-style data which is stored by each backend as it like.

If I understand your point correctly, I think you are missing following,

The amount of additional information for each command can be different. We may need an array of text and numeric data to represent more additional information.

Thank you,

Rahila Syed

Re: [PROPOSAL] VACUUM Progress Checker.

From

Jim Nasby

Date:

18 August 2015, 05:15:56

On 8/17/15 5:07 PM, Rahila Syed wrote:
>>In case ofvacuum, I think we need to track the number of scanned heap
>>pages at least, and the information about index scan is the additional
>>information
>
> Actually the progress of heap pages scan depend on index scans. So
> complete VACUUM progress
> needs to have a count of index pages scanned too. So, progress can be
> calculated by measuring index_pages_scanned + heap_pages_scanned
> against total_index_pages + total_heap_pages. This can make essential
> information.

There's absolutely no way to get a reasonable status report in the case 
of multiple index passes unless you somehow count the passes, especially 
since index cleanup is frequently MUCH longer than the heap cleanup.

What should work is exporting the number of index passes we've already 
made. If > 0 we know we're in a multiple scan situation. At the end of 
each index pass, do index_passes++; index_pages=0; index_pages_scanned=0.

Personally, I think we should use SIGINFO to signal a backend to output 
status data to a file in pg_stat_tmp/ (but not the main stats file) and 
be done with it. That allows us to easily handle variable length stuff 
with minimal fuss. No normal user is going to hammer away at that, and 
anyone that's really worried about performance will have that directory 
sitting on a ramdisk anyway.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

18 August 2015, 15:05:53

On Mon, Aug 10, 2015 at 12:36 AM, Rahila Syed <rahilasyed90@gmail.com> wrote:
> Hello,
>
>>Say, 6 bigint counters, 6 float8
>>counters, and 3 strings up to 80 characters each.  So we have a
>>fixed-size chunk of shared memory per backend, and each backend that
>>wants to expose progress information can fill in those fields however
>>it likes, and we expose the results.
>>This would be sorta like the way pg_statistic works: the same columns
>>can be used for different purposes depending on what estimator will be
>>used to access them.
>
> After thinking more on this suggestion, I came up with following generic
> structure which can be used to store progress of any command per backend in
> shared memory.
>
> Struct PgBackendProgress
> {
> int32 *counter[COMMAND_NUM_SLOTS];
> float8 *counter_float[COMMAND_NUM_SLOTS];
>
> char *progress_message[COMMAND_NUM_SLOTS];
> }

This can't actually work, because we don't have a dynamic allocator
for shared memory.  What you need to do is something like this:

struct PgBackendProgress
{   uint64 progress_integer[N_PROGRESS_INTEGER];   float8 progress_float[N_PROGRESS_FLOAT];   char
progress_string[PROGRESS_STRING_LENGTH][N_PROGRESS_STRING];
};

You probably want to protect this with the st_changecount protocol, or
just put the fields in PgBackendStatus.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

31 August 2015, 14:40:01

<p dir="ltr">Hello, <p dir="ltr">On Jul 16, 2015 1:48 AM, "Rahila Syed" <<a
href="mailto:rahilasyed90@gmail.com">rahilasyed90@gmail.com</a>>wrote:<br /> ><br /> > Hello,<br /> ><br />
>Please find attached updated patch >with an interface to calculate command progress in pgstat.c. This interface
currentlyimplements VACUUM progress tracking .<p dir="ltr">I have added this patch to CommitFest 2015-09. It is marked
as<br/> Waiting on author . I will post an updated patch as per review comments soon.<p dir="ltr">Thank you, <br />
RahilaSyed<br />

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Syed, Rahila"

Date:

11 September 2015, 15:24:57

Hello,

Please find attached updated VACUUM progress checker patch.
Following have been accomplished in the patch

1. Accounts for index pages count while calculating  total progress of VACUUM.
2. Common location for storing progress parameters for any command. Idea is every command which needs to report
progresscan populate and interpret the shared variables in its own way. 
     Each can display progress by implementing separate views.
3. Separate VACUUM progress view to display various progress parameters has been implemented . Progress of various
phaseslike heap scan, index scan, total pages scanned along with  
    completion percentage is reported.
4.This view can display progress for all active backends running VACUUM.

Basic testing has been performed. Thorough testing is yet to be done. Marking it as Needs Review in  Sept-Commitfest.

ToDo:
Display count of heap pages actually vacuumed(marking line pointers unused)
Display percentage of work_mem being used to store dead tuples.

Thank you,
Rahila Syed

______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Attachment

Vacuum_progress_checker_v2.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Thom Brown

Date:

11 September 2015, 16:13:22

On 11 September 2015 at 15:43, Syed, Rahila <Rahila.Syed@nttdata.com> wrote:

Hello,

Please find attached updated VACUUM progress checker patch.
Following have been accomplished in the patch

1. Accounts for index pages count while calculating total progress of VACUUM.
2. Common location for storing progress parameters for any command. Idea is every command which needs to report progress can populate and interpret the shared variables in its own way.
Each can display progress by implementing separate views.
3. Separate VACUUM progress view to display various progress parameters has been implemented . Progress of various phases like heap scan, index scan, total pages scanned along with
completion percentage is reported.
4.This view can display progress for all active backends running VACUUM.

Basic testing has been performed. Thorough testing is yet to be done. Marking it as Needs Review in Sept-Commitfest.

ToDo:
Display count of heap pages actually vacuumed(marking line pointers unused)
Display percentage of work_mem being used to store dead tuples.

Thank you,
Rahila Syed

This doesn't seem to compile:

make[4]: Leaving directory `/home/thom/Development/postgresql/src/backend'
make[3]: Leaving directory `/home/thom/Development/postgresql/src/common'
make -C catalog schemapg.h
make[3]: Entering directory `/home/thom/Development/postgresql/src/backend/catalog'
cd ../../../src/include/catalog && '/usr/bin/perl' ./duplicate_oids
3308
make[3]: *** [postgres.bki] Error 1
make[3]: Leaving directory `/home/thom/Development/postgresql/src/backend/catalog'
make[2]: *** [submake-schemapg] Error 2
make[2]: Leaving directory `/home/thom/Development/postgresql/src/backend'
make[1]: *** [install-backend-recurse] Error 2
make[1]: Leaving directory `/home/thom/Development/postgresql/src'
make: *** [install-src-recurse] Error 2
make: Leaving directory `/home/thom/Development/postgresql'

Thom

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

11 September 2015, 20:10:26

>This doesn't seem to compile

Oh. It compiled successfully when applied on HEAD on my machine. Anyways, the OID is changed to 3309 in the attached patch. 3308 / 3309 both are part of OIDs in unused OID list.

Thank you,

Rahila Syed

Attachment

Vacuum_progress_checker_v2.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Alvaro Herrera

Date:

11 September 2015, 21:35:12

Rahila Syed wrote:
> >This doesn't seem to compile
> Oh. It compiled successfully when applied on HEAD on my machine. Anyways,
> the OID is changed to 3309 in the attached patch. 3308 / 3309 both are part
> of OIDs in unused OID list.

I think Thom may have patched on top of some other patch.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] VACUUM Progress Checker.

From

Thom Brown

Date:

12 September 2015, 00:05:34

On 11 September 2015 at 22:34, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Rahila Syed wrote:
> >This doesn't seem to compile
> Oh. It compiled successfully when applied on HEAD on my machine. Anyways,
> the OID is changed to 3309 in the attached patch. 3308 / 3309 both are part
> of OIDs in unused OID list.

I think Thom may have patched on top of some other patch.

I think you might be right. I had run "git stash" and thought that would be sufficient, but it seems "git clean -f" was necessary.

It builds fine now.

Thom

Re: [PROPOSAL] VACUUM Progress Checker.

From

Thom Brown

Date:

14 September 2015, 13:26:26

On 11 September 2015 at 15:43, Syed, Rahila <Rahila.Syed@nttdata.com> wrote:

Hello,

Please find attached updated VACUUM progress checker patch.
Following have been accomplished in the patch

1. Accounts for index pages count while calculating total progress of VACUUM.
2. Common location for storing progress parameters for any command. Idea is every command which needs to report progress can populate and interpret the shared variables in its own way.
Each can display progress by implementing separate views.
3. Separate VACUUM progress view to display various progress parameters has been implemented . Progress of various phases like heap scan, index scan, total pages scanned along with
completion percentage is reported.
4.This view can display progress for all active backends running VACUUM.

Basic testing has been performed. Thorough testing is yet to be done. Marking it as Needs Review in Sept-Commitfest.

ToDo:
Display count of heap pages actually vacuumed(marking line pointers unused)
Display percentage of work_mem being used to store dead tuples.

Okay, I've just tested this with a newly-loaded table (1,252,973 of jsonb data), and it works fine during a vacuum. I can see the scanned_pages, scanned_heap_pages and percent_complete increasing, but after it's finished, I end up with this:

json=# select * from pg_stat_vacuum_progress;
-[ RECORD 1 ]-------+-------
pid                 | 5569
total_pages         | 217941
scanned_pages       | 175243
total_heap_pages    | 175243
scanned_heap_pages | 175243
total_index_pages   | 42698
scanned_index_pages |
percent_complete    | 80

This was running with a VACUUM ANALYZE. This output seems to suggest that it didn't complete.

After, I ran VACUUM FULL. pg_stat_vacuum_progress didn't change from before, so that doesn't appear to show up in the view.

I then deleted 40,000 rows from my table, and ran VACUUM ANALYZE again. This time it progressed and percent_complete reached 100.

--

Thom

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Syed, Rahila"

Date:

15 September 2015, 14:38:22

Hello Thom,

>Okay, I've just tested this with a newly-loaded table (1,252,973 of jsonb data),
Thanks a lot!

>but after it's finished, I end up with this:
>json=# select * from pg_stat_vacuum_progress;
>-[ RECORD 1 ]-------+-------
>pid                 | 5569
>total_pages         | 217941
>scanned_pages       | 175243
>total_heap_pages    | 175243
>scanned_heap_pages  | 175243
>total_index_pages   | 42698
>scanned_index_pages |
>percent_complete    | 80
>This was running with a VACUUM ANALYZE.  This output seems to suggest that it didn't complete.

Ok. The patch fails here because 'total pages to be scanned' takes into account index pages and no index pages are
actuallyscanned.  
So the scanned pages count does not reach total pages count . I will fix this.
It seems that no index pages were scanned during this  because there were no dead tuples to be cleaned as the table was
newlyloaded. 

>After, I ran VACUUM FULL.  pg_stat_vacuum_progress didn't change from before, so that doesn't appear to show up in the
view.
The scope of this patch is to report progress of basic VACUUM . It does not take into account VACUUM FULL yet.  I think
thiscan be included after basic VACUUM progress is done. 

>I then deleted 40,000 rows from my table, and ran VACUUM ANALYZE again.  This time it progressed and percent_complete
reached100 
OK.

Thank you,
Rahila Syed.


______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Re: [PROPOSAL] VACUUM Progress Checker.

From

Masahiko Sawada

Date:

16 September 2015, 13:02:16

On Tue, Sep 15, 2015 at 11:35 PM, Syed, Rahila <Rahila.Syed@nttdata.com> wrote:
>
> Hello Thom,
>
>>Okay, I've just tested this with a newly-loaded table (1,252,973 of jsonb data),
> Thanks a lot!
>
>>but after it's finished, I end up with this:
>>json=# select * from pg_stat_vacuum_progress;
>>-[ RECORD 1 ]-------+-------
>>pid                 | 5569
>>total_pages         | 217941
>>scanned_pages       | 175243
>>total_heap_pages    | 175243
>>scanned_heap_pages  | 175243
>>total_index_pages   | 42698
>>scanned_index_pages |
>>percent_complete    | 80
>>This was running with a VACUUM ANALYZE.  This output seems to suggest that it didn't complete.
>
> Ok. The patch fails here because 'total pages to be scanned' takes into account index pages and no index pages are
actuallyscanned.
 
> So the scanned pages count does not reach total pages count . I will fix this.
> It seems that no index pages were scanned during this  because there were no dead tuples to be cleaned as the table
wasnewly loaded.
 
>
>>After, I ran VACUUM FULL.  pg_stat_vacuum_progress didn't change from before, so that doesn't appear to show up in
theview.
 
> The scope of this patch is to report progress of basic VACUUM . It does not take into account VACUUM FULL yet.  I
thinkthis can be included after basic VACUUM progress is done.
 
>
>>I then deleted 40,000 rows from my table, and ran VACUUM ANALYZE again.  This time it progressed and percent_complete
reached100
 
> OK.
>

I tested this patch with some cases.
And the followings seems to be bug.

* After running "pgbench -i -s 100" and "VACUUM FREEZE
pgbench_accounts", the pg_stat_vacuum_progress is,

-[ RECORD 1 ]-------+-------
pid                 | 2298
total_pages         | 27422
scanned_pages       | 163935
total_heap_pages    |
scanned_heap_pages  | 163935
total_index_pages   | 27422
scanned_index_pages |
percent_complete    | 597

The value of percent_complete column exceeds 100%.
And, why are the total_heap_pages and scanned_index_pages columns NULL?

* Also, after dropping primary key of pgbench_accounts, I got
assertion error when I execute "VACUUM FREEZE pgbench_accounts".

=# VACUUM FREEZE pgbench_accounts;
ERROR:  floating-point exception
DETAIL:  An invalid floating-point operation was signaled. This
probably means an out-of-range result or an invalid operation, such as
division by zero.
STATEMENT:  vacuum freeze pgbench_accounts ;
TRAP: FailedAssertion("!((beentry->st_changecount & 1) == 0)", File:
"pgstat.c", Line: 2934)

* The progress of vacuum by autovacuum seems not to be displayed.

Regards,

--
Masahiko Sawada

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Syed, Rahila"

Date:

22 September 2015, 15:24:49

Hello,

Please find attached patch with bugs reported by Thom and Sawada-san solved.

>* The progress of vacuum by autovacuum seems not to be displayed.
The progress is stored in shared variables during autovacuum. I guess the reason they are not visible is that the
entriesare deleted as soon as the process exits. 
But the progress can be viewed while autovacuum worker is running.

Thank you,
Rahila Syed

______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Attachment

Vacuum_progress_checker_v3.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Masahiko Sawada

Date:

24 September 2015, 12:38:30

On Wed, Sep 23, 2015 at 12:24 AM, Syed, Rahila <Rahila.Syed@nttdata.com> wrote:
> Hello,
>
> Please find attached patch with bugs reported by Thom and Sawada-san solved.
>
>>* The progress of vacuum by autovacuum seems not to be displayed.
> The progress is stored in shared variables during autovacuum. I guess the reason they are not visible is that the
entriesare deleted as soon as the process exits.
 
> But the progress can be viewed while autovacuum worker is running.
>

Thank you for updating the patch.

I tested the latest version patch.
The followings are my review comments and questions.

* pg_stat_vacuum_progress should have the oid of relation being vacuumed.

When we run "VACUUM;", the all tables of current database will be vacuumed.
So pg_stat_vacuum_progress should have these oid in order to show
which table is vacuumed now.

* progress_message variable in PgBackendStatus is not used at all.
IIRC, progress_message variable is set the description of processing.

* The progress of VACUUM FULL seems wrong.
When I run VACUUM FULL for a table, I got following progress.

postgres(1)=# select * from pg_stat_vacuum_progress ;
-[ RECORD 1 ]-------+------
pid                 | 19190
total_pages         | 1
scanned_pages       | 1
total_heap_pages    | 1
scanned_heap_pages  | 1
total_index_pages   |
scanned_index_pages |
percent_complete    | 100

The table being vacuumed is 400MB, so it's not 1 page table.

* The vacuum by autovacuum is not displayed.
I tested about this by the executing the following queries in a row,
but the vacuum by autovacuum is not displayed,

postgres(1)=# select datname, pid, backend_start, query, state from
pg_stat_activity ;datname  |  pid  |         backend_start         |             query
| state
 

----------+-------+-------------------------------+--------------------------------------------------------------------------+--------postgres
|20123 | 2015-09-24 17:44:26.467021+09 | autovacuum: VACUUM
 
ANALYZE public.hoge                                   | activepostgres | 19779 | 2015-09-24 17:42:31.57918+09  | select
datname,
pid, backend_start, query, state from pg_stat_activity ; | active
(3 rows)

postgres(1)=# selecT * from pg_stat_vacuum_progress ;pid | total_pages | scanned_pages | total_heap_pages |
scanned_heap_pages | total_index_pages | scanned_index_pages |
percent_complete

-----+-------------+---------------+------------------+--------------------+-------------------+---------------------+------------------
(0 rows)

postgres(1)=# select datname, pid, backend_start, query, state from
pg_stat_activity ;datname  |  pid  |         backend_start         |             query
| state
 

----------+-------+-------------------------------+--------------------------------------------------------------------------+--------postgres
|20123 | 2015-09-24 17:44:26.467021+09 | autovacuum: VACUUM
 
ANALYZE public.hoge                                   | activepostgres | 19779 | 2015-09-24 17:42:31.57918+09  | select
datname,
pid, backend_start, query, state from pg_stat_activity ; | active
(3 rows)

The vacuuming for hoge table took about 2min, but the progress of
vacuum is never displayed.
Could you check this on your environment?

Regards,

--
Masahiko Sawada

Re: [PROPOSAL] VACUUM Progress Checker.

From

Fujii Masao

Date:

24 September 2015, 17:03:48

On Wed, Sep 23, 2015 at 12:24 AM, Syed, Rahila <Rahila.Syed@nttdata.com> wrote:
> Hello,
>
> Please find attached patch with bugs reported by Thom and Sawada-san solved.

The regression test failed on my machine, so you need to update the
regression test,
I think.

Regards,

-- 
Fujii Masao

Re: [PROPOSAL] VACUUM Progress Checker.

From

Jim Nasby

Date:

24 September 2015, 22:01:46

On 9/24/15 7:37 AM, Masahiko Sawada wrote:
> * The progress of VACUUM FULL seems wrong.
> When I run VACUUM FULL for a table, I got following progress.

It never occurred to me that this patch was attempting to measure the 
progress of a CLUSTER (aka VACUUM FULL). I'm not sure that's such a 
great idea, as the progress estimation presumably needs to be 
significantly different.

More to the point, you can't estimate a CLUSTER unless you can estimate 
the progress of an index build. That'd be a cool feature to have as 
well, but it seems like a bad idea to mix that in with this patch.

Keep in mind that running a VACUUM FULL is presumably a LOT less common 
than regular vacuums, so I don't think leaving it out for now is that 
big a deal.

> * The vacuum by autovacuum is not displayed.
> I tested about this by the executing the following queries in a row,
> but the vacuum by autovacuum is not displayed,

IIRC this is the second problem related to autovacuum... is there some 
way to regression test that? Maybe disable autovac on a table, dirty it, 
then re-enable (all with an absurdly low autovacuum naptime)?
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

28 September 2015, 15:03:24

On Thu, Sep 24, 2015 at 8:37 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> When we run "VACUUM;", the all tables of current database will be vacuumed.
> So pg_stat_vacuum_progress should have these oid in order to show
> which table is vacuumed now.

Hmm, I would tend to instead show the schema & table name, like "foo"."bar".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Masahiko Sawada

Date:

29 September 2015, 00:37:42

On Mon, Sep 28, 2015 at 11:03 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Sep 24, 2015 at 8:37 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>> When we run "VACUUM;", the all tables of current database will be vacuumed.
>> So pg_stat_vacuum_progress should have these oid in order to show
>> which table is vacuumed now.
>
> Hmm, I would tend to instead show the schema & table name, like "foo"."bar".
>

Yes, it looks better.

Regards,

--
Masahiko Sawada

Re: [PROPOSAL] VACUUM Progress Checker.

From

Fujii Masao

Date:

02 October 2015, 06:38:53

On Fri, Sep 25, 2015 at 2:03 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, Sep 23, 2015 at 12:24 AM, Syed, Rahila <Rahila.Syed@nttdata.com> wrote:
>> Hello,
>>
>> Please find attached patch with bugs reported by Thom and Sawada-san solved.
>
> The regression test failed on my machine, so you need to update the
> regression test,
> I think.

Here are another review comments.

You removed some empty lines, for example, in vacuum.h.
Which seems useless to me.

+    uint32 progress_param[N_PROGRESS_PARAM];

Why did you use an array to store the progress information of VACUUM?
I think that it's better to use separate specific variables for them for
better code readability, for example, variables scanned_pages,
heap_total_pages, etc.

+    double    progress_param_float[N_PROGRESS_PARAM];

Currently only progress_param_float[0] is used. So there is no need to
use an array here.

progress_param_float[0] saves the percetage of VACUUM progress.
But why do we need to save that information into shared memory?
We can just calculate the percentage whenever pg_stat_get_vacuum_progress()
is executed, instead. There seems to be no need to save that information.

+    char progress_message[PROGRESS_MESSAGE_LENGTH][N_PROGRESS_PARAM];

As Sawada pointed out, there is no user of this variable.

+#define PG_STAT_GET_PROGRESS_COLS    30

Why did you use 30?

+    FROM pg_stat_get_vacuum_progress(NULL) AS S;

You defined pg_stat_get_vacuum_progress() so that it accepts one argument.
But as far as I read the function, it doesn't use any argument at all.
I think that pg_stat_get_vacuum_progress() should be defined as a function
having no argument.

+        /* Report values for only those backends which are running VACUUM */
+        if(!beentry || (strncmp(beentry->st_activity,"VACUUM",6)
+                        && strncmp(beentry->st_activity,"vacuum",6)))
+            continue;

This design looks bad to me. There is no guarantee that st_activity of
the backend running VACUUM displays "VACUUM" or "vacuum".
For example, st_activity of autovacuum worker displays "autovacuum ...".
So as Sawada reported, he could not find any entry for autovacuum in
pg_stat_vacuum_progress.

I think that you should add the flag or something which indicates
whether this backend is running VACUUM or not, into PgBackendStatus.
pg_stat_vacuum_progress should display the entries of only backends
with that flag set true. This design means that you need to set the flag
to true when starting VACUUM and reset at the end of VACUUM progressing.

Non-superuser cannot see some columns of the superuser's entry in
pg_stat_activity, for permission reason. We should treat
pg_stat_vacuum_progress in the same way? That is, non-superuser
should not be allowed to see the pg_stat_vacuum_progress entry
of superuser running VACUUM?

+                if(!scan_all)
+                {
+                    total_heap_pages = total_heap_pages -
(next_not_all_visible_block - blkno);
+                    total_pages = total_pages -
(next_not_all_visible_block - blkno);
+                }

This code may cause total_pages and total_heap_pages to decrease
while VACUUM is running. This sounds strange and confusing. I think
that total values basically should be fixed. And heap_scanned_pages
should increase, instead.

Regards,

-- 
Fujii Masao

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

02 October 2015, 07:14:59

On 2015/10/02 15:38, Fujii Masao wrote:
> 
> +    uint32 progress_param[N_PROGRESS_PARAM];
> 
> Why did you use an array to store the progress information of VACUUM?
> I think that it's better to use separate specific variables for them for
> better code readability, for example, variables scanned_pages,
> heap_total_pages, etc.
> 
> +    double    progress_param_float[N_PROGRESS_PARAM];
> 
> Currently only progress_param_float[0] is used. So there is no need to
> use an array here.

I think this kind of design may have come from the ideas expressed here
(especially the last paragraph):

http://www.postgresql.org/message-id/CA+TgmoYnWtNJRmVWAJ+wGLOB_x8vNOTrZnEDio=GaPi5HK73oQ@mail.gmail.com

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

02 October 2015, 20:15:39

On Fri, Oct 2, 2015 at 3:14 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On 2015/10/02 15:38, Fujii Masao wrote:
>>
>> +    uint32 progress_param[N_PROGRESS_PARAM];
>>
>> Why did you use an array to store the progress information of VACUUM?
>> I think that it's better to use separate specific variables for them for
>> better code readability, for example, variables scanned_pages,
>> heap_total_pages, etc.
>>
>> +    double    progress_param_float[N_PROGRESS_PARAM];
>>
>> Currently only progress_param_float[0] is used. So there is no need to
>> use an array here.
>
> I think this kind of design may have come from the ideas expressed here
> (especially the last paragraph):
>
> http://www.postgresql.org/message-id/CA+TgmoYnWtNJRmVWAJ+wGLOB_x8vNOTrZnEDio=GaPi5HK73oQ@mail.gmail.com

Right.  This design is obviously silly if we only care about exposing
VACUUM progress.  But if we want to be able to expose progress from
many utility commands, and slightly different kinds of information for
each one, then I think it could be quite useful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Andres Freund

Date:

03 October 2015, 14:02:03

Hi!

On 2015-09-22 15:24:38 +0000, Syed, Rahila wrote:
> Please find attached patch with bugs reported by Thom and Sawada-san solved.

This thread has seen a bunch of reviews and new patch versions, but
doesnt yet seem to have arrived in a committable state. As the
commitfest ended and this patch has gotten attention, I'm moving the
entry to the next fest.

Greetings,

Andres Freund

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Syed, Rahila"

Date:

06 October 2015, 09:35:21

Hello Fujii-san,

>Here are another review comments
Thank you for review. Please find attached an updated patch.

> You removed some empty lines, for example, in vacuum.h.
>Which seems useless to me.
Has been corrected in the attached.

>Why did you use an array to store the progress information of VACUUM?
>I think that it's better to use separate specific variables for them for better code readability, for example,
variablesscanned_pages, heap_total_pages, etc. 
It is designed this way to keep it generic for all the commands which can store different progress parameters in shared
memory.

>Currently only progress_param_float[0] is used. So there is no need to use an array here.
Same as before . This is for later use by other commands.

>progress_param_float[0] saves the percetage of VACUUM progress.
>But why do we need to save that information into shared memory?
>We can just calculate the percentage whenever pg_stat_get_vacuum_progress() is executed, instead. There seems to be no
needto save that information. 
This has been corrected in the attached.

>char progress_message[PROGRESS_MESSAGE_LENGTH][N_PROGRESS_PARAM];
>As Sawada pointed out, there is no user of this variable.
Have used it to store table name in the updated patch. It can also be used to display index names, current phase of
VACUUM.  
This has not been included in the patch yet to avoid cluttering the display with too much information.

>For example, st_activity of autovacuum worker displays "autovacuum ...".
>So as Sawada reported, he could not find any entry for autovacuum in pg_stat_vacuum_progress.
In the attached patch , I have performed check for autovacuum also.

>I think that you should add the flag or something which indicates whether this backend is running VACUUM or not, into
PgBackendStatus.
>pg_stat_vacuum_progress should display the entries of only backends with that flag set true. This design means that
youneed to set the flag to true when starting VACUUM and reset at the end of VACUUM progressing. 
This design seems better in the sense that we don’t rely on st_activity entry to display progress values.
A variable which stores flags for running commands can be created in PgBackendStatus. These flags can be used at the
timeof display of progress of particular command.  

>That is, non-superuser should not be allowed to see the pg_stat_vacuum_progress entry of superuser running VACUUM?
This has been included in the updated patch.

>This code may cause total_pages and total_heap_pages to decrease while VACUUM is running.
Yes. This is because the initial count of total pages to be vacuumed and the pages which are actually vacuumed can vary
dependingon visibility of tuples. 
The pages which are all visible are skipped and hence have been subtracted from total page count.


Thank you,
Rahila Syed

______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Attachment

Vacuum_progress_checker_v4.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Syed, Rahila"

Date:

06 October 2015, 10:47:35

Hello,

Please check the attached patch as the earlier one had typo in regression test output.

>+#define PG_STAT_GET_PROGRESS_COLS    30
>Why did you use 30?
That has come from N_PROGRESS_PARAM * 3  where N_PROGRESS_PARAM = 10 is the number of progress parameters of each type
storedin shared memory. 
There are three such types (int, float, string) hence total number of progress parameters can be 30.

Thank you,
Rahila Syed


______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Attachment

Vacuum_progress_checker_v4.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Syed, Rahila"

Date:

20 October 2015, 09:03:00

Hello,

>I think that you should add the flag or something which indicates whether this backend is running VACUUM or not, into
PgBackendStatus.
>pg_stat_vacuum_progress should display the entries of only backends with that flag set true. This design means that
youneed to set the flag to true when starting VACUUM and reset at the end of VACUUM progressing. 
Please find attached  updated patch which adds a flag in PgBackendStatus which indicates whether this backend in
runningVACUUM. 
Also, pgstat_report_progress function is changed to make it generic for all commands reporting progress.

Thank you,
Rahila Syed



______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Attachment

Vacuum_progress_checker_v5.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

22 October 2015, 13:54:56

On Tue, Oct 20, 2015 at 4:58 AM, Syed, Rahila <Rahila.Syed@nttdata.com> wrote:
>>I think that you should add the flag or something which indicates whether this backend is running VACUUM or not, into
PgBackendStatus.
>>pg_stat_vacuum_progress should display the entries of only backends with that flag set true. This design means that
youneed to set the flag to true when starting VACUUM and reset at the end of VACUUM progressing.
 
> Please find attached  updated patch which adds a flag in PgBackendStatus which indicates whether this backend in
runningVACUUM.
 

Flag isn't reset on error.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Alvaro Herrera

Date:

22 October 2015, 16:40:49

Syed, Rahila wrote:


> @@ -355,6 +356,7 @@ vacuum(int options, RangeVar *relation, Oid relid, VacuumParams *params,
>          vac_update_datfrozenxid();
>      }
>  
> +    pgstat_reset_activityflag;
>      /*
>       * Clean up working storage --- note we must do this after
>       * StartTransactionCommand, else we might be trying to delete the active

Does this actually compile?


> @@ -596,11 +630,42 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
>              /* Log cleanup info before we touch indexes */
>              vacuum_log_cleanup_info(onerel, vacrelstats);
>  
> +            /*
> +             * If passes through indexes exceed 1 add
> +             * pages equal to rel_index_pages to the count of
> +             * total pages to be scanned.
> +             */
> +            if (vacrelstats->num_index_scans >= 1)
> +            {
> +                total_index_pages = total_index_pages + rel_index_pages;
> +                total_pages = total_heap_pages + total_index_pages;
> +            }

Having the keep total_pages updated each time you change one of the
summands seems tedious and error-prone.  Why can't it be computed
whenever it is going to be used instead?

> +                memcpy((char *) progress_message[0], schemaname, schemaname_len);
> +                progress_message[0][schemaname_len] = '\0';
> +                strcat(progress_message[0],".");
> +                strcat(progress_message[0],relname);

snprintf()?  I don't think you need to keep track of schemaname_len at
all.

> +            scanned_index_pages += RelationGetNumberOfBlocks(Irel[i]);
> +            scanned_total_pages = scanned_total_pages + RelationGetNumberOfBlocks(Irel[i]);
> +            /* Report progress to the statistics collector */
> +            progress_param[0] = total_pages;
> +            progress_param[1] = scanned_total_pages;
> +            progress_param[2] = total_heap_pages;
> +            progress_param[3] = vacrelstats->scanned_pages;
> +            progress_param[4] = total_index_pages;
> +            progress_param[5] = scanned_index_pages;

In fact, I wonder if you need to send total_pages at all -- surely the
client can add both total_heap_pages and total_index_pages by itself ...

> +            memcpy((char *) progress_message[0], schemaname, schemaname_len);
> +            progress_message[0][schemaname_len] = '\0';
> +            strcat(progress_message[0],".");
> +            strcat(progress_message[0],relname);

snprintf().

> diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
> index ab018c4..f97759e 100644
> --- a/src/backend/postmaster/pgstat.c
> +++ b/src/backend/postmaster/pgstat.c
> @@ -2851,6 +2851,55 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
>      pgstat_increment_changecount_after(beentry);
>  }
>  
> +/* ---------------------------------------------
> + * Called from VACUUM  after every heap page scan or index scan
> + * to report progress
> + * ---------------------------------------------
> + */
> +
> +void
> +pgstat_report_progress(uint *param1, int num_of_int, double *param2, int num_of_float,
> +                        char param3[PROGRESS_MESSAGE_LENGTH][N_PROGRESS_PARAM],
> +                        int num_of_string)
> +{
> +    volatile PgBackendStatus *beentry = MyBEEntry;
> +    int i;
> +
> +    if (!beentry)
> +        return;
> +
> +    if (!pgstat_track_activities)
> +        return;
> +
> +    pgstat_increment_changecount_before(beentry);
> +
> +    for(i = 0; i < num_of_int; i++)
> +    {
> +        beentry->progress_param[i] = param1[i];
> +    }
> +    for (i = 0; i < num_of_float; i++)
> +    {
> +        beentry->progress_param_float[i] = param2[i];
> +    }
> +    for (i = 0; i < num_of_string; i++)
> +    {
> +        strcpy((char *)beentry->progress_message[i], param3[i]);
> +    }
> +    pgstat_increment_changecount_after(beentry);
> +}

It seems a bit strange that the remaining progress_param entries are not
initialized to anything.  Also, why aren't the number of params of each
type saved too?  In the receiving code you check whether each value
equals 0, and if it does then report NULL, but imagine vacuuming a table
with no indexes where the number of index pages is going to be zero.
Shouldn't we display zero there rather than null?  Maybe I'm missing
something and that does work fine.

This patch lacks a comment somewhere explaining how this whole thing
works.

> diff --git a/src/include/pgstat.h b/src/include/pgstat.h
> index 9ecc163..4214b3d 100644
> --- a/src/include/pgstat.h
> +++ b/src/include/pgstat.h
> @@ -20,6 +20,7 @@
>  #include "utils/hsearch.h"
>  #include "utils/relcache.h"
>  
> +#include "storage/block.h"
I believe you don't need this include.


> @@ -776,6 +779,12 @@ typedef struct PgBackendStatus
>  
>      /* current command string; MUST be null-terminated */
>      char       *st_activity;
> +
> +    uint32        flag_activity;
> +    uint32        progress_param[N_PROGRESS_PARAM];
> +    double        progress_param_float[N_PROGRESS_PARAM];
> +    char        progress_message[PROGRESS_MESSAGE_LENGTH][N_PROGRESS_PARAM];
> +
>  } PgBackendStatus;
This not only adds an unnecessary empty line at the end of the struct
declaration, but also fails to preserve the "st_" prefix used in all the
other fields.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] VACUUM Progress Checker.

From

"Syed, Rahila"

Date:

29 October 2015, 14:24:03

Hello,

Please find attached an updated patch.

>Flag isn't reset on error.
Corrected in the attached.

> +    pgstat_reset_activityflag;
>Does this actually compile?
It does compile but with no effect.  It has been corrected.

>snprintf()?  I don't think you need to keep track of schemaname_len at all.
memcpy() has been replaced by snprintf() to avoid calculating schemaname_len.

>In fact, I wonder if you need to send total_pages at all -- surely the client can add both total_heap_pages and
total_index_pagesby itself ... 
This has  been corrected in the attached patch.

>It seems a bit strange that the remaining progress_param entries are not initialized to anything.  Also, why aren't
thenumber of params of each type saved too?   
The number of params for each command remains constant hence it has been hardcoded.

>In the receiving code you check whether each value equals 0, and if it does then report NULL, but imagine vacuuming a
tablewith no indexes where the number of index pages is going to be zero. 
>Shouldn't we display zero there rather than null?
Agree.  IIUC, NULL should rather be used when a value is invalid. But for valid values like 'zero index pages' it is
clearerto display 0. It has been corrected in the attached.  

>This patch lacks a comment somewhere explaining how this whole thing works.
Have added few lines in pgstat.h inside PgBackendStatus struct.

>I believe you don't need this include.
Corrected.

>This not only adds an unnecessary empty line at the end of the struct declaration, but also fails to preserve the
"st_"prefix used in all the other fields 
Corrected.

Thank you,
Rahila Syed


______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Attachment

Vacuum_progress_checker_v6.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

10 November 2015, 05:44:55

On 2015/10/29 23:22, Syed, Rahila wrote:
> 
> Please find attached an updated patch.
> 

Thanks for the v6. A few quick comments:

- duplicate_oids error in HEAD.

- a compiler warning:

pgstat.c:2898: warning: no previous prototype for ‘pgstat_reset_activityflag’

To fix that use void for empty parameter list -

-extern void pgstat_reset_activityflag();
+extern void pgstat_reset_activityflag(void);

One more change you could do is 's/activityflag/activity_flag/g', which I
guess is a naming related guideline in place.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Kyotaro HORIGUCHI

Date:

10 November 2015, 07:52:59

Hello, I have some random comments on this patch addition to
Amit's comments.

- Type of the flag of vacuum activity.

ACTIVITY_IS_VACUUM is the alone entry in the enum, and the
variable to store it is named as *flag. If you don't have any
plan to extend this information, the name of this variable would
seems better to be something like pgstat_report_vacuum_running
and in the type of boolean.

- Type of st_progress_param and so.

The variable st_progress_param has very generic name but as
looking the pg_stat_get_vacuum_progress, every elements of it is
in a definite role. If so, the variable should be a struct.

st_progress_param_float is currently totally useless.

- Definition of progress_message.

The definition of progress_message in lazy_scan_heap is "char
[PROGRESS_MESSAGE_LENGTH][N_PROGRESS_PARAM]" which looks to be
inversed. The following snprintf,

| snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s", schemaname);

certainly  destroys the data already stored in it if any.

- snprintf()

You are so carefully to use snprintf,

+    snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s", schemaname);
+    strcat(progress_message[0],".");
+    strcat(progress_message[0],relname);

but the strcats following ruin it.


- Calculation of total_heap_pages in lazy_scan_heap.
 The current code subtracts the number of blocks when skipping_all_visible_blocks is set in two places. But I think it
isenough to decrement when skipping.
 

I'll be happy if this can be of any help.

regards,


At Tue, 10 Nov 2015 14:44:23 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<56418437.5080003@lab.ntt.co.jp>
> Thanks for the v6. A few quick comments:
> 
> - duplicate_oids error in HEAD.
> 
> - a compiler warning:
> 
> pgstat.c:2898: warning: no previous prototype for ‘pgstat_reset_activityflag’
> 
> To fix that use void for empty parameter list -
> 
> -extern void pgstat_reset_activityflag();
> +extern void pgstat_reset_activityflag(void);
> 
> One more change you could do is 's/activityflag/activity_flag/g', which I
> guess is a naming related guideline in place.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

10 November 2015, 08:02:39

On 2015/10/29 23:22, Syed, Rahila wrote:
> Please find attached an updated patch.

A few more comments on v6:

>      relname = RelationGetRelationName(onerel);
> +    schemaname = get_namespace_name(RelationGetNamespace(onerel));
>      ereport(elevel,
>              (errmsg("vacuuming \"%s.%s\"",
>                      get_namespace_name(RelationGetNamespace(onerel)),
>                      relname)));
> +    snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s", schemaname);
> +    strcat(progress_message[0],".");
> +    strcat(progress_message[0],relname);

How about the following instead -

+ snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s",
+                     generate_relation_name(onerel));

>      if (next_not_all_visible_block >= SKIP_PAGES_THRESHOLD)
> +    {
>          skipping_all_visible_blocks = true;
> +        if(!scan_all)
> +            total_heap_pages = total_heap_pages - next_not_all_visible_block;
> +    }
>      else
>          skipping_all_visible_blocks = false;

...

>               */
>              if (next_not_all_visible_block - blkno > SKIP_PAGES_THRESHOLD)
> +            {
>                  skipping_all_visible_blocks = true;
> +                if(!scan_all)
> +                    total_heap_pages = total_heap_pages - (next_not_all_visible_block - blkno);
> +            }

Fujii-san's review comment about these code blocks does not seem to be
addressed. He suggested to keep total_heap_pages fixed while adding number
of skipped pages to that of scanned pages. For that, why not add a
scanned_heap_pages variable instead of using vacrelstats->scanned_pages.

> +        if (has_privs_of_role(GetUserId(), beentry->st_userid))
> +        {
> +            values[2] = UInt32GetDatum(beentry->st_progress_param[0]);
> +            values[3] = UInt32GetDatum(beentry->st_progress_param[1]);
> +            values[4] = UInt32GetDatum(beentry->st_progress_param[2]);
> +            values[5] = UInt32GetDatum(beentry->st_progress_param[3]);
> +            values[6] = UInt32GetDatum(total_pages);
> +            values[7] = UInt32GetDatum(scanned_pages);
> +
> +            if (total_pages != 0)
> +                values[8] = Float8GetDatum(scanned_pages * 100 / total_pages);
> +            else
> +                nulls[8] = true;
> +        }
> +        else
> +        {
> +            values[2] = CStringGetTextDatum("<insufficient privilege>");
> +            nulls[3] = true;
> +            nulls[4] = true;
> +            nulls[5] = true;
> +            nulls[6] = true;
> +            nulls[7] = true;
> +            nulls[8] = true;
> +        }

This is most likely not correct, that is, putting a text datum into
supposedly int4 column. I see this when I switch to a unprivileged user:

pgbench=# \x
pgbench=# \c - other
pgbench=> SELECT * FROM pg_stat_vacuum_progress;
-[ RECORD 1 ]-------+------------------------
pid                 | 20395
table_name          | public.pgbench_accounts
total_heap_pages    | 44895488
scanned_heap_pages  |
total_index_pages   |
scanned_index_pages |
total_pages         |
scanned_pages       |
percent_complete    |

I'm not sure if applying the privilege check for columns of
pg_stat_vacuum_progress is necessary, but I may be wrong.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

14 November 2015, 08:05:33

On Tue, Nov 10, 2015 at 5:02 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On 2015/10/29 23:22, Syed, Rahila wrote:
> How about the following instead -
>
> + snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s",
> +                                       generate_relation_name(onerel));

That was a useless suggestion, sorry. It still could be rewritten as -

+ snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s.%s",
+                          get_namespace_name(RelationGetNamespace(rel)),
+                          relname);

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

19 November 2015, 07:19:28

On 2015/11/10 17:02, Amit Langote wrote:
> On 2015/10/29 23:22, Syed, Rahila wrote:
>> Please find attached an updated patch.
> 
> A few more comments on v6:

I backed up a little, studied the proposal and the patch in little some
more detail. Here are still more comments -

Going through the thread, it seems there are following problems being solved -

1) General purpose interface for (maintenance?) commands to report a set
of internal values of different internal types using shared memory as IPC;
values are exposed to users as is and/or some derived values like
percent_done using view/functions

2) For a start, instrumenting lazy vacuum to report such internal values

And maybe,

3) Estimating the amount of work to be done and time required based on
historical statistics like n_dead_tup, visibility map and run-time
resources available like maintenance_work_mem

Latest version of the patch (v6) implements 1 and 2. The code is starting
to look good though see some comments below.

* Regarding (2): Some random thoughts on the patch and in general -

For lazy vacuum, lazy_scan_heap() seems like the best place which can
provide granular progress report in terms of the heap block number (of
total number of heap blocks in the relation) currently undergoing
per-block pass 1 processing. About pass 2, ie, lazy_index_vacuum() and
lazy_vacuum_heap(), I don't see how we can do better than reporting its
progress only after finishing all of it without any finer-grained
instrumentation. They are essentially block-box as far as the proposed
instrumentation approach is concerned. Being able to report progress per
index seems good but as a whole, a user would have to wait arbitrarily
long before numbers move forward. We might as well just report a bool
saying we're about to enter a potentially time-consuming index vacuum
round with possibly multiple indexes followed by lazy_vacuum_heap()
processing. Additionally, we can report the incremented count of the
vacuuming round (pass 2) once we are through it. So, we'd report two
values viz. waiting_vacuum_pass (bool) and num_vacuum_pass (int). The
former is reported twice - 'true' as we are about to begin the round and
'false' once done. We can keep the total_index_pages (counting all
indexes) and index_pages_done as the patch currently reports. The latter
moves forward for every index we finish processing, and also should be
reset for every pass 2 round. Note that we can leave them out of
percent_done of overall vacuum progress. Until we have a good solution for
number (3) above, it seems to difficult to incorporate index pages into
overall progress.

As someone pointed out upthread, the final heap truncate phase can take
arbitrarily long and is outside the scope of lazy_scan_heap() to
instrument. Perhaps a bool, say, waiting_heap_trunc could be reported for
the same. Note that, it would have to be reported from lazy_vacuum_rel().

I spotted a potential oversight regarding report of scanned_pages. It
seems pages that are skipped because of not getting a pin, being new,
being empty could be left out of the progress equation.

* Regarding (1): These are mostly code comments -

IMHO, float progress parameters (st_progress_param_float[]) can be taken
out. They are currently unused and it's unlikely that some command would
want to report them. OTOH, as suggested in above paragraph, why not have
bool parameters? In addition to a few I mentioned in the context of lazy
vacuum instrumentation, it seems likely that they would be useful for
other commands, too.

Instead of st_activity_flag, how about st_command and calling
ACTIVITY_IS_VACUUM, say, COMMAND_LAZY_VACUUM?
pgstat_report_activity_flag() then would become pgstat_report_command().

Like floats, I would think we could take out st_progress_message[][]. I
see that it is currently used to report table name. For that, we might as
well add a single st_command_target[NAMEDATALEN] string which is set at
the beginning of command processing using, say,
pgstat_report_command_target(). It stores the name of relation/object that
the command is going to work on.

Maybe, we don't need each command to proactively pgstat_reset_command().
That would be similar to how st_activity is not proactively cleared but is
rather reset by the next query/command or when some other backend uses the
shared memory slot. Also, we could have a SQL function
pg_stat_reset_local_progress() which clears the st_command after which the
backend is no longer shown in the progress view.

I think it would be better to report only changed parameter arrays when
performing pgstat_report_progress(). So, if we have following shared
memory progress parameters and the reporting function signature:

typedef struct PgBackendStatus
{   ...   uint16    st_command;   char      st_command[NAMEDATALEN];   uint32
st_progress_uint32_param[N_PROGRESS_PARAM];  bool      st_progress_bool_param[N_PROGRESS_PARAM];

} PgBackendStatus;

void pgstat_report_progress(uint32 *uint32_param, int num_uint32_param,                           bool *bool_param, int
num_bool_param);

and if we need to report a bool parameter change, say, waiting_vacuum_pass
in lazy_scan_heap(), we do -

pgstat_report_progress(NULL, 0, progress_bool_param, 2);

That is, no need for pgstat_report_progress() to overwrite the shared
st_progress_uint32_param if none of its members have changed since the
last report.

Currently, ACTIVITY_IS_VACUUM is reported even for VACOPT_ANALYZE and
VACOPT_FULL commands. They are not covered by lazy_scan_heap(), so such
commands are needlessly shown in the progress view with 0s in most of the
fields.

Regarding pg_stat_get_vacuum_progress(): I think a backend can simply be
skipped if (!has_privs_of_role(GetUserId(), beentry->st_userid)) (cannot
put that in plain English, :))

Please add documentation for the newly added view and SQL functions, if any.

I'm marking this as "Waiting on author" in the commitfest app. Also, let's
hear from more people.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

19 November 2015, 07:30:28

On 2015/11/19 16:18, Amit Langote wrote:
> I'm marking this as "Waiting on author" in the commitfest app. Also, let's
> hear from more people.

Well, it seems Michael beat me to it.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Michael Paquier

Date:

19 November 2015, 07:39:20

On Thu, Nov 19, 2015 at 4:30 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On 2015/11/19 16:18, Amit Langote wrote:
>> I'm marking this as "Waiting on author" in the commitfest app. Also, let's
>> hear from more people.
>
> Well, it seems Michael beat me to it.

Yeah, some other folks provided feedback that's why I did it.
@Rahila: are you planning more work on this patch? It seems that at
this point we're waiting for you. If not, and because I have a couple
of times waited for long VACUUM jobs to finish on some relations
without having much information about their progress, I would be fine
to spend time hacking this thing. That's up to you.
Regards,
-- 
Michael

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

19 November 2015, 10:38:20

Hello Michael,

I am planning to continue contributing to this feature in any way be it by reviewing the patch or making one. Though I haven't been able to reply to the comments or post an updated patch lately. I plan to do that soon.

Thank you,

Rahila

On Thu, Nov 19, 2015 at 1:09 PM, Michael Paquier <michael.paquier@gmail.com> wrote:

On Thu, Nov 19, 2015 at 4:30 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On 2015/11/19 16:18, Amit Langote wrote:
>> I'm marking this as "Waiting on author" in the commitfest app. Also, let's
>> hear from more people.
>
> Well, it seems Michael beat me to it.

Yeah, some other folks provided feedback that's why I did it.
@Rahila: are you planning more work on this patch? It seems that at
this point we're waiting for you. If not, and because I have a couple
of times waited for long VACUUM jobs to finish on some relations
without having much information about their progress, I would be fine
to spend time hacking this thing. That's up to you.
Regards,
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PROPOSAL] VACUUM Progress Checker.

From

Jim Nasby

Date:

19 November 2015, 15:58:02

On 11/19/15 1:18 AM, Amit Langote wrote:
> 1) General purpose interface for (maintenance?) commands to report a set

I'm surprised no one has picked up on using this for DML. Certainly 
anyone works with ETL processes would love to be able to get some clue 
on the status of a long running query...

> About pass 2, ie, lazy_index_vacuum() and
> lazy_vacuum_heap(), I don't see how we can do better than reporting its
> progress only after finishing all of it without any finer-grained
> instrumentation. They are essentially block-box as far as the proposed
> instrumentation approach is concerned. Being able to report progress per
> index seems good but as a whole, a user would have to wait arbitrarily
> long before numbers move forward. We might as well just report a bool
> saying we're about to enter a potentially time-consuming index vacuum
> round with possibly multiple indexes followed by lazy_vacuum_heap()
> processing. Additionally, we can report the incremented count of the
> vacuuming round (pass 2) once we are through it.

Another option is to provide the means for the index scan routines to 
report their progress. Maybe every index AM won't use it, but it'd 
certainly be a lot better than staring at a long_running boolean.

> Note that we can leave them out of
> percent_done of overall vacuum progress. Until we have a good solution for
> number (3) above, it seems to difficult to incorporate index pages into
> overall progress.

IMHO we need to either put a big caution sign on any % estimate that it 
could be wildly off, or just forgo it completely for now. I'll bet that 
if we don't provide it some enterprising users will figure out the best 
way to do this (similar to how the bloat estimate query has evolved over 
time).

Even if we never get a % done indicator, just being able to see what 
'position' a command is at will be very valuable.

> As someone pointed out upthread, the final heap truncate phase can take
> arbitrarily long and is outside the scope of lazy_scan_heap() to
> instrument. Perhaps a bool, say, waiting_heap_trunc could be reported for
> the same. Note that, it would have to be reported from lazy_vacuum_rel().

ISTM this is similar to the problem of reporting index status, namely 
that a progress reporting method needs to accept reports from multiple 
places in the code.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

20 November 2015, 01:30:22

On 2015/11/20 0:57, Jim Nasby wrote:
> On 11/19/15 1:18 AM, Amit Langote wrote:
>> 1) General purpose interface for (maintenance?) commands to report a set
> 
> I'm surprised no one has picked up on using this for DML. Certainly anyone
> works with ETL processes would love to be able to get some clue on the
> status of a long running query...

Instrumenting query execution for progress info would be a complex beast
though. Also, what kind of reporting interface it would require is also
not clear, at least to me. Jan Urbanski's PGCon presentation[1] is a good
source on the matter I discovered in this thread, thanks! But IMHO, for
now, it would be worthwhile to focus our resources on the modest goal of
implementing a reporting interface for utility commands. Sure it would be
nice to investigate how much the requirements of the two overlap.

> 
>> About pass 2, ie, lazy_index_vacuum() and
>> lazy_vacuum_heap(), I don't see how we can do better than reporting its
>> progress only after finishing all of it without any finer-grained
>> instrumentation. They are essentially block-box as far as the proposed
>> instrumentation approach is concerned. Being able to report progress per
>> index seems good but as a whole, a user would have to wait arbitrarily
>> long before numbers move forward. We might as well just report a bool
>> saying we're about to enter a potentially time-consuming index vacuum
>> round with possibly multiple indexes followed by lazy_vacuum_heap()
>> processing. Additionally, we can report the incremented count of the
>> vacuuming round (pass 2) once we are through it.
> 
> Another option is to provide the means for the index scan routines to
> report their progress. Maybe every index AM won't use it, but it'd
> certainly be a lot better than staring at a long_running boolean.

The boolean would be a workaround for sure. I'm also slightly tempted by
the idea of instrumenting vacuum scans of individual index AM's bulkdelete
methods. One precedent is how vacuum_delay_point() are sprinkled around in
the code. Another problem to solve would be to figure out how to pass
progress parameters around - via some struct or could they be globals just
like VacuumCost* variables are...

> 
>> Note that we can leave them out of
>> percent_done of overall vacuum progress. Until we have a good solution for
>> number (3) above, it seems to difficult to incorporate index pages into
>> overall progress.
> 
> IMHO we need to either put a big caution sign on any % estimate that it
> could be wildly off, or just forgo it completely for now. I'll bet that if
> we don't provide it some enterprising users will figure out the best way
> to do this (similar to how the bloat estimate query has evolved over time).
> 
> Even if we never get a % done indicator, just being able to see what
> 'position' a command is at will be very valuable.

Agreed. If we provide enough information in whatever view we choose to
expose, that would be a good start.

> 
>> As someone pointed out upthread, the final heap truncate phase can take
>> arbitrarily long and is outside the scope of lazy_scan_heap() to
>> instrument. Perhaps a bool, say, waiting_heap_trunc could be reported for
>> the same. Note that, it would have to be reported from lazy_vacuum_rel().
> 
> ISTM this is similar to the problem of reporting index status, namely that
> a progress reporting method needs to accept reports from multiple places
> in the code.

Yes.

Thanks,
Amit

[1] http://www.pgcon.org/2013/schedule/events/576.en.html

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

20 November 2015, 20:46:25

On Thu, Nov 19, 2015 at 2:18 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> As someone pointed out upthread, the final heap truncate phase can take
> arbitrarily long and is outside the scope of lazy_scan_heap() to
> instrument. Perhaps a bool, say, waiting_heap_trunc could be reported for
> the same. Note that, it would have to be reported from lazy_vacuum_rel().

I don't think reporting booleans is a very good idea.  It's better to
report that some other way, like use one of the strings to report a
"phase" of processing that we're currently performing.

> IMHO, float progress parameters (st_progress_param_float[]) can be taken
> out. They are currently unused and it's unlikely that some command would
> want to report them.

If they are not used, they shouldn't be included in this patch, but we
should be open to adding them later if it proves useful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Jim Nasby

Date:

21 November 2015, 05:38:55

On 11/19/15 7:29 PM, Amit Langote wrote:
>> Another option is to provide the means for the index scan routines to
>> >report their progress. Maybe every index AM won't use it, but it'd
>> >certainly be a lot better than staring at a long_running boolean.
> The boolean would be a workaround for sure. I'm also slightly tempted by
> the idea of instrumenting vacuum scans of individual index AM's bulkdelete
> methods. One precedent is how vacuum_delay_point() are sprinkled around in
> the code. Another problem to solve would be to figure out how to pass
> progress parameters around - via some struct or could they be globals just
> like VacuumCost* variables are...

It just occurred to me that we could do the instrumentation in 
lazy_tid_reaped(). It might seem bad to do in increment for every tuple 
in an index, but we're already doing a bsearch over the dead tuple list. 
Presumably that's going to be a lot more expensive than an increment 
operation.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

23 November 2015, 17:44:00

On Sat, Nov 21, 2015 at 12:38 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 11/19/15 7:29 PM, Amit Langote wrote:
>>>
>>> Another option is to provide the means for the index scan routines to
>>> >report their progress. Maybe every index AM won't use it, but it'd
>>> >certainly be a lot better than staring at a long_running boolean.
>>
>> The boolean would be a workaround for sure. I'm also slightly tempted by
>> the idea of instrumenting vacuum scans of individual index AM's bulkdelete
>> methods. One precedent is how vacuum_delay_point() are sprinkled around in
>> the code. Another problem to solve would be to figure out how to pass
>> progress parameters around - via some struct or could they be globals just
>> like VacuumCost* variables are...
>
> It just occurred to me that we could do the instrumentation in
> lazy_tid_reaped(). It might seem bad to do in increment for every tuple in
> an index, but we're already doing a bsearch over the dead tuple list.
> Presumably that's going to be a lot more expensive than an increment
> operation.

I think the cost of doing an increment there would be negligible.  I'm
not quite sure whether that's the right place to instrument - though
it looks like it might be - but I think the cost of ++something in
that function isn't gonna be a problem at all.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

24 November 2015, 08:03:24

On 2015/11/21 14:38, Jim Nasby wrote:
> On 11/19/15 7:29 PM, Amit Langote wrote:
>>> Another option is to provide the means for the index scan routines to
>>> >report their progress. Maybe every index AM won't use it, but it'd
>>> >certainly be a lot better than staring at a long_running boolean.
>> The boolean would be a workaround for sure. I'm also slightly tempted by
>> the idea of instrumenting vacuum scans of individual index AM's bulkdelete
>> methods. One precedent is how vacuum_delay_point() are sprinkled around in
>> the code. Another problem to solve would be to figure out how to pass
>> progress parameters around - via some struct or could they be globals just
>> like VacuumCost* variables are...
> 
> It just occurred to me that we could do the instrumentation in
> lazy_tid_reaped(). It might seem bad to do in increment for every tuple in
> an index, but we're already doing a bsearch over the dead tuple list.
> Presumably that's going to be a lot more expensive than an increment
> operation.

Just to clarify, does this mean we report index vacuum progress in terms
of index items processed (not pages)? If so, how do we get total number of
index items to process (presumably across all indexes) for a given phase 2
round? As a context, we'd report phase 1 progress in terms of heap pages
processed of total heap pages.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

24 November 2015, 09:06:35

On 2015/11/21 5:46, Robert Haas wrote:
> On Thu, Nov 19, 2015 at 2:18 AM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> As someone pointed out upthread, the final heap truncate phase can take
>> arbitrarily long and is outside the scope of lazy_scan_heap() to
>> instrument. Perhaps a bool, say, waiting_heap_trunc could be reported for
>> the same. Note that, it would have to be reported from lazy_vacuum_rel().
> 
> I don't think reporting booleans is a very good idea.  It's better to
> report that some other way, like use one of the strings to report a
> "phase" of processing that we're currently performing.

Yeah, that might be better. One possible downside of booleans I didn't
foresee is that too many of them might clutter the progress view. What
would've been the names of boolean columns in the progress view are better
reported as strings as the value of a single column, as you seem to suggest.

> 
>> IMHO, float progress parameters (st_progress_param_float[]) can be taken
>> out. They are currently unused and it's unlikely that some command would
>> want to report them.
> 
> If they are not used, they shouldn't be included in this patch, but we
> should be open to adding them later if it proves useful.

Certainly.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Jim Nasby

Date:

25 November 2015, 00:33:07

On 11/24/15 2:02 AM, Amit Langote wrote:
>> It just occurred to me that we could do the instrumentation in
>> >lazy_tid_reaped(). It might seem bad to do in increment for every tuple in
>> >an index, but we're already doing a bsearch over the dead tuple list.
>> >Presumably that's going to be a lot more expensive than an increment
>> >operation.
> Just to clarify, does this mean we report index vacuum progress in terms
> of index items processed (not pages)? If so, how do we get total number of
> index items to process (presumably across all indexes) for a given phase 2
> round? As a context, we'd report phase 1 progress in terms of heap pages
> processed of total heap pages.

You'd get it from pg_class.reltuples for each index. Since all index 
vacuuming is done strictly on a per-index-tuple basis, that's probably 
the most accurate way to do it anyway.

Also, while it might be interesting to look at the total number of index 
tuples, I think it's probably best to always report on a per-index 
basis, as well as which index is being processed. I suspect there could 
be a very large variance of tuple processing speed for different index 
types. Eventually it might be worth it to allow index AMs to provide 
their own vacuuming feedback, but I think that's way out of scope for 
this patch. :)
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

25 November 2015, 01:02:43

On 2015/11/25 9:32, Jim Nasby wrote:
> On 11/24/15 2:02 AM, Amit Langote wrote:
>> Just to clarify, does this mean we report index vacuum progress in terms
>> of index items processed (not pages)? If so, how do we get total number of
>> index items to process (presumably across all indexes) for a given phase 2
>> round? As a context, we'd report phase 1 progress in terms of heap pages
>> processed of total heap pages.
> 
> You'd get it from pg_class.reltuples for each index. Since all index
> vacuuming is done strictly on a per-index-tuple basis, that's probably the
> most accurate way to do it anyway.

Important to remember though that the reltuples would be latest as of the
last VACUUM/ANALYZE.

> Also, while it might be interesting to look at the total number of index
> tuples, I think it's probably best to always report on a per-index basis,
> as well as which index is being processed. I suspect there could be a very
> large variance of tuple processing speed for different index types.
> Eventually it might be worth it to allow index AMs to provide their own
> vacuuming feedback, but I think that's way out of scope for this patch. :)

Agreed.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Jim Nasby

Date:

25 November 2015, 01:12:48

On 11/24/15 7:02 PM, Amit Langote wrote:
>> You'd get it from pg_class.reltuples for each index. Since all index
>> >vacuuming is done strictly on a per-index-tuple basis, that's probably the
>> >most accurate way to do it anyway.
> Important to remember though that the reltuples would be latest as of the
> last VACUUM/ANALYZE.

True, but in cases where you care about monitoring a vacuum I suspect 
it'll be close enough.

Might be worth a little extra effort to handle the 0 case though. If you 
really wanted to get fancy you could see how the current heap 
tuples/page count compares to reltuples/relpages from pg_class for the 
heap... but I suspect that's pretty serious overkill.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

30 November 2015, 12:49:58

Hello,

Thank you for your comments.

Please find attached patch addressing following comments ,

>- duplicate_oids error in HEAD.

Check.

>- a compiler warning:
>pgstat.c:2898: warning: no previous prototype for ‘pgstat_reset_activityflag’

Check.

>One more change you could do is 's/activityflag/activity_flag/g',

Check.

>Type of the flag of vacuum activity.

The flag variable is an integer to incorporate more commands in future.

>Type of st_progress_param and so.

st_progress_param is also given a generic name to incorporate different parameters reported from various commands.

>st_progress_param_float is currently totally useless.

Float parameter has currently been removed from the patch.

>Definition of progress_message.
>The definition of progress_message in lazy_scan_heap is "char
>[PROGRESS_MESSAGE_LENGTH][N_PROGRESS_PARAM]" which looks to be
>inversed.

Corrected.

>The current code subtracts the number of blocks when
>skipping_all_visible_blocks is set in two places. But I think
>it is enough to decrement when skipping.

In both the places, the pages are being skipped hence the total count was decremented.

>He suggested to keep total_heap_pages fixed while adding number
>of skipped pages to that of scanned pages.

This has been done in the attached.

>snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s.%s",
> get_namespace_name(RelationGetNamespace(rel)),
> relname);

Check.

The previous implementation used to add total number of pages across all indexes of a relation to total_index_pages in every scan of

indexes to account for total pages scanned. Thus, it was equal to number of scans * total_index_pages.

In the attached patch, total_index_pages reflects total number of pages across all indexes of a relation.

And the column to report passes through indexes (phase 2) has been added to account for number of passes for index and heap vacuuming.

Number of scanned index pages is reset at the end of each pass.

This makes the reporting clearer.

The percent complete does not account for index pages. It just denotes percentage of heap scanned.

>Spotted a potential oversight regarding report of scanned_pages. It
>seems pages that are skipped because of not getting a pin, being new,

>being empty could be left out of the progress equation.

Corrected.

>It's better to
>report that some other way, like use one of the strings to report a
>"phase" of processing that we're currently performing.
Has been included in the attached.

Some more comments need to be addressed which include name change of activity flag, reporting only changed parameters to shared memory,

ACTIVITY_IS_VACUUM flag being set unnecessarily for ANALYZE and FULL commands ,documentation for new view.

Also, finer grain reporting from indexes and heap truncate phase is yet to be incorporated into the patch

Thank you,

Rahila Syed

Attachment

Vacuum_progress_checker_v7.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Vinayak

Date:

01 December 2015, 02:10:50

Thanks for the v7.
Please check the comment below.
-Table name in the vacuum progress

+ snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s.%s",
schemaname,relname);

In the vacuum progress, column table_name is showing first 30 characters of
table name.
postgres=# create table test_vacuum_progress_in_postgresql(c1 int,c2 text);
postgres=# select * from pg_stat_vacuum_progress ;
-[ RECORD 1 ]-------+------------------------------
pid                 | 12284
table_name          | public.test_vacuum_progress_i
phase               | Scanning Heap
total_heap_pages    | 41667
scanned_heap_pages  | 25185
percent_complete    | 60
total_index_pages   | 0
scanned_index_pages | 0
index_scan_count    | 0




-----
Regards,
Vinayak,

--
View this message in context: http://postgresql.nabble.com/PROPOSAL-VACUUM-Progress-Checker-tp5855849p5875614.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: [PROPOSAL] VACUUM Progress Checker.

From

Kyotaro HORIGUCHI

Date:

01 December 2015, 07:26:16

Hello,

At Mon, 30 Nov 2015 19:10:44 -0700 (MST), Vinayak <vinpokale@gmail.com> wrote in
<1448935844520-5875614.post@n5.nabble.com>
> Thanks for the v7.
> Please check the comment below.
> -Table name in the vacuum progress
> 
> + snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s.%s",
> schemaname,relname);
> 
> In the vacuum progress, column table_name is showing first 30 characters of
> table name.

Yeah, it is actually restricted in that length. But if we allow
the buffer to store whole the qualified names, it will need 64 *
2 + 1 +1 = 130 bytes * 10 1300 bytes for each beentry... It might
be acceptable by others, but I don't think that is preferable..

Separating namespace and relation name as below reduces the
required length of the field but 62 bytes is still too long for
most of the information and in turn too short for longer messages
in some cases.

As a more dractic change in design, since these fields are
written/read in sequential manner, providing one free buffer of
the size of.. so.. about 128 bytes for each beentry and storing
strings delimiting with '\0' and numbers in binary format, as an
example, would do. Additional functions to write into/read from
this buffer safely would be needed but this gives both the
ability to store longer messages and relatively short total
buffer size, and allows arbitrary number of parameters limited
only by the length of the free buffer.

What do you think about this?


By the way, how about giving separate columns for relname and
namespace? I think it is more usual way to designate a relation
in this kind of view and it makes the snprintf to concatenate
name and schema unnecessary(it's not significant, though). (The
following example is after pg_stat_all_tables)

> postgres=# create table test_vacuum_progress_in_postgresql(c1 int,c2 text);
> postgres=# select * from pg_stat_vacuum_progress ;
> pid                 | 12284
> schemaname          | public
> relname             | test_vacuum_progress_i...
> phase               | Scanning Heap
> total_heap_pages    | 41667
...


And I have some comments about code.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PROPOSAL] VACUUM Progress Checker.

From

Kyotaro HORIGUCHI

Date:

01 December 2015, 07:32:47

Sorry for the confusing description and the chopped sentsnce.

At Tue, 01 Dec 2015 16:25:57 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20151201.162557.184519961.horiguchi.kyotaro@lab.ntt.co.jp>
> Hello,
> 
> At Mon, 30 Nov 2015 19:10:44 -0700 (MST), Vinayak <vinpokale@gmail.com> wrote in
<1448935844520-5875614.post@n5.nabble.com>
> > Thanks for the v7.
> > Please check the comment below.
> > -Table name in the vacuum progress
> > 
> > + snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s.%s",
> > schemaname,relname);
> > 
> > In the vacuum progress, column table_name is showing first 30 characters of
> > table name.
> 
> Yeah, it is actually restricted in that length. But if we allow
> the buffer to store whole the qualified names, it will need 64 *
> 2 + 1 +1 = 130 bytes * 10 1300 bytes for each beentry... It might
> be acceptable by others, but I don't think that is preferable..
> 
> Separating namespace and relation name as below reduces the
> required length of the field but 62 bytes is still too long for
> most of the information and in turn too short for longer messages
> in some cases.
> 
> As a more dractic change in design, since these fields are
> written/read in sequential manner, providing one free buffer of
> the size of.. so.. about 128 bytes for each beentry and storing
> strings delimiting with '\0' and numbers in binary format, as an
> example, would do. 

This would fail to make sense.. I suppose this can be called
'packed format', as opposed to fixed-length format. Sorry for
poor wording.

> Additional functions to write into/read from
> this buffer safely would be needed but this gives both the
> ability to store longer messages and relatively short total
> buffer size, and allows arbitrary number of parameters limited
> only by the length of the free buffer.
> 
> What do you think about this?
> 
> 
> By the way, how about giving separate columns for relname and
> namespace? I think it is more usual way to designate a relation
> in this kind of view and it makes the snprintf to concatenate
> name and schema unnecessary(it's not significant, though). (The
> following example is after pg_stat_all_tables)
> 
> > postgres=# create table test_vacuum_progress_in_postgresql(c1 int,c2 text);
> > postgres=# select * from pg_stat_vacuum_progress ;
> > pid                 | 12284
> > schemaname          | public
> > relname             | test_vacuum_progress_i...
> > phase               | Scanning Heap
> > total_heap_pages    | 41667
> ...
> 
> 
> And I have some comments about code.

This is just what I forgot to delete. I'll mention them later if
necessary.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

01 December 2015, 08:12:31

On 2015/12/01 16:25, Kyotaro HORIGUCHI wrote:
> At Mon, 30 Nov 2015 19:10:44 -0700 (MST), Vinayak <vinpokale@gmail.com> wrote
>> Thanks for the v7.
>> Please check the comment below.
>> -Table name in the vacuum progress
>>
>> + snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s.%s",
>> schemaname,relname);
>>
>> In the vacuum progress, column table_name is showing first 30 characters of
>> table name.
> 
> Yeah, it is actually restricted in that length. But if we allow
> the buffer to store whole the qualified names, it will need 64 *
> 2 + 1 +1 = 130 bytes * 10 1300 bytes for each beentry... It might
> be acceptable by others, but I don't think that is preferable..
> 
> Separating namespace and relation name as below reduces the
> required length of the field but 62 bytes is still too long for
> most of the information and in turn too short for longer messages
> in some cases.

As done in the patch, the table name is stored in one of the slots of
st_progress_message which has the width limit of PROGRESS_MESSAGE_LENGTH
bytes. Whereas users of pgstat_report_progress interface could make sure
that strings of their choosing to be stored in st_progress_param slots are
within the PROGRESS_MESSAGE_LENGTH limit, the same cannot be ensured for
the table name. Maybe, the table name is a different kind of information
than other reported parameters that it could be treated specially. How
about a separate st_* member, say, st_command_target[2*NAMDATALEN+1] for
the table name? It would be reported using a separate interface, say,
pgstat_report_command_target() once the name is determined. Moreover,
subsequent pgstat_report_progress() invocations need not copy the table
name needlessly as part of copying argument values to st_progress_param
(which is a separate suggestion in its own right though).

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

02 December 2015, 18:33:32

On Mon, Nov 30, 2015 at 9:10 PM, Vinayak <vinpokale@gmail.com> wrote:
> Thanks for the v7.
> Please check the comment below.
> -Table name in the vacuum progress
>
> + snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s.%s",
> schemaname,relname);

Uh, I hope that line doesn't appear in the patch.  We're scarcely
likely to commit anything that has such an obvious SQL-injection risk
built into it.

https://xkcd.com/327/

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

02 December 2015, 18:42:07

On Tue, Dec 1, 2015 at 2:25 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Yeah, it is actually restricted in that length. But if we allow
> the buffer to store whole the qualified names, it will need 64 *
> 2 + 1 +1 = 130 bytes * 10 1300 bytes for each beentry... It might
> be acceptable by others, but I don't think that is preferable..

There's no such thing as a free lunch here, but we probably don't need
room for 10 strings.  If we allowed say 4 strings per beentry and
limited each one to, say, 140 characters for Twitter-compatibility,
that's 560 bytes per backend.  Throw in some int8 counters and you're
up to maybe 600 bytes per backend.  So that's ~60kB of memory for 100
backends.  That doesn't sound like a huge problem to me, but it
wouldn't be stupid to have a PGC_POSTMASTER GUC to turn this feature
on and off, for the benefit of people who may want to run this in
low-memory environments.

> As a more dractic change in design, since these fields are
> written/read in sequential manner, providing one free buffer of
> the size of.. so.. about 128 bytes for each beentry and storing
> strings delimiting with '\0' and numbers in binary format, as an
> example, would do. Additional functions to write into/read from
> this buffer safely would be needed but this gives both the
> ability to store longer messages and relatively short total
> buffer size, and allows arbitrary number of parameters limited
> only by the length of the free buffer.
>
> What do you think about this?

I think it sounds like a mess with uncertain benefits.  Now instead of
having individual fields that maybe don't fit and have to be
truncated, you have to figure out what to leave out when the overall
message doesn't fit.  That's likely to lead to a lot of messy logic on
the server side, and even messier logic for any clients that read the
data and try to parse it programmatically.

> By the way, how about giving separate columns for relname and
> namespace? I think it is more usual way to designate a relation
> in this kind of view and it makes the snprintf to concatenate
> name and schema unnecessary(it's not significant, though). (The
> following example is after pg_stat_all_tables)

I could go either way on this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Alvaro Herrera

Date:

02 December 2015, 18:48:32

Vinayak wrote:

> In the vacuum progress, column table_name is showing first 30 characters of
> table name.
> postgres=# create table test_vacuum_progress_in_postgresql(c1 int,c2 text);
> postgres=# select * from pg_stat_vacuum_progress ;
> -[ RECORD 1 ]-------+------------------------------
> pid                 | 12284
> table_name          | public.test_vacuum_progress_i

Actually, do we really need to have the table name as a string at all
here?  Why not just report the table OID?  Surely whoever wants to check
the progress can connect to the database in question to figure out the
table name.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] VACUUM Progress Checker.

From

Kyotaro HORIGUCHI

Date:

03 December 2015, 04:37:14

Hello, sorry for the cloberred CC list.

# I restored it manually from upthread..

At Wed, 2 Dec 2015 13:42:01 -0500, Robert Haas <robertmhaas@gmail.com> wrote in
<CA+TgmobcN=3qa9X7c8_G18x53HDCpEYbWP4tnR_es5d=tYvrkQ@mail.gmail.com>
> On Tue, Dec 1, 2015 at 2:25 AM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > Yeah, it is actually restricted in that length. But if we allow
> > the buffer to store whole the qualified names, it will need 64 *
> > 2 + 1 +1 = 130 bytes * 10 1300 bytes for each beentry... It might
> > be acceptable by others, but I don't think that is preferable..
> 
> There's no such thing as a free lunch here, but we probably don't need
> room for 10 strings.  If we allowed say 4 strings per beentry and
> limited each one to, say, 140 characters for Twitter-compatibility,
> that's 560 bytes per backend.  Throw in some int8 counters and you're
> up to maybe 600 bytes per backend.  So that's ~60kB of memory for 100
> backends.  That doesn't sound like a huge problem to me, but it
> wouldn't be stupid to have a PGC_POSTMASTER GUC to turn this feature
> on and off, for the benefit of people who may want to run this in
> low-memory environments.

This is similar to Amit-L's proposal and either sound fair for me.

> > As a more dractic change in design, since these fields are
> > written/read in sequential manner, providing one free buffer of
> > the size of.. so.. about 128 bytes for each beentry and storing
> > strings delimiting with '\0' and numbers in binary format, as an
> > example, would do. Additional functions to write into/read from
> > this buffer safely would be needed but this gives both the
> > ability to store longer messages and relatively short total
> > buffer size, and allows arbitrary number of parameters limited
> > only by the length of the free buffer.
> >
> > What do you think about this?
> 
> I think it sounds like a mess with uncertain benefits.  Now instead of
> having individual fields that maybe don't fit and have to be
> truncated, you have to figure out what to leave out when the overall
> message doesn't fit.  That's likely to lead to a lot of messy logic on
> the server side, and even messier logic for any clients that read the
> data and try to parse it programmatically.

Ok, I understood that the packed format itself is unaccetable.

> > By the way, how about giving separate columns for relname and
> > namespace? I think it is more usual way to designate a relation
> > in this kind of view and it makes the snprintf to concatenate
> > name and schema unnecessary(it's not significant, though). (The
> > following example is after pg_stat_all_tables)
> 
> I could go either way on this.

It would depends on the field length but 140 bytes can hold a
whole qualified names.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PROPOSAL] VACUUM Progress Checker.

From

Kyotaro HORIGUCHI

Date:

03 December 2015, 04:48:19

Hello,

At Wed, 2 Dec 2015 15:48:20 -0300, Alvaro Herrera <alvherre@2ndquadrant.com> wrote in
<20151202184820.GL2763@alvherre.pgsql>
> Vinayak wrote:
> 
> > In the vacuum progress, column table_name is showing first 30 characters of
> > table name.
> > postgres=# create table test_vacuum_progress_in_postgresql(c1 int,c2 text);
> > postgres=# select * from pg_stat_vacuum_progress ;
> > -[ RECORD 1 ]-------+------------------------------
> > pid                 | 12284
> > table_name          | public.test_vacuum_progress_i
> 
> Actually, do we really need to have the table name as a string at all
> here?  Why not just report the table OID?  Surely whoever wants to check
> the progress can connect to the database in question to figure out the
> table name.

I thought the same thing but found that the same kind of view
(say, pg_stat_user_tables) has separate relanme and shcemaname in
string (not a qualified name, though).

Apart from the representation of the relation, OID would be
better as a field in beentry.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

03 December 2015, 05:19:35

On 2015/12/03 13:47, Kyotaro HORIGUCHI wrote:
> At Wed, 2 Dec 2015 15:48:20 -0300, Alvaro Herrera <alvherre@2ndquadrant.com> wrote
>>
>> Actually, do we really need to have the table name as a string at all
>> here?  Why not just report the table OID?  Surely whoever wants to check
>> the progress can connect to the database in question to figure out the
>> table name.
> 
> I thought the same thing but found that the same kind of view
> (say, pg_stat_user_tables) has separate relanme and shcemaname in
> string (not a qualified name, though).
> 
> Apart from the representation of the relation, OID would be
> better as a field in beentry.

I wonder if the field should be a standalone field or as yet another
st_progress_* array?

IMHO, there are some values that a command would report that should not be
mixed with pgstat_report_progress()'s interface. That is, things like
command ID/name, command target (table name or OID) should not be mixed
with actual progress parameters like num_pages, num_indexes (integers),
processing "phase" (string) that are shared via st_progress_* fields. The
first of them  already has its own reporting interface in proposed patch
in the form of pgstat_report_activity_flag(). Although, we must be careful
to choose these interfaces carefully.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Kyotaro HORIGUCHI

Date:

03 December 2015, 06:28:07

Hello,

At Thu, 3 Dec 2015 14:18:50 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<565FD0BA.5020202@lab.ntt.co.jp>
> On 2015/12/03 13:47, Kyotaro HORIGUCHI wrote:
> > At Wed, 2 Dec 2015 15:48:20 -0300, Alvaro Herrera <alvherre@2ndquadrant.com> wrote
> >>
> >> Actually, do we really need to have the table name as a string at all
> >> here?  Why not just report the table OID?  Surely whoever wants to check
> >> the progress can connect to the database in question to figure out the
> >> table name.
> > 
> > I thought the same thing but found that the same kind of view
> > (say, pg_stat_user_tables) has separate relanme and shcemaname in
> > string (not a qualified name, though).
> > 
> > Apart from the representation of the relation, OID would be
> > better as a field in beentry.
> 
> I wonder if the field should be a standalone field or as yet another
> st_progress_* array?
> 
> IMHO, there are some values that a command would report that should not be
> mixed with pgstat_report_progress()'s interface. That is, things like
> command ID/name, command target (table name or OID) should not be mixed
> with actual progress parameters like num_pages, num_indexes (integers),
> processing "phase" (string) that are shared via st_progress_* fields. The
> first of them  already has its own reporting interface in proposed patch
> in the form of pgstat_report_activity_flag(). Although, we must be careful
> to choose these interfaces carefully.

Sorry I misunderstood the patch.

Agreed. The patch already separates integer values and texts.
And re-reviewing the patch, there's no fields necessary to be
passed as string.

total_heap_pages, scanned_heap_pages, total_index_pages,
scanned_index_pages, vacrelstats->num_index_scans are currently
in int32.

Phase can be in integer, and schema_name and relname can be
represented by one OID, uint32.

Finally, *NO* text field is needed at least this usage. So
progress_message is totally useless regardless of other usages
unknown to us.

Am I missing somethig?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

03 December 2015, 07:25:58

Hi,

On 2015/12/03 15:27, Kyotaro HORIGUCHI wrote:
> At Thu, 3 Dec 2015 14:18:50 +0900, Amit Langote wrote
>> On 2015/12/03 13:47, Kyotaro HORIGUCHI wrote:
>>>
>>> Apart from the representation of the relation, OID would be
>>> better as a field in beentry.
>>
>> I wonder if the field should be a standalone field or as yet another
>> st_progress_* array?
>>
>> IMHO, there are some values that a command would report that should not be
>> mixed with pgstat_report_progress()'s interface. That is, things like
>> command ID/name, command target (table name or OID) should not be mixed
>> with actual progress parameters like num_pages, num_indexes (integers),
>> processing "phase" (string) that are shared via st_progress_* fields. The
>> first of them  already has its own reporting interface in proposed patch
>> in the form of pgstat_report_activity_flag(). Although, we must be careful
>> to choose these interfaces carefully.
> 
> Sorry I misunderstood the patch.
> 
> Agreed. The patch already separates integer values and texts.
> And re-reviewing the patch, there's no fields necessary to be
> passed as string.
> 
> total_heap_pages, scanned_heap_pages, total_index_pages,
> scanned_index_pages, vacrelstats->num_index_scans are currently
> in int32.
> 
> Phase can be in integer, and schema_name and relname can be
> represented by one OID, uint32.

AIUI, st_progress_message (strings) are to be used to share certain
messages as progress information. I think the latest vacuum-progress patch
uses it to report which phase lazy_scan_heap() is in, for example,
"Scanning heap" for phase 1 of its processing and "Vacuuming index and
heap" for phase 2. Those values are shown to the user in a text column
named "phase" of the pg_stat_vacuum_progress view. That said, reporting
phase as an integer value may also be worth a consideration. Some other
command might choose to do that.

> Finally, *NO* text field is needed at least this usage. So
> progress_message is totally useless regardless of other usages
> unknown to us.

I think it may be okay at this point to add just those st_progress_*
fields which are required by lazy vacuum progress reporting. If someone
comes up with instrumentation ideas for some other command, they could
post patches to add more st_progress_* fields and to implement
instrumentation and a progress view for that command. This is essentially
what Robert said in [1] in relation to my suggestion of taking out
st_progress_param_float from this patch.

By the way, there are some non-st_progress_* fields that I was talking
about in my previous message. For example, st_activity_flag (which I have
suggested to rename to st_command instead). It needs to be set once at the
beginning of the command processing and need not be touched again. I think
it may be a better idea to do the same for table name or OID. It also
won't change over the duration of the command execution. So, we could set
it once at the beginning where that becomes known. Currently in the patch,
it's reported in st_progress_message[0] by every pgstat_report_progress()
invocation. So, the table name will be strcpy()'d to shared memory for
every scanned block that's reported.

Thanks,
Amit

[1]
http://www.postgresql.org/message-id/CA+TgmoYdZk9nPDtS+_kOt4S6fDRQLW+1jnJBmG0pkRT4ynxO=Q@mail.gmail.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

Kyotaro HORIGUCHI

Date:

03 December 2015, 10:06:05

Hello,

At Thu, 3 Dec 2015 16:24:32 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<565FEE30.8010906@lab.ntt.co.jp>
> > Agreed. The patch already separates integer values and texts.
> > And re-reviewing the patch, there's no fields necessary to be
> > passed as string.
> > 
> > total_heap_pages, scanned_heap_pages, total_index_pages,
> > scanned_index_pages, vacrelstats->num_index_scans are currently
> > in int32.
> > 
> > Phase can be in integer, and schema_name and relname can be
> > represented by one OID, uint32.
> 
> AIUI, st_progress_message (strings) are to be used to share certain
> messages as progress information. I think the latest vacuum-progress patch
> uses it to report which phase lazy_scan_heap() is in, for example,
> "Scanning heap" for phase 1 of its processing and "Vacuuming index and
> heap" for phase 2. Those values are shown to the user in a text column
> named "phase" of the pg_stat_vacuum_progress view. That said, reporting
> phase as an integer value may also be worth a consideration. Some other
> command might choose to do that.
> 
> > Finally, *NO* text field is needed at least this usage. So
> > progress_message is totally useless regardless of other usages
> > unknown to us.
> 
> I think it may be okay at this point to add just those st_progress_*
> fields which are required by lazy vacuum progress reporting. If someone
> comes up with instrumentation ideas for some other command, they could
> post patches to add more st_progress_* fields and to implement
> instrumentation and a progress view for that command. This is essentially
> what Robert said in [1] in relation to my suggestion of taking out
> st_progress_param_float from this patch.

Yes. After taking a detour, though.

> By the way, there are some non-st_progress_* fields that I was talking
> about in my previous message. For example, st_activity_flag (which I have
> suggested to rename to st_command instead). It needs to be set once at the
> beginning of the command processing and need not be touched again. I think
> it may be a better idea to do the same for table name or OID. It also
> won't change over the duration of the command execution. So, we could set
> it once at the beginning where that becomes known. Currently in the patch,
> it's reported in st_progress_message[0] by every pgstat_report_progress()
> invocation. So, the table name will be strcpy()'d to shared memory for
> every scanned block that's reported.

If we don't have dedicate reporting functions for each phase
(that is, static data phase and progress phase), the one function
pgstat_report_progress does that by having some instruction from
the caller and it would be a bitfield.

Reading the flags for making decision whether every field to copy
or not and branching by that seems too much for int32 (or maybe
64?) fields. Alhtough it would be in effective when we have
progress_message fields, I don't think it is a good idea without
having progress_message.

> pgstat_report_progress(uint32 *param1,  uint16 param1_bitmap,
>                        char param2[][..], uint16 param2_bitmap)
> {
> ...
>       for(i = 0; i < 16 ; i++)
>       {
>           if (param1_bitmap & (1 << i))
>                beentry->st_progress_param[i] = param1[i];
>           if (param2_bitmap & (1 << i))
>                strcpy(beentry->..., param2[i]);
>       }

Thoughts?


> Thanks,
> Amit
> 
> [1]
> http://www.postgresql.org/message-id/CA+TgmoYdZk9nPDtS+_kOt4S6fDRQLW+1jnJBmG0pkRT4ynxO=Q@mail.gmail.com

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

07 December 2015, 07:43:02

Hi,

On 2015/12/03 19:05, Kyotaro HORIGUCHI wrote:
> At Thu, 3 Dec 2015 16:24:32 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote
>> By the way, there are some non-st_progress_* fields that I was talking
>> about in my previous message. For example, st_activity_flag (which I have
>> suggested to rename to st_command instead). It needs to be set once at the
>> beginning of the command processing and need not be touched again. I think
>> it may be a better idea to do the same for table name or OID. It also
>> won't change over the duration of the command execution. So, we could set
>> it once at the beginning where that becomes known. Currently in the patch,
>> it's reported in st_progress_message[0] by every pgstat_report_progress()
>> invocation. So, the table name will be strcpy()'d to shared memory for
>> every scanned block that's reported.
> 
> If we don't have dedicate reporting functions for each phase
> (that is, static data phase and progress phase), the one function
> pgstat_report_progress does that by having some instruction from
> the caller and it would be a bitfield.
> 
> Reading the flags for making decision whether every field to copy
> or not and branching by that seems too much for int32 (or maybe
> 64?) fields. Alhtough it would be in effective when we have
> progress_message fields, I don't think it is a good idea without
> having progress_message.
> 
>> pgstat_report_progress(uint32 *param1,  uint16 param1_bitmap,
>>                        char param2[][..], uint16 param2_bitmap)
>> {
>> ...
>>       for(i = 0; i < 16 ; i++)
>>       {
>>           if (param1_bitmap & (1 << i))
>>                beentry->st_progress_param[i] = param1[i];
>>           if (param2_bitmap & (1 << i))
>>                strcpy(beentry->..., param2[i]);
>>       }
> 
> Thoughts?

Hm, I guess progress messages would not change that much over the course
of a long-running command. How about we pass only the array(s) that we
change and NULL for others:

With the following prototype for pgstat_report_progress:

void pgstat_report_progress(uint32 *uint32_param, int num_uint32_param,                          bool *message_param[],
intnum_message_param);
 

If we just wanted to change, say scanned_heap_pages, then we report that
using:

uint32_param[1] = scanned_heap_pages;
pgstat_report_progress(uint32_param, 3, NULL, 0);

Also, pgstat_report_progress() should check each of its parameters for
NULL before iterating over to copy. So in most reports over the course of
a command, the message_param would be NULL and hence not copied.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

09 December 2015, 19:40:16

On Mon, Dec 7, 2015 at 2:41 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On 2015/12/03 19:05, Kyotaro HORIGUCHI wrote:
>> At Thu, 3 Dec 2015 16:24:32 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote
>>> By the way, there are some non-st_progress_* fields that I was talking
>>> about in my previous message. For example, st_activity_flag (which I have
>>> suggested to rename to st_command instead). It needs to be set once at the
>>> beginning of the command processing and need not be touched again. I think
>>> it may be a better idea to do the same for table name or OID. It also
>>> won't change over the duration of the command execution. So, we could set
>>> it once at the beginning where that becomes known. Currently in the patch,
>>> it's reported in st_progress_message[0] by every pgstat_report_progress()
>>> invocation. So, the table name will be strcpy()'d to shared memory for
>>> every scanned block that's reported.
>>
>> If we don't have dedicate reporting functions for each phase
>> (that is, static data phase and progress phase), the one function
>> pgstat_report_progress does that by having some instruction from
>> the caller and it would be a bitfield.
>>
>> Reading the flags for making decision whether every field to copy
>> or not and branching by that seems too much for int32 (or maybe
>> 64?) fields. Alhtough it would be in effective when we have
>> progress_message fields, I don't think it is a good idea without
>> having progress_message.
>>
>>> pgstat_report_progress(uint32 *param1,  uint16 param1_bitmap,
>>>                        char param2[][..], uint16 param2_bitmap)
>>> {
>>> ...
>>>       for(i = 0; i < 16 ; i++)
>>>       {
>>>           if (param1_bitmap & (1 << i))
>>>                beentry->st_progress_param[i] = param1[i];
>>>           if (param2_bitmap & (1 << i))
>>>                strcpy(beentry->..., param2[i]);
>>>       }
>>
>> Thoughts?
>
> Hm, I guess progress messages would not change that much over the course
> of a long-running command. How about we pass only the array(s) that we
> change and NULL for others:
>
> With the following prototype for pgstat_report_progress:
>
> void pgstat_report_progress(uint32 *uint32_param, int num_uint32_param,
>                            bool *message_param[], int num_message_param);
>
> If we just wanted to change, say scanned_heap_pages, then we report that
> using:
>
> uint32_param[1] = scanned_heap_pages;
> pgstat_report_progress(uint32_param, 3, NULL, 0);
>
> Also, pgstat_report_progress() should check each of its parameters for
> NULL before iterating over to copy. So in most reports over the course of
> a command, the message_param would be NULL and hence not copied.

It's going to be *really* important that this facility provides a
lightweight way of updating progress, so I think this whole API is
badly designed.  VACUUM, for example, is going to want a way to update
the individual counter for the number of pages it's scanned after
every page.  It should not have to pass all of the other information
that is part of a complete report.  It should just be able to say
pgstat_report_progress_update_counter(1, pages_scanned) or something
of this sort.  Don't marshal all of the data and then make
pgstat_report_progress figure out what's changed.  Provide a series of
narrow APIs where the caller tells you exactly what they want to
update, and you update only that.  It's fine to have a
pgstat_report_progress() function to update everything at once, for
the use at the beginning of a command, but the incremental updates
within the command should do something lighter-weight.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

10 December 2015, 06:11:07

On 2015/12/10 4:40, Robert Haas wrote:
> It's going to be *really* important that this facility provides a
> lightweight way of updating progress, so I think this whole API is
> badly designed.  VACUUM, for example, is going to want a way to update
> the individual counter for the number of pages it's scanned after
> every page.  It should not have to pass all of the other information
> that is part of a complete report.  It should just be able to say
> pgstat_report_progress_update_counter(1, pages_scanned) or something
> of this sort.  Don't marshal all of the data and then make
> pgstat_report_progress figure out what's changed.  Provide a series of
> narrow APIs where the caller tells you exactly what they want to
> update, and you update only that.  It's fine to have a
> pgstat_report_progress() function to update everything at once, for
> the use at the beginning of a command, but the incremental updates
> within the command should do something lighter-weight.

How about something like the following:

/** index: in the array of uint32 counters in the beentry* counter: new value of the (index+1)th counter*/
void pgstat_report_progress_update_counter(int index, uint32 counter);

/** msg: new value of (index+1)the message (with trailing null byte)*/
void pgstat_report_progress_update_message(int index, const char *msg);

Actually updating a counter or message would look like:

pgstat_increment_changecount_before(beentry);
// update the counter or message at index in beentry->st_progress_*
pgstat_increment_changecount_after(beentry);

Other interface functions which are called at the beginning:

void pgstat_report_progress_set_command(int commandId);
void pgstat_report_progress_set_command_target(const char *target_name);
or
void pgstat_report_progress_set_command_target(Oid target_oid);

And then a SQL-level,
void pgstat_reset_local_progress();

Which simply sets beentry->st_command to some invalid value which signals
a progress view function to ignore this backend.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Michael Paquier

Date:

10 December 2015, 06:28:23

On Thu, Dec 10, 2015 at 4:40 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Dec 7, 2015 at 2:41 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> Hm, I guess progress messages would not change that much over the course
>> of a long-running command. How about we pass only the array(s) that we
>> change and NULL for others:
>>
>> With the following prototype for pgstat_report_progress:
>>
>> void pgstat_report_progress(uint32 *uint32_param, int num_uint32_param,
>>                            bool *message_param[], int num_message_param);
>>
>> If we just wanted to change, say scanned_heap_pages, then we report that
>> using:
>>
>> uint32_param[1] = scanned_heap_pages;
>> pgstat_report_progress(uint32_param, 3, NULL, 0);
>>
>> Also, pgstat_report_progress() should check each of its parameters for
>> NULL before iterating over to copy. So in most reports over the course of
>> a command, the message_param would be NULL and hence not copied.
>
> It's going to be *really* important that this facility provides a
> lightweight way of updating progress, so I think this whole API is
> badly designed.  VACUUM, for example, is going to want a way to update
> the individual counter for the number of pages it's scanned after
> every page.  It should not have to pass all of the other information
> that is part of a complete report.  It should just be able to say
> pgstat_report_progress_update_counter(1, pages_scanned) or something
> of this sort.  Don't marshal all of the data and then make
> pgstat_report_progress figure out what's changed.  Provide a series of
> narrow APIs where the caller tells you exactly what they want to
> update, and you update only that.  It's fine to have a
> pgstat_report_progress() function to update everything at once, for
> the use at the beginning of a command, but the incremental updates
> within the command should do something lighter-weight.

[first time looking really at the patch and catching up with this thread]

Agreed. As far as I can guess from the code, those should be as
followed to bloat pgstat message queue a minimum:

+               pgstat_report_activity_flag(ACTIVITY_IS_VACUUM);               /*                * Loop to process each
selectedrelation.                */
 
Defining a new routine for this purpose is a bit surprising. Cannot we
just use pgstat_report_activity with a new BackendState STATE_INVACUUM
or similar if we pursue the progress tracking approach?

A couple of comments:
- The relation OID should be reported and not its name. In case of a
relation rename that would not be cool for tracking, and most users
are surely going to join with other system tables using it.
- The progress tracking facility adds a whole level of complexity for
very little gain, and IMO this should *not* be part of PgBackendStatus
because in most cases its data finishes wasted. We don't expect
backends to run frequently such progress reports, do we? My opinion on
the matter if that we should define a different collector data for
vacuum, with something like PgStat_StatVacuumEntry, then have on top
of it a couple of routines dedicated at feeding up data with it when
some work is done on a vacuum job.

In short, it seems to me that this patch needs a rework, and should be
returned with feedback. Other opinions?
-- 
Michael

Re: [PROPOSAL] VACUUM Progress Checker.

From

Kyotaro HORIGUCHI

Date:

10 December 2015, 07:30:55

Hello,

At Thu, 10 Dec 2015 15:28:14 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in
<CAB7nPqRNw=w4mt-W+gtq0ED0KTR=B8Qgu6D+4BN3XmzFRuAgXQ@mail.gmail.com>
> On Thu, Dec 10, 2015 at 4:40 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > On Mon, Dec 7, 2015 at 2:41 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:
> >> Hm, I guess progress messages would not change that much over the course
> >> of a long-running command. How about we pass only the array(s) that we
> >> change and NULL for others:
> >>
> >> With the following prototype for pgstat_report_progress:
> >>
> >> void pgstat_report_progress(uint32 *uint32_param, int num_uint32_param,
> >>                            bool *message_param[], int num_message_param);
> >>
> >> If we just wanted to change, say scanned_heap_pages, then we report that
> >> using:
> >>
> >> uint32_param[1] = scanned_heap_pages;
> >> pgstat_report_progress(uint32_param, 3, NULL, 0);
> >>
> >> Also, pgstat_report_progress() should check each of its parameters for
> >> NULL before iterating over to copy. So in most reports over the course of
> >> a command, the message_param would be NULL and hence not copied.
> >
> > It's going to be *really* important that this facility provides a
> > lightweight way of updating progress, so I think this whole API is
> > badly designed.  VACUUM, for example, is going to want a way to update
> > the individual counter for the number of pages it's scanned after
> > every page.  It should not have to pass all of the other information
> > that is part of a complete report.  It should just be able to say
> > pgstat_report_progress_update_counter(1, pages_scanned) or something
> > of this sort.  Don't marshal all of the data and then make
> > pgstat_report_progress figure out what's changed.  Provide a series of
> > narrow APIs where the caller tells you exactly what they want to
> > update, and you update only that.  It's fine to have a
> > pgstat_report_progress() function to update everything at once, for
> > the use at the beginning of a command, but the incremental updates
> > within the command should do something lighter-weight.
> 
> [first time looking really at the patch and catching up with this thread]
> 
> Agreed. As far as I can guess from the code, those should be as
> followed to bloat pgstat message queue a minimum:
> 
> +               pgstat_report_activity_flag(ACTIVITY_IS_VACUUM);
>                 /*
>                  * Loop to process each selected relation.
>                  */
> Defining a new routine for this purpose is a bit surprising. Cannot we
> just use pgstat_report_activity with a new BackendState STATE_INVACUUM
> or similar if we pursue the progress tracking approach?

The author might want to know vacuum status *after* it has been
finished. But it is reset just after the end of a vacuum. One
concern is that BackendState adds new value for
pg_stat_activiry.state and it might confuse someone using it but
I don't see other issue on it.

> A couple of comments:
> - The relation OID should be reported and not its name. In case of a
> relation rename that would not be cool for tracking, and most users
> are surely going to join with other system tables using it.

+1

> - The progress tracking facility adds a whole level of complexity for
> very little gain, and IMO this should *not* be part of PgBackendStatus
> because in most cases its data finishes wasted. We don't expect
> backends to run frequently such progress reports, do we?

I strongly thought the same thing but I haven't came up with
better place for it to be stored. but,

>  My opinion on
> the matter if that we should define a different collector data for
> vacuum, with something like PgStat_StatVacuumEntry, then have on top
> of it a couple of routines dedicated at feeding up data with it when
> some work is done on a vacuum job.

+many. But I can't guess the appropriate number of the entry of
it, or suitable replacing policy on excesive number of
vacuums. Although sane users won't run vacuum on so many
backends.

> In short, it seems to me that this patch needs a rework, and should be
> returned with feedback. Other opinions?

This is important feature for DBAs so I agree with you.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

10 December 2015, 10:24:05

On 2015/12/10 15:28, Michael Paquier wrote:
> On Thu, Dec 10, 2015 at 4:40 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> It's going to be *really* important that this facility provides a
>> lightweight way of updating progress, so I think this whole API is
>> badly designed.  VACUUM, for example, is going to want a way to update
>> the individual counter for the number of pages it's scanned after
>> every page.  It should not have to pass all of the other information
>> that is part of a complete report.  It should just be able to say
>> pgstat_report_progress_update_counter(1, pages_scanned) or something
>> of this sort.  Don't marshal all of the data and then make
>> pgstat_report_progress figure out what's changed.  Provide a series of
>> narrow APIs where the caller tells you exactly what they want to
>> update, and you update only that.  It's fine to have a
>> pgstat_report_progress() function to update everything at once, for
>> the use at the beginning of a command, but the incremental updates
>> within the command should do something lighter-weight.
> 
> [first time looking really at the patch and catching up with this thread]
> 
> Agreed. As far as I can guess from the code, those should be as
> followed to bloat pgstat message queue a minimum:
> 
> +               pgstat_report_activity_flag(ACTIVITY_IS_VACUUM);
>                 /*
>                  * Loop to process each selected relation.
>                  */
> Defining a new routine for this purpose is a bit surprising. Cannot we
> just use pgstat_report_activity with a new BackendState STATE_INVACUUM
> or similar if we pursue the progress tracking approach?

ISTM, PgBackendStatus.st_state is normally manipulated at quite different
sites (mostly tcop) than where a backend would be able to report that a
command like VACUUM is running. Also, the value 'active' of
pg_stat_activity.state has an established meaning as Horiguchi-san seems
to point out in his reply. IMHO, this patch shouldn't affect such meaning
of a pg_stat_activity field.

> A couple of comments:
> - The relation OID should be reported and not its name. In case of a
> relation rename that would not be cool for tracking, and most users
> are surely going to join with other system tables using it.

+1

> - The progress tracking facility adds a whole level of complexity for
> very little gain, and IMO this should *not* be part of PgBackendStatus
> because in most cases its data finishes wasted. We don't expect
> backends to run frequently such progress reports, do we? My opinion on
> the matter if that we should define a different collector data for
> vacuum, with something like PgStat_StatVacuumEntry, then have on top
> of it a couple of routines dedicated at feeding up data with it when
> some work is done on a vacuum job.

I assume your comment here means we should use stats collector to the
track/publish progress info, is that right?

AIUI, the counts published via stats collector are updated asynchronously
w.r.t. operations they count and mostly as aggregate figures. For example,
PgStat_StatTabEntry.blocks_fetched. IOW, we never see
pg_statio_all_tables.heap_blks_read updating as a scan reads blocks. Maybe
that helps keep traffic to pgstat collector to sane levels. But that is
not to mean that I think controlling stats collector levels was the only
design consideration behind how such counters are published.

In case of reporting counters as progress info, it seems we might have to
send too many PgStat_Msg's, for example, for every block we finish
processing during vacuum. That kind of message traffic may swamp the
collector. Then we need to see the updated counters from other counters in
near real-time though that may be possible with suitable (build?)
configuration.

Moreover, the counters reported as progress info seem to be of different
nature than those published via the stats collector. I would think of it
in terms of the distinction between track_activities and track_counts. I
find these lines on "The Statistics Collector" page somewhat related:

"PostgreSQL also supports reporting dynamic information about exactly what
is going on in the system right now, such as the exact command currently
being executed by other server processes, and which other connections
exist in the system. This facility is independent of the collector process."

Then,

"When using the statistics to monitor collected data, it is important to
realize that the information does not update instantaneously. Each
individual server process transmits new statistical counts to the
collector just before going idle; so a query or transaction still in
progress does not affect the displayed totals. Also, the collector itself
emits a new report at most once per PGSTAT_STAT_INTERVAL milliseconds (500
ms unless altered while building the server). So the displayed information
lags behind actual activity. However, current-query information collected
by track_activities is always up-to-date."

http://www.postgresql.org/docs/devel/static/monitoring-stats.html

Am I misunderstanding what you mean?

> 
> In short, it seems to me that this patch needs a rework, and should be
> returned with feedback. Other opinions?

Yeah, some more thought needs to be put into design of the general
reporting interface. Then we also need to pay attention to another
important aspect of this patch - lazy vacuum instrumentation.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Michael Paquier

Date:

10 December 2015, 11:46:57

On Thu, Dec 10, 2015 at 7:23 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On 2015/12/10 15:28, Michael Paquier wrote:
>> - The progress tracking facility adds a whole level of complexity for
>> very little gain, and IMO this should *not* be part of PgBackendStatus
>> because in most cases its data finishes wasted. We don't expect
>> backends to run frequently such progress reports, do we? My opinion on
>> the matter if that we should define a different collector data for
>> vacuum, with something like PgStat_StatVacuumEntry, then have on top
>> of it a couple of routines dedicated at feeding up data with it when
>> some work is done on a vacuum job.
>
> I assume your comment here means we should use stats collector to the
> track/publish progress info, is that right?

Yep.

> AIUI, the counts published via stats collector are updated asynchronously
> w.r.t. operations they count and mostly as aggregate figures. For example,
> PgStat_StatTabEntry.blocks_fetched. IOW, we never see
> pg_statio_all_tables.heap_blks_read updating as a scan reads blocks. Maybe
> that helps keep traffic to pgstat collector to sane levels. But that is
> not to mean that I think controlling stats collector levels was the only
> design consideration behind how such counters are published.
>
> In case of reporting counters as progress info, it seems we might have to
> send too many PgStat_Msg's, for example, for every block we finish
> processing during vacuum. That kind of message traffic may swamp the
> collector. Then we need to see the updated counters from other counters in
> near real-time though that may be possible with suitable (build?)
> configuration.

As far as I understand it, the basic reason why this patch exists is
to allow a DBA to have a hint of the progress of a VACUUM that may be
taking minutes, or say hours, which is something we don't have now. So
it seems perfectly fine to me to report this information
asynchronously with a bit of lag. Why would we need so much precision
in the report?

>> In short, it seems to me that this patch needs a rework, and should be
>> returned with feedback. Other opinions?
>
> Yeah, some more thought needs to be put into design of the general
> reporting interface. Then we also need to pay attention to another
> important aspect of this patch - lazy vacuum instrumentation.

This patch has received a lot of feedback, and it is not in a
committable state, so I marked it as "Returned with feedback" for this
CF.
Regards,
-- 
Michael

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

10 December 2015, 14:39:09

On Thu, Dec 10, 2015 at 6:46 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Dec 10, 2015 at 7:23 PM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> On 2015/12/10 15:28, Michael Paquier wrote:
>>> - The progress tracking facility adds a whole level of complexity for
>>> very little gain, and IMO this should *not* be part of PgBackendStatus
>>> because in most cases its data finishes wasted. We don't expect
>>> backends to run frequently such progress reports, do we? My opinion on
>>> the matter if that we should define a different collector data for
>>> vacuum, with something like PgStat_StatVacuumEntry, then have on top
>>> of it a couple of routines dedicated at feeding up data with it when
>>> some work is done on a vacuum job.
>>
>> I assume your comment here means we should use stats collector to the
>> track/publish progress info, is that right?
>
> Yep.

Oh, please, no.  Gosh, this is supposed to be a lightweight facility!
Just have a chunk of shared memory and write the data in there.  If
you try to feed this through the stats collector you're going to
increase the overhead by 100x or more, and there's no benefit.  We've
got to do relation stats that way because there's no a priori bound on
the number of relations, so we can't just preallocate enough shared
memory for all of them.  But there's no similar restriction here: the
number of backends IS fixed at startup time.  As long as we limit the
amount of progress information that a backend can supply to some fixed
length, which IMHO we definitely should, there's no need to add the
expense of funneling this through the stats collector.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Tom Lane

Date:

10 December 2015, 14:49:37

Robert Haas <robertmhaas@gmail.com> writes:
> Oh, please, no.  Gosh, this is supposed to be a lightweight facility!
> Just have a chunk of shared memory and write the data in there.  If
> you try to feed this through the stats collector you're going to
> increase the overhead by 100x or more, and there's no benefit.  We've
> got to do relation stats that way because there's no a priori bound on
> the number of relations, so we can't just preallocate enough shared
> memory for all of them.  But there's no similar restriction here: the
> number of backends IS fixed at startup time.  As long as we limit the
> amount of progress information that a backend can supply to some fixed
> length, which IMHO we definitely should, there's no need to add the
> expense of funneling this through the stats collector.

I agree with this, and I'd further add that if we don't have a
fixed-length progress state, we've overdesigned the facility entirely.
People won't be able to make sense of anything that acts much more
complicated than "0% .. 100% done".  So you need to find a way of
approximating progress of a given command in terms more or less
like that, even if it's a pretty crude approximation.
        regards, tom lane

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

10 December 2015, 16:00:02

On Thu, Dec 10, 2015 at 9:49 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Oh, please, no.  Gosh, this is supposed to be a lightweight facility!
>> Just have a chunk of shared memory and write the data in there.  If
>> you try to feed this through the stats collector you're going to
>> increase the overhead by 100x or more, and there's no benefit.  We've
>> got to do relation stats that way because there's no a priori bound on
>> the number of relations, so we can't just preallocate enough shared
>> memory for all of them.  But there's no similar restriction here: the
>> number of backends IS fixed at startup time.  As long as we limit the
>> amount of progress information that a backend can supply to some fixed
>> length, which IMHO we definitely should, there's no need to add the
>> expense of funneling this through the stats collector.
>
> I agree with this, and I'd further add that if we don't have a
> fixed-length progress state, we've overdesigned the facility entirely.
> People won't be able to make sense of anything that acts much more
> complicated than "0% .. 100% done".  So you need to find a way of
> approximating progress of a given command in terms more or less
> like that, even if it's a pretty crude approximation.

That I don't agree with.  Even for something like VACUUM, it's pretty
hard to approximate overall progress - because, for example, normally
we'll only have 1 index scan per index, but we might have multiple
index scans or none if maintenance_work_mem is too small or if there
aren't any dead tuples after all.  I don't want our progress reporting
facility to end up with this reputation:

https://xkcd.com/612/

This point has already been discussed rather extensively upthread, but
to reiterate, I think it's much better to report slightly more
detailed information and let the user figure out what to do with it.
For example, for a VACUUM, I think we should report something like
this:

1. The number of heap pages scanned thus far.
2. The number of dead tuples found thus far.
3. The number of dead tuples we can store before we run out of
maintenance_work_mem.
4. The number of index pages processed by the current index vac cycle
(or a sentinel value if none is in progress).
5. The number of heap pages for which the "second heap pass" has been completed.

Now, if the user wants to flatten this out to a progress meter, they
can write an SQL expression which does that easily enough, folding the
sizes of the table and its indices and whatever assumptions they want
to make about what will happen down the road.  If we all agree on how
that should be done, it can even ship as a built-in view.  But I
*don't* think we should build those assumptions into the core progress
reporting facility.  For one thing, that would make updating the
progress meter considerably more expensive - you'd have to recompute a
new completion percentage instead of just saying "heap pages processed
went up by one".  For another thing, there are definitely going to be
some people that want the detailed information - and I can practically
guarantee that if we don't make it available, at least one person will
write a tool that tries to reverse-engineer the detailed progress
information from whatever we do report.

Heck, I might do it myself.  If I find a long-running vacuum on a
customer system that doesn't seem to be making progress, knowing that
it's 69% complete and that the completion percentage isn't rising
doesn't help me much.  What I want to know is are we stuck in a heap
vacuum phase, or an index vacuum phase, or a second heap pass.  Are we
making progress so slowly that 69% takes forever to get to 70%, or are
we making absolutely no progress at all?  I think if we don't report a
healthy amount of detail here people will still frequently have to
resort to what I do now, which is ask the customer to install strace
and attach it to the vacuum process.  For many customers, that's not
so easy; and it never inspires any confidence.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

11 December 2015, 00:42:01

On 2015/12/10 20:46, Michael Paquier wrote:
> On Thu, Dec 10, 2015 at 7:23 PM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> AIUI, the counts published via stats collector are updated asynchronously
>> w.r.t. operations they count and mostly as aggregate figures. For example,
>> PgStat_StatTabEntry.blocks_fetched. IOW, we never see
>> pg_statio_all_tables.heap_blks_read updating as a scan reads blocks. Maybe
>> that helps keep traffic to pgstat collector to sane levels. But that is
>> not to mean that I think controlling stats collector levels was the only
>> design consideration behind how such counters are published.
>>
>> In case of reporting counters as progress info, it seems we might have to
>> send too many PgStat_Msg's, for example, for every block we finish
>> processing during vacuum. That kind of message traffic may swamp the
>> collector. Then we need to see the updated counters from other counters in
>> near real-time though that may be possible with suitable (build?)
>> configuration.
> 
> As far as I understand it, the basic reason why this patch exists is
> to allow a DBA to have a hint of the progress of a VACUUM that may be
> taking minutes, or say hours, which is something we don't have now. So
> it seems perfectly fine to me to report this information
> asynchronously with a bit of lag. Why would we need so much precision
> in the report?

Sorry, I didn't mean to overstate this requirement. I agree precise
real-time reporting of progress info is not such a stringent requirement
from the patch. The point regarding whether we should storm the collector
with progress info messages still holds, IMHO.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Kyotaro HORIGUCHI

Date:

11 December 2015, 05:42:07

Sorry, I misunderstood the meaning of PgStat_*.

At Fri, 11 Dec 2015 09:41:04 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<566A1BA0.70707@lab.ntt.co.jp>
> > As far as I understand it, the basic reason why this patch exists is
> > to allow a DBA to have a hint of the progress of a VACUUM that may be
> > taking minutes, or say hours, which is something we don't have now. So
> > it seems perfectly fine to me to report this information
> > asynchronously with a bit of lag. Why would we need so much precision
> > in the report?
> 
> Sorry, I didn't mean to overstate this requirement. I agree precise
> real-time reporting of progress info is not such a stringent requirement
> from the patch. The point regarding whether we should storm the collector
> with progress info messages still holds, IMHO.

Taking a few seconds interval between each messages would be
sufficient. I personaly think that gettimeofday() per processing
every buffer (or few buffers) is not so heavy-weight but I
suppose there's not such a consensus here. However,
IsCheckpointOnSchedule does that per writing one buffer.

vacuum_delay_point() seems to be a reasonable point to check the
interval and send stats since it would be designed to be called
with the interval also appropriate for this purpose.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

11 December 2015, 06:06:19

On 2015/12/11 14:41, Kyotaro HORIGUCHI wrote:
> Sorry, I misunderstood the meaning of PgStat_*.

I should've just said "messages to the stats collector" instead of
"PgStat_Msg's".

> 
> At Fri, 11 Dec 2015 09:41:04 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote
>>> As far as I understand it, the basic reason why this patch exists is
>>> to allow a DBA to have a hint of the progress of a VACUUM that may be
>>> taking minutes, or say hours, which is something we don't have now. So
>>> it seems perfectly fine to me to report this information
>>> asynchronously with a bit of lag. Why would we need so much precision
>>> in the report?
>>
>> Sorry, I didn't mean to overstate this requirement. I agree precise
>> real-time reporting of progress info is not such a stringent requirement
>> from the patch. The point regarding whether we should storm the collector
>> with progress info messages still holds, IMHO.
> 
> Taking a few seconds interval between each messages would be
> sufficient. I personaly think that gettimeofday() per processing
> every buffer (or few buffers) is not so heavy-weight but I
> suppose there's not such a consensus here. However,
> IsCheckpointOnSchedule does that per writing one buffer.
> 
> vacuum_delay_point() seems to be a reasonable point to check the
> interval and send stats since it would be designed to be called
> with the interval also appropriate for this purpose.

Interesting, vacuum_delay_point() may be worth considering.

It seems though that, overall, PgBackendStatus approach may be more suited
for progress tracking. Let's see what the author thinks.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Michael Paquier

Date:

11 December 2015, 06:25:18

On Fri, Dec 11, 2015 at 12:59 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Dec 10, 2015 at 9:49 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> Oh, please, no.  Gosh, this is supposed to be a lightweight facility!
>>> Just have a chunk of shared memory and write the data in there.  If
>>> you try to feed this through the stats collector you're going to
>>> increase the overhead by 100x or more, and there's no benefit.  We've
>>> got to do relation stats that way because there's no a priori bound on
>>> the number of relations, so we can't just preallocate enough shared
>>> memory for all of them.  But there's no similar restriction here: the
>>> number of backends IS fixed at startup time.  As long as we limit the
>>> amount of progress information that a backend can supply to some fixed
>>> length, which IMHO we definitely should, there's no need to add the
>>> expense of funneling this through the stats collector.
>>
>> I agree with this, and I'd further add that if we don't have a
>> fixed-length progress state, we've overdesigned the facility entirely.
>> People won't be able to make sense of anything that acts much more
>> complicated than "0% .. 100% done".  So you need to find a way of
>> approximating progress of a given command in terms more or less
>> like that, even if it's a pretty crude approximation.

Check. My opinion is based on the fact that most of the backends are
not going to use the progress facility at all, and we actually do not
need a high level of precision for VACUUM reports: we could simply
send messages with a certain delay between two messages. And it looks
like a waste to allocate that for all the backends. But I am going to
withdraw here, two committers is by far too much pressure.

> That I don't agree with.  Even for something like VACUUM, it's pretty
> hard to approximate overall progress - because, for example, normally
> we'll only have 1 index scan per index, but we might have multiple
> index scans or none if maintenance_work_mem is too small or if there
> aren't any dead tuples after all.  I don't want our progress reporting
> facility to end up with this reputation:
>
> https://xkcd.com/612/

This brings memories. Who has never faced that...

> This point has already been discussed rather extensively upthread, but
> to reiterate, I think it's much better to report slightly more
> detailed information and let the user figure out what to do with it.
> For example, for a VACUUM, I think we should report something like
> this:
> 1. The number of heap pages scanned thus far.
> 2. The number of dead tuples found thus far.
> 3. The number of dead tuples we can store before we run out of
> maintenance_work_mem.
> 4. The number of index pages processed by the current index vac cycle
> (or a sentinel value if none is in progress).
> 5. The number of heap pages for which the "second heap pass" has been completed
> Now, if the user wants to flatten this out to a progress meter, they
> can write an SQL expression which does that easily enough, folding the
> sizes of the table and its indices and whatever assumptions they want
> to make about what will happen down the road.  If we all agree on how
> that should be done, it can even ship as a built-in view.  But I
> *don't* think we should build those assumptions into the core progress
> reporting facility.  For one thing, that would make updating the
> progress meter considerably more expensive - you'd have to recompute a
> new completion percentage instead of just saying "heap pages processed
> went up by one".

This stuff I agree. Having global counters, and have user compute any
kind of percentage or progress bar is definitely the way to go.

> For another thing, there are definitely going to be
> some people that want the detailed information - and I can practically
> guarantee that if we don't make it available, at least one person will
> write a tool that tries to reverse-engineer the detailed progress
> information from whatever we do report.

OK, so this justifies the fact of having detailed information, but
does it justify the fact of having precise and accurate data? ISTM
that having detailed information and precise information are two
different things. The level of details is defined depending on how
verbose we want the information to be, and the list you are giving
would fulfill this requirement nicely for VACUUM. The level of
precision/accuracy at which this information is provided though
depends at which frequency we want to send this information. For
long-running VACUUM it does not seem necessary to update the fields of
the progress tracker each time a counter needs to be incremented.
CLUSTER has been mentioned as well as a potential target for the
progress facility, but it seems that it enters as well in the category
of things that would need to be reported on a slower frequency pace
than "each-time-a-counter-is-incremented".

My impression is just based on the needs of VACUUM and CLUSTER.
Perhaps I am lacking imagination regarding the potential use cases of
the progress facility though in cases where we'd want to provide
extremely precise progress information :)
It just seems to me that this is not a requirement for VACUUM or
CLUSTER. That's all.
-- 
Michael

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

11 December 2015, 17:08:03

On Fri, Dec 11, 2015 at 1:25 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
>> For another thing, there are definitely going to be
>> some people that want the detailed information - and I can practically
>> guarantee that if we don't make it available, at least one person will
>> write a tool that tries to reverse-engineer the detailed progress
>> information from whatever we do report.
>
> OK, so this justifies the fact of having detailed information, but
> does it justify the fact of having precise and accurate data? ISTM
> that having detailed information and precise information are two
> different things. The level of details is defined depending on how
> verbose we want the information to be, and the list you are giving
> would fulfill this requirement nicely for VACUUM. The level of
> precision/accuracy at which this information is provided though
> depends at which frequency we want to send this information. For
> long-running VACUUM it does not seem necessary to update the fields of
> the progress tracker each time a counter needs to be incremented.
> CLUSTER has been mentioned as well as a potential target for the
> progress facility, but it seems that it enters as well in the category
> of things that would need to be reported on a slower frequency pace
> than "each-time-a-counter-is-incremented".
>
> My impression is just based on the needs of VACUUM and CLUSTER.
> Perhaps I am lacking imagination regarding the potential use cases of
> the progress facility though in cases where we'd want to provide
> extremely precise progress information :)
> It just seems to me that this is not a requirement for VACUUM or
> CLUSTER. That's all.

It's not a hard requirement, but it should be quite easy to do without
adding any significant overhead.  All you need to do is something
like:

foo->changecount++;
pg_write_barrier();
foo->count_of_blocks++;
pg_write_barrier();
foo->changecount++;

I suspect that's plenty cheap enough to do for every block.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Vinayak Pokale

Date:

25 December 2015, 12:46:13

Hi,
Please find attached patch addressing following comments.

>The relation OID should be reported and not its name. In case of a
>relation rename that would not be cool for tracking, and most users
>are surely going to join with other system tables using it.
The relation OID is reported instead of relation name.
The following interface function is called at the beginning to report the relation OID once.
void pgstat_report_command_target(Oid relid)

>Regarding pg_stat_get_vacuum_progress(): I think a backend can simply be
>skipped if (!has_privs_of_role(GetUserId(), beentry->st_userid)) (cannot
>put that in plain English, :))
Updated in the attached patch.

In the previous patch, ACTIVITY_IS_VACUUM is set unnecessarily for VACOPT_FULL and they are not covered by lazy_scan_heap().
I have updated it in attached patch and also renamed ACTIVITY_IS_VACUUM to COMMAND_LAZY_VACUUM.

Added documentation for view.
Some more comments need to be addressed.

Regards,

Vinayak Pokale

On Sat, Dec 12, 2015 at 2:07 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Dec 11, 2015 at 1:25 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
>> For another thing, there are definitely going to be
>> some people that want the detailed information - and I can practically
>> guarantee that if we don't make it available, at least one person will
>> write a tool that tries to reverse-engineer the detailed progress
>> information from whatever we do report.
>
> OK, so this justifies the fact of having detailed information, but
> does it justify the fact of having precise and accurate data? ISTM
> that having detailed information and precise information are two
> different things. The level of details is defined depending on how
> verbose we want the information to be, and the list you are giving
> would fulfill this requirement nicely for VACUUM. The level of
> precision/accuracy at which this information is provided though
> depends at which frequency we want to send this information. For
> long-running VACUUM it does not seem necessary to update the fields of
> the progress tracker each time a counter needs to be incremented.
> CLUSTER has been mentioned as well as a potential target for the
> progress facility, but it seems that it enters as well in the category
> of things that would need to be reported on a slower frequency pace
> than "each-time-a-counter-is-incremented".
>
> My impression is just based on the needs of VACUUM and CLUSTER.
> Perhaps I am lacking imagination regarding the potential use cases of
> the progress facility though in cases where we'd want to provide
> extremely precise progress information :)
> It just seems to me that this is not a requirement for VACUUM or
> CLUSTER. That's all.

It's not a hard requirement, but it should be quite easy to do without
adding any significant overhead. All you need to do is something
like:

foo->changecount++;
pg_write_barrier();
foo->count_of_blocks++;
pg_write_barrier();
foo->changecount++;

I suspect that's plenty cheap enough to do for every block.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

Vacuum_progress_checker_v8.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

28 December 2015, 00:32:38

Hi Vinayak,

On 2015/12/25 21:46, Vinayak Pokale wrote:
> Hi,
> Please find attached patch addressing following comments.
> 
>> The relation OID should be reported and not its name. In case of a
>> relation rename that would not be cool for tracking, and most users
>> are surely going to join with other system tables using it.
> The relation OID is reported instead of relation name.
> The following interface function is called at the beginning to report the
> relation OID once.
> void pgstat_report_command_target(Oid relid)
> 
>> Regarding pg_stat_get_vacuum_progress(): I think a backend can simply be
>> skipped if (!has_privs_of_role(GetUserId(), beentry->st_userid)) (cannot
>> put that in plain English, :))
> Updated in the attached patch.
> 
> In the previous patch, ACTIVITY_IS_VACUUM is set unnecessarily for
> VACOPT_FULL and they are not covered by lazy_scan_heap().
> I have updated it in attached patch and also renamed ACTIVITY_IS_VACUUM to
> COMMAND_LAZY_VACUUM.
> 
> Added documentation for view.
> Some more comments need to be addressed.

I suspect you need to create a new CF entry for this patch in CF 2016-01.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Sudhir Lonkar-2

Date:

06 January 2016, 18:01:48

Hello,
>Please find attached patch addressing following comments
I have tested this patch.
It is showing empty (null) in phase column of pg_stat_vacuum_progress, when
I switched to a unprivileged user.
In the previous patch, it is showing <insufficient privilege> in phase
column.

Thanks and Regards,
Sudhir Lonkar



--
View this message in context: http://postgresql.nabble.com/PROPOSAL-VACUUM-Progress-Checker-tp5855849p5880544.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

08 January 2016, 12:20:52

>I suspect you need to create a new CF entry for this patch in CF 2016-01.

Unless I am missing something, there seems to be no entry for this patch into CF 2016-01 page: https://commitfest.postgresql.org/8/.

Regrettably, we have exceeded the deadline to add the patch into this commitfest. Is there still some way to add it to the commitfest 2016-01? As this feature has received lot of feedback in previous commitfest , adding it to this commitfest will surely help in progressing it in order to make it ready for PostgreSQL 9.6.

Thank you,

Rahila Syed

On Mon, Dec 28, 2015 at 6:01 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:

Hi Vinayak,

On 2015/12/25 21:46, Vinayak Pokale wrote:
> Hi,
> Please find attached patch addressing following comments.
>
>> The relation OID should be reported and not its name. In case of a
>> relation rename that would not be cool for tracking, and most users
>> are surely going to join with other system tables using it.
> The relation OID is reported instead of relation name.
> The following interface function is called at the beginning to report the
> relation OID once.
> void pgstat_report_command_target(Oid relid)
>
>> Regarding pg_stat_get_vacuum_progress(): I think a backend can simply be
>> skipped if (!has_privs_of_role(GetUserId(), beentry->st_userid)) (cannot
>> put that in plain English, :))
> Updated in the attached patch.
>
> In the previous patch, ACTIVITY_IS_VACUUM is set unnecessarily for
> VACOPT_FULL and they are not covered by lazy_scan_heap().
> I have updated it in attached patch and also renamed ACTIVITY_IS_VACUUM to
> COMMAND_LAZY_VACUUM.
>
> Added documentation for view.
> Some more comments need to be addressed.

I suspect you need to create a new CF entry for this patch in CF 2016-01.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

12 January 2016, 01:31:07

On 2016/01/08 21:20, Rahila Syed wrote:
>> I suspect you need to create a new CF entry for this patch in CF 2016-01.
> 
> Unless I am missing something, there seems to be no entry for this patch
> into CF 2016-01 page: https://commitfest.postgresql.org/8/.
> Regrettably, we have exceeded the deadline to add the patch into this
> commitfest. Is there still some way to add it to the commitfest 2016-01? As
> this feature has received lot of feedback in previous commitfest , adding
> it to this commitfest will surely help in progressing it in order to make
> it ready for PostgreSQL 9.6.

I see that the patch has been added to the CF.

I'm slightly concerned that the latest patch doesn't incorporate any
revisions to the original pgstat interface per Robert's comments in [1].

Thanks,
Amit

[1]
http://www.postgresql.org/message-id/CA+TgmoZ5q4N4T0c0_-XKTencEWOAbfdKtoPPT8NUjjjV5OHMFQ@mail.gmail.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

12 January 2016, 02:22:14

On 2016/01/12 10:30, Amit Langote wrote:
> I'm slightly concerned that the latest patch doesn't incorporate any
> revisions to the original pgstat interface per Robert's comments in [1].

I meant to say "originally proposed pgstat interface on this thread".

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Vinayak Pokale

Date:

12 January 2016, 02:29:03

<p dir="ltr"><br /> On Jan 12, 2016 11:22 AM, "Amit Langote" <<a
href="mailto:Langote_Amit_f8@lab.ntt.co.jp">Langote_Amit_f8@lab.ntt.co.jp</a>>wrote:<br /> ><br /> > On
2016/01/1210:30, Amit Langote wrote:<br /> > > I'm slightly concerned that the latest patch doesn't incorporate
any<br/> > > revisions to the original pgstat interface per Robert's comments in [1].<br /> ><br /> > I
meantto say "originally proposed pgstat interface on this thread".<p dir="ltr">Yes.<br /> Robert's comments related to
pgstatinterface needs to be address.<br /> I will update it.<p dir="ltr">Regards,<br /> Vinayak Pokale<br />

Re: [PROPOSAL] VACUUM Progress Checker.

From

Vinayak Pokale

Date:

12 January 2016, 02:35:27

<p dir="ltr">Hi Sudhir,<p dir="ltr">On Jan 7, 2016 3:02 AM, "Sudhir Lonkar-2" <<a
href="mailto:sudhir.lonkar@nttdata.com">sudhir.lonkar@nttdata.com</a>>wrote:<br /> ><br /> > Hello,<br /> >
>Pleasefind attached patch addressing following comments<br /> > I have tested this patch.<br /> > It is
showingempty (null) in phase column of pg_stat_vacuum_progress, when<br /> > I switched to a unprivileged user.<br
/>> In the previous patch, it is showing <insufficient privilege> in phase<br /> > column.<br /> Yes. I
willupdate the patch.<br /> Regards,<br /> Vinayak Pokale

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

13 January 2016, 06:17:40

On 2016/01/12 11:28, Vinayak Pokale wrote:
> On Jan 12, 2016 11:22 AM, "Amit Langote" <Langote_Amit_f8@lab.ntt.co.jp>
> wrote:
>>
>> On 2016/01/12 10:30, Amit Langote wrote:
>>> I'm slightly concerned that the latest patch doesn't incorporate any
>>> revisions to the original pgstat interface per Robert's comments in [1].
>>
>> I meant to say "originally proposed pgstat interface on this thread".
> 
> Yes.
> Robert's comments related to pgstat interface needs to be address.
> I will update it.

So, I updated the patch status to "Waiting on Author".

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Vinayak Pokale

Date:

25 January 2016, 11:58:13

Hi,

Please find attached updated patch with an updated interface.

I added the below interface to update the scanned_heap_pages,scanned_index_pages and index_scan_count only.

void pgstat_report_progress_scanned_pages(int num_of_int, uint32 *progress_scanned_pages_param)

Other interface functions which are called at the beginning:
void pgstat_report_progress_set_command_target(Oid relid)

Regards,

Vinayak

On Wed, Jan 13, 2016 at 3:16 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:

On 2016/01/12 11:28, Vinayak Pokale wrote:
> On Jan 12, 2016 11:22 AM, "Amit Langote" <Langote_Amit_f8@lab.ntt.co.jp>
> wrote:
>>
>> On 2016/01/12 10:30, Amit Langote wrote:
>>> I'm slightly concerned that the latest patch doesn't incorporate any
>>> revisions to the original pgstat interface per Robert's comments in [1].
>>
>> I meant to say "originally proposed pgstat interface on this thread".
>
> Yes.
> Robert's comments related to pgstat interface needs to be address.
> I will update it.

So, I updated the patch status to "Waiting on Author".

Thanks,
Amit

Attachment

Vacuum_progress_checker_v9.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

26 January 2016, 00:51:57

Hi Vinayak,

On 2016/01/25 20:58, Vinayak Pokale wrote:
> Hi,
> 
> Please find attached updated patch with an updated interface.
> 

Thanks for updating the patch.

> I added the below interface to update the
> scanned_heap_pages,scanned_index_pages and index_scan_count only.
> void pgstat_report_progress_scanned_pages(int num_of_int, uint32
> *progress_scanned_pages_param)

I think it's still the same interface with the names changed. IIRC, what
was suggested was to provide a way to not have to pass the entire array
for the update of a single member of it. Just pass the index of the
updated member and its new value. Maybe, something like:

void pgstat_progress_update_counter(int index, uint32 newval);

The above function would presumably update the value of
beentry.st_progress_counter[index] or something like that.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Vinayak Pokale

Date:

26 January 2016, 02:22:54

<p dir="ltr">Hi Amit,<p dir="ltr">Thank you for reviewing the patch.<br /> On Jan 26, 2016 9:51 AM, "Amit Langote"
<<ahref="mailto:Langote_Amit_f8@lab.ntt.co.jp">Langote_Amit_f8@lab.ntt.co.jp</a>> wrote:<br /> ><br /> ><br
/>> Hi Vinayak,<br /> ><br /> > On 2016/01/25 20:58, Vinayak Pokale wrote:<br /> > > Hi,<br /> >
><br/> > > Please find attached updated patch with an updated interface.<br /> > ><br /> ><br /> >
Thanksfor updating the patch.<br /> ><br /> > > I added the below interface to update the<br /> > >
scanned_heap_pages,scanned_index_pagesand index_scan_count only.<br /> > > void
pgstat_report_progress_scanned_pages(intnum_of_int, uint32<br /> > > *progress_scanned_pages_param)<br /> ><br
/>> I think it's still the same interface with the names changed. IIRC, what<br /> > was suggested was to provide
away to not have to pass the entire array<br /> > for the update of a single member of it. Just pass the index of
the<br/> > updated member and its new value. Maybe, something like:<br /> ><br /> > void
pgstat_progress_update_counter(intindex, uint32 newval);<br /> ><br /> > The above function would presumably
updatethe value of<br /> > beentry.st_progress_counter[index] or something like that.<p dir="ltr">Understood. I will
updatethe patch.<p dir="ltr">Regards,<br /> Vinayak<br />

Re: [PROPOSAL] VACUUM Progress Checker.

From

Vinayak Pokale

Date:

27 January 2016, 04:38:00

Hi,

Please find attached updated patch with an updated interface.

On Jan 26, 2016 11:22 AM, "Vinayak Pokale" <vinpokale@gmail.com> wrote:
>
> Hi Amit,
>
> Thank you for reviewing the patch.
>
> On Jan 26, 2016 9:51 AM, "Amit Langote" <Langote_Amit_f8@lab.ntt.co.jp> wrote:
> >
> >
> > Hi Vinayak,
> >
> > On 2016/01/25 20:58, Vinayak Pokale wrote:
> > > Hi,
> > >
> > > Please find attached updated patch with an updated interface.
> > >
> >
> > Thanks for updating the patch.
> >
> > > I added the below interface to update the
> > > scanned_heap_pages,scanned_index_pages and index_scan_count only.
> > > void pgstat_report_progress_scanned_pages(int num_of_int, uint32
> > > *progress_scanned_pages_param)
> >
> > I think it's still the same interface with the names changed. IIRC, what
> > was suggested was to provide a way to not have to pass the entire array
> > for the update of a single member of it. Just pass the index of the
> > updated member and its new value. Maybe, something like:
> >
> > void pgstat_progress_update_counter(int index, uint32 newval);
> >
> > The above function would presumably update the value of
> > beentry.st_progress_counter[index] or something like that.

Following interface functions are added:

/*
* index: in the array of uint32 counters in the beentry
* counter: new value of the (index)th counter
*/
void
pgstat_report_progress_update_counter(int index, uint32 counter)

/*
called to updatet the VACUUM progress phase.
msg: new value of (index)th message
*/
void
pgstat_report_progress_update_message(int index, char msg[N_PROGRESS_PARAM][PROGRESS_MESSAGE_LENGTH])

Regards,
Vinayak

Attachment

Vacuum_progress_checker_v10.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

27 January 2016, 20:57:56

On Tue, Jan 26, 2016 at 11:37 PM, Vinayak Pokale <vinpokale@gmail.com> wrote:
> Hi,
>
> Please find attached updated patch with an updated interface.

Well, this isn't right.  You've got this sort of thing:

+            scanned_index_pages += RelationGetNumberOfBlocks(Irel[i]);
+            /* Report progress to the statistics collector */
+            pgstat_report_progress_update_message(0, progress_message);
+            pgstat_report_progress_update_counter(1, scanned_heap_pages);
+            pgstat_report_progress_update_counter(3, scanned_index_pages);
+            pgstat_report_progress_update_counter(4,
vacrelstats->num_index_scans + 1);

The point of having pgstat_report_progress_update_counter() is so that
you can efficiently update a single counter without having to update
everything, when only one counter has changed.  But here you are
calling this function a whole bunch of times in a row, which
completely misses the point - if you are updating all the counters,
it's more efficient to use an interface that does them all at once
instead of one at a time.

But there's a second problem here, too, which is that I think you've
got this code in the wrong place.  The second point of the
pgstat_report_progress_update_counter interface is that this should be
cheap enough that we can do it every time the counter changes.  That's
not what you are doing here.  You're updating the counters at various
points in the code and just pushing new values for all of them
regardless of which ones have changed.  I think you should find a way
that you can update the value immediately at the exact moment it
changes.  If that seems like too much of a performance hit we can talk
about it, but I think the value of this feature will be greatly
weakened if users can't count on it to be fully and continuously up to
the moment.  When something gets stuck, you want to know where it's
stuck, not approximately kinda where it's stuck.

+                if(!scan_all)
+                    scanned_heap_pages = scanned_heap_pages +
next_not_all_visible_block;

I don't want to be too much of a stickler for details here, but it
seems to me that this is an outright lie.  The number of scanned pages
does not go up when we decide to skip some pages, because scanning and
skipping aren't the same thing.  If we're only going to report one
number here, it needs to be called something like "current heap page",
and then you can just report blkno at the top of each iteration of
lazy_scan_heap's main loop.  If you want to report the numbers of
scanned and skipped pages separately that'd be OK too, but you can't
call it the number of scanned pages and then actually report a value
that is not that.

+        /*
+         * Reporting vacuum progress to statistics collector
+         */

This patch doesn't report anything to the statistics collector, nor should it.

Instead of making the SQL-visible function
pg_stat_get_vacuum_progress(), I think it should be something more
generic like pg_stat_get_command_progress().  Maybe VACUUM will be the
only command that reports into that feature for right now, but I'd
hope for us to change that pretty soon after we get the first patch
committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

28 January 2016, 10:38:20

>+ if(!scan_all)
>+ scanned_heap_pages = scanned_heap_pages +
>next_not_all_visible_block;

>I don't want to be too much of a stickler for details here, but it
>seems to me that this is an outright lie.

Initially the scanned_heap_pages were meant to report just the scanned pages and skipped pages were not added to the count. Instead the skipped pages were deduced from number of total heap pages to be scanned to make the number of scanned pages eventually add up to total heap pages. As per comments received later total heap pages were kept constant and skipped pages count was added to scanned pages to make the count add up to total heap pages at the end of scan. That said, as suggested, scanned_heap_pages should be renamed to current_heap_page to report current blkno in lazy_scan_heap loop which will add up to total heap pages(nblocks) at the end of scan. And scanned_heap_pages can be reported as a separate number which wont contain skipped pages.

>+ /*
>+ * Reporting vacuum progress to statistics collector
>+ */

>This patch doesn't report anything to the statistics collector, nor should it.

Yes. This was incorrectly added initially by referring to similar pgstat_report interface functions.

Thank you,

Rahila Syed

On Thu, Jan 28, 2016 at 2:27 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Jan 26, 2016 at 11:37 PM, Vinayak Pokale <vinpokale@gmail.com> wrote:
> Hi,
>
> Please find attached updated patch with an updated interface.

Well, this isn't right. You've got this sort of thing:

+ scanned_index_pages += RelationGetNumberOfBlocks(Irel[i]);
+ /* Report progress to the statistics collector */
+ pgstat_report_progress_update_message(0, progress_message);
+ pgstat_report_progress_update_counter(1, scanned_heap_pages);
+ pgstat_report_progress_update_counter(3, scanned_index_pages);
+ pgstat_report_progress_update_counter(4,
vacrelstats->num_index_scans + 1);

The point of having pgstat_report_progress_update_counter() is so that
you can efficiently update a single counter without having to update
everything, when only one counter has changed. But here you are
calling this function a whole bunch of times in a row, which
completely misses the point - if you are updating all the counters,
it's more efficient to use an interface that does them all at once
instead of one at a time.

But there's a second problem here, too, which is that I think you've
got this code in the wrong place. The second point of the
pgstat_report_progress_update_counter interface is that this should be
cheap enough that we can do it every time the counter changes. That's
not what you are doing here. You're updating the counters at various
points in the code and just pushing new values for all of them
regardless of which ones have changed. I think you should find a way
that you can update the value immediately at the exact moment it
changes. If that seems like too much of a performance hit we can talk
about it, but I think the value of this feature will be greatly
weakened if users can't count on it to be fully and continuously up to
the moment. When something gets stuck, you want to know where it's
stuck, not approximately kinda where it's stuck.

+ if(!scan_all)
+ scanned_heap_pages = scanned_heap_pages +
next_not_all_visible_block;

I don't want to be too much of a stickler for details here, but it
seems to me that this is an outright lie. The number of scanned pages
does not go up when we decide to skip some pages, because scanning and
skipping aren't the same thing. If we're only going to report one
number here, it needs to be called something like "current heap page",
and then you can just report blkno at the top of each iteration of
lazy_scan_heap's main loop. If you want to report the numbers of
scanned and skipped pages separately that'd be OK too, but you can't
call it the number of scanned pages and then actually report a value
that is not that.

+ /*
+ * Reporting vacuum progress to statistics collector
+ */

This patch doesn't report anything to the statistics collector, nor should it.

Instead of making the SQL-visible function
pg_stat_get_vacuum_progress(), I think it should be something more
generic like pg_stat_get_command_progress(). Maybe VACUUM will be the
only command that reports into that feature for right now, but I'd
hope for us to change that pretty soon after we get the first patch
committed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

28 January 2016, 13:42:08

Hi,

On Thu, Jan 28, 2016 at 7:38 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:
>>+                if(!scan_all)
>>+                    scanned_heap_pages = scanned_heap_pages +
>>next_not_all_visible_block;
>
>>I don't want to be too much of a stickler for details here, but it
>>seems to me that this is an outright lie.
>
> Initially the scanned_heap_pages were meant to report just the scanned pages
> and skipped pages were not added to the count.  Instead the skipped pages
> were deduced from number of total heap pages to be scanned to make the
> number of scanned pages eventually add up to total heap pages.   As per
> comments received later total heap pages were kept constant and skipped
> pages count was added to scanned pages to make the count add up to total
> heap pages at the end of scan. That said, as suggested, scanned_heap_pages
> should be renamed to current_heap_page to report current blkno in
> lazy_scan_heap loop which will add up to total heap pages(nblocks) at the
> end of scan. And scanned_heap_pages can be reported as a separate number
> which wont contain skipped pages.

Or keep scanned_heap_pages as is and add a skipped_pages (or
skipped_heap_pages). I guess the latter would be updated not only for
all visible skipped pages but also pin skipped pages. That is,
updating its counter right after vacrelstats->pinskipped_pages++ which
there are a couple of instances of. Likewise a good (and only?) time
to update the former's counter would be right after
vacrelstats->scanned_pages++. Although, I see at least one place where
both are incremented so maybe I'm not entirely correct about the last
two sentences.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

28 January 2016, 14:53:28

On Thu, Jan 28, 2016 at 8:41 AM, Amit Langote <amitlangote09@gmail.com> wrote:
> Or keep scanned_heap_pages as is and add a skipped_pages (or
> skipped_heap_pages). I guess the latter would be updated not only for
> all visible skipped pages but also pin skipped pages. That is,
> updating its counter right after vacrelstats->pinskipped_pages++ which
> there are a couple of instances of. Likewise a good (and only?) time
> to update the former's counter would be right after
> vacrelstats->scanned_pages++. Although, I see at least one place where
> both are incremented so maybe I'm not entirely correct about the last
> two sentences.

So I've spent a fair amount of time debugging really-long-running
VACUUM processes with customers, and generally what I really want to
know is:

>>> What block number are we at? <<<

Because, if I know that, and I can see how fast that's increasing,
then I can estimate whether the VACUUM is going to end in a reasonable
period of time or not.  So my preference is to not bother breaking out
skipped pages, but just report the block number and call it good.  I
will defer to a strong consensus on something else, but reporting the
block number has the advantage of being dead simple and, in my
experience, that would answer the question that I typically have.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

29 January 2016, 01:04:06

On 2016/01/28 23:53, Robert Haas wrote:
> On Thu, Jan 28, 2016 at 8:41 AM, Amit Langote <amitlangote09@gmail.com> wrote:
>> Or keep scanned_heap_pages as is and add a skipped_pages (or
>> skipped_heap_pages). I guess the latter would be updated not only for
>> all visible skipped pages but also pin skipped pages. That is,
>> updating its counter right after vacrelstats->pinskipped_pages++ which
>> there are a couple of instances of. Likewise a good (and only?) time
>> to update the former's counter would be right after
>> vacrelstats->scanned_pages++. Although, I see at least one place where
>> both are incremented so maybe I'm not entirely correct about the last
>> two sentences.
> 
> So I've spent a fair amount of time debugging really-long-running
> VACUUM processes with customers, and generally what I really want to
> know is:
> 
>>>> What block number are we at? <<<
> 
> Because, if I know that, and I can see how fast that's increasing,
> then I can estimate whether the VACUUM is going to end in a reasonable
> period of time or not.  So my preference is to not bother breaking out
> skipped pages, but just report the block number and call it good.  I
> will defer to a strong consensus on something else, but reporting the
> block number has the advantage of being dead simple and, in my
> experience, that would answer the question that I typically have.

Okay, I agree that reporting just the current blkno is simple and good
enough. How about numbers of what we're going to report as the "Vacuuming
Index and Heap" phase? I guess we can still keep the scanned_index_pages
and index_scan_count. So we have:

+CREATE VIEW pg_stat_vacuum_progress AS
+    SELECT
+              S.pid,
+              S.relid,
+              S.phase,
+              S.total_heap_blks,
+              S.current_heap_blkno,
+              S.total_index_pages,
+              S.scanned_index_pages,
+              S.index_scan_count
+              S.percent_complete,
+    FROM pg_stat_get_vacuum_progress() AS S;

I guess it won't remain pg_stat_get_"vacuum"_progress(), though.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

29 January 2016, 12:02:11

>Okay, I agree that reporting just the current blkno is simple and good
>enough. How about numbers of what we're going to report as the "Vacuuming
>Index and Heap" phase? I guess we can still keep the scanned_index_pages
>and index_scan_count So we have:
>+CREATE VIEW pg_stat_vacuum_progress AS
>+ SELECT
>+ S.pid,
>+ S.relid,
>+ S.phase,
>+ S.total_heap_blks,
>+ S.current_heap_blkno,
>+ S.total_index_pages,
>+ S.scanned_index_pages,
>+ S.index_scan_count
>+ S.percent_complete,
>+ FROM pg_stat_get_vacuum_progress() AS S;
>I guess it won't remain pg_stat_get_"vacuum"_progress(

>), though.

Apart from these, as suggested in [1] , finer grained reporting from index vacuuming phase can provide better insight. Currently we report number of blocks processed once at the end of vacuuming of each index.

IIUC, what was suggested in [1] was instrumenting lazy_tid_reaped with a counter to count number of index tuples processed so far as lazy_tid_reaped is called for every index tuple to see if it matches any of the dead tuple tids.

So additional parameters for each index can be,
scanned_index_tuples

total_index_tuples (from pg_class.reltuples entry)

Thank you,

Rahila Syed

[1]. http://www.postgresql.org/message-id/56500356.4070101@BlueTreble.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

29 January 2016, 15:01:40

On Fri, Jan 29, 2016 at 7:02 AM, Rahila Syed <rahilasyed90@gmail.com> wrote:
> Apart from these, as suggested in [1] , finer grained reporting from index
> vacuuming phase can provide better insight. Currently we report number of
> blocks processed once at the end of vacuuming of each index.
> IIUC, what was suggested in [1] was instrumenting lazy_tid_reaped with a
> counter to count number of index tuples processed so far as lazy_tid_reaped
> is called for every index tuple to see if it matches any of the dead tuple
> tids.
>
> So additional parameters for each index can be,
> scanned_index_tuples
> total_index_tuples (from pg_class.reltuples entry)

Let's report blocks, not tuples.  The reason is that
pg_class.reltuples is only an estimate and might be wildly wrong on
occasion, but the length of the relation in blocks can be known with
certainty.

But other than that I agree with this.  Fine-grained is key.  If it's
not fine grained, then people really won't be able to tell what's
going on when VACUUM doesn't finish in a timely fashion.  And the
whole point is we want to be able to know that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

01 February 2016, 01:28:59

On 2016/01/29 21:02, Rahila Syed wrote:
>> Okay, I agree that reporting just the current blkno is simple and good
>> enough. How about numbers of what we're going to report as the "Vacuuming
>> Index and Heap" phase? I guess we can still keep the scanned_index_pages
>> and index_scan_count So we have:
>> +CREATE VIEW pg_stat_vacuum_progress AS
>> +       SELECT
>> +              S.pid,
>> +              S.relid,
>> +              S.phase,
>> +              S.total_heap_blks,
>> +              S.current_heap_blkno,
>> +              S.total_index_pages,
>> +              S.scanned_index_pages,
>> +              S.index_scan_count
>> +              S.percent_complete,
>> +       FROM pg_stat_get_vacuum_progress() AS S;
>> I guess it won't remain pg_stat_get_"vacuum"_progress(
>> ), though.
> 
> Apart from these, as suggested in [1] , finer grained reporting from index
> vacuuming phase can provide better insight. Currently we report number of
> blocks processed once at the end of vacuuming of each index.
> IIUC, what was suggested in [1] was instrumenting lazy_tid_reaped with a
> counter to count number of index tuples processed so far as lazy_tid_reaped
> is called for every index tuple to see if it matches any of the dead tuple
> tids.

Agreed. Although, as Robert already suggested, I too would vote for
counting pages, not tuples. I think there was an independent patch doing
something of that sort somewhere upthread.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Date:

05 February 2016, 08:17:36

Hello,

Please find attached updated patch.
>The point of having pgstat_report_progress_update_counter() is so that 
>you can efficiently update a single counter without having to update 
>everything, when only one counter has changed.  But here you are 
>calling this function a whole bunch of times in a row, which 
>completely misses the point - if you are updating all the counters, 
>it's more efficient to use an interface that does them all at once 
>instead of one at a time.

The pgstat_report_progress_update_counter() is called at appropriate places in the attached patch.

>So I've spent a fair amount of time debugging really-long-running 
>VACUUM processes with customers, and generally what I really want to 
>know is:
>>>> What block number are we at? <<<

Agreed. The attached patch reported current block number.

Regards,
Vinayak

Attachment

Vacuum_progress_checker_v11.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

08 February 2016, 02:37:43

Hi Vinayak,

Thanks for updating the patch, a couple of comments:

On 2016/02/05 17:15, pokurev@pm.nttdata.co.jp wrote:
> Hello,
> 
> Please find attached updated patch.
>> The point of having pgstat_report_progress_update_counter() is so that 
>> you can efficiently update a single counter without having to update 
>> everything, when only one counter has changed.  But here you are 
>> calling this function a whole bunch of times in a row, which 
>> completely misses the point - if you are updating all the counters, 
>> it's more efficient to use an interface that does them all at once 
>> instead of one at a time.
> 
> The pgstat_report_progress_update_counter() is called at appropriate places in the attached patch.

+    char    progress_message[N_PROGRESS_PARAM][PROGRESS_MESSAGE_LENGTH];

[ ... ]

+    snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s", phase1);
+    pgstat_report_progress_update_message(0, progress_message);

[ ... ]

+            snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s", phase2);
+            pgstat_report_progress_update_message(0, progress_message);

Instead of passing the array of char *'s, why not just pass a single char
*, because that's what it's doing - updating a single message. So,
something like:

+ char progress_message[PROGRESS_MESSAGE_LENGTH];

[ ... ]

+ snprintf(progress_message, PROGRESS_MESSAGE_LENGTH, "%s", phase1);
+ pgstat_report_progress_update_message(0, progress_message);

[ ... ]

+ snprintf(progress_message, PROGRESS_MESSAGE_LENGTH, "%s", phase2);
+ pgstat_report_progress_update_message(0, progress_message);

And also:

+/*-----------
+ * pgstat_report_progress_update_message()-
+ *
+ *Called to update phase of VACUUM progress
+ *-----------
+ */
+void
+pgstat_report_progress_update_message(int index, char *msg)
+{

[ ... ]

+    pgstat_increment_changecount_before(beentry);
+    strncpy((char *)beentry->st_progress_message[index], msg,
PROGRESS_MESSAGE_LENGTH);
+    pgstat_increment_changecount_after(beentry);


One more comment:

@@ -1120,14 +1157,23 @@ lazy_scan_heap(Relation onerel, LVRelStats
*vacrelstats,        /* Log cleanup info before we touch indexes */        vacuum_log_cleanup_info(onerel,
vacrelstats);

+        snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s", phase2);
+        pgstat_report_progress_update_message(0, progress_message);        /* Remove index entries */        for (i =
0;i < nindexes; i++)
 
+        {            lazy_vacuum_index(Irel[i],                              &indstats[i],
vacrelstats);
 
+            scanned_index_pages += RelationGetNumberOfBlocks(Irel[i]);
+            /* Update the scanned index pages and number of index scan */
+            pgstat_report_progress_update_counter(3, scanned_index_pages);
+            pgstat_report_progress_update_counter(4, vacrelstats->num_index_scans
+ 1);
+        }        /* Remove tuples from heap */        lazy_vacuum_heap(onerel, vacrelstats);
vacrelstats->num_index_scans++;
+        scanned_index_pages = 0;

I guess num_index_scans could better be reported after all the indexes are
done, that is, after the for loop ends.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Alvaro Herrera

Date:

08 February 2016, 19:13:16

Since things are clearly still moving here, I closed it as
returned-with-feedback.  Please submit to the next CF so that we don't
lose it.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] VACUUM Progress Checker.

From

Kyotaro HORIGUCHI

Date:

15 February 2016, 11:21:49

Hello,

At Mon, 8 Feb 2016 11:37:17 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<56B7FF5D.7030108@lab.ntt.co.jp>
> 
> Hi Vinayak,
> 
> Thanks for updating the patch, a couple of comments:
> 
> On 2016/02/05 17:15, pokurev@pm.nttdata.co.jp wrote:
> > Hello,
> > 
> > Please find attached updated patch.
> >> The point of having pgstat_report_progress_update_counter() is so that 
> >> you can efficiently update a single counter without having to update 
> >> everything, when only one counter has changed.  But here you are 
> >> calling this function a whole bunch of times in a row, which 
> >> completely misses the point - if you are updating all the counters, 
> >> it's more efficient to use an interface that does them all at once 
> >> instead of one at a time.
> > 
> > The pgstat_report_progress_update_counter() is called at appropriate places in the attached patch.
> 
> +     char    progress_message[N_PROGRESS_PARAM][PROGRESS_MESSAGE_LENGTH];
> 
> [ ... ]
> 
> +     snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s", phase1);
> +     pgstat_report_progress_update_message(0, progress_message);
> 
> [ ... ]
> 
> +                     snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s", phase2);
> +                     pgstat_report_progress_update_message(0, progress_message);
> 
> Instead of passing the array of char *'s, why not just pass a single char
> *, because that's what it's doing - updating a single message. So,
> something like:
> 
> + char progress_message[PROGRESS_MESSAGE_LENGTH];
> 
> [ ... ]
> 
> + snprintf(progress_message, PROGRESS_MESSAGE_LENGTH, "%s", phase1);
> + pgstat_report_progress_update_message(0, progress_message);
> 
> [ ... ]
> 
> + snprintf(progress_message, PROGRESS_MESSAGE_LENGTH, "%s", phase2);
> + pgstat_report_progress_update_message(0, progress_message);
> 
> And also:
> 
> +/*-----------
> + * pgstat_report_progress_update_message()-
> + *
> + *Called to update phase of VACUUM progress
> + *-----------
> + */
> +void
> +pgstat_report_progress_update_message(int index, char *msg)
> +{
> 
> [ ... ]
> 
> +     pgstat_increment_changecount_before(beentry);
> +     strncpy((char *)beentry->st_progress_message[index], msg,
> PROGRESS_MESSAGE_LENGTH);
> +     pgstat_increment_changecount_after(beentry);


As I might have written upthread, transferring the whole string
as a progress message is useless at least in this scenario. Since
they are a set of fixed messages, each of them can be represented
by an identifier, an integer number. I don't see a reason for
sending the whole of a string beyond a backend.

Next, the function pg_stat_get_command_progress() has a somewhat
generic name, but it seems to reuturn the data only for the
backends with beentry->st_command = COMMAND_LAZY_VACUUM and has
the column names specific for vucuum like process. If the
function is intended to be generic, it might be better to return
a set of integer[] for given type. Otherwise it should have a
name represents its objective.

CREATE FUNCTIONpg_stat_get_command_progress(IN cmdtype integer)RETURNS SETOF integer[] as $$....

SELECT * from pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as x        x
---------------------_
{1233, 16233, 1, ....}
{3244, 16236, 2, ....}
....

CREATE VIEW pg_stat_vacuum_progress AS SELECT S.s[1] as pid,        S.s[2] as relid,        CASE S.s[3]           WHEN
1THEN 'Scanning Heap'          WHEN 2 THEN 'Vacuuming Index and Heap'          ELSE 'Unknown phase'        END,  ....
FROMpg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as S;
 

# The name of the function could be other than *_command_progress.

Any thoughts or opinions?


> One more comment:
> 
> @@ -1120,14 +1157,23 @@ lazy_scan_heap(Relation onerel, LVRelStats
> *vacrelstats,
>               /* Log cleanup info before we touch indexes */
>               vacuum_log_cleanup_info(onerel, vacrelstats);
> 
> +             snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s", phase2);
> +             pgstat_report_progress_update_message(0, progress_message);
>               /* Remove index entries */
>               for (i = 0; i < nindexes; i++)
> +             {
>                       lazy_vacuum_index(Irel[i],
>                                                         &indstats[i],
>                                                         vacrelstats);
> +                     scanned_index_pages += RelationGetNumberOfBlocks(Irel[i]);
> +                     /* Update the scanned index pages and number of index scan */
> +                     pgstat_report_progress_update_counter(3, scanned_index_pages);
> +                     pgstat_report_progress_update_counter(4, vacrelstats->num_index_scans
> + 1);
> +             }
>               /* Remove tuples from heap */
>               lazy_vacuum_heap(onerel, vacrelstats);
>               vacrelstats->num_index_scans++;
> +             scanned_index_pages = 0;
> 
> I guess num_index_scans could better be reported after all the indexes are
> done, that is, after the for loop ends.

Precise reporting would be valuable if vacuuming indexes takes a
long time. It seems to me to be fine as it is since updating of
stat counters wouldn't add any significant overhead.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

16 February 2016, 01:39:51

Hello,

On 2016/02/15 20:21, Kyotaro HORIGUCHI wrote:
> At Mon, 8 Feb 2016 11:37:17 +0900, Amit Langote wrote:
>> On 2016/02/05 17:15, pokurev@pm.nttdata.co.jp wrote:
>>> Please find attached updated patch.

[ ... ]

>>
>> Instead of passing the array of char *'s, why not just pass a single char
>> *, because that's what it's doing - updating a single message. So,
>> something like:
> 
> As I might have written upthread, transferring the whole string
> as a progress message is useless at least in this scenario. Since
> they are a set of fixed messages, each of them can be represented
> by an identifier, an integer number. I don't see a reason for
> sending the whole of a string beyond a backend.

This tends to make sense. Perhaps, they could be macros:

#define VACUUM_PHASE_SCAN_HEAP        1
#define VACUUM_PHASE_VACUUM_INDEX_HEAP    2

> Next, the function pg_stat_get_command_progress() has a somewhat
> generic name, but it seems to reuturn the data only for the
> backends with beentry->st_command = COMMAND_LAZY_VACUUM and has
> the column names specific for vucuum like process. If the
> function is intended to be generic, it might be better to return
> a set of integer[] for given type. Otherwise it should have a
> name represents its objective.

Agreed.

> 
> CREATE FUNCTION
>  pg_stat_get_command_progress(IN cmdtype integer)
>  RETURNS SETOF integer[] as $$....
> 
> SELECT * from pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as x
>          x
> ---------------------_
> {1233, 16233, 1, ....}
> {3244, 16236, 2, ....}
> ....

I am not sure what we would pass as argument to the (SQL) function
pg_stat_get_command_progress() in the system view definition for
individual commands - what is PROGRESS_COMMAND_VACUUM exactly? Would
string literals like "vacuum", "cluster", etc. to represent command names
work?

> 
> CREATE VIEW pg_stat_vacuum_progress AS
>   SELECT S.s[1] as pid,
>          S.s[2] as relid,
>          CASE S.s[3] 
>            WHEN 1 THEN 'Scanning Heap'
>            WHEN 2 THEN 'Vacuuming Index and Heap'
>            ELSE 'Unknown phase'
>          END,
>    ....
>   FROM pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as S;
> 
> # The name of the function could be other than *_command_progress.
> 
> Any thoughts or opinions?

How about pg_stat_get_progress_info()?

>> +             snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s", phase2);
>> +             pgstat_report_progress_update_message(0, progress_message);
>>               /* Remove index entries */
>>               for (i = 0; i < nindexes; i++)
>> +             {
>>                       lazy_vacuum_index(Irel[i],
>>                                                         &indstats[i],
>>                                                         vacrelstats);
>> +                     scanned_index_pages += RelationGetNumberOfBlocks(Irel[i]);
>> +                     /* Update the scanned index pages and number of index scan */
>> +                     pgstat_report_progress_update_counter(3, scanned_index_pages);
>> +                     pgstat_report_progress_update_counter(4, vacrelstats->num_index_scans
>> + 1);
>> +             }
>>               /* Remove tuples from heap */
>>               lazy_vacuum_heap(onerel, vacrelstats);
>>               vacrelstats->num_index_scans++;
>> +             scanned_index_pages = 0;
>>
>> I guess num_index_scans could better be reported after all the indexes are
>> done, that is, after the for loop ends.
> 
> Precise reporting would be valuable if vacuuming indexes takes a
> long time. It seems to me to be fine as it is since updating of
> stat counters wouldn't add any significant overhead.

Sorry, my comment may be a bit unclear. vacrelstats->num_index_scans
doesn't count individual indexes vacuumed but rather the number of times
"all" the indexes of a table are vacuumed, IOW, the number of times the
vacuum phase runs. Purpose of counter #4 there seems to be to report the
latter. OTOH, reporting scanned_index_pages per index as is done in the
patch is alright.

That said, there is discussion upthread about more precise reporting on
index vacuuming by utilizing the lazy_tid_reaped() (the index bulk delete
callback) as a place where we can report what index block number we are
at.  I think that would mean the current IndexBulkDeleteCallback signature
is insufficient, which is the following:

/* Typedef for callback function to determine if a tuple is bulk-deletable */
typedef bool (*IndexBulkDeleteCallback) (ItemPointer itemptr, void *state);

One more parameter would be necessary:

typedef bool (*IndexBulkDeleteCallback) (ItemPointer itemptr, BlockNumber
current_index_blkno, void *state);

That would also require changing all the am specific vacuumpage routines
(like btvacuumpage) to also pass the new argument. Needless to say, some
bookkeeping information would also need to be kept in LVRelStats (the
"state" in above signature).

Am I missing something?

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Kyotaro HORIGUCHI

Date:

16 February 2016, 09:25:45

Hello,

At Tue, 16 Feb 2016 10:39:27 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<56C27DCF.7020705@lab.ntt.co.jp>
> 
> Hello,
> 
> On 2016/02/15 20:21, Kyotaro HORIGUCHI wrote:
> > At Mon, 8 Feb 2016 11:37:17 +0900, Amit Langote wrote:
> >> On 2016/02/05 17:15, pokurev@pm.nttdata.co.jp wrote:
> >>> Please find attached updated patch.
> 
> [ ... ]
> 
> >>
> >> Instead of passing the array of char *'s, why not just pass a single char
> >> *, because that's what it's doing - updating a single message. So,
> >> something like:
> > 
> > As I might have written upthread, transferring the whole string
> > as a progress message is useless at least in this scenario. Since
> > they are a set of fixed messages, each of them can be represented
> > by an identifier, an integer number. I don't see a reason for
> > sending the whole of a string beyond a backend.
> 
> This tends to make sense. Perhaps, they could be macros:
> 
> #define VACUUM_PHASE_SCAN_HEAP        1
> #define VACUUM_PHASE_VACUUM_INDEX_HEAP    2

Exactly. Or an enum.

> > Next, the function pg_stat_get_command_progress() has a somewhat
> > generic name, but it seems to reuturn the data only for the
> > backends with beentry->st_command = COMMAND_LAZY_VACUUM and has
> > the column names specific for vucuum like process. If the
> > function is intended to be generic, it might be better to return
> > a set of integer[] for given type. Otherwise it should have a
> > name represents its objective.
> 
> Agreed.
> 
> > 
> > CREATE FUNCTION
> >  pg_stat_get_command_progress(IN cmdtype integer)
> >  RETURNS SETOF integer[] as $$....
> > 
> > SELECT * from pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as x
> >          x
> > ---------------------_
> > {1233, 16233, 1, ....}
> > {3244, 16236, 2, ....}
> > ....
> 
> I am not sure what we would pass as argument to the (SQL) function
> pg_stat_get_command_progress() in the system view definition for
> individual commands - what is PROGRESS_COMMAND_VACUUM exactly? Would
> string literals like "vacuum", "cluster", etc. to represent command names
> work?

Sorry, it is a symbol to tell pg_stat_get_command_progress() to
return stats numbers of backends running VACUUM. It should have
been COMMAND_LAZY_VACUUM for this patch. If we want progress of
CREATE INDEX, it would be COMMAND_CREATE_INDEX.

> > 
> > CREATE VIEW pg_stat_vacuum_progress AS
> >   SELECT S.s[1] as pid,
> >          S.s[2] as relid,
> >          CASE S.s[3] 
> >            WHEN 1 THEN 'Scanning Heap'
> >            WHEN 2 THEN 'Vacuuming Index and Heap'
> >            ELSE 'Unknown phase'
> >          END,
> >    ....
> >   FROM pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as S;
> > 
> > # The name of the function could be other than *_command_progress.
> > 
> > Any thoughts or opinions?
> 
> How about pg_stat_get_progress_info()?

I think it's good.

> >> +             snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s", phase2);
> >> +             pgstat_report_progress_update_message(0, progress_message);
> >>               /* Remove index entries */
> >>               for (i = 0; i < nindexes; i++)
> >> +             {
> >>                       lazy_vacuum_index(Irel[i],
> >>                                                         &indstats[i],
> >>                                                         vacrelstats);
> >> +                     scanned_index_pages += RelationGetNumberOfBlocks(Irel[i]);
> >> +                     /* Update the scanned index pages and number of index scan */
> >> +                     pgstat_report_progress_update_counter(3, scanned_index_pages);
> >> +                     pgstat_report_progress_update_counter(4, vacrelstats->num_index_scans
> >> + 1);
> >> +             }
> >>               /* Remove tuples from heap */
> >>               lazy_vacuum_heap(onerel, vacrelstats);
> >>               vacrelstats->num_index_scans++;
> >> +             scanned_index_pages = 0;
> >>
> >> I guess num_index_scans could better be reported after all the indexes are
> >> done, that is, after the for loop ends.
> > 
> > Precise reporting would be valuable if vacuuming indexes takes a
> > long time. It seems to me to be fine as it is since updating of
> > stat counters wouldn't add any significant overhead.
> 
> Sorry, my comment may be a bit unclear. vacrelstats->num_index_scans
> doesn't count individual indexes vacuumed but rather the number of times
> "all" the indexes of a table are vacuumed, IOW, the number of times the
> vacuum phase runs. Purpose of counter #4 there seems to be to report the
> latter. OTOH, reporting scanned_index_pages per index as is done in the
> patch is alright.

I got it. Sorry for my misreading. Yes, you're
right. index_scan_count can take atmost 1 by the code. That's
odd.

> That said, there is discussion upthread about more precise reporting on
> index vacuuming by utilizing the lazy_tid_reaped() (the index bulk delete
> callback) as a place where we can report what index block number we are
> at.  I think that would mean the current IndexBulkDeleteCallback signature
> is insufficient, which is the following:
> 
> /* Typedef for callback function to determine if a tuple is bulk-deletable */
> typedef bool (*IndexBulkDeleteCallback) (ItemPointer itemptr, void *state);
> 
> One more parameter would be necessary:
> 
> typedef bool (*IndexBulkDeleteCallback) (ItemPointer itemptr, BlockNumber
> current_index_blkno, void *state);

It could work for btree but doesn't for, for example,
gin. ginbulkdelete finds the next page in the following way.

>  blkno = GinPageGetOpaque(page)->rightlink;

We should use another value to fagure the progress. If the
callback is called centainly the same or around the same number
of times with the total page numbers, the callback should just
increment a static counter for processed pages.

> That would also require changing all the am specific vacuumpage routines
> (like btvacuumpage) to also pass the new argument. Needless to say, some
> bookkeeping information would also need to be kept in LVRelStats (the
> "state" in above signature).
> 
> Am I missing something?

So, maybe missing the case of other than btree..

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

17 February 2016, 07:46:52

Hi,

On 2016/02/16 18:25, Kyotaro HORIGUCHI wrote:
> At Tue, 16 Feb 2016 10:39:27 +0900, Amit Langote wrote:
>> On 2016/02/15 20:21, Kyotaro HORIGUCHI wrote:
>>> CREATE FUNCTION
>>>  pg_stat_get_command_progress(IN cmdtype integer)
>>>  RETURNS SETOF integer[] as $$....
>>>
>>> SELECT * from pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as x
>>>          x
>>> ---------------------_
>>> {1233, 16233, 1, ....}
>>> {3244, 16236, 2, ....}
>>> ....
>>
>> I am not sure what we would pass as argument to the (SQL) function
>> pg_stat_get_command_progress() in the system view definition for
>> individual commands - what is PROGRESS_COMMAND_VACUUM exactly? Would
>> string literals like "vacuum", "cluster", etc. to represent command names
>> work?
> 
> Sorry, it is a symbol to tell pg_stat_get_command_progress() to
> return stats numbers of backends running VACUUM. It should have
> been COMMAND_LAZY_VACUUM for this patch. If we want progress of
> CREATE INDEX, it would be COMMAND_CREATE_INDEX.

Oh I see:

CREATE VIEW pg_stat_vacuum_prgress AS SELECT * from pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as x

is actually:

CREATE VIEW pg_stat_vacuum_prgress AS SELECT * from pg_stat_get_command_progress(1) as x

where PROGRESS_COMMAND_VACUUM is 1 in backend code (macro, enum,
whatever).  I was confused because we never say relkind = RELKIND_INDEX in
SQL queries, :)

>> That said, there is discussion upthread about more precise reporting on
>> index vacuuming by utilizing the lazy_tid_reaped() (the index bulk delete
>> callback) as a place where we can report what index block number we are
>> at.  I think that would mean the current IndexBulkDeleteCallback signature
>> is insufficient, which is the following:
>>
>> /* Typedef for callback function to determine if a tuple is bulk-deletable */
>> typedef bool (*IndexBulkDeleteCallback) (ItemPointer itemptr, void *state);
>>
>> One more parameter would be necessary:
>>
>> typedef bool (*IndexBulkDeleteCallback) (ItemPointer itemptr, BlockNumber
>> current_index_blkno, void *state);
> 
> It could work for btree but doesn't for, for example,
> gin. ginbulkdelete finds the next page in the following way.
> 
>>  blkno = GinPageGetOpaque(page)->rightlink;
> 
> We should use another value to fagure the progress. If the
> callback is called centainly the same or around the same number
> of times with the total page numbers, the callback should just
> increment a static counter for processed pages.
> 
>> That would also require changing all the am specific vacuumpage routines
>> (like btvacuumpage) to also pass the new argument. Needless to say, some
>> bookkeeping information would also need to be kept in LVRelStats (the
>> "state" in above signature).
>>
>> Am I missing something?
> 
> So, maybe missing the case of other than btree..

More or less, the callback is called maxoffset number of times for all
index pages containing pointers to heap tuples. Robert said upthread that
counting in granularity lower than pages may not be useful:

"Let's report blocks, not tuples. The reason is that
pg_class.reltuples is only an estimate and might be wildly wrong on
occasion, but the length of the relation in blocks can be known with
certainty."

With the existing interface of the callback, it's difficult to keep the
count of pages, hence a proposal to enhance the interface. Also, now I
wonder whether scanned_index_pages will always converge to whatever
total_index_pages we get from RelationGetNumberOfBlocks(index), because
callback is not called for *every* index page and tends to differ per
index method (am). Thanks for pointing me to confirm so.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Date:

26 February 2016, 08:29:38

Hello,

Thank you for your comments.
Please find attached patch addressing following comments.

>As I might have written upthread, transferring the whole string 
>as a progress message is useless at least in this scenario. Since 
>they are a set of fixed messages, each of them can be represented 
>by an identifier, an integer number. I don't see a reason for 
>sending the whole of a string beyond a backend. 
Agreed. I used following macros.
#define VACUUM_PHASE_SCAN_HEAP    1 
#define VACUUM_PHASE_VACUUM_INDEX_HEAP    2

>I guess num_index_scans could better be reported after all the indexes are 
>done, that is, after the for loop ends.
Agreed.  I have corrected it.

> CREATE VIEW pg_stat_vacuum_progress AS 
>   SELECT S.s[1] as pid, 
>          S.s[2] as relid, 
>          CASE S.s[3] 
>            WHEN 1 THEN 'Scanning Heap' 
>            WHEN 2 THEN 'Vacuuming Index and Heap' 
>            ELSE 'Unknown phase' 
>          END, 
>    .... 
>   FROM pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as S; 
> 
> # The name of the function could be other than *_command_progress.
The name of function is updated as pg_stat_get_progress_info() and also updated the function.
Updated the pg_stat_vacuum_progress view as suggested.

Regards,
Vinayak

Attachment

Vacuum_progress_checker_v12.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

26 February 2016, 09:19:30

Hi Vinayak,

Thanks for updating the patch! A quick comment:

On 2016/02/26 17:28, pokurev@pm.nttdata.co.jp wrote:
>> CREATE VIEW pg_stat_vacuum_progress AS 
>>   SELECT S.s[1] as pid, 
>>          S.s[2] as relid, 
>>          CASE S.s[3] 
>>            WHEN 1 THEN 'Scanning Heap' 
>>            WHEN 2 THEN 'Vacuuming Index and Heap' 
>>            ELSE 'Unknown phase' 
>>          END, 
>>    .... 
>>   FROM pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as S; 
>>
>> # The name of the function could be other than *_command_progress.
> The name of function is updated as pg_stat_get_progress_info() and also updated the function.
> Updated the pg_stat_vacuum_progress view as suggested.

So, pg_stat_get_progress_info() now accepts a parameter to distinguish
different commands.  I see the following in its definition:

+        /*  Report values for only those backends which are running VACUUM
command */
+        if (cmdtype == COMMAND_LAZY_VACUUM)
+        {
+            /*Progress can only be viewed by role member.*/
+            if (has_privs_of_role(GetUserId(), beentry->st_userid))
+            {
+                values[2] = UInt32GetDatum(beentry->st_progress_param[0]);
+                values[3] = UInt32GetDatum(beentry->st_progress_param[1]);
+                values[4] = UInt32GetDatum(beentry->st_progress_param[2]);
+                values[5] = UInt32GetDatum(beentry->st_progress_param[3]);
+                values[6] = UInt32GetDatum(beentry->st_progress_param[4]);
+                values[7] = UInt32GetDatum(beentry->st_progress_param[5]);
+                if (beentry->st_progress_param[1] != 0)
+                    values[8] = Float8GetDatum(beentry->st_progress_param[2] * 100 /
beentry->st_progress_param[1]);
+                else
+                    nulls[8] = true;
+            }
+            else
+            {
+                nulls[2] = true;
+                nulls[3] = true;
+                nulls[4] = true;
+                nulls[5] = true;
+                nulls[6] = true;
+                nulls[7] = true;
+                nulls[8] = true;
+            }
+        }

How about doing this in a separate function which takes the command id as
parameter and returns an array of values and the number of values (per
command id). pg_stat_get_progress_info() then creates values[] and nulls[]
arrays from that and returns that as result set.  It will be a cleaner
separation of activities, perhaps.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Vinayak Pokale

Date:

27 February 2016, 04:54:06

Hello,

On Fri, Feb 26, 2016 at 6:19 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:

Hi Vinayak,

Thanks for updating the patch! A quick comment:

On 2016/02/26 17:28, pokurev@pm.nttdata.co.jp wrote:
>> CREATE VIEW pg_stat_vacuum_progress AS
>> SELECT S.s[1] as pid,
>> S.s[2] as relid,
>> CASE S.s[3]
>> WHEN 1 THEN 'Scanning Heap'
>> WHEN 2 THEN 'Vacuuming Index and Heap'
>> ELSE 'Unknown phase'
>> END,
>> ....
>> FROM pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as S;
>>
>> # The name of the function could be other than *_command_progress.
> The name of function is updated as pg_stat_get_progress_info() and also updated the function.
> Updated the pg_stat_vacuum_progress view as suggested.

So, pg_stat_get_progress_info() now accepts a parameter to distinguish
different commands. I see the following in its definition:

+ /* Report values for only those backends which are running VACUUM
command */
+ if (cmdtype == COMMAND_LAZY_VACUUM)
+ {
+ /*Progress can only be viewed by role member.*/
+ if (has_privs_of_role(GetUserId(), beentry->st_userid))
+ {
+ values[2] = UInt32GetDatum(beentry->st_progress_param[0]);
+ values[3] = UInt32GetDatum(beentry->st_progress_param[1]);
+ values[4] = UInt32GetDatum(beentry->st_progress_param[2]);
+ values[5] = UInt32GetDatum(beentry->st_progress_param[3]);
+ values[6] = UInt32GetDatum(beentry->st_progress_param[4]);
+ values[7] = UInt32GetDatum(beentry->st_progress_param[5]);
+ if (beentry->st_progress_param[1] != 0)
+ values[8] = Float8GetDatum(beentry->st_progress_param[2] * 100 /
beentry->st_progress_param[1]);
+ else
+ nulls[8] = true;
+ }
+ else
+ {
+ nulls[2] = true;
+ nulls[3] = true;
+ nulls[4] = true;
+ nulls[5] = true;
+ nulls[6] = true;
+ nulls[7] = true;
+ nulls[8] = true;
+ }
+ }

How about doing this in a separate function which takes the command id as
parameter and returns an array of values and the number of values (per
command id). pg_stat_get_progress_info() then creates values[] and nulls[]
arrays from that and returns that as result set. It will be a cleaner
separation of activities, perhaps.

+1

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PROPOSAL] VACUUM Progress Checker.

From

大山真実

Date:

27 February 2016, 17:19:21

Hi!

I'm interesting this patch and tested it. I found two strange thing.

* Incorrect counting

Reproduce:

1. Client1 execute "VACUUM"

2. Client2 execute "VACUUM"

3. Client3 execute "SELECT * FROM pg_stat_vacuum_progress".

pid | relid | phase | total_heap_blks | current_heap_blkno | total_index_pages | scanned_index_pages | index_scan_count | percent_complete

------+-------+---------------+-----------------+--------------------+-------------------+---------------------+------------------+------------------

9267 | 16551 | Scanning Heap | 164151 | 316 | 27422 | 7 | 1 | 0

9764 | 16554 | Scanning Heap | 2 | 2 | 2 | 27422 | 1 | 100

(2 rows)

Client2 is waiting for Clinet1 "VACUUM" but percent_complete of Client2 "VACUUM" is 100.

* Not end VACUUM ANALYZE in spite of "percent_complete=100"

Client_1 execute "VACUUM ANALYZE", then Client_2 execute "SELECT * FROM pg_stat_vacuum_progress".

pid | relid | phase | total_heap_blks | current_heap_blkno | total_index_pages | scanned_index_pages | index_scan_count | percent_complete

------+-------+---------------+-----------------+--------------------+-------------------+---------------------+------------------+------------------

9277 | 16551 | Scanning Heap | 163935 | 163935 | 27422 | 7 | 1 | 100

(1 row

percent_complete is 100 but Client_1 "VACUUM ANALYZE" do not response yet.

Of course, Client_1 is executing analyze after vacuum. But it seem to me that this confuses users.

If percent_complete becomes 100 that row should be deleted quickly.

Regards,

Masanori Ohyama

NTT Open Source Software Center

2016年2月27日(土) 13:54 Vinayak Pokale <vinpokale@gmail.com>:

Hello,

On Fri, Feb 26, 2016 at 6:19 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote:

Hi Vinayak,

Thanks for updating the patch! A quick comment:

On 2016/02/26 17:28, pokurev@pm.nttdata.co.jp wrote:
>> CREATE VIEW pg_stat_vacuum_progress AS
>> SELECT S.s[1] as pid,
>> S.s[2] as relid,
>> CASE S.s[3]
>> WHEN 1 THEN 'Scanning Heap'
>> WHEN 2 THEN 'Vacuuming Index and Heap'
>> ELSE 'Unknown phase'
>> END,
>> ....
>> FROM pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as S;
>>
>> # The name of the function could be other than *_command_progress.
> The name of function is updated as pg_stat_get_progress_info() and also updated the function.
> Updated the pg_stat_vacuum_progress view as suggested.

So, pg_stat_get_progress_info() now accepts a parameter to distinguish
different commands. I see the following in its definition:

+ /* Report values for only those backends which are running VACUUM
command */
+ if (cmdtype == COMMAND_LAZY_VACUUM)
+ {
+ /*Progress can only be viewed by role member.*/
+ if (has_privs_of_role(GetUserId(), beentry->st_userid))
+ {
+ values[2] = UInt32GetDatum(beentry->st_progress_param[0]);
+ values[3] = UInt32GetDatum(beentry->st_progress_param[1]);
+ values[4] = UInt32GetDatum(beentry->st_progress_param[2]);
+ values[5] = UInt32GetDatum(beentry->st_progress_param[3]);
+ values[6] = UInt32GetDatum(beentry->st_progress_param[4]);
+ values[7] = UInt32GetDatum(beentry->st_progress_param[5]);
+ if (beentry->st_progress_param[1] != 0)
+ values[8] = Float8GetDatum(beentry->st_progress_param[2] * 100 /
beentry->st_progress_param[1]);
+ else
+ nulls[8] = true;
+ }
+ else
+ {
+ nulls[2] = true;
+ nulls[3] = true;
+ nulls[4] = true;
+ nulls[5] = true;
+ nulls[6] = true;
+ nulls[7] = true;
+ nulls[8] = true;
+ }
+ }

How about doing this in a separate function which takes the command id as
parameter and returns an array of values and the number of values (per
command id). pg_stat_get_progress_info() then creates values[] and nulls[]
arrays from that and returns that as result set. It will be a cleaner
separation of activities, perhaps.

+1

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PROPOSAL] VACUUM Progress Checker.

From

Kyotaro HORIGUCHI

Date:

01 March 2016, 09:02:26

Hello, thank for testing this.

At Sat, 27 Feb 2016 17:19:05 +0000, 大山真実 <oyama.masanori.1987@gmail.com> wrote in
<CAJ_V8TmJG0Z8RPpN9DhTLEnfDxWEUoBQXQRQvQ7mbE47y3X9+Q@mail.gmail.com>
> Hi!
> 
> I'm interesting this patch and tested it. I found two strange thing.
> 
> * Incorrect counting
> 
> Reproduce:
>   1. Client1 execute "VACUUM"
>   2. Client2 execute "VACUUM"
>   3. Client3 execute "SELECT * FROM pg_stat_vacuum_progress".
>  pid  | relid |     phase     | total_heap_blks | current_heap_blkno |
> total_index_pages | scanned_index_pages | index_scan_count |
> percent_complete
>
------+-------+---------------+-----------------+--------------------+-------------------+---------------------+------------------+------------------
>  9267 | 16551 | Scanning Heap |          164151 |                316 |
>         27422 |                   7 |                1 |                0
>  9764 | 16554 | Scanning Heap |               2 |                  2 |
>             2 |               27422 |                1 |              100
> (2 rows)
> 
>   Client2 is waiting for Clinet1 "VACUUM" but percent_complete of Client2
> "VACUUM" is 100.
> * Not end VACUUM ANALYZE in spite of "percent_complete=100"

The inidividual record is telling about *one* relation now under
vacuuming (or just after the processing), not about all relations
to be vacuumed as a whole. It is the specification of this patch
for now. However it cannot tell how long the invoker should wait
for the vauum to end, it seems to be way difficult to calculate
statistics against the all relations to be processed.

Anyway other status messages such as "Waiting for XXXX" would be
necessary.

>   Client_1 execute "VACUUM ANALYZE", then Client_2 execute "SELECT * FROM
> pg_stat_vacuum_progress".
> 
>  pid  | relid |     phase     | total_heap_blks | current_heap_blkno |
> total_index_pages | scanned_index_pages | index_scan_count |
> percent_complete
>
------+-------+---------------+-----------------+--------------------+-------------------+---------------------+------------------+------------------
>  9277 | 16551 | Scanning Heap |          163935 |             163935 |
>         27422 |                   7 |                1 |              100
> (1 row
> 
>   percent_complete is 100 but Client_1 "VACUUM ANALYZE" do not response yet.
> 
>   Of course, Client_1 is executing analyze after vacuum. But it seem to me
> that this confuses users.
>   If percent_complete becomes 100 that row should be deleted quickly.

Maybe some works other than vacuuming pages is performing or
waiting a lock to be acquired. If it is a matter of progress, it
should be counted in the progress, but not for something like
waiting for a lock. It is a matter of status messages.

> Regards,
> Masanori Ohyama
> NTT Open Source Software Center
> 
> 2016年2月27日(土) 13:54 Vinayak Pokale <vinpokale@gmail.com>:
> 
> > Hello,
> >
> > On Fri, Feb 26, 2016 at 6:19 PM, Amit Langote <
> > Langote_Amit_f8@lab.ntt.co.jp> wrote:
> >
> >>
> >> Hi Vinayak,
> >>
> >> Thanks for updating the patch! A quick comment:
> >>
> >> On 2016/02/26 17:28, pokurev@pm.nttdata.co.jp wrote:
> >> >> CREATE VIEW pg_stat_vacuum_progress AS
> >> >>   SELECT S.s[1] as pid,
> >> >>          S.s[2] as relid,
> >> >>          CASE S.s[3]
> >> >>            WHEN 1 THEN 'Scanning Heap'
> >> >>            WHEN 2 THEN 'Vacuuming Index and Heap'
> >> >>            ELSE 'Unknown phase'
> >> >>          END,
> >> >>    ....
> >> >>   FROM pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as S;
> >> >>
> >> >> # The name of the function could be other than *_command_progress.
> >> > The name of function is updated as pg_stat_get_progress_info() and also
> >> updated the function.
> >> > Updated the pg_stat_vacuum_progress view as suggested.
> >>
> >> So, pg_stat_get_progress_info() now accepts a parameter to distinguish
> >> different commands.  I see the following in its definition:
> >>
> >> +               /*  Report values for only those backends which are
> >> running VACUUM
> >> command */
> >> +               if (cmdtype == COMMAND_LAZY_VACUUM)
> >> +               {
> >> +                       /*Progress can only be viewed by role member.*/
> >> +                       if (has_privs_of_role(GetUserId(),
> >> beentry->st_userid))
> >> +                       {
> >> +                               values[2] =
> >> UInt32GetDatum(beentry->st_progress_param[0]);
> >> +                               values[3] =
> >> UInt32GetDatum(beentry->st_progress_param[1]);
> >> +                               values[4] =
> >> UInt32GetDatum(beentry->st_progress_param[2]);
> >> +                               values[5] =
> >> UInt32GetDatum(beentry->st_progress_param[3]);
> >> +                               values[6] =
> >> UInt32GetDatum(beentry->st_progress_param[4]);
> >> +                               values[7] =
> >> UInt32GetDatum(beentry->st_progress_param[5]);
> >> +                               if (beentry->st_progress_param[1] != 0)
> >> +                                       values[8] =
> >> Float8GetDatum(beentry->st_progress_param[2] * 100 /
> >> beentry->st_progress_param[1]);
> >> +                               else
> >> +                                       nulls[8] = true;
> >> +                       }
> >> +                       else
> >> +                       {
> >> +                               nulls[2] = true;
> >> +                               nulls[3] = true;
> >> +                               nulls[4] = true;
> >> +                               nulls[5] = true;
> >> +                               nulls[6] = true;
> >> +                               nulls[7] = true;
> >> +                               nulls[8] = true;
> >> +                       }
> >> +               }
> >>
> >> How about doing this in a separate function which takes the command id as
> >> parameter and returns an array of values and the number of values (per
> >> command id). pg_stat_get_progress_info() then creates values[] and nulls[]
> >> arrays from that and returns that as result set.  It will be a cleaner
> >> separation of activities, perhaps.
> >>
> >> +1

Accessing an element out of array safely be NULL and the caller
should know the number of elements, so I prefer one integer (or
bigint?) array to be returned. Or anyway the internal array has
finite number of elements, the function may return an array
exactly reflects the internal.

Last, I found one small bug mentioned above.

+        if (beentry->st_progress_param[1] != 0)
+          values[8] = Float8GetDatum(beentry->st_progress_param[2] * 100 / beentry->st_progress_param[1]);

Float8GetDatum(int/int) cannot have decimal places.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

04 March 2016, 22:11:23

On Fri, Feb 26, 2016 at 3:28 AM,  <pokurev@pm.nttdata.co.jp> wrote:
> Thank you for your comments.
> Please find attached patch addressing following comments.
>
>>As I might have written upthread, transferring the whole string
>>as a progress message is useless at least in this scenario. Since
>>they are a set of fixed messages, each of them can be represented
>>by an identifier, an integer number. I don't see a reason for
>>sending the whole of a string beyond a backend.
> Agreed. I used following macros.
> #define VACUUM_PHASE_SCAN_HEAP  1
> #define VACUUM_PHASE_VACUUM_INDEX_HEAP  2
>
>>I guess num_index_scans could better be reported after all the indexes are
>>done, that is, after the for loop ends.
> Agreed.  I have corrected it.
>
>> CREATE VIEW pg_stat_vacuum_progress AS
>>   SELECT S.s[1] as pid,
>>          S.s[2] as relid,
>>          CASE S.s[3]
>>            WHEN 1 THEN 'Scanning Heap'
>>            WHEN 2 THEN 'Vacuuming Index and Heap'
>>            ELSE 'Unknown phase'
>>          END,
>>    ....
>>   FROM pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as S;
>>
>> # The name of the function could be other than *_command_progress.
> The name of function is updated as pg_stat_get_progress_info() and also updated the function.
> Updated the pg_stat_vacuum_progress view as suggested.

I'm positive I've said this at least once before while reviewing this
patch, and I think more than once: we should be trying to build a
general progress-reporting facility here with vacuum as the first
user.  Therefore, for example, pg_stat_get_progress_info's output
columns should have generic names, not names specific to VACUUM.
pg_stat_vacuum_progress can alias them to a vacuum-specific name.  See
for example the relationship between pg_stats and pg_statistic.

I think VACUUM should have three phases, not two.  lazy_vacuum_index()
and lazy_vacuum_heap() are lumped together right now, but I think they
shouldn't be.

Please create named constants for the first argument to
pgstat_report_progress_update_counter(), maybe with names like
PROGRESS_VACUUM_WHATEVER.

+               /* Update current block number of the relation */
+               pgstat_report_progress_update_counter(2, blkno + 1);

Why + 1?

I thought we had a plan to update the counter of scanned index pages
after each index page was vacuumed by the AM.  Doing it only after
vacuuming the entire index is much less granular and generally less
useful.   See http://www.postgresql.org/message-id/56500356.4070101@BlueTreble.com

+               if (blkno == nblocks - 1 &&
vacrelstats->num_dead_tuples == 0 && nindexes != 0
+                       && vacrelstats->num_index_scans == 0)
+                       total_index_pages = 0;

I'm not sure what this is trying to do, perhaps because there is no
comment explaining it.  Whatever the intent, I suspect that such a
complex test is likely to be fragile.  Perhaps there is a better way?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

05 March 2016, 07:24:41

On Sat, Mar 5, 2016 at 7:11 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Feb 26, 2016 at 3:28 AM,  <pokurev@pm.nttdata.co.jp> wrote:
>> Thank you for your comments.
>> Please find attached patch addressing following comments.
>
> I'm positive I've said this at least once before while reviewing this
> patch, and I think more than once: we should be trying to build a
> general progress-reporting facility here with vacuum as the first
> user.  Therefore, for example, pg_stat_get_progress_info's output
> columns should have generic names, not names specific to VACUUM.
> pg_stat_vacuum_progress can alias them to a vacuum-specific name.  See
> for example the relationship between pg_stats and pg_statistic.
>
> I think VACUUM should have three phases, not two.  lazy_vacuum_index()
> and lazy_vacuum_heap() are lumped together right now, but I think they
> shouldn't be.
>
> Please create named constants for the first argument to
> pgstat_report_progress_update_counter(), maybe with names like
> PROGRESS_VACUUM_WHATEVER.
>
> +               /* Update current block number of the relation */
> +               pgstat_report_progress_update_counter(2, blkno + 1);
>
> Why + 1?
>
> I thought we had a plan to update the counter of scanned index pages
> after each index page was vacuumed by the AM.  Doing it only after
> vacuuming the entire index is much less granular and generally less
> useful.   See http://www.postgresql.org/message-id/56500356.4070101@BlueTreble.com
>
> +               if (blkno == nblocks - 1 &&
> vacrelstats->num_dead_tuples == 0 && nindexes != 0
> +                       && vacrelstats->num_index_scans == 0)
> +                       total_index_pages = 0;
>
> I'm not sure what this is trying to do, perhaps because there is no
> comment explaining it.  Whatever the intent, I suspect that such a
> complex test is likely to be fragile.  Perhaps there is a better way?

So, I took the Vinayak's latest patch and rewrote it a little while
maintaining the original idea but modifying code to some degree.  Hope
original author(s) are okay with it.  Vinayak, do see if the rewritten
patch is alright and improve it anyway you want.

I broke it into two:

0001-Provide-a-way-for-utility-commands-to-report-progres.patch
0002-Implement-progress-reporting-for-VACUUM-command.patch

The code review comments received recently (including mine) have been
incorporated.

However, I didn't implement the report-per-index-page-vacuumed bit but
should be easy to code once the details are finalized (questions like
whether it requires modifying any existing interfaces, etc).

Thanks,
Amit

Attachment

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

05 March 2016, 07:41:57

On Sat, Mar 5, 2016 at 4:24 PM, Amit Langote <amitlangote09@gmail.com> wrote:
> So, I took the Vinayak's latest patch and rewrote it a little
...
> I broke it into two:
>
> 0001-Provide-a-way-for-utility-commands-to-report-progres.patch
> 0002-Implement-progress-reporting-for-VACUUM-command.patch

Oops, unamended commit messages in those patches are misleading.  So,
please find attached corrected versions.

Thanks,
Amit

Attachment

Re: [PROPOSAL] VACUUM Progress Checker.

From

Date:

07 March 2016, 01:45:29

Hi Amit,

Thank you for updating the patch. I am testing it and I will try to improve it.

Regards,
Vinayak
> -----Original Message-----
> From: Amit Langote [mailto:amitlangote09@gmail.com]
> Sent: Saturday, March 05, 2016 4:41 PM
> To: Robert Haas <robertmhaas@gmail.com>
> Cc: SPS ポクレ ヴィナヤック(三技術) <pokurev@pm.nttdata.co.jp>;
> Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>; Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp>; pgsql-hackers@postgresql.org; SPS 坂野
> 昌平(三技術) <bannos@nttdata.co.jp>
> Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
> 
> On Sat, Mar 5, 2016 at 4:24 PM, Amit Langote <amitlangote09@gmail.com>
> wrote:
> > So, I took the Vinayak's latest patch and rewrote it a little
> ...
> > I broke it into two:
> >
> > 0001-Provide-a-way-for-utility-commands-to-report-progres.patch
> > 0002-Implement-progress-reporting-for-VACUUM-command.patch
> 
> Oops, unamended commit messages in those patches are misleading.  So,
> please find attached corrected versions.
> 
> Thanks,
> Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Kyotaro HORIGUCHI

Date:

07 March 2016, 04:03:34

Hi, Thank you for the patch.

At Sat, 5 Mar 2016 16:41:29 +0900, Amit Langote <amitlangote09@gmail.com> wrote in
<CA+HiwqHTeuqWMc+ktneGqFdJMRXD=syncgU0914TVXaahOF56g@mail.gmail.com>
> On Sat, Mar 5, 2016 at 4:24 PM, Amit Langote <amitlangote09@gmail.com> wrote:
> > So, I took the Vinayak's latest patch and rewrote it a little
> ...
> > I broke it into two:
> >
> > 0001-Provide-a-way-for-utility-commands-to-report-progres.patch
> > 0002-Implement-progress-reporting-for-VACUUM-command.patch
> 
> Oops, unamended commit messages in those patches are misleading.  So,
> please find attached corrected versions.

The 0001-P.. adds the following interface functions.

+extern void pgstat_progress_set_command(BackendCommandType cmdtype);
+extern void pgstat_progress_set_command_target(Oid objid);
+extern void pgstat_progress_update_param(int index, uint32 val);
+extern void pgstat_reset_local_progress(void);
+extern int    pgstat_progress_get_num_param(BackendCommandType cmdtype);

I don't like to treat the target object id differently from other
parameters. It could not be needed at all, or could be needed two
or more in contrast. Although oids are not guaranteed to fit
uint32, we have already stored BlockNumber there.

# I think that integer arrays might be needed to be passed as a
# parameter, but it would be the another issue.

pg_stat_get_progress_info returns a tuple with 10 integer columns
(plus an object id). The reason why I suggested use of an integer
array is that it allows the API to serve arbitrary number of
parmeters without a modification of API, and array indexes are
coloreless than any concrete names. Howerver I don't stick to
that if we agree that it is ok to have fixed number of paremters.

pgstat_progress_get_num_param looks not good in the aspect of
genericity. I'd like to define it as an integer array by idexed
by the command type if it is needed. However it seems to me to be
enough that pg_stat_get_progress_info always returns 10 integers
regardless of what the numbers are for. The user sql function,
pg_stat_vacuum_progress as the first user, knows how many numbers
should be read for its work. It reads zeroes safely even if it
reads more than what the producer side offered (unless it tries
to divide something with it).

What do you think about this?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

07 March 2016, 07:17:11

Horiguchi-san,

Thanks a lot for taking a look!

On 2016/03/07 13:02, Kyotaro HORIGUCHI wrote:
> At Sat, 5 Mar 2016 16:41:29 +0900, Amit Langote wrote:
>> On Sat, Mar 5, 2016 at 4:24 PM, Amit Langote <amitlangote09@gmail.com> wrote:
>>> So, I took the Vinayak's latest patch and rewrote it a little
>> ...
>>> I broke it into two:
>>>
>>> 0001-Provide-a-way-for-utility-commands-to-report-progres.patch
>>> 0002-Implement-progress-reporting-for-VACUUM-command.patch
>>
>> Oops, unamended commit messages in those patches are misleading.  So,
>> please find attached corrected versions.
>
> The 0001-P.. adds the following interface functions.
>
> +extern void pgstat_progress_set_command(BackendCommandType cmdtype);
> +extern void pgstat_progress_set_command_target(Oid objid);
> +extern void pgstat_progress_update_param(int index, uint32 val);
> +extern void pgstat_reset_local_progress(void);
> +extern int    pgstat_progress_get_num_param(BackendCommandType cmdtype);
>
> I don't like to treat the target object id differently from other
> parameters. It could not be needed at all, or could be needed two
> or more in contrast. Although oids are not guaranteed to fit
> uint32, we have already stored BlockNumber there.

I thought giving cmdtype and objid each its own slot would make things a
little bit clearer than stuffing them into st_progress_param[0] and
st_progress_param[1], respectively.  Is that what you are suggesting?
Although as I've don, a separate field st_command_objid may be a bit too much.

If they are not special fields, I think we don't need special interface
functions *set_command() and *set_command_target().  But I am still
inclined toward keeping the former.

> # I think that integer arrays might be needed to be passed as a
> # parameter, but it would be the another issue.

Didn't really think about it.  Maybe we should consider a scenario that
would require it.

> pg_stat_get_progress_info returns a tuple with 10 integer columns
> (plus an object id). The reason why I suggested use of an integer
> array is that it allows the API to serve arbitrary number of
> parmeters without a modification of API, and array indexes are
> coloreless than any concrete names. Howerver I don't stick to
> that if we agree that it is ok to have fixed number of paremters.

I think the fixed number of parameters in the form of a fixed-size array
is because st_progress_param[] is part of a shared memory structure as
discussed before.  Although such interface has been roughly modeled on how
pg_statistic catalog and pg_stats view or get_attstatsslot() function
work, shared memory structures take the place of the catalog, so there are
some restrictions (fixed size array being one).

Regarding index into st_progress_param[], pgstat.c/pgstatfuncs.c should
not bother what it is.  As exemplified in patch 0002, individual index
numbers can be defined as macros by individual command modules (suggested
by Robert recently) with certain convention for readability such as the
following in lazyvacuum.c:

#define PROG_PAR_VAC_RELID                     0
#define PROG_PAR_VAC_PHASE_ID                  1
#define PROG_PAR_VAC_HEAP_BLKS                 2
#define PROG_PAR_VAC_CUR_HEAP_BLK              3
... so on.

Then, to report a changed parameter:

pgstat_progress_update_param(PROG_PAR_VAC_PHASE_ID, LV_PHASE_SCAN_HEAP);
...
pgstat_progress_update_param(PROG_PAR_VAC_CUR_HEAP_BLK, blkno);

by the way, following is proargnames[] for pg_stat_get_progress_info():

cmdtype integer,
OUT pid integer,
OUT param1 integer,
OUT param2 integer,
...
OUT param10 integer

So, it is a responsibility of a command specific progress view definition
that it interprets values of param1..param10 appropriately.  In fact, the
implementer of the progress reporting for a command determines what goes
into which slot of st_progress_param[], to begin with.

> pgstat_progress_get_num_param looks not good in the aspect of
> genericity. I'd like to define it as an integer array by idexed
> by the command type if it is needed. However it seems to me to be
> enough that pg_stat_get_progress_info always returns 10 integers
> regardless of what the numbers are for. The user sql function,
> pg_stat_vacuum_progress as the first user, knows how many numbers
> should be read for its work. It reads zeroes safely even if it
> reads more than what the producer side offered (unless it tries
> to divide something with it).

Thinking a bit, perhaps we don't need num_param(cmdtpye) function or array
at all as you seem to suggest.  It serves no useful purpose now that I see
it. pg_stat_get_progress_info() should simply copy
st_progress_param[0...PG_STAT_GET_PROGRESS_COLS-1] to the result and view
definer knows what's what.

Attached updated patches which incorporate above mentioned changes.  If
Vinayak has something else in mind about anything, he can weigh in.

Thanks,
Amit

Attachment

Re: [PROPOSAL] VACUUM Progress Checker.

From

Kyotaro HORIGUCHI

Date:

07 March 2016, 09:19:16

Hi, Amit.

At Mon, 7 Mar 2016 16:16:30 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<56DD2ACE.5050208@lab.ntt.co.jp>
> 
> Horiguchi-san,
> 
> Thanks a lot for taking a look!
> 
> On 2016/03/07 13:02, Kyotaro HORIGUCHI wrote:
> > At Sat, 5 Mar 2016 16:41:29 +0900, Amit Langote wrote:
> >> On Sat, Mar 5, 2016 at 4:24 PM, Amit Langote <amitlangote09@gmail.com> wrote:
> >>> So, I took the Vinayak's latest patch and rewrote it a little
> >> ...
> >>> I broke it into two:
> >>>
> >>> 0001-Provide-a-way-for-utility-commands-to-report-progres.patch
> >>> 0002-Implement-progress-reporting-for-VACUUM-command.patch
> >>
> >> Oops, unamended commit messages in those patches are misleading.  So,
> >> please find attached corrected versions.
> > 
> > The 0001-P.. adds the following interface functions.
> > 
> > +extern void pgstat_progress_set_command(BackendCommandType cmdtype);
> > +extern void pgstat_progress_set_command_target(Oid objid);
> > +extern void pgstat_progress_update_param(int index, uint32 val);
> > +extern void pgstat_reset_local_progress(void);
> > +extern int    pgstat_progress_get_num_param(BackendCommandType cmdtype);
> > 
> > I don't like to treat the target object id differently from other
> > parameters. It could not be needed at all, or could be needed two
> > or more in contrast. Although oids are not guaranteed to fit
> > uint32, we have already stored BlockNumber there.
> 
> I thought giving cmdtype and objid each its own slot would make things a
> little bit clearer than stuffing them into st_progress_param[0] and
> st_progress_param[1], respectively.  Is that what you are suggesting?
> Although as I've don, a separate field st_command_objid may be a bit too much.

I mentioned only of object id as you seem to take me. The command
type is essential unlike the target object ids. It is needed by
all statistics views of this kind to filter required backends.

> If they are not special fields, I think we don't need special interface
> functions *set_command() and *set_command_target().  But I am still
> inclined toward keeping the former.
> 
> > # I think that integer arrays might be needed to be passed as a
> > # parameter, but it would be the another issue.
> 
> Didn't really think about it.  Maybe we should consider a scenario that
> would require it.

Imagine to provide a statictics of a vacuum commnad as a
whole. It will vacuum several relations at once so the view could
be like the following.

select * from pg_stat_vacuum_command;
- [ Record 1 ]
worker_pid     : 3243
command        : vacuum full
rels_scheduled : {16387, 16390, 16393}
rels_finished  : {16384}
status         : Processing 16384, awiting for a lock.
..

This needs arrays if we want this but it would be another issue
as I said.


> > pg_stat_get_progress_info returns a tuple with 10 integer columns
> > (plus an object id). The reason why I suggested use of an integer
> > array is that it allows the API to serve arbitrary number of
> > parmeters without a modification of API, and array indexes are
> > coloreless than any concrete names. Howerver I don't stick to
> > that if we agree that it is ok to have fixed number of paremters.
> 
> I think the fixed number of parameters in the form of a fixed-size array
> is because st_progress_param[] is part of a shared memory structure as
> discussed before.  Although such interface has been roughly modeled on how
> pg_statistic catalog and pg_stats view or get_attstatsslot() function
> work, shared memory structures take the place of the catalog, so there are
> some restrictions (fixed size array being one).

It depends on how easy we take it to widen the parameter slots in
shared memory:p Anyway I don't stick that since it doesn't make
a siginificant difference.

> Regarding index into st_progress_param[], pgstat.c/pgstatfuncs.c should
> not bother what it is.  As exemplified in patch 0002, individual index
> numbers can be defined as macros by individual command modules (suggested
> by Robert recently) with certain convention for readability such as the
> following in lazyvacuum.c:
> 
> #define PROG_PAR_VAC_RELID                     0
> #define PROG_PAR_VAC_PHASE_ID                  1
> #define PROG_PAR_VAC_HEAP_BLKS                 2
> #define PROG_PAR_VAC_CUR_HEAP_BLK              3
> ... so on.
> 
> Then, to report a changed parameter:
> 
> pgstat_progress_update_param(PROG_PAR_VAC_PHASE_ID, LV_PHASE_SCAN_HEAP);
> ...
> pgstat_progress_update_param(PROG_PAR_VAC_CUR_HEAP_BLK, blkno);

Yeah, it seems fine for me.

> by the way, following is proargnames[] for pg_stat_get_progress_info():
> 
> cmdtype integer,
> OUT pid integer,
> OUT param1 integer,
> OUT param2 integer,
> ...
> OUT param10 integer
> 
> So, it is a responsibility of a command specific progress view definition
> that it interprets values of param1..param10 appropriately.  In fact, the
> implementer of the progress reporting for a command determines what goes
> into which slot of st_progress_param[], to begin with.

It seems quite fine, too.

> > pgstat_progress_get_num_param looks not good in the aspect of
> > genericity. I'd like to define it as an integer array by idexed
> > by the command type if it is needed. However it seems to me to be
> > enough that pg_stat_get_progress_info always returns 10 integers
> > regardless of what the numbers are for. The user sql function,
> > pg_stat_vacuum_progress as the first user, knows how many numbers
> > should be read for its work. It reads zeroes safely even if it
> > reads more than what the producer side offered (unless it tries
> > to divide something with it).
> 
> Thinking a bit, perhaps we don't need num_param(cmdtpye) function or array
> at all as you seem to suggest.  It serves no useful purpose now that I see

^^; Sorry for the cryptic description..

> it. pg_stat_get_progress_info() should simply copy
> st_progress_param[0...PG_STAT_GET_PROGRESS_COLS-1] to the result and view
> definer knows what's what.
> 
> Attached updated patches which incorporate above mentioned changes.  If
> Vinayak has something else in mind about anything, he can weigh in.


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

07 March 2016, 10:11:25

Horiguchi-san,

Thanks for a quick reply, :)

On 2016/03/07 18:18, Kyotaro HORIGUCHI wrote:
> At Mon, 7 Mar 2016 16:16:30 +0900, Amit Langote wrote:
>> On 2016/03/07 13:02, Kyotaro HORIGUCHI wrote:
>>> The 0001-P.. adds the following interface functions.
>>>
>>> I don't like to treat the target object id differently from other
>>> parameters. It could not be needed at all, or could be needed two
>>> or more in contrast. Although oids are not guaranteed to fit
>>> uint32, we have already stored BlockNumber there.
>>
>> I thought giving cmdtype and objid each its own slot would make things a
>> little bit clearer than stuffing them into st_progress_param[0] and
>> st_progress_param[1], respectively.  Is that what you are suggesting?
>> Although as I've don, a separate field st_command_objid may be a bit too much.
> 
> I mentioned only of object id as you seem to take me. The command
> type is essential unlike the target object ids. It is needed by
> all statistics views of this kind to filter required backends.

Yep.

>>> # I think that integer arrays might be needed to be passed as a
>>> # parameter, but it would be the another issue.
>>
>> Didn't really think about it.  Maybe we should consider a scenario that
>> would require it.
> 
> Imagine to provide a statictics of a vacuum commnad as a
> whole. It will vacuum several relations at once so the view could
> be like the following.
> 
> select * from pg_stat_vacuum_command;
> - [ Record 1 ]
> worker_pid     : 3243
> command        : vacuum full
> rels_scheduled : {16387, 16390, 16393}
> rels_finished  : {16384}
> status         : Processing 16384, awiting for a lock.
> ..
> 
> This needs arrays if we want this but it would be another issue
> as I said.

Oh, I see. This does call for at least some consideration of how to
support variable size parameter values.

By the way, looking at the "status" message in your example, it doesn't
seem like a candidate for evaluation in a CASE..WHEN expression?  Maybe,
we should re-introduce[1] a fixed-size char st_progress_message[] field.
Since, ISTM, such a command's internal code is in better position to
compute that kind of message string.  IIRC, a reason that was given to not
have such a field was, among other things, the copy overhead of message
strings.  But commands like the one in your example, could afford that
much overhead since the frequency of message change would be less and less
compared with the time elapsed between the changes anyway.

>> I think the fixed number of parameters in the form of a fixed-size array
>> is because st_progress_param[] is part of a shared memory structure as
>> discussed before.  Although such interface has been roughly modeled on how
>> pg_statistic catalog and pg_stats view or get_attstatsslot() function
>> work, shared memory structures take the place of the catalog, so there are
>> some restrictions (fixed size array being one).
> 
> It depends on how easy we take it to widen the parameter slots in
> shared memory:p Anyway I don't stick that since it doesn't make
> a siginificant difference.

Your above example makes me wonder how we can provide for it.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

07 March 2016, 10:18:33

On 2016/03/07 19:11, Amit Langote wrote:
> we should re-introduce[1] a fixed-size char st_progress_message[] field.

Sorry, that [1] does not refer to anything, just a leftover from my draft.I thought I had a link handy for an email
wheresome sort of
 
justification was given as to why st_progress_message field was removed
from the patch.  I couldn't find it.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

07 March 2016, 14:48:43

On Sun, Mar 6, 2016 at 11:02 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> At Sat, 5 Mar 2016 16:41:29 +0900, Amit Langote <amitlangote09@gmail.com> wrote in
<CA+HiwqHTeuqWMc+ktneGqFdJMRXD=syncgU0914TVXaahOF56g@mail.gmail.com>
>> On Sat, Mar 5, 2016 at 4:24 PM, Amit Langote <amitlangote09@gmail.com> wrote:
>> > So, I took the Vinayak's latest patch and rewrote it a little
>> ...
>> > I broke it into two:
>> >
>> > 0001-Provide-a-way-for-utility-commands-to-report-progres.patch
>> > 0002-Implement-progress-reporting-for-VACUUM-command.patch
>>
>> Oops, unamended commit messages in those patches are misleading.  So,
>> please find attached corrected versions.
>
> The 0001-P.. adds the following interface functions.
>
> +extern void pgstat_progress_set_command(BackendCommandType cmdtype);
> +extern void pgstat_progress_set_command_target(Oid objid);
> +extern void pgstat_progress_update_param(int index, uint32 val);
> +extern void pgstat_reset_local_progress(void);
> +extern int     pgstat_progress_get_num_param(BackendCommandType cmdtype);
>
> I don't like to treat the target object id differently from other
> parameters. It could not be needed at all, or could be needed two
> or more in contrast. Although oids are not guaranteed to fit
> uint32, we have already stored BlockNumber there.

Well...

There's not much point in deciding that the parameters are uint32,
because we don't have that type at the SQL level.
pgstat_progress_update_param() really ought to take either int32 or
int64 as an argument, because that's what we can actually handle from
SQL, and it seems pretty clear that int64 is better since otherwise we
can't fit, among other things, a block number.

Given that, I tend to think that treating the command target specially
and passing that as an OID is reasonable.  We're not going to be able
to pass variable-sized arrays through this mechanism, ever, because
our shared memory segment doesn't work like that.  And it seems to me
that nearly every command somebody might want to report progress on
will touch, basically, one relation a a time.  So I don't see the harm
in hardcoding that idea into the facility.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

08 March 2016, 08:02:55

On 2016/03/07 23:48, Robert Haas wrote:
> On Sun, Mar 6, 2016 at 11:02 PM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>> The 0001-P.. adds the following interface functions.
>>
>> +extern void pgstat_progress_set_command(BackendCommandType cmdtype);
>> +extern void pgstat_progress_set_command_target(Oid objid);
>> +extern void pgstat_progress_update_param(int index, uint32 val);
>> +extern void pgstat_reset_local_progress(void);
>> +extern int     pgstat_progress_get_num_param(BackendCommandType cmdtype);
>>
>> I don't like to treat the target object id differently from other
>> parameters. It could not be needed at all, or could be needed two
>> or more in contrast. Although oids are not guaranteed to fit
>> uint32, we have already stored BlockNumber there.
>
> Well...
>
> There's not much point in deciding that the parameters are uint32,
> because we don't have that type at the SQL level.
> pgstat_progress_update_param() really ought to take either int32 or
> int64 as an argument, because that's what we can actually handle from
> SQL, and it seems pretty clear that int64 is better since otherwise we
> can't fit, among other things, a block number.
>
> Given that, I tend to think that treating the command target specially
> and passing that as an OID is reasonable.  We're not going to be able
> to pass variable-sized arrays through this mechanism, ever, because
> our shared memory segment doesn't work like that.  And it seems to me
> that nearly every command somebody might want to report progress on
> will touch, basically, one relation a a time.  So I don't see the harm
> in hardcoding that idea into the facility.

Updated versions attached.

* changed st_progress_param to int64 and so did the argument of
pgstat_progress_update_param().  Likewise changed param1..param10 of
pg_stat_get_progress_info()'s output columns to bigint.

* Added back the Oid field st_command_target and corresponding function
pgstat_progress_set_command_target(Oid).

* I attempted to implement a method to report index blocks done from
lazy_tid_reaped() albeit with some limitations. Patch 0003 is that
attempt.  In summary, I modified the index bulk delete callback interface
to receive a BlockNumber argument index_blkno:

 /* Typedef for callback function to determine if a tuple is bulk-deletable */
-typedef bool (*IndexBulkDeleteCallback) (ItemPointer itemptr, void *state);
+typedef bool (*IndexBulkDeleteCallback) (ItemPointer itemptr,
+                                         BlockNumber index_blkno,
+                                         void *state);

Then added 2 more fields to LVRelStats:

@@ -143,6 +143,8 @@ typedef struct LVRelStats
     int         num_index_scans;
     TransactionId latestRemovedXid;
     bool        lock_waiter_detected;
+    BlockNumber last_index_blkno;
+    BlockNumber index_blks_vacuumed;

Then in lazy_tid_reaped(), if the index block number received in the
index_blkno argument has changed from the previous call, increment the
count of index blocks processed and
pgstat_report_update_param(index_blks_done). I wonder if we should reset
the the saved block number and the count for every index vacuumed by
lazy_vacuum_index(). Right now, total_index_blks counts all indexes and
counting blocks using the rough method mentioned above is sensible only
for one index at time.  Actually, the method has different kinds of
problems to deal with anyway. For example, with a btree index, one can
expect that the final count does not match total_index_blks obtained using
RelationGetNumberOfBlocks().  Moreover, each AM has its own idiosyncratic
way of traversing the index pages. I dared only touch the btree case to
make it pass current block number to the callback. It finishes with
index_blks_done << total_index_blks since I guess the callback is called
only on the leaf pages. Any ideas?

* I am also tempted to add num_dead_tuples and dead_tuples_vacuumed to add
granularity to 'vacuuming heap' phase but didn't in this version. Should we?

Thanks,
Amit

Attachment

Re: [PROPOSAL] VACUUM Progress Checker.

From

Kyotaro HORIGUCHI

Date:

08 March 2016, 09:19:59

You're so quick.

At Tue, 8 Mar 2016 17:02:24 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<56DE8710.4070202@lab.ntt.co.jp>
> On 2016/03/07 23:48, Robert Haas wrote:
> >> I don't like to treat the target object id differently from other
> >> parameters. It could not be needed at all, or could be needed two
> >> or more in contrast. Although oids are not guaranteed to fit
> >> uint32, we have already stored BlockNumber there.
> > 
> > Well...
> > 
> > There's not much point in deciding that the parameters are uint32,
> > because we don't have that type at the SQL level.
> > pgstat_progress_update_param() really ought to take either int32 or
> > int64 as an argument, because that's what we can actually handle from
> > SQL, and it seems pretty clear that int64 is better since otherwise we
> > can't fit, among other things, a block number.
> > 
> > Given that, I tend to think that treating the command target specially
> > and passing that as an OID is reasonable.  We're not going to be able
> > to pass variable-sized arrays through this mechanism, ever, because
> > our shared memory segment doesn't work like that.  And it seems to me
> > that nearly every command somebody might want to report progress on
> > will touch, basically, one relation a a time.  So I don't see the harm
> > in hardcoding that idea into the facility.

We'd concatenate two int32s into int64s but widening each
parameters to int64 would be preferable. Additional 4 bytes by
the defalut number of maxbackends 100 by 10 parameters = 4kb, 4MB
for 1000 backends is not so big for modern machines?

> Updated versions attached.
> 
> * changed st_progress_param to int64 and so did the argument of
> pgstat_progress_update_param().  Likewise changed param1..param10 of
> pg_stat_get_progress_info()'s output columns to bigint.
> 
> * Added back the Oid field st_command_target and corresponding function
> pgstat_progress_set_command_target(Oid).

+    beentry->st_command = COMMAND_INVALID;
+    MemSet(&beentry->st_progress_param, 0, sizeof(beentry->st_progress_param));

The MemSet seems useless since it gets the same initialization on
setting st_command.

+        /*
+         * Report values for only those backends which are running the given
+         * command.  XXX - privilege check is maybe dubious.
+         */
+        if (!beentry ||
+            beentry->st_command != cmdtype ||
+            !has_privs_of_role(GetUserId(), beentry->st_userid))
+            continue;

We can simplly ignore unpriviledged backends, or all zeroz or
nulls to signal that the caller has no priviledge.

0002

+   FROM pg_stat_get_progress_info(1) AS S;

Ah... This magick number seems quite bad.. The function should
take the command type in maybe string type.

+   FROM pg_stat_get_progress_info('lazy vacuum') AS S;

Using an array of the names would be acceptable, maybe.

| char *progress_command_names[] = {'lazy vacuum', NULL};

However the numbers for the phases ('scanning heap' and so..) is
acceptable for me for reasons uncertain to me, it also could be
represented in names but is might be rahter bothersome..

+              WHEN 0 THEN 100::numeric(5, 2)
+              ELSE ((S.param3 + 1)::numeric / S.param2 * 100)::numeric(5, 2)

This usage of numeric seems overkill to me.



> * I attempted to implement a method to report index blocks done from
> lazy_tid_reaped() albeit with some limitations. Patch 0003 is that
> attempt.  In summary, I modified the index bulk delete callback interface
> to receive a BlockNumber argument index_blkno:
> 
>  /* Typedef for callback function to determine if a tuple is bulk-deletable */
> -typedef bool (*IndexBulkDeleteCallback) (ItemPointer itemptr, void *state);
> +typedef bool (*IndexBulkDeleteCallback) (ItemPointer itemptr,
> +                                         BlockNumber index_blkno,
> +                                         void *state);
> 
> Then added 2 more fields to LVRelStats:
> 
> @@ -143,6 +143,8 @@ typedef struct LVRelStats
>      int         num_index_scans;
>      TransactionId latestRemovedXid;
>      bool        lock_waiter_detected;
> +    BlockNumber last_index_blkno;
> +    BlockNumber index_blks_vacuumed;
> 
> Then in lazy_tid_reaped(), if the index block number received in the
> index_blkno argument has changed from the previous call, increment the
> count of index blocks processed and
> pgstat_report_update_param(index_blks_done). I wonder if we should reset
> the the saved block number and the count for every index vacuumed by
> lazy_vacuum_index(). Right now, total_index_blks counts all indexes and
> counting blocks using the rough method mentioned above is sensible only
> for one index at time.  Actually, the method has different kinds of
> problems to deal with anyway. For example, with a btree index, one can
> expect that the final count does not match total_index_blks obtained using
> RelationGetNumberOfBlocks().  Moreover, each AM has its own idiosyncratic
> way of traversing the index pages. I dared only touch the btree case to
> make it pass current block number to the callback. It finishes with
> index_blks_done << total_index_blks since I guess the callback is called
> only on the leaf pages. Any ideas?

To the contrary, I suppose it counts one index page more than
once for the cases of uncorrelated heaps. index_blks_vacuumd can
exceed RelationGetNumberOfBlocks() in extreme cases. If I'm not
missing something, it stands on a quite fragile graound.

> * I am also tempted to add num_dead_tuples and dead_tuples_vacuumed to add
> granularity to 'vacuuming heap' phase but didn't in this version. Should we?

How do you think they are to be used?

reagards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

08 March 2016, 15:24:51

On Tue, Mar 8, 2016 at 3:02 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> Updated versions attached.
>
> * changed st_progress_param to int64 and so did the argument of
> pgstat_progress_update_param().  Likewise changed param1..param10 of
> pg_stat_get_progress_info()'s output columns to bigint.
>
> * Added back the Oid field st_command_target and corresponding function
> pgstat_progress_set_command_target(Oid).

What the heck do we have an SQL-visible pg_stat_reset_local_progress()
for?  Surely if we ever need that, it's a bug.

I think pgstat_progress_update_param() should Assert(index >= 0 &&
index < N_PROGRESS_PARAM).  But I'd rename N_PROGRESS_PARAM to
PGSTAT_NUM_PROGRESS_PARAM.

Regarding "XXX - privilege check is maybe dubious" - I think the
privilege check here should match pg_stat_activity.  If it does,
there's nothing dubious about that IMHO.

This patch has been worked on by so many people and reviewed by so
many people that I can't keep track of who should be credited when it
gets committed.  Could someone provide a list of author(s) and
reviewer(s)?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

08 March 2016, 16:16:51

On Wed, Mar 9, 2016 at 12:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> This patch has been worked on by so many people and reviewed by so
> many people that I can't keep track of who should be credited when it
> gets committed.  Could someone provide a list of author(s) and
> reviewer(s)?

Original authors are Rahila Syed and Vinayak Pokale.

I have been reviewing this for last few CFs. I sent in last few
revisions as well.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Kyotaro HORIGUCHI

Date:

09 March 2016, 00:22:46

At Wed, 9 Mar 2016 01:16:26 +0900, Amit Langote <amitlangote09@gmail.com> wrote in
<CA+HiwqGP3MzhvhVQf5EEzMPUk13BjxFaDGd1AXxZdQDaan30Ow@mail.gmail.com>
> On Wed, Mar 9, 2016 at 12:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > This patch has been worked on by so many people and reviewed by so
> > many people that I can't keep track of who should be credited when it
> > gets committed.  Could someone provide a list of author(s) and
> > reviewer(s)?
> 
> Original authors are Rahila Syed and Vinayak Pokale.
> 
> I have been reviewing this for last few CFs. I sent in last few
> revisions as well.

The owner of this is Vinayak and, ah, I forgot to add myself as a
reviewer. I have also reviewed this for last few CFs. 

So, as looking into CF app, it seems not so inconsistent with the
persons who appears in this thread for thses three CFs.

Authors: Vinayak Pokale, Rahila Syed, Amit Langote
Reviewers: Amit Langote, Kyotaro Horiguchi

Is there anyone who shold be added in this list?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

09 March 2016, 01:12:04

On 2016/03/09 0:24, Robert Haas wrote:
> On Tue, Mar 8, 2016 at 3:02 AM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> Updated versions attached.
>>
>> * changed st_progress_param to int64 and so did the argument of
>> pgstat_progress_update_param().  Likewise changed param1..param10 of
>> pg_stat_get_progress_info()'s output columns to bigint.
>>
>> * Added back the Oid field st_command_target and corresponding function
>> pgstat_progress_set_command_target(Oid).
>
> What the heck do we have an SQL-visible pg_stat_reset_local_progress()
> for?  Surely if we ever need that, it's a bug.

OK, now I am not sure what I was thinking adding that function. Removed.

> I think pgstat_progress_update_param() should Assert(index >= 0 &&
> index < N_PROGRESS_PARAM).  But I'd rename N_PROGRESS_PARAM to
> PGSTAT_NUM_PROGRESS_PARAM.

Agreed, done.

> Regarding "XXX - privilege check is maybe dubious" - I think the
> privilege check here should match pg_stat_activity.  If it does,
> there's nothing dubious about that IMHO.

OK, done.  So, it shows pid column to all, while rest of the values -
relid, param1..param10 are only shown to role members.  Unlike
pg_stat_activity, there is no text column to stash a "<insufficient
privilege>" message into, so all that's done is to output null values.

The attached revision addresses above and one of Horiguchi-san's comments
in his email yesterday.

Thanks a lot for the review.

Thanks,
Amit

Attachment

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

09 March 2016, 01:30:02

On 2016/03/09 9:22, Kyotaro HORIGUCHI wrote:
>> On Wed, Mar 9, 2016 at 12:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> This patch has been worked on by so many people and reviewed by so
>>> many people that I can't keep track of who should be credited when it
>>> gets committed.  Could someone provide a list of author(s) and
>>> reviewer(s)?
>>
>> Original authors are Rahila Syed and Vinayak Pokale.
>>
>> I have been reviewing this for last few CFs. I sent in last few
>> revisions as well.
> 
> The owner of this is Vinayak and, ah, I forgot to add myself as a
> reviewer. I have also reviewed this for last few CFs. 
> 
> So, as looking into CF app, it seems not so inconsistent with the
> persons who appears in this thread for thses three CFs.
> 
> Authors: Vinayak Pokale, Rahila Syed, Amit Langote
> Reviewers: Amit Langote, Kyotaro Horiguchi
> 
> Is there anyone who shold be added in this list?

Jim Nasby, Thom Brown, Masahiko Sawada, Fujii Masao, Masanori Oyama and of
course, Robert himself.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

09 March 2016, 02:01:28

Horiguchi-san,

Thanks for the review!

On 2016/03/08 18:19, Kyotaro HORIGUCHI wrote:
>> Updated versions attached.
>>
>> * changed st_progress_param to int64 and so did the argument of
>> pgstat_progress_update_param().  Likewise changed param1..param10 of
>> pg_stat_get_progress_info()'s output columns to bigint.
>>
>> * Added back the Oid field st_command_target and corresponding function
>> pgstat_progress_set_command_target(Oid).
> 
> +    beentry->st_command = COMMAND_INVALID;
> +    MemSet(&beentry->st_progress_param, 0, sizeof(beentry->st_progress_param));
> 
> The MemSet seems useless since it gets the same initialization on
> setting st_command.

Right, every backend start should not have to pay that price.  Fixed in
the latest version.

> 
> +        /*
> +         * Report values for only those backends which are running the given
> +         * command.  XXX - privilege check is maybe dubious.
> +         */
> +        if (!beentry ||
> +            beentry->st_command != cmdtype ||
> +            !has_privs_of_role(GetUserId(), beentry->st_userid))
> +            continue;
> 
> We can simplly ignore unpriviledged backends, or all zeroz or
> nulls to signal that the caller has no priviledge.

As suggested by Robert, used pg_stat_get_activity() style.  In this case,
show 'pid' to all but the rest only to role members.

> 0002
> 
> +   FROM pg_stat_get_progress_info(1) AS S;
> 
> Ah... This magick number seems quite bad.. The function should
> take the command type in maybe string type.
> 
> +   FROM pg_stat_get_progress_info('lazy vacuum') AS S;
> 
> Using an array of the names would be acceptable, maybe.
> 
> | char *progress_command_names[] = {'lazy vacuum', NULL};

Hm, I think the way it is *may* be OK the way it is, but...

As done in the patch, the way we identify commands is with the enum
BackendCommandType:

+typedef enum BackendCommandType
+{
+    COMMAND_INVALID = 0,
+    COMMAND_LAZY_VACUUM
+} BackendCommandType;

Perhaps we could create a struct:

typedef struct PgStatProgressCommand
{   char                 *cmd_name;   BackendCommandType    cmd_type;
} PgStatProgressCommand;

static const struct PgStatProgressCommand commands[] = {   {"vacuum", COMMAND_LAZY_VACUUM},   {NULL, COMMAND_INVALID}
};

> However the numbers for the phases ('scanning heap' and so..) is
> acceptable for me for reasons uncertain to me, it also could be
> represented in names but is might be rahter bothersome..

In initial versions of the patch, it used to be char * that was passed for
identifying phases.  But, then we got rid of char * progress parameters
altogether. So, there are no longer any text columns in
pg_stat_get_progress_info()'s result.  It may not work out well in long
run to forever not have those (your recent example comes to mind).

> 
> +              WHEN 0 THEN 100::numeric(5, 2)
> +              ELSE ((S.param3 + 1)::numeric / S.param2 * 100)::numeric(5, 2)
> 
> This usage of numeric seems overkill to me.

Hmm, how could this rather be written?

>> * I attempted to implement a method to report index blocks done from
>> lazy_tid_reaped() albeit with some limitations. Patch 0003 is that
>> attempt.  In summary, I modified the index bulk delete callback interface
>> to receive a BlockNumber argument index_blkno:

[ snip ]

>> way of traversing the index pages. I dared only touch the btree case to
>> make it pass current block number to the callback. It finishes with
>> index_blks_done << total_index_blks since I guess the callback is called
>> only on the leaf pages. Any ideas?
> 
> To the contrary, I suppose it counts one index page more than
> once for the cases of uncorrelated heaps. index_blks_vacuumd can
> exceed RelationGetNumberOfBlocks() in extreme cases. If I'm not
> missing something, it stands on a quite fragile graound.

Yeah, the method is not entirely foolproof yet.

>> * I am also tempted to add num_dead_tuples and dead_tuples_vacuumed to add
>> granularity to 'vacuuming heap' phase but didn't in this version. Should we?
> 
> How do you think they are to be used?

I just realized there are objections to some columns be counters for pages
and others counting tuples.  So, I guess I withdraw.  I am just worried
that 'vacuuming heap' phase may take arbitrarily long if dead tuples array
is big.  Since we were thinking of adding more granularity to 'vacuuming
indexes' phase, I thought we should do for the former too.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

09 March 2016, 07:29:16

> On 2016/03/08 18:19, Kyotaro HORIGUCHI wrote:
>> +              WHEN 0 THEN 100::numeric(5, 2)
>> +              ELSE ((S.param3 + 1)::numeric / S.param2 * 100)::numeric(5, 2)
>>
>> This usage of numeric seems overkill to me.
> 
> Hmm, how could this rather be written?

OK, agreed about the overkill. Following might be better:

+ WHEN 0 THEN round(100.0, 2)
+ ELSE round((S.param3 + 1) * 100.0 / S.param2, 2)

Will update that patch.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

09 March 2016, 07:37:52

On 2016/03/09 10:11, Amit Langote wrote:
> The attached revision addresses above and one of Horiguchi-san's comments
> in his email yesterday.

I fixed one more issue in 0002 per Horiguchi-san's comment.  Sorry about
so many versions.

Thanks,
Amit

Attachment

Re: [PROPOSAL] VACUUM Progress Checker.

From

Date:

09 March 2016, 07:55:55

Hi Amit,

> -----Original Message-----
> From: Amit Langote [mailto:Langote_Amit_f8@lab.ntt.co.jp]
> Sent: Wednesday, March 09, 2016 4:29 PM
> To: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
> Cc: robertmhaas@gmail.com; amitlangote09@gmail.com; SPS ポクレ ヴィ
> ナヤック(三技術) <pokurev@pm.nttdata.co.jp>; pgsql-
> hackers@postgresql.org; SPS 坂野 昌平(三技術) <bannos@nttdata.co.jp>
> Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
> 
> > On 2016/03/08 18:19, Kyotaro HORIGUCHI wrote:
> >> +              WHEN 0 THEN 100::numeric(5, 2)
> >> +              ELSE ((S.param3 + 1)::numeric / S.param2 *
> 100)::numeric(5, 2)
> >>
> >> This usage of numeric seems overkill to me.
> >
> > Hmm, how could this rather be written?
> 
> OK, agreed about the overkill. Following might be better:
> 
> + WHEN 0 THEN round(100.0, 2)
> + ELSE round((S.param3 + 1) * 100.0 / S.param2, 2)
+1

> Will update that patch.
> 
> Thanks,
> Amit
>

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

09 March 2016, 17:17:04

On Wed, Mar 9, 2016 at 2:37 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On 2016/03/09 10:11, Amit Langote wrote:
>> The attached revision addresses above and one of Horiguchi-san's comments
>> in his email yesterday.
>
> I fixed one more issue in 0002 per Horiguchi-san's comment.  Sorry about
> so many versions.

I've committed 0001 with heavy revisions.  Just because we don't need
an SQL-visible function to clear the command progress doesn't mean we
don't need to clear it at all; rather, it has to happen automatically.
I also did a bunch of identifier renaming, added datid to the view
output, adjusted the comments, and so on.  Please rebase the remainder
of the series.  Thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Tomas Vondra

Date:

09 March 2016, 20:38:18

Hi,


On Wed, 2016-03-09 at 12:16 -0500, Robert Haas wrote:
> On Wed, Mar 9, 2016 at 2:37 AM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
> > On 2016/03/09 10:11, Amit Langote wrote:
> >> The attached revision addresses above and one of Horiguchi-san's comments
> >> in his email yesterday.
> >
> > I fixed one more issue in 0002 per Horiguchi-san's comment.  Sorry about
> > so many versions.
> 
> I've committed 0001 with heavy revisions.  Just because we don't need
> an SQL-visible function to clear the command progress doesn't mean we
> don't need to clear it at all; rather, it has to happen automatically.
> I also did a bunch of identifier renaming, added datid to the view
> output, adjusted the comments, and so on.  Please rebase the remainder
> of the series.  Thanks.

I'm pretty sure this piece of code ends up accessing subscripts above
array bounds (and gcc 4.6.4 complains about that):
   #define PG_STAT_GET_PROGRESS_COLS PGSTAT_NUM_PROGRESS_PARAM + 3
   ...
   bool    nulls[PG_STAT_GET_PROGRESS_COLS];
   ...
   nulls[2] = true;   for (i = 1; i < PGSTAT_NUM_PROGRESS_PARAM + 1; i++)       nulls[i+3] = true;

Now let's say PARAM=10, which means COLS=13. The last index accessed by
the loop will be i=10, which means we'll do this:
       nulls[13] = true;

which is above bounds.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PROPOSAL] VACUUM Progress Checker.

From

Date:

10 March 2016, 01:53:02

Hi,
Thank you very much for committing this feature.
> -----Original Message-----
> From: Robert Haas [mailto:robertmhaas@gmail.com]
> Sent: Thursday, March 10, 2016 2:17 AM
> To: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
> Cc: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>; Amit Langote
> <amitlangote09@gmail.com>; SPS ポクレ ヴィナヤック(三技術)
> <pokurev@pm.nttdata.co.jp>; pgsql-hackers@postgresql.org; SPS 坂野 昌
> 平(三技術) <bannos@nttdata.co.jp>
> Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
> 
> On Wed, Mar 9, 2016 at 2:37 AM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
> > On 2016/03/09 10:11, Amit Langote wrote:
> >> The attached revision addresses above and one of Horiguchi-san's
> >> comments in his email yesterday.
> >
> > I fixed one more issue in 0002 per Horiguchi-san's comment.  Sorry
> > about so many versions.
> 
> I've committed 0001 with heavy revisions.  Just because we don't need an
> SQL-visible function to clear the command progress doesn't mean we don't
> need to clear it at all; rather, it has to happen automatically.
> I also did a bunch of identifier renaming, added datid to the view output,
> adjusted the comments, and so on.  Please rebase the remainder of the
> series.  Thanks.
Some minor typos need to fix.
+/*-----------+ * pgstat_progress_start_command() -+ *+ * Set st_command in own backend entry.  Also, zero-initialize+
*st_progress_param array.+ *-----------+ */
 
In the description we need to use st_progress_command instead of st_command.

+/*-----------+ * pgstat_progress_end_command() -+ *+ * Update index'th member in st_progress_param[] of own backend
entry.+*-----------+ */
 
Here also need to change the description.

Regards,
Vinayak Pokale

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

10 March 2016, 05:30:20

On 2016/03/10 2:16, Robert Haas wrote:
> On Wed, Mar 9, 2016 at 2:37 AM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> On 2016/03/09 10:11, Amit Langote wrote:
>>> The attached revision addresses above and one of Horiguchi-san's comments
>>> in his email yesterday.
>>
>> I fixed one more issue in 0002 per Horiguchi-san's comment.  Sorry about
>> so many versions.
>
> I've committed 0001 with heavy revisions.  Just because we don't need
> an SQL-visible function to clear the command progress doesn't mean we
> don't need to clear it at all; rather, it has to happen automatically.
> I also did a bunch of identifier renaming, added datid to the view
> output, adjusted the comments, and so on.  Please rebase the remainder
> of the series.  Thanks.

Great, thanks a lot for the review and committing in much better shape!

I rebased remainder patches (attached).

0001 is a small patch to fix issues reported by Tomas and Vinayak.  0002
and 0003 are WIP patches to implement progress reporting for vacuum.

Thanks,
Amit

Attachment

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

10 March 2016, 06:37:00

On 2016/03/10 14:29, Amit Langote wrote:
> I rebased remainder patches (attached).
>
> 0001 is a small patch to fix issues reported by Tomas and Vinayak.  0002
> and 0003 are WIP patches to implement progress reporting for vacuum.

Oops, in 0002, I wrongly joined with pg_class in the definition of
pg_stat_progress_vacuum to output the schema-qualified name of the table
being vacuumed.  That means we need to connect to the correct database,
which is undesirable. Updated version fixes that (shows database name and
relid).  You may also have noticed that I said pg_stat_progress_vacuum,
not pg_stat_vacuum_progress (IMHO, the former is a better name).

Updated patches attached.

Thanks,
Amit

Attachment

Re: [PROPOSAL] VACUUM Progress Checker.

From

Date:

10 March 2016, 07:24:02

Hi Amit,

Thank you for updating the patch.

> -----Original Message-----
> From: Amit Langote [mailto:Langote_Amit_f8@lab.ntt.co.jp]
> Sent: Thursday, March 10, 2016 3:36 PM
> To: Robert Haas <robertmhaas@gmail.com>
> Cc: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>; Amit Langote
> <amitlangote09@gmail.com>; SPS ポクレ ヴィナヤック(三技術)
> <pokurev@pm.nttdata.co.jp>; pgsql-hackers@postgresql.org; SPS 坂野 昌
> 平(三技術) <bannos@nttdata.co.jp>
> Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
> 
> On 2016/03/10 14:29, Amit Langote wrote:
> > I rebased remainder patches (attached).
> >
> > 0001 is a small patch to fix issues reported by Tomas and Vinayak.
> > 0002 and 0003 are WIP patches to implement progress reporting for
> vacuum.
> 
> Oops, in 0002, I wrongly joined with pg_class in the definition of
> pg_stat_progress_vacuum to output the schema-qualified name of the table
> being vacuumed.  That means we need to connect to the correct database,
> which is undesirable. Updated version fixes that (shows database name and
> relid).  You may also have noticed that I said pg_stat_progress_vacuum, not
> pg_stat_vacuum_progress (IMHO, the former is a better name).
> 
> Updated patches attached.
In 0002-
+CREATE VIEW pg_stat_progress_vacuum AS
+    SELECT
+            S.pid AS pid,
+            D.datname AS database,
+            S.relid AS relid,
.
.
.
.
+    FROM pg_database D, pg_stat_get_progress_info('VACUUM') AS S
+    WHERE S.datid = D.oid;
I think we need to use datid instead of datname.
Robert added datid in pg_stat_get_progress_info() and we are using that function here.
+values[1] = ObjectIdGetDatum(beentry->st_databaseid);

+DATA(insert OID = 3318 (  pg_stat_get_progress_info           PGNSP PGUID 12 1 100 0 0 f f f f f t s r 1 0 2249 "25"
"{25,23,26,26,20,20,20,20,20,20,20,20,20,20}""{i,o,o,o,o,o,o,o,o,o,o,o,o,o}"
"{cmdtype,pid,datid,relid,param1,param2,param3,param4,param5,param6,param7,param8,param9,param10}"_null_ _null_
pg_stat_get_progress_info_null_ _null_ _null_ ));
 

So I think it's better to report datid not datname.
The definition of view is simply like:
+CREATE VIEW pg_stat_progress_vacuum AS
+    SELECT
+            S.pid AS pid,
+            S.datid AS datid,
+            S.relid AS relid,
+            CASE S.param1
+                WHEN 1 THEN 'scanning heap'
+                WHEN 2 THEN 'vacuuming indexes'
+                WHEN 3 THEN 'vacuuming heap'
+                WHEN 4 THEN 'cleanup'
+                ELSE 'unknown phase'
+            END AS processing_phase,
+            S.param2 AS total_heap_blocks,
+            S.param3 AS current_heap_block,
+            S.param4 AS total_index_blocks,
+            S.param5 AS index_blocks_done,
+            S.param6 AS index_vacuum_count,
+            CASE S.param2
+                WHEN 0 THEN round(100.0, 2)
+                ELSE round((S.param3 + 1) * 100.0 / S.param2, 2)
+            END AS percent_done
+    FROM pg_stat_get_progress_info('VACUUM') AS S;

In the pg_stat_activity view, datid and datname are the separate columns. So maybe we can add datname as separate
columnin pg_stat_progress_vacuum, but I think it's not required only datid is sufficient.
 
Any comment?

Regards,
Vinayak Pokale

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

10 March 2016, 08:09:16

Hi Vinayak,

Thanks for the quick review!

On 2016/03/10 16:22, pokurev@pm.nttdata.co.jp wrote:
>> On 2016/03/10 14:29, Amit Langote wrote:
>> Updated patches attached.
> In 0002-

[ snip ]

> I think we need to use datid instead of datname.
> Robert added datid in pg_stat_get_progress_info() and we are using that function here.
> +values[1] = ObjectIdGetDatum(beentry->st_databaseid);

[ snip ]

> So I think it's better to report datid not datname.
> The definition of view is simply like:
> +CREATE VIEW pg_stat_progress_vacuum AS
> +    SELECT
> +            S.pid AS pid,
> +            S.datid AS datid,
> +            S.relid AS relid,
> +            CASE S.param1
> +                WHEN 1 THEN 'scanning heap'
> +                WHEN 2 THEN 'vacuuming indexes'
> +                WHEN 3 THEN 'vacuuming heap'
> +                WHEN 4 THEN 'cleanup'
> +                ELSE 'unknown phase'
> +            END AS processing_phase,
> +            S.param2 AS total_heap_blocks,
> +            S.param3 AS current_heap_block,
> +            S.param4 AS total_index_blocks,
> +            S.param5 AS index_blocks_done,
> +            S.param6 AS index_vacuum_count,
> +            CASE S.param2
> +                WHEN 0 THEN round(100.0, 2)
> +                ELSE round((S.param3 + 1) * 100.0 / S.param2, 2)
> +            END AS percent_done
> +    FROM pg_stat_get_progress_info('VACUUM') AS S;
>
> So maybe we can add datname as separate column in pg_stat_progress_vacuum, I think it's not required only datid is
sufficient.
> Any comment?

Why do you think showing the name may be unacceptable?  Wouldn't that be a
little more user-friendly?  Though maybe, we can follow the
pg_stat_activity style and have both instead, as you suggest.  Attached
updated version does that.

Thanks,
Amit

Attachment

Re: [PROPOSAL] VACUUM Progress Checker.

From

Date:

10 March 2016, 08:22:58

Hi Amit,

Thank you for updating the patch.
> -----Original Message-----
> From: Amit Langote [mailto:Langote_Amit_f8@lab.ntt.co.jp]
> Sent: Thursday, March 10, 2016 5:09 PM
> To: SPS ポクレ ヴィナヤック(三技術) <pokurev@pm.nttdata.co.jp>;
> robertmhaas@gmail.com
> Cc: horiguchi.kyotaro@lab.ntt.co.jp; amitlangote09@gmail.com; pgsql-
> hackers@postgresql.org; SPS 坂野 昌平(三技術) <bannos@nttdata.co.jp>
> Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
> 
> 
> Hi Vinayak,
> 
> Thanks for the quick review!
> 
> On 2016/03/10 16:22, pokurev@pm.nttdata.co.jp wrote:
> >> On 2016/03/10 14:29, Amit Langote wrote:
> >> Updated patches attached.
> > In 0002-
> 
> [ snip ]
> 
> > I think we need to use datid instead of datname.
> > Robert added datid in pg_stat_get_progress_info() and we are using that
> function here.
> > +values[1] = ObjectIdGetDatum(beentry->st_databaseid);
> 
> [ snip ]
> 
> > So I think it's better to report datid not datname.
> > The definition of view is simply like:
> > +CREATE VIEW pg_stat_progress_vacuum AS
> > +    SELECT
> > +            S.pid AS pid,
> > +            S.datid AS datid,
> > +            S.relid AS relid,
> > +            CASE S.param1
> > +                WHEN 1 THEN 'scanning heap'
> > +                WHEN 2 THEN 'vacuuming indexes'
> > +                WHEN 3 THEN 'vacuuming heap'
> > +                WHEN 4 THEN 'cleanup'
> > +                ELSE 'unknown phase'
> > +            END AS processing_phase,
> > +            S.param2 AS total_heap_blocks,
> > +            S.param3 AS current_heap_block,
> > +            S.param4 AS total_index_blocks,
> > +            S.param5 AS index_blocks_done,
> > +            S.param6 AS index_vacuum_count,
> > +            CASE S.param2
> > +                WHEN 0 THEN round(100.0, 2)
> > +                ELSE round((S.param3 + 1) * 100.0 / S.param2, 2)
> > +            END AS percent_done
> > +    FROM pg_stat_get_progress_info('VACUUM') AS S;
> >
> > So maybe we can add datname as separate column in
> pg_stat_progress_vacuum, I think it's not required only datid is sufficient.
> > Any comment?
> 
> Why do you think showing the name may be unacceptable?  Wouldn't that
> be a little more user-friendly?  Though maybe, we can follow the
> pg_stat_activity style and have both instead, as you suggest.  Attached
> updated version does that.
+1
I think reporting both (datid and datname) is more user-friendly.
Thank you.

Regards,
Vinayak Pokale

Re: [PROPOSAL] VACUUM Progress Checker.

From

Kyotaro HORIGUCHI

Date:

10 March 2016, 11:10:56

Hi,

At Thu, 10 Mar 2016 08:21:36 +0000, <pokurev@pm.nttdata.co.jp> wrote in
<8e09c2fe530d4008aa0019e38c1d5453@MP-MSGSS-MBX007.msg.nttdata.co.jp>
> > > So maybe we can add datname as separate column in
> > pg_stat_progress_vacuum, I think it's not required only datid is sufficient.
> > > Any comment?
> > 
> > Why do you think showing the name may be unacceptable?  Wouldn't that
> > be a little more user-friendly?  Though maybe, we can follow the
> > pg_stat_activity style and have both instead, as you suggest.  Attached
> > updated version does that.
> +1
> I think reporting both (datid and datname) is more user-friendly.
> Thank you.

I don't like showing both oid and name and only "user friendry"
doesn't seem to justify adding redundant columns in-a-sense.

So, I have looked into system_views.sql and picked up what
catalogs/views shows objects in such way, that is, showing both
object id and its name.

Show by name: pg_policies, pg_rules, pg_tablespg_matviews,             pg_indexes, pg_stats, pg_prepared_xacts,
pg_seclabels,     pg_stat(io)_*_tables/indexes.schemaname             pg_stat_*_functions.schemaname
 

Show by oid : pg_locks, pg_user_mappings.umid

Both        : pg_stat(io)_*_tables/indexes.relid/relname, indexrelid/indexname;
pg_stat_activity.datid/datname,usesysid/usename             pg_stat_activity.datid/datname, usesysid/usename
pg_replication_slots.datoid/database             pg_stat_database(_conflicts).datid/datname
pg_stat_*_functions.funcid/funcname            pg_user_mappings.srvid/srvname,umuser/usename
 

It's surprising to see this result for me. The nature of this
view is near to pg_stat* views so it is proper to show *both of
database and relation* in both of oid and name.

Thoughts?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

10 March 2016, 13:48:53

On Thu, Mar 10, 2016 at 6:10 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> So, I have looked into system_views.sql and picked up what
> catalogs/views shows objects in such way, that is, showing both
> object id and its name.
>
> Show by name: pg_policies, pg_rules, pg_tablespg_matviews,
>               pg_indexes, pg_stats, pg_prepared_xacts, pg_seclabels,
>               pg_stat(io)_*_tables/indexes.schemaname
>               pg_stat_*_functions.schemaname
>
> Show by oid : pg_locks, pg_user_mappings.umid
>
> Both        : pg_stat(io)_*_tables/indexes.relid/relname, indexrelid/indexname;
>               pg_stat_activity.datid/datname, usesysid/usename
>               pg_stat_activity.datid/datname, usesysid/usename
>               pg_replication_slots.datoid/database
>               pg_stat_database(_conflicts).datid/datname
>               pg_stat_*_functions.funcid/funcname
>               pg_user_mappings.srvid/srvname,umuser/usename
>
> It's surprising to see this result for me. The nature of this
> view is near to pg_stat* views so it is proper to show *both of
> database and relation* in both of oid and name.
>
> Thoughts?

I think the problem is that you can't show the name of a non-global
SQL object (such as a relation) unless the object is in the current
database.  Many of the views in the first group are database-local
views, while things like pg_locks span all databases.  We can show the
datid/relid always, but if we have a relname column it will have to be
NULL unless the datid is our database.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

10 March 2016, 14:29:36

On Thu, Mar 10, 2016 at 3:08 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> Hi Vinayak,
>
> Thanks for the quick review!

Committed 0001 earlier this morning.

On 0002:

+       /* total_index_blks */
+       current_index_blks = (BlockNumber *) palloc(nindexes *
sizeof(BlockNumber));
+       total_index_blks = 0;
+       for (i = 0; i < nindexes; i++)
+       {
+               BlockNumber             nblocks =
RelationGetNumberOfBlocks(Irel[i]);
+
+               current_index_blks[i] = nblocks;
+               total_index_blks += nblocks;
+       }
+       pgstat_progress_update_param(PROG_PARAM_VAC_IDX_BLKS, total_index_blks);

I think this is a bad idea.  The value calculated here isn't
necessarily accurate, because the number of index blocks can change
between the time this is calculated and the time the indexes are
actually vacuumed.  If a client just wants the length of the indexes
in round figures, that's already SQL-visible, and there's little
reason to make VACUUM do it all the time whether anyone is looking at
the progress information or not.  Note that I'm not complaining about
the fact that you exposed the heap block count, because in that case
you are exposing the actual value that VACUUM is using to guide its
work.  The client can get the *current* length of the relation, but
the value you are exposing gives you the number of blocks *this
particular VACUUM intends to scan*.  That has some incremental value -
but the index information doesn't have the same thing going for it.

On 0003:

I think you should make this work for all AMs, not just btree, and
then consolidate it with 0002.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

11 March 2016, 02:05:17

On 2016/03/10 23:29, Robert Haas wrote:
> On Thu, Mar 10, 2016 at 3:08 AM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> Hi Vinayak,
>>
>> Thanks for the quick review!
> 
> Committed 0001 earlier this morning.

Thanks!

> On 0002:
> 
> +       /* total_index_blks */
> +       current_index_blks = (BlockNumber *) palloc(nindexes *
> sizeof(BlockNumber));
> +       total_index_blks = 0;
> +       for (i = 0; i < nindexes; i++)
> +       {
> +               BlockNumber             nblocks =
> RelationGetNumberOfBlocks(Irel[i]);
> +
> +               current_index_blks[i] = nblocks;
> +               total_index_blks += nblocks;
> +       }
> +       pgstat_progress_update_param(PROG_PARAM_VAC_IDX_BLKS, total_index_blks);
> 
> I think this is a bad idea.  The value calculated here isn't
> necessarily accurate, because the number of index blocks can change
> between the time this is calculated and the time the indexes are
> actually vacuumed.  If a client just wants the length of the indexes
> in round figures, that's already SQL-visible, and there's little
> reason to make VACUUM do it all the time whether anyone is looking at
> the progress information or not.  Note that I'm not complaining about
> the fact that you exposed the heap block count, because in that case
> you are exposing the actual value that VACUUM is using to guide its
> work.  The client can get the *current* length of the relation, but
> the value you are exposing gives you the number of blocks *this
> particular VACUUM intends to scan*.  That has some incremental value -
> but the index information doesn't have the same thing going for it.

So, from what I understand here, we should not put total count of index
pages into st_progress_param; rather, have the client (reading
pg_stat_progress_vacuum) derive it using pg_indexes_size() (?), as and
when necessary.  However, only server is able to tell the current position
within an index vacuuming round (or how many pages into a given index
vacuuming round), so report that using some not-yet-existent mechanism.

> On 0003:
> 
> I think you should make this work for all AMs, not just btree, and
> then consolidate it with 0002.

OK.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

11 March 2016, 04:16:30

On Thu, Mar 10, 2016 at 9:04 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> So, from what I understand here, we should not put total count of index
> pages into st_progress_param; rather, have the client (reading
> pg_stat_progress_vacuum) derive it using pg_indexes_size() (?), as and
> when necessary.  However, only server is able to tell the current position
> within an index vacuuming round (or how many pages into a given index
> vacuuming round), so report that using some not-yet-existent mechanism.

Isn't that mechanism what you are trying to create in 0003?  But
otherwise, yes, you've accurate summarized what I think we should do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

11 March 2016, 05:32:19

On 2016/03/11 13:16, Robert Haas wrote:
> On Thu, Mar 10, 2016 at 9:04 PM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> So, from what I understand here, we should not put total count of index
>> pages into st_progress_param; rather, have the client (reading
>> pg_stat_progress_vacuum) derive it using pg_indexes_size() (?), as and
>> when necessary.  However, only server is able to tell the current position
>> within an index vacuuming round (or how many pages into a given index
>> vacuuming round), so report that using some not-yet-existent mechanism.
> 
> Isn't that mechanism what you are trying to create in 0003?

Right, 0003 should hopefully become that mechanism.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

12 March 2016, 12:49:48

On Fri, Mar 11, 2016 at 2:31 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On 2016/03/11 13:16, Robert Haas wrote:
>> On Thu, Mar 10, 2016 at 9:04 PM, Amit Langote
>> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>>> So, from what I understand here, we should not put total count of index
>>> pages into st_progress_param; rather, have the client (reading
>>> pg_stat_progress_vacuum) derive it using pg_indexes_size() (?), as and
>>> when necessary.  However, only server is able to tell the current position
>>> within an index vacuuming round (or how many pages into a given index
>>> vacuuming round), so report that using some not-yet-existent mechanism.
>>
>> Isn't that mechanism what you are trying to create in 0003?
>
> Right, 0003 should hopefully become that mechanism.

About 0003:

Earlier, it was trying to report vacuumed index block count using
lazy_tid_reaped() callback for which I had added a index_blkno
argument to IndexBulkDeleteCallback. Turns out it's not such a good
place to do what we are trying to do.  This callback is called for
every heap pointer in an index. Not all index pages contain heap
pointers, which means the existing callback does not allow to count
all the index blocks that AM would read to finish a given index vacuum
run.

Instead, the attached patch adds a IndexBulkDeleteProgressCallback
which AMs should call for every block that's read (say, right before a
call to ReadBufferExtended) as part of a given vacuum run. The
callback with help of some bookkeeping state can count each block and
report to pgstat_progress API. Now, I am not sure if all AMs read 1..N
blocks for every vacuum or if it's possible that some blocks are read
more than once in single vacuum, etc.  IOW, some AM's processing may
be non-linear and counting blocks 1..N (where N is reported total
index blocks) may not be possible.  However, this is the best I could
think of as doing what we are trying to do here. Maybe index AM
experts can chime in on that.

Thoughts?

Thanks,
Amit

Attachment

0001-WIP-Implement-progress-reporting-for-VACUUM-command-v11.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Jim Nasby

Date:

12 March 2016, 16:56:25

On 3/10/16 7:48 AM, Robert Haas wrote:
> I think the problem is that you can't show the name of a non-global
> SQL object (such as a relation) unless the object is in the current
> database.  Many of the views in the first group are database-local
> views, while things like pg_locks span all databases.  We can show the
> datid/relid always, but if we have a relname column it will have to be
> NULL unless the datid is our database.

I would prefer that if the object is in another database we at least 
display the OID. That way, if you're logging this info you can go back 
later and figure out what was going on.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

14 March 2016, 09:55:22

<div dir="ltr"><p dir="ltr">Hello,<p>While I am still looking at this WIP patch, I had one suggestion.<p>Instead of
makingchanges in the index AM API can we have a call to update the shared state using pgstat_progress* API <br
/><p>directlyfrom specific index level code?<p>Like  pgstat_count_index_scan(rel) call from _bt_first does. Though this
functionbasically updates local structures and sends the count to stat collector via messages we can have a function
whichwill instead modify the shared state using the progress API committed recently.<p>Thank you,<br /><p>Rahila
Syed<br/><br /></div>

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

14 March 2016, 11:05:12

Hi,

Thanks for taking a look at the patch.

On Mon, Mar 14, 2016 at 6:55 PM, Rahila Syed <rahilasyed90@gmail.com> wrote:
> Hello,
>
> While I am still looking at this WIP patch, I had one suggestion.
>
> Instead of making changes in the index AM API can we have a call to update
> the shared state using pgstat_progress* API
>
> directly from specific index level code?
>
> Like  pgstat_count_index_scan(rel) call from _bt_first does. Though this
> function basically updates local structures and sends the count to stat
> collector via messages we can have a function which will instead modify the
> shared state using the progress API committed recently.

I chose the callback approach because we need to count the index
blocks within the context of a given vacuum run.  For example, as
proposed, progress_callback_state (in this case, a pointer to the
LVRelStats struct for a given vacuum run) keeps the block count for a
given index vacuum run.  It is reset when next index vacuuming round
starts.  Also, remember that the count is across all indexes.

If we call pgstat_progress API directly from within AM, what I just
described above seems difficult to achieve modularly. But maybe, I'm
missing something.

Aside from whether we should use one of the above two methods, I think
we also have to figure out, for each AM, how to count correctly
considering non-linearity (tree traversal, recursion and such) of most
AMs' vacuum scans.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

14 March 2016, 18:41:33

On Sat, Mar 12, 2016 at 7:49 AM, Amit Langote <amitlangote09@gmail.com> wrote:
> On Fri, Mar 11, 2016 at 2:31 PM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> On 2016/03/11 13:16, Robert Haas wrote:
>>> On Thu, Mar 10, 2016 at 9:04 PM, Amit Langote
>>> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>>>> So, from what I understand here, we should not put total count of index
>>>> pages into st_progress_param; rather, have the client (reading
>>>> pg_stat_progress_vacuum) derive it using pg_indexes_size() (?), as and
>>>> when necessary.  However, only server is able to tell the current position
>>>> within an index vacuuming round (or how many pages into a given index
>>>> vacuuming round), so report that using some not-yet-existent mechanism.
>>>
>>> Isn't that mechanism what you are trying to create in 0003?
>>
>> Right, 0003 should hopefully become that mechanism.
>
> About 0003:
>
> Earlier, it was trying to report vacuumed index block count using
> lazy_tid_reaped() callback for which I had added a index_blkno
> argument to IndexBulkDeleteCallback. Turns out it's not such a good
> place to do what we are trying to do.  This callback is called for
> every heap pointer in an index. Not all index pages contain heap
> pointers, which means the existing callback does not allow to count
> all the index blocks that AM would read to finish a given index vacuum
> run.
>
> Instead, the attached patch adds a IndexBulkDeleteProgressCallback
> which AMs should call for every block that's read (say, right before a
> call to ReadBufferExtended) as part of a given vacuum run. The
> callback with help of some bookkeeping state can count each block and
> report to pgstat_progress API. Now, I am not sure if all AMs read 1..N
> blocks for every vacuum or if it's possible that some blocks are read
> more than once in single vacuum, etc.  IOW, some AM's processing may
> be non-linear and counting blocks 1..N (where N is reported total
> index blocks) may not be possible.  However, this is the best I could
> think of as doing what we are trying to do here. Maybe index AM
> experts can chime in on that.
>
> Thoughts?

Well, I think you need to study the index AMs and figure this out.

But I think for starters you should write a patch that reports the following:

1. phase
2. number of heap blocks scanned
3. number of heap blocks vacuumed
4. number of completed index vac cycles
5. number of dead tuples collected since the last index vac cycle
6. number of dead tuples that we can store before needing to perform
an index vac cycle

All of that should be pretty straightforward, and then we'd have
something we can ship.  We can add the detailed index reporting later,
when we get to it, perhaps for 9.7.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

15 March 2016, 05:17:04

On 2016/03/15 3:41, Robert Haas wrote:
> On Sat, Mar 12, 2016 at 7:49 AM, Amit Langote <amitlangote09@gmail.com> wrote:
>> Instead, the attached patch adds a IndexBulkDeleteProgressCallback
>> which AMs should call for every block that's read (say, right before a
>> call to ReadBufferExtended) as part of a given vacuum run. The
>> callback with help of some bookkeeping state can count each block and
>> report to pgstat_progress API. Now, I am not sure if all AMs read 1..N
>> blocks for every vacuum or if it's possible that some blocks are read
>> more than once in single vacuum, etc.  IOW, some AM's processing may
>> be non-linear and counting blocks 1..N (where N is reported total
>> index blocks) may not be possible.  However, this is the best I could
>> think of as doing what we are trying to do here. Maybe index AM
>> experts can chime in on that.
>>
>> Thoughts?
>
> Well, I think you need to study the index AMs and figure this out.

OK.  I tried to put calls to the callback in appropriate places, but
couldn't get the resulting progress numbers to look sane.  So I ended up
concluding that any attempt to do so is futile unless I analyze each AM's
vacuum code carefully to be able to determine in advance the max bound on
the count of blocks that the callback will report.  Anyway, as you
suggest, we can improve it later.

> But I think for starters you should write a patch that reports the following:
>
> 1. phase
> 2. number of heap blocks scanned
> 3. number of heap blocks vacuumed
> 4. number of completed index vac cycles
> 5. number of dead tuples collected since the last index vac cycle
> 6. number of dead tuples that we can store before needing to perform
> an index vac cycle
>
> All of that should be pretty straightforward, and then we'd have
> something we can ship.  We can add the detailed index reporting later,
> when we get to it, perhaps for 9.7.

OK, I agree with this plan.  Attached updated patch implements this.

Thanks,
Amit

Attachment

0001-WIP-Implement-progress-reporting-for-VACUUM-command-v12.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

15 March 2016, 17:33:40

On Tue, Mar 15, 2016 at 1:16 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On 2016/03/15 3:41, Robert Haas wrote:
>> On Sat, Mar 12, 2016 at 7:49 AM, Amit Langote <amitlangote09@gmail.com> wrote:
>>> Instead, the attached patch adds a IndexBulkDeleteProgressCallback
>>> which AMs should call for every block that's read (say, right before a
>>> call to ReadBufferExtended) as part of a given vacuum run. The
>>> callback with help of some bookkeeping state can count each block and
>>> report to pgstat_progress API. Now, I am not sure if all AMs read 1..N
>>> blocks for every vacuum or if it's possible that some blocks are read
>>> more than once in single vacuum, etc.  IOW, some AM's processing may
>>> be non-linear and counting blocks 1..N (where N is reported total
>>> index blocks) may not be possible.  However, this is the best I could
>>> think of as doing what we are trying to do here. Maybe index AM
>>> experts can chime in on that.
>>>
>>> Thoughts?
>>
>> Well, I think you need to study the index AMs and figure this out.
>
> OK.  I tried to put calls to the callback in appropriate places, but
> couldn't get the resulting progress numbers to look sane.  So I ended up
> concluding that any attempt to do so is futile unless I analyze each AM's
> vacuum code carefully to be able to determine in advance the max bound on
> the count of blocks that the callback will report.  Anyway, as you
> suggest, we can improve it later.

I don't think there is any way to bound that, because new blocks can
get added to the index concurrently, and we might end up needing to
scan them.  Reporting the number of blocks scanned can still be
useful, though - any chance you can just implement that part of it?

>> But I think for starters you should write a patch that reports the following:
>>
>> 1. phase
>> 2. number of heap blocks scanned
>> 3. number of heap blocks vacuumed
>> 4. number of completed index vac cycles
>> 5. number of dead tuples collected since the last index vac cycle
>> 6. number of dead tuples that we can store before needing to perform
>> an index vac cycle
>>
>> All of that should be pretty straightforward, and then we'd have
>> something we can ship.  We can add the detailed index reporting later,
>> when we get to it, perhaps for 9.7.
>
> OK, I agree with this plan.  Attached updated patch implements this.

Sorta.  Committed after renaming what you called heap blocks vacuumed
back to heap blocks scanned, adding heap blocks vacuumed, removing the
overall progress meter which I don't believe will be anything close to
accurate, fixing some stylistic stuff, arranging to update multiple
counters automatically where it could otherwise produce confusion,
moving the new view near similar ones in the file, reformatting it to
follow the style of the rest of the file, exposing the counter
#defines via a header file in case extensions want to use them, and
overhauling and substantially expanding the documentation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

16 March 2016, 00:56:59

On 2016/03/16 2:33, Robert Haas wrote:
> On Tue, Mar 15, 2016 at 1:16 AM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> On 2016/03/15 3:41, Robert Haas wrote:
>>> Well, I think you need to study the index AMs and figure this out.
>>
>> OK.  I tried to put calls to the callback in appropriate places, but
>> couldn't get the resulting progress numbers to look sane.  So I ended up
>> concluding that any attempt to do so is futile unless I analyze each AM's
>> vacuum code carefully to be able to determine in advance the max bound on
>> the count of blocks that the callback will report.  Anyway, as you
>> suggest, we can improve it later.
> 
> I don't think there is any way to bound that, because new blocks can
> get added to the index concurrently, and we might end up needing to
> scan them.  Reporting the number of blocks scanned can still be
> useful, though - any chance you can just implement that part of it?

Do you mean the changes I made to index bulk delete API for this purpose
in last few versions of this patch?  With it, I have observed that
reported scanned blocks count (that is incremented for every
ReadBufferExtended() call across a index vacuum cycle using the new
callback) can be non-deterministic.  Would there be an accompanying
index_blocks_total (pg_indexes_size()-based) in the view, as well?

>>> But I think for starters you should write a patch that reports the following:
>>>
>>> 1. phase
>>> 2. number of heap blocks scanned
>>> 3. number of heap blocks vacuumed
>>> 4. number of completed index vac cycles
>>> 5. number of dead tuples collected since the last index vac cycle
>>> 6. number of dead tuples that we can store before needing to perform
>>> an index vac cycle
>>>
>>> All of that should be pretty straightforward, and then we'd have
>>> something we can ship.  We can add the detailed index reporting later,
>>> when we get to it, perhaps for 9.7.
>>
>> OK, I agree with this plan.  Attached updated patch implements this.
> 
> Sorta.  Committed after renaming what you called heap blocks vacuumed
> back to heap blocks scanned, adding heap blocks vacuumed, removing the
> overall progress meter which I don't believe will be anything close to
> accurate, fixing some stylistic stuff, arranging to update multiple
> counters automatically where it could otherwise produce confusion,
> moving the new view near similar ones in the file, reformatting it to
> follow the style of the rest of the file, exposing the counter
> #defines via a header file in case extensions want to use them, and
> overhauling and substantially expanding the documentation.

Thanks a lot for committing this.  The committed version is much better.
Some of the things therein should really have been part of the final
*submitted* patch; I will try to improve in that area in my submissions
henceforth.

Thanks,
Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Date:

16 March 2016, 06:10:06

Hi,
> -----Original Message-----
> From: Robert Haas [mailto:robertmhaas@gmail.com]
> Sent: Wednesday, March 16, 2016 2:34 AM
> To: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
> Cc: Amit Langote <amitlangote09@gmail.com>; SPS ポクレ ヴィナヤック(
> 三技術) <pokurev@pm.nttdata.co.jp>; Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp>; pgsql-hackers@postgresql.org; SPS 坂野
> 昌平(三技術) <bannos@nttdata.co.jp>
> Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
> 
> On Tue, Mar 15, 2016 at 1:16 AM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
> > On 2016/03/15 3:41, Robert Haas wrote:
> >> On Sat, Mar 12, 2016 at 7:49 AM, Amit Langote
> <amitlangote09@gmail.com> wrote:
> >>> Instead, the attached patch adds a IndexBulkDeleteProgressCallback
> >>> which AMs should call for every block that's read (say, right before
> >>> a call to ReadBufferExtended) as part of a given vacuum run. The
> >>> callback with help of some bookkeeping state can count each block
> >>> and report to pgstat_progress API. Now, I am not sure if all AMs
> >>> read 1..N blocks for every vacuum or if it's possible that some
> >>> blocks are read more than once in single vacuum, etc.  IOW, some
> >>> AM's processing may be non-linear and counting blocks 1..N (where N
> >>> is reported total index blocks) may not be possible.  However, this
> >>> is the best I could think of as doing what we are trying to do here.
> >>> Maybe index AM experts can chime in on that.
> >>>
> >>> Thoughts?
> >>
> >> Well, I think you need to study the index AMs and figure this out.
> >
> > OK.  I tried to put calls to the callback in appropriate places, but
> > couldn't get the resulting progress numbers to look sane.  So I ended
> > up concluding that any attempt to do so is futile unless I analyze
> > each AM's vacuum code carefully to be able to determine in advance the
> > max bound on the count of blocks that the callback will report.
> > Anyway, as you suggest, we can improve it later.
> 
> I don't think there is any way to bound that, because new blocks can get
> added to the index concurrently, and we might end up needing to scan
> them.  Reporting the number of blocks scanned can still be useful, though -
> any chance you can just implement that part of it?
> 
> >> But I think for starters you should write a patch that reports the following:
> >>
> >> 1. phase
> >> 2. number of heap blocks scanned
> >> 3. number of heap blocks vacuumed
> >> 4. number of completed index vac cycles 5. number of dead tuples
> >> collected since the last index vac cycle 6. number of dead tuples
> >> that we can store before needing to perform an index vac cycle
> >>
> >> All of that should be pretty straightforward, and then we'd have
> >> something we can ship.  We can add the detailed index reporting
> >> later, when we get to it, perhaps for 9.7.
> >
> > OK, I agree with this plan.  Attached updated patch implements this.
> 
> Sorta.  Committed after renaming what you called heap blocks vacuumed
> back to heap blocks scanned, adding heap blocks vacuumed, removing the
> overall progress meter which I don't believe will be anything close to
> accurate, fixing some stylistic stuff, arranging to update multiple counters
> automatically where it could otherwise produce confusion, moving the new
> view near similar ones in the file, reformatting it to follow the style of the
> rest of the file, exposing the counter #defines via a header file in case
> extensions want to use them, and overhauling and substantially expanding
> the documentation.

Thank you for committing this feature.
There is one minor bug.
s/ pgstat_progress_update_params/ pgstat_progress_update_multi_param/g
 Attached patch fixes a minor bug.

Regards,
Vinayak Pokale

Attachment

pgstat_progress_function-typo.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

16 March 2016, 10:44:17

>Sorta. Committed after renaming what you called heap blocks vacuumed
>back to heap blocks scanned, adding heap blocks vacuumed, removing the
>overall progress meter which I don't believe will be anything close to
>accurate, fixing some stylistic stuff, arranging to update multiple
>counters automatically where it could otherwise produce confusion,
>moving the new view near similar ones in the file, reformatting it to
>follow the style of the rest of the file, exposing the counter
>#defines via a header file in case extensions want to use them, and
>overhauling and substantially expanding the documentation

We have following lines,

        /* report that everything is scanned and vacuumed */
        pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED, blkno);
        pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED, blkno);

which appear before final vacuum cycle happens for any remaining dead tuples which may span few pages if I am not mistaken.

IMO, reporting final count of heap_blks_scanned is correct here, but reporting final heap_blks_vacuumed can happen after the final VACUUM cycle for more accuracy.

Thank you,

Rahila Syed

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

16 March 2016, 18:00:17

On Wed, Mar 16, 2016 at 6:44 AM, Rahila Syed <rahilasyed90@gmail.com> wrote:
>>Sorta.  Committed after renaming what you called heap blocks vacuumed
>>back to heap blocks scanned, adding heap blocks vacuumed, removing the
>>overall progress meter which I don't believe will be anything close to
>>accurate, fixing some stylistic stuff, arranging to update multiple
>>counters automatically where it could otherwise produce confusion,
>>moving the new view near similar ones in the file, reformatting it to
>>follow the style of the rest of the file, exposing the counter
>>#defines via a header file in case extensions want to use them, and
>>overhauling and substantially expanding the documentation
>
> We have following lines,
>
>         /* report that everything is scanned and vacuumed */
>         pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED,
> blkno);
>         pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED,
> blkno);
>
>
> which appear before final vacuum cycle happens for any remaining dead tuples
> which may span few pages if I am not mistaken.
>
> IMO, reporting final count of heap_blks_scanned is correct here, but
> reporting final heap_blks_vacuumed can happen after the final VACUUM cycle
> for more accuracy.

You are quite right.  Good catch.  Fixed that, and applied Vinayak's
patch too, and fixed another mistake I saw while I was at it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

24 March 2016, 12:45:38

Hello,

Server crash was reported on running vacuum progress checker view on 32-bit machine.

Please find attached a fix for the same.

Crash was because 32 bit machine considers int8 as being passed by reference while creating the tuple descriptor. At the time of filling the tuple store, the code (heap_fill_tuple) checks this tuple descriptor before inserting the value into the tuple store. It finds the attribute type pass by reference and hence it treats the value as a pointer when it is not and thus it fails at the time of memcpy.

This happens because appropriate conversion function is not employed while storing the value of that particular attribute into the values array before copying it into tuple store.

- values[i+3] = UInt32GetDatum(beentry->st_progress_param[i]);
+ values[i+3] = Int64GetDatum(beentry->st_progress_param[i]);

Attached patch fixes this.

Thank you,
Rahila Syed

On Wed, Mar 16, 2016 at 11:30 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Mar 16, 2016 at 6:44 AM, Rahila Syed <rahilasyed90@gmail.com> wrote:
>>Sorta. Committed after renaming what you called heap blocks vacuumed
>>back to heap blocks scanned, adding heap blocks vacuumed, removing the
>>overall progress meter which I don't believe will be anything close to
>>accurate, fixing some stylistic stuff, arranging to update multiple
>>counters automatically where it could otherwise produce confusion,
>>moving the new view near similar ones in the file, reformatting it to
>>follow the style of the rest of the file, exposing the counter
>>#defines via a header file in case extensions want to use them, and
>>overhauling and substantially expanding the documentation
>
> We have following lines,
>
> /* report that everything is scanned and vacuumed */
> pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_SCANNED,
> blkno);
> pgstat_progress_update_param(PROGRESS_VACUUM_HEAP_BLKS_VACUUMED,
> blkno);
>
>
> which appear before final vacuum cycle happens for any remaining dead tuples
> which may span few pages if I am not mistaken.
>
> IMO, reporting final count of heap_blks_scanned is correct here, but
> reporting final heap_blks_vacuumed can happen after the final VACUUM cycle
> for more accuracy.

You are quite right. Good catch. Fixed that, and applied Vinayak's
patch too, and fixed another mistake I saw while I was at it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

vacuum_progress_checker_bugfix.patch

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

24 March 2016, 13:01:19

On Thu, Mar 24, 2016 at 8:45 AM, Rahila Syed <rahilasyed90@gmail.com> wrote:
> Server crash was reported on running  vacuum progress checker view on 32-bit
> machine.
> Please find attached a fix for the same.
>
> Crash was because 32 bit machine considers int8 as being passed by reference
> while creating the tuple descriptor. At the time of filling the tuple store,
> the code (heap_fill_tuple) checks this tuple descriptor before inserting the
> value into the tuple store. It finds the attribute type pass by reference
> and hence it treats the value as a pointer when it is not and thus it fails
> at the time of memcpy.
>
> This happens because appropriate conversion function is not employed while
> storing the value of that particular attribute into the values array before
> copying it into tuple store.
>
> -                               values[i+3] =
> UInt32GetDatum(beentry->st_progress_param[i]);
> +                               values[i+3] =
> Int64GetDatum(beentry->st_progress_param[i]);
>
>
> Attached patch fixes this.

Uggh, what a stupid mistake on my part.

Committed.  Thanks for the patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Robert Haas

Date:

24 March 2016, 13:01:57

On Thu, Mar 24, 2016 at 9:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Mar 24, 2016 at 8:45 AM, Rahila Syed <rahilasyed90@gmail.com> wrote:
>> Server crash was reported on running  vacuum progress checker view on 32-bit
>> machine.
>> Please find attached a fix for the same.
>>
>> Crash was because 32 bit machine considers int8 as being passed by reference
>> while creating the tuple descriptor. At the time of filling the tuple store,
>> the code (heap_fill_tuple) checks this tuple descriptor before inserting the
>> value into the tuple store. It finds the attribute type pass by reference
>> and hence it treats the value as a pointer when it is not and thus it fails
>> at the time of memcpy.
>>
>> This happens because appropriate conversion function is not employed while
>> storing the value of that particular attribute into the values array before
>> copying it into tuple store.
>>
>> -                               values[i+3] =
>> UInt32GetDatum(beentry->st_progress_param[i]);
>> +                               values[i+3] =
>> Int64GetDatum(beentry->st_progress_param[i]);
>>
>>
>> Attached patch fixes this.
>
> Uggh, what a stupid mistake on my part.
>
> Committed.  Thanks for the patch.

Oops.  I forgot to credit you in the commit message.  Sorry about that.  :-(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PROPOSAL] VACUUM Progress Checker.

From

Amit Langote

Date:

25 March 2016, 00:24:11

On 2016/03/24 22:01, Robert Haas wrote:
> On Thu, Mar 24, 2016 at 8:45 AM, Rahila Syed <rahilasyed90@gmail.com> wrote:
>>
>> -                               values[i+3] =
>> UInt32GetDatum(beentry->st_progress_param[i]);
>> +                               values[i+3] =
>> Int64GetDatum(beentry->st_progress_param[i]);
>>
>>
>> Attached patch fixes this.
> 
> Uggh, what a stupid mistake on my part.
> 
> Committed.  Thanks for the patch.

Thanks Rahila and Robert.

- Amit

Re: [PROPOSAL] VACUUM Progress Checker.

From

Rahila Syed

Date:

25 March 2016, 08:51:56

>Oops. I forgot to credit you in the commit message. Sorry about that. :-(

No problem :). Thanks for the commit.

Thank you,

Rahila Syed