Thread: [HACKERS] pg_stat_wal_write statistics view

[HACKERS] pg_stat_wal_write statistics view

From
Haribabu Kommi
Date:
Hi Hackers,

I just want to discuss adding of a new statistics view that provides
the information of wal writing details as follows

postgres=# \d pg_stat_wal_writer 
                        View "pg_catalog.pg_stat_wal_writer"
        Column         |           Type           | Collation | Nullable | Default 
-----------------------+--------------------------+-----------+----------+---------
 num_backend_writes       | bigint                   |           |          | 
 num_total_writes  | bigint                   |           |          | 
 num_blocks  | bigint                   |           |          | 
 total_write_time   | bigint|           |          | 
 stats_reset           | timestamp with time zone |           |          | 

The columns of the view are
1. Total number of xlog writes that are called from the backend.
2. Total number of xlog writes that are called from both backend
 and background workers. (This column can be changed to just
 display on the background writes).
3. The number of the blocks that are written.
4. Total write_time of the IO operation it took, this variable data is
filled only when the track_io_timing GUC is enabled.
5. Last time when the stats are reset.

I feel this view information may be useful in finding out how much
time does backend may spend in writing the xlog, based on this
information, it may be possible to tune wal_writer_delay GUC.

Or it is possible to integrate the new columns into the existing
pg_stat_bgwriter view also.

Opinions?
 
Regards,
Hari Babu
Fujitsu Australia

Re: [HACKERS] pg_stat_wal_write statistics view

From
Amit Kapila
Date:
On Tue, Feb 7, 2017 at 11:47 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> Hi Hackers,
>
> I just want to discuss adding of a new statistics view that provides
> the information of wal writing details as follows
>

+1.  I think it will be useful to observe WAL activity.

> postgres=# \d pg_stat_wal_writer
>                         View "pg_catalog.pg_stat_wal_writer"
>         Column         |           Type           | Collation | Nullable |
> Default
> -----------------------+--------------------------+-----------+----------+---------
>  num_backend_writes       | bigint                   |           |
> |
>  num_total_writes  | bigint                   |           |          |
>  num_blocks  | bigint                   |           |          |
>  total_write_time   | bigint|           |          |
>  stats_reset           | timestamp with time zone |           |          |
>
> The columns of the view are
> 1. Total number of xlog writes that are called from the backend.
> 2. Total number of xlog writes that are called from both backend
>  and background workers. (This column can be changed to just
>  display on the background writes).
> 3. The number of the blocks that are written.
> 4. Total write_time of the IO operation it took, this variable data is
> filled only when the track_io_timing GUC is enabled.

So, here is *write_time* the total time system has spent in WAL
writing before the last reset?

I think there should be a separate column for write and sync time.


> Or it is possible to integrate the new columns into the existing
> pg_stat_bgwriter view also.
>

I feel separate view is better.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] pg_stat_wal_write statistics view

From
Haribabu Kommi
Date:


On Wed, Feb 8, 2017 at 9:36 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Feb 7, 2017 at 11:47 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> Hi Hackers,
>
> I just want to discuss adding of a new statistics view that provides
> the information of wal writing details as follows
>

+1.  I think it will be useful to observe WAL activity.

Thanks for your opinion. 

> postgres=# \d pg_stat_wal_writer
>                         View "pg_catalog.pg_stat_wal_writer"
>         Column         |           Type           | Collation | Nullable |
> Default
> -----------------------+--------------------------+-----------+----------+---------
>  num_backend_writes       | bigint                   |           |
> |
>  num_total_writes  | bigint                   |           |          |
>  num_blocks  | bigint                   |           |          |
>  total_write_time   | bigint|           |          |
>  stats_reset           | timestamp with time zone |           |          |
>
> The columns of the view are
> 1. Total number of xlog writes that are called from the backend.
> 2. Total number of xlog writes that are called from both backend
>  and background workers. (This column can be changed to just
>  display on the background writes).
> 3. The number of the blocks that are written.
> 4. Total write_time of the IO operation it took, this variable data is
> filled only when the track_io_timing GUC is enabled.

So, here is *write_time* the total time system has spent in WAL
writing before the last reset?

total write_time spent in WAL writing "after" the last reset in milliseconds. 

I think there should be a separate column for write and sync time.


Added. 


> Or it is possible to integrate the new columns into the existing
> pg_stat_bgwriter view also.
>

I feel separate view is better.

Ok.

Following the sample out of the view after regress run.

postgres=# select * from pg_stat_walwrites;
-[ RECORD 1 ]--+------------------------------
backend_writes | 19092
writes         | 663
write_blocks   | 56116
write_time     | 0
sync_time      | 3064
stats_reset    | 2017-02-15 13:37:09.454314+11

Currently, writer, walwriter and checkpointer processes
are considered as background processes that can do
the wal write mainly.

Here I attached patch that implements the view.
I will add this patch to next commitfest.
 
Regards,
Hari Babu
Fujitsu Australia
Attachment

Re: [HACKERS] pg_stat_wal_write statistics view

From
Julien Rouhaud
Date:
On Wed, Feb 15, 2017 at 02:53:44PM +1100, Haribabu Kommi wrote:
> Here I attached patch that implements the view.
> I will add this patch to next commitfest.

Hello,

I just reviewed the patch.

First, there are some whitespace issues that make git-apply complaining (at
least lines 218 and 396).

Patch is rather straightforward and works as expected, doc compiles without
issue.

I only found some minor issues:

+      <entry>One row only, showing statistics about the
+       wal writing activity. See

+      <entry>Number of wal writes that are carried out by the backend process</entry>


WAL should be uppercase (and for some more occurences).

+      <entry>
+      Number of wal writes that are carried out by background workers such as checkpointer,
+      writer and walwriter.

I guess you meant backgroung processes?

>+      This field data will be populated only when the track_io_timing GUC is enabled
(and similar occurences)

track_io_timing should link to <xref linkend="guc-track-io-timing"> instead of
mentionning GUC.

I think you also need to update the track_io_timing description in
sgml/config.sgml to mention this new view.


+           else
+           {
+               LocalWalWritesStats.m_wal_total_write_time = 0;
+           }
(and similar ones)

The brackets seem unnecessary.

I marked the commitfest entry as waiting on author.

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org



Re: [HACKERS] pg_stat_wal_write statistics view

From
Haribabu Kommi
Date:


On Thu, Mar 16, 2017 at 9:55 AM, Julien Rouhaud <julien.rouhaud@dalibo.com> wrote:
On Wed, Feb 15, 2017 at 02:53:44PM +1100, Haribabu Kommi wrote:
> Here I attached patch that implements the view.
> I will add this patch to next commitfest.

Hello,

I just reviewed the patch.

Thanks for the review. 

First, there are some whitespace issues that make git-apply complaining (at
least lines 218 and 396).

Removed. 

Patch is rather straightforward and works as expected, doc compiles without
issue.

I only found some minor issues:

+      <entry>One row only, showing statistics about the
+       wal writing activity. See

+      <entry>Number of wal writes that are carried out by the backend process</entry>


WAL should be uppercase (and for some more occurences).

Fixed.
 
+      <entry>
+      Number of wal writes that are carried out by background workers such as checkpointer,
+      writer and walwriter.

I guess you meant backgroung processes?

Yes, it is background processes. Updated.
 
>+      This field data will be populated only when the track_io_timing GUC is enabled
(and similar occurences)

track_io_timing should link to <xref linkend="guc-track-io-timing"> instead of
mentionning GUC.

Updated accordingly. 

I think you also need to update the track_io_timing description in
sgml/config.sgml to mention this new view.

Added the reference of pg_stat_walwrites in the GUC description.
 

+           else
+           {
+               LocalWalWritesStats.m_wal_total_write_time = 0;
+           }
(and similar ones)

The brackets seem unnecessary.

Corrected. 

Updated patch attached.

Regards,
Hari Babu
Fujitsu Australia
Attachment

Re: [HACKERS] pg_stat_wal_write statistics view

From
vinayak
Date:


On 2017/03/16 10:34, Haribabu Kommi wrote:

Updated patch attached.
The patch looks good to me.

How about rename the view as "pg_stat_walwriter"?
The columns of view :
backend_writes -> backend_wal_writes
writes-> background_wal_writes
write_blocks-> wal_write_blocks
write_time->wal_write_time
sync_time->wal_sync_time

Regards,
Vinayak Pokale
NTT Open Source Software Center

Re: [HACKERS] pg_stat_wal_write statistics view

From
Haribabu Kommi
Date:


On Thu, Mar 16, 2017 at 4:15 PM, vinayak <Pokale_Vinayak_q3@lab.ntt.co.jp> wrote:
On 2017/03/16 10:34, Haribabu Kommi wrote:

Updated patch attached.
The patch looks good to me.
 
Thanks for the review.

How about rename the view as "pg_stat_walwriter"?

With the use of name "walwriter" instead of "walwrites", the
user may confuse that this view is used for displaying 
walwriter processes statistics. But actually it is showing
the WAL writes activity in the instance. Because of this
reason, I went with the name of "walwrites".
 
The columns of view :
backend_writes -> backend_wal_writes
writes-> background_wal_writes
write_blocks-> wal_write_blocks
write_time->wal_write_time
sync_time->wal_sync_time

As the view name already contains WAL, I am not sure
whether is it required to include WAL in every column?
I am fine to change if others have the same opinion of
adding WAL to column names.

Regards,
Hari Babu
Fujitsu Australia

Re: [HACKERS] pg_stat_wal_write statistics view

From
vinayak
Date:

On 2017/03/16 14:46, Haribabu Kommi wrote:


On Thu, Mar 16, 2017 at 4:15 PM, vinayak <Pokale_Vinayak_q3@lab.ntt.co.jp> wrote:
On 2017/03/16 10:34, Haribabu Kommi wrote:

Updated patch attached.
The patch looks good to me.
 
Thanks for the review.

How about rename the view as "pg_stat_walwriter"?

With the use of name "walwriter" instead of "walwrites", the
user may confuse that this view is used for displaying 
walwriter processes statistics. But actually it is showing
the WAL writes activity in the instance. Because of this
reason, I went with the name of "walwrites".
Understood. Thanks.
 
The columns of view :
backend_writes -> backend_wal_writes
writes-> background_wal_writes
write_blocks-> wal_write_blocks
write_time->wal_write_time
sync_time->wal_sync_time

As the view name already contains WAL, I am not sure
whether is it required to include WAL in every column?
I am fine to change if others have the same opinion of
adding WAL to column names.

Ok.

Regards,
Vinayak Pokale
NTT Open Source Software Center

Re: pg_stat_wal_write statistics view

From
David Steele
Date:
On 3/16/17 1:54 AM, vinayak wrote:
>
> On 2017/03/16 14:46, Haribabu Kommi wrote:
>>
>> As the view name already contains WAL, I am not sure
>> whether is it required to include WAL in every column?
>> I am fine to change if others have the same opinion of
>> adding WAL to column names.
>>
> Ok.

So what is the status of this patch now?  Should it be marked "Ready for 
Committer"?

-- 
-David
david@pgmasters.net



Re: pg_stat_wal_write statistics view

From
Fujii Masao
Date:
On Wed, Feb 15, 2017 at 12:53 PM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
>
> On Wed, Feb 8, 2017 at 9:36 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Tue, Feb 7, 2017 at 11:47 AM, Haribabu Kommi
>> <kommi.haribabu@gmail.com> wrote:
>> > Hi Hackers,
>> >
>> > I just want to discuss adding of a new statistics view that provides
>> > the information of wal writing details as follows
>> >
>>
>> +1.  I think it will be useful to observe WAL activity.
>
>
> Thanks for your opinion.
>
>> > postgres=# \d pg_stat_wal_writer
>> >                         View "pg_catalog.pg_stat_wal_writer"
>> >         Column         |           Type           | Collation | Nullable
>> > |
>> > Default
>> >
>> > -----------------------+--------------------------+-----------+----------+---------
>> >  num_backend_writes       | bigint                   |           |
>> > |
>> >  num_total_writes  | bigint                   |           |          |
>> >  num_blocks  | bigint                   |           |          |
>> >  total_write_time   | bigint|           |          |
>> >  stats_reset           | timestamp with time zone |           |
>> > |
>> >
>> > The columns of the view are
>> > 1. Total number of xlog writes that are called from the backend.
>> > 2. Total number of xlog writes that are called from both backend
>> >  and background workers. (This column can be changed to just
>> >  display on the background writes).
>> > 3. The number of the blocks that are written.
>> > 4. Total write_time of the IO operation it took, this variable data is
>> > filled only when the track_io_timing GUC is enabled.
>>
>> So, here is *write_time* the total time system has spent in WAL
>> writing before the last reset?
>
>
> total write_time spent in WAL writing "after" the last reset in
> milliseconds.
>
>> I think there should be a separate column for write and sync time.
>>
>
> Added.
>
>>
>> > Or it is possible to integrate the new columns into the existing
>> > pg_stat_bgwriter view also.
>> >
>>
>> I feel separate view is better.
>
>
> Ok.
>
> Following the sample out of the view after regress run.
>
> postgres=# select * from pg_stat_walwrites;
> -[ RECORD 1 ]--+------------------------------
> backend_writes | 19092
> writes         | 663
> write_blocks   | 56116
> write_time     | 0
> sync_time      | 3064
> stats_reset    | 2017-02-15 13:37:09.454314+11
>
> Currently, writer, walwriter and checkpointer processes
> are considered as background processes that can do
> the wal write mainly.

I'm not sure if this categorization is good. You told that this view is useful
to tune walwriter parameters at the top of this thread. If so, ISTM that
the information about walwriter's activity should be separated from others.

What about other processes which *can* write WAL, for example walsender
(e.g., BASE_BACKUP can cause WAL record), startup process (e.g., end-of-
recovery checkpoint) and logical replication worker (Not sure if it always
works with synchronous_commit=off, though). There might be other processes
which can write WAL.

Why didn't you separate "write_blocks", "write_time" and "sync_time" per
the process category, like "backend_writes" and "writes"?

This view doesn't seem to take into consideration the WAL writing and flushing
during creating new WAL segment file.

I think that it's better to make this view report also the number of WAL pages
which are written when wal_buffer is full. This information is useful to
tune the size of wal_buffers. This was proposed by Nagayasu before.
https://www.postgresql.org/message-id/4FF824F3.5090407@uptime.jp

Regards,

-- 
Fujii Masao



Re: pg_stat_wal_write statistics view

From
Haribabu Kommi
Date:


On Sat, Mar 25, 2017 at 6:40 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Wed, Feb 15, 2017 at 12:53 PM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
>
> On Wed, Feb 8, 2017 at 9:36 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Tue, Feb 7, 2017 at 11:47 AM, Haribabu Kommi
>> <kommi.haribabu@gmail.com> wrote:
>> > Hi Hackers,
>> >
>> > I just want to discuss adding of a new statistics view that provides
>> > the information of wal writing details as follows
>> >
>>
>> +1.  I think it will be useful to observe WAL activity.
>
>
> Thanks for your opinion.
>
>> > postgres=# \d pg_stat_wal_writer
>> >                         View "pg_catalog.pg_stat_wal_writer"
>> >         Column         |           Type           | Collation | Nullable
>> > |
>> > Default
>> >
>> > -----------------------+--------------------------+-----------+----------+---------
>> >  num_backend_writes       | bigint                   |           |
>> > |
>> >  num_total_writes  | bigint                   |           |          |
>> >  num_blocks  | bigint                   |           |          |
>> >  total_write_time   | bigint|           |          |
>> >  stats_reset           | timestamp with time zone |           |
>> > |
>> >
>> > The columns of the view are
>> > 1. Total number of xlog writes that are called from the backend.
>> > 2. Total number of xlog writes that are called from both backend
>> >  and background workers. (This column can be changed to just
>> >  display on the background writes).
>> > 3. The number of the blocks that are written.
>> > 4. Total write_time of the IO operation it took, this variable data is
>> > filled only when the track_io_timing GUC is enabled.
>>
>> So, here is *write_time* the total time system has spent in WAL
>> writing before the last reset?
>
>
> total write_time spent in WAL writing "after" the last reset in
> milliseconds.
>
>> I think there should be a separate column for write and sync time.
>>
>
> Added.
>
>>
>> > Or it is possible to integrate the new columns into the existing
>> > pg_stat_bgwriter view also.
>> >
>>
>> I feel separate view is better.
>
>
> Ok.
>
> Following the sample out of the view after regress run.
>
> postgres=# select * from pg_stat_walwrites;
> -[ RECORD 1 ]--+------------------------------
> backend_writes | 19092
> writes         | 663
> write_blocks   | 56116
> write_time     | 0
> sync_time      | 3064
> stats_reset    | 2017-02-15 13:37:09.454314+11
>
> Currently, writer, walwriter and checkpointer processes
> are considered as background processes that can do
> the wal write mainly.

Thanks for the review. 

I'm not sure if this categorization is good. You told that this view is useful
to tune walwriter parameters at the top of this thread. If so, ISTM that
the information about walwriter's activity should be separated from others.

Yes, that's correct. First I thought of providing the statistics of walwriter, but
later in development, it turned into showing statistics of all wal write activity
of background processes also to differentiate the actual write by the backends.
 
What about other processes which *can* write WAL, for example walsender
(e.g., BASE_BACKUP can cause WAL record), startup process (e.g., end-of-
recovery checkpoint) and logical replication worker (Not sure if it always
works with synchronous_commit=off, though). There might be other processes
which can write WAL

It is possible to add the walsender, stratup and other processes easily, but not 
background workers that does some wal write operations until unless they
report the stats with pgstat_report_stat(). Is it fine to ignore the workers that
does not report the stats?

 
Why didn't you separate "write_blocks", "write_time" and "sync_time" per
the process category, like "backend_writes" and "writes"?

Ok. I will add those columns.
 
This view doesn't seem to take into consideration the WAL writing and flushing
during creating new WAL segment file.

I think that it's better to make this view report also the number of WAL pages
which are written when wal_buffer is full. This information is useful to
tune the size of wal_buffers. This was proposed by Nagayasu before.
https://www.postgresql.org/message-id/4FF824F3.5090407@uptime.jp

Ok. But this new column just shows how many times the WAL buffers are flushed
because of wal buffers are full. Not the WAL pages that are actually flushed because
of wal buffers full as a separate column.


Regards,
Hari Babu
Fujitsu Australia

Re: pg_stat_wal_write statistics view

From
Haribabu Kommi
Date:


On Mon, Mar 27, 2017 at 1:27 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:


On Sat, Mar 25, 2017 at 6:40 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Wed, Feb 15, 2017 at 12:53 PM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
>
> On Wed, Feb 8, 2017 at 9:36 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Tue, Feb 7, 2017 at 11:47 AM, Haribabu Kommi
>> <kommi.haribabu@gmail.com> wrote:
>> > Hi Hackers,
>> >
>> > I just want to discuss adding of a new statistics view that provides
>> > the information of wal writing details as follows
>> >
>>
>> +1.  I think it will be useful to observe WAL activity.
>
>
> Thanks for your opinion.
>
>> > postgres=# \d pg_stat_wal_writer
>> >                         View "pg_catalog.pg_stat_wal_writer"
>> >         Column         |           Type           | Collation | Nullable
>> > |
>> > Default
>> >
>> > -----------------------+--------------------------+-----------+----------+---------
>> >  num_backend_writes       | bigint                   |           |
>> > |
>> >  num_total_writes  | bigint                   |           |          |
>> >  num_blocks  | bigint                   |           |          |
>> >  total_write_time   | bigint|           |          |
>> >  stats_reset           | timestamp with time zone |           |
>> > |
>> >
>> > The columns of the view are
>> > 1. Total number of xlog writes that are called from the backend.
>> > 2. Total number of xlog writes that are called from both backend
>> >  and background workers. (This column can be changed to just
>> >  display on the background writes).
>> > 3. The number of the blocks that are written.
>> > 4. Total write_time of the IO operation it took, this variable data is
>> > filled only when the track_io_timing GUC is enabled.
>>
>> So, here is *write_time* the total time system has spent in WAL
>> writing before the last reset?
>
>
> total write_time spent in WAL writing "after" the last reset in
> milliseconds.
>
>> I think there should be a separate column for write and sync time.
>>
>
> Added.
>
>>
>> > Or it is possible to integrate the new columns into the existing
>> > pg_stat_bgwriter view also.
>> >
>>
>> I feel separate view is better.
>
>
> Ok.
>
> Following the sample out of the view after regress run.
>
> postgres=# select * from pg_stat_walwrites;
> -[ RECORD 1 ]--+------------------------------
> backend_writes | 19092
> writes         | 663
> write_blocks   | 56116
> write_time     | 0
> sync_time      | 3064
> stats_reset    | 2017-02-15 13:37:09.454314+11
>
> Currently, writer, walwriter and checkpointer processes
> are considered as background processes that can do
> the wal write mainly.

Thanks for the review. 

I'm not sure if this categorization is good. You told that this view is useful
to tune walwriter parameters at the top of this thread. If so, ISTM that
the information about walwriter's activity should be separated from others.

Yes, that's correct. First I thought of providing the statistics of walwriter, but
later in development, it turned into showing statistics of all wal write activity
of background processes also to differentiate the actual write by the backends.
 
What about other processes which *can* write WAL, for example walsender
(e.g., BASE_BACKUP can cause WAL record), startup process (e.g., end-of-
recovery checkpoint) and logical replication worker (Not sure if it always
works with synchronous_commit=off, though). There might be other processes
which can write WAL

It is possible to add the walsender, stratup and other processes easily, but not 
background workers that does some wal write operations until unless they
report the stats with pgstat_report_stat(). Is it fine to ignore the workers that
does not report the stats?

Added stats collection for walsender, statrup and autovacuum processes.
The background workers that call pgstat_report_stat() function will automatically
included.
 
 
Why didn't you separate "write_blocks", "write_time" and "sync_time" per
the process category, like "backend_writes" and "writes"?

Ok. I will add those columns.

Added the new columns to differentiate backend and background processes. 

This view doesn't seem to take into consideration the WAL writing and flushing
during creating new WAL segment file.

I think that it's better to make this view report also the number of WAL pages
which are written when wal_buffer is full. This information is useful to
tune the size of wal_buffers. This was proposed by Nagayasu before.
https://www.postgresql.org/message-id/4FF824F3.5090407@uptime.jp

Ok. But this new column just shows how many times the WAL buffers are flushed
because of wal buffers are full. Not the WAL pages that are actually flushed because
of wal buffers full as a separate column.

Added a new column to display the dirty writes from both backend and
background processes.

update patch attached.
 
Regards,
Hari Babu
Fujitsu Australia
Attachment

Re: pg_stat_wal_write statistics view

From
Haribabu Kommi
Date:


On Tue, Mar 28, 2017 at 3:40 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

update patch attached.

Forgot to execute git commit, new patch attached.

Regards,
Hari Babu
Fujitsu Australia
Attachment

Re: pg_stat_wal_write statistics view

From
Fujii Masao
Date:
On Tue, Mar 28, 2017 at 1:40 PM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
>
> On Mon, Mar 27, 2017 at 1:27 PM, Haribabu Kommi <kommi.haribabu@gmail.com>
> wrote:
>>
>>
>>
>> On Sat, Mar 25, 2017 at 6:40 AM, Fujii Masao <masao.fujii@gmail.com>
>> wrote:
>>>
>>> On Wed, Feb 15, 2017 at 12:53 PM, Haribabu Kommi
>>> <kommi.haribabu@gmail.com> wrote:
>>> >
>>> >
>>> > On Wed, Feb 8, 2017 at 9:36 PM, Amit Kapila <amit.kapila16@gmail.com>
>>> > wrote:
>>> >>
>>> >> On Tue, Feb 7, 2017 at 11:47 AM, Haribabu Kommi
>>> >> <kommi.haribabu@gmail.com> wrote:
>>> >> > Hi Hackers,
>>> >> >
>>> >> > I just want to discuss adding of a new statistics view that provides
>>> >> > the information of wal writing details as follows
>>> >> >
>>> >>
>>> >> +1.  I think it will be useful to observe WAL activity.
>>> >
>>> >
>>> > Thanks for your opinion.
>>> >
>>> >> > postgres=# \d pg_stat_wal_writer
>>> >> >                         View "pg_catalog.pg_stat_wal_writer"
>>> >> >         Column         |           Type           | Collation |
>>> >> > Nullable
>>> >> > |
>>> >> > Default
>>> >> >
>>> >> >
>>> >> > -----------------------+--------------------------+-----------+----------+---------
>>> >> >  num_backend_writes       | bigint                   |           |
>>> >> > |
>>> >> >  num_total_writes  | bigint                   |           |
>>> >> > |
>>> >> >  num_blocks  | bigint                   |           |          |
>>> >> >  total_write_time   | bigint|           |          |
>>> >> >  stats_reset           | timestamp with time zone |           |
>>> >> > |
>>> >> >
>>> >> > The columns of the view are
>>> >> > 1. Total number of xlog writes that are called from the backend.
>>> >> > 2. Total number of xlog writes that are called from both backend
>>> >> >  and background workers. (This column can be changed to just
>>> >> >  display on the background writes).
>>> >> > 3. The number of the blocks that are written.
>>> >> > 4. Total write_time of the IO operation it took, this variable data
>>> >> > is
>>> >> > filled only when the track_io_timing GUC is enabled.
>>> >>
>>> >> So, here is *write_time* the total time system has spent in WAL
>>> >> writing before the last reset?
>>> >
>>> >
>>> > total write_time spent in WAL writing "after" the last reset in
>>> > milliseconds.
>>> >
>>> >> I think there should be a separate column for write and sync time.
>>> >>
>>> >
>>> > Added.
>>> >
>>> >>
>>> >> > Or it is possible to integrate the new columns into the existing
>>> >> > pg_stat_bgwriter view also.
>>> >> >
>>> >>
>>> >> I feel separate view is better.
>>> >
>>> >
>>> > Ok.
>>> >
>>> > Following the sample out of the view after regress run.
>>> >
>>> > postgres=# select * from pg_stat_walwrites;
>>> > -[ RECORD 1 ]--+------------------------------
>>> > backend_writes | 19092
>>> > writes         | 663
>>> > write_blocks   | 56116
>>> > write_time     | 0
>>> > sync_time      | 3064
>>> > stats_reset    | 2017-02-15 13:37:09.454314+11
>>> >
>>> > Currently, writer, walwriter and checkpointer processes
>>> > are considered as background processes that can do
>>> > the wal write mainly.
>>
>>
>> Thanks for the review.
>>
>>> I'm not sure if this categorization is good. You told that this view is
>>> useful
>>> to tune walwriter parameters at the top of this thread. If so, ISTM that
>>> the information about walwriter's activity should be separated from
>>> others.
>>
>>
>> Yes, that's correct. First I thought of providing the statistics of
>> walwriter, but
>> later in development, it turned into showing statistics of all wal write
>> activity
>> of background processes also to differentiate the actual write by the
>> backends.
>>
>>>
>>> What about other processes which *can* write WAL, for example walsender
>>> (e.g., BASE_BACKUP can cause WAL record), startup process (e.g., end-of-
>>> recovery checkpoint) and logical replication worker (Not sure if it
>>> always
>>> works with synchronous_commit=off, though). There might be other
>>> processes
>>> which can write WAL
>>
>>
>> It is possible to add the walsender, stratup and other processes easily,
>> but not
>> background workers that does some wal write operations until unless they
>> report the stats with pgstat_report_stat(). Is it fine to ignore the
>> workers that
>> does not report the stats?
>
>
> Added stats collection for walsender, statrup and autovacuum processes.
> The background workers that call pgstat_report_stat() function will
> automatically
> included.

So you added pgstat_send_walwrites() into several places. But WAL activity
by the background workers which don't call pgstat_report_stat() is still not
considered in pg_stat_walwrites view. This design looks fragile to me.
I think that we should capture all the WAL activities (even by such background
workers) in XLogWrite(), instead.

For example, we can add the "shared" counters (protected by WALWriteLock)
into XLogCtl and make XLogWrite() update those counters if the process is
NOT walwriter. OTOH, if it's walwriter, XLogWrite() updates its local counters.
Then walwriter periodically packs those counters in the message and sends
it to the stats collector.

>
>>
>>
>>>
>>> Why didn't you separate "write_blocks", "write_time" and "sync_time" per
>>> the process category, like "backend_writes" and "writes"?
>>
>>
>> Ok. I will add those columns.
>
>
> Added the new columns to differentiate backend and background processes.
>
>>> This view doesn't seem to take into consideration the WAL writing and
>>> flushing
>>> during creating new WAL segment file.
>>>
>>> I think that it's better to make this view report also the number of WAL
>>> pages
>>> which are written when wal_buffer is full. This information is useful to
>>> tune the size of wal_buffers. This was proposed by Nagayasu before.
>>> https://www.postgresql.org/message-id/4FF824F3.5090407@uptime.jp
>>
>>
>> Ok. But this new column just shows how many times the WAL buffers are
>> flushed
>> because of wal buffers are full. Not the WAL pages that are actually
>> flushed because
>> of wal buffers full as a separate column.
>
>
> Added a new column to display the dirty writes from both backend and
> background processes.
>
> update patch attached.

Thanks for updating the patch!

+      <entry><structfield>writes</></entry>
+      <entry><type>bigint</type></entry>
+      <entry>
+      Number of WAL writes that are carried out by background
processes and workers.
+      </entry>
<snip>
+      <entry><structfield>write_blocks</></entry>
+      <entry><type>bigint</type></entry>
+      <entry>Number of WAL pages written to the disk by the
background processes/workers</entry>
+     </row>

Could you tell me why both numbers of WAL writes and pages written need to
be reported? How can we tune the system by using those information?

Instead of (or in addition to) the number of WAL pages, we should report
the number of *bytes* of WAL records actually written?

Anyway, we need more and more opinions about what information of WAL activity
should be exposed in the view and how. But we have only a few days left for
this CommitFest...

Regards,

-- 
Fujii Masao



Re: pg_stat_wal_write statistics view

From
Haribabu Kommi
Date:


On Wed, Mar 29, 2017 at 5:10 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Tue, Mar 28, 2017 at 1:40 PM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>

> Added stats collection for walsender, statrup and autovacuum processes.
> The background workers that call pgstat_report_stat() function will
> automatically
> included.

So you added pgstat_send_walwrites() into several places. But WAL activity
by the background workers which don't call pgstat_report_stat() is still not
considered in pg_stat_walwrites view. This design looks fragile to me.
I think that we should capture all the WAL activities (even by such background
workers) in XLogWrite(), instead.

For example, we can add the "shared" counters (protected by WALWriteLock)
into XLogCtl and make XLogWrite() update those counters if the process is
NOT walwriter. OTOH, if it's walwriter, XLogWrite() updates its local counters.
Then walwriter periodically packs those counters in the message and sends
it to the stats collector.

Updated patch to use shared counter instead of adding pg_stat_ calls to send
the statistics from each background process/worker.

All the processes updates the shared counters, the walwriter process fill the
local structure with the shared counters and send it to the stats collector.
 
>
> update patch attached.

Thanks for updating the patch!

+      <entry><structfield>writes</></entry>
+      <entry><type>bigint</type></entry>
+      <entry>
+      Number of WAL writes that are carried out by background
processes and workers.
+      </entry>
<snip>
+      <entry><structfield>write_blocks</></entry>
+      <entry><type>bigint</type></entry>
+      <entry>Number of WAL pages written to the disk by the
background processes/workers</entry>
+     </row>

Could you tell me why both numbers of WAL writes and pages written need to
be reported? How can we tune the system by using those information?

With the column of write_blocks, it is useful to find out how many number of blocks
that gets written from backend along with the number of writes. I feel 
1. If we only get the count of number of writes and the write amount is very less,
so it may not need any tuning.
2. If the amount of write is good even for less number of writes, then it needs
some tuning.

Updated patch attached.

Regards,
Hari Babu
Fujitsu Australia
Attachment

Re: pg_stat_wal_write statistics view

From
Andres Freund
Date:
Hi,

On 2017-03-30 13:10:41 +1100, Haribabu Kommi wrote:
> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
> index 5d58f09..a29c108 100644
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c
> @@ -600,6 +600,9 @@ typedef struct XLogCtlData
>       */
>      XLogwrtResult LogwrtResult;
>  
> +    /* Protected by WALWriteLock */
> +    PgStat_WalWritesCounts stats;

>  /*
> + * Check whether the current process is a normal backend or not.
> + * This function checks for the background processes that does
> + * some WAL write activity only and other background processes
> + * are not considered. It considers all the background workers
> + * as WAL write activity workers.
> + *
> + * Returns FALSE - when the current process is a normal backend
> + *           TRUE - when the current process a background process/worker
> + */

I don't think we commonly capitalize true/false these days.

> +static bool
> +am_background_process()
> +{
> +    /* check whether current process is a background process/worker? */
> +    if (AmBackgroundWriterProcess() ||
> +        AmWalWriterProcess() ||
> +        AmCheckpointerProcess() ||
> +        AmStartupProcess() ||
> +        IsBackgroundWorker ||
> +        am_walsender ||
> +        am_autovacuum_worker)
> +    {
> +        return true;
> +    }
> +
> +    return false;
> +}

Uhm, wouldn't inverting this be a lot easier and future proof?  There's
far fewer non-background processes.

> +/*
>   * Write and/or fsync the log at least as far as WriteRqst indicates.
>   *
>   * If flexible == TRUE, we don't have to write as far as WriteRqst, but
> @@ -2341,6 +2377,9 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
>      int            npages;
>      int            startidx;
>      uint32        startoffset;
> +    instr_time    io_start,
> +                io_time;
> +    bool        is_background_process = am_background_process();
>  
>      /* We should always be inside a critical section here */
>      Assert(CritSectionCount > 0);
> @@ -2458,6 +2497,11 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
>              /* OK to write the page(s) */
>              from = XLogCtl->pages + startidx * (Size) XLOG_BLCKSZ;
>              nbytes = npages * (Size) XLOG_BLCKSZ;
> +
> +            /* Start timer to acquire start time of the wal write */
> +            if (track_io_timing)
> +                INSTR_TIME_SET_CURRENT(io_start);
> +

I'm more than a bit hesitant adding overhead to WAL writing - it's
already quite a bit of a bottleneck.  Normally track_io_timing just
makes things a tiny bit slower, but doesn't cause a scalability issue,
here it does.  This is all done under an already highly contended lock.


>              nleft = nbytes;
>              do
>              {
> @@ -2480,6 +2524,34 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
>                  from += written;
>              } while (nleft > 0);
>  
> +            /* calculate the total time spent for wal writing */
> +            if (track_io_timing)
> +            {
> +                INSTR_TIME_SET_CURRENT(io_time);
> +                INSTR_TIME_SUBTRACT(io_time, io_start);
> +
> +                if (is_background_process)
> +                    XLogCtl->stats.total_write_time += INSTR_TIME_GET_MILLISEC(io_time);
> +                else
> +                    XLogCtl->stats.backend_total_write_time += INSTR_TIME_GET_MILLISEC(io_time);
> +            }
> +            else
> +            {
> +                XLogCtl->stats.total_write_time = 0;
> +                XLogCtl->stats.backend_total_write_time = 0;
> +            }
> +
> +            /* check whether writer is a normal backend or not? */
> +            if (is_background_process)
> +                XLogCtl->stats.writes++;
> +            else
> +                XLogCtl->stats.backend_writes++;
> +
> +            if (is_background_process)
> +                XLogCtl->stats.write_blocks += npages;
> +            else
> +                XLogCtl->stats.backend_write_blocks += npages;
> +
>              /* Update state for write */
>              openLogOff += nbytes;
>              npages = 0;
> @@ -2499,8 +2571,29 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
>               */
>              if (finishing_seg)
>              {
> +                /* Start timer to acquire start time of the wal sync */
> +                if (track_io_timing)
> +                    INSTR_TIME_SET_CURRENT(io_start);
> +
>                  issue_xlog_fsync(openLogFile, openLogSegNo);
>  
> +                /* calculate the total time spent for wal sync */
> +                if (track_io_timing)
> +                {
> +                    INSTR_TIME_SET_CURRENT(io_time);
> +                    INSTR_TIME_SUBTRACT(io_time, io_start);
> +
> +                    if (is_background_process)
> +                        XLogCtl->stats.total_sync_time += INSTR_TIME_GET_MILLISEC(io_time);
> +                    else
> +                        XLogCtl->stats.backend_total_sync_time += INSTR_TIME_GET_MILLISEC(io_time);
> +                }
> +                else
> +                {
> +                    XLogCtl->stats.total_sync_time = 0;
> +                    XLogCtl->stats.backend_total_sync_time = 0;
> +                }
> +
>                  /* signal that we need to wakeup walsenders later */
>                  WalSndWakeupRequest();
>  
> @@ -2568,7 +2661,28 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
>                  openLogOff = 0;
>              }
>  
> +            /* Start timer to acquire start time of the wal sync */
> +            if (track_io_timing)
> +                INSTR_TIME_SET_CURRENT(io_start);
> +
>              issue_xlog_fsync(openLogFile, openLogSegNo);
> +
> +            /* calculate the total time spent for wal sync */
> +            if (track_io_timing)
> +            {
> +                INSTR_TIME_SET_CURRENT(io_time);
> +                INSTR_TIME_SUBTRACT(io_time, io_start);
> +
> +                if (is_background_process)
> +                    XLogCtl->stats.total_sync_time += INSTR_TIME_GET_MILLISEC(io_time);
> +                else
> +                    XLogCtl->stats.backend_total_sync_time += INSTR_TIME_GET_MILLISEC(io_time);
> +            }
> +            else
> +            {
> +                XLogCtl->stats.total_sync_time = 0;
> +                XLogCtl->stats.backend_total_sync_time = 0;
> +            }
>          }

I'm *very* doubtful about this, but even if we do it, this needs careful
benchmarking.

>          /* signal that we need to wakeup walsenders later */
> @@ -3043,6 +3157,9 @@ XLogBackgroundFlush(void)
>      {
>          XLogWrite(WriteRqst, flexible);
>      }
> +
> +    /* Collect all the wal write shared counters into local, and report it to stats collector */
> +    memcpy(&LocalWalWritesStats.stats, &XLogCtl->stats, sizeof(PgStat_WalWritesCounts));
>      LWLockRelease(WALWriteLock);

Hm. I think in a good number of workloads hti sis never reached, because
we return early.


I think this is an interesting feature, but I don't think it's ready,
and it was submitted very late in the v10 release cycle. Therefore I
think this should be moved to the next CF.

Greetings,

Andres Freund



Re: [HACKERS] pg_stat_wal_write statistics view

From
Fujii Masao
Date:
On Thu, Apr 6, 2017 at 5:51 AM, Andres Freund <andres@anarazel.de> wrote:
> Hi,
>
> On 2017-03-30 13:10:41 +1100, Haribabu Kommi wrote:
>> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
>> index 5d58f09..a29c108 100644
>> --- a/src/backend/access/transam/xlog.c
>> +++ b/src/backend/access/transam/xlog.c
>> @@ -600,6 +600,9 @@ typedef struct XLogCtlData
>>        */
>>       XLogwrtResult LogwrtResult;
>>
>> +     /* Protected by WALWriteLock */
>> +     PgStat_WalWritesCounts stats;
>
>>  /*
>> + * Check whether the current process is a normal backend or not.
>> + * This function checks for the background processes that does
>> + * some WAL write activity only and other background processes
>> + * are not considered. It considers all the background workers
>> + * as WAL write activity workers.
>> + *
>> + * Returns FALSE - when the current process is a normal backend
>> + *              TRUE - when the current process a background process/worker
>> + */
>
> I don't think we commonly capitalize true/false these days.
>
>> +static bool
>> +am_background_process()
>> +{
>> +     /* check whether current process is a background process/worker? */
>> +     if (AmBackgroundWriterProcess() ||
>> +             AmWalWriterProcess() ||
>> +             AmCheckpointerProcess() ||
>> +             AmStartupProcess() ||
>> +             IsBackgroundWorker ||
>> +             am_walsender ||
>> +             am_autovacuum_worker)
>> +     {
>> +             return true;
>> +     }
>> +
>> +     return false;
>> +}
>
> Uhm, wouldn't inverting this be a lot easier and future proof?  There's
> far fewer non-background processes.
>
>> +/*
>>   * Write and/or fsync the log at least as far as WriteRqst indicates.
>>   *
>>   * If flexible == TRUE, we don't have to write as far as WriteRqst, but
>> @@ -2341,6 +2377,9 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
>>       int                     npages;
>>       int                     startidx;
>>       uint32          startoffset;
>> +     instr_time      io_start,
>> +                             io_time;
>> +     bool            is_background_process = am_background_process();
>>
>>       /* We should always be inside a critical section here */
>>       Assert(CritSectionCount > 0);
>> @@ -2458,6 +2497,11 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
>>                       /* OK to write the page(s) */
>>                       from = XLogCtl->pages + startidx * (Size) XLOG_BLCKSZ;
>>                       nbytes = npages * (Size) XLOG_BLCKSZ;
>> +
>> +                     /* Start timer to acquire start time of the wal write */
>> +                     if (track_io_timing)
>> +                             INSTR_TIME_SET_CURRENT(io_start);
>> +
>
> I'm more than a bit hesitant adding overhead to WAL writing - it's
> already quite a bit of a bottleneck.  Normally track_io_timing just
> makes things a tiny bit slower, but doesn't cause a scalability issue,
> here it does.  This is all done under an already highly contended lock.
>
>
>>                       nleft = nbytes;
>>                       do
>>                       {
>> @@ -2480,6 +2524,34 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
>>                               from += written;
>>                       } while (nleft > 0);
>>
>> +                     /* calculate the total time spent for wal writing */
>> +                     if (track_io_timing)
>> +                     {
>> +                             INSTR_TIME_SET_CURRENT(io_time);
>> +                             INSTR_TIME_SUBTRACT(io_time, io_start);
>> +
>> +                             if (is_background_process)
>> +                                     XLogCtl->stats.total_write_time += INSTR_TIME_GET_MILLISEC(io_time);
>> +                             else
>> +                                     XLogCtl->stats.backend_total_write_time += INSTR_TIME_GET_MILLISEC(io_time);
>> +                     }
>> +                     else
>> +                     {
>> +                             XLogCtl->stats.total_write_time = 0;
>> +                             XLogCtl->stats.backend_total_write_time = 0;
>> +                     }
>> +
>> +                     /* check whether writer is a normal backend or not? */
>> +                     if (is_background_process)
>> +                             XLogCtl->stats.writes++;
>> +                     else
>> +                             XLogCtl->stats.backend_writes++;
>> +
>> +                     if (is_background_process)
>> +                             XLogCtl->stats.write_blocks += npages;
>> +                     else
>> +                             XLogCtl->stats.backend_write_blocks += npages;
>> +
>>                       /* Update state for write */
>>                       openLogOff += nbytes;
>>                       npages = 0;
>> @@ -2499,8 +2571,29 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
>>                        */
>>                       if (finishing_seg)
>>                       {
>> +                             /* Start timer to acquire start time of the wal sync */
>> +                             if (track_io_timing)
>> +                                     INSTR_TIME_SET_CURRENT(io_start);
>> +
>>                               issue_xlog_fsync(openLogFile, openLogSegNo);
>>
>> +                             /* calculate the total time spent for wal sync */
>> +                             if (track_io_timing)
>> +                             {
>> +                                     INSTR_TIME_SET_CURRENT(io_time);
>> +                                     INSTR_TIME_SUBTRACT(io_time, io_start);
>> +
>> +                                     if (is_background_process)
>> +                                             XLogCtl->stats.total_sync_time += INSTR_TIME_GET_MILLISEC(io_time);
>> +                                     else
>> +                                             XLogCtl->stats.backend_total_sync_time +=
INSTR_TIME_GET_MILLISEC(io_time);
>> +                             }
>> +                             else
>> +                             {
>> +                                     XLogCtl->stats.total_sync_time = 0;
>> +                                     XLogCtl->stats.backend_total_sync_time = 0;
>> +                             }
>> +
>>                               /* signal that we need to wakeup walsenders later */
>>                               WalSndWakeupRequest();
>>
>> @@ -2568,7 +2661,28 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible)
>>                               openLogOff = 0;
>>                       }
>>
>> +                     /* Start timer to acquire start time of the wal sync */
>> +                     if (track_io_timing)
>> +                             INSTR_TIME_SET_CURRENT(io_start);
>> +
>>                       issue_xlog_fsync(openLogFile, openLogSegNo);
>> +
>> +                     /* calculate the total time spent for wal sync */
>> +                     if (track_io_timing)
>> +                     {
>> +                             INSTR_TIME_SET_CURRENT(io_time);
>> +                             INSTR_TIME_SUBTRACT(io_time, io_start);
>> +
>> +                             if (is_background_process)
>> +                                     XLogCtl->stats.total_sync_time += INSTR_TIME_GET_MILLISEC(io_time);
>> +                             else
>> +                                     XLogCtl->stats.backend_total_sync_time += INSTR_TIME_GET_MILLISEC(io_time);
>> +                     }
>> +                     else
>> +                     {
>> +                             XLogCtl->stats.total_sync_time = 0;
>> +                             XLogCtl->stats.backend_total_sync_time = 0;
>> +                     }
>>               }
>
> I'm *very* doubtful about this, but even if we do it, this needs careful
> benchmarking.
>
>>               /* signal that we need to wakeup walsenders later */
>> @@ -3043,6 +3157,9 @@ XLogBackgroundFlush(void)
>>       {
>>               XLogWrite(WriteRqst, flexible);
>>       }
>> +
>> +     /* Collect all the wal write shared counters into local, and report it to stats collector */
>> +     memcpy(&LocalWalWritesStats.stats, &XLogCtl->stats, sizeof(PgStat_WalWritesCounts));
>>       LWLockRelease(WALWriteLock);
>
> Hm. I think in a good number of workloads hti sis never reached, because
> we return early.
>
>
> I think this is an interesting feature, but I don't think it's ready,
> and it was submitted very late in the v10 release cycle. Therefore I
> think this should be moved to the next CF.

+1

I think that there is no consensus yet about what values should be exposed.
For example, with the patch, WAL writing activity by backends is reported
separately from that by the other processes. But I'm not sure if this grouping
is good. It seems better to report walwriter activity separately from
the others, for tuning of walwriter.

Regards,

-- 
Fujii Masao



Re: [HACKERS] pg_stat_wal_write statistics view

From
Peter Eisentraut
Date:
On 3/29/17 22:10, Haribabu Kommi wrote:
> Updated patch to use shared counter instead of adding pg_stat_ calls to send
> the statistics from each background process/worker.

Your patch needs to be rebased and the OID assignments updated.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] pg_stat_wal_write statistics view

From
Haribabu Kommi
Date:


On Tue, Aug 15, 2017 at 7:39 AM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 3/29/17 22:10, Haribabu Kommi wrote:
> Updated patch to use shared counter instead of adding pg_stat_ calls to send
> the statistics from each background process/worker.

Your patch needs to be rebased and the OID assignments updated.

Rebased patch is attached. And also it has taken care of the comments provided
by Andres and Fuji Masao in the upthread.

I separated the walwriter statistics from the other background processes to check
how much load does the other processes are really contributing to wal writing.

Following is the test results during my performance test.

 writes | walwriter_writes | backend_writes | dirty_writes | walwriter_dirty_writes | backend_dirty_writes | write_blocks | walwriter_write_blocks | backend_write_blocks | write_time | walwriter_write_time | backend_write_time | sync_time | walwriter_sync_time | backend_sync_time |          stats_reset          
--------+------------------+----------------+--------------+------------------------+----------------------+--------------+------------------------+----------------------+------------+----------------------+--------------------+-----------+---------------------+-------------------+-------------------------------
      0 |         17748394 |     1383789657 |            0 |                      0 |                    0 |            0 |               21153194 |           3039806652 |          0 |                    0 |                  0 |         0 |           259250230 |       17262560725 | 2017-09-05 18:22:41.087955+10
(1 row)

I didn't find any other background processes contribution to the WAL writing activity.
May be we can combine them with backend stats?

I ran the performance test on this patch with pgbench to find out the impact of these
changes. Because of my low end machine, the performance results are varying too
much, I will try to get the performance figures from an high end machine by that time.

Attached the latest patch and performance report.

Regards,
Hari Babu
Fujitsu Australia
Attachment

Re: [HACKERS] pg_stat_wal_write statistics view

From
Kuntal Ghosh
Date:
On Wed, Sep 6, 2017 at 9:16 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> Attached the latest patch and performance report.
>
While looking into the patch, I realized that a normal backend has to
check almost 10 if conditions at worst case inside XLogWrite(6 in
am_background_process method, 1 for write, 1 for blocks, 2 for
fsyncs), just to decide whether it is a normal backend or not. The
count is more for walwriter process. Although I've not done any
performance tests, IMHO, it might add further overhead on the
XLogWrite Lock.

I was thinking whether it is possible to collect all the stats in
XLogWrite() irrespective of the type of backend and update the shared
counters at once at the end of the function. Thoughts?


-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_wal_write statistics view

From
Haribabu Kommi
Date:


On Tue, Sep 12, 2017 at 2:04 AM, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote:
On Wed, Sep 6, 2017 at 9:16 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> Attached the latest patch and performance report.
>
While looking into the patch, I realized that a normal backend has to
check almost 10 if conditions at worst case inside XLogWrite(6 in
am_background_process method, 1 for write, 1 for blocks, 2 for
fsyncs), just to decide whether it is a normal backend or not. The
count is more for walwriter process. Although I've not done any
performance tests, IMHO, it might add further overhead on the
XLogWrite Lock.

I was thinking whether it is possible to collect all the stats in
XLogWrite() irrespective of the type of backend and update the shared
counters at once at the end of the function. Thoughts?

Thanks for the review.
Yes, I agree with you that the stats update can be done at the end
of the function to avoid some overhead. Updated patch is attached.

There is an overhead with IO time calculation. Is the view is good 
enough without IO columns?

And also during my tests, I didn't observe any other background 
processes performing the xlogwrite operation, the values are always
zero. Is it fine to merge them with backend columns?

Regards,
Hari Babu
Fujitsu Australia
Attachment

Re: [HACKERS] pg_stat_wal_write statistics view

From
Kuntal Ghosh
Date:
On Tue, Sep 12, 2017 at 9:06 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
>
> On Tue, Sep 12, 2017 at 2:04 AM, Kuntal Ghosh <kuntalghosh.2007@gmail.com>
> wrote:
>>
>> On Wed, Sep 6, 2017 at 9:16 AM, Haribabu Kommi <kommi.haribabu@gmail.com>
>> wrote:
>> >
>> > Attached the latest patch and performance report.
>> >
>> While looking into the patch, I realized that a normal backend has to
>> check almost 10 if conditions at worst case inside XLogWrite(6 in
>> am_background_process method, 1 for write, 1 for blocks, 2 for
>> fsyncs), just to decide whether it is a normal backend or not. The
>> count is more for walwriter process. Although I've not done any
>> performance tests, IMHO, it might add further overhead on the
>> XLogWrite Lock.
>>
>> I was thinking whether it is possible to collect all the stats in
>> XLogWrite() irrespective of the type of backend and update the shared
>> counters at once at the end of the function. Thoughts?
>
>
> Thanks for the review.
> Yes, I agree with you that the stats update can be done at the end
> of the function to avoid some overhead. Updated patch is attached.
>
Thanks for the patch.
+ * Check whether the current process is a normal backend or not.
+ * This function checks for the background processes that does
+ * some WAL write activity only and other background processes
+ * are not considered. It considers all the background workers
+ * as WAL write activity workers.
+ *
+ * Returns false - when the current process is a normal backend
+ *        true - when the current process a background process/worker
+ */
+static bool
+am_background_process()
+{
+   /* check whether current process is a background process/worker? */
+   if (!AmBackgroundWriterProcess() ||
+       !AmCheckpointerProcess() ||
+       !AmStartupProcess() ||
+       !IsBackgroundWorker ||
+       !am_walsender ||
+       !am_autovacuum_worker)
+   {
+       return false;
+   }
+
+   return true;
+}
I think you've to do AND operation here instead of OR. Isn't it?
Another point is that, the function description could start with
'Check whether the current process is a background process/worker.'

> There is an overhead with IO time calculation. Is the view is good
> enough without IO columns?
I'm not sure how IO columns are useful for tuning the WAL write/fsync
performance from an user's perspective. But, it's definitely useful
for developing/improving stuffs in XLogWrite.

>
> And also during my tests, I didn't observe any other background
> processes performing the xlogwrite operation, the values are always
> zero. Is it fine to merge them with backend columns?
>
Apart from wal writer process, I don't see any reason why you should
track other background processes separately from normal backends.
However, I may be missing some important point.



-- 
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_wal_write statistics view

From
Haribabu Kommi
Date:


On Tue, Sep 12, 2017 at 3:14 PM, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote:
On Tue, Sep 12, 2017 at 9:06 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
>
> On Tue, Sep 12, 2017 at 2:04 AM, Kuntal Ghosh <kuntalghosh.2007@gmail.com>
> wrote:
>
Thanks for the patch.
+ * Check whether the current process is a normal backend or not.
+ * This function checks for the background processes that does
+ * some WAL write activity only and other background processes
+ * are not considered. It considers all the background workers
+ * as WAL write activity workers.
+ *
+ * Returns false - when the current process is a normal backend
+ *        true - when the current process a background process/worker
+ */
+static bool
+am_background_process()
+{
+   /* check whether current process is a background process/worker? */
+   if (!AmBackgroundWriterProcess() ||
+       !AmCheckpointerProcess() ||
+       !AmStartupProcess() ||
+       !IsBackgroundWorker ||
+       !am_walsender ||
+       !am_autovacuum_worker)
+   {
+       return false;
+   }
+
+   return true;
+}
I think you've to do AND operation here instead of OR. Isn't it?
Another point is that, the function description could start with
'Check whether the current process is a background process/worker.'

Yes, it should be AND, while correcting a review comment, I messed
up that function.

> There is an overhead with IO time calculation. Is the view is good
> enough without IO columns?
I'm not sure how IO columns are useful for tuning the WAL write/fsync
performance from an user's perspective. But, it's definitely useful
for developing/improving stuffs in XLogWrite.

I ran the latest performance tests with and without IO times, there is an
overhead involved with IO time calculation and didn't observe any performance
overhead with normal stats. May be we can enable the IO stats only in the
development environment to find out the IO stats? 
 
>
> And also during my tests, I didn't observe any other background
> processes performing the xlogwrite operation, the values are always
> zero. Is it fine to merge them with backend columns?
>
Apart from wal writer process, I don't see any reason why you should
track other background processes separately from normal backends.
However, I may be missing some important point.

I added the other background stats to find out how much WAL write is
carried out by the other background processes. Now I am able to collect
the stats for the other background processes also after the pgbench test.
So I feel now the separate background stats may be useful.

Attached latest patch, performance test results and stats details with
separate background stats and also combine them with backend including
the IO stats also.


Regards,
Hari Babu
Fujitsu Australia
Attachment

Re: [HACKERS] pg_stat_wal_write statistics view

From
Julien Rouhaud
Date:
Hello,

On Wed, Sep 13, 2017 at 3:01 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> I ran the latest performance tests with and without IO times, there is an
> overhead involved with IO time calculation and didn't observe any
> performance
> overhead with normal stats. May be we can enable the IO stats only in the
> development environment to find out the IO stats?
>

-1 for me to have these columns depending on a GUC *and* wether it's a
debug or assert build (unless this behaviour already exists for other
functions, but AFAIK that's not the case).

> I added the other background stats to find out how much WAL write is
> carried out by the other background processes. Now I am able to collect
> the stats for the other background processes also after the pgbench test.
> So I feel now the separate background stats may be useful.
>
> Attached latest patch, performance test results and stats details with
> separate background stats and also combine them with backend including
> the IO stats also.
>

The results seem to vary too much to have a strong opinion, but it
looks like the timing instrumentation can be too expensive, so I'd be
-1 on adding this overhead to track_io_timing.

I have some minor comments on the last patch:

+    <row>
+      <entry><structfield>backend_writes</></entry>
+      <entry><type>bigint</type></entry>
+      <entry>Number of WAL writes that are carried out by the backend
process</entry>

it should be backend processes

+#define NUM_PG_STAT_WALWRITE_ATTS 16
+
+Datum
+pg_stat_get_walwrites(PG_FUNCTION_ARGS)
+{
+   TupleDesc   tupdesc;
+   Datum       values[NUM_PG_STAT_WALWRITE_ATTS];
+   bool        nulls[NUM_PG_STAT_WALWRITE_ATTS];

For consistency, the #define should be just before the tupdesc
declaration, and be named PG_STAT_GET_WALWRITE_COLS


+   PgStat_Counter backend_writes;      /* No of writes by backend */

+   PgStat_Counter backend_dirty_writes;        /* No of dirty writes by
+                                                * backend when WAL buffers
+                                                * full */

+   PgStat_Counter backend_write_blocks;    /* Total no of pages
written by backend */

+   PgStat_Counter backend_total_write_time;    /* Total write time in
+                                                * milliseconds by backend */

+   PgStat_Counter backend_total_sync_time;     /* Total write time in
+                                                * milliseconds by backend */

these comments should all be say "backends" for consistency

+-- There will surely and maximum one record
+select count(*) > 0 as ok from pg_stat_walwrites;

The test should be count(*) = 1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_wal_write statistics view

From
Haribabu Kommi
Date:


On Fri, Sep 22, 2017 at 5:46 PM, Julien Rouhaud <rjuju123@gmail.com> wrote:
Hello,

On Wed, Sep 13, 2017 at 3:01 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> I ran the latest performance tests with and without IO times, there is an
> overhead involved with IO time calculation and didn't observe any
> performance
> overhead with normal stats. May be we can enable the IO stats only in the
> development environment to find out the IO stats?
>

-1 for me to have these columns depending on a GUC *and* wether it's a
debug or assert build (unless this behaviour already exists for other
functions, but AFAIK that's not the case).

> I added the other background stats to find out how much WAL write is
> carried out by the other background processes. Now I am able to collect
> the stats for the other background processes also after the pgbench test.
> So I feel now the separate background stats may be useful.
>
> Attached latest patch, performance test results and stats details with
> separate background stats and also combine them with backend including
> the IO stats also.
>

The results seem to vary too much to have a strong opinion, but it
looks like the timing instrumentation can be too expensive, so I'd be
-1 on adding this overhead to track_io_timing.

Thanks for the review.
I removed the time related columns from the view. As these columns are adding
an overhead and GUC controlled behavior is different to the other views.

Apart from above time columns, I removed walwriter_dirty_writes, as the
walwriter writers cannot be treated as dirty writes. 

I have some minor comments on the last patch:

+    <row>
+      <entry><structfield>backend_writes</></entry>
+      <entry><type>bigint</type></entry>
+      <entry>Number of WAL writes that are carried out by the backend
process</entry>

it should be backend processes

Changed.
 
+#define NUM_PG_STAT_WALWRITE_ATTS 16
+
+Datum
+pg_stat_get_walwrites(PG_FUNCTION_ARGS)
+{
+   TupleDesc   tupdesc;
+   Datum       values[NUM_PG_STAT_WALWRITE_ATTS];
+   bool        nulls[NUM_PG_STAT_WALWRITE_ATTS];

For consistency, the #define should be just before the tupdesc
declaration, and be named PG_STAT_GET_WALWRITE_COLS
 
Changed.


+   PgStat_Counter backend_writes;      /* No of writes by backend */

+   PgStat_Counter backend_dirty_writes;        /* No of dirty writes by
+                                                * backend when WAL buffers
+                                                * full */

+   PgStat_Counter backend_write_blocks;    /* Total no of pages
written by backend */

these comments should all be say "backends" for consistency

Changed.
 
+-- There will surely and maximum one record
+select count(*) > 0 as ok from pg_stat_walwrites;

The test should be count(*) = 1

Changed.

Updated patch attached.

Regards,
Hari Babu
Fujitsu Australia
Attachment

Re: [HACKERS] pg_stat_wal_write statistics view

From
Haribabu Kommi
Date:


On Wed, Sep 27, 2017 at 6:58 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

Updated patch attached.

Patch rebased.

Regards,
Hari Babu
Fujitsu Australia
Attachment

Re: [HACKERS] pg_stat_wal_write statistics view

From
Robert Haas
Date:
On Tue, Nov 7, 2017 at 4:31 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>> Updated patch attached.
> Patch rebased.

I think the earlier concerns about the performance impact of this are
probably very valid concerns, and I don't see how the new version of
the patch gets us much closer to solving them.

I am also not sure I understand how the backend_write_blocks column is
intended to work.  The only call to pgstat_send_walwrites() is in
WalWriterMain, so where do the other backends report anything?

Also, if there's only ever one global set of counters (as opposed to
one per table, say) then why use the stats collector machinery for
this at all, vs. having a structure in shared memory that can be
updated directly?  It seems like adding a lot of overhead for no
functional benefit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat_wal_write statistics view

From
Haribabu Kommi
Date:
On Wed, Nov 8, 2017 at 8:46 AM, Robert Haas wrote: > On Tue, Nov 7, 2017 at 4:31 AM, Haribabu Kommi > wrote: > >> Updated patch attached. > > Patch rebased. > > I think the earlier concerns about the performance impact of this are > probably very valid concerns, and I don't see how the new version of > the patch gets us much closer to solving them. > I will check the performance with the changes of removing the stats collector usage and provide the details. > I am also not sure I understand how the backend_write_blocks column is > intended to work. The only call to pgstat_send_walwrites() is in > WalWriterMain, so where do the other backends report anything? > With the current patch, All the backends update the stats in shared memory structure and only WAL writer process gathers the stats and share with the stats collector. > Also, if there's only ever one global set of counters (as opposed to > one per table, say) then why use the stats collector machinery for > this at all, vs. having a structure in shared memory that can be > updated directly? It seems like adding a lot of overhead for no > functional benefit. > Yes, I agree that using stats collector for these stats is an overhead. I change the patch to use just the shared memory structure and gather the performance results and post it to the next commitfest. Currently I marked the patch as "returned with feedback" in the ongoing commitfest. Regards, Hari Babu Fujitsu Australia

Re: [HACKERS] pg_stat_wal_write statistics view

From
Haribabu Kommi
Date:
On Mon, Nov 27, 2017 at 2:12 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

On Wed, Nov 8, 2017 at 8:46 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Nov 7, 2017 at 4:31 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>> Updated patch attached.
> Patch rebased.

I think the earlier concerns about the performance impact of this are
probably very valid concerns, and I don't see how the new version of
the patch gets us much closer to solving them.

I will check the performance with the changes of removing the stats collector
usage and provide the details.

I ran the pgbench performance after removing stats collector usage and moving
the stats into shared memory. I find it that this stats addition doesn't have any
overhead while collecting the stats. I ran these tests on a small machine, these may
need to be retested on an high end machine to confirm that there is no impact.

Threads Clients HEAD PATCH Diff
1 1 88.72981533 88.527789 -0.23%
2 2 88.17999           88.823842 0.73%
4 4 169.430813 167.897937 -0.90%
8 8 311.790313 315.6659117 1.24%
16 16 558.6014777 562.5883583 0.71%
32 32 874.0996587 899.855634 2.95%

Attached is the updated patch accordingly to use the shared memory stats.

Regards,
Hari Babu
Fujitsu Australia
Attachment

Re: [HACKERS] pg_stat_wal_write statistics view

From
Robert Haas
Date:
On Tue, Dec 12, 2017 at 4:18 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> I ran the pgbench performance after removing stats collector usage and
> moving
> the stats into shared memory. I find it that this stats addition doesn't
> have any
> overhead while collecting the stats. I ran these tests on a small machine,
> these may
> need to be retested on an high end machine to confirm that there is no
> impact.
>
> Threads Clients HEAD PATCH Diff
> 1 1 88.72981533 88.527789 -0.23%
> 2 2 88.17999           88.823842 0.73%
> 4 4 169.430813 167.897937 -0.90%
> 8 8 311.790313 315.6659117 1.24%
> 16 16 558.6014777 562.5883583 0.71%
> 32 32 874.0996587 899.855634 2.95%
>
> Attached is the updated patch accordingly to use the shared memory stats.

I think the concerns have more to do with systems that have a huge
number of tables.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company