Re: New statistics for WAL buffer dirty writes - Mailing list pgsql-hackers

From Satoshi Nagayasu
Subject Re: New statistics for WAL buffer dirty writes
Date
Msg-id 5061F7E1.7040100@uptime.jp
Whole thread Raw
In response to Re: New statistics for WAL buffer dirty writes  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: New statistics for WAL buffer dirty writes
List pgsql-hackers
Hi,

2012/08/12 7:11, Jeff Janes wrote:
> On Sat, Jul 28, 2012 at 3:33 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> On Sat, Jul 7, 2012 at 9:17 PM, Satoshi Nagayasu <snaga@uptime.jp> wrote:
>>> Hi,
>>>
>>> Jeff Janes has pointed out that my previous patch could hold
>>> a number of the dirty writes only in single local backend, and
>>> it could not hold all over the cluster, because the counter
>>> was allocated in the local process memory.
>>>
>>> That's true, and I have fixed it with moving the counter into
>>> the shared memory, as a member of XLogCtlWrite, to keep total
>>> dirty writes in the cluster.
>>
>
> ...
>
>> The comment "XLogCtrlWrite must be protected with WALWriteLock"
>> mis-spells XLogCtlWrite.
>>
>> The final patch will need to add a sections to the documentation.
>
> Thanks to Robert and Tom for addressing my concerns about the pointer
> volatility.
>
> I think there is enough consensus that this is useful without adding
> more things to it, like histograms or high water marks.
>
> However, I do think we will want to add a way to query for the time of
> the last reset, as other monitoring features are going that way.
>
> Is it OK that the count is reset upon a server restart?
> pg_stat_bgwriter, for example, does not do that.  Unfortunately I
> think fixing this in an acceptable way will be harder than the entire
> rest of the patch was.
>
>
> The coding looks OK to me, it applies and builds, and passes make
> check, and does what it says.  I didn't do performance testing, as it
> is hard to believe it would have a meaningful effect.
>
> I'll marked it as waiting on author, for the documentation and reset
> time.  I'd ask a more senior hacker to comment on the durability over
> restarts.

I have rewritten the patch to deal with dirty write statistics
through pgstat collector as bgwriter does.
Yeah, it's a bit bigger rewrite.

With this patch, walwriter process and each backend process
would sum up dirty writes, and send it to the stat collector.
So, the value could be saved in the stat file, and could be
kept on restarting.

The statistics could be retreive with using
pg_stat_get_xlog_dirty_writes() function, and could be reset
with calling pg_stat_reset_shared('walwriter').

Now, I have one concern.

The reset time could be captured in globalStats.stat_reset_timestamp,
but this value is the same with the bgwriter one.

So, once pg_stat_reset_shared('walwriter') is called,
stats_reset column in pg_stat_bgwriter does represent
the reset time for walwriter, not for bgwriter.

How should we handle this?  Should we split this value?
And should we have new system view for walwriter?

Of course, I will work on documentation next.

Regards,

>
> Cheers,
>
> Jeff
>
--
Satoshi Nagayasu <snaga@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp

Attachment

pgsql-hackers by date:

Previous
From: "md@rpzdesign.com"
Date:
Subject: Re: Switching timeline over streaming replication
Next
From: Dimitri Fontaine
Date:
Subject: Re: pg_reorg in core?