Re: [Proposal] More Vacuum Statistics - Mailing list pgsql-hackers

From Naoya Anzai
Subject Re: [Proposal] More Vacuum Statistics
Date
Msg-id 116262CF971C844FB6E793F8809B51C6EA6E21@BPXM02GP.gisp.nec.co.jp
Whole thread Raw
In response to Re: [Proposal] More Vacuum Statistics  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: [Proposal] More Vacuum Statistics  ("Syed, Rahila" <Rahila.Syed@nttdata.com>)
List pgsql-hackers
Hi,

Thank you for comments. and Sorry for my late response.

>> ====================
>> pg_stat_vacuum view
>> ====================
>>
>> I understand it is not good to simply add more counters in
>> pg_stat_*_tables. For now, I'd like to suggest an extension
>> which can confirm vacuum statistics like pg_stat_statements.
>>
>> Similar feature has been already provided by pg_statsinfo package.
>> But it is a full-stack package for PG-stats and it needs to
>> redesign pg_log and design a repository database for introduce.
>> And it is not a core-extension for PostgreSQL.
>> (I don't intend to hate pg_statsinfo,
>>   I think this package is a very convinient tool)
>>
>> Everyone will be able to do more easily tuning of VACUUM.
>> That's all I want.
>
>I'm still wondering whether these stats will really make the tuning any
>easier. What I do right now is looking at pg_stat_all_tables.n_deat_tup
>and if it exceeds some threshold, it's a sign that vacuum may need a bit
>of tuning. Sometimes it really requires tuning vacuum itself, but more
>often than not it's due to something else (a large bulk delete,
>autovacuum getting stuck on another table, ...). I don't see how the new
>stats would make this any easier.
>
>Can you give some examples on how the new stats might be used (and where
>the current stats are insufficient)? What use cases do you imagine for
>those stats?

pg_stat_vacuum can keep histories of vacuum statistics for each
tables/indices into shared memory.(They are not only last vacuum.
This is already able to confirm using pg_stat_all_tables.) It makes
easier analysis of vacuum histories because this view can sort or
aggregate or filter.

My use cases for those stats are following.

- examine TRANSITION of vacuum execution time on any table(you can predict the future vacuum execution time)
- examine EXECUTION INTERVAL of vacuum for each table(if too frequent, it should make vacuum-threshold tuning to up)
- examine REST of dead-tuples just after vacuum(if dead-tuples remain, it may be due to any idle in transaction
sessions)

>
>It might help differentiate the autovacuum activity from the rest of the
>system (e.g. there's a lot of I/O going on - how much of that is coming
>from autovacuum workers?). This would however require a more
>fine-grained reporting, because often the vacuums run for a very long
>time, especially on very large tables (which is exactly the case when
>this might be handy) - I just had a VACUUM that ran for 12 hours. These
>jobs should report the stats incrementally, not just once at the very
>end, because that makes it rather useless IMNSHO.

+1

Certainly, VACUUM have often much execution time, I just had too.
At present, we cannot predict when this vacuum finishes, what this vacuum
is doing now, and whether this vacuum have any problem or not.

Maybe, For DBAs,
It might be better to show vacuum progress in pg_stat_activity.
(if we'd do, add a free-style column like "progress" ?)
This column might also be able to use for other long time commands
like ANALYZE, CREATE/RE INDEX and COPY. To realize this feature,
we certainly need to properly change pgstat_report_activity,
use it more and add a new track-activity parameter.

Regards,

Anzai Naoya
---
Naoya Anzai
Engineering Department
NEC Solution Inovetors, Ltd.
E-Mail: nao-anzai@xc.jp.nec.com
---




pgsql-hackers by date:

Previous
From: Xiaoyulei
Date:
Subject: does tuple store subtransaction id in it?
Next
From: Prakash Itnal
Date:
Subject: Re: Auto-vacuum is not running in 9.1.12