Re: [Proposal] More Vacuum Statistics - Mailing list pgsql-hackers

From Naoya Anzai
Subject Re: [Proposal] More Vacuum Statistics
Date
Msg-id 116262CF971C844FB6E793F8809B51C6E95F3A@BPXM02GP.gisp.nec.co.jp
Whole thread Raw
In response to Re: [Proposal] More Vacuum Statistics  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [Proposal] More Vacuum Statistics  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
Thank you for quick feedback, and I'm sorry for slow response.
All of your opinions were very helpful for me.

I have confirmed Greg's Idea "Timing events".
http://www.postgresql.org/message-id/509300F7.5000803@2ndQuadrant.com

Greg said at first,
"Parsing log files for commonly needed performance data is no fun."
Yes, I completely agree with him.

That looks a nice idea but I don't know why this idea has
not been commited yet. Anybody knows?

I have reworked my idea since I heard dear hacker's opinions.

====================
pg_stat_vacuum view
====================

I understand it is not good to simply add more counters in
pg_stat_*_tables. For now, I'd like to suggest an extension
which can confirm vacuum statistics like pg_stat_statements.

VACUUM is a most important feature in PostgreSQL, but a
special view for vacuum does not exist. Don't you think
the fact is inconvenience? At least, I am disgruntled with
that we need to parse pg_log for tune VACUUM.

My first design of pg_stat_vacuum view is following.
(There are two views.)

pg_stat_vacuum_table
---------------
dbid
schemaname
relid
relname
elapsed
page_removed
page_remain
page_skipped
tuple_removed
tuple_remain
tuple_notremovable
buffer_hit
buffer_miss
buffer_dirty
avg_read
avg_write
vm_count
vac_start
vac_end
is_autovacuum

pg_stat_vacuum_index
---------------
dbid
shemaname
relid
indexrelid
indexname
elapsed
num_index_tuples
num_pages
tuples_removed
pages_deleted
pages_free
is_autovacuum

At present, I think memory design of pg_stat_statements can
divert into this feature.And I think this module needs to
prepare following parameters like pg_stat_statements.

pg_stat_vacuum.max(integer)
pg_stat_vacuum.save(boolean)
pg_stat_vacuum.excluded_dbnames(text)
pg_stat_vacuum.excluded_schemas(text)
pg_stat_vacuum.min_duration(integer)
... and so on.

To implement this feature, I have to collect each vacuum-stats
every lazy_vacuum_* and I need to embed a hook function point
where needed. (probably last point of lazy_vacuum_rel).
Do you hesitate to add the hook only for this function?

Similar feature has been already provided by pg_statsinfo package.
But it is a full-stack package for PG-stats and it needs to
redesign pg_log and design a repository database for introduce.
And it is not a core-extension for PostgreSQL.
(I don't intend to hate pg_statsinfo, I think this package is a very convinient tool)

Everyone will be able to do more easily tuning of VACUUM.
That's all I want.

Any comments are welcome!

Best Regards,

Naoya Anzai

---
Naoya Anzai
Engineering Department
NEC Solution Inovetors, Ltd.
E-Mail: nao-anzai@xc.jp.nec.com
---



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: error message diff with Perl 5.22.0
Next
From: "Joshua D. Drake"
Date:
Subject: Re: [CORE] Restore-reliability mode