Thread: Add statistics refresh materialized view
Hi. This is a proposal for a new feature in statistics collector. I think we need to add statistics about refresh matview to pg_stat_all_tables view. When the "REFRESH MATERIALIZED VIEW" was executed, the number of times it was executed and date it took were not recorded anywhere. "pg_stat_statements" can be used to get the number of executions and the date and time of execution, but this information is statement-based, not view-based. Also, that method requires the high cost of "pg_stat_statements". This patch will add statistics(count, last time) about "REFRESH MATERIALIZED VIEW" to pg_stat_all_tables(pg_stat_user_tables, [pg_stat_sys_tables]). What do you think? Regards, Seino Yuki
Attachment
On 2021/07/09 1:39, Seino Yuki wrote: > Hi. > > This is a proposal for a new feature in statistics collector. > I think we need to add statistics about refresh matview to pg_stat_all_tables view. Why do you want to treat only REFRESH MATERIALIZED VIEW command special? What about other utility commands like TRUNCATE, CLUSTER, etc? It's not good design to add new columns per utility command into pg_stat_all_tables. Otherwise pg_stat_all_tables will have to have lots of columns to expose the stats of many utility commands at last. Which is ugly and very user-unfriendly. Most entries in pg_stat_all_tables are basically for tables. So the columns about REFRESH MATERIALIZED VIEW are useless for those most entries. This is another reason why I think the design is not good. > > When the "REFRESH MATERIALIZED VIEW" was executed, the number of times it was executed > and date it took were not recorded anywhere. pg_stat_statements and log_statement would help? > > "pg_stat_statements" can be used to get the number of executions and the date and time of execution, > but this information is statement-based, not view-based. pg_stat_statements reports different records for REFRESH MATERIALIZED VIEW commands on different views. So ISTM that we can aggregate the information per view, from pg_stat_statements. No? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2021-09-01 23:15, Fujii Masao wrote: > Why do you want to treat only REFRESH MATERIALIZED VIEW command > special? > What about other utility commands like TRUNCATE, CLUSTER, etc? First of all, knowing the update date and time of the MATVIEW is essential for actual operation. Without that information, users will not be able to trust the MATVIEW. In terms of the reliability of the information in the table, I think the priority of the REFRESHED MATVIEW is higher than that of TRUNCATE and CLUSTER. > It's not good design to add new columns per utility command into > pg_stat_all_tables. Otherwise pg_stat_all_tables will have to have lots > of > columns to expose the stats of many utility commands at last. Which is > ugly and very user-unfriendly. > Most entries in pg_stat_all_tables are basically for tables. So the > columns > about REFRESH MATERIALIZED VIEW are useless for those most entries. > This is another reason why I think the design is not good. I agree with this opinion. Initially, I thought about storing this information in pg_matviews, but decided against it because of the overhead of adding it to the system catalog. > pg_stat_statements reports different records for REFRESH MATERIALIZED > VIEW > commands on different views. So ISTM that we can aggregate the > information > per view, from pg_stat_statements. No? I made this suggestion based on the premise that the last update date and time of the Mateview should always be retained. I think the same concept applies to Oracle Database. https://docs.oracle.com/cd/F19136_01/refrn/ALL_MVIEWS.html#GUID-8B9432B5-6B66-411A-936E-590D9D7671E9 I thought it would be useless to enable pg_stat_statements and log_statement to see this information. However, as you said, for most use cases, pg_stat_statements and log_statement may be sufficient. I would like to withdraw this proposal. Regards,
On Tue, Sep 07, 2021 at 06:11:14PM +0900, Seino Yuki wrote: > I would like to withdraw this proposal. This was registered in the CF, so marked as RwF. -- Michael
Attachment
Hi, > However, as you said, for most use cases, pg_stat_statements and > log_statement may be sufficient. > I would like to withdraw this proposal. > Well, they either require extensions or parameters to be set properly. One advantage I see to store those kind of information is that it can be queried by application developers (users are reporting old data for example). We currently have to rely on other ways to figure out if materialized views were properly refreshed.