Thread: [PATCH] Add features to pg_stat_statements
This is a proposal to add some features to pg_stat_statements. Attached is the patch of this. pg_stat_statements uses a hash table to hold statistics, and the maximum number of its entries can be configured through pg_stat_statements.max. When the number of entries exceeds the pg_stat_statements.max, pg_stat_statements deallocate existing entries. Currently, it is impossible to know how many times/when this deallocation happened. But, with this information, more detailed performance analysis would be possible. So, this patch provides access to this information. This patch provides a view (pg_stat_statements_ctl) and a function (pg_stat_statements_ctl). The pg_stat_statements_ctl view is defined in terms of a function also named pg_stat_statements_ctl. The entry of pg_stat_statements_ctl view is removed when pg_stat_statements_reset(0,0,0) is called. Maybe, it is convenient to provide a function named pg_stat_statements_ctl_reset that removes only the entry of pg_stat_statements_ctl. The following is an example of this feature. Here, a deallocation is shown with the INFO message. pg_stat_statements_ctl tracks the number of deallocations and timestamp of last deallocation. testdb=# select * from pg_stat_statements_ctl; dealloc | last_dealloc ---------+------------------------------- 2 | 2020-09-18 16:55:31.128256+09 (1 row) testdb=# select count(*) from test1; 2020-09-18 16:59:20.745 JST [3920] INFO: deallocate 2020-09-18 16:59:20.745 JST [3920] STATEMENT: select count(*) from test1; INFO: deallocate count ------- 90 (1 row) testdb=# select * from pg_stat_statements_ctl; dealloc | last_dealloc ---------+------------------------------- 3 | 2020-09-18 16:59:20.745652+09 (1 row) Regards, Katsuragi Yuta
Attachment
Hi, On Fri, Sep 18, 2020 at 10:54 AM Katsuragi Yuta <btkatsuragiyu@oss.nttdata.com> wrote: > > This is a proposal to add some features to pg_stat_statements. > Attached is the patch of this. > > pg_stat_statements uses a hash table to hold statistics, > and the maximum number of its entries can be configured through > pg_stat_statements.max. > When the number of entries exceeds the pg_stat_statements.max, > pg_stat_statements deallocate existing entries. > > Currently, it is impossible to know how many times/when this > deallocation happened. > But, with this information, more detailed performance analysis would be > possible. > So, this patch provides access to this information. > > This patch provides a view (pg_stat_statements_ctl) and > a function (pg_stat_statements_ctl). > The pg_stat_statements_ctl view is defined > in terms of a function also named pg_stat_statements_ctl. > > The entry of pg_stat_statements_ctl view is removed > when pg_stat_statements_reset(0,0,0) is called. > Maybe, it is convenient to provide a function named > pg_stat_statements_ctl_reset > that removes only the entry of pg_stat_statements_ctl. > > The following is an example of this feature. > Here, a deallocation is shown with the INFO message. > pg_stat_statements_ctl tracks the number of deallocations > and timestamp of last deallocation. > > > testdb=# select * from pg_stat_statements_ctl; > dealloc | last_dealloc > ---------+------------------------------- > 2 | 2020-09-18 16:55:31.128256+09 > (1 row) > > testdb=# select count(*) from test1; > 2020-09-18 16:59:20.745 JST [3920] INFO: deallocate > 2020-09-18 16:59:20.745 JST [3920] STATEMENT: select count(*) from > test1; > INFO: deallocate > count > ------- > 90 > (1 row) > > testdb=# select * from pg_stat_statements_ctl; > dealloc | last_dealloc > ---------+------------------------------- > 3 | 2020-09-18 16:59:20.745652+09 > (1 row) I like it, this is especially important since this can lead to quite huge overhead. Did you consider also adding the cumulated number of evicted entries? This could be useful to know how to configure pg_stat_statements.max.
+1 ! An other way is to log evictions, it provides informations about time and amount : for (i = 0; i < nvictims; i++) { hash_search(pgssp_hash, &entries[i]->key, HASH_REMOVE, NULL); } pfree(entries); /* trace when evicting entries, if appening too often this can slow queries ... * increasing pg_stat_sql_plans.max value could help */ ereport(LOG, (errmsg("pg_stat_sql_plans evicting %d entries", nvictims), errhidecontext(true), errhidestmt(true))); -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
On 2020-09-18 18:49, Julien Rouhaud wrote: > Did you consider also adding the cumulated number of > evicted entries? This could be useful to know how to configure > pg_stat_statements.max. Thank you for your comments! I overlooked the cumulated number of evicted entries. This statistic looks important. But, I am not sure if I should add this statistic to a view. This is because I am not sure how to utilize the cumulated number of evicted entries for configuring pg_stat_statements.max. Not only providing a view but also logging evictions along with the number of evicted entries might be a choice. This idea is from legrand legrand [1]. [1] https://www.postgresql.org/message-id/1600500942767-0.post%40n3.nabble.com Regards, Katsuragi Yuta
On Wed, Sep 23, 2020 at 2:48 PM Katsuragi Yuta <btkatsuragiyu@oss.nttdata.com> wrote: > > On 2020-09-18 18:49, Julien Rouhaud wrote: > > Did you consider also adding the cumulated number of > > evicted entries? This could be useful to know how to configure > > pg_stat_statements.max. > > Thank you for your comments! > I overlooked the cumulated number of evicted entries. > This statistic looks important. But, I am not sure > if I should add this statistic to a view. > This is because I am not sure how to utilize the cumulated > number of evicted entries for configuring pg_stat_statements.max. You're right, as the number of evicted entries isn't depending on the number of entries that wouls been needed to contain the entirely workload. > Not only providing a view but also logging evictions > along with the number of evicted entries might be a choice. > This idea is from legrand legrand [1]. +1. I'm wondering if logging each evicted entry, with its queryid, would help to estimate the actual size of the normalised queries set.
On 2020-09-19 16:35, legrand legrand wrote: > +1 ! > > An other way is to log evictions, it provides informations about time > and > amount : > > for (i = 0; i < nvictims; i++) > { > hash_search(pgssp_hash, &entries[i]->key, HASH_REMOVE, NULL); > } > > pfree(entries); > > /* trace when evicting entries, if appening too often this can slow > queries > ... > * increasing pg_stat_sql_plans.max value could help */ > ereport(LOG, > (errmsg("pg_stat_sql_plans evicting %d entries", nvictims), > errhidecontext(true), errhidestmt(true))); > > > > -- > Sent from: > https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html Thank you for your comments! In addition to providing a view that is like a summary of history, logging might be a good choice. Regards, Katsuragi Yuta
On 2020-09-23 16:01, Julien Rouhaud wrote: >> Not only providing a view but also logging evictions >> along with the number of evicted entries might be a choice. >> This idea is from legrand legrand [1]. > > +1. I'm wondering if logging each evicted entry, with its queryid, > would help to estimate the actual size of the normalised queries set. I agree with the estimation of the actual size of the query set is important. It looks difficult to estimate the actual size of the query set from queryid because queryid is a 64bit hash value. Regards, Katsuragi Yuta
Hi, Both are probably not needed. I would prefer it accessible in a view live Event | date | victims Eviction... Reset... Part reset ... As there are other needs to track reset times. Regards PAscal -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
On 2020-09-24 14:15, legrand legrand wrote: > Hi, > Both are probably not needed. > I would prefer it accessible in a view live > > Event | date | victims > Eviction... > Reset... > Part reset ... > > As there are other needs to track reset times. Let me confirm one thing. Is the number of records of this view fixed to three? Or, will a new record be appended every time an event (Eviction or Reset or Part Reset) happens? Regards, Katsuragi Yuta
Not limited to 3, Like an history table. Will have to think if data is saved at shutdown. -- Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
On 2020-09-24 18:55, legrand legrand wrote: > Not limited to 3, Like an history table. > > Will have to think if data is saved at shutdown. Thanks. This design might introduce some additional complexity and things to be considered. For example, the maximum size of "history table", how to evict entries from the history table and how to manage the information maintained by evicted entries. Also, providing a history table looks similar to logging. Providing the original view (# of dealloc and last_dealloc ts) and optional logging looks a simple and better way. Regards, Katsuragi Yuta
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation: tested, passed The patch applies cleanly and looks fine to me. It's a small detail, However wouldn't it be better if the following changeswere made. 1.There are unnecessary difference lines 707 and 863 in "pg_stat_statements.c". 2.There is no comment on "slock_t mutex" in "struct pgssCtlCounter" in "pg_stat_statements.c". 3."update_ctl" and "reset_ctl" are generic and illegible name.You might want to rename it something like "pgss_ctl_update". 4.To improve the readability of the source, why not change the function declaration option in "pg_stat_statements_ctl" from "AS 'MODULE_PATHNAME'" to "AS 'MODULE_PATHNAME', 'pg_stat_statements_ctl'"? in "pg_stat_statements--1.8--1.9.sql". The new status of this patch is: Waiting on Author
On 2020/10/12 21:18, Yuki Seino wrote: > The following review has been posted through the commitfest application: > make installcheck-world: tested, passed > Implements feature: tested, passed > Spec compliant: tested, passed > Documentation: tested, passed > > The patch applies cleanly and looks fine to me. It's a small detail, However wouldn't it be better if the following changeswere made. > > 1.There are unnecessary difference lines 707 and 863 in "pg_stat_statements.c". > 2.There is no comment on "slock_t mutex" in "struct pgssCtlCounter" in "pg_stat_statements.c". > 3."update_ctl" and "reset_ctl" are generic and illegible name.You might want to rename it something like "pgss_ctl_update". > 4.To improve the readability of the source, why not change the function declaration option in "pg_stat_statements_ctl"from > "AS 'MODULE_PATHNAME'" to "AS 'MODULE_PATHNAME', 'pg_stat_statements_ctl'"? in "pg_stat_statements--1.8--1.9.sql". > > The new status of this patch is: Waiting on Author Here are other comments from me. -DATA = pg_stat_statements--1.4.sql \ +DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.8--1.9.sql\ One space character needs to be added just before the character "\". +--- Define pg_stat_statements_ctl ISTM that this is not good name. What about pg_stat_statements_info, _stats, _activity or something? + OUT dealloc integer, The type of "dealloc" should be bigint, instead? + OUT last_dealloc TIMESTAMP WITH TIME ZONE Is this information really useful? If there is no valid use case for this, I'd like to drop it. Thought? +LANGUAGE C STRICT VOLATILE PARALLEL SAFE; There are two space characters just after "LANGUAGE". One space character should be removed from there. +CREATE VIEW pg_stat_statements_ctl AS + SELECT * FROM pg_stat_statements_ctl(); If we rename the function, this view name also should be changed. +GRANT SELECT ON pg_stat_statements TO PUBLIC; "pg_stat_statements" should be "pg_stat_statement_xxx"? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
2020-10-22 01:31 に Fujii Masao さんは書きました: > On 2020/10/12 21:18, Yuki Seino wrote: >> The following review has been posted through the commitfest >> application: >> make installcheck-world: tested, passed >> Implements feature: tested, passed >> Spec compliant: tested, passed >> Documentation: tested, passed >> >> The patch applies cleanly and looks fine to me. It's a small detail, >> However wouldn't it be better if the following changes were made. >> >> 1.There are unnecessary difference lines 707 and 863 in >> "pg_stat_statements.c". >> 2.There is no comment on "slock_t mutex" in "struct pgssCtlCounter" in >> "pg_stat_statements.c". >> 3."update_ctl" and "reset_ctl" are generic and illegible name.You >> might want to rename it something like "pgss_ctl_update". >> 4.To improve the readability of the source, why not change the >> function declaration option in "pg_stat_statements_ctl" from >> "AS 'MODULE_PATHNAME'" to "AS 'MODULE_PATHNAME', >> 'pg_stat_statements_ctl'"? in "pg_stat_statements--1.8--1.9.sql". >> >> The new status of this patch is: Waiting on Author > > Here are other comments from me. > > -DATA = pg_stat_statements--1.4.sql \ > +DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.8--1.9.sql\ > > One space character needs to be added just before the character "\". > > +--- Define pg_stat_statements_ctl > > ISTM that this is not good name. > What about pg_stat_statements_info, _stats, _activity or something? > > + OUT dealloc integer, > > The type of "dealloc" should be bigint, instead? > > + OUT last_dealloc TIMESTAMP WITH TIME ZONE > > Is this information really useful? > If there is no valid use case for this, I'd like to drop it. > Thought? > > +LANGUAGE C STRICT VOLATILE PARALLEL SAFE; > > There are two space characters just after "LANGUAGE". > One space character should be removed from there. > > +CREATE VIEW pg_stat_statements_ctl AS > + SELECT * FROM pg_stat_statements_ctl(); > > If we rename the function, this view name also should be changed. > > +GRANT SELECT ON pg_stat_statements TO PUBLIC; > > "pg_stat_statements" should be "pg_stat_statement_xxx"? > > Regards, Thanks for the comment, Fujii-san. I will post the patch again in the future to reflect your and my points. However, let me confirm the following. > Is this information really useful? > If there is no valid use case for this, I'd like to drop it. > Thought? I thought it would be easy for users to see at a glance that if there is a case I assumed, if the last modified date and time is old, there is no need to adjust at all, and if the last modified date and time is recent, it would be easy for users to understand that the parameters need to be adjusted. What do you think? Regards, Seino Yuki
On 2020/10/28 17:31, seinoyu wrote: > 2020-10-22 01:31 に Fujii Masao さんは書きました: >> On 2020/10/12 21:18, Yuki Seino wrote: >>> The following review has been posted through the commitfest application: >>> make installcheck-world: tested, passed >>> Implements feature: tested, passed >>> Spec compliant: tested, passed >>> Documentation: tested, passed >>> >>> The patch applies cleanly and looks fine to me. It's a small detail, However wouldn't it be better if the following changeswere made. >>> >>> 1.There are unnecessary difference lines 707 and 863 in "pg_stat_statements.c". >>> 2.There is no comment on "slock_t mutex" in "struct pgssCtlCounter" in "pg_stat_statements.c". >>> 3."update_ctl" and "reset_ctl" are generic and illegible name.You might want to rename it something like "pgss_ctl_update". >>> 4.To improve the readability of the source, why not change the function declaration option in "pg_stat_statements_ctl"from >>> "AS 'MODULE_PATHNAME'" to "AS 'MODULE_PATHNAME', 'pg_stat_statements_ctl'"? in "pg_stat_statements--1.8--1.9.sql". >>> >>> The new status of this patch is: Waiting on Author >> >> Here are other comments from me. >> >> -DATA = pg_stat_statements--1.4.sql \ >> +DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.8--1.9.sql\ >> >> One space character needs to be added just before the character "\". >> >> +--- Define pg_stat_statements_ctl >> >> ISTM that this is not good name. >> What about pg_stat_statements_info, _stats, _activity or something? >> >> + OUT dealloc integer, >> >> The type of "dealloc" should be bigint, instead? >> >> + OUT last_dealloc TIMESTAMP WITH TIME ZONE >> >> Is this information really useful? >> If there is no valid use case for this, I'd like to drop it. >> Thought? >> >> +LANGUAGE C STRICT VOLATILE PARALLEL SAFE; >> >> There are two space characters just after "LANGUAGE". >> One space character should be removed from there. >> >> +CREATE VIEW pg_stat_statements_ctl AS >> + SELECT * FROM pg_stat_statements_ctl(); >> >> If we rename the function, this view name also should be changed. >> >> +GRANT SELECT ON pg_stat_statements TO PUBLIC; >> >> "pg_stat_statements" should be "pg_stat_statement_xxx"? >> >> Regards, > > > Thanks for the comment, Fujii-san. > > I will post the patch again in the future to reflect your and my points. Thanks! > > However, let me confirm the following. >> Is this information really useful? >> If there is no valid use case for this, I'd like to drop it. >> Thought? > > I thought it would be easy for users to see at a glance that if there is a case I assumed, > if the last modified date and time is old, there is no need to adjust at all, and if the > last modified date and time is recent, it would be easy for users to understand that the > parameters need to be adjusted. > What do you think? Even without the last deallocation timestamp, you can presume when the deallocation of entries happened by periodically monitoring the total number of deallocations and seeing those history. Or IMO it's better to log whenever the deallocation happens as proposed upthread. Then you can easily check the history of occurrences of deallocations from the log file. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
>> >> However, let me confirm the following. >>> Is this information really useful? >>> If there is no valid use case for this, I'd like to drop it. >>> Thought? >> >> I thought it would be easy for users to see at a glance that if there >> is a case I assumed, >> if the last modified date and time is old, there is no need to adjust >> at all, and if the >> last modified date and time is recent, it would be easy for users to >> understand that the >> parameters need to be adjusted. >> What do you think? > > Even without the last deallocation timestamp, you can presume > when the deallocation of entries happened by periodically monitoring > the total number of deallocations and seeing those history. Or IMO > it's better to log whenever the deallocation happens as proposed > upthread. > Then you can easily check the history of occurrences of deallocations > from the log file. > > Regards, +1.I think you should output the deallocation history to the log as well. In light of the above, I've posted a patch that reflects the points made. Regards,
Attachment
2020-11-09 15:39 に Seino Yuki さんは書きました: >>> >>> However, let me confirm the following. >>>> Is this information really useful? >>>> If there is no valid use case for this, I'd like to drop it. >>>> Thought? >>> >>> I thought it would be easy for users to see at a glance that if there >>> is a case I assumed, >>> if the last modified date and time is old, there is no need to adjust >>> at all, and if the >>> last modified date and time is recent, it would be easy for users to >>> understand that the >>> parameters need to be adjusted. >>> What do you think? >> >> Even without the last deallocation timestamp, you can presume >> when the deallocation of entries happened by periodically monitoring >> the total number of deallocations and seeing those history. Or IMO >> it's better to log whenever the deallocation happens as proposed >> upthread. >> Then you can easily check the history of occurrences of deallocations >> from the log file. >> >> Regards, > > +1.I think you should output the deallocation history to the log as > well. > > In light of the above, I've posted a patch that reflects the points > made. > > Regards, Sorry. There was a typo in the documentation correction. I'll post a patch to fix it. Regards,
Attachment
On 2020/11/09 18:02, Seino Yuki wrote: > 2020-11-09 15:39 に Seino Yuki さんは書きました: >>>> >>>> However, let me confirm the following. >>>>> Is this information really useful? >>>>> If there is no valid use case for this, I'd like to drop it. >>>>> Thought? >>>> >>>> I thought it would be easy for users to see at a glance that if there is a case I assumed, >>>> if the last modified date and time is old, there is no need to adjust at all, and if the >>>> last modified date and time is recent, it would be easy for users to understand that the >>>> parameters need to be adjusted. >>>> What do you think? >>> >>> Even without the last deallocation timestamp, you can presume >>> when the deallocation of entries happened by periodically monitoring >>> the total number of deallocations and seeing those history. Or IMO >>> it's better to log whenever the deallocation happens as proposed upthread. >>> Then you can easily check the history of occurrences of deallocations >>> from the log file. >>> >>> Regards, >> >> +1.I think you should output the deallocation history to the log as well. >> >> In light of the above, I've posted a patch that reflects the points made. >> >> Regards, > > Sorry. There was a typo in the documentation correction. > I'll post a patch to fix it. Thanks for updating the patch! + pgss_info->dealloc = 0; + SpinLockInit(&pgss_info->mutex); + Assert(pgss_info->dealloc == 0); Why is this assertion check necessary? It seems not necessary. + { + Assert(found == found_info); Having pgssSharedState and pgssInfoCounters separately might make the code a bit more complicated like the above? If this is true, what about including pgssInfoCounters in pgssSharedState? PGSS_FILE_HEADER needs to be changed since the patch changes the format of pgss file? + /* Read pgss_info */ + if (feof(file) == 0) + if (fread(pgss_info, sizeof(pgssInfoCounters), 1, file) != 1) + goto read_error; Why does feof(file) need to be called here? +pgss_info_update(void) +{ + { Why is the second "{" necessary? It seems redundant. +pgss_info_reset(void) +{ + { Same as above. +pg_stat_statements_info(PG_FUNCTION_ARGS) +{ + int64 d_count = 0; + { Same as above. + SpinLockAcquire(&c->mutex); + d_count = Int64GetDatum(c->dealloc); + SpinLockRelease(&c->mutex); Why does Int64GetDatum() need to be called here? It seems not necessary. + <varlistentry> + <term> + <function>pg_stat_statements_info() returns bigint</function> + <indexterm> + <primary>pg_stat_statements_info</primary> + </indexterm> + </term> Isn't it better not to expose pg_stat_statements_info() function in the document because pg_stat_statements_info view is enough and there seems no use case for the function? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
> Thanks for updating the patch! > > + pgss_info->dealloc = 0; > + SpinLockInit(&pgss_info->mutex); > + Assert(pgss_info->dealloc == 0); > > Why is this assertion check necessary? It seems not necessary. > > + { > + Assert(found == found_info); > > Having pgssSharedState and pgssInfoCounters separately might make > the code a bit more complicated like the above? If this is true, what > about > including pgssInfoCounters in pgssSharedState? > > PGSS_FILE_HEADER needs to be changed since the patch changes > the format of pgss file? > > + /* Read pgss_info */ > + if (feof(file) == 0) > + if (fread(pgss_info, sizeof(pgssInfoCounters), 1, file) != 1) > + goto read_error; > > Why does feof(file) need to be called here? > > +pgss_info_update(void) > +{ > + { > > Why is the second "{" necessary? It seems redundant. > > +pgss_info_reset(void) > +{ > + { > > Same as above. > > +pg_stat_statements_info(PG_FUNCTION_ARGS) > +{ > + int64 d_count = 0; > + { > > Same as above. > > + SpinLockAcquire(&c->mutex); > + d_count = Int64GetDatum(c->dealloc); > + SpinLockRelease(&c->mutex); > > Why does Int64GetDatum() need to be called here? It seems not > necessary. > > + <varlistentry> > + <term> > + <function>pg_stat_statements_info() returns bigint</function> > + <indexterm> > + <primary>pg_stat_statements_info</primary> > + </indexterm> > + </term> > > Isn't it better not to expose pg_stat_statements_info() function in the > document because pg_stat_statements_info view is enough and there > seems no use case for the function? > > Regards, Thanks for the comment. I'll post a fixed patch. Due to similar fixed, we have also merged the patches discussed in the following thread. https://commitfest.postgresql.org/30/2738/ > Why is this assertion check necessary? It seems not necessary. As indicated, it is unnecessary and will be removed. > Having pgssSharedState and pgssInfoCounters separately might make > the code a bit more complicated like the above? If this is true, what > about > including pgssInfoCounters in pgssSharedState? Fix pgssSharedState to include pgssInfoCounters . The related parts were also corrected accordingly. > PGSS_FILE_HEADER needs to be changed since the patch changes > the format of pgss file? The value of PGSS_FILE_HEADER has been updated. > Why does feof(file) need to be called here? As indicated, it is unnecessary and will be removed. > Why is the second "{" necessary? It seems redundant. As indicated, it is unnecessary and will be removed. But I left the {} in pg_stat_statements_info() to make the shared memory edit part explicit. > Why does Int64GetDatum() need to be called here? It seems not > necessary. As indicated, it is unnecessary and will be removed. > Isn't it better not to expose pg_stat_statements_info() function in the > document because pg_stat_statements_info view is enough and there > seems no use case for the function? As indicated, it is unnecessary and will be removed. Regards.
Attachment
On 2020/11/16 12:22, Seino Yuki wrote: >> Thanks for updating the patch! >> >> + pgss_info->dealloc = 0; >> + SpinLockInit(&pgss_info->mutex); >> + Assert(pgss_info->dealloc == 0); >> >> Why is this assertion check necessary? It seems not necessary. >> >> + { >> + Assert(found == found_info); >> >> Having pgssSharedState and pgssInfoCounters separately might make >> the code a bit more complicated like the above? If this is true, what about >> including pgssInfoCounters in pgssSharedState? >> >> PGSS_FILE_HEADER needs to be changed since the patch changes >> the format of pgss file? >> >> + /* Read pgss_info */ >> + if (feof(file) == 0) >> + if (fread(pgss_info, sizeof(pgssInfoCounters), 1, file) != 1) >> + goto read_error; >> >> Why does feof(file) need to be called here? >> >> +pgss_info_update(void) >> +{ >> + { >> >> Why is the second "{" necessary? It seems redundant. >> >> +pgss_info_reset(void) >> +{ >> + { >> >> Same as above. >> >> +pg_stat_statements_info(PG_FUNCTION_ARGS) >> +{ >> + int64 d_count = 0; >> + { >> >> Same as above. >> >> + SpinLockAcquire(&c->mutex); >> + d_count = Int64GetDatum(c->dealloc); >> + SpinLockRelease(&c->mutex); >> >> Why does Int64GetDatum() need to be called here? It seems not necessary. >> >> + <varlistentry> >> + <term> >> + <function>pg_stat_statements_info() returns bigint</function> >> + <indexterm> >> + <primary>pg_stat_statements_info</primary> >> + </indexterm> >> + </term> >> >> Isn't it better not to expose pg_stat_statements_info() function in the >> document because pg_stat_statements_info view is enough and there >> seems no use case for the function? >> >> Regards, > > Thanks for the comment. > I'll post a fixed patch. > Due to similar fixed, we have also merged the patches discussed in the following thread. > https://commitfest.postgresql.org/30/2738/ I agree that these two patches should use the same infrastructure because they both try to add the global stats for pg_stat_statements. But IMO they should not be merged to one patch. It's better to develop them one by one for ease of review. Thought? So I extracted the "dealloc" part from the merged version of your patch. Also I refactored the code and applied some cosmetic changes into the patch. Attached is the updated version of the patch that implements only "dealloc" part. Could you review this version? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Attachment
2020-11-17 01:46 に Fujii Masao さんは書きました: > On 2020/11/16 12:22, Seino Yuki wrote: >>> Thanks for updating the patch! >>> >>> + pgss_info->dealloc = 0; >>> + SpinLockInit(&pgss_info->mutex); >>> + Assert(pgss_info->dealloc == 0); >>> >>> Why is this assertion check necessary? It seems not necessary. >>> >>> + { >>> + Assert(found == found_info); >>> >>> Having pgssSharedState and pgssInfoCounters separately might make >>> the code a bit more complicated like the above? If this is true, what >>> about >>> including pgssInfoCounters in pgssSharedState? >>> >>> PGSS_FILE_HEADER needs to be changed since the patch changes >>> the format of pgss file? >>> >>> + /* Read pgss_info */ >>> + if (feof(file) == 0) >>> + if (fread(pgss_info, sizeof(pgssInfoCounters), 1, file) != >>> 1) >>> + goto read_error; >>> >>> Why does feof(file) need to be called here? >>> >>> +pgss_info_update(void) >>> +{ >>> + { >>> >>> Why is the second "{" necessary? It seems redundant. >>> >>> +pgss_info_reset(void) >>> +{ >>> + { >>> >>> Same as above. >>> >>> +pg_stat_statements_info(PG_FUNCTION_ARGS) >>> +{ >>> + int64 d_count = 0; >>> + { >>> >>> Same as above. >>> >>> + SpinLockAcquire(&c->mutex); >>> + d_count = Int64GetDatum(c->dealloc); >>> + SpinLockRelease(&c->mutex); >>> >>> Why does Int64GetDatum() need to be called here? It seems not >>> necessary. >>> >>> + <varlistentry> >>> + <term> >>> + <function>pg_stat_statements_info() returns bigint</function> >>> + <indexterm> >>> + <primary>pg_stat_statements_info</primary> >>> + </indexterm> >>> + </term> >>> >>> Isn't it better not to expose pg_stat_statements_info() function in >>> the >>> document because pg_stat_statements_info view is enough and there >>> seems no use case for the function? >>> >>> Regards, >> >> Thanks for the comment. >> I'll post a fixed patch. >> Due to similar fixed, we have also merged the patches discussed in the >> following thread. >> https://commitfest.postgresql.org/30/2738/ > > I agree that these two patches should use the same infrastructure > because they both try to add the global stats for pg_stat_statements. > But IMO they should not be merged to one patch. It's better to > develop them one by one for ease of review. Thought? > > So I extracted the "dealloc" part from the merged version of your > patch. > Also I refactored the code and applied some cosmetic changes into > the patch. Attached is the updated version of the patch that implements > only "dealloc" part. Could you review this version? > > Regards, Thank you for posting the new patch. I checked "regression test" and "document" and "operation of the view". No particular problems were found. I just want to check one thing: will the log output be unnecessary this time? Quotes from v2.patch > > + { > entry_dealloc(); > + /* Update pgss_info */ > + { > + volatile pgssSharedState *s = (volatile pgssSharedState *) pgss; > + SpinLockAcquire(&s->mutex); > + s->pgss_info.dealloc += 1; /* increment dealloc count */ > + SpinLockRelease(&s->mutex); > + } > + ereport(LOG, > + (errmsg("The information in pg_stat_statements has been > deallocated."))); > + } Regards.
On 2020/11/25 12:02, Seino Yuki wrote: > 2020-11-17 01:46 に Fujii Masao さんは書きました: >> On 2020/11/16 12:22, Seino Yuki wrote: >>>> Thanks for updating the patch! >>>> >>>> + pgss_info->dealloc = 0; >>>> + SpinLockInit(&pgss_info->mutex); >>>> + Assert(pgss_info->dealloc == 0); >>>> >>>> Why is this assertion check necessary? It seems not necessary. >>>> >>>> + { >>>> + Assert(found == found_info); >>>> >>>> Having pgssSharedState and pgssInfoCounters separately might make >>>> the code a bit more complicated like the above? If this is true, what about >>>> including pgssInfoCounters in pgssSharedState? >>>> >>>> PGSS_FILE_HEADER needs to be changed since the patch changes >>>> the format of pgss file? >>>> >>>> + /* Read pgss_info */ >>>> + if (feof(file) == 0) >>>> + if (fread(pgss_info, sizeof(pgssInfoCounters), 1, file) != 1) >>>> + goto read_error; >>>> >>>> Why does feof(file) need to be called here? >>>> >>>> +pgss_info_update(void) >>>> +{ >>>> + { >>>> >>>> Why is the second "{" necessary? It seems redundant. >>>> >>>> +pgss_info_reset(void) >>>> +{ >>>> + { >>>> >>>> Same as above. >>>> >>>> +pg_stat_statements_info(PG_FUNCTION_ARGS) >>>> +{ >>>> + int64 d_count = 0; >>>> + { >>>> >>>> Same as above. >>>> >>>> + SpinLockAcquire(&c->mutex); >>>> + d_count = Int64GetDatum(c->dealloc); >>>> + SpinLockRelease(&c->mutex); >>>> >>>> Why does Int64GetDatum() need to be called here? It seems not necessary. >>>> >>>> + <varlistentry> >>>> + <term> >>>> + <function>pg_stat_statements_info() returns bigint</function> >>>> + <indexterm> >>>> + <primary>pg_stat_statements_info</primary> >>>> + </indexterm> >>>> + </term> >>>> >>>> Isn't it better not to expose pg_stat_statements_info() function in the >>>> document because pg_stat_statements_info view is enough and there >>>> seems no use case for the function? >>>> >>>> Regards, >>> >>> Thanks for the comment. >>> I'll post a fixed patch. >>> Due to similar fixed, we have also merged the patches discussed in the following thread. >>> https://commitfest.postgresql.org/30/2738/ >> >> I agree that these two patches should use the same infrastructure >> because they both try to add the global stats for pg_stat_statements. >> But IMO they should not be merged to one patch. It's better to >> develop them one by one for ease of review. Thought? >> >> So I extracted the "dealloc" part from the merged version of your patch. >> Also I refactored the code and applied some cosmetic changes into >> the patch. Attached is the updated version of the patch that implements >> only "dealloc" part. Could you review this version? >> >> Regards, > > Thank you for posting the new patch. > > I checked "regression test" and "document" and "operation of the view". > No particular problems were found. Thanks for the review and test! So you think that this patch can be marked as ready for committer? > > I just want to check one thing: will the log output be unnecessary this time? > Quotes from v2.patch I'm not sure if it's really good idea to add this log message. If we adopt that logging, in the system where pgss entries are deallocated very frequently, that message also would be logged very frequently. Such too many log messages might be noisy to users. To address this issue, we may want to add new parameter that controls whether log message is emitted or not when entries are deallocated. But that parameter sounds too specific... Thought? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
2020-11-25 13:13 に Fujii Masao さんは書きました: > On 2020/11/25 12:02, Seino Yuki wrote: >> 2020-11-17 01:46 に Fujii Masao さんは書きました: >>> On 2020/11/16 12:22, Seino Yuki wrote: >>>>> Thanks for updating the patch! >>>>> >>>>> + pgss_info->dealloc = 0; >>>>> + SpinLockInit(&pgss_info->mutex); >>>>> + Assert(pgss_info->dealloc == 0); >>>>> >>>>> Why is this assertion check necessary? It seems not necessary. >>>>> >>>>> + { >>>>> + Assert(found == found_info); >>>>> >>>>> Having pgssSharedState and pgssInfoCounters separately might make >>>>> the code a bit more complicated like the above? If this is true, >>>>> what about >>>>> including pgssInfoCounters in pgssSharedState? >>>>> >>>>> PGSS_FILE_HEADER needs to be changed since the patch changes >>>>> the format of pgss file? >>>>> >>>>> + /* Read pgss_info */ >>>>> + if (feof(file) == 0) >>>>> + if (fread(pgss_info, sizeof(pgssInfoCounters), 1, file) != >>>>> 1) >>>>> + goto read_error; >>>>> >>>>> Why does feof(file) need to be called here? >>>>> >>>>> +pgss_info_update(void) >>>>> +{ >>>>> + { >>>>> >>>>> Why is the second "{" necessary? It seems redundant. >>>>> >>>>> +pgss_info_reset(void) >>>>> +{ >>>>> + { >>>>> >>>>> Same as above. >>>>> >>>>> +pg_stat_statements_info(PG_FUNCTION_ARGS) >>>>> +{ >>>>> + int64 d_count = 0; >>>>> + { >>>>> >>>>> Same as above. >>>>> >>>>> + SpinLockAcquire(&c->mutex); >>>>> + d_count = Int64GetDatum(c->dealloc); >>>>> + SpinLockRelease(&c->mutex); >>>>> >>>>> Why does Int64GetDatum() need to be called here? It seems not >>>>> necessary. >>>>> >>>>> + <varlistentry> >>>>> + <term> >>>>> + <function>pg_stat_statements_info() returns bigint</function> >>>>> + <indexterm> >>>>> + <primary>pg_stat_statements_info</primary> >>>>> + </indexterm> >>>>> + </term> >>>>> >>>>> Isn't it better not to expose pg_stat_statements_info() function in >>>>> the >>>>> document because pg_stat_statements_info view is enough and there >>>>> seems no use case for the function? >>>>> >>>>> Regards, >>>> >>>> Thanks for the comment. >>>> I'll post a fixed patch. >>>> Due to similar fixed, we have also merged the patches discussed in >>>> the following thread. >>>> https://commitfest.postgresql.org/30/2738/ >>> >>> I agree that these two patches should use the same infrastructure >>> because they both try to add the global stats for pg_stat_statements. >>> But IMO they should not be merged to one patch. It's better to >>> develop them one by one for ease of review. Thought? >>> >>> So I extracted the "dealloc" part from the merged version of your >>> patch. >>> Also I refactored the code and applied some cosmetic changes into >>> the patch. Attached is the updated version of the patch that >>> implements >>> only "dealloc" part. Could you review this version? >>> >>> Regards, >> >> Thank you for posting the new patch. >> >> I checked "regression test" and "document" and "operation of the >> view". >> No particular problems were found. > > Thanks for the review and test! > So you think that this patch can be marked as ready for committer? > > >> >> I just want to check one thing: will the log output be unnecessary >> this time? >> Quotes from v2.patch > > I'm not sure if it's really good idea to add this log message. > If we adopt that logging, in the system where pgss entries are > deallocated > very frequently, that message also would be logged very frequently. > Such too many log messages might be noisy to users. To address this > issue, > we may want to add new parameter that controls whether log message is > emitted or not when entries are deallocated. But that parameter sounds > too specific... Thought? > > Regards, > Thanks for the review and test! > So you think that this patch can be marked as ready for committer? Updated status to ready for committer. > I'm not sure if it's really good idea to add this log message. > If we adopt that logging, in the system where pgss entries are > deallocated > very frequently, that message also would be logged very frequently. > Such too many log messages might be noisy to users. To address this > issue, > we may want to add new parameter that controls whether log message is > emitted or not when entries are deallocated. But that parameter sounds > too specific... Thought? I think it's no problem to add the log after the user requests it. Usually I think you should tune pg_stat_statements.max if you have frequent deallocated. However, most users may not be that aware of it. Let's not add a log this time. Regards.
On 2020/11/25 15:40, Seino Yuki wrote: > 2020-11-25 13:13 に Fujii Masao さんは書きました: >> On 2020/11/25 12:02, Seino Yuki wrote: >>> 2020-11-17 01:46 に Fujii Masao さんは書きました: >>>> On 2020/11/16 12:22, Seino Yuki wrote: >>>>>> Thanks for updating the patch! >>>>>> >>>>>> + pgss_info->dealloc = 0; >>>>>> + SpinLockInit(&pgss_info->mutex); >>>>>> + Assert(pgss_info->dealloc == 0); >>>>>> >>>>>> Why is this assertion check necessary? It seems not necessary. >>>>>> >>>>>> + { >>>>>> + Assert(found == found_info); >>>>>> >>>>>> Having pgssSharedState and pgssInfoCounters separately might make >>>>>> the code a bit more complicated like the above? If this is true, what about >>>>>> including pgssInfoCounters in pgssSharedState? >>>>>> >>>>>> PGSS_FILE_HEADER needs to be changed since the patch changes >>>>>> the format of pgss file? >>>>>> >>>>>> + /* Read pgss_info */ >>>>>> + if (feof(file) == 0) >>>>>> + if (fread(pgss_info, sizeof(pgssInfoCounters), 1, file) != 1) >>>>>> + goto read_error; >>>>>> >>>>>> Why does feof(file) need to be called here? >>>>>> >>>>>> +pgss_info_update(void) >>>>>> +{ >>>>>> + { >>>>>> >>>>>> Why is the second "{" necessary? It seems redundant. >>>>>> >>>>>> +pgss_info_reset(void) >>>>>> +{ >>>>>> + { >>>>>> >>>>>> Same as above. >>>>>> >>>>>> +pg_stat_statements_info(PG_FUNCTION_ARGS) >>>>>> +{ >>>>>> + int64 d_count = 0; >>>>>> + { >>>>>> >>>>>> Same as above. >>>>>> >>>>>> + SpinLockAcquire(&c->mutex); >>>>>> + d_count = Int64GetDatum(c->dealloc); >>>>>> + SpinLockRelease(&c->mutex); >>>>>> >>>>>> Why does Int64GetDatum() need to be called here? It seems not necessary. >>>>>> >>>>>> + <varlistentry> >>>>>> + <term> >>>>>> + <function>pg_stat_statements_info() returns bigint</function> >>>>>> + <indexterm> >>>>>> + <primary>pg_stat_statements_info</primary> >>>>>> + </indexterm> >>>>>> + </term> >>>>>> >>>>>> Isn't it better not to expose pg_stat_statements_info() function in the >>>>>> document because pg_stat_statements_info view is enough and there >>>>>> seems no use case for the function? >>>>>> >>>>>> Regards, >>>>> >>>>> Thanks for the comment. >>>>> I'll post a fixed patch. >>>>> Due to similar fixed, we have also merged the patches discussed in the following thread. >>>>> https://commitfest.postgresql.org/30/2738/ >>>> >>>> I agree that these two patches should use the same infrastructure >>>> because they both try to add the global stats for pg_stat_statements. >>>> But IMO they should not be merged to one patch. It's better to >>>> develop them one by one for ease of review. Thought? >>>> >>>> So I extracted the "dealloc" part from the merged version of your patch. >>>> Also I refactored the code and applied some cosmetic changes into >>>> the patch. Attached is the updated version of the patch that implements >>>> only "dealloc" part. Could you review this version? >>>> >>>> Regards, >>> >>> Thank you for posting the new patch. >>> >>> I checked "regression test" and "document" and "operation of the view". >>> No particular problems were found. >> >> Thanks for the review and test! >> So you think that this patch can be marked as ready for committer? >> >> >>> >>> I just want to check one thing: will the log output be unnecessary this time? >>> Quotes from v2.patch >> >> I'm not sure if it's really good idea to add this log message. >> If we adopt that logging, in the system where pgss entries are deallocated >> very frequently, that message also would be logged very frequently. >> Such too many log messages might be noisy to users. To address this issue, >> we may want to add new parameter that controls whether log message is >> emitted or not when entries are deallocated. But that parameter sounds >> too specific... Thought? >> >> Regards, > > >> Thanks for the review and test! >> So you think that this patch can be marked as ready for committer? > > Updated status to ready for committer. Thanks! > >> I'm not sure if it's really good idea to add this log message. >> If we adopt that logging, in the system where pgss entries are deallocated >> very frequently, that message also would be logged very frequently. >> Such too many log messages might be noisy to users. To address this issue, >> we may want to add new parameter that controls whether log message is >> emitted or not when entries are deallocated. But that parameter sounds >> too specific... Thought? > > I think it's no problem to add the log after the user requests it. > Usually I think you should tune pg_stat_statements.max if you have frequent deallocated. Yes, but I guess that there are some users who cannot increase pgss.max for some reasons (e.g., lack of memory). It seems annoying for them to *always* log the deallocation of pgss entries. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On 2020/11/25 17:07, Fujii Masao wrote: > > > On 2020/11/25 15:40, Seino Yuki wrote: >> 2020-11-25 13:13 に Fujii Masao さんは書きました: >>> On 2020/11/25 12:02, Seino Yuki wrote: >>>> 2020-11-17 01:46 に Fujii Masao さんは書きました: >>>>> On 2020/11/16 12:22, Seino Yuki wrote: >>>>>>> Thanks for updating the patch! >>>>>>> >>>>>>> + pgss_info->dealloc = 0; >>>>>>> + SpinLockInit(&pgss_info->mutex); >>>>>>> + Assert(pgss_info->dealloc == 0); >>>>>>> >>>>>>> Why is this assertion check necessary? It seems not necessary. >>>>>>> >>>>>>> + { >>>>>>> + Assert(found == found_info); >>>>>>> >>>>>>> Having pgssSharedState and pgssInfoCounters separately might make >>>>>>> the code a bit more complicated like the above? If this is true, what about >>>>>>> including pgssInfoCounters in pgssSharedState? >>>>>>> >>>>>>> PGSS_FILE_HEADER needs to be changed since the patch changes >>>>>>> the format of pgss file? >>>>>>> >>>>>>> + /* Read pgss_info */ >>>>>>> + if (feof(file) == 0) >>>>>>> + if (fread(pgss_info, sizeof(pgssInfoCounters), 1, file) != 1) >>>>>>> + goto read_error; >>>>>>> >>>>>>> Why does feof(file) need to be called here? >>>>>>> >>>>>>> +pgss_info_update(void) >>>>>>> +{ >>>>>>> + { >>>>>>> >>>>>>> Why is the second "{" necessary? It seems redundant. >>>>>>> >>>>>>> +pgss_info_reset(void) >>>>>>> +{ >>>>>>> + { >>>>>>> >>>>>>> Same as above. >>>>>>> >>>>>>> +pg_stat_statements_info(PG_FUNCTION_ARGS) >>>>>>> +{ >>>>>>> + int64 d_count = 0; >>>>>>> + { >>>>>>> >>>>>>> Same as above. >>>>>>> >>>>>>> + SpinLockAcquire(&c->mutex); >>>>>>> + d_count = Int64GetDatum(c->dealloc); >>>>>>> + SpinLockRelease(&c->mutex); >>>>>>> >>>>>>> Why does Int64GetDatum() need to be called here? It seems not necessary. >>>>>>> >>>>>>> + <varlistentry> >>>>>>> + <term> >>>>>>> + <function>pg_stat_statements_info() returns bigint</function> >>>>>>> + <indexterm> >>>>>>> + <primary>pg_stat_statements_info</primary> >>>>>>> + </indexterm> >>>>>>> + </term> >>>>>>> >>>>>>> Isn't it better not to expose pg_stat_statements_info() function in the >>>>>>> document because pg_stat_statements_info view is enough and there >>>>>>> seems no use case for the function? >>>>>>> >>>>>>> Regards, >>>>>> >>>>>> Thanks for the comment. >>>>>> I'll post a fixed patch. >>>>>> Due to similar fixed, we have also merged the patches discussed in the following thread. >>>>>> https://commitfest.postgresql.org/30/2738/ >>>>> >>>>> I agree that these two patches should use the same infrastructure >>>>> because they both try to add the global stats for pg_stat_statements. >>>>> But IMO they should not be merged to one patch. It's better to >>>>> develop them one by one for ease of review. Thought? >>>>> >>>>> So I extracted the "dealloc" part from the merged version of your patch. >>>>> Also I refactored the code and applied some cosmetic changes into >>>>> the patch. Attached is the updated version of the patch that implements >>>>> only "dealloc" part. Could you review this version? >>>>> >>>>> Regards, >>>> >>>> Thank you for posting the new patch. >>>> >>>> I checked "regression test" and "document" and "operation of the view". >>>> No particular problems were found. >>> >>> Thanks for the review and test! >>> So you think that this patch can be marked as ready for committer? >>> >>> >>>> >>>> I just want to check one thing: will the log output be unnecessary this time? >>>> Quotes from v2.patch >>> >>> I'm not sure if it's really good idea to add this log message. >>> If we adopt that logging, in the system where pgss entries are deallocated >>> very frequently, that message also would be logged very frequently. >>> Such too many log messages might be noisy to users. To address this issue, >>> we may want to add new parameter that controls whether log message is >>> emitted or not when entries are deallocated. But that parameter sounds >>> too specific... Thought? >>> >>> Regards, >> >> >>> Thanks for the review and test! >>> So you think that this patch can be marked as ready for committer? >> >> Updated status to ready for committer. > > Thanks! I modified the docs a bit and pushed the patch. Thanks! Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION