Thread: New function pg_stat_statements_reset_query() to reset statistics ofa specific query

The pg_stat_statements contains the statistics of the queries that are cumulative.
I find that any optimizations that are done to improve the performance of a query
are not be visible clearly until the stats are reset. Currently there is a function to 
reset all the statistics, I find it will be useful if we a function that resets the stats of
a single query, instead of reseting all the queries.

Attached is a simple patch with implementation. Comments?

Regards,
Haribabu Kommi
Fujitsu Australia
Attachment
> On 20 Jun 2018, at 09:30, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> The pg_stat_statements contains the statistics of the queries that are cumulative.
> I find that any optimizations that are done to improve the performance of a query
> are not be visible clearly until the stats are reset. Currently there is a function to
> reset all the statistics, I find it will be useful if we a function that resets the stats of
> a single query, instead of reseting all the queries.

I’ve found myself wanting this in the past, so +1 on the idea.

> Attached is a simple patch with implementation. Comments?

From only skimming it, the patch seems to lack documentation updates to
doc/src/sgml/pgstatstatements.sgml.

cheers ./daniel

> On Jun 20, 2018, at 12:30 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> Attached is a simple patch with implementation. Comments?

Seems useful to me too! What are the odds we could have a column telling the timestamp when a particular query was last
reset?Would that be complicated to add? 

-Jeremy

Hello

>    key.userid = GetUserId();
We can remove query id only from current user, right? Maybe better provide optional argument for userid? Typically user
withpg_read_all_stats and user for application queries are different users.
 
At least it should be documented.

regards, Sergei


2018-06-20 4:30 GMT-03:00 Haribabu Kommi <kommi.haribabu@gmail.com>:
> Attached is a simple patch with implementation. Comments?
>
Why don't you extend the existing function pg_stat_statements_reset()?


-- 
   Euler Taveira                                   Timbira -
http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


On Wed, Jun 20, 2018 at 1:00 PM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> The pg_stat_statements contains the statistics of the queries that are
> cumulative.
> I find that any optimizations that are done to improve the performance of a
> query
> are not be visible clearly until the stats are reset. Currently there is a
> function to
> reset all the statistics, I find it will be useful if we a function that
> resets the stats of
> a single query, instead of reseting all the queries.
>
> Attached is a simple patch with implementation. Comments?

The idea looks interesting to me. But, as Euler Taveira mentioned,
can't we extend an existing function pg_stat_statements_reset()
instead of adding a new one and update the documentation for it. Also,
in the test-case it would be good to display the output of
pg_stat_statements before and after deleting the query. Thanks.

-- 
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com


On Wed, Jun 20, 2018 at 10:19 AM, Euler Taveira <euler@timbira.com.br> wrote:
> 2018-06-20 4:30 GMT-03:00 Haribabu Kommi <kommi.haribabu@gmail.com>:
>> Attached is a simple patch with implementation. Comments?
>>
> Why don't you extend the existing function pg_stat_statements_reset()?

Well, the existing function doesn't take any arguments.  We could add
an additional version of it that takes an argument, or we could
replace the existing version with one that has an optional argument.
But are either of those things any better than just adding a new
function with a different name, like
pg_stat_statements_reset_statement()?

I have not had such good experiences with function overloading, either
in PostgreSQL or elsewhere, that I'm ready to say reusing the same
name is definitely the right approach.  For example, suppose we
eventually end up with a function that resets all the statements, a
function that resets just one statement, a function that resets all
statements for one user, and a function that resets all statements
where the statement text matches a certain regexp.  If those all have
separate names, everything is fine.  If they all have the same name,
there's no way that's not confusing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Why not just parameterize it with the three key fields; userid, dbid, and queryid?

i.e It would then allow it be limited to only records associated with a given user and/or database as well.

pg_stat_statements_reset(dbid oid, userid oid, queryid bigint)

pg_stat_statements_reset(null, null, 3569076157)    — all for a given query
pg_stat_statements_reset(6384, null, 3569076157)
pg_stat_statements_reset(null, 16429, 3569076157)
pg_stat_statements_reset(6384, 6384, 3569076157)
pg_stat_statements_reset(6384, null, null) — all for a given database.
.
.
.

>> pg_stat_statements_reset()

> On Jun 22, 2018, at 11:06 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Jun 20, 2018 at 10:19 AM, Euler Taveira <euler@timbira.com.br> wrote:
>> 2018-06-20 4:30 GMT-03:00 Haribabu Kommi <kommi.haribabu@gmail.com>:
>>> Attached is a simple patch with implementation. Comments?
>>>
>> Why don't you extend the existing function pg_stat_statements_reset()?
>
> Well, the existing function doesn't take any arguments.  We could add
> an additional version of it that takes an argument, or we could
> replace the existing version with one that has an optional argument.
> But are either of those things any better than just adding a new
> function with a different name, like
> pg_stat_statements_reset_statement()?
>
> I have not had such good experiences with function overloading, either
> in PostgreSQL or elsewhere, that I'm ready to say reusing the same
> name is definitely the right approach.  For example, suppose we
> eventually end up with a function that resets all the statements, a
> function that resets just one statement, a function that resets all
> statements for one user, and a function that resets all statements
> where the statement text matches a certain regexp.  If those all have
> separate names, everything is fine.  If they all have the same name,
> there's no way that's not confusing.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



2018-06-22 12:06 GMT-03:00 Robert Haas <robertmhaas@gmail.com>:
> On Wed, Jun 20, 2018 at 10:19 AM, Euler Taveira <euler@timbira.com.br> wrote:
>> 2018-06-20 4:30 GMT-03:00 Haribabu Kommi <kommi.haribabu@gmail.com>:
>>> Attached is a simple patch with implementation. Comments?
>>>
>> Why don't you extend the existing function pg_stat_statements_reset()?
>
> Well, the existing function doesn't take any arguments.  We could add
> an additional version of it that takes an argument, or we could
> replace the existing version with one that has an optional argument.
> But are either of those things any better than just adding a new
> function with a different name, like
> pg_stat_statements_reset_statement()?
>
From the user's POV, overloading is a good goal. It is better to
remember one function name than 3 different function names to do the
same task (reset statement statistics).

> I have not had such good experiences with function overloading, either
> in PostgreSQL or elsewhere, that I'm ready to say reusing the same
> name is definitely the right approach.  For example, suppose we
> eventually end up with a function that resets all the statements, a
> function that resets just one statement, a function that resets all
> statements for one user, and a function that resets all statements
> where the statement text matches a certain regexp.  If those all have
> separate names, everything is fine.  If they all have the same name,
> there's no way that's not confusing.
>
I think part of your frustration with overloading is because there are
so many arguments and/or argument type combinations to remember.
Frankly, I never remember the order / type of arguments when a
function has more than 2 arguments (unless I use that function a lot).
If we want to consider overloading I think we should put all
possibilities for that function on the table and decide between
overload it or not. Bad overloading decisions tend to be very
frustrating for the user. In this case, all possibilities (queryid,
user, regex, database, statement regex) can be summarized as (i) 0
arguments that means all statements and (ii) 1 argument (queryid is
unique for all statements) -- because queryid can be obtain using
SELECT pg_stat_statements_reset(queryid) FROM pg_stat_statements WHERE
my-condition-here.


-- 
   Euler Taveira                                   Timbira -
http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


On Sat, Jun 23, 2018 at 5:45 AM Euler Taveira <euler@timbira.com.br> wrote:
2018-06-22 12:06 GMT-03:00 Robert Haas <robertmhaas@gmail.com>:
> On Wed, Jun 20, 2018 at 10:19 AM, Euler Taveira <euler@timbira.com.br> wrote:
>> 2018-06-20 4:30 GMT-03:00 Haribabu Kommi <kommi.haribabu@gmail.com>:
>>> Attached is a simple patch with implementation. Comments?
>>>
>> Why don't you extend the existing function pg_stat_statements_reset()?
>
> Well, the existing function doesn't take any arguments.  We could add
> an additional version of it that takes an argument, or we could
> replace the existing version with one that has an optional argument.
> But are either of those things any better than just adding a new
> function with a different name, like
> pg_stat_statements_reset_statement()?
>
From the user's POV, overloading is a good goal. It is better to
remember one function name than 3 different function names to do the
same task (reset statement statistics).

I agree that function overloading is beneficial until unless it doesn't introduce
confusion and difficulty in using it.


> I have not had such good experiences with function overloading, either
> in PostgreSQL or elsewhere, that I'm ready to say reusing the same
> name is definitely the right approach.  For example, suppose we
> eventually end up with a function that resets all the statements, a
> function that resets just one statement, a function that resets all
> statements for one user, and a function that resets all statements
> where the statement text matches a certain regexp.  If those all have
> separate names, everything is fine.  If they all have the same name,
> there's no way that's not confusing.
>
I think part of your frustration with overloading is because there are
so many arguments and/or argument type combinations to remember.
Frankly, I never remember the order / type of arguments when a
function has more than 2 arguments (unless I use that function a lot).
If we want to consider overloading I think we should put all
possibilities for that function on the table and decide between
overload it or not. Bad overloading decisions tend to be very
frustrating for the user. In this case, all possibilities (queryid,
user, regex, database, statement regex) can be summarized as (i) 0
arguments that means all statements and (ii) 1 argument (queryid is
unique for all statements) -- because queryid can be obtain using
SELECT pg_stat_statements_reset(queryid) FROM pg_stat_statements WHERE
my-condition-here.

In pg_stat_statements the uniqueness of the query is decided by (userid, dbid and queryid).
I feel the reset function should take maximum of 3 input arguments and rest of the
conditions can be filtered using the WHERE condition.

if we are going to support a single function that wants to reset all the statement
statistics of a "user" or specific to a "database" and "specific queries" based on
other input parameter values, I am not sure that the overloading function can cause
confusion in using it? And also if the specified input query data doesn't found, better
to ignore or raise an error?

Regards,
Haribabu Kommi
Fujitsu Australia
On Sat, Jun 23, 2018 at 7:36 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> On Sat, Jun 23, 2018 at 5:45 AM Euler Taveira <euler@timbira.com.br> wrote:
>>
>> 2018-06-22 12:06 GMT-03:00 Robert Haas <robertmhaas@gmail.com>:
>> > On Wed, Jun 20, 2018 at 10:19 AM, Euler Taveira <euler@timbira.com.br>
>> > wrote:
>> >> 2018-06-20 4:30 GMT-03:00 Haribabu Kommi <kommi.haribabu@gmail.com>:
>> >>> Attached is a simple patch with implementation. Comments?
>> >>>
>> >> Why don't you extend the existing function pg_stat_statements_reset()?
>> >
>> > Well, the existing function doesn't take any arguments.  We could add
>> > an additional version of it that takes an argument, or we could
>> > replace the existing version with one that has an optional argument.
>> > But are either of those things any better than just adding a new
>> > function with a different name, like
>> > pg_stat_statements_reset_statement()?
>> >
>> From the user's POV, overloading is a good goal. It is better to
>> remember one function name than 3 different function names to do the
>> same task (reset statement statistics).
>
>
> I agree that function overloading is beneficial until unless it doesn't
> introduce
> confusion and difficulty in using it.
>
>
>> > I have not had such good experiences with function overloading, either
>> > in PostgreSQL or elsewhere, that I'm ready to say reusing the same
>> > name is definitely the right approach.  For example, suppose we
>> > eventually end up with a function that resets all the statements, a
>> > function that resets just one statement, a function that resets all
>> > statements for one user, and a function that resets all statements
>> > where the statement text matches a certain regexp.  If those all have
>> > separate names, everything is fine.  If they all have the same name,
>> > there's no way that's not confusing.
>> >
>> I think part of your frustration with overloading is because there are
>> so many arguments and/or argument type combinations to remember.
>> Frankly, I never remember the order / type of arguments when a
>> function has more than 2 arguments (unless I use that function a lot).
>> If we want to consider overloading I think we should put all
>> possibilities for that function on the table and decide between
>> overload it or not. Bad overloading decisions tend to be very
>> frustrating for the user. In this case, all possibilities (queryid,
>> user, regex, database, statement regex) can be summarized as (i) 0
>> arguments that means all statements and (ii) 1 argument (queryid is
>> unique for all statements) -- because queryid can be obtain using
>> SELECT pg_stat_statements_reset(queryid) FROM pg_stat_statements WHERE
>> my-condition-here.
>
>
> In pg_stat_statements the uniqueness of the query is decided by (userid,
> dbid and queryid).
> I feel the reset function should take maximum of 3 input arguments and rest
> of the
> conditions can be filtered using the WHERE condition.
>

I'm slightly confused about the function overloading part here (not
sure whether we should try using the same function name or introduce a
function with new name), however, I think, passing userid, dbid and
queryid as input arguments to the user function (may be
pg_stat_statements_reset_statement or pg_stat_statements_reset) looks
like a good option as it would allow users to remove an entry for a
particular query across the databases not just in the database that
the user is currently connected to. The current patch definitely lacks
this flexibility.

> if we are going to support a single function that wants to reset all the
> statement
> statistics of a "user" or specific to a "database" and "specific queries"
> based on
> other input parameter values, I am not sure that the overloading function
> can cause
> confusion in using it? And also if the specified input query data doesn't
> found, better
> to ignore or raise an error?
>
> Regards,
> Haribabu Kommi
> Fujitsu Australia


On Sat, Jun 23, 2018 at 2:52 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
On Sat, Jun 23, 2018 at 7:36 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> On Sat, Jun 23, 2018 at 5:45 AM Euler Taveira <euler@timbira.com.br> wrote:
>>
>> 2018-06-22 12:06 GMT-03:00 Robert Haas <robertmhaas@gmail.com>:
>> > On Wed, Jun 20, 2018 at 10:19 AM, Euler Taveira <euler@timbira.com.br>
>> > wrote:
>> >> Why don't you extend the existing function pg_stat_statements_reset()?
>> >
>> > Well, the existing function doesn't take any arguments.  We could add
>> > an additional version of it that takes an argument, or we could
>> > replace the existing version with one that has an optional argument.
>> > But are either of those things any better than just adding a new
>> > function with a different name, like
>> > pg_stat_statements_reset_statement()?
>> >
>> From the user's POV, overloading is a good goal. It is better to
>> remember one function name than 3 different function names to do the
>> same task (reset statement statistics).
>
> I agree that function overloading is beneficial until unless it doesn't
> introduce
> confusion and difficulty in using it.
>
>> > I have not had such good experiences with function overloading, either
>> > in PostgreSQL or elsewhere, that I'm ready to say reusing the same
>> > name is definitely the right approach.  For example, suppose we
>> > eventually end up with a function that resets all the statements, a
>> > function that resets just one statement, a function that resets all
>> > statements for one user, and a function that resets all statements
>> > where the statement text matches a certain regexp.  If those all have
>> > separate names, everything is fine.  If they all have the same name,
>> > there's no way that's not confusing.
>> >

I'm slightly confused about the function overloading part here (not
sure whether we should try using the same function name or introduce a
function with new name), however, I think, passing userid, dbid and
queryid as input arguments to the user function (may be
pg_stat_statements_reset_statement or pg_stat_statements_reset) looks
like a good option as it would allow users to remove an entry for a
particular query across the databases not just in the database that
the user is currently connected to. The current patch definitely lacks
this flexibility.

Thanks for all your comments on the and approach. Patch is changed
to use the existing _reset function and it takes 3 arguments userid, dbid and queryid
and their default values are 0.

If all values are passed, it resets only a single query statement, otherwise it resets
all the statements that matches with the other parameters. If no values are passed
or all values are invalid, all the statistics are reset.

Updated patch attached.

Regards,
Haribabu Kommi
Fujitsu Australia
Attachment
Hello
Thank you for update

It may be better to use NULL as the default value at sql level.

> ereport(LOG, (errmsg("userid %u, dbid %u, queryid %ld does not exist", userid, dbid, queryid)));
I think LOG level is not useful here. In common case this is server log only. How about WARNING? Or just ignore. Want
removerow? Here is no such row anymore, all fine.
 
Also we can return num_remove instead of void. I think this is even better. But this break backward compatibility and
weneed something like pg_stat_statements_reset_1_6
 

> By default, this function can only be executed by superusers.
Can you also update this phrase in pg_stat_statements_reset documentation? Beginning from 1.5 version this is not true,
resetcan be used by any user with pg_read_all_stats role.
 

regards, Sergei


On 29/06/18 10:14, Sergei Kornilov wrote:
> It may be better to use NULL as the default value at sql level.

Absolutely.  +1 from me.
-- 
Vik Fearing                                          +33 6 46 75 15 36
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


On Fri, Jun 29, 2018 at 6:14 PM Sergei Kornilov <sk@zsrv.org> wrote:
Hello
Thank you for update

Thanks for the review.
 
It may be better to use NULL as the default value at sql level.

Changed.
 
> ereport(LOG, (errmsg("userid %u, dbid %u, queryid %ld does not exist", userid, dbid, queryid)));
I think LOG level is not useful here. In common case this is server log only. How about WARNING? Or just ignore. Want remove row? Here is no such row anymore, all fine.

I went with removing of ereport.
 
Also we can return num_remove instead of void. I think this is even better. But this break backward compatibility and we need something like pg_stat_statements_reset_1_6

Changed. Returning removed rows is nice than just void.
 
> By default, this function can only be executed by superusers.
Can you also update this phrase in pg_stat_statements_reset documentation? Beginning from 1.5 version this is not true, reset can be used by any user with pg_read_all_stats role.

Doc is updated and also separate thread is started for the doc fix [1].
Updated patch attached.


Regards,
Haribabu Kommi
Fujitsu Australia
Attachment
Hello

I found SELECT pg_stat_statements_reset(NULL,NULL,s.queryid) in tests and it pass tests, but i wonder how it works.
Shouldnot we check the NULL through PG_ARGISNULL macro before any PG_GETARG_*? According src/include/fmgr.h 
 
> * If function is not marked "proisstrict" in pg_proc, it must check for
> * null arguments using this macro.  Do not try to GETARG a null argument!

> pg_stat_statements_reset(userid Oid, dbid Oid, queryid bigint) returns void
And you forgot to change return type in docs (and description of return value)

regards, Sergei




On Mon, Jul 2, 2018 at 6:42 PM Sergei Kornilov <sk@zsrv.org> wrote:
Hello

Thanks for the review.
 
I found SELECT pg_stat_statements_reset(NULL,NULL,s.queryid) in tests and it pass tests, but i wonder how it works. Should not we check the NULL through PG_ARGISNULL macro before any PG_GETARG_*? According src/include/fmgr.h
> * If function is not marked "proisstrict" in pg_proc, it must check for
> * null arguments using this macro.  Do not try to GETARG a null argument!

Thanks for checking, Added.
 
> pg_stat_statements_reset(userid Oid, dbid Oid, queryid bigint) returns void
And you forgot to change return type in docs (and description of return value)

Corrected and also added details of the returns value.

Update patch attached.

Regards,
Haribabu Kommi
Fujitsu Australia
Attachment
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, passed

I reviewed and tested patch one more times, including check with previous extension version. I have no more notes on
thecode. Patch has tests and appropriate documentation changes.
 
I think the patch is ready for committer.

regards, Sergei

The new status of this patch is: Ready for Committer

On Wed, Jul 4, 2018 at 7:12 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
>
> On Mon, Jul 2, 2018 at 6:42 PM Sergei Kornilov <sk@zsrv.org> wrote:
>>
>> Hello
>
>
> Thanks for the review.
>
>>
>> I found SELECT pg_stat_statements_reset(NULL,NULL,s.queryid) in tests and
>> it pass tests, but i wonder how it works. Should not we check the NULL
>> through PG_ARGISNULL macro before any PG_GETARG_*? According
>> src/include/fmgr.h
>> > * If function is not marked "proisstrict" in pg_proc, it must check for
>> > * null arguments using this macro.  Do not try to GETARG a null
>> > argument!
>
>
> Thanks for checking, Added.
>
>>
>> > pg_stat_statements_reset(userid Oid, dbid Oid, queryid bigint) returns
>> > void
>> And you forgot to change return type in docs (and description of return
>> value)
>
>
> Corrected and also added details of the returns value.
>
> Update patch attached.

+ if (userid != 0 && dbid != 0 && queryid != 0)

UINT64CONST() should be used for the constant for queryid?

It's rare case, but 0 can be assigned as valid queryid. Right?
If yes, it's problematic to use 0 as an invalid queryid here.

+      By default, this function can only be executed by superusers and members
+      of the <literal>pg_read_all_stats</literal> role.

Currently pg_stat_statements_reset() and other stats reset functions like
pg_stat_reset() can be executed only by superusers.
But why did you allow pg_read_all_stats role to execute that function,
by default? That change looks strange to me.

Regards,

-- 
Fujii Masao


Hello

> Currently pg_stat_statements_reset() and other stats reset functions like
> pg_stat_reset() can be executed only by superusers.
> But why did you allow pg_read_all_stats role to execute that function,
> by default? That change looks strange to me.
This is not behavior change, only fix documentation to current state. Commit
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=25fff40798fc4ac11a241bfd9ab0c45c085e2212creates 1.5
versionand grant execute to pg_read_all_stats
 

Backpatch proposal was started in separate thread
https://www.postgresql.org/message-id/CAJrrPGcim%3DQ-7ewhuKr1n3mkeELySC-QeVHGZJJYwaaKMSJRkg%40mail.gmail.com

regards, Sergei


On Fri, Jul 6, 2018 at 3:34 AM, Sergei Kornilov <sk@zsrv.org> wrote:
> Hello
>
>> Currently pg_stat_statements_reset() and other stats reset functions like
>> pg_stat_reset() can be executed only by superusers.
>> But why did you allow pg_read_all_stats role to execute that function,
>> by default? That change looks strange to me.
> This is not behavior change, only fix documentation to current state. Commit
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=25fff40798fc4ac11a241bfd9ab0c45c085e2212creates 1.5
versionand grant execute to pg_read_all_stats
 

Hmm... so pg_stat_statements_reset() is allowed to be executed by
pg_read_all_stats role while other stats reset functions like
pg_stat_reset() can be executed only by superusers. Which looks
strange and inconsistent to me.

Regards,

-- 
Fujii Masao


On Fri, Jul 6, 2018 at 1:26 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> Hmm... so pg_stat_statements_reset() is allowed to be executed by
> pg_read_all_stats role while other stats reset functions like
> pg_stat_reset() can be executed only by superusers. Which looks
> strange and inconsistent to me.

Yeah, why would a pg_READ_all_stats role let you change stuff?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


06.07.2018, 22:35, "Robert Haas" <robertmhaas@gmail.com>:
> On Fri, Jul 6, 2018 at 1:26 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>  Hmm... so pg_stat_statements_reset() is allowed to be executed by
>>  pg_read_all_stats role while other stats reset functions like
>>  pg_stat_reset() can be executed only by superusers. Which looks
>>  strange and inconsistent to me.
>
> Yeah, why would a pg_READ_all_stats role let you change stuff?

+1, personally i was surprised first time. But I thought that it was discussed before committing version 1.5
I read original thread [1] and, as far i see, pg_stat_statements_reset change was not discussed.

Let's remove this grant?
Or grant to pg_monitor role instead of pg_read_all_stats?

regards, Sergei

[1] https://www.postgresql.org/message-id/flat/CA+OCxoyRdsc1xyLfF9s698gUGyPXBs4CvJ+0Gwo8U65NmYJ7pw@mail.gmail.com


On 2018-Jul-06, Sergei Kornilov wrote:

> 06.07.2018, 22:35, "Robert Haas" <robertmhaas@gmail.com>:
> > On Fri, Jul 6, 2018 at 1:26 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> >>  Hmm... so pg_stat_statements_reset() is allowed to be executed by
> >>  pg_read_all_stats role while other stats reset functions like
> >>  pg_stat_reset() can be executed only by superusers. Which looks
> >>  strange and inconsistent to me.
> >
> > Yeah, why would a pg_READ_all_stats role let you change stuff?
> 
> +1, personally i was surprised first time. But I thought that it was discussed before committing version 1.5
> I read original thread [1] and, as far i see, pg_stat_statements_reset change was not discussed.

Ugh, it's true :-(
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=25fff40798fc4ac11a241bfd9ab0c45c085e2212#patch8

Dave, Simon, any comments?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Fri, Jul 6, 2018 at 3:22 AM Fujii Masao <masao.fujii@gmail.com> wrote:
On Wed, Jul 4, 2018 at 7:12 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> Update patch attached.

+ if (userid != 0 && dbid != 0 && queryid != 0)

UINT64CONST() should be used for the constant for queryid?

OK.
 
It's rare case, but 0 can be assigned as valid queryid. Right?

But for normal queries, in post parse analyze function, the queryID
is calculated and it set to 1, in case if the calculation becomes 0.
But for the utility statements, the calculation is done using the
pgss_hash_string() function. I am not sure whether this function 
can return 0.

If yes, then we may need same handling to utility statements 
similar like normal statements but with queryID as 2 for utility statements.

comments?

Regards,
Haribabu Kommi
Fujitsu Australia
On Fri, Jul 06, 2018 at 05:10:18PM -0400, Alvaro Herrera wrote:
> Ugh, it's true :-(
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=25fff40798fc4ac11a241bfd9ab0c45c085e2212#patch8
>
> Dave, Simon, any comments?

The offending line:
contrib/pg_stat_statements/pg_stat_statements--1.4--1.5.sql:
GRANT EXECUTE ON FUNCTION pg_stat_statements_reset() TO pg_read_all_stats;

This will need a new version bump down to REL_10_STABLE...
--
Michael

Attachment

On Mon, Jul 9, 2018 at 12:24 PM Michael Paquier <michael@paquier.xyz> wrote:
On Fri, Jul 06, 2018 at 05:10:18PM -0400, Alvaro Herrera wrote:
> Ugh, it's true :-(
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=25fff40798fc4ac11a241bfd9ab0c45c085e2212#patch8
>
> Dave, Simon, any comments?

The offending line:
contrib/pg_stat_statements/pg_stat_statements--1.4--1.5.sql:
GRANT EXECUTE ON FUNCTION pg_stat_statements_reset() TO pg_read_all_stats;

This will need a new version bump down to REL_10_STABLE...

Hearing no objections, attached patch removes all permissions from PUBLIC
as before this change went in. Or do we need to add command for revoke only
from pg_read_all_stats?

Regards,
Haribabu Kommi
Fujitsu Australia
Attachment
On Mon, Jul 9, 2018 at 3:39 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:

On Mon, Jul 9, 2018 at 12:24 PM Michael Paquier <michael@paquier.xyz> wrote:
On Fri, Jul 06, 2018 at 05:10:18PM -0400, Alvaro Herrera wrote:
> Ugh, it's true :-(
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=25fff40798fc4ac11a241bfd9ab0c45c085e2212#patch8
>
> Dave, Simon, any comments?

The offending line:
contrib/pg_stat_statements/pg_stat_statements--1.4--1.5.sql:
GRANT EXECUTE ON FUNCTION pg_stat_statements_reset() TO pg_read_all_stats;

This will need a new version bump down to REL_10_STABLE...

Hearing no objections, attached patch removes all permissions from PUBLIC
as before this change went in. Or do we need to add command for revoke only
from pg_read_all_stats?

Revoke all doesn't work, so patch updated with revoke from pg_read_all_stats.

Regards,
Haribabu Kommi
Fujitsu Australia
Attachment
On Sun, Jul 8, 2018 at 11:48 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
> On Fri, Jul 6, 2018 at 3:22 AM Fujii Masao <masao.fujii@gmail.com> wrote:
>>
>> On Wed, Jul 4, 2018 at 7:12 PM, Haribabu Kommi <kommi.haribabu@gmail.com>
>> wrote:
>> >
>> > Update patch attached.
>>
>> + if (userid != 0 && dbid != 0 && queryid != 0)
>>
>> UINT64CONST() should be used for the constant for queryid?
>
>
> OK.
>
>>
>> It's rare case, but 0 can be assigned as valid queryid. Right?
>
>
> But for normal queries, in post parse analyze function, the queryID
> is calculated and it set to 1, in case if the calculation becomes 0.
> But for the utility statements, the calculation is done using the
> pgss_hash_string() function. I am not sure whether this function
> can return 0.

Though I've not read the whole code of pgss_hash_string(), ISTM that
the function can return 0. Otherwise, the following code is unnecessary
after queryid is assigned by hash_any_extended(),
in pgss_post_parse_analyze().

    /*
    * If we are unlucky enough to get a hash of zero, use 1 instead, to
    * prevent confusion with the utility-statement case.
    */
    if (query->queryId == UINT64CONST(0))
    query->queryId = UINT64CONST(1);

> If yes, then we may need same handling to utility statements
> similar like normal statements but with queryID as 2 for utility statements.

That's possible, but I think that it's better to get rid of such corner
case at all.

Regards,

-- 
Fujii Masao



On Tue, Jul 10, 2018 at 12:26 AM Fujii Masao <masao.fujii@gmail.com> wrote:
On Sun, Jul 8, 2018 at 11:48 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
> On Fri, Jul 6, 2018 at 3:22 AM Fujii Masao <masao.fujii@gmail.com> wrote:
>>
>> On Wed, Jul 4, 2018 at 7:12 PM, Haribabu Kommi <kommi.haribabu@gmail.com>
>> wrote:
>> >
>> > Update patch attached.
>>
>> + if (userid != 0 && dbid != 0 && queryid != 0)
>>
>> UINT64CONST() should be used for the constant for queryid?
>
>
> OK.
>
>>
>> It's rare case, but 0 can be assigned as valid queryid. Right?
>
>
> But for normal queries, in post parse analyze function, the queryID
> is calculated and it set to 1, in case if the calculation becomes 0.
> But for the utility statements, the calculation is done using the
> pgss_hash_string() function. I am not sure whether this function
> can return 0.

Though I've not read the whole code of pgss_hash_string(), ISTM that
the function can return 0. Otherwise, the following code is unnecessary
after queryid is assigned by hash_any_extended(),
in pgss_post_parse_analyze().

    /*
    * If we are unlucky enough to get a hash of zero, use 1 instead, to
    * prevent confusion with the utility-statement case.
    */
    if (query->queryId == UINT64CONST(0))
    query->queryId = UINT64CONST(1);

> If yes, then we may need same handling to utility statements
> similar like normal statements but with queryID as 2 for utility statements.

That's possible, but I think that it's better to get rid of such corner
case at all.

QueryID 2 is used in case if the generated hash value is 0 for utility statements
to give difference with normal statements and also used the UINT64CONST
macro as per the earlier comment.

updated patch attached. This patch needs to be applied on top of the ACL
permissions revert patch. ACL revert patch also in this thread.

Regards,
Haribabu Kommi
Fujitsu Australia
Attachment
On Mon, Jul 9, 2018 at 11:28 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> On Mon, Jul 9, 2018 at 3:39 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>>
>>
>> On Mon, Jul 9, 2018 at 12:24 PM Michael Paquier <michael@paquier.xyz> wrote:
>>>
>>> On Fri, Jul 06, 2018 at 05:10:18PM -0400, Alvaro Herrera wrote:
>>> > Ugh, it's true :-(
>>> >
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=25fff40798fc4ac11a241bfd9ab0c45c085e2212#patch8
>>> >
>>> > Dave, Simon, any comments?
>>>
>>> The offending line:
>>> contrib/pg_stat_statements/pg_stat_statements--1.4--1.5.sql:
>>> GRANT EXECUTE ON FUNCTION pg_stat_statements_reset() TO pg_read_all_stats;
>>>
>>> This will need a new version bump down to REL_10_STABLE...
>>
>>
>> Hearing no objections, attached patch removes all permissions from PUBLIC
>> as before this change went in. Or do we need to add command for revoke only
>> from pg_read_all_stats?
>
>
> Revoke all doesn't work, so patch updated with revoke from pg_read_all_stats.
>

The other questionable part of that commit (25fff40798) is that it
changes permissions for function pg_stat_statements_reset at SQL level
and for C function it changes the permission check for
pg_stat_statements, refer below change.

@@ -1391,7 +1392,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
    MemoryContext per_query_ctx;
    MemoryContext oldcontext;
    Oid         userid = GetUserId();
-   bool        is_superuser = superuser();
+   bool        is_allowed_role = false;
    char       *qbuffer = NULL;
    Size        qbuffer_size = 0;
    Size        extent = 0;
@@ -1399,6 +1400,9 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
    HASH_SEQ_STATUS hash_seq;
    pgssEntry  *entry;

+   /* Superusers or members of pg_read_all_stats members are allowed */
+   is_allowed_role = is_member_of_role(GetUserId(),
DEFAULT_ROLE_READ_ALL_STATS);

Am I confused here?  In any case, I think it is better to start a
separate thread to discuss this issue.  It might help us in getting
more attention on this issue and we can focus on your proposed patch
in this thread.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Sat, Sep 22, 2018 at 11:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jul 9, 2018 at 11:28 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> On Mon, Jul 9, 2018 at 3:39 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>>
>>
>> On Mon, Jul 9, 2018 at 12:24 PM Michael Paquier <michael@paquier.xyz> wrote:
>>>
>>> On Fri, Jul 06, 2018 at 05:10:18PM -0400, Alvaro Herrera wrote:
>>> > Ugh, it's true :-(
>>> > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=25fff40798fc4ac11a241bfd9ab0c45c085e2212#patch8
>>> >
>>> > Dave, Simon, any comments?
>>>
>>> The offending line:
>>> contrib/pg_stat_statements/pg_stat_statements--1.4--1.5.sql:
>>> GRANT EXECUTE ON FUNCTION pg_stat_statements_reset() TO pg_read_all_stats;
>>>
>>> This will need a new version bump down to REL_10_STABLE...
>>
>>
>> Hearing no objections, attached patch removes all permissions from PUBLIC
>> as before this change went in. Or do we need to add command for revoke only
>> from pg_read_all_stats?
>
>
> Revoke all doesn't work, so patch updated with revoke from pg_read_all_stats.
>

Thanks for the review.
 
The other questionable part of that commit (25fff40798) is that it
changes permissions for function pg_stat_statements_reset at SQL level
and for C function it changes the permission check for
pg_stat_statements, refer below change.

The below changes are to support the read access of particular columns
of the pg_stat_statements view to pg_read_all_stats role. These changes
should exist and only the permissions of pg_stat_statements_reset() function
needs to be removed.
 
@@ -1391,7 +1392,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
    MemoryContext per_query_ctx;
    MemoryContext oldcontext;
    Oid         userid = GetUserId();
-   bool        is_superuser = superuser();
+   bool        is_allowed_role = false;
    char       *qbuffer = NULL;
    Size        qbuffer_size = 0;
    Size        extent = 0;
@@ -1399,6 +1400,9 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
    HASH_SEQ_STATUS hash_seq;
    pgssEntry  *entry;

+   /* Superusers or members of pg_read_all_stats members are allowed */
+   is_allowed_role = is_member_of_role(GetUserId(),
DEFAULT_ROLE_READ_ALL_STATS);

Am I confused here?  In any case, I think it is better to start a
separate thread to discuss this issue.  It might help us in getting
more attention on this issue and we can focus on your proposed patch
in this thread.

Started a new thread to discuss about the revoke the execute permissions
of the pg_stat_statements_reset() from pg_read_all_stats role at [1]


Regards,
Haribabu Kommi
Fujitsu Australia

On Mon, Sep 24, 2018 at 11:13 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
On Sat, Sep 22, 2018 at 11:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Jul 9, 2018 at 11:28 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> On Mon, Jul 9, 2018 at 3:39 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>>
>>
>> On Mon, Jul 9, 2018 at 12:24 PM Michael Paquier <michael@paquier.xyz> wrote:
>>>
>>> On Fri, Jul 06, 2018 at 05:10:18PM -0400, Alvaro Herrera wrote:
>>> > Ugh, it's true :-(
>>> > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=25fff40798fc4ac11a241bfd9ab0c45c085e2212#patch8
>>> >
>>> > Dave, Simon, any comments?
>>>
>>> The offending line:
>>> contrib/pg_stat_statements/pg_stat_statements--1.4--1.5.sql:
>>> GRANT EXECUTE ON FUNCTION pg_stat_statements_reset() TO pg_read_all_stats;
>>>
>>> This will need a new version bump down to REL_10_STABLE...
>>
>>
>> Hearing no objections, attached patch removes all permissions from PUBLIC
>> as before this change went in. Or do we need to add command for revoke only
>> from pg_read_all_stats?
>
>
> Revoke all doesn't work, so patch updated with revoke from pg_read_all_stats.
>

Thanks for the review.
 
The other questionable part of that commit (25fff40798) is that it
changes permissions for function pg_stat_statements_reset at SQL level
and for C function it changes the permission check for
pg_stat_statements, refer below change.

The below changes are to support the read access of particular columns
of the pg_stat_statements view to pg_read_all_stats role. These changes
should exist and only the permissions of pg_stat_statements_reset() function
needs to be removed.
 
@@ -1391,7 +1392,7 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
    MemoryContext per_query_ctx;
    MemoryContext oldcontext;
    Oid         userid = GetUserId();
-   bool        is_superuser = superuser();
+   bool        is_allowed_role = false;
    char       *qbuffer = NULL;
    Size        qbuffer_size = 0;
    Size        extent = 0;
@@ -1399,6 +1400,9 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo,
    HASH_SEQ_STATUS hash_seq;
    pgssEntry  *entry;

+   /* Superusers or members of pg_read_all_stats members are allowed */
+   is_allowed_role = is_member_of_role(GetUserId(),
DEFAULT_ROLE_READ_ALL_STATS);

Am I confused here?  In any case, I think it is better to start a
separate thread to discuss this issue.  It might help us in getting
more attention on this issue and we can focus on your proposed patch
in this thread.

Started a new thread to discuss about the revoke the execute permissions
of the pg_stat_statements_reset() from pg_read_all_stats role at [1]
 
Attached new rebased version of the patch that enhances the pg_stat_statements_reset()
function. This needs to be applied on top of the patch that is posted in [1].


Regards,
Haribabu Kommi
Fujitsu Australia
Attachment
On Mon, Sep 24, 2018 at 12:19:44PM +1000, Haribabu Kommi wrote:
> Attached new rebased version of the patch that enhances the
> pg_stat_statements_reset()
> function. This needs to be applied on top of the patch that is posted in
> [1].

+CREATE ROLE stats_regress_user1;
+CREATE ROLE stats_regress_user2;
Just a short note: regression tests creating roles should use regress_
as prefix.
--
Michael

Attachment

On Tue, Sep 25, 2018 at 1:39 AM Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Sep 24, 2018 at 12:19:44PM +1000, Haribabu Kommi wrote:
> Attached new rebased version of the patch that enhances the
> pg_stat_statements_reset()
> function. This needs to be applied on top of the patch that is posted in
> [1].

+CREATE ROLE stats_regress_user1;
+CREATE ROLE stats_regress_user2;
Just a short note: regression tests creating roles should use regress_
as prefix.

Thanks for the review.
Fixed in the attached patch as per your suggestion.

Regards,
Haribabu Kommi
Fujitsu Australia
Attachment
On Tue, Sep 25, 2018 at 01:49:09PM +1000, Haribabu Kommi wrote:
> Thanks for the review.
> Fixed in the attached patch as per your suggestion.

Hmm.  I see a problem with the tests and the stability of what
pg_stat_statements_reset() can return.  Normally installcheck is
disabled in contrib/pg_stat_statements/Makefile but if you remove this
barrier and run the tests with a server loading the module in
shared_preload_libraries then things are not stable.  We don't have this
kind of instability on HEAD.  Some call to pg_stat_statements_reset()
system-wide is visibly missing.

+   if (!pgss || !pgss_hash)
+       ereport(ERROR,
+               (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                errmsg("pg_stat_statements must be loaded via shared_preload_libraries")));
This check can be within entry_reset().

+      the specified userid, dbid and queryid. Returns the total number of
+      statement statistics that are reset based on the specified input.
+      If any of the parameter is not specified, the default value NULL(invalid)
Missing some markups for the three field names here, as well as for NULL
which is a value.

I can buy the compatibility breakage with the return result of
pg_stat_statements_reset when specified without arguments.

Some nannyism: If all entries are removed and a new file needs to be
written, you could save a bit of indentation by returning immediately
when (num_entries != num_remove).
--
Michael

Attachment
On Tue, Sep 25, 2018 at 3:09 PM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Sep 25, 2018 at 01:49:09PM +1000, Haribabu Kommi wrote:
> Thanks for the review.
> Fixed in the attached patch as per your suggestion.

Thanks for the review. 

Hmm.  I see a problem with the tests and the stability of what
pg_stat_statements_reset() can return.  Normally installcheck is
disabled in contrib/pg_stat_statements/Makefile but if you remove this
barrier and run the tests with a server loading the module in
shared_preload_libraries then things are not stable.  We don't have this
kind of instability on HEAD.  Some call to pg_stat_statements_reset()
system-wide is visibly missing.

The difference in results of the output of the pg_stat_statements_reset()
function is based on the earlier statements that are stored in the pg_stat_statements
table, this varies based on the environment. So I created a wrapper function
that masks the return value of the first reset and then the test is stable.

check whether is it fine or any better approach to handle it?
 
+   if (!pgss || !pgss_hash)
+       ereport(ERROR,
+               (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+                errmsg("pg_stat_statements must be loaded via shared_preload_libraries")));
This check can be within entry_reset().

Moved.
 
+      the specified userid, dbid and queryid. Returns the total number of
+      statement statistics that are reset based on the specified input.
+      If any of the parameter is not specified, the default value NULL(invalid)
Missing some markups for the three field names here, as well as for NULL
which is a value.

Corrected.
 
I can buy the compatibility breakage with the return result of
pg_stat_statements_reset when specified without arguments.

Some nannyism: If all entries are removed and a new file needs to be
written, you could save a bit of indentation by returning immediately
when (num_entries != num_remove).

Corrected.

Attached is the updated patch.

Regards,
Haribabu Kommi
Fujitsu Australia
Attachment
On Fri, Sep 28, 2018 at 7:45 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> On Tue, Sep 25, 2018 at 3:09 PM Michael Paquier <michael@paquier.xyz> wrote:
>>
>> On Tue, Sep 25, 2018 at 01:49:09PM +1000, Haribabu Kommi wrote:
>> Hmm.  I see a problem with the tests and the stability of what
>> pg_stat_statements_reset() can return.  Normally installcheck is
>> disabled in contrib/pg_stat_statements/Makefile but if you remove this
>> barrier and run the tests with a server loading the module in
>> shared_preload_libraries then things are not stable.  We don't have this
>> kind of instability on HEAD.  Some call to pg_stat_statements_reset()
>> system-wide is visibly missing.
>
>
> The difference in results of the output of the pg_stat_statements_reset()
> function is based on the earlier statements that are stored in the pg_stat_statements
> table, this varies based on the environment. So I created a wrapper function
> that masks the return value of the first reset and then the test is stable.
>
> check whether is it fine or any better approach to handle it?
>

Before trying out any solution or deciding which is better, I think we
want to understand why the variability in results occurred only after
your patch?  Without the patch, it works just fine.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Sat, Nov 03, 2018 at 03:56:14PM +0530, Amit Kapila wrote:
> Before trying out any solution or deciding which is better, I think we
> want to understand why the variability in results occurred only after
> your patch?  Without the patch, it works just fine.

Good point.  We surely want to have a stable feature, which gets tested
without triggering random failures in the builfarm.
--
Michael

Attachment
On Sun, Nov 4, 2018 at 11:17 AM Michael Paquier <michael@paquier.xyz> wrote:
On Sat, Nov 03, 2018 at 03:56:14PM +0530, Amit Kapila wrote:
> Before trying out any solution or deciding which is better, I think we
> want to understand why the variability in results occurred only after
> your patch?  Without the patch, it works just fine.

Good point.  We surely want to have a stable feature, which gets tested
without triggering random failures in the builfarm.

Thanks for the review.

This patch has changed the pg_stat_statements_reset() function from returning void
to number statements that it reset. The regression test contains pg_stat_statements_reset()
as first query to reset any of the query stats that are already tracked to let the test to
provide the proper results. But with this feature, if we test this regression test on an
already running server, the first query result is varying and it leads to test failure.

So to fix this problem, I added a wrapper function that masks the result of the 
pg_stat_statements_reset() function and just return as void, with this wrapper function
used a first statement, the test is stable, as this function takes care of resetting already
existing statements from the already running server.

With the above change, the regression test is stable. Comments?

Regards,
Haribabu Kommi
Fujitsu Australia
On Sun, Nov 4, 2018 at 6:37 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> On Sun, Nov 4, 2018 at 11:17 AM Michael Paquier <michael@paquier.xyz> wrote:
>>
>> On Sat, Nov 03, 2018 at 03:56:14PM +0530, Amit Kapila wrote:
>> > Before trying out any solution or deciding which is better, I think we
>> > want to understand why the variability in results occurred only after
>> > your patch?  Without the patch, it works just fine.
>>
>> Good point.  We surely want to have a stable feature, which gets tested
>> without triggering random failures in the builfarm.
>
>
> Thanks for the review.
>
> This patch has changed the pg_stat_statements_reset() function from returning void
> to number statements that it reset.
>

What is the motivation of that change?  It seems to be
backward-incompatible change.  I am not telling we can't do it, but do
we have strong justification?  One reason, I could see is to allow the
user to get the exact value of statements that got reset and maybe
that is more helpful as we have now additional parameters in the API,
but I am not sure if that is a good justification.

> The regression test contains pg_stat_statements_reset()
> as first query to reset any of the query stats that are already tracked to let the test to
> provide the proper results. But with this feature, if we test this regression test on an
> already running server, the first query result is varying and it leads to test failure.
>
> So to fix this problem, I added a wrapper function that masks the result of the
> pg_stat_statements_reset() function and just return as void, with this wrapper function
> used a first statement, the test is stable, as this function takes care of resetting already
> existing statements from the already running server.
>

Or you could change the query to something like (Select NULL from
pg_stat_statements_reset())

> With the above change, the regression test is stable. Comments?
>

In the comment message, you wrote: "Backward Compatible change, Now
pg_stat_statements_reset() function returns the number of statement
entries that are reset."

Do you want to say incompatible instead of compatible?

+      If no parameter is specified or specify everything as
<literal>NULL</literal>(invalid),
+      it will discard all statistics.
+      By default, this function can only be executed by superusers. Access may
+      be granted to others using <command>GRANT</command>.

I think it will look better if you can continue the "By default, ..."
line from where the previous line ends rather than starting a new
line.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Thu, Nov 8, 2018 at 4:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sun, Nov 4, 2018 at 6:37 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> On Sun, Nov 4, 2018 at 11:17 AM Michael Paquier <michael@paquier.xyz> wrote:
>>
>> On Sat, Nov 03, 2018 at 03:56:14PM +0530, Amit Kapila wrote:
>> > Before trying out any solution or deciding which is better, I think we
>> > want to understand why the variability in results occurred only after
>> > your patch?  Without the patch, it works just fine.
>>
>> Good point.  We surely want to have a stable feature, which gets tested
>> without triggering random failures in the builfarm.
>
>
> Thanks for the review.
>
> This patch has changed the pg_stat_statements_reset() function from returning void
> to number statements that it reset.
>

What is the motivation of that change?  It seems to be
backward-incompatible change.  I am not telling we can't do it, but do
we have strong justification?  One reason, I could see is to allow the
user to get the exact value of statements that got reset and maybe
that is more helpful as we have now additional parameters in the API,
but I am not sure if that is a good justification.

Yes, as you said that is the main reason to modify the function to return
the number of statements that it reset. Without having any output from
the function, there is no way that how many statements that the above
function reset. Earlier it used to reset every statement, so I feel there is 
no need of any output, but I feel giving the number of statements with
this approach is good.
 
> The regression test contains pg_stat_statements_reset()
> as first query to reset any of the query stats that are already tracked to let the test to
> provide the proper results. But with this feature, if we test this regression test on an
> already running server, the first query result is varying and it leads to test failure.
>
> So to fix this problem, I added a wrapper function that masks the result of the
> pg_stat_statements_reset() function and just return as void, with this wrapper function
> used a first statement, the test is stable, as this function takes care of resetting already
> existing statements from the already running server.
>

Or you could change the query to something like (Select NULL from
pg_stat_statements_reset())

Thanks. I changed it accordingly.
 
> With the above change, the regression test is stable. Comments?
>

In the comment message, you wrote: "Backward Compatible change, Now
pg_stat_statements_reset() function returns the number of statement
entries that are reset."

Do you want to say incompatible instead of compatible?

Yes, your are correct. 
 
+      If no parameter is specified or specify everything as
<literal>NULL</literal>(invalid),
+      it will discard all statistics.
+      By default, this function can only be executed by superusers. Access may
+      be granted to others using <command>GRANT</command>.

I think it will look better if you can continue the "By default, ..."
line from where the previous line ends rather than starting a new
line.

Updated as per your suggestion.
Attached an updated with above comments correction.

Regards,
Haribabu Kommi
Fujitsu Australia
Attachment
On Thu, Nov 8, 2018 at 1:52 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
> On Thu, Nov 8, 2018 at 4:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> > This patch has changed the pg_stat_statements_reset() function from returning void
>> > to number statements that it reset.
>> >
>>
>> What is the motivation of that change?  It seems to be
>> backward-incompatible change.  I am not telling we can't do it, but do
>> we have strong justification?  One reason, I could see is to allow the
>> user to get the exact value of statements that got reset and maybe
>> that is more helpful as we have now additional parameters in the API,
>> but I am not sure if that is a good justification.
>
>
> Yes, as you said that is the main reason to modify the function to return
> the number of statements that it reset. Without having any output from
> the function, there is no way that how many statements that the above
> function reset. Earlier it used to reset every statement, so I feel there is
> no need of any output, but I feel giving the number of statements with
> this approach is good.
>

Sure, but what are we going to achieve with that number?  What
information user is going to get by that?  If it can help us to ensure
that it has reset the expected number of statements, then I can see
the clear usage, but without that, the return value doesn't seem to
have any clear purpose.  So, I don't see much value in breaking
compatibility.

Does anyone else have an opinion on this matter?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On 2018-Nov-08, Amit Kapila wrote:

> Sure, but what are we going to achieve with that number?  What
> information user is going to get by that?  If it can help us to ensure
> that it has reset the expected number of statements, then I can see
> the clear usage, but without that, the return value doesn't seem to
> have any clear purpose.  So, I don't see much value in breaking
> compatibility.
> 
> Does anyone else have an opinion on this matter?

This was proposed by Sergei Kornilov in
https://postgr.es/m/3368121530260059@web21g.yandex.ru saying that "it
would be nice" to return it.  Maybe he has an use case in mind?  I don't
see one myself.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Hi

>>  Sure, but what are we going to achieve with that number? What
>>  information user is going to get by that? If it can help us to ensure
>>  that it has reset the expected number of statements, then I can see
>>  the clear usage, but without that, the return value doesn't seem to
>>  have any clear purpose. So, I don't see much value in breaking
>>  compatibility.
>>
>>  Does anyone else have an opinion on this matter?
>
> This was proposed by Sergei Kornilov in
> https://postgr.es/m/3368121530260059@web21g.yandex.ru saying that "it
> would be nice" to return it. Maybe he has an use case in mind? I don't
> see one myself.
No, i have no specific usecase for this. Silently remove all matching rows and return void is ok for me. But i still
thinkLOG ereport is not useful.
 

regards, Sergei


On Thu, Nov 8, 2018 at 3:53 PM Sergei Kornilov <sk@zsrv.org> wrote:
Hi

>>  Sure, but what are we going to achieve with that number? What
>>  information user is going to get by that? If it can help us to ensure
>>  that it has reset the expected number of statements, then I can see
>>  the clear usage, but without that, the return value doesn't seem to
>>  have any clear purpose. So, I don't see much value in breaking
>>  compatibility.
>>
>>  Does anyone else have an opinion on this matter?
>
> This was proposed by Sergei Kornilov in
> https://postgr.es/m/3368121530260059@web21g.yandex.ru saying that "it
> would be nice" to return it. Maybe he has an use case in mind? I don't
> see one myself.
No, i have no specific usecase for this. Silently remove all matching rows and return void is ok for me. But i still think LOG ereport is not useful.

I would much prefer it to be a return code rather than a forced LOG message. Log message spam is very much a thing, and things that are logged as LOG will always be there.

It could also be made to take a parameter saying log yes/no with a default value, but that seems like possible overengineering of a fairly simple functionality. 

--
On Thu, Nov 8, 2018 at 10:56 PM Magnus Hagander <magnus@hagander.net> wrote:
>
> On Thu, Nov 8, 2018 at 3:53 PM Sergei Kornilov <sk@zsrv.org> wrote:
>>
>> Hi
>>
>> >>  Sure, but what are we going to achieve with that number? What
>> >>  information user is going to get by that? If it can help us to ensure
>> >>  that it has reset the expected number of statements, then I can see
>> >>  the clear usage, but without that, the return value doesn't seem to
>> >>  have any clear purpose. So, I don't see much value in breaking
>> >>  compatibility.
>> >>
>> >>  Does anyone else have an opinion on this matter?
>> >
>> > This was proposed by Sergei Kornilov in
>> > https://postgr.es/m/3368121530260059@web21g.yandex.ru saying that "it
>> > would be nice" to return it. Maybe he has an use case in mind? I don't
>> > see one myself.
>> No, i have no specific usecase for this. Silently remove all matching rows and return void is ok for me. But i still
thinkLOG ereport is not useful.
 
>
>
> I would much prefer it to be a return code rather than a forced LOG message. Log message spam is very much a thing,
andthings that are logged as LOG will always be there.
 
>

Is any such LOG message present in the latest patch?  I agree that the
return code might be better, but there doesn't exist any such (LOG)
thing.  I see that it can be helpful for some users if we have any
such return code, but don't know if it doesn't already exist, why that
should be a requirement for this patch?  Do you have any strong
opinion about introducing return code with this patch?

> It could also be made to take a parameter saying log yes/no with a default value, but that seems like possible
overengineeringof a fairly simple functionality.
 
>

Right.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Fri, Nov 9, 2018 at 2:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Nov 8, 2018 at 10:56 PM Magnus Hagander <magnus@hagander.net> wrote:
>
> On Thu, Nov 8, 2018 at 3:53 PM Sergei Kornilov <sk@zsrv.org> wrote:
>>
>> Hi
>>
>> >>  Sure, but what are we going to achieve with that number? What
>> >>  information user is going to get by that? If it can help us to ensure
>> >>  that it has reset the expected number of statements, then I can see
>> >>  the clear usage, but without that, the return value doesn't seem to
>> >>  have any clear purpose. So, I don't see much value in breaking
>> >>  compatibility.
>> >>
>> >>  Does anyone else have an opinion on this matter?
>> >
>> > This was proposed by Sergei Kornilov in
>> > https://postgr.es/m/3368121530260059@web21g.yandex.ru saying that "it
>> > would be nice" to return it. Maybe he has an use case in mind? I don't
>> > see one myself.
>> No, i have no specific usecase for this. Silently remove all matching rows and return void is ok for me. But i still think LOG ereport is not useful.
>
>
> I would much prefer it to be a return code rather than a forced LOG message. Log message spam is very much a thing, and things that are logged as LOG will always be there.
>

Is any such LOG message present in the latest patch?  I agree that the
return code might be better, but there doesn't exist any such (LOG)
thing.  I see that it can be helpful for some users if we have any
such return code, but don't know if it doesn't already exist, why that
should be a requirement for this patch?  Do you have any strong
opinion about introducing return code with this patch?

I thought that returning the affected number of statements with the change
of adding new parameters to the reset function will be helpful to find out
how many statements are affected?

I can revert it back to void, if I am the only one interested with that change.

Regards,
Haribabu Kommi
Fujitsu Australia
On Mon, Nov 12, 2018 at 10:55 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
> On Fri, Nov 9, 2018 at 2:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Thu, Nov 8, 2018 at 10:56 PM Magnus Hagander <magnus@hagander.net> wrote:
>> >
>> > On Thu, Nov 8, 2018 at 3:53 PM Sergei Kornilov <sk@zsrv.org> wrote:
>> >>
>> >> Hi
>> >>
>> >> >>  Sure, but what are we going to achieve with that number? What
>> >> >>  information user is going to get by that? If it can help us to ensure
>> >> >>  that it has reset the expected number of statements, then I can see
>> >> >>  the clear usage, but without that, the return value doesn't seem to
>> >> >>  have any clear purpose. So, I don't see much value in breaking
>> >> >>  compatibility.
>> >> >>
>> >> >>  Does anyone else have an opinion on this matter?
>> >> >
>> >> > This was proposed by Sergei Kornilov in
>> >> > https://postgr.es/m/3368121530260059@web21g.yandex.ru saying that "it
>> >> > would be nice" to return it. Maybe he has an use case in mind? I don't
>> >> > see one myself.
>> >> No, i have no specific usecase for this. Silently remove all matching rows and return void is ok for me. But i
stillthink LOG ereport is not useful.
 
>> >
>> >
>> > I would much prefer it to be a return code rather than a forced LOG message. Log message spam is very much a
thing,and things that are logged as LOG will always be there.
 
>> >
>>
>> Is any such LOG message present in the latest patch?  I agree that the
>> return code might be better, but there doesn't exist any such (LOG)
>> thing.  I see that it can be helpful for some users if we have any
>> such return code, but don't know if it doesn't already exist, why that
>> should be a requirement for this patch?  Do you have any strong
>> opinion about introducing return code with this patch?
>
>
> I thought that returning the affected number of statements with the change
> of adding new parameters to the reset function will be helpful to find out
> how many statements are affected?
>

It is not clear how will user make use of that information.

> I can revert it back to void,
>

+1, as we don't see any good reason to break backward compatibility.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Mon, Nov 12, 2018 at 6:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 12, 2018 at 10:55 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
> I can revert it back to void,
>

+1, as we don't see any good reason to break backward compatibility.

Thanks for the review.
Attached the updated patch with return type as void.

Regards,
Haribabu Kommi
Fujitsu Australia
Attachment
On Tue, Nov 13, 2018 at 11:32 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
> On Mon, Nov 12, 2018 at 6:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> > I can revert it back to void,
>> >
>>
>> +1, as we don't see any good reason to break backward compatibility.
>
>
> Thanks for the review.
> Attached the updated patch with return type as void.
>

With this patch, we are intending to remove the entries matching
userid, dbid, queryid from hash table (pgss_hash), but the contents of
the file (
pgss_query_texts.stat) will remain unchanged unless all of the entries
are removed from hash table.  This appears fine to me, especially
because there is no easy way to remove the contents from the file.
Does anybody see any problem with this behavior?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Wed, Nov 14, 2018 at 12:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Nov 13, 2018 at 11:32 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
> On Mon, Nov 12, 2018 at 6:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> > I can revert it back to void,
>> >
>>
>> +1, as we don't see any good reason to break backward compatibility.
>
>
> Thanks for the review.
> Attached the updated patch with return type as void.
>

With this patch, we are intending to remove the entries matching
userid, dbid, queryid from hash table (pgss_hash), but the contents of
the file (
pgss_query_texts.stat) will remain unchanged unless all of the entries
are removed from hash table.  This appears fine to me, especially
because there is no easy way to remove the contents from the file.
Does anybody see any problem with this behavior?

Adding more info to the above point, even if the file contents are not 
removed, later if the file size and number of pg_stat_statements entries
satisfy the garbage collection, the file will be truncated. So I feel not
removing of the contents when the query stats are reset using specific
parameters is fine.

Regards,
Haribabu Kommi
Fujitsu Australia
On Wed, Nov 14, 2018 at 7:04 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> On Wed, Nov 14, 2018 at 12:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Tue, Nov 13, 2018 at 11:32 AM Haribabu Kommi
>> <kommi.haribabu@gmail.com> wrote:
>> >
>> > On Mon, Nov 12, 2018 at 6:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >> > I can revert it back to void,
>> >> >
>> >>
>> >> +1, as we don't see any good reason to break backward compatibility.
>> >
>> >
>> > Thanks for the review.
>> > Attached the updated patch with return type as void.
>> >
>>
>> With this patch, we are intending to remove the entries matching
>> userid, dbid, queryid from hash table (pgss_hash), but the contents of
>> the file (
>> pgss_query_texts.stat) will remain unchanged unless all of the entries
>> are removed from hash table.  This appears fine to me, especially
>> because there is no easy way to remove the contents from the file.
>> Does anybody see any problem with this behavior?
>
>
> Adding more info to the above point, even if the file contents are not
> removed, later if the file size and number of pg_stat_statements entries
> satisfy the garbage collection, the file will be truncated. So I feel not
> removing of the contents when the query stats are reset using specific
> parameters is fine.
>

I have further reviewed this patch and below are my comments:

1.
+ if ((!userid || (userid && entry->key.userid == userid)) &&
+ (!dbid || (dbid && entry->key.dbid == dbid)) &&
+ (!queryid || (queryid && entry->key.queryid == queryid)))

I think this check can be simplified as:
+ if ((!userid || entry->key.userid == userid) &&
+ (!dbid || entry->key.dbid == dbid) &&
+ (!queryid || entry->key.queryid == queryid))

2.
+ else
+ {
+ hash_seq_init(&hash_seq, pgss_hash);
+ while ((entry = hash_seq_search(&hash_seq)) != NULL)
+ {
+ if ((!userid || (userid && entry->key.userid == userid)) &&
+ (!dbid || (dbid && entry->key.dbid == dbid)) &&
+ (!queryid || (queryid && entry->key.queryid == queryid)))
+ {
+ hash_search(pgss_hash, &entry->key, HASH_REMOVE, NULL);
+ num_remove++;
+ }
+ }
+ }

I think this 'if check' is redundant when none of the parameters are
specified.  We can easily avoid it, see attached.

I have fixed above two observations in the attached patch and edited
few comments and doc changes.  Kindly review the same.

Apart from the above, I think we should add a test where all the
parameters are valid as the corresponding code is not covered by any
existing tests.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment

On Wed, Nov 14, 2018 at 8:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Nov 14, 2018 at 7:04 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> On Wed, Nov 14, 2018 at 12:26 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Tue, Nov 13, 2018 at 11:32 AM Haribabu Kommi
>> <kommi.haribabu@gmail.com> wrote:
>> >
>> > On Mon, Nov 12, 2018 at 6:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >> > I can revert it back to void,
>> >> >
>> >>
>> >> +1, as we don't see any good reason to break backward compatibility.
>> >
>> >
>> > Thanks for the review.
>> > Attached the updated patch with return type as void.
>> >
>>
>> With this patch, we are intending to remove the entries matching
>> userid, dbid, queryid from hash table (pgss_hash), but the contents of
>> the file (
>> pgss_query_texts.stat) will remain unchanged unless all of the entries
>> are removed from hash table.  This appears fine to me, especially
>> because there is no easy way to remove the contents from the file.
>> Does anybody see any problem with this behavior?
>
>
> Adding more info to the above point, even if the file contents are not
> removed, later if the file size and number of pg_stat_statements entries
> satisfy the garbage collection, the file will be truncated. So I feel not
> removing of the contents when the query stats are reset using specific
> parameters is fine.
>

I have further reviewed this patch and below are my comments:

Thanks for the review.
 
1.
+ if ((!userid || (userid && entry->key.userid == userid)) &&
+ (!dbid || (dbid && entry->key.dbid == dbid)) &&
+ (!queryid || (queryid && entry->key.queryid == queryid)))

I think this check can be simplified as:
+ if ((!userid || entry->key.userid == userid) &&
+ (!dbid || entry->key.dbid == dbid) &&
+ (!queryid || entry->key.queryid == queryid))

Yes, the second check is not necessary.
 
2.
+ else
+ {
+ hash_seq_init(&hash_seq, pgss_hash);
+ while ((entry = hash_seq_search(&hash_seq)) != NULL)
+ {
+ if ((!userid || (userid && entry->key.userid == userid)) &&
+ (!dbid || (dbid && entry->key.dbid == dbid)) &&
+ (!queryid || (queryid && entry->key.queryid == queryid)))
+ {
+ hash_search(pgss_hash, &entry->key, HASH_REMOVE, NULL);
+ num_remove++;
+ }
+ }
+ }

I think this 'if check' is redundant when none of the parameters are
specified.  We can easily avoid it, see attached.

Yes, that removes the unnecessary if check if none of the parameters
are available. 

I have fixed above two observations in the attached patch and edited
few comments and doc changes.  Kindly review the same.

Thanks for the correction, all are fine.
 
Apart from the above, I think we should add a test where all the
parameters are valid as the corresponding code is not covered by any
existing tests.

Added another test with all the 3 parameters are valid. 
Updated patch attached. 

Regards,
Haribabu Kommi
Fujitsu Australia
Attachment
On Thu, Nov 15, 2018 at 2:05 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> On Wed, Nov 14, 2018 at 8:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Apart from the above, I think we should add a test where all the
>> parameters are valid as the corresponding code is not covered by any
>> existing tests.
>
>
> Added another test with all the 3 parameters are valid.
> Updated patch attached.
>

+--
+-- remove query ('SELECT $1 + $2 AS "TWO"') executed by
regress_stats_user2 in the current_database
+--
+SELECT pg_stat_statements_reset((SELECT r.oid FROM pg_roles AS r
WHERE r.rolname = 'regress_stats_user2'),
+ (SELECT d.oid from pg_database As d where datname = current_database()),
+ (SELECT s.queryid FROM pg_stat_statements AS s WHERE s.query =
'SELECT $1 AS "ONE"'));

The query in comments is different than what is actually used?  And
how is able to remove the correct statement from hash (it seems you
intended to remove 'SELECT $1 AS "ONE"', but it removed 'SELECT $1 +
$2 AS "TWO"')?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Thu, Nov 15, 2018 at 11:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Nov 15, 2018 at 2:05 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
> >
> > On Wed, Nov 14, 2018 at 8:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> Apart from the above, I think we should add a test where all the
> >> parameters are valid as the corresponding code is not covered by any
> >> existing tests.
> >
> >
> > Added another test with all the 3 parameters are valid.
> > Updated patch attached.
> >
>
> +--
> +-- remove query ('SELECT $1 + $2 AS "TWO"') executed by
> regress_stats_user2 in the current_database
> +--
> +SELECT pg_stat_statements_reset((SELECT r.oid FROM pg_roles AS r
> WHERE r.rolname = 'regress_stats_user2'),
> + (SELECT d.oid from pg_database As d where datname = current_database()),
> + (SELECT s.queryid FROM pg_stat_statements AS s WHERE s.query =
> 'SELECT $1 AS "ONE"'));
>
> The query in comments is different than what is actually used?  And
> how is able to remove the correct statement from hash (it seems you
> intended to remove 'SELECT $1 AS "ONE"', but it removed 'SELECT $1 +
> $2 AS "TWO"')?
>

One more point, the length of each line is too long in this statement,
try to reduce it by starting parameters for pg_stat_statements_reset
from next line or something like that.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Thu, Nov 15, 2018 at 5:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Nov 15, 2018 at 11:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Nov 15, 2018 at 2:05 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
> >
> > On Wed, Nov 14, 2018 at 8:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >> Apart from the above, I think we should add a test where all the
> >> parameters are valid as the corresponding code is not covered by any
> >> existing tests.
> >
> >
> > Added another test with all the 3 parameters are valid.
> > Updated patch attached.
> >
>
> +--
> +-- remove query ('SELECT $1 + $2 AS "TWO"') executed by
> regress_stats_user2 in the current_database
> +--
> +SELECT pg_stat_statements_reset((SELECT r.oid FROM pg_roles AS r
> WHERE r.rolname = 'regress_stats_user2'),
> + (SELECT d.oid from pg_database As d where datname = current_database()),
> + (SELECT s.queryid FROM pg_stat_statements AS s WHERE s.query =
> 'SELECT $1 AS "ONE"'));
>
> The query in comments is different than what is actually used?  And
> how is able to remove the correct statement from hash (it seems you
> intended to remove 'SELECT $1 AS "ONE"', but it removed 'SELECT $1 +
> $2 AS "TWO"')?
>

One more point, the length of each line is too long in this statement,
try to reduce it by starting parameters for pg_stat_statements_reset
from next line or something like that.

Sorry for the mistake.
Attached patch synced the comment and SQL statement. And also I
rearranged the all options test as first compared to others.

Regards,
Haribabu Kommi
Fujitsu Australia
Attachment
On Fri, Nov 16, 2018 at 9:43 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> On Thu, Nov 15, 2018 at 5:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> > +--
>> > +-- remove query ('SELECT $1 + $2 AS "TWO"') executed by
>> > regress_stats_user2 in the current_database
>> > +--
>> > +SELECT pg_stat_statements_reset((SELECT r.oid FROM pg_roles AS r
>> > WHERE r.rolname = 'regress_stats_user2'),
>> > + (SELECT d.oid from pg_database As d where datname = current_database()),
>> > + (SELECT s.queryid FROM pg_stat_statements AS s WHERE s.query =
>> > 'SELECT $1 AS "ONE"'));
>> >
>> > The query in comments is different than what is actually used?  And
>> > how is able to remove the correct statement from hash (it seems you
>> > intended to remove 'SELECT $1 AS "ONE"', but it removed 'SELECT $1 +
>> > $2 AS "TWO"')?
>> >
>>
>> One more point, the length of each line is too long in this statement,
>> try to reduce it by starting parameters for pg_stat_statements_reset
>> from next line or something like that.
>
>
> Sorry for the mistake.
> Attached patch synced the comment and SQL statement.

Okay, but you haven't answered my question:
"how is able to remove the correct statement from hash (it seems
statement intended to remove 'SELECT $1 AS "ONE"', but it removed
'SELECT $1 + $2 AS "TWO"')"?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Fri, Nov 16, 2018 at 11:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Fri, Nov 16, 2018 at 9:43 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> On Thu, Nov 15, 2018 at 5:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> > +--
>> > +-- remove query ('SELECT $1 + $2 AS "TWO"') executed by
>> > regress_stats_user2 in the current_database
>> > +--
>> > +SELECT pg_stat_statements_reset((SELECT r.oid FROM pg_roles AS r
>> > WHERE r.rolname = 'regress_stats_user2'),
>> > + (SELECT d.oid from pg_database As d where datname = current_database()),
>> > + (SELECT s.queryid FROM pg_stat_statements AS s WHERE s.query =
>> > 'SELECT $1 AS "ONE"'));
>> >
>> > The query in comments is different than what is actually used?  And
>> > how is able to remove the correct statement from hash (it seems you
>> > intended to remove 'SELECT $1 AS "ONE"', but it removed 'SELECT $1 +
>> > $2 AS "TWO"')?
>> >
>>
>> One more point, the length of each line is too long in this statement,
>> try to reduce it by starting parameters for pg_stat_statements_reset
>> from next line or something like that.
>
>
> Sorry for the mistake.
> Attached patch synced the comment and SQL statement.

Okay, but you haven't answered my question:
"how is able to remove the correct statement from hash (it seems
statement intended to remove 'SELECT $1 AS "ONE"', but it removed
'SELECT $1 + $2 AS "TWO"')"?

I missed to check that question.

SELECT s.queryid FROM pg_stat_statements AS s WHERE s.query =
'SELECT $1 AS "ONE"'

With the above query to get the queryid, but there are no such queries
present in the pg_stat_statements, so the above query returns NULL as
output, and with NULL as input to the _reset() function, it takes the default
value of 0, and the reset query compares with rest of the two parameters
of userid and dbid. There is only one query that is present with that
combination, and that query is deleted.

In order to avoid such wrong results, i moved this as the first reset query,
so that there will be many queries that satisfy the condition in case if such
mistakes happens.

Regards,
Haribabu Kommi
Fujitsu Australia
On Sat, Nov 17, 2018 at 4:47 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> With the above query to get the queryid, but there are no such queries
> present in the pg_stat_statements, so the above query returns NULL as
> output, and with NULL as input to the _reset() function, it takes the default
> value of 0, and the reset query compares with rest of the two parameters
> of userid and dbid. There is only one query that is present with that
> combination, and that query is deleted.
>
> In order to avoid such wrong results, i moved this as the first reset query,
> so that there will be many queries that satisfy the condition in case if such
> mistakes happens.
>

Okay, that makes sense.

+--
+-- remove query ('SELECT $1 AS "ONE"') executed by two users
+--
+SELECT pg_stat_statements_reset(NULL,NULL,s.queryid)
+ FROM pg_stat_statements AS s WHERE s.query = 'SELECT $1 AS "ONE"' LIMIT 1;

In the above query, I don't see any advantage of adding LIMIT 1, so
removed in the attached patch.  I have also written a commit message
based on what you have written, kindly see if that looks okay to you
especially the credits section.

I am happy with the attached patch and will commit the same by
Wednesday if you or others don't have any further comments.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment
On 2018-Nov-17, Haribabu Kommi wrote:

> > Okay, but you haven't answered my question:
> > "how is able to remove the correct statement from hash (it seems
> > statement intended to remove 'SELECT $1 AS "ONE"', but it removed
> > 'SELECT $1 + $2 AS "TWO"')"?
> 
> I missed to check that question.
> 
> SELECT s.queryid FROM pg_stat_statements AS s WHERE s.query =
> 'SELECT $1 AS "ONE"'
> 
> With the above query to get the queryid, but there are no such queries
> present in the pg_stat_statements, so the above query returns NULL as
> output, and with NULL as input to the _reset() function, it takes the
> default value of 0, and the reset query compares with rest of the two
> parameters of userid and dbid.

Uh, ouch!  Seems to me that this is a high-caliber foot-gun, and will
cause nasty suprises when production stats data disappear inadvertently!
Are you sure we wouldn't like to have this functionality not rely on
NULL values to mean "yeah reset everything!"  Seems to me that for safe
production use we would have to tell users to add COALESCE to parameter
values, and it'll be very tedious.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


On Sat, Nov 17, 2018 at 4:46 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2018-Nov-17, Haribabu Kommi wrote:
>
> > > Okay, but you haven't answered my question:
> > > "how is able to remove the correct statement from hash (it seems
> > > statement intended to remove 'SELECT $1 AS "ONE"', but it removed
> > > 'SELECT $1 + $2 AS "TWO"')"?
> >
> > I missed to check that question.
> >
> > SELECT s.queryid FROM pg_stat_statements AS s WHERE s.query =
> > 'SELECT $1 AS "ONE"'
> >
> > With the above query to get the queryid, but there are no such queries
> > present in the pg_stat_statements, so the above query returns NULL as
> > output, and with NULL as input to the _reset() function, it takes the
> > default value of 0, and the reset query compares with rest of the two
> > parameters of userid and dbid.
>
> Uh, ouch!  Seems to me that this is a high-caliber foot-gun, and will
> cause nasty suprises when production stats data disappear inadvertently!
>

What is the alternative in your mind?  AFAICS, we have below alternatives:

a) Return an error for the NULL value of any argument?
b) Silently ignore if there is any NULL value argument and don't
remove any stats.
c) Just ignore the NULL value parameter and remove the stats based on
other parameters.

Currently, patch implements option (c), but we can do (a) or (b) as
well, however, that can make the API usage bit awkward.  I mean the
user has to pass the valid value for all parameters or alternatively
we need to keep some other default value of parameters which say that
ignore this particular parameter and remove stats based on other
parameters.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On 2018-Nov-17, Amit Kapila wrote:

> On Sat, Nov 17, 2018 at 4:46 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

> > Uh, ouch!  Seems to me that this is a high-caliber foot-gun, and will
> > cause nasty suprises when production stats data disappear inadvertently!
> 
> What is the alternative in your mind?

Well, as I see this interface, it is as if
DELETE FROM ... WHERE queryid = <value>
where passing a NULL value meant to delete *all* rows regardless of
queryid, rather than only those where queryid is NULL.

> AFAICS, we have below alternatives:
> 
> a) Return an error for the NULL value of any argument?
> b) Silently ignore if there is any NULL value argument and don't
> remove any stats.
> c) Just ignore the NULL value parameter and remove the stats based on
> other parameters.

> Currently, patch implements option (c), but we can do (a) or (b) as
> well, however, that can make the API usage bit awkward.

I think option d) is to have more functions (seven functions total), and
have them all be strict (i.e. don't do anything if one argument is
NULL).  One function takes three arguments, and removes all entries
matching the three.  Other functions take two arguments, and remove
everything matching those, ignoring the third (so behaving like the
current NULL).  Others take one argument.

Maybe it doesn't make sense to provide all combinations, so it's less
than seven.  But having an interface that's inherently easy to misuse
makes me a bit nervous.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Sun, Nov 18, 2018 at 1:11 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
On 2018-Nov-17, Amit Kapila wrote:

> On Sat, Nov 17, 2018 at 4:46 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

> > Uh, ouch!  Seems to me that this is a high-caliber foot-gun, and will
> > cause nasty suprises when production stats data disappear inadvertently!
>
> What is the alternative in your mind?

Well, as I see this interface, it is as if
DELETE FROM ... WHERE queryid = <value>
where passing a NULL value meant to delete *all* rows regardless of
queryid, rather than only those where queryid is NULL.

> AFAICS, we have below alternatives:
>
> a) Return an error for the NULL value of any argument?
> b) Silently ignore if there is any NULL value argument and don't
> remove any stats.
> c) Just ignore the NULL value parameter and remove the stats based on
> other parameters.

> Currently, patch implements option (c), but we can do (a) or (b) as
> well, however, that can make the API usage bit awkward.

I think option d) is to have more functions (seven functions total), and
have them all be strict (i.e. don't do anything if one argument is
NULL).  One function takes three arguments, and removes all entries
matching the three.  Other functions take two arguments, and remove
everything matching those, ignoring the third (so behaving like the
current NULL).  Others take one argument.

Maybe it doesn't make sense to provide all combinations, so it's less
than seven.  But having an interface that's inherently easy to misuse
makes me a bit nervous.

Following are the combinations that are possible and function name as
also pg_stat_statements_reset_by_***

userid, 
dbid
queryid

userid, dbid
userid, queryid
dbid, queryid

userid, dbid, queryid -- Existing function name with arguments can work.

So 6 new functions needs to be added to cover all the above cases,
IMO, we may need functions for all combinations, because I feel some
user may have the requirement of left out combination, in case if we choose
only some combinations.

Regards,
Haribabu Kommi
Fujitsu Australia
On Mon, Nov 19, 2018 at 10:41:22AM +1100, Haribabu Kommi wrote:
> So 6 new functions needs to be added to cover all the above cases,
> IMO, we may need functions for all combinations, because I feel some
> user may have the requirement of left out combination, in case if we choose
> only some combinations.

That's bloating the interface in my opinion.

Another choice #e would be to keep the function as not strict, but
defaulting to the current database if NULL is used for the database ID
and defaulting to the current user if NULL is used for the user ID.  For
the query ID, if NULL is given in output, process all queries matching
with (database,user), removing nothing if there is no match.  If the
caller has set up a query ID the rest of the stats should not be
touched.
--
Michael

Attachment
On 2018-Nov-19, Michael Paquier wrote:

> On Mon, Nov 19, 2018 at 10:41:22AM +1100, Haribabu Kommi wrote:
> > So 6 new functions needs to be added to cover all the above cases,
> > IMO, we may need functions for all combinations, because I feel some
> > user may have the requirement of left out combination, in case if we choose
> > only some combinations.
> 
> That's bloating the interface in my opinion.

I understand.

Let's call for a vote from a larger audience.  It's important to get
this interface right, ISTM.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


On Sat, Nov 17, 2018 at 7:41 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2018-Nov-17, Amit Kapila wrote:
>
> > On Sat, Nov 17, 2018 at 4:46 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> > > Uh, ouch!  Seems to me that this is a high-caliber foot-gun, and will
> > > cause nasty suprises when production stats data disappear inadvertently!
> >
> > What is the alternative in your mind?
>
> Well, as I see this interface, it is as if
> DELETE FROM ... WHERE queryid = <value>
> where passing a NULL value meant to delete *all* rows regardless of
> queryid, rather than only those where queryid is NULL.
>

If one want to understand in query form, it will be like:
DELETE FROM ... WHERE userid = <value> AND dbid = <value> AND queryid = <value>;

Now, if any of the value is NULL, we ignore it, so the case in
question would be:
DELETE FROM ... WHERE userid = <value> AND dbid = <value> OR queryid = NULL;

Now, I think it is quite possible that one can interpret it as you
have written above and then this can lead to some unintended behavior.
I think one thing we can do to avoid such cases is to make this
function strict and make the default values as InvalidOid or -1.  So,
on NULL input, we don't do anything, but for -1 or an invalid value
for the parameter, we ignore only that parameter as we are doing now.
Do you think such an interface will address your concern?

> > AFAICS, we have below alternatives:
> >
> > a) Return an error for the NULL value of any argument?
> > b) Silently ignore if there is any NULL value argument and don't
> > remove any stats.
> > c) Just ignore the NULL value parameter and remove the stats based on
> > other parameters.
>
> > Currently, patch implements option (c), but we can do (a) or (b) as
> > well, however, that can make the API usage bit awkward.
>
> I think option d) is to have more functions (seven functions total), and
> have them all be strict (i.e. don't do anything if one argument is
> NULL).  One function takes three arguments, and removes all entries
> matching the three.  Other functions take two arguments, and remove
> everything matching those, ignoring the third (so behaving like the
> current NULL).  Others take one argument.
>
> Maybe it doesn't make sense to provide all combinations, so it's less
> than seven.  But having an interface that's inherently easy to misuse
> makes me a bit nervous.
>

Yeah, we can go in that direction if the one interface idea doesn't
work.  IIRC, we have discussed having multiple interfaces for this API
and majority people seems to be in favor of single interface.  Let us
first see if there is some reasonable way to provide this
functionality with a single interface, if not, then surely we can
explore multiple interface idea.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Mon, Nov 19, 2018 at 1:37 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
On 2018-Nov-19, Michael Paquier wrote:

> On Mon, Nov 19, 2018 at 10:41:22AM +1100, Haribabu Kommi wrote:
> > So 6 new functions needs to be added to cover all the above cases,
> > IMO, we may need functions for all combinations, because I feel some
> > user may have the requirement of left out combination, in case if we choose
> > only some combinations.
>
> That's bloating the interface in my opinion.

I understand.

Let's call for a vote from a larger audience.  It's important to get
this interface right, ISTM.

Amit suggested another option in another mail, so total viable 
solutions that are discussed as of now are,

1. Single API with NULL input treat as invalid value
2. Multiple API to deal with NULL input of other values
3. Single API with NULL value to treat them as current user, current database
 and NULL queryid.
4. Single API with -1 as invalid value, treat NULL as no matching. (Only problem
 with this approach is till now -1 is also a valid queryid, but setting -1 as queryid
needs to be avoided.

I prefer single API. I can go with either 3 or 4.

opinion from others?

Regards,
Haribabu Kommi
Fujitsu Australia
On Mon, Nov 19, 2018 at 10:48 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> On Mon, Nov 19, 2018 at 1:37 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>>
>> On 2018-Nov-19, Michael Paquier wrote:
>>
>> > On Mon, Nov 19, 2018 at 10:41:22AM +1100, Haribabu Kommi wrote:
>> > > So 6 new functions needs to be added to cover all the above cases,
>> > > IMO, we may need functions for all combinations, because I feel some
>> > > user may have the requirement of left out combination, in case if we choose
>> > > only some combinations.
>> >
>> > That's bloating the interface in my opinion.
>>
>> I understand.
>>
>> Let's call for a vote from a larger audience.  It's important to get
>> this interface right, ISTM.
>
> 4. Single API with -1 as invalid value, treat NULL as no matching. (Only problem
>  with this approach is till now -1 is also a valid queryid, but setting -1 as queryid
> needs to be avoided.
>

Hmm, can we use 0 as default value without any such caveat?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Tue, Nov 20, 2018 at 2:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 19, 2018 at 10:48 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> On Mon, Nov 19, 2018 at 1:37 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>>
>> On 2018-Nov-19, Michael Paquier wrote:
>>
>> > On Mon, Nov 19, 2018 at 10:41:22AM +1100, Haribabu Kommi wrote:
>> > > So 6 new functions needs to be added to cover all the above cases,
>> > > IMO, we may need functions for all combinations, because I feel some
>> > > user may have the requirement of left out combination, in case if we choose
>> > > only some combinations.
>> >
>> > That's bloating the interface in my opinion.
>>
>> I understand.
>>
>> Let's call for a vote from a larger audience.  It's important to get
>> this interface right, ISTM.
>
> 4. Single API with -1 as invalid value, treat NULL as no matching. (Only problem
>  with this approach is till now -1 is also a valid queryid, but setting -1 as queryid
> needs to be avoided.
>

Hmm, can we use 0 as default value without any such caveat?

Yes, with strict and 0 as default value can work. 
If there is no problem, I can go ahead with the above changes?

Regards,
Haribabu Kommi
Fujitsu Australia
On Tue, Nov 20, 2018 at 10:56 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
> On Tue, Nov 20, 2018 at 2:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Mon, Nov 19, 2018 at 10:48 AM Haribabu Kommi
>> <kommi.haribabu@gmail.com> wrote:
>> > On Mon, Nov 19, 2018 at 1:37 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>> >>
>> >> On 2018-Nov-19, Michael Paquier wrote:
>> >>
>> >> > On Mon, Nov 19, 2018 at 10:41:22AM +1100, Haribabu Kommi wrote:
>> >> > > So 6 new functions needs to be added to cover all the above cases,
>> >> > > IMO, we may need functions for all combinations, because I feel some
>> >> > > user may have the requirement of left out combination, in case if we choose
>> >> > > only some combinations.
>> >> >
>> >> > That's bloating the interface in my opinion.
>> >>
>> >> I understand.
>> >>
>> >> Let's call for a vote from a larger audience.  It's important to get
>> >> this interface right, ISTM.
>> >
>> > 4. Single API with -1 as invalid value, treat NULL as no matching. (Only problem
>> >  with this approach is till now -1 is also a valid queryid, but setting -1 as queryid
>> > needs to be avoided.
>> >
>>
>> Hmm, can we use 0 as default value without any such caveat?
>
>
> Yes, with strict and 0 as default value can work.
> If there is no problem, I can go ahead with the above changes?
>

I would say wait for a few days (at least 2 days) to see if Alvaro or
anybody else has an opinion on this.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Mon, Nov 19, 2018 at 10:48 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
> On Mon, Nov 19, 2018 at 1:37 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>>
>> On 2018-Nov-19, Michael Paquier wrote:
>>
>> > On Mon, Nov 19, 2018 at 10:41:22AM +1100, Haribabu Kommi wrote:
>> > > So 6 new functions needs to be added to cover all the above cases,
>> > > IMO, we may need functions for all combinations, because I feel some
>> > > user may have the requirement of left out combination, in case if we choose
>> > > only some combinations.
>> >
>> > That's bloating the interface in my opinion.
>>
>> I understand.
>>
>> Let's call for a vote from a larger audience.  It's important to get
>> this interface right, ISTM.
>
>
> Amit suggested another option in another mail, so total viable
> solutions that are discussed as of now are,
>
> 1. Single API with NULL input treat as invalid value
> 2. Multiple API to deal with NULL input of other values
> 3. Single API with NULL value to treat them as current user, current database
>  and NULL queryid.
> 4. Single API with -1 as invalid value, treat NULL as no matching. (Only problem
>  with this approach is till now -1 is also a valid queryid, but setting -1 as queryid
> needs to be avoided.
>

As discussed above the problem mentioned by Hari in point-4 won't be
there if we use a default value as 0.

> I prefer single API. I can go with either 3 or 4.
>
> opinion from others?

We don't see many opinions from others, but what I can read here is the count:
Option-3: Michael, Hari
Option-4: Amit, Hari
Option-2: Alvaro

As Hari seems to be a bit more inclined towards option-4, I think we
can proceed with it.

Alvaro, we understand your concerns and it seems there is some genuine
way to address that.  Kindly let us know if you still see the issue or
you are still not comfortable?  If you are not comfortable with what
is being proposed, kindly suggest a way forward for this patch?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Thu, Nov 22, 2018 at 4:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Mon, Nov 19, 2018 at 10:48 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
> On Mon, Nov 19, 2018 at 1:37 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>>
>> On 2018-Nov-19, Michael Paquier wrote:
>>
>> > On Mon, Nov 19, 2018 at 10:41:22AM +1100, Haribabu Kommi wrote:
>> > > So 6 new functions needs to be added to cover all the above cases,
>> > > IMO, we may need functions for all combinations, because I feel some
>> > > user may have the requirement of left out combination, in case if we choose
>> > > only some combinations.
>> >
>> > That's bloating the interface in my opinion.
>>
>> I understand.
>>
>> Let's call for a vote from a larger audience.  It's important to get
>> this interface right, ISTM.
>
>
> Amit suggested another option in another mail, so total viable
> solutions that are discussed as of now are,
>
> 1. Single API with NULL input treat as invalid value
> 2. Multiple API to deal with NULL input of other values
> 3. Single API with NULL value to treat them as current user, current database
>  and NULL queryid.
> 4. Single API with -1 as invalid value, treat NULL as no matching. (Only problem
>  with this approach is till now -1 is also a valid queryid, but setting -1 as queryid
> needs to be avoided.
>

As discussed above the problem mentioned by Hari in point-4 won't be
there if we use a default value as 0.

> I prefer single API. I can go with either 3 or 4.
>
> opinion from others?

We don't see many opinions from others, but what I can read here is the count:
Option-3: Michael, Hari
Option-4: Amit, Hari
Option-2: Alvaro

As Hari seems to be a bit more inclined towards option-4, I think we
can proceed with it.

If you want more input ont it, I'd vote for either 2 or 4. Between those two I think I have a slight favor in the direction of 2, but really not enough to voice strongly. I do feel more strongly that 1 and 3 are not good options.

--
On 2018-Nov-20, Haribabu Kommi wrote:

> > > 4. Single API with -1 as invalid value, treat NULL as no matching. (Only
> > problem
> > >  with this approach is till now -1 is also a valid queryid, but setting
> > -1 as queryid
> > > needs to be avoided.
> >
> > Hmm, can we use 0 as default value without any such caveat?
> 
> Yes, with strict and 0 as default value can work.
> If there is no problem, I can go ahead with the above changes?

I'm not sure I understand this proposal.  Does this mean that passing -1
as databaseid / userid would match all databases/users, and passing 0 as
queryid would match all queries?  This would be suprising, since OID is
unsigned so -1 is a possibly valid value ...

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


On Thu, Nov 22, 2018 at 7:02 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2018-Nov-20, Haribabu Kommi wrote:
>
> > > > 4. Single API with -1 as invalid value, treat NULL as no matching. (Only
> > > problem
> > > >  with this approach is till now -1 is also a valid queryid, but setting
> > > -1 as queryid
> > > > needs to be avoided.
> > >
> > > Hmm, can we use 0 as default value without any such caveat?
> >
> > Yes, with strict and 0 as default value can work.
> > If there is no problem, I can go ahead with the above changes?
>
> I'm not sure I understand this proposal.  Does this mean that passing -1
> as databaseid / userid would match all databases/users, and passing 0 as
> queryid would match all queries?
>

No, for userid/databaseid also it will be 0 (aka InvalidOid).


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Thu, Nov 22, 2018 at 7:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Nov 22, 2018 at 7:02 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> >
> > On 2018-Nov-20, Haribabu Kommi wrote:
> >
> > > > > 4. Single API with -1 as invalid value, treat NULL as no matching. (Only
> > > > problem
> > > > >  with this approach is till now -1 is also a valid queryid, but setting
> > > > -1 as queryid
> > > > > needs to be avoided.
> > > >
> > > > Hmm, can we use 0 as default value without any such caveat?
> > >
> > > Yes, with strict and 0 as default value can work.
> > > If there is no problem, I can go ahead with the above changes?
> >
> > I'm not sure I understand this proposal.  Does this mean that passing -1
> > as databaseid / userid would match all databases/users, and passing 0 as
> > queryid would match all queries?
> >
>
> No, for userid/databaseid also it will be 0 (aka InvalidOid).
>

Not sure if the above statement is clear,  but what I wanted to say
was "for all the three userid/databaseid/queryid, default will be 0
(aka InvalidOid)."



-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On 19/11/2018 06:18, Haribabu Kommi wrote:
> Amit suggested another option in another mail, so total viable 
> solutions that are discussed as of now are,
> 
> 1. Single API with NULL input treat as invalid value
> 2. Multiple API to deal with NULL input of other values
> 3. Single API with NULL value to treat them as current user, current
> database
>  and NULL queryid.
> 4. Single API with -1 as invalid value, treat NULL as no matching. (Only
> problem
>  with this approach is till now -1 is also a valid queryid, but setting
> -1 as queryid
> needs to be avoided.

Can you show examples of what these would look like?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Fri, Nov 23, 2018 at 1:54 AM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 19/11/2018 06:18, Haribabu Kommi wrote:
> Amit suggested another option in another mail, so total viable 
> solutions that are discussed as of now are,
>
> 1. Single API with NULL input treat as invalid value
> 2. Multiple API to deal with NULL input of other values
> 3. Single API with NULL value to treat them as current user, current
> database
>  and NULL queryid.
> 4. Single API with -1 as invalid value, treat NULL as no matching. (Only
> problem
>  with this approach is till now -1 is also a valid queryid, but setting
> -1 as queryid
> needs to be avoided.

Can you show examples of what these would look like?

Following are some of the examples how the various options
work.

Option -1:

select pg_stat_statements_reset(NULL,NULL,NULL);
-- No change, just return. Because the function is strict.

select pg_stat_statements_reset(10,10,NULL);
-- Resets all the statements that are executed by userid 10 on database id 10.

select pg_stat_statements_reset(NULL,10,NULL);
-- Resets all the statements executed on database id 10


Option - 2:

select pg_stat_statements_reset(NULL,NULL,NULL);
-- No change, just return. Function are of type strict.

select pg_stat_statements_reset(10,10,1234)
-- Resets the query statement 1234 executed by userid 10 on database id 10

select pg_stat_statements_reset_by_userid(10)
-- Resets all the statements executed by userid 10

select pg_stat_statements_reset_by_dbid(10);
-- Rests all the statements executed on database id 10.

select pg_stat_statements_reset_by_userid_and_dbid(10, 10);
-- Resets all the statements executed by userid 10 on datbase id 10

Likewise total 7 API's.


Option -3:

select pg_stat_statements_reset(NULL,NULL,NULL);
--Similar like option-1, but when the userid or databaseid are NULL, it uses
the current_role and current_database as the options. But for the queryid
when the options is NULL, it deletes all the queries.

Rest of the details are same as option-1.

Option -4:

select pg_stat_statements_reset(NULL,NULL,NULL);
-- No change, just return. Because the function is strict.

select pg_stat_statements_reset(0,0,0);
-- Resets all the statements

select pg_stat_statements_reset(0,10,0);
-- Resets all the statements executed on database id 10

select pg_stat_statements_reset(10,0,0);
-- Resets all the statements executed on userid 10.

 
Regards,
Haribabu Kommi
Fujitsu Australia
On Thu, Nov 22, 2018 at 7:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Nov 22, 2018 at 7:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Nov 22, 2018 at 7:02 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > >
> > > On 2018-Nov-20, Haribabu Kommi wrote:
> > >
> > > > > > 4. Single API with -1 as invalid value, treat NULL as no matching. (Only
> > > > > problem
> > > > > >  with this approach is till now -1 is also a valid queryid, but setting
> > > > > -1 as queryid
> > > > > > needs to be avoided.
> > > > >
> > > > > Hmm, can we use 0 as default value without any such caveat?
> > > >
> > > > Yes, with strict and 0 as default value can work.
> > > > If there is no problem, I can go ahead with the above changes?
> > >
> > > I'm not sure I understand this proposal.  Does this mean that passing -1
> > > as databaseid / userid would match all databases/users, and passing 0 as
> > > queryid would match all queries?
> > >
> >
> > No, for userid/databaseid also it will be 0 (aka InvalidOid).
> >
>
> Not sure if the above statement is clear,  but what I wanted to say
> was "for all the three userid/databaseid/queryid, default will be 0
> (aka InvalidOid)."
>

Is the proposal clear?  If so, can you please share your opinion even
if it is the same as previous, because, I think that will help us in
moving forward with this patch.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Fri, Nov 23, 2018 at 4:40 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
>
> On Fri, Nov 23, 2018 at 1:54 AM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
>>
>> On 19/11/2018 06:18, Haribabu Kommi wrote:
>> > Amit suggested another option in another mail, so total viable
>> > solutions that are discussed as of now are,
>> >
>> > 1. Single API with NULL input treat as invalid value
>> > 2. Multiple API to deal with NULL input of other values
>> > 3. Single API with NULL value to treat them as current user, current
>> > database
>> >  and NULL queryid.
>> > 4. Single API with -1 as invalid value, treat NULL as no matching. (Only
>> > problem
>> >  with this approach is till now -1 is also a valid queryid, but setting
>> > -1 as queryid
>> > needs to be avoided.
>>
>> Can you show examples of what these would look like?
>
>
> Following are some of the examples how the various options
> work.
>

Do the examples provided by Haribabu helps in understanding the proposal?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On 23/11/2018 00:09, Haribabu Kommi wrote:
> 
> On Fri, Nov 23, 2018 at 1:54 AM Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com
> <mailto:peter.eisentraut@2ndquadrant.com>> wrote:
> 
>     On 19/11/2018 06:18, Haribabu Kommi wrote:
>     > Amit suggested another option in another mail, so total viable 
>     > solutions that are discussed as of now are,
>     >
>     > 1. Single API with NULL input treat as invalid value
>     > 2. Multiple API to deal with NULL input of other values
>     > 3. Single API with NULL value to treat them as current user, current
>     > database
>     >  and NULL queryid.
>     > 4. Single API with -1 as invalid value, treat NULL as no matching.
>     (Only
>     > problem
>     >  with this approach is till now -1 is also a valid queryid, but
>     setting
>     > -1 as queryid
>     > needs to be avoided.
> 
>     Can you show examples of what these would look like?
> 
> 
> Following are some of the examples how the various options
> work.
> 
> Option -1:
> 
> select pg_stat_statements_reset(NULL,NULL,NULL);
> -- No change, just return. Because the function is strict.
> 
> select pg_stat_statements_reset(10,10,NULL);
> -- Resets all the statements that are executed by userid 10 on database
> id 10.
> 
> select pg_stat_statements_reset(NULL,10,NULL);
> -- Resets all the statements executed on database id 10

If the function is strict, none of these will do anything.

My preference is for NULL to mean *all* so this is my favorite option,
except that the first query should reset everything.

+1 with that change.

> Option - 2:
> 
> select pg_stat_statements_reset(NULL,NULL,NULL);
> -- No change, just return. Function are of type strict.
> 
> select pg_stat_statements_reset(10,10,1234)
> -- Resets the query statement 1234 executed by userid 10 on database id 10
> 
> select pg_stat_statements_reset_by_userid(10)
> -- Resets all the statements executed by userid 10
> 
> select pg_stat_statements_reset_by_dbid(10);
> -- Rests all the statements executed on database id 10.
> 
> select pg_stat_statements_reset_by_userid_and_dbid(10, 10);
> -- Resets all the statements executed by userid 10 on datbase id 10
> 
> Likewise total 7 API's.

I could live with this, but I prefer as described above.

I vote +0.

> Option -3:
> 
> select pg_stat_statements_reset(NULL,NULL,NULL);
> --Similar like option-1, but when the userid or databaseid are NULL, it uses
> the current_role and current_database as the options. But for the queryid
> when the options is NULL, it deletes all the queries.
> 
> Rest of the details are same as option-1.

I don't like this one at all.  Something like resetting stats don't need
shortcuts and the dba can easily specify current_user and current_catalog.

-1 from me.

> Option -4:
> 
> select pg_stat_statements_reset(NULL,NULL,NULL);
> -- No change, just return. Because the function is strict.
> 
> select pg_stat_statements_reset(0,0,0);
> -- Resets all the statements
> 
> select pg_stat_statements_reset(0,10,0);
> -- Resets all the statements executed on database id 10
> 
> select pg_stat_statements_reset(10,0,0);
> -- Resets all the statements executed on userid 10.

This one is my very least favorite.

-1 from me.
-- 
Vik Fearing                                          +33 6 46 75 15 36
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


On Mon, Nov 26, 2018 at 5:40 PM Vik Fearing <vik.fearing@2ndquadrant.com> wrote:
>
> On 23/11/2018 00:09, Haribabu Kommi wrote:
> >
> > On Fri, Nov 23, 2018 at 1:54 AM Peter Eisentraut
> > <peter.eisentraut@2ndquadrant.com
> > <mailto:peter.eisentraut@2ndquadrant.com>> wrote:
> >
> >     On 19/11/2018 06:18, Haribabu Kommi wrote:
> >     > Amit suggested another option in another mail, so total viable
> >     > solutions that are discussed as of now are,
> >     >
> >     > 1. Single API with NULL input treat as invalid value
> >     > 2. Multiple API to deal with NULL input of other values
> >     > 3. Single API with NULL value to treat them as current user, current
> >     > database
> >     >  and NULL queryid.
> >     > 4. Single API with -1 as invalid value, treat NULL as no matching.
> >     (Only
> >     > problem
> >     >  with this approach is till now -1 is also a valid queryid, but
> >     setting
> >     > -1 as queryid
> >     > needs to be avoided.
> >
> >     Can you show examples of what these would look like?
> >
> >
> > Following are some of the examples how the various options
> > work.
> >
> > Option -1:
> >
> > select pg_stat_statements_reset(NULL,NULL,NULL);
> > -- No change, just return. Because the function is strict.
> >
> > select pg_stat_statements_reset(10,10,NULL);
> > -- Resets all the statements that are executed by userid 10 on database
> > id 10.
> >
> > select pg_stat_statements_reset(NULL,10,NULL);
> > -- Resets all the statements executed on database id 10
>
> If the function is strict, none of these will do anything.
>
> My preference is for NULL to mean *all* so this is my favorite option,
> except that the first query should reset everything.
>
> +1 with that change.
>

Thanks, Vik for sharing feedback.  So, here is a summarization of what
different people think:

Option-1: Vik
Option-2: Alvaro, Magnus
Option-3: Michael, Hari
Option-4: Amit, Hari, Magnus


We still can't conclude clearly, but I think Option-2 and Option-4 are
the ones which appear to be more popular.  Does anyone else want to
weigh in here?  If no more people share an opinion, I will go with one
among Option-2 or Option-4.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Hello

> My preference is for NULL to mean *all* so this is my favorite option,
> except that the first query should reset everything.

I am +1 for this option. NULL treat as "any" and pg_stat_statements_reset(NULL,NULL,NULL) to reset everything

We can use named notation?
https://www.postgresql.org/docs/current/sql-syntax-calling-funcs.html#SQL-SYNTAX-CALLING-FUNCS-NAMED

> select pg_stat_statements_reset(dbid := 10);

So i am -1 for multiple function options.

regards, Sergei


On Wed, Nov 28, 2018 at 4:45 PM Sergei Kornilov <sk@zsrv.org> wrote:
>
> Hello
>
> > My preference is for NULL to mean *all* so this is my favorite option,
> > except that the first query should reset everything.
>
> I am +1 for this option. NULL treat as "any" and pg_stat_statements_reset(NULL,NULL,NULL) to reset everything
>

The problem with this idea is that if someone specifies a particular
parameter using query and the query doesn't return any parameters,
then it can lead to inadvertent behavior.  For example, if user uses
something like pg_stat_statements_reset(<valid_user_id>,
<valid_db_id>, SELECT s.queryid FROM pg_stat_statements AS s WHERE
s.query = 'SELECT $1 AS "ONE"');  now, if the query doesn't return any
row, we will remove the stats for all queries that belong to
(userid,dbid).  It could be surprising for some users, that's why we
came up with option-4 where we keep the default value of parameters as
0.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On 2018-Nov-28, Amit Kapila wrote:

> The problem with this idea is that if someone specifies a particular
> parameter using query and the query doesn't return any parameters,
> then it can lead to inadvertent behavior.  For example, if user uses
> something like pg_stat_statements_reset(<valid_user_id>,
> <valid_db_id>, SELECT s.queryid FROM pg_stat_statements AS s WHERE
> s.query = 'SELECT $1 AS "ONE"');  now, if the query doesn't return any
> row, we will remove the stats for all queries that belong to
> (userid,dbid).  It could be surprising for some users, that's why we
> came up with option-4 where we keep the default value of parameters as
> 0.

Right, I think option 4 is a clear improvement over option 1.  I can get
behind that one.  Since not many people care to vote, I think this tips
the scales enough to that side.

Thanks for your patience.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


On Wed, Nov 28, 2018 at 7:13 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2018-Nov-28, Amit Kapila wrote:
>
> > The problem with this idea is that if someone specifies a particular
> > parameter using query and the query doesn't return any parameters,
> > then it can lead to inadvertent behavior.  For example, if user uses
> > something like pg_stat_statements_reset(<valid_user_id>,
> > <valid_db_id>, SELECT s.queryid FROM pg_stat_statements AS s WHERE
> > s.query = 'SELECT $1 AS "ONE"');  now, if the query doesn't return any
> > row, we will remove the stats for all queries that belong to
> > (userid,dbid).  It could be surprising for some users, that's why we
> > came up with option-4 where we keep the default value of parameters as
> > 0.
>
> Right, I think option 4 is a clear improvement over option 1.  I can get
> behind that one.  Since not many people care to vote, I think this tips
> the scales enough to that side.
>

Agreed.  Hari, can you change the patch as per the requirements of option-4.

> Thanks for your patience.
>

No problem.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Wed, Nov 28, 2018 at 10:43:35AM -0300, Alvaro Herrera wrote:
> On 2018-Nov-28, Amit Kapila wrote:
>> The problem with this idea is that if someone specifies a particular
>> parameter using query and the query doesn't return any parameters,
>> then it can lead to inadvertent behavior.  For example, if user uses
>> something like pg_stat_statements_reset(<valid_user_id>,
>> <valid_db_id>, SELECT s.queryid FROM pg_stat_statements AS s WHERE
>> s.query = 'SELECT $1 AS "ONE"');  now, if the query doesn't return any
>> row, we will remove the stats for all queries that belong to
>> (userid,dbid).  It could be surprising for some users, that's why we
>> came up with option-4 where we keep the default value of parameters as
>> 0.
>
> Right, I think option 4 is a clear improvement over option 1.  I can get
> behind that one.  Since not many people care to vote, I think this tips
> the scales enough to that side.

With option 4 (as defined in [1]), the database ID or the user ID set to
0 means that we don't apply any filter on them, which is sensible.  Now,
we have query IDs which can be set to 0, like parseable utility
statements.  When a query ID is set to 0, does this mean that all query
IDs matching 0 are reset?  Or does that mean that we don't filter out by
query ID?

[1]: https://www.postgresql.org/message-id/CAJrrPGfxtCUnbtWsyp_qLWb=U4diVrsA9RiEepffOm9_KOms3Q@mail.gmail.com),
--
Michael

Attachment
On Thu, Nov 29, 2018 at 8:46 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Nov 28, 2018 at 10:43:35AM -0300, Alvaro Herrera wrote:
> > On 2018-Nov-28, Amit Kapila wrote:
> >> The problem with this idea is that if someone specifies a particular
> >> parameter using query and the query doesn't return any parameters,
> >> then it can lead to inadvertent behavior.  For example, if user uses
> >> something like pg_stat_statements_reset(<valid_user_id>,
> >> <valid_db_id>, SELECT s.queryid FROM pg_stat_statements AS s WHERE
> >> s.query = 'SELECT $1 AS "ONE"');  now, if the query doesn't return any
> >> row, we will remove the stats for all queries that belong to
> >> (userid,dbid).  It could be surprising for some users, that's why we
> >> came up with option-4 where we keep the default value of parameters as
> >> 0.
> >
> > Right, I think option 4 is a clear improvement over option 1.  I can get
> > behind that one.  Since not many people care to vote, I think this tips
> > the scales enough to that side.
>
> With option 4 (as defined in [1]), the database ID or the user ID set to
> 0 means that we don't apply any filter on them, which is sensible.  Now,
> we have query IDs which can be set to 0, like parseable utility
> statements.
>

I think we ensure that queryId won't be 0 for utility statements.
Check below part of patch:
* For utility statements, we just hash the query string to get an ID.
  */
  if (queryId == UINT64CONST(0))
+ {
  queryId = pgss_hash_string(query, query_len);

+ /*
+ * If we are unlucky enough to get a hash of zero(invalid), use queryID
+ * as 2 instead, queryID 1 is already in use for normal statements.
+ */
+ if (queryId == UINT64CONST(0))
+ queryId = UINT64CONST(2);
+ }
+

>  When a query ID is set to 0, does this mean that all query
> IDs matching 0 are reset?  Or does that mean that we don't filter out by
> query ID?
>

It means the later one "We don't apply any filter for queryId.".

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Thu, Nov 29, 2018 at 09:09:48AM +0530, Amit Kapila wrote:
> On Thu, Nov 29, 2018 at 8:46 AM Michael Paquier <michael@paquier.xyz> wrote:
>> With option 4 (as defined in [1]), the database ID or the user ID set to
>> 0 means that we don't apply any filter on them, which is sensible.  Now,
>> we have query IDs which can be set to 0, like parseable utility
>> statements.
>
> I think we ensure that queryId won't be 0 for utility statements.

Oh, pgss_store() makes sure to enforce the query ID which has been
calculated in the post-analyze hook.  That's a bit weird to be honest.

>>  When a query ID is set to 0, does this mean that all query
>> IDs matching 0 are reset?  Or does that mean that we don't filter out by
>> query ID?
>>
>
> It means the later one "We don't apply any filter for queryId.".

Seems like it will be possible to live happily with option 4 then.
--
Michael

Attachment
On 28/11/2018 14:43, Alvaro Herrera wrote:
> On 2018-Nov-28, Amit Kapila wrote:
> 
>> The problem with this idea is that if someone specifies a particular
>> parameter using query and the query doesn't return any parameters,
>> then it can lead to inadvertent behavior.  For example, if user uses
>> something like pg_stat_statements_reset(<valid_user_id>,
>> <valid_db_id>, SELECT s.queryid FROM pg_stat_statements AS s WHERE
>> s.query = 'SELECT $1 AS "ONE"');  now, if the query doesn't return any
>> row, we will remove the stats for all queries that belong to
>> (userid,dbid).  It could be surprising for some users, that's why we
>> came up with option-4 where we keep the default value of parameters as
>> 0.
> 
> Right, I think option 4 is a clear improvement over option 1.  I can get
> behind that one.  Since not many people care to vote, I think this tips
> the scales enough to that side.

I think option 4 is the right choice, and with named parameters and
default values it will also be very convenient.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On Thu, Nov 29, 2018 at 1:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Nov 28, 2018 at 7:13 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2018-Nov-28, Amit Kapila wrote:
>
> > The problem with this idea is that if someone specifies a particular
> > parameter using query and the query doesn't return any parameters,
> > then it can lead to inadvertent behavior.  For example, if user uses
> > something like pg_stat_statements_reset(<valid_user_id>,
> > <valid_db_id>, SELECT s.queryid FROM pg_stat_statements AS s WHERE
> > s.query = 'SELECT $1 AS "ONE"');  now, if the query doesn't return any
> > row, we will remove the stats for all queries that belong to
> > (userid,dbid).  It could be surprising for some users, that's why we
> > came up with option-4 where we keep the default value of parameters as
> > 0.
>
> Right, I think option 4 is a clear improvement over option 1.  I can get
> behind that one.  Since not many people care to vote, I think this tips
> the scales enough to that side.
>

Agreed.  Hari, can you change the patch as per the requirements of option-4.

Apologies for delay. Thanks to all your opinions.

Attached is the updated patch as per the option-4 and added a test also to ensure
the function strictness.

Regards,
Haribabu Kommi
Fujitsu Australia
Attachment
On Wed, Dec 12, 2018 at 3:54 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
>
> On Thu, Nov 29, 2018 at 1:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Wed, Nov 28, 2018 at 7:13 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>> >
>> > On 2018-Nov-28, Amit Kapila wrote:
>> >
>> > > The problem with this idea is that if someone specifies a particular
>> > > parameter using query and the query doesn't return any parameters,
>> > > then it can lead to inadvertent behavior.  For example, if user uses
>> > > something like pg_stat_statements_reset(<valid_user_id>,
>> > > <valid_db_id>, SELECT s.queryid FROM pg_stat_statements AS s WHERE
>> > > s.query = 'SELECT $1 AS "ONE"');  now, if the query doesn't return any
>> > > row, we will remove the stats for all queries that belong to
>> > > (userid,dbid).  It could be surprising for some users, that's why we
>> > > came up with option-4 where we keep the default value of parameters as
>> > > 0.
>> >
>> > Right, I think option 4 is a clear improvement over option 1.  I can get
>> > behind that one.  Since not many people care to vote, I think this tips
>> > the scales enough to that side.
>> >
>>
>> Agreed.  Hari, can you change the patch as per the requirements of option-4.
>
>
> Apologies for delay. Thanks to all your opinions.
>
> Attached is the updated patch as per the option-4 and added a test also to ensure
> the function strictness.
>

Thanks for the updated patch. I will look into it in the next few days.



-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Wed, Nov 28, 2018 at 1:43 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Right, I think option 4 is a clear improvement over option 1.  I can get
> behind that one.  Since not many people care to vote, I think this tips
> the scales enough to that side.

I'm showing up very late to the party here, but I like option 1 best.
I feel like the SQL standard has a pretty clear idea that NULL is how
you represent a value is unknown, which shows up in a lot of places.
Deciding that we're going to use a different sentinel value in this
one case because NULL is a confusing concept in general seems pretty
strange to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Sat, Dec 15, 2018 at 12:14 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Nov 28, 2018 at 1:43 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > Right, I think option 4 is a clear improvement over option 1.  I can get
> > behind that one.  Since not many people care to vote, I think this tips
> > the scales enough to that side.
>
> I'm showing up very late to the party here,
>

Not a problem, people like you are always welcome.

> but I like option 1 best.
>

You are not alone in this camp, so, IIUC, below are voting by different people

Option-1: Vik, Sergei, Robert
Option-2: Alvaro, Magnus
Option-3: Michael, Hari
Option-4: Amit, Hari, Magnus, Alvaro, Michael, Peter

Some people's name is present in two options as they are okay with
either one of those.  I see that now more people are in favor of
option-1, but still, the tilt is more towards option-4.

> I feel like the SQL standard has a pretty clear idea that NULL is how
> you represent a value is unknown, which shows up in a lot of places.
> Deciding that we're going to use a different sentinel value in this
> one case because NULL is a confusing concept in general seems pretty
> strange to me.
>

I agree that NULL seems to be the attractive choice for a default
value, but we have to also see what it means?  In this case, NULL will
mean 'all' which doesn't go with generally what NULL means (aka
unknown, only NULL can be compared with other NULL).  There are a few
people on this thread who feel that having NULL can lead to misuse of
this API [1] as explained here and probably we need to use some
workaround for it to be used in production [2].

[1] - https://www.postgresql.org/message-id/CAA4eK1LicqWY55XxmahQXti4RjQ28iuASAk1X8%2ByKX0J051_VQ%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/20181117111653.cetidngkgol5e5xn%40alvherre.pgsql

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On 14/12/2018 19:44, Robert Haas wrote:
> I'm showing up very late to the party here, but I like option 1 best.
> I feel like the SQL standard has a pretty clear idea that NULL is how
> you represent a value is unknown, which shows up in a lot of places.
> Deciding that we're going to use a different sentinel value in this
> one case because NULL is a confusing concept in general seems pretty
> strange to me.

[The difference between #1 and #4 is whether "any" is represented as
NULL or 0.]

An example:

select pg_stat_statements_reset(
10,
(select min(oid) from pg_database where datname like 'test%'),
1234);

(This is obviously a weird example, but it illustrates the
language-theoretic point.)

If you have no matching databases, under #1 this would then apply to all
databases.  Under #4 it would not.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


On 2018-Dec-15, Peter Eisentraut wrote:

> An example:
> 
> select pg_stat_statements_reset(
> 10,
> (select min(oid) from pg_database where datname like 'test%'),
> 1234);
> 
> (This is obviously a weird example, but it illustrates the
> language-theoretic point.)

This is not at all weird.  Lack of clarity on this point is the reason
that spawned this whole sub-thread, because a reset was removing data
that was not about the query that was intended.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: New function pg_stat_statements_reset_query() to resetstatistics of a specific query

From
Kyotaro HORIGUCHI
Date:
Hello. One more weight comes here.

At Sat, 15 Dec 2018 08:03:46 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in
<CAA4eK1JHaibcX_Sf9eboDFV-OhHAw93XiD46rK=hnzSNHRD3oA@mail.gmail.com>
> On Sat, Dec 15, 2018 at 12:14 AM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > On Wed, Nov 28, 2018 at 1:43 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > > Right, I think option 4 is a clear improvement over option 1.  I can get
> > > behind that one.  Since not many people care to vote, I think this tips
> > > the scales enough to that side.
> >
> > I'm showing up very late to the party here,
> >
> 
> Not a problem, people like you are always welcome.
> 
> > but I like option 1 best.
> >
> 
> You are not alone in this camp, so, IIUC, below are voting by different people
> 
> Option-1: Vik, Sergei, Robert
> Option-2: Alvaro, Magnus
> Option-3: Michael, Hari
> Option-4: Amit, Hari, Magnus, Alvaro, Michael, Peter
> 
> Some people's name is present in two options as they are okay with
> either one of those.  I see that now more people are in favor of
> option-1, but still, the tilt is more towards option-4.
> 
> > I feel like the SQL standard has a pretty clear idea that NULL is how
> > you represent a value is unknown, which shows up in a lot of places.
> > Deciding that we're going to use a different sentinel value in this
> > one case because NULL is a confusing concept in general seems pretty
> > strange to me.
> >
> 
> I agree that NULL seems to be the attractive choice for a default
> value, but we have to also see what it means?  In this case, NULL will
> mean 'all' which doesn't go with generally what NULL means (aka
> unknown, only NULL can be compared with other NULL).  There are a few
> people on this thread who feel that having NULL can lead to misuse of
> this API [1] as explained here and probably we need to use some
> workaround for it to be used in production [2].
> [1] - https://www.postgresql.org/message-id/CAA4eK1LicqWY55XxmahQXti4RjQ28iuASAk1X8%2ByKX0J051_VQ%40mail.gmail.com
> [2] - https://www.postgresql.org/message-id/20181117111653.cetidngkgol5e5xn%40alvherre.pgsql


Although #1 seems cleaner to me, the fear of inadvertent removal
of all queries with quite-common usage wins.  So I vote for #4
for now, supposing pg_stat_statements_reset(0, 0, 0) as
all-clear.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



On Wed, Dec 12, 2018 at 3:54 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> On Thu, Nov 29, 2018 at 1:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >
>>
>> Agreed.  Hari, can you change the patch as per the requirements of option-4.
>
>
> Apologies for delay. Thanks to all your opinions.
>
> Attached is the updated patch as per the option-4 and added a test also to ensure
> the function strictness.
>

Can you add a few examples in the documentation?  See if it is
possible to extend the existing documentation section (F.29.4),
otherwise, add a new section.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On Sat, Dec 15, 2018 at 12:14 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Nov 28, 2018 at 1:43 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > Right, I think option 4 is a clear improvement over option 1.  I can get
> > behind that one.  Since not many people care to vote, I think this tips
> > the scales enough to that side.
>
> I'm showing up very late to the party here, but I like option 1 best.
> I feel like the SQL standard has a pretty clear idea that NULL is how
> you represent a value is unknown, which shows up in a lot of places.
> Deciding that we're going to use a different sentinel value in this
> one case because NULL is a confusing concept in general seems pretty
> strange to me.
>

The proposed patch has used option-4.  I am not sure if you are
completely convinced or not, but I hope that you won't mind too much
if we pursue with option-4.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


On 2018-Dec-17, Kyotaro HORIGUCHI wrote:

> Although #1 seems cleaner to me, the fear of inadvertent removal
> of all queries with quite-common usage wins.  So I vote for #4
> for now, supposing pg_stat_statements_reset(0, 0, 0) as
> all-clear.

Right -- also keeping in mind that 0 is the default value too, so the
old spelling of pg_stat_statements_reset() has the same behavior as
previously, ie., remove everything.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


On Tue, Dec 18, 2018 at 3:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Dec 12, 2018 at 3:54 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> On Thu, Nov 29, 2018 at 1:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >
>>
>> Agreed.  Hari, can you change the patch as per the requirements of option-4.
>
>
> Apologies for delay. Thanks to all your opinions.
>
> Attached is the updated patch as per the option-4 and added a test also to ensure
> the function strictness.
>

Can you add a few examples in the documentation?  See if it is
possible to extend the existing documentation section (F.29.4),
otherwise, add a new section.

Apologies for the delay.
Attached the updated patch with addition of few examples usage
of pg_stat_statements_reset() function with new parameters to the
existing sample output.

Regards,
Haribabu Kommi
Fujitsu Australia
Attachment
On Tue, Jan 8, 2019 at 6:00 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
> On Tue, Dec 18, 2018 at 3:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> Can you add a few examples in the documentation?  See if it is
>> possible to extend the existing documentation section (F.29.4),
>> otherwise, add a new section.
>
>
> Apologies for the delay.
>

No problem.

> Attached the updated patch with addition of few examples usage
> of pg_stat_statements_reset() function with new parameters to the
> existing sample output.
>

Few minor comments:
1.
bench=# SELECT query, calls, total_time, rows, 100.0 * shared_blks_hit /
               nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
          FROM pg_stat_statements ORDER BY total_time DESC LIMIT 5;

-[ RECORD 5 ]--------------------------------------------------------------------
query       | vacuum analyze pgbench_accounts
calls       | 1
total_time  | 136.448116
rows        | 0
hit_percent | 99.9201915403032721

There is nothing between the first and second time you ran this query,
so where did this vacuum analyze .. record came?

2.
bench=# SELECT pg_stat_statements_reset(0,0,s.queryid) FROM
pg_stat_statements AS s
bench-# WHERE s.query = 'UPDATE pgbench_branches SET bbalance =
bbalance + $1 WHERE bid = $2';

bench=#

I think it would be good if you show the output of
pg_stat_statements_reset statement instead of showing empty line.
Another minor point is that in the second statement
(pg_stat_statements_reset), you seem to have made it a two-line
statement whereas first one looks to be a single-line statement, it
would be good from the readability point of view if both looks same.
I would prefer the second to look similar to the first one.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



On Wed, Jan 9, 2019 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
Few minor comments:

Thanks for the review.
 
1.
bench=# SELECT query, calls, total_time, rows, 100.0 * shared_blks_hit /
               nullif(shared_blks_hit + shared_blks_read, 0) AS hit_percent
          FROM pg_stat_statements ORDER BY total_time DESC LIMIT 5;

-[ RECORD 5 ]--------------------------------------------------------------------
query       | vacuum analyze pgbench_accounts
calls       | 1
total_time  | 136.448116
rows        | 0
hit_percent | 99.9201915403032721

There is nothing between the first and second time you ran this query,
so where did this vacuum analyze .. record came?

Because of the pg_stat_statements_reset() function call to reset the first
query, when the query executed second time the above statement is appeared
as the select query is limiting the result to 5.
 
2.
bench=# SELECT pg_stat_statements_reset(0,0,s.queryid) FROM
pg_stat_statements AS s
bench-# WHERE s.query = 'UPDATE pgbench_branches SET bbalance =
bbalance + $1 WHERE bid = $2';

bench=#

I think it would be good if you show the output of
pg_stat_statements_reset statement instead of showing empty line.

The pg_stat_statements_reset() function just returns void, because 
of this reason, already existing _reset() call also just lists the empty
line, I also followed the same.

I feel it is better to add the return data for all the _reset() function calls
or leave it as empty.
 
Another minor point is that in the second statement
(pg_stat_statements_reset), you seem to have made it a two-line
statement whereas first one looks to be a single-line statement, it
would be good from the readability point of view if both looks same.
I would prefer the second to look similar to the first one.

OK. Corrected.

Updated patch attached.

Regards,
Haribabu Kommi
Fujitsu Australia
Attachment
On Wed, Jan 9, 2019 at 10:26 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
>
> On Wed, Jan 9, 2019 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> Another minor point is that in the second statement
>> (pg_stat_statements_reset), you seem to have made it a two-line
>> statement whereas first one looks to be a single-line statement, it
>> would be good from the readability point of view if both looks same.
>> I would prefer the second to look similar to the first one.
>
>
> OK. Corrected.
>
> Updated patch attached.
>

I am happy with this patch now, attached is the new version of the
patch where I have slightly modified the commit message, see if that
looks fine to you.  I will push this patch tomorrow morning (after
again going through it) unless someone has any objections or see any
other issues.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment

On Thu, Jan 10, 2019 at 2:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jan 9, 2019 at 10:26 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>
>
> On Wed, Jan 9, 2019 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> Another minor point is that in the second statement
>> (pg_stat_statements_reset), you seem to have made it a two-line
>> statement whereas first one looks to be a single-line statement, it
>> would be good from the readability point of view if both looks same.
>> I would prefer the second to look similar to the first one.
>
>
> OK. Corrected.
>
> Updated patch attached.
>

I am happy with this patch now, attached is the new version of the
patch where I have slightly modified the commit message, see if that
looks fine to you.  I will push this patch tomorrow morning (after
again going through it) unless someone has any objections or see any
other issues.

Thanks for changes. LGTM.

Regards,
Haribabu Kommi
Fujitsu Australia
On Thu, Jan 10, 2019 at 12:11 PM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> On Thu, Jan 10, 2019 at 2:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> I am happy with this patch now, attached is the new version of the
>> patch where I have slightly modified the commit message, see if that
>> looks fine to you.  I will push this patch tomorrow morning (after
>> again going through it) unless someone has any objections or see any
>> other issues.
>
>
> Thanks for changes. LGTM.
>

Pushed.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com