Thread: New SQL counter statistics view (pg_stat_sql)

New SQL counter statistics view (pg_stat_sql)

From
Haribabu Kommi
Date:
This is a new statistics view that is used to provide the number of
SQL operations that are
happened on a particular interval of time. This view is useful for the
system to find out the
pattern of the operations that are happening in the instance during
particular interval of
time.

Following is the more or less columns and their details of the pg_stat_sql view.

postgres=# \d pg_stat_sql          View "pg_catalog.pg_stat_sql"  Column    |           Type           | Modifiers
-------------+--------------------------+-----------selects     | bigint                   |inserts     | bigint
          |deletes     | bigint                   |updates     | bigint                   |declares    | bigint
         |fetches     | bigint                   |copies      | bigint                   |reindexes   | bigint
        |truncates   | bigint                   |stats_reset | timestamp with time zone |
 

The SQL counters gets updated only for the external SQL queries, not for
the SQL queries that are generated internally. The counters gets updated
at exec_simple_query and exec_execute_message functions.

User can reset the SQL counter statistics using an exposed function.
The Stats collection can be turned on/off using a GUC also.

Any comments/objections in providing a patch for the same?

Regards,
Hari Babu
Fujitsu Australia



Re: New SQL counter statistics view (pg_stat_sql)

From
Tom Lane
Date:
Haribabu Kommi <kommi.haribabu@gmail.com> writes:
> This is a new statistics view that is used to provide the number of
> SQL operations that are
> happened on a particular interval of time. This view is useful for the
> system to find out the
> pattern of the operations that are happening in the instance during
> particular interval of
> time.

> Following is the more or less columns and their details of the pg_stat_sql view.

> postgres=# \d pg_stat_sql
>            View "pg_catalog.pg_stat_sql"
>    Column    |           Type           | Modifiers
> -------------+--------------------------+-----------
>  selects     | bigint                   |
>  inserts     | bigint                   |
>  deletes     | bigint                   |
>  updates     | bigint                   |
>  declares    | bigint                   |
>  fetches     | bigint                   |
>  copies      | bigint                   |
>  reindexes   | bigint                   |
>  truncates   | bigint                   |
>  stats_reset | timestamp with time zone |


1. This set of counters seems remarkably random.  Why, for instance,
count reindexes but not original index creations?

2. What will you do with cases such as SELECTs containing modifying CTEs?
Or EXPLAIN ANALYZE?  Does CLUSTER count as a REINDEX?  Does REFRESH
MATERIALIZED VIEW count as a SELECT?  (Or maybe it's an INSERT?)

If you're going to be selective about what you count, you're forever
going to be fielding complaints from users who are unhappy that you
didn't count some statement type they care about, or feel that you
misclassified some complex case.

I'm inclined to suggest you forget this approach and propose a single
counter for "SQL commands executed", which avoids all of the above
definitional problems.  People who need more detail than that are
probably best advised to look to contrib/pg_stat_statements, anyway.

Also, if you do that, there hardly seems a need for a whole new view.
You could just add a column to pg_stat_database.  The incremental
maintenance effort doesn't seem enough to justify its own GUC, either.
        regards, tom lane



Re: New SQL counter statistics view (pg_stat_sql)

From
Haribabu Kommi
Date:
On Sun, Aug 21, 2016 at 1:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Haribabu Kommi <kommi.haribabu@gmail.com> writes:
>> This is a new statistics view that is used to provide the number of
>> SQL operations that are
>> happened on a particular interval of time. This view is useful for the
>> system to find out the
>> pattern of the operations that are happening in the instance during
>> particular interval of
>> time.
>
> 1. This set of counters seems remarkably random.  Why, for instance,
> count reindexes but not original index creations?

I thought of adding the columns to the view that are mostly used(excluding
all DDL commands), otherwise the view columns fill with all the commands
that are supported by PostgreSQL.

> 2. What will you do with cases such as SELECTs containing modifying CTEs?
> Or EXPLAIN ANALYZE?  Does CLUSTER count as a REINDEX?  Does REFRESH
> MATERIALIZED VIEW count as a SELECT?  (Or maybe it's an INSERT?)

Cluster commands will be calculated as clusters, explain needs some
logic to identify what SQL operation exactly. This is because of using
nodeTag data from the parseTree structure to identify what type of SQL
statement it is.

The counters will not updated for any SQL type that is not as part of the view.

> If you're going to be selective about what you count, you're forever
> going to be fielding complaints from users who are unhappy that you
> didn't count some statement type they care about, or feel that you
> misclassified some complex case.

Yes, I agree that users may complain regarding with limited columns that are not
sufficient for their use. But with the mostly used columns, this view
may be useful for
some of the users to find out the pattern of the SQL operations that
are happening
on the instance.

> I'm inclined to suggest you forget this approach and propose a single
> counter for "SQL commands executed", which avoids all of the above
> definitional problems.  People who need more detail than that are
> probably best advised to look to contrib/pg_stat_statements, anyway.

I will add a new column to the pg_stat_database, to track the number of SQL
operations that are happened on this particular database.

Regards,
Hari Babu
Fujitsu Australia



Re: New SQL counter statistics view (pg_stat_sql)

From
Robert Haas
Date:
On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'm inclined to suggest you forget this approach and propose a single
> counter for "SQL commands executed", which avoids all of the above
> definitional problems.  People who need more detail than that are
> probably best advised to look to contrib/pg_stat_statements, anyway.

I disagree.  I think SQL commands executed, lumping absolutely
everything together, really isn't much use.  Haribabu's categorization
scheme seems to need some work, but the idea of categorizing
statements by type and counting executions per type seems very
reasonable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: New SQL counter statistics view (pg_stat_sql)

From
Andres Freund
Date:
On 2016-08-22 13:54:43 -0400, Robert Haas wrote:
> On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I'm inclined to suggest you forget this approach and propose a single
> > counter for "SQL commands executed", which avoids all of the above
> > definitional problems.  People who need more detail than that are
> > probably best advised to look to contrib/pg_stat_statements, anyway.
> 
> I disagree.  I think SQL commands executed, lumping absolutely
> everything together, really isn't much use.

I'm inclined to agree. I think that's a quite useful stat when looking
at an installation one previously didn't have a lot of interaction with.


> Haribabu's categorization
> scheme seems to need some work, but the idea of categorizing
> statements by type and counting executions per type seems very
> reasonable.

I'd consider instead using something like COALESCE(commandType,
nodeTag(Query->utilityStmt)) as the categories. Not sure if I'd even
pivot that.

Andres



Re: New SQL counter statistics view (pg_stat_sql)

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2016-08-22 13:54:43 -0400, Robert Haas wrote:
>> On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I'm inclined to suggest you forget this approach and propose a single
>>> counter for "SQL commands executed", which avoids all of the above
>>> definitional problems.  People who need more detail than that are
>>> probably best advised to look to contrib/pg_stat_statements, anyway.

>> I disagree.  I think SQL commands executed, lumping absolutely
>> everything together, really isn't much use.

> I'm inclined to agree. I think that's a quite useful stat when looking
> at an installation one previously didn't have a lot of interaction with.

Well, let's at least have an "other" category so you can add up the
counters and get a meaningful total.
        regards, tom lane



Re: New SQL counter statistics view (pg_stat_sql)

From
Gavin Flower
Date:
On 23/08/16 08:27, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
>> On 2016-08-22 13:54:43 -0400, Robert Haas wrote:
>>> On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> I'm inclined to suggest you forget this approach and propose a single
>>>> counter for "SQL commands executed", which avoids all of the above
>>>> definitional problems.  People who need more detail than that are
>>>> probably best advised to look to contrib/pg_stat_statements, anyway.
>>> I disagree.  I think SQL commands executed, lumping absolutely
>>> everything together, really isn't much use.
>> I'm inclined to agree. I think that's a quite useful stat when looking
>> at an installation one previously didn't have a lot of interaction with.
> Well, let's at least have an "other" category so you can add up the
> counters and get a meaningful total.
>
>             regards, tom lane
>
>
Initially I thought of something I thought was factious, but then 
realized it might actually be both practicable & useful...

How about 2 extra categories (if appropriate!!!):
1. Things that actually might be sort of as fitting in 2 or more of the   existing categories, or there is an ambiguity
asto which category   is appropriate.
 
2. Things that don't fit into any existing category

This was inspired by a real use case, in a totally unrelated area - but 
where I attempted to ensure counts were in the most useful categories I 
was able to provide.  The user had listed categories, but I found that 
the data didn't always fit neatly into them as specified.


Cheers,

Gavin





Re: New SQL counter statistics view (pg_stat_sql)

From
neha khatri
Date:
>Andres Freund <andres@anarazel.de> writes:
>> On 2016-08-22 13:54:43 -0400, Robert Haas wrote:
>> On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> I'm inclined to suggest you forget this approach and propose a single
>>>> counter for "SQL commands executed", which avoids all of the above
>>>> definitional problems.  People who need more detail than that are
>>>> probably best advised to look to contrib/pg_stat_statements, anyway.

>>> I disagree.  I think SQL commands executed, lumping absolutely
>>> everything together, really isn't much use.

>> I'm inclined to agree. I think that's a quite useful stat when looking
>> at an installation one previously didn't have a lot of interaction with.

>Well, let's at least have an "other" category so you can add up the
>counters and get a meaningful total.

How would that meaningful total might help a user. What can a user might analyse with the counter in 'other' category.


Neha

Re: New SQL counter statistics view (pg_stat_sql)

From
Gavin Flower
Date:
On 24/08/16 12:02, neha khatri wrote:
> >Andres Freund <andres@anarazel.de <mailto:andres@anarazel.de>> writes:
> >> On 2016-08-22 13:54:43 -0400, Robert Haas wrote:
> >> On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane <tgl@sss.pgh.pa.us 
> <mailto:tgl@sss.pgh.pa.us>> wrote:
> >>>> I'm inclined to suggest you forget this approach and propose a single
> >>>> counter for "SQL commands executed", which avoids all of the above
> >>>> definitional problems.  People who need more detail than that are
> >>>> probably best advised to look to contrib/pg_stat_statements, anyway.
>
> >>> I disagree.  I think SQL commands executed, lumping absolutely
> >>> everything together, really isn't much use.
>
> >> I'm inclined to agree. I think that's a quite useful stat when looking
> >> at an installation one previously didn't have a lot of interaction 
> with.
>
> >Well, let's at least have an "other" category so you can add up the
> >counters and get a meaningful total.
>
> How would that meaningful total might help a user. What can a user 
> might analyse with the counter in 'other' category.
>
>
> Neha
>
The user could then judge if there were a significant number of examples 
not covered in the other categories - this may, or may not, be a 
problem; depending on the use case.


Cheers,
Gavin




Re: New SQL counter statistics view (pg_stat_sql)

From
neha khatri
Date:
<div dir="ltr"><div class="gmail_extra"><br /><div class="gmail_quote">On Wed, Aug 24, 2016 at 10:07 AM, Gavin Flower
<spandir="ltr"><<a href="mailto:GavinFlower@archidevsys.co.nz"
target="_blank">GavinFlower@archidevsys.co.nz</a>></span>wrote:<br /><blockquote class="gmail_quote" style="margin:0
00 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 24/08/16 12:02, neha khatri wrote:<br /><blockquote
class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""> >Andres
Freund<<a href="mailto:andres@anarazel.de" target="_blank">andres@anarazel.de</a> <mailto:<a
href="mailto:andres@anarazel.de"target="_blank">andres@anarazel.de</a>>> writes:<br /> >> On 2016-08-22
13:54:43-0400, Robert Haas wrote:<br /></span><span class=""> >> On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane
<<ahref="mailto:tgl@sss.pgh.pa.us" target="_blank">tgl@sss.pgh.pa.us</a> <mailto:<a
href="mailto:tgl@sss.pgh.pa.us"target="_blank">tgl@sss.pgh.pa.us</a>>> wrote:<br /> >>>> I'm inclined
tosuggest you forget this approach and propose a single<br /> >>>> counter for "SQL commands executed",
whichavoids all of the above<br /> >>>> definitional problems.  People who need more detail than that
are<br/> >>>> probably best advised to look to contrib/pg_stat_statements, anyway.<br /><br /> >>>
Idisagree.  I think SQL commands executed, lumping absolutely<br /> >>> everything together, really isn't much
use.<br/><br /> >> I'm inclined to agree. I think that's a quite useful stat when looking<br /> >> at an
installationone previously didn't have a lot of interaction with.<br /><br /> >Well, let's at least have an "other"
categoryso you can add up the<br /> >counters and get a meaningful total.<br /><br /> How would that meaningful
totalmight help a user. What can a user might analyse with the counter in 'other' category.<br /><br /><br /><br
/></span></blockquote>The user could then judge if there were a significant number of examples not covered in the other
categories- this may, or may not, be a problem; depending on the use case.<br /><br /><br /> Cheers,<br /> Gavin<br
/><br/></blockquote></div>For the user to be able to judge that whether the number in the 'other' category is a problem
ornot, the user is also required to know what all might fall under the 'other' category. It may not be good to say that
_anything_that is not part of the already defined category is part of 'other'. Probably, 'other' should also be a set
ofpredefined operations.</div><div class="gmail_extra"><br /></div><div class="gmail_extra">Neha</div></div> 

Re: New SQL counter statistics view (pg_stat_sql)

From
Haribabu Kommi
Date:
On Tue, Aug 23, 2016 at 3:59 AM, Andres Freund <andres@anarazel.de> wrote:
>> Haribabu's categorization
>> scheme seems to need some work, but the idea of categorizing
>> statements by type and counting executions per type seems very
>> reasonable.
>
> I'd consider instead using something like COALESCE(commandType,
> nodeTag(Query->utilityStmt)) as the categories. Not sure if I'd even
> pivot that.

CommandType means to use select, insert, update, delete and utility
as the categorization to display the counters? I think I am not getting
your point correctly.

Apart from the above, here are the following list of command tags that
are generated in the code, I took only the first word of the command tag
just to see how many categories present. The number indicates the
subset of operations or number of types it is used. Like create table,
create function and etc.

insert
delete
update
select (6)
transaction (10)
declare
close (2)
move
fetch
create (37)
drop (36)
Alter (56)
import
truncate
comment
security
copy
grant
revoke
notify
listen
unlisten
load
cluster
vacuum
analyze
explain
refresh
set (2)
reset
show
discard (4)
reassign
lock
checkpoint
reindex
prepare
execute
deallocate

we can try to pick something from the above list also for main
categories and rest will fall under
other.


Regards,
Hari Babu
Fujitsu Australia



Re: New SQL counter statistics view (pg_stat_sql)

From
Alvaro Herrera
Date:
Haribabu Kommi wrote:

> Apart from the above, here are the following list of command tags that
> are generated in the code, I took only the first word of the command tag
> just to see how many categories present. The number indicates the
> subset of operations or number of types it is used. Like create table,
> create function and etc.

Sounds about right.  I suppose all those cases that you aggregated here
would expand to full tags in the actual code.  I furthermore suppose
that some of these could be ignored, such as the transaction ones and
things like load, lock, move, fetch, discard, deallocate (maybe lump
them all together under "other", or some other rough categorization, as
Tom suggests).  Also, for many of these commands it's probably relevant
whether they are acting on a temporary object or not; we should either
count these separately, or not count the temp ones at all.

> insert
> delete
> update
> select (6)
> transaction (10)
> declare
> close (2)
> move
> fetch
> create (37)
> drop (36)
> Alter (56)
> import
> truncate
> comment
> security
> copy
> grant
> revoke
> notify
> listen
> unlisten
> load
> cluster
> vacuum
> analyze
> explain
> refresh
> set (2)
> reset
> show
> discard (4)
> reassign
> lock
> checkpoint
> reindex
> prepare
> execute
> deallocate

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: New SQL counter statistics view (pg_stat_sql)

From
Haribabu Kommi
Date:


On Wed, Aug 31, 2016 at 3:19 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Haribabu Kommi wrote:
>
>> Apart from the above, here are the following list of command tags that
>> are generated in the code, I took only the first word of the command tag
>> just to see how many categories present. The number indicates the
>> subset of operations or number of types it is used. Like create table,
>> create function and etc.
>
> Sounds about right.  I suppose all those cases that you aggregated here
> would expand to full tags in the actual code.  I furthermore suppose
> that some of these could be ignored, such as the transaction ones and
> things like load, lock, move, fetch, discard, deallocate (maybe lump
> them all together under "other", or some other rough categorization, as
> Tom suggests).

Following is the pg_stat_sql view with the SQL categories that I considered
that are important. Rest of the them will be shown under others category.

postgres=# \d pg_stat_sql
             View "pg_catalog.pg_stat_sql"
     Column      |           Type           | Modifiers 
-----------------+--------------------------+-----------
 inserts         | bigint                   | 
 deletes         | bigint                   | 
 updates         | bigint                   | 
 selects         | bigint                   | 
 declare_cursors | bigint                   | 
 closes          | bigint                   | 
 creates         | bigint                   | 
 drops           | bigint                   | 
 alters          | bigint                   | 
 imports         | bigint                   | 
 truncates       | bigint                   | 
 copies          | bigint                   | 
 grants          | bigint                   | 
 revokes         | bigint                   | 
 clusters        | bigint                   | 
 vacuums         | bigint                   | 
 analyzes        | bigint                   | 
 refreshs        | bigint                   | 
 locks           | bigint                   | 
 checkpoints     | bigint                   | 
 reindexes       | bigint                   | 
 deallocates     | bigint                   | 
 others          | bigint                   | 
 stats_reset     | timestamp with time zone | 


If any additions/deletions, I can accommodate them.

The stats data gets updated in exec_simple_query and exec_execute_message
functions and the collected stats will be sent to stats collector similar
like function usage stats in pgstat_report_stat function.

These SQL statistics data is stored in the stats file similar like global
statistics. The STAT file format is updated to accommodate the new stats.

A new GUC "track_sql" is added to track the SQL statement
statistics, by default this is off. Only superuser can change this
parameter.

Attached a patch for the same.

>  Also, for many of these commands it's probably relevant
> whether they are acting on a temporary object or not; we should either
> count these separately, or not count the temp ones at all.

Currently the SQL stats are not checking any object level that is a temp
one or not? The temp objects are specific to a backend only. But what
I feel, any way that is an SQL query that was executed on a temp object,
so we need to count that operation.

I feel this SQL stats should not worry about the object type. May be I am
wrong.

Regards,
Hari Babu
Fujitsu Australia
Attachment

Re: New SQL counter statistics view (pg_stat_sql)

From
Robert Haas
Date:
On Fri, Sep 2, 2016 at 2:33 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
> On Wed, Aug 31, 2016 at 3:19 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
> wrote:
>> Haribabu Kommi wrote:
>>
>>> Apart from the above, here are the following list of command tags that
>>> are generated in the code, I took only the first word of the command tag
>>> just to see how many categories present. The number indicates the
>>> subset of operations or number of types it is used. Like create table,
>>> create function and etc.
>>
>> Sounds about right.  I suppose all those cases that you aggregated here
>> would expand to full tags in the actual code.  I furthermore suppose
>> that some of these could be ignored, such as the transaction ones and
>> things like load, lock, move, fetch, discard, deallocate (maybe lump
>> them all together under "other", or some other rough categorization, as
>> Tom suggests).
>
> Following is the pg_stat_sql view with the SQL categories that I considered
> that are important. Rest of the them will be shown under others category.
>
> postgres=# \d pg_stat_sql
>              View "pg_catalog.pg_stat_sql"
>      Column      |           Type           | Modifiers
> -----------------+--------------------------+-----------
>  inserts         | bigint                   |
>  deletes         | bigint                   |
>  updates         | bigint                   |
>  selects         | bigint                   |
>  declare_cursors | bigint                   |
>  closes          | bigint                   |
>  creates         | bigint                   |
>  drops           | bigint                   |
>  alters          | bigint                   |
>  imports         | bigint                   |
>  truncates       | bigint                   |
>  copies          | bigint                   |
>  grants          | bigint                   |
>  revokes         | bigint                   |
>  clusters        | bigint                   |
>  vacuums         | bigint                   |
>  analyzes        | bigint                   |
>  refreshs        | bigint                   |
>  locks           | bigint                   |
>  checkpoints     | bigint                   |
>  reindexes       | bigint                   |
>  deallocates     | bigint                   |
>  others          | bigint                   |
>  stats_reset     | timestamp with time zone |
>
>
> If any additions/deletions, I can accommodate them.

I think it is not a good idea to make the command names used here the
plural forms of the command tags.  Instead of "inserts", "updates",
"imports", etc. just use "INSERT", "UPDATE", "IMPORT".  That's simpler
and less error prone - e.g. you won't end up with things like
"refreshs", which is not a word.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: New SQL counter statistics view (pg_stat_sql)

From
Haribabu Kommi
Date:


On Thu, Sep 15, 2016 at 6:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Sep 2, 2016 at 2:33 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
> On Wed, Aug 31, 2016 at 3:19 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
> wrote:
>> Haribabu Kommi wrote:
>>
>>> Apart from the above, here are the following list of command tags that
>>> are generated in the code, I took only the first word of the command tag
>>> just to see how many categories present. The number indicates the
>>> subset of operations or number of types it is used. Like create table,
>>> create function and etc.
>>
>> Sounds about right.  I suppose all those cases that you aggregated here
>> would expand to full tags in the actual code.  I furthermore suppose
>> that some of these could be ignored, such as the transaction ones and
>> things like load, lock, move, fetch, discard, deallocate (maybe lump
>> them all together under "other", or some other rough categorization, as
>> Tom suggests).
>
> Following is the pg_stat_sql view with the SQL categories that I considered
> that are important. Rest of the them will be shown under others category.
>
> postgres=# \d pg_stat_sql
>              View "pg_catalog.pg_stat_sql"
>      Column      |           Type           | Modifiers
> -----------------+--------------------------+-----------
>  inserts         | bigint                   |
>  deletes         | bigint                   |
>  updates         | bigint                   |
>  selects         | bigint                   |
>  declare_cursors | bigint                   |
>  closes          | bigint                   |
>  creates         | bigint                   |
>  drops           | bigint                   |
>  alters          | bigint                   |
>  imports         | bigint                   |
>  truncates       | bigint                   |
>  copies          | bigint                   |
>  grants          | bigint                   |
>  revokes         | bigint                   |
>  clusters        | bigint                   |
>  vacuums         | bigint                   |
>  analyzes        | bigint                   |
>  refreshs        | bigint                   |
>  locks           | bigint                   |
>  checkpoints     | bigint                   |
>  reindexes       | bigint                   |
>  deallocates     | bigint                   |
>  others          | bigint                   |
>  stats_reset     | timestamp with time zone |
>
>
> If any additions/deletions, I can accommodate them.

I think it is not a good idea to make the command names used here the
plural forms of the command tags.  Instead of "inserts", "updates",
"imports", etc. just use "INSERT", "UPDATE", "IMPORT".  That's simpler
and less error prone - e.g. you won't end up with things like
"refreshs", which is not a word.

Thanks for your suggestion. I also thought of changing the name while writing
"refreshs" as a column name of the view originally. A small restriction with the
change of names to command names is that, user needs to execute the query
as follows.

select pg_stat_sql.select from pg_stat_sql;

Updated patch is attached with the corrections. Apart from name change, 
added documentation and tests.
 
Regards,
Hari Babu
Fujitsu Australia
Attachment

Re: New SQL counter statistics view (pg_stat_sql)

From
Peter Eisentraut
Date:
On 9/14/16 4:01 PM, Robert Haas wrote:
> I think it is not a good idea to make the command names used here the
> plural forms of the command tags.  Instead of "inserts", "updates",
> "imports", etc. just use "INSERT", "UPDATE", "IMPORT".  That's simpler
> and less error prone - e.g. you won't end up with things like
> "refreshs", which is not a word.

How about having the tag not be a column name but a row entry.  So you'd
do something like

SELECT * FROM pg_stat_sql WHERE tag = 'ALTER VIEW';

That way, we don't have to keep updating (and re-debating) this when new
command types or subtypes are added.  And queries written for future
versions will not fail when run against old servers.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: New SQL counter statistics view (pg_stat_sql)

From
David Fetter
Date:
On Wed, Sep 21, 2016 at 11:25:14AM -0400, Peter Eisentraut wrote:
> On 9/14/16 4:01 PM, Robert Haas wrote:
> > I think it is not a good idea to make the command names used here the
> > plural forms of the command tags.  Instead of "inserts", "updates",
> > "imports", etc. just use "INSERT", "UPDATE", "IMPORT".  That's simpler
> > and less error prone - e.g. you won't end up with things like
> > "refreshs", which is not a word.
> 
> How about having the tag not be a column name but a row entry.  So you'd
> do something like
> 
> SELECT * FROM pg_stat_sql WHERE tag = 'ALTER VIEW';

+1 for this.  It's MUCH easier to deal with changes in row counts than
changes in row type.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: New SQL counter statistics view (pg_stat_sql)

From
Alvaro Herrera
Date:
Peter Eisentraut wrote:

> How about having the tag not be a column name but a row entry.  So you'd
> do something like
> 
> SELECT * FROM pg_stat_sql WHERE tag = 'ALTER VIEW';
> 
> That way, we don't have to keep updating (and re-debating) this when new
> command types or subtypes are added.  And queries written for future
> versions will not fail when run against old servers.

Yeah, good idea.

Let's also discuss the interface from the stats collector.  Currently we
have some 20 new SQL functions, all alike, each loading the whole data
and returning a single counter, and then the view invokes each function
separately.  That doesn't seem great to me.  How about having a single C
function that returns the whole thing as a SRF instead, and the view is
just a single function invocation -- something like pg_lock_status
filling pg_locks in one go.

Another consideration is that the present patch lumps together all ALTER
cases in a single counter.  This isn't great, but at the same time we
don't want to bloat the stat files by having hundreds of counters per
database, do we?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: New SQL counter statistics view (pg_stat_sql)

From
David Fetter
Date:
On Wed, Sep 21, 2016 at 02:05:24PM -0300, Alvaro Herrera wrote:
> Another consideration is that the present patch lumps together all
> ALTER cases in a single counter.  This isn't great, but at the same
> time we don't want to bloat the stat files by having hundreds of
> counters per database, do we?

I count 37 documented versions of ALTER as of git master.  Is there
some multiplier I'm missing?

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: New SQL counter statistics view (pg_stat_sql)

From
Haribabu Kommi
Date:


On Thu, Sep 22, 2016 at 3:05 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Peter Eisentraut wrote:

> How about having the tag not be a column name but a row entry.  So you'd
> do something like
>
> SELECT * FROM pg_stat_sql WHERE tag = 'ALTER VIEW';
>
> That way, we don't have to keep updating (and re-debating) this when new
> command types or subtypes are added.  And queries written for future
> versions will not fail when run against old servers.

Yeah, good idea.

Yes, Having it as a row entry is good.
 
 
Let's also discuss the interface from the stats collector.  Currently we
have some 20 new SQL functions, all alike, each loading the whole data
and returning a single counter, and then the view invokes each function
separately.  That doesn't seem great to me.  How about having a single C
function that returns the whole thing as a SRF instead, and the view is
just a single function invocation -- something like pg_lock_status
filling pg_locks in one go.

Another consideration is that the present patch lumps together all ALTER
cases in a single counter.  This isn't great, but at the same time we
don't want to bloat the stat files by having hundreds of counters per
database, do we?

Currently, The SQL stats is a fixed size counter to track the all the ALTER 
cases as single counter. So while sending the stats from the backend to
stats collector at the end of the transaction, the cost is same, because of
it's fixed size. This approach adds overhead to send and read the stats
is minimal.

With the following approach, I feel it is possible to support the counter at
command tag level.

Add a Global and local Hash to keep track of the counters by using the
command tag as the key, this hash table increases dynamically whenever
a new type of SQL command gets executed. The Local Hash data is passed
to stats collector whenever the transaction gets committed.

The problem I am thinking is that, Sending data from Hash and populating
the Hash from stats file for all the command tags adds some overhead.


Regards,
Hari Babu
Fujitsu Australia

Re: New SQL counter statistics view (pg_stat_sql)

From
Haribabu Kommi
Date:


On Thu, Sep 29, 2016 at 3:45 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:


On Thu, Sep 22, 2016 at 3:05 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Peter Eisentraut wrote:

> How about having the tag not be a column name but a row entry.  So you'd
> do something like
>
> SELECT * FROM pg_stat_sql WHERE tag = 'ALTER VIEW';
>
> That way, we don't have to keep updating (and re-debating) this when new
> command types or subtypes are added.  And queries written for future
> versions will not fail when run against old servers.

Yeah, good idea.

Yes, Having it as a row entry is good.
 
 
Let's also discuss the interface from the stats collector.  Currently we
have some 20 new SQL functions, all alike, each loading the whole data
and returning a single counter, and then the view invokes each function
separately.  That doesn't seem great to me.  How about having a single C
function that returns the whole thing as a SRF instead, and the view is
just a single function invocation -- something like pg_lock_status
filling pg_locks in one go.

Another consideration is that the present patch lumps together all ALTER
cases in a single counter.  This isn't great, but at the same time we
don't want to bloat the stat files by having hundreds of counters per
database, do we?

Currently, The SQL stats is a fixed size counter to track the all the ALTER 
cases as single counter. So while sending the stats from the backend to
stats collector at the end of the transaction, the cost is same, because of
it's fixed size. This approach adds overhead to send and read the stats
is minimal.

With the following approach, I feel it is possible to support the counter at
command tag level.

Add a Global and local Hash to keep track of the counters by using the
command tag as the key, this hash table increases dynamically whenever
a new type of SQL command gets executed. The Local Hash data is passed
to stats collector whenever the transaction gets committed.

The problem I am thinking is that, Sending data from Hash and populating
the Hash from stats file for all the command tags adds some overhead.


I tried changing the pg_stat_sql into row mode and ran the regress suite to 
add different type of SQL commands to the view and ran the pgbench test
on my laptop to find out any performance impact with this patch.

                                HEAD      PATCH
pgbench - select      828          816

Here I attached the pg_stat_sql patch to keep track of all SQL commands
based on the commandTag and their counts. I attached the result of this
view that is run on the database after "make installcheck" just for reference.

Regards,
Hari Babu
Fujitsu Australia
Attachment

Re: New SQL counter statistics view (pg_stat_sql)

From
vinayak
Date:



On 2016/10/12 12:21, Haribabu Kommi wrote:


On Thu, Sep 29, 2016 at 3:45 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:


On Thu, Sep 22, 2016 at 3:05 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Peter Eisentraut wrote:

> How about having the tag not be a column name but a row entry.  So you'd
> do something like
>
> SELECT * FROM pg_stat_sql WHERE tag = 'ALTER VIEW';
>
> That way, we don't have to keep updating (and re-debating) this when new
> command types or subtypes are added.  And queries written for future
> versions will not fail when run against old servers.

Yeah, good idea.

Yes, Having it as a row entry is good.
 
 
Let's also discuss the interface from the stats collector.  Currently we
have some 20 new SQL functions, all alike, each loading the whole data
and returning a single counter, and then the view invokes each function
separately.  That doesn't seem great to me.  How about having a single C
function that returns the whole thing as a SRF instead, and the view is
just a single function invocation -- something like pg_lock_status
filling pg_locks in one go.

Another consideration is that the present patch lumps together all ALTER
cases in a single counter.  This isn't great, but at the same time we
don't want to bloat the stat files by having hundreds of counters per
database, do we?

Currently, The SQL stats is a fixed size counter to track the all the ALTER 
cases as single counter. So while sending the stats from the backend to
stats collector at the end of the transaction, the cost is same, because of
it's fixed size. This approach adds overhead to send and read the stats
is minimal.

With the following approach, I feel it is possible to support the counter at
command tag level.

Add a Global and local Hash to keep track of the counters by using the
command tag as the key, this hash table increases dynamically whenever
a new type of SQL command gets executed. The Local Hash data is passed
to stats collector whenever the transaction gets committed.

The problem I am thinking is that, Sending data from Hash and populating
the Hash from stats file for all the command tags adds some overhead.


I tried changing the pg_stat_sql into row mode and ran the regress suite to 
add different type of SQL commands to the view and ran the pgbench test
on my laptop to find out any performance impact with this patch.

                                HEAD      PATCH
pgbench - select      828          816

Here I attached the pg_stat_sql patch to keep track of all SQL commands
based on the commandTag and their counts. I attached the result of this
view that is run on the database after "make installcheck" just for reference.

Thank you for the patch.

Test: Commands with uppercase and lowercase
====
If the tag='select' then it returns the 0 rows but count is actually increment by 1.

tag='select' vs tag='SELECT'

postgres=# SET track_sql TO ON;
SET
postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
  tag   | count
--------+-------
 SELECT |    12
(1 row)

postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
  tag   | count
--------+-------
 SELECT |    13
(1 row)

postgres=# SELECT * FROM pg_stat_sql where tag='select';
 tag | count
-----+-------
(0 rows)

postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
  tag   | count
--------+-------
 SELECT |    15
(1 row)

I think all command works same as above.

Regards,
Vinayak Pokale
NTT Open Source Software Center

Re: New SQL counter statistics view (pg_stat_sql)

From
vinayak
Date:

On 2016/10/12 12:21, Haribabu Kommi wrote:


On Thu, Sep 29, 2016 at 3:45 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:


On Thu, Sep 22, 2016 at 3:05 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Peter Eisentraut wrote:

> How about having the tag not be a column name but a row entry.  So you'd
> do something like
>
> SELECT * FROM pg_stat_sql WHERE tag = 'ALTER VIEW';
>
> That way, we don't have to keep updating (and re-debating) this when new
> command types or subtypes are added.  And queries written for future
> versions will not fail when run against old servers.

Yeah, good idea.

Yes, Having it as a row entry is good.
 
 
Let's also discuss the interface from the stats collector.  Currently we
have some 20 new SQL functions, all alike, each loading the whole data
and returning a single counter, and then the view invokes each function
separately.  That doesn't seem great to me.  How about having a single C
function that returns the whole thing as a SRF instead, and the view is
just a single function invocation -- something like pg_lock_status
filling pg_locks in one go.

Another consideration is that the present patch lumps together all ALTER
cases in a single counter.  This isn't great, but at the same time we
don't want to bloat the stat files by having hundreds of counters per
database, do we?

Currently, The SQL stats is a fixed size counter to track the all the ALTER 
cases as single counter. So while sending the stats from the backend to
stats collector at the end of the transaction, the cost is same, because of
it's fixed size. This approach adds overhead to send and read the stats
is minimal.

With the following approach, I feel it is possible to support the counter at
command tag level.

Add a Global and local Hash to keep track of the counters by using the
command tag as the key, this hash table increases dynamically whenever
a new type of SQL command gets executed. The Local Hash data is passed
to stats collector whenever the transaction gets committed.

The problem I am thinking is that, Sending data from Hash and populating
the Hash from stats file for all the command tags adds some overhead.


I tried changing the pg_stat_sql into row mode and ran the regress suite to 
add different type of SQL commands to the view and ran the pgbench test
on my laptop to find out any performance impact with this patch.

                                HEAD      PATCH
pgbench - select      828          816

Here I attached the pg_stat_sql patch to keep track of all SQL commands
based on the commandTag and their counts. I attached the result of this
view that is run on the database after "make installcheck" just for reference.
Some comments:
I think we can use pgstat_* instead of pgstat* for code consistency.
 
+static HTAB *pgStatBackendSql = NULL;
How about *pgstat_backend_sql

+HTAB       *pgStatSql = NULL;
How about *pgstat_sql

+static void pgstat_recv_sqlstmt(PgStat_MsgSqlstmt * msg, int len);
How about PgStat_MsgSqlstmt *msg instead of PgStat_MsgSqlstmt * msg

+typedef struct PgStatSqlstmtEntry
How about PgStat_SqlstmtEntry

+typedef struct PgStatMsgSqlstmt
How about PgStat_MsgSqlstmt

I have observed below behavior.
I have SET track_sql to ON and then execute the SELECT command and it return 0 rows.
It start counting from the second command.
postgres=# SET track_sql TO ON;
SET
postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
 tag | count
-----+-------
(0 rows)

postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
  tag   | count
--------+-------
 SELECT |     1
(1 row)
Is this a correct behavior?

Regards,
Vinayak Pokale
NTT Open Source Software Center


Re: New SQL counter statistics view (pg_stat_sql)

From
Haribabu Kommi
Date:


On Wed, Oct 12, 2016 at 4:06 PM, vinayak <Pokale_Vinayak_q3@lab.ntt.co.jp> wrote:

Thank you for the patch.

Test: Commands with uppercase and lowercase
====
If the tag='select' then it returns the 0 rows but count is actually increment by 1.

tag='select' vs tag='SELECT'

postgres=# SET track_sql TO ON;
SET
postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
  tag   | count
--------+-------
 SELECT |    12
(1 row)

postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
  tag   | count
--------+-------
 SELECT |    13
(1 row)

postgres=# SELECT * FROM pg_stat_sql where tag='select';
 tag | count
-----+-------
(0 rows)

postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
  tag   | count
--------+-------
 SELECT |    15
(1 row)

I think all command works same as above.

Thanks for checking the patch.

Yes, that's correct. Currently the CAPS letters are used as tag names
that are getting displayed whenever any SQL command is executed.
So I used the same names as entries to store the details of stats of
SQL statements.

If anyone feels the other way is better, I am fine for it.


Regards,
Hari Babu
Fujitsu Australia

Re: New SQL counter statistics view (pg_stat_sql)

From
Haribabu Kommi
Date:


On Fri, Oct 14, 2016 at 7:48 PM, vinayak <Pokale_Vinayak_q3@lab.ntt.co.jp> wrote:

On 2016/10/12 12:21, Haribabu Kommi wrote:
I tried changing the pg_stat_sql into row mode and ran the regress suite to 
add different type of SQL commands to the view and ran the pgbench test
on my laptop to find out any performance impact with this patch.

                                HEAD      PATCH
pgbench - select      828          816

Here I attached the pg_stat_sql patch to keep track of all SQL commands
based on the commandTag and their counts. I attached the result of this
view that is run on the database after "make installcheck" just for reference.
Some comments:
I think we can use pgstat_* instead of pgstat* for code consistency.
 
+static HTAB *pgStatBackendSql = NULL;
How about *pgstat_backend_sql

+HTAB       *pgStatSql = NULL;
How about *pgstat_sql

Changed. 
 
+static void pgstat_recv_sqlstmt(PgStat_MsgSqlstmt * msg, int len);
How about PgStat_MsgSqlstmt *msg instead of PgStat_MsgSqlstmt * msg

Added the typdef into typdef.list file so this problem never occurs.
 
+typedef struct PgStatSqlstmtEntry
How about PgStat_SqlstmtEntry

+typedef struct PgStatMsgSqlstmt
How about PgStat_MsgSqlstmt


 changed.
 
I have observed below behavior.
I have SET track_sql to ON and then execute the SELECT command and it return 0 rows.
It start counting from the second command.
postgres=# SET track_sql TO ON;
SET
postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
 tag | count
-----+-------
(0 rows)

postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
  tag   | count
--------+-------
 SELECT |     1
(1 row)
Is this a correct behavior?
 

Yes. This is because the stats of the SQL statements that are collected are sent to
the stats collector at the end of the SQL statement execution. The current SQL statement
is not counted in the result.

Updated patch is attached.

Regards,
Hari Babu
Fujitsu Australia
Attachment

Re: New SQL counter statistics view (pg_stat_sql)

From
vinayak
Date:

On 2016/10/17 10:22, Haribabu Kommi wrote:


On Fri, Oct 14, 2016 at 7:48 PM, vinayak <Pokale_Vinayak_q3@lab.ntt.co.jp> wrote:

On 2016/10/12 12:21, Haribabu Kommi wrote:
I tried changing the pg_stat_sql into row mode and ran the regress suite to 
add different type of SQL commands to the view and ran the pgbench test
on my laptop to find out any performance impact with this patch.

                                HEAD      PATCH
pgbench - select      828          816

Here I attached the pg_stat_sql patch to keep track of all SQL commands
based on the commandTag and their counts. I attached the result of this
view that is run on the database after "make installcheck" just for reference.
Some comments:
I think we can use pgstat_* instead of pgstat* for code consistency.
 
+static HTAB *pgStatBackendSql = NULL;
How about *pgstat_backend_sql

+HTAB       *pgStatSql = NULL;
How about *pgstat_sql

Changed. 
 
+static void pgstat_recv_sqlstmt(PgStat_MsgSqlstmt * msg, int len);
How about PgStat_MsgSqlstmt *msg instead of PgStat_MsgSqlstmt * msg

Added the typdef into typdef.list file so this problem never occurs.
 
+typedef struct PgStatSqlstmtEntry
How about PgStat_SqlstmtEntry

+typedef struct PgStatMsgSqlstmt
How about PgStat_MsgSqlstmt


 changed.

Updated patch is attached.
The updated patch gives following error.
Error:
../../../../src/include/pgstat.h:557: error: ‘PgStatSqlstmtEntry’ undeclared here (not in a function)

-    ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(PgStatSqlstmtEntry))
+    ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(PgStat_SqlstmtEntry))
The attached patch fixed the error.

Regards,
Vinayak Pokale
Attachment

Re: New SQL counter statistics view (pg_stat_sql)

From
Robert Haas
Date:
On Thu, Sep 29, 2016 at 1:45 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> Currently, The SQL stats is a fixed size counter to track the all the ALTER
> cases as single counter. So while sending the stats from the backend to
> stats collector at the end of the transaction, the cost is same, because of
> it's fixed size. This approach adds overhead to send and read the stats
> is minimal.
>
> With the following approach, I feel it is possible to support the counter at
> command tag level.
>
> Add a Global and local Hash to keep track of the counters by using the
> command tag as the key, this hash table increases dynamically whenever
> a new type of SQL command gets executed. The Local Hash data is passed
> to stats collector whenever the transaction gets committed.
>
> The problem I am thinking is that, Sending data from Hash and populating
> the Hash from stats file for all the command tags adds some overhead.

Yeah, I'm not very excited about that overhead.  This seems useful as
far as it goes, but I don't really want to incur measurable overhead
when it's in use.  Having a hash table rather than a fixed array of
slots means that you have to pass this through the stats collector
rather than updating shared memory directly, which is fairly heavy
weight.  If each backend could have its own copy of the slot array and
just update that, and readers added up the values across the whole
array, this could be done without any locking at all, and it would
generally be much lighter-weight than this approach.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: New SQL counter statistics view (pg_stat_sql)

From
Haribabu Kommi
Date:


On Wed, Oct 19, 2016 at 5:11 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Sep 29, 2016 at 1:45 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> Currently, The SQL stats is a fixed size counter to track the all the ALTER
> cases as single counter. So while sending the stats from the backend to
> stats collector at the end of the transaction, the cost is same, because of
> it's fixed size. This approach adds overhead to send and read the stats
> is minimal.
>
> With the following approach, I feel it is possible to support the counter at
> command tag level.
>
> Add a Global and local Hash to keep track of the counters by using the
> command tag as the key, this hash table increases dynamically whenever
> a new type of SQL command gets executed. The Local Hash data is passed
> to stats collector whenever the transaction gets committed.
>
> The problem I am thinking is that, Sending data from Hash and populating
> the Hash from stats file for all the command tags adds some overhead.

Yeah, I'm not very excited about that overhead.  This seems useful as
far as it goes, but I don't really want to incur measurable overhead
when it's in use.  Having a hash table rather than a fixed array of
slots means that you have to pass this through the stats collector
rather than updating shared memory directly, which is fairly heavy
weight.  If each backend could have its own copy of the slot array and
just update that, and readers added up the values across the whole
array, this could be done without any locking at all, and it would
generally be much lighter-weight than this approach.

Using limited information like combining all ALTER XXX into Alter counter,
the fixed array logic works fine without having any problems. But people
are suggesting to provide more details like ALTER VIEW and etc, so I
checked the following ways,

1. Using of nodetag as an array of index to update the counter. But this
approach doesn't work for some cases like T_DropStmt where the tag
varies based on the removeType of the DropStmt. So I decide to drop
this approach.

2. Using of tag name for searching the fixed size array that is sorted with
all command tags that are possible in PostgreSQL. Using a binary search,
find out the location and update the counter. In this approach, first the array
needs to be filed with TAG information in the sorted order.

3. Using of Hash table to store the counters information based on the TAG
name as key.

I choose the approach-3 as it can scale for any additional commands.

Any better alternatives with that directly it can point to the array index?

if we are going with ALTER XXX into single Alter counter is approach, then
I will change the code with a fixed array approach.

Regards,
Hari Babu
Fujitsu Australia

Re: New SQL counter statistics view (pg_stat_sql)

From
Haribabu Kommi
Date:
Hi Vinayak,

This is a gentle reminder.

you assigned as reviewer to the current patch in the 11-2016 commitfest.
If you have any more review comments / performance results regarding the
approach of the patch, please share it. Otherwise, you can mark the patch
as "Ready for committer" for committer feedback. This will help us in smoother
operation of the commitfest.

 
Regards,
Hari Babu
Fujitsu Australia

Re: New SQL counter statistics view (pg_stat_sql)

From
Haribabu Kommi
Date:


On Tue, Nov 22, 2016 at 5:57 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
Hi Vinayak,

This is a gentle reminder.

you assigned as reviewer to the current patch in the 11-2016 commitfest.
If you have any more review comments / performance results regarding the
approach of the patch, please share it. Otherwise, you can mark the patch
as "Ready for committer" for committer feedback. This will help us in smoother
operation of the commitfest.

Moved to next CF with "needs review" status.


Regards,
Hari Babu
Fujitsu Australia

Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)

From
Dilip Kumar
Date:
On Mon, Dec 5, 2016 at 8:01 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
> Moved to next CF with "needs review" status.

I have started reviewing the patch, Some initial comments.

$ git apply ../pg_stat_sql_row_mode_2.patch
error: patch failed: src/include/catalog/pg_proc.h:2893
error: src/include/catalog/pg_proc.h: patch does not apply

Patch is not applying on the head, I think it needs to be rebased.

+void
+pgstat_count_sqlstmt(const char *commandTag)
+{
+ PgStat_SqlstmtEntry *htabent;
+ bool found;
+
+ if (!pgstat_track_sql)
+ return

Callers of pgstat_count_sqlstmt are always ensuring that
pgstat_track_sql is true, seems it's redundant here.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)

From
Robert Haas
Date:
On Wed, Jan 11, 2017 at 9:17 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> On Mon, Dec 5, 2016 at 8:01 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
>> Moved to next CF with "needs review" status.
>
> I have started reviewing the patch, Some initial comments.
>
> $ git apply ../pg_stat_sql_row_mode_2.patch
> error: patch failed: src/include/catalog/pg_proc.h:2893
> error: src/include/catalog/pg_proc.h: patch does not apply

I suggest using 'patch' directly.  'git apply' is far too picky.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)

From
Dilip Kumar
Date:
On Wed, Jan 11, 2017 at 7:47 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> +void
> +pgstat_count_sqlstmt(const char *commandTag)
> +{
> + PgStat_SqlstmtEntry *htabent;
> + bool found;
> +
> + if (!pgstat_track_sql)
> + return
>
> Callers of pgstat_count_sqlstmt are always ensuring that
> pgstat_track_sql is true, seems it's redundant here.

I have done testing of the feature, it's mostly working as per the expectation.

I have few comments/questions.

--------
If you execute the query with EXECUTE then commandTag will be EXECUTE
that way we will not show the actual query type, I mean all the
statements will get the common tag "EXECUTE".

You can try this.
PREPARE fooplan (int) AS INSERT INTO t VALUES($1);
EXECUTE fooplan(1);

------

+ /* Destroy the SQL statement counter stats HashTable */
+ hash_destroy(pgstat_sql);
+
+ /* Create SQL statement counter Stats HashTable */

I think in the patch we have used mixed of "Stats HashTable" and
"stats HashTable", I think better
to be consistent throughout the patch. Check other similar instances.

----------------

@@ -1102,6 +1102,12 @@ exec_simple_query(const char *query_string)
 PortalDrop(portal, false);

+ /*
+ * Count SQL statement for pg_stat_sql view
+ */
+ if (pgstat_track_sql)
+ pgstat_count_sqlstmt(commandTag);

We are only counting the successful SQL statement, Is that intentional?

------
I have noticed that pgstat_count_sqlstmt is called from
exec_simple_query and exec_execute_message. Don't we want to track the
statement executed from functions?
-------

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)

From
Michael Paquier
Date:
On Sat, Jan 14, 2017 at 12:58 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> On Wed, Jan 11, 2017 at 7:47 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>> +void
>> +pgstat_count_sqlstmt(const char *commandTag)
>> +{
>> + PgStat_SqlstmtEntry *htabent;
>> + bool found;
>> +
>> + if (!pgstat_track_sql)
>> + return
>>
>> Callers of pgstat_count_sqlstmt are always ensuring that
>> pgstat_track_sql is true, seems it's redundant here.
>
> I have done testing of the feature, it's mostly working as per the expectation.
>
> I have few comments/questions.
>
> --------
> If you execute the query with EXECUTE then commandTag will be EXECUTE
> that way we will not show the actual query type, I mean all the
> statements will get the common tag "EXECUTE".
>
> You can try this.
> PREPARE fooplan (int) AS INSERT INTO t VALUES($1);
> EXECUTE fooplan(1);
>
> ------
>
> + /* Destroy the SQL statement counter stats HashTable */
> + hash_destroy(pgstat_sql);
> +
> + /* Create SQL statement counter Stats HashTable */
>
> I think in the patch we have used mixed of "Stats HashTable" and
> "stats HashTable", I think better
> to be consistent throughout the patch. Check other similar instances.
>
> ----------------
>
> @@ -1102,6 +1102,12 @@ exec_simple_query(const char *query_string)
>
>   PortalDrop(portal, false);
>
> + /*
> + * Count SQL statement for pg_stat_sql view
> + */
> + if (pgstat_track_sql)
> + pgstat_count_sqlstmt(commandTag);
>
> We are only counting the successful SQL statement, Is that intentional?
>
> ------
> I have noticed that pgstat_count_sqlstmt is called from
> exec_simple_query and exec_execute_message. Don't we want to track the
> statement executed from functions?
> -------

The latest patch available has rotten, could you rebase please?
-- 
Michael



Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)

From
Robert Haas
Date:
On Fri, Jan 13, 2017 at 10:58 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> If you execute the query with EXECUTE then commandTag will be EXECUTE
> that way we will not show the actual query type, I mean all the
> statements will get the common tag "EXECUTE".

Somebody might say that's a feature, not a bug.  But then again,
someone else might say the opposite.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)

From
Haribabu Kommi
Date:


On Sat, Jan 14, 2017 at 2:58 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Jan 11, 2017 at 7:47 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> +void
> +pgstat_count_sqlstmt(const char *commandTag)
> +{
> + PgStat_SqlstmtEntry *htabent;
> + bool found;
> +
> + if (!pgstat_track_sql)
> + return
>
> Callers of pgstat_count_sqlstmt are always ensuring that
> pgstat_track_sql is true, seems it's redundant here.

I have done testing of the feature, it's mostly working as per the expectation.

Thanks for the review and test.

The use case for this view is to find out the number of different SQL statements
that are getting executed successfully on an instance by the application/user. 

I have few comments/questions.

--------
If you execute the query with EXECUTE then commandTag will be EXECUTE
that way we will not show the actual query type, I mean all the
statements will get the common tag "EXECUTE".

You can try this.
PREPARE fooplan (int) AS INSERT INTO t VALUES($1);
EXECUTE fooplan(1);

------

Yes, that's correct. Currently the count is incremented based on the base command,
because of this reason, the EXECUTE is increased, instead of the actual operation.

 
+ /* Destroy the SQL statement counter stats HashTable */
+ hash_destroy(pgstat_sql);
+
+ /* Create SQL statement counter Stats HashTable */

I think in the patch we have used mixed of "Stats HashTable" and
"stats HashTable", I think better
to be consistent throughout the patch. Check other similar instances.

----------------

Corrected.
 

@@ -1102,6 +1102,12 @@ exec_simple_query(const char *query_string)

  PortalDrop(portal, false);

+ /*
+ * Count SQL statement for pg_stat_sql view
+ */
+ if (pgstat_track_sql)
+ pgstat_count_sqlstmt(commandTag);

We are only counting the successful SQL statement, Is that intentional?

Yes, I thought of counting the successful SQL operations that changed the
database over a period of time. I am not sure whether there will be many
failure operations that can occur on an instance to be counted.
 
------
I have noticed that pgstat_count_sqlstmt is called from
exec_simple_query and exec_execute_message. Don't we want to track the
statement executed from functions?
-------
 
The view is currently designed to count user/application initiated SQL statements,
but not the internal SQL queries that are getting executed from functions and etc.

>>+void
>>+pgstat_count_sqlstmt(const char *commandTag)
>>+{
>>+ PgStat_SqlstmtEntry *htabent;
>>+ bool found;
>>+
>>+ if (!pgstat_track_sql)
>>+ return
>
>Callers of pgstat_count_sqlstmt are always ensuring that
>pgstat_track_sql is true, seems it's redundant here.

Removed the check in pgstat_count_sqlstmt statement and add it exec_execute_message
function where the check is missed.

Updated patch attached.

Regards,
Hari Babu
Fujitsu Australia
Attachment

Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)

From
vinayak
Date:

On 2017/01/18 14:15, Haribabu Kommi wrote:


On Sat, Jan 14, 2017 at 2:58 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
On Wed, Jan 11, 2017 at 7:47 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> +void
> +pgstat_count_sqlstmt(const char *commandTag)
> +{
> + PgStat_SqlstmtEntry *htabent;
> + bool found;
> +
> + if (!pgstat_track_sql)
> + return
>
> Callers of pgstat_count_sqlstmt are always ensuring that
> pgstat_track_sql is true, seems it's redundant here.

I have done testing of the feature, it's mostly working as per the expectation.

Thanks for the review and test.

The use case for this view is to find out the number of different SQL statements
that are getting executed successfully on an instance by the application/user. 

I have few comments/questions.

--------
If you execute the query with EXECUTE then commandTag will be EXECUTE
that way we will not show the actual query type, I mean all the
statements will get the common tag "EXECUTE".

You can try this.
PREPARE fooplan (int) AS INSERT INTO t VALUES($1);
EXECUTE fooplan(1);

------

Yes, that's correct. Currently the count is incremented based on the base command,
because of this reason, the EXECUTE is increased, instead of the actual operation.

 
+ /* Destroy the SQL statement counter stats HashTable */
+ hash_destroy(pgstat_sql);
+
+ /* Create SQL statement counter Stats HashTable */

I think in the patch we have used mixed of "Stats HashTable" and
"stats HashTable", I think better
to be consistent throughout the patch. Check other similar instances.

----------------

Corrected.
 

@@ -1102,6 +1102,12 @@ exec_simple_query(const char *query_string)

  PortalDrop(portal, false);

+ /*
+ * Count SQL statement for pg_stat_sql view
+ */
+ if (pgstat_track_sql)
+ pgstat_count_sqlstmt(commandTag);

We are only counting the successful SQL statement, Is that intentional?

Yes, I thought of counting the successful SQL operations that changed the
database over a period of time. I am not sure whether there will be many
failure operations that can occur on an instance to be counted.
 
------
I have noticed that pgstat_count_sqlstmt is called from
exec_simple_query and exec_execute_message. Don't we want to track the
statement executed from functions?
-------
 
The view is currently designed to count user/application initiated SQL statements,
but not the internal SQL queries that are getting executed from functions and etc.

>>+void
>>+pgstat_count_sqlstmt(const char *commandTag)
>>+{
>>+ PgStat_SqlstmtEntry *htabent;
>>+ bool found;
>>+
>>+ if (!pgstat_track_sql)
>>+ return
>
>Callers of pgstat_count_sqlstmt are always ensuring that
>pgstat_track_sql is true, seems it's redundant here.

Removed the check in pgstat_count_sqlstmt statement and add it exec_execute_message
function where the check is missed.

Updated patch attached.
I have reviewed the patch. All the test cases works fine.

Regards,
Vinayak Pokale
NTT Open Source Software Center

Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)

From
Dilip Kumar
Date:
On Wed, Jan 18, 2017 at 10:45 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> The use case for this view is to find out the number of different SQL
> statements
> that are getting executed successfully on an instance by the
> application/user.
>
>> I have few comments/questions.
>>
>> --------
>> If you execute the query with EXECUTE then commandTag will be EXECUTE
>> that way we will not show the actual query type, I mean all the
>> statements will get the common tag "EXECUTE".
>>
>> You can try this.
>> PREPARE fooplan (int) AS INSERT INTO t VALUES($1);
>> EXECUTE fooplan(1);
>>
>> ------
>
>
> Yes, that's correct. Currently the count is incremented based on the base
> command,
> because of this reason, the EXECUTE is increased, instead of the actual
> operation.

Okay, As per your explaination, and Robert also pointed out that it
can be a feature for someone and bug for others. I would have
preferred it shows actual SQL statement. But I have no objection to
what you are doing.
>
>
>>
>> + /* Destroy the SQL statement counter stats HashTable */
>> + hash_destroy(pgstat_sql);
>> +
>> + /* Create SQL statement counter Stats HashTable */
>>
>> I think in the patch we have used mixed of "Stats HashTable" and
>> "stats HashTable", I think better
>> to be consistent throughout the patch. Check other similar instances.
>>
>> ----------------
>
>
> Corrected.
>
>>
>>
>> @@ -1102,6 +1102,12 @@ exec_simple_query(const char *query_string)
>>
>>   PortalDrop(portal, false);
>>
>> + /*
>> + * Count SQL statement for pg_stat_sql view
>> + */
>> + if (pgstat_track_sql)
>> + pgstat_count_sqlstmt(commandTag);
>>
>> We are only counting the successful SQL statement, Is that intentional?
>
>
> Yes, I thought of counting the successful SQL operations that changed the
> database over a period of time. I am not sure whether there will be many
> failure operations that can occur on an instance to be counted.

Okay..
>
>>
>> ------
>> I have noticed that pgstat_count_sqlstmt is called from
>> exec_simple_query and exec_execute_message. Don't we want to track the
>> statement executed from functions?
>> -------
> The view is currently designed to count user/application initiated SQL
> statements,
> but not the internal SQL queries that are getting executed from functions
> and etc.

Okay..
>
>>>+void
>>>+pgstat_count_sqlstmt(const char *commandTag)
>>>+{
>>>+ PgStat_SqlstmtEntry *htabent;
>>>+ bool found;
>>>+
>>>+ if (!pgstat_track_sql)
>>>+ return
>>
>>Callers of pgstat_count_sqlstmt are always ensuring that
>>pgstat_track_sql is true, seems it's redundant here.
>
> Removed the check in pgstat_count_sqlstmt statement and add it
> exec_execute_message
> function where the check is missed.
>
> Updated patch attached.

Overall patch looks fine to me and marking it "ready for committer".

There is two design decision, which I leave it for committer's decision.

1.  EXECUTE statement should show only as EXECUTE count, not the
actual SQL query.
2.  should we count statement execute from Functions or only statement
what is executed directly by the user as it's doing now?


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)

From
Haribabu Kommi
Date:


On Tue, Jan 24, 2017 at 3:59 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

Overall patch looks fine to me and marking it "ready for committer".

There is two design decision, which I leave it for committer's decision.

1.  EXECUTE statement should show only as EXECUTE count, not the
actual SQL query.
2.  should we count statement execute from Functions or only statement
what is executed directly by the user as it's doing now?

Thanks for the review.
Let's wait for the committer's opinion.

Regards,
Hari Babu
Fujitsu Australia

Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)

From
Michael Paquier
Date:
On Fri, Jan 27, 2017 at 10:26 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> Thanks for the review.
> Let's wait for the committer's opinion.

I have moved this patch to CF 2017-03 to wait for this to happen.
-- 
Michael



Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)

From
Haribabu Kommi
Date:


On Wed, Feb 1, 2017 at 3:13 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jan 27, 2017 at 10:26 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> Thanks for the review.
> Let's wait for the committer's opinion.

I have moved this patch to CF 2017-03 to wait for this to happen.

Attached a rebased patch to resolve the OID conflict.

Regards,
Hari Babu
Fujitsu Australia
Attachment

Re: New SQL counter statistics view (pg_stat_sql)

From
Andres Freund
Date:
Hi,


I'm somewhat inclined to think that this really would be better done in
an extension like pg_stat_statements.


On 2017-03-08 14:39:23 +1100, Haribabu Kommi wrote:
> On Wed, Feb 1, 2017 at 3:13 PM, Michael Paquier <michael.paquier@gmail.com>

> +     <varlistentry id="guc-track-sql" xreflabel="track_sql">
> +      <term><varname>track_sql</varname> (<type>boolean</type>)
> +      <indexterm>
> +       <primary><varname>track_sql</> configuration parameter</primary>
> +      </indexterm>
> +      </term>
> +      <listitem>
> +       <para>
> +        Enables collection of different SQL statement statistics that are
> +        executed on the instance. This parameter is off by default. Only
> +        superusers can change this setting.
> +       </para>
> +      </listitem>
> +     </varlistentry>
> +

I don't like this name much, seems a bit too generic to
me. 'track_activities', 'track_io_timings' are at least a bit clearer.
How about track_statement_statistics + corresponding view/function renaming?


> +  <table id="pg-stat-sql-view" xreflabel="pg_stat_sql">
> +   <title><structname>pg_stat_sql</structname> View</title>
> +   <tgroup cols="3">
> +    <thead>
> +    <row>
> +      <entry>Column</entry>
> +      <entry>Type</entry>
> +      <entry>Description</entry>
> +     </row>
> +    </thead>
> +
> +   <tbody>
> +    <row>
> +     <entry><structfield>tag</></entry>
> +     <entry><type>text</></entry>
> +     <entry>Name of the SQL statement</entry>
> +    </row>

It's not really the "name" of a statement. Internally and IIRC elsewhere
in the docs we describe something like this as tag?


> +/* ----------
> + * pgstat_send_sqlstmt(void)
> + *
> + *        Send SQL statement statistics to the collector
> + * ----------
> + */
> +static void
> +pgstat_send_sqlstmt(void)
> +{
> +    PgStat_MsgSqlstmt msg;
> +    PgStat_SqlstmtEntry *entry;
> +    HASH_SEQ_STATUS hstat;
> +
> +    if (pgstat_backend_sql == NULL)
> +        return;
> +
> +    pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SQLSTMT);
> +    msg.m_nentries = 0;
> +
> +    hash_seq_init(&hstat, pgstat_backend_sql);
> +    while ((entry = (PgStat_SqlstmtEntry *) hash_seq_search(&hstat)) != NULL)
> +    {
> +        PgStat_SqlstmtEntry *m_ent;
> +
> +        /* Skip it if no counts accumulated since last time */
> +        if (entry->count == 0)
> +            continue;
> +
> +        /* need to convert format of time accumulators */
> +        m_ent = &msg.m_entry[msg.m_nentries];
> +        memcpy(m_ent, entry, sizeof(PgStat_SqlstmtEntry));
> +
> +        if (++msg.m_nentries >= PGSTAT_NUM_SQLSTMTS)
> +        {
> +            pgstat_send(&msg, offsetof(PgStat_MsgSqlstmt, m_entry[0]) +
> +                        msg.m_nentries * sizeof(PgStat_SqlstmtEntry));
> +            msg.m_nentries = 0;
> +        }
> +
> +        /* reset the entry's count */
> +        entry->count = 0;
> +    }
> +
> +    if (msg.m_nentries > 0)
> +        pgstat_send(&msg, offsetof(PgStat_MsgSqlstmt, m_entry[0]) +
> +                    msg.m_nentries * sizeof(PgStat_SqlstmtEntry));

Hm. So pgstat_backend_sql is never deleted, which'll mean that if a
backend lives long we'll continually iterate over a lot of 0 entries.  I
think performance evaluation of this feature should take that into
account.


> +}
> +
> +/*
> + * Count SQL statement for pg_stat_sql view
> + */
> +void
> +pgstat_count_sqlstmt(const char *commandTag)
> +{
> +    PgStat_SqlstmtEntry *htabent;
> +    bool        found;
> +
> +    if (!pgstat_backend_sql)
> +    {
> +        /* First time through - initialize SQL statement stat table */
> +        HASHCTL        hash_ctl;
> +
> +        memset(&hash_ctl, 0, sizeof(hash_ctl));
> +        hash_ctl.keysize = NAMEDATALEN;
> +        hash_ctl.entrysize = sizeof(PgStat_SqlstmtEntry);
> +        pgstat_backend_sql = hash_create("SQL statement stat entries",
> +                                         PGSTAT_SQLSTMT_HASH_SIZE,
> +                                         &hash_ctl,
> +                                         HASH_ELEM | HASH_BLOBS);
> +    }
> +
> +    /* Get the stats entry for this SQL statement, create if necessary */
> +    htabent = hash_search(pgstat_backend_sql, commandTag,
> +                          HASH_ENTER, &found);
> +    if (!found)
> +        htabent->count = 1;
> +    else
> +        htabent->count++;
> +}


That's not really something in this patch, but a lot of this would be
better if we didn't internally have command tags as strings, but as an
enum.  We should really only convert to a string with needed.  That
we're doing string comparisons on Portal->commandTag is just plain bad.

If so, we could also make this whole patch a lot cheaper - have a fixed
size array that has an entry for every possible tag (possibly even in
shared memory, and then use atomics there).

> +++ b/src/backend/tcop/postgres.c
> @@ -1109,6 +1109,12 @@ exec_simple_query(const char *query_string)
>  
>          PortalDrop(portal, false);
>  
> +        /*
> +         * Count SQL statement for pg_stat_sql view
> +         */
> +        if (pgstat_track_sql)
> +            pgstat_count_sqlstmt(commandTag);
> +

Isn't doing this at the message level quite wrong?  What about
statements executed from functions and such?  Shouldn't this integrate
at the executor level instead?


>          if (IsA(parsetree->stmt, TransactionStmt))
>          {
>              /*
> @@ -1991,6 +1997,29 @@ exec_execute_message(const char *portal_name, long max_rows)
>  
>      (*receiver->rDestroy) (receiver);
>  
> +    /*
> +     * Count SQL Statement for pgx_stat_sql
> +     */
> +    if (pgstat_track_sql)
> +    {
> +        CachedPlanSource *psrc = NULL;
> +
> +        if (portal->prepStmtName)
> +        {
> +            PreparedStatement *pstmt;
> +
> +            pstmt = FetchPreparedStatement(portal->prepStmtName, false);
> +            if (pstmt)
> +                psrc = pstmt->plansource;
> +        }
> +        else
> +            psrc = unnamed_stmt_psrc;
> +
> +        /* psrc should not be NULL here */
> +        if (psrc && psrc->commandTag && !execute_is_fetch && pgstat_track_sql)
> +            pgstat_count_sqlstmt(psrc->commandTag);

Wait, we're re-fetching the statement here?  That doesn't sound
alright.


> +Datum
> +pg_stat_sql(PG_FUNCTION_ARGS)
> +{
> +    TupleDesc    tupdesc;
> +    Datum        values[2];
> +    bool        nulls[2];
> +    ReturnSetInfo *rsi;
> +    MemoryContext old_cxt;
> +    Tuplestorestate *tuple_store;
> +    PgStat_SqlstmtEntry *sqlstmtentry;
> +    HASH_SEQ_STATUS hstat;
> +
> +    /* Function call to let the backend read the stats file */
> +    pgstat_fetch_global();
> +
> +    /* Initialize values and nulls arrays */
> +    MemSet(values, 0, sizeof(values));
> +    MemSet(nulls, 0, sizeof(nulls));

why not instead declare values[2] = {0}?  Obviously not important.


> +    tuple_store =
> +        tuplestore_begin_heap(rsi->allowedModes & SFRM_Materialize_Random,
> +                              false, work_mem);

Huh, why do we need random?



Greetings,

Andres Freund



Re: [HACKERS] New SQL counter statistics view (pg_stat_sql)

From
Haribabu Kommi
Date:


On Thu, Apr 6, 2017 at 5:17 AM, Andres Freund <andres@anarazel.de> wrote:
Hi,


I'm somewhat inclined to think that this really would be better done in
an extension like pg_stat_statements.

Thanks for the review.

 

> +}
> +
> +/*
> + * Count SQL statement for pg_stat_sql view
> + */
> +void
> +pgstat_count_sqlstmt(const char *commandTag)
> +{
> +     PgStat_SqlstmtEntry *htabent;
> +     bool            found;
> +
> +     if (!pgstat_backend_sql)
> +     {
> +             /* First time through - initialize SQL statement stat table */
> +             HASHCTL         hash_ctl;
> +
> +             memset(&hash_ctl, 0, sizeof(hash_ctl));
> +             hash_ctl.keysize = NAMEDATALEN;
> +             hash_ctl.entrysize = sizeof(PgStat_SqlstmtEntry);
> +             pgstat_backend_sql = hash_create("SQL statement stat entries",
> +                                                                              PGSTAT_SQLSTMT_HASH_SIZE,
> +                                                                              &hash_ctl,
> +                                                                              HASH_ELEM | HASH_BLOBS);
> +     }
> +
> +     /* Get the stats entry for this SQL statement, create if necessary */
> +     htabent = hash_search(pgstat_backend_sql, commandTag,
> +                                               HASH_ENTER, &found);
> +     if (!found)
> +             htabent->count = 1;
> +     else
> +             htabent->count++;
> +}


That's not really something in this patch, but a lot of this would be
better if we didn't internally have command tags as strings, but as an
enum.  We should really only convert to a string with needed.  That
we're doing string comparisons on Portal->commandTag is just plain bad.

If so, we could also make this whole patch a lot cheaper - have a fixed
size array that has an entry for every possible tag (possibly even in
shared memory, and then use atomics there).

During the development, I thought of using an array of all command tags
and update the counters using the tag name, but not like the enum values.
Now I have to create a mapping array with enums and tag names for easier
counter updates. The problem in this approach is, whenever any addition
of new commands, this mapping array needs to be updated with both
new enum and new tag name, whereas with hash table approach, it works
future command additions also, but there is some performance overhead
compared to the array approach.

I will modify the patch to use the array approach and provide it to the next
commitfest.

 
> +++ b/src/backend/tcop/postgres.c
> @@ -1109,6 +1109,12 @@ exec_simple_query(const char *query_string)
>
>               PortalDrop(portal, false);
>
> +             /*
> +              * Count SQL statement for pg_stat_sql view
> +              */
> +             if (pgstat_track_sql)
> +                     pgstat_count_sqlstmt(commandTag);
> +

Isn't doing this at the message level quite wrong?  What about
statements executed from functions and such?  Shouldn't this integrate
at the executor level instead?

Currently the patch is designed to find out only the user executed statements
that are successfully finished, (no internal statements that are executed from
functions and etc).

The same question was asked by dilip also in earlier mails, may be now it is
the time that we can decide the approach of statement counts.


Regards,
Hari Babu
Fujitsu Australia

RE: New SQL counter statistics view (pg_stat_sql)

From
"Smith, Peter"
Date:
Dear Hackers,

We have resurrected this 2 year old "SQL statements statistics counter" proposal from Hari.

The attached patch addresses the previous review issues.

The commit-fest entry has been moved forward to the next commit-fest
https://commitfest.postgresql.org/25/790/

Please review again, and consider if this is OK for "Ready for Committer" status.

Kind Regards
--
Peter Smith
Fujitsu Australia



-----Original Message-----
From: pgsql-hackers-owner@postgresql.org <pgsql-hackers-owner@postgresql.org> On Behalf Of Andres Freund
Sent: Thursday, 6 April 2017 5:18 AM
To: Haribabu Kommi <kommi.haribabu@gmail.com>
Cc: Michael Paquier <michael.paquier@gmail.com>; Dilip Kumar <dilipbalaut@gmail.com>; vinayak
<Pokale_Vinayak_q3@lab.ntt.co.jp>;pgsql-hackers@postgresql.org 
Subject: Re: New SQL counter statistics view (pg_stat_sql)

Hi,


I'm somewhat inclined to think that this really would be better done in an extension like pg_stat_statements.


On 2017-03-08 14:39:23 +1100, Haribabu Kommi wrote:
> On Wed, Feb 1, 2017 at 3:13 PM, Michael Paquier
> <michael.paquier@gmail.com>

> +     <varlistentry id="guc-track-sql" xreflabel="track_sql">
> +      <term><varname>track_sql</varname> (<type>boolean</type>)
> +      <indexterm>
> +       <primary><varname>track_sql</> configuration parameter</primary>
> +      </indexterm>
> +      </term>
> +      <listitem>
> +       <para>
> +        Enables collection of different SQL statement statistics that are
> +        executed on the instance. This parameter is off by default. Only
> +        superusers can change this setting.
> +       </para>
> +      </listitem>
> +     </varlistentry>
> +

I don't like this name much, seems a bit too generic to me. 'track_activities', 'track_io_timings' are at least a bit
clearer.
How about track_statement_statistics + corresponding view/function renaming?


> +  <table id="pg-stat-sql-view" xreflabel="pg_stat_sql">
> +   <title><structname>pg_stat_sql</structname> View</title>
> +   <tgroup cols="3">
> +    <thead>
> +    <row>
> +      <entry>Column</entry>
> +      <entry>Type</entry>
> +      <entry>Description</entry>
> +     </row>
> +    </thead>
> +
> +   <tbody>
> +    <row>
> +     <entry><structfield>tag</></entry>
> +     <entry><type>text</></entry>
> +     <entry>Name of the SQL statement</entry>
> +    </row>

It's not really the "name" of a statement. Internally and IIRC elsewhere in the docs we describe something like this as
tag?


> +/* ----------
> + * pgstat_send_sqlstmt(void)
> + *
> + *        Send SQL statement statistics to the collector
> + * ----------
> + */
> +static void
> +pgstat_send_sqlstmt(void)
> +{
> +    PgStat_MsgSqlstmt msg;
> +    PgStat_SqlstmtEntry *entry;
> +    HASH_SEQ_STATUS hstat;
> +
> +    if (pgstat_backend_sql == NULL)
> +        return;
> +
> +    pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SQLSTMT);
> +    msg.m_nentries = 0;
> +
> +    hash_seq_init(&hstat, pgstat_backend_sql);
> +    while ((entry = (PgStat_SqlstmtEntry *) hash_seq_search(&hstat)) != NULL)
> +    {
> +        PgStat_SqlstmtEntry *m_ent;
> +
> +        /* Skip it if no counts accumulated since last time */
> +        if (entry->count == 0)
> +            continue;
> +
> +        /* need to convert format of time accumulators */
> +        m_ent = &msg.m_entry[msg.m_nentries];
> +        memcpy(m_ent, entry, sizeof(PgStat_SqlstmtEntry));
> +
> +        if (++msg.m_nentries >= PGSTAT_NUM_SQLSTMTS)
> +        {
> +            pgstat_send(&msg, offsetof(PgStat_MsgSqlstmt, m_entry[0]) +
> +                        msg.m_nentries * sizeof(PgStat_SqlstmtEntry));
> +            msg.m_nentries = 0;
> +        }
> +
> +        /* reset the entry's count */
> +        entry->count = 0;
> +    }
> +
> +    if (msg.m_nentries > 0)
> +        pgstat_send(&msg, offsetof(PgStat_MsgSqlstmt, m_entry[0]) +
> +                    msg.m_nentries * sizeof(PgStat_SqlstmtEntry));

Hm. So pgstat_backend_sql is never deleted, which'll mean that if a backend lives long we'll continually iterate over a
lotof 0 entries.  I think performance evaluation of this feature should take that into account. 


> +}
> +
> +/*
> + * Count SQL statement for pg_stat_sql view  */ void
> +pgstat_count_sqlstmt(const char *commandTag) {
> +    PgStat_SqlstmtEntry *htabent;
> +    bool        found;
> +
> +    if (!pgstat_backend_sql)
> +    {
> +        /* First time through - initialize SQL statement stat table */
> +        HASHCTL        hash_ctl;
> +
> +        memset(&hash_ctl, 0, sizeof(hash_ctl));
> +        hash_ctl.keysize = NAMEDATALEN;
> +        hash_ctl.entrysize = sizeof(PgStat_SqlstmtEntry);
> +        pgstat_backend_sql = hash_create("SQL statement stat entries",
> +                                         PGSTAT_SQLSTMT_HASH_SIZE,
> +                                         &hash_ctl,
> +                                         HASH_ELEM | HASH_BLOBS);
> +    }
> +
> +    /* Get the stats entry for this SQL statement, create if necessary */
> +    htabent = hash_search(pgstat_backend_sql, commandTag,
> +                          HASH_ENTER, &found);
> +    if (!found)
> +        htabent->count = 1;
> +    else
> +        htabent->count++;
> +}


That's not really something in this patch, but a lot of this would be better if we didn't internally have command tags
asstrings, but as an enum.  We should really only convert to a string with needed.  That we're doing string comparisons
onPortal->commandTag is just plain bad. 

If so, we could also make this whole patch a lot cheaper - have a fixed size array that has an entry for every possible
tag(possibly even in shared memory, and then use atomics there). 

> +++ b/src/backend/tcop/postgres.c
> @@ -1109,6 +1109,12 @@ exec_simple_query(const char *query_string)
>
>          PortalDrop(portal, false);
>
> +        /*
> +         * Count SQL statement for pg_stat_sql view
> +         */
> +        if (pgstat_track_sql)
> +            pgstat_count_sqlstmt(commandTag);
> +

Isn't doing this at the message level quite wrong?  What about statements executed from functions and such?  Shouldn't
thisintegrate at the executor level instead? 


>          if (IsA(parsetree->stmt, TransactionStmt))
>          {
>              /*
> @@ -1991,6 +1997,29 @@ exec_execute_message(const char *portal_name,
> long max_rows)
>
>      (*receiver->rDestroy) (receiver);
>
> +    /*
> +     * Count SQL Statement for pgx_stat_sql
> +     */
> +    if (pgstat_track_sql)
> +    {
> +        CachedPlanSource *psrc = NULL;
> +
> +        if (portal->prepStmtName)
> +        {
> +            PreparedStatement *pstmt;
> +
> +            pstmt = FetchPreparedStatement(portal->prepStmtName, false);
> +            if (pstmt)
> +                psrc = pstmt->plansource;
> +        }
> +        else
> +            psrc = unnamed_stmt_psrc;
> +
> +        /* psrc should not be NULL here */
> +        if (psrc && psrc->commandTag && !execute_is_fetch && pgstat_track_sql)
> +            pgstat_count_sqlstmt(psrc->commandTag);

Wait, we're re-fetching the statement here?  That doesn't sound alright.


> +Datum
> +pg_stat_sql(PG_FUNCTION_ARGS)
> +{
> +    TupleDesc    tupdesc;
> +    Datum        values[2];
> +    bool        nulls[2];
> +    ReturnSetInfo *rsi;
> +    MemoryContext old_cxt;
> +    Tuplestorestate *tuple_store;
> +    PgStat_SqlstmtEntry *sqlstmtentry;
> +    HASH_SEQ_STATUS hstat;
> +
> +    /* Function call to let the backend read the stats file */
> +    pgstat_fetch_global();
> +
> +    /* Initialize values and nulls arrays */
> +    MemSet(values, 0, sizeof(values));
> +    MemSet(nulls, 0, sizeof(nulls));

why not instead declare values[2] = {0}?  Obviously not important.


> +    tuple_store =
> +        tuplestore_begin_heap(rsi->allowedModes & SFRM_Materialize_Random,
> +                              false, work_mem);

Huh, why do we need random?



Greetings,

Andres Freund


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers



Attachment

Re: New SQL counter statistics view (pg_stat_sql)

From
Thomas Munro
Date:
On Thu, Oct 17, 2019 at 3:22 PM Smith, Peter <peters@fast.au.fujitsu.com> wrote:
> We have resurrected this 2 year old "SQL statements statistics counter" proposal from Hari.
>
> The attached patch addresses the previous review issues.

Hi,

No comment on the patch but I noticed that the documentation changes
don't build.  Please make sure you can "make docs" successfully,
having installed the documentation tools[1].

[1] https://www.postgresql.org/docs/devel/docguide-toolsets.html



RE: New SQL counter statistics view (pg_stat_sql)

From
"Smith, Peter"
Date:
From: Thomas Munro <thomas.munro@gmail.com> Sent: Monday, 4 November 2019 1:43 PM

> No comment on the patch but I noticed that the documentation changes don't build.  Please make sure you can "make
docs"successfully, having installed the documentation tools[1].
 

Thanks for the feedback. An updated patch which fixes the docs issue is attached.

Kind Regards.
Peter Smith
---
Fujitsu Australia


Attachment

Re: New SQL counter statistics view (pg_stat_sql)

From
Alvaro Herrera
Date:
On 2019-Nov-13, Smith, Peter wrote:

> From: Thomas Munro <thomas.munro@gmail.com> Sent: Monday, 4 November 2019 1:43 PM
> 
> > No comment on the patch but I noticed that the documentation changes don't build.  Please make sure you can "make
docs"successfully, having installed the documentation tools[1].
 
> 
> Thanks for the feedback. An updated patch which fixes the docs issue is attached.

So, who is updating this patch for the new cmdtaglist.h stuff?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: New SQL counter statistics view (pg_stat_sql)

From
Mark Dilger
Date:

> On Mar 3, 2020, at 6:50 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2019-Nov-13, Smith, Peter wrote:
>
>> From: Thomas Munro <thomas.munro@gmail.com> Sent: Monday, 4 November 2019 1:43 PM
>>
>>> No comment on the patch but I noticed that the documentation changes don't build.  Please make sure you can "make
docs"successfully, having installed the documentation tools[1]. 
>>
>> Thanks for the feedback. An updated patch which fixes the docs issue is attached.
>
> So, who is updating this patch for the new cmdtaglist.h stuff?

I have a much altered (competing?) patch inspired by pg_stat_sql.  I am working to integrate the changes you made to my
commandtagpatch into my commandstats patch before posting it. 

It might be good if somebody else tackled the pg_stat_sql patch in case my version is rejected.

I don't want to highjack this thread to talk in detail about the other patch, so I'll just give an overview that might
beuseful for anybody thinking about putting effort into committing this patch first.  

The motivation for the commandtag patch that you've committed (thanks again!) was to make the commandstats patch
possible. I didn't like the way pg_stat_sql stored the tags as strings, but I also still don't like the way it needs to
takea lock every time a backend increments the counter for a command.  That is bad, I think, because the lock overhead
foran otherwise fast command (like moving a cursor) may be high enough to discourage users from turning this feature on
inproduction. 

I solve that by trading off speed and memory.  For most commandtags, there is a single uint64 counter, and incrementing
thecounter requires taking a lock.  This is very similar to how the other patch works.  For the few commandtags that we
deemworthy of the special treatment, there is an additional uint32 counter *per backend* reserved in shared memory.
Backendsupdate this counter using atomics, but do not require locks.  When the counter is in danger of overflow, locks
aretaken to roll the count into the uint64 counters and zero out the backend-specific uint32 counter.  I have roughly
20command tags receiving this special treatment, but the design of the patch doesn't make it too hard to change the
exactlist of tags worthy of it, so I don't plan to stress too much about the exact set of tags.  I'm sure you'll all
tellme which ones you think do or do not deserve the treatment. 

If it isn't obvious, the logic of having this treatment for only a subset of tags is that 192 tags * 4 bytes per
counter* MaxBackends gets expensive for folks running a high number of backends.  Taking that from 192 tags down to one
ortwo dozen seems worthwhile. 

I expect to post on a separate thread before the day is over.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Command statistics system (cmdstats)

From
Mark Dilger
Date:
Hackers,

as mentioned in [1], I have created an implementation of command counter statistics very similar in purpose to the one
alreadypending in the commitfest going by the name "pg_stat_sql".  I don't really care if this implementation is seen
asbuilding on that one or as separate, but I was worried about hijacking that thread, so I'm starting this thead.
Thereare elements of this patch that borrowed from that one, so if this is accepted, authorship should reflect that. 

See the work by Haribabu Kommi (haribabu) at https://commitfest.postgresql.org/27/790/

The two main differences are that

(1) This implementation is based on commandtags as enums, not strings and
(2) This implementation uses techniques to reduce lock contention

I think (2) is the more important part.

[1] https://www.postgresql.org/message-id/28F46159-3764-421B-904B-004DEA339310%40enterprisedb.com


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

Re: Command statistics system (cmdstats)

From
Mark Dilger
Date:
> On Mar 4, 2020, at 7:43 PM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
> Hackers,
>
> as mentioned in [1], I have created an implementation of command counter statistics very similar in purpose to the
onealready pending in the commitfest going by the name "pg_stat_sql".  I don't really care if this implementation is
seenas building on that one or as separate, but I was worried about hijacking that thread, so I'm starting this thead.
Thereare elements of this patch that borrowed from that one, so if this is accepted, authorship should reflect that. 
>
> See the work by Haribabu Kommi (haribabu) at https://commitfest.postgresql.org/27/790/
>
> The two main differences are that
>
> (1) This implementation is based on commandtags as enums, not strings and
> (2) This implementation uses techniques to reduce lock contention
>
> I think (2) is the more important part.
>
> [1] https://www.postgresql.org/message-id/28F46159-3764-421B-904B-004DEA339310%40enterprisedb.com
>
> <v1-0001-Implementing-the-cmdstats-subsystem.patch>
> —

Rebased patch against master caa3c4242cf86322e2ed0c86199e6462a2c41565.



—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

Re: Command statistics system (cmdstats)

From
Mark Dilger
Date:

> On Jun 2, 2020, at 9:58 AM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
>>
>> On Mar 4, 2020, at 7:43 PM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>>
>> Hackers,
>>
>> as mentioned in [1], I have created an implementation of command counter statistics very similar in purpose to the
onealready pending in the commitfest going by the name "pg_stat_sql".  I don't really care if this implementation is
seenas building on that one or as separate, but I was worried about hijacking that thread, so I'm starting this thead.
Thereare elements of this patch that borrowed from that one, so if this is accepted, authorship should reflect that. 
>>
>> See the work by Haribabu Kommi (haribabu) at https://commitfest.postgresql.org/27/790/
>>
>> The two main differences are that
>>
>> (1) This implementation is based on commandtags as enums, not strings and
>> (2) This implementation uses techniques to reduce lock contention
>>
>> I think (2) is the more important part.
>>
>> [1] https://www.postgresql.org/message-id/28F46159-3764-421B-904B-004DEA339310%40enterprisedb.com
>>
>> <v1-0001-Implementing-the-cmdstats-subsystem.patch>
>> —
>
> Rebased patch against master caa3c4242cf86322e2ed0c86199e6462a2c41565.



—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

Re: Command statistics system (cmdstats)

From
Robert Haas
Date:
On Wed, Mar 4, 2020 at 10:43 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> The two main differences are that
>
> (1) This implementation is based on commandtags as enums, not strings and
> (2) This implementation uses techniques to reduce lock contention
>
> I think (2) is the more important part.

My spidey sense is tingling here, telling me that we need some actual
benchmarking. Like, suppose we test the two patches under normal cases
and under cases that are constructed to be as bad as possible for each
of them. Or suppose we test this patch with the lock mitigation
strategies and then remove the mitigations for some inexpensive
command (e.g. SHOW) and then use pgbench to spam that command. Like
you, I suspect that the locking mitigations are important in some
workloads, but it would be good to have some figures to back that out,
as well as to find out whether there's still too much overhead.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Command statistics system (cmdstats)

From
Michael Paquier
Date:
On Wed, Jun 10, 2020 at 01:45:02PM -0400, Robert Haas wrote:
> My spidey sense is tingling here, telling me that we need some actual
> benchmarking. Like, suppose we test the two patches under normal cases
> and under cases that are constructed to be as bad as possible for each
> of them. Or suppose we test this patch with the lock mitigation
> strategies and then remove the mitigations for some inexpensive
> command (e.g. SHOW) and then use pgbench to spam that command. Like
> you, I suspect that the locking mitigations are important in some
> workloads, but it would be good to have some figures to back that out,
> as well as to find out whether there's still too much overhead.

This patch has not received any replies after this comment for three
months, so I am marking it as returned with feedback.  I agree that
this should be benchmarked carefully.
--
Michael

Attachment

Re: New SQL counter statistics view (pg_stat_sql)

From
Michael Paquier
Date:
On Tue, Mar 03, 2020 at 11:50:04PM -0300, Alvaro Herrera wrote:
> So, who is updating this patch for the new cmdtaglist.h stuff?

This patch had no activity for months, so I am marking it as returned
with feedback.
--
Michael

Attachment

Re: Command statistics system (cmdstats)

From
Alvaro Herrera
Date:
On 2020-Sep-17, Michael Paquier wrote:

> On Wed, Jun 10, 2020 at 01:45:02PM -0400, Robert Haas wrote:
> > My spidey sense is tingling here, telling me that we need some actual
> > benchmarking.
> 
> This patch has not received any replies after this comment for three
> months, so I am marking it as returned with feedback.  I agree that
> this should be benchmarked carefully.

It seems fine to mark the patch as RwF at the end of commitfest, but
that's two weeks away.  I don't understand what is accomplished by doing
it ahead of time, other than alienating the patch authors.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Command statistics system (cmdstats)

From
Mark Dilger
Date:

> On Sep 17, 2020, at 5:40 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2020-Sep-17, Michael Paquier wrote:
>
>> On Wed, Jun 10, 2020 at 01:45:02PM -0400, Robert Haas wrote:
>>> My spidey sense is tingling here, telling me that we need some actual
>>> benchmarking.
>>
>> This patch has not received any replies after this comment for three
>> months, so I am marking it as returned with feedback.  I agree that
>> this should be benchmarked carefully.

Yes, I have prioritized a couple other patches over this one, with benchmarking this patch lower down my priority list.
Thank you, Michael, for the reminder! 

>
> It seems fine to mark the patch as RwF at the end of commitfest, but
> that's two weeks away.  I don't understand what is accomplished by doing
> it ahead of time, other than alienating the patch authors.

Thanks, Álvaro!  I do not feel alienated, and I hope Haribabu Kommi does not either.

Haribabu, do you have an opinion on which of the two patches should proceed?  I don't recall seeing a response from you
aboutwhether you liked what I did with your patch.  Perhaps you were waiting on the benchmarking results? 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Command statistics system (cmdstats)

From
Michael Paquier
Date:
On Thu, Sep 17, 2020 at 08:03:55AM -0700, Mark Dilger wrote:
> Yes, I have prioritized a couple other patches over this one, with
> benchmarking this patch lower down my priority list.  Thank you,
> Michael, for the reminder!

Based on the low level of activity and the fact that the patch was
marked as waiting on author for a couple of weeks, it looks like
little could be achieved by the end of the CF, and the attention was
elsewhere, so it looked better (and it still does IMO) to give more
attention to the remaining 170~ patches that are still lying on the CF
app.
--
Michael

Attachment

Re: Command statistics system (cmdstats)

From
Alvaro Herrera
Date:
On 2020-Sep-18, Michael Paquier wrote:

> Based on the low level of activity and the fact that the patch was
> marked as waiting on author for a couple of weeks, it looks like
> little could be achieved by the end of the CF, and the attention was
> elsewhere, so it looked better (and it still does IMO) to give more
> attention to the remaining 170~ patches that are still lying on the CF
> app.

"A couple of weeks" of inactivity is not sufficient, in my view, to boot
a patch out of the commitfest process.  Whenever the patch is
resurrected, it will be a new entry which won't have the history that it
had accumulated in the long time since it was created -- which biases
it against other new patches.

I'm not really arguing about this one patch only, but more generally
about the process you're following.  The problem is that you as CFM are
imposing your personal priorities on the whole process by punting
patches ahead of time -- you are saying that nobody else should be
looking at updating this patch because it is now closed.  It seems fair
to do so at the end of the commitfest, but it does not seem fair to do
it when it's still mid-cycle.

Putting also in perspective the history that the patch had prior to your
unfairly closing it, there was a lot of effort in getting it reviewed to
a good state and updated by the author --  yet a single unsubstantiated
comment, without itself a lot of effort on the reviewer's part, that
"maybe it has a perf drawback" is sufficient to get it booted?  It
doesn't seem appropriate.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Command statistics system (cmdstats)

From
Robert Haas
Date:
On Mon, Sep 21, 2020 at 9:41 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> "A couple of weeks" of inactivity is not sufficient, in my view, to boot
> a patch out of the commitfest process.  Whenever the patch is
> resurrected, it will be a new entry which won't have the history that it
> had accumulated in the long time since it was created -- which biases
> it against other new patches.

As Michael said, it had been inactive for three *months*. I don't
think there's a big problem here. I think that the redesign of the
CommitFest application encourages carrying things along from CF to CF
forever, instead of getting rid of things that aren't wanted or aren't
making any progress. That's work we don't need. There ought to be a
way to mark a patch RwF when nothing's happen that lets it be revived
later if and when someone gets around to resubmitting, but I think
that's not possible now.

Then, too, since we have email integration now, maybe we ought to do
that automatically if the thread isn't getting updated and the patch
is setting there waiting on author. It's a real waste of CFM time to
chase down and kick out obviously-inactive patches, and if the CFM is
going to get flack for it then that's even worse. Like, do we have 170
patches now because we have more activity than a few years ago, or
just because we've become more reluctant to boot things? I don't know,
but to the extent it's from unwillingness to boot things, I think
that's bad. Discouraging contributors is not good, but wasting CFM and
commiter time is also bad. You've got to draw a line someplace or
you'll go nuts.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Command statistics system (cmdstats)

From
Michael Paquier
Date:
On Mon, Sep 21, 2020 at 10:41:46AM -0300, Alvaro Herrera wrote:
> "A couple of weeks" of inactivity is not sufficient, in my view, to boot
> a patch out of the commitfest process.  Whenever the patch is
> resurrected, it will be a new entry which won't have the history that it
> had accumulated in the long time since it was created -- which biases
> it against other new patches.

Any of the patches that have been marked as RwF last week have been
waiting on author since at least the end of the commit fest of July,
with zero actions taken on them.  In my experience, this points
strongly to the fact that these are unlikely going to be worked on
actively in a short time frame.

> I'm not really arguing about this one patch only, but more generally
> about the process you're following.  The problem is that you as CFM are
> imposing your personal priorities on the whole process by punting
> patches ahead of time -- you are saying that nobody else should be
> looking at updating this patch because it is now closed.  It seems fair
> to do so at the end of the commitfest, but it does not seem fair to do
> it when it's still mid-cycle.

I think that this remark is a bit unfair.  When it comes to any patch
in the commit fest, and I have managed a couple of them over the
years, I have tried to keep a neutral view for all of them, meaning
that I only let the numbers speak by themselves, mostly:
- When has a patch lastly been updated.
- What's the current state in the CF app  If the patch had no reviews
and still in "Needs review", simply switch it to "waiting on author".
If the patch has been waiting on author for more than two weeks, it
would get marked as RwF at the end of the CF.  There are also cases
where the status of the patch is incorrect, mostly that a patch
waiting on author because of a review was not updated as such in the
CF app.  In this case, and in the middle of a CF, this would get get
marked as Rwf, but the author would get a reminder of that.

Any of the patches that have been discarded by me last week were
waiting in the stack for much, much, longer than that, and I think
that it does not help reviewers and committers in keeping this stuff
longer than necessary because this mainly bloats the CF app in a
non-necessary way, hiding patches that should deserve more attention.
Some of those patches have been left idle for half a year.

Another point that I'd like to make here is that my intention was to
lift the curve here (a term quite fashionable this year particularly,
isn't it?), due to the high number of patches to handle in a single
CF, and the shortage of hands actually doing this kind of work.

Then, let's be clear about one last thing.  If at *any* moment people
here feel that at least the state of *one* patch of this CF has been
changed based on some of my "personal" priority views, I will
immediately stop any action as CFM and resign from it.  So, if that's
the case, please feel free to speak up here or on any of the threads
related to those patches.

> Putting also in perspective the history that the patch had prior to your
> unfairly closing it, there was a lot of effort in getting it reviewed to
> a good state and updated by the author --  yet a single unsubstantiated
> comment, without itself a lot of effort on the reviewer's part, that
> "maybe it has a perf drawback" is sufficient to get it booted?  It
> doesn't seem appropriate.

I agree on the fact that not being able to recreate a new CF entry
that keeps the history of an old one, with all the past threads or
annotations, could be helpful in some cases.  Now, I have seen as well
cases where a patch got around for so long with a very long thread
made things confusing, and that people also like starting a new thread
with a summary of the past state as a first message to attempt
clarifying things.  So both have advantages.
--
Michael

Attachment

Re: Command statistics system (cmdstats)

From
Michael Paquier
Date:
On Wed, Sep 23, 2020 at 10:41:10AM +0900, Michael Paquier wrote:
> I think that this remark is a bit unfair.  When it comes to any patch
> in the commit fest, and I have managed a couple of them over the
> years, I have tried to keep a neutral view for all of them, meaning
> that I only let the numbers speak by themselves, mostly:
> - When has a patch lastly been updated.
> - What's the current state in the CF app  If the patch had no reviews
> and still in "Needs review", simply switch it to "waiting on author".

Sorry, forgot to mention here that this is in the case where a patch
does not apply anymore or that the CF bot complains, requiring a check
of the patch from the author.

> If the patch has been waiting on author for more than two weeks, it
> would get marked as RwF at the end of the CF.  There are also cases
> where the status of the patch is incorrect, mostly that a patch
> waiting on author because of a review was not updated as such in the
> CF app.  In this case, and in the middle of a CF, this would get get
> marked as Rwf, but the author would get a reminder of that.

Last sentence here is incorrect as well: s/get get/not get/, and the
author would be reminded to update his/her patch.
--
Michael

Attachment

Re: Command statistics system (cmdstats)

From
Alvaro Herrera
Date:
> >> On Mar 4, 2020, at 7:43 PM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:

> >> as mentioned in [1], I have created an implementation of command
> >> counter statistics very similar in purpose to the one already
> >> pending in the commitfest going by the name "pg_stat_sql".  I don't
> >> really care if this implementation is seen as building on that one
> >> or as separate, but I was worried about hijacking that thread, so
> >> I'm starting this thead. There are elements of this patch that
> >> borrowed from that one, so if this is accepted, authorship should
> >> reflect that.

Hello, was this abandoned or is there another thread somewhere?  The
patch in the email I'm replying to doesn't apply, but the conflicts
don't look very bad.

-- 
Álvaro Herrera       Valdivia, Chile



Re: Command statistics system (cmdstats)

From
Mark Dilger
Date:

> On May 28, 2021, at 6:42 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
>>>> On Mar 4, 2020, at 7:43 PM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
>>>> as mentioned in [1], I have created an implementation of command
>>>> counter statistics very similar in purpose to the one already
>>>> pending in the commitfest going by the name "pg_stat_sql".  I don't
>>>> really care if this implementation is seen as building on that one
>>>> or as separate, but I was worried about hijacking that thread, so
>>>> I'm starting this thead. There are elements of this patch that
>>>> borrowed from that one, so if this is accepted, authorship should
>>>> reflect that.
>
> Hello, was this abandoned or is there another thread somewhere?  The
> patch in the email I'm replying to doesn't apply, but the conflicts
> don't look very bad.

I abandoned it for v14, choosing instead to work on amcheck features.  Would you like me to take this up again for v15?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company