Thread: New SQL counter statistics view (pg_stat_sql)
This is a new statistics view that is used to provide the number of SQL operations that are happened on a particular interval of time. This view is useful for the system to find out the pattern of the operations that are happening in the instance during particular interval of time. Following is the more or less columns and their details of the pg_stat_sql view. postgres=# \d pg_stat_sql View "pg_catalog.pg_stat_sql" Column | Type | Modifiers -------------+--------------------------+-----------selects | bigint |inserts | bigint |deletes | bigint |updates | bigint |declares | bigint |fetches | bigint |copies | bigint |reindexes | bigint |truncates | bigint |stats_reset | timestamp with time zone | The SQL counters gets updated only for the external SQL queries, not for the SQL queries that are generated internally. The counters gets updated at exec_simple_query and exec_execute_message functions. User can reset the SQL counter statistics using an exposed function. The Stats collection can be turned on/off using a GUC also. Any comments/objections in providing a patch for the same? Regards, Hari Babu Fujitsu Australia
Haribabu Kommi <kommi.haribabu@gmail.com> writes: > This is a new statistics view that is used to provide the number of > SQL operations that are > happened on a particular interval of time. This view is useful for the > system to find out the > pattern of the operations that are happening in the instance during > particular interval of > time. > Following is the more or less columns and their details of the pg_stat_sql view. > postgres=# \d pg_stat_sql > View "pg_catalog.pg_stat_sql" > Column | Type | Modifiers > -------------+--------------------------+----------- > selects | bigint | > inserts | bigint | > deletes | bigint | > updates | bigint | > declares | bigint | > fetches | bigint | > copies | bigint | > reindexes | bigint | > truncates | bigint | > stats_reset | timestamp with time zone | 1. This set of counters seems remarkably random. Why, for instance, count reindexes but not original index creations? 2. What will you do with cases such as SELECTs containing modifying CTEs? Or EXPLAIN ANALYZE? Does CLUSTER count as a REINDEX? Does REFRESH MATERIALIZED VIEW count as a SELECT? (Or maybe it's an INSERT?) If you're going to be selective about what you count, you're forever going to be fielding complaints from users who are unhappy that you didn't count some statement type they care about, or feel that you misclassified some complex case. I'm inclined to suggest you forget this approach and propose a single counter for "SQL commands executed", which avoids all of the above definitional problems. People who need more detail than that are probably best advised to look to contrib/pg_stat_statements, anyway. Also, if you do that, there hardly seems a need for a whole new view. You could just add a column to pg_stat_database. The incremental maintenance effort doesn't seem enough to justify its own GUC, either. regards, tom lane
On Sun, Aug 21, 2016 at 1:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Haribabu Kommi <kommi.haribabu@gmail.com> writes: >> This is a new statistics view that is used to provide the number of >> SQL operations that are >> happened on a particular interval of time. This view is useful for the >> system to find out the >> pattern of the operations that are happening in the instance during >> particular interval of >> time. > > 1. This set of counters seems remarkably random. Why, for instance, > count reindexes but not original index creations? I thought of adding the columns to the view that are mostly used(excluding all DDL commands), otherwise the view columns fill with all the commands that are supported by PostgreSQL. > 2. What will you do with cases such as SELECTs containing modifying CTEs? > Or EXPLAIN ANALYZE? Does CLUSTER count as a REINDEX? Does REFRESH > MATERIALIZED VIEW count as a SELECT? (Or maybe it's an INSERT?) Cluster commands will be calculated as clusters, explain needs some logic to identify what SQL operation exactly. This is because of using nodeTag data from the parseTree structure to identify what type of SQL statement it is. The counters will not updated for any SQL type that is not as part of the view. > If you're going to be selective about what you count, you're forever > going to be fielding complaints from users who are unhappy that you > didn't count some statement type they care about, or feel that you > misclassified some complex case. Yes, I agree that users may complain regarding with limited columns that are not sufficient for their use. But with the mostly used columns, this view may be useful for some of the users to find out the pattern of the SQL operations that are happening on the instance. > I'm inclined to suggest you forget this approach and propose a single > counter for "SQL commands executed", which avoids all of the above > definitional problems. People who need more detail than that are > probably best advised to look to contrib/pg_stat_statements, anyway. I will add a new column to the pg_stat_database, to track the number of SQL operations that are happened on this particular database. Regards, Hari Babu Fujitsu Australia
On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I'm inclined to suggest you forget this approach and propose a single > counter for "SQL commands executed", which avoids all of the above > definitional problems. People who need more detail than that are > probably best advised to look to contrib/pg_stat_statements, anyway. I disagree. I think SQL commands executed, lumping absolutely everything together, really isn't much use. Haribabu's categorization scheme seems to need some work, but the idea of categorizing statements by type and counting executions per type seems very reasonable. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2016-08-22 13:54:43 -0400, Robert Haas wrote: > On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I'm inclined to suggest you forget this approach and propose a single > > counter for "SQL commands executed", which avoids all of the above > > definitional problems. People who need more detail than that are > > probably best advised to look to contrib/pg_stat_statements, anyway. > > I disagree. I think SQL commands executed, lumping absolutely > everything together, really isn't much use. I'm inclined to agree. I think that's a quite useful stat when looking at an installation one previously didn't have a lot of interaction with. > Haribabu's categorization > scheme seems to need some work, but the idea of categorizing > statements by type and counting executions per type seems very > reasonable. I'd consider instead using something like COALESCE(commandType, nodeTag(Query->utilityStmt)) as the categories. Not sure if I'd even pivot that. Andres
Andres Freund <andres@anarazel.de> writes: > On 2016-08-22 13:54:43 -0400, Robert Haas wrote: >> On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I'm inclined to suggest you forget this approach and propose a single >>> counter for "SQL commands executed", which avoids all of the above >>> definitional problems. People who need more detail than that are >>> probably best advised to look to contrib/pg_stat_statements, anyway. >> I disagree. I think SQL commands executed, lumping absolutely >> everything together, really isn't much use. > I'm inclined to agree. I think that's a quite useful stat when looking > at an installation one previously didn't have a lot of interaction with. Well, let's at least have an "other" category so you can add up the counters and get a meaningful total. regards, tom lane
On 23/08/16 08:27, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: >> On 2016-08-22 13:54:43 -0400, Robert Haas wrote: >>> On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> I'm inclined to suggest you forget this approach and propose a single >>>> counter for "SQL commands executed", which avoids all of the above >>>> definitional problems. People who need more detail than that are >>>> probably best advised to look to contrib/pg_stat_statements, anyway. >>> I disagree. I think SQL commands executed, lumping absolutely >>> everything together, really isn't much use. >> I'm inclined to agree. I think that's a quite useful stat when looking >> at an installation one previously didn't have a lot of interaction with. > Well, let's at least have an "other" category so you can add up the > counters and get a meaningful total. > > regards, tom lane > > Initially I thought of something I thought was factious, but then realized it might actually be both practicable & useful... How about 2 extra categories (if appropriate!!!): 1. Things that actually might be sort of as fitting in 2 or more of the existing categories, or there is an ambiguity asto which category is appropriate. 2. Things that don't fit into any existing category This was inspired by a real use case, in a totally unrelated area - but where I attempted to ensure counts were in the most useful categories I was able to provide. The user had listed categories, but I found that the data didn't always fit neatly into them as specified. Cheers, Gavin
>Andres Freund <andres@anarazel.de> writes:
>> On 2016-08-22 13:54:43 -0400, Robert Haas wrote:
>> On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> I'm inclined to suggest you forget this approach and propose a single
>>>> counter for "SQL commands executed", which avoids all of the above
>>>> definitional problems. People who need more detail than that are
>>>> probably best advised to look to contrib/pg_stat_statements, anyway.
>>> I disagree. I think SQL commands executed, lumping absolutely
>>> everything together, really isn't much use.
>> I'm inclined to agree. I think that's a quite useful stat when looking
>> at an installation one previously didn't have a lot of interaction with.
>Well, let's at least have an "other" category so you can add up the
>counters and get a meaningful total.
Neha
>> On 2016-08-22 13:54:43 -0400, Robert Haas wrote:
>> On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> I'm inclined to suggest you forget this approach and propose a single
>>>> counter for "SQL commands executed", which avoids all of the above
>>>> definitional problems. People who need more detail than that are
>>>> probably best advised to look to contrib/pg_stat_statements, anyway.
>>> I disagree. I think SQL commands executed, lumping absolutely
>>> everything together, really isn't much use.
>> I'm inclined to agree. I think that's a quite useful stat when looking
>> at an installation one previously didn't have a lot of interaction with.
>Well, let's at least have an "other" category so you can add up the
>counters and get a meaningful total.
How would that meaningful total might help a user. What can a user might analyse with the counter in 'other' category.
Neha
On 24/08/16 12:02, neha khatri wrote: > >Andres Freund <andres@anarazel.de <mailto:andres@anarazel.de>> writes: > >> On 2016-08-22 13:54:43 -0400, Robert Haas wrote: > >> On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane <tgl@sss.pgh.pa.us > <mailto:tgl@sss.pgh.pa.us>> wrote: > >>>> I'm inclined to suggest you forget this approach and propose a single > >>>> counter for "SQL commands executed", which avoids all of the above > >>>> definitional problems. People who need more detail than that are > >>>> probably best advised to look to contrib/pg_stat_statements, anyway. > > >>> I disagree. I think SQL commands executed, lumping absolutely > >>> everything together, really isn't much use. > > >> I'm inclined to agree. I think that's a quite useful stat when looking > >> at an installation one previously didn't have a lot of interaction > with. > > >Well, let's at least have an "other" category so you can add up the > >counters and get a meaningful total. > > How would that meaningful total might help a user. What can a user > might analyse with the counter in 'other' category. > > > Neha > The user could then judge if there were a significant number of examples not covered in the other categories - this may, or may not, be a problem; depending on the use case. Cheers, Gavin
<div dir="ltr"><div class="gmail_extra"><br /><div class="gmail_quote">On Wed, Aug 24, 2016 at 10:07 AM, Gavin Flower <spandir="ltr"><<a href="mailto:GavinFlower@archidevsys.co.nz" target="_blank">GavinFlower@archidevsys.co.nz</a>></span>wrote:<br /><blockquote class="gmail_quote" style="margin:0 00 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 24/08/16 12:02, neha khatri wrote:<br /><blockquote class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""> >Andres Freund<<a href="mailto:andres@anarazel.de" target="_blank">andres@anarazel.de</a> <mailto:<a href="mailto:andres@anarazel.de"target="_blank">andres@anarazel.de</a>>> writes:<br /> >> On 2016-08-22 13:54:43-0400, Robert Haas wrote:<br /></span><span class=""> >> On Sat, Aug 20, 2016 at 11:17 AM, Tom Lane <<ahref="mailto:tgl@sss.pgh.pa.us" target="_blank">tgl@sss.pgh.pa.us</a> <mailto:<a href="mailto:tgl@sss.pgh.pa.us"target="_blank">tgl@sss.pgh.pa.us</a>>> wrote:<br /> >>>> I'm inclined tosuggest you forget this approach and propose a single<br /> >>>> counter for "SQL commands executed", whichavoids all of the above<br /> >>>> definitional problems. People who need more detail than that are<br/> >>>> probably best advised to look to contrib/pg_stat_statements, anyway.<br /><br /> >>> Idisagree. I think SQL commands executed, lumping absolutely<br /> >>> everything together, really isn't much use.<br/><br /> >> I'm inclined to agree. I think that's a quite useful stat when looking<br /> >> at an installationone previously didn't have a lot of interaction with.<br /><br /> >Well, let's at least have an "other" categoryso you can add up the<br /> >counters and get a meaningful total.<br /><br /> How would that meaningful totalmight help a user. What can a user might analyse with the counter in 'other' category.<br /><br /><br /><br /></span></blockquote>The user could then judge if there were a significant number of examples not covered in the other categories- this may, or may not, be a problem; depending on the use case.<br /><br /><br /> Cheers,<br /> Gavin<br /><br/></blockquote></div>For the user to be able to judge that whether the number in the 'other' category is a problem ornot, the user is also required to know what all might fall under the 'other' category. It may not be good to say that _anything_that is not part of the already defined category is part of 'other'. Probably, 'other' should also be a set ofpredefined operations.</div><div class="gmail_extra"><br /></div><div class="gmail_extra">Neha</div></div>
On Tue, Aug 23, 2016 at 3:59 AM, Andres Freund <andres@anarazel.de> wrote: >> Haribabu's categorization >> scheme seems to need some work, but the idea of categorizing >> statements by type and counting executions per type seems very >> reasonable. > > I'd consider instead using something like COALESCE(commandType, > nodeTag(Query->utilityStmt)) as the categories. Not sure if I'd even > pivot that. CommandType means to use select, insert, update, delete and utility as the categorization to display the counters? I think I am not getting your point correctly. Apart from the above, here are the following list of command tags that are generated in the code, I took only the first word of the command tag just to see how many categories present. The number indicates the subset of operations or number of types it is used. Like create table, create function and etc. insert delete update select (6) transaction (10) declare close (2) move fetch create (37) drop (36) Alter (56) import truncate comment security copy grant revoke notify listen unlisten load cluster vacuum analyze explain refresh set (2) reset show discard (4) reassign lock checkpoint reindex prepare execute deallocate we can try to pick something from the above list also for main categories and rest will fall under other. Regards, Hari Babu Fujitsu Australia
Haribabu Kommi wrote: > Apart from the above, here are the following list of command tags that > are generated in the code, I took only the first word of the command tag > just to see how many categories present. The number indicates the > subset of operations or number of types it is used. Like create table, > create function and etc. Sounds about right. I suppose all those cases that you aggregated here would expand to full tags in the actual code. I furthermore suppose that some of these could be ignored, such as the transaction ones and things like load, lock, move, fetch, discard, deallocate (maybe lump them all together under "other", or some other rough categorization, as Tom suggests). Also, for many of these commands it's probably relevant whether they are acting on a temporary object or not; we should either count these separately, or not count the temp ones at all. > insert > delete > update > select (6) > transaction (10) > declare > close (2) > move > fetch > create (37) > drop (36) > Alter (56) > import > truncate > comment > security > copy > grant > revoke > notify > listen > unlisten > load > cluster > vacuum > analyze > explain > refresh > set (2) > reset > show > discard (4) > reassign > lock > checkpoint > reindex > prepare > execute > deallocate -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Aug 31, 2016 at 3:19 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> Haribabu Kommi wrote:
>
>> Apart from the above, here are the following list of command tags that
>> are generated in the code, I took only the first word of the command tag
>> just to see how many categories present. The number indicates the
>> subset of operations or number of types it is used. Like create table,
>> create function and etc.
>
> Sounds about right. I suppose all those cases that you aggregated here
> would expand to full tags in the actual code. I furthermore suppose
> that some of these could be ignored, such as the transaction ones and
> things like load, lock, move, fetch, discard, deallocate (maybe lump
> them all together under "other", or some other rough categorization, as
> Tom suggests).
Following is the pg_stat_sql view with the SQL categories that I considered
that are important. Rest of the them will be shown under others category.
> Also, for many of these commands it's probably relevant
> whether they are acting on a temporary object or not; we should either
> count these separately, or not count the temp ones at all.
Currently the SQL stats are not checking any object level that is a temp
postgres=# \d pg_stat_sql
View "pg_catalog.pg_stat_sql"
Column | Type | Modifiers
-----------------+--------------------------+-----------
inserts | bigint |
deletes | bigint |
updates | bigint |
selects | bigint |
declare_cursors | bigint |
closes | bigint |
creates | bigint |
drops | bigint |
alters | bigint |
imports | bigint |
truncates | bigint |
copies | bigint |
grants | bigint |
revokes | bigint |
clusters | bigint |
vacuums | bigint |
analyzes | bigint |
refreshs | bigint |
locks | bigint |
checkpoints | bigint |
reindexes | bigint |
deallocates | bigint |
others | bigint |
stats_reset | timestamp with time zone |
If any additions/deletions, I can accommodate them.
The stats data gets updated in exec_simple_query and exec_execute_message
functions and the collected stats will be sent to stats collector similar
like function usage stats in pgstat_report_stat function.
These SQL statistics data is stored in the stats file similar like global
statistics. The STAT file format is updated to accommodate the new stats.
A new GUC "track_sql" is added to track the SQL statement
statistics, by default this is off. Only superuser can change this
parameter.
Attached a patch for the same.
> Also, for many of these commands it's probably relevant
> whether they are acting on a temporary object or not; we should either
> count these separately, or not count the temp ones at all.
Currently the SQL stats are not checking any object level that is a temp
one or not? The temp objects are specific to a backend only. But what
I feel, any way that is an SQL query that was executed on a temp object,
so we need to count that operation.
I feel this SQL stats should not worry about the object type. May be I am
wrong.
Regards,
Hari Babu
Fujitsu Australia
Regards,
Hari Babu
Fujitsu Australia
Attachment
On Fri, Sep 2, 2016 at 2:33 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > On Wed, Aug 31, 2016 at 3:19 AM, Alvaro Herrera <alvherre@2ndquadrant.com> > wrote: >> Haribabu Kommi wrote: >> >>> Apart from the above, here are the following list of command tags that >>> are generated in the code, I took only the first word of the command tag >>> just to see how many categories present. The number indicates the >>> subset of operations or number of types it is used. Like create table, >>> create function and etc. >> >> Sounds about right. I suppose all those cases that you aggregated here >> would expand to full tags in the actual code. I furthermore suppose >> that some of these could be ignored, such as the transaction ones and >> things like load, lock, move, fetch, discard, deallocate (maybe lump >> them all together under "other", or some other rough categorization, as >> Tom suggests). > > Following is the pg_stat_sql view with the SQL categories that I considered > that are important. Rest of the them will be shown under others category. > > postgres=# \d pg_stat_sql > View "pg_catalog.pg_stat_sql" > Column | Type | Modifiers > -----------------+--------------------------+----------- > inserts | bigint | > deletes | bigint | > updates | bigint | > selects | bigint | > declare_cursors | bigint | > closes | bigint | > creates | bigint | > drops | bigint | > alters | bigint | > imports | bigint | > truncates | bigint | > copies | bigint | > grants | bigint | > revokes | bigint | > clusters | bigint | > vacuums | bigint | > analyzes | bigint | > refreshs | bigint | > locks | bigint | > checkpoints | bigint | > reindexes | bigint | > deallocates | bigint | > others | bigint | > stats_reset | timestamp with time zone | > > > If any additions/deletions, I can accommodate them. I think it is not a good idea to make the command names used here the plural forms of the command tags. Instead of "inserts", "updates", "imports", etc. just use "INSERT", "UPDATE", "IMPORT". That's simpler and less error prone - e.g. you won't end up with things like "refreshs", which is not a word. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Sep 15, 2016 at 6:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:
I think it is not a good idea to make the command names used here theOn Fri, Sep 2, 2016 at 2:33 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
> On Wed, Aug 31, 2016 at 3:19 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
> wrote:
>> Haribabu Kommi wrote:
>>
>>> Apart from the above, here are the following list of command tags that
>>> are generated in the code, I took only the first word of the command tag
>>> just to see how many categories present. The number indicates the
>>> subset of operations or number of types it is used. Like create table,
>>> create function and etc.
>>
>> Sounds about right. I suppose all those cases that you aggregated here
>> would expand to full tags in the actual code. I furthermore suppose
>> that some of these could be ignored, such as the transaction ones and
>> things like load, lock, move, fetch, discard, deallocate (maybe lump
>> them all together under "other", or some other rough categorization, as
>> Tom suggests).
>
> Following is the pg_stat_sql view with the SQL categories that I considered
> that are important. Rest of the them will be shown under others category.
>
> postgres=# \d pg_stat_sql
> View "pg_catalog.pg_stat_sql"
> Column | Type | Modifiers
> -----------------+--------------------------+-----------
> inserts | bigint |
> deletes | bigint |
> updates | bigint |
> selects | bigint |
> declare_cursors | bigint |
> closes | bigint |
> creates | bigint |
> drops | bigint |
> alters | bigint |
> imports | bigint |
> truncates | bigint |
> copies | bigint |
> grants | bigint |
> revokes | bigint |
> clusters | bigint |
> vacuums | bigint |
> analyzes | bigint |
> refreshs | bigint |
> locks | bigint |
> checkpoints | bigint |
> reindexes | bigint |
> deallocates | bigint |
> others | bigint |
> stats_reset | timestamp with time zone |
>
>
> If any additions/deletions, I can accommodate them.
plural forms of the command tags. Instead of "inserts", "updates",
"imports", etc. just use "INSERT", "UPDATE", "IMPORT". That's simpler
and less error prone - e.g. you won't end up with things like
"refreshs", which is not a word.
Thanks for your suggestion. I also thought of changing the name while writing
"refreshs" as a column name of the view originally. A small restriction with the
change of names to command names is that, user needs to execute the query
as follows.
select pg_stat_sql.select from pg_stat_sql;
Updated patch is attached with the corrections. Apart from name change,
added documentation and tests.
Regards,
Hari Babu
Fujitsu Australia
Attachment
On 9/14/16 4:01 PM, Robert Haas wrote: > I think it is not a good idea to make the command names used here the > plural forms of the command tags. Instead of "inserts", "updates", > "imports", etc. just use "INSERT", "UPDATE", "IMPORT". That's simpler > and less error prone - e.g. you won't end up with things like > "refreshs", which is not a word. How about having the tag not be a column name but a row entry. So you'd do something like SELECT * FROM pg_stat_sql WHERE tag = 'ALTER VIEW'; That way, we don't have to keep updating (and re-debating) this when new command types or subtypes are added. And queries written for future versions will not fail when run against old servers. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Sep 21, 2016 at 11:25:14AM -0400, Peter Eisentraut wrote: > On 9/14/16 4:01 PM, Robert Haas wrote: > > I think it is not a good idea to make the command names used here the > > plural forms of the command tags. Instead of "inserts", "updates", > > "imports", etc. just use "INSERT", "UPDATE", "IMPORT". That's simpler > > and less error prone - e.g. you won't end up with things like > > "refreshs", which is not a word. > > How about having the tag not be a column name but a row entry. So you'd > do something like > > SELECT * FROM pg_stat_sql WHERE tag = 'ALTER VIEW'; +1 for this. It's MUCH easier to deal with changes in row counts than changes in row type. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Peter Eisentraut wrote: > How about having the tag not be a column name but a row entry. So you'd > do something like > > SELECT * FROM pg_stat_sql WHERE tag = 'ALTER VIEW'; > > That way, we don't have to keep updating (and re-debating) this when new > command types or subtypes are added. And queries written for future > versions will not fail when run against old servers. Yeah, good idea. Let's also discuss the interface from the stats collector. Currently we have some 20 new SQL functions, all alike, each loading the whole data and returning a single counter, and then the view invokes each function separately. That doesn't seem great to me. How about having a single C function that returns the whole thing as a SRF instead, and the view is just a single function invocation -- something like pg_lock_status filling pg_locks in one go. Another consideration is that the present patch lumps together all ALTER cases in a single counter. This isn't great, but at the same time we don't want to bloat the stat files by having hundreds of counters per database, do we? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Sep 21, 2016 at 02:05:24PM -0300, Alvaro Herrera wrote: > Another consideration is that the present patch lumps together all > ALTER cases in a single counter. This isn't great, but at the same > time we don't want to bloat the stat files by having hundreds of > counters per database, do we? I count 37 documented versions of ALTER as of git master. Is there some multiplier I'm missing? Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Thu, Sep 22, 2016 at 3:05 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Peter Eisentraut wrote:
> How about having the tag not be a column name but a row entry. So you'd
> do something like
>
> SELECT * FROM pg_stat_sql WHERE tag = 'ALTER VIEW';
>
> That way, we don't have to keep updating (and re-debating) this when new
> command types or subtypes are added. And queries written for future
> versions will not fail when run against old servers.
Yeah, good idea.
Yes, Having it as a row entry is good.
Let's also discuss the interface from the stats collector. Currently we
have some 20 new SQL functions, all alike, each loading the whole data
and returning a single counter, and then the view invokes each function
separately. That doesn't seem great to me. How about having a single C
function that returns the whole thing as a SRF instead, and the view is
just a single function invocation -- something like pg_lock_status
filling pg_locks in one go.
Another consideration is that the present patch lumps together all ALTER
cases in a single counter. This isn't great, but at the same time we
don't want to bloat the stat files by having hundreds of counters per
database, do we?
Currently, The SQL stats is a fixed size counter to track the all the ALTER
cases as single counter. So while sending the stats from the backend to
stats collector at the end of the transaction, the cost is same, because of
it's fixed size. This approach adds overhead to send and read the stats
is minimal.
With the following approach, I feel it is possible to support the counter at
command tag level.
Add a Global and local Hash to keep track of the counters by using the
command tag as the key, this hash table increases dynamically whenever
a new type of SQL command gets executed. The Local Hash data is passed
to stats collector whenever the transaction gets committed.
The problem I am thinking is that, Sending data from Hash and populating
the Hash from stats file for all the command tags adds some overhead.
Regards,
Hari Babu
Fujitsu Australia
On Thu, Sep 29, 2016 at 3:45 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
On Thu, Sep 22, 2016 at 3:05 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:Peter Eisentraut wrote:
> How about having the tag not be a column name but a row entry. So you'd
> do something like
>
> SELECT * FROM pg_stat_sql WHERE tag = 'ALTER VIEW';
>
> That way, we don't have to keep updating (and re-debating) this when new
> command types or subtypes are added. And queries written for future
> versions will not fail when run against old servers.
Yeah, good idea.Yes, Having it as a row entry is good.Let's also discuss the interface from the stats collector. Currently we
have some 20 new SQL functions, all alike, each loading the whole data
and returning a single counter, and then the view invokes each function
separately. That doesn't seem great to me. How about having a single C
function that returns the whole thing as a SRF instead, and the view is
just a single function invocation -- something like pg_lock_status
filling pg_locks in one go.
Another consideration is that the present patch lumps together all ALTER
cases in a single counter. This isn't great, but at the same time we
don't want to bloat the stat files by having hundreds of counters per
database, do we?Currently, The SQL stats is a fixed size counter to track the all the ALTERcases as single counter. So while sending the stats from the backend tostats collector at the end of the transaction, the cost is same, because ofit's fixed size. This approach adds overhead to send and read the statsis minimal.With the following approach, I feel it is possible to support the counter atcommand tag level.Add a Global and local Hash to keep track of the counters by using thecommand tag as the key, this hash table increases dynamically whenevera new type of SQL command gets executed. The Local Hash data is passedto stats collector whenever the transaction gets committed.The problem I am thinking is that, Sending data from Hash and populatingthe Hash from stats file for all the command tags adds some overhead.
I tried changing the pg_stat_sql into row mode and ran the regress suite to
add different type of SQL commands to the view and ran the pgbench test
on my laptop to find out any performance impact with this patch.
HEAD PATCH
pgbench - select 828 816
Here I attached the pg_stat_sql patch to keep track of all SQL commands
based on the commandTag and their counts. I attached the result of this
view that is run on the database after "make installcheck" just for reference.
Regards,
Hari Babu
Fujitsu Australia
Attachment
On 2016/10/12 12:21, Haribabu Kommi wrote:
Thank you for the patch.On Thu, Sep 29, 2016 at 3:45 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:On Thu, Sep 22, 2016 at 3:05 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:Peter Eisentraut wrote:
> How about having the tag not be a column name but a row entry. So you'd
> do something like
>
> SELECT * FROM pg_stat_sql WHERE tag = 'ALTER VIEW';
>
> That way, we don't have to keep updating (and re-debating) this when new
> command types or subtypes are added. And queries written for future
> versions will not fail when run against old servers.
Yeah, good idea.Yes, Having it as a row entry is good.Let's also discuss the interface from the stats collector. Currently we
have some 20 new SQL functions, all alike, each loading the whole data
and returning a single counter, and then the view invokes each function
separately. That doesn't seem great to me. How about having a single C
function that returns the whole thing as a SRF instead, and the view is
just a single function invocation -- something like pg_lock_status
filling pg_locks in one go.
Another consideration is that the present patch lumps together all ALTER
cases in a single counter. This isn't great, but at the same time we
don't want to bloat the stat files by having hundreds of counters per
database, do we?Currently, The SQL stats is a fixed size counter to track the all the ALTERcases as single counter. So while sending the stats from the backend tostats collector at the end of the transaction, the cost is same, because ofit's fixed size. This approach adds overhead to send and read the statsis minimal.With the following approach, I feel it is possible to support the counter atcommand tag level.Add a Global and local Hash to keep track of the counters by using thecommand tag as the key, this hash table increases dynamically whenevera new type of SQL command gets executed. The Local Hash data is passedto stats collector whenever the transaction gets committed.The problem I am thinking is that, Sending data from Hash and populatingthe Hash from stats file for all the command tags adds some overhead.I tried changing the pg_stat_sql into row mode and ran the regress suite toadd different type of SQL commands to the view and ran the pgbench teston my laptop to find out any performance impact with this patch.HEAD PATCHpgbench - select 828 816Here I attached the pg_stat_sql patch to keep track of all SQL commandsbased on the commandTag and their counts. I attached the result of thisview that is run on the database after "make installcheck" just for reference.
Test: Commands with uppercase and lowercase
====
If the tag='select' then it returns the 0 rows but count is actually increment by 1.
tag='select' vs tag='SELECT'
postgres=# SET track_sql TO ON;
SET
postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
tag | count
--------+-------
SELECT | 12
(1 row)
postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
tag | count
--------+-------
SELECT | 13
(1 row)
postgres=# SELECT * FROM pg_stat_sql where tag='select';
tag | count
-----+-------
(0 rows)
postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
tag | count
--------+-------
SELECT | 15
(1 row)
I think all command works same as above.
Regards,
Vinayak Pokale
NTT Open Source Software Center
On 2016/10/12 12:21, Haribabu Kommi wrote:
Some comments:On Thu, Sep 29, 2016 at 3:45 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:On Thu, Sep 22, 2016 at 3:05 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:Peter Eisentraut wrote:
> How about having the tag not be a column name but a row entry. So you'd
> do something like
>
> SELECT * FROM pg_stat_sql WHERE tag = 'ALTER VIEW';
>
> That way, we don't have to keep updating (and re-debating) this when new
> command types or subtypes are added. And queries written for future
> versions will not fail when run against old servers.
Yeah, good idea.Yes, Having it as a row entry is good.Let's also discuss the interface from the stats collector. Currently we
have some 20 new SQL functions, all alike, each loading the whole data
and returning a single counter, and then the view invokes each function
separately. That doesn't seem great to me. How about having a single C
function that returns the whole thing as a SRF instead, and the view is
just a single function invocation -- something like pg_lock_status
filling pg_locks in one go.
Another consideration is that the present patch lumps together all ALTER
cases in a single counter. This isn't great, but at the same time we
don't want to bloat the stat files by having hundreds of counters per
database, do we?Currently, The SQL stats is a fixed size counter to track the all the ALTERcases as single counter. So while sending the stats from the backend tostats collector at the end of the transaction, the cost is same, because ofit's fixed size. This approach adds overhead to send and read the statsis minimal.With the following approach, I feel it is possible to support the counter atcommand tag level.Add a Global and local Hash to keep track of the counters by using thecommand tag as the key, this hash table increases dynamically whenevera new type of SQL command gets executed. The Local Hash data is passedto stats collector whenever the transaction gets committed.The problem I am thinking is that, Sending data from Hash and populatingthe Hash from stats file for all the command tags adds some overhead.I tried changing the pg_stat_sql into row mode and ran the regress suite toadd different type of SQL commands to the view and ran the pgbench teston my laptop to find out any performance impact with this patch.HEAD PATCHpgbench - select 828 816Here I attached the pg_stat_sql patch to keep track of all SQL commandsbased on the commandTag and their counts. I attached the result of thisview that is run on the database after "make installcheck" just for reference.
I think we can use pgstat_* instead of pgstat* for code consistency.
+static HTAB *pgStatBackendSql = NULL;
How about *pgstat_backend_sql
+HTAB *pgStatSql = NULL;
How about *pgstat_sql
+static void pgstat_recv_sqlstmt(PgStat_MsgSqlstmt * msg, int len);
How about PgStat_MsgSqlstmt *msg instead of PgStat_MsgSqlstmt * msg
+typedef struct PgStatSqlstmtEntry
How about PgStat_SqlstmtEntry
+typedef struct PgStatMsgSqlstmt
How about PgStat_MsgSqlstmt
I have observed below behavior.
I have SET track_sql to ON and then execute the SELECT command and it return 0 rows.
It start counting from the second command.
postgres=# SET track_sql TO ON;
SET
postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
tag | count
-----+-------
(0 rows)
postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
tag | count
--------+-------
SELECT | 1
(1 row)
Is this a correct behavior?
Regards,
Vinayak Pokale
NTT Open Source Software Center
On Wed, Oct 12, 2016 at 4:06 PM, vinayak <Pokale_Vinayak_q3@lab.ntt.co.jp> wrote:
Thank you for the patch.
Test: Commands with uppercase and lowercase
====
If the tag='select' then it returns the 0 rows but count is actually increment by 1.
tag='select' vs tag='SELECT'
postgres=# SET track_sql TO ON;
SET
postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
tag | count
--------+-------
SELECT | 12
(1 row)
postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
tag | count
--------+-------
SELECT | 13
(1 row)
postgres=# SELECT * FROM pg_stat_sql where tag='select';
tag | count
-----+-------
(0 rows)
postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
tag | count
--------+-------
SELECT | 15
(1 row)
I think all command works same as above.
Yes, that's correct. Currently the CAPS letters are used as tag names
that are getting displayed whenever any SQL command is executed.
So I used the same names as entries to store the details of stats of
SQL statements.
If anyone feels the other way is better, I am fine for it.
Regards,
Hari Babu
Fujitsu Australia
On Fri, Oct 14, 2016 at 7:48 PM, vinayak <Pokale_Vinayak_q3@lab.ntt.co.jp> wrote:
On 2016/10/12 12:21, Haribabu Kommi wrote:Some comments:I tried changing the pg_stat_sql into row mode and ran the regress suite toadd different type of SQL commands to the view and ran the pgbench teston my laptop to find out any performance impact with this patch.HEAD PATCHpgbench - select 828 816Here I attached the pg_stat_sql patch to keep track of all SQL commandsbased on the commandTag and their counts. I attached the result of thisview that is run on the database after "make installcheck" just for reference.
I think we can use pgstat_* instead of pgstat* for code consistency.
+static HTAB *pgStatBackendSql = NULL;
How about *pgstat_backend_sql
+HTAB *pgStatSql = NULL;
How about *pgstat_sql
Changed.
+static void pgstat_recv_sqlstmt(PgStat_MsgSqlstmt * msg, int len);
How about PgStat_MsgSqlstmt *msg instead of PgStat_MsgSqlstmt * msg
Added the typdef into typdef.list file so this problem never occurs.
+typedef struct PgStatSqlstmtEntry
How about PgStat_SqlstmtEntry
+typedef struct PgStatMsgSqlstmt
How about PgStat_MsgSqlstmt
changed.
I have observed below behavior.
I have SET track_sql to ON and then execute the SELECT command and it return 0 rows.
It start counting from the second command.
postgres=# SET track_sql TO ON;
SET
postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
tag | count
-----+-------
(0 rows)
postgres=# SELECT * FROM pg_stat_sql where tag='SELECT';
tag | count
--------+-------
SELECT | 1
(1 row)
Is this a correct behavior?
the stats collector at the end of the SQL statement execution. The current SQL statement
is not counted in the result.
Updated patch is attached.
Regards,
Hari Babu
Fujitsu Australia
Attachment
On 2016/10/17 10:22, Haribabu Kommi wrote:
The updated patch gives following error.On Fri, Oct 14, 2016 at 7:48 PM, vinayak <Pokale_Vinayak_q3@lab.ntt.co.jp> wrote:On 2016/10/12 12:21, Haribabu Kommi wrote:Some comments:I tried changing the pg_stat_sql into row mode and ran the regress suite toadd different type of SQL commands to the view and ran the pgbench teston my laptop to find out any performance impact with this patch.HEAD PATCHpgbench - select 828 816Here I attached the pg_stat_sql patch to keep track of all SQL commandsbased on the commandTag and their counts. I attached the result of thisview that is run on the database after "make installcheck" just for reference.
I think we can use pgstat_* instead of pgstat* for code consistency.
+static HTAB *pgStatBackendSql = NULL;
How about *pgstat_backend_sql
+HTAB *pgStatSql = NULL;
How about *pgstat_sqlChanged.+static void pgstat_recv_sqlstmt(PgStat_MsgSqlstmt * msg, int len);
How about PgStat_MsgSqlstmt *msg instead of PgStat_MsgSqlstmt * msgAdded the typdef into typdef.list file so this problem never occurs.+typedef struct PgStatSqlstmtEntry
How about PgStat_SqlstmtEntry
+typedef struct PgStatMsgSqlstmt
How about PgStat_MsgSqlstmtchanged.Updated patch is attached.
Error:
../../../../src/include/pgstat.h:557: error: ‘PgStatSqlstmtEntry’ undeclared here (not in a function)
- ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(PgStatSqlstmtEntry))
+ ((PGSTAT_MSG_PAYLOAD - sizeof(int)) / sizeof(PgStat_SqlstmtEntry))
The attached patch fixed the error.
Regards,
Vinayak Pokale
Attachment
On Thu, Sep 29, 2016 at 1:45 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > Currently, The SQL stats is a fixed size counter to track the all the ALTER > cases as single counter. So while sending the stats from the backend to > stats collector at the end of the transaction, the cost is same, because of > it's fixed size. This approach adds overhead to send and read the stats > is minimal. > > With the following approach, I feel it is possible to support the counter at > command tag level. > > Add a Global and local Hash to keep track of the counters by using the > command tag as the key, this hash table increases dynamically whenever > a new type of SQL command gets executed. The Local Hash data is passed > to stats collector whenever the transaction gets committed. > > The problem I am thinking is that, Sending data from Hash and populating > the Hash from stats file for all the command tags adds some overhead. Yeah, I'm not very excited about that overhead. This seems useful as far as it goes, but I don't really want to incur measurable overhead when it's in use. Having a hash table rather than a fixed array of slots means that you have to pass this through the stats collector rather than updating shared memory directly, which is fairly heavy weight. If each backend could have its own copy of the slot array and just update that, and readers added up the values across the whole array, this could be done without any locking at all, and it would generally be much lighter-weight than this approach. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Oct 19, 2016 at 5:11 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Sep 29, 2016 at 1:45 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> Currently, The SQL stats is a fixed size counter to track the all the ALTER
> cases as single counter. So while sending the stats from the backend to
> stats collector at the end of the transaction, the cost is same, because of
> it's fixed size. This approach adds overhead to send and read the stats
> is minimal.
>
> With the following approach, I feel it is possible to support the counter at
> command tag level.
>
> Add a Global and local Hash to keep track of the counters by using the
> command tag as the key, this hash table increases dynamically whenever
> a new type of SQL command gets executed. The Local Hash data is passed
> to stats collector whenever the transaction gets committed.
>
> The problem I am thinking is that, Sending data from Hash and populating
> the Hash from stats file for all the command tags adds some overhead.
Yeah, I'm not very excited about that overhead. This seems useful as
far as it goes, but I don't really want to incur measurable overhead
when it's in use. Having a hash table rather than a fixed array of
slots means that you have to pass this through the stats collector
rather than updating shared memory directly, which is fairly heavy
weight. If each backend could have its own copy of the slot array and
just update that, and readers added up the values across the whole
array, this could be done without any locking at all, and it would
generally be much lighter-weight than this approach.
Using limited information like combining all ALTER XXX into Alter counter,
the fixed array logic works fine without having any problems. But people
are suggesting to provide more details like ALTER VIEW and etc, so I
checked the following ways,
1. Using of nodetag as an array of index to update the counter. But this
approach doesn't work for some cases like T_DropStmt where the tag
varies based on the removeType of the DropStmt. So I decide to drop
this approach.
2. Using of tag name for searching the fixed size array that is sorted with
all command tags that are possible in PostgreSQL. Using a binary search,
find out the location and update the counter. In this approach, first the array
needs to be filed with TAG information in the sorted order.
3. Using of Hash table to store the counters information based on the TAG
name as key.
I choose the approach-3 as it can scale for any additional commands.
Any better alternatives with that directly it can point to the array index?
if we are going with ALTER XXX into single Alter counter is approach, then
I will change the code with a fixed array approach.
Regards,
Hari Babu
Fujitsu Australia
Hi Vinayak,
This is a gentle reminder.
you assigned as reviewer to the current patch in the 11-2016 commitfest.
If you have any more review comments / performance results regarding the
approach of the patch, please share it. Otherwise, you can mark the patch
as "Ready for committer" for committer feedback. This will help us in smoother
operation of the commitfest.
Regards,
Hari Babu
Fujitsu Australia
On Tue, Nov 22, 2016 at 5:57 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
Hi Vinayak,This is a gentle reminder.you assigned as reviewer to the current patch in the 11-2016 commitfest.If you have any more review comments / performance results regarding theapproach of the patch, please share it. Otherwise, you can mark the patchas "Ready for committer" for committer feedback. This will help us in smootheroperation of the commitfest.
Moved to next CF with "needs review" status.
Regards,
Hari Babu
Fujitsu Australia
On Mon, Dec 5, 2016 at 8:01 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > Moved to next CF with "needs review" status. I have started reviewing the patch, Some initial comments. $ git apply ../pg_stat_sql_row_mode_2.patch error: patch failed: src/include/catalog/pg_proc.h:2893 error: src/include/catalog/pg_proc.h: patch does not apply Patch is not applying on the head, I think it needs to be rebased. +void +pgstat_count_sqlstmt(const char *commandTag) +{ + PgStat_SqlstmtEntry *htabent; + bool found; + + if (!pgstat_track_sql) + return Callers of pgstat_count_sqlstmt are always ensuring that pgstat_track_sql is true, seems it's redundant here. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Wed, Jan 11, 2017 at 9:17 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote: > On Mon, Dec 5, 2016 at 8:01 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: >> Moved to next CF with "needs review" status. > > I have started reviewing the patch, Some initial comments. > > $ git apply ../pg_stat_sql_row_mode_2.patch > error: patch failed: src/include/catalog/pg_proc.h:2893 > error: src/include/catalog/pg_proc.h: patch does not apply I suggest using 'patch' directly. 'git apply' is far too picky. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Jan 11, 2017 at 7:47 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote: > +void > +pgstat_count_sqlstmt(const char *commandTag) > +{ > + PgStat_SqlstmtEntry *htabent; > + bool found; > + > + if (!pgstat_track_sql) > + return > > Callers of pgstat_count_sqlstmt are always ensuring that > pgstat_track_sql is true, seems it's redundant here. I have done testing of the feature, it's mostly working as per the expectation. I have few comments/questions. -------- If you execute the query with EXECUTE then commandTag will be EXECUTE that way we will not show the actual query type, I mean all the statements will get the common tag "EXECUTE". You can try this. PREPARE fooplan (int) AS INSERT INTO t VALUES($1); EXECUTE fooplan(1); ------ + /* Destroy the SQL statement counter stats HashTable */ + hash_destroy(pgstat_sql); + + /* Create SQL statement counter Stats HashTable */ I think in the patch we have used mixed of "Stats HashTable" and "stats HashTable", I think better to be consistent throughout the patch. Check other similar instances. ---------------- @@ -1102,6 +1102,12 @@ exec_simple_query(const char *query_string) PortalDrop(portal, false); + /* + * Count SQL statement for pg_stat_sql view + */ + if (pgstat_track_sql) + pgstat_count_sqlstmt(commandTag); We are only counting the successful SQL statement, Is that intentional? ------ I have noticed that pgstat_count_sqlstmt is called from exec_simple_query and exec_execute_message. Don't we want to track the statement executed from functions? ------- -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Sat, Jan 14, 2017 at 12:58 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote: > On Wed, Jan 11, 2017 at 7:47 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote: >> +void >> +pgstat_count_sqlstmt(const char *commandTag) >> +{ >> + PgStat_SqlstmtEntry *htabent; >> + bool found; >> + >> + if (!pgstat_track_sql) >> + return >> >> Callers of pgstat_count_sqlstmt are always ensuring that >> pgstat_track_sql is true, seems it's redundant here. > > I have done testing of the feature, it's mostly working as per the expectation. > > I have few comments/questions. > > -------- > If you execute the query with EXECUTE then commandTag will be EXECUTE > that way we will not show the actual query type, I mean all the > statements will get the common tag "EXECUTE". > > You can try this. > PREPARE fooplan (int) AS INSERT INTO t VALUES($1); > EXECUTE fooplan(1); > > ------ > > + /* Destroy the SQL statement counter stats HashTable */ > + hash_destroy(pgstat_sql); > + > + /* Create SQL statement counter Stats HashTable */ > > I think in the patch we have used mixed of "Stats HashTable" and > "stats HashTable", I think better > to be consistent throughout the patch. Check other similar instances. > > ---------------- > > @@ -1102,6 +1102,12 @@ exec_simple_query(const char *query_string) > > PortalDrop(portal, false); > > + /* > + * Count SQL statement for pg_stat_sql view > + */ > + if (pgstat_track_sql) > + pgstat_count_sqlstmt(commandTag); > > We are only counting the successful SQL statement, Is that intentional? > > ------ > I have noticed that pgstat_count_sqlstmt is called from > exec_simple_query and exec_execute_message. Don't we want to track the > statement executed from functions? > ------- The latest patch available has rotten, could you rebase please? -- Michael
On Fri, Jan 13, 2017 at 10:58 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote: > If you execute the query with EXECUTE then commandTag will be EXECUTE > that way we will not show the actual query type, I mean all the > statements will get the common tag "EXECUTE". Somebody might say that's a feature, not a bug. But then again, someone else might say the opposite. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Sat, Jan 14, 2017 at 2:58 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
The view is currently designed to count user/application initiated SQL statements,On Wed, Jan 11, 2017 at 7:47 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> +void
> +pgstat_count_sqlstmt(const char *commandTag)
> +{
> + PgStat_SqlstmtEntry *htabent;
> + bool found;
> +
> + if (!pgstat_track_sql)
> + return
>
> Callers of pgstat_count_sqlstmt are always ensuring that
> pgstat_track_sql is true, seems it's redundant here.
I have done testing of the feature, it's mostly working as per the expectation.
Thanks for the review and test.
The use case for this view is to find out the number of different SQL statements
that are getting executed successfully on an instance by the application/user.
I have few comments/questions.
--------
If you execute the query with EXECUTE then commandTag will be EXECUTE
that way we will not show the actual query type, I mean all the
statements will get the common tag "EXECUTE".
You can try this.
PREPARE fooplan (int) AS INSERT INTO t VALUES($1);
EXECUTE fooplan(1);
------
Yes, that's correct. Currently the count is incremented based on the base command,
because of this reason, the EXECUTE is increased, instead of the actual operation.
+ /* Destroy the SQL statement counter stats HashTable */
+ hash_destroy(pgstat_sql);
+
+ /* Create SQL statement counter Stats HashTable */
I think in the patch we have used mixed of "Stats HashTable" and
"stats HashTable", I think better
to be consistent throughout the patch. Check other similar instances.
----------------
Corrected.
@@ -1102,6 +1102,12 @@ exec_simple_query(const char *query_string)
PortalDrop(portal, false);
+ /*
+ * Count SQL statement for pg_stat_sql view
+ */
+ if (pgstat_track_sql)
+ pgstat_count_sqlstmt(commandTag);
We are only counting the successful SQL statement, Is that intentional?
Yes, I thought of counting the successful SQL operations that changed the
database over a period of time. I am not sure whether there will be many
failure operations that can occur on an instance to be counted.
------
I have noticed that pgstat_count_sqlstmt is called from
exec_simple_query and exec_execute_message. Don't we want to track the
statement executed from functions?
-------
but not the internal SQL queries that are getting executed from functions and etc.
Removed the check in pgstat_count_sqlstmt statement and add it exec_execute_message
>>+void
>>+pgstat_count_sqlstmt(const char *commandTag)
>>+{
>>+ PgStat_SqlstmtEntry *htabent;
>>+ bool found;
>>+
>>+ if (!pgstat_track_sql)
>>+ return
>
>Callers of pgstat_count_sqlstmt are always ensuring that
>pgstat_track_sql is true, seems it's redundant here.
>>+pgstat_count_sqlstmt(const char *commandTag)
>>+{
>>+ PgStat_SqlstmtEntry *htabent;
>>+ bool found;
>>+
>>+ if (!pgstat_track_sql)
>>+ return
>
>Callers of pgstat_count_sqlstmt are always ensuring that
>pgstat_track_sql is true, seems it's redundant here.
function where the check is missed.
Updated patch attached.
Regards,
Hari Babu
Fujitsu Australia
Attachment
On 2017/01/18 14:15, Haribabu Kommi wrote:
I have reviewed the patch. All the test cases works fine.On Sat, Jan 14, 2017 at 2:58 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote:The view is currently designed to count user/application initiated SQL statements,On Wed, Jan 11, 2017 at 7:47 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> +void
> +pgstat_count_sqlstmt(const char *commandTag)
> +{
> + PgStat_SqlstmtEntry *htabent;
> + bool found;
> +
> + if (!pgstat_track_sql)
> + return
>
> Callers of pgstat_count_sqlstmt are always ensuring that
> pgstat_track_sql is true, seems it's redundant here.
I have done testing of the feature, it's mostly working as per the expectation.Thanks for the review and test.The use case for this view is to find out the number of different SQL statementsthat are getting executed successfully on an instance by the application/user.I have few comments/questions.
--------
If you execute the query with EXECUTE then commandTag will be EXECUTE
that way we will not show the actual query type, I mean all the
statements will get the common tag "EXECUTE".
You can try this.
PREPARE fooplan (int) AS INSERT INTO t VALUES($1);
EXECUTE fooplan(1);
------Yes, that's correct. Currently the count is incremented based on the base command,because of this reason, the EXECUTE is increased, instead of the actual operation.+ /* Destroy the SQL statement counter stats HashTable */
+ hash_destroy(pgstat_sql);
+
+ /* Create SQL statement counter Stats HashTable */
I think in the patch we have used mixed of "Stats HashTable" and
"stats HashTable", I think better
to be consistent throughout the patch. Check other similar instances.
----------------Corrected.
@@ -1102,6 +1102,12 @@ exec_simple_query(const char *query_string)
PortalDrop(portal, false);
+ /*
+ * Count SQL statement for pg_stat_sql view
+ */
+ if (pgstat_track_sql)
+ pgstat_count_sqlstmt(commandTag);
We are only counting the successful SQL statement, Is that intentional?Yes, I thought of counting the successful SQL operations that changed thedatabase over a period of time. I am not sure whether there will be manyfailure operations that can occur on an instance to be counted.------
I have noticed that pgstat_count_sqlstmt is called from
exec_simple_query and exec_execute_message. Don't we want to track the
statement executed from functions?
-------but not the internal SQL queries that are getting executed from functions and etc.>>+void
>>+pgstat_count_sqlstmt(const char *commandTag)
>>+{
>>+ PgStat_SqlstmtEntry *htabent;
>>+ bool found;
>>+
>>+ if (!pgstat_track_sql)
>>+ return
>
>Callers of pgstat_count_sqlstmt are always ensuring that
>pgstat_track_sql is true, seems it's redundant here.Removed the check in pgstat_count_sqlstmt statement and add it exec_execute_messagefunction where the check is missed.Updated patch attached.
Regards,
Vinayak Pokale
NTT Open Source Software Center
On Wed, Jan 18, 2017 at 10:45 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > The use case for this view is to find out the number of different SQL > statements > that are getting executed successfully on an instance by the > application/user. > >> I have few comments/questions. >> >> -------- >> If you execute the query with EXECUTE then commandTag will be EXECUTE >> that way we will not show the actual query type, I mean all the >> statements will get the common tag "EXECUTE". >> >> You can try this. >> PREPARE fooplan (int) AS INSERT INTO t VALUES($1); >> EXECUTE fooplan(1); >> >> ------ > > > Yes, that's correct. Currently the count is incremented based on the base > command, > because of this reason, the EXECUTE is increased, instead of the actual > operation. Okay, As per your explaination, and Robert also pointed out that it can be a feature for someone and bug for others. I would have preferred it shows actual SQL statement. But I have no objection to what you are doing. > > >> >> + /* Destroy the SQL statement counter stats HashTable */ >> + hash_destroy(pgstat_sql); >> + >> + /* Create SQL statement counter Stats HashTable */ >> >> I think in the patch we have used mixed of "Stats HashTable" and >> "stats HashTable", I think better >> to be consistent throughout the patch. Check other similar instances. >> >> ---------------- > > > Corrected. > >> >> >> @@ -1102,6 +1102,12 @@ exec_simple_query(const char *query_string) >> >> PortalDrop(portal, false); >> >> + /* >> + * Count SQL statement for pg_stat_sql view >> + */ >> + if (pgstat_track_sql) >> + pgstat_count_sqlstmt(commandTag); >> >> We are only counting the successful SQL statement, Is that intentional? > > > Yes, I thought of counting the successful SQL operations that changed the > database over a period of time. I am not sure whether there will be many > failure operations that can occur on an instance to be counted. Okay.. > >> >> ------ >> I have noticed that pgstat_count_sqlstmt is called from >> exec_simple_query and exec_execute_message. Don't we want to track the >> statement executed from functions? >> ------- > The view is currently designed to count user/application initiated SQL > statements, > but not the internal SQL queries that are getting executed from functions > and etc. Okay.. > >>>+void >>>+pgstat_count_sqlstmt(const char *commandTag) >>>+{ >>>+ PgStat_SqlstmtEntry *htabent; >>>+ bool found; >>>+ >>>+ if (!pgstat_track_sql) >>>+ return >> >>Callers of pgstat_count_sqlstmt are always ensuring that >>pgstat_track_sql is true, seems it's redundant here. > > Removed the check in pgstat_count_sqlstmt statement and add it > exec_execute_message > function where the check is missed. > > Updated patch attached. Overall patch looks fine to me and marking it "ready for committer". There is two design decision, which I leave it for committer's decision. 1. EXECUTE statement should show only as EXECUTE count, not the actual SQL query. 2. should we count statement execute from Functions or only statement what is executed directly by the user as it's doing now? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Tue, Jan 24, 2017 at 3:59 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
Overall patch looks fine to me and marking it "ready for committer".
There is two design decision, which I leave it for committer's decision.
1. EXECUTE statement should show only as EXECUTE count, not the
actual SQL query.
2. should we count statement execute from Functions or only statement
what is executed directly by the user as it's doing now?
Thanks for the review.
Let's wait for the committer's opinion.
Regards,
Hari Babu
Fujitsu Australia
On Fri, Jan 27, 2017 at 10:26 AM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > Thanks for the review. > Let's wait for the committer's opinion. I have moved this patch to CF 2017-03 to wait for this to happen. -- Michael
On Wed, Feb 1, 2017 at 3:13 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Fri, Jan 27, 2017 at 10:26 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> Thanks for the review.
> Let's wait for the committer's opinion.
I have moved this patch to CF 2017-03 to wait for this to happen.
Attached a rebased patch to resolve the OID conflict.
Regards,
Hari Babu
Fujitsu Australia
Attachment
Hi, I'm somewhat inclined to think that this really would be better done in an extension like pg_stat_statements. On 2017-03-08 14:39:23 +1100, Haribabu Kommi wrote: > On Wed, Feb 1, 2017 at 3:13 PM, Michael Paquier <michael.paquier@gmail.com> > + <varlistentry id="guc-track-sql" xreflabel="track_sql"> > + <term><varname>track_sql</varname> (<type>boolean</type>) > + <indexterm> > + <primary><varname>track_sql</> configuration parameter</primary> > + </indexterm> > + </term> > + <listitem> > + <para> > + Enables collection of different SQL statement statistics that are > + executed on the instance. This parameter is off by default. Only > + superusers can change this setting. > + </para> > + </listitem> > + </varlistentry> > + I don't like this name much, seems a bit too generic to me. 'track_activities', 'track_io_timings' are at least a bit clearer. How about track_statement_statistics + corresponding view/function renaming? > + <table id="pg-stat-sql-view" xreflabel="pg_stat_sql"> > + <title><structname>pg_stat_sql</structname> View</title> > + <tgroup cols="3"> > + <thead> > + <row> > + <entry>Column</entry> > + <entry>Type</entry> > + <entry>Description</entry> > + </row> > + </thead> > + > + <tbody> > + <row> > + <entry><structfield>tag</></entry> > + <entry><type>text</></entry> > + <entry>Name of the SQL statement</entry> > + </row> It's not really the "name" of a statement. Internally and IIRC elsewhere in the docs we describe something like this as tag? > +/* ---------- > + * pgstat_send_sqlstmt(void) > + * > + * Send SQL statement statistics to the collector > + * ---------- > + */ > +static void > +pgstat_send_sqlstmt(void) > +{ > + PgStat_MsgSqlstmt msg; > + PgStat_SqlstmtEntry *entry; > + HASH_SEQ_STATUS hstat; > + > + if (pgstat_backend_sql == NULL) > + return; > + > + pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SQLSTMT); > + msg.m_nentries = 0; > + > + hash_seq_init(&hstat, pgstat_backend_sql); > + while ((entry = (PgStat_SqlstmtEntry *) hash_seq_search(&hstat)) != NULL) > + { > + PgStat_SqlstmtEntry *m_ent; > + > + /* Skip it if no counts accumulated since last time */ > + if (entry->count == 0) > + continue; > + > + /* need to convert format of time accumulators */ > + m_ent = &msg.m_entry[msg.m_nentries]; > + memcpy(m_ent, entry, sizeof(PgStat_SqlstmtEntry)); > + > + if (++msg.m_nentries >= PGSTAT_NUM_SQLSTMTS) > + { > + pgstat_send(&msg, offsetof(PgStat_MsgSqlstmt, m_entry[0]) + > + msg.m_nentries * sizeof(PgStat_SqlstmtEntry)); > + msg.m_nentries = 0; > + } > + > + /* reset the entry's count */ > + entry->count = 0; > + } > + > + if (msg.m_nentries > 0) > + pgstat_send(&msg, offsetof(PgStat_MsgSqlstmt, m_entry[0]) + > + msg.m_nentries * sizeof(PgStat_SqlstmtEntry)); Hm. So pgstat_backend_sql is never deleted, which'll mean that if a backend lives long we'll continually iterate over a lot of 0 entries. I think performance evaluation of this feature should take that into account. > +} > + > +/* > + * Count SQL statement for pg_stat_sql view > + */ > +void > +pgstat_count_sqlstmt(const char *commandTag) > +{ > + PgStat_SqlstmtEntry *htabent; > + bool found; > + > + if (!pgstat_backend_sql) > + { > + /* First time through - initialize SQL statement stat table */ > + HASHCTL hash_ctl; > + > + memset(&hash_ctl, 0, sizeof(hash_ctl)); > + hash_ctl.keysize = NAMEDATALEN; > + hash_ctl.entrysize = sizeof(PgStat_SqlstmtEntry); > + pgstat_backend_sql = hash_create("SQL statement stat entries", > + PGSTAT_SQLSTMT_HASH_SIZE, > + &hash_ctl, > + HASH_ELEM | HASH_BLOBS); > + } > + > + /* Get the stats entry for this SQL statement, create if necessary */ > + htabent = hash_search(pgstat_backend_sql, commandTag, > + HASH_ENTER, &found); > + if (!found) > + htabent->count = 1; > + else > + htabent->count++; > +} That's not really something in this patch, but a lot of this would be better if we didn't internally have command tags as strings, but as an enum. We should really only convert to a string with needed. That we're doing string comparisons on Portal->commandTag is just plain bad. If so, we could also make this whole patch a lot cheaper - have a fixed size array that has an entry for every possible tag (possibly even in shared memory, and then use atomics there). > +++ b/src/backend/tcop/postgres.c > @@ -1109,6 +1109,12 @@ exec_simple_query(const char *query_string) > > PortalDrop(portal, false); > > + /* > + * Count SQL statement for pg_stat_sql view > + */ > + if (pgstat_track_sql) > + pgstat_count_sqlstmt(commandTag); > + Isn't doing this at the message level quite wrong? What about statements executed from functions and such? Shouldn't this integrate at the executor level instead? > if (IsA(parsetree->stmt, TransactionStmt)) > { > /* > @@ -1991,6 +1997,29 @@ exec_execute_message(const char *portal_name, long max_rows) > > (*receiver->rDestroy) (receiver); > > + /* > + * Count SQL Statement for pgx_stat_sql > + */ > + if (pgstat_track_sql) > + { > + CachedPlanSource *psrc = NULL; > + > + if (portal->prepStmtName) > + { > + PreparedStatement *pstmt; > + > + pstmt = FetchPreparedStatement(portal->prepStmtName, false); > + if (pstmt) > + psrc = pstmt->plansource; > + } > + else > + psrc = unnamed_stmt_psrc; > + > + /* psrc should not be NULL here */ > + if (psrc && psrc->commandTag && !execute_is_fetch && pgstat_track_sql) > + pgstat_count_sqlstmt(psrc->commandTag); Wait, we're re-fetching the statement here? That doesn't sound alright. > +Datum > +pg_stat_sql(PG_FUNCTION_ARGS) > +{ > + TupleDesc tupdesc; > + Datum values[2]; > + bool nulls[2]; > + ReturnSetInfo *rsi; > + MemoryContext old_cxt; > + Tuplestorestate *tuple_store; > + PgStat_SqlstmtEntry *sqlstmtentry; > + HASH_SEQ_STATUS hstat; > + > + /* Function call to let the backend read the stats file */ > + pgstat_fetch_global(); > + > + /* Initialize values and nulls arrays */ > + MemSet(values, 0, sizeof(values)); > + MemSet(nulls, 0, sizeof(nulls)); why not instead declare values[2] = {0}? Obviously not important. > + tuple_store = > + tuplestore_begin_heap(rsi->allowedModes & SFRM_Materialize_Random, > + false, work_mem); Huh, why do we need random? Greetings, Andres Freund
On Thu, Apr 6, 2017 at 5:17 AM, Andres Freund <andres@anarazel.de> wrote:
Hi,
I'm somewhat inclined to think that this really would be better done in
an extension like pg_stat_statements.
Thanks for the review.
> +}
> +
> +/*
> + * Count SQL statement for pg_stat_sql view
> + */
> +void
> +pgstat_count_sqlstmt(const char *commandTag)
> +{
> + PgStat_SqlstmtEntry *htabent;
> + bool found;
> +
> + if (!pgstat_backend_sql)
> + {
> + /* First time through - initialize SQL statement stat table */
> + HASHCTL hash_ctl;
> +
> + memset(&hash_ctl, 0, sizeof(hash_ctl));
> + hash_ctl.keysize = NAMEDATALEN;
> + hash_ctl.entrysize = sizeof(PgStat_SqlstmtEntry);
> + pgstat_backend_sql = hash_create("SQL statement stat entries",
> + PGSTAT_SQLSTMT_HASH_SIZE,
> + &hash_ctl,
> + HASH_ELEM | HASH_BLOBS);
> + }
> +
> + /* Get the stats entry for this SQL statement, create if necessary */
> + htabent = hash_search(pgstat_backend_sql, commandTag,
> + HASH_ENTER, &found);
> + if (!found)
> + htabent->count = 1;
> + else
> + htabent->count++;
> +}
That's not really something in this patch, but a lot of this would be
better if we didn't internally have command tags as strings, but as an
enum. We should really only convert to a string with needed. That
we're doing string comparisons on Portal->commandTag is just plain bad.
If so, we could also make this whole patch a lot cheaper - have a fixed
size array that has an entry for every possible tag (possibly even in
shared memory, and then use atomics there).
During the development, I thought of using an array of all command tags
and update the counters using the tag name, but not like the enum values.
Now I have to create a mapping array with enums and tag names for easier
counter updates. The problem in this approach is, whenever any addition
of new commands, this mapping array needs to be updated with both
new enum and new tag name, whereas with hash table approach, it works
future command additions also, but there is some performance overhead
compared to the array approach.
I will modify the patch to use the array approach and provide it to the next
commitfest.
> +++ b/src/backend/tcop/postgres.c
> @@ -1109,6 +1109,12 @@ exec_simple_query(const char *query_string)
>
> PortalDrop(portal, false);
>
> + /*
> + * Count SQL statement for pg_stat_sql view
> + */
> + if (pgstat_track_sql)
> + pgstat_count_sqlstmt(commandTag);
> +
Isn't doing this at the message level quite wrong? What about
statements executed from functions and such? Shouldn't this integrate
at the executor level instead?
Currently the patch is designed to find out only the user executed statements
that are successfully finished, (no internal statements that are executed from
functions and etc).
The same question was asked by dilip also in earlier mails, may be now it is
the time that we can decide the approach of statement counts.
Regards,
Hari Babu
Fujitsu Australia
Dear Hackers, We have resurrected this 2 year old "SQL statements statistics counter" proposal from Hari. The attached patch addresses the previous review issues. The commit-fest entry has been moved forward to the next commit-fest https://commitfest.postgresql.org/25/790/ Please review again, and consider if this is OK for "Ready for Committer" status. Kind Regards -- Peter Smith Fujitsu Australia -----Original Message----- From: pgsql-hackers-owner@postgresql.org <pgsql-hackers-owner@postgresql.org> On Behalf Of Andres Freund Sent: Thursday, 6 April 2017 5:18 AM To: Haribabu Kommi <kommi.haribabu@gmail.com> Cc: Michael Paquier <michael.paquier@gmail.com>; Dilip Kumar <dilipbalaut@gmail.com>; vinayak <Pokale_Vinayak_q3@lab.ntt.co.jp>;pgsql-hackers@postgresql.org Subject: Re: New SQL counter statistics view (pg_stat_sql) Hi, I'm somewhat inclined to think that this really would be better done in an extension like pg_stat_statements. On 2017-03-08 14:39:23 +1100, Haribabu Kommi wrote: > On Wed, Feb 1, 2017 at 3:13 PM, Michael Paquier > <michael.paquier@gmail.com> > + <varlistentry id="guc-track-sql" xreflabel="track_sql"> > + <term><varname>track_sql</varname> (<type>boolean</type>) > + <indexterm> > + <primary><varname>track_sql</> configuration parameter</primary> > + </indexterm> > + </term> > + <listitem> > + <para> > + Enables collection of different SQL statement statistics that are > + executed on the instance. This parameter is off by default. Only > + superusers can change this setting. > + </para> > + </listitem> > + </varlistentry> > + I don't like this name much, seems a bit too generic to me. 'track_activities', 'track_io_timings' are at least a bit clearer. How about track_statement_statistics + corresponding view/function renaming? > + <table id="pg-stat-sql-view" xreflabel="pg_stat_sql"> > + <title><structname>pg_stat_sql</structname> View</title> > + <tgroup cols="3"> > + <thead> > + <row> > + <entry>Column</entry> > + <entry>Type</entry> > + <entry>Description</entry> > + </row> > + </thead> > + > + <tbody> > + <row> > + <entry><structfield>tag</></entry> > + <entry><type>text</></entry> > + <entry>Name of the SQL statement</entry> > + </row> It's not really the "name" of a statement. Internally and IIRC elsewhere in the docs we describe something like this as tag? > +/* ---------- > + * pgstat_send_sqlstmt(void) > + * > + * Send SQL statement statistics to the collector > + * ---------- > + */ > +static void > +pgstat_send_sqlstmt(void) > +{ > + PgStat_MsgSqlstmt msg; > + PgStat_SqlstmtEntry *entry; > + HASH_SEQ_STATUS hstat; > + > + if (pgstat_backend_sql == NULL) > + return; > + > + pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_SQLSTMT); > + msg.m_nentries = 0; > + > + hash_seq_init(&hstat, pgstat_backend_sql); > + while ((entry = (PgStat_SqlstmtEntry *) hash_seq_search(&hstat)) != NULL) > + { > + PgStat_SqlstmtEntry *m_ent; > + > + /* Skip it if no counts accumulated since last time */ > + if (entry->count == 0) > + continue; > + > + /* need to convert format of time accumulators */ > + m_ent = &msg.m_entry[msg.m_nentries]; > + memcpy(m_ent, entry, sizeof(PgStat_SqlstmtEntry)); > + > + if (++msg.m_nentries >= PGSTAT_NUM_SQLSTMTS) > + { > + pgstat_send(&msg, offsetof(PgStat_MsgSqlstmt, m_entry[0]) + > + msg.m_nentries * sizeof(PgStat_SqlstmtEntry)); > + msg.m_nentries = 0; > + } > + > + /* reset the entry's count */ > + entry->count = 0; > + } > + > + if (msg.m_nentries > 0) > + pgstat_send(&msg, offsetof(PgStat_MsgSqlstmt, m_entry[0]) + > + msg.m_nentries * sizeof(PgStat_SqlstmtEntry)); Hm. So pgstat_backend_sql is never deleted, which'll mean that if a backend lives long we'll continually iterate over a lotof 0 entries. I think performance evaluation of this feature should take that into account. > +} > + > +/* > + * Count SQL statement for pg_stat_sql view */ void > +pgstat_count_sqlstmt(const char *commandTag) { > + PgStat_SqlstmtEntry *htabent; > + bool found; > + > + if (!pgstat_backend_sql) > + { > + /* First time through - initialize SQL statement stat table */ > + HASHCTL hash_ctl; > + > + memset(&hash_ctl, 0, sizeof(hash_ctl)); > + hash_ctl.keysize = NAMEDATALEN; > + hash_ctl.entrysize = sizeof(PgStat_SqlstmtEntry); > + pgstat_backend_sql = hash_create("SQL statement stat entries", > + PGSTAT_SQLSTMT_HASH_SIZE, > + &hash_ctl, > + HASH_ELEM | HASH_BLOBS); > + } > + > + /* Get the stats entry for this SQL statement, create if necessary */ > + htabent = hash_search(pgstat_backend_sql, commandTag, > + HASH_ENTER, &found); > + if (!found) > + htabent->count = 1; > + else > + htabent->count++; > +} That's not really something in this patch, but a lot of this would be better if we didn't internally have command tags asstrings, but as an enum. We should really only convert to a string with needed. That we're doing string comparisons onPortal->commandTag is just plain bad. If so, we could also make this whole patch a lot cheaper - have a fixed size array that has an entry for every possible tag(possibly even in shared memory, and then use atomics there). > +++ b/src/backend/tcop/postgres.c > @@ -1109,6 +1109,12 @@ exec_simple_query(const char *query_string) > > PortalDrop(portal, false); > > + /* > + * Count SQL statement for pg_stat_sql view > + */ > + if (pgstat_track_sql) > + pgstat_count_sqlstmt(commandTag); > + Isn't doing this at the message level quite wrong? What about statements executed from functions and such? Shouldn't thisintegrate at the executor level instead? > if (IsA(parsetree->stmt, TransactionStmt)) > { > /* > @@ -1991,6 +1997,29 @@ exec_execute_message(const char *portal_name, > long max_rows) > > (*receiver->rDestroy) (receiver); > > + /* > + * Count SQL Statement for pgx_stat_sql > + */ > + if (pgstat_track_sql) > + { > + CachedPlanSource *psrc = NULL; > + > + if (portal->prepStmtName) > + { > + PreparedStatement *pstmt; > + > + pstmt = FetchPreparedStatement(portal->prepStmtName, false); > + if (pstmt) > + psrc = pstmt->plansource; > + } > + else > + psrc = unnamed_stmt_psrc; > + > + /* psrc should not be NULL here */ > + if (psrc && psrc->commandTag && !execute_is_fetch && pgstat_track_sql) > + pgstat_count_sqlstmt(psrc->commandTag); Wait, we're re-fetching the statement here? That doesn't sound alright. > +Datum > +pg_stat_sql(PG_FUNCTION_ARGS) > +{ > + TupleDesc tupdesc; > + Datum values[2]; > + bool nulls[2]; > + ReturnSetInfo *rsi; > + MemoryContext old_cxt; > + Tuplestorestate *tuple_store; > + PgStat_SqlstmtEntry *sqlstmtentry; > + HASH_SEQ_STATUS hstat; > + > + /* Function call to let the backend read the stats file */ > + pgstat_fetch_global(); > + > + /* Initialize values and nulls arrays */ > + MemSet(values, 0, sizeof(values)); > + MemSet(nulls, 0, sizeof(nulls)); why not instead declare values[2] = {0}? Obviously not important. > + tuple_store = > + tuplestore_begin_heap(rsi->allowedModes & SFRM_Materialize_Random, > + false, work_mem); Huh, why do we need random? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Thu, Oct 17, 2019 at 3:22 PM Smith, Peter <peters@fast.au.fujitsu.com> wrote: > We have resurrected this 2 year old "SQL statements statistics counter" proposal from Hari. > > The attached patch addresses the previous review issues. Hi, No comment on the patch but I noticed that the documentation changes don't build. Please make sure you can "make docs" successfully, having installed the documentation tools[1]. [1] https://www.postgresql.org/docs/devel/docguide-toolsets.html
From: Thomas Munro <thomas.munro@gmail.com> Sent: Monday, 4 November 2019 1:43 PM > No comment on the patch but I noticed that the documentation changes don't build. Please make sure you can "make docs"successfully, having installed the documentation tools[1]. Thanks for the feedback. An updated patch which fixes the docs issue is attached. Kind Regards. Peter Smith --- Fujitsu Australia
Attachment
On 2019-Nov-13, Smith, Peter wrote: > From: Thomas Munro <thomas.munro@gmail.com> Sent: Monday, 4 November 2019 1:43 PM > > > No comment on the patch but I noticed that the documentation changes don't build. Please make sure you can "make docs"successfully, having installed the documentation tools[1]. > > Thanks for the feedback. An updated patch which fixes the docs issue is attached. So, who is updating this patch for the new cmdtaglist.h stuff? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> On Mar 3, 2020, at 6:50 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > On 2019-Nov-13, Smith, Peter wrote: > >> From: Thomas Munro <thomas.munro@gmail.com> Sent: Monday, 4 November 2019 1:43 PM >> >>> No comment on the patch but I noticed that the documentation changes don't build. Please make sure you can "make docs"successfully, having installed the documentation tools[1]. >> >> Thanks for the feedback. An updated patch which fixes the docs issue is attached. > > So, who is updating this patch for the new cmdtaglist.h stuff? I have a much altered (competing?) patch inspired by pg_stat_sql. I am working to integrate the changes you made to my commandtagpatch into my commandstats patch before posting it. It might be good if somebody else tackled the pg_stat_sql patch in case my version is rejected. I don't want to highjack this thread to talk in detail about the other patch, so I'll just give an overview that might beuseful for anybody thinking about putting effort into committing this patch first. The motivation for the commandtag patch that you've committed (thanks again!) was to make the commandstats patch possible. I didn't like the way pg_stat_sql stored the tags as strings, but I also still don't like the way it needs to takea lock every time a backend increments the counter for a command. That is bad, I think, because the lock overhead foran otherwise fast command (like moving a cursor) may be high enough to discourage users from turning this feature on inproduction. I solve that by trading off speed and memory. For most commandtags, there is a single uint64 counter, and incrementing thecounter requires taking a lock. This is very similar to how the other patch works. For the few commandtags that we deemworthy of the special treatment, there is an additional uint32 counter *per backend* reserved in shared memory. Backendsupdate this counter using atomics, but do not require locks. When the counter is in danger of overflow, locks aretaken to roll the count into the uint64 counters and zero out the backend-specific uint32 counter. I have roughly 20command tags receiving this special treatment, but the design of the patch doesn't make it too hard to change the exactlist of tags worthy of it, so I don't plan to stress too much about the exact set of tags. I'm sure you'll all tellme which ones you think do or do not deserve the treatment. If it isn't obvious, the logic of having this treatment for only a subset of tags is that 192 tags * 4 bytes per counter* MaxBackends gets expensive for folks running a high number of backends. Taking that from 192 tags down to one ortwo dozen seems worthwhile. I expect to post on a separate thread before the day is over. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hackers, as mentioned in [1], I have created an implementation of command counter statistics very similar in purpose to the one alreadypending in the commitfest going by the name "pg_stat_sql". I don't really care if this implementation is seen asbuilding on that one or as separate, but I was worried about hijacking that thread, so I'm starting this thead. Thereare elements of this patch that borrowed from that one, so if this is accepted, authorship should reflect that. See the work by Haribabu Kommi (haribabu) at https://commitfest.postgresql.org/27/790/ The two main differences are that (1) This implementation is based on commandtags as enums, not strings and (2) This implementation uses techniques to reduce lock contention I think (2) is the more important part. [1] https://www.postgresql.org/message-id/28F46159-3764-421B-904B-004DEA339310%40enterprisedb.com — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
> On Mar 4, 2020, at 7:43 PM, Mark Dilger <mark.dilger@enterprisedb.com> wrote: > > Hackers, > > as mentioned in [1], I have created an implementation of command counter statistics very similar in purpose to the onealready pending in the commitfest going by the name "pg_stat_sql". I don't really care if this implementation is seenas building on that one or as separate, but I was worried about hijacking that thread, so I'm starting this thead. Thereare elements of this patch that borrowed from that one, so if this is accepted, authorship should reflect that. > > See the work by Haribabu Kommi (haribabu) at https://commitfest.postgresql.org/27/790/ > > The two main differences are that > > (1) This implementation is based on commandtags as enums, not strings and > (2) This implementation uses techniques to reduce lock contention > > I think (2) is the more important part. > > [1] https://www.postgresql.org/message-id/28F46159-3764-421B-904B-004DEA339310%40enterprisedb.com > > <v1-0001-Implementing-the-cmdstats-subsystem.patch> > — Rebased patch against master caa3c4242cf86322e2ed0c86199e6462a2c41565. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
> On Jun 2, 2020, at 9:58 AM, Mark Dilger <mark.dilger@enterprisedb.com> wrote: > >> >> On Mar 4, 2020, at 7:43 PM, Mark Dilger <mark.dilger@enterprisedb.com> wrote: >> >> Hackers, >> >> as mentioned in [1], I have created an implementation of command counter statistics very similar in purpose to the onealready pending in the commitfest going by the name "pg_stat_sql". I don't really care if this implementation is seenas building on that one or as separate, but I was worried about hijacking that thread, so I'm starting this thead. Thereare elements of this patch that borrowed from that one, so if this is accepted, authorship should reflect that. >> >> See the work by Haribabu Kommi (haribabu) at https://commitfest.postgresql.org/27/790/ >> >> The two main differences are that >> >> (1) This implementation is based on commandtags as enums, not strings and >> (2) This implementation uses techniques to reduce lock contention >> >> I think (2) is the more important part. >> >> [1] https://www.postgresql.org/message-id/28F46159-3764-421B-904B-004DEA339310%40enterprisedb.com >> >> <v1-0001-Implementing-the-cmdstats-subsystem.patch> >> — > > Rebased patch against master caa3c4242cf86322e2ed0c86199e6462a2c41565. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
On Wed, Mar 4, 2020 at 10:43 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > The two main differences are that > > (1) This implementation is based on commandtags as enums, not strings and > (2) This implementation uses techniques to reduce lock contention > > I think (2) is the more important part. My spidey sense is tingling here, telling me that we need some actual benchmarking. Like, suppose we test the two patches under normal cases and under cases that are constructed to be as bad as possible for each of them. Or suppose we test this patch with the lock mitigation strategies and then remove the mitigations for some inexpensive command (e.g. SHOW) and then use pgbench to spam that command. Like you, I suspect that the locking mitigations are important in some workloads, but it would be good to have some figures to back that out, as well as to find out whether there's still too much overhead. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Jun 10, 2020 at 01:45:02PM -0400, Robert Haas wrote: > My spidey sense is tingling here, telling me that we need some actual > benchmarking. Like, suppose we test the two patches under normal cases > and under cases that are constructed to be as bad as possible for each > of them. Or suppose we test this patch with the lock mitigation > strategies and then remove the mitigations for some inexpensive > command (e.g. SHOW) and then use pgbench to spam that command. Like > you, I suspect that the locking mitigations are important in some > workloads, but it would be good to have some figures to back that out, > as well as to find out whether there's still too much overhead. This patch has not received any replies after this comment for three months, so I am marking it as returned with feedback. I agree that this should be benchmarked carefully. -- Michael
Attachment
On Tue, Mar 03, 2020 at 11:50:04PM -0300, Alvaro Herrera wrote: > So, who is updating this patch for the new cmdtaglist.h stuff? This patch had no activity for months, so I am marking it as returned with feedback. -- Michael
Attachment
On 2020-Sep-17, Michael Paquier wrote: > On Wed, Jun 10, 2020 at 01:45:02PM -0400, Robert Haas wrote: > > My spidey sense is tingling here, telling me that we need some actual > > benchmarking. > > This patch has not received any replies after this comment for three > months, so I am marking it as returned with feedback. I agree that > this should be benchmarked carefully. It seems fine to mark the patch as RwF at the end of commitfest, but that's two weeks away. I don't understand what is accomplished by doing it ahead of time, other than alienating the patch authors. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> On Sep 17, 2020, at 5:40 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > On 2020-Sep-17, Michael Paquier wrote: > >> On Wed, Jun 10, 2020 at 01:45:02PM -0400, Robert Haas wrote: >>> My spidey sense is tingling here, telling me that we need some actual >>> benchmarking. >> >> This patch has not received any replies after this comment for three >> months, so I am marking it as returned with feedback. I agree that >> this should be benchmarked carefully. Yes, I have prioritized a couple other patches over this one, with benchmarking this patch lower down my priority list. Thank you, Michael, for the reminder! > > It seems fine to mark the patch as RwF at the end of commitfest, but > that's two weeks away. I don't understand what is accomplished by doing > it ahead of time, other than alienating the patch authors. Thanks, Álvaro! I do not feel alienated, and I hope Haribabu Kommi does not either. Haribabu, do you have an opinion on which of the two patches should proceed? I don't recall seeing a response from you aboutwhether you liked what I did with your patch. Perhaps you were waiting on the benchmarking results? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Sep 17, 2020 at 08:03:55AM -0700, Mark Dilger wrote: > Yes, I have prioritized a couple other patches over this one, with > benchmarking this patch lower down my priority list. Thank you, > Michael, for the reminder! Based on the low level of activity and the fact that the patch was marked as waiting on author for a couple of weeks, it looks like little could be achieved by the end of the CF, and the attention was elsewhere, so it looked better (and it still does IMO) to give more attention to the remaining 170~ patches that are still lying on the CF app. -- Michael
Attachment
On 2020-Sep-18, Michael Paquier wrote: > Based on the low level of activity and the fact that the patch was > marked as waiting on author for a couple of weeks, it looks like > little could be achieved by the end of the CF, and the attention was > elsewhere, so it looked better (and it still does IMO) to give more > attention to the remaining 170~ patches that are still lying on the CF > app. "A couple of weeks" of inactivity is not sufficient, in my view, to boot a patch out of the commitfest process. Whenever the patch is resurrected, it will be a new entry which won't have the history that it had accumulated in the long time since it was created -- which biases it against other new patches. I'm not really arguing about this one patch only, but more generally about the process you're following. The problem is that you as CFM are imposing your personal priorities on the whole process by punting patches ahead of time -- you are saying that nobody else should be looking at updating this patch because it is now closed. It seems fair to do so at the end of the commitfest, but it does not seem fair to do it when it's still mid-cycle. Putting also in perspective the history that the patch had prior to your unfairly closing it, there was a lot of effort in getting it reviewed to a good state and updated by the author -- yet a single unsubstantiated comment, without itself a lot of effort on the reviewer's part, that "maybe it has a perf drawback" is sufficient to get it booted? It doesn't seem appropriate. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Sep 21, 2020 at 9:41 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > "A couple of weeks" of inactivity is not sufficient, in my view, to boot > a patch out of the commitfest process. Whenever the patch is > resurrected, it will be a new entry which won't have the history that it > had accumulated in the long time since it was created -- which biases > it against other new patches. As Michael said, it had been inactive for three *months*. I don't think there's a big problem here. I think that the redesign of the CommitFest application encourages carrying things along from CF to CF forever, instead of getting rid of things that aren't wanted or aren't making any progress. That's work we don't need. There ought to be a way to mark a patch RwF when nothing's happen that lets it be revived later if and when someone gets around to resubmitting, but I think that's not possible now. Then, too, since we have email integration now, maybe we ought to do that automatically if the thread isn't getting updated and the patch is setting there waiting on author. It's a real waste of CFM time to chase down and kick out obviously-inactive patches, and if the CFM is going to get flack for it then that's even worse. Like, do we have 170 patches now because we have more activity than a few years ago, or just because we've become more reluctant to boot things? I don't know, but to the extent it's from unwillingness to boot things, I think that's bad. Discouraging contributors is not good, but wasting CFM and commiter time is also bad. You've got to draw a line someplace or you'll go nuts. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Sep 21, 2020 at 10:41:46AM -0300, Alvaro Herrera wrote: > "A couple of weeks" of inactivity is not sufficient, in my view, to boot > a patch out of the commitfest process. Whenever the patch is > resurrected, it will be a new entry which won't have the history that it > had accumulated in the long time since it was created -- which biases > it against other new patches. Any of the patches that have been marked as RwF last week have been waiting on author since at least the end of the commit fest of July, with zero actions taken on them. In my experience, this points strongly to the fact that these are unlikely going to be worked on actively in a short time frame. > I'm not really arguing about this one patch only, but more generally > about the process you're following. The problem is that you as CFM are > imposing your personal priorities on the whole process by punting > patches ahead of time -- you are saying that nobody else should be > looking at updating this patch because it is now closed. It seems fair > to do so at the end of the commitfest, but it does not seem fair to do > it when it's still mid-cycle. I think that this remark is a bit unfair. When it comes to any patch in the commit fest, and I have managed a couple of them over the years, I have tried to keep a neutral view for all of them, meaning that I only let the numbers speak by themselves, mostly: - When has a patch lastly been updated. - What's the current state in the CF app If the patch had no reviews and still in "Needs review", simply switch it to "waiting on author". If the patch has been waiting on author for more than two weeks, it would get marked as RwF at the end of the CF. There are also cases where the status of the patch is incorrect, mostly that a patch waiting on author because of a review was not updated as such in the CF app. In this case, and in the middle of a CF, this would get get marked as Rwf, but the author would get a reminder of that. Any of the patches that have been discarded by me last week were waiting in the stack for much, much, longer than that, and I think that it does not help reviewers and committers in keeping this stuff longer than necessary because this mainly bloats the CF app in a non-necessary way, hiding patches that should deserve more attention. Some of those patches have been left idle for half a year. Another point that I'd like to make here is that my intention was to lift the curve here (a term quite fashionable this year particularly, isn't it?), due to the high number of patches to handle in a single CF, and the shortage of hands actually doing this kind of work. Then, let's be clear about one last thing. If at *any* moment people here feel that at least the state of *one* patch of this CF has been changed based on some of my "personal" priority views, I will immediately stop any action as CFM and resign from it. So, if that's the case, please feel free to speak up here or on any of the threads related to those patches. > Putting also in perspective the history that the patch had prior to your > unfairly closing it, there was a lot of effort in getting it reviewed to > a good state and updated by the author -- yet a single unsubstantiated > comment, without itself a lot of effort on the reviewer's part, that > "maybe it has a perf drawback" is sufficient to get it booted? It > doesn't seem appropriate. I agree on the fact that not being able to recreate a new CF entry that keeps the history of an old one, with all the past threads or annotations, could be helpful in some cases. Now, I have seen as well cases where a patch got around for so long with a very long thread made things confusing, and that people also like starting a new thread with a summary of the past state as a first message to attempt clarifying things. So both have advantages. -- Michael
Attachment
On Wed, Sep 23, 2020 at 10:41:10AM +0900, Michael Paquier wrote: > I think that this remark is a bit unfair. When it comes to any patch > in the commit fest, and I have managed a couple of them over the > years, I have tried to keep a neutral view for all of them, meaning > that I only let the numbers speak by themselves, mostly: > - When has a patch lastly been updated. > - What's the current state in the CF app If the patch had no reviews > and still in "Needs review", simply switch it to "waiting on author". Sorry, forgot to mention here that this is in the case where a patch does not apply anymore or that the CF bot complains, requiring a check of the patch from the author. > If the patch has been waiting on author for more than two weeks, it > would get marked as RwF at the end of the CF. There are also cases > where the status of the patch is incorrect, mostly that a patch > waiting on author because of a review was not updated as such in the > CF app. In this case, and in the middle of a CF, this would get get > marked as Rwf, but the author would get a reminder of that. Last sentence here is incorrect as well: s/get get/not get/, and the author would be reminded to update his/her patch. -- Michael
Attachment
> >> On Mar 4, 2020, at 7:43 PM, Mark Dilger <mark.dilger@enterprisedb.com> wrote: > >> as mentioned in [1], I have created an implementation of command > >> counter statistics very similar in purpose to the one already > >> pending in the commitfest going by the name "pg_stat_sql". I don't > >> really care if this implementation is seen as building on that one > >> or as separate, but I was worried about hijacking that thread, so > >> I'm starting this thead. There are elements of this patch that > >> borrowed from that one, so if this is accepted, authorship should > >> reflect that. Hello, was this abandoned or is there another thread somewhere? The patch in the email I'm replying to doesn't apply, but the conflicts don't look very bad. -- Álvaro Herrera Valdivia, Chile
> On May 28, 2021, at 6:42 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > >>>> On Mar 4, 2020, at 7:43 PM, Mark Dilger <mark.dilger@enterprisedb.com> wrote: > >>>> as mentioned in [1], I have created an implementation of command >>>> counter statistics very similar in purpose to the one already >>>> pending in the commitfest going by the name "pg_stat_sql". I don't >>>> really care if this implementation is seen as building on that one >>>> or as separate, but I was worried about hijacking that thread, so >>>> I'm starting this thead. There are elements of this patch that >>>> borrowed from that one, so if this is accepted, authorship should >>>> reflect that. > > Hello, was this abandoned or is there another thread somewhere? The > patch in the email I'm replying to doesn't apply, but the conflicts > don't look very bad. I abandoned it for v14, choosing instead to work on amcheck features. Would you like me to take this up again for v15? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company