Thread: [HACKERS] Adding new output parameter of pg_stat_statements to identify operation of the query.

 Hi PG hackers:
         When using pg_stat_statements to collect running SQL of PG, we find it is hard for our program to get exact operation type of the SQL,  such as SELECT, DELETE, UPDATE, INSERT, and so on.
   So we modify the the source code of pg_stat_statements and add another output parameter to tell us the operation type.
Of course some application know their operation type, but for us and many public databases, doing this is hard. 
The only way is to reparse the SQL, obviously it is too expensive for a monitoring or diagnosis system.
We have done the job and are willing to post a patch.
I sent one through my work mail, but it seems that my mail didn't reach the maillist, so I try again by using my personal mail account.

jasonysli
From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of
> husttripper@vip.sina.com
>          When using pg_stat_statements to collect running SQL of PG, we
> find it is hard for our program to get exact operation type of the SQL,
> such as SELECT, DELETE, UPDATE, INSERT, and so on.
>    So we modify the the source code of pg_stat_statements and add another
> output parameter to tell us the operation type.
> Of course some application know their operation type, but for us and many
> public databases, doing this is hard.
> The only way is to reparse the SQL, obviously it is too expensive for a
> monitoring or diagnosis system.
> We have done the job and are willing to post a patch.
> I sent one through my work mail, but it seems that my mail didn't reach
> the maillist, so I try again by using my personal mail account.

A view for counting the number of executions per operation type is being developed for PostgreSQL 10, which is expected
tobe released this year.
 

https://commitfest.postgresql.org/13/790/

Would this fit your need?  If not, what's the benefit of getting the operation type via pg_stat_statements?

Regards
Takayuki Tsunakawa






On 2/19/17 6:34 PM, Tsunakawa, Takayuki wrote:
>> We have done the job and are willing to post a patch.
>> I sent one through my work mail, but it seems that my mail didn't reach
>> the maillist, so I try again by using my personal mail account.
> A view for counting the number of executions per operation type is being developed for PostgreSQL 10, which is
expectedto be released this year.
 
>
> https://commitfest.postgresql.org/13/790/
>
> Would this fit your need?  If not, what's the benefit of getting the operation type via pg_stat_statements?

Something that needs to be considered with doing this in  
pg_stat_statement is that a query that's reported there can contain  
multiple SQL statements. I don't remember offhand if all statements get  
parsed as a whole before anything else happens; if that's the case then  
you could potentially have an array in pg_stat_statements indicating  
what the command tags are.

Short of that, I'm not sure it would be a good idea to only support a  
single tag being visible at a time; it would be certain to induce users  
to create code that's going to be buggy as soon as someone starts using  
multiple statements.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)



Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
> Something that needs to be considered with doing this in  
> pg_stat_statement is that a query that's reported there can contain  
> multiple SQL statements. I don't remember offhand if all statements get  
> parsed as a whole before anything else happens; if that's the case then  
> you could potentially have an array in pg_stat_statements indicating  
> what the command tags are.

I think that's been addressed as of 83f2061dd.

My own concern here is that pg_stat_statements shared hashtable entries
(pgssEntry) are currently 200 bytes, if I counted correctly.  It's hard
to see how to implement this feature without adding COMPLETION_TAG_BUFSIZE
(64 bytes) to that, which is kind of a large percentage bump for a feature
request that AFAIR nobody else has ever made.

I suppose one way to avoid a large fixed-size field would be to store
the tag string out-of-line, similarly to what we do with the query text.
But then you've traded off a shared-memory-bloat worry for a performance
worry, ie what's the added overhead for dealing with another external
string.
        regards, tom lane



Yes, it seems the pg_stat_sql function can fit the individual need of collecting tags of query.  However the new
functioncan not return other values of query  at the same time, such as block number info, run time and so on.
Returningthese values at the same time are very important.   
So I think it is still needed by pg_stat_statement when monitoring a database system.
________________________________________
发件人: pgsql-hackers-owner@postgresql.org [pgsql-hackers-owner@postgresql.org] 代表 Tsunakawa, Takayuki
[tsunakawa.takay@jp.fujitsu.com]
发送时间: 2017年2月20日 8:34
收件人: 'husttripper@vip.sina.com'; pgsql-hackers
主题: Re: [HACKERS] Adding new output parameter of pg_stat_statements toidentify operation of the query.(Internet mail)

From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of
> husttripper@vip.sina.com
>          When using pg_stat_statements to collect running SQL of PG, we
> find it is hard for our program to get exact operation type of the SQL,
> such as SELECT, DELETE, UPDATE, INSERT, and so on.
>    So we modify the the source code of pg_stat_statements and add another
> output parameter to tell us the operation type.
> Of course some application know their operation type, but for us and many
> public databases, doing this is hard.
> The only way is to reparse the SQL, obviously it is too expensive for a
> monitoring or diagnosis system.
> We have done the job and are willing to post a patch.
> I sent one through my work mail, but it seems that my mail didn't reach
> the maillist, so I try again by using my personal mail account.

A view for counting the number of executions per operation type is being developed for PostgreSQL 10, which is expected
tobe released this year. 

https://commitfest.postgresql.org/13/790/

Would this fit your need?  If not, what's the benefit of getting the operation type via pg_stat_statements?

Regards
Takayuki Tsunakawa





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

jasonysli(李跃森) <jasonysli@tencent.com> writes:
> Yes, it seems the pg_stat_sql function can fit the individual need of collecting tags of query.  However the new
functioncan not return other values of query  at the same time, such as block number info, run time and so on.
Returningthese values at the same time are very important.   

Meh ... that seems like a justification that was made up on the fly,
rather than being a demonstrable requirement.  What's the specific use
case that requires it?


After thinking about this some more I'm really pretty dubious that storing
the command tag will get you anything you can't get as well or better by
looking at the first word or two of the query text.  I don't believe the
original claim that doing so is "too expensive for a monitoring system".
It does not take that much to pull out a couple of whitespace-separated
keywords and perhaps case-fold them, and by the time you've fetched the
string out of the database you've spent way more cycles than that.
A slightly better argument is that WITH will confuse matters, but really
it does anyway: consider
WITH x AS (INSERT INTO ... RETURNING *)SELECT * FROM x;

This will get a command tag of "SELECT", but any reasonable person would
deem that this would be better characterized as an INSERT.  So I think
if you want useful results for WITH cases you're going to need to spend
effort looking at the querystring anyway.
        regards, tom lane



On 2/19/17 10:02 PM, Tom Lane wrote:
> Jim Nasby <Jim.Nasby@BlueTreble.com> writes:
>> Something that needs to be considered with doing this in
>> pg_stat_statement is that a query that's reported there can contain
>> multiple SQL statements. I don't remember offhand if all statements get
>> parsed as a whole before anything else happens; if that's the case then
>> you could potentially have an array in pg_stat_statements indicating
>> what the command tags are.
> I think that's been addressed as of 83f2061dd.
>
> My own concern here is that pg_stat_statements shared hashtable entries
> (pgssEntry) are currently 200 bytes, if I counted correctly.  It's hard
> to see how to implement this feature without adding COMPLETION_TAG_BUFSIZE
> (64 bytes) to that, which is kind of a large percentage bump for a feature
> request that AFAIR nobody else has ever made.

AFAIK the only variable part of any tag is the rowcount from SELECT (if 
that's even part of the tag?)... so couldn't tags be switched over to an 
enum, at least internally?
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)