Thread: [HACKERS] Adding new output parameter of pg_stat_statements to identify operation of the query.
[HACKERS] Adding new output parameter of pg_stat_statements to identify operation of the query.
From
Date:
Hi PG hackers:
When using pg_stat_statements to collect running SQL of PG, we find it is hard for our program to get exact operation type of the SQL, such as SELECT, DELETE, UPDATE, INSERT, and so on.
So we modify the the source code of pg_stat_statements and add another output parameter to tell us the operation type.
Of course some application know their operation type, but for us and many public databases, doing this is hard.
The only way is to reparse the SQL, obviously it is too expensive for a monitoring or diagnosis system.
We have done the job and are willing to post a patch.
I sent one through my work mail, but it seems that my mail didn't reach the maillist, so I try again by using my personal mail account.
jasonysli
Re: [HACKERS] Adding new output parameter of pg_stat_statements toidentify operation of the query.
From
"Tsunakawa, Takayuki"
Date:
From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of > husttripper@vip.sina.com > When using pg_stat_statements to collect running SQL of PG, we > find it is hard for our program to get exact operation type of the SQL, > such as SELECT, DELETE, UPDATE, INSERT, and so on. > So we modify the the source code of pg_stat_statements and add another > output parameter to tell us the operation type. > Of course some application know their operation type, but for us and many > public databases, doing this is hard. > The only way is to reparse the SQL, obviously it is too expensive for a > monitoring or diagnosis system. > We have done the job and are willing to post a patch. > I sent one through my work mail, but it seems that my mail didn't reach > the maillist, so I try again by using my personal mail account. A view for counting the number of executions per operation type is being developed for PostgreSQL 10, which is expected tobe released this year. https://commitfest.postgresql.org/13/790/ Would this fit your need? If not, what's the benefit of getting the operation type via pg_stat_statements? Regards Takayuki Tsunakawa
Re: [HACKERS] Adding new output parameter of pg_stat_statements toidentify operation of the query.
From
Jim Nasby
Date:
On 2/19/17 6:34 PM, Tsunakawa, Takayuki wrote: >> We have done the job and are willing to post a patch. >> I sent one through my work mail, but it seems that my mail didn't reach >> the maillist, so I try again by using my personal mail account. > A view for counting the number of executions per operation type is being developed for PostgreSQL 10, which is expectedto be released this year. > > https://commitfest.postgresql.org/13/790/ > > Would this fit your need? If not, what's the benefit of getting the operation type via pg_stat_statements? Something that needs to be considered with doing this in pg_stat_statement is that a query that's reported there can contain multiple SQL statements. I don't remember offhand if all statements get parsed as a whole before anything else happens; if that's the case then you could potentially have an array in pg_stat_statements indicating what the command tags are. Short of that, I'm not sure it would be a good idea to only support a single tag being visible at a time; it would be certain to induce users to create code that's going to be buggy as soon as someone starts using multiple statements. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
Re: [HACKERS] Adding new output parameter of pg_stat_statements to identify operation of the query.
From
Tom Lane
Date:
Jim Nasby <Jim.Nasby@BlueTreble.com> writes: > Something that needs to be considered with doing this in > pg_stat_statement is that a query that's reported there can contain > multiple SQL statements. I don't remember offhand if all statements get > parsed as a whole before anything else happens; if that's the case then > you could potentially have an array in pg_stat_statements indicating > what the command tags are. I think that's been addressed as of 83f2061dd. My own concern here is that pg_stat_statements shared hashtable entries (pgssEntry) are currently 200 bytes, if I counted correctly. It's hard to see how to implement this feature without adding COMPLETION_TAG_BUFSIZE (64 bytes) to that, which is kind of a large percentage bump for a feature request that AFAIR nobody else has ever made. I suppose one way to avoid a large fixed-size field would be to store the tag string out-of-line, similarly to what we do with the query text. But then you've traded off a shared-memory-bloat worry for a performance worry, ie what's the added overhead for dealing with another external string. regards, tom lane
[HACKERS] 答复: [HACKERS] Adding new output parameter of pg_stat_statements toidentify operation of the query.(Internet mail)
From
jasonysli(李跃森)
Date:
Yes, it seems the pg_stat_sql function can fit the individual need of collecting tags of query. However the new functioncan not return other values of query at the same time, such as block number info, run time and so on. Returningthese values at the same time are very important. So I think it is still needed by pg_stat_statement when monitoring a database system. ________________________________________ 发件人: pgsql-hackers-owner@postgresql.org [pgsql-hackers-owner@postgresql.org] 代表 Tsunakawa, Takayuki [tsunakawa.takay@jp.fujitsu.com] 发送时间: 2017年2月20日 8:34 收件人: 'husttripper@vip.sina.com'; pgsql-hackers 主题: Re: [HACKERS] Adding new output parameter of pg_stat_statements toidentify operation of the query.(Internet mail) From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of > husttripper@vip.sina.com > When using pg_stat_statements to collect running SQL of PG, we > find it is hard for our program to get exact operation type of the SQL, > such as SELECT, DELETE, UPDATE, INSERT, and so on. > So we modify the the source code of pg_stat_statements and add another > output parameter to tell us the operation type. > Of course some application know their operation type, but for us and many > public databases, doing this is hard. > The only way is to reparse the SQL, obviously it is too expensive for a > monitoring or diagnosis system. > We have done the job and are willing to post a patch. > I sent one through my work mail, but it seems that my mail didn't reach > the maillist, so I try again by using my personal mail account. A view for counting the number of executions per operation type is being developed for PostgreSQL 10, which is expected tobe released this year. https://commitfest.postgresql.org/13/790/ Would this fit your need? If not, what's the benefit of getting the operation type via pg_stat_statements? Regards Takayuki Tsunakawa -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 答复: [HACKERS] Adding new output parameter of pg_stat_statements toidentify operation of the query.(Internet mail)
From
Tom Lane
Date:
jasonysli(李跃森) <jasonysli@tencent.com> writes: > Yes, it seems the pg_stat_sql function can fit the individual need of collecting tags of query. However the new functioncan not return other values of query at the same time, such as block number info, run time and so on. Returningthese values at the same time are very important. Meh ... that seems like a justification that was made up on the fly, rather than being a demonstrable requirement. What's the specific use case that requires it? After thinking about this some more I'm really pretty dubious that storing the command tag will get you anything you can't get as well or better by looking at the first word or two of the query text. I don't believe the original claim that doing so is "too expensive for a monitoring system". It does not take that much to pull out a couple of whitespace-separated keywords and perhaps case-fold them, and by the time you've fetched the string out of the database you've spent way more cycles than that. A slightly better argument is that WITH will confuse matters, but really it does anyway: consider WITH x AS (INSERT INTO ... RETURNING *)SELECT * FROM x; This will get a command tag of "SELECT", but any reasonable person would deem that this would be better characterized as an INSERT. So I think if you want useful results for WITH cases you're going to need to spend effort looking at the querystring anyway. regards, tom lane
Re: [HACKERS] Adding new output parameter of pg_stat_statements toidentify operation of the query.
From
Jim Nasby
Date:
On 2/19/17 10:02 PM, Tom Lane wrote: > Jim Nasby <Jim.Nasby@BlueTreble.com> writes: >> Something that needs to be considered with doing this in >> pg_stat_statement is that a query that's reported there can contain >> multiple SQL statements. I don't remember offhand if all statements get >> parsed as a whole before anything else happens; if that's the case then >> you could potentially have an array in pg_stat_statements indicating >> what the command tags are. > I think that's been addressed as of 83f2061dd. > > My own concern here is that pg_stat_statements shared hashtable entries > (pgssEntry) are currently 200 bytes, if I counted correctly. It's hard > to see how to implement this feature without adding COMPLETION_TAG_BUFSIZE > (64 bytes) to that, which is kind of a large percentage bump for a feature > request that AFAIR nobody else has ever made. AFAIK the only variable part of any tag is the rowcount from SELECT (if that's even part of the tag?)... so couldn't tags be switched over to an enum, at least internally? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)