Thread: display previous query string of idle-in-transaction
Hi. Now, we can check the running query string by pg_stat_activity.current_query. If we can also check previous query_string of idle-in-transaction, it is useful for analysis of long transaction problem. Long-transaction is a trouble, because it prevents defragmentation of HOT and VACUUM. And long-transaction tends to be it in a state of "idle in transaction". (BEGIN -> SOME SQL -> .... (long-transactin) ....) So, I sometimes want to know what query (main cause) was done before transaction which have been practiced for a long time. I think that we are glad when we can confirm it in the following form. # We will be able to use debug_query_string in postgres.c for this purpose. ================================================================= =# SELECT current_query FROM pg_stat_activity WHERE procpid <> pg_backend_pid(); current_query ---------------------------------------------------------------<IDLE> in transaction [prev]: SELECT * FROM pg_class limit1; ================================================================= Thoughts? Best regards. -- Tatsuhito Kasahara kasahara.tatsuhito@oss.ntt.co.jp
This sure is a desirable feature. I have seen quite a few instances, where the app is in 'IDLE in Transaction' state, andwe are left with the only choice of killing such processes from OS. (Remember pg_cancel_backend() does not work for sessionsin IDLE or IDLE in transaction state)<br /><br />Also, it should be introduced as a new column, rather than tackingon the existing string. Although this column will be of little use in cases where current query is visible, but havinga separate column looks like a cleaner and simpler implementation.<br /><br />Best regards,<br /><br /><div class="gmail_quote">2009/3/25Tatsuhito Kasahara <span dir="ltr"><<a href="mailto:kasahara.tatsuhito@oss.ntt.co.jp">kasahara.tatsuhito@oss.ntt.co.jp</a>></span><br/><blockquote class="gmail_quote"style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> Hi.<br/><br /> Now, we can check the running query string by pg_stat_activity.current_query.<br /> If we can also check previousquery_string of idle-in-transaction,<br /> it is useful for analysis of long transaction problem.<br /><br /> Long-transactionis a trouble, because it prevents defragmentation of HOT and VACUUM.<br /> And long-transaction tends tobe it in a state of "idle in transaction".<br /> (BEGIN -> SOME SQL -> .... (long-transactin) ....)<br /><br /> So,I sometimes want to know what query (main cause) was done before<br /> transaction which have been practiced for a longtime.<br /><br /> I think that we are glad when we can confirm it in the following form.<br /> # We will be able to usedebug_query_string in postgres.c for this purpose.<br /><br /> =================================================================<br/> =# SELECT current_query FROM pg_stat_activity<br /> WHERE procpid <> pg_backend_pid();<br /><br /> current_query<br /> ---------------------------------------------------------------<br/> <IDLE> in transaction [prev]: SELECT * FROM pg_classlimit 1;<br /><br /> =================================================================<br /><br /> Thoughts?<br /><br/> Best regards.<br /><br /> --<br /> Tatsuhito Kasahara<br /><a href="mailto:kasahara.tatsuhito@oss.ntt.co.jp">kasahara.tatsuhito@oss.ntt.co.jp</a><br/><font color="#888888"><br /> --<br/> Sent via pgsql-hackers mailing list (<a href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br/> To make changes to your subscription:<br/><a href="http://www.postgresql.org/mailpref/pgsql-hackers" target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/></font></blockquote></div><br /><br clear="all"/><br />-- <br />gurjeet[.singh]@EnterpriseDB.com<br />singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com<br/><br />EnterpriseDB <a href="http://www.enterprisedb.com">http://www.enterprisedb.com</a><br /><br />Mail sentfrom my BlackLaptop device<br />
On Wed, Mar 25, 2009 at 06:08:43PM +0900, Tatsuhito Kasahara wrote: > So, I sometimes want to know what query (main cause) was done before > transaction which have been practiced for a long time. > Thoughts? I would love to get it, but when I suggested it some time in the past Tom shot it down as bad idea. http://archives.postgresql.org/message-id/20071016132131.GA4438@depesz.com To be honest - I have yet to see case described by Ton (commit; begin;). Best regards, depesz -- Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/ jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007
On Wed, Mar 25, 2009 at 5:48 PM, hubert depesz lubaczewski <depesz@depesz.com> wrote: > I would love to get it, but when I suggested it some time in the past > Tom shot it down as bad idea. > > http://archives.postgresql.org/message-id/20071016132131.GA4438@depesz.com I agree with Tom here. I tracked a lot of idle in transaction problems and you need the whole sequence of queries of all the backends to really understand what happens. The last query executed is mostly useless to solve this sort of problem. At least, it was for me in every case I had to deal with. -- Guillaume
Guillaume Smet wrote: > On Wed, Mar 25, 2009 at 5:48 PM, hubert depesz lubaczewski > <depesz@depesz.com> wrote: >> I would love to get it, but when I suggested it some time in the past >> Tom shot it down as bad idea. >> >> http://archives.postgresql.org/message-id/20071016132131.GA4438@depesz.com > > I agree with Tom here. I tracked a lot of idle in transaction problems > and you need the whole sequence of queries of all the backends to > really understand what happens. > The last query executed is mostly useless to solve this sort of > problem. At least, it was for me in every case I had to deal with. I think so too. But last-query-string may be a useful *hint*. It is a hard work to trace the whole sequence of queries. In most cases, last query string is enough information to solve the long transaction problem. And I want a easy way to get the informarion. # last query string is also a key to trace queries. "COMMIT;BEGIN;idle in transaction" is not a problem on 8.3 or later, but "(BEGIN;)SOME SQL;idle in transaction" is a problem. So, main purpose of displaying the last query string is .. - check whether "idle in transaction (running long time) process after SOME SQL" is exists or not. - check the content of "SOME SQL". best regards, -- Tatsuhito Kasahara kasahara.tatsuhito@oss.ntt.co.jp
Tatsuhito Kasahara <kasahara.tatsuhito@oss.ntt.co.jp> wrote: > So, main purpose of displaying the last query string is .. > - check whether "idle in transaction (running long time) process > after SOME SQL" is exists or not. > - check the content of "SOME SQL". The feature could be achieved by an extension module using new executor hooks in 8.4. It is just like contrib/pg_stat_statements; Backends store their queries history in process-local or shared memory. If you store queries in local memory, you could dump them at some of callback routines called at the end of transaction. On the other hand, if you use shared memory, it might be possitble to define a SRF function which return history of queries: CREATE FUNCTION query_history(backend_pid integer) RETURNS SETOF text; Required memory for query history is not so much. "1kB of query text * 100 connection * 10 queries" consumes just 1MB. You can discard old queries at the end of transaction or out of memory. If there are some interesting queries in it, you can dump them into server logs. For example, logging configurations something like 'idle_in_transaction_min_duration' and 'total_transaction_min_duration' will be possible. I'm not sure this feature should be in the core or not. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
ITAGAKI Takahiro wrote: > The feature could be achieved by an extension module using new executor > hooks in 8.4. It is just like contrib/pg_stat_statements; Well, it is a good idea. Displaying last-query-string may be useful, but it is not a feature for general purpose. So, it may be an external module. Now, I have two choice. 1. Displaying last-query-string by pg_stat_activity (suggested by Gurjeet): Extends the pg_stat_activity to display "previous_query". We can check only a previous query. 2. Displaying query-string-history by external module (suggested by Itagaki): Using executor hook to get a query stringand stores it in shared memory. We can check specified number of query string history. I think I will try mainly 2. # Or this feature may merge to pg_stat_statements. Any comments welcome. Best regards, -- Tatsuhito Kasahara kasahara.tatsuhito@oss.ntt.co.jp
On Wed, 2009-03-25 at 18:08 +0900, Tatsuhito Kasahara wrote: > If we can also check previous query_string of idle-in-transaction, > it is useful for analysis of long transaction problem. I'm more interested in the problem itself. Why do you think there is a problem and why does knowing this help you? I had similar problems recently, so I'm interested in thoughts around this. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Simon Riggs wrote: >> If we can also check previous query_string of idle-in-transaction, >> it is useful for analysis of long transaction problem. > > I'm more interested in the problem itself. Why do you think there is a > problem and why does knowing this help you? I had similar problems > recently, so I'm interested in thoughts around this. In many case, applications don't know what queries they are doing. So, if a long transaction problem (unworking defragmnetation by HOT and VACUUM) occured, I don't understand which application is the cause with only "<IDLE> in transaction" message. But if I can also check last query string, I guess which apllication do that and point out the problem point. # As I said, I think a long transaction is not a problem itself. Does that answer your question ? Best regards, -- Tatsuhito Kasahara kasahara.tatsuhito@oss.ntt.co.jp
On Fri, 2009-03-27 at 15:44 +0900, Tatsuhito Kasahara wrote: > Simon Riggs wrote: > >> If we can also check previous query_string of idle-in-transaction, > >> it is useful for analysis of long transaction problem. > > > > I'm more interested in the problem itself. Why do you think there is a > > problem and why does knowing this help you? I had similar problems > > recently, so I'm interested in thoughts around this. > In many case, applications don't know what queries they are doing. > So, if a long transaction problem (unworking defragmnetation by HOT and > VACUUM) occured, I don't understand which application is the cause with > only "<IDLE> in transaction" message. > > But if I can also check last query string, I guess which apllication > do that and point out the problem point. > # As I said, I think a long transaction is not a problem itself. > > Does that answer your question ? Not really. I want to understand the actual problem with idle-in-transaction so we can consider all ways to solve it, rather than just focus on one method. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
2009/3/27 Tatsuhito Kasahara <kasahara.tatsuhito@oss.ntt.co.jp>: > But if I can also check last query string, I guess which apllication > do that and point out the problem point. Oh, I just understand why you want this patch. I usually have one database per server so I didn't see your point. Considering this usage, +1 for me. -- Guillaume
On Fri, 2009-03-27 at 16:49 +0900, Tatsuhito Kasahara wrote: > Simon Riggs wrote: > >> Does that answer your question ? > > > > Not really. I want to understand the actual problem with > > idle-in-transaction so we can consider all ways to solve it, rather than > > just focus on one method. > "idle in transaction timeout" feature may be one of the ways. > But I have no specific idea about it now. Or take it further back still and think about why "idle in transaction" occurs at all and fix *that*. Maybe not in Postgres at all, possibly in the driver or even higher up client stack. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Simon Riggs wrote: >> Does that answer your question ? > > Not really. I want to understand the actual problem with > idle-in-transaction so we can consider all ways to solve it, rather than > just focus on one method. "idle in transaction timeout" feature may be one of the ways. But I have no specific idea about it now. -- Tatsuhito Kasahara kasahara.tatsuhito@oss.ntt.co.jp
On Fri, Mar 27, 2009 at 9:07 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > Or take it further back still and think about why "idle in transaction" > occurs at all and fix *that*. Maybe not in Postgres at all, possibly in > the driver or even higher up client stack. From my experience, the main reasons for this problem are: - a poorly coded connection pool (a lot of people think they really should write their connection pooling solution themselves...); - a poorly coded application; - bugs in the code. So you really should try to fix them in your application first. That said, I have one case in mind where I wasn't able to fix completely the connection pool and the application and we still encounter idle in transaction connections from time to time. For this sort of case, a timeout would be a nice solution. -- Guillaume
On Fri, Mar 27, 2009 at 8:27 AM, Guillaume Smet <guillaume.smet@gmail.com> wrote: > 2009/3/27 Tatsuhito Kasahara <kasahara.tatsuhito@oss.ntt.co.jp>: >> But if I can also check last query string, I guess which apllication >> do that and point out the problem point. > > Oh, I just understand why you want this patch. I usually have one > database per server so I didn't see your point. Thinking a bit more about it: the datname column in the pg_stat_activity view gives you the database concerned and usename the user used. So I still don't see your point: you can use different user to distinguish the applications. Moreover, if you're using connection pooling (which is more and more common) and the same user for connecting to the database, you won't be able to know if it's really the last query which causes the problem (from my experience, it's usually not). Being able to detect which application is running which query on the very same database with the very same user seems like something not so obvious and the use case seems to be pretty narrow. And IMHO, even if we suppose you can make the difference between the applications with only one query, you won't be able to limit your investigation to this application. So, in fact, I'm still not convinced. Could you detail a bit more how you plan to use it? -- Guillaume
(Sorry for delay..) Guillaume Smet wrote: > Being able to detect which application is running which query on the > very same database with the very same user seems like something not so > obvious and the use case seems to be pretty narrow. And IMHO, even if > we suppose you can make the difference between the applications with > only one query, you won't be able to limit your investigation to this > application. Yes, I won't be able to *completely* detect which application is running long transaction with a last query. But, as I said, I can get a hint for guessing causes from it. And, as Simon said, I can detect a problem point with collaboration from other information (app's log, app's source, operation procedure, and so on). > So, in fact, I'm still not convinced. Could you detail a bit more how > you plan to use it? Well, Now, I can't get enough information from pg_stat_activity. So, I have to check huge logs or reproduce the same problem. (They are annoying works.) If I can check last and more queries, I can use it as a hint for narrowing down problem points with app's log and so on. # And search the point and fix (or suggesting action) it. I hope it would be able to narrowing down problem points more easily. Best regards, -- Tatsuhito Kasahara kasahara.tatsuhito@oss.ntt.co.jp
On Mar 27, 2009, at 2:36 AM, Simon Riggs wrote: > Not really. I want to understand the actual problem with > idle-in-transaction so we can consider all ways to solve it, rather > than > just focus on one method. I have to distinct problems with idle in transaction. One is reporting users / the tools they're using. I'll often find transactions that have been open for minutes or hours. But, that's not a big deal for me, because that's only impacting londiste slaves, and I have no problem just killing those backends. What does concern me is seeing idle in transaction from our web servers that lasts anything more than a few fractions of a second. Those cases worry me because I have to wonder if that's happening due to bad code. Right now I can't think of any way to figure out if that's the case other than a lot of complex logfile processing (assuming that would even work). But if I knew what the previous query was, I'd at least have half a chance to know what portion of the code was responsible, and could then look at the code to see if the idle state was expected or not. -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828
After taking look at our monitoring system i think some hint about previous SQL might be useful.
dba db70 db_name WARNING 1 long transactions, duration > 2690min user=postgres pid=7887 waiting=False query=<IDLE> in transaction
Currently i have no idea what exactly did i kill without digging in logs which might have rotated anyway by now.
regards,
Asko
dba db70 db_name WARNING 1 long transactions, duration > 2690min user=postgres pid=7887 waiting=False query=<IDLE> in transaction
Currently i have no idea what exactly did i kill without digging in logs which might have rotated anyway by now.
regards,
Asko
On Tue, May 12, 2009 at 6:37 PM, decibel <decibel@decibel.org> wrote:
On Mar 27, 2009, at 2:36 AM, Simon Riggs wrote:I have to distinct problems with idle in transaction. One is reporting users / the tools they're using. I'll often find transactions that have been open for minutes or hours. But, that's not a big deal for me, because that's only impacting londiste slaves, and I have no problem just killing those backends.Not really. I want to understand the actual problem with
idle-in-transaction so we can consider all ways to solve it, rather than
just focus on one method.
What does concern me is seeing idle in transaction from our web servers that lasts anything more than a few fractions of a second. Those cases worry me because I have to wonder if that's happening due to bad code. Right now I can't think of any way to figure out if that's the case other than a lot of complex logfile processing (assuming that would even work). But if I knew what the previous query was, I'd at least have half a chance to know what portion of the code was responsible, and could then look at the code to see if the idle state was expected or not.
--
Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Should this patch be on the commitfest page for 8.5? Or is there a consensus already that it's a bad idea? Personally I actually think this makes a lot of sense to do. -- greg
Greg Stark <stark@enterprisedb.com> wrote: > Should this patch be on the commitfest page for 8.5? Or is there a > consensus already that it's a bad idea? > > Personally I actually think this makes a lot of sense to do. +1 It at least gives one a reasonable chance to get a useful clue.... -Kevin
Kevin Grittner wrote: > Greg Stark <stark@enterprisedb.com> wrote: >> Should this patch be on the commitfest page for 8.5? Or is there a >> consensus already that it's a bad idea? >> >> Personally I actually think this makes a lot of sense to do. > > +1 > > It at least gives one a reasonable chance to get a useful clue.... +1 # And I'm going to register this patch on the next commitfest page. -- Tatsuhito Kasahara kasahara.tatsuhito@oss.ntt.co.jp
On Thu, Jun 4, 2009 at 9:54 PM, Tatsuhito Kasahara<kasahara.tatsuhito@oss.ntt.co.jp> wrote: > Kevin Grittner wrote: >> >> Greg Stark <stark@enterprisedb.com> wrote: >>> >>> Should this patch be on the commitfest page for 8.5? Or is there a >>> consensus already that it's a bad idea? >>> >>> Personally I actually think this makes a lot of sense to do. >> >> +1 >> It at least gives one a reasonable chance to get a useful clue.... > > +1 > # And I'm going to register this patch on the next commitfest page. I don't actually remember seeing actual code to implement this - is this a patch, or just an idea? (I guess if this gets added to the CommitFest page there will be a pointer to the code, but somehow I can't find it at the moment.) The only thing I don't like about this is that I think it's kind of a hack to shove the <IDLE> in transaction designation and the query string into the same database column. I've never liked having to write: select sum(1) from pg_stat_activity where current_query = '<IDLE> in transaction'; ...and I like it even less if I now have to modify that query to use "like". We should find some way to represent this as structured data... maybe make a separate column called "idle" that's a boolean, or something, and let the query column contain the most recent query (whether or not it's still executing). ...Robert
On Thu, Jun 04, 2009 at 10:22:41PM -0400, Robert Haas wrote: > The only thing I don't like about this is that I think it's kind of a > hack to shove the <IDLE> in transaction designation and the query > string into the same database column. I've never liked having to > write: > > select sum(1) from pg_stat_activity where current_query = '<IDLE> in > transaction'; > > ...and I like it even less if I now have to modify that query to use > "like". We should find some way to represent this as structured > data... maybe make a separate column called "idle" that's a boolean, > or something, and let the query column contain the most recent query > (whether or not it's still executing). I like this idea a lot. Possibly it would be useful to have the end time of the last query too, then one could find idle sessions that were old and truly idle rather than just waiting for a busy client to send the next query. select ... from pg_stat_activity where idle and last_statement_endtime < now() - interval '1 minute'; -dg -- David Gould daveg@sonic.net 510 536 1443 510 282 0869 If simplicity worked, the world would be overrun with insects.
daveg <daveg@sonic.net> wrote: > On Thu, Jun 04, 2009 at 10:22:41PM -0400, Robert Haas wrote: >> maybe make a separate column called "idle" that's a boolean, >> or something, and let the query column contain the most recent >> query (whether or not it's still executing). +1 > I like this idea a lot. Possibly it would be useful to have the end > time of the last query too, then one could find idle sessions that > were old and truly idle rather than just waiting for a busy client > to send the next query. > > select ... from pg_stat_activity > where idle > and last_statement_endtime < now() - interval '1 minute'; +1 Of course, you might be more interested in those which are idle in a transaction, but that's easily done with these changes -- just throw in xact_start IS NULL. -Kevin
On Fri, Jul 24, 2009 at 10:47 AM, Kevin Grittner<Kevin.Grittner@wicourts.gov> wrote: > daveg <daveg@sonic.net> wrote: >> On Thu, Jun 04, 2009 at 10:22:41PM -0400, Robert Haas wrote: > >>> maybe make a separate column called "idle" that's a boolean, >>> or something, and let the query column contain the most recent >>> query (whether or not it's still executing). > > +1 > >> I like this idea a lot. Possibly it would be useful to have the end >> time of the last query too, then one could find idle sessions that >> were old and truly idle rather than just waiting for a busy client >> to send the next query. >> >> select ... from pg_stat_activity >> where idle >> and last_statement_endtime < now() - interval '1 minute'; > > +1 Hmm, I don't think we'd need two columns for this, actually. You could just have one column last_statement_endtime (not sure if it's the best name, but something along those lines) which would be NULL if the statement was still in progress and the appropriate timestamp if not. You could infer idle from whether or not that column was NULL. > Of course, you might be more interested in those which are idle in a > transaction, but that's easily done with these changes -- just throw > in xact_start IS NULL. Surely if xact_start is NULL it is not in a transaction at all? ...Robert
Robert Haas <robertmhaas@gmail.com> wrote: > Hmm, I don't think we'd need two columns for this, actually. You > could just have one column last_statement_endtime (not sure if it's > the best name, but something along those lines) which would be NULL > if the statement was still in progress and the appropriate timestamp > if not. You could infer idle from whether or not that column was > NULL. That would lose the ability to tell what the idle time was before the latest statement began, but maybe that's not interesting enough to justify another column.... >> Of course, you might be more interested in those which are idle in >> a transaction, but that's easily done with these changes -- just >> throw in xact_start IS NULL. > > Surely if xact_start is NULL it is not in a transaction at all? That's exactly the point I was trying to make. Sorry if that appeared to be saying anything else. -Kevin
Robert Haas <robertmhaas@gmail.com> writes: > Hmm, I don't think we'd need two columns for this, actually. You > could just have one column last_statement_endtime (not sure if it's > the best name, but something along those lines) which would be NULL if > the statement was still in progress and the appropriate timestamp if > not. You could infer idle from whether or not that column was NULL. Yeah, but "where idle" or "where not idle" is a lot easier to type. I think the extra column is justified on usability grounds. I'm also not entirely convinced that we want last_statement_endtime, because introducing that will cost us an extra kernel call per query in a lot of scenarios. And gettimeofday() is not cheap everywhere. Another question is that this proposal effectively redefines the current_query column as not the "current" query, but something that might be better be described as "latest_query". Should we change the name? We'd probably break some client code if we did, but on the other hand the semantics change might break such code anyway. Intentional breakage might not be such a bad thing if it forces people to take a fresh look at their code. regards, tom lane
On Fri, Jul 24, 2009 at 11:15 AM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Hmm, I don't think we'd need two columns for this, actually. You >> could just have one column last_statement_endtime (not sure if it's >> the best name, but something along those lines) which would be NULL if >> the statement was still in progress and the appropriate timestamp if >> not. You could infer idle from whether or not that column was NULL. > > Yeah, but "where idle" or "where not idle" is a lot easier to type. > I think the extra column is justified on usability grounds. I'm also > not entirely convinced that we want last_statement_endtime, because > introducing that will cost us an extra kernel call per query in a lot of > scenarios. And gettimeofday() is not cheap everywhere. I hate redundancy, but I don't care enough to argue about it. > Another question is that this proposal effectively redefines the > current_query column as not the "current" query, but something that > might be better be described as "latest_query". Should we change the > name? We'd probably break some client code if we did, but on the other > hand the semantics change might break such code anyway. Intentional > breakage might not be such a bad thing if it forces people to take a > fresh look at their code. +1 for intentional breakage. I like the name, too. ...Robert
On Friday 24 July 2009 18:15:00 Tom Lane wrote: > Another question is that this proposal effectively redefines the > current_query column as not the "current" query, but something that > might be better be described as "latest_query". Should we change the > name? We'd probably break some client code if we did, but on the other > hand the semantics change might break such code anyway. Intentional > breakage might not be such a bad thing if it forces people to take a > fresh look at their code. That breakage could be pretty widespread, though. Maybe have current_query and last_query.