Thread: transaction idle timeout in 7.4.5 and 8.0.0beta2
Hi all, I was implement the "transaction idle timeout" function in PostgreSQL (version 7.4.5 and 8.0.0beta2) test=# SET trans_idle_timeout=10; SET test=# BEGIN; BEGIN (ps) ... postgres: sygma test [local] idle in transaction (3) postgres: sygma test [local] idle in transaction (2) postgres: sygma test [local] idle in transaction (1) postgres: sygma test [local] idle in transaction (aborted) test=# INSERT INTO test (a,b) VALUES (1,'a'); ERROR: current transaction is aborted, commands ignored until end of transaction block The code is very simple and stable. Changed files: include/storage/proc.h backend/libpq/pqcomm.c backend/libpq/be-secure.c backend/utils/misc/guc.c TODO: SSL connection, manual, psql help & tab-competition Do you interest it? Regards, Sygma
On Fri, Sep 17, 2004 at 08:47:02AM +0200, Szima G?bor wrote: > I was implement the "transaction idle timeout" function in PostgreSQL > (version 7.4.5 and 8.0.0beta2) It sounds interesting to me (for use in libpqxx, the C++ API), but perhaps for a slightly unusual reason. When a connection to the backend is lost just as you're waiting for the result of a COMMIT, you can't be sure if the transaction was rolled back or not. If I could know (not influence, just "know") when a transaction times out, then I could wait for this amount of time, reconnect to the backend, and check for some record left in a special table by the transaction. If it's not there, I'll know "for sure" (insofar as anything can ever be sure) that the transaction was not committed. This is still guesswork in the current situation. There's one assumption, however, and that is that the transaction will time out even if (part of) the timeout time was spent processing rather than waiting for a command. I'm not sure how many people would be willing to pay that price. Perhaps a connection timeout would be more appropriate for this sort of thing. Jeroen
"Jeroen T. Vermeulen" <jtv@xs4all.nl> writes: > It sounds interesting to me (for use in libpqxx, the C++ API), but perhaps > for a slightly unusual reason. When a connection to the backend is lost > just as you're waiting for the result of a COMMIT, you can't be sure if the > transaction was rolled back or not. > If I could know (not influence, just "know") when a transaction times out, > then I could wait for this amount of time, reconnect to the backend, and > check for some record left in a special table by the transaction. If it's > not there, I'll know "for sure" (insofar as anything can ever be sure) that > the transaction was not committed. This is still guesswork in the current > situation. I don't see any reason for guesswork. Remember the PID of the backend you were connected to. On reconnect, look in pg_stat_activity to see if that backend is still alive; if so, sleep till it's not. Then check to see if your transaction committed or not. No need for anything so dangerous as a timeout. regards, tom lane
On Sat, Sep 18, 2004 at 12:43:05PM -0400, Tom Lane wrote: > I don't see any reason for guesswork. Remember the PID of the backend > you were connected to. On reconnect, look in pg_stat_activity to see if > that backend is still alive; if so, sleep till it's not. Then check to > see if your transaction committed or not. No need for anything so > dangerous as a timeout. I didn't know you could do that! I'll look into it. Thanks. Jeroen
On Sat, Sep 18, 2004 at 12:43:05PM -0400, Tom Lane wrote: > I don't see any reason for guesswork. Remember the PID of the backend > you were connected to. On reconnect, look in pg_stat_activity to see if > that backend is still alive; if so, sleep till it's not. Then check to > see if your transaction committed or not. No need for anything so > dangerous as a timeout. Looks like that only works if stats_command_string is set (and backend version is at least 7.2), correct? I couldn't find this table in the online documentation, but can I assume that the query will have finished executing (whether for good or for bad) when its current_query is either empty or null? Jeroen
"Jeroen T. Vermeulen" <jtv@xs4all.nl> writes: > On Sat, Sep 18, 2004 at 12:43:05PM -0400, Tom Lane wrote: >> I don't see any reason for guesswork. Remember the PID of the backend >> you were connected to. On reconnect, look in pg_stat_activity to see if >> that backend is still alive; if so, sleep till it's not. > Looks like that only works if stats_command_string is set (and backend > version is at least 7.2), correct? No, stats_command_string need not be set, only stats_start_collector. > I couldn't find this table in the > online documentation, http://www.postgresql.org/docs/7.4/static/monitoring-stats.html#MONITORING-STATS-VIEWS-TABLE > but can I assume that the query will have finished > executing (whether for good or for bad) when its current_query is either > empty or null? No, because of the reporting delay. I would recommend waiting for the backend's row in pg_stat_activity to disappear entirely. Under normal circumstances that should occur quickly. If there's a communications problem, it might take awhile to detect connection loss ... but if there's a communications problem, you probably aren't going to be able to open a new connection, either. regards, tom lane
On Sat, Sep 18, 2004 at 02:32:32PM -0400, Tom Lane wrote: > No, because of the reporting delay. I would recommend waiting for the > backend's row in pg_stat_activity to disappear entirely. Under normal > circumstances that should occur quickly. If there's a communications > problem, it might take awhile to detect connection loss ... but if > there's a communications problem, you probably aren't going to be able > to open a new connection, either. Unfortunately, a communications problem is exactly the kind of scenario we were talking about in the first place! Might be a misguided firewall, for instance. (In which case we'd want the TCP connection to time out quickly on the server as well to avoid piling up dead backends, but that's another matter). BTW is there really no other way to see if a query (for the same user who's looking) is still executing? It'd be really helpful if the check could be done from the same client process that lost the connection in the first place. Jeroen
On Sat, Sep 18, 2004 at 02:32:32PM -0400, Tom Lane wrote: > No, stats_command_string need not be set, only stats_start_collector. BTW, I've got this set (I'm even running as "postgres") but still I get the "<command string not enabled>" message instead of current_query. :( Jeroen
"Jeroen T. Vermeulen" <jtv@xs4all.nl> writes: > On Sat, Sep 18, 2004 at 02:32:32PM -0400, Tom Lane wrote: >> No, stats_command_string need not be set, only stats_start_collector. > BTW, I've got this set (I'm even running as "postgres") but still I get > the "<command string not enabled>" message instead of current_query. :( It has to be set in the backend you want the info from ... regards, tom lane
On Sat, Sep 18, 2004 at 03:41:24PM -0400, Tom Lane wrote: > >> No, stats_command_string need not be set, only stats_start_collector. > > > BTW, I've got this set (I'm even running as "postgres") but still I get > > the "<command string not enabled>" message instead of current_query. :( > > It has to be set in the backend you want the info from ... I set it in /etc/postgresql/postgresql.conf; then I restarted the postmaster just to be sure, and checked from psql. Same message, even about the same session: jtv@bulletproof:~$ sudo su postgres -c 'postgresql-8.0/bin/psql template1' Password: Welcome to psql 8.0.0beta1, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help with psql commands \g orterminate with semicolon to execute query \q to quit template1=# select * from pg_stat_activity;datid | datname | procpid | usesysid | usename | current_query | query_start -------+-----------+---------+----------+----------+------------------------------+------------- 1 | template1 | 11559| 1 | postgres | <command string not enabled> | (1 row) template1=# show stats_start_collector;stats_start_collector -----------------------on (1 row) Jeroen
"Jeroen T. Vermeulen" <jtv@xs4all.nl> writes: > On Sat, Sep 18, 2004 at 02:32:32PM -0400, Tom Lane wrote: >> If there's a communications >> problem, it might take awhile to detect connection loss ... but if >> there's a communications problem, you probably aren't going to be able >> to open a new connection, either. > Unfortunately, a communications problem is exactly the kind of scenario > we were talking about in the first place! Might be a misguided firewall, > for instance. (In which case we'd want the TCP connection to time out > quickly on the server as well to avoid piling up dead backends, but that's > another matter). Well, I think it would time out quickly --- anyway on the order of minutes not hours. By hypothesis, the situation you're worried about is where the backend was unable to send you a COMMIT acknowledgement message. The kernel is going to realize that it didn't get an ACK back, and is going to retry a few times, and is then going to declare the connection lost. The case where you may have a very long delay before detection of connection loss is where the backend is sitting idle with nothing to send. regards, tom lane
"Jeroen T. Vermeulen" <jtv@xs4all.nl> writes: > On Sat, Sep 18, 2004 at 03:41:24PM -0400, Tom Lane wrote: >> No, stats_command_string need not be set, only stats_start_collector. >> > BTW, I've got this set (I'm even running as "postgres") but still I get > the "<command string not enabled>" message instead of current_query. :( Oh, we're talking at cross-purposes. I was saying that you only need stats_start_collector in order to have a row in pg_stat_activity indicating that the backend is still there. If you're going to insist on examining the query column rather than just the PID, then yes you need stats_command_string. regards, tom lane
On Sat, Sep 18, 2004 at 04:05:26PM -0400, Tom Lane wrote: > Well, I think it would time out quickly --- anyway on the order of > minutes not hours. By hypothesis, the situation you're worried about is > where the backend was unable to send you a COMMIT acknowledgement > message. The kernel is going to realize that it didn't get an ACK back, > and is going to retry a few times, and is then going to declare the > connection lost. The case where you may have a very long delay before > detection of connection loss is where the backend is sitting idle with > nothing to send. That's one load off my mind, thanks. Jeroen
Interesting. So it times out a transaction to release its locks. Does it time from the start of the transaction? I assume so. We do have statment_timeout so maybe transaction_timeout would be the proper name. Timing out on just the idle state seems strange to me. --------------------------------------------------------------------------- Szima G�bor wrote: > > Hi all, > > I was implement the "transaction idle timeout" function in PostgreSQL > (version 7.4.5 and 8.0.0beta2) > > test=# SET trans_idle_timeout=10; > SET > test=# BEGIN; > BEGIN > > (ps) > ... > postgres: sygma test [local] idle in transaction (3) > postgres: sygma test [local] idle in transaction (2) > postgres: sygma test [local] idle in transaction (1) > postgres: sygma test [local] idle in transaction (aborted) > > test=# INSERT INTO test (a,b) VALUES (1,'a'); > ERROR: current transaction is aborted, commands ignored until end of transaction block > > > The code is very simple and stable. > > Changed files: > > include/storage/proc.h > backend/libpq/pqcomm.c > backend/libpq/be-secure.c > backend/utils/misc/guc.c > > TODO: SSL connection, manual, psql help & tab-competition > > Do you interest it? > > > Regards, Sygma > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: > Interesting. So it times out a transaction to release its locks. Does > it time from the start of the transaction? I assume so. We do have > statment_timeout so maybe transaction_timeout would be the proper name. > Timing out on just the idle state seems strange to me. Timing out on the start of the transaction isn't a good idea; what about long running transactions? You'd have to set the timeout very high (lets say to have nightly imports) which would make the "idle in transaction and not responding any more" errors detected very late in online situations. So the timeout should start after each statement's completion. Regards, Andreas