Thread: Getting rid of extra gettimeofday() calls
As of CVS tip, PG does up to four separate gettimeofday() calls upon the arrival of a new client command. This is because the statement_timestamp, stats_command_string, log_duration, and statement_timeout features each independently save an indication of statement start time. Given what we've found out recently about gettimeofday() being unduly expensive on some hardware, this cries out to get fixed. I propose that we do SetCurrentStatementStartTimestamp() immediately upon receiving a client message, and then make the other features copy that value instead of fetching their own. Another gettimeofday() call that I would like to get rid of is the one currently done at the end of statement when stats_command_string is enabled: we record current time when resetting the activity_string to <IDLE>. Would anyone be terribly upset if this used statement_timestamp instead? The effect would be that for an idle backend, pg_stat_activity.query_start would reflect the start time of its latest query instead of the time at which it finished the query. I can see some use for the current behavior but I don't really think it's worth the overhead of a gettimeofday() call. Preliminary tests say that after the shared-memory change I committed yesterday, the overhead of stats_command_string consists *entirely* of the two added gettimeofday() calls. If we get rid of both, the difference between having stats_command_string on and off is barely measurable (using Bruce's test case of 10000 "SELECT 1;" statements). regards, tom lane
On Mon, Jun 19, 2006 at 11:17:48AM -0400, Tom Lane wrote: > instead? The effect would be that for an idle backend, > pg_stat_activity.query_start would reflect the start time of its latest > query instead of the time at which it finished the query. I can see > some use for the current behavior but I don't really think it's worth > the overhead of a gettimeofday() call. Perhaps make it a compile-time option... I suspect that there's people making use of that info in their monitoring tools. Though, those people are probably also likely to have log_duration=true, so maybe the same trick of gettimeofday() once at statement end and copying it as needed would work. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Mon, 2006-06-19 at 11:17 -0400, Tom Lane wrote: > As of CVS tip, PG does up to four separate gettimeofday() calls upon the > arrival of a new client command. This is because the statement_timestamp, > stats_command_string, log_duration, and statement_timeout features each > independently save an indication of statement start time. Given what > we've found out recently about gettimeofday() being unduly expensive on > some hardware, this cries out to get fixed. I propose that we do > SetCurrentStatementStartTimestamp() immediately upon receiving a client > message, and then make the other features copy that value instead of > fetching their own. Yes. Well spotted. That should make each timed aspect more accurate also since its the same value. Presumably you don't mean *every* client message, just stmt start ones. Can we set that in GetTransactionSnapshot() - that way a serializable transaction won't need to update the time after each statement. We can then record this as the SetCurrentSnapshotStartTimestamp(). > Another gettimeofday() call that I would like to get rid of is the one > currently done at the end of statement when stats_command_string is > enabled: we record current time when resetting the activity_string to > <IDLE>. Would anyone be terribly upset if this used statement_timestamp > instead? The effect would be that for an idle backend, > pg_stat_activity.query_start would reflect the start time of its latest > query instead of the time at which it finished the query. I can see > some use for the current behavior but I don't really think it's worth > the overhead of a gettimeofday() call. Presumably we have to do at least one at the end when doing statement logging? I notice there is also one in elog.c for when we have %t set. We probably don't need to do both when statement logging. > Preliminary tests say that after the shared-memory change I committed > yesterday, the overhead of stats_command_string consists *entirely* > of the two added gettimeofday() calls. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com
Simon Riggs <simon@2ndquadrant.com> writes: > Presumably you don't mean *every* client message, just stmt start ones. At the moment I've got it setting the statement_timestamp on receipt of any message that could lead to execution of user-defined code; that includes Query, Parse, Bind, Execute, FunctionCall. Possibly we could dispense with the Bind one but I'm unconvinced. > Can we set that in GetTransactionSnapshot() - that way a serializable > transaction won't need to update the time after each statement. No, that's much too late, unless you want to do major rearrangement of the times at which reporting actions occur. Furthermore the entire point of statement_timestamp is that it advances for new commands within the same xact, so your proposal amounts to removing statement_timestamp entirely. The actual behavior of CVS tip is that transaction_timestamp copies from statement_timestamp, not vice versa; that seems fine to me. > Presumably we have to do at least one at the end when doing statement > logging? Only if you've got log_duration on. Per Jim's suggestion, maybe we could have the <IDLE> activity report advance activity_timestamp only if log_duration is true, ie, only if it's free to do so. > I notice there is also one in elog.c for when we have %t set. We > probably don't need to do both when statement logging. I'm inclined to think that that one is worth its keep. Sometimes you really wanna know exactly when a log message was emitted ... regards, tom lane
Ühel kenal päeval, E, 2006-06-19 kell 11:17, kirjutas Tom Lane: > As of CVS tip, PG does up to four separate gettimeofday() calls upon the > arrival of a new client command. This is because the statement_timestamp, > stats_command_string, log_duration, and statement_timeout features each > independently save an indication of statement start time. Given what > we've found out recently about gettimeofday() being unduly expensive on > some hardware, this cries out to get fixed. I propose that we do > SetCurrentStatementStartTimestamp() immediately upon receiving a client > message, and then make the other features copy that value instead of > fetching their own. > > Another gettimeofday() call that I would like to get rid of is the one > currently done at the end of statement when stats_command_string is > enabled: we record current time when resetting the activity_string to > <IDLE>. Is it just <IDLE> or also <IDLE in transaction> ? If we are going to change things anyway, I'd like the latter to show the time since start of transaction, so that I Would at least have an easy way to write a transaction timeout script :) I don't really care about what plain <IDLE> uses. -- ---------------- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com