Thread: Getting rid of extra gettimeofday() calls

Getting rid of extra gettimeofday() calls

From
Tom Lane
Date:
As of CVS tip, PG does up to four separate gettimeofday() calls upon the
arrival of a new client command.  This is because the statement_timestamp,
stats_command_string, log_duration, and statement_timeout features each
independently save an indication of statement start time.  Given what
we've found out recently about gettimeofday() being unduly expensive on
some hardware, this cries out to get fixed.  I propose that we do
SetCurrentStatementStartTimestamp() immediately upon receiving a client
message, and then make the other features copy that value instead of
fetching their own.

Another gettimeofday() call that I would like to get rid of is the one
currently done at the end of statement when stats_command_string is
enabled: we record current time when resetting the activity_string to
<IDLE>.  Would anyone be terribly upset if this used statement_timestamp
instead?  The effect would be that for an idle backend,
pg_stat_activity.query_start would reflect the start time of its latest
query instead of the time at which it finished the query.  I can see
some use for the current behavior but I don't really think it's worth
the overhead of a gettimeofday() call.

Preliminary tests say that after the shared-memory change I committed
yesterday, the overhead of stats_command_string consists *entirely*
of the two added gettimeofday() calls.  If we get rid of both, the
difference between having stats_command_string on and off is barely
measurable (using Bruce's test case of 10000 "SELECT 1;" statements).
        regards, tom lane


Re: Getting rid of extra gettimeofday() calls

From
"Jim C. Nasby"
Date:
On Mon, Jun 19, 2006 at 11:17:48AM -0400, Tom Lane wrote:
> instead?  The effect would be that for an idle backend,
> pg_stat_activity.query_start would reflect the start time of its latest
> query instead of the time at which it finished the query.  I can see
> some use for the current behavior but I don't really think it's worth
> the overhead of a gettimeofday() call.

Perhaps make it a compile-time option... I suspect that there's people
making use of that info in their monitoring tools. Though, those people
are probably also likely to have log_duration=true, so maybe the same
trick of gettimeofday() once at statement end and copying it as needed
would work.
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: Getting rid of extra gettimeofday() calls

From
Simon Riggs
Date:
On Mon, 2006-06-19 at 11:17 -0400, Tom Lane wrote:
> As of CVS tip, PG does up to four separate gettimeofday() calls upon the
> arrival of a new client command.  This is because the statement_timestamp,
> stats_command_string, log_duration, and statement_timeout features each
> independently save an indication of statement start time.  Given what
> we've found out recently about gettimeofday() being unduly expensive on
> some hardware, this cries out to get fixed.  I propose that we do
> SetCurrentStatementStartTimestamp() immediately upon receiving a client
> message, and then make the other features copy that value instead of
> fetching their own.

Yes. Well spotted. That should make each timed aspect more accurate also
since its the same value.

Presumably you don't mean *every* client message, just stmt start ones. 

Can we set that in GetTransactionSnapshot() - that way a serializable
transaction won't need to update the time after each statement. We can
then record this as the SetCurrentSnapshotStartTimestamp().

> Another gettimeofday() call that I would like to get rid of is the one
> currently done at the end of statement when stats_command_string is
> enabled: we record current time when resetting the activity_string to
> <IDLE>.  Would anyone be terribly upset if this used statement_timestamp
> instead?  The effect would be that for an idle backend,
> pg_stat_activity.query_start would reflect the start time of its latest
> query instead of the time at which it finished the query.  I can see
> some use for the current behavior but I don't really think it's worth
> the overhead of a gettimeofday() call.

Presumably we have to do at least one at the end when doing statement
logging?

I notice there is also one in elog.c for when we have %t set. We
probably don't need to do both when statement logging.

> Preliminary tests say that after the shared-memory change I committed
> yesterday, the overhead of stats_command_string consists *entirely*
> of the two added gettimeofday() calls. 
--  Simon Riggs EnterpriseDB          http://www.enterprisedb.com



Re: Getting rid of extra gettimeofday() calls

From
Tom Lane
Date:
Simon Riggs <simon@2ndquadrant.com> writes:
> Presumably you don't mean *every* client message, just stmt start ones. 

At the moment I've got it setting the statement_timestamp on receipt of
any message that could lead to execution of user-defined code; that
includes Query, Parse, Bind, Execute, FunctionCall.  Possibly we could
dispense with the Bind one but I'm unconvinced.

> Can we set that in GetTransactionSnapshot() - that way a serializable
> transaction won't need to update the time after each statement.

No, that's much too late, unless you want to do major rearrangement of
the times at which reporting actions occur.  Furthermore the entire
point of statement_timestamp is that it advances for new commands within
the same xact, so your proposal amounts to removing statement_timestamp
entirely.

The actual behavior of CVS tip is that transaction_timestamp copies from
statement_timestamp, not vice versa; that seems fine to me.

> Presumably we have to do at least one at the end when doing statement
> logging?

Only if you've got log_duration on.  Per Jim's suggestion, maybe we
could have the <IDLE> activity report advance activity_timestamp only
if log_duration is true, ie, only if it's free to do so.

> I notice there is also one in elog.c for when we have %t set. We
> probably don't need to do both when statement logging.

I'm inclined to think that that one is worth its keep.  Sometimes you
really wanna know exactly when a log message was emitted ...
        regards, tom lane


Re: Getting rid of extra gettimeofday() calls

From
Hannu Krosing
Date:
Ühel kenal päeval, E, 2006-06-19 kell 11:17, kirjutas Tom Lane:
> As of CVS tip, PG does up to four separate gettimeofday() calls upon the
> arrival of a new client command.  This is because the statement_timestamp,
> stats_command_string, log_duration, and statement_timeout features each
> independently save an indication of statement start time.  Given what
> we've found out recently about gettimeofday() being unduly expensive on
> some hardware, this cries out to get fixed.  I propose that we do
> SetCurrentStatementStartTimestamp() immediately upon receiving a client
> message, and then make the other features copy that value instead of
> fetching their own.
> 
> Another gettimeofday() call that I would like to get rid of is the one
> currently done at the end of statement when stats_command_string is
> enabled: we record current time when resetting the activity_string to
> <IDLE>. 

Is it just <IDLE> or also <IDLE in transaction> ?

If we are going to change things anyway, I'd like the latter to show the
time since start of transaction, so that I Would at least have an easy
way to write a transaction timeout script :)


I don't really care about what plain <IDLE> uses.

-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com