Re: Reducing stats collection overhead - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Reducing stats collection overhead
Date
Msg-id 20070429160040.GE18593@alvh.no-ip.org
Whole thread Raw
In response to Reducing stats collection overhead  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Reducing stats collection overhead  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:

> The first design that comes to mind is that at transaction end
> (pgstat_report_tabstat() time) we send a stats message only if at least
> X milliseconds have elapsed since we last sent one, where X is
> PGSTAT_STAT_INTERVAL or closely related to it.  We also make sure to
> flush stats out before process exit.  This approach ensures that in a
> lots-of-short-transactions scenario, we only need to send one stats
> message every X msec, not one per query.

If you're going to make it depend on the timestamp set by transaction
start, I'm all for it.

> The cost is possible delay of stats reports.  I claim that any
> transaction that makes a really sizable change in the stats will run
> longer than X msec and therefore will send its stats immediately.

I agree with this, particularly if it means we don't get to add another
gettimeofday().


FWIW, am I reading the code wrong or do we send the number of xact
commit and rollback multiple times in pgstat_report_one_tabstat, with
only the first one having non-zero counts?  Maybe we could put these
counters in a separate message to reduce the size of the tabstat
messages themselves.  (It may be that the total impact in bytes is
minimal, and the added overhead of an additional message is greater?)

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


pgsql-hackers by date:

Previous
From: Gregory Stark
Date:
Subject: Re: Feature freeze progress report
Next
From: Tom Lane
Date:
Subject: Re: Reducing stats collection overhead