On Tue, Jan 28, 2014 at 10:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp> writes:
>> By the way, latest pg_stat_statement might affect performance in Windows system.
>> Because it uses fflush() system call every creating new entry in
>> pg_stat_statements, and it calls many fread() to warm file cache.
>
> This statement doesn't seem to have much to do with the patch as
> committed.
You could make a strong case for the patch improving performance in
practice for many users, by allowing us to increase the
pg_stat_statements.max default value to 5,000, 5 times the previous
value. The real expense of creating a new entry is the exclusive
locking of the hash table, which blocks *everything* in
pg_stat_statements. That isn't expanded at all, since queries are only
written with a shared lock, which only blocks the creation of new
entries which was already relatively expensive. In particular, it does
not block the maintenance of costs for all already observed entries in
the hash table. It's obvious that simply reducing the pressure on the
cache will improve matters a lot, which for many users the external
texts patch does.
Since Mitsumasa has compared that patch and at least one of his
proposed pg_stat_statements patches on several occasions, I will
re-iterate the difference: any increased overhead from the external
texts patch is paid only when a query is first entered into the hash
table. Then, there is obviously and self-evidently no additional
overhead from contention for any future execution of the same query,
no matter what, indefinitely (the exclusive locking section of
creating a new entry does not do I/O, except in fantastically unlikely
circumstances). So for many of the busy production systems I work
with, that's the difference between a cost paid perhaps tens of
thousands of times a second, and a cost paid every few days or weeks.
Even if he is right about a write() taking an unreasonably large
amount of time on Windows, the consequences felt as pg_stat_statements
LWLock contention would be very limited.
I am not opposed in principle to adding new things to the counters
struct in pg_stat_statements. I just think that the fact that the
overhead of installing the module on a busy production system is
currently so low is of *major* value, and therefore any person that
proposes to expand that struct should be required to very conclusively
demonstrate that there is no appreciably increase in overhead. Having
a standard deviation column would be nice, but it's still not that
important. Maybe when we have portable atomic addition we can afford
to be much more inclusive of that kind of thing.
--
Peter Geoghegan