Re: Add min and max execute statement time in pg_stat_statement - Mailing list pgsql-hackers
From | KONDO Mitsumasa |
---|---|
Subject | Re: Add min and max execute statement time in pg_stat_statement |
Date | |
Msg-id | 52E9D98A.4000007@lab.ntt.co.jp Whole thread Raw |
In response to | Re: Add min and max execute statement time in pg_stat_statement (Peter Geoghegan <pg@heroku.com>) |
Responses |
Re: Add min and max execute statement time in pg_stat_statement
|
List | pgsql-hackers |
(2014/01/29 16:58), Peter Geoghegan wrote: > On Tue, Jan 28, 2014 at 10:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> KONDO Mitsumasa <kondo.mitsumasa@lab.ntt.co.jp> writes: >>> By the way, latest pg_stat_statement might affect performance in Windows system. >>> Because it uses fflush() system call every creating new entry in >>> pg_stat_statements, and it calls many fread() to warm file cache. >> >> This statement doesn't seem to have much to do with the patch as >> committed. > > You could make a strong case for the patch improving performance in > practice for many users, by allowing us to increase the > pg_stat_statements.max default value to 5,000, 5 times the previous > value. The real expense of creating a new entry is the exclusive > locking of the hash table, which blocks *everything* in > pg_stat_statements. That isn't expanded at all, since queries are only > written with a shared lock, which only blocks the creation of new > entries which was already relatively expensive. In particular, it does > not block the maintenance of costs for all already observed entries in > the hash table. It's obvious that simply reducing the pressure on the > cache will improve matters a lot, which for many users the external > texts patch does. > > Since Mitsumasa has compared that patch and at least one of his > proposed pg_stat_statements patches on several occasions, I will > re-iterate the difference: any increased overhead from the external > texts patch is paid only when a query is first entered into the hash > table. Then, there is obviously and self-evidently no additional > overhead from contention for any future execution of the same query, > no matter what, indefinitely (the exclusive locking section of > creating a new entry does not do I/O, except in fantastically unlikely > circumstances). So for many of the busy production systems I work > with, that's the difference between a cost paid perhaps tens of > thousands of times a second, and a cost paid every few days or weeks. > Even if he is right about a write() taking an unreasonably large > amount of time on Windows, the consequences felt as pg_stat_statements > LWLock contention would be very limited. > > I am not opposed in principle to adding new things to the counters > struct in pg_stat_statements. I just think that the fact that the > overhead of installing the module on a busy production system is > currently so low is of *major* value, and therefore any person that > proposes to expand that struct should be required to very conclusively > demonstrate that there is no appreciably increase in overhead. Having > a standard deviation column would be nice, but it's still not that > important. Maybe when we have portable atomic addition we can afford > to be much more inclusive of that kind of thing. I'd like to know the truth and the fact in your patch, rather than your argument:-) So I create detail pg_stat_statements benchmark tool using pgbench. This tool can create 10000 pattern unique sqls in a file, and it is only for measuring pg_stat_statements performance. Because it only updates pg_stat_statements data and doesn't write to disk at all. It's fair benchmark. [usage] perl create_sql.pl >file.sql psql -h -h xxx.xxx.xxx.xxx mitsu-ko -c 'SELECT pg_stat_statements_reset()' pgbench -h xxx.xxx.xxx.xxx mitsu-ko -c64 -j32 -n -T 180 -f file.sql [part of sample sqls file, they are executed in pgbench] SELECT 1-1/9+5*8*6+5+9-6-3-4/9%7-2%7/5-9/7+2+9/7-1%5%9/3-4/4-9/1+5+5/6/5%2+1*2*3-8/8+5-3-8-4/8+5%2*2-2-5%6+4 SELECT 1%9%7-8/5%6-1%2*2-7+9*6-2*6-9+1-2*9+6+7*8-4-2*1%3/7-1%4%9+4+7/5+4/2-3-5%8/3/7*6-1%8*6*1/7/2%5*6/2-3-9 SELECT 1%3*2/8%6%5-3%1+1/6*6*5/9-2*5+6/6*5-1/2-6%4%8/7%2*7%5%9%5/9%1%1-3-9/2*1+1*6%8/2*4/1+6*7*1+5%6+9*1-9*6 ... I test some methods that include old pgss, old pgss with my patch, and new pgss. Test server and postgresql.conf are same in I tested last day in this ML-thread. And test methods and test results are here, [methods] method 1: with old pgss, pg_stat_statements.max=1000(default) method 2: with old pgss with my patch, pg_stat_statements.max=1000(default) peter 1 : with new pgss(Peter's patch), pg_stat_statements.max=5000(default) peter 2 : with new pgss(Peter's patch), pg_stat_statements.max=1000 [for reference] method 5:with old pgss, pg_stat_statements.max=5000 method 6:with old pgss with my patch, pg_stat_statements.max=5000 [results] method | try1 | try2 | try3 | degrade performance ratio ----------------------------------------------------------------- method 1 | 6.546 | 6.558 | 6.638 | 0% (reference score) method 2 | 6.527 | 6.556 | 6.574 | 1% peter 1 | 5.204 | 5.203 | 5.216 |20% peter 2 | 4.241 | 4.236 | 4.262 |35% method 5 | 5.931 | 5.848 | 5.872 |11% method 6 | 5.794 | 5.792 | 5.776 |12% New pgss is extremely degrade performance than old pgss, and I cannot see performance scaling. I understand that his argument was "My patch reduces time of LWLock in pg_stat_statements, it is good for performance. However, Kondo's patch will be degrade performance in pg_stat_statements. so I give it -1.". But unless seeing this test results, his claim seems to contradict it even in Linux OS. I think that his patch's strong point is only storing long query, at the sacrifice of significant performance. Does it hope in community? At least, I don't hope it. Regards, -- Mitsumasa KONDO NTT Open Source Software Center
Attachment
pgsql-hackers by date: