Thread: [GENERAL] Performance degradation when using auto_explain
Hi,
I found performance degradation when using auto_explain with log_analyze option true.
It automatically logs query plan analyze results.
But when there are many concurrent sessions, the performance degrades in proportion to the number of concurrent sessions.
These queries are all read-only queries.
And I found that the time consumed in a plan node like IndexScan takes more times when there are many sessions.
At first, I thought that it may be due to CPU scheduling contentions incurs the degradation.
But there are more CPU cores than the session counts.
And finally I found that auto_explain is the cause of the problem.
I am curious now, why auto_explain incurs contentions between the session?
I use Ubuntu linux. Is it a problem of gettimeofday()?
Thank you.
Kisung Kim
--
Bitnine Global Inc., Kisung Kim, Ph.D
https://sites.google.com/site/kisungresearch/
E-mail : kskim@bitnine.net
Office phone : 070-4800-5890, 408-606-8602
US Mobile phone : 408-805-2192
Kisung Kim <kskim@bitnine.net> wrote: > And finally I found that auto_explain is the cause of the problem. real hardware or virtual hardware? On virtual there are sometimes problems with exact timings, please read: https://www.postgresql.org/docs/current/static/pgtesttiming.html Regards, Andreas Kretschmer -- Andreas Kretschmer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Jan 4, 2017 at 1:21 AM, Andreas Kretschmer <akretschmer@spamfence.net> wrote:
Kisung Kim <kskim@bitnine.net> wrote:
> And finally I found that auto_explain is the cause of the problem.
real hardware or virtual hardware? On virtual there are sometimes
problems with exact timings, please read:
https://www.postgresql.org/docs/current/static/ pgtesttiming.html
Thank you for your reply.
I use real hardware.
I am curious timing functions have contentions when multi-threads call them.
Regards, Andreas Kretschmer
--
Andreas Kretschmer
http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Bitnine Global Inc., Kisung Kim, Ph.D
https://sites.google.com/site/kisungresearch/
E-mail : kskim@bitnine.net
Office phone : 070-4800-5890, 408-606-8602
US Mobile phone : 408-805-2192
On 01/04/2017 08:54 PM, Kisung Kim wrote: > > > On Wed, Jan 4, 2017 at 1:21 AM, Andreas Kretschmer > <akretschmer@spamfence.net <mailto:akretschmer@spamfence.net>> wrote: > > Kisung Kim <kskim@bitnine.net <mailto:kskim@bitnine.net>> wrote: > > > And finally I found that auto_explain is the cause of the problem. > > real hardware or virtual hardware? On virtual there are sometimes > problems with exact timings, please read: > > https://www.postgresql.org/docs/current/static/pgtesttiming.html > <https://www.postgresql.org/docs/current/static/pgtesttiming.html> > > > Thank you for your reply. > I use real hardware. > I am curious timing functions have contentions when multi-threads call them. > It's not so much about contention between threads/processes, but mostly about the cost of actually reading data from the clock source. So even if you run on physical hardware, the clock source may be slow. Actually, there may be multiple clock sources available, differing in precision and overhead. See: /sys/devices/system/clocksource/*/available_clocksource On my machine I see 3 different sources "tsc hpet acpi_pm" and after running the pg_test_timing tool, linked by Andreas, and I get this for 'tsc' clock source Testing timing overhead for 3 seconds. Per loop time including overhead: 29.87 nsec Histogram of timing durations: < usec % of total count 1 97.09159 97499400 2 2.90085 2913031 4 0.00193 1936 8 0.00089 891 16 0.00140 1405 32 0.00335 3366 64 0.00000 2 suggesting that 97% of calls took less than 1 usec, which is quite good. For comparison, using 'hpet' gives me this: Testing timing overhead for 3 seconds. Per loop time including overhead: 766.92 ns Histogram of timing durations: < us % of total count 1 27.69558 1083389 2 71.28437 2788485 4 0.75748 29631 8 0.02886 1129 16 0.06578 2573 32 0.16755 6554 64 0.00033 13 128 0.00003 1 256 0.00003 1 Which is clearly much worse (it increased the per-loop cost from 30ns to 767ns, which is ~25x more). So check which clock source you have selected, and test how expensive that is. But even with a fast clock source, the additional timing overhead may make EXPLAIN ANALYZE considerably slower. There's not much we can do about it :-( And to make matters worse, it may affect different plans differently (e.g. nested loop joins do many more gettimeofday calls than other join types, amplifying the timing costs). But the question is whether you actually need the timing - the total duration + row counts are far more important in my experience, so by setting auto_explain.log_analyze = on auto_explain.log_timing = off you may significantly reduce the impact on query performance, while retaining the most valuable information. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services