Re: Timing overhead and Linux clock sources - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Timing overhead and Linux clock sources
Date
Msg-id 20120817022814.GR30286@momjian.us
Whole thread Raw
In response to Re: Timing overhead and Linux clock sources  (Greg Smith <greg@2ndQuadrant.com>)
Responses Re: Timing overhead and Linux clock sources
List pgsql-hackers
FYI, I am planning to go ahead and package this tool in /contrib for PG
9.3.

---------------------------------------------------------------------------

On Fri, Dec  9, 2011 at 08:26:12PM -0500, Greg Smith wrote:
> On 12/09/2011 06:48 PM, Ants Aasma wrote:
> >The attached test program (test_gettimeofday_monotonic) shows that one
> >test loop iteration takes 34ns with tsc and 1270ns with hpet.
> 
> This test program is great, I've wanted this exact sort of
> visibility into this problem for years.  I've toyed with writing
> something like this and then seeing what results it returns on all
> of the build farm animals.  For now I can just run it on all the
> hardware I have access to, which is not a small list.
> 
> I think we should bundle this up similarly to test_fsync, document
> some best practices on time sources, and then point the vague
> warning about EXPLAIN overhead toward that.  Then new sources of
> timing overhead can point there too.   Much like low-level fsync
> timing, there's nothing PostgreSQL can do about it, the best we can
> do is provide a program to help users classify a system as likely or
> unlikely to run to have high-overhead timing.  I can make the needed
> docs changes, this is resolving a long standing issue impacting code
> I wanted to add.
> 
> Rough guideline I'm seeing shape up is that <50ns is unlikely to
> cause clock timing to be a significant problem, >500ns certainly is,
> and values in the middle should concern people but not necessarily
> invalidate the timing data collected.
> 
> I just confirmed that switching the clock source by echoing a new
> value into
> /sys/devices/system/clocksource/clocksource0/current_clocksource
> works (on the 2.6.32 kernel at least).  What we want to see on a
> good server is a result that looks like this, from my Intel i7-860
> system:
> 
> $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> tsc
> $ ./test_gettimeofday_monotonic 5
> Per loop: 39.30 ns
>      usec:      count   percent
>         4:          6  0.00000%
>         2:        104  0.00008%
>         1:    4999760  3.92983%
>         0:  122226109 96.07009%
> 
> Here's how badly that degrades if I use one of the alternate time sources:
> 
> # echo acpi_pm >
> /sys/devices/system/clocksource/clocksource0/current_clocksource
> $ ./test_gettimeofday_monotonic 5
> Per loop: 727.65 ns
>      usec:      count   percent
>        16:          1  0.00001%
>         8:          0  0.00000%
>         4:       1233  0.01794%
>         2:        699  0.01017%
>         1:    4992622 72.65764%
>         0:    1876879 27.31423%
> 
> echo hpet > /sys/devices/system/clocksource/clocksource0/current_clocksource
> $ ./test_gettimeofday_monotonic 5
> Per loop: 576.96 ns
>      usec:      count   percent
>         8:          2  0.00002%
>         4:       1273  0.01469%
>         2:        767  0.00885%
>         1:    4993028 57.61598%
>         0:    3670977 42.36046%
> 
> Switching to the Intel T7200 CPU in my T60 laptop only provides the
> poor quality time sources, not TSC, and results show timing is
> really slow there:
> 
> $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
> hpet
> $ ./test_gettimeofday_monotonic 5
> Per loop: 1019.60 ns
>      usec:      count   percent
>       256:          2  0.00004%
>       128:          3  0.00006%
>        64:         90  0.00184%
>        32:         23  0.00047%
>        16:         92  0.00188%
>         8:       1246  0.02541%
>         4:         34  0.00069%
>         2:     136154  2.77645%
>         1:    4700772 95.85818%
>         0:      65466  1.33498%
> 
> # echo acpi_pm >
> /sys/devices/system/clocksource/clocksource0/current_clocksource
> $ ./test_gettimeofday_monotonic 5
> Per loop: 1864.66 ns
>      usec:      count   percent
>       256:          2  0.00007%
>       128:          0  0.00000%
>        64:          3  0.00011%
>        32:          6  0.00022%
>        16:         90  0.00336%
>         8:       1741  0.06493%
>         4:       2062  0.07690%
>         2:    2260601 84.30489%
>         1:     416954 15.54952%
>         0:          0  0.00000%
> 
> This seems to be measuring exactly the problem I only had a
> hand-wave "it's bad on this hardware" explanation of before.
> 
> -- 
> Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
> PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: RFC: list API / memory allocations
Next
From: Stephen Frost
Date:
Subject: Slow tab completion w/ lots of tables