gettimeofday is at the end of its usefulness? - Mailing list pgsql-hackers

From Tom Lane
Subject gettimeofday is at the end of its usefulness?
Date
Msg-id 31856.1400021891@sss.pgh.pa.us
Whole thread Raw
Responses Re: gettimeofday is at the end of its usefulness?  (Peter Geoghegan <pg@heroku.com>)
Re: gettimeofday is at the end of its usefulness?  (Greg Stark <stark@mit.edu>)
Re: gettimeofday is at the end of its usefulness?  (Jeff Janes <jeff.janes@gmail.com>)
Re: gettimeofday is at the end of its usefulness?  (Andres Freund <andres@2ndquadrant.com>)
Re: gettimeofday is at the end of its usefulness?  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
A recent question from Tim Kane prompted me to measure the overhead
costs of EXPLAIN ANALYZE, which I'd not checked in awhile.  Things
are far worse than I thought.  On my current server (by no means
lavish hardware: Xeon E5-2609 @2.40GHz) a simple seqscan can run
at something like 110 nsec per row:

regression=# create table foo as select x as f1 from generate_series(1,1000000) x;
SELECT 1000000
regression=# vacuum foo;
VACUUM
regression=# explain analyze select * from foo;
                                                  QUERY PLAN
---------------------------------------------------------------------------------------------------------------
 Seq Scan on foo  (cost=0.00..14425.00 rows=1000000 width=4) (actual time=0.053..111.720 rows=1000000 loops=1)
 Planning time: 0.222 ms
 Execution time: 166.682 ms
(3 rows)

(and, btw, this is a debug build --- without assert and memory
context checks it'd be faster.)

The problem with this number is that a simple test program shows that
gettimeofday() requires about 40 nsec on this hardware.  That means
that two-thirds of the above timing measurement is overhead.

To add insult to injury, gettimeofday's output cannot be more precise than
1 microsecond, making its use for measuring sub-microsecond intervals at
best stochastic.

I looked around a bit and found that recent versions of POSIX have a
function clock_gettime() that's a bit more modern than gettimeofday():
at least, the output struct provides nsec rather than usec available
precision.  I benchmarked this, with the CLOCK_REALTIME selector, and
found that it also requires about 40nsec, while the output is actually
good to perhaps 10nsec precision.  (I base this on seeing no duplicate
readings in a tight loop, so that the value is certainly getting advanced
more often than once every 40 nsec.)

There's also a CLOCK_REALTIME_COARSE selector, which is noticeably faster
--- about 10nsec for me --- but the output appears to only advance once
every millisecond, so it's probably useless for our purposes.  The other
selectors mentioned in the Linux man page are considerably slower than
CLOCK_REALTIME for me, suggesting that they actually call into the kernel.

I also tried a loop around a bare "rdtsc" assembly instruction, finding
that that instruction takes about 10nsec.  That would be a nice
improvement over gettimeofday, except that using that directly would
involve dealing with cross-CPU skew, which seems like no fun at all.
And I don't really want to get into finding equivalents for non-Intel
architectures, either.

Anyway it looks like clock_gettime() might be worth using on Linux
just for the more precise output.  It doesn't seem to exist on OS X
though, and I have no idea about elsewhere.

I'm curious if anybody has ideas about other things we might do for
portable high-precision timing.

            regards, tom lane

#include <stdio.h>
#include <time.h>
#include <sys/time.h>

int main(int argc, char **argv)
{
    long thistime, lasttime = 0;
    int nbogus = 0;
    int ndistinct = 0;
    int i;

    for (i=0; i<1000000000; i++)
    {
#if 1
        struct timespec tv;
        clock_gettime(CLOCK_REALTIME, &tv);
        thistime = (long) tv.tv_sec * 1000000000 + tv.tv_nsec;
#else
        struct timeval tv;
        gettimeofday(&tv, NULL);
        thistime = (long) tv.tv_sec * 1000000 + tv.tv_usec;
#endif

        if (thistime < lasttime)
            nbogus++;
        else if (thistime > lasttime)
            ndistinct++;
        lasttime = thistime;
    }

    printf("%d bogus readings\n", nbogus);
    printf("%d distinct readings\n", ndistinct);

    return 0;
}

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Error in running DBT2
Next
From: Thomas Munro
Date:
Subject: SKIP LOCKED DATA (work in progress)