Re: [HACKERS] gettimeofday is at the end of its usefulness? - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] gettimeofday is at the end of its usefulness?
Date
Msg-id 17524.1483063359@sss.pgh.pa.us
Whole thread Raw
In response to Re: gettimeofday is at the end of its usefulness?  (Haribabu Kommi <kommi.haribabu@gmail.com>)
Responses Re: [HACKERS] gettimeofday is at the end of its usefulness?  (Haribabu Kommi <kommi.haribabu@gmail.com>)
List pgsql-hackers
Haribabu Kommi <kommi.haribabu@gmail.com> writes:
> Attached a patch that replaces most of the getimeofday function calls,
> except timeofday(user callable) and GetCurrentTimestamp functions.

I looked at this for awhile and could not convince myself that it's
a good idea.  Trying to do s/gettimeofday/clock_gettime/g is not going
to do much for us except create portability headaches.  According
to my tests, clock_gettime is not noticeably faster than gettimeofday
on any platform, except that if you use nonstandard clockids like
CLOCK_REALTIME_COARSE then on *some* platforms it's a little bit quicker,
at the cost of being a great deal less precise.  But we'd have to research
the existence and effects of nonstandard clockids on every platform.
So AFAICS the only clear advantage to switching is the extra precision
available from clock_gettime.

But ... most of the places you've touched in this patch have neither any
need for sub-microsecond precision nor any great need to worry about
shaving a few ns off the time taken by the call.  As far as I can find,
the only place where it's actually worth our trouble to deal with it is
instr_time.h (ie, EXPLAIN ANALYZE and a few other uses).

So I think we should do something more like the attached.


One issue I did not resolve in this WIP patch is what to do with this
gem of abstraction violation in pgbench:

        /* no, print raw transactions */
#ifndef WIN32

        /* This is more than we really ought to know about instr_time */
        if (skipped)
            fprintf(logfile, "%d " INT64_FORMAT " skipped %d %ld %ld",
                    st->id, st->cnt, st->use_file,
                    (long) now->tv_sec, (long) now->tv_usec);
        else
            fprintf(logfile, "%d " INT64_FORMAT " %.0f %d %ld %ld",
                    st->id, st->cnt, latency, st->use_file,
                    (long) now->tv_sec, (long) now->tv_usec);
#else

        /* On Windows, instr_time doesn't provide a timestamp anyway */
        if (skipped)
            fprintf(logfile, "%d " INT64_FORMAT " skipped %d 0 0",
                    st->id, st->cnt, st->use_file);
        else
            fprintf(logfile, "%d " INT64_FORMAT " %.0f %d 0 0",
                    st->id, st->cnt, latency, st->use_file);
#endif

We could either rip out the non-Windows code path entirely, or do
something about providing an honest elapsed-time measurement, perhaps
by doing INSTR_TIME_GET_DOUBLE() on the diff from run start to "now".
Given that we're calling fprintf, I doubt that the extra arithmetic
needed for that is a big problem.

            regards, tom lane

diff --git a/configure b/configure
index 0f143a0..e5dd6fb 100755
*** a/configure
--- b/configure
*************** if test "$ac_res" != no; then :
*** 9055,9060 ****
--- 9055,9116 ----

  fi

+ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing clock_gettime" >&5
+ $as_echo_n "checking for library containing clock_gettime... " >&6; }
+ if ${ac_cv_search_clock_gettime+:} false; then :
+   $as_echo_n "(cached) " >&6
+ else
+   ac_func_search_save_LIBS=$LIBS
+ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+ /* end confdefs.h.  */
+
+ /* Override any GCC internal prototype to avoid an error.
+    Use char because int might match the return type of a GCC
+    builtin and then its argument prototype would still apply.  */
+ #ifdef __cplusplus
+ extern "C"
+ #endif
+ char clock_gettime ();
+ int
+ main ()
+ {
+ return clock_gettime ();
+   ;
+   return 0;
+ }
+ _ACEOF
+ for ac_lib in '' rt posix4; do
+   if test -z "$ac_lib"; then
+     ac_res="none required"
+   else
+     ac_res=-l$ac_lib
+     LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
+   fi
+   if ac_fn_c_try_link "$LINENO"; then :
+   ac_cv_search_clock_gettime=$ac_res
+ fi
+ rm -f core conftest.err conftest.$ac_objext \
+     conftest$ac_exeext
+   if ${ac_cv_search_clock_gettime+:} false; then :
+   break
+ fi
+ done
+ if ${ac_cv_search_clock_gettime+:} false; then :
+
+ else
+   ac_cv_search_clock_gettime=no
+ fi
+ rm conftest.$ac_ext
+ LIBS=$ac_func_search_save_LIBS
+ fi
+ { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_search_clock_gettime" >&5
+ $as_echo "$ac_cv_search_clock_gettime" >&6; }
+ ac_res=$ac_cv_search_clock_gettime
+ if test "$ac_res" != no; then :
+   test "$ac_res" = "none required" || LIBS="$ac_res $LIBS"
+
+ fi
+
  # Solaris:
  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing fdatasync" >&5
  $as_echo_n "checking for library containing fdatasync... " >&6; }
*************** fi
*** 12520,12526 ****
  LIBS_including_readline="$LIBS"
  LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`

! for ac_func in cbrt dlopen fdatasync getifaddrs getpeerucred getrlimit mbstowcs_l memmove poll pstat
pthread_is_threaded_npreadlink setproctitle setsid shm_open symlink sync_file_range towlower utime utimes wcstombs
wcstombs_l
  do :
    as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
  ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
--- 12576,12582 ----
  LIBS_including_readline="$LIBS"
  LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`

! for ac_func in cbrt clock_gettime dlopen fdatasync getifaddrs getpeerucred getrlimit mbstowcs_l memmove poll pstat
pthread_is_threaded_npreadlink setproctitle setsid shm_open symlink sync_file_range towlower utime utimes wcstombs
wcstombs_l
  do :
    as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
  ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
diff --git a/configure.in b/configure.in
index b9831bc..77aa3b4 100644
*** a/configure.in
--- b/configure.in
*************** AC_SEARCH_LIBS(getopt_long, [getopt gnug
*** 1016,1021 ****
--- 1016,1022 ----
  AC_SEARCH_LIBS(crypt, crypt)
  AC_SEARCH_LIBS(shm_open, rt)
  AC_SEARCH_LIBS(shm_unlink, rt)
+ AC_SEARCH_LIBS(clock_gettime, [rt posix4])
  # Solaris:
  AC_SEARCH_LIBS(fdatasync, [rt posix4])
  # Required for thread_test.c on Solaris
*************** PGAC_FUNC_WCSTOMBS_L
*** 1415,1421 ****
  LIBS_including_readline="$LIBS"
  LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`

! AC_CHECK_FUNCS([cbrt dlopen fdatasync getifaddrs getpeerucred getrlimit mbstowcs_l memmove poll pstat
pthread_is_threaded_npreadlink setproctitle setsid shm_open symlink sync_file_range towlower utime utimes wcstombs
wcstombs_l])

  AC_REPLACE_FUNCS(fseeko)
  case $host_os in
--- 1416,1422 ----
  LIBS_including_readline="$LIBS"
  LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`

! AC_CHECK_FUNCS([cbrt clock_gettime dlopen fdatasync getifaddrs getpeerucred getrlimit mbstowcs_l memmove poll pstat
pthread_is_threaded_npreadlink setproctitle setsid shm_open symlink sync_file_range towlower utime utimes wcstombs
wcstombs_l])

  AC_REPLACE_FUNCS(fseeko)
  case $host_os in
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 42a3fc8..b9dfdd4 100644
*** a/src/include/pg_config.h.in
--- b/src/include/pg_config.h.in
***************
*** 105,110 ****
--- 105,113 ----
  /* Define to 1 if you have the `class' function. */
  #undef HAVE_CLASS

+ /* Define to 1 if you have the `clock_gettime' function. */
+ #undef HAVE_CLOCK_GETTIME
+
  /* Define to 1 if you have the <crtdefs.h> header file. */
  #undef HAVE_CRTDEFS_H

diff --git a/src/include/pg_config.h.win32 b/src/include/pg_config.h.win32
index ceb8b79..199668c 100644
*** a/src/include/pg_config.h.win32
--- b/src/include/pg_config.h.win32
***************
*** 75,80 ****
--- 75,83 ----
  /* Define to 1 if you have the `class' function. */
  /* #undef HAVE_CLASS */

+ /* Define to 1 if you have the `clock_gettime' function. */
+ /* #undef HAVE_CLOCK_GETTIME */
+
  /* Define to 1 if you have the `crypt' function. */
  /* #undef HAVE_CRYPT */

diff --git a/src/include/portability/instr_time.h b/src/include/portability/instr_time.h
index 16caf6e..b10d6f0 100644
*** a/src/include/portability/instr_time.h
--- b/src/include/portability/instr_time.h
***************
*** 4,13 ****
   *      portable high-precision interval timing
   *
   * This file provides an abstraction layer to hide portability issues in
!  * interval timing.  On Unix we use gettimeofday(), but on Windows that
!  * gives a low-precision result so we must use QueryPerformanceCounter()
!  * instead.  These macros also give some breathing room to use other
!  * high-precision-timing APIs on yet other platforms.
   *
   * The basic data type is instr_time, which all callers should treat as an
   * opaque typedef.  instr_time can store either an absolute time (of
--- 4,13 ----
   *      portable high-precision interval timing
   *
   * This file provides an abstraction layer to hide portability issues in
!  * interval timing.  On Unix we use clock_gettime() if available, else
!  * gettimeofday().  On Windows, gettimeofday() gives a low-precision result
!  * so we must use QueryPerformanceCounter() instead.  These macros also give
!  * some breathing room to use other high-precision-timing APIs.
   *
   * The basic data type is instr_time, which all callers should treat as an
   * opaque typedef.  instr_time can store either an absolute time (of
***************
*** 54,59 ****
--- 54,147 ----

  #ifndef WIN32

+ #ifdef HAVE_CLOCK_GETTIME
+
+ /* Use clock_gettime() */
+
+ #include <time.h>
+
+ /*
+  * The best clockid to use according to the POSIX spec is CLOCK_MONOTONIC,
+  * since that will give reliable interval timing even in the face of changes
+  * to the system clock.  However, POSIX doesn't require implementations to
+  * provide anything except CLOCK_REALTIME, so fall back to that if we don't
+  * find CLOCK_MONOTONIC.
+  *
+  * Also, some implementations have nonstandard clockids with better properties
+  * than CLOCK_MONOTONIC.  In particular, as of macOS 10.12, Apple provides
+  * CLOCK_MONOTONIC_RAW which is both faster to read and higher resolution than
+  * their version of CLOCK_MONOTONIC.
+  */
+ #if defined(__darwin__) && defined(CLOCK_MONOTONIC_RAW)
+ #define PG_INSTR_CLOCK    CLOCK_MONOTONIC_RAW
+ #elif defined(CLOCK_MONOTONIC)
+ #define PG_INSTR_CLOCK    CLOCK_MONOTONIC
+ #else
+ #define PG_INSTR_CLOCK    CLOCK_REALTIME
+ #endif
+
+ typedef struct timespec instr_time;
+
+ #define INSTR_TIME_IS_ZERO(t)    ((t).tv_nsec == 0 && (t).tv_sec == 0)
+
+ #define INSTR_TIME_SET_ZERO(t)    ((t).tv_sec = 0, (t).tv_nsec = 0)
+
+ #define INSTR_TIME_SET_CURRENT(t)    ((void) clock_gettime(PG_INSTR_CLOCK, &(t)))
+
+ #define INSTR_TIME_ADD(x,y) \
+     do { \
+         (x).tv_sec += (y).tv_sec; \
+         (x).tv_nsec += (y).tv_nsec; \
+         /* Normalize */ \
+         while ((x).tv_nsec >= 1000000000) \
+         { \
+             (x).tv_nsec -= 1000000000; \
+             (x).tv_sec++; \
+         } \
+     } while (0)
+
+ #define INSTR_TIME_SUBTRACT(x,y) \
+     do { \
+         (x).tv_sec -= (y).tv_sec; \
+         (x).tv_nsec -= (y).tv_nsec; \
+         /* Normalize */ \
+         while ((x).tv_nsec < 0) \
+         { \
+             (x).tv_nsec += 1000000000; \
+             (x).tv_sec--; \
+         } \
+     } while (0)
+
+ #define INSTR_TIME_ACCUM_DIFF(x,y,z) \
+     do { \
+         (x).tv_sec += (y).tv_sec - (z).tv_sec; \
+         (x).tv_nsec += (y).tv_nsec - (z).tv_nsec; \
+         /* Normalize after each add to avoid overflow/underflow of tv_nsec */ \
+         while ((x).tv_nsec < 0) \
+         { \
+             (x).tv_nsec += 1000000000; \
+             (x).tv_sec--; \
+         } \
+         while ((x).tv_nsec >= 1000000000) \
+         { \
+             (x).tv_nsec -= 1000000000; \
+             (x).tv_sec++; \
+         } \
+     } while (0)
+
+ #define INSTR_TIME_GET_DOUBLE(t) \
+     (((double) (t).tv_sec) + ((double) (t).tv_nsec) / 1000000000.0)
+
+ #define INSTR_TIME_GET_MILLISEC(t) \
+     (((double) (t).tv_sec * 1000.0) + ((double) (t).tv_nsec) / 1000000.0)
+
+ #define INSTR_TIME_GET_MICROSEC(t) \
+     (((uint64) (t).tv_sec * (uint64) 1000000) + (uint64) ((t).tv_nsec / 1000))
+
+ #else                            /* !HAVE_CLOCK_GETTIME */
+
+ /* Use gettimeofday() */
+
  #include <sys/time.h>

  typedef struct timeval instr_time;
*************** typedef struct timeval instr_time;
*** 113,120 ****
--- 201,213 ----

  #define INSTR_TIME_GET_MICROSEC(t) \
      (((uint64) (t).tv_sec * (uint64) 1000000) + (uint64) (t).tv_usec)
+
+ #endif   /* HAVE_CLOCK_GETTIME */
+
  #else                            /* WIN32 */

+ /* Use QueryPerformanceCounter() */
+
  typedef LARGE_INTEGER instr_time;

  #define INSTR_TIME_IS_ZERO(t)    ((t).QuadPart == 0)
*************** GetTimerFrequency(void)
*** 149,154 ****
--- 242,248 ----
      QueryPerformanceFrequency(&f);
      return (double) f.QuadPart;
  }
+
  #endif   /* WIN32 */

  #endif   /* INSTR_TIME_H */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Re: [HACKERS] Faster methods for getting SPI results
Next
From: Haribabu Kommi
Date:
Subject: [HACKERS] [WIP]Vertical Clustered Index (columnar store extension)