Avoiding roundoff error in pg_sleep() - Mailing list pgsql-hackers

From Tom Lane
Subject Avoiding roundoff error in pg_sleep()
Date
Msg-id 3879137.1758825752@sss.pgh.pa.us
Whole thread Raw
Responses Re: Avoiding roundoff error in pg_sleep()
Re: Avoiding roundoff error in pg_sleep()
Re: Avoiding roundoff error in pg_sleep()
List pgsql-hackers
I chanced to notice that if you ask pg_sleep for 1ms delay,
what you actually get is 2ms, for example

regression=# \timing on
Timing is on.
regression=# do $$ begin
for i in 1..1000 loop
perform pg_sleep(0.001);
end loop; end $$;
DO
Time: 2081.175 ms (00:02.081)
regression=# do $$ begin
for i in 1..1000 loop
perform pg_sleep(0.002);
end loop; end $$;
DO
Time: 2177.407 ms (00:02.177)

A bit of gdb-ing confirms that the delay passed to WaitLatch() is 2ms,
so the problem is fundamentally one of floating-point roundoff error
in pg_sleep's calculation of delay_ms.  I didn't try to figure out
exactly why that's happening.  It may well vary depending on the
absolute magnitude of current values of GetCurrentTimestamp(),
because I don't recall having noticed any such behavior back when
this code was written.

Anyway, I propose trying to get rid of this misbehavior by avoiding
floating point in the delay computation, as attached.  With this
patch I get less surprising behavior:

regression=# \timing on
Timing is on.
regression=# do $$ begin
for i in 1..1000 loop
perform pg_sleep(0.001);
end loop; end $$;
DO
Time: 1063.997 ms (00:01.064)
regression=# do $$ begin
for i in 1..1000 loop
perform pg_sleep(0.002);
end loop; end $$;
DO
Time: 2172.849 ms (00:02.173)

The code is a little more tied to TimestampTz being measured in
microseconds than it was before, but it wouldn't really be much
harder to fix if we ever change that.

            regards, tom lane

diff --git a/src/backend/utils/adt/misc.c b/src/backend/utils/adt/misc.c
index 6c5e3438447..bc01209cf20 100644
--- a/src/backend/utils/adt/misc.c
+++ b/src/backend/utils/adt/misc.c
@@ -369,7 +369,20 @@ Datum
 pg_sleep(PG_FUNCTION_ARGS)
 {
     float8        secs = PG_GETARG_FLOAT8(0);
-    float8        endtime;
+    int64        usecs;
+    TimestampTz endtime;
+
+    /*
+     * Convert the delay to int64 microseconds, rounding up any fraction, and
+     * silently limiting it to PG_INT64_MAX/2 microseconds (about 150K years)
+     * to ensure the computation of endtime won't overflow.  Historically
+     * we've treated NaN as "no wait", not an error, so keep that behavior.
+     */
+    if (isnan(secs) || secs <= 0.0)
+        PG_RETURN_VOID();
+    secs *= 1e6;                /* we assume overflow will produce +Inf */
+    secs = ceil(secs);            /* round up */
+    usecs = (int64) Min(secs, (float8) (PG_INT64_MAX / 2));
 
     /*
      * We sleep using WaitLatch, to ensure that we'll wake up promptly if an
@@ -383,22 +396,20 @@ pg_sleep(PG_FUNCTION_ARGS)
      * less than the specified time when WaitLatch is terminated early by a
      * non-query-canceling signal such as SIGHUP.
      */
-#define GetNowFloat()    ((float8) GetCurrentTimestamp() / 1000000.0)
-
-    endtime = GetNowFloat() + secs;
+    endtime = GetCurrentTimestamp() + usecs;
 
     for (;;)
     {
-        float8        delay;
+        TimestampTz delay;
         long        delay_ms;
 
         CHECK_FOR_INTERRUPTS();
 
-        delay = endtime - GetNowFloat();
-        if (delay >= 600.0)
+        delay = endtime - GetCurrentTimestamp();
+        if (delay >= 600 * USECS_PER_SEC)
             delay_ms = 600000;
-        else if (delay > 0.0)
-            delay_ms = (long) ceil(delay * 1000.0);
+        else if (delay > 0)
+            delay_ms = (long) ((delay + 999) / 1000);
         else
             break;


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Making pg_rewind faster
Next
From: Nathan Bossart
Date:
Subject: Re: [PATCH] Hex-coding optimizations using SVE on ARM.