Re: [HACKERS] Interval aggregate regression failure (expected seems - Mailing list pgsql-patches

From Michael Glaesemann
Subject Re: [HACKERS] Interval aggregate regression failure (expected seems
Date
Msg-id 16F3B056-8929-4F68-AB72-929480B1F576@seespotcode.net
Whole thread Raw
In response to Re: [HACKERS] Interval aggregate regression failure (expected  (Bruce Momjian <bruce@momjian.us>)
Responses Re: [HACKERS] Interval aggregate regression failure (expected seems
List pgsql-patches
On Aug 26, 2006, at 11:40 , Bruce Momjian wrote:

>
> I used your ideas to make a patch to fix your example:
>
>     test=> select '41 months'::interval  / 10;
>        ?column?
>     ---------------
>      4 mons 3 days
>     (1 row)
>
> and
>
>     test=> select '41 months'::interval  * 0.3;
>        ?column?
>     ---------------
>      1 year 9 days
>     (1 row)
>
> The trick was not to play with the division, but to check if the
> number
> of seconds cascaded into full days and/or months.

While this does provide a fix for the example, I don't believe it's a
complete solution. For example, with your patch, you also get the
following results:

select '41 mon 360:00'::interval / 10 as "pos"
     , '-41 mon -360:00'::interval / 10 as "neg";
           pos           |             neg
------------------------+------------------------------
4 mons 2 days 60:00:00 | -4 mons -2 days -59:59:60.00
(1 row)

If I've done the math right, this should be:
4 mons 3 days 36:00:00 | -4 mons -3 days -36:00:00

select '41 mon -360:00'::interval / 10 as "pos"
     , '-41 mon 360:00'::interval / 10 as "neg";
            pos           |            neg
-------------------------+---------------------------
4 mons 2 days -12:00:00 | -4 mons -2 days +12:00:00
(1 row)

Should be:
4 mons 3 days -36:00:00 | -4 mons -3 days +36:00:00

What we want to do is check just the month contribution to the day
component to see if it is greater than 24 hours. Perhaps the simplest
way to accomplish this is something like (psuedo code):

if (abs(tsround(month_remainder * SECS_PER_DAY)) == SECS_PER_DAY)
{
    if (month_remainder > 0)
    {
       result->month++;
    }
    else
    {
       result->month--;
    }
}

I'm going to try something along these lines this evening.

FWIW, I've included the patch of for what I'm working on. It's pretty
heavily commented right now with expected results as I think through
what the code's doing. (It also includes the DecodeInterval patch I
sent to -patches earlier today.) I'm still getting overflow warnings
in the lines including USECS_PER_DAY and USECS_PER_MONTH, and my
inexperience with C and gdb is getting the best of me right now
(though I'm still plugging away: ).

Michael Glaesemann
grzm seespotcode net

----8<-------------------
? CONFIGURE_ARGS
? datetime.patch
? timestamp.patch
? src/backend/.DS_Store
? src/include/.DS_Store
? src/test/.DS_Store
Index: src/backend/utils/adt/datetime.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/datetime.c,v
retrieving revision 1.169
diff -c -r1.169 datetime.c
*** src/backend/utils/adt/datetime.c    25 Jul 2006 03:51:21 -0000    1.169
--- src/backend/utils/adt/datetime.c    28 Aug 2006 07:08:46 -0000
***************
*** 2920,2935 ****
                           tm->tm_mday += val * 7;
                           if (fval != 0)
                           {
!                             int            sec;
!
!                             fval *= 7 * SECS_PER_DAY;
!                             sec = fval;
!                             tm->tm_sec += sec;
   #ifdef HAVE_INT64_TIMESTAMP
!                             *fsec += (fval - sec) * 1000000;
   #else
!                             *fsec += fval - sec;
   #endif
                           }
                           tmask = (fmask & DTK_M(DAY)) ? 0 : DTK_M(DAY);
                           break;
--- 2920,2942 ----
                           tm->tm_mday += val * 7;
                           if (fval != 0)
                           {
!                             int extra_days;
!                             fval *= 7;
!                             extra_days = (int32) fval;
!                             tm->tm_mday += extra_days;
!                             fval -= extra_days;
!                             if (fval != 0)
!                             {
!                                 int            sec;
!                                 fval *= SECS_PER_DAY;
!                                 sec = fval;
!                                 tm->tm_sec += sec;
   #ifdef HAVE_INT64_TIMESTAMP
!                                 *fsec += (fval - sec) * 1000000;
   #else
!                                 *fsec += fval - sec;
   #endif
+                             }
                           }
                           tmask = (fmask & DTK_M(DAY)) ? 0 : DTK_M(DAY);
                           break;
***************
*** 2938,2953 ****
                           tm->tm_mon += val;
                           if (fval != 0)
                           {
!                             int            sec;
!
!                             fval *= DAYS_PER_MONTH * SECS_PER_DAY;
!                             sec = fval;
!                             tm->tm_sec += sec;
   #ifdef HAVE_INT64_TIMESTAMP
!                             *fsec += (fval - sec) * 1000000;
   #else
!                             *fsec += fval - sec;
   #endif
                           }
                           tmask = DTK_M(MONTH);
                           break;
--- 2945,2967 ----
                           tm->tm_mon += val;
                           if (fval != 0)
                           {
!                             int         day;
!                             fval *= DAYS_PER_MONTH;
!                             day = fval;
!                             tm->tm_mday += day;
!                             fval -= day;
!                             if (fval != 0)
!                             {
!                                 int            sec;
!                                 fval *= SECS_PER_DAY;
!                                 sec = fval;
!                                 tm->tm_sec += sec;
   #ifdef HAVE_INT64_TIMESTAMP
!                                 *fsec += (fval - sec) * 1000000;
   #else
!                                 *fsec += fval - sec;
   #endif
+                             }
                           }
                           tmask = DTK_M(MONTH);
                           break;
Index: src/backend/utils/adt/timestamp.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/timestamp.c,v
retrieving revision 1.165
diff -c -r1.165 timestamp.c
*** src/backend/utils/adt/timestamp.c    13 Jul 2006 16:49:16 -0000    1.165
--- src/backend/utils/adt/timestamp.c    28 Aug 2006 07:08:48 -0000
***************
*** 2547,2556 ****
       Interval   *span = PG_GETARG_INTERVAL_P(0);
       float8        factor = PG_GETARG_FLOAT8(1);
       double        month_remainder,
                   day_remainder,
!                 month_remainder_days;
       Interval   *result;

       result = (Interval *) palloc(sizeof(Interval));

       if (factor == 0.0)
--- 2547,2572 ----
       Interval   *span = PG_GETARG_INTERVAL_P(0);
       float8        factor = PG_GETARG_FLOAT8(1);
       double        month_remainder,
+                 month_remainder_usecs = 0.,
                   day_remainder,
!                 day_remainder_usecs = 0.;
!     int32        extra_days;
       Interval   *result;

+ /*
+    example a:
+    select '1.5 mon'::interval / 10;
+    span = { time = 0, day = 15, month = 1 }
+    factor = 10.0
+    result: 4 days 12:00 { time = 43200.0 (int64 43_200_000_000),
day = 4,  month = 0 }
+
+    example b:
+    select '41 mon'::interval / 10;
+    span = { time = 0, day = 0, month = 41 }
+    factor = 10.0
+    result: 4 mon 3 days { time = 0, day = 3, month = 4 }
+  */
+
       result = (Interval *) palloc(sizeof(Interval));

       if (factor == 0.0)
***************
*** 2559,2584 ****
                    errmsg("division by zero")));

       month_remainder = span->month / factor;
!     day_remainder = span->day / factor;
       result->month = (int32) month_remainder;
!     result->day = (int32) day_remainder;
!     month_remainder -= result->month;
!     day_remainder -= result->day;

       /*
!      * Handle any fractional parts the same way as in interval_mul.
        */

       /* fractional months full days into days */
!     month_remainder_days = month_remainder * DAYS_PER_MONTH;
!     result->day += (int32) month_remainder_days;
!     /* fractional months partial days into time */
!     day_remainder += month_remainder_days - (int32) month_remainder_days;

   #ifdef HAVE_INT64_TIMESTAMP
!     result->time = rint(span->time / factor + day_remainder *
USECS_PER_DAY);
   #else
!     result->time = span->time / factor + day_remainder * SECS_PER_DAY;
   #endif

       PG_RETURN_INTERVAL_P(result);
--- 2575,2682 ----
                    errmsg("division by zero")));

       month_remainder = span->month / factor;
!     /*
!        a: month_remainder = 1 / 10.0 = 0.1
!        b: month_remainder = 41 / 10.0 = 4.1
!      */
       result->month = (int32) month_remainder;
!     /*
!        a: result->month = 0
!        b: result->month = 4
!      */
!      /* FIXME integer overflow */
!     month_remainder_usecs = rint( (double) (month_remainder - result-
 >month)
!                                         * USECS_PER_MONTH);
!     /*
!         a: month_remainder_usecs = rint((0.1 - 0) * ((double)
DAYS_PER_MONTH * (double) USECS_PER_DAY))
!                                  = rint(0.1 * (30.0 * (SECS_PER_DAY * 1000000)))
!                                  = rint(0.1 * 30.0 * 86400 * 1000000)
!                                  = 259_200_000_000
!         b: month_remainder_usecs = rint((4.1 - 4) * ((double)
DAYS_PER_MONTH * (double) USECS_PER_DAY))
!                                  = rint( 0.1 * 30.0 * 86400 * 1000000)
!                                  = 259_200_000_000
!      */
!
!     /*
!      * Due to the inherent inaccuracy of floating point division,
round to
!      * microsecond accuracy. Scale up intermediate results to
microseconds
!      * and scale back down for the final result.
!      */

+     day_remainder = span->day / factor;
       /*
!         a: day_remainder = 15 / 10.0 = 1.5
!         b: day_remainder = 0 / 10.0 = 0.0
!      */
!     result->day = (int32) day_remainder;
!     /*
!         a: result->day = (int32) 1.5 = 1
!         b: result->day = (int32) 0.0 = 0
!      */
!      /* FIXME integer overflow */
!     day_remainder_usecs = rint( (double) (day_remainder - result->day)
* (double) USECS_PER_DAY);
!     /*
!         a: day_remainder_usecs = rint((1.5 - 1) * (SECS_PER_DAY * 1000000))
!                                = rint( 0.5 * 86400 * 1_000_000 )
!                                = 43_200_000_000
!         b: day_remainder_usecs = rint((0.0 - 0 ) * 86400 * 1_000_000)
!                                = 0
!      */
!
!     /*
!      * Handle any fractional parts the same way as in interval_mul.
!      * Fractional months and fractional days are added to the result
separately
!      * to prevent their time fractional components from contributing
to the day
!      * component.
        */

       /* fractional months full days into days */
!     /* FIXME integer overflow */
!     extra_days = (int32) (month_remainder_usecs / (double)
USECS_PER_DAY);
!     /*
!         a: extra_days = (int32) (259_200_000_000 / (SECS_PER_DAY * 1000000))
!                       = (int32) (259_200_000_000 / (86400 * 1000000))
!                       = (int32) 3.0
!                       = 3
!         b: extra_days = (int32) (259_200_000_000 / (86400 * 1000000) )
!                       = (int32) 3.0
!                       = 3
!      */
!     result->day += extra_days;
!     /*
!         a: result->day = 1 + 3 = 4
!         b: result->day = 0 + 3 = 3
!      */
!      /* FIXME integer overflow */
!     month_remainder_usecs -= extra_days * (double) USECS_PER_DAY;
!     /*
!         a: month_remainder_usecs = 259_200_000_000 - (3 * (86400 * 1000000))
!                                  = 259_200_000_000 - (3 * (86400 * 1000000))
!                                  = 259_200_000_000 - 259_200_000_000
!                                  = 0
!         b: month_remainder_usecs = 259_200_000_000 - (3 * (86400 * 1000000))
!                                  = 259_200_000_000 - 259_200_000_000
!                                  = 0
!      */

   #ifdef HAVE_INT64_TIMESTAMP
!     result->time = rint(span->time / factor +
!                             day_remainder_usecs + month_remainder_usecs);
!     /*
!         a: result->time = rint(0.0 / 10.0 + 43_200_000_000 + 0.0)
!                         = 43_200_000_000
!         b: result->time = rint(0.0 / 10.0 + 0 + 0.0)
!                         = 0
!      */
   #else
!     result->time = rint(span->time / factor * 1.0e6 * +
!                             day_remainder_usecs + month_remainder_usecs) / 1.0e6;
!     /*
!         a: result->time = rint(0.0 / 10.0 * 1.0e6 + 43_200_000_000 +
0.0) / 1.0e6
!                         = 43200.0
!         b: result->time = rint(0.0 / 10.0 * 1.0e6 + 0.0 + 0.0) / 1.0e6
!                         = 0.0
!      */
   #endif

       PG_RETURN_INTERVAL_P(result);
Index: src/include/utils/timestamp.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/utils/timestamp.h,v
retrieving revision 1.62
diff -c -r1.62 timestamp.h
*** src/include/utils/timestamp.h    13 Jul 2006 16:49:20 -0000    1.62
--- src/include/utils/timestamp.h    28 Aug 2006 07:08:51 -0000
***************
*** 79,92 ****
   #define SECS_PER_YEAR    (36525 * 864)    /* avoid floating-point
computation */
   #define SECS_PER_DAY    86400
   #define SECS_PER_HOUR    3600
! #define SECS_PER_MINUTE 60
   #define MINS_PER_HOUR    60

   #ifdef HAVE_INT64_TIMESTAMP
   #define USECS_PER_DAY    INT64CONST(86400000000)
   #define USECS_PER_HOUR    INT64CONST(3600000000)
! #define USECS_PER_MINUTE INT64CONST(60000000)
   #define USECS_PER_SEC    INT64CONST(1000000)
   #endif

   /*
--- 79,96 ----
   #define SECS_PER_YEAR    (36525 * 864)    /* avoid floating-point
computation */
   #define SECS_PER_DAY    86400
   #define SECS_PER_HOUR    3600
! #define SECS_PER_MINUTE    60
   #define MINS_PER_HOUR    60

   #ifdef HAVE_INT64_TIMESTAMP
+ #define USECS_PER_MONTH    INT64CONST(2592000000000)
   #define USECS_PER_DAY    INT64CONST(86400000000)
   #define USECS_PER_HOUR    INT64CONST(3600000000)
! #define USECS_PER_MINUTE    INT64CONST(60000000)
   #define USECS_PER_SEC    INT64CONST(1000000)
+ #else
+ #define USECS_PER_DAY    (SECS_PER_DAY * 1000000)
+ #define USECS_PER_MONTH ((double) DAYS_PER_MONTH * (double)
USECS_PER_DAY)
   #endif

   /*



pgsql-patches by date:

Previous
From: "Pavel Stehule"
Date:
Subject: plpgsql, return can contains any expression
Next
From: Böszörményi Zoltán
Date:
Subject: Re: [HACKERS] Performance testing of COPY (SELECT) TO