Thread: Re: [HACKERS] Interval aggregate regression failure (expected seems

Re: [HACKERS] Interval aggregate regression failure (expected seems

From
Michael Glaesemann
Date:
On Sep 3, 2006, at 12:34 , Bruce Momjian wrote:

> OK, I worked with Michael and I think this is the best we are going to
> do to fix this.  It has one TSROUND call for Powerpc, and that is
> documented.  Applied.

As I was working up regression tests, I found a case that this patch
doesn't handle.

select interval '4 mon' * .3 as product_h;
        product_h
-----------------------
1 mon 5 days 24:00:00
(1 row)

This should be 1 mon 6 days. It fails for any number of months
greater than 3 that is not evenly divisible by 10, greater than 3
months. Do we need to look at the month remainder separately?

Michael Glaesemann
grzm seespotcode net




Re: [HACKERS] Interval aggregate regression failure

From
Bruce Momjian
Date:
Is this non-datetime integer only or both?  I cannot reproduce the
failure here.

---------------------------------------------------------------------------

Michael Glaesemann wrote:
>
> On Sep 3, 2006, at 12:34 , Bruce Momjian wrote:
>
> > OK, I worked with Michael and I think this is the best we are going to
> > do to fix this.  It has one TSROUND call for Powerpc, and that is
> > documented.  Applied.
>
> As I was working up regression tests, I found a case that this patch
> doesn't handle.
>
> select interval '4 mon' * .3 as product_h;
>         product_h
> -----------------------
> 1 mon 5 days 24:00:00
> (1 row)
>
> This should be 1 mon 6 days. It fails for any number of months
> greater than 3 that is not evenly divisible by 10, greater than 3
> months. Do we need to look at the month remainder separately?
>
> Michael Glaesemann
> grzm seespotcode net

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [HACKERS] Interval aggregate regression failure (expected seems

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Is this non-datetime integer only or both?  I cannot reproduce the
> failure here.

On HPPA with float datetimes with today's code, Michael's case works
but it took me less than two minutes to find one that doesn't:

regression=# select interval '14 mon' * 8.2 as product_h;
            product_h
---------------------------------
 9 years 6 mons 23 days 24:00:00
(1 row)

I reiterate my comment that this approach will never work; any small
amount of experimentation will turn up cases that don't round correctly
on one platform or another.  Float arithmetic is inherently inexact.

            regards, tom lane

Re: [HACKERS] Interval aggregate regression failure

From
Bruce Momjian
Date:
Michael Glaesemann wrote:
>
> On Sep 3, 2006, at 12:34 , Bruce Momjian wrote:
>
> > OK, I worked with Michael and I think this is the best we are going to
> > do to fix this.  It has one TSROUND call for Powerpc, and that is
> > documented.  Applied.
>
> As I was working up regression tests, I found a case that this patch
> doesn't handle.
>
> select interval '4 mon' * .3 as product_h;
>         product_h
> -----------------------
> 1 mon 5 days 24:00:00
> (1 row)
>
> This should be 1 mon 6 days. It fails for any number of months
> greater than 3 that is not evenly divisible by 10, greater than 3
> months. Do we need to look at the month remainder separately?

Another question.  Is this result correct?

    test=> select '999 months 999 days'::interval / 100;
            ?column?
    -------------------------
     9 mons 38 days 40:33:36
    (1 row)

Should that be:

     9 mons 39 days 16:33:36

The core problem is that the combined remainder seconds of months and
days is > 24 hours.

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [HACKERS] Interval aggregate regression failure (expected seems

From
Michael Glaesemann
Date:
On Sep 4, 2006, at 4:45 , Bruce Momjian wrote:

> Another question.  Is this result correct?
>
>     test=> select '999 months 999 days'::interval / 100;
>             ?column?
>     -------------------------
>      9 mons 38 days 40:33:36
>     (1 row)
>
> Should that be:
>
>      9 mons 39 days 16:33:36

Yeah, I think it should be. I had been thinking of treating the month
and day component as having separate time contributions, but it makes
more sense to think of month as a collection of days first, integral
or no, and then cascade down the fractional portion of the combined
days component to time. I.e., 9.99 mon is 9 mon 29.7 days, rather
than 9 mon 29 days 60480 sec.

Michael Glaesemann
grzm seespotcode net





Re: [HACKERS] Interval aggregate regression failure

From
Bruce Momjian
Date:
Michael Glaesemann wrote:
>
> On Sep 4, 2006, at 4:45 , Bruce Momjian wrote:
>
> > Another question.  Is this result correct?
> >
> >     test=> select '999 months 999 days'::interval / 100;
> >             ?column?
> >     -------------------------
> >      9 mons 38 days 40:33:36
> >     (1 row)
> >
> > Should that be:
> >
> >      9 mons 39 days 16:33:36
>
> Yeah, I think it should be. I had been thinking of treating the month
> and day component as having separate time contributions, but it makes
> more sense to think of month as a collection of days first, integral
> or no, and then cascade down the fractional portion of the combined
> days component to time. I.e., 9.99 mon is 9 mon 29.7 days, rather
> than 9 mon 29 days 60480 sec.

No, I don't think so.  If we do that, there is no reason to cascade at
all.  Why not just say 9.1 months?

I am going to work on a patch to fix the >24 hours case, which will fix
your 24:00:00 case at the same time.  Will post in an hour.

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [HACKERS] Interval aggregate regression failure

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Is this non-datetime integer only or both?  I cannot reproduce the
> > failure here.
>
> On HPPA with float datetimes with today's code, Michael's case works
> but it took me less than two minutes to find one that doesn't:
>
> regression=# select interval '14 mon' * 8.2 as product_h;
>             product_h
> ---------------------------------
>  9 years 6 mons 23 days 24:00:00
> (1 row)
>
> I reiterate my comment that this approach will never work; any small
> amount of experimentation will turn up cases that don't round correctly
> on one platform or another.  Float arithmetic is inherently inexact.

Working on a new patch to round; will post.

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: [HACKERS] Interval aggregate regression failure

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> Michael Glaesemann wrote:
> >
> > On Sep 3, 2006, at 12:34 , Bruce Momjian wrote:
> >
> > > OK, I worked with Michael and I think this is the best we are going to
> > > do to fix this.  It has one TSROUND call for Powerpc, and that is
> > > documented.  Applied.
> >
> > As I was working up regression tests, I found a case that this patch
> > doesn't handle.
> >
> > select interval '4 mon' * .3 as product_h;
> >         product_h
> > -----------------------
> > 1 mon 5 days 24:00:00
> > (1 row)
> >
> > This should be 1 mon 6 days. It fails for any number of months
> > greater than 3 that is not evenly divisible by 10, greater than 3
> > months. Do we need to look at the month remainder separately?
>
> Another question.  Is this result correct?
>
>     test=> select '999 months 999 days'::interval / 100;
>             ?column?
>     -------------------------
>      9 mons 38 days 40:33:36
>     (1 row)
>
> Should that be:
>
>      9 mons 39 days 16:33:36
>
> The core problem is that the combined remainder seconds of months and
> days is > 24 hours.

OK, updated patch.  It will fix the >=24:00:00 case because it cascades
up if the remainder number of seconds is greater or equal to one day.
One open item is that it still might show >24 hours if the seconds
computation combined with the remaning seconds >24 hours.  Not sure if
that should be handled or not.  If you fix that, you really are
cascading up because the resulting seconds might be less than the
computed value, e.g. result is 23:00:00, remainder is 02:00:00, cascade
up would be 1 day, 01:00:00.  I am unsure we want to do that.  Right
now, this will show 25:00:00.

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/backend/utils/adt/timestamp.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/adt/timestamp.c,v
retrieving revision 1.166
diff -c -c -r1.166 timestamp.c
*** src/backend/utils/adt/timestamp.c    3 Sep 2006 03:34:04 -0000    1.166
--- src/backend/utils/adt/timestamp.c    4 Sep 2006 03:49:21 -0000
***************
*** 2525,2541 ****
      sec_remainder = (orig_day * (double)SECS_PER_DAY) * factor -
              result->day * (double)SECS_PER_DAY +
              (month_remainder_days - (int32) month_remainder_days) * SECS_PER_DAY;

      /* cascade units down */
      result->day += (int32) month_remainder_days;
  #ifdef HAVE_INT64_TIMESTAMP
      result->time = rint(span->time * factor + sec_remainder * USECS_PER_SEC);
  #else
!     /*
!      *    TSROUND() needed to prevent -146:23:60.00 output on PowerPC for
!      *    SELECT interval '-41 mon -12 days -360:00' * 0.3;
!      */
!     result->time = span->time * factor + TSROUND(sec_remainder);
  #endif

      PG_RETURN_INTERVAL_P(result);
--- 2524,2547 ----
      sec_remainder = (orig_day * (double)SECS_PER_DAY) * factor -
              result->day * (double)SECS_PER_DAY +
              (month_remainder_days - (int32) month_remainder_days) * SECS_PER_DAY;
+     /*
+      *    Might have 24:00:00 hours due to rounding, or >24 hours because of
+      *    time cascade from months and days.  It might still be >24 if the
+      *    combination of cascade and the seconds factor operation itself.
+      */
+     sec_remainder = TSROUND(sec_remainder);
+     if (Abs(sec_remainder) >= SECS_PER_DAY)
+     {
+         sec_remainder -= (int)(sec_remainder / SECS_PER_DAY) * SECS_PER_DAY;
+         result->day += (int)(sec_remainder / SECS_PER_DAY);
+     }

      /* cascade units down */
      result->day += (int32) month_remainder_days;
  #ifdef HAVE_INT64_TIMESTAMP
      result->time = rint(span->time * factor + sec_remainder * USECS_PER_SEC);
  #else
!     result->time = span->time * factor + sec_remainder;
  #endif

      PG_RETURN_INTERVAL_P(result);
***************
*** 2579,2584 ****
--- 2585,2596 ----
      sec_remainder = (orig_day * (double)SECS_PER_DAY) / factor -
              result->day * (double)SECS_PER_DAY +
              (month_remainder_days - (int32) month_remainder_days) * SECS_PER_DAY;
+     sec_remainder = TSROUND(sec_remainder);
+     if (Abs(sec_remainder) >= SECS_PER_DAY)
+     {
+         sec_remainder -= (int)(sec_remainder / SECS_PER_DAY) * SECS_PER_DAY;
+         result->day += (int)(sec_remainder / SECS_PER_DAY);
+     }

      /* cascade units down */
      result->day += (int32) month_remainder_days;
***************
*** 2586,2592 ****
      result->time = rint(span->time / factor + sec_remainder * USECS_PER_SEC);
  #else
      /* See TSROUND comment in interval_mul(). */
!     result->time = span->time / factor + TSROUND(sec_remainder);
  #endif

      PG_RETURN_INTERVAL_P(result);
--- 2598,2604 ----
      result->time = rint(span->time / factor + sec_remainder * USECS_PER_SEC);
  #else
      /* See TSROUND comment in interval_mul(). */
!     result->time = span->time / factor + sec_remainder;
  #endif

      PG_RETURN_INTERVAL_P(result);
Index: src/include/utils/timestamp.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/utils/timestamp.h,v
retrieving revision 1.62
diff -c -c -r1.62 timestamp.h
*** src/include/utils/timestamp.h    13 Jul 2006 16:49:20 -0000    1.62
--- src/include/utils/timestamp.h    4 Sep 2006 03:49:22 -0000
***************
*** 161,171 ****

  typedef double fsec_t;

! /* round off to MAX_TIMESTAMP_PRECISION decimal places */
! /* note: this is also used for rounding off intervals */
  #define TS_PREC_INV 1000000.0
  #define TSROUND(j) (rint(((double) (j)) * TS_PREC_INV) / TS_PREC_INV)
- #endif

  #define TIMESTAMP_MASK(b) (1 << (b))
  #define INTERVAL_MASK(b) (1 << (b))
--- 161,174 ----

  typedef double fsec_t;

! #endif
!
! /*
!  *    Round off to MAX_TIMESTAMP_PRECISION decimal places.
!  *    Note: this is also used for rounding off intervals.
!  */
  #define TS_PREC_INV 1000000.0
  #define TSROUND(j) (rint(((double) (j)) * TS_PREC_INV) / TS_PREC_INV)

  #define TIMESTAMP_MASK(b) (1 << (b))
  #define INTERVAL_MASK(b) (1 << (b))

Re: [HACKERS] Interval aggregate regression failure

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> OK, updated patch.  It will fix the >=24:00:00 case because it cascades
> up if the remainder number of seconds is greater or equal to one day.
> One open item is that it still might show >24 hours if the seconds
> computation combined with the remaning seconds >24 hours.  Not sure if
> that should be handled or not.  If you fix that, you really are
> cascading up because the resulting seconds might be less than the
> computed value, e.g. result is 23:00:00, remainder is 02:00:00, cascade
> up would be 1 day, 01:00:00.  I am unsure we want to do that.  Right
> now, this will show 25:00:00.

Updated patch that uses TSROUND for partial month and days.  Michael
says the tests look good on his system.  I think we are done with this
bug.  :-)

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/backend/utils/adt/timestamp.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/adt/timestamp.c,v
retrieving revision 1.166
diff -c -c -r1.166 timestamp.c
*** src/backend/utils/adt/timestamp.c    3 Sep 2006 03:34:04 -0000    1.166
--- src/backend/utils/adt/timestamp.c    4 Sep 2006 17:54:25 -0000
***************
*** 2514,2541 ****
      /*
       *    Fractional months full days into days.
       *
!      *    The remainders suffer from float rounding, so instead of
!      *    doing the computation using just the remainder, we calculate
!      *    the total number of days and subtract.  Specifically, we are
!      *    multipling by DAYS_PER_MONTH before dividing by factor.
!      *    This greatly reduces rounding errors.
       */
!     month_remainder_days = (orig_month * (double)DAYS_PER_MONTH) * factor -
!             result->month * (double)DAYS_PER_MONTH;
!     sec_remainder = (orig_day * (double)SECS_PER_DAY) * factor -
!             result->day * (double)SECS_PER_DAY +
!             (month_remainder_days - (int32) month_remainder_days) * SECS_PER_DAY;

      /* cascade units down */
      result->day += (int32) month_remainder_days;
  #ifdef HAVE_INT64_TIMESTAMP
      result->time = rint(span->time * factor + sec_remainder * USECS_PER_SEC);
  #else
!     /*
!      *    TSROUND() needed to prevent -146:23:60.00 output on PowerPC for
!      *    SELECT interval '-41 mon -12 days -360:00' * 0.3;
!      */
!     result->time = span->time * factor + TSROUND(sec_remainder);
  #endif

      PG_RETURN_INTERVAL_P(result);
--- 2514,2547 ----
      /*
       *    Fractional months full days into days.
       *
!      *    Floating point calculation are inherently inprecise, so these
!      *    calculations are crafted to produce the most reliable result
!      *    possible.  TSROUND() is needed to more accurately produce whole
!      *    numbers where appropriate.
       */
!     month_remainder_days = (orig_month * factor - result->month) * DAYS_PER_MONTH;
!     month_remainder_days = TSROUND(month_remainder_days);
!     sec_remainder = (orig_day * factor - result->day +
!             month_remainder_days - (int)month_remainder_days) * SECS_PER_DAY;
!     sec_remainder = TSROUND(sec_remainder);
!
!     /*
!      *    Might have 24:00:00 hours due to rounding, or >24 hours because of
!      *    time cascade from months and days.  It might still be >24 if the
!      *    combination of cascade and the seconds factor operation itself.
!      */
!     if (Abs(sec_remainder) >= SECS_PER_DAY)
!     {
!         result->day += (int)(sec_remainder / SECS_PER_DAY);
!         sec_remainder -= (int)(sec_remainder / SECS_PER_DAY) * SECS_PER_DAY;
!     }

      /* cascade units down */
      result->day += (int32) month_remainder_days;
  #ifdef HAVE_INT64_TIMESTAMP
      result->time = rint(span->time * factor + sec_remainder * USECS_PER_SEC);
  #else
!     result->time = span->time * factor + sec_remainder;
  #endif

      PG_RETURN_INTERVAL_P(result);
***************
*** 2574,2584 ****
       *    Fractional months full days into days.  See comment in
       *    interval_mul().
       */
!     month_remainder_days = (orig_month * (double)DAYS_PER_MONTH) / factor -
!             result->month * (double)DAYS_PER_MONTH;
!     sec_remainder = (orig_day * (double)SECS_PER_DAY) / factor -
!             result->day * (double)SECS_PER_DAY +
!             (month_remainder_days - (int32) month_remainder_days) * SECS_PER_DAY;

      /* cascade units down */
      result->day += (int32) month_remainder_days;
--- 2580,2595 ----
       *    Fractional months full days into days.  See comment in
       *    interval_mul().
       */
!     month_remainder_days = (orig_month / factor - result->month) * DAYS_PER_MONTH;
!     month_remainder_days = TSROUND(month_remainder_days);
!     sec_remainder = (orig_day / factor - result->day +
!             month_remainder_days - (int)month_remainder_days) * SECS_PER_DAY;
!     sec_remainder = TSROUND(sec_remainder);
!     if (Abs(sec_remainder) >= SECS_PER_DAY)
!     {
!         result->day += (int)(sec_remainder / SECS_PER_DAY);
!         sec_remainder -= (int)(sec_remainder / SECS_PER_DAY) * SECS_PER_DAY;
!     }

      /* cascade units down */
      result->day += (int32) month_remainder_days;
***************
*** 2586,2592 ****
      result->time = rint(span->time / factor + sec_remainder * USECS_PER_SEC);
  #else
      /* See TSROUND comment in interval_mul(). */
!     result->time = span->time / factor + TSROUND(sec_remainder);
  #endif

      PG_RETURN_INTERVAL_P(result);
--- 2597,2603 ----
      result->time = rint(span->time / factor + sec_remainder * USECS_PER_SEC);
  #else
      /* See TSROUND comment in interval_mul(). */
!     result->time = span->time / factor + sec_remainder;
  #endif

      PG_RETURN_INTERVAL_P(result);
Index: src/include/utils/timestamp.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/utils/timestamp.h,v
retrieving revision 1.62
diff -c -c -r1.62 timestamp.h
*** src/include/utils/timestamp.h    13 Jul 2006 16:49:20 -0000    1.62
--- src/include/utils/timestamp.h    4 Sep 2006 17:54:26 -0000
***************
*** 161,171 ****

  typedef double fsec_t;

! /* round off to MAX_TIMESTAMP_PRECISION decimal places */
! /* note: this is also used for rounding off intervals */
  #define TS_PREC_INV 1000000.0
  #define TSROUND(j) (rint(((double) (j)) * TS_PREC_INV) / TS_PREC_INV)
- #endif

  #define TIMESTAMP_MASK(b) (1 << (b))
  #define INTERVAL_MASK(b) (1 << (b))
--- 161,174 ----

  typedef double fsec_t;

! #endif
!
! /*
!  *    Round off to MAX_TIMESTAMP_PRECISION decimal places.
!  *    Note: this is also used for rounding off intervals.
!  */
  #define TS_PREC_INV 1000000.0
  #define TSROUND(j) (rint(((double) (j)) * TS_PREC_INV) / TS_PREC_INV)

  #define TIMESTAMP_MASK(b) (1 << (b))
  #define INTERVAL_MASK(b) (1 << (b))

Re: [HACKERS] Interval aggregate regression failure

From
Michael Glaesemann
Date:
On Sep 5, 2006, at 10:14 , Bruce Momjian wrote:

> Bruce Momjian wrote:
>> OK, updated patch.  It will fix the >=24:00:00 case because it
>> cascades
>> up if the remainder number of seconds is greater or equal to one day.
>> One open item is that it still might show >24 hours if the seconds
>> computation combined with the remaning seconds >24 hours.  Not
>> sure if
>> that should be handled or not.  If you fix that, you really are
>> cascading up because the resulting seconds might be less than the
>> computed value, e.g. result is 23:00:00, remainder is 02:00:00,
>> cascade
>> up would be 1 day, 01:00:00.  I am unsure we want to do that.  Right
>> now, this will show 25:00:00.
>
> Updated patch that uses TSROUND for partial month and days.  Michael
> says the tests look good on his system.  I think we are done with this
> bug.  :-)

Please find attached regression tests for this patch.

Michael Glaesemann
grzm seespotcode net



Attachment

Re: [HACKERS] Interval aggregate regression failure

From
Bruce Momjian
Date:
Patch applied.  Thanks.

---------------------------------------------------------------------------


Michael Glaesemann wrote:
>
> On Sep 5, 2006, at 10:14 , Bruce Momjian wrote:
>
> > Bruce Momjian wrote:
> >> OK, updated patch.  It will fix the >=24:00:00 case because it
> >> cascades
> >> up if the remainder number of seconds is greater or equal to one day.
> >> One open item is that it still might show >24 hours if the seconds
> >> computation combined with the remaning seconds >24 hours.  Not
> >> sure if
> >> that should be handled or not.  If you fix that, you really are
> >> cascading up because the resulting seconds might be less than the
> >> computed value, e.g. result is 23:00:00, remainder is 02:00:00,
> >> cascade
> >> up would be 1 day, 01:00:00.  I am unsure we want to do that.  Right
> >> now, this will show 25:00:00.
> >
> > Updated patch that uses TSROUND for partial month and days.  Michael
> > says the tests look good on his system.  I think we are done with this
> > bug.  :-)
>
> Please find attached regression tests for this patch.
>
> Michael Glaesemann
> grzm seespotcode net
>

[ Attachment, skipping... ]

>

--
  Bruce Momjian   bruce@momjian.us
  EnterpriseDB    http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +