Thread: Re: [HACKERS] Interval aggregate regression failure (expected seems
On Sep 3, 2006, at 12:34 , Bruce Momjian wrote: > OK, I worked with Michael and I think this is the best we are going to > do to fix this. It has one TSROUND call for Powerpc, and that is > documented. Applied. As I was working up regression tests, I found a case that this patch doesn't handle. select interval '4 mon' * .3 as product_h; product_h ----------------------- 1 mon 5 days 24:00:00 (1 row) This should be 1 mon 6 days. It fails for any number of months greater than 3 that is not evenly divisible by 10, greater than 3 months. Do we need to look at the month remainder separately? Michael Glaesemann grzm seespotcode net
Is this non-datetime integer only or both? I cannot reproduce the failure here. --------------------------------------------------------------------------- Michael Glaesemann wrote: > > On Sep 3, 2006, at 12:34 , Bruce Momjian wrote: > > > OK, I worked with Michael and I think this is the best we are going to > > do to fix this. It has one TSROUND call for Powerpc, and that is > > documented. Applied. > > As I was working up regression tests, I found a case that this patch > doesn't handle. > > select interval '4 mon' * .3 as product_h; > product_h > ----------------------- > 1 mon 5 days 24:00:00 > (1 row) > > This should be 1 mon 6 days. It fails for any number of months > greater than 3 that is not evenly divisible by 10, greater than 3 > months. Do we need to look at the month remainder separately? > > Michael Glaesemann > grzm seespotcode net -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian <bruce@momjian.us> writes: > Is this non-datetime integer only or both? I cannot reproduce the > failure here. On HPPA with float datetimes with today's code, Michael's case works but it took me less than two minutes to find one that doesn't: regression=# select interval '14 mon' * 8.2 as product_h; product_h --------------------------------- 9 years 6 mons 23 days 24:00:00 (1 row) I reiterate my comment that this approach will never work; any small amount of experimentation will turn up cases that don't round correctly on one platform or another. Float arithmetic is inherently inexact. regards, tom lane
Michael Glaesemann wrote: > > On Sep 3, 2006, at 12:34 , Bruce Momjian wrote: > > > OK, I worked with Michael and I think this is the best we are going to > > do to fix this. It has one TSROUND call for Powerpc, and that is > > documented. Applied. > > As I was working up regression tests, I found a case that this patch > doesn't handle. > > select interval '4 mon' * .3 as product_h; > product_h > ----------------------- > 1 mon 5 days 24:00:00 > (1 row) > > This should be 1 mon 6 days. It fails for any number of months > greater than 3 that is not evenly divisible by 10, greater than 3 > months. Do we need to look at the month remainder separately? Another question. Is this result correct? test=> select '999 months 999 days'::interval / 100; ?column? ------------------------- 9 mons 38 days 40:33:36 (1 row) Should that be: 9 mons 39 days 16:33:36 The core problem is that the combined remainder seconds of months and days is > 24 hours. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Sep 4, 2006, at 4:45 , Bruce Momjian wrote: > Another question. Is this result correct? > > test=> select '999 months 999 days'::interval / 100; > ?column? > ------------------------- > 9 mons 38 days 40:33:36 > (1 row) > > Should that be: > > 9 mons 39 days 16:33:36 Yeah, I think it should be. I had been thinking of treating the month and day component as having separate time contributions, but it makes more sense to think of month as a collection of days first, integral or no, and then cascade down the fractional portion of the combined days component to time. I.e., 9.99 mon is 9 mon 29.7 days, rather than 9 mon 29 days 60480 sec. Michael Glaesemann grzm seespotcode net
Michael Glaesemann wrote: > > On Sep 4, 2006, at 4:45 , Bruce Momjian wrote: > > > Another question. Is this result correct? > > > > test=> select '999 months 999 days'::interval / 100; > > ?column? > > ------------------------- > > 9 mons 38 days 40:33:36 > > (1 row) > > > > Should that be: > > > > 9 mons 39 days 16:33:36 > > Yeah, I think it should be. I had been thinking of treating the month > and day component as having separate time contributions, but it makes > more sense to think of month as a collection of days first, integral > or no, and then cascade down the fractional portion of the combined > days component to time. I.e., 9.99 mon is 9 mon 29.7 days, rather > than 9 mon 29 days 60480 sec. No, I don't think so. If we do that, there is no reason to cascade at all. Why not just say 9.1 months? I am going to work on a patch to fix the >24 hours case, which will fix your 24:00:00 case at the same time. Will post in an hour. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > Is this non-datetime integer only or both? I cannot reproduce the > > failure here. > > On HPPA with float datetimes with today's code, Michael's case works > but it took me less than two minutes to find one that doesn't: > > regression=# select interval '14 mon' * 8.2 as product_h; > product_h > --------------------------------- > 9 years 6 mons 23 days 24:00:00 > (1 row) > > I reiterate my comment that this approach will never work; any small > amount of experimentation will turn up cases that don't round correctly > on one platform or another. Float arithmetic is inherently inexact. Working on a new patch to round; will post. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > Michael Glaesemann wrote: > > > > On Sep 3, 2006, at 12:34 , Bruce Momjian wrote: > > > > > OK, I worked with Michael and I think this is the best we are going to > > > do to fix this. It has one TSROUND call for Powerpc, and that is > > > documented. Applied. > > > > As I was working up regression tests, I found a case that this patch > > doesn't handle. > > > > select interval '4 mon' * .3 as product_h; > > product_h > > ----------------------- > > 1 mon 5 days 24:00:00 > > (1 row) > > > > This should be 1 mon 6 days. It fails for any number of months > > greater than 3 that is not evenly divisible by 10, greater than 3 > > months. Do we need to look at the month remainder separately? > > Another question. Is this result correct? > > test=> select '999 months 999 days'::interval / 100; > ?column? > ------------------------- > 9 mons 38 days 40:33:36 > (1 row) > > Should that be: > > 9 mons 39 days 16:33:36 > > The core problem is that the combined remainder seconds of months and > days is > 24 hours. OK, updated patch. It will fix the >=24:00:00 case because it cascades up if the remainder number of seconds is greater or equal to one day. One open item is that it still might show >24 hours if the seconds computation combined with the remaning seconds >24 hours. Not sure if that should be handled or not. If you fix that, you really are cascading up because the resulting seconds might be less than the computed value, e.g. result is 23:00:00, remainder is 02:00:00, cascade up would be 1 day, 01:00:00. I am unsure we want to do that. Right now, this will show 25:00:00. -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: src/backend/utils/adt/timestamp.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/utils/adt/timestamp.c,v retrieving revision 1.166 diff -c -c -r1.166 timestamp.c *** src/backend/utils/adt/timestamp.c 3 Sep 2006 03:34:04 -0000 1.166 --- src/backend/utils/adt/timestamp.c 4 Sep 2006 03:49:21 -0000 *************** *** 2525,2541 **** sec_remainder = (orig_day * (double)SECS_PER_DAY) * factor - result->day * (double)SECS_PER_DAY + (month_remainder_days - (int32) month_remainder_days) * SECS_PER_DAY; /* cascade units down */ result->day += (int32) month_remainder_days; #ifdef HAVE_INT64_TIMESTAMP result->time = rint(span->time * factor + sec_remainder * USECS_PER_SEC); #else ! /* ! * TSROUND() needed to prevent -146:23:60.00 output on PowerPC for ! * SELECT interval '-41 mon -12 days -360:00' * 0.3; ! */ ! result->time = span->time * factor + TSROUND(sec_remainder); #endif PG_RETURN_INTERVAL_P(result); --- 2524,2547 ---- sec_remainder = (orig_day * (double)SECS_PER_DAY) * factor - result->day * (double)SECS_PER_DAY + (month_remainder_days - (int32) month_remainder_days) * SECS_PER_DAY; + /* + * Might have 24:00:00 hours due to rounding, or >24 hours because of + * time cascade from months and days. It might still be >24 if the + * combination of cascade and the seconds factor operation itself. + */ + sec_remainder = TSROUND(sec_remainder); + if (Abs(sec_remainder) >= SECS_PER_DAY) + { + sec_remainder -= (int)(sec_remainder / SECS_PER_DAY) * SECS_PER_DAY; + result->day += (int)(sec_remainder / SECS_PER_DAY); + } /* cascade units down */ result->day += (int32) month_remainder_days; #ifdef HAVE_INT64_TIMESTAMP result->time = rint(span->time * factor + sec_remainder * USECS_PER_SEC); #else ! result->time = span->time * factor + sec_remainder; #endif PG_RETURN_INTERVAL_P(result); *************** *** 2579,2584 **** --- 2585,2596 ---- sec_remainder = (orig_day * (double)SECS_PER_DAY) / factor - result->day * (double)SECS_PER_DAY + (month_remainder_days - (int32) month_remainder_days) * SECS_PER_DAY; + sec_remainder = TSROUND(sec_remainder); + if (Abs(sec_remainder) >= SECS_PER_DAY) + { + sec_remainder -= (int)(sec_remainder / SECS_PER_DAY) * SECS_PER_DAY; + result->day += (int)(sec_remainder / SECS_PER_DAY); + } /* cascade units down */ result->day += (int32) month_remainder_days; *************** *** 2586,2592 **** result->time = rint(span->time / factor + sec_remainder * USECS_PER_SEC); #else /* See TSROUND comment in interval_mul(). */ ! result->time = span->time / factor + TSROUND(sec_remainder); #endif PG_RETURN_INTERVAL_P(result); --- 2598,2604 ---- result->time = rint(span->time / factor + sec_remainder * USECS_PER_SEC); #else /* See TSROUND comment in interval_mul(). */ ! result->time = span->time / factor + sec_remainder; #endif PG_RETURN_INTERVAL_P(result); Index: src/include/utils/timestamp.h =================================================================== RCS file: /cvsroot/pgsql/src/include/utils/timestamp.h,v retrieving revision 1.62 diff -c -c -r1.62 timestamp.h *** src/include/utils/timestamp.h 13 Jul 2006 16:49:20 -0000 1.62 --- src/include/utils/timestamp.h 4 Sep 2006 03:49:22 -0000 *************** *** 161,171 **** typedef double fsec_t; ! /* round off to MAX_TIMESTAMP_PRECISION decimal places */ ! /* note: this is also used for rounding off intervals */ #define TS_PREC_INV 1000000.0 #define TSROUND(j) (rint(((double) (j)) * TS_PREC_INV) / TS_PREC_INV) - #endif #define TIMESTAMP_MASK(b) (1 << (b)) #define INTERVAL_MASK(b) (1 << (b)) --- 161,174 ---- typedef double fsec_t; ! #endif ! ! /* ! * Round off to MAX_TIMESTAMP_PRECISION decimal places. ! * Note: this is also used for rounding off intervals. ! */ #define TS_PREC_INV 1000000.0 #define TSROUND(j) (rint(((double) (j)) * TS_PREC_INV) / TS_PREC_INV) #define TIMESTAMP_MASK(b) (1 << (b)) #define INTERVAL_MASK(b) (1 << (b))
Bruce Momjian wrote: > OK, updated patch. It will fix the >=24:00:00 case because it cascades > up if the remainder number of seconds is greater or equal to one day. > One open item is that it still might show >24 hours if the seconds > computation combined with the remaning seconds >24 hours. Not sure if > that should be handled or not. If you fix that, you really are > cascading up because the resulting seconds might be less than the > computed value, e.g. result is 23:00:00, remainder is 02:00:00, cascade > up would be 1 day, 01:00:00. I am unsure we want to do that. Right > now, this will show 25:00:00. Updated patch that uses TSROUND for partial month and days. Michael says the tests look good on his system. I think we are done with this bug. :-) -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: src/backend/utils/adt/timestamp.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/utils/adt/timestamp.c,v retrieving revision 1.166 diff -c -c -r1.166 timestamp.c *** src/backend/utils/adt/timestamp.c 3 Sep 2006 03:34:04 -0000 1.166 --- src/backend/utils/adt/timestamp.c 4 Sep 2006 17:54:25 -0000 *************** *** 2514,2541 **** /* * Fractional months full days into days. * ! * The remainders suffer from float rounding, so instead of ! * doing the computation using just the remainder, we calculate ! * the total number of days and subtract. Specifically, we are ! * multipling by DAYS_PER_MONTH before dividing by factor. ! * This greatly reduces rounding errors. */ ! month_remainder_days = (orig_month * (double)DAYS_PER_MONTH) * factor - ! result->month * (double)DAYS_PER_MONTH; ! sec_remainder = (orig_day * (double)SECS_PER_DAY) * factor - ! result->day * (double)SECS_PER_DAY + ! (month_remainder_days - (int32) month_remainder_days) * SECS_PER_DAY; /* cascade units down */ result->day += (int32) month_remainder_days; #ifdef HAVE_INT64_TIMESTAMP result->time = rint(span->time * factor + sec_remainder * USECS_PER_SEC); #else ! /* ! * TSROUND() needed to prevent -146:23:60.00 output on PowerPC for ! * SELECT interval '-41 mon -12 days -360:00' * 0.3; ! */ ! result->time = span->time * factor + TSROUND(sec_remainder); #endif PG_RETURN_INTERVAL_P(result); --- 2514,2547 ---- /* * Fractional months full days into days. * ! * Floating point calculation are inherently inprecise, so these ! * calculations are crafted to produce the most reliable result ! * possible. TSROUND() is needed to more accurately produce whole ! * numbers where appropriate. */ ! month_remainder_days = (orig_month * factor - result->month) * DAYS_PER_MONTH; ! month_remainder_days = TSROUND(month_remainder_days); ! sec_remainder = (orig_day * factor - result->day + ! month_remainder_days - (int)month_remainder_days) * SECS_PER_DAY; ! sec_remainder = TSROUND(sec_remainder); ! ! /* ! * Might have 24:00:00 hours due to rounding, or >24 hours because of ! * time cascade from months and days. It might still be >24 if the ! * combination of cascade and the seconds factor operation itself. ! */ ! if (Abs(sec_remainder) >= SECS_PER_DAY) ! { ! result->day += (int)(sec_remainder / SECS_PER_DAY); ! sec_remainder -= (int)(sec_remainder / SECS_PER_DAY) * SECS_PER_DAY; ! } /* cascade units down */ result->day += (int32) month_remainder_days; #ifdef HAVE_INT64_TIMESTAMP result->time = rint(span->time * factor + sec_remainder * USECS_PER_SEC); #else ! result->time = span->time * factor + sec_remainder; #endif PG_RETURN_INTERVAL_P(result); *************** *** 2574,2584 **** * Fractional months full days into days. See comment in * interval_mul(). */ ! month_remainder_days = (orig_month * (double)DAYS_PER_MONTH) / factor - ! result->month * (double)DAYS_PER_MONTH; ! sec_remainder = (orig_day * (double)SECS_PER_DAY) / factor - ! result->day * (double)SECS_PER_DAY + ! (month_remainder_days - (int32) month_remainder_days) * SECS_PER_DAY; /* cascade units down */ result->day += (int32) month_remainder_days; --- 2580,2595 ---- * Fractional months full days into days. See comment in * interval_mul(). */ ! month_remainder_days = (orig_month / factor - result->month) * DAYS_PER_MONTH; ! month_remainder_days = TSROUND(month_remainder_days); ! sec_remainder = (orig_day / factor - result->day + ! month_remainder_days - (int)month_remainder_days) * SECS_PER_DAY; ! sec_remainder = TSROUND(sec_remainder); ! if (Abs(sec_remainder) >= SECS_PER_DAY) ! { ! result->day += (int)(sec_remainder / SECS_PER_DAY); ! sec_remainder -= (int)(sec_remainder / SECS_PER_DAY) * SECS_PER_DAY; ! } /* cascade units down */ result->day += (int32) month_remainder_days; *************** *** 2586,2592 **** result->time = rint(span->time / factor + sec_remainder * USECS_PER_SEC); #else /* See TSROUND comment in interval_mul(). */ ! result->time = span->time / factor + TSROUND(sec_remainder); #endif PG_RETURN_INTERVAL_P(result); --- 2597,2603 ---- result->time = rint(span->time / factor + sec_remainder * USECS_PER_SEC); #else /* See TSROUND comment in interval_mul(). */ ! result->time = span->time / factor + sec_remainder; #endif PG_RETURN_INTERVAL_P(result); Index: src/include/utils/timestamp.h =================================================================== RCS file: /cvsroot/pgsql/src/include/utils/timestamp.h,v retrieving revision 1.62 diff -c -c -r1.62 timestamp.h *** src/include/utils/timestamp.h 13 Jul 2006 16:49:20 -0000 1.62 --- src/include/utils/timestamp.h 4 Sep 2006 17:54:26 -0000 *************** *** 161,171 **** typedef double fsec_t; ! /* round off to MAX_TIMESTAMP_PRECISION decimal places */ ! /* note: this is also used for rounding off intervals */ #define TS_PREC_INV 1000000.0 #define TSROUND(j) (rint(((double) (j)) * TS_PREC_INV) / TS_PREC_INV) - #endif #define TIMESTAMP_MASK(b) (1 << (b)) #define INTERVAL_MASK(b) (1 << (b)) --- 161,174 ---- typedef double fsec_t; ! #endif ! ! /* ! * Round off to MAX_TIMESTAMP_PRECISION decimal places. ! * Note: this is also used for rounding off intervals. ! */ #define TS_PREC_INV 1000000.0 #define TSROUND(j) (rint(((double) (j)) * TS_PREC_INV) / TS_PREC_INV) #define TIMESTAMP_MASK(b) (1 << (b)) #define INTERVAL_MASK(b) (1 << (b))
On Sep 5, 2006, at 10:14 , Bruce Momjian wrote: > Bruce Momjian wrote: >> OK, updated patch. It will fix the >=24:00:00 case because it >> cascades >> up if the remainder number of seconds is greater or equal to one day. >> One open item is that it still might show >24 hours if the seconds >> computation combined with the remaning seconds >24 hours. Not >> sure if >> that should be handled or not. If you fix that, you really are >> cascading up because the resulting seconds might be less than the >> computed value, e.g. result is 23:00:00, remainder is 02:00:00, >> cascade >> up would be 1 day, 01:00:00. I am unsure we want to do that. Right >> now, this will show 25:00:00. > > Updated patch that uses TSROUND for partial month and days. Michael > says the tests look good on his system. I think we are done with this > bug. :-) Please find attached regression tests for this patch. Michael Glaesemann grzm seespotcode net
Attachment
Patch applied. Thanks. --------------------------------------------------------------------------- Michael Glaesemann wrote: > > On Sep 5, 2006, at 10:14 , Bruce Momjian wrote: > > > Bruce Momjian wrote: > >> OK, updated patch. It will fix the >=24:00:00 case because it > >> cascades > >> up if the remainder number of seconds is greater or equal to one day. > >> One open item is that it still might show >24 hours if the seconds > >> computation combined with the remaning seconds >24 hours. Not > >> sure if > >> that should be handled or not. If you fix that, you really are > >> cascading up because the resulting seconds might be less than the > >> computed value, e.g. result is 23:00:00, remainder is 02:00:00, > >> cascade > >> up would be 1 day, 01:00:00. I am unsure we want to do that. Right > >> now, this will show 25:00:00. > > > > Updated patch that uses TSROUND for partial month and days. Michael > > says the tests look good on his system. I think we are done with this > > bug. :-) > > Please find attached regression tests for this patch. > > Michael Glaesemann > grzm seespotcode net > [ Attachment, skipping... ] > -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +