Thread: Re: Have I found an interval arithmetic bug?

Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Thu, Apr  1, 2021 at 09:46:58PM -0700, Bryn Llewellyn wrote:
> Or am I misunderstanding something?
> 
> Try this. The result of each “select” is shown as the trailing comment on the
> same line. I added whitespace by hand to line up the fields.
> 
> select interval '-1.7 years';                          -- -1 years -8 mons
> 
> select interval '29.4 months';                         --  2 years  5 mons 12
> days
> 
> select interval '-1.7 years 29.4 months';              --           8 mons 12
> days << wrong
> select interval '29.4 months -1.7 years';              --           9 mons 12
> days
> 
> select interval '-1.7 years' + interval '29.4 months'; --           9 mons 12
> days
> select interval '29.4 months' + interval '-1.7 years'; --           9 mons 12
> days
> 
> As I reason it, the last four “select” statements are all semantically the
> same. They’re just different syntaxes to add the two intervals  the the first
> two “select” statements use separately. There’s one odd man out. And I reason
> this one to be wrong. Is there a flaw in my reasoning?

[Thread moved to hackers.]

Looking at your report, I thought I could easily find the cause, but it
wasn't obvious.  What is happening is that when you cast a float to an
integer in C, it rounds toward zero, meaning that 8.4 rounds to 8 and
-8.4 rounds to -8.  The big problem here is that -8.4 is rounding in a
positive direction, while 8.4 rounds in a negative direction.  See this:

    int(int(-8.4) + 29)
            21
    int(int(29) + -8.4)
            20

When you do '-1.7 years' first, it become -8, and then adding 29 yields
21.  In the other order, it is 29 - 8.4, which yields 20.6, which
becomes 20.  I honestly had never studied this interaction, though you
would think I would have seen it before.  One interesting issue is that
it only happens when the truncations happen to values with different
signs --- if they are both positive or negative, it is fine.

The best fix I think is to use rint()/round to round to the closest
integer, not toward zero.  The attached patch does this in a few places,
but the code needs more research if we are happy with this approach,
since there are probably other cases.  Using rint() does help to produce
more accurate results, thought the regression tests show no change from
this patch.

> Further… there’s a notable asymmetry. The fractional part of “1.7 years” is 8.4
> months. But the fractional part of the months value doesn’t spread further down
> into days. However, the fractional part of “29.4 months” (12 days) _does_
> spread further down into days. What’s the rationale for this asymmetry?

Yes, looking at the code, it seems we only spill down to one unit, not
more.  I think we need to have a discussion if we want to change that. 
I think the idea was that if you specify a non-whole number, you
probably want to spill down one level, but don't want it spilling all
the way to milliseconds, which is certainly possible.

> I can’t see that my observations here can be explained by the difference
> between calendar time and clock time. Here I’m just working with non-metric
> units like feet and inches. One year is just defined as 12 months. And one
> month is just defined as 30 days. All that stuff about adding a month to
> 3-Feb-2020 taking you to 3-Mar-2020 (same for leap years an non-leap years) ,
> and that other stuff about adding one day to 23:00 on the day before the
> “spring forward” moment taking you to 23:00 on the next day (i.w. when
> intervals are added to timestamps) is downstream of simply adding two
> intervals.

Ah, seems you have done some research.  ;-)

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


Attachment

Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:
Bruce:
Thanks for tackling this issue.

The patch looks good to me.
When you have time, can you include the places which were not covered by the current diff ?

Cheers

On Fri, Apr 2, 2021 at 11:06 AM Bruce Momjian <bruce@momjian.us> wrote:
On Thu, Apr  1, 2021 at 09:46:58PM -0700, Bryn Llewellyn wrote:
> Or am I misunderstanding something?
>
> Try this. The result of each “select” is shown as the trailing comment on the
> same line. I added whitespace by hand to line up the fields.
>
> select interval '-1.7 years';                          -- -1 years -8 mons
>
> select interval '29.4 months';                         --  2 years  5 mons 12
> days
>
> select interval '-1.7 years 29.4 months';              --           8 mons 12
> days << wrong
> select interval '29.4 months -1.7 years';              --           9 mons 12
> days
>
> select interval '-1.7 years' + interval '29.4 months'; --           9 mons 12
> days
> select interval '29.4 months' + interval '-1.7 years'; --           9 mons 12
> days
>
> As I reason it, the last four “select” statements are all semantically the
> same. They’re just different syntaxes to add the two intervals  the the first
> two “select” statements use separately. There’s one odd man out. And I reason
> this one to be wrong. Is there a flaw in my reasoning?

[Thread moved to hackers.]

Looking at your report, I thought I could easily find the cause, but it
wasn't obvious.  What is happening is that when you cast a float to an
integer in C, it rounds toward zero, meaning that 8.4 rounds to 8 and
-8.4 rounds to -8.  The big problem here is that -8.4 is rounding in a
positive direction, while 8.4 rounds in a negative direction.  See this:

        int(int(-8.4) + 29)
                21
        int(int(29) + -8.4)
                20

When you do '-1.7 years' first, it become -8, and then adding 29 yields
21.  In the other order, it is 29 - 8.4, which yields 20.6, which
becomes 20.  I honestly had never studied this interaction, though you
would think I would have seen it before.  One interesting issue is that
it only happens when the truncations happen to values with different
signs --- if they are both positive or negative, it is fine.

The best fix I think is to use rint()/round to round to the closest
integer, not toward zero.  The attached patch does this in a few places,
but the code needs more research if we are happy with this approach,
since there are probably other cases.  Using rint() does help to produce
more accurate results, thought the regression tests show no change from
this patch.

> Further… there’s a notable asymmetry. The fractional part of “1.7 years” is 8.4
> months. But the fractional part of the months value doesn’t spread further down
> into days. However, the fractional part of “29.4 months” (12 days) _does_
> spread further down into days. What’s the rationale for this asymmetry?

Yes, looking at the code, it seems we only spill down to one unit, not
more.  I think we need to have a discussion if we want to change that.
I think the idea was that if you specify a non-whole number, you
probably want to spill down one level, but don't want it spilling all
the way to milliseconds, which is certainly possible.

> I can’t see that my observations here can be explained by the difference
> between calendar time and clock time. Here I’m just working with non-metric
> units like feet and inches. One year is just defined as 12 months. And one
> month is just defined as 30 days. All that stuff about adding a month to
> 3-Feb-2020 taking you to 3-Mar-2020 (same for leap years an non-leap years) ,
> and that other stuff about adding one day to 23:00 on the day before the
> “spring forward” moment taking you to 23:00 on the next day (i.w. when
> intervals are added to timestamps) is downstream of simply adding two
> intervals.

Ah, seems you have done some research.  ;-)

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: Have I found an interval arithmetic bug?

From
John W Higgins
Date:

On Fri, Apr 2, 2021 at 11:05 AM Bruce Momjian <bruce@momjian.us> wrote:
On Thu, Apr  1, 2021 at 09:46:58PM -0700, Bryn Llewellyn wrote:
> Or am I misunderstanding something?
>
> Try this. The result of each “select” is shown as the trailing comment on the
> same line. I added whitespace by hand to line up the fields.
>
> select interval '-1.7 years';                          -- -1 years -8 mons
>
> select interval '29.4 months';                         --  2 years  5 mons 12
> days
>
> select interval '-1.7 years 29.4 months';              --           8 mons 12
> days << wrong
> select interval '29.4 months -1.7 years';              --           9 mons 12
> days
>
> select interval '-1.7 years' + interval '29.4 months'; --           9 mons 12
> days
> select interval '29.4 months' + interval '-1.7 years'; --           9 mons 12
> days
>

While maybe there is an argument to fixing the negative/positive rounding issue - there is no way this gets solved without breaking the current implementation

select interval '0.3 years' + interval '0.4 years' - interval '0.7 years' + interval '0.1 years' should not equal 0 but it certainly does.

Unless we take the concept of 0.3 years = 3 months and move to something along the lines of 

1 year = 360 days
1 month = 30 days 

so therefore 

0.3 years = 360 days * 0.3 = 108 days = 3 months 18 days 
0.4 years = 360 days * 0.4 = 144 days = 4 months 24 days
0.7 years = 360 days * 0.7 = 252 days = 8 months 12 days

Then, and only if we don't go to any more than tenths of a year, does the math work. Probably this should resolve down to seconds and then work backwards - but unless we're looking at breaking the entire way it currently resolves things - I don't think this is of much value.

Doing math on intervals is like doing math on rounded numbers - there is always going to be a pile of issues because the level of precision just is not good enough.

John

Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, Apr  2, 2021 at 01:05:42PM -0700, Bryn Llewellyn wrote:
> I should come clean about the larger context. I work for Yugabyte, Inc. We have
> a distributed SQL database that uses the Version 11.2 PostgreSQL C code for SQL
> processing “as is”.
> 
> https://blog.yugabyte.com/
> distributed-postgresql-on-a-google-spanner-architecture-query-layer/
> 
> The founders decided to document YugabyteDB’s SQL functionality explicitly
> rather than just to point to the published PostgreSQL doc. (There are some DDL
> differences that reflect the storage layer differences.) I’m presently
> documenting date-time functionality. This is why I’m so focused on
> understanding the semantics exactly and on understanding the requirements that
> the functionality was designed to meet. I’m struggling with interval
> functionality. I read this:

[Sorry, also moved this to hackers.  I might normally do all the
discussion on general, with patches, and then move it to hackers, but
our PG 14 deadline is next week, so it is best to move it now in hopes
it can be addressed in PG 14.]

Sure, seems like a good idea.

> https://www.postgresql.org/docs/current/datatype-datetime.html#
> DATATYPE-INTERVAL-INPUT
> 
> « …field values can have fractional parts; for example '1.5 week' or
> '01:02:03.45'. Such input is converted to the appropriate number of months,
> days, and seconds for storage. When this would result in a fractional number of
> months or days, the fraction is added to the lower-order fields using the
> conversion factors 1 month = 30 days and 1 day = 24 hours. For example, '1.5
> month' becomes 1 month and 15 days. Only seconds will ever be shown as
> fractional on output. »
> 
> Notice that the doc says that spill-down goes all the way to seconds and not
> just one unit. This simple test is consistent with the doc (output follows the
> dash-dash comment):
> 
> select ('6.54321 months'::interval)::text as i; --  6 mons 16 days 07:06:40.32
> 
> You see similar spill-down with this:
> 
> select ('876.54321 days'::interval)::text as i; -- 876 days 13:02:13.344
> 
> And so on down through the remaining smaller units. It’s only this test that
> doesn’t spill down one unit:
> 
> select ('6.54321 years'::interval)::text as i; --  6 years 6 mons
> 
> This does suggest a straight bug rather than a case for committee debate about
> what might have been intended. What do you think, Bruce?

So, that gets into more detail.  When I said "spill down one unit", I
was not talking about _visible_ units, but rather the three internal
units used by Postgres:

    https://www.postgresql.org/docs/13/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
    Internally interval values are stored as months, days, and seconds.
                                             -------------------------

However, while that explains why years don't spill beyond months, it
doesn't explain why months would spill beyond days.  This certainly
seems inconsistent.

I have modified the patch to prevent partial months from creating
partial hours/minutes/seconds, so the output is now at least consistent
based on the three units:

    SELECT ('6.54321 years'::interval)::text as i;
           i
    ----------------
     6 years 7 mons
    
    SELECT ('6.54321 months'::interval)::text as i;
           i
    ----------------
     6 mons 16 days
    
    SELECT ('876.54321 days'::interval)::text as i;
               i
    -----------------------
     876 days 13:02:13.344

Partial years keeps it in months, partial months takes it to days, and
partial days take it to hours/minutes/seconds.  This seems like an
improvement.

This also changes the regression test output, I think for the better:

     SELECT INTERVAL '1.5 weeks';
              i
     ------------------
    - 10 days 12:00:00
    + 10 days

The new output is less precise, but probably closer to what the user
wanted.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


Attachment

Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, Apr  2, 2021 at 01:27:33PM -0700, Zhihong Yu wrote:
> Bruce:
> Thanks for tackling this issue.
> 
> The patch looks good to me.
> When you have time, can you include the places which were not covered by the
> current diff ?

I have just posted a new version of the patch which I think covers all
the right areas.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, Apr  2, 2021 at 02:00:03PM -0700, John W Higgins wrote:
> On Fri, Apr 2, 2021 at 11:05 AM Bruce Momjian <bruce@momjian.us> wrote:
> While maybe there is an argument to fixing the negative/positive rounding issue
> - there is no way this gets solved without breaking the current implementation
> 
> select interval '0.3 years' + interval '0.4 years' - interval '0.7 years' +
> interval '0.1 years' should not equal 0 but it certainly does.

My new code returns 0.2 months for this, not zero:

    SELECT  interval '0.3 years' + interval '0.4 years' -
        interval '0.7 years' + interval '0.1 years';
     ?column?
    ----------
     2 mons

which is also wrong since:

    SELECT interval '0.1 years';
     interval
    ----------
     1 mon

> Unless we take the concept of 0.3 years = 3 months and move to something along
> the lines of 
> 
> 1 year = 360 days
> 1 month = 30 days 
> 
> so therefore 
> 
> 0.3 years = 360 days * 0.3 = 108 days = 3 months 18 days 
> 0.4 years = 360 days * 0.4 = 144 days = 4 months 24 days
> 0.7 years = 360 days * 0.7 = 252 days = 8 months 12 days
> 
> Then, and only if we don't go to any more than tenths of a year, does the math
> work. Probably this should resolve down to seconds and then work backwards -
> but unless we're looking at breaking the entire way it currently resolves
> things - I don't think this is of much value.
> 
> Doing math on intervals is like doing math on rounded numbers - there is always
> going to be a pile of issues because the level of precision just is not good
> enough.

I think the big question is what units do people want with fractional
values.  I have posted a follow-up email that spills only for one unit,
which I think is the best approach.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:
Hi,
bq. My new code returns 0.2 months for this, not zero

Can you clarify (the output below that was 2 mons, not 0.2) ?

Thanks

On Fri, Apr 2, 2021 at 4:58 PM Bruce Momjian <bruce@momjian.us> wrote:
On Fri, Apr  2, 2021 at 02:00:03PM -0700, John W Higgins wrote:
> On Fri, Apr 2, 2021 at 11:05 AM Bruce Momjian <bruce@momjian.us> wrote:
> While maybe there is an argument to fixing the negative/positive rounding issue
> - there is no way this gets solved without breaking the current implementation
>
> select interval '0.3 years' + interval '0.4 years' - interval '0.7 years' +
> interval '0.1 years' should not equal 0 but it certainly does.

My new code returns 0.2 months for this, not zero:

        SELECT  interval '0.3 years' + interval '0.4 years' -
                interval '0.7 years' + interval '0.1 years';
         ?column?
        ----------
         2 mons

which is also wrong since:

        SELECT interval '0.1 years';
         interval
        ----------
         1 mon

> Unless we take the concept of 0.3 years = 3 months and move to something along
> the lines of 
>
> 1 year = 360 days
> 1 month = 30 days 
>
> so therefore 
>
> 0.3 years = 360 days * 0.3 = 108 days = 3 months 18 days 
> 0.4 years = 360 days * 0.4 = 144 days = 4 months 24 days
> 0.7 years = 360 days * 0.7 = 252 days = 8 months 12 days
>
> Then, and only if we don't go to any more than tenths of a year, does the math
> work. Probably this should resolve down to seconds and then work backwards -
> but unless we're looking at breaking the entire way it currently resolves
> things - I don't think this is of much value.
>
> Doing math on intervals is like doing math on rounded numbers - there is always
> going to be a pile of issues because the level of precision just is not good
> enough.

I think the big question is what units do people want with fractional
values.  I have posted a follow-up email that spills only for one unit,
which I think is the best approach.

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.



Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, Apr  2, 2021 at 07:47:32PM -0400, Bruce Momjian wrote:
> I have modified the patch to prevent partial months from creating
> partial hours/minutes/seconds, so the output is now at least consistent
> based on the three units:
> 
>     SELECT ('6.54321 years'::interval)::text as i;
>            i
>     ----------------
>      6 years 7 mons
>     
>     SELECT ('6.54321 months'::interval)::text as i;
>            i
>     ----------------
>      6 mons 16 days
>     
>     SELECT ('876.54321 days'::interval)::text as i;
>                i
>     -----------------------
>      876 days 13:02:13.344
> 
> Partial years keeps it in months, partial months takes it to days, and
> partial days take it to hours/minutes/seconds.  This seems like an
> improvement.
> 
> This also changes the regression test output, I think for the better:
> 
>      SELECT INTERVAL '1.5 weeks';
>               i
>      ------------------
>     - 10 days 12:00:00
>     + 10 days
> 
> The new output is less precise, but probably closer to what the user
> wanted.

Thinking some more about this, the connection between months and days is
very inaccurate, 30 days/month, but the connection between days and
hours/minutes/seconds is pretty accurate, except for leap days. 
Therefore, returning "10 days 12:00:00" is in many ways better, but
returning hours/minutes/seconds for fractional months is very arbitrary
and suggests an accuracy that doesn't exist.  However, I am afraid that
trying to enforce that distinction in the Postgres behavior would appear
very arbitrary, so what I did above is proabably the best I can do. 
Here is another example of what we have:

    SELECT INTERVAL '1.5 years';
       interval
    ---------------
     1 year 6 mons
    
    SELECT INTERVAL '1.5 months';
       interval
    ---------------
     1 mon 15 days
    
    SELECT INTERVAL '1.5 weeks';
     interval
    ----------
     10 days
    
    SELECT INTERVAL '1.5 days';
        interval
    ----------------
     1 day 12:00:00
    
    SELECT INTERVAL '1.5 hours';
     interval
    ----------
     01:30:00

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Bryn Llewellyn
Date:
bruce@momjian.us wrote:

I have just posted a new version of the patch which I think covers all the right areas.

I found the relevant email from you to pgsql-hackers here:


You said:

I have modified the patch to prevent partial months from creating partial hours/minutes/seconds… Partial years keeps it in months, partial months takes it to days, and partial days take it to hours/minutes/seconds. This seems like an improvement.

I have written some PL/pgSQL code that faithfully emulates the behavior that I see in my present vanilla PostgreSQL Version 13.2 system in a wide range of tests. This is the key part:

  m1           int     not null := trunc(p.mo);
  m_remainder  numeric not null := p.mo - m1::numeric;
  m            int     not null := trunc(p.yy*12) + m1;

  d_real       numeric not null := p.dd + m_remainder*30.0;
  d            int     not null := floor(d_real);
  d_remainder  numeric not null := d_real - d::numeric;

  s            numeric not null := d_remainder*24.0*60.0*60.0 +
                                   p.hh*60.0*60.0 +
                                   p.mi*60.0 +
                                   p.ss;
begin
  return (m, d, s)::modeled_interval_t;
end;

These quantities:

p.yy, p.mo, p.dd, p.hh, p.mi, and p.ss

are the user’s parameterization. All are real numbers. Because non-integral values for years, months, days, hours, and minutes are allowed when you specify a value using the ::interval typecast, my reference doc must state the rules. I would have struggled to express these rules in prose—especially given the use both of trunc() and floor(). I would have struggled more to explain what requirements these rules meet.

I gather that by the time YugabyteDB has adopted your patch, my PL/pgSQL will no longer be a correct emulation. So I’ll re-write it then.

I intend to advise users always to constrain the values, when they specify an interval value explicitly, so the the years, months, days, hours, and minutes are integers. This is, after all, the discipline that the make_interval() built-in imposes. So I might recommend using only that.









Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, Apr  2, 2021 at 05:50:59PM -0700, Bryn Llewellyn wrote:
> are the user’s parameterization. All are real numbers. Because non-integral
> values for years, months, days, hours, and minutes are allowed when you specify
> a value using the ::interval typecast, my reference doc must state the rules. I
> would have struggled to express these rules in prose—especially given the use
> both of trunc() and floor(). I would have struggled more to explain what
> requirements these rules meet.

The fundamental issue is that while months, days, and seconds are
consistent in their own units, when you have to cross from one unit to
another, it is by definition imprecise, since the interval is not tied
to a specific date, with its own days-of-the-month and leap days and
daylight savings time changes.  It feels like it is going to be
imprecise no matter what we do.

Adding to this is the fact that interval values are stored in C 'struct
tm' defined in libc's ctime(), where months are integers, so carrying
around non-integer month values until we get a final result would add a
lot of complexity, and complexity to a system that is by definition
imprecise, which doesn't seem worth it.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:
Hi,
I got a local build with second patch where:

yugabyte=# SELECT  interval '0.3 years' + interval '0.4 years' -
                interval '0.7 years';
 ?column?
----------
 1 mon

I think the outcome is a bit unintuitive (I would expect result close to 0).

Cheers

On Fri, Apr 2, 2021 at 5:07 PM Zhihong Yu <zyu@yugabyte.com> wrote:
Hi,
bq. My new code returns 0.2 months for this, not zero

Can you clarify (the output below that was 2 mons, not 0.2) ?

Thanks

On Fri, Apr 2, 2021 at 4:58 PM Bruce Momjian <bruce@momjian.us> wrote:
On Fri, Apr  2, 2021 at 02:00:03PM -0700, John W Higgins wrote:
> On Fri, Apr 2, 2021 at 11:05 AM Bruce Momjian <bruce@momjian.us> wrote:
> While maybe there is an argument to fixing the negative/positive rounding issue
> - there is no way this gets solved without breaking the current implementation
>
> select interval '0.3 years' + interval '0.4 years' - interval '0.7 years' +
> interval '0.1 years' should not equal 0 but it certainly does.

My new code returns 0.2 months for this, not zero:

        SELECT  interval '0.3 years' + interval '0.4 years' -
                interval '0.7 years' + interval '0.1 years';
         ?column?
        ----------
         2 mons

which is also wrong since:

        SELECT interval '0.1 years';
         interval
        ----------
         1 mon

> Unless we take the concept of 0.3 years = 3 months and move to something along
> the lines of 
>
> 1 year = 360 days
> 1 month = 30 days 
>
> so therefore 
>
> 0.3 years = 360 days * 0.3 = 108 days = 3 months 18 days 
> 0.4 years = 360 days * 0.4 = 144 days = 4 months 24 days
> 0.7 years = 360 days * 0.7 = 252 days = 8 months 12 days
>
> Then, and only if we don't go to any more than tenths of a year, does the math
> work. Probably this should resolve down to seconds and then work backwards -
> but unless we're looking at breaking the entire way it currently resolves
> things - I don't think this is of much value.
>
> Doing math on intervals is like doing math on rounded numbers - there is always
> going to be a pile of issues because the level of precision just is not good
> enough.

I think the big question is what units do people want with fractional
values.  I have posted a follow-up email that spills only for one unit,
which I think is the best approach.

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.



Re: Have I found an interval arithmetic bug?

From
Isaac Morland
Date:
On Fri, 2 Apr 2021 at 21:08, Zhihong Yu <zyu@yugabyte.com> wrote:
Hi,
I got a local build with second patch where:

yugabyte=# SELECT  interval '0.3 years' + interval '0.4 years' -
                interval '0.7 years';
 ?column?
----------
 1 mon

I think the outcome is a bit unintuitive (I would expect result close to 0).

That's not fundamentally different from this:

odyssey=> select 12 * 3/10 + 12 * 4/10 - 12 * 7/10;
 ?column? 
----------
       -1
(1 row)

odyssey=> 

And actually the result is pretty close to 0. I mean it’s less than 0.1 year.

I wonder if it might have been better if only integers had been accepted for the components? If you want 0.3 years write 0.3 * '1 year'::interval. But changing it now would be a pretty significant backwards compatibility break.

There's really no avoiding counterintuitive behaviour though. Look at this:

odyssey=> select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7 * '1 year'::interval;
     ?column?     
------------------
 -1 mons +30 days
(1 row)

odyssey=> select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7 * '1 year'::interval = '0';
 ?column? 
----------
 t
(1 row)

odyssey=> 

In other words, doing the “same” calculation but with multiplying 1 year intervals by floats to get the values to add, you end up with an interval that while not identical to 0 does compare equal to 0. So very close to 0; in fact, as close to 0 as you can get without actually being identically 0.

Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:
Hi,
The mix of interval and comparison with float is not easy to interpret. See the following (I got 0.0833 since the result for interval '0.3 years' + interval '0.4 years' - ... query was 1 month and 1/12 ~= 0.0833).

yugabyte=# select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7 * '1 year'::interval = '0.0833 year'::interval;
 ?column?
----------
 f

As long as Bruce's patch makes improvements over the current behavior, I think that's fine.

Cheers

On Fri, Apr 2, 2021 at 6:24 PM Isaac Morland <isaac.morland@gmail.com> wrote:
On Fri, 2 Apr 2021 at 21:08, Zhihong Yu <zyu@yugabyte.com> wrote:
Hi,
I got a local build with second patch where:

yugabyte=# SELECT  interval '0.3 years' + interval '0.4 years' -
                interval '0.7 years';
 ?column?
----------
 1 mon

I think the outcome is a bit unintuitive (I would expect result close to 0).

That's not fundamentally different from this:

odyssey=> select 12 * 3/10 + 12 * 4/10 - 12 * 7/10;
 ?column? 
----------
       -1
(1 row)

odyssey=> 

And actually the result is pretty close to 0. I mean it’s less than 0.1 year.

I wonder if it might have been better if only integers had been accepted for the components? If you want 0.3 years write 0.3 * '1 year'::interval. But changing it now would be a pretty significant backwards compatibility break.

There's really no avoiding counterintuitive behaviour though. Look at this:

odyssey=> select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7 * '1 year'::interval;
     ?column?     
------------------
 -1 mons +30 days
(1 row)

odyssey=> select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7 * '1 year'::interval = '0';
 ?column? 
----------
 t
(1 row)

odyssey=> 

In other words, doing the “same” calculation but with multiplying 1 year intervals by floats to get the values to add, you end up with an interval that while not identical to 0 does compare equal to 0. So very close to 0; in fact, as close to 0 as you can get without actually being identically 0.

Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, Apr  2, 2021 at 06:11:08PM -0700, Zhihong Yu wrote:
> Hi,
> I got a local build with second patch where:
> 
> yugabyte=# SELECT  interval '0.3 years' + interval '0.4 years' -
>                 interval '0.7 years';
>  ?column?
> ----------
>  1 mon
> 
> I think the outcome is a bit unintuitive (I would expect result close to 0).

Uh, the current code returns:

    SELECT  interval '0.3 years' + interval '0.4 years' - interval '0.7 years';
     ?column?
    ----------
     -1 mon

and with the patch it is:
    
    SELECT  interval '0.3 years' + interval '0.4 years' - interval '0.7 years';
     ?column?
    ----------
     1 mon

What it isn't, is zero months, which is the obviously ideal answer.


-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, Apr  2, 2021 at 07:06:08PM -0700, Zhihong Yu wrote:
> Hi,
> The mix of interval and comparison with float is not easy to interpret. See the
> following (I got 0.0833 since the result for interval '0.3 years' + interval
> '0.4 years' - ... query was 1 month and 1/12 ~= 0.0833).
> 
> yugabyte=# select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7 *
> '1 year'::interval = '0.0833 year'::interval;
>  ?column?
> ----------
>  f
> 
> As long as Bruce's patch makes improvements over the current behavior, I think
> that's fine.

I wish I could figure out how to improve it any futher.  What is odd is
that I have never seen this reported as a problem before.  I plan to
apply this early next week for PG 14.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:
Bruce:
In src/interfaces/ecpg/pgtypeslib/interval.c, how about the following places ?

Around line 158:
                case 'Y':
                    tm->tm_year += val;
                    tm->tm_mon += (fval * MONTHS_PER_YEAR);

Around line 194:
                    tm->tm_year += val;
                    tm->tm_mon += (fval * MONTHS_PER_YEAR);

Is rint() needed for these two cases ?

Cheers

On Fri, Apr 2, 2021 at 7:21 PM Bruce Momjian <bruce@momjian.us> wrote:
On Fri, Apr  2, 2021 at 07:06:08PM -0700, Zhihong Yu wrote:
> Hi,
> The mix of interval and comparison with float is not easy to interpret. See the
> following (I got 0.0833 since the result for interval '0.3 years' + interval
> '0.4 years' - ... query was 1 month and 1/12 ~= 0.0833).
>
> yugabyte=# select 0.3 * '1 year'::interval + 0.4 * '1 year'::interval - 0.7 *
> '1 year'::interval = '0.0833 year'::interval;
>  ?column?
> ----------
>  f
>
> As long as Bruce's patch makes improvements over the current behavior, I think
> that's fine.

I wish I could figure out how to improve it any futher.  What is odd is
that I have never seen this reported as a problem before.  I plan to
apply this early next week for PG 14.

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, Apr  2, 2021 at 07:53:35PM -0700, Zhihong Yu wrote:
> Bruce:
> In src/interfaces/ecpg/pgtypeslib/interval.c, how about the following places ?
> 
> Around line 158:
>                 case 'Y':
>                     tm->tm_year += val;
>                     tm->tm_mon += (fval * MONTHS_PER_YEAR);
> 
> Around line 194:
>                     tm->tm_year += val;
>                     tm->tm_mon += (fval * MONTHS_PER_YEAR);
> 
> Is rint() needed for these two cases ?

Ah, yes, good point.  Updated patch attached.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


Attachment

Re: Have I found an interval arithmetic bug?

From
Justin Pryzby
Date:
On Fri, Apr 02, 2021 at 10:21:26PM -0400, Bruce Momjian wrote:
> I wish I could figure out how to improve it any futher.  What is odd is
> that I have never seen this reported as a problem before.  I plan to
> apply this early next week for PG 14.

On Fri, Apr 02, 2021 at 01:05:42PM -0700, Bryn Llewellyn wrote:
> bruce@momjian.us wrote:
> > Yes, looking at the code, it seems we only spill down to one unit, not more. I think we need to have a discussion
ifwe want to change that. 
 

If this is a bug, then there's no deadline - and if you're proposing a behavior
change, then I don't think it's a good time to begin the discussion.

> https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
> « …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to the
appropriatenumber of months, days, and seconds for storage. When this would result in a fractional number of months or
days,the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24
hours.For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. »
 

Your patch changes what seems to be the intended behavior, with the test case
added by:

|commit 57bfb27e60055c10e42b87e68a894720c2f78e70
|Author: Tom Lane <tgl@sss.pgh.pa.us>
|Date:   Mon Sep 4 01:26:28 2006 +0000
|
|    Fix interval input parser so that fractional weeks and months are
|    cascaded first to days and only what is leftover into seconds.  This

And documented by:

|commit dbf57d31f8d7bf5c058a9fab2a1ccb4a336864ce
|Author: Tom Lane <tgl@sss.pgh.pa.us>
|Date:   Sun Nov 9 17:09:48 2008 +0000
|
|    Add some documentation about handling of fractions in interval input.
|    (It's always worked like this, but we never documented it before.)

If you were to change the behavior, I think you'd have to update the
documentation, too - but I think that's not a desirable change.

I *am* curious why the YEAR, DECADE, CENTURY, AND MILLENIUM cases only handle
fractional intervals down to the next smaller unit, and not down to
seconds/milliseconds.  I wrote a patch to handle that by adding
AdjustFractMons(), if we agree that it's desirable to change.

-- 
Justin



Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Mon, Apr  5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
> On Fri, Apr 02, 2021 at 10:21:26PM -0400, Bruce Momjian wrote:
> > I wish I could figure out how to improve it any futher.  What is odd is
> > that I have never seen this reported as a problem before.  I plan to
> > apply this early next week for PG 14.
> 
> On Fri, Apr 02, 2021 at 01:05:42PM -0700, Bryn Llewellyn wrote:
> > bruce@momjian.us wrote:
> > > Yes, looking at the code, it seems we only spill down to one unit, not more. I think we need to have a discussion
ifwe want to change that. 
 
> 
> If this is a bug, then there's no deadline - and if you're proposing a behavior
> change, then I don't think it's a good time to begin the discussion.

Well, bug or not, we are not going to change back branches for this, and
if you want a larger discussion, it will have to wait for PG 15.

> > https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
> > « …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to the
appropriatenumber of months, days, and seconds for storage. When this would result in a fractional number of months or
days,the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24
hours.For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. »
 

I see that.  What is not clear here is how far we flow down.  I was
looking at adding documentation or regression tests for that, but was
unsure.  I adjusted the docs slightly in the attached patch.

> Your patch changes what seems to be the intended behavior, with the test case
> added by:
> 
> |commit 57bfb27e60055c10e42b87e68a894720c2f78e70
> |Author: Tom Lane <tgl@sss.pgh.pa.us>
> |Date:   Mon Sep 4 01:26:28 2006 +0000
> |
> |    Fix interval input parser so that fractional weeks and months are
> |    cascaded first to days and only what is leftover into seconds.  This
> 
> And documented by:
> 
> |commit dbf57d31f8d7bf5c058a9fab2a1ccb4a336864ce
> |Author: Tom Lane <tgl@sss.pgh.pa.us>
> |Date:   Sun Nov 9 17:09:48 2008 +0000
> |
> |    Add some documentation about handling of fractions in interval input.
> |    (It's always worked like this, but we never documented it before.)
> 
> If you were to change the behavior, I think you'd have to update the
> documentation, too - but I think that's not a desirable change.

> I *am* curious why the YEAR, DECADE, CENTURY, AND MILLENIUM cases only handle
> fractional intervals down to the next smaller unit, and not down to
> seconds/milliseconds.  I wrote a patch to handle that by adding
> AdjustFractMons(), if we agree that it's desirable to change.

The interaction of months/days/seconds is so imprecise that passing it
futher down doesn't make much sense, and suggests a precision that
doesn't exist, but if people prefer that we can do it.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


Attachment

Re: Have I found an interval arithmetic bug?

From
Justin Pryzby
Date:
On Mon, Apr 05, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
> On Mon, Apr  5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
> > > https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
> > > « …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to
theappropriate number of months, days, and seconds for storage. When this would result in a fractional number of months
ordays, the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24
hours.For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. »
 
> 
> I see that.  What is not clear here is how far we flow down.  I was
> looking at adding documentation or regression tests for that, but was
> unsure.  I adjusted the docs slightly in the attached patch.

I should have adjusted the quote to include context:

| In the verbose input format, and in SOME FIELDS of the more compact input formats, field values can have fractional
parts[...]

I don't know what "some fields" means - more clarity here would help indicate
the intended behavior.

> The interaction of months/days/seconds is so imprecise that passing it
> futher down doesn't make much sense, and suggests a precision that
> doesn't exist, but if people prefer that we can do it.

I agree on its face that "months" is imprecise (30, 31, 27, 28 days),
especially fractional months, and same for "years" (leap years), and hours per
day (DST), but even minutes ("leap seconds").  But the documentation seems to
be clear about the behavior:

| .. using the conversion factors 1 month = 30 days and 1 day = 24 hours

I think the most obvious/consistent change is for years and greater to "cascade
down" to seconds, and not just months.

-- 
Justin



Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Mon, Apr  5, 2021 at 01:15:22PM -0500, Justin Pryzby wrote:
> On Mon, Apr 05, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
> > On Mon, Apr  5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
> > > > https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
> > > > « …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to
theappropriate number of months, days, and seconds for storage. When this would result in a fractional number of months
ordays, the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24
hours.For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. »
 
> > 
> > I see that.  What is not clear here is how far we flow down.  I was
> > looking at adding documentation or regression tests for that, but was
> > unsure.  I adjusted the docs slightly in the attached patch.
> 
> I should have adjusted the quote to include context:
> 
> | In the verbose input format, and in SOME FIELDS of the more compact input formats, field values can have fractional
parts[...]
> 
> I don't know what "some fields" means - more clarity here would help indicate
> the intended behavior.

I assume it is comparing the verbose format to the ISO 8601 time
intervals format, which I have not looked at.  Interesting I see this as
a C comment at the top of DecodeISO8601Interval();

     *  A couple exceptions from the spec:
     *   - a week field ('W') may coexist with other units
-->     *   - allows decimals in fields other than the least significant unit.

I don't actually see anything in our code that doesn't support factional
values, so maybe the docs are wrong and need to be fixed.

Actually, according to our regression tests, this fails:

    SELECT '5.5 seconds 3 milliseconds'::interval;
    ERROR:  invalid input syntax for type interval: "5.5 seconds 3 milliseconds"

but that is the verbose format, I think.

> > The interaction of months/days/seconds is so imprecise that passing it
> > futher down doesn't make much sense, and suggests a precision that
> > doesn't exist, but if people prefer that we can do it.
> 
> I agree on its face that "months" is imprecise (30, 31, 27, 28 days),
> especially fractional months, and same for "years" (leap years), and hours per
> day (DST), but even minutes ("leap seconds").  But the documentation seems to
> be clear about the behavior:
> 
> | .. using the conversion factors 1 month = 30 days and 1 day = 24 hours
> 
> I think the most obvious/consistent change is for years and greater to "cascade
> down" to seconds, and not just months.

Wow, well, that is _an_ option.  Would people like that?  It is certainly
easier to explain.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Bryn Llewellyn
Date:
> On 05-Apr-2021, at 11:37, Bruce Momjian <bruce@momjian.us> wrote:
>
> On Mon, Apr  5, 2021 at 01:15:22PM -0500, Justin Pryzby wrote:
>> On Mon, Apr 05, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
>>> On Mon, Apr  5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
>>>>>
https://www.google.com/url?q=https://www.postgresql.org/docs/current/datatype-datetime.html%23DATATYPE-INTERVAL-INPUT&source=gmail-imap&ust=1618252677000000&usg=AOvVaw34LnV9DlK4pcYY5NJGQe-m
>>>>> « …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to
theappropriate number of months, days, and seconds for storage. When this would result in a fractional number of months
ordays, the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24
hours.For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. » 
>>>
>>> I see that.  What is not clear here is how far we flow down.  I was
>>> looking at adding documentation or regression tests for that, but was
>>> unsure.  I adjusted the docs slightly in the attached patch.
>>
>> I should have adjusted the quote to include context:
>>
>> | In the verbose input format, and in SOME FIELDS of the more compact input formats, field values can have
fractionalparts[...] 
>>
>> I don't know what "some fields" means - more clarity here would help indicate
>> the intended behavior.
>
> I assume it is comparing the verbose format to the ISO 8601 time
> intervals format, which I have not looked at.  Interesting I see this as
> a C comment at the top of DecodeISO8601Interval();
>
>      *  A couple exceptions from the spec:
>      *   - a week field ('W') may coexist with other units
> -->     *   - allows decimals in fields other than the least significant unit.
>
> I don't actually see anything in our code that doesn't support factional
> values, so maybe the docs are wrong and need to be fixed.
>
> Actually, according to our regression tests, this fails:
>
>     SELECT '5.5 seconds 3 milliseconds'::interval;
>     ERROR:  invalid input syntax for type interval: "5.5 seconds 3 milliseconds"
>
> but that is the verbose format, I think.
>
>>> The interaction of months/days/seconds is so imprecise that passing it
>>> futher down doesn't make much sense, and suggests a precision that
>>> doesn't exist, but if people prefer that we can do it.
>>
>> I agree on its face that "months" is imprecise (30, 31, 27, 28 days),
>> especially fractional months, and same for "years" (leap years), and hours per
>> day (DST), but even minutes ("leap seconds").  But the documentation seems to
>> be clear about the behavior:
>>
>> | .. using the conversion factors 1 month = 30 days and 1 day = 24 hours
>>
>> I think the most obvious/consistent change is for years and greater to "cascade
>> down" to seconds, and not just months.
>
> Wow, well, that is _an_ option.  Would people like that?  It is certainly
> easier to explain.

It seems to me that this whole business is an irrevocable mess. The original design could have brought three
overload-distinguishabletypes, "interval month", "interval day", and "interval second"—each represented internally as a
scalar.There could have been built-ins to convert between them using conventionally specified rules. Then interval
arithmeticwould have been clear. For example, an attempt to assign the difference between two timestamps to anything
but"interval second" would cause an error (as it does in Oracle database, even though there there are only two interval
kinds).But we can only deal with what we have and accept the fact that the doc will inevitably be tortuous. 

Givea this, I agree that fractional years should simply convert to fractional months (to be then added to
verbetim-givenfractional months) just before representing the months as the trunc() of the value and cascading the
remainderdown to days. Units like century would fall out naturally in the same way. 


ABOUT LEAP SECONDS

Look at this (from Feb 2005):

«
PostgreSQL does not support leap seconds
https://www.postgresql.org/message-id/1162319515.20050202141132@mail.ru
»

I don't know if the title reports a state of affairs in the hope that this be changed to bring such support—or whether
itsimply states what obtains and always will. Anyway, a simple test (below) shows that PG Version 13.2 doesn't honor
leapseconds. 

DETAIL

First, it helps me to demonstrate, using leap years, that this is a base phenomenon of the proleptic Gregorian calendar
thatPG uses—and has nothing to do with time zones. (If it did, the then leap year notion could be time zone dependent.
Dothis 

select
  '2020-02-29'::date as "date",
  '2020-02-29 23:59:59.99999'::timestamp as "plain timestamp";

This is the result:

    date    |      plain timestamp
------------+---------------------------
 2020-02-29 | 2020-02-29 23:59:59.99999

Changing the year to 2021 brings the 22008 error "date/time field value out of range". (Of course, you have to split
thetest into two pieces to be sure that you get the same error with both data types.) 

This suggests a test that uses '23:59:60.000000' for the time. However, try this first:

select
  '23:59:60.000000'::time as "time",
  '2021-04-05 23:59:60.000000'::timestamp as "plain timestamp";

   time   |   plain timestamp
----------+---------------------
 24:00:00 | 2021-04-06 00:00:00

This is annoying. It reflects what seems to me to be an unfortunate design choice. Anyway, this behavior will never
change.But it means that a precise discussion needs more words than had one minute been taken as a closed-open
interval—[0,60)seconds—(with 59.99999 legal and 60.000000 illegal). It's too boring to type all those words here. Just
dothis: 

select '2021-04-05 23:59:60.5'::timestamp as "plain timestamp";

This is the result:

    plain timestamp
-----------------------
 2021-04-06 00:00:00.5

Of course, there was no leap second (on the planet—never, mind databases) at this moment. The most recent leap second
was2016-12-31 at 23:59:60 (UTC) meaning that '60.000000' through  '60.999999' were all meaningful times on 31-Dec that
year.So try this: 

select '2016-12-31 23:59:60.5'::timestamp as "should be leap second";

This is the result:

 should be leap second
-----------------------
 2017-01-01 00:00:00.5

This tells me that the subject line of the email from 2005 remains correct: PostgreSQL does not support leap seconds.
Giventhis, we can safely say that one minute is exactly 60 seconds (and that one hour is exactly 60 minutes) and never
mentionleap seconds ever again. I assume that it's this that must have informed the decision to represent an interval
valueas the three fields months, days, and seconds. 




Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Mon, Apr 5, 2021 at 01:06:36PM -0700, Bryn Llewellyn wrote:
> > On 05-Apr-2021, at 11:37, Bruce Momjian <bruce@momjian.us> wrote On:
> > Mon, Apr 5, 2021 at 01:15:22PM -0500, Justin Pryzby wrote          :
>
> It seems to me that this whole business is an irrevocable mess. The
> original design could have brought three overload-distinguishable
> types, "interval month", "interval day", and "interval second"—each
> represented internally as a scalar. There could have been built-ins
> to convert between them using conventionally specified rules. Then
> interval arithmetic would have been clear. For example, an attempt to
> assign the difference between two timestamps to anything but "interval
> second" would cause an error (as it does in Oracle database, even
> though there there are only two interval kinds). But we can only deal
> with what we have and accept the fact that the doc will inevitably be
> tortuous.

The problem with making three data types is that someone is going to
want to use a mixture of those, so I am not sure it is a win.

> Givea this, I agree that fractional years should simply convert to
> fractional months (to be then added to verbetim-given fractional
> months) just before representing the months as the trunc() of the
> value and cascading the remainder down to days. Units like century
> would fall out naturally in the same way.

I am confused --- are you saying we should do the interval addition,
then truncate, because we don't do that now, and it would be hard to do.

> ABOUT LEAP SECONDS
>
> Look at this (from Feb 2005):
>
> « PostgreSQL does not support leap seconds
> https://www.postgresql.org/message-id/1162319515.20050202141132@mail.r
> u »
>
> I don't know if the title reports a state of affairs in the hope that
> this be changed to bring such support—or whether it simply states
> what obtains and always will. Anyway, a simple test (below) shows that
> PG Version 13.2 doesn't honor leap seconds.

Postgres is documented as not supporting leap seconds:

    https://www.postgresql.org/docs/13/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT
    
    timezone
    
        The time zone offset from UTC, measured in seconds. Positive values
    correspond to time zones east of UTC, negative values to zones west of
    UTC. (Technically, PostgreSQL does not use UTC because leap seconds are
    not handled.)

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Bryn Llewellyn
Date:
> On 05-Apr-2021, at 13:35, Bruce Momjian <bruce@momjian.us> wrote:
>
> On Mon, Apr 5, 2021 at 01:06:36PM -0700, Bryn Llewellyn wrote:
>>> On 05-Apr-2021, at 11:37, Bruce Momjian <bruce@momjian.us> wrote On:
>>> Mon, Apr 5, 2021 at 01:15:22PM -0500, Justin Pryzby wrote          :
>>
>> It seems to me that this whole business is an irrevocable mess. The
>> original design could have brought three overload-distinguishable
>> types, "interval month", "interval day", and "interval second"—each
>> represented internally as a scalar. There could have been built-ins
>> to convert between them using conventionally specified rules. Then
>> interval arithmetic would have been clear. For example, an attempt to
>> assign the difference between two timestamps to anything but "interval
>> second" would cause an error (as it does in Oracle database, even
>> though there there are only two interval kinds). But we can only deal
>> with what we have and accept the fact that the doc will inevitably be
>> tortuous.
>
> The problem with making three data types is that someone is going to
> want to use a mixture of those, so I am not sure it is a win.
>
>> Givea this, I agree that fractional years should simply convert to
>> fractional months (to be then added to verbetim-given fractional
>> months) just before representing the months as the trunc() of the
>> value and cascading the remainder down to days. Units like century
>> would fall out naturally in the same way.
>
> I am confused --- are you saying we should do the interval addition,
> then truncate, because we don't do that now, and it would be hard to do.
>
>> ABOUT LEAP SECONDS
>>
>> Look at this (from Feb 2005):
>>
>> « PostgreSQL does not support leap seconds
>>
https://www.google.com/url?q=https://www.postgresql.org/message-id/1162319515.20050202141132@mail.r&source=gmail-imap&ust=1618259739000000&usg=AOvVaw0lT0Zz_HDsCrF5HrWCjplE
>> u »
>>
>> I don't know if the title reports a state of affairs in the hope that
>> this be changed to bring such support—or whether it simply states
>> what obtains and always will. Anyway, a simple test (below) shows that
>> PG Version 13.2 doesn't honor leap seconds.
>
> Postgres is documented as not supporting leap seconds:
>
>
https://www.google.com/url?q=https://www.postgresql.org/docs/13/functions-datetime.html%23FUNCTIONS-DATETIME-EXTRACT&source=gmail-imap&ust=1618259739000000&usg=AOvVaw35xJBdHRIsAYVV4pTzs0wR
>
>     timezone
>
>         The time zone offset from UTC, measured in seconds. Positive values
>     correspond to time zones east of UTC, negative values to zones west of
>     UTC. (Technically, PostgreSQL does not use UTC because leap seconds are
>     not handled.)

Thanks for the “leap seconds not supported” link. Google’s search within site refused to find that for me. (Talk about
wellhidden). 

About “ three data [interval] types” it’s too late anyway. So I’ll say no more.

Re “are you saying we should do the interval addition, then truncate, because we don't do that now, and it would be
hardto do.” I wan’t thinking of interval addition at all. Simply how the three values that that make up the internal
representationare computed from a specified interval value. Like the PL/pgSQL simulation I showed you in an earlier
reply.I can't find that in the archive now. So here it is again. Sorry for the repetition. 

p.yy, p.mo, p.dd, p.hh, p.mi, and p.ss are th input

m, d, and s are the internal representation

  m1           int     not null := trunc(p.mo);
  m_remainder  numeric not null := p.mo - m1::numeric;
  m            int     not null := trunc(p.yy*12) + m1;

  d_real       numeric not null := p.dd + m_remainder*30.0;
  d            int     not null := floor(d_real);
  d_remainder  numeric not null := d_real - d::numeric;

  s            numeric not null := d_remainder*24.0*60.0*60.0 +
                                   p.hh*60.0*60.0 +
                                   p.mi*60.0 +
                                   p.ss;

I have a harness to supply years, months, days, hours, minutes, and seconds values (like the lteral does the,) and to
getthem back (as "extract" gets them) using the actual implementation and my simulation. The two approaches have never
disagreedusing a wide range of inputs. 

The algorithm that my code shows (esp with both trunc() and float() in play) is too hard to describe in words.






Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Mon, Apr  5, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
> On Mon, Apr  5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
> Well, bug or not, we are not going to change back branches for this, and
> if you want a larger discussion, it will have to wait for PG 15.
> 
> > > https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
> > > « …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to
theappropriate number of months, days, and seconds for storage. When this would result in a fractional number of months
ordays, the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24
hours.For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. »
 
> 
> I see that.  What is not clear here is how far we flow down.  I was
> looking at adding documentation or regression tests for that, but was
> unsure.  I adjusted the docs slightly in the attached patch.

Here is an updated patch, which will be for PG 15.  It updates the
documentation to state:

    The fractional parts are used to compute appropriate values for the next
    lower-order internal fields (months, days, seconds).

It removes the flow from fractional months/weeks to
hours-minutes-seconds, and adds missing rounding for fractional
computations.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


Attachment

Re: Have I found an interval arithmetic bug?

From
Bryn Llewellyn
Date:
> On 08-Apr-2021, at 10:24, Bruce Momjian <bruce@momjian.us> wrote:
>
> On Mon, Apr  5, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
>> On Mon, Apr  5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
>> Well, bug or not, we are not going to change back branches for this, and
>> if you want a larger discussion, it will have to wait for PG 15.
>>
>>>>
https://www.google.com/url?q=https://www.postgresql.org/docs/current/datatype-datetime.html%23DATATYPE-INTERVAL-INPUT&source=gmail-imap&ust=1618507489000000&usg=AOvVaw2h2TNbK7O41zsDn8HfD88C
>>>> « …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to the
appropriatenumber of months, days, and seconds for storage. When this would result in a fractional number of months or
days,the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24
hours.For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. » 
>>
>> I see that.  What is not clear here is how far we flow down.  I was
>> looking at adding documentation or regression tests for that, but was
>> unsure.  I adjusted the docs slightly in the attached patch.
>
> Here is an updated patch, which will be for PG 15.  It updates the
> documentation to state:
>
>     The fractional parts are used to compute appropriate values for the next
>     lower-order internal fields (months, days, seconds).
>
> It removes the flow from fractional months/weeks to
> hours-minutes-seconds, and adds missing rounding for fractional
> computations.

Thank you Bruce. I look forward to documenting this new algorithm for YugabyteDB. The algorithm implements the
transformationfrom this: 

[
  yy_in numeric,
  mo_in numeric,
  dd_in numeric,
  hh_in numeric,
  mi_in numeric,
  ss_in numeric
]

to this:

[
  mo_internal_representation int,
  dd_internal_representation int,
  ss_internal_representation numeric(1000,6)
]

I am convinced that a prose account of the algorithm, by itself, is not the best way to tell the reader the rules that
thealgorithm implements. Rather, psuedocode is needed. I mentioned before that, better still, is actual executable
PL/pgSQLcode. (I can expect readers to be fluent in PL/pgSQL.) Given this executable simulation, an informal prose
sketchof what it does will definitely add value. 

May I ask you to fill in the body of this stub by translating the C that you have in hand?

create type internal_representation_t as(
  mo_internal_representation int,
  dd_internal_representation int,
  ss_internal_representation numeric(1000,6));

create function internal_representation(
  yy_in numeric default 0,
  mo_in numeric default 0,
  dd_in numeric default 0,
  hh_in numeric default 0,
  mi_in numeric default 0,
  ss_in numeric default 0)
  returns internal_representation_t
  language plpgsql
as $body$
declare
  mo_internal_representation  int     not null := 0;
  dd_internal_representation  int     not null := 0;
  ss_internal_representation  numeric not null := 0.0;

  ok constant boolean :=
    (yy_in is not null) and
    (mo_in is not null) and
    (dd_in is not null) and
    (hh_in is not null) and
    (mi_in is not null) and
    (ss_in is not null);
begin
  assert ok, 'No actual argument, when provided, may be null';

  -- The algorithm.

  return (mo_internal_representation, dd_internal_representation,
ss_internal_representation)::internal_representation_t;
end;
$body$;

By the way, I believe that a user might well decide always to supply all the fields in a "from text to interval"
typecast,except for the seconds, as integral values. This, after all, is what the "make_interval()" function enforces.
But,because the typecast approach allows non-integral values, the reference documentation must explain the rules
unambiguouslyso that the reader can predict the outcome of any ad hoc test that they might try. 

It's a huge pity that the three values of the internal representation cannot be observed directly using SQL because
eachbehaves with different semantics when an interval value is added to a timestamptz value. However, as a second best
(andknowing the algorithm above), a user can create interval values where only one of the three fields is populated and
testtheir understanding of the semantic rules that way. 


Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:


On Thu, Apr 8, 2021 at 10:24 AM Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Apr  5, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
> On Mon, Apr  5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
> Well, bug or not, we are not going to change back branches for this, and
> if you want a larger discussion, it will have to wait for PG 15.
>
> > > https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
> > > « …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to the appropriate number of months, days, and seconds for storage. When this would result in a fractional number of months or days, the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24 hours. For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. »
>
> I see that.  What is not clear here is how far we flow down.  I was
> looking at adding documentation or regression tests for that, but was
> unsure.  I adjusted the docs slightly in the attached patch.

Here is an updated patch, which will be for PG 15.  It updates the
documentation to state:

        The fractional parts are used to compute appropriate values for the next
        lower-order internal fields (months, days, seconds).

It removes the flow from fractional months/weeks to
hours-minutes-seconds, and adds missing rounding for fractional
computations.

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


+1 to this patch. 

Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:


On Sun, Apr 11, 2021 at 12:57 PM Zhihong Yu <zyu@yugabyte.com> wrote:


On Thu, Apr 8, 2021 at 10:24 AM Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Apr  5, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
> On Mon, Apr  5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
> Well, bug or not, we are not going to change back branches for this, and
> if you want a larger discussion, it will have to wait for PG 15.
>
> > > https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
> > > « …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to the appropriate number of months, days, and seconds for storage. When this would result in a fractional number of months or days, the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24 hours. For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. »
>
> I see that.  What is not clear here is how far we flow down.  I was
> looking at adding documentation or regression tests for that, but was
> unsure.  I adjusted the docs slightly in the attached patch.

Here is an updated patch, which will be for PG 15.  It updates the
documentation to state:

        The fractional parts are used to compute appropriate values for the next
        lower-order internal fields (months, days, seconds).

It removes the flow from fractional months/weeks to
hours-minutes-seconds, and adds missing rounding for fractional
computations.

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


+1 to this patch. 

Bryn reminded me, off list, about the flowing down from fractional day after the patch.

Before Bruce confirms the removal of the flowing down from fractional day, I withhold my previous +1.

Bryn would respond with more details.

Cheers

Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:


On Sun, Apr 11, 2021 at 4:33 PM Zhihong Yu <zyu@yugabyte.com> wrote:


On Sun, Apr 11, 2021 at 12:57 PM Zhihong Yu <zyu@yugabyte.com> wrote:


On Thu, Apr 8, 2021 at 10:24 AM Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Apr  5, 2021 at 02:01:58PM -0400, Bruce Momjian wrote:
> On Mon, Apr  5, 2021 at 11:33:10AM -0500, Justin Pryzby wrote:
> Well, bug or not, we are not going to change back branches for this, and
> if you want a larger discussion, it will have to wait for PG 15.
>
> > > https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-INTERVAL-INPUT
> > > « …field values can have fractional parts; for example '1.5 week' or '01:02:03.45'. Such input is converted to the appropriate number of months, days, and seconds for storage. When this would result in a fractional number of months or days, the fraction is added to the lower-order fields using the conversion factors 1 month = 30 days and 1 day = 24 hours. For example, '1.5 month' becomes 1 month and 15 days. Only seconds will ever be shown as fractional on output. »
>
> I see that.  What is not clear here is how far we flow down.  I was
> looking at adding documentation or regression tests for that, but was
> unsure.  I adjusted the docs slightly in the attached patch.

Here is an updated patch, which will be for PG 15.  It updates the
documentation to state:

        The fractional parts are used to compute appropriate values for the next
        lower-order internal fields (months, days, seconds).

It removes the flow from fractional months/weeks to
hours-minutes-seconds, and adds missing rounding for fractional
computations.

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


+1 to this patch. 

Bryn reminded me, off list, about the flowing down from fractional day after the patch.

Before Bruce confirms the removal of the flowing down from fractional day, I withhold my previous +1.

Bryn would respond with more details.

Cheers

Among previous examples given by Bryn, the following produces correct result based on Bruce's patch.

# select interval '-1.7 years 29.4 months';
    interval
----------------
 9 mons 12 days
(1 row)

Cheers

Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Sun, Apr 11, 2021 at 07:33:34PM -0700, Zhihong Yu wrote:
> Among previous examples given by Bryn, the following produces correct result
> based on Bruce's patch.
> 
> # select interval '-1.7 years 29.4 months';
>     interval
> ----------------
>  9 mons 12 days

Yes, that changed is caused by the rounding fixes, and not by the unit
pushdown adjustments.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Bryn Llewellyn
Date:
bruce@momjian.us wrote:
>
> zyu@yugabyte.com wrote:
>> Among previous examples given by Bryn, the following produces correct result based on Bruce's patch.
>>
>> # select interval '-1.7 years 29.4 months';
>>     interval
>> ----------------
>>  9 mons 12 days
>
> Yes, that changed is caused by the rounding fixes, and not by the unit pushdown adjustments.

I showed you all this example a long time ago:

select (
    '
      3.853467 years
    '::interval
  )::text as i;

This behavior is the same in the env. of Bruce’s patch as in unpatched PG 13.2. This is the result.

3 years 10 mons

Notice that "3.853467 years" is "3 years" plus "10.241604 months". This explains the "10 mons" in the result. But the
0.241604months remainder did not spill down into days. 

Can anybody defend this quirk? An extension of this example with a real number of month in the user input is
correspondinglyyet more quirky. The rules can be written down. But they’re too tortuos to allow an ordinary mortal
confidentlyto design code that relies on them. 

(I was unable to find any rule statement that lets the user predict this in the doc. But maybe that’s because of my
feeblesearching skills.) 

If there is no defense (and I cannot imagine one) might Bruce’s patch normalize this too to follow this rule:

— convert 'y years m months' to the real number y*12 + m.

— record truc( y*12 + m) in the "months" field of the internal representation

— flow the remainder down to days (but no further)

After all, you've bitten the bullet now and changed the behavior. This means that the semantics of some extant
applicationswill change. So... in for a penny, in for a pound? 


Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Mon, Apr 12, 2021 at 03:09:48PM -0700, Bryn Llewellyn wrote:
> I showed you all this example a long time ago:
> 
> select (
>     '
>       3.853467 years
>     '::interval
>   )::text as i;
> 
> This behavior is the same in the env. of Bruce’s patch as in unpatched PG 13.2. This is the result.
> 
> 3 years 10 mons
> 
> Notice that "3.853467 years" is "3 years" plus "10.241604 months". This explains the "10 mons" in the result. But the
0.241604months remainder did not spill down into days.
 
> 
> Can anybody defend this quirk? An extension of this example with a real number of month in the user input is
correspondinglyyet more quirky. The rules can be written down. But they’re too tortuos to allow an ordinary mortal
confidentlyto design code that relies on them.
 
> 
> (I was unable to find any rule statement that lets the user predict this in the doc. But maybe that’s because of my
feeblesearching skills.)
 
> 
> If there is no defense (and I cannot imagine one) might Bruce’s patch normalize this too to follow this rule:
> 
> — convert 'y years m months' to the real number y*12 + m.
> 
> — record truc( y*12 + m) in the "months" field of the internal representation
> 
> — flow the remainder down to days (but no further)
> 
> After all, you've bitten the bullet now and changed the behavior. This means that the semantics of some extant
applicationswill change. So... in for a penny, in for a pound?
 

The docs now say:

     Field values can have fractional parts;  for example, <literal>'1.5
     weeks'</literal> or <literal>'01:02:03.45'</literal>.  The fractional
-->  parts are used to compute appropriate values for the next lower-order
     internal fields (months, days, seconds).

meaning fractional years flows to the next lower internal unit, months,
and no further.  Fractional months would flow to days.  The idea of not
flowing past the next lower-order internal field is that the
approximations between units are not precise enough to flow accurately.

With my patch, the output is now:

    SELECT INTERVAL '3 years 10.241604 months';
            interval
    ------------------------
     3 years 10 mons 7 days

It used to flow to seconds.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> On Mon, Apr 12, 2021 at 03:09:48PM -0700, Bryn Llewellyn wrote:
>> After all, you've bitten the bullet now and changed the behavior. This means that the semantics of some extant
applicationswill change. So... in for a penny, in for a pound? 

> The docs now say:

>      Field values can have fractional parts;  for example, <literal>'1.5
>      weeks'</literal> or <literal>'01:02:03.45'</literal>.  The fractional
> -->  parts are used to compute appropriate values for the next lower-order
>      internal fields (months, days, seconds).

> meaning fractional years flows to the next lower internal unit, months,
> and no further.  Fractional months would flow to days.  The idea of not
> flowing past the next lower-order internal field is that the
> approximations between units are not precise enough to flow accurately.

Um, what's the argument for down-converting AT ALL?  The problem is
precisely that any such conversion is mostly fictional.

> With my patch, the output is now:

>     SELECT INTERVAL '3 years 10.241604 months';
>             interval
>     ------------------------
>      3 years 10 mons 7 days

> It used to flow to seconds.

Yeah, that's better than before, but I don't see any principled argument
for it not to be "3 years 10 months", full stop.

            regards, tom lane



Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Mon, Apr 12, 2021 at 07:38:21PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Mon, Apr 12, 2021 at 03:09:48PM -0700, Bryn Llewellyn wrote:
> >> After all, you've bitten the bullet now and changed the behavior. This means that the semantics of some extant
applicationswill change. So... in for a penny, in for a pound?
 
> 
> > The docs now say:
> 
> >      Field values can have fractional parts;  for example, <literal>'1.5
> >      weeks'</literal> or <literal>'01:02:03.45'</literal>.  The fractional
> > -->  parts are used to compute appropriate values for the next lower-order
> >      internal fields (months, days, seconds).
> 
> > meaning fractional years flows to the next lower internal unit, months,
> > and no further.  Fractional months would flow to days.  The idea of not
> > flowing past the next lower-order internal field is that the
> > approximations between units are not precise enough to flow accurately.
> 
> Um, what's the argument for down-converting AT ALL?  The problem is
> precisely that any such conversion is mostly fictional.

True.

> > With my patch, the output is now:
> 
> >     SELECT INTERVAL '3 years 10.241604 months';
> >             interval
> >     ------------------------
> >      3 years 10 mons 7 days
> 
> > It used to flow to seconds.
> 
> Yeah, that's better than before, but I don't see any principled argument
> for it not to be "3 years 10 months", full stop.

Well, the case was:

    SELECT INTERVAL '0.1 months';
     interval
    ----------
     3 days
    
    SELECT INTERVAL '0.1 months' + interval '0.9 months';
     ?column?
    ----------
     30 days

If you always truncate, you basically lose the ability to specify
fractional internal units, which I think is useful.  I would say if you
use fractional units of one of the internal units, you are basically
knowing you are asking for an approximation --- that is not true of '3.5
years', for example.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Bryn Llewellyn
Date:
tgl@sss.pgh.pa.us wrote:

bruce@momjian.us writes:
bryn@yugabyte.com wrote:
After all, you've bitten the bullet now and changed the behavior. This means that the semantics of some extant applications will change. So... in for a penny, in for a pound?

The docs now say:

    Field values can have fractional parts;  for example, <literal>'1.5
    weeks'</literal> or <literal>'01:02:03.45'</literal>.  The fractional
-->  parts are used to compute appropriate values for the next lower-order
    internal fields (months, days, seconds).

meaning fractional years flows to the next lower internal unit, months, and no further.  Fractional months would flow to days.  The idea of not flowing past the next lower-order internal field is that the approximations between units are not precise enough to flow accurately.

Um, what's the argument for down-converting AT ALL?  The problem is precisely that any such conversion is mostly fictional.

With my patch, the output is now:

SELECT INTERVAL '3 years 10.241604 months';
       interval
------------------------
3 years 10 mons 7 days

It used to flow to seconds.

Yeah, that's better than before, but I don't see any principled argument for it not to be "3 years 10 months", full stop.

Tom, I fully support your suggestion to have no flow down at all. Please may this happen! However, the new rule must be described in terms of the three fields of the internal representation: [months, days, seconds]. This representation is already documented.

Don’t forget that '731.42587 hours’ is read back as "731:25:33.132" (or, if you prefer, 731 hours, 25 minutes, and 33.132 seconds if you use "extract" and your own pretty print). But we don’t think of this as “flow down”. Rather, it’s just a conventional representation of the seconds field of the internal representation. I could go on. But you all know what I mean.

By the way, I made a nice little demo (for my doc project). It shows that:

(1) if you pick the right date-time just before a DST change, and do the addition in the right time zone, then adding 24 hours gets a different answer than adding one day.

(2) if you pick the right date-time just before 29-Feb in a leap year, then adding 30 days gets a different answer than adding one month.

You all know why. And though the doc could explain and illustrate this better, it does tell you to expect this. It also says that the difference in semantics that these examples show is the reason for the three-field internal representation.

It seems to me that both the age-old behavior that vanilla 13.2 exhibits, and the behavior in the regime of Bruce’s patch are like adding 2.2 oranges to 1.3 oranges and getting 3 oranges and 21 apples (because 1 orange is conventionally the same as 42 apples). Bruce made a step in the right direction by stopping oranges convert all the way down to bananas. But it would be so much better to get rid of this false equivalence business altogether.


Re: Have I found an interval arithmetic bug?

From
Bryn Llewellyn
Date:
> On 12-Apr-2021, at 17:00, Bruce Momjian <bruce@momjian.us> wrote:
>
> On Mon, Apr 12, 2021 at 07:38:21PM -0400, Tom Lane wrote:
>> Bruce Momjian <bruce@momjian.us> writes:
>>> On Mon, Apr 12, 2021 at 03:09:48PM -0700, Bryn Llewellyn wrote:
>>>> After all, you've bitten the bullet now and changed the behavior. This means that the semantics of some extant
applicationswill change. So... in for a penny, in for a pound? 
>>
>>> The docs now say:
>>
>>>     Field values can have fractional parts;  for example, <literal>'1.5
>>>     weeks'</literal> or <literal>'01:02:03.45'</literal>.  The fractional
>>> -->  parts are used to compute appropriate values for the next lower-order
>>>     internal fields (months, days, seconds).
>>
>>> meaning fractional years flows to the next lower internal unit, months,
>>> and no further.  Fractional months would flow to days.  The idea of not
>>> flowing past the next lower-order internal field is that the
>>> approximations between units are not precise enough to flow accurately.
>>
>> Um, what's the argument for down-converting AT ALL?  The problem is
>> precisely that any such conversion is mostly fictional.
>
> True.
>
>>> With my patch, the output is now:
>>
>>>     SELECT INTERVAL '3 years 10.241604 months';
>>>             interval
>>>     ------------------------
>>>      3 years 10 mons 7 days
>>
>>> It used to flow to seconds.
>>
>> Yeah, that's better than before, but I don't see any principled argument
>> for it not to be "3 years 10 months", full stop.
>
> Well, the case was:
>
>     SELECT INTERVAL '0.1 months';
>      interval
>     ----------
>      3 days
>
>     SELECT INTERVAL '0.1 months' + interval '0.9 months';
>      ?column?
>     ----------
>      30 days
>
> If you always truncate, you basically lose the ability to specify
> fractional internal units, which I think is useful.  I would say if you
> use fractional units of one of the internal units, you are basically
> knowing you are asking for an approximation --- that is not true of '3.5
> years', for example.

I’d argue that the fact that this:

('0.3 months'::interval) + ('0.7 months'::interval)

Is reported as '30 days' and not '1 month' is yet another bug—precisely because of what I said in my previous email
(sorrythat I forked the thread) where I referred to the fact that, in the right test, adding 1 month gets a different
answerthan adding 30 days. Yet another convincing reason to get rid of this flow down business altogether. 

If some application wants to model flow-down, then it can do so with trivial programming and full control over its own
definitionof the rules. 




Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Mon, Apr 12, 2021 at 05:20:43PM -0700, Bryn Llewellyn wrote:
> I’d argue that the fact that this:
>
> ('0.3 months'::interval) + ('0.7 months'::interval)
>
> Is reported as '30 days' and not '1 month' is yet another
> bug—precisely because of what I said in my previous email (sorry
> that I forked the thread) where I referred to the fact that, in the
> right test, adding 1 month gets a different answer than adding 30
> days. 

Flowing _up_ is what these functions do:

    \df *justify*
                                   List of functions
       Schema   |       Name       | Result data type | Argument data types | Type
    ------------+------------------+------------------+---------------------+------
     pg_catalog | justify_days     | interval         | interval            | func
     pg_catalog | justify_hours    | interval         | interval            | func
     pg_catalog | justify_interval | interval         | interval            | func

> Yet another convincing reason to get rid of this flow down
> business altogether.

We can certainly get rid of all downflow, which in the current patch is
only when fractional internal units are specified.

> If some application wants to model flow-down, then it can do so with
> trivial programming and full control over its own definition of the
> rules.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:


On Mon, Apr 12, 2021 at 4:22 PM Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Apr 12, 2021 at 03:09:48PM -0700, Bryn Llewellyn wrote:
> I showed you all this example a long time ago:
>
> select (
>     '
>       3.853467 years
>     '::interval
>   )::text as i;
>
> This behavior is the same in the env. of Bruce’s patch as in unpatched PG 13.2. This is the result.
>
> 3 years 10 mons
>
> Notice that "3.853467 years" is "3 years" plus "10.241604 months". This explains the "10 mons" in the result. But the 0.241604 months remainder did not spill down into days.
>
> Can anybody defend this quirk? An extension of this example with a real number of month in the user input is correspondingly yet more quirky. The rules can be written down. But they’re too tortuos to allow an ordinary mortal confidently to design code that relies on them.
>
> (I was unable to find any rule statement that lets the user predict this in the doc. But maybe that’s because of my feeble searching skills.)
>
> If there is no defense (and I cannot imagine one) might Bruce’s patch normalize this too to follow this rule:
>
> — convert 'y years m months' to the real number y*12 + m.
>
> — record truc( y*12 + m) in the "months" field of the internal representation
>
> — flow the remainder down to days (but no further)
>
> After all, you've bitten the bullet now and changed the behavior. This means that the semantics of some extant applications will change. So... in for a penny, in for a pound?

The docs now say:

     Field values can have fractional parts;  for example, <literal>'1.5
     weeks'</literal> or <literal>'01:02:03.45'</literal>.  The fractional
-->  parts are used to compute appropriate values for the next lower-order
     internal fields (months, days, seconds).

meaning fractional years flows to the next lower internal unit, months,
and no further.  Fractional months would flow to days.  The idea of not
flowing past the next lower-order internal field is that the
approximations between units are not precise enough to flow accurately.

With my patch, the output is now:

        SELECT INTERVAL '3 years 10.241604 months';
                interval
        ------------------------
         3 years 10 mons 7 days

It used to flow to seconds.

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


Based on the results of more samples, I restore +1 to Bruce's latest patch.

Cheers

Re: Have I found an interval arithmetic bug?

From
Bryn Llewellyn
Date:
On 12-Apr-2021, at 17:25, Bruce Momjian <bruce@momjian.us> wrote:

On Mon, Apr 12, 2021 at 05:20:43PM -0700, Bryn Llewellyn wrote:
I’d argue that the fact that this:

('0.3 months'::interval) + ('0.7 months'::interval)

Is reported as '30 days' and not '1 month' is yet another
bug—precisely because of what I said in my previous email (sorry
that I forked the thread) where I referred to the fact that, in the
right test, adding 1 month gets a different answer than adding 30
days.

Flowing _up_ is what these functions do:

\df *justify*
                              List of functions
  Schema   |       Name       | Result data type | Argument data types | Type
------------+------------------+------------------+---------------------+------
pg_catalog | justify_days     | interval         | interval            | func
pg_catalog | justify_hours    | interval         | interval            | func
pg_catalog | justify_interval | interval         | interval            | func

Yet another convincing reason to get rid of this flow down
business altogether.

We can certainly get rid of all downflow, which in the current patch is
only when fractional internal units are specified.

If some application wants to model flow-down, then it can do so with
trivial programming and full control over its own definition of the
rules.

“Yes please!” re Bruce’s “We can certainly get rid of all downflow, which in the current patch is only when fractional internal units are specified.”

Notice that a user who creates interval values explicitly only by using “make_interval()” will see no behavior change.

There’s another case of up-flow. When you subtract one timestamp value from another, and when they’re far enough apart, then the (internal representation of the) resulting interval value has both a seconds component and a days component. (But never, in my tests, a months component.) I assume that the implementation first gets the difference between the two timestamp values in seconds using (the moral equivalent of) “extract epoch”. And then, if this is greater than 24*60*60, it implements up-flow using the “rule-of-24”—never mind that this means that if you add the answer back to the timestamp value that you subtracted, then you might not get the timestamp value from which you subtracted. (This happens around a DST change and has been discussed earlier in the thread.)

The purpose of these three “justify” functions is dubious. I think that it’s best to think of the 3-component interval vector like an [x, y, z] vector in 3d geometric space, where the three coordinates are not mutually convertible because each has unique semantics. This point has been rehearsed many times in this long thread.

Having said this, it isn’t hard to understand the rules that the functions implement. And, most importantly, their use is voluntary. They are, though, no more than shipped and documented wrappers for a few special cases. A user could so easily write their own function like this:

1. Compute the values of the three components of the internal representation of the passed-in interval value using the “extract” feature and some simple arithmetic.

2. Derive the [minutes, days, seconds] values of a new representation using whatever rules you feel for.

3. Use these new values to create the return interval value.

For example, I might follow a discipline to use interval values that have only one of the three components of the internal representation non-zero. It’s easy to use the “domain” feature for this. (I can use these in any context where I can use the shipped interval.) I could then write a function to convert a “pure seconds” value of my_interval to a “pure years” value. And I could implement my own rules:

— Makes sense only for a large number of seconds that comes out to at least five years (else assert failure).

— Converts seconds to years using the rule that 1 year is, on average, 365.25*24*60*60 seconds, and then truncates it.

There’s no shipped function that does this, and this makes me suspect that I’d prefer to roll my own for any serious purpose.

The existence of the three “justify” functions is, therefore, harmless.

Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, May  7, 2021 at 07:23:42PM -0700, Zhihong Yu wrote:
> On Tue, Apr 13, 2021 at 10:55 AM Bryn Llewellyn <bryn@yugabyte.com> wrote:
>     There’s no shipped function that does this, and this makes me suspect that
>     I’d prefer to roll my own for any serious purpose.
> 
>     The existence of the three “justify” functions is, therefore, harmless.
> 
> Bruce / Tom:
> Can we revisit this topic ?

I thought we agreed that the attached patch will be applied to PG 15.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


Attachment

Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:


On Tue, Apr 13, 2021 at 10:55 AM Bryn Llewellyn <bryn@yugabyte.com> wrote:
On 12-Apr-2021, at 17:25, Bruce Momjian <bruce@momjian.us> wrote:

On Mon, Apr 12, 2021 at 05:20:43PM -0700, Bryn Llewellyn wrote:
I’d argue that the fact that this:

('0.3 months'::interval) + ('0.7 months'::interval)

Is reported as '30 days' and not '1 month' is yet another
bug—precisely because of what I said in my previous email (sorry
that I forked the thread) where I referred to the fact that, in the
right test, adding 1 month gets a different answer than adding 30
days.

Flowing _up_ is what these functions do:

\df *justify*
                              List of functions
  Schema   |       Name       | Result data type | Argument data types | Type
------------+------------------+------------------+---------------------+------
pg_catalog | justify_days     | interval         | interval            | func
pg_catalog | justify_hours    | interval         | interval            | func
pg_catalog | justify_interval | interval         | interval            | func

Yet another convincing reason to get rid of this flow down
business altogether.

We can certainly get rid of all downflow, which in the current patch is
only when fractional internal units are specified.

If some application wants to model flow-down, then it can do so with
trivial programming and full control over its own definition of the
rules.

“Yes please!” re Bruce’s “We can certainly get rid of all downflow, which in the current patch is only when fractional internal units are specified.”

Notice that a user who creates interval values explicitly only by using “make_interval()” will see no behavior change.

There’s another case of up-flow. When you subtract one timestamp value from another, and when they’re far enough apart, then the (internal representation of the) resulting interval value has both a seconds component and a days component. (But never, in my tests, a months component.) I assume that the implementation first gets the difference between the two timestamp values in seconds using (the moral equivalent of) “extract epoch”. And then, if this is greater than 24*60*60, it implements up-flow using the “rule-of-24”—never mind that this means that if you add the answer back to the timestamp value that you subtracted, then you might not get the timestamp value from which you subtracted. (This happens around a DST change and has been discussed earlier in the thread.)

The purpose of these three “justify” functions is dubious. I think that it’s best to think of the 3-component interval vector like an [x, y, z] vector in 3d geometric space, where the three coordinates are not mutually convertible because each has unique semantics. This point has been rehearsed many times in this long thread.

Having said this, it isn’t hard to understand the rules that the functions implement. And, most importantly, their use is voluntary. They are, though, no more than shipped and documented wrappers for a few special cases. A user could so easily write their own function like this:

1. Compute the values of the three components of the internal representation of the passed-in interval value using the “extract” feature and some simple arithmetic.

2. Derive the [minutes, days, seconds] values of a new representation using whatever rules you feel for.

3. Use these new values to create the return interval value.

For example, I might follow a discipline to use interval values that have only one of the three components of the internal representation non-zero. It’s easy to use the “domain” feature for this. (I can use these in any context where I can use the shipped interval.) I could then write a function to convert a “pure seconds” value of my_interval to a “pure years” value. And I could implement my own rules:

— Makes sense only for a large number of seconds that comes out to at least five years (else assert failure).

— Converts seconds to years using the rule that 1 year is, on average, 365.25*24*60*60 seconds, and then truncates it.

There’s no shipped function that does this, and this makes me suspect that I’d prefer to roll my own for any serious purpose.

The existence of the three “justify” functions is, therefore, harmless.


Bruce / Tom:
Can we revisit this topic ?

Cheers 

Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:


On Fri, May 7, 2021 at 7:23 PM Bruce Momjian <bruce@momjian.us> wrote:
On Fri, May  7, 2021 at 07:23:42PM -0700, Zhihong Yu wrote:
> On Tue, Apr 13, 2021 at 10:55 AM Bryn Llewellyn <bryn@yugabyte.com> wrote:
>     There’s no shipped function that does this, and this makes me suspect that
>     I’d prefer to roll my own for any serious purpose.
>
>     The existence of the three “justify” functions is, therefore, harmless.
>
> Bruce / Tom:
> Can we revisit this topic ?

I thought we agreed that the attached patch will be applied to PG 15.

Good to know. 

Hopefully it lands soon.


--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, May  7, 2021 at 07:39:31PM -0700, Zhihong Yu wrote:
> 
> 
> On Fri, May 7, 2021 at 7:23 PM Bruce Momjian <bruce@momjian.us> wrote:
> 
>     On Fri, May  7, 2021 at 07:23:42PM -0700, Zhihong Yu wrote:
>     > On Tue, Apr 13, 2021 at 10:55 AM Bryn Llewellyn <bryn@yugabyte.com>
>     wrote:
>     >     There’s no shipped function that does this, and this makes me suspect
>     that
>     >     I’d prefer to roll my own for any serious purpose.
>     >
>     >     The existence of the three “justify” functions is, therefore,
>     harmless.
>     >
>     > Bruce / Tom:
>     > Can we revisit this topic ?
> 
>     I thought we agreed that the attached patch will be applied to PG 15.
> 
> 
> Good to know. 
> 
> Hopefully it lands soon.

It will be applied in June/July, but will not appear in a release until
Sept/Oct, 2022.  Sorry.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Daniel Gustafsson
Date:
> On 29 Jun 2021, at 18:50, Zhihong Yu <zyu@yugabyte.com> wrote:

> Now that PG 15 is open for commit, do you think the patch can land ?

Adding it to the commitfest patch tracker is a good way to ensure it's not
forgotten about:

    https://commitfest.postgresql.org/33/

--
Daniel Gustafsson        https://vmware.com/




Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:


On Fri, May 7, 2021 at 7:42 PM Bruce Momjian <bruce@momjian.us> wrote:
On Fri, May  7, 2021 at 07:39:31PM -0700, Zhihong Yu wrote:
>
>
> On Fri, May 7, 2021 at 7:23 PM Bruce Momjian <bruce@momjian.us> wrote:
>
>     On Fri, May  7, 2021 at 07:23:42PM -0700, Zhihong Yu wrote:
>     > On Tue, Apr 13, 2021 at 10:55 AM Bryn Llewellyn <bryn@yugabyte.com>
>     wrote:
>     >     There’s no shipped function that does this, and this makes me suspect
>     that
>     >     I’d prefer to roll my own for any serious purpose.
>     >
>     >     The existence of the three “justify” functions is, therefore,
>     harmless.
>     >
>     > Bruce / Tom:
>     > Can we revisit this topic ?
>
>     I thought we agreed that the attached patch will be applied to PG 15.
>
>
> Good to know. 
>
> Hopefully it lands soon.

It will be applied in June/July, but will not appear in a release until
Sept/Oct, 2022.  Sorry.

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Bruce:
Now that PG 15 is open for commit, do you think the patch can land ?

Cheers 

Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Tue, Jun 29, 2021 at 06:49:45PM +0200, Daniel Gustafsson wrote:
> > On 29 Jun 2021, at 18:50, Zhihong Yu <zyu@yugabyte.com> wrote:
> 
> > Now that PG 15 is open for commit, do you think the patch can land ?
> 
> Adding it to the commitfest patch tracker is a good way to ensure it's not
> forgotten about:
> 
>     https://commitfest.postgresql.org/33/

OK, I have been keeping it in my git tree since I wrote it and will
apply it in the next few days.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:


On Wed, Jun 30, 2021 at 9:35 AM Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Jun 29, 2021 at 06:49:45PM +0200, Daniel Gustafsson wrote:
> > On 29 Jun 2021, at 18:50, Zhihong Yu <zyu@yugabyte.com> wrote:
>
> > Now that PG 15 is open for commit, do you think the patch can land ?
>
> Adding it to the commitfest patch tracker is a good way to ensure it's not
> forgotten about:
>
>       https://commitfest.postgresql.org/33/

OK, I have been keeping it in my git tree since I wrote it and will
apply it in the next few days.

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Thanks, Bruce.

Hopefully you can get to this soon. 

Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:


On Thu, Jul 8, 2021 at 10:22 AM Zhihong Yu <zyu@yugabyte.com> wrote:


On Wed, Jun 30, 2021 at 9:35 AM Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Jun 29, 2021 at 06:49:45PM +0200, Daniel Gustafsson wrote:
> > On 29 Jun 2021, at 18:50, Zhihong Yu <zyu@yugabyte.com> wrote:
>
> > Now that PG 15 is open for commit, do you think the patch can land ?
>
> Adding it to the commitfest patch tracker is a good way to ensure it's not
> forgotten about:
>
>       https://commitfest.postgresql.org/33/

OK, I have been keeping it in my git tree since I wrote it and will
apply it in the next few days.

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Thanks, Bruce.

Hopefully you can get to this soon. 

Bruce: 
Please see if the patch can be integrated now.

Cheers

Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Wed, Jul 14, 2021 at 09:03:21AM -0700, Zhihong Yu wrote:
> On Thu, Jul 8, 2021 at 10:22 AM Zhihong Yu <zyu@yugabyte.com> wrote:
>     On Wed, Jun 30, 2021 at 9:35 AM Bruce Momjian <bruce@momjian.us> wrote:
> 
>         On Tue, Jun 29, 2021 at 06:49:45PM +0200, Daniel Gustafsson wrote:
>         > > On 29 Jun 2021, at 18:50, Zhihong Yu <zyu@yugabyte.com> wrote:
>         >
>         > > Now that PG 15 is open for commit, do you think the patch can land
>         ?
>         >
>         > Adding it to the commitfest patch tracker is a good way to ensure
>         it's not
>         > forgotten about:
>         >
>         >       https://commitfest.postgresql.org/33/
> 
>         OK, I have been keeping it in my git tree since I wrote it and will
>         apply it in the next few days.
>     Thanks, Bruce.
> 
>     Hopefully you can get to this soon. 
> 
> Bruce: 
> Please see if the patch can be integrated now.

I found a mistake in my most recent patch.  For example, in master we
see this output:

    SELECT INTERVAL '1.8594 months';
             interval
    --------------------------
     1 mon 25 days 18:46:04.8

Obviously this should return '1 mon 26 days', but with my most recent
patch, it returned '1 mon 25 days'.  Turns out I had not properly used
rint() in AdjustFractDays, and in fact the function is now longer needed
because it is just a multiplication and an rint().

Updated patch attached.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


Attachment

Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:


On Mon, Jul 19, 2021 at 9:14 PM Bruce Momjian <bruce@momjian.us> wrote:
On Wed, Jul 14, 2021 at 09:03:21AM -0700, Zhihong Yu wrote:
> On Thu, Jul 8, 2021 at 10:22 AM Zhihong Yu <zyu@yugabyte.com> wrote:
>     On Wed, Jun 30, 2021 at 9:35 AM Bruce Momjian <bruce@momjian.us> wrote:
>
>         On Tue, Jun 29, 2021 at 06:49:45PM +0200, Daniel Gustafsson wrote:
>         > > On 29 Jun 2021, at 18:50, Zhihong Yu <zyu@yugabyte.com> wrote:
>         >
>         > > Now that PG 15 is open for commit, do you think the patch can land
>         ?
>         >
>         > Adding it to the commitfest patch tracker is a good way to ensure
>         it's not
>         > forgotten about:
>         >
>         >       https://commitfest.postgresql.org/33/
>
>         OK, I have been keeping it in my git tree since I wrote it and will
>         apply it in the next few days.
>     Thanks, Bruce.
>
>     Hopefully you can get to this soon. 
>
> Bruce: 
> Please see if the patch can be integrated now.

I found a mistake in my most recent patch.  For example, in master we
see this output:

        SELECT INTERVAL '1.8594 months';
                 interval
        --------------------------
         1 mon 25 days 18:46:04.8

Obviously this should return '1 mon 26 days', but with my most recent
patch, it returned '1 mon 25 days'.  Turns out I had not properly used
rint() in AdjustFractDays, and in fact the function is now longer needed
because it is just a multiplication and an rint().

Updated patch attached.

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Hi,
Patch looks good.
Maybe add the statement above as a test case :

SELECT INTERVAL '1.8594 months' 

Cheers

Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Tue, Jul 20, 2021 at 02:33:07PM -0700, Zhihong Yu wrote:
> On Mon, Jul 19, 2021 at 9:14 PM Bruce Momjian <bruce@momjian.us> wrote:
> >     Obviously this should return '1 mon 26 days', but with my most recent
> >     patch, it returned '1 mon 25 days'.  Turns out I had not properly used
> >     rint() in AdjustFractDays, and in fact the function is now longer needed
> >     because it is just a multiplication and an rint().
>
> Patch looks good.
> Maybe add the statement above as a test case :
> 
> SELECT INTERVAL '1.8594 months' 

Good idea --- updated patch attached.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


Attachment

Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:


On Tue, Jul 20, 2021 at 3:53 PM Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Jul 20, 2021 at 02:33:07PM -0700, Zhihong Yu wrote:
> On Mon, Jul 19, 2021 at 9:14 PM Bruce Momjian <bruce@momjian.us> wrote:
> >     Obviously this should return '1 mon 26 days', but with my most recent
> >     patch, it returned '1 mon 25 days'.  Turns out I had not properly used
> >     rint() in AdjustFractDays, and in fact the function is now longer needed
> >     because it is just a multiplication and an rint().
>
> Patch looks good.
> Maybe add the statement above as a test case :
>
> SELECT INTERVAL '1.8594 months' 

Good idea --- updated patch attached.

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Hi,
With your patch, the following example (Coutesy Bryn) still shows fraction:

# select (interval '1 month')*1.123;
       ?column?
-----------------------
 1 mon 3 days 16:33:36
(1 row) 

Do you think the output can be improved (by getting rid of fraction) ?

Thanks

Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Tue, Jul 20, 2021 at 05:13:37PM -0700, Zhihong Yu wrote:
> On Tue, Jul 20, 2021 at 3:53 PM Bruce Momjian <bruce@momjian.us> wrote:
> With your patch, the following example (Coutesy Bryn) still shows fraction:
> 
> # select (interval '1 month')*1.123;
>        ?column?
> -----------------------
>  1 mon 3 days 16:33:36
> (1 row) 
> 
> Do you think the output can be improved (by getting rid of fraction) ?

Well, I focused on how fractional units were processed inside of
interval values.  I never considered how multiplication should be
handled.  I have not really thought about how to handle that, but this
example now gives me concern:

    SELECT INTERVAL '1.06 months 1 hour';
           interval
    -----------------------
     1 mon 2 days 01:00:00

Notice that it rounds the '1.06 months' to '1 mon 2 days', rather than
spilling to hours/minutes/seconds, even though hours is already
specified.  I don't see a better way to handle this than the current
code already does, but it is something odd.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Dean Rasheed
Date:
On Wed, 21 Jul 2021 at 03:48, Bruce Momjian <bruce@momjian.us> wrote:
>
> this example now gives me concern:
>
>         SELECT INTERVAL '1.06 months 1 hour';
>                interval
>         -----------------------
>          1 mon 2 days 01:00:00
>
> Notice that it rounds the '1.06 months' to '1 mon 2 days', rather than
> spilling to hours/minutes/seconds, even though hours is already
> specified.  I don't see a better way to handle this than the current
> code already does, but it is something odd.

Hmm, looking at this whole thread, I have to say that I prefer the old
behaviour of spilling down to lower units.

For example, with this patch:

SELECT '0.5 weeks'::interval;
 interval
----------
 4 days

which I don't think is really an improvement. My expectation is that
half a week is 3.5 days, and I prefer what it used to return, namely
'3 days 12:00:00'.

It's true that that leads to odd-looking results when the field value
has lots of fractional digits, but that was at least explainable, and
followed the documentation.

Looking for a general principle to follow, how about this -- the
result of specifying a fractional value should be the same as
multiplying an interval of 1 unit by that value. In other words,
'1.8594 months'::interval should be the same as '1 month'::interval *
1.8594. (Actually, it probably can't easily be made exactly the same
in all cases, due to differences in the floating point computations in
the two cases, and rounding errors, but it's hopefully not far off,
unlike the results obtained by not spilling down to lower units on
input.)

The cases that are broken in master, in my opinion, are the larger
units (year and above), which don't propagate down in the same way as
fractional months and below. So, for example, '0.7 years' should be
8.4 months (with the conversion factor of 1 year = 12 months), giving
'8 months 12 days', which is what '1 year'::interval * 0.7 produces.
Sure, there are arguably more accurate ways of computing that.
However, that's the result obtained using the documented conversion
factors, so it's justifiable in those terms.

It's worth noting another case that is broken in master:

SELECT '1.7 decades'::interval;
     interval
------------------
 16 years 11 mons

which is surely not what anyone would expect. The current patch fixes
this, but it would also be fixed by handling the fractional digits for
these units in the same way as for smaller units. There was an earlier
patch doing that, I think, though I didn't test it.

Regards,
Dean



Re: Have I found an interval arithmetic bug?

From
Tom Lane
Date:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> Hmm, looking at this whole thread, I have to say that I prefer the old
> behaviour of spilling down to lower units.

> For example, with this patch:

> SELECT '0.5 weeks'::interval;
>  interval
> ----------
>  4 days

> which I don't think is really an improvement. My expectation is that
> half a week is 3.5 days, and I prefer what it used to return, namely
> '3 days 12:00:00'.

Yeah, that is clearly a significant dis-improvement.

In general, considering that (most of?) the existing behavior has stood
for decades, I think we need to tread VERY carefully about changing it.
I don't want to see this patch changing any case that is not indisputably
broken.

            regards, tom lane



Re: Have I found an interval arithmetic bug?

From
Bryn Llewellyn
Date:
On 21-Jul-2021, at 02:58, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dean Rasheed <dean.a.rasheed@gmail.com> writes:
Hmm, looking at this whole thread, I have to say that I prefer the old
behaviour of spilling down to lower units.

For example, with this patch:

SELECT '0.5 weeks'::interval;
interval
----------
4 days

which I don't think is really an improvement. My expectation is that
half a week is 3.5 days, and I prefer what it used to return, namely
'3 days 12:00:00'.

Yeah, that is clearly a significant dis-improvement.

In general, considering that (most of?) the existing behavior has stood
for decades, I think we need to tread VERY carefully about changing it.
I don't want to see this patch changing any case that is not indisputably
broken.

It was me that started the enormous thread with the title “Have I found an interval arithmetic bug?” on 01-Apr-2021. I presented this testcase:

select interval '-1.7 years';                          -- -1 years -8 mons

select interval '29.4 months';                         --  2 years  5 mons 12 days

select interval '-1.7 years 29.4 months';              --           8 mons 12 days << wrong
select interval '29.4 months -1.7 years';              --           9 mons 12 days

select interval '-1.7 years' + interval '29.4 months'; --           9 mons 12 days
select interval '29.4 months' + interval '-1.7 years'; --           9 mons 12 days

The consensus was that the outcome that I flagged with “wrong” does indeed have that status. After all, it’s hard to see how anybody could intend this rule (that anyway holds in only some cases):

-a + b <> b - a

It seems odd that there’s been no recent reference to my testcase and how it behaves in the environment of Bruce’s patch.

I don’t recall the history of the thread. But Bruce took on the task of fixing this narrow issue. Anyway, somehow, the whole question of “spill down” came up for discussion. The rules aren’t documented and I’ve been unable to find any reference even to the phenomenon. I have managed to implement a model, in PL/pgSQL, that gets the same results as the native implementation in every one of many tests that I’ve done. I appreciate that this doesn’t prove that my model is correct. But it would seem that it must be on the right track. The rules that my PL/pgSQL uses are self-evidently whimsical—but they were needed precisely to get the same outcomes as the native implementation. There was some discussion of all this somewhere in this thread.

If memory serves, it was Tom who suggested changing the spill-down rules. This was possibly meant entirely rhetorically. But it seems that Bruce did set about implementing a change here. (I was unable to find a clear prose functional spec for the new behavior. Probably I didn’t know where to look.

There’s no doubt that a change in these rules would change the behavior of extant code. But then, in a purist sense, this is the case with any bug fix.

I’m simply waiting on a final ruling and final outcome.

Meanwhile, I’ve worked out a way to tame all this business (by using domain types and associated functionality) so that application code can deal confidently with only pure months, pure days, and pure seconds interval values (thinking of the internal [mm, dd, ss] representation). The scheme ensures that spill-down never occurs by rounding the years or the months field to integral values. If you want to add a “mixed” interval to a timestamp, then you simply add the different kinds of interval in the one expression. And you use parentheses to assert, visibly, the priority rule that you intend.

Because this is ordinary application code, there are no compatibility issues for me. My approach won’t see a change in behavior no matter what is decided about the present patch.

Re: Have I found an interval arithmetic bug?

From
Tom Lane
Date:
Bryn Llewellyn <bryn@yugabyte.com> writes:
> On 21-Jul-2021, at 02:58, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> In general, considering that (most of?) the existing behavior has stood
>> for decades, I think we need to tread VERY carefully about changing it.
>> I don't want to see this patch changing any case that is not indisputably
>> broken.

> It was me that started the enormous thread with the title “Have I found an interval arithmetic bug?” on 01-Apr-2021.
Ipresented this testcase: 

>> select interval '-1.7 years';                          -- -1 years -8 mons
>>
>> select interval '29.4 months';                         --  2 years  5 mons 12 days
>>
>> select interval '-1.7 years 29.4 months';              --           8 mons 12 days << wrong
>> select interval '29.4 months -1.7 years';              --           9 mons 12 days
>>
>> select interval '-1.7 years' + interval '29.4 months'; --           9 mons 12 days
>> select interval '29.4 months' + interval '-1.7 years'; --           9 mons 12 days

> The consensus was that the outcome that I flagged with “wrong” does indeed have that status.

Yeah, I think it's self-evident that your last four cases should
produce the same results.  Whether '9 mons 12 days' is the best
possible result is debatable --- in a perfect world, maybe we'd
produce '9 mons' exactly --- but given that the first two cases
produce what they do, that does seem self-consistent.  I think
we should be setting out to fix that outlier without causing
any of the other five results to change.

            regards, tom lane



Re: Have I found an interval arithmetic bug?

From
Bryn Llewellyn
Date:
> On 21-Jul-2021, at 01:23, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> On Wed, 21 Jul 2021 at 03:48, Bruce Momjian <bruce@momjian.us> wrote:
>>
>> this example now gives me concern:
>>
>>        SELECT INTERVAL '1.06 months 1 hour';
>>               interval
>>        -----------------------
>>         1 mon 2 days 01:00:00
>>
>> Notice that it rounds the '1.06 months' to '1 mon 2 days', rather than
>> spilling to hours/minutes/seconds, even though hours is already
>> specified.  I don't see a better way to handle this than the current
>> code already does, but it is something odd.
>
> Hmm, looking at this whole thread, I have to say that I prefer the old
> behaviour of spilling down to lower units.
>
> For example, with this patch:
>
> SELECT '0.5 weeks'::interval;
> interval
> ----------
> 4 days
>
> which I don't think is really an improvement. My expectation is that
> half a week is 3.5 days, and I prefer what it used to return, namely
> '3 days 12:00:00'.
>
> It's true that that leads to odd-looking results when the field value
> has lots of fractional digits, but that was at least explainable, and
> followed the documentation.
>
> Looking for a general principle to follow, how about this -- the
> result of specifying a fractional value should be the same as
> multiplying an interval of 1 unit by that value. In other words,
> '1.8594 months'::interval should be the same as '1 month'::interval *
> 1.8594. (Actually, it probably can't easily be made exactly the same
> in all cases, due to differences in the floating point computations in
> the two cases, and rounding errors, but it's hopefully not far off,
> unlike the results obtained by not spilling down to lower units on
> input.)
>
> The cases that are broken in master, in my opinion, are the larger
> units (year and above), which don't propagate down in the same way as
> fractional months and below. So, for example, '0.7 years' should be
> 8.4 months (with the conversion factor of 1 year = 12 months), giving
> '8 months 12 days', which is what '1 year'::interval * 0.7 produces.
> Sure, there are arguably more accurate ways of computing that.
> However, that's the result obtained using the documented conversion
> factors, so it's justifiable in those terms.
>
> It's worth noting another case that is broken in master:
>
> SELECT '1.7 decades'::interval;
>     interval
> ------------------
> 16 years 11 mons
>
> which is surely not what anyone would expect. The current patch fixes
> this, but it would also be fixed by handling the fractional digits for
> these units in the same way as for smaller units. There was an earlier
> patch doing that, I think, though I didn't test it.
>
> Regards,
> Dean

And try these two tests. (I’m using Version 13.3.) on current MacOS.

select
  '1.7 decades'::interval as i1,
  ('1 decades'::interval)*1.7 as i2,
  ('10 years'::interval)*1.7 as i3;

       i1        |    i2    |    i3
------------------+----------+----------
 16 years 11 mons | 17 years | 17 years

select
  '1.7345 decades'::interval as i4,
  ('1 decades'::interval)*1.7345 as i5,
  ('10 years'::interval)*1.7345 as i6;

       i4        |               i5                |               i6
-----------------+---------------------------------+---------------------------------
 17 years 4 mons | 17 years 4 mons 4 days 04:48:00 | 17 years 4 mons 4 days 04:48:00

Shows only what we know already: mixed interval arithmetic is fishy.

Seems to me that units like “weeks”, “centuries”, “millennia”, and so on are a solution (broken in some cases) looking
fora problem. Try this (and variants like I showed above): 

select
  '1.7345 millennia'::interval as i7,
  '1.7345 centuries'::interval as i8,
  '1.7345 weeks'::interval as i9;

        i7         |        i8        |         i9
-------------------+------------------+--------------------
 1734 years 6 mons | 173 years 5 mons | 12 days 03:23:45.6




Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Wed, Jul 21, 2021 at 01:29:49PM -0400, Tom Lane wrote:
> Bryn Llewellyn <bryn@yugabyte.com> writes:
> > It was me that started the enormous thread with the title “Have I found an interval arithmetic bug?” on
01-Apr-2021.I presented this testcase:
 
> 
> >> select interval '-1.7 years';                          -- -1 years -8 mons
> >> 
> >> select interval '29.4 months';                         --  2 years  5 mons 12 days
> >> 
> >> select interval '-1.7 years 29.4 months';              --           8 mons 12 days << wrong
> >> select interval '29.4 months -1.7 years';              --           9 mons 12 days
> >> 
> >> select interval '-1.7 years' + interval '29.4 months'; --           9 mons 12 days
> >> select interval '29.4 months' + interval '-1.7 years'; --           9 mons 12 days
> 
> > The consensus was that the outcome that I flagged with “wrong” does indeed have that status.
> 
> Yeah, I think it's self-evident that your last four cases should
> produce the same results.  Whether '9 mons 12 days' is the best
> possible result is debatable --- in a perfect world, maybe we'd
> produce '9 mons' exactly --- but given that the first two cases
> produce what they do, that does seem self-consistent.  I think
> we should be setting out to fix that outlier without causing
> any of the other five results to change.

OK, I decided to reverse some of the changes I was proposing once I
started to think about the inaccuracy of not spilling down from 'weeks'
to seconds when hours also appear.  The fundamental issue is that the
months-to-days conversion is almost always an approximation, while the
days to seconds conversion is almost always accurate.  This means we are
never going to have consistent spill-down that is useful.

Therefore, I went ahead and accepted that years and larger units spill
only to months, months spill only to days, and weeks and lower spill all
the way down to seconds.  I also spelled this out in the docs, and
explained why we have this behavior.

Also, with my patch, the last four queries return the same result
because of the proper rounding also added by the patch, attached.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


Attachment

Re: Have I found an interval arithmetic bug?

From
Bryn Llewellyn
Date:
On 21-Jul-2021, at 17:07, Bruce Momjian <bruce@momjian.us> wrote:

On Wed, Jul 21, 2021 at 01:29:49PM -0400, Tom Lane wrote:
Bryn Llewellyn <bryn@yugabyte.com> writes:
It was me that started the enormous thread with the title “Have I found an interval arithmetic bug?” on 01-Apr-2021. I presented this testcase:

select interval '-1.7 years';                          -- -1 years -8 mons

select interval '29.4 months';                         --  2 years  5 mons 12 days

select interval '-1.7 years 29.4 months';              --           8 mons 12 days << wrong
select interval '29.4 months -1.7 years';              --           9 mons 12 days

select interval '-1.7 years' + interval '29.4 months'; --           9 mons 12 days
select interval '29.4 months' + interval '-1.7 years'; --           9 mons 12 days

The consensus was that the outcome that I flagged with “wrong” does indeed have that status.

Yeah, I think it's self-evident that your last four cases should
produce the same results.  Whether '9 mons 12 days' is the best
possible result is debatable --- in a perfect world, maybe we'd
produce '9 mons' exactly --- but given that the first two cases
produce what they do, that does seem self-consistent.  I think
we should be setting out to fix that outlier without causing
any of the other five results to change.

OK, I decided to reverse some of the changes I was proposing once I
started to think about the inaccuracy of not spilling down from 'weeks'
to seconds when hours also appear.  The fundamental issue is that the
months-to-days conversion is almost always an approximation, while the
days to seconds conversion is almost always accurate.  This means we are
never going to have consistent spill-down that is useful.

Therefore, I went ahead and accepted that years and larger units spill
only to months, months spill only to days, and weeks and lower spill all
the way down to seconds.  I also spelled this out in the docs, and
explained why we have this behavior.

Also, with my patch, the last four queries return the same result
because of the proper rounding also added by the patch, attached.

Your statement

“months-to-days conversion is almost always an approximation, while the days to seconds conversion is almost always accurate.” 

is misleading. Any conversion like these (and also the “spill up” conversions that the justify_hours(), justify_days(), and justify_interval() built-in functions bring) are semantically dangerous because of the different rules for adding a pure months, a pure days, or a pure seconds interval to a timestamptz value.

Unless you avoid mixed interval values, then it’s so hard (even though it is possible) to predict the outcomes of interval arithmetic. Rather, all you get is emergent behavior that I fail to see can be relied upon in deliberately designed application code. Here’s a telling example:

set timezone = 'America/Los_Angeles';
with
  c as (
    select
      '2021-03-13 19:00:00 America/Los_Angeles'::timestamptz as d,
      '25 hours'::interval                                   as i)
select
  d +               i  as "d + i",
  d + justify_hours(i) as "d + justify_hours(i)"
from c;

This is the result:

         d + i          |  d + justify_hours(i)  
------------------------+------------------------
 2021-03-14 21:00:00-07 | 2021-03-14 20:00:00-07

The two results are different, even though the native equality test shows that the two different interval values are the same:

with
  c as (select '25 hours'::interval as i)
select (i = justify_hours(i))::text
from c;

The result is TRUE.

The only route to sanity is to use only pure interval values (i.e. where only one of the fields of the internal [mm, dd, ss] representation is non-zero.

I mentioned that you can use a set of three domain types to enforce your intended practice here.

In other words, by programming application code defensively, it’s possible to insulate oneself entirely from the emergent behavior of the decades old PG code that implements the unconstrained native interval functionality and that brings what can only be considered to be unpredictable results.

Moreover, this defensive approach insulates you from any changes that Bruce’s patch might make.

Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Wed, Jul 21, 2021 at 06:39:26PM -0700, Bryn Llewellyn wrote:
> Your statement
> 
> 
>     “months-to-days conversion is almost always an approximation, while the
>     days to seconds conversion is almost always accurate.” 
> 
> 
> is misleading. Any conversion like these (and also the “spill up” conversions
> that the justify_hours(), justify_days(), and justify_interval() built-in
> functions bring) are semantically dangerous because of the different rules for
> adding a pure months, a pure days, or a pure seconds interval to a timestamptz
> value.

We are trying to get the most reasonable output for fractional values
--- I stand by my statements.

> Unless you avoid mixed interval values, then it’s so hard (even though it is
> possible) to predict the outcomes of interval arithmetic. Rather, all you get
> is emergent behavior that I fail to see can be relied upon in deliberately
> designed application code. Here’s a telling example:

The point is that we will get unusual values, so we should do the best
we can.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:


On Wed, Jul 21, 2021 at 6:43 PM Bruce Momjian <bruce@momjian.us> wrote:
On Wed, Jul 21, 2021 at 06:39:26PM -0700, Bryn Llewellyn wrote:
> Your statement
>
>
>     “months-to-days conversion is almost always an approximation, while the
>     days to seconds conversion is almost always accurate.”
>
>
> is misleading. Any conversion like these (and also the “spill up” conversions
> that the justify_hours(), justify_days(), and justify_interval() built-in
> functions bring) are semantically dangerous because of the different rules for
> adding a pure months, a pure days, or a pure seconds interval to a timestamptz
> value.

We are trying to get the most reasonable output for fractional values
--- I stand by my statements.

> Unless you avoid mixed interval values, then it’s so hard (even though it is
> possible) to predict the outcomes of interval arithmetic. Rather, all you get
> is emergent behavior that I fail to see can be relied upon in deliberately
> designed application code. Here’s a telling example:

The point is that we will get unusual values, so we should do the best
we can.

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Hi,

-                   tm->tm_mon += (fval * MONTHS_PER_YEAR);
+                   tm->tm_mon += rint(fval * MONTHS_PER_YEAR);

Should the handling for year use the same check as that for month ?

-                   AdjustFractDays(fval, tm, fsec, DAYS_PER_MONTH);
+                   /* round to a full month? */
+                   if (rint(fval * DAYS_PER_MONTH) == DAYS_PER_MONTH)

Cheers 

Re: Have I found an interval arithmetic bug?

From
Zhihong Yu
Date:


On Thu, Jul 22, 2021 at 2:59 PM Zhihong Yu <zyu@yugabyte.com> wrote:


On Wed, Jul 21, 2021 at 6:43 PM Bruce Momjian <bruce@momjian.us> wrote:
On Wed, Jul 21, 2021 at 06:39:26PM -0700, Bryn Llewellyn wrote:
> Your statement
>
>
>     “months-to-days conversion is almost always an approximation, while the
>     days to seconds conversion is almost always accurate.”
>
>
> is misleading. Any conversion like these (and also the “spill up” conversions
> that the justify_hours(), justify_days(), and justify_interval() built-in
> functions bring) are semantically dangerous because of the different rules for
> adding a pure months, a pure days, or a pure seconds interval to a timestamptz
> value.

We are trying to get the most reasonable output for fractional values
--- I stand by my statements.

> Unless you avoid mixed interval values, then it’s so hard (even though it is
> possible) to predict the outcomes of interval arithmetic. Rather, all you get
> is emergent behavior that I fail to see can be relied upon in deliberately
> designed application code. Here’s a telling example:

The point is that we will get unusual values, so we should do the best
we can.

--
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Hi,

-                   tm->tm_mon += (fval * MONTHS_PER_YEAR);
+                   tm->tm_mon += rint(fval * MONTHS_PER_YEAR);

Should the handling for year use the same check as that for month ?

-                   AdjustFractDays(fval, tm, fsec, DAYS_PER_MONTH);
+                   /* round to a full month? */
+                   if (rint(fval * DAYS_PER_MONTH) == DAYS_PER_MONTH)

Cheers 
Hi,
I guess the reason for current patch was that year to months conversion is accurate.
On the new test:

+SELECT INTERVAL '1.16 months 01:00:00' AS "One mon 5 days one hour";

0.16 * 31 = 4.96 < 5

I wonder why 5 days were chosen in the test output.

Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Thu, Jul 22, 2021 at 03:17:52PM -0700, Zhihong Yu wrote:
> On Thu, Jul 22, 2021 at 2:59 PM Zhihong Yu <zyu@yugabyte.com> wrote:
>     Hi,
> 
>     -                   tm->tm_mon += (fval * MONTHS_PER_YEAR);
>     +                   tm->tm_mon += rint(fval * MONTHS_PER_YEAR);
> 
>     Should the handling for year use the same check as that for month ?
> 
>     -                   AdjustFractDays(fval, tm, fsec, DAYS_PER_MONTH);
>     +                   /* round to a full month? */
>     +                   if (rint(fval * DAYS_PER_MONTH) == DAYS_PER_MONTH)
> 
>     Cheers 
> 
> Hi,
> I guess the reason for current patch was that year to months conversion is
> accurate.

Our internal units are hours/days/seconds, so the spill _up_ from months
to years happens automatically:

    SELECT INTERVAL '23.99 months';
     interval
    ----------
     2 years

> On the new test:
> 
> +SELECT INTERVAL '1.16 months 01:00:00' AS "One mon 5 days one hour";
> 
> 0.16 * 31 = 4.96 < 5
> 
> I wonder why 5 days were chosen in the test output.

We use 30 days/month, not 31.  However, I think you are missing the
changes in the patch and I am just understanding them fully now.  There
are two big changes:

1.  The amount of spill from months only to days
2.  The _rounding_ of the result once we stop spilling at months or days

#2 is the part I think you missed.

One thing missing from my previous patch was the handling of negative
units, which is now handled properly in the attached patch:

    SELECT INTERVAL '-1.99 years';
     interval
    ----------
     -2 years

    SELECT INTERVAL '-1.99 months';
     interval
    ----------
     -2 mons

I ended up creating a function to handle this, which allowed me to
simplify some of the surrounding code.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


Attachment

Re: Have I found an interval arithmetic bug?

From
Bryn Llewellyn
Date:
On 23-Jul-2021, at 08:05, Bruce Momjian <bruce@momjian.us> wrote:

On Thu, Jul 22, 2021 at 03:17:52PM -0700, Zhihong Yu wrote:
On Thu, Jul 22, 2021 at 2:59 PM Zhihong Yu <zyu@yugabyte.com> wrote:
   Hi,

   -                   tm->tm_mon += (fval * MONTHS_PER_YEAR);
   +                   tm->tm_mon += rint(fval * MONTHS_PER_YEAR);

   Should the handling for year use the same check as that for month ?

   -                   AdjustFractDays(fval, tm, fsec, DAYS_PER_MONTH);
   +                   /* round to a full month? */
   +                   if (rint(fval * DAYS_PER_MONTH) == DAYS_PER_MONTH)

   Cheers 

Hi,
I guess the reason for current patch was that year to months conversion is
accurate.

Our internal units are hours/days/seconds, so the spill _up_ from months
to years happens automatically:

SELECT INTERVAL '23.99 months';
interval
----------
2 years

On the new test:

+SELECT INTERVAL '1.16 months 01:00:00' AS "One mon 5 days one hour";

0.16 * 31 = 4.96 < 5

I wonder why 5 days were chosen in the test output.

We use 30 days/month, not 31.  However, I think you are missing the
changes in the patch and I am just understanding them fully now.  There
are two big changes:

1.  The amount of spill from months only to days
2.  The _rounding_ of the result once we stop spilling at months or days

#2 is the part I think you missed.

One thing missing from my previous patch was the handling of negative
units, which is now handled properly in the attached patch:

SELECT INTERVAL '-1.99 years';
interval
----------
-2 years

SELECT INTERVAL '-1.99 months';
interval
----------
-2 mons

I ended up creating a function to handle this, which allowed me to
simplify some of the surrounding code.

--
 Bruce Momjian  <bruce@momjian.us>        https://www.google.com/url?q=https://momjian.us&source=gmail-imap&ust=1627657554000000&usg=AOvVaw2pMx7QBd3qSjHK1L9oUnl0
 EDB                                      https://www.google.com/url?q=https://enterprisedb.com&source=gmail-imap&ust=1627657554000000&usg=AOvVaw2Q92apfhXmqqFYz7aN16YL

 If only the physical world exists, free will is an illusion.

<interval.diff.gz>


Will the same new spilldown rules hold in the same way for interval multiplication and division as they will for the interpretation of an interval literal?

The semantics here are (at least as far as my limited search skills have shown me) simply undocumented. But my tests in 13.3 have to date not disproved this hypothesis:

* considering "new_i ◄— i * f"

* # notice that the internal representation is _months_, days, and seconds at odds with "Our internal units are hours/days/seconds,"
* let i’s internal representation be [mm, dd, ss] 

* new_i’s “intermediate” internal representation is [mm*f, dd*f, ss*f]

* input these values to the same spilldown algorithm that is applied when these same intermediate values are used in an interval literal

* so the result is [new_mm, new_dd, new_ss]

Here’s an example:

select
  '1.2345 months 1.2345 days 1.2345 seconds'::interval = 
  '1 month 1 day 1 second'::interval*1.2345;

In 13.3, the result is TRUE. (I know that this doesn’t guarantee that the internal representations of the two compared interval values are the same. But it’s a necessary condition for the outcome that I’m referring to and serves to indecate the pont I’m making. A more careful test can be made.

Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, Jul 23, 2021 at 10:55:11AM -0700, Bryn Llewellyn wrote:
> SELECT
>   '1.2345 months 1.2345 days 1.2345 seconds'::interval = 
>   '1 month 1 day 1 second'::interval*1.2345;
> 
> In 13.3, the result is TRUE. (I know that this doesn’t guarantee that the
> internal representations of the two compared interval values are the same. But
> it’s a necessary condition for the outcome that I’m referring to and serves to
> indecate the pont I’m making. A more careful test can be made.

So you are saying fractional unit output should match multiplication
output?  It doesn't now for all units:

    SELECT interval '1.3443 years';
       interval
    ---------------
     1 year 4 mons
    
    SELECT interval '1 years' * 1.3443;
                ?column?
    ---------------------------------
     1 year 4 mons 3 days 22:45:07.2

It is true this patch is further reducing that matching.  Do people
think I should make them match as part of this patch?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Bryn Llewellyn
Date:
On 23-Jul-2021, bruce@momjian.us wrote:

On Fri, Jul 23, 2021 at 10:55:11AM -0700, Bryn Llewellyn wrote:
SELECT
 '1.2345 months 1.2345 days 1.2345 seconds'::interval =
 '1 month 1 day 1 second'::interval*1.2345;

In 13.3, the result is TRUE. (I know that this doesn’t guarantee that the
internal representations of the two compared interval values are the same. But
it’s a necessary condition for the outcome that I’m referring to and serves to
indecate the pont I’m making. A more careful test can be made.

So you are saying fractional unit output should match multiplication
output?  It doesn't now for all units:

SELECT interval '1.3443 years';
  interval
---------------
1 year 4 mons

SELECT interval '1 years' * 1.3443;
           ?column?
---------------------------------
1 year 4 mons 3 days 22:45:07.2

It is true this patch is further reducing that matching.  Do people
think I should make them match as part of this patch?

Summary:
--------

It seems to me that the best thing to do is fix the coarse bug about which there is unanimous agreement and to leave everything else (quirks and all) untouched.

Detail:
-------

My previous email (to which Bruce replies here) was muddled. Sorry. The challenge is that are a number of mechanisms at work. Their effects are conflated. And it’s very hard to unscramble them.

The underlying problem is that the internal representation of an interval value is a [mm, dd, ss] tuple. This fact is documented. The representation is key to understanding the definitions of these operations:

— defining an interval value from a text literal that uses real number values for its various fields.

— defining an interval value from make_interval(). (This one is easy because the API requires integral values for all but the seconds argument. It would be interesting to know why this asymmetrical definition was implemented. It seems to imply that somebody though that spilldown was a bad idea and should be prevented before it might happen.)

— creating the text typecast of an extant interval value.

— creating an interval value by adding/subtracting an extant interval value to/from another

— creating an interval value by multiplying or dividing an extant interval value by a (real) number

— creating an interval value by subtracting a pair of moments of the same data type (timestamptz, plain timestamp, or time)

— creating a new moment value by adding or subtracting an extant interval value to an extant moment value of the same data type.

— creating an interval value by applying  justify_hours(i), justify_days(i), and justify_interval(i) to an extant interval value, i.

— creating a double precision value by applying extract(epoch from i) 
to an extant interval value, i.

— evaluating inequality and equality tests to compare two extant interval values.

Notice that, for example, this test:

select
  interval  '1.3443 years' as i1,
  interval '1 years' * 1.3443 as i2;

conflates three things: converting an interval literal to a [mm, dd, ss] tuple; multiplying a [mm, dd, ss] tuple by a real number; and converting a [mm, dd, ss] tuple to a text representation. Similarly, this test:

select
  interval  '1.3443 years' <
  interval '1 years' * 1.3443;

conflates three things: converting an interval literal to a [mm, dd, ss] tuple; multiplying a [mm, dd, ss] tuple by a real number; and inequality comparison of two [mm, dd, ss] two [mm, dd, ss] tuples.

As far as I’ve been able, the PG documentation doesn’t do a good job of defining the semantics of any of these operations. Some (like the “justify” functions” are sketched reasonably well. Others, like interval multiplication, are entirely undefined.

This makes discussion of simple test like the two I showed immediately above hard. It also makes any discussion of correctness, possible bugs, and proposed implementation changes very difficult.

Further, it also makes it hard to see how tests for application code that uses any of these operations can be designed. The normal approach relies on testing that you get what you expect. But here, you don't know what to expect—unless (as I’ve done) you first assert hypotheses for the undefined operations and test them with programmed simulations. Of course, this is, in general, an unreliable approach. The only way to know what code is intended to do is to read the prose specification that informs the implementation.

I had forgotten one piece of the long history of this thread. Soon after I presented the testcase that folks agree shows a clear bug, I asked about the rules for creating the internal [mm, dd, ss] tuple from a text literal that uses real numbers for the fields. My tests showed two things: (1) an intuitively clear model for the spilldown of nonintegral months to days and then, in turn, of nonintegral days to seconds; and (2) a quirky rule for deriving intermediate months from fractional years and fractional months before then using the more obvious rules to spill to days. (This defies terse expression in prose. I copied my PL/pgSQL implementation below.)

There was initially some discussion about changing implementation o the spill-down from [years, months] in the interval literal to the ultimate [mm, dd, ss] representation. This is what's Bruces is asking about. And it's what I was muddled about.

As I’ve said, my conclusion is that the only safe approach is to create and use only “pure” interval values (where just one of the internal fields is non-zero). For this reason (and having seen what I decided was the impossibly unmemorable rules that my modeled implementation uses) I didn’t look at the rules for the other fields that the interval literal allows (weeks, centuries, millennia, and so on).

--------------------------------------------------------------------------------
mm_trunc                constant int              not null := trunc(p.mm);
mm_remainder            constant double precision not null := p.mm - mm_trunc::double precision;

-- This is a quirk.
mm_out                  constant int              not null := trunc(p.yy*mm_per_yy) + mm_trunc;

dd_real_from_mm         constant double precision not null := mm_remainder*dd_per_mm;

dd_int_from_mm          constant int              not null := trunc(dd_real_from_mm);
dd_remainder_from_mm    constant double precision not null := dd_real_from_mm - dd_int_from_mm::double precision;

dd_int_from_user        constant int              not null := trunc(p.dd);
dd_remainder_from_user  constant double precision not null := p.dd - dd_int_from_user::double precision;

dd_out                  constant int              not null := dd_int_from_mm + dd_int_from_user;

d_remainder             constant double precision not null := dd_remainder_from_mm + dd_remainder_from_user;

ss_out                  constant double precision not null := d_remainder*ss_per_dd +
                                                              p.hh*ss_per_hh +
                                                              p.mi*ss_per_mi +
                                                              p.ss;
--------------------------------------------------------------------------------












Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Sun, Jul 25, 2021 at 11:56:54AM -0700, Bryn Llewellyn wrote:
> As far as I’ve been able, the PG documentation doesn’t do a good job of
> defining the semantics of any of these operations. Some (like the “justify”

This is because fractional interval values are not used or asked about
often.

> functions” are sketched reasonably well. Others, like interval multiplication,
> are entirely undefined.

Yes, the “justify” functions were requested and implemented because they
met a frequently-requested need unrelated to fractional values, though
they do have spill-up uses.

> This makes discussion of simple test like the two I showed immediately above
> hard. It also makes any discussion of correctness, possible bugs, and proposed
> implementation changes very difficult.

Agreed.  With fractional values an edge use-case, we are trying to find
the most useful implementation.

> As I’ve said, my conclusion is that the only safe approach is to create and use
> only “pure” interval values (where just one of the internal fields is
> non-zero). For this reason (and having seen what I decided was the impossibly
> unmemorable rules that my modeled implementation uses) I didn’t look at the
> rules for the other fields that the interval literal allows (weeks, centuries,
> millennia, and so on).

I think the current page is clear about _specifying_ fractional units,
but you are right that multiplication/division of fractional values is
not covered.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, Apr  2, 2021 at 09:02:29PM -0400, Bruce Momjian wrote:
> On Fri, Apr  2, 2021 at 05:50:59PM -0700, Bryn Llewellyn wrote:
> > are the user’s parameterization. All are real numbers. Because non-integral
> > values for years, months, days, hours, and minutes are allowed when you specify
> > a value using the ::interval typecast, my reference doc must state the rules. I
> > would have struggled to express these rules in prose—especially given the use
> > both of trunc() and floor(). I would have struggled more to explain what
> > requirements these rules meet.
> 
> The fundamental issue is that while months, days, and seconds are
> consistent in their own units, when you have to cross from one unit to
> another, it is by definition imprecise, since the interval is not tied
> to a specific date, with its own days-of-the-month and leap days and
> daylight savings time changes.  It feels like it is going to be
> imprecise no matter what we do.
> 
> Adding to this is the fact that interval values are stored in C 'struct
> tm' defined in libc's ctime(), where months are integers, so carrying
> around non-integer month values until we get a final result would add a
> lot of complexity, and complexity to a system that is by definition
> imprecise, which doesn't seem worth it.

I went ahead and modified the interval multiplication/division functions
to use the same logic as fractional interval units:

    SELECT interval '23 mons';
        interval
    ----------------
     1 year 11 mons
    
    SELECT interval '23 mons' / 2;
        ?column?
    -----------------
     11 mons 15 days
    
    SELECT interval '23.5 mons';
            interval
    ------------------------
     1 year 11 mons 15 days
    
    SELECT interval '23.5 mons' / 2;
             ?column?
    --------------------------
     11 mons 22 days 12:00:00

I think the big issue is that the casting to interval into integer
mons/days/secs so we can no longer make the distinction of units >
months vs months.

Using Bryn's example, the master branch output is:

    SELECT
       interval  '1.3443 years' as i1,
       interval '1 years' * 1.3443 as i2;
          i1       |               i2
    ---------------+---------------------------------
     1 year 4 mons | 1 year 4 mons 3 days 22:45:07.2

and the attached patch output is:

    SELECT
      interval  '1.3443 years' as i1,
      interval '1 years' * 1.3443 as i2;
          i1       |          i2
    ---------------+----------------------
     1 year 4 mons | 1 year 4 mons 4 days

which looks like an improvement.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


Attachment

Re: Have I found an interval arithmetic bug?

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> I went ahead and modified the interval multiplication/division functions
> to use the same logic as fractional interval units:

Wait. A. Minute.

What I think we have consensus on is that interval_in is doing the
wrong thing in a particular corner case.  I have heard nobody but
you suggesting that we should start undertaking behavioral changes
in other interval functions, and I don't believe that that's a good
road to start going down.  These behaviors have stood for many years.
Moreover, since the whole thing is by definition operating with
inadequate information, it is inevitable that for every case you
make better there will be another one you make worse.

I'm really not on board with changing anything except interval_in,
and even there, we had better be certain that everything we change
is a case that is certainly being made better.

BTW, please do not post patches as gzipped attachments, unless
they're enormous.  You're just adding another step making it
harder for people to look at them.

            regards, tom lane



Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Tue, Jul 27, 2021 at 04:01:54PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > I went ahead and modified the interval multiplication/division functions
> > to use the same logic as fractional interval units:
> 
> Wait. A. Minute.
> 
> What I think we have consensus on is that interval_in is doing the
> wrong thing in a particular corner case.  I have heard nobody but
> you suggesting that we should start undertaking behavioral changes
> in other interval functions, and I don't believe that that's a good
> road to start going down.  These behaviors have stood for many years.
> Moreover, since the whole thing is by definition operating with
> inadequate information, it is inevitable that for every case you
> make better there will be another one you make worse.

Bryn mentioned this so I thought I would see what the result looks like.
I am fine to skip them.

> I'm really not on board with changing anything except interval_in,
> and even there, we had better be certain that everything we change
> is a case that is certainly being made better.

Well, I think what I had before the multiply/divide changes were
acceptable to everyone except Bryn, who was looking for more
consistency.
 
> BTW, please do not post patches as gzipped attachments, unless
> they're enormous.  You're just adding another step making it
> harder for people to look at them.

OK, what is large for you?  100k bytes?  I was using 10k bytes.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Bryn Llewellyn
Date:
> On 27-Jul-2021, at 14:13, Bruce Momjian <bruce@momjian.us> wrote:
>
> On Tue, Jul 27, 2021 at 04:01:54PM -0400, Tom Lane wrote:
>> Bruce Momjian <bruce@momjian.us> writes:
>>> I went ahead and modified the interval multiplication/division functions
>>> to use the same logic as fractional interval units:
>>
>> Wait. A. Minute.
>>
>> What I think we have consensus on is that interval_in is doing the
>> wrong thing in a particular corner case.  I have heard nobody but
>> you suggesting that we should start undertaking behavioral changes
>> in other interval functions, and I don't believe that that's a good
>> road to start going down.  These behaviors have stood for many years.
>> Moreover, since the whole thing is by definition operating with
>> inadequate information, it is inevitable that for every case you
>> make better there will be another one you make worse.
>
> Bryn mentioned this so I thought I would see what the result looks like.
> I am fine to skip them.
>
>> I'm really not on board with changing anything except interval_in,
>> and even there, we had better be certain that everything we change
>> is a case that is certainly being made better.
>
> Well, I think what I had before the multiply/divide changes were
> acceptable to everyone except Bryn, who was looking for more
> consistency.
>
>> BTW, please do not post patches as gzipped attachments, unless
>> they're enormous.  You're just adding another step making it
>> harder for people to look at them.
>
> OK, what is large for you?  100k bytes?  I was using 10k bytes.

Before I say anything else, I’ll stress what I wrote recently (under the heading “summary”). I support Tom’s idea that
theonly appropriate change to make is to fix only the exactly self-evident bug that I reported at the start of this
thread.

I fear that Bruce doesn’t understand my point about interval multiplication (which includes multiplying by a number
whoseabsolute value lies between 0 and 1). Here it is. I believe that the semantics are (and should be) defined like
this:

[mm, dd, ss]*n == post_spilldown([mm*n, dd*n, ss*n])

where the function post_spilldown() applies the rules that are used when an interval literal that specifies only values
formonths, days, and seconds is converted to the internal [mm, dd, ss] representation—where mm and dd are 4-byte
integersand ss is an 80byte integer that represents microseconds. 

Here’s a simple test that’s consistent with that hypothesis:

with
  c1 as (
    select
      '1 month 1 day 1 second'::interval as i1,
      '1.234 month 1.234 day 1.234 second'::interval as i3),

  c2 as (
    select i1*1.234 as i2, i3 from c1)

select i2::text as i2_txt, i3::text from c2 as i3_txt;

Here’s the result:

          i2_txt           |            i3
---------------------------+---------------------------
 1 mon 8 days 06:05:46.834 | 1 mon 8 days 06:05:46.834

So I’m so far happy.

But, like I said, I’d forgotten a orthogonal quirk. This test shows it. It’s informed by the fact that 1.2345*12.0 is
14.8140.

select
  ('1.2345 years'  ::interval)::text as i1_txt,
  ('14.8140 months'::interval)::text as i2_txt;

Here’s the result:

    i1_txt     |             i2_txt
---------------+--------------------------------
 1 year 2 mons | 1 year 2 mons 24 days 10:04:48

It seems to be to be crazy behavior. I haven’t found any account of it in the PG docs. Others have argued that it’s a
sensibleresult. Anyway, I don’t believe that I’ve ever argued that it’s a bug. I wanted only to know what rationale
informedthe design. I agree that changing the behavior here would be problematic for extant code. 

This quirk explains the outcome of this test:

select
  ('1.2345 years'::interval)::text as i1_txt,
  ('14.8140 months'::interval)::text as i2_txt,
  (1.2345*('1 years'::interval))::text as i3_txt;

This is the result:

    i1_txt     |             i2_txt             |             i3_txt
---------------+--------------------------------+--------------------------------
 1 year 2 mons | 1 year 2 mons 24 days 10:04:48 | 1 year 2 mons 24 days 10:04:48

Notice that the same text is reported for i2_txt as for i3_txt.




Re: Have I found an interval arithmetic bug?

From
John W Higgins
Date:


On Tue, Jul 27, 2021 at 3:36 PM Bryn Llewellyn <bryn@yugabyte.com> wrote:

with
  c1 as (
    select
      '1 month 1 day 1 second'::interval as i1,
      '1.234 month 1.234 day 1.234 second'::interval as i3),

  c2 as (
    select i1*1.234 as i2, i3 from c1)

select i2::text as i2_txt, i3::text from c2 as i3_txt;


It's nice to envision all forms of fancy calculations. But the fact is that 

'1.5 month'::interval * 2 != '3 month"::interval 

with any of these patches - and if that doesn't work - the rest of the strange numbers really seem to be irrelevant.

If there is a desire to handle fractional cases - then all pieces need to be held as provided until they are transformed into something. In other words - 1.5 month needs to be held as 1.5 month until we ask for it to be reduced to 1 month and 15 days at some point. If the interval data type immediately casts 1.5 months to 1 month 15 days then all subsequent calculations are going to be wrong.

I appreciate there is generally no way to accomplish this right now - but that means walking away from things like 1 month * 1.234 as being not calculable as opposed to trying to piece something together that fails pretty quickly.

John

Re: Have I found an interval arithmetic bug?

From
Dean Rasheed
Date:
On Wed, 28 Jul 2021 at 00:08, John W Higgins <wishdev@gmail.com> wrote:
>
> It's nice to envision all forms of fancy calculations. But the fact is that
>
> '1.5 month'::interval * 2 != '3 month"::interval
>

That's not exactly true. Even without the patch:

SELECT '1.5 month'::interval * 2 AS product,
       '3 month'::interval AS expected,
       justify_interval('1.5 month'::interval * 2) AS justified_product,
       '1.5 month'::interval * 2 = '3 month'::interval AS equal;

    product     | expected | justified_product | equal
----------------+----------+-------------------+-------
 2 mons 30 days | 3 mons   | 3 mons            | t
(1 row)

So it's equal even without calling justify_interval() on the result.

FWIW, I remain of the opinion that the interval literal code should
just spill down to lower units in all cases, just like the
multiplication and division code, so that the results are consistent
(barring floating point rounding errors) and explainable.

Regards,
Dean



Re: Have I found an interval arithmetic bug?

From
John W Higgins
Date:


On Wed, Jul 28, 2021 at 12:42 AM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
On Wed, 28 Jul 2021 at 00:08, John W Higgins <wishdev@gmail.com> wrote:
>
> It's nice to envision all forms of fancy calculations. But the fact is that
>
> '1.5 month'::interval * 2 != '3 month"::interval
>

That's not exactly true. Even without the patch:

SELECT '1.5 month'::interval * 2 AS product,
       '3 month'::interval AS expected,
       justify_interval('1.5 month'::interval * 2) AS justified_product,
       '1.5 month'::interval * 2 = '3 month'::interval AS equal;

    product     | expected | justified_product | equal
----------------+----------+-------------------+-------
 2 mons 30 days | 3 mons   | 3 mons            | t
(1 row)


That's viewing something via the mechanism that is incorrectly (technically speaking) doing the work in the first place. It believes they are the same - but they are clearly not when actually used.

select '1/1/2001'::date + (interval '3 month');
      ?column?      
---------------------
 2001-04-01 00:00:00
(1 row)

vs

select '1/1/2001'::date + (interval '1.5 month' * 2);
      ?column?      
---------------------
 2001-03-31 00:00:00
(1 row)

That's the flaw in this entire body of work - we keep taking fractional amounts - doing round offs and then trying to add or multiply the pieces back and ending up with weird floating point math style errors. That's never to complain about it - but we shouldn't be looking at edge cases with things like 1 month * 1.234 when 1.5 months * 2 doesn't work properly.

John

P.S. Finally we have items like this

select '12/1/2001'::date + (interval '1.5 months' * 2);
      ?column?      
---------------------
 2002-03-03 00:00:00
(1 row)

postgres=# select '1/1/2001'::date + (interval '1.5 months' * 2);
      ?column?      
---------------------
 2001-03-31 00:00:00
(1 row)

Which only has a 28 day gap because of the length of February - clearly this is not working quite right.

Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Wed, Jul 28, 2021 at 08:42:31AM +0100, Dean Rasheed wrote:
> On Wed, 28 Jul 2021 at 00:08, John W Higgins <wishdev@gmail.com> wrote:
> >
> > It's nice to envision all forms of fancy calculations. But the fact is that
> >
> > '1.5 month'::interval * 2 != '3 month"::interval
> >
> 
> That's not exactly true. Even without the patch:
> 
> SELECT '1.5 month'::interval * 2 AS product,
>        '3 month'::interval AS expected,
>        justify_interval('1.5 month'::interval * 2) AS justified_product,
>        '1.5 month'::interval * 2 = '3 month'::interval AS equal;
> 
>     product     | expected | justified_product | equal
> ----------------+----------+-------------------+-------
>  2 mons 30 days | 3 mons   | 3 mons            | t
> (1 row)
> 
> So it's equal even without calling justify_interval() on the result.
> 
> FWIW, I remain of the opinion that the interval literal code should
> just spill down to lower units in all cases, just like the
> multiplication and division code, so that the results are consistent
> (barring floating point rounding errors) and explainable.

Here is a more minimal patch that doesn't change the spill-down units at
all, but merely documents it, and changes the spilldown to months to
round instead of truncate.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


Attachment

Re: Have I found an interval arithmetic bug?

From
Robert Haas
Date:
On Tue, Jul 27, 2021 at 4:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> What I think we have consensus on is that interval_in is doing the
> wrong thing in a particular corner case.  I have heard nobody but
> you suggesting that we should start undertaking behavioral changes
> in other interval functions, and I don't believe that that's a good
> road to start going down.  These behaviors have stood for many years.
> Moreover, since the whole thing is by definition operating with
> inadequate information, it is inevitable that for every case you
> make better there will be another one you make worse.

I agree that we need to be really conservative here. I think Tom is
right that if we start changing behaviors that "seem wrong," we will
probably make some things better and other things worse. The overall
amount of stuff that "seems wrong" will probably not go down, but a
lot of people's applications will break when they try to upgrade to
v15. That's not going to be a win overall.

I think a lot of the discussion on this thread consists of people
hoping for things that are not very realistic. The interval type
represents the number of months as an integer, and the number of days
as an integer. That means that an interval like '0.7 months' does not
really exist. If you ask for that interval what you get is actually 21
days, which is a reasonable approximation of 0.7 months but not the
same thing, except in April, June, September, and November. So when
you then say that you want 0.7 months + 0.3 months to equal 1.0
months, what you're really requesting is that 21 days + 9 days = 1
month. That system has been tried in the past, but it was abandoned
roughly around the time of Julius Caeser for the very good reason that
the orbital period of the earth about the sun is noticeably greater
than 360 days.

It would be entirely possible to design a data type that could
represent such values more exactly. A data type that had a
representation similar to interval but with double values for the
numbers of years and months would be able to compute 0.7 months + 0.3
months and get 1.0 months with no problem.

If we were doing this over again, I would argue that, with this
on-disk representation, 0.7 months ought to be rejected as invalid
input, because it's generally not a good idea to have a data type that
silently converts a value into a different value that is not
equivalent for all purposes. It is confusing and causes people to
expect behavior different from what they will actually get. Now, it
seems far too late to consider such a change at this point ... and it
is also no good considering a change to the on-disk representation of
the existing data type at this point ... but it is also no good
pretending like we have a floating-point representation of months and
days when we actually do not.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Have I found an interval arithmetic bug?

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> If we were doing this over again, I would argue that, with this
> on-disk representation, 0.7 months ought to be rejected as invalid
> input, because it's generally not a good idea to have a data type that
> silently converts a value into a different value that is not
> equivalent for all purposes. It is confusing and causes people to
> expect behavior different from what they will actually get. Now, it
> seems far too late to consider such a change at this point ... and it
> is also no good considering a change to the on-disk representation of
> the existing data type at this point ... but it is also no good
> pretending like we have a floating-point representation of months and
> days when we actually do not.

You know, I was thinking exactly that thing earlier.  Changing the
on-disk representation is certainly a nonstarter, but the problem
here stems from expecting interval_in to do something sane with inputs
that do not correspond to any representable value.  I do not think we
have any other datatypes where we expect the input function to make
choices like that.

Is it really too late to say "that was a damfool idea" and drop fractional
years/months/etc from interval_in's lexicon?  By definition, this will not
create any dump/reload problems, because interval_out will never emit any
such thing.  It will break applications that are expecting such syntax to
do something sane.  But that expectation is fundamentally not meetable,
so maybe we should just make a clean break.  (Without back-patching it,
of course.)

I'm not entirely sure about whether to reject fractional days, though.
Converting those on the assumption of 1 day == 24 hours is not quite
theoretically right.  But it's unsurprising, which is not something
we can say about fractions of the larger units.  So maybe it's OK to
continue accepting that.

            regards, tom lane



Re: Have I found an interval arithmetic bug?

From
Robert Haas
Date:
On Wed, Jul 28, 2021 at 11:52 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> You know, I was thinking exactly that thing earlier.  Changing the
> on-disk representation is certainly a nonstarter, but the problem
> here stems from expecting interval_in to do something sane with inputs
> that do not correspond to any representable value.  I do not think we
> have any other datatypes where we expect the input function to make
> choices like that.

It's not exactly the same issue, but the input function whose behavior
most regularly trips people up is bytea, because they try something
like 'x'::bytea and it seems to DWTW so then they try it on all their
data and discover that, for example, '\'::bytea fails outright, or
that ''::bytea = '\x'::bytea, contrary to expectations. People often
seem to think that casting to bytea should work like convert_to(), but
it doesn't. As in the case at hand, byteain() has to guess whether the
input is intended to be the 'hex' or 'escape' format, and because the
'escape' format looks a lot like plain old text, confusion ensues.
Now, guessing between two input formats that are both legal for the
data type is not exactly the same as guessing what to do with a value
that's not directly representable, but it has the same ultimate effect
i.e. the human beings perceive the system as buggy.

A case that is perhaps more theoretically similar to the instance at
hand is rounding during the construction of floating point values. My
system thinks '1.00000000000000000000000001'::float = '1'::float, so
in that case, as in this one, we've decided that it's OK to build an
inexact representation of the input value. I don't really see what can
be done about this considering that the textual representation uses
base 10 and the internal representation uses base 2, but I think this
doesn't cause us as many problems in practice because people
understand how it works, which doesn't seem to be the case with the
interval data type, at last if this thread is any indication.

I am dubious that it's worth the pain of making the input function
reject cases involving fractional units. It's true that some people
here aren't happy with the current behavior, but they may no happier
if we reject those cases with an error, and other people may then be
unhappy too. I think your previous idea was the best one so far: fix
the input function so that 'X years Y months' and 'Y months X years'
always produce the same answer, and call it good.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Have I found an interval arithmetic bug?

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Jul 28, 2021 at 11:52 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> You know, I was thinking exactly that thing earlier.  Changing the
>> on-disk representation is certainly a nonstarter, but the problem
>> here stems from expecting interval_in to do something sane with inputs
>> that do not correspond to any representable value.  I do not think we
>> have any other datatypes where we expect the input function to make
>> choices like that.

> A case that is perhaps more theoretically similar to the instance at
> hand is rounding during the construction of floating point values. My
> system thinks '1.00000000000000000000000001'::float = '1'::float, so
> in that case, as in this one, we've decided that it's OK to build an
> inexact representation of the input value.

Fair point, but you decided when you chose to use float that you don't
care about the differences between numbers that only differ at the
seventeenth or so decimal place.  (Maybe, if you don't understand what
float is, you didn't make that choice intentionally ... but that's
a documentation issue not a code shortcoming.)  However, it's fairly
hard to believe that somebody who writes '1.4 years'::interval doesn't
care about the 0.4 year.  The fact that we silently convert that to,
effectively, 1.33333333... years seems like a bigger roundoff error
than one would expect.

> I am dubious that it's worth the pain of making the input function
> reject cases involving fractional units. It's true that some people
> here aren't happy with the current behavior, but they may no happier
> if we reject those cases with an error, and other people may then be
> unhappy too.

Maybe.  A possible compromise is to accept only exactly-representable
fractions.  Then, for instance, we'd take 1.5 years (resulting in 18
months) but not 1.4 years.  Now, this might fall foul of your point about
not wanting to mislead people into expecting the system to do things it
can't; but I'd argue that the existing behavior misleads them much more.

> I think your previous idea was the best one so far: fix
> the input function so that 'X years Y months' and 'Y months X years'
> always produce the same answer, and call it good.

That would clearly be a bug fix.  I'm just troubled that there are
still behaviors that people will see as bugs.

            regards, tom lane



Re: Have I found an interval arithmetic bug?

From
Robert Haas
Date:
On Wed, Jul 28, 2021 at 1:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Fair point, but you decided when you chose to use float that you don't
> care about the differences between numbers that only differ at the
> seventeenth or so decimal place.  (Maybe, if you don't understand what
> float is, you didn't make that choice intentionally ... but that's
> a documentation issue not a code shortcoming.)  However, it's fairly
> hard to believe that somebody who writes '1.4 years'::interval doesn't
> care about the 0.4 year.  The fact that we silently convert that to,
> effectively, 1.33333333... years seems like a bigger roundoff error
> than one would expect.

Yeah, that's definitely a fair point!

> > I am dubious that it's worth the pain of making the input function
> > reject cases involving fractional units. It's true that some people
> > here aren't happy with the current behavior, but they may no happier
> > if we reject those cases with an error, and other people may then be
> > unhappy too.
>
> Maybe.  A possible compromise is to accept only exactly-representable
> fractions.  Then, for instance, we'd take 1.5 years (resulting in 18
> months) but not 1.4 years.  Now, this might fall foul of your point about
> not wanting to mislead people into expecting the system to do things it
> can't; but I'd argue that the existing behavior misleads them much more.

Well, let's see what other people think.

> > I think your previous idea was the best one so far: fix
> > the input function so that 'X years Y months' and 'Y months X years'
> > always produce the same answer, and call it good.
>
> That would clearly be a bug fix.  I'm just troubled that there are
> still behaviors that people will see as bugs.

That's a reasonable thing to be troubled about, but the date and time
related datatypes have so many odd and crufty behaviors that I have a
hard time believing that there's another possible outcome. If somebody
showed up today and proposed a new data type and told us that the way
to format values of that data type was to say to_char(my_value,
alphabet_soup) I think they would not be taken very seriously. A lot
of this code, and the associated interfaces, date back to a time when
PostgreSQL was far more primitive than today, and when databases in
general were as well. At least we didn't end up with a datatype called
varchar2 ... or not yet, anyway.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Have I found an interval arithmetic bug?

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Jul 28, 2021 at 1:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> That would clearly be a bug fix.  I'm just troubled that there are
>> still behaviors that people will see as bugs.

> That's a reasonable thing to be troubled about, but the date and time
> related datatypes have so many odd and crufty behaviors that I have a
> hard time believing that there's another possible outcome.

There's surely a ton of cruft there, but I think most of it stems from
western civilization's received rules for timekeeping, which we can do
little about.  But the fact that interval_in accepts '1.4 years' when
it cannot do anything very reasonable with that input is entirely
self-inflicted damage.

BTW, I don't have a problem with the "interval * float8" operator
doing equally strange things, because if you don't like what it
does you can always write your own multiplication function that
you like better.  There can be only one interval_in, though,
so I don't think it should be misrepresenting the fundamental
capabilities of the datatype.

            regards, tom lane



Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Wed, Jul 28, 2021 at 11:19:16AM -0400, Bruce Momjian wrote:
> On Wed, Jul 28, 2021 at 08:42:31AM +0100, Dean Rasheed wrote:
> > So it's equal even without calling justify_interval() on the result.
> > 
> > FWIW, I remain of the opinion that the interval literal code should
> > just spill down to lower units in all cases, just like the
> > multiplication and division code, so that the results are consistent
> > (barring floating point rounding errors) and explainable.
> 
> Here is a more minimal patch that doesn't change the spill-down units at
> all, but merely documents it, and changes the spilldown to months to
> round instead of truncate.

Unless I hear more feedback, I plan to apply this doc patch to all
branches with the word "rounded" changed to "truncated" in the back
branches, and apply the rounded code changes to master.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, Jul 30, 2021 at 12:04:39PM -0400, Bruce Momjian wrote:
> On Wed, Jul 28, 2021 at 11:19:16AM -0400, Bruce Momjian wrote:
> > On Wed, Jul 28, 2021 at 08:42:31AM +0100, Dean Rasheed wrote:
> > > So it's equal even without calling justify_interval() on the result.
> > > 
> > > FWIW, I remain of the opinion that the interval literal code should
> > > just spill down to lower units in all cases, just like the
> > > multiplication and division code, so that the results are consistent
> > > (barring floating point rounding errors) and explainable.
> > 
> > Here is a more minimal patch that doesn't change the spill-down units at
> > all, but merely documents it, and changes the spilldown to months to
> > round instead of truncate.
> 
> Unless I hear more feedback, I plan to apply this doc patch to all
> branches with the word "rounded" changed to "truncated" in the back
> branches, and apply the rounded code changes to master.

Now that I think of it, I will just remove the word "rounded" from the
back branch docs so we are technically breaking the documented API less
in PG 15.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> On Fri, Jul 30, 2021 at 12:04:39PM -0400, Bruce Momjian wrote:
>> Unless I hear more feedback, I plan to apply this doc patch to all
>> branches with the word "rounded" changed to "truncated" in the back
>> branches, and apply the rounded code changes to master.

> Now that I think of it, I will just remove the word "rounded" from the
> back branch docs so we are technically breaking the documented API less
> in PG 15.

I think your first idea was better.  Not documenting the behavior
doesn't make this not an API change; it just makes it harder for
people to understand what changed.

The doc patch itself is not exactly fine:

+     Field values can have fractional parts;  for example, <literal>'1.5
+     weeks'</literal> or <literal>'01:02:03.45'</literal>.  However,

I think "some field values", as it was worded previously, was better.
If you try to write 01.5:02:03, that is not going to be interpreted
as 1.5 hours.  (Hmm, I get something that seems quite insane:

regression=# select '01.5:02:03'::interval;
    interval    
----------------
 1 day 14:03:00
(1 row)

I wonder what it thinks it's doing there.)

This is wrong:

+     because interval internally stores only three integer units (months,
+     days, seconds), fractional units must be spilled to smaller units.

s/seconds/microseconds/ is probably enough to fix that.

+     For example, because months are approximated to equal 30 days,
+     fractional values of units greater than months is rounded to be the
+     nearest integer number of months.  Fractional units of months or less
+     are computed to be an integer number of days and seconds, assuming
+     24 hours per day.  For example, <literal>'1.5 months'</literal>
+     becomes <literal>1 month 15 days</literal>.

This entire passage is vague, and grammatically shaky too.  Perhaps
more like

      Fractional parts of units larger than months are rounded to the
      nearest integer number of months; for example '1.5 years'
      becomes '1 year 6 mons'.  Fractional parts of months are rounded
      to the nearest integer number of days, using the assumption that
      one month equals 30 days; for example '1.5 months'
      becomes '1 mon 15 days'.  Fractional parts of days and weeks
      are converted to microseconds, using the assumption that one day
      equals 24 hours.

      On output, the months field is shown as an appropriate number of
      years and months; the days field is shown as-is; the microseconds
      field is converted to hours, minutes, and possibly-fractional
      seconds.

            regards, tom lane



Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, Jul 30, 2021 at 12:49:34PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Now that I think of it, I will just remove the word "rounded" from the
> > back branch docs so we are technically breaking the documented API less
> > in PG 15.
> 
> I think your first idea was better.  Not documenting the behavior
> doesn't make this not an API change; it just makes it harder for
> people to understand what changed.

OK.  However, I thought we were more worried about changing documented
APIs than undocumented ones.  Anyway, I will do as you suggested.

> The doc patch itself is not exactly fine:
> 
> +     Field values can have fractional parts;  for example, <literal>'1.5
> +     weeks'</literal> or <literal>'01:02:03.45'</literal>.  However,
> 
> I think "some field values", as it was worded previously, was better.
> If you try to write 01.5:02:03, that is not going to be interpreted
> as 1.5 hours.  (Hmm, I get something that seems quite insane:
> 
> regression=# select '01.5:02:03'::interval;
>     interval    
> ----------------
>  1 day 14:03:00
> (1 row)
> 
> I wonder what it thinks it's doing there.)

It thinks 01.5:02:03 is Days:Hours;Minute, so I think all fields can use
fractions:

    SELECT interval '1.5 minutes';
     interval
    ----------
     00:01:30

> This is wrong:
> 
> +     because interval internally stores only three integer units (months,
> +     days, seconds), fractional units must be spilled to smaller units.
> 
> s/seconds/microseconds/ is probably enough to fix that.

OK, there were a few place that said "seconds" so I fixed those too.

> +     For example, because months are approximated to equal 30 days,
> +     fractional values of units greater than months is rounded to be the
> +     nearest integer number of months.  Fractional units of months or less
> +     are computed to be an integer number of days and seconds, assuming
> +     24 hours per day.  For example, <literal>'1.5 months'</literal>
> +     becomes <literal>1 month 15 days</literal>.
> 
> This entire passage is vague, and grammatically shaky too.  Perhaps
> more like
> 
>       Fractional parts of units larger than months are rounded to the
>       nearest integer number of months; for example '1.5 years'
>       becomes '1 year 6 mons'.  Fractional parts of months are rounded
>       to the nearest integer number of days, using the assumption that
>       one month equals 30 days; for example '1.5 months'

The newest patch actually doesn't work as explained above --- fractional
months now continue to spill to microseconds.  I think you are looking
at a previous version.

>       becomes '1 mon 15 days'.  Fractional parts of days and weeks
>       are converted to microseconds, using the assumption that one day
>       equals 24 hours.

Uh, fractional weeks can be integer days.

>       On output, the months field is shown as an appropriate number of
>       years and months; the days field is shown as-is; the microseconds
>       field is converted to hours, minutes, and possibly-fractional
>       seconds.

Here is an updated patch that includes some of your ideas.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


Attachment

Re: Have I found an interval arithmetic bug?

From
Robert Haas
Date:
On Fri, Jul 30, 2021 at 3:03 PM Bruce Momjian <bruce@momjian.us> wrote:
> Here is an updated patch that includes some of your ideas.

Just to be clear, I am against this patch. I don't think it's a
minimal change for the reported problem, and I think some people will
be unhappy about the behavior changes.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, Jul 30, 2021 at 03:08:56PM -0400, Robert Haas wrote:
> On Fri, Jul 30, 2021 at 3:03 PM Bruce Momjian <bruce@momjian.us> wrote:
> > Here is an updated patch that includes some of your ideas.
> 
> Just to be clear, I am against this patch. I don't think it's a
> minimal change for the reported problem, and I think some people will
> be unhappy about the behavior changes.

Uh, what do you suggest then?  You wanted the years/months fixed, and
rounding at spill stop time makes sense, and fixes the problem.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Robert Haas
Date:
On Fri, Jul 30, 2021 at 3:20 PM Bruce Momjian <bruce@momjian.us> wrote:
> Uh, what do you suggest then?  You wanted the years/months fixed, and
> rounding at spill stop time makes sense, and fixes the problem.

Hmm, maybe I misunderstood. Are you saying that you think the patch
will fix cases like interval '-1.7 years 29.4 months' and interval
'29.4 months -1.7 years' to produce the same answer without changing
any other cases? I had the impression that you were proposing a bigger
change to the rules for converting fractional units to units of lower
type, particularly because Tom called it an "API change".

For some reason I can't apply the patch locally.

[rhaas pgsql]$ patch -p1 < ~/Downloads/interval.diff
(Stripping trailing CRs from patch.)
patching file doc/src/sgml/datatype.sgml
(Stripping trailing CRs from patch.)
patching file src/backend/utils/adt/datetime.c
patch: **** malformed patch at line 90: @@ -3601,7 +3597,7 @@
DecodeISO8601Interval(char *str,

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Have I found an interval arithmetic bug?

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> On Fri, Jul 30, 2021 at 03:08:56PM -0400, Robert Haas wrote:
>> Just to be clear, I am against this patch. I don't think it's a
>> minimal change for the reported problem, and I think some people will
>> be unhappy about the behavior changes.

> Uh, what do you suggest then?  You wanted the years/months fixed, and
> rounding at spill stop time makes sense, and fixes the problem.

The complained-of bug is that 'X years Y months' isn't always
identical to 'Y months X years'.  I do not believe that this patch
fixes that, though it may obscure the problem for some values of
X and Y.  After a quick look at the code, I am suspicious that
the actual problem is that AdjustFractDays is applied at the wrong
time, before we've collected all the input.  We probably need to
collect up all of the contributing input as floats and then do the
fractional spilling once at the end.

Having said that, I also agree that it's not great that '1.4 years'
is converted to '1 year 4 mons' (1.33333... years) rather than
'1 year 5 mons' (1.41666... years).  The latter seems like a clearly
saner translation.  I would really rather that we reject such input
altogether, but if we're going to accept it, we should round not
truncate.

            regards, tom lane



Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, Jul 30, 2021 at 03:47:53PM -0400, Robert Haas wrote:
> On Fri, Jul 30, 2021 at 3:20 PM Bruce Momjian <bruce@momjian.us> wrote:
> > Uh, what do you suggest then?  You wanted the years/months fixed, and
> > rounding at spill stop time makes sense, and fixes the problem.
> 
> Hmm, maybe I misunderstood. Are you saying that you think the patch
> will fix cases like interval '-1.7 years 29.4 months' and interval
> '29.4 months -1.7 years' to produce the same answer without changing
> any other cases? I had the impression that you were proposing a bigger

Yes, tests from the oringal email:

    SELECT interval '-1.7 years 29.4 months';
        interval
    ----------------
     9 mons 12 days
    (1 row)
    
    SELECT interval '29.4 months -1.7 years';
        interval
    ----------------
     9 mons 12 days
    (1 row)
    
    SELECT interval '-1.7 years' + interval '29.4 months';
        ?column?
    ----------------
     9 mons 12 days
    (1 row)
    
    SELECT interval '29.4 months' + interval '-1.7 years';
        ?column?
    ----------------
     9 mons 12 days

> change to the rules for converting fractional units to units of lower
> type, particularly because Tom called it an "API change".

The API change is to _round_ units greater than months to integeral
month values;  we currently truncate.  Changing the spill behavior has
been rejected.

> For some reason I can't apply the patch locally.
> 
> [rhaas pgsql]$ patch -p1 < ~/Downloads/interval.diff
> (Stripping trailing CRs from patch.)
> patching file doc/src/sgml/datatype.sgml
> (Stripping trailing CRs from patch.)
> patching file src/backend/utils/adt/datetime.c
> patch: **** malformed patch at line 90: @@ -3601,7 +3597,7 @@
> DecodeISO8601Interval(char *str,

Uh, here is the patch again, in case that helps.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.


Attachment

Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, Jul 30, 2021 at 03:54:42PM -0400, Tom Lane wrote:
> Having said that, I also agree that it's not great that '1.4 years'
> is converted to '1 year 4 mons' (1.33333... years) rather than
> '1 year 5 mons' (1.41666... years).  The latter seems like a clearly
> saner translation.  I would really rather that we reject such input
> altogether, but if we're going to accept it, we should round not
> truncate.

My patch returns what you want:

    SELECT interval '1.4 years';
       interval
    ---------------
     1 year 5 mons

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.




Re: Have I found an interval arithmetic bug?

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> On Fri, Jul 30, 2021 at 03:54:42PM -0400, Tom Lane wrote:
>> Having said that, I also agree that it's not great that '1.4 years'
>> is converted to '1 year 4 mons' (1.33333... years) rather than
>> '1 year 5 mons' (1.41666... years).  The latter seems like a clearly
>> saner translation.  I would really rather that we reject such input
>> altogether, but if we're going to accept it, we should round not
>> truncate.

> My patch returns what you want:

Yeah, as far as that point goes, I was replying to Robert's
objection not your patch.

            regards, tom lane



Re: Have I found an interval arithmetic bug?

From
Bruce Momjian
Date:
On Fri, Jul 30, 2021 at 03:03:13PM -0400, Bruce Momjian wrote:
> >       On output, the months field is shown as an appropriate number of
> >       years and months; the days field is shown as-is; the microseconds
> >       field is converted to hours, minutes, and possibly-fractional
> >       seconds.
> 
> Here is an updated patch that includes some of your ideas.

"Rounding" patch applied to master, and back branches got only the
adjusted "truncated" doc patch.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  If only the physical world exists, free will is an illusion.