Re: Have I found an interval arithmetic bug? - Mailing list pgsql-hackers

From Zhihong Yu
Subject Re: Have I found an interval arithmetic bug?
Date
Msg-id CALNJ-vRf8MUEhTMRDUJ5o8b_g-pFr8n99-yxaw8qWZX3Bae5XQ@mail.gmail.com
Whole thread Raw
In response to Re: Have I found an interval arithmetic bug?  (Bryn Llewellyn <bryn@yugabyte.com>)
Responses Re: Have I found an interval arithmetic bug?
List pgsql-hackers


On Tue, Apr 13, 2021 at 10:55 AM Bryn Llewellyn <bryn@yugabyte.com> wrote:
On 12-Apr-2021, at 17:25, Bruce Momjian <bruce@momjian.us> wrote:

On Mon, Apr 12, 2021 at 05:20:43PM -0700, Bryn Llewellyn wrote:
I’d argue that the fact that this:

('0.3 months'::interval) + ('0.7 months'::interval)

Is reported as '30 days' and not '1 month' is yet another
bug—precisely because of what I said in my previous email (sorry
that I forked the thread) where I referred to the fact that, in the
right test, adding 1 month gets a different answer than adding 30
days.

Flowing _up_ is what these functions do:

\df *justify*
                              List of functions
  Schema   |       Name       | Result data type | Argument data types | Type
------------+------------------+------------------+---------------------+------
pg_catalog | justify_days     | interval         | interval            | func
pg_catalog | justify_hours    | interval         | interval            | func
pg_catalog | justify_interval | interval         | interval            | func

Yet another convincing reason to get rid of this flow down
business altogether.

We can certainly get rid of all downflow, which in the current patch is
only when fractional internal units are specified.

If some application wants to model flow-down, then it can do so with
trivial programming and full control over its own definition of the
rules.

“Yes please!” re Bruce’s “We can certainly get rid of all downflow, which in the current patch is only when fractional internal units are specified.”

Notice that a user who creates interval values explicitly only by using “make_interval()” will see no behavior change.

There’s another case of up-flow. When you subtract one timestamp value from another, and when they’re far enough apart, then the (internal representation of the) resulting interval value has both a seconds component and a days component. (But never, in my tests, a months component.) I assume that the implementation first gets the difference between the two timestamp values in seconds using (the moral equivalent of) “extract epoch”. And then, if this is greater than 24*60*60, it implements up-flow using the “rule-of-24”—never mind that this means that if you add the answer back to the timestamp value that you subtracted, then you might not get the timestamp value from which you subtracted. (This happens around a DST change and has been discussed earlier in the thread.)

The purpose of these three “justify” functions is dubious. I think that it’s best to think of the 3-component interval vector like an [x, y, z] vector in 3d geometric space, where the three coordinates are not mutually convertible because each has unique semantics. This point has been rehearsed many times in this long thread.

Having said this, it isn’t hard to understand the rules that the functions implement. And, most importantly, their use is voluntary. They are, though, no more than shipped and documented wrappers for a few special cases. A user could so easily write their own function like this:

1. Compute the values of the three components of the internal representation of the passed-in interval value using the “extract” feature and some simple arithmetic.

2. Derive the [minutes, days, seconds] values of a new representation using whatever rules you feel for.

3. Use these new values to create the return interval value.

For example, I might follow a discipline to use interval values that have only one of the three components of the internal representation non-zero. It’s easy to use the “domain” feature for this. (I can use these in any context where I can use the shipped interval.) I could then write a function to convert a “pure seconds” value of my_interval to a “pure years” value. And I could implement my own rules:

— Makes sense only for a large number of seconds that comes out to at least five years (else assert failure).

— Converts seconds to years using the rule that 1 year is, on average, 365.25*24*60*60 seconds, and then truncates it.

There’s no shipped function that does this, and this makes me suspect that I’d prefer to roll my own for any serious purpose.

The existence of the three “justify” functions is, therefore, harmless.


Bruce / Tom:
Can we revisit this topic ?

Cheers 

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Have I found an interval arithmetic bug?
Next
From: David Rowley
Date:
Subject: Re: plan with result cache is very slow when work_mem is not enough