Re: = t1 - t0 but t0 + i <> t1 when t1 and t2 timestamptz values and i is an interval value - Mailing list pgsql-general

From Bryn Llewellyn
Subject Re: = t1 - t0 but t0 + i <> t1 when t1 and t2 timestamptz values and i is an interval value
Date
Msg-id DAABCBE6-6DFF-4C28-9CED-0AA8053FE12C@yugabyte.com
Whole thread Raw
In response to Re: i := t1 - t0 but t0 + i <> t1 when t1 and t2 timestamptz values and i is an interval value  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: = t1 - t0 but t0 + i <> t1 when t1 and t2 timestamptz values and i is an interval value  (Adrian Klaver <adrian.klaver@aklaver.com>)
List pgsql-general
Tom Lane wrote:

bryn@yugabyte.com writes:
There's more. We see that while the two "interval second” values '1 day 01:00:00' and '25 hours' test as equal, the results of adding each to the same timestamptz value are different.

You're misunderstanding how it works…

PLEASE STATE THE RULES THAT ALLOW THE BEHAVIOR OF THE BLOCK TO BE PREDICTED.

…there is absolutely nothing about either the Gregorian calendar or daylight-savings time that you will like.

(I searched the doc but found nothing.)

There's a specific discussion of the DST-boundary issue on… https://www.postgresql.org/docs/11/functions-datetime.html.

Thanks for the prompt reply Tom.

and for the doc ref. (I used the Version 11 page ‘cos that’s my particular focus.)

« When adding an interval value to (or subtracting an interval value from) a timestamp with time zone value, the days component advances or decrements the date of the timestamp with time zone by the indicated number of days, keeping the time of day the same. Across daylight saving time changes (when the session time zone is set to a time zone that recognizes DST), this means interval '1 day' does not necessarily equal interval '24 hours’.. »

I also noted this from folarte@peoplecall.com:

« It seems you want to use them as simple seconds count, like “I'm starting a batch job which will take 25 hours to be done, when will it end?". Like unix timestamps. If you want that, just avoid interval, use extract epoch to convert timestamptz to a simple number of seconds, do arithmetic there (where intervals are just plain numbers) and convert back using  to_timestamp(). »

Thanks Francisco. That’s a useful tip.

I’d deduced the conceptual background that both Tom and Francisco referred to. And I’ve coined the terms “horological interval” and “cultural interval” to capture the distinction. I’d also noticed that it seems that, to first order, the “interval day to second” flavor maps to “horological” and the "interval year to month” flavor maps to “cultural”. However, as my testcase shows, “day” is an oddity because subtracting two timestamptz values treats “day” in the “horological” sense but adding an interval value to a timestamptz treats “day” in the cultural sense. This was the central point of my question. Neither of you referred to this.

I’m going to conclude just that it is what it is and it won’t change.


« Internally interval values are stored as months, days, and seconds. This is done because the number of days in a month varies, and a day can have 23 or 25 hours if a daylight savings time adjustment is involved. The months and days fields are integers while the seconds field can store fractions. Because intervals are usually created from constant strings or timestamp subtraction, this storage method works well in most cases, but can cause unexpected results… »

(Francisco referred to this.) Consider this:

create table t(k int primary key, i interval day to second not null);
insert into t(k, i) values(1, '1 day 1 hour'), (2, '25 hours');
select k, i from t order by k;

The “select” shows that the difference in spelling the two values is preserved in what’s persisted. This is consistent with months, days, and seconds being persisted separately. And it’s consistent with the different semantic effect that the two values can have on addition. I therefore find it confusing that wrong semantics are imposed on the represented value here:

create unique index on t(i);

It causes the “could not create unique index” error. This is the same effect that the “assert” for equality in my testcase shows. The two values can be defined to be equal only if “day” is given a strict horological meaning. But the present discussion shows that it’s taken culturally on interval addition.

This brings me to another strange observation. Do this:

set time zone 'US/Pacific';
select (
    ('2021-02-15 12:00:00'::timestamptz + (interval '9 months')::interval month)::timestamptz at time zone 'US/Pacific'
  )::text as t;

This is the result:

2021-11-15 12:00:00

In plain English, 9 months after 15-Feb is 15-Nov—consistent with what has been said about “month” always being taken culturally. Now try this:

select (
    ('2021-11-15 12:00:00'::timestamptz - '2021-02-15 12:00:00'::timestamptz)::interval month
  )::text as i;

It silently produces this result:

00:00:00

In plain English, you can’t produce a cultural interval value by subtraction; subtraction can only populate a “day to second” flavor interval. Moreover, the computed “hours” component never exceeds 24 and the computed “days” component is always the horological value.

I failed to find an explanation of this in the doc—but I dare say that my searching skills are too feeble.

Finally, I discovered that this is OK:

create table t(i interval);

But I can’t find a definition of the semantics of a bare interval. However, I did find a column headed “Mixed Interval”  at https://www.postgresql.org/docs/11/datatype-datetime.html#INTERVAL-STYLE-OUTPUT-TABLE. But the example values in the column are consistent with this:

select ((interval '2 years, 3 months, 4 days, 5 hours, 6 minutes 7.8 seconds')::interval)::text as i;

This is the result:

2 years 3 mons 4 days 05:06:07.8

If you repeat the “select” using the typecast “::interval month” then the other components are silently thrown away. But if you repeat it using the typecast “::interval second” then all components are preserved just as with bare “interval”. This muddies my idea that there were three distinct interval flavors: horological, cultural, and hybrid. Is the behavior that I’ve just shown intended?

I can guess the rules for the outcome when such a hybrid is added to a timestamptz value. It’s possible to design edge case tests where you’d get different outcomes if: (a) the cultural component is added first and only then the horological component is added; or (b) the components are added in the other order. It seems to me that the outcome is governed by rule (a). Am I right?

B.t.w., I think that the specific complexities of the proleptic Gregorian calendar are cleanly separable from the basic idea that (considering only the requirements statement space) there is a real distinction to be drawn between “horological” and “cultural”—no matter what calendar rules might be used.

pgsql-general by date:

Previous
From: Benedict Holland
Date:
Subject: Re: Hello - About how to install PgAdmin4 on Debian 10
Next
From: Adrian Klaver
Date:
Subject: Re: Hello - About how to install PgAdmin4 on Debian 10