Thread: BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);

BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      18348
Logged by:          Michael Bondarenko
Email address:      work.michael.2956@gmail.com
PostgreSQL version: 14.10
Operating system:   macOS
Description:

Hello,

I'm building a random semantically-correct SQL code generator for PostgreSQL
and I stumbled upon an inconsistency:

tpch=# select extract(year from interval '3 years');
 extract 
---------
       3
(1 row)

tpch=# select extract(week from interval '3 weeks');
ERROR:  interval units "week" not supported

In the documentation it's mentioned that 'week' is an ISO 8601 week, so it
makes sense why it's not applicable to INTERVAL, which is the same for
isoyear. However, the field is named week and not isoweek, so I expect it to
work like the `select extract(year from interval '3 years');` does.
Moreover, the documentation does not mention that the field cannot be
extracted from INTERVAL, like it does for isoyear:
https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT
.


Re: BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);

From
Michael Bondarenko
Date:
Adding another inconsistency I found in the docs to this thread (https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT):

The docs say: "source must be a value expression of type timestamp, time, or interval. (Expressions of type date are cast to timestamp and can therefore be used as well.)"

Which implies that the following two results must be the same:

tpch=# select extract(microseconds from date '1924.01.01');
ERROR:  date units "microseconds" not supported

tpch=# select extract(microseconds from (date '1924.01.01')::timestamp);
 extract
---------
       0
(1 row)

However, the behaviour is different, which suggests that the date is indeed treated as its own type in EXTRACT, and not cast to timestamp.

On Fri, Feb 16, 2024 at 2:07 PM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      18348
Logged by:          Michael Bondarenko
Email address:      work.michael.2956@gmail.com
PostgreSQL version: 14.10
Operating system:   macOS
Description:       

Hello,

I'm building a random semantically-correct SQL code generator for PostgreSQL
and I stumbled upon an inconsistency:

tpch=# select extract(year from interval '3 years');
 extract
---------
       3
(1 row)

tpch=# select extract(week from interval '3 weeks');
ERROR:  interval units "week" not supported

In the documentation it's mentioned that 'week' is an ISO 8601 week, so it
makes sense why it's not applicable to INTERVAL, which is the same for
isoyear. However, the field is named week and not isoweek, so I expect it to
work like the `select extract(year from interval '3 years');` does.
Moreover, the documentation does not mention that the field cannot be
extracted from INTERVAL, like it does for isoyear:
https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT
.

Re: BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);

From
David Rowley
Date:
On Sat, 17 Feb 2024 at 01:27, Michael Bondarenko
<work.michael.2956@gmail.com> wrote:
>
> Adding another inconsistency I found in the docs to this thread
(https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT):
>
> The docs say: "source must be a value expression of type timestamp, time, or interval. (Expressions of type date are
castto timestamp and can therefore be used as well.)"
 
>
> Which implies that the following two results must be the same:
>
> tpch=# select extract(microseconds from date '1924.01.01');
> ERROR:  date units "microseconds" not supported
>
> tpch=# select extract(microseconds from (date '1924.01.01')::timestamp);
>  extract
> ---------
>        0

It looks like a2da77cdb should have updated the documentation for this.

David



Re: BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);

From
David Rowley
Date:
On Sat, 17 Feb 2024 at 01:27, PG Bug reporting form
<noreply@postgresql.org> wrote:
> tpch=# select extract(week from interval '3 weeks');
> ERROR:  interval units "week" not supported
>
> In the documentation it's mentioned that 'week' is an ISO 8601 week, so it
> makes sense why it's not applicable to INTERVAL, which is the same for
> isoyear. However, the field is named week and not isoweek, so I expect it to
> work like the `select extract(year from interval '3 years');` does.
> Moreover, the documentation does not mention that the field cannot be
> extracted from INTERVAL, like it does for isoyear:
> https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT

Maybe that table should specify which type(s) each of the items listed
is applicable to. Seems better than mentioning which types they're not
applicable to.

David



in `9.9.1. EXTRACT, date_part`
EXTRACT(field FROM source)

I saw more inconsistencies with the doc when `source` is an interval.

the `minute` field
select extract(minute from interval '2011 year 16 month 35 day 48 hour
1005 min 71 sec 11 ms');
select extract(minute from interval '2011 year 16 month 35 day 48 hour
1005 min 2 sec 11 ms');
select extract(minute from interval '2011 year 16 month 35 day 48 hour
1005 min 2 sec 11 ms');

the `hour` field:
select extract(hour from interval '2011 year 16 month 35 day 48 hour
1005 min 71 sec 11 ms');
select extract(hour from interval '2011 year 16 month 35 day 48 hour
1005 min 2 sec 11 ms');
select extract(hour from interval '2011 year 16 month 35 day 48 hour
1005 min 71 sec 11111111111 ms');

the `quarter` field:
select extract(quarter from interval '2011 year 12 month 48 hour 1005
min 2 sec 11 ms');
SELECT EXTRACT(QUARTER FROM TIMESTAMP '2001-12-16 20:38:40');



Re: BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);

From
Michael Bondarenko
Date:
When testing I stumbled upon that too, but I thought no calculation was happening in the interval field. However, it's different with the days and months etc. It seems no calculation for day and month and more:

tpch=# select extract(day from interval '86400000 seconds');
 extract
---------
       0
(1 row)

tpch=# select extract(month from interval '86400000 seconds');
 extract
---------
       0
(1 row)

tpch=# select extract(year from interval '86400000 seconds');
 extract
---------
       0
(1 row)

But calculation is present for hour, and minutes and seconds (90061 sec is 1 day 1 hour 1 minute 1 second):

tpch=# select extract(minute from interval '90061 seconds');
 extract
---------
       1
(1 row)

tpch=# select extract(hour from interval '90061 seconds');
 extract
---------
      25
(1 row)

tpch=# select extract(second from interval '90061 seconds');
 extract  
----------
 1.000000
(1 row)

The docs mention The hour field (0–23) for the hours, which is not true because it's not the field at all, but the calculated amount, and the value is not 0-23.

On Sat, Feb 17, 2024 at 3:48 AM jian he <jian.universality@gmail.com> wrote:
in `9.9.1. EXTRACT, date_part`
EXTRACT(field FROM source)

I saw more inconsistencies with the doc when `source` is an interval.

the `minute` field
select extract(minute from interval '2011 year 16 month 35 day 48 hour
1005 min 71 sec 11 ms');
select extract(minute from interval '2011 year 16 month 35 day 48 hour
1005 min 2 sec 11 ms');
select extract(minute from interval '2011 year 16 month 35 day 48 hour
1005 min 2 sec 11 ms');

the `hour` field:
select extract(hour from interval '2011 year 16 month 35 day 48 hour
1005 min 71 sec 11 ms');
select extract(hour from interval '2011 year 16 month 35 day 48 hour
1005 min 2 sec 11 ms');
select extract(hour from interval '2011 year 16 month 35 day 48 hour
1005 min 71 sec 11111111111 ms');

the `quarter` field:
select extract(quarter from interval '2011 year 12 month 48 hour 1005
min 2 sec 11 ms');
SELECT EXTRACT(QUARTER FROM TIMESTAMP '2001-12-16 20:38:40');

Re: BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);

From
Francisco Olarte
Date:
On Sat, 17 Feb 2024 at 09:01, Michael Bondarenko
<work.michael.2956@gmail.com> wrote:
> When testing I stumbled upon that too, but I thought no calculation was happening in the interval field. However,
it'sdifferent with the days and months etc. It seems no calculation for day and month and more:
 
...
> But calculation is present for hour, and minutes and seconds (90061 sec is 1 day 1 hour 1 minute 1 second):

No, intervals have seconds, days and months. This is because not all
days have 24 hours, due to DST they can have 23 or 25, or even more
extreme values if some country decides to change its time zone
definition. And not all months have 30 days, so 90061 is 0 months, 0
days, 25 hours, 1 minute, 1 second ( IIRC leap second are not handled
).

It is done that way so when you add one day across a dst jump you get
the same hour on the next day, and when you add one month you get the
same day in the next month independent of how many days the month has.
This is great for things like "schedule a meeting one month and one
week from now", but it bites you sometimes, like when you need a
duration to bill for a long event like a phone call, where I always
end up extracting epoch and substracting them.

Francisco Olarte.



Re: BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);

From
Tom Lane
Date:
David Rowley <dgrowleyml@gmail.com> writes:
> On Sat, 17 Feb 2024 at 01:27, PG Bug reporting form
> <noreply@postgresql.org> wrote:
>> Moreover, the documentation does not mention that the field cannot be
>> extracted from INTERVAL, like it does for isoyear:
>> https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT

> Maybe that table should specify which type(s) each of the items listed
> is applicable to. Seems better than mentioning which types they're not
> applicable to.

The thing's not laid out as a table though, and converting it seems
like more trouble than this is worth.  The rejected cases hardly seem
surprising.  I propose just mentioning that not all fields apply for
all data types, as in 0001 attached.

(Parenthetically, one case that perhaps is surprising is
    ERROR:  unit "week" not supported for type interval
Why not just return the day field divided by 7?)

Unrelated but adjacent, the discussion of the century field seems
more than a bit flippant when I read it now.  In other places we
are typically content to use examples to make similar points.
I propose doing so here too, as in 0002 attached.

Lastly, the entire page is quite schizophrenic about whether to leave
a blank line between adjacent examples.  I could go either way on
whether to have that whitespace or not, but I do think it would be
better to make it uniform.  Any votes on what to do there?

            regards, tom lane

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index cf3de80394..fc8017f2f3 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10040,13 +10040,19 @@ EXTRACT(<replaceable>field</replaceable> FROM <replaceable>source</replaceable>)
     The <function>extract</function> function retrieves subfields
     such as year or hour from date/time values.
     <replaceable>source</replaceable> must be a value expression of
-    type <type>timestamp</type>, <type>time</type>, or <type>interval</type>.
-    (Expressions of type <type>date</type> are
-    cast to <type>timestamp</type> and can therefore be used as
-    well.)  <replaceable>field</replaceable> is an identifier or
+    type <type>timestamp</type>, <type>date</type>, <type>time</type>,
+    or <type>interval</type>.  (Timestamps and times can be with or
+    without time zone.)
+    <replaceable>field</replaceable> is an identifier or
     string that selects what field to extract from the source value.
+    Not all fields are valid for every input data type; for example, fields
+    smaller than a day cannot be extracted from a <type>date</type>, while
+    fields of a day or more cannot be extracted from a <type>time</type>.
     The <function>extract</function> function returns values of type
     <type>numeric</type>.
+   </para>
+
+   <para>
     The following are valid field names:

     <!-- alphabetical -->
@@ -10228,7 +10228,7 @@ SELECT EXTRACT(ISODOW FROM TIMESTAMP '2001-02-18 20:38:40');
       <listitem>
        <para>
         The <acronym>ISO</acronym> 8601 week-numbering year that the date
-        falls in (not applicable to intervals)
+        falls in
        </para>

 <screen>
@@ -10256,7 +10256,7 @@ SELECT EXTRACT(ISOYEAR FROM DATE '2006-01-02');
       <listitem>
        <para>
         The <firstterm>Julian Date</firstterm> corresponding to the
-        date or timestamp (not applicable to intervals).  Timestamps
+        date or timestamp.  Timestamps
         that are not local midnight result in a fractional value.  See
         <xref linkend="datetime-julian-dates"/> for more information.
        </para>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index cf3de80394..5d215d218c 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10063,17 +10069,11 @@ SELECT EXTRACT(CENTURY FROM TIMESTAMP '2000-12-16 12:21:13');
 <lineannotation>Result: </lineannotation><computeroutput>20</computeroutput>
 SELECT EXTRACT(CENTURY FROM TIMESTAMP '2001-02-16 20:38:40');
 <lineannotation>Result: </lineannotation><computeroutput>21</computeroutput>
+SELECT EXTRACT(CENTURY FROM DATE '0001-01-01 AD');
+<lineannotation>Result: </lineannotation><computeroutput>1</computeroutput>
+SELECT EXTRACT(CENTURY FROM DATE '0001-12-31 BC');
+<lineannotation>Result: </lineannotation><computeroutput>-1</computeroutput>
 </screen>
-
-       <para>
-        The first century starts at 0001-01-01 00:00:00 AD, although
-        they did not know it at the time. This definition applies to all
-        Gregorian calendar countries. There is no century number 0,
-        you go from -1 century to 1 century.
-
-        If you disagree with this, please write your complaint to:
-        Pope, Cathedral Saint-Peter of Roma, Vatican.
-       </para>
       </listitem>
      </varlistentry>


Re: BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);

From
Tom Lane
Date:
Francisco Olarte <folarte@peoplecall.com> writes:
> On Sat, 17 Feb 2024 at 09:01, Michael Bondarenko
> <work.michael.2956@gmail.com> wrote:
>> When testing I stumbled upon that too, but I thought no calculation was happening in the interval field. However,
it'sdifferent with the days and months etc. It seems no calculation for day and month and more: 
>> ...
>> But calculation is present for hour, and minutes and seconds (90061 sec is 1 day 1 hour 1 minute 1 second):

> No, intervals have seconds, days and months.

Yeah.  I think much of the confusion here comes from starting with
non-normalized interval input.  Sure you can write "2011 year 12 month
48 hour 1005 min 2 sec 11 ms", but that's not how it's stored:

regression=# select interval '2011 year 12 month 48 hour 1005 min 2 sec 11 ms';
        interval
-------------------------
 2012 years 64:45:02.011
(1 row)

(Actually, what's stored is 2012*12 months, 0 days, and some number
of microseconds that I don't feel like working out.  Conversion of
the microseconds to HH:MM:SS.SSS happens on output.)

Once you look at the normalized value, the results of extract()
are far less surprising.

Probably the right place to enlarge on this point is not in the
extract() section at all, but in 8.5.4. Interval Input.  That does
mention the months/days/microseconds representation, but it doesn't
follow through by illustrating how other input is combined.  Perhaps
we'd want to adopt something like the attached (this is separate from
the other patches I posted in the thread).

            regards, tom lane

diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 66510ee031..1c568e5022 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -2869,10 +2869,46 @@ P <optional> <replaceable>years</replaceable>-<replaceable>months</replaceable>-
     </para>

     <para>
-     Field values can have fractional parts:  for example, <literal>'1.5
+     Internally <type>interval</type> values are stored as three integral
+     fields: months, days, and microseconds.  These fields are kept
+     separate because the number of days in a month varies, while a day
+     can have 23 or 25 hours if a daylight savings time transition is
+     involved.  An interval input string that uses other units is
+     normalized into this format, and then reconstructed in a standardized
+     way for output, for example:
+
+<programlisting>
+SELECT '2 years 15 months 100 weeks 99 hours 123456789 milliseconds'::interval;
+               interval
+---------------------------------------
+ 3 years 3 mons 700 days 133:17:36.789
+</programlisting>
+
+     Here weeks, which are understood as <quote>7 days</quote>, have been
+     kept separate, while the smaller and larger time units were
+     combined and normalized.  It is possible to use the
+     functions <function>justify_days</function>
+     and <function>justify_hours</function> to convert large days or
+     hours values into the next higher field:
+
+<programlisting>
+SELECT justify_days('2 years 15 months 100 weeks 99 hours 123456789 milliseconds'::interval);
+             justify_days
+--------------------------------------
+ 5 years 2 mons 10 days 133:17:36.789
+
+SELECT justify_hours('2 years 15 months 100 weeks 99 hours 123456789 milliseconds'::interval);
+            justify_hours
+--------------------------------------
+ 3 years 3 mons 705 days 13:17:36.789
+</programlisting>
+    </para>
+
+    <para>
+     Input field values can have fractional parts, for example <literal>'1.5
      weeks'</literal> or <literal>'01:02:03.45'</literal>.  However,
-     because interval internally stores only three integer units (months,
-     days, microseconds), fractional units must be spilled to smaller
+     because interval internally stores only integer fields,
+     fractional units must be spilled to smaller
      units.  Fractional parts of units greater than months are rounded to
      be an integer number of months, e.g. <literal>'1.5 years'</literal>
      becomes <literal>'1 year 6 mons'</literal>.  Fractional parts of
@@ -2922,33 +2958,6 @@ P <optional> <replaceable>years</replaceable>-<replaceable>months</replaceable>-
       </tgroup>
      </table>

-    <para>
-     Internally <type>interval</type> values are stored as months, days,
-     and microseconds. This is done because the number of days in a month
-     varies, and a day can have 23 or 25 hours if a daylight savings
-     time adjustment is involved.  The months and days fields are integers
-     while the microseconds field can store fractional seconds.  Because intervals are
-     usually created from constant strings or <type>timestamp</type> subtraction,
-     this storage method works well in most cases, but can cause unexpected
-     results:
-
-<programlisting>
-SELECT EXTRACT(hours from '80 minutes'::interval);
- date_part
------------
-         1
-
-SELECT EXTRACT(days from '80 hours'::interval);
- date_part
------------
-         0
-</programlisting>
-
-     Functions <function>justify_days</function> and
-     <function>justify_hours</function> are available for adjusting days
-     and hours that overflow their normal ranges.
-    </para>
-
    </sect2>

    <sect2 id="datatype-interval-output">
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index cf3de80394..480a8dcb60 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10461,6 +10461,20 @@ SELECT EXTRACT(YEAR FROM TIMESTAMP '2001-02-16 20:38:40');
     </variablelist>
    </para>

+   <para>
+    When processing <type>interval</type> input,
+    the <function>extract</function> function produces field values that
+    match the interpretation used by the interval output function.  This
+    can produce surprising results if one starts with a non-normalized
+    interval representation, for example:
+<screen>
+SELECT INTERVAL '80 minutes';
+<lineannotation>Result: </lineannotation><computeroutput>01:20:00</computeroutput>
+SELECT EXTRACT(MINUTES FROM INTERVAL '80 minutes');
+<lineannotation>Result: </lineannotation><computeroutput>20</computeroutput>
+</screen>
+   </para>
+
    <note>
     <para>
      When the input value is +/-Infinity, <function>extract</function> returns

On Sun, Feb 18, 2024 at 4:30 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>
>
> Once you look at the normalized value, the results of extract()
> are far less surprising.
>
> Probably the right place to enlarge on this point is not in the
> extract() section at all, but in 8.5.4. Interval Input.  That does
> mention the months/days/microseconds representation, but it doesn't
> follow through by illustrating how other input is combined.  Perhaps
> we'd want to adopt something like the attached (this is separate from
> the other patches I posted in the thread).
>

--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10040,13 +10040,19 @@ EXTRACT(<replaceable>field</replaceable>
FROM <replaceable>source</replaceable>)
     The <function>extract</function> function retrieves subfields
     such as year or hour from date/time values.
     <replaceable>source</replaceable> must be a value expression of
-    type <type>timestamp</type>, <type>time</type>, or <type>interval</type>.
-    (Expressions of type <type>date</type> are
-    cast to <type>timestamp</type> and can therefore be used as
-    well.)  <replaceable>field</replaceable> is an identifier or
+    type <type>timestamp</type>, <type>date</type>, <type>time</type>,
+    or <type>interval</type>.  (Timestamps and times can be with or
+    without time zone.)
+    <replaceable>field</replaceable> is an identifier or
     string that selects what field to extract from the source value.
+    Not all fields are valid for every input data type; for example, fields
+    smaller than a day cannot be extracted from a <type>date</type>, while
+    fields of a day or more cannot be extracted from a <type>time</type>.
     The <function>extract</function> function returns values of type
     <type>numeric</type>.
+   </para>

you already mentioned "Not all fields are valid for every input data type".
interval data type don't even have a unit "quarter",
so the following should generate an error?
select extract(quarter from interval '2011 year 12 month 48 hour
1005min 2 sec 11 ms');

9.9.1. EXTRACT, date_part
hour field description as
`
The hour field (0–23)
`
Do we need to update for the EXTRACT(INTERVAL) case?



Re: BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);

From
Tom Lane
Date:
jian he <jian.universality@gmail.com> writes:
> you already mentioned "Not all fields are valid for every input data type".
> interval data type don't even have a unit "quarter",
> so the following should generate an error?
> select extract(quarter from interval '2011 year 12 month 48 hour
> 1005min 2 sec 11 ms');

I'm not especially persuaded by that reasoning.  Intervals don't have
century or millisecond fields either, but we allow extracting those.

If your argument is that we shouldn't allow it because we don't take
the input INTERVAL '1 quarter', I'd be much more inclined to add that
as valid input than to take away existing extract functionality.
But I'm dubious about the proposition that extract's list of valid
fields should exactly match the set of allowed input units.  The
semantics aren't really the same (as per the '80 minutes' example)
so such a restriction doesn't seem to have much basis in reality.

            regards, tom lane



Re: BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);

From
Tom Lane
Date:
jian he <jian.universality@gmail.com> writes:
> 9.9.1. EXTRACT, date_part
> hour field description as
> `
> The hour field (0–23)
> `
> Do we need to update for the EXTRACT(INTERVAL) case?

Yeah, probably.  I did a bit more wordsmithing too.
Here's a rolled-up patch.

            regards, tom lane

diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 66510ee031..73e51b0b11 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -2869,10 +2869,31 @@ P <optional> <replaceable>years</replaceable>-<replaceable>months</replaceable>-
     </para>

     <para>
-     Field values can have fractional parts:  for example, <literal>'1.5
+     Internally, <type>interval</type> values are stored as three integral
+     fields: months, days, and microseconds.  These fields are kept
+     separate because the number of days in a month varies, while a day
+     can have 23 or 25 hours if a daylight savings time transition is
+     involved.  An interval input string that uses other units is
+     normalized into this format, and then reconstructed in a standardized
+     way for output, for example:
+
+<programlisting>
+SELECT '2 years 15 months 100 weeks 99 hours 123456789 milliseconds'::interval;
+               interval
+---------------------------------------
+ 3 years 3 mons 700 days 133:17:36.789
+</programlisting>
+
+     Here weeks, which are understood as <quote>7 days</quote>, have been
+     kept separate, while the smaller and larger time units were
+     combined and normalized.
+    </para>
+
+    <para>
+     Input field values can have fractional parts, for example <literal>'1.5
      weeks'</literal> or <literal>'01:02:03.45'</literal>.  However,
-     because interval internally stores only three integer units (months,
-     days, microseconds), fractional units must be spilled to smaller
+     because <type>interval</type> internally stores only integral fields,
+     fractional values must be converted into smaller
      units.  Fractional parts of units greater than months are rounded to
      be an integer number of months, e.g. <literal>'1.5 years'</literal>
      becomes <literal>'1 year 6 mons'</literal>.  Fractional parts of
@@ -2922,33 +2943,6 @@ P <optional> <replaceable>years</replaceable>-<replaceable>months</replaceable>-
       </tgroup>
      </table>

-    <para>
-     Internally <type>interval</type> values are stored as months, days,
-     and microseconds. This is done because the number of days in a month
-     varies, and a day can have 23 or 25 hours if a daylight savings
-     time adjustment is involved.  The months and days fields are integers
-     while the microseconds field can store fractional seconds.  Because intervals are
-     usually created from constant strings or <type>timestamp</type> subtraction,
-     this storage method works well in most cases, but can cause unexpected
-     results:
-
-<programlisting>
-SELECT EXTRACT(hours from '80 minutes'::interval);
- date_part
------------
-         1
-
-SELECT EXTRACT(days from '80 hours'::interval);
- date_part
------------
-         0
-</programlisting>
-
-     Functions <function>justify_days</function> and
-     <function>justify_hours</function> are available for adjusting days
-     and hours that overflow their normal ranges.
-    </para>
-
    </sect2>

    <sect2 id="datatype-interval-output">
@@ -2960,6 +2954,23 @@ SELECT EXTRACT(days from '80 hours'::interval);
      <seealso>formatting</seealso>
     </indexterm>

+    <para>
+     As previously explained, <productname>PostgreSQL</productname>
+     stores <type>interval</type> values as months, days, and
+     microseconds.  For output, the months field is converted to years and
+     months by dividing by 12.  The days field is shown as-is.  The
+     microseconds field is converted to hours, minutes, seconds, and
+     fractional seconds.  Thus months, minutes, and seconds will never be
+     shown as exceeding the ranges 0–11, 0–59, and 0–59
+     respectively, while the displayed years, days, and hours fields can
+     be quite large.  (The <link
+     linkend="function-justify-days"><function>justify_days</function></link>
+     and <link
+     linkend="function-justify-hours"><function>justify_hours</function></link>
+     functions can be used if it is desirable to transpose large days or
+     hours values into the next higher field.)
+    </para>
+
     <para>
      The output format of the interval type can be set to one of the
      four styles <literal>sql_standard</literal>, <literal>postgres</literal>,
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index cf3de80394..3de13dae44 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -9166,7 +9166,8 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
         </para>
         <para>
          Subtract timestamps (converting 24-hour intervals into days,
-         similarly to <function>justify_hours()</function>)
+         similarly to <link
+         linkend="function-justify-hours"><function>justify_hours()</function></link>)
         </para>
         <para>
          <literal>timestamp '2001-09-29 03:00' - timestamp '2001-07-27 12:00'</literal>
@@ -9579,35 +9580,35 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');

        <row>
         <entry role="func_table_entry"><para role="func_signature">
-         <indexterm>
+         <indexterm id="function-justify-days">
           <primary>justify_days</primary>
          </indexterm>
          <function>justify_days</function> ( <type>interval</type> )
          <returnvalue>interval</returnvalue>
         </para>
         <para>
-         Adjust interval so 30-day time periods are represented as months
+         Adjust interval, converting 30-day time periods to months
         </para>
         <para>
-         <literal>justify_days(interval '35 days')</literal>
-         <returnvalue>1 mon 5 days</returnvalue>
+         <literal>justify_days(interval '1 year 65 days')</literal>
+         <returnvalue>1 year 2 mons 5 days</returnvalue>
         </para></entry>
        </row>

        <row>
         <entry role="func_table_entry"><para role="func_signature">
-         <indexterm>
+         <indexterm id="function-justify-hours">
           <primary>justify_hours</primary>
          </indexterm>
          <function>justify_hours</function> ( <type>interval</type> )
          <returnvalue>interval</returnvalue>
         </para>
         <para>
-         Adjust interval so 24-hour time periods are represented as days
+         Adjust interval, converting 24-hour time periods to days
         </para>
         <para>
-         <literal>justify_hours(interval '27 hours')</literal>
-         <returnvalue>1 day 03:00:00</returnvalue>
+         <literal>justify_hours(interval '50 hours 10 minutes')</literal>
+         <returnvalue>2 days 02:10:00</returnvalue>
         </para></entry>
        </row>

@@ -10040,13 +10041,19 @@ EXTRACT(<replaceable>field</replaceable> FROM <replaceable>source</replaceable>)
     The <function>extract</function> function retrieves subfields
     such as year or hour from date/time values.
     <replaceable>source</replaceable> must be a value expression of
-    type <type>timestamp</type>, <type>time</type>, or <type>interval</type>.
-    (Expressions of type <type>date</type> are
-    cast to <type>timestamp</type> and can therefore be used as
-    well.)  <replaceable>field</replaceable> is an identifier or
+    type <type>timestamp</type>, <type>date</type>, <type>time</type>,
+    or <type>interval</type>.  (Timestamps and times can be with or
+    without time zone.)
+    <replaceable>field</replaceable> is an identifier or
     string that selects what field to extract from the source value.
+    Not all fields are valid for every input data type; for example, fields
+    smaller than a day cannot be extracted from a <type>date</type>, while
+    fields of a day or more cannot be extracted from a <type>time</type>.
     The <function>extract</function> function returns values of type
     <type>numeric</type>.
+   </para>
+
+   <para>
     The following are valid field names:

     <!-- alphabetical -->
@@ -10063,17 +10070,11 @@ SELECT EXTRACT(CENTURY FROM TIMESTAMP '2000-12-16 12:21:13');
 <lineannotation>Result: </lineannotation><computeroutput>20</computeroutput>
 SELECT EXTRACT(CENTURY FROM TIMESTAMP '2001-02-16 20:38:40');
 <lineannotation>Result: </lineannotation><computeroutput>21</computeroutput>
+SELECT EXTRACT(CENTURY FROM DATE '0001-01-01 AD');
+<lineannotation>Result: </lineannotation><computeroutput>1</computeroutput>
+SELECT EXTRACT(CENTURY FROM DATE '0001-12-31 BC');
+<lineannotation>Result: </lineannotation><computeroutput>-1</computeroutput>
 </screen>
-
-       <para>
-        The first century starts at 0001-01-01 00:00:00 AD, although
-        they did not know it at the time. This definition applies to all
-        Gregorian calendar countries. There is no century number 0,
-        you go from -1 century to 1 century.
-
-        If you disagree with this, please write your complaint to:
-        Pope, Cathedral Saint-Peter of Roma, Vatican.
-       </para>
       </listitem>
      </varlistentry>

@@ -10088,7 +10089,6 @@ SELECT EXTRACT(CENTURY FROM TIMESTAMP '2001-02-16 20:38:40');
 <screen>
 SELECT EXTRACT(DAY FROM TIMESTAMP '2001-02-16 20:38:40');
 <lineannotation>Result: </lineannotation><computeroutput>16</computeroutput>
-
 SELECT EXTRACT(DAY FROM INTERVAL '40 days 1 minute');
 <lineannotation>Result: </lineannotation><computeroutput>40</computeroutput>
 </screen>
@@ -10162,10 +10162,8 @@ SELECT EXTRACT(DOY FROM TIMESTAMP '2001-02-16 20:38:40');
 <screen>
 SELECT EXTRACT(EPOCH FROM TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40.12-08');
 <lineannotation>Result: </lineannotation><computeroutput>982384720.120000</computeroutput>
-
 SELECT EXTRACT(EPOCH FROM TIMESTAMP '2001-02-16 20:38:40.12');
 <lineannotation>Result: </lineannotation><computeroutput>982355920.120000</computeroutput>
-
 SELECT EXTRACT(EPOCH FROM INTERVAL '5 days 3 hours');
 <lineannotation>Result: </lineannotation><computeroutput>442800.000000</computeroutput>
 </screen>
@@ -10193,7 +10191,8 @@ SELECT to_timestamp(982384720.12);
       <term><literal>hour</literal></term>
       <listitem>
        <para>
-        The hour field (0–23)
+        The hour field (0–23 in timestamps, unrestricted for
+        intervals)
        </para>

 <screen>
@@ -10228,7 +10227,7 @@ SELECT EXTRACT(ISODOW FROM TIMESTAMP '2001-02-18 20:38:40');
       <listitem>
        <para>
         The <acronym>ISO</acronym> 8601 week-numbering year that the date
-        falls in (not applicable to intervals)
+        falls in
        </para>

 <screen>
@@ -10256,7 +10255,7 @@ SELECT EXTRACT(ISOYEAR FROM DATE '2006-01-02');
       <listitem>
        <para>
         The <firstterm>Julian Date</firstterm> corresponding to the
-        date or timestamp (not applicable to intervals).  Timestamps
+        date or timestamp.  Timestamps
         that are not local midnight result in a fractional value.  See
         <xref linkend="datetime-julian-dates"/> for more information.
        </para>
@@ -10345,10 +10344,8 @@ SELECT EXTRACT(MINUTE FROM TIMESTAMP '2001-02-16 20:38:40');
 <screen>
 SELECT EXTRACT(MONTH FROM TIMESTAMP '2001-02-16 20:38:40');
 <lineannotation>Result: </lineannotation><computeroutput>2</computeroutput>
-
 SELECT EXTRACT(MONTH FROM INTERVAL '2 years 3 months');
 <lineannotation>Result: </lineannotation><computeroutput>3</computeroutput>
-
 SELECT EXTRACT(MONTH FROM INTERVAL '2 years 13 months');
 <lineannotation>Result: </lineannotation><computeroutput>1</computeroutput>
 </screen>
@@ -10379,7 +10376,6 @@ SELECT EXTRACT(QUARTER FROM TIMESTAMP '2001-02-16 20:38:40');
 <screen>
 SELECT EXTRACT(SECOND FROM TIMESTAMP '2001-02-16 20:38:40');
 <lineannotation>Result: </lineannotation><computeroutput>40.000000</computeroutput>
-
 SELECT EXTRACT(SECOND FROM TIME '17:12:28.5');
 <lineannotation>Result: </lineannotation><computeroutput>28.500000</computeroutput>
 </screen>
@@ -10461,6 +10457,20 @@ SELECT EXTRACT(YEAR FROM TIMESTAMP '2001-02-16 20:38:40');
     </variablelist>
    </para>

+   <para>
+    When processing <type>interval</type> input,
+    the <function>extract</function> function produces field values that
+    match the interpretation used by the interval output function.  This
+    can produce surprising results if one starts with a non-normalized
+    interval representation, for example:
+<screen>
+SELECT INTERVAL '80 minutes';
+<lineannotation>Result: </lineannotation><computeroutput>01:20:00</computeroutput>
+SELECT EXTRACT(MINUTES FROM INTERVAL '80 minutes');
+<lineannotation>Result: </lineannotation><computeroutput>20</computeroutput>
+</screen>
+   </para>
+
    <note>
     <para>
      When the input value is +/-Infinity, <function>extract</function> returns
@@ -10502,7 +10512,6 @@ date_part('<replaceable>field</replaceable>', <replaceable>source</replaceable>)
 <screen>
 SELECT date_part('day', TIMESTAMP '2001-02-16 20:38:40');
 <lineannotation>Result: </lineannotation><computeroutput>16</computeroutput>
-
 SELECT date_part('hour', INTERVAL '4 hours 3 minutes');
 <lineannotation>Result: </lineannotation><computeroutput>4</computeroutput>
 </screen>
@@ -10580,16 +10589,12 @@ date_trunc(<replaceable>field</replaceable>, <replaceable>source</replaceable> [
 <screen>
 SELECT date_trunc('hour', TIMESTAMP '2001-02-16 20:38:40');
 <lineannotation>Result: </lineannotation><computeroutput>2001-02-16 20:00:00</computeroutput>
-
 SELECT date_trunc('year', TIMESTAMP '2001-02-16 20:38:40');
 <lineannotation>Result: </lineannotation><computeroutput>2001-01-01 00:00:00</computeroutput>
-
 SELECT date_trunc('day', TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40+00');
 <lineannotation>Result: </lineannotation><computeroutput>2001-02-16 00:00:00-05</computeroutput>
-
 SELECT date_trunc('day', TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40+00', 'Australia/Sydney');
 <lineannotation>Result: </lineannotation><computeroutput>2001-02-16 08:00:00-05</computeroutput>
-
 SELECT date_trunc('hour', INTERVAL '3 days 02:47:33');
 <lineannotation>Result: </lineannotation><computeroutput>3 days 02:00:00</computeroutput>
 </screen>
@@ -10628,7 +10633,6 @@ date_bin(<replaceable>stride</replaceable>, <replaceable>source</replaceable>, <
 <screen>
 SELECT date_bin('15 minutes', TIMESTAMP '2020-02-11 15:44:17', TIMESTAMP '2001-01-01');
 <lineannotation>Result: </lineannotation><computeroutput>2020-02-11 15:30:00</computeroutput>
-
 SELECT date_bin('15 minutes', TIMESTAMP '2020-02-11 15:44:17', TIMESTAMP '2001-01-01 00:02:30');
 <lineannotation>Result: </lineannotation><computeroutput>2020-02-11 15:32:30</computeroutput>
 </screen>
@@ -10814,16 +10818,12 @@ SELECT date_bin('15 minutes', TIMESTAMP '2020-02-11 15:44:17', TIMESTAMP '2001-0
 <screen>
 SELECT TIMESTAMP '2001-02-16 20:38:40' AT TIME ZONE 'America/Denver';
 <lineannotation>Result: </lineannotation><computeroutput>2001-02-16 19:38:40-08</computeroutput>
-
 SELECT TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40-05' AT TIME ZONE 'America/Denver';
 <lineannotation>Result: </lineannotation><computeroutput>2001-02-16 18:38:40</computeroutput>
-
 SELECT TIMESTAMP '2001-02-16 20:38:40' AT TIME ZONE 'Asia/Tokyo' AT TIME ZONE 'America/Chicago';
 <lineannotation>Result: </lineannotation><computeroutput>2001-02-16 05:38:40</computeroutput>
-
 SELECT TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40-05' AT LOCAL;
 <lineannotation>Result: </lineannotation><computeroutput>2001-02-16 17:38:40</computeroutput>
-
 SELECT TIME WITH TIME ZONE '20:38:40-05' AT LOCAL;
 <lineannotation>Result: </lineannotation><computeroutput>17:38:40</computeroutput>
 </screen>
@@ -10928,16 +10928,12 @@ LOCALTIMESTAMP(<replaceable>precision</replaceable>)
 <screen>
 SELECT CURRENT_TIME;
 <lineannotation>Result: </lineannotation><computeroutput>14:39:53.662522-05</computeroutput>
-
 SELECT CURRENT_DATE;
 <lineannotation>Result: </lineannotation><computeroutput>2019-12-23</computeroutput>
-
 SELECT CURRENT_TIMESTAMP;
 <lineannotation>Result: </lineannotation><computeroutput>2019-12-23 14:39:53.662522-05</computeroutput>
-
 SELECT CURRENT_TIMESTAMP(2);
 <lineannotation>Result: </lineannotation><computeroutput>2019-12-23 14:39:53.66-05</computeroutput>
-
 SELECT LOCALTIMESTAMP;
 <lineannotation>Result: </lineannotation><computeroutput>2019-12-23 14:39:53.662522</computeroutput>
 </screen>

On Sun, Feb 18, 2024 at 2:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>
> (Parenthetically, one case that perhaps is surprising is
>         ERROR:  unit "week" not supported for type interval
> Why not just return the day field divided by 7?)
>
seems pretty simple?
diff --git a/src/backend/utils/adt/timestamp.c
b/src/backend/utils/adt/timestamp.c
index ed03c50a..5e69e258 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -5992,6 +5992,10 @@ interval_part_common(PG_FUNCTION_ARGS, bool retnumeric)
                                intresult = tm->tm_mday;
                                break;

+                       case DTK_WEEK:
+                               intresult = (tm->tm_mday  - 1) / 7 + 1;
+                               break;
but I am not sure not sure how to write the doc.

On Sun, Feb 18, 2024 at 10:19 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> jian he <jian.universality@gmail.com> writes:
> > you already mentioned "Not all fields are valid for every input data type".
> > interval data type don't even have a unit "quarter",
> > so the following should generate an error?
> > select extract(quarter from interval '2011 year 12 month 48 hour
> > 1005min 2 sec 11 ms');
>
> I'm not especially persuaded by that reasoning.  Intervals don't have
> century or millisecond fields either, but we allow extracting those.
>
> If your argument is that we shouldn't allow it because we don't take
> the input INTERVAL '1 quarter', I'd be much more inclined to add that
> as valid input than to take away existing extract functionality.
> But I'm dubious about the proposition that extract's list of valid
> fields should exactly match the set of allowed input units.  The
> semantics aren't really the same (as per the '80 minutes' example)
> so such a restriction doesn't seem to have much basis in reality.
>

in interval_part_common:
case DTK_QUARTER:
intresult = (tm->tm_mon / 3) + 1;
break;

in timestamp_part_common:
case DTK_QUARTER:
intresult = (tm->tm_mon - 1) / 3 + 1;
break;

So in section 9.9.1. EXTRACT, date_part
we may need to document extract(quarter from interval) case.
intervals can be negative, which will make the issue more complicated.
except the "quarter" field , EXTRACT other fields from intervals, the
output seems sane.

for example:
drop table s;
create table s(a interval);
insert into s select ( g * 1000 || 'year ' ||  g || 'month ' || g || '
day ' || g || 'hour ' || g || 'min ' || g || 'sec' )::interval
from generate_series(-20, 20) g;

select
        extract(century from a) as century,
        extract(millennium from a) as millennium,
        extract(decade from a) as decade,
        extract(year from a) as year,
        extract(quarter from a) as quarter,
        extract(month from a) as mon,
        extract(day from a) as day,
        extract(hour from a) as hour,
        extract(min from a) as min,
        extract(second from a) as sec,
        extract(microseconds from a) as microseconds
        -- a
from s order by 2 asc;



Re: BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);

From
Tom Lane
Date:
jian he <jian.universality@gmail.com> writes:
> On Sun, Feb 18, 2024 at 2:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> (Parenthetically, one case that perhaps is surprising is
>> ERROR:  unit "week" not supported for type interval
>> Why not just return the day field divided by 7?)

> seems pretty simple?

Hm, maybe, but does this behave desirably for zero or negative days?

> So in section 9.9.1. EXTRACT, date_part
> we may need to document extract(quarter from interval) case.
> intervals can be negative, which will make the issue more complicated.
> except the "quarter" field , EXTRACT other fields from intervals, the
> output seems sane.

Yeah, I see what you mean: the output for negative month counts is
very bizarre, whereas other fields seem to all produce the negative
of what they'd produce for the absolute value of the interval.
We could either try to fix that or decide that rejecting "quarter"
for intervals is the saner answer.

I went ahead and pushed the docs changes after adding more explicit
descriptions of interval's behavior for the field types where it
seemed important.  If we make any changes to the behavior for
week or quarter fields, ISTM that should be a HEAD-only change.

            regards, tom lane



Re: BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);

From
Tom Lane
Date:
I wrote:
> jian he <jian.universality@gmail.com> writes:
>> On Sun, Feb 18, 2024 at 2:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> (Parenthetically, one case that perhaps is surprising is
>>> ERROR:  unit "week" not supported for type interval
>>> Why not just return the day field divided by 7?)

>> seems pretty simple?

> Hm, maybe, but does this behave desirably for zero or negative days?

>> So in section 9.9.1. EXTRACT, date_part
>> we may need to document extract(quarter from interval) case.
>> intervals can be negative, which will make the issue more complicated.
>> except the "quarter" field , EXTRACT other fields from intervals, the
>> output seems sane.

> Yeah, I see what you mean: the output for negative month counts is
> very bizarre, whereas other fields seem to all produce the negative
> of what they'd produce for the absolute value of the interval.
> We could either try to fix that or decide that rejecting "quarter"
> for intervals is the saner answer.

After fooling with these cases for a little I'm inclined to think
we should do it as attached (no test or docs changes yet).

            regards, tom lane

diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index ed03c50a6d..7177c1a62f 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -5992,12 +5992,19 @@ interval_part_common(PG_FUNCTION_ARGS, bool retnumeric)
                 intresult = tm->tm_mday;
                 break;

+            case DTK_WEEK:
+                intresult = tm->tm_mday / 7;
+                break;
+
             case DTK_MONTH:
                 intresult = tm->tm_mon;
                 break;

             case DTK_QUARTER:
-                intresult = (tm->tm_mon / 3) + 1;
+                if (tm->tm_year >= 0)
+                    intresult = (tm->tm_mon / 3) + 1;
+                else
+                    intresult = -((-tm->tm_mon / 3) + 1);
                 break;

             case DTK_YEAR:

On Wed, Feb 21, 2024 at 4:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> I wrote:
> > jian he <jian.universality@gmail.com> writes:
> >> On Sun, Feb 18, 2024 at 2:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> (Parenthetically, one case that perhaps is surprising is
> >>> ERROR:  unit "week" not supported for type interval
> >>> Why not just return the day field divided by 7?)
>
> >> seems pretty simple?
>
> > Hm, maybe, but does this behave desirably for zero or negative days?
>
> >> So in section 9.9.1. EXTRACT, date_part
> >> we may need to document extract(quarter from interval) case.
> >> intervals can be negative, which will make the issue more complicated.
> >> except the "quarter" field , EXTRACT other fields from intervals, the
> >> output seems sane.
>
> > Yeah, I see what you mean: the output for negative month counts is
> > very bizarre, whereas other fields seem to all produce the negative
> > of what they'd produce for the absolute value of the interval.
> > We could either try to fix that or decide that rejecting "quarter"
> > for intervals is the saner answer.
>
> After fooling with these cases for a little I'm inclined to think
> we should do it as attached (no test or docs changes yet).
>
>                         regards, tom lane
>

for `week`, we can do following for the doc:

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e5fa82c1..a21eb9f8 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10422,7 +10422,7 @@ SELECT EXTRACT(SECOND FROM TIME '17:12:28.5');
         The number of the <acronym>ISO</acronym> 8601 week-numbering week of
         the year.  By definition, ISO weeks start on Mondays and the first
         week of a year contains January 4 of that year.  In other words, the
-        first Thursday of a year is in week 1 of that year.
+        first Thursday of a year is in week 1 of that year. For
<type>interval</type> values, divide the number of days by 7.

Actually, it's not totally correct, since "the number of days is a
numeric value. need to cast "the number of days" to int.

for positive interval value, we can
+ For positive <type>interval</type> values, divide the number of days
by 3 then plus 1.
I don't know how to write the documentation for the `quarter` when
it's negative.



jian he <jian.universality@gmail.com> writes:
> On Wed, Feb 21, 2024 at 4:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Yeah, I see what you mean: the output for negative month counts is
>>> very bizarre, whereas other fields seem to all produce the negative
>>> of what they'd produce for the absolute value of the interval.
>>> We could either try to fix that or decide that rejecting "quarter"
>>> for intervals is the saner answer.

>> After fooling with these cases for a little I'm inclined to think
>> we should do it as attached (no test or docs changes yet).

> ... I don't know how to write the documentation for the `quarter` when
> it's negative.

After poking at it some more, I realized that my draft patch was still
wrong about that.  We really have to look at interval->month if we
want to behave plausibly for negative months.

Here's a more fleshed-out patch.  I don't think we really need to
document the behavior for negative intervals; at least, we haven't
done that so far for any other fields.  I did add testing of such
cases though.

            regards, tom lane

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 17c44bc338..e7792fbae1 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10392,12 +10392,16 @@ SELECT EXTRACT(MONTH FROM INTERVAL '2 years 13 months');
       <term><literal>quarter</literal></term>
       <listitem>
        <para>
-        The quarter of the year (1–4) that the date is in
+        The quarter of the year (1–4) that the date is in;
+        for <type>interval</type> values, the month field divided by 3
+        plus 1
        </para>

 <screen>
 SELECT EXTRACT(QUARTER FROM TIMESTAMP '2001-02-16 20:38:40');
 <lineannotation>Result: </lineannotation><computeroutput>1</computeroutput>
+SELECT EXTRACT(QUARTER FROM INTERVAL '1 year 6 months');
+<lineannotation>Result: </lineannotation><computeroutput>3</computeroutput>
 </screen>
       </listitem>
      </varlistentry>
@@ -10468,9 +10472,16 @@ SELECT EXTRACT(SECOND FROM TIME '17:12:28.5');
         <literal>week</literal> to get consistent results.
        </para>

+       <para>
+        For <type>interval</type> values, the week field is simply the number
+        of integral days divided by 7.
+       </para>
+
 <screen>
 SELECT EXTRACT(WEEK FROM TIMESTAMP '2001-02-16 20:38:40');
 <lineannotation>Result: </lineannotation><computeroutput>7</computeroutput>
+SELECT EXTRACT(WEEK FROM INTERVAL '13 days 24 hours');
+<lineannotation>Result: </lineannotation><computeroutput>1</computeroutput>
 </screen>
       </listitem>
      </varlistentry>
diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index e4715605a2..3e65046117 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -5918,6 +5918,7 @@ NonFiniteIntervalPart(int type, int unit, char *lowunits, bool isNegative)
         case DTK_MILLISEC:
         case DTK_SECOND:
         case DTK_MINUTE:
+        case DTK_WEEK:
         case DTK_MONTH:
         case DTK_QUARTER:
             return 0.0;
@@ -6037,12 +6038,27 @@ interval_part_common(PG_FUNCTION_ARGS, bool retnumeric)
                 intresult = tm->tm_mday;
                 break;

+            case DTK_WEEK:
+                intresult = tm->tm_mday / 7;
+                break;
+
             case DTK_MONTH:
                 intresult = tm->tm_mon;
                 break;

             case DTK_QUARTER:
-                intresult = (tm->tm_mon / 3) + 1;
+
+                /*
+                 * We want to maintain the rule that a field extracted from a
+                 * negative interval is the negative of the field's value for
+                 * the sign-reversed interval.  The broken-down tm_year and
+                 * tm_mon aren't very helpful for that, so work from
+                 * interval->month.
+                 */
+                if (interval->month >= 0)
+                    intresult = (tm->tm_mon / 3) + 1;
+                else
+                    intresult = -(((-interval->month % MONTHS_PER_YEAR) / 3) + 1);
                 break;

             case DTK_YEAR:
diff --git a/src/test/regress/expected/interval.out b/src/test/regress/expected/interval.out
index 51ae010c7b..e5d919d0cf 100644
--- a/src/test/regress/expected/interval.out
+++ b/src/test/regress/expected/interval.out
@@ -1834,6 +1834,7 @@ SELECT f1,
     EXTRACT(MINUTE FROM f1) AS MINUTE,
     EXTRACT(HOUR FROM f1) AS HOUR,
     EXTRACT(DAY FROM f1) AS DAY,
+    EXTRACT(WEEK FROM f1) AS WEEK,
     EXTRACT(MONTH FROM f1) AS MONTH,
     EXTRACT(QUARTER FROM f1) AS QUARTER,
     EXTRACT(YEAR FROM f1) AS YEAR,
@@ -1842,20 +1843,52 @@ SELECT f1,
     EXTRACT(MILLENNIUM FROM f1) AS MILLENNIUM,
     EXTRACT(EPOCH FROM f1) AS EPOCH
     FROM INTERVAL_TBL;
-              f1               | microsecond | millisecond |   second   | minute |   hour    |    day    | month |
quarter|   year    |  decade   |  century  | millennium |       epoch        

--------------------------------+-------------+-------------+------------+--------+-----------+-----------+-------+---------+-----------+-----------+-----------+------------+-------------------
- @ 1 min                       |           0 |       0.000 |   0.000000 |      1 |         0 |         0 |     0 |
 1 |         0 |         0 |         0 |          0 |         60.000000 
- @ 5 hours                     |           0 |       0.000 |   0.000000 |      0 |         5 |         0 |     0 |
 1 |         0 |         0 |         0 |          0 |      18000.000000 
- @ 10 days                     |           0 |       0.000 |   0.000000 |      0 |         0 |        10 |     0 |
 1 |         0 |         0 |         0 |          0 |     864000.000000 
- @ 34 years                    |           0 |       0.000 |   0.000000 |      0 |         0 |         0 |     0 |
 1 |        34 |         3 |         0 |          0 | 1072958400.000000 
- @ 3 mons                      |           0 |       0.000 |   0.000000 |      0 |         0 |         0 |     3 |
 2 |         0 |         0 |         0 |          0 |    7776000.000000 
- @ 14 secs ago                 |   -14000000 |  -14000.000 | -14.000000 |      0 |         0 |         0 |     0 |
 1 |         0 |         0 |         0 |          0 |        -14.000000 
- @ 1 day 2 hours 3 mins 4 secs |     4000000 |    4000.000 |   4.000000 |      3 |         2 |         1 |     0 |
 1 |         0 |         0 |         0 |          0 |      93784.000000 
- @ 6 years                     |           0 |       0.000 |   0.000000 |      0 |         0 |         0 |     0 |
 1 |         6 |         0 |         0 |          0 |  189345600.000000 
- @ 5 mons                      |           0 |       0.000 |   0.000000 |      0 |         0 |         0 |     5 |
 2 |         0 |         0 |         0 |          0 |   12960000.000000 
- @ 5 mons 12 hours             |           0 |       0.000 |   0.000000 |      0 |        12 |         0 |     5 |
 2 |         0 |         0 |         0 |          0 |   13003200.000000 
- infinity                      |             |             |            |        |  Infinity |  Infinity |       |
   |  Infinity |  Infinity |  Infinity |   Infinity |          Infinity 
- -infinity                     |             |             |            |        | -Infinity | -Infinity |       |
   | -Infinity | -Infinity | -Infinity |  -Infinity |         -Infinity 
+              f1               | microsecond | millisecond |   second   | minute |   hour    |    day    | week |
month| quarter |   year    |  decade   |  century  | millennium |       epoch        

+-------------------------------+-------------+-------------+------------+--------+-----------+-----------+------+-------+---------+-----------+-----------+-----------+------------+-------------------
+ @ 1 min                       |           0 |       0.000 |   0.000000 |      1 |         0 |         0 |    0 |
0|       1 |         0 |         0 |         0 |          0 |         60.000000 
+ @ 5 hours                     |           0 |       0.000 |   0.000000 |      0 |         5 |         0 |    0 |
0|       1 |         0 |         0 |         0 |          0 |      18000.000000 
+ @ 10 days                     |           0 |       0.000 |   0.000000 |      0 |         0 |        10 |    1 |
0|       1 |         0 |         0 |         0 |          0 |     864000.000000 
+ @ 34 years                    |           0 |       0.000 |   0.000000 |      0 |         0 |         0 |    0 |
0|       1 |        34 |         3 |         0 |          0 | 1072958400.000000 
+ @ 3 mons                      |           0 |       0.000 |   0.000000 |      0 |         0 |         0 |    0 |
3|       2 |         0 |         0 |         0 |          0 |    7776000.000000 
+ @ 14 secs ago                 |   -14000000 |  -14000.000 | -14.000000 |      0 |         0 |         0 |    0 |
0|       1 |         0 |         0 |         0 |          0 |        -14.000000 
+ @ 1 day 2 hours 3 mins 4 secs |     4000000 |    4000.000 |   4.000000 |      3 |         2 |         1 |    0 |
0|       1 |         0 |         0 |         0 |          0 |      93784.000000 
+ @ 6 years                     |           0 |       0.000 |   0.000000 |      0 |         0 |         0 |    0 |
0|       1 |         6 |         0 |         0 |          0 |  189345600.000000 
+ @ 5 mons                      |           0 |       0.000 |   0.000000 |      0 |         0 |         0 |    0 |
5|       2 |         0 |         0 |         0 |          0 |   12960000.000000 
+ @ 5 mons 12 hours             |           0 |       0.000 |   0.000000 |      0 |        12 |         0 |    0 |
5|       2 |         0 |         0 |         0 |          0 |   13003200.000000 
+ infinity                      |             |             |            |        |  Infinity |  Infinity |      |
|         |  Infinity |  Infinity |  Infinity |   Infinity |          Infinity 
+ -infinity                     |             |             |            |        | -Infinity | -Infinity |      |
|         | -Infinity | -Infinity | -Infinity |  -Infinity |         -Infinity 
+(12 rows)
+
+SELECT -f1,
+    EXTRACT(MICROSECOND FROM -f1) AS MICROSECOND,
+    EXTRACT(MILLISECOND FROM -f1) AS MILLISECOND,
+    EXTRACT(SECOND FROM -f1) AS SECOND,
+    EXTRACT(MINUTE FROM -f1) AS MINUTE,
+    EXTRACT(HOUR FROM -f1) AS HOUR,
+    EXTRACT(DAY FROM -f1) AS DAY,
+    EXTRACT(WEEK FROM -f1) AS WEEK,
+    EXTRACT(MONTH FROM -f1) AS MONTH,
+    EXTRACT(QUARTER FROM -f1) AS QUARTER,
+    EXTRACT(YEAR FROM -f1) AS YEAR,
+    EXTRACT(DECADE FROM -f1) AS DECADE,
+    EXTRACT(CENTURY FROM -f1) AS CENTURY,
+    EXTRACT(MILLENNIUM FROM -f1) AS MILLENNIUM,
+    EXTRACT(EPOCH FROM -f1) AS EPOCH
+    FROM INTERVAL_TBL;
+             ?column?              | microsecond | millisecond |  second   | minute |   hour    |    day    | week |
month| quarter |   year    |  decade   |  century  | millennium |       epoch         

+-----------------------------------+-------------+-------------+-----------+--------+-----------+-----------+------+-------+---------+-----------+-----------+-----------+------------+--------------------
+ @ 1 min ago                       |           0 |       0.000 |  0.000000 |     -1 |         0 |         0 |    0 |
 0 |       1 |         0 |         0 |         0 |          0 |         -60.000000 
+ @ 5 hours ago                     |           0 |       0.000 |  0.000000 |      0 |        -5 |         0 |    0 |
 0 |       1 |         0 |         0 |         0 |          0 |      -18000.000000 
+ @ 10 days ago                     |           0 |       0.000 |  0.000000 |      0 |         0 |       -10 |   -1 |
 0 |       1 |         0 |         0 |         0 |          0 |     -864000.000000 
+ @ 34 years ago                    |           0 |       0.000 |  0.000000 |      0 |         0 |         0 |    0 |
 0 |      -1 |       -34 |        -3 |         0 |          0 | -1072958400.000000 
+ @ 3 mons ago                      |           0 |       0.000 |  0.000000 |      0 |         0 |         0 |    0 |
-3 |      -2 |         0 |         0 |         0 |          0 |    -7776000.000000 
+ @ 14 secs                         |    14000000 |   14000.000 | 14.000000 |      0 |         0 |         0 |    0 |
 0 |       1 |         0 |         0 |         0 |          0 |          14.000000 
+ @ 1 day 2 hours 3 mins 4 secs ago |    -4000000 |   -4000.000 | -4.000000 |     -3 |        -2 |        -1 |    0 |
 0 |       1 |         0 |         0 |         0 |          0 |      -93784.000000 
+ @ 6 years ago                     |           0 |       0.000 |  0.000000 |      0 |         0 |         0 |    0 |
 0 |      -1 |        -6 |         0 |         0 |          0 |  -189345600.000000 
+ @ 5 mons ago                      |           0 |       0.000 |  0.000000 |      0 |         0 |         0 |    0 |
-5 |      -2 |         0 |         0 |         0 |          0 |   -12960000.000000 
+ @ 5 mons 12 hours ago             |           0 |       0.000 |  0.000000 |      0 |       -12 |         0 |    0 |
-5 |      -2 |         0 |         0 |         0 |          0 |   -13003200.000000 
+ -infinity                         |             |             |           |        | -Infinity | -Infinity |      |
   |         | -Infinity | -Infinity | -Infinity |  -Infinity |          -Infinity 
+ infinity                          |             |             |           |        |  Infinity |  Infinity |      |
   |         |  Infinity |  Infinity |  Infinity |   Infinity |           Infinity 
 (12 rows)

 SELECT EXTRACT(FORTNIGHT FROM INTERVAL '2 days');  -- error
diff --git a/src/test/regress/sql/interval.sql b/src/test/regress/sql/interval.sql
index fbf6e064d6..55054ae65d 100644
--- a/src/test/regress/sql/interval.sql
+++ b/src/test/regress/sql/interval.sql
@@ -592,6 +592,7 @@ SELECT f1,
     EXTRACT(MINUTE FROM f1) AS MINUTE,
     EXTRACT(HOUR FROM f1) AS HOUR,
     EXTRACT(DAY FROM f1) AS DAY,
+    EXTRACT(WEEK FROM f1) AS WEEK,
     EXTRACT(MONTH FROM f1) AS MONTH,
     EXTRACT(QUARTER FROM f1) AS QUARTER,
     EXTRACT(YEAR FROM f1) AS YEAR,
@@ -601,6 +602,23 @@ SELECT f1,
     EXTRACT(EPOCH FROM f1) AS EPOCH
     FROM INTERVAL_TBL;

+SELECT -f1,
+    EXTRACT(MICROSECOND FROM -f1) AS MICROSECOND,
+    EXTRACT(MILLISECOND FROM -f1) AS MILLISECOND,
+    EXTRACT(SECOND FROM -f1) AS SECOND,
+    EXTRACT(MINUTE FROM -f1) AS MINUTE,
+    EXTRACT(HOUR FROM -f1) AS HOUR,
+    EXTRACT(DAY FROM -f1) AS DAY,
+    EXTRACT(WEEK FROM -f1) AS WEEK,
+    EXTRACT(MONTH FROM -f1) AS MONTH,
+    EXTRACT(QUARTER FROM -f1) AS QUARTER,
+    EXTRACT(YEAR FROM -f1) AS YEAR,
+    EXTRACT(DECADE FROM -f1) AS DECADE,
+    EXTRACT(CENTURY FROM -f1) AS CENTURY,
+    EXTRACT(MILLENNIUM FROM -f1) AS MILLENNIUM,
+    EXTRACT(EPOCH FROM -f1) AS EPOCH
+    FROM INTERVAL_TBL;
+
 SELECT EXTRACT(FORTNIGHT FROM INTERVAL '2 days');  -- error
 SELECT EXTRACT(TIMEZONE FROM INTERVAL '2 days');  -- error


BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);

From
"Wetmore, Matthew (CTR)"
Date:
Devils advocating here, feel free to ignore.

Is there a real need for a negative month?  Sounds like high level this could be disastrous if I screw up the syntax.
(Ah,memories of DD)
 

I have done this in data warehousing with dimensions tables.

Just process on the INT and translate into the name.

I was thinking on how a negative month could impact this side (data warehousing) side of querying. 

I could be chicken little on this, but wanted it in the conversation.

workaround for negative months:

CREATE TABLE dim_biz_hours( year INT(4)
, doy INT(3)
, dow INT(7)
, month INT(2)
, day INT(2)
, hour INT(2)
, minute INT(2)
, second INT(2)
, utc_offset INT(2)
, utc_offset_dst INT(2)
);

INSERT INTO biz_hours (year)
SELECT * FROM generate_series(2000, 2099);

INSERT INTO biz_hours (doy)
SELECT * FROM generate_series(1, 366);

INSERT INTO biz_hours (dow)
SELECT * FROM generate_series(1, 7);

INSERT INTO biz_hours (month)
SELECT * FROM generate_series(1, 12);

INSERT INTO biz_hours (day)
SELECT * FROM generate_series(1, 31) ;

INSERT INTO biz_hours (hour)
SELECT * FROM generate_series(1, 24);

INSERT INTO biz_hours (minute)
SELECT * FROM generate_series(1, 60);

INSERT INTO biz_hours (second
SELECT * FROM generate_series(1, 60);

INSERT INTO biz_hours (utc_offset)
SELECT * FROM generate_series(1, 24);

INSERT INTO biz_hours (utc_offset_dst)
SELECT * FROM generate_series(1, 24);


-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us> 
Sent: Tuesday, May 7, 2024 2:27 PM
To: jian he <jian.universality@gmail.com>
Cc: Francisco Olarte <folarte@peoplecall.com>; Michael Bondarenko <work.michael.2956@gmail.com>;
pgsql-bugs@lists.postgresql.org;dgrowleyml@gmail.com; Peter Eisentraut <peter.eisentraut@enterprisedb.com>
 
Subject: [EXTERNAL] Re: BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);

jian he <jian.universality@gmail.com> writes:
> On Wed, Feb 21, 2024 at 4:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Yeah, I see what you mean: the output for negative month counts is 
>>> very bizarre, whereas other fields seem to all produce the negative 
>>> of what they'd produce for the absolute value of the interval.
>>> We could either try to fix that or decide that rejecting "quarter"
>>> for intervals is the saner answer.

>> After fooling with these cases for a little I'm inclined to think we 
>> should do it as attached (no test or docs changes yet).

> ... I don't know how to write the documentation for the `quarter` when 
> it's negative.

After poking at it some more, I realized that my draft patch was still wrong about that.  We really have to look at
interval->monthif we want to behave plausibly for negative months.
 

Here's a more fleshed-out patch.  I don't think we really need to document the behavior for negative intervals; at
least,we haven't done that so far for any other fields.  I did add testing of such cases though.
 

            regards, tom lane


On Wed, May 8, 2024 at 5:27 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Here's a more fleshed-out patch.  I don't think we really need to
> document the behavior for negative intervals; at least, we haven't
> done that so far for any other fields.  I did add testing of such
> cases though.
>

the doc looks good to me.
extract quarter from the interval makes sense to me.

but in real life, for week, we generally begin with 1?
like "the first week", "second week"

so should
select extract(week from interval '1 day');
return 1
?



"Wetmore, Matthew  (CTR)" <Matthew.Wetmore@evernorth.com> writes:
> Devils advocating here, feel free to ignore.
> Is there a real need for a negative month?  Sounds like high level this could be disastrous if I screw up the syntax.
(Ah,memories of DD) 

What are you objecting to the "need for"?  That intervals can store
negative months at all?  I think that ship sailed a couple decades
ago.  It's hard to use interval as the output of, say,
timestamp minus timestamp if it refuses to allow negative values.

The next fallback position perhaps could be that extract(quarter ...)
could throw error for negative input, but that seems like mostly a
foot-gun.  We've striven elsewhere to not have it throw error, even
if there's not any very sane choice to make.  For instance, these
are pre-existing behaviors:

regression=# select extract(quarter from interval 'infinity');
 extract
---------

(1 row)

regression=# select extract(quarter from interval '-infinity');
 extract
---------

(1 row)

Maybe there's a case for returning null for "quarter" for any negative
months value, but that seems inconsistent with other behaviors of
extract().  The pattern I see for finite values is that negating
the input interval negates each output of extract().

            regards, tom lane



jian he <jian.universality@gmail.com> writes:
> but in real life, for week, we generally begin with 1?
> like "the first week", "second week"
> so should
> select extract(week from interval '1 day');
> return 1
> ?

Hmm, I read it as being "the number of (whole) weeks in the
interval".  Starting with week 1 is what happens in the timestamp
case, true, but I don't find that appropriate for interval.
By analogy,

regression=# select extract(day from interval '23 hours');
 extract 
---------
       0
(1 row)

There's no such thing as "day 0" in the timestamp case,
but that doesn't make this wrong.

In any case, I'm starting to wonder why this issue is on the v17
open items list.  These are hardly new bugs in 17.  If there's
still differences of opinion about what the definition should be,
I think cramming in a change post-feature-freeze is not appropriate.
Let's just queue the issue for the next commitfest (already done
at [1]) and take it off the open items list.

            regards, tom lane

[1] https://commitfest.postgresql.org/48/4979/