Re: BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL); - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);
Date
Msg-id 4116086.1708363353@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #18348: Inconsistency with EXTRACT([field] from INTERVAL);  (jian he <jian.universality@gmail.com>)
List pgsql-bugs
jian he <jian.universality@gmail.com> writes:
> 9.9.1. EXTRACT, date_part
> hour field description as
> `
> The hour field (0–23)
> `
> Do we need to update for the EXTRACT(INTERVAL) case?

Yeah, probably.  I did a bit more wordsmithing too.
Here's a rolled-up patch.

            regards, tom lane

diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 66510ee031..73e51b0b11 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -2869,10 +2869,31 @@ P <optional> <replaceable>years</replaceable>-<replaceable>months</replaceable>-
     </para>

     <para>
-     Field values can have fractional parts:  for example, <literal>'1.5
+     Internally, <type>interval</type> values are stored as three integral
+     fields: months, days, and microseconds.  These fields are kept
+     separate because the number of days in a month varies, while a day
+     can have 23 or 25 hours if a daylight savings time transition is
+     involved.  An interval input string that uses other units is
+     normalized into this format, and then reconstructed in a standardized
+     way for output, for example:
+
+<programlisting>
+SELECT '2 years 15 months 100 weeks 99 hours 123456789 milliseconds'::interval;
+               interval
+---------------------------------------
+ 3 years 3 mons 700 days 133:17:36.789
+</programlisting>
+
+     Here weeks, which are understood as <quote>7 days</quote>, have been
+     kept separate, while the smaller and larger time units were
+     combined and normalized.
+    </para>
+
+    <para>
+     Input field values can have fractional parts, for example <literal>'1.5
      weeks'</literal> or <literal>'01:02:03.45'</literal>.  However,
-     because interval internally stores only three integer units (months,
-     days, microseconds), fractional units must be spilled to smaller
+     because <type>interval</type> internally stores only integral fields,
+     fractional values must be converted into smaller
      units.  Fractional parts of units greater than months are rounded to
      be an integer number of months, e.g. <literal>'1.5 years'</literal>
      becomes <literal>'1 year 6 mons'</literal>.  Fractional parts of
@@ -2922,33 +2943,6 @@ P <optional> <replaceable>years</replaceable>-<replaceable>months</replaceable>-
       </tgroup>
      </table>

-    <para>
-     Internally <type>interval</type> values are stored as months, days,
-     and microseconds. This is done because the number of days in a month
-     varies, and a day can have 23 or 25 hours if a daylight savings
-     time adjustment is involved.  The months and days fields are integers
-     while the microseconds field can store fractional seconds.  Because intervals are
-     usually created from constant strings or <type>timestamp</type> subtraction,
-     this storage method works well in most cases, but can cause unexpected
-     results:
-
-<programlisting>
-SELECT EXTRACT(hours from '80 minutes'::interval);
- date_part
------------
-         1
-
-SELECT EXTRACT(days from '80 hours'::interval);
- date_part
------------
-         0
-</programlisting>
-
-     Functions <function>justify_days</function> and
-     <function>justify_hours</function> are available for adjusting days
-     and hours that overflow their normal ranges.
-    </para>
-
    </sect2>

    <sect2 id="datatype-interval-output">
@@ -2960,6 +2954,23 @@ SELECT EXTRACT(days from '80 hours'::interval);
      <seealso>formatting</seealso>
     </indexterm>

+    <para>
+     As previously explained, <productname>PostgreSQL</productname>
+     stores <type>interval</type> values as months, days, and
+     microseconds.  For output, the months field is converted to years and
+     months by dividing by 12.  The days field is shown as-is.  The
+     microseconds field is converted to hours, minutes, seconds, and
+     fractional seconds.  Thus months, minutes, and seconds will never be
+     shown as exceeding the ranges 0–11, 0–59, and 0–59
+     respectively, while the displayed years, days, and hours fields can
+     be quite large.  (The <link
+     linkend="function-justify-days"><function>justify_days</function></link>
+     and <link
+     linkend="function-justify-hours"><function>justify_hours</function></link>
+     functions can be used if it is desirable to transpose large days or
+     hours values into the next higher field.)
+    </para>
+
     <para>
      The output format of the interval type can be set to one of the
      four styles <literal>sql_standard</literal>, <literal>postgres</literal>,
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index cf3de80394..3de13dae44 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -9166,7 +9166,8 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
         </para>
         <para>
          Subtract timestamps (converting 24-hour intervals into days,
-         similarly to <function>justify_hours()</function>)
+         similarly to <link
+         linkend="function-justify-hours"><function>justify_hours()</function></link>)
         </para>
         <para>
          <literal>timestamp '2001-09-29 03:00' - timestamp '2001-07-27 12:00'</literal>
@@ -9579,35 +9580,35 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');

        <row>
         <entry role="func_table_entry"><para role="func_signature">
-         <indexterm>
+         <indexterm id="function-justify-days">
           <primary>justify_days</primary>
          </indexterm>
          <function>justify_days</function> ( <type>interval</type> )
          <returnvalue>interval</returnvalue>
         </para>
         <para>
-         Adjust interval so 30-day time periods are represented as months
+         Adjust interval, converting 30-day time periods to months
         </para>
         <para>
-         <literal>justify_days(interval '35 days')</literal>
-         <returnvalue>1 mon 5 days</returnvalue>
+         <literal>justify_days(interval '1 year 65 days')</literal>
+         <returnvalue>1 year 2 mons 5 days</returnvalue>
         </para></entry>
        </row>

        <row>
         <entry role="func_table_entry"><para role="func_signature">
-         <indexterm>
+         <indexterm id="function-justify-hours">
           <primary>justify_hours</primary>
          </indexterm>
          <function>justify_hours</function> ( <type>interval</type> )
          <returnvalue>interval</returnvalue>
         </para>
         <para>
-         Adjust interval so 24-hour time periods are represented as days
+         Adjust interval, converting 24-hour time periods to days
         </para>
         <para>
-         <literal>justify_hours(interval '27 hours')</literal>
-         <returnvalue>1 day 03:00:00</returnvalue>
+         <literal>justify_hours(interval '50 hours 10 minutes')</literal>
+         <returnvalue>2 days 02:10:00</returnvalue>
         </para></entry>
        </row>

@@ -10040,13 +10041,19 @@ EXTRACT(<replaceable>field</replaceable> FROM <replaceable>source</replaceable>)
     The <function>extract</function> function retrieves subfields
     such as year or hour from date/time values.
     <replaceable>source</replaceable> must be a value expression of
-    type <type>timestamp</type>, <type>time</type>, or <type>interval</type>.
-    (Expressions of type <type>date</type> are
-    cast to <type>timestamp</type> and can therefore be used as
-    well.)  <replaceable>field</replaceable> is an identifier or
+    type <type>timestamp</type>, <type>date</type>, <type>time</type>,
+    or <type>interval</type>.  (Timestamps and times can be with or
+    without time zone.)
+    <replaceable>field</replaceable> is an identifier or
     string that selects what field to extract from the source value.
+    Not all fields are valid for every input data type; for example, fields
+    smaller than a day cannot be extracted from a <type>date</type>, while
+    fields of a day or more cannot be extracted from a <type>time</type>.
     The <function>extract</function> function returns values of type
     <type>numeric</type>.
+   </para>
+
+   <para>
     The following are valid field names:

     <!-- alphabetical -->
@@ -10063,17 +10070,11 @@ SELECT EXTRACT(CENTURY FROM TIMESTAMP '2000-12-16 12:21:13');
 <lineannotation>Result: </lineannotation><computeroutput>20</computeroutput>
 SELECT EXTRACT(CENTURY FROM TIMESTAMP '2001-02-16 20:38:40');
 <lineannotation>Result: </lineannotation><computeroutput>21</computeroutput>
+SELECT EXTRACT(CENTURY FROM DATE '0001-01-01 AD');
+<lineannotation>Result: </lineannotation><computeroutput>1</computeroutput>
+SELECT EXTRACT(CENTURY FROM DATE '0001-12-31 BC');
+<lineannotation>Result: </lineannotation><computeroutput>-1</computeroutput>
 </screen>
-
-       <para>
-        The first century starts at 0001-01-01 00:00:00 AD, although
-        they did not know it at the time. This definition applies to all
-        Gregorian calendar countries. There is no century number 0,
-        you go from -1 century to 1 century.
-
-        If you disagree with this, please write your complaint to:
-        Pope, Cathedral Saint-Peter of Roma, Vatican.
-       </para>
       </listitem>
      </varlistentry>

@@ -10088,7 +10089,6 @@ SELECT EXTRACT(CENTURY FROM TIMESTAMP '2001-02-16 20:38:40');
 <screen>
 SELECT EXTRACT(DAY FROM TIMESTAMP '2001-02-16 20:38:40');
 <lineannotation>Result: </lineannotation><computeroutput>16</computeroutput>
-
 SELECT EXTRACT(DAY FROM INTERVAL '40 days 1 minute');
 <lineannotation>Result: </lineannotation><computeroutput>40</computeroutput>
 </screen>
@@ -10162,10 +10162,8 @@ SELECT EXTRACT(DOY FROM TIMESTAMP '2001-02-16 20:38:40');
 <screen>
 SELECT EXTRACT(EPOCH FROM TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40.12-08');
 <lineannotation>Result: </lineannotation><computeroutput>982384720.120000</computeroutput>
-
 SELECT EXTRACT(EPOCH FROM TIMESTAMP '2001-02-16 20:38:40.12');
 <lineannotation>Result: </lineannotation><computeroutput>982355920.120000</computeroutput>
-
 SELECT EXTRACT(EPOCH FROM INTERVAL '5 days 3 hours');
 <lineannotation>Result: </lineannotation><computeroutput>442800.000000</computeroutput>
 </screen>
@@ -10193,7 +10191,8 @@ SELECT to_timestamp(982384720.12);
       <term><literal>hour</literal></term>
       <listitem>
        <para>
-        The hour field (0–23)
+        The hour field (0–23 in timestamps, unrestricted for
+        intervals)
        </para>

 <screen>
@@ -10228,7 +10227,7 @@ SELECT EXTRACT(ISODOW FROM TIMESTAMP '2001-02-18 20:38:40');
       <listitem>
        <para>
         The <acronym>ISO</acronym> 8601 week-numbering year that the date
-        falls in (not applicable to intervals)
+        falls in
        </para>

 <screen>
@@ -10256,7 +10255,7 @@ SELECT EXTRACT(ISOYEAR FROM DATE '2006-01-02');
       <listitem>
        <para>
         The <firstterm>Julian Date</firstterm> corresponding to the
-        date or timestamp (not applicable to intervals).  Timestamps
+        date or timestamp.  Timestamps
         that are not local midnight result in a fractional value.  See
         <xref linkend="datetime-julian-dates"/> for more information.
        </para>
@@ -10345,10 +10344,8 @@ SELECT EXTRACT(MINUTE FROM TIMESTAMP '2001-02-16 20:38:40');
 <screen>
 SELECT EXTRACT(MONTH FROM TIMESTAMP '2001-02-16 20:38:40');
 <lineannotation>Result: </lineannotation><computeroutput>2</computeroutput>
-
 SELECT EXTRACT(MONTH FROM INTERVAL '2 years 3 months');
 <lineannotation>Result: </lineannotation><computeroutput>3</computeroutput>
-
 SELECT EXTRACT(MONTH FROM INTERVAL '2 years 13 months');
 <lineannotation>Result: </lineannotation><computeroutput>1</computeroutput>
 </screen>
@@ -10379,7 +10376,6 @@ SELECT EXTRACT(QUARTER FROM TIMESTAMP '2001-02-16 20:38:40');
 <screen>
 SELECT EXTRACT(SECOND FROM TIMESTAMP '2001-02-16 20:38:40');
 <lineannotation>Result: </lineannotation><computeroutput>40.000000</computeroutput>
-
 SELECT EXTRACT(SECOND FROM TIME '17:12:28.5');
 <lineannotation>Result: </lineannotation><computeroutput>28.500000</computeroutput>
 </screen>
@@ -10461,6 +10457,20 @@ SELECT EXTRACT(YEAR FROM TIMESTAMP '2001-02-16 20:38:40');
     </variablelist>
    </para>

+   <para>
+    When processing <type>interval</type> input,
+    the <function>extract</function> function produces field values that
+    match the interpretation used by the interval output function.  This
+    can produce surprising results if one starts with a non-normalized
+    interval representation, for example:
+<screen>
+SELECT INTERVAL '80 minutes';
+<lineannotation>Result: </lineannotation><computeroutput>01:20:00</computeroutput>
+SELECT EXTRACT(MINUTES FROM INTERVAL '80 minutes');
+<lineannotation>Result: </lineannotation><computeroutput>20</computeroutput>
+</screen>
+   </para>
+
    <note>
     <para>
      When the input value is +/-Infinity, <function>extract</function> returns
@@ -10502,7 +10512,6 @@ date_part('<replaceable>field</replaceable>', <replaceable>source</replaceable>)
 <screen>
 SELECT date_part('day', TIMESTAMP '2001-02-16 20:38:40');
 <lineannotation>Result: </lineannotation><computeroutput>16</computeroutput>
-
 SELECT date_part('hour', INTERVAL '4 hours 3 minutes');
 <lineannotation>Result: </lineannotation><computeroutput>4</computeroutput>
 </screen>
@@ -10580,16 +10589,12 @@ date_trunc(<replaceable>field</replaceable>, <replaceable>source</replaceable> [
 <screen>
 SELECT date_trunc('hour', TIMESTAMP '2001-02-16 20:38:40');
 <lineannotation>Result: </lineannotation><computeroutput>2001-02-16 20:00:00</computeroutput>
-
 SELECT date_trunc('year', TIMESTAMP '2001-02-16 20:38:40');
 <lineannotation>Result: </lineannotation><computeroutput>2001-01-01 00:00:00</computeroutput>
-
 SELECT date_trunc('day', TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40+00');
 <lineannotation>Result: </lineannotation><computeroutput>2001-02-16 00:00:00-05</computeroutput>
-
 SELECT date_trunc('day', TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40+00', 'Australia/Sydney');
 <lineannotation>Result: </lineannotation><computeroutput>2001-02-16 08:00:00-05</computeroutput>
-
 SELECT date_trunc('hour', INTERVAL '3 days 02:47:33');
 <lineannotation>Result: </lineannotation><computeroutput>3 days 02:00:00</computeroutput>
 </screen>
@@ -10628,7 +10633,6 @@ date_bin(<replaceable>stride</replaceable>, <replaceable>source</replaceable>, <
 <screen>
 SELECT date_bin('15 minutes', TIMESTAMP '2020-02-11 15:44:17', TIMESTAMP '2001-01-01');
 <lineannotation>Result: </lineannotation><computeroutput>2020-02-11 15:30:00</computeroutput>
-
 SELECT date_bin('15 minutes', TIMESTAMP '2020-02-11 15:44:17', TIMESTAMP '2001-01-01 00:02:30');
 <lineannotation>Result: </lineannotation><computeroutput>2020-02-11 15:32:30</computeroutput>
 </screen>
@@ -10814,16 +10818,12 @@ SELECT date_bin('15 minutes', TIMESTAMP '2020-02-11 15:44:17', TIMESTAMP '2001-0
 <screen>
 SELECT TIMESTAMP '2001-02-16 20:38:40' AT TIME ZONE 'America/Denver';
 <lineannotation>Result: </lineannotation><computeroutput>2001-02-16 19:38:40-08</computeroutput>
-
 SELECT TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40-05' AT TIME ZONE 'America/Denver';
 <lineannotation>Result: </lineannotation><computeroutput>2001-02-16 18:38:40</computeroutput>
-
 SELECT TIMESTAMP '2001-02-16 20:38:40' AT TIME ZONE 'Asia/Tokyo' AT TIME ZONE 'America/Chicago';
 <lineannotation>Result: </lineannotation><computeroutput>2001-02-16 05:38:40</computeroutput>
-
 SELECT TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40-05' AT LOCAL;
 <lineannotation>Result: </lineannotation><computeroutput>2001-02-16 17:38:40</computeroutput>
-
 SELECT TIME WITH TIME ZONE '20:38:40-05' AT LOCAL;
 <lineannotation>Result: </lineannotation><computeroutput>17:38:40</computeroutput>
 </screen>
@@ -10928,16 +10928,12 @@ LOCALTIMESTAMP(<replaceable>precision</replaceable>)
 <screen>
 SELECT CURRENT_TIME;
 <lineannotation>Result: </lineannotation><computeroutput>14:39:53.662522-05</computeroutput>
-
 SELECT CURRENT_DATE;
 <lineannotation>Result: </lineannotation><computeroutput>2019-12-23</computeroutput>
-
 SELECT CURRENT_TIMESTAMP;
 <lineannotation>Result: </lineannotation><computeroutput>2019-12-23 14:39:53.662522-05</computeroutput>
-
 SELECT CURRENT_TIMESTAMP(2);
 <lineannotation>Result: </lineannotation><computeroutput>2019-12-23 14:39:53.66-05</computeroutput>
-
 SELECT LOCALTIMESTAMP;
 <lineannotation>Result: </lineannotation><computeroutput>2019-12-23 14:39:53.662522</computeroutput>
 </screen>

pgsql-bugs by date:

Previous
From: Amit Langote
Date:
Subject: Re: BUG #18344: Pruning tables partitioned by bool range fails with invalid strategy
Next
From: Alexander Lakhin
Date:
Subject: Re: BUG #18344: Pruning tables partitioned by bool range fails with invalid strategy