Re: to_timestamp TZH and TZM format specifiers - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: to_timestamp TZH and TZM format specifiers
Date
Msg-id CAFj8pRDgc-ti1zM3bvnUNt+tznO-0TGoUWSnt5nthLYtewv3Gg@mail.gmail.com
Whole thread Raw
In response to Re: to_timestamp TZH and TZM format specifiers  (Nikita Glukhov <n.gluhov@postgrespro.ru>)
Responses Re: to_timestamp TZH and TZM format specifiers
Re: to_timestamp TZH and TZM format specifiers
List pgsql-hackers
Hi

2018-01-08 1:22 GMT+01:00 Nikita Glukhov <n.gluhov@postgrespro.ru>:
On 03.01.2018 21:34, Tom Lane wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
This small and simple standalone patch extracted from the SQL/JSON work
would allow the user to supply a string with a time zone specified as
hh:mm thus:
     SELECT to_timestamp('2011-12-18 11:38 -05:20', 'YYYY-MM-DD HH12:MI
     TZH:TZM');
              to_timestamp
     ------------------------------
      Sun Dec 18 08:58:00 2011 PST
I see that Oracle's to_timestamp supports these format codes, so +1
if you've checked that the behavior is compatible with Oracle.  The
most obvious possible gotcha is whether + is east or west of GMT,
but also there's formatting questions like what the field width is
and whether leading zeroes are printed.

Also, I'm unimpressed that you've not bothered to implement the
to_char direction.  That moves this from a feature addition to
a kluge, IMO, especially since that ought to be the easier direction.


BTW, I had not known this before, but according to the page I'm
looking at

https://docs.oracle.com/database/121/SQLRF/sql_elements004.htm#SQLRF00212

Oracle also supports "TZD" to mean a time zone abbreviation (their
example is "PDT") and "TZR" to mean a time zone name (their example
is "US/Pacific", so yes they mean the IANA zone names).  Those seem
remarkably useful, so I'm surprised we've not added support for them.


The patch seems pretty straightforward to me, and it's required for the
jsonpath patches which would be the next cab off the rank in the
SQL/JSON work.
I'm quite confused as to why a patch that alleges to be implementing
SQL-standard behavior would be depending on an Oracle-ism.  That's
not an argument against this patch, but it is a question about the
SQL/JSON work.

                        regards, tom lane

TZH and TZM specifiers are required by standard for SQL/JSON item method
.datetime() (Feature F411, “Time zone specification”).  To be fully
compliant, we should also support RR, RRRR and FF1-FF9 specifiers.

.datetime() item method is used for conversion of JSON string items to
SQL/JSON datetime items.  Its optional argument "format" determines
target datetime type:


=# SELECT jsonb '"10-03-2017 12:34 +05:20"' @* '$.datetime("DD-MM-YYYY").type()';
 ?column?
----------
 "date"
(1 row)

=# SELECT jsonb '"10-03-2017 12:34 +05:20"' @* '$.datetime("DD-MM-YYYY HH24:MI").type()';
           ?column?
-------------------------------
 "timestamp without time zone"
(1 row)

=# SELECT jsonb '"10-03-2017 12:34 +05:20"' @* '$.datetime("DD-MM-YYYY HH24:MI TZH:TZM").type()';
          ?column?
----------------------------
 "timestamp with time zone"
(1 row)

-- automatic datetime type recognition for ISO-formatted strings
=# SELECT jsonb '"2017-10-03 12:34:56 +05:20"' @* '$.datetime().type()';
          ?column?
----------------------------
 "timestamp with time zone"
(1 row)



Here are corresponding excerpts from the SQL-2016 standard:

9.44 Datetime templates

<datetime template> ::=
  { <datetime template part> }...

<datetime template part> ::=
  <datetime template field>
  | <datetime template delimiter>

<datetime template field> ::=
  <datetime template year>
  | <datetime template rounded year>
  | <datetime template month>
  | <datetime template day of month>
  | <datetime template day of year>
  | <datetime template 12-hour>
  | <datetime template 24-hour>
  | <datetime template minute>
  | <datetime template second of minute>
  | <datetime template second of day>
  | <datetime template fraction>
  | <datetime template am/pm>
  | <datetime template time zone hour>
  | <datetime template time zone minute>

<datetime template delimiter> ::=
  <minus sign>
  | <period>
  | <solidus>
  | <comma>
  | <apostrophe>
  | <semicolon>
  | <colon>
  | <space>

<datetime template year> ::= YYYY | YYY | YY | Y
<datetime template rounded year> ::= RRRR | RR
<datetime template month> ::= MM
<datetime template day of month> ::= DD
<datetime template day of year> ::= DDD
<datetime template 12-hour> ::= HH | HH12
<datetime template 24-hour> ::= HH24
<datetime template minute> ::= MI
<datetime template second of minute> ::= SS
<datetime template second of day> ::= SSSSS
<datetime template fraction> ::=
  FF1 | FF2 | FF3 | FF4 | FF5 | FF6 | FF7 | FF8 | FF9
<datetime template am/pm> ::= A.M. | P.M.
<datetime template time zone hour> ::= TZH
<datetime template time zone minute> ::= TZM


9.39 SQL/JSON path language: syntax and semantics
...
10) If <JSON datetime template> JDT is specified, then the value of JDT shall
conform to the lexical grammar of a <datetime template> in the Format of
Subclause 9.44, “Datetime templates”.

  a) If JDT contains
     <datetime template year>,
     <datetime template rounded year>,
     <datetime template month>,
     <datetime template day of month>, or
     <datetime template day of year>,
     then JDT is dated.

  b) If JDT contains
     <datetime template 12-hour>,
     <datetime template 24-hour>,
     <datetime template minute>,
     <datetime template second of minute>,
     <datetime template second of day>,
     <datetime template fraction>, or
     <datetime template am/pm>,
     then JDT is timed.

     The fractional seconds precision FSP of JDT is
     Case:

     i) If JDT contains <datetime template fraction>
        FF1, FF2, FF3, FF4, FF5, FF6, FF7, FF8, or FF9,
        then 1 (one), 2, 3, 4, 5, 6, 7, 8, or 9, respectively.

     ii) Otherwise, 0 (zero).

  c) If JDT contains
     <datetime template time zone hour> or
     <datetime template time zone minute>,
     then JDT is zoned.

  d) If JDT is zoned, then JDT shall be timed.

  e) JDT shall be dated or timed or both.
    f) The implicit datetime data type IDT of JDT is
     Case:
     i) If JDT is dated, timed, and zoned, then TIMESTAMP (FSP) WITH TIME ZONE.
     ii) If JDT is dated, timed, and not zoned, then
         TIMESTAMP (FSP) WITHOUT TIME ZONE.
     iii) If JDT is timed and zoned, then TIME (FSP) WITH TIME ZONE.
     iv) If JDT is timed and not zoned, then TIME (FSP) WITHOUT TIME ZONE.
     v) If JDT is dated but not timed and not zoned, then DATE.
  ...


(RR/RRRR specifiers explanation)

9.43 Converting a formatted character string to a datetime
 ...
  5) Let NOW be the value of CURRENT_TIMESTAMP.
     Let CY be the YEAR field of NOW.
     Let CYLIT be an <unsigned integer> of four <digit>s whose value is CY.
     Let CM be the MONTH field of NOW.
     Let CMLIT be an <unsigned integer> of two <digit>s whose value is CM.

  6) Case:
     a) If CT contains a <datetime template year> YY, then:
        i) Let YYPOS be an <exact numeric literal> whose value is the regular
           expression position of YY.
        ii) Let YYSTR be the result of SUBSTRING_REGEX ( RX IN FCS GROUP YYPOS )
        iii) Let YYLEN be the length of YYSTR.
        iv) Let YYPREFIX be the first (4 – YYLEN) digits of CYLIT.
            NOTE 471 — If the length of YYSTR is 4, then YYPREFIX is a
            zero-length string.
        v) Let YYYY be the result of YYPREFIX || YYSTR
        vi) Let YEAR be the value of YYYY interpreted as an <unsigned integer>.

     b) If CT contains a <datetime template rounded year> RR, then:
        i) Let RRPOS be an <exact numeric literal> whose value is the regular
           expression position of RR.
        ii) Let RRSTR be the result of SUBSTRING_REGEX ( RX IN FCS GROUP RRPOS )
        iii) Let RRLEN be the length of RRSTR.
        iv) Let RY be an implementation-defined exact numeric value of scale 0
            (zero) that is between CY–100 and CY+100, inclusive. Let RYLIT be
            an <unsigned integer> of four <digit>s whose value is RY.
        v) Let RRPREFIX be the first (4 – RRLEN) digits of RYLIT.
           NOTE 472 — If the length of RRSTR is 4, then RRPREFIX is a zero-length
           string.
        vi) Let RRRR be the result of RRPREFIX || RRSTR
        vii) Let YEAR be the value of RRRR interpreted as an <unsigned integer>.
     c) Otherwise, let YEAR be CY.


I checked this patch and I think so it is correct.

1. all tests passed
2. no problems with patching and compilation
3. the doc is good enough
4. I can confirm so Oracle 12c supports these formats, but I have not possibility to test it
5. the behave is consistent with timestamp with time zone

I'll mark this patch as ready for commiter

Regards

Pavel




 
--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] [PATCH] Assert that the correct locks are held whencalling PageGetLSN()
Next
From: Pavel Stehule
Date:
Subject: Re: to_timestamp TZH and TZM format specifiers