Thread: Re: [GENERAL] to_timestamp() and quarters

Re: [GENERAL] to_timestamp() and quarters

From
Bruce Momjian
Date:
Scott Bailey wrote:
> Tom Lane wrote:
> > Asher Hoskins <asher@piceur.co.uk> writes:
> >> I can't seem to get to_timestamp() or to_date() to work with quarters,
> >
> > The source code says
> >
> >                  * We ignore Q when converting to date because it is not
> >                  * normative.
> >                  *
> >                  * We still parse the source string for an integer, but it
> >                  * isn't stored anywhere in 'out'.
> >
> > That might be a reasonable position, but it seems like it'd be better to
> > throw an error than silently do nothing.  Anybody know what Oracle does
> > with this?
>
> +1 for throwing error.
> Oracle 10g throws ORA-01820: format code cannot appear in date input format.

Well, I can easily make it do what you expect, and I don't see many
error returns in that area of the code, so I just wrote a patch that
does what you would expect rather than throw an error.

    test=> select to_date('2010-1', 'YYYY-Q');
      to_date
    ------------
     2010-01-01
    (1 row)

    test=> select to_date('2010-3', 'YYYY-Q');
      to_date
    ------------
     2010-07-01
    (1 row)

    test=> select to_date('2010-7', 'YYYY-Q');
      to_date
    ------------
     2011-07-04
    (1 row)

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do
Index: src/backend/utils/adt/formatting.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/adt/formatting.c,v
retrieving revision 1.168
diff -c -c -r1.168 formatting.c
*** src/backend/utils/adt/formatting.c    26 Feb 2010 02:01:08 -0000    1.168
--- src/backend/utils/adt/formatting.c    3 Mar 2010 03:29:05 -0000
***************
*** 2671,2685 ****
                  s += SKIP_THth(n->suffix);
                  break;
              case DCH_Q:
!
!                 /*
!                  * We ignore Q when converting to date because it is not
!                  * normative.
!                  *
!                  * We still parse the source string for an integer, but it
!                  * isn't stored anywhere in 'out'.
!                  */
!                 from_char_parse_int((int *) NULL, &s, n);
                  s += SKIP_THth(n->suffix);
                  break;
              case DCH_CC:
--- 2671,2678 ----
                  s += SKIP_THth(n->suffix);
                  break;
              case DCH_Q:
!                 from_char_parse_int(&out->mm, &s, n);
!                 out->mm = (out->mm - 1) * 3 + 1;
                  s += SKIP_THth(n->suffix);
                  break;
              case DCH_CC:

Re: [GENERAL] to_timestamp() and quarters

From
"A. Kretschmer"
Date:
In response to Bruce Momjian :
> Well, I can easily make it do what you expect, and I don't see many
> error returns in that area of the code, so I just wrote a patch that
> does what you would expect rather than throw an error.

Well, that's great and better than an error, thx.

>     test=> select to_date('2010-7', 'YYYY-Q');
>       to_date
>     ------------
>      2011-07-04
>     (1 row)

Is this per SQL-Spec? I would expect an error for a quarter not in
(1,2,3,4).

But stop, now i see:

test=*# select to_date('2010-02-29', 'YYYY-MM-DD'); to_date
------------2010-03-01
(1 row)

So it is maybe a congruously behavior ;-)


Regards, Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99


Re: [GENERAL] to_timestamp() and quarters

From
"Albe Laurenz"
Date:
A. Kretschmer *EXTERN*
> > Well, I can easily make it do what you expect, and I don't see many
> > error returns in that area of the code, so I just wrote a patch that
> > does what you would expect rather than throw an error.
>
> Well, that's great and better than an error, thx.
>
> >     test=> select to_date('2010-7', 'YYYY-Q');
> >       to_date
> >     ------------
> >      2011-07-04
> >     (1 row)
>
> Is this per SQL-Spec? I would expect an error for a quarter not in
> (1,2,3,4).
>
> But stop, now i see:
>
> test=*# select to_date('2010-02-29', 'YYYY-MM-DD');
>   to_date
> ------------
>  2010-03-01
> (1 row)
>
> So it is maybe a congruously behavior ;-)

Ugh. I thought that to_date was an Oracle compatibility function.

SQL> select to_date('2010-02-29', 'YYYY-MM-DD') from dual;
select to_date('2010-02-29', 'YYYY-MM-DD') from dual              *
ERROR at line 1:
ORA-01839: date not valid for month specified

And for that matter:

SQL> select to_date('2010-7', 'YYYY-Q') from dual;
select to_date('2010-7', 'YYYY-Q') from dual                        *
ERROR at line 1:
ORA-01820: format code cannot appear in date input format

Oracle allows Q only when converting date to string.
So this can be seen as an extension.

But allowing 2010-02-29 is incompatible and smacks of MySQL...

Yours,
Laurenz Albe


Re: [GENERAL] to_timestamp() and quarters

From
Bruce Momjian
Date:
Albe Laurenz wrote:
> > But stop, now i see:
> > 
> > test=*# select to_date('2010-02-29', 'YYYY-MM-DD');
> >   to_date
> > ------------
> >  2010-03-01
> > (1 row)
> > 
> > So it is maybe a congruously behavior ;-)
> 
> Ugh. I thought that to_date was an Oracle compatibility function.
> 
> SQL> select to_date('2010-02-29', 'YYYY-MM-DD') from dual;
> select to_date('2010-02-29', 'YYYY-MM-DD') from dual
>                *
> ERROR at line 1:
> ORA-01839: date not valid for month specified
> 
> And for that matter:
> 
> SQL> select to_date('2010-7', 'YYYY-Q') from dual;
> select to_date('2010-7', 'YYYY-Q') from dual
>                          *
> ERROR at line 1:
> ORA-01820: format code cannot appear in date input format
> 
> Oracle allows Q only when converting date to string.
> So this can be seen as an extension.
> 
> But allowing 2010-02-29 is incompatible and smacks of MySQL...

Yea, we had a similar issue with to_timestamp():
test=> SELECT to_timestamp('20096040','YYYYMMDD');      to_timestamp------------------------ 2014-01-17 00:00:00-05(1
row)

If we are going to tighten these up, we should do them all.  Right now
we allow it and for consistency should allow the Q=7 value too.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do


Re: [GENERAL] to_timestamp() and quarters

From
Brendan Jurd
Date:
On 3 March 2010 14:34, Bruce Momjian <bruce@momjian.us> wrote:
> Scott Bailey wrote:
>> Tom Lane wrote:
>> > Asher Hoskins <asher@piceur.co.uk> writes:
>> >> I can't seem to get to_timestamp() or to_date() to work with quarters,
>> >
>> > The source code says
>> >
>> >                  * We ignore Q when converting to date because it is not
>> >                  * normative.
>> >                  *
>> >                  * We still parse the source string for an integer, but it
>> >                  * isn't stored anywhere in 'out'.
>> >
>> > That might be a reasonable position, but it seems like it'd be better to
>> > throw an error than silently do nothing.  Anybody know what Oracle does
>> > with this?
>>
>> +1 for throwing error.
>> Oracle 10g throws ORA-01820: format code cannot appear in date input format.
>
> Well, I can easily make it do what you expect, and I don't see many
> error returns in that area of the code, so I just wrote a patch that
> does what you would expect rather than throw an error.
>
>        test=> select to_date('2010-1', 'YYYY-Q');
>          to_date
>        ------------
>         2010-01-01
>        (1 row)

I don't think this is the way to go.  Why should the "date" for
quarter 1, 2010 be the first date of that quarter?  Why not the last
date?  Why not some date in between?

A quarter on its own doesn't assist us in producing a *date* result,
which is after all the purpose of the to_date() function.

I first proposed ignoring the Q field back in 2007 [1].  My motivation
for not throwing an error was that I think the main use-case for
to_date() would be importing data from another system where dates are
in a predictable but non-standard format.

If such a date included the quarter, the user might expect to be able
to include the quarter in his format string.

For example, you're trying to import a date that is written as "Wed
3rd March, Q1 2010".  You might give to_date a format string like 'Dy
FMDDTH Month, "Q"Q YYYY' and expect to get the correct answer.  If we
start throwing an error on the Q field, then users would have to
resort to some strange circumlocution to get around it.

Having said all of that, it's been pointed out to me in the past that
Oracle compatibility is the main goal of these functions, so if we're
going to change the behaviour of Q in to_date(), I think it should be
in order to move closer to Oracle's treatment.  I certainly don't
think we should get back into the business of delivering an exact
answer to an inexact question.  So a +1 for throwing the error per Tom
Lane and Scott Bailey.

Cheers,
BJ

[1] http://archives.postgresql.org/message-id/37ed240d0707170747p4f5c26ffx63fff2b5750c62e5@mail.gmail.com


Re: [GENERAL] to_timestamp() and quarters

From
Bruce Momjian
Date:
Brendan Jurd wrote:
> > Well, I can easily make it do what you expect, and I don't see many
> > error returns in that area of the code, so I just wrote a patch that
> > does what you would expect rather than throw an error.
> >
> > ? ? ? ?test=> select to_date('2010-1', 'YYYY-Q');
> > ? ? ? ? ?to_date
> > ? ? ? ?------------
> > ? ? ? ? 2010-01-01
> > ? ? ? ?(1 row)
>
> I don't think this is the way to go.  Why should the "date" for
> quarter 1, 2010 be the first date of that quarter?  Why not the last
> date?  Why not some date in between?
>
> A quarter on its own doesn't assist us in producing a *date* result,
> which is after all the purpose of the to_date() function.
>
> I first proposed ignoring the Q field back in 2007 [1].  My motivation
> for not throwing an error was that I think the main use-case for
> to_date() would be importing data from another system where dates are
> in a predictable but non-standard format.
>
> If such a date included the quarter, the user might expect to be able
> to include the quarter in his format string.
>
> For example, you're trying to import a date that is written as "Wed
> 3rd March, Q1 2010".  You might give to_date a format string like 'Dy
> FMDDTH Month, "Q"Q YYYY' and expect to get the correct answer.  If we
> start throwing an error on the Q field, then users would have to
> resort to some strange circumlocution to get around it.
>
> Having said all of that, it's been pointed out to me in the past that
> Oracle compatibility is the main goal of these functions, so if we're
> going to change the behaviour of Q in to_date(), I think it should be
> in order to move closer to Oracle's treatment.  I certainly don't
> think we should get back into the business of delivering an exact
> answer to an inexact question.  So a +1 for throwing the error per Tom
> Lane and Scott Bailey.

OK, patch attached that throws an error:

    test=> SELECT to_date('2010-7', 'YYYY-Q');
    ERROR:  "Q" format is not supported in to_date

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do
Index: src/backend/utils/adt/formatting.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/adt/formatting.c,v
retrieving revision 1.168
diff -c -c -r1.168 formatting.c
*** src/backend/utils/adt/formatting.c    26 Feb 2010 02:01:08 -0000    1.168
--- src/backend/utils/adt/formatting.c    3 Mar 2010 17:06:59 -0000
***************
*** 2671,2686 ****
                  s += SKIP_THth(n->suffix);
                  break;
              case DCH_Q:
!
!                 /*
!                  * We ignore Q when converting to date because it is not
!                  * normative.
!                  *
!                  * We still parse the source string for an integer, but it
!                  * isn't stored anywhere in 'out'.
!                  */
!                 from_char_parse_int((int *) NULL, &s, n);
!                 s += SKIP_THth(n->suffix);
                  break;
              case DCH_CC:
                  from_char_parse_int(&out->cc, &s, n);
--- 2671,2680 ----
                  s += SKIP_THth(n->suffix);
                  break;
              case DCH_Q:
!                 /* It is unclear which date in the quarter to return. */
!                 ereport(ERROR,
!                         (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                          errmsg("\"Q\" format is not supported in to_date")));
                  break;
              case DCH_CC:
                  from_char_parse_int(&out->cc, &s, n);

Re: [GENERAL] to_timestamp() and quarters

From
Tom Lane
Date:
Brendan Jurd <direvus@gmail.com> writes:
> For example, you're trying to import a date that is written as "Wed
> 3rd March, Q1 2010".  You might give to_date a format string like 'Dy
> FMDDTH Month, "Q"Q YYYY' and expect to get the correct answer.  If we
> start throwing an error on the Q field, then users would have to
> resort to some strange circumlocution to get around it.

Hmm.  That's an interesting test case: if Q throws an error, there
doesn't seem to be any way to do it at all, because there is no format
spec for ignoring non-constant text.  Conversely, Bruce's proposed
patch would actually break it, because the Q code would overwrite the
(correct) month information with the first-month-of-the-quarter.

So at the moment my vote is "leave it alone".  If we want to throw
error for Q then we should provide a substitute method of ignoring
a field.  But we could just document Q as ignoring an integer for
input.
        regards, tom lane


Re: [GENERAL] to_timestamp() and quartersf

From
Bruce Momjian
Date:
Tom Lane wrote:
> Brendan Jurd <direvus@gmail.com> writes:
> > For example, you're trying to import a date that is written as "Wed
> > 3rd March, Q1 2010".  You might give to_date a format string like 'Dy
> > FMDDTH Month, "Q"Q YYYY' and expect to get the correct answer.  If we
> > start throwing an error on the Q field, then users would have to
> > resort to some strange circumlocution to get around it.
>
> Hmm.  That's an interesting test case: if Q throws an error, there
> doesn't seem to be any way to do it at all, because there is no format
> spec for ignoring non-constant text.  Conversely, Bruce's proposed
> patch would actually break it, because the Q code would overwrite the
> (correct) month information with the first-month-of-the-quarter.
>
> So at the moment my vote is "leave it alone".  If we want to throw
> error for Q then we should provide a substitute method of ignoring
> a field.  But we could just document Q as ignoring an integer for
> input.

Here is an updated patch that honors 'Q' only if the month has not been
previously supplied:

    test=> SELECT to_date('2010-3', 'YYYY-Q');
      to_date
    ------------
     2010-07-01
    (1 row)

    test=> SELECT to_date('2010-04-3', 'YYYY-MM-Q');
      to_date
    ------------
     2010-04-01
    (1 row)

but it fails if a later month is specified:

    test=> select to_date('2010-3-05', 'YYYY-Q-MM');
    ERROR:  conflicting values for "MM" field in formatting string
    DETAIL:  This value contradicts a previous setting for the same field type.

even if the month is in that quarter but not the first month of the
quarter.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do
Index: src/backend/utils/adt/formatting.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/adt/formatting.c,v
retrieving revision 1.168
diff -c -c -r1.168 formatting.c
*** src/backend/utils/adt/formatting.c    26 Feb 2010 02:01:08 -0000    1.168
--- src/backend/utils/adt/formatting.c    3 Mar 2010 17:18:43 -0000
***************
*** 2671,2685 ****
                  s += SKIP_THth(n->suffix);
                  break;
              case DCH_Q:
!
!                 /*
!                  * We ignore Q when converting to date because it is not
!                  * normative.
!                  *
!                  * We still parse the source string for an integer, but it
!                  * isn't stored anywhere in 'out'.
!                  */
!                 from_char_parse_int((int *) NULL, &s, n);
                  s += SKIP_THth(n->suffix);
                  break;
              case DCH_CC:
--- 2671,2684 ----
                  s += SKIP_THth(n->suffix);
                  break;
              case DCH_Q:
!                 /* Honor "Q" only if a month has not previously be set */
!                 if (out->mm == 0)
!                 {
!                     from_char_parse_int(&out->mm, &s, n);
!                     out->mm = (out->mm - 1) * 3 + 1;
!                 }
!                 else    /* ignore */
!                     from_char_parse_int((int *) NULL, &s, n);
                  s += SKIP_THth(n->suffix);
                  break;
              case DCH_CC:

Re: [GENERAL] to_timestamp() and quarters

From
Brendan Jurd
Date:
On 4 March 2010 04:08, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Brendan Jurd <direvus@gmail.com> writes:
>> For example, you're trying to import a date that is written as "Wed
>> 3rd March, Q1 2010".  You might give to_date a format string like 'Dy
>> FMDDTH Month, "Q"Q YYYY' and expect to get the correct answer.  If we
>> start throwing an error on the Q field, then users would have to
>> resort to some strange circumlocution to get around it.
>
> Hmm.  That's an interesting test case: if Q throws an error, there
> doesn't seem to be any way to do it at all, because there is no format
> spec for ignoring non-constant text.

Not entirely true.  It's possible, it's just not at all obvious:

=# select to_date('Wed 3rd March, Q1 2010', 'Dy FMDDTH Month, "QQ" YYYY'); to_date
------------2010-03-03
(1 row)

Anything in a format string which is quoted is ignored.  Or to put it
another way, putting stuff in quotes is telling to_date() that the
characters in those positions are not important to you and should not
be used to help construct the date result.  It doesn't actually check
that the characters in the source string match what you have put
inside the quotes, it just skips over the quoted number of characters.

I doubt anyone unfamiliar with the source code of the function would
ever devise the above solution, and it's an ugly hack reliant on a
quirk anyway.  So a user in-the-field would probably just resort to
running a regexp_replace() over the text in order to strip out the
quarter before passing it to to_date().

> So at the moment my vote is "leave it alone".  If we want to throw
> error for Q then we should provide a substitute method of ignoring
> a field.  But we could just document Q as ignoring an integer for
> input.

Sounds good to me.

Cheers,
BJ


Re: [GENERAL] to_timestamp() and quartersf

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Here is an updated patch that honors 'Q' only if the month has not been
> previously supplied:

That's just weird.  It's not even self-consistent much less
unsurprising --- having the behavior be dependent on field order is
really horrid.

I think what people would actually want for this type of situation is
a way to specify "there is an integer here but I want to ignore it".
Q as it's presently constituted accomplishes that, though it is not
documented as doing so.  Brendan's comment about quoted text is
interesting, but it doesn't really solve the problem because of the
possibility of the integer field being variable width.
        regards, tom lane


Re: [GENERAL] to_timestamp() and quartersf

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Here is an updated patch that honors 'Q' only if the month has not been
> > previously supplied:
>
> That's just weird.  It's not even self-consistent much less
> unsurprising --- having the behavior be dependent on field order is
> really horrid.
>
> I think what people would actually want for this type of situation is
> a way to specify "there is an integer here but I want to ignore it".
> Q as it's presently constituted accomplishes that, though it is not
> documented as doing so.  Brendan's comment about quoted text is
> interesting, but it doesn't really solve the problem because of the
> possibility of the integer field being variable width.

I have updated the comments that "Q" is ignored by to_date and
to_timestamp, and added a C comment.

I also documented the double-quote input-skip behavior of to_timestamp,
to_number, and to_date.

Applied patch attached.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do
Index: doc/src/sgml/func.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/func.sgml,v
retrieving revision 1.506
diff -c -c -r1.506 func.sgml
*** doc/src/sgml/func.sgml    23 Feb 2010 16:14:25 -0000    1.506
--- doc/src/sgml/func.sgml    3 Mar 2010 22:27:36 -0000
***************
*** 5089,5095 ****
         </row>
         <row>
          <entry><literal>Q</literal></entry>
!         <entry>quarter</entry>
         </row>
         <row>
          <entry><literal>RM</literal></entry>
--- 5089,5095 ----
         </row>
         <row>
          <entry><literal>Q</literal></entry>
!         <entry>quarter (ignored by <function>to_date</> and <function>to_timestamp</>)</entry>
         </row>
         <row>
          <entry><literal>RM</literal></entry>
***************
*** 5209,5215 ****
         even if it contains pattern key words.  For example, in
         <literal>'"Hello Year "YYYY'</literal>, the <literal>YYYY</literal>
         will be replaced by the year data, but the single <literal>Y</literal> in <literal>Year</literal>
!        will not be.
        </para>
       </listitem>

--- 5209,5218 ----
         even if it contains pattern key words.  For example, in
         <literal>'"Hello Year "YYYY'</literal>, the <literal>YYYY</literal>
         will be replaced by the year data, but the single <literal>Y</literal> in <literal>Year</literal>
!        will not be.  In <function>to_date</>, <function>to_number</>,
!        and <function>to_timestamp</>, double-quoted strings skip the number of
!        input characters contained in the string, e.g. <literal>"XX"</>
!        skips two input characters.
        </para>
       </listitem>

Index: src/backend/utils/adt/formatting.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/adt/formatting.c,v
retrieving revision 1.168
diff -c -c -r1.168 formatting.c
*** src/backend/utils/adt/formatting.c    26 Feb 2010 02:01:08 -0000    1.168
--- src/backend/utils/adt/formatting.c    3 Mar 2010 22:27:38 -0000
***************
*** 2671,2680 ****
                  s += SKIP_THth(n->suffix);
                  break;
              case DCH_Q:
-
                  /*
!                  * We ignore Q when converting to date because it is not
!                  * normative.
                   *
                   * We still parse the source string for an integer, but it
                   * isn't stored anywhere in 'out'.
--- 2671,2682 ----
                  s += SKIP_THth(n->suffix);
                  break;
              case DCH_Q:
                  /*
!                  * We ignore 'Q' when converting to date because it is
!                  * unclear which date in the quarter to use, and some
!                  * people specify both quarter and month, so if it was
!                  * honored it might conflict with the supplied month.
!                  * That is also why we don't throw an error.
                   *
                   * We still parse the source string for an integer, but it
                   * isn't stored anywhere in 'out'.