Thread: [HACKERS] Inconsistencies in from_char_parse_int_len()

[HACKERS] Inconsistencies in from_char_parse_int_len()

From
Douglas Doole
Date:
I was playing with TO_TIMESTAMP() and I noticed a weird result:

postgres=# select to_timestamp('20170-07-24 21:59:57.12345678', 'yyyy-mm-dd hh24:mi:ss.us');
          to_timestamp          
--------------------------------
 20170-07-24 22:00:09.345678+00
(1 row)

Even though the "us" token is supposed to be restricted to 000000-999999 it looks like the microseconds was calculated as 12.345678.

Digging into the code, I found inconsistencies in from_char_parse_int_len(). From formatting.c:

!    /*
!     * Read a single integer from the source string, into the int pointed to by
!     * 'dest'. If 'dest' is NULL, the result is discarded.
!     *
!     * In fixed-width mode (the node does not have the FM suffix), consume at most
!     * 'len' characters.  However, any leading whitespace isn't counted in 'len'.
!     *
<snip>
!    static int
!    from_char_parse_int_len(int *dest, char **src, const int len, FormatNode *node)
!    {
<snip>
!        if (S_FM(node->suffix) || is_next_separator(node))
!        {
!            /*
!             * This node is in Fill Mode, or the next node is known to be a
!             * non-digit value, so we just slurp as many characters as we can get.
!             */
!            errno = 0;
!            result = strtol(init, src, 10);
!        }
!        else
!        {
!            /*
!             * We need to pull exactly the number of characters given in 'len' out
!             * of the string, and convert those.
!             */
!            char       *last;
!
!            if (used < len)
!                ereport(ERROR,
!                        (errcode(ERRCODE_INVALID_DATETIME_FORMAT),

So the function prologue disagrees with the code. In the first condition strtol() can consume more than 'len' digits. In the else, we error out if we don't have exactly 'len' characters.

What's the proper behaviour here? 

- Doug
Salesforce

Re: [HACKERS] Inconsistencies in from_char_parse_int_len()

From
Tom Lane
Date:
Douglas Doole <dougdoole@gmail.com> writes:
> I was playing with TO_TIMESTAMP() and I noticed a weird result:
> postgres=# select to_timestamp('20170-07-24 21:59:57.12345678', 'yyyy-mm-dd
> hh24:mi:ss.us');
>           to_timestamp
> --------------------------------
>  20170-07-24 22:00:09.345678+00
> (1 row)

FWIW, we already tightened that up in v10:

regression=#  select to_timestamp('20170-07-24 21:59:57.12345678', 'yyyy-mm-dd hh24:mi:ss.us');
ERROR:  date/time field value out of range: "20170-07-24 21:59:57.12345678"

There may well be some discrepancies left.
        regards, tom lane