Re: [PATCHES] to_date() validation - Mailing list pgsql-hackers

From Brendan Jurd
Subject Re: [PATCHES] to_date() validation
Date
Msg-id 37ed240d0809080124y3403224fpe3c34ca90df798a3@mail.gmail.com
Whole thread Raw
Responses Re: [PATCHES] to_date() validation  ("Brendan Jurd" <direvus@gmail.com>)
Re: [PATCHES] to_date() validation  (Martijn van Oosterhout <kleptog@svana.org>)
Re: [PATCHES] to_date() validation  ("Alex Hunsaker" <badalex@gmail.com>)
List pgsql-hackers
On Sun, Sep 7, 2008 at 5:58 AM, Alex Hunsaker <badalex@gmail.com> wrote:
> Im just following this:
> http://wiki.postgresql.org/wiki/Reviewing_a_Patch so lets get started.
>

Hi Alex.  Thanks for taking the time to review my patch.

> Feature test: Everything seems to work as advertised. However before
> via sscanf() most limited the max length of the input.  Before they
> were just silently ignored now they get added to the result:
>
> patched:
> # SELECT to_timestamp('11111', 'HHMM');
>         to_timestamp
> ------------------------------
>  0009-03-19 11:00:00-06:59:56
>
> 8.3.3:
> # SELECT to_timestamp('11111', 'HHMM');
>       to_timestamp
> ---------------------------
>  0001-11-01 11:00:00-07 BC
>

It's an interesting point.  I had considered what would happen if the
input string was too short, but hadn't given much thought to what
would happen if it was too long.

In your example case, it isn't clear whether the user wanted to
specify 11 hours and 11 months (and the final '1' is just junk), or if
they really wanted to specify 11 hours and 111 months.

But consider a case like the following:

# SELECT to_date('07-09-20008', 'DD-MM-YYYY');

The documentation says that 'YYYY' is "4 and more digits", so you have
to assume that the user meant to say the year 20,008.  That's why the
code in the patch parses all the remaining digits in the input string
on the final node.

HEAD actually gets this one wrong; in defiance of the documentation it
returns 2000-09-07.  So, it seems to me that the patch shifts the
behaviour in the right direction.

Barring actually teaching the code that some nodes (like YYYY) can
take an open-ended number of characters, while others (like MM) must
take an exact number of characters, I'm not sure what can be done to
improve this.  Perhaps something for a later patch?

> Code review: one minor nit
> *** a/src/backend/utils/adt/formatting.c
> --- b/src/backend/utils/adt/formatting.c
> ***************
> *** 781,787 **** static const KeyWord DCH_keywords[] = {
>        {"y", 1, DCH_Y, TRUE, FROM_CHAR_DATE_GREGORIAN},
>
>        /* last */
> !       {NULL, 0, 0, 0}
>  };
>
>  /* ----------
> --- 781,787 ----
>        {"y", 1, DCH_Y, TRUE, FROM_CHAR_DATE_GREGORIAN},
>
>        /* last */
> !       {NULL, 0, 0, 0, 0}
>  };

Ah, thank you for picking that up.  I will correct this and send in a
new patch version in the next 24 hours.

Cheers,
BJ


pgsql-hackers by date:

Previous
From: Zdenek Kotala
Date:
Subject: Re: Prototype: In-place upgrade v02
Next
From: Zdenek Kotala
Date:
Subject: Re: Prototype: In-place upgrade v02