Re: Meridiem markers (was: [BUGS] Incorrect "invalid AM/PM string" error from to_timestamp) - Mailing list pgsql-hackers

From Brendan Jurd
Subject Re: Meridiem markers (was: [BUGS] Incorrect "invalid AM/PM string" error from to_timestamp)
Date
Msg-id 37ed240d0901180324k7c07445dlcffb41eb25f17cca@mail.gmail.com
Whole thread Raw
In response to Meridiem markers (was: [BUGS] Incorrect "invalid AM/PM string" error from to_timestamp)  ("Brendan Jurd" <direvus@gmail.com>)
List pgsql-hackers
On Sat, Sep 27, 2008 at 4:25 AM, Brendan Jurd <direvus@gmail.com> wrote:
> Currently, Postgres accepts four separate flavours for specifying
> meridiem markers, given by uppercase/lowercase and with/without
> periods:
>
>  * am/pm
>  * AM/PM
>  * a.m./p.m.
>  * A.M./P.M.

>
> I would go so far as to say that we should accept any of the 8 valid
> meridiem markers, regardless of which flavour is indicated by the
> formatting keyword.
>
> Day and month names already work this way.  We don't throw an error if
> a user specifies a mixed-case month name like "Sep" but uses the
> uppercase formatting keyword "MON".

I've been thinking further about this lately, and whilst the month and
day name tokens aren't fussy about *case*, they do make a distinction
about *length*.

So, while MON will match "Sep", "SEP" and "sep" just fine, it will
have issues with "September" (it will match the first three characters
as "Sep" and then leave the remaining characters "tember" to bork up
the next token).

Likewise, MONTH will not match "Sep", it needs the full month name.

I think, for to_timestamp(), it's important that the user have a solid
idea of how many characters each formatting token wants to consume.
With the am/pm and bc/ad markers, we've got two possibilities for
length; without periods (2 characters) and with periods (4
characters).  Having the 2-character token match against a 4-character
string might cause more confusion than convenience.

It may make more sense to keep the different lengths separate, so that
a 2-character token will match any of "am", "pm", "AM", "PM", and a
4-character token will match any of "a.m.", "p.m.", "A.M.", "P.M.".

Comments?

Cheers,
BJ


pgsql-hackers by date:

Previous
From: alanwli@gmail.com
Date:
Subject: Re: Fixes for compiler warnings
Next
From: Andrew Chernow
Date:
Subject: Re: VARSIZE - why omit VARLEN?