Thread: T is a mandatory date time separator in RFC3339 but documentation states differently

T is a mandatory date time separator in RFC3339 but documentation states differently

From
PG Doc comments form
Date:
The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/16/git.html
Description:

>ISO 8601 specifies the use of uppercase letter T to separate the date and
time. PostgreSQL accepts that format on input, but on output it uses a space
rather than T, as shown above. This is for readability and for consistency
with RFC 3339 as well as some other database systems.

https://www.postgresql.org/docs/current/datatype-datetime.html

Short answer: T (or t as discouraged alternative).

After reading on this as much as I could, it turns out the time separator
must be a T or t. What has made think this way is first of all this thread
in the GNU lists where F. Alexander Njemz contacted the authors of RFC3339
Graham Klyne and Chris Newman asking if T is mandatory and got this response
from Mr. Klyne:

> In short: "yes"
> 
> Per section 5.5, the intent in this draft was to specify a timestamp
format using elements from and compatible with 8601, but eliminating as far
as reasonable any variations that could make timestamp data harder to
process. This includes making the 'T' mandatory in date+time values.

Just for clarity's sake, this is stated in the section 5.5:

> Simplicity is achieved by making most fields and punctuation mandatory.

This clearly clashes with a non-mandatory T and strongly makes me think that
the this syntax in that problematic passage refers to ISO8601 and not
RFC3339.

https://stackoverflow.com/questions/63783868/what-are-valid-date-time-separators-in-rfc3339-strings/63882162#63882162

On 2023-11-11 23:45 +0100, PG Doc comments form wrote:
> The following documentation comment has been logged on the website:
> 
> Page: https://www.postgresql.org/docs/16/git.html
> Description:
> 
> >ISO 8601 specifies the use of uppercase letter T to separate the date and
> time. PostgreSQL accepts that format on input, but on output it uses a space
> rather than T, as shown above. This is for readability and for consistency
> with RFC 3339 as well as some other database systems.

This note probably refers to section 5.6. of RFC 3339 [1] which allows
applications to choose space over "T".

> https://www.postgresql.org/docs/current/datatype-datetime.html
> 
> Short answer: T (or t as discouraged alternative).
> 
> After reading on this as much as I could, it turns out the time separator
> must be a T or t. What has made think this way is first of all this thread
> in the GNU lists where F. Alexander Njemz contacted the authors of RFC3339
> Graham Klyne and Chris Newman asking if T is mandatory and got this response
> from Mr. Klyne:
> 
> > In short: "yes"
> > 
> > Per section 5.5, the intent in this draft was to specify a timestamp
> format using elements from and compatible with 8601, but eliminating as far
> as reasonable any variations that could make timestamp data harder to
> process. This includes making the 'T' mandatory in date+time values.
> 
> Just for clarity's sake, this is stated in the section 5.5:
> 
> > Simplicity is achieved by making most fields and punctuation mandatory.

But the word "most" certainly leaves some wiggle room.

> This clearly clashes with a non-mandatory T and strongly makes me think that
> the this syntax in that problematic passage refers to ISO8601 and not
> RFC3339.
> 
>
https://stackoverflow.com/questions/63783868/what-are-valid-date-time-separators-in-rfc3339-strings/63882162#63882162

[1] https://datatracker.ietf.org/doc/html/rfc3339#section-5.6

-- 
Erik



So what should be changed?

postgresql docs say that it is consistent with rfc3339 using space.

Also look at this:

> In addition, an uppercase "T" character MUST be used to separate date and time

https://validator.w3.org/feed/docs/error/InvalidRFC3339Date.html

I would say that postgresql is not consistent with rfc3339, but rather with ISO8601 which is very flexible in regard to separator etc.

On Mon, Nov 13, 2023 at 11:23 AM Erik Wienhold <ewie@ewie.name> wrote:
On 2023-11-11 23:45 +0100, PG Doc comments form wrote:
> The following documentation comment has been logged on the website:
>
> Page: https://www.postgresql.org/docs/16/git.html
> Description:
>
> >ISO 8601 specifies the use of uppercase letter T to separate the date and
> time. PostgreSQL accepts that format on input, but on output it uses a space
> rather than T, as shown above. This is for readability and for consistency
> with RFC 3339 as well as some other database systems.

This note probably refers to section 5.6. of RFC 3339 [1] which allows
applications to choose space over "T".

> https://www.postgresql.org/docs/current/datatype-datetime.html
>
> Short answer: T (or t as discouraged alternative).
>
> After reading on this as much as I could, it turns out the time separator
> must be a T or t. What has made think this way is first of all this thread
> in the GNU lists where F. Alexander Njemz contacted the authors of RFC3339
> Graham Klyne and Chris Newman asking if T is mandatory and got this response
> from Mr. Klyne:
>
> > In short: "yes"
> >
> > Per section 5.5, the intent in this draft was to specify a timestamp
> format using elements from and compatible with 8601, but eliminating as far
> as reasonable any variations that could make timestamp data harder to
> process. This includes making the 'T' mandatory in date+time values.
>
> Just for clarity's sake, this is stated in the section 5.5:
>
> > Simplicity is achieved by making most fields and punctuation mandatory.

But the word "most" certainly leaves some wiggle room.

> This clearly clashes with a non-mandatory T and strongly makes me think that
> the this syntax in that problematic passage refers to ISO8601 and not
> RFC3339.
>
> https://stackoverflow.com/questions/63783868/what-are-valid-date-time-separators-in-rfc3339-strings/63882162#63882162

[1] https://datatracker.ietf.org/doc/html/rfc3339#section-5.6

--
Erik

On Mon, Nov 13, 2023 at 12:03 PM Roman Frołow <rofrol@gmail.com> wrote:
So what should be changed?

postgresql docs say that it is consistent with rfc3339 using space.

Also look at this:

> In addition, an uppercase "T" character MUST be used to separate date and time

https://validator.w3.org/feed/docs/error/InvalidRFC3339Date.html

I would say that postgresql is not consistent with rfc3339, but rather with ISO8601 which is very flexible in regard to separator etc.

On Mon, Nov 13, 2023 at 11:23 AM Erik Wienhold <ewie@ewie.name> wrote:
On 2023-11-11 23:45 +0100, PG Doc comments form wrote:
> The following documentation comment has been logged on the website:
>
> Page: https://www.postgresql.org/docs/16/git.html
> Description:
>
> >ISO 8601 specifies the use of uppercase letter T to separate the date and
> time. PostgreSQL accepts that format on input, but on output it uses a space
> rather than T, as shown above. This is for readability and for consistency
> with RFC 3339 as well as some other database systems.

This note probably refers to section 5.6. of RFC 3339 [1] which allows
applications to choose space over "T".

> https://www.postgresql.org/docs/current/datatype-datetime.html
>
> Short answer: T (or t as discouraged alternative).
>
> After reading on this as much as I could, it turns out the time separator
> must be a T or t. What has made think this way is first of all this thread
> in the GNU lists where F. Alexander Njemz contacted the authors of RFC3339
> Graham Klyne and Chris Newman asking if T is mandatory and got this response
> from Mr. Klyne:
>
> > In short: "yes"
> >
> > Per section 5.5, the intent in this draft was to specify a timestamp
> format using elements from and compatible with 8601, but eliminating as far
> as reasonable any variations that could make timestamp data harder to
> process. This includes making the 'T' mandatory in date+time values.
>
> Just for clarity's sake, this is stated in the section 5.5:
>
> > Simplicity is achieved by making most fields and punctuation mandatory.

But the word "most" certainly leaves some wiggle room.

> This clearly clashes with a non-mandatory T and strongly makes me think that
> the this syntax in that problematic passage refers to ISO8601 and not
> RFC3339.
>
> https://stackoverflow.com/questions/63783868/what-are-valid-date-time-separators-in-rfc3339-strings/63882162#63882162

[1] https://datatracker.ietf.org/doc/html/rfc3339#section-5.6

--
Erik
On 2023-11-13 12:27 +0100, Roman Frołow wrote:
> It seems I was wrong.
> Now T is mandatory in ISO8601 and from RFC3339 it seems this requirement is
> relaxed.

Right.  RFC 3339 is from 2002 and it references ISO 8601:1988 where T
was optional.  (Until today I didn't even know about the 2019 edition.)
RFC 3339 made the separator mandatory and also permitted any character.

Also SQL only uses part of ISO 8601 (without specifying a particular
edition), namely the definition of UTC and date according to 3.1.4 in
Part 2 of the standard:

> 3.1.4 Definitions taken from ISO 8601
>
> For the purposes of this part of ISO/IEC 9075, the definitions of the
> following terms given in ISO 8601 apply:
>
> 3.1.4.1 Coordinated Universal Time (UTC)
> 3.1.4.2 date (date, calendar in ISO 8601)

I think the docs should clarify that "ISO 8601" in this case refers to a
subset of definitions as layed out in the SQL standard.  For example,
week date (SELECT '2023-W46-1'::date) is not accepted by Postgres, but
ordinal date (SELECT '2023-317'::date) is.

I also noticed that when people say "ISO 8601" they usually mean RFC
3389 or some subset of ISO 8601. [1]

[1] https://robinpokorny.com/blog/why-developers-should-stop-using-iso-8601-for-date-time/

-- 
Erik



On 2023-11-13 15:24 +0100, Erik Wienhold wrote:
> I also noticed that when people say "ISO 8601" they usually mean RFC
> 3389 or some subset of ISO 8601.

Forgot this fine visualization of the differences:
https://ijmacd.github.io/rfc3339-iso8601/

-- 
Erik



Erik Wienhold <ewie@ewie.name> writes:
> On 2023-11-13 15:24 +0100, Erik Wienhold wrote:
>> I also noticed that when people say "ISO 8601" they usually mean RFC
>> 3389 or some subset of ISO 8601.

> Forgot this fine visualization of the differences:
> https://ijmacd.github.io/rfc3339-iso8601/

I'm inclined not to change anything here, for a couple of reasons:

1. PG accepts a fairly large number of ISO 8601 variants (not all);
not only the RFC 3339 format.  So s/ISO 8601/RFC 3339/g would be
incorrect.  Besides, I think more people know what ISO 8601 is than
know what RFC 3339 is, so that change would also be confusing.

2. If ijmacd's pretty graphic is correct, then what we say about
'T' versus space is correct, even if it isn't the whole truth.
I'm not quite sure that ijmacd is correct, though, because of
this bit in 3339:

      NOTE: Per [ABNF] and ISO8601, the "T" and "Z" characters in this
      syntax may alternatively be lower case "t" or "z" respectively.

which suggests that 8601 is also case-insensitive.  I don't plan
to go buy a copy of that spec to find out, though.  In any case,
we accept 'T', 't', '_', and most other punctuation there, so
we should be able to read nearly any plausible variant.

            regards, tom lane



On 2023-11-13 17:23 +0100, Tom Lane wrote:
> Erik Wienhold <ewie@ewie.name> writes:
> > On 2023-11-13 15:24 +0100, Erik Wienhold wrote:
> >> I also noticed that when people say "ISO 8601" they usually mean RFC
> >> 3389 or some subset of ISO 8601.
> 
> > Forgot this fine visualization of the differences:
> > https://ijmacd.github.io/rfc3339-iso8601/
> 
> I'm inclined not to change anything here, for a couple of reasons:
> 
> 1. PG accepts a fairly large number of ISO 8601 variants (not all);
> not only the RFC 3339 format.  So s/ISO 8601/RFC 3339/g would be
> incorrect.

I did not say that occurrences of "ISO 8601" should be replaced with
"RFC 3339".  Just that the docs should have a cautionary note about the
SQL standard using a subset of ISO 8601 and that Postgres does not
implement (at the moment) all of its date formats.

> Besides, I think more people know what ISO 8601 is than
> know what RFC 3339 is, so that change would also be confusing.

Fair point.

> 2. If ijmacd's pretty graphic is correct, then what we say about
> 'T' versus space is correct, even if it isn't the whole truth.
> I'm not quite sure that ijmacd is correct, though, because of
> this bit in 3339:
> 
>       NOTE: Per [ABNF] and ISO8601, the "T" and "Z" characters in this
>       syntax may alternatively be lower case "t" or "z" respectively.
> 
> which suggests that 8601 is also case-insensitive.  I don't plan
> to go buy a copy of that spec to find out, though.

I dug up my old copies (1988/2000/2004) from uni (let me know if you're
interested ;)  Lower case is allowed if upper case is not available.
And it doesn't allow space in any edition.  Quote from the 2004 edition:

> 3.4.1 Introduction
>
> NOTE 1   In date and time representations lower case characters may
> be used when upper case characters are not available.
>
> NOTE 2   Encoding of characters for the interchange of dates and times
> is not in the scope of this International Standard.
>
> Unless explicitly allowed by this International Standard the character
> "space" shall not be used in the representations.

Probably still the same in the current edition but I won't pay for that
either.

-- 
Erik



On 14.11.23 19:21, Erik Wienhold wrote:
> I did not say that occurrences of "ISO 8601" should be replaced with
> "RFC 3339".  Just that the docs should have a cautionary note about the
> SQL standard using a subset of ISO 8601 and that Postgres does not
> implement (at the moment) all of its date formats.

The SQL standard does not refer to ISO 8601 to define date formats, it 
has its own definitions.  In fact, PostgreSQL implements more date 
formats than the SQL standard requires.



On 2023-11-15 08:16 +0100, Peter Eisentraut wrote:
> The SQL standard does not refer to ISO 8601 to define date formats, it has
> its own definitions.  In fact, PostgreSQL implements more date formats than
> the SQL standard requires.

Really?  Then what does the standard mean with section "Definitions
taken from ISO 8601" which I quoted in [1]?  Just using the term "date"
without adopting its syntax?

And the Postgres docs also say "The SQL standard requires the use of the
ISO 8601 format." [2]

[1] https://www.postgresql.org/message-id/piavtdd7mhmkpzpgvxaek3hz3e2kan3c2fitn5iqta6nyrpgyl%40txongxshxkxw
[2] https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-DATETIME-OUTPUT

-- 
Erik



On 15.11.23 09:37, Erik Wienhold wrote:
> On 2023-11-15 08:16 +0100, Peter Eisentraut wrote:
>> The SQL standard does not refer to ISO 8601 to define date formats, it has
>> its own definitions.  In fact, PostgreSQL implements more date formats than
>> the SQL standard requires.
> 
> Really?  Then what does the standard mean with section "Definitions
> taken from ISO 8601" which I quoted in [1]?  Just using the term "date"
> without adopting its syntax?

Exactly, it just imports the definitions of those terms.

> And the Postgres docs also say "The SQL standard requires the use of the
> ISO 8601 format." [2]
> [2] https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-DATETIME-OUTPUT

Yeah, that isn't correct.

I think we should reframe "ISO" to mean "ISO 9075" and remove all claims 
of alignment with ISO 8601 and RFC 3339.




On 2023-11-15 12:53 +0100, Peter Eisentraut wrote:
> On 15.11.23 09:37, Erik Wienhold wrote:
> > On 2023-11-15 08:16 +0100, Peter Eisentraut wrote:
> > > The SQL standard does not refer to ISO 8601 to define date formats, it has
> > > its own definitions.  In fact, PostgreSQL implements more date formats than
> > > the SQL standard requires.
> > 
> > Really?  Then what does the standard mean with section "Definitions
> > taken from ISO 8601" which I quoted in [1]?  Just using the term "date"
> > without adopting its syntax?
> 
> Exactly, it just imports the definitions of those terms.

Thanks, now I see.  SQL only defines date format 'YYYY-MM-DD' (YYYY, MM,
and DD can be any unsigned integer) with this BNF:

> <date literal> ::=
>   DATE <date string>
> 
> <date string> ::=
>   <quote> <unquoted date string> <quote>
> 
> <unquoted date string> ::=
>   <date value>
> 
> <date value> ::=
>   <years value> <minus sign> <months value> <minus sign> <days value>

And timestamp is only defined with a space separator which is clearly
not ISO 8601:

> <unquoted timestamp string> ::=
>   <unquoted date string> <space> <unquoted time string>

> > And the Postgres docs also say "The SQL standard requires the use of the
> > ISO 8601 format." [2]
> > [2] https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-DATETIME-OUTPUT
> 
> Yeah, that isn't correct.
> 
> I think we should reframe "ISO" to mean "ISO 9075" and remove all claims of
> alignment with ISO 8601 and RFC 3339.

Agree.  So just list the example inputs without any reference to a
particular standard, except for ISO 9075 to show that Postgres is
SQL-standard-compliant?

-- 
Erik



Erik Wienhold <ewie@ewie.name> writes:
> On 2023-11-15 12:53 +0100, Peter Eisentraut wrote:
>> I think we should reframe "ISO" to mean "ISO 9075" and remove all claims of
>> alignment with ISO 8601 and RFC 3339.

> Agree.  So just list the example inputs without any reference to a
> particular standard, except for ISO 9075 to show that Postgres is
> SQL-standard-compliant?

I think that would remove useful context without actually improving
anything.  (The datetime input code would be far simpler if it
meant only to read the exact format mentioned in the SQL spec.)

            regards, tom lane



On 2023-11-15 15:46 +0100, Tom Lane wrote:
> Erik Wienhold <ewie@ewie.name> writes:
> > On 2023-11-15 12:53 +0100, Peter Eisentraut wrote:
> >> I think we should reframe "ISO" to mean "ISO 9075" and remove all claims of
> >> alignment with ISO 8601 and RFC 3339.
> 
> > Agree.  So just list the example inputs without any reference to a
> > particular standard, except for ISO 9075 to show that Postgres is
> > SQL-standard-compliant?
> 
> I think that would remove useful context without actually improving
> anything.  (The datetime input code would be far simpler if it
> meant only to read the exact format mentioned in the SQL spec.)

I wrote the attached patch to hopefully clarify the ISO 8601 references.

The two main changes are:

* Making explicit references to ISO 8601:2004 where section numbers are
  referenced.  Mostly in source comments but also a couple of places in
  the docs.  This is about avoiding confusion as ISO 8601:2019 has been
  published since then, with different section numbers[1].  The pre-2004
  editions also have different section numbers.  References to general
  ISO 8601 concepts (e.g. week numbers) are left unchanged because those
  are not tied to any particular edition.

* Remove the claim that the SQL standard requires ISO 8601 formats as
  clarified by Peter Eisentraut.  I left the general references to ISO
  8601 and RFC 3339 because those relate to the date format that
  Postgres implements in addition to the standard SQL formats.  Also
  change time zone input samples that are described as ISO 8601 but do
  not match the standard format.

[1] https://www.iso.org/obp/ui/en/#iso:std:iso:8601:-1:ed-1:v1:en

-- 
Erik

Attachment
On 19.11.23 21:34, Erik Wienhold wrote:
> * Making explicit references to ISO 8601:2004 where section numbers are
>    referenced.  Mostly in source comments but also a couple of places in
>    the docs.  This is about avoiding confusion as ISO 8601:2019 has been
>    published since then, with different section numbers[1].  The pre-2004
>    editions also have different section numbers.  References to general
>    ISO 8601 concepts (e.g. week numbers) are left unchanged because those
>    are not tied to any particular edition.

Maybe we should change the references to the 2019 edition instead?




On 2023-11-20 08:14 +0100, Peter Eisentraut wrote:
> On 19.11.23 21:34, Erik Wienhold wrote:
> > * Making explicit references to ISO 8601:2004 where section numbers are
> >    referenced.  Mostly in source comments but also a couple of places in
> >    the docs.  This is about avoiding confusion as ISO 8601:2019 has been
> >    published since then, with different section numbers[1].  The pre-2004
> >    editions also have different section numbers.  References to general
> >    ISO 8601 concepts (e.g. week numbers) are left unchanged because those
> >    are not tied to any particular edition.
> 
> Maybe we should change the references to the 2019 edition instead?

I wouldn't do that without knowing the full text of the standard.  Maybe
I can eyeball which 2004 sections map to 2019 sections but I wouldn't be
confident in that.

Also 8601:2019 removed 24:00 as midnight for some reason but it is
allowed again with an ammendment from 2022.  Not sure what else changed
besides the "main changes" summarized in the foreword of [1].  And I
can't find the 2022 ammendment on iso.org.

[1] https://www.iso.org/obp/ui/en/#iso:std:iso:8601:-1:ed-1:v1:en

-- 
Erik