Thread: BUG #17240: at time zone ... ; wrong result

BUG #17240: at time zone ... ; wrong result

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      17240
Logged by:          Marek Läll
Email address:      marek.lall@eesti.ee
PostgreSQL version: 13.4
Operating system:   Debian
Description:

Hello!

# SELECT version();
                                                        version
                                           

------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 13.4 (Debian 13.4-1.pgdg100+1) on aarch64-unknown-linux-gnu,
compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
(1 row)


# SET TimeZone='UTC';

# select * from pg_timezone_names where name like 'America/Los_Angeles';
        name         | abbrev | utc_offset | is_dst 
---------------------+--------+------------+--------
 America/Los_Angeles | PDT    | -07:00:00  | t
(1 row)

# select now()
 , now() at time zone 'America/Los_Angeles' as correct
 , now() at time zone '-07:00:00' as wrong;

              now              |          correct           |
wrong            
-------------------------------+----------------------------+----------------------------
 2021-10-21 09:29:07.591962+00 | 2021-10-21 02:29:07.591962 | 2021-10-21
16:29:07.591962
(1 row)

--> COLUMN "WRONG" IS EXPECTED TO HAVE THE SAME VALUE AS COLUMN "CORRECT"

# select timestamptz'2021-10-01 00:00:00 UTC' at time zone
'America/Los_Angeles' as correct_1
 , timestamptz'2021-10-01 00:00:00 UTC' at time zone '-07:00' as wrong_1
 , timestamptz'2021-10-01 00:00:00 America/Los_Angeles' at time zone
'America/Los_Angeles' as correct_2
 , timestamptz'2021-10-01 00:00:00 -07:00:00' at time zone '-07:00' as
wrong_2
 , timestamptz'2021-10-01 00:00:00 -07:00:00' at time zone
'America/Los_Angeles' as correct_3
 , timestamptz'2021-10-01 00:00:00 America/Los_Angeles' at time zone
'-07:00' as wrong_3;
 
      correct_1      |       wrong_1       |      correct_2      |
wrong_2       |      correct_3      |       wrong_3       

---------------------+---------------------+---------------------+---------------------+---------------------+---------------------
 2021-09-30 17:00:00 | 2021-10-01 07:00:00 | 2021-10-01 00:00:00 |
2021-10-01 14:00:00 | 2021-10-01 00:00:00 | 2021-10-01 14:00:00
(1 row)

--> COLUMNS "WRONG_<N>" ARE EXPECTED TO HAVE THE SAME VALUE AS COLUMN
"CORRECT_<N>"


Re: BUG #17240: at time zone ... ; wrong result

From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes:
> # select now()
>  , now() at time zone 'America/Los_Angeles' as correct
>  , now() at time zone '-07:00:00' as wrong;

Unfortunately, the pure-numeric syntax for time zone names follows the
POSIX sign convention, which is opposite to the ISO convention used
in pg_timezone_names.utc_offset (and in most other places in Postgres).
So "at time zone '+07:00:00'" is what you needed to write to duplicate
the 'America/Los_Angeles' result.  See

https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-TIMEZONES

Now, if you'd done this:

select ... now() at time zone interval '-07:00:00' as fine

you'd have gotten the ISO sign interpretation.  But an undecorated
literal string defaults to being of type text, meaning you get
the time-zone-name logic path.

The great thing about standards is there are so many to choose from :-(

            regards, tom lane



Re: BUG #17240: at time zone ... ; wrong result

From
Marek Läll
Date:
Hi!

Thank you for your response, Tom.

I have multiple questions/comments but let's start one-by-one.


POSIX string (syntax) is defined as:
 --> stdoffset[dst[offset][,start-date[/time],end-date[/time]]]

The std string specifies the name of the time zone.
It must be three or more characters long and must not contain a leading colon, embedded digits, commas, nor plus and minus signs.
There is no space character separating the time zone name from the offset, so these restrictions are necessary to parse the specification correctly.

For "Pacific Time (Canada & US)" aka "America/Los_Angeles" the POSIX version looks like: "PST8PDT,M3.2.0,M11.1.0"


Here are results of experiment:

$ export TZ="PST8PDT,M3.2.0,M11.1.0" ; date --iso-8601=seconds
2021-10-24T12:10:51-07:00   <-- CORRECT

$ export TZ="UTC" ; date --iso-8601=seconds
2021-10-24T19:11:32+00:00   <-- CORRECT

$ export TZ="-07:00" ; date --iso-8601=seconds
2021-10-24T19:11:42+00:00   <-- TZ is IGNORED, because "-07:00" is not valid POSIX syntax

$ export TZ="-0700" ; date --iso-8601=seconds
2021-10-24T19:11:46+00:00   <-- TZ is IGNORED, because "-0700" is not valid POSIX syntax

$ export TZ="-07" ; date --iso-8601=seconds
2021-10-24T19:55:08+00:00   <-- TZ is IGNORED, because "-07" is not valid POSIX syntax

$ export TZ="-07RandomMeaninglessString" ; date --iso-8601=seconds
2021-10-24T19:14:55+00:00   <-- TZ is IGNORED, because "-07RandomMeaninglessString" is not valid POSIX syntax

$ uname -a
Linux 5.10.0-8-amd64 #1 SMP Debian 5.10.46-4 (2021-08-03) x86_64 GNU/Linux


Back to POSTGRES case:
> postgres> select timestamptz'2021-10-01 07:00:00 UTC' at time zone '-07:00' as revers_res2;
>  2021-10-01 14:00:00

The value '-07:00' is not a valid POSIX value but ... Postgres performs the following (as concept):
1) first: reads in the string '-07:00'
2) then: ignores the fact that string '-07:00' is invalid POSIX value
3) then: decides to follow "POSIX sign convention" rules (still ignoring previous fact)
4) and then: applies "POSIX sign convention" rules using invalid POSIX value as input?


Regards
Marek


Kontakt Tom Lane (<tgl@sss.pgh.pa.us>) kirjutas kuupäeval N, 21. oktoober 2021 kell 16:41:
PG Bug reporting form <noreply@postgresql.org> writes:
> # select now()
>  , now() at time zone 'America/Los_Angeles' as correct
>  , now() at time zone '-07:00:00' as wrong;

Unfortunately, the pure-numeric syntax for time zone names follows the
POSIX sign convention, which is opposite to the ISO convention used
in pg_timezone_names.utc_offset (and in most other places in Postgres).
So "at time zone '+07:00:00'" is what you needed to write to duplicate
the 'America/Los_Angeles' result.  See

https://www.postgresql.org/docs/current/datatype-datetime.html#DATATYPE-TIMEZONES

Now, if you'd done this:

select ... now() at time zone interval '-07:00:00' as fine

you'd have gotten the ISO sign interpretation.  But an undecorated
literal string defaults to being of type text, meaning you get
the time-zone-name logic path.

The great thing about standards is there are so many to choose from :-(

                        regards, tom lane


Re: BUG #17240: at time zone ... ; wrong result

From
Tom Lane
Date:
=?UTF-8?Q?Marek_L=C3=A4ll?= <lall.marek@gmail.com> writes:
> POSIX string (syntax) is defined as:
>  --> stdoffset[dst[offset][,start-date[/time],end-date[/time]]]

> The std string specifies the name of the time zone.
> It must be three or more characters long and must not contain a leading
> colon, embedded digits, commas, nor plus and minus signs.

Hmm, you're reading the POSIX spec I guess, because our docs don't
say that ;-).  The IANA tzdb code only enforces that STD not be empty,
and Postgres has modified it to allow empty STD as well.  That's
an ancient backwards-compatibility decision that we likely ought
to change sometime, so I've intentionally not documented it in
appendix B.5 [1].  The text in B.5 actually says that you need angle
brackets if you want any non-letters in STD or DST, which is more
conservative than what the IANA code will accept.

> 2) then: ignores the fact that string '-07:00' is invalid POSIX value

It's valid according to our interpretation of POSIX.  Some experimentation
suggests that GNU date(1) enforces the POSIX definition exactly, which is
that STD be at least three characters, all alphabetic.  That implies that
they wrote their own TZ parser, because the IANA reference code doesn't
act that way.

            regards, tom lane

[1] https://www.postgresql.org/docs/current/datetime-posix-timezone-specs.html



Re: BUG #17240: at time zone ... ; wrong result

From
Marek Läll
Date:
Hi!

Thank you for your response.


Next question.

What about the following principle?

  if there is a function
    y = func( x, z)
  and it is linear function (which is the case) then there is inverse function
    x = inverse_func( y, z)
  and then the following is always true:
    x = inverse_func( func( x, z), z)

Example with plus and minus:
  y = x + z
  x = y - z
  x = (x + z) - z


Postgres follows the principle for valid POSIX values:
  func():         timestamptz'2021-10-01 00:00:00 UTC-07:00' at time zone 'UTC'  --> timestamp'2021-09-30 17:00:00'
  inverse_func(): timestamptz'2021-09-30 17:00:00 UTC' at time zone 'UTC-07:00'  --> timestamp'2021-10-01 00:00:00'

And Postgres follows the principle for values like 'America/Los_Angeles'.


But Postgres starts ignoring the principle for invalid POSIX (or should we say non-POSIX) values:
  func():         timestamptz'2021-10-01 00:00:00 -07:00' at time zone 'UTC'  --> timestamp'2021-10-01 07:00:00'
  inverse_func(): timestamptz'2021-10-01 07:00:00 UTC' at time zone '-07:00'  --> timestamp'2021-10-01 14:00:00'


Please note!
1) Values 'UTC-07:00' and '-07:00' are treated as different values if they are specified inside a timestamp literal.
   And this is expected behaviour. POSIX value and non-POSIX value are recognized and handled properly.

2) Values 'UTC-07:00' and '-07:00' are treated as equal values if they are specified as arguments for "at time zone".
   This is NOT expected behaviour. Non-POSIX values must not be recognized and handled as POSIX values.
   Also, this behaviour breaks the simple principle described in the beginning.


To be clear, once again, same examples, different angle.
Next 2 examples produce different result (expected):
    timestamptz'2021-10-01 00:00:00 -07:00' at time zone 'UTC'  --> timestamp'2021-10-01 07:00:00'
    timestamptz'2021-10-01 00:00:00 UTC-07:00' at time zone 'UTC'  --> timestamp'2021-09-30 17:00:00'  

And next 2 examples produce the same result (not expected):
    timestamptz'2021-10-01 07:00:00 UTC' at time zone '-07:00'  -->  timestamp'2021-10-01 14:00:00'
    timestamptz'2021-10-01 07:00:00 UTC' at time zone 'UTC-07:00'  -->  timestamp'2021-10-01 14:00:00'


Summary:
I reported this case as the "at time zone" argument is not properly handled.
But you may say that the timestamptz literal is not properly handled.
Anyway, there is no consistent handling as you can see in examples above.


Regards
Marek

Kontakt Tom Lane (<tgl@sss.pgh.pa.us>) kirjutas kuupäeval E, 25. oktoober 2021 kell 19:26:
Marek Läll <lall.marek@gmail.com> writes:
> POSIX string (syntax) is defined as:
>  --> stdoffset[dst[offset][,start-date[/time],end-date[/time]]]

> The std string specifies the name of the time zone.
> It must be three or more characters long and must not contain a leading
> colon, embedded digits, commas, nor plus and minus signs.

Hmm, you're reading the POSIX spec I guess, because our docs don't
say that ;-).  The IANA tzdb code only enforces that STD not be empty,
and Postgres has modified it to allow empty STD as well.  That's
an ancient backwards-compatibility decision that we likely ought
to change sometime, so I've intentionally not documented it in
appendix B.5 [1].  The text in B.5 actually says that you need angle
brackets if you want any non-letters in STD or DST, which is more
conservative than what the IANA code will accept.

> 2) then: ignores the fact that string '-07:00' is invalid POSIX value

It's valid according to our interpretation of POSIX.  Some experimentation
suggests that GNU date(1) enforces the POSIX definition exactly, which is
that STD be at least three characters, all alphabetic.  That implies that
they wrote their own TZ parser, because the IANA reference code doesn't
act that way.

                        regards, tom lane

[1] https://www.postgresql.org/docs/current/datetime-posix-timezone-specs.html