The contents of the pg_timezone_names view bring some surprises - Mailing list pgsql-general

From Bryn Llewellyn
Subject The contents of the pg_timezone_names view bring some surprises
Date
Msg-id 34C58324-C1AC-4BF9-9DDF-C9A09ADA2901@yugabyte.com
Whole thread Raw
Responses Re: The contents of the pg_timezone_names view bring some surprises
Re: The contents of the pg_timezone_names view bring some surprises
Re: The contents of the pg_timezone_names view bring some surprises
List pgsql-general
Some time zones have abbreviations that are identical to their names. This query:

select name
from pg_timezone_names
where abbrev = name
order by name;

gets this result:

EST
GMT
HST
MST
UCT
UTC

This, in itself, doesn’t seem remarkable. I wondered if any time zones have names that occur as time zone abbreviations but where the name and its abbreviation differ.

select name
from
pg_timezone_names
where
  name in (select abbrev from pg_timezone_names) and
  name <> abbrev
order by name;

gets this result:

CET
EET

So I wondered what rows have CET or EET as either a name or an abbreviation and yet the name and the abbreviation differ.

select name, abbrev, utc_offset, is_dst
from pg_timezone_names
where
  (
    name in ('CET', 'EET') or
    abbrev in ('CET', 'EET')
  )
  and name <> abbrev
order by name;

gets this result:

Africa/Algiers     | CET    | 01:00:00   | f
Africa/Cairo       | EET    | 02:00:00   | f
Africa/Tripoli     | EET    | 02:00:00   | f
Africa/Tunis       | CET    | 01:00:00   | f
CET                | CEST   | 02:00:00   | t
EET                | EEST   | 03:00:00   | t
Egypt              | EET    | 02:00:00   | f
Europe/Kaliningrad | EET    | 02:00:00   | f
Libya              | EET    | 02:00:00   | f

This tells me that when CET is used as a timezone name, it denotes an offset of 02:00—at least at some times of the year. And when the same text is used as an abbrev, it denotes an offset of 01:00.

But you can use either a timezone name, or a timezone abbreviation in the `at time zone` clause (see below).

There’s a similar story for EET where it denotes respectively offsets of 03:00 and 02:00.

Here’s what seems to me to be a closely related dilemma. I’d thought that an abbrev uniquely specified the utc_offset. But this test shows that it doesn’t:

with
  v1 as (
    select distinct abbrev, utc_offset
    from pg_timezone_names),
  v2 as (
    select abbrev, count(*)
    from v1
    group by abbrev
    having count(*) > 1)
select name, abbrev, utc_offset, is_dst
from pg_timezone_names
where abbrev in (select abbrev from v2)
order by abbrev;

It gets 46 rows. Here’s an interesting subset:

America/Monterrey              | CDT    | -05:00:00  | t
America/Havana                 | CDT    | -04:00:00  | t

ROC                            | CST    | 08:00:00   | f
America/Costa_Rica             | CST    | -06:00:00  | f

Eire                           | IST    | 01:00:00   | f
Asia/Kolkata                   | IST    | 05:30:00   | f

So here, the same text, even when used as abbrev, can denote different utc_offset values. (But note that there seems to be no way, in the 'at time zone' clause, that I can say that I want a text value to be taken as a name and not as an abbreviation, or vice versa.) 

This seems to be at odds with what section “8.5.3. Time Zones” at


says:

«
A time zone abbreviation, for example PST. Such a specification merely defines a particular offset from UTC, in contrast to full time zone names which can imply a set of daylight savings transition rules as well. The recognized abbreviations are listed in the pg_timezone_abbrevs view (see Section 51.91). You cannot set the configuration parameters TimeZone or log_timezone to a time zone abbreviation, but you can use abbreviations in date/time input values and with the AT TIME ZONE operator.
»

This claims (as I read it) that a time zone abbreviation uniquely determines an offset from UTC.

It seems that the result of this is therefore undefined because CDT denotes two different utc_offset values.:

select ('2000-01-01 12:00:00 +00:00'::timestamptz) at time zone 'CDT';

The same goes for this:

select ('2000-01-01 12:00:00 +00:00'::timestamptz) at time zone 'EET';

In summary, each of these texts, for two kinds of reason, can denote two different utc_offset values.:

CET
EET
CDT
CST
IST

Am I missing an essential clue to resolving what seems to me to be a paradox? Or am I seeing two kinds of bug?



pgsql-general by date:

Previous
From: Bryn Llewellyn
Date:
Subject: Some abbrev values from pg_timezone_names are not found in pg_timezone_abbrevs
Next
From: "David G. Johnston"
Date:
Subject: Re: The contents of the pg_timezone_names view bring some surprises