Re: UCT (Re: pgsql: Update time zone data files to tzdata release 2019a.) - Mailing list pgsql-hackers

From Andrew Gierth
Subject Re: UCT (Re: pgsql: Update time zone data files to tzdata release 2019a.)
Date
Msg-id 871rzv4yes.fsf@news-spur.riddles.org.uk
Whole thread Raw
In response to Re: UCT (Re: pgsql: Update time zone data files to tzdata release 2019a.)  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Responses Re: UCT (Re: pgsql: Update time zone data files to tzdata release2019a.)
List pgsql-hackers
>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 >>> This isn't good enough, because it still picks "UCT" on a system
 >>> with no /etc/localtime and no TZ variable. Testing on HEAD as of
 >>> 3da73d683 (on FreeBSD, but it'll be the same anywhere else):

 Tom> [ shrug... ]  Too bad.  I doubt that that's a common situation anyway.

 Andrew> I'm also reminded that this applies also if the /etc/localtime
 Andrew> file is a _copy_ of the UTC zonefile rather than a symlink,
 Andrew> which is possibly even more common.

And testing shows that if you select "UTC" when installing FreeBSD, you
indeed get /etc/localtime as a copy not a symlink, and I've confirmed
that initdb picks "UCT" in that case.

So here is my current proposed fix.

-- 
Andrew (irc:RhodiumToad)

diff --git a/src/bin/initdb/findtimezone.c b/src/bin/initdb/findtimezone.c
index 3477a08efd..f7c199a006 100644
--- a/src/bin/initdb/findtimezone.c
+++ b/src/bin/initdb/findtimezone.c
@@ -128,8 +128,11 @@ pg_load_tz(const char *name)
  * the C library's localtime() function.  The database zone that matches
  * furthest into the past is the one to use.  Often there will be several
  * zones with identical rankings (since the IANA database assigns multiple
- * names to many zones).  We break ties arbitrarily by preferring shorter,
- * then alphabetically earlier zone names.
+ * names to many zones).  We break ties by first checking for "preferred"
+ * names (such as "UTC"), and then arbitrarily by preferring shorter, then
+ * alphabetically earlier zone names.  (If we did not explicitly prefer
+ * "UTC", we would get the alias name "UCT" instead due to alphabetic
+ * ordering.)
  *
  * Many modern systems use the IANA database, so if we can determine the
  * system's idea of which zone it is using and its behavior matches our zone
@@ -602,6 +605,28 @@ check_system_link_file(const char *linkname, struct tztry *tt,
 #endif
 }
 
+/*
+ * Given a timezone name, determine whether it should be preferred over other
+ * names which are equally good matches. The output is arbitrary but we will
+ * use 0 for "neutral" default preference.
+ *
+ * Ideally we'd prefer the zone.tab/zone1970.tab names, since in general those
+ * are the ones offered to the user to select from. But for the moment, to
+ * minimize changes in behaviour, simply prefer UTC over alternative spellings
+ * such as UCT that otherwise cause confusion. The existing "shortest first"
+ * rule would prefer "UTC" over "Etc/UTC" so keep that the same way (while
+ * still preferring Etc/UTC over Etc/UCT).
+ */
+static int
+zone_name_pref(const char *zonename)
+{
+    if (strcmp(zonename, "UTC") == 0)
+        return 50;
+    if (strcmp(zonename, "Etc/UTC") == 0)
+        return 40;
+    return 0;
+}
+
 /*
  * Recursively scan the timezone database looking for the best match to
  * the system timezone behavior.
@@ -674,7 +699,8 @@ scan_available_timezones(char *tzdir, char *tzdirsub, struct tztry *tt,
             else if (score == *bestscore)
             {
                 /* Consider how to break a tie */
-                if (strlen(tzdirsub) < strlen(bestzonename) ||
+                if (zone_name_pref(tzdirsub) > zone_name_pref(bestzonename) ||
+                    strlen(tzdirsub) < strlen(bestzonename) ||
                     (strlen(tzdirsub) == strlen(bestzonename) &&
                      strcmp(tzdirsub, bestzonename) < 0))
                     strlcpy(bestzonename, tzdirsub, TZ_STRLEN_MAX + 1);

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Next
From: Alvaro Herrera
Date:
Subject: Re: pg_dump multi VALUES INSERT