Thread: Re: [pgsql-hackers-win32] Weird new time zone
> >> I thought the issue under question was to find out what > the time zone > >> was. > > > Nope, we already had that. The issue is that the names are not the > > same as the one used in zic/unix, so there is nothing to match on. > > Right. The problem we are actually faced with is to identify > which of the zic timezones is the best match for the system's > timezone setting. > One of the issues is that it's not clear what "best" means... > > At the moment I like Oliver Jowett's idea of defining "best" > as "the one that matches furthest back". Sounds reasonable to me. As long as a clear warning is put in the log whenever something is picked that is not a perfect match, so the admin is directed at the potential problem and can fix it (by setting the GUC timezone variable). "Most apps" are probably going to deal with datetimes starting a couple of years back and going into the future. In these cases it doesn't even matter. Certainly not all, though. //Magnus
"Magnus Hagander" <mha@sollentuna.net> writes: >> At the moment I like Oliver Jowett's idea of defining "best" >> as "the one that matches furthest back". > Sounds reasonable to me. As long as a clear warning is put in the log > whenever something is picked that is not a perfect match, Define "perfect match". I do not think we can really tell if we have an exact match or not; the libc timezone API is just too limited to be sure. And on many platforms we can be sure we will never have an exact match, especially if we look at years before 1970. If you want something in the log I'd be inclined to just always make a log entry when we infer a timezone setting. regards, tom lane
"Magnus Hagander" <mha@sollentuna.net> writes: >> Right. The problem we are actually faced with is to identify >> which of the zic timezones is the best match for the system's >> timezone setting. >> One of the issues is that it's not clear what "best" means... >> >> At the moment I like Oliver Jowett's idea of defining "best" >> as "the one that matches furthest back". > Sounds reasonable to me. As long as a clear warning is put in the log > whenever something is picked that is not a perfect match, so the admin > is directed at the potential problem and can fix it (by setting the GUC > timezone variable). I'm not sure that a log entry is needed --- SHOW TIMEZONE will make it perfectly clear what zone was selected. But in any case, I've committed code that implements Oliver's idea. Could folks take another swipe at it and see if it works well in their local zones? Also, it'd still be interesting to see if we could #ifdef out the matching on zone names for Windows and still get reasonable results. regards, tom lane
Tom Lane wrote: > But in any case, I've committed code that implements Oliver's idea. > Could folks take another swipe at it and see if it works well in their > local zones? Also, it'd still be interesting to see if we could #ifdef > out the matching on zone names for Windows and still get reasonable > results. Sadly, it now thinks I live a bit further south than I actually do :) template1=# show timezone; TimeZone -------------------- Antarctica/McMurdo (1 row) The only timezones that get positive scores during startup are: DEBUG: TZ "Antarctica/McMurdo" gets max score 2080 DEBUG: TZ "Antarctica/South_Pole" gets max score 2080 DEBUG: TZ "Pacific/Auckland" gets max score 2080 DEBUG: TZ "NZ" gets max score 2080 Either of "NZ" or "Pacific/Auckland" would be correct. From memory it picked Pacific/Auckland with the slightly older code. -O
Oliver Jowett wrote: > The only timezones that get positive scores during startup are: > > DEBUG: TZ "Antarctica/McMurdo" gets max score 2080 > DEBUG: TZ "Antarctica/South_Pole" gets max score 2080 > DEBUG: TZ "Pacific/Auckland" gets max score 2080 > DEBUG: TZ "NZ" gets max score 2080 > > Either of "NZ" or "Pacific/Auckland" would be correct. Looking in the timezone data files, it appears that those two Antarctica timezones are identical to the NZ ones back to 1956. The CVS code only scans back to 1964. Increasing MAX_TEST_TIMES to scan back 50 years produces this: > DEBUG: TZ "Antarctica/McMurdo" scores 2534: at -442152000 1955-12-28 12:00:00 std versus 1955-12-29 00:00:00 std > DEBUG: TZ "Antarctica/South_Pole" scores 2534: at -442152000 1955-12-28 12:00:00 std versus 1955-12-29 00:00:00 std > DEBUG: TZ "Pacific/Auckland" gets max score 2600 > DEBUG: TZ "NZ" gets max score 2600 and it picks Pacific/Auckland. Also I'm a bit nervous about that hardcoded 2004 start date for the scan in pgtz.c -- that will presumably break if the timezone data files are updated for post-2004 changes without a corresponding change to the scan code. Would it make sense to scan backwards from the current system time to a predetermined year? -O
Oliver Jowett <oliver@opencloud.com> writes: > Also I'm a bit nervous about that hardcoded 2004 start date for the scan > in pgtz.c -- that will presumably break if the timezone data files are > updated for post-2004 changes without a corresponding change to the scan > code. Actually, that was intentional. My thought was that if it picks the right zone now, while we are testing it, it will continue to pick the right zone in future. Let's suppose we do that, and then Congress decides to fool with the US' DST laws again in 2009. The zic people will update their database, we will propagate that upstream change, and identify_system_timezone will immediately fail on any machine with an un-updated local timezone database. Now I suppose the owners of such machines would have good reason to update their libc databases soon ... but the point is that probing future time exposes us to risks from unforeseeable future changes, and I don't see that it gives any advantages. The Antarctica/McMurdo business is more interesting. I am not sure why it picks McMurdo today when it picked Auckland before, since the previous code certainly didn't scan backwards far enough to distinguish those zones either. I would have thought that you'd get the first exact match with the old code, and if the scan order is consistent that would be McMurdo. Possibly there's some phase-of-the-moon behavior involved in the scan order. Have you reinstalled PG without wiping the installation directories first? If the TZ files are installed by overwriting an existing tree, I can believe that the live directory entries would end up in a different physical order each time you do it. In general though, the zic database has a lot of duplicate and near-duplicate zone entries, and I'm not sure we can hope to pick one that the user will think is his local zone when there are several perfect matches. (This morning I was trying to think of a way of at least not running the calculations over again for each of several perfect duplicates, but AFAICS we'd have to rely on noticing multiple hard links, which would be awfully platform-dependent not to say fragile...) regards, tom lane
Tom Lane wrote: > Oliver Jowett <oliver@opencloud.com> writes: > >>Also I'm a bit nervous about that hardcoded 2004 start date for the scan >>in pgtz.c -- that will presumably break if the timezone data files are >>updated for post-2004 changes without a corresponding change to the scan >>code. > > > Actually, that was intentional. My thought was that if it picks the > right zone now, while we are testing it, it will continue to pick the > right zone in future. Let's suppose we do that, and then Congress > decides to fool with the US' DST laws again in 2009. The zic people > will update their database, we will propagate that upstream change, and > identify_system_timezone will immediately fail on any machine with an > un-updated local timezone database. Now I suppose the owners of such > machines would have good reason to update their libc databases soon > ... but the point is that probing future time exposes us to risks from > unforeseeable future changes, and I don't see that it gives any > advantages. It occurs to me that we should be able to automatically find a range of dates to scan that will distinguish between all non-identical timezones in our timezone database, and then plug the results into the scan code. That's simplify the upgrading of timezone data anyway.. otherwise some guesswork or hand inspection of the new data would be needed to work out if we need to move the scan start point to be able to distinguish new timezones. > The Antarctica/McMurdo business is more interesting. I am not sure > why it picks McMurdo today when it picked Auckland before, since the > previous code certainly didn't scan backwards far enough to distinguish > those zones either. I would have thought that you'd get the first > exact match with the old code, and if the scan order is consistent > that would be McMurdo. Possibly there's some phase-of-the-moon behavior > involved in the scan order. Have you reinstalled PG without wiping the > installation directories first? If the TZ files are installed by > overwriting an existing tree, I can believe that the live directory > entries would end up in a different physical order each time you do it. The install I tested on was a freshly checked out CVS tree installing into a clean directory. The older install had been updated and reinstalled a few times, so it's possible it just happened to find Auckland first because of a different ordering in the directory. It'd be nice to have a predictable timezone choice made when there's a tie. Perhaps we should order on timezone name in this case? -O
Oliver Jowett <oliver@opencloud.com> writes: > It'd be nice to have a predictable timezone choice made when there's a > tie. Perhaps we should order on timezone name in this case? The Antarctica zones would tend to win with such a rule, which is probably not what we want :-(. But certainly we could do something with ties beyond "take the first one in scan order". Would "shortest name wins" help any? regards, tom lane
Tom Lane wrote: > Oliver Jowett <oliver@opencloud.com> writes: > >>It'd be nice to have a predictable timezone choice made when there's a >>tie. Perhaps we should order on timezone name in this case? > > > The Antarctica zones would tend to win with such a rule, which is > probably not what we want :-(. But certainly we could do something > with ties beyond "take the first one in scan order". > > Would "shortest name wins" help any? Well, I was thinking about the NZ vs. Pacific/Auckland tie actually. We already have a way to avoid picking Antarctica -- scan further back. -O