Thread: Re: [pgsql-hackers-win32] Weird new time zone

Re: [pgsql-hackers-win32] Weird new time zone

From
"Magnus Hagander"
Date:
> >> I thought the issue under question was to find out what
> the time zone
> >> was.
>
> > Nope, we already had that. The issue is that the names are not the
> > same as the one used in zic/unix, so there is nothing to match on.
>
> Right.  The problem we are actually faced with is to identify
> which of the zic timezones is the best match for the system's
> timezone setting.
> One of the issues is that it's not clear what "best" means...
>
> At the moment I like Oliver Jowett's idea of defining "best"
> as "the one that matches furthest back".

Sounds reasonable to me. As long as a clear warning is put in the log
whenever something is picked that is not a perfect match, so the admin
is directed at the potential problem and can fix it (by setting the GUC
timezone variable).

"Most apps" are probably going to deal with datetimes starting a couple
of years back and going into the future. In these cases it doesn't even
matter. Certainly not all, though.

//Magnus



Re: [pgsql-hackers-win32] Weird new time zone

From
Tom Lane
Date:
"Magnus Hagander" <mha@sollentuna.net> writes:
>> At the moment I like Oliver Jowett's idea of defining "best" 
>> as "the one that matches furthest back".

> Sounds reasonable to me. As long as a clear warning is put in the log
> whenever something is picked that is not a perfect match,

Define "perfect match".  I do not think we can really tell if we have an
exact match or not; the libc timezone API is just too limited to be
sure.  And on many platforms we can be sure we will never have an exact
match, especially if we look at years before 1970.

If you want something in the log I'd be inclined to just always make a
log entry when we infer a timezone setting.
        regards, tom lane


Re: [pgsql-hackers-win32] Weird new time zone

From
Tom Lane
Date:
"Magnus Hagander" <mha@sollentuna.net> writes:
>> Right.  The problem we are actually faced with is to identify 
>> which of the zic timezones is the best match for the system's 
>> timezone setting.
>> One of the issues is that it's not clear what "best" means...
>> 
>> At the moment I like Oliver Jowett's idea of defining "best" 
>> as "the one that matches furthest back".

> Sounds reasonable to me. As long as a clear warning is put in the log
> whenever something is picked that is not a perfect match, so the admin
> is directed at the potential problem and can fix it (by setting the GUC
> timezone variable).

I'm not sure that a log entry is needed --- SHOW TIMEZONE will make it
perfectly clear what zone was selected.

But in any case, I've committed code that implements Oliver's idea.
Could folks take another swipe at it and see if it works well in their
local zones?  Also, it'd still be interesting to see if we could #ifdef
out the matching on zone names for Windows and still get reasonable
results.
        regards, tom lane


Re: [pgsql-hackers-win32] Weird new time zone

From
Oliver Jowett
Date:
Tom Lane wrote:

> But in any case, I've committed code that implements Oliver's idea.
> Could folks take another swipe at it and see if it works well in their
> local zones?  Also, it'd still be interesting to see if we could #ifdef
> out the matching on zone names for Windows and still get reasonable
> results.

Sadly, it now thinks I live a bit further south than I actually do :)

template1=# show timezone;      TimeZone
-------------------- Antarctica/McMurdo
(1 row)

The only timezones that get positive scores during startup are:

DEBUG:  TZ "Antarctica/McMurdo" gets max score 2080
DEBUG:  TZ "Antarctica/South_Pole" gets max score 2080
DEBUG:  TZ "Pacific/Auckland" gets max score 2080
DEBUG:  TZ "NZ" gets max score 2080

Either of "NZ" or "Pacific/Auckland" would be correct.
From memory it picked Pacific/Auckland with the slightly older code.

-O


Re: [pgsql-hackers-win32] Weird new time zone

From
Oliver Jowett
Date:
Oliver Jowett wrote:

> The only timezones that get positive scores during startup are:
> 
> DEBUG:  TZ "Antarctica/McMurdo" gets max score 2080
> DEBUG:  TZ "Antarctica/South_Pole" gets max score 2080
> DEBUG:  TZ "Pacific/Auckland" gets max score 2080
> DEBUG:  TZ "NZ" gets max score 2080
> 
> Either of "NZ" or "Pacific/Auckland" would be correct.

Looking in the timezone data files, it appears that those two Antarctica 
timezones are identical to the NZ ones back to 1956. The CVS code only 
scans back to 1964.

Increasing MAX_TEST_TIMES to scan back 50 years produces this:

> DEBUG:  TZ "Antarctica/McMurdo" scores 2534: at -442152000 1955-12-28 12:00:00 std versus 1955-12-29 00:00:00 std
> DEBUG:  TZ "Antarctica/South_Pole" scores 2534: at -442152000 1955-12-28 12:00:00 std versus 1955-12-29 00:00:00 std
> DEBUG:  TZ "Pacific/Auckland" gets max score 2600
> DEBUG:  TZ "NZ" gets max score 2600

and it picks Pacific/Auckland.

Also I'm a bit nervous about that hardcoded 2004 start date for the scan 
in pgtz.c -- that will presumably break if the timezone data files are 
updated for post-2004 changes without a corresponding change to the scan 
code. Would it make sense to scan backwards from the current system time 
to a predetermined year?

-O


Re: [pgsql-hackers-win32] Weird new time zone

From
Tom Lane
Date:
Oliver Jowett <oliver@opencloud.com> writes:
> Also I'm a bit nervous about that hardcoded 2004 start date for the scan 
> in pgtz.c -- that will presumably break if the timezone data files are 
> updated for post-2004 changes without a corresponding change to the scan 
> code.

Actually, that was intentional.  My thought was that if it picks the
right zone now, while we are testing it, it will continue to pick the
right zone in future.  Let's suppose we do that, and then Congress
decides to fool with the US' DST laws again in 2009.  The zic people
will update their database, we will propagate that upstream change, and
identify_system_timezone will immediately fail on any machine with an
un-updated local timezone database.  Now I suppose the owners of such
machines would have good reason to update their libc databases soon
... but the point is that probing future time exposes us to risks from
unforeseeable future changes, and I don't see that it gives any
advantages.

The Antarctica/McMurdo business is more interesting.  I am not sure
why it picks McMurdo today when it picked Auckland before, since the
previous code certainly didn't scan backwards far enough to distinguish
those zones either.  I would have thought that you'd get the first
exact match with the old code, and if the scan order is consistent
that would be McMurdo.  Possibly there's some phase-of-the-moon behavior
involved in the scan order.  Have you reinstalled PG without wiping the
installation directories first?  If the TZ files are installed by
overwriting an existing tree, I can believe that the live directory
entries would end up in a different physical order each time you do it.

In general though, the zic database has a lot of duplicate and
near-duplicate zone entries, and I'm not sure we can hope to pick one
that the user will think is his local zone when there are several
perfect matches.  (This morning I was trying to think of a way of
at least not running the calculations over again for each of several
perfect duplicates, but AFAICS we'd have to rely on noticing multiple
hard links, which would be awfully platform-dependent not to say
fragile...)
        regards, tom lane


Re: [pgsql-hackers-win32] Weird new time zone

From
Oliver Jowett
Date:
Tom Lane wrote:
> Oliver Jowett <oliver@opencloud.com> writes:
> 
>>Also I'm a bit nervous about that hardcoded 2004 start date for the scan 
>>in pgtz.c -- that will presumably break if the timezone data files are 
>>updated for post-2004 changes without a corresponding change to the scan 
>>code.
> 
> 
> Actually, that was intentional.  My thought was that if it picks the
> right zone now, while we are testing it, it will continue to pick the
> right zone in future.  Let's suppose we do that, and then Congress
> decides to fool with the US' DST laws again in 2009.  The zic people
> will update their database, we will propagate that upstream change, and
> identify_system_timezone will immediately fail on any machine with an
> un-updated local timezone database.  Now I suppose the owners of such
> machines would have good reason to update their libc databases soon
> ... but the point is that probing future time exposes us to risks from
> unforeseeable future changes, and I don't see that it gives any
> advantages.

It occurs to me that we should be able to automatically find a range of 
dates to scan that will distinguish between all non-identical timezones 
in our timezone database, and then plug the results into the scan code.

That's simplify the upgrading of timezone data anyway.. otherwise some 
guesswork or hand inspection of the new data would be needed to work out 
if we need to move the scan start point to be able to distinguish new 
timezones.

> The Antarctica/McMurdo business is more interesting.  I am not sure
> why it picks McMurdo today when it picked Auckland before, since the
> previous code certainly didn't scan backwards far enough to distinguish
> those zones either.  I would have thought that you'd get the first
> exact match with the old code, and if the scan order is consistent
> that would be McMurdo.  Possibly there's some phase-of-the-moon behavior
> involved in the scan order.  Have you reinstalled PG without wiping the
> installation directories first?  If the TZ files are installed by
> overwriting an existing tree, I can believe that the live directory
> entries would end up in a different physical order each time you do it.

The install I tested on was a freshly checked out CVS tree installing 
into a clean directory.

The older install had been updated and reinstalled a few times, so it's 
possible it just happened to find Auckland first because of a different 
ordering in the directory.

It'd be nice to have a predictable timezone choice made when there's a 
tie. Perhaps we should order on timezone name in this case?

-O


Re: [pgsql-hackers-win32] Weird new time zone

From
Tom Lane
Date:
Oliver Jowett <oliver@opencloud.com> writes:
> It'd be nice to have a predictable timezone choice made when there's a 
> tie. Perhaps we should order on timezone name in this case?

The Antarctica zones would tend to win with such a rule, which is
probably not what we want :-(.  But certainly we could do something
with ties beyond "take the first one in scan order".

Would "shortest name wins" help any?
        regards, tom lane


Re: [pgsql-hackers-win32] Weird new time zone

From
Oliver Jowett
Date:
Tom Lane wrote:
> Oliver Jowett <oliver@opencloud.com> writes:
> 
>>It'd be nice to have a predictable timezone choice made when there's a 
>>tie. Perhaps we should order on timezone name in this case?
> 
> 
> The Antarctica zones would tend to win with such a rule, which is
> probably not what we want :-(.  But certainly we could do something
> with ties beyond "take the first one in scan order".
> 
> Would "shortest name wins" help any?

Well, I was thinking about the NZ vs. Pacific/Auckland tie actually. We 
already have a way to avoid picking Antarctica -- scan further back.

-O