Thread: Anybody care about having the verbose form of the tzdata files?

Anybody care about having the verbose form of the tzdata files?

From
Tom Lane
Date:
Traditionally, src/timezone/data/ contains the source files of the
IANA timezone database.  That currently amounts to just under 700KB,
and it's growing all the time, mostly because they keep what amounts
to their entire commit log in the comments :-(

IANA have recently started distributing an abbreviated version of
that data, in a single comment-free file "tzdata.zi" that looks like
this:

# This zic input file is in the public domain.
R A 1916 o - Jun 14 23s 1 S
R A 1916 1919 - O Sun>=1 23s 0 -
R A 1917 o - Mar 24 23s 1 S
R A 1918 o - Mar 9 23s 1 S
R A 1919 o - Mar 1 23s 1 S
R A 1920 o - F 14 23s 1 S
R A 1920 o - O 23 23s 0 -
R A 1921 o - Mar 14 23s 1 S
...

The current version is just a shade over 100KB, and its growth
rate can be projected to be noticeably slower than the master
source files.  It's alleged to compress much better than the
master files too, though I've not tried to measure that.

So I'm wondering if we should replace src/timezone/data/* with
tzdata.zi, in the name of reducing the size of our tarballs.
It's substantially harder to see what changes they made from
one version to the next by comparing .zi files, but I'm not
sure if anyone cares about that.

Anybody here actually care about reading the zone data files?
        regards, tom lane


Re: Anybody care about having the verbose form of the tzdata files?

From
Daniel Gustafsson
Date:
> On 20 Nov 2017, at 21:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> So I'm wondering if we should replace src/timezone/data/* with
> tzdata.zi, in the name of reducing the size of our tarballs.

+1

> Anybody here actually care about reading the zone data files?

I doubt there is anyone who cares about that who isn’t already consuming the
upstream data.

cheers ./daniel

Re: Anybody care about having the verbose form of the tzdata files?

From
Michael Paquier
Date:
On Tue, Nov 21, 2017 at 6:28 PM, Daniel Gustafsson <daniel@yesql.se> wrote:
>> On 20 Nov 2017, at 21:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Anybody here actually care about reading the zone data files?
>
> I doubt there is anyone who cares about that who isn’t already consuming the
> upstream data.

Perhaps I do. If this set of files gets removed and replaced by the zi
file, is it possible to still know easily which files are being
removed during a minor upgrade? When doing minor upgrades of a MSI
installer (Windows, yeah!), I need to keep track of files that get
deleted or a minor upgrade would simply fail. The tweak that I have is
to list them and recreate them as empty. The thing is ugly as hell,
but I need to be able to track which files are being removed easily.
And as far as I am checking, for example taking the rather recent
example of Riyadh87 in commit e04641f4, src/timezone/data allows to
keep easily track of files removed. If this gets removed, I am pretty
convinced that this tracking gets more complicated.
--
Michael


Re: Anybody care about having the verbose form of the tzdata files?

From
Tom Lane
Date:
Michael Paquier <michael.paquier@gmail.com> writes:
> On Tue, Nov 21, 2017 at 6:28 PM, Daniel Gustafsson <daniel@yesql.se> wrote:
> On 20 Nov 2017, at 21:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Anybody here actually care about reading the zone data files?

>> I doubt there is anyone who cares about that who isn’t already consuming the
>> upstream data.

> Perhaps I do. If this set of files gets removed and replaced by the zi
> file, is it possible to still know easily which files are being
> removed during a minor upgrade? When doing minor upgrades of a MSI
> installer (Windows, yeah!), I need to keep track of files that get
> deleted or a minor upgrade would simply fail. The tweak that I have is
> to list them and recreate them as empty. The thing is ugly as hell,
> but I need to be able to track which files are being removed easily.
> And as far as I am checking, for example taking the rather recent
> example of Riyadh87 in commit e04641f4, src/timezone/data allows to
> keep easily track of files removed. If this gets removed, I am pretty
> convinced that this tracking gets more complicated.

I'm a bit confused.  The files under src/timezone/data/ don't correspond
to individual installed zone data files; most of them describe a lot of
zones.  (Riyadh87 and friends were outliers.)  Seems to me that if you
care about the installed file list, much the easiest way is to run
"make install" and then look to see what's under share/timezones/.
That wouldn't change if we use the abbreviated form of the zic input
data.

Now, personally, I've long diff'd the old and new timezone/data/ files
in the process of writing the commit message for a tzdata update.
I'd have to change that process --- but it was always a pretty tedious and
obsessive-compulsive way to do it anyway, because most of the diffs are
comments.  I'd probably just start relying more fully on the IANA
announcement emails, like this one:

http://mm.icann.org/pipermail/tz-announce/2017-October/000047.html

As far as I've seen, they are reliably good about summarizing everything
you need to know about an update.  They definitely always mention
additions and removals of zones.
        regards, tom lane


Re: Anybody care about having the verbose form of the tzdata files?

From
Michael Paquier
Date:
On Wed, Nov 22, 2017 at 9:34 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Michael Paquier <michael.paquier@gmail.com> writes:
>> On Tue, Nov 21, 2017 at 6:28 PM, Daniel Gustafsson <daniel@yesql.se> wrote:
>> On 20 Nov 2017, at 21:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> Anybody here actually care about reading the zone data files?
>> Perhaps I do. If this set of files gets removed and replaced by the zi
>> file, is it possible to still know easily which files are being
>> removed during a minor upgrade? When doing minor upgrades of a MSI
>> installer (Windows, yeah!), I need to keep track of files that get
>> deleted or a minor upgrade would simply fail. The tweak that I have is
>> to list them and recreate them as empty. The thing is ugly as hell,
>> but I need to be able to track which files are being removed easily.
>> And as far as I am checking, for example taking the rather recent
>> example of Riyadh87 in commit e04641f4, src/timezone/data allows to
>> keep easily track of files removed. If this gets removed, I am pretty
>> convinced that this tracking gets more complicated.
>
> I'm a bit confused.  The files under src/timezone/data/ don't correspond
> to individual installed zone data files; most of them describe a lot of
> zones.  (Riyadh87 and friends were outliers.)  Seems to me that if you
> care about the installed file list, much the easiest way is to run
> "make install" and then look to see what's under share/timezones/.
> That wouldn't change if we use the abbreviated form of the zic input
> data.

Yeah. That's basically what I do when I have a doubt, seeing an
automated minor upgrade failing or when getting a complain. But the
process is an hassle, and I can get things basically fine if I have an
easy reference of things removed.

> Now, personally, I've long diff'd the old and new timezone/data/ files
> in the process of writing the commit message for a tzdata update.
> I'd have to change that process --- but it was always a pretty tedious and
> obsessive-compulsive way to do it anyway, because most of the diffs are
> comments.  I'd probably just start relying more fully on the IANA
> announcement emails, like this one:
>
> http://mm.icann.org/pipermail/tz-announce/2017-October/000047.html
>
> As far as I've seen, they are reliably good about summarizing everything
> you need to know about an update.  They definitely always mention
> additions and removals of zones.

If you add a reference to those upstream announces in your commit
message, that would be fine as well for me. I don't tend to follow
those folks closely (I really should I guess).
-- 
Michael