Thread: Proposal for updating src/timezone
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies. — C.A.R. Hoare
John Cochran <j69cochran@gmail.com> writes: > My proposal is the have the following directory structure > .../src/timezone - Would have only PostgreSQL specific code > .../src/timezone/tznames - would be unaltered from current (optionally this > could be removed) > .../src/timezone/iana - would contain unaltered code and data from IANA > The .../src/timezone/data directory would be removed. I am not in favor of doing anything to the data/ directory. If we can have unaltered tzcode source files in their own subdirectory, that'd be great, but I see no advantage to smashing the IANA code and data files together. Furthermore, doing so would break the existing ability to just cherry-pick tzdata update commits into back branches. Part of my thought here is that we do generally just dump tzdata updates into our tree without much thought, but I don't think that will ever be the case for tzcode updates; we'll want to look at those with more care, and probably we'll not back-patch them. > 1. I would have liked to recommend 2 sub-directories underneath > .../src/timezone/iana named "code" and "data", but unfortunately have to > suggest untarring both the code and data files directly into the iana > directory. The reason for this is that the IANA code does not compile using > the IANA Makefile unless the script yearistype.sh is present and that > script is currently present in the data tar file, not the code tar file. I have exactly zero expectation of using their Makefile, so this is not a concern. > 2. Depositing the IANA code unaltered would also deposit some "junk" files > into the directory. The files would mainly consist of man pages and html > files containing documentation for the timezone code. The extra files would > consume approximately 500 kilobytes above what's actually needed, but > otherwise wouldn't have any adverse effects. We're not doing that either, IMO. Why should we invest 500K of tarball space on that? regards, tom lane
John Cochran <j69cochran@gmail.com> writes:
> My proposal is the have the following directory structure
...> 1. I would have liked to recommend 2 sub-directories underneath
I have exactly zero expectation of using their Makefile, so this is not a concern.
> 2. Depositing the IANA code unaltered would also deposit some "junk" files
We're not doing that either, IMO. Why should we invest 500K of tarball
space on that?
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies. — C.A.R. Hoare
John Cochran <j69cochran@gmail.com> writes: > Did a diff between the 2010c version of localtime.c and the postgres > version and saw a lot more differences than what could have been expected > from simple reformatting and adaptation. So I installed gitk and took a > look at the change history of localtime.c.... > Well, looking at the results, it pretty much put paid to the concept of > ever importing the IANA code unaltered into the postgres tree. In fact, it > looks like the original version of localtime.c was based upon one of the > 2004a thru 2004c versions of the IANA code and since then has been > independently maintained. That's certainly true, but I'm not sure that we should give up all that easily on getting closer to the IANA code again. For one thing, some of the thrashing had to do with the fact that we went over to 64-bit timestamps sooner than the IANA crew did. I'm assuming that they insist on 64-bit arithmetic now; we do too, so for instance some of the typedef hacking might no longer be necessary. AFAICT from a quick scan of the git logs, the main thing we've done to the API that's not in IANA is invent pg_next_dst_boundary(). Perhaps that could be pushed into a separate source file. Even if we only got to a point where the sources were the same except for a few narrow patches, it would make tracking their updates much easier. regards, tom lane
John Cochran <j69cochran@gmail.com> writes:That's certainly true, but I'm not sure that we should give up all that
> Did a diff between the 2010c version of localtime.c and the postgres
> version and saw a lot more differences than what could have been expected
> from simple reformatting and adaptation. So I installed gitk and took a
> look at the change history of localtime.c....
> Well, looking at the results, it pretty much put paid to the concept of
> ever importing the IANA code unaltered into the postgres tree. In fact, it
> looks like the original version of localtime.c was based upon one of the
> 2004a thru 2004c versions of the IANA code and since then has been
> independently maintained.
easily on getting closer to the IANA code again. For one thing, some of
the thrashing had to do with the fact that we went over to 64-bit
timestamps sooner than the IANA crew did. I'm assuming that they insist
on 64-bit arithmetic now; we do too, so for instance some of the typedef
hacking might no longer be necessary.
AFAICT from a quick scan of the git logs, the main thing we've done to the
API that's not in IANA is invent pg_next_dst_boundary(). Perhaps that
could be pushed into a separate source file.
Even if we only got to a point where the sources were the same except for
a few narrow patches, it would make tracking their updates much easier.
regards, tom lane
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies. — C.A.R. Hoare
Hi, On Sat, Jul 19, 2014 at 09:28:25AM -0400, John Cochran wrote: > Agreed. Right now, I'm seeing about updating zic.c to match the IANA code > combined with the modifications that postgres did to it. So far, it doesn't > look like many functional changes have been done, but due to the use of > pgindent, there's a LOT of cosmetic changes that add one heck of a lot of > noise in determining the actual differences. Maybe if you pgindent the IANA code as well, you can more easily diff the actual changes between the two, did you try that? Michael
Maybe if you pgindent the IANA code as well, you can more easily diff
the actual changes between the two, did you try that?
Michael
Unfortunately, pgindent doesn't work well with the IANA code as evident by some previous checkins.
There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other way is to make it so complicated that there are no obvious deficiencies. — C.A.R. Hoare
<div class="moz-cite-prefix">On 20/07/14 06:30, John Cochran wrote:<br /></div><blockquote cite="mid:CAGQU3n49T4W2hKXGnzS-vJNb1xuHx3xNzBcjk=kpmwQdyMDTzA@mail.gmail.com"type="cite"><div dir="ltr"><br /><div class="gmail_extra"><br/><br /><div class="gmail_quote">On Sat, Jul 19, 2014 at 11:58 AM, Michael Banck <span dir="ltr"><<ahref="mailto:mbanck@gmx.net" moz-do-not-send="true" target="_blank">mbanck@gmx.net</a>></span> wrote:</div><divclass="gmail_quote">SNIP <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px#ccc solid;padding-left:1ex"> Maybe if you pgindent the IANA code as well, you can more easily diff<br/> the actual changes between the two, did you try that?<br /><span class="HOEnZb"><font color="#888888"><br /><br/> Michael<br /></font></span></blockquote></div><br /> Unfortunately, pgindent doesn't work well with the IANA codeas evident by some previous checkins.</div><div class="gmail_extra"><br /></div><div class="gmail_extra">To quote a checkindone 2004-05-21 16:59:10 by Tom Lane</div><div class="gmail_extra">pgindent did a pretty awful job on the timezonecode, particularly with respect to doubly-starred comment blocks. Do some manual cleanup.<br /></div><br /></div></blockquote>[...]<br /><br /> For simply noting functional differences: could not make copies of the sources filesfor both the pg & IANA code, and remove ALL indentation prior to to a a diff?<br /><br /><br /> Cheers,<br /> Gavin<br/>
John Cochran <j69cochran@gmail.com> writes: > As it is, I've finished checking the differences between the postgres and > IANA code for zic.c after editing both to eliminate non-functional style > differences such as indentation, function prototypes, comparing strchr > results against NULL or 0, etc. It looks like the only differences from the > stock code is support for the -P option implemented by Tom Lane and the > substitution of the postgres code for getopt instead of the unix default as > well as a few minor changes in some defaults and minor modifications to > deal with Windows. Overall, rather small and trivial. > Additionally, I discovered a little piece of low hanging fruit. The > Makefile for the timezone code links together zic.o ialloc.o scheck.o and > localtime.o in order to make zic. Additionally, zic.c has a stub > implementation of pg_open_tzfile() in order to resolve a linkage issue with > the localtime.o module for zic. > Well, as it turns out, localtime.o doesn't supply ANY symbols that zic > needs and therefore can be omitted entirely from the list of object files > comprising zic. Which in turn means that the stub implementation of > pg_open_tzfile can be removed from the postgres version of zic.c. I'm not > bothering to submit a patch involving this since that patch will be quite > short lived given my objective to bring the code up to date with the 2014e > version of the IANA code. But I am submitting a bug report to IANA on the > Makefile since the unneeded linkage with localtime.o is still in the 2014e > code on their site. John, have you made any further progress on this since July? The urgency of updating our timezone code has risen quite a bit for me, because while testing an update of the data files to tzdata2014h I became aware that the "-P" option is failing to print a noticeable number of zone abbreviations that clearly exist in the data files. Since the -P hack itself is so simple, it's hard to come to any other conclusion than that the rest of the timezone code is buggy --- presumably because it's four years behind what IANA is targeting with the data files. I spent a little bit of time looking at just applying the diffs between tzcode2010c and tzcode2014h to our code, but the diffs are large enough that that seems both tedious and quite error-prone. I think we'd be better advised to push forward with the idea of trying to absorb more-or-less-unmodified versions of the timezone code files. (I'm particularly disillusioned with the idea that we'll keep on pgindent'ing that code --- the effort-to-reward ratio even to keep the comments readable just ain't very good.) regards, tom lane
I wrote: > The urgency of updating our timezone code has risen quite a bit for me, > because while testing an update of the data files to tzdata2014h I became > aware that the "-P" option is failing to print a noticeable number of > zone abbreviations that clearly exist in the data files. Since the -P > hack itself is so simple, it's hard to come to any other conclusion than > that the rest of the timezone code is buggy --- presumably because it's > four years behind what IANA is targeting with the data files. Ah, scratch that. On further investigation, it turns out that my code for "-P" is in fact wrong; it misses out printing zone abbreviations in cases where an area reverted to an older meaning of a zone abbrev after having used another meaning for awhile. Which, in fact, is what the Russians have just done, and it seems it's happened rather often elsewhere as well. We do still need to work on syncing our code with IANA's updates one way or the other, because eventually they are going to start shipping zone definitions that break older copies of the library; this has been mentioned more than once on the IANA announcement list. But it doesn't seem like that has happened quite yet. Meanwhile, it looks like there's a fair amount of missed updates in our tznames/ files as a result of this error. Off I go to look at that. (The underlying motivation here is that I'd like to get the tzdata2014h updates installed before we wrap 9.4beta3. Since it looks like those changes are going to be rather invasive, it seems like getting some beta testing on them would be a good thing before we unleash them on back-branch releases.) regards, tom lane