Thread: Re: [COMMITTERS] pgsql: Add configure option --with-system-tzdata to use operating system

petere@postgresql.org (Peter Eisentraut) writes:
> Log Message:
> -----------
> Add configure option --with-system-tzdata to use operating system time zone
> database.

While this looked like a reasonable idea in the abstract, it turns out
that it was probably a waste of time.  Red Hat distributions, at least,
will never be able to use it in the given form.  Allow me to do a brain
dump for the archives' benefit, before I forget the details.

In the first place, it is considered bad form for a package to install
an absolute symlink to /usr/share/zoneinfo:

"symlinks _should_ be relative.  Even if all they have in common is /."
- Jeremy Katz
https://www.redhat.com/archives/fedora-maintainers/2007-August/msg00096.html

Followup arguments in that thread mentioned chroots and NFS mounts as
environments where absolute symlinks are likely to lead to the wrong
place.  While I'm not 100% convinced by those arguments, it's difficult
to go against the advice of people who have far more experience with RPM
packaging than I do.  Fortunately, the standard RPM installation
location for PG's datadir is /usr/share/pgsql, so in practice the
symlink will be "../zoneinfo", which is not so scary as it coulda been.

Unfortunately, "configure --with-system-tzdata=../zoneinfo" ain't
gonna work; the build will not get through "make check", because that
symlink won't work in a temporary installation.  And trying to
relativize the symlink on-the-fly during "make install" is no answer
either, since in an RPM build that step is normally done with a nonempty
DESTDIR prefix.

The other truly nasty thing, which I just wasted most of this afternoon
learning the hard way, is that RPM does not like at all to replace a
directory with a symlink, and don't hold your breath for a fix:
http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=61413
I experimented with various ways to get around that, and mostly
succeeded in wiping out my /usr/share/zoneinfo :-( It seems that the
fundamental problem is that RPM doesn't remove files that have vanished
from a package until quite late in an update cycle, and is therefore
capable of removing things indirectly through the already-installed
symlink, while thinking that it is cleaning out the contents of the
former subdirectory.  My ultimate conclusion was that the only
reasonably safe way to do things is to make the symlink be named
something different from the original subdirectory.
https://www.redhat.com/archives/fedora-devel-list/2007-August/msg01622.html

This might be a problem specific to RPM, but I wouldn't really care to
bet on other package managers being entirely free of it.  My advice
to anyone else thinking of using a symlink to a system directory is to
name it something different than the existing PG subdirectory.

In short, then, the patch actually being used (as of today) in Fedora
and ultimately RHEL is as below, and I don't see any prospect of
substituting the mechanism Peter has created.
        regards, tom lane

--------------

PostgreSQL includes its own copy of the zic timezone database, which is great
for ensuring portable results but not so great from a package maintenance
perspective.  Since the data is in the same format as is provided by the
Linux-standard /usr/share/zoneinfo files, we can avoid having to update
postgresql for timezone updates by just symlinking to those files.

It is allegedly desirable for the link in question to be a relative symlink.
I have strong doubts about this, not least because it requires the horrid
install-time kluge seen below --- we can't use a simple relative symlink when
making the temporary installation used for "make check", since that will be at
an indeterminate location compared to /usr/share.  The actual relative link
also depends fundamentally upon knowing where the PG datadir will get
installed, namely /usr/share/pgsql.

And if you thought that was bad, it turns out that RPM has some fundamental
bugs that make it difficult or impossible to replace a directory with a
symlink during RPM upgrade.  Rather than risk getting caught in that quagmire,
we choose to name the symlink something different than its predecessor
subdirectory.  (This part of the hack could perhaps get undone someday, when
there is no longer any danger of someone trying to rpm-upgrade from an
installation that isn't patched this way.)


diff -Naur postgresql-8.2.4.orig/src/timezone/Makefile postgresql-8.2.4/src/timezone/Makefile
--- postgresql-8.2.4.orig/src/timezone/Makefile    2007-03-14 13:38:15.000000000 -0400
+++ postgresql-8.2.4/src/timezone/Makefile    2007-08-22 16:57:41.000000000 -0400
@@ -38,14 +38,18 @@    $(CC) $(CFLAGS) $(ZICOBJS) $(LDFLAGS) $(LIBS) -o $@$(X)install: all installdirs
-    ./zic -d '$(DESTDIR)$(datadir)/timezone' -p '$(POSIXRULES)' $(TZDATAFILES)
+    if [ x'$(DESTDIR)' = x`echo '$(DESTDIR)' | sed 's,tmp_check/install,,'` ] ; then \
+        ln -s '../zoneinfo' '$(DESTDIR)$(datadir)/zoneinfo' ; \
+    else \
+        ln -s '/usr/share/zoneinfo' '$(DESTDIR)$(datadir)/zoneinfo' ; \
+    fi    $(MAKE) -C tznames $@installdirs:    $(mkinstalldirs) '$(DESTDIR)$(datadir)'uninstall:
-    rm -rf '$(DESTDIR)$(datadir)/timezone'
+    rm '$(DESTDIR)$(datadir)/zoneinfo'    $(MAKE) -C tznames $@clean distclean maintainer-clean:
diff -Naur postgresql-8.2.4.orig/src/timezone/pgtz.c postgresql-8.2.4/src/timezone/pgtz.c
--- postgresql-8.2.4.orig/src/timezone/pgtz.c    2006-11-21 18:11:55.000000000 -0500
+++ postgresql-8.2.4/src/timezone/pgtz.c    2007-08-22 16:57:04.000000000 -0400
@@ -52,7 +52,7 @@        return tzdir;    get_share_path(my_exec_path, tzdir);
-    strlcpy(tzdir + strlen(tzdir), "/timezone", MAXPGPATH - strlen(tzdir));
+    strlcpy(tzdir + strlen(tzdir), "/zoneinfo", MAXPGPATH - strlen(tzdir));    done_tzdir = true;    return tzdir;


"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> In the first place, it is considered bad form for a package to install
> an absolute symlink to /usr/share/zoneinfo:
>
> "symlinks _should_ be relative.  Even if all they have in common is /."
> - Jeremy Katz
> https://www.redhat.com/archives/fedora-maintainers/2007-August/msg00096.html
>
> Followup arguments in that thread mentioned chroots and NFS mounts as
> environments where absolute symlinks are likely to lead to the wrong
> place.  

Fwiw Debian also faced this issue and came to a different conclusion. IIRC the
policy is that it's explicitly supported for a sysadmin to replace any
directory in the top level directory with a symlink. So for example /home ->
/export/home.

Therefore any package which includes a symlink which traverses between top
level directories *must* be absolute. And any symlink which does not span two
top level directories *must* be relative. So /usr/foo/bar which links to
/var/foo/bar must be absolute. But /usr/foo/bar which links to /usr/qux/bar
must be a relative link ../qux/bar.

> In short, then, the patch actually being used (as of today) in Fedora
> and ultimately RHEL is as below, and I don't see any prospect of
> substituting the mechanism Peter has created.

Why would --with-zoneinfo want to use a symlink though? Shouldn't it just
compile the binary to use the path specified directly? Symlinks are fine for a
sysadmin or a packager but if it's going to be supported by Postgres code
directly why not do it directly?

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


Tom Lane wrote:
> Red Hat distributions, at least, will never be able to use it in the
> given form.

This just encoded what a packager would presumably have done anyway.  If 
it's not going to work for Red Hat, then it's never going to work for 
Red Hat in any way without major reeingineering (such as compiling in 
the path rather than using a symlink).  If someone feels inclined to do 
that work, then we can evaluate that, but we can keep using the same 
configure option, so everything would keep working consistently.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Gregory Stark wrote:
> Why would --with-zoneinfo want to use a symlink though? Shouldn't it
> just compile the binary to use the path specified directly?

The way this question is posed appears to imply that doing the latter is 
easier, but it's not.  If someone wants to do the extra work, maybe 
that would help support more cases.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Peter Eisentraut <peter_e@gmx.net> writes:
> Gregory Stark wrote:
>> Why would --with-zoneinfo want to use a symlink though? Shouldn't it
>> just compile the binary to use the path specified directly?

> The way this question is posed appears to imply that doing the latter is 
> easier, but it's not.  If someone wants to do the extra work, maybe 
> that would help support more cases.

AFAICS it would support fewer cases.  Hardwiring /usr/share/zoneinfo
into the binary is isomorphic to having it use a symlink defined that
way, as far as the objections to absolute symlinks go.  And once it's
hardwired you can't change it without recompiling, which means no go
for hacks like the one I did involving supplying a different symlink
for "make check".

The larger point of my post was really that there are enough
packaging-specific considerations involved in doing this that
--with-system-tzdata is just not going to be a widely useful solution.
My advice is to take it out, not make it more baroque.  You won't
be doing packagers any favors by turning the code they have to patch
into a moving target.
        regards, tom lane


[ catching up on today's email ]

Gregory Stark <stark@enterprisedb.com> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> In the first place, it is considered bad form for a package to install
>> an absolute symlink to /usr/share/zoneinfo:

> Fwiw Debian also faced this issue and came to a different conclusion.

That's fair enough --- as I mentioned, I wasn't 100% convinced either.
But those are the rules for RHEL/Fedora packages and I'm gonna play
by them.

The bottom line here is that accessing a system copy of the zic database
is going to be so system-dependent that I'm not sure Peter's patch will
be useful to anybody.

> Why would --with-zoneinfo want to use a symlink though? Shouldn't it just
> compile the binary to use the path specified directly?

AFAICS that just moves the problem to a different place, one where an
admin *can't* fix it without recompiling ...
        regards, tom lane


Tom Lane wrote:
> Hardwiring /usr/share/zoneinfo
> into the binary is isomorphic to having it use a symlink defined that
> way, as far as the objections to absolute symlinks go.

That just shows how silly the relative symlink requirement is in the 
first place.  If setting a symlink and hardcoding a path are 
isomorphic, then all compiled in path references would have to be 
relative as well.  PostgreSQL went through some effort to support that 
(within the package!), but surely most packages do not support it (at 
all).  Note, for example, that libc refers to /usr/share/zoneinfo by 
absolute path.  In fact, any use of any absolute path anywhere would 
have to be examined.  Just grep /etc for a thousand likely candidates.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Gregory Stark wrote:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
> 

> 
> Why would --with-zoneinfo want to use a symlink though? Shouldn't it just
> compile the binary to use the path specified directly? Symlinks are fine for a
> sysadmin or a packager but if it's going to be supported by Postgres code
> directly why not do it directly?

I sent patch which hardwire path into postgres couple of months ago. There is 
discussion:

http://archives.postgresql.org/pgsql-patches/2007-03/msg00293.php

with conclusion do it as a symlink... However I still convinced that hardwired 
solution invoke less problems then symlink solution.
    Zdenek





Tom Lane wrote:
> [ catching up on today's email ]
> 
> Gregory Stark <stark@enterprisedb.com> writes:

>> Why would --with-zoneinfo want to use a symlink though? Shouldn't it just
>> compile the binary to use the path specified directly?
> 
> AFAICS that just moves the problem to a different place, one where an
> admin *can't* fix it without recompiling ...

Is there real reason why admin should do it? Timezone files are stored in 
defined location where "all" software expect it.  If it is important to change 
timezone file location we could add extra command line parameter or GUC 
variable, but I don't see any reason for do it configurable on run-time.

    Zdenek



Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:
> Tom Lane wrote:
>> AFAICS that just moves the problem to a different place, one where an
>> admin *can't* fix it without recompiling ...

> Is there real reason why admin should do it? Timezone files are stored in 
> defined location where "all" software expect it.  If it is important to change 
> timezone file location we could add extra command line parameter or GUC 
> variable, but I don't see any reason for do it configurable on run-time.

[ shrug... ]  This just reinforces my point: if every packager is going
to have his own opinion about how to do it, there's little point in
trying to provide a pre-fab solution.
        regards, tom lane


Tom Lane wrote:
> Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:
> > Tom Lane wrote:
> >> AFAICS that just moves the problem to a different place, one where an
> >> admin *can't* fix it without recompiling ...
> 
> > Is there real reason why admin should do it? Timezone files are stored in 
> > defined location where "all" software expect it.  If it is important to change 
> > timezone file location we could add extra command line parameter or GUC 
> > variable, but I don't see any reason for do it configurable on run-time.
> 
> [ shrug... ]  This just reinforces my point: if every packager is going
> to have his own opinion about how to do it, there's little point in
> trying to provide a pre-fab solution.

Well, there's a point if every packager have his own opinion but the
opinions all happen to be equal.  Is there any glibc system where it
isn't /usr/share/zoneinfo?  Per Zdenek's patch, even Solaris does it
that way.

The FHS defines it that way:
http://www.pathname.com/fhs/pub/fhs-2.3.html#USRSHAREARCHITECTUREINDEPENDENTDATA
(which is said to cover Linux and 4.4BSD-based systems).

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> [ shrug... ]  This just reinforces my point: if every packager is going
>> to have his own opinion about how to do it, there's little point in
>> trying to provide a pre-fab solution.

> Well, there's a point if every packager have his own opinion but the
> opinions all happen to be equal.

I think we've already found out that the opinions *aren't* equal.
So far the score seems to be:Red Hat: will use relative symlinkSolaris: will use hardwired path in program???: will use
absolutesymlink
 
Peter's patch supports only the third case, which no packager has yet
committed to use (though I suppose Debian might like to).  It just gets
in the way for the other two cases.
        regards, tom lane


Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Tom Lane wrote:
> >> [ shrug... ]  This just reinforces my point: if every packager is going
> >> to have his own opinion about how to do it, there's little point in
> >> trying to provide a pre-fab solution.
> 
> > Well, there's a point if every packager have his own opinion but the
> > opinions all happen to be equal.
> 
> I think we've already found out that the opinions *aren't* equal.
> So far the score seems to be:
>     Red Hat: will use relative symlink
>     Solaris: will use hardwired path in program
>     ???: will use absolute symlink
> Peter's patch supports only the third case, which no packager has yet
> committed to use (though I suppose Debian might like to).  It just gets
> in the way for the other two cases.

Sorry, my point is that it is likely that we would be able to use a
hardwired path, because the path is the same in many systems.

Maybe we could turn the configure option into enabling a lookup into
/usr/share/zoneinfo (hardwired).  So if the files are there it uses
them.  No need to set a symlink of any kind.

It would be even better to use --with-system-tzdata=/usr/share/zoneinfo
which enables lookup in the specified dir, hardwired at compile time in
the executable (I'm not aware if the patch already accepts a path
argument -- seems like the only sane choice).  No symlink needed.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Peter Eisentraut <peter_e@gmx.net> writes:
> Note, for example, that libc refers to /usr/share/zoneinfo by 
> absolute path.

After reflecting more about this, that's an awfully strong point.

Maybe we should go back to the original patch from Sun --- that is,
just have the configure option change pg_TZDIR() to return a hardwired
absolute path.  The symlink idea has clearly got more risks than we
understood during the prior discussion.  I think the main argument
in favor was that it allowed switching between PG-local and system
TZ databases without recompiling the server, but is anyone really
interested in doing that?
        regards, tom lane


Tom Lane wrote:
> I think we've already found out that the opinions *aren't* equal.
> So far the score seems to be:
>         Red Hat: will use relative symlink
>         Solaris: will use hardwired path in program
>         ???: will use absolute symlink
> Peter's patch supports only the third case, which no packager has yet
> committed to use (though I suppose Debian might like to).  It just
> gets in the way for the other two cases.

We can certainly try to develop a solution that addresses the 
requirements of various camps.

So far, the requirements I've heard are:

1. Symlinks must be relative.
2. Symlink must have a different name from a previous directory.
3. make check still has to work.

I have not heard the requirement

4. Symlinks must not be used; we must hardwire all paths.

although I'm willing to discuss it if there is some rationale given 
(other than, "my previous patch did this").

FWIW, hardwiring plus an environment variable would seem to address all 
currently known und unknown requirements, and is not so totally 
different from solutions to previous, related problems.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
>> Note, for example, that libc refers to /usr/share/zoneinfo by 
>> absolute path.
> 
> After reflecting more about this, that's an awfully strong point.
> 
> Maybe we should go back to the original patch from Sun --- that is,
> just have the configure option change pg_TZDIR() to return a hardwired
> absolute path.  The symlink idea has clearly got more risks than we
> understood during the prior discussion.  I think the main argument
> in favor was that it allowed switching between PG-local and system
> TZ databases without recompiling the server, but is anyone really
> interested in doing that?

If you're talking about a dev scenario you can already do this - just
don't use the configure parameter and manually symlink.
I can't really see why someone using the RPM (or whatever) version would
want to do it.

//Magnus


Peter Eisentraut <peter_e@gmx.net> writes:
> FWIW, hardwiring plus an environment variable would seem to address all 
> currently known und unknown requirements, and is not so totally 
> different from solutions to previous, related problems.

I think we are converging on the recognition that Zdenek was right the
first time.  I'd go for plain hardwired --- I think an environment
variable is more likely to be a foot-gun than anything useful.
(Recall the original argument that choosing to use an external TZ
database is not something to be done lightly.)
        regards, tom lane


Tom Lane wrote:

> 
> I think we've already found out that the opinions *aren't* equal.
> So far the score seems to be:
>     Red Hat: will use relative symlink
>     Solaris: will use hardwired path in program

Finally, Because my original patch has not been accepted, Solaris also 
use relative symlink for PostgreSQL 8.2. Absolute symlink invokes some 
problem during packaging and also breaks live upgrade.

    Zdenek


Alvaro Herrera wrote:

> 
> It would be even better to use --with-system-tzdata=/usr/share/zoneinfo
> which enables lookup in the specified dir, hardwired at compile time in
> the executable (I'm not aware if the patch already accepts a path
> argument -- seems like the only sane choice).  No symlink needed.

Yes, path is accepted (if I remember correctly :-). One thing should be 
added and it is default path based on OS.
    Zdenek


Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
>> FWIW, hardwiring plus an environment variable would seem to address all 
>> currently known und unknown requirements, and is not so totally 
>> different from solutions to previous, related problems.
> 
> I think we are converging on the recognition that Zdenek was right the
> first time.  I'd go for plain hardwired --- I think an environment
> variable is more likely to be a foot-gun than anything useful.
> (Recall the original argument that choosing to use an external TZ
> database is not something to be done lightly.)

Tom,

Let me know if there something what should be adjusted on my patch. I 
would like to do it tomorrow, because I will be offline for next two weeks.
    thanks Zdenek


Zdenek Kotala wrote:
> Let me know if there something what should be adjusted on my patch. I
> would like to do it tomorrow, because I will be offline for next two
> weeks.

I can try to fit your patch into what's there now, if you'd rather just 
start your holiday or whatever.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Peter Eisentraut wrote:
> Zdenek Kotala wrote:
>> Let me know if there something what should be adjusted on my patch. I
>> would like to do it tomorrow, because I will be offline for next two
>> weeks.
> 
> I can try to fit your patch into what's there now, if you'd rather just 
> start your holiday or whatever.

OK, thanks
    Zdenek







Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:
> Yes, path is accepted (if I remember correctly :-). One thing should be 
> added and it is default path based on OS.

If the user doesn't know enough to know what the correct path is, he has
no business using the switch at all.  So my opinion is "no way are we
going to expend effort on that".
        regards, tom lane