Re: location of the configuration files - Mailing list pgsql-hackers

From Kevin Brown
Subject Re: location of the configuration files
Date
Msg-id 20030213073645.GI1833@filer
Whole thread Raw
In response to Re: location of the configuration files  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Before I get started, I should note that it may be a good compromise
to have the data directory be the same as the config file directory,
when neither the config file nor the command line specify something
different.  So the changes I think may make the most sense are:

1.  We add a new GUC variable which specifies where the data is.   The data is assumed to reside in the same place the
configfiles   reside unless the GUC variable is defined (either in   postgresql.conf or on the command line, as usual
fora GUC   variable).  Both -D and $PGDATA therefore retain their current   semantics unless overridden by the GUC
variable,in which case   they fall back to the new semantics of specifying only where the   config files can be found.
 

2.  We add a configure option that specifies what the hardcoded   fallback directory should be when neither -D nor
$PGDATAare   specified: /etc/postgresql when the option isn't specified to   configure.
 

3.  We supply a different default startup script and a different   default configuration file (but can make the older
versions  available in the distribution as well if we wish).  The former   uses neither $PGDATA nor -D (or uses
/etc/postgresqlfor them),   and the latter uses the new GUC variable to specify a data   directory location
(/var/lib/postgresby default?)
 

This combination should work nicely for transitioning and for package
builders.  It accomplishes all of the goals mentioned in this thread
and will cause minimal pain for developers, since they can use their
current methods.  Sounds like it'll make Tom happy, at least.  :-)


Tom Lane wrote:
> mlw <pgsql@mohawksoft.com> writes:
> > The idea that a, more or less, arbitrary data location determines the 
> > database configuration is wrong. It should be obvious to any 
> > administrator that a configuration file location which controls the 
> > server is the "right" way to do it.
> 
> I guess I'm just dense, but I entirely fail to see why this is the One
> True Way To Do It.  

But we're not saying it's the One True Way, just saying that it's a
way that has very obvious benefits over the way we're using now, if
your job is to manage a system that someone else set up.

> What you seem to be proposing (ignoring syntactic-sugar issues) is
> that we replace "postmaster -D /some/data/dir" by "postmaster
> -config /some/config/file".  I am not seeing the nature of the
> improvement.

The nature of the improvement is that the configuration of a
PostgreSQL install will becomes obvious to anyone who looks in the
obvious places.  Remember, the '-D ...' is optional!  The PGDATA
environment variable can be used instead, and *is* used in what few
installations I've seen.  That's not something that shows up on the
command line when looking at the process list, which forces the
administrator to hunt down the data directory through other means.

> It looks to me like the sysadmin must now grant the Postgres DBA
> write access on *two* directories, viz /some/config/ and
> /wherever/the/data/directory/is.  How is that better than granting
> write access on one directory?

The difference in where you grant write access isn't a benefit to be
gained here.  The fact that you no longer have to give root privileges
to the DBA so that he can change the data directory as needed is the
benefit (well, one of them, at least).  A standard packaged install
can easily set the /etc/postgresql directory up with write permissions
for the postgres user by default, so the sysadmin won't even have to
touch it if he doesn't want to.

A big production database box is usually managed by one or more system
administrators and one or more DBAs.  Their roles are largely
orthogonal.  The sysadmins have the responsibility of keeping the
boxes up and making sure they don't fall over or crawl to a
standstill.  The DBAs have the responsibility of maximizing the
performance and availability of the database and *that's all*.  Giving
the DBAs root privileges means giving them the power to screw up the
system in ways that they can't recover from and might not even know
about.  The ways you can take down a system by misconfiguring the
database are bad enough.  No sane sysadmin is going to give the DBA
the power to run an arbitrary script as root at a time during the boot
cycle that the system is the most difficult to manage unless he thinks
the DBA is *really* good at system administration tasks, too.  And
that's assuming the sysadmin even *has* the authority to grant the DBA
that kind of access.  Many organizations keep a tight rein on who can
do what in an effort to minimize the damage from screwups.

The point is that the DBA isn't likely to have root access to the box.
When the DBA lacks that ability, the way we currently do things places
greater demand on the sysadmin than is necessary, because root access
is required to change the startup scripts, as it should be, and the
location of the data, as it should *not* be.

> Given that we can't manage to standardize the data directory
> location across multiple Unixen, how is it that we will be more
> successful at standardizing a config file location?

A couple of ways.

Firstly, as we mentioned before, just about every other daemon that
runs on a Unix system has its configuration file somewhere in the /etc
hierarchy.  By putting our config files in that same hierarchy we'll
be *adhering* to a standard.  We don't have to worry about
"standardizing" that config file location because it's *already* a
standard that we're currently ignoring.

Secondly, standards arise as a result of being declared standards and
by most people using them.  So simply by making /etc/postgresql the
default configuration directory, *that* will become the standard
place.  Most people won't mess with the default install if they don't
have to.

Right now they almost *have to* mess with the default install, because
there is no standard place on a Unix system for high speed, highly
reliable disk access.  And that means that, right now, there *is* no
standard place for our config files -- it's wherever the person who
configured the database decided the data should be, and he made that
decision based on performance and reliability considerations, not on
any standards.

> All I see here is an arbitrary break with our past practice.  I do not
> see any net improvement.

That's probably because you're looking at this from the point of view
of a developer.  From that standpoint there really isn't any net
improvement, because *you* still have to specify something on the
command line to get your test databases going.  As a developer you
*always* install and manage your own database installations, so *of
course* you'll always know where the config files are.  But that's not
how it works in the production world.

The break we'd be making is *not* arbitrary, and that's much of the
point: it's a break towards existing standards, and there are good
reasons for doing it, benefits to be had by adhering to those
standards.


The way we currently handle configuration files is fine for research
and development use -- the environment from which PostgreSQL sprang.
But now we're talking about getting it used in production
environments, and their requirements are very different.

Since it is *we* who are not currently adhering to the standard,
shouldn't the burden of proof (so to speak) be on those who wish to
keep things as they are?




-- 
Kevin Brown                          kevin@sysexperts.com


pgsql-hackers by date:

Previous
From: Daniel Kalchev
Date:
Subject: Re: Changing the default configuration
Next
From: Daniel Kalchev
Date:
Subject: Re: Brain dump: btree collapsing