Thread: Configure redux. "WITH LOCATION," what do you think?

Configure redux. "WITH LOCATION," what do you think?

From
pgsql@mohawksoft.com
Date:
I'm sorry to keep bringing this up, but I keep thinking that PostgreSQL is
a PITA to configure. It isn't that I don't know how, or can't read the
documents, it's just that there is too many things that can go wrong or
leave me with the "How does that work?" feeling where I need to test
before I implement.

OK, here is what I think we need to do, and I am willing to take the point
and do all of it:

If postmaster is started with -D or with $PGDATA set to the location of a
postgresql cluster directory, it will work as it has always worked.

If the -D or the $PGDATA values do not point to a database cluster
directory, it must point to either a postgresql.conf file, or a directory
which contains "postgresql.conf." If it points to a directory, then that
directory will be the assumed the default for pg_hba.conf and
pg_ident.conf.

The test for it being a cluster directory or a configuration directory
will be the presence of "PG_VERSION." (any better suggestion would be
welcome

Within the postgresql.conf file, there will be about 4 additional
configuration parameters:

include='/somepath/coreopts.conf'
hba_conf='/somepath/pg_hba.conf'
ident_conf='somepath/pg_ident.conf'
pgdata='/somepath/to/cluster'

(Unfortunately we lose the ability to use a standard configuration across
multiple databases, but I think this is a good compromise.)

Additionally, rather than depend on environment variables for security, we
can put acceptable database locations in the postgresql.conf file in the
form of:

altdb='/somealtpath'

and we can use createdb as:

CREATE DATABASE somedb with LOCATION 'altdb';

What do you all think?



Re: Configure redux.

From
pgsql@mohawksoft.com
Date:
> WITH LOCATION would be great,.... That way you could distribute your
> databases across multiple physical disks a lot easier than looking up
> oid's
> and moving the db after it's been created, mounting it over the top of the
> cluster directory....

Actually "WITH LOCATION" already exists, but it works in two eays, either
you specify an environment variable which points to the directory, or you
have to compile postgresql with a define which allows the location to be
an absolute path.

They say the absolute path is a security issue, I honestly don't see how
as it must be accessed by the postgresql process. If you can create a
directory with that ownership, then you would presumably have enough
access to modify Postgresql to use any location you like.

Anyway, the hacker group seems to think that being able to specify a
directory is a security risk, and I'm sure its not a decision that will go
away easy. By the same token, however, I think mandating alternate
locations be in the environment table of the server process is insane. If
you can't specify actual directories, then the information belongs in the
configuration file. Does anyone seriously disagree with this statement?

My problem with postgresql configuration are ways you must do things.

If you want "pg_xlog" on a different volume, you must use symlinks. If you
want to put a database in an alternate location, you must either use
symlinks or specify an environment variable. If you want to share pg_ident
or pg_hba you have to use symlinks.

As *any* DBA will tell you, symlinks are not the production solution of
choice. They are not self documenting. They sometimes break archive
utilities i.e. some follow symlinks, some only save the link. Symlinks are
just plain messy in production. That's why I've really been pushing for
the configuration file to be *THE* configuration of the database system.

One should be able to print out postgresql.conf and "know" where
everything is and how it is configured. This knowledge should not depend
on knowing that symlinks are used or what environment variables are in
play.

The changes I suggest can get us closer to a single point of reference for
postgresql deployment.

>
> Just my $0.002 worth...

Mine too.

>
> ... John
>
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of
> pgsql@mohawksoft.com
> Sent: Wednesday, 12 May 2004 5:06 AM
> To: pgsql-hackers@postgresql.org
> Subject: [HACKERS] Configure redux. "WITH LOCATION," what do you think?
>
> I'm sorry to keep bringing this up, but I keep thinking that PostgreSQL is
> a
> PITA to configure. It isn't that I don't know how, or can't read the
> documents, it's just that there is too many things that can go wrong or
> leave me with the "How does that work?" feeling where I need to test
> before
> I implement.
>
> OK, here is what I think we need to do, and I am willing to take the point
> and do all of it:
>
> If postmaster is started with -D or with $PGDATA set to the location of a
> postgresql cluster directory, it will work as it has always worked.
>
> If the -D or the $PGDATA values do not point to a database cluster
> directory, it must point to either a postgresql.conf file, or a directory
> which contains "postgresql.conf." If it points to a directory, then that
> directory will be the assumed the default for pg_hba.conf and
> pg_ident.conf.
>
> The test for it being a cluster directory or a configuration directory
> will
> be the presence of "PG_VERSION." (any better suggestion would be welcome
>
> Within the postgresql.conf file, there will be about 4 additional
> configuration parameters:
>
> include='/somepath/coreopts.conf'
> hba_conf='/somepath/pg_hba.conf'
> ident_conf='somepath/pg_ident.conf'
> pgdata='/somepath/to/cluster'
>
> (Unfortunately we lose the ability to use a standard configuration across
> multiple databases, but I think this is a good compromise.)
>
> Additionally, rather than depend on environment variables for security, we
> can put acceptable database locations in the postgresql.conf file in the
> form of:
>
> altdb='/somealtpath'
>
> and we can use createdb as:
>
> CREATE DATABASE somedb with LOCATION 'altdb';
>
> What do you all think?
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html
>


Re: Configure redux.

From
Tom Lane
Date:
pgsql@mohawksoft.com writes:
> They say the absolute path is a security issue, I honestly don't see how

Because it allows someone with only CREATEDB privilege to tell the
backend to write anywhere (that it can write).  That's an ability
that should be reserved to superusers.

The environment-variable-based variant is actually not a lot better.
Consider CREATEDB WITH LOCATION 'HOME'.  Or 'PWD'.  Depending on your
platform, "env | grep =/" may reveal other interesting possibilities.
The problem is that a postmaster's environment will contain a lot of
strings that happen to look like absolute paths.  There is no way for
the code to know which ones were really intended by the DBA to be used
as locations, and which ones are just part of the standard environment
on a particular platform.

Either way you slice it, WITH LOCATION is fundamentally bogus because it
allows users to create databases in unintended locations.  I don't think
it can be used to overwrite existing files, so it's not a security hole
in the sense of allowing attacks, but it is a security hole in the sense
of not giving the DBA a reliable way to control database locations.

I have every intention of ripping that code out, root and branch, as
soon as we have a real tablespace feature (which I'm expecting will
provide means within the system for defining allowed physical
locations).  I don't recommend spending any time on improving the
feature, because its days are numbered.
        regards, tom lane


Re: Configure redux.

From
pgsql@mohawksoft.com
Date:
> pgsql@mohawksoft.com writes:
>> They say the absolute path is a security issue, I honestly don't see how
>
> Because it allows someone with only CREATEDB privilege to tell the
> backend to write anywhere (that it can write).  That's an ability
> that should be reserved to superusers.

I guess I can see that.

>
> The environment-variable-based variant is actually not a lot better.
> Consider CREATEDB WITH LOCATION 'HOME'.  Or 'PWD'.  Depending on your
> platform, "env | grep =/" may reveal other interesting possibilities.
> The problem is that a postmaster's environment will contain a lot of
> strings that happen to look like absolute paths.  There is no way for
> the code to know which ones were really intended by the DBA to be used
> as locations, and which ones are just part of the standard environment
> on a particular platform.
>
> Either way you slice it, WITH LOCATION is fundamentally bogus because it
> allows users to create databases in unintended locations.  I don't think
> it can be used to overwrite existing files, so it's not a security hole
> in the sense of allowing attacks, but it is a security hole in the sense
> of not giving the DBA a reliable way to control database locations.
>
> I have every intention of ripping that code out, root and branch, as
> soon as we have a real tablespace feature (which I'm expecting will
> provide means within the system for defining allowed physical
> locations).  I don't recommend spending any time on improving the
> feature, because its days are numbered.

Well, is there a doc on how tables spaces will be implemented?

Do you intend to have postmaster initialize the tablespace, or will the
admin be required to perform an initlocation?

I already have a keyword in my config patch called 'volume' but
'tablespace' would probably be a better name for it. A simple name change
would work.