Thread: Configure redux. "WITH LOCATION," what do you think?
I'm sorry to keep bringing this up, but I keep thinking that PostgreSQL is a PITA to configure. It isn't that I don't know how, or can't read the documents, it's just that there is too many things that can go wrong or leave me with the "How does that work?" feeling where I need to test before I implement. OK, here is what I think we need to do, and I am willing to take the point and do all of it: If postmaster is started with -D or with $PGDATA set to the location of a postgresql cluster directory, it will work as it has always worked. If the -D or the $PGDATA values do not point to a database cluster directory, it must point to either a postgresql.conf file, or a directory which contains "postgresql.conf." If it points to a directory, then that directory will be the assumed the default for pg_hba.conf and pg_ident.conf. The test for it being a cluster directory or a configuration directory will be the presence of "PG_VERSION." (any better suggestion would be welcome Within the postgresql.conf file, there will be about 4 additional configuration parameters: include='/somepath/coreopts.conf' hba_conf='/somepath/pg_hba.conf' ident_conf='somepath/pg_ident.conf' pgdata='/somepath/to/cluster' (Unfortunately we lose the ability to use a standard configuration across multiple databases, but I think this is a good compromise.) Additionally, rather than depend on environment variables for security, we can put acceptable database locations in the postgresql.conf file in the form of: altdb='/somealtpath' and we can use createdb as: CREATE DATABASE somedb with LOCATION 'altdb'; What do you all think?
> WITH LOCATION would be great,.... That way you could distribute your > databases across multiple physical disks a lot easier than looking up > oid's > and moving the db after it's been created, mounting it over the top of the > cluster directory.... Actually "WITH LOCATION" already exists, but it works in two eays, either you specify an environment variable which points to the directory, or you have to compile postgresql with a define which allows the location to be an absolute path. They say the absolute path is a security issue, I honestly don't see how as it must be accessed by the postgresql process. If you can create a directory with that ownership, then you would presumably have enough access to modify Postgresql to use any location you like. Anyway, the hacker group seems to think that being able to specify a directory is a security risk, and I'm sure its not a decision that will go away easy. By the same token, however, I think mandating alternate locations be in the environment table of the server process is insane. If you can't specify actual directories, then the information belongs in the configuration file. Does anyone seriously disagree with this statement? My problem with postgresql configuration are ways you must do things. If you want "pg_xlog" on a different volume, you must use symlinks. If you want to put a database in an alternate location, you must either use symlinks or specify an environment variable. If you want to share pg_ident or pg_hba you have to use symlinks. As *any* DBA will tell you, symlinks are not the production solution of choice. They are not self documenting. They sometimes break archive utilities i.e. some follow symlinks, some only save the link. Symlinks are just plain messy in production. That's why I've really been pushing for the configuration file to be *THE* configuration of the database system. One should be able to print out postgresql.conf and "know" where everything is and how it is configured. This knowledge should not depend on knowing that symlinks are used or what environment variables are in play. The changes I suggest can get us closer to a single point of reference for postgresql deployment. > > Just my $0.002 worth... Mine too. > > ... John > > -----Original Message----- > From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of > pgsql@mohawksoft.com > Sent: Wednesday, 12 May 2004 5:06 AM > To: pgsql-hackers@postgresql.org > Subject: [HACKERS] Configure redux. "WITH LOCATION," what do you think? > > I'm sorry to keep bringing this up, but I keep thinking that PostgreSQL is > a > PITA to configure. It isn't that I don't know how, or can't read the > documents, it's just that there is too many things that can go wrong or > leave me with the "How does that work?" feeling where I need to test > before > I implement. > > OK, here is what I think we need to do, and I am willing to take the point > and do all of it: > > If postmaster is started with -D or with $PGDATA set to the location of a > postgresql cluster directory, it will work as it has always worked. > > If the -D or the $PGDATA values do not point to a database cluster > directory, it must point to either a postgresql.conf file, or a directory > which contains "postgresql.conf." If it points to a directory, then that > directory will be the assumed the default for pg_hba.conf and > pg_ident.conf. > > The test for it being a cluster directory or a configuration directory > will > be the presence of "PG_VERSION." (any better suggestion would be welcome > > Within the postgresql.conf file, there will be about 4 additional > configuration parameters: > > include='/somepath/coreopts.conf' > hba_conf='/somepath/pg_hba.conf' > ident_conf='somepath/pg_ident.conf' > pgdata='/somepath/to/cluster' > > (Unfortunately we lose the ability to use a standard configuration across > multiple databases, but I think this is a good compromise.) > > Additionally, rather than depend on environment variables for security, we > can put acceptable database locations in the postgresql.conf file in the > form of: > > altdb='/somealtpath' > > and we can use createdb as: > > CREATE DATABASE somedb with LOCATION 'altdb'; > > What do you all think? > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html >
pgsql@mohawksoft.com writes: > They say the absolute path is a security issue, I honestly don't see how Because it allows someone with only CREATEDB privilege to tell the backend to write anywhere (that it can write). That's an ability that should be reserved to superusers. The environment-variable-based variant is actually not a lot better. Consider CREATEDB WITH LOCATION 'HOME'. Or 'PWD'. Depending on your platform, "env | grep =/" may reveal other interesting possibilities. The problem is that a postmaster's environment will contain a lot of strings that happen to look like absolute paths. There is no way for the code to know which ones were really intended by the DBA to be used as locations, and which ones are just part of the standard environment on a particular platform. Either way you slice it, WITH LOCATION is fundamentally bogus because it allows users to create databases in unintended locations. I don't think it can be used to overwrite existing files, so it's not a security hole in the sense of allowing attacks, but it is a security hole in the sense of not giving the DBA a reliable way to control database locations. I have every intention of ripping that code out, root and branch, as soon as we have a real tablespace feature (which I'm expecting will provide means within the system for defining allowed physical locations). I don't recommend spending any time on improving the feature, because its days are numbered. regards, tom lane
> pgsql@mohawksoft.com writes: >> They say the absolute path is a security issue, I honestly don't see how > > Because it allows someone with only CREATEDB privilege to tell the > backend to write anywhere (that it can write). That's an ability > that should be reserved to superusers. I guess I can see that. > > The environment-variable-based variant is actually not a lot better. > Consider CREATEDB WITH LOCATION 'HOME'. Or 'PWD'. Depending on your > platform, "env | grep =/" may reveal other interesting possibilities. > The problem is that a postmaster's environment will contain a lot of > strings that happen to look like absolute paths. There is no way for > the code to know which ones were really intended by the DBA to be used > as locations, and which ones are just part of the standard environment > on a particular platform. > > Either way you slice it, WITH LOCATION is fundamentally bogus because it > allows users to create databases in unintended locations. I don't think > it can be used to overwrite existing files, so it's not a security hole > in the sense of allowing attacks, but it is a security hole in the sense > of not giving the DBA a reliable way to control database locations. > > I have every intention of ripping that code out, root and branch, as > soon as we have a real tablespace feature (which I'm expecting will > provide means within the system for defining allowed physical > locations). I don't recommend spending any time on improving the > feature, because its days are numbered. Well, is there a doc on how tables spaces will be implemented? Do you intend to have postmaster initialize the tablespace, or will the admin be required to perform an initlocation? I already have a keyword in my config patch called 'volume' but 'tablespace' would probably be a better name for it. A simple name change would work.