Thread: Default connection parameters for postgres_fdw and dblink
It struck me while looking at the regression test arrangements for postgres_fdw that as things are set up, the default username for outgoing connections is going to be that of the operating system user running the postmaster. dblink is the same way. Now, this might not be the world's worst default, but it's got to be a pretty bad one. The OS username of the server isn't really supposed to be exposed at all on the SQL level. And to the extent that it matches the bootstrap superuser's SQL name, it's still a non-POLA-satisfying default for any user other than the bootstrap superuser. IMO it would make a lot more sense for the default to be the name of the current database user. Either that, or insist that the outgoing username not be defaultable at all; though I'm not sure we can do the latter without breaking the regression tests, since those are supposed to be agnostic as to the name of the superuser running them. A related issue is that libpq will happily acquire defaults from the server's environment, such as PGPORT. This seems like it's also exposing things that shouldn't be exposed. Unfortunately, I think we're depending on that for the dblink and postgres_fdw regression tests to work at all when the postmaster is listening to a nondefault port (ie, "make check"). Is there a better way to handle all this? It may be too late to rethink dblink's behavior anyhow, but perhaps it's not too late to change postgres_fdw. I think though that once we let 9.3 out the door, it *will* be too late to make any major changes, because postgres_fdw's usage is going to go through the roof now that it can do remote updates. regards, tom lane
On Fri, 2013-03-22 at 12:19 -0400, Tom Lane wrote: > Is there a better way to handle all this? It may be too late to rethink > dblink's behavior anyhow, but perhaps it's not too late to change > postgres_fdw. I think though that once we let 9.3 out the door, it > *will* be too late to make any major changes, because postgres_fdw's > usage is going to go through the roof now that it can do remote updates. The first thing that occurs to me is to have postgres_fdw install some GUCs with reasonable defaults. Perhaps the default could be a magic value that is replaced by the current user or something (similar to search_path). Regards,Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes: > On Fri, 2013-03-22 at 12:19 -0400, Tom Lane wrote: >> Is there a better way to handle all this? It may be too late to rethink >> dblink's behavior anyhow, but perhaps it's not too late to change >> postgres_fdw. I think though that once we let 9.3 out the door, it >> *will* be too late to make any major changes, because postgres_fdw's >> usage is going to go through the roof now that it can do remote updates. > The first thing that occurs to me is to have postgres_fdw install some > GUCs with reasonable defaults. If there's anything I've learned in the last dozen years, it's that GUCs with application-visible semantic effects are dangerous. If the semantic effects are relevant to security, that's probably even worse. > Perhaps the default could be a magic value that is replaced by the > current user or something (similar to search_path). That seems like just an overcomplicated form of my suggestion that the default should be the current user's name. regards, tom lane
Tom Lane wrote: > It struck me while looking at the regression test arrangements for > postgres_fdw that as things are set up, the default username for > outgoing connections is going to be that of the operating system > user running the postmaster. dblink is the same way. > > Now, this might not be the world's worst default, but it's got to be a > pretty bad one. The OS username of the server isn't really supposed to > be exposed at all on the SQL level. And to the extent that it matches > the bootstrap superuser's SQL name, it's still a non-POLA-satisfying > default for any user other than the bootstrap superuser. > > IMO it would make a lot more sense for the default to be the name of the > current database user. Either that, or insist that the outgoing > username not be defaultable at all; though I'm not sure we can do the > latter without breaking the regression tests, since those are supposed > to be agnostic as to the name of the superuser running them. > > A related issue is that libpq will happily acquire defaults from the > server's environment, such as PGPORT. This seems like it's also > exposing things that shouldn't be exposed. Unfortunately, I think we're > depending on that for the dblink and postgres_fdw regression tests to > work at all when the postmaster is listening to a nondefault port (ie, > "make check"). > > Is there a better way to handle all this? It may be too late to rethink > dblink's behavior anyhow, but perhaps it's not too late to change > postgres_fdw. I think though that once we let 9.3 out the door, it > *will* be too late to make any major changes, because postgres_fdw's > usage is going to go through the roof now that it can do remote updates. I agree that this is an unhappy situation. If possible, I would suggest the following defaults (that's what I as a user would expect without thinking too hard): 1) Default the user to the current effective database user. 2) Default the port to 5432. 3) Default the database name to the current database name. That comes closest to DWIM in my opinion. Speaking about exposing the server environment, I have been bitten before by the fact that the default client encoding is taken from the server's PGCLIENTENCODING at postmaster start time, but that's slightly off topic. Yours, Laurenz Albe
On Mon, Mar 25, 2013 at 4:38 AM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote: > I agree that this is an unhappy situation. > > If possible, I would suggest the following defaults (that's what I as > a user would expect without thinking too hard): > 1) Default the user to the current effective database user. > 2) Default the port to 5432. > 3) Default the database name to the current database name. +1. > Speaking about exposing the server environment, I have been bitten > before by the fact that the default client encoding is taken from > the server's PGCLIENTENCODING at postmaster start time, but that's > slightly off topic. Yeah. I really hate environment variables as a way of setting defaults for things, because they tend to start creeping into unfortunate places. It's not so bad to have one or two, but when you start to have PGDATA, PGPORT, PGUSER, PGSERVICE, PGSERVICEFILE, PGSSLMODE, PGCONNECT_TIMEOUT, PGHOST, PGHOSTADDR, PGREALM, PGOPTIONS, PGAPPNAME, PGREQUIRESSL, PGSSLCOMPRESSION, PGSYSCONFDIR, PGLOCALEDIR, PGSSLROOTCERT, PGSSLCERT, PGGEQO, PGTZ, PGDATESTYLE, PGCLIENTENCODING, PGKRBSRVNAME, PGGSSLIB, PGPASSFILE, OLDDATADIR, NEWDATADIR, OLDBINDIR, NEWBINDIR, and probably a few others I'm missing, it becomes very difficult to sanitize an environment (such as the postmaster) against undesirable intrusions. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Mar 25, 2013 at 4:38 AM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote: >> If possible, I would suggest the following defaults (that's what I as >> a user would expect without thinking too hard): >> 1) Default the user to the current effective database user. >> 2) Default the port to 5432. >> 3) Default the database name to the current database name. > +1. I think (2) should be "default to whatever the configure-selected default port is" --- not hard-wired to 5432. I'm also a bit unclear on what the argument is for (3), as opposed to following the default that we use in every other context, namely set dbname equal to username. > Yeah. I really hate environment variables as a way of setting > defaults for things, because they tend to start creeping into > unfortunate places. It's not so bad to have one or two, but when you > start to have PGDATA, PGPORT, PGUSER, PGSERVICE, PGSERVICEFILE, > PGSSLMODE, PGCONNECT_TIMEOUT, PGHOST, PGHOSTADDR, PGREALM, PGOPTIONS, > PGAPPNAME, PGREQUIRESSL, PGSSLCOMPRESSION, PGSYSCONFDIR, PGLOCALEDIR, > PGSSLROOTCERT, PGSSLCERT, PGGEQO, PGTZ, PGDATESTYLE, PGCLIENTENCODING, > PGKRBSRVNAME, PGGSSLIB, PGPASSFILE, OLDDATADIR, NEWDATADIR, OLDBINDIR, > NEWBINDIR, and probably a few others I'm missing, it becomes very > difficult to sanitize an environment (such as the postmaster) against > undesirable intrusions. It's arguable that we should unsetenv all of these inside the postmaster (once it's absorbed the values from the ones it historically pays attention to), so that the postmaster environment does not impinge on the behavior of libpq inside a server process. This would cause a non-backwards-compatible change in the behavior of dblink, though. Are we okay with that? regards, tom lane
On Wed, Mar 27, 2013 at 10:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Mon, Mar 25, 2013 at 4:38 AM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote: >>> If possible, I would suggest the following defaults (that's what I as >>> a user would expect without thinking too hard): >>> 1) Default the user to the current effective database user. >>> 2) Default the port to 5432. >>> 3) Default the database name to the current database name. > >> +1. > > I think (2) should be "default to whatever the configure-selected > default port is" --- not hard-wired to 5432. > > I'm also a bit unclear on what the argument is for (3), as opposed to > following the default that we use in every other context, namely set > dbname equal to username. Well, those both seem reasonable alternative proposals. No argument here. >> Yeah. I really hate environment variables as a way of setting >> defaults for things, because they tend to start creeping into >> unfortunate places. It's not so bad to have one or two, but when you >> start to have PGDATA, PGPORT, PGUSER, PGSERVICE, PGSERVICEFILE, >> PGSSLMODE, PGCONNECT_TIMEOUT, PGHOST, PGHOSTADDR, PGREALM, PGOPTIONS, >> PGAPPNAME, PGREQUIRESSL, PGSSLCOMPRESSION, PGSYSCONFDIR, PGLOCALEDIR, >> PGSSLROOTCERT, PGSSLCERT, PGGEQO, PGTZ, PGDATESTYLE, PGCLIENTENCODING, >> PGKRBSRVNAME, PGGSSLIB, PGPASSFILE, OLDDATADIR, NEWDATADIR, OLDBINDIR, >> NEWBINDIR, and probably a few others I'm missing, it becomes very >> difficult to sanitize an environment (such as the postmaster) against >> undesirable intrusions. > > It's arguable that we should unsetenv all of these inside the postmaster > (once it's absorbed the values from the ones it historically pays > attention to), so that the postmaster environment does not impinge on > the behavior of libpq inside a server process. This would cause a > non-backwards-compatible change in the behavior of dblink, though. > Are we okay with that? I feel like unsetting all of these (or whatever the canonical list is) in the postmaster is a bit like trying to bail out the ocean with a bucket, but since a bucket appears to be the only instrument at hand, sure, why not? As far as breaking backward incompatibility goes, it doesn't strike me as likely that anyone is relying on the current behavior, but on the off chance that they are, do we have some other way for them to set defaults? What I'm worried about with the current behavior is that people will accidentally absorb behavior changes they don't want (or that are insecure). But if there's no other way to set defaults explicitly then you could we'd be ripping out a feature without providing a replacement, something I am loathe to do. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Mar 27, 2013 at 10:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> It's arguable that we should unsetenv all of these inside the postmaster >> (once it's absorbed the values from the ones it historically pays >> attention to), so that the postmaster environment does not impinge on >> the behavior of libpq inside a server process. This would cause a >> non-backwards-compatible change in the behavior of dblink, though. >> Are we okay with that? > I feel like unsetting all of these (or whatever the canonical list is) > in the postmaster is a bit like trying to bail out the ocean with a > bucket, but since a bucket appears to be the only instrument at hand, > sure, why not? > As far as breaking backward incompatibility goes, it doesn't strike me > as likely that anyone is relying on the current behavior, but on the > off chance that they are, do we have some other way for them to set > defaults? What I'm worried about with the current behavior is that > people will accidentally absorb behavior changes they don't want (or > that are insecure). But if there's no other way to set defaults > explicitly then you could we'd be ripping out a feature without > providing a replacement, something I am loathe to do. Use a service file maybe? But you can't have it both ways: either we like the behavior of libpq absorbing defaults from the postmaster environment, or we don't. You were just arguing this was a bug, and now you're calling it a feature. (I guess we could have a switch to control whether the environment gets cleared, so that anybody who really needs the old behavior could still get it. But ugh.) regards, tom lane
On Wed, Mar 27, 2013 at 10:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Wed, Mar 27, 2013 at 10:20 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> It's arguable that we should unsetenv all of these inside the postmaster >>> (once it's absorbed the values from the ones it historically pays >>> attention to), so that the postmaster environment does not impinge on >>> the behavior of libpq inside a server process. This would cause a >>> non-backwards-compatible change in the behavior of dblink, though. >>> Are we okay with that? > >> I feel like unsetting all of these (or whatever the canonical list is) >> in the postmaster is a bit like trying to bail out the ocean with a >> bucket, but since a bucket appears to be the only instrument at hand, >> sure, why not? > >> As far as breaking backward incompatibility goes, it doesn't strike me >> as likely that anyone is relying on the current behavior, but on the >> off chance that they are, do we have some other way for them to set >> defaults? What I'm worried about with the current behavior is that >> people will accidentally absorb behavior changes they don't want (or >> that are insecure). But if there's no other way to set defaults >> explicitly then you could we'd be ripping out a feature without >> providing a replacement, something I am loathe to do. > > Use a service file maybe? But you can't have it both ways: either we > like the behavior of libpq absorbing defaults from the postmaster > environment, or we don't. You were just arguing this was a bug, and > now you're calling it a feature. Maybe we could compromise and call it a beature. Or a fug. In all seriousness, it's in that grey area where most people would probably consider this a surprising and unwanted behavior, but there might be a few out there who had a problem and discovered that they could use this trick to solve it. In terms of a different solution, what about a GUC that can contain a connection string which is used to set connection defaults, but which can still be overriden? So it would mimic the semantics of setting an environment variable, but it would be better, because you could do all of the usual GUC things with it instead of having to hack on the postmaster startup environment. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Mar 27, 2013 at 8:22 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Mar 27, 2013 at 10:47 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:>> >> Use a service file maybe? But you can't have it both ways: either we >> like the behavior of libpq absorbing defaults from the postmaster >> environment, or we don't. You were just arguing this was a bug, and >> now you're calling it a feature. > > Maybe we could compromise and call it a beature. Or a fug. In all > seriousness, it's in that grey area where most people would probably > consider this a surprising and unwanted behavior, but there might be a > few out there who had a problem and discovered that they could use > this trick to solve it. > > In terms of a different solution, what about a GUC that can contain a > connection string which is used to set connection defaults, but which > can still be overriden? So it would mimic the semantics of setting an > environment variable, but it would be better, because you could do all > of the usual GUC things with it instead of having to hack on the > postmaster startup environment. In my meta-experience, "nobody" really wants defaults in that configuration anyway, and if they do, most of them would be appreciative of being notified positively of such a problem of relying on connection defaults (maybe with a warning at first at most?). In this case, such an abrupt change in the contrib is probably fine. From the vantage point I have: full steam ahead, with the sans GUC solution...it's one less detailed GUC in the world, and its desired outcome can be achieved otherwise. I'd personally rather open myself up for dealing with the consequences of this narrow change in my own corner of the world. This anecdote carries a standard disclaimer: my meta-experience doesn't include all use cases for everyone everywhere. -- fdr