Thread: askpass program for libpq
I would like to have something like ssh-askpass for libpq. The main reason is that I don't want to have passwords in plain text on disk, even if .pgpass is read protected. By getting the password from an external program, I can integrate libpq tools with the host system's key chain or wallet thing, which stores passwords encrypted. I'm thinking about adding a new connection option "askpass" with environment variable PGASKPASS. One thing I haven't quite figured out is how to make this ask for passwords only if needed. Maybe it needs two connection options, one to say which program to use and one to say whether to use it. Ideas?
On Wed, Jan 9, 2013 at 2:17 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > I would like to have something like ssh-askpass for libpq. The main > reason is that I don't want to have passwords in plain text on disk, > even if .pgpass is read protected. By getting the password from an > external program, I can integrate libpq tools with the host system's key > chain or wallet thing, which stores passwords encrypted. Sounds very useful. > I'm thinking about adding a new connection option "askpass" with > environment variable PGASKPASS. One thing I haven't quite figured out > is how to make this ask for passwords only if needed. Maybe it needs > two connection options, one to say which program to use and one to say > whether to use it. > > Ideas? You could call it basically where conn->password_needed is set today. So instead of dropping it directly back to the user, call the callback, try again, and drop back to the user only if it doesn't work. That means it gets called only after the connection to the server is established, but that seems reasonable given that that's the only case when you can get a password prompt as well... You don't know the server is going to ask for a password until it gets that far. In fact, might it be interesting to allow libpq to do a simple callback for the password *as well*? to implement a password prompt directly in the application, instead of having to make multiple connections? So not just as an external command, but also availbale as a direct calback. --Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
On Wed, Jan 9, 2013 at 5:17 AM, Peter Eisentraut <peter_e@gmx.net> wrote: > I would like to have something like ssh-askpass for libpq. The main > reason is that I don't want to have passwords in plain text on disk, > even if .pgpass is read protected. By getting the password from an > external program, I can integrate libpq tools with the host system's key > chain or wallet thing, which stores passwords encrypted. > > I'm thinking about adding a new connection option "askpass" with > environment variable PGASKPASS. One thing I haven't quite figured out > is how to make this ask for passwords only if needed. Maybe it needs > two connection options, one to say which program to use and one to say > whether to use it. > > Ideas? Okay, I have a patch that does something *like* (but not the same) as this, and whose implementation is totally unreasonable, but it's enough to get a sense of how the whole thing feels. Critically, it goes beyond askpass, instead allowing a shell-command based hook for arbitrary interpretation and rewriting of connection info...such as the 'host' libpq keyword. I have called it, without much thought, a 'resolver'. In this way, it's closer to the libpq 'service' facility, except with addition of complete control of the interpretation of user-provided notation. I think it would be useful to have an in-process equivalent (e.g. loading hooks via dynamic linking, like the backend), if the functionality seems worthwhile, mostly for performance and a richer protocol between libpq and resolvers (error messages come to mind). I think with a bit of care this also would allow other drivers to more easily implement some of the features grown into libpq (by re-using the shared-object's protocol, or a generic wrapper that can load the shared object to provide a command line interface), like the .pgpass file and the services file, which are often missing from new driver implementations. With these disclaimers out of the way, here's a 'demo'. It's long, but hopefully not as dense as it first appears since it's mostly a narration of walking through using this. Included alongside the patch is a short program that integrates with the freedesktop.org secret service standard, realized in the form of libsecret (I don't know exactly what the relationship is, but I used libsecret's documentation[0]). This resolver I wrote is not a solid piece of work either, much like the libpq patch I've attached...but, I probably will try to actually make personal use of this for a time after fixing a handful of things to feel it out over an extended period. On Debian, one needs gir1.2-secret-1 and Python 2.7. I run this on Ubuntu Raring, and have traced my steps using the GNOME Keyring used there, Seahorse. # This tool has some help: $ ./pq-secret-service -h $ ./pq-secret-service store -h $ ./pq-secret-service recall -h # Store a secret: $ printf 'dbname=regression host=localhost port=5432 user=user'\ ' password=password' | ./pq-secret-service store Now, you can check your keyring manager, and search for 'postgres' or 'postgresql'. You should see a stored PostgreSQL secret, probably labeled "PostgreSQL Database Information" After having compiled a libpq with my patch applied and making sure a 'psql' is using that libpq... # Set the 'resolve' command: $ export PGRESOLVE='/path/to/pq-secret-service recall' # Attempt to connect to regression database. The credentials are # probably not enabled on your database, and so you'll see the # appropriate error message instead. $ psql regression What this does is searches the keyrings in the most straightforward way given the Secret Service protocol to try to 'complete' the input passed, based on any of the (non-secret) parts of the passed conninfo to 'pq-secret-service store'. Here's an example where the resolver opts to not load something from the keyring: $ psql 'dbname=regression user=not-the-one-stored' Now the resolver doesn't find a secret, so it just passes-through the original connection info. Ditto if one passes a password. Here are more examples, this time explored by invoking the resolver command directly, without having to go through libpq. This makes testing and experimentation somewhat easier: $ echo $(printf 'regression' | ./pq-secret-service recall) $ echo $(printf 'dbname=regression' | ./pq-secret-service recall) $ echo $(printf 'host=localhost port=5432' | ./pq-secret-service recall) # Show failing to match the secret (the input provided will be # returned unchanged): $ echo $(printf 'host=elsewhere port=5432' | ./pq-secret-service recall) In some aspects the loose matching is kind of neat, because usually some subset of database names, role names, and host names are sufficient to find the right thing, but this is besides the point: the resolver program can be made very strict, or lax, or can even try opening a database connection to feel things out. I do want to make sure that at least some useful baseline functionality can be implemented, though, so wrangling the details of one or more defensible sample resolver program is something I want to do. In addition, there are two more advanced features in the 'pq-secret-service' resolver: labels, and shorthands. These are *not* concepts seen in the patch to libpq, whose main feature in this is the ability to hand off the user's input to some arbitrary other program. 'label's are part of the Secret Service, and provide a human readable entry in one's keyring program. I default it to "PostgreSQL Database Information". It is not used in matching credentials and connection information in any way: $ printf 'dbname=regression host=localhost port=5432 user=user'\ ' password=password' | \ ./pq-secret-service store \ --label='My Secure Postgres Regression Database' Then, check your keyring program. It'll probably display 'My Secure Postgres Regression Database' somewhere. If you ran the previous steps, you will get a message like this: Adding this secret would be ambiguous with another keyring entry with the attributes: {'user': 'user', 'host': 'localhost', 'dbname': 'regression', 'port': '5432'} In this way, resolvers can also be useful independent programs: libpq wouldn't nominally invoke the 'store' action at all, it only cares about 'recall'. You can delete the old credentials from your keyring to successfully run this command (I don't know how to do that working from the docs[0]). But, given the label is not semantically significant to libpq, you can opt to continue without doing that... Moving on, 'shorthands' are a jargon of my hand (and not a Secret Service feature, unlike 'label') that are very similar to service definitions and the feature that I want the most badly: $ printf 'dbname=regression host=localhost port=5432 user=user' ' password=password' | ./pq-secret-service store --shorthand=regress $ psql regress Note that this is ambiguous with database names: shorthands take priority over the implicit-dbname connection string form. Furthermore, they and don't need to follow any libpq quoting rules: if it's database-name-like, it is passed to the resolver verbatim, and if it's connection-string-like, the resolver gets to have a look before running any libpq parser (which might reject the input). I think this property of complete control to interpret user input is very important. Demo done. Next demo: What follows is a different use case entirely: client side proxying. Shown below is an interaction with a pgbouncer resolver (also included in this mail), as is useful to an application that uses a fork-based web server and wants pooling, but doesn't want to run centralized pgbouncer infrastructure. The multiplexing of, say, four or more forked backends to one pgbouncer, even pushed to the client, can be operationally really useful. WARNING: this program will litter some temporary files in /tmp (prefixed with pgbouncer, to assist in deletion), and is missing some useful features, and is suspect to port clashing if one is already running pgbouncer with default settings. $ export PGRESOLVE=/path/to/pq-pgbouncer-resolver $ psql 'dbname=regression host=elsewhere user=someone password=secret' This resolver assumes it can start up pgbouncer on its default port (6543) should it not already be started. When doing this, it sets up pgbouncer configuration files in a particular way to be able to service the user's passed connection string that triggered the pgbouncer start-up. After that, this implementation assumes that all connections are going to the same place, not taking care to update pgbouncer's configuration to 'learn' new database routes. Beyond the start-up step, the resolver is pretty simple: it overwrites the host and port libpq keywords in the conninfo passed by the user to point at the started pgbouncer, so what happens is the resolution yields: host=/tmp port=6543 dbname=regression user=someone password=secret' Demo done. Next Demo, a short one: $ export PGRESOLVE='pq-secret-service recall | pq-pgbouncer-resolver' $ psql theshorthand This will expand the shorthand using one's keyring, and then subsequently pipe the output of that to the pgbouncer resolver, which can do its deed to get the connection to use pgbouncer. Thanks for getting through all that text. Fin. And, thoughts? [0]: https://developer.gnome.org/libsecret/unstable/index.html
Attachment
On Fri, May 17, 2013 at 2:03 PM, Daniel Farina <daniel@heroku.com> wrote: > Thanks for getting through all that text. Fin. And, thoughts? I have uploaded the resolvers, the last mail, and the patch to github: https://github.com/fdr/pq-resolvers So, if one prefers to use git to get this and track potential changes, or add contributions, please do feel encouraged to do so.
On Fri, May 17, 2013 at 2:03 PM, Daniel Farina <daniel@heroku.com> wrote: > On Wed, Jan 9, 2013 at 5:17 AM, Peter Eisentraut <peter_e@gmx.net> wrote: >> I would like to have something like ssh-askpass for libpq. The main >> reason is that I don't want to have passwords in plain text on disk, >> even if .pgpass is read protected. By getting the password from an >> external program, I can integrate libpq tools with the host system's key >> chain or wallet thing, which stores passwords encrypted. >> >> I'm thinking about adding a new connection option "askpass" with >> environment variable PGASKPASS. One thing I haven't quite figured out >> is how to make this ask for passwords only if needed. Maybe it needs >> two connection options, one to say which program to use and one to say >> whether to use it. >> >> Ideas? > > Okay, I have a patch that does something *like* (but not the same) as > this, and whose implementation is totally unreasonable, but it's > enough to get a sense of how the whole thing feels. Critically, it > goes beyond askpass, instead allowing a shell-command based hook for > arbitrary interpretation and rewriting of connection info...such as > the 'host' libpq keyword. I have called it, without much thought, a > 'resolver'. In this way, it's closer to the libpq 'service' facility, > except with addition of complete control of the interpretation of > user-provided notation. Hello everyone, I'm sort of thinking of attacking this problem again, does anyone have an opinion or any words of (en/dis)couragement to continue? The implementation I posted is bogus but is reasonable to feel around with, but I'm curious besides its obvious defects as to what the temperature of opinion is. Most generally, I think the benefits are strongest in dealing with: * Security: out-of-band secrets will just prevent people from pasting important stuff all over the place, as I see despairinglyoften today. * Client-side Proxies: pgbouncer comes to mind, a variation being used on production applications right now that uses full-blownpreprocessing of the user environment (only possible in a environment with certain assumptions like Heroku) https://github.com/gregburek/heroku-buildpack-pgbouncerseems very promising and effective, but it'd be nice to confer thesame benefits to everyone else, too. * HA: one of the most annoying problems in HA is naming things. Yes, this could be solved with other forms of common dynamicbinding DNS or Virtual IP (sometimes), but these both are pretty complicated and carry baggage and pitfalls, but aslong as there is dynamic binding of the credentials, I'm thinking it may make sense to have dynamci binding of net locations,too. * Cross-server references This is basically the issues seen in HA and Security, but on (horrible) steroids: the spate of features making Postgreswork cross-server (older features like dblink, but now also new ones like FDWs and Writable FDWs) make complex interconnectionbetween servers more likely and problematic, especially if one has standbys where there is a delay in catalogpropagation from a primary to standby with new connection info. So, an out of band way where one can adjust the dynamic binding seems useful there. Knowing those, am I barking up the wrong tree? Can I do something else entirely? I've considered DNS and SSL certs, but these seem much, much harder and limited, too.
Daniel Farina <daniel@heroku.com> writes: >> Okay, I have a patch that does something *like* (but not the same) as >> this, and whose implementation is totally unreasonable, but it's >> enough to get a sense of how the whole thing feels. Critically, it >> goes beyond askpass, instead allowing a shell-command based hook for >> arbitrary interpretation and rewriting of connection info...such as >> the 'host' libpq keyword. I have called it, without much thought, a >> 'resolver'. In this way, it's closer to the libpq 'service' facility, >> except with addition of complete control of the interpretation of >> user-provided notation. > Hello everyone, > I'm sort of thinking of attacking this problem again, does anyone have > an opinion or any words of (en/dis)couragement to continue? The > implementation I posted is bogus but is reasonable to feel around > with, but I'm curious besides its obvious defects as to what the > temperature of opinion is. > Most generally, I think the benefits are strongest in dealing with: > * Security: out-of-band secrets will just prevent people from pasting > important stuff all over the place, as I see despairingly often > today. > * Client-side Proxies: pgbouncer comes to mind, a variation being used > on production applications right now that uses full-blown > preprocessing of the user environment (only possible in a > environment with certain assumptions like Heroku) > https://github.com/gregburek/heroku-buildpack-pgbouncer seems very > promising and effective, but it'd be nice to confer the same > benefits to everyone else, too. > * HA: one of the most annoying problems in HA is naming things. Yes, > this could be solved with other forms of common dynamic binding DNS > or Virtual IP (sometimes), but these both are pretty complicated and > carry baggage and pitfalls, but as long as there is dynamic binding > of the credentials, I'm thinking it may make sense to have dynamci > binding of net locations, too. > * Cross-server references > This is basically the issues seen in HA and Security, but on > (horrible) steroids: the spate of features making Postgres work > cross-server (older features like dblink, but now also new ones like > FDWs and Writable FDWs) make complex interconnection between servers > more likely and problematic, especially if one has standbys where > there is a delay in catalog propagation from a primary to standby > with new connection info. > So, an out of band way where one can adjust the dynamic binding > seems useful there. TBH, I see no clear reason to think that a connection-string rewriter solves any of those problems. At best it would move them somewhere else. Nor is it clear that any of this should be libpq's business, as opposed to something an application might do before invoking libpq. Also, I think a facility dependent on invoking a shell command is (a) wide open for security problems, and (b) not likely to be portable to Windows. regards, tom lane
On Sat, Jun 15, 2013 at 8:55 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Daniel Farina <daniel@heroku.com> writes: >>> Okay, I have a patch that does something *like* (but not the same) as >>> this, and whose implementation is totally unreasonable, but it's >>> enough to get a sense of how the whole thing feels. Critically, it >>> goes beyond askpass, instead allowing a shell-command based hook for >>> arbitrary interpretation and rewriting of connection info...such as >>> the 'host' libpq keyword. I have called it, without much thought, a >>> 'resolver'. In this way, it's closer to the libpq 'service' facility, >>> except with addition of complete control of the interpretation of >>> user-provided notation. > >> Hello everyone, > >> I'm sort of thinking of attacking this problem again, does anyone have >> an opinion or any words of (en/dis)couragement to continue? The >> implementation I posted is bogus but is reasonable to feel around >> with, but I'm curious besides its obvious defects as to what the >> temperature of opinion is. > >> Most generally, I think the benefits are strongest in dealing with: > >> * Security: out-of-band secrets will just prevent people from pasting >> important stuff all over the place, as I see despairingly often >> today. > >> * Client-side Proxies: pgbouncer comes to mind, a variation being used >> on production applications right now that uses full-blown >> preprocessing of the user environment (only possible in a >> environment with certain assumptions like Heroku) >> https://github.com/gregburek/heroku-buildpack-pgbouncer seems very >> promising and effective, but it'd be nice to confer the same >> benefits to everyone else, too. > >> * HA: one of the most annoying problems in HA is naming things. Yes, >> this could be solved with other forms of common dynamic binding DNS >> or Virtual IP (sometimes), but these both are pretty complicated and >> carry baggage and pitfalls, but as long as there is dynamic binding >> of the credentials, I'm thinking it may make sense to have dynamci >> binding of net locations, too. > >> * Cross-server references > >> This is basically the issues seen in HA and Security, but on >> (horrible) steroids: the spate of features making Postgres work >> cross-server (older features like dblink, but now also new ones like >> FDWs and Writable FDWs) make complex interconnection between servers >> more likely and problematic, especially if one has standbys where >> there is a delay in catalog propagation from a primary to standby >> with new connection info. > >> So, an out of band way where one can adjust the dynamic binding >> seems useful there. > > TBH, I see no clear reason to think that a connection-string rewriter > solves any of those problems. At best it would move them somewhere else. Yes, that's exactly what I want to achieve: moving them somewhere else that can be held in common by client applications. > Nor is it clear that any of this should be libpq's business, as opposed > to something an application might do before invoking libpq. Yes, it's unclear. I have only arrived at seriously exploring this after trying my very best meditate on other options. In addition, sometimes 'the application' is Postgres, and with diverse access paths like FDWs and dblink, which may not be so easy to adjust, and it would seem strange to adjust them in a way that can't be shared in common with regular non-backend-linked client applications. Also, it seems like a very high bar to set for an application as to make use of environment keychains or environment-specific high availability retargeting. This general approach you mention is used in Greg Burek's heroku-pgbouncer buildpack, but it took significant work to iron out (and probably still needs more ironing, although it seems to work great) and only serves the Heroku-verse. Basically, the needs seem very similar, so abstracting seems to me profitable. This is not that different in principle than pgpass and the services file in that regard, except taking the final step to delegate their function...and deliver full control over notation. Although I don't know much about it, I seem to recall that VMWare felt inclined to instigate some kind of vaguely related solution to solve a similar problem in a custom libpq. After initial recoil and over a year of contemplation, I think the reasons are more well justified than I originally thought and it'd be nice to de-weirdify such approaches. > Also, I think a facility dependent on invoking a shell command is > (a) wide open for security problems, and (b) not likely to be > portable to Windows. Yeah, those things occurred to me, I think a dlopen based mechanism is a more likely solution than the shell-command one. The latter just let me get started quickly to experiment. Would a rigorous proposal about how to do that help the matter? I mostly wanted to get the temperature before thinking about Real Mechanisms.