Thread: [GENERAL] LYDB: Feasible to use PG roles instead of application-level security?

[GENERAL] LYDB: Feasible to use PG roles instead of application-level security?

From

Guyren Howe

Date:

30 December 2016, 06:19:09

Further to my attempts to enlighten application developers about what they might better do in the database:

https://medium.com/@gisborne/love-your-database-lydb-23c69f480a1d#.4jngp2rcb

it occurs to me to wonder whether it is practical to use PG’s own roles and security model in lieu of using an application-level one.

It seems that the role system in PG is sufficient for most general purposes. One could presumably also have a table with role names and associated metainformation (email address etc) as needed.

If I have a system with many thousands of users, is it practical to manage these users’ authentication and authorization using *just* Postgres?

It occurs to me that some client frameworks might have issues with their connection pools if those connections keep switching users, assuming they even can, but let’s set that aside for now. Or perhaps every connection could immediately do a SET USER before executing its connection?

This seems an attractive proposition from a security standpoint: if I use row-level security pervasively, I can have a security system that’s nestled nice and close to the data and presumably tricky to work around from a hacker given direct access only to the client application.

Is this practical? Has anyone here done it? What might the caveats be?

TIA

Re: [GENERAL] LYDB: Feasible to use PG roles instead ofapplication-level security?

From

Alban Hertroys

Date:

30 December 2016, 14:23:27

> On 30 Dec 2016, at 4:19, Guyren Howe <guyren@gmail.com> wrote:
>
> Further to my attempts to enlighten application developers about what they might better do in the database:
>
> https://medium.com/@gisborne/love-your-database-lydb-23c69f480a1d#.4jngp2rcb
>
> it occurs to me to wonder whether it is practical to use PG’s own roles and security model in lieu of using an
application-levelone. 
>
> It seems that the role system in PG is sufficient for most general purposes. One could presumably also have a table
withrole names and associated metainformation (email address etc) as needed. 
>
> If I have a system with many thousands of users, is it practical to manage these users’ authentication and
authorizationusing *just* Postgres? 

Postgres roles are global to the cluster, so you would end up with multiple thousands of roles if you have multiple
databasesin your cluster with different users on each. Which roles each user is allowed to have becomes quite the
nightmarefor the administrators, I suspect. 

For a web-application facing the internet, I'd say no, don't do that. You're dealing with far too many users to be
maintainable.

For an intranet database in a not-too-large company with a fixed set of users, it could be a good solution, especially
ifthose roles can be linked to the company's LDAP server (assuming that's possible, I don't know). Multiple intranet
applicationson that same database can use the same users and roles. 

Someone needs to do the administration though; with volumes (of users) like that and the database knowledge level of
theaverage system administrator, a GUI seems preferable. IMHO, pgadmin provides too many features to be practical for
someonelike that, you would probably prefer something that only does user administration. I don't know of anything that
doesthat though (not a GUI user myself)... 

> It occurs to me that some client frameworks might have issues with their connection pools if those connections keep
switchingusers, assuming they even can, but let’s set that aside for now. Or perhaps every connection could immediately
doa SET USER before executing its connection? 
>
> This seems an attractive proposition from a security standpoint: if I use row-level security pervasively, I can have
asecurity system that’s nestled nice and close to the data and presumably tricky to work around from a hacker given
directaccess only to the client application. 

With a few changes, that could work very well.

First, create roles for the different types of users that you expect. In a company, that could be by division,
distinguishingdivision-heads, interns, etc. 

Secondly, have a table with the users and their attributes like you describe. Include an attribute for their database
rolethere. Only administrator users should have access to that table. 

Finally, create a stored procedure that looks up a user name in that table and sets the accompanying role. If a user is
notfound, set the role to some default 'unprivileged' user. 
Make that procedure a SECURITY DEFINER with according permissions. That role stays active the entire session, so unless
youclose the connection, create a new one or change the user's role, this procedure doesn't need calling again. 

> Is this practical? Has anyone here done it? What might the caveats be?

It's a fairly common practice, the ML archives should contain plenty of examples.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

Re: [GENERAL] LYDB: Feasible to use PG roles instead ofapplication-level security?

From

"btober@computer.org"

Date:

30 December 2016, 17:51:26

----- Original Message -----
> From: "Alban Hertroys" <haramrae@gmail.com>
> To: "Guyren Howe" <guyren@gmail.com>
> Cc: "pgsql-general" <pgsql-general@postgresql.org>
> Sent: Friday, December 30, 2016 6:23:27 AM
> Subject: Re: [GENERAL] LYDB: Feasible to use PG roles instead of application-level security?
>
>
> > On 30 Dec 2016, at 4:19, Guyren Howe <guyren@gmail.com> wrote:
> >
> >
> > ... wonder whether it is practical to use PG’s own roles and
> > security model in lieu of using an application-level one.
> >
> > It seems that the role system in PG is sufficient for most general
> > purposes. One could presumably also have a table with role names and
> > associated metainformation (email address etc) as needed.
> >
> > If I have a system with many thousands of users, is it practical to manage
> > these users’ authentication and authorization using *just* Postgres?
>
> Postgres roles are global to the cluster, so you would end up with multiple
> thousands of roles if you have multiple databases in your cluster with
> different users on each. Which roles each user is allowed to have becomes
> quite the nightmare for the administrators, I suspect.
>
> For a web-application facing the internet, I'd say no, don't do that. You're
> dealing with far too many users to be maintainable.
>
> For an intranet database in a not-too-large company with a fixed set of
> users, it could be a good solution, especially if those roles can be linked
> to the company's LDAP server (assuming that's possible, I don't know).
> Multiple intranet applications on that same database can use the same users
> and roles.
>
> Someone needs to do the administration though; with volumes (of users) like
> that and the database knowledge level of the average system administrator, a
> GUI seems preferable. IMHO, pgadmin provides too many features to be
> practical for someone like that, you would probably prefer something that
> only does user administration. I don't know of anything that does that
> though (not a GUI user myself)...
>
> > It occurs to me that some client frameworks might have issues with their
> > connection pools if those connections keep switching users, assuming they
> > even can, but let’s set that aside for now. Or perhaps every connection
> > could immediately do a SET USER before executing its connection?
> >
> > This seems an attractive proposition from a security standpoint: if I use
> > row-level security pervasively, I can have a security system that’s
> > nestled nice and close to the data and presumably tricky to work around
> > from a hacker given direct access only to the client application.
>
> With a few changes, that could work very well.
>
> First, create roles for the different types of users that you expect. In a
> company, that could be by division, distinguishing division-heads, interns,
> etc.
>
> Secondly, have a table with the users and their attributes like you describe.
> Include an attribute for their database role there. Only administrator users
> should have access to that table.
>
> Finally, create a stored procedure that looks up a user name in that table
> and sets the accompanying role. If a user is not found, set the role to some
> default 'unprivileged' user.
> Make that procedure a SECURITY DEFINER with according permissions. That role
> stays active the entire session, so unless you close the connection, create
> a new one or change the user's role, this procedure doesn't need calling
> again.
>
> > Is this practical? Has anyone here done it? What might the caveats be?
>
> It's a fairly common practice, the ML archives should contain plenty of
> examples.

I cannot speak to the thousands of users scenario in practice, but in principle it seems workable.

As a practical matter, I have implemented something along the lines of what Hertroys describes, and taken it a step
further,engaging schema privileges as well. 

The PUBLIC schema contains a single relation that exposes a view with three columns: username, password, and
email_address.You could expose more, but for my purposes minimal is best. This view serves the express purpose of
creatingnew user accounts and is, by means of the rule system, writeable. Technically it is readable, too, but a select
statementprovides no information about underlying data ... so stalkers can't get a user list. 

pg_hba.conf is configured to allow anonymous login which has usage permission on, and only on, the PUBLIC schema. So
anyonecan log in and see that one view and do an insert on that view. 

The rule system pushes down the INSERT onto a table in a private schema that has a trigger which creates the data base
rolecorresponding to the new user row inserted into the view. Thus, prospective users "self administer" and create
theirown accounts as needed. The trigger also grants the new role membership in a particular group role. 

The group role has usage privileges on a second schema that exposes broader visibility into the data base as well as
additionalinsert and update privileges on some of these other, newly exposed views. 

You can read about and actually try it yourself at https://fairwinds.btober.net. Read the first few sections of the
tutorial.Just be aware that that is a test environment, so it may or may not be available at any particular,
unannouncedtimes that I'm playing around with it, and will occasionally be re-initialized.

Re: [GENERAL] LYDB: Feasible to use PG roles instead ofapplication-level security?

From

Stephen Frost

Date:

30 December 2016, 18:23:44

Greetings,

* Guyren Howe (guyren@gmail.com) wrote:
> it occurs to me to wonder whether it is practical to use PG’s own roles and security model in lieu of using an
application-levelone.

The short answer is yes.

> It seems that the role system in PG is sufficient for most general purposes. One could presumably also have a table
withrole names and associated metainformation (email address etc) as needed.

Yup. That can get a bit awkward if you have multiple databases inside
of a single cluster, as you would have to pick which database to put
that metainformation in, but that isn't a very big issue.

> If I have a system with many thousands of users, is it practical to manage these users’ authentication and
authorizationusing *just* Postgres?

For this, it really depends on if the PG authorization model matches the
requirements you have. The PG auth model, particularly with RLS, is
extremely flexible but you would really need to evaluate what the exact
requirements are and how you would handle that with the PG auth model.
Of course, if there are just a few exceptions or complicated cases that
can't be satisfied directly with PG today, you could use security
definer functions.

One area that isn't fully addressed with the PG auth model today is
partial access to a certain column. Consider a table where you want
users to have access to all of the rows and all of the columns *except*
for column X for rows where ID is > 1000. The PG auth model today can
be used to say "you can't access column X" or to say "you can't access
rows where ID > 1000" but you can't combine those, yet.

I'm hopeful that we'll get there as there are definitely use-cases for
that kind of access control, but it's unlikely to happen for PG10.

> It occurs to me that some client frameworks might have issues with their connection pools if those connections keep
switchingusers, assuming they even can, but let’s set that aside for now. Or perhaps every connection could immediately
doa SET USER before executing its connection?

Again, yes, connection poolers can be an issue, but it's possible to use
the role system and do a 'set role X' after having connected as some
user that has very little access. The issue here is controlling that
role change- there's no direct way in PG today to require a password to
be provided when doing the role change, which is unfortunate. One
approach to solving that with RLS is to use a security definer function
to change a record in a table that is then used in all RLS policies.
It's a bit complicated and doesn't involve doing 'set role' though, so
there are some trade-offs there.

If you really want connection pooling and independent users in PG's role
system then you'll end up having to have the app code do the
authentication (or maybe auth to PG as the user and, if successful,
reconnect as the regular user and set role... that's pretty awkward
though) and then connect and do the 'set role'.

One big question here, however, is if you're going to have thousands of
*concurrently connected* users. Thousands of users shouldn't be too
much of an issue, but if they're all connected using PG's main auth
system then you'll have thousands of backend processes running. That'll
end up causing some amount of overhead even if they're mostly idle. If
your application can handle connecting/disconnecting pretty easily and
you have a relativly short timeout (though, ideally, not too short) then
perhaps your number of concurrent connections won't be too bad.

> This seems an attractive proposition from a security standpoint: if I use row-level security pervasively, I can have
asecurity system that’s nestled nice and close to the data and presumably tricky to work around from a hacker given
directaccess only to the client application.

If that's the threat model you want to address then you'll have to work
out the concurrent connections question. One thing which can help is to
use a common user for 'read-only/public-access (or at least low-value)'
queries from the app, if there are such.

> Is this practical? Has anyone here done it? What might the caveats be?

Yes, yes, see above.

Thanks!

Stephen

Attachment

signature.asc

Re: [GENERAL] LYDB: Feasible to use PG roles instead ofapplication-level security?

From

Melvin Davidson

Date:

30 December 2016, 19:06:41

On Fri, Dec 30, 2016 at 10:23 AM, Stephen Frost <sfrost@snowman.net> wrote:

Greetings,

* Guyren Howe (guyren@gmail.com) wrote:
> it occurs to me to wonder whether it is practical to use PG’s own roles and security model in lieu of using an application-level one.

The short answer is yes.

> It seems that the role system in PG is sufficient for most general purposes. One could presumably also have a table with role names and associated metainformation (email address etc) as needed.

Yup. That can get a bit awkward if you have multiple databases inside
of a single cluster, as you would have to pick which database to put
that metainformation in, but that isn't a very big issue.

> If I have a system with many thousands of users, is it practical to manage these users’ authentication and authorization using *just* Postgres?

For this, it really depends on if the PG authorization model matches the
requirements you have. The PG auth model, particularly with RLS, is
extremely flexible but you would really need to evaluate what the exact
requirements are and how you would handle that with the PG auth model.
Of course, if there are just a few exceptions or complicated cases that
can't be satisfied directly with PG today, you could use security
definer functions.

One area that isn't fully addressed with the PG auth model today is
partial access to a certain column. Consider a table where you want
users to have access to all of the rows and all of the columns *except*
for column X for rows where ID is > 1000. The PG auth model today can
be used to say "you can't access column X" or to say "you can't access
rows where ID > 1000" but you can't combine those, yet.

I'm hopeful that we'll get there as there are definitely use-cases for
that kind of access control, but it's unlikely to happen for PG10.

> It occurs to me that some client frameworks might have issues with their connection pools if those connections keep switching users, assuming they even can, but let’s set that aside for now. Or perhaps every connection could immediately do a SET USER before executing its connection?

Again, yes, connection poolers can be an issue, but it's possible to use
the role system and do a 'set role X' after having connected as some
user that has very little access. The issue here is controlling that
role change- there's no direct way in PG today to require a password to
be provided when doing the role change, which is unfortunate. One
approach to solving that with RLS is to use a security definer function
to change a record in a table that is then used in all RLS policies.
It's a bit complicated and doesn't involve doing 'set role' though, so
there are some trade-offs there.

If you really want connection pooling and independent users in PG's role
system then you'll end up having to have the app code do the
authentication (or maybe auth to PG as the user and, if successful,
reconnect as the regular user and set role... that's pretty awkward
though) and then connect and do the 'set role'.

One big question here, however, is if you're going to have thousands of
*concurrently connected* users. Thousands of users shouldn't be too
much of an issue, but if they're all connected using PG's main auth
system then you'll have thousands of backend processes running. That'll
end up causing some amount of overhead even if they're mostly idle. If
your application can handle connecting/disconnecting pretty easily and
you have a relativly short timeout (though, ideally, not too short) then
perhaps your number of concurrent connections won't be too bad.

> This seems an attractive proposition from a security standpoint: if I use row-level security pervasively, I can have a security system that’s nestled nice and close to the data and presumably tricky to work around from a hacker given direct access only to the client application.

If that's the threat model you want to address then you'll have to work
out the concurrent connections question. One thing which can help is to
use a common user for 'read-only/public-access (or at least low-value)'
queries from the app, if there are such.

> Is this practical? Has anyone here done it? What might the caveats be?

Yes, yes, see above.

Thanks!

Stephen

>Postgres roles are global to the cluster,

Well, that is true by default, however, you can make roles database specific:
https://www.postgresql.org/docs/9.4/static/runtime-config-connection.html#RUNTIME-CONFIG-CONNECTION-SECURITY

db_user_namespace = on

That being said, there is a trade off of managing multiple users & passwords VS simple access roles.

Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Re: [GENERAL] LYDB: Feasible to use PG roles instead ofapplication-level security?

From

Karsten Hilbert

Date:

30 December 2016, 20:05:12

On Fri, Dec 30, 2016 at 10:23:44AM -0500, Stephen Frost wrote:

> One area that isn't fully addressed with the PG auth model today is
> partial access to a certain column.  Consider a table where you want
> users to have access to all of the rows and all of the columns *except*
> for column X for rows where ID is > 1000.  The PG auth model today can
> be used to say "you can't access column X" or to say "you can't access
> rows where ID > 1000" but you can't combine those, yet.

Do you mean that there is currently no way to say:

    if special_column is NOT in the SELECT list:
        show all rows
    if special_column IS in the SELECT list:
        show only those rows where special_column > 1000

?

Thanks,
Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346