Re: Additional role attributes && superuser review - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: Additional role attributes && superuser review
Date
Msg-id CABUevEw8qeWZMbEPcKPjWG6ocE9ba69k7pSVJU1+RQJZ7cZXWA@mail.gmail.com
Whole thread Raw
In response to Re: Additional role attributes && superuser review  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Additional role attributes && superuser review  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On Wed, Dec 31, 2014 at 3:08 PM, Stephen Frost <sfrost@snowman.net> wrote:
* Magnus Hagander (magnus@hagander.net) wrote:
> On Tue, Dec 30, 2014 at 4:16 PM, Stephen Frost <sfrost@snowman.net> wrote:
> > The approach I was thinking was to document and implement this as
> > impliciting granting exactly USAGE and SELECT rights, no more (not
> > BYPASSRLS) and no less (yes, the role could execute functions).  I agree
>
> If the role could execute functions, it's *not* "exactly USAGE and SELECT
> rights", it's now "USAGE and SELECT and EXECUTE" rights... Just to be
> nitpicking, but see below.

We seem to be coming at it from two different directions, so I'll try to
clarify.  What I'm suggesting is that this role attribute be strictly
*additive*.  Any other privileges granted to the role with this role
attribute would still be applicable, including grants made to PUBLIC.

This role attribute would *not* include EXECUTE rights, by that logic.
However, if PUBLIC was granted EXECUTE rights for the function, then
this role could execute that function.

Ah, ok, mistunderstood that part.


What it sounds like is you're imagining this role attribute to mean the
role has *only* USAGE and SELECT (or COPY or whatever) rights across the
board and that any grants done explicitly to this role or to public
wouldn't be respected.  In my view, that moves this away from a role
*attribute* to being a pre-defined *role*, as such a role would not be
usable for anything else.

No, i meant additive as well. I misread you.


> > that doing so would be strictly more than what pg_dump actually requires
> > but it's also what we actually have support for in our privilege system.
>
> Yeah, but would it also be what people would actually *want*?

I can certainly see reasons why you'd want such a role to be able to
execute functions- in particular, consider xlog_pause anx xlog_resume. 
If this role can't execute those functions then they probably can't
successfully run pg_dump against a replica.

uh, pg_dump doesn't run those commands :P I don't see why that's a requirement at all.  And you can still always grant those things on top of whatever the privilege gives you.


> I think having it do exactly what pg_dump needs, and not things like
> execute functions etc, would be the thing people want for a 'DUMP'
> privilege.

What if we want pg_dump in 9.6 to have an option to execute xlog_pause
and xlog_resume for you?  You wouldn't be able to run that against a 9.5
database (or at least, that option wouldn't work).

It would if you added an explicit grant for it, which would have to be documented.

I don't see how that's different if the definition is "allows select on all tables". That wouldn't automatically give it the ability to do xlog_pause in an old version either.


> We might *also* want a USAGEANDSELECTANDEXECUTEANDWHATNOTBUTNOBYPASSURL
> (crap, it has to fit within NAMEDATALEN?) privilege, but I think that's a
> different thing.

Our privilege system doesn't currently allow for negative grants (deny
user X the ability to run functions, even if EXECUTE is granted to
PUBLIC).  I'm not against that idea, but I don't see it as the same as
this.

Agreed. That's what I said - different thing :)


> > it would only give you COPY access. (And also
> > > COPY != SELECT in the way that the rule system applies, I think? And this
> > > one could be for COPY only)
> >
> > COPY certainly does equal SELECT rights..  We haven't got an independent
> > COPY privilege and I don't think it makes sense to invent one.  It
>
> We don't, but I'm saying it might make sense to implement this. Not as a
> regular privilige, but as a role attribute. I'm not sure it's the right
> thing, but it might actually be interesting to entertain.

We've discussed having a role attribute for COPY-from-filesystem, but
pg_dump doesn't use that ever, it only uses COPY TO STDOUT.  I'm not
a fan of making a COPY_TO_STDOUT-vs-SELECT distinction just for this..

Yeah, it's probably going overboard with it, since AFAICT the only thing that would actually be affected is RULEs on SELECT, which I bet most people don't use on their tables.


> > sounds like you're suggesting that we add a hack directly into COPY for
> > this privilege, but my thinking is that the right place to change for
> > this is in the privilege system and not hack it into individual
> > commands..  I'm also a bit nervous that we'll end up painting ourselves
> > into a corner if we hack this to mean exactly what pg_dump needs today.
>
> One point would be that if we define it as "exactly what pg_dump needs",
> that definition can change in a future major version.

Sure, but that then gets a bit ugly because we encourage running the
latest version of pg_dump against the prior-major-version.

But the latest version of pg_dump *knows* how the prior major version behaved with these things, and can either adapt, or give the user a message about what they need to do to adapt.


> > > I think BYPASSRLS would have to be implicitly granted by the DUMP
> > > privilege. Without that, the DUMP privilege is more or less meaningless
> > > (for anybody who uses RLS - but if they don't use RLS, then having it
> > > include BYPASSRLS makes no difference). Worse than that, you may end up
> > > with a dump that you cannot restore.
> >
> > The dump would fail, as I mentioned before.  I don't see why BYPASSRLS
> > has to be implicitly granted by the DUMP privilege and can absolutely
> > see use-cases for it to *not* be.  Those use-cases would require passing
> > pg_dump the --enable-row-security option, but that's why it's there.
>
> So you're basically saying that if RLS is in use anywhere, this priviliege
> alone would be useless, correct?

No, it would still be *extremely* useful.  We have an option to pg_dump
to say "please dump with RLS enabled".  What that means is that you'd be
able to dump the entire database for all data you're allowed to see
through RLS policies.  How is that useless?  I certainly think it'd be
very handy and a good way to get *segregated* dumps according to policy.

Hmm. Yeah, I guess - my mindset was int he "database backup" mode, where a "silently partial" backup is a really scary thing rather than a feature :) I guess as long as you still dump all the parts, RLS itself ensures that foreign keys etc will actually be valid upon a reaload?


> And you would get a hard failure at *dump
> time*, not at reload time?

If you attempt to pg_dump a relation which has RLS enabled, and you
don't enable RLS with pg_dump, then you'll get a failure at dump time,
yes.  That's far better than a reload-time failure.

Good.

> That would at least make it acceptable, as you
> would know it's wrong. and we could make the error messages shown
> specifically hint that you need to grant both privileges to make it useful.

Agreed, having such a hint makes sense.

> We could/should also throw a WARNING if DUMP Is granted to a role without
> BYPASSRLS in case row level security is enabled in the system, I think. But
> that's more of an implementation detail.

That's a bit ugly and RLS could be added to a relation after the DUMP
privilege is granted.

Yes, it's not going to be all-covering, but it can still be a useful hint/warning in the cases where it *does* that. We obviously still need pg_dump to give the error in both scenarios.


What I think this part of the discussion is getting at is that there's a
lot of people who view pg_dump as explicitly for "dump the ENTIRE
database".  While that's one use-case it is certainly not the only one.
I find "pg_dump schema X" to be very common and often find that pg_dump
against an entire database isn't practical because of the size of the
database.

Agreed. I was in that mindset since we were talking about other backups. And FWIW, one reason I like to call it "DUMP" and not anything to do with BACKUP is exactly that. pg_dump is often not very useful for backups anymore, due to the size of databases, but it's very useful for other things.


> > Similar concerns would exist for the existing REPLICATION role for example
> > > - that one clearly lets you bypass RLS as well, just not with a SQL
> > > statement.
> >
> > I'm not sure that I see the point you're making here.  Yes, REPLICATION
> > allows you to do a filesystem copy of the entire database and that
> > clearly bypasses RLS and *all* of our privilege system.  I'm suggesting
> > that this role attribute work as an implicit grant of privileges we
> > already have.  That strikes me as easy to document and very clear for
> > users.
>
> It does - though documenting that it implicitly gives you a different
> privilege as well is also easy :)
>
> But it does potentially limit us to what we can actually do with the
> priviliges. One reason to use them is exactly because we *cannot* express
> this with our regular permissions (such as BYPASSRLS or REPLICATION).

Ok, I see the point you're making that we could make this into a
capability which isn't something which can be expressed through our
existing GRANT system.  That strikes me as a solution trying to find a
problem though.  There's no need to invent such an oddity for this
particular use-case, I don't think.

Maybe not, but we should make sure we don't paint ourselves into a corner where we cannot do it in the future either.


> For regular permissions, we could just pre-populate the system with
> predefined roles and use regular GRANTs to those roles, instead of relying
> on role attributes, which might in that case make it even more clear?

The reason for this approach is to address exactly the nightmare that is
trying to maintain those regular permissions across all the objects in
the system.  Today, people simply give the role trying to do the pg_dump
superuser, which is the best option we have.  Saying 'grant SELECT on
all the tables and USAGE on all the schemas' isn't suggested because
it's a huge pain.  This role attribute provides a middle ground where
the pg_dump'ing role isn't a superuser, but you don't have to ensure
usage and select are granted to it for every relation.

No, what I'm saying is we could have *predefined role* that allows "select on all the tables and usage on all the schemas". And you are unable to actually remove that. It's not stored on the object, so you cannot REVOKE the permission on the *object*. Since it's not store on the object it will also automatically apply to all new objects, regardless of what you've done with DEFAULT PRIVILEGES etc.

But you can grant users and other roles access to this role, and it's dealt with like other roles from the "who has it" perspective, instead of being special-cased.

Instead of "ALTER USER foo WITH DUMP" (or whatever), you'd just do a "GRANT pgdump TO foo" (which would then also work through things like group membership as well)


> > should work 'sanely' with any combination of the two options.
> > > > Similairly, DUMP shouldn't imply BACKUP or visa-versa.  In particular,
> > > > I'd like to see roles which have only the BACKUP privilege be unable to
> > > > directly read any data (modulo things granted to PUBLIC).  This would
> > > > allow an unprivileged user to manage backups, kick off ad-hoc ones,
> > etc,
> > > > without being able to actually access any of the data (this would
> > > > require the actual backup system to have a similar control, but that's
> > > > entirely possible with more advanced SANs and enterprise backup
> > > > solutions).
> > >
> > > So you're saying a privilege that would allow you to do
> > > pg_start_backup()/pg_stop_backup() but *not* actually use pg_basebackup?
> >
> > Yes.
>
> What's really the usecase for that? I'd say that pg_start/stop backup is a
> superset and a higher privilige than using pg_basebackup, since you can for
> example destroy other peoples backups using it (by running pg_stop_backup
> while they are running).

One use-case is to allow unprivileged users to run adhoc backups. 
Another is simply that you don't want the cronjob that runs the backup
to be able to read any of the data directly (why would you?).  I agree
that the individuals who have this capability would still need to
coordinate or at lesat not step on each other or they could end up with
bad backups.

Another way to address that would be to require that the role calling
stop_backup be a member of the role which called start_backup (that also
address the superuser case, as superusers are considered members of all
roles).  That seems like a pretty reasonable sanity check requirement.

This diverges a bit from the actual role attribute discussion here, but I think a better solution to this is to actually have a separate interface rather than pg_start/pg_stop backup. One of the main problems with pg_start/pg_stop vs pg_basebackup is that you can easily leave the system in a broken state (for example by forgetting to pg_stop_backup, or by accidentally pg_stop_backup:ing in the middle of someone elses backup). pg_basebackup gets around this by automatically executing do_pg_stop_backup if the client disconnects. This also lets us allow parallel base backups. We could have an equivalent functionality exposed through a SQL function, or argument to pg_start_backup - it would require the backup software to keep the connection running as it works and then disconnect when it's done, but for anything beyond the most trivial shellscript that's not exactly hard, and it would make the backups safer.

If we did that, perhaps we don't even need a separate privilege for pg_start_backup() as it is today, btu can leave that as superuser?



> > Personalyl I think using the DUMP name makes that a lot more clear. Maybe
> > > we need to avoid using BACKUP alone as well, to make sure it doesn't go
> > the
> > > other way - using BASEBACKUP and EXCLUSIVEBACKUP for those two different
> > > ones perhaps?
> >
> > DUMP - implicitly grants existing rights, to facilitate pg_dump and
> >        perhaps some other use-cases
> > BASEBACKUP - allows pg_basebackup, which means bypassing all in-DB
> >              privilege systems
>
> That's equivalent of REPLICATION, yes?

Ah, yeah, seems like it would be.  Got ahead of myself. :)

> > EXCLUSIVEBACKUP - just allows pg_start/stop_backup and friends
>
> I'd say there is no "just" there really - that's a higher level privilege,
> but I see your point :)

Well, see above, but ok.

> Later I added:
> >
> > pg_xlog_replay_pause
> > pg_xlog_replay_resume
> >
> > Though I'd be find if the xlog_replay ones were their own privilege (eg:
> > REPLAY or something).
> >
>
> I think it makes more sense to have those replay functions to be a separate
> privilege, yes. They have nothing to actually do with taking the backups -
> they are for restoring them. And to restore the backups, you clearly
> already have superuser level privileges on the system outside the db (at
> least as long as we only allow full cluster restores).

Not quite- remember that reply_pause/resume can be done on a replica, so
they are useful to be able to grant independent from superuser.  Perhaps
we call such a role attribute XLOGREPLAY ?

Yes, that's what I meant - they are useful, but they're not interesting for the bacukp itself. So yes, something like XLOGREPLAY makes sense.


> > > You mean something that restricts the user to read even *if* write
> > > permissions has been granted on an individual table? Yeah, that would
> > > actually be quite useful, I think - sort of a "reverse privilege".
> >
> > Yes.  My thinking on how to approach this was to forcibly set all of
> > their transactions to read-only.
>
> Agreed that this would be very useful.

Glad we agree..  Any thoughts on implementation? :)

Where's your patch? :)

In theory we'd just have to trap any attempt to set the value for transaction_read_only, no? Same as hot_standby does?



--

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Failure on markhor with CLOBBER_CACHE_ALWAYS for test brin
Next
From: Fabien COELHO
Date:
Subject: Re: documentation update for doc/src/sgml/func.sgml