Re: Additional role attributes && superuser review - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: Additional role attributes && superuser review
Date
Msg-id CABUevEzSR=w4v1ereJM2qaZ+zvfzAe_v9POdYxxnP=r6V1U1YQ@mail.gmail.com
Whole thread Raw
In response to Re: Additional role attributes && superuser review  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Additional role attributes && superuser review
List pgsql-hackers
On Tue, Dec 30, 2014 at 4:16 PM, Stephen Frost <sfrost@snowman.net> wrote:
* Magnus Hagander (magnus@hagander.net) wrote:
> On Mon, Dec 29, 2014 at 11:01 PM, Stephen Frost <sfrost@snowman.net> wrote:
> > That said, a 'DUMP' privilege which allows the user to dump the contents
> > of the entire database is entirely reasonable.  We need to be clear in
> > the documentation though- such a 'DUMP' privilege is essentially
> > granting USAGE on all schemas and table-level SELECT on all tables and
>
> sequences (anything else..?).  To be clear, a user with this privilege
> > can utilize that access without using pg_dump.
>
> Well, it would not give you full USAGE - granted USAGE on a schema, you can
> execute functions in there for example (provided permissions). This
> privilege would not do that,

The approach I was thinking was to document and implement this as
impliciting granting exactly USAGE and SELECT rights, no more (not
BYPASSRLS) and no less (yes, the role could execute functions).  I agree

If the role could execute functions, it's *not* "exactly USAGE and SELECT rights", it's now "USAGE and SELECT and EXECUTE" rights... Just to be nitpicking, but see below.

 
that doing so would be strictly more than what pg_dump actually requires
but it's also what we actually have support for in our privilege system.

Yeah, but would it also be what people would actually *want*?

I think having it do exactly what pg_dump needs, and not things like execute functions etc, would be the thing people want for a 'DUMP' privilege.

We might *also* want a USAGEANDSELECTANDEXECUTEANDWHATNOTBUTNOBYPASSURL (crap, it has to fit within NAMEDATALEN?) privilege, but I think that's a different thing.


> it would only give you COPY access. (And also
> COPY != SELECT in the way that the rule system applies, I think? And this
> one could be for COPY only)

COPY certainly does equal SELECT rights..  We haven't got an independent
COPY privilege and I don't think it makes sense to invent one.  It

We don't, but I'm saying it might make sense to implement this. Not as a regular privilige, but as a role attribute. I'm not sure it's the right thing, but it might actually be interesting to entertain.

 
sounds like you're suggesting that we add a hack directly into COPY for
this privilege, but my thinking is that the right place to change for
this is in the privilege system and not hack it into individual
commands..  I'm also a bit nervous that we'll end up painting ourselves
into a corner if we hack this to mean exactly what pg_dump needs today.

One point would be that if we define it as "exactly what pg_dump needs", that definition can change in a future major version.


> One other point is that this shouldn't imply any other privileges, imv.
> > I'm specifically thinking of BYPASSRLS- that's independently grantable
> > and therefore should be explicitly set, if it's intended.  Things
>
> I think BYPASSRLS would have to be implicitly granted by the DUMP
> privilege. Without that, the DUMP privilege is more or less meaningless
> (for anybody who uses RLS - but if they don't use RLS, then having it
> include BYPASSRLS makes no difference). Worse than that, you may end up
> with a dump that you cannot restore.

The dump would fail, as I mentioned before.  I don't see why BYPASSRLS
has to be implicitly granted by the DUMP privilege and can absolutely
see use-cases for it to *not* be.  Those use-cases would require passing
pg_dump the --enable-row-security option, but that's why it's there.

So you're basically saying that if RLS is in use anywhere, this priviliege alone would be useless, correct? And you would get a hard failure at *dump time*, not at reload time? That would at least make it acceptable, as you would know it's wrong. and we could make the error messages shown specifically hint that you need to grant both privileges to make it useful.

We could/should also throw a WARNING if DUMP Is granted to a role without BYPASSRLS in case row level security is enabled in the system, I think. But that's more of an implementation detail.

 

Implementations which don't use RLS are not impacted either way, so we
don't need to consider them.  Implementations which *do* use RLS, in my
experience, would certainly understand and expect BYPASSRLS to be
required if they want this role to be able to dump all data out
regardless of the RLS policies.  What does implicitly including
BYPASSRLS in this privilege gain us?  A slightly less irritated DBA who
forgot to include BYPASSRLS and ended up getting a pg_dump error because
of it.  On the other hand, it walls off any possibility of using this
privilege for roles who shouldn't be allowed to bypass RLS or for
pg_dumps to be done across the entire system for specific RLS policies.

Yeah, I think that's acceptable as long as we get the error at dump, and not at reload (when it's too late to fix it).
 

> Similar concerns would exist for the existing REPLICATION role for example
> - that one clearly lets you bypass RLS as well, just not with a SQL
> statement.

I'm not sure that I see the point you're making here.  Yes, REPLICATION
allows you to do a filesystem copy of the entire database and that
clearly bypasses RLS and *all* of our privilege system.  I'm suggesting
that this role attribute work as an implicit grant of privileges we
already have.  That strikes me as easy to document and very clear for
users.

It does - though documenting that it implicitly gives you a different privilege as well is also easy :)

But it does potentially limit us to what we can actually do with the priviliges. One reason to use them is exactly because we *cannot* express this with our regular permissions (such as BYPASSRLS or REPLICATION).

For regular permissions, we could just pre-populate the system with predefined roles and use regular GRANTs to those roles, instead of relying on role attributes, which might in that case make it even more clear?


> should work 'sanely' with any combination of the two options.
> > Similairly, DUMP shouldn't imply BACKUP or visa-versa.  In particular,
> > I'd like to see roles which have only the BACKUP privilege be unable to
> > directly read any data (modulo things granted to PUBLIC).  This would
> > allow an unprivileged user to manage backups, kick off ad-hoc ones, etc,
> > without being able to actually access any of the data (this would
> > require the actual backup system to have a similar control, but that's
> > entirely possible with more advanced SANs and enterprise backup
> > solutions).
>
> So you're saying a privilege that would allow you to do
> pg_start_backup()/pg_stop_backup() but *not* actually use pg_basebackup?

Yes.

What's really the usecase for that? I'd say that pg_start/stop backup is a superset and a higher privilige than using pg_basebackup, since you can for example destroy other peoples backups using it (by running pg_stop_backup while they are running).


> That would be "EXCLUSIVEBACKUP" or something like that, to be consistent
> with existing terminology though.

Ok.  I agree that working out the specific naming is difficult and would
like to focus on that, but we probably need to agree on the specific
capabilities first. :)

:)


> > > Fair enough, ultimately what I was trying to address is the following
> > > concern raised by Alvaro:
> > >
> > > "To me, what this repeated discussion on this particular BACKUP point
> > > says, is that the ability to run pg_start/stop_backend and the xlog
> > > related functions should be a different privilege, i.e. something other
> > > than BACKUP; because later we will want the ability to grant someone the
> > > ability to run pg_dump on the whole database without being superuser,
> > > and we will want to use the name BACKUP for that.  So I'm inclined to
> > > propose something more specific for this like WAL_CONTROL or
> > > XLOG_OPERATOR, say."
> >
> > Note that the BACKUP role attribute was never intended to cover the
> > pg_dump use-case.  Simply the name of it caused confusion though.  I'm
> > not sure if adding a DUMP role attribute is sufficient enough to address
> > that confusion, but I haven't got a better idea either.
>
> We need to separate the logical backups (pg_dump) from the physical ones
> (start/stop+filesystem and pg_basebackup). We might also need to separate
> the two different ways of doing physical backups.

Right, I view those as three distinct types of privileges (see below).
> Personalyl I think using the DUMP name makes that a lot more clear. Maybe
> we need to avoid using BACKUP alone as well, to make sure it doesn't go the
> other way - using BASEBACKUP and EXCLUSIVEBACKUP for those two different
> ones perhaps?

DUMP - implicitly grants existing rights, to facilitate pg_dump and
       perhaps some other use-cases
BASEBACKUP - allows pg_basebackup, which means bypassing all in-DB
             privilege systems

That's equivalent of REPLICATION, yes?

 
EXCLUSIVEBACKUP - just allows pg_start/stop_backup and friends

I'd say there is no "just" there really - that's a higher level privilege, but I see your point :)
 

I'm still not entirely happy with the naming, but I feel like we're
getting there.  One thought I had was using ARCHIVE somewhere, but I
kind of like BASEBACKUP meaning what's needed to run pg_basebackup, and,
well, we say 'backup' in the pg_start/stop_backup function names, so
it's kind of hard to avoid that.  EXCLUSIVEBACKUP seems really long tho.

I don't like ARCHIVE in that it sounds like something to do with log archiving. If anything, that would affect pg_receivexlog, but that's definitely REPLICATION. I'd rather keep that one for a future time where we might let users actually do something about log archiving directly, which would then use the name.

The reason I like EXCLUSIVEBACKUP is that it's a term we already use. It's only 4 chars longer than REPLICATION for example - does that really matter? As long as we don't store the name, it should have no effect at all, and even if we do, I'm not sure it matters. If you want it shorter, we can have EXCLBACKUP :)


> > When indeed, what it meant was to have the following separate (effectively
> > > merging #2 and #3):
> > >
> > > 1) ability to pg_dump
> > > 2) ability to start/stop backups *and* ability to execute xlog related
> > > functions.
> >
>
> We probably also need to define what those "xlog related functions"
> actually arse. pg_current_xlog_location() is definitely an xlog related
> function, but does it need the privilege? pg_switch_xlog()?
> pg_start_backup()? pg_xlog_replay_pause()?

I had defined them when I started the thread:

Apologies - I had missed that part.
 

pg_start_backup
pg_stop_backup
pg_switch_xlog
pg_create_restore_point

Ok. That makes sense.


Later I added:

pg_xlog_replay_pause
pg_xlog_replay_resume

Though I'd be find if the xlog_replay ones were their own privilege (eg:
REPLAY or something).

I think it makes more sense to have those replay functions to be a separate privilege, yes. They have nothing to actually do with taking the backups - they are for restoring them. And to restore the backups, you clearly already have superuser level privileges on the system outside the db (at least as long as we only allow full cluster restores).


> That sounds reasonable to me (and is what was initially proposed, though
> > I've come around to the thinking that this BACKUP role attribute should
> > also allow pg_xlog_replay_pause/resume(), as those can be useful on
> > replicas).
> >
>
> If it's for replicas, then why are we not using the REPLICATION privilege
> which is extremely similar to this?

I don't actually see REPLICATION as being the same.

The point is to have a role which can log into the replica, pause
the stream to be able to run whatever queries they're permitted to
(which might not include all of the data) and then resume it when done.
Perhaps that needs to be independent of the EXCLUSIVEBACKUP role, but
it's definitely different from the REPLICATION privilege.

Agreed per above, now that I realize what we mean. I do think it needs to be at least as independent from EXCLUSIVEBACKUP as it does froM REPLICATION though.


> > The read-all vs. ability-to-pg_dump distinction doesn't really exist for
> > role attributes, as I see it (see my comments above).  That said, having
> > DUMP or read-all is different from read-*only*, which would probably be
> > good to have independently.  I can imagine a use-case for a read-only
> > account which only has read ability for those tables, schemas, etc,
> > explicitly granted to it.
>
> You mean something that restricts the user to read even *if* write
> permissions has been granted on an individual table? Yeah, that would
> actually be quite useful, I think - sort of a "reverse privilege".

Yes.  My thinking on how to approach this was to forcibly set all of
their transactions to read-only.

Agreed that this would be very useful.


> There is one issue that occurs to me, however.  We're talking about
> > pg_dump, but what about pg_dumpall?  In particular, I don't think the
> > DUMP privilege should provide access to pg_authid, as that would allow
> > the user to bypass the privilege system in some environments by using
> > the hash to log in as a superuser.  Now, I don't encourage using
> > password based authentication, especially for superuser accounts, but
> > lots of people do.  The idea with these privileges is to allow certain
> > operations to be performed by a non-superuser while preventing trivial
> > access to superuser.  Perhaps it's pie-in-the-sky, but my hope is to
> > achieve that.
>
> Well, from an actual security perspective that would make it equivalent to
> superuser in the case of anybody using password auth. I'm not sure we cant
> to grant that out to DUMP by default - perhaps we need a separate one for
> DUMPAUTH or DUMPPASSWORDS.

That makes sense to me- DUMPAUTH or maybe just DUMPALL (to go with
pg_dumpall).

I'd prefer DUMPAUTH specifically because it makes it glaringly obvious that you're going to be dumping authentication data.


> (We could dump all the users *without* passwords with just the DUMP
> privilege)

Agreed.  That's actually something that I think would be *really* nice-
an option to dump the necessary globals for whatever database you're
running pg_dump against.  We have existing problems in this area
(database-level GUCs aren't considered database-specific and therefore
aren't picked up by pg_dump, for example..), but being able to also dump
the roles which are used in a given database with pg_dump would be
*really* nice..

Yes. Shouldn't be *that* hard, completely independent of this privilege work. Famous last words, I'm sure...

--

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Next
From: David Rowley
Date:
Subject: Performance improvement for joins where outer side is unique