Thread: Directory/File Access Permissions for COPY and Generic File Access Functions

Directory/File Access Permissions for COPY and Generic File Access Functions

From
"Brightwell, Adam"
Date:
All,

The attached patch for review implements a directory permission system that allows for providing a directory read/write capability to directories for COPY TO/FROM and Generic File Access Functions to non-superusers.  This is not a complete solution as it does not currently contain documentation or regression tests.  Though it is my hopes to get some feedback as I am sure there are some aspects of it that need to be reworked.  So I thought I'd put it out there for comments/review.

The approach taken is to create "directory aliases" that have a unique name and path, as well as an associated ACL list.  A superuser can create a new alias to any directory on the system and then provide READ or WRITE permissions to any non-superuser.  When a non-superuser then attempts to execute a COPY TO/FROM or any one of the generic file access functions, a permission check is performed against the aliases for the user and target directory.  Superusers are allowed to bypass all of these checks.  All alias paths must be an absolute path in order to avoid potential risks.  However, in the generic file access functions superusers are still allowed to execute the functions with a relative path where non-superusers are required to provide an absolute path.

- Implementation Details -

System Catalog:
pg_diralias
 - dirname - the name of the directory alias
 - dirpath - the directory path - must be absolute
 - diracl - the ACL for the directory

Syntax:
CREATE DIRALIAS <name> AS '<path>'
ALTER DIRALIAS <name> AS '<path>'
ALTER DIRALIAS <name> RENAME TO <new_name>
DROP DIRALIAS <name>

This is probably the area that I would really appreciate your thoughts and recommendations. To GRANT permissions to a directory alias, I had to create a special variant of GRANT since READ and WRITE are not reserved keywords and causes grammar issues.  Therefore, I chose to start with the following syntax:

GRANT ON DIRALIAS <name> <permissions> TO <roles>

where <permissions> is either READ, WRITE or ALL.

Any comments, suggestions or feedback would be greatly appreciated.

Thanks,
Attachment
"Brightwell, Adam" <adam.brightwell@crunchydatasolutions.com> writes:
> The attached patch for review implements a directory permission system that
> allows for providing a directory read/write capability to directories for
> COPY TO/FROM and Generic File Access Functions to non-superusers.

TBH, this sounds like it's adding a lot of mechanism and *significant*
risk of unforeseen security issues in order to solve a problem that we
do not need to solve.  The field demand for such a feature is just about
indistinguishable from zero.
        regards, tom lane



On Wed, Oct 15, 2014 at 11:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Brightwell, Adam" <adam.brightwell@crunchydatasolutions.com> writes:
>> The attached patch for review implements a directory permission system that
>> allows for providing a directory read/write capability to directories for
>> COPY TO/FROM and Generic File Access Functions to non-superusers.
>
> TBH, this sounds like it's adding a lot of mechanism and *significant*
> risk of unforeseen security issues in order to solve a problem that we
> do not need to solve.  The field demand for such a feature is just about
> indistinguishable from zero.

I am also not convinced that we need this.  If we need to allow
non-superusers COPY permission at all, can we just exclude certain
"unsafe" directories (like the data directory, and tablespaces) and
let them access anything else?  Or can we have a whitelist of
directories stored as a PGC_SUSER GUC?  This seems awfully heavyweight
for what it is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Oct 15, 2014 at 11:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > "Brightwell, Adam" <adam.brightwell@crunchydatasolutions.com> writes:
> >> The attached patch for review implements a directory permission system that
> >> allows for providing a directory read/write capability to directories for
> >> COPY TO/FROM and Generic File Access Functions to non-superusers.
> >
> > TBH, this sounds like it's adding a lot of mechanism and *significant*
> > risk of unforeseen security issues in order to solve a problem that we
> > do not need to solve.  The field demand for such a feature is just about
> > indistinguishable from zero.
>
> I am also not convinced that we need this.  If we need to allow
> non-superusers COPY permission at all, can we just exclude certain
> "unsafe" directories (like the data directory, and tablespaces) and
> let them access anything else?

Wow..  I'd say 'no' to this, certainly.  Granularity is required here.
I want to give a non-superuser the ability to slurp data off a specific
NFS mount, not read /etc/passwd..

> Or can we have a whitelist of
> directories stored as a PGC_SUSER GUC?  This seems awfully heavyweight
> for what it is.

Hrm, perhaps this would work though..

Allow me to outline a few use-cases which I see for this though and
perhaps that'll help us make progress.

This started out as a request for a non-superuser to be able to review
the log files without needing access to the server.  Now, things can
certainly be set up on the server to import *all* logs and then grant
access to a non-superuser, but generally it's "I need to review the log
from X to Y" and not *all* logs need to be stored or kept in PG.

In years past, I've wanted to be able to grant this ability out for
users to do loads without having to transfer the data through the user's
laptop or get them to log onto the Linux box from their Windows desktop
and pull the data in via psql (it's a bigger deal than some might
think..), and then there's the general ETL case where, without this, you
end up running something like Pentaho and having to pass all the data
through Java to get it into the database.

Building on that is the concept of *background* loads, with
pg_background.  That's a killer capability, in my view.  "Hey, PG, go
load all the files in this directory into this table, but don't make me
have to stick around and make sure my laptop is still connected for the
next 3 hours."

Next, the file_fdw could leverage this catalog to do its own checks and
allow non-superusers to use it, which would be fantastic and gets back
to the 'log file' use-case above.

And then there is the next-level item: CREATE TABLESPACE, which we
already see folks like RDS and others having to hack the codebase to
add as a non-superuser capability.  It'd need to be an independently
grantable capability, of course.
Thanks!
    Stephen

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Bruce Momjian
Date:
On Thu, Oct 16, 2014 at 12:01:28PM -0400, Stephen Frost wrote:
> This started out as a request for a non-superuser to be able to review
> the log files without needing access to the server.  Now, things can
> certainly be set up on the server to import *all* logs and then grant
> access to a non-superuser, but generally it's "I need to review the log
> from X to Y" and not *all* logs need to be stored or kept in PG.

Why is this patch showing up before being discussed?  You are having to
back into the discusion because of this.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Bruce Momjian (bruce@momjian.us) wrote:
> On Thu, Oct 16, 2014 at 12:01:28PM -0400, Stephen Frost wrote:
> > This started out as a request for a non-superuser to be able to review
> > the log files without needing access to the server.  Now, things can
> > certainly be set up on the server to import *all* logs and then grant
> > access to a non-superuser, but generally it's "I need to review the log
> > from X to Y" and not *all* logs need to be stored or kept in PG.
>
> Why is this patch showing up before being discussed?  You are having to
> back into the discusion because of this.

For my part, I didn't actually see it as being a questionable use-case
from the start..  That was obviously incorrect, though I didn't know
that previously.  The general idea has been discussed a couple of times
before, at least as far back as 2005:

http://www.postgresql.org/message-id/430F78E0.9020206@cs.concordia.ca

It's also a feature available in other databases (at least MySQL and
Oracle, but I'm pretty sure others also).

I can also recall chatting with folks about it a couple of times over
the years at various conferences.  Still, perhaps it would have been
better to post about the idea before the patch, but hindsight is often
20/20.
Thanks!
    Stephen

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Peter Eisentraut
Date:
On 10/16/14 12:01 PM, Stephen Frost wrote:
> This started out as a request for a non-superuser to be able to review
> the log files without needing access to the server.

I think that can be done with a security-definer function.




Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Peter Eisentraut
Date:
You patch is missing the files src/include/catalog/pg_diralias.h,
src/include/commands/diralias.h, and src/backend/commands/diralias.c.

(Hint: git add -N)




Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
"Brightwell, Adam"
Date:
Peter,

You patch is missing the files src/include/catalog/pg_diralias.h,
src/include/commands/diralias.h, and src/backend/commands/diralias.c.

(Hint: git add -N)

Yikes, sorry about that, not sure how that happened.  Attached is an updated patch.

-Adam

--
Attachment

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Peter Eisentraut (peter_e@gmx.net) wrote:
> On 10/16/14 12:01 PM, Stephen Frost wrote:
> > This started out as a request for a non-superuser to be able to review
> > the log files without needing access to the server.
>
> I think that can be done with a security-definer function.

Of course it can be.  We could replace the entire authorization system
with security definer functions too.  I don't view this as an argument
against this feature, particularly as we know other systems have it,
users have asked for multiple times, and large PG deployments have had
to hack around our lack of it.
Thanks,
    Stephen

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Adam Brightwell
Date:
All,

Attached is a patch with minor updates/corrections.

-Adam

--
Attachment

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Peter Eisentraut
Date:
I think the way this should work is that if you create a DIRALIAS, then
the COPY command should refer to it by logical name, e.g.,

CREATE DIRALIAS dumpster AS '/tmp/trash';
COPY mytable TO dumpster;


If you squint a bit, this is the same as a tablespace.  Maybe those two
concepts could be combined.

On the other hand, we already have file_fdw, which does something very
similar.




Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Peter Eisentraut
Date:
On 10/27/14 7:27 AM, Stephen Frost wrote:
> * Peter Eisentraut (peter_e@gmx.net) wrote:
>> On 10/16/14 12:01 PM, Stephen Frost wrote:
>>> This started out as a request for a non-superuser to be able to review
>>> the log files without needing access to the server.
>>
>> I think that can be done with a security-definer function.
> 
> Of course it can be.  We could replace the entire authorization system
> with security definer functions too.

I don't think that is correct.

It's easy to do something with security definer functions if it's single
purpose, with a single argument, like load this file into this table,
let these users do it.

It's not easy to do it with functions if you have many parameters, like
in a general SELECT statement.

So I would like to see at least three wildly different use cases for
this before believing that a security definer function isn't appropriate.

> I don't view this as an argument
> against this feature, particularly as we know other systems have it,
> users have asked for multiple times, and large PG deployments have had
> to hack around our lack of it.

What other systems have it?  Do you have links to their documentation?





Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Peter Eisentraut (peter_e@gmx.net) wrote:
> On 10/27/14 7:27 AM, Stephen Frost wrote:
> > * Peter Eisentraut (peter_e@gmx.net) wrote:
> >> On 10/16/14 12:01 PM, Stephen Frost wrote:
> >>> This started out as a request for a non-superuser to be able to review
> >>> the log files without needing access to the server.
> >>
> >> I think that can be done with a security-definer function.
> >
> > Of course it can be.  We could replace the entire authorization system
> > with security definer functions too.
>
> I don't think that is correct.

Of course it is- you simply have to move all the logic into the
function.

> It's easy to do something with security definer functions if it's single
> purpose, with a single argument, like load this file into this table,
> let these users do it.

The files won't be consistently named and there may be cases to make
ad-hoc runs or test runs.  No, it isn't as simple as always being a
single, specific filename and when consider that there needs to be
intelligence about the actual path being specified and making sure that
there can't be '..' and similar, it gets to be a pretty ugly situation
to make our users have to deal with.

> It's not easy to do it with functions if you have many parameters, like
> in a general SELECT statement.

You could define SRFs for every table.

> So I would like to see at least three wildly different use cases for
> this before believing that a security definer function isn't appropriate.

I'm not following this- there's probably 100s of use-cases for this, but
they're all variations n 'read and/or write data server-side instead of
through a front-end connection', which is what the purpose of the
feature is..  I do see this as being useful for COPY, Large Object, and
the file_fdw...

> > I don't view this as an argument
> > against this feature, particularly as we know other systems have it,
> > users have asked for multiple times, and large PG deployments have had
> > to hack around our lack of it.
>
> What other systems have it?  Do you have links to their documentation?

MySQL:
http://dev.mysql.com/doc/refman/5.1/en/privileges-provided.html#priv_file

(note they provide a way to limit access also, via secure_file_priv)

Oracle:
http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_5007.htm
http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_9013.htm#i2125999

SQL Server:
http://msdn.microsoft.com/en-us/library/ms175915.aspx
(Note: they can actually run as the user connected instead of the SQL DB
server, if Windows authentication is used, which is basically doing
Kerberos proxying unless I'm mistaken; it's unclear how the security is
maintained if it's a SQL server logon user..).

DB2:

http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.dm.doc/doc/c0004589.html?cp=SSEPGG_9.7.0

etc...
Thanks,
    Stephen

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Peter Eisentraut (peter_e@gmx.net) wrote:
> I think the way this should work is that if you create a DIRALIAS, then
> the COPY command should refer to it by logical name, e.g.,
>
> CREATE DIRALIAS dumpster AS '/tmp/trash';
> COPY mytable TO dumpster;

You'd have to be able to specify the filename also.  I'm not against the
idea of using the 'diralias' alias name this way, just saying it isn't
quite as simple as the above.

> If you squint a bit, this is the same as a tablespace.  Maybe those two
> concepts could be combined.

CREATE TABLESPACE is something else which could be supported with
diralias, though it'd have to be an independently grantable capability
and it'd be a bad idea to let a user create tablespaces in a directory
and then also be able to copy from/to files there (backend crashes,
etc).  This exact capability is more-or-less what RDS has had to hack on
to PG for their environment, as I understand it, in case you're looking
for a use-case.

> On the other hand, we already have file_fdw, which does something very
> similar.

It's really not at all the same..  Perhaps we'll get there some day, but
we're a very long way away from file_fdw having the ability to replace
normal tablespaces...
Thanks!
    Stephen

On Mon, Oct 27, 2014 at 5:59 PM, Adam Brightwell
<adam.brightwell@crunchydatasolutions.com> wrote:
> Attached is a patch with minor updates/corrections.

Given that no fewer than four people - all committers - have expressed
doubts about the design of this patch, I wonder why you're bothering
to post a new version.  It seems to me that you should be discussing
the fundamental design, not making minor updates to the code.  I
really hope this is not moving in the direction of another "surprise
commit" like we had with RLS.  There is absolutely NOT consensus on
this design or anything close to it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> There is absolutely NOT consensus on
> this design or anything close to it.

There is no doubt that consensus on the desirability and design needs
to be reached before we can even consider committing it.  I suspect
Adam posted it simply because he had identified issues himself and
wanted to make others aware that things had been fixed.

That said, it sounds like the primary concern has been if we want this
feature at all and there hasn't been much discussion of the design
itself.  Comments about the technical design would be great.  I
appreciate your thoughts about using a PGC_SUSER GUC, but I don't feel
like it really works as it's all-or-nothing and doesn't provide
read-vs-write, unless we extend it out to be multiple GUCs and then
there is still the question about per-role access..

I'm not sure that I see a way to allow the per-role granularity without
having a top-level catalog object on which the GRANT can be executed and
ACL information stored.  Perhaps it's unfortunate that we don't have a
more generic way to address that but I'm not sure I really see another
catalog table as a big problem..
Thanks!
    Stephen

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Andres Freund
Date:
On 2014-10-28 09:24:18 -0400, Stephen Frost wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
> > There is absolutely NOT consensus on
> > this design or anything close to it.
> 
> There is no doubt that consensus on the desirability and design needs
> to be reached before we can even consider committing it.  I suspect
> Adam posted it simply because he had identified issues himself and
> wanted to make others aware that things had been fixed.
> 
> That said, it sounds like the primary concern has been if we want this
> feature at all and there hasn't been much discussion of the design
> itself.

Well, why waste time on the technical details when we haven't agreed
that the feature is worthwile? Review bandwidth is a serious problem in
this community.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Andres Freund (andres@2ndquadrant.com) wrote:
> On 2014-10-28 09:24:18 -0400, Stephen Frost wrote:
> > There is no doubt that consensus on the desirability and design needs
> > to be reached before we can even consider committing it.  I suspect
> > Adam posted it simply because he had identified issues himself and
> > wanted to make others aware that things had been fixed.
> >
> > That said, it sounds like the primary concern has been if we want this
> > feature at all and there hasn't been much discussion of the design
> > itself.
>
> Well, why waste time on the technical details when we haven't agreed
> that the feature is worthwile? Review bandwidth is a serious problem in
> this community.

Fair enough, and I'm happy to discuss that (and have been..); I was
simply objecting to the implication that the desirability concerns
raised were design concerns- the only design concern raised was wrt
it being possibly too heavyweight and the PGC_SUSET GUC suggestion (at
least, based on my re-reading of the thread..).
Thanks!
    Stephen

On Tue, Oct 28, 2014 at 9:24 AM, Stephen Frost <sfrost@snowman.net> wrote:
> That said, it sounds like the primary concern has been if we want this
> feature at all and there hasn't been much discussion of the design
> itself.  Comments about the technical design would be great.  I
> appreciate your thoughts about using a PGC_SUSER GUC, but I don't feel
> like it really works as it's all-or-nothing and doesn't provide
> read-vs-write, unless we extend it out to be multiple GUCs and then
> there is still the question about per-role access..

It sounds to me like you've basically settled on the way that you want
to implement it - without prior discussion on the mailing list - and
you're not trying very hard to make any of the alternatives work.
It's not the community's job to come up with a design that satisfies
you; it's your job to come up with as design that satisfies the
community.  That doesn't *necessarily* mean that you have to change
the design that you've come up with; convincing other people that your
design is the best one is also an option.  But I don't see that you're
making any real attempt to do that.

Your previous comment on the idea of a PGC_SUSET GUC was "Hrm, perhaps
this would work though.." and then, with zero further on-list
discussion, you've arrived at "I don't feel like it really works as
it's all-or-nothing and doesn't provide read-vs-write".  Those are
precisely the kinds of issues that you should be discussing here in
detail, not cogitating on in isolation and then expecting this group
of people to accept that your original design is really for the best
after all.

I also find your technical arguments - to the extent that you've
bothered to articulate them at all - to be without merit.  The
"question about per-role access" is easily dealt with, so let's start
there: if you make it a GUC, ALTER USER .. SET can be used to set
different values for different users.  No problem.  Your other
criticism that it is "all-vs-nothing" seems to me to be totally
incomprehensible, since as far as I can see a GUC with a list of
pathnames is exactly the same functionality that you're proposing to
implement via a much more baroque syntax.  It is no more or less
all-or-nothing than that.  Finally, you mention "read-vs-write"
access.  You haven't even attempted to argue that we need to make that
distinction - in fact, you don't seem to have convinced a
significantly majority of the people that we need this feature at all
- but if we do, the fact that it might require two GUCs instead of one
is not a fatal objection to that design. (I'd be prepared to concede
that if there are half a dozen different privileges on directories
that we might want to grant, then wedging it into a GUC might be a
stretch.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Tue, Oct 28, 2014 at 9:24 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > That said, it sounds like the primary concern has been if we want this
> > feature at all and there hasn't been much discussion of the design
> > itself.  Comments about the technical design would be great.  I
> > appreciate your thoughts about using a PGC_SUSER GUC, but I don't feel
> > like it really works as it's all-or-nothing and doesn't provide
> > read-vs-write, unless we extend it out to be multiple GUCs and then
> > there is still the question about per-role access..
>
> It sounds to me like you've basically settled on the way that you want
> to implement it - without prior discussion on the mailing list - and
> you're not trying very hard to make any of the alternatives work.

I'm happy to put more effort into alternatives, was just trying to
outline what capabilities I felt it should have and make sure others
proposing designs understood the granularity requested.

> It's not the community's job to come up with a design that satisfies
> you; it's your job to come up with as design that satisfies the
> community.  That doesn't *necessarily* mean that you have to change
> the design that you've come up with; convincing other people that your
> design is the best one is also an option.  But I don't see that you're
> making any real attempt to do that.

There was only one other design to contrast against- the rest has been
concern about the desirability, which is what I've been trying to
address by responding to Peter's request about documentation and how
this capability exists in other systems.

> Your previous comment on the idea of a PGC_SUSET GUC was "Hrm, perhaps
> this would work though.." and then, with zero further on-list
> discussion, you've arrived at "I don't feel like it really works as
> it's all-or-nothing and doesn't provide read-vs-write".  Those are
> precisely the kinds of issues that you should be discussing here in
> detail, not cogitating on in isolation and then expecting this group
> of people to accept that your original design is really for the best
> after all.

Alright, I'll try and outline a more detailed proposal which uses GUCs
to achieve the level of granularity that is being sought and we can
discuss it.

> I also find your technical arguments - to the extent that you've
> bothered to articulate them at all - to be without merit.  The
> "question about per-role access" is easily dealt with, so let's start
> there: if you make it a GUC, ALTER USER .. SET can be used to set
> different values for different users.  No problem.

No, I simply hadn't thought about that approach and I'm glad that you're
clarifying it..  I'll think about it more but my initial concern is
being able to identify everything a user has access to would then become
more complex as you'd have to consider what special GUCs they have set
in pg_config.  I see how what you're proposing would work there though.

> Your other
> criticism that it is "all-vs-nothing" seems to me to be totally
> incomprehensible, since as far as I can see a GUC with a list of
> pathnames is exactly the same functionality that you're proposing to
> implement via a much more baroque syntax.  It is no more or less
> all-or-nothing than that.

Apologies about not being clear- that 'all-or-nothing' was without
considering using a per-user GUC to control it; I had thought the
proposal was a single GUC and then a role attribute which said if a
given role could access everything in the global list or not.  Using a
per-role GUC solves that.

> Finally, you mention "read-vs-write"
> access.  You haven't even attempted to argue that we need to make that
> distinction

The use-case that I had described up-thread, I had thought, made it
clear that there will be cases where a user should have only read-only
access to a directory (able to import log files) and cases where a user
should be able to write to a directory (exporting to an NFS mount or
similar).

> - in fact, you don't seem to have convinced a
> significantly majority of the people that we need this feature at all

That's certainly what I've been primairly focused on addressing as it's
the first hurdle to jump.  As I mentioned to Bruce, I didn't realize
there was really a question about that, but evidently that was
incorrect and I'm working to rectify the situation.

> - but if we do, the fact that it might require two GUCs instead of one
> is not a fatal objection to that design. (I'd be prepared to concede
> that if there are half a dozen different privileges on directories
> that we might want to grant, then wedging it into a GUC might be a
> stretch.)

There are more capabilities that I've been considering longer-term but
wasn't sure if they should be independent or just lumped into the
simpler read/write category:

read (eg: importing log files, or importing from an NFS mount)
write (eg: exporting to NFS mount)
tablespace (eg: create a tablespace in a subdir of a directory)
create directory (eg: create subdirs)
modify permissions (eg: allow users other than pg to read/write/etc)
directory listing
large-object import/export (might be same as read/write)
COPY PIPE

I can see cases where you wouldn't want to allow a directory listing,
but would want to allow read (or write), for example.  There are also
cases where you would or wouldn't want a given user to be able to chmod
the files.  I'm not sure that it should matter if it's a large object or
not, or if it's being used by file_fdw vs. normal COPY, but they're
certainly things to consider.  The tablespace case is one which I think
really needs a lot of consideration since it could be particularly
dangerous, as well as COPY PIPE (which might not ever be able to be for
non-superuser, in the end..).

I'll discuss with Adam putting a wiki together which outlines the use
cases and rationale for them and hopefully that'll lead into a better
discussion about the possible permissions which would make sense to
exist for these and that may inform us as to if a GUC-based approach
would work.  I'm still unsure about using GUCs to define permissions in
this way.  That feels novel to me for PG to do, but I'll admit that I
may just be ignorant or forgetting existing cases where we do that.

Thanks for explaining the GUC-based proposal further.
Stephen

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Adam Brightwell
Date:
Robert,
 
Given that no fewer than four people - all committers - have expressed
doubts about the design of this patch, I wonder why you're bothering
to post a new version.

I understand and my intent was in no way to disregard those concerns.  The only reason that I have posted a new version was simply to address some minor issues that I noticed when responding to Peter's earlier comment about missing files.

It seems to me that you should be discussing
the fundamental design, not making minor updates to the code.

Ok.  I'm certainly looking at the other options proposed and will work with Stephen to put together an appropriate design for discussion here.

I really hope this is not moving in the direction of another "surprise
commit" like we had with RLS.  There is absolutely NOT consensus on
this design or anything close to it.

Certainly not and I am in no way confused that consensus has not been reached.

-Adam

--
On Tue, Oct 28, 2014 at 11:16 AM, Stephen Frost <sfrost@snowman.net> wrote:
> There are more capabilities that I've been considering longer-term but
> wasn't sure if they should be independent or just lumped into the
> simpler read/write category:
>
> read (eg: importing log files, or importing from an NFS mount)
> write (eg: exporting to NFS mount)
> tablespace (eg: create a tablespace in a subdir of a directory)
> create directory (eg: create subdirs)
> modify permissions (eg: allow users other than pg to read/write/etc)
> directory listing
> large-object import/export (might be same as read/write)
> COPY PIPE

I think it would be a good idea to figure out how this fits together
and propose a design that covers all the cases you think are
important, and then see how many of them the community agrees are
important.  I have no problem with incremental commits moving toward
an agreed-upon design, but it's important that we don't go off in one
directly and then have to reverse course, because it creates upgrade
problems for our users.

To articular my own concerns perhaps a bit better, there are two major
things I don't like about the whole DIRALIAS proposal.  Number one,
you're creating this SQL object whose name is not actually used for
anything other than manipulating the alias you created.  The users are
still operating on pathnames.  That's awfully strange.  Number two,
every other SQL object we have has a name that is one or several
English words.  DIRALIAS does not appear in any dictionary.  The
second objection can be answered by renaming the facility, but the
first one is not so straightforward.

> I'll discuss with Adam putting a wiki together which outlines the use
> cases and rationale for them and hopefully that'll lead into a better
> discussion about the possible permissions which would make sense to
> exist for these and that may inform us as to if a GUC-based approach
> would work.  I'm still unsure about using GUCs to define permissions in
> this way.  That feels novel to me for PG to do, but I'll admit that I
> may just be ignorant or forgetting existing cases where we do that.

Well, there's temp_file_limit, for example.  That's not exactly the
same, but it bears a passing resemblance.

I'm definitely not saying that the GUC-based proposal is perfect.  It
isn't, and if we're going to need a whole bunch of different
permissions that are all per-directory, that could get ugly in a
hurry.  My points are (1) the community does not have to accept this
feature just because you propose it, and in fact there's a good
argument for rejecting it outright, which is that very few users are
going to get any benefit out of this, and it might end up being a
whole lot of code; and (2) the pros and cons of accepting this at all,
and of different designs, need to be debated here, on this list, in an
open way.

I think it would help, on all accounts, to explain why in the world
we're spending time on this in the first place.  I have a sneaking
suspicion this is 1 of N things we need to do to meet some US
government security standard, and if something like that is the case,
that could tip the balance toward doing it, or toward a particular
implementation of the concept.  From my point of view, if you made a
list of all of the annoyances of using PostgreSQL and listed them in
order of importance, you'd burn through a fair amount of paper before
reaching this one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Tue, Oct 28, 2014 at 11:33 AM, Adam Brightwell
<adam.brightwell@crunchydatasolutions.com> wrote:
>> Given that no fewer than four people - all committers - have expressed
>> doubts about the design of this patch, I wonder why you're bothering
>> to post a new version.
>
> I understand and my intent was in no way to disregard those concerns.  The
> only reason that I have posted a new version was simply to address some
> minor issues that I noticed when responding to Peter's earlier comment about
> missing files.
>
>> It seems to me that you should be discussing
>> the fundamental design, not making minor updates to the code.
>
> Ok.  I'm certainly looking at the other options proposed and will work with
> Stephen to put together an appropriate design for discussion here.
>
>> I really hope this is not moving in the direction of another "surprise
>> commit" like we had with RLS.  There is absolutely NOT consensus on
>> this design or anything close to it.
>
> Certainly not and I am in no way confused that consensus has not been
> reached.

OK, thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Kevin Grittner
Date:
Robert Haas <robertmhaas@gmail.com> wrote:

> I think it would help, on all accounts, to explain why in the world
> we're spending time on this in the first place.  I have a sneaking
> suspicion this is 1 of N things we need to do to meet some US
> government security standard, and if something like that is the case,
> that could tip the balance toward doing it, or toward a particular
> implementation of the concept.

Stephen my correct me on this, but I seem to remember him saying
that this was part of a general effort to avoid needing to use a
superuser login for routine tasks that don't fit into the area of
what a sysadmin would do.  That seems like a laudable goal to me.
Of course, most or all of what this particular feature would allow
can be done using superuser-owned SECURITY DEFINER functions, but
that is sure a lot clumsier and error-prone than being able to say
that role x can read from directory data/input and role y can write
to directory data/output.

That said, Stephen does seem to have some additional specific use
cases in mind which he hasn't shared with the list; knowing what
problems we're talking about solving would sure help make
discussions about the possible solutions more productive.  :-)

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
Kevin,

Thanks.

* Kevin Grittner (kgrittn@ymail.com) wrote:
> Stephen my correct me on this, but I seem to remember him saying
> that this was part of a general effort to avoid needing to use a
> superuser login for routine tasks that don't fit into the area of
> what a sysadmin would do.  That seems like a laudable goal to me.

Right, and this is one of those things that only a superuser can do now.
I had expected to find other more complicated cases which would require
a generalized "pg_permissions" type of approach but having gone through
the superuser() checks, this is the one case where we really needed a
complex ACL scheme that, in my view at least, warranted a new catalog
table.  Rather than come up with "pg_permissions" and some ugly hacks
to make that work for a variety of object types, I looked to address the
specific case of server-side directory access in a way similar to what
other databases already provide.

> Of course, most or all of what this particular feature would allow
> can be done using superuser-owned SECURITY DEFINER functions, but
> that is sure a lot clumsier and error-prone than being able to say
> that role x can read from directory data/input and role y can write
> to directory data/output.

Exactly.

> That said, Stephen does seem to have some additional specific use
> cases in mind which he hasn't shared with the list; knowing what
> problems we're talking about solving would sure help make
> discussions about the possible solutions more productive.  :-)

It's actually more-or-less the opposite..  As I think I mentioned at
some point earlier, the original ask was to be able to view logs as a
DBA who isn't a superuser, and without having to have those views
delayed or complex cron jobs running to set up access to them.  That's a
*frequently* asked for capability and I don't think this directory type
approach will be the final solution to that specific problem, but it'll
at least get us a lot closer while also providing capabilities that
other databases have and that I've personally wanted for a long time.

In other words, I took the ask and attempted to generalize it out to
cover more use-cases that I've run into which are similar.  While I have
ideas and memories about times when I've wanted this capability for
various use-cases, there's not some pre-defined list that I'm hiding
offline in hopes that no one asks for it, nor is it for some government
check-list.

Since there is evidently interest in this, I'll try to provide some
insight into the times I've run into this previously:

The first time I came across COPY and was frustrated that I had to be a
superuser to use it, period.  Initially, I didn't realize it could do
STDIN/STDOUT, but even once I discovered that, I felt it was unfortunate
that only a superuser could do it server-side, unlike other databases.
This, in my view, is probably the experience of nearly every new user to
PG and COPY and, basically, it sucks.

Later on, I started writing scripts to do server-side copy to avoid
having to marshall data through whatever-client-API-I'm-using (perl,
php, python, etc) and where I couldn't do that due to not being able to
run as a superuser, I ended up doing ugly things in some cases (like
exec'ing out to psql..) because I couldn't just tell the server "pull
this file in".

In some cases, COPY wasn't even supported by the client library, as I
recall.  That's better now, but new languages continue to come out and
often initially support the bare minimum (wasn't ruby initially in this
boat of lacking COPY protocol support initially..?).

Then, when working with Pentaho I came across it again- having to
marshall data through Java and over into PG, and it had to go over a
local TCP connection instead of a unix socket (still the case with our
JDBC driver, no?), primairly to get data into the DB which was out on an
NFS mount in a format that PG could have digested just fine directly or
could have made available via the file_fdw.

Next was the Amazon use-case, which wasn't obvious to me initially but
makes perfect sense now.  They want to allow users to add new i/o
channels and use them but can't let users run as the normal PG
superuser, hence the idea about supporting CREATE TABLESPACE with this
same 'diralias' approach.

The thoughts around permissions related to 'diralias' (chmod, mkdir, ls,
etc) are all just based on what unix provides already.  Similairly,
extending to support large-object import/export along with COPY just
makes sense, as does supporting the file_fdw with this approach, imv.
The file_fdw case is interesting as it's an extension and we'll need to
be able to provide a clear and simple interface to check if the access
is allowed or not which the file_fdw would then leverage.
Thanks,
    Stephen

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
Robert,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Tue, Oct 28, 2014 at 11:16 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > There are more capabilities that I've been considering longer-term but
> > wasn't sure if they should be independent or just lumped into the
> > simpler read/write category:
> >
> > read (eg: importing log files, or importing from an NFS mount)
> > write (eg: exporting to NFS mount)
> > tablespace (eg: create a tablespace in a subdir of a directory)
> > create directory (eg: create subdirs)
> > modify permissions (eg: allow users other than pg to read/write/etc)
> > directory listing
> > large-object import/export (might be same as read/write)
> > COPY PIPE
>
> I think it would be a good idea to figure out how this fits together
> and propose a design that covers all the cases you think are
> important, and then see how many of them the community agrees are
> important.  I have no problem with incremental commits moving toward
> an agreed-upon design, but it's important that we don't go off in one
> directly and then have to reverse course, because it creates upgrade
> problems for our users.

Certainly.

> To articular my own concerns perhaps a bit better, there are two major
> things I don't like about the whole DIRALIAS proposal.  Number one,
> you're creating this SQL object whose name is not actually used for
> anything other than manipulating the alias you created.

I agree that this makes it feel awkward.  Peter had an interesting
suggestion to make the dir aliases available as actual aliases for the
commands which they would be relevant to.  I hadn't considered that- I
proposed 'diralias' because I didn't like 'directory' since we weren't
actually creating *directories* but rather defining aliases to existing
OS directories in PG.  Perhaps it wasn't clear at the outset, but this
is all work-in-progress and not intended to be the one-true-solution
from on-high.  Apologies if it came across that way.

> The users are
> still operating on pathnames.  That's awfully strange.  Number two,
> every other SQL object we have has a name that is one or several
> English words.  DIRALIAS does not appear in any dictionary.  The
> second objection can be answered by renaming the facility, but the
> first one is not so straightforward.

I do think it's important to support subdirectories (the Amazon use-case
is one where this would be required) and allowing users to specify the
specific file names, so we'd have to come up with a way to combine the
alias and the rest of the fully-qualified path.  That might not be too
bad but, to me at least, it seemed more natural to just use the full
path.  That was from a sysadmin perspective though, from a DBA
perspective, knowing the rest of the path is probably not all that
interesting and using the alias would be simpler for them.

> > I'll discuss with Adam putting a wiki together which outlines the use
> > cases and rationale for them and hopefully that'll lead into a better
> > discussion about the possible permissions which would make sense to
> > exist for these and that may inform us as to if a GUC-based approach
> > would work.  I'm still unsure about using GUCs to define permissions in
> > this way.  That feels novel to me for PG to do, but I'll admit that I
> > may just be ignorant or forgetting existing cases where we do that.
>
> Well, there's temp_file_limit, for example.  That's not exactly the
> same, but it bears a passing resemblance.

Hrm, yes, that's PG_SUSET and could be set per-user.

> I'm definitely not saying that the GUC-based proposal is perfect.  It
> isn't, and if we're going to need a whole bunch of different
> permissions that are all per-directory, that could get ugly in a
> hurry.  My points are (1) the community does not have to accept this
> feature just because you propose it, and in fact there's a good
> argument for rejecting it outright, which is that very few users are
> going to get any benefit out of this, and it might end up being a
> whole lot of code; and (2) the pros and cons of accepting this at all,
> and of different designs, need to be debated here, on this list, in an
> open way.

I'd like to think that we're doing (2) now.  As for (1), I certainly
feel it's a useful capability and will argue for it, but the community
certainly has the 'final say' on it, of course.  I'm optomistic that the
amount of code will be reasonable and that users will benefit from it or
I wouldn't be advocating it, but that's obviously a judgement call and
others will and are certainly entitled to have different opinions.

> I think it would help, on all accounts, to explain why in the world
> we're spending time on this in the first place.

Because I feel it's a valuable feature...?  So does Oracle, MySQL, and
the other databases which support it.  This isn't the first time it's
come up either, as I pointed out up-thread.

> I have a sneaking
> suspicion this is 1 of N things we need to do to meet some US
> government security standard, and if something like that is the case,
> that could tip the balance toward doing it, or toward a particular
> implementation of the concept.

No, it hasn't got anything to do with NIST or other government
standards.  Those standards are much more interested in the general
"reduce the need to be a superuser" concept but there's certainly
nothing in there about directory-level access, nor was it even part of
the original discussion that this idea came out of.  If there were
specific standards about this, I'd have pointed them out (as I've done
previously...), because solving those cases are valuable to our
community, in my view.

> From my point of view, if you made a
> list of all of the annoyances of using PostgreSQL and listed them in
> order of importance, you'd burn through a fair amount of paper before
> reaching this one.

I'm not quite sure what to do with this comment.  Perhaps it isn't at
the top of anyone's list (not even mine), but I didn't think we rejected
features because the community feels that some other feature is more
important.  If we're going to start doing that then we should probably
come up with a list of what features the community wants, prioritize
them, and require that all committers work towards those features to the
exclusion of their own interests, or those of their employers or the
companies they own/run.  I hope I've simply misunderstood the
implication here instead.
Thanks,
    Stephen

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Kevin Grittner
Date:
Stephen Frost <sfrost@snowman.net> wrote:

> the original ask was to be able to view logs as a DBA who isn't a
> superuser, and without having to have those views delayed or
> complex cron jobs running to set up access to them.

I had kinda forgotten it, but I had to set up a cron log rsync at
Wisconsin Courts.  I understand the need.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Tue, Oct 28, 2014 at 3:19 PM, Stephen Frost <sfrost@snowman.net> wrote:
>> To articular my own concerns perhaps a bit better, there are two major
>> things I don't like about the whole DIRALIAS proposal.  Number one,
>> you're creating this SQL object whose name is not actually used for
>> anything other than manipulating the alias you created.
>
> I agree that this makes it feel awkward.  Peter had an interesting
> suggestion to make the dir aliases available as actual aliases for the
> commands which they would be relevant to.  I hadn't considered that- I
> proposed 'diralias' because I didn't like 'directory' since we weren't
> actually creating *directories* but rather defining aliases to existing
> OS directories in PG.

Right.  Another way to go at this would be to just ditch the names.
This exact syntax probably wouldn't work (or might not be a good idea)
because GRANT is so badly overloaded already, but conceptually:

GRANT READ ON DIRECTORY '/home/snowman' TO sfrost;

Or maybe some variant of:

ALTER USER sfrost GRANT READ ON DIRECTORY '/home/snowman';

> I'm not quite sure what to do with this comment.  Perhaps it isn't at
> the top of anyone's list (not even mine), but I didn't think we rejected
> features because the community feels that some other feature is more
> important.  If we're going to start doing that then we should probably
> come up with a list of what features the community wants, prioritize
> them, and require that all committers work towards those features to the
> exclusion of their own interests, or those of their employers or the
> companies they own/run.  I hope I've simply misunderstood the
> implication here instead.

No, that's not what I'm saying.  Come on.  From my point of view what
happened is that a patch implementing a rather specific design for a
problem I personally viewed as somewhat obscure just sort of dropped
out of nowhere; and it came from people working at a company that is
also working on a bunch of other security-related features.  I
wondered whether there was more to the story, but I guess not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Peter Eisentraut
Date:
On 10/27/14 7:36 PM, Stephen Frost wrote:
> MySQL:
> http://dev.mysql.com/doc/refman/5.1/en/privileges-provided.html#priv_file
> 
> (note they provide a way to limit access also, via secure_file_priv)

They have a single privilege to allow the user to read or write any
file.  I think that feature could be useful.

> Oracle:
> http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_5007.htm
> http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_9013.htm#i2125999

From the description, that CREATE DIRECTORY command looks to me more
like a tablespace, or a general BLOB space, that you reference by object
name, not by file name.

> SQL Server:
> http://msdn.microsoft.com/en-us/library/ms175915.aspx
> (Note: they can actually run as the user connected instead of the SQL DB
> server, if Windows authentication is used, which is basically doing
> Kerberos proxying unless I'm mistaken; it's unclear how the security is
> maintained if it's a SQL server logon user..).

That could be useful. ;-)  But it's not actually the same as the feature
proposed here.

> DB2:
>
http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.dm.doc/doc/c0004589.html?cp=SSEPGG_9.7.0

That's also more like the single capability system that MySQL has.


So while this is interesting food for thought, I don't think this really
supports that claim that other systems have a facility very much like
the proposed one.




Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Peter Eisentraut (peter_e@gmx.net) wrote:
> On 10/27/14 7:36 PM, Stephen Frost wrote:
> > MySQL:
> > http://dev.mysql.com/doc/refman/5.1/en/privileges-provided.html#priv_file
> >
> > (note they provide a way to limit access also, via secure_file_priv)
>
> They have a single privilege to allow the user to read or write any
> file.  I think that feature could be useful.

... Optionally limited to a specific directory with the
secure_file_priv, as I pointed out previously.

> > Oracle:
> > http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_5007.htm
> > http://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_9013.htm#i2125999
>
> >From the description, that CREATE DIRECTORY command looks to me more
> like a tablespace, or a general BLOB space, that you reference by object
> name, not by file name.

It also allows you to use the 'external_table_clause' which "is a
read-only table whose metadata is stored in the database but whose data
in stored outside the database. Among other capabilities, external
tables let you query data without first loading it into the database."

In other words, file_fdw.  If you read further down, you'll also see
that the way the file is interpreted is based on the access drivers,
which can be ORACLE_LOADER or ORACLE_DATAPUMP, which means you can read
any file you can use with imp or SQLLoader.  Basically, Oracle expects
you to use this to create a table in the DB which references the
external file rather than importing using a COPY, which I agree we want
and should make file_fdw support, but it amounts to the same thing.

> > SQL Server:
> > http://msdn.microsoft.com/en-us/library/ms175915.aspx
> > (Note: they can actually run as the user connected instead of the SQL DB
> > server, if Windows authentication is used, which is basically doing
> > Kerberos proxying unless I'm mistaken; it's unclear how the security is
> > maintained if it's a SQL server logon user..).
>
> That could be useful. ;-)  But it's not actually the same as the feature
> proposed here.

Huh?  It's exactly the same- but done with Kerberos integration and file
shares.  This proposal is essentially a poor-man's version of this where
the administrator has to go set up the allows themselves rather than
letting Kerberos and regular user permissions handle it.

The point is, they're both about giving users access to external files
for importing, exporting and querying, within certain boundaries to
avoid the user being able to trivially bypass the in-database or OS
security.

> > DB2:
> >
http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.dm.doc/doc/c0004589.html?cp=SSEPGG_9.7.0
>
> That's also more like the single capability system that MySQL has.

I agree that it's not clear from the docs for DB2 how or if you can
limit what can be done with this capability, but I don't see much point
in it if you can use it to bypass the in-database security.

> So while this is interesting food for thought, I don't think this really
> supports that claim that other systems have a facility very much like
> the proposed one.

While the documentation for these other products isn't as good as ours,
if you look a bit closer, I think you'll see that their features are
actually very similar to the proposed one (Oracle's even has nearly the
same syntax..).
Thanks,
    Stephen

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
Robert,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Tue, Oct 28, 2014 at 3:19 PM, Stephen Frost <sfrost@snowman.net> wrote:
> > I agree that this makes it feel awkward.  Peter had an interesting
> > suggestion to make the dir aliases available as actual aliases for the
> > commands which they would be relevant to.  I hadn't considered that- I
> > proposed 'diralias' because I didn't like 'directory' since we weren't
> > actually creating *directories* but rather defining aliases to existing
> > OS directories in PG.
>
> Right.  Another way to go at this would be to just ditch the names.

Alright.

> This exact syntax probably wouldn't work (or might not be a good idea)
> because GRANT is so badly overloaded already, but conceptually:
>
> GRANT READ ON DIRECTORY '/home/snowman' TO sfrost;

Yeah, GRANT is overloaded pretty badly and has the unfortunate quality
that it's spec-driven.

> Or maybe some variant of:
>
> ALTER USER sfrost GRANT READ ON DIRECTORY '/home/snowman';

This could work though.  We could add an array to pg_authid which is a
complex type that combines the permission allowed with the directory
somehow.  Feels like it might get a bit clumsy though.

One other thing occured to me while I was considering Peter's idea about
using the 'DIRALIAS' name- replicas and/or database migrations.
pg_basebackup always really annoyed me that you had to have your
tablespace directories set up *exactly* the same way when doing the
restore.  That stinks.  If we actually used the DIRALIAS name then
sysadmins could abstract out the location and could handle migrations
and/or changes to the filesystem structure without having to bother the
DBAs to update their code to the new location.  That's not something the
other RDBMS's have that I could see, but it strikes me as a nice
capability anyway and, well, we're certainly not limited to just
implementing what others have.

Thanks for continueing to help walk this forward towards a hopefully
useful feature and apologies for the confusion.
Thanks again!
    Stephen

On Wed, Oct 29, 2014 at 6:50 AM, Stephen Frost <sfrost@snowman.net> wrote:
> This could work though.  We could add an array to pg_authid which is a
> complex type that combines the permission allowed with the directory
> somehow.  Feels like it might get a bit clumsy though.

Sure, I'm just throwing things out to see what sticks.  It would be
helpful to have more input from others on what they like and dislike,
too; I'm not pretending my input is Gospel.

> One other thing occured to me while I was considering Peter's idea about
> using the 'DIRALIAS' name- replicas and/or database migrations.
> pg_basebackup always really annoyed me that you had to have your
> tablespace directories set up *exactly* the same way when doing the
> restore.  That stinks.  If we actually used the DIRALIAS name then
> sysadmins could abstract out the location and could handle migrations
> and/or changes to the filesystem structure without having to bother the
> DBAs to update their code to the new location.  That's not something the
> other RDBMS's have that I could see, but it strikes me as a nice
> capability anyway and, well, we're certainly not limited to just
> implementing what others have.

Of course, any design that stores paths in the system catalogs is
going to have the problem that the standby will perforce have the same
configuration as the master.

I'm fuzzy on how you see DIRALIAS helping with tablespace migrations,
etc.  There's no obvious way to make a tablespace definition reference
an alias rather than a pathname; it's just a filesystem-level symlink.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Oct 29, 2014 at 6:50 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > This could work though.  We could add an array to pg_authid which is a
> > complex type that combines the permission allowed with the directory
> > somehow.  Feels like it might get a bit clumsy though.
>
> Sure, I'm just throwing things out to see what sticks.  It would be
> helpful to have more input from others on what they like and dislike,
> too; I'm not pretending my input is Gospel.

Agreed- additional input from others would be great.  Adam's started the
wiki to hopefully capture these thoughts and be a way to solicit input
from others also.  I wonder if we could try something different to get
input from users who don't typically follow -hackers, like maybe get
someone to blog about the concept, point to the wiki, and ask for
feedback?  Just a random thought.

I've only glanced at it so far myself and plan a deeper review to add my
own thoughts, but if folks want to look at what Adam's put together
so far, it's here:

https://wiki.postgresql.org/wiki/Directory_Permissions

> > One other thing occured to me while I was considering Peter's idea about
> > using the 'DIRALIAS' name- replicas and/or database migrations.
> > pg_basebackup always really annoyed me that you had to have your
> > tablespace directories set up *exactly* the same way when doing the
> > restore.  That stinks.  If we actually used the DIRALIAS name then
> > sysadmins could abstract out the location and could handle migrations
> > and/or changes to the filesystem structure without having to bother the
> > DBAs to update their code to the new location.  That's not something the
> > other RDBMS's have that I could see, but it strikes me as a nice
> > capability anyway and, well, we're certainly not limited to just
> > implementing what others have.
>
> Of course, any design that stores paths in the system catalogs is
> going to have the problem that the standby will perforce have the same
> configuration as the master.

Yeah, that's a good point, this wouldn't address replicas (until/unless
we allow catalogs to be different some day..  though I guess you'd do
that with logical replication instead) but rather the pg_basebackup /
migration-to-new-system case.

> I'm fuzzy on how you see DIRALIAS helping with tablespace migrations,
> etc.  There's no obvious way to make a tablespace definition reference
> an alias rather than a pathname; it's just a filesystem-level symlink.

Sorry, to clarify, I wasn't thinking of tablespaces (which pg_basebackup
now deals with better by allowing you to provide a mapping from the old
to the new) but rather files referenced by a file_fdw table, though it
could be used with COPY also (possibly inside of pl/pgsql, or in client
apps).
Thanks!
    Stephen

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Alvaro Herrera
Date:
Robert Haas wrote:

> To articular my own concerns perhaps a bit better, there are two major
> things I don't like about the whole DIRALIAS proposal.  Number one,
> you're creating this SQL object whose name is not actually used for
> anything other than manipulating the alias you created.  The users are
> still operating on pathnames.  That's awfully strange.

I think it would make more sense if the file-accessing command specified
the DIRALIAS (or DIRECTORY, whatever we end up calling this) and a
pathname relative to the base one.  Something like

postgres=# CREATE DIRECTORY logdir ALIAS FOR '/pgsql/data/pg_log';
postgres=# GRANT READ ON DIRECTORY logdir TO logscanner;

logscanner=> COPY logtable FROM 'postgresql-2014-10-28.csv' IN DIRECTORY logdir;

The ALTER ROLE GRANT READ idea proposed downthread is nice also, but one
advantage of this is not having absolute path names in the COPY command.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Stephen Frost <sfrost@snowman.net> writes:
> Agreed- additional input from others would be great.

I think this entire concept is a bad idea that will be a never-ending
source of security holes.  There are too many things that a user with
filesystem access can do to get superuser-equivalent status.

Here is one trivial example: you want to let user joe import COPY
data quickly, so you give him read access in directory foo, which he
has write access on from his own account.  Surely that's right in the
middle of use cases you had in mind, or even if it wasn't, it sounds
like a good idea no?  The problem is he can create symlinks, not just
files, in that directory, and by pointing the symlink to the right
place he can read any file the server can read.  pg_hba.conf, pg_authid,
or even just tables he shouldn't have access to.  With a little luck he
can crack the superuser's password, but even without that you've given
him access to sensitive information.

If you were dumb enough to give joe *write* access in such a directory,
so that he could COPY in both directions, it's game over altogether: he
can become superuser in any number of ways, most easily by hacking
pg_hba.conf.

You could ameliorate this problem by checking to see that the read/write
target is a file not a symlink, but that's still subject to filesystem
race conditions that could be exploited by anyone with the ability to
retry it enough times.

The larger point though is that this is just one of innumerable attack
routes for anyone with the ability to make the server do filesystem reads
or writes of his choosing.  If you think that's something you can safely
give to people you don't trust enough to make them superusers, you are
wrong, and I don't particularly want to spend the next ten years trying
to wrap band-aids around your misjudgment.

Therefore, I'm going to be against committing any feature of this sort.

If the objective is to give filesystem capabilities to someone you *do*
trust, but they'd prefer to use it from an account without full superuser
privileges, that can be solved much more simply by making access to the
existing superuser-only I/O functions more granular.  That fits in just
fine with the other project you've got of breaking down superuserness into
smaller privileges.  But if we build a feature designed in the way being
discussed in this thread, people will think it can be used to grant
limited filesystem access to users they don't completely trust, and we're
going to have to deal with the fallout from that.
        regards, tom lane



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Andres Freund
Date:
On 2014-10-29 10:47:58 -0400, Tom Lane wrote:
> Here is one trivial example: you want to let user joe import COPY
> data quickly, so you give him read access in directory foo, which he
> has write access on from his own account.  Surely that's right in the
> middle of use cases you had in mind, or even if it wasn't, it sounds
> like a good idea no?  The problem is he can create symlinks, not just
> files, in that directory, and by pointing the symlink to the right
> place he can read any file the server can read.  pg_hba.conf, pg_authid,
> or even just tables he shouldn't have access to.  With a little luck he
> can crack the superuser's password, but even without that you've given
> him access to sensitive information.
> 
> If you were dumb enough to give joe *write* access in such a directory,
> so that he could COPY in both directions, it's game over altogether: he
> can become superuser in any number of ways, most easily by hacking
> pg_hba.conf.
> 
> You could ameliorate this problem by checking to see that the read/write
> target is a file not a symlink, but that's still subject to filesystem
> race conditions that could be exploited by anyone with the ability to
> retry it enough times.

I think I'd be fair to restrict this features to platforms that support
O_NOFOLLOW and O_EXCL. Those can be used to circumvent such race
conditions.

> The larger point though is that this is just one of innumerable attack
> routes for anyone with the ability to make the server do filesystem reads
> or writes of his choosing.  If you think that's something you can safely
> give to people you don't trust enough to make them superusers, you are
> wrong, and I don't particularly want to spend the next ten years trying
> to wrap band-aids around your misjudgment.

... but that doesn't necessarily address this point.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



On Wed, Oct 29, 2014 at 10:52 AM, Andres Freund <andres@2ndquadrant.com> wrote:
>> The larger point though is that this is just one of innumerable attack
>> routes for anyone with the ability to make the server do filesystem reads
>> or writes of his choosing.  If you think that's something you can safely
>> give to people you don't trust enough to make them superusers, you are
>> wrong, and I don't particularly want to spend the next ten years trying
>> to wrap band-aids around your misjudgment.
>
> ... but that doesn't necessarily address this point.

I think the question is "just how innumerable are those attack
routes"?  So, we can prevent a symlink from being used via O_NOFOLLOW.
But what about hard links?

In general, the hazard is that an untrusted user can induce the user
to read or write a file that the user in question could not have read
or written himself.  It's not clear to me whether it's reasonably
possible to build a system that is robust against such attacks, or
not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
Tom,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > Agreed- additional input from others would be great.
>
> I think this entire concept is a bad idea that will be a never-ending
> source of security holes.  There are too many things that a user with
> filesystem access can do to get superuser-equivalent status.

I'd be pretty disappointed if we are unable to implement a feature like
this which can't trivially bypass the in-database security and/or gain
superuser access.

> Here is one trivial example: you want to let user joe import COPY
> data quickly, so you give him read access in directory foo, which he
> has write access on from his own account.  Surely that's right in the
> middle of use cases you had in mind, or even if it wasn't, it sounds
> like a good idea no?  The problem is he can create symlinks, not just
> files, in that directory, and by pointing the symlink to the right
> place he can read any file the server can read.  pg_hba.conf, pg_authid,
> or even just tables he shouldn't have access to.  With a little luck he
> can crack the superuser's password, but even without that you've given
> him access to sensitive information.

While a use-case which would be great to support, we don't have the
ability to do it as well as SQL Server can since the backend can't
impersonate another user.  I'm definitely hoping to support Kerberos
credential proxying eventually (primairly for FDWs, but it's possible we
could use it for NFS also) and therefore we'd certainly have to caution
users about these risks.

The specific use-cases which I've been describing are cases where the
user doesn't have access to modify (or possibly even to read) the
filesystem- log directories, read-only NFS mounts, etc.

> If you were dumb enough to give joe *write* access in such a directory,
> so that he could COPY in both directions, it's game over altogether: he
> can become superuser in any number of ways, most easily by hacking
> pg_hba.conf.

You make a good point, certainly, but pg_hba.conf isn't necessairly
available to be modified by the PostgreSQL user.  Of course,
postgresql.auto.conf and the PG heap files could be overwritten if a
user is able to create symlinks and convince PG to access through them.

> You could ameliorate this problem by checking to see that the read/write
> target is a file not a symlink, but that's still subject to filesystem
> race conditions that could be exploited by anyone with the ability to
> retry it enough times.

As Andres already pointed out, there are ways to specifically address
these risks, I dare suggest because what we're talking about supporting
here is not new ground and others have found it to be a valuable
capability and worked to make it secure and safe to support.

> The larger point though is that this is just one of innumerable attack
> routes for anyone with the ability to make the server do filesystem reads
> or writes of his choosing.  If you think that's something you can safely
> give to people you don't trust enough to make them superusers, you are
> wrong, and I don't particularly want to spend the next ten years trying
> to wrap band-aids around your misjudgment.

I certainly don't have the experience you do in this area and am quite
interested in the other attack routes you're thinking of, and how other
databases which support this capability address them.  Perhaps they're
simply documented as known issues, or they aren't addressed at all and
bugs exist, but I'm not seeing these apparently obvious issues.

> If the objective is to give filesystem capabilities to someone you *do*
> trust, but they'd prefer to use it from an account without full superuser
> privileges, that can be solved much more simply by making access to the
> existing superuser-only I/O functions more granular.  That fits in just
> fine with the other project you've got of breaking down superuserness into
> smaller privileges.  But if we build a feature designed in the way being
> discussed in this thread, people will think it can be used to grant
> limited filesystem access to users they don't completely trust, and we're
> going to have to deal with the fallout from that.

The features which I've been proposing are, generally, intended to allow
a non-superuser to do things that are limited to the superuser today
while also preventing them from trivially being able to become a
superuser.  Any which can trivially be used to become a superuser would
at least need to be heavily caveat'd accordingly in the documentation,
which I'd be happy to do, but I'd like to have a better understanding of
the attack vectors under which there is such a risk- to provide
appropriate documentation, if nothing else.
Thanks!
    Stephen

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Oct 29, 2014 at 10:52 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> >> The larger point though is that this is just one of innumerable attack
> >> routes for anyone with the ability to make the server do filesystem reads
> >> or writes of his choosing.  If you think that's something you can safely
> >> give to people you don't trust enough to make them superusers, you are
> >> wrong, and I don't particularly want to spend the next ten years trying
> >> to wrap band-aids around your misjudgment.
> >
> > ... but that doesn't necessarily address this point.
>
> I think the question is "just how innumerable are those attack
> routes"?  So, we can prevent a symlink from being used via O_NOFOLLOW.
> But what about hard links?

You can't hard link to files you don't own.

sfrost@tamriel:/home/sfrost> ln /home/archive/xx.tar.gz
ln: failed to create hard link ?./xx.tar.gz? => ?/home/archive/xx.tar.gz?: Operation not permitted

> In general, the hazard is that an untrusted user can induce the user
> to read or write a file that the user in question could not have read
> or written himself.  It's not clear to me whether it's reasonably
> possible to build a system that is robust against such attacks, or
> not.

There are certainly use-cases where the user executing the COPY doesn't
have any direct access to the filesystem at all but only through PG.
Taken to a bit of an extreme, you could say we already provide that
today. ;)
Thanks!
    Stephen

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Alvaro Herrera
Date:
Robert Haas wrote:
> On Wed, Oct 29, 2014 at 10:52 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> >> The larger point though is that this is just one of innumerable attack
> >> routes for anyone with the ability to make the server do filesystem reads
> >> or writes of his choosing.  If you think that's something you can safely
> >> give to people you don't trust enough to make them superusers, you are
> >> wrong, and I don't particularly want to spend the next ten years trying
> >> to wrap band-aids around your misjudgment.
> >
> > ... but that doesn't necessarily address this point.
> 
> I think the question is "just how innumerable are those attack
> routes"?  So, we can prevent a symlink from being used via O_NOFOLLOW.
> But what about hard links?

Users cannot create a hard link to a file they can't already access.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> Robert Haas wrote:
> > On Wed, Oct 29, 2014 at 10:52 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > >> The larger point though is that this is just one of innumerable attack
> > >> routes for anyone with the ability to make the server do filesystem reads
> > >> or writes of his choosing.  If you think that's something you can safely
> > >> give to people you don't trust enough to make them superusers, you are
> > >> wrong, and I don't particularly want to spend the next ten years trying
> > >> to wrap band-aids around your misjudgment.
> > >
> > > ... but that doesn't necessarily address this point.
> >
> > I think the question is "just how innumerable are those attack
> > routes"?  So, we can prevent a symlink from being used via O_NOFOLLOW.
> > But what about hard links?
>
> Users cannot create a hard link to a file they can't already access.

The specifics actually depend on (on Linux, at least) the value of
/proc/sys/fs/protected_hardlink, which has existed in upstream since 3.6
(not sure about the RHEL kernels, though I expect they've incorporated
it also at some point along the way).

There is a similar /proc/sys/fs/protected_symlinks control for dealing
with the same kind of time-of-check / time-of-use issues that exist with
symlinks.

At least on my Ubuntu 14.04 systems, these are both set to '1'.
Thanks,
    Stephen

Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> The larger point though is that this is just one of innumerable attack
>> routes for anyone with the ability to make the server do filesystem reads
>> or writes of his choosing.  If you think that's something you can safely
>> give to people you don't trust enough to make them superusers, you are
>> wrong, and I don't particularly want to spend the next ten years trying
>> to wrap band-aids around your misjudgment.

> I certainly don't have the experience you do in this area and am quite
> interested in the other attack routes you're thinking of, and how other
> databases which support this capability address them.  Perhaps they're
> simply documented as known issues, or they aren't addressed at all and
> bugs exist, but I'm not seeing these apparently obvious issues.

Well, the point here is that I'm *not* an expert.  I'm aware that there
are lots of nonobvious ways in which Unix filesystem security can be
subverted if you can control the actions of a process running with
privileges you don't/shouldn't have.  I don't claim to have all the
details at my fingertips, and I doubt that anyone else in the PG community
does either.  Therefore, I think it's inevitable that if we build a
feature like this, it's going to have multiple security holes that
we will find out about the hard way.

As for other databases, since when did we think that Oracle, Microsoft, or
mysql are reliable sources of well-designed security-hole-free software?
The fact that they advertise features of this sort doesn't impress me in
the slightest.

I'm happy to have us rearrange things so that use of the existing
filesystem access functionality can be given out to users who aren't full
superusers.  What I don't believe is that it's a useful exercise to try
to give out restricted filesystem access: that will require too many
restrictions/compromises and still create too much of an attack surface.
I want to just define away the attack surface by making it clear that we
are *not* making any promises about what someone can do with filesystem
access functionality.  If you give joe access to that functionality and
he does something you don't like, it's your fault not ours.
        regards, tom lane



On Wed, Oct 29, 2014 at 11:34 AM, Stephen Frost <sfrost@snowman.net> wrote:
> The specifics actually depend on (on Linux, at least) the value of
> /proc/sys/fs/protected_hardlink, which has existed in upstream since 3.6
> (not sure about the RHEL kernels, though I expect they've incorporated
> it also at some point along the way).
>
> There is a similar /proc/sys/fs/protected_symlinks control for dealing
> with the same kind of time-of-check / time-of-use issues that exist with
> symlinks.
>
> At least on my Ubuntu 14.04 systems, these are both set to '1'.

Playing devil's advocate here for a minute, you're saying that
new-enough versions of Linux have an optional feature that prevents
this attack.  I think an argument could be made that this is basically
unsecurable on any other platform, or even old Linux versions.  And it
still doesn't protect against the case where you hardlink to a file
and then the permissions on that file are later changed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Andres Freund
Date:
On 2014-10-29 11:52:43 -0400, Robert Haas wrote:
> On Wed, Oct 29, 2014 at 11:34 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > The specifics actually depend on (on Linux, at least) the value of
> > /proc/sys/fs/protected_hardlink, which has existed in upstream since 3.6
> > (not sure about the RHEL kernels, though I expect they've incorporated
> > it also at some point along the way).
> >
> > There is a similar /proc/sys/fs/protected_symlinks control for dealing
> > with the same kind of time-of-check / time-of-use issues that exist with
> > symlinks.
> >
> > At least on my Ubuntu 14.04 systems, these are both set to '1'.
> 
> Playing devil's advocate here for a minute, you're saying that
> new-enough versions of Linux have an optional feature that prevents
> this attack.  I think an argument could be made that this is basically
> unsecurable on any other platform, or even old Linux versions.

It's possible to do this securely by doing a fstat() and checking the
link count.

> And it
> still doesn't protect against the case where you hardlink to a file
> and then the permissions on that file are later changed.

Imo that's simply not a problem that we need to solve - it's much more
general and independent.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



On Wed, Oct 29, 2014 at 12:00 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> It's possible to do this securely by doing a fstat() and checking the
> link count.

Good point.

>> And it
>> still doesn't protect against the case where you hardlink to a file
>> and then the permissions on that file are later changed.
>
> Imo that's simply not a problem that we need to solve - it's much more
> general and independent.

I don't see how you can draw an arbitrary line there.  We either
guarantee that the logged-in user can't usurp the server's
permissions, or we don't.  Making it happen only sometimes in cases
we're prepared to dismiss is not real security.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Andres Freund
Date:
On 2014-10-29 12:03:54 -0400, Robert Haas wrote:
> >> And it
> >> still doesn't protect against the case where you hardlink to a file
> >> and then the permissions on that file are later changed.
> >
> > Imo that's simply not a problem that we need to solve - it's much more
> > general and independent.
> 
> I don't see how you can draw an arbitrary line there.  We either
> guarantee that the logged-in user can't usurp the server's
> permissions, or we don't.  Making it happen only sometimes in cases
> we're prepared to dismiss is not real security.

I can draw the line because lowering the permissions of some file isn't
postgres' problem. If you do that, you better make sure that there's no
existing hardlinks pointing to the precious file. And that has nothing
to do with postgres.

But anyway, just refusing to work on hardlinked files would also get rid
of that problem.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Stephen Frost <sfrost@snowman.net> writes:
> * Robert Haas (robertmhaas@gmail.com) wrote:
>> I think the question is "just how innumerable are those attack
>> routes"?  So, we can prevent a symlink from being used via O_NOFOLLOW.
>> But what about hard links?

> You can't hard link to files you don't own.

That restriction exists on only some platforms.  Current OS X for instance
seems perfectly willing to allow it (suggesting that most BSDen probably
do likewise), and I see no language supporting your claim in the POSIX
spec for link(2).

This points up the fact that platform-specific security holes are likely
to be a huge part of the problem.  I won't even speculate about our odds
of building something that's secure on Windows.
        regards, tom lane



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Andres Freund (andres@2ndquadrant.com) wrote:
> On 2014-10-29 12:03:54 -0400, Robert Haas wrote:
> > I don't see how you can draw an arbitrary line there.  We either
> > guarantee that the logged-in user can't usurp the server's
> > permissions, or we don't.  Making it happen only sometimes in cases
> > we're prepared to dismiss is not real security.
>
> I can draw the line because lowering the permissions of some file isn't
> postgres' problem. If you do that, you better make sure that there's no
> existing hardlinks pointing to the precious file. And that has nothing
> to do with postgres.
>
> But anyway, just refusing to work on hardlinked files would also get rid
> of that problem.

Right, I was just about to point out the same- the fstat/link-count
approach addresses the issue also.

As for the 'new-enough' versions of Linux, my point there was simply
that these are issues which people who are concerned about security have
been looking at and working to address.  History shows a pretty thorny
past, certainly, but SMTP has a similar past.
Thanks,
    Stephen

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Andres Freund
Date:
On 2014-10-29 12:09:00 -0400, Tom Lane wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * Robert Haas (robertmhaas@gmail.com) wrote:
> >> I think the question is "just how innumerable are those attack
> >> routes"?  So, we can prevent a symlink from being used via O_NOFOLLOW.
> >> But what about hard links?
> 
> > You can't hard link to files you don't own.
> 
> That restriction exists on only some platforms.

Yea, it's nothing we can rely on. I do think checking the link count to
be 1 is safe though.

> Current OS X for instance
> seems perfectly willing to allow it (suggesting that most BSDen probably
> do likewise), and I see no language supporting your claim in the POSIX
> spec for link(2).

I'd argue that there's no point in treating OSX as a securable platform
:P

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Stephen Frost <sfrost@snowman.net> writes:
> * Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
>> Users cannot create a hard link to a file they can't already access.

> The specifics actually depend on (on Linux, at least) the value of
> /proc/sys/fs/protected_hardlink, which has existed in upstream since 3.6
> (not sure about the RHEL kernels, though I expect they've incorporated
> it also at some point along the way).

No such file in RHEL 6.6 :-(.

What the POSIX spec for link(2) says is

[EACCES] A component of either path prefix denies search permission, or the requested link requires writing in a
directorythat denies write permission, or the calling process does not have permission to access the existing file and
thisis required by the implementation.
 

It's not very clear what "access" means, and in any case this wording
gives implementors permission to not enforce anything at all in that
line.  Whether particular flavors of Linux do or not doesn't help us
much, because other popular platforms clearly don't enforce it.
        regards, tom lane



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> >> Users cannot create a hard link to a file they can't already access.
>
> > The specifics actually depend on (on Linux, at least) the value of
> > /proc/sys/fs/protected_hardlink, which has existed in upstream since 3.6
> > (not sure about the RHEL kernels, though I expect they've incorporated
> > it also at some point along the way).
>
> No such file in RHEL 6.6 :-(.

Ouch.  Although- have you tested when happens there?  I wonder if
they've decided it's not worth allowing ever or if they feel that it's
not worth preventing and that security-concious software should check
the link count as Andres suggests.

> What the POSIX spec for link(2) says is
>
> [EACCES]
>   A component of either path prefix denies search permission, or the
>   requested link requires writing in a directory that denies write
>   permission, or the calling process does not have permission to access
>   the existing file and this is required by the implementation.

Yeah, I didn't mean to imply that this was provided by POSIX and you're
right to point out that we couldn't depend on this as it wouldn't be
cross-platform anyway.
Thanks,
    Stephen

Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> No such file in RHEL 6.6 :-(.

> Ouch.  Although- have you tested when happens there?

Pretty much exactly the same thing I just saw on OSX, ie, nothing.

[tgl@sss1 zzz]$ touch foo
[tgl@sss1 zzz]$ ls -l
total 0
-rw-rw-r--. 1 tgl tgl 0 Oct 29 12:23 foo
[tgl@sss1 zzz]$ ln foo bar
[tgl@sss1 zzz]$ ls -l
total 0
-rw-rw-r--. 2 tgl tgl 0 Oct 29 12:23 bar
-rw-rw-r--. 2 tgl tgl 0 Oct 29 12:23 foo
[tgl@sss1 zzz]$ chmod 000 foo
[tgl@sss1 zzz]$ sudo chown root foo
[tgl@sss1 zzz]$ ln foo baz
[tgl@sss1 zzz]$ ls -l
total 0
----------. 3 root tgl 0 Oct 29 12:23 bar
----------. 3 root tgl 0 Oct 29 12:23 baz
----------. 3 root tgl 0 Oct 29 12:23 foo
[tgl@sss1 zzz]$ uname -a
Linux sss1.sss.pgh.pa.us 2.6.32-504.el6.x86_64 #1 SMP Tue Sep 16 01:56:35 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux

> I wonder if
> they've decided it's not worth allowing ever or if they feel that it's
> not worth preventing and that security-concious software should check
> the link count as Andres suggests.

Probably it's just that it's a new feature that they've not chosen to
back-port into 2.6.x kernels.  I'm sure they're following the upstream
kernels in newer release series.  But even if they had chosen to back-port
it, you can be entirely darn sure it wouldn't be turned on by default in
the RHEL6 series; they'd be too worried about breaking existing
applications.
        regards, tom lane



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> This points up the fact that platform-specific security holes are likely
> to be a huge part of the problem.  I won't even speculate about our odds
> of building something that's secure on Windows.

Andres' suggestion to only provide it on platforms which support
O_NOFOLLOW and O_EXCL certainly seems appropriate, along with fstat'ing
after we've opened it and checking that there's only one hard-link to
it.  As for Windows, it looks like you can get a file's attributes after
opening it by using GetFileInformationByHandle and you can then check if
it's a junction point or not (which would indicate if it's either a
symbolic link or a hard link, from what I can see).  Obviously, we'd
need to get input from someone more familiar with Windows than I am
before we can be confident of this approach though.
Thanks!
    Stephen

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Adam Brightwell
Date:
Robert,
 
To articular my own concerns perhaps a bit better, there are two major
things I don't like about the whole DIRALIAS proposal.  Number one,
you're creating this SQL object whose name is not actually used for
anything other than manipulating the alias you created.  The users are
still operating on pathnames.  That's awfully strange.

That's an interesting point and I don't disagree that it seems a little strange.  However, isn't this approach similar if not the same (other than operating on path names) as with some other objects, specifically rules and policies?

-Adam


--

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Adam Brightwell
Date:
Alvaro,

I think it would make more sense if the file-accessing command specified
the DIRALIAS (or DIRECTORY, whatever we end up calling this) and a
pathname relative to the base one.  Something like

postgres=# CREATE DIRECTORY logdir ALIAS FOR '/pgsql/data/pg_log';

Following this, what do you think about simply expanding DIRALIAS out into to DIRECTORY ALIAS?  So instead:

CREATE DIRECTORY ALIAS <name> AS '<path>'

or...

CREATE DIRECTORY ALIAS <name> FOR '<path>'

My thought on this is towards the natural word order of the command.  Also, I think having it as CREATE DIRECTORY ALIAS minimizes confusion, as I think Stephen mentioned, that we are creating an alias, not an actual directory.  Thoughts?

postgres=# GRANT READ ON DIRECTORY logdir TO logscanner;

I personally like this form the most, however, I think the greatest hurdle with it is that it would require making READ (and WRITE) reserved keywords.  Obviously, I think that is a non-starter.
 
logscanner=> COPY logtable FROM 'postgresql-2014-10-28.csv' IN DIRECTORY logdir;

That's an interesting thought.  Would 'IN DIRECTORY' be restricted to just the alias name?  I'm not sure it would make sense to allow a directory path there, as what would be the point?  At any rate, just food for thought.

The ALTER ROLE GRANT READ idea proposed downthread is nice also,

Agreed and probably the most logical option at this point?

but one
advantage of this is not having absolute path names in the COPY command.

Pardon my ignorance, but can you help me understand the advantage of not having absolute path names in the COPY command?

-Adam 


--

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
Adam,

* Adam Brightwell (adam.brightwell@crunchydatasolutions.com) wrote:
> Pardon my ignorance, but can you help me understand the advantage of not
> having absolute path names in the COPY command?

If you're writing ETL processes and/or PL/PgSQL code which embeds the
COPY command and you migrate from one server to another, or the sysadmin
decides he wants to mount /data and /otherdata, it's nice to be able to
just update the directory alias rather than having to modify all the ETL
processes and PL/PgSQL code.

Basically, it provides an abstraction layer which would allow you to
avoid having to either create your own variable for the directory and
then track that somewhere in your own table (to get around having to
hard-code things into your ETL or PL/PgSQL code) or deal with updating
all that code when the filesystem structure changes for whatever reason.
Thanks,
    Stephen

On Wed, Oct 29, 2014 at 12:36 PM, Adam Brightwell
<adam.brightwell@crunchydatasolutions.com> wrote:
> Robert,
>
>> To articular my own concerns perhaps a bit better, there are two major
>> things I don't like about the whole DIRALIAS proposal.  Number one,
>> you're creating this SQL object whose name is not actually used for
>> anything other than manipulating the alias you created.  The users are
>> still operating on pathnames.  That's awfully strange.
>
> That's an interesting point and I don't disagree that it seems a little
> strange.  However, isn't this approach similar if not the same (other than
> operating on path names) as with some other objects, specifically rules and
> policies?

Hmm.  Maybe.  Somehow it feels different to me.  A rule or policy is
something internal to the system, and you have to identify it somehow.
A directory, though, already has a name, so giving it an additional
dummy name seems strange.  But, you do have a point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Jeremy Harris
Date:
On 29/10/14 16:11, Andres Freund wrote:
>  I do think checking the link count to
> be 1 is safe though.

You will fail against certain styles of online-backup.
-- 
Cheers, Jeremy




Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Andres Freund
Date:
On 2014-10-29 16:38:44 +0000, Jeremy Harris wrote:
> On 29/10/14 16:11, Andres Freund wrote:
> >  I do think checking the link count to
> > be 1 is safe though.
> 
> You will fail against certain styles of online-backup.

Meh. I don't think that's really a problem for the usecases for COPY
FROM.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Jeremy Harris (jgh@wizmail.org) wrote:
> On 29/10/14 16:11, Andres Freund wrote:
> >  I do think checking the link count to
> > be 1 is safe though.
>
> You will fail against certain styles of online-backup.

Fail-safe though, no?  For my part, I'm not particularly bothered by
that; we'd have to document it appropriately, of course.
Thanks!
    Stephen

Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> This points up the fact that platform-specific security holes are likely
>> to be a huge part of the problem.  I won't even speculate about our odds
>> of building something that's secure on Windows.

> Andres' suggestion to only provide it on platforms which support
> O_NOFOLLOW and O_EXCL certainly seems appropriate, along with fstat'ing
> after we've opened it and checking that there's only one hard-link to
> it.  As for Windows, it looks like you can get a file's attributes after
> opening it by using GetFileInformationByHandle and you can then check if
> it's a junction point or not (which would indicate if it's either a
> symbolic link or a hard link, from what I can see).  Obviously, we'd
> need to get input from someone more familiar with Windows than I am
> before we can be confident of this approach though.

So at this point we've decided that we must forbid access to symlinked or
hardlinked files, which is a significant usability penalty; we've also
chosen to blow off most older platforms entirely; and we've only spent
about five minutes actually looking for security issues, with no good
reason to assume there are no more.

(I can think of one more already, actually: the proposed post-open
fstat for link count has a race condition.  User just has to link target
file into writable directory, attempt to open it, and concurrently unlink
from the writable directory.  Repeat until success.)

So I remain of the opinion that this is a bad idea we should not pursue.
We're going to put a huge amount of work into it, it *will* cause more
than one security bug in the future (want to lay a side bet?), and we're
still going to end up with people needing to use the old-style access
facilities because the restrictions we'll have to put on this one are
unacceptable for their purposes.
        regards, tom lane



Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-10-29 16:38:44 +0000, Jeremy Harris wrote:
>> On 29/10/14 16:11, Andres Freund wrote:
>>> I do think checking the link count to
>>> be 1 is safe though.

>> You will fail against certain styles of online-backup.

> Meh. I don't think that's really a problem for the usecases for COPY
> FROM.

I think Jeremy's point is that if such a backup technology is in use, it
would result in random failures whenever the backup daemon happened to
have an extra hardlink at the moment you tried to access the file.

In other words, just another scenario where the proposed feature fails.
        regards, tom lane



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Kevin Grittner
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> So at this point we've decided that we must forbid access to symlinked or
> hardlinked files, which is a significant usability penalty; we've also
> chosen to blow off most older platforms entirely; and we've only spent
> about five minutes actually looking for security issues, with no good
> reason to assume there are no more.

What's interesting and disappointing here is that not one of these
suggested vulnerabilities seems like a possibility on a database
server managed in what I would consider a sane and secure manner[1].
This feature is valuable because it is an alternative to allowing a
user you don't trust *either* an OS login to the database server
*or* a superuser database login.  Can anyone suggest an exploit
which would be available if we allowed someone who has permission
to view all data in the database read permission to the pg_log
directory and the files contained therein, assuming they do *not*
have an OS login to the database server?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



I wrote:
> ... and we've only spent
> about five minutes actually looking for security issues, with no good
> reason to assume there are no more.

Oh, here's another one: what I read in RHEL6's open(2) man page is
      O_NOFOLLOW             If pathname is a symbolic link, then the open fails.  This is  a             FreeBSD
extension,which was added to Linux in version 2.1.126.             Symbolic links in earlier components of the pathname
will still             be followed.
 

So heaven help you if you grant user joe access in directory
/home/joe/copydata, or any other directory whose parent is writable by
him.  He can just remove the directory and replace it with a symlink to
whatever directory contains files he'd like the server to read/write for
him.

Again, we could no doubt install defenses against that sort of case,
once we realize it's a threat.  Maybe they'd even be bulletproof defenses
(not too sure how you'd prevent race conditions though).  But whether they
are or not, we just took the usability of the feature down another notch,
because certainly that sort of directory arrangement would have been
convenient for joe ... as long as he was trustworthy.

In any case, my larger point is that I foresee a very very long line
of gotchas of this sort, and I do not think that the proposed feature
is worth it.
        regards, tom lane



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
Kevin,

* Kevin Grittner (kgrittn@ymail.com) wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > So at this point we've decided that we must forbid access to symlinked or
> > hardlinked files, which is a significant usability penalty; we've also
> > chosen to blow off most older platforms entirely; and we've only spent
> > about five minutes actually looking for security issues, with no good
> > reason to assume there are no more.
>
> What's interesting and disappointing here is that not one of these
> suggested vulnerabilities seems like a possibility on a database
> server managed in what I would consider a sane and secure manner[1].

For my part- I agree completely with this sentiment, and I'm not sure
that Tom disagrees with it.  I believe the discussion is heading towards
a blanket "use this at your own risk- if the user can modify files in
these directories outside of PG, they can probably break your system"
being added in very bold lettering in the documentation around this.

It botheres me that we'd have to have a statement like that, but if we
have to then we have to.

> This feature is valuable because it is an alternative to allowing a
> user you don't trust *either* an OS login to the database server
> *or* a superuser database login.  Can anyone suggest an exploit
> which would be available if we allowed someone who has permission
> to view all data in the database read permission to the pg_log
> directory and the files contained therein, assuming they do *not*
> have an OS login to the database server?

These are the use-cases which I've been wanting this for.  Also, things
like ETL processes which run as a dedicated user and really have no
business nor need to be running as a superuser.
Thanks,
    Stephen

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> So heaven help you if you grant user joe access in directory
> /home/joe/copydata, or any other directory whose parent is writable by
> him.  He can just remove the directory and replace it with a symlink to
> whatever directory contains files he'd like the server to read/write for
> him.

Yeah, there's no workaround for this that I'm aware of- we'd have to
prevent subdirectories being provided by the user, I believe.

Reviewing procmail, maildrop, and other processes which do this sort of
operation, it looks like the common thread today is to setuid() instead,
but that only works if you're running as root, which we obviously don't
want to be doing either.

Kevin had a good question, I thought- are there issues if the DB user
doesn't have any access to the OS filesystem?  Would it be sufficient to
say "Note that processes which can create objects in the directory
specified outside of PostgreSQL could create symlinks, hard links, or
other objects which might cause the PostgreSQL server to read files or
write files which it has access to that are outside of the directory
specified." ?

I still don't particularly like it and, frankly, the limitations we've
come up with thus far are not issues for my use-cases and I'd rather
have them and be able to say "yes, you can use this with some confidence
that it won't trivially bypass the DB security or provide a way to crash
the DB".

> Again, we could no doubt install defenses against that sort of case,
> once we realize it's a threat.  Maybe they'd even be bulletproof defenses
> (not too sure how you'd prevent race conditions though).  But whether they
> are or not, we just took the usability of the feature down another notch,
> because certainly that sort of directory arrangement would have been
> convenient for joe ... as long as he was trustworthy.

This "ad-hoc data load for Joe" use-case isn't where I had been going
with this feature, and I do trust the ETL processes that are behind the
use-case that I've proposed for the most part, but there's also no
reason for those files to be symlinks or have hard-links or have
subdirectories beyond those that I've specifically set up, and having
those protections seems, to me at least, like they'd be a good idea to
have, just in case.

If we punt on it entirely and refuse to check that the path provided by
the user hasn't got ".." in it, or that the file is an actual file and
not a symlink, etc, then we might as well make a "FILEACCESS" role
attribute and declare that you can trivially get superuser with it, if
you care to.  The problem with that, as I see it, is that it'd close off
a different set of use-cases as I'm really on the fence about if I'd
want to give an ETL process that kind of access, just out of sheer
paranoia.
Thanks!
    Stephen

Stephen Frost <sfrost@snowman.net> writes:
> * Kevin Grittner (kgrittn@ymail.com) wrote:
>> What's interesting and disappointing here is that not one of these
>> suggested vulnerabilities seems like a possibility on a database
>> server managed in what I would consider a sane and secure manner[1].

> For my part- I agree completely with this sentiment, and I'm not sure
> that Tom disagrees with it.

Well, the issues we've thought of so far require that the attacker have
his own shell-level access to the filesystem, but I would not like to
posit that there are none that don't require it.  Race conditions, for
example, could be exploited without a shell account as long as you can
fire up two backends doing your bidding.  *Maybe* it's safe if we don't
expose any "create symlink" or "create hardlink" or "rename" functions;
but you can bet people will be asking for such things down the line,
and we might forget and give it to them :-(

More to the point, if you're excluding cases like "let the user use
server-side COPY for speed" as allowed use-cases for this feature,
it seems like that's a pretty severe restriction.  How much is left
other than "let DBAs read the postmaster log remotely"?  And do we really
need to provide allegedly-not-superuser-equivalent filesystem access in
order to satisfy that use-case?  If you are letting untrustworthy people
read the postmaster log you've already got security problems, as per other
recent threads.

> I believe the discussion is heading towards
> a blanket "use this at your own risk- if the user can modify files in
> these directories outside of PG, they can probably break your system"
> being added in very bold lettering in the documentation around this.

> It botheres me that we'd have to have a statement like that, but if we
> have to then we have to.

If you're going to need a "use at your own risk" disclaimer, how is
that significantly different from letting people use the existing
superuser filesystem access functions?

>> This feature is valuable because it is an alternative to allowing a
>> user you don't trust *either* an OS login to the database server
>> *or* a superuser database login.  Can anyone suggest an exploit
>> which would be available if we allowed someone who has permission
>> to view all data in the database read permission to the pg_log
>> directory and the files contained therein, assuming they do *not*
>> have an OS login to the database server?

Capture the postmaster log.  Keep on capturing it till somebody
fat-fingers their login to the extent of swapping the username and
password (yeah, I've done that, haven't you?).  Scrape password from
the connection-failure log entry, figure out who it belongs to from
the next successful login, away you go.  Mean time to break-in might
or might not be less than time to brute-force the MD5 you could've
read from pg_authid.  But in any case, I don't find the assumption
that the user can already read everything in the database to
correspond to an untrusted user, so I'm not sure what this exercise
proves.

Or in short: you really shouldn't give server-filesystem access to
a user you have no trust in, and I'm unclear on what the use case
would be for that even if we could restrict it reliably.  The use
cases I can see for this are for DBAs to be able to do maintenance
things remotely without using a full no-training-wheels superuser
account.  ISTM that that type of use-case would be satisfied well
enough --- not ideally, perhaps, but well enough --- by being able
to grant full filesystem read and/or write to non-superuser accounts.

I compare this to the CREATEROLE privilege: that's pretty dangerous,
all in all, but we have not felt the need to invent facilities
whereby somebody could say "joe can create new roles, but only
on alternate Tuesdays and only if their names begin with 'u'".
        regards, tom lane



On 10/29/14, 2:33 PM, Tom Lane wrote:
> Capture the postmaster log.  Keep on capturing it till somebody
> fat-fingers their login to the extent of swapping the username and
> password (yeah, I've done that, haven't you?).

Which begs the question: why on earth do we log passwords at all? This is a problem for ALTER ROLE too.

Perhaps it would make sense if we had a dedicated security log this stuff went into, but if you're running something
likepgBadger/pgFouine you're going to be copying logfiles off somewhere else and now you've got a security problem.
 

Let alone if you're using syslog...
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Well, the issues we've thought of so far require that the attacker have
> his own shell-level access to the filesystem, but I would not like to
> posit that there are none that don't require it.  Race conditions, for
> example, could be exploited without a shell account as long as you can
> fire up two backends doing your bidding.  *Maybe* it's safe if we don't
> expose any "create symlink" or "create hardlink" or "rename" functions;
> but you can bet people will be asking for such things down the line,
> and we might forget and give it to them :-(

I was planning to document these concerns in our documentation around
the feature.  If we manage to still provide unfettered link() access to
users despite having docs that say we had best not do that, well, that'd
be on us then, yes.

> More to the point, if you're excluding cases like "let the user use
> server-side COPY for speed" as allowed use-cases for this feature,
> it seems like that's a pretty severe restriction.  How much is left
> other than "let DBAs read the postmaster log remotely"?  And do we really
> need to provide allegedly-not-superuser-equivalent filesystem access in
> order to satisfy that use-case?  If you are letting untrustworthy people
> read the postmaster log you've already got security problems, as per other
> recent threads.

Auditors don't need write access to the server, period.  They do need to
be able to read the logs though.  Additionally, being able to grant this
capability to relatively trusted processes, such as ETL, rather than
ad-hoc users, is a valuable use case for this feature.

> > It botheres me that we'd have to have a statement like that, but if we
> > have to then we have to.
>
> If you're going to need a "use at your own risk" disclaimer, how is
> that significantly different from letting people use the existing
> superuser filesystem access functions?

I agree that it really isn't (except for the 'you are not actually
running as superuser' bit, as discussed below) and said as much on
another sub-thread a few moment ago.

> >> This feature is valuable because it is an alternative to allowing a
> >> user you don't trust *either* an OS login to the database server
> >> *or* a superuser database login.  Can anyone suggest an exploit
> >> which would be available if we allowed someone who has permission
> >> to view all data in the database read permission to the pg_log
> >> directory and the files contained therein, assuming they do *not*
> >> have an OS login to the database server?
>
> Capture the postmaster log.  Keep on capturing it till somebody
> fat-fingers their login to the extent of swapping the username and
> password (yeah, I've done that, haven't you?).

Back to the 'setting up systems sanely'- don't use password based
authentication.

> Or in short: you really shouldn't give server-filesystem access to
> a user you have no trust in, and I'm unclear on what the use case
> would be for that even if we could restrict it reliably.  The use
> cases I can see for this are for DBAs to be able to do maintenance
> things remotely without using a full no-training-wheels superuser
> account.  ISTM that that type of use-case would be satisfied well
> enough --- not ideally, perhaps, but well enough --- by being able
> to grant full filesystem read and/or write to non-superuser accounts.

I agree that those use-cases are useful but, even as an admin, I'd be
worried about fat-fingering a filename or similar and overwriting
something unintentionally.  Still, it'd be a bit better for trusted
admins than having to run around as full superuser, but I'm still not
sure I'd want to give it to my ETL process.

My hope was specifically to *not* give full and unfettered server
filesystem access through this mechanism, to trusted users or untrusted
ones, as you could trivially become superuser or corrupt files on the
system to cause PG to crash.

> I compare this to the CREATEROLE privilege: that's pretty dangerous,
> all in all, but we have not felt the need to invent facilities
> whereby somebody could say "joe can create new roles, but only
> on alternate Tuesdays and only if their names begin with 'u'".

Having a FILEACCESS role attribute would certainly be trivial to
implement and document.
Thanks,
    Stephen

Stephen Frost <sfrost@snowman.net> writes:
> This "ad-hoc data load for Joe" use-case isn't where I had been going
> with this feature, and I do trust the ETL processes that are behind the
> use-case that I've proposed for the most part, but there's also no
> reason for those files to be symlinks or have hard-links or have
> subdirectories beyond those that I've specifically set up, and having
> those protections seems, to me at least, like they'd be a good idea to
> have, just in case.

If your ETL process can be restricted that much, can't it use file_fdw or
some such to access a fixed filename set by somebody with more privilege?
Why exactly does it need freedom to specify a filename but not a directory
path?

As for the DBA-access set of use cases, ISTM that most real-world needs
for this sort of functionality are inherently a bit ad-hoc, and therefore
once you've locked it down tightly enough that it's credibly not
exploitable, it's not really going to be as useful as all that.  The
nature of an admin job is dealing with unforeseen cases.
        regards, tom lane



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > This "ad-hoc data load for Joe" use-case isn't where I had been going
> > with this feature, and I do trust the ETL processes that are behind the
> > use-case that I've proposed for the most part, but there's also no
> > reason for those files to be symlinks or have hard-links or have
> > subdirectories beyond those that I've specifically set up, and having
> > those protections seems, to me at least, like they'd be a good idea to
> > have, just in case.
>
> If your ETL process can be restricted that much, can't it use file_fdw or
> some such to access a fixed filename set by somebody with more privilege?

We currently have the ETL figure out what the filename is on a daily
basis and by contrasting where it "should" be against what has been
loaded thus far (which is tracked in tables in the DB) we can figure out
what need to be loaded.  To do what you're suggesting we'd have to write
a pl/pgsql function to do the same which runs as a superuser- not ideal,
but it would be possible.

> Why exactly does it need freedom to specify a filename but not a directory
> path?

Because the file names change every day for daily processes, and there
can be cases (such as the system being backlogged or down for a day or
two) where it'd need to go back a few days in time.  This isn't
abnormal- I've run into exactly these cases a few times.  The Hadoop
system dumps the files out on the NFS server and the PG side sucks them
in.  The directories are part of the API which is defined between the
Hadoop team and the PG team, along with the file names, file formats,
etc.  These can go in either direction too, of course, Hadoop -> PG or
PG -> Hadoop, though each direction is always in a different directory
in my experience (as it's just sane to set things up that way), though I
suppose they wouldn't absolutely have to be.

> As for the DBA-access set of use cases, ISTM that most real-world needs
> for this sort of functionality are inherently a bit ad-hoc, and therefore
> once you've locked it down tightly enough that it's credibly not
> exploitable, it's not really going to be as useful as all that.  The
> nature of an admin job is dealing with unforeseen cases.

I agree that for the DBA-access set of use-cases (ad-hoc data loads,
etc), having a role attribute would be sufficient.  Note that this
doesn't cover the auditor role and log file access use-case that we've
been discussing though as auditors shouldn't have write access to the
system.
Thanks,
    Stephen

Stephen Frost <sfrost@snowman.net> writes:
> * Tom Lane (tgl@sss.pgh.pa.us) wrote:
>> If your ETL process can be restricted that much, can't it use file_fdw or
>> some such to access a fixed filename set by somebody with more privilege?

> We currently have the ETL figure out what the filename is on a daily
> basis and by contrasting where it "should" be against what has been
> loaded thus far (which is tracked in tables in the DB) we can figure out
> what need to be loaded.  To do what you're suggesting we'd have to write
> a pl/pgsql function to do the same which runs as a superuser- not ideal,
> but it would be possible.

Well, surely there's a finite set of possible filenames.  But if creating
a bunch of file_fdw servers doesn't float your boat, could we imagine a
variant of file_fdw that allows unprivileged specification of filename
within a directory set by a more-privileged user?  (Directory as a foreign
server property and filename as a table property, perhaps.)  Although the
superuser security definer function solution might work just as well.

>> As for the DBA-access set of use cases, ISTM that most real-world needs
>> for this sort of functionality are inherently a bit ad-hoc, and therefore
>> once you've locked it down tightly enough that it's credibly not
>> exploitable, it's not really going to be as useful as all that.  The
>> nature of an admin job is dealing with unforeseen cases.

> I agree that for the DBA-access set of use-cases (ad-hoc data loads,
> etc), having a role attribute would be sufficient.  Note that this
> doesn't cover the auditor role and log file access use-case that we've
> been discussing though as auditors shouldn't have write access to the
> system.

Log access seems like a sufficiently specialized, yet important, case that
maybe we should provide bespoke features for exactly that.  Aside from
having a clearer idea of the security implications of what we're doing,
specialized code could provide convenience features like automatically
reassembling a series of log files into a single stream.
        regards, tom lane



On Wed, Oct 29, 2014 at 3:31 PM, Stephen Frost <sfrost@snowman.net> wrote:
> I still don't particularly like it and, frankly, the limitations we've
> come up with thus far are not issues for my use-cases and I'd rather
> have them and be able to say "yes, you can use this with some confidence
> that it won't trivially bypass the DB security or provide a way to crash
> the DB".

I think it *will* trivially bypass the DB security.  If trivial means
"it can be done by anyone with no work at all", then, OK, it's not
trivial.  If it means "it can be done by a reasonably skilled engineer
without too much trouble", then it's trivial.  To call it a security
feature, I think the bar needs to be higher than that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > * Tom Lane (tgl@sss.pgh.pa.us) wrote:
> >> If your ETL process can be restricted that much, can't it use file_fdw or
> >> some such to access a fixed filename set by somebody with more privilege?
>
> > We currently have the ETL figure out what the filename is on a daily
> > basis and by contrasting where it "should" be against what has been
> > loaded thus far (which is tracked in tables in the DB) we can figure out
> > what need to be loaded.  To do what you're suggesting we'd have to write
> > a pl/pgsql function to do the same which runs as a superuser- not ideal,
> > but it would be possible.
>
> Well, surely there's a finite set of possible filenames.  But if creating
> a bunch of file_fdw servers doesn't float your boat, could we imagine a
> variant of file_fdw that allows unprivileged specification of filename
> within a directory set by a more-privileged user?  (Directory as a foreign
> server property and filename as a table property, perhaps.)  Although the
> superuser security definer function solution might work just as well.

Ugh, no, I wouldn't want hundreds of file_fdw tables created (and when
would you stop..?).   I'm trying to figure out how what you're
suggesting with file_fdw is different from what I was trying to propose
with directory aliases?  Wouldn't that have the same issues of hard
links, etc, if the user also has access to the filesystem and that
directory?  And if we trust the admin to use protected directories when
setting up file_fdw, why couldn't they do the same with directory
aliases?  Perhaps I've misunderstood this suggestion?

> Log access seems like a sufficiently specialized, yet important, case that
> maybe we should provide bespoke features for exactly that.  Aside from
> having a clearer idea of the security implications of what we're doing,
> specialized code could provide convenience features like automatically
> reassembling a series of log files into a single stream.

I agree with this, absolutely.  This as a use-case for the directory
aliases concept was more as a "it happens to support this nicely too"
than a final solution to this use-case, which I agree we definitely
could and should do better with, though I don't have any specific
solutions for it.

Clearly, I'd like to provide a solution to this use-case also though, so
if the whole 'directory alias' idea is defunct then I'd love to hear
suggestions on how to provide ad-hoc log file access for auditors via
file_fdws and/or COPY, if anyone has any ideas..
Thanks,
    Stephen

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Oct 29, 2014 at 3:31 PM, Stephen Frost <sfrost@snowman.net> wrote:
> > I still don't particularly like it and, frankly, the limitations we've
> > come up with thus far are not issues for my use-cases and I'd rather
> > have them and be able to say "yes, you can use this with some confidence
> > that it won't trivially bypass the DB security or provide a way to crash
> > the DB".
>
> I think it *will* trivially bypass the DB security.  If trivial means
> "it can be done by anyone with no work at all", then, OK, it's not
> trivial.  If it means "it can be done by a reasonably skilled engineer
> without too much trouble", then it's trivial.  To call it a security
> feature, I think the bar needs to be higher than that.

ENOPARSE

I agree- to be a security feature, we need to have a bar higher than
"can be bypassed by a reasonably skilled engineer without too much
trouble" and certainly higher than "it can be done by anyone with no
work at all".  I admit that I didn't realize the situation was quite
so dire today when it comes to these operations and that most utilities
which have to do this for their operations (procmail, maildrop, cron,
etc) have punted completely and gone to using setuid() instead.

Although I will note that cron, at least, does use O_NOFOLLOW and then
does do the hard-link check with fstat() after the crontab file is
opened.  If we're able to identify an issue with this approach, we
should probably let them know.

Another interesting idea might be to have "owner" specified along with
the directory alias and then test that the file to be opened is owned by
that owner, as cron checks..

Perhaps I should go work on Kerberos credential proxying and then we
could at least support this kind of capability on Windows when using
Windows Authentication (eg: Kerberos) and Windows file shares, as SQL
Server does.  I'll have to investigate how that works (if it does at
all) with Kerberos-based NFSv4; I haven't run into a system which uses
credential proxying to access NFS yet though I don't see any particular
reason offhand why it wouldn't work.
Thanks,
    Stephen

Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Kevin Grittner
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> If you are letting untrustworthy people read the postmaster log
> you've already got security problems, as per other recent
> threads.

That seems like a rather naive perspective.  In a large operation
there are different roles.  To draw some analogies, you might trust
your attorney with sensitive information, yet not trust her to
perform open heart surgery.  You might trust your accountant with
sensitive financial data, yet not want to get into an airplane
piloted by him.  If your electrician tried to fix your brakes you
might wind up careening down a hill out of control, while an
attempt by your mechanic to install a new circuit might see your
house in flames.  A one-dimensional concept of "trustworthy" is
worse than useless; it's downright dangerous.  Any security system
based on that is going to be weak.

>>> This feature is valuable because it is an alternative to allowing a
>>> user you don't trust *either* an OS login to the database server
>>> *or* a superuser database login.  Can anyone suggest an exploit
>>> which would be available if we allowed someone who has permission
>>> to view all data in the database read permission to the pg_log
>>> directory and the files contained therein, assuming they do *not*
>>> have an OS login to the database server?
>
> Capture the postmaster log.  Keep on capturing it till somebody
> fat-fingers their login to the extent of swapping the username and
> password (yeah, I've done that, haven't you?).  Scrape password from
> the connection-failure log entry, figure out who it belongs to from
> the next successful login, away you go.  Mean time to break-in might
> or might not be less than time to brute-force the MD5 you could've
> read from pg_authid.

At Wisconsin Courts we had people authorized to see all data in the
database and who had to support applications using the database.
They very frequently needed to look at the logs to diagnose
problems, and time was often of the essence.  We wound up creating
crontab jobs to copy the log files off the database servers to
directories on the file servers where they could be examined, so
that operations could be kept running properly.  The expose to the
problem you describe was, I would argue, *greater* with this
approach than if they could access the logs at need through a
database connection.

The fact that we write such things to the log is a serious problem
that we should fix, rather than pretending that anyone who has
access to the logs should be trustworthy enough not to impersonate
another user.  That position is sure to leave us vulnerable to
security breaches, and keep PostgreSQL out of many high-security
environments.

> But in any case, I don't find the assumption that the user can
> already read everything in the database to correspond to an
> untrusted user, so I'm not sure what this exercise proves.

A one-dimensional trust model is not in any way appropriate for a
large-scale environment which cares about security.  Heck, even a
"Principles of Accounting 101" class is sure involve a significant
discussion of the importance of the separation of duties.  IMV it
becomes *more* significant in a computerized system, not less.

> Or in short: you really shouldn't give server-filesystem access to
> a user you have no trust in,

Again, a one-dimensional measure of "trustworthy" is naive and
without merit.

> and I'm unclear on what the use case would be for that even if we
> could restrict it reliably.

Auditors or application support staff are a couple.

> The use cases I can see for this are for DBAs to be able to do
> maintenance things remotely without using a full no-training-wheels
> superuser account.

That is not the sort of use case that I feel is the primary target
of this.

> ISTM that that type of use-case would be satisfied well enough
> --- not ideally, perhaps, but well enough --- by being able to
> grant full filesystem read and/or write to non-superuser accounts.

IMV, if we can't have a read-only version there's no real point.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On Wed, Oct 29, 2014 at 3:52 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> And it still doesn't protect against the case where you hardlink to a file
> and then the permissions on that file are later changed.

Fwiw that's not how hard links work, at least UFS semantics
permissions such as ext2 etc. Hard links are links to the same inode
and permissions are associated with the file. There are other
filesystems out there though. AFS for example associates permissions
with directories.


-- 
greg



On Wed, Oct 29, 2014 at 5:22 PM, Greg Stark <stark@mit.edu> wrote:
> On Wed, Oct 29, 2014 at 3:52 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> And it still doesn't protect against the case where you hardlink to a file
>> and then the permissions on that file are later changed.
>
> Fwiw that's not how hard links work, at least UFS semantics
> permissions such as ext2 etc. Hard links are links to the same inode
> and permissions are associated with the file. There are other
> filesystems out there though. AFS for example associates permissions
> with directories.

That's exactly the point.  The postgres user has owns file F and user
A has permissions on it.  The DBA realizes this is bad and revokes
user A's permissions, but user A has already noticed and made a
hardlink to the file.  When the DBA subsequently gives user A
permissions to have the server write to files in /home/a, a can induce
the server write to her hardlink even though she can no longer access
the file herself.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Peter Eisentraut
Date:
On 10/29/14 3:41 PM, Jim Nasby wrote:
> On 10/29/14, 2:33 PM, Tom Lane wrote:
>> Capture the postmaster log.  Keep on capturing it till somebody
>> fat-fingers their login to the extent of swapping the username and
>> password (yeah, I've done that, haven't you?).
> 
> Which begs the question: why on earth do we log passwords at all?

We don't.

> This is a problem for ALTER ROLE too.

Only if you use the non-encrypted forms.




Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From
Heikki Linnakangas
Date:
I'm marking this as "Rejected" in the commitfest. It's quite clear that 
this isn't going to fly in its current form.

For the COPY FROM use case, I'd suggest just doing COPY FROM STDIN. Yes, 
it's slower, but not much. And you probably could optimize it further - 
there's some gratuitous memcpy()ing happening from buffer to buffer in 
that codepath.

- Heikki