Home > mailing lists

Re: Directory/File Access Permissions for COPY and Generic File Access Functions - Mailing list pgsql-hackers

From	Stephen Frost
Subject	Re: Directory/File Access Permissions for COPY and Generic File Access Functions
Date	October 29, 2014 22:58:54
Msg-id	20141029195849.GZ28859@tamriel.snowman.net Whole thread Raw
In response to	Re: Directory/File Access Permissions for COPY and Generic File Access Functions (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Directory/File Access Permissions for COPY and Generic File Access Functions
List	pgsql-hackers

Tree view

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > This "ad-hoc data load for Joe" use-case isn't where I had been going
> > with this feature, and I do trust the ETL processes that are behind the
> > use-case that I've proposed for the most part, but there's also no
> > reason for those files to be symlinks or have hard-links or have
> > subdirectories beyond those that I've specifically set up, and having
> > those protections seems, to me at least, like they'd be a good idea to
> > have, just in case.
>
> If your ETL process can be restricted that much, can't it use file_fdw or
> some such to access a fixed filename set by somebody with more privilege?

We currently have the ETL figure out what the filename is on a daily
basis and by contrasting where it "should" be against what has been
loaded thus far (which is tracked in tables in the DB) we can figure out
what need to be loaded.  To do what you're suggesting we'd have to write
a pl/pgsql function to do the same which runs as a superuser- not ideal,
but it would be possible.

> Why exactly does it need freedom to specify a filename but not a directory
> path?

Because the file names change every day for daily processes, and there
can be cases (such as the system being backlogged or down for a day or
two) where it'd need to go back a few days in time.  This isn't
abnormal- I've run into exactly these cases a few times.  The Hadoop
system dumps the files out on the NFS server and the PG side sucks them
in.  The directories are part of the API which is defined between the
Hadoop team and the PG team, along with the file names, file formats,
etc.  These can go in either direction too, of course, Hadoop -> PG or
PG -> Hadoop, though each direction is always in a different directory
in my experience (as it's just sane to set things up that way), though I
suppose they wouldn't absolutely have to be.

> As for the DBA-access set of use cases, ISTM that most real-world needs
> for this sort of functionality are inherently a bit ad-hoc, and therefore
> once you've locked it down tightly enough that it's credibly not
> exploitable, it's not really going to be as useful as all that.  The
> nature of an admin job is dealing with unforeseen cases.

I agree that for the DBA-access set of use-cases (ad-hoc data loads,
etc), having a role attribute would be sufficient.  Note that this
doesn't cover the auditor role and log file access use-case that we've
been discussing though as auditors shouldn't have write access to the
system.
Thanks,
    Stephen

pgsql-hackers by date:

From: Tom Lane
Date: 29 October 2014, 22:49:18
Subject: Re: Directory/File Access Permissions for COPY and Generic File Access Functions

From: Petr Jelinek
Date: 29 October 2014, 23:08:45
Subject: Re: Add shutdown_at_recovery_target option to recovery.conf

Re: Directory/File Access Permissions for COPY and Generic File Access Functions - Mailing list pgsql-hackers

Previous

Next