Thread: Proposal: functions get_text() or get_url()

Proposal: functions get_text() or get_url()

From
Stefan Keller
Date:
How to insert a text file into a field in PostgreSQL?
I'd like to insert a row with fields from a local or remote text file.
I'd expect functions like get_text() or get_url() in order to do the following:
INSERT INTO collection(id, path, content) VALUES(1, '/tmp/mytext, get_text('/tmp/mytext));
AFAIK there was a get_url in libcurl but I neither find it any more. But anyway: This should be part of the core... :->
-S.
 

Re: Proposal: functions get_text() or get_url()

From
Robert Haas
Date:
On Mon, May 18, 2009 at 4:03 PM, Stefan Keller <sfkeller@gmail.com> wrote:
> How to insert a text file into a field in PostgreSQL?
> I'd like to insert a row with fields from a local or remote text file.
> I'd expect functions like get_text() or get_url() in order to do the
> following:
> INSERT INTO collection(id, path, content) VALUES(1, '/tmp/mytext,
> get_text('/tmp/mytext));
> AFAIK there was a get_url in libcurl but I neither find it any more. But
> anyway: This should be part of the core... :->

Putting this into core would have security implications.  The file or
URL would be downloaded by the PostgreSQL server process, not the
client process - therefore I think it would have to be super-user
only, which would make it much less useful.  You'd also need a pretty
rich API to mimic all the crazy thinks a URL-fetching library like
libcurl knows how to do.

Except for very large objects where double-copying must be avoided, it
seems like having the client retrieve and pass the data is a simpler
solution.  That having been said, nobody can stop you from writing
(and submitting) a patch.

...Robert


Re: Proposal: functions get_text() or get_url()

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, May 18, 2009 at 4:03 PM, Stefan Keller <sfkeller@gmail.com> wrote:
>> I'd expect functions like get_text() or get_url() in order to do the
>> following:
>> INSERT INTO collection(id, path, content) VALUES(1, '/tmp/mytext,
>> get_text('/tmp/mytext));

Apparently you've not found pg_read_file() ?

>> AFAIK there was a get_url in libcurl but I neither find it any more. But
>> anyway: This should be part of the core... :->

> Putting this into core would have security implications.  The file or
> URL would be downloaded by the PostgreSQL server process, not the
> client process - therefore I think it would have to be super-user
> only, which would make it much less useful.

Yes.  I very strongly doubt that we'd accept a url-fetching function at
all.  Aside from the security issues, it would necessarily pull in a
boatload of dependencies that we'd prefer not to have.

Of course, you can write such a thing trivially in plperlu or several
other untrusted PLs, and include any security restrictions you see fit
while you're at it.  I'm not seeing how a built-in function that would
have to impose one-size-fits-all security requirements would be an
improvement.
        regards, tom lane


Re: Proposal: functions get_text() or get_url()

From
Stefan Keller
Date:
Tom,
 
> Apparently you've not found pg_read_file() ?
Thanks a lot. Did'nt find this. This helped!
 
Still, get_url() would be handy too... :->
 
Questions: Don't see, why this would be a security issue: How could such a function do any harm? large files?
 
Finally: Got some tricky followup questions regarding index usage in tsearch2 and regex. Should I place these here (or else where?)?
 
Regards, S.
 
 

 
2009/5/19 Tom Lane <tgl@sss.pgh.pa.us>
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, May 18, 2009 at 4:03 PM, Stefan Keller <sfkeller@gmail.com> wrote:
>> I'd expect functions like get_text() or get_url() in order to do the
>> following:
>> INSERT INTO collection(id, path, content) VALUES(1, '/tmp/mytext,
>> get_text('/tmp/mytext));

Apparently you've not found pg_read_file() ?

>> AFAIK there was a get_url in libcurl but I neither find it any more. But
>> anyway: This should be part of the core... :->

> Putting this into core would have security implications.  The file or
> URL would be downloaded by the PostgreSQL server process, not the
> client process - therefore I think it would have to be super-user
> only, which would make it much less useful.

Yes.  I very strongly doubt that we'd accept a url-fetching function at
all.  Aside from the security issues, it would necessarily pull in a
boatload of dependencies that we'd prefer not to have.

Of course, you can write such a thing trivially in plperlu or several
other untrusted PLs, and include any security restrictions you see fit
while you're at it.  I'm not seeing how a built-in function that would
have to impose one-size-fits-all security requirements would be an
improvement.

                       regards, tom lane

Re: Proposal: functions get_text() or get_url()

From
Robert Haas
Date:
On Wed, May 20, 2009 at 6:34 AM, Stefan Keller <sfkeller@gmail.com> wrote:
> Questions: Don't see, why this would be a security issue: How could such a
> function do any harm? large files?

No, large files aren't the problem.  The problem is that the
PostgreSQL server process may have rights to access things that the
user doesn't.  For a simple case, imagine that PostgreSQL is behind a
firewall and the user is in front of the firewall, but there's a port
open to permit access to PostgreSQL.  Now imagine that there is a web
server behind the firewall.  The firewall blocks the user from
accessing the web server directly, but the user can ask PostgreSQL to
download the URLs for him.  In that way, the user can bypass the
firewall.  (Consider for example Andrew Chernow's company, which has
clients connecting to their database server from all over the
Internet...)

...Robert


Re: Proposal: functions get_text() or get_url()

From
Stefan Keller
Date:
Ok.
 
But again: There is a library mentioned and documented in the famous PostgreSQL book from Douglas & Douglas called pgcurl (http://gborg.postgresql.org/project/pgcurl/ ). Where's this gone?
Yours, S.
2009/5/20 Robert Haas <robertmhaas@gmail.com>
On Wed, May 20, 2009 at 6:34 AM, Stefan Keller <sfkeller@gmail.com> wrote:
> Questions: Don't see, why this would be a security issue: How could such a
> function do any harm? large files?

No, large files aren't the problem.  The problem is that the
PostgreSQL server process may have rights to access things that the
user doesn't.  For a simple case, imagine that PostgreSQL is behind a
firewall and the user is in front of the firewall, but there's a port
open to permit access to PostgreSQL.  Now imagine that there is a web
server behind the firewall.  The firewall blocks the user from
accessing the web server directly, but the user can ask PostgreSQL to
download the URLs for him.  In that way, the user can bypass the
firewall.  (Consider for example Andrew Chernow's company, which has
clients connecting to their database server from all over the
Internet...)

...Robert