Proposal: functions get_text() or get_url()

Started by Stefan Kellerover 16 years ago6 messages
#1Stefan Keller
sfkeller@gmail.com

How to insert a text file into a field in PostgreSQL?
I'd like to insert a row with fields from a local or remote text file.
I'd expect functions like get_text() or get_url() in order to do the
following:
INSERT INTO collection(id, path, content) VALUES(1, '/tmp/mytext,
get_text('/tmp/mytext));
AFAIK there was a get_url in libcurl but I neither find it any more. But
anyway: This should be part of the core... :->
-S.

#2Robert Haas
robertmhaas@gmail.com
In reply to: Stefan Keller (#1)
Re: Proposal: functions get_text() or get_url()

On Mon, May 18, 2009 at 4:03 PM, Stefan Keller <sfkeller@gmail.com> wrote:

How to insert a text file into a field in PostgreSQL?
I'd like to insert a row with fields from a local or remote text file.
I'd expect functions like get_text() or get_url() in order to do the
following:
INSERT INTO collection(id, path, content) VALUES(1, '/tmp/mytext,
get_text('/tmp/mytext));
AFAIK there was a get_url in libcurl but I neither find it any more. But
anyway: This should be part of the core... :->

Putting this into core would have security implications. The file or
URL would be downloaded by the PostgreSQL server process, not the
client process - therefore I think it would have to be super-user
only, which would make it much less useful. You'd also need a pretty
rich API to mimic all the crazy thinks a URL-fetching library like
libcurl knows how to do.

Except for very large objects where double-copying must be avoided, it
seems like having the client retrieve and pass the data is a simpler
solution. That having been said, nobody can stop you from writing
(and submitting) a patch.

...Robert

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#2)
Re: Proposal: functions get_text() or get_url()

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, May 18, 2009 at 4:03 PM, Stefan Keller <sfkeller@gmail.com> wrote:

I'd expect functions like get_text() or get_url() in order to do the
following:
INSERT INTO collection(id, path, content) VALUES(1, '/tmp/mytext,
get_text('/tmp/mytext));

Apparently you've not found pg_read_file() ?

AFAIK there was a get_url in libcurl but I neither find it any more. But
anyway: This should be part of the core... :->

Putting this into core would have security implications. The file or
URL would be downloaded by the PostgreSQL server process, not the
client process - therefore I think it would have to be super-user
only, which would make it much less useful.

Yes. I very strongly doubt that we'd accept a url-fetching function at
all. Aside from the security issues, it would necessarily pull in a
boatload of dependencies that we'd prefer not to have.

Of course, you can write such a thing trivially in plperlu or several
other untrusted PLs, and include any security restrictions you see fit
while you're at it. I'm not seeing how a built-in function that would
have to impose one-size-fits-all security requirements would be an
improvement.

regards, tom lane

#4Stefan Keller
sfkeller@gmail.com
In reply to: Tom Lane (#3)
Re: Proposal: functions get_text() or get_url()

Tom,

Apparently you've not found pg_read_file() ?

Thanks a lot. Did'nt find this. This helped!

Still, get_url() would be handy too... :->

Questions: Don't see, why this would be a security issue: How could such a
function do any harm? large files?

Finally: Got some tricky followup questions regarding index usage in
tsearch2 and regex. Should I place these here (or else where?)?

Regards, S.

2009/5/19 Tom Lane <tgl@sss.pgh.pa.us>

Show quoted text

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, May 18, 2009 at 4:03 PM, Stefan Keller <sfkeller@gmail.com>

wrote:

I'd expect functions like get_text() or get_url() in order to do the
following:
INSERT INTO collection(id, path, content) VALUES(1, '/tmp/mytext,
get_text('/tmp/mytext));

Apparently you've not found pg_read_file() ?

AFAIK there was a get_url in libcurl but I neither find it any more. But
anyway: This should be part of the core... :->

Putting this into core would have security implications. The file or
URL would be downloaded by the PostgreSQL server process, not the
client process - therefore I think it would have to be super-user
only, which would make it much less useful.

Yes. I very strongly doubt that we'd accept a url-fetching function at
all. Aside from the security issues, it would necessarily pull in a
boatload of dependencies that we'd prefer not to have.

Of course, you can write such a thing trivially in plperlu or several
other untrusted PLs, and include any security restrictions you see fit
while you're at it. I'm not seeing how a built-in function that would
have to impose one-size-fits-all security requirements would be an
improvement.

regards, tom lane

#5Robert Haas
robertmhaas@gmail.com
In reply to: Stefan Keller (#4)
Re: Proposal: functions get_text() or get_url()

On Wed, May 20, 2009 at 6:34 AM, Stefan Keller <sfkeller@gmail.com> wrote:

Questions: Don't see, why this would be a security issue: How could such a
function do any harm? large files?

No, large files aren't the problem. The problem is that the
PostgreSQL server process may have rights to access things that the
user doesn't. For a simple case, imagine that PostgreSQL is behind a
firewall and the user is in front of the firewall, but there's a port
open to permit access to PostgreSQL. Now imagine that there is a web
server behind the firewall. The firewall blocks the user from
accessing the web server directly, but the user can ask PostgreSQL to
download the URLs for him. In that way, the user can bypass the
firewall. (Consider for example Andrew Chernow's company, which has
clients connecting to their database server from all over the
Internet...)

...Robert

#6Stefan Keller
sfkeller@gmail.com
In reply to: Robert Haas (#5)
Re: Proposal: functions get_text() or get_url()

Ok.

But again: There is a library mentioned and documented in the famous
PostgreSQL book from Douglas & Douglas called pgcurl (
http://gborg.postgresql.org/project/pgcurl/ ). Where's this gone?
Yours, S.
2009/5/20 Robert Haas <robertmhaas@gmail.com>

Show quoted text

On Wed, May 20, 2009 at 6:34 AM, Stefan Keller <sfkeller@gmail.com> wrote:

Questions: Don't see, why this would be a security issue: How could such

a

function do any harm? large files?

No, large files aren't the problem. The problem is that the
PostgreSQL server process may have rights to access things that the
user doesn't. For a simple case, imagine that PostgreSQL is behind a
firewall and the user is in front of the firewall, but there's a port
open to permit access to PostgreSQL. Now imagine that there is a web
server behind the firewall. The firewall blocks the user from
accessing the web server directly, but the user can ask PostgreSQL to
download the URLs for him. In that way, the user can bypass the
firewall. (Consider for example Andrew Chernow's company, which has
clients connecting to their database server from all over the
Internet...)

...Robert