Specification for Trusted PLs?

Started by David Fetterover 15 years ago53 messages
#1David Fetter
david@fetter.org

Folks,

I feel dumb.

I have been looking for a document which specifies what trusted and
untrusted PLs must do and forbid, so far without result.

Where do we document this, and if we don't where *should* we document
this?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#2Stephen Frost
sfrost@snowman.net
In reply to: David Fetter (#1)
Re: Specification for Trusted PLs?

* David Fetter (david@fetter.org) wrote:

I have been looking for a document which specifies what trusted and
untrusted PLs must do and forbid, so far without result.

I think you might have been missing the tree for the forest in this
case.. :) I'm sure you've seen this, but perhaps you weren't thinking
about how broad it really is:

http://www.postgresql.org/docs/9.0/static/sql-createlanguage.html

TRUSTED

TRUSTED specifies that the language is safe, that is, it does not
offer an unprivileged user any functionality to bypass access
restrictions. If this key word is omitted when registering the
language, only users with the PostgreSQL superuser privilege can use
this language to create new functions.

That's about it- a language is TRUSTED if there's no way for a user to
be able to write a function which will give them access to things
they're not supposed to have. Practically, this includes things like
any kind of direct I/O (files, network, etc).

Where do we document this, and if we don't where *should* we document
this?

I'd be hesitant about trying to document exactly what a PL must do to be
trusted at a more granular level than what's above- mostly because, if
we change some functionality, we would end up having to document that
change in the place which is appropriate for it and then also in the
list of "things trusted PLs shouldn't do/allow".

Thanks,

Stephen

#3Peter Geoghegan
peter.geoghegan86@gmail.com
In reply to: Stephen Frost (#2)
Re: Specification for Trusted PLs?

That's about it- a language is TRUSTED if there's no way for a user to
be able to write a function which will give them access to things
they're not supposed to have.  Practically, this includes things like
any kind of direct I/O (files, network, etc).

The fact that plpythonu used to be plpython back in 7.3 serves to
illustrate that the distinction is not all that well defined. I guess
that someone made an executive decision that the python restricted
execution environment wasn't restricted enough.

Regards,
Peter Geoghegan

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#3)
Re: Specification for Trusted PLs?

Peter Geoghegan <peter.geoghegan86@gmail.com> writes:

That's about it- a language is TRUSTED if there's no way for a user to
be able to write a function which will give them access to things
they're not supposed to have. �Practically, this includes things like
any kind of direct I/O (files, network, etc).

The fact that plpythonu used to be plpython back in 7.3 serves to
illustrate that the distinction is not all that well defined. I guess
that someone made an executive decision that the python restricted
execution environment wasn't restricted enough.

Well, it was the upstream authors of python's restricted execution
environment who decided it was unfixably insecure, not us. So the
"trusted" version had to go away.

(For awhile there last month, it was looking like plperl was going to
suffer the same fate :-(. Fortunately Tim Bunce thought of a way to
not have to rely on Safe.pm anymore.)

regards, tom lane

#5Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#4)
Re: Specification for Trusted PLs?

So, here's a working definition:

1) cannot directly read or write files on the server.
2) cannot bind network ports
3) uses only the SPI interface to interact with postgresql tables etc.
4) does any logging only using elog to the postgres log

Questions:

a) it seems like there should be some kind of restriction on access to
memory, but I'm not clear on how that would be defined.

b) where are we with the whole trusted module thing? Like for CPAN
modules etc.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#6Magnus Hagander
magnus@hagander.net
In reply to: Josh Berkus (#5)
Re: Specification for Trusted PLs?

On Fri, May 21, 2010 at 11:55 AM, Josh Berkus <josh@agliodbs.com> wrote:

So, here's a working definition:

1) cannot directly read or write files on the server.
2) cannot bind network ports

To make that more covering, don't yu really need something like
"cannot communicate with outside processes"?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#7Josh Berkus
josh@agliodbs.com
In reply to: Magnus Hagander (#6)
Re: Specification for Trusted PLs?

On 05/21/2010 11:57 AM, Magnus Hagander wrote:

On Fri, May 21, 2010 at 11:55 AM, Josh Berkus<josh@agliodbs.com> wrote:

So, here's a working definition:

1) cannot directly read or write files on the server.
2) cannot bind network ports

To make that more covering, don't yu really need something like
"cannot communicate with outside processes"?

So, no interprocess communication except through the SPI interface? How
do module GUCs and things like %_SHARED fit into this?

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#8David Fetter
david@fetter.org
In reply to: Magnus Hagander (#6)
Re: Specification for Trusted PLs?

On Fri, May 21, 2010 at 11:57:33AM -0400, Magnus Hagander wrote:

On Fri, May 21, 2010 at 11:55 AM, Josh Berkus <josh@agliodbs.com> wrote:

So, here's a working definition:

1) cannot directly read or write files on the server.
2) cannot bind network ports

To make that more covering, don't yu really need something like
"cannot communicate with outside processes"?

These need to be testable conditions, and new tests need to get added
any time we find that we've missed something. Making this concept
fuzzier is exactly the wrong direction to go.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#9Magnus Hagander
magnus@hagander.net
In reply to: David Fetter (#8)
Re: Specification for Trusted PLs?

On Fri, May 21, 2010 at 12:22 PM, David Fetter <david@fetter.org> wrote:

On Fri, May 21, 2010 at 11:57:33AM -0400, Magnus Hagander wrote:

On Fri, May 21, 2010 at 11:55 AM, Josh Berkus <josh@agliodbs.com> wrote:

So, here's a working definition:

1) cannot directly read or write files on the server.
2) cannot bind network ports

To make that more covering, don't yu really need something like
"cannot communicate with outside processes"?

These need to be testable conditions, and new tests need to get added
any time we find that we've missed something.  Making this concept
fuzzier is exactly the wrong direction to go.

Well, the best way to define what a trusted language can do is to
define a *whitelist* of what it can do, not a blacklist of what it
can't do. That's the only way to get a complete definition. It's then
up to the implementation step to figure out how to represent that in
the form of tests.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#10Stephen Frost
sfrost@snowman.net
In reply to: David Fetter (#8)
Re: Specification for Trusted PLs?

* David Fetter (david@fetter.org) wrote:

These need to be testable conditions, and new tests need to get added
any time we find that we've missed something. Making this concept
fuzzier is exactly the wrong direction to go.

I'm really not sure that we want to be in the business of writing a ton
of regression tests to see if languages which claim to be trusted really
are..

Stephen

#11David Fetter
david@fetter.org
In reply to: Stephen Frost (#10)
Re: Specification for Trusted PLs?

On Fri, May 21, 2010 at 12:26:24PM -0400, Stephen Frost wrote:

* David Fetter (david@fetter.org) wrote:

These need to be testable conditions, and new tests need to get
added any time we find that we've missed something. Making this
concept fuzzier is exactly the wrong direction to go.

I'm really not sure that we want to be in the business of writing a
ton of regression tests to see if languages which claim to be
trusted really are..

That is *precisely* the business we need to be in, at least for the
languages we ship, and it would behoove us to test languages we don't
ship so we can warn people when they don't pass.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#12Stephen Frost
sfrost@snowman.net
In reply to: David Fetter (#11)
Re: Specification for Trusted PLs?

* David Fetter (david@fetter.org) wrote:

That is *precisely* the business we need to be in, at least for the
languages we ship, and it would behoove us to test languages we don't
ship so we can warn people when they don't pass.

k, let's start with something simpler first tho- I'm sure we can pull in
the glibc regression tests and run them too. You know, just in case
there's a bug there, somewhere.

Thanks,

Stephen

#13David Fetter
david@fetter.org
In reply to: Stephen Frost (#12)
Re: Specification for Trusted PLs?

On Fri, May 21, 2010 at 01:45:45PM -0400, Stephen Frost wrote:

* David Fetter (david@fetter.org) wrote:

That is *precisely* the business we need to be in, at least for the
languages we ship, and it would behoove us to test languages we don't
ship so we can warn people when they don't pass.

k, let's start with something simpler first tho- I'm sure we can pull in
the glibc regression tests and run them too. You know, just in case
there's a bug there, somewhere.

That's pretty pure straw man argument. I expect much higher quality
trolling. D-.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#14Florian Pflug
fgp@phlo.org
In reply to: Stephen Frost (#10)
Re: Specification for Trusted PLs?

On May 21, 2010, at 18:26 , Stephen Frost wrote:

* David Fetter (david@fetter.org) wrote:

These need to be testable conditions, and new tests need to get added
any time we find that we've missed something. Making this concept
fuzzier is exactly the wrong direction to go.

I'm really not sure that we want to be in the business of writing a ton
of regression tests to see if languages which claim to be trusted really
are..

Well, testing software security via regression tests certainly is sounds intriguing. But unfortunately, it's impossible also AFAICS - it'd amount to testing for the *absence* of features, which seems hard...

I suggest the following definition of "trusted PL".
"While potentially preventing excruciating pain, saving tons of sweat and allowing code reuse, actually adds nothing in terms of features over pl/pgsql".

best regards,
Florian Pflug

#15Stephen Frost
sfrost@snowman.net
In reply to: David Fetter (#13)
Re: Specification for Trusted PLs?

* David Fetter (david@fetter.org) wrote:

On Fri, May 21, 2010 at 01:45:45PM -0400, Stephen Frost wrote:

k, let's start with something simpler first tho- I'm sure we can pull in
the glibc regression tests and run them too. You know, just in case
there's a bug there, somewhere.

That's pretty pure straw man argument. I expect much higher quality
trolling. D-.

Sorry, but seriously, at some point we have to expect that the tools we
use will behave according to their claims and their documentation, at
least until proven otherwise. I don't like that it means we may end up
having to issue CVE's when there are issues in things we use, but I
don't think that means we shouldn't use other libraries or we should
spend alot of time working on validating those tools. Presumably, they
have communities who do that.

As an example, consider the zlib issue that happened not too long ago
and the subsequent many CVE's that came of it. We could have reviewed
zlib better and possibly found that bug, but I don't know that it would
be the best use of our rather limited resources. Additionally, trying
to go into other code bases like that to do that kind of detailed review
would necessairly be much more difficult for those who are not familiar
with it. etc, etc...

Stephen

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Fetter (#11)
Re: Specification for Trusted PLs?

David Fetter <david@fetter.org> writes:

On Fri, May 21, 2010 at 12:26:24PM -0400, Stephen Frost wrote:

I'm really not sure that we want to be in the business of writing a
ton of regression tests to see if languages which claim to be
trusted really are..

That is *precisely* the business we need to be in, at least for the
languages we ship, and it would behoove us to test languages we don't
ship so we can warn people when they don't pass.

I can't see us writing an AI-complete set of tests for each language
we ship, let alone ones we don't. Testing can prove the presence of
bugs, not their absence --- and that applies in spades to security
holes.

regards, tom lane

#17Robert Haas
robertmhaas@gmail.com
In reply to: David Fetter (#13)
Re: Specification for Trusted PLs?

On Fri, May 21, 2010 at 1:58 PM, David Fetter <david@fetter.org> wrote:

On Fri, May 21, 2010 at 01:45:45PM -0400, Stephen Frost wrote:

* David Fetter (david@fetter.org) wrote:

That is *precisely* the business we need to be in, at least for the
languages we ship, and it would behoove us to test languages we don't
ship so we can warn people when they don't pass.

k, let's start with something simpler first tho- I'm sure we can pull in
the glibc regression tests and run them too.  You know, just in case
there's a bug there, somewhere.

That's pretty pure straw man argument.  I expect much higher quality
trolling.  D-.

I'm sorely tempted to try to provide some higher-quality trolling, but
in all seriousness I think that (1) we could certainly use much better
regression tests in many areas of which this is one and (2) it will
never be possible to catch all security bugs - in particular - via
regression testing because they typically stem from cases people
didn't consider. So... can we get back to coming up with a reasonable
definition, and if somebody wants to write some regression tests, all
the better?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#17)
Re: Specification for Trusted PLs?

Robert Haas <robertmhaas@gmail.com> writes:

So... can we get back to coming up with a reasonable
definition,

(1) no access to system calls (including file and network I/O)

(2) no access to process memory, other than variables defined within the
PL.

What else?

regards, tom lane

#19Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#18)
Re: Specification for Trusted PLs?

On Fri, May 21, 2010 at 2:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

So... can we get back to coming up with a reasonable
definition,

(1) no access to system calls (including file and network I/O)

(2) no access to process memory, other than variables defined within the
PL.

What else?

Doesn't subvert the general PostgreSQL security mechanisms? Not sure
how to formulate that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#20Greg Sabino Mullane
greg@turnstep.com
In reply to: Magnus Hagander (#9)
Re: Specification for Trusted PLs?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

Well, the best way to define what a trusted language can do is to
define a *whitelist* of what it can do, not a blacklist of what it
can't do. That's the only way to get a complete definition. It's then
up to the implementation step to figure out how to represent that in
the form of tests.

No, that's exactly backwards. We can't define all the things a language
can do, but we can certainly lay out the things that it is not supposed to.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201005211452
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkv21oIACgkQvJuQZxSWSsg8lQCdFKNXO5XWD5bJ0lQAx3prFYGW
5CYAnjHiuwKVAxvwjl/clyiwCtXCVvr0
=5tSD
-----END PGP SIGNATURE-----

#21Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#17)
Re: Specification for Trusted PLs?

* Robert Haas (robertmhaas@gmail.com) wrote:

So... can we get back to coming up with a reasonable
definition, and

Guess I'm wondering if we could steal such a definition from one of the
languages we allow as trusted already.. Just a thought. I certainly
think we should make sure that we document how untrusted languages are
handled from the PG point of view (eg: can't change ownership).

if somebody wants to write some regression tests, all
the better?

I certainly am fine with that to the extent that they want to work on
that instead of hacking PG.. Guess I just don't think it should be a
priority for us to come up with a signifigant regression suite for
pieces that are supposedly being externally managed.

Thanks,

Stephen

#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Sabino Mullane (#20)
Re: Specification for Trusted PLs?

"Greg Sabino Mullane" <greg@turnstep.com> writes:

Well, the best way to define what a trusted language can do is to
define a *whitelist* of what it can do, not a blacklist of what it
can't do.

No, that's exactly backwards. We can't define all the things a language
can do, but we can certainly lay out the things that it is not supposed to.

Yeah. The whole point of allowing multiple PLs is that some of them
make it possible/easy to do things you can't (easily) do in others.
So I'm not sure that a whitelist is going to be especially useful.

regards, tom lane

#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#19)
Re: Specification for Trusted PLs?

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, May 21, 2010 at 2:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

(1) no access to system calls (including file and network I/O)
(2) no access to process memory, other than variables defined within the
PL.
What else?

Doesn't subvert the general PostgreSQL security mechanisms? Not sure
how to formulate that.

As long as you can't do database access except via SPI, that should be
covered. So I guess the next item on the list is no, or at least
restricted, access to functions outside the PL's own language.

regards, tom lane

#24David Fetter
david@fetter.org
In reply to: Tom Lane (#23)
Re: Specification for Trusted PLs?

On Fri, May 21, 2010 at 03:15:27PM -0400, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, May 21, 2010 at 2:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

(1) no access to system calls (including file and network I/O)
(2) no access to process memory, other than variables defined within the
PL.
What else?

Doesn't subvert the general PostgreSQL security mechanisms? Not
sure how to formulate that.

As long as you can't do database access except via SPI, that should
be covered. So I guess the next item on the list is no, or at least
restricted, access to functions outside the PL's own language.

"No access" seems pretty draconian.

How about limiting such access to functions of equal or lower
trustedness? Surely an untrusted function shouldn't be restricted
from calling other untrusted functions based on the language they're
written in.

Cheers,
David (who is not, at this point, going to suggest that a "trusted"
boolean may inadequately reflect users' needs)
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#25David Fetter
david@fetter.org
In reply to: David Fetter (#24)
Re: Specification for Trusted PLs?

On Fri, May 21, 2010 at 12:36:50PM -0700, David Fetter wrote:

On Fri, May 21, 2010 at 03:15:27PM -0400, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, May 21, 2010 at 2:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

(1) no access to system calls (including file and network I/O)
(2) no access to process memory, other than variables defined within the
PL.
What else?

Doesn't subvert the general PostgreSQL security mechanisms? Not
sure how to formulate that.

As long as you can't do database access except via SPI, that should
be covered. So I guess the next item on the list is no, or at least
restricted, access to functions outside the PL's own language.

"No access" seems pretty draconian.

How about limiting such access to functions of equal or lower
trustedness?

I see that's confusing. What I meant was that functions in trusted
languages should be able to call other functions in trusted languages,
while functions in untrusted languages shouldn't be restricted as to
what other functions they can call.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#26Joshua Tolley
eggyknap@gmail.com
In reply to: David Fetter (#24)
Re: Specification for Trusted PLs?

On Fri, May 21, 2010 at 1:36 PM, David Fetter <david@fetter.org> wrote:

On Fri, May 21, 2010 at 03:15:27PM -0400, Tom Lane wrote:

As long as you can't do database access except via SPI, that should
be covered.  So I guess the next item on the list is no, or at least
restricted, access to functions outside the PL's own language.

"No access" seems pretty draconian.

How about limiting such access to functions of equal or lower
trustedness?  Surely an untrusted function shouldn't be restricted
from calling other untrusted functions based on the language they're
written in.

Agreed. As long as a trusted language can do things outside the
database only by going through a database and calling some function to
which the user has rights, in an untrusted language, that seems decent
to me. A user with permissions to launch_missiles() would have a
function in an untrusted language to do it, but there's no reason an
untrusted language shouldn't be able to say "SELECT
launch_missiles()".

--
Joshua Tolley / eggyknap
End Point Corporation

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joshua Tolley (#26)
Re: Specification for Trusted PLs?

Joshua Tolley <eggyknap@gmail.com> writes:

Agreed. As long as a trusted language can do things outside the
database only by going through a database and calling some function to
which the user has rights, in an untrusted language, that seems decent
to me. A user with permissions to launch_missiles() would have a
function in an untrusted language to do it, but there's no reason an
untrusted language shouldn't be able to say "SELECT

s/untrusted/trusted/ here, right?

launch_missiles()".

To me, as long as they go back into the database via SPI, anything they
can get to from there is OK. What I meant to highlight upthread is that
we don't want trusted functions being able to access other functions
"directly" without going through SQL. As an example, a PL that has FFI
capability sufficient to allow direct access to heap_insert() would
have to be considered untrusted.

regards, tom lane

#28Jonathan Leto
jonathan@leto.net
In reply to: Tom Lane (#18)
Re: Specification for Trusted PLs?

Howdy,

On Fri, May 21, 2010 at 11:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

So... can we get back to coming up with a reasonable
definition,

(1) no access to system calls (including file and network I/O)

(2) no access to process memory, other than variables defined within the
PL.

What else?

I ran across this comment in PL/Perl while implementing PL/Parrot, and
I think it should be taken into consideration for the definition of
trusted/untrusted:

/*
* plperl.on_plperl_init is currently PGC_SUSET to avoid issues whereby a
* user who doesn't have USAGE privileges on the plperl language could
* possibly use SET plperl.on_plperl_init='...' to influence the behaviour
* of any existing plperl function that they can EXECUTE (which may be
* security definer). Set
* http://archives.postgresql.org/pgsql-hackers/2010-02/msg00281.php and
* the overall thread.
*/

Duke

--
Jonathan "Duke" Leto
jonathan@leto.net
http://leto.net

#29Joshua Tolley
eggyknap@gmail.com
In reply to: Tom Lane (#27)
Re: Specification for Trusted PLs?

On Fri, May 21, 2010 at 2:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Joshua Tolley <eggyknap@gmail.com> writes:

Agreed. As long as a trusted language can do things outside the
database only by going through a database and calling some function to
which the user has rights, in an untrusted language, that seems decent
to me. A user with permissions to launch_missiles() would have a
function in an untrusted language to do it, but there's no reason an
untrusted language shouldn't be able to say "SELECT

s/untrusted/trusted/ here, right?

Er, right. Sorry.

launch_missiles()".

To me, as long as they go back into the database via SPI, anything they
can get to from there is OK.  What I meant to highlight upthread is that
we don't want trusted functions being able to access other functions
"directly" without going through SQL.  As an example, a PL that has FFI
capability sufficient to allow direct access to heap_insert() would
have to be considered untrusted.

That I can definitely agree with.

--
Joshua Tolley / eggyknap
End Point Corporation

#30Jan Wieck
JanWieck@Yahoo.com
In reply to: Tom Lane (#27)
Re: Specification for Trusted PLs?

The original idea was that a trusted language does not allow an
unprivileged user to gain access to any object or data, he does not have
access to without that language.

This does not include data transformation functionality, like string
processing or the like. As long as the user had legitimate access to the
input datum, then every derived form thereof is OK.

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

#31Alexey Klyukin
alexk@commandprompt.com
In reply to: Magnus Hagander (#9)
Re: Specification for Trusted PLs?

On Fri, May 21, 2010 at 7:25 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Fri, May 21, 2010 at 12:22 PM, David Fetter <david@fetter.org> wrote:

On Fri, May 21, 2010 at 11:57:33AM -0400, Magnus Hagander wrote:

On Fri, May 21, 2010 at 11:55 AM, Josh Berkus <josh@agliodbs.com> wrote:

So, here's a working definition:

1) cannot directly read or write files on the server.
2) cannot bind network ports

To make that more covering, don't yu really need something like
"cannot communicate with outside processes"?

These need to be testable conditions, and new tests need to get added
any time we find that we've missed something.  Making this concept
fuzzier is exactly the wrong direction to go.

Well, the best way to define what a trusted language can do is to
define a *whitelist* of what it can do, not a blacklist of what it
can't do. That's the only way to get a complete definition. It's then
up to the implementation step to figure out how to represent that in
the form of tests.

Yes, PL/Perl is following this approach. For a whitelist see
plperl_opmask.h (generated by plperl_opmask.pl at build phase).

--
Alexey Klyukin www.CommandPrompt.com
The PostgreSQL Company - Command Prompt, Inc

#32Cédric Villemain
cedric.villemain.debian@gmail.com
In reply to: Jan Wieck (#30)
Re: Specification for Trusted PLs?

2010/5/21 Jan Wieck <JanWieck@yahoo.com>:

The original idea was that a trusted language does not allow an unprivileged
user to gain access to any object or data, he does not have access to
without that language.

This does not include data transformation functionality, like string
processing or the like. As long as the user had legitimate access to the
input datum, then every derived form thereof is OK.

I find the current doc enough, add this prose from Jan as a comment
might help people perhaps.

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Cédric Villemain

#33Robert Haas
robertmhaas@gmail.com
In reply to: Cédric Villemain (#32)
Re: Specification for Trusted PLs?

On Sat, May 22, 2010 at 4:53 PM, Cédric Villemain
<cedric.villemain.debian@gmail.com> wrote:

2010/5/21 Jan Wieck <JanWieck@yahoo.com>:

The original idea was that a trusted language does not allow an unprivileged
user to gain access to any object or data, he does not have access to
without that language.

This does not include data transformation functionality, like string
processing or the like. As long as the user had legitimate access to the
input datum, then every derived form thereof is OK.

I find the current doc enough, add this prose from Jan as a comment
might help people perhaps.

Yeah, Jan's description is very clear and to the point.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#34Ron Mayer
rm_pg@cheapcomplexdevices.com
In reply to: Tom Lane (#18)
Re: Specification for Trusted PLs?

Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

So... can we get back to coming up with a reasonable
definition,

(1) no access to system calls (including file and network I/O)

If a PL has file access to it's own sandbox (similar to what
flash seems to do in web browsers), could that be considered
trusted?

#35Jan Wieck
JanWieck@Yahoo.com
In reply to: Ron Mayer (#34)
Re: Specification for Trusted PLs?

On 5/23/2010 6:14 PM, Ron Mayer wrote:

Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

So... can we get back to coming up with a reasonable
definition,

(1) no access to system calls (including file and network I/O)

If a PL has file access to it's own sandbox (similar to what
flash seems to do in web browsers), could that be considered
trusted?

That is a good question.

Currently, the first of all TRUSTED languages, PL/Tcl, would allow the
function of a lesser privileged user access the "global" objects of
every other database user created within the same session.

These are per backend in memory objects, but none the less, an evil
function could just scan the per backend Tcl namespace and look for
compromising data, and that's not exactly what TRUSTED is all about.

In the case of Tcl it is possible to create a separate "safe"
interpreter per DB role to fix this. I actually think this would be the
right thing to do.

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

#36Andrew Dunstan
andrew@dunslane.net
In reply to: Jan Wieck (#35)
Re: Specification for Trusted PLs?

Jan Wieck wrote:

On 5/23/2010 6:14 PM, Ron Mayer wrote:

Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

So... can we get back to coming up with a reasonable
definition,

(1) no access to system calls (including file and network I/O)

If a PL has file access to it's own sandbox (similar to what
flash seems to do in web browsers), could that be considered
trusted?

That is a good question.

Currently, the first of all TRUSTED languages, PL/Tcl, would allow the
function of a lesser privileged user access the "global" objects of
every other database user created within the same session.

These are per backend in memory objects, but none the less, an evil
function could just scan the per backend Tcl namespace and look for
compromising data, and that's not exactly what TRUSTED is all about.

In the case of Tcl it is possible to create a separate "safe"
interpreter per DB role to fix this. I actually think this would be
the right thing to do.

I think that would probably be serious overkill. Maybe a data stash per
role rather than an interpreter per role would be doable. it would
certainly be more lightweight.

ISTM we are in danger of confusing several different things. A user that
doesn't want data to be shared should not stash it in global objects.
But to me, trusting a language is not about making data private, but
about not allowing the user to do things that are dangerous, such as
referencing memory, or the file system, or the operating system, or
network connections, or loading code which might do any of those things.

cheers

andrew

#37Craig Ringer
craig@postnewspapers.com.au
In reply to: Josh Berkus (#5)
Re: Specification for Trusted PLs?

On 21/05/10 23:55, Josh Berkus wrote:

So, here's a working definition:

1) cannot directly read or write files on the server.

It must also prevent PL-user-level access to file descriptors already
open by the backend. That's implicitly covered in the above, but should
probably be explicit.

2) cannot bind network ports
3) uses only the SPI interface to interact with postgresql tables etc.
4) does any logging only using elog to the postgres log

5) Cannot dynamically load shared libraries from user-supplied locations

(eg in Python, 'import' of a module that had a .so component would be
blocked unless it was in the core module path)

a) it seems like there should be some kind of restriction on access to
memory, but I'm not clear on how that would be defined.

Like:

5) Has no way to directly access backend memory, ie doesn't have
PL-user-accessible pointers or user access to any C-level calls that
take/return them. Data structures containing pointers must be opaque to
the PL user.

The idea being that if you have no access to C APIs that work with
pointers to memory, and you can't use files (/dev/mem, /proc/self/mem,
etc), you can't work with backend memory directly.

--
Craig Ringer

#38Jan Wieck
JanWieck@Yahoo.com
In reply to: Andrew Dunstan (#36)
Re: Specification for Trusted PLs?

On 5/23/2010 10:04 PM, Andrew Dunstan wrote:

Jan Wieck wrote:

On 5/23/2010 6:14 PM, Ron Mayer wrote:

Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

So... can we get back to coming up with a reasonable
definition,

(1) no access to system calls (including file and network I/O)

If a PL has file access to it's own sandbox (similar to what
flash seems to do in web browsers), could that be considered
trusted?

That is a good question.

Currently, the first of all TRUSTED languages, PL/Tcl, would allow the
function of a lesser privileged user access the "global" objects of
every other database user created within the same session.

These are per backend in memory objects, but none the less, an evil
function could just scan the per backend Tcl namespace and look for
compromising data, and that's not exactly what TRUSTED is all about.

In the case of Tcl it is possible to create a separate "safe"
interpreter per DB role to fix this. I actually think this would be
the right thing to do.

I think that would probably be serious overkill. Maybe a data stash per
role rather than an interpreter per role would be doable. it would
certainly be more lightweight.

ISTM we are in danger of confusing several different things. A user that
doesn't want data to be shared should not stash it in global objects.
But to me, trusting a language is not about making data private, but
about not allowing the user to do things that are dangerous, such as
referencing memory, or the file system, or the operating system, or
network connections, or loading code which might do any of those things.

How is "loading code which might do any of those things" different from
writing a stored procedure, that accesses data, a careless "superuser"
left in a global variable? Remember, the code of a PL function is "open"
source - like in "everyone can select from pg_proc". You really don't
expect anyone to scan for your global variables just because they can
write functions in the same language?

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

#39Craig Ringer
craig@postnewspapers.com.au
In reply to: Robert Haas (#17)
Re: Specification for Trusted PLs?

On 22/05/10 02:12, Robert Haas wrote:

On Fri, May 21, 2010 at 1:58 PM, David Fetter<david@fetter.org> wrote:

On Fri, May 21, 2010 at 01:45:45PM -0400, Stephen Frost wrote:

* David Fetter (david@fetter.org) wrote:

That is *precisely* the business we need to be in, at least for the
languages we ship, and it would behoove us to test languages we don't
ship so we can warn people when they don't pass.

k, let's start with something simpler first tho- I'm sure we can pull in
the glibc regression tests and run them too. You know, just in case
there's a bug there, somewhere.

That's pretty pure straw man argument. I expect much higher quality
trolling. D-.

I'm sorely tempted to try to provide some higher-quality trolling, but
in all seriousness I think that (1) we could certainly use much better
regression tests in many areas of which this is one and (2) it will
never be possible to catch all security bugs - in particular - via
regression testing because they typically stem from cases people
didn't consider. So... can we get back to coming up with a reasonable
definition, and if somebody wants to write some regression tests, all
the better?

Personally, I don't think a PL should be trusted unless it _does_ define
a whitelist of operations. Experience in the wider world has shown that
this is the only approach that works. Regression testing to make sure
all possible approaches to access unsafe features are blocked is doomed
to have holes where there's another approach that hasn't been thought of
yet.

Perl's new approach is whitelist based. Python restricted mode failed
not least because it was a blacklist and people kept on finding ways
around it. Lua and JavaScript are great examples of whitelist
approaches, where the language just doesn't expose features that're
dangerous - in fact, the core language doesn't even *have* those
features. PL/PgSQL is the same, and works well as a trusted language for
that reason.

Java's SecurityManager is whitelist based (allowed classes, allowed
operations), and has proved very secure.

--
Craig Ringer

#40Andrew Dunstan
andrew@dunslane.net
In reply to: Jan Wieck (#38)
Re: Specification for Trusted PLs?

Jan Wieck wrote:

ISTM we are in danger of confusing several different things. A user
that doesn't want data to be shared should not stash it in global
objects. But to me, trusting a language is not about making data
private, but about not allowing the user to do things that are
dangerous, such as referencing memory, or the file system, or the
operating system, or network connections, or loading code which might
do any of those things.

How is "loading code which might do any of those things" different
from writing a stored procedure, that accesses data, a careless
"superuser" left in a global variable? Remember, the code of a PL
function is "open" source - like in "everyone can select from
pg_proc". You really don't expect anyone to scan for your global
variables just because they can write functions in the same language?

Well, that threat arises from the unsafe actions of the careless
superuser. And we could at least ameliorate it by providing a per role
data stash, at very little cost, as I mentioned. It's not like we don't
know about such threats, and I'm certainly not pretending they don't
exist. The 9.0 PL/Perl docs say:

The %_SHARED variable and other global state within the language is
public data, available to all PL/Perl functions within a session.
Use with care, especially in situations that involve use of multiple
roles or SECURITY DEFINER functions.

But the threats I was referring to arise if the language allows them to,
without any requirement for unsafe actions by another user. Protecting
against those is the essence of trustedness in my mind at least.

cheers

andrew

#41Jan Wieck
JanWieck@Yahoo.com
In reply to: Andrew Dunstan (#40)
Re: Specification for Trusted PLs?

On 5/23/2010 11:19 PM, Andrew Dunstan wrote:

Jan Wieck wrote:

ISTM we are in danger of confusing several different things. A user
that doesn't want data to be shared should not stash it in global
objects. But to me, trusting a language is not about making data
private, but about not allowing the user to do things that are
dangerous, such as referencing memory, or the file system, or the
operating system, or network connections, or loading code which might
do any of those things.

How is "loading code which might do any of those things" different
from writing a stored procedure, that accesses data, a careless
"superuser" left in a global variable? Remember, the code of a PL
function is "open" source - like in "everyone can select from
pg_proc". You really don't expect anyone to scan for your global
variables just because they can write functions in the same language?

Well, that threat arises from the unsafe actions of the careless
superuser. And we could at least ameliorate it by providing a per role
data stash, at very little cost, as I mentioned. It's not like we don't
know about such threats, and I'm certainly not pretending they don't
exist. The 9.0 PL/Perl docs say:

The %_SHARED variable and other global state within the language is
public data, available to all PL/Perl functions within a session.
Use with care, especially in situations that involve use of multiple
roles or SECURITY DEFINER functions.

But the threats I was referring to arise if the language allows them to,
without any requirement for unsafe actions by another user. Protecting
against those is the essence of trustedness in my mind at least.

I can agree with that.

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

#42Greg Sabino Mullane
greg@turnstep.com
In reply to: Alexey Klyukin (#31)
Re: Specification for Trusted PLs?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

Well, the best way to define what a trusted language can do is to
define a *whitelist* of what it can do, not a blacklist of what it
can't do. That's the only way to get a complete definition. It's then
up to the implementation step to figure out how to represent that in
the form of tests.

Yes, PL/Perl is following this approach. For a whitelist see
plperl_opmask.h (generated by plperl_opmask.pl at build phase).

Ah, okay, I can mostly agree with that. My objection was with trying
to build a cross-language generic whitelist. But it looks like the
ship has already sailed upthread and we've more or less got a working
definition. David, I think you started this thread, I assume you have
some concrete reason for asking about this (new trusted language?).
May have been stated, but I missed it.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201005241025
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkv6jE4ACgkQvJuQZxSWSsjWugCdEwR/n0V3IeFB7w/h5hhPQW/J
ln0An2FZKa2CHWaWdHKOvQvEbBIvyzwK
=wqO5
-----END PGP SIGNATURE-----

#43Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#27)
Re: Specification for Trusted PLs?

Tom Lane wrote:

Joshua Tolley <eggyknap@gmail.com> writes:

Agreed. As long as a trusted language can do things outside the
database only by going through a database and calling some function to
which the user has rights, in an untrusted language, that seems decent
to me. A user with permissions to launch_missiles() would have a
function in an untrusted language to do it, but there's no reason an
untrusted language shouldn't be able to say "SELECT

s/untrusted/trusted/ here, right?

One thing that has always bugged me is that the use of
"trusted/untrusted" for languages is confusing, because it is "trusted"
users who can run untrusted languages. I think "trust" is more
associated with users than with software features. I have no idea how
this confusion could be clarified.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

#44David Fetter
david@fetter.org
In reply to: Bruce Momjian (#43)
Re: Specification for Trusted PLs?

On Thu, May 27, 2010 at 11:23:44AM -0400, Bruce Momjian wrote:

Tom Lane wrote:

Joshua Tolley <eggyknap@gmail.com> writes:

Agreed. As long as a trusted language can do things outside the
database only by going through a database and calling some
function to which the user has rights, in an untrusted language,
that seems decent to me. A user with permissions to
launch_missiles() would have a function in an untrusted language
to do it, but there's no reason an untrusted language shouldn't
be able to say "SELECT

s/untrusted/trusted/ here, right?

One thing that has always bugged me is that the use of
"trusted/untrusted" for languages is confusing, because it is
"trusted" users who can run untrusted languages. I think "trust" is
more associated with users than with software features. I have no
idea how this confusion could be clarified.

Sadly, I don't think it could short of a time machine. We're stuck
with an backward convention. :(

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#45Peter Eisentraut
peter_e@gmx.net
In reply to: Robert Haas (#19)
Re: Specification for Trusted PLs?

On fre, 2010-05-21 at 14:22 -0400, Robert Haas wrote:

On Fri, May 21, 2010 at 2:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

So... can we get back to coming up with a reasonable
definition,

(1) no access to system calls (including file and network I/O)

(2) no access to process memory, other than variables defined within the
PL.

What else?

Doesn't subvert the general PostgreSQL security mechanisms? Not sure
how to formulate that.

Succinctly: A trusted language does not grant access to data that the
user would otherwise not have.

I wouldn't go any further than that. File and network I/O, for example,
are implementation details. A trusted language might do some kind of
RPC, for example. The PL/J project once wanted to do something like
that.

#46David Fetter
david@fetter.org
In reply to: Peter Eisentraut (#45)
Re: Specification for Trusted PLs?

On Fri, May 28, 2010 at 01:03:15AM +0300, Peter Eisentraut wrote:

On fre, 2010-05-21 at 14:22 -0400, Robert Haas wrote:

On Fri, May 21, 2010 at 2:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

So... can we get back to coming up with a reasonable
definition,

(1) no access to system calls (including file and network I/O)

(2) no access to process memory, other than variables defined
within the PL.

What else?

Doesn't subvert the general PostgreSQL security mechanisms? Not
sure how to formulate that.

Succinctly: A trusted language does not grant access to data that
the user would otherwise not have.

That's a great definition from a point of view of understanding by
human beings. A whitelist system will work better from the point of
automating tests which, while they couldn't conclusively prove that
something was actually this way, could go a long way toward making
sure that PLs didn't regress into untrusted territory.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#47Robert Haas
robertmhaas@gmail.com
In reply to: David Fetter (#46)
Re: Specification for Trusted PLs?

On Thu, May 27, 2010 at 7:10 PM, David Fetter <david@fetter.org> wrote:

On Fri, May 28, 2010 at 01:03:15AM +0300, Peter Eisentraut wrote:

On fre, 2010-05-21 at 14:22 -0400, Robert Haas wrote:

On Fri, May 21, 2010 at 2:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

So... can we get back to coming up with a reasonable
definition,

(1) no access to system calls (including file and network I/O)

(2) no access to process memory, other than variables defined
within the PL.

What else?

Doesn't subvert the general PostgreSQL security mechanisms?  Not
sure how to formulate that.

Succinctly: A trusted language does not grant access to data that
the user would otherwise not have.

That's a great definition from a point of view of understanding by
human beings.  A whitelist system will work better from the point of
automating tests which, while they couldn't conclusively prove that
something was actually this way, could go a long way toward making
sure that PLs didn't regress into untrusted territory.

You haven't presented any sort of plan for how such automated testing
would actually work. Perhaps if you presented the plan first we could
think about how to provide for its needs. I'm generally of the
opinion that it's not possible to do automated testing for security
vulnerabilities (beyond crash testing, perhaps) but if you have a good
idea let's talk about it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#48David Fetter
david@fetter.org
In reply to: Robert Haas (#47)
Re: Specification for Trusted PLs?

On Thu, May 27, 2010 at 09:51:30PM -0400, Robert Haas wrote:

On Thu, May 27, 2010 at 7:10 PM, David Fetter <david@fetter.org> wrote:

On Fri, May 28, 2010 at 01:03:15AM +0300, Peter Eisentraut wrote:

On fre, 2010-05-21 at 14:22 -0400, Robert Haas wrote:

On Fri, May 21, 2010 at 2:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

So... can we get back to coming up with a reasonable
definition,

(1) no access to system calls (including file and network
I/O)

(2) no access to process memory, other than variables defined
within the PL.

What else?

Doesn't subvert the general PostgreSQL security mechanisms?
�Not sure how to formulate that.

Succinctly: A trusted language does not grant access to data that
the user would otherwise not have.

That's a great definition from a point of view of understanding by
human beings. �A whitelist system will work better from the point
of automating tests which, while they couldn't conclusively prove
that something was actually this way, could go a long way toward
making sure that PLs didn't regress into untrusted territory.

You haven't presented any sort of plan for how such automated
testing would actually work. Perhaps if you presented the plan
first we could think about how to provide for its needs. I'm
generally of the opinion that it's not possible to do automated
testing for security vulnerabilities (beyond crash testing, perhaps)
but if you have a good idea let's talk about it.

I don't know about a *good* idea, but here's the one I've got.

1. Make a whitelist. This is what needs to work in order for a
language to be a fully functional trusted PL.

2. Write tests that check that each thing on the whitelist works as
advertised. These are language specific.

3. (the un-fun part) Write tests which attempt to do things not in
the whitelist. We can start from the vulnerabilities so far
discovered.

4. Each time a vulnerability is discovered in one language, write
something that tests for it in the other languages.

I get that this isn't going to ensure that the access control is
perfect. It's more a backstop against regressions of previously
function access controls.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#49Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Fetter (#48)
Re: Specification for Trusted PLs?

David Fetter <david@fetter.org> writes:

I don't know about a *good* idea, but here's the one I've got.

1. Make a whitelist. This is what needs to work in order for a
language to be a fully functional trusted PL.

Well, I pretty much lose interest right here, because this is already
assuming that every potentially trusted PL is isomorphic in its
capabilities. If that were so, there'd not be very much point in
supporting multiple PLs. A good example here is R. I have no idea
whether PL/R is trusted or trustworthy, but in any case the main point
of supporting that PL is to allow access to the R statistical library.
How does that fit into a whitelist designed for some other language?
It doesn't.

3. (the un-fun part) Write tests which attempt to do things not in
the whitelist. We can start from the vulnerabilities so far
discovered.

And here is the *other* fatal problem: a whitelist does not in fact give
any leverage at all for testing whether there is access to functionality
outside the whitelist. (It might be useful if you could enforce the
whitelist at some sufficiently low level of the language implementation,
but as a matter of testing, it does nothing for you.) What you're
suggesting isn't so much un-fun as un-possible. Given a maze of twisty
little subroutines all different, how will you find out if any of them
contain calls of unwanted functionality?

If you think you can do something with this, go for it, but don't
expect me to spend any of my time on it.

regards, tom lane

#50Sam Mason
sam@samason.me.uk
In reply to: Tom Lane (#49)
Re: Specification for Trusted PLs?

On Thu, May 27, 2010 at 11:09:26PM -0400, Tom Lane wrote:

David Fetter <david@fetter.org> writes:

I don't know about a *good* idea, but here's the one I've got.

1. Make a whitelist. This is what needs to work in order for a
language to be a fully functional trusted PL.

Well, I pretty much lose interest right here, because this is already
assuming that every potentially trusted PL is isomorphic in its
capabilities.

That's not normally a problem. The conventional way would be to place
the interpreter in its own sandbox, similar to how Chrome has each tab
running in its own process. These processes are protected in a way
so that the code running inside them can't do any harm--e.g. a ptrace
jail[1]http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.122.5494. This is quite a change from existing pl implementations, and
present a different set of performance/compatibility issues.

If that were so, there'd not be very much point in
supporting multiple PLs. A good example here is R. I have no idea
whether PL/R is trusted or trustworthy, but in any case the main point
of supporting that PL is to allow access to the R statistical library.
How does that fit into a whitelist designed for some other language?
It doesn't.

AFAIU, a trusted language should only be able to perform computation,
e.g. not touch the local filesystem, beyond readonly access to library
code, and not see the network. Policies such as these are easy to
enforce in a ptrace jail, and would still allow a trusted pl/r to do
whatever it wants to get any pure calculation done. As soon as it needs
to touch the file system the language becomes non-trusted.

3. (the un-fun part) Write tests which attempt to do things not in
the whitelist. We can start from the vulnerabilities so far
discovered.

And here is the *other* fatal problem: a whitelist does not in fact give
any leverage at all for testing whether there is access to functionality
outside the whitelist. (It might be useful if you could enforce the
whitelist at some sufficiently low level of the language implementation,
but as a matter of testing, it does nothing for you.) What you're
suggesting isn't so much un-fun as un-possible. Given a maze of twisty
little subroutines all different, how will you find out if any of them
contain calls of unwanted functionality?

A jail helps with a lot of this; the remainder is in the normal fact
that bug testing can only demonstrate the presence of bugs and you need
to do formal code proof to check for the absence of bugs.

--
Sam http://samason.me.uk/

[1]: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.122.5494

#51Peter Eisentraut
peter_e@gmx.net
In reply to: Sam Mason (#50)
Re: Specification for Trusted PLs?

On fre, 2010-05-28 at 13:03 +0100, Sam Mason wrote:

That's not normally a problem. The conventional way would be to place
the interpreter in its own sandbox, similar to how Chrome has each tab
running in its own process. These processes are protected in a way
so that the code running inside them can't do any harm--e.g. a ptrace
jail[1]. This is quite a change from existing pl implementations, and
present a different set of performance/compatibility issues.

Surely a definition of a trusted language that invalidates the existing
trusted languages is not going help resolve the issue.

#52Andrew Dunstan
andrew@dunslane.net
In reply to: Sam Mason (#50)
Re: Specification for Trusted PLs?

Sam Mason wrote:

On Thu, May 27, 2010 at 11:09:26PM -0400, Tom Lane wrote:

David Fetter <david@fetter.org> writes:

I don't know about a *good* idea, but here's the one I've got.

1. Make a whitelist. This is what needs to work in order for a
language to be a fully functional trusted PL.

Well, I pretty much lose interest right here, because this is already
assuming that every potentially trusted PL is isomorphic in its
capabilities.

That's not normally a problem. The conventional way would be to place
the interpreter in its own sandbox, similar to how Chrome has each tab
running in its own process. These processes are protected in a way
so that the code running inside them can't do any harm--e.g. a ptrace
jail[1]. This is quite a change from existing pl implementations, and
present a different set of performance/compatibility issues.

I have my own translation of this last sentence.

cheers

andrew

#53Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#33)
1 attachment(s)
Re: Specification for Trusted PLs?

Robert Haas wrote:

On Sat, May 22, 2010 at 4:53 PM, C?dric Villemain
<cedric.villemain.debian@gmail.com> wrote:

2010/5/21 Jan Wieck <JanWieck@yahoo.com>:

The original idea was that a trusted language does not allow an unprivileged
user to gain access to any object or data, he does not have access to
without that language.

This does not include data transformation functionality, like string
processing or the like. As long as the user had legitimate access to the
input datum, then every derived form thereof is OK.

I find the current doc enough, add this prose from Jan as a comment
might help people perhaps.

Yeah, Jan's description is very clear and to the point.

The attached, applied patch clarifies the meaning of "trusted language"
in the documentation using Jan's description.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

Attachments:

/rtmp/difftext/x-diffDownload
Index: doc/src/sgml/xplang.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/xplang.sgml,v
retrieving revision 1.37
diff -c -c -r1.37 xplang.sgml
*** doc/src/sgml/xplang.sgml	3 Apr 2010 07:22:56 -0000	1.37
--- doc/src/sgml/xplang.sgml	30 May 2010 02:21:53 -0000
***************
*** 151,158 ****
      <optional>VALIDATOR <replaceable>validator_function_name</replaceable></optional> ;
  </synopsis>
        The optional key word <literal>TRUSTED</literal> specifies that
!       ordinary database users that have no superuser privileges should
!       be allowed to use this language to create functions and trigger
        procedures. Since PL functions are executed inside the database
        server, the <literal>TRUSTED</literal> flag should only be given
        for languages that do not allow access to database server
--- 151,160 ----
      <optional>VALIDATOR <replaceable>validator_function_name</replaceable></optional> ;
  </synopsis>
        The optional key word <literal>TRUSTED</literal> specifies that
!       the language does not grant access to data that the user would
!       not otherwise have.  Trusted languages are designed for ordinary
!       database users (those without superuser privilege) and allows them
!       to safely create of functions and trigger
        procedures. Since PL functions are executed inside the database
        server, the <literal>TRUSTED</literal> flag should only be given
        for languages that do not allow access to database server
Index: doc/src/sgml/ref/create_language.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/create_language.sgml,v
retrieving revision 1.50
diff -c -c -r1.50 create_language.sgml
*** doc/src/sgml/ref/create_language.sgml	3 Apr 2010 07:22:58 -0000	1.50
--- doc/src/sgml/ref/create_language.sgml	30 May 2010 02:21:53 -0000
***************
*** 104,114 ****
  
       <listitem>
        <para>
!        <literal>TRUSTED</literal> specifies that
!        the language is safe, that is, it does not offer an
!        unprivileged user any functionality to bypass access
!        restrictions. If this key word is omitted when registering the
!        language, only users with the
         <productname>PostgreSQL</productname> superuser privilege can
         use this language to create new functions.
        </para>
--- 104,113 ----
  
       <listitem>
        <para>
!        <literal>TRUSTED</literal> specifies that the language does
!        not grant access to data that the user would not otherwise
!        have.  If this key word is omitted
!        when registering the language, only users with the
         <productname>PostgreSQL</productname> superuser privilege can
         use this language to create new functions.
        </para>