Proposal: access control jails (and introduction as aspiring GSoC student)
Hello, I'm Joey Adams, and I'm interested in applying for Google
Summer of Code to work on PostgreSQL. I'm a former GSoC student (I
worked on CCAN last year), and a strong C programmer, though I am
still new to working with large, established communities. I apologize
if this is the wrong place to send GSoC student introductions.
My proposal is bold, though I believe it can be trimmed down and
refined into something very useful, yet simple enough to implement in
3 months by a newcomer to the PostgreSQL code base.
I propose adding application-level access control to PostgreSQL via a
jails concept. In a nutshell, a jail is created as part of the
database definition (typically exposing a free variable for the
current user). When a jail is activated for a session, the only
accesses allowed are those indicated in the jail itself. A jail
cannot be exited without closing the session. If used properly, jails
make it possible to safely execute untrusted SQL code (though one may
not want to, citing the principle of least privilege).
For example, suppose we have the following database definition for a
trivial discussion board system:
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name TEXT,
password TEXT,
email TEXT,
email_public BOOLEAN
);
CREATE TABLE posts (
id SERIAL PRIMARY KEY,
title TEXT,
"user" INT REFERENCES users(id),
content TEXT,
published BOOLEAN
);
CREATE TABLE tags (
post INT REFERENCES posts(id),
tag TEXT
);
Suppose our discussion board is written in PHP. It authenticates the
user via cookies, after which it has an ID for the current user. One
way to arbitrate access is to code it into the PHP using queries like
this:
pg_query_params('SELECT users.email WHERE id=$1 AND (email_public=TRUE
OR id=$2)', $_GET['user'], $current_user);
Simple enough. However, this access control check has to be done for
every query, and it can get rather complex as more tables are
involved.
Views are a way to alleviate this complexity, but:
* Views only apply to SELECTs
* Queries still have to specify the current user (unless one uses a
global variable (which requires a bit of a hack, if I'm not mistaken))
My proposal would make it possible to have a statement in the database
definition much like this:
CREATE JAIL jail (
SELECT id, name FROM users;
SELECT email FROM users
WHERE id=current_user
OR email_public=TRUE;
SELECT * FROM posts
WHERE "user"=current_user
OR published=TRUE;
UPDATE posts
WHERE "user"=current_user
SET title, content, published;
INSERT INTO posts
WHERE "user"=current_user;
DELETE FROM posts
WHERE "user"=current_user;
SELECT * FROM tags, posts
WHERE tags.post=posts.id;
INSERT INTO tags
WHERE post IN (
SELECT id FROM posts
WHERE "user"=current_user);
DELETE FROM posts
WHERE post IN (
SELECT id FROM posts
WHERE "user"=current_user);
);
Inside of the jail definition is a series of pseudo-statements that
indicate the space of queries the user can perform. Simply creating a
jail does not make it go into effect. A jail is activated using
another query, and it remains in effect for the remainder of the
session. It cannot be deactivated through the protocol, as doing so
would constitute a privilege escalation.
Example of a PHP script invoking a jail:
<?php
pg_connect('user=foo password=spiderman') or die('Database error occurred');
/* authentication-fu */
// $current_user now contains the id of the logged-in user.
pg_query_params('USE JAIL jail WITH current_user=$1', array($current_user));
/* Remaining queries in this session will only be able to SELECT,
UPDATE, INSERT, and DELETE on users, posts, and tags, following the
rules set forth by the jail. */
?>
I came up with this idea while working on a small
software-as-a-service application using PostgreSQL. I needed to
implement access control in the PHP frontend. I ended up creating a
database abstraction class to protect the rest of the frontend code
from SQL messiness. Every database query I needed, I wrote/extended a
PHP function for it. My database access class for my "small"
application exploded to over 1000 lines (granted, a lot of it was
comments).
Although having accessor functions can make the code easier to read
and maintain, it can result in a lot of redundancy. Why not just use
the database abstraction library PostgreSQL gives you: SQL?
Jails seem to me like a revolutionary addition to SQL, and
implementing them "fully" could be quite involved. However, my guess
is that by restricting what may be done in a jail simply to SELECT,
UPDATE, INSERT and DELETE, an initial jail implementation capable of
what I posted above would be relatively easy. Work would mainly
consist of adding code to PostgreSQL's parser and the rewrite engine.
Comments?
Joseph Adams wrote:
Hello, I'm Joey Adams, and I'm interested in applying for Google
Summer of Code to work on PostgreSQL. I'm a former GSoC student (I
worked on CCAN last year), and a strong C programmer, though I am
still new to working with large, established communities. I apologize
if this is the wrong place to send GSoC student introductions.
Hi Joey, you're in the right place!
* Views only apply to SELECTs
Views can be made updateable by defining suitable rules for them, that
is usually the way people solve this problem. See chapters
http://www.postgresql.org/docs/8.4/interactive/rules-views.html and
http://www.postgresql.org/docs/8.4/interactive/rules-update.html in the
user manual.
You should also take a look at Veil
(http://veil.projects.postgresql.org/), which uses views to implement
something that sounds much like what you're proposing.
You should also be aware of an issue with this approach:
http://archives.postgresql.org/message-id/4AE02DF0.40101@enterprisedb.com.
Some ideas on plugging that hole were discussed, but unfortunately there
was no clear consensus and no progress since.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
* Joseph Adams (joeyadams3.14159@gmail.com) wrote:
I propose adding application-level access control to PostgreSQL via a
jails concept. In a nutshell, a jail is created as part of the
database definition (typically exposing a free variable for the
current user). When a jail is activated for a session, the only
accesses allowed are those indicated in the jail itself. A jail
cannot be exited without closing the session. If used properly, jails
make it possible to safely execute untrusted SQL code (though one may
not want to, citing the principle of least privilege).
I guess my initial reaction to this is that you can use roles, views,
and pl/pgsql (security definer) functions to achieve this. This does
have an interesting intersection with row-level security concepts and
that's definitely a project that I'd like to see happen at some point in
PG. Not sure if you've considered this, but you can do a 'set role' at
the start of a session and then use CURRENT_ROLE in view definitions and
in other places. You can also make it so that the user who is logging
in (eg 'www-data') doesn't have any rights to anything, except the
ability to 'set role' to other roles.
Note that, with any of this, you need to consider pooled database
connections. Unfortunately, it's still pretty expensive to establish a
new database connection to PG.
Thanks,
Stephen
On Mon, Mar 22, 2010 at 9:39 AM, Stephen Frost <sfrost@snowman.net> wrote:
* Joseph Adams (joeyadams3.14159@gmail.com) wrote:
I propose adding application-level access control to PostgreSQL via a
jails concept. In a nutshell, a jail is created as part of the
database definition (typically exposing a free variable for the
current user). When a jail is activated for a session, the only
accesses allowed are those indicated in the jail itself. A jail
cannot be exited without closing the session. If used properly, jails
make it possible to safely execute untrusted SQL code (though one may
not want to, citing the principle of least privilege).I guess my initial reaction to this is that you can use roles, views,
and pl/pgsql (security definer) functions to achieve this. This does
have an interesting intersection with row-level security concepts and
that's definitely a project that I'd like to see happen at some point in
PG. Not sure if you've considered this, but you can do a 'set role' at
the start of a session and then use CURRENT_ROLE in view definitions and
in other places. You can also make it so that the user who is logging
in (eg 'www-data') doesn't have any rights to anything, except the
ability to 'set role' to other roles.
Sometimes it would be nice to conditionalize queries on a value other
than the authenticated role. I really wish we had some kind of SQL
variable support. Talking out of my rear end:
CREATE VARIABLE name AS type [GLOBAL | LOCAL]; -- local variables are
per-backend
SET VARIABLE name = value;
SELECT * FROM foo WHERE col = ${name};
We have had one previous request for the ability to limit the list of
queries that the user can issue to a fixed set, but it's not clear to
me that that's really all that useful. If that's what you want, you
might be better off putting an intermediate layer between the client
and the database - e.g. this kind of restriction could be implemented
in the connection pooler...
...Robert
* Robert Haas (robertmhaas@gmail.com) wrote:
Sometimes it would be nice to conditionalize queries on a value other
than the authenticated role. I really wish we had some kind of SQL
variable support. Talking out of my rear end:
I certainly agree- having variable support in the backend would
definitely be nice. I'd want it to be explicit and distinct from GUCs
though, unlike the situation we have w/ psql right now. All that said,
I'm not really a huge fan of write-your-own-authorization-system in
general. If the existing authorization system isn't sufficient for what
you want, then let's improve it. There may be specific cases where
what's needed is particularly complex, but that's what security definer
functions are for..
Thanks,
Stephen
On Mon, Mar 22, 2010 at 10:03 AM, Stephen Frost <sfrost@snowman.net> wrote:
* Robert Haas (robertmhaas@gmail.com) wrote:
Sometimes it would be nice to conditionalize queries on a value other
than the authenticated role. I really wish we had some kind of SQL
variable support. Talking out of my rear end:I certainly agree- having variable support in the backend would
definitely be nice. I'd want it to be explicit and distinct from GUCs
though, unlike the situation we have w/ psql right now.
Agreed.
All that said,
I'm not really a huge fan of write-your-own-authorization-system in
general. If the existing authorization system isn't sufficient for what
you want, then let's improve it. There may be specific cases where
what's needed is particularly complex, but that's what security definer
functions are for..
Fortunately this functionality also has other uses, so I don't know
that we really need to decide which of those uses we approve of more
or less.
Does the SQL standard specify anything in this area?
...Robert
2010/3/22 Robert Haas <robertmhaas@gmail.com>:
On Mon, Mar 22, 2010 at 9:39 AM, Stephen Frost <sfrost@snowman.net> wrote:
* Joseph Adams (joeyadams3.14159@gmail.com) wrote:
I propose adding application-level access control to PostgreSQL via a
jails concept. In a nutshell, a jail is created as part of the
database definition (typically exposing a free variable for the
current user). When a jail is activated for a session, the only
accesses allowed are those indicated in the jail itself. A jail
cannot be exited without closing the session. If used properly, jails
make it possible to safely execute untrusted SQL code (though one may
not want to, citing the principle of least privilege).I guess my initial reaction to this is that you can use roles, views,
and pl/pgsql (security definer) functions to achieve this. This does
have an interesting intersection with row-level security concepts and
that's definitely a project that I'd like to see happen at some point in
PG. Not sure if you've considered this, but you can do a 'set role' at
the start of a session and then use CURRENT_ROLE in view definitions and
in other places. You can also make it so that the user who is logging
in (eg 'www-data') doesn't have any rights to anything, except the
ability to 'set role' to other roles.Sometimes it would be nice to conditionalize queries on a value other
than the authenticated role. I really wish we had some kind of SQL
variable support. Talking out of my rear end:CREATE VARIABLE name AS type [GLOBAL | LOCAL]; -- local variables are
per-backend
SET VARIABLE name = value;
SELECT * FROM foo WHERE col = ${name};
I though about it two, three years ago as lighter solution without
packages. Now we have more almost all what we need. Variables are
supported by parser (with last changes related to integration of main
parser to plpgsql).
just you can write
select * from tab where col = var;
there are not necessary some special syntax. And it is available from
all environments. Implementation for scalar type cannot be dificult
(maybe).
Regards
Pavel Stehule
Show quoted text
We have had one previous request for the ability to limit the list of
queries that the user can issue to a fixed set, but it's not clear to
me that that's really all that useful. If that's what you want, you
might be better off putting an intermediate layer between the client
and the database - e.g. this kind of restriction could be implemented
in the connection pooler......Robert
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> wrote:
Does the SQL standard specify anything in this area?
The only thing that comes to mind for me is the SQL/PSM
<SQL variable declaration>.
-Kevin
On mån, 2010-03-22 at 09:54 -0400, Robert Haas wrote:
We have had one previous request for the ability to limit the list of
queries that the user can issue to a fixed set, but it's not clear to
me that that's really all that useful.
Well, sudo is pretty useful, and this would be quite similar.
Peter Eisentraut <peter_e@gmx.net> writes:
Well, sudo is pretty useful, and this would be quite similar.
+1.
I guess one of the big difficulties would be to be able to match a given
random query with the list of queries we have in any Jail, given that we
put in there "generic" queries and we want to allow "specific" queries.
But once we have that, it could turn out pretty useful for other
thoughts. I can't find it again in the archives, but the idea was to
collect statistics on views rather than plain table so that you can have
correlated stats on JOINs and some columns etc. The hard part here too
looks like being able to tell at runtime that a given query is a
specific form of an existing view.
Regards,
--
dim
On 3/21/10 9:36 PM, Joseph Adams wrote:
Inside of the jail definition is a series of pseudo-statements that
indicate the space of queries the user can perform. Simply creating a
jail does not make it go into effect. A jail is activated using
another query, and it remains in effect for the remainder of the
session. It cannot be deactivated through the protocol, as doing so
would constitute a privilege escalation.
This is an interesting approach and I don't think that most of the
people commenting on this list have quite grasped it.
I see two major difficulties to solve with this approach: (1)
developing a way of phrasing the query stubs which would allow common
things like dynamic where clauses, order by, and limit, and (2) whether
it's practical for the author of any real application to define all of
those queries beforehand.
For (1), you might want to look at Meredith's libDejector, which takes a
similar approach for SQL-injection protection:
http://www.thesmartpolitenerd.com/code/dejector.html
I don't think that the idea of turning on the jail mode via a
session-level switch works, given the realities of connection pooling.
Also, I do not believe that we currently have any USERSET variable which
can be turned on but not off, so that would require adding a whole new mode.
BTW, if you wanted something less ambitious, we have a longstanding
request to implement "local superuser", that is, the ability to give one
role the ability to edit other roles in one database only.
--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com
On Tue, Mar 23, 2010 at 1:28 PM, Josh Berkus <josh@agliodbs.com> wrote:
I don't think that the idea of turning on the jail mode via a
session-level switch works, given the realities of connection pooling.
Also, I do not believe that we currently have any USERSET variable which
can be turned on but not off, so that would require adding a whole new mode.
I think this could be done with an assign hook.
BTW, if you wanted something less ambitious, we have a longstanding
request to implement "local superuser", that is, the ability to give one
role the ability to edit other roles in one database only.
But roles aren't database-specific... they're globals.
...Robert
Robert Haas escribi�:
On Tue, Mar 23, 2010 at 1:28 PM, Josh Berkus <josh@agliodbs.com> wrote:
BTW, if you wanted something less ambitious, we have a longstanding
request to implement "local superuser", that is, the ability to give one
role the ability to edit other roles in one database only.But roles aren't database-specific... they're globals.
Well, that's another longstanding request ;-) (See the
db_user_namespace hack)
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes:
Robert Haas escribi�:
On Tue, Mar 23, 2010 at 1:28 PM, Josh Berkus <josh@agliodbs.com> wrote:
BTW, if you wanted something less ambitious, we have a longstanding
request to implement "local superuser", that is, the ability to give one
role the ability to edit other roles in one database only.But roles aren't database-specific... they're globals.
Well, that's another longstanding request ;-) (See the
db_user_namespace hack)
Yeah, you'd have to fix that first. The "ambitious" part of that is
coming up with a spec that everybody will accept. Once you had that,
coding it might not be very hard ...
BTW, "local superuser" is an oxymoron. If you're superuser you'd have
no trouble whatsoever breaking into other databases. "Local CREATEROLE"
privilege could be a sane concept, though, if we had local roles.
regards, tom lane
Tom Lane escribió:
Alvaro Herrera <alvherre@commandprompt.com> writes:
Robert Haas escribi�:
But roles aren't database-specific... they're globals.
Well, that's another longstanding request ;-) (See the
db_user_namespace hack)Yeah, you'd have to fix that first. The "ambitious" part of that is
coming up with a spec that everybody will accept. Once you had that,
coding it might not be very hard ...
I wonder if this is simpler now that we got rid of the flat files stuff.
We could validate the user once we've connected to a database and thus
able to poke at the local user catalog, not just the global one. I
think that was a serious roadblock.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes:
I wonder if this is simpler now that we got rid of the flat files stuff.
We could validate the user once we've connected to a database and thus
able to poke at the local user catalog, not just the global one. I
think that was a serious roadblock.
I think it'd be a mistake to invent a separate catalog for local users;
what had been nice clean foreign key relationships (eg, relowner ->
pg_auth.oid) would suddenly become a swamp.
My first thought about a catalog representation would be to add a column
to pg_auth which is a DB OID for local users or zero for global users.
However, you'd probably want to prevent local users and global users
from having the same names, and it's not very clear how to do that
with this representation (though that'd be even worse with separate
catalogs). I guess we could fall back on a creation-time check (ick).
regards, tom lane
On Tue, Mar 23, 2010 at 8:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alvaro Herrera <alvherre@commandprompt.com> writes:
I wonder if this is simpler now that we got rid of the flat files stuff.
We could validate the user once we've connected to a database and thus
able to poke at the local user catalog, not just the global one. I
think that was a serious roadblock.I think it'd be a mistake to invent a separate catalog for local users;
what had been nice clean foreign key relationships (eg, relowner ->
pg_auth.oid) would suddenly become a swamp.My first thought about a catalog representation would be to add a column
to pg_auth which is a DB OID for local users or zero for global users.
However, you'd probably want to prevent local users and global users
from having the same names, and it's not very clear how to do that
with this representation (though that'd be even worse with separate
catalogs). I guess we could fall back on a creation-time check (ick).
Could we use a suitably defined exclusion constraint?
...Robert
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Mar 23, 2010 at 8:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
My first thought about a catalog representation would be to add a column
to pg_auth which is a DB OID for local users or zero for global users.
However, you'd probably want to prevent local users and global users
from having the same names, and it's not very clear how to do that
with this representation (though that'd be even worse with separate
catalogs). �I guess we could fall back on a creation-time check (ick).
Could we use a suitably defined exclusion constraint?
Not unless you'd like to solve the issues with triggers on system
catalogs first ...
regards, tom lane
On Tue, Mar 23, 2010 at 8:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Mar 23, 2010 at 8:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
My first thought about a catalog representation would be to add a column
to pg_auth which is a DB OID for local users or zero for global users.
However, you'd probably want to prevent local users and global users
from having the same names, and it's not very clear how to do that
with this representation (though that'd be even worse with separate
catalogs). I guess we could fall back on a creation-time check (ick).Could we use a suitably defined exclusion constraint?
Not unless you'd like to solve the issues with triggers on system
catalogs first ...
Urp. Not really, though I don't know what they are exactly. I didn't
think exclusion constraints depended on triggers. UNIQUE constraints
work on system catalogs, right?
...Robert
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Mar 23, 2010 at 8:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Not unless you'd like to solve the issues with triggers on system
catalogs first ...
Urp. Not really, though I don't know what they are exactly. I didn't
think exclusion constraints depended on triggers. UNIQUE constraints
work on system catalogs, right?
UNIQUE constraints depend on internal support in the index access method
(see today's thread with Gokulakannan Somasundaram for some details of
how btree does it). Exclusion constraints have a totally different
implementation --- they don't require index AM support, but they do use
triggers.
Now having said that, my recollection is that the worst issues
surrounding triggers on catalogs had to do with BEFORE triggers.
Exclusion constraint triggers would be AFTER triggers, so maybe it could
be made to work. It'd still be significant work though, for not a lot
of value as far as this particular issue goes.
regards, tom lane