LYDB: Feasible to use PG roles instead of application-level security?

Started by Guyren Howeover 9 years ago6 messagesgeneral
Jump to latest
#1Guyren Howe
guyren@gmail.com

Further to my attempts to enlighten application developers about what they might better do in the database:

https://medium.com/@gisborne/love-your-database-lydb-23c69f480a1d#.4jngp2rcb <https://medium.com/@gisborne/love-your-database-lydb-23c69f480a1d#.4jngp2rcb&gt;

it occurs to me to wonder whether it is practical to use PG’s own roles and security model in lieu of using an application-level one.

It seems that the role system in PG is sufficient for most general purposes. One could presumably also have a table with role names and associated metainformation (email address etc) as needed.

If I have a system with many thousands of users, is it practical to manage these users’ authentication and authorization using *just* Postgres?

It occurs to me that some client frameworks might have issues with their connection pools if those connections keep switching users, assuming they even can, but let’s set that aside for now. Or perhaps every connection could immediately do a SET USER before executing its connection?

This seems an attractive proposition from a security standpoint: if I use row-level security pervasively, I can have a security system that’s nestled nice and close to the data and presumably tricky to work around from a hacker given direct access only to the client application.

Is this practical? Has anyone here done it? What might the caveats be?

TIA

#2Alban Hertroys
haramrae@gmail.com
In reply to: Guyren Howe (#1)
Re: LYDB: Feasible to use PG roles instead of application-level security?

On 30 Dec 2016, at 4:19, Guyren Howe <guyren@gmail.com> wrote:

Further to my attempts to enlighten application developers about what they might better do in the database:

https://medium.com/@gisborne/love-your-database-lydb-23c69f480a1d#.4jngp2rcb

it occurs to me to wonder whether it is practical to use PG’s own roles and security model in lieu of using an application-level one.

It seems that the role system in PG is sufficient for most general purposes. One could presumably also have a table with role names and associated metainformation (email address etc) as needed.

If I have a system with many thousands of users, is it practical to manage these users’ authentication and authorization using *just* Postgres?

Postgres roles are global to the cluster, so you would end up with multiple thousands of roles if you have multiple databases in your cluster with different users on each. Which roles each user is allowed to have becomes quite the nightmare for the administrators, I suspect.

For a web-application facing the internet, I'd say no, don't do that. You're dealing with far too many users to be maintainable.

For an intranet database in a not-too-large company with a fixed set of users, it could be a good solution, especially if those roles can be linked to the company's LDAP server (assuming that's possible, I don't know). Multiple intranet applications on that same database can use the same users and roles.

Someone needs to do the administration though; with volumes (of users) like that and the database knowledge level of the average system administrator, a GUI seems preferable. IMHO, pgadmin provides too many features to be practical for someone like that, you would probably prefer something that only does user administration. I don't know of anything that does that though (not a GUI user myself)...

It occurs to me that some client frameworks might have issues with their connection pools if those connections keep switching users, assuming they even can, but let’s set that aside for now. Or perhaps every connection could immediately do a SET USER before executing its connection?

This seems an attractive proposition from a security standpoint: if I use row-level security pervasively, I can have a security system that’s nestled nice and close to the data and presumably tricky to work around from a hacker given direct access only to the client application.

With a few changes, that could work very well.

First, create roles for the different types of users that you expect. In a company, that could be by division, distinguishing division-heads, interns, etc.

Secondly, have a table with the users and their attributes like you describe. Include an attribute for their database role there. Only administrator users should have access to that table.

Finally, create a stored procedure that looks up a user name in that table and sets the accompanying role. If a user is not found, set the role to some default 'unprivileged' user.
Make that procedure a SECURITY DEFINER with according permissions. That role stays active the entire session, so unless you close the connection, create a new one or change the user's role, this procedure doesn't need calling again.

Is this practical? Has anyone here done it? What might the caveats be?

It's a fairly common practice, the ML archives should contain plenty of examples.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Berend Tober
btober@broadstripe.net
In reply to: Alban Hertroys (#2)
Re: LYDB: Feasible to use PG roles instead of application-level security?

----- Original Message -----

From: "Alban Hertroys" <haramrae@gmail.com>
To: "Guyren Howe" <guyren@gmail.com>
Cc: "pgsql-general" <pgsql-general@postgresql.org>
Sent: Friday, December 30, 2016 6:23:27 AM
Subject: Re: [GENERAL] LYDB: Feasible to use PG roles instead of application-level security?

On 30 Dec 2016, at 4:19, Guyren Howe <guyren@gmail.com> wrote:

... wonder whether it is practical to use PG’s own roles and
security model in lieu of using an application-level one.

It seems that the role system in PG is sufficient for most general
purposes. One could presumably also have a table with role names and
associated metainformation (email address etc) as needed.

If I have a system with many thousands of users, is it practical to manage
these users’ authentication and authorization using *just* Postgres?

Postgres roles are global to the cluster, so you would end up with multiple
thousands of roles if you have multiple databases in your cluster with
different users on each. Which roles each user is allowed to have becomes
quite the nightmare for the administrators, I suspect.

For a web-application facing the internet, I'd say no, don't do that. You're
dealing with far too many users to be maintainable.

For an intranet database in a not-too-large company with a fixed set of
users, it could be a good solution, especially if those roles can be linked
to the company's LDAP server (assuming that's possible, I don't know).
Multiple intranet applications on that same database can use the same users
and roles.

Someone needs to do the administration though; with volumes (of users) like
that and the database knowledge level of the average system administrator, a
GUI seems preferable. IMHO, pgadmin provides too many features to be
practical for someone like that, you would probably prefer something that
only does user administration. I don't know of anything that does that
though (not a GUI user myself)...

It occurs to me that some client frameworks might have issues with their
connection pools if those connections keep switching users, assuming they
even can, but let’s set that aside for now. Or perhaps every connection
could immediately do a SET USER before executing its connection?

This seems an attractive proposition from a security standpoint: if I use
row-level security pervasively, I can have a security system that’s
nestled nice and close to the data and presumably tricky to work around
from a hacker given direct access only to the client application.

With a few changes, that could work very well.

First, create roles for the different types of users that you expect. In a
company, that could be by division, distinguishing division-heads, interns,
etc.

Secondly, have a table with the users and their attributes like you describe.
Include an attribute for their database role there. Only administrator users
should have access to that table.

Finally, create a stored procedure that looks up a user name in that table
and sets the accompanying role. If a user is not found, set the role to some
default 'unprivileged' user.
Make that procedure a SECURITY DEFINER with according permissions. That role
stays active the entire session, so unless you close the connection, create
a new one or change the user's role, this procedure doesn't need calling
again.

Is this practical? Has anyone here done it? What might the caveats be?

It's a fairly common practice, the ML archives should contain plenty of
examples.

I cannot speak to the thousands of users scenario in practice, but in principle it seems workable.

As a practical matter, I have implemented something along the lines of what Hertroys describes, and taken it a step further, engaging schema privileges as well.

The PUBLIC schema contains a single relation that exposes a view with three columns: username, password, and email_address. You could expose more, but for my purposes minimal is best. This view serves the express purpose of creating new user accounts and is, by means of the rule system, writeable. Technically it is readable, too, but a select statement provides no information about underlying data ... so stalkers can't get a user list.

pg_hba.conf is configured to allow anonymous login which has usage permission on, and only on, the PUBLIC schema. So anyone can log in and see that one view and do an insert on that view.

The rule system pushes down the INSERT onto a table in a private schema that has a trigger which creates the data base role corresponding to the new user row inserted into the view. Thus, prospective users "self administer" and create their own accounts as needed. The trigger also grants the new role membership in a particular group role.

The group role has usage privileges on a second schema that exposes broader visibility into the data base as well as additional insert and update privileges on some of these other, newly exposed views.

You can read about and actually try it yourself at https://fairwinds.btober.net. Read the first few sections of the tutorial. Just be aware that that is a test environment, so it may or may not be available at any particular, unannounced times that I'm playing around with it, and will occasionally be re-initialized.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Stephen Frost
sfrost@snowman.net
In reply to: Guyren Howe (#1)
Re: LYDB: Feasible to use PG roles instead of application-level security?

Greetings,

* Guyren Howe (guyren@gmail.com) wrote:

it occurs to me to wonder whether it is practical to use PG’s own roles and security model in lieu of using an application-level one.

The short answer is yes.

It seems that the role system in PG is sufficient for most general purposes. One could presumably also have a table with role names and associated metainformation (email address etc) as needed.

Yup. That can get a bit awkward if you have multiple databases inside
of a single cluster, as you would have to pick which database to put
that metainformation in, but that isn't a very big issue.

If I have a system with many thousands of users, is it practical to manage these users’ authentication and authorization using *just* Postgres?

For this, it really depends on if the PG authorization model matches the
requirements you have. The PG auth model, particularly with RLS, is
extremely flexible but you would really need to evaluate what the exact
requirements are and how you would handle that with the PG auth model.
Of course, if there are just a few exceptions or complicated cases that
can't be satisfied directly with PG today, you could use security
definer functions.

One area that isn't fully addressed with the PG auth model today is
partial access to a certain column. Consider a table where you want
users to have access to all of the rows and all of the columns *except*
for column X for rows where ID is > 1000. The PG auth model today can
be used to say "you can't access column X" or to say "you can't access
rows where ID > 1000" but you can't combine those, yet.

I'm hopeful that we'll get there as there are definitely use-cases for
that kind of access control, but it's unlikely to happen for PG10.

It occurs to me that some client frameworks might have issues with their connection pools if those connections keep switching users, assuming they even can, but let’s set that aside for now. Or perhaps every connection could immediately do a SET USER before executing its connection?

Again, yes, connection poolers can be an issue, but it's possible to use
the role system and do a 'set role X' after having connected as some
user that has very little access. The issue here is controlling that
role change- there's no direct way in PG today to require a password to
be provided when doing the role change, which is unfortunate. One
approach to solving that with RLS is to use a security definer function
to change a record in a table that is then used in all RLS policies.
It's a bit complicated and doesn't involve doing 'set role' though, so
there are some trade-offs there.

If you really want connection pooling and independent users in PG's role
system then you'll end up having to have the app code do the
authentication (or maybe auth to PG as the user and, if successful,
reconnect as the regular user and set role... that's pretty awkward
though) and then connect and do the 'set role'.

One big question here, however, is if you're going to have thousands of
*concurrently connected* users. Thousands of users shouldn't be too
much of an issue, but if they're all connected using PG's main auth
system then you'll have thousands of backend processes running. That'll
end up causing some amount of overhead even if they're mostly idle. If
your application can handle connecting/disconnecting pretty easily and
you have a relativly short timeout (though, ideally, not too short) then
perhaps your number of concurrent connections won't be too bad.

This seems an attractive proposition from a security standpoint: if I use row-level security pervasively, I can have a security system that’s nestled nice and close to the data and presumably tricky to work around from a hacker given direct access only to the client application.

If that's the threat model you want to address then you'll have to work
out the concurrent connections question. One thing which can help is to
use a common user for 'read-only/public-access (or at least low-value)'
queries from the app, if there are such.

Is this practical? Has anyone here done it? What might the caveats be?

Yes, yes, see above.

Thanks!

Stephen

#5Melvin Davidson
melvin6925@gmail.com
In reply to: Stephen Frost (#4)
Re: LYDB: Feasible to use PG roles instead of application-level security?

On Fri, Dec 30, 2016 at 10:23 AM, Stephen Frost <sfrost@snowman.net> wrote:

Greetings,

* Guyren Howe (guyren@gmail.com) wrote:

it occurs to me to wonder whether it is practical to use PG’s own roles

and security model in lieu of using an application-level one.

The short answer is yes.

It seems that the role system in PG is sufficient for most general

purposes. One could presumably also have a table with role names and
associated metainformation (email address etc) as needed.

Yup. That can get a bit awkward if you have multiple databases inside
of a single cluster, as you would have to pick which database to put
that metainformation in, but that isn't a very big issue.

If I have a system with many thousands of users, is it practical to

manage these users’ authentication and authorization using *just* Postgres?

For this, it really depends on if the PG authorization model matches the
requirements you have. The PG auth model, particularly with RLS, is
extremely flexible but you would really need to evaluate what the exact
requirements are and how you would handle that with the PG auth model.
Of course, if there are just a few exceptions or complicated cases that
can't be satisfied directly with PG today, you could use security
definer functions.

One area that isn't fully addressed with the PG auth model today is
partial access to a certain column. Consider a table where you want
users to have access to all of the rows and all of the columns *except*
for column X for rows where ID is > 1000. The PG auth model today can
be used to say "you can't access column X" or to say "you can't access
rows where ID > 1000" but you can't combine those, yet.

I'm hopeful that we'll get there as there are definitely use-cases for
that kind of access control, but it's unlikely to happen for PG10.

It occurs to me that some client frameworks might have issues with their

connection pools if those connections keep switching users, assuming they
even can, but let’s set that aside for now. Or perhaps every connection
could immediately do a SET USER before executing its connection?

Again, yes, connection poolers can be an issue, but it's possible to use
the role system and do a 'set role X' after having connected as some
user that has very little access. The issue here is controlling that
role change- there's no direct way in PG today to require a password to
be provided when doing the role change, which is unfortunate. One
approach to solving that with RLS is to use a security definer function
to change a record in a table that is then used in all RLS policies.
It's a bit complicated and doesn't involve doing 'set role' though, so
there are some trade-offs there.

If you really want connection pooling and independent users in PG's role
system then you'll end up having to have the app code do the
authentication (or maybe auth to PG as the user and, if successful,
reconnect as the regular user and set role... that's pretty awkward
though) and then connect and do the 'set role'.

One big question here, however, is if you're going to have thousands of
*concurrently connected* users. Thousands of users shouldn't be too
much of an issue, but if they're all connected using PG's main auth
system then you'll have thousands of backend processes running. That'll
end up causing some amount of overhead even if they're mostly idle. If
your application can handle connecting/disconnecting pretty easily and
you have a relativly short timeout (though, ideally, not too short) then
perhaps your number of concurrent connections won't be too bad.

This seems an attractive proposition from a security standpoint: if I

use row-level security pervasively, I can have a security system that’s
nestled nice and close to the data and presumably tricky to work around
from a hacker given direct access only to the client application.

If that's the threat model you want to address then you'll have to work
out the concurrent connections question. One thing which can help is to
use a common user for 'read-only/public-access (or at least low-value)'
queries from the app, if there are such.

Is this practical? Has anyone here done it? What might the caveats be?

Yes, yes, see above.

Thanks!

Stephen

*>Postgres roles are global to the cluster,*

*Well, that is true by default, however, you can make roles database
specific:https://www.postgresql.org/docs/9.4/static/runtime-config-connection.html#RUNTIME-CONFIG-CONNECTION-SECURITY
<https://www.postgresql.org/docs/9.4/static/runtime-config-connection.html#RUNTIME-CONFIG-CONNECTION-SECURITY&gt;*

*db_user_namespace = on*

*That being said, there is a trade off of managing multiple users &
passwords VS simple access roles. *
--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#6Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Stephen Frost (#4)
Re: LYDB: Feasible to use PG roles instead of application-level security?

On Fri, Dec 30, 2016 at 10:23:44AM -0500, Stephen Frost wrote:

One area that isn't fully addressed with the PG auth model today is
partial access to a certain column. Consider a table where you want
users to have access to all of the rows and all of the columns *except*
for column X for rows where ID is > 1000. The PG auth model today can
be used to say "you can't access column X" or to say "you can't access
rows where ID > 1000" but you can't combine those, yet.

Do you mean that there is currently no way to say:

if special_column is NOT in the SELECT list:
show all rows
if special_column IS in the SELECT list:
show only those rows where special_column > 1000

?

Thanks,
Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general