Clarifying docs on nuance of select and update policies

Started by Chandler Gonzalesover 3 years ago2 messageshackersdocs
Jump to latest
#1Chandler Gonzales
jcgsville@gmail.com
hackersdocs

Hi y'all, I've got a proposed clarification to the documentation on the
nuances of RLS behavior for update policies, and maybe a (humble) request
for a change in behavior to make it more intuitive. I am starting with
pgsql-docs since I think the documentation change is a good starting point.

We use RLS policies to hide "soft deleted" objects from certain DB roles.
We recently tried to add the ability to let a user "soft delete" a row.
Ideally, we can write an RLS policy that allows a user to "soft delete" a
row, but then hides the row from that same user once it is soft deleted.

Here's the setup on a fresh Postgres db for my example. I'm executing these
queries as the database owner:
```
create role some_user_type;
create table foo(id int primary key, soft_deleted_at timestamptz,
some_other_field text);
alter table foo enable row level security;
grant select on table foo to some_user_type;
grant update(soft_deleted_at) on table foo to some_user_type;
insert into foo(id) values (1);
insert into foo(id, soft_deleted_at) values (2, now());
```

The behavior I'm trying to encode in RLS is that users with the role
some_user_type can see all rows where soft_deleted_at is null, can update
rows where BOTH the original soft_deleted_at is null AND the updated row
has a non-null soft_deleted_at. Basically, the only thing this user can do
to this row is to soft delete it.

We'll use a restrictive policy to get better error messages when we do an
update later on:
```
create policy pol_1 on foo for select to some_user_type using (true);
create policy pol_1_res on foo as restrictive for select to some_user_type
using (soft_deleted_at is null);
```

And just to verify it's working:
```
chandler@localhost:chandler> begin; set local role some_user_type; select *
from foo; rollback;
BEGIN
SET
╒════╤═════════════════╤══════════════════╕
│ id │ soft_deleted_at │ some_other_field │
╞════╪═════════════════╪══════════════════╡
│ 1 │ ¤ │ ¤ │
╘════╧═════════════════╧══════════════════╛
SELECT 1
ROLLBACK
```

Now the important bit, the update policy:
```
create policy pol_2 on foo for update to some_user_type using
(soft_deleted_at is null) with check (soft_deleted_at is not null);
```

If we update a row without a where clause that touches the row, the update
succeeds:
```
chandler@localhost:chandler> begin; set local role some_user_type; update
foo set soft_deleted_at = now() where true; rollback;
BEGIN
SET
UPDATE 1
ROLLBACK
```

But if we update a row with a where clause that uses the existing row, the
update fails:
```
chandler@localhost:chandler> begin; set local role some_user_type; update
foo set soft_deleted_at = now() where id = 1; rollback;
BEGIN
SET
new row violates row-level security policy "pol_1_res" for table "foo"
```

This was very unintuitive to me. My understanding is that when USING and
WITH CHECK are both used for an update policy, the USING is tested against
the original row, and the WITH CHECK clause against the new row. This is
stated in the [documentation](
https://www.postgresql.org/docs/current/sql-createpolicy.html):

The USING expression determines which records the UPDATE command will see

to operate against, while the WITH CHECK expression defines which modified
rows are allowed to be stored back into the relation

So why is it that the addition of where id = 1 violates the select policy?
Later in the same doc:

Typically an UPDATE command also needs to read data from columns in the

relation being updated (e.g., in a WHERE clause or a RETURNING clause, or
in an expression on the right hand side of the SET clause). In this case,
SELECT rights are also required on the relation being updated, and the
appropriate SELECT or ALL policies will be applied in addition to the
UPDATE policies. Thus the user must have access to the row(s) being updated
through a SELECT or ALL policy in addition to being granted permission to
update the row(s) via an UPDATE or ALL policy.

If you read this documentation perfectly literally, it does describe the
behavior shown my examples above, but it is a bit ambiguous. I think a more
clear explanation of this behavior would be the following:

Typically an UPDATE command also needs to read data from columns in the

relation being updated (e.g., in a WHERE clause or a RETURNING clause, or
in an expression on the right hand side of the SET clause). In this case,
SELECT rights are also required on the relation being updated, and the
appropriate SELECT or ALL policies will be applied in addition to the
UPDATE policies. Thus the user must have access to the row(s) operated
against and the rows being stored back into the relation via a SELECT or
ALL policy in addition to being granted permission to update the row(s) via
an UPDATE or ALL policy.

However, it seems like there is an opportunity to change the behavior to be
more intuitive, and to support the use case I am attempting. It seems like
use of a WHERE clause that uses the existing row should result in the
select policies only being applied to the row being operated against, not
the row being stored back into the relation. If the caller adds a RETURNING
clause, the existing behavior makes sense, because you are asking to access
the row being stored back into the relation.

Am I thinking about this correctly? I know this would be a breaking change,
albeit a small one. I'm not familiar with Postgres's approach to breaking
changes to know if it would even be considered. At a minimum, does my
proposed documentation change seem reasonable? It would have saved us lots
of time in discovering the behavior.

If this behavior is intentional, I'm also just curious about the rationale
behind it if someone happens to know? Maybe there's a use case that this is
built for that I'm not thinking of.

#2Bruce Momjian
bruce@momjian.us
In reply to: Chandler Gonzales (#1)
hackersdocs
Re: Clarifying docs on nuance of select and update policies

Thread moved to hackers since it involves complex queries. Can someone
like Stephen comment on the existing docs and feature behavior? Thanks.

---------------------------------------------------------------------------

On Fri, Sep 16, 2022 at 11:57:25AM -0700, Chandler Gonzales wrote:

Hi y'all, I've got a proposed clarification to the documentation on the nuances
of RLS behavior for update policies, and maybe a (humble) request for a change
in behavior to make it more intuitive. I am starting with pgsql-docs since I
think the documentation change is a good starting point.

We use RLS policies to hide "soft deleted" objects from certain DB roles. We
recently tried to add the ability to let a user "soft delete" a row. Ideally,
we can write an RLS policy that allows a user to "soft delete" a row, but then
hides the row from that same user once it is soft deleted.

Here's the setup on a fresh Postgres db for my example. I'm executing these
queries as the database owner:
```
create role some_user_type;
create table foo(id int primary key, soft_deleted_at timestamptz,
some_other_field text);
alter table foo enable row level security;
grant select on table foo to some_user_type;
grant update(soft_deleted_at) on table foo to some_user_type;
insert into foo(id) values (1);
insert into foo(id, soft_deleted_at) values (2, now());
```

The behavior I'm trying to encode in RLS is that users with the role
some_user_type can see all rows where soft_deleted_at is null, can update rows
where BOTH the original soft_deleted_at is null AND the updated row has a
non-null soft_deleted_at. Basically, the only thing this user can do to this
row is to soft delete it.

We'll use a restrictive policy to get better error messages when we do an
update later on:
```
create policy pol_1 on foo for select to some_user_type using (true);
create policy pol_1_res on foo as restrictive for select to some_user_type
using (soft_deleted_at is null);
```

And just to verify it's working:
```
chandler@localhost:chandler> begin; set local role some_user_type; select *
from foo; rollback;
BEGIN
SET
╒════╤═════════════════╤══════════════════╕
│ id │ soft_deleted_at │ some_other_field │
╞════╪═════════════════╪══════════════════╡
│ 1  │ ¤               │ ¤                │
╘════╧═════════════════╧══════════════════╛
SELECT 1
ROLLBACK
```

Now the important bit, the update policy:
```
create policy pol_2 on foo for update to some_user_type using (soft_deleted_at
is null) with check (soft_deleted_at is not null);
```

If we update a row without a where clause that touches the row, the update
succeeds:
```
chandler@localhost:chandler> begin; set local role some_user_type; update foo
set soft_deleted_at = now() where true; rollback;
BEGIN
SET
UPDATE 1
ROLLBACK
```

But if we update a row with a where clause that uses the existing row, the
update fails:
```
chandler@localhost:chandler> begin; set local role some_user_type; update foo
set soft_deleted_at = now() where id = 1; rollback;
BEGIN
SET
new row violates row-level security policy "pol_1_res" for table "foo"
```

This was very unintuitive to me. My understanding is that when USING and WITH
CHECK are both used for an update policy, the USING is tested against the
original row, and the WITH CHECK clause against the new row.  This is stated in
the [documentation](https://www.postgresql.org/docs/current/
sql-createpolicy.html):

The USING expression determines which records the UPDATE command will see to

operate against, while the WITH CHECK expression defines which modified rows
are allowed to be stored back into the relation

So why is it that the addition of where id = 1 violates the select policy?
Later in the same doc:

Typically an UPDATE command also needs to read data from columns in the

relation being updated (e.g., in a WHERE clause or a RETURNING clause, or in an
expression on the right hand side of the SET clause). In this case, SELECT
rights are also required on the relation being updated, and the appropriate
SELECT or ALL policies will be applied in addition to the UPDATE policies. Thus
the user must have access to the row(s) being updated through a SELECT or ALL
policy in addition to being granted permission to update the row(s) via an
UPDATE or ALL policy.

If you read this documentation perfectly literally, it does describe the
behavior shown my examples above, but it is a bit ambiguous. I think a more
clear explanation of this behavior would be the following:

Typically an UPDATE command also needs to read data from columns in the

relation being updated (e.g., in a WHERE clause or a RETURNING clause, or in an
expression on the right hand side of the SET clause). In this case, SELECT
rights are also required on the relation being updated, and the appropriate
SELECT or ALL policies will be applied in addition to the UPDATE policies. Thus
the user must have access to the row(s) operated against and the rows being
stored back into the relation via a SELECT or ALL policy in addition to being
granted permission to update the row(s) via an UPDATE or ALL policy.

However, it seems like there is an opportunity to change the behavior to be
more intuitive, and to support the use case I am attempting. It seems like use
of a WHERE clause that uses the existing row should result in the select
policies only being applied to the row being operated against, not the row
being stored back into the relation. If the caller adds a RETURNING clause, the
existing behavior makes sense, because you are asking to access the row being
stored back into the relation.

Am I thinking about this correctly? I know this would be a breaking change,
albeit a small one. I'm not familiar with Postgres's approach to breaking
changes to know if it would even be considered. At a minimum, does my proposed
documentation change seem reasonable? It would have saved us lots of time in
discovering the behavior.

If this behavior is intentional, I'm also just curious about the rationale
behind it if someone happens to know? Maybe there's a use case that this is
built for that I'm not thinking of.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Indecision is a decision. Inaction is an action. Mark Batterson