Improving RLS planning
Currently, we don't produce very good plans when row-level security
is enabled. An example is that, given
create table t1 (pk1 int primary key, label text);
create table t2 (pk2 int primary key, fk int references t1);
then for
select * from t1, t2 where pk1 = fk and pk2 = 42;
you would ordinarily get a cheap plan like
Nested Loop
-> Index Scan using t2_pkey on t2
Index Cond: (pk2 = 42)
-> Index Scan using t1_pkey on t1
Index Cond: (pk1 = t2.fk)
But stick an RLS policy on t1, and that degrades to a seqscan, eg
Nested Loop
Join Filter: (t1.pk1 = t2.fk)
-> Index Scan using t2_pkey on t2
Index Cond: (pk2 = 42)
-> Seq Scan on t1
Filter: (label = 'public'::text)
The reason for this is that we implement RLS by turning the reference
to t1 into a sub-SELECT, and the planner's recursive invocation of
subquery_planner produces only a seqscan path for t1, there not being
any reason visible in the subquery for it to do differently.
I have been thinking about improving this by allowing subquery_planner
to generate parameterized paths; but the more I think about that the
less satisfied I am with it. It will be quite expensive and probably
will still fail to find desirable plans in many cases. (I've not given
up on parameterized subquery paths altogether --- I just feel it'd be a
brute-force and not very effective way of dealing with RLS.)
The alternative I'm now thinking about pursuing is to get rid of the
conversion of RLS quals to subqueries. Instead, we can label individual
qual clauses with security precedence markings. Concretely, suppose we
add an "int security_level" field to struct RestrictInfo. The semantics
of this would be that a qual with a lower security_level value must be
evaluated before a qual with a higher security_level value, unless the
latter qual is leakproof. (It would likely also behoove us to add a
"leakproof" bool field to struct RestrictInfo, to avoid duplicate
leakproof-ness checks on quals. But that's just an optimization.)
In the initial implementation, quals coming from a RangeTblEntry's
securityQuals field would have security_level 0, quals coming from
anywhere else would have security_level 1; except that if we know
there are no security quals anywhere (ie not Query->hasRowSecurity),
we could give all quals security_level 0. (I think this exception
may be worth making because there's no need to test leakproofness
for a qual with security level 0; it could never be a candidate
for security delay anyway.)
Having done that much, I think all we need in order to get rid of
RLS subqueries, and just stick RLS quals into their relation's
baserestrictinfo list, are two rules:
1. When selecting potential indexquals, a RestrictInfo can be considered
for indexqual use only if it is leakproof or has security_level <= the
minimum among the table's baserestrictinfo clauses.
2. In order_qual_clauses, sort first by security_level and second by cost.
This would already be enough of a win to be worth doing. I think though
that this mechanism can be extended to also allow getting rid of the
restriction that security-barrier views can't be flattened. The idea
would be to make sure that quals coming from above the SB view are given
higher security_level values than quals within the SB view. We'd need
some extra mechanism to make that possible --- perhaps an additional kind
of node within jointree nests to show where there had been a
security-barrier boundary, and then some smarts in distribute_qual_to_rels
to prevent pushing upper quals down past a lower qual of strictly lesser
security level. But that can come later. (We do not need such smarts
to fix the RLS problem, because in the initial version, quals with lower
security level than another qual could only exist at the baserel level.)
In short, I'm proposing to throw away the entire existing implementation
for planning of RLS and SB views, and start over.
There are some corner cases I've not entirely worked out, in particular
what security_level to assign to quals generated from EquivalenceClasses.
A safe but not optimal answer would be to assign them the maximum
security_level of any source clause of the EC. Maybe it's not worth
working harder than that, because most equality operators are leakproof
anyway, so that it wouldn't matter what level we assigned them.
Before I start implementing this, can anyone see a fatal flaw in the
design?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 26 October 2016 at 10:58, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The alternative I'm now thinking about pursuing is to get rid of the
conversion of RLS quals to subqueries. Instead, we can label individual
qual clauses with security precedence markings. Concretely, suppose we
add an "int security_level" field to struct RestrictInfo. The semantics
of this would be that a qual with a lower security_level value must be
evaluated before a qual with a higher security_level value, unless the
latter qual is leakproof. (It would likely also behoove us to add a
"leakproof" bool field to struct RestrictInfo, to avoid duplicate
leakproof-ness checks on quals. But that's just an optimization.)
I wonder if there will be a need for more security_levels in the
future, otherwise perhaps a more generic field can be added, like "int
evaulation_flags".
In [1]/messages/by-id/CAKJS1f9fPdLKM6=SUZAGwucH3otbsPk6k0YT8-A1HgjFapL-zQ@mail.gmail.com -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services, there's still things to work out with it, but I mentioned that
I'd like to improve equivalence classes to handle more than just
simple equality, which seems to require "optional" RestrictInfos. It
would be nice if we could store all this in one field as a set of
bits.
[1]: /messages/by-id/CAKJS1f9fPdLKM6=SUZAGwucH3otbsPk6k0YT8-A1HgjFapL-zQ@mail.gmail.com -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Oct 25, 2016 at 5:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The alternative I'm now thinking about pursuing is to get rid of the
conversion of RLS quals to subqueries. Instead, we can label individual
qual clauses with security precedence markings. Concretely, suppose we
add an "int security_level" field to struct RestrictInfo. The semantics
of this would be that a qual with a lower security_level value must be
evaluated before a qual with a higher security_level value, unless the
latter qual is leakproof. (It would likely also behoove us to add a
"leakproof" bool field to struct RestrictInfo, to avoid duplicate
leakproof-ness checks on quals. But that's just an optimization.)In the initial implementation, quals coming from a RangeTblEntry's
securityQuals field would have security_level 0, quals coming from
anywhere else would have security_level 1; except that if we know
there are no security quals anywhere (ie not Query->hasRowSecurity),
we could give all quals security_level 0. (I think this exception
may be worth making because there's no need to test leakproofness
for a qual with security level 0; it could never be a candidate
for security delay anyway.)Having done that much, I think all we need in order to get rid of
RLS subqueries, and just stick RLS quals into their relation's
baserestrictinfo list, are two rules:1. When selecting potential indexquals, a RestrictInfo can be considered
for indexqual use only if it is leakproof or has security_level <= the
minimum among the table's baserestrictinfo clauses.2. In order_qual_clauses, sort first by security_level and second by cost.
This might work for RLS policies, if they can only reference a single
table, but I can't see how it's going to work for security barrier
views. For example, consider CREATE VIEW v WITH (security_barrier) AS
SELECT * FROM x, y WHERE x.a = y.a followed by SELECT * FROM v WHERE
leak(somefield). somefield is necessarily coming from either x or y,
and you can't let it be passed to leak() except for rows where the
join qual has been satisfied.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
This might work for RLS policies, if they can only reference a single
table, but I can't see how it's going to work for security barrier
views. For example, consider CREATE VIEW v WITH (security_barrier) AS
SELECT * FROM x, y WHERE x.a = y.a followed by SELECT * FROM v WHERE
leak(somefield). somefield is necessarily coming from either x or y,
and you can't let it be passed to leak() except for rows where the
join qual has been satisfied.
Right, so quals from above the SB view would have to not be allowed to
drop below the join level (but they could fall *to* the join level,
where they'd be applied after the join's own quals). I mentioned that
in the part of the message you cut. I don't have a detailed design yet
but it seems possible, and I expect it to be a lot simpler than the Rube
Goldberg design we've got for SB views now.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Oct 26, 2016 at 1:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
This might work for RLS policies, if they can only reference a single
table, but I can't see how it's going to work for security barrier
views. For example, consider CREATE VIEW v WITH (security_barrier) AS
SELECT * FROM x, y WHERE x.a = y.a followed by SELECT * FROM v WHERE
leak(somefield). somefield is necessarily coming from either x or y,
and you can't let it be passed to leak() except for rows where the
join qual has been satisfied.Right, so quals from above the SB view would have to not be allowed to
drop below the join level (but they could fall *to* the join level,
where they'd be applied after the join's own quals). I mentioned that
in the part of the message you cut. I don't have a detailed design yet
but it seems possible, and I expect it to be a lot simpler than the Rube
Goldberg design we've got for SB views now.
OK; it wasn't clear to me that you had considered that case. I'm not
convinced that what you end up with is going to be simpler than what
we have now, but if it is, great. One of the reasons I did the
initial security_barrier stuff this way was to avoid inventing a lot
of new stuff. Subqueries already acted as optimization fences and
that was what we needed for this, so it made sense to me to build on
top of that. Now, if we do build stuff specifically for this purpose,
it can probably be smarter than what we have today, and that is fine.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Oct 26, 2016 at 1:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Right, so quals from above the SB view would have to not be allowed to
drop below the join level (but they could fall *to* the join level,
where they'd be applied after the join's own quals). I mentioned that
in the part of the message you cut. I don't have a detailed design yet
but it seems possible, and I expect it to be a lot simpler than the Rube
Goldberg design we've got for SB views now.
OK; it wasn't clear to me that you had considered that case. I'm not
convinced that what you end up with is going to be simpler than what
we have now, but if it is, great.
Well, we already have mechanisms for controlling how far down the join
tree upper quals can fall; outer joins in particular require that. So
I'm thinking that it shouldn't take a lot of additional code for
distribute_qual_to_rels to handle this too. Admittedly, the amount of
boilerplate elsewhere, if it turns out we need a new jointree nodetype
to control this, is not negligible. But I'm thinking it'll be a lot more
straightforward. There's weird warts for security quals all over the
planner right now, and there are still some things about them that I think
work only by accident.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 25 October 2016 at 22:58, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The alternative I'm now thinking about pursuing is to get rid of the
conversion of RLS quals to subqueries. Instead, we can label individual
qual clauses with security precedence markings. Concretely, suppose we
add an "int security_level" field to struct RestrictInfo. The semantics
of this would be that a qual with a lower security_level value must be
evaluated before a qual with a higher security_level value, unless the
latter qual is leakproof. (It would likely also behoove us to add a
"leakproof" bool field to struct RestrictInfo, to avoid duplicate
leakproof-ness checks on quals. But that's just an optimization.)
+1 for this approach. It looks like it could potentially be much
simpler. There's some ugly code in the inheritance planner (and
probably one or two other places) that it might be possible to chop
out, which would probably also speed up planning times.
In the initial implementation, quals coming from a RangeTblEntry's
securityQuals field would have security_level 0, quals coming from
anywhere else would have security_level 1; except that if we know
there are no security quals anywhere (ie not Query->hasRowSecurity),
we could give all quals security_level 0. (I think this exception
may be worth making because there's no need to test leakproofness
for a qual with security level 0; it could never be a candidate
for security delay anyway.)
Note that the securityQuals list represents nested levels of security
barrier (e.g., nested SB views), so you'd have to actually assign
security_level 0 to the first security qual, security_level 1 to the
second security qual, and so on. Then the quals coming from anywhere
else would have to have a security_level one greater than the maximum
of all the other security levels.
Having done that much, I think all we need in order to get rid of
RLS subqueries, and just stick RLS quals into their relation's
baserestrictinfo list, are two rules:1. When selecting potential indexquals, a RestrictInfo can be considered
for indexqual use only if it is leakproof or has security_level <= the
minimum among the table's baserestrictinfo clauses.2. In order_qual_clauses, sort first by security_level and second by cost.
I think that ordering might be sub-optimal if you had a mix of
leakproof quals and security quals and the cost of some security quals
were significantly higher than the cost of some other quals. Perhaps
all leakproof quals should be assigned security_level 0, to allow them
to be checked earlier if they have a lower cost (whether or not they
are security quals), and only leaky quals would have a security_level
greater than zero. Rule 1 would then not need to check whether the
qual was leakproof, and you probably wouldn't need the separate
"leakproof" bool field on RestrictInfo.
Regards,
Dean
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
On 25 October 2016 at 22:58, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The alternative I'm now thinking about pursuing is to get rid of the
conversion of RLS quals to subqueries. Instead, we can label individual
qual clauses with security precedence markings.
+1 for this approach. It looks like it could potentially be much
simpler. There's some ugly code in the inheritance planner (and
probably one or two other places) that it might be possible to chop
out, which would probably also speed up planning times.
Here's a draft patch for this. I've only addressed the RLS use-case
so far, so this doesn't get into managing the order of application of
join quals, only restriction quals.
2. In order_qual_clauses, sort first by security_level and second by cost.
I think that ordering might be sub-optimal if you had a mix of
leakproof quals and security quals and the cost of some security quals
were significantly higher than the cost of some other quals. Perhaps
all leakproof quals should be assigned security_level 0, to allow them
to be checked earlier if they have a lower cost (whether or not they
are security quals), and only leaky quals would have a security_level
greater than zero. Rule 1 would then not need to check whether the
qual was leakproof, and you probably wouldn't need the separate
"leakproof" bool field on RestrictInfo.
Hm, but it would also force leakproof quals to be evaluated in front
of potentially-cheaper leaky quals, whether or not that's semantically
necessary.
I experimented with ignoring security_level altogether for leakproof
quals, but I couldn't make it work properly, because that didn't lead to
a comparison rule that satisfies transitivity. For instance, consider
three quals:
A: cost 1, security_level 1, leaky
B: cost 2, security_level 1, leakproof
C: cost 3, security_level 0, leakproof
A should sort before B, since same security_level and lower cost;
B should sort before C, since lower cost and leakproof;
but A must sort after C, since higher security_level and leaky.
So what I ended up doing was using your idea of forcing the security
level to 0 for leakproof quals, but only if they have cost below a
threshold (which I set at 10X cpu_operator_cost, which should take in
most built-in functions). That at least limits the possible damage
from forcing early evaluation of a leakproof qual. There may be
some better way to do it, though, so I didn't go so far as to remove
the separate leakproof flag.
Some other notes:
* This creates a requirement on scan-planning code (and someday on
join-planning code) to be sure it doesn't create violations of the qual
ordering rule. Currently only indxpath.c and tidpath.c have to worry
about that AFAICS. FDWs would need to worry about it too, except that
we don't currently allow RLS to be enabled on foreign tables. I'm a
little concerned about whether FDWs could create security holes by
not accounting for this, but it's moot for now. Custom scan providers
will need to pay attention as well.
* prepsecurity.c is now dead code and should be removed, but I did not
include that in this patch, since it would just bloat the patch.
* Accounting for the removal of prepsecurity.c, this is actually a net
savings of about 300 lines of code. So I feel pretty good about that.
It also gets rid of some really messy kluges, particularly the behavior
of generating new subquery RTEs as late as halfway through
grouping_planner. I find it astonishing that that worked at all.
* Since the planner is now depending on Query.hasRowSecurity to be set
whenever there are any securityQuals, I put in an Assert about that,
and promptly found three places in prepjointree.c and the rewriter where
we'd been failing to set it. I have not looked to see if these represent
live bugs in existing releases, but they might. Or am I misunderstanding
what the flag is supposed to mean?
* Aside from plan changes, there's one actual behavioral change in the
regression test results, but I think it's okay because it's a question
of whether to put a non-RLS qual before or after a leaky qual. That's
not something we are promising anything about.
* There's one test query in updatable_views.sql where the plan collapses
to a dummy (Result with constant false qual) because the planner is now
able to see that the qual conditions are mutually contradictory. Maybe
that query needs adjustment; I'm not sure what it's intending to test
exactly.
regards, tom lane
PS: I've been slacking on the commitfest because I wanted to push this
to a reasonably finished state before I set it aside to do CF work.
I'm not expecting it to be reviewed in this fest.
Attachments:
new-rls-planning-implementation-1.patch.gzapplication/x-gzip; name=new-rls-planning-implementation-1.patch.gzDownload
Tom,
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
On 25 October 2016 at 22:58, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The alternative I'm now thinking about pursuing is to get rid of the
conversion of RLS quals to subqueries. Instead, we can label individual
qual clauses with security precedence markings.+1 for this approach. It looks like it could potentially be much
simpler. There's some ugly code in the inheritance planner (and
probably one or two other places) that it might be possible to chop
out, which would probably also speed up planning times.Here's a draft patch for this. I've only addressed the RLS use-case
so far, so this doesn't get into managing the order of application of
join quals, only restriction quals.
Thanks much for working on this.
Just a few relatively quick comments:
2. In order_qual_clauses, sort first by security_level and second by cost.
I think that ordering might be sub-optimal if you had a mix of
leakproof quals and security quals and the cost of some security quals
were significantly higher than the cost of some other quals. Perhaps
all leakproof quals should be assigned security_level 0, to allow them
to be checked earlier if they have a lower cost (whether or not they
are security quals), and only leaky quals would have a security_level
greater than zero. Rule 1 would then not need to check whether the
qual was leakproof, and you probably wouldn't need the separate
"leakproof" bool field on RestrictInfo.Hm, but it would also force leakproof quals to be evaluated in front
of potentially-cheaper leaky quals, whether or not that's semantically
necessary.
I agree that this is a concern.
I experimented with ignoring security_level altogether for leakproof
quals, but I couldn't make it work properly, because that didn't lead to
a comparison rule that satisfies transitivity. For instance, consider
three quals:
A: cost 1, security_level 1, leaky
B: cost 2, security_level 1, leakproof
C: cost 3, security_level 0, leakproof
A should sort before B, since same security_level and lower cost;
B should sort before C, since lower cost and leakproof;
but A must sort after C, since higher security_level and leaky.So what I ended up doing was using your idea of forcing the security
level to 0 for leakproof quals, but only if they have cost below a
threshold (which I set at 10X cpu_operator_cost, which should take in
most built-in functions). That at least limits the possible damage
from forcing early evaluation of a leakproof qual. There may be
some better way to do it, though, so I didn't go so far as to remove
the separate leakproof flag.
I'm not a huge fan of these kinds of magic values, particularly when
they can't be independently managed, but, from a practical perspective,
I have a hard time seeing a real problem with this approach.
Some other notes:
* This creates a requirement on scan-planning code (and someday on
join-planning code) to be sure it doesn't create violations of the qual
ordering rule. Currently only indxpath.c and tidpath.c have to worry
about that AFAICS. FDWs would need to worry about it too, except that
we don't currently allow RLS to be enabled on foreign tables. I'm a
little concerned about whether FDWs could create security holes by
not accounting for this, but it's moot for now. Custom scan providers
will need to pay attention as well.
I'd certainly like to support RLS on FDWs (and views...), but we need to
hash out what the exact semantics of that will be and this looks like
something we should be able to address when we look at adding that
support.
* prepsecurity.c is now dead code and should be removed, but I did not
include that in this patch, since it would just bloat the patch.
Sure.
* Accounting for the removal of prepsecurity.c, this is actually a net
savings of about 300 lines of code. So I feel pretty good about that.
It also gets rid of some really messy kluges, particularly the behavior
of generating new subquery RTEs as late as halfway through
grouping_planner. I find it astonishing that that worked at all.
I'm a bit surprised to hear that as we've been doing that for quite a
while, though looking back on it, I can see why you bring it up.
* Since the planner is now depending on Query.hasRowSecurity to be set
whenever there are any securityQuals, I put in an Assert about that,
and promptly found three places in prepjointree.c and the rewriter where
we'd been failing to set it. I have not looked to see if these represent
live bugs in existing releases, but they might. Or am I misunderstanding
what the flag is supposed to mean?
They're independent, actually. securityQuals can be set via either
security barrier view or from RLS, while hasRowSecurity is specifically
for the RLS case. The reason for the distinction is that changing your
role isn't going to impact security barrier views at all, while it could
impact what RLS policies are used. See extract_query_dependencies().
* Aside from plan changes, there's one actual behavioral change in the
regression test results, but I think it's okay because it's a question
of whether to put a non-RLS qual before or after a leaky qual. That's
not something we are promising anything about.
Agreed.
* There's one test query in updatable_views.sql where the plan collapses
to a dummy (Result with constant false qual) because the planner is now
able to see that the qual conditions are mutually contradictory. Maybe
that query needs adjustment; I'm not sure what it's intending to test
exactly.
Would have to look at the specific test in question, will try to do so
this week, though I'm also planning to be more involved in the CF this
month and want to make progress there.
Thanks again!
Stephen
Stephen Frost <sfrost@snowman.net> writes:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
* Since the planner is now depending on Query.hasRowSecurity to be set
whenever there are any securityQuals, I put in an Assert about that,
and promptly found three places in prepjointree.c and the rewriter where
we'd been failing to set it. I have not looked to see if these represent
live bugs in existing releases, but they might. Or am I misunderstanding
what the flag is supposed to mean?
They're independent, actually. securityQuals can be set via either
security barrier view or from RLS, while hasRowSecurity is specifically
for the RLS case. The reason for the distinction is that changing your
role isn't going to impact security barrier views at all, while it could
impact what RLS policies are used. See extract_query_dependencies().
OK. In that case I'll need to adjust the patch so that the planner keeps
its own flag about whether the query contains any securityQuals; that's
easy enough. But I'm still suspicious that the three places I found may
represent bugs in the management of Query.hasRowSecurity.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Nov 3, 2016 at 6:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think that ordering might be sub-optimal if you had a mix of
leakproof quals and security quals and the cost of some security quals
were significantly higher than the cost of some other quals. Perhaps
all leakproof quals should be assigned security_level 0, to allow them
to be checked earlier if they have a lower cost (whether or not they
are security quals), and only leaky quals would have a security_level
greater than zero. Rule 1 would then not need to check whether the
qual was leakproof, and you probably wouldn't need the separate
"leakproof" bool field on RestrictInfo.Hm, but it would also force leakproof quals to be evaluated in front
of potentially-cheaper leaky quals, whether or not that's semantically
necessary.I experimented with ignoring security_level altogether for leakproof
quals, but I couldn't make it work properly, because that didn't lead to
a comparison rule that satisfies transitivity. For instance, consider
three quals:
A: cost 1, security_level 1, leaky
B: cost 2, security_level 1, leakproof
C: cost 3, security_level 0, leakproof
A should sort before B, since same security_level and lower cost;
B should sort before C, since lower cost and leakproof;
but A must sort after C, since higher security_level and leaky.
Yeah, this is pretty thorny. IIUC, all leaky quals of a given
security level must be evaluated before any quals of the next higher
security level, or we have a security problem. Beyond that, we'd
*prefer* to evaluate cheaper quals first (though perhaps we ought to
be also thinking about how selective they are) but that's "just" a
matter of how good the query plan is. So in this example, security
dictates that C must precede A, but that's it. We can pick between
C-A-B, C-B-A, and B-C-A based on cost. C-B-A is clearly inferior to
either of the other two, but it's less obvious whether C-A-B or B-C-A
is better. If you expect each predicate to have a selectivity of 50%,
then C-A-B costs 3+(0.5*1)+(0.25*2) = 4 while B-C-A costs
2+(0.5*3)+(0.25*1) = 3.75, so B-C-A is better. But now make the cost
of B and C 18 and 20 while keeping the cost of A at 1. Now C-A-B
costs 20+(0.5*1)+(0.25*18) = 25 while B-C-A costs 18+(0.5*20)+(0.25*1)
= 28.25, so now C-A-B is better.
So I think any attempt to come up with a transitive comparison rule is
doomed. We could do something like: sort by cost then security level;
afterwards, allow leakproof qual to migrate forward as many position
as is possible without passing a qual that is either higher-cost or
(non-leakproof and lower security level). So in the above example we
would start by sorting the like C-A-B and then check whether B can
move forward; it can't, so we're done. If all operators were
leakproof, this would essentially turn into an insertion-sort that
orders them strictly by cost, whereas if they're all leaky, it orders
strictly by security level and then by cost. With a mix of leaky and
non-leaky operators you get something in the middle.
I'm not sure that this is practically better than the hack you
proposed, but I wanted to take the time to comment on the theory here,
as I see it anyway.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 8 November 2016 at 14:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Stephen Frost <sfrost@snowman.net> writes:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
* Since the planner is now depending on Query.hasRowSecurity to be set
whenever there are any securityQuals, I put in an Assert about that,
and promptly found three places in prepjointree.c and the rewriter where
we'd been failing to set it. I have not looked to see if these represent
live bugs in existing releases, but they might. Or am I misunderstanding
what the flag is supposed to mean?They're independent, actually. securityQuals can be set via either
security barrier view or from RLS, while hasRowSecurity is specifically
for the RLS case. The reason for the distinction is that changing your
role isn't going to impact security barrier views at all, while it could
impact what RLS policies are used. See extract_query_dependencies().
Right. securityQuals was added for updatable SB views, which pre-dates RLS.
OK. In that case I'll need to adjust the patch so that the planner keeps
its own flag about whether the query contains any securityQuals; that's
easy enough. But I'm still suspicious that the three places I found may
represent bugs in the management of Query.hasRowSecurity.
I don't believe that there are any existing bugs here, but I think 1
or possibly 2 of the 3 places you changed should be kept on robustness
grounds (see below).
Query.hasRowSecurity is only used for invalidation of cached plans,
when the current role or environment changes, causing a change to the
set of policies that need to be applied, and thus requiring that the
query be re-planned. This happens in extract_query_dependencies(),
which walks the query tree and will find any Query.hasRowSecurity
flags and set PlannerGlobal.dependsOnRole, which is sufficient for its
intended purpose.
Regarding the 3 places you mention...
1). rewriteRuleAction() doesn't need to set Query.hasRowSecurity
because it's called during "Step 1" of the rewriter, where non-SELECT
rules are expanded, but RLS expansion doesn't happen until later, at
the end of "Step 2", after SELECT rules are expanded. That said, you
could argue that copying the flag in rewriteRuleAction() makes the
code more bulletproof, even though it is expected to always be false
at that point.
2). pull_up_simple_subquery() technically doesn't need to set
Query.hasRowSecurity because nothing in the planner refers to it, and
plancache.c will have already recorded the fact that the original
query had RLS. However, this seems like a bug waiting to happen, and
this really ought to be copying the flag in case we later add code
that does look at the flag later in the planning process.
3). rewriteTargetView() should not set the flag because the flag is
only for RLS, not for SB views, and we don't want cached plans for SB
views to be invalidated.
On a related note, I think it's worth establishing a terminology
convention for code and comments in this whole area. In the existing
code, or in my head at least, there's a convention that a term that
contains the words "row" or "policy" together with "security" refers
specifically to RLS, not to SB views. For example:
* Row-level security or just row security for the name of the feature itself
* row_security -- the configuration setting
* get_row_security_policies()
* Query.hasRowSecurity
* rowsecurity.c
On the other hand, terms that contain just the word "security" without
the words "row" or "policy" have a broader scope and may refer to
either RLS or SB views. For example:
* RangeTblEntry.security_barrier
* RangeTblEntry.securityQuals
* expand_security_quals()
* prepsecurity.c
* The new security_level field
It's a pretty fine distinction, and I don't know how others have been
thinking about this, but I think that it's helpful to make the
distinction, and there are at least a couple of places in the patch
that use RLS-specific terminology for what could also be a SB view.
Regards,
Dean
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 8 November 2016 at 16:46, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Nov 3, 2016 at 6:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I think that ordering might be sub-optimal if you had a mix of
leakproof quals and security quals and the cost of some security quals
were significantly higher than the cost of some other quals. Perhaps
all leakproof quals should be assigned security_level 0, to allow them
to be checked earlier if they have a lower cost (whether or not they
are security quals), and only leaky quals would have a security_level
greater than zero. Rule 1 would then not need to check whether the
qual was leakproof, and you probably wouldn't need the separate
"leakproof" bool field on RestrictInfo.Hm, but it would also force leakproof quals to be evaluated in front
of potentially-cheaper leaky quals, whether or not that's semantically
necessary.
True. That's also what currently happens with RLS and SB views because
leakproof quals are pushed down into subqueries without considering
their cost. It would be nice to do better than that.
I experimented with ignoring security_level altogether for leakproof
quals, but I couldn't make it work properly, because that didn't lead to
a comparison rule that satisfies transitivity. For instance, consider
three quals:
A: cost 1, security_level 1, leaky
B: cost 2, security_level 1, leakproof
C: cost 3, security_level 0, leakproof
A should sort before B, since same security_level and lower cost;
B should sort before C, since lower cost and leakproof;
but A must sort after C, since higher security_level and leaky.Yeah, this is pretty thorny. IIUC, all leaky quals of a given
security level must be evaluated before any quals of the next higher
security level, or we have a security problem. Beyond that, we'd
*prefer* to evaluate cheaper quals first (though perhaps we ought to
be also thinking about how selective they are) but that's "just" a
matter of how good the query plan is. So in this example, security
dictates that C must precede A, but that's it. We can pick between
C-A-B, C-B-A, and B-C-A based on cost. C-B-A is clearly inferior to
either of the other two, but it's less obvious whether C-A-B or B-C-A
is better. If you expect each predicate to have a selectivity of 50%,
then C-A-B costs 3+(0.5*1)+(0.25*2) = 4 while B-C-A costs
2+(0.5*3)+(0.25*1) = 3.75, so B-C-A is better. But now make the cost
of B and C 18 and 20 while keeping the cost of A at 1. Now C-A-B
costs 20+(0.5*1)+(0.25*18) = 25 while B-C-A costs 18+(0.5*20)+(0.25*1)
= 28.25, so now C-A-B is better.So I think any attempt to come up with a transitive comparison rule is
doomed. We could do something like: sort by cost then security level;
afterwards, allow leakproof qual to migrate forward as many position
as is possible without passing a qual that is either higher-cost or
(non-leakproof and lower security level). So in the above example we
would start by sorting the like C-A-B and then check whether B can
move forward; it can't, so we're done. If all operators were
leakproof, this would essentially turn into an insertion-sort that
orders them strictly by cost, whereas if they're all leaky, it orders
strictly by security level and then by cost. With a mix of leaky and
non-leaky operators you get something in the middle.I'm not sure that this is practically better than the hack you
proposed, but I wanted to take the time to comment on the theory here,
as I see it anyway.
Yes, I think you're right. It doesn't look possible to invent a
transitive comparison rule.
I thought perhaps the rule could be to only "push down" a leakproof
qual (change it to a lower security_level) if there are more expensive
quals at the lower level, but as you point out, this doesn't guarantee
cheaper execution.
Regards,
Dean
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Dean,
* Dean Rasheed (dean.a.rasheed@gmail.com) wrote:
On 8 November 2016 at 14:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:
OK. In that case I'll need to adjust the patch so that the planner keeps
its own flag about whether the query contains any securityQuals; that's
easy enough. But I'm still suspicious that the three places I found may
represent bugs in the management of Query.hasRowSecurity.I don't believe that there are any existing bugs here, but I think 1
or possibly 2 of the 3 places you changed should be kept on robustness
grounds (see below).
Agreed.
On a related note, I think it's worth establishing a terminology
convention for code and comments in this whole area. In the existing
code, or in my head at least, there's a convention that a term that
contains the words "row" or "policy" together with "security" refers
specifically to RLS, not to SB views. For example:
Agreed, at least for 'row security'. I tend to view 'policy' as just
about sufficient to stand on its own, in an 'object' type of context
(vs. something like a 'policy decision'). There aren't many other
mentions of policy in src/backend either, the notable one I found
quickly being 'LockWaitPolicy'. That strikes me as pretty distinctive
from RLS-related policies though.
* Row-level security or just row security for the name of the feature itself
* row_security -- the configuration setting
* get_row_security_policies()
* Query.hasRowSecurity
* rowsecurity.cOn the other hand, terms that contain just the word "security" without
the words "row" or "policy" have a broader scope and may refer to
either RLS or SB views. For example:
For my 2c, 'security' is a pretty overloaded term, unfortunately. We
also have things like fmgr_security_definer(), fmgr_info_cxt_security(),
the security label system, etc, so I don't know that 'security' can
really stand on its own, except perhaps within a specific context, like
"within the rewriter and planner/optimizer, 'security' generally is
going to be talking about security barriers, be they for RLS or security
barrier views." Even that is likely a bit of a stretch though. I tend
think we should move in more of a 'Security Barrier'/'SecBarrier' or
similar direction. Anyone working with the code associated with this
should understand that RLS is built on top of the security barrier
system.
I'm not sure we need to get particularly wrapped up in this, however, or
go making changes just for the sake of making them.
It's a pretty fine distinction, and I don't know how others have been
thinking about this, but I think that it's helpful to make the
distinction, and there are at least a couple of places in the patch
that use RLS-specific terminology for what could also be a SB view.
I agree that we shouldn't be using RLS-specific terminology for
components which are actually used by RLS and SB views.
Thanks!
Stephen
Stephen Frost <sfrost@snowman.net> writes:
* Dean Rasheed (dean.a.rasheed@gmail.com) wrote:
On 8 November 2016 at 14:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:
... I'm still suspicious that the three places I found may
represent bugs in the management of Query.hasRowSecurity.
I don't believe that there are any existing bugs here, but I think 1
or possibly 2 of the 3 places you changed should be kept on robustness
grounds (see below).
Agreed.
OK. I'll push a small patch that adds two of those and a comment as to
why it's not appropriate in the third case. HEAD-only should be
sufficient since we don't think this is a live bug.
On a related note, I think it's worth establishing a terminology
convention for code and comments in this whole area.
For my 2c, 'security' is a pretty overloaded term, unfortunately.
Yeah, I think we'd be best off to avoid the bare term "security".
It's probably too late to change the RTE field name "securityQuals",
but maybe we could uniformly call those "security barrier quals" in
the comments. Then the basic terminology is that we have security
barrier views and row-level security both implemented on top of
security barrier quals, and we should be careful to use the right
one of those three terms in comments/documentation.
Or, if you are willing to put up with renaming the field, we could
call the RTE field "barrierQuals" and then they are just "barrier
quals" for documentation purposes. But this would be a PITA for
back-patching, so I'm not sure it's worth it.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10 November 2016 at 17:12, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yeah, I think we'd be best off to avoid the bare term "security".
It's probably too late to change the RTE field name "securityQuals",
but maybe we could uniformly call those "security barrier quals" in
the comments. Then the basic terminology is that we have security
barrier views and row-level security both implemented on top of
security barrier quals, and we should be careful to use the right
one of those three terms in comments/documentation.
+1 for that terminology and no renaming of fields.
Regards,
Dean
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
* Dean Rasheed (dean.a.rasheed@gmail.com) wrote:
On 10 November 2016 at 17:12, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yeah, I think we'd be best off to avoid the bare term "security".
It's probably too late to change the RTE field name "securityQuals",
but maybe we could uniformly call those "security barrier quals" in
the comments. Then the basic terminology is that we have security
barrier views and row-level security both implemented on top of
security barrier quals, and we should be careful to use the right
one of those three terms in comments/documentation.+1 for that terminology and no renaming of fields.
Agreed.
Thanks!
Stephen
Stephen Frost <sfrost@snowman.net> writes:
* Dean Rasheed (dean.a.rasheed@gmail.com) wrote:
+1 for that terminology and no renaming of fields.
Agreed.
Here's an updated version of the RLS planning patch that gets rid of
the incorrect interaction with Query.hasRowSecurity and adjusts
terminology as agreed.
regards, tom lane
Attachments:
new-rls-planning-implementation-2.patch.gzapplication/x-gzip; name=new-rls-planning-implementation-2.patch.gzDownload+0-1
Tom,
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
Stephen Frost <sfrost@snowman.net> writes:
* Dean Rasheed (dean.a.rasheed@gmail.com) wrote:
+1 for that terminology and no renaming of fields.
Agreed.
Here's an updated version of the RLS planning patch that gets rid of
the incorrect interaction with Query.hasRowSecurity and adjusts
terminology as agreed.
I've spent a fair bit of time going over this change to understand it,
how it works, and how it changes the way RLS and securiy barrier views
work.
Overall, I'm happy with how it works and don't see any serious issues
with the qual ordering or the general concept. I did have a few
comments from my review:
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README [...] + Additional rules will be needed to support safe handling of join quals + when there is a mix of security levels among join quals; for example, it + will be necessary to prevent leaky higher-security-level quals from being + evaluated at a lower join level than other quals of lower security level. + Currently there is no need to consider that since security-prioritized + quals can only be single-table restriction quals coming from RLS policies + or security-barrier views, and thus enforcement only needs to happen at + the table scan level. With such extra rules, it should be possible to let + security-barrier views be flattened into the parent query, allowing more + flexibility of planning while still preserving required ordering of qual + evaluation. But that will come later.
Are you thinking that we will be able to remove the cut-out in
is_simple_subquery() which currently punts whenever an RTE is marked as
security_barrier? That would certainly be excellent as, currently, even
if everything involved is leakproof, we may end up with a poor choice of
plan because the join in the security barrier view must be performed
first. Consider a case where we have two relativly large tables being
joined together in a security barrier view, but a join from outside
which is against a small table. A better plan would generally be to
join with the smaller table first and then join the resulting small set
against the remaining large table.
Speaking of which, it seems like we should probably update the README to
include some mention, at least, of what we're doing today when it comes
to joins which involve security barrier entanglements.
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
[...]
! /*
! * In addition to the quals inherited from the parent, we might have
! * securityQuals associated with this particular child node. (This
! * won't happen in inheritance cases, only with appendrels originating
! * from UNION ALL.) Pull them up into the baserestrictinfo for the
! * child. This is similar to process_security_barrier_quals() for the
! * parent rel, except that we can't make any general deductions from
! * such quals, since they don't hold for the whole appendrel.
! */
Right, this won't happen in inheritance cases because we explicitly
don't consider the quals of the children when querying through the
parent, similar to how we don't consider the GRANT-based permissions on
the child tables. This is mentioned elsewhere but might make sense to
also mention it here, or at least say 'see expand_inherited_rtentry()'.
*************** subquery_push_qual(Query *subquery, Rang
*** 2708,2714 ****
make_and_qual(subquery->jointree->quals, qual);/* ! * We need not change the subquery's hasAggs or hasSublinks flags, * since we can't be pushing down any aggregates that weren't there * before, and we don't push down subselects at all. */ --- 2748,2754 ---- make_and_qual(subquery->jointree->quals, qual);/*
! * We need not change the subquery's hasAggs or hasSubLinks flags,
* since we can't be pushing down any aggregates that weren't there
* before, and we don't push down subselects at all.
*/
Seems like this change is unrelated to what this patch is about. Not a
big deal, but did take me a second to realize that you were just
changing the case of the 'L' in hasSubLinks.
+ * We also reject proposed equivalence clauses if they contain leaky functions + * and have security_level above zero. The EC evaluation rules require us to + * apply certain tests at certain joining levels, and we can't tolerate + * delaying any test on security_level grounds. By rejecting candidate clauses + * that might require security delays, we ensure it's safe to apply an EC + * clause as soon as it's supposed to be applied.
[...]
+ /* Reject if it is potentially postponable by security considerations */ + if (restrictinfo->security_level > 0 && !restrictinfo->leakproof) + return false;
The first comment makes a lot of sense, but the one-liner doesn't seem
as clear, to me anyway.
The result of the above, as I understand it, is that security_level will
either be zero, or the restrictinfo will be leakproof, no? Meaning that
ec_max_security will either be zero, or the functions involved will be
leakproof, right?
*************** select_equality_operator(EquivalenceClas
[...]
--- 1352,1364 ----opno = get_opfamily_member(opfamily, lefttype, righttype,
BTEqualStrategyNumber);
! if (!OidIsValid(opno))
! continue;
! /* If no barrier quals in query, don't worry about leaky operators */
! if (ec->ec_max_security == 0)
! return opno;
! /* Otherwise, insist that selected operators be leakproof */
! if (get_func_leakproof(get_opcode(opno)))
return opno;
}
return InvalidOid;
Leading me to wonder if the above ever actually falls through to the
InvalidOid case due to ec_max_security > 0 and the operator not being
leakproof. Reviewing the coverage-html output, it looks like the only
cases where InvalidOid is returned is when no operator can be found (and
that only happens 20 times throughout the regression tests, and only
through two of the many code paths that call this function).
Perhaps it's more difficult than it's worth to come up with cases that
cover the other code paths involved, but it seems like it might be good
to at least try to as it's likely to happen in more cases now that we're
returning (or should be, at least) InvalidOid due to the only operators
found being leaky ones.
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
[...]
+ /* + * If a clause is leakproof, it doesn't have to be constrained by + * its nominal security level. If it's also reasonably cheap + * (here defined as 10X cpu_operator_cost), pretend it has + * security_level 0, which will allow it to go in front of + * more-expensive quals of lower security levels. Of course, that + * will also force it to go in front of cheaper quals of its own + * security level, which is not so great, but we can alleviate + * that risk by applying the cost limit cutoff. + */ + if (rinfo->leakproof && items[i].cost < 10 * cpu_operator_cost) + items[i].security_level = 0; + else + items[i].security_level = rinfo->security_level; + } + else + items[i].security_level = 0; i++; }
As discussed previously, this looks like a good, practical, hack, but I
feel a little bad that we don't mention it anywhere except in this
comment. Is it too low-level to get a mention in the README?
diff --git a/src/test/regress/expected/updatable_views.out b/src/test/regress/expected/updatable_views.out
[...]
--- 2104,2114 ----EXPLAIN (VERBOSE, COSTS OFF)
UPDATE v1 SET a=100 WHERE snoop(a) AND leakproof(a) AND a = 3;
! QUERY PLAN
! --------------------------
! Result
! One-Time Filter: false
! (2 rows)
Perhaps Dean recalls something more specific, but reviewing this test
and the others around it, I believe it was simply to see what happens in
a case which doesn't pass the security barrier view constraint and to
cover the same cases with the UPDATE as were in the SELECTs above it. I
don't see it being an issue that it now results in a one-time filter:
false result.
Reviewing the other regression test changes, they all look good to me.
Thanks!
Stephen
On Mon, Nov 28, 2016 at 6:55 PM, Stephen Frost <sfrost@snowman.net> wrote:
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README [...] + Additional rules will be needed to support safe handling of join quals + when there is a mix of security levels among join quals; for example, it + will be necessary to prevent leaky higher-security-level quals from being + evaluated at a lower join level than other quals of lower security level. + Currently there is no need to consider that since security-prioritized + quals can only be single-table restriction quals coming from RLS policies + or security-barrier views, and thus enforcement only needs to happen at + the table scan level. With such extra rules, it should be possible to let + security-barrier views be flattened into the parent query, allowing more + flexibility of planning while still preserving required ordering of qual + evaluation. But that will come later.Are you thinking that we will be able to remove the cut-out in
is_simple_subquery() which currently punts whenever an RTE is marked as
security_barrier? That would certainly be excellent as, currently, even
if everything involved is leakproof, we may end up with a poor choice of
plan because the join in the security barrier view must be performed
first. Consider a case where we have two relativly large tables being
joined together in a security barrier view, but a join from outside
which is against a small table. A better plan would generally be to
join with the smaller table first and then join the resulting small set
against the remaining large table.
We have to be careful that we don't introduce new security holes while
we're improving the plans. I guess this would be OK if the table, its
target list, and its quals all happened to be leakproof, but otherwise
not. Or am I confused?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers