Re: [pgsql-hackers] Daily digest v1.10705 (13 messages)

Started by Marc Munroalmost 16 years ago40 messageshackers

marc@bloodnok.com

almost 16 years ago

On Thu, 2010-06-03 at 05:53 -0300, pgsql-hackers-owner@postgresql.org
wrote:

[ . . . ]

In my current idea, when a qual-node that contains FuncExpr tries to
reference a part of relations within a security view, its referencing
relations will be expanded to whole of the security view at
distribute_qual_to_rels().
[ . . . ]

I may be missing something here but this seems a bit too simplistic and,
I think, fails to deal with an important use case.

Security views, that do anything useful at all, tend to introduce
performance issues. I am concerned that placing a conceptual barrier
between the secured and unsecured parts of queries is going to
unnecessarily restrict what the optimiser can do.

For example consider that we have three secured views, each of the form:

create view s_x as select * from x where i_can_see(x.key);

and consider the query:

select stuff from s_x
inner join s_y on s_y.key = s_x.key
inner join s_z on s_z.key = s_x.key
where fn(s_x.a) = 3;

The optimiser ought to be able to spot the fact that i_can_see() need
only be called once for each joined result. By placing a barrier (if I
understand your proposal correctly) between the outermost joins and the
inner views, doesn't this optimisation become impossible?

I think a simpler solution may be possible here. If you can tag the
function i_can_see() as a security function, at least in the context of
its use in the security views, and then create the rule that security
functions are always considered to be lower cost than user-defined
non-security functions, don't we achieve the result of preventing the
insecure function from seeing rows that it shouldn't?

I guess my concern is that a query may be constructed a=out of secured
and unsecured parts and the optimiser should be free to group all of the
secured parts together before considering the unsecured parts.

Sorry for the imprecise language and terminolgy, and also if I have
completely misunderstood the implications.

Best Wishes

__
Marc (the veil guy)

Import Notes

Reply to msg id not found: 20100603085405.0B7CB1200B@bloodnok.comReference msg id not found: 20100603085405.0B7CB1200B@bloodnok.com

KaiGai Kohei

kaigai@ak.jp.nec.com

almost 16 years ago

In reply to: Marc Munro (#1)

Re: [PATCH] Fix leaky VIEWs for RLS

I fixed up the subject.

(2010/06/04 2:23), Marc Munro wrote:

On Thu, 2010-06-03 at 05:53 -0300, pgsql-hackers-owner@postgresql.org
wrote:

[ . . . ]

In my current idea, when a qual-node that contains FuncExpr tries to
reference a part of relations within a security view, its referencing
relations will be expanded to whole of the security view at
distribute_qual_to_rels().
[ . . . ]

I may be missing something here but this seems a bit too simplistic and,
I think, fails to deal with an important use case.

Security views, that do anything useful at all, tend to introduce
performance issues. I am concerned that placing a conceptual barrier
between the secured and unsecured parts of queries is going to
unnecessarily restrict what the optimiser can do.

For example consider that we have three secured views, each of the form:

create view s_x as select * from x where i_can_see(x.key);

and consider the query:

select stuff from s_x
inner join s_y on s_y.key = s_x.key
inner join s_z on s_z.key = s_x.key
where fn(s_x.a) = 3;

The optimiser ought to be able to spot the fact that i_can_see() need
only be called once for each joined result. By placing a barrier (if I
understand your proposal correctly) between the outermost joins and the
inner views, doesn't this optimisation become impossible?

It seems to me incorrect.

If i_can_see() can filter a part of tuples within table: x, the optimizer
prefers to call i_can_see() on scanning each tables rather than result of
join, because it effectively reduce the size of scan.

In fact, the existing optimizer make the following plan:

postgres=# CREATE FUNCTION i_can_see(int) RETURNS bool IMMUTABLE
AS 'begin return $1 % 1 = 1; end;' LANGUAGE 'plpgsql';
CREATE FUNCTION
postgres=# CREATE VIEW v1 as select * from t1 where i_can_see(a);
CREATE VIEW
postgres=# CREATE VIEW v2 as select * from t2 where i_can_see(x);
CREATE VIEW
postgres=# CREATE VIEW v3 as select * from t3 where i_can_see(s);
CREATE VIEW

postgres=# EXPLAIN SELECT * FROM (v1 JOIN v2 ON v1.a=v2.x) JOIN v3 on v1.a=v3.s;
QUERY PLAN
-------------------------------------------------------------------------------
Nested Loop (cost=0.00..860.47 rows=410 width=108)
-> Nested Loop (cost=0.00..595.13 rows=410 width=72)
-> Seq Scan on t1 (cost=0.00..329.80 rows=410 width=36)
Filter: i_can_see(a)
-> Index Scan using t2_pkey on t2 (cost=0.00..0.63 rows=1 width=36)
Index Cond: (t2.x = t1.a)
Filter: i_can_see(t2.x)
-> Index Scan using t3_pkey on t3 (cost=0.00..0.63 rows=1 width=36)
Index Cond: (t3.s = t1.a)
Filter: i_can_see(t3.s)
(10 rows)

Sorry, I may miss something your point.

I think a simpler solution may be possible here. If you can tag the
function i_can_see() as a security function, at least in the context of
its use in the security views, and then create the rule that security
functions are always considered to be lower cost than user-defined
non-security functions, don't we achieve the result of preventing the
insecure function from seeing rows that it shouldn't?

Sorry, it seems to me you mixed up different issues.

My patch tries to tackle the problem that optimizer distributes outermost
functions into inside of the view when the function takes arguments only
from a side of join, even if security views.
This issue is independent from cost of functions.

On the other hand, when a scan plan has multiple qualifiers to filter
fetched tuples, we have to pay attention on the order to be called.
If outermost functions are called prior to view's policy functions,
it may cause unexpected leaks.

In my opinion, we should consider the nestlevel of the function where
it is originally put. But it is not concluded yet, and the patch does
not tackle anything about this issue.

I guess my concern is that a query may be constructed a=out of secured
and unsecured parts and the optimiser should be free to group all of the
secured parts together before considering the unsecured parts.

Yes, it is an opinion, but it may cause unnecessary performance regression
when user given condition is obviously harmless.

E.g, If table: t has primary key t.a, and a security view is defined as:

CREATE VIEW v AS SELECT * FROM t WHERE f_policy(t.b);

When user gives the following query:

SELECT * FROM v WHERE a = 100;

If we strictly separate secured and unsecure part prior to optimizer,
it will break a chance to scan the table: t with index.

I also think security is not free, so it may take a certain level of
performance degradation. But it should not bring down more than necessity.

Thanks,
--
KaiGai Kohei <kaigai@ak.jp.nec.com>

Robert Haas

robertmhaas@gmail.com

almost 16 years ago

In reply to: Marc Munro (#1)

On Thu, Jun 3, 2010 at 1:23 PM, Marc Munro <marc@bloodnok.com> wrote:

On Thu, 2010-06-03 at 05:53 -0300, pgsql-hackers-owner@postgresql.org
wrote:

[ . . . ]

In my current idea, when a qual-node that contains FuncExpr tries to
reference a part of relations within a security view, its referencing
relations will be expanded to whole of the security view at
distribute_qual_to_rels().
[ . . . ]

I may be missing something here but this seems a bit too simplistic and,
I think, fails to deal with an important use case.

If anything, you're putting it mildly. This is quite a bit too
simplistic and fails to deal with several important issues, at least
some of which have already been mentioned on this thread.

The optimiser ought to be able to spot the fact that i_can_see() need
only be called once for each joined result. By placing a barrier (if I
understand your proposal correctly) between the outermost joins and the
inner views, doesn't this optimisation become impossible?

I think a simpler solution may be possible here. If you can tag the
function i_can_see() as a security function, at least in the context of
its use in the security views, and then create the rule that security
functions are always considered to be lower cost than user-defined
non-security functions, don't we achieve the result of preventing the
insecure function from seeing rows that it shouldn't?

So, yes and no. You DO need a security barrier between the view and
the rest of the query, but if a function can be trusted not to do evil
things, then it should be allowed to be pushed down. What we need to
prevent is the pushdown of untrusted functions (or operators). A
(very) important part of this problem is determining which quals are
safe to push down.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

KaiGai Kohei

kaigai@ak.jp.nec.com

almost 16 years ago

In reply to: Robert Haas (#3)

Re: [PATCH] Fix leaky VIEWs for RLS

(2010/06/04 11:55), Robert Haas wrote:

On Thu, Jun 3, 2010 at 1:23 PM, Marc Munro<marc@bloodnok.com> wrote:

On Thu, 2010-06-03 at 05:53 -0300, pgsql-hackers-owner@postgresql.org
wrote:

[ . . . ]

In my current idea, when a qual-node that contains FuncExpr tries to
reference a part of relations within a security view, its referencing
relations will be expanded to whole of the security view at
distribute_qual_to_rels().
[ . . . ]

I may be missing something here but this seems a bit too simplistic and,
I think, fails to deal with an important use case.

If anything, you're putting it mildly. This is quite a bit too
simplistic and fails to deal with several important issues, at least
some of which have already been mentioned on this thread.

Hmm. Was it too simplified even if just a proof of concept?

The optimiser ought to be able to spot the fact that i_can_see() need
only be called once for each joined result. By placing a barrier (if I
understand your proposal correctly) between the outermost joins and the
inner views, doesn't this optimisation become impossible?

I think a simpler solution may be possible here. If you can tag the
function i_can_see() as a security function, at least in the context of
its use in the security views, and then create the rule that security
functions are always considered to be lower cost than user-defined
non-security functions, don't we achieve the result of preventing the
insecure function from seeing rows that it shouldn't?

So, yes and no. You DO need a security barrier between the view and
the rest of the query, but if a function can be trusted not to do evil
things, then it should be allowed to be pushed down. What we need to
prevent is the pushdown of untrusted functions (or operators). A
(very) important part of this problem is determining which quals are
safe to push down.

At least, I don't have an idea to distinguish trusted functions from
others without any additional hints, because we support variable kind
of PL languages. :(

An idea is to add TRUSTED (or other) keyword for CREATE FUNCTION to mark
this function is harmless to execute, but only DBA can define functions
with the option.
(Perhaps, most of built-in functions should be marked as trusted?)

If we should identify a function as either trusted or untrusted, is
there any other ideas?

Thanks,
--
KaiGai Kohei <kaigai@ak.jp.nec.com>

Tom Lane

tgl@sss.pgh.pa.us

almost 16 years ago

In reply to: KaiGai Kohei (#4)

Re: [PATCH] Fix leaky VIEWs for RLS

KaiGai Kohei <kaigai@ak.jp.nec.com> writes:

(2010/06/04 11:55), Robert Haas wrote:

A (very) important part of this problem is determining which quals are
safe to push down.

At least, I don't have an idea to distinguish trusted functions from
others without any additional hints, because we support variable kind
of PL languages. :(

The proposal some time back in this thread was to trust all built-in
functions and no others. That's a bit simplistic, no doubt, but it
seems to me to largely solve the performance problem and to do so with
minimal effort. When and if you get to a solution that's committable
with respect to everything else, it might be time to think about
more flexible answers to that particular point.

regards, tom lane

KaiGai Kohei

kaigai@ak.jp.nec.com

almost 16 years ago

In reply to: Tom Lane (#5)

Re: [PATCH] Fix leaky VIEWs for RLS

(2010/06/04 13:57), Tom Lane wrote:

KaiGai Kohei<kaigai@ak.jp.nec.com> writes:

(2010/06/04 11:55), Robert Haas wrote:

A (very) important part of this problem is determining which quals are
safe to push down.

At least, I don't have an idea to distinguish trusted functions from
others without any additional hints, because we support variable kind
of PL languages. :(

The proposal some time back in this thread was to trust all built-in
functions and no others. That's a bit simplistic, no doubt, but it
seems to me to largely solve the performance problem and to do so with
minimal effort. When and if you get to a solution that's committable
with respect to everything else, it might be time to think about
more flexible answers to that particular point.

Although I've not check all the functions within pg_proc.h, it seems to
me reasonable assumptions, except for a small number of exception which
have side-effects, such as lo_write().

Maybe, we have to shut out a small number of exceptions.
However, at least, it seems to me reasonable assumption in this stage.

Thanks,
--
KaiGai Kohei <kaigai@ak.jp.nec.com>

Dimitri Fontaine

dimitri@2ndQuadrant.fr

almost 16 years ago

In reply to: Tom Lane (#5)

Re: [PATCH] Fix leaky VIEWs for RLS

Tom Lane <tgl@sss.pgh.pa.us> writes:

The proposal some time back in this thread was to trust all built-in
functions and no others. That's a bit simplistic, no doubt, but it
seems to me to largely solve the performance problem and to do so with
minimal effort. When and if you get to a solution that's committable
with respect to everything else, it might be time to think about
more flexible answers to that particular point.

What about trusting all "internal" and "C" language function instead? My
understanding is that "internal" covers built-in functions, and as you
need to be a superuser to CREATE a "C" language function, surely you're
able to accept that by doing so you get to trust it?

How useful would that be?
--
dim

KaiGai Kohei

kaigai@ak.jp.nec.com

almost 16 years ago

In reply to: Dimitri Fontaine (#7)

Re: [PATCH] Fix leaky VIEWs for RLS

(2010/06/04 18:26), Dimitri Fontaine wrote:

Tom Lane<tgl@sss.pgh.pa.us> writes:

The proposal some time back in this thread was to trust all built-in
functions and no others. That's a bit simplistic, no doubt, but it
seems to me to largely solve the performance problem and to do so with
minimal effort. When and if you get to a solution that's committable
with respect to everything else, it might be time to think about
more flexible answers to that particular point.

What about trusting all "internal" and "C" language function instead? My
understanding is that "internal" covers built-in functions, and as you
need to be a superuser to CREATE a "C" language function, surely you're
able to accept that by doing so you get to trust it?

How useful would that be?

If we trust all the "C" language functions, it also means DBA can never
install any binary functions having side-effect (e.g, pg_file_write() in
the contrib/adminpack ) without security risks.

If we need an intelligence to identify what functions are trusted and
what ones are untrusted, it will eventually need a hint to mark a certain
function as trusted, won't it?

Thanks,
--
KaiGai Kohei <kaigai@kaigai.gr.jp>

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

almost 16 years ago

In reply to: Tom Lane (#5)

Re: [PATCH] Fix leaky VIEWs for RLS

On 04/06/10 07:57, Tom Lane wrote:

KaiGai Kohei<kaigai@ak.jp.nec.com> writes:

(2010/06/04 11:55), Robert Haas wrote:

A (very) important part of this problem is determining which quals are
safe to push down.

At least, I don't have an idea to distinguish trusted functions from
others without any additional hints, because we support variable kind
of PL languages. :(

The proposal some time back in this thread was to trust all built-in
functions and no others.

I thought I debunked that idea already
(http://archives.postgresql.org/pgsql-hackers/2009-10/msg01428.php). Not
all built-in functions are safe. Consider casting integer to text, for
example. Seems innocent at first glance, but it's not; if the input is
not a valid integer, it throws an error which contains the input string,
revealing it.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#10

Tom Lane

tgl@sss.pgh.pa.us

almost 16 years ago

In reply to: Heikki Linnakangas (#9)

Re: [PATCH] Fix leaky VIEWs for RLS

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

On 04/06/10 07:57, Tom Lane wrote:

The proposal some time back in this thread was to trust all built-in
functions and no others.

I thought I debunked that idea already
(http://archives.postgresql.org/pgsql-hackers/2009-10/msg01428.php). Not
all built-in functions are safe. Consider casting integer to text, for
example. Seems innocent at first glance, but it's not; if the input is
not a valid integer, it throws an error which contains the input string,
revealing it.

Hmm ... that's a mighty interesting example, because it shows that any
well-meaning change in error handling might render seemingly-unrelated
functions "unsafe". And we're certainly not going to make error
messages stop showing relevant information just because of this.

Maybe the entire idea is unworkable. I certainly don't find any comfort
in your proposal in the above-referenced message to trust index
operators; where is it written that those don't throw errors?

regards, tom lane

#11

Marc Munro

marc@bloodnok.com

almost 16 years ago

In reply to: Tom Lane (#10)

Re: [PATCH] Fix leaky VIEWs for RLS

On Fri, 2010-06-04 at 10:33 -0400, Tom Lane wrote:

Hmm ... that's a mighty interesting example, because it shows that any
well-meaning change in error handling might render seemingly-unrelated
functions "unsafe". And we're certainly not going to make error
messages stop showing relevant information just because of this.

Although that looks like a show-stopper, I think it can be worked
around. Errors in operations on security views could simply be caught
and conditionally rewritten. The original error could still appear in
the logs but the full details need not be reported to unprivileged
users.

If that can be done, then we would still need to be able to identify
trusted functions and views used for security purposes, and ensure that
(please excuse my terminology if it is incorrect) untrusted quals do not
get pushed down inside secured views. If all of that can be done along
with the error trapping, then we may have a solution.

My big concern is still about performance, particularly when joining
between multiple security views and other objects. I don't expect to
get security for free but I don't want to see unnecessary barriers to
optimisation.

__
Marc

#12

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

almost 16 years ago

In reply to: Tom Lane (#10)

Re: [PATCH] Fix leaky VIEWs for RLS

On 04/06/10 17:33, Tom Lane wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes:

On 04/06/10 07:57, Tom Lane wrote:

The proposal some time back in this thread was to trust all built-in
functions and no others.

I thought I debunked that idea already
(http://archives.postgresql.org/pgsql-hackers/2009-10/msg01428.php). Not
all built-in functions are safe. Consider casting integer to text, for
example.

(I meant "text to integer", of course)

Maybe the entire idea is unworkable. I certainly don't find any comfort
in your proposal in the above-referenced message to trust index
operators; where is it written that those don't throw errors?

Let's consider b-tree operators for an index on the secure table, for
starters. Surely a b-tree index comparison operator can't throw an error
on any value that's in the table already, you would've gotten an error
trying to insert that.

Now, is it safe to expand that thinking to b-tree operators in general,
even if there's no such index on the table? I'm not sure. But indexable
operations are what we care about the most; the order of executing those
determines if you can use an index scan or not.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#13

Robert Haas

robertmhaas@gmail.com

almost 16 years ago

In reply to: Heikki Linnakangas (#12)

Re: [PATCH] Fix leaky VIEWs for RLS

On Fri, Jun 4, 2010 at 1:46 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 04/06/10 17:33, Tom Lane wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes:

On 04/06/10 07:57, Tom Lane wrote:

The proposal some time back in this thread was to trust all built-in
functions and no others.

I thought I debunked that idea already
(http://archives.postgresql.org/pgsql-hackers/2009-10/msg01428.php). Not
all built-in functions are safe. Consider casting integer to text, for
example.

(I meant "text to integer", of course)

Maybe the entire idea is unworkable. I certainly don't find any comfort
in your proposal in the above-referenced message to trust index
operators; where is it written that those don't throw errors?

Let's consider b-tree operators for an index on the secure table, for
starters. Surely a b-tree index comparison operator can't throw an error on
any value that's in the table already, you would've gotten an error trying
to insert that.

Now, is it safe to expand that thinking to b-tree operators in general, even
if there's no such index on the table? I'm not sure. But indexable
operations are what we care about the most; the order of executing those
determines if you can use an index scan or not.

Another idea I had was... would it be safe to trust functions defined
by the same user who owns the view? If he's granted access to the
view and the function to some other user, presumably he doesn't mind
them being used together? Or is that too optimistic?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#14

Tom Lane

tgl@sss.pgh.pa.us

almost 16 years ago

In reply to: Heikki Linnakangas (#12)

Re: [PATCH] Fix leaky VIEWs for RLS

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

On 04/06/10 17:33, Tom Lane wrote:

Maybe the entire idea is unworkable. I certainly don't find any comfort
in your proposal in the above-referenced message to trust index
operators; where is it written that those don't throw errors?

Let's consider b-tree operators for an index on the secure table, for
starters. Surely a b-tree index comparison operator can't throw an error
on any value that's in the table already, you would've gotten an error
trying to insert that.

Man, are *you* trusting.

A counterexample: suppose we had a form of type "text" that carried a
collation specifier internally, and the comparison routine threw an
error if asked to compare values with incompatible specifiers. An index
built on a column of all the same collation would work fine. A query
that tried to compare against a constant of a different collation would
throw an error.

I'm not sure. But indexable
operations are what we care about the most; the order of executing those
determines if you can use an index scan or not.

Personally, I care just as much about hash and merge join operators...

regards, tom lane

#15

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

almost 16 years ago

In reply to: Tom Lane (#14)

Re: [PATCH] Fix leaky VIEWs for RLS

On 04/06/10 22:33, Tom Lane wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes:

On 04/06/10 17:33, Tom Lane wrote:

Maybe the entire idea is unworkable. I certainly don't find any comfort
in your proposal in the above-referenced message to trust index
operators; where is it written that those don't throw errors?

Let's consider b-tree operators for an index on the secure table, for
starters. Surely a b-tree index comparison operator can't throw an error
on any value that's in the table already, you would've gotten an error
trying to insert that.

Man, are *you* trusting.

A counterexample: suppose we had a form of type "text" that carried a
collation specifier internally, and the comparison routine threw an
error if asked to compare values with incompatible specifiers. An index
built on a column of all the same collation would work fine. A query
that tried to compare against a constant of a different collation would
throw an error.

I can't take that example seriously. First of all, tacking a collation
specifier to text values would be an awful hack. Secondly, it would be a
bad idea to define the b-tree comparison operators to throw an error; it
would be a lot more useful to impose an arbitrary order on the
collations, so that all values with collation A are considered smaller
than values with collation B. We do that for types like box; smaller or
greater than don't make much sense for boxes, but we implement them in a
pretty arbitrary way anyway to make it possible to build a b-tree index
on them, and for the planner to use merge joins on them, and implement
DISTINCT using sort etc.

I'm not sure. But indexable
operations are what we care about the most; the order of executing those
determines if you can use an index scan or not.

Personally, I care just as much about hash and merge join operators...

Hash seems safe too. Don't merge joins just use the default b-tree operator?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#16

Tom Lane

tgl@sss.pgh.pa.us

almost 16 years ago

In reply to: Heikki Linnakangas (#15)

Re: [PATCH] Fix leaky VIEWs for RLS

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

On 04/06/10 22:33, Tom Lane wrote:

A counterexample: suppose we had a form of type "text" that carried a
collation specifier internally, and the comparison routine threw an
error if asked to compare values with incompatible specifiers. An index
built on a column of all the same collation would work fine. A query
that tried to compare against a constant of a different collation would
throw an error.

I can't take that example seriously. First of all, tacking a collation
specifier to text values would be an awful hack.

Really? I thought that was under serious discussion. But whether it
applies to text or not is insignificant; I believe there are cases just
like this in existence today for some datatypes (think postgis).

The real point is that the comparison constant is under the control of
the attacker, and it's not part of the index. Therefore "it didn't
throw an error during index construction" proves nothing whatever.

... Secondly, it would be a
bad idea to define the b-tree comparison operators to throw an error;

You're still being far too trusting, by imagining that only *designed*
error conditions matter here. Think about overflows, out-of-memory,
(perhaps intentionally) corrupted data, etc etc.

I think the only real fix would be something like what Marc suggested:
if there's a security view involved in the query, we simply don't give
the client the real error message. Of course, now our "security
feature" is utterly disastrous on usability as well as performance
counts ...

regards, tom lane

#17

Robert Haas

robertmhaas@gmail.com

almost 16 years ago

In reply to: Tom Lane (#16)

Re: [PATCH] Fix leaky VIEWs for RLS

On Fri, Jun 4, 2010 at 4:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

On 04/06/10 22:33, Tom Lane wrote:

A counterexample: suppose we had a form of type "text" that carried a
collation specifier internally, and the comparison routine threw an
error if asked to compare values with incompatible specifiers. An index
built on a column of all the same collation would work fine. A query
that tried to compare against a constant of a different collation would
throw an error.

I can't take that example seriously. First of all, tacking a collation
specifier to text values would be an awful hack.

Really? I thought that was under serious discussion. But whether it
applies to text or not is insignificant; I believe there are cases just
like this in existence today for some datatypes (think postgis).

The real point is that the comparison constant is under the control of
the attacker, and it's not part of the index. Therefore "it didn't
throw an error during index construction" proves nothing whatever.

... Secondly, it would be a
bad idea to define the b-tree comparison operators to throw an error;

You're still being far too trusting, by imagining that only *designed*
error conditions matter here. Think about overflows, out-of-memory,
(perhaps intentionally) corrupted data, etc etc.

btree comparison operators should handle overflow and corrupted data
without blowing up. Maybe out-of-memory is worth worrying about, but
I think that is a mighty thin excuse for abandoning this feature
altogether. You'd have to contrive a situation where the system was
just about out of memory and something about the value being compared
against resulted in the comparison blowing up or not. I think that's
likely to be rather hard in practice, and in any case it's a covert
channel attack, which I think everyone already agrees is beyond the
scope of what we can protect against. You can probably learn more
information more quickly about the unseen data by fidding with
EXPLAIN, analyzing query execution times, etc. As long as we are
preventing the actual contents of the unseen tuples from being
revealed, I feel reasonably good about it.

I think the only real fix would be something like what Marc suggested:
if there's a security view involved in the query, we simply don't give
the client the real error message. Of course, now our "security
feature" is utterly disastrous on usability as well as performance
counts ...

Not pushing anything into the view is an equally real fix, although
I'll be pretty depressed if that's where we end up.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#18

Stephen Frost

sfrost@snowman.net

almost 16 years ago

In reply to: Robert Haas (#13)

Re: [PATCH] Fix leaky VIEWs for RLS

* Robert Haas (robertmhaas@gmail.com) wrote:

Another idea I had was... would it be safe to trust functions defined
by the same user who owns the view? If he's granted access to the
view and the function to some other user, presumably he doesn't mind
them being used together? Or is that too optimistic?

This was more-or-less what I'd been kind of kicking around in my head.
Forget about functions that are defined in the view itself. Any other
functions, etc, which are attached to the view by the calling user would
be suspect, etc. Perhaps with the exception of some built-ins that
we've marked as "safe" in some way.

My first thought was to track the "run this as X" information on every
RTE (more-or-less, relations, function calls, etc) and then at least be
able to, hopefully, *detect* situations that might be a problem- eg:
running a function which has "run as Q" against a relation that was
accessed as "run as R" when a filter "run as R" happens later. This is
all far too hand-wavey, I'm sure, but at least if we could detect it
then we might be able to find a way to deal with it.

Also, perhaps I'm not being paranoid enough, but all this concern over
error cases really doesn't really worry me that much. The amount of
data one could acquire that way is pretty limited. It'd be great if we
could deal with that case too, but maybe we could worry about the bigger
issue (at least, as I see it) first.

Just my 2c.

Thanks,

Stephen

#19

KaiGai Kohei

kaigai@ak.jp.nec.com

almost 16 years ago

In reply to: Robert Haas (#17)

Re: [PATCH] Fix leaky VIEWs for RLS

(2010/06/07 10:38), Robert Haas wrote:

On Fri, Jun 4, 2010 at 4:12 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes:

On 04/06/10 22:33, Tom Lane wrote:

A counterexample: suppose we had a form of type "text" that carried a
collation specifier internally, and the comparison routine threw an
error if asked to compare values with incompatible specifiers. An index
built on a column of all the same collation would work fine. A query
that tried to compare against a constant of a different collation would
throw an error.

I can't take that example seriously. First of all, tacking a collation
specifier to text values would be an awful hack.

Really? I thought that was under serious discussion. But whether it
applies to text or not is insignificant; I believe there are cases just
like this in existence today for some datatypes (think postgis).

The real point is that the comparison constant is under the control of
the attacker, and it's not part of the index. Therefore "it didn't
throw an error during index construction" proves nothing whatever.

... Secondly, it would be a
bad idea to define the b-tree comparison operators to throw an error;

You're still being far too trusting, by imagining that only *designed*
error conditions matter here. Think about overflows, out-of-memory,
(perhaps intentionally) corrupted data, etc etc.

btree comparison operators should handle overflow and corrupted data
without blowing up. Maybe out-of-memory is worth worrying about, but
I think that is a mighty thin excuse for abandoning this feature
altogether. You'd have to contrive a situation where the system was
just about out of memory and something about the value being compared
against resulted in the comparison blowing up or not. I think that's
likely to be rather hard in practice, and in any case it's a covert
channel attack, which I think everyone already agrees is beyond the
scope of what we can protect against. You can probably learn more
information more quickly about the unseen data by fidding with
EXPLAIN, analyzing query execution times, etc. As long as we are
preventing the actual contents of the unseen tuples from being
revealed, I feel reasonably good about it.

It seems to me a good point. In general, any software have
possibility to leak information via cover-channels, and it is
nearly impossible to fix all the channels.
However, from a security perspective, covert channels with low
bandwidths represent a lower threat than those with high bandwidths.
So, evaluation criteria stands on the viewpoint that it does not
necessary to eliminate all the covert channels more than necessity.

Even if we can estimate the contents of invisible tuples from error
messages or EXPLAIN output, its bandwidths are quite limited.

But, lowrite() might write out the given argument somewhere, if it
is invoked prior to security policy functions.
In this case, the contents of invisible tuples will be moved to
another large object unexpectedly with unignorable speed.
Such a direct data flow channel should be fixed at first, because of
its large bandwidth.

I think the only real fix would be something like what Marc suggested:
if there's a security view involved in the query, we simply don't give
the client the real error message. Of course, now our "security
feature" is utterly disastrous on usability as well as performance
counts ...

Not pushing anything into the view is an equally real fix, although
I'll be pretty depressed if that's where we end up.

I could not find out any certified commercial RDBMS with functionality
to tackle covert channel. It includes Oracle's row-level security stuff.
So, I don't think it is our goal to prevent anything into views.

An idea is that we focus on the direct data flow which allows to move
information to be invisible into another tables, files and so on.
In this stage, I don't think the priority of error messages are high.

Thanks,
--
KaiGai Kohei <kaigai@ak.jp.nec.com>

#20

KaiGai Kohei

kaigai@ak.jp.nec.com

almost 16 years ago

In reply to: Stephen Frost (#18)

Re: [PATCH] Fix leaky VIEWs for RLS

(2010/06/07 12:06), Stephen Frost wrote:

* Robert Haas (robertmhaas@gmail.com) wrote:

Another idea I had was... would it be safe to trust functions defined
by the same user who owns the view? If he's granted access to the
view and the function to some other user, presumably he doesn't mind
them being used together? Or is that too optimistic?

This was more-or-less what I'd been kind of kicking around in my head.
Forget about functions that are defined in the view itself. Any other
functions, etc, which are attached to the view by the calling user would
be suspect, etc. Perhaps with the exception of some built-ins that
we've marked as "safe" in some way.

My first thought was to track the "run this as X" information on every
RTE (more-or-less, relations, function calls, etc) and then at least be
able to, hopefully, *detect* situations that might be a problem- eg:
running a function which has "run as Q" against a relation that was
accessed as "run as R" when a filter "run as R" happens later. This is
all far too hand-wavey, I'm sure, but at least if we could detect it
then we might be able to find a way to deal with it.

It is a similar idea to what I tried to implement at the conceptual
patch. It checks privileges of calling user on the underlaying tables
to be referenced using views. If violated, it prevents to push down
the user given FuncExpr into join loops of views. Otherwise, it does
not prevent anything, because the user originally has privileges to
reference them.
Are you suggesting the idea (with adjustments such as "safe" functions)?

Also, perhaps I'm not being paranoid enough, but all this concern over
error cases really doesn't really worry me that much. The amount of
data one could acquire that way is pretty limited. It'd be great if we
could deal with that case too, but maybe we could worry about the bigger
issue (at least, as I see it) first.

As I also mentioned at previous message. It seems to me a good
point to focus on bandwidth of the covert channel.

The error message indeed can deliver information to be invisible,
but I don't think its priority is higher than functions that can
be abused to direct data flow channel, such as lowrite().

Thanks,
--
KaiGai Kohei <kaigai@ak.jp.nec.com>

#21

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

almost 16 years ago

In reply to: Stephen Frost (#18)

#22

KaiGai Kohei

kaigai@ak.jp.nec.com

almost 16 years ago

In reply to: Heikki Linnakangas (#21)

#23