PATCH: Add query for operators unusable with RLS to documentation

Started by Josh Snyderover 1 year ago3 messages
#1Josh Snyder
josh@code406.com
1 attachment(s)

When deploying RLS, I was surprised to find that certain queries which used
only builtin indexes and operators had dramatically different query plans when
a policy is applied. In my case, the query `tsvector @@ tsquery` over a GIN
index was no longer able to use that index. I was able to find one other
instance [1]/messages/by-id/CAGrP7a2t+JbeuxpQY+RSvNe4fr3+==UmyimwV0GCD+wcrSSb=w@mail.gmail.com of someone being surprised by this behavior on the mailing lists.

The docs already discuss the LEAKPROOF semantics in the abstract, but I think
they place not enough focus on the idea that builtin operators can be (and
frequently are) not leakproof. Based on the query given in the attached patch,
I found that 387 operators are not leakproof versus 588 that are.

The attached patch updates the documentation to provide an easy query over
system catalogs as a way of determining which operators will no longer perform
well under RLS or a security-barrier view.

Thanks,
Josh

[1]: /messages/by-id/CAGrP7a2t+JbeuxpQY+RSvNe4fr3+==UmyimwV0GCD+wcrSSb=w@mail.gmail.com

Attachments:

0001-Add-query-for-operators-unusable-with-RLS.patchtext/x-patch; charset=UTF-8; name=0001-Add-query-for-operators-unusable-with-RLS.patchDownload
From db382697f14bd12dd9e29606c314d7a35bd290ea Mon Sep 17 00:00:00 2001
From: Josh Snyder <josh@code406.com>
Date: Sat, 18 May 2024 15:50:47 -0700
Subject: [PATCH] Add query for operators unusable with RLS

---
 doc/src/sgml/rules.sgml | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/doc/src/sgml/rules.sgml b/doc/src/sgml/rules.sgml
index 7a928bd7b9..3b1283c002 100644
--- a/doc/src/sgml/rules.sgml
+++ b/doc/src/sgml/rules.sgml
@@ -2167,6 +2167,32 @@ CREATE VIEW phone_number WITH (security_barrier) AS
     view's row filters.
 </para>
 
+<para>
+    When working with <literal>security_barrier</literal> views or row-level
+    security policies, a query author can check in advance which index-access
+    operators cannot cross the security barrier by querying system catalogs:
+</para>
+
+<programlisting>
+SELECT
+  amname,
+  format('%s(%s,%s) [%s]', oprcode, lefttype.typname, righttype.typname, oprname)
+FROM pg_operator
+  JOIN pg_amop ON pg_operator.oid = pg_amop.amopopr
+  JOIN pg_am ON pg_amop.amopmethod = pg_am.oid
+  JOIN pg_proc ON pg_operator.oprcode = pg_proc.oid
+  JOIN pg_type lefttype ON pg_amop.amoplefttype = lefttype.oid
+  JOIN pg_type righttype ON pg_amop.amoprighttype = righttype.oid
+WHERE proleakproof = false;
+</programlisting>
+
+<para>
+    Any operator returned by the above query will not be able to perform an
+    indexed lookup through the security barrier. In that case, the view's own
+    conditions will be applied first, followed by conditions added by the query
+    author.
+</para>
+
 <para>
     It is important to understand that even a view created with the
     <literal>security_barrier</literal> option is intended to be secure only
-- 
2.40.1

#2Yugo NAGATA
nagata@sraoss.co.jp
In reply to: Josh Snyder (#1)
Re: PATCH: Add query for operators unusable with RLS to documentation

On Sat, 18 May 2024 16:54:52 -0700
Josh Snyder <josh@code406.com> wrote:

When deploying RLS, I was surprised to find that certain queries which used
only builtin indexes and operators had dramatically different query plans when
a policy is applied. In my case, the query `tsvector @@ tsquery` over a GIN
index was no longer able to use that index. I was able to find one other
instance [1] of someone being surprised by this behavior on the mailing lists.

The docs already discuss the LEAKPROOF semantics in the abstract, but I think
they place not enough focus on the idea that builtin operators can be (and
frequently are) not leakproof. Based on the query given in the attached patch,
I found that 387 operators are not leakproof versus 588 that are.

The attached patch updates the documentation to provide an easy query over
system catalogs as a way of determining which operators will no longer perform
well under RLS or a security-barrier view.

I think it would be worth mentioning an index involving non-LEAKPROOF operator
could not work with RLS or a security-barrier view in the documentation.
(e.g. like /messages/by-id/2273225.DEBA8KRT0r@peanuts2)
It may help to avoid other users from facing the surprise you got.

However, I am not sure if it is appropriate to write the query consulting
pg_amop in this part of the documentation.It is enough to add a reference to
the other part describing operation familiar, for example, "11.10. Operator Classes
and Operator Families"? Additionally, is it useful to add LEAKPROOF information
to the result of psql \dAo(+) meta-comand, or a function that can check given index
or operator is leakproof or not?

Regards,
Yugo Nagata

Thanks,
Josh

[1] /messages/by-id/CAGrP7a2t+JbeuxpQY+RSvNe4fr3+==UmyimwV0GCD+wcrSSb=w@mail.gmail.com

--
Yugo NAGATA <nagata@sraoss.co.jp>

#3Yugo NAGATA
nagata@sraoss.co.jp
In reply to: Josh Snyder (#1)
Re: PATCH: Add query for operators unusable with RLS to documentation

On Sun, 23 Jun 2024 19:14:09 +0900
Yugo NAGATA <nagata@sraoss.co.jp> wrote:

and Operator Families"? Additionally, is it useful to add LEAKPROOF information
to the result of psql \dAo(+) meta-comand, or a function that can check given index
or operator is leakproof or not?

I worte a pach to implement the proposal above and submitted in the new thread[1]/messages/by-id/20240701220817.483f9b645b95611f8b1f65da@sranhm.sraoss.co.jp.

[1]: /messages/by-id/20240701220817.483f9b645b95611f8b1f65da@sranhm.sraoss.co.jp

Regards,
Yugo Nagata

Regards,
Yugo Nagata

Thanks,
Josh

[1] /messages/by-id/CAGrP7a2t+JbeuxpQY+RSvNe4fr3+==UmyimwV0GCD+wcrSSb=w@mail.gmail.com

--
Yugo NAGATA <nagata@sraoss.co.jp>

--
Yugo NAGATA <nagata@sraoss.co.jp>