INSERT ... ON CONFLICT UPDATE and RLS

Started by Peter Geogheganover 11 years ago46 messageshackers

pg@bowt.ie

over 11 years ago

The patch that implements INSERT ... ON CONFLICT UPDATE has support
and tests for per-column privileges (which are not relevant to the
IGNORE variant, AFAICT). However, RLS support is another thing
entirely. It has not been properly thought out, and unlike per-column
privileges requires careful consideration, as the correct behavior
isn't obvious.

I've documented the current problems with RLS here:

https://wiki.postgresql.org/wiki/UPSERT#RLS

It's not clear whether or not the auxiliary UPDATE within an INSERT...
ON CONFLICT UPDATE statement should have security quals appended.
Stephen seemed to think that that might not be the best solution [1]/messages/by-id/20141121205926.GK28859@tamriel.snowman.net -- Peter Geoghegan.
I am not sure. I'd like to learn what other people think.

What is the best way of integrating RLS with ON CONFLICT UPDATE? What
behavior is most consistent with the guarantees of RLS? In particular,
should the implementation append security quals to the auxiliary
UPDATE, or fail sooner?

[1]: /messages/by-id/20141121205926.GK28859@tamriel.snowman.net -- Peter Geoghegan
--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

over 11 years ago

In reply to: Peter Geoghegan (#1)

Re: INSERT ... ON CONFLICT UPDATE and RLS

On Mon, Jan 5, 2015 at 12:54 PM, Peter Geoghegan <pg@heroku.com> wrote:

The patch that implements INSERT ... ON CONFLICT UPDATE has support
and tests for per-column privileges (which are not relevant to the
IGNORE variant, AFAICT). However, RLS support is another thing
entirely. It has not been properly thought out, and unlike per-column
privileges requires careful consideration, as the correct behavior
isn't obvious.

I've documented the current problems with RLS here:

https://wiki.postgresql.org/wiki/UPSERT#RLS

It's not clear whether or not the auxiliary UPDATE within an INSERT...
ON CONFLICT UPDATE statement should have security quals appended.
Stephen seemed to think that that might not be the best solution [1].
I am not sure. I'd like to learn what other people think.

What is the best way of integrating RLS with ON CONFLICT UPDATE? What
behavior is most consistent with the guarantees of RLS? In particular,
should the implementation append security quals to the auxiliary
UPDATE, or fail sooner?

I think the INSERT .. ON CONFLICT UPDATE shouldn't be able to attempt
an update unless the UPDATE policies of the table are such that a
regular UPDATE would find the affected row. The post-image of the row
needs to satisfy any UPDATE CHECK OPTION. If the INSERT fails due to
a conflict with an unseen row, and the UPDATE can't find that row
either due to RLS, then it should probably error out; the alternative
is to silently do nothing, but that feels wrong.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Geoghegan

pg@bowt.ie

over 11 years ago

In reply to: Robert Haas (#2)

Re: INSERT ... ON CONFLICT UPDATE and RLS

On Tue, Jan 6, 2015 at 9:39 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I think the INSERT .. ON CONFLICT UPDATE shouldn't be able to attempt
an update unless the UPDATE policies of the table are such that a
regular UPDATE would find the affected row. The post-image of the row
needs to satisfy any UPDATE CHECK OPTION. If the INSERT fails due to
a conflict with an unseen row, and the UPDATE can't find that row
either due to RLS, then it should probably error out; the alternative
is to silently do nothing, but that feels wrong.

I can certainly see the argument for that behavior. I'm inclined to
agree that this is better.

With th existing implementation, UPDATE check options are effective at
preventing updates. However, the implementation is not effective at
preventing row locking from finding a row that the user would not
otherwise be able to find (with a conventional UPDATE). So I guess
what I have to do now is figure out a way of also having the "... FOR
UPDATE .... USING ( )" qual be enforced after row locking in respect
of the row locked (locked, but not yet used to generate a post-image)
in the target table. If it isn't satisfied, throw an error that
doesn't leak anything about the target row, rather than silently not
affecting the row. So, as you say, a divergence from what regular RLS
UPDATEs do that happens to make more sense here.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Geoghegan

pg@bowt.ie

over 11 years ago

In reply to: Peter Geoghegan (#3)

Re: INSERT ... ON CONFLICT UPDATE and RLS

Another issue that I see is that there is only one resultRelation
between the INSERT and UPDATE. That means that without some additional
special measure, both UPDATE and INSERT "WITH CHECK ( ... ) " options
are applied regardless of whether the INSERT path was taken, or the
UPDATE path. Maybe that's actually sensible (note that even this
doesn't happen right now, since the auxiliary UPDATE is currently
minimally processed by the rewriter). What do people think about that?
It seems like it might be less surprising to have that fail earlier
when there are separate policies for INSERT and UPDATE -- for UPSERT,
all policies are applied regardless of which path is taken.

The task of making the security qual enforced like a check option
before we go to update a locked row (at the start of the UPDATE path,
for any UPDATE policy with a USING security qual) looks complicated.
I'd appreciate a little help on the implementation details.

Apparently Oracle does not offer "fine-grained access control" with
MERGE, which I believe means they just don't support this kind of
thing at all. Obviously I'd rather avoid that here, but the correct
semantics are not obvious. ON CONFLICT UPDATE could almost justify
making CREATE POLICY FOR INSERT accept a USING expression, since
that's really where the row to UPDATE comes from.
--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Dean Rasheed

dean.a.rasheed@gmail.com

over 11 years ago

In reply to: Peter Geoghegan (#4)

Re: INSERT ... ON CONFLICT UPDATE and RLS

On 6 January 2015 at 20:44, Peter Geoghegan <pg@heroku.com> wrote:

Another issue that I see is that there is only one resultRelation
between the INSERT and UPDATE. That means that without some additional
special measure, both UPDATE and INSERT "WITH CHECK ( ... ) " options
are applied regardless of whether the INSERT path was taken, or the
UPDATE path. Maybe that's actually sensible (note that even this
doesn't happen right now, since the auxiliary UPDATE is currently
minimally processed by the rewriter). What do people think about that?
It seems like it might be less surprising to have that fail earlier
when there are separate policies for INSERT and UPDATE -- for UPSERT,
all policies are applied regardless of which path is taken.

I think the policies applied should depend on the path taken, so if it
does an INSERT, then only the INSERT CHECK policy should be applied
(after the insert), but if it ends up doing an UPDATE, I would expect
the UPDATE USING policy to be applied (before the update) and the
UPDATE CHECK policy to be applied (after the update). I would not
expect the INSERT CHECK policy to be applied on the UPDATE path.

As to whether the UPDATE USING policy should cause an error to be
thrown if it is not satisfied, my inclination would be to not error,
and use the command tag to report that no rows were updated, since
that is what would happen with a regular UPDATE.

So overall INSERT .. ON CONFLICT UPDATE would be consistent with
either an INSERT or an UPDATE, depending on whether the row existed
beforehand, which is easier to explain than having some special UPSERT
semantics.

Regards,
Dean

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

over 11 years ago

In reply to: Dean Rasheed (#5)

Re: INSERT ... ON CONFLICT UPDATE and RLS

On Wed, Jan 7, 2015 at 4:04 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

I think the policies applied should depend on the path taken, so if it
does an INSERT, then only the INSERT CHECK policy should be applied
(after the insert), but if it ends up doing an UPDATE, I would expect
the UPDATE USING policy to be applied (before the update) and the
UPDATE CHECK policy to be applied (after the update). I would not
expect the INSERT CHECK policy to be applied on the UPDATE path.

I agree.

As to whether the UPDATE USING policy should cause an error to be
thrown if it is not satisfied, my inclination would be to not error,
and use the command tag to report that no rows were updated, since
that is what would happen with a regular UPDATE.

My inclination would be to error, but I'd be OK with your proposal.

So overall INSERT .. ON CONFLICT UPDATE would be consistent with
either an INSERT or an UPDATE, depending on whether the row existed
beforehand, which is easier to explain than having some special UPSERT
semantics.

Yeah. We won't escape the question so easily where triggers are
concerned, but at least for RLS it seems like it should be possible to
avoid confusing, one-off semantics.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Stephen Frost

sfrost@snowman.net

over 11 years ago

In reply to: Dean Rasheed (#5)

Re: INSERT ... ON CONFLICT UPDATE and RLS

* Dean Rasheed (dean.a.rasheed@gmail.com) wrote:

On 6 January 2015 at 20:44, Peter Geoghegan <pg@heroku.com> wrote:

Another issue that I see is that there is only one resultRelation
between the INSERT and UPDATE. That means that without some additional
special measure, both UPDATE and INSERT "WITH CHECK ( ... ) " options
are applied regardless of whether the INSERT path was taken, or the
UPDATE path. Maybe that's actually sensible (note that even this
doesn't happen right now, since the auxiliary UPDATE is currently
minimally processed by the rewriter). What do people think about that?
It seems like it might be less surprising to have that fail earlier
when there are separate policies for INSERT and UPDATE -- for UPSERT,
all policies are applied regardless of which path is taken.

I think the policies applied should depend on the path taken, so if it
does an INSERT, then only the INSERT CHECK policy should be applied
(after the insert), but if it ends up doing an UPDATE, I would expect
the UPDATE USING policy to be applied (before the update) and the
UPDATE CHECK policy to be applied (after the update). I would not
expect the INSERT CHECK policy to be applied on the UPDATE path.

You're making a distinction where I'm not convinced there should be one.
While I understand that, internally, there are two paths (INSERT vs.
UPDATE), it's still only one command for the user and their goal is
simply "update if this exists, insert if it doesn't." I also don't like
the idea that the policy to be applied depends on if it ends up being an
INSERT or an UPDATE. I liked Peter's suggestion that the row must pass
both WITH CHECK conditions- at least that's consistent and
understandable.

As to whether the UPDATE USING policy should cause an error to be
thrown if it is not satisfied, my inclination would be to not error,
and use the command tag to report that no rows were updated, since
that is what would happen with a regular UPDATE.

Yes, for a regular UPDATE that's what would happen- but this isn't a
regular UPDATE, it's an UPSERT. Consider how users handle this now-
they have routines which continually try to do one or the other until
either the INSERT or the UPDATE is successful. I've never seen such a
function, properly written, that says "try to INSERT, if that fails, try
to UPDATE and if that doesn't update anything, then simply exit." What
is the use-case for that?

So overall INSERT .. ON CONFLICT UPDATE would be consistent with
either an INSERT or an UPDATE, depending on whether the row existed
beforehand, which is easier to explain than having some special UPSERT
semantics.

Any semantics which result in a no-op would be surprising for users of
UPSERT. I like the idea of failing earlier- if you try to INSERT .. ON
CONFLICT UPDATE a row which you're not allowed to INSERT, erroring
strikes me as the right result. If you try to INSERT .. ON CONFLICT
UPDATE a row which you're not allowed to UPDATE, I'd also think an error
would be the right answer, even if you don't end up doing an actual
UPDATE- you ran a command asking for an UPDATE to happen under a certain
condition (the row already exists) and the policy states that you're not
allowed to do such an UPDATE.

As for the UPDATE's USING clause, if you're not allowed to view the
records to be updated (which evidently exists because the INSERT
failed), I'd handle that the same way we handle the case that a SELECT
policy might not allow a user to see rows that they can attempt to
INSERT- they'll get an error back in that case too.

In the end, these are edge cases as policies are generally self
consistent and so I think we have some leeway, but I don't think we have
the right to turn an INSERT .. ON CONFLICT UPDATE into a no-op. That
would be like allowing an INSERT to be a no-op.

Thanks,

Stephen

Stephen Frost

sfrost@snowman.net

over 11 years ago

In reply to: Robert Haas (#6)

Re: INSERT ... ON CONFLICT UPDATE and RLS

* Robert Haas (robertmhaas@gmail.com) wrote:

On Wed, Jan 7, 2015 at 4:04 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

I think the policies applied should depend on the path taken, so if it
does an INSERT, then only the INSERT CHECK policy should be applied
(after the insert), but if it ends up doing an UPDATE, I would expect
the UPDATE USING policy to be applied (before the update) and the
UPDATE CHECK policy to be applied (after the update). I would not
expect the INSERT CHECK policy to be applied on the UPDATE path.

I agree.

I can certainly understand the appeal of this approach, but I don't
think it makes sense. Consider what happens later on down the road,
after the code has been written and deployed using INSERT .. ON CONFLICT
UPDATE where 99% of the time only one path or the other is taken. Then
the other path is taken and suddenly the exact same command and row ends
up returning errors. Additional testing should have been done to check
if that happens, of course, but I really don't like the idea that the
exact same command, with the exact same policies, would succeed or fail,
due to policies, based on the data in the database.

That's not to say that, generally, speaking, the data in the database
shouldn't impact policy failures, but in this case, the policies might
not even reference any other tables in the database.

As to whether the UPDATE USING policy should cause an error to be
thrown if it is not satisfied, my inclination would be to not error,
and use the command tag to report that no rows were updated, since
that is what would happen with a regular UPDATE.

My inclination would be to error, but I'd be OK with your proposal.

I'm pretty strongly in favor of erroring. :)

So overall INSERT .. ON CONFLICT UPDATE would be consistent with
either an INSERT or an UPDATE, depending on whether the row existed
beforehand, which is easier to explain than having some special UPSERT
semantics.

Yeah. We won't escape the question so easily where triggers are
concerned, but at least for RLS it seems like it should be possible to
avoid confusing, one-off semantics.

Triggers are another question..

One approach that I don't recall seeing (and I'm not convinced that it's
a good idea either, but thought it deserved mention) is the idea of
having an UPSERT-specific set of policies (and perhaps an
UPSERT-specific trigger...?). That gets pretty grotty when you consider
existing systems which have INSERT and UPDATE triggers already, but
UPSERT is a pretty big deal and a big change and for users who are just
starting to use it, requiring that they write triggers and policies
specifically to address that case might be acceptable. That doesn't
address the cases where users have direct SQL access and might be able
to then bypass existing triggers, but we might be able to work out an
answer to that case..

Other databases have this capability and have triggers and at least one
ends up firing both INSERT and UPDATE triggers, with many complaints
from users about how that ends up making the performance suck. Perhaps
we could use that as a fallback but support the explicit single trigger
option too.. Just some thoughts, apologies if it's already been
convered in depth previously.

Thanks!

Stephen

Robert Haas

robertmhaas@gmail.com

over 11 years ago

In reply to: Stephen Frost (#8)

Re: INSERT ... ON CONFLICT UPDATE and RLS

On Wed, Jan 7, 2015 at 3:01 PM, Stephen Frost <sfrost@snowman.net> wrote:

* Robert Haas (robertmhaas@gmail.com) wrote:

On Wed, Jan 7, 2015 at 4:04 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

I think the policies applied should depend on the path taken, so if it
does an INSERT, then only the INSERT CHECK policy should be applied
(after the insert), but if it ends up doing an UPDATE, I would expect
the UPDATE USING policy to be applied (before the update) and the
UPDATE CHECK policy to be applied (after the update). I would not
expect the INSERT CHECK policy to be applied on the UPDATE path.

I agree.

I can certainly understand the appeal of this approach, but I don't
think it makes sense. Consider what happens later on down the road,
after the code has been written and deployed using INSERT .. ON CONFLICT
UPDATE where 99% of the time only one path or the other is taken. Then
the other path is taken and suddenly the exact same command and row ends
up returning errors.

I'd say: that's life. If you don't test your policies, they might not work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Stephen Frost

sfrost@snowman.net

over 11 years ago

In reply to: Robert Haas (#9)

Re: INSERT ... ON CONFLICT UPDATE and RLS

* Robert Haas (robertmhaas@gmail.com) wrote:

On Wed, Jan 7, 2015 at 3:01 PM, Stephen Frost <sfrost@snowman.net> wrote:

* Robert Haas (robertmhaas@gmail.com) wrote:

On Wed, Jan 7, 2015 at 4:04 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

I think the policies applied should depend on the path taken, so if it
does an INSERT, then only the INSERT CHECK policy should be applied
(after the insert), but if it ends up doing an UPDATE, I would expect
the UPDATE USING policy to be applied (before the update) and the
UPDATE CHECK policy to be applied (after the update). I would not
expect the INSERT CHECK policy to be applied on the UPDATE path.

I agree.

I can certainly understand the appeal of this approach, but I don't
think it makes sense. Consider what happens later on down the road,
after the code has been written and deployed using INSERT .. ON CONFLICT
UPDATE where 99% of the time only one path or the other is taken. Then
the other path is taken and suddenly the exact same command and row ends
up returning errors.

I'd say: that's life. If you don't test your policies, they might not work.

I agree with that up to a point- this case feels more like a POLA
violation though rather than a run-of-the-mill "you need to test all
that." I'm a bit worried this might lead to cases where users can't use
UPSERT because they don't know which set of policies will end up getting
applied, although the above approach is strictly a subset of the
approach I'm advocating, but at least with the 'throw it all together'
case, you know exactly what you're getting with UPSERT.

Thanks,

Stephen

#11

Peter Geoghegan

pg@bowt.ie

over 11 years ago

In reply to: Stephen Frost (#8)

Re: INSERT ... ON CONFLICT UPDATE and RLS

On Wed, Jan 7, 2015 at 12:01 PM, Stephen Frost <sfrost@snowman.net> wrote:

Other databases have this capability and have triggers and at least one
ends up firing both INSERT and UPDATE triggers, with many complaints
from users about how that ends up making the performance suck. Perhaps
we could use that as a fallback but support the explicit single trigger
option too.. Just some thoughts, apologies if it's already been
convered in depth previously.

I would like to expose whether or not statement-level triggers are
being called in the context of an INSERT ... ON CONFLICT UPDATE, FWIW.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Peter Geoghegan

pg@bowt.ie

over 11 years ago

In reply to: Stephen Frost (#10)

Re: INSERT ... ON CONFLICT UPDATE and RLS

On Wed, Jan 7, 2015 at 12:26 PM, Stephen Frost <sfrost@snowman.net> wrote:

I agree with that up to a point- this case feels more like a POLA
violation though rather than a run-of-the-mill "you need to test all
that." I'm a bit worried this might lead to cases where users can't use
UPSERT because they don't know which set of policies will end up getting
applied, although the above approach is strictly a subset of the
approach I'm advocating, but at least with the 'throw it all together'
case, you know exactly what you're getting with UPSERT.

I think that it makes sense to bunch the policies together, because
INSERT ... ON CONFLICT UPDATE is mostly an INSERT - informally
speaking, the "UPDATE part" is just enough to resolve a conflict -
this is ultimately just a new clause of INSERT. Besides, I think that
making the permissions as restrictive as possible is the least
surprising behavior. It's clearly the most secure behavior.

I think that we collectively lean towards actively throwing an error
as the preferred behavior - I think that's good, because the intent of
the user to UPDATE some particular tuple that they're not allowed to
UPDATE is clear and unambiguous. As long as we're doing that, clearly
writing a query that will fail based on the "wrong" path being taken
is wrong-headed. Now, you can construct an UPSERT case that a "bunch
together policies" approach doesn't serve well. I could be mistaken,
but I think that case is likely to be more than a little contrived.

We're now going to be testing the TARGET.* tuple before going to
update, having failed to insert (and after row locking the TARGET.*
tuple) - the target tuple is really how the update finds the existing
row to update, so it should be tested, without necessarily being
blamed directly (certainly, we cannot leak the TARGET.* tuple values,
so maybe we should try blaming the EXCLUDED.* tuple first for error
messaging reasons). If it fails because of the insert check after row
locking but before update, well, it was liable to anyway, when the
insert path was taken. Conversely, if it failed after actually
inserting where that would not happen with a vanilla INSERT (due to a
bunched-together update "USING()" security barrier qual or explicit
user-supplied CHECK OPTION), well, you'd have gotten the same thing if
you took the UPDATE path anyway.

That just leaves the regular ExecUpdate() call of
ExecWithCheckOptions(). When that is called with "bunched together"
policies, you're now testing the final tuple. So it might be a bit
confusing that the policy of final tuples originating from INSERTs is
also enforced here, for the final tuple of an "UPDATE". But in order
for even that to be the case, you'd have to have a final tuple that
was substantially different from the one originally proposed for
insertion that made the difference between a qual passing and not
passing. How likely is that in realistic use cases? And besides, isn't
the policy for INSERTs still quite relevant even here? We're talking
about a final tuple that originated from an INSERT statement, after
all.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

David Fetter

david@fetter.org

over 11 years ago

In reply to: Stephen Frost (#8)

Re: INSERT ... ON CONFLICT UPDATE and RLS

On Wed, Jan 07, 2015 at 03:01:20PM -0500, Stephen Frost wrote:

* Robert Haas (robertmhaas@gmail.com) wrote:

On Wed, Jan 7, 2015 at 4:04 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

I think the policies applied should depend on the path taken, so if it
does an INSERT, then only the INSERT CHECK policy should be applied
(after the insert), but if it ends up doing an UPDATE, I would expect
the UPDATE USING policy to be applied (before the update) and the
UPDATE CHECK policy to be applied (after the update). I would not
expect the INSERT CHECK policy to be applied on the UPDATE path.

I agree.

I can certainly understand the appeal of this approach, but I don't
think it makes sense. Consider what happens later on down the road,
after the code has been written and deployed using INSERT .. ON CONFLICT
UPDATE where 99% of the time only one path or the other is taken. Then
the other path is taken and suddenly the exact same command and row ends
up returning errors. Additional testing should have been done to check
if that happens, of course, but I really don't like the idea that the
exact same command, with the exact same policies, would succeed or fail,
due to policies, based on the data in the database.

There's precedent. Unique constraints, for example.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Peter Geoghegan

pg@bowt.ie

over 11 years ago

In reply to: David Fetter (#13)

Re: INSERT ... ON CONFLICT UPDATE and RLS

On Wed, Jan 7, 2015 at 5:16 PM, David Fetter <david@fetter.org> wrote:

There's precedent. Unique constraints, for example.

I don't see that as any kind of precedent.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

David Fetter

david@fetter.org

over 11 years ago

In reply to: Peter Geoghegan (#14)

Re: INSERT ... ON CONFLICT UPDATE and RLS

On Wed, Jan 07, 2015 at 05:18:58PM -0800, Peter Geoghegan wrote:

On Wed, Jan 7, 2015 at 5:16 PM, David Fetter <david@fetter.org> wrote:

There's precedent. Unique constraints, for example.

I don't see that as any kind of precedent.

In the part you sliced off, Stephen described a situation where the
contents of a database either do or don't cause a query to violate a
constraint. Uniqueness constraints are one example of this.

CREATE TABLE foo(id INTEGER PRIMARY KEY);

INSERT INTO foo(id) VALUES(1); /* Works the first time */
INSERT INTO foo(id) VALUES(1); /* Fails the next time */

Same database, same constraints, different outcome.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Peter Geoghegan

pg@bowt.ie

over 11 years ago

In reply to: David Fetter (#15)

Re: INSERT ... ON CONFLICT UPDATE and RLS

On Wed, Jan 7, 2015 at 5:33 PM, David Fetter <david@fetter.org> wrote:

Same database, same constraints, different outcome.

You're missing the point, I think. UPSERT means that the user doesn't
care about whether or not a given tuple proposed for insertion was
handled with an insert or an update. If you have distinct INSERT and
UPDATE policies (not that I think that many people will), and if they
differ in such a way as to make the difference between some actual
INSERT ... ON CONFLICT UPDATE in the app throwing an error or not
throwing an error depending only on whether the UPDATE (or INSERT)
path was taken, then that combination (the combination of that UPSERT
command, that UPDATE policy and that INSERT policy) is wrong-headed.
It's not based on some particular set of data in the database, but any
and all possible sets. Stephen's objection isn't that the update
operation throws an error; it's that the insert doesn't. I'm with
Stephen; ISTM that selectively enforcing RLS like that is a foot gun.
For column level privileges, you wouldn't expect to only get an error
about not having the relevant update permissions at runtime, when the
update path happens to be taken. And so it is for RLS.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

Stephen Frost

sfrost@snowman.net

over 11 years ago

In reply to: Peter Geoghegan (#16)

Re: INSERT ... ON CONFLICT UPDATE and RLS

Peter,

* Peter Geoghegan (pg@heroku.com) wrote:

For column level privileges, you wouldn't expect to only get an error
about not having the relevant update permissions at runtime, when the
update path happens to be taken. And so it is for RLS.

Right, that's the precedent we should be considering. Column-level
privileges is a great example- you need both insert and update
privileges for the columns involved for the command to succeed. It
shouldn't depend on which path actually ends up being taken.

Thanks!

Stephen

#18

Dean Rasheed

dean.a.rasheed@gmail.com

over 11 years ago

In reply to: Stephen Frost (#17)

Re: INSERT ... ON CONFLICT UPDATE and RLS

On 9 January 2015 at 00:49, Stephen Frost <sfrost@snowman.net> wrote:

Peter,

* Peter Geoghegan (pg@heroku.com) wrote:

For column level privileges, you wouldn't expect to only get an error
about not having the relevant update permissions at runtime, when the
update path happens to be taken. And so it is for RLS.

Right, that's the precedent we should be considering. Column-level
privileges is a great example- you need both insert and update
privileges for the columns involved for the command to succeed. It
shouldn't depend on which path actually ends up being taken.

It's not the same as column-level privileges though, because RLS
policies depend on the new/existing data, not just the table metadata.
So for example an "UPDATE ... WHERE false" can fail column-level
privilege checks even though it isn't updating anything, but it cannot
fail a RLS policy check.

I was trying to think up an example where you might actually have
different INSERT and UPDATE policies, and the best I can think of is
some sort of mod_count column where you have an INSERT CHECK
(mod_count = 0) and an UPDATE CHECK (mod_count > 0). In that case,
checking both policies would make an UPSERT impossible, whereas if you
think of it as doing either an INSERT or an UPDATE, as the syntax
suggests, it becomes possible.

(I'm assuming here from your description that you intend for the
policy expressions to be AND'ed together, so it might end up being an
AND of 2 OR expressions.)

Regards,
Dean

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

Peter Geoghegan

pg@bowt.ie

over 11 years ago

In reply to: Dean Rasheed (#18)

Re: INSERT ... ON CONFLICT UPDATE and RLS

On Fri, Jan 9, 2015 at 12:19 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

I was trying to think up an example where you might actually have
different INSERT and UPDATE policies, and the best I can think of is
some sort of mod_count column where you have an INSERT CHECK
(mod_count = 0) and an UPDATE CHECK (mod_count > 0). In that case,
checking both policies would make an UPSERT impossible, whereas if you
think of it as doing either an INSERT or an UPDATE, as the syntax
suggests, it becomes possible.

Why does this user want to do this upsert? If they're upserting, then
the inserted row could only reasonably have a value of (mod_count =
0). If updating, then they must have a constant value for the update
path (a value that's greater than 0, naturally - say 2), which doesn't
make any sense in the context of an upsert's auxiliary update - what
happened to the 0 value? Sorry, but I don't think your example makes
sense - I can't see what would motivate anyone to write a query like
that with those RLS policies in place. It sounds like you're talking
about an insert and a separate update that may or may not affect the
same row, and not an upsert. Then those policies make sense, but in
practice they render the upsert you describe contradictory.

FWIW, I'm not suggesting that there couldn't possibly be a use case
that doesn't do well with this convention where we enforce RLS
deepening on the path taken. The cases are just very marginal, as I
think your difficulty in coming up with a convincing counter-argument
shows. I happen to think what Stephen and I favor ("bunching together"
USING() barrier quals and check options from INSERT and UPDATE
policies) is likely to be the best alternative available on balance.

More generally, you could point out that I'm actually testing
different tuples at different points in query processing under that
regime (e.g. the post-insert tuple, or the before-update conflicting,
existing tuple from the target, or the post update tuple) and so
things could fail when the update path is taken despite the fact that
they didn't fail when the insert path was taken. That's technically
true, of course, but with idiomatic usage it isn't true, and that's
what I care about.

Does anyone have another counter-example of a practical upsert
statement that cannot be used with certain RLS policies due to the
fact that we chose to "bunch together" INSERT and UPDATE RLS policies?
--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

Dean Rasheed

dean.a.rasheed@gmail.com

over 11 years ago

In reply to: Peter Geoghegan (#19)

Re: INSERT ... ON CONFLICT UPDATE and RLS

On 9 January 2015 at 08:49, Peter Geoghegan <pg@heroku.com> wrote:

On Fri, Jan 9, 2015 at 12:19 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

I was trying to think up an example where you might actually have
different INSERT and UPDATE policies, and the best I can think of is
some sort of mod_count column where you have an INSERT CHECK
(mod_count = 0) and an UPDATE CHECK (mod_count > 0). In that case,
checking both policies would make an UPSERT impossible, whereas if you
think of it as doing either an INSERT or an UPDATE, as the syntax
suggests, it becomes possible.

Why does this user want to do this upsert? If they're upserting, then
the inserted row could only reasonably have a value of (mod_count =
0). If updating, then they must have a constant value for the update
path (a value that's greater than 0, naturally - say 2), which doesn't
make any sense in the context of an upsert's auxiliary update - what
happened to the 0 value? Sorry, but I don't think your example makes
sense - I can't see what would motivate anyone to write a query like
that with those RLS policies in place. It sounds like you're talking
about an insert and a separate update that may or may not affect the
same row, and not an upsert. Then those policies make sense, but in
practice they render the upsert you describe contradictory.

Whoa, hang on. I think you're being a bit quick to dismiss that
example. Why shouldn't I want an upsert where the majority of the
table columns follow the usual "make it so" pattern of an upsert, but
there is also this kind of audit column to be maintained? Then I would
write something like

INSERT INTO tbl (<some values>, 0)
ON CONFLICT UPDATE SET <same values>, mod_count=mod_count+1;

The root of the problem is the way that you're proposing to combine
the RLS policies (using AND), which runs contrary to the way RLS
policies are usually combined (using OR), which is why this kind of
example fails -- RLS policies in general aren't intended to all be
true simultaneously.

Regards,
Dean

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

Stephen Frost

sfrost@snowman.net

over 11 years ago

In reply to: Dean Rasheed (#20)

#22

Peter Geoghegan

pg@bowt.ie

over 11 years ago

In reply to: Dean Rasheed (#20)

#23

Stephen Frost