pg_dump needs SELECT privileges on irrelevant extension table
Hi all,
We have a situation where we need to revoke SELECT on a table that
belongs to our extension, and we also need to let less privileged users
dump the extension's external config tables. (The restricted table's
contents are exposed through a security_barrier view, and it's a cloud
environment where "admin" users don't necessarily have true superuser
access.)
Since the restricted table is internal, its contents aren't included in
dumps anyway, so we expected to be able to meet both use cases at once.
Unfortunately:
$ pg_dump -U unprivileged_user -d postgres
pg_dump: error: query failed: ERROR: permission denied for relation
ext_table
pg_dump: error: query was: LOCK TABLE public.ext_table IN ACCESS
SHARE MODE
...and there appears to be no way to work around this with
--exclude-table, since the table is part of the extension.
It looks like the only reason pg_dump locks this particular table is
because it's been marked with DUMP_COMPONENT_POLICY, which needs a lock
to ensure the consistency of later pg_get_expr() calls. That stings for
two reasons: 1) it doesn't seem like you need SELECT access on a table
to see its policies, and 2) we have no policies on the table anyway;
there are no pg_get_expr() calls to protect.
So I've attached the simplest backportable workaround I could think of:
unmark DUMP_COMPONENT_POLICY for a table that has no policies at the
time of the getTables() query. This is similar to the ACL optimization
that back branches do; it should ensure that there will be no
pg_get_expr() calls on pg_policy for that table later, due to
repeatable-read, and it omits the lock when there's no reason to grab
it. It won't fix the problem for tables that have do policies, but I
don't have any ideas for how to do that safely, unless there's some lock
mode that uses fewer privileges.
I also attached a speculative backport to 11 to illustrate what that
might look like, but first I have to convince you it's a bug. :)
WDYT?
Thanks,
--Jacob
Attachments:
0001-pg_dump-skip-lock-for-extension-tables-without-polic.patchtext/x-patch; charset=UTF-8; name=0001-pg_dump-skip-lock-for-extension-tables-without-polic.patchDownload+20-1
0001-pg_dump-skip-lock-for-extension-tables-without-polic.REL_11_STABLE.patchtext/x-patch; charset=UTF-8; name=0001-pg_dump-skip-lock-for-extension-tables-without-polic.REL_11_STABLE.patchDownload+32-10
Jacob Champion <jchampion@timescale.com> writes:
We have a situation where we need to revoke SELECT on a table that
belongs to our extension, and we also need to let less privileged users
dump the extension's external config tables.
In general, we don't expect that random minimum-privilege users can do
a database-wide pg_dump, so I'm not entirely sure that I buy that this
is a case we should cater to. Why shouldn't your dump user have enough
privilege to take this lock?
I'd be more willing to consider the proposed patch if it weren't such
a hack --- as you say, it doesn't fix the problem when the table has
policies, so it's hardly a general-purpose solution. I fear that it's
also fairly expensive: adding sub-selects to the query we must do
before we can lock any tables is not appetizing, because making that
window wider adds to the risk of deadlocks, dump failures, etc.
regards, tom lane
On Mon, Mar 20, 2023 at 10:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
In general, we don't expect that random minimum-privilege users can do
a database-wide pg_dump, so I'm not entirely sure that I buy that this
is a case we should cater to.
They're neither random nor minimum-privilege -- it's the role with the
most privileges available to our end users. They just can't see the
contents of this table.
Why shouldn't your dump user have enough
privilege to take this lock?
The table contains information that's confidential to the superuser.
Other users access it through a view.
I'd be more willing to consider the proposed patch if it weren't such
a hack --- as you say, it doesn't fix the problem when the table has
policies, so it's hardly a general-purpose solution.
Right. Does a more general fix exist?
I fear that it's
also fairly expensive: adding sub-selects to the query we must do
before we can lock any tables is not appetizing, because making that
window wider adds to the risk of deadlocks, dump failures, etc.
I was hoping an EXISTS subselect would be cheap enough, but maybe I
don't have enough entries in pg_policy to see a slowdown. Any
suggestions on an order of magnitude so I can characterize it? Or
would you just like to know at what point I start seeing slower
behavior? (Alternatively: are there cheaper ways to write this query?)
Thanks!
--Jacob
On Mon, Mar 20, 2023 at 11:23 AM Jacob Champion <jchampion@timescale.com> wrote:
On Mon, Mar 20, 2023 at 10:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I fear that it's
also fairly expensive: adding sub-selects to the query we must do
before we can lock any tables is not appetizing, because making that
window wider adds to the risk of deadlocks, dump failures, etc.I was hoping an EXISTS subselect would be cheap enough, but maybe I
don't have enough entries in pg_policy to see a slowdown. Any
suggestions on an order of magnitude so I can characterize it? Or
would you just like to know at what point I start seeing slower
behavior? (Alternatively: are there cheaper ways to write this query?)
As a smoke test, I have 10M policies spread across 100k tables on my
laptop (that is, 100 policies each). I also have 100k more empty
tables with no policies on them, to try to stress both sides of the
EXISTS. On PG11, the baseline query duration is roughly 20s; with the
patch, it increases to roughly 22s (~10% slowdown). Setup SQL
attached.
This appears to be tied to the number of policies more than the number
of tables; if I reduce it to "only" 1M policies, the slowdown drops to
~400ms (2%), and at 10k policies any difference is lost in noise. That
doesn't seem unreasonable to me, but I don't know what a worst-case
pg_policy catalog looks like.
--Jacob
Attachments:
On 3/20/23 10:43, Tom Lane wrote:
I'd be more willing to consider the proposed patch if it weren't such
a hack --- as you say, it doesn't fix the problem when the table has
policies, so it's hardly a general-purpose solution. I fear that it's
also fairly expensive: adding sub-selects to the query we must do
before we can lock any tables is not appetizing, because making that
window wider adds to the risk of deadlocks, dump failures, etc.
(moving to -hackers and July CF)
= Recap for Potential Reviewers =
The timescaledb extension has an internal table that's owned by the
superuser. It's not dumpable, and other users can only access its
contents through a filtered view. For our cloud deployments, we
additionally have that common trope where the most privileged users
aren't actually superusers, but we still want them to be able to perform
a subset of maintenance tasks, including pg_dumping their data.
When cloud users try to dump that data, pg_dump sees that this internal
table is an extension member and plans to dump ACLs, security labels,
and RLS policies for it. (This behavior cannot be overridden with
--exclude-table. pg_dump ignores that flag for extension members.)
Dumping policies requires the use of pg_get_expr() on the backend; this,
in turn, requires a lock on the table with ACCESS SHARE.
So pg_dump tries to lock a table, with no policies, that it's not going
to dump the schema or data for anyway, and it fails because our users
don't have (and shouldn't need) SELECT access to it. For an example of
this in action, I've attached a test case in v2-0001.
= Proposal =
Since this is affecting users on released Postgres versions, my end goal
is to find a fix that's backportable.
This situation looks very similar to [1]/messages/by-id/CAGPqQf3Uzo-yU1suYyoZR83h6QTxXxkGTtEyeMV7EAVBqn=PcQ@mail.gmail.com, where non-superusers couldn't
perform a dump because we were eagerly grabbing table locks to read
(non-existent) ACLs. But that was solved with the realization that ACLs
don't need locks anyway, which is unfortunately not applicable to policies.
My initial patch to -bugs was a riff on a related performance fix [2]https://git.postgresql.org/cgit/postgresql.git/commit/?id=5d589993,
which figured out which tables had interesting ACLs and skipped that
part if nothing was found. I added the same kind of subselect for RLS
policies as well, but that had nasty corner cases where it would perform
terribly, as Tom alluded to above. (In a cluster of 200k tables, where
one single table had 10M policies, the query ground to a halt.)
So v2-0002 is instead inspired by Tom's rewrite of that ACL dump logic
[3]: https://git.postgresql.org/cgit/postgresql.git/commit/?id=0c9d8442
catalog map on the client side, and then again skips the policy dump
(and therefore the lock) if no policies exist. The performance hit now
scales with the size of pg_policy alone.
This is a bit more invasive than the subselect, but hopefully still
straightforward enough to be applicable to the back branches' old
catalog map strategy. It's still not a general-purpose fix, as Tom
pointed out above, but that was true of the discussion in [1]/messages/by-id/CAGPqQf3Uzo-yU1suYyoZR83h6QTxXxkGTtEyeMV7EAVBqn=PcQ@mail.gmail.com as well,
so I'm optimistic.
WDYT?
--Jacob
[1]: /messages/by-id/CAGPqQf3Uzo-yU1suYyoZR83h6QTxXxkGTtEyeMV7EAVBqn=PcQ@mail.gmail.com
/messages/by-id/CAGPqQf3Uzo-yU1suYyoZR83h6QTxXxkGTtEyeMV7EAVBqn=PcQ@mail.gmail.com
[2]: https://git.postgresql.org/cgit/postgresql.git/commit/?id=5d589993
[3]: https://git.postgresql.org/cgit/postgresql.git/commit/?id=0c9d8442
Attachments:
v2-0002-pg_dump-skip-lock-for-extension-tables-without-po.patchtext/x-patch; charset=UTF-8; name=v2-0002-pg_dump-skip-lock-for-extension-tables-without-po.patchDownload+116-1
v2-0001-Add-failing-test-for-undumped-extension-table.patchtext/x-patch; charset=UTF-8; name=v2-0001-Add-failing-test-for-undumped-extension-table.patchDownload+29-1
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: not tested
Passes the default cases; Does not make any trivial changes to the codebase
On 2023-06-29 Th 12:24, Jacob Champion wrote:
On 3/20/23 10:43, Tom Lane wrote:
I'd be more willing to consider the proposed patch if it weren't such
a hack --- as you say, it doesn't fix the problem when the table has
policies, so it's hardly a general-purpose solution. I fear that it's
also fairly expensive: adding sub-selects to the query we must do
before we can lock any tables is not appetizing, because making that
window wider adds to the risk of deadlocks, dump failures, etc.(moving to -hackers and July CF)
= Recap for Potential Reviewers =
The timescaledb extension has an internal table that's owned by the
superuser. It's not dumpable, and other users can only access its
contents through a filtered view. For our cloud deployments, we
additionally have that common trope where the most privileged users
aren't actually superusers, but we still want them to be able to perform
a subset of maintenance tasks, including pg_dumping their data.When cloud users try to dump that data, pg_dump sees that this internal
table is an extension member and plans to dump ACLs, security labels,
and RLS policies for it. (This behavior cannot be overridden with
--exclude-table. pg_dump ignores that flag for extension members.)
Dumping policies requires the use of pg_get_expr() on the backend; this,
in turn, requires a lock on the table with ACCESS SHARE.So pg_dump tries to lock a table, with no policies, that it's not going
to dump the schema or data for anyway, and it fails because our users
don't have (and shouldn't need) SELECT access to it. For an example of
this in action, I've attached a test case in v2-0001.= Proposal =
Since this is affecting users on released Postgres versions, my end goal
is to find a fix that's backportable.This situation looks very similar to [1], where non-superusers couldn't
perform a dump because we were eagerly grabbing table locks to read
(non-existent) ACLs. But that was solved with the realization that ACLs
don't need locks anyway, which is unfortunately not applicable to policies.My initial patch to -bugs was a riff on a related performance fix [2],
which figured out which tables had interesting ACLs and skipped that
part if nothing was found. I added the same kind of subselect for RLS
policies as well, but that had nasty corner cases where it would perform
terribly, as Tom alluded to above. (In a cluster of 200k tables, where
one single table had 10M policies, the query ground to a halt.)So v2-0002 is instead inspired by Tom's rewrite of that ACL dump logic
[3]. It scans pg_policy separately, stores the tables it finds into the
catalog map on the client side, and then again skips the policy dump
(and therefore the lock) if no policies exist. The performance hit now
scales with the size of pg_policy alone.This is a bit more invasive than the subselect, but hopefully still
straightforward enough to be applicable to the back branches' old
catalog map strategy. It's still not a general-purpose fix, as Tom
pointed out above, but that was true of the discussion in [1] as well,
so I'm optimistic.WDYT?
--Jacob
[1]
/messages/by-id/CAGPqQf3Uzo-yU1suYyoZR83h6QTxXxkGTtEyeMV7EAVBqn=PcQ@mail.gmail.com
[2]https://git.postgresql.org/cgit/postgresql.git/commit/?id=5d589993
[3]https://git.postgresql.org/cgit/postgresql.git/commit/?id=0c9d8442
Seems reasonable at first glance. Isn't it going to save some work
anyway later on, so the performance hit could end up negative?
cheers
andrew
--
Andrew Dunstan
EDB:https://www.enterprisedb.com
Thanks for the reviews, both of you!
On Fri, Jul 14, 2023 at 5:04 AM Andrew Dunstan <andrew@dunslane.net> wrote:
Seems reasonable at first glance. Isn't it going to save some work anyway later on, so the performance hit could end up negative?
Theoretically it could, if the OID list sent during getPolicies()
shrinks enough. I tried a quick test against the regression database,
but it's too noisy on my machine to know whether the difference is
really meaningful.
--Jacob
Hi all,
v3 fixes a doc comment I forgot to fill in; there are no other code
changes. To try to further reduce the activation energy, I've also
attached an attempt at a backport to 11. The main difference is the
absence of catalogIdHash, which showed up in 15, so we don't get the
benefit of that deduplication.
Thanks,
--Jacob
Attachments:
since-v2.diff.txttext/plain; charset=US-ASCII; name=since-v2.diff.txtDownload
v3-0001-Add-failing-test-for-undumped-extension-table.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Add-failing-test-for-undumped-extension-table.patchDownload+29-1
v3-0002-pg_dump-skip-lock-for-extension-tables-without-po.patchtext/x-patch; charset=US-ASCII; name=v3-0002-pg_dump-skip-lock-for-extension-tables-without-po.patchDownload+117-1
backport-11-0001.patch.txttext/plain; charset=US-ASCII; name=backport-11-0001.patch.txtDownload+47-19
backport-11-0002.patch.txttext/plain; charset=US-ASCII; name=backport-11-0002.patch.txtDownload+161-2
Jacob Champion <jchampion@timescale.com> writes:
v3 fixes a doc comment I forgot to fill in; there are no other code
changes. To try to further reduce the activation energy, I've also
attached an attempt at a backport to 11. The main difference is the
absence of catalogIdHash, which showed up in 15, so we don't get the
benefit of that deduplication.
So ... I still do not like anything about this patch. Putting
has_policies into CatalogIdMapEntry isn't a wart, it's more
nearly a tumor. Running getTablesWithPolicies before we can
acquire locks is horrid from the standpoint of minimizing the
window between our transaction snapshot and successful acquisition
of all needed locks. (It might be all right in databases with
few pg_policy entries, but I don't think we can assume that that
holds everywhere.) And the whole thing is just ugly and solves
the problem only partially.
What I am wondering about is whether we shouldn't just undo what
checkExtensionMembership does, specifically:
/*
* In 9.6 and above, mark the member object to have any non-initial ACL,
* policies, and security labels dumped.
*
* Note that any initial ACLs (see pg_init_privs) will be removed when we
* extract the information about the object. We don't provide support for
* initial policies and security labels and it seems unlikely for those to
* ever exist, but we may have to revisit this later.
*
* ...
*/
dobj->dump = ext->dobj.dump_contains & (DUMP_COMPONENT_ACL |
DUMP_COMPONENT_SECLABEL |
DUMP_COMPONENT_POLICY);
Why are we marking extension member objects as being subject to SECLABEL
or POLICY dumping? As the comment notes, that isn't really sensible
unless what we are dumping is a delta from the extension's initial
assignments. But we have no infrastructure for that, and none seems
likely to appear in the near future.
Could we not fix this by just reducing the above to
dobj->dump = ext->dobj.dump_contains & (DUMP_COMPONENT_ACL);
When and if someone comes along and implements storage of extensions'
initial policies, they can figure out how to avoid fetching policies for
not-to-be-dumped tables. (My thoughts would run towards adding a column
to pg_class to help detect whether such work is needed without doing
expensive additional queries.) But I don't see why we require a solution
to that problem as things stand.
regards, tom lane
I wrote:
Why are we marking extension member objects as being subject to SECLABEL
or POLICY dumping? As the comment notes, that isn't really sensible
unless what we are dumping is a delta from the extension's initial
assignments. But we have no infrastructure for that, and none seems
likely to appear in the near future.
Here's a quick patch that does it that way. The test changes
are identical to Jacob's v3-0001.
regards, tom lane
Attachments:
v4-0001-dont-try-to-dump-extension-RLS-policies.patchtext/x-diff; charset=us-ascii; name=v4-0001-dont-try-to-dump-extension-RLS-policies.patchDownload+41-10
Greetings,
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
I wrote:
Why are we marking extension member objects as being subject to SECLABEL
or POLICY dumping? As the comment notes, that isn't really sensible
unless what we are dumping is a delta from the extension's initial
assignments. But we have no infrastructure for that, and none seems
likely to appear in the near future.Here's a quick patch that does it that way. The test changes
are identical to Jacob's v3-0001.
What the comment is talking about is that we don't support initial
policies, not that we don't support policies on extension tables at all.
That said ... even the claim that we don't support such policies isn't
supported by code and there are people out there doing it, which creates
its own set of problems (ones we should really try to find solutions to
though..).
This change would mean that policies added by a user after the extension
is created would just be lost by a pg_dump/reload, doesn't it?
Thanks,
Stephen
Stephen Frost <sfrost@snowman.net> writes:
Greetings,
* Tom Lane (tgl@sss.pgh.pa.us) wrote:I wrote:
Why are we marking extension member objects as being subject to SECLABEL
or POLICY dumping?
This change would mean that policies added by a user after the extension
is created would just be lost by a pg_dump/reload, doesn't it?
Yes. But I'd say that's unsupported, just like making other ad-hoc
changes to extension objects is unsupported (and the effects will be
lost on dump/reload). We specifically have support for user-added
ACLs, and that's good, but don't claim that we have support for
doing the same with policies.
As far as I can see, the current behavior is that we'll dump and
try to reload policies (and seclabels) on extension objects even
if those properties were set by the extension creation script.
That has many more problems than just the one Jacob is moaning
about: you'll see failures at reload if you're not superuser,
and if the destination installation has a newer version of the
extension than what was dumped, the old properties might be
completely inappropriate. So IMO there's basically nothing
that works properly about this. To make it work, we'd need
infrastructure comparable to the pg_init_privs infrastructure.
regards, tom lane
On Wed, Oct 18, 2023 at 1:25 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Stephen Frost <sfrost@snowman.net> writes:
This change would mean that policies added by a user after the extension
is created would just be lost by a pg_dump/reload, doesn't it?Yes. But I'd say that's unsupported, just like making other ad-hoc
changes to extension objects is unsupported (and the effects will be
lost on dump/reload). We specifically have support for user-added
ACLs, and that's good, but don't claim that we have support for
doing the same with policies.
Is this approach backportable?
(Adding Aleks to CC -- Timescale may want to double-check that the new
proposal still works for them.)
Thanks,
--Jacob
Jacob Champion <champion.p@gmail.com> writes:
Is this approach backportable?
The code fix would surely work in the back branches. Whether the
behavioral change is too big to be acceptable in minor releases
is something I don't have a strong opinion on.
regards, tom lane
I wrote:
Jacob Champion <champion.p@gmail.com> writes:
Is this approach backportable?
The code fix would surely work in the back branches. Whether the
behavioral change is too big to be acceptable in minor releases
is something I don't have a strong opinion on.
I'm hearing nothing but crickets :-(
If nobody objects by say Monday, I'm going to go ahead and
commit (and backpatch) the patch I posted at [1]/messages/by-id/2984517.1697577076@sss.pgh.pa.us.
regards, tom lane
On Thu, Nov 9, 2023 at 11:02 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm hearing nothing but crickets :-(
Yeah :/
Based on your arguments above, it sounds like your patch may improve
several other corner cases when backported, so that sounds good
overall to me. My best guess is that Timescale will be happy with this
patch's approach. But I can't speak with any authority.
Aleks -- anything to add?
--Jacob