DROP OWNED BY fails to clean out pg_init_privs grants
I wondered why buildfarm member copperhead has started to fail
xversion-upgrade-HEAD-HEAD tests. I soon reproduced the problem here:
pg_restore: creating ACL "regress_pg_dump_schema.TYPE "test_type""
pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 4355; 0 0 ACL TYPE "test_type" buildfarm
pg_restore: error: could not execute query: ERROR: role "74603" does not exist
Command was: SELECT pg_catalog.binary_upgrade_set_record_init_privs(true);
GRANT ALL ON TYPE "regress_pg_dump_schema"."test_type" TO "74603";
SELECT pg_catalog.binary_upgrade_set_record_init_privs(false);
REVOKE ALL ON TYPE "regress_pg_dump_schema"."test_type" FROM "74603";
(So now I'm wondering why *only* copperhead has shown this so far.
Are our other cross-version-upgrade testing animals AWOL?)
I believe this is a longstanding problem that was exposed by accident
by commit 936e3fa37. If you run "make installcheck" in HEAD's
src/test/modules/test_pg_dump, and then poke around in the leftover
contrib_regression database, you can find dangling grants in
pg_init_privs:
contrib_regression=# table pg_init_privs;
objoid | classoid | objsubid | privtype | initprivs
--------+----------+----------+----------+--------------------------------------
---------------------------
...
es}
43134 | 1259 | 0 | e | {postgres=rwU/postgres,43125=U/postgr
es}
43128 | 1259 | 0 | e | {postgres=arwdDxtm/postgres,43125=r/p
ostgres}
...
The fact that the DROP ROLE added by 936e3fa37 succeeded indicates
that these role references weren't captured in pg_shdepend.
I imagine that we also lack code that would allow DROP OWNED BY to
follow up on such entries if they existed, but I've not checked that
for sure. In any case, there's probably a nontrivial amount of code
to be written to make this work.
Given the lack of field complaints, I suspect that extension scripts
simply don't grant privileges to random roles that aren't the
extension's owner. So I wonder a little bit if this is even worth
fixing, as opposed to blocking off somehow. But probably we should
first try to fix it.
I doubt this is something we'll have fixed by Monday, so I will
go add an open item for it.
regards, tom lane
On Fri, Apr 05, 2024 at 07:10:59PM -0400, Tom Lane wrote:
I wondered why buildfarm member copperhead has started to fail
xversion-upgrade-HEAD-HEAD tests. I soon reproduced the problem here:pg_restore: creating ACL "regress_pg_dump_schema.TYPE "test_type""
pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 4355; 0 0 ACL TYPE "test_type" buildfarm
pg_restore: error: could not execute query: ERROR: role "74603" does not exist
Command was: SELECT pg_catalog.binary_upgrade_set_record_init_privs(true);
GRANT ALL ON TYPE "regress_pg_dump_schema"."test_type" TO "74603";
SELECT pg_catalog.binary_upgrade_set_record_init_privs(false);
REVOKE ALL ON TYPE "regress_pg_dump_schema"."test_type" FROM "74603";(So now I'm wondering why *only* copperhead has shown this so far.
Are our other cross-version-upgrade testing animals AWOL?)I believe this is a longstanding problem that was exposed by accident
by commit 936e3fa37. If you run "make installcheck" in HEAD's
src/test/modules/test_pg_dump, and then poke around in the leftover
contrib_regression database, you can find dangling grants in
pg_init_privs:contrib_regression=# table pg_init_privs;
objoid | classoid | objsubid | privtype | initprivs--------+----------+----------+----------+--------------------------------------
---------------------------
...
es}
43134 | 1259 | 0 | e | {postgres=rwU/postgres,43125=U/postgr
es}
43128 | 1259 | 0 | e | {postgres=arwdDxtm/postgres,43125=r/p
ostgres}
...The fact that the DROP ROLE added by 936e3fa37 succeeded indicates
that these role references weren't captured in pg_shdepend.
I imagine that we also lack code that would allow DROP OWNED BY to
follow up on such entries if they existed, but I've not checked that
for sure. In any case, there's probably a nontrivial amount of code
to be written to make this work.Given the lack of field complaints, I suspect that extension scripts
simply don't grant privileges to random roles that aren't the
extension's owner. So I wonder a little bit if this is even worth
fixing, as opposed to blocking off somehow. But probably we should
first try to fix it.
This sounds closely-related to the following thread:
/messages/by-id/1573808483712.96817@Optiver.com
Noah Misch <noah@leadboat.com> writes:
On Fri, Apr 05, 2024 at 07:10:59PM -0400, Tom Lane wrote:
The fact that the DROP ROLE added by 936e3fa37 succeeded indicates
that these role references weren't captured in pg_shdepend.
I imagine that we also lack code that would allow DROP OWNED BY to
follow up on such entries if they existed, but I've not checked that
for sure. In any case, there's probably a nontrivial amount of code
to be written to make this work.Given the lack of field complaints, I suspect that extension scripts
simply don't grant privileges to random roles that aren't the
extension's owner. So I wonder a little bit if this is even worth
fixing, as opposed to blocking off somehow. But probably we should
first try to fix it.
This sounds closely-related to the following thread:
/messages/by-id/1573808483712.96817@Optiver.com
Oh, interesting, I'd forgotten that thread completely.
So Stephen was pushing back against dealing with the case because
he thought that the SQL commands issued in that example should not
have produced pg_init_privs entries in the first place. Which nobody
else wanted to opine on, so the thread stalled. However, in the case
of the test_pg_dump extension, the test_pg_dump--1.0.sql script
absolutely did grant those privileges so it's very hard for me to
think that they shouldn't be listed in pg_init_privs. Hence, I think
we've accidentally stumbled across a case where we do need all that
mechanism --- unless somebody wants to argue that what
test_pg_dump--1.0.sql is doing should be disallowed.
regards, tom lane
On 6 Apr 2024, at 01:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:
(So now I'm wondering why *only* copperhead has shown this so far.
Are our other cross-version-upgrade testing animals AWOL?)
Clicking around searching for Xversion animals I didn't spot any, but without
access to the database it's nontrivial to know which animal does what.
I doubt this is something we'll have fixed by Monday, so I will
go add an open item for it.
+1. Having opened the can of worms I'll have a look at it next week.
--
Daniel Gustafsson
Daniel Gustafsson <daniel@yesql.se> writes:
On 6 Apr 2024, at 01:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:
(So now I'm wondering why *only* copperhead has shown this so far.
Are our other cross-version-upgrade testing animals AWOL?)
Clicking around searching for Xversion animals I didn't spot any, but without
access to the database it's nontrivial to know which animal does what.
I believe I see why this is (or isn't) happening. The animals
currently running xversion tests are copperhead, crake, drongo,
and fairywren. copperhead is using the makefiles while the others
are using meson. And I find this in
src/test/modules/test_pg_dump/meson.build (from 3f0e786cc):
# doesn't delete its user
'runningcheck': false,
So the meson animals are not running the test that sets up the
problematic data.
I think we should remove the above, since (a) the reason to have
it is gone, and (b) it seems really bad that the set of tests
run by meson is different from that run by the makefiles.
However, once we do that, those other three animals will presumably go
red, greatly complicating detection of any Windows-specific problems.
So I'm inclined to not do it till just before we intend to commit
a fix for the underlying problem. (Enough before that we can confirm
that they do go red.)
Speaking of which ...
I doubt this is something we'll have fixed by Monday, so I will
go add an open item for it.
+1. Having opened the can of worms I'll have a look at it next week.
... were you going to look at it? I can take a whack if it's
too far down your priority list.
regards, tom lane
On 21 Apr 2024, at 23:08, Tom Lane <tgl@sss.pgh.pa.us> wrote:
... were you going to look at it? I can take a whack if it's too far down your priority list.
Yeah, I’m working on a patchset right now.
./daniel
On 21 Apr 2024, at 23:08, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
On 6 Apr 2024, at 01:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:
(So now I'm wondering why *only* copperhead has shown this so far.
Are our other cross-version-upgrade testing animals AWOL?)Clicking around searching for Xversion animals I didn't spot any, but without
access to the database it's nontrivial to know which animal does what.I believe I see why this is (or isn't) happening. The animals
currently running xversion tests are copperhead, crake, drongo,
and fairywren. copperhead is using the makefiles while the others
are using meson. And I find this in
src/test/modules/test_pg_dump/meson.build (from 3f0e786cc):# doesn't delete its user
'runningcheck': false,So the meson animals are not running the test that sets up the
problematic data.
ugh =/
I think we should remove the above, since (a) the reason to have
it is gone, and (b) it seems really bad that the set of tests
run by meson is different from that run by the makefiles.
Agreed.
However, once we do that, those other three animals will presumably go
red, greatly complicating detection of any Windows-specific problems.
So I'm inclined to not do it till just before we intend to commit
a fix for the underlying problem. (Enough before that we can confirm
that they do go red.)
Agreed, we definitely want that but compromising the ability to find Windows
issues at this point in the cycle seems bad.
... were you going to look at it? I can take a whack if it's
too far down your priority list.
I took a look at this, reading code and the linked thread. My gut feeling is
that Stephen is right in that the underlying bug is these privileges ending up
in pg_init_privs to begin with. That being said, I wasn't able to fix that in
a way that doesn't seem like a terrible hack. The attached POC hack fixes it
for me but I'm not sure how to fix it properly. Your wisdom would be much appreciated.
Clusters which already has such entries aren't helped by a fix for this though,
fixing that would either require pg_dump to skip them, or pg_upgrade to have a
check along with instructions for fixing the issue. Not sure what's the best
strategy here, the lack of complaints could indicate this isn't terribly common
so spending cycles on it for every pg_dump might be excessive compared to a
pg_upgrade check?
--
Daniel Gustafsson
Attachments:
init_privs.diffapplication/octet-stream; name=init_privs.diff; x-unix-mode=0644Download+72-8
Daniel Gustafsson <daniel@yesql.se> writes:
On 21 Apr 2024, at 23:08, Tom Lane <tgl@sss.pgh.pa.us> wrote:
So the meson animals are not running the test that sets up the
problematic data.
I took a look at this, reading code and the linked thread. My gut feeling is
that Stephen is right in that the underlying bug is these privileges ending up
in pg_init_privs to begin with. That being said, I wasn't able to fix that in
a way that doesn't seem like a terrible hack.
Hmm, can't we put the duplicate logic inside recordExtensionInitPriv?
Even if these calls need a different result from others, adding a flag
parameter seems superior to having N copies of the logic.
A bigger problem though is that I think you are addressing the
original complaint from the older thread, which while it's a fine
thing to fix seems orthogonal to the failure we're seeing in the
buildfarm. The buildfarm's problem is not that we're recording
incorrect pg_init_privs entries, it's that when we do create such
entries we're failing to show their dependency on the grantee role
in pg_shdepend. We've missed spotting that so far because it's
so seldom that pg_init_privs entries reference any but built-in
roles (or at least roles that'd likely outlive the extension).
regards, tom lane
I wrote:
A bigger problem though is that I think you are addressing the
original complaint from the older thread, which while it's a fine
thing to fix seems orthogonal to the failure we're seeing in the
buildfarm. The buildfarm's problem is not that we're recording
incorrect pg_init_privs entries, it's that when we do create such
entries we're failing to show their dependency on the grantee role
in pg_shdepend. We've missed spotting that so far because it's
so seldom that pg_init_privs entries reference any but built-in
roles (or at least roles that'd likely outlive the extension).
Here's a draft patch that attacks that. It seems to fix the
problem with test_pg_dump: no dangling pg_init_privs grants
are left behind.
A lot of the changes here are just involved with needing to pass the
object's owner OID to recordExtensionInitPriv so that it can be passed
to updateAclDependencies. One thing I'm a bit worried about is that
some of the new code assumes that all object types that are of
interest here will have catcaches on OID, so that it's possible to
fetch the owner OID for a generic object-with-privileges using the
catcache and objectaddress.c's tables of object properties. That
assumption seems to exist already, eg ExecGrant_common also assumes
it, but it's not obvious that it must be so.
regards, tom lane
Attachments:
initprivs-dependency-fix.patchtext/x-diff; charset=us-ascii; name=initprivs-dependency-fix.patchDownload+315-21
I wrote:
Here's a draft patch that attacks that. It seems to fix the
problem with test_pg_dump: no dangling pg_init_privs grants
are left behind.
Here's a v2 that attempts to add some queries to test_pg_dump.sql
to provide visual verification that pg_shdepend and pg_init_privs
are updated correctly during DROP OWNED BY. It's a little bit
nasty to look at the ACL column of pg_init_privs, because that text
involves the bootstrap superuser's name which is site-dependent.
What I did to try to make the test stable is
replace(initprivs::text, current_user, 'postgres') AS initprivs
This is of course not bulletproof: with a sufficiently weird
bootstrap superuser name, we could get false matches to parts
of "regress_dump_test_role" or to privilege strings. That
seems unlikely enough to live with, but I wonder if anybody has
a better idea.
regards, tom lane
Attachments:
v2-initprivs-dependency-fix.patchtext/x-diff; charset=us-ascii; name=v2-initprivs-dependency-fix.patchDownload+418-21
On 28 Apr 2024, at 20:52, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
Here's a draft patch that attacks that. It seems to fix the
problem with test_pg_dump: no dangling pg_init_privs grants
are left behind.
Reading this I can't find any sharp edges, and I prefer your changes to
recordExtensionInitPriv over the version I had half-baked by the time you had
this finished. Trying to break it with the testcases I had devised also
failed, so +1.
Here's a v2 that attempts to add some queries to test_pg_dump.sql
to provide visual verification that pg_shdepend and pg_init_privs
are updated correctly during DROP OWNED BY. It's a little bit
nasty to look at the ACL column of pg_init_privs, because that text
involves the bootstrap superuser's name which is site-dependent.
What I did to try to make the test stable isreplace(initprivs::text, current_user, 'postgres') AS initprivs
Maybe that part warrants a small comment in the testfile to keep it from
sending future readers into rabbitholes?
This is of course not bulletproof: with a sufficiently weird
bootstrap superuser name, we could get false matches to parts
of "regress_dump_test_role" or to privilege strings. That
seems unlikely enough to live with, but I wonder if anybody has
a better idea.
I think that will be bulletproof enough to keep it working in the buildfarm and
among 99% of hackers.
--
Daniel Gustafsson
Daniel Gustafsson <daniel@yesql.se> writes:
On 28 Apr 2024, at 20:52, Tom Lane <tgl@sss.pgh.pa.us> wrote:
... It's a little bit
nasty to look at the ACL column of pg_init_privs, because that text
involves the bootstrap superuser's name which is site-dependent.
What I did to try to make the test stable is
replace(initprivs::text, current_user, 'postgres') AS initprivs
Maybe that part warrants a small comment in the testfile to keep it from
sending future readers into rabbitholes?
Agreed.
This is of course not bulletproof: with a sufficiently weird
bootstrap superuser name, we could get false matches to parts
of "regress_dump_test_role" or to privilege strings. That
seems unlikely enough to live with, but I wonder if anybody has
a better idea.
I think that will be bulletproof enough to keep it working in the buildfarm and
among 99% of hackers.
It occurred to me to use "aclexplode" to expand the initprivs, and
then we can substitute names with simple equality tests. The test
query is a bit more complicated, but I feel better about it.
v3 attached also has a bit more work on code comments.
regards, tom lane
Attachments:
v3-initprivs-dependency-fix.patchtext/x-diff; charset=us-ascii; name=v3-initprivs-dependency-fix.patchDownload+548-22
On 29 Apr 2024, at 21:15, Tom Lane <tgl@sss.pgh.pa.us> wrote:
It occurred to me to use "aclexplode" to expand the initprivs, and
then we can substitute names with simple equality tests. The test
query is a bit more complicated, but I feel better about it.
Nice, I didn't even remember that function existed. I agree that it's an
improvement even at the increased query complexity.
v3 attached also has a bit more work on code comments.
LGTM.
--
Daniel Gustafsson
Daniel Gustafsson <daniel@yesql.se> writes:
On 29 Apr 2024, at 21:15, Tom Lane <tgl@sss.pgh.pa.us> wrote:
v3 attached also has a bit more work on code comments.
LGTM.
Pushed, thanks for reviewing!
regards, tom lane
I wrote:
Pushed, thanks for reviewing!
Argh, I forgot I'd meant to push b0c5b215d first not second.
Oh well, it was only neatnik-ism that made me want to see
those other animals fail --- and a lot of the buildfarm is
red right now for $other_reasons anyway.
regards, tom lane
On Monday, April 29, 2024, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
On 28 Apr 2024, at 20:52, Tom Lane <tgl@sss.pgh.pa.us> wrote:
This is of course not bulletproof: with a sufficiently weird
bootstrap superuser name, we could get false matches to parts
of "regress_dump_test_role" or to privilege strings. That
seems unlikely enough to live with, but I wonder if anybody has
a better idea.I think that will be bulletproof enough to keep it working in the
buildfarm and
among 99% of hackers.
It occurred to me to use "aclexplode" to expand the initprivs, and
then we can substitute names with simple equality tests. The test
query is a bit more complicated, but I feel better about it.
My solution to this was to rely on the fact that the bootstrap superuser is
assigned OID 10 regardless of its name.
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes:
My solution to this was to rely on the fact that the bootstrap superuser is
assigned OID 10 regardless of its name.
Yeah, I wrote it that way to start with too, but reconsidered
because
(1) I don't like hard-coding numeric OIDs. We can avoid that in C
code but it's harder to do in SQL.
(2) It's not clear to me that this test couldn't be run by a
non-bootstrap superuser. I think "current_user" is actually
the correct thing for the role executing the test.
regards, tom lane
On Monday, April 29, 2024, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
My solution to this was to rely on the fact that the bootstrap superuser
is
assigned OID 10 regardless of its name.
Yeah, I wrote it that way to start with too, but reconsidered
because(1) I don't like hard-coding numeric OIDs. We can avoid that in C
code but it's harder to do in SQL.
If the tests don’t involve, e.g., the predefined role pg_monitor and its
grantor of the memberships in the other predefined roles, this indeed can
be avoided. So I think my test still needs to check for 10 even if some
other superuser is allowed to produce the test output since a key output in
my case was the bootstrap superuser and the initdb roles.
(2) It's not clear to me that this test couldn't be run by a
non-bootstrap superuser. I think "current_user" is actually
the correct thing for the role executing the test.
Agreed, testing against current_role is correct if the things being queried
were created while executing the test. I would need to do this as well to
remove the current requirement that my tests be run by the bootstrap
superuser.
David J.
While the 'DROP OWNED BY fails to clean out pg_init_privs grants'
issue is now fixed,we have a similar issue with REASSIGN OWNED BY that
is still there:
Tested on fresh git checkout om May 20th
test=# create user privtestuser superuser;
CREATE ROLE
test=# set role privtestuser;
SET
test=# create extension pg_stat_statements ;
CREATE EXTENSION
test=# select * from pg_init_privs where privtype ='e';
objoid | classoid | objsubid | privtype |
initprivs
--------+----------+----------+----------+------------------------------------------------------
16405 | 1259 | 0 | e |
{privtestuser=arwdDxtm/privtestuser,=r/privtestuser}
16422 | 1259 | 0 | e |
{privtestuser=arwdDxtm/privtestuser,=r/privtestuser}
16427 | 1255 | 0 | e | {privtestuser=X/privtestuser}
(3 rows)
test=# reset role;
RESET
test=# reassign owned by privtestuser to hannuk;
REASSIGN OWNED
test=# select * from pg_init_privs where privtype ='e';
objoid | classoid | objsubid | privtype |
initprivs
--------+----------+----------+----------+------------------------------------------------------
16405 | 1259 | 0 | e |
{privtestuser=arwdDxtm/privtestuser,=r/privtestuser}
16422 | 1259 | 0 | e |
{privtestuser=arwdDxtm/privtestuser,=r/privtestuser}
16427 | 1255 | 0 | e | {privtestuser=X/privtestuser}
(3 rows)
test=# drop user privtestuser ;
DROP ROLE
test=# select * from pg_init_privs where privtype ='e';
objoid | classoid | objsubid | privtype | initprivs
--------+----------+----------+----------+---------------------------------
16405 | 1259 | 0 | e | {16390=arwdDxtm/16390,=r/16390}
16422 | 1259 | 0 | e | {16390=arwdDxtm/16390,=r/16390}
16427 | 1255 | 0 | e | {16390=X/16390}
(3 rows)
This will cause pg_dump to produce something that cant be loaded back
into the database:
CREATE EXTENSION IF NOT EXISTS pg_stat_statements WITH SCHEMA public;
...
REVOKE ALL ON TABLE public.pg_stat_statements FROM "16390";
...
And this will, among other things, break pg_upgrade.
-----
Hannu
On Tue, Apr 30, 2024 at 6:40 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
Show quoted text
On Monday, April 29, 2024, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
My solution to this was to rely on the fact that the bootstrap superuser is
assigned OID 10 regardless of its name.Yeah, I wrote it that way to start with too, but reconsidered
because(1) I don't like hard-coding numeric OIDs. We can avoid that in C
code but it's harder to do in SQL.If the tests don’t involve, e.g., the predefined role pg_monitor and its grantor of the memberships in the other predefined roles, this indeed can be avoided. So I think my test still needs to check for 10 even if some other superuser is allowed to produce the test output since a key output in my case was the bootstrap superuser and the initdb roles.
(2) It's not clear to me that this test couldn't be run by a
non-bootstrap superuser. I think "current_user" is actually
the correct thing for the role executing the test.Agreed, testing against current_role is correct if the things being queried were created while executing the test. I would need to do this as well to remove the current requirement that my tests be run by the bootstrap superuser.
David J.
Hannu Krosing <hannuk@google.com> writes:
While the 'DROP OWNED BY fails to clean out pg_init_privs grants'
issue is now fixed,we have a similar issue with REASSIGN OWNED BY that
is still there:
Ugh, how embarrassing. I'll take a look tomorrow, if no one
beats me to it.
regards, tom lane