Possible regression setting GUCs on \connect

[1]: https://github.com/pgaudit/pgaudit/compare/dev-pg16-ci#diff-db4bb73982787fa1d07d4c9d80bc54028b8d2a52b80806f1352a42c42f00eaaaR604
https://github.com/pgaudit/pgaudit/compare/dev-pg16-ci#diff-db4bb73982787fa1d07d4c9d80bc54028b8d2a52b80806f1352a42c42f00eaaaR604

david@pgmasters.net

almost 3 years ago

Hackers,

I have been updating pgAudit for PG16 and ran into the following issue
in the regression tests:

\connect - user1
WARNING: permission denied to set parameter "pgaudit.log_level"

This happens after switching back and forth a few times between the
current user when the regression script was executed and user1 which is
created in the script. Specifically, it happens at [1]https://github.com/pgaudit/pgaudit/compare/dev-pg16-ci#diff-db4bb73982787fa1d07d4c9d80bc54028b8d2a52b80806f1352a42c42f00eaaaR604.

I have tracked the issue down to context == PGC_USERSET for case
PGC_SUSET in set_config_option_ext(). This GUC is PGC_SUSET so it seems
like once it is set reloading it should not be an issue.

If the GUC is set again immediately before the \connect then there is no
error, so it looks like the correct context is being lost somewhere
along the way.

Before I get into serious debugging on this issue, I thought it would be
good to bring it up in case the answer is obvious to someone else.

Thanks,
-David

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: David Steele (#1)

Re: Possible regression setting GUCs on \connect

David Steele <david@pgmasters.net> writes:

I have been updating pgAudit for PG16 and ran into the following issue
in the regression tests:
\connect - user1
WARNING: permission denied to set parameter "pgaudit.log_level"
This happens after switching back and forth a few times between the
current user when the regression script was executed and user1 which is
created in the script. Specifically, it happens at [1].

If this is new in v16, I'm inclined to blame 096dd80f3, but that's
just a guess without a test case.

regards, tom lane

[1]: https://github.com/pgaudit/pgaudit/tree/dev-pg16-ci

david@pgmasters.net

almost 3 years ago

In reply to: Tom Lane (#2)

Re: Possible regression setting GUCs on \connect

On 4/27/23 19:13, Tom Lane wrote:

David Steele <david@pgmasters.net> writes:

I have been updating pgAudit for PG16 and ran into the following issue
in the regression tests:
\connect - user1
WARNING: permission denied to set parameter "pgaudit.log_level"
This happens after switching back and forth a few times between the
current user when the regression script was executed and user1 which is
created in the script. Specifically, it happens at [1].

If this is new in v16, I'm inclined to blame 096dd80f3, but that's
just a guess without a test case.

Seems plausible. This can be reproduced by cloning [1]https://github.com/pgaudit/pgaudit/tree/dev-pg16-ci into contrib and
running `make check`. I can work out another test case but it may not
end up being simpler.

Thoughts on this Alexander?

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: David Steele (#3)

Re: Possible regression setting GUCs on \connect

David Steele <david@pgmasters.net> writes:

Seems plausible. This can be reproduced by cloning [1] into contrib and
running `make check`. I can work out another test case but it may not
end up being simpler.
[1] https://github.com/pgaudit/pgaudit/tree/dev-pg16-ci

I tried to replicate this per that recipe, but it works for me:

$ git clone https://github.com/pgaudit/pgaudit.git pgaudit
$ cd pgaudit
$ git checkout dev-pg16-ci
$ make -s check
# +++ regress check in contrib/pgaudit +++
# using temp instance on port 61696 with PID 1191703
ok 1 - pgaudit 310 ms
1..1
# All 1 tests passed.

This is at commit 6f879bddbdcfbf9995ecee1db9a587e06027bd13 on
your dev-pg16-ci branch and df38157d94662a64e2f83aa8a0110fd1ee7c4776
on PG master. Note that I had to add

$ diff -pud Makefile~ Makefile 
--- Makefile~   2023-04-27 14:02:19.041714415 -0400
+++ Makefile    2023-04-27 14:07:10.056909016 -0400
@@ -20,3 +20,5 @@ top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 include $(top_srcdir)/contrib/contrib-global.mk
 endif
+
+EXTRA_INSTALL += contrib/pg_stat_statements

else I got failures about pg_stat_statements not being installed.
But I don't see the failure you're complaining of.

regards, tom lane

david@pgmasters.net

almost 3 years ago

In reply to: Tom Lane (#4)

Re: Possible regression setting GUCs on \connect

On 4/27/23 21:16, Tom Lane wrote:

David Steele <david@pgmasters.net> writes:

Seems plausible. This can be reproduced by cloning [1] into contrib and
running `make check`. I can work out another test case but it may not
end up being simpler.
[1] https://github.com/pgaudit/pgaudit/tree/dev-pg16-ci

I tried to replicate this per that recipe, but it works for me:

$ git clone https://github.com/pgaudit/pgaudit.git pgaudit
$ cd pgaudit
$ git checkout dev-pg16-ci
$ make -s check
# +++ regress check in contrib/pgaudit +++
# using temp instance on port 61696 with PID 1191703
ok 1 - pgaudit 310 ms
1..1
# All 1 tests passed.

I included the errors in the expect log so I could link to them. So test
success means the error is happening.

Note that I had to add

$ diff -pud Makefile~ Makefile
--- Makefile~   2023-04-27 14:02:19.041714415 -0400
+++ Makefile    2023-04-27 14:07:10.056909016 -0400
@@ -20,3 +20,5 @@ top_builddir = ../..
include $(top_builddir)/src/Makefile.global
include $(top_srcdir)/contrib/contrib-global.mk
endif
+
+EXTRA_INSTALL += contrib/pg_stat_statements

Yeah, I rarely run tests in-tree, but I'll add this if it does not break
our regular CI.

-David

nathandbossart@gmail.com

almost 3 years ago

In reply to: David Steele (#5)

Re: Possible regression setting GUCs on \connect

I suspect the problem is that GUCArrayDelete() is using the wrong Datum:

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 9dd624b3ae..ee9f87e7f2 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -6496,7 +6496,7 @@ GUCArrayDelete(ArrayType *array, ArrayType **usersetArray, const char *name)
         {
             newarray = construct_array_builtin(&d, 1, TEXTOID);
             if (usersetArray)
-                newUsersetArray = construct_array_builtin(&d, 1, BOOLOID);
+                newUsersetArray = construct_array_builtin(&userSetDatum, 1, BOOLOID);
         }

index++;

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

david@pgmasters.net

almost 3 years ago

In reply to: Nathan Bossart (#6)

Re: Possible regression setting GUCs on \connect

On 4/27/23 21:43, Nathan Bossart wrote:

I suspect the problem is that GUCArrayDelete() is using the wrong Datum:

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 9dd624b3ae..ee9f87e7f2 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -6496,7 +6496,7 @@ GUCArrayDelete(ArrayType *array, ArrayType **usersetArray, const char *name)
{
newarray = construct_array_builtin(&d, 1, TEXTOID);
if (usersetArray)
-                newUsersetArray = construct_array_builtin(&d, 1, BOOLOID);
+                newUsersetArray = construct_array_builtin(&userSetDatum, 1, BOOLOID);
}

index++;

That seems to work. The errors are now gone.

Thanks!
-David

nathandbossart@gmail.com

almost 3 years ago

In reply to: David Steele (#7)

Re: Possible regression setting GUCs on \connect

On Thu, Apr 27, 2023 at 09:47:33PM +0300, David Steele wrote:

That seems to work. The errors are now gone.

Great. Barring objections, I'll plan on committing this shortly.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

aekorotkov@gmail.com

almost 3 years ago

In reply to: Nathan Bossart (#8)

Re: Possible regression setting GUCs on \connect

Hi!

On Thu, Apr 27, 2023 at 9:51 PM Nathan Bossart <nathandbossart@gmail.com> wrote:

On Thu, Apr 27, 2023 at 09:47:33PM +0300, David Steele wrote:

That seems to work. The errors are now gone.

Great. Barring objections, I'll plan on committing this shortly.

Thanks to everybody for catching and investigating this.
Nathan, I'd like to push it myself. I'm also going to check the code
for similar errors.

------
Regards,
Alexander Korotkov

#10

nathandbossart@gmail.com

almost 3 years ago

In reply to: Alexander Korotkov (#9)

Re: Possible regression setting GUCs on \connect

On Thu, Apr 27, 2023 at 09:53:23PM +0300, Alexander Korotkov wrote:

Thanks to everybody for catching and investigating this.
Nathan, I'd like to push it myself. I'm also going to check the code
for similar errors.

Sounds good!

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#11

aekorotkov@gmail.com

almost 3 years ago

In reply to: Nathan Bossart (#10)

Re: Possible regression setting GUCs on \connect

.On Thu, Apr 27, 2023 at 9:55 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:

On Thu, Apr 27, 2023 at 09:53:23PM +0300, Alexander Korotkov wrote:

Thanks to everybody for catching and investigating this.
Nathan, I'd like to push it myself. I'm also going to check the code
for similar errors.

Sounds good!

I didn't find similar bugs in 096dd80f3c. Pushed!

------
Regards,
Alexander Korotkov

#12

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: David Steele (#5)

Re: Possible regression setting GUCs on \connect

David Steele <david@pgmasters.net> writes:

On 4/27/23 21:16, Tom Lane wrote:

I tried to replicate this per that recipe, but it works for me:

I included the errors in the expect log so I could link to them. So test
success means the error is happening.

Ah, got it.

I added some debug output to the test, and what I see is that
at the "\connect - user1" commands that work ok, what we have
in pg_db_role_setting is along the lines of

+select setdatabase, setrole::regrole, setconfig, setuser from pg_db_role_setting;
+ setdatabase | setrole  |                                                                   setconfig                                                                   |    setuser    
+-------------+----------+-----------------------------------------------------------------------------------------------------------------------------------------------+---------------
+           0 | user1    | {"pgaudit.log=read, WRITE",pgaudit.log_level=notice,pgaudit.log_client=on,pgaudit.role=auditor,pgaudit.log_relation=on}                       | {f,f,f,f,f}
...

while where it's not working:

+select setdatabase, setrole::regrole, setconfig, setuser from pg_db_role_setting;
+ setdatabase | setrole  |                                                                   setconfig                                                                   |    setuser    
+-------------+----------+-----------------------------------------------------------------------------------------------------------------------------------------------+---------------
+           0 | user1    | {pgaudit.log_level=notice,pgaudit.log_client=on,pgaudit.role=auditor,pgaudit.log_statement=off}                                               | {t,f,f,f}
...

So it is failing because setuser = 't' for that setting; which makes the
failure itself unsurprising, but it seems like the flag should not
have been set that way. Digging further, it looks like the flag array
is not correctly updated during ALTER USER RESET:

select setdatabase, setrole::regrole, setconfig, setuser from pg_db_role_setting;
NOTICE: AUDIT: SESSION,1,1,READ,SELECT,TABLE,pg_catalog.pg_db_role_setting,"select setdatabase, setrole::regrole, setconfig, setuser from pg_db_role_setting;",<not logged>
setdatabase | setrole | setconfig | setuser
-------------+----------+-----------------------------------------------------------------------------------------------------------------------------------------------+---------------
0 | user1 | {"pgaudit.log=read, WRITE",pgaudit.log_level=notice,pgaudit.log_client=on,pgaudit.role=auditor,pgaudit.log_relation=on} | {f,f,f,f,f}

... that's fine ...

ALTER ROLE user1 RESET pgaudit.log_relation;
select setdatabase, setrole::regrole, setconfig, setuser from pg_db_role_setting;
setdatabase | setrole | setconfig | setuser
-------------+----------+-----------------------------------------------------------------------------------------------------------------------------------------------+---------------
0 | user1 | {"pgaudit.log=read, WRITE",pgaudit.log_level=notice,pgaudit.log_client=on,pgaudit.role=auditor} | {t,f,f,f}

... wrong ...

ALTER ROLE user1 RESET pgaudit.log;
select setdatabase, setrole::regrole, setconfig, setuser from pg_db_role_setting;
setdatabase | setrole | setconfig | setuser
-------------+----------+-----------------------------------------------------------------------------------------------------------------------------------------------+---------------
0 | user1 | {pgaudit.log_level=notice,pgaudit.log_client=on,pgaudit.role=auditor} | {t,f,f}

... and wronger.

I had not paid any attention to 096dd80f3 when it went in, but now
that I have looked at it I'm quite distressed, independently of this
probably-minor bug. It seems to me that this feature is not well
designed and completely ignores the precedents set by commits
a0ffa885e and 13d838815. The right way to do this was not to add some
poorly-explained option to ALTER ROLE, but to record the role OID that
issued the ALTER ROLE, and then to check when loading the ALTER ROLE
setting whether that role (still) has the right to change the
specified setting. As implemented, this can't possibly track changes
in GRANT/REVOKE SET privileges correctly, and I wonder if it's not
introducing outright security holes like the one fixed by 13d838815.

I think we ought to revert 096dd80f3 and try again in v17.

regards, tom lane

#13

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: Nathan Bossart (#6)

Re: Possible regression setting GUCs on \connect

Nathan Bossart <nathandbossart@gmail.com> writes:

I suspect the problem is that GUCArrayDelete() is using the wrong Datum:

-                newUsersetArray = construct_array_builtin(&d, 1, BOOLOID);
+                newUsersetArray = construct_array_builtin(&userSetDatum, 1, BOOLOID);

Ah, should have checked my mail earlier.

However, my concern about whether we even want this feature
still stands.

regards, tom lane

#14

nathandbossart@gmail.com

almost 3 years ago

In reply to: Tom Lane (#12)

Re: Possible regression setting GUCs on \connect

On Thu, Apr 27, 2023 at 03:22:04PM -0400, Tom Lane wrote:

The right way to do this was not to add some
poorly-explained option to ALTER ROLE, but to record the role OID that
issued the ALTER ROLE, and then to check when loading the ALTER ROLE
setting whether that role (still) has the right to change the
specified setting. As implemented, this can't possibly track changes
in GRANT/REVOKE SET privileges correctly, and I wonder if it's not
introducing outright security holes like the one fixed by 13d838815.

I generally agree. At least, I think it would be nice to avoid adding a
new option if possible. It's not clear to me why we'd need to also check
privileges at login time as opposed to only checking them at ALTER ROLE SET
time. ISTM that the former approach would introduce some interesting
problems around dropping roles or changing roles' privileges.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#15

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: Nathan Bossart (#14)

Re: Possible regression setting GUCs on \connect

Nathan Bossart <nathandbossart@gmail.com> writes:

On Thu, Apr 27, 2023 at 03:22:04PM -0400, Tom Lane wrote:

The right way to do this was not to add some
poorly-explained option to ALTER ROLE, but to record the role OID that
issued the ALTER ROLE, and then to check when loading the ALTER ROLE
setting whether that role (still) has the right to change the
specified setting. As implemented, this can't possibly track changes
in GRANT/REVOKE SET privileges correctly, and I wonder if it's not
introducing outright security holes like the one fixed by 13d838815.

I generally agree. At least, I think it would be nice to avoid adding a
new option if possible. It's not clear to me why we'd need to also check
privileges at login time as opposed to only checking them at ALTER ROLE SET
time.

Perhaps there's room to argue about that. But ISTM that if someone
does ALTER ROLE SET on the strength of some privilege you granted
them, and then you regret that and revoke the privilege, then the
ALTER ROLE setting should not continue to work. So I would regard
the session-start-time check as the primary one. Checking when
ALTER ROLE is done is just a user-friendliness detail.

Also, in the case of an extension-defined GUC, we don't necessarily
know its privilege level at either ALTER ROLE time or session start,
since the extension might not yet be loaded at either point.
I've forgotten exactly what restrictive hack we use to get around
that, but the *only* way to do that fully correctly is to record
the role that did the ALTER and then check its privileges at extension
load time. I think that 096dd80f3 has made it harder not easier
to get to the point of doing that correctly, because it's added
a feature that we'll have to figure out how to make interoperate
with a correct implementation.

regards, tom lane

#16

aekorotkov@gmail.com

almost 3 years ago

In reply to: Tom Lane (#15)

Re: Possible regression setting GUCs on \connect

On Fri, Apr 28, 2023 at 1:38 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Nathan Bossart <nathandbossart@gmail.com> writes:

On Thu, Apr 27, 2023 at 03:22:04PM -0400, Tom Lane wrote:

The right way to do this was not to add some
poorly-explained option to ALTER ROLE, but to record the role OID that
issued the ALTER ROLE, and then to check when loading the ALTER ROLE
setting whether that role (still) has the right to change the
specified setting. As implemented, this can't possibly track changes
in GRANT/REVOKE SET privileges correctly, and I wonder if it's not
introducing outright security holes like the one fixed by 13d838815.

I generally agree. At least, I think it would be nice to avoid adding a
new option if possible. It's not clear to me why we'd need to also check
privileges at login time as opposed to only checking them at ALTER ROLE SET
time.

Perhaps there's room to argue about that. But ISTM that if someone
does ALTER ROLE SET on the strength of some privilege you granted
them, and then you regret that and revoke the privilege, then the
ALTER ROLE setting should not continue to work. So I would regard
the session-start-time check as the primary one. Checking when
ALTER ROLE is done is just a user-friendliness detail.

From my point of view that is much different from what we're doing
with other database objects. If some role gets revoked from
privilege, that doesn't affect the actions done with that privilege
before. The law is not retroactive. If one has created some tables,
those tables still work if the creator gets revoked privilege or even
gets deleted. Why should the setting behave differently?

Additionally, I think if we start recording role OID, then we need a
full set of management clauses for each individual option ownership.
Otherwise, we would leave this new role OID without necessarily
management facilities. But with them, the whole stuff will look like
awful overengineering.

Also, in the case of an extension-defined GUC, we don't necessarily
know its privilege level at either ALTER ROLE time or session start,
since the extension might not yet be loaded at either point.

Yes, that's it.

I've forgotten exactly what restrictive hack we use to get around
that,

As I understand the restrictive hack is to assume that the role is at
least SUSET.

but the *only* way to do that fully correctly is to record
the role that did the ALTER and then check its privileges at extension
load time.

This depends on the understanding of correctness. Recording role OID
would mean altering that role privileges or deleting the role would
affect the settings. Even if that is correct, this is very very far
from the behavior we had for decades.

I think that 096dd80f3 has made it harder not easier
to get to the point of doing that correctly, because it's added
a feature that we'll have to figure out how to make interoperate
with a correct implementation.

With 096dd80f3, we still may revoke the setting of USERSET options
from the public. Even if the option is not currently loaded at ALTER
time, we still may find an explicit revoke recorder in the system
catalog. That behavior will be current if we understand the default
options as separate material things (as it is today), but not part of
the setter role.

------
Regards,
Alexander Korotkov

#17