Allow non-superuser to cancel superuser tasks.

Started by Kirill Reshkeabout 2 years ago68 messages
Jump to latest
#1Kirill Reshke
reshkekirill@gmail.com

Hi hackers!

In our Cloud we have a patch, which allows non-superuser role ('mdb_admin')
to do some superuser things.
In particular, we have a patch that allows mdb admin to cancel the
autovacuum process and some other processes (processes with
application_name = 'MDB'), see the attachment.
This is needed to allow non-superuser roles to run pg_repack and to cancel
pg_repack.
We need to cancel running autovac to run pg_repack (because of locks), and
we need to cancel pg_repack sometimes also.

I want to reduce our internal patch size and transfer this logic to
extension or to core.
I have found similar threads [1]/messages/by-id/F9408A5A-B20B-42D2-9E7F-49CD3D1547BC@enterprisedb.com and [2]/messages/by-id/20220722203735.GB3996698@nathanxps13, but, as far as I understand, they
do not solve this particular case.
I see 2 possible ways to implement this. The first one is to have hool in
pg_signal_backend, and define a hook in extension which can do the thing.
The second one is to have a predefined role. Something like a
`pg_signal_autovacuum` role which can signal running autovac to cancel. But
I don't see how we can handle specific `application_name` with this
solution.

[1]: /messages/by-id/F9408A5A-B20B-42D2-9E7F-49CD3D1547BC@enterprisedb.com
/messages/by-id/F9408A5A-B20B-42D2-9E7F-49CD3D1547BC@enterprisedb.com
[2]: /messages/by-id/20220722203735.GB3996698@nathanxps13
/messages/by-id/20220722203735.GB3996698@nathanxps13

Attachments:

v1-0001-cloud-mdb_admin-patch-part-to-illustrate.patchapplication/octet-stream; name=v1-0001-cloud-mdb_admin-patch-part-to-illustrate.patchDownload+28-3
#2Nathan Bossart
nathandbossart@gmail.com
In reply to: Kirill Reshke (#1)
Re: Allow non-superuser to cancel superuser tasks.

On Mon, Feb 26, 2024 at 12:38:40PM +0500, Kirill Reshke wrote:

I see 2 possible ways to implement this. The first one is to have hool in
pg_signal_backend, and define a hook in extension which can do the thing.
The second one is to have a predefined role. Something like a
`pg_signal_autovacuum` role which can signal running autovac to cancel. But
I don't see how we can handle specific `application_name` with this
solution.

pg_signal_autovacuum seems useful given commit 3a9b18b.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#3Kirill Reshke
reshkekirill@gmail.com
In reply to: Nathan Bossart (#2)
Re: Allow non-superuser to cancel superuser tasks.

On Mon, 26 Feb 2024 at 20:10, Nathan Bossart <nathandbossart@gmail.com>
wrote:

On Mon, Feb 26, 2024 at 12:38:40PM +0500, Kirill Reshke wrote:

I see 2 possible ways to implement this. The first one is to have hool in
pg_signal_backend, and define a hook in extension which can do the thing.
The second one is to have a predefined role. Something like a
`pg_signal_autovacuum` role which can signal running autovac to cancel.

But

I don't see how we can handle specific `application_name` with this
solution.

pg_signal_autovacuum seems useful given commit 3a9b18b.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Thank you for your response.
Please find a patch attached.

In patch, pg_signal_autovacuum role with oid 6312 added. I grabbed oid from
unused_oids script output.
Also, tap tests for functionality added. I'm not sure where to place them,
so I placed them in a separate directory in `src/test/`
Seems that regression tests for this feature are not possible, am i right?
Also, I was thinking of pg_signal_autovacuum vs pg_signal_backend.
Should pg_signal_autovacuum have power of pg_signal_backend (implicity)? Or
should this role have such little scope...

Attachments:

v1-0001-Allow-non-superuser-to-cancel-superuser-tasks.patchapplication/octet-stream; name=v1-0001-Allow-non-superuser-to-cancel-superuser-tasks.patchDownload+113-9
#4Kirill Reshke
reshkekirill@gmail.com
In reply to: Kirill Reshke (#3)
Re: Allow non-superuser to cancel superuser tasks.

On Tue, 27 Feb 2024 at 01:22, Kirill Reshke <reshkekirill@gmail.com> wrote:

On Mon, 26 Feb 2024 at 20:10, Nathan Bossart <nathandbossart@gmail.com>
wrote:

On Mon, Feb 26, 2024 at 12:38:40PM +0500, Kirill Reshke wrote:

I see 2 possible ways to implement this. The first one is to have hool

in

pg_signal_backend, and define a hook in extension which can do the

thing.

The second one is to have a predefined role. Something like a
`pg_signal_autovacuum` role which can signal running autovac to cancel.

But

I don't see how we can handle specific `application_name` with this
solution.

pg_signal_autovacuum seems useful given commit 3a9b18b.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Thank you for your response.
Please find a patch attached.

In patch, pg_signal_autovacuum role with oid 6312 added. I grabbed oid
from unused_oids script output.
Also, tap tests for functionality added. I'm not sure where to place them,
so I placed them in a separate directory in `src/test/`
Seems that regression tests for this feature are not possible, am i right?
Also, I was thinking of pg_signal_autovacuum vs pg_signal_backend.
Should pg_signal_autovacuum have power of pg_signal_backend (implicity)?
Or should this role have such little scope...

Have a little thought on this, will share.

Do we need to test the pg_cancel_backend vs autovacuum case at all?
I think we do. Would it be better to split work into 2 patches: first one
with tests against current logic, and second
one with some changes/enhancements which allows to cancel running autovac
to non-superuser (via `pg_signal_autovacuum` role or some other mechanism)?

#5Nathan Bossart
nathandbossart@gmail.com
In reply to: Kirill Reshke (#3)
Re: Allow non-superuser to cancel superuser tasks.

On Tue, Feb 27, 2024 at 01:22:31AM +0500, Kirill Reshke wrote:

Also, tap tests for functionality added. I'm not sure where to place them,
so I placed them in a separate directory in `src/test/`
Seems that regression tests for this feature are not possible, am i right?

It might be difficult to create reliable tests for pg_signal_autovacuum.
If we can, it would probably be easiest to do with a TAP test.

Also, I was thinking of pg_signal_autovacuum vs pg_signal_backend.
Should pg_signal_autovacuum have power of pg_signal_backend (implicity)? Or
should this role have such little scope...

-1. I don't see why giving a role privileges of pg_signal_autovacuum
should also give them the ability to signal all other non-superuser
backends.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#6Nathan Bossart
nathandbossart@gmail.com
In reply to: Kirill Reshke (#4)
Re: Allow non-superuser to cancel superuser tasks.

On Tue, Feb 27, 2024 at 11:59:00PM +0500, Kirill Reshke wrote:

Do we need to test the pg_cancel_backend vs autovacuum case at all?
I think we do. Would it be better to split work into 2 patches: first one
with tests against current logic, and second
one with some changes/enhancements which allows to cancel running autovac
to non-superuser (via `pg_signal_autovacuum` role or some other mechanism)?

If we need to add tests for pg_signal_backend, I think it's reasonable to
keep those in a separate patch from pg_signal_autovacuum.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#7Leung, Anthony
antholeu@amazon.com
In reply to: Nathan Bossart (#6)
Re: Allow non-superuser to cancel superuser tasks.

Hi,

I'm new to reviewing postgres patches, but I took an interest in reviewing this patch as recommended by Nathan.

I have the following comments:

if (!superuser()) {
if (!OidIsValid(proc->roleId)) {
LocalPgBackendStatus *local_beentry;
local_beentry = pgstat_get_local_beentry_by_backend_id(proc->backendId);

if (!(local_beentry && local_beentry->backendStatus.st_backendType == B_AUTOVAC_WORKER &&
has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM)))
return SIGNAL_BACKEND_NOSUPERUSER;
} else {
if (superuser_arg(proc->roleId))
return SIGNAL_BACKEND_NOSUPERUSER;

/* Users can signal backends they have role membership in. */
if (!has_privs_of_role(GetUserId(), proc->roleId) &&
!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_BACKEND))
return SIGNAL_BACKEND_NOPERMISSION;
}
}

1. I would suggest not to do nested if blocks since it's becoming harder to read. Also, does it make sense to have a utilities function in backend_status.c to check if a backend of a given backend id is of a certain backend_type. And should we check if proc->backendId is invalid?

ALTER SYSTEM SET autovacuum_vacuum_cost_limit TO 1;
ALTER SYSTEM SET autovacuum_vacuum_cost_delay TO 100;
ALTER SYSTEM SET autovacuum_naptime TO 1;

2. Could we set these parameters at the beginning of the test before $node->start with $node->append_conf ? That way we can avoid restarting the node and doing the sleep later on.

my $res_pid = $node_primary->safe_psql(
. 'regress',
"SELECT pid FROM pg_stat_activity WHERE backend_type = 'autovacuum worker' and datname = 'regress';"
);

my ($res_reg_psa_1, $stdout_reg_psa_1, $stderr_reg_psa_1) = $node_primary->psql('regress', qq[

SET ROLE psa_reg_role_1;

SELECT pg_terminate_backend($res_pid);
]);

ok($res_reg_psa_1 != 0, "should fail for non pg_signal_autovacuum");
like($stderr_reg_psa_1, qr/Only roles with the SUPERUSER attribute may terminate processes of roles with the SUPERUSER attribute./, "matches");

my ($res_reg_psa_2, $stdout_reg_psa_2, $stderr_reg_psa_2) = $node_primary->psql('regress', qq[
SET ROLE psa_reg_role_2;
SELECT pg_terminate_backend($res_pid);
]");

3. Some nits on styling

4. According to Postgres styles, I believe open brackets should be in a new line

#8Leung, Anthony
antholeu@amazon.com
In reply to: Leung, Anthony (#7)
Re: Allow non-superuser to cancel superuser tasks.

Another comment that I forgot to mention is that we should also make the documentation change in doc/src/sgml/user-manag.sgml for this new predefined role

Thanks.

--
Anthony Leung
Amazon Web Services: https://aws.amazon.com

#9Leung, Anthony
antholeu@amazon.com
In reply to: Leung, Anthony (#8)
Re: Allow non-superuser to cancel superuser tasks.

I took the liberty of continuing to work on this after chatting with Nathan.

I've attached the updated patch with some improvements.

Thanks.

--
Anthony Leung
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Introduce-pg_signal_autovacuum-role-to-allow-non-sup.patchapplication/octet-stream; name=v1-0001-Introduce-pg_signal_autovacuum-role-to-allow-non-sup.patchDownload+137-10
#10Nathan Bossart
nathandbossart@gmail.com
In reply to: Leung, Anthony (#9)
Re: Allow non-superuser to cancel superuser tasks.

On Mon, Apr 01, 2024 at 02:29:29PM +0000, Leung, Anthony wrote:

I've attached the updated patch with some improvements.

Thanks!

+      <row>
+       <entry>pg_signal_autovacuum</entry>
+       <entry>Allow terminating backend running autovacuum</entry>
+      </row>

I think we should be more precise here by calling out the exact types of
workers:

"Allow signaling autovacuum worker processes to..."

-    if ((!OidIsValid(proc->roleId) || superuser_arg(proc->roleId)) &&
-        !superuser())
-        return SIGNAL_BACKEND_NOSUPERUSER;
-
-    /* Users can signal backends they have role membership in. */
-    if (!has_privs_of_role(GetUserId(), proc->roleId) &&
-        !has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_BACKEND))
-        return SIGNAL_BACKEND_NOPERMISSION;
+    if (!superuser())
+    {
+        if (!OidIsValid(proc->roleId))
+        {
+            /*
+             * We only allow user with pg_signal_autovacuum role to terminate
+             * autovacuum worker as an exception. 
+             */
+            if (!(pg_stat_is_backend_autovac_worker(proc->backendId) &&
+                has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM)))
+                return SIGNAL_BACKEND_NOSUPERUSER;
+        }
+        else
+        {
+            if (superuser_arg(proc->roleId))
+                return SIGNAL_BACKEND_NOSUPERUSER;
+
+            /* Users can signal backends they have role membership in. */
+            if (!has_privs_of_role(GetUserId(), proc->roleId) &&
+                !has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_BACKEND))
+                return SIGNAL_BACKEND_NOPERMISSION;
+        }
+    }

I don't think we should rely on !OidIsValid(proc->roleId) for signaling
autovacuum workers. That might not always be true, and I don't see any
need to rely on that otherwise. IMHO we should just add a few lines before
the existing code, which doesn't need to be changed at all:

if (pg_stat_is_backend_autovac_worker(proc->backendId) &&
!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM))
return SIGNAL_BACKEND_NOAUTOVACUUM;

I also think we need to return something besides SIGNAL_BACKEND_NOSUPERUSER
in this case. Specifically, we probably need to introduce a new value and
provide the relevant error messages in pg_cancel_backend() and
pg_terminate_backend().

+/* ----------
+ * pg_stat_is_backend_autovac_worker() -
+ *
+ * Return whether the backend of the given backend id is of type autovacuum worker.
+ */
+bool
+pg_stat_is_backend_autovac_worker(BackendId beid)
+{
+    PgBackendStatus *ret;
+
+    Assert(beid != InvalidBackendId);
+
+    ret = pgstat_get_beentry_by_backend_id(beid);
+
+    if (!ret)
+        return false;
+
+    return ret->st_backendType == B_AUTOVAC_WORKER;
+}

Can we have this function return the backend type so that we don't have to
create a new function for every possible type? That might be handy in the
future.

I haven't looked too closely, but I'm pretty skeptical that the test suite
in your patch would be stable. Unfortunately, I don't have any better
ideas at the moment besides not adding a test for this new role.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#11Andrey Borodin
amborodin@acm.org
In reply to: Nathan Bossart (#10)
Re: Allow non-superuser to cancel superuser tasks.

On 2 Apr 2024, at 01:21, Nathan Bossart <nathandbossart@gmail.com> wrote:

I haven't looked too closely, but I'm pretty skeptical that the test suite
in your patch would be stable. Unfortunately, I don't have any better
ideas at the moment besides not adding a test for this new role.

We can add tests just like [0]https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=eeefd4280f6 with injection points.
I mean replace that "sleep 1" with something like "$node->wait_for_event('autovacuum worker', 'autocauum-runing');".
Currently we have no infrastructure to wait for autovacuum of particular table, but I think it's doable.
Also I do not like that test is changing system-wide autovac settings, AFAIR these settings can be set for particular table.

Thanks!

Best regards, Andrey Borodin.

[0]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=eeefd4280f6

#12Leung, Anthony
antholeu@amazon.com
In reply to: Andrey Borodin (#11)
Re: Allow non-superuser to cancel superuser tasks.

I don't think we should rely on !OidIsValid(proc->roleId) for signaling
autovacuum workers. That might not always be true, and I don't see any
need to rely on that otherwise. IMHO we should just add a few lines before
the existing code, which doesn't need to be changed at all:

if (pg_stat_is_backend_autovac_worker(proc->backendId) &&
!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM))
return SIGNAL_BACKEND_NOAUTOVACUUM;

I tried to add them above the existing code. When I test it locally, a user without pg_signal_autovacuum will actually fail at this block because the user is not superuser and !OidIsValid(proc->roleId) is also true in the following:

/*
* Only allow superusers to signal superuser-owned backends. Any process
* not advertising a role might have the importance of a superuser-owned
* backend, so treat it that way.
*/
if ((!OidIsValid(proc->roleId) || superuser_arg(proc->roleId)) &&
!superuser())
return SIGNAL_BACKEND_NOSUPERUSER;

This is what Im planning to do - If the backend is autovacuum worker and the user is not superuser or has pg_signal_autovacuum role, we return the new value and provide the relevant error message

/*
* If the backend is autovacuum worker, allow user with privileges of the
* pg_signal_autovacuum role to signal the backend.
*/
if (pgstat_get_backend_type(proc->backendId) == B_AUTOVAC_WORKER)
{
if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM) || !superuser())
return SIGNAL_BACKEND_NOAUTOVACUUM;
}
/*
* Only allow superusers to signal superuser-owned backends. Any process
* not advertising a role might have the importance of a superuser-owned
* backend, so treat it that way.
*/
else if ((!OidIsValid(proc->roleId) || superuser_arg(proc->roleId)) &&
!superuser())
{
return SIGNAL_BACKEND_NOSUPERUSER;
}
/* Users can signal backends they have role membership in. */
else if (!has_privs_of_role(GetUserId(), proc->roleId) &&
!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_BACKEND))
{
return SIGNAL_BACKEND_NOPERMISSION;
}

We can add tests just like [0] with injection points.
I mean replace that "sleep 1" with something like "$node->wait_for_event('autovacuum worker', 'autocauum-runing');".
Currently we have no infrastructure to wait for autovacuum of particular table, but I think it's doable.
Also I do not like that test is changing system-wide autovac settings, AFAIR these settings can be set for particular table.

Thanks for the suggestion. I will take a look at this. Let me also separate the test into a different patch file.

--
Anthony Leung
Amazon Web Services: https://aws.amazon.com

#13Leung, Anthony
antholeu@amazon.com
In reply to: Leung, Anthony (#12)
Re: Allow non-superuser to cancel superuser tasks.

Update - the condition should be &&

if (pgstat_get_backend_type(proc->backendId) == B_AUTOVAC_WORKER)
{
if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM) && !superuser())
return SIGNAL_BACKEND_NOAUTOVACUUM;
}

Thanks
--
Anthony Leung
Amazon Web Services: https://aws.amazon.com

#14Michael Paquier
michael@paquier.xyz
In reply to: Andrey Borodin (#11)
Re: Allow non-superuser to cancel superuser tasks.

On Tue, Apr 02, 2024 at 04:35:28PM +0500, Andrey M. Borodin wrote:

We can add tests just like [0] with injection points.
I mean replace that "sleep 1" with something like
"$node->wait_for_event('autovacuum worker', 'autocauum-runing');".
Currently we have no infrastructure to wait for autovacuum of
particular table, but I think it's doable.
Also I do not like that test is changing system-wide autovac
settings, AFAIR these settings can be set for particular table.

Yeah, hardcoded sleeps are not really acceptable. On fast machines
they eat in global runtime making the whole slower, impacting the CI.
On slow machines, that's not going to be stable and we have a lot of
buildfarm animals starved on CPU, like the ones running valgrind or
just because their environment is slow (one of my animals runs on a
RPI, for example). Note that slow machines have a lot of value
because they're usually better at catching race conditions. Injection
points would indeed make the tests more deterministic by controlling
the waits and wakeups you'd like to have in the patch's tests.

eeefd4280f6e would be a good example of how to implement a test.
--
Michael

#15Nathan Bossart
nathandbossart@gmail.com
In reply to: Leung, Anthony (#12)
Re: Allow non-superuser to cancel superuser tasks.

On Thu, Apr 04, 2024 at 12:30:51AM +0000, Leung, Anthony wrote:

if (pg_stat_is_backend_autovac_worker(proc->backendId) &&
!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM))
return SIGNAL_BACKEND_NOAUTOVACUUM;

I tried to add them above the existing code. When I test it locally, a
user without pg_signal_autovacuum will actually fail at this block
because the user is not superuser and !OidIsValid(proc->roleId) is also
true in the following:

Good catch.

This is what Im planning to do - If the backend is autovacuum worker and
the user is not superuser or has pg_signal_autovacuum role, we return the
new value and provide the relevant error message

/*
* If the backend is autovacuum worker, allow user with privileges of the
* pg_signal_autovacuum role to signal the backend.
*/
if (pgstat_get_backend_type(proc->backendId) == B_AUTOVAC_WORKER)
{
if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM) || !superuser())
return SIGNAL_BACKEND_NOAUTOVACUUM;
}
/*
* Only allow superusers to signal superuser-owned backends. Any process
* not advertising a role might have the importance of a superuser-owned
* backend, so treat it that way.
*/
else if ((!OidIsValid(proc->roleId) || superuser_arg(proc->roleId)) &&
!superuser())
{
return SIGNAL_BACKEND_NOSUPERUSER;
}
/* Users can signal backends they have role membership in. */
else if (!has_privs_of_role(GetUserId(), proc->roleId) &&
!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_BACKEND))
{
return SIGNAL_BACKEND_NOPERMISSION;
}

There's no need for the explicit superuser() check in the
pg_signal_autovacuum section. That's built into has_privs_of_role()
already.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#16Leung, Anthony
antholeu@amazon.com
In reply to: Nathan Bossart (#15)
Re: Allow non-superuser to cancel superuser tasks.

I made some updates based on the feedbacks in v2. This patch only contains the code change for allowing the signaling to av worker with pg_signal_autovacuum. I will send a separate patch for the tap test shortly.

Thanks

--
Anthony Leung
Amazon Web Services: https://aws.amazon.com

Attachments:

v2-0001-Introduce-pg_signal_autovacuum-role-to-cancel-or-termiante-autovacuum-worker.patchapplication/octet-stream; name=v2-0001-Introduce-pg_signal_autovacuum-role-to-cancel-or-termiante-autovacuum-worker.patchDownload+62-8
#17Leung, Anthony
antholeu@amazon.com
In reply to: Leung, Anthony (#16)
Re: Allow non-superuser to cancel superuser tasks.

Adding tap test for pg_signal_autovacuum using injection points as a separate patch. I also made a minor change on the original patch.

Thanks.

--
Anthony
Amazon Web Services: https://aws.amazon.com

Attachments:

v3-0001-Introduce-pg_signal_autovacuum-role-to-signal-autovacuum-worker.patchapplication/octet-stream; name=v3-0001-Introduce-pg_signal_autovacuum-role-to-signal-autovacuum-worker.patchDownload+62-8
v3-0002-Add-tap-test-for-pg-signal-autovacuum-role.patchapplication/octet-stream; name=v3-0002-Add-tap-test-for-pg-signal-autovacuum-role.patchDownload+134-1
#18Andrey Borodin
amborodin@acm.org
In reply to: Leung, Anthony (#17)
Re: Allow non-superuser to cancel superuser tasks.

On 5 Apr 2024, at 05:03, Leung, Anthony <antholeu@amazon.com> wrote:

Adding tap test for pg_signal_autovacuum using injection points as a separate patch. I also made a minor change on the original patch.

The test looks good, but:
1. remove references to passcheck :)
2. detach injection point when it's not needed anymore

Thanks!

Best regards, Andrey Borodin.

#19Michael Paquier
michael@paquier.xyz
In reply to: Leung, Anthony (#17)
Re: Allow non-superuser to cancel superuser tasks.

On Fri, Apr 05, 2024 at 12:03:05AM +0000, Leung, Anthony wrote:

Adding tap test for pg_signal_autovacuum using injection points as a
separate patch. I also made a minor change on the original patch.

+	ret = pgstat_get_beentry_by_proc_number(procNumber);
+
+	if (!ret)
+		return false;

An invalid BackendType is not false, but B_INVALID.

+{ oid => '6312', oid_symbol => 'ROLE_PG_SIGNAL_AUTOVACUUM',

OIDs in patches under development should use a value in the range
8000-9999. Newly-assigned OIDs are renumbered after the feature
freeze.

+	/*
+	 * If the backend is autovacuum worker, allow user with the privileges of
+	 * pg_signal_autovacuum role to signal the backend.
+	 */
+	if (pgstat_get_backend_type(GetNumberFromPGProc(proc)) == B_AUTOVAC_WORKER)
+	{
+		if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM))
+			return SIGNAL_BACKEND_NOAUTOVACUUM;
+	}

I was wondering why this is not done after we've checked that we have
a superuser-owned backend, and this is giving me a pause. @Nathan,
why do you think we should not rely on the roleId for an autovacuum
worker? In core, do_autovacuum() is only called in a process without
a role specified, and I've noticed your remark here:
/messages/by-id/20240401202146.GA2354284@nathanxps13
It's feeling more natural here to check that we have a superuser-owned
backend first, and then do a lookup of the process type.

One thing that we should definitely not do is letting any user calling
pg_signal_backend() know that a given PID maps to an autovacuum
worker. This information is hidden in pg_stat_activity. And
actually, doesn't the patch leak this information to all users when
calling pg_signal_backend with random PID numbers because of the fact
that SIGNAL_BACKEND_NOAUTOVACUUM exists? Any role could guess which
PIDs are used by an autovacuum worker because of the granularity
required for the error related to pg_signal_autovacuum.

+ INJECTION_POINT("autovacuum-start");
Perhaps autovacuum-worker-start is more suited here. I am not sure
that the beginning of do_autovacuum() is the optimal location, as what
matters is that we've done InitPostgres() to be able to grab the PID
from pg_stat_activity. This location does the job.

+if ($ENV{enable_injection_points} ne 'yes')
+{
+	plan skip_all => 'Injection points not supported by this build';
+}
[...]
+$node->safe_psql('postgres',
+	"SELECT injection_points_attach('autovacuum-start', 'wait');");
[...]
+# Wait until the autovacuum worker starts
+$node->wait_for_event('autovacuum worker', 'autovacuum-start');

This integration with injection points looks correct to me.

+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
[...]
+# Copyright (c) 2024-2024, PostgreSQL Global Development Group

These need to be cleaned up.

+# Makefile for src/test/recovery
+#
+# src/test/recovery/Makefile

This is incorrect, twice. No problems for me with using a new path in
src/test/ for that kind of tests. There are no similar locations.

+ INSERT INTO tab_int SELECT * FROM generate_series(1, 1000000);
A good chunk of the test would be spent on that, but you don't need
that many tuples to trigger an autovacuum worker as the short naptime
is able to do it. I would recommend to reduce that to a minimum.

+# User with signal_backend_role cannot terminate autovacuum worker

Not sure that there is a huge point in checking after a role that
holds pg_signal_backend. An autovacuum worker is not a backend. Only
the case of a role not member of pg_signal_autovacuum should be
enough.

+# Test signaling for pg_signal_autovacuum role.

This needs a better documentation: the purpose of the test is to
signal an autovacuum worker, aka it uses an injection point to ensure
that the worker for the whole duration of the test.

It seems to me that it would be a better practice to wakeup the
injection point and detach it before being done with the worker.
That's not mandatory but it would encourage the correct flow if this
code is copy-pasted around to other tests.

+like($psql_err, qr/ERROR: permission denied to terminate ...

Checking only the ERRROR, and not the DETAIL should be sufficient
here.

+# User with pg_signal_backend can terminate autovacuum worker 
+my $terminate_with_pg_signal_av = $node->psql('postgres', qq( 
+    SET ROLE signal_autovacuum_role;
+    SELECT pg_terminate_backend($av_pid);
+), stdout => \$psql_out, stderr => \$psql_err);
+
+ok($terminate_with_pg_signal_av == 0, "Terminating autovacuum worker should succeed with pg_signal_autovacuum role");

Is that enough for the validation? How about checking some pattern in
the server logs from an offset before running this last query?
--
Michael

#20Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#19)
Re: Allow non-superuser to cancel superuser tasks.

On Fri, Apr 05, 2024 at 02:39:05PM +0900, Michael Paquier wrote:

+	/*
+	 * If the backend is autovacuum worker, allow user with the privileges of
+	 * pg_signal_autovacuum role to signal the backend.
+	 */
+	if (pgstat_get_backend_type(GetNumberFromPGProc(proc)) == B_AUTOVAC_WORKER)
+	{
+		if (!has_privs_of_role(GetUserId(), ROLE_PG_SIGNAL_AUTOVACUUM))
+			return SIGNAL_BACKEND_NOAUTOVACUUM;
+	}

I was wondering why this is not done after we've checked that we have
a superuser-owned backend, and this is giving me a pause. @Nathan,
why do you think we should not rely on the roleId for an autovacuum
worker? In core, do_autovacuum() is only called in a process without
a role specified, and I've noticed your remark here:
/messages/by-id/20240401202146.GA2354284@nathanxps13
It's feeling more natural here to check that we have a superuser-owned
backend first, and then do a lookup of the process type.

I figured since there's no reason to rely on that behavior, we might as
well do a bit of future-proofing in case autovacuum workers are ever not
run as InvalidOid. It'd be easy enough to fix this code if that ever
happened, so I'm not too worried about this.

One thing that we should definitely not do is letting any user calling
pg_signal_backend() know that a given PID maps to an autovacuum
worker. This information is hidden in pg_stat_activity. And
actually, doesn't the patch leak this information to all users when
calling pg_signal_backend with random PID numbers because of the fact
that SIGNAL_BACKEND_NOAUTOVACUUM exists? Any role could guess which
PIDs are used by an autovacuum worker because of the granularity
required for the error related to pg_signal_autovacuum.

Hm. I hadn't considered that angle. IIUC right now they'll just get the
generic superuser error for autovacuum workers. I don't know how concerned
to be about users distinguishing autovacuum workers from other superuser
backends, _but_ if roles with pg_signal_autovacuum can't even figure out
the PIDs for the autovacuum workers, then this feature seems kind-of
useless. Perhaps we should allow roles with privileges of
pg_signal_autovacuum to see the autovacuum workers in pg_stat_activity.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#21Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#20)
#22Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#21)
#23Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#22)
#24Leung, Anthony
antholeu@amazon.com
In reply to: Michael Paquier (#23)
#25Michael Paquier
michael@paquier.xyz
In reply to: Leung, Anthony (#24)
#26Kirill Reshke
reshkekirill@gmail.com
In reply to: Michael Paquier (#25)
#27Michael Paquier
michael@paquier.xyz
In reply to: Kirill Reshke (#26)
#28Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#27)
#29Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#28)
#30Kirill Reshke
reshkekirill@gmail.com
In reply to: Michael Paquier (#29)
#31Nathan Bossart
nathandbossart@gmail.com
In reply to: Kirill Reshke (#30)
#32Nathan Bossart
nathandbossart@gmail.com
In reply to: Kirill Reshke (#30)
#33Kirill Reshke
reshkekirill@gmail.com
In reply to: Nathan Bossart (#31)
#34Michael Paquier
michael@paquier.xyz
In reply to: Kirill Reshke (#30)
#35Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#34)
#36Kirill Reshke
reshkekirill@gmail.com
In reply to: Kirill Reshke (#30)
#37Michael Paquier
michael@paquier.xyz
In reply to: Kirill Reshke (#36)
#38Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#37)
#39Andrey Borodin
amborodin@acm.org
In reply to: Nathan Bossart (#38)
#40Nathan Bossart
nathandbossart@gmail.com
In reply to: Andrey Borodin (#39)
#41Michael Paquier
michael@paquier.xyz
In reply to: Andrey Borodin (#39)
#42Andrey Borodin
amborodin@acm.org
In reply to: Michael Paquier (#41)
#43Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#40)
#44Michael Paquier
michael@paquier.xyz
In reply to: Andrey Borodin (#42)
#45Andrey Borodin
amborodin@acm.org
In reply to: Michael Paquier (#43)
#46Nathan Bossart
nathandbossart@gmail.com
In reply to: Andrey Borodin (#45)
#47Kirill Reshke
reshkekirill@gmail.com
In reply to: Nathan Bossart (#46)
#48Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#46)
#49Michael Paquier
michael@paquier.xyz
In reply to: Kirill Reshke (#47)
#50Kirill Reshke
reshkekirill@gmail.com
In reply to: Michael Paquier (#49)
#51Andrey Borodin
amborodin@acm.org
In reply to: Kirill Reshke (#50)
#52Kirill Reshke
reshkekirill@gmail.com
In reply to: Michael Paquier (#48)
#53Michael Paquier
michael@paquier.xyz
In reply to: Kirill Reshke (#52)
#54Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#48)
#55Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#54)
#56Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#55)
#57Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#56)
#58Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#57)
#59Andres Freund
andres@anarazel.de
In reply to: Nathan Bossart (#46)
#60Kirill Reshke
reshkekirill@gmail.com
In reply to: Andres Freund (#59)
#61Nathan Bossart
nathandbossart@gmail.com
In reply to: Andres Freund (#59)
#62Andres Freund
andres@anarazel.de
In reply to: Nathan Bossart (#61)
#63Nathan Bossart
nathandbossart@gmail.com
In reply to: Andres Freund (#62)
#64Kirill Reshke
reshkekirill@gmail.com
In reply to: Nathan Bossart (#63)
#65Andres Freund
andres@anarazel.de
In reply to: Nathan Bossart (#63)
#66Nathan Bossart
nathandbossart@gmail.com
In reply to: Andres Freund (#65)
#67Andres Freund
andres@anarazel.de
In reply to: Nathan Bossart (#66)
#68Nathan Bossart
nathandbossart@gmail.com
In reply to: Andres Freund (#67)