Commit 5a2fed911a broke parallel query
This is a script to reproduce the problem:
CREATE ROLE fluff;
CREATE ROLE duff LOGIN IN ROLE fluff;
CREATE DATABASE scratch OWNER duff;
REVOKE CONNECT, TEMP ON DATABASE scratch FROM PUBLIC;
\c scratch duff
CREATE TABLE large AS SELECT id FROM generate_series(1, 500000) AS id;
GRANT SELECT ON large TO fluff;
SET ROLE fluff;
SELECT count(*) FROM large;
Since commit 5a2fed911a, this results in:
ERROR: permission denied for database "scratch"
DETAIL: User does not have CONNECT privilege.
CONTEXT: parallel worker
The following patch makes the problem disappear, but I am far
from certain that using the session user is always correct there:
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index a024b1151d0..150ec3f52f8 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -360,7 +360,7 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
* and save a few cycles.)
*/
if (!am_superuser &&
- object_aclcheck(DatabaseRelationId, MyDatabaseId, GetUserId(),
+ object_aclcheck(DatabaseRelationId, MyDatabaseId, GetSessionUserId(),
ACL_CONNECT) != ACLCHECK_OK)
ereport(FATAL,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
Yours,
Laurenz Albe
Laurenz Albe <laurenz.albe@cybertec.at> writes:
This is a script to reproduce the problem:
As commented in 73c9f91a1,
This all seems very accidental and probably not the behavior we really
want, but a security patch is no time to be redesigning things.
Perhaps this thread is a good place to start the discussion.
In the security-team discussion that led up to 73c9f91a1, I'd been
wondering why parallel worker start enforces any connection
restrictions at all. If we'd like the use of parallelism to be
more-or-less transparent, we shouldn't apply these checks,
and not the !am_superuser ones in InitPostgres either. If we do
want to apply permissions checks in parallel worker start, why
should we think that the failure in this example is wrong?
regards, tom lane
I wrote:
In the security-team discussion that led up to 73c9f91a1, I'd been
wondering why parallel worker start enforces any connection
restrictions at all. If we'd like the use of parallelism to be
more-or-less transparent, we shouldn't apply these checks,
and not the !am_superuser ones in InitPostgres either.
That is, the way I'd prefer to attack this is something along the
lines of the attached, which just disables all those checks in
parallel workers (and reverts 73c9f91a1's hackery on am_superuser).
One big question this raises is whether any other sorts of background
processes should be treated similarly.
Another loose end is that I don't think the enforcement of
datconnlimit is very sane. CountDBBackends skips processes that say
AmBackgroundWorkerProcess, so logically those should be precisely the
ones that escape enforcement of datconnlimit, but it doesn't quite
work like that today. This patch at least puts parallel workers on
the right side of the line; but I think we have some cleanup to do
for other sorts of auxiliary processes.
I'm also wondering if we could revisit the need for
CheckMyDatabase's override_allow_connections parameter
and thereby remove INIT_PG_OVERRIDE_ALLOW_CONNS.
So all in all, this is just at the POC stage.
regards, tom lane
Attachments:
fix-parallel-worker-connection-checks-poc.patchtext/x-diff; charset=us-ascii; name=fix-parallel-worker-connection-checks-poc.patchDownload+17-24
On Mon, 2024-12-23 at 12:12 -0500, Tom Lane wrote:
I wrote:
In the security-team discussion that led up to 73c9f91a1, I'd been
wondering why parallel worker start enforces any connection
restrictions at all. If we'd like the use of parallelism to be
more-or-less transparent, we shouldn't apply these checks,
and not the !am_superuser ones in InitPostgres either.That is, the way I'd prefer to attack this is something along the
lines of the attached, which just disables all those checks in
parallel workers (and reverts 73c9f91a1's hackery on am_superuser).One big question this raises is whether any other sorts of background
processes should be treated similarly.Another loose end is that I don't think the enforcement of
datconnlimit is very sane. CountDBBackends skips processes that say
AmBackgroundWorkerProcess, so logically those should be precisely the
ones that escape enforcement of datconnlimit, but it doesn't quite
work like that today. This patch at least puts parallel workers on
the right side of the line; but I think we have some cleanup to do
for other sorts of auxiliary processes.I'm also wondering if we could revisit the need for
CheckMyDatabase's override_allow_connections parameter
and thereby remove INIT_PG_OVERRIDE_ALLOW_CONNS.
So all in all, this is just at the POC stage.
Thanks for looking at this problem.
I think that disabling the connect privilege check for parallel
workers is the right thing to do. Getting permission problems just
because PostgreSQL decided to use parallel query doesn't make any sense.
I'm not sure about other background workers, but my guts say that
they shouldn't check for the connect privilege either.
I agree that background workers shouldn't count against a connection
limit, but against "max_worker_processes".
Yours,
Laurenz Albe
Hello,
just a notice that this sounds like the symptom already described in
/messages/by-id/c7896c19adb646e889d5b2e40fdd17c1@ldbv.bayern.de
Regards,
Christian
-----Ursprüngliche Nachricht-----
Von: Laurenz Albe <laurenz.albe@cybertec.at>
Gesendet: Montag, 23. Dezember 2024 11:55
An: pgsql-bugs@lists.postgresql.org
Betreff: Commit 5a2fed911a broke parallel query
This is a script to reproduce the problem:
CREATE ROLE fluff;
CREATE ROLE duff LOGIN IN ROLE fluff;
CREATE DATABASE scratch OWNER duff;
REVOKE CONNECT, TEMP ON DATABASE scratch FROM PUBLIC;
\c scratch duff
CREATE TABLE large AS SELECT id FROM generate_series(1, 500000) AS id;
GRANT SELECT ON large TO fluff;
SET ROLE fluff;
SELECT count(*) FROM large;
Since commit 5a2fed911a, this results in:
ERROR: permission denied for database "scratch"
DETAIL: User does not have CONNECT privilege.
CONTEXT: parallel worker
The following patch makes the problem disappear, but I am far
from certain that using the session user is always correct there:
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index a024b1151d0..150ec3f52f8 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -360,7 +360,7 @@ CheckMyDatabase(const char *name, bool am_superuser, bool override_allow_connect
* and save a few cycles.)
*/
if (!am_superuser &&
- object_aclcheck(DatabaseRelationId, MyDatabaseId, GetUserId(),
+ object_aclcheck(DatabaseRelationId, MyDatabaseId, GetSessionUserId(),
ACL_CONNECT) != ACLCHECK_OK)
ereport(FATAL,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
Yours,
Laurenz Albe
"Rank, Christian (LfL)" <Christian.Rank@lfl.bayern.de> writes:
just a notice that this sounds like the symptom already described in
/messages/by-id/c7896c19adb646e889d5b2e40fdd17c1@ldbv.bayern.de
Indeed. My apologies for having not gotten back to you in that
thread. There is a committed fix now:
(caution: the fix is nontrivially different in each branch; that
one is for v16)
regards, tom lane