[PATCH] Expose port->authn_id to extensions and triggers
Hi all,
Stephen pointed out [1]/messages/by-id/CAOuzzgpFpuroNRabEvB9kST_TSyS2jFicBNoXvW7G2pZFixyBw@mail.gmail.com that the authenticated identity that's stored
in MyProcPort can't be retrieved by extensions or triggers. Attached is
a patch that provides both a C API and a SQL function for retrieving
it.
GetAuthenticatedIdentityString() is a mouthful but I wanted to
differentiate it from the existing GetAuthenticatedUserId(); better
names welcome. It only exists as an accessor because I wasn't sure if
extensions outside of contrib were encouraged to rely on the internal
layout of struct Port. (If they can, then that call can go away
entirely.)
Thanks,
--Jacob
[1]: /messages/by-id/CAOuzzgpFpuroNRabEvB9kST_TSyS2jFicBNoXvW7G2pZFixyBw@mail.gmail.com
Attachments:
0001-Add-APIs-to-retrieve-authn_id-from-C-and-SQL.patchtext/x-patch; name=0001-Add-APIs-to-retrieve-authn_id-from-C-and-SQL.patchDownload
From 155a490d28cecf161d994e6d8824cbe967f4d469 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Mon, 14 Feb 2022 08:10:53 -0800
Subject: [PATCH] Add APIs to retrieve authn_id from C and SQL
The authn_id field in MyProcPort is currently only accessible to the
backend itself. Add a getter in C, GetAuthenticatedIdentityString(),
and a corresponding SQL function, session_authn_id(), to expose the
field to extensions and triggers that may want to make use of it.
---
src/backend/utils/adt/name.c | 14 +++++++++++++-
src/backend/utils/init/miscinit.c | 14 ++++++++++++++
src/include/catalog/catversion.h | 2 +-
src/include/catalog/pg_proc.dat | 3 +++
src/include/miscadmin.h | 1 +
src/test/authentication/t/001_password.pl | 11 +++++++++++
6 files changed, 43 insertions(+), 2 deletions(-)
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index e8bba3670c..d605b1e116 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -257,7 +257,7 @@ namestrcmp(Name name, const char *str)
/*
- * SQL-functions CURRENT_USER, SESSION_USER
+ * SQL-functions CURRENT_USER, SESSION_USER, SESSION_AUTHN_ID
*/
Datum
current_user(PG_FUNCTION_ARGS)
@@ -271,6 +271,18 @@ session_user(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(DirectFunctionCall1(namein, CStringGetDatum(GetUserNameFromId(GetSessionUserId(), false))));
}
+Datum
+session_authn_id(PG_FUNCTION_ARGS)
+{
+ char *authn_id;
+
+ authn_id = GetAuthenticatedIdentityString();
+ if (!authn_id)
+ PG_RETURN_NULL();
+
+ PG_RETURN_CSTRING(authn_id);
+}
+
/*
* SQL-functions CURRENT_SCHEMA, CURRENT_SCHEMAS
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index bdc77af719..9fea7f0be3 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -929,6 +929,20 @@ GetUserNameFromId(Oid roleid, bool noerr)
return result;
}
+/*
+ * Get a palloc'd string representing the authenticated identity of the client
+ * (see documentation for Port's authn_id field). Returns NULL if the client has
+ * not been authenticated.
+ */
+char *
+GetAuthenticatedIdentityString(void)
+{
+ if (!MyProcPort || !MyProcPort->authn_id)
+ return NULL;
+
+ return pstrdup(MyProcPort->authn_id);
+}
+
/*-------------------------------------------------------------------------
* Interlock-file support
diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h
index 1addb568ef..ca52e6889c 100644
--- a/src/include/catalog/catversion.h
+++ b/src/include/catalog/catversion.h
@@ -53,6 +53,6 @@
*/
/* yyyymmddN */
-#define CATALOG_VERSION_NO 202202221
+#define CATALOG_VERSION_NO 202202231
#endif
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 7f1ee97f55..b76d357bee 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1508,6 +1508,9 @@
{ oid => '746', descr => 'session user name',
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
+{ oid => '9774', descr => 'session authenticated identity',
+ proname => 'session_authn_id', provolatile => 's', prorettype => 'cstring',
+ proargtypes => '', prosrc => 'session_authn_id' },
{ oid => '744',
proname => 'array_eq', prorettype => 'bool',
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 0abc3ad540..25731ce977 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -347,6 +347,7 @@ extern void SetDataDir(const char *dir);
extern void ChangeToDataDir(void);
extern char *GetUserNameFromId(Oid roleid, bool noerr);
+extern char *GetAuthenticatedIdentityString(void);
extern Oid GetUserId(void);
extern Oid GetOuterUserId(void);
extern Oid GetSessionUserId(void);
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index 3e3079c824..2aa28ed547 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -82,6 +82,11 @@ test_role($node, 'scram_role', 'trust', 0,
test_role($node, 'md5_role', 'trust', 0,
log_unlike => [qr/connection authenticated:/]);
+my $res = $node->safe_psql('postgres',
+ "SELECT session_authn_id() IS NULL;"
+);
+is($res, 't', "users with trust authentication have NULL authn_id");
+
# For plain "password" method, all users should also be able to connect.
reset_pg_hba($node, 'password');
test_role($node, 'scram_role', 'password', 0,
@@ -91,6 +96,12 @@ test_role($node, 'md5_role', 'password', 0,
log_like =>
[qr/connection authenticated: identity="md5_role" method=password/]);
+$res = $node->safe_psql('postgres',
+ "SELECT session_authn_id();",
+ connstr => "user=md5_role"
+);
+is($res, 'md5_role', "users with md5 authentication have authn_id matching role name");
+
# For "scram-sha-256" method, user "scram_role" should be able to connect.
reset_pg_hba($node, 'scram-sha-256');
test_role(
--
2.25.1
On Thu, Feb 24, 2022 at 12:15:40AM +0000, Jacob Champion wrote:
Stephen pointed out [1] that the authenticated identity that's stored
in MyProcPort can't be retrieved by extensions or triggers. Attached is
a patch that provides both a C API and a SQL function for retrieving
it.GetAuthenticatedIdentityString() is a mouthful but I wanted to
differentiate it from the existing GetAuthenticatedUserId(); better
names welcome. It only exists as an accessor because I wasn't sure if
extensions outside of contrib were encouraged to rely on the internal
layout of struct Port. (If they can, then that call can go away
entirely.)
+char *
+GetAuthenticatedIdentityString(void)
+{
+ if (!MyProcPort || !MyProcPort->authn_id)
+ return NULL;
+
+ return pstrdup(MyProcPort->authn_id);
I don't quite see the additional value that this API brings as
MyProcPort is directly accessible, and contrib modules like
postgres_fdw and sslinfo just use that to find the data of the current
backend. Cannot a module like pgaudit, through the elog hook, do its
work with what we have already in place?
What's the use case for a given session to be able to report back
only its own authn through SQL?
I could still see a use case for that at a more global level with
beentrys, but it looked like there was not much interest the last time
I dropped this idea.
--
Michael
On Thu, 2022-02-24 at 20:39 +0900, Michael Paquier wrote:
I don't quite see the additional value that this API brings as
MyProcPort is directly accessible, and contrib modules like
postgres_fdw and sslinfo just use that to find the data of the current
backend.
Right -- I just didn't know if third-party modules were actually able
to rely on the internal layout of struct Port. Is that guaranteed to
remain constant for a major release line? If so, this new API is
superfluous.
Cannot a module like pgaudit, through the elog hook, do its
work with what we have already in place?
Given the above, I would hope so. Stephen mentioned that pgaudit only
had access to the logged-in role, and when I proposed a miscadmin.h API
he said that would help. CC'ing him to see what he meant; I don't know
if pgaudit has additional constraints.
What's the use case for a given session to be able to report back
only its own authn through SQL?
That's for triggers to be able to grab the current ID for
logging/auditing. Is there a better way to expose this for that use
case?
I could still see a use case for that at a more global level with
beentrys, but it looked like there was not much interest the last time
I dropped this idea.
I agree that this would be useful to have in the stats. From my outside
perspective, it seems like it's difficult to get strings of arbitrary
length in there; is that accurate?
Thanks,
--Jacob
Hi,
On Thu, Feb 24, 2022 at 04:50:59PM +0000, Jacob Champion wrote:
On Thu, 2022-02-24 at 20:39 +0900, Michael Paquier wrote:
I don't quite see the additional value that this API brings as
MyProcPort is directly accessible, and contrib modules like
postgres_fdw and sslinfo just use that to find the data of the current
backend.Right -- I just didn't know if third-party modules were actually able
to rely on the internal layout of struct Port. Is that guaranteed to
remain constant for a major release line? If so, this new API is
superfluous.
Yes, third-party can rely on Port layout. We don't break ABI between minor
release. In some occasions we can add additional fields at the end of a
struct, but nothing more.
I could still see a use case for that at a more global level with
beentrys, but it looked like there was not much interest the last time
I dropped this idea.I agree that this would be useful to have in the stats. From my outside
perspective, it seems like it's difficult to get strings of arbitrary
length in there; is that accurate?
Yes, as it's all in shared memory. The only workaround is using something like
track_activity_query_size, but it's not great.
On Fri, 2022-02-25 at 01:15 +0800, Julien Rouhaud wrote:
On Thu, Feb 24, 2022 at 04:50:59PM +0000, Jacob Champion wrote:
On Thu, 2022-02-24 at 20:39 +0900, Michael Paquier wrote:
I don't quite see the additional value that this API brings as
MyProcPort is directly accessible, and contrib modules like
postgres_fdw and sslinfo just use that to find the data of the current
backend.Right -- I just didn't know if third-party modules were actually able
to rely on the internal layout of struct Port. Is that guaranteed to
remain constant for a major release line? If so, this new API is
superfluous.Yes, third-party can rely on Port layout. We don't break ABI between minor
release. In some occasions we can add additional fields at the end of a
struct, but nothing more.
That simplifies things. PFA a smaller v2; if pgaudit can't make use of
libpq-be.h for some reason, then I guess we can tailor a fix to that
use case.
I could still see a use case for that at a more global level with
beentrys, but it looked like there was not much interest the last time
I dropped this idea.I agree that this would be useful to have in the stats. From my outside
perspective, it seems like it's difficult to get strings of arbitrary
length in there; is that accurate?Yes, as it's all in shared memory. The only workaround is using something like
track_activity_query_size, but it's not great.
Yeah... I was following a similar track with the initial work last
year, but I dropped it when the cost of implementation started to grow
considerably. At the time, though, it looked like some overhauls to the
stats framework were being pursued? I should read up on that thread.
--Jacob
Attachments:
v2-0001-Add-API-to-retrieve-authn_id-from-SQL.patchtext/x-patch; name=v2-0001-Add-API-to-retrieve-authn_id-from-SQL.patchDownload
From 92679a2109be5ba4d81bf58e8fb091c2d0020828 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Mon, 14 Feb 2022 08:10:53 -0800
Subject: [PATCH v2] Add API to retrieve authn_id from SQL
The authn_id field in MyProcPort is currently only accessible to the
backend itself. Add a SQL function, session_authn_id(), to expose the
field to triggers that may want to make use of it.
---
src/backend/utils/adt/name.c | 12 +++++++++++-
src/include/catalog/catversion.h | 2 +-
src/include/catalog/pg_proc.dat | 3 +++
src/test/authentication/t/001_password.pl | 11 +++++++++++
4 files changed, 26 insertions(+), 2 deletions(-)
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index e8bba3670c..ff0e2829bb 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -23,6 +23,7 @@
#include "catalog/namespace.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_type.h"
+#include "libpq/libpq-be.h"
#include "libpq/pqformat.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
@@ -257,7 +258,7 @@ namestrcmp(Name name, const char *str)
/*
- * SQL-functions CURRENT_USER, SESSION_USER
+ * SQL-functions CURRENT_USER, SESSION_USER, SESSION_AUTHN_ID
*/
Datum
current_user(PG_FUNCTION_ARGS)
@@ -271,6 +272,15 @@ session_user(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(DirectFunctionCall1(namein, CStringGetDatum(GetUserNameFromId(GetSessionUserId(), false))));
}
+Datum
+session_authn_id(PG_FUNCTION_ARGS)
+{
+ if (!MyProcPort || !MyProcPort->authn_id)
+ PG_RETURN_NULL();
+
+ PG_RETURN_CSTRING(MyProcPort->authn_id);
+}
+
/*
* SQL-functions CURRENT_SCHEMA, CURRENT_SCHEMAS
diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h
index 1addb568ef..ca52e6889c 100644
--- a/src/include/catalog/catversion.h
+++ b/src/include/catalog/catversion.h
@@ -53,6 +53,6 @@
*/
/* yyyymmddN */
-#define CATALOG_VERSION_NO 202202221
+#define CATALOG_VERSION_NO 202202231
#endif
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 7f1ee97f55..b76d357bee 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1508,6 +1508,9 @@
{ oid => '746', descr => 'session user name',
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
+{ oid => '9774', descr => 'session authenticated identity',
+ proname => 'session_authn_id', provolatile => 's', prorettype => 'cstring',
+ proargtypes => '', prosrc => 'session_authn_id' },
{ oid => '744',
proname => 'array_eq', prorettype => 'bool',
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index 3e3079c824..2aa28ed547 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -82,6 +82,11 @@ test_role($node, 'scram_role', 'trust', 0,
test_role($node, 'md5_role', 'trust', 0,
log_unlike => [qr/connection authenticated:/]);
+my $res = $node->safe_psql('postgres',
+ "SELECT session_authn_id() IS NULL;"
+);
+is($res, 't', "users with trust authentication have NULL authn_id");
+
# For plain "password" method, all users should also be able to connect.
reset_pg_hba($node, 'password');
test_role($node, 'scram_role', 'password', 0,
@@ -91,6 +96,12 @@ test_role($node, 'md5_role', 'password', 0,
log_like =>
[qr/connection authenticated: identity="md5_role" method=password/]);
+$res = $node->safe_psql('postgres',
+ "SELECT session_authn_id();",
+ connstr => "user=md5_role"
+);
+is($res, 'md5_role', "users with md5 authentication have authn_id matching role name");
+
# For "scram-sha-256" method, user "scram_role" should be able to connect.
reset_pg_hba($node, 'scram-sha-256');
test_role(
--
2.25.1
On Thu, Feb 24, 2022 at 08:44:08PM +0000, Jacob Champion wrote:
Yeah... I was following a similar track with the initial work last
year, but I dropped it when the cost of implementation started to grow
considerably. At the time, though, it looked like some overhauls to the
stats framework were being pursued? I should read up on that thread.
Do you mean the shared memory stats patchset? IIUC this is unrelated, as the
beentry stuff Michael was talking about is a different infrastructure
(backend_status.[ch]), and I don't think there are any plans to move that to
dynamic shared memory.
Hi,
On 2022-02-25 13:01:26 +0800, Julien Rouhaud wrote:
On Thu, Feb 24, 2022 at 08:44:08PM +0000, Jacob Champion wrote:
Yeah... I was following a similar track with the initial work last
year, but I dropped it when the cost of implementation started to grow
considerably. At the time, though, it looked like some overhauls to the
stats framework were being pursued? I should read up on that thread.Do you mean the shared memory stats patchset? IIUC this is unrelated, as the
beentry stuff Michael was talking about is a different infrastructure
(backend_status.[ch]), and I don't think there are any plans to move that to
dynamic shared memory.
Until a year ago the backend_status.c stuff was in in pgstat.c - I just split
them out because of the shared memory status work - so it'd not be surprising
for them to mentally be thrown in one bucket.
Basically the type of stats we're trying to move to dynamic shared memory is
about counters that should persist for a while and are accumulated across
connections etc. Whereas backend_status.c is more about tracking the current
state (what query is a backend running, what user, etc).
They're not unrelated though: backend_status.c feeds pgstat.c with some
information occasionally.
Greetings,
Andres Freund
Hi,
On Thu, Feb 24, 2022 at 09:18:26PM -0800, Andres Freund wrote:
On 2022-02-25 13:01:26 +0800, Julien Rouhaud wrote:
On Thu, Feb 24, 2022 at 08:44:08PM +0000, Jacob Champion wrote:
Yeah... I was following a similar track with the initial work last
year, but I dropped it when the cost of implementation started to grow
considerably. At the time, though, it looked like some overhauls to the
stats framework were being pursued? I should read up on that thread.Do you mean the shared memory stats patchset? IIUC this is unrelated, as the
beentry stuff Michael was talking about is a different infrastructure
(backend_status.[ch]), and I don't think there are any plans to move that to
dynamic shared memory.Until a year ago the backend_status.c stuff was in in pgstat.c - I just split
them out because of the shared memory status work - so it'd not be surprising
for them to mentally be thrown in one bucket.
Right.
Basically the type of stats we're trying to move to dynamic shared memory is
about counters that should persist for a while and are accumulated across
connections etc. Whereas backend_status.c is more about tracking the current
state (what query is a backend running, what user, etc).
But would it be acceptable to use dynamic shared memory in backend_status and
e.g. have a dsa_pointer rather than a fixed length array? That seems like a
bad idea for query text in general, but for authn_id for instance it seems less
likely to hold gigantic strings, and also have more or less consistent size
when provided.
On Thu, 2022-02-24 at 20:44 +0000, Jacob Champion wrote:
That simplifies things. PFA a smaller v2; if pgaudit can't make use of
libpq-be.h for some reason, then I guess we can tailor a fix to that
use case.
Ha, opr_sanity caught my use of cstring. I'll work on a fix later
today.
--Jacob
On Fri, 2022-02-25 at 16:28 +0000, Jacob Champion wrote:
Ha, opr_sanity caught my use of cstring. I'll work on a fix later
today.
Fixed in v3.
--Jacob
Attachments:
v3-0001-Add-API-to-retrieve-authn_id-from-SQL.patchtext/x-patch; name=v3-0001-Add-API-to-retrieve-authn_id-from-SQL.patchDownload
From 2fde60a6bc3739f1894c8c264120e4fa0f04df64 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Mon, 14 Feb 2022 08:10:53 -0800
Subject: [PATCH v3] Add API to retrieve authn_id from SQL
The authn_id field in MyProcPort is currently only accessible to the
backend itself. Add a SQL function, session_authn_id(), to expose the
field to triggers that may want to make use of it.
---
src/backend/utils/adt/name.c | 12 +++++++++++-
src/include/catalog/catversion.h | 2 +-
src/include/catalog/pg_proc.dat | 3 +++
src/test/authentication/t/001_password.pl | 11 +++++++++++
4 files changed, 26 insertions(+), 2 deletions(-)
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index e8bba3670c..b892d25c29 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -23,6 +23,7 @@
#include "catalog/namespace.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_type.h"
+#include "libpq/libpq-be.h"
#include "libpq/pqformat.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
@@ -257,7 +258,7 @@ namestrcmp(Name name, const char *str)
/*
- * SQL-functions CURRENT_USER, SESSION_USER
+ * SQL-functions CURRENT_USER, SESSION_USER, SESSION_AUTHN_ID
*/
Datum
current_user(PG_FUNCTION_ARGS)
@@ -271,6 +272,15 @@ session_user(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(DirectFunctionCall1(namein, CStringGetDatum(GetUserNameFromId(GetSessionUserId(), false))));
}
+Datum
+session_authn_id(PG_FUNCTION_ARGS)
+{
+ if (!MyProcPort || !MyProcPort->authn_id)
+ PG_RETURN_NULL();
+
+ PG_RETURN_TEXT_P(cstring_to_text(MyProcPort->authn_id));
+}
+
/*
* SQL-functions CURRENT_SCHEMA, CURRENT_SCHEMAS
diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h
index 1addb568ef..14194afe1c 100644
--- a/src/include/catalog/catversion.h
+++ b/src/include/catalog/catversion.h
@@ -53,6 +53,6 @@
*/
/* yyyymmddN */
-#define CATALOG_VERSION_NO 202202221
+#define CATALOG_VERSION_NO 202202251
#endif
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 7f1ee97f55..3ddcbae55e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1508,6 +1508,9 @@
{ oid => '746', descr => 'session user name',
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
+{ oid => '9774', descr => 'session authenticated identity',
+ proname => 'session_authn_id', provolatile => 's', prorettype => 'text',
+ proargtypes => '', prosrc => 'session_authn_id' },
{ oid => '744',
proname => 'array_eq', prorettype => 'bool',
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index 3e3079c824..2aa28ed547 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -82,6 +82,11 @@ test_role($node, 'scram_role', 'trust', 0,
test_role($node, 'md5_role', 'trust', 0,
log_unlike => [qr/connection authenticated:/]);
+my $res = $node->safe_psql('postgres',
+ "SELECT session_authn_id() IS NULL;"
+);
+is($res, 't', "users with trust authentication have NULL authn_id");
+
# For plain "password" method, all users should also be able to connect.
reset_pg_hba($node, 'password');
test_role($node, 'scram_role', 'password', 0,
@@ -91,6 +96,12 @@ test_role($node, 'md5_role', 'password', 0,
log_like =>
[qr/connection authenticated: identity="md5_role" method=password/]);
+$res = $node->safe_psql('postgres',
+ "SELECT session_authn_id();",
+ connstr => "user=md5_role"
+);
+is($res, 'md5_role', "users with md5 authentication have authn_id matching role name");
+
# For "scram-sha-256" method, user "scram_role" should be able to connect.
reset_pg_hba($node, 'scram-sha-256');
test_role(
--
2.25.1
On 2022-02-25 20:19:24 +0000, Jacob Champion wrote:
From 2fde60a6bc3739f1894c8c264120e4fa0f04df64 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Mon, 14 Feb 2022 08:10:53 -0800
Subject: [PATCH v3] Add API to retrieve authn_id from SQL
The authn_id field in MyProcPort is currently only accessible to the
backend itself. Add a SQL function, session_authn_id(), to expose the
field to triggers that may want to make use of it.
Looks to me like authn_id isn't synchronized to parallel workers right now. So
the function will return the wrong thing when executed as part of a parallel
query.
I don't think we should add further functions not prefixed with pg_.
Perhaps a few tests for less trivial authn_ids could be worthwhile?
E.g. certificate DNs.
Greetings,
Andres Freund
Hi,
On 2022-02-25 14:25:52 +0800, Julien Rouhaud wrote:
Basically the type of stats we're trying to move to dynamic shared memory is
about counters that should persist for a while and are accumulated across
connections etc. Whereas backend_status.c is more about tracking the current
state (what query is a backend running, what user, etc).But would it be acceptable to use dynamic shared memory in backend_status and
e.g. have a dsa_pointer rather than a fixed length array?
Might be OK, but it does add a fair bit of complexity. Suddenly there's a
bunch more initialization order dependencies that you don't have right
now. I'd not go there for just this.
Greetings,
Andres Freund
On Fri, Feb 25, 2022 at 01:23:49PM -0800, Andres Freund wrote:
Looks to me like authn_id isn't synchronized to parallel workers right now. So
the function will return the wrong thing when executed as part of a parallel
query.
FWIW, I am not completely sure what's the use case for being able to
see the authn of the current session through a trigger. We expose
that when log_connections is enabled, for audit purposes. I can also
get behind something more central so as one can get a full picture of
the authn used by a bunch of session, particularly with complex HBA
policies, but this looks rather limited to me in usability. Perhaps
that's not enough to stand as an objection, though, and the patch is
dead simple.
I don't think we should add further functions not prefixed with pg_.
Yep.
Perhaps a few tests for less trivial authn_ids could be worthwhile?
E.g. certificate DNs.
Yes, src/test/ssl would handle that just fine. Now, this stuff
already looks after authn results with log_connections=on, so that
feels like a duplicate.
--
Michael
On Fri, Feb 25, 2022 at 01:23:49PM -0800, Andres Freund wrote:
Looks to me like authn_id isn't synchronized to parallel workers right now. So
the function will return the wrong thing when executed as part of a parallel
query.
Thanks for the catch. It looks like MyProcPort is left empty, and other
functions that rely on like inet_server_addr() are marked parallel-
restricted, so I've done the same in v4.
On Sat, 2022-02-26 at 14:39 +0900, Michael Paquier wrote:
FWIW, I am not completely sure what's the use case for being able to
see the authn of the current session through a trigger. We expose
that when log_connections is enabled, for audit purposes. I can also
get behind something more central so as one can get a full picture of
the authn used by a bunch of session, particularly with complex HBA
policies, but this looks rather limited to me in usability. Perhaps
that's not enough to stand as an objection, though, and the patch is
dead simple.
I'm primarily motivated by the linked thread -- if the gap between
builtin roles and authn_id are going to be used as ammo against other
security features, then let's close that gap. But I think it's fair to
say that if someone is already using triggers to exhaustively audit a
table, it'd be nice to have this info in the same place too.
I don't think we should add further functions not prefixed with pg_.
Yep.
Fixed.
Perhaps a few tests for less trivial authn_ids could be worthwhile?
E.g. certificate DNs.Yes, src/test/ssl would handle that just fine. Now, this stuff
already looks after authn results with log_connections=on, so that
feels like a duplicate.
It was easy enough to add, so I added it. I suppose it does protect
against any reimplementations of pg_session_authn_id() that can't
handle longer ID strings, though I admit that's a stretch.
Thanks,
--Jacob
Attachments:
since-v3.diff.txttext/plain; name=since-v3.diff.txtDownload
commit efec9f040843d1de2fc52f5ce0d020478a5bc75d
Author: Jacob Champion <pchampion@vmware.com>
Date: Mon Feb 28 10:28:51 2022 -0800
squash! Add API to retrieve authn_id from SQL
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index b892d25c29..662a7943ed 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -258,7 +258,7 @@ namestrcmp(Name name, const char *str)
/*
- * SQL-functions CURRENT_USER, SESSION_USER, SESSION_AUTHN_ID
+ * SQL-functions CURRENT_USER, SESSION_USER, PG_SESSION_AUTHN_ID
*/
Datum
current_user(PG_FUNCTION_ARGS)
@@ -273,7 +273,7 @@ session_user(PG_FUNCTION_ARGS)
}
Datum
-session_authn_id(PG_FUNCTION_ARGS)
+pg_session_authn_id(PG_FUNCTION_ARGS)
{
if (!MyProcPort || !MyProcPort->authn_id)
PG_RETURN_NULL();
diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h
index 14194afe1c..3787b8edaf 100644
--- a/src/include/catalog/catversion.h
+++ b/src/include/catalog/catversion.h
@@ -53,6 +53,6 @@
*/
/* yyyymmddN */
-#define CATALOG_VERSION_NO 202202251
+#define CATALOG_VERSION_NO 202202281
#endif
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 51e0d24f01..45326a2fe5 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1509,8 +1509,8 @@
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
{ oid => '9774', descr => 'session authenticated identity',
- proname => 'session_authn_id', provolatile => 's', prorettype => 'text',
- proargtypes => '', prosrc => 'session_authn_id' },
+ proname => 'pg_session_authn_id', provolatile => 's', proparallel => 'r',
+ prorettype => 'text', proargtypes => '', prosrc => 'pg_session_authn_id' },
{ oid => '744',
proname => 'array_eq', prorettype => 'bool',
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index 2aa28ed547..1edac8d588 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -83,7 +83,7 @@ test_role($node, 'md5_role', 'trust', 0,
log_unlike => [qr/connection authenticated:/]);
my $res = $node->safe_psql('postgres',
- "SELECT session_authn_id() IS NULL;"
+ "SELECT pg_session_authn_id() IS NULL;"
);
is($res, 't', "users with trust authentication have NULL authn_id");
@@ -97,7 +97,7 @@ test_role($node, 'md5_role', 'password', 0,
[qr/connection authenticated: identity="md5_role" method=password/]);
$res = $node->safe_psql('postgres',
- "SELECT session_authn_id();",
+ "SELECT pg_session_authn_id();",
connstr => "user=md5_role"
);
is($res, 'md5_role', "users with md5 authentication have authn_id matching role name");
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index 5c5b16fbe7..79ef7b46f1 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -443,6 +443,13 @@ $node->connect_ok(
qr/connection authenticated: identity="CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG" method=cert/
],);
+# Sanity-check pg_session_authn_id() for long ID strings
+my $res = $node->safe_psql('postgres',
+ "SELECT pg_session_authn_id();",
+ connstr => "$dn_connstr user=ssltestuser sslcert=ssl/client-dn.crt sslkey=$key{'client-dn.key'}",
+);
+is($res, "CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG", "users with cert authentication have entire DN as authn_id");
+
# same thing but with a regex
$dn_connstr = "$common_connstr dbname=certdb_dn_re";
v4-0001-Add-API-to-retrieve-authn_id-from-SQL.patchtext/x-patch; name=v4-0001-Add-API-to-retrieve-authn_id-from-SQL.patchDownload
From 8a313bb19ad9d2748edc38ce91464c123642046b Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Mon, 14 Feb 2022 08:10:53 -0800
Subject: [PATCH v4] Add API to retrieve authn_id from SQL
The authn_id field in MyProcPort is currently only accessible to the
backend itself. Add a SQL function, pg_session_authn_id(), to expose
the field to triggers that may want to make use of it.
---
src/backend/utils/adt/name.c | 12 +++++++++++-
src/include/catalog/catversion.h | 2 +-
src/include/catalog/pg_proc.dat | 3 +++
src/test/authentication/t/001_password.pl | 11 +++++++++++
src/test/ssl/t/001_ssltests.pl | 7 +++++++
5 files changed, 33 insertions(+), 2 deletions(-)
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index e8bba3670c..662a7943ed 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -23,6 +23,7 @@
#include "catalog/namespace.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_type.h"
+#include "libpq/libpq-be.h"
#include "libpq/pqformat.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
@@ -257,7 +258,7 @@ namestrcmp(Name name, const char *str)
/*
- * SQL-functions CURRENT_USER, SESSION_USER
+ * SQL-functions CURRENT_USER, SESSION_USER, PG_SESSION_AUTHN_ID
*/
Datum
current_user(PG_FUNCTION_ARGS)
@@ -271,6 +272,15 @@ session_user(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(DirectFunctionCall1(namein, CStringGetDatum(GetUserNameFromId(GetSessionUserId(), false))));
}
+Datum
+pg_session_authn_id(PG_FUNCTION_ARGS)
+{
+ if (!MyProcPort || !MyProcPort->authn_id)
+ PG_RETURN_NULL();
+
+ PG_RETURN_TEXT_P(cstring_to_text(MyProcPort->authn_id));
+}
+
/*
* SQL-functions CURRENT_SCHEMA, CURRENT_SCHEMAS
diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h
index 14194afe1c..3787b8edaf 100644
--- a/src/include/catalog/catversion.h
+++ b/src/include/catalog/catversion.h
@@ -53,6 +53,6 @@
*/
/* yyyymmddN */
-#define CATALOG_VERSION_NO 202202251
+#define CATALOG_VERSION_NO 202202281
#endif
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 7de8cfc7e9..45326a2fe5 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1508,6 +1508,9 @@
{ oid => '746', descr => 'session user name',
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
+{ oid => '9774', descr => 'session authenticated identity',
+ proname => 'pg_session_authn_id', provolatile => 's', proparallel => 'r',
+ prorettype => 'text', proargtypes => '', prosrc => 'pg_session_authn_id' },
{ oid => '744',
proname => 'array_eq', prorettype => 'bool',
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index 3e3079c824..1edac8d588 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -82,6 +82,11 @@ test_role($node, 'scram_role', 'trust', 0,
test_role($node, 'md5_role', 'trust', 0,
log_unlike => [qr/connection authenticated:/]);
+my $res = $node->safe_psql('postgres',
+ "SELECT pg_session_authn_id() IS NULL;"
+);
+is($res, 't', "users with trust authentication have NULL authn_id");
+
# For plain "password" method, all users should also be able to connect.
reset_pg_hba($node, 'password');
test_role($node, 'scram_role', 'password', 0,
@@ -91,6 +96,12 @@ test_role($node, 'md5_role', 'password', 0,
log_like =>
[qr/connection authenticated: identity="md5_role" method=password/]);
+$res = $node->safe_psql('postgres',
+ "SELECT pg_session_authn_id();",
+ connstr => "user=md5_role"
+);
+is($res, 'md5_role', "users with md5 authentication have authn_id matching role name");
+
# For "scram-sha-256" method, user "scram_role" should be able to connect.
reset_pg_hba($node, 'scram-sha-256');
test_role(
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index 5c5b16fbe7..79ef7b46f1 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -443,6 +443,13 @@ $node->connect_ok(
qr/connection authenticated: identity="CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG" method=cert/
],);
+# Sanity-check pg_session_authn_id() for long ID strings
+my $res = $node->safe_psql('postgres',
+ "SELECT pg_session_authn_id();",
+ connstr => "$dn_connstr user=ssltestuser sslcert=ssl/client-dn.crt sslkey=$key{'client-dn.key'}",
+);
+is($res, "CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG", "users with cert authentication have entire DN as authn_id");
+
# same thing but with a regex
$dn_connstr = "$common_connstr dbname=certdb_dn_re";
--
2.25.1
Greetings,
* Jacob Champion (pchampion@vmware.com) wrote:
On Fri, Feb 25, 2022 at 01:23:49PM -0800, Andres Freund wrote:
Looks to me like authn_id isn't synchronized to parallel workers right now. So
the function will return the wrong thing when executed as part of a parallel
query.Thanks for the catch. It looks like MyProcPort is left empty, and other
functions that rely on like inet_server_addr() are marked parallel-
restricted, so I've done the same in v4.
That's probably alright.
On Sat, 2022-02-26 at 14:39 +0900, Michael Paquier wrote:
FWIW, I am not completely sure what's the use case for being able to
see the authn of the current session through a trigger. We expose
that when log_connections is enabled, for audit purposes. I can also
get behind something more central so as one can get a full picture of
the authn used by a bunch of session, particularly with complex HBA
policies, but this looks rather limited to me in usability. Perhaps
that's not enough to stand as an objection, though, and the patch is
dead simple.I'm primarily motivated by the linked thread -- if the gap between
builtin roles and authn_id are going to be used as ammo against other
security features, then let's close that gap. But I think it's fair to
say that if someone is already using triggers to exhaustively audit a
table, it'd be nice to have this info in the same place too.
Yeah, we really should make this available to trigger-based auditing
systems too and not just through log_connections which involves a great
deal more log parsing and work external to the database to figure out
who did what.
I don't think we should add further functions not prefixed with pg_.
Yep.
Fixed.
That's fine.
Perhaps a few tests for less trivial authn_ids could be worthwhile?
E.g. certificate DNs.Yes, src/test/ssl would handle that just fine. Now, this stuff
already looks after authn results with log_connections=on, so that
feels like a duplicate.It was easy enough to add, so I added it. I suppose it does protect
against any reimplementations of pg_session_authn_id() that can't
handle longer ID strings, though I admit that's a stretch.Thanks,
--Jacob
commit efec9f040843d1de2fc52f5ce0d020478a5bc75d
Author: Jacob Champion <pchampion@vmware.com>
Date: Mon Feb 28 10:28:51 2022 -0800squash! Add API to retrieve authn_id from SQL
Bleh. :) Squash indeed.
Subject: [PATCH v4] Add API to retrieve authn_id from SQL
The authn_id field in MyProcPort is currently only accessible to the
backend itself. Add a SQL function, pg_session_authn_id(), to expose
the field to triggers that may want to make use of it.
Only did a quick look but generally looks reasonable to me.
Thanks,
Stephen
On Mon, 2022-02-28 at 16:00 -0500, Stephen Frost wrote:
commit efec9f040843d1de2fc52f5ce0d020478a5bc75d
Author: Jacob Champion <pchampion@vmware.com>
Date: Mon Feb 28 10:28:51 2022 -0800squash! Add API to retrieve authn_id from SQL
Bleh. :) Squash indeed.
Ha, I wasn't sure if anyone read the since-diffs :) I'll start
wordsmithing them more in the future.
Subject: [PATCH v4] Add API to retrieve authn_id from SQL
The authn_id field in MyProcPort is currently only accessible to the
backend itself. Add a SQL function, pg_session_authn_id(), to expose
the field to triggers that may want to make use of it.Only did a quick look but generally looks reasonable to me.
Thanks!
--Jacob
On Mon, Feb 28, 2022 at 04:00:36PM -0500, Stephen Frost wrote:
* Jacob Champion (pchampion@vmware.com) wrote:
On Fri, Feb 25, 2022 at 01:23:49PM -0800, Andres Freund wrote:
Looks to me like authn_id isn't synchronized to parallel workers right now. So
the function will return the wrong thing when executed as part of a parallel
query.Thanks for the catch. It looks like MyProcPort is left empty, and other
functions that rely on like inet_server_addr() are marked parallel-
restricted, so I've done the same in v4.That's probably alright.
I'd say as well that this is right as-is. If it happens that there is
a use-case for making this parallel aware in the future, it could be
done. Now, it may be a bit weird to make parallel workers inherit the
authn ID of the parent as these did not go through authentication, no?
Letting this function being run only by the leader feels intuitive.
Yeah, we really should make this available to trigger-based auditing
systems too and not just through log_connections which involves a great
deal more log parsing and work external to the database to figure out
who did what.
Okay, I won't fight hard on that if all of you think that this is
useful for a given session.
Subject: [PATCH v4] Add API to retrieve authn_id from SQL
The authn_id field in MyProcPort is currently only accessible to the
backend itself. Add a SQL function, pg_session_authn_id(), to expose
the field to triggers that may want to make use of it.Only did a quick look but generally looks reasonable to me.
The function and the test are fine, pgperltidy complains a bit about
the format of the tests.
Ayway, this function needs to be documented. I think that you should
just add that in "Session Information Functions" in func.sgml, same
area as current_user(). The last time we talked about the authn ID,
one thing we discussed about was how to describe that in a good way to
the user, which is why the section of log_connections was reworked a
bit. And we don't have yet any references to what an authenticated
identity is in the docs.
There is no need to update catversion.h in the patch, committers
usually take care of that and that's an area of the code that
conflicts a lot.
--
Michael
Greetings,
* Michael Paquier (michael@paquier.xyz) wrote:
On Mon, Feb 28, 2022 at 04:00:36PM -0500, Stephen Frost wrote:
* Jacob Champion (pchampion@vmware.com) wrote:
On Fri, Feb 25, 2022 at 01:23:49PM -0800, Andres Freund wrote:
Looks to me like authn_id isn't synchronized to parallel workers right now. So
the function will return the wrong thing when executed as part of a parallel
query.Thanks for the catch. It looks like MyProcPort is left empty, and other
functions that rely on like inet_server_addr() are marked parallel-
restricted, so I've done the same in v4.That's probably alright.
I'd say as well that this is right as-is. If it happens that there is
a use-case for making this parallel aware in the future, it could be
done. Now, it may be a bit weird to make parallel workers inherit the
authn ID of the parent as these did not go through authentication, no?
Letting this function being run only by the leader feels intuitive.
I'm not really sure why we're arguing about this, but clearly the authn
ID of the leader process is what should be used because that's the
authentication under which the parallel worker is running, just as much
as the effective role is the authorization. Having this be available in
worker processes would certainly be good as it would allow more query
plans to be considered when these functions are used. At this time, I
don't think that outweighs the complications around having it and I'm
not suggesting that Jacob needs to go do that, but surely it would be
better.
Subject: [PATCH v4] Add API to retrieve authn_id from SQL
The authn_id field in MyProcPort is currently only accessible to the
backend itself. Add a SQL function, pg_session_authn_id(), to expose
the field to triggers that may want to make use of it.Only did a quick look but generally looks reasonable to me.
The function and the test are fine, pgperltidy complains a bit about
the format of the tests.Ayway, this function needs to be documented. I think that you should
just add that in "Session Information Functions" in func.sgml, same
area as current_user(). The last time we talked about the authn ID,
one thing we discussed about was how to describe that in a good way to
the user, which is why the section of log_connections was reworked a
bit. And we don't have yet any references to what an authenticated
identity is in the docs.
Agreed that it should be documented and that location seems reasonable
to me.
There is no need to update catversion.h in the patch, committers
usually take care of that and that's an area of the code that
conflicts a lot.
Yeah, best to let committers handle catversion bumps.
Thanks,
Stephen
On 25.02.22 21:19, Jacob Champion wrote:
On Fri, 2022-02-25 at 16:28 +0000, Jacob Champion wrote:
Ha, opr_sanity caught my use of cstring. I'll work on a fix later
today.Fixed in v3.
This patch contains no documentation. I'm having a hard time
understanding what the name "session_authn_id" is supposed to convey.
The comment for the Port.authn_id field says this is the "system
username", which sounds like a clearer terminology.
On Tue, 2022-03-01 at 08:35 -0500, Stephen Frost wrote:
* Michael Paquier (michael@paquier.xyz) wrote:
Ayway, this function needs to be documented. I think that you should
just add that in "Session Information Functions" in func.sgml, same
area as current_user(). The last time we talked about the authn ID,
one thing we discussed about was how to describe that in a good way to
the user, which is why the section of log_connections was reworked a
bit. And we don't have yet any references to what an authenticated
identity is in the docs.Agreed that it should be documented and that location seems reasonable
to me.
Added a first draft in v5, alongside the perltidy fixups mentioned by
Michael.
There is no need to update catversion.h in the patch, committers
usually take care of that and that's an area of the code that
conflicts a lot.Yeah, best to let committers handle catversion bumps.
Heh, that was added for my benefit -- I was tired of forgetting to
initdb after switching dev branches -- but I've dropped it from the
patch and will just carry that diff locally.
Thanks,
--Jacob
Attachments:
v5-0001-Add-API-to-retrieve-authn_id-from-SQL.patchtext/x-patch; name=v5-0001-Add-API-to-retrieve-authn_id-from-SQL.patchDownload
From fd6e8a5b09b7facbecc8e38ef1f8a3d2cef866d4 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Mon, 14 Feb 2022 08:10:53 -0800
Subject: [PATCH v5] Add API to retrieve authn_id from SQL
The authn_id field in MyProcPort is currently only accessible to the
backend itself. Add a SQL function, pg_session_authn_id(), to expose
the field to triggers that may want to make use of it.
---
doc/src/sgml/func.sgml | 26 +++++++++++++++++++++++
src/backend/utils/adt/name.c | 12 ++++++++++-
src/include/catalog/pg_proc.dat | 3 +++
src/test/authentication/t/001_password.pl | 11 ++++++++++
src/test/ssl/t/001_ssltests.pl | 7 ++++++
5 files changed, 58 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index df3cd5987b..654b96e677 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -22280,6 +22280,32 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_session_authn_id</primary>
+ </indexterm>
+ <function>pg_session_authn_id</function> ()
+ <returnvalue>text</returnvalue>
+ </para>
+ <para>
+ Returns the authenticated identity for the current connection, or
+ <literal>NULL</literal> if the user has not been authenticated.
+ </para>
+ <para>
+ The authenticated identity is an immutable identifier for the user
+ presented during the connection handshake; the exact format depends on
+ the authentication method in use. (For example, when using the
+ <literal>scram-sha-256</literal> auth method, the authenticated identity
+ is simply the username. When using the <literal>cert</literal> auth
+ method, the authenticated identity is the Distinguished Name of the
+ client certificate.) Even for auth methods which use the username as
+ the authenticated identity, this function differs from
+ <literal>session_user</literal> in that its return value cannot be
+ changed after login.
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index e8bba3670c..662a7943ed 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -23,6 +23,7 @@
#include "catalog/namespace.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_type.h"
+#include "libpq/libpq-be.h"
#include "libpq/pqformat.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
@@ -257,7 +258,7 @@ namestrcmp(Name name, const char *str)
/*
- * SQL-functions CURRENT_USER, SESSION_USER
+ * SQL-functions CURRENT_USER, SESSION_USER, PG_SESSION_AUTHN_ID
*/
Datum
current_user(PG_FUNCTION_ARGS)
@@ -271,6 +272,15 @@ session_user(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(DirectFunctionCall1(namein, CStringGetDatum(GetUserNameFromId(GetSessionUserId(), false))));
}
+Datum
+pg_session_authn_id(PG_FUNCTION_ARGS)
+{
+ if (!MyProcPort || !MyProcPort->authn_id)
+ PG_RETURN_NULL();
+
+ PG_RETURN_TEXT_P(cstring_to_text(MyProcPort->authn_id));
+}
+
/*
* SQL-functions CURRENT_SCHEMA, CURRENT_SCHEMAS
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index bf88858171..3afd171224 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1508,6 +1508,9 @@
{ oid => '746', descr => 'session user name',
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
+{ oid => '9774', descr => 'session authenticated identity',
+ proname => 'pg_session_authn_id', provolatile => 's', proparallel => 'r',
+ prorettype => 'text', proargtypes => '', prosrc => 'pg_session_authn_id' },
{ oid => '744',
proname => 'array_eq', prorettype => 'bool',
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index 3e3079c824..f0bdeda52d 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -82,6 +82,10 @@ test_role($node, 'scram_role', 'trust', 0,
test_role($node, 'md5_role', 'trust', 0,
log_unlike => [qr/connection authenticated:/]);
+my $res =
+ $node->safe_psql('postgres', "SELECT pg_session_authn_id() IS NULL;");
+is($res, 't', "users with trust authentication have NULL authn_id");
+
# For plain "password" method, all users should also be able to connect.
reset_pg_hba($node, 'password');
test_role($node, 'scram_role', 'password', 0,
@@ -91,6 +95,13 @@ test_role($node, 'md5_role', 'password', 0,
log_like =>
[qr/connection authenticated: identity="md5_role" method=password/]);
+$res = $node->safe_psql(
+ 'postgres',
+ "SELECT pg_session_authn_id();",
+ connstr => "user=md5_role");
+is($res, 'md5_role',
+ "users with md5 authentication have authn_id matching role name");
+
# For "scram-sha-256" method, user "scram_role" should be able to connect.
reset_pg_hba($node, 'scram-sha-256');
test_role(
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index 5c5b16fbe7..79ef7b46f1 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -443,6 +443,13 @@ $node->connect_ok(
qr/connection authenticated: identity="CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG" method=cert/
],);
+# Sanity-check pg_session_authn_id() for long ID strings
+my $res = $node->safe_psql('postgres',
+ "SELECT pg_session_authn_id();",
+ connstr => "$dn_connstr user=ssltestuser sslcert=ssl/client-dn.crt sslkey=$key{'client-dn.key'}",
+);
+is($res, "CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG", "users with cert authentication have entire DN as authn_id");
+
# same thing but with a regex
$dn_connstr = "$common_connstr dbname=certdb_dn_re";
--
2.25.1
On Tue, 2022-03-01 at 19:56 +0100, Peter Eisentraut wrote:
This patch contains no documentation. I'm having a hard time
understanding what the name "session_authn_id" is supposed to convey.
The comment for the Port.authn_id field says this is the "system
username", which sounds like a clearer terminology.
"System username" may help from an internal development perspective,
especially as it relates to pg_ident.conf, but I don't think that's
likely to be a useful descriptor to an end user. (I don't think of a
client certificate's Subject Distinguished Name as a "system
username".) Does my attempt in v5 help?
--Jacob
On Tue, Mar 01, 2022 at 10:03:20PM +0000, Jacob Champion wrote:
Added a first draft in v5, alongside the perltidy fixups mentioned by
Michael.
+ The authenticated identity is an immutable identifier for the user
+ presented during the connection handshake; the exact format depends on
+ the authentication method in use. (For example, when using the
+ <literal>scram-sha-256</literal> auth method, the authenticated identity
+ is simply the username. When using the <literal>cert</literal> auth
+ method, the authenticated identity is the Distinguished Name of the
+ client certificate.) Even for auth methods which use the username as
+ the authenticated identity, this function differs from
+ <literal>session_user</literal> in that its return value cannot be
+ changed after login.
That looks enough seen from here. Thanks!
Nit: "auth method" would be a first in the documentation, so this had
better be "authentication method". (No need to send an updated patch
just for that).
So, any comments and/or opinions from others?
--
Michael
On 01.03.22 23:05, Jacob Champion wrote:
On Tue, 2022-03-01 at 19:56 +0100, Peter Eisentraut wrote:
This patch contains no documentation. I'm having a hard time
understanding what the name "session_authn_id" is supposed to convey.
The comment for the Port.authn_id field says this is the "system
username", which sounds like a clearer terminology."System username" may help from an internal development perspective,
especially as it relates to pg_ident.conf, but I don't think that's
likely to be a useful descriptor to an end user. (I don't think of a
client certificate's Subject Distinguished Name as a "system
username".) Does my attempt in v5 help?
Yeah, maybe there are better names. But I have no idea what the letter
combination "authn_id" is supposed to stand for. Is it an
"authentication identifier"? What does it identify? Maybe I'm missing
something here, but I don't find it clear.
Hi,
On 2022-03-01 08:35:27 -0500, Stephen Frost wrote:
I'm not really sure why we're arguing about this, but clearly the authn
ID of the leader process is what should be used because that's the
authentication under which the parallel worker is running, just as much
as the effective role is the authorization. Having this be available in
worker processes would certainly be good as it would allow more query
plans to be considered when these functions are used. At this time, I
don't think that outweighs the complications around having it and I'm
not suggesting that Jacob needs to go do that, but surely it would be
better.
I don't think we should commit this without synchronizing the authn between
worker / leader (in a separate commit). Too likely that some function that's
marked parallel ok queries the authn_id, opening up a security/monitoring hole
or such because of a bogus return value.
Greetings,
Andres Freund
On Wed, Mar 02, 2022 at 01:27:40PM -0800, Andres Freund wrote:
I don't think we should commit this without synchronizing the authn between
worker / leader (in a separate commit). Too likely that some function that's
marked parallel ok queries the authn_id, opening up a security/monitoring hole
or such because of a bogus return value.
Hmm, OK. Using the same authn ID for the leader and the workers still
looks a bit strange to me as the worker is not the one that does the
authentication, only the leader does that. Anyway, FixedParallelState
includes some authentication data passed down by the leader when
spawning a worker. So, if we were to pass down the authn, we are
going to need a new PARALLEL_KEY_* to serialize and restore the data
passed down via a DSM like any other states as per the business in
parallel.c. Jacob, what do you think?
--
Michael
On Wed, 2022-03-02 at 09:18 +0100, Peter Eisentraut wrote:
On 01.03.22 23:05, Jacob Champion wrote:
On Tue, 2022-03-01 at 19:56 +0100, Peter Eisentraut wrote:
This patch contains no documentation. I'm having a hard time
understanding what the name "session_authn_id" is supposed to convey.
The comment for the Port.authn_id field says this is the "system
username", which sounds like a clearer terminology."System username" may help from an internal development perspective,
especially as it relates to pg_ident.conf, but I don't think that's
likely to be a useful descriptor to an end user. (I don't think of a
client certificate's Subject Distinguished Name as a "system
username".) Does my attempt in v5 help?Yeah, maybe there are better names. But I have no idea what the letter
combination "authn_id" is supposed to stand for. Is it an
"authentication identifier"? What does it identify?
Authenticated identity, but yeah, that's the gist. ("AuthN" being a
standard-ish way to differentiate authentication from "AuthZ"
authorization.)
It's meant to uniquely identify the end user in the case of usermaps,
where multiple separate entities might log in using the same role. It
is distinct from the authorized role name, though they might be exactly
the same in many common setups. And it's not set at all if no
authentication was done.
Maybe I'm missing something here, but I don't find it clear.
I just used the internal name, but if we want to make it more clear
then now would be a good time. Do you have any suggestions? Does
expanding the name (pg_session_authenticated_id, or even
pg_session_authenticated_identity) help?
--Jacob
On Thu, 2022-03-03 at 16:45 +0900, Michael Paquier wrote:
Anyway, FixedParallelState
includes some authentication data passed down by the leader when
spawning a worker. So, if we were to pass down the authn, we are
going to need a new PARALLEL_KEY_* to serialize and restore the data
passed down via a DSM like any other states as per the business in
parallel.c. Jacob, what do you think?
I guess it depends on what we want MyProcPort to look like in a
parallel worker. Are we comfortable having most of it be blank/useless?
Does it need to be filled in?
--Jacob
On Thu, Mar 03, 2022 at 07:16:17PM +0000, Jacob Champion wrote:
I guess it depends on what we want MyProcPort to look like in a
parallel worker. Are we comfortable having most of it be blank/useless?
Does it need to be filled in?
Good question. It depends on the definition of how much
authentication information makes sense for the parallel workers to
inherit from the parent. And as I mentioned upthread, this definition
is not completely clear to me because the parallel workers do *not* go
through the authentication paths of the parent, they are just spawned
in their own dedicated paths that the leader invokes. Inheriting all
this information from the leader has also an impact on the
PgBackendStatus entries of the workers as these are reported in
pg_stat_activity as far as I recall, and it could be confusing to see,
for example, some SSL or some GSS information for automatically
spawned processes because these don't use SSL or GSS when they pass
back data to the leader.
I have been looking at the commit history, and found about 6b7d11f
that switched all the functions of sslinfo to be parallel-restricted
especially because of this. So if we inherit all this information the
restriction on the sslinfo functions could be lifted, though the
interest is honestly limited in this case.
postgres_fdw has introduced recently the concept of cached
connections, as of v14 with 411ae64 and 708d165, with a set of
parallel-restricted functions. Some of the code paths related to
appname look at MyProcPort, so there could be a risk of having some
inconsistent information if this is accessed in a parallel worker.
Looking at the code, I don't think that it would happen now but
copying some of the data of MyProcPort to the worker could avoid any
future issues if this code gets extended.
At the end of the day, Port is an interface used for the communication
between the postmaster with the frontends, so I'd like to say that it
is correct to not apply this concept to parallel workers because they
are not designed to contact any frontend-side things.
--
Michael
On Fri, 2022-03-04 at 10:45 +0900, Michael Paquier wrote:
At the end of the day, Port is an interface used for the communication
between the postmaster with the frontends, so I'd like to say that it
is correct to not apply this concept to parallel workers because they
are not designed to contact any frontend-side things.
Coming back to this late, sorry. I'm not quite sure where to move with
this. I'm considering copying pieces of Port over just so we can see
what it looks like in practice?
Personally I think it makes sense for the parallel workers to have the
authn information for the client -- in fact there's a lot of
information that it seems shouldn't be hidden from them -- but there
are other pieces, like the socket handle, that are clearly not useful.
Thanks,
--Jacob
Jacob Champion <pchampion@vmware.com> writes:
On Fri, 2022-03-04 at 10:45 +0900, Michael Paquier wrote:
At the end of the day, Port is an interface used for the communication
between the postmaster with the frontends, so I'd like to say that it
is correct to not apply this concept to parallel workers because they
are not designed to contact any frontend-side things.
Coming back to this late, sorry. I'm not quite sure where to move with
this. I'm considering copying pieces of Port over just so we can see
what it looks like in practice?
Personally I think it makes sense for the parallel workers to have the
authn information for the client -- in fact there's a lot of
information that it seems shouldn't be hidden from them -- but there
are other pieces, like the socket handle, that are clearly not useful.
Yeah. It seems to me that putting the auth info into struct Port was
a fairly random thing to do in the first place, and we are now dealing
with the fallout of that.
I think what we ought to do here is separate out the data that we think
parallel workers need access to. It does not seem wise to say "workers
can access fields A,B,C of MyPort but not fields X,Y,Z". I do not have
a concrete proposal for fixing it though.
regards, tom lane
On Thu, 2022-03-17 at 18:33 -0400, Tom Lane wrote:
Yeah. It seems to me that putting the auth info into struct Port was
a fairly random thing to do in the first place, and we are now dealing
with the fallout of that.I think what we ought to do here is separate out the data that we think
parallel workers need access to. It does not seem wise to say "workers
can access fields A,B,C of MyPort but not fields X,Y,Z". I do not have
a concrete proposal for fixing it though.
v6-0002 has my first attempt at this. I moved authn_id into its own
substruct inside Port, which gets serialized with the parallel key
machinery. (My name selection of "SharedPort" is pretty bland.)
Over time, we could move more fields into the shared struct and fill
out the serialization logic as needed, and then maybe eventually
SharedPort can be broken off into its own thing with its own
allocation. But I don't know if we should do it all at once, yet.
WDYT?
--Jacob
Attachments:
v6-0002-Allow-parallel-workers-to-use-pg_session_authn_id.patchtext/x-patch; name=v6-0002-Allow-parallel-workers-to-use-pg_session_authn_id.patchDownload
From c96254c0656ffd7fc9ddf5cea524be42cce71c77 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Wed, 23 Mar 2022 15:07:05 -0700
Subject: [PATCH v6 2/3] Allow parallel workers to use pg_session_authn_id()
Move authn_id into a substruct, SharedPort, which is intended to hold
all the information that can be shared between the backend and any
parallel workers. SharedPort is serialized and restored using a new
parallel key.
With this change, the parallel restriction can be removed from
pg_session_authn_id().
---
src/backend/access/transam/parallel.c | 91 ++++++++++++++++++++++-
src/backend/libpq/auth.c | 12 +--
src/backend/utils/adt/name.c | 4 +-
src/include/catalog/pg_proc.dat | 2 +-
src/include/libpq/libpq-be.h | 38 +++++++---
src/test/authentication/t/001_password.pl | 33 ++++++++
6 files changed, 161 insertions(+), 19 deletions(-)
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index df0cd77558..dda2aab7b1 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -76,6 +76,7 @@
#define PARALLEL_KEY_REINDEX_STATE UINT64CONST(0xFFFFFFFFFFFF000C)
#define PARALLEL_KEY_RELMAPPER_STATE UINT64CONST(0xFFFFFFFFFFFF000D)
#define PARALLEL_KEY_UNCOMMITTEDENUMS UINT64CONST(0xFFFFFFFFFFFF000E)
+#define PARALLEL_KEY_SHAREDPORT UINT64CONST(0xFFFFFFFFFFFF000F)
/* Fixed-size parallel state. */
typedef struct FixedParallelState
@@ -212,6 +213,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
Size reindexlen = 0;
Size relmapperlen = 0;
Size uncommittedenumslen = 0;
+ Size sharedportlen = 0;
Size segsize = 0;
int i;
FixedParallelState *fps;
@@ -272,8 +274,10 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_estimate_chunk(&pcxt->estimator, relmapperlen);
uncommittedenumslen = EstimateUncommittedEnumsSpace();
shm_toc_estimate_chunk(&pcxt->estimator, uncommittedenumslen);
+ sharedportlen = EstimateSharedPortSpace();
+ shm_toc_estimate_chunk(&pcxt->estimator, sharedportlen);
/* If you add more chunks here, you probably need to add keys. */
- shm_toc_estimate_keys(&pcxt->estimator, 11);
+ shm_toc_estimate_keys(&pcxt->estimator, 12);
/* Estimate space need for error queues. */
StaticAssertStmt(BUFFERALIGN(PARALLEL_ERROR_QUEUE_SIZE) ==
@@ -352,6 +356,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
char *session_dsm_handle_space;
char *entrypointstate;
char *uncommittedenumsspace;
+ char *sharedportspace;
Size lnamelen;
/* Serialize shared libraries we have loaded. */
@@ -422,6 +427,12 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNCOMMITTEDENUMS,
uncommittedenumsspace);
+ /* Serialize our SharedPort. */
+ sharedportspace = shm_toc_allocate(pcxt->toc, sharedportlen);
+ SerializeSharedPort(sharedportlen, sharedportspace);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_SHAREDPORT,
+ sharedportspace);
+
/* Allocate space for worker information. */
pcxt->worker = palloc0(sizeof(ParallelWorkerInfo) * pcxt->nworkers);
@@ -467,6 +478,79 @@ InitializeParallelDSM(ParallelContext *pcxt)
MemoryContextSwitchTo(oldcontext);
}
+/*
+ * Calculate the space needed to serialize MyProcPort->shared.
+ */
+Size
+EstimateSharedPortSpace(void)
+{
+ SharedPort *shared = &MyProcPort->shared;
+ Size size = 1;
+
+ if (shared->authn_id)
+ size = add_size(size, strlen(shared->authn_id) + 1);
+
+ return size;
+}
+
+/*
+ * Serialize MyProcPort->shared for use by parallel workers.
+ */
+void
+SerializeSharedPort(Size maxsize, char *start_address)
+{
+ SharedPort *shared = &MyProcPort->shared;
+
+ /*
+ * First byte is an indication of whether or not authn_id has been set to
+ * non-NULL, to differentiate that case from the empty string.
+ */
+ Assert(maxsize > 0);
+ start_address[0] = shared->authn_id ? 1 : 0;
+ start_address++;
+ maxsize--;
+
+ if (shared->authn_id)
+ {
+ Size len;
+
+ len = strlcpy(start_address, shared->authn_id, maxsize) + 1;
+ Assert(len <= maxsize);
+ maxsize -= len;
+ start_address += len;
+ }
+}
+
+/*
+ * Restore MyProcPort->shared from its serialized representation, allocating
+ * MyProcPort if necessary.
+ */
+void
+RestoreSharedPort(char *sharedport)
+{
+ /* First make sure we have a place to put the information. */
+ if (!MyProcPort)
+ {
+ if (!(MyProcPort = calloc(1, sizeof(Port))))
+ ereport(FATAL,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));
+ }
+
+ if (sharedport[0] == 0)
+ {
+ MyProcPort->shared.authn_id = NULL;
+ sharedport++;
+ }
+ else
+ {
+ sharedport++;
+ MyProcPort->shared.authn_id = MemoryContextStrdup(TopMemoryContext,
+ sharedport);
+ sharedport += strlen(sharedport) + 1;
+ }
+}
+
/*
* Reinitialize the dynamic shared memory segment for a parallel context such
* that we could launch workers for it again.
@@ -1270,6 +1354,7 @@ ParallelWorkerMain(Datum main_arg)
char *reindexspace;
char *relmapperspace;
char *uncommittedenumsspace;
+ char *sharedportspace;
StringInfoData msgbuf;
char *session_dsm_handle_space;
Snapshot tsnapshot;
@@ -1479,6 +1564,10 @@ ParallelWorkerMain(Datum main_arg)
false);
RestoreUncommittedEnums(uncommittedenumsspace);
+ /* Restore the SharedPort. */
+ sharedportspace = shm_toc_lookup(toc, PARALLEL_KEY_SHAREDPORT, false);
+ RestoreSharedPort(sharedportspace);
+
/* Attach to the leader's serializable transaction, if SERIALIZABLE. */
AttachSerializableXact(fps->serializable_xact_handle);
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index efc53f3135..40384a31b0 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -350,7 +350,7 @@ set_authn_id(Port *port, const char *id)
{
Assert(id);
- if (port->authn_id)
+ if (port->shared.authn_id)
{
/*
* An existing authn_id should never be overwritten; that means two
@@ -361,17 +361,18 @@ set_authn_id(Port *port, const char *id)
ereport(FATAL,
(errmsg("authentication identifier set more than once"),
errdetail_log("previous identifier: \"%s\"; new identifier: \"%s\"",
- port->authn_id, id)));
+ port->shared.authn_id, id)));
}
- port->authn_id = MemoryContextStrdup(TopMemoryContext, id);
+ port->shared.authn_id = MemoryContextStrdup(TopMemoryContext, id);
if (Log_connections)
{
ereport(LOG,
errmsg("connection authenticated: identity=\"%s\" method=%s "
"(%s:%d)",
- port->authn_id, hba_authname(port->hba->auth_method), HbaFileName,
+ port->shared.authn_id,
+ hba_authname(port->hba->auth_method), HbaFileName,
port->hba->linenumber));
}
}
@@ -1908,7 +1909,8 @@ auth_peer(hbaPort *port)
*/
set_authn_id(port, pw->pw_name);
- ret = check_usermap(port->hba->usermap, port->user_name, port->authn_id, false);
+ ret = check_usermap(port->hba->usermap, port->user_name,
+ port->shared.authn_id, false);
return ret;
#else
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index 662a7943ed..9000ad05f8 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -275,10 +275,10 @@ session_user(PG_FUNCTION_ARGS)
Datum
pg_session_authn_id(PG_FUNCTION_ARGS)
{
- if (!MyProcPort || !MyProcPort->authn_id)
+ if (!MyProcPort || !MyProcPort->shared.authn_id)
PG_RETURN_NULL();
- PG_RETURN_TEXT_P(cstring_to_text(MyProcPort->authn_id));
+ PG_RETURN_TEXT_P(cstring_to_text(MyProcPort->shared.authn_id));
}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index a1bf898476..b044a71c93 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1509,7 +1509,7 @@
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
{ oid => '9774', descr => 'session authenticated identity',
- proname => 'pg_session_authn_id', provolatile => 's', proparallel => 'r',
+ proname => 'pg_session_authn_id', provolatile => 's',
prorettype => 'text', proargtypes => '', prosrc => 'pg_session_authn_id' },
{ oid => '744',
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index dd3e5efba3..0a9dc61d04 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -99,6 +99,27 @@ typedef struct
} pg_gssinfo;
#endif
+/*
+ * Fields from Port that need to be copied over to parallel workers go into the
+ * SharedPort. The same rules apply for allocations here as for Port (must be
+ * malloc'd or palloc'd in TopMemoryContext).
+ */
+typedef struct SharedPort
+{
+ /*
+ * Authenticated identity. The meaning of this identifier is dependent on
+ * hba->auth_method; it is the identity (if any) that the user presented
+ * during the authentication cycle, before they were assigned a database
+ * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
+ * -- though the exact string in use may be different, depending on pg_hba
+ * options.)
+ *
+ * authn_id is NULL if the user has not actually been authenticated, for
+ * example if the "trust" auth method is in use.
+ */
+ const char *authn_id;
+} SharedPort;
+
/*
* This is used by the postmaster in its communication with frontends. It
* contains all state information needed during this communication before the
@@ -160,17 +181,10 @@ typedef struct Port
HbaLine *hba;
/*
- * Authenticated identity. The meaning of this identifier is dependent on
- * hba->auth_method; it is the identity (if any) that the user presented
- * during the authentication cycle, before they were assigned a database
- * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
- * -- though the exact string in use may be different, depending on pg_hba
- * options.)
- *
- * authn_id is NULL if the user has not actually been authenticated, for
- * example if the "trust" auth method is in use.
+ * Information that's copied between the backend and any parallel workers.
+ * This is the only part of the Port that a parallel worker may access!
*/
- const char *authn_id;
+ SharedPort shared;
/*
* TCP keepalive and user timeout settings.
@@ -341,4 +355,8 @@ extern int pq_setkeepalivesinterval(int interval, Port *port);
extern int pq_setkeepalivescount(int count, Port *port);
extern int pq_settcpusertimeout(int timeout, Port *port);
+extern Size EstimateSharedPortSpace(void);
+extern void SerializeSharedPort(Size maxsize, char *start_address);
+extern void RestoreSharedPort(char *sharedport);
+
#endif /* LIBPQ_BE_H */
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index f0bdeda52d..3f8629b3a6 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -74,6 +74,14 @@ $node->safe_psql('postgres',
);
$ENV{"PGPASSWORD"} = 'pass';
+# Set up a table for parallel worker testing.
+$node->safe_psql('postgres',
+ 'CREATE TABLE nulls (n) AS SELECT NULL FROM generate_series(1, 200000);'
+);
+$node->safe_psql('postgres',
+ 'GRANT SELECT ON nulls TO md5_role;'
+);
+
# For "trust" method, all users should be able to connect. These users are not
# considered to be authenticated.
reset_pg_hba($node, 'trust');
@@ -86,6 +94,19 @@ my $res =
$node->safe_psql('postgres', "SELECT pg_session_authn_id() IS NULL;");
is($res, 't', "users with trust authentication have NULL authn_id");
+# Test pg_session_authn_id() with parallel workers.
+$res = $node->safe_psql(
+ 'postgres', '
+ SET min_parallel_table_scan_size TO 0;
+ SET parallel_setup_cost TO 0;
+ SET parallel_tuple_cost TO 0;
+ SET max_parallel_workers_per_gather TO 2;
+
+ SELECT bool_and(pg_session_authn_id() IS NOT DISTINCT FROM n) FROM nulls;
+ ',
+ connstr => "user=md5_role");
+is($res, 't', "parallel workers return a null authn_id when not authenticated");
+
# For plain "password" method, all users should also be able to connect.
reset_pg_hba($node, 'password');
test_role($node, 'scram_role', 'password', 0,
@@ -102,6 +123,18 @@ $res = $node->safe_psql(
is($res, 'md5_role',
"users with md5 authentication have authn_id matching role name");
+$res = $node->safe_psql(
+ 'postgres', '
+ SET min_parallel_table_scan_size TO 0;
+ SET parallel_setup_cost TO 0;
+ SET parallel_tuple_cost TO 0;
+ SET max_parallel_workers_per_gather TO 2;
+
+ SELECT bool_and(pg_session_authn_id() IS DISTINCT FROM n) FROM nulls;
+ ',
+ connstr => "user=md5_role");
+is($res, 't', "parallel workers return a non-null authn_id when authenticated");
+
# For "scram-sha-256" method, user "scram_role" should be able to connect.
reset_pg_hba($node, 'scram-sha-256');
test_role(
--
2.25.1
v6-0001-Add-API-to-retrieve-authn_id-from-SQL.patchtext/x-patch; name=v6-0001-Add-API-to-retrieve-authn_id-from-SQL.patchDownload
From c1323d7694deedf1fa829772083977c18c7f77f6 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Mon, 14 Feb 2022 08:10:53 -0800
Subject: [PATCH v6 1/3] Add API to retrieve authn_id from SQL
The authn_id field in MyProcPort is currently only accessible to the
backend itself. Add a SQL function, pg_session_authn_id(), to expose
the field to triggers that may want to make use of it.
---
doc/src/sgml/func.sgml | 26 +++++++++++++++++++++++
src/backend/utils/adt/name.c | 12 ++++++++++-
src/include/catalog/pg_proc.dat | 3 +++
src/test/authentication/t/001_password.pl | 11 ++++++++++
src/test/ssl/t/001_ssltests.pl | 7 ++++++
5 files changed, 58 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 89a5e17884..45df4ff158 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -22280,6 +22280,32 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_session_authn_id</primary>
+ </indexterm>
+ <function>pg_session_authn_id</function> ()
+ <returnvalue>text</returnvalue>
+ </para>
+ <para>
+ Returns the authenticated identity for the current connection, or
+ <literal>NULL</literal> if the user has not been authenticated.
+ </para>
+ <para>
+ The authenticated identity is an immutable identifier for the user
+ presented during the connection handshake; the exact format depends on
+ the authentication method in use. (For example, when using the
+ <literal>scram-sha-256</literal> auth method, the authenticated identity
+ is simply the username. When using the <literal>cert</literal> auth
+ method, the authenticated identity is the Distinguished Name of the
+ client certificate.) Even for auth methods which use the username as
+ the authenticated identity, this function differs from
+ <literal>session_user</literal> in that its return value cannot be
+ changed after login.
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index e8bba3670c..662a7943ed 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -23,6 +23,7 @@
#include "catalog/namespace.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_type.h"
+#include "libpq/libpq-be.h"
#include "libpq/pqformat.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
@@ -257,7 +258,7 @@ namestrcmp(Name name, const char *str)
/*
- * SQL-functions CURRENT_USER, SESSION_USER
+ * SQL-functions CURRENT_USER, SESSION_USER, PG_SESSION_AUTHN_ID
*/
Datum
current_user(PG_FUNCTION_ARGS)
@@ -271,6 +272,15 @@ session_user(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(DirectFunctionCall1(namein, CStringGetDatum(GetUserNameFromId(GetSessionUserId(), false))));
}
+Datum
+pg_session_authn_id(PG_FUNCTION_ARGS)
+{
+ if (!MyProcPort || !MyProcPort->authn_id)
+ PG_RETURN_NULL();
+
+ PG_RETURN_TEXT_P(cstring_to_text(MyProcPort->authn_id));
+}
+
/*
* SQL-functions CURRENT_SCHEMA, CURRENT_SCHEMAS
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d8e8715ed1..a1bf898476 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1508,6 +1508,9 @@
{ oid => '746', descr => 'session user name',
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
+{ oid => '9774', descr => 'session authenticated identity',
+ proname => 'pg_session_authn_id', provolatile => 's', proparallel => 'r',
+ prorettype => 'text', proargtypes => '', prosrc => 'pg_session_authn_id' },
{ oid => '744',
proname => 'array_eq', prorettype => 'bool',
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index 3e3079c824..f0bdeda52d 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -82,6 +82,10 @@ test_role($node, 'scram_role', 'trust', 0,
test_role($node, 'md5_role', 'trust', 0,
log_unlike => [qr/connection authenticated:/]);
+my $res =
+ $node->safe_psql('postgres', "SELECT pg_session_authn_id() IS NULL;");
+is($res, 't', "users with trust authentication have NULL authn_id");
+
# For plain "password" method, all users should also be able to connect.
reset_pg_hba($node, 'password');
test_role($node, 'scram_role', 'password', 0,
@@ -91,6 +95,13 @@ test_role($node, 'md5_role', 'password', 0,
log_like =>
[qr/connection authenticated: identity="md5_role" method=password/]);
+$res = $node->safe_psql(
+ 'postgres',
+ "SELECT pg_session_authn_id();",
+ connstr => "user=md5_role");
+is($res, 'md5_role',
+ "users with md5 authentication have authn_id matching role name");
+
# For "scram-sha-256" method, user "scram_role" should be able to connect.
reset_pg_hba($node, 'scram-sha-256');
test_role(
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index 5c5b16fbe7..79ef7b46f1 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -443,6 +443,13 @@ $node->connect_ok(
qr/connection authenticated: identity="CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG" method=cert/
],);
+# Sanity-check pg_session_authn_id() for long ID strings
+my $res = $node->safe_psql('postgres',
+ "SELECT pg_session_authn_id();",
+ connstr => "$dn_connstr user=ssltestuser sslcert=ssl/client-dn.crt sslkey=$key{'client-dn.key'}",
+);
+is($res, "CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG", "users with cert authentication have entire DN as authn_id");
+
# same thing but with a regex
$dn_connstr = "$common_connstr dbname=certdb_dn_re";
--
2.25.1
Jacob Champion <pchampion@vmware.com> writes:
On Thu, 2022-03-17 at 18:33 -0400, Tom Lane wrote:
I think what we ought to do here is separate out the data that we think
parallel workers need access to. It does not seem wise to say "workers
can access fields A,B,C of MyPort but not fields X,Y,Z". I do not have
a concrete proposal for fixing it though.
v6-0002 has my first attempt at this. I moved authn_id into its own
substruct inside Port, which gets serialized with the parallel key
machinery. (My name selection of "SharedPort" is pretty bland.)
Hm. I was more envisioning getting the "sharable" info out of Port
entirely, although I'm not quite sure where it should go instead.
regards, tom lane
On Wed, 2022-03-23 at 19:00 -0400, Tom Lane wrote:
Hm. I was more envisioning getting the "sharable" info out of Port
entirely, although I'm not quite sure where it should go instead.
If it helps, I can move the substruct out and up to a new global struct
(MyProcShared?). At this point I think it's mostly search-and-replace.
--Jacob
Hi,
On 2022-03-23 23:06:14 +0000, Jacob Champion wrote:
On Wed, 2022-03-23 at 19:00 -0400, Tom Lane wrote:
Hm. I was more envisioning getting the "sharable" info out of Port
entirely, although I'm not quite sure where it should go instead.If it helps, I can move the substruct out and up to a new global struct
(MyProcShared?). At this point I think it's mostly search-and-replace.
Perhaps alongside CurrentUserId etc in miscinit.c? It would be nicer if all
those were together in a struct, but oh well.
Another option would be to make it a GUC. With a bit of care it could be
automatically synced by the existing parallelism infrastructure...
Greetings,
Andres Freund
On Wed, 2022-03-23 at 16:54 -0700, Andres Freund wrote:
On 2022-03-23 23:06:14 +0000, Jacob Champion wrote:
On Wed, 2022-03-23 at 19:00 -0400, Tom Lane wrote:
Hm. I was more envisioning getting the "sharable" info out of Port
entirely, although I'm not quite sure where it should go instead.If it helps, I can move the substruct out and up to a new global struct
(MyProcShared?). At this point I think it's mostly search-and-replace.Perhaps alongside CurrentUserId etc in miscinit.c? It would be nicer if all
those were together in a struct, but oh well.
Next draft in v7. My naming choices probably make even less sense now. Any ideas for names for "a bag of stuff that we want parallel workers to have too"?
Another option would be to make it a GUC. With a bit of care it could be
automatically synced by the existing parallelism infrastructure...
Like a write-once, PGC_INTERNAL setting? I guess I don't have any
intuition on how that would compare to the separate-global-and-accessor
approach. Is the primary advantage that you don't have to maintain the
serialization logic, or is there more to it?
Thanks,
--Jacob
Attachments:
since-v6.diff.txttext/plain; name=since-v6.diff.txtDownload
commit e6eca817f3cc359fff762600ad286d92046ba07d
Author: Jacob Champion <pchampion@vmware.com>
Date: Thu Mar 24 10:00:30 2022 -0700
squash! Allow parallel workers to use pg_session_authn_id()
Move SharedPort out of Port and over to miscinit.c.
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index dda2aab7b1..c88eab0933 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -478,79 +478,6 @@ InitializeParallelDSM(ParallelContext *pcxt)
MemoryContextSwitchTo(oldcontext);
}
-/*
- * Calculate the space needed to serialize MyProcPort->shared.
- */
-Size
-EstimateSharedPortSpace(void)
-{
- SharedPort *shared = &MyProcPort->shared;
- Size size = 1;
-
- if (shared->authn_id)
- size = add_size(size, strlen(shared->authn_id) + 1);
-
- return size;
-}
-
-/*
- * Serialize MyProcPort->shared for use by parallel workers.
- */
-void
-SerializeSharedPort(Size maxsize, char *start_address)
-{
- SharedPort *shared = &MyProcPort->shared;
-
- /*
- * First byte is an indication of whether or not authn_id has been set to
- * non-NULL, to differentiate that case from the empty string.
- */
- Assert(maxsize > 0);
- start_address[0] = shared->authn_id ? 1 : 0;
- start_address++;
- maxsize--;
-
- if (shared->authn_id)
- {
- Size len;
-
- len = strlcpy(start_address, shared->authn_id, maxsize) + 1;
- Assert(len <= maxsize);
- maxsize -= len;
- start_address += len;
- }
-}
-
-/*
- * Restore MyProcPort->shared from its serialized representation, allocating
- * MyProcPort if necessary.
- */
-void
-RestoreSharedPort(char *sharedport)
-{
- /* First make sure we have a place to put the information. */
- if (!MyProcPort)
- {
- if (!(MyProcPort = calloc(1, sizeof(Port))))
- ereport(FATAL,
- (errcode(ERRCODE_OUT_OF_MEMORY),
- errmsg("out of memory")));
- }
-
- if (sharedport[0] == 0)
- {
- MyProcPort->shared.authn_id = NULL;
- sharedport++;
- }
- else
- {
- sharedport++;
- MyProcPort->shared.authn_id = MemoryContextStrdup(TopMemoryContext,
- sharedport);
- sharedport += strlen(sharedport) + 1;
- }
-}
-
/*
* Reinitialize the dynamic shared memory segment for a parallel context such
* that we could launch workers for it again.
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 40384a31b0..bceda9755a 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -342,15 +342,15 @@ auth_failed(Port *port, int status, const char *logdetail)
* authorization will fail later.
*
* The provided string will be copied into TopMemoryContext, to match the
- * lifetime of the Port, so it is safe to pass a string that is managed by an
- * external library.
+ * lifetime of MyProcShared, so it is safe to pass a string that is managed by
+ * an external library.
*/
static void
set_authn_id(Port *port, const char *id)
{
Assert(id);
- if (port->shared.authn_id)
+ if (MyProcShared.authn_id)
{
/*
* An existing authn_id should never be overwritten; that means two
@@ -361,17 +361,17 @@ set_authn_id(Port *port, const char *id)
ereport(FATAL,
(errmsg("authentication identifier set more than once"),
errdetail_log("previous identifier: \"%s\"; new identifier: \"%s\"",
- port->shared.authn_id, id)));
+ MyProcShared.authn_id, id)));
}
- port->shared.authn_id = MemoryContextStrdup(TopMemoryContext, id);
+ MyProcShared.authn_id = MemoryContextStrdup(TopMemoryContext, id);
if (Log_connections)
{
ereport(LOG,
errmsg("connection authenticated: identity=\"%s\" method=%s "
"(%s:%d)",
- port->shared.authn_id,
+ MyProcShared.authn_id,
hba_authname(port->hba->auth_method), HbaFileName,
port->hba->linenumber));
}
@@ -1910,7 +1910,7 @@ auth_peer(hbaPort *port)
set_authn_id(port, pw->pw_name);
ret = check_usermap(port->hba->usermap, port->user_name,
- port->shared.authn_id, false);
+ MyProcShared.authn_id, false);
return ret;
#else
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index 9000ad05f8..6d497e63d9 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -275,10 +275,10 @@ session_user(PG_FUNCTION_ARGS)
Datum
pg_session_authn_id(PG_FUNCTION_ARGS)
{
- if (!MyProcPort || !MyProcPort->shared.authn_id)
+ if (!MyProcShared.authn_id)
PG_RETURN_NULL();
- PG_RETURN_TEXT_P(cstring_to_text(MyProcPort->shared.authn_id));
+ PG_RETURN_TEXT_P(cstring_to_text(MyProcShared.authn_id));
}
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index bdc77af719..0afab3e142 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -929,6 +929,77 @@ GetUserNameFromId(Oid roleid, bool noerr)
return result;
}
+/* ------------------------------------------------------------------------
+ * "Shared" connection state
+ *
+ * MyProcShared contains pieces of information about the client that need to be
+ * synced to parallel workers when they initialize. Over time, this list will
+ * probably grow, and may subsume some of the "user state" variables above.
+ *-------------------------------------------------------------------------
+ */
+
+SharedPort MyProcShared;
+
+/*
+ * Calculate the space needed to serialize MyProcShared.
+ */
+Size
+EstimateSharedPortSpace(void)
+{
+ Size size = 1;
+
+ if (MyProcShared.authn_id)
+ size = add_size(size, strlen(MyProcShared.authn_id) + 1);
+
+ return size;
+}
+
+/*
+ * Serialize MyProcShared for use by parallel workers.
+ */
+void
+SerializeSharedPort(Size maxsize, char *start_address)
+{
+ /*
+ * First byte is an indication of whether or not authn_id has been set to
+ * non-NULL, to differentiate that case from the empty string.
+ */
+ Assert(maxsize > 0);
+ start_address[0] = MyProcShared.authn_id ? 1 : 0;
+ start_address++;
+ maxsize--;
+
+ if (MyProcShared.authn_id)
+ {
+ Size len;
+
+ len = strlcpy(start_address, MyProcShared.authn_id, maxsize) + 1;
+ Assert(len <= maxsize);
+ maxsize -= len;
+ start_address += len;
+ }
+}
+
+/*
+ * Restore MyProcShared from its serialized representation.
+ */
+void
+RestoreSharedPort(char *sharedport)
+{
+ if (sharedport[0] == 0)
+ {
+ MyProcShared.authn_id = NULL;
+ sharedport++;
+ }
+ else
+ {
+ sharedport++;
+ MyProcShared.authn_id = MemoryContextStrdup(TopMemoryContext,
+ sharedport);
+ sharedport += strlen(sharedport) + 1;
+ }
+}
+
/*-------------------------------------------------------------------------
* Interlock-file support
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index 0a9dc61d04..911b8246ce 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -180,12 +180,6 @@ typedef struct Port
*/
HbaLine *hba;
- /*
- * Information that's copied between the backend and any parallel workers.
- * This is the only part of the Port that a parallel worker may access!
- */
- SharedPort shared;
-
/*
* TCP keepalive and user timeout settings.
*
@@ -342,6 +336,7 @@ extern ssize_t be_gssapi_write(Port *port, void *ptr, size_t len);
#endif /* ENABLE_GSS */
extern ProtocolVersion FrontendProtocol;
+extern SharedPort MyProcShared;
/* TCP keepalives configuration. These are no-ops on an AF_UNIX socket. */
@@ -355,8 +350,4 @@ extern int pq_setkeepalivesinterval(int interval, Port *port);
extern int pq_setkeepalivescount(int count, Port *port);
extern int pq_settcpusertimeout(int timeout, Port *port);
-extern Size EstimateSharedPortSpace(void);
-extern void SerializeSharedPort(Size maxsize, char *start_address);
-extern void RestoreSharedPort(char *sharedport);
-
#endif /* LIBPQ_BE_H */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 0abc3ad540..68cc1517a0 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -481,6 +481,10 @@ extern void process_session_preload_libraries(void);
extern void pg_bindtextdomain(const char *domain);
extern bool has_rolreplication(Oid roleid);
+extern Size EstimateSharedPortSpace(void);
+extern void SerializeSharedPort(Size maxsize, char *start_address);
+extern void RestoreSharedPort(char *sharedport);
+
/* in access/transam/xlog.c */
extern bool BackupInProgress(void);
extern void CancelBackup(void);
v7-0001-Add-API-to-retrieve-authn_id-from-SQL.patchtext/x-patch; name=v7-0001-Add-API-to-retrieve-authn_id-from-SQL.patchDownload
From e30055b1a56f82214dc730077114b4167bff53be Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Mon, 14 Feb 2022 08:10:53 -0800
Subject: [PATCH v7 1/3] Add API to retrieve authn_id from SQL
The authn_id field in MyProcPort is currently only accessible to the
backend itself. Add a SQL function, pg_session_authn_id(), to expose
the field to triggers that may want to make use of it.
---
doc/src/sgml/func.sgml | 26 +++++++++++++++++++++++
src/backend/utils/adt/name.c | 12 ++++++++++-
src/include/catalog/pg_proc.dat | 3 +++
src/test/authentication/t/001_password.pl | 11 ++++++++++
src/test/ssl/t/001_ssltests.pl | 7 ++++++
5 files changed, 58 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 8a802fb225..441a0fd63d 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -22280,6 +22280,32 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_session_authn_id</primary>
+ </indexterm>
+ <function>pg_session_authn_id</function> ()
+ <returnvalue>text</returnvalue>
+ </para>
+ <para>
+ Returns the authenticated identity for the current connection, or
+ <literal>NULL</literal> if the user has not been authenticated.
+ </para>
+ <para>
+ The authenticated identity is an immutable identifier for the user
+ presented during the connection handshake; the exact format depends on
+ the authentication method in use. (For example, when using the
+ <literal>scram-sha-256</literal> auth method, the authenticated identity
+ is simply the username. When using the <literal>cert</literal> auth
+ method, the authenticated identity is the Distinguished Name of the
+ client certificate.) Even for auth methods which use the username as
+ the authenticated identity, this function differs from
+ <literal>session_user</literal> in that its return value cannot be
+ changed after login.
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index e8bba3670c..662a7943ed 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -23,6 +23,7 @@
#include "catalog/namespace.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_type.h"
+#include "libpq/libpq-be.h"
#include "libpq/pqformat.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
@@ -257,7 +258,7 @@ namestrcmp(Name name, const char *str)
/*
- * SQL-functions CURRENT_USER, SESSION_USER
+ * SQL-functions CURRENT_USER, SESSION_USER, PG_SESSION_AUTHN_ID
*/
Datum
current_user(PG_FUNCTION_ARGS)
@@ -271,6 +272,15 @@ session_user(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(DirectFunctionCall1(namein, CStringGetDatum(GetUserNameFromId(GetSessionUserId(), false))));
}
+Datum
+pg_session_authn_id(PG_FUNCTION_ARGS)
+{
+ if (!MyProcPort || !MyProcPort->authn_id)
+ PG_RETURN_NULL();
+
+ PG_RETURN_TEXT_P(cstring_to_text(MyProcPort->authn_id));
+}
+
/*
* SQL-functions CURRENT_SCHEMA, CURRENT_SCHEMAS
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index d8e8715ed1..a1bf898476 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1508,6 +1508,9 @@
{ oid => '746', descr => 'session user name',
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
+{ oid => '9774', descr => 'session authenticated identity',
+ proname => 'pg_session_authn_id', provolatile => 's', proparallel => 'r',
+ prorettype => 'text', proargtypes => '', prosrc => 'pg_session_authn_id' },
{ oid => '744',
proname => 'array_eq', prorettype => 'bool',
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index 3e3079c824..f0bdeda52d 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -82,6 +82,10 @@ test_role($node, 'scram_role', 'trust', 0,
test_role($node, 'md5_role', 'trust', 0,
log_unlike => [qr/connection authenticated:/]);
+my $res =
+ $node->safe_psql('postgres', "SELECT pg_session_authn_id() IS NULL;");
+is($res, 't', "users with trust authentication have NULL authn_id");
+
# For plain "password" method, all users should also be able to connect.
reset_pg_hba($node, 'password');
test_role($node, 'scram_role', 'password', 0,
@@ -91,6 +95,13 @@ test_role($node, 'md5_role', 'password', 0,
log_like =>
[qr/connection authenticated: identity="md5_role" method=password/]);
+$res = $node->safe_psql(
+ 'postgres',
+ "SELECT pg_session_authn_id();",
+ connstr => "user=md5_role");
+is($res, 'md5_role',
+ "users with md5 authentication have authn_id matching role name");
+
# For "scram-sha-256" method, user "scram_role" should be able to connect.
reset_pg_hba($node, 'scram-sha-256');
test_role(
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index 605e405de3..22b3edc51e 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -451,6 +451,13 @@ $node->connect_ok(
qr/connection authenticated: identity="CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG" method=cert/
],);
+# Sanity-check pg_session_authn_id() for long ID strings
+my $res = $node->safe_psql('postgres',
+ "SELECT pg_session_authn_id();",
+ connstr => "$dn_connstr user=ssltestuser sslcert=ssl/client-dn.crt sslkey=$key{'client-dn.key'}",
+);
+is($res, "CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG", "users with cert authentication have entire DN as authn_id");
+
# same thing but with a regex
$dn_connstr = "$common_connstr dbname=certdb_dn_re";
--
2.25.1
v7-0002-Allow-parallel-workers-to-use-pg_session_authn_id.patchtext/x-patch; name=v7-0002-Allow-parallel-workers-to-use-pg_session_authn_id.patchDownload
From 2127fc4f60caa33c368c23d4e8cd713d9f150249 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Wed, 23 Mar 2022 15:07:05 -0700
Subject: [PATCH v7 2/3] Allow parallel workers to use pg_session_authn_id()
Move authn_id into a new global, MyProcShared, which is intended to hold
all the information that can be shared between the backend and any
parallel workers. MyProcShared is serialized and restored using a new
parallel key.
With this change, the parallel restriction can be removed from
pg_session_authn_id().
---
src/backend/access/transam/parallel.c | 18 +++++-
src/backend/libpq/auth.c | 16 ++---
src/backend/utils/adt/name.c | 4 +-
src/backend/utils/init/miscinit.c | 71 +++++++++++++++++++++++
src/include/catalog/pg_proc.dat | 2 +-
src/include/libpq/libpq-be.h | 35 ++++++-----
src/include/miscadmin.h | 4 ++
src/test/authentication/t/001_password.pl | 33 +++++++++++
8 files changed, 159 insertions(+), 24 deletions(-)
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index df0cd77558..c88eab0933 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -76,6 +76,7 @@
#define PARALLEL_KEY_REINDEX_STATE UINT64CONST(0xFFFFFFFFFFFF000C)
#define PARALLEL_KEY_RELMAPPER_STATE UINT64CONST(0xFFFFFFFFFFFF000D)
#define PARALLEL_KEY_UNCOMMITTEDENUMS UINT64CONST(0xFFFFFFFFFFFF000E)
+#define PARALLEL_KEY_SHAREDPORT UINT64CONST(0xFFFFFFFFFFFF000F)
/* Fixed-size parallel state. */
typedef struct FixedParallelState
@@ -212,6 +213,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
Size reindexlen = 0;
Size relmapperlen = 0;
Size uncommittedenumslen = 0;
+ Size sharedportlen = 0;
Size segsize = 0;
int i;
FixedParallelState *fps;
@@ -272,8 +274,10 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_estimate_chunk(&pcxt->estimator, relmapperlen);
uncommittedenumslen = EstimateUncommittedEnumsSpace();
shm_toc_estimate_chunk(&pcxt->estimator, uncommittedenumslen);
+ sharedportlen = EstimateSharedPortSpace();
+ shm_toc_estimate_chunk(&pcxt->estimator, sharedportlen);
/* If you add more chunks here, you probably need to add keys. */
- shm_toc_estimate_keys(&pcxt->estimator, 11);
+ shm_toc_estimate_keys(&pcxt->estimator, 12);
/* Estimate space need for error queues. */
StaticAssertStmt(BUFFERALIGN(PARALLEL_ERROR_QUEUE_SIZE) ==
@@ -352,6 +356,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
char *session_dsm_handle_space;
char *entrypointstate;
char *uncommittedenumsspace;
+ char *sharedportspace;
Size lnamelen;
/* Serialize shared libraries we have loaded. */
@@ -422,6 +427,12 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNCOMMITTEDENUMS,
uncommittedenumsspace);
+ /* Serialize our SharedPort. */
+ sharedportspace = shm_toc_allocate(pcxt->toc, sharedportlen);
+ SerializeSharedPort(sharedportlen, sharedportspace);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_SHAREDPORT,
+ sharedportspace);
+
/* Allocate space for worker information. */
pcxt->worker = palloc0(sizeof(ParallelWorkerInfo) * pcxt->nworkers);
@@ -1270,6 +1281,7 @@ ParallelWorkerMain(Datum main_arg)
char *reindexspace;
char *relmapperspace;
char *uncommittedenumsspace;
+ char *sharedportspace;
StringInfoData msgbuf;
char *session_dsm_handle_space;
Snapshot tsnapshot;
@@ -1479,6 +1491,10 @@ ParallelWorkerMain(Datum main_arg)
false);
RestoreUncommittedEnums(uncommittedenumsspace);
+ /* Restore the SharedPort. */
+ sharedportspace = shm_toc_lookup(toc, PARALLEL_KEY_SHAREDPORT, false);
+ RestoreSharedPort(sharedportspace);
+
/* Attach to the leader's serializable transaction, if SERIALIZABLE. */
AttachSerializableXact(fps->serializable_xact_handle);
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index efc53f3135..bceda9755a 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -342,15 +342,15 @@ auth_failed(Port *port, int status, const char *logdetail)
* authorization will fail later.
*
* The provided string will be copied into TopMemoryContext, to match the
- * lifetime of the Port, so it is safe to pass a string that is managed by an
- * external library.
+ * lifetime of MyProcShared, so it is safe to pass a string that is managed by
+ * an external library.
*/
static void
set_authn_id(Port *port, const char *id)
{
Assert(id);
- if (port->authn_id)
+ if (MyProcShared.authn_id)
{
/*
* An existing authn_id should never be overwritten; that means two
@@ -361,17 +361,18 @@ set_authn_id(Port *port, const char *id)
ereport(FATAL,
(errmsg("authentication identifier set more than once"),
errdetail_log("previous identifier: \"%s\"; new identifier: \"%s\"",
- port->authn_id, id)));
+ MyProcShared.authn_id, id)));
}
- port->authn_id = MemoryContextStrdup(TopMemoryContext, id);
+ MyProcShared.authn_id = MemoryContextStrdup(TopMemoryContext, id);
if (Log_connections)
{
ereport(LOG,
errmsg("connection authenticated: identity=\"%s\" method=%s "
"(%s:%d)",
- port->authn_id, hba_authname(port->hba->auth_method), HbaFileName,
+ MyProcShared.authn_id,
+ hba_authname(port->hba->auth_method), HbaFileName,
port->hba->linenumber));
}
}
@@ -1908,7 +1909,8 @@ auth_peer(hbaPort *port)
*/
set_authn_id(port, pw->pw_name);
- ret = check_usermap(port->hba->usermap, port->user_name, port->authn_id, false);
+ ret = check_usermap(port->hba->usermap, port->user_name,
+ MyProcShared.authn_id, false);
return ret;
#else
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index 662a7943ed..6d497e63d9 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -275,10 +275,10 @@ session_user(PG_FUNCTION_ARGS)
Datum
pg_session_authn_id(PG_FUNCTION_ARGS)
{
- if (!MyProcPort || !MyProcPort->authn_id)
+ if (!MyProcShared.authn_id)
PG_RETURN_NULL();
- PG_RETURN_TEXT_P(cstring_to_text(MyProcPort->authn_id));
+ PG_RETURN_TEXT_P(cstring_to_text(MyProcShared.authn_id));
}
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index bdc77af719..0afab3e142 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -929,6 +929,77 @@ GetUserNameFromId(Oid roleid, bool noerr)
return result;
}
+/* ------------------------------------------------------------------------
+ * "Shared" connection state
+ *
+ * MyProcShared contains pieces of information about the client that need to be
+ * synced to parallel workers when they initialize. Over time, this list will
+ * probably grow, and may subsume some of the "user state" variables above.
+ *-------------------------------------------------------------------------
+ */
+
+SharedPort MyProcShared;
+
+/*
+ * Calculate the space needed to serialize MyProcShared.
+ */
+Size
+EstimateSharedPortSpace(void)
+{
+ Size size = 1;
+
+ if (MyProcShared.authn_id)
+ size = add_size(size, strlen(MyProcShared.authn_id) + 1);
+
+ return size;
+}
+
+/*
+ * Serialize MyProcShared for use by parallel workers.
+ */
+void
+SerializeSharedPort(Size maxsize, char *start_address)
+{
+ /*
+ * First byte is an indication of whether or not authn_id has been set to
+ * non-NULL, to differentiate that case from the empty string.
+ */
+ Assert(maxsize > 0);
+ start_address[0] = MyProcShared.authn_id ? 1 : 0;
+ start_address++;
+ maxsize--;
+
+ if (MyProcShared.authn_id)
+ {
+ Size len;
+
+ len = strlcpy(start_address, MyProcShared.authn_id, maxsize) + 1;
+ Assert(len <= maxsize);
+ maxsize -= len;
+ start_address += len;
+ }
+}
+
+/*
+ * Restore MyProcShared from its serialized representation.
+ */
+void
+RestoreSharedPort(char *sharedport)
+{
+ if (sharedport[0] == 0)
+ {
+ MyProcShared.authn_id = NULL;
+ sharedport++;
+ }
+ else
+ {
+ sharedport++;
+ MyProcShared.authn_id = MemoryContextStrdup(TopMemoryContext,
+ sharedport);
+ sharedport += strlen(sharedport) + 1;
+ }
+}
+
/*-------------------------------------------------------------------------
* Interlock-file support
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index a1bf898476..b044a71c93 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1509,7 +1509,7 @@
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
{ oid => '9774', descr => 'session authenticated identity',
- proname => 'pg_session_authn_id', provolatile => 's', proparallel => 'r',
+ proname => 'pg_session_authn_id', provolatile => 's',
prorettype => 'text', proargtypes => '', prosrc => 'pg_session_authn_id' },
{ oid => '744',
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index dd3e5efba3..911b8246ce 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -99,6 +99,27 @@ typedef struct
} pg_gssinfo;
#endif
+/*
+ * Fields from Port that need to be copied over to parallel workers go into the
+ * SharedPort. The same rules apply for allocations here as for Port (must be
+ * malloc'd or palloc'd in TopMemoryContext).
+ */
+typedef struct SharedPort
+{
+ /*
+ * Authenticated identity. The meaning of this identifier is dependent on
+ * hba->auth_method; it is the identity (if any) that the user presented
+ * during the authentication cycle, before they were assigned a database
+ * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
+ * -- though the exact string in use may be different, depending on pg_hba
+ * options.)
+ *
+ * authn_id is NULL if the user has not actually been authenticated, for
+ * example if the "trust" auth method is in use.
+ */
+ const char *authn_id;
+} SharedPort;
+
/*
* This is used by the postmaster in its communication with frontends. It
* contains all state information needed during this communication before the
@@ -159,19 +180,6 @@ typedef struct Port
*/
HbaLine *hba;
- /*
- * Authenticated identity. The meaning of this identifier is dependent on
- * hba->auth_method; it is the identity (if any) that the user presented
- * during the authentication cycle, before they were assigned a database
- * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
- * -- though the exact string in use may be different, depending on pg_hba
- * options.)
- *
- * authn_id is NULL if the user has not actually been authenticated, for
- * example if the "trust" auth method is in use.
- */
- const char *authn_id;
-
/*
* TCP keepalive and user timeout settings.
*
@@ -328,6 +336,7 @@ extern ssize_t be_gssapi_write(Port *port, void *ptr, size_t len);
#endif /* ENABLE_GSS */
extern ProtocolVersion FrontendProtocol;
+extern SharedPort MyProcShared;
/* TCP keepalives configuration. These are no-ops on an AF_UNIX socket. */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 0abc3ad540..68cc1517a0 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -481,6 +481,10 @@ extern void process_session_preload_libraries(void);
extern void pg_bindtextdomain(const char *domain);
extern bool has_rolreplication(Oid roleid);
+extern Size EstimateSharedPortSpace(void);
+extern void SerializeSharedPort(Size maxsize, char *start_address);
+extern void RestoreSharedPort(char *sharedport);
+
/* in access/transam/xlog.c */
extern bool BackupInProgress(void);
extern void CancelBackup(void);
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index f0bdeda52d..3f8629b3a6 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -74,6 +74,14 @@ $node->safe_psql('postgres',
);
$ENV{"PGPASSWORD"} = 'pass';
+# Set up a table for parallel worker testing.
+$node->safe_psql('postgres',
+ 'CREATE TABLE nulls (n) AS SELECT NULL FROM generate_series(1, 200000);'
+);
+$node->safe_psql('postgres',
+ 'GRANT SELECT ON nulls TO md5_role;'
+);
+
# For "trust" method, all users should be able to connect. These users are not
# considered to be authenticated.
reset_pg_hba($node, 'trust');
@@ -86,6 +94,19 @@ my $res =
$node->safe_psql('postgres', "SELECT pg_session_authn_id() IS NULL;");
is($res, 't', "users with trust authentication have NULL authn_id");
+# Test pg_session_authn_id() with parallel workers.
+$res = $node->safe_psql(
+ 'postgres', '
+ SET min_parallel_table_scan_size TO 0;
+ SET parallel_setup_cost TO 0;
+ SET parallel_tuple_cost TO 0;
+ SET max_parallel_workers_per_gather TO 2;
+
+ SELECT bool_and(pg_session_authn_id() IS NOT DISTINCT FROM n) FROM nulls;
+ ',
+ connstr => "user=md5_role");
+is($res, 't', "parallel workers return a null authn_id when not authenticated");
+
# For plain "password" method, all users should also be able to connect.
reset_pg_hba($node, 'password');
test_role($node, 'scram_role', 'password', 0,
@@ -102,6 +123,18 @@ $res = $node->safe_psql(
is($res, 'md5_role',
"users with md5 authentication have authn_id matching role name");
+$res = $node->safe_psql(
+ 'postgres', '
+ SET min_parallel_table_scan_size TO 0;
+ SET parallel_setup_cost TO 0;
+ SET parallel_tuple_cost TO 0;
+ SET max_parallel_workers_per_gather TO 2;
+
+ SELECT bool_and(pg_session_authn_id() IS DISTINCT FROM n) FROM nulls;
+ ',
+ connstr => "user=md5_role");
+is($res, 't', "parallel workers return a non-null authn_id when authenticated");
+
# For "scram-sha-256" method, user "scram_role" should be able to connect.
reset_pg_hba($node, 'scram-sha-256');
test_role(
--
2.25.1
On Wed, Mar 2, 2022 at 4:27 PM Andres Freund <andres@anarazel.de> wrote:
I don't think we should commit this without synchronizing the authn between
worker / leader (in a separate commit). Too likely that some function that's
marked parallel ok queries the authn_id, opening up a security/monitoring hole
or such because of a bogus return value.
It is not free to copy data from the leader to the worker. I don't
think we should just adopt a policy of copying everything anyone
thinks of, because then most of the time we'll be copying a bunch of
stuff that really isn't needed.
My gut reaction is to think that this is way too marginal to be worth
making parallel-safe, but it is also possible that I just don't know
enough to understand its true value.
--
Robert Haas
EDB: http://www.enterprisedb.com
Hi,
On 2022-03-24 13:55:29 -0400, Robert Haas wrote:
On Wed, Mar 2, 2022 at 4:27 PM Andres Freund <andres@anarazel.de> wrote:
I don't think we should commit this without synchronizing the authn between
worker / leader (in a separate commit). Too likely that some function that's
marked parallel ok queries the authn_id, opening up a security/monitoring hole
or such because of a bogus return value.It is not free to copy data from the leader to the worker. I don't
think we should just adopt a policy of copying everything anyone
thinks of, because then most of the time we'll be copying a bunch of
stuff that really isn't needed.
I agree.
My gut reaction is to think that this is way too marginal to be worth
making parallel-safe, but it is also possible that I just don't know
enough to understand its true value.
My problem with that is that as far as I can tell the only real use of the
field / function is for stuff like audit logging, RLS etc. Which seems
problematic for two reasons:
1) It's likely that the call to the function is nested into other functions,
"hiding" the parallel safety. Then it'd return bogus data silently. At
the very least we need to make it error out if called in a parallel worker.
2) If used for the purposes above, there's basically no parallelism possible
anymore.
Greetings,
Andres Freund
On Thu, Mar 24, 2022 at 05:44:06PM +0000, Jacob Champion wrote:
On Wed, 2022-03-23 at 16:54 -0700, Andres Freund wrote:
Another option would be to make it a GUC. With a bit of care it could be
automatically synced by the existing parallelism infrastructure...Like a write-once, PGC_INTERNAL setting? I guess I don't have any
intuition on how that would compare to the separate-global-and-accessor
approach. Is the primary advantage that you don't have to maintain the
serialization logic, or is there more to it?
Hmm. That would be a first for a GUC, no? It is not seem natural
compared to the other information pieces passed down from the leader
to the workers.
+extern SharedPort MyProcShared;
This naming is interesting, and seems to be in line with a couple of
executor structures that share information across workers. Still
that's a bit inconsistent as Shared is used once at the beginning and
once at the end? I don't have a better idea on top of my mind.
Anyway, wouldn't it be better to reverse the patch order, introducing
the shared Proc information first and then build the parallel-safe
function on top of it?
--
Michael
Hi,
On 2022-03-26 15:18:59 +0900, Michael Paquier wrote:
On Thu, Mar 24, 2022 at 05:44:06PM +0000, Jacob Champion wrote:
On Wed, 2022-03-23 at 16:54 -0700, Andres Freund wrote:
Another option would be to make it a GUC. With a bit of care it could be
automatically synced by the existing parallelism infrastructure...Like a write-once, PGC_INTERNAL setting?
Perhaps PGC_INTERNAL, perhaps PGC_SU_BACKEND, set with PGC_S_OVERRIDE?
I guess I don't have any
intuition on how that would compare to the separate-global-and-accessor
approach. Is the primary advantage that you don't have to maintain the
serialization logic, or is there more to it?Hmm. That would be a first for a GUC, no? It is not seem natural
compared to the other information pieces passed down from the leader
to the workers.
What would be the first for a GUC? We have plenty GUCs that are set on a
per-connection basis to reflect some fact? And there's several authenitcation
related bits of state known to guc.c , think role, session_authorization,
is_superuser.
Sharing per-connection state via GUCs for paralellism? I don't think that is
true either. E.g. application_name, client_encoding.
+extern SharedPort MyProcShared;
I strongly dislike MyProcShared. It's way too easily confused with MyProc
which point to shared memory.
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2022-03-26 15:18:59 +0900, Michael Paquier wrote:
On Thu, Mar 24, 2022 at 05:44:06PM +0000, Jacob Champion wrote:
On Wed, 2022-03-23 at 16:54 -0700, Andres Freund wrote:
Another option would be to make it a GUC. With a bit of care it could be
automatically synced by the existing parallelism infrastructure...
Like a write-once, PGC_INTERNAL setting?
Perhaps PGC_INTERNAL, perhaps PGC_SU_BACKEND, set with PGC_S_OVERRIDE?
Seems like making it a GUC does nothing to fix the complaint you had about
passing data to workers whether it's needed or not. Sure, we don't then
need to write any new code for it, but it's still new cycles. And it's
new cycles all throughout guc.c, too, not just in parallel worker start.
I also note that exposing it as a GUC means we have zero control over
who/what can read it. Maybe that's not a problem, but it needs to be
thought about before we go down that path.
regards, tom lane
Hi,
On 2022-03-26 13:57:39 -0400, Tom Lane wrote:
Seems like making it a GUC does nothing to fix the complaint you had about
passing data to workers whether it's needed or not.
I don't think that was my complaint. Maybe Robert's?
Sure, we don't then need to write any new code for it, but it's still new
cycles.
I think it'd quite possibly less cycles than separately syncing it.
Because I wanted to know what the overhead be in relation to other things, I
made serialize_variable() log whenever it decides to serialize, and it's a bit
depressing :/.
serialized DateStyle = ISO, MDY
serialized default_text_search_config = pg_catalog.english
serialized force_parallel_mode = on
serialized lc_messages = en_US.UTF-8
serialized lc_monetary = en_US.UTF-8
serialized lc_numeric = en_US.UTF-8
serialized lc_time = en_US.UTF-8
serialized log_checkpoints = true
serialized log_line_prefix = %m [%p][%b][%v:%x][%a]
serialized log_timezone = America/Los_Angeles
serialized max_stack_depth = 2048
serialized max_wal_size = 153600
serialized min_wal_size = 48
serialized restart_after_crash = true
serialized session_authorization = andres
serialized ssl_cert_file = /home/andres/tmp/pgdev/ssl-cert-snakeoil.pem
serialized ssl_key_file = /home/andres/tmp/pgdev/ssl-cert-snakeoil.key
serialized TimeZone = America/Los_Angeles
serialized timezone_abbreviations = Default
serialized track_io_timing = true
serialized transaction_deferrable = false
serialized transaction_isolation = read committed
serialized transaction_read_only = false
total serialized guc state is 1324
Of course, compared to the total size of 94784 bytes that's not too
much... FWIW, 65536 of that is for the tuple queues...
I also note that exposing it as a GUC means we have zero control over
who/what can read it. Maybe that's not a problem, but it needs to be
thought about before we go down that path.
Yes, I think that's a fair concern.
Greetings,
Andres Freund
On Sat, 2022-03-26 at 11:36 -0700, Andres Freund wrote:
I also note that exposing it as a GUC means we have zero control over
who/what can read it. Maybe that's not a problem, but it needs to be
thought about before we go down that path.Yes, I think that's a fair concern.
I like that there's no builtin way, today, for a superuser to modify
the internal value; it strengthens the use as an auditing tool. Moving
this to a PGC_SU_BACKEND GUC seems to weaken that. And it looks like
PGC_INTERNAL is skipped during the serialization, so if we chose that
option, we'd need to write new code anyway?
We'd also need to guess whether the GUC system's serialization of NULL
as an empty string is likely to cause problems for any future auth
methods. My guess is "no", to be honest, but I do like maintaining the
distinction -- it feels safer.
v8 rebases over the recent SSL changes to get the cfbot green again.
Thanks,
--Jacob
Attachments:
since-v7.diff.txttext/plain; name=since-v7.diff.txtDownload
commit bd02c608e3053217056464a31dff49344ca3a5f3
Author: Jacob Champion <pchampion@vmware.com>
Date: Tue Mar 29 16:26:52 2022 -0700
fixup! Add API to retrieve authn_id from SQL
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index ac2848b931..2b6947fee5 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -428,7 +428,7 @@ $node->connect_ok(
# Sanity-check pg_session_authn_id() for long ID strings
my $res = $node->safe_psql('postgres',
"SELECT pg_session_authn_id();",
- connstr => "$dn_connstr user=ssltestuser sslcert=ssl/client-dn.crt sslkey=$key{'client-dn.key'}",
+ connstr => "$dn_connstr user=ssltestuser sslcert=ssl/client-dn.crt " . sslkey('client-dn.key'),
);
is($res, "CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG", "users with cert authentication have entire DN as authn_id");
v8-0001-Add-API-to-retrieve-authn_id-from-SQL.patchtext/x-patch; name=v8-0001-Add-API-to-retrieve-authn_id-from-SQL.patchDownload
From 34e590e13240bde6a5daa0e3f866b56ad7a4c2f9 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Mon, 14 Feb 2022 08:10:53 -0800
Subject: [PATCH v8 1/3] Add API to retrieve authn_id from SQL
The authn_id field in MyProcPort is currently only accessible to the
backend itself. Add a SQL function, pg_session_authn_id(), to expose
the field to triggers that may want to make use of it.
---
doc/src/sgml/func.sgml | 26 +++++++++++++++++++++++
src/backend/utils/adt/name.c | 12 ++++++++++-
src/include/catalog/pg_proc.dat | 3 +++
src/test/authentication/t/001_password.pl | 11 ++++++++++
src/test/ssl/t/001_ssltests.pl | 7 ++++++
5 files changed, 58 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 3a9d62b408..454c15fde4 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -22280,6 +22280,32 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_session_authn_id</primary>
+ </indexterm>
+ <function>pg_session_authn_id</function> ()
+ <returnvalue>text</returnvalue>
+ </para>
+ <para>
+ Returns the authenticated identity for the current connection, or
+ <literal>NULL</literal> if the user has not been authenticated.
+ </para>
+ <para>
+ The authenticated identity is an immutable identifier for the user
+ presented during the connection handshake; the exact format depends on
+ the authentication method in use. (For example, when using the
+ <literal>scram-sha-256</literal> auth method, the authenticated identity
+ is simply the username. When using the <literal>cert</literal> auth
+ method, the authenticated identity is the Distinguished Name of the
+ client certificate.) Even for auth methods which use the username as
+ the authenticated identity, this function differs from
+ <literal>session_user</literal> in that its return value cannot be
+ changed after login.
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index e8bba3670c..662a7943ed 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -23,6 +23,7 @@
#include "catalog/namespace.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_type.h"
+#include "libpq/libpq-be.h"
#include "libpq/pqformat.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
@@ -257,7 +258,7 @@ namestrcmp(Name name, const char *str)
/*
- * SQL-functions CURRENT_USER, SESSION_USER
+ * SQL-functions CURRENT_USER, SESSION_USER, PG_SESSION_AUTHN_ID
*/
Datum
current_user(PG_FUNCTION_ARGS)
@@ -271,6 +272,15 @@ session_user(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(DirectFunctionCall1(namein, CStringGetDatum(GetUserNameFromId(GetSessionUserId(), false))));
}
+Datum
+pg_session_authn_id(PG_FUNCTION_ARGS)
+{
+ if (!MyProcPort || !MyProcPort->authn_id)
+ PG_RETURN_NULL();
+
+ PG_RETURN_TEXT_P(cstring_to_text(MyProcPort->authn_id));
+}
+
/*
* SQL-functions CURRENT_SCHEMA, CURRENT_SCHEMAS
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index deb00307f6..27ab913402 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1508,6 +1508,9 @@
{ oid => '746', descr => 'session user name',
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
+{ oid => '9774', descr => 'session authenticated identity',
+ proname => 'pg_session_authn_id', provolatile => 's', proparallel => 'r',
+ prorettype => 'text', proargtypes => '', prosrc => 'pg_session_authn_id' },
{ oid => '744',
proname => 'array_eq', prorettype => 'bool',
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index 3e3079c824..f0bdeda52d 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -82,6 +82,10 @@ test_role($node, 'scram_role', 'trust', 0,
test_role($node, 'md5_role', 'trust', 0,
log_unlike => [qr/connection authenticated:/]);
+my $res =
+ $node->safe_psql('postgres', "SELECT pg_session_authn_id() IS NULL;");
+is($res, 't', "users with trust authentication have NULL authn_id");
+
# For plain "password" method, all users should also be able to connect.
reset_pg_hba($node, 'password');
test_role($node, 'scram_role', 'password', 0,
@@ -91,6 +95,13 @@ test_role($node, 'md5_role', 'password', 0,
log_like =>
[qr/connection authenticated: identity="md5_role" method=password/]);
+$res = $node->safe_psql(
+ 'postgres',
+ "SELECT pg_session_authn_id();",
+ connstr => "user=md5_role");
+is($res, 'md5_role',
+ "users with md5 authentication have authn_id matching role name");
+
# For "scram-sha-256" method, user "scram_role" should be able to connect.
reset_pg_hba($node, 'scram-sha-256');
test_role(
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index d8eeb085da..2b6947fee5 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -425,6 +425,13 @@ $node->connect_ok(
qr/connection authenticated: identity="CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG" method=cert/
],);
+# Sanity-check pg_session_authn_id() for long ID strings
+my $res = $node->safe_psql('postgres',
+ "SELECT pg_session_authn_id();",
+ connstr => "$dn_connstr user=ssltestuser sslcert=ssl/client-dn.crt " . sslkey('client-dn.key'),
+);
+is($res, "CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG", "users with cert authentication have entire DN as authn_id");
+
# same thing but with a regex
$dn_connstr = "$common_connstr dbname=certdb_dn_re";
--
2.25.1
v8-0002-Allow-parallel-workers-to-use-pg_session_authn_id.patchtext/x-patch; name=v8-0002-Allow-parallel-workers-to-use-pg_session_authn_id.patchDownload
From e2394ee7349474b51adc203bddca0ee911bab81a Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Wed, 23 Mar 2022 15:07:05 -0700
Subject: [PATCH v8 2/3] Allow parallel workers to use pg_session_authn_id()
Move authn_id into a new global, MyProcShared, which is intended to hold
all the information that can be shared between the backend and any
parallel workers. MyProcShared is serialized and restored using a new
parallel key.
With this change, the parallel restriction can be removed from
pg_session_authn_id().
---
src/backend/access/transam/parallel.c | 18 +++++-
src/backend/libpq/auth.c | 16 ++---
src/backend/utils/adt/name.c | 4 +-
src/backend/utils/init/miscinit.c | 71 +++++++++++++++++++++++
src/include/catalog/pg_proc.dat | 2 +-
src/include/libpq/libpq-be.h | 35 ++++++-----
src/include/miscadmin.h | 4 ++
src/test/authentication/t/001_password.pl | 33 +++++++++++
8 files changed, 159 insertions(+), 24 deletions(-)
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index df0cd77558..c88eab0933 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -76,6 +76,7 @@
#define PARALLEL_KEY_REINDEX_STATE UINT64CONST(0xFFFFFFFFFFFF000C)
#define PARALLEL_KEY_RELMAPPER_STATE UINT64CONST(0xFFFFFFFFFFFF000D)
#define PARALLEL_KEY_UNCOMMITTEDENUMS UINT64CONST(0xFFFFFFFFFFFF000E)
+#define PARALLEL_KEY_SHAREDPORT UINT64CONST(0xFFFFFFFFFFFF000F)
/* Fixed-size parallel state. */
typedef struct FixedParallelState
@@ -212,6 +213,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
Size reindexlen = 0;
Size relmapperlen = 0;
Size uncommittedenumslen = 0;
+ Size sharedportlen = 0;
Size segsize = 0;
int i;
FixedParallelState *fps;
@@ -272,8 +274,10 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_estimate_chunk(&pcxt->estimator, relmapperlen);
uncommittedenumslen = EstimateUncommittedEnumsSpace();
shm_toc_estimate_chunk(&pcxt->estimator, uncommittedenumslen);
+ sharedportlen = EstimateSharedPortSpace();
+ shm_toc_estimate_chunk(&pcxt->estimator, sharedportlen);
/* If you add more chunks here, you probably need to add keys. */
- shm_toc_estimate_keys(&pcxt->estimator, 11);
+ shm_toc_estimate_keys(&pcxt->estimator, 12);
/* Estimate space need for error queues. */
StaticAssertStmt(BUFFERALIGN(PARALLEL_ERROR_QUEUE_SIZE) ==
@@ -352,6 +356,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
char *session_dsm_handle_space;
char *entrypointstate;
char *uncommittedenumsspace;
+ char *sharedportspace;
Size lnamelen;
/* Serialize shared libraries we have loaded. */
@@ -422,6 +427,12 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNCOMMITTEDENUMS,
uncommittedenumsspace);
+ /* Serialize our SharedPort. */
+ sharedportspace = shm_toc_allocate(pcxt->toc, sharedportlen);
+ SerializeSharedPort(sharedportlen, sharedportspace);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_SHAREDPORT,
+ sharedportspace);
+
/* Allocate space for worker information. */
pcxt->worker = palloc0(sizeof(ParallelWorkerInfo) * pcxt->nworkers);
@@ -1270,6 +1281,7 @@ ParallelWorkerMain(Datum main_arg)
char *reindexspace;
char *relmapperspace;
char *uncommittedenumsspace;
+ char *sharedportspace;
StringInfoData msgbuf;
char *session_dsm_handle_space;
Snapshot tsnapshot;
@@ -1479,6 +1491,10 @@ ParallelWorkerMain(Datum main_arg)
false);
RestoreUncommittedEnums(uncommittedenumsspace);
+ /* Restore the SharedPort. */
+ sharedportspace = shm_toc_lookup(toc, PARALLEL_KEY_SHAREDPORT, false);
+ RestoreSharedPort(sharedportspace);
+
/* Attach to the leader's serializable transaction, if SERIALIZABLE. */
AttachSerializableXact(fps->serializable_xact_handle);
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index efc53f3135..bceda9755a 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -342,15 +342,15 @@ auth_failed(Port *port, int status, const char *logdetail)
* authorization will fail later.
*
* The provided string will be copied into TopMemoryContext, to match the
- * lifetime of the Port, so it is safe to pass a string that is managed by an
- * external library.
+ * lifetime of MyProcShared, so it is safe to pass a string that is managed by
+ * an external library.
*/
static void
set_authn_id(Port *port, const char *id)
{
Assert(id);
- if (port->authn_id)
+ if (MyProcShared.authn_id)
{
/*
* An existing authn_id should never be overwritten; that means two
@@ -361,17 +361,18 @@ set_authn_id(Port *port, const char *id)
ereport(FATAL,
(errmsg("authentication identifier set more than once"),
errdetail_log("previous identifier: \"%s\"; new identifier: \"%s\"",
- port->authn_id, id)));
+ MyProcShared.authn_id, id)));
}
- port->authn_id = MemoryContextStrdup(TopMemoryContext, id);
+ MyProcShared.authn_id = MemoryContextStrdup(TopMemoryContext, id);
if (Log_connections)
{
ereport(LOG,
errmsg("connection authenticated: identity=\"%s\" method=%s "
"(%s:%d)",
- port->authn_id, hba_authname(port->hba->auth_method), HbaFileName,
+ MyProcShared.authn_id,
+ hba_authname(port->hba->auth_method), HbaFileName,
port->hba->linenumber));
}
}
@@ -1908,7 +1909,8 @@ auth_peer(hbaPort *port)
*/
set_authn_id(port, pw->pw_name);
- ret = check_usermap(port->hba->usermap, port->user_name, port->authn_id, false);
+ ret = check_usermap(port->hba->usermap, port->user_name,
+ MyProcShared.authn_id, false);
return ret;
#else
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index 662a7943ed..6d497e63d9 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -275,10 +275,10 @@ session_user(PG_FUNCTION_ARGS)
Datum
pg_session_authn_id(PG_FUNCTION_ARGS)
{
- if (!MyProcPort || !MyProcPort->authn_id)
+ if (!MyProcShared.authn_id)
PG_RETURN_NULL();
- PG_RETURN_TEXT_P(cstring_to_text(MyProcPort->authn_id));
+ PG_RETURN_TEXT_P(cstring_to_text(MyProcShared.authn_id));
}
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index bdc77af719..0afab3e142 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -929,6 +929,77 @@ GetUserNameFromId(Oid roleid, bool noerr)
return result;
}
+/* ------------------------------------------------------------------------
+ * "Shared" connection state
+ *
+ * MyProcShared contains pieces of information about the client that need to be
+ * synced to parallel workers when they initialize. Over time, this list will
+ * probably grow, and may subsume some of the "user state" variables above.
+ *-------------------------------------------------------------------------
+ */
+
+SharedPort MyProcShared;
+
+/*
+ * Calculate the space needed to serialize MyProcShared.
+ */
+Size
+EstimateSharedPortSpace(void)
+{
+ Size size = 1;
+
+ if (MyProcShared.authn_id)
+ size = add_size(size, strlen(MyProcShared.authn_id) + 1);
+
+ return size;
+}
+
+/*
+ * Serialize MyProcShared for use by parallel workers.
+ */
+void
+SerializeSharedPort(Size maxsize, char *start_address)
+{
+ /*
+ * First byte is an indication of whether or not authn_id has been set to
+ * non-NULL, to differentiate that case from the empty string.
+ */
+ Assert(maxsize > 0);
+ start_address[0] = MyProcShared.authn_id ? 1 : 0;
+ start_address++;
+ maxsize--;
+
+ if (MyProcShared.authn_id)
+ {
+ Size len;
+
+ len = strlcpy(start_address, MyProcShared.authn_id, maxsize) + 1;
+ Assert(len <= maxsize);
+ maxsize -= len;
+ start_address += len;
+ }
+}
+
+/*
+ * Restore MyProcShared from its serialized representation.
+ */
+void
+RestoreSharedPort(char *sharedport)
+{
+ if (sharedport[0] == 0)
+ {
+ MyProcShared.authn_id = NULL;
+ sharedport++;
+ }
+ else
+ {
+ sharedport++;
+ MyProcShared.authn_id = MemoryContextStrdup(TopMemoryContext,
+ sharedport);
+ sharedport += strlen(sharedport) + 1;
+ }
+}
+
/*-------------------------------------------------------------------------
* Interlock-file support
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 27ab913402..a98c8abb9e 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1509,7 +1509,7 @@
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
{ oid => '9774', descr => 'session authenticated identity',
- proname => 'pg_session_authn_id', provolatile => 's', proparallel => 'r',
+ proname => 'pg_session_authn_id', provolatile => 's',
prorettype => 'text', proargtypes => '', prosrc => 'pg_session_authn_id' },
{ oid => '744',
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index dd3e5efba3..911b8246ce 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -99,6 +99,27 @@ typedef struct
} pg_gssinfo;
#endif
+/*
+ * Fields from Port that need to be copied over to parallel workers go into the
+ * SharedPort. The same rules apply for allocations here as for Port (must be
+ * malloc'd or palloc'd in TopMemoryContext).
+ */
+typedef struct SharedPort
+{
+ /*
+ * Authenticated identity. The meaning of this identifier is dependent on
+ * hba->auth_method; it is the identity (if any) that the user presented
+ * during the authentication cycle, before they were assigned a database
+ * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
+ * -- though the exact string in use may be different, depending on pg_hba
+ * options.)
+ *
+ * authn_id is NULL if the user has not actually been authenticated, for
+ * example if the "trust" auth method is in use.
+ */
+ const char *authn_id;
+} SharedPort;
+
/*
* This is used by the postmaster in its communication with frontends. It
* contains all state information needed during this communication before the
@@ -159,19 +180,6 @@ typedef struct Port
*/
HbaLine *hba;
- /*
- * Authenticated identity. The meaning of this identifier is dependent on
- * hba->auth_method; it is the identity (if any) that the user presented
- * during the authentication cycle, before they were assigned a database
- * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
- * -- though the exact string in use may be different, depending on pg_hba
- * options.)
- *
- * authn_id is NULL if the user has not actually been authenticated, for
- * example if the "trust" auth method is in use.
- */
- const char *authn_id;
-
/*
* TCP keepalive and user timeout settings.
*
@@ -328,6 +336,7 @@ extern ssize_t be_gssapi_write(Port *port, void *ptr, size_t len);
#endif /* ENABLE_GSS */
extern ProtocolVersion FrontendProtocol;
+extern SharedPort MyProcShared;
/* TCP keepalives configuration. These are no-ops on an AF_UNIX socket. */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 0abc3ad540..68cc1517a0 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -481,6 +481,10 @@ extern void process_session_preload_libraries(void);
extern void pg_bindtextdomain(const char *domain);
extern bool has_rolreplication(Oid roleid);
+extern Size EstimateSharedPortSpace(void);
+extern void SerializeSharedPort(Size maxsize, char *start_address);
+extern void RestoreSharedPort(char *sharedport);
+
/* in access/transam/xlog.c */
extern bool BackupInProgress(void);
extern void CancelBackup(void);
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index f0bdeda52d..3f8629b3a6 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -74,6 +74,14 @@ $node->safe_psql('postgres',
);
$ENV{"PGPASSWORD"} = 'pass';
+# Set up a table for parallel worker testing.
+$node->safe_psql('postgres',
+ 'CREATE TABLE nulls (n) AS SELECT NULL FROM generate_series(1, 200000);'
+);
+$node->safe_psql('postgres',
+ 'GRANT SELECT ON nulls TO md5_role;'
+);
+
# For "trust" method, all users should be able to connect. These users are not
# considered to be authenticated.
reset_pg_hba($node, 'trust');
@@ -86,6 +94,19 @@ my $res =
$node->safe_psql('postgres', "SELECT pg_session_authn_id() IS NULL;");
is($res, 't', "users with trust authentication have NULL authn_id");
+# Test pg_session_authn_id() with parallel workers.
+$res = $node->safe_psql(
+ 'postgres', '
+ SET min_parallel_table_scan_size TO 0;
+ SET parallel_setup_cost TO 0;
+ SET parallel_tuple_cost TO 0;
+ SET max_parallel_workers_per_gather TO 2;
+
+ SELECT bool_and(pg_session_authn_id() IS NOT DISTINCT FROM n) FROM nulls;
+ ',
+ connstr => "user=md5_role");
+is($res, 't', "parallel workers return a null authn_id when not authenticated");
+
# For plain "password" method, all users should also be able to connect.
reset_pg_hba($node, 'password');
test_role($node, 'scram_role', 'password', 0,
@@ -102,6 +123,18 @@ $res = $node->safe_psql(
is($res, 'md5_role',
"users with md5 authentication have authn_id matching role name");
+$res = $node->safe_psql(
+ 'postgres', '
+ SET min_parallel_table_scan_size TO 0;
+ SET parallel_setup_cost TO 0;
+ SET parallel_tuple_cost TO 0;
+ SET max_parallel_workers_per_gather TO 2;
+
+ SELECT bool_and(pg_session_authn_id() IS DISTINCT FROM n) FROM nulls;
+ ',
+ connstr => "user=md5_role");
+is($res, 't', "parallel workers return a non-null authn_id when authenticated");
+
# For "scram-sha-256" method, user "scram_role" should be able to connect.
reset_pg_hba($node, 'scram-sha-256');
test_role(
--
2.25.1
Hi,
On 2022-03-29 23:38:29 +0000, Jacob Champion wrote:
On Sat, 2022-03-26 at 11:36 -0700, Andres Freund wrote:
I also note that exposing it as a GUC means we have zero control over
who/what can read it. Maybe that's not a problem, but it needs to be
thought about before we go down that path.Yes, I think that's a fair concern.
I like that there's no builtin way, today, for a superuser to modify
the internal value; it strengthens the use as an auditing tool. Moving
this to a PGC_SU_BACKEND GUC seems to weaken that. And it looks like
PGC_INTERNAL is skipped during the serialization, so if we chose that
option, we'd need to write new code anyway?
It'd be pretty simple to change can_skip_gucvar()'s selection of what to
sync. E.g. an additional flag like GUC_PARALLEL_SYNCHRONIZE.
I'm not convinced that a GUC is the answer, to be clear.
We'd also need to guess whether the GUC system's serialization of NULL
as an empty string is likely to cause problems for any future auth
methods.
You can't represent a NULL in a postgres 'text' datum, independent of
parallelism. So the current definition of pg_session_authn_id() already
precludes that (and set_authn_id() and ...). Honestly, I can't see a reason
why we should ever allow authn_id to contain a NULL byte.
Greetings,
Andres Freund
On Tue, 2022-03-29 at 16:53 -0700, Andres Freund wrote:
We'd also need to guess whether the GUC system's serialization of NULL
as an empty string is likely to cause problems for any future auth
methods.You can't represent a NULL in a postgres 'text' datum, independent of
parallelism. So the current definition of pg_session_authn_id() already
precludes that (and set_authn_id() and ...). Honestly, I can't see a reason
why we should ever allow authn_id to contain a NULL byte.
I don't mean a NULL byte, just a NULL pointer. This part of the
implementation doesn't distinguish between it and an empty string:
/* NULL becomes empty string, see estimate_variable_size() */
do_serialize(destptr, maxbytes, "%s",
*conf->variable ? *conf->variable : "");
Whether that's a problem in the future entirely depends on whether
there's some authentication method that considers the empty string a
sane and meaningful identity. We might reasonably decide that the
answer is "no", but I like being able to make that decision as opposed
to delegating it to an existing generic framework.
(That last point may be my core concern about making it a GUC: I'd like
us to have full control of how and where this particular piece of
information gets modified.)
Thanks,
--Jacob
On Tue, 2022-03-29 at 23:38 +0000, Jacob Champion wrote:
v8 rebases over the recent SSL changes to get the cfbot green again.
I think the Windows failure [1]https://cirrus-ci.com/task/5434752374145024 is unrelated to this patch, but for
posterity:
[03:01:58.925] c:\cirrus>call "C:/Program Files/Git/usr/bin/timeout.exe" -v -k60s 15m perl src/tools/msvc/vcregress.pl recoverycheck
[03:03:16.106] [03:03:16] t/001_stream_rep.pl .................. ok 76551 ms ( 0.00 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.00 CPU)
[03:03:16.120] [03:03:16] t/002_archiving.pl ................... ok 0 ms ( 0.02 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.02 CPU)
[03:03:16.128] [03:03:16] t/003_recovery_targets.pl ............ ok 0 ms ( 0.00 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.00 CPU)
[03:03:16.138] [03:03:16] t/004_timeline_switch.pl ............. ok 0 ms ( 0.00 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.00 CPU)
[03:03:16.141] [03:03:16] t/005_replay_delay.pl ................ ok 0 ms ( 0.00 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.00 CPU)
[03:03:24.561] [03:03:24] t/006_logical_decoding.pl ............ ok 8416 ms ( 0.00 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.00 CPU)
[03:03:32.496] [03:03:32] t/007_sync_rep.pl .................... ok 7895 ms ( 0.00 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.00 CPU)
[03:03:32.496] [03:03:32] t/008_fsm_truncation.pl .............. ok 0 ms ( 0.00 usr 0.00 sys + 0.00 cusr 0.00 csys = 0.00 CPU)
[03:16:58.985] /usr/bin/timeout: sending signal TERM to command ‘perl’
The server and client logs don't quite match up; it looks like we get
partway through t/018_wal_optimize.pl, maybe to the end of the
`wal_level = minimal, SET TABLESPACE commit subtransaction` test,
before hanging.
I see that there's an active thread about a hang later in the recovery
suite [2]/messages/by-id/83b46e5f-2a52-86aa-fa6c-8174908174b8@iki.fi. That's suspicious since 019 is just the next test, but I
don't see any evidence in the logs that we actually started test 019 in
this run.
--Jacob
[1]: https://cirrus-ci.com/task/5434752374145024
[2]: /messages/by-id/83b46e5f-2a52-86aa-fa6c-8174908174b8@iki.fi
On Wed, Mar 30, 2022 at 04:02:09PM +0000, Jacob Champion wrote:
Whether that's a problem in the future entirely depends on whether
there's some authentication method that considers the empty string a
sane and meaningful identity. We might reasonably decide that the
answer is "no", but I like being able to make that decision as opposed
to delegating it to an existing generic framework.
My guess on the matter is that an empty authn holds the same meaning
as NULL because it has no data, but I can see your point as well to
make this distinction. In order to do that, couldn't you just use
shm_toc_lookup(noError=true)? PARALLEL_KEY_SHAREDPORT could be an
optional entry in the TOC data.
The name choice is still an issue, as per Andres' point that
MyProcShared is confusing as it can refer to shared memory. What we
want is a structure name for something that's related to MyProc and
shared across all parallel workers including the leader. I would
give up on the "Shared" part, using "Parallel" and "Info" instead.
Here are some ideas:
- ProcParallelInfo
- ProcInfoParallel
- ParallelProcInfo
--
Michael
On Tue, 2022-04-05 at 15:13 +0900, Michael Paquier wrote:
On Wed, Mar 30, 2022 at 04:02:09PM +0000, Jacob Champion wrote:
Whether that's a problem in the future entirely depends on whether
there's some authentication method that considers the empty string a
sane and meaningful identity. We might reasonably decide that the
answer is "no", but I like being able to make that decision as opposed
to delegating it to an existing generic framework.My guess on the matter is that an empty authn holds the same meaning
as NULL because it has no data,
Whether it holds meaning or not depends entirely on the auth method, I
think. Hypothetical example -- a system could accept client
certificates with an empty Subject. What identity that Subject
represents would depend on the organization, but it's distinct from
NULL/unauthenticated because the certificate is still signed by a CA.
(Postgres rejects empty Subjects when using clientname=DN and I'm not
proposing that we change that; I'm haven't actually checked that
they're RFC-legal. But it's possible that a future auth method could
have a reasonable standard definition for an empty identifier.)
but I can see your point as well to
make this distinction. In order to do that, couldn't you just use
shm_toc_lookup(noError=true)? PARALLEL_KEY_SHAREDPORT could be an
optional entry in the TOC data.
The current patch already handles NULL with a byte of overhead; is
there any advantage to using noError? (It might make things messier
once a second member gets added to the struct.) My concern was directed
at the GUC proposal.
The name choice is still an issue, as per Andres' point that
MyProcShared is confusing as it can refer to shared memory. What we
want is a structure name for something that's related to MyProc and
shared across all parallel workers including the leader. I would
give up on the "Shared" part, using "Parallel" and "Info" instead.
Here are some ideas:
- ProcParallelInfo
- ProcInfoParallel
- ParallelProcInfo
I like ParallelProcInfo; it reads nicely. I've used that in v9.
Thanks!
--Jacob
Attachments:
since-v8.diff.txttext/plain; name=since-v8.diff.txtDownload
commit b3fb176a4e4f09f7436f2df8c3411b4b51c71906
Author: Jacob Champion <pchampion@vmware.com>
Date: Tue Apr 5 10:18:16 2022 -0700
squash! Allow parallel workers to use pg_session_authn_id()
Update name to MyParallelProcInfo.
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index c88eab0933..27eda766b1 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -76,7 +76,7 @@
#define PARALLEL_KEY_REINDEX_STATE UINT64CONST(0xFFFFFFFFFFFF000C)
#define PARALLEL_KEY_RELMAPPER_STATE UINT64CONST(0xFFFFFFFFFFFF000D)
#define PARALLEL_KEY_UNCOMMITTEDENUMS UINT64CONST(0xFFFFFFFFFFFF000E)
-#define PARALLEL_KEY_SHAREDPORT UINT64CONST(0xFFFFFFFFFFFF000F)
+#define PARALLEL_KEY_PROCINFO UINT64CONST(0xFFFFFFFFFFFF000F)
/* Fixed-size parallel state. */
typedef struct FixedParallelState
@@ -213,7 +213,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
Size reindexlen = 0;
Size relmapperlen = 0;
Size uncommittedenumslen = 0;
- Size sharedportlen = 0;
+ Size procinfolen = 0;
Size segsize = 0;
int i;
FixedParallelState *fps;
@@ -274,8 +274,8 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_estimate_chunk(&pcxt->estimator, relmapperlen);
uncommittedenumslen = EstimateUncommittedEnumsSpace();
shm_toc_estimate_chunk(&pcxt->estimator, uncommittedenumslen);
- sharedportlen = EstimateSharedPortSpace();
- shm_toc_estimate_chunk(&pcxt->estimator, sharedportlen);
+ procinfolen = EstimateParallelProcInfoSpace();
+ shm_toc_estimate_chunk(&pcxt->estimator, procinfolen);
/* If you add more chunks here, you probably need to add keys. */
shm_toc_estimate_keys(&pcxt->estimator, 12);
@@ -356,7 +356,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
char *session_dsm_handle_space;
char *entrypointstate;
char *uncommittedenumsspace;
- char *sharedportspace;
+ char *procinfospace;
Size lnamelen;
/* Serialize shared libraries we have loaded. */
@@ -427,11 +427,11 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNCOMMITTEDENUMS,
uncommittedenumsspace);
- /* Serialize our SharedPort. */
- sharedportspace = shm_toc_allocate(pcxt->toc, sharedportlen);
- SerializeSharedPort(sharedportlen, sharedportspace);
- shm_toc_insert(pcxt->toc, PARALLEL_KEY_SHAREDPORT,
- sharedportspace);
+ /* Serialize our ParallelProcInfo. */
+ procinfospace = shm_toc_allocate(pcxt->toc, procinfolen);
+ SerializeParallelProcInfo(procinfolen, procinfospace);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PROCINFO,
+ procinfospace);
/* Allocate space for worker information. */
pcxt->worker = palloc0(sizeof(ParallelWorkerInfo) * pcxt->nworkers);
@@ -1281,7 +1281,7 @@ ParallelWorkerMain(Datum main_arg)
char *reindexspace;
char *relmapperspace;
char *uncommittedenumsspace;
- char *sharedportspace;
+ char *procinfospace;
StringInfoData msgbuf;
char *session_dsm_handle_space;
Snapshot tsnapshot;
@@ -1491,9 +1491,9 @@ ParallelWorkerMain(Datum main_arg)
false);
RestoreUncommittedEnums(uncommittedenumsspace);
- /* Restore the SharedPort. */
- sharedportspace = shm_toc_lookup(toc, PARALLEL_KEY_SHAREDPORT, false);
- RestoreSharedPort(sharedportspace);
+ /* Restore the ParallelProcInfo. */
+ procinfospace = shm_toc_lookup(toc, PARALLEL_KEY_PROCINFO, false);
+ RestoreParallelProcInfo(procinfospace);
/* Attach to the leader's serializable transaction, if SERIALIZABLE. */
AttachSerializableXact(fps->serializable_xact_handle);
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index bceda9755a..2e5fe2cc19 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -342,15 +342,15 @@ auth_failed(Port *port, int status, const char *logdetail)
* authorization will fail later.
*
* The provided string will be copied into TopMemoryContext, to match the
- * lifetime of MyProcShared, so it is safe to pass a string that is managed by
- * an external library.
+ * lifetime of MyParallelProcInfo, so it is safe to pass a string that is
+ * managed by an external library.
*/
static void
set_authn_id(Port *port, const char *id)
{
Assert(id);
- if (MyProcShared.authn_id)
+ if (MyParallelProcInfo.authn_id)
{
/*
* An existing authn_id should never be overwritten; that means two
@@ -361,17 +361,17 @@ set_authn_id(Port *port, const char *id)
ereport(FATAL,
(errmsg("authentication identifier set more than once"),
errdetail_log("previous identifier: \"%s\"; new identifier: \"%s\"",
- MyProcShared.authn_id, id)));
+ MyParallelProcInfo.authn_id, id)));
}
- MyProcShared.authn_id = MemoryContextStrdup(TopMemoryContext, id);
+ MyParallelProcInfo.authn_id = MemoryContextStrdup(TopMemoryContext, id);
if (Log_connections)
{
ereport(LOG,
errmsg("connection authenticated: identity=\"%s\" method=%s "
"(%s:%d)",
- MyProcShared.authn_id,
+ MyParallelProcInfo.authn_id,
hba_authname(port->hba->auth_method), HbaFileName,
port->hba->linenumber));
}
@@ -1910,7 +1910,7 @@ auth_peer(hbaPort *port)
set_authn_id(port, pw->pw_name);
ret = check_usermap(port->hba->usermap, port->user_name,
- MyProcShared.authn_id, false);
+ MyParallelProcInfo.authn_id, false);
return ret;
#else
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index 6d497e63d9..24a06bf933 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -275,10 +275,10 @@ session_user(PG_FUNCTION_ARGS)
Datum
pg_session_authn_id(PG_FUNCTION_ARGS)
{
- if (!MyProcShared.authn_id)
+ if (!MyParallelProcInfo.authn_id)
PG_RETURN_NULL();
- PG_RETURN_TEXT_P(cstring_to_text(MyProcShared.authn_id));
+ PG_RETURN_TEXT_P(cstring_to_text(MyParallelProcInfo.authn_id));
}
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 0afab3e142..91b3347398 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -930,50 +930,50 @@ GetUserNameFromId(Oid roleid, bool noerr)
}
/* ------------------------------------------------------------------------
- * "Shared" connection state
+ * Parallel connection state
*
- * MyProcShared contains pieces of information about the client that need to be
- * synced to parallel workers when they initialize. Over time, this list will
- * probably grow, and may subsume some of the "user state" variables above.
+ * MyParallelProcInfo contains pieces of information about the client that need
+ * to be synced to parallel workers when they initialize. Over time, this list
+ * will probably grow, and may subsume some of the "user state" variables above.
*-------------------------------------------------------------------------
*/
-SharedPort MyProcShared;
+ParallelProcInfo MyParallelProcInfo;
/*
- * Calculate the space needed to serialize MyProcShared.
+ * Calculate the space needed to serialize MyParallelProcInfo.
*/
Size
-EstimateSharedPortSpace(void)
+EstimateParallelProcInfoSpace(void)
{
Size size = 1;
- if (MyProcShared.authn_id)
- size = add_size(size, strlen(MyProcShared.authn_id) + 1);
+ if (MyParallelProcInfo.authn_id)
+ size = add_size(size, strlen(MyParallelProcInfo.authn_id) + 1);
return size;
}
/*
- * Serialize MyProcShared for use by parallel workers.
+ * Serialize MyParallelProcInfo for use by parallel workers.
*/
void
-SerializeSharedPort(Size maxsize, char *start_address)
+SerializeParallelProcInfo(Size maxsize, char *start_address)
{
/*
* First byte is an indication of whether or not authn_id has been set to
* non-NULL, to differentiate that case from the empty string.
*/
Assert(maxsize > 0);
- start_address[0] = MyProcShared.authn_id ? 1 : 0;
+ start_address[0] = MyParallelProcInfo.authn_id ? 1 : 0;
start_address++;
maxsize--;
- if (MyProcShared.authn_id)
+ if (MyParallelProcInfo.authn_id)
{
Size len;
- len = strlcpy(start_address, MyProcShared.authn_id, maxsize) + 1;
+ len = strlcpy(start_address, MyParallelProcInfo.authn_id, maxsize) + 1;
Assert(len <= maxsize);
maxsize -= len;
start_address += len;
@@ -981,22 +981,22 @@ SerializeSharedPort(Size maxsize, char *start_address)
}
/*
- * Restore MyProcShared from its serialized representation.
+ * Restore MyParallelProcInfo from its serialized representation.
*/
void
-RestoreSharedPort(char *sharedport)
+RestoreParallelProcInfo(char *procinfo)
{
- if (sharedport[0] == 0)
+ if (procinfo[0] == 0)
{
- MyProcShared.authn_id = NULL;
- sharedport++;
+ MyParallelProcInfo.authn_id = NULL;
+ procinfo++;
}
else
{
- sharedport++;
- MyProcShared.authn_id = MemoryContextStrdup(TopMemoryContext,
- sharedport);
- sharedport += strlen(sharedport) + 1;
+ procinfo++;
+ MyParallelProcInfo.authn_id = MemoryContextStrdup(TopMemoryContext,
+ procinfo);
+ procinfo += strlen(procinfo) + 1;
}
}
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index 911b8246ce..5cc4091216 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -101,10 +101,10 @@ typedef struct
/*
* Fields from Port that need to be copied over to parallel workers go into the
- * SharedPort. The same rules apply for allocations here as for Port (must be
- * malloc'd or palloc'd in TopMemoryContext).
+ * ParallelProcInfo. The same rules apply for allocations here as for Port (must
+ * be malloc'd or palloc'd in TopMemoryContext).
*/
-typedef struct SharedPort
+typedef struct
{
/*
* Authenticated identity. The meaning of this identifier is dependent on
@@ -118,7 +118,7 @@ typedef struct SharedPort
* example if the "trust" auth method is in use.
*/
const char *authn_id;
-} SharedPort;
+} ParallelProcInfo;
/*
* This is used by the postmaster in its communication with frontends. It
@@ -336,7 +336,7 @@ extern ssize_t be_gssapi_write(Port *port, void *ptr, size_t len);
#endif /* ENABLE_GSS */
extern ProtocolVersion FrontendProtocol;
-extern SharedPort MyProcShared;
+extern ParallelProcInfo MyParallelProcInfo;
/* TCP keepalives configuration. These are no-ops on an AF_UNIX socket. */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 68cc1517a0..7a08a73b85 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -481,9 +481,9 @@ extern void process_session_preload_libraries(void);
extern void pg_bindtextdomain(const char *domain);
extern bool has_rolreplication(Oid roleid);
-extern Size EstimateSharedPortSpace(void);
-extern void SerializeSharedPort(Size maxsize, char *start_address);
-extern void RestoreSharedPort(char *sharedport);
+extern Size EstimateParallelProcInfoSpace(void);
+extern void SerializeParallelProcInfo(Size maxsize, char *start_address);
+extern void RestoreParallelProcInfo(char *procinfo);
/* in access/transam/xlog.c */
extern bool BackupInProgress(void);
v9-0001-Add-API-to-retrieve-authn_id-from-SQL.patchtext/x-patch; name=v9-0001-Add-API-to-retrieve-authn_id-from-SQL.patchDownload
From 7916cb42344e13baf973bcc9b9af822742b64c59 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Mon, 14 Feb 2022 08:10:53 -0800
Subject: [PATCH v9 1/3] Add API to retrieve authn_id from SQL
The authn_id field in MyProcPort is currently only accessible to the
backend itself. Add a SQL function, pg_session_authn_id(), to expose
the field to triggers that may want to make use of it.
---
doc/src/sgml/func.sgml | 26 +++++++++++++++++++++++
src/backend/utils/adt/name.c | 12 ++++++++++-
src/include/catalog/pg_proc.dat | 3 +++
src/test/authentication/t/001_password.pl | 11 ++++++++++
src/test/ssl/t/001_ssltests.pl | 7 ++++++
5 files changed, 58 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4001cb2bda..f1a42bbbf5 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -22290,6 +22290,32 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_session_authn_id</primary>
+ </indexterm>
+ <function>pg_session_authn_id</function> ()
+ <returnvalue>text</returnvalue>
+ </para>
+ <para>
+ Returns the authenticated identity for the current connection, or
+ <literal>NULL</literal> if the user has not been authenticated.
+ </para>
+ <para>
+ The authenticated identity is an immutable identifier for the user
+ presented during the connection handshake; the exact format depends on
+ the authentication method in use. (For example, when using the
+ <literal>scram-sha-256</literal> auth method, the authenticated identity
+ is simply the username. When using the <literal>cert</literal> auth
+ method, the authenticated identity is the Distinguished Name of the
+ client certificate.) Even for auth methods which use the username as
+ the authenticated identity, this function differs from
+ <literal>session_user</literal> in that its return value cannot be
+ changed after login.
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index e8bba3670c..662a7943ed 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -23,6 +23,7 @@
#include "catalog/namespace.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_type.h"
+#include "libpq/libpq-be.h"
#include "libpq/pqformat.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
@@ -257,7 +258,7 @@ namestrcmp(Name name, const char *str)
/*
- * SQL-functions CURRENT_USER, SESSION_USER
+ * SQL-functions CURRENT_USER, SESSION_USER, PG_SESSION_AUTHN_ID
*/
Datum
current_user(PG_FUNCTION_ARGS)
@@ -271,6 +272,15 @@ session_user(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(DirectFunctionCall1(namein, CStringGetDatum(GetUserNameFromId(GetSessionUserId(), false))));
}
+Datum
+pg_session_authn_id(PG_FUNCTION_ARGS)
+{
+ if (!MyProcPort || !MyProcPort->authn_id)
+ PG_RETURN_NULL();
+
+ PG_RETURN_TEXT_P(cstring_to_text(MyProcPort->authn_id));
+}
+
/*
* SQL-functions CURRENT_SCHEMA, CURRENT_SCHEMAS
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 25304430f4..fb1126b090 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1508,6 +1508,9 @@
{ oid => '746', descr => 'session user name',
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
+{ oid => '9774', descr => 'session authenticated identity',
+ proname => 'pg_session_authn_id', provolatile => 's', proparallel => 'r',
+ prorettype => 'text', proargtypes => '', prosrc => 'pg_session_authn_id' },
{ oid => '744',
proname => 'array_eq', prorettype => 'bool',
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index 3e3079c824..f0bdeda52d 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -82,6 +82,10 @@ test_role($node, 'scram_role', 'trust', 0,
test_role($node, 'md5_role', 'trust', 0,
log_unlike => [qr/connection authenticated:/]);
+my $res =
+ $node->safe_psql('postgres', "SELECT pg_session_authn_id() IS NULL;");
+is($res, 't', "users with trust authentication have NULL authn_id");
+
# For plain "password" method, all users should also be able to connect.
reset_pg_hba($node, 'password');
test_role($node, 'scram_role', 'password', 0,
@@ -91,6 +95,13 @@ test_role($node, 'md5_role', 'password', 0,
log_like =>
[qr/connection authenticated: identity="md5_role" method=password/]);
+$res = $node->safe_psql(
+ 'postgres',
+ "SELECT pg_session_authn_id();",
+ connstr => "user=md5_role");
+is($res, 'md5_role',
+ "users with md5 authentication have authn_id matching role name");
+
# For "scram-sha-256" method, user "scram_role" should be able to connect.
reset_pg_hba($node, 'scram-sha-256');
test_role(
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index 58d2bc336f..a4ae6b680f 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -547,6 +547,13 @@ $node->connect_ok(
qr/connection authenticated: identity="CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG" method=cert/
],);
+# Sanity-check pg_session_authn_id() for long ID strings
+my $res = $node->safe_psql('postgres',
+ "SELECT pg_session_authn_id();",
+ connstr => "$dn_connstr user=ssltestuser sslcert=ssl/client-dn.crt " . sslkey('client-dn.key'),
+);
+is($res, "CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG", "users with cert authentication have entire DN as authn_id");
+
# same thing but with a regex
$dn_connstr = "$common_connstr dbname=certdb_dn_re";
--
2.25.1
v9-0002-Allow-parallel-workers-to-use-pg_session_authn_id.patchtext/x-patch; name=v9-0002-Allow-parallel-workers-to-use-pg_session_authn_id.patchDownload
From ac555ab562a9491c3479a0cee495414b63073ed9 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Wed, 23 Mar 2022 15:07:05 -0700
Subject: [PATCH v9 2/3] Allow parallel workers to use pg_session_authn_id()
Move authn_id into a new global, MyParallelProcInfo, which is intended
to hold all the information that can be shared between the backend and
any parallel workers. MyParallelProcInfo is serialized and restored
using a new parallel key.
With this change, the parallel restriction can be removed from
pg_session_authn_id().
---
src/backend/access/transam/parallel.c | 18 +++++-
src/backend/libpq/auth.c | 16 ++---
src/backend/utils/adt/name.c | 4 +-
src/backend/utils/init/miscinit.c | 71 +++++++++++++++++++++++
src/include/catalog/pg_proc.dat | 2 +-
src/include/libpq/libpq-be.h | 35 ++++++-----
src/include/miscadmin.h | 4 ++
src/test/authentication/t/001_password.pl | 33 +++++++++++
8 files changed, 159 insertions(+), 24 deletions(-)
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index df0cd77558..27eda766b1 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -76,6 +76,7 @@
#define PARALLEL_KEY_REINDEX_STATE UINT64CONST(0xFFFFFFFFFFFF000C)
#define PARALLEL_KEY_RELMAPPER_STATE UINT64CONST(0xFFFFFFFFFFFF000D)
#define PARALLEL_KEY_UNCOMMITTEDENUMS UINT64CONST(0xFFFFFFFFFFFF000E)
+#define PARALLEL_KEY_PROCINFO UINT64CONST(0xFFFFFFFFFFFF000F)
/* Fixed-size parallel state. */
typedef struct FixedParallelState
@@ -212,6 +213,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
Size reindexlen = 0;
Size relmapperlen = 0;
Size uncommittedenumslen = 0;
+ Size procinfolen = 0;
Size segsize = 0;
int i;
FixedParallelState *fps;
@@ -272,8 +274,10 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_estimate_chunk(&pcxt->estimator, relmapperlen);
uncommittedenumslen = EstimateUncommittedEnumsSpace();
shm_toc_estimate_chunk(&pcxt->estimator, uncommittedenumslen);
+ procinfolen = EstimateParallelProcInfoSpace();
+ shm_toc_estimate_chunk(&pcxt->estimator, procinfolen);
/* If you add more chunks here, you probably need to add keys. */
- shm_toc_estimate_keys(&pcxt->estimator, 11);
+ shm_toc_estimate_keys(&pcxt->estimator, 12);
/* Estimate space need for error queues. */
StaticAssertStmt(BUFFERALIGN(PARALLEL_ERROR_QUEUE_SIZE) ==
@@ -352,6 +356,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
char *session_dsm_handle_space;
char *entrypointstate;
char *uncommittedenumsspace;
+ char *procinfospace;
Size lnamelen;
/* Serialize shared libraries we have loaded. */
@@ -422,6 +427,12 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNCOMMITTEDENUMS,
uncommittedenumsspace);
+ /* Serialize our ParallelProcInfo. */
+ procinfospace = shm_toc_allocate(pcxt->toc, procinfolen);
+ SerializeParallelProcInfo(procinfolen, procinfospace);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PROCINFO,
+ procinfospace);
+
/* Allocate space for worker information. */
pcxt->worker = palloc0(sizeof(ParallelWorkerInfo) * pcxt->nworkers);
@@ -1270,6 +1281,7 @@ ParallelWorkerMain(Datum main_arg)
char *reindexspace;
char *relmapperspace;
char *uncommittedenumsspace;
+ char *procinfospace;
StringInfoData msgbuf;
char *session_dsm_handle_space;
Snapshot tsnapshot;
@@ -1479,6 +1491,10 @@ ParallelWorkerMain(Datum main_arg)
false);
RestoreUncommittedEnums(uncommittedenumsspace);
+ /* Restore the ParallelProcInfo. */
+ procinfospace = shm_toc_lookup(toc, PARALLEL_KEY_PROCINFO, false);
+ RestoreParallelProcInfo(procinfospace);
+
/* Attach to the leader's serializable transaction, if SERIALIZABLE. */
AttachSerializableXact(fps->serializable_xact_handle);
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index efc53f3135..2e5fe2cc19 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -342,15 +342,15 @@ auth_failed(Port *port, int status, const char *logdetail)
* authorization will fail later.
*
* The provided string will be copied into TopMemoryContext, to match the
- * lifetime of the Port, so it is safe to pass a string that is managed by an
- * external library.
+ * lifetime of MyParallelProcInfo, so it is safe to pass a string that is
+ * managed by an external library.
*/
static void
set_authn_id(Port *port, const char *id)
{
Assert(id);
- if (port->authn_id)
+ if (MyParallelProcInfo.authn_id)
{
/*
* An existing authn_id should never be overwritten; that means two
@@ -361,17 +361,18 @@ set_authn_id(Port *port, const char *id)
ereport(FATAL,
(errmsg("authentication identifier set more than once"),
errdetail_log("previous identifier: \"%s\"; new identifier: \"%s\"",
- port->authn_id, id)));
+ MyParallelProcInfo.authn_id, id)));
}
- port->authn_id = MemoryContextStrdup(TopMemoryContext, id);
+ MyParallelProcInfo.authn_id = MemoryContextStrdup(TopMemoryContext, id);
if (Log_connections)
{
ereport(LOG,
errmsg("connection authenticated: identity=\"%s\" method=%s "
"(%s:%d)",
- port->authn_id, hba_authname(port->hba->auth_method), HbaFileName,
+ MyParallelProcInfo.authn_id,
+ hba_authname(port->hba->auth_method), HbaFileName,
port->hba->linenumber));
}
}
@@ -1908,7 +1909,8 @@ auth_peer(hbaPort *port)
*/
set_authn_id(port, pw->pw_name);
- ret = check_usermap(port->hba->usermap, port->user_name, port->authn_id, false);
+ ret = check_usermap(port->hba->usermap, port->user_name,
+ MyParallelProcInfo.authn_id, false);
return ret;
#else
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index 662a7943ed..24a06bf933 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -275,10 +275,10 @@ session_user(PG_FUNCTION_ARGS)
Datum
pg_session_authn_id(PG_FUNCTION_ARGS)
{
- if (!MyProcPort || !MyProcPort->authn_id)
+ if (!MyParallelProcInfo.authn_id)
PG_RETURN_NULL();
- PG_RETURN_TEXT_P(cstring_to_text(MyProcPort->authn_id));
+ PG_RETURN_TEXT_P(cstring_to_text(MyParallelProcInfo.authn_id));
}
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index bdc77af719..91b3347398 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -929,6 +929,77 @@ GetUserNameFromId(Oid roleid, bool noerr)
return result;
}
+/* ------------------------------------------------------------------------
+ * Parallel connection state
+ *
+ * MyParallelProcInfo contains pieces of information about the client that need
+ * to be synced to parallel workers when they initialize. Over time, this list
+ * will probably grow, and may subsume some of the "user state" variables above.
+ *-------------------------------------------------------------------------
+ */
+
+ParallelProcInfo MyParallelProcInfo;
+
+/*
+ * Calculate the space needed to serialize MyParallelProcInfo.
+ */
+Size
+EstimateParallelProcInfoSpace(void)
+{
+ Size size = 1;
+
+ if (MyParallelProcInfo.authn_id)
+ size = add_size(size, strlen(MyParallelProcInfo.authn_id) + 1);
+
+ return size;
+}
+
+/*
+ * Serialize MyParallelProcInfo for use by parallel workers.
+ */
+void
+SerializeParallelProcInfo(Size maxsize, char *start_address)
+{
+ /*
+ * First byte is an indication of whether or not authn_id has been set to
+ * non-NULL, to differentiate that case from the empty string.
+ */
+ Assert(maxsize > 0);
+ start_address[0] = MyParallelProcInfo.authn_id ? 1 : 0;
+ start_address++;
+ maxsize--;
+
+ if (MyParallelProcInfo.authn_id)
+ {
+ Size len;
+
+ len = strlcpy(start_address, MyParallelProcInfo.authn_id, maxsize) + 1;
+ Assert(len <= maxsize);
+ maxsize -= len;
+ start_address += len;
+ }
+}
+
+/*
+ * Restore MyParallelProcInfo from its serialized representation.
+ */
+void
+RestoreParallelProcInfo(char *procinfo)
+{
+ if (procinfo[0] == 0)
+ {
+ MyParallelProcInfo.authn_id = NULL;
+ procinfo++;
+ }
+ else
+ {
+ procinfo++;
+ MyParallelProcInfo.authn_id = MemoryContextStrdup(TopMemoryContext,
+ procinfo);
+ procinfo += strlen(procinfo) + 1;
+ }
+}
+
/*-------------------------------------------------------------------------
* Interlock-file support
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index fb1126b090..9bd1af42c0 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1509,7 +1509,7 @@
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
{ oid => '9774', descr => 'session authenticated identity',
- proname => 'pg_session_authn_id', provolatile => 's', proparallel => 'r',
+ proname => 'pg_session_authn_id', provolatile => 's',
prorettype => 'text', proargtypes => '', prosrc => 'pg_session_authn_id' },
{ oid => '744',
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index dd3e5efba3..5cc4091216 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -99,6 +99,27 @@ typedef struct
} pg_gssinfo;
#endif
+/*
+ * Fields from Port that need to be copied over to parallel workers go into the
+ * ParallelProcInfo. The same rules apply for allocations here as for Port (must
+ * be malloc'd or palloc'd in TopMemoryContext).
+ */
+typedef struct
+{
+ /*
+ * Authenticated identity. The meaning of this identifier is dependent on
+ * hba->auth_method; it is the identity (if any) that the user presented
+ * during the authentication cycle, before they were assigned a database
+ * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
+ * -- though the exact string in use may be different, depending on pg_hba
+ * options.)
+ *
+ * authn_id is NULL if the user has not actually been authenticated, for
+ * example if the "trust" auth method is in use.
+ */
+ const char *authn_id;
+} ParallelProcInfo;
+
/*
* This is used by the postmaster in its communication with frontends. It
* contains all state information needed during this communication before the
@@ -159,19 +180,6 @@ typedef struct Port
*/
HbaLine *hba;
- /*
- * Authenticated identity. The meaning of this identifier is dependent on
- * hba->auth_method; it is the identity (if any) that the user presented
- * during the authentication cycle, before they were assigned a database
- * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
- * -- though the exact string in use may be different, depending on pg_hba
- * options.)
- *
- * authn_id is NULL if the user has not actually been authenticated, for
- * example if the "trust" auth method is in use.
- */
- const char *authn_id;
-
/*
* TCP keepalive and user timeout settings.
*
@@ -328,6 +336,7 @@ extern ssize_t be_gssapi_write(Port *port, void *ptr, size_t len);
#endif /* ENABLE_GSS */
extern ProtocolVersion FrontendProtocol;
+extern ParallelProcInfo MyParallelProcInfo;
/* TCP keepalives configuration. These are no-ops on an AF_UNIX socket. */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 0abc3ad540..7a08a73b85 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -481,6 +481,10 @@ extern void process_session_preload_libraries(void);
extern void pg_bindtextdomain(const char *domain);
extern bool has_rolreplication(Oid roleid);
+extern Size EstimateParallelProcInfoSpace(void);
+extern void SerializeParallelProcInfo(Size maxsize, char *start_address);
+extern void RestoreParallelProcInfo(char *procinfo);
+
/* in access/transam/xlog.c */
extern bool BackupInProgress(void);
extern void CancelBackup(void);
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index f0bdeda52d..3f8629b3a6 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -74,6 +74,14 @@ $node->safe_psql('postgres',
);
$ENV{"PGPASSWORD"} = 'pass';
+# Set up a table for parallel worker testing.
+$node->safe_psql('postgres',
+ 'CREATE TABLE nulls (n) AS SELECT NULL FROM generate_series(1, 200000);'
+);
+$node->safe_psql('postgres',
+ 'GRANT SELECT ON nulls TO md5_role;'
+);
+
# For "trust" method, all users should be able to connect. These users are not
# considered to be authenticated.
reset_pg_hba($node, 'trust');
@@ -86,6 +94,19 @@ my $res =
$node->safe_psql('postgres', "SELECT pg_session_authn_id() IS NULL;");
is($res, 't', "users with trust authentication have NULL authn_id");
+# Test pg_session_authn_id() with parallel workers.
+$res = $node->safe_psql(
+ 'postgres', '
+ SET min_parallel_table_scan_size TO 0;
+ SET parallel_setup_cost TO 0;
+ SET parallel_tuple_cost TO 0;
+ SET max_parallel_workers_per_gather TO 2;
+
+ SELECT bool_and(pg_session_authn_id() IS NOT DISTINCT FROM n) FROM nulls;
+ ',
+ connstr => "user=md5_role");
+is($res, 't', "parallel workers return a null authn_id when not authenticated");
+
# For plain "password" method, all users should also be able to connect.
reset_pg_hba($node, 'password');
test_role($node, 'scram_role', 'password', 0,
@@ -102,6 +123,18 @@ $res = $node->safe_psql(
is($res, 'md5_role',
"users with md5 authentication have authn_id matching role name");
+$res = $node->safe_psql(
+ 'postgres', '
+ SET min_parallel_table_scan_size TO 0;
+ SET parallel_setup_cost TO 0;
+ SET parallel_tuple_cost TO 0;
+ SET max_parallel_workers_per_gather TO 2;
+
+ SELECT bool_and(pg_session_authn_id() IS DISTINCT FROM n) FROM nulls;
+ ',
+ connstr => "user=md5_role");
+is($res, 't', "parallel workers return a non-null authn_id when authenticated");
+
# For "scram-sha-256" method, user "scram_role" should be able to connect.
reset_pg_hba($node, 'scram-sha-256');
test_role(
--
2.25.1
On Tue, Apr 05, 2022 at 06:23:06PM +0000, Jacob Champion wrote:
Whether it holds meaning or not depends entirely on the auth method, I
think. Hypothetical example -- a system could accept client
certificates with an empty Subject. What identity that Subject
represents would depend on the organization, but it's distinct from
NULL/unauthenticated because the certificate is still signed by a CA.
Interesting point.
The current patch already handles NULL with a byte of overhead; is
there any advantage to using noError? (It might make things messier
once a second member gets added to the struct.) My concern was directed
at the GUC proposal.
FWIW, I am a bit concerned by this approach because it feels
inconsistent with any other conditional fields passed down from the
parallel leader to its workers. And if we need to add more fields to
ParallelProcInfo in the future, it will be cleaner to use different
TOC keys to pass down different fields anyway, no?
--
Michael
On Wed, 2022-04-06 at 20:09 +0900, Michael Paquier wrote:
The current patch already handles NULL with a byte of overhead; is
there any advantage to using noError? (It might make things messier
once a second member gets added to the struct.) My concern was directed
at the GUC proposal.FWIW, I am a bit concerned by this approach because it feels
inconsistent with any other conditional fields passed down from the
parallel leader to its workers. And if we need to add more fields to
ParallelProcInfo in the future, it will be cleaner to use different
TOC keys to pass down different fields anyway, no?
I assumed that we would follow the existing model of "(de)serialize a
whole struct", rather than pulling it apart into many separate keys. If
it got too complicated then we could consider introducing a
SerializedParallelProcInfo struct like some of the other functions do.
Maybe that wouldn't work so well if many of the fields are strings?
Is there a significant cost to changing this later, if one approach
turns out to be wrong?
--Jacob
On Wed, Apr 06, 2022 at 07:16:43PM +0000, Jacob Champion wrote:
I assumed that we would follow the existing model of "(de)serialize a
whole struct", rather than pulling it apart into many separate keys. If
it got too complicated then we could consider introducing a
SerializedParallelProcInfo struct like some of the other functions do.
Maybe that wouldn't work so well if many of the fields are strings?Is there a significant cost to changing this later, if one approach
turns out to be wrong?
I don't think this is going to be an issue as long as we don't change
the definitions of MyParallelProcInfo, Port or PARALLEL_KEY_* in the
stable branch. My guess is that we are fine to switch to one approach
or the other as this is just some internal communication logic between
the parallel leader and its workers.
What is the feeling of others about this patch and the introduction of
ParallelProcInfo (or ParallelPortInfo?) to store the authn coming from
Port? The feature freeze is very close.
--
Michael
v10 is rebased over latest; I've also added a PGDLLIMPORT to the new global.
--Jacob
Attachments:
v10-0001-Add-API-to-retrieve-authn_id-from-SQL.patchtext/x-patch; charset=US-ASCII; name=v10-0001-Add-API-to-retrieve-authn_id-from-SQL.patchDownload
From c8b3d2df4ce461fc65a27699419a54a5b7bb2001 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Mon, 14 Feb 2022 08:10:53 -0800
Subject: [PATCH v10 1/2] Add API to retrieve authn_id from SQL
The authn_id field in MyProcPort is currently only accessible to the
backend itself. Add a SQL function, pg_session_authn_id(), to expose
the field to triggers that may want to make use of it.
---
doc/src/sgml/func.sgml | 26 +++++++++++++++++++++++
src/backend/utils/adt/name.c | 12 ++++++++++-
src/include/catalog/pg_proc.dat | 3 +++
src/test/authentication/t/001_password.pl | 11 ++++++++++
src/test/ssl/t/001_ssltests.pl | 7 ++++++
5 files changed, 58 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index db3147d1c4..0532e2d605 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -23344,6 +23344,32 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_session_authn_id</primary>
+ </indexterm>
+ <function>pg_session_authn_id</function> ()
+ <returnvalue>text</returnvalue>
+ </para>
+ <para>
+ Returns the authenticated identity for the current connection, or
+ <literal>NULL</literal> if the user has not been authenticated.
+ </para>
+ <para>
+ The authenticated identity is an immutable identifier for the user
+ presented during the connection handshake; the exact format depends on
+ the authentication method in use. (For example, when using the
+ <literal>scram-sha-256</literal> auth method, the authenticated identity
+ is simply the username. When using the <literal>cert</literal> auth
+ method, the authenticated identity is the Distinguished Name of the
+ client certificate.) Even for auth methods which use the username as
+ the authenticated identity, this function differs from
+ <literal>session_user</literal> in that its return value cannot be
+ changed after login.
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index e8bba3670c..662a7943ed 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -23,6 +23,7 @@
#include "catalog/namespace.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_type.h"
+#include "libpq/libpq-be.h"
#include "libpq/pqformat.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
@@ -257,7 +258,7 @@ namestrcmp(Name name, const char *str)
/*
- * SQL-functions CURRENT_USER, SESSION_USER
+ * SQL-functions CURRENT_USER, SESSION_USER, PG_SESSION_AUTHN_ID
*/
Datum
current_user(PG_FUNCTION_ARGS)
@@ -271,6 +272,15 @@ session_user(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(DirectFunctionCall1(namein, CStringGetDatum(GetUserNameFromId(GetSessionUserId(), false))));
}
+Datum
+pg_session_authn_id(PG_FUNCTION_ARGS)
+{
+ if (!MyProcPort || !MyProcPort->authn_id)
+ PG_RETURN_NULL();
+
+ PG_RETURN_TEXT_P(cstring_to_text(MyProcPort->authn_id));
+}
+
/*
* SQL-functions CURRENT_SCHEMA, CURRENT_SCHEMAS
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 87aa571a33..8e181b4771 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1508,6 +1508,9 @@
{ oid => '746', descr => 'session user name',
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
+{ oid => '9774', descr => 'session authenticated identity',
+ proname => 'pg_session_authn_id', provolatile => 's', proparallel => 'r',
+ prorettype => 'text', proargtypes => '', prosrc => 'pg_session_authn_id' },
{ oid => '744',
proname => 'array_eq', prorettype => 'bool',
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index 3e3079c824..f0bdeda52d 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -82,6 +82,10 @@ test_role($node, 'scram_role', 'trust', 0,
test_role($node, 'md5_role', 'trust', 0,
log_unlike => [qr/connection authenticated:/]);
+my $res =
+ $node->safe_psql('postgres', "SELECT pg_session_authn_id() IS NULL;");
+is($res, 't', "users with trust authentication have NULL authn_id");
+
# For plain "password" method, all users should also be able to connect.
reset_pg_hba($node, 'password');
test_role($node, 'scram_role', 'password', 0,
@@ -91,6 +95,13 @@ test_role($node, 'md5_role', 'password', 0,
log_like =>
[qr/connection authenticated: identity="md5_role" method=password/]);
+$res = $node->safe_psql(
+ 'postgres',
+ "SELECT pg_session_authn_id();",
+ connstr => "user=md5_role");
+is($res, 'md5_role',
+ "users with md5 authentication have authn_id matching role name");
+
# For "scram-sha-256" method, user "scram_role" should be able to connect.
reset_pg_hba($node, 'scram-sha-256');
test_role(
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index c0b4a5739c..2941eb0bde 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -562,6 +562,13 @@ $node->connect_ok(
qr/connection authenticated: identity="CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG" method=cert/
],);
+# Sanity-check pg_session_authn_id() for long ID strings
+my $res = $node->safe_psql('postgres',
+ "SELECT pg_session_authn_id();",
+ connstr => "$dn_connstr user=ssltestuser sslcert=ssl/client-dn.crt " . sslkey('client-dn.key'),
+);
+is($res, "CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG", "users with cert authentication have entire DN as authn_id");
+
# same thing but with a regex
$dn_connstr = "$common_connstr dbname=certdb_dn_re";
--
2.17.1
v10-0002-Allow-parallel-workers-to-use-pg_session_authn_i.patchtext/x-patch; charset=US-ASCII; name=v10-0002-Allow-parallel-workers-to-use-pg_session_authn_i.patchDownload
From 2b3e8a883c04d161e9c554ecadcf9e1ffd1bd35c Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Wed, 23 Mar 2022 15:07:05 -0700
Subject: [PATCH v10 2/2] Allow parallel workers to use pg_session_authn_id()
Move authn_id into a new global, MyParallelProcInfo, which is intended
to hold all the information that can be shared between the backend and
any parallel workers. MyParallelProcInfo is serialized and restored
using a new parallel key.
With this change, the parallel restriction can be removed from
pg_session_authn_id().
---
src/backend/access/transam/parallel.c | 18 +++++-
src/backend/libpq/auth.c | 16 ++---
src/backend/utils/adt/name.c | 4 +-
src/backend/utils/init/miscinit.c | 71 +++++++++++++++++++++++
src/include/catalog/pg_proc.dat | 2 +-
src/include/libpq/libpq-be.h | 35 ++++++-----
src/include/miscadmin.h | 4 ++
src/test/authentication/t/001_password.pl | 33 +++++++++++
8 files changed, 159 insertions(+), 24 deletions(-)
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index df0cd77558..27eda766b1 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -76,6 +76,7 @@
#define PARALLEL_KEY_REINDEX_STATE UINT64CONST(0xFFFFFFFFFFFF000C)
#define PARALLEL_KEY_RELMAPPER_STATE UINT64CONST(0xFFFFFFFFFFFF000D)
#define PARALLEL_KEY_UNCOMMITTEDENUMS UINT64CONST(0xFFFFFFFFFFFF000E)
+#define PARALLEL_KEY_PROCINFO UINT64CONST(0xFFFFFFFFFFFF000F)
/* Fixed-size parallel state. */
typedef struct FixedParallelState
@@ -212,6 +213,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
Size reindexlen = 0;
Size relmapperlen = 0;
Size uncommittedenumslen = 0;
+ Size procinfolen = 0;
Size segsize = 0;
int i;
FixedParallelState *fps;
@@ -272,8 +274,10 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_estimate_chunk(&pcxt->estimator, relmapperlen);
uncommittedenumslen = EstimateUncommittedEnumsSpace();
shm_toc_estimate_chunk(&pcxt->estimator, uncommittedenumslen);
+ procinfolen = EstimateParallelProcInfoSpace();
+ shm_toc_estimate_chunk(&pcxt->estimator, procinfolen);
/* If you add more chunks here, you probably need to add keys. */
- shm_toc_estimate_keys(&pcxt->estimator, 11);
+ shm_toc_estimate_keys(&pcxt->estimator, 12);
/* Estimate space need for error queues. */
StaticAssertStmt(BUFFERALIGN(PARALLEL_ERROR_QUEUE_SIZE) ==
@@ -352,6 +356,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
char *session_dsm_handle_space;
char *entrypointstate;
char *uncommittedenumsspace;
+ char *procinfospace;
Size lnamelen;
/* Serialize shared libraries we have loaded. */
@@ -422,6 +427,12 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNCOMMITTEDENUMS,
uncommittedenumsspace);
+ /* Serialize our ParallelProcInfo. */
+ procinfospace = shm_toc_allocate(pcxt->toc, procinfolen);
+ SerializeParallelProcInfo(procinfolen, procinfospace);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_PROCINFO,
+ procinfospace);
+
/* Allocate space for worker information. */
pcxt->worker = palloc0(sizeof(ParallelWorkerInfo) * pcxt->nworkers);
@@ -1270,6 +1281,7 @@ ParallelWorkerMain(Datum main_arg)
char *reindexspace;
char *relmapperspace;
char *uncommittedenumsspace;
+ char *procinfospace;
StringInfoData msgbuf;
char *session_dsm_handle_space;
Snapshot tsnapshot;
@@ -1479,6 +1491,10 @@ ParallelWorkerMain(Datum main_arg)
false);
RestoreUncommittedEnums(uncommittedenumsspace);
+ /* Restore the ParallelProcInfo. */
+ procinfospace = shm_toc_lookup(toc, PARALLEL_KEY_PROCINFO, false);
+ RestoreParallelProcInfo(procinfospace);
+
/* Attach to the leader's serializable transaction, if SERIALIZABLE. */
AttachSerializableXact(fps->serializable_xact_handle);
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index efc53f3135..2e5fe2cc19 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -342,15 +342,15 @@ auth_failed(Port *port, int status, const char *logdetail)
* authorization will fail later.
*
* The provided string will be copied into TopMemoryContext, to match the
- * lifetime of the Port, so it is safe to pass a string that is managed by an
- * external library.
+ * lifetime of MyParallelProcInfo, so it is safe to pass a string that is
+ * managed by an external library.
*/
static void
set_authn_id(Port *port, const char *id)
{
Assert(id);
- if (port->authn_id)
+ if (MyParallelProcInfo.authn_id)
{
/*
* An existing authn_id should never be overwritten; that means two
@@ -361,17 +361,18 @@ set_authn_id(Port *port, const char *id)
ereport(FATAL,
(errmsg("authentication identifier set more than once"),
errdetail_log("previous identifier: \"%s\"; new identifier: \"%s\"",
- port->authn_id, id)));
+ MyParallelProcInfo.authn_id, id)));
}
- port->authn_id = MemoryContextStrdup(TopMemoryContext, id);
+ MyParallelProcInfo.authn_id = MemoryContextStrdup(TopMemoryContext, id);
if (Log_connections)
{
ereport(LOG,
errmsg("connection authenticated: identity=\"%s\" method=%s "
"(%s:%d)",
- port->authn_id, hba_authname(port->hba->auth_method), HbaFileName,
+ MyParallelProcInfo.authn_id,
+ hba_authname(port->hba->auth_method), HbaFileName,
port->hba->linenumber));
}
}
@@ -1908,7 +1909,8 @@ auth_peer(hbaPort *port)
*/
set_authn_id(port, pw->pw_name);
- ret = check_usermap(port->hba->usermap, port->user_name, port->authn_id, false);
+ ret = check_usermap(port->hba->usermap, port->user_name,
+ MyParallelProcInfo.authn_id, false);
return ret;
#else
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index 662a7943ed..24a06bf933 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -275,10 +275,10 @@ session_user(PG_FUNCTION_ARGS)
Datum
pg_session_authn_id(PG_FUNCTION_ARGS)
{
- if (!MyProcPort || !MyProcPort->authn_id)
+ if (!MyParallelProcInfo.authn_id)
PG_RETURN_NULL();
- PG_RETURN_TEXT_P(cstring_to_text(MyProcPort->authn_id));
+ PG_RETURN_TEXT_P(cstring_to_text(MyParallelProcInfo.authn_id));
}
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index ec6a61594a..ab4b6f2911 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -932,6 +932,77 @@ GetUserNameFromId(Oid roleid, bool noerr)
return result;
}
+/* ------------------------------------------------------------------------
+ * Parallel connection state
+ *
+ * MyParallelProcInfo contains pieces of information about the client that need
+ * to be synced to parallel workers when they initialize. Over time, this list
+ * will probably grow, and may subsume some of the "user state" variables above.
+ *-------------------------------------------------------------------------
+ */
+
+ParallelProcInfo MyParallelProcInfo;
+
+/*
+ * Calculate the space needed to serialize MyParallelProcInfo.
+ */
+Size
+EstimateParallelProcInfoSpace(void)
+{
+ Size size = 1;
+
+ if (MyParallelProcInfo.authn_id)
+ size = add_size(size, strlen(MyParallelProcInfo.authn_id) + 1);
+
+ return size;
+}
+
+/*
+ * Serialize MyParallelProcInfo for use by parallel workers.
+ */
+void
+SerializeParallelProcInfo(Size maxsize, char *start_address)
+{
+ /*
+ * First byte is an indication of whether or not authn_id has been set to
+ * non-NULL, to differentiate that case from the empty string.
+ */
+ Assert(maxsize > 0);
+ start_address[0] = MyParallelProcInfo.authn_id ? 1 : 0;
+ start_address++;
+ maxsize--;
+
+ if (MyParallelProcInfo.authn_id)
+ {
+ Size len;
+
+ len = strlcpy(start_address, MyParallelProcInfo.authn_id, maxsize) + 1;
+ Assert(len <= maxsize);
+ maxsize -= len;
+ start_address += len;
+ }
+}
+
+/*
+ * Restore MyParallelProcInfo from its serialized representation.
+ */
+void
+RestoreParallelProcInfo(char *procinfo)
+{
+ if (procinfo[0] == 0)
+ {
+ MyParallelProcInfo.authn_id = NULL;
+ procinfo++;
+ }
+ else
+ {
+ procinfo++;
+ MyParallelProcInfo.authn_id = MemoryContextStrdup(TopMemoryContext,
+ procinfo);
+ procinfo += strlen(procinfo) + 1;
+ }
+}
+
/*-------------------------------------------------------------------------
* Interlock-file support
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 8e181b4771..d4fa9d32dd 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1509,7 +1509,7 @@
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
{ oid => '9774', descr => 'session authenticated identity',
- proname => 'pg_session_authn_id', provolatile => 's', proparallel => 'r',
+ proname => 'pg_session_authn_id', provolatile => 's',
prorettype => 'text', proargtypes => '', prosrc => 'pg_session_authn_id' },
{ oid => '744',
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index 90c20da22b..f381e958ee 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -98,6 +98,27 @@ typedef struct
} pg_gssinfo;
#endif
+/*
+ * Fields from Port that need to be copied over to parallel workers go into the
+ * ParallelProcInfo. The same rules apply for allocations here as for Port (must
+ * be malloc'd or palloc'd in TopMemoryContext).
+ */
+typedef struct
+{
+ /*
+ * Authenticated identity. The meaning of this identifier is dependent on
+ * hba->auth_method; it is the identity (if any) that the user presented
+ * during the authentication cycle, before they were assigned a database
+ * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
+ * -- though the exact string in use may be different, depending on pg_hba
+ * options.)
+ *
+ * authn_id is NULL if the user has not actually been authenticated, for
+ * example if the "trust" auth method is in use.
+ */
+ const char *authn_id;
+} ParallelProcInfo;
+
/*
* This is used by the postmaster in its communication with frontends. It
* contains all state information needed during this communication before the
@@ -158,19 +179,6 @@ typedef struct Port
*/
HbaLine *hba;
- /*
- * Authenticated identity. The meaning of this identifier is dependent on
- * hba->auth_method; it is the identity (if any) that the user presented
- * during the authentication cycle, before they were assigned a database
- * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
- * -- though the exact string in use may be different, depending on pg_hba
- * options.)
- *
- * authn_id is NULL if the user has not actually been authenticated, for
- * example if the "trust" auth method is in use.
- */
- const char *authn_id;
-
/*
* TCP keepalive and user timeout settings.
*
@@ -327,6 +335,7 @@ extern ssize_t be_gssapi_write(Port *port, void *ptr, size_t len);
#endif /* ENABLE_GSS */
extern PGDLLIMPORT ProtocolVersion FrontendProtocol;
+extern PGDLLIMPORT ParallelProcInfo MyParallelProcInfo;
/* TCP keepalives configuration. These are no-ops on an AF_UNIX socket. */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 0af130fbc5..55ad268700 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -486,6 +486,10 @@ extern bool has_rolreplication(Oid roleid);
typedef void (*shmem_request_hook_type) (void);
extern PGDLLIMPORT shmem_request_hook_type shmem_request_hook;
+extern Size EstimateParallelProcInfoSpace(void);
+extern void SerializeParallelProcInfo(Size maxsize, char *start_address);
+extern void RestoreParallelProcInfo(char *procinfo);
+
/* in executor/nodeHash.c */
extern size_t get_hash_memory_limit(void);
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index f0bdeda52d..3f8629b3a6 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -74,6 +74,14 @@ $node->safe_psql('postgres',
);
$ENV{"PGPASSWORD"} = 'pass';
+# Set up a table for parallel worker testing.
+$node->safe_psql('postgres',
+ 'CREATE TABLE nulls (n) AS SELECT NULL FROM generate_series(1, 200000);'
+);
+$node->safe_psql('postgres',
+ 'GRANT SELECT ON nulls TO md5_role;'
+);
+
# For "trust" method, all users should be able to connect. These users are not
# considered to be authenticated.
reset_pg_hba($node, 'trust');
@@ -86,6 +94,19 @@ my $res =
$node->safe_psql('postgres', "SELECT pg_session_authn_id() IS NULL;");
is($res, 't', "users with trust authentication have NULL authn_id");
+# Test pg_session_authn_id() with parallel workers.
+$res = $node->safe_psql(
+ 'postgres', '
+ SET min_parallel_table_scan_size TO 0;
+ SET parallel_setup_cost TO 0;
+ SET parallel_tuple_cost TO 0;
+ SET max_parallel_workers_per_gather TO 2;
+
+ SELECT bool_and(pg_session_authn_id() IS NOT DISTINCT FROM n) FROM nulls;
+ ',
+ connstr => "user=md5_role");
+is($res, 't', "parallel workers return a null authn_id when not authenticated");
+
# For plain "password" method, all users should also be able to connect.
reset_pg_hba($node, 'password');
test_role($node, 'scram_role', 'password', 0,
@@ -102,6 +123,18 @@ $res = $node->safe_psql(
is($res, 'md5_role',
"users with md5 authentication have authn_id matching role name");
+$res = $node->safe_psql(
+ 'postgres', '
+ SET min_parallel_table_scan_size TO 0;
+ SET parallel_setup_cost TO 0;
+ SET parallel_tuple_cost TO 0;
+ SET max_parallel_workers_per_gather TO 2;
+
+ SELECT bool_and(pg_session_authn_id() IS DISTINCT FROM n) FROM nulls;
+ ',
+ connstr => "user=md5_role");
+is($res, 't', "parallel workers return a non-null authn_id when authenticated");
+
# For "scram-sha-256" method, user "scram_role" should be able to connect.
reset_pg_hba($node, 'scram-sha-256');
test_role(
--
2.17.1
On Tue, May 31, 2022 at 6:21 PM Jacob Champion <jchampion@timescale.com> wrote:
v10 is rebased over latest; I've also added a PGDLLIMPORT to the new global.
I took a quick look at this and it doesn't seem crazy to me, except
that I think ParallelProcInfo is a bad name for it. It's kind of
generic, because neither "proc" nor "info" means a whole lot. It's
also kind of wrong, because I think "parallel" should be things that
have to do with parallelism, not just things that happen to be
synchronized across processes when parallelism is in use. It doesn't
make sense to me to have something called a ParallelProcInfo that is
used for every single connection in the universe even if parallelism
is completely disabled on the system.
I'm not sure what it SHOULD be called, exactly: that's one of the hard
problems in computer science.[1]https://martinfowler.com/bliki/TwoHardThings.html
--
Robert Haas
EDB: http://www.enterprisedb.com
On Thu, Jun 2, 2022 at 6:52 AM Robert Haas <robertmhaas@gmail.com> wrote:
I'm not sure what it SHOULD be called, exactly: that's one of the hard
problems in computer science.[1]
Yeah...
All right, here's the full list of previous suggestions, I think:
- SharedPort
- MyProcShared
- ProcParallelInfo
- ProcInfoParallel
- ParallelProcInfo
- ParallelPortInfo
I have a few new proposals:
- GlobalPortInfo
- GlobalConnInfo
- Synced[Port/Conn]Info
- Worker[Port/Conn]Info (but I think this suffers from the exact same
problem as Parallel)
- ThingsThatHappenToBeSynchronizedAcrossProcessesWhenParallelismIsInUse
- OrderImporterConsumerTemplateBeanFactory
I am struggling to come up with a single adjective that captures this
concept of sometimes-synchronized, that isn't conflicting with
existing uses (like Shared). Other suggestions are very welcome.
Thanks,
--Jacob
On Thu, Jun 02, 2022 at 03:56:28PM -0700, Jacob Champion wrote:
All right, here's the full list of previous suggestions, I think:
- SharedPort
- MyProcShared
- ProcParallelInfo
- ProcInfoParallel
- ParallelProcInfo
- ParallelPortInfoI have a few new proposals:
- GlobalPortInfo
- GlobalConnInfo
- Synced[Port/Conn]Info
- Worker[Port/Conn]Info (but I think this suffers from the exact same
problem as Parallel)
- ThingsThatHappenToBeSynchronizedAcrossProcessesWhenParallelismIsInUse
- OrderImporterConsumerTemplateBeanFactory
ParallelPortInfo sounds kind of right for the job to me in this set of
proposals, as the data is from the Port, and that's some information
shared between all the parallel workers and the leader.
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
ParallelPortInfo sounds kind of right for the job to me in this set of
proposals, as the data is from the Port, and that's some information
shared between all the parallel workers and the leader.
I agree with Robert's complaint that Parallel is far too generic
a term here. Also, the fact that this data is currently in struct
Port seems like an artifact.
Don't we have a term for the set of processes comprising a leader
plus parallel workers? If we called that set FooGroup, then
something like FooGroupSharedInfo would be on-point.
regards, tom lane
On Fri, Jun 03, 2022 at 10:04:12AM -0400, Tom Lane wrote:
I agree with Robert's complaint that Parallel is far too generic
a term here. Also, the fact that this data is currently in struct
Port seems like an artifact.Don't we have a term for the set of processes comprising a leader
plus parallel workers? If we called that set FooGroup, then
something like FooGroupSharedInfo would be on-point.
As far as I know, proc.h includes the term "group members", which
includes the leader and its workers (see CLOG and lock part)?
--
Michael
On Fri, Jun 3, 2022 at 7:36 PM Michael Paquier <michael@paquier.xyz> wrote:
On Fri, Jun 03, 2022 at 10:04:12AM -0400, Tom Lane wrote:
I agree with Robert's complaint that Parallel is far too generic
a term here. Also, the fact that this data is currently in struct
Port seems like an artifact.Don't we have a term for the set of processes comprising a leader
plus parallel workers? If we called that set FooGroup, then
something like FooGroupSharedInfo would be on-point.As far as I know, proc.h includes the term "group members", which
includes the leader and its workers (see CLOG and lock part)?
lmgr/README also refers to "gangs of related processes" and "parallel
groups". So
- GroupSharedInfo
- ParallelGroupSharedInfo
- GangSharedInfo
- SharedLeaderInfo
?
--Jacob
On Fri, Jun 3, 2022 at 10:04 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I agree with Robert's complaint that Parallel is far too generic
a term here. Also, the fact that this data is currently in struct
Port seems like an artifact.
Why do we call this thing a Port, anyway?
I think I'd feel more comfortable here if we were defining what went
into which struct on some semantic basis rather than being like, OK,
so all the stuff we want to serialize goes into struct #1, and the
stuff we don't want to serialize goes into struct #2. I suppose if
it's just based on whether or not we want to serialize it, then the
placement of future additions will just be based on how people happen
to feel about the thing they're adding right at that moment, and there
won't be any consistency.
One could imagine dividing the Port struct into a couple of different
structs, e.g.
AuthenticationState: stuff that is needed only during authentication
and can be discarded thereafter (e.g. the HBA line, at least if the
comment is to be believed)
ClientCommunicationState: stuff that is used to communicate with the
client but doesn't need to be or can't be shared (e.g. the SSL object
itself)
ClientConnectionInfo: stuff that someone might want to look at for
information purposes at any time (e.g. authn_id, apparently)
Then we could serialize the third of these, keep the second around but
not serialize it, and free the first once connection setup is
complete.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Mon, Jun 6, 2022 at 11:44 AM Robert Haas <robertmhaas@gmail.com> wrote:
I think I'd feel more comfortable here if we were defining what went
into which struct on some semantic basis rather than being like, OK,
so all the stuff we want to serialize goes into struct #1, and the
stuff we don't want to serialize goes into struct #2. I suppose if
it's just based on whether or not we want to serialize it, then the
placement of future additions will just be based on how people happen
to feel about the thing they're adding right at that moment, and there
won't be any consistency.
"This struct contains connection fields that are explicitly safe for
workers to access" _is_ a useful semantic, in my opinion. And it seems
like it'd make it easier to determine what needs to be included in the
struct; I'm not sure I follow why it would result in less consistency.
But to your suggestion, if we just called the new struct
"ClientConnectionInfo", would it be a useful step towards your
proposed three-bucket state? I guess I'm having trouble understanding
why a struct that is defined as "this stuff *doesn't* get serialized"
is materially different from having one that's the opposite.
Thanks,
--Jacob
On Tue, Jun 7, 2022 at 6:54 PM Jacob Champion <jchampion@timescale.com> wrote:
"This struct contains connection fields that are explicitly safe for
workers to access" _is_ a useful semantic, in my opinion. And it seems
like it'd make it easier to determine what needs to be included in the
struct; I'm not sure I follow why it would result in less consistency.But to your suggestion, if we just called the new struct
"ClientConnectionInfo", would it be a useful step towards your
proposed three-bucket state? I guess I'm having trouble understanding
why a struct that is defined as "this stuff *doesn't* get serialized"
is materially different from having one that's the opposite.
Well, it isn't, and if my proposal boils down to that, which perhaps
it does, then my proposal isn't that great, honestly. Let me try again
to explain, though, and maybe it will seem less arbitrary with a
second explanation -- or maybe it won't.
If we say "this struct contains authentication-related information
that we got from the client and which functions may want to look at
later," then I feel like the chances are good that when someone adds a
new thing to the system in the future, they will know whether or not
that new thing falls into that category or not. If the definition of a
struct is "everything that should be serialized," then I feel like the
chances are less good that everyone will know whether a new thing they
are adding falls into that category or not. Perhaps that is
ill-founded, but I don't think "should be serialized" is necessarily
something that everybody is going to have the same view on, or even
know what it means.
Also, I don't think we want to end up with a situation where we have a
struct that contains wildly unrelated things that all need to be
serialized. If the definition of the struct is "stuff we should
serialize and send to the worker," well then maybe the transaction
snapshot ought to go in there! Well, no. I mean, we already have a
separate place for that, but suppose somehow we didn't. It doesn't
belong here, because yes the things in this struct get serialized, but
it's not only any old thing that needs serializing, it's more specific
than that.
I guess what this boils down to is that I really want this thing to
have a meaningful name by means of which a future developer can make a
guess as to whether some field they're adding ought to go in there. I
theorize that SharedPort is not too great because (a) Port is already
a bad name and (b) how am I supposed to know whether my stuff ought to
be shared or not? I like something like ClientConnectionInfo better
because it seems to describe what the stuff in the struct *is* rather
than what we *do* with it.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Jun 7, 2022 at 5:45 PM Robert Haas <robertmhaas@gmail.com> wrote:
Perhaps that is
ill-founded, but I don't think "should be serialized" is necessarily
something that everybody is going to have the same view on, or even
know what it means.
Using this thread as an example, once it was decided that the parallel
workers needed the additional info, the need for serialization
followed directly from that. I don't get the feeling that developers
are going to jump through the hoops of writing serialization logic for
a new field in the struct just by accident; they should know why
they're writing that code, and hopefully it would be easy for
reviewers to catch a patch that showed up with pointless
serialization.
Also, I don't think we want to end up with a situation where we have a
struct that contains wildly unrelated things that all need to be
serialized. If the definition of the struct is "stuff we should
serialize and send to the worker," well then maybe the transaction
snapshot ought to go in there! Well, no. I mean, we already have a
separate place for that, but suppose somehow we didn't. It doesn't
belong here, because yes the things in this struct get serialized, but
it's not only any old thing that needs serializing, it's more specific
than that.
I completely agree with you here -- the name should not be so generic
that it's just a catch-all for any serialized fields that exist.
I guess what this boils down to is that I really want this thing to
have a meaningful name by means of which a future developer can make a
guess as to whether some field they're adding ought to go in there. I
theorize that SharedPort is not too great because (a) Port is already
a bad name and (b) how am I supposed to know whether my stuff ought to
be shared or not? I like something like ClientConnectionInfo better
because it seems to describe what the stuff in the struct *is* rather
than what we *do* with it.
I think having both would be useful in this case -- what the stuff is,
so that it's clear what doesn't belong in it, and what we do with it,
so it's clear that you have to write serialization code if you add new
things. The nature of the struct is such that I think you _have_ to
figure out whether or not your stuff ought to be shared before you
have any business adding it.
But I don't have any better ideas for how to achieve both. I'm fine
with your suggestion of ClientConnectionInfo, if that sounds good to
others; the doc comment can clarify why it differs from Port? Or add
one of the Shared-/Gang-/Group- prefixes to it, maybe?
Thanks,
--Jacob
On Wed, Jun 8, 2022 at 7:53 PM Jacob Champion <jchampion@timescale.com> wrote:
But I don't have any better ideas for how to achieve both. I'm fine
with your suggestion of ClientConnectionInfo, if that sounds good to
others; the doc comment can clarify why it differs from Port? Or add
one of the Shared-/Gang-/Group- prefixes to it, maybe?
I don't like the prefixes, so I'd prefer explaining it in the struct comment.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Thu, Jun 9, 2022 at 6:23 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jun 8, 2022 at 7:53 PM Jacob Champion <jchampion@timescale.com> wrote:
But I don't have any better ideas for how to achieve both. I'm fine
with your suggestion of ClientConnectionInfo, if that sounds good to
others; the doc comment can clarify why it differs from Port? Or add
one of the Shared-/Gang-/Group- prefixes to it, maybe?I don't like the prefixes, so I'd prefer explaining it in the struct comment.
Done that way in v11.
Thanks!
--Jacob
Attachments:
since-v10.diff.txttext/plain; charset=US-ASCII; name=since-v10.diff.txtDownload
commit afb70959a6d46054eb65e9c4b0a8f61d1c87b91b
Author: Jacob Champion <jchampion@timescale.com>
Date: Fri Jun 10 10:48:07 2022 -0700
squash! Allow parallel workers to use pg_session_authn_id()
Per review, switch the global name to ClientConnectionInfo.
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 27eda766b1..bc93101ff7 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -76,7 +76,7 @@
#define PARALLEL_KEY_REINDEX_STATE UINT64CONST(0xFFFFFFFFFFFF000C)
#define PARALLEL_KEY_RELMAPPER_STATE UINT64CONST(0xFFFFFFFFFFFF000D)
#define PARALLEL_KEY_UNCOMMITTEDENUMS UINT64CONST(0xFFFFFFFFFFFF000E)
-#define PARALLEL_KEY_PROCINFO UINT64CONST(0xFFFFFFFFFFFF000F)
+#define PARALLEL_KEY_CLIENTCONNINFO UINT64CONST(0xFFFFFFFFFFFF000F)
/* Fixed-size parallel state. */
typedef struct FixedParallelState
@@ -213,7 +213,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
Size reindexlen = 0;
Size relmapperlen = 0;
Size uncommittedenumslen = 0;
- Size procinfolen = 0;
+ Size clientconninfolen = 0;
Size segsize = 0;
int i;
FixedParallelState *fps;
@@ -274,8 +274,8 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_estimate_chunk(&pcxt->estimator, relmapperlen);
uncommittedenumslen = EstimateUncommittedEnumsSpace();
shm_toc_estimate_chunk(&pcxt->estimator, uncommittedenumslen);
- procinfolen = EstimateParallelProcInfoSpace();
- shm_toc_estimate_chunk(&pcxt->estimator, procinfolen);
+ clientconninfolen = EstimateClientConnectionInfoSpace();
+ shm_toc_estimate_chunk(&pcxt->estimator, clientconninfolen);
/* If you add more chunks here, you probably need to add keys. */
shm_toc_estimate_keys(&pcxt->estimator, 12);
@@ -356,7 +356,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
char *session_dsm_handle_space;
char *entrypointstate;
char *uncommittedenumsspace;
- char *procinfospace;
+ char *clientconninfospace;
Size lnamelen;
/* Serialize shared libraries we have loaded. */
@@ -427,11 +427,11 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNCOMMITTEDENUMS,
uncommittedenumsspace);
- /* Serialize our ParallelProcInfo. */
- procinfospace = shm_toc_allocate(pcxt->toc, procinfolen);
- SerializeParallelProcInfo(procinfolen, procinfospace);
- shm_toc_insert(pcxt->toc, PARALLEL_KEY_PROCINFO,
- procinfospace);
+ /* Serialize our ClientConnectionInfo. */
+ clientconninfospace = shm_toc_allocate(pcxt->toc, clientconninfolen);
+ SerializeClientConnectionInfo(clientconninfolen, clientconninfospace);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_CLIENTCONNINFO,
+ clientconninfospace);
/* Allocate space for worker information. */
pcxt->worker = palloc0(sizeof(ParallelWorkerInfo) * pcxt->nworkers);
@@ -1281,7 +1281,7 @@ ParallelWorkerMain(Datum main_arg)
char *reindexspace;
char *relmapperspace;
char *uncommittedenumsspace;
- char *procinfospace;
+ char *clientconninfospace;
StringInfoData msgbuf;
char *session_dsm_handle_space;
Snapshot tsnapshot;
@@ -1491,9 +1491,10 @@ ParallelWorkerMain(Datum main_arg)
false);
RestoreUncommittedEnums(uncommittedenumsspace);
- /* Restore the ParallelProcInfo. */
- procinfospace = shm_toc_lookup(toc, PARALLEL_KEY_PROCINFO, false);
- RestoreParallelProcInfo(procinfospace);
+ /* Restore the ClientConnectionInfo. */
+ clientconninfospace = shm_toc_lookup(toc, PARALLEL_KEY_CLIENTCONNINFO,
+ false);
+ RestoreClientConnectionInfo(clientconninfospace);
/* Attach to the leader's serializable transaction, if SERIALIZABLE. */
AttachSerializableXact(fps->serializable_xact_handle);
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 2e5fe2cc19..6a499efecd 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -342,7 +342,7 @@ auth_failed(Port *port, int status, const char *logdetail)
* authorization will fail later.
*
* The provided string will be copied into TopMemoryContext, to match the
- * lifetime of MyParallelProcInfo, so it is safe to pass a string that is
+ * lifetime of MyClientConnectionInfo, so it is safe to pass a string that is
* managed by an external library.
*/
static void
@@ -350,7 +350,7 @@ set_authn_id(Port *port, const char *id)
{
Assert(id);
- if (MyParallelProcInfo.authn_id)
+ if (MyClientConnectionInfo.authn_id)
{
/*
* An existing authn_id should never be overwritten; that means two
@@ -361,17 +361,17 @@ set_authn_id(Port *port, const char *id)
ereport(FATAL,
(errmsg("authentication identifier set more than once"),
errdetail_log("previous identifier: \"%s\"; new identifier: \"%s\"",
- MyParallelProcInfo.authn_id, id)));
+ MyClientConnectionInfo.authn_id, id)));
}
- MyParallelProcInfo.authn_id = MemoryContextStrdup(TopMemoryContext, id);
+ MyClientConnectionInfo.authn_id = MemoryContextStrdup(TopMemoryContext, id);
if (Log_connections)
{
ereport(LOG,
errmsg("connection authenticated: identity=\"%s\" method=%s "
"(%s:%d)",
- MyParallelProcInfo.authn_id,
+ MyClientConnectionInfo.authn_id,
hba_authname(port->hba->auth_method), HbaFileName,
port->hba->linenumber));
}
@@ -1910,7 +1910,7 @@ auth_peer(hbaPort *port)
set_authn_id(port, pw->pw_name);
ret = check_usermap(port->hba->usermap, port->user_name,
- MyParallelProcInfo.authn_id, false);
+ MyClientConnectionInfo.authn_id, false);
return ret;
#else
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index 24a06bf933..97c827fb9a 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -275,10 +275,10 @@ session_user(PG_FUNCTION_ARGS)
Datum
pg_session_authn_id(PG_FUNCTION_ARGS)
{
- if (!MyParallelProcInfo.authn_id)
+ if (!MyClientConnectionInfo.authn_id)
PG_RETURN_NULL();
- PG_RETURN_TEXT_P(cstring_to_text(MyParallelProcInfo.authn_id));
+ PG_RETURN_TEXT_P(cstring_to_text(MyClientConnectionInfo.authn_id));
}
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 408fa8953d..1bbe1eaa17 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -935,48 +935,49 @@ GetUserNameFromId(Oid roleid, bool noerr)
/* ------------------------------------------------------------------------
* Parallel connection state
*
- * MyParallelProcInfo contains pieces of information about the client that need
- * to be synced to parallel workers when they initialize. Over time, this list
- * will probably grow, and may subsume some of the "user state" variables above.
+ * ClientConnectionInfo contains pieces of information about the client that
+ * need to be synced to parallel workers when they initialize. Over time, this
+ * list will probably grow, and may subsume some of the "user state" variables
+ * above.
*-------------------------------------------------------------------------
*/
-ParallelProcInfo MyParallelProcInfo;
+ClientConnectionInfo MyClientConnectionInfo;
/*
- * Calculate the space needed to serialize MyParallelProcInfo.
+ * Calculate the space needed to serialize MyClientConnectionInfo.
*/
Size
-EstimateParallelProcInfoSpace(void)
+EstimateClientConnectionInfoSpace(void)
{
Size size = 1;
- if (MyParallelProcInfo.authn_id)
- size = add_size(size, strlen(MyParallelProcInfo.authn_id) + 1);
+ if (MyClientConnectionInfo.authn_id)
+ size = add_size(size, strlen(MyClientConnectionInfo.authn_id) + 1);
return size;
}
/*
- * Serialize MyParallelProcInfo for use by parallel workers.
+ * Serialize MyClientConnectionInfo for use by parallel workers.
*/
void
-SerializeParallelProcInfo(Size maxsize, char *start_address)
+SerializeClientConnectionInfo(Size maxsize, char *start_address)
{
/*
* First byte is an indication of whether or not authn_id has been set to
* non-NULL, to differentiate that case from the empty string.
*/
Assert(maxsize > 0);
- start_address[0] = MyParallelProcInfo.authn_id ? 1 : 0;
+ start_address[0] = MyClientConnectionInfo.authn_id ? 1 : 0;
start_address++;
maxsize--;
- if (MyParallelProcInfo.authn_id)
+ if (MyClientConnectionInfo.authn_id)
{
Size len;
- len = strlcpy(start_address, MyParallelProcInfo.authn_id, maxsize) + 1;
+ len = strlcpy(start_address, MyClientConnectionInfo.authn_id, maxsize) + 1;
Assert(len <= maxsize);
maxsize -= len;
start_address += len;
@@ -984,22 +985,22 @@ SerializeParallelProcInfo(Size maxsize, char *start_address)
}
/*
- * Restore MyParallelProcInfo from its serialized representation.
+ * Restore MyClientConnectionInfo from its serialized representation.
*/
void
-RestoreParallelProcInfo(char *procinfo)
+RestoreClientConnectionInfo(char *conninfo)
{
- if (procinfo[0] == 0)
+ if (conninfo[0] == 0)
{
- MyParallelProcInfo.authn_id = NULL;
- procinfo++;
+ MyClientConnectionInfo.authn_id = NULL;
+ conninfo++;
}
else
{
- procinfo++;
- MyParallelProcInfo.authn_id = MemoryContextStrdup(TopMemoryContext,
- procinfo);
- procinfo += strlen(procinfo) + 1;
+ conninfo++;
+ MyClientConnectionInfo.authn_id = MemoryContextStrdup(TopMemoryContext,
+ conninfo);
+ conninfo += strlen(conninfo) + 1;
}
}
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index f381e958ee..c900411fdd 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -99,9 +99,13 @@ typedef struct
#endif
/*
- * Fields from Port that need to be copied over to parallel workers go into the
- * ParallelProcInfo. The same rules apply for allocations here as for Port (must
- * be malloc'd or palloc'd in TopMemoryContext).
+ * Fields describing the client connection, that also need to be copied over to
+ * parallel workers, go into the ClientConnectionInfo rather than Port. The same
+ * rules apply for allocations here as for Port (must be malloc'd or palloc'd in
+ * TopMemoryContext).
+ *
+ * If you add a struct member here, remember to also handle serialization in
+ * SerializeClientConnectionInfo() et al.
*/
typedef struct
{
@@ -117,7 +121,7 @@ typedef struct
* example if the "trust" auth method is in use.
*/
const char *authn_id;
-} ParallelProcInfo;
+} ClientConnectionInfo;
/*
* This is used by the postmaster in its communication with frontends. It
@@ -335,7 +339,7 @@ extern ssize_t be_gssapi_write(Port *port, void *ptr, size_t len);
#endif /* ENABLE_GSS */
extern PGDLLIMPORT ProtocolVersion FrontendProtocol;
-extern PGDLLIMPORT ParallelProcInfo MyParallelProcInfo;
+extern PGDLLIMPORT ClientConnectionInfo MyClientConnectionInfo;
/* TCP keepalives configuration. These are no-ops on an AF_UNIX socket. */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 55ad268700..c06796fe4a 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -486,9 +486,9 @@ extern bool has_rolreplication(Oid roleid);
typedef void (*shmem_request_hook_type) (void);
extern PGDLLIMPORT shmem_request_hook_type shmem_request_hook;
-extern Size EstimateParallelProcInfoSpace(void);
-extern void SerializeParallelProcInfo(Size maxsize, char *start_address);
-extern void RestoreParallelProcInfo(char *procinfo);
+extern Size EstimateClientConnectionInfoSpace(void);
+extern void SerializeClientConnectionInfo(Size maxsize, char *start_address);
+extern void RestoreClientConnectionInfo(char *procinfo);
/* in executor/nodeHash.c */
extern size_t get_hash_memory_limit(void);
v11-0002-Allow-parallel-workers-to-use-pg_session_authn_i.patchtext/x-patch; charset=US-ASCII; name=v11-0002-Allow-parallel-workers-to-use-pg_session_authn_i.patchDownload
From 77801627c46f9e29918d35ffd7430e861ac03b82 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Wed, 23 Mar 2022 15:07:05 -0700
Subject: [PATCH v11 2/2] Allow parallel workers to use pg_session_authn_id()
Move authn_id into a new global, MyClientConnectionInfo, which is
intended to hold all the client information that needs to be shared
between the backend and any parallel workers. MyClientConnectionInfo is
serialized and restored using a new parallel key.
With this change, the parallel restriction can be removed from
pg_session_authn_id().
---
src/backend/access/transam/parallel.c | 19 +++++-
src/backend/libpq/auth.c | 16 ++---
src/backend/utils/adt/name.c | 4 +-
src/backend/utils/init/miscinit.c | 72 +++++++++++++++++++++++
src/include/catalog/pg_proc.dat | 2 +-
src/include/libpq/libpq-be.h | 39 ++++++++----
src/include/miscadmin.h | 4 ++
src/test/authentication/t/001_password.pl | 33 +++++++++++
8 files changed, 165 insertions(+), 24 deletions(-)
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index df0cd77558..bc93101ff7 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -76,6 +76,7 @@
#define PARALLEL_KEY_REINDEX_STATE UINT64CONST(0xFFFFFFFFFFFF000C)
#define PARALLEL_KEY_RELMAPPER_STATE UINT64CONST(0xFFFFFFFFFFFF000D)
#define PARALLEL_KEY_UNCOMMITTEDENUMS UINT64CONST(0xFFFFFFFFFFFF000E)
+#define PARALLEL_KEY_CLIENTCONNINFO UINT64CONST(0xFFFFFFFFFFFF000F)
/* Fixed-size parallel state. */
typedef struct FixedParallelState
@@ -212,6 +213,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
Size reindexlen = 0;
Size relmapperlen = 0;
Size uncommittedenumslen = 0;
+ Size clientconninfolen = 0;
Size segsize = 0;
int i;
FixedParallelState *fps;
@@ -272,8 +274,10 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_estimate_chunk(&pcxt->estimator, relmapperlen);
uncommittedenumslen = EstimateUncommittedEnumsSpace();
shm_toc_estimate_chunk(&pcxt->estimator, uncommittedenumslen);
+ clientconninfolen = EstimateClientConnectionInfoSpace();
+ shm_toc_estimate_chunk(&pcxt->estimator, clientconninfolen);
/* If you add more chunks here, you probably need to add keys. */
- shm_toc_estimate_keys(&pcxt->estimator, 11);
+ shm_toc_estimate_keys(&pcxt->estimator, 12);
/* Estimate space need for error queues. */
StaticAssertStmt(BUFFERALIGN(PARALLEL_ERROR_QUEUE_SIZE) ==
@@ -352,6 +356,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
char *session_dsm_handle_space;
char *entrypointstate;
char *uncommittedenumsspace;
+ char *clientconninfospace;
Size lnamelen;
/* Serialize shared libraries we have loaded. */
@@ -422,6 +427,12 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNCOMMITTEDENUMS,
uncommittedenumsspace);
+ /* Serialize our ClientConnectionInfo. */
+ clientconninfospace = shm_toc_allocate(pcxt->toc, clientconninfolen);
+ SerializeClientConnectionInfo(clientconninfolen, clientconninfospace);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_CLIENTCONNINFO,
+ clientconninfospace);
+
/* Allocate space for worker information. */
pcxt->worker = palloc0(sizeof(ParallelWorkerInfo) * pcxt->nworkers);
@@ -1270,6 +1281,7 @@ ParallelWorkerMain(Datum main_arg)
char *reindexspace;
char *relmapperspace;
char *uncommittedenumsspace;
+ char *clientconninfospace;
StringInfoData msgbuf;
char *session_dsm_handle_space;
Snapshot tsnapshot;
@@ -1479,6 +1491,11 @@ ParallelWorkerMain(Datum main_arg)
false);
RestoreUncommittedEnums(uncommittedenumsspace);
+ /* Restore the ClientConnectionInfo. */
+ clientconninfospace = shm_toc_lookup(toc, PARALLEL_KEY_CLIENTCONNINFO,
+ false);
+ RestoreClientConnectionInfo(clientconninfospace);
+
/* Attach to the leader's serializable transaction, if SERIALIZABLE. */
AttachSerializableXact(fps->serializable_xact_handle);
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index efc53f3135..6a499efecd 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -342,15 +342,15 @@ auth_failed(Port *port, int status, const char *logdetail)
* authorization will fail later.
*
* The provided string will be copied into TopMemoryContext, to match the
- * lifetime of the Port, so it is safe to pass a string that is managed by an
- * external library.
+ * lifetime of MyClientConnectionInfo, so it is safe to pass a string that is
+ * managed by an external library.
*/
static void
set_authn_id(Port *port, const char *id)
{
Assert(id);
- if (port->authn_id)
+ if (MyClientConnectionInfo.authn_id)
{
/*
* An existing authn_id should never be overwritten; that means two
@@ -361,17 +361,18 @@ set_authn_id(Port *port, const char *id)
ereport(FATAL,
(errmsg("authentication identifier set more than once"),
errdetail_log("previous identifier: \"%s\"; new identifier: \"%s\"",
- port->authn_id, id)));
+ MyClientConnectionInfo.authn_id, id)));
}
- port->authn_id = MemoryContextStrdup(TopMemoryContext, id);
+ MyClientConnectionInfo.authn_id = MemoryContextStrdup(TopMemoryContext, id);
if (Log_connections)
{
ereport(LOG,
errmsg("connection authenticated: identity=\"%s\" method=%s "
"(%s:%d)",
- port->authn_id, hba_authname(port->hba->auth_method), HbaFileName,
+ MyClientConnectionInfo.authn_id,
+ hba_authname(port->hba->auth_method), HbaFileName,
port->hba->linenumber));
}
}
@@ -1908,7 +1909,8 @@ auth_peer(hbaPort *port)
*/
set_authn_id(port, pw->pw_name);
- ret = check_usermap(port->hba->usermap, port->user_name, port->authn_id, false);
+ ret = check_usermap(port->hba->usermap, port->user_name,
+ MyClientConnectionInfo.authn_id, false);
return ret;
#else
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index 662a7943ed..97c827fb9a 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -275,10 +275,10 @@ session_user(PG_FUNCTION_ARGS)
Datum
pg_session_authn_id(PG_FUNCTION_ARGS)
{
- if (!MyProcPort || !MyProcPort->authn_id)
+ if (!MyClientConnectionInfo.authn_id)
PG_RETURN_NULL();
- PG_RETURN_TEXT_P(cstring_to_text(MyProcPort->authn_id));
+ PG_RETURN_TEXT_P(cstring_to_text(MyClientConnectionInfo.authn_id));
}
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index b25bd0e583..1bbe1eaa17 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -932,6 +932,78 @@ GetUserNameFromId(Oid roleid, bool noerr)
return result;
}
+/* ------------------------------------------------------------------------
+ * Parallel connection state
+ *
+ * ClientConnectionInfo contains pieces of information about the client that
+ * need to be synced to parallel workers when they initialize. Over time, this
+ * list will probably grow, and may subsume some of the "user state" variables
+ * above.
+ *-------------------------------------------------------------------------
+ */
+
+ClientConnectionInfo MyClientConnectionInfo;
+
+/*
+ * Calculate the space needed to serialize MyClientConnectionInfo.
+ */
+Size
+EstimateClientConnectionInfoSpace(void)
+{
+ Size size = 1;
+
+ if (MyClientConnectionInfo.authn_id)
+ size = add_size(size, strlen(MyClientConnectionInfo.authn_id) + 1);
+
+ return size;
+}
+
+/*
+ * Serialize MyClientConnectionInfo for use by parallel workers.
+ */
+void
+SerializeClientConnectionInfo(Size maxsize, char *start_address)
+{
+ /*
+ * First byte is an indication of whether or not authn_id has been set to
+ * non-NULL, to differentiate that case from the empty string.
+ */
+ Assert(maxsize > 0);
+ start_address[0] = MyClientConnectionInfo.authn_id ? 1 : 0;
+ start_address++;
+ maxsize--;
+
+ if (MyClientConnectionInfo.authn_id)
+ {
+ Size len;
+
+ len = strlcpy(start_address, MyClientConnectionInfo.authn_id, maxsize) + 1;
+ Assert(len <= maxsize);
+ maxsize -= len;
+ start_address += len;
+ }
+}
+
+/*
+ * Restore MyClientConnectionInfo from its serialized representation.
+ */
+void
+RestoreClientConnectionInfo(char *conninfo)
+{
+ if (conninfo[0] == 0)
+ {
+ MyClientConnectionInfo.authn_id = NULL;
+ conninfo++;
+ }
+ else
+ {
+ conninfo++;
+ MyClientConnectionInfo.authn_id = MemoryContextStrdup(TopMemoryContext,
+ conninfo);
+ conninfo += strlen(conninfo) + 1;
+ }
+}
+
/*-------------------------------------------------------------------------
* Interlock-file support
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 8e181b4771..d4fa9d32dd 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1509,7 +1509,7 @@
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
{ oid => '9774', descr => 'session authenticated identity',
- proname => 'pg_session_authn_id', provolatile => 's', proparallel => 'r',
+ proname => 'pg_session_authn_id', provolatile => 's',
prorettype => 'text', proargtypes => '', prosrc => 'pg_session_authn_id' },
{ oid => '744',
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index 90c20da22b..c900411fdd 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -98,6 +98,31 @@ typedef struct
} pg_gssinfo;
#endif
+/*
+ * Fields describing the client connection, that also need to be copied over to
+ * parallel workers, go into the ClientConnectionInfo rather than Port. The same
+ * rules apply for allocations here as for Port (must be malloc'd or palloc'd in
+ * TopMemoryContext).
+ *
+ * If you add a struct member here, remember to also handle serialization in
+ * SerializeClientConnectionInfo() et al.
+ */
+typedef struct
+{
+ /*
+ * Authenticated identity. The meaning of this identifier is dependent on
+ * hba->auth_method; it is the identity (if any) that the user presented
+ * during the authentication cycle, before they were assigned a database
+ * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
+ * -- though the exact string in use may be different, depending on pg_hba
+ * options.)
+ *
+ * authn_id is NULL if the user has not actually been authenticated, for
+ * example if the "trust" auth method is in use.
+ */
+ const char *authn_id;
+} ClientConnectionInfo;
+
/*
* This is used by the postmaster in its communication with frontends. It
* contains all state information needed during this communication before the
@@ -158,19 +183,6 @@ typedef struct Port
*/
HbaLine *hba;
- /*
- * Authenticated identity. The meaning of this identifier is dependent on
- * hba->auth_method; it is the identity (if any) that the user presented
- * during the authentication cycle, before they were assigned a database
- * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
- * -- though the exact string in use may be different, depending on pg_hba
- * options.)
- *
- * authn_id is NULL if the user has not actually been authenticated, for
- * example if the "trust" auth method is in use.
- */
- const char *authn_id;
-
/*
* TCP keepalive and user timeout settings.
*
@@ -327,6 +339,7 @@ extern ssize_t be_gssapi_write(Port *port, void *ptr, size_t len);
#endif /* ENABLE_GSS */
extern PGDLLIMPORT ProtocolVersion FrontendProtocol;
+extern PGDLLIMPORT ClientConnectionInfo MyClientConnectionInfo;
/* TCP keepalives configuration. These are no-ops on an AF_UNIX socket. */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 0af130fbc5..c06796fe4a 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -486,6 +486,10 @@ extern bool has_rolreplication(Oid roleid);
typedef void (*shmem_request_hook_type) (void);
extern PGDLLIMPORT shmem_request_hook_type shmem_request_hook;
+extern Size EstimateClientConnectionInfoSpace(void);
+extern void SerializeClientConnectionInfo(Size maxsize, char *start_address);
+extern void RestoreClientConnectionInfo(char *procinfo);
+
/* in executor/nodeHash.c */
extern size_t get_hash_memory_limit(void);
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index f0bdeda52d..3f8629b3a6 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -74,6 +74,14 @@ $node->safe_psql('postgres',
);
$ENV{"PGPASSWORD"} = 'pass';
+# Set up a table for parallel worker testing.
+$node->safe_psql('postgres',
+ 'CREATE TABLE nulls (n) AS SELECT NULL FROM generate_series(1, 200000);'
+);
+$node->safe_psql('postgres',
+ 'GRANT SELECT ON nulls TO md5_role;'
+);
+
# For "trust" method, all users should be able to connect. These users are not
# considered to be authenticated.
reset_pg_hba($node, 'trust');
@@ -86,6 +94,19 @@ my $res =
$node->safe_psql('postgres', "SELECT pg_session_authn_id() IS NULL;");
is($res, 't', "users with trust authentication have NULL authn_id");
+# Test pg_session_authn_id() with parallel workers.
+$res = $node->safe_psql(
+ 'postgres', '
+ SET min_parallel_table_scan_size TO 0;
+ SET parallel_setup_cost TO 0;
+ SET parallel_tuple_cost TO 0;
+ SET max_parallel_workers_per_gather TO 2;
+
+ SELECT bool_and(pg_session_authn_id() IS NOT DISTINCT FROM n) FROM nulls;
+ ',
+ connstr => "user=md5_role");
+is($res, 't', "parallel workers return a null authn_id when not authenticated");
+
# For plain "password" method, all users should also be able to connect.
reset_pg_hba($node, 'password');
test_role($node, 'scram_role', 'password', 0,
@@ -102,6 +123,18 @@ $res = $node->safe_psql(
is($res, 'md5_role',
"users with md5 authentication have authn_id matching role name");
+$res = $node->safe_psql(
+ 'postgres', '
+ SET min_parallel_table_scan_size TO 0;
+ SET parallel_setup_cost TO 0;
+ SET parallel_tuple_cost TO 0;
+ SET max_parallel_workers_per_gather TO 2;
+
+ SELECT bool_and(pg_session_authn_id() IS DISTINCT FROM n) FROM nulls;
+ ',
+ connstr => "user=md5_role");
+is($res, 't', "parallel workers return a non-null authn_id when authenticated");
+
# For "scram-sha-256" method, user "scram_role" should be able to connect.
reset_pg_hba($node, 'scram-sha-256');
test_role(
--
2.25.1
v11-0001-Add-API-to-retrieve-authn_id-from-SQL.patchtext/x-patch; charset=US-ASCII; name=v11-0001-Add-API-to-retrieve-authn_id-from-SQL.patchDownload
From 31a9d3ab7928d41c8e5d4778893455a31defc6a6 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Mon, 14 Feb 2022 08:10:53 -0800
Subject: [PATCH v11 1/2] Add API to retrieve authn_id from SQL
The authn_id field in MyProcPort is currently only accessible to the
backend itself. Add a SQL function, pg_session_authn_id(), to expose
the field to triggers that may want to make use of it.
---
doc/src/sgml/func.sgml | 26 +++++++++++++++++++++++
src/backend/utils/adt/name.c | 12 ++++++++++-
src/include/catalog/pg_proc.dat | 3 +++
src/test/authentication/t/001_password.pl | 11 ++++++++++
src/test/ssl/t/001_ssltests.pl | 7 ++++++
5 files changed, 58 insertions(+), 1 deletion(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 478a216dbb..b45659b609 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -23344,6 +23344,32 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_session_authn_id</primary>
+ </indexterm>
+ <function>pg_session_authn_id</function> ()
+ <returnvalue>text</returnvalue>
+ </para>
+ <para>
+ Returns the authenticated identity for the current connection, or
+ <literal>NULL</literal> if the user has not been authenticated.
+ </para>
+ <para>
+ The authenticated identity is an immutable identifier for the user
+ presented during the connection handshake; the exact format depends on
+ the authentication method in use. (For example, when using the
+ <literal>scram-sha-256</literal> auth method, the authenticated identity
+ is simply the username. When using the <literal>cert</literal> auth
+ method, the authenticated identity is the Distinguished Name of the
+ client certificate.) Even for auth methods which use the username as
+ the authenticated identity, this function differs from
+ <literal>session_user</literal> in that its return value cannot be
+ changed after login.
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
diff --git a/src/backend/utils/adt/name.c b/src/backend/utils/adt/name.c
index e8bba3670c..662a7943ed 100644
--- a/src/backend/utils/adt/name.c
+++ b/src/backend/utils/adt/name.c
@@ -23,6 +23,7 @@
#include "catalog/namespace.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_type.h"
+#include "libpq/libpq-be.h"
#include "libpq/pqformat.h"
#include "mb/pg_wchar.h"
#include "miscadmin.h"
@@ -257,7 +258,7 @@ namestrcmp(Name name, const char *str)
/*
- * SQL-functions CURRENT_USER, SESSION_USER
+ * SQL-functions CURRENT_USER, SESSION_USER, PG_SESSION_AUTHN_ID
*/
Datum
current_user(PG_FUNCTION_ARGS)
@@ -271,6 +272,15 @@ session_user(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(DirectFunctionCall1(namein, CStringGetDatum(GetUserNameFromId(GetSessionUserId(), false))));
}
+Datum
+pg_session_authn_id(PG_FUNCTION_ARGS)
+{
+ if (!MyProcPort || !MyProcPort->authn_id)
+ PG_RETURN_NULL();
+
+ PG_RETURN_TEXT_P(cstring_to_text(MyProcPort->authn_id));
+}
+
/*
* SQL-functions CURRENT_SCHEMA, CURRENT_SCHEMAS
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 87aa571a33..8e181b4771 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -1508,6 +1508,9 @@
{ oid => '746', descr => 'session user name',
proname => 'session_user', provolatile => 's', prorettype => 'name',
proargtypes => '', prosrc => 'session_user' },
+{ oid => '9774', descr => 'session authenticated identity',
+ proname => 'pg_session_authn_id', provolatile => 's', proparallel => 'r',
+ prorettype => 'text', proargtypes => '', prosrc => 'pg_session_authn_id' },
{ oid => '744',
proname => 'array_eq', prorettype => 'bool',
diff --git a/src/test/authentication/t/001_password.pl b/src/test/authentication/t/001_password.pl
index 3e3079c824..f0bdeda52d 100644
--- a/src/test/authentication/t/001_password.pl
+++ b/src/test/authentication/t/001_password.pl
@@ -82,6 +82,10 @@ test_role($node, 'scram_role', 'trust', 0,
test_role($node, 'md5_role', 'trust', 0,
log_unlike => [qr/connection authenticated:/]);
+my $res =
+ $node->safe_psql('postgres', "SELECT pg_session_authn_id() IS NULL;");
+is($res, 't', "users with trust authentication have NULL authn_id");
+
# For plain "password" method, all users should also be able to connect.
reset_pg_hba($node, 'password');
test_role($node, 'scram_role', 'password', 0,
@@ -91,6 +95,13 @@ test_role($node, 'md5_role', 'password', 0,
log_like =>
[qr/connection authenticated: identity="md5_role" method=password/]);
+$res = $node->safe_psql(
+ 'postgres',
+ "SELECT pg_session_authn_id();",
+ connstr => "user=md5_role");
+is($res, 'md5_role',
+ "users with md5 authentication have authn_id matching role name");
+
# For "scram-sha-256" method, user "scram_role" should be able to connect.
reset_pg_hba($node, 'scram-sha-256');
test_role(
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index c0b4a5739c..2941eb0bde 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -562,6 +562,13 @@ $node->connect_ok(
qr/connection authenticated: identity="CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG" method=cert/
],);
+# Sanity-check pg_session_authn_id() for long ID strings
+my $res = $node->safe_psql('postgres',
+ "SELECT pg_session_authn_id();",
+ connstr => "$dn_connstr user=ssltestuser sslcert=ssl/client-dn.crt " . sslkey('client-dn.key'),
+);
+is($res, "CN=ssltestuser-dn,OU=Testing,OU=Engineering,O=PGDG", "users with cert authentication have entire DN as authn_id");
+
# same thing but with a regex
$dn_connstr = "$common_connstr dbname=certdb_dn_re";
--
2.25.1
Hi,
On 6/10/22 7:58 PM, Jacob Champion wrote:
CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
On Thu, Jun 9, 2022 at 6:23 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jun 8, 2022 at 7:53 PM Jacob Champion <jchampion@timescale.com> wrote:
But I don't have any better ideas for how to achieve both. I'm fine
with your suggestion of ClientConnectionInfo, if that sounds good to
others; the doc comment can clarify why it differs from Port? Or add
one of the Shared-/Gang-/Group- prefixes to it, maybe?I don't like the prefixes, so I'd prefer explaining it in the struct comment.
Done that way in v11.
Thanks!
--Jacob
FWIW, I just created a new thread to expose the port->authn_id through
the SYSTEM_USER sql reserved word.
Regards,
Bertrand
On 6/22/22 06:31, Drouvot, Bertrand wrote:
FWIW, I just created a new thread to expose the port->authn_id through
the SYSTEM_USER sql reserved word.
Review for both seems to have dried up a bit. I'm not particularly
invested in my code, but I do want to see *a* solution go in. So if it
helps the review momentum for me to withdraw this patch and put my
effort into SYSTEM_USER, I can do that no problem.
Thoughts from prior reviewers? Is SYSTEM_USER the way to go?
--Jacob
Hi,
On 8/2/22 11:57 PM, Jacob Champion wrote:
On 6/22/22 06:31, Drouvot, Bertrand wrote:
FWIW, I just created a new thread to expose the port->authn_id through
the SYSTEM_USER sql reserved word.Review for both seems to have dried up a bit. I'm not particularly
invested in my code, but I do want to see *a* solution go in. So if it
helps the review momentum for me to withdraw this patch and put my
effort into SYSTEM_USER, I can do that no problem.Thoughts from prior reviewers? Is SYSTEM_USER the way to go?
I did not look in detail to this thread, but if the goal is "only" to
expose authn_id (as the subject describes) then it seems to me that
SYSTEM_USER [1]https://commitfest.postgresql.org/39/3703/ is the way to go.
[1]: https://commitfest.postgresql.org/39/3703/
--
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com
On Fri, Aug 05, 2022 at 12:48:33PM +0200, Drouvot, Bertrand wrote:
On 8/2/22 11:57 PM, Jacob Champion wrote:
Thoughts from prior reviewers? Is SYSTEM_USER the way to go?
Reading through the other thread, there is a clear parallel between
both in concept to provide this information at SQL level, indeed.
I did not look in detail to this thread, but if the goal is "only" to expose
authn_id (as the subject describes) then it seems to me that SYSTEM_USER [1]
is the way to go.
However, I am not sure if the suggestion of auth_method:authn as
output generated by SYSTEM_USER would be correct according to the SQL
specification, either. The spec being not really talkative about the
details of what an external module should be opens up for a lot of
interpretation, something that both thread are dealing with.
Anyway, we are talking about two different problems on this thread:
1) Provide the authn/SYSTEM_USER through some SQL interface, which is
what 0001 is an attempt of, SYSTEM_USER is a different attempt.
2) Move authn out of Port into its own structure, named
ClientConnectionInfo, and pass it down to the parallel workers.
SYSTEM_USER overlaps with 0001, but I see no reason to not do 0002 in
all cases. Even if we are not sure yet of how to expose that at SQL
level, we still want to pass down this information to the parallel
workers and we still want to not have that in Port. An extension
could also do easily the job once 0002 is done, so I see a good
argument about doing it anyway. The name ClientConnectionInfo from
Robert looks like the compromise we have for the new structure holding
the information about the client information passed down to the
workers.
--
Michael
On 8/6/22 02:26, Michael Paquier wrote:
On Fri, Aug 05, 2022 at 12:48:33PM +0200, Drouvot, Bertrand wrote:
On 8/2/22 11:57 PM, Jacob Champion wrote:
Thoughts from prior reviewers? Is SYSTEM_USER the way to go?
Reading through the other thread, there is a clear parallel between
both in concept to provide this information at SQL level, indeed.I did not look in detail to this thread, but if the goal is "only" to expose
authn_id (as the subject describes) then it seems to me that SYSTEM_USER [1]
is the way to go.However, I am not sure if the suggestion of auth_method:authn as
output generated by SYSTEM_USER would be correct according to the SQL
specification, either. The spec being not really talkative about the
details of what an external module should be opens up for a lot of
interpretation, something that both thread are dealing with.
As I pointed out here [
/messages/by-id/28b4a9ef-5103-f117-99e1-99ae5a86a6e8@joeconway.com
] both the SQL Server and Oracle interpretations are similar to the one
provided by Bertrand's patch:
SQL Server:
"If the current user is logged in to SQL Server by
using Windows Authentication, SYSTEM_USER returns the
Windows login identification name in the form:
DOMAIN\user_login_name. However, if the current user
is logged in to SQL Server by using SQL Server
Authentication, SYSTEM_USER returns the SQL Server
login identification name"
Oracle:
"SYSTEM_USER
Returns the name of the current data store user as
identified by the operating system."
I am not sure how else we should interpret SYSTEM_USER -- if it isn't
port->authn_id what else would you propose it should be?
--
Joe Conway
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Sat, Aug 06, 2022 at 10:59:26AM -0400, Joe Conway wrote:
I am not sure how else we should interpret SYSTEM_USER -- if it isn't
port->authn_id what else would you propose it should be?
What you say sounds rather right, but I was wondering mainly what
Oracle and SQL server report when it comes to other authentication
methods like SSPI or a cert, where we don't use a user name but some
data dependent on the auth method. And I have no experience with
these.
Anyway, I was looking at Bertrand's patch, and I can see that it is
doing nothing to move away the connection information that we have in
Port away to a different structure passed down to the parallel
workers, which is what I understand is a cleanup worth on its own
based on the discussion of this thread. Hence, I still see a good
argument for the introduction of ClientConnectionInfo that gets passed
down to the workers. Based on that, I think that we'd better finish
v11-0002 (only ClientConnectionInfo, no SQL interface) as a first step
to build for the next ones, with authn being the first piece of
information given to the workers. With a separate structure, the
auth_method can also be a second member in ClientConnectionInfo,
completing what would be needed to build SYSTEM_USER as the workers
would have access to it.
Am I getting that right?
--
Michael
Hi,
On 8/7/22 9:41 AM, Michael Paquier wrote:
Anyway, I was looking at Bertrand's patch, and I can see that it is
doing nothing to move away the connection information that we have in
Port away to a different structure passed down to the parallel
workers,
Thanks for looking at it!
That's right. The main reason is that in the v2-0003 SYSTEM_USER patch
what is passed down to the parallel workers is not Port->authn_id but a
new "SystemUser" (defined in miscinit.c with CurrentUserId and friends).
which is what I understand is a cleanup worth on its own
based on the discussion of this thread. Hence, I still see a good
argument for the introduction of ClientConnectionInfo that gets passed
down to the workers.
I agree that it could it be useful too.
Based on that, I think that we'd better finish
v11-0002 (only ClientConnectionInfo, no SQL interface)
I agree.
as a first step
to build for the next ones, with authn being the first piece of
information given to the workers. With a separate structure, the
auth_method can also be a second member in ClientConnectionInfo,
completing what would be needed to build SYSTEM_USER as the workers
would have access to it.
but I'm not sure we should do it as a first step (given the fact that
this is not Port->authn_id that is passed down to the parallel workers
in the SYSTEM_USER patch).
What do you think about working on both (aka a) v11-002 only
ClientConnectionInfo and b) SYSTEM_USER) in parallel?
Thanks
--
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com
On Mon, Aug 08, 2022 at 12:43:14PM +0200, Drouvot, Bertrand wrote:
but I'm not sure we should do it as a first step (given the fact that this
is not Port->authn_id that is passed down to the parallel workers in the
SYSTEM_USER patch).What do you think about working on both (aka a) v11-002 only
ClientConnectionInfo and b) SYSTEM_USER) in parallel?
It seems to me that completing ClientConnectionInfo first has the
advantage of not having to tweak twice the interface we are going to
use when passing down the full structure to the workers, so I would
choose for doing it first (with one field for the authn, and a second
field for the auth method so as the the workers can build SYSTEM_USER
by themselves when required).
--
Michael
Hi,
On 8/9/22 11:17 AM, Michael Paquier wrote:
On Mon, Aug 08, 2022 at 12:43:14PM +0200, Drouvot, Bertrand wrote:
but I'm not sure we should do it as a first step (given the fact that this
is not Port->authn_id that is passed down to the parallel workers in the
SYSTEM_USER patch).What do you think about working on both (aka a) v11-002 only
ClientConnectionInfo and b) SYSTEM_USER) in parallel?It seems to me that completing ClientConnectionInfo first has the
advantage of not having to tweak twice the interface we are going to
use when passing down the full structure to the workers, so I would
choose for doing it first (with one field for the authn, and a second
field for the auth method so as the the workers can build SYSTEM_USER
by themselves when required).
Yeah fair point.
Agree that it makes sense to work on those patches in this particular
order then.
Thanks,
--
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com
On Tue, Aug 9, 2022 at 3:39 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
Agree that it makes sense to work on those patches in this particular
order then.
Sounds good. The ClientConnectionInfo patch (previously 0002) is
attached, with the SQL function removed.
Thanks,
--Jacob
Attachments:
Allow-parallel-workers-to-read-authn_id.patchtext/x-patch; charset=US-ASCII; name=Allow-parallel-workers-to-read-authn_id.patchDownload
From a22ff3ba36f5eb93c582a957c7c2caca07ed21c5 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Wed, 23 Mar 2022 15:07:05 -0700
Subject: [PATCH] Allow parallel workers to read authn_id
Move authn_id into a new global, MyClientConnectionInfo, which is
intended to hold all the client information that needs to be shared
between the backend and any parallel workers. MyClientConnectionInfo is
serialized and restored using a new parallel key.
---
src/backend/access/transam/parallel.c | 19 ++++++-
src/backend/libpq/auth.c | 16 +++---
src/backend/utils/init/miscinit.c | 72 +++++++++++++++++++++++++++
src/include/libpq/libpq-be.h | 39 ++++++++++-----
src/include/miscadmin.h | 4 ++
5 files changed, 129 insertions(+), 21 deletions(-)
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index df0cd77558..bc93101ff7 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -76,6 +76,7 @@
#define PARALLEL_KEY_REINDEX_STATE UINT64CONST(0xFFFFFFFFFFFF000C)
#define PARALLEL_KEY_RELMAPPER_STATE UINT64CONST(0xFFFFFFFFFFFF000D)
#define PARALLEL_KEY_UNCOMMITTEDENUMS UINT64CONST(0xFFFFFFFFFFFF000E)
+#define PARALLEL_KEY_CLIENTCONNINFO UINT64CONST(0xFFFFFFFFFFFF000F)
/* Fixed-size parallel state. */
typedef struct FixedParallelState
@@ -212,6 +213,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
Size reindexlen = 0;
Size relmapperlen = 0;
Size uncommittedenumslen = 0;
+ Size clientconninfolen = 0;
Size segsize = 0;
int i;
FixedParallelState *fps;
@@ -272,8 +274,10 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_estimate_chunk(&pcxt->estimator, relmapperlen);
uncommittedenumslen = EstimateUncommittedEnumsSpace();
shm_toc_estimate_chunk(&pcxt->estimator, uncommittedenumslen);
+ clientconninfolen = EstimateClientConnectionInfoSpace();
+ shm_toc_estimate_chunk(&pcxt->estimator, clientconninfolen);
/* If you add more chunks here, you probably need to add keys. */
- shm_toc_estimate_keys(&pcxt->estimator, 11);
+ shm_toc_estimate_keys(&pcxt->estimator, 12);
/* Estimate space need for error queues. */
StaticAssertStmt(BUFFERALIGN(PARALLEL_ERROR_QUEUE_SIZE) ==
@@ -352,6 +356,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
char *session_dsm_handle_space;
char *entrypointstate;
char *uncommittedenumsspace;
+ char *clientconninfospace;
Size lnamelen;
/* Serialize shared libraries we have loaded. */
@@ -422,6 +427,12 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNCOMMITTEDENUMS,
uncommittedenumsspace);
+ /* Serialize our ClientConnectionInfo. */
+ clientconninfospace = shm_toc_allocate(pcxt->toc, clientconninfolen);
+ SerializeClientConnectionInfo(clientconninfolen, clientconninfospace);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_CLIENTCONNINFO,
+ clientconninfospace);
+
/* Allocate space for worker information. */
pcxt->worker = palloc0(sizeof(ParallelWorkerInfo) * pcxt->nworkers);
@@ -1270,6 +1281,7 @@ ParallelWorkerMain(Datum main_arg)
char *reindexspace;
char *relmapperspace;
char *uncommittedenumsspace;
+ char *clientconninfospace;
StringInfoData msgbuf;
char *session_dsm_handle_space;
Snapshot tsnapshot;
@@ -1479,6 +1491,11 @@ ParallelWorkerMain(Datum main_arg)
false);
RestoreUncommittedEnums(uncommittedenumsspace);
+ /* Restore the ClientConnectionInfo. */
+ clientconninfospace = shm_toc_lookup(toc, PARALLEL_KEY_CLIENTCONNINFO,
+ false);
+ RestoreClientConnectionInfo(clientconninfospace);
+
/* Attach to the leader's serializable transaction, if SERIALIZABLE. */
AttachSerializableXact(fps->serializable_xact_handle);
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 2d9ab7edce..313a6ea701 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -342,15 +342,15 @@ auth_failed(Port *port, int status, const char *logdetail)
* authorization will fail later.
*
* The provided string will be copied into TopMemoryContext, to match the
- * lifetime of the Port, so it is safe to pass a string that is managed by an
- * external library.
+ * lifetime of MyClientConnectionInfo, so it is safe to pass a string that is
+ * managed by an external library.
*/
static void
set_authn_id(Port *port, const char *id)
{
Assert(id);
- if (port->authn_id)
+ if (MyClientConnectionInfo.authn_id)
{
/*
* An existing authn_id should never be overwritten; that means two
@@ -361,17 +361,18 @@ set_authn_id(Port *port, const char *id)
ereport(FATAL,
(errmsg("authentication identifier set more than once"),
errdetail_log("previous identifier: \"%s\"; new identifier: \"%s\"",
- port->authn_id, id)));
+ MyClientConnectionInfo.authn_id, id)));
}
- port->authn_id = MemoryContextStrdup(TopMemoryContext, id);
+ MyClientConnectionInfo.authn_id = MemoryContextStrdup(TopMemoryContext, id);
if (Log_connections)
{
ereport(LOG,
errmsg("connection authenticated: identity=\"%s\" method=%s "
"(%s:%d)",
- port->authn_id, hba_authname(port->hba->auth_method), HbaFileName,
+ MyClientConnectionInfo.authn_id,
+ hba_authname(port->hba->auth_method), HbaFileName,
port->hba->linenumber));
}
}
@@ -1908,7 +1909,8 @@ auth_peer(hbaPort *port)
*/
set_authn_id(port, pw->pw_name);
- ret = check_usermap(port->hba->usermap, port->user_name, port->authn_id, false);
+ ret = check_usermap(port->hba->usermap, port->user_name,
+ MyClientConnectionInfo.authn_id, false);
return ret;
#else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index eb43b2c5e5..973103374b 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -931,6 +931,78 @@ GetUserNameFromId(Oid roleid, bool noerr)
return result;
}
+/* ------------------------------------------------------------------------
+ * Parallel connection state
+ *
+ * ClientConnectionInfo contains pieces of information about the client that
+ * need to be synced to parallel workers when they initialize. Over time, this
+ * list will probably grow, and may subsume some of the "user state" variables
+ * above.
+ *-------------------------------------------------------------------------
+ */
+
+ClientConnectionInfo MyClientConnectionInfo;
+
+/*
+ * Calculate the space needed to serialize MyClientConnectionInfo.
+ */
+Size
+EstimateClientConnectionInfoSpace(void)
+{
+ Size size = 1;
+
+ if (MyClientConnectionInfo.authn_id)
+ size = add_size(size, strlen(MyClientConnectionInfo.authn_id) + 1);
+
+ return size;
+}
+
+/*
+ * Serialize MyClientConnectionInfo for use by parallel workers.
+ */
+void
+SerializeClientConnectionInfo(Size maxsize, char *start_address)
+{
+ /*
+ * First byte is an indication of whether or not authn_id has been set to
+ * non-NULL, to differentiate that case from the empty string.
+ */
+ Assert(maxsize > 0);
+ start_address[0] = MyClientConnectionInfo.authn_id ? 1 : 0;
+ start_address++;
+ maxsize--;
+
+ if (MyClientConnectionInfo.authn_id)
+ {
+ Size len;
+
+ len = strlcpy(start_address, MyClientConnectionInfo.authn_id, maxsize) + 1;
+ Assert(len <= maxsize);
+ maxsize -= len;
+ start_address += len;
+ }
+}
+
+/*
+ * Restore MyClientConnectionInfo from its serialized representation.
+ */
+void
+RestoreClientConnectionInfo(char *conninfo)
+{
+ if (conninfo[0] == 0)
+ {
+ MyClientConnectionInfo.authn_id = NULL;
+ conninfo++;
+ }
+ else
+ {
+ conninfo++;
+ MyClientConnectionInfo.authn_id = MemoryContextStrdup(TopMemoryContext,
+ conninfo);
+ conninfo += strlen(conninfo) + 1;
+ }
+}
+
/*-------------------------------------------------------------------------
* Interlock-file support
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index 90c20da22b..c900411fdd 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -98,6 +98,31 @@ typedef struct
} pg_gssinfo;
#endif
+/*
+ * Fields describing the client connection, that also need to be copied over to
+ * parallel workers, go into the ClientConnectionInfo rather than Port. The same
+ * rules apply for allocations here as for Port (must be malloc'd or palloc'd in
+ * TopMemoryContext).
+ *
+ * If you add a struct member here, remember to also handle serialization in
+ * SerializeClientConnectionInfo() et al.
+ */
+typedef struct
+{
+ /*
+ * Authenticated identity. The meaning of this identifier is dependent on
+ * hba->auth_method; it is the identity (if any) that the user presented
+ * during the authentication cycle, before they were assigned a database
+ * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
+ * -- though the exact string in use may be different, depending on pg_hba
+ * options.)
+ *
+ * authn_id is NULL if the user has not actually been authenticated, for
+ * example if the "trust" auth method is in use.
+ */
+ const char *authn_id;
+} ClientConnectionInfo;
+
/*
* This is used by the postmaster in its communication with frontends. It
* contains all state information needed during this communication before the
@@ -158,19 +183,6 @@ typedef struct Port
*/
HbaLine *hba;
- /*
- * Authenticated identity. The meaning of this identifier is dependent on
- * hba->auth_method; it is the identity (if any) that the user presented
- * during the authentication cycle, before they were assigned a database
- * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
- * -- though the exact string in use may be different, depending on pg_hba
- * options.)
- *
- * authn_id is NULL if the user has not actually been authenticated, for
- * example if the "trust" auth method is in use.
- */
- const char *authn_id;
-
/*
* TCP keepalive and user timeout settings.
*
@@ -327,6 +339,7 @@ extern ssize_t be_gssapi_write(Port *port, void *ptr, size_t len);
#endif /* ENABLE_GSS */
extern PGDLLIMPORT ProtocolVersion FrontendProtocol;
+extern PGDLLIMPORT ClientConnectionInfo MyClientConnectionInfo;
/* TCP keepalives configuration. These are no-ops on an AF_UNIX socket. */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 067b729d5a..3e9297e399 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -481,6 +481,10 @@ extern bool has_rolreplication(Oid roleid);
typedef void (*shmem_request_hook_type) (void);
extern PGDLLIMPORT shmem_request_hook_type shmem_request_hook;
+extern Size EstimateClientConnectionInfoSpace(void);
+extern void SerializeClientConnectionInfo(Size maxsize, char *start_address);
+extern void RestoreClientConnectionInfo(char *procinfo);
+
/* in executor/nodeHash.c */
extern size_t get_hash_memory_limit(void);
--
2.25.1
Hi,
On 8/10/22 5:09 PM, Jacob Champion wrote:
On Tue, Aug 9, 2022 at 3:39 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
Agree that it makes sense to work on those patches in this particular
order then.Sounds good. The ClientConnectionInfo patch (previously 0002) is
attached, with the SQL function removed.
Thanks for the patch!
Looking at:
+typedef struct
+{
+ /*
+ * Authenticated identity. The meaning of this identifier is
dependent on
+ * hba->auth_method; it is the identity (if any) that the user
presented
+ * during the authentication cycle, before they were assigned a
database
+ * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident
usermap
+ * -- though the exact string in use may be different, depending on
pg_hba
+ * options.)
+ *
+ * authn_id is NULL if the user has not actually been
authenticated, for
+ * example if the "trust" auth method is in use.
+ */
+ const char *authn_id;
+} ClientConnectionInfo;
What do you think about adding a second field in ClientConnectionInfo
for the auth method (as suggested by Michael upthread)?
That will be needed by the SYSTEM_USER patch (that its current version
implements as "auth_method:identity").
Thanks,
--
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com
On Wed, Aug 10, 2022 at 10:48 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
What do you think about adding a second field in ClientConnectionInfo
for the auth method (as suggested by Michael upthread)?
Sure -- without a followup patch, it's not really tested, though.
v2 adjusts set_authn_id() to copy the auth_method over as well. It
"passes tests" but is otherwise unexercised.
Thanks,
--Jacob
Attachments:
since-v1.diff.txttext/plain; charset=US-ASCII; name=since-v1.diff.txtDownload
commit 69cacd5e0869b18d64ff4233ef6a73123c513496
Author: Jacob Champion <jchampion@timescale.com>
Date: Thu Aug 11 15:16:15 2022 -0700
squash! Allow parallel workers to read authn_id
Add a copy of hba->auth_method to ClientConnectionInfo when
set_authn_id() is called.
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 313a6ea701..9113f04189 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -333,9 +333,9 @@ auth_failed(Port *port, int status, const char *logdetail)
/*
- * Sets the authenticated identity for the current user. The provided string
- * will be copied into the TopMemoryContext. The ID will be logged if
- * log_connections is enabled.
+ * Sets the authenticated identity for the current user. The provided string
+ * will be stored into MyClientConnectionInfo, alongside the current HBA method
+ * in use. The ID will be logged if log_connections is enabled.
*
* Auth methods should call this routine exactly once, as soon as the user is
* successfully authenticated, even if they have reasons to know that
@@ -365,6 +365,7 @@ set_authn_id(Port *port, const char *id)
}
MyClientConnectionInfo.authn_id = MemoryContextStrdup(TopMemoryContext, id);
+ MyClientConnectionInfo.auth_method = port->hba->auth_method;
if (Log_connections)
{
@@ -372,8 +373,8 @@ set_authn_id(Port *port, const char *id)
errmsg("connection authenticated: identity=\"%s\" method=%s "
"(%s:%d)",
MyClientConnectionInfo.authn_id,
- hba_authname(port->hba->auth_method), HbaFileName,
- port->hba->linenumber));
+ hba_authname(MyClientConnectionInfo.auth_method),
+ HbaFileName, port->hba->linenumber));
}
}
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 973103374b..155ba92c67 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -954,6 +954,8 @@ EstimateClientConnectionInfoSpace(void)
if (MyClientConnectionInfo.authn_id)
size = add_size(size, strlen(MyClientConnectionInfo.authn_id) + 1);
+ size = add_size(size, sizeof(UserAuth));
+
return size;
}
@@ -981,6 +983,15 @@ SerializeClientConnectionInfo(Size maxsize, char *start_address)
maxsize -= len;
start_address += len;
}
+
+ {
+ UserAuth *auth_method = (UserAuth*) start_address;
+
+ Assert(sizeof(*auth_method) <= maxsize);
+ *auth_method = MyClientConnectionInfo.auth_method;
+ maxsize -= sizeof(*auth_method);
+ start_address += sizeof(*auth_method);
+ }
}
/*
@@ -1001,6 +1012,13 @@ RestoreClientConnectionInfo(char *conninfo)
conninfo);
conninfo += strlen(conninfo) + 1;
}
+
+ {
+ UserAuth *auth_method = (UserAuth*) conninfo;
+
+ MyClientConnectionInfo.auth_method = *auth_method;
+ conninfo += sizeof(*auth_method);
+ }
}
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index c900411fdd..0643733765 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -111,7 +111,7 @@ typedef struct
{
/*
* Authenticated identity. The meaning of this identifier is dependent on
- * hba->auth_method; it is the identity (if any) that the user presented
+ * auth_method; it is the identity (if any) that the user presented
* during the authentication cycle, before they were assigned a database
* role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
* -- though the exact string in use may be different, depending on pg_hba
@@ -121,6 +121,12 @@ typedef struct
* example if the "trust" auth method is in use.
*/
const char *authn_id;
+
+ /*
+ * The HBA method that determined the above authn_id. This only has meaning
+ * if authn_id is not NULL; otherwise it's undefined.
+ */
+ UserAuth auth_method;
} ClientConnectionInfo;
/*
v2-0001-Allow-parallel-workers-to-read-authn_id.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Allow-parallel-workers-to-read-authn_id.patchDownload
From 32d465527678ad6ef2f177287c797cd87feba585 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Wed, 23 Mar 2022 15:07:05 -0700
Subject: [PATCH v2] Allow parallel workers to read authn_id
Move authn_id into a new global, MyClientConnectionInfo, which is
intended to hold all the client information that needs to be shared
between the backend and any parallel workers. MyClientConnectionInfo is
serialized and restored using a new parallel key.
Additionally, make a copy of hba->auth_method in ClientConnectionInfo
when set_authn_id() is called, for use by SYSTEM_USER.
---
src/backend/access/transam/parallel.c | 19 +++++-
src/backend/libpq/auth.c | 25 ++++----
src/backend/utils/init/miscinit.c | 90 +++++++++++++++++++++++++++
src/include/libpq/libpq-be.h | 45 ++++++++++----
src/include/miscadmin.h | 4 ++
5 files changed, 158 insertions(+), 25 deletions(-)
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index df0cd77558..bc93101ff7 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -76,6 +76,7 @@
#define PARALLEL_KEY_REINDEX_STATE UINT64CONST(0xFFFFFFFFFFFF000C)
#define PARALLEL_KEY_RELMAPPER_STATE UINT64CONST(0xFFFFFFFFFFFF000D)
#define PARALLEL_KEY_UNCOMMITTEDENUMS UINT64CONST(0xFFFFFFFFFFFF000E)
+#define PARALLEL_KEY_CLIENTCONNINFO UINT64CONST(0xFFFFFFFFFFFF000F)
/* Fixed-size parallel state. */
typedef struct FixedParallelState
@@ -212,6 +213,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
Size reindexlen = 0;
Size relmapperlen = 0;
Size uncommittedenumslen = 0;
+ Size clientconninfolen = 0;
Size segsize = 0;
int i;
FixedParallelState *fps;
@@ -272,8 +274,10 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_estimate_chunk(&pcxt->estimator, relmapperlen);
uncommittedenumslen = EstimateUncommittedEnumsSpace();
shm_toc_estimate_chunk(&pcxt->estimator, uncommittedenumslen);
+ clientconninfolen = EstimateClientConnectionInfoSpace();
+ shm_toc_estimate_chunk(&pcxt->estimator, clientconninfolen);
/* If you add more chunks here, you probably need to add keys. */
- shm_toc_estimate_keys(&pcxt->estimator, 11);
+ shm_toc_estimate_keys(&pcxt->estimator, 12);
/* Estimate space need for error queues. */
StaticAssertStmt(BUFFERALIGN(PARALLEL_ERROR_QUEUE_SIZE) ==
@@ -352,6 +356,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
char *session_dsm_handle_space;
char *entrypointstate;
char *uncommittedenumsspace;
+ char *clientconninfospace;
Size lnamelen;
/* Serialize shared libraries we have loaded. */
@@ -422,6 +427,12 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNCOMMITTEDENUMS,
uncommittedenumsspace);
+ /* Serialize our ClientConnectionInfo. */
+ clientconninfospace = shm_toc_allocate(pcxt->toc, clientconninfolen);
+ SerializeClientConnectionInfo(clientconninfolen, clientconninfospace);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_CLIENTCONNINFO,
+ clientconninfospace);
+
/* Allocate space for worker information. */
pcxt->worker = palloc0(sizeof(ParallelWorkerInfo) * pcxt->nworkers);
@@ -1270,6 +1281,7 @@ ParallelWorkerMain(Datum main_arg)
char *reindexspace;
char *relmapperspace;
char *uncommittedenumsspace;
+ char *clientconninfospace;
StringInfoData msgbuf;
char *session_dsm_handle_space;
Snapshot tsnapshot;
@@ -1479,6 +1491,11 @@ ParallelWorkerMain(Datum main_arg)
false);
RestoreUncommittedEnums(uncommittedenumsspace);
+ /* Restore the ClientConnectionInfo. */
+ clientconninfospace = shm_toc_lookup(toc, PARALLEL_KEY_CLIENTCONNINFO,
+ false);
+ RestoreClientConnectionInfo(clientconninfospace);
+
/* Attach to the leader's serializable transaction, if SERIALIZABLE. */
AttachSerializableXact(fps->serializable_xact_handle);
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 2d9ab7edce..9113f04189 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -333,24 +333,24 @@ auth_failed(Port *port, int status, const char *logdetail)
/*
- * Sets the authenticated identity for the current user. The provided string
- * will be copied into the TopMemoryContext. The ID will be logged if
- * log_connections is enabled.
+ * Sets the authenticated identity for the current user. The provided string
+ * will be stored into MyClientConnectionInfo, alongside the current HBA method
+ * in use. The ID will be logged if log_connections is enabled.
*
* Auth methods should call this routine exactly once, as soon as the user is
* successfully authenticated, even if they have reasons to know that
* authorization will fail later.
*
* The provided string will be copied into TopMemoryContext, to match the
- * lifetime of the Port, so it is safe to pass a string that is managed by an
- * external library.
+ * lifetime of MyClientConnectionInfo, so it is safe to pass a string that is
+ * managed by an external library.
*/
static void
set_authn_id(Port *port, const char *id)
{
Assert(id);
- if (port->authn_id)
+ if (MyClientConnectionInfo.authn_id)
{
/*
* An existing authn_id should never be overwritten; that means two
@@ -361,18 +361,20 @@ set_authn_id(Port *port, const char *id)
ereport(FATAL,
(errmsg("authentication identifier set more than once"),
errdetail_log("previous identifier: \"%s\"; new identifier: \"%s\"",
- port->authn_id, id)));
+ MyClientConnectionInfo.authn_id, id)));
}
- port->authn_id = MemoryContextStrdup(TopMemoryContext, id);
+ MyClientConnectionInfo.authn_id = MemoryContextStrdup(TopMemoryContext, id);
+ MyClientConnectionInfo.auth_method = port->hba->auth_method;
if (Log_connections)
{
ereport(LOG,
errmsg("connection authenticated: identity=\"%s\" method=%s "
"(%s:%d)",
- port->authn_id, hba_authname(port->hba->auth_method), HbaFileName,
- port->hba->linenumber));
+ MyClientConnectionInfo.authn_id,
+ hba_authname(MyClientConnectionInfo.auth_method),
+ HbaFileName, port->hba->linenumber));
}
}
@@ -1908,7 +1910,8 @@ auth_peer(hbaPort *port)
*/
set_authn_id(port, pw->pw_name);
- ret = check_usermap(port->hba->usermap, port->user_name, port->authn_id, false);
+ ret = check_usermap(port->hba->usermap, port->user_name,
+ MyClientConnectionInfo.authn_id, false);
return ret;
#else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index eb43b2c5e5..155ba92c67 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -931,6 +931,96 @@ GetUserNameFromId(Oid roleid, bool noerr)
return result;
}
+/* ------------------------------------------------------------------------
+ * Parallel connection state
+ *
+ * ClientConnectionInfo contains pieces of information about the client that
+ * need to be synced to parallel workers when they initialize. Over time, this
+ * list will probably grow, and may subsume some of the "user state" variables
+ * above.
+ *-------------------------------------------------------------------------
+ */
+
+ClientConnectionInfo MyClientConnectionInfo;
+
+/*
+ * Calculate the space needed to serialize MyClientConnectionInfo.
+ */
+Size
+EstimateClientConnectionInfoSpace(void)
+{
+ Size size = 1;
+
+ if (MyClientConnectionInfo.authn_id)
+ size = add_size(size, strlen(MyClientConnectionInfo.authn_id) + 1);
+
+ size = add_size(size, sizeof(UserAuth));
+
+ return size;
+}
+
+/*
+ * Serialize MyClientConnectionInfo for use by parallel workers.
+ */
+void
+SerializeClientConnectionInfo(Size maxsize, char *start_address)
+{
+ /*
+ * First byte is an indication of whether or not authn_id has been set to
+ * non-NULL, to differentiate that case from the empty string.
+ */
+ Assert(maxsize > 0);
+ start_address[0] = MyClientConnectionInfo.authn_id ? 1 : 0;
+ start_address++;
+ maxsize--;
+
+ if (MyClientConnectionInfo.authn_id)
+ {
+ Size len;
+
+ len = strlcpy(start_address, MyClientConnectionInfo.authn_id, maxsize) + 1;
+ Assert(len <= maxsize);
+ maxsize -= len;
+ start_address += len;
+ }
+
+ {
+ UserAuth *auth_method = (UserAuth*) start_address;
+
+ Assert(sizeof(*auth_method) <= maxsize);
+ *auth_method = MyClientConnectionInfo.auth_method;
+ maxsize -= sizeof(*auth_method);
+ start_address += sizeof(*auth_method);
+ }
+}
+
+/*
+ * Restore MyClientConnectionInfo from its serialized representation.
+ */
+void
+RestoreClientConnectionInfo(char *conninfo)
+{
+ if (conninfo[0] == 0)
+ {
+ MyClientConnectionInfo.authn_id = NULL;
+ conninfo++;
+ }
+ else
+ {
+ conninfo++;
+ MyClientConnectionInfo.authn_id = MemoryContextStrdup(TopMemoryContext,
+ conninfo);
+ conninfo += strlen(conninfo) + 1;
+ }
+
+ {
+ UserAuth *auth_method = (UserAuth*) conninfo;
+
+ MyClientConnectionInfo.auth_method = *auth_method;
+ conninfo += sizeof(*auth_method);
+ }
+}
+
/*-------------------------------------------------------------------------
* Interlock-file support
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index 90c20da22b..0643733765 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -98,6 +98,37 @@ typedef struct
} pg_gssinfo;
#endif
+/*
+ * Fields describing the client connection, that also need to be copied over to
+ * parallel workers, go into the ClientConnectionInfo rather than Port. The same
+ * rules apply for allocations here as for Port (must be malloc'd or palloc'd in
+ * TopMemoryContext).
+ *
+ * If you add a struct member here, remember to also handle serialization in
+ * SerializeClientConnectionInfo() et al.
+ */
+typedef struct
+{
+ /*
+ * Authenticated identity. The meaning of this identifier is dependent on
+ * auth_method; it is the identity (if any) that the user presented
+ * during the authentication cycle, before they were assigned a database
+ * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
+ * -- though the exact string in use may be different, depending on pg_hba
+ * options.)
+ *
+ * authn_id is NULL if the user has not actually been authenticated, for
+ * example if the "trust" auth method is in use.
+ */
+ const char *authn_id;
+
+ /*
+ * The HBA method that determined the above authn_id. This only has meaning
+ * if authn_id is not NULL; otherwise it's undefined.
+ */
+ UserAuth auth_method;
+} ClientConnectionInfo;
+
/*
* This is used by the postmaster in its communication with frontends. It
* contains all state information needed during this communication before the
@@ -158,19 +189,6 @@ typedef struct Port
*/
HbaLine *hba;
- /*
- * Authenticated identity. The meaning of this identifier is dependent on
- * hba->auth_method; it is the identity (if any) that the user presented
- * during the authentication cycle, before they were assigned a database
- * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
- * -- though the exact string in use may be different, depending on pg_hba
- * options.)
- *
- * authn_id is NULL if the user has not actually been authenticated, for
- * example if the "trust" auth method is in use.
- */
- const char *authn_id;
-
/*
* TCP keepalive and user timeout settings.
*
@@ -327,6 +345,7 @@ extern ssize_t be_gssapi_write(Port *port, void *ptr, size_t len);
#endif /* ENABLE_GSS */
extern PGDLLIMPORT ProtocolVersion FrontendProtocol;
+extern PGDLLIMPORT ClientConnectionInfo MyClientConnectionInfo;
/* TCP keepalives configuration. These are no-ops on an AF_UNIX socket. */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 067b729d5a..3e9297e399 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -481,6 +481,10 @@ extern bool has_rolreplication(Oid roleid);
typedef void (*shmem_request_hook_type) (void);
extern PGDLLIMPORT shmem_request_hook_type shmem_request_hook;
+extern Size EstimateClientConnectionInfoSpace(void);
+extern void SerializeClientConnectionInfo(Size maxsize, char *start_address);
+extern void RestoreClientConnectionInfo(char *procinfo);
+
/* in executor/nodeHash.c */
extern size_t get_hash_memory_limit(void);
--
2.25.1
Hi,
On 8/12/22 12:28 AM, Jacob Champion wrote:
On Wed, Aug 10, 2022 at 10:48 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
What do you think about adding a second field in ClientConnectionInfo
for the auth method (as suggested by Michael upthread)?Sure -- without a followup patch, it's not really tested, though.
v2 adjusts set_authn_id() to copy the auth_method over as well. It
"passes tests" but is otherwise unexercised.
Thank you!
To help with the testing I've just provided a new version (aka
v2-0004-system_user-implementation.patch) of the SYSTEM_USER patch in
[1]: https://commitfest.postgresql.org/39/3703/
"v2-0001-Allow-parallel-workers-to-read-authn_id.patch".
But for this to work, the first comment below on your patch needs to be
addressed.
Once the first comment is addressed and the new SYSTEM_USER patch
applied (that adds new tap tests) then we can test the propagation to
the parallel workers with:
make -C src/test/kerberos check PROVE_TESTS=t/001_auth.pl PROVE_FLAGS=-v
and
make -C src/test/authentication check PROVE_TESTS=t/001_password.pl
PROVE_FLAGS=-v
Both are currently successful.
Regarding the comments on
v2-0001-Allow-parallel-workers-to-read-authn_id.patch:
1)
This is the one to be applied before adding the new SYSTEM_USER one on top:
+typedef struct
+{
+ /*
+ * Authenticated identity. The meaning of this identifier is
dependent on
has to be replaced by:
+typedef struct ClientConnectionInfo
+{
+ /*
+ * Authenticated identity. The meaning of this identifier is
dependent on
2)
+ * Authenticated identity. The meaning of this identifier is
dependent on
There is one extra space before "The"
3)
+SerializeClientConnectionInfo(Size maxsize, char *start_address)
+{
+ /*
+ * First byte is an indication of whether or not authn_id has
been set to
+ * non-NULL, to differentiate that case from the empty string.
+ */
is authn_id being an empty string possible?
4)
+ */
+
+ClientConnectionInfo MyClientConnectionInfo;
+
+/*
+ * Calculate the space needed to serialize MyClientConnectionInfo.
+ */
+Size
+EstimateClientConnectionInfoSpace(void)
From a coding style point of view, shouldn't "ClientConnectionInfo
MyClientConnectionInfo;" be moved to the top of the file?
[1]: https://commitfest.postgresql.org/39/3703/
Regards,
--
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com
On Fri, Aug 12, 2022 at 03:34:04PM +0200, Drouvot, Bertrand wrote:
3)
+SerializeClientConnectionInfo(Size maxsize, char *start_address) +{ + /* + * First byte is an indication of whether or not authn_id has been set to + * non-NULL, to differentiate that case from the empty string. + */is authn_id being an empty string possible?
I don't recall that this can be the case yet, but we cannot discard
it. One thing was itching me about the serialization and
deserialization logic though: could it be more readable if we used an
intermediate structure to store the length of the serialized strings?
We use this approach in other areas, like for the snapshot data in
snapmgr.c. This would handle the case of an empty and NULL string, by
storing -1 as length for NULL and >= 0 for the string length if there
is something set, while making the addition of more fields a
no-brainer.
--
Michael
Hi,
On 8/14/22 11:57 AM, Michael Paquier wrote:
On Fri, Aug 12, 2022 at 03:34:04PM +0200, Drouvot, Bertrand wrote:
3)
+SerializeClientConnectionInfo(Size maxsize, char *start_address) +{ + /* + * First byte is an indication of whether or not authn_id has been set to + * non-NULL, to differentiate that case from the empty string. + */is authn_id being an empty string possible?
I don't recall that this can be the case yet, but we cannot discard
it.
Fair point.
One thing was itching me about the serialization and
deserialization logic though: could it be more readable if we used an
intermediate structure to store the length of the serialized strings?
We use this approach in other areas, like for the snapshot data in
snapmgr.c. This would handle the case of an empty and NULL string, by
storing -1 as length for NULL and >= 0 for the string length if there
is something set, while making the addition of more fields a
no-brainer.
I think that's a good idea and I think that would be more readable (as
compare to storing a "hint" in the first byte).
Regards,
--
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com
Hello,
On Fri, Aug 12, 2022 at 6:34 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
+typedef struct +{ + /* + * Authenticated identity. The meaning of this identifier is dependent onhas to be replaced by:
+typedef struct ClientConnectionInfo +{ + /* + * Authenticated identity. The meaning of this identifier is dependent on
Okay, will do in the next patch (coming soon).
+ * Authenticated identity. The meaning of this identifier is
dependent onThere is one extra space before "The"
This comment block was just moved verbatim; the double-spaced
sentences were there before.
From a coding style point of view, shouldn't "ClientConnectionInfo
MyClientConnectionInfo;" be moved to the top of the file?
The style in this file seems to be to declare the variables at the top
of the section in which they're used. See the sections for "User ID
state" and "Library preload support".
Thanks,
--Jacob
On Tue, Aug 16, 2022 at 2:02 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
On 8/14/22 11:57 AM, Michael Paquier wrote:
One thing was itching me about the serialization and
deserialization logic though: could it be more readable if we used an
intermediate structure to store the length of the serialized strings?
We use this approach in other areas, like for the snapshot data in
snapmgr.c. This would handle the case of an empty and NULL string, by
storing -1 as length for NULL and >= 0 for the string length if there
is something set, while making the addition of more fields a
no-brainer.I think that's a good idea and I think that would be more readable (as
compare to storing a "hint" in the first byte).
Sounds good. v3, attached, should make the requested changes:
- declare `struct ClientConnectionInfo`
- use an intermediate serialization struct
- switch to length-"prefixing" for the string
I do like the way this reads compared to before.
Thanks,
--Jacob
Attachments:
since-v2.diff.txttext/plain; charset=US-ASCII; name=since-v2.diff.txtDownload
commit 753c46352adc967a903a60ea65a3068252d685e6
Author: Jacob Champion <jchampion@timescale.com>
Date: Tue Aug 16 09:14:58 2022 -0700
squash! Allow parallel workers to read authn_id
Per review,
- add an intermediate struct for serialization,
- switch to length-prefixing for the authn_id string, and
- make sure `struct ClientConnectionInfo` is declared for use elsewhere.
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 155ba92c67..58772d0a4a 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -943,19 +943,29 @@ GetUserNameFromId(Oid roleid, bool noerr)
ClientConnectionInfo MyClientConnectionInfo;
+/*
+ * Intermediate representation of ClientConnectionInfo for easier serialization.
+ * Variable-length fields are allocated right after this header.
+ */
+typedef struct SerializedClientConnectionInfo
+{
+ int32 authn_id_len; /* strlen(authn_id), or -1 if NULL */
+ UserAuth auth_method;
+} SerializedClientConnectionInfo;
+
/*
* Calculate the space needed to serialize MyClientConnectionInfo.
*/
Size
EstimateClientConnectionInfoSpace(void)
{
- Size size = 1;
+ Size size = 0;
+
+ size = add_size(size, sizeof(SerializedClientConnectionInfo));
if (MyClientConnectionInfo.authn_id)
size = add_size(size, strlen(MyClientConnectionInfo.authn_id) + 1);
- size = add_size(size, sizeof(UserAuth));
-
return size;
}
@@ -965,32 +975,29 @@ EstimateClientConnectionInfoSpace(void)
void
SerializeClientConnectionInfo(Size maxsize, char *start_address)
{
- /*
- * First byte is an indication of whether or not authn_id has been set to
- * non-NULL, to differentiate that case from the empty string.
- */
- Assert(maxsize > 0);
- start_address[0] = MyClientConnectionInfo.authn_id ? 1 : 0;
- start_address++;
- maxsize--;
+ SerializedClientConnectionInfo serialized = {0};
+
+ serialized.authn_id_len = -1;
+ serialized.auth_method = MyClientConnectionInfo.auth_method;
if (MyClientConnectionInfo.authn_id)
- {
- Size len;
+ serialized.authn_id_len = strlen(MyClientConnectionInfo.authn_id);
- len = strlcpy(start_address, MyClientConnectionInfo.authn_id, maxsize) + 1;
- Assert(len <= maxsize);
- maxsize -= len;
- start_address += len;
- }
+ /* Copy serialized representation to buffer */
+ Assert(maxsize >= sizeof(serialized));
+ memcpy(start_address, &serialized, sizeof(serialized));
- {
- UserAuth *auth_method = (UserAuth*) start_address;
+ maxsize -= sizeof(serialized);
+ start_address += sizeof(serialized);
- Assert(sizeof(*auth_method) <= maxsize);
- *auth_method = MyClientConnectionInfo.auth_method;
- maxsize -= sizeof(*auth_method);
- start_address += sizeof(*auth_method);
+ /* Copy authn_id into the space after the struct. */
+ if (serialized.authn_id_len >= 0)
+ {
+ Assert(maxsize >= (serialized.authn_id_len + 1));
+ memcpy(start_address,
+ MyClientConnectionInfo.authn_id,
+ /* include the NULL terminator to ease deserialization */
+ serialized.authn_id_len + 1);
}
}
@@ -1000,25 +1007,19 @@ SerializeClientConnectionInfo(Size maxsize, char *start_address)
void
RestoreClientConnectionInfo(char *conninfo)
{
- if (conninfo[0] == 0)
- {
- MyClientConnectionInfo.authn_id = NULL;
- conninfo++;
- }
- else
- {
- conninfo++;
- MyClientConnectionInfo.authn_id = MemoryContextStrdup(TopMemoryContext,
- conninfo);
- conninfo += strlen(conninfo) + 1;
- }
+ SerializedClientConnectionInfo serialized;
+ char *authn_id;
- {
- UserAuth *auth_method = (UserAuth*) conninfo;
+ memcpy(&serialized, conninfo, sizeof(serialized));
+ authn_id = conninfo + sizeof(serialized);
- MyClientConnectionInfo.auth_method = *auth_method;
- conninfo += sizeof(*auth_method);
- }
+ /* Copy the fields back into place. */
+ MyClientConnectionInfo.authn_id = NULL;
+ MyClientConnectionInfo.auth_method = serialized.auth_method;
+
+ if (serialized.authn_id_len >= 0)
+ MyClientConnectionInfo.authn_id = MemoryContextStrdup(TopMemoryContext,
+ authn_id);
}
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index 0643733765..84a6bdea6f 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -107,7 +107,7 @@ typedef struct
* If you add a struct member here, remember to also handle serialization in
* SerializeClientConnectionInfo() et al.
*/
-typedef struct
+typedef struct ClientConnectionInfo
{
/*
* Authenticated identity. The meaning of this identifier is dependent on
v3-0001-Allow-parallel-workers-to-read-authn_id.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Allow-parallel-workers-to-read-authn_id.patchDownload
From 2eea3ef097bbeee5323f78c827e56b42480b5c81 Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@vmware.com>
Date: Wed, 23 Mar 2022 15:07:05 -0700
Subject: [PATCH v3 1/3] Allow parallel workers to read authn_id
Move authn_id into a new global, MyClientConnectionInfo, which is
intended to hold all the client information that needs to be shared
between the backend and any parallel workers. MyClientConnectionInfo is
serialized and restored using a new parallel key.
Additionally, make a copy of hba->auth_method in ClientConnectionInfo
when set_authn_id() is called, for use by SYSTEM_USER.
---
src/backend/access/transam/parallel.c | 19 +++++-
src/backend/libpq/auth.c | 25 ++++----
src/backend/utils/init/miscinit.c | 91 +++++++++++++++++++++++++++
src/include/libpq/libpq-be.h | 45 +++++++++----
src/include/miscadmin.h | 4 ++
5 files changed, 159 insertions(+), 25 deletions(-)
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index df0cd77558..bc93101ff7 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -76,6 +76,7 @@
#define PARALLEL_KEY_REINDEX_STATE UINT64CONST(0xFFFFFFFFFFFF000C)
#define PARALLEL_KEY_RELMAPPER_STATE UINT64CONST(0xFFFFFFFFFFFF000D)
#define PARALLEL_KEY_UNCOMMITTEDENUMS UINT64CONST(0xFFFFFFFFFFFF000E)
+#define PARALLEL_KEY_CLIENTCONNINFO UINT64CONST(0xFFFFFFFFFFFF000F)
/* Fixed-size parallel state. */
typedef struct FixedParallelState
@@ -212,6 +213,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
Size reindexlen = 0;
Size relmapperlen = 0;
Size uncommittedenumslen = 0;
+ Size clientconninfolen = 0;
Size segsize = 0;
int i;
FixedParallelState *fps;
@@ -272,8 +274,10 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_estimate_chunk(&pcxt->estimator, relmapperlen);
uncommittedenumslen = EstimateUncommittedEnumsSpace();
shm_toc_estimate_chunk(&pcxt->estimator, uncommittedenumslen);
+ clientconninfolen = EstimateClientConnectionInfoSpace();
+ shm_toc_estimate_chunk(&pcxt->estimator, clientconninfolen);
/* If you add more chunks here, you probably need to add keys. */
- shm_toc_estimate_keys(&pcxt->estimator, 11);
+ shm_toc_estimate_keys(&pcxt->estimator, 12);
/* Estimate space need for error queues. */
StaticAssertStmt(BUFFERALIGN(PARALLEL_ERROR_QUEUE_SIZE) ==
@@ -352,6 +356,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
char *session_dsm_handle_space;
char *entrypointstate;
char *uncommittedenumsspace;
+ char *clientconninfospace;
Size lnamelen;
/* Serialize shared libraries we have loaded. */
@@ -422,6 +427,12 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNCOMMITTEDENUMS,
uncommittedenumsspace);
+ /* Serialize our ClientConnectionInfo. */
+ clientconninfospace = shm_toc_allocate(pcxt->toc, clientconninfolen);
+ SerializeClientConnectionInfo(clientconninfolen, clientconninfospace);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_CLIENTCONNINFO,
+ clientconninfospace);
+
/* Allocate space for worker information. */
pcxt->worker = palloc0(sizeof(ParallelWorkerInfo) * pcxt->nworkers);
@@ -1270,6 +1281,7 @@ ParallelWorkerMain(Datum main_arg)
char *reindexspace;
char *relmapperspace;
char *uncommittedenumsspace;
+ char *clientconninfospace;
StringInfoData msgbuf;
char *session_dsm_handle_space;
Snapshot tsnapshot;
@@ -1479,6 +1491,11 @@ ParallelWorkerMain(Datum main_arg)
false);
RestoreUncommittedEnums(uncommittedenumsspace);
+ /* Restore the ClientConnectionInfo. */
+ clientconninfospace = shm_toc_lookup(toc, PARALLEL_KEY_CLIENTCONNINFO,
+ false);
+ RestoreClientConnectionInfo(clientconninfospace);
+
/* Attach to the leader's serializable transaction, if SERIALIZABLE. */
AttachSerializableXact(fps->serializable_xact_handle);
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 2d9ab7edce..9113f04189 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -333,24 +333,24 @@ auth_failed(Port *port, int status, const char *logdetail)
/*
- * Sets the authenticated identity for the current user. The provided string
- * will be copied into the TopMemoryContext. The ID will be logged if
- * log_connections is enabled.
+ * Sets the authenticated identity for the current user. The provided string
+ * will be stored into MyClientConnectionInfo, alongside the current HBA method
+ * in use. The ID will be logged if log_connections is enabled.
*
* Auth methods should call this routine exactly once, as soon as the user is
* successfully authenticated, even if they have reasons to know that
* authorization will fail later.
*
* The provided string will be copied into TopMemoryContext, to match the
- * lifetime of the Port, so it is safe to pass a string that is managed by an
- * external library.
+ * lifetime of MyClientConnectionInfo, so it is safe to pass a string that is
+ * managed by an external library.
*/
static void
set_authn_id(Port *port, const char *id)
{
Assert(id);
- if (port->authn_id)
+ if (MyClientConnectionInfo.authn_id)
{
/*
* An existing authn_id should never be overwritten; that means two
@@ -361,18 +361,20 @@ set_authn_id(Port *port, const char *id)
ereport(FATAL,
(errmsg("authentication identifier set more than once"),
errdetail_log("previous identifier: \"%s\"; new identifier: \"%s\"",
- port->authn_id, id)));
+ MyClientConnectionInfo.authn_id, id)));
}
- port->authn_id = MemoryContextStrdup(TopMemoryContext, id);
+ MyClientConnectionInfo.authn_id = MemoryContextStrdup(TopMemoryContext, id);
+ MyClientConnectionInfo.auth_method = port->hba->auth_method;
if (Log_connections)
{
ereport(LOG,
errmsg("connection authenticated: identity=\"%s\" method=%s "
"(%s:%d)",
- port->authn_id, hba_authname(port->hba->auth_method), HbaFileName,
- port->hba->linenumber));
+ MyClientConnectionInfo.authn_id,
+ hba_authname(MyClientConnectionInfo.auth_method),
+ HbaFileName, port->hba->linenumber));
}
}
@@ -1908,7 +1910,8 @@ auth_peer(hbaPort *port)
*/
set_authn_id(port, pw->pw_name);
- ret = check_usermap(port->hba->usermap, port->user_name, port->authn_id, false);
+ ret = check_usermap(port->hba->usermap, port->user_name,
+ MyClientConnectionInfo.authn_id, false);
return ret;
#else
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index eb43b2c5e5..58772d0a4a 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -931,6 +931,97 @@ GetUserNameFromId(Oid roleid, bool noerr)
return result;
}
+/* ------------------------------------------------------------------------
+ * Parallel connection state
+ *
+ * ClientConnectionInfo contains pieces of information about the client that
+ * need to be synced to parallel workers when they initialize. Over time, this
+ * list will probably grow, and may subsume some of the "user state" variables
+ * above.
+ *-------------------------------------------------------------------------
+ */
+
+ClientConnectionInfo MyClientConnectionInfo;
+
+/*
+ * Intermediate representation of ClientConnectionInfo for easier serialization.
+ * Variable-length fields are allocated right after this header.
+ */
+typedef struct SerializedClientConnectionInfo
+{
+ int32 authn_id_len; /* strlen(authn_id), or -1 if NULL */
+ UserAuth auth_method;
+} SerializedClientConnectionInfo;
+
+/*
+ * Calculate the space needed to serialize MyClientConnectionInfo.
+ */
+Size
+EstimateClientConnectionInfoSpace(void)
+{
+ Size size = 0;
+
+ size = add_size(size, sizeof(SerializedClientConnectionInfo));
+
+ if (MyClientConnectionInfo.authn_id)
+ size = add_size(size, strlen(MyClientConnectionInfo.authn_id) + 1);
+
+ return size;
+}
+
+/*
+ * Serialize MyClientConnectionInfo for use by parallel workers.
+ */
+void
+SerializeClientConnectionInfo(Size maxsize, char *start_address)
+{
+ SerializedClientConnectionInfo serialized = {0};
+
+ serialized.authn_id_len = -1;
+ serialized.auth_method = MyClientConnectionInfo.auth_method;
+
+ if (MyClientConnectionInfo.authn_id)
+ serialized.authn_id_len = strlen(MyClientConnectionInfo.authn_id);
+
+ /* Copy serialized representation to buffer */
+ Assert(maxsize >= sizeof(serialized));
+ memcpy(start_address, &serialized, sizeof(serialized));
+
+ maxsize -= sizeof(serialized);
+ start_address += sizeof(serialized);
+
+ /* Copy authn_id into the space after the struct. */
+ if (serialized.authn_id_len >= 0)
+ {
+ Assert(maxsize >= (serialized.authn_id_len + 1));
+ memcpy(start_address,
+ MyClientConnectionInfo.authn_id,
+ /* include the NULL terminator to ease deserialization */
+ serialized.authn_id_len + 1);
+ }
+}
+
+/*
+ * Restore MyClientConnectionInfo from its serialized representation.
+ */
+void
+RestoreClientConnectionInfo(char *conninfo)
+{
+ SerializedClientConnectionInfo serialized;
+ char *authn_id;
+
+ memcpy(&serialized, conninfo, sizeof(serialized));
+ authn_id = conninfo + sizeof(serialized);
+
+ /* Copy the fields back into place. */
+ MyClientConnectionInfo.authn_id = NULL;
+ MyClientConnectionInfo.auth_method = serialized.auth_method;
+
+ if (serialized.authn_id_len >= 0)
+ MyClientConnectionInfo.authn_id = MemoryContextStrdup(TopMemoryContext,
+ authn_id);
+}
+
/*-------------------------------------------------------------------------
* Interlock-file support
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index 90c20da22b..84a6bdea6f 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -98,6 +98,37 @@ typedef struct
} pg_gssinfo;
#endif
+/*
+ * Fields describing the client connection, that also need to be copied over to
+ * parallel workers, go into the ClientConnectionInfo rather than Port. The same
+ * rules apply for allocations here as for Port (must be malloc'd or palloc'd in
+ * TopMemoryContext).
+ *
+ * If you add a struct member here, remember to also handle serialization in
+ * SerializeClientConnectionInfo() et al.
+ */
+typedef struct ClientConnectionInfo
+{
+ /*
+ * Authenticated identity. The meaning of this identifier is dependent on
+ * auth_method; it is the identity (if any) that the user presented
+ * during the authentication cycle, before they were assigned a database
+ * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
+ * -- though the exact string in use may be different, depending on pg_hba
+ * options.)
+ *
+ * authn_id is NULL if the user has not actually been authenticated, for
+ * example if the "trust" auth method is in use.
+ */
+ const char *authn_id;
+
+ /*
+ * The HBA method that determined the above authn_id. This only has meaning
+ * if authn_id is not NULL; otherwise it's undefined.
+ */
+ UserAuth auth_method;
+} ClientConnectionInfo;
+
/*
* This is used by the postmaster in its communication with frontends. It
* contains all state information needed during this communication before the
@@ -158,19 +189,6 @@ typedef struct Port
*/
HbaLine *hba;
- /*
- * Authenticated identity. The meaning of this identifier is dependent on
- * hba->auth_method; it is the identity (if any) that the user presented
- * during the authentication cycle, before they were assigned a database
- * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
- * -- though the exact string in use may be different, depending on pg_hba
- * options.)
- *
- * authn_id is NULL if the user has not actually been authenticated, for
- * example if the "trust" auth method is in use.
- */
- const char *authn_id;
-
/*
* TCP keepalive and user timeout settings.
*
@@ -327,6 +345,7 @@ extern ssize_t be_gssapi_write(Port *port, void *ptr, size_t len);
#endif /* ENABLE_GSS */
extern PGDLLIMPORT ProtocolVersion FrontendProtocol;
+extern PGDLLIMPORT ClientConnectionInfo MyClientConnectionInfo;
/* TCP keepalives configuration. These are no-ops on an AF_UNIX socket. */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 067b729d5a..3e9297e399 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -481,6 +481,10 @@ extern bool has_rolreplication(Oid roleid);
typedef void (*shmem_request_hook_type) (void);
extern PGDLLIMPORT shmem_request_hook_type shmem_request_hook;
+extern Size EstimateClientConnectionInfoSpace(void);
+extern void SerializeClientConnectionInfo(Size maxsize, char *start_address);
+extern void RestoreClientConnectionInfo(char *procinfo);
+
/* in executor/nodeHash.c */
extern size_t get_hash_memory_limit(void);
--
2.25.1
Hi,
On 8/16/22 6:58 PM, Jacob Champion wrote:
Sounds good. v3, attached, should make the requested changes:
- declare `struct ClientConnectionInfo`
- use an intermediate serialization struct
- switch to length-"prefixing" for the stringI do like the way this reads compared to before.
Thanks for the new version!
+ /* Copy authn_id into the space after the struct. */
+ if (serialized.authn_id_len >= 0)
Maybe remove the "." at the end of the comment? (to be consistent with
the other comment just above)
+/*
+ * Restore MyClientConnectionInfo from its serialized representation.
+ */
+void
+RestoreClientConnectionInfo(char *conninfo)
+{
+ SerializedClientConnectionInfo serialized;
+ char *authn_id;
Move "char *authn_id;" in the "if (serialized.authn_id_len >= 0)"
below?
+
+ memcpy(&serialized, conninfo, sizeof(serialized));
+ authn_id = conninfo + sizeof(serialized);
Move "authn_id = conninfo + sizeof(serialized)" in the "if
(serialized.authn_id_len >= 0)" below?
+
+ /* Copy the fields back into place. */
Remove the "." at the end of the comment?
+ MyClientConnectionInfo.authn_id = NULL;
+ MyClientConnectionInfo.auth_method = serialized.auth_method;
+
+ if (serialized.authn_id_len >= 0)
+ MyClientConnectionInfo.authn_id =
MemoryContextStrdup(TopMemoryContext,
+ authn_id);
This instead?
if (serialized.authn_id_len >= 0)
{
char *authn_id;
authn_id = conninfo + sizeof(serialized);
MyClientConnectionInfo.authn_id =
MemoryContextStrdup(TopMemoryContext,
authn_id);
}
+ src/backend/utils/init/miscinit.c:RestoreClientConnectionInfo(char
*conninfo)
+ src/include/miscadmin.h:extern void RestoreClientConnectionInfo(char
*procinfo);
conninfo in both to be consistent?
Apart from the comments above, that looks good to me.
Regards,
--
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com
On Wed, Aug 17, 2022 at 09:53:45AM +0200, Drouvot, Bertrand wrote:
Thanks for the new version!
+ /* Copy authn_id into the space after the struct. */ + if (serialized.authn_id_len >= 0)Maybe remove the "." at the end of the comment? (to be consistent with the
other comment just above)
When it comes to such things, I usually apply the rule of consistency
with the surroundings, which sounds right here.
+ memcpy(&serialized, conninfo, sizeof(serialized)); + authn_id = conninfo + sizeof(serialized);Move "authn_id = conninfo + sizeof(serialized)" in the "if
(serialized.authn_id_len >= 0)" below?
Makes sense, so as never have something pointing to an area should
should not look at. This should just be used when we know that there
is going to be a string.
+ src/backend/utils/init/miscinit.c:RestoreClientConnectionInfo(char *conninfo) + src/include/miscadmin.h:extern void RestoreClientConnectionInfo(char *procinfo);conninfo in both to be consistent?
Yep. Looks like a copy-pasto, seen from here.
By the way, I have looked at the patch, tweaked a couple of things
with comments and the style, but overval that's fine. First, I have
intended to apply this stuff today but I have lacked the time to do
so. I should be able to get this wrapped tomorrow, though.
--
Michael
On Mon, Aug 22, 2022 at 4:32 AM Michael Paquier <michael@paquier.xyz> wrote:
By the way, I have looked at the patch, tweaked a couple of things
with comments and the style, but overval that's fine. First, I have
intended to apply this stuff today but I have lacked the time to do
so. I should be able to get this wrapped tomorrow, though.
Thank you both for the reviews! Let me know if it would help for me to
issue a new patchset, otherwise I'll sit tight.
--Jacob
On Mon, Aug 22, 2022 at 08:10:10AM -0700, Jacob Champion wrote:
otherwise I'll sit tight.
So am I. I have done an extra round of checks around the
serialization/deserialization logic where I put some elog()'s to look
at the output passed down with some workers and a couple of auth
methods, and after an indentation and some comment polishing I finish
with the attached.
There was one thing that annoyed me with the patch, though, as of the
lack of initialization of MyClientConnectionInfo at backend startup,
as we may finish by not calling set_authn() to fill in some of its
data, so I have placed an extra memset(0) in InitProcessGlobals()
(note that Port does a calloc() much earlier than that, but I think
that we don't really want to do more in such code paths, especially
for the parallelized client information).
I have written a commit message, while on it. Does that look fine to
you?
--
Michael
Attachments:
v4-0001-Allow-parallel-workers-to-retrieve-some-data-from.patchtext/x-diff; charset=us-asciiDownload
From 73c28e17f1c77af134dc117500226c201099499b Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Tue, 23 Aug 2022 11:23:55 +0900
Subject: [PATCH v4] Allow parallel workers to retrieve some data from Port
This commit moves authn_id into a new global structure called
ClientConnectionInfo (mapping to a MyClientConnectionInfo for each
backend) which is intended to hold all the client information that
should be shared between the backend and any of its parallel workers,
access for extensions and triggers being the primary use case. There is
no need to push all the data of Port to the workers, and authn_id is
quite a generic concept so using a separate structure provides the best
balance (the name of the structure has been suggested by Robert Haas).
While on it, and per discussion as this would be useful for a potential
SYSTEM_USER that can be accessed through parallel workers, a second
field is added for the authentication method, copied directly from
Port.
ClientConnectionInfo is serialized and restored using a new parallel
key and a structure tracks the length of the authn_id, making the
addition of more fields straight-forward.
Author: Jocob Champion
Reviewed-by: Bertrand Drouvot, Stephen Frost, Robert Haas, Tom Lane,
Michael Paquier, Julien Rouhaud
Discussion: https://postgr.es/m/793d990837ae5c06a558d58d62de9378ab525d83.camel@vmware.com
---
src/include/libpq/libpq-be.h | 45 +++++++++----
src/include/miscadmin.h | 4 ++
src/backend/access/transam/parallel.c | 19 +++++-
src/backend/libpq/auth.c | 23 ++++---
src/backend/postmaster/postmaster.c | 6 +-
src/backend/utils/init/miscinit.c | 93 +++++++++++++++++++++++++++
src/tools/pgindent/typedefs.list | 2 +
7 files changed, 166 insertions(+), 26 deletions(-)
diff --git a/src/include/libpq/libpq-be.h b/src/include/libpq/libpq-be.h
index 32d3a4b085..d3c8c83e51 100644
--- a/src/include/libpq/libpq-be.h
+++ b/src/include/libpq/libpq-be.h
@@ -88,6 +88,37 @@ typedef struct
} pg_gssinfo;
#endif
+/*
+ * ClientConnectionInfo includes the fields describing the client connection
+ * that are copied over to parallel workers as nothing from Port does that.
+ * The same rules apply for allocations here as for Port (everything must be
+ * malloc'd or palloc'd in TopMemoryContext).
+ *
+ * If you add a struct member here, remember to also handle serialization in
+ * SerializeClientConnectionInfo() and co.
+ */
+typedef struct ClientConnectionInfo
+{
+ /*
+ * Authenticated identity. The meaning of this identifier is dependent on
+ * auth_method; it is the identity (if any) that the user presented during
+ * the authentication cycle, before they were assigned a database role.
+ * (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap --
+ * though the exact string in use may be different, depending on pg_hba
+ * options.)
+ *
+ * authn_id is NULL if the user has not actually been authenticated, for
+ * example if the "trust" auth method is in use.
+ */
+ const char *authn_id;
+
+ /*
+ * The HBA method that determined the above authn_id. This only has
+ * meaning if authn_id is not NULL; otherwise it's undefined.
+ */
+ UserAuth auth_method;
+} ClientConnectionInfo;
+
/*
* This is used by the postmaster in its communication with frontends. It
* contains all state information needed during this communication before the
@@ -148,19 +179,6 @@ typedef struct Port
*/
HbaLine *hba;
- /*
- * Authenticated identity. The meaning of this identifier is dependent on
- * hba->auth_method; it is the identity (if any) that the user presented
- * during the authentication cycle, before they were assigned a database
- * role. (It is effectively the "SYSTEM-USERNAME" of a pg_ident usermap
- * -- though the exact string in use may be different, depending on pg_hba
- * options.)
- *
- * authn_id is NULL if the user has not actually been authenticated, for
- * example if the "trust" auth method is in use.
- */
- const char *authn_id;
-
/*
* TCP keepalive and user timeout settings.
*
@@ -317,6 +335,7 @@ extern ssize_t be_gssapi_write(Port *port, void *ptr, size_t len);
#endif /* ENABLE_GSS */
extern PGDLLIMPORT ProtocolVersion FrontendProtocol;
+extern PGDLLIMPORT ClientConnectionInfo MyClientConnectionInfo;
/* TCP keepalives configuration. These are no-ops on an AF_UNIX socket. */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 067b729d5a..0a88e70efe 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -481,6 +481,10 @@ extern bool has_rolreplication(Oid roleid);
typedef void (*shmem_request_hook_type) (void);
extern PGDLLIMPORT shmem_request_hook_type shmem_request_hook;
+extern Size EstimateClientConnectionInfoSpace(void);
+extern void SerializeClientConnectionInfo(Size maxsize, char *start_address);
+extern void RestoreClientConnectionInfo(char *conninfo);
+
/* in executor/nodeHash.c */
extern size_t get_hash_memory_limit(void);
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index df0cd77558..bc93101ff7 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -76,6 +76,7 @@
#define PARALLEL_KEY_REINDEX_STATE UINT64CONST(0xFFFFFFFFFFFF000C)
#define PARALLEL_KEY_RELMAPPER_STATE UINT64CONST(0xFFFFFFFFFFFF000D)
#define PARALLEL_KEY_UNCOMMITTEDENUMS UINT64CONST(0xFFFFFFFFFFFF000E)
+#define PARALLEL_KEY_CLIENTCONNINFO UINT64CONST(0xFFFFFFFFFFFF000F)
/* Fixed-size parallel state. */
typedef struct FixedParallelState
@@ -212,6 +213,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
Size reindexlen = 0;
Size relmapperlen = 0;
Size uncommittedenumslen = 0;
+ Size clientconninfolen = 0;
Size segsize = 0;
int i;
FixedParallelState *fps;
@@ -272,8 +274,10 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_estimate_chunk(&pcxt->estimator, relmapperlen);
uncommittedenumslen = EstimateUncommittedEnumsSpace();
shm_toc_estimate_chunk(&pcxt->estimator, uncommittedenumslen);
+ clientconninfolen = EstimateClientConnectionInfoSpace();
+ shm_toc_estimate_chunk(&pcxt->estimator, clientconninfolen);
/* If you add more chunks here, you probably need to add keys. */
- shm_toc_estimate_keys(&pcxt->estimator, 11);
+ shm_toc_estimate_keys(&pcxt->estimator, 12);
/* Estimate space need for error queues. */
StaticAssertStmt(BUFFERALIGN(PARALLEL_ERROR_QUEUE_SIZE) ==
@@ -352,6 +356,7 @@ InitializeParallelDSM(ParallelContext *pcxt)
char *session_dsm_handle_space;
char *entrypointstate;
char *uncommittedenumsspace;
+ char *clientconninfospace;
Size lnamelen;
/* Serialize shared libraries we have loaded. */
@@ -422,6 +427,12 @@ InitializeParallelDSM(ParallelContext *pcxt)
shm_toc_insert(pcxt->toc, PARALLEL_KEY_UNCOMMITTEDENUMS,
uncommittedenumsspace);
+ /* Serialize our ClientConnectionInfo. */
+ clientconninfospace = shm_toc_allocate(pcxt->toc, clientconninfolen);
+ SerializeClientConnectionInfo(clientconninfolen, clientconninfospace);
+ shm_toc_insert(pcxt->toc, PARALLEL_KEY_CLIENTCONNINFO,
+ clientconninfospace);
+
/* Allocate space for worker information. */
pcxt->worker = palloc0(sizeof(ParallelWorkerInfo) * pcxt->nworkers);
@@ -1270,6 +1281,7 @@ ParallelWorkerMain(Datum main_arg)
char *reindexspace;
char *relmapperspace;
char *uncommittedenumsspace;
+ char *clientconninfospace;
StringInfoData msgbuf;
char *session_dsm_handle_space;
Snapshot tsnapshot;
@@ -1479,6 +1491,11 @@ ParallelWorkerMain(Datum main_arg)
false);
RestoreUncommittedEnums(uncommittedenumsspace);
+ /* Restore the ClientConnectionInfo. */
+ clientconninfospace = shm_toc_lookup(toc, PARALLEL_KEY_CLIENTCONNINFO,
+ false);
+ RestoreClientConnectionInfo(clientconninfospace);
+
/* Attach to the leader's serializable transaction, if SERIALIZABLE. */
AttachSerializableXact(fps->serializable_xact_handle);
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 1545ff9f16..2e7330f7bc 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -333,23 +333,23 @@ auth_failed(Port *port, int status, const char *logdetail)
/*
* Sets the authenticated identity for the current user. The provided string
- * will be copied into the TopMemoryContext. The ID will be logged if
- * log_connections is enabled.
+ * will be stored into MyClientConnectionInfo, alongside the current HBA
+ * method in use. The ID will be logged if log_connections is enabled.
*
* Auth methods should call this routine exactly once, as soon as the user is
* successfully authenticated, even if they have reasons to know that
* authorization will fail later.
*
* The provided string will be copied into TopMemoryContext, to match the
- * lifetime of the Port, so it is safe to pass a string that is managed by an
- * external library.
+ * lifetime of MyClientConnectionInfo, so it is safe to pass a string that is
+ * managed by an external library.
*/
static void
set_authn_id(Port *port, const char *id)
{
Assert(id);
- if (port->authn_id)
+ if (MyClientConnectionInfo.authn_id)
{
/*
* An existing authn_id should never be overwritten; that means two
@@ -360,18 +360,20 @@ set_authn_id(Port *port, const char *id)
ereport(FATAL,
(errmsg("authentication identifier set more than once"),
errdetail_log("previous identifier: \"%s\"; new identifier: \"%s\"",
- port->authn_id, id)));
+ MyClientConnectionInfo.authn_id, id)));
}
- port->authn_id = MemoryContextStrdup(TopMemoryContext, id);
+ MyClientConnectionInfo.authn_id = MemoryContextStrdup(TopMemoryContext, id);
+ MyClientConnectionInfo.auth_method = port->hba->auth_method;
if (Log_connections)
{
ereport(LOG,
errmsg("connection authenticated: identity=\"%s\" method=%s "
"(%s:%d)",
- port->authn_id, hba_authname(port->hba->auth_method), HbaFileName,
- port->hba->linenumber));
+ MyClientConnectionInfo.authn_id,
+ hba_authname(MyClientConnectionInfo.auth_method),
+ HbaFileName, port->hba->linenumber));
}
}
@@ -1907,7 +1909,8 @@ auth_peer(hbaPort *port)
*/
set_authn_id(port, pw->pw_name);
- ret = check_usermap(port->hba->usermap, port->user_name, port->authn_id, false);
+ ret = check_usermap(port->hba->usermap, port->user_name,
+ MyClientConnectionInfo.authn_id, false);
return ret;
#else
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 8a038d1b2a..1fd70fba83 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2678,9 +2678,10 @@ ClosePostmasterPorts(bool am_syslogger)
/*
- * InitProcessGlobals -- set MyProcPid, MyStartTime[stamp], random seeds
+ * InitProcessGlobals -- set some global variables for this process
*
- * Called early in the postmaster and every backend.
+ * This sets MyProcPid, MyStartTime[stamp], random seeds, and initializes
+ * MyClientConnectionInfo. Called early in the postmaster and every backend.
*/
void
InitProcessGlobals(void)
@@ -2688,6 +2689,7 @@ InitProcessGlobals(void)
MyProcPid = getpid();
MyStartTimestamp = GetCurrentTimestamp();
MyStartTime = timestamptz_to_time_t(MyStartTimestamp);
+ memset(&MyClientConnectionInfo, 0, sizeof(MyClientConnectionInfo));
/*
* Set a different global seed in every process. We want something
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index bd973ba613..2e02d681e6 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -931,6 +931,99 @@ GetUserNameFromId(Oid roleid, bool noerr)
return result;
}
+/* ------------------------------------------------------------------------
+ * Client connection state shared with parallel workers
+ *
+ * ClientConnectionInfo contains pieces of information about the client that
+ * need to be synced to parallel workers when they initialize.
+ *-------------------------------------------------------------------------
+ */
+
+ClientConnectionInfo MyClientConnectionInfo;
+
+/*
+ * Intermediate representation of ClientConnectionInfo for easier
+ * serialization. Variable-length fields are allocated right after this
+ * header.
+ */
+typedef struct SerializedClientConnectionInfo
+{
+ int32 authn_id_len; /* strlen(authn_id), or -1 if NULL */
+ UserAuth auth_method;
+} SerializedClientConnectionInfo;
+
+/*
+ * Calculate the space needed to serialize MyClientConnectionInfo.
+ */
+Size
+EstimateClientConnectionInfoSpace(void)
+{
+ Size size = 0;
+
+ size = add_size(size, sizeof(SerializedClientConnectionInfo));
+
+ if (MyClientConnectionInfo.authn_id)
+ size = add_size(size, strlen(MyClientConnectionInfo.authn_id) + 1);
+
+ return size;
+}
+
+/*
+ * Serialize MyClientConnectionInfo for use by parallel workers.
+ */
+void
+SerializeClientConnectionInfo(Size maxsize, char *start_address)
+{
+ SerializedClientConnectionInfo serialized = {0};
+
+ serialized.authn_id_len = -1;
+ serialized.auth_method = MyClientConnectionInfo.auth_method;
+
+ if (MyClientConnectionInfo.authn_id)
+ serialized.authn_id_len = strlen(MyClientConnectionInfo.authn_id);
+
+ /* Copy serialized representation to buffer */
+ Assert(maxsize >= sizeof(serialized));
+ memcpy(start_address, &serialized, sizeof(serialized));
+
+ maxsize -= sizeof(serialized);
+ start_address += sizeof(serialized);
+
+ /* Copy authn_id into the space after the struct */
+ if (serialized.authn_id_len >= 0)
+ {
+ Assert(maxsize >= (serialized.authn_id_len + 1));
+ memcpy(start_address,
+ MyClientConnectionInfo.authn_id,
+ /* include the NULL terminator to ease deserialization */
+ serialized.authn_id_len + 1);
+ }
+}
+
+/*
+ * Restore MyClientConnectionInfo from its serialized representation.
+ */
+void
+RestoreClientConnectionInfo(char *conninfo)
+{
+ SerializedClientConnectionInfo serialized;
+
+ memcpy(&serialized, conninfo, sizeof(serialized));
+
+ /* Copy the fields back into place */
+ MyClientConnectionInfo.authn_id = NULL;
+ MyClientConnectionInfo.auth_method = serialized.auth_method;
+
+ if (serialized.authn_id_len >= 0)
+ {
+ char *authn_id;
+
+ authn_id = conninfo + sizeof(serialized);
+ MyClientConnectionInfo.authn_id = MemoryContextStrdup(TopMemoryContext,
+ authn_id);
+ }
+}
+
/*-------------------------------------------------------------------------
* Interlock-file support
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 35c9f1efce..a4a4e356e5 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -373,6 +373,7 @@ CkptTsStatus
ClientAuthentication_hook_type
ClientCertMode
ClientCertName
+ClientConnectionInfo
ClientData
ClonePtrType
ClosePortalStmt
@@ -2455,6 +2456,7 @@ SerCommitSeqNo
SerialControl
SerializableXactHandle
SerializedActiveRelMaps
+SerializedClientConnectionInfo
SerializedRanges
SerializedReindexState
SerializedSnapshotData
--
2.37.2
Hi,
On 8/23/22 4:25 AM, Michael Paquier wrote:
On Mon, Aug 22, 2022 at 08:10:10AM -0700, Jacob Champion wrote:
otherwise I'll sit tight.
So am I. I have done an extra round of checks around the
serialization/deserialization logic where I put some elog()'s to look
at the output passed down with some workers and a couple of auth
methods, and after an indentation and some comment polishing I finish
with the attached.There was one thing that annoyed me with the patch, though, as of the
lack of initialization of MyClientConnectionInfo at backend startup,
as we may finish by not calling set_authn() to fill in some of its
data, so I have placed an extra memset(0) in InitProcessGlobals()
Fair point.
(note that Port does a calloc() much earlier than that, but I think
that we don't really want to do more in such code paths, especially
for the parallelized client information).I have written a commit message, while on it. Does that look fine to
you?
Thanks!
That sounds all good to me, except a typo for the author in the commit
message: s/Jocob/Jacob/
Regards,
--
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com
On 8/23/22 01:53, Drouvot, Bertrand wrote:
That sounds all good to me, except a typo for the author in the commit
message: s/Jocob/Jacob/
Thanks, I missed that on my readthrough! :D
Patch looks good to me, too, with one question:
@@ -2688,6 +2689,7 @@ InitProcessGlobals(void) MyProcPid = getpid(); MyStartTimestamp = GetCurrentTimestamp(); MyStartTime = timestamptz_to_time_t(MyStartTimestamp); + memset(&MyClientConnectionInfo, 0, sizeof(MyClientConnectionInfo));/*
* Set a different global seed in every process. We want something
When can we rely on static initialization, and when can't we? Is there a
concern that the memory could have been polluted from before the
postmaster's fork?
Thanks,
--Jacob
On Tue, Aug 23, 2022 at 10:04:30AM -0700, Jacob Champion wrote:
On 8/23/22 01:53, Drouvot, Bertrand wrote:
@@ -2688,6 +2689,7 @@ InitProcessGlobals(void) MyProcPid = getpid(); MyStartTimestamp = GetCurrentTimestamp(); MyStartTime = timestamptz_to_time_t(MyStartTimestamp); + memset(&MyClientConnectionInfo, 0, sizeof(MyClientConnectionInfo));/*
* Set a different global seed in every process. We want somethingWhen can we rely on static initialization, and when can't we? Is there a
concern that the memory could have been polluted from before the
postmaster's fork?
My main worry here is EXEC_BACKEND, where we would just use our own
implementation of fork(), and it is a bad idea at the end to leave
that untouched while we could have code paths that attempt to access
it. At the end, I have moved the initialization at the same place as
where we set MyProcPort for a backend in BackendInitialize(), mainly
as a matter of consistency because ClientConnectionInfo is aimed at
being a subset of that. And applied.
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
On Tue, Aug 23, 2022 at 10:04:30AM -0700, Jacob Champion wrote:
When can we rely on static initialization, and when can't we? Is there a
concern that the memory could have been polluted from before the
postmaster's fork?
My main worry here is EXEC_BACKEND, where we would just use our own
implementation of fork(), and it is a bad idea at the end to leave
that untouched while we could have code paths that attempt to access
it.
Uh ... what? EXEC_BACKEND is even more certain to correctly initialize
static/global variables in a child process. I agree with Jacob that
this memset is probably useless, and therefore confusing.
regards, tom lane
Hi,
Michael Paquier <michael@paquier.xyz> writes:
[[PGP Signed Part:Undecided]]
On Tue, Aug 23, 2022 at 10:04:30AM -0700, Jacob Champion wrote:My main worry here is EXEC_BACKEND, where we would just use our own
implementation of fork(), and it is a bad idea at the end to leave
that untouched while we could have code paths that attempt to access
it. At the end, I have moved the initialization at the same place as
where we set MyProcPort for a backend in BackendInitialize(), mainly
as a matter of consistency because ClientConnectionInfo is aimed at
being a subset of that. And applied.
I found a compiler complaint of this patch. The attached fix that.
--
Best Regards
Andy Fan
Attachments:
v1-0001-Remove-the-parameter-maxsize-set-but-not-used-err.patchtext/x-diffDownload
From 9e8a3fb7a044704fbfcd682a897f72260266bd54 Mon Sep 17 00:00:00 2001
From: "yizhi.fzh" <yizhi.fzh@alibaba-inc.com>
Date: Thu, 15 Feb 2024 16:46:57 +0800
Subject: [PATCH v1 1/1] Remove the "parameter 'maxsize' set but not used"
error.
maxsize is only used in Assert build, to make compiler quiet, it is
better maintaining it only assert build.
---
src/backend/utils/init/miscinit.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index 23f77a59e5..0e72fcefab 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -1050,7 +1050,9 @@ SerializeClientConnectionInfo(Size maxsize, char *start_address)
Assert(maxsize >= sizeof(serialized));
memcpy(start_address, &serialized, sizeof(serialized));
+#ifdef USE_ASSERT_CHECKING
maxsize -= sizeof(serialized);
+#endif
start_address += sizeof(serialized);
/* Copy authn_id into the space after the struct */
--
2.34.1