[PoC] Federated Authn/z with OAUTHBEARER
Hi all,
We've been working on ways to expand the list of third-party auth
methods that Postgres provides. Some example use cases might be "I want
to let anyone with a Google account read this table" or "let anyone who
belongs to this GitHub organization connect as a superuser".
Attached is a proof of concept that implements pieces of OAuth 2.0
federated authorization, via the OAUTHBEARER SASL mechanism from RFC
7628 [1]https://datatracker.ietf.org/doc/html/rfc7628. Currently, only Linux is supported due to some ugly hacks in
the backend.
The architecture can support the following use cases, as long as your
OAuth issuer of choice implements the necessary specs, and you know how
to write a validator for your issuer's bearer tokens:
- Authentication only, where an external validator uses the bearer
token to determine the end user's identity, and Postgres decides
whether that user ID is authorized to connect via the standard pg_ident
user mapping.
- Authorization only, where the validator uses the bearer token to
determine the allowed roles for the end user, and then checks to make
sure that the connection's role is one of those. This bypasses pg_ident
and allows pseudonymous connections, where Postgres doesn't care who
you are as long as the token proves you're allowed to assume the role
you want.
- A combination, where the validator provides both an authn_id (for
later audits of database access) and an authorization decision based on
the bearer token and role provided.
It looks kinda like this during use:
$ psql 'host=example.org oauth_client_id=f02c6361-0635-...'
Visit https://oauth.example.org/login and enter the code: FPQ2-M4BG
= Quickstart =
For anyone who likes building and seeing green tests ASAP.
Prerequisite software:
- iddawc v0.9.9 [2]https://github.com/babelouest/iddawc, library and dev headers, for client support
- Python 3, for the test suite only
(Some newer distributions have dev packages for iddawc, but mine did
not.)
Configure using --with-oauth (and, if you've installed iddawc into a
non-standard location, be sure to use --with-includes and --with-
libraries. Make sure either rpath or LD_LIBRARY_PATH will get you what
you need). Install as usual.
To run the test suite, make sure the contrib/authn_id extension is
installed, then init and start your dev cluster. No other configuration
is required; the test will do it for you. Switch to the src/test/python
directory, point your PG* envvars to a superuser connection on the
cluster (so that a "bare" psql will connect automatically), and run
`make installcheck`.
= Production Setup =
(but don't use this in production, please)
Actually setting up a "real" system requires knowing the specifics of
your third-party issuer of choice. Your issuer MUST implement OpenID
Discovery and the OAuth Device Authorization flow! Seriously, check
this before spending a lot of time writing a validator against an
issuer that can't actually talk to libpq.
The broad strokes are as follows:
1. Register a new public client with your issuer to get an OAuth client
ID for libpq. You'll use this as the oauth_client_id in the connection
string. (If your issuer doesn't support public clients and gives you a
client secret, you can use the oauth_client_secret connection parameter
to provide that too.)
The client you register must be able to use a device authorization
flow; some issuers require additional setup for that.
2. Set up your HBA with the 'oauth' auth method, and set the 'issuer'
and 'scope' options. 'issuer' is the base URL identifying your third-
party issuer (for example, https://accounts.google.com), and 'scope' is
the set of OAuth scopes that the client and server will need to
authenticate and/or authorize the user (e.g. "openid email").
So a sample HBA line might look like
host all all samehost oauth issuer="https://accounts.google.com" scope="openid email"
3. In postgresql.conf, set up an oauth_validator_command that's capable
of verifying bearer tokens and implements the validator protocol. This
is the hardest part. See below.
= Design =
On the client side, I've implemented the Device Authorization flow (RFC
8628, [3]https://datatracker.ietf.org/doc/html/rfc8628). What this means in practice is that libpq reaches out to a
third-party issuer (e.g. Google, Azure, etc.), identifies itself with a
client ID, and requests permission to act on behalf of the end user.
The issuer responds with a login URL and a one-time code, which libpq
presents to the user using the notice hook. The end user then navigates
to that URL, presents their code, authenticates to the issuer, and
grants permission for libpq to retrieve a bearer token. libpq grabs a
token and sends it to the server for verification.
(The bearer token, in this setup, is essentially a plaintext password,
and you must secure it like you would a plaintext password. The token
has an expiration date and can be explicitly revoked, which makes it
slightly better than a password, but this is still a step backwards
from something like SCRAM with channel binding. There are ways to bind
a bearer token to a client certificate [4]https://datatracker.ietf.org/doc/html/rfc8705, which would mitigate the
risk of token theft -- but your issuer has to support that, and I
haven't found much support in the wild.)
The server side is where things get more difficult for the DBA. The
OAUTHBEARER spec has this to say about the server side implementation:
The server validates the response according to the specification for
the OAuth Access Token Types used.
And here's what the Bearer Token specification [5]https://datatracker.ietf.org/doc/html/rfc6750#section-5.2 says:
This document does not specify the encoding or the contents of the
token; hence, detailed recommendations about the means of
guaranteeing token integrity protection are outside the scope of
this document.
It's the Wild West. Every issuer does their own thing in their own
special way. Some don't really give you a way to introspect information
about a bearer token at all, because they assume that the issuer of the
token and the consumer of the token are essentially the same service.
Some major players provide their own custom libraries, implemented in
your-language-of-choice, to deal with their particular brand of magic.
So I punted and added the oauth_validator_command GUC. A token
validator command reads the bearer token from a file descriptor that's
passed to it, then does whatever magic is necessary to validate that
token and find out who owns it. Optionally, it can look at the role
that's being connected and make sure that the token authorizes the user
to actually use that role. Then it says yea or nay to Postgres, and
optionally tells the server who the user is so that their ID can be
logged and mapped through pg_ident.
(See the commit message in 0005 for a full description of the protocol.
The test suite also has two toy implementations that illustrate the
protocol, but they provide zero security.)
This is easily the worst part of the patch, not only because my
implementation is a bad hack on OpenPipeStream(), but because it
balances the security of the entire system on the shoulders of a DBA
who does not have time to read umpteen OAuth specifications cover to
cover. More thought and coding effort is needed here, but I didn't want
to gold-plate a bad design. I'm not sure what alternatives there are
within the rules laid out by OAUTHBEARER. And the system is _extremely_
flexible, in the way that only code that's maintained by somebody else
can be.
= Patchset Roadmap =
The seven patches can be grouped into three:
1. Prep
0001 decouples the SASL code from the SCRAM implementation.
0002 makes it possible to use common/jsonapi from the frontend.
0003 lets the json_errdetail() result be freed, to avoid leaks.
2. OAUTHBEARER Implementation
0004 implements the client with libiddawc.
0005 implements server HBA support and oauth_validator_command.
3. Testing
0006 adds a simple test extension to retrieve the authn_id.
0007 adds the Python test suite I've been developing against.
The first three patches are, hopefully, generally useful outside of
this implementation, and I'll plan to register them in the next
commitfest. The middle two patches are the "interesting" pieces, and
I've split them into client and server for ease of understanding,
though neither is particularly useful without the other.
The last two patches grew out of a test suite that I originally built
to be able to exercise NSS corner cases at the protocol/byte level. It
was incredibly helpful during implementation of this new SASL
mechanism, since I could write the client and server independently of
each other and get high coverage of broken/malicious implementations.
It's based on pytest and Construct, and the Python 3 requirement might
turn some away, but I wanted to include it in case anyone else wanted
to hack on the code. src/test/python/README explains more.
= Thoughts/Reflections =
...in no particular order.
I picked OAuth 2.0 as my first experiment in federated auth mostly
because I was already familiar with pieces of it. I think SAML (via the
SAML20 mechanism, RFC 6595) would be a good companion to this proof of
concept, if there is general interest in federated deployments.
I don't really like the OAUTHBEARER spec, but I'm not sure there's a
better alternative. Everything is left as an exercise for the reader.
It's not particularly extensible. Standard OAuth is built for
authorization, not authentication, and from reading the RFC's history,
it feels like it was a hack to just get something working. New
standards like OpenID Connect have begun to fill in the gaps, but the
SASL mechanisms have not kept up. (The OPENID20 mechanism is, to my
understanding, unrelated/obsolete.) And support for helpful OIDC
features seems to be spotty in the real world.
The iddawc dependency for client-side OAuth was extremely helpful to
develop this proof of concept quickly, but I don't think it would be an
appropriate component to build a real feature on. It's extremely
heavyweight -- it incorporates a huge stack of dependencies, including
a logging framework and a web server, to implement features we would
probably never use -- and it's fairly difficult to debug in practice.
If a device authorization flow were the only thing that libpq needed to
support natively, I think we should just depend on a widely used HTTP
client, like libcurl or neon, and implement the minimum spec directly
against the existing test suite.
There are a huge number of other authorization flows besides Device
Authorization; most would involve libpq automatically opening a web
browser for you. I felt like that wasn't an appropriate thing for a
library to do by default, especially when one of the most important
clients is a command-line application. Perhaps there could be a hook
for applications to be able to override the builtin flow and substitute
their own.
Since bearer tokens are essentially plaintext passwords, the relevant
specs require the use of transport-level protection, and I think it'd
be wise for the client to require TLS to be in place before performing
the initial handshake or sending a token.
Not every OAuth issuer is also an OpenID Discovery provider, so it's
frustrating that OAUTHBEARER (which is purportedly an OAuth 2.0
feature) requires OIDD for real-world implementations. Perhaps we could
hack around this with a data: URI or something.
The client currently performs the OAuth login dance every single time a
connection is made, but a proper OAuth client would cache its tokens to
reuse later, and keep an eye on their expiration times. This would make
daily use a little more like that of Kerberos, but we would have to
design a way to create and secure a token cache on disk.
If you've read this far, thank you for your interest, and I hope you
enjoy playing with it!
--Jacob
[1]: https://datatracker.ietf.org/doc/html/rfc7628
[2]: https://github.com/babelouest/iddawc
[3]: https://datatracker.ietf.org/doc/html/rfc8628
[4]: https://datatracker.ietf.org/doc/html/rfc8705
[5]: https://datatracker.ietf.org/doc/html/rfc6750#section-5.2
Attachments:
0001-auth-generalize-SASL-mechanisms.patchtext/x-patch; name=0001-auth-generalize-SASL-mechanisms.patchDownload+139-64
0002-src-common-remove-logging-from-jsonapi-for-shlib.patchtext/x-patch; name=0002-src-common-remove-logging-from-jsonapi-for-shlib.patchDownload+12-4
0003-common-jsonapi-always-palloc-the-error-strings.patchtext/x-patch; name=0003-common-jsonapi-always-palloc-the-error-strings.patchDownload+6-7
0004-libpq-add-OAUTHBEARER-SASL-mechanism.patchtext/x-patch; name=0004-libpq-add-OAUTHBEARER-SASL-mechanism.patchDownload+956-25
0005-backend-add-OAUTHBEARER-SASL-mechanism.patchtext/x-patch; name=0005-backend-add-OAUTHBEARER-SASL-mechanism.patchDownload+915-38
0006-Add-a-very-simple-authn_id-extension.patchtext/x-patch; name=0006-Add-a-very-simple-authn_id-extension.patchDownload+60-1
0007-Add-pytest-suite-for-OAuth.patchtext/x-patch; name=0007-Add-pytest-suite-for-OAuth.patchDownload+4168-1
On Tue, Jun 08, 2021 at 04:37:46PM +0000, Jacob Champion wrote:
1. Prep
0001 decouples the SASL code from the SCRAM implementation.
0002 makes it possible to use common/jsonapi from the frontend.
0003 lets the json_errdetail() result be freed, to avoid leaks.The first three patches are, hopefully, generally useful outside of
this implementation, and I'll plan to register them in the next
commitfest. The middle two patches are the "interesting" pieces, and
I've split them into client and server for ease of understanding,
though neither is particularly useful without the other.
Beginning with the beginning, could you spawn two threads for the
jsonapi rework and the SASL/SCRAM business? I agree that these look
independently useful. Glad to see someone improving the code with
SASL and SCRAM which are too inter-dependent now. I saw in the RFCs
dedicated to OAUTH the need for the JSON part as well.
+# define check_stack_depth()
+# ifdef JSONAPI_NO_LOG
+# define json_log_and_abort(...) \
+ do { fprintf(stderr, __VA_ARGS__); exit(1); } while(0)
+# else
In patch 0002, this is the wrong approach. libpq will not be able to
feed on such reports, and you cannot use any of the APIs from the
palloc() family either as these just fail on OOM. libpq should be
able to know about the error, and would fill in the error back to the
application. This abstraction is not necessary on HEAD as
pg_verifybackup is fine with this level of reporting. My rough guess
is that we will need to split the existing jsonapi.c into two files,
one that can be used in shared libraries and a second that handles the
errors.
+ /* TODO: SASL_EXCHANGE_FAILURE with output is forbidden in SASL */
if (result == SASL_EXCHANGE_SUCCESS)
sendAuthRequest(port,
AUTH_REQ_SASL_FIN,
output,
outputlen);
Perhaps that's an issue we need to worry on its own? I didn't recall
this part..
--
Michael
On 08/06/2021 19:37, Jacob Champion wrote:
We've been working on ways to expand the list of third-party auth
methods that Postgres provides. Some example use cases might be "I want
to let anyone with a Google account read this table" or "let anyone who
belongs to this GitHub organization connect as a superuser".
Cool!
The iddawc dependency for client-side OAuth was extremely helpful to
develop this proof of concept quickly, but I don't think it would be an
appropriate component to build a real feature on. It's extremely
heavyweight -- it incorporates a huge stack of dependencies, including
a logging framework and a web server, to implement features we would
probably never use -- and it's fairly difficult to debug in practice.
If a device authorization flow were the only thing that libpq needed to
support natively, I think we should just depend on a widely used HTTP
client, like libcurl or neon, and implement the minimum spec directly
against the existing test suite.
You could punt and let the application implement that stuff. I'm
imagining that the application code would look something like this:
conn = PQconnectStartParams(...);
for (;;)
{
status = PQconnectPoll(conn)
switch (status)
{
case CONNECTION_SASL_TOKEN_REQUIRED:
/* open a browser for the user, get token */
token = open_browser()
PQauthResponse(token);
break;
...
}
}
It would be nice to have a simple default implementation, though, for
psql and all the other client applications that come with PostgreSQL itself.
If you've read this far, thank you for your interest, and I hope you
enjoy playing with it!
A few small things caught my eye in the backend oauth_exchange function:
+ /* Handle the client's initial message. */ + p = strdup(input);
this strdup() should be pstrdup().
In the same function, there are a bunch of reports like this:
ereport(ERROR, + (errcode(ERRCODE_PROTOCOL_VIOLATION), + errmsg("malformed OAUTHBEARER message"), + errdetail("Comma expected, but found character \"%s\".", + sanitize_char(*p))));
I don't think the double quotes are needed here, because sanitize_char
will return quotes if it's a single character. So it would end up
looking like this: ... found character "'x'".
- Heikki
On Fri, 2021-06-18 at 11:31 +0300, Heikki Linnakangas wrote:
On 08/06/2021 19:37, Jacob Champion wrote:
We've been working on ways to expand the list of third-party auth
methods that Postgres provides. Some example use cases might be "I want
to let anyone with a Google account read this table" or "let anyone who
belongs to this GitHub organization connect as a superuser".Cool!
Glad you think so! :D
The iddawc dependency for client-side OAuth was extremely helpful to
develop this proof of concept quickly, but I don't think it would be an
appropriate component to build a real feature on. It's extremely
heavyweight -- it incorporates a huge stack of dependencies, including
a logging framework and a web server, to implement features we would
probably never use -- and it's fairly difficult to debug in practice.
If a device authorization flow were the only thing that libpq needed to
support natively, I think we should just depend on a widely used HTTP
client, like libcurl or neon, and implement the minimum spec directly
against the existing test suite.You could punt and let the application implement that stuff. I'm
imagining that the application code would look something like this:conn = PQconnectStartParams(...);
for (;;)
{
status = PQconnectPoll(conn)
switch (status)
{
case CONNECTION_SASL_TOKEN_REQUIRED:
/* open a browser for the user, get token */
token = open_browser()
PQauthResponse(token);
break;
...
}
}
I was toying with the idea of having a callback for libpq clients,
where they could take full control of the OAuth flow if they wanted to.
Doing it inline with PQconnectPoll seems like it would work too. It has
a couple of drawbacks that I can see:
- If a client isn't currently using a poll loop, they'd have to switch
to one to be able to use OAuth connections. Not a difficult change, but
considering all the other hurdles to making this work, I'm hoping to
minimize the hoop-jumping.
- A client would still have to receive a bunch of OAuth parameters from
some new libpq API in order to construct the correct URL to visit, so
the overall complexity for implementers might be higher than if we just
passed those params directly in a callback.
It would be nice to have a simple default implementation, though, for
psql and all the other client applications that come with PostgreSQL itself.
I agree. I think having a bare-bones implementation in libpq itself
would make initial adoption *much* easier, and then if specific
applications wanted to have richer control over an authorization flow,
then they could implement that themselves with the aforementioned
callback.
The Device Authorization flow was the most minimal working
implementation I could find, since it doesn't require a web browser on
the system, just the ability to print a prompt to the console. But if
anyone knows of a better flow for this use case, I'm all ears.
If you've read this far, thank you for your interest, and I hope you
enjoy playing with it!A few small things caught my eye in the backend oauth_exchange function:
+ /* Handle the client's initial message. */ + p = strdup(input);this strdup() should be pstrdup().
Thanks, I'll fix that in the next re-roll.
In the same function, there are a bunch of reports like this:
ereport(ERROR, + (errcode(ERRCODE_PROTOCOL_VIOLATION), + errmsg("malformed OAUTHBEARER message"), + errdetail("Comma expected, but found character \"%s\".", + sanitize_char(*p))));I don't think the double quotes are needed here, because sanitize_char
will return quotes if it's a single character. So it would end up
looking like this: ... found character "'x'".
I'll fix this too. Thanks!
--Jacob
On Fri, 2021-06-18 at 13:07 +0900, Michael Paquier wrote:
On Tue, Jun 08, 2021 at 04:37:46PM +0000, Jacob Champion wrote:
1. Prep
0001 decouples the SASL code from the SCRAM implementation.
0002 makes it possible to use common/jsonapi from the frontend.
0003 lets the json_errdetail() result be freed, to avoid leaks.The first three patches are, hopefully, generally useful outside of
this implementation, and I'll plan to register them in the next
commitfest. The middle two patches are the "interesting" pieces, and
I've split them into client and server for ease of understanding,
though neither is particularly useful without the other.Beginning with the beginning, could you spawn two threads for the
jsonapi rework and the SASL/SCRAM business?
Done [1, 2]. I've copied your comments into those threads with my
responses, and I'll have them registered in commitfest shortly.
Thanks!
--Jacob
[1]: /messages/by-id/3d2a6f5d50e741117d6baf83eb67ebf1a8a35a11.camel@vmware.com
[2]: /messages/by-id/a250d475ba1c0cc0efb7dfec8e538fcc77cdcb8e.camel@vmware.com
On Tue, Jun 22, 2021 at 11:26:03PM +0000, Jacob Champion wrote:
Done [1, 2]. I've copied your comments into those threads with my
responses, and I'll have them registered in commitfest shortly.
Thanks!
--
Michael
On Tue, 2021-06-22 at 23:22 +0000, Jacob Champion wrote:
On Fri, 2021-06-18 at 11:31 +0300, Heikki Linnakangas wrote:
A few small things caught my eye in the backend oauth_exchange function:
+ /* Handle the client's initial message. */ + p = strdup(input);this strdup() should be pstrdup().
Thanks, I'll fix that in the next re-roll.
In the same function, there are a bunch of reports like this:
ereport(ERROR, + (errcode(ERRCODE_PROTOCOL_VIOLATION), + errmsg("malformed OAUTHBEARER message"), + errdetail("Comma expected, but found character \"%s\".", + sanitize_char(*p))));I don't think the double quotes are needed here, because sanitize_char
will return quotes if it's a single character. So it would end up
looking like this: ... found character "'x'".I'll fix this too. Thanks!
v2, attached, incorporates Heikki's suggested fixes and also rebases on
top of latest HEAD, which had the SASL refactoring changes committed
last month.
The biggest change from the last patchset is 0001, an attempt at
enabling jsonapi in the frontend without the use of palloc(), based on
suggestions by Michael and Tom from last commitfest. I've also made
some improvements to the pytest suite. No major changes to the OAuth
implementation yet.
--Jacob
Attachments:
v2-0001-common-jsonapi-support-FRONTEND-clients.patchtext/x-patch; name=v2-0001-common-jsonapi-support-FRONTEND-clients.patchDownload+270-89
v2-0002-libpq-add-OAUTHBEARER-SASL-mechanism.patchtext/x-patch; name=v2-0002-libpq-add-OAUTHBEARER-SASL-mechanism.patchDownload+980-25
v2-0003-backend-add-OAUTHBEARER-SASL-mechanism.patchtext/x-patch; name=v2-0003-backend-add-OAUTHBEARER-SASL-mechanism.patchDownload+907-42
v2-0004-Add-a-very-simple-authn_id-extension.patchtext/x-patch; name=v2-0004-Add-a-very-simple-authn_id-extension.patchDownload+60-1
v2-0005-Add-pytest-suite-for-OAuth.patchtext/x-patch; name=v2-0005-Add-pytest-suite-for-OAuth.patchDownload+4178-1
On Wed, Aug 25, 2021 at 11:42 AM Jacob Champion <pchampion@vmware.com>
wrote:
On Tue, 2021-06-22 at 23:22 +0000, Jacob Champion wrote:
On Fri, 2021-06-18 at 11:31 +0300, Heikki Linnakangas wrote:
A few small things caught my eye in the backend oauth_exchange
function:
+ /* Handle the client's initial message. */ + p = strdup(input);this strdup() should be pstrdup().
Thanks, I'll fix that in the next re-roll.
In the same function, there are a bunch of reports like this:
ereport(ERROR, + (errcode(ERRCODE_PROTOCOL_VIOLATION), + errmsg("malformed OAUTHBEARER message"), + errdetail("Comma expected, but foundcharacter \"%s\".",
+ sanitize_char(*p))));
I don't think the double quotes are needed here, because sanitize_char
will return quotes if it's a single character. So it would end up
looking like this: ... found character "'x'".I'll fix this too. Thanks!
v2, attached, incorporates Heikki's suggested fixes and also rebases on
top of latest HEAD, which had the SASL refactoring changes committed
last month.The biggest change from the last patchset is 0001, an attempt at
enabling jsonapi in the frontend without the use of palloc(), based on
suggestions by Michael and Tom from last commitfest. I've also made
some improvements to the pytest suite. No major changes to the OAuth
implementation yet.--Jacob
Hi,
For v2-0001-common-jsonapi-support-FRONTEND-clients.patch :
+ /* Clean up. */
+ termJsonLexContext(&lex);
At the end of termJsonLexContext(), empty is copied to lex. For stack
based JsonLexContext, the copy seems unnecessary.
Maybe introduce a boolean parameter for termJsonLexContext() to signal that
the copy can be omitted ?
+#ifdef FRONTEND
+ /* make sure initialization succeeded */
+ if (lex->strval == NULL)
+ return JSON_OUT_OF_MEMORY;
Should PQExpBufferBroken(lex->strval) be used for the check ?
Thanks
On Wed, Aug 25, 2021 at 3:25 PM Zhihong Yu <zyu@yugabyte.com> wrote:
On Wed, Aug 25, 2021 at 11:42 AM Jacob Champion <pchampion@vmware.com>
wrote:On Tue, 2021-06-22 at 23:22 +0000, Jacob Champion wrote:
On Fri, 2021-06-18 at 11:31 +0300, Heikki Linnakangas wrote:
A few small things caught my eye in the backend oauth_exchange
function:
+ /* Handle the client's initial message. */ + p = strdup(input);this strdup() should be pstrdup().
Thanks, I'll fix that in the next re-roll.
In the same function, there are a bunch of reports like this:
ereport(ERROR, + (errcode(ERRCODE_PROTOCOL_VIOLATION), + errmsg("malformed OAUTHBEARER message"), + errdetail("Comma expected, but foundcharacter \"%s\".",
+ sanitize_char(*p))));
I don't think the double quotes are needed here, because
sanitize_char
will return quotes if it's a single character. So it would end up
looking like this: ... found character "'x'".I'll fix this too. Thanks!
v2, attached, incorporates Heikki's suggested fixes and also rebases on
top of latest HEAD, which had the SASL refactoring changes committed
last month.The biggest change from the last patchset is 0001, an attempt at
enabling jsonapi in the frontend without the use of palloc(), based on
suggestions by Michael and Tom from last commitfest. I've also made
some improvements to the pytest suite. No major changes to the OAuth
implementation yet.--Jacob
Hi,
For v2-0001-common-jsonapi-support-FRONTEND-clients.patch :+ /* Clean up. */
+ termJsonLexContext(&lex);At the end of termJsonLexContext(), empty is copied to lex. For stack
based JsonLexContext, the copy seems unnecessary.
Maybe introduce a boolean parameter for termJsonLexContext() to signal
that the copy can be omitted ?+#ifdef FRONTEND + /* make sure initialization succeeded */ + if (lex->strval == NULL) + return JSON_OUT_OF_MEMORY;Should PQExpBufferBroken(lex->strval) be used for the check ?
Thanks
Hi,
For v2-0002-libpq-add-OAUTHBEARER-SASL-mechanism.patch :
+ i_init_session(&session);
+
+ if (!conn->oauth_client_id)
+ {
+ /* We can't talk to a server without a client identifier. */
+ appendPQExpBufferStr(&conn->errorMessage,
+ libpq_gettext("no oauth_client_id is set for
the connection"));
+ goto cleanup;
Can conn->oauth_client_id check be performed ahead of i_init_session() ?
That way, ```goto cleanup``` can be replaced with return.
+ if (!error_code || (strcmp(error_code, "authorization_pending")
+ && strcmp(error_code, "slow_down")))
What if, in the future, there is error code different from the above two
which doesn't represent "OAuth token retrieval failed" scenario ?
For client_initial_response(),
+ token_buf = createPQExpBuffer();
+ if (!token_buf)
+ goto cleanup;
If token_buf is NULL, there doesn't seem to be anything to free. We can
return directly.
Cheers
On Wed, 2021-08-25 at 15:25 -0700, Zhihong Yu wrote:
Hi,
For v2-0001-common-jsonapi-support-FRONTEND-clients.patch :+ /* Clean up. */
+ termJsonLexContext(&lex);At the end of termJsonLexContext(), empty is copied to lex. For stack
based JsonLexContext, the copy seems unnecessary.
Maybe introduce a boolean parameter for termJsonLexContext() to
signal that the copy can be omitted ?
Do you mean heap-based? i.e. destroyJsonLexContext() does an
unnecessary copy before free? Yeah, in that case it's not super useful,
but I think I'd want some evidence that the performance hit matters
before optimizing it.
Are there any other internal APIs that take a boolean parameter like
that? If not, I think we'd probably just want to remove the copy
entirely if it's a problem.
+#ifdef FRONTEND + /* make sure initialization succeeded */ + if (lex->strval == NULL) + return JSON_OUT_OF_MEMORY;Should PQExpBufferBroken(lex->strval) be used for the check ?
It should be okay to continue if the strval is broken but non-NULL,
since it's about to be reset. That has the fringe benefit of allowing
the function to go as far as possible without failing, though that's
probably a pretty weak justification.
In practice, do you think that the probability of success is low enough
that we should just short-circuit and be done with it?
On Wed, 2021-08-25 at 16:24 -0700, Zhihong Yu wrote:
For v2-0002-libpq-add-OAUTHBEARER-SASL-mechanism.patch :
+ i_init_session(&session); + + if (!conn->oauth_client_id) + { + /* We can't talk to a server without a client identifier. */ + appendPQExpBufferStr(&conn->errorMessage, + libpq_gettext("no oauth_client_id is set for the connection")); + goto cleanup;Can conn->oauth_client_id check be performed ahead
of i_init_session() ? That way, ```goto cleanup``` can be replaced
with return.
Yeah, I think that makes sense. FYI, this is probably one of the
functions that will be rewritten completely once iddawc is removed.
+ if (!error_code || (strcmp(error_code, "authorization_pending") + && strcmp(error_code, "slow_down")))What if, in the future, there is error code different from the above
two which doesn't represent "OAuth token retrieval failed" scenario ?
We'd have to update our code; that would be a breaking change to the
Device Authorization spec. Here's what it says today [1]https://datatracker.ietf.org/doc/html/rfc8628#section-3.5:
The "authorization_pending" and "slow_down" error codes define
particularly unique behavior, as they indicate that the OAuth client
should continue to poll the token endpoint by repeating the token
request (implementing the precise behavior defined above). If the
client receives an error response with any other error code, it MUST
stop polling and SHOULD react accordingly, for example, by displaying
an error to the user.
For client_initial_response(),
+ token_buf = createPQExpBuffer(); + if (!token_buf) + goto cleanup;If token_buf is NULL, there doesn't seem to be anything to free. We
can return directly.
That's true today, but implementations have a habit of changing. I
personally prefer not to introduce too many exit points from a function
that's already using goto. In my experience, that makes future
maintenance harder.
Thanks for the reviews! Have you been able to give the patchset a try
with an OAuth deployment?
--Jacob
[1]: https://datatracker.ietf.org/doc/html/rfc8628#section-3.5
On Thu, Aug 26, 2021 at 9:13 AM Jacob Champion <pchampion@vmware.com> wrote:
On Wed, 2021-08-25 at 15:25 -0700, Zhihong Yu wrote:
Hi,
For v2-0001-common-jsonapi-support-FRONTEND-clients.patch :+ /* Clean up. */
+ termJsonLexContext(&lex);At the end of termJsonLexContext(), empty is copied to lex. For stack
based JsonLexContext, the copy seems unnecessary.
Maybe introduce a boolean parameter for termJsonLexContext() to
signal that the copy can be omitted ?Do you mean heap-based? i.e. destroyJsonLexContext() does an
unnecessary copy before free? Yeah, in that case it's not super useful,
but I think I'd want some evidence that the performance hit matters
before optimizing it.Are there any other internal APIs that take a boolean parameter like
that? If not, I think we'd probably just want to remove the copy
entirely if it's a problem.+#ifdef FRONTEND + /* make sure initialization succeeded */ + if (lex->strval == NULL) + return JSON_OUT_OF_MEMORY;Should PQExpBufferBroken(lex->strval) be used for the check ?
It should be okay to continue if the strval is broken but non-NULL,
since it's about to be reset. That has the fringe benefit of allowing
the function to go as far as possible without failing, though that's
probably a pretty weak justification.In practice, do you think that the probability of success is low enough
that we should just short-circuit and be done with it?On Wed, 2021-08-25 at 16:24 -0700, Zhihong Yu wrote:
For v2-0002-libpq-add-OAUTHBEARER-SASL-mechanism.patch :
+ i_init_session(&session); + + if (!conn->oauth_client_id) + { + /* We can't talk to a server without a client identifier. */ + appendPQExpBufferStr(&conn->errorMessage, + libpq_gettext("no oauth_client_id is setfor the connection"));
+ goto cleanup;
Can conn->oauth_client_id check be performed ahead
of i_init_session() ? That way, ```goto cleanup``` can be replaced
with return.Yeah, I think that makes sense. FYI, this is probably one of the
functions that will be rewritten completely once iddawc is removed.+ if (!error_code || (strcmp(error_code, "authorization_pending") + && strcmp(error_code, "slow_down")))What if, in the future, there is error code different from the above
two which doesn't represent "OAuth token retrieval failed" scenario ?We'd have to update our code; that would be a breaking change to the
Device Authorization spec. Here's what it says today [1]:The "authorization_pending" and "slow_down" error codes define
particularly unique behavior, as they indicate that the OAuth client
should continue to poll the token endpoint by repeating the token
request (implementing the precise behavior defined above). If the
client receives an error response with any other error code, it MUST
stop polling and SHOULD react accordingly, for example, by displaying
an error to the user.For client_initial_response(),
+ token_buf = createPQExpBuffer(); + if (!token_buf) + goto cleanup;If token_buf is NULL, there doesn't seem to be anything to free. We
can return directly.That's true today, but implementations have a habit of changing. I
personally prefer not to introduce too many exit points from a function
that's already using goto. In my experience, that makes future
maintenance harder.Thanks for the reviews! Have you been able to give the patchset a try
with an OAuth deployment?--Jacob
[1] https://datatracker.ietf.org/doc/html/rfc8628#section-3.5
Hi,
bq. destroyJsonLexContext() does an unnecessary copy before free? Yeah, in
that case it's not super useful,
but I think I'd want some evidence that the performance hit matters before
optimizing it.
Yes I agree.
bq. In practice, do you think that the probability of success is low enough
that we should just short-circuit and be done with it?
Haven't had a chance to try your patches out yet.
I will leave this to people who are more familiar with OAuth
implementation(s).
bq. I personally prefer not to introduce too many exit points from a
function that's already using goto.
Fair enough.
Cheers
On Thu, Aug 26, 2021 at 04:13:08PM +0000, Jacob Champion wrote:
Do you mean heap-based? i.e. destroyJsonLexContext() does an
unnecessary copy before free? Yeah, in that case it's not super useful,
but I think I'd want some evidence that the performance hit matters
before optimizing it.
As an authentication code path, the impact is minimal and my take on
that would be to keep the code simple. Now if you'd really wish to
stress that without relying on the backend, one simple way is to use
pgbench -C -n with a mostly-empty script (one meta-command) coupled
with some profiling.
--
Michael
On Fri, 2021-08-27 at 11:32 +0900, Michael Paquier wrote:
Now if you'd really wish to
stress that without relying on the backend, one simple way is to use
pgbench -C -n with a mostly-empty script (one meta-command) coupled
with some profiling.
Ah, thanks! I'll add that to the toolbox.
--Jacob
Hi all,
v3 rebases this patchset over the top of Samay's pluggable auth
provider API [1]/messages/by-id/CAJxrbyxTRn5P8J-p+wHLwFahK5y56PhK28VOb55jqMO05Y-DJw@mail.gmail.com, included here as patches 0001-3. The final patch in
the set ports the server implementation from a core feature to a
contrib module; to switch between the two approaches, simply leave out
that final patch.
There are still some backend changes that must be made to get this
working, as pointed out in 0009, and obviously libpq support still
requires code changes.
--Jacob
[1]: /messages/by-id/CAJxrbyxTRn5P8J-p+wHLwFahK5y56PhK28VOb55jqMO05Y-DJw@mail.gmail.com
Attachments:
v3-0001-Add-support-for-custom-authentication-methods.patchtext/x-patch; name=v3-0001-Add-support-for-custom-authentication-methods.patchDownload+143-22
v3-0002-Add-sample-extension-to-test-custom-auth-provider.patchtext/x-patch; name=v3-0002-Add-sample-extension-to-test-custom-auth-provider.patchDownload+106-1
v3-0003-Add-tests-for-test_auth_provider-extension.patchtext/x-patch; name=v3-0003-Add-tests-for-test_auth_provider-extension.patchDownload+128-1
v3-0004-common-jsonapi-support-FRONTEND-clients.patchtext/x-patch; name=v3-0004-common-jsonapi-support-FRONTEND-clients.patchDownload+270-89
v3-0005-libpq-add-OAUTHBEARER-SASL-mechanism.patchtext/x-patch; name=v3-0005-libpq-add-OAUTHBEARER-SASL-mechanism.patchDownload+979-25
v3-0006-backend-add-OAUTHBEARER-SASL-mechanism.patchtext/x-patch; name=v3-0006-backend-add-OAUTHBEARER-SASL-mechanism.patchDownload+889-24
v3-0007-Add-a-very-simple-authn_id-extension.patchtext/x-patch; name=v3-0007-Add-a-very-simple-authn_id-extension.patchDownload+60-1
v3-0008-Add-pytest-suite-for-OAuth.patchtext/x-patch; name=v3-0008-Add-pytest-suite-for-OAuth.patchDownload+4178-1
v3-0009-contrib-oauth-switch-to-pluggable-auth-API.patchtext/x-patch; name=v3-0009-contrib-oauth-switch-to-pluggable-auth-API.patchDownload+78-77
Hi Jacob,
Thank you for porting this on top of the pluggable auth methods API. I've
addressed the feedback around other backend changes in my latest patch, but
the client side changes still remain. I had a few questions to understand
them better.
(a) What specifically do the client side changes in the patch implement?
(b) Are the changes you made on the client side specific to OAUTH or are
they about making SASL more generic? As an additional question, if someone
wanted to implement something similar on top of your patch, would they
still have to make client side changes?
Regards,
Samay
On Fri, Mar 4, 2022 at 11:13 AM Jacob Champion <pchampion@vmware.com> wrote:
Show quoted text
Hi all,
v3 rebases this patchset over the top of Samay's pluggable auth
provider API [1], included here as patches 0001-3. The final patch in
the set ports the server implementation from a core feature to a
contrib module; to switch between the two approaches, simply leave out
that final patch.There are still some backend changes that must be made to get this
working, as pointed out in 0009, and obviously libpq support still
requires code changes.--Jacob
[1]
/messages/by-id/CAJxrbyxTRn5P8J-p+wHLwFahK5y56PhK28VOb55jqMO05Y-DJw@mail.gmail.com
On Tue, 2022-03-22 at 14:48 -0700, samay sharma wrote:
Thank you for porting this on top of the pluggable auth methods API.
I've addressed the feedback around other backend changes in my latest
patch, but the client side changes still remain. I had a few
questions to understand them better.(a) What specifically do the client side changes in the patch implement?
Hi Samay,
The client-side changes are an implementation of the OAuth 2.0 Device
Authorization Grant [1]https://datatracker.ietf.org/doc/html/rfc8628 in libpq. The majority of the OAuth logic is
handled by the third-party iddawc library.
The server tells the client what OIDC provider to contact, and then
libpq prompts you to log into that provider on your
smartphone/browser/etc. using a one-time code. After you give libpq
permission to act on your behalf, the Bearer token gets sent to libpq
via a direct connection, and libpq forwards it to the server so that
the server can determine whether you're allowed in.
(b) Are the changes you made on the client side specific to OAUTH or
are they about making SASL more generic?
The original patchset included changes to make SASL more generic. Many
of those changes have since been merged, and the remaining code is
mostly OAuth-specific, but there are still improvements to be made.
(And there's some JSON crud to sift through in the first couple of
patches. I'm still mad that the OAUTHBEARER spec requires clients to
parse JSON in the first place.)
As an additional question,
if someone wanted to implement something similar on top of your
patch, would they still have to make client side changes?
Any new SASL mechanisms require changes to libpq at this point. You
need to implement a new pg_sasl_mech, modify pg_SASL_init() to select
the mechanism correctly, and add whatever connection string options you
need, along with the associated state in pg_conn. Patch 0004 has all
the client-side magic for OAUTHBEARER.
--Jacob
On Fri, 2022-03-04 at 19:13 +0000, Jacob Champion wrote:
v3 rebases this patchset over the top of Samay's pluggable auth
provider API [1], included here as patches 0001-3.
v4 rebases over the latest version of the pluggable auth patchset
(included as 0001-4). Note that there's a recent conflict as
of d4781d887; use an older commit as the base (or wait for the other
thread to be updated).
--Jacob
Attachments:
v4-0007-backend-add-OAUTHBEARER-SASL-mechanism.patchtext/x-patch; name=v4-0007-backend-add-OAUTHBEARER-SASL-mechanism.patchDownload+889-24
v4-0008-Add-a-very-simple-authn_id-extension.patchtext/x-patch; name=v4-0008-Add-a-very-simple-authn_id-extension.patchDownload+60-1
v4-0001-Add-support-for-custom-authentication-methods.patchtext/x-patch; name=v4-0001-Add-support-for-custom-authentication-methods.patchDownload+172-20
v4-0002-Add-sample-extension-to-test-custom-auth-provider.patchtext/x-patch; name=v4-0002-Add-sample-extension-to-test-custom-auth-provider.patchDownload+103-2
v4-0003-Add-tests-for-test_auth_provider-extension.patchtext/x-patch; name=v4-0003-Add-tests-for-test_auth_provider-extension.patchDownload+128-1
v4-0004-Add-support-for-map-and-custom-auth-options.patchtext/x-patch; name=v4-0004-Add-support-for-map-and-custom-auth-options.patchDownload+157-21
v4-0005-common-jsonapi-support-FRONTEND-clients.patchtext/x-patch; name=v4-0005-common-jsonapi-support-FRONTEND-clients.patchDownload+270-89
v4-0006-libpq-add-OAUTHBEARER-SASL-mechanism.patchtext/x-patch; name=v4-0006-libpq-add-OAUTHBEARER-SASL-mechanism.patchDownload+979-25
v4-0009-Add-pytest-suite-for-OAuth.patchtext/x-patch; name=v4-0009-Add-pytest-suite-for-OAuth.patchDownload+4178-1
v4-0010-contrib-oauth-switch-to-pluggable-auth-API.patchtext/x-patch; name=v4-0010-contrib-oauth-switch-to-pluggable-auth-API.patchDownload+104-101
Hi Hackers,
We are trying to implement AAD(Azure AD) support in PostgreSQL and it
can be achieved with support of the OAuth method. To support AAD on
top of OAuth in a generic fashion (i.e for all other OAuth providers),
we are proposing this patch. It basically exposes two new hooks (one
for error reporting and one for OAuth provider specific token
validation) and passing OAuth bearer token to backend. It also adds
support for client credentials flow of OAuth additional to device code
flow which Jacob has proposed.
The changes for each component are summarized below.
1. Provider-specific extension:
Each OAuth provider implements their own token validator as an
extension. Extension registers an OAuth provider hook which is matched
to a line in the HBA file.
2. Add support to pass on the OAuth bearer token. In this
obtaining the bearer token is left to 3rd party application or user.
./psql -U <username> -d 'dbname=postgres
oauth_client_id=<client_id> oauth_bearer_token=<token>
3. HBA: An additional param ‘provider’ is added for the oauth method.
Defining "oauth" as method + passing provider, issuer endpoint
and expected audience
* * * * oauth provider=<token validation extension>
issuer=.... scope=....
4. Engine Backend:
Support for generic OAUTHBEARER type, requesting client to
provide token and passing to token for provider-specific extension.
5. Engine Frontend: Two-tiered approach.
a) libpq transparently passes on the token received
from 3rd party client as is to the backend.
b) libpq optionally compiled for the clients which
explicitly need libpq to orchestrate OAuth communication with the
issuer (it depends heavily on 3rd party library iddawc as Jacob
already pointed out. The library seems to be supporting all the OAuth
flows.)
Please let us know your thoughts as the proposed method supports
different OAuth flows with the use of provider specific hooks. We
think that the proposal would be useful for various OAuth providers.
Thanks,
Mahendrakar.
Show quoted text
On Tue, 20 Sept 2022 at 10:18, Jacob Champion <pchampion@vmware.com> wrote:
On Tue, 2021-06-22 at 23:22 +0000, Jacob Champion wrote:
On Fri, 2021-06-18 at 11:31 +0300, Heikki Linnakangas wrote:
A few small things caught my eye in the backend oauth_exchange function:
+ /* Handle the client's initial message. */ + p = strdup(input);this strdup() should be pstrdup().
Thanks, I'll fix that in the next re-roll.
In the same function, there are a bunch of reports like this:
ereport(ERROR, + (errcode(ERRCODE_PROTOCOL_VIOLATION), + errmsg("malformed OAUTHBEARER message"), + errdetail("Comma expected, but found character \"%s\".", + sanitize_char(*p))));I don't think the double quotes are needed here, because sanitize_char
will return quotes if it's a single character. So it would end up
looking like this: ... found character "'x'".I'll fix this too. Thanks!
v2, attached, incorporates Heikki's suggested fixes and also rebases on
top of latest HEAD, which had the SASL refactoring changes committed
last month.The biggest change from the last patchset is 0001, an attempt at
enabling jsonapi in the frontend without the use of palloc(), based on
suggestions by Michael and Tom from last commitfest. I've also made
some improvements to the pytest suite. No major changes to the OAuth
implementation yet.--Jacob
Attachments:
v1-0001-oauth-provider-support.patchapplication/x-patch; name=v1-0001-oauth-provider-support.patchDownload+209-220
Hi Mahendrakar, thanks for your interest and for the patch!
On Mon, Sep 19, 2022 at 10:03 PM mahendrakar s
<mahendrakarforpg@gmail.com> wrote:
The changes for each component are summarized below.
1. Provider-specific extension:
Each OAuth provider implements their own token validator as an
extension. Extension registers an OAuth provider hook which is matched
to a line in the HBA file.
How easy is it to write a Bearer validator using C? My limited
understanding was that most providers were publishing libraries in
higher-level languages.
Along those lines, sample validators will need to be provided, both to
help in review and to get the pytest suite green again. (And coverage
for the new code is important, too.)
2. Add support to pass on the OAuth bearer token. In this
obtaining the bearer token is left to 3rd party application or user../psql -U <username> -d 'dbname=postgres
oauth_client_id=<client_id> oauth_bearer_token=<token>
This hurts, but I think people are definitely going to ask for it, given
the frightening practice of copy-pasting these (incredibly sensitive
secret) tokens all over the place... Ideally I'd like to implement
sender constraints for the Bearer token, to *prevent* copy-pasting (or,
you know, outright theft). But I'm not sure that sender constraints are
well-implemented yet for the major providers.
3. HBA: An additional param ‘provider’ is added for the oauth method.
Defining "oauth" as method + passing provider, issuer endpoint
and expected audience* * * * oauth provider=<token validation extension>
issuer=.... scope=....
Naming aside (this conflicts with Samay's previous proposal, I think), I
have concerns about the implementation. There's this code:
+ if (oauth_provider && oauth_provider->name) + { + ereport(ERROR, + (errmsg("OAuth provider \"%s\" is already loaded.", + oauth_provider->name))); + }
which appears to prevent loading more than one global provider. But
there's also code that deals with a provider list? (Again, it'd help to
have test code covering the new stuff.)
b) libpq optionally compiled for the clients which
explicitly need libpq to orchestrate OAuth communication with the
issuer (it depends heavily on 3rd party library iddawc as Jacob
already pointed out. The library seems to be supporting all the OAuth
flows.)
Speaking of iddawc, I don't think it's a dependency we should choose to
rely on. For all the code that it has, it doesn't seem to provide
compatibility with several real-world providers.
Google, for one, chose not to follow the IETF spec it helped author, and
iddawc doesn't support its flavor of Device Authorization. At another
point, I think iddawc tried to decode Azure's Bearer tokens, which is
incorrect...
I haven't been able to check if those problems have been fixed in a
recent version, but if we're going to tie ourselves to a huge
dependency, I'd at least like to believe that said dependency is
battle-tested and solid, and personally I don't feel like iddawc is.
- auth_method = I_TOKEN_AUTH_METHOD_NONE;
- if (conn->oauth_client_secret && *conn->oauth_client_secret)
- auth_method = I_TOKEN_AUTH_METHOD_SECRET_BASIC;
This code got moved, but I'm not sure why? It doesn't appear to have
made a change to the logic.
+ if (conn->oauth_client_secret && *conn->oauth_client_secret) + { + session_response_type = I_RESPONSE_TYPE_CLIENT_CREDENTIALS; + }
Is this an Azure-specific requirement? Ideally a public client (which
psql is) shouldn't have to provide a secret to begin with, if I
understand that bit of the protocol correctly. I think Google also
required provider-specific changes in this part of the code, and
unfortunately I don't think they looked the same as yours.
We'll have to figure all that out... Standards are great; everyone has
one of their own. :)
Thanks,
--Jacob
On Tue, Sep 20, 2022 at 4:19 PM Jacob Champion <jchampion@timescale.com> wrote:
2. Add support to pass on the OAuth bearer token. In this
obtaining the bearer token is left to 3rd party application or user../psql -U <username> -d 'dbname=postgres
oauth_client_id=<client_id> oauth_bearer_token=<token>This hurts, but I think people are definitely going to ask for it, given
the frightening practice of copy-pasting these (incredibly sensitive
secret) tokens all over the place...
After some further thought -- in this case, you already have an opaque
Bearer token (and therefore you already know, out of band, which
provider needs to be used), you're willing to copy-paste it from
whatever service you got it from, and you have an extension plugged
into Postgres on the backend that verifies this Bearer blob using some
procedure that Postgres knows nothing about.
Why do you need the OAUTHBEARER mechanism logic at that point? Isn't
that identical to a custom password scheme? It seems like that could
be handled completely by Samay's pluggable auth proposal.
--Jacob