[oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

Started by Jacob Champion4 months ago28 messageshackers
Jump to latest
#1Jacob Champion
jacob.champion@enterprisedb.com

Hi everybody,

We introduced the libpq-oauth module late in the cycle for PG18, and
to put it mildly, its interface isn't great. The original
implementation depended on libpq internals, and we had to make sure
that we didn't start crashing during major or minor version upgrades.
So there were a bunch of compromises made to keep things safe,
including the restriction that the module name has to change every
major release.

Separately, but closely related: PG18's OAuth support allows you to
customize the client flow via a libpq hook function. Third-party
applications can make use of that, but our own utilities can still
only use the builtin device flow. That's annoying.

I've been working to replace the internal ABI with a stable one,
hopefully to solve both problems at the same time. A dlopen() is a
pretty clear seam for other people to use to modify and extend.
Unfortunately my first attempt (not pictured) ended up in a rabbit
hole, because I tried to tackle the third-party use case first. My
second attempt, attached, focuses on the ABI stabilization instead,
which I think is more likely to succeed.

(This took enough thinking that I'm really glad we didn't try this for
PG18. Thanks for letting me take on some technical debt for a bit.)

= Design =

Here's the train of thought behind the core changes (which are in patch 0004):

The builtin-flow code in fe-auth-oauth.c is similar to the custom-flow
code, but it's just ever-so-slightly different. I'd like to unify the
two, so I want libpq-oauth to make use of the public
PGoauthBearerRequest API, and that means that almost all of the
injections made in the PG18 ABI need to be replaced.

Most of those injections are simply subsumed by the hook API
(hooray!). A couple of others can be replaced by PQconninfo(). Four
are left over:
- pgthreadlock_t
- libpq_gettext
- conn->errorMessage
- conn->oauth_issuer_id

I think we should keep injecting libpq_gettext; no third-party
implementations would be able to use that. And application hooks are
presumably capable of figuring out top-level locking already, since
the application has to have called PQregisterThreadLock() if it needed
to coordinate with libpq.

That leaves error messages and the issuer identifier. I think both
would be useful for hooks to have, too, so I'd like to add them to
PQAUTHDATA_OAUTH_BEARER_TOKEN.

= PQAUTHDATA_OAUTH_BEARER_TOKEN, version 2 =

My original plan for authdata extensions was to add new members to the
end of the structs that libpq passes into the hook. Applications would
gate on a feature macro during compilation to see whether they could
use the new members. That should work fine for an application hook;
you're not allowed to downgrade libpq past the version that your
applications are compiled against, lest you lose symbols (or other
feature-flag functionality) you're relying on.

Plugins, unfortunately, can't rely on the feature macro. As we found
out during the libpq-oauth split [1]/messages/by-id/aAkJnDQq3mOUvmQV@msg.df7cb.de, we have to handle a long-running
application with an old libpq that loads an upgraded libpq-oauth, even
if the OS package dependencies suggest otherwise. (A plugin
architecture flips the direction of the runtime dependency arrow.)

There are a couple ways we could handle this. For this draft, I've
implemented what I think is a middle ground between verbosity and
type-safety: introduce a new V2 struct that "inherits" the V1 struct
and can be down-cast in the callbacks, kinda similar to our Node
hierarchy. We could go even more verbose, and duplicate the entire
PGoauthBearerToken struct -- matching the callback parameter types for
maximum safety -- but I'm not convinced that this would be a good use
of maintenance effort. The ability to cast between the two means we
can share v1 and v2 implementations in our tests.

We could also just add the new members at the end, say that you're
only allowed to use them if the V2 hook type is in use, and scribble
on them in V1 hooks to try to get misbehaving implementations to crash
outright. This arguably has the same amount of type-safety as the
downcast, and the resulting code looks nicer. (The libcurl API we use
does something similar with curl_version_info().) But it is definitely
more "magic".

Also of note: this adds a PQExpBuffer to libpq-fe.h. Technically, that
type is "internal". But... is it really, though? It doesn't seem
possible for us to make incompatible changes there without crashing
earlier psqls, in which case I would like to make use of it too. Maybe
this deserves its own minithread.

Okay, on to the full patchset.

= Roadmap: Prep =

The first few patches are bugfixes I intend to backpatch to 18.

- 0001: I stomped on the SOCKTYPE name in libpq-fe.h, but that's not
in our namespace and it's conceivable that it might collide with
someone else. (It collided with my own test code during my work on
this.)
- 0002 fixes a copy-paste bug in meson.build, which luckily hadn't
caused problems yet.
- 0003 ports Tom's debug2 fix for Test::Cluster::connect_fails() over
to 002_client.pl. (Currently, log checks in this test aren't made
after connection failures, but I don't really want to chase that down
later after I've forgotten about it.)

= Roadmap: Implementation =

Next three patches are the core implementation, which stabilizes the
ABI for libpq-oauth. I feel fairly confident that this, or something
close to it, could land in PG19.

- 0004 introduces the new PQAUTHDATA_OAUTH_BEARER_TOKEN_V2 API.

As described above.

- 0005 makes use of the new API in libpq-oauth.

This removes more code than it introduces, which is exciting.

Now we can rename libpq-oauth-19.so to just libpq-oauth.so, since we
no longer depend on anything that could change between major versions.
(We still need lock and locale support from libpq, as mentioned above,
so they continue to be decoupled via injection at init time.)

Some of the code in this patch overlaps with the translation fixes
thread [2]/messages/by-id/TY4PR01MB1690746DB91991D1E9A47F57E94CDA@TY4PR01MB16907.jpnprd01.prod.outlook.com, which I need to get to first. I'm hoping some additional
simplifications can be made after those localization bugs are fixed.

I think I'd also like to get the threadlock into the module API (but
not the hook API). A third-party flow might need to perform its own
one-time initialization, and if it relies on an old version of Curl
(or worse, Kerberos [3]/messages/by-id/aSSp03wmNMngi/Oe@ubby), it'll need to use the same lock that the
top-level application has registered for libpq. So I imagine I'll need
to break out an initialization function here. Alternatively, I could
introduce an API into libpq to retrieve the threadlock function in
use?

- 0006 removes a potential ABI-compatibility pitfall for future devs.

libpq-oauth needs to use the shared-library variant of libpgcommon,
but it can no longer assume that the encoding support exported by
libpq is compatible. So it must not accidentally link against those
functions (see [4]https://postgr.es/c/b6c7cfac8). I don't imagine anyone will try adding code that
does this in practice; we're pretty UTF8-centric in OAuth. But just to
be safe, define USE_PRIVATE_ENCODING_FUNCS so that anyone who tries
will fail the build.

= Roadmap: Plugins? =

So now we have a stable ABI, which technically means that any
enterprising developer who wants to play games with LD_LIBRARY_PATH
could implement their own version of an OAuth flow, and have our
utilities make use of it into the future.

That brings us to patch 0007, which experimentally promotes the stable
API to a public header, and introduces a really janky environment
variable so that people don't have to play games. It will be obvious
from the code that this is not well-baked yet.

I hope the ability to dlopen() a custom flow can make it for 19; I
just don't know that this envvar approach is any good. The ideal
situation, IMO, is for a flow to be selected in the connection string.
But we have to lock that down, similarly to how we protect
local_preload_libraries, to prevent horrible exploits. At which point
we'll have essentially designed a generic libpq plugin system. Not
necessarily a terrible thing, but I don't think I have time to take it
on for PG19.

WDYT?
--Jacob

[1]: /messages/by-id/aAkJnDQq3mOUvmQV@msg.df7cb.de
[2]: /messages/by-id/TY4PR01MB1690746DB91991D1E9A47F57E94CDA@TY4PR01MB16907.jpnprd01.prod.outlook.com
[3]: /messages/by-id/aSSp03wmNMngi/Oe@ubby
[4]: https://postgr.es/c/b6c7cfac8

Attachments:

0001-libpq-fe.h-Don-t-claim-SOCKTYPE-in-the-global-namesp.patchapplication/octet-stream; name=0001-libpq-fe.h-Don-t-claim-SOCKTYPE-in-the-global-namesp.patchDownload+5-6
0002-libpq-oauth-use-correct-c_args-in-meson.build.patchapplication/octet-stream; name=0002-libpq-oauth-use-correct-c_args-in-meson.build.patchDownload+1-2
0003-oauth_validator-Avoid-races-in-log_check.patchapplication/octet-stream; name=0003-oauth_validator-Avoid-races-in-log_check.patchDownload+18-7
0004-libpq-Introduce-PQAUTHDATA_OAUTH_BEARER_TOKEN_V2.patchapplication/octet-stream; name=0004-libpq-Introduce-PQAUTHDATA_OAUTH_BEARER_TOKEN_V2.patchDownload+324-36
0005-libpq-oauth-Use-the-PGoauthBearerRequestV2-API.patchapplication/octet-stream; name=0005-libpq-oauth-Use-the-PGoauthBearerRequestV2-API.patchDownload+329-407
0006-libpq-oauth-Never-link-against-libpq-s-encoding-func.patchapplication/octet-stream; name=0006-libpq-oauth-Never-link-against-libpq-s-encoding-func.patchDownload+18-4
0007-WIP-Introduce-third-party-OAuth-flow-plugins.patchapplication/octet-stream; name=0007-WIP-Introduce-third-party-OAuth-flow-plugins.patchDownload+649-370
#2Chao Li
li.evan.chao@gmail.com
In reply to: Jacob Champion (#1)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

On Dec 10, 2025, at 05:00, Jacob Champion <jacob.champion@enterprisedb.com> wrote:

Hi everybody,

We introduced the libpq-oauth module late in the cycle for PG18, and
to put it mildly, its interface isn't great. The original
implementation depended on libpq internals, and we had to make sure
that we didn't start crashing during major or minor version upgrades.
So there were a bunch of compromises made to keep things safe,
including the restriction that the module name has to change every
major release.

Separately, but closely related: PG18's OAuth support allows you to
customize the client flow via a libpq hook function. Third-party
applications can make use of that, but our own utilities can still
only use the builtin device flow. That's annoying.

I've been working to replace the internal ABI with a stable one,
hopefully to solve both problems at the same time. A dlopen() is a
pretty clear seam for other people to use to modify and extend.
Unfortunately my first attempt (not pictured) ended up in a rabbit
hole, because I tried to tackle the third-party use case first. My
second attempt, attached, focuses on the ABI stabilization instead,
which I think is more likely to succeed.

(This took enough thinking that I'm really glad we didn't try this for
PG18. Thanks for letting me take on some technical debt for a bit.)

= Design =

Here's the train of thought behind the core changes (which are in patch 0004):

The builtin-flow code in fe-auth-oauth.c is similar to the custom-flow
code, but it's just ever-so-slightly different. I'd like to unify the
two, so I want libpq-oauth to make use of the public
PGoauthBearerRequest API, and that means that almost all of the
injections made in the PG18 ABI need to be replaced.

Most of those injections are simply subsumed by the hook API
(hooray!). A couple of others can be replaced by PQconninfo(). Four
are left over:
- pgthreadlock_t
- libpq_gettext
- conn->errorMessage
- conn->oauth_issuer_id

I think we should keep injecting libpq_gettext; no third-party
implementations would be able to use that. And application hooks are
presumably capable of figuring out top-level locking already, since
the application has to have called PQregisterThreadLock() if it needed
to coordinate with libpq.

That leaves error messages and the issuer identifier. I think both
would be useful for hooks to have, too, so I'd like to add them to
PQAUTHDATA_OAUTH_BEARER_TOKEN.

= PQAUTHDATA_OAUTH_BEARER_TOKEN, version 2 =

My original plan for authdata extensions was to add new members to the
end of the structs that libpq passes into the hook. Applications would
gate on a feature macro during compilation to see whether they could
use the new members. That should work fine for an application hook;
you're not allowed to downgrade libpq past the version that your
applications are compiled against, lest you lose symbols (or other
feature-flag functionality) you're relying on.

Plugins, unfortunately, can't rely on the feature macro. As we found
out during the libpq-oauth split [1], we have to handle a long-running
application with an old libpq that loads an upgraded libpq-oauth, even
if the OS package dependencies suggest otherwise. (A plugin
architecture flips the direction of the runtime dependency arrow.)

There are a couple ways we could handle this. For this draft, I've
implemented what I think is a middle ground between verbosity and
type-safety: introduce a new V2 struct that "inherits" the V1 struct
and can be down-cast in the callbacks, kinda similar to our Node
hierarchy. We could go even more verbose, and duplicate the entire
PGoauthBearerToken struct -- matching the callback parameter types for
maximum safety -- but I'm not convinced that this would be a good use
of maintenance effort. The ability to cast between the two means we
can share v1 and v2 implementations in our tests.

We could also just add the new members at the end, say that you're
only allowed to use them if the V2 hook type is in use, and scribble
on them in V1 hooks to try to get misbehaving implementations to crash
outright. This arguably has the same amount of type-safety as the
downcast, and the resulting code looks nicer. (The libcurl API we use
does something similar with curl_version_info().) But it is definitely
more "magic".

Also of note: this adds a PQExpBuffer to libpq-fe.h. Technically, that
type is "internal". But... is it really, though? It doesn't seem
possible for us to make incompatible changes there without crashing
earlier psqls, in which case I would like to make use of it too. Maybe
this deserves its own minithread.

Okay, on to the full patchset.

= Roadmap: Prep =

The first few patches are bugfixes I intend to backpatch to 18.

- 0001: I stomped on the SOCKTYPE name in libpq-fe.h, but that's not
in our namespace and it's conceivable that it might collide with
someone else. (It collided with my own test code during my work on
this.)
- 0002 fixes a copy-paste bug in meson.build, which luckily hadn't
caused problems yet.
- 0003 ports Tom's debug2 fix for Test::Cluster::connect_fails() over
to 002_client.pl. (Currently, log checks in this test aren't made
after connection failures, but I don't really want to chase that down
later after I've forgotten about it.)

= Roadmap: Implementation =

Next three patches are the core implementation, which stabilizes the
ABI for libpq-oauth. I feel fairly confident that this, or something
close to it, could land in PG19.

- 0004 introduces the new PQAUTHDATA_OAUTH_BEARER_TOKEN_V2 API.

As described above.

- 0005 makes use of the new API in libpq-oauth.

This removes more code than it introduces, which is exciting.

Now we can rename libpq-oauth-19.so to just libpq-oauth.so, since we
no longer depend on anything that could change between major versions.
(We still need lock and locale support from libpq, as mentioned above,
so they continue to be decoupled via injection at init time.)

Some of the code in this patch overlaps with the translation fixes
thread [2], which I need to get to first. I'm hoping some additional
simplifications can be made after those localization bugs are fixed.

I think I'd also like to get the threadlock into the module API (but
not the hook API). A third-party flow might need to perform its own
one-time initialization, and if it relies on an old version of Curl
(or worse, Kerberos [3]), it'll need to use the same lock that the
top-level application has registered for libpq. So I imagine I'll need
to break out an initialization function here. Alternatively, I could
introduce an API into libpq to retrieve the threadlock function in
use?

- 0006 removes a potential ABI-compatibility pitfall for future devs.

libpq-oauth needs to use the shared-library variant of libpgcommon,
but it can no longer assume that the encoding support exported by
libpq is compatible. So it must not accidentally link against those
functions (see [4]). I don't imagine anyone will try adding code that
does this in practice; we're pretty UTF8-centric in OAuth. But just to
be safe, define USE_PRIVATE_ENCODING_FUNCS so that anyone who tries
will fail the build.

= Roadmap: Plugins? =

So now we have a stable ABI, which technically means that any
enterprising developer who wants to play games with LD_LIBRARY_PATH
could implement their own version of an OAuth flow, and have our
utilities make use of it into the future.

That brings us to patch 0007, which experimentally promotes the stable
API to a public header, and introduces a really janky environment
variable so that people don't have to play games. It will be obvious
from the code that this is not well-baked yet.

I hope the ability to dlopen() a custom flow can make it for 19; I
just don't know that this envvar approach is any good. The ideal
situation, IMO, is for a flow to be selected in the connection string.
But we have to lock that down, similarly to how we protect
local_preload_libraries, to prevent horrible exploits. At which point
we'll have essentially designed a generic libpq plugin system. Not
necessarily a terrible thing, but I don't think I have time to take it
on for PG19.

WDYT?
--Jacob

[1] /messages/by-id/aAkJnDQq3mOUvmQV@msg.df7cb.de
[2] /messages/by-id/TY4PR01MB1690746DB91991D1E9A47F57E94CDA@TY4PR01MB16907.jpnprd01.prod.outlook.com
[3] /messages/by-id/aSSp03wmNMngi/Oe@ubby
[4] https://postgr.es/c/b6c7cfac8
<0001-libpq-fe.h-Don-t-claim-SOCKTYPE-in-the-global-namesp.patch><0002-libpq-oauth-use-correct-c_args-in-meson.build.patch><0003-oauth_validator-Avoid-races-in-log_check.patch><0004-libpq-Introduce-PQAUTHDATA_OAUTH_BEARER_TOKEN_V2.patch><0005-libpq-oauth-Use-the-PGoauthBearerRequestV2-API.patch><0006-libpq-oauth-Never-link-against-libpq-s-encoding-func.patch><0007-WIP-Introduce-third-party-OAuth-flow-plugins.patch>

Hi Jacob,

This is a solid patch set. Only a few small comments:

1 - 0001
```
 /* for PGoauthBearerRequest.async() */
 #ifdef _WIN32
-#define SOCKTYPE uintptr_t		/* avoids depending on winsock2.h for SOCKET */
+#define PQ_SOCKTYPE uintptr_t	/* avoids depending on winsock2.h for SOCKET */
 #else
-#define SOCKTYPE int
+#define PQ_SOCKTYPE int
 #endif
```

The commit message has explained why SOCKTYPE is temporary and the reason why adding prefix “PG_” is to avoid collisions. But I don’t think code readers will always read commit messages, given the macro is a local and temporary, why adding a prefix starting with a underscore, like “_PQ_SOCKTYPE”, which explicitly indicates the macro is kinda private.

===
0002 & 0003 Looks good.
===

2 - 0004
```
+ * Helper for handling user flow failures. If the implementation put anything
+ * into request->error, it's added to conn->errorMessage here.
```

Typo: put -> puts

3 - 0004
```
+# Make sure the v1 hook continues to work. */
+test(
```

“*/“ in the end of the comment line seems a typo.

4 - 0005
```
+	PQAUTHDATA_OAUTH_BEARER_TOKEN,	/* server requests an OAuth Bearer token
+									 * (prefer v2, below, instead) */
```
"(prefer v2, below, instead)" looks confusing to me, though I can understand what it means. Maybe make it clearer, like “(v2 is preferred; see below)"

===
0006 Looks good.
===

===
Not reviewing 0007 as it marks with WIP.
===

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

#3Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Chao Li (#2)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

On Thu, Dec 11, 2025 at 11:43 PM Chao Li <li.evan.chao@gmail.com> wrote:

This is a solid patch set. Only a few small comments:

Thanks for the review!

The commit message has explained why SOCKTYPE is temporary and the reason why adding prefix “PG_” is to avoid collisions. But I don’t think code readers will always read commit messages, given the macro is a local and temporary, why adding a prefix starting with a underscore, like “_PQ_SOCKTYPE”, which explicitly indicates the macro is kinda private.

_PQ_SOCKTYPE is reserved (starts with _P), but I could add more
explanatory comments if you think that'd be useful. See v2-0001, which
now includes an explanation of the signature in the documentation.

The hard part is that I don't want to require all Windows clients of
libpq-fe.h to have to depend on Winsock; that's the only reason for
this oddity. Otherwise I'd declare PGsocket as the public version of
our internal pgsocket and call it a day.

+ * Helper for handling user flow failures. If the implementation put anything
+ * into request->error, it's added to conn->errorMessage here.
```

Typo: put -> puts

Past tense was my intent, but I've reworded to avoid any garden paths:
"If anything was put into request->error, it's added to
conn->errorMessage here."

“*/“ in the end of the comment line seems a typo.

Thanks, no idea why I did that.

"(prefer v2, below, instead)" looks confusing to me, though I can understand what it means. Maybe make it clearer, like “(v2 is preferred; see below)"

Done.

--

v2 makes these changes and rebases over latest HEAD. I'll plan to get
0001-3 in this week; probably tomorrow.

Open questions remain:
1) 0004: Any objections to putting PQExpBuffer into libpq-fe.h?
2) 0004: Thoughts on the v2 inheritance struct style as opposed to
relying on implementations to double-check the struct length?
3) 0005: Should I add the thread lock to an init() API, or expose a
new PQgetThreadLock() that other code can use?
4) 0007: [all of it]

My personal thoughts on these:
1) it's fine
2) it's a coin flip for me; inheritance is ugly, length magic is scary
3) I like the idea of PQgetThreadLock() so that we don't have to
inject it everywhere it could possibly be needed

Thanks,
--Jacob

Attachments:

since-v1.diff.txttext/plain; charset=US-ASCII; name=since-v1.diff.txtDownload
v2-0001-libpq-fe.h-Don-t-claim-SOCKTYPE-in-the-global-nam.patchapplication/octet-stream; name=v2-0001-libpq-fe.h-Don-t-claim-SOCKTYPE-in-the-global-nam.patchDownload+19-9
v2-0002-libpq-oauth-use-correct-c_args-in-meson.build.patchapplication/octet-stream; name=v2-0002-libpq-oauth-use-correct-c_args-in-meson.build.patchDownload+1-2
v2-0003-oauth_validator-Avoid-races-in-log_check.patchapplication/octet-stream; name=v2-0003-oauth_validator-Avoid-races-in-log_check.patchDownload+18-7
v2-0004-libpq-Introduce-PQAUTHDATA_OAUTH_BEARER_TOKEN_V2.patchapplication/octet-stream; name=v2-0004-libpq-Introduce-PQAUTHDATA_OAUTH_BEARER_TOKEN_V2.patchDownload+324-36
v2-0005-libpq-oauth-Use-the-PGoauthBearerRequestV2-API.patchapplication/octet-stream; name=v2-0005-libpq-oauth-Use-the-PGoauthBearerRequestV2-API.patchDownload+327-405
v2-0006-libpq-oauth-Never-link-against-libpq-s-encoding-f.patchapplication/octet-stream; name=v2-0006-libpq-oauth-Never-link-against-libpq-s-encoding-f.patchDownload+18-4
v2-0007-WIP-Introduce-third-party-OAuth-flow-plugins.patchapplication/octet-stream; name=v2-0007-WIP-Introduce-third-party-OAuth-flow-plugins.patchDownload+649-370
#4Zsolt Parragi
zsolt.parragi@percona.com
In reply to: Jacob Champion (#3)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

Hello!

The patches look good, I don't have comments for them directly.

I also tried out a simple client-server plugin pair with them, both
built as external plugins, and it works.

I have some practical findings based on that:

1. If I provide an incorrect library path to psql, it suggests that I
should install the libpq-oauth package. It would be better to tell the
user that PGOAUTHMODULE is invalid in this case. For example:

PGOAUTHMODULE=/not/existing/path/to/oauth42_flow.so bin/psql
"oauth_issuer=... oauth_client_id=..."
psql: error: connection to server at "127.0.0.1", port 5432 failed: no
OAuth flows are available (try installing the libpq-oauth package)

2. Even if I use a custom library (which doesn't default to the usual
get the discovery json - do a device flow - etc flow), it tries to
construct/validate the well known discovery url. Why does it do that,
if it doesn't try to use the URL directly? The only validation done
inside libpq is now to validate that the URL matches the server URL.
The server doesn't do this validation at all, it accepts any string I
provide and starts up. If we want this URL validation, shouldn't it
happen in the server, and in the client side oauth plugin? (Not doing
it on the server but enforcing it on the client seems like a strange
choice, as typos/misconfigurations will need server restarts)

3. Is it really still OAuth, and not a generic pluggable SASLBEARER
authentication? Yes, I still have to provide the "oauth_issuer" and
"oauth_client_id" parameters, but I don't have to do anything with it.
I can implement any client side authentication I want in a libpq
plugin, as long as I am able to verify it on the server side by
sending a single token using SASL. That token doesn't have to be an
OAuth token, because I can change both the creation and validation
part. So why don't you call this SASLBEARER, with the default provided
implementation being OAuthBearer, that seems to be a better fitting
name for it? (this was already the case for custom applications with
PQsetAuthDataHook, and because of that I already wanted to ask this
question, but now I can also misuse it in psql/other command line
tools/existing applications/...)

4. Have you thought about parameter passing, if my custom plugin needs
extra configuration? Especially related to the above question, but
even in the scope of OIDC. For example there's my next 5th question.
Or if used by a webpage and it has to pass a callback URL to the
provider.

5. I was about to submit a separate patch we got some requests about:
a way to supply an OAuth token directly to psql (PGOAUTHTOKEN=...
bin/psql ...), by implementing PQsetAuthDataHook in psql/other command
line tools. (Multiple users asked for this) I realized that I could
possibly implement that using this new plugin API, and keep it outside
of the core, but from a user experience perspective simply using an
environment variable without custom plugins could be better. What
direction would you prefer with that? (and this is related to the
previous question with again the additional parameter - the custom
user provided token)

6. Still about PQsetAuthDataHook, this seems to have a limited use
case: if I am in the control of the application, writing it,
PQsetAuthDataHook seems to be a better choice. These plugins are
useful if I have to modify an existing generic application, to use a
different authentication method, and that seems to be useful,
especially if I treat it as a generic SASLBEARER API. But this also
goes back to the security questions I raised in another thread: could
an application say that it doesn't want to use plugins? Could it be
configured in another way, other than environment variables? (if an
application has a config file for example, it seems to be more
practical to make this plugin part of that config file, instead of
relying on a completely different environment variable)

#5Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Zsolt Parragi (#4)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

On Sat, Jan 17, 2026 at 1:37 AM Zsolt Parragi <zsolt.parragi@percona.com> wrote:

The patches look good, I don't have comments for them directly.

I also tried out a simple client-server plugin pair with them, both
built as external plugins, and it works.

Thanks very much for the review!

I have some practical findings based on that:

1. If I provide an incorrect library path to psql, it suggests that I
should install the libpq-oauth package. It would be better to tell the
user that PGOAUTHMODULE is invalid in this case.

Agreed, will fix.

2. Even if I use a custom library (which doesn't default to the usual
get the discovery json - do a device flow - etc flow), it tries to
construct/validate the well known discovery url. Why does it do that,
if it doesn't try to use the URL directly? The only validation done
inside libpq is now to validate that the URL matches the server URL.
The server doesn't do this validation at all, it accepts any string I
provide and starts up. If we want this URL validation, shouldn't it
happen in the server, and in the client side oauth plugin?

I'm assuming you're talking about issuer_from_well_known_uri()? That's
a security gate, not just a syntax validation. We still need to avoid
mixup attacks, and I didn't want to punt that responsibility down to
the plugins, because they probably won't do it.

(Does that help clarify enough? One caller has a comment to this
effect, but not the other, and maybe I could add some to the doc
comment for the function itself?)

(Not doing
it on the server but enforcing it on the client seems like a strange
choice, as typos/misconfigurations will need server restarts)

Hopefully you mean reloads, or else there's a bad bug somewhere. As
for the usability request to validate syntax in the server, I agree
that would be good. I think that came up during v18 development...

3. Is it really still OAuth, and not a generic pluggable SASLBEARER
authentication?

It's still OAUTHBEARER, yes. The underlying mechanism is tied to OAuth
via both scopes and OAuth status codes. The latter would become more
apparent if we find use for any resource-server error responses other
than `invalid_token`. `insufficient_scope` might be nice eventually...

Yes, I still have to provide the "oauth_issuer" and
"oauth_client_id" parameters, but I don't have to do anything with it.
I can implement any client side authentication I want in a libpq
plugin, as long as I am able to verify it on the server side by
sending a single token using SASL.

For now, yes. I can't really stop anyone from tunneling magic junk
through Bearer tokens, or LDAP passwords, or anything else opaque.
(Until and unless the governing specs require us to. Things change,
after all; that's the risk you take when you write code on top of a
layering violation.)

But I'm also not very interested in supporting that use case with this
feature. You can already (ab)use PAM this way, as far as I know; you
don't need to assume the extra architectural cost and security checks
of OAUTHBEARER. And the architecture changes for OAuth introduced
enough of the bones of pluggable auth that a particular SASL mechanism
shouldn't *have* to be abused in this way, if what people really want
is pluggable auth.

That token doesn't have to be an
OAuth token, because I can change both the creation and validation
part. So why don't you call this SASLBEARER, with the default provided
implementation being OAuthBearer, that seems to be a better fitting
name for it?

Well, SASLBEARER isn't a SASL mechanism. We didn't invent OAUTHBEARER,
and if a future revision to its spec binds it even more tightly to new
OAuth-specific features, we should feel free to adopt them in our
mechanism implementation.

4. Have you thought about parameter passing, if my custom plugin needs
extra configuration?

Yeah, that's actually part of why I got stuck in my first revision. I
realized I was creating an entirely new ecosystem of stuff for libpq
to worry about, and I needed to walk back the feature scope. So
plugins will need out-of-band config for now, which precludes
per-connection settings.

My idea for the first revision was an oauth_flow connection parameter,
with the syntax

oauth_flow=<plugin_name>
oauth_flow=<plugin_name>:<plugin-specific-parameter-string>

So maybe the default would be oauth_flow=builtin, or
oauth_flow=libpq:device, or oauth_flow=libpq-device, or... And then
third parties could do their own thing.

5. I was about to submit a separate patch we got some requests about:
a way to supply an OAuth token directly to psql (PGOAUTHTOKEN=...
bin/psql ...), by implementing PQsetAuthDataHook in psql/other command
line tools. (Multiple users asked for this)

I'm still fairly opposed to manual token passing. It destroys any
semblance of token secrecy; it makes it trivial to exfiltrate
laterally, via `cat /proc/xxx/environ`; it removes a forcing function
for proper flow support (my weakest argument, to be fair). It'll also
become completely useless as soon as we have sender constraints,
because if you don't have the binding material, you can't use the
token.

Are they asking for this because it'd be an easy way around the v18
flow limitation? Because that's been the primary motivation in the
conversations I've had. I'd rather give them the ability to obtain the
token, in-process, the way they want, and then weird user-specific
tradeoffs are their decision and not ours. We can try to focus on
best-possible-implementation here in libpq.

6. Still about PQsetAuthDataHook, this seems to have a limited use
case: if I am in the control of the application, writing it,
PQsetAuthDataHook seems to be a better choice. These plugins are
useful if I have to modify an existing generic application, to use a
different authentication method, and that seems to be useful,
especially if I treat it as a generic SASLBEARER API. But this also
goes back to the security questions I raised in another thread: could
an application say that it doesn't want to use plugins? Could it be
configured in another way, other than environment variables? (if an
application has a config file for example, it seems to be more
practical to make this plugin part of that config file, instead of
relying on a completely different environment variable)

Yeah, the question of "how do I configure this thing that is nested
five layers deep in my application stack" is everpresent, and I'm not
sure that the world is converging on a solution. (My preference is
"trusted-file-on-disk plus client API" for this, personally. I'm
wrestling with that over at [1]/messages/by-id/CAOYmi+nDMumexG4X4N9_jMU=yEiHOB=3TdYBPr0aYgPVb_TYAw@mail.gmail.com and it's not baked enough to solve
this.)

--Jacob

[1]: /messages/by-id/CAOYmi+nDMumexG4X4N9_jMU=yEiHOB=3TdYBPr0aYgPVb_TYAw@mail.gmail.com

#6Zsolt Parragi
zsolt.parragi@percona.com
In reply to: Jacob Champion (#5)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

Hopefully you mean reloads, or else there's a bad bug somewhere

Yes, reloads - I usually do restarts with local testing, but it works
with reloads.

I'm assuming you're talking about issuer_from_well_known_uri()? That's a security gate, not just a syntax validation.

As for the usability request to validate syntax in the server, I agree that would be good. I think that came up during v18 development...

If it's validated on the server, and the issuer matches, that should
be enough? I'm not saying that we don't match the URL at all, that
part is still needed. When I first tried out the OIDC support I even
hoped that maybe this could be used to select the hba line if we have
multiple issuers.

Well, SASLBEARER isn't a SASL mechanism.

Oops, that's what I get for accepting the ai summary from google
without verifying properly, I should have done that before sending the
previous email.

I can't really stop anyone from tunneling magic junk through Bearer tokens, or LDAP passwords, or anything else opaque.

And the architecture changes for OAuth introduced enough of the bones of pluggable auth that a particular SASL mechanism shouldn't *have* to be abused in this way, if what people really want is pluggable auth.

I don't think LDAP, or anything else is similarly extensible both on
the server and client side? And most of the time, oauth
implementations in other software also aren't this extensible, so they
are more strictly tied to oauth.

And my question was exactly because of this: OAuth introduced mostly
everything needed for pluggable authentication (without PAM - my
previous experience with that is that it is system specific, slow, and
complex), and it is already possible to misuse it for something else.
It would be really nice to have a generic authentication plugin system
in postgres to implement other authentication methods, not just OAuth.

Are they asking for this because it'd be an easy way around the v18 flow limitation? Because that's been the primary motivation in the conversations I've had.

One specific use case I know of is CI, for example GitHub simply
provides you an oauth token as an environment variable.

#7Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Zsolt Parragi (#6)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

On Tue, Jan 20, 2026 at 1:14 PM Zsolt Parragi <zsolt.parragi@percona.com> wrote:

If it's validated on the server, and the issuer matches, that should
be enough?

It's client-side protection against a malicious server; server-side
validation doesn't help. This is why you have to specify an issuer in
your client's connection string (my original patchset just trusted
whatever the server sent, which would have caused serious problems).
See [1]/messages/by-id/CAOYmi+kLTJ1wZ6gxRbgtR52E=EyiCpmp6J3mmSvtc1a6i7sZ3Q@mail.gmail.com for a longer discussion.

If I've misunderstood what you mean, please tell me what function call
in particular you think can be removed.

I don't think LDAP, or anything else is similarly extensible both on
the server and client side?

Any plaintext password method (like LDAP) can tunnel arbitrary data,
just like a Bearer token can. So if you control both sides, you can do
whatever you want.

And my question was exactly because of this: OAuth introduced mostly
everything needed for pluggable authentication (without PAM - my
previous experience with that is that it is system specific, slow, and
complex), and it is already possible to misuse it for something else.
It would be really nice to have a generic authentication plugin system
in postgres to implement other authentication methods, not just OAuth.

I'm very much on board with pluggable auth [2]/messages/by-id/3d41067ed944e9ce889fc15a6593cb26e72b6c0f.camel@vmware.com, but OAUTHBEARER is not
the layer for arbitrary non-OAuth authentication systems, any more
than LDAP is. (SASL is the correct layer for that, IMHO.)

Are they asking for this because it'd be an easy way around the v18 flow limitation? Because that's been the primary motivation in the conversations I've had.

One specific use case I know of is CI, for example GitHub simply
provides you an oauth token as an environment variable.

Mm. I'll try to take a closer look at GitHub and Cirrus.

Thanks,
--Jacob

[1]: /messages/by-id/CAOYmi+kLTJ1wZ6gxRbgtR52E=EyiCpmp6J3mmSvtc1a6i7sZ3Q@mail.gmail.com
[2]: /messages/by-id/3d41067ed944e9ce889fc15a6593cb26e72b6c0f.camel@vmware.com

#8Zsolt Parragi
zsolt.parragi@percona.com
In reply to: Jacob Champion (#7)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

If I've misunderstood what you mean, please tell me what function call
in particular you think can be removed.

I meant that it calls issuer_from_well_known_uri, and does multiple
validation steps in that.

Wouldn't it be simpler to do all that validation at the server side,
and for libpq to simply compare that what the user specified for the
connection and what the server sent are the same?

The protection would still be there, as the client still compares the
URLs, but the validation moves to the server, where we can detect
incorrect URLs earlier.

So if you control both sides, you can do whatever you want.

Yes, but I can't start postgres with a custom LDAP backend or client
library, or at least not in an easy built in way.

I'm very much on board with pluggable auth [2], but OAUTHBEARER is not
the layer for arbitrary non-OAuth authentication systems, any more
than LDAP is. (SASL is the correct layer for that, IMHO.)

That part is understandable - I mistakenly thought that there's a
generalized OAUTHBEARER mechanism in SASL, but there isn't.

And from your original email,

I just don't know that this envvar approach is any good.
...
At which point we'll have essentially designed a generic libpq plugin system.

My question should have been: shouldn't the plugin api work at the
SASL level instead of a specific OAuth level? Or even if the first
iteration of the internal API would only allow it to modify the OAuth
flow, shouldn't the public facing configuration (env var / connection
string / anything) be something more generic?
The connection string approach would definitely be better, but not
naming the environment variable OAUTHSOMETHING would already make this
a more future-proof feature. If you think generic extensibility should
work with SASL, I see that as one more reason why the variable should
be named SASLsomething instead.

In my previous email I didn't reply to the plugin/configuration part
of your message because I wanted to think more about the question.
How, and when would I use this (oauth limited, differently configured)
plugin API instead of the authdata hook? And I can't think of any good
use cases outside the command line tools included with postgres. But
if it's the beginning of a generic API that gets extended in later
major versions, that's different.

#9Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Zsolt Parragi (#8)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

On Mon, Jan 26, 2026 at 4:06 AM Zsolt Parragi <zsolt.parragi@percona.com> wrote:

Wouldn't it be simpler to do all that validation at the server side,
and for libpq to simply compare that what the user specified for the
connection and what the server sent are the same?

No, I don't think so. That opens up the possibility of a third-party
server implementation breaking the OAUTHBEARER rules and libpq failing
to call them out on it.

My question should have been: shouldn't the plugin api work at the
SASL level instead of a specific OAuth level?

IMO, yes, but I don't imagine that most people want to pick a SASL
plugin to switch OAuth flows. They just want their flow to match the
use case and move on.

Or even if the first
iteration of the internal API would only allow it to modify the OAuth
flow, shouldn't the public facing configuration (env var / connection
string / anything) be something more generic?

I think these are two different use cases that _could_ be handled with
the same knob but probably _should_ not be (see above).

How, and when would I use this (oauth limited, differently configured)
plugin API instead of the authdata hook? And I can't think of any good
use cases outside the command line tools included with postgres.

That's exactly the use case. (Also, library clients of postgres that
should not install application hooks, other monolithic utilities that
link against libpq, etc.) Who wants to recompile their clients just to
switch how they get a token?

--Jacob

#10Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Jacob Champion (#3)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

On Tue, Dec 16, 2025 at 9:40 AM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:

Open questions remain:
1) 0004: Any objections to putting PQExpBuffer into libpq-fe.h?

Tom sounded lukewarm to this on the Discord, so I looked into
replacing the new usage with a simple char*.

That actually exposed a bug: appending error data during cleanup
doesn't help us at all, because we've already stopped using the error
buffer. In fact my whole idea of adding things to conn->errorMessage
during a teardown operation has been nearly useless from the
beginning. Elsewhere in libpq, we handle cases like these (which
should ideally just not happen) by printing warnings to stderr, so
v3-0001 does that instead.

Besides exposing a bug, switching to char* means anyone who's not
programming in C and already has code that handles pointer ownership
across the boundary can continue to make use of that, instead of
adapting to a completely new API for this one particular use case. We
already have the cleanup-callback requirement for the token anyway;
it's fine.

2) 0004: Thoughts on the v2 inheritance struct style as opposed to
relying on implementations to double-check the struct length?

Still feels half-dozen or the other to me, so I'm planning to move
ahead with the inheritance model. I'll look into
VALGRIND_MAKE_MEM_NOACCESS (and/or other ways to catch people
scribbling where they shouldn't) for v4.

3) 0005: Should I add the thread lock to an init() API, or expose a
new PQgetThreadLock() that other code can use?

v3-0002 implements PQgetThreadLock(). The conversation with Nico
Williams at [1]/messages/by-id/aSSp03wmNMngi/Oe@ubby cemented this for me; it's entirely possible that
implementations will need the lock at other times besides
initialization, say if they're mixing OAuth with Kerberos. Adding a
way to retrieve it doesn't actually expose new functionality --
applications could always get at our internal implementation by
calling PQregisterThreadLock(NULL) -- but libraries can't use that API
safely.

Also, clients can probably make use of some of the newer ways of doing
this kind of initialization (pthread_once, etc.) that weren't in wide
use back when the init-function design showed up. We may not ever
actually need a separate init function, and if I'm wrong we can always
add one.

I think I'll put this in its own top-level thread for comment.

(In other words, a plugin architecture causes the compile-time
and run-time dependency arrows to point in opposite directions, so
plugins won't be able to rely on the LIBPQ_HAS_* macros to determine
what APIs are available to them.)

(TODO: Are there implications for our use of RTLD_NOW at dlopen() time?

To answer my own question, yes: any future libpq-oauth plugin that
needs PG20+ APIs will have to lazy-load them (weak symbol
declarations, etc.) or else accept that long-running applications may
fail OAuth connections until they restart.

A more general plugin system would need to solve this. For example, we
could load RTLD_LAZY, check for a minimum libpq version declared by
the plugin, and then either upgrade the bindings with RTLD_NOW, or
dlclose() and bail. But I don't think I need to be solving a
nonexistent problem now.

--Jacob

[1]: /messages/by-id/aSSp03wmNMngi/Oe@ubby

Attachments:

v3-0001-oauth-Report-cleanup-errors-as-warnings-on-stderr.patchapplication/x-patch; name=v3-0001-oauth-Report-cleanup-errors-as-warnings-on-stderr.patchDownload+8-11
v3-0002-libpq-Add-PQgetThreadLock-to-mirror-PQregisterThr.patchapplication/x-patch; name=v3-0002-libpq-Add-PQgetThreadLock-to-mirror-PQregisterThr.patchDownload+16-3
v3-0003-libpq-Introduce-PQAUTHDATA_OAUTH_BEARER_TOKEN_V2.patchapplication/x-patch; name=v3-0003-libpq-Introduce-PQAUTHDATA_OAUTH_BEARER_TOKEN_V2.patchDownload+301-37
v3-0004-libpq-oauth-Use-the-PGoauthBearerRequestV2-API.patchapplication/x-patch; name=v3-0004-libpq-oauth-Use-the-PGoauthBearerRequestV2-API.patchDownload+333-418
v3-0005-libpq-oauth-Never-link-against-libpq-s-encoding-f.patchapplication/x-patch; name=v3-0005-libpq-oauth-Never-link-against-libpq-s-encoding-f.patchDownload+18-4
v3-0006-WIP-Introduce-third-party-OAuth-flow-plugins.patchapplication/x-patch; name=v3-0006-WIP-Introduce-third-party-OAuth-flow-plugins.patchDownload+687-387
#11Zsolt Parragi
zsolt.parragi@percona.com
In reply to: Jacob Champion (#10)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)
+ * TODO: have to think about _all_ the security ramifications of this. What
+ * existing protections in LD_LIBRARY_PATH (and/or SIP) are we potentially
+ * bypassing? Should we check the permissions of the file somehow...?
+ * TODO: maybe disallow anything not underneath LIBDIR? or PKGLIBDIR?
+ * Should it have a naming convention?
+ */
+ const char *env = getenv("PGOAUTHMODULE");

and

That's exactly the use case. (Also, library clients of postgres that
should not install application hooks, other monolithic utilities that
link against libpq, etc.) Who wants to recompile their clients just to
switch how they get a token?

How would this work with library clients exactly (and even multiple
library clients)?

* Language bindings: there's already ongoing work on adding support
for the hooks into these libraries, so that's not really a use case
(and: 1. there's no recompilation cost for many of them 2. even if
there is, you can't easily use the same language for these plugins,
which is again a downside). And some bindings don't use libpq at all.
* postgres CLI tools: That's one use case, but it could work without
generic user-facing plugin support, limited to these tools.
* postgres_fwd/dblink and similar: I'm not sure how that would work,
what if different extensions require different plugins? This already
seems like a tricky question with the hooks

And from a configuration/security risk point:

* Restricting to system paths could work, but it limits who and how
can add these plugins, and removes the possibility of modifying a
specific application without system permissions. I can't just download
a binary application, and a separate oauth plugin, and use them
together, I need admin permission - that seems strange to me. (Unless
I compile libpq myself in a different prefix?)
* On the other hand requiring applications to specify allowed paths or
plugins directly, without environment variables goes back to the
previous configuration question

What existing protections in LD_LIBRARY_PATH (and/or SIP) are we potentially bypassing?

From the ld.so.8 manpage:

In secure-execution mode, preload pathnames containing slashes are ignored. Furthermore, shared objects are preloaded only from the standard search directories and only if they have set-user-ID mode bit enabled (which is not typical).

An easy solution could be using secure_getenv instead of getenv, which
would at least improve the situation?

And a few specific comments about the patches:

+oom:
+ request->error = libpq_gettext("out of memory");
+ return -1;

Shouldn't this also free conninfo if it is allocated?

+/* Features added in PostgreSQL v19: */
+/* Indicates presence of PQgetThreadLock */
+#define LIBPQ_HAS_GET_THREAD_LOCK

Should this be defined to 1?

+ /*
+ * We need to inject necessary function pointers into the module. This
+ * only needs to be done once -- even if the pointers are constant,
+ * assigning them while another thread is executing the flows feels
+ * like tempting fate.
+ */
+ if ((lockerr = pthread_mutex_lock(&init_mutex)) != 0)
+ {
+ /* Should not happen... but don't continue if it does. */
+ Assert(false);
- libpq_append_conn_error(conn, "failed to lock mutex (%d)", lockerr);
- return 0;
- }
+ libpq_append_conn_error(conn, "failed to lock mutex (%d)", lockerr);
+ return 0;
+ }

Shouldn't this path return -1, and also dlclose the library?

+
+ dlclose(state->flow_module);
+

Shouldn't dlcloses also reset state->flow_module after?

-free_async_ctx(PGconn *conn, struct async_ctx *actx)
+free_async_ctx(PGoauthBearerRequestV2 *req, struct async_ctx *actx)

req (or conn) doesn't seem to be used in this function, does it need
that parameter?

+ env = strchr(env, '\x01');
+ *env++ = '\0';

Isn't mutating environment variables UB?

#12Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Zsolt Parragi (#11)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

On Wed, Feb 25, 2026 at 9:16 AM Zsolt Parragi <zsolt.parragi@percona.com> wrote:

How would this work with library clients exactly (and even multiple
library clients)?

The global setting? Not very well at all. The eventual solution needs
to be per-connection.

I don't think any version of v3-0006 is likely to make it for PG19,
for the record. I just need to ensure that the use case we're moving
towards continues to work with the code that's actually committed, and
there still might be some pieces that should be cherry-picked
backwards in the set. As I said in my original post:

That brings us to patch [v1-]0007, which experimentally promotes the stable
API to a public header, and introduces a really janky environment
variable so that people don't have to play games. It will be obvious
from the code that this is not well-baked yet.

My hope was to continue to take small steps in the direction of the
end goal with every release, but I think PGOAUTHMODULE is a step too
far. I don't want to expose an oauth_module connection option, and an
environment variable that can't be subsumed by a future connection
option will just be cruft eventually.

For 19, bleeding-edge developers who want a global override ASAP can
implement the public API and tell their application's linker to find a
different libpq-oauth. If a global override isn't good enough,
PGOAUTHMODULE wouldn't have helped anyway.

* Language bindings: there's already ongoing work on adding support
for the hooks into these libraries, so that's not really a use case

(The hooks are probably insufficient in the general case [1]https://github.com/ged/ruby-pg/pull/693#issuecomment-3867178201, but a
global envvar isn't intended to solve that.)

* postgres CLI tools: That's one use case, but it could work without
generic user-facing plugin support, limited to these tools.

Yes.

* postgres_fwd/dblink and similar: I'm not sure how that would work,
what if different extensions require different plugins? This already
seems like a tricky question with the hooks

We don't support proxied OAuth at all yet (i.e. both of those
extensions outright prohibit oauth_* settings).

I can't just download
a binary application, and a separate oauth plugin, and use them
together, I need admin permission - that seems strange to me.

Why does that seem strange? If you don't have the ability to install
libpq to begin with, you shouldn't be able to modify that libpq or get
it to run arbitrary code. If you control the application, on the other
hand, you control both the global hook and the link behavior (you
don't have to link against system libpq, after all), so it doesn't
seem like you've lost any functionality.

An easy solution could be using secure_getenv instead of getenv, which
would at least improve the situation?

I don't think we claim setuid-safety for libpq. (Our own code aside,
we'd have to vet all of our transitive dependencies in perpetuity.)

And a few specific comments about the patches:

Thank you for the review!

+oom:
+ request->error = libpq_gettext("out of memory");
+ return -1;

Shouldn't this also free conninfo if it is allocated?

Yep, good catch.

+/* Features added in PostgreSQL v19: */
+/* Indicates presence of PQgetThreadLock */
+#define LIBPQ_HAS_GET_THREAD_LOCK

Should this be defined to 1?

Yes. One of my weirder copy-paste errors...

+ if ((lockerr = pthread_mutex_lock(&init_mutex)) != 0)
+ {
+ /* Should not happen... but don't continue if it does. */
+ Assert(false);
- libpq_append_conn_error(conn, "failed to lock mutex (%d)", lockerr);
- return 0;
- }
+ libpq_append_conn_error(conn, "failed to lock mutex (%d)", lockerr);
+ return 0;
+ }

Shouldn't this path return -1,

We could. I chose zero to try to retain the PG18 behavior, but I could
expand this error message and set request->error instead. If that'd be
less confusing to you as a reader, it's probably worth the change.

and also dlclose the library?

I don't think we gain anything by that, do we? It's named correctly,
it has the right symbols, and the
shouldn't-happen-failure-that-happened had nothing to do with the
library implementation.

Shouldn't dlcloses also reset state->flow_module after?

That'd probably be kinder to anyone expanding this logic in the
future, yeah. Maybe not worth a backpatch though (there's no users
outside of this function; it's only retained to assist with debugging
at the moment).

-free_async_ctx(PGconn *conn, struct async_ctx *actx)
+free_async_ctx(PGoauthBearerRequestV2 *req, struct async_ctx *actx)

req (or conn) doesn't seem to be used in this function, does it need
that parameter?

Ah, right, it's dead now that req->error isn't set. Thanks!

+ env = strchr(env, '\x01');
+ *env++ = '\0';

Isn't mutating environment variables UB?

Kinda, or at least it invites UB later on according to POSIX. I should
just make a copy.

Thanks,
--Jacob

[1]: https://github.com/ged/ruby-pg/pull/693#issuecomment-3867178201

#13Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Jacob Champion (#12)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

On Thu, Feb 26, 2026 at 1:56 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:

And a few specific comments about the patches:

Thank you for the review!

v4 should address that feedback. (The Valgrind work isn't part of it, yet.)

0001 should be ready to go. I meant to put 0002 up in its own thread
at the beginning of the week but got sidetracked, so I'll go do that
now. 0003 has to wait on Valgrind, and 0004-5 have to wait on 0002.

Thanks,
--Jacob

Attachments:

v4-0006-WIP-Introduce-third-party-OAuth-flow-plugins.patchapplication/octet-stream; name=v4-0006-WIP-Introduce-third-party-OAuth-flow-plugins.patchDownload+695-387
since-v3.diff.txttext/plain; charset=US-ASCII; name=since-v3.diff.txtDownload
v4-0001-oauth-Report-cleanup-errors-as-warnings-on-stderr.patchapplication/octet-stream; name=v4-0001-oauth-Report-cleanup-errors-as-warnings-on-stderr.patchDownload+10-13
v4-0002-libpq-Add-PQgetThreadLock-to-mirror-PQregisterThr.patchapplication/octet-stream; name=v4-0002-libpq-Add-PQgetThreadLock-to-mirror-PQregisterThr.patchDownload+16-3
v4-0003-libpq-Introduce-PQAUTHDATA_OAUTH_BEARER_TOKEN_V2.patchapplication/octet-stream; name=v4-0003-libpq-Introduce-PQAUTHDATA_OAUTH_BEARER_TOKEN_V2.patchDownload+301-37
v4-0004-libpq-oauth-Use-the-PGoauthBearerRequestV2-API.patchapplication/octet-stream; name=v4-0004-libpq-oauth-Use-the-PGoauthBearerRequestV2-API.patchDownload+334-417
v4-0005-libpq-oauth-Never-link-against-libpq-s-encoding-f.patchapplication/octet-stream; name=v4-0005-libpq-oauth-Never-link-against-libpq-s-encoding-f.patchDownload+18-4
#14Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Jacob Champion (#13)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

On Fri, Feb 27, 2026 at 11:42 AM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:

0001 should be ready to go. I meant to put 0002 up in its own thread
at the beginning of the week but got sidetracked, so I'll go do that
now.

Done at [1]/messages/by-id/CAOYmi+=MHD+WKD4rsTn0v8220mYfyLGhEc5EfhmtqrAb7SmC5g@mail.gmail.com.

v5 incorporates that version of the patch in -0002 and makes some
minor test updates for better coverage in -0003.

0003 has to wait on Valgrind, and 0004-5 have to wait on 0002.

v5-0006 is my experiment in Valgrind (and also AddressSanitizer, and
also generic pointer poisoning). I think it's interesting, and it
seems clean enough for some subset to be committable for 19. The
weak-symbol support for ASan is the most suspect; I have no idea if it
links correctly across all platforms. It's probably better for me to
propose generic ASan support in utils/memdebug.h at some point in the
future, and drop it from this patch.

I don't plan to block the other patches on -0006, since -0004 is
blocking PGOAUTHCAFILE.

--Jacob

[1]: /messages/by-id/CAOYmi+=MHD+WKD4rsTn0v8220mYfyLGhEc5EfhmtqrAb7SmC5g@mail.gmail.com

Attachments:

since-v4.diff.txttext/plain; charset=US-ASCII; name=since-v4.diff.txtDownload+0-1
v5-0001-oauth-Report-cleanup-errors-as-warnings-on-stderr.patchapplication/octet-stream; name=v5-0001-oauth-Report-cleanup-errors-as-warnings-on-stderr.patchDownload+10-13
v5-0002-libpq-Add-PQgetThreadLock-to-mirror-PQregisterThr.patchapplication/octet-stream; name=v5-0002-libpq-Add-PQgetThreadLock-to-mirror-PQregisterThr.patchDownload+21-4
v5-0003-libpq-Introduce-PQAUTHDATA_OAUTH_BEARER_TOKEN_V2.patchapplication/octet-stream; name=v5-0003-libpq-Introduce-PQAUTHDATA_OAUTH_BEARER_TOKEN_V2.patchDownload+306-37
v5-0004-libpq-oauth-Use-the-PGoauthBearerRequestV2-API.patchapplication/octet-stream; name=v5-0004-libpq-oauth-Use-the-PGoauthBearerRequestV2-API.patchDownload+334-417
v5-0005-libpq-oauth-Never-link-against-libpq-s-encoding-f.patchapplication/octet-stream; name=v5-0005-libpq-oauth-Never-link-against-libpq-s-encoding-f.patchDownload+18-4
v5-0006-WIP-test-out-poisoning.patchapplication/octet-stream; name=v5-0006-WIP-test-out-poisoning.patchDownload+156-11
v5-0007-WIP-Introduce-third-party-OAuth-flow-plugins.patchapplication/octet-stream; name=v5-0007-WIP-Introduce-third-party-OAuth-flow-plugins.patchDownload+696-387
#15Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Jacob Champion (#14)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

On Tue, Mar 3, 2026 at 2:08 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:

I don't plan to block the other patches on -0006, since -0004 is
blocking PGOAUTHCAFILE.

v5-0001 and -0002 are now committed. Attached is a v6 rebase with some
minor self-review fixups and the removal of ASan from the poisoning
experiment.

v6-0001 through -0003 are targeted for commit next. I'll leave -0004
for later; I plan to ask for some committer review, since it's pretty
unusual code for libpq, but it needs a real commit message and some
better docs first.

As for -0005: I think I'll cherry-pick only the ability for the
libpq_oauth_init entry to be optional, marking any flows that don't
provide it as user-defined, to make it possible for v19 developers to
help us iterate on v20 if they want.

Thanks,
--Jacob

Attachments:

since-v5.diff.txttext/plain; charset=US-ASCII; name=since-v5.diff.txtDownload
v6-0001-libpq-Introduce-PQAUTHDATA_OAUTH_BEARER_TOKEN_V2.patchapplication/octet-stream; name=v6-0001-libpq-Introduce-PQAUTHDATA_OAUTH_BEARER_TOKEN_V2.patchDownload+305-37
v6-0002-libpq-oauth-Use-the-PGoauthBearerRequestV2-API.patchapplication/octet-stream; name=v6-0002-libpq-oauth-Use-the-PGoauthBearerRequestV2-API.patchDownload+334-417
v6-0003-libpq-oauth-Never-link-against-libpq-s-encoding-f.patchapplication/octet-stream; name=v6-0003-libpq-oauth-Never-link-against-libpq-s-encoding-f.patchDownload+18-4
v6-0004-WIP-test-out-poisoning.patchapplication/octet-stream; name=v6-0004-WIP-test-out-poisoning.patchDownload+125-10
v6-0005-WIP-Introduce-third-party-OAuth-flow-plugins.patchapplication/octet-stream; name=v6-0005-WIP-Introduce-third-party-OAuth-flow-plugins.patchDownload+696-387
#16Zsolt Parragi
zsolt.parragi@percona.com
In reply to: Jacob Champion (#15)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

For the first commits I only have a few more questions/comments about
the error messages, otherwise looks good.

+ libpq_append_conn_error(conn, "failed to lock mutex (%d)", lockerr);
+ return 0;
+ }
Shouldn't this path return -1,

We could. I chose zero to try to retain the PG18 behavior, but I could
expand this error message and set request->error instead. If that'd be
less confusing to you as a reader, it's probably worth the change.

If this returns 0, we print out

failed to lock mutex
no OAuth flows are available (try installing the libpq-oauth package)

Which isn't ideal, as the library is there, so installing the package
wouldn't help.

+ if ((start_flow = dlsym(state->flow_module, "pg_start_oauthbearer")) == NULL)

And this path has the same issue, the library is there, so suggesting
to install libpq-oauth isn't helpful.
The more detailed message is only printed out with unsafe debugging,
without that it just returns 0.

+ appendPQExpBuffer(&conn->errorMessage,
+   "use_builtin_flow: failed to lock mutex (%d)\n",
+   lockerr);

This is after an assert, so maybe it is okay as is, but this bypasses
gettext. (or shouldn't it use "internal error:" similarly to the other
untranslated error message? and another 2 internal errors are
translated)

(try installing the libpq-oauth package)

This isn't changed in these patches, but Is it okay to assume a
package name here? This is not a package that universally exists
everywhere, we can't even be sure that pg was installed with a package
manager. On RHEL it is called postgresql18-libs-oauth, on suse it's
part of the main libpq package. In both cases if for some internal
error it can't find/load the library, we suggest installing a package
that doesn't exist on that system.

#17Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Jacob Champion (#15)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

On Thu, Mar 5, 2026 at 4:57 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:

v6-0001 through -0003 are targeted for commit next.

v6-0001 is pushed, after making another docs/comments/commit message
pass, and adding one last test to pin the correct meaning of
PGoauthBearerRequestV2.issuer for future maintainers. I'm working on
some of Zsolt's feedback before posting a new set.

--Jacob

#18Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Zsolt Parragi (#16)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

On Fri, Mar 6, 2026 at 2:44 PM Zsolt Parragi <zsolt.parragi@percona.com> wrote:

+ if ((start_flow = dlsym(state->flow_module, "pg_start_oauthbearer")) == NULL)

And this path has the same issue, the library is there, so suggesting
to install libpq-oauth isn't helpful.

I'll cherry-pick some of the -1 handling backwards in the patchset to
handle this.

+ appendPQExpBuffer(&conn->errorMessage,
+   "use_builtin_flow: failed to lock mutex (%d)\n",
+   lockerr);

This is after an assert, so maybe it is okay as is, but this bypasses
gettext.

Correct. For PG18, I got the feedback that can't-happen errors in
OAuth should really remain untranslated, unless it's clear that the
user can act on them. Otherwise we're consuming translators' time for
no practical benefit.

(try installing the libpq-oauth package)

This isn't changed in these patches, but Is it okay to assume a
package name here?

No, not really, but see [1]/messages/by-id/aAOREVWMFTuWvJ1l@msg.df7cb.de. Any "vanilla" version of that error
message will contain the string "libpq-oauth" regardless; that's the
module's name. So package maintainers need to either patch the line if
it's not useful, or else let us know how they'd prefer to override
this -- Makefile? Configure? (Meson?) -- to improve the situation.
Christoph gave the most feedback here, so Debian has the most-greased
wheel at the moment. :D

Thanks,
--Jacob

[1]: /messages/by-id/aAOREVWMFTuWvJ1l@msg.df7cb.de

#19Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Jacob Champion (#18)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

On Fri, Mar 6, 2026 at 4:27 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:

I'll cherry-pick some of the -1 handling backwards in the patchset to
handle this.

Done in v7-0001. Some of the improvements in the WIP patch were also
cherry-picked, and I fixed a stray comment bug. -0001 and -0002 are
next up for commit.

-0003 fills out the commit message and should be generally reviewable
now. -0004 adds the ability to LD_LIBRARY_PATH your way into flow
plugin development for the v20 cycle. -0005 becomes pretty much
useless other than for testing (and it should possible to adapt those
tests to -0004's implementation at some point).

Thanks,
--Jacob

Attachments:

since-v6.diff.txttext/plain; charset=US-ASCII; name=since-v6.diff.txtDownload+0-1
v7-0001-libpq-oauth-Use-the-PGoauthBearerRequestV2-API.patchapplication/octet-stream; name=v7-0001-libpq-oauth-Use-the-PGoauthBearerRequestV2-API.patchDownload+353-429
v7-0002-libpq-oauth-Never-link-against-libpq-s-encoding-f.patchapplication/octet-stream; name=v7-0002-libpq-oauth-Never-link-against-libpq-s-encoding-f.patchDownload+18-4
v7-0003-libpq-Poison-the-v2-part-of-a-v1-Bearer-request.patchapplication/octet-stream; name=v7-0003-libpq-Poison-the-v2-part-of-a-v1-Bearer-request.patchDownload+125-13
v7-0004-libpq-Allow-developers-to-reimplement-libpq-oauth.patchapplication/octet-stream; name=v7-0004-libpq-Allow-developers-to-reimplement-libpq-oauth.patchDownload+42-32
v7-0005-WIP-Introduce-third-party-OAuth-flow-plugins.patchapplication/octet-stream; name=v7-0005-WIP-Introduce-third-party-OAuth-flow-plugins.patchDownload+646-355
#20Zsolt Parragi
zsolt.parragi@percona.com
In reply to: Jacob Champion (#19)
Re: [oauth] Stabilize the libpq-oauth ABI (and allow alternative implementations?)

1-4 looks good to me, with one question:

+#define MASK_BITS ((uintptr_t) 0x55aa55aa55aa55aa)

Won't this cause a warning in 32 bit builds? (0x55aa55aa definitely
should work in both?)

#21Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Zsolt Parragi (#20)
#22Zsolt Parragi
zsolt.parragi@percona.com
In reply to: Jacob Champion (#21)
#23Jonathan Gonzalez V.
jonathan.abdiel@gmail.com
In reply to: Jacob Champion (#19)
#24Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Jonathan Gonzalez V. (#23)
#25Chao Li
li.evan.chao@gmail.com
In reply to: Jacob Champion (#24)
#26Zsolt Parragi
zsolt.parragi@percona.com
In reply to: Chao Li (#25)
#27Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Chao Li (#25)
#28Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Zsolt Parragi (#26)