oauth integer overflow
Hi,
I was once more looking at what it'd take to work with -ftrapv, when cassert
is enabled. Partially motivated with the worry that we support compilers that
don't understand -fwrapv so code relying on signed overflow isn't actually
safe. And because it sometimes leads to unexpected results that can cause
trouble and -ftrapv helps find those.
One thing that quickly triggers when doing so is:
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1 0x00007fe1a05a9dbf in __pthread_kill_internal (threadid=<optimized out>, signo=6) at ./nptl/pthread_kill.c:89
#2 0x00007fe1a0552d02 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007fe1a053a4b2 in __GI_abort () at ./stdlib/abort.c:77
#4 0x00007fe19f218ac1 in __addvsi3 ()
from /srv/dev/build/postgres/m-dev-assert/tmp_install//srv/dev/install/postgres/m-dev-assert/lib/x86_64-linux-gnu/libpq-oauth.so
#5 0x00007fe19f20da05 in handle_token_response (actx=0x561825a06350, token=0x7ffc2f677e48)
at ../../../../../home/andres/src/postgresql/src/interfaces/libpq-oauth/oauth-curl.c:2625
#6 0x00007fe19f20dec5 in pg_fe_run_oauth_flow_impl (conn=0x561825998490, request=0x561825a06e30, altsock=0x561825998860)
at ../../../../../home/andres/src/postgresql/src/interfaces/libpq-oauth/oauth-curl.c:2924
#7 0x00007fe19f20e0a8 in pg_fe_run_oauth_flow (conn=0x561825998490, request=0x561825a06e30, altsock=0x561825998860)
at ../../../../../home/andres/src/postgresql/src/interfaces/libpq-oauth/oauth-curl.c:3027
#8 0x00007fe1a0a35627 in do_async (state=0x5618259a1880, request=0x561825a06e30)
at ../../../../../home/andres/src/postgresql/src/interfaces/libpq/fe-auth-oauth.c:1528
#9 0x00007fe1a0a346e3 in run_oauth_flow (conn=0x561825998490) at ../../../../../home/andres/src/postgresql/src/interfaces/libpq/fe-auth-oauth.c:751
#10 0x00007fe1a0a3ffc1 in PQconnectPoll (conn=0x561825998490) at ../../../../../home/andres/src/postgresql/src/interfaces/libpq/fe-connect.c:4302
#11 0x00007fe1a0a3db6c in pqConnectDBComplete (conn=0x561825998490) at ../../../../../home/andres/src/postgresql/src/interfaces/libpq/fe-connect.c:2893
#12 0x00007fe1a0a3a1f7 in PQconnectdbParams (keywords=0x5618259983f0, values=0x561825998440, expand_dbname=1)
/*
* A slow_down error requires us to permanently increase our retry
* interval by five seconds.
*/
if (strcmp(err->error, "slow_down") == 0)
{
int prev_interval = actx->authz.interval;
actx->authz.interval += 5;
if (actx->authz.interval < prev_interval)
{
actx_error(actx, "slow_down interval overflow");
goto token_cleanup;
}
}
I don't think it's safe to rely on that type of check working without -fwrapv
support, which in turn we can't rely on having.
I think this should use pg_add_s32_overflow().
Greetings,
Andres Freund
On 23 Apr 2026, at 18:12, Andres Freund <andres@anarazel.de> wrote:
I don't think it's safe to rely on that type of check working without -fwrapv
support, which in turn we can't rely on having.I think this should use pg_add_s32_overflow().
Agreed, I'll write up a patch to fix it.
--
Daniel Gustafsson
On Thu, Apr 23, 2026 at 10:29 AM Daniel Gustafsson <daniel@yesql.se> wrote:
On 23 Apr 2026, at 18:12, Andres Freund <andres@anarazel.de> wrote:
I think this should use pg_add_s32_overflow().
Agreed, thanks for the report!
Agreed, I'll write up a patch to fix it.
Cool. I have one written up and can share it for comparison, if you'd
like, but it's fairly verbose and I wonder if there's a better way to
do it.
--Jacob
On 23 Apr 2026, at 19:49, Jacob Champion <jacob.champion@enterprisedb.com> wrote:
Cool. I have one written up and can share it for comparison, if you'd
like, but it's fairly verbose and I wonder if there's a better way to
do it.
Well, if you're already done then please do share it, and we'll use that as a
starting point.
--
Daniel Gustafsson
On Thu, Apr 23, 2026 at 11:17 AM Daniel Gustafsson <daniel@yesql.se> wrote:
Cool. I have one written up and can share it for comparison, if you'd
like, but it's fairly verbose and I wonder if there's a better way to
do it.Well, if you're already done then please do share it, and we'll use that as a
starting point.
Attached. The static_assert for the millisecond calculation is the
only part I don't really like, but doing an overflow check on a
calculation that can't overflow int64 is even more verbose/wasteful.
--Jacob
Attachments:
0001-libpq-oauth-Avoid-overflow-for-very-large-intervals.patchapplication/octet-stream; name=0001-libpq-oauth-Avoid-overflow-for-very-large-intervals.patchDownload+28-11
Hi,
On 2026-04-23 11:31:34 -0700, Jacob Champion wrote:
On Thu, Apr 23, 2026 at 11:17 AM Daniel Gustafsson <daniel@yesql.se> wrote:
Cool. I have one written up and can share it for comparison, if you'd
like, but it's fairly verbose and I wonder if there's a better way to
do it.Well, if you're already done then please do share it, and we'll use that as a
starting point.Attached. The static_assert for the millisecond calculation is the
only part I don't really like, but doing an overflow check on a
calculation that can't overflow int64 is even more verbose/wasteful.
How about instead making sure that actx->authz.interval never gets big enough
to have any chance of overflowing during either the += 5 or the * 1000? It's
clearly ok to error out well before that...
Greetings,
Andres Freund
On Thu, Apr 23, 2026 at 11:37 AM Andres Freund <andres@anarazel.de> wrote:
How about instead making sure that actx->authz.interval never gets big enough
to have any chance of overflowing during either the += 5 or the * 1000? It's
clearly ok to error out well before that...
It probably is, but I guess the approach depends on whether you prefer
checking at the time of operation, or attempting to reason about it
ahead of time in far-away code. With the latter, if additional math is
added in the future, then either the new overflow hazard gets missed,
or the ceiling gets lowered again, or the new math gets an overflow
check when the others don't. I prefer the time-of-use pattern,
personally.
--Jacob
On 23 Apr 2026, at 21:05, Jacob Champion <jacob.champion@enterprisedb.com> wrote:
On Thu, Apr 23, 2026 at 11:37 AM Andres Freund <andres@anarazel.de> wrote:
How about instead making sure that actx->authz.interval never gets big enough
to have any chance of overflowing during either the += 5 or the * 1000? It's
clearly ok to error out well before that...It probably is, but I guess the approach depends on whether you prefer
checking at the time of operation, or attempting to reason about it
ahead of time in far-away code. With the latter, if additional math is
added in the future, then either the new overflow hazard gets missed,
or the ceiling gets lowered again, or the new math gets an overflow
check when the others don't. I prefer the time-of-use pattern,
personally.
I am fine with your approach in the attached patch. If you don't like the
static assert you could move it to be out of the way, and expand the comment
for it to what it means if it hits. Just one small nitpick on the patch:
+ * LONG_MAX milliseconds is 24 days on 32-bit platforms,
+ * which for most people is going to be equivalent to a
+ * disabled timer... but avoid overflow in case the
When teading "disabled timer" I interpret that as a timer which is 0 and has no
interval (which might be due to not being a native speaker), but what it
actually describes is an interval which (in practice) never ends. Perhaps it
could be phrased more like "for most people is going to be equivalent to a
never ending interval".
--
Daniel Gustafsson