check_stack_depth() vs. tail-recursion optimization

Started by Tom Lane4 days ago1 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

While thinking about the Perl circular-reference problem reported by
Aleksander Alekseev [1]/messages/by-id/CAJ7c6TPbjkzUk4qJ5dHvDNEz0hBuFue3A-XWz_=897z+BC+z8A@mail.gmail.com, I realized that plperl's plperl_sv_to_datum()
is exposed to the same issue. It looks like:

static Datum
plperl_sv_to_datum(SV *sv, Oid typid, int32 typmod,
...
{
/* we might recurse */
check_stack_depth();
...
else if (SvROK(sv))
{
...
/*
* If it's a reference to something else, such as a scalar, just
* recursively look through the reference.
*/
return plperl_sv_to_datum(SvRV(sv), typid, typmod,
fcinfo, finfo, typioparam,
isnull);

That recursive call is a tail recursion. If the compiler is smart
enough to optimize that into a loop, then the stack doesn't grow and
the check_stack_depth() call won't get us out of the infinite loop.

I tried to demonstrate this with:

create type mytype as (a int);

create or replace function hashit(int) returns mytype
language plperl as
$$
# my $h = {a => $_[0]};
my $h;
$h = \\$h;
return $h;
$$;

select hashit(42);

With gcc 14.3.1, I get a "stack depth limit exceeded" error,
indicating that that compiler doesn't spot the tail recursion
opportunity. But with clang 21.0.0 (macOS' current compiler),
this is indeed an uninterruptible infinite loop.

A narrow fix would be to insert CHECK_FOR_INTERRUPTS() into
plperl_sv_to_datum(), mimicking what we did in da82fbb8f and
c0f17b04d. However, now that I've seen this, it seems likely to me
that other recursive routines that rely on check_stack_depth()
might have a similar issue. There are a fair number that look
like their recursive calls are tail-recursion-optimizable.
It's not really a problem unless the input data structure can
contain a self-referential loop, but how much faith do we want
to place in there not being one?

So what I'm considering is putting CHECK_FOR_INTERRUPTS() into
check_stack_depth() itself, more or less as attached. That's kind
of ugly from a modularity/separation-of-concerns standpoint, but it
ensures that we don't miss any cases where this could be an issue.

A larger question is whether this is a good enough answer,
or whether callers that are at risk of this need to expend
more effort on detecting input circularity. In the PL/Perl case
that started this, I judged that we didn't need to do more, but
that conclusion might not hold for still-hypothetical other
callers. However, something like this could be a good backstop
anyway.

regards, tom lane

[1]: /messages/by-id/CAJ7c6TPbjkzUk4qJ5dHvDNEz0hBuFue3A-XWz_=897z+BC+z8A@mail.gmail.com

Attachments:

wip-break-infinite-recursion-loops.patchtext/x-diff; charset=us-ascii; name=wip-break-infinite-recursion-loops.patchDownload+10-0