Potential security risk associated with function call

Started by 章晨曦@易景科技about 1 month ago23 messageshackers
Jump to latest
#1章晨曦@易景科技
zhangchenxi@halodbtech.com

Hi Hackers,

Recently, I notice a security risk when calling a function, it's strange but also interesting. E.g.

`array_to_text_null` is a bultin function with 3 args. Normally, the function is working well. **BUT**
if we create another version `array_to_text_null` function, say `harmful_array_to_string`, but with 2 args:

```
CREATE OR REPLACE FUNCTION harmful_array_to_string(anyarray, text)
 RETURNS text
 LANGUAGE internal
 STABLE PARALLEL SAFE STRICT
AS $function$array_to_text_null$function$;
```

And the we call the new function:
```
postgres=# SELECT harmful_array_to_string(ARRAY[1,2], 'HARMFUL');
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
```

It will cause the server crash~

The reason is there is a if statement in `array_to_text_null`

```
Datum
array_to_text_null(PG_FUNCTION_ARGS)
{
...
/* NULL null string is passed through as a null pointer */
if (!PG_ARGISNULL(2))
    null_string = text_to_cstring(PG_GETARG_TEXT_PP(2));
...
}
```

to determine wheather the 3rd arg is NULL or not. And we only pass 2 args to the function, but the
if statement here return TRUE, so it tries to get the 3rd arg, and cause the segmentfault.

The strange but interesting thing's here, if we change the code to:

```
Datum
array_to_text_null(PG_FUNCTION_ARGS)
{
...
/* NULL null string is passed through as a null pointer */
if (PG_ARGISNULL(2))
    null_string = text_to_cstring(PG_GETARG_TEXT_PP(2));
...
}
```

Will this code work well?

NO! The if statement still return TRUE! So still cause the segmentfault.

Not only `array_to_text_null`, other functions also having such problem, like `array_prepend`, we can
create a function:

```
CREATE OR REPLACE FUNCTION harmful_array_prepend(anycompatible)
 RETURNS anycompatiblearray
 LANGUAGE internal
 IMMUTABLE PARALLEL SAFE
AS $function$array_prepend$function$;
```

to cause the server crash easily.

This issue can be reproduction when compiled with "-O0". And when compiled with "-O2", although will not cause the server crash, but potential security risk arised as it will access an unknow memory.

A simple patch provided to prevent to access unknow args memory.

Jet

Halo Tech

Attachments:

0001-fix-potential-funccall-leakrisk.patchapplication/octet-stream; charset=utf-8; name=0001-fix-potential-funccall-leakrisk.patchDownload+1-1
#2Anders Åstrand
anders@449.se
In reply to: 章晨曦@易景科技 (#1)
Re: Potential security risk associated with function call

On 3/10/26 11:24, Jet wrote:

Hi Hackers,

Recently, I notice a security risk when calling a function, it's
strange but also interesting. E.g.

`array_to_text_null` is a bultin function with 3 args. Normally, the
function is working well. **BUT**
if we create another version `array_to_text_null` function, say
`harmful_array_to_string`, but with 2 args:

Yikes. This seems really dangerous.

A simple patch provided to prevent to access unknow args memory.

I don't think this patch will cover all cases as the function might do
something else with the data instead of checking for NULL, especially if
it expects to be called from a function that is defined with RETURNS
NULL ON NULL INPUT on the sql side.

My gut reaction would be to limit the creation of functions with
language=internal to superusers, but that wouldn't work as it would
break CREATE EXTENSION when there are server modules involved.

Maybe all C functions that are able to be used as language=internal
needs to explicitly check nargs at the top of the function? 

--
Anders Åstrand
Percona

#3章晨曦@易景科技
zhangchenxi@halodbtech.com
In reply to: Anders Åstrand (#2)
Re: Potential security risk associated with function call

My gut reaction would be to limit the creation of functions with
language=internal to superusers, but that wouldn't work as it would
break CREATE EXTENSION when there are server modules involved.

Maybe all C functions that are able to be used as language=internal
needs to explicitly check nargs at the top of the function?

Yes, all C functions suffer such potential risk, not only language=internal.
So limit the creation of functions with language=internal is not enough.

Jet
Halo Tech

#4Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: 章晨曦@易景科技 (#1)
Re: Potential security risk associated with function call

On Tue, 10 Mar 2026 at 11:25, Jet <zhangchenxi@halodbtech.com> wrote:

Hi Hackers,

Recently, I notice a security risk when calling a function, it's strange but also interesting. E.g.

`array_to_text_null` is a bultin function with 3 args. Normally, the function is working well. **BUT**
if we create another version `array_to_text_null` function, say `harmful_array_to_string`, but with 2 args:

[...]

And the we call the new function:

[...]

It will cause the server crash~

Correct. This is expected behaviour: the "internal" and "c" languages
are not 'trusted' languages, and therefore only superusers can create
functions using these languages. It is the explicit responsibility of
the superuser to make sure the functions they create using untrusted
languages are correct and execute safely when called by PostgreSQL.

Kind regards,

Matthias van de Meent

#5Robert Haas
robertmhaas@gmail.com
In reply to: Matthias van de Meent (#4)
Re: Potential security risk associated with function call

On Tue, Mar 10, 2026 at 8:03 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

Correct. This is expected behaviour: the "internal" and "c" languages
are not 'trusted' languages, and therefore only superusers can create
functions using these languages. It is the explicit responsibility of
the superuser to make sure the functions they create using untrusted
languages are correct and execute safely when called by PostgreSQL.

Agreed!

In fact, it's pretty much theoretically impossible for this to work
any other way. If we wanted to add checks that the expectations of the
C code match the actual function definitions, how would we do that?
I'm tempted to say we'd have to solve the halting problem (which is
impossible, look it up), but the 2026 reality is that someone would
just say "deploy an AI agent to check whether the code is safe for the
definition," and that might actually work in practical cases, but
we're not going to add a call-out to Claude as part of the CREATE
FUNCTION statement. And it's equally impossible to insist that every C
function anyone writes must be prepared for an arbitrary number of
arguments of arbitrary data types. Even doing that for core functions
would be a massive waste of resources. Functions like +(int4,int4) can
be called in very tight loops, and even the fact that those functions
do overflow checking is a significant performance drain. Doing these
kinds of checks to counter hypothetical scenarios would be a poor
investment of resources that would make many users unhappy. Besides,
even if we did that, we couldn't possibly enforce that out-of-core C
code has all of the same checks, or that those checks are correctly
coded.

Basically, yeah, being able to call code written directly in C is
dangerous, but it's also necessary, because that's how we get
reasonable performance.

--
Robert Haas
EDB: http://www.enterprisedb.com

#6章晨曦@易景科技
zhangchenxi@halodbtech.com
In reply to: Matthias van de Meent (#4)
Re: Potential security risk associated with function call

Correct. This is expected behaviour: the "internal" and "c" languages
are not 'trusted' languages, and therefore only superusers can create
functions using these languages.

Yes, you're right, only superusers can create "in.ternal" and "c" languages

It is the explicit responsibility of
the superuser to make sure the functions they create using untrusted
languages are correct and execute safely when called by PostgreSQL.

But the question is how can a superuser know the "internal" and "c" functions
implementation details? He will not know whether the code has !PG_ARGISNULL(...),
and create a harmful function accidentally...

Jet
Halo Tech

#7David G. Johnston
david.g.johnston@gmail.com
In reply to: 章晨曦@易景科技 (#6)
Re: Potential security risk associated with function call

On Tuesday, March 10, 2026, Jet <zhangchenxi@halodbtech.com> wrote:

It is the explicit responsibility of
the superuser to make sure the functions they create using untrusted
languages are correct and execute safely when called by PostgreSQL.

But the question is how can a superuser know the "internal" and "c"
functions
implementation details? He will not know whether the code has
!PG_ARGISNULL(...),
and create a harmful function accidentally...

You describe the fundamental problem/risk of the entire software industry.
At least PostgreSQL has chosen a business model where the superuser has the
option to read the source code.

David J.

#8Kirill Reshke
reshkekirill@gmail.com
In reply to: 章晨曦@易景科技 (#6)
Re: Potential security risk associated with function call

On Tue, 10 Mar 2026 at 17:27, Jet <zhangchenxi@halodbtech.com> wrote:

It is the explicit responsibility of
the superuser to make sure the functions they create using untrusted
languages are correct and execute safely when called by PostgreSQL.

But the question is how can a superuser know the "internal" and "c" functions
implementation details? He will not know whether the code has !PG_ARGISNULL(...),
and create a harmful function accidentally...

I think our global assumption is that superuser is super-wise and
knows everything

--
Best regards,
Kirill Reshke

#9章晨曦@易景科技
zhangchenxi@halodbtech.com
In reply to: Kirill Reshke (#8)
Re: Potential security risk associated with function call

It is the explicit responsibility of
the superuser to make sure the functions they create using untrusted
languages are correct and execute safely when called by PostgreSQL.

But the question is how can a superuser know the "internal" and "c" functions
implementation details? He will not know whether the code has !PG_ARGISNULL(...),
and create a harmful function accidentally...

I think our global assumption is that superuser is super-wise and
knows everything

Totally agreed ...

Jet
Halo Tech

#10章晨曦@易景科技
zhangchenxi@halodbtech.com
In reply to: Robert Haas (#5)
Re: Potential security risk associated with function call

but the 2026 reality is that someone would
just say "deploy an AI agent to check whether the code is safe for the
definition," and that might actually work in practical cases, but
we're not going to add a call-out to Claude as part of the CREATE
FUNCTION statement.

I notice the potential problem just because using Claude to write a simple
extension. And it works well on testing enviroment. But when take over the
Claude generated extenion to dev enviroment, the server crashed.
More and more people will use AI to generate codes, that's the trend, but AI
will make mistakes, and may leave many potention risks. So I suppose as the
base platform, we should try our best efforts to make it more robust.

Regards,
Jet
Halo Tech

#11Daniel Gustafsson
daniel@yesql.se
In reply to: 章晨曦@易景科技 (#10)
Re: Potential security risk associated with function call

On 10 Mar 2026, at 14:09, Jet <zhangchenxi@halodbtech.com> wrote:

but the 2026 reality is that someone would
just say "deploy an AI agent to check whether the code is safe for the
definition," and that might actually work in practical cases, but
we're not going to add a call-out to Claude as part of the CREATE
FUNCTION statement.

I notice the potential problem just because using Claude to write a simple
extension. And it works well on testing enviroment. But when take over the
Claude generated extenion to dev enviroment, the server crashed.
More and more people will use AI to generate codes, that's the trend, but AI
will make mistakes, and may leave many potention risks. So I suppose as the
base platform, we should try our best efforts to make it more robust.

There is no protection strong enough against developers who run generated C
code in production that they didn't read, review and test properly.

--
Daniel Gustafsson

#12Robert Haas
robertmhaas@gmail.com
In reply to: Kirill Reshke (#8)
Re: Potential security risk associated with function call

On Tue, Mar 10, 2026 at 8:39 AM Kirill Reshke <reshkekirill@gmail.com> wrote:

I think our global assumption is that superuser is super-wise and
knows everything

Right, but in case they don't, instead of writing their own CREATE
FUNCTION statements, they might want to use CREATE EXTENSION, thus
depending on the wisdom of the extension provider in lieu of their
own.

In ~30 years as a PostgreSQL user and developer, I've only written a
relatively small number of CREATE FUNCTION ... LANGUAGE c/internal
statements myself, and they've all been either for an extension or for
some kind of development exercise. There's no real reason to go around
writing random such statements that are completely broken just for
fun.

By the way, if you think this is a fun way to break your database, try
running "DELETE FROM pg_proc" sometime. Do not, under any
circumstances, do this in a PostgreSQL instance that you ever want to
use for anything ever again. I actually think we should have more
guardrails against this kind of direct system catalog modification
than we do -- like you have to set a GUC saying "yes, I know I'm
potentially about to break everything really badly" before you can
write to the system catalogs. The example that started this thread is
essentially unpreventable, because we need CREATE FUNCTION to be
possible and we need the superuser to tell us what the C code is
expecting, but the number of people who go tinkering with catalog
contents manually without fully understanding the consequences seems
to be much larger than I would have thought, even if the tinkering is
usually less dramatic than this example.

--
Robert Haas
EDB: http://www.enterprisedb.com

#13章晨曦@易景科技
zhangchenxi@halodbtech.com
In reply to: Robert Haas (#12)
Re: Potential security risk associated with function call

Right, but in case they don't, instead of writing their own CREATE
FUNCTION statements, they might want to use CREATE EXTENSION, thus
depending on the wisdom of the extension provider in lieu of their
own.

In ~30 years as a PostgreSQL user and developer, I've only written a
relatively small number of CREATE FUNCTION ... LANGUAGE c/internal
statements myself, and they've all been either for an extension or for
some kind of development exercise. There's no real reason to go around
writing random such statements that are completely broken just for
fun.

I don't think it just for fun. People may prefer to use EXTENSION, but the
problem is may the EXTENSION was written by a person who don't have full
skills with extension developing or even without any code experience but only
using AI. Just in the case I notice the problem. AI doing all the things and on
most cases it works well but leave potential risks. Will the end user really to
study the whole EXTENSION code? I can ensure most of them will not. And AI
will take over to do the most of coding works, that iss what happening...

Regards,
Jet
Halo Tech

#14Robert Haas
robertmhaas@gmail.com
In reply to: 章晨曦@易景科技 (#13)
Re: Potential security risk associated with function call

On Tue, Mar 10, 2026 at 10:05 AM Jet <zhangchenxi@halodbtech.com> wrote:

I don't think it just for fun. People may prefer to use EXTENSION, but the
problem is may the EXTENSION was written by a person who don't have full
skills with extension developing or even without any code experience but only
using AI. Just in the case I notice the problem. AI doing all the things and on
most cases it works well but leave potential risks. Will the end user really to
study the whole EXTENSION code? I can ensure most of them will not. And AI
will take over to do the most of coding works, that iss what happening...

Sure, but what do you propose to do about it? As I have already said,
there's no realistic way for PostgreSQL itself to know what the
correct function definition is.

--
Robert Haas
EDB: http://www.enterprisedb.com

#15Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Robert Haas (#5)
Re: Potential security risk associated with function call

On Tue, 10 Mar 2026 at 13:26, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Mar 10, 2026 at 8:03 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

Correct. This is expected behaviour: the "internal" and "c" languages
are not 'trusted' languages, and therefore only superusers can create
functions using these languages. It is the explicit responsibility of
the superuser to make sure the functions they create using untrusted
languages are correct and execute safely when called by PostgreSQL.

Agreed!

In fact, it's pretty much theoretically impossible for this to work
any other way. If we wanted to add checks that the expectations of the
C code match the actual function definitions, how would we do that?
I'm tempted to say we'd have to solve the halting problem (which is
impossible, look it up), but the 2026 reality is that someone would
just say "deploy an AI agent to check whether the code is safe for the
definition," and that might actually work in practical cases, but
we're not going to add a call-out to Claude as part of the CREATE
FUNCTION statement.

Tangent: I think it could be possible to make extensions (and PG
itself) generate more extensive pg_finfo records that contain
sufficient information to describe the functions' expected SQL calling
signature(s), which PG could then check and verify when the function
is catalogued (e.g. through lanvalidator).
E.g. "this function has 2 PG calling signatures: a volatile function
with 2 non-null arguments, or an immutable function with 3 non-null
arguments". Registrations which conflict with the exposed definition
could then raise a warning to expose the difference. This would make
the gap between C code and SQL code that needs to be bridged by manual
superuser validation a bit smaller.

I won't claim it's trivial, but I do think it might be a worthwile
time investment, and extensions could benefit here, too, as such
metadata could be used to validate and/or generate parts of
extension's install/upgrade scripts.

(And, whilst this is not on my personal todo list, it's definitely on
my wishlist; so do with the idea what you would like).

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

#16Nico Williams
nico@cryptonector.com
In reply to: Robert Haas (#12)
Re: Potential security risk associated with function call

On Tue, Mar 10, 2026 at 09:23:50AM -0400, Robert Haas wrote:

[...]. The example that started this thread is
essentially unpreventable, because we need CREATE FUNCTION to be
possible and we need the superuser to tell us what the C code is
expecting, but the number of people who go tinkering with catalog
contents manually without fully understanding the consequences seems
to be much larger than I would have thought, even if the tinkering is
usually less dramatic than this example.

If DWARF is available you could always get the C function's
prototype from that, and sanity-check it. But DWARF really bloats
shared objects, and it's not universal, so it's not a good solution.

C is just a crappy language. You play with fire, you best know what
you're doing -- that's a reasonable policy. And since PG is written in
C, and users do have C-coded extensions here and there, playing with
fire has to be supported.

It'd be clever if there was at least a standard for a subset of DWARF
that provides just the types information (but not, e.g., stack
unwinding) so that we could have some sort of standard reflection
support in C. That would be for the C standards committee.

Nico
--

#17Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Nico Williams (#16)
Re: Potential security risk associated with function call

On Tue, 10 Mar 2026 at 17:19, Nico Williams <nico@cryptonector.com> wrote:

On Tue, Mar 10, 2026 at 09:23:50AM -0400, Robert Haas wrote:

[...]. The example that started this thread is
essentially unpreventable, because we need CREATE FUNCTION to be
possible and we need the superuser to tell us what the C code is
expecting, but the number of people who go tinkering with catalog
contents manually without fully understanding the consequences seems
to be much larger than I would have thought, even if the tinkering is
usually less dramatic than this example.

If DWARF is available you could always get the C function's
prototype from that, and sanity-check it. But DWARF really bloats
shared objects, and it's not universal, so it's not a good solution.

Even with DWARF analysis it wouldn't help for C-language SQL
functions, as their signature is fixed: their one and only argument is
always just an FunctionCallInfo aka FunctionCallInfoBaseData*. That
struct then contains the actual arguments/argument count/nullability
info.

Also note that the "c" language here effectively only means
"dynamically loaded symbol using standard C linking with the
platform's C calling convention": PostgreSQL doesn't compile the
functions from sources. Any language that compiles to a binary that
links with such symbols should work; e.g. C++ and Rust are both using
this mechanism despite the "c" name used for the language.

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

#18Mark Woodward
woodwardm@google.com
In reply to: Nico Williams (#16)
Re: Potential security risk associated with function call

On Tue, Mar 10, 2026 at 12:19 PM Nico Williams <nico@cryptonector.com>
wrote:

On Tue, Mar 10, 2026 at 09:23:50AM -0400, Robert Haas wrote:

[...]. The example that started this thread is
essentially unpreventable, because we need CREATE FUNCTION to be
possible and we need the superuser to tell us what the C code is
expecting, but the number of people who go tinkering with catalog
contents manually without fully understanding the consequences seems
to be much larger than I would have thought, even if the tinkering is
usually less dramatic than this example.

If DWARF is available you could always get the C function's
prototype from that, and sanity-check it. But DWARF really bloats
shared objects, and it's not universal, so it's not a good solution.

C is just a crappy language. You play with fire, you best know what
you're doing -- that's a reasonable policy. And since PG is written in
C, and users do have C-coded extensions here and there, playing with
fire has to be supported.

I'm really tired of "C" bashing. The C programming language is a tool.
Effective and powerful tools can be dangerous, but the productivity is
worth it. The problem isn't the "C" language, it is with people who try to
program in C but do not know C.

It'd be clever if there was at least a standard for a subset of DWARF
that provides just the types information (but not, e.g., stack
unwinding) so that we could have some sort of standard reflection
support in C. That would be for the C standards committee.

Why do we need that?

Show quoted text

Nico
--

#19Mark Woodward
woodwardm@google.com
In reply to: 章晨曦@易景科技 (#1)
Re: Potential security risk associated with function call

I have written a lot of postgresql extensions and I think the interface as
it is is pretty good. You do need to be careful and check your inputs. One
of the things we could do is create set of macros and/or post processing
that could do something like C++ style name mangling for C functions. I
have had to do this manually in the past, but maybe we can create a process
that will scan a source file for meta in comments and create the SQL
function declarations based on the number of parameters and hook them up to
the correct C functions. Something like that. It's an incremental
mitigation. Malicious remapping of SQL functions to bad C functions is not
preventable if you have the permissions to do so.

On Tue, Mar 10, 2026 at 6:25 AM Jet <zhangchenxi@halodbtech.com> wrote:

Show quoted text

Hi Hackers,

Recently, I notice a security risk when calling a function, it's strange
but also interesting. E.g.

`array_to_text_null` is a bultin function with 3 args. Normally, the
function is working well. **BUT**
if we create another version `array_to_text_null` function, say
`harmful_array_to_string`, but with 2 args:

```
CREATE OR REPLACE FUNCTION harmful_array_to_string(anyarray, text)
RETURNS text
LANGUAGE internal
STABLE PARALLEL SAFE STRICT
AS $function$array_to_text_null$function$;
```

And the we call the new function:
```
postgres=# SELECT harmful_array_to_string(ARRAY[1,2], 'HARMFUL');
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
```

It will cause the server crash~

The reason is there is a if statement in `array_to_text_null`

```
Datum
array_to_text_null(PG_FUNCTION_ARGS)
{
...
/* NULL null string is passed through as a null pointer */
if (!PG_ARGISNULL(2))
null_string = text_to_cstring(PG_GETARG_TEXT_PP(2));
...
}
```

to determine wheather the 3rd arg is NULL or not. And we only pass 2 args
to the function, but the
if statement here return TRUE, so it tries to get the 3rd arg, and cause
the segmentfault.

The strange but interesting thing's here, if we change the code to:

```
Datum
array_to_text_null(PG_FUNCTION_ARGS)
{
...
/* NULL null string is passed through as a null pointer */
if (PG_ARGISNULL(2))
null_string = text_to_cstring(PG_GETARG_TEXT_PP(2));
...
}
```

Will this code work well?

NO! The if statement still return TRUE! So still cause the segmentfault.

Not only `array_to_text_null`, other functions also having such problem,
like `array_prepend`, we can
create a function:

```
CREATE OR REPLACE FUNCTION harmful_array_prepend(anycompatible)
RETURNS anycompatiblearray
LANGUAGE internal
IMMUTABLE PARALLEL SAFE
AS $function$array_prepend$function$;
```

to cause the server crash easily.

This issue can be reproduction when compiled with "-O0". And when compiled
with "-O2", although will not cause the server crash, but potential
security risk arised as it will access an unknow memory.

A simple patch provided to prevent to access unknow args memory.

Jet
Halo Tech

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthias van de Meent (#15)
Re: Potential security risk associated with function call

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

Tangent: I think it could be possible to make extensions (and PG
itself) generate more extensive pg_finfo records that contain
sufficient information to describe the functions' expected SQL calling
signature(s), which PG could then check and verify when the function
is catalogued (e.g. through lanvalidator).

I think that'd be a lot of work with little result other than to
change what sort of manual validation you have to do. Today, you
have to check "does the function's actual C code match the SQL
definition?". But with this, you'd have to check "does the function's
actual C code match the pg_finfo record?". I'm not seeing a huge win
there.

Many many years ago when we first designed V1 function call protocol,
I had the idea that we could write a tool that inspects C code like

Datum
int42pl(PG_FUNCTION_ARGS)
{
int32 arg1 = PG_GETARG_INT32(0);
int16 arg2 = PG_GETARG_INT16(1);
int32 result;

and automatically derives (or at least cross-checks against) the SQL
definition. And we probably still could write such a tool. But
there's a large fraction of the code base where no attention was paid
to following that layout, and/or one C function was made to handle
several signatures by writing conditional logic to fetch the
arguments. Maybe you could get an AI tool to disentangle such logic,
but how much you wanna trust the results?

regards, tom lane

#21Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#20)
#22Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#20)
#23Pavel Stehule
pavel.stehule@gmail.com
In reply to: Andres Freund (#22)