generating function default settings from pg_proc.dat

Started by Andrew Dunstan18 days ago28 messages
Jump to latest
#1Andrew Dunstan
andrew@dunslane.net

Motivated by Bug 19409 [1]/messages/by-id/19409-e16cd2605e59a4af@postgresql.org I decided to do something about a wart that
has bugged me for a while, namely the requirement to write stuff in
system_views.sql if you need to specify default values for function
arguments. Here's my attempt. The first patch here sets up the required
infrastructure. genbki.pl creates a file called function_defaults.sql
which is run by initdb at the appropriate time. There are two new fields
in pg_proc.dat entries: proargdflts,and provariadicdflt. These are
parsed and the appropriate CREATE OR REPLACE statement is generated and
placed in function_defaults.sql. The second patch applies this treatment
to 37 function definitions and removes the corresponding statements from
system_views.sql. This gets us closer to having pg_proc.dat as a single
source of truth.

cheers

andrew

[1]: /messages/by-id/19409-e16cd2605e59a4af@postgresql.org
/messages/by-id/19409-e16cd2605e59a4af@postgresql.org

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachments:

0001-Add-infrastructure-for-proargdflts-in-pg_proc.dat.patchtext/x-patch; charset=UTF-8; name=0001-Add-infrastructure-for-proargdflts-in-pg_proc.dat.patchDownload+225-5
0002-Move-function-defaults-from-system_functions.sql-to-.patchtext/x-patch; charset=UTF-8; name=0002-Move-function-defaults-from-system_functions.sql-to-.patchDownload+75-291
#2Daniel Gustafsson
daniel@yesql.se
In reply to: Andrew Dunstan (#1)
Re: generating function default settings from pg_proc.dat

On 16 Feb 2026, at 18:31, Andrew Dunstan <andrew@dunslane.net> wrote:

Motivated by Bug 19409 [1] I decided to do something about a wart that has bugged me for a while, namely the requirement to write stuff in system_views.sql if you need to specify default values for function arguments.

I haven't studied the patch yet, but a big +many on the idea. This has bugged
me many times.

--
Daniel Gustafsson

#3Corey Huinker
corey.huinker@gmail.com
In reply to: Andrew Dunstan (#1)
Re: generating function default settings from pg_proc.dat

On Mon, Feb 16, 2026 at 12:31 PM Andrew Dunstan <andrew@dunslane.net> wrote:

Motivated by Bug 19409 [1] I decided to do something about a wart that
has bugged me for a while, namely the requirement to write stuff in
system_views.sql if you need to specify default values for function
arguments. Here's my attempt. The first patch here sets up the required
infrastructure. genbki.pl creates a file called function_defaults.sql
which is run by initdb at the appropriate time. There are two new fields
in pg_proc.dat entries: proargdflts,and provariadicdflt. These are
parsed and the appropriate CREATE OR REPLACE statement is generated and
placed in function_defaults.sql. The second patch applies this treatment
to 37 function definitions and removes the corresponding statements from
system_views.sql. This gets us closer to having pg_proc.dat as a single
source of truth.

+1 for the attempt. My preference would be for allowing CREATE OR REPLACE
to specify an oid, but that's a non-starter for bootstrapping purposes, so
this is the next best option.

The defaults read a little funny in that a human reader must line up the
defaults right-to-left to then determine a given parameter's default, if
any. For example:

   proname => 'json_strip_nulls', prorettype => 'json',
-  proargtypes => 'json bool', prosrc => 'json_strip_nulls' },
+  proargtypes => 'json bool',
+  proargnames => '{target,strip_in_arrays}', proargdflts => '{false}',
+  prosrc => 'json_strip_nulls' },

Perhaps we could require proargdflts to pre-pad with empty values

proargdflts => '{,false}'

or even be a hash

proargdflts => { strip_in_arrays => 'false' }

which I will grant you is wordy, but its definitely clearer.

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#1)
Re: generating function default settings from pg_proc.dat

Andrew Dunstan <andrew@dunslane.net> writes:

Motivated by Bug 19409 [1] I decided to do something about a wart that
has bugged me for a while, namely the requirement to write stuff in
system_views.sql if you need to specify default values for function
arguments. Here's my attempt. The first patch here sets up the required
infrastructure. genbki.pl creates a file called function_defaults.sql
which is run by initdb at the appropriate time. There are two new fields
in pg_proc.dat entries: proargdflts,and provariadicdflt. These are
parsed and the appropriate CREATE OR REPLACE statement is generated and
placed in function_defaults.sql. The second patch applies this treatment
to 37 function definitions and removes the corresponding statements from
system_views.sql. This gets us closer to having pg_proc.dat as a single
source of truth.

Nice! I'll review this in more detail later, but it seems like
a great idea.

One nitpick is that it's not clear how someone would discover the
existence of this feature. We could of course add info to bki.sgml,
but I wish it were more organic. Is there a way that we could have
these entries appear in "proargdefaults" so that there's an obvious
connection to the actual catalog field?

regards, tom lane

#5Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#1)
Re: generating function default settings from pg_proc.dat

Hi,

On 2026-02-16 12:31:37 -0500, Andrew Dunstan wrote:

Motivated by Bug 19409 [1] I decided to do something about a wart that has
bugged me for a while, namely the requirement to write stuff in
system_views.sql if you need to specify default values for function
arguments.

I'm one more person this has been bugging.

A related issue is that we have grants and revokes as part of
system_functions.sql. Perhaps it's worth tackling that at the same time?

Here's my attempt. The first patch here sets up the required
infrastructure. genbki.pl creates a file called function_defaults.sql which
is run by initdb at the appropriate time.

Hm. I don't love that we first insert something into pg_proc, then generate a
full blown CREATE OR REPLACE FUNCTION with all the details to edit it. It
wouldn't take too much to somehow end up, down the line, with a silent
mismatch between what pg_proc.h contains and what CREATE OR REPLACE FUNCTION
ends up generating. I'm not sure we'd be likely to notice such a mismatch
immediately.

In fact, I think the current translation may have such an issue, the generated
CREATE OR REPLACE doesn't seem include ROWS, which seems to lead to resetting
prorows to the default 1000, regardless of what it was before.

DROP FUNCTION IF EXISTS foo();
CREATE FUNCTION foo() RETURNS SETOF int LANGUAGE SQL VOLATILE ROWS 10 AS $$SELECT generate_series(1, 10)$$;
SELECT prorows FROM pg_proc WHERE proname = 'foo';
┌─────────┐
│ prorows │
├─────────┤
│ 10 │
└─────────┘

CREATE OR REPLACE FUNCTION foo() RETURNS SETOF int LANGUAGE SQL VOLATILE AS $$SELECT generate_series(1, 10)$$;
SELECT prorows FROM pg_proc WHERE proname = 'foo';
┌─────────┐
│ prorows │
├─────────┤
│ 1000 │
└─────────┘

I wish we could just generate the parse-analyzed representation for default
values during bootstrap, but that's probably not realistic.

Although, I guess we don't really need the full machinery. Afaict we just need
a list of simple CONST nodes. There are two current functions (random_normal()
and pg_terminate_backend()), where the default expressions contains casts
implemented via FUNCEXPR, but that's just sloppiness in the DEFAULT
specification we probably should fix independently of everything else.

Greetings,

Andres Freund

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#5)
Re: generating function default settings from pg_proc.dat

Andres Freund <andres@anarazel.de> writes:

I wish we could just generate the parse-analyzed representation for default
values during bootstrap, but that's probably not realistic.

Although, I guess we don't really need the full machinery. Afaict we just need
a list of simple CONST nodes.

Const is enough to be problematic. In particular, the bytes of the
stored Datum are shown in physical order so that the results are
endian-dependent. We can't have machine dependencies in postgres.bki.
I suppose it might be possible to rethink the printed representation
of Const nodes to dodge that problem, but that's starting to make the
project seem rather complex. Even without that, hand-maintained
byte-level representations of jsonb, float8, text[] seem like a pretty
bad idea.

I think what we'd really want here is some smarts in backend bootstrap
mode to be able to invoke the correct datatype input function to
convert the type's standard string representation into a Datum.
I wonder how complicated that'd be.

regards, tom lane

#7Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#6)
Re: generating function default settings from pg_proc.dat

Hi,

On 2026-02-16 14:13:45 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I wish we could just generate the parse-analyzed representation for default
values during bootstrap, but that's probably not realistic.

Although, I guess we don't really need the full machinery. Afaict we just need
a list of simple CONST nodes.

Const is enough to be problematic. In particular, the bytes of the
stored Datum are shown in physical order so that the results are
endian-dependent. We can't have machine dependencies in postgres.bki.

I was more thinking we would teach bootstrap.c/bootparse.y to generate the
List(Const+) from a simpler representation that would be included in
postgres.bki, rather than include the node tree soup in postgres.bki.

I think what we'd really want here is some smarts in backend bootstrap
mode to be able to invoke the correct datatype input function to
convert the type's standard string representation into a Datum.
I wonder how complicated that'd be.

Seems we were thinking something roughly similar...

Looks like the slightly difficult bit is that we haven't assembled the pg_proc
row by the time we'd do the OidInputFunctionCall() in InsertOneValue(), so
we'd not trivially know the type of the corresponding column.

But if we made the input something like {'some'::type1, 'value'::type2}, we
wouldn't need to know the corresponding column's type, and genbki could build
it.

I'm starting to wonder if we shouldn't do something more bespoke for all
functions during bootstrap in general, rather than just for the argument
defaults.

Particularly for SRFs, I find it rather painful to keep proargtypes,
proallargtypes, proargmodes, proargnames in sync. Not helped by proargtypes
and proallargtypes/proargmodes/... having a different input syntax. I've
spent too much time trying to keep the arguments of stats functions in sync.

{ oid => '3786', descr => 'set up a logical replication slot',
proname => 'pg_create_logical_replication_slot', provolatile => 'v',
proparallel => 'u', prorettype => 'record',
proargtypes => 'name name bool bool bool',
proallargtypes => '{name,name,bool,bool,bool,name,pg_lsn}',
proargmodes => '{i,i,i,i,i,o,o}',
proargnames => '{slot_name,plugin,temporary,twophase,failover,slot_name,lsn}',
prosrc => 'pg_create_logical_replication_slot' },

+

CREATE OR REPLACE FUNCTION pg_create_logical_replication_slot(
IN slot_name name, IN plugin name,
IN temporary boolean DEFAULT false,
IN twophase boolean DEFAULT false,
IN failover boolean DEFAULT false,
OUT slot_name name, OUT lsn pg_lsn)
RETURNS RECORD
LANGUAGE INTERNAL
STRICT VOLATILE
AS 'pg_create_logical_replication_slot';

could be

{ oid => '3786', descr => 'set up a logical replication slot',
proname => 'pg_create_logical_replication_slot', provolatile => 'v',
proparallel => 'u',
proargs => [
{type => 'name', name => 'slot_name'},
{type => 'name', name => 'plugin'},
{type => 'bool', name => 'temporary', default => 'false'},
{type => 'bool', name => 'twophase', default => 'false'},
{type => 'bool', name => 'failover', default => 'false'},
],
prorettype => [
{type => 'name', name => 'slot_name'},
{type => 'pg_lsn', name => 'lsn'},
]
}

and then either turn that into something like the current representation, or
perhaps better, into a new top-level 'proc' bootstrap command, which then
would know how to interpret the default literals into the node soup.

Greetings,

Andres Freund

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#7)
Re: generating function default settings from pg_proc.dat

Andres Freund <andres@anarazel.de> writes:

On 2026-02-16 14:13:45 -0500, Tom Lane wrote:

Const is enough to be problematic. In particular, the bytes of the
stored Datum are shown in physical order so that the results are
endian-dependent. We can't have machine dependencies in postgres.bki.

I was more thinking we would teach bootstrap.c/bootparse.y to generate the
List(Const+) from a simpler representation that would be included in
postgres.bki, rather than include the node tree soup in postgres.bki.

Right, maintaining pg_node_tree strings is exactly what we don't want
to do.

Looks like the slightly difficult bit is that we haven't assembled the pg_proc
row by the time we'd do the OidInputFunctionCall() in InsertOneValue(), so
we'd not trivially know the type of the corresponding column.

Ah, I'd not got far enough to notice that.

But if we made the input something like {'some'::type1, 'value'::type2}, we
wouldn't need to know the corresponding column's type, and genbki could build
it.

Hmm. Your idea of a bespoke 'proc' command would avoid the need for
duplication, I think, although I'm not sure how to write that without
it becoming its own source of maintenance pain.

Particularly for SRFs, I find it rather painful to keep proargtypes,
proallargtypes, proargmodes, proargnames in sync. Not helped by proargtypes
and proallargtypes/proargmodes/... having a different input syntax. I've
spent too much time trying to keep the arguments of stats functions in sync.

Agreed, we could stand to do that better.

proargs => [
{type => 'name', name => 'slot_name'},
{type => 'name', name => 'plugin'},
{type => 'bool', name => 'temporary', default => 'false'},
{type => 'bool', name => 'twophase', default => 'false'},
{type => 'bool', name => 'failover', default => 'false'},
],
prorettype => [
{type => 'name', name => 'slot_name'},
{type => 'pg_lsn', name => 'lsn'},
]
}

I'd be inclined to keep prorettype separate from the output
arguments, but otherwise something like this seems attractive.

Who's going to work on this? I'm happy to take a swing at it,
but don't want to duplicate someone else's effort.

regards, tom lane

#9Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#8)
Re: generating function default settings from pg_proc.dat

On 2026-02-16 Mo 3:02 PM, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2026-02-16 14:13:45 -0500, Tom Lane wrote:

Const is enough to be problematic. In particular, the bytes of the
stored Datum are shown in physical order so that the results are
endian-dependent. We can't have machine dependencies in postgres.bki.

I was more thinking we would teach bootstrap.c/bootparse.y to generate the
List(Const+) from a simpler representation that would be included in
postgres.bki, rather than include the node tree soup in postgres.bki.

Right, maintaining pg_node_tree strings is exactly what we don't want
to do.

Looks like the slightly difficult bit is that we haven't assembled the pg_proc
row by the time we'd do the OidInputFunctionCall() in InsertOneValue(), so
we'd not trivially know the type of the corresponding column.

Ah, I'd not got far enough to notice that.

But if we made the input something like {'some'::type1, 'value'::type2}, we
wouldn't need to know the corresponding column's type, and genbki could build
it.

Hmm. Your idea of a bespoke 'proc' command would avoid the need for
duplication, I think, although I'm not sure how to write that without
it becoming its own source of maintenance pain.

Particularly for SRFs, I find it rather painful to keep proargtypes,
proallargtypes, proargmodes, proargnames in sync. Not helped by proargtypes
and proallargtypes/proargmodes/... having a different input syntax. I've
spent too much time trying to keep the arguments of stats functions in sync.

Agreed, we could stand to do that better.

proargs => [
{type => 'name', name => 'slot_name'},
{type => 'name', name => 'plugin'},
{type => 'bool', name => 'temporary', default => 'false'},
{type => 'bool', name => 'twophase', default => 'false'},
{type => 'bool', name => 'failover', default => 'false'},
],
prorettype => [
{type => 'name', name => 'slot_name'},
{type => 'pg_lsn', name => 'lsn'},
]
}

I'd be inclined to keep prorettype separate from the output
arguments, but otherwise something like this seems attractive.

Who's going to work on this? I'm happy to take a swing at it,
but don't want to duplicate someone else's effort.

Go for it. I'm happy to review.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#9)
Re: generating function default settings from pg_proc.dat

Andrew Dunstan <andrew@dunslane.net> writes:

On 2026-02-16 Mo 3:02 PM, Tom Lane wrote:

Who's going to work on this? I'm happy to take a swing at it,
but don't want to duplicate someone else's effort.

Go for it. I'm happy to review.

After a little bit of thought, I propose the following sketch
for what to do in the bootstrap backend:

In InsertOneValue(), add a special case for typoid == PG_NODE_TREEOID.
pg_node_tree_in() would just fail anyway so this isn't removing
any useful functionality. The special-case code would check which
column we are entering a value for and dispatch to appropriate code:

if (typoid == PG_NODE_TREEOID)
{
if (RelationGetRelid(boot_reldesc) == ProcedureRelationId &&
i == Anum_pg_proc_proargdefaults - 1)
values[i] = ConvertOneProargdefaultsValue(value);
else
... maybe other cases later
else
elog(ERROR, "can't handle pg_node_tree input");
return;
}

This framework would permit special-casing other catalog columns
in future, if we feel a need for that.

Andres was concerned about not having access to the other columns of
pg_proc, but I think that's mistaken: bootstrap.c's values[] array
should hold the already-parsed values of all earlier columns for the
current entry. So ISTM that we should have enough data to interpret
an input that just looks like an array of input-value strings and
construct a List of Const nodes from that, then flatten it to a
nodetree string.

So with that we'd have enough infrastructure to allow writing
something like

proargdefaults => '{1,2,true}'

This seems orthogonal to Andres' suggestion about refactoring
the pg_proc.dat representation of arguments to not be parallel
arrays but a list of hashes. I think that's likely a good
idea, but I don't want to implement it because I'm not a great
Perl programmer. Do you want to pick up that part?

regards, tom lane

#11Michael Paquier
michael@paquier.xyz
In reply to: Andres Freund (#5)
Re: generating function default settings from pg_proc.dat

On Mon, Feb 16, 2026 at 01:54:49PM -0500, Andres Freund wrote:

On 2026-02-16 12:31:37 -0500, Andrew Dunstan wrote:

Motivated by Bug 19409 [1] I decided to do something about a wart that has
bugged me for a while, namely the requirement to write stuff in
system_views.sql if you need to specify default values for function
arguments.

I'm one more person this has been bugging.

Please add me on the stack of people who have been annoyed by that
largely more than once. Glad to see that this could be enforced in
the bki scripts.
--
Michael

#12Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#7)
Re: generating function default settings from pg_proc.dat

On 2026-Feb-16, Andres Freund wrote:

could be

{ oid => '3786', descr => 'set up a logical replication slot',
proname => 'pg_create_logical_replication_slot', provolatile => 'v',
proparallel => 'u',
proargs => [
{type => 'name', name => 'slot_name'},
{type => 'name', name => 'plugin'},
{type => 'bool', name => 'temporary', default => 'false'},
{type => 'bool', name => 'twophase', default => 'false'},
{type => 'bool', name => 'failover', default => 'false'},
],
prorettype => [
{type => 'name', name => 'slot_name'},
{type => 'pg_lsn', name => 'lsn'},
]
}

This is pretty much the sort of thing I was imagining when I read
Corey's argumentation. +1 for something along these lines.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Pero la cosa no es muy grave ..." (le petit Nicolas -- René Goscinny)

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#10)
Re: generating function default settings from pg_proc.dat

I wrote:

After a little bit of thought, I propose the following sketch
for what to do in the bootstrap backend:

Here is a draft patch along those lines. I've verified that
the initial contents of pg_proc are exactly as before,
except that json[b]_strip_nulls gain the correct provolatile
value, and a few proargdefaults entries no longer involve
cast functions.

The changes in system_functions.sql and pg_proc.dat are nearly
verbatim from your 0002 patch, except that I had to adjust the
quoting conventions in some places because array_in does it
differently.

regards, tom lane

Attachments:

0001-fill-proargdefaults-during-bootstrap.patchtext/x-diff; charset=us-ascii; name=0001-fill-proargdefaults-during-bootstrap.patchDownload+283-296
#14Corey Huinker
corey.huinker@gmail.com
In reply to: Alvaro Herrera (#12)
Re: generating function default settings from pg_proc.dat

On Tue, Feb 17, 2026 at 1:37 AM Álvaro Herrera <alvherre@kurilemu.de> wrote:

On 2026-Feb-16, Andres Freund wrote:

could be

{ oid => '3786', descr => 'set up a logical replication slot',
proname => 'pg_create_logical_replication_slot', provolatile => 'v',
proparallel => 'u',
proargs => [
{type => 'name', name => 'slot_name'},
{type => 'name', name => 'plugin'},
{type => 'bool', name => 'temporary', default => 'false'},
{type => 'bool', name => 'twophase', default => 'false'},
{type => 'bool', name => 'failover', default => 'false'},
],
prorettype => [
{type => 'name', name => 'slot_name'},
{type => 'pg_lsn', name => 'lsn'},
]
}

This is pretty much the sort of thing I was imagining when I read
Corey's argumentation. +1 for something along these lines.

I like this a lot too, but I'm noticing that with each iteration we're
getting closer to re-inventing SQL. Would it make sense in the long run to
have a mode on the CREATE FUNCTION command that cues initdb to create the
minimal function skeleton with prescribed oid on the first pass, but then
stores the defer-able parts (if any) for a later pass, perhaps in parallel?
Then we wouldn't have to worry about how to model all future additions to
CREATE FUNCTION, and instead focus on what parts of creating the function
need to be in the bootstrap pass.

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Corey Huinker (#14)
Re: generating function default settings from pg_proc.dat

Corey Huinker <corey.huinker@gmail.com> writes:

I like this a lot too, but I'm noticing that with each iteration we're
getting closer to re-inventing SQL.

Really? Neither pg_proc.dat nor the constructed postgres.bki file
look anything like SQL to my eye.

Would it make sense in the long run to
have a mode on the CREATE FUNCTION command that cues initdb to create the
minimal function skeleton with prescribed oid on the first pass, but then
stores the defer-able parts (if any) for a later pass, perhaps in parallel?

I seriously, seriously doubt it. That would involve allowing large
amounts of the parser to run in bootstrap mode, and would probably
end in plastering warts all over backend/parser/ to say "do this
in one way normally but some other way in bootstrap". Also, it's
really just syntactic sugar and does nothing for the harder problems
that bootstrap mode has to solve, such as supporting references to
objects that've not been created yet.

regards, tom lane

#16Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#13)
Re: generating function default settings from pg_proc.dat

Hi,

On 2026-02-16 20:36:25 -0500, Tom Lane wrote:

Here is a draft patch along those lines. I've verified that
the initial contents of pg_proc are exactly as before,
except that json[b]_strip_nulls gain the correct provolatile
value, and a few proargdefaults entries no longer involve
cast functions.

Nice.

@@ -817,8 +832,10 @@ $ perl  rewrite_dat_with_prokind.pl  pg_proc.dat
The following column types are supported directly by
<filename>bootstrap.c</filename>: <type>bool</type>,
<type>bytea</type>, <type>char</type> (1 byte),
-      <type>name</type>, <type>int2</type>,
-      <type>int4</type>, <type>regproc</type>, <type>regclass</type>,
+      <type>int2</type>, <type>int4</type>, <type>int8</type>,
+      <type>float4</type>, <type>float8</type>,
+      <type>name</type>,
+      <type>regproc</type>, <type>regclass</type>,
<type>regtype</type>, <type>text</type>,
<type>oid</type>, <type>tid</type>, <type>xid</type>,
<type>cid</type>, <type>int2vector</type>, <type>oidvector</type>,

Don't you also add jsonb support below?

+	/*
+	 * pg_node_tree values can't be inserted normally (pg_node_tree_in would
+	 * just error out), so provide special cases for such columns that we
+	 * would like to fill during bootstrap.
+	 */
+	if (typoid == PG_NODE_TREEOID)
+	{
+		/* pg_proc.proargdefaults */
+		if (RelationGetRelid(boot_reldesc) == ProcedureRelationId &&
+			i == Anum_pg_proc_proargdefaults - 1)
+			values[i] = ConvertOneProargdefaultsValue(value);
+		else					/* maybe other cases later */
+			elog(ERROR, "can't handle pg_node_tree input");

Perhaps add the relid to the ERROR?

+/* ----------------
+ *		ConvertOneProargdefaultsValue
+ *
+ * In general, proargdefaults can be a list of any expressions, but
+ * for bootstrap we only support a list of Const nodes.  The input
+ * has the form of a text array, and we feed non-null elements to the
+ * typinput functions for the appropriate parameters.
+ * ----------------
+ */
+static Datum
+ConvertOneProargdefaultsValue(char *value)
+{
+	int			pronargs;
+	oidvector  *proargtypes;
+	Datum		arrayval;
+	Datum	   *array_datums;
+	bool	   *array_nulls;
+	int			array_count;
+	List	   *proargdefaults;
+
+	/* The pg_proc columns we need to use must have been filled already */
+	StaticAssertDecl(Anum_pg_proc_pronargs < Anum_pg_proc_proargdefaults,
+					 "pronargs must come before proargdefaults");
+	StaticAssertDecl(Anum_pg_proc_pronargdefaults < Anum_pg_proc_proargdefaults,
+					 "pronargdefaults must come before proargdefaults");
+	StaticAssertDecl(Anum_pg_proc_proargtypes < Anum_pg_proc_proargdefaults,
+					 "proargtypes must come before proargdefaults");
+	if (Nulls[Anum_pg_proc_pronargs - 1])
+		elog(ERROR, "pronargs must not be null");
+	if (Nulls[Anum_pg_proc_proargtypes - 1])
+		elog(ERROR, "proargtypes must not be null");
+	pronargs = DatumGetInt16(values[Anum_pg_proc_pronargs - 1]);
+	proargtypes = DatumGetPointer(values[Anum_pg_proc_proargtypes - 1]);
+	Assert(pronargs == proargtypes->dim1);
+
+	/* Parse the input string as a text[] value, then deconstruct to Datums */
+	arrayval = OidFunctionCall3(F_ARRAY_IN,
+								CStringGetDatum(value),
+								ObjectIdGetDatum(TEXTOID),
+								Int32GetDatum(-1));
+	deconstruct_array_builtin(DatumGetArrayTypeP(arrayval), TEXTOID,
+							  &array_datums, &array_nulls, &array_count);

If we convert to cstring below anyway, why not make it a cstring array?

+	/* The values should correspond to the last N argtypes */
+	if (array_count > pronargs)
+		elog(ERROR, "too many proargdefaults entries");
+
+	/* Build the List of Const nodes */
+	proargdefaults = NIL;
+	for (int i = 0; i < array_count; i++)
+	{
+		Oid			argtype = proargtypes->values[pronargs - array_count + i];
+		int16		typlen;
+		bool		typbyval;
+		char		typalign;
+		char		typdelim;
+		Oid			typioparam;
+		Oid			typinput;
+		Oid			typoutput;
+		Oid			typcollation;
+		Datum		defval;
+		bool		defnull;
+		Const	   *defConst;
+
+		boot_get_type_io_data(argtype,
+							  &typlen, &typbyval, &typalign,
+							  &typdelim, &typioparam,
+							  &typinput, &typoutput);
+		typcollation = boot_get_typcollation(argtype);
+		defnull = array_nulls[i];
+		if (defnull)
+			defval = (Datum) 0;
+		else
+		{
+			char	   *defstr = text_to_cstring(DatumGetTextPP(array_datums[i]));
+
+			defval = OidInputFunctionCall(typinput, defstr, typioparam, -1);
+		}
+
+		defConst = makeConst(argtype,
+							 -1,	/* never any typmod */
+							 typcollation,
+							 typlen,
+							 defval,
+							 defnull,
+							 typbyval);
+		proargdefaults = lappend(proargdefaults, defConst);
+	}
+
+	/*
+	 * Hack: fill in pronargdefaults with the right value.  This is surely
+	 * ugly, but it beats making the programmer do it.
+	 */
+	values[Anum_pg_proc_pronargdefaults - 1] = Int16GetDatum(array_count);
+	Nulls[Anum_pg_proc_pronargdefaults - 1] = false;

I don't mind the hack, but I wonder about it location. It's odd that the
caller puts the return value of ConvertOneProargdefaultsValue() into the
values array, but then ConvertOneProargdefaultsValue() also sets
pronargdefaults?

+/* ----------------
+ *		boot_get_typcollation
+ *
+ * Obtain type's collation at bootstrap time.  This intentionally has
+ * the same API as lsyscache.c's get_typcollation.
+ *
+ * XXX would it be better to add another output to boot_get_type_io_data?

Yes, it seems like it would? There aren't many callers for it...

Greetings,

Andres Freund

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#16)
Re: generating function default settings from pg_proc.dat

Andres Freund <andres@anarazel.de> writes:

If we convert to cstring below anyway, why not make it a cstring array?

Ha, I'd forgotten that cstring[] is a thing. Yup, that'd save one
step.

I don't mind the hack, but I wonder about it location. It's odd that the
caller puts the return value of ConvertOneProargdefaultsValue() into the
values array, but then ConvertOneProargdefaultsValue() also sets
pronargdefaults?

Yeah, I'd gone back and forth about whether this function ought to
return the converted datum or just shove it into values[] directly.
Given that it's also filling the pronargdefaults entry, it probably
should take the latter approach.

I'll post a v2 in a bit. Thanks for reviewing!

regards, tom lane

#18Corey Huinker
corey.huinker@gmail.com
In reply to: Tom Lane (#15)
Re: generating function default settings from pg_proc.dat

On Tue, Feb 17, 2026 at 10:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Corey Huinker <corey.huinker@gmail.com> writes:

I like this a lot too, but I'm noticing that with each iteration we're
getting closer to re-inventing SQL.

Really? Neither pg_proc.dat nor the constructed postgres.bki file
look anything like SQL to my eye.

It doesn't _look_ like SQL, but we're trying to cover all things that a
CREATE FUNCTION can do, and if we don't model them all, then we're back to
needing CREATE OR REPLACE overlays.

Would it make sense in the long run to
have a mode on the CREATE FUNCTION command that cues initdb to create the
minimal function skeleton with prescribed oid on the first pass, but then
stores the defer-able parts (if any) for a later pass, perhaps in

parallel?

I seriously, seriously doubt it. That would involve allowing large
amounts of the parser to run in bootstrap mode, and would probably
end in plastering warts all over backend/parser/ to say "do this
in one way normally but some other way in bootstrap". Also, it's
really just syntactic sugar and does nothing for the harder problems
that bootstrap mode has to solve, such as supporting references to
objects that've not been created yet.

Fair enough. But in the interest of keeping a single source of truth, what
if we reversed the process, having a build-time perl script parse
system_functions.sql to build a minimal pg_proc.dat-like file just for
bootstrap? We would probably want to break up system_functions.sql into
several files, (admin functions go here, adt-related functions go there,
etc), but we'd have total clarity on all function definitions, and we
wouldn't have to modify the generation of pg_proc.dat unless a new function
syntax feature affected bootstrapping.

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#17)
Re: generating function default settings from pg_proc.dat

Here's the promised v2, which addresses all your review comments.

With respect to the list of supported types in bki.sgml: I wonder if
we should just drop that, because it evidently hasn't been maintained
well. It wasn't at all in sync with the actual contents of TypInfo[].
I made it be so, but ...

Poking further at that, I found that there were a lot of TypInfo[]
entries that were not actually used and seem to have just been
cargo-culted in. So this patch removes all the ones that aren't
demonstrably necessary to get through initdb. Maybe that's too
aggressive, but in view of the potential for maintenance errors
(cf 7cdb633c8) I don't think we should be carrying unused entries
there.

regards, tom lane

Attachments:

v2-0001-Simplify-creation-of-built-in-functions-with-defa.patchtext/x-diff; charset=us-ascii; name*0=v2-0001-Simplify-creation-of-built-in-functions-with-defa.p; name*1=atchDownload+262-325
#20Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#19)
Re: generating function default settings from pg_proc.dat

On 2026-02-17 Tu 1:41 PM, Tom Lane wrote:

Here's the promised v2, which addresses all your review comments.

With respect to the list of supported types in bki.sgml: I wonder if
we should just drop that, because it evidently hasn't been maintained
well. It wasn't at all in sync with the actual contents of TypInfo[].
I made it be so, but ...

Poking further at that, I found that there were a lot of TypInfo[]
entries that were not actually used and seem to have just been
cargo-culted in. So this patch removes all the ones that aren't
demonstrably necessary to get through initdb. Maybe that's too
aggressive, but in view of the potential for maintenance errors
(cf 7cdb633c8) I don't think we should be carrying unused entries
there.

That makes sense. I haven't tested it, but the code looks pretty sane.
Nice job. I guess if we ever want to specify a default that's not a
Const for some reason we'd have to fall back on the system_views
mechanism. But that's not very likely.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#20)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Corey Huinker (#18)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#22)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#23)
#25Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#24)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#25)
#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#26)
#28Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#27)