Patch: Add parse_type Function
Hackers,
Attached is a patch to add a new function, `parse_type()`. It parses a type string and returns a record with the typid and typmod for the type, or raises an error if the type is invalid. It’s effectively a thin layer over the parser’s parseTypeString() function.
The purpose of this function is to allow uses to resolve any type name, including aliases, to its formal name as rendered by, for example, pg_dump. An example:
david=# WITH s(s) AS (
SELECT * FROM unnest(ARRAY[
'timestamp(4)',
'interval(0)',
'interval second(0)',
'timestamptz',
'timestamptz(6)',
'varchar',
'varchar(128)'
])
),
p(type, typid, typmod) AS (
SELECT s, ((parse_type(s)).*)
FROM s
)
SELECT type, format_type(typid, typmod) FROM p;
type | format_type
--------------------+--------------------------------
timestamp(4) | timestamp(4) without time zone
interval(0) | interval(0)
interval second(0) | interval second(0)
timestamptz | timestamp with time zone
timestamptz(6) | timestamp(6) with time zone
varchar | character varying
varchar(128) | character varying(128)
(7 rows)
The use case leading to this patch was pgTAP column data type assertions. pgTAP traditionally required full type names for testing data types, but recently added support for aliases. This broke for some types, because it was difficult to extract the typmod correctly, and even when we did, it failed for types such as `interval second(0)`, where `second (0)` is the typmod, and there was no sensible way to access the bit mask to generate the proper typmod to pass to `format_type()`.
Erik Wienhold wrote this function to solve the problem, and I volunteered to submit a patch.
Potential issues and questions for this patch:
* Is `parse_type()` the right name to use? I can see arguments for `parse_type_string()`, `pg_parse_type()`, and `pg_parse_type_string()`. Whatever the consensus is works for me.
* The function returns a record, which is a little fussy. I could see adding a `format_type_string()` function that just contains `SELECT format_type(p.typmod, p.typid) FROM parse_type($1) p` and returns the formatted type. But I think it might also be useful to have access to the typmod and typid directly via this method, since they’re already exposed as `format_type()`’s arguments.
* Is format_type.c the right home for the function? I put it there because I closely associate it with format_type(), but happy to move it to a more appropriate location.
* I tried to link to `format_type` from the docs entry for `parse_type`, and added an ID for the former, but I’m not sure it resolves.
Best,
David
Attachments:
v1-0001-Add-parse_type-SQL-function.patchapplication/octet-stream; name=v1-0001-Add-parse_type-SQL-function.patch; x-unix-mode=0644Download+176-2
Hi
ne 4. 2. 2024 v 18:51 odesílatel David E. Wheeler <david@justatheory.com>
napsal:
Hackers,
Attached is a patch to add a new function, `parse_type()`. It parses a
type string and returns a record with the typid and typmod for the type, or
raises an error if the type is invalid. It’s effectively a thin layer over
the parser’s parseTypeString() function.The purpose of this function is to allow uses to resolve any type name,
including aliases, to its formal name as rendered by, for example, pg_dump.
An example:david=# WITH s(s) AS (
SELECT * FROM unnest(ARRAY[
'timestamp(4)',
'interval(0)',
'interval second(0)',
'timestamptz',
'timestamptz(6)',
'varchar',
'varchar(128)'
])
),
p(type, typid, typmod) AS (
SELECT s, ((parse_type(s)).*)
FROM s
)
SELECT type, format_type(typid, typmod) FROM p;
type | format_type
--------------------+--------------------------------
timestamp(4) | timestamp(4) without time zone
interval(0) | interval(0)
interval second(0) | interval second(0)
timestamptz | timestamp with time zone
timestamptz(6) | timestamp(6) with time zone
varchar | character varying
varchar(128) | character varying(128)
(7 rows)The use case leading to this patch was pgTAP column data type assertions.
pgTAP traditionally required full type names for testing data types, but
recently added support for aliases. This broke for some types, because it
was difficult to extract the typmod correctly, and even when we did, it
failed for types such as `interval second(0)`, where `second (0)` is the
typmod, and there was no sensible way to access the bit mask to generate
the proper typmod to pass to `format_type()`.Erik Wienhold wrote this function to solve the problem, and I volunteered
to submit a patch.Potential issues and questions for this patch:
* Is `parse_type()` the right name to use? I can see arguments for
`parse_type_string()`, `pg_parse_type()`, and `pg_parse_type_string()`.
Whatever the consensus is works for me.
there can be an analogy with parse_ident, so parse_type looks ok
* The function returns a record, which is a little fussy. I could see
adding a `format_type_string()` function that just contains `SELECT
format_type(p.typmod, p.typid) FROM parse_type($1) p` and returns the
formatted type. But I think it might also be useful to have access to the
typmod and typid directly via this method, since they’re already exposed as
`format_type()`’s arguments.
I think so record has sense for this case
* Is format_type.c the right home for the function? I put it there because
I closely associate it with format_type(), but happy to move it to a more
appropriate location.
yes
* I tried to link to `format_type` from the docs entry for `parse_type`,
and added an ID for the former, but I’m not sure it resolves.
I thinks so proposed functionality can be useful
Regards
Pavel
Show quoted text
Best,
David
On Feb 4, 2024, at 13:02, Pavel Stehule <pavel.stehule@gmail.com> wrote:
I thinks so proposed functionality can be useful
Great, thank you!
Is that a review? :-)
D
ne 4. 2. 2024 v 19:30 odesílatel David E. Wheeler <david@justatheory.com>
napsal:
On Feb 4, 2024, at 13:02, Pavel Stehule <pavel.stehule@gmail.com> wrote:
I thinks so proposed functionality can be useful
Great, thank you!
Is that a review? :-)
not yet, but I'll do it
Pavel
Show quoted text
D
On Feb 4, 2024, at 13:52, Pavel Stehule <pavel.stehule@gmail.com> wrote:
not yet, but I'll do it
Nice, thank you. I put it into the Commitfest.
https://commitfest.postgresql.org/47/4807/
Best,
David
Hi David,
Thanks for the patch.
On 04.02.24 18:51, David E. Wheeler wrote:
Hackers,
Attached is a patch to add a new function, `parse_type()`. It parses a type string and returns a record with the typid and typmod for the type, or raises an error if the type is invalid. It’s effectively a thin layer over the parser’s parseTypeString() function.
The purpose of this function is to allow uses to resolve any type name, including aliases, to its formal name as rendered by, for example, pg_dump. An example:
david=# WITH s(s) AS (
SELECT * FROM unnest(ARRAY[
'timestamp(4)',
'interval(0)',
'interval second(0)',
'timestamptz',
'timestamptz(6)',
'varchar',
'varchar(128)'
])
),
p(type, typid, typmod) AS (
SELECT s, ((parse_type(s)).*)
FROM s
)
SELECT type, format_type(typid, typmod) FROM p;
type | format_type
--------------------+--------------------------------
timestamp(4) | timestamp(4) without time zone
interval(0) | interval(0)
interval second(0) | interval second(0)
timestamptz | timestamp with time zone
timestamptz(6) | timestamp(6) with time zone
varchar | character varying
varchar(128) | character varying(128)
(7 rows)The use case leading to this patch was pgTAP column data type assertions. pgTAP traditionally required full type names for testing data types, but recently added support for aliases. This broke for some types, because it was difficult to extract the typmod correctly, and even when we did, it failed for types such as `interval second(0)`, where `second (0)` is the typmod, and there was no sensible way to access the bit mask to generate the proper typmod to pass to `format_type()`.
Erik Wienhold wrote this function to solve the problem, and I volunteered to submit a patch.
Potential issues and questions for this patch:
* Is `parse_type()` the right name to use? I can see arguments for `parse_type_string()`, `pg_parse_type()`, and `pg_parse_type_string()`. Whatever the consensus is works for me.
* The function returns a record, which is a little fussy. I could see adding a `format_type_string()` function that just contains `SELECT format_type(p.typmod, p.typid) FROM parse_type($1) p` and returns the formatted type. But I think it might also be useful to have access to the typmod and typid directly via this method, since they’re already exposed as `format_type()`’s arguments.
* Is format_type.c the right home for the function? I put it there because I closely associate it with format_type(), but happy to move it to a more appropriate location.
* I tried to link to `format_type` from the docs entry for `parse_type`, and added an ID for the former, but I’m not sure it resolves.
Best,
David
The patch applies cleanly
Checking patch doc/src/sgml/func.sgml...
Checking patch src/backend/utils/adt/format_type.c...
Checking patch src/include/catalog/pg_proc.dat...
Checking patch src/include/utils/builtins.h...
Checking patch src/test/regress/expected/create_type.out...
Checking patch src/test/regress/sql/create_type.sql...
Applied patch doc/src/sgml/func.sgml cleanly.
Applied patch src/backend/utils/adt/format_type.c cleanly.
Applied patch src/include/catalog/pg_proc.dat cleanly.
Applied patch src/include/utils/builtins.h cleanly.
Applied patch src/test/regress/expected/create_type.out cleanly.
Applied patch src/test/regress/sql/create_type.sql cleanly.
The docs render just fine and all tests pass (check and check-world).
There are a few issues though:
1) The function causes a crash when called with a NULL parameter:
SELECT * FROM parse_type(NULL);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.
You have to check it in the beginning of function and either return an
error message or just NULL, e.g
if (PG_ARGISNULL(0))
PG_RETURN_NULL();
2) Add a function call with NULL in the regression tests
SELECT * FROM parse_type(NULL);
3) Add the expected behaviour of calling the function with NULL in the
docs (error message or null)
4) The name of the returned columns is repeated in the docs
Returns a record with two fields, <parameter>typid</parameter> and
<parameter>typid</parameter>
--
Jim
On Feb 5, 2024, at 08:02, Jim Jones <jim.jones@uni-muenster.de> wrote:
There are a few issues though:
Excellent, thank you very much! Updated patch attached.
Best,
David
Attachments:
v2-0001-Add-parse_type-SQL-function.patchapplication/octet-stream; name=v2-0001-Add-parse_type-SQL-function.patch; x-unix-mode=0644Download+188-2
Jim Jones <jim.jones@uni-muenster.de> writes:
1) The function causes a crash when called with a NULL parameter:
SELECT * FROM parse_type(NULL);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
The connection to the server was lost. Attempting reset: Failed.You have to check it in the beginning of function and either return an
error message or just NULL, e.gif (PG_ARGISNULL(0))
PG_RETURN_NULL();
Instead of handling this in the function body, the function should be
declared as strict. This is in fact the default in pg_proc.h,
/* strict with respect to NULLs? */
bool proisstrict BKI_DEFAULT(t);
so removing the explicit proisstrict => 'f' from the pg_proc.dat entry
will fix it with no additional code.
2) Add a function call with NULL in the regression tests
SELECT * FROM parse_type(NULL);
3) Add the expected behaviour of calling the function with NULL in the
docs (error message or null)
Once the function is declared strict, I don't think either of these is
necessary: function strictness is tested elsewhere, and it's the default
behaviour. The only functions that explicitly say they return NULL on
NULL inputs are quote_literal (because you might expect it to return the
string 'NULL', but there's qoute_nullable for that) and xmlexists (which
I don't see any particular reason for).
- ilmari
On 05.02.24 15:10, David E. Wheeler wrote:
Excellent, thank you very much! Updated patch attached.
Best,
David
v2 no longer crashes with a null parameter.
docs and regress tests were updated accordingly.
patch no longer applies cleanly (tiny little indentation issue):
/home/jim/Downloads/v2-0001-Add-parse_type-SQL-function.patch:140:
indent with spaces.
PG_RETURN_NULL();
warning: 1 line adds whitespace errors.
I read the comments again, and something is not entirely clear to me.
Line 494 says "Raises an error on an invalid type." and 501 says
"Returns NULL for an invalid type."
Perhaps merging both comment blocks and rephrasing these sentences would
make things clearer?
--
Jim
On 05.02.24 15:32, Dagfinn Ilmari Mannsåker wrote:
Once the function is declared strict, I don't think either of these is
necessary: function strictness is tested elsewhere, and it's the default
behaviour. The only functions that explicitly say they return NULL on
NULL inputs are quote_literal (because you might expect it to return the
string 'NULL', but there's qoute_nullable for that) and xmlexists (which
I don't see any particular reason for).- ilmari
+1
Yes, if the function was strict (which in the current design is not the
case) there is no need to check for nulls.
--
Jim
On Feb 5, 2024, at 09:49, Jim Jones <jim.jones@uni-muenster.de> wrote:
Yes, if the function was strict (which in the current design is not the
case) there is no need to check for nulls.
Right, done, and cleaned up the redundant comments.
Best,
David
Attachments:
v3-0001-Add-parse_type-SQL-function.patchapplication/octet-stream; name=v3-0001-Add-parse_type-SQL-function.patch; x-unix-mode=0644Download+176-2
On 05.02.24 16:14, David E. Wheeler wrote:
Right, done, and cleaned up the redundant comments.
Best,
David
Nice.
The patch applies cleanly and it no longer crashes with NULL parameters.
Docs render fine and regression tests pass. I have nothing more to add.
Let's now wait for Pavel's review.
Thanks!
--
Jim
On Feb 5, 2024, at 10:47 AM, Jim Jones <jim.jones@uni-muenster.de> wrote:
The patch applies cleanly and it no longer crashes with NULL parameters.
Docs render fine and regression tests pass. I have nothing more to add.
Let's now wait for Pavel's review.
Much obliged!
D
+ /*
+ * Parse type-name argument to obtain type OID and encoded typmod. We don't
+ * need to check for parseTypeString failure, but just let the error be
+ * raised. The 0 arg works both as the `Node *escontext` arg in Postgres 16
+ * and the `bool missing_ok` arg in 9.4-15.
+ */
+ (void) parseTypeString(type, &typid, &typmod, 0);
if you are wondering around other code deal with NULL, ErrorSaveContext.
NULL would be more correct?
`(void) parseTypeString(type, &typid, &typmod, NULL);`
also parseTypeString already explained the third argument, our
comments can be simplified as:
/*
* Parse type-name argument to obtain type OID and encoded typmod. We don't
* need to handle parseTypeString failure, just let the error be
* raised.
*/
cosmetic issue. Looking around other error handling code, the format
should be something like:
`
if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("function returning record called in"
"context that cannot accept type record")));
`
`PG_FUNCTION_INFO_V1(parse_type);`
is unnecessary?
based on the error information: https://cirrus-ci.com/task/5899457502380032
not sure why it only fails on windows.
+#define PARSE_TYPE_STRING_COLS 2 /* Returns two columns. */
+#undef PARSE_TYPE_STRING_COLS
Are these necessary?
given that the comments already say return the OID and type modifier.
+ ( <parameter>typid</parameter> <type>oid</type>,
+ <parameter>typmod</parameter> <type>int4</type> )
here `int4` should be `integer`?
commit message:
`Also accounts for data typs that require the SQL grammar to be parsed:`
except the typo issue, this sentence is still hard for me to understand.
The `parse_type()` function uses the underlying `parseTypeString()` C
function to parse a string representing a data type into a type ID and
typmod suitabld for passing to `format_type()`.
suitabld should be suitable.
+ <para>
+ Parses a string representing an SQL type declaration as used in a
+ <command>CREATE TABLE</command> statement, optionally schema-qualified.
+ Returns a record with two fields, <parameter>typid</parameter> and
+ <parameter>typmod</parameter>, representing the OID and
modifier for the
+ type. These correspond to the parameters to pass to the
+ <link linkend="format_type"><function>format_type</function>
function.</link>
+ </para>
can be simplified:
+ <para>
+ Parses a string representing an SQL data type, optionally
schema-qualified.
+ Returns a record with two fields, <parameter>typid</parameter> and
+ <parameter>typmod</parameter>, representing the OID and
modifier for the
+ type. These correspond to the parameters to pass to the
+ <link linkend="format_type"><function>format_type</function>
function.</link>
+ </para>
(I don't have a strong opinion though)
Let me comment on some issues since I wrote the very first version of
parse_type() on which David's patch is based.
On 2024-02-01 01:00 +0100, jian he <jian.universality@gmail.com> wrote:
cosmetic issue. Looking around other error handling code, the format
should be something like:
`
if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("function returning record called in"
"context that cannot accept type record")));
`
+1
`PG_FUNCTION_INFO_V1(parse_type);`
is unnecessary?
based on the error information: https://cirrus-ci.com/task/5899457502380032
not sure why it only fails on windows.
Yeah, it's not necessary for internal functions per [1]https://www.postgresql.org/docs/current/xfunc-c.html#XFUNC-C-V1-CALL-CONV. It's a
leftover from when this function was part of the pgtap extension.
+#define PARSE_TYPE_STRING_COLS 2 /* Returns two columns. */ +#undef PARSE_TYPE_STRING_COLS Are these necessary? given that the comments already say return the OID and type modifier.
Not sure what the convention here is but I found this in a couple of
places, maybe even a tutorial on writing C functions. See
`git grep '_COLS\s\+[1-9]'` for those instances. It's in line with my
habit of avoiding magic constants.
+ ( <parameter>typid</parameter> <type>oid</type>, + <parameter>typmod</parameter> <type>int4</type> ) here `int4` should be `integer`?
+1
commit message:
`Also accounts for data typs that require the SQL grammar to be parsed:`
except the typo issue, this sentence is still hard for me to understand.
Yes, the sentence is rather handwavy. What is meant here is that this
function also resolves types whose typmod is determined by the SQL
parser or some step after that. There are types whose typmod is
whatever value is found inside the parenthesis, e.g. bit(13) has typmod
13. Our first idea before coming up with that function was to do some
string manipulation and extract the typmod from inside the parenthesis.
But we soon found out that other typmods don't translate one to one,
e.g. varchar(13) has typmod 17. The interval type is also special
because the typmod is some bitmask encoding of e.g. 'second(0)'. Hence
the mention of the SQL grammar.
+ <para> + Parses a string representing an SQL type declaration as used in a + <command>CREATE TABLE</command> statement, optionally schema-qualified. + Returns a record with two fields, <parameter>typid</parameter> and + <parameter>typmod</parameter>, representing the OID and modifier for the + type. These correspond to the parameters to pass to the + <link linkend="format_type"><function>format_type</function> function.</link> + </para>can be simplified: + <para> + Parses a string representing an SQL data type, optionally schema-qualified. + Returns a record with two fields, <parameter>typid</parameter> and + <parameter>typmod</parameter>, representing the OID and modifier for the + type. These correspond to the parameters to pass to the + <link linkend="format_type"><function>format_type</function> function.</link> + </para> (I don't have a strong opinion though)
Sounds better. The CREATE TABLE reference is superfluous.
[1]: https://www.postgresql.org/docs/current/xfunc-c.html#XFUNC-C-V1-CALL-CONV
--
Erik
On Feb 10, 2024, at 20:52, Erik Wienhold <ewie@ewie.name> wrote:
Let me comment on some issues since I wrote the very first version of
parse_type() on which David's patch is based.
Thanks Erik.
On 2024-02-01 01:00 +0100, jian he <jian.universality@gmail.com> wrote:
if you are wondering around other code deal with NULL, ErrorSaveContext.
NULL would be more correct?
`(void) parseTypeString(type, &typid, &typmod, NULL);`
Fixed.
also parseTypeString already explained the third argument, our
comments can be simplified as:
/*
* Parse type-name argument to obtain type OID and encoded typmod. We don't
* need to handle parseTypeString failure, just let the error be
* raised.
*/
Thanks, adopted that language.
cosmetic issue. Looking around other error handling code, the format
should be something like:
`
if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("function returning record called in"
"context that cannot accept type record")));
`+1
I poked around and found some other examples, yes. I went with a single long line for errmsg following the example in adminpack.c[1]https://github.com/postgres/postgres/blob/09eb633e1baa3b7cd7929f3cc77f9c46f63c20b1/contrib/adminpack/adminpack.c#L508-L511
`PG_FUNCTION_INFO_V1(parse_type);`
is unnecessary?
based on the error information: https://cirrus-ci.com/task/5899457502380032
not sure why it only fails on windows.Yeah, it's not necessary for internal functions per [1]. It's a
leftover from when this function was part of the pgtap extension.
Removed.
+#define PARSE_TYPE_STRING_COLS 2 /* Returns two columns. */ +#undef PARSE_TYPE_STRING_COLS Are these necessary? given that the comments already say return the OID and type modifier.Not sure what the convention here is but I found this in a couple of
places, maybe even a tutorial on writing C functions. See
`git grep '_COLS\s\+[1-9]'` for those instances. It's in line with my
habit of avoiding magic constants.
Left in place for now.
+ ( <parameter>typid</parameter> <type>oid</type>, + <parameter>typmod</parameter> <type>int4</type> ) here `int4` should be `integer`?+1
Fixed.
Yes, the sentence is rather handwavy. What is meant here is that this
function also resolves types whose typmod is determined by the SQL
parser or some step after that. There are types whose typmod is
whatever value is found inside the parenthesis, e.g. bit(13) has typmod
13. Our first idea before coming up with that function was to do some
string manipulation and extract the typmod from inside the parenthesis.
But we soon found out that other typmods don't translate one to one,
e.g. varchar(13) has typmod 17. The interval type is also special
because the typmod is some bitmask encoding of e.g. 'second(0)'. Hence
the mention of the SQL grammar.
I adopted some of your language here --- and fixed the spelling errors :-)
can be simplified: + <para> + Parses a string representing an SQL data type, optionally schema-qualified. + Returns a record with two fields, <parameter>typid</parameter> and + <parameter>typmod</parameter>, representing the OID and modifier for the + type. These correspond to the parameters to pass to the + <link linkend="format_type"><function>format_type</function> function.</link> + </para> (I don't have a strong opinion though)Sounds better. The CREATE TABLE reference is superfluous.
Done.
Best,
Attachments:
v4-0001-Add-parse_type-SQL-function.patchapplication/octet-stream; name=v4-0001-Add-parse_type-SQL-function.patch; x-unix-mode=0644Download+168-2
"David E. Wheeler" <david@justatheory.com> writes:
[ v4-0001-Add-parse_type-SQL-function.patch ]
It strikes me that this is basically to_regtype() with the additional
option to return the typmod. That leads to some questions:
* Should we choose a name related to to_regtype? I don't have any
immediate suggestions, but since you didn't seem entirely set on
parse_type, maybe it's worth thinking along those lines. OTOH,
to the extent that people would use this with format_type, maybe
parse_type is fine.
* Perhaps the type OID output argument should be declared regtype
not plain OID? It wouldn't make any difference for passing it to
format_type, but in other use-cases perhaps regtype would be more
readable. It's not a deal-breaker either way of course, since
you can cast oid to regtype or vice versa.
* Maybe the code should be in adt/regproc.c not format_type.c.
* Experience with to_regtype suggests strongly that people would
prefer "return NULL" to failure for an unrecognized type name.
Skimming the patch, I notice that the manual addition to
builtins.h should be unnecessary: the pg_proc.dat entry
should be enough to create an extern in fmgrprotos.h.
Also, I'm pretty sure that reformat_dat_file.pl will
think your pg_proc.dat entry is overly verbose. See
https://www.postgresql.org/docs/devel/system-catalog-initial-data.html
regards, tom lane
I wrote:
It strikes me that this is basically to_regtype() with the additional
option to return the typmod. That leads to some questions:
BTW, another way that this problem could be approached is to use
to_regtype() as-is, with a separate function to obtain the typmod:
select format_type(to_regtype('timestamp(4)'), to_regtypmod('timestamp(4)'));
This is intellectually ugly, since it implies parsing the same
typename string twice. But on the other hand it avoids the notational
pain and runtime overhead involved in using a record-returning
function. So I think it might be roughly a wash for performance.
Question to think about is which way is easier to use. I don't
have an opinion particularly; just throwing the idea out there.
regards, tom lane
po 12. 2. 2024 v 19:20 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
I wrote:
It strikes me that this is basically to_regtype() with the additional
option to return the typmod. That leads to some questions:BTW, another way that this problem could be approached is to use
to_regtype() as-is, with a separate function to obtain the typmod:select format_type(to_regtype('timestamp(4)'),
to_regtypmod('timestamp(4)'));This is intellectually ugly, since it implies parsing the same
typename string twice. But on the other hand it avoids the notational
pain and runtime overhead involved in using a record-returning
function. So I think it might be roughly a wash for performance.
Question to think about is which way is easier to use. I don't
have an opinion particularly; just throwing the idea out there.
+1
Pavel
Show quoted text
regards, tom lane
On Feb 12, 2024, at 12:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David E. Wheeler" <david@justatheory.com> writes:
[ v4-0001-Add-parse_type-SQL-function.patch ]
It strikes me that this is basically to_regtype() with the additional
option to return the typmod. That leads to some questions:
Huh. I saw it more on a par with format_type(), at least in the output I expect.
* Should we choose a name related to to_regtype? I don't have any
immediate suggestions, but since you didn't seem entirely set on
parse_type, maybe it's worth thinking along those lines. OTOH,
to the extent that people would use this with format_type, maybe
parse_type is fine.
Yeah, and the `to_*` functions return a single OID, not two fields.
* Perhaps the type OID output argument should be declared regtype
not plain OID? It wouldn't make any difference for passing it to
format_type, but in other use-cases perhaps regtype would be more
readable. It's not a deal-breaker either way of course, since
you can cast oid to regtype or vice versa.
Sure. I used `oid` because that’s exactly what the argument to format_type() is called, so thought the parity there more appropriate. Maybe its argument should be renamed to regtype?
* Maybe the code should be in adt/regproc.c not format_type.c.
Happy to move it wherever makes the most sense.
* Experience with to_regtype suggests strongly that people would
prefer "return NULL" to failure for an unrecognized type name.
Could do, but as with to_regtype() there’s an issue with error handling when it descends into the SQL parser.
Skimming the patch, I notice that the manual addition to
builtins.h should be unnecessary: the pg_proc.dat entry
should be enough to create an extern in fmgrprotos.h.
Confirmed, will remove in next patch.
Also, I'm pretty sure that reformat_dat_file.pl will
think your pg_proc.dat entry is overly verbose. See
https://www.postgresql.org/docs/devel/system-catalog-initial-data.html
There are a bunch of things it reformats:
❯ git diff --name-only
src/include/catalog/pg_aggregate.dat
src/include/catalog/pg_database.dat
src/include/catalog/pg_proc.dat
src/include/catalog/pg_type.dat
src/include/utils/builtins.h
And even in pg_proc.dat there are 13 blocks it reformats. I can submit with just my block formatted if you’d prefer.
BTW, another way that this problem could be approached is to use
to_regtype() as-is, with a separate function to obtain the typmod:select format_type(to_regtype('timestamp(4)'), to_regtypmod('timestamp(4)'));
Oh interesting.
This is intellectually ugly, since it implies parsing the same
typename string twice. But on the other hand it avoids the notational
pain and runtime overhead involved in using a record-returning
function. So I think it might be roughly a wash for performance.
Question to think about is which way is easier to use. I don't
have an opinion particularly; just throwing the idea out there.
Honestly I haven’t cared for the fact that parse_type() returns a record; it makes it a bit clunky. But yeah, so does this to pass the same value to two function calls.
Perhaps it makes sense to add to_regtypmod() as you suggest, but then also something like:
CREATE FUNCTION format_type_string(text) RETURNS TEXT AS $$
SELECT format_type(to_regtype($1), to_regtypmod($1))
$$;
Best,
David