to_regtype() Raises Error
The docs for `to_regtype()` say, “this function will return NULL rather than throwing an error if the name is not found.” And it’s true most of the time:
david=# select to_regtype('foo'), to_regtype('clam');
to_regtype | to_regtype
------------+------------
[null] | [null]
But not others:
david=# select to_regtype('inteval second');
ERROR: syntax error at or near "second"
LINE 1: select to_regtype('inteval second');
^
CONTEXT: invalid type name "inteval second”
I presume this has something to do with not catching errors from the parser?
david=# select to_regtype('clam bake');
ERROR: syntax error at or near "bake"
LINE 1: select to_regtype('clam bake');
^
CONTEXT: invalid type name "clam bake"
Best,
David
On 18/09/2023 00:13 CEST David E. Wheeler <david@justatheory.com> wrote:
The docs for `to_regtype()` say, “this function will return NULL rather than
throwing an error if the name is not found.” And it’s true most of the time:david=# select to_regtype('foo'), to_regtype('clam');
to_regtype | to_regtype
------------+------------
[null] | [null]But not others:
david=# select to_regtype('inteval second');
ERROR: syntax error at or near "second"
LINE 1: select to_regtype('inteval second');
^
CONTEXT: invalid type name "inteval second”
Probably a typo and you meant 'interval second' which works.
I presume this has something to do with not catching errors from the parser?
david=# select to_regtype('clam bake');
ERROR: syntax error at or near "bake"
LINE 1: select to_regtype('clam bake');
^
CONTEXT: invalid type name "clam bake"
Double-quoting the type name to treat it as an identifier works:
test=# select to_regtype('"clam bake"');
to_regtype
------------
<NULL>
(1 row)
So it's basically a matter of keywords vs. identifiers.
--
Erik
On 9/18/23 00:41, Erik Wienhold wrote:
On 18/09/2023 00:13 CEST David E. Wheeler <david@justatheory.com> wrote:
The docs for `to_regtype()` say, “this function will return NULL rather than
throwing an error if the name is not found.” And it’s true most of the time:david=# select to_regtype('foo'), to_regtype('clam');
to_regtype | to_regtype
------------+------------
[null] | [null]But not others:
david=# select to_regtype('inteval second');
ERROR: syntax error at or near "second"
LINE 1: select to_regtype('inteval second');
^
CONTEXT: invalid type name "inteval second”Probably a typo and you meant 'interval second' which works.
No, that is precisely the point. The result should be null instead of
an error.
--
Vik Fearing
Vik Fearing <vik@postgresfriends.org> writes:
No, that is precisely the point. The result should be null instead of
an error.
Yeah, ideally so, but the cost/benefit of making it happen seems
pretty unattractive for now. See the soft-errors thread at [1]/messages/by-id/3bbbb0df-7382-bf87-9737-340ba096e034@postgrespro.ru,
particularly [2]/messages/by-id/3342239.1671988406@sss.pgh.pa.us (but searching in that thread for references to
regclassin, regtypein, and to_reg* will find additional detail).
regards, tom lane
[1]: /messages/by-id/3bbbb0df-7382-bf87-9737-340ba096e034@postgrespro.ru
[2]: /messages/by-id/3342239.1671988406@sss.pgh.pa.us
On 18/09/2023 00:57 CEST Vik Fearing <vik@postgresfriends.org> wrote:
On 9/18/23 00:41, Erik Wienhold wrote:
On 18/09/2023 00:13 CEST David E. Wheeler <david@justatheory.com> wrote:
david=# select to_regtype('inteval second');
ERROR: syntax error at or near "second"
LINE 1: select to_regtype('inteval second');
^
CONTEXT: invalid type name "inteval second”Probably a typo and you meant 'interval second' which works.
No, that is precisely the point. The result should be null instead of
an error.
Well, the docs say "return NULL rather than throwing an error if the name is
not found". To me "name is not found" implies that it has to be valid syntax
first to even have a name that can be looked up.
String 'inteval second' is a syntax error when interpreted as a type name.
The same when I want to create a table with that typo:
test=# create table t (a inteval second);
ERROR: syntax error at or near "second"
LINE 1: create table t (a inteval second);
And a custom function is always an option:
create function to_regtype_lax(name text)
returns regtype
language plpgsql
as $$
begin
return to_regtype(name);
exception
when others then
return null;
end
$$;
test=# select to_regtype_lax('inteval second');
to_regtype_lax
----------------
<NULL>
(1 row)
--
Erik
On Sep 17, 2023, at 19:28, Tom Lane <tgl@sss.pgh.pa.us> wrote:
No, that is precisely the point. The result should be null instead of
an error.Yeah, ideally so, but the cost/benefit of making it happen seems
pretty unattractive for now. See the soft-errors thread at [1],
particularly [2] (but searching in that thread for references to
regclassin, regtypein, and to_reg* will find additional detail).
For my purposes I’m wrapping it in an exception-catching PL/pgSQL function, but it might be worth noting the condition in which it *will* raise an error on the docs.
Best,
David
On Mon, 2023-09-18 at 00:57 +0200, Vik Fearing wrote:
On 9/18/23 00:41, Erik Wienhold wrote:
On 18/09/2023 00:13 CEST David E. Wheeler <david@justatheory.com> wrote:
The docs for `to_regtype()` say, “this function will return NULL rather than
throwing an error if the name is not found.” And it’s true most of the time:david=# select to_regtype('foo'), to_regtype('clam');
to_regtype | to_regtype
------------+------------
[null] | [null]But not others:
david=# select to_regtype('inteval second');
ERROR: syntax error at or near "second"
LINE 1: select to_regtype('inteval second');
^
CONTEXT: invalid type name "inteval second”Probably a typo and you meant 'interval second' which works.
No, that is precisely the point. The result should be null instead of
an error.
Right. I debugged into this, and found this comment to typeStringToTypeName():
* If the string cannot be parsed as a type, an error is raised,
* unless escontext is an ErrorSaveContext node, in which case we may
* fill that and return NULL. But note that the ErrorSaveContext option
* is mostly aspirational at present: errors detected by the main
* grammar, rather than here, will still be thrown.
"escontext" is an ErrorSaveContext node, and it is the parser failing.
Not sure if we can do anything about that or if it is worth the effort.
Perhaps the documentation could reflect the implementation.
Yours,
Laurenz Albe
On Sun, Sep 17, 2023 at 5:34 PM Erik Wienhold <ewie@ewie.name> wrote:
On 18/09/2023 00:57 CEST Vik Fearing <vik@postgresfriends.org> wrote:
On 9/18/23 00:41, Erik Wienhold wrote:
On 18/09/2023 00:13 CEST David E. Wheeler <david@justatheory.com>
wrote:
david=# select to_regtype('inteval second');
ERROR: syntax error at or near "second"
LINE 1: select to_regtype('inteval second');
^
CONTEXT: invalid type name "inteval second”Probably a typo and you meant 'interval second' which works.
No, that is precisely the point. The result should be null instead of
an error.Well, the docs say "return NULL rather than throwing an error if the name
is
not found".
To me "name is not found" implies that it has to be valid syntax
first to even have a name that can be looked up.
Except there is nothing in the typed literal value that is actually a
syntactical problem from the perspective of the user. IOW, the following
work just fine:
select to_regtype('character varying'), to_regtype('interval second');
No need for quotes and the space doesn't produce an issue (and in fact
adding double quotes to the above causes them to not match since the
quoting is taken literally and not syntactically)
The failure to return NULL exposes an implementation detail that we
shouldn't be exposing. As Tom said, maybe doing better is too hard to be
worthwhile, but that doesn't mean our current behavior is somehow correct.
Put differently, there is no syntax involved when the value being provided
is the text literal name of a type as it is stored in pg_type.typname, so
the presence of a syntax error is wrong.
David J.
On 2023-09-17 20:58, David G. Johnston wrote:
Put differently, there is no syntax involved when the value being
provided
is the text literal name of a type as it is stored in pg_type.typname,
so
the presence of a syntax error is wrong.
Well, the situation is a little weirder than that, because of the
existence
of SQL standard types with multiple-token names; when you provide the
value 'character varying', you are not providing a name found in
pg_type.typname (while, if you call the same type 'varchar', you are).
For 'character varying', the parser is necessarily involved.
The case with 'interval second' is similar, but even different still;
that isn't a multiple-token type name, but a type name with a
standard-specified bespoke way of writing a typmod. Another place
the parser is necessarily involved, doing another job. (AFAICT,
to_regtype is happy with a typmod attached to the input, and
happily ignores it, so to_regtype('interval second') gives you
interval, to_regtype('character varying(20)') gives you
character varying, and so on.)
Regards,
-Chao
On Sun, Sep 17, 2023 at 6:25 PM Chapman Flack <chap@anastigmatix.net> wrote:
On 2023-09-17 20:58, David G. Johnston wrote:
Put differently, there is no syntax involved when the value being
provided
is the text literal name of a type as it is stored in pg_type.typname,
so
the presence of a syntax error is wrong.Well, the situation is a little weirder than that, because of the
existence
of SQL standard types with multiple-token names; when you provide the
value 'character varying', you are not providing a name found in
pg_type.typname (while, if you call the same type 'varchar', you are).
For 'character varying', the parser is necessarily involved.
Why don't we just populate pg_type with these standard mandated names too?
The case with 'interval second' is similar, but even different still;
that isn't a multiple-token type name, but a type name with a
standard-specified bespoke way of writing a typmod. Another place
the parser is necessarily involved, doing another job. (AFAICT,
to_regtype is happy with a typmod attached to the input, and
happily ignores it, so to_regtype('interval second') gives you
interval, to_regtype('character varying(20)') gives you
character varying, and so on.)
Seems doable to teach the lookup code that suffixes of the form (n) should
be ignored when matching the base type, plus maybe some kind of special
case for standard mandated typmods on their specific types. There is some
ambiguity possible when doing that though:
create type "interval second" as (x int, y int);
select to_regtype('interval second'); --> interval
Or just write a function that deals with the known forms dictated by the
standard and delegate checking for valid names against that first before
consulting pg_type? It might be some code duplication but it isn't like
it's a quickly moving target and I have to imagine it would be faster and
allow us to readily implement the soft-error contract.
David J.
On 2023-09-17 21:58, David G. Johnston wrote:
ambiguity possible when doing that though:
create type "interval second" as (x int, y int);
select to_regtype('interval second'); --> interval
Not ambiguity really: that composite type you just made was named
with a single <delimited identifier>, which is one token. (Also,
being delimited makes it case-sensitive, and always distinct from
an SQL keyword; consider the different types char and "char". Ah,
that SQL committee!)
The argument to regtype there is a single, case-insensitive,
<regular identifier>, a <separator>, and another <regular identifier>,
where in this case the first identifier happens to name a type, the
second one happens to be a typmod, and the separator is rather
simple as <separator> goes.
In this one, both identifiers are part of the type name, and the
separator a little more flamboyant.
select to_regtype('character /* hi!
am I part of the type name? /* what, me too? */ ok! */ -- huh!
varying');
to_regtype
-------------------
character varying
As the backend already has one parser that knows all those
lexical and grammar productions, I don't imagine it would be
very appealing to have a second implementation of some of them.
Obviously, to_regtype could add some simplifying requirements
(like "only whitespace for the separator please"), but as you
see above, it currently doesn't.
Regards,
-Chap
On Sunday, September 17, 2023, Chapman Flack <chap@anastigmatix.net> wrote:
In this one, both identifiers are part of the type name, and the
separator a little more flamboyant.select to_regtype('character /* hi!
am I part of the type name? /* what, me too? */ ok! */ -- huh!
varying');
to_regtype
-------------------
character varying
So, maybe we should be saying:
Parses a string of text, extracts a potential type name from it, and
translates that name into an OID. Failure to extract a valid potential
type name results in an error while a failure to determine that the
extracted name is known to the system results in a null output.
I take specific exception to describing your example as a “textual type
name”.
David J.
Hey there, coming back to this. I poked at the logs in the master branch and saw no mention of to_regtype; did I miss it?
On Sep 17, 2023, at 10:58 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
Parses a string of text, extracts a potential type name from it, and translates that name into an OID. Failure to extract a valid potential type name results in an error while a failure to determine that the extracted name is known to the system results in a null output.
I take specific exception to describing your example as a “textual type name”.
More docs seem like a reasonable compromise. Perhaps it’d be useful to also describe when an error is likely and when it’s not.
Best,
David
On Mon, Jan 29, 2024 at 8:45 AM David E. Wheeler <david@justatheory.com>
wrote:
Hey there, coming back to this. I poked at the logs in the master branch
and saw no mention of to_regtype; did I miss it?
With no feedback regarding my final suggestion I lost interest in it and
never produced a patch.
On Sep 17, 2023, at 10:58 PM, David G. Johnston <
david.g.johnston@gmail.com> wrote:Parses a string of text, extracts a potential type name from it, and
translates that name into an OID. Failure to extract a valid potential
type name results in an error while a failure to determine that the
extracted name is known to the system results in a null output.I take specific exception to describing your example as a “textual type
name”.
More docs seem like a reasonable compromise. Perhaps it’d be useful to
also describe when an error is likely and when it’s not.
Seems like most just want to leave well enough alone and deal with the rare
question for oddball input on the mailing list. If you are interested
enough to come back after 4 months I'd suggest you write up and submit a
patch. I'm happy to review it and see if that is enough to get a committer
to respond.
David J.
On Feb 2, 2024, at 3:23 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
Seems like most just want to leave well enough alone and deal with the rare question for oddball input on the mailing list. If you are interested enough to come back after 4 months I'd suggest you write up and submit a patch. I'm happy to review it and see if that is enough to get a committer to respond.
LOL, “interested enough” is less the right term than “triaging email backlog and following up on a surprisingly controversial question.” I also just like to see decisions made and issues closed one way or another.
Anyway, I’m happy to submit a documentation patch along the lines you suggested.
D
On Feb 2, 2024, at 15:33, David E. Wheeler <david@justatheory.com> wrote:
Anyway, I’m happy to submit a documentation patch along the lines you suggested.
How’s this?
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25460,11 +25460,12 @@ SELECT collation for ('foo' COLLATE "de_DE");
<returnvalue>regtype</returnvalue>
</para>
<para>
- Translates a textual type name to its OID. A similar result is
+ Parses a string of text, extracts a potential type name from it, and
+ translates that name into an OID. A similar result is
obtained by casting the string to type <type>regtype</type> (see
- <xref linkend="datatype-oid"/>); however, this function will return
- <literal>NULL</literal> rather than throwing an error if the name is
- not found.
+ <xref linkend="datatype-oid"/>). Failure to extract a valid potential
+ type name results in an error; however, if the extracted names is not
+ known to the system, this function will return <literal>NULL</literal>.
</para></entry>
</row>
</tbody>
Does similar wording need to apply to other `to_reg*` functions?
Best,
David
Attachments:
v1-0001-Update-to_regtype-docs-regarding-error-condition.patchapplication/octet-stream; name=v1-0001-Update-to_regtype-docs-regarding-error-condition.patch; x-unix-mode=0644Download
From 975bb2a57029a4ce06338104820451a85545257b Mon Sep 17 00:00:00 2001
From: "David E. Wheeler" <david@justatheory.com>
Date: Sun, 4 Feb 2024 14:13:53 -0500
Subject: [PATCH v1] Update to_regtype docs regarding error condition
The statement that `to_regtype` returns `NULL` rather than raising an
error turns out to be false when the parser cannot extract a type. This
can happen for inputs such as `inteval second`, where the SQL grammar
parser fails to parse the type name. In most cases it does return `NULL`
and ideally it always would for such unknown types, but sadly the
overhead of handling errors thrown by the grammar is too high at this
point. So document the variation, instead.
---
doc/src/sgml/func.sgml | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 6788ba8ef4..afd821100a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25460,11 +25460,12 @@ SELECT collation for ('foo' COLLATE "de_DE");
<returnvalue>regtype</returnvalue>
</para>
<para>
- Translates a textual type name to its OID. A similar result is
+ Parses a string of text, extracts a potential type name from it, and
+ translates that name into an OID. A similar result is
obtained by casting the string to type <type>regtype</type> (see
- <xref linkend="datatype-oid"/>); however, this function will return
- <literal>NULL</literal> rather than throwing an error if the name is
- not found.
+ <xref linkend="datatype-oid"/>). Failure to extract a valid potential
+ type name results in an error; however, if the extracted names is not
+ known to the system, this function will return <literal>NULL</literal>.
</para></entry>
</row>
</tbody>
--
2.43.0
On 2024-02-04 20:20 +0100, David E. Wheeler wrote:
On Feb 2, 2024, at 15:33, David E. Wheeler <david@justatheory.com> wrote:
Anyway, I’m happy to submit a documentation patch along the lines you suggested.
How’s this?
--- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -25460,11 +25460,12 @@ SELECT collation for ('foo' COLLATE "de_DE"); <returnvalue>regtype</returnvalue> </para> <para> - Translates a textual type name to its OID. A similar result is + Parses a string of text, extracts a potential type name from it, and + translates that name into an OID. A similar result is obtained by casting the string to type <type>regtype</type> (see - <xref linkend="datatype-oid"/>); however, this function will return - <literal>NULL</literal> rather than throwing an error if the name is - not found. + <xref linkend="datatype-oid"/>). Failure to extract a valid potential + type name results in an error; however, if the extracted names is not
Here "extracted names" should be "extracted name" (singular).
Otherwise, the text looks good.
+ known to the system, this function will return <literal>NULL</literal>.
</para></entry>
</row>
</tbody>Does similar wording need to apply to other `to_reg*` functions?
Just to_regtype() is fine IMO. The other to_reg* functions don't throw
errors on similar input, e.g.:
test=> select to_regproc('foo bar');
to_regproc
------------
<NULL>
(1 row)
--
Erik
On Feb 4, 2024, at 19:11, Erik Wienhold <ewie@ewie.name> wrote:
Here "extracted names" should be "extracted name" (singular).
Otherwise, the text looks good.
Ah, thank you. Updated patch attached.
Best,
David
Attachments:
v2-0001-Update-to_regtype-docs-regarding-error-condition.patchapplication/octet-stream; name=v2-0001-Update-to_regtype-docs-regarding-error-condition.patch; x-unix-mode=0644Download
From 1d15921965a90027329933e0e32fc3f008ebe527 Mon Sep 17 00:00:00 2001
From: "David E. Wheeler" <david@justatheory.com>
Date: Mon, 5 Feb 2024 09:00:29 -0500
Subject: [PATCH v2] Update to_regtype docs regarding error condition
The statement that `to_regtype` returns `NULL` rather than raising an
error turns out to be false when the parser cannot extract a type. This
can happen for inputs such as `inteval second`, where the SQL grammar
parser fails to parse the type name. In most cases it does return `NULL`
and ideally it always would for such unknown types, but sadly the
overhead of handling errors thrown by the grammar is too high at this
point. So document the variation, instead.
---
doc/src/sgml/func.sgml | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 6788ba8ef4..0ff392ab05 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25460,11 +25460,12 @@ SELECT collation for ('foo' COLLATE "de_DE");
<returnvalue>regtype</returnvalue>
</para>
<para>
- Translates a textual type name to its OID. A similar result is
+ Parses a string of text, extracts a potential type name from it, and
+ translates that name into an OID. A similar result is
obtained by casting the string to type <type>regtype</type> (see
- <xref linkend="datatype-oid"/>); however, this function will return
- <literal>NULL</literal> rather than throwing an error if the name is
- not found.
+ <xref linkend="datatype-oid"/>). Failure to extract a valid potential
+ type name results in an error; however, if the extracted name is not
+ known to the system, this function will return <literal>NULL</literal>.
</para></entry>
</row>
</tbody>
--
2.43.0
On Feb 5, 2024, at 09:01, David E. Wheeler <david@justatheory.com> wrote:
Ah, thank you. Updated patch attached.
I’ve moved this patch into the to_regtype patch thread[1]/messages/by-id/60EF4E11-BC1C-4034-B37B-695662D28AD2@justatheory.com, since it exhibits the same behavior.
Best,
David
[1]: /messages/by-id/60EF4E11-BC1C-4034-B37B-695662D28AD2@justatheory.com
Merged this change into the [to_regtypemod patch](https://commitfest.postgresql.org/47/4807/), which has exactly the same issue.
The new status of this patch is: Needs review
On Feb 21, 2024, at 11:54 AM, David Wheeler <david@justatheory.com> wrote:
Merged this change into the [to_regtypemod patch](https://commitfest.postgresql.org/47/4807/), which has exactly the same issue.
The new status of this patch is: Needs review
Bah, withdrawn.
D