misleading error message in ProcessUtilitySlow T_CreateStatsStmt
hi.
while reviewing other work, some error messages in src/backend/tcop/utility.c
seem not accurate.
static void
ProcessUtilitySlow(ParseState *pstate,
PlannedStmt *pstmt,
const char *queryString,
ProcessUtilityContext context,
ParamListInfo params,
QueryEnvironment *queryEnv,
DestReceiver *dest,
QueryCompletion *qc)
case T_CreateStatsStmt:
{
Oid relid;
CreateStatsStmt *stmt = (CreateStatsStmt *) parsetree;
RangeVar *rel = (RangeVar *) linitial(stmt->relations);
if (!IsA(rel, RangeVar))
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("only a single relation is
allowed in CREATE STATISTICS")));
relid = RangeVarGetRelid(rel,
ShareUpdateExclusiveLock, false);
/* Run parse analysis ... */
stmt = transformStatsStmt(relid, stmt, queryString);
address = CreateStatistics(stmt);
}
for example:
create or replace function tftest(int) returns table(a int, b int) as $$
begin
return query select $1, $1+i from generate_series(1,5) g(i);
end;
$$ language plpgsql immutable strict;
CREATE STATISTICS alt_stat2 ON a, b FROM tftest(1);
ERROR: only a single relation is allowed in CREATE STATISTICS
this error message seem misleading?
also this error check seems duplicated in CreateStatistics?
On Thu, 21 Aug 2025 at 17:00, jian he <jian.universality@gmail.com> wrote:
hi.
Hi!
RangeVar *rel = (RangeVar *) linitial(stmt->relations);
if (!IsA(rel, RangeVar))
These two lines are weird. Looks like linitial(stmt->relations)
should be assigned to variable with type Node* first?
for example:
create or replace function tftest(int) returns table(a int, b int) as $$
begin
return query select $1, $1+i from generate_series(1,5) g(i);
end;
$$ language plpgsql immutable strict;CREATE STATISTICS alt_stat2 ON a, b FROM tftest(1);
ERROR: only a single relation is allowed in CREATE STATISTICSthis error message seem misleading?
I wouldn’t say this is misleading, but " a single relation" is indeed
not precise enough. IMO we need a more precise term to distinguish
regular relation and table func.
also this error check seems duplicated in CreateStatistics?
Indeed.
--
Best regards,
Kirill Reshke
Kirill Reshke <reshkekirill@gmail.com> writes:
On Thu, 21 Aug 2025 at 17:00, jian he <jian.universality@gmail.com> wrote:
RangeVar *rel = (RangeVar *) linitial(stmt->relations);
if (!IsA(rel, RangeVar))
These two lines are weird. Looks like linitial(stmt->relations)
should be assigned to variable with type Node* first?
We take that sort of shortcut in many places. If there's not any need
for the code to deal with more than one node type, an extra variable
that's used just for the IsA test seems like a lot of notational
overhead.
regards, tom lane
On 2025-Aug-21, Kirill Reshke wrote:
I wouldn’t say this is misleading, but " a single relation" is indeed
not precise enough. IMO we need a more precise term to distinguish
regular relation and table func.
I'm not sure. See the definition of relation in the glossary:
https://www.postgresql.org/docs/18/glossary.html#GLOSSARY-RELATION
The generic term for all objects in a database that have a name and a
list of attributes defined in a specific order. Tables, sequences,
views, foreign tables, materialized views, composite types, and
indexes are all relations.
More generically, a relation is a set of tuples; for example, the
result of a query is also a relation.
In PostgreSQL, Class is an archaic synonym for relation.
(I wonder why this says "generically" rather than "generally". Is that
word choice a mistake?) Maybe in the "For example" clause we can also
mention table functions.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
On Fri, 22 Aug 2025 at 14:46, Álvaro Herrera <alvherre@kurilemu.de> wrote:
On 2025-Aug-21, Kirill Reshke wrote:
I wouldn’t say this is misleading, but " a single relation" is indeed
not precise enough. IMO we need a more precise term to distinguish
regular relation and table func.I'm not sure. See the definition of relation in the glossary:
https://www.postgresql.org/docs/18/glossary.html#GLOSSARY-RELATIONThe generic term for all objects in a database that have a name and a
list of attributes defined in a specific order. Tables, sequences,
views, foreign tables, materialized views, composite types, and
indexes are all relations.More generically, a relation is a set of tuples; for example, the
result of a query is also a relation.In PostgreSQL, Class is an archaic synonym for relation.
(I wonder why this says "generically" rather than "generally". Is that
word choice a mistake?) Maybe in the "For example" clause we can also
mention table functions.--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
I am sorry: I am not following. CREATE STATISTIC will work only for
single HEAP (or other AM) relations. So, for "simple regular tables"
as one can say (not tablefunc).
You say: relation is a term for both HEAP relation, tablefunc relation
and much more.
I say: "a single relation" in the error message is not precise enough.
Where do we disagree?
Anyway, I would say correct error message here should be:
```
db=# CREATE STATISTICS alt_stat2 ON a, b FROM tftest(1);
ERROR: cannot define statistics for relation "alt_stat2"
DETAIL: This operation is not supported for query result.
```
--
Best regards,
Kirill Reshke
hi.
will
+ if (!IsA(rln, RangeVar))
+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("CREATE STATISTICS only
supports regular tables, materialized views, foreign tables, and
partitioned tables"));
solve the problem?
On Fri, Aug 22, 2025 at 6:46 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
I'm not sure. See the definition of relation in the glossary:
https://www.postgresql.org/docs/18/glossary.html#GLOSSARY-RELATIONThe generic term for all objects in a database that have a name and a
list of attributes defined in a specific order. Tables, sequences,
views, foreign tables, materialized views, composite types, and
indexes are all relations.More generically, a relation is a set of tuples; for example, the
result of a query is also a relation.In PostgreSQL, Class is an archaic synonym for relation.
(I wonder why this says "generically" rather than "generally". Is that
word choice a mistake?)
"More generally" feels more natural to me than "more generically" in
this sentence, but I'm not a native English speaker, so I could be
wrong.
Thanks
Richard
=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:
I'm not sure. See the definition of relation in the glossary:
https://www.postgresql.org/docs/18/glossary.html#GLOSSARY-RELATION
The generic term for all objects in a database that have a name and a
list of attributes defined in a specific order. Tables, sequences,
views, foreign tables, materialized views, composite types, and
indexes are all relations.
More generically, a relation is a set of tuples; for example, the
result of a query is also a relation.
In PostgreSQL, Class is an archaic synonym for relation.
(I wonder why this says "generically" rather than "generally". Is that
word choice a mistake?) Maybe in the "For example" clause we can also
mention table functions.
Yeah, I think s/generically/generally/ would be an improvement.
I'm not certain, but I think that our use of "relation" to mean
"an object with a pg_class entry" is a Postgres-ism. I do know
that the meaning of "a set of tuples" is widely used, as that's
where the term "relational database" comes from. Maybe whoever
wrote this was trying to get at that point? But this text is
hardly clear about that.
regards, tom lane
On 2025-Aug-22, Kirill Reshke wrote:
I am sorry: I am not following. CREATE STATISTIC will work only for
single HEAP (or other AM) relations. So, for "simple regular tables"
as one can say (not tablefunc).
Ah yeah, you're right, I think we should say "a single table", or maybe
"a single table or materialized view" if the latter case is supported (I
don't remember offhand if it is).
The error message is mostly concerned with rejecting the case of
multiple relations being given in the FROM clause, because we said at
the time that we needed to make the grammar flexible enough to support
that case, but make it clear that it wasn't implemented yet. (It's a
pity that we haven't implemented that thus far ... I suppose it must be
a difficult problem.)
DETAIL: This operation is not supported for query result.
I would say "This operation is only supported for tables [and
matviews]." We don't need to list all the things that aren't supported,
I think. But your example here is wrong:
db=# CREATE STATISTICS alt_stat2 ON a, b FROM tftest(1);
ERROR: cannot define statistics for relation "alt_stat2"
DETAIL: This operation is not supported for query result.
It's not alt_stat2 that's the problem (that's the extstats object being
created), but tftest(1). I don't know if we need to specify that the
problem is in the FROM clause here -- seems obvious enough to me -- but
maybe someone opines differently?
ERROR: cannot define statistics for relation "tftest"
DETAIL: The FROM clause may only contain a single table.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Si quieres ser creativo, aprende el arte de perder el tiempo"
On 2025-Aug-22, Tom Lane wrote:
I'm not certain, but I think that our use of "relation" to mean
"an object with a pg_class entry" is a Postgres-ism. I do know
that the meaning of "a set of tuples" is widely used, as that's
where the term "relational database" comes from. Maybe whoever
wrote this was trying to get at that point? But this text is
hardly clear about that.
Yeah, AFAIR I wrote it (or at least heavily edited the original to get
to that), and I was trying to convey exactly those two ideas. If you
want to propose improvements, they're very welcome.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Puedes vivir sólo una vez, pero si lo haces bien, una vez es suficiente"
=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:
Yeah, AFAIR I wrote it (or at least heavily edited the original to get
to that), and I was trying to convey exactly those two ideas. If you
want to propose improvements, they're very welcome.
Hmmm ... maybe something like
Mathematically, a "relation" is a set of tuples; this is the sense
meant in the term "relational database".
In Postgres, "relation" is commonly used to mean a database object
that has a name and a list of attributes defined in a specific
order. Tables, sequences, views, foreign tables, materialized
views, composite types, and indexes are all relations. A relation
in this sense is a container or descriptor for a set of tuples.
"Class" is an alternative but archaic term. The system catalog
pg_class holds an entry for each Postgres relation.
regards, tom lane
On 2025-Aug-22, Tom Lane wrote:
Hmmm ... maybe something like
Mathematically, a "relation" is a set of tuples; this is the sense
meant in the term "relational database".In Postgres, "relation" is commonly used to mean a database object
that has a name and a list of attributes defined in a specific
order. Tables, sequences, views, foreign tables, materialized
views, composite types, and indexes are all relations. A relation
in this sense is a container or descriptor for a set of tuples."Class" is an alternative but archaic term. The system catalog
pg_class holds an entry for each Postgres relation.
Thanks, pushed like that. I changed "a database object" to "an SQL
object", because that's a term we have a definition for.
(I also wrote "PostgreSQL" where you had "Postgres". I think it might
be okay now to change the product name in various places here, but it
seems better to do it consistently across the whole page.)
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"¿Qué importan los años? Lo que realmente importa es comprobar que
a fin de cuentas la mejor edad de la vida es estar vivo" (Mafalda)
=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:
Thanks, pushed like that. I changed "a database object" to "an SQL
object", because that's a term we have a definition for.
(I also wrote "PostgreSQL" where you had "Postgres". I think it might
be okay now to change the product name in various places here, but it
seems better to do it consistently across the whole page.)
LGTM, thanks. (My wording was only meant as a draft ...)
regards, tom lane
On 2025-Aug-21, jian he wrote:
case T_CreateStatsStmt:
{
Oid relid;
CreateStatsStmt *stmt = (CreateStatsStmt *) parsetree;
RangeVar *rel = (RangeVar *) linitial(stmt->relations);if (!IsA(rel, RangeVar))
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("only a single relation is
allowed in CREATE STATISTICS")));
relid = RangeVarGetRelid(rel,
ShareUpdateExclusiveLock, false);
/* Run parse analysis ... */
stmt = transformStatsStmt(relid, stmt, queryString);
address = CreateStatistics(stmt);
}
Yeah, I think there's room to argue that this ereport() is just wrongly
copy-and-pasted from other nearby errors, and that it should have
completely different wording. BTW another way to cause this is
CREATE STATISTICS alt_stat2 ON a, b FROM xmltable('foo' passing 'bar' columns a text);
I propose to use this report:
ERROR: cannot create statistics on specified relation
DETAIL: CREATE STATISTICS only supports tables, materialized views, foreign tables, and partitioned tables.
(Not sold on saying just "tables" vs. "regular tables" or "plain tables";
thoughts?)
While looking at this whole thing, I noticed that this code is somewhat
bogus in a different way: we're resolving the relation name twice, both
here in ProcessUtilitySlow() (to pass to transformStatsStmt), and later
again inside CreateStatistics(). It's really strange that we decided
not to pass the relation Oids as a list argument to CreateStatistics. I
suggest to change that as the first attached patch.
Now, such a change would be appropriate only for branch master; it seems too
invasive for stable ones. For them I propose we only change the error message,
as in the other attachment.
We should add a couple of test cases for this particular error message
also. It's bad that these changes don't break any tests.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Once again, thank you and all of the developers for your hard work on
PostgreSQL. This is by far the most pleasant management experience of
any database I've worked on." (Dan Harris)
http://archives.postgresql.org/pgsql-performance/2006-04/msg00247.php
=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:
I propose to use this report:
ERROR: cannot create statistics on specified relation
DETAIL: CREATE STATISTICS only supports tables, materialized views, foreign tables, and partitioned tables.
WFM, although I think you could shorten it to "tables, materialized
views, and foreign tables". We generally expect that partitioned
tables are included when saying "tables", no? I'm not dead set on
that either way, though.
While looking at this whole thing, I noticed that this code is somewhat
bogus in a different way: we're resolving the relation name twice, both
here in ProcessUtilitySlow() (to pass to transformStatsStmt), and later
again inside CreateStatistics(). It's really strange that we decided
not to pass the relation Oids as a list argument to CreateStatistics. I
suggest to change that as the first attached patch.
I think this is a good idea, but I'm not sure that the locking
considerations are straight in this version. Once we translate a
relation name to OID, we'd better hold lock on the rel continuously to
the end of usage of that OID. It looks like you do, but then couldn't
the
+ rel = relation_open(relid, ShareUpdateExclusiveLock);
in CreateStatistics use NoLock to signify that we expect to have
the lock already?
Alternatively, maybe the right fix is to move the parse-analysis
work into CreateStatistics altogether. I think there is not a
very good argument for ProcessUtilitySlow doing all that stuff.
It's supposed to mainly just be a dispatching switch(), IMO.
regards, tom lane
On Fri, Aug 29, 2025 at 5:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:
I propose to use this report:
ERROR: cannot create statistics on specified relation
DETAIL: CREATE STATISTICS only supports tables, materialized views, foreign tables, and partitioned tables.WFM, although I think you could shorten it to "tables, materialized
views, and foreign tables". We generally expect that partitioned
tables are included when saying "tables", no? I'm not dead set on
that either way, though.
https://www.postgresql.org/docs/current/sql-copy.html
use "COPY TO can be used only with plain tables, not views, and does
not copy rows from child tables or child partitions"
Alternatively, maybe the right fix is to move the parse-analysis
work into CreateStatistics altogether. I think there is not a
very good argument for ProcessUtilitySlow doing all that stuff.
It's supposed to mainly just be a dispatching switch(), IMO.
seems doable.
transformStatsStmt, CreateStatistics both used only twice, refactoring arguments
should be fine.
please check the attached POC, regress tests also added.
main idea is
first check CreateStatsStmt->relations,
then call transformStatsStmt, transformStatsStmt only to transform
CreateStatsStmt->exprs.
also the above complaint about the relation lock issue will be resolved.
Attachments:
v2-0001-refactor-CreateStatsStmt.patchtext/x-patch; charset=US-ASCII; name=v2-0001-refactor-CreateStatsStmt.patchDownload+47-29
On 2025-Aug-29, jian he wrote:
On Fri, Aug 29, 2025 at 5:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
WFM, although I think you could shorten it to "tables, materialized
views, and foreign tables". We generally expect that partitioned
tables are included when saying "tables", no? I'm not dead set on
that either way, though.https://www.postgresql.org/docs/current/sql-copy.html
use "COPY TO can be used only with plain tables, not views, and does
not copy rows from child tables or child partitions"
I'm inclined to think that we should only mention partitioned tables
specifically when they for some reason deviate from what we do for
regular tables, i.e., what Tom is saying. I don't think we've had an
explicit, consistent rule for that thus far, so there may be places
where we fail to follow it.
Anyway, I have pushed the error message change.
Alternatively, maybe the right fix is to move the parse-analysis
work into CreateStatistics altogether. I think there is not a
very good argument for ProcessUtilitySlow doing all that stuff.
It's supposed to mainly just be a dispatching switch(), IMO.seems doable.
transformStatsStmt, CreateStatistics both used only twice, refactoring
arguments should be fine.
please check the attached POC, regress tests also added.
Yeah, I like how this turned out. I found out this was introduced in
commit a4d75c86bf15.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
On Fri, Aug 29, 2025 at 8:48 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
Alternatively, maybe the right fix is to move the parse-analysis
work into CreateStatistics altogether. I think there is not a
very good argument for ProcessUtilitySlow doing all that stuff.
It's supposed to mainly just be a dispatching switch(), IMO.seems doable.
transformStatsStmt, CreateStatistics both used only twice, refactoring
arguments should be fine.
please check the attached POC, regress tests also added.Yeah, I like how this turned out. I found out this was introduced in
commit a4d75c86bf15.
Previously, CreateStatistics and ProcessUtilitySlow opened the same relation
twice with a ShareUpdateExclusiveLock. This refactor ensures the relation is
opened with ShareUpdateExclusiveLock only once and also reduces redundant error
checks.
The logic is now more intuitive: we first error checking
CreateStatsStmt->relations and then call transformStatsStmt to parse analysis
CreateStatsStmt->exprs.
please check the attached refactor CreateStatsStmt.
Attachments:
v3-0001-refactor-CreateStatsStmt.patchtext/x-patch; charset=US-ASCII; name=v3-0001-refactor-CreateStatsStmt.patchDownload+17-35
On 29.08.25 14:48, Álvaro Herrera wrote:
On 2025-Aug-29, jian he wrote:
On Fri, Aug 29, 2025 at 5:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
WFM, although I think you could shorten it to "tables, materialized
views, and foreign tables". We generally expect that partitioned
tables are included when saying "tables", no? I'm not dead set on
that either way, though.https://www.postgresql.org/docs/current/sql-copy.html
use "COPY TO can be used only with plain tables, not views, and does
not copy rows from child tables or child partitions"I'm inclined to think that we should only mention partitioned tables
specifically when they for some reason deviate from what we do for
regular tables, i.e., what Tom is saying. I don't think we've had an
explicit, consistent rule for that thus far, so there may be places
where we fail to follow it.Anyway, I have pushed the error message change.
I think this message is still wrong. The check doesn't even look at the
relation kind, which is what the message is implying. (If the message
were about relkinds, then it should use
errdetail_relkind_not_supported().) It checks that the from list entry
is a table name instead of some other thing like VALUES or JOIN. So it
should be something like
CREATE STATISTICS only supports plain table names in the FROM clause
The same could also be accomplished by changing the grammar from
FROM from_list
to
FROM qualified_name_list
On 05.09.25 14:30, Peter Eisentraut wrote:
On 29.08.25 14:48, Álvaro Herrera wrote:
On 2025-Aug-29, jian he wrote:
On Fri, Aug 29, 2025 at 5:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
WFM, although I think you could shorten it to "tables, materialized
views, and foreign tables". We generally expect that partitioned
tables are included when saying "tables", no? I'm not dead set on
that either way, though.https://www.postgresql.org/docs/current/sql-copy.html
use "COPY TO can be used only with plain tables, not views, and does
not copy rows from child tables or child partitions"I'm inclined to think that we should only mention partitioned tables
specifically when they for some reason deviate from what we do for
regular tables, i.e., what Tom is saying. I don't think we've had an
explicit, consistent rule for that thus far, so there may be places
where we fail to follow it.Anyway, I have pushed the error message change.
I think this message is still wrong. The check doesn't even look at the
relation kind, which is what the message is implying. (If the message
were about relkinds, then it should use
errdetail_relkind_not_supported().) It checks that the from list entry
is a table name instead of some other thing like VALUES or JOIN. So it
should be something likeCREATE STATISTICS only supports plain table names in the FROM clause
I propose the attached patch to fix this. I think this restores the
original meaning better.