code cleanup for CREATE STATISTICS
Hi,
There is at present a comment at the top of transformStatsStmt which
says "To avoid race conditions, it's important that this function
relies only on the passed-in relid (and not on stmt->relation) to
determine the target relation." However, doing what the comment says
we need to do doesn't actually avoid the problem we need to avoid. The
issue here, as I understand it, is that if we look up the same
less-than-fully-qualified name multiple times, we might get different
answers due to concurrent activity, and that might create a security
vulnerability, along the lines of CVE-2014-0062. So what the code
should be doing is looking up the user-provided name just once and
then using that value throughout all subsequent processing stages, but
it doesn't actually. The caller of transformStatsStmt() looks up the
RangeVar and gets an OID, but that value is then discarded and
CreateStatistics does another lookup on the same name, which means
that we're not really avoiding the hazard about which the comment
seems to be concerned.
So that leads to the question of whether there's a security
vulnerability here. I and a few other members of the pgsql-security
haven't been able to find one in brief review, but we may have missed
something. Fortunately, the permissions checks happen after the second
name lookup inside CreateStatistics(), so it doesn't seem that, for
example, you can leverage this to create extended statistics on a
table that you don't own. You can possibly get the first part of the
operation, where we transform the CREATE STATISTICS command's WHERE
clause, to operate on one table that you do own and then the second
part on another table that you don't own, but even if that's so, the
second part is just going to fail before doing anything interesting,
so it doesn't seem like there's a problem. If anyone reading this can
spot an exploit, please speak up!
So the attached patch is proposed as code cleanup, rather than
security patches. It changes the code to avoid the duplicate name
lookup altogether. There is no reason that I know of why this needs to
be back-patched for correctness, but I think it's worth putting into
master to make the code nicer and avoid doing things that in some
circumstances can be risky.
Thanks,
--
Robert Haas
EDB: http://www.enterprisedb.com
Attachments:
v2-0001-Refactor-transformStatsStmt-to-avoid-repeated-nam.patchapplication/octet-stream; name=v2-0001-Refactor-transformStatsStmt-to-avoid-repeated-nam.patchDownload
From a5cf563a42570882904049943fbaae294e419644 Mon Sep 17 00:00:00 2001
From: Robert Haas <rhaas@postgresql.org>
Date: Tue, 2 May 2023 13:19:46 -0400
Subject: [PATCH v2] Refactor transformStatsStmt to avoid repeated name
lookups.
The header comment for transformStatsStmt previously stated that
it was important for that function rely on on the passed-in relid
to identify the target relation, rather than figuring it out from
from the FROM clause. However, as coded, having the caller pass the
table OID to transformStatsStmt didn't really achieve anything
useful, because CreateStatistics looked up the name from the FROM
clause again anyway.
To avoid that, we need to look up the name mentioned in the CREATE
STATISTICS just once, and carry that same OID all the way through
parse analysis and execution. Refactor so that a CreateStatsStmt
carries both the untransformed FROM clause and the range table that
we get when it's transformed. That way, once the name lookups have
been done, we can propagate the results forward to future
processing steps and avoid ever repeating the lookup.
---
src/backend/commands/statscmds.c | 73 +++++++++------------
src/backend/commands/tablecmds.c | 15 ++++-
src/backend/parser/gram.y | 4 +-
src/backend/parser/parse_utilcmd.c | 100 ++++++++++++++++++++++++++---
src/backend/tcop/utility.c | 22 +------
src/include/nodes/parsenodes.h | 3 +-
src/include/parser/parse_utilcmd.h | 3 +-
7 files changed, 141 insertions(+), 79 deletions(-)
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 26ebd0819d..43f0ad3b47 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -30,7 +30,9 @@
#include "commands/defrem.h"
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
+#include "nodes/print.h"
#include "optimizer/optimizer.h"
+#include "parser/parse_utilcmd.h"
#include "statistics/statistics.h"
#include "utils/builtins.h"
#include "utils/lsyscache.h"
@@ -98,56 +100,45 @@ CreateStatistics(CreateStatsStmt *stmt)
Assert(IsA(stmt, CreateStatsStmt));
/*
- * Examine the FROM clause. Currently, we only allow it to be a single
- * simple table, but later we'll probably allow multiple tables and JOIN
- * syntax. The grammar is already prepared for that, so we have to check
- * here that what we got is what we can support.
+ * If the range table hasn't been constructed yet, do that now. Normally
+ * it's best for this to happen earlier, perhaps by calling
+ * transformStatsStmt(), so that the name lookup happens at the earliest
+ * possible stage and isn't repeated (with the attendant danger of getting
+ * a different result). But in the case of CREATE TABLE LIKE there's no
+ * earlier point at which we can do this, so it happens here.
*/
- if (list_length(stmt->relations) != 1)
+ if (stmt->rtable == NIL)
+ stmt->rtable = transformStatsFromClause(stmt->from_clause);
+ Assert(list_length(stmt->rtable) == list_length(stmt->from_clause));
+
+ /*
+ * Double-check that the range table is of a form we can support.
+ *
+ * Currently, we only allow it to be a single simple table, but later
+ * we'll probably allow multiple tables and JOIN syntax. Even though the
+ * grammar is already prepared for that, we expect such cases to be
+ * rejected by transformStatsStmt; to be safe, double-check it here.
+ */
+ if (list_length(stmt->rtable) != 1)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("only a single relation is allowed in CREATE STATISTICS")));
- foreach(cell, stmt->relations)
+ /*
+ * We want to use the relation OID from stmt->rtable rather than again
+ * looking it up based on stmt->from_clause, to avoid race conditions.
+ */
+ foreach(cell, stmt->rtable)
{
- Node *rln = (Node *) lfirst(cell);
-
- if (!IsA(rln, RangeVar))
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("only a single relation is allowed in CREATE STATISTICS")));
+ RangeTblEntry *rte = (RangeTblEntry *) lfirst(cell);
/*
- * CREATE STATISTICS will influence future execution plans but does
- * not interfere with currently executing plans. So it should be
- * enough to take only ShareUpdateExclusiveLock on relation,
- * conflicting with ANALYZE and other DDL that sets statistical
- * information, but not with normal queries.
+ * See comments in transformStatsStmt() for why this lock level has
+ * been chosen. We expect the relation to already be locked at this
+ * point, but take the lock here anyway just in case there's some code
+ * path where that is not so.
*/
- rel = relation_openrv((RangeVar *) rln, ShareUpdateExclusiveLock);
-
- /* Restrict to allowed relation types */
- if (rel->rd_rel->relkind != RELKIND_RELATION &&
- rel->rd_rel->relkind != RELKIND_MATVIEW &&
- rel->rd_rel->relkind != RELKIND_FOREIGN_TABLE &&
- rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
- ereport(ERROR,
- (errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg("cannot define statistics for relation \"%s\"",
- RelationGetRelationName(rel)),
- errdetail_relkind_not_supported(rel->rd_rel->relkind)));
-
- /* You must own the relation to create stats on it */
- if (!object_ownercheck(RelationRelationId, RelationGetRelid(rel), stxowner))
- aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
- RelationGetRelationName(rel));
-
- /* Creating statistics on system catalogs is not allowed */
- if (!allowSystemTableMods && IsSystemRelation(rel))
- ereport(ERROR,
- (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- errmsg("permission denied: \"%s\" is a system catalog",
- RelationGetRelationName(rel))));
+ rel = relation_open(rte->relid, ShareUpdateExclusiveLock);
}
Assert(rel);
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index c7a8a689b7..4d0a7b204c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -13410,10 +13410,19 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd,
querytree_list = list_concat(querytree_list, afterStmts);
}
else if (IsA(stmt, CreateStatsStmt))
+ {
+ CreateStatsStmt *ss = (CreateStatsStmt *) stmt;
+ RangeTblEntry *rte;
+
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = oldRelId;
+ rte->rellockmode = ShareUpdateExclusiveLock;
+ ss->rtable = list_make1(rte);
+
querytree_list = lappend(querytree_list,
- transformStatsStmt(oldRelId,
- (CreateStatsStmt *) stmt,
- cmd));
+ transformStatsStmt(ss, cmd));
+ }
else
querytree_list = lappend(querytree_list, stmt);
}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index a723d9db78..5776a0ce5c 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -4503,7 +4503,7 @@ CreateStatsStmt:
n->defnames = $3;
n->stat_types = $4;
n->exprs = $6;
- n->relations = $8;
+ n->from_clause = $8;
n->stxcomment = NULL;
n->if_not_exists = false;
$$ = (Node *) n;
@@ -4516,7 +4516,7 @@ CreateStatsStmt:
n->defnames = $6;
n->stat_types = $7;
n->exprs = $9;
- n->relations = $11;
+ n->from_clause = $11;
n->stxcomment = NULL;
n->if_not_exists = true;
$$ = (Node *) n;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index b1255e3b70..01e2b77d5e 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1965,7 +1965,8 @@ generateClonedExtStatsStmt(RangeVar *heapRel, Oid heapRelid,
stats->defnames = NULL;
stats->stat_types = stat_types;
stats->exprs = def_names;
- stats->relations = list_make1(heapRel);
+ stats->from_clause = list_make1(heapRel);
+ stats->rtable = NIL;
stats->stxcomment = NULL;
stats->transformed = true; /* don't need transformStatsStmt again */
stats->if_not_exists = false;
@@ -2890,33 +2891,43 @@ transformIndexStmt(Oid relid, IndexStmt *stmt, const char *queryString)
/*
* transformStatsStmt - parse analysis for CREATE STATISTICS
- *
- * To avoid race conditions, it's important that this function relies only on
- * the passed-in relid (and not on stmt->relation) to determine the target
- * relation.
*/
CreateStatsStmt *
-transformStatsStmt(Oid relid, CreateStatsStmt *stmt, const char *queryString)
+transformStatsStmt(CreateStatsStmt *stmt, const char *queryString)
{
ParseState *pstate;
ParseNamespaceItem *nsitem;
ListCell *l;
Relation rel;
+ RangeTblEntry *rte;
/* Nothing to do if statement already transformed. */
if (stmt->transformed)
return stmt;
+ /* Construct range table if not done yet. */
+ if (stmt->rtable == NIL)
+ stmt->rtable = transformStatsFromClause(stmt->from_clause);
+
+ /*
+ * The range table should have been derived from a FROM clause consisting
+ * of a single RangeVar, so we expect a single RangeTblEntry.
+ */
+ Assert(list_length(stmt->rtable) == 1);
+ rte = linitial(stmt->rtable);
+ Assert(IsA(rte, RangeTblEntry));
+
+ /* See transformStatsFromClause for notes on lock level. */
+ rel = relation_open(rte->relid, ShareUpdateExclusiveLock);
+
/* Set up pstate */
pstate = make_parsestate(NULL);
pstate->p_sourcetext = queryString;
/*
* Put the parent table into the rtable so that the expressions can refer
- * to its fields without qualification. Caller is responsible for locking
- * relation, but we still need to open it.
+ * to its fields without qualification.
*/
- rel = relation_open(relid, NoLock);
nsitem = addRangeTableEntryForRelation(pstate, rel,
AccessShareLock,
NULL, false, true);
@@ -2952,7 +2963,7 @@ transformStatsStmt(Oid relid, CreateStatsStmt *stmt, const char *queryString)
free_parsestate(pstate);
/* Close relation */
- table_close(rel, NoLock);
+ relation_close(rel, NoLock);
/* Mark statement as successfully transformed */
stmt->transformed = true;
@@ -2960,6 +2971,75 @@ transformStatsStmt(Oid relid, CreateStatsStmt *stmt, const char *queryString)
return stmt;
}
+/*
+ * Accepts a list of RangeVar objects and returns a corresponding list of
+ * RangeTblEntry objects. For use while transforming a CreateStatsStmt.
+ *
+ * At present, the only case we know how to handle is a FROM clause
+ * consisting of a single RangeVar.
+ */
+List *
+transformStatsFromClause(List *from_clause)
+{
+ RangeVar *rv = NULL;
+ RangeTblEntry *rte;
+ Relation rel;
+
+ /* Check for unsupported cases. */
+ if (list_length(from_clause) == 1)
+ rv = linitial(from_clause);
+ if (rv == NULL || !IsA(rv, RangeVar))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("only a single relation is allowed in CREATE STATISTICS")));
+
+ /*
+ * CREATE STATISTICS will influence future execution plans but does not
+ * interfere with currently executing plans. So it should be enough to
+ * take ShareUpdateExclusiveLock on relation, conflicting with ANALYZE and
+ * other DDL that sets statistical information, but not with normal
+ * queries.
+ *
+ * XXX RangeVarCallbackOwnsRelation not needed here, to keep the same
+ * behavior as before.
+ */
+ rel = relation_openrv(rv, ShareUpdateExclusiveLock);
+
+ /* Restrict to allowed relation types */
+ if (rel->rd_rel->relkind != RELKIND_RELATION &&
+ rel->rd_rel->relkind != RELKIND_MATVIEW &&
+ rel->rd_rel->relkind != RELKIND_FOREIGN_TABLE &&
+ rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("cannot define statistics for relation \"%s\"",
+ RelationGetRelationName(rel)),
+ errdetail_relkind_not_supported(rel->rd_rel->relkind)));
+
+ /* You must own the relation to create stats on it */
+ if (!object_ownercheck(RelationRelationId, RelationGetRelid(rel), GetUserId()))
+ aclcheck_error(ACLCHECK_NOT_OWNER, get_relkind_objtype(rel->rd_rel->relkind),
+ RelationGetRelationName(rel));
+
+ /* Creating statistics on system catalogs is not allowed */
+ if (!allowSystemTableMods && IsSystemRelation(rel))
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("permission denied: \"%s\" is a system catalog",
+ RelationGetRelationName(rel))));
+
+ /* Create a range table entry. */
+ rte = makeNode(RangeTblEntry);
+ rte->rtekind = RTE_RELATION;
+ rte->relid = RelationGetRelid(rel);
+ rte->rellockmode = ShareUpdateExclusiveLock;
+
+ /* Close relation */
+ relation_close(rel, NoLock);
+
+ return list_make1(rte);
+}
+
/*
* transformRuleStmt -
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 30b51bf4d3..4c2bdafe2a 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1876,30 +1876,10 @@ ProcessUtilitySlow(ParseState *pstate,
case T_CreateStatsStmt:
{
- Oid relid;
CreateStatsStmt *stmt = (CreateStatsStmt *) parsetree;
- RangeVar *rel = (RangeVar *) linitial(stmt->relations);
-
- if (!IsA(rel, RangeVar))
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("only a single relation is allowed in CREATE STATISTICS")));
-
- /*
- * CREATE STATISTICS will influence future execution plans
- * but does not interfere with currently executing plans.
- * So it should be enough to take ShareUpdateExclusiveLock
- * on relation, conflicting with ANALYZE and other DDL
- * that sets statistical information, but not with normal
- * queries.
- *
- * XXX RangeVarCallbackOwnsRelation not needed here, to
- * keep the same behavior as before.
- */
- relid = RangeVarGetRelid(rel, ShareUpdateExclusiveLock, false);
/* Run parse analysis ... */
- stmt = transformStatsStmt(relid, stmt, queryString);
+ stmt = transformStatsStmt(stmt, queryString);
address = CreateStatistics(stmt);
}
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index cc7b32b279..37d9a14959 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -3220,7 +3220,8 @@ typedef struct CreateStatsStmt
List *defnames; /* qualified name (list of String) */
List *stat_types; /* stat types (list of String) */
List *exprs; /* expressions to build statistics on */
- List *relations; /* rels to build stats on (list of RangeVar) */
+ List *from_clause; /* FROM clause */
+ List *rtable; /* list of range table entries */
char *stxcomment; /* comment to apply to stats, or NULL */
bool transformed; /* true when transformStatsStmt is finished */
bool if_not_exists; /* do nothing if stats name already exists */
diff --git a/src/include/parser/parse_utilcmd.h b/src/include/parser/parse_utilcmd.h
index 2b40a0b5b1..d68f41bd0c 100644
--- a/src/include/parser/parse_utilcmd.h
+++ b/src/include/parser/parse_utilcmd.h
@@ -26,8 +26,9 @@ extern AlterTableStmt *transformAlterTableStmt(Oid relid, AlterTableStmt *stmt,
List **afterStmts);
extern IndexStmt *transformIndexStmt(Oid relid, IndexStmt *stmt,
const char *queryString);
-extern CreateStatsStmt *transformStatsStmt(Oid relid, CreateStatsStmt *stmt,
+extern CreateStatsStmt *transformStatsStmt(CreateStatsStmt *stmt,
const char *queryString);
+extern List *transformStatsFromClause(List *from_clause);
extern void transformRuleStmt(RuleStmt *stmt, const char *queryString,
List **actions, Node **whereClause);
extern List *transformCreateSchemaStmtElements(List *schemaElts,
--
2.37.1 (Apple Git-137.1)
On Wed, May 3, 2023 at 1:50 AM Robert Haas <robertmhaas@gmail.com> wrote:
Hi,
So the attached patch is proposed as code cleanup, rather than
security patches. It changes the code to avoid the duplicate name
lookup altogether. There is no reason that I know of why this needs to
be back-patched for correctness, but I think it's worth putting into
master to make the code nicer and avoid doing things that in some
circumstances can be risky.
hi.
while working on
https://commitfest.postgresql.org/patch/6106
Somehow, I accidentally found this thread by searching on gmail (web).
I copied most of your patch (more than 50%) and posted it on that thread....