teaching Var about NOT NULL
Hi,
While hacking away at implementing join removal support for ANTI-JOINs I
realised that I couldn't just replace the join with a WHERE false
condition... Let me explain...
With a query such as:
SELECT * FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = b.id);
Where a.b_id has a foreign key on b(id)
I'm working on a join removal patch which will turn this into:
SELECT * FROM a WHERE b_id IS NULL;
This seemed like a bit of a shame since with my test tables b_id is defined
NOT NULL, but there seemed to be no way to tell if I needed to add a WHERE
FALSE or a IS NULL check to the WHERE clause.
I quickly put together the attached patch which adds a "knownnotnull" bool
field to Var which we can set to true when we're completely sure that the
Var cannot contain any NULL values. I'm populating this from
pg_attribute.attnotnull where I can and setting it to false where I can't
see another way to tell for sure that nulls cannot exist.
The only use of knownnotnull that I've added to the patch is to turn a
query such as:
SELECT * FROM a WHERE b_id IS NULL;
To not scan the table, since id is defined as NOT NULL.
postgres=# alter table a alter column b_id drop not null;
ALTER TABLE
postgres=# explain select id from a where b_id is null;
QUERY PLAN
---------------------------------------------------
Seq Scan on a (cost=0.00..31.40 rows=11 width=4)
Filter: (b_id IS NULL)
Planning time: 0.340 ms
(3 rows)
postgres=# alter table a alter column b_id set not null;
ALTER TABLE
postgres=# explain select id from a where b_id is null;
QUERY PLAN
--------------------------------------------------------
Result (cost=0.00..31.40 rows=1 width=4)
One-Time Filter: false
-> Seq Scan on a (cost=0.00..31.40 rows=1 width=4)
Planning time: 0.402 ms
(4 rows)
Having this extra flag could likely help optimise NOT IN(SELECT notnullcol
FROM table) to allow this to become an ANTI-JOIN. It will also help join
optimise join removals a little more.
The patch is just a few minutes old and there's no regression tests yet.
I'd rather have some feedback before I proceed with it.
Regards
David Rowley
Attachments:
var_not_null_v0.1.patchapplication/octet-stream; name=var_not_null_v0.1.patchDownload
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 7dd43a9..65fb2d8 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -1124,7 +1124,7 @@ postgresAddForeignUpdateTargets(Query *parsetree,
TIDOID,
-1,
InvalidOid,
- 0);
+ 0, false);
/* Wrap it in a resjunk TLE with the right name ... */
attrname = "ctid";
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 341262b..7003d7a 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -7551,7 +7551,7 @@ ATPrepAlterColumnType(List **wqueue,
transform = (Node *) makeVar(1, attnum,
attTup->atttypid, attTup->atttypmod,
attTup->attcollation,
- 0);
+ 0, false);
}
transform = coerce_to_target_type(pstate,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 43530aa..1d25590 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -1060,6 +1060,7 @@ _copyVar(const Var *from)
COPY_SCALAR_FIELD(varnoold);
COPY_SCALAR_FIELD(varoattno);
COPY_LOCATION_FIELD(location);
+ COPY_SCALAR_FIELD(knownnotnull);
return newnode;
}
diff --git a/src/backend/nodes/makefuncs.c b/src/backend/nodes/makefuncs.c
index da59c58..ed613fb 100644
--- a/src/backend/nodes/makefuncs.c
+++ b/src/backend/nodes/makefuncs.c
@@ -68,7 +68,8 @@ makeVar(Index varno,
Oid vartype,
int32 vartypmod,
Oid varcollid,
- Index varlevelsup)
+ Index varlevelsup,
+ bool knownnotnull)
{
Var *var = makeNode(Var);
@@ -78,7 +79,6 @@ makeVar(Index varno,
var->vartypmod = vartypmod;
var->varcollid = varcollid;
var->varlevelsup = varlevelsup;
-
/*
* Since few if any routines ever create Var nodes with varnoold/varoattno
* different from varno/varattno, we don't provide separate arguments for
@@ -90,6 +90,7 @@ makeVar(Index varno,
/* Likewise, we just set location to "unknown" here */
var->location = -1;
+ var->knownnotnull = knownnotnull;
return var;
}
@@ -108,7 +109,8 @@ makeVarFromTargetEntry(Index varno,
exprType((Node *) tle->expr),
exprTypmod((Node *) tle->expr),
exprCollation((Node *) tle->expr),
- 0);
+ 0,
+ false);
}
/*
@@ -150,7 +152,8 @@ makeWholeRowVar(RangeTblEntry *rte,
toid,
-1,
InvalidOid,
- varlevelsup);
+ varlevelsup,
+ false);
break;
case RTE_FUNCTION:
@@ -168,7 +171,8 @@ makeWholeRowVar(RangeTblEntry *rte,
RECORDOID,
-1,
InvalidOid,
- varlevelsup);
+ varlevelsup,
+ false);
break;
}
@@ -182,7 +186,8 @@ makeWholeRowVar(RangeTblEntry *rte,
toid,
-1,
InvalidOid,
- varlevelsup);
+ varlevelsup,
+ false);
}
else if (allowScalar)
{
@@ -192,7 +197,8 @@ makeWholeRowVar(RangeTblEntry *rte,
toid,
-1,
exprCollation(fexpr),
- varlevelsup);
+ varlevelsup,
+ false);
}
else
{
@@ -202,7 +208,8 @@ makeWholeRowVar(RangeTblEntry *rte,
RECORDOID,
-1,
InvalidOid,
- varlevelsup);
+ varlevelsup,
+ false);
}
break;
@@ -219,7 +226,8 @@ makeWholeRowVar(RangeTblEntry *rte,
RECORDOID,
-1,
InvalidOid,
- varlevelsup);
+ varlevelsup,
+ false);
break;
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 4b641a2..572761e 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -2999,7 +2999,7 @@ fix_indexqual_operand(Node *node, IndexOptInfo *index, int indexcol)
result = makeVar(INDEX_VAR, indexcol + 1,
exprType(lfirst(indexpr_item)), -1,
exprCollation(lfirst(indexpr_item)),
- 0);
+ 0, false);
return (Node *) result;
}
else
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 768c5c7..cb7863b 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1398,7 +1398,8 @@ set_dummy_tlist_references(Plan *plan, int rtoffset)
exprType((Node *) oldvar),
exprTypmod((Node *) oldvar),
exprCollation((Node *) oldvar),
- 0);
+ 0,
+ false);
if (IsA(oldvar, Var))
{
newvar->varnoold = oldvar->varno + rtoffset;
diff --git a/src/backend/optimizer/prep/preptlist.c b/src/backend/optimizer/prep/preptlist.c
index 4ab12e5..875b7b4 100644
--- a/src/backend/optimizer/prep/preptlist.c
+++ b/src/backend/optimizer/prep/preptlist.c
@@ -100,7 +100,8 @@ preprocess_targetlist(PlannerInfo *root, List *tlist)
TIDOID,
-1,
InvalidOid,
- 0);
+ 0,
+ false);
snprintf(resname, sizeof(resname), "ctid%u", rc->rowmarkId);
tle = makeTargetEntry((Expr *) var,
list_length(tlist) + 1,
@@ -116,7 +117,8 @@ preprocess_targetlist(PlannerInfo *root, List *tlist)
OIDOID,
-1,
InvalidOid,
- 0);
+ 0,
+ false);
snprintf(resname, sizeof(resname), "tableoid%u", rc->rowmarkId);
tle = makeTargetEntry((Expr *) var,
list_length(tlist) + 1,
@@ -303,7 +305,8 @@ expand_targetlist(List *tlist, int command_type,
atttype,
atttypmod,
attcollation,
- 0);
+ 0,
+ false);
}
else
{
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 0410fdd..3baf8c1 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -948,7 +948,8 @@ generate_setop_tlist(List *colTypes, List *colCollations,
exprType((Node *) inputtle->expr),
exprTypmod((Node *) inputtle->expr),
exprCollation((Node *) inputtle->expr),
- 0);
+ 0,
+ false);
if (exprType(expr) != colType)
{
@@ -1106,7 +1107,8 @@ generate_append_tlist(List *colTypes, List *colCollations,
colType,
colTypmod,
colColl,
- 0);
+ 0,
+ false);
tle = makeTargetEntry((Expr *) expr,
(AttrNumber) resno++,
pstrdup(reftle->resname),
@@ -1123,7 +1125,8 @@ generate_append_tlist(List *colTypes, List *colCollations,
INT4OID,
-1,
InvalidOid,
- 0);
+ 0,
+ false);
tle = makeTargetEntry((Expr *) expr,
(AttrNumber) resno++,
pstrdup("flag"),
@@ -1469,7 +1472,8 @@ make_inh_translation_list(Relation oldrelation, Relation newrelation,
atttypid,
atttypmod,
attcollation,
- 0));
+ 0,
+ false));
continue;
}
@@ -1515,7 +1519,8 @@ make_inh_translation_list(Relation oldrelation, Relation newrelation,
atttypid,
atttypmod,
attcollation,
- 0));
+ 0,
+ false));
}
*translated_vars = vars;
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 97dacaa..aafc415 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -3220,7 +3220,8 @@ eval_const_expressions_mutator(Node *node,
fselect->resulttype,
fselect->resulttypmod,
fselect->resultcollid,
- ((Var *) arg)->varlevelsup);
+ ((Var *) arg)->varlevelsup,
+ false);
}
if (arg && IsA(arg, RowExpr))
{
@@ -3306,6 +3307,16 @@ eval_const_expressions_mutator(Node *node,
/* Else we need an AND node */
return (Node *) make_andclause(newargs);
}
+ if (!ntest->argisrow && arg && IsA(arg, Var))
+ {
+ /*
+ * If we encounter a var IS NULL on a var that we know
+ * can never have NULL values then we can prove the condition
+ * to always be false
+ */
+ if (((Var *) arg)->knownnotnull && ntest->nulltesttype == IS_NULL)
+ return makeBoolConst(false, false);
+ }
if (!ntest->argisrow && arg && IsA(arg, Const))
{
Const *carg = (Const *) arg;
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index b2becfa..5d56fb6 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -717,7 +717,8 @@ get_relation_constraints(PlannerInfo *root,
att->atttypid,
att->atttypmod,
att->attcollation,
- 0);
+ 0,
+ false);
ntest->nulltesttype = IS_NOT_NULL;
ntest->argisrow = type_is_rowtype(att->atttypid);
result = lappend(result, ntest);
@@ -881,7 +882,8 @@ build_physical_tlist(PlannerInfo *root, RelOptInfo *rel)
att_tup->atttypid,
att_tup->atttypmod,
att_tup->attcollation,
- 0);
+ 0,
+ att_tup->attnotnull);
tlist = lappend(tlist,
makeTargetEntry((Expr *) var,
@@ -992,7 +994,8 @@ build_index_tlist(PlannerInfo *root, IndexOptInfo *index,
att_tup->atttypid,
att_tup->atttypmod,
att_tup->attcollation,
- 0);
+ 0,
+ false);
}
else
{
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index fb6c44c..8804061 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -1398,7 +1398,8 @@ transformSetOperationStmt(ParseState *pstate, SelectStmt *stmt)
colType,
colTypmod,
colCollation,
- 0);
+ 0,
+ false);
var->location = exprLocation((Node *) lefttle->expr);
tle = makeTargetEntry((Expr *) var,
(AttrNumber) pstate->p_next_resno++,
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 1e3d1f6..c2c3aec 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -191,10 +191,12 @@ make_var(ParseState *pstate, RangeTblEntry *rte, int attrno, int location)
Oid vartypeid;
int32 type_mod;
Oid varcollid;
+ bool knownnotnull;
vnum = RTERangeTablePosn(pstate, rte, &sublevels_up);
- get_rte_attribute_type(rte, attrno, &vartypeid, &type_mod, &varcollid);
- result = makeVar(vnum, attrno, vartypeid, type_mod, varcollid, sublevels_up);
+ get_rte_attribute_type(rte, attrno, &vartypeid, &type_mod, &varcollid, &knownnotnull);
+ result = makeVar(vnum, attrno, vartypeid, type_mod, varcollid, sublevels_up,
+ knownnotnull);
result->location = location;
return result;
}
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 478584d..2169a72 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1844,7 +1844,7 @@ expandRTE(RangeTblEntry *rte, int rtindex, int sublevels_up,
exprType((Node *) te->expr),
exprTypmod((Node *) te->expr),
exprCollation((Node *) te->expr),
- sublevels_up);
+ sublevels_up, false);
varnode->location = location;
*colvars = lappend(*colvars, varnode);
@@ -1892,7 +1892,7 @@ expandRTE(RangeTblEntry *rte, int rtindex, int sublevels_up,
varnode = makeVar(rtindex, atts_done + 1,
funcrettype, -1,
exprCollation(rtfunc->funcexpr),
- sublevels_up);
+ sublevels_up, false);
varnode->location = location;
*colvars = lappend(*colvars, varnode);
@@ -1934,7 +1934,8 @@ expandRTE(RangeTblEntry *rte, int rtindex, int sublevels_up,
attrtype,
attrtypmod,
attrcollation,
- sublevels_up);
+ sublevels_up,
+ false);
varnode->location = location;
*colvars = lappend(*colvars, varnode);
}
@@ -1962,7 +1963,8 @@ expandRTE(RangeTblEntry *rte, int rtindex, int sublevels_up,
INT8OID,
-1,
InvalidOid,
- sublevels_up);
+ sublevels_up,
+ false);
*colvars = lappend(*colvars, varnode);
}
@@ -2002,7 +2004,8 @@ expandRTE(RangeTblEntry *rte, int rtindex, int sublevels_up,
exprType(col),
exprTypmod(col),
colcollation,
- sublevels_up);
+ sublevels_up,
+ false);
varnode->location = location;
*colvars = lappend(*colvars, varnode);
}
@@ -2070,7 +2073,8 @@ expandRTE(RangeTblEntry *rte, int rtindex, int sublevels_up,
exprType(avar),
exprTypmod(avar),
exprCollation(avar),
- sublevels_up);
+ sublevels_up,
+ false);
varnode->location = location;
*colvars = lappend(*colvars, varnode);
@@ -2111,7 +2115,8 @@ expandRTE(RangeTblEntry *rte, int rtindex, int sublevels_up,
varnode = makeVar(rtindex, varattno,
coltype, coltypmod, colcoll,
- sublevels_up);
+ sublevels_up,
+ false);
*colvars = lappend(*colvars, varnode);
}
}
@@ -2220,7 +2225,7 @@ expandTupleDesc(TupleDesc tupdesc, Alias *eref, int count, int offset,
varnode = makeVar(rtindex, varattno + offset + 1,
attr->atttypid, attr->atttypmod,
attr->attcollation,
- sublevels_up);
+ sublevels_up, attr->attnotnull);
varnode->location = location;
*colvars = lappend(*colvars, varnode);
@@ -2330,7 +2335,7 @@ get_rte_attribute_name(RangeTblEntry *rte, AttrNumber attnum)
*/
void
get_rte_attribute_type(RangeTblEntry *rte, AttrNumber attnum,
- Oid *vartype, int32 *vartypmod, Oid *varcollid)
+ Oid *vartype, int32 *vartypmod, Oid *varcollid, bool *knownnotnull)
{
switch (rte->rtekind)
{
@@ -2361,6 +2366,7 @@ get_rte_attribute_type(RangeTblEntry *rte, AttrNumber attnum,
*vartype = att_tup->atttypid;
*vartypmod = att_tup->atttypmod;
*varcollid = att_tup->attcollation;
+ *knownnotnull = att_tup->attnotnull;
ReleaseSysCache(tp);
}
break;
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index e6c5530..75128df 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -830,7 +830,8 @@ rewriteTargetListIU(Query *parsetree, Relation target_relation,
att_tup->atttypid,
att_tup->atttypmod,
att_tup->attcollation,
- 0);
+ 0,
+ false);
new_tle = makeTargetEntry((Expr *) new_expr,
attrno,
@@ -1214,7 +1215,8 @@ rewriteTargetListUD(Query *parsetree, RangeTblEntry *target_rte,
TIDOID,
-1,
InvalidOid,
- 0);
+ 0,
+ false);
attrname = "ctid";
}
diff --git a/src/include/nodes/makefuncs.h b/src/include/nodes/makefuncs.h
index e108b85..9dac6c6 100644
--- a/src/include/nodes/makefuncs.h
+++ b/src/include/nodes/makefuncs.h
@@ -28,7 +28,8 @@ extern Var *makeVar(Index varno,
Oid vartype,
int32 vartypmod,
Oid varcollid,
- Index varlevelsup);
+ Index varlevelsup,
+ bool knownnotnull);
extern Var *makeVarFromTargetEntry(Index varno,
TargetEntry *tle);
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 4f03ef9..c8f1185 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -157,6 +157,7 @@ typedef struct Var
Index varnoold; /* original value of varno, for debugging */
AttrNumber varoattno; /* original value of varattno */
int location; /* token location, or -1 if unknown */
+ bool knownnotnull; /* is the var known to never be NULL */
} Var;
/*
diff --git a/src/include/parser/parsetree.h b/src/include/parser/parsetree.h
index ef67b41..49343a2 100644
--- a/src/include/parser/parsetree.h
+++ b/src/include/parser/parsetree.h
@@ -52,7 +52,8 @@ extern char *get_rte_attribute_name(RangeTblEntry *rte, AttrNumber attnum);
* type and typemod info for that attribute of that RTE.
*/
extern void get_rte_attribute_type(RangeTblEntry *rte, AttrNumber attnum,
- Oid *vartype, int32 *vartypmod, Oid *varcollid);
+ Oid *vartype, int32 *vartypmod, Oid *varcollid,
+ bool *knownnotnull);
/*
* Check whether an attribute of an RTE has been dropped (note that
David Rowley <dgrowleyml@gmail.com> writes:
I quickly put together the attached patch which adds a "knownnotnull" bool
field to Var which we can set to true when we're completely sure that the
Var cannot contain any NULL values.
This is utterly the wrong way to go about it. How will you update views
containing such Vars, when someone does an ALTER TABLE SET/DROP NOT NULL?
The right thing is for the planner to look *at plan time* to see if the
column is marked NOT NULL. (This is safe against the ALTER TABLE problem
because an ALTER will result in a relcache invalidation signal, forcing
any plans referencing the table to be regenerated before next use.)
One way to go about it would be to add a bool array to RelOptInfo and
teach plancat.c to populate the array. However, that would only be a win
if we could expect the information to get used (preferably multiple times)
in most planning attempts. That does not seem to me to be likely for this
use case, so I'd go with just doing a pg_attribute catcache lookup
on-the-fly when necessary. I'd suggest
(1) new lsyscache.c utility function taking a relation OID and an attnum;
(2) new function somewhere in the planner that decides whether an
expression is known not-null. For a Var, it'd fetch the matching RTE,
see if it's RTE_RELATION, and if so call the lsyscache.c function.
There are a lot of other potential cases that such a function could be
taught about later, if it proves useful.
BTW, you'd need to be pretty careful about semantics here. Even if
the Var is known not-null at the point of scanning the relation,
what if that relation is nullable by some upper outer join? Perhaps
the function (2) would need to take an argument describing the
join level at which we're making the test.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers