[PATCH] Equivalence Class Filters

Started by David Rowleyabout 10 years ago25 messages
#1David Rowley
david.rowley@2ndquadrant.com
1 attachment(s)

Hi,

I'd like to gather up what the community interest might be for this patch.
Let me explain:

As of today Equivalence Classes are used in the query planner to allow the
planner to have more flexibility into the join order by collecting
information to describe which expressions are equal to each other. These
Equivalence classes, when they contain a constant value also allow
predicate push down. For example:

# explain select * from t1 inner join t2 on t1.id = t2.id where t1.id=1;
QUERY PLAN
-----------------------------------------------------------------------------
Nested Loop (cost=0.56..12.61 rows=1 width=12)
-> Index Scan using t1_pkey on t1 (cost=0.29..8.30 rows=1 width=8)
Index Cond: (id = 1)
-> Index Only Scan using t2_pkey on t2 (cost=0.28..4.29 rows=1 width=4)
Index Cond: (id = 1)
(5 rows)

We can see that a qual was added to filter t2.id=1.

As of today these Equivalence Classes only incorporate expressions which
use the equality operator, but what if the above query had been written as:

select * from t1 inner join t2 on t1.id = t2.id where t1.id <= 10;

Should we not be able to assume that t2.id is also <= 10? Currently we
don't, but the attached patch expands to add what I've called Equivalence
Filters to Equivalence Classes.

This allows the above query to produce a plan like:

QUERY PLAN
------------------------------------------------------------------------------
Merge Join (cost=0.56..5.68 rows=1 width=12)
Merge Cond: (t1.id = t2.id)
-> Index Scan using t1_pkey on t1 (cost=0.29..8.44 rows=9 width=8)
Index Cond: (id <= 10)
-> Index Only Scan using t2_pkey on t2 (cost=0.28..4.45 rows=10
width=4)
Index Cond: (id <= 10)

Now, it may not be that common to perform range filters on an id column,
but it can be fairly common for date values to be join conditions and have
date range filters too. For example in [1]/messages/by-id/CAGewt-tbqRW5NLAzKDCvP_ztEN_LMMyGugQ1iVVEzB+p2XpefQ@mail.gmail.com Alexandre claimed a 1100 to 2200
performance improvement after manually pushing the date filter into the
subquery. This patch allows this to become automatic.

This all works by simply just collecting OpExprs during building the
EquivalanceClasses which have previously been rejected for the eqclass
because they don't use an equality operator. OpExprs in the form of "Expr
op Const" and "Const op Expr" are collected, and then once the
EquivalanceClasses have been build the "Expr" is searched for in the built
classes to see if we can find that Expr as a member of a class, if so this
then becomes an EquivalenceFilter and gets tagged onto to the
EquivalenceClass.

Patch Status - I'm a bit uncertain as to how far we can go with this and if
we deem this as a good idea, then we'd need to be careful not to cause any
performance regression. For example if the filter was "somestaticfunc(col)
< 1234", then we might not want to push that down as somestaticfunc() might
be so costly, that it might be better to perform the join with all the rows
instead. For this reason I've limited the current patch to only using
Operators which are members of a btree op class. Perhaps we could do more,
but likely there's less of a win to be had due to less chance of that qual
being useful for an index scan.

Writing this patch has been on my "interesting" list for a while now. I got
some time last weekend to finally write it, but due to that happening at
the weekend rather than in work time it's a low priority for me to. However
there's no sense in it gathering dust, so I'm posting it today.

Is this something that we might want?

[1]: /messages/by-id/CAGewt-tbqRW5NLAzKDCvP_ztEN_LMMyGugQ1iVVEzB+p2XpefQ@mail.gmail.com
/messages/by-id/CAGewt-tbqRW5NLAzKDCvP_ztEN_LMMyGugQ1iVVEzB+p2XpefQ@mail.gmail.com

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

equivalenceclassfilters_2015-12-05_aa62f00b.patchapplication/octet-stream; name=equivalenceclassfilters_2015-12-05_aa62f00b.patchDownload
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 012c14b..0964d82 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1976,6 +1976,17 @@ _outEquivalenceMember(StringInfo str, const EquivalenceMember *node)
 }
 
 static void
+_outEquivalenceFilter(StringInfo str, const EquivalenceFilter *node)
+{
+	WRITE_NODE_TYPE("EQUIVALENCEFILTER");
+
+	WRITE_NODE_FIELD(ef_const);
+	WRITE_OID_FIELD(ef_opno);
+	WRITE_BOOL_FIELD(ef_const_is_left);
+	WRITE_UINT_FIELD(ef_source_rel);
+}
+
+static void
 _outPathKey(StringInfo str, const PathKey *node)
 {
 	WRITE_NODE_TYPE("PATHKEY");
@@ -3339,6 +3350,9 @@ _outNode(StringInfo str, const void *obj)
 			case T_EquivalenceMember:
 				_outEquivalenceMember(str, obj);
 				break;
+			case T_EquivalenceFilter:
+				_outEquivalenceFilter(str, obj);
+				break;
 			case T_PathKey:
 				_outPathKey(str, obj);
 				break;
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 48be45d..16193d9 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -17,6 +17,7 @@
 #include "postgres.h"
 
 #include "access/stratnum.h"
+#include "catalog/pg_am.h"
 #include "catalog/pg_type.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -840,6 +841,37 @@ generate_base_implied_equalities_const(PlannerInfo *root,
 }
 
 /*
+ * finds the opfamily and strategy number for the specified 'opno' and 'method'
+ * access method. Returns True if one is found and sets 'family' and
+ * 'amstrategy', or returns False if none are found.
+ */
+static bool
+find_am_family_and_stategy(Oid opno, Oid method, Oid *family, int *amstrategy)
+{
+	List *opfamilies;
+	ListCell *l;
+	int strategy;
+
+	opfamilies = get_opfamilies(opno, method);
+
+	foreach(l, opfamilies)
+	{
+		Oid opfamily = lfirst_oid(l);
+
+		strategy = get_op_opfamily_strategy(opno, opfamily);
+
+		if (strategy)
+		{
+			*amstrategy = strategy;
+			*family = opfamily;
+			return true;
+		}
+	}
+
+	return false;
+}
+
+/*
  * generate_base_implied_equalities when EC contains no pseudoconstants
  */
 static void
@@ -848,6 +880,7 @@ generate_base_implied_equalities_no_const(PlannerInfo *root,
 {
 	EquivalenceMember **prev_ems;
 	ListCell   *lc;
+	ListCell   *lc2;
 
 	/*
 	 * We scan the EC members once and track the last-seen member for each
@@ -893,6 +926,56 @@ generate_base_implied_equalities_no_const(PlannerInfo *root,
 									 ec->ec_below_outer_join,
 									 false);
 		}
+
+		/*
+		 * Also push any EquivalenceFilter clauses down into all relations
+		 * other than the one which the filter actually originated from.
+		 */
+		foreach(lc2, ec->ec_filters)
+		{
+			EquivalenceFilter *ef = (EquivalenceFilter *) lfirst(lc2);
+			Expr *leftexpr;
+			Expr *rightexpr;
+			int strategy;
+			Oid opno;
+			Oid family;
+
+			if (ef->ef_source_rel == relid)
+				continue;
+
+			if (!find_am_family_and_stategy(ef->ef_opno, BTREE_AM_OID,
+				&family, &strategy))
+				continue;
+
+			if (ef->ef_const_is_left)
+			{
+				leftexpr = (Expr *) ef->ef_const;
+				rightexpr = cur_em->em_expr;
+			}
+			else
+			{
+				leftexpr = cur_em->em_expr;
+				rightexpr = (Expr *) ef->ef_const;
+			}
+
+			opno = get_opfamily_member(family,
+										exprType((Node *) leftexpr),
+										exprType((Node *) rightexpr),
+										strategy);
+
+			if (opno == InvalidOid)
+				continue;
+
+			process_implied_equality(root, opno,
+										ec->ec_collation,
+										leftexpr,
+										rightexpr,
+										bms_copy(ec->ec_relids),
+										bms_copy(cur_em->em_nullable_relids),
+										ec->ec_below_outer_join,
+										false);
+		}
+
 		prev_ems[relid] = cur_em;
 	}
 
@@ -1384,6 +1467,104 @@ create_join_clause(PlannerInfo *root,
 	return rinfo;
 }
 
+/*
+ * distribute_filter_quals_to_eclass
+ *		For each OpExpr in quallist look for an eclass which has an Expr
+ *		matching the Expr in the OpExpr. If a match is found we add a new
+ *		EquivalenceFilter to the eclass containing the filter details.
+ */
+void
+distribute_filter_quals_to_eclass(PlannerInfo *root, List *quallist)
+{
+	ListCell *l;
+
+	/* fast path for when no eclasses have been generated */
+	if (root->eq_classes == NIL)
+		return;
+
+	/*
+	 * For each qual in quallist try and find an eclass which contains the
+	 * non-Const part of the OpExpr. We'll tag any matches that we find onto
+	 * the correct eclass.
+	 */
+	foreach(l, quallist)
+	{
+		OpExpr	   *opexpr = (OpExpr *) lfirst(l);
+		Expr	   *leftexpr = (Expr *) linitial(opexpr->args);
+		Expr	   *rightexpr = (Expr *) lsecond(opexpr->args);
+		Const	   *constexpr;
+		Expr	   *varexpr;
+		Relids		exprrels;
+		int			relid;
+		bool		const_isleft;
+		ListCell *l2;
+
+		/*
+		 * Determine if the the OpExpr is in the form "expr op const" or
+		 * "const op expr".
+		 */
+		if (IsA(leftexpr, Const))
+		{
+			constexpr = (Const *) leftexpr;
+			varexpr = rightexpr;
+			const_isleft = true;
+		}
+		else
+		{
+			constexpr = (Const *) rightexpr;
+			varexpr = leftexpr;
+			const_isleft = false;
+		}
+
+		exprrels = pull_varnos((Node *) varexpr);
+
+		/* should be filtered out, but we need to determine relid anyway */
+		if (!bms_get_singleton_member(exprrels, &relid))
+			continue;
+
+		/* search for a matching eclass member in all eclasses */
+		foreach(l2, root->eq_classes)
+		{
+			EquivalenceClass *ec = (EquivalenceClass *) lfirst(l2);
+			ListCell *l3;
+
+			if (ec->ec_broken || ec->ec_has_volatile)
+				continue;
+
+			/*
+			 * if the eclass has a const then that const will serve as the
+			 * filter, we needn't add any others.
+			 */
+			if (ec->ec_has_const)
+				continue;
+
+			/* skip this eclass no members exist which belong to this relid */
+			if (!bms_is_member(relid, ec->ec_relids))
+				continue;
+
+			foreach(l3, ec->ec_members)
+			{
+				EquivalenceMember *em = (EquivalenceMember *) lfirst(l3);
+
+				if (!bms_is_member(relid, em->em_relids))
+					continue;
+
+				if (equal(em->em_expr, varexpr))
+				{
+					EquivalenceFilter *efilter;
+					efilter = makeNode(EquivalenceFilter);
+					efilter->ef_const = (Const *) copyObject(constexpr);
+					efilter->ef_const_is_left = const_isleft;
+					efilter->ef_opno = opexpr->opno;
+					efilter->ef_source_rel = relid;
+
+					ec->ec_filters = lappend(ec->ec_filters, efilter);
+					break;		/* Onto the next eclass */
+				}
+			}
+		}
+	}
+}
 
 /*
  * reconsider_outer_join_clauses
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 2f4e818..93f340b 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -51,7 +51,7 @@ static void add_lateral_info(PlannerInfo *root, Relids lhs, Relids rhs);
 static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
 					bool below_outer_join,
 					Relids *qualscope, Relids *inner_join_rels,
-					List **postponed_qual_list);
+					List **postponed_qual_list, List **filter_qual_list);
 static SpecialJoinInfo *make_outerjoininfo(PlannerInfo *root,
 				   Relids left_rels, Relids right_rels,
 				   Relids inner_join_rels,
@@ -65,7 +65,9 @@ static void distribute_qual_to_rels(PlannerInfo *root, Node *clause,
 						Relids ojscope,
 						Relids outerjoin_nonnullable,
 						Relids deduced_nullable_relids,
-						List **postponed_qual_list);
+						List **postponed_qual_list,
+						List **filter_qual_list);
+static bool is_simple_filter_qual(OpExpr *expr);
 static bool check_outerjoin_delay(PlannerInfo *root, Relids *relids_p,
 					  Relids *nullable_relids_p, bool is_pushed_down);
 static bool check_equivalence_delay(PlannerInfo *root,
@@ -648,6 +650,7 @@ deconstruct_jointree(PlannerInfo *root)
 	Relids		qualscope;
 	Relids		inner_join_rels;
 	List	   *postponed_qual_list = NIL;
+	List	   *filter_qual_list = NIL;
 
 	/* Start recursion at top of jointree */
 	Assert(root->parse->jointree != NULL &&
@@ -658,11 +661,14 @@ deconstruct_jointree(PlannerInfo *root)
 
 	result = deconstruct_recurse(root, (Node *) root->parse->jointree, false,
 								 &qualscope, &inner_join_rels,
-								 &postponed_qual_list);
+								 &postponed_qual_list, &filter_qual_list);
 
 	/* Shouldn't be any leftover quals */
 	Assert(postponed_qual_list == NIL);
 
+	/* try and match each filter_qual_list item up with an eclass. */
+	distribute_filter_quals_to_eclass(root, filter_qual_list);
+
 	return result;
 }
 
@@ -683,6 +689,8 @@ deconstruct_jointree(PlannerInfo *root)
  *		or free this, either)
  *	*postponed_qual_list is a list of PostponedQual structs, which we can
  *		add quals to if they turn out to belong to a higher join level
+ *	*filter_qual_list is appended to with a list of quals which may be useful
+ *		include as EquivalenceFilters.
  *	Return value is the appropriate joinlist for this jointree node
  *
  * In addition, entries will be added to root->join_info_list for outer joins.
@@ -690,7 +698,7 @@ deconstruct_jointree(PlannerInfo *root)
 static List *
 deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
 					Relids *qualscope, Relids *inner_join_rels,
-					List **postponed_qual_list)
+					List **postponed_qual_list, List **filter_qual_list)
 {
 	List	   *joinlist;
 
@@ -737,7 +745,8 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
 											   below_outer_join,
 											   &sub_qualscope,
 											   inner_join_rels,
-											   &child_postponed_quals);
+											   &child_postponed_quals,
+											   filter_qual_list);
 			*qualscope = bms_add_members(*qualscope, sub_qualscope);
 			sub_members = list_length(sub_joinlist);
 			remaining--;
@@ -770,7 +779,7 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
 				distribute_qual_to_rels(root, pq->qual,
 										false, below_outer_join, JOIN_INNER,
 										*qualscope, NULL, NULL, NULL,
-										NULL);
+										NULL, filter_qual_list);
 			else
 				*postponed_qual_list = lappend(*postponed_qual_list, pq);
 		}
@@ -785,7 +794,7 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
 			distribute_qual_to_rels(root, qual,
 									false, below_outer_join, JOIN_INNER,
 									*qualscope, NULL, NULL, NULL,
-									postponed_qual_list);
+									postponed_qual_list, filter_qual_list);
 		}
 	}
 	else if (IsA(jtnode, JoinExpr))
@@ -823,11 +832,13 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
 				leftjoinlist = deconstruct_recurse(root, j->larg,
 												   below_outer_join,
 												   &leftids, &left_inners,
-												   &child_postponed_quals);
+												   &child_postponed_quals,
+												   filter_qual_list);
 				rightjoinlist = deconstruct_recurse(root, j->rarg,
 													below_outer_join,
 													&rightids, &right_inners,
-													&child_postponed_quals);
+													&child_postponed_quals,
+													filter_qual_list);
 				*qualscope = bms_union(leftids, rightids);
 				*inner_join_rels = *qualscope;
 				/* Inner join adds no restrictions for quals */
@@ -840,11 +851,13 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
 				leftjoinlist = deconstruct_recurse(root, j->larg,
 												   below_outer_join,
 												   &leftids, &left_inners,
-												   &child_postponed_quals);
+												   &child_postponed_quals,
+												   filter_qual_list);
 				rightjoinlist = deconstruct_recurse(root, j->rarg,
 													true,
 													&rightids, &right_inners,
-													&child_postponed_quals);
+													&child_postponed_quals,
+													filter_qual_list);
 				*qualscope = bms_union(leftids, rightids);
 				*inner_join_rels = bms_union(left_inners, right_inners);
 				nonnullable_rels = leftids;
@@ -854,11 +867,13 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
 				leftjoinlist = deconstruct_recurse(root, j->larg,
 												   below_outer_join,
 												   &leftids, &left_inners,
-												   &child_postponed_quals);
+												   &child_postponed_quals,
+												   filter_qual_list);
 				rightjoinlist = deconstruct_recurse(root, j->rarg,
 													below_outer_join,
 													&rightids, &right_inners,
-													&child_postponed_quals);
+													&child_postponed_quals,
+													filter_qual_list);
 				*qualscope = bms_union(leftids, rightids);
 				*inner_join_rels = bms_union(left_inners, right_inners);
 				/* Semi join adds no restrictions for quals */
@@ -875,11 +890,13 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
 				leftjoinlist = deconstruct_recurse(root, j->larg,
 												   true,
 												   &leftids, &left_inners,
-												   &child_postponed_quals);
+												   &child_postponed_quals,
+												   filter_qual_list);
 				rightjoinlist = deconstruct_recurse(root, j->rarg,
 													true,
 													&rightids, &right_inners,
-													&child_postponed_quals);
+													&child_postponed_quals,
+													filter_qual_list);
 				*qualscope = bms_union(leftids, rightids);
 				*inner_join_rels = bms_union(left_inners, right_inners);
 				/* each side is both outer and inner */
@@ -963,7 +980,8 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
 									false, below_outer_join, j->jointype,
 									*qualscope,
 									ojscope, nonnullable_rels, NULL,
-									postponed_qual_list);
+									postponed_qual_list,
+									filter_qual_list);
 		}
 
 		/* Now we can add the SpecialJoinInfo to join_info_list */
@@ -1485,7 +1503,8 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
 						Relids ojscope,
 						Relids outerjoin_nonnullable,
 						Relids deduced_nullable_relids,
-						List **postponed_qual_list)
+						List **postponed_qual_list,
+						List **filter_qual_list)
 {
 	Relids		relids;
 	bool		is_pushed_down;
@@ -1848,6 +1867,48 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
 
 	/* No EC special case applies, so push it into the clause lists */
 	distribute_restrictinfo_to_rels(root, restrictinfo);
+
+	/* Check if the qual looks useful to harvest as an EquivalenceFilter */
+	if (filter_qual_list != NULL && is_simple_filter_qual((OpExpr *) clause))
+		*filter_qual_list = lappend(*filter_qual_list, clause);
+}
+
+/*
+ * is_simple_filter_qual
+ *		Analyzes an OpExpr to determine if it may be useful as an
+ *		EquivalenceFilter. Returns true if the OpExpr may be of some use, or
+ *		false if it should not be used.
+ */
+static bool
+is_simple_filter_qual(OpExpr *expr)
+{
+	Expr *leftexpr;
+	Expr *rightexpr;
+
+	if (!IsA(expr, OpExpr))
+		return false;
+
+	if (list_length(expr->args) != 2)
+		return false;
+
+	leftexpr = (Expr *) linitial(expr->args);
+	rightexpr = (Expr *) lsecond(expr->args);
+
+	/* XXX should we restrict these to simple Var op Const expressions? */
+	if (IsA(leftexpr, Const))
+	{
+		if (bms_membership(pull_varnos((Node *) rightexpr)) == BMS_SINGLETON &&
+			!contain_volatile_functions((Node *) rightexpr))
+			return true;
+	}
+	else if (IsA(rightexpr, Const))
+	{
+		if (bms_membership(pull_varnos((Node *) leftexpr)) == BMS_SINGLETON &&
+			!contain_volatile_functions((Node *) leftexpr))
+			return true;
+	}
+
+	return false;
 }
 
 /*
@@ -2183,7 +2244,7 @@ process_implied_equality(PlannerInfo *root,
 	distribute_qual_to_rels(root, (Node *) clause,
 							true, below_outer_join, JOIN_INNER,
 							qualscope, NULL, NULL, nullable_relids,
-							NULL);
+							NULL, NULL);
 }
 
 /*
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 093da76..7e05e72 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -340,6 +340,34 @@ get_ordering_op_for_equality_op(Oid opno, bool use_lhs_type)
 }
 
 /*
+ * get_opfamilies
+ *		Returns a list of Oids of each opfamily which 'opno' belonging to
+ *		'method' access method.
+ */
+List *
+get_opfamilies(Oid opno, Oid method)
+{
+	List	   *result = NIL;
+	CatCList   *catlist;
+	int			i;
+
+	catlist = SearchSysCacheList1(AMOPOPID, ObjectIdGetDatum(opno));
+
+	for (i = 0; i < catlist->n_members; i++)
+	{
+		HeapTuple	tuple = &catlist->members[i]->tuple;
+		Form_pg_amop aform = (Form_pg_amop) GETSTRUCT(tuple);
+
+		if (aform->amopmethod == method)
+			result = lappend_oid(result, aform->amopfamily);
+	}
+
+	ReleaseSysCacheList(catlist);
+
+	return result;
+}
+
+/*
  * get_mergejoin_opfamilies
  *		Given a putatively mergejoinable operator, return a list of the OIDs
  *		of the btree opfamilies in which it represents equality.
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 94bdb7c..c463ebf 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -243,6 +243,7 @@ typedef enum NodeTag
 	T_GatherPath,
 	T_EquivalenceClass,
 	T_EquivalenceMember,
+	T_EquivalenceFilter,
 	T_PathKey,
 	T_RestrictInfo,
 	T_PlaceHolderVar,
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 9a0dd28..07119a5 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -619,6 +619,8 @@ typedef struct EquivalenceClass
 	List	   *ec_members;		/* list of EquivalenceMembers */
 	List	   *ec_sources;		/* list of generating RestrictInfos */
 	List	   *ec_derives;		/* list of derived RestrictInfos */
+	List	   *ec_filters;		/* list of OpExprs may be pushed into other
+								 * relations as filters. */
 	Relids		ec_relids;		/* all relids appearing in ec_members, except
 								 * for child members (see below) */
 	bool		ec_has_const;	/* any pseudoconstants in ec_members? */
@@ -671,6 +673,42 @@ typedef struct EquivalenceMember
 } EquivalenceMember;
 
 /*
+ * EquivalenceFilter - List of filters on Consts which belong to the
+ * EquivalenceClass.
+ *
+ * When building the equivalence classes we also collected a list of quals in
+ * the form of; "Expr op Const" and "Const op Expr". These are collected in the
+ * hope that we'll later generate an equivalence class which contains the
+ * "Expr" part. For example, if we parse a query such as;
+ *
+ *		SELECT * FROM t1 INNER JOIN t2 ON t1.id = t2.id WHERE t1.id < 10;
+ *
+ * then since we'll end up with an equivalence class containing {t1.id,t2.id},
+ * we'll tag the "< 10" filter onto the eclass. We are able to do this because
+ * the eclass proves equality between each class member, therefore all members
+ * must be below 10.
+ *
+ * EquivalenceFilters store the details required to allow us to push these
+ * filter clauses down into other relations which share an equivalence class
+ * containing a member which matches the expression of this EquivalenceFilter.
+ *
+ * ef_const is the Const value which this filter should filter against.
+ * ef_opno is the operator to filter on.
+ * ef_const_is_left marks if the OpExpr was in the form "Const op Expr" or
+ * "Expr op Const".
+ * ef_source_rel is the relation id of where this qual originated from.
+ */
+typedef struct EquivalenceFilter
+{
+	NodeTag		type;
+
+	Const	   *ef_const;		/* the constant expression to filter on */
+	Oid			ef_opno;		/* Operator Oid of filter operator */
+	bool		ef_const_is_left; /* Is the Const on the left of the OpExrp? */
+	Index		ef_source_rel;	/* relid of originating relation. */
+} EquivalenceFilter;
+
+/*
  * PathKeys
  *
  * The sort ordering of a path is represented by a list of PathKey nodes.
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 87123a5..6eaf587 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -113,6 +113,8 @@ extern bool process_equivalence(PlannerInfo *root, RestrictInfo *restrictinfo,
 					bool below_outer_join);
 extern Expr *canonicalize_ec_expression(Expr *expr,
 						   Oid req_type, Oid req_collation);
+extern void distribute_filter_quals_to_eclass(PlannerInfo *root,
+											  List *quallist);
 extern void reconsider_outer_join_clauses(PlannerInfo *root);
 extern EquivalenceClass *get_eclass_for_sort_expr(PlannerInfo *root,
 						 Expr *expr,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index dcc421f..41bbbd7 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -52,6 +52,7 @@ extern bool get_ordering_op_properties(Oid opno,
 						   Oid *opfamily, Oid *opcintype, int16 *strategy);
 extern Oid	get_equality_op_for_ordering_op(Oid opno, bool *reverse);
 extern Oid	get_ordering_op_for_equality_op(Oid opno, bool use_lhs_type);
+extern List *get_opfamilies(Oid opno, Oid method);
 extern List *get_mergejoin_opfamilies(Oid opno);
 extern bool get_compatible_hash_operators(Oid opno,
 							  Oid *lhs_opno, Oid *rhs_opno);
diff --git a/src/test/regress/expected/equivclass.out b/src/test/regress/expected/equivclass.out
index 0391b8e..7608912 100644
--- a/src/test/regress/expected/equivclass.out
+++ b/src/test/regress/expected/equivclass.out
@@ -381,3 +381,48 @@ explain (costs off)
                      Index Cond: (ff = '42'::bigint)
 (14 rows)
 
+-- test equivalence filters
+explain (costs off)
+  select * from ec0
+  inner join ec1 on ec0.ff = ec1.ff
+  where ec0.ff between 1 and 10;
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Merge Join
+   Merge Cond: (ec0.ff = ec1.ff)
+   ->  Sort
+         Sort Key: ec0.ff
+         ->  Bitmap Heap Scan on ec0
+               Recheck Cond: ((ff >= 1) AND (ff <= 10))
+               ->  Bitmap Index Scan on ec0_pkey
+                     Index Cond: ((ff >= 1) AND (ff <= 10))
+   ->  Sort
+         Sort Key: ec1.ff
+         ->  Bitmap Heap Scan on ec1
+               Recheck Cond: ((ff >= 1) AND (ff <= 10))
+               ->  Bitmap Index Scan on ec1_pkey
+                     Index Cond: ((ff >= 1) AND (ff <= 10))
+(14 rows)
+
+explain (costs off)
+  select * from ec0
+  inner join ec1 on ec0.ff = ec1.ff
+  where ec1.ff between 1 and 10;
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Merge Join
+   Merge Cond: (ec0.ff = ec1.ff)
+   ->  Sort
+         Sort Key: ec0.ff
+         ->  Bitmap Heap Scan on ec0
+               Recheck Cond: ((ff >= 1) AND (ff <= 10))
+               ->  Bitmap Index Scan on ec0_pkey
+                     Index Cond: ((ff >= 1) AND (ff <= 10))
+   ->  Sort
+         Sort Key: ec1.ff
+         ->  Bitmap Heap Scan on ec1
+               Recheck Cond: ((ff >= 1) AND (ff <= 10))
+               ->  Bitmap Index Scan on ec1_pkey
+                     Index Cond: ((ff >= 1) AND (ff <= 10))
+(14 rows)
+
diff --git a/src/test/regress/sql/equivclass.sql b/src/test/regress/sql/equivclass.sql
index 17fad67..bd5c5b3 100644
--- a/src/test/regress/sql/equivclass.sql
+++ b/src/test/regress/sql/equivclass.sql
@@ -222,3 +222,14 @@ explain (costs off)
      union all
      select ff + 4 as x from ec1) as ss1
   where ss1.x = ec1.f1 and ec1.ff = 42::int8;
+
+-- test equivalence filters
+explain (costs off)
+  select * from ec0
+  inner join ec1 on ec0.ff = ec1.ff
+  where ec0.ff between 1 and 10;
+
+explain (costs off)
+  select * from ec0
+  inner join ec1 on ec0.ff = ec1.ff
+  where ec1.ff between 1 and 10;
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#1)
Re: [PATCH] Equivalence Class Filters

David Rowley <david.rowley@2ndquadrant.com> writes:

As of today these Equivalence Classes only incorporate expressions which
use the equality operator, but what if the above query had been written as:

select * from t1 inner join t2 on t1.id = t2.id where t1.id <= 10;

Should we not be able to assume that t2.id is also <= 10?

This sort of thing has been suggested multiple times before, and I've
rejected it every time on the grounds that it would likely add far more
planner cycles than it'd be worth, eg, time would be spent on searches for
matching subexpressions whether or not anything was learned (and often
nothing would be learned). While I've only read your description of the
patch not the patch itself, the search methodology you propose seems
pretty brute-force and unlikely to solve that issue. It's particularly
important to avoid O(N^2) behaviors when there are N expressions ...

Another issue that would need consideration is how to avoid skewing
planner selectivity estimates with derived clauses that are fully
redundant with other clauses. The existing EC machinery is mostly
able to dodge that problem by generating just a minimal set of equality
clauses from an EC, but I don't see how we'd make that work here.

I'm also wondering why you want to limit it to comparisons to constants;
that seems rather arbitrary.

Lastly, in most cases knowing that t2.id <= 10 is just not worth all
that much; it's certainly far less useful than an equality condition.
It's not difficult to imagine that this would often be a net drag on
runtime performance (never mind planner performance) by doing nothing
except creating additional filter conditions the executor has to check.
Certainly it would be valuable to know this if it let us exclude some
partition of a table, but that's only going to happen in a small minority
of queries.

I'm not necessarily opposed to doing anything in this area, but so far
I've not seen how to do it in a way that is attractive when planner
complexity, performance, and end results are all considered.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3David Rowley
david.rowley@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: [PATCH] Equivalence Class Filters

On 6 December 2015 at 06:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <david.rowley@2ndquadrant.com> writes:

As of today these Equivalence Classes only incorporate expressions which
use the equality operator, but what if the above query had been written

as:

select * from t1 inner join t2 on t1.id = t2.id where t1.id <= 10;

Should we not be able to assume that t2.id is also <= 10?

This sort of thing has been suggested multiple times before, and I've

rejected it every time on the grounds that it would likely add far more
planner cycles than it'd be worth, eg, time would be spent on searches for
matching subexpressions whether or not anything was learned (and often
nothing would be learned).

I guess if it keeps coming up then people must be hitting this limitation.
Perhaps continually rejecting it based on your original opinion is not the
best way to move forward with improving the query planner.

While I've only read your description of the
patch not the patch itself, the search methodology you propose seems
pretty brute-force and unlikely to solve that issue. It's particularly
important to avoid O(N^2) behaviors when there are N expressions ...

Likely it would be possible to do something to improve on that.
Perhaps if the list of filter clauses grows beyond a certain number then a
hash table can be built on the ec_members list with the em_expr as the key
and the EquivalenceClass as the data. Although likely we don't currently
have anything available for generating hash values for Expr nodes. On
checking it seems that 32 is the estimated tipping point for hashing
in find_join_rel(), perhaps something similar would suit this.

Another issue that would need consideration is how to avoid skewing
planner selectivity estimates with derived clauses that are fully
redundant with other clauses. The existing EC machinery is mostly
able to dodge that problem by generating just a minimal set of equality
clauses from an EC, but I don't see how we'd make that work here.

Could you give me an example of where this is a problem? I've tried fooling
the planner into giving me bad estimates, but I've not managed to.

# explain analyze select * from t1 where t1.id < 10 and t1.id < 100 and
t1.id < 1000;
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Index Scan using t1_pkey on t1 (cost=0.29..8.49 rows=9 width=8) (actual
time=0.155..0.160 rows=9 loops=1)
Index Cond: ((id < 10) AND (id < 100) AND (id < 1000))

I'd say the t1.id < 100 AND t1.id < 1000 are fully redundant here. The
estimate given by the planner is bang-on.

Perhaps you're talking about another column making the pushed down qual
redundant? if so, is the current eclass code not prone to the exact same
problem?

I'm also wondering why you want to limit it to comparisons to constants;

that seems rather arbitrary.

Do you have other useful cases in mind?
The other one I considered was "expr op expr" clauses, where both those
exprs were in eclasses. For example:

select * from t1 inner join t2 on t1.col1=t2.col1 and t1.col2=t2.col2 where
t1.col1 < t1.col2;

I'd imagine that would have a narrower use case. I simply stopped with
"Expr op Const" because I though that would be more common and less planner
overhead to detect. Giving that below you also raise concerns that "expr op
const" is likely not that useful in enough cases, then I'm not holding up
too much hope of adding more complexity to the detection mechanism for less
common wins.

Lastly, in most cases knowing that t2.id <= 10 is just not worth all
that much; it's certainly far less useful than an equality condition.
It's not difficult to imagine that this would often be a net drag on
runtime performance (never mind planner performance) by doing nothing
except creating additional filter conditions the executor has to check.
Certainly it would be valuable to know this if it let us exclude some
partition of a table, but that's only going to happen in a small minority
of queries.

I'd have thought that my link to a thread of a reported 1100 to 2200 times
performance improvement might have made you think twice about claiming the
uselessness of this idea.

I'm also quite surprised at you mentioning partitions here. Personally I'd
be thinking of indexes long before thinking of partitions. I don't think I
need to remind you that btree indexes handle >= and <= just fine. It's not
all that hard to imagine that if they're using that column for a join
condition and are serious about performance that they just might also have
an index defined on that column too. I'd imagine the only case we might
have to worry about is when the selectivity of the pushed qual is getting
close to 1. Perhaps the RestrictInfos could be marked as "optional"
somehow, and we could simply remove them when their selectivity estimates
are too high.

I'm not necessarily opposed to doing anything in this area, but so far

I've not seen how to do it in a way that is attractive when planner
complexity, performance, and end results are all considered.

Glad to hear it! Otherwise it would be a real shame to miss out on such
great wins during execution time.

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Training & Services

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#3)
Re: [PATCH] Equivalence Class Filters

David Rowley <david.rowley@2ndquadrant.com> writes:

On 6 December 2015 at 06:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Another issue that would need consideration is how to avoid skewing
planner selectivity estimates with derived clauses that are fully
redundant with other clauses.

Could you give me an example of where this is a problem?

Using the regression database, try

explain analyze select * from tenk1 a, tenk1 b where a.thousand = b.thousand;

explain analyze select * from tenk1 a, tenk1 b where a.thousand = b.thousand and a.thousand < 100;

explain analyze select * from tenk1 a, tenk1 b where a.thousand = b.thousand and a.thousand < 100 and b.thousand < 100;

The first two give pretty accurate estimates for the join size, the third
not so much, because it thinks b.thousand < 100 is an independent
constraint.

I've tried fooling
the planner into giving me bad estimates, but I've not managed to.
# explain analyze select * from t1 where t1.id < 10 and t1.id < 100 and
t1.id < 1000;

That particular case works because clausesel.c's AddRangeClause figures
out that the similar inequalities are redundant and drops all but one,
on its way to (not) identifying a range constraint for t1.id. There's
nothing comparable for constraints on different variables, though,
especially not constraints on Vars of different relations, which will
never even appear in the same restriction list.

if so, is the current eclass code not prone to the exact same problem?

No, and I already explained why not.

Lastly, in most cases knowing that t2.id <= 10 is just not worth all
that much; it's certainly far less useful than an equality condition.

I'd have thought that my link to a thread of a reported 1100 to 2200 times
performance improvement might have made you think twice about claiming the
uselessness of this idea.

I said "in most cases". You can find example cases to support almost any
weird planner optimization no matter how expensive and single-purpose;
but that is the wrong way to think about it. What you have to think about
is average cases, and in particular, not putting a drag on planning time
in cases where no benefit ensues. We're not committing any patches that
give one uncommon case an 1100X speedup by penalizing every other query 10%,
or even 1%; especially not when there may be other ways to fix it.

The EC machinery wasn't designed in a vacuum. It is interested in
deriving equality conditions because those are useful for merge and hash
joins. It's also tightly tied in with the planner's understanding of sort
ordering and distinct-ness. None of those considerations provide any
reason to build a similar thing for inequalities.

It may be that computing derived inequalities is something that's worth
adding, but we're going to have to take a very hard look at the costs and
benefits; it's not a slam dunk that such a patch would get committed.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tom Lane (#4)
Re: [PATCH] Equivalence Class Filters

On 12/6/15 10:38 AM, Tom Lane wrote:

I said "in most cases". You can find example cases to support almost any
weird planner optimization no matter how expensive and single-purpose;
but that is the wrong way to think about it. What you have to think about
is average cases, and in particular, not putting a drag on planning time
in cases where no benefit ensues. We're not committing any patches that
give one uncommon case an 1100X speedup by penalizing every other query 10%,
or even 1%; especially not when there may be other ways to fix it.

This is a problem that seriously hurts Postgres in data warehousing
applications. We can't keep ignoring optimizations that provide even as
little as 10% execution improvements for 10x worse planner performance,
because in a warehouse it's next to impossible for planning time to matter.

Obviously it'd be great if there was a fast, easy way to figure out
whether a query would be expensive enough to go the whole 9 yards on
planning it but at this point I suspect a simple GUC would be a big
improvement.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Nasby (#5)
Re: [PATCH] Equivalence Class Filters

Jim Nasby <Jim.Nasby@BlueTreble.com> writes:

On 12/6/15 10:38 AM, Tom Lane wrote:

I said "in most cases". You can find example cases to support almost any
weird planner optimization no matter how expensive and single-purpose;
but that is the wrong way to think about it. What you have to think about
is average cases, and in particular, not putting a drag on planning time
in cases where no benefit ensues. We're not committing any patches that
give one uncommon case an 1100X speedup by penalizing every other query 10%,
or even 1%; especially not when there may be other ways to fix it.

This is a problem that seriously hurts Postgres in data warehousing
applications.

Please provide some specific examples. I remain skeptical that this
would make a useful difference all that often in the real world ...
and handwaving like that does nothing to change my opinion. What do
the queries look like, and why would deducing an extra inequality
condition help them?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7David G. Johnston
david.g.johnston@gmail.com
In reply to: Jim Nasby (#5)
Re: [PATCH] Equivalence Class Filters

On Mon, Dec 7, 2015 at 8:35 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

On 12/6/15 10:38 AM, Tom Lane wrote:

I said "in most cases". You can find example cases to support almost any
weird planner optimization no matter how expensive and single-purpose;
but that is the wrong way to think about it. What you have to think about
is average cases, and in particular, not putting a drag on planning time
in cases where no benefit ensues. We're not committing any patches that
give one uncommon case an 1100X speedup by penalizing every other query
10%,
or even 1%; especially not when there may be other ways to fix it.

This is a problem that seriously hurts Postgres in data warehousing
applications. We can't keep ignoring optimizations that provide even as
little as 10% execution improvements for 10x worse planner performance,
because in a warehouse it's next to impossible for planning time to matter.

Obviously it'd be great if there was a fast, easy way to figure out
whether a query would be expensive enough to go the whole 9 yards on
planning it but at this point I suspect a simple GUC would be a big
improvement.

Something like "enable_equivalencefilters" but that defaults to false
unlike every one existing "enable_*" GUC?

​It would be a lot more user-friendly to have something along the lines of
"planner_mode (text)" with labels like "star, transactional, bulk_load,
etc..." because I suspect there are other things we'd want to add if we
start identifying queries by their type/usage and optimize accordingly.
Having the knobs available is necessary but putting on a façade would make
the user more straight-forward for the common cases.

David J.

#8Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#4)
Re: [PATCH] Equivalence Class Filters

On 6 December 2015 at 16:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Lastly, in most cases knowing that t2.id <= 10 is just not worth all

that much; it's certainly far less useful than an equality condition.

I'd have thought that my link to a thread of a reported 1100 to 2200

times

performance improvement might have made you think twice about claiming

the

uselessness of this idea.

Personally, I think this optimization is a long way down the list of
important items because I don't see it as a typical use case. There are
already patches submitted for more important items, so this isn't the right
place to be arguing such things. Not every beneficial optimization has a
wide use case.

Since the discussion has become more general, I would add a few points.

I said "in most cases". You can find example cases to support almost any

weird planner optimization no matter how expensive and single-purpose;
but that is the wrong way to think about it. What you have to think about
is average cases, and in particular, not putting a drag on planning time
in cases where no benefit ensues. We're not committing any patches that
give one uncommon case an 1100X speedup by penalizing every other query
10%,
or even 1%; especially not when there may be other ways to fix it.

I agree with this point also, I just don't see it as a blocker for
expensive general case optimizations.

There are many optimizations we might adopt, yet planning time is a factor.
It seems simple enough to ignore more complex optimizations if we have
already achieved a threshold cost (say 10). Such a test would add nearly
zero time for the common case. We can apply the optimizations in some kind
of ordering depending upon the cost, so we are careful to balance the
cost/benefit of trying certain optimizations.

Optimizer directives might be useful also, but we can do just as well with
a heuristic.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#9Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tom Lane (#6)
Re: [PATCH] Equivalence Class Filters

On 12/7/15 9:54 AM, Tom Lane wrote:

Jim Nasby<Jim.Nasby@BlueTreble.com> writes:

On 12/6/15 10:38 AM, Tom Lane wrote:

I said "in most cases". You can find example cases to support almost any
weird planner optimization no matter how expensive and single-purpose;
but that is the wrong way to think about it. What you have to think about
is average cases, and in particular, not putting a drag on planning time
in cases where no benefit ensues. We're not committing any patches that
give one uncommon case an 1100X speedup by penalizing every other query 10%,
or even 1%; especially not when there may be other ways to fix it.

This is a problem that seriously hurts Postgres in data warehousing
applications.

Please provide some specific examples. I remain skeptical that this
would make a useful difference all that often in the real world ...
and handwaving like that does nothing to change my opinion. What do
the queries look like, and why would deducing an extra inequality
condition help them?

I was speaking more broadly than this particular case. There's a lot of
planner improvements that get shot down because of the planning overhead
they would add. That's great for cases when milliseconds count, but
spending an extra 60 seconds (a planning eternity) to shave an hour off
a warehouse/reporting query.

There needs to be some way to give the planner an idea of how much
effort it should expend. GEQO and *_collapse_limit addresses this in the
opposite direction (putting a cap on planner effort), but I think we
need something that does the opposite "I know this query will take a
long time, so expend extra effort on planning it."
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Simon Riggs (#8)
Re: [PATCH] Equivalence Class Filters

On 12/7/15 10:44 AM, Simon Riggs wrote:

There are many optimizations we might adopt, yet planning time is a
factor. It seems simple enough to ignore more complex optimizations if
we have already achieved a threshold cost (say 10). Such a test would
add nearly zero time for the common case. We can apply the optimizations
in some kind of ordering depending upon the cost, so we are careful to
balance the cost/benefit of trying certain optimizations.

Unfortunately I've seen a lot of millisecond queries that have 6 figure
estimates, due to data being in cache. So I'm not sure how practical
that would be.

Maybe a better starting point would be a planner timeout.

I definitely agree we need some method to limit planning time when
necessary (ie: OLTP). Without that we'll never be able to start testing
more complex optimizations.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Simon Riggs
simon@2ndQuadrant.com
In reply to: Jim Nasby (#10)
Re: [PATCH] Equivalence Class Filters

On 7 December 2015 at 16:55, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

Unfortunately I've seen a lot of millisecond queries that have 6 figure
estimates, due to data being in cache. So I'm not sure how practical that
would be.

We seem to be agreed that longer running queries may benefit from more
thorough optimization.

I like the idea of specifying a goal for the optimization, so the user can
explicitly ask to spend more time. But I would also rather that I didn't
have to do that at all, by using a heuristic that allows us to act sensibly
even when not explicitly instructed by the user. It Just Works.

We can decide what the heuristic limit is based upon the cost of the
technique. I don't think it matters how quick your CPU is, it matters how
long planning takes as a % of the whole expected execution time.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#12Gavin Flower
GavinFlower@archidevsys.co.nz
In reply to: David G. Johnston (#7)
Re: [PATCH] Equivalence Class Filters

On 08/12/15 05:27, David G. Johnston wrote:

On Mon, Dec 7, 2015 at 8:35 AM, Jim Nasby <Jim.Nasby@bluetreble.com
<mailto:Jim.Nasby@bluetreble.com>>wrote:

On 12/6/15 10:38 AM, Tom Lane wrote:

I said "in most cases". You can find example cases to support
almost any
weird planner optimization no matter how expensive and
single-purpose;
but that is the wrong way to think about it. What you have to
think about
is average cases, and in particular, not putting a drag on
planning time
in cases where no benefit ensues. We're not committing any
patches that
give one uncommon case an 1100X speedup by penalizing every
other query 10%,
or even 1%; especially not when there may be other ways to fix it.

This is a problem that seriously hurts Postgres in data
warehousing applications. We can't keep ignoring optimizations
that provide even as little as 10% execution improvements for 10x
worse planner performance, because in a warehouse it's next to
impossible for planning time to matter.

Obviously it'd be great if there was a fast, easy way to figure
out whether a query would be expensive enough to go the whole 9
yards on planning it but at this point I suspect a simple GUC
would be a big improvement.

Something like "enable_equivalencefilters" but that defaults to false
unlike every one existing "enable_*" GUC?

​It would be a lot more user-friendly to have something along the
lines of "planner_mode (text)" with labels like "star, transactional,
bulk_load, etc..." because I suspect there are other things we'd want
to add if we start identifying queries by their type/usage and
optimize accordingly. Having the knobs available is necessary but
putting on a façade would make the user more straight-forward for the
common cases.

David J.

How about:

planning_time_base 10 # Default effort, may be increased or decreased
as required - must be at least 1
planning_time_XXXX 0 # By default, planner makes no (or minimal)
effort to optimise for feature XXXX

So for some people, adjusting planning_time_base may be sufficient - but
for more specialised cases, people can tell the planner to consider
expending more effort.

Cheers,
Gavin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Evgeniy Shishkin
itparanoia@gmail.com
In reply to: Gavin Flower (#12)
Re: [PATCH] Equivalence Class Filters

On 07 Dec 2015, at 22:27, Gavin Flower <GavinFlower@archidevsys.co.nz> wrote:

On 08/12/15 05:27, David G. Johnston wrote:

On Mon, Dec 7, 2015 at 8:35 AM, Jim Nasby <Jim.Nasby@bluetreble.com <mailto:Jim.Nasby@bluetreble.com>>wrote:

On 12/6/15 10:38 AM, Tom Lane wrote:

I said "in most cases". You can find example cases to support
almost any
weird planner optimization no matter how expensive and
single-purpose;
but that is the wrong way to think about it. What you have to
think about
is average cases, and in particular, not putting a drag on
planning time
in cases where no benefit ensues. We're not committing any
patches that
give one uncommon case an 1100X speedup by penalizing every
other query 10%,
or even 1%; especially not when there may be other ways to fix it.

This is a problem that seriously hurts Postgres in data
warehousing applications. We can't keep ignoring optimizations
that provide even as little as 10% execution improvements for 10x
worse planner performance, because in a warehouse it's next to
impossible for planning time to matter.

Obviously it'd be great if there was a fast, easy way to figure
out whether a query would be expensive enough to go the whole 9
yards on planning it but at this point I suspect a simple GUC
would be a big improvement.

Something like "enable_equivalencefilters" but that defaults to false unlike every one existing "enable_*" GUC?

​It would be a lot more user-friendly to have something along the lines of "planner_mode (text)" with labels like "star, transactional, bulk_load, etc..." because I suspect there are other things we'd want to add if we start identifying queries by their type/usage and optimize accordingly. Having the knobs available is necessary but putting on a façade would make the user more straight-forward for the common cases.

David J.

How about:

planning_time_base 10 # Default effort, may be increased or decreased as required - must be at least 1
planning_time_XXXX 0 # By default, planner makes no (or minimal) effort to optimise for feature XXXX

So for some people, adjusting planning_time_base may be sufficient - but for more specialised cases, people can tell the planner to consider expending more effort.

Mysql have now 19 optimizer_switch parameters
https://dev.mysql.com/doc/refman/5.7/en/switchable-optimizations.html

Please don't do that.

I'd rather like some sort of pg_stat_statements, which would track execution and planning time.
On new query, we can lookup if query can benefit from more planning time.
But i don't know how costly this can be.

Cheers,
Gavin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Gavin Flower
GavinFlower@archidevsys.co.nz
In reply to: Evgeniy Shishkin (#13)
Re: [PATCH] Equivalence Class Filters

On 08/12/15 08:34, Evgeniy Shishkin wrote:

On 07 Dec 2015, at 22:27, Gavin Flower <GavinFlower@archidevsys.co.nz> wrote:

On 08/12/15 05:27, David G. Johnston wrote:

On Mon, Dec 7, 2015 at 8:35 AM, Jim Nasby <Jim.Nasby@bluetreble.com <mailto:Jim.Nasby@bluetreble.com>>wrote:

On 12/6/15 10:38 AM, Tom Lane wrote:

I said "in most cases". You can find example cases to support
almost any
weird planner optimization no matter how expensive and
single-purpose;
but that is the wrong way to think about it. What you have to
think about
is average cases, and in particular, not putting a drag on
planning time
in cases where no benefit ensues. We're not committing any
patches that
give one uncommon case an 1100X speedup by penalizing every
other query 10%,
or even 1%; especially not when there may be other ways to fix it.

This is a problem that seriously hurts Postgres in data
warehousing applications. We can't keep ignoring optimizations
that provide even as little as 10% execution improvements for 10x
worse planner performance, because in a warehouse it's next to
impossible for planning time to matter.

Obviously it'd be great if there was a fast, easy way to figure
out whether a query would be expensive enough to go the whole 9
yards on planning it but at this point I suspect a simple GUC
would be a big improvement.

Something like "enable_equivalencefilters" but that defaults to false unlike every one existing "enable_*" GUC?

​It would be a lot more user-friendly to have something along the lines of "planner_mode (text)" with labels like "star, transactional, bulk_load, etc..." because I suspect there are other things we'd want to add if we start identifying queries by their type/usage and optimize accordingly. Having the knobs available is necessary but putting on a façade would make the user more straight-forward for the common cases.

David J.

How about:

planning_time_base 10 # Default effort, may be increased or decreased as required - must be at least 1
planning_time_XXXX 0 # By default, planner makes no (or minimal) effort to optimise for feature XXXX

So for some people, adjusting planning_time_base may be sufficient - but for more specialised cases, people can tell the planner to consider expending more effort.

Mysql have now 19 optimizer_switch parameters
https://dev.mysql.com/doc/refman/5.7/en/switchable-optimizations.html

I notice that they are either on or off, I suspect that it is better to
have some sort of measure of how much extra effort the planner should make.

Please don't do that.

I think having some might be useful - though in most situations, having
a general indicator to the planner might be sufficient.

From reading the thread, I have the impression that for some extreme
workloads, them some extra twiddling would be useful even though for
most people it simply be an unnecessary complication.

In over twenty years I've never needed such knobs, but I might get a
project next year where they might be useful. So I agree that for most
situations, such extra stuff is not needed - but I'd like additional
options available if I ever needed them.

I'd rather like some sort of pg_stat_statements, which would track execution and planning time.
On new query, we can lookup if query can benefit from more planning time.
But i don't know how costly this can be.

Cheers,
Gavin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15David Rowley
david.rowley@2ndquadrant.com
In reply to: Jim Nasby (#5)
Re: [PATCH] Equivalence Class Filters

On 8 December 2015 at 04:35, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

On 12/6/15 10:38 AM, Tom Lane wrote:

I said "in most cases". You can find example cases to support almost any
weird planner optimization no matter how expensive and single-purpose;
but that is the wrong way to think about it. What you have to think about
is average cases, and in particular, not putting a drag on planning time
in cases where no benefit ensues. We're not committing any patches that
give one uncommon case an 1100X speedup by penalizing every other query
10%,
or even 1%; especially not when there may be other ways to fix it.

This is a problem that seriously hurts Postgres in data warehousing
applications. We can't keep ignoring optimizations that provide even as
little as 10% execution improvements for 10x worse planner performance,
because in a warehouse it's next to impossible for planning time to matter.

Obviously it'd be great if there was a fast, easy way to figure out
whether a query would be expensive enough to go the whole 9 yards on
planning it but at this point I suspect a simple GUC would be a big
improvement.

I've certainly been here before [1]/messages/by-id/CAApHDvrJrz-0xinyiqTiWs0mFX17GWD2Y8VZ+i92nuZsha8ocw@mail.gmail.com, but the idea fell of deaf ears.

The biggest frustration for me is that the way Tom always seems to argue
his point it's as if planning time is roughly the same or more expensive
than execution time, and likely that's true in many cases, but I would
imagine in more normal cases that execution time is longer, although I've
never had the guts to stand up and argued this as I don't have any stats to
back me up.

I was talking to Thomas Munro yesterday about this, and he mentioned that
it would be quite nice to have some stats on how much time is spent in the
planner, vs executor. He came up with the idea of adding a column to
pg_stat_statements to record the planning time.

If we could get real statistics on real world cases and we discovered that,
for example on average that the total CPU time of planning was only 1% of
execution time, then it would certainly make adding 2% overhead onto the
planner a bit less of a worry, as this would just be %2 of 1% (0.02%). Such
information, if fed back into the community might be able to drive us in
the right direction when it comes to deciding what needs to be done to
resolve this constant issue with accepting planner improvement patches.

I believe that with parallel query on the horizon for 9.6 that we're now
aiming to support bigger OLAP type database than ever before. So if we
ignore patches like this one then it appears that we have some conflicting
goals in the community as it seems that we're willing to add the brawn, but
we're not willing to add the brain. If this is the case then it's a shame,
as I think we can have both. So I very much agree on the fact that we must
find a way to maintain support and high performance of small OLTP databases
too.

[1]: /messages/by-id/CAApHDvrJrz-0xinyiqTiWs0mFX17GWD2Y8VZ+i92nuZsha8ocw@mail.gmail.com
/messages/by-id/CAApHDvrJrz-0xinyiqTiWs0mFX17GWD2Y8VZ+i92nuZsha8ocw@mail.gmail.com

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Training & Services

#16Jeremy Harris
jgh@wizmail.org
In reply to: Simon Riggs (#8)
Re: [PATCH] Equivalence Class Filters

On 07/12/15 16:44, Simon Riggs wrote:

There are many optimizations we might adopt, yet planning time is a factor.
It seems simple enough to ignore more complex optimizations if we have
already achieved a threshold cost (say 10). Such a test would add nearly
zero time for the common case. We can apply the optimizations in some kind
of ordering depending upon the cost, so we are careful to balance the
cost/benefit of trying certain optimizations.

Given parallelism, why not continue planning after initiating a
a cancellable execution, giving a better plan to be used if the
excecution runs for long enough?
--
Cheers,
Jeremy

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: David Rowley (#15)
Re: [PATCH] Equivalence Class Filters

On 12/7/15 7:26 PM, David Rowley wrote:

I was talking to Thomas Munro yesterday about this, and he mentioned
that it would be quite nice to have some stats on how much time is spent
in the planner, vs executor. He came up with the idea of adding a column
to pg_stat_statements to record the planning time.

I think that would be useful. Maybe something in pg_stat_database too.

If we could get real statistics on real world cases and we discovered
that, for example on average that the total CPU time of planning was
only 1% of execution time, then it would certainly make adding 2%
overhead onto the planner a bit less of a worry, as this would just be
%2 of 1% (0.02%). Such information, if fed back into the community might
be able to drive us in the right direction when it comes to deciding
what needs to be done to resolve this constant issue with accepting
planner improvement patches.

Might be nice, but I think it's also pretty unnecessary.

I've dealt with dozens of queries that took minutes to hours to run. Yet
I can't recall ever having an EXPLAIN on one of these take more than a
few seconds. I tend to do more OLTP stuff so maybe others have
experienced long-running EXPLAIN, in which case it'd be great to know that.

Actually, I have seen long EXPLAIN times, but only as part of trying to
aggressively increase *_collapse_limit. IIRC I was able to increase one
of those to 14 and one to 18 before plan time became unpredictably bad
(it wasn't always bad though, just sometimes).

I believe that with parallel query on the horizon for 9.6 that we're now
aiming to support bigger OLAP type database than ever before. So if we
ignore patches like this one then it appears that we have some
conflicting goals in the community as it seems that we're willing to add
the brawn, but we're not willing to add the brain. If this is the case
then it's a shame, as I think we can have both. So I very much agree on
the fact that we must find a way to maintain support and high
performance of small OLTP databases too.

+1
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Jeremy Harris (#16)
Re: [PATCH] Equivalence Class Filters

On 12/8/15 3:52 AM, Jeremy Harris wrote:

On 07/12/15 16:44, Simon Riggs wrote:

There are many optimizations we might adopt, yet planning time is a factor.
It seems simple enough to ignore more complex optimizations if we have
already achieved a threshold cost (say 10). Such a test would add nearly
zero time for the common case. We can apply the optimizations in some kind
of ordering depending upon the cost, so we are careful to balance the
cost/benefit of trying certain optimizations.

Given parallelism, why not continue planning after initiating a
a cancellable execution, giving a better plan to be used if the
excecution runs for long enough?

Because that would take significantly more work than what Simon is
proposing.

That said, I think the ability to restart with a different plan is
something we might need, for cases when we discover the plan estimates
were way off. If that ever gets built it might be useful for what you
propose as well.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Robert Haas
robertmhaas@gmail.com
In reply to: David Rowley (#15)
Re: [PATCH] Equivalence Class Filters

On Mon, Dec 7, 2015 at 8:26 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

The biggest frustration for me is that the way Tom always seems to argue his
point it's as if planning time is roughly the same or more expensive than
execution time, and likely that's true in many cases, but I would imagine in
more normal cases that execution time is longer, although I've never had the
guts to stand up and argued this as I don't have any stats to back me up.

I think this is missing the point. It is possible to have a query
planner optimization that is so expensive that it loses even when it
improves the plan, but I don't think this optimization falls into that
category. This particular change would not have been requested as
many times as it has been if people didn't keep rediscovering that, on
a certain class of queries, it works *really* well. The problem that
I understand Tom to be concerned about is the queries where the
optimization consumes additional planning time without delivering a
better plan - and those where, without careful thought, it might even
deliver a worse plan.

Now, I'm not sure Tom is right about that, but he might be. The class
of queries we're talking about here are the ones where somebody
constrains a column that is part of an equivalence class with multiple
members. Perhaps the best example is a large join, let's say 10
tables, where the join column is the same for all tables; thus, we
have an equivalence class with 10 members. Suppose further we have an
inequality condition applied to one member of that equivalence class.

Currently, we'll apply that inequality to the table that the user
actually mentioned and ignore all the others; in theory, it could be
right to enforce that inequality condition on any nonempty subset of
the 10 tables we've got. It might be right to pick exactly one of
those tables and enforce the inequality there, or it might be right to
enforce it on some of them, or it might be right to enforce it on all
of them. The reason why the answer isn't automatically "all of them"
is because, first of all, it's possible that enforcing the condition
at a particular table costs more to filter out the rows that we save
in execution time at higher levels of the plan tree. For example,
consider A JOIN B ON A.X = B.X WHERE A.X > 1000000. It might be that
the range of A.X is [0,1000001] but the range of B.X is
[1000000,2000000]; so enforcing the inequality against A is very
selective but enforcing it against B filters out basically nothing.
Furthermore, there are some cases involving parameterized paths where
enforcing the inequality multiple times is definitely bad: for
example, if we've got a nested loop where the outer side is a seq scan
that enforces the condition and the inner side is an index probe, it
is just a waste to retest it on the inner side. We already know that
the outer row passes the inequality, so the inner row will necessarily
pass also. This doesn't apply to merge or hash joins, and it also
doesn't apply to all nested loops: scans that aren't paramaterized by
the equivalence-class column can still benefit from separate
enforcement of the inequality.

Considering the factors mentioned in the previous paragraph, it seems
likely to me that a naive patch that propagates the inequalities to
every relation where they could hypothetically be enforced will be a
significant loss in some cases even just in execution cost, ignoring
completely the effects on planning time. And, on the flip side,
figuring out what non-empty subset of relations forms the optimal set
upon which to enforce the inequality seems like a complicated problem.
A first cut might be to enforce the inequality against the relation
where it's believed to be most selective, and to also enforce it in
paths for other relations that aren't parameterized by the
equivalence-class column mentioned in the inequality provided that the
selectivity is thought to be above some threshold ... but I'm not sure
this is very principled, and it's certainly not obvious what arbitrary
threshold would be appropriate. The threshold where multiple
enforcement of the qual starts to pay off also depends on the cost of
the qual: an expensive qual has to filter out more rows than a cheap
qual to be worthwhile. I'm not sure how to estimate all this, but I
don't have trouble believing that if not done carefully it could
either (a) cost a lot of planning time or (b) generate lousy plans.

Now, all that having been said, I think this is a pretty important
optimization. Lots of people have asked for it, and I think it would
be worth expending some brainpower to try to figure out a way to be
smarter than we are now, which is, in a nutshell, as dumb as possible.
You're completely right that, on an OLAP query that's going to run for
a minute, a few extra milliseconds of planning time could be totally
OK even if they don't yield any benefit on most queries. Maybe we can
get the cost of this feature down to the point where we can turn it on
for everybody and have that be OK. But if not, having something we
can turn on for queries that are going to run for a long time, or that
we estimate are going to run for a long time, would, I think, be
useful. As things stand, planning time can consume a very significant
percentage of total runtime on OLTP queries, and if you are processing
300k queries/second and 10% of that is planning time, and somebody
boosts planning time by 1%, that's an 0.1% hit overall and you just
lost several hundred queries per second. That's not nothing,
especially if we do 3 or 4 such "optimizations" per release cycle.
But if the optimization can be made nearly free in cases where it
doesn't apply, that's a different story, and of course OLAP is a whole
different ball game.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20David Rowley
david.rowley@2ndquadrant.com
In reply to: Robert Haas (#19)
Re: [PATCH] Equivalence Class Filters

On 9 December 2015 at 15:14, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Dec 7, 2015 at 8:26 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

The biggest frustration for me is that the way Tom always seems to argue

his

point it's as if planning time is roughly the same or more expensive than
execution time, and likely that's true in many cases, but I would

imagine in

more normal cases that execution time is longer, although I've never had

the

guts to stand up and argued this as I don't have any stats to back me up.

I think this is missing the point. It is possible to have a query
planner optimization that is so expensive that it loses even when it
improves the plan, but I don't think this optimization falls into that
category. This particular change would not have been requested as
many times as it has been if people didn't keep rediscovering that, on
a certain class of queries, it works *really* well. The problem that
I understand Tom to be concerned about is the queries where the
optimization consumes additional planning time without delivering a
better plan - and those where, without careful thought, it might even
deliver a worse plan.

My point was more of a response to the general condition that we cannot add
any planner overhead unless the extra CPU cycles spent are almost certainly
going improve the plan.

Now, I'm not sure Tom is right about that, but he might be. The class
of queries we're talking about here are the ones where somebody
constrains a column that is part of an equivalence class with multiple
members. Perhaps the best example is a large join, let's say 10
tables, where the join column is the same for all tables; thus, we
have an equivalence class with 10 members. Suppose further we have an
inequality condition applied to one member of that equivalence class.

Currently, we'll apply that inequality to the table that the user
actually mentioned and ignore all the others; in theory, it could be
right to enforce that inequality condition on any nonempty subset of
the 10 tables we've got. It might be right to pick exactly one of
those tables and enforce the inequality there, or it might be right to
enforce it on some of them, or it might be right to enforce it on all
of them. The reason why the answer isn't automatically "all of them"
is because, first of all, it's possible that enforcing the condition
at a particular table costs more to filter out the rows that we save
in execution time at higher levels of the plan tree. For example,
consider A JOIN B ON A.X = B.X WHERE A.X > 1000000. It might be that
the range of A.X is [0,1000001] but the range of B.X is
[1000000,2000000]; so enforcing the inequality against A is very
selective but enforcing it against B filters out basically nothing.

hmm, well I think it's more than likely that my view on this has been
skewed by the fact that we push equality quals down with the eclass
contains a Const without any regard that it could generate a worse plan or
add needless CPU overhead for checking the quals. If you rewrite your above
paragraph with equality instead of inequality then it still applies. A JOIN
B ON A.X = B.X WHERE A.X = 1000000, will clause B.X = 1000000 to be pushed
into B, but what if B.X values are all set to 1000000? I've not seen anyone
complain about us doing that. This is the reason I limited the patch to
only deal with members of the BTREE op-class, I assumed that if there's
some BTREE index helping with the equality qual then that same index may
well just help with a qual using the <, <=, > or >= operator, and also I
thought that in most cases all these btree inequality operations will have
about the same CPU cost as equality.

Would you be able to explain the difference between the two? Or is it just
a question of the likelihood?

Furthermore, there are some cases involving parameterized paths where
enforcing the inequality multiple times is definitely bad: for
example, if we've got a nested loop where the outer side is a seq scan
that enforces the condition and the inner side is an index probe, it
is just a waste to retest it on the inner side. We already know that
the outer row passes the inequality, so the inner row will necessarily
pass also. This doesn't apply to merge or hash joins, and it also
doesn't apply to all nested loops: scans that aren't paramaterized by
the equivalence-class column can still benefit from separate
enforcement of the inequality.

I guess that could be fixed by somehow marking these pushed quals as
optional and having parameterised scans ignore optional quals.
I have to admit that I didn't give it much thought as all of the cases that
I tested turned what used to be nested loop with a parameterised inner
index scan into a merge join, which was faster in my test case. Of course,
that does not mean that I claim that this merge join will be faster in all
cases or that the plan will switch to using a merge join in all cases.

Considering the factors mentioned in the previous paragraph, it seems
likely to me that a naive patch that propagates the inequalities to
every relation where they could hypothetically be enforced will be a
significant loss in some cases even just in execution cost, ignoring
completely the effects on planning time. And, on the flip side,
figuring out what non-empty subset of relations forms the optimal set
upon which to enforce the inequality seems like a complicated problem.
A first cut might be to enforce the inequality against the relation
where it's believed to be most selective, and to also enforce it in
paths for other relations that aren't parameterized by the
equivalence-class column mentioned in the inequality provided that the
selectivity is thought to be above some threshold ... but I'm not sure
this is very principled, and it's certainly not obvious what arbitrary
threshold would be appropriate. The threshold where multiple
enforcement of the qual starts to pay off also depends on the cost of
the qual: an expensive qual has to filter out more rows than a cheap
qual to be worthwhile. I'm not sure how to estimate all this, but I
don't have trouble believing that if not done carefully it could
either (a) cost a lot of planning time or (b) generate lousy plans.

Again this is one of the reason that I limited it to btree operators only.
I don't quite know how to estimate this either as the extra rows being
filtered may mean that a sort no longer spills to disk, or a hash join no
longer multi-batches. I'd imagine filtering would be extra worthwhile in
such cases, so I doubt setting the threshold to some constant value is the
correct way, and most likely considering the path with and without the
qual(s) would be far too costly.

Now, all that having been said, I think this is a pretty important
optimization. Lots of people have asked for it, and I think it would
be worth expending some brainpower to try to figure out a way to be
smarter than we are now, which is, in a nutshell, as dumb as possible.
You're completely right that, on an OLAP query that's going to run for
a minute, a few extra milliseconds of planning time could be totally
OK even if they don't yield any benefit on most queries. Maybe we can
get the cost of this feature down to the point where we can turn it on
for everybody and have that be OK. But if not, having something we
can turn on for queries that are going to run for a long time, or that
we estimate are going to run for a long time, would, I think, be
useful. As things stand, planning time can consume a very significant
percentage of total runtime on OLTP queries, and if you are processing
300k queries/second and 10% of that is planning time, and somebody
boosts planning time by 1%, that's an 0.1% hit overall and you just
lost several hundred queries per second. That's not nothing,
especially if we do 3 or 4 such "optimizations" per release cycle.
But if the optimization can be made nearly free in cases where it
doesn't apply, that's a different story, and of course OLAP is a whole
different ball game.

Well I agree with that. I've got no interest in slowing anything down. I've
been busy working with big databases for quite a while now, so perhaps my
point of view is coming from that direction still.

So anyway, it's quite clear that we don't want the patch in its current
form, and I'm not volunteering any more time at this stage to improve it,
so for this reason I won't add it the January commit fest.

Thanks for the feedback.

David

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#21Robert Haas
robertmhaas@gmail.com
In reply to: David Rowley (#20)
Re: [PATCH] Equivalence Class Filters

On Tue, Dec 8, 2015 at 11:12 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

My point was more of a response to the general condition that we cannot add
any planner overhead unless the extra CPU cycles spent are almost certainly
going improve the plan.

I hope that's an overstatement of the requirement, because otherwise
it boils down to "don't improve the planner". Most things we do to
try to improve plans are going to generate paths that will only
sometimes be better than the paths we're already generating. "almost
certainly" is a high bar to clear.

hmm, well I think it's more than likely that my view on this has been skewed
by the fact that we push equality quals down with the eclass contains a
Const without any regard that it could generate a worse plan or add needless
CPU overhead for checking the quals. If you rewrite your above paragraph
with equality instead of inequality then it still applies. A JOIN B ON A.X =
B.X WHERE A.X = 1000000, will clause B.X = 1000000 to be pushed into B, but
what if B.X values are all set to 1000000? I've not seen anyone complain
about us doing that. This is the reason I limited the patch to only deal
with members of the BTREE op-class, I assumed that if there's some BTREE
index helping with the equality qual then that same index may well just help
with a qual using the <, <=, > or >= operator, and also I thought that in
most cases all these btree inequality operations will have about the same
CPU cost as equality.

Would you be able to explain the difference between the two? Or is it just a
question of the likelihood?

Likelihood, I guess. I mean, typically if you are doing an equijoin
on two or more relations, the join column is unique, so there's only
going to be one row in each of A and B that is equal to a given
constant. If A.X and/or B.X contain duplicate values, then the output
is going to have a number of rows equal to the product of the number
of duplicates in each, sort of like a Cartesian join. That scenario
could happen, but it's a pretty strange kind of query which I would be
disinclined to spend a lot of time optimizing. OTOH, something like
SELECT * FROM orders o, lines l WHERE l.order_id = o.order_id AND
o.order_id > 1000000 is a pretty normal query, at least IMHO.
Worrying about how that's going to perform with various data
distributions seems like a pretty reasonable thing to care about.

Furthermore, there are some cases involving parameterized paths where
enforcing the inequality multiple times is definitely bad: for
example, if we've got a nested loop where the outer side is a seq scan
that enforces the condition and the inner side is an index probe, it
is just a waste to retest it on the inner side. We already know that
the outer row passes the inequality, so the inner row will necessarily
pass also. This doesn't apply to merge or hash joins, and it also
doesn't apply to all nested loops: scans that aren't paramaterized by
the equivalence-class column can still benefit from separate
enforcement of the inequality.

I guess that could be fixed by somehow marking these pushed quals as
optional and having parameterised scans ignore optional quals.

Yeah.

I have to admit that I didn't give it much thought as all of the cases that
I tested turned what used to be nested loop with a parameterised inner index
scan into a merge join, which was faster in my test case. Of course, that
does not mean that I claim that this merge join will be faster in all cases
or that the plan will switch to using a merge join in all cases.

Interesting that it turned into a merge join and that that was faster.

Again this is one of the reason that I limited it to btree operators only.
I don't quite know how to estimate this either as the extra rows being
filtered may mean that a sort no longer spills to disk, or a hash join no
longer multi-batches. I'd imagine filtering would be extra worthwhile in
such cases, so I doubt setting the threshold to some constant value is the
correct way, and most likely considering the path with and without the
qual(s) would be far too costly.

It might be OK, particularly for OLTP queries, but it's certainly not
going to be the cheapest thing anybody ever did.

Well I agree with that. I've got no interest in slowing anything down. I've
been busy working with big databases for quite a while now, so perhaps my
point of view is coming from that direction still.

Yeah.

So anyway, it's quite clear that we don't want the patch in its current
form, and I'm not volunteering any more time at this stage to improve it, so
for this reason I won't add it the January commit fest.

Too bad, but I understand. I hope you come back around to working on
this further at some point.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#8)
Re: [PATCH] Equivalence Class Filters

On 7 December 2015 at 16:44, Simon Riggs <simon@2ndquadrant.com> wrote:

On 6 December 2015 at 16:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Lastly, in most cases knowing that t2.id <= 10 is just not worth all

that much; it's certainly far less useful than an equality condition.

I'd have thought that my link to a thread of a reported 1100 to 2200

times

performance improvement might have made you think twice about claiming

the

uselessness of this idea.

Personally, I think this optimization is a long way down the list of
important items because I don't see it as a typical use case. There are
already patches submitted for more important items, so this isn't the right
place to be arguing such things. Not every beneficial optimization has a
wide use case.

There is an interesting real world case where we might get some use of
these thoughts.

If we have Orders and OrderItems (FK->Orders)
and we also know (and can Assert) Order.order_date <= OrderItems.ship_date
then a restriction on Orders.order_date > X => OrderItem.ship_date > X when
the two tables are joined on OrderId
and also a restriction on OrderItems.ship_date >= X => Orders.order_date <
X when the two tables are joined on OrderId

Such an assertion could be checked during the FK check, so would not be
expensive to maintain.

One for the future, at least, since we don't have any way of expressing or
enforcing that just yet.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#23Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Simon Riggs (#22)
Re: [PATCH] Equivalence Class Filters

On 12/16/2015 01:26 AM, Simon Riggs wrote:

There is an interesting real world case where we might get some use of
these thoughts.

If we have Orders and OrderItems (FK->Orders)
and we also know (and can Assert) Order.order_date <= OrderItems.ship_date
then a restriction on Orders.order_date > X => OrderItem.ship_date > X
when the two tables are joined on OrderId
and also a restriction on OrderItems.ship_date >= X => Orders.order_date
< X when the two tables are joined on OrderId

Such an assertion could be checked during the FK check, so would not be
expensive to maintain.

One for the future, at least, since we don't have any way of expressing
or enforcing that just yet.

There's a concept of "correlation maps", described in a paper [1]http://www.vldb.org/pvldb/2/vldb09-199.pdf
presented on VLDB 2009. It essentially talks about deriving conditions
between attributes of the same table, but I guess it might be modified
to track the correlations through foreign keys.

Interestingly enough, the data for the paper paper were collected on
PostgreSQL, but the correlation maps were implemented in an application
layer on top of the database.

[1]: http://www.vldb.org/pvldb/2/vldb09-199.pdf

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24David Rowley
david.rowley@2ndquadrant.com
In reply to: Simon Riggs (#22)
Re: [PATCH] Equivalence Class Filters

On 16 December 2015 at 13:26, Simon Riggs <simon@2ndquadrant.com> wrote:

There is an interesting real world case where we might get some use of
these thoughts.

If we have Orders and OrderItems (FK->Orders)
and we also know (and can Assert) Order.order_date <= OrderItems.ship_date
then a restriction on Orders.order_date > X => OrderItem.ship_date > X
when the two tables are joined on OrderId
and also a restriction on OrderItems.ship_date >= X => Orders.order_date <
X when the two tables are joined on OrderId

Such an assertion could be checked during the FK check, so would not be
expensive to maintain.

One for the future, at least, since we don't have any way of expressing or
enforcing that just yet.

That does sound interesting, but it's important to remember that referenced
tables are not updated in real time in that same way that indexes are. This
was the reason the INNER JOIN removals had problems, we simply can't
determine at planner time that the trigger queue for the foreign key will
be empty during execution, so can't be certain that the foreign key will be
"true".

I'm just mentioning this as I wouldn't want someone to run off thinking
this was a fantastic idea without being aware of the above, and waste time
making the same mistakes as I did last year.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#25David Fetter
david@fetter.org
In reply to: David Rowley (#24)
Re: [PATCH] Equivalence Class Filters

On Sun, Dec 20, 2015 at 10:27:35PM +1300, David Rowley wrote:

On 16 December 2015 at 13:26, Simon Riggs <simon@2ndquadrant.com> wrote:

There is an interesting real world case where we might get some
use of these thoughts.

If we have Orders and OrderItems (FK->Orders) and we also know
(and can Assert) Order.order_date <= OrderItems.ship_date then a
restriction on Orders.order_date > X => OrderItem.ship_date > X
when the two tables are joined on OrderId and also a restriction
on OrderItems.ship_date >= X => Orders.order_date < X when the two
tables are joined on OrderId

Such an assertion could be checked during the FK check, so would
not be expensive to maintain.

One for the future, at least, since we don't have any way of
expressing or enforcing that just yet.

That does sound interesting, but it's important to remember that
referenced tables are not updated in real time in that same way that
indexes are.

Is getting them so even remotely possible, given the system we have
now?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers