[PATCH] Erase the distinctClause if the result is unique by definition

Started by Andy Fanalmost 6 years ago42 messages

zhihui.fan1213@gmail.com

almost 6 years ago

1 attachment(s)

Hi:

I wrote a patch to erase the distinctClause if the result is unique by
definition, I find this because a user switch this code from oracle
to PG and find the performance is bad due to this, so I adapt pg for
this as well.

This patch doesn't work for a well-written SQL, but some drawback
of a SQL may be not very obvious, since the cost of checking is pretty
low as well, so I think it would be ok to add..

Please see the patch for details.

Thank you.

Attachments:

0001-Erase-the-distinctClause-if-the-result-is-unique-by-.patchapplication/octet-stream; name=0001-Erase-the-distinctClause-if-the-result-is-unique-by-.patchDownload

From 113d44d02b67ff487fe14a36c016cb1a11a03ebf Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com>
Date: Fri, 31 Jan 2020 19:38:05 +0800
Subject: [PATCH] Erase the distinctClause if the result is unique by
 definition

For a single relation, we can tell it by:
1. The pk is in the target list.
2. The uk is in the target list and the columns is not null
3. The columns in group-by clause is also in the target list

for relation join, we can tell it by:
1. if every relation it the jointree yield a unique result set

The final result will be unique as well regardless the join method
---
 src/backend/nodes/bitmapset.c                 |  40 ++++
 src/backend/optimizer/path/costsize.c         |   1 +
 src/backend/optimizer/plan/planner.c          | 219 ++++++++++++++++++
 src/backend/utils/cache/relcache.c            |  23 ++
 src/backend/utils/misc/guc.c                  |  10 +
 src/include/nodes/bitmapset.h                 |   2 +
 src/include/optimizer/cost.h                  |   1 +
 src/include/utils/rel.h                       |   3 +
 src/test/regress/expected/join.out            |  16 +-
 .../regress/expected/select_distinct_2.out    |  88 +++++++
 src/test/regress/expected/sysviews.out        |   3 +-
 src/test/regress/sql/select_distinct_2.sql    |  31 +++
 12 files changed, 428 insertions(+), 9 deletions(-)
 create mode 100644 src/test/regress/expected/select_distinct_2.out
 create mode 100644 src/test/regress/sql/select_distinct_2.sql

diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index 648cc1a7eb..76ce9b526e 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -1167,3 +1167,43 @@ bms_hash_value(const Bitmapset *a)
 	return DatumGetUInt32(hash_any((const unsigned char *) a->words,
 								   (lastword + 1) * sizeof(bitmapword)));
 }
+
+/*
+ * bms_array_copy --
+ *
+ * copy the bms data in the newly palloc memory
+ */
+
+Bitmapset**
+bms_array_copy(Bitmapset **bms_array, int len)
+{
+	Bitmapset **res;
+	int i;
+	if (bms_array == NULL || len < 1)
+		return NULL;
+
+	res = palloc(sizeof(Bitmapset*) * len);
+	for(i = 0; i < len; i++)
+	{
+		res[i] = bms_copy(bms_array[i]);
+	}
+	return res;
+}
+
+/*
+ * bms_array_free
+ *
+ * free the element in the array one by one, free the array as well at last
+ */
+void
+bms_array_free(Bitmapset **bms_array,  int len)
+{
+	int idx;
+	if (bms_array == NULL)
+		return;
+	for(idx = 0 ; idx < len; idx++)
+	{
+		bms_free(bms_array[idx]);
+	}
+	pfree(bms_array);
+}
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index b5a0033721..dde16b5d44 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -138,6 +138,7 @@ bool		enable_partitionwise_aggregate = false;
 bool		enable_parallel_append = true;
 bool		enable_parallel_hash = true;
 bool		enable_partition_pruning = true;
+bool		enable_distinct_elimination = true;
 
 typedef struct
 {
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index d6f2153593..326ecb47b8 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -22,8 +22,10 @@
 #include "access/htup_details.h"
 #include "access/parallel.h"
 #include "access/sysattr.h"
+#include "access/relation.h"
 #include "access/table.h"
 #include "access/xact.h"
+#include "catalog/index.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_proc.h"
@@ -35,6 +37,7 @@
 #include "lib/bipartite_match.h"
 #include "lib/knapsack.h"
 #include "miscadmin.h"
+#include "nodes/bitmapset.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
 #ifdef OPTIMIZER_DEBUG
@@ -248,6 +251,7 @@ static bool group_by_has_partkey(RelOptInfo *input_rel,
 								 List *targetList,
 								 List *groupClause);
 static int	common_prefix_cmp(const void *a, const void *b);
+static void	preprocess_distinct_node(PlannerInfo *root);
 
 
 /*****************************************************************************
@@ -989,6 +993,9 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	/* Remove any redundant GROUP BY columns */
 	remove_useless_groupby_columns(root);
 
+	if (enable_distinct_elimination)
+		preprocess_distinct_node(root);
+
 	/*
 	 * If we have any outer joins, try to reduce them to plain inner joins.
 	 * This step is most easily done after we've done expression
@@ -7409,3 +7416,215 @@ group_by_has_partkey(RelOptInfo *input_rel,
 
 	return true;
 }
+
+/*
+ * is_unique_result_already
+ *
+ * Given a relation, we can know its primary key + unique key information
+ * unique target is the target list of distinct/distinct on target.
+ * not_null_columns is a union of not null columns based on catalog and quals.
+ * then we can know the result is unique already before executing it if
+ * the primary key or uk + not null in target list.
+ */
+static bool
+is_unique_result_already(Relation relation,
+						 Bitmapset *unique_target,
+						 Bitmapset *not_null_columns)
+{
+	int i;
+	Bitmapset *pkattr = RelationGetIndexAttrBitmap(relation,
+												   INDEX_ATTR_BITMAP_PRIMARY_KEY);
+
+	/*
+	 * if the pk is in the target list,
+	 * the result set is unique for this relation
+	 */
+	if (pkattr != NULL &&
+		!bms_is_empty(pkattr) &&
+		bms_is_subset(pkattr, unique_target))
+	{
+		return true;
+	}
+
+	/*
+	 * check if the pk is in the unique index
+	 */
+	for (i = 0; i < relation->rd_plain_ukcount; i++)
+	{
+		Bitmapset *ukattr = relation->rd_plain_ukattrs[i];
+		if (!bms_is_empty(ukattr)
+			&& bms_is_subset(ukattr, unique_target)
+			&& bms_is_subset(ukattr, not_null_columns))
+			return true;
+	}
+
+	/*
+	 * If a unique index is in the target list, and the columns are not null
+	 * the result set is unique as well
+	 */
+
+	return false;
+}
+
+/*
+ * preprocess_distinct_node
+ *
+ * remove the distinctClause if it is not necessary by definition
+ */
+static void
+preprocess_distinct_node(PlannerInfo *root)
+{
+	Query *query = root->parse;
+	ListCell *lc;
+	int num_of_rtables;
+	Bitmapset **targetlist_by_table = NULL;
+    Bitmapset **notnullcolumns = NULL;
+	Index rel_idx;
+	bool should_distinct_elimination = false;
+
+	if (query->distinctClause == NIL)
+		return;
+
+	foreach(lc, query->rtable)
+	{
+		RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
+		/* we only handle the basic Relation for now */
+		if (rte->rtekind != RTE_RELATION)
+			return;
+	}
+
+	num_of_rtables = list_length(query->rtable);
+
+	/* If the group clause in the target list, we don't need distinct */
+	if (query->groupClause != NIL)
+	{
+		Bitmapset *groupclause_bm = NULL;
+		Bitmapset *groupclause_in_targetlist_bm = NULL;
+		ListCell *lc;
+		foreach(lc, query->groupClause)
+			groupclause_bm = bms_add_member(groupclause_bm,
+												lfirst_node(SortGroupClause, lc)->tleSortGroupRef);
+
+		foreach(lc, query->targetList)
+		{
+			TargetEntry *te = lfirst_node(TargetEntry, lc);
+			if (te->resjunk)
+				continue;
+			groupclause_in_targetlist_bm = bms_add_member(groupclause_in_targetlist_bm,
+														  te->ressortgroupref);
+		}
+
+		should_distinct_elimination = bms_is_subset(groupclause_bm,
+													groupclause_in_targetlist_bm);
+		bms_free(groupclause_bm);
+		bms_free(groupclause_in_targetlist_bm);
+		if (should_distinct_elimination)
+			goto ret;
+	}
+
+	targetlist_by_table = palloc0(sizeof(Bitmapset*) * num_of_rtables);
+	notnullcolumns = palloc0(sizeof(Bitmapset* ) * num_of_rtables);
+
+	/* build the targetlist_by_table */
+	foreach(lc, query->targetList)
+	{
+		TargetEntry *te = lfirst_node(TargetEntry, lc);
+		Expr *expr = te->expr;
+		Var *var;
+		Bitmapset **target_column_per_rel;
+		int target_attno;
+
+		if (!IsA(expr, Var))
+			continue;
+		var = (Var *)(expr);
+		if (var->varlevelsup != 0)
+			continue;
+
+		target_column_per_rel = &targetlist_by_table[var->varno - 1];
+		target_attno = var->varattno - FirstLowInvalidHeapAttributeNumber;
+
+		/* Handle distincton differently */
+		if (query->hasDistinctOn)
+		{
+			Index ref = te->ressortgroupref;
+			ListCell *lc;
+
+			/* A fastpath to know if the targetEntry is in the distinctClause */
+			if (ref == 0)
+				continue;
+
+			foreach(lc, query->distinctClause)
+			{
+				if (ref == lfirst_node(SortGroupClause, lc)->tleSortGroupRef)
+					*target_column_per_rel = bms_add_member(*target_column_per_rel,
+															target_attno);
+			}
+		}
+		else
+			*target_column_per_rel = bms_add_member(*target_column_per_rel,
+												  target_attno);
+	}
+
+	/* find out nonnull columns from qual via find_nonnullable_vars */
+	foreach(lc, find_nonnullable_vars(query->jointree->quals))
+	{
+		Var *not_null_var;
+		Bitmapset **notnullcolumns_per_rel;
+		int notnull_attno;
+		if (!IsA(lfirst(lc), Var))
+			continue;
+		not_null_var = lfirst_node(Var, lc);
+		if (not_null_var->varno == INNER_VAR ||
+			not_null_var->varno == OUTER_VAR ||
+			not_null_var->varno == INDEX_VAR)
+			continue;
+		notnullcolumns_per_rel = &notnullcolumns[not_null_var->varno - 1];
+		notnull_attno = not_null_var->varattno - FirstLowInvalidHeapAttributeNumber;
+		*notnullcolumns_per_rel = bms_add_member(*notnullcolumns_per_rel,
+												 notnull_attno);
+	}
+
+	/* Check if each related rtable can yield a unique result set */
+	rel_idx = 0;
+	foreach(lc, query->rtable)
+	{
+		RangeTblEntry *te = lfirst_node(RangeTblEntry, lc);
+	    Relation relation = relation_open(te->relid, RowExclusiveLock);
+		int attr_idx = 0;
+		TupleDesc desc = relation->rd_att;
+
+		/* Add the notnullcolumns based on catalog */
+		for(; attr_idx < desc->natts; attr_idx++)
+		{
+			int notnull_attno;
+			if (!desc->attrs[attr_idx].attnotnull)
+				continue;
+			notnull_attno = attr_idx + 1 - FirstLowInvalidHeapAttributeNumber;
+			notnullcolumns[rel_idx] = bms_add_member(notnullcolumns[rel_idx],
+													 notnull_attno);
+		}
+
+		/* check non-nullable in qual, only col is not null checked now */
+		if (!is_unique_result_already(relation,
+									  targetlist_by_table[rel_idx],
+									  notnullcolumns[rel_idx]))
+		{
+			RelationClose(relation);
+			goto ret;
+		}
+		RelationClose(relation);
+		rel_idx++;
+	}
+
+	should_distinct_elimination = true;
+
+ ret:
+	bms_array_free(notnullcolumns, num_of_rtables);
+	bms_array_free(targetlist_by_table, num_of_rtables);
+
+	if (should_distinct_elimination)
+	{
+		query->distinctClause = NIL;
+		query->hasDistinctOn = false;
+	}
+}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index df025a5a30..d8a76a2273 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -2346,6 +2346,8 @@ RelationDestroyRelation(Relation relation, bool remember_tupdesc)
 	bms_free(relation->rd_keyattr);
 	bms_free(relation->rd_pkattr);
 	bms_free(relation->rd_idattr);
+	if (relation->rd_plain_ukattrs)
+		bms_array_free(relation->rd_plain_ukattrs, relation->rd_plain_ukcount);
 	if (relation->rd_pubactions)
 		pfree(relation->rd_pubactions);
 	if (relation->rd_options)
@@ -4762,6 +4764,7 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
 	Bitmapset  *indexattrs;		/* indexed columns */
 	Bitmapset  *uindexattrs;	/* columns in unique indexes */
 	Bitmapset  *pkindexattrs;	/* columns in the primary index */
+	Bitmapset  **ukindexattrs = NULL; /* columns in the unique indexes */
 	Bitmapset  *idindexattrs;	/* columns in the replica identity */
 	List	   *indexoidlist;
 	List	   *newindexoidlist;
@@ -4769,6 +4772,7 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
 	Oid			relreplindex;
 	ListCell   *l;
 	MemoryContext oldcxt;
+	int plain_uk_index_count = 0, index_count = 0, indexno = 0;
 
 	/* Quick exit if we already computed the result. */
 	if (relation->rd_indexattr != NULL)
@@ -4826,6 +4830,9 @@ restart:
 	uindexattrs = NULL;
 	pkindexattrs = NULL;
 	idindexattrs = NULL;
+	index_count = list_length(indexoidlist);
+	ukindexattrs = palloc0(sizeof(Bitmapset *) * index_count);
+
 	foreach(l, indexoidlist)
 	{
 		Oid			indexOid = lfirst_oid(l);
@@ -4875,6 +4882,9 @@ restart:
 		/* Is this index the configured (or default) replica identity? */
 		isIDKey = (indexOid == relreplindex);
 
+		if (isKey)
+			plain_uk_index_count++;
+
 		/* Collect simple attribute references */
 		for (i = 0; i < indexDesc->rd_index->indnatts; i++)
 		{
@@ -4904,6 +4914,11 @@ restart:
 				if (isIDKey && i < indexDesc->rd_index->indnkeyatts)
 					idindexattrs = bms_add_member(idindexattrs,
 												  attrnum - FirstLowInvalidHeapAttributeNumber);
+
+				if (isKey)
+					ukindexattrs[indexno] = bms_add_member(ukindexattrs[indexno],
+														   attrnum - FirstLowInvalidHeapAttributeNumber);
+
 			}
 		}
 
@@ -4914,6 +4929,7 @@ restart:
 		pull_varattnos(indexPredicate, 1, &indexattrs);
 
 		index_close(indexDesc, AccessShareLock);
+		indexno++;
 	}
 
 	/*
@@ -4940,6 +4956,7 @@ restart:
 		bms_free(pkindexattrs);
 		bms_free(idindexattrs);
 		bms_free(indexattrs);
+		bms_array_free(ukindexattrs, index_count);
 
 		goto restart;
 	}
@@ -4953,6 +4970,8 @@ restart:
 	relation->rd_pkattr = NULL;
 	bms_free(relation->rd_idattr);
 	relation->rd_idattr = NULL;
+	bms_array_free(relation->rd_plain_ukattrs, relation->rd_plain_ukcount);
+	relation->rd_plain_ukattrs = NULL;
 
 	/*
 	 * Now save copies of the bitmaps in the relcache entry.  We intentionally
@@ -4966,6 +4985,8 @@ restart:
 	relation->rd_pkattr = bms_copy(pkindexattrs);
 	relation->rd_idattr = bms_copy(idindexattrs);
 	relation->rd_indexattr = bms_copy(indexattrs);
+	relation->rd_plain_ukattrs = bms_array_copy(ukindexattrs, index_count);
+	relation->rd_plain_ukcount = plain_uk_index_count;
 	MemoryContextSwitchTo(oldcxt);
 
 	/* We return our original working copy for caller to play with */
@@ -5618,6 +5639,8 @@ load_relcache_init_file(bool shared)
 		rel->rd_keyattr = NULL;
 		rel->rd_pkattr = NULL;
 		rel->rd_idattr = NULL;
+		rel->rd_plain_ukattrs = NULL;
+		rel->rd_plain_ukcount = 0;
 		rel->rd_pubactions = NULL;
 		rel->rd_statvalid = false;
 		rel->rd_statlist = NIL;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e44f71e991..fa798dd564 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1064,6 +1064,16 @@ static struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_distinct_elimination", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables plan-time and run-time unique elimination."),
+		    gettext_noop("Allows the query planner to remove the uncecessary distinct clause."), 
+			GUC_EXPLAIN
+		},
+		&enable_distinct_elimination,
+		true,
+		NULL, NULL, NULL
+	},
 	{
 		{"geqo", PGC_USERSET, QUERY_TUNING_GEQO,
 			gettext_noop("Enables genetic query optimization."),
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index b7b18a0b68..ff30feb521 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -117,4 +117,6 @@ extern int	bms_prev_member(const Bitmapset *a, int prevbit);
 /* support for hashtables using Bitmapsets as keys: */
 extern uint32 bms_hash_value(const Bitmapset *a);
 
+extern Bitmapset **bms_array_copy(Bitmapset **bms_array, int len);
+extern void bms_array_free(Bitmapset **bms_array,  int len);
 #endif							/* BITMAPSET_H */
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index cb012ba198..4fa5d32df6 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -64,6 +64,7 @@ extern PGDLLIMPORT bool enable_partitionwise_aggregate;
 extern PGDLLIMPORT bool enable_parallel_append;
 extern PGDLLIMPORT bool enable_parallel_hash;
 extern PGDLLIMPORT bool enable_partition_pruning;
+extern PGDLLIMPORT bool enable_distinct_elimination;
 extern PGDLLIMPORT int constraint_exclusion;
 
 extern double index_pages_fetched(double tuples_fetched, BlockNumber pages,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 44ed04dd3f..7c5a6d65b6 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -120,6 +120,9 @@ typedef struct RelationData
 	Bitmapset  *rd_indexattr;	/* identifies columns used in indexes */
 	Bitmapset  *rd_keyattr;		/* cols that can be ref'd by foreign keys */
 	Bitmapset  *rd_pkattr;		/* cols included in primary key */
+	Bitmapset  **rd_plain_ukattrs;    /* cols included in the plain unique indexes,
+                                   only non-expression, non-partical columns are count */
+	int        rd_plain_ukcount;  /* the no. of uk count */
 	Bitmapset  *rd_idattr;		/* included in replica identity index */
 
 	PublicationActions *rd_pubactions;	/* publication actions */
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 761376b007..3f6595d53b 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -4433,17 +4433,17 @@ select d.* from d left join (select * from b group by b.id, b.c_id) s
 explain (costs off)
 select d.* from d left join (select distinct * from b) s
   on d.a = s.id;
-              QUERY PLAN              
---------------------------------------
- Merge Right Join
-   Merge Cond: (b.id = d.a)
-   ->  Unique
-         ->  Sort
-               Sort Key: b.id, b.c_id
-               ->  Seq Scan on b
+           QUERY PLAN            
+---------------------------------
+ Merge Left Join
+   Merge Cond: (d.a = s.id)
    ->  Sort
          Sort Key: d.a
          ->  Seq Scan on d
+   ->  Sort
+         Sort Key: s.id
+         ->  Subquery Scan on s
+               ->  Seq Scan on b
 (9 rows)
 
 -- check join removal works when uniqueness of the join condition is enforced
diff --git a/src/test/regress/expected/select_distinct_2.out b/src/test/regress/expected/select_distinct_2.out
new file mode 100644
index 0000000000..d5c8f818af
--- /dev/null
+++ b/src/test/regress/expected/select_distinct_2.out
@@ -0,0 +1,88 @@
+create table select_distinct_a(a int, b char(20),  c char(20) not null,  d int, e int, primary key(a, b));
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+-- no node for distinct.
+explain (costs off) select distinct * from select_distinct_a;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a;
+             QUERY PLAN              
+-------------------------------------
+ HashAggregate
+   Group Key: b, c, d, e
+   ->  Seq Scan on select_distinct_a
+(3 rows)
+
+create unique index select_distinct_a_uk on select_distinct_a(c, d);
+explain (costs off) select distinct b,c,d,e from select_distinct_a where c is not null;
+             QUERY PLAN              
+-------------------------------------
+ HashAggregate
+   Group Key: b, c, d, e
+   ->  Seq Scan on select_distinct_a
+         Filter: (c IS NOT NULL)
+(4 rows)
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a where c is not null and d is not null;
+                   QUERY PLAN                    
+-------------------------------------------------
+ Seq Scan on select_distinct_a
+   Filter: ((c IS NOT NULL) AND (d IS NOT NULL))
+(2 rows)
+
+explain select distinct d, e from select_distinct_a group by d, e;
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
+ HashAggregate  (cost=15.85..17.85 rows=200 width=8)
+   Group Key: d, e
+   ->  Seq Scan on select_distinct_a  (cost=0.00..13.90 rows=390 width=8)
+(3 rows)
+
+create table select_distinct_b(a int, b char(20),  c char(20) not null,  d int, e int, primary key(a, b));
+explain (costs off) select distinct * from select_distinct_a a, select_distinct_b b;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop
+   ->  Seq Scan on select_distinct_a a
+   ->  Materialize
+         ->  Seq Scan on select_distinct_b b
+(4 rows)
+
+explain (costs off) select distinct a.b, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: a.b, a.c, b.a, b.b
+         ->  Nested Loop
+               ->  Seq Scan on select_distinct_a a
+               ->  Materialize
+                     ->  Seq Scan on select_distinct_b b
+(7 rows)
+
+explain (costs off) select distinct a.d, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b where a.d is not null;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop
+   ->  Seq Scan on select_distinct_b b
+   ->  Materialize
+         ->  Seq Scan on select_distinct_a a
+               Filter: (d IS NOT NULL)
+(5 rows)
+
+explain (costs off) select distinct a.d, b.a from select_distinct_a a, select_distinct_b b group by a.d, b.a;
+                    QUERY PLAN                     
+---------------------------------------------------
+ HashAggregate
+   Group Key: a.d, b.a
+   ->  Nested Loop
+         ->  Seq Scan on select_distinct_a a
+         ->  Materialize
+               ->  Seq Scan on select_distinct_b b
+(6 rows)
+
+drop table select_distinct_a;
+drop table select_distinct_b;
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index a1c90eb905..e053214f9d 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -73,6 +73,7 @@ select name, setting from pg_settings where name like 'enable%';
               name              | setting 
 --------------------------------+---------
  enable_bitmapscan              | on
+ enable_distinct_elimination    | on
  enable_gathermerge             | on
  enable_hashagg                 | on
  enable_hashjoin                | on
@@ -89,7 +90,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(17 rows)
+(18 rows)
 
 -- Test that the pg_timezone_names and pg_timezone_abbrevs views are
 -- more-or-less working.  We can't test their contents in any great detail
diff --git a/src/test/regress/sql/select_distinct_2.sql b/src/test/regress/sql/select_distinct_2.sql
new file mode 100644
index 0000000000..d236aa9168
--- /dev/null
+++ b/src/test/regress/sql/select_distinct_2.sql
@@ -0,0 +1,31 @@
+create table select_distinct_a(a int, b char(20),  c char(20) not null,  d int, e int, primary key(a, b));
+
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+
+-- no node for distinct.
+explain (costs off) select distinct * from select_distinct_a;
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a;
+
+create unique index select_distinct_a_uk on select_distinct_a(c, d);
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a where c is not null;
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a where c is not null and d is not null;
+
+explain select distinct d, e from select_distinct_a group by d, e;
+
+
+create table select_distinct_b(a int, b char(20),  c char(20) not null,  d int, e int, primary key(a, b));
+
+explain (costs off) select distinct * from select_distinct_a a, select_distinct_b b;
+
+explain (costs off) select distinct a.b, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b;
+
+explain (costs off) select distinct a.d, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b where a.d is not null;
+
+explain (costs off) select distinct a.d, b.a from select_distinct_a a, select_distinct_b b group by a.d, b.a;
+
+drop table select_distinct_a;
+drop table select_distinct_b;
-- 
2.20.1 (Apple Git-117)

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: Andy Fan (#1)

1 attachment(s)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

update the patch with considering the semi/anti join.

Can anyone help to review this patch?

Thanks

On Fri, Jan 31, 2020 at 8:39 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

Show quoted text

Hi:

I wrote a patch to erase the distinctClause if the result is unique by
definition, I find this because a user switch this code from oracle
to PG and find the performance is bad due to this, so I adapt pg for
this as well.

This patch doesn't work for a well-written SQL, but some drawback
of a SQL may be not very obvious, since the cost of checking is pretty
low as well, so I think it would be ok to add..

Please see the patch for details.

Thank you.

Attachments:

0001-Erase-the-distinctClause-if-the-result-is-unique-by-.patchapplication/octet-stream; name=0001-Erase-the-distinctClause-if-the-result-is-unique-by-.patchDownload

From 32c220b1a12b3c622553ed0fc93f8556619e020f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com>
Date: Fri, 31 Jan 2020 19:38:05 +0800
Subject: [PATCH] Erase the distinctClause if the result is unique by
 definition

For a single relation, we can tell it by any one of the following
is true:
1. The pk is in the target list.
2. The uk is in the target list and the columns is not null
3. The columns in group-by clause is also in the target list

for relation join, we can tell it by:
if every relation in the jointree yield a unique result set,then
the final result is unique as well regardless the join method.
for semi/anti join, we will ignore the righttable.
---
 src/backend/nodes/bitmapset.c                 |  40 +++
 src/backend/optimizer/path/costsize.c         |   1 +
 src/backend/optimizer/plan/planner.c          | 292 ++++++++++++++++++
 src/backend/utils/cache/relcache.c            |  23 ++
 src/backend/utils/misc/guc.c                  |  10 +
 src/include/nodes/bitmapset.h                 |   2 +
 src/include/optimizer/cost.h                  |   1 +
 src/include/utils/rel.h                       |   3 +
 src/test/regress/expected/join.out            |  16 +-
 .../regress/expected/select_distinct_2.out    | 141 +++++++++
 src/test/regress/expected/sysviews.out        |   3 +-
 src/test/regress/sql/select_distinct_2.sql    |  42 +++
 12 files changed, 565 insertions(+), 9 deletions(-)
 create mode 100644 src/test/regress/expected/select_distinct_2.out
 create mode 100644 src/test/regress/sql/select_distinct_2.sql

diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index 648cc1a7eb..76ce9b526e 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -1167,3 +1167,43 @@ bms_hash_value(const Bitmapset *a)
 	return DatumGetUInt32(hash_any((const unsigned char *) a->words,
 								   (lastword + 1) * sizeof(bitmapword)));
 }
+
+/*
+ * bms_array_copy --
+ *
+ * copy the bms data in the newly palloc memory
+ */
+
+Bitmapset**
+bms_array_copy(Bitmapset **bms_array, int len)
+{
+	Bitmapset **res;
+	int i;
+	if (bms_array == NULL || len < 1)
+		return NULL;
+
+	res = palloc(sizeof(Bitmapset*) * len);
+	for(i = 0; i < len; i++)
+	{
+		res[i] = bms_copy(bms_array[i]);
+	}
+	return res;
+}
+
+/*
+ * bms_array_free
+ *
+ * free the element in the array one by one, free the array as well at last
+ */
+void
+bms_array_free(Bitmapset **bms_array,  int len)
+{
+	int idx;
+	if (bms_array == NULL)
+		return;
+	for(idx = 0 ; idx < len; idx++)
+	{
+		bms_free(bms_array[idx]);
+	}
+	pfree(bms_array);
+}
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index b5a0033721..dde16b5d44 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -138,6 +138,7 @@ bool		enable_partitionwise_aggregate = false;
 bool		enable_parallel_append = true;
 bool		enable_parallel_hash = true;
 bool		enable_partition_pruning = true;
+bool		enable_distinct_elimination = true;
 
 typedef struct
 {
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index d6f2153593..6f7d85f96e 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -22,8 +22,10 @@
 #include "access/htup_details.h"
 #include "access/parallel.h"
 #include "access/sysattr.h"
+#include "access/relation.h"
 #include "access/table.h"
 #include "access/xact.h"
+#include "catalog/index.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_proc.h"
@@ -35,6 +37,7 @@
 #include "lib/bipartite_match.h"
 #include "lib/knapsack.h"
 #include "miscadmin.h"
+#include "nodes/bitmapset.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
 #ifdef OPTIMIZER_DEBUG
@@ -248,6 +251,7 @@ static bool group_by_has_partkey(RelOptInfo *input_rel,
 								 List *targetList,
 								 List *groupClause);
 static int	common_prefix_cmp(const void *a, const void *b);
+static void	preprocess_distinct_node(PlannerInfo *root);
 
 
 /*****************************************************************************
@@ -989,6 +993,9 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	/* Remove any redundant GROUP BY columns */
 	remove_useless_groupby_columns(root);
 
+	if (enable_distinct_elimination)
+		preprocess_distinct_node(root);
+
 	/*
 	 * If we have any outer joins, try to reduce them to plain inner joins.
 	 * This step is most easily done after we've done expression
@@ -7409,3 +7416,288 @@ group_by_has_partkey(RelOptInfo *input_rel,
 
 	return true;
 }
+
+/*
+ * is_unique_result_already
+ *
+ * Given a relation, we can know its primary key + unique key information
+ * unique target is the target list of distinct/distinct on target.
+ * not_null_columns is a union of not null columns based on catalog and quals.
+ * then we can know the result is unique already before executing it if
+ * the primary key or uk + not null in target list.
+ */
+static bool
+is_unique_result_already(Relation relation,
+						 Bitmapset *unique_target,
+						 Bitmapset *not_null_columns)
+{
+	int i;
+	Bitmapset *pkattr = RelationGetIndexAttrBitmap(relation,
+												   INDEX_ATTR_BITMAP_PRIMARY_KEY);
+
+	/*
+	 * if the pk is in the target list,
+	 * the result set is unique for this relation
+	 */
+	if (pkattr != NULL &&
+		!bms_is_empty(pkattr) &&
+		bms_is_subset(pkattr, unique_target))
+	{
+		return true;
+	}
+
+	/*
+	 * check if the pk is in the unique index
+	 */
+	for (i = 0; i < relation->rd_plain_ukcount; i++)
+	{
+		Bitmapset *ukattr = relation->rd_plain_ukattrs[i];
+		if (!bms_is_empty(ukattr)
+			&& bms_is_subset(ukattr, unique_target)
+			&& bms_is_subset(ukattr, not_null_columns))
+			return true;
+	}
+
+	/*
+	 * If a unique index is in the target list, and the columns are not null
+	 * the result set is unique as well
+	 */
+
+	return false;
+}
+
+
+/*
+ * scan_non_semi_anti_relids
+ *
+ * scan jointree to get non-semi/anti join rtindex.
+ */
+static void
+scan_non_semi_anti_relids(Node* jtnode, Relids* relids)
+{
+	if (jtnode == NULL)
+		return;
+
+	if (IsA(jtnode, RangeTblRef))
+	{
+		int			varno = ((RangeTblRef *) jtnode)->rtindex;
+
+		*relids = bms_add_member(*relids, varno);
+	}
+	else if (IsA(jtnode, FromExpr))
+	{
+		FromExpr   *f = (FromExpr *) jtnode;
+		ListCell   *l;
+
+		foreach(l, f->fromlist)
+			scan_non_semi_anti_relids(lfirst(l), relids);
+	}
+	else if (IsA(jtnode, JoinExpr))
+	{
+		JoinExpr   *j = (JoinExpr *) jtnode;
+
+		scan_non_semi_anti_relids(j->larg, relids);
+		if (j->jointype != JOIN_SEMI && j->jointype != JOIN_ANTI)
+		{
+			scan_non_semi_anti_relids(j->rarg, relids);
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+			 (int) nodeTag(jtnode));
+
+}
+
+/*
+ * preprocess_distinct_node
+ *
+ * remove the distinctClause if it is not necessary by definition
+ */
+static void
+preprocess_distinct_node(PlannerInfo *root)
+{
+	Query *query = root->parse;
+	ListCell *lc;
+	int num_of_rtables;
+	Bitmapset **targetlist_by_table = NULL;
+    Bitmapset **notnullcolumns = NULL;
+	Index rel_idx = 0;
+	bool should_distinct_elimination = false;
+	Relids non_semi_anti_relids = NULL;
+
+	if (query->distinctClause == NIL)
+		return;
+
+	scan_non_semi_anti_relids((Node*)query->jointree, &non_semi_anti_relids);
+
+	foreach(lc, query->rtable)
+	{
+		RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
+		rel_idx++;
+		if (!bms_is_member(rel_idx, non_semi_anti_relids))
+			continue;
+		/* we only handle the basic Relation for now */
+		if (rte->rtekind != RTE_RELATION)
+			return;
+	}
+
+	num_of_rtables = bms_num_members(non_semi_anti_relids);
+
+	/*
+	 * If the columns in group clause is in the target list
+	 * we don't need distinct
+	 */
+	if (query->groupClause != NIL)
+	{
+		Bitmapset *groupclause_bm = NULL;
+		Bitmapset *groupclause_in_targetlist_bm = NULL;
+		ListCell *lc;
+		foreach(lc, query->groupClause)
+			groupclause_bm = bms_add_member(groupclause_bm,
+												lfirst_node(SortGroupClause, lc)->tleSortGroupRef);
+
+		foreach(lc, query->targetList)
+		{
+			TargetEntry *te = lfirst_node(TargetEntry, lc);
+			if (te->resjunk)
+				continue;
+			groupclause_in_targetlist_bm = bms_add_member(groupclause_in_targetlist_bm,
+														  te->ressortgroupref);
+		}
+
+		should_distinct_elimination = bms_is_subset(groupclause_bm,
+													groupclause_in_targetlist_bm);
+		bms_free(groupclause_bm);
+		bms_free(groupclause_in_targetlist_bm);
+		if (should_distinct_elimination)
+			goto ret;
+	}
+
+	targetlist_by_table = palloc0(sizeof(Bitmapset*) * num_of_rtables);
+	notnullcolumns = palloc0(sizeof(Bitmapset* ) * num_of_rtables);
+
+	/* build the targetlist_by_table */
+	foreach(lc, query->targetList)
+	{
+		TargetEntry *te = lfirst_node(TargetEntry, lc);
+		Expr *expr = te->expr;
+		Var *var;
+		Bitmapset **target_column_per_rel;
+		int target_attno;
+
+		if (!IsA(expr, Var))
+			continue;
+
+		var = (Var *)(expr);
+		if (var->varlevelsup != 0)
+			continue;
+
+		target_column_per_rel = &targetlist_by_table[var->varno - 1];
+		target_attno = var->varattno - FirstLowInvalidHeapAttributeNumber;
+
+		/*
+		 * for distinct On (..), we only count the field in .. rather than
+		 * all the entries in target list
+		 */
+		if (query->hasDistinctOn)
+		{
+			Index ref = te->ressortgroupref;
+			ListCell *lc;
+
+			/*
+			 * A fastpath to know if the targetEntry is in the distinctClause
+			 */
+			if (ref == 0)
+				continue;
+
+			/*
+			 * Even the ref is not zero, it may be in sort as well, so we
+			 * need dobule check.
+			 */
+			foreach(lc, query->distinctClause)
+			{
+				if (ref == lfirst_node(SortGroupClause, lc)->tleSortGroupRef)
+					*target_column_per_rel = bms_add_member(*target_column_per_rel,
+															target_attno);
+			}
+		}
+		else
+			*target_column_per_rel = bms_add_member(*target_column_per_rel,
+													target_attno);
+	}
+
+	/* find out nonnull columns from qual via find_nonnullable_vars */
+	foreach(lc, find_nonnullable_vars(query->jointree->quals))
+	{
+		Var *not_null_var;
+		Bitmapset **notnullcolumns_per_rel;
+		int notnull_attno;
+		if (!IsA(lfirst(lc), Var))
+			continue;
+		not_null_var = lfirst_node(Var, lc);
+		if (not_null_var->varno == INNER_VAR ||
+			not_null_var->varno == OUTER_VAR ||
+			not_null_var->varno == INDEX_VAR)
+			continue;
+		notnullcolumns_per_rel = &notnullcolumns[not_null_var->varno - 1];
+		notnull_attno = not_null_var->varattno - FirstLowInvalidHeapAttributeNumber;
+		*notnullcolumns_per_rel = bms_add_member(*notnullcolumns_per_rel,
+												 notnull_attno);
+	}
+
+	/* Check if each related rtable can yield a unique result set */
+	rel_idx = 0;
+	foreach(lc, query->rtable)
+	{
+		Relation relation;
+		TupleDesc desc;
+		RangeTblEntry *rte;
+		int attr_idx;
+
+		if (!bms_is_member(rel_idx+1, non_semi_anti_relids))
+			continue;
+
+		rte = lfirst_node(RangeTblEntry, lc);
+		Assert(rte->rtekind == RTE_RELATION);
+		Assert(rte->relid != InvalidOid);
+
+	    relation = relation_open(rte->relid, RowExclusiveLock);
+		desc = relation->rd_att;
+		attr_idx = 0;
+
+		/* Add the notnullcolumns based on catalog */
+		for(; attr_idx < desc->natts; attr_idx++)
+		{
+			int notnull_attno;
+			if (!desc->attrs[attr_idx].attnotnull)
+				continue;
+			notnull_attno = attr_idx + 1 - FirstLowInvalidHeapAttributeNumber;
+			notnullcolumns[rel_idx] = bms_add_member(notnullcolumns[rel_idx],
+													 notnull_attno);
+		}
+
+		/* check non-nullable in qual, only col is not null checked now */
+		if (!is_unique_result_already(relation,
+									  targetlist_by_table[rel_idx],
+									  notnullcolumns[rel_idx]))
+		{
+			RelationClose(relation);
+			goto ret;
+		}
+		RelationClose(relation);
+		rel_idx++;
+	}
+
+	should_distinct_elimination = true;
+
+ ret:
+	bms_array_free(notnullcolumns, num_of_rtables);
+	bms_array_free(targetlist_by_table, num_of_rtables);
+	bms_free(non_semi_anti_relids);
+
+	if (should_distinct_elimination)
+	{
+		query->distinctClause = NIL;
+		query->hasDistinctOn = false;
+	}
+}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index df025a5a30..d8a76a2273 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -2346,6 +2346,8 @@ RelationDestroyRelation(Relation relation, bool remember_tupdesc)
 	bms_free(relation->rd_keyattr);
 	bms_free(relation->rd_pkattr);
 	bms_free(relation->rd_idattr);
+	if (relation->rd_plain_ukattrs)
+		bms_array_free(relation->rd_plain_ukattrs, relation->rd_plain_ukcount);
 	if (relation->rd_pubactions)
 		pfree(relation->rd_pubactions);
 	if (relation->rd_options)
@@ -4762,6 +4764,7 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
 	Bitmapset  *indexattrs;		/* indexed columns */
 	Bitmapset  *uindexattrs;	/* columns in unique indexes */
 	Bitmapset  *pkindexattrs;	/* columns in the primary index */
+	Bitmapset  **ukindexattrs = NULL; /* columns in the unique indexes */
 	Bitmapset  *idindexattrs;	/* columns in the replica identity */
 	List	   *indexoidlist;
 	List	   *newindexoidlist;
@@ -4769,6 +4772,7 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
 	Oid			relreplindex;
 	ListCell   *l;
 	MemoryContext oldcxt;
+	int plain_uk_index_count = 0, index_count = 0, indexno = 0;
 
 	/* Quick exit if we already computed the result. */
 	if (relation->rd_indexattr != NULL)
@@ -4826,6 +4830,9 @@ restart:
 	uindexattrs = NULL;
 	pkindexattrs = NULL;
 	idindexattrs = NULL;
+	index_count = list_length(indexoidlist);
+	ukindexattrs = palloc0(sizeof(Bitmapset *) * index_count);
+
 	foreach(l, indexoidlist)
 	{
 		Oid			indexOid = lfirst_oid(l);
@@ -4875,6 +4882,9 @@ restart:
 		/* Is this index the configured (or default) replica identity? */
 		isIDKey = (indexOid == relreplindex);
 
+		if (isKey)
+			plain_uk_index_count++;
+
 		/* Collect simple attribute references */
 		for (i = 0; i < indexDesc->rd_index->indnatts; i++)
 		{
@@ -4904,6 +4914,11 @@ restart:
 				if (isIDKey && i < indexDesc->rd_index->indnkeyatts)
 					idindexattrs = bms_add_member(idindexattrs,
 												  attrnum - FirstLowInvalidHeapAttributeNumber);
+
+				if (isKey)
+					ukindexattrs[indexno] = bms_add_member(ukindexattrs[indexno],
+														   attrnum - FirstLowInvalidHeapAttributeNumber);
+
 			}
 		}
 
@@ -4914,6 +4929,7 @@ restart:
 		pull_varattnos(indexPredicate, 1, &indexattrs);
 
 		index_close(indexDesc, AccessShareLock);
+		indexno++;
 	}
 
 	/*
@@ -4940,6 +4956,7 @@ restart:
 		bms_free(pkindexattrs);
 		bms_free(idindexattrs);
 		bms_free(indexattrs);
+		bms_array_free(ukindexattrs, index_count);
 
 		goto restart;
 	}
@@ -4953,6 +4970,8 @@ restart:
 	relation->rd_pkattr = NULL;
 	bms_free(relation->rd_idattr);
 	relation->rd_idattr = NULL;
+	bms_array_free(relation->rd_plain_ukattrs, relation->rd_plain_ukcount);
+	relation->rd_plain_ukattrs = NULL;
 
 	/*
 	 * Now save copies of the bitmaps in the relcache entry.  We intentionally
@@ -4966,6 +4985,8 @@ restart:
 	relation->rd_pkattr = bms_copy(pkindexattrs);
 	relation->rd_idattr = bms_copy(idindexattrs);
 	relation->rd_indexattr = bms_copy(indexattrs);
+	relation->rd_plain_ukattrs = bms_array_copy(ukindexattrs, index_count);
+	relation->rd_plain_ukcount = plain_uk_index_count;
 	MemoryContextSwitchTo(oldcxt);
 
 	/* We return our original working copy for caller to play with */
@@ -5618,6 +5639,8 @@ load_relcache_init_file(bool shared)
 		rel->rd_keyattr = NULL;
 		rel->rd_pkattr = NULL;
 		rel->rd_idattr = NULL;
+		rel->rd_plain_ukattrs = NULL;
+		rel->rd_plain_ukcount = 0;
 		rel->rd_pubactions = NULL;
 		rel->rd_statvalid = false;
 		rel->rd_statlist = NIL;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e44f71e991..fa798dd564 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1064,6 +1064,16 @@ static struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_distinct_elimination", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables plan-time and run-time unique elimination."),
+		    gettext_noop("Allows the query planner to remove the uncecessary distinct clause."), 
+			GUC_EXPLAIN
+		},
+		&enable_distinct_elimination,
+		true,
+		NULL, NULL, NULL
+	},
 	{
 		{"geqo", PGC_USERSET, QUERY_TUNING_GEQO,
 			gettext_noop("Enables genetic query optimization."),
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index b7b18a0b68..ff30feb521 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -117,4 +117,6 @@ extern int	bms_prev_member(const Bitmapset *a, int prevbit);
 /* support for hashtables using Bitmapsets as keys: */
 extern uint32 bms_hash_value(const Bitmapset *a);
 
+extern Bitmapset **bms_array_copy(Bitmapset **bms_array, int len);
+extern void bms_array_free(Bitmapset **bms_array,  int len);
 #endif							/* BITMAPSET_H */
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index cb012ba198..4fa5d32df6 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -64,6 +64,7 @@ extern PGDLLIMPORT bool enable_partitionwise_aggregate;
 extern PGDLLIMPORT bool enable_parallel_append;
 extern PGDLLIMPORT bool enable_parallel_hash;
 extern PGDLLIMPORT bool enable_partition_pruning;
+extern PGDLLIMPORT bool enable_distinct_elimination;
 extern PGDLLIMPORT int constraint_exclusion;
 
 extern double index_pages_fetched(double tuples_fetched, BlockNumber pages,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 44ed04dd3f..7c5a6d65b6 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -120,6 +120,9 @@ typedef struct RelationData
 	Bitmapset  *rd_indexattr;	/* identifies columns used in indexes */
 	Bitmapset  *rd_keyattr;		/* cols that can be ref'd by foreign keys */
 	Bitmapset  *rd_pkattr;		/* cols included in primary key */
+	Bitmapset  **rd_plain_ukattrs;    /* cols included in the plain unique indexes,
+                                   only non-expression, non-partical columns are count */
+	int        rd_plain_ukcount;  /* the no. of uk count */
 	Bitmapset  *rd_idattr;		/* included in replica identity index */
 
 	PublicationActions *rd_pubactions;	/* publication actions */
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 761376b007..3f6595d53b 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -4433,17 +4433,17 @@ select d.* from d left join (select * from b group by b.id, b.c_id) s
 explain (costs off)
 select d.* from d left join (select distinct * from b) s
   on d.a = s.id;
-              QUERY PLAN              
---------------------------------------
- Merge Right Join
-   Merge Cond: (b.id = d.a)
-   ->  Unique
-         ->  Sort
-               Sort Key: b.id, b.c_id
-               ->  Seq Scan on b
+           QUERY PLAN            
+---------------------------------
+ Merge Left Join
+   Merge Cond: (d.a = s.id)
    ->  Sort
          Sort Key: d.a
          ->  Seq Scan on d
+   ->  Sort
+         Sort Key: s.id
+         ->  Subquery Scan on s
+               ->  Seq Scan on b
 (9 rows)
 
 -- check join removal works when uniqueness of the join condition is enforced
diff --git a/src/test/regress/expected/select_distinct_2.out b/src/test/regress/expected/select_distinct_2.out
new file mode 100644
index 0000000000..9c0dd564c8
--- /dev/null
+++ b/src/test/regress/expected/select_distinct_2.out
@@ -0,0 +1,141 @@
+create table select_distinct_a(a int, b char(20),  c char(20) not null,  d int, e int, primary key(a, b));
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+-- no node for distinct.
+explain (costs off) select distinct * from select_distinct_a;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a;
+             QUERY PLAN              
+-------------------------------------
+ HashAggregate
+   Group Key: b, c, d, e
+   ->  Seq Scan on select_distinct_a
+(3 rows)
+
+create unique index select_distinct_a_uk on select_distinct_a(c, d);
+explain (costs off) select distinct b,c,d,e from select_distinct_a where c is not null;
+             QUERY PLAN              
+-------------------------------------
+ HashAggregate
+   Group Key: b, c, d, e
+   ->  Seq Scan on select_distinct_a
+         Filter: (c IS NOT NULL)
+(4 rows)
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a where c is not null and d is not null;
+                   QUERY PLAN                    
+-------------------------------------------------
+ Seq Scan on select_distinct_a
+   Filter: ((c IS NOT NULL) AND (d IS NOT NULL))
+(2 rows)
+
+explain select distinct d, e from select_distinct_a group by d, e;
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
+ HashAggregate  (cost=15.85..17.85 rows=200 width=8)
+   Group Key: d, e
+   ->  Seq Scan on select_distinct_a  (cost=0.00..13.90 rows=390 width=8)
+(3 rows)
+
+create table select_distinct_b(a int, b char(20),  c char(20) not null,  d int, e int, primary key(a, b));
+explain (costs off) select distinct * from select_distinct_a a, select_distinct_b b;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop
+   ->  Seq Scan on select_distinct_a a
+   ->  Materialize
+         ->  Seq Scan on select_distinct_b b
+(4 rows)
+
+explain (costs off) select distinct a.b, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: a.b, a.c, b.a, b.b
+         ->  Nested Loop
+               ->  Seq Scan on select_distinct_a a
+               ->  Materialize
+                     ->  Seq Scan on select_distinct_b b
+(7 rows)
+
+explain (costs off) select distinct a.d, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b where a.d is not null;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop
+   ->  Seq Scan on select_distinct_b b
+   ->  Materialize
+         ->  Seq Scan on select_distinct_a a
+               Filter: (d IS NOT NULL)
+(5 rows)
+
+explain (costs off) select distinct a.d, b.a from select_distinct_a a, select_distinct_b b group by a.d, b.a;
+                    QUERY PLAN                     
+---------------------------------------------------
+ HashAggregate
+   Group Key: a.d, b.a
+   ->  Nested Loop
+         ->  Seq Scan on select_distinct_a a
+         ->  Materialize
+               ->  Seq Scan on select_distinct_b b
+(6 rows)
+
+explain (costs off) select distinct a, b from select_distinct_a where a in (select a from select_distinct_b);
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Nested Loop Semi Join
+   ->  Seq Scan on select_distinct_a
+   ->  Index Only Scan using select_distinct_b_pkey on select_distinct_b
+         Index Cond: (a = select_distinct_a.a)
+(4 rows)
+
+explain (costs off) select distinct a, b from select_distinct_a where a not in (select a from select_distinct_b);
+              QUERY PLAN               
+---------------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on select_distinct_b
+(4 rows)
+
+explain (costs off) select distinct * from select_distinct_a a, (select a, max(b) as b from select_distinct_b group by a) b
+where a.a in (select a from select_distinct_b)
+and a.b = b.b;
+                                           QUERY PLAN                                            
+-------------------------------------------------------------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: a.a, a.b, a.c, a.d, a.e, select_distinct_b_1.a
+         ->  Nested Loop
+               Join Filter: (a.b = (max(select_distinct_b_1.b)))
+               ->  HashAggregate
+                     Group Key: select_distinct_b_1.a
+                     ->  Seq Scan on select_distinct_b select_distinct_b_1
+               ->  Materialize
+                     ->  Nested Loop Semi Join
+                           ->  Seq Scan on select_distinct_a a
+                           ->  Index Only Scan using select_distinct_b_pkey on select_distinct_b
+                                 Index Cond: (a = a.a)
+(13 rows)
+
+explain (costs off) select distinct on(a) a, b from select_distinct_a;
+                QUERY PLAN                 
+-------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: a
+         ->  Seq Scan on select_distinct_a
+(4 rows)
+
+explain (costs off) select distinct on(a, b) a, b from select_distinct_a;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)
+
+drop table select_distinct_a;
+drop table select_distinct_b;
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index a1c90eb905..e053214f9d 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -73,6 +73,7 @@ select name, setting from pg_settings where name like 'enable%';
               name              | setting 
 --------------------------------+---------
  enable_bitmapscan              | on
+ enable_distinct_elimination    | on
  enable_gathermerge             | on
  enable_hashagg                 | on
  enable_hashjoin                | on
@@ -89,7 +90,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(17 rows)
+(18 rows)
 
 -- Test that the pg_timezone_names and pg_timezone_abbrevs views are
 -- more-or-less working.  We can't test their contents in any great detail
diff --git a/src/test/regress/sql/select_distinct_2.sql b/src/test/regress/sql/select_distinct_2.sql
new file mode 100644
index 0000000000..cad8d40dc6
--- /dev/null
+++ b/src/test/regress/sql/select_distinct_2.sql
@@ -0,0 +1,42 @@
+create table select_distinct_a(a int, b char(20),  c char(20) not null,  d int, e int, primary key(a, b));
+
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+
+-- no node for distinct.
+explain (costs off) select distinct * from select_distinct_a;
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a;
+
+create unique index select_distinct_a_uk on select_distinct_a(c, d);
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a where c is not null;
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a where c is not null and d is not null;
+
+explain select distinct d, e from select_distinct_a group by d, e;
+
+
+create table select_distinct_b(a int, b char(20),  c char(20) not null,  d int, e int, primary key(a, b));
+
+explain (costs off) select distinct * from select_distinct_a a, select_distinct_b b;
+
+explain (costs off) select distinct a.b, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b;
+
+explain (costs off) select distinct a.d, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b where a.d is not null;
+
+explain (costs off) select distinct a.d, b.a from select_distinct_a a, select_distinct_b b group by a.d, b.a;
+
+explain (costs off) select distinct a, b from select_distinct_a where a in (select a from select_distinct_b);
+
+explain (costs off) select distinct a, b from select_distinct_a where a not in (select a from select_distinct_b);
+
+explain (costs off) select distinct * from select_distinct_a a, (select a, max(b) as b from select_distinct_b group by a) b
+where a.a in (select a from select_distinct_b)
+and a.b = b.b;
+
+explain (costs off) select distinct on(a) a, b from select_distinct_a;
+explain (costs off) select distinct on(a, b) a, b from select_distinct_a;
+
+drop table select_distinct_a;
+drop table select_distinct_b;
-- 
2.20.1 (Apple Git-117)

Ashutosh Bapat

ashutosh.bapat.oss@gmail.com

almost 6 years ago

In reply to: Andy Fan (#2)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

Hi Andy,
What might help is to add more description to your email message like
giving examples to explain your idea.

Anyway, I looked at the testcases you added for examples.
+create table select_distinct_a(a int, b char(20),  c char(20) not null,  d
int, e int, primary key(a, b));
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+-- no node for distinct.
+explain (costs off) select distinct * from select_distinct_a;
+          QUERY PLAN
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)

From this example, it seems that the distinct operation can be dropped
because (a, b) is a primary key. Is my understanding correct?

I like the idea since it eliminates one expensive operation.

However the patch as presented has some problems
1. What happens if the primary key constraint or NOT NULL constraint gets
dropped between a prepare and execute? The plan will no more be valid and
thus execution may produce non-distinct results. PostgreSQL has similar
concept of allowing non-grouping expression as part of targetlist when
those expressions can be proved to be functionally dependent on the GROUP
BY clause. See check_functional_grouping() and its caller. I think,
DISTINCT elimination should work on similar lines.
2. For the same reason described in check_functional_grouping(), using
unique indexes for eliminating DISTINCT should be discouraged.
3. If you could eliminate DISTINCT you could similarly eliminate GROUP BY
as well
4. The patch works only at the query level, but that functionality can be
expanded generally to other places which add Unique/HashAggregate/Group
nodes if the underlying relation can be proved to produce distinct rows.
But that's probably more work since we will have to label paths with unique
keys similar to pathkeys.
5. Have you tested this OUTER joins, which can render inner side nullable?

On Thu, Feb 6, 2020 at 11:31 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:

update the patch with considering the semi/anti join.

Can anyone help to review this patch?

Thanks

On Fri, Jan 31, 2020 at 8:39 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

Hi:

I wrote a patch to erase the distinctClause if the result is unique by
definition, I find this because a user switch this code from oracle
to PG and find the performance is bad due to this, so I adapt pg for
this as well.

This patch doesn't work for a well-written SQL, but some drawback
of a SQL may be not very obvious, since the cost of checking is pretty
low as well, so I think it would be ok to add..

Please see the patch for details.

Thank you.

--
--
Best Wishes,
Ashutosh Bapat

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: Ashutosh Bapat (#3)

1 attachment(s)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

Hi Ashutosh:
Thanks for your time.

On Fri, Feb 7, 2020 at 11:54 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
wrote:

Hi Andy,
What might help is to add more description to your email message like
giving examples to explain your idea.
Anyway, I looked at the testcases you added for examples.
+create table select_distinct_a(a int, b char(20),  c char(20) not null,
d int, e int, primary key(a, b));
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+-- no node for distinct.
+explain (costs off) select distinct * from select_distinct_a;
+          QUERY PLAN
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)
From this example, it seems that the distinct operation can be dropped
because (a, b) is a primary key. Is my understanding correct?

Yes, you are correct. Actually I added then to commit message,
but it's true that I should have copied them in this email body as
well. so copy it now.

[PATCH] Erase the distinctClause if the result is unique by
definition

For a single relation, we can tell it by any one of the following
is true:
1. The pk is in the target list.
2. The uk is in the target list and the columns is not null
3. The columns in group-by clause is also in the target list

for relation join, we can tell it by:
if every relation in the jointree yields a unique result set, then
the final result is unique as well regardless the join method.
for semi/anti join, we will ignore the righttable.

I like the idea since it eliminates one expensive operation.

However the patch as presented has some problems
1. What happens if the primary key constraint or NOT NULL constraint gets
dropped between a prepare and execute? The plan will no more be valid and
thus execution may produce non-distinct results.

Will this still be an issue if user use doesn't use a "read uncommitted"
isolation level? I suppose it should be ok for this case. But even though
I should add an isolation level check for this. Just added that in the
patch
to continue discussing of this issue.

PostgreSQL has similar concept of allowing non-grouping expression as part
of targetlist when those expressions can be proved to be functionally
dependent on the GROUP BY clause. See check_functional_grouping() and its
caller. I think, DISTINCT elimination should work on similar lines.

2. For the same reason described in check_functional_grouping(), using

unique indexes for eliminating DISTINCT should be discouraged.

I checked the comments of check_functional_grouping, the reason is

* Currently we only check to see if the rel has a primary key that is a
* subset of the grouping_columns. We could also use plain unique
constraints
* if all their columns are known not null, but there's a problem: we need
* to be able to represent the not-null-ness as part of the constraints
added
* to *constraintDeps. FIXME whenever not-null constraints get represented
* in pg_constraint.

Actually I am doubtful the reason for pg_constraint since we still be able
to get the not null information from relation->rd_attr->attrs[n].attnotnull
which
is just what this patch did.

3. If you could eliminate DISTINCT you could similarly eliminate GROUP BY

as well

This is a good point. The rules may have some different for join, so I
prefer
to to focus on the current one so far.

4. The patch works only at the query level, but that functionality can be
expanded generally to other places which add Unique/HashAggregate/Group
nodes if the underlying relation can be proved to produce distinct rows.
But that's probably more work since we will have to label paths with unique
keys similar to pathkeys.

Do you mean adding some information into PlannerInfo, and when we create
a node for Unique/HashAggregate/Group, we can just create a dummy node?

5. Have you tested this OUTER joins, which can render inner side nullable?

Yes, that part was missed in the test case. I just added them.

Show quoted text

On Thu, Feb 6, 2020 at 11:31 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:

update the patch with considering the semi/anti join.

Can anyone help to review this patch?

Thanks

On Fri, Jan 31, 2020 at 8:39 PM Andy Fan <zhihui.fan1213@gmail.com>
wrote:

Hi:

I wrote a patch to erase the distinctClause if the result is unique by
definition, I find this because a user switch this code from oracle
to PG and find the performance is bad due to this, so I adapt pg for
this as well.

This patch doesn't work for a well-written SQL, but some drawback
of a SQL may be not very obvious, since the cost of checking is pretty
low as well, so I think it would be ok to add..

Please see the patch for details.

Thank you.

--
--
Best Wishes,
Ashutosh Bapat

Attachments:

0001-Erase-the-distinctClause-if-the-result-is-unique-by-.patchapplication/octet-stream; name=0001-Erase-the-distinctClause-if-the-result-is-unique-by-.patchDownload

From 94edaece73503572d4489d41e6644efe1e13f652 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com>
Date: Fri, 31 Jan 2020 19:38:05 +0800
Subject: [PATCH] Erase the distinctClause if the result is unique by
 definition

For a single relation, we can tell it by any one of the following
is true:
1. The pk is in the target list.
2. The uk is in the target list and the columns is not null
3. The columns in group-by clause is also in the target list

for relation join, we can tell it by:
if every relation in the jointree yield a unique result set,then
the final result is unique as well regardless the join method.
for semi/anti join, we don't have such restriction on the right
table.
---
 src/backend/nodes/bitmapset.c                 |  40 +++
 src/backend/optimizer/path/costsize.c         |   1 +
 src/backend/optimizer/plan/planner.c          | 292 ++++++++++++++++++
 src/backend/utils/cache/relcache.c            |  23 ++
 src/backend/utils/misc/guc.c                  |  10 +
 src/include/nodes/bitmapset.h                 |   2 +
 src/include/optimizer/cost.h                  |   1 +
 src/include/utils/rel.h                       |   3 +
 src/test/regress/expected/join.out            |  16 +-
 .../regress/expected/select_distinct_2.out    | 141 +++++++++
 src/test/regress/expected/sysviews.out        |   3 +-
 src/test/regress/sql/select_distinct_2.sql    |  42 +++
 12 files changed, 565 insertions(+), 9 deletions(-)
 create mode 100644 src/test/regress/expected/select_distinct_2.out
 create mode 100644 src/test/regress/sql/select_distinct_2.sql

diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index 648cc1a7eb..76ce9b526e 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -1167,3 +1167,43 @@ bms_hash_value(const Bitmapset *a)
 	return DatumGetUInt32(hash_any((const unsigned char *) a->words,
 								   (lastword + 1) * sizeof(bitmapword)));
 }
+
+/*
+ * bms_array_copy --
+ *
+ * copy the bms data in the newly palloc memory
+ */
+
+Bitmapset**
+bms_array_copy(Bitmapset **bms_array, int len)
+{
+	Bitmapset **res;
+	int i;
+	if (bms_array == NULL || len < 1)
+		return NULL;
+
+	res = palloc(sizeof(Bitmapset*) * len);
+	for(i = 0; i < len; i++)
+	{
+		res[i] = bms_copy(bms_array[i]);
+	}
+	return res;
+}
+
+/*
+ * bms_array_free
+ *
+ * free the element in the array one by one, free the array as well at last
+ */
+void
+bms_array_free(Bitmapset **bms_array,  int len)
+{
+	int idx;
+	if (bms_array == NULL)
+		return;
+	for(idx = 0 ; idx < len; idx++)
+	{
+		bms_free(bms_array[idx]);
+	}
+	pfree(bms_array);
+}
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index b5a0033721..dde16b5d44 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -138,6 +138,7 @@ bool		enable_partitionwise_aggregate = false;
 bool		enable_parallel_append = true;
 bool		enable_parallel_hash = true;
 bool		enable_partition_pruning = true;
+bool		enable_distinct_elimination = true;
 
 typedef struct
 {
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index d6f2153593..6f7d85f96e 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -22,8 +22,10 @@
 #include "access/htup_details.h"
 #include "access/parallel.h"
 #include "access/sysattr.h"
+#include "access/relation.h"
 #include "access/table.h"
 #include "access/xact.h"
+#include "catalog/index.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_proc.h"
@@ -35,6 +37,7 @@
 #include "lib/bipartite_match.h"
 #include "lib/knapsack.h"
 #include "miscadmin.h"
+#include "nodes/bitmapset.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
 #ifdef OPTIMIZER_DEBUG
@@ -248,6 +251,7 @@ static bool group_by_has_partkey(RelOptInfo *input_rel,
 								 List *targetList,
 								 List *groupClause);
 static int	common_prefix_cmp(const void *a, const void *b);
+static void	preprocess_distinct_node(PlannerInfo *root);
 
 
 /*****************************************************************************
@@ -989,6 +993,9 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	/* Remove any redundant GROUP BY columns */
 	remove_useless_groupby_columns(root);
 
+	if (enable_distinct_elimination)
+		preprocess_distinct_node(root);
+
 	/*
 	 * If we have any outer joins, try to reduce them to plain inner joins.
 	 * This step is most easily done after we've done expression
@@ -7409,3 +7416,288 @@ group_by_has_partkey(RelOptInfo *input_rel,
 
 	return true;
 }
+
+/*
+ * is_unique_result_already
+ *
+ * Given a relation, we can know its primary key + unique key information
+ * unique target is the target list of distinct/distinct on target.
+ * not_null_columns is a union of not null columns based on catalog and quals.
+ * then we can know the result is unique already before executing it if
+ * the primary key or uk + not null in target list.
+ */
+static bool
+is_unique_result_already(Relation relation,
+						 Bitmapset *unique_target,
+						 Bitmapset *not_null_columns)
+{
+	int i;
+	Bitmapset *pkattr = RelationGetIndexAttrBitmap(relation,
+												   INDEX_ATTR_BITMAP_PRIMARY_KEY);
+
+	/*
+	 * if the pk is in the target list,
+	 * the result set is unique for this relation
+	 */
+	if (pkattr != NULL &&
+		!bms_is_empty(pkattr) &&
+		bms_is_subset(pkattr, unique_target))
+	{
+		return true;
+	}
+
+	/*
+	 * check if the pk is in the unique index
+	 */
+	for (i = 0; i < relation->rd_plain_ukcount; i++)
+	{
+		Bitmapset *ukattr = relation->rd_plain_ukattrs[i];
+		if (!bms_is_empty(ukattr)
+			&& bms_is_subset(ukattr, unique_target)
+			&& bms_is_subset(ukattr, not_null_columns))
+			return true;
+	}
+
+	/*
+	 * If a unique index is in the target list, and the columns are not null
+	 * the result set is unique as well
+	 */
+
+	return false;
+}
+
+
+/*
+ * scan_non_semi_anti_relids
+ *
+ * scan jointree to get non-semi/anti join rtindex.
+ */
+static void
+scan_non_semi_anti_relids(Node* jtnode, Relids* relids)
+{
+	if (jtnode == NULL)
+		return;
+
+	if (IsA(jtnode, RangeTblRef))
+	{
+		int			varno = ((RangeTblRef *) jtnode)->rtindex;
+
+		*relids = bms_add_member(*relids, varno);
+	}
+	else if (IsA(jtnode, FromExpr))
+	{
+		FromExpr   *f = (FromExpr *) jtnode;
+		ListCell   *l;
+
+		foreach(l, f->fromlist)
+			scan_non_semi_anti_relids(lfirst(l), relids);
+	}
+	else if (IsA(jtnode, JoinExpr))
+	{
+		JoinExpr   *j = (JoinExpr *) jtnode;
+
+		scan_non_semi_anti_relids(j->larg, relids);
+		if (j->jointype != JOIN_SEMI && j->jointype != JOIN_ANTI)
+		{
+			scan_non_semi_anti_relids(j->rarg, relids);
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+			 (int) nodeTag(jtnode));
+
+}
+
+/*
+ * preprocess_distinct_node
+ *
+ * remove the distinctClause if it is not necessary by definition
+ */
+static void
+preprocess_distinct_node(PlannerInfo *root)
+{
+	Query *query = root->parse;
+	ListCell *lc;
+	int num_of_rtables;
+	Bitmapset **targetlist_by_table = NULL;
+    Bitmapset **notnullcolumns = NULL;
+	Index rel_idx = 0;
+	bool should_distinct_elimination = false;
+	Relids non_semi_anti_relids = NULL;
+
+	if (query->distinctClause == NIL)
+		return;
+
+	scan_non_semi_anti_relids((Node*)query->jointree, &non_semi_anti_relids);
+
+	foreach(lc, query->rtable)
+	{
+		RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
+		rel_idx++;
+		if (!bms_is_member(rel_idx, non_semi_anti_relids))
+			continue;
+		/* we only handle the basic Relation for now */
+		if (rte->rtekind != RTE_RELATION)
+			return;
+	}
+
+	num_of_rtables = bms_num_members(non_semi_anti_relids);
+
+	/*
+	 * If the columns in group clause is in the target list
+	 * we don't need distinct
+	 */
+	if (query->groupClause != NIL)
+	{
+		Bitmapset *groupclause_bm = NULL;
+		Bitmapset *groupclause_in_targetlist_bm = NULL;
+		ListCell *lc;
+		foreach(lc, query->groupClause)
+			groupclause_bm = bms_add_member(groupclause_bm,
+												lfirst_node(SortGroupClause, lc)->tleSortGroupRef);
+
+		foreach(lc, query->targetList)
+		{
+			TargetEntry *te = lfirst_node(TargetEntry, lc);
+			if (te->resjunk)
+				continue;
+			groupclause_in_targetlist_bm = bms_add_member(groupclause_in_targetlist_bm,
+														  te->ressortgroupref);
+		}
+
+		should_distinct_elimination = bms_is_subset(groupclause_bm,
+													groupclause_in_targetlist_bm);
+		bms_free(groupclause_bm);
+		bms_free(groupclause_in_targetlist_bm);
+		if (should_distinct_elimination)
+			goto ret;
+	}
+
+	targetlist_by_table = palloc0(sizeof(Bitmapset*) * num_of_rtables);
+	notnullcolumns = palloc0(sizeof(Bitmapset* ) * num_of_rtables);
+
+	/* build the targetlist_by_table */
+	foreach(lc, query->targetList)
+	{
+		TargetEntry *te = lfirst_node(TargetEntry, lc);
+		Expr *expr = te->expr;
+		Var *var;
+		Bitmapset **target_column_per_rel;
+		int target_attno;
+
+		if (!IsA(expr, Var))
+			continue;
+
+		var = (Var *)(expr);
+		if (var->varlevelsup != 0)
+			continue;
+
+		target_column_per_rel = &targetlist_by_table[var->varno - 1];
+		target_attno = var->varattno - FirstLowInvalidHeapAttributeNumber;
+
+		/*
+		 * for distinct On (..), we only count the field in .. rather than
+		 * all the entries in target list
+		 */
+		if (query->hasDistinctOn)
+		{
+			Index ref = te->ressortgroupref;
+			ListCell *lc;
+
+			/*
+			 * A fastpath to know if the targetEntry is in the distinctClause
+			 */
+			if (ref == 0)
+				continue;
+
+			/*
+			 * Even the ref is not zero, it may be in sort as well, so we
+			 * need dobule check.
+			 */
+			foreach(lc, query->distinctClause)
+			{
+				if (ref == lfirst_node(SortGroupClause, lc)->tleSortGroupRef)
+					*target_column_per_rel = bms_add_member(*target_column_per_rel,
+															target_attno);
+			}
+		}
+		else
+			*target_column_per_rel = bms_add_member(*target_column_per_rel,
+													target_attno);
+	}
+
+	/* find out nonnull columns from qual via find_nonnullable_vars */
+	foreach(lc, find_nonnullable_vars(query->jointree->quals))
+	{
+		Var *not_null_var;
+		Bitmapset **notnullcolumns_per_rel;
+		int notnull_attno;
+		if (!IsA(lfirst(lc), Var))
+			continue;
+		not_null_var = lfirst_node(Var, lc);
+		if (not_null_var->varno == INNER_VAR ||
+			not_null_var->varno == OUTER_VAR ||
+			not_null_var->varno == INDEX_VAR)
+			continue;
+		notnullcolumns_per_rel = &notnullcolumns[not_null_var->varno - 1];
+		notnull_attno = not_null_var->varattno - FirstLowInvalidHeapAttributeNumber;
+		*notnullcolumns_per_rel = bms_add_member(*notnullcolumns_per_rel,
+												 notnull_attno);
+	}
+
+	/* Check if each related rtable can yield a unique result set */
+	rel_idx = 0;
+	foreach(lc, query->rtable)
+	{
+		Relation relation;
+		TupleDesc desc;
+		RangeTblEntry *rte;
+		int attr_idx;
+
+		if (!bms_is_member(rel_idx+1, non_semi_anti_relids))
+			continue;
+
+		rte = lfirst_node(RangeTblEntry, lc);
+		Assert(rte->rtekind == RTE_RELATION);
+		Assert(rte->relid != InvalidOid);
+
+	    relation = relation_open(rte->relid, RowExclusiveLock);
+		desc = relation->rd_att;
+		attr_idx = 0;
+
+		/* Add the notnullcolumns based on catalog */
+		for(; attr_idx < desc->natts; attr_idx++)
+		{
+			int notnull_attno;
+			if (!desc->attrs[attr_idx].attnotnull)
+				continue;
+			notnull_attno = attr_idx + 1 - FirstLowInvalidHeapAttributeNumber;
+			notnullcolumns[rel_idx] = bms_add_member(notnullcolumns[rel_idx],
+													 notnull_attno);
+		}
+
+		/* check non-nullable in qual, only col is not null checked now */
+		if (!is_unique_result_already(relation,
+									  targetlist_by_table[rel_idx],
+									  notnullcolumns[rel_idx]))
+		{
+			RelationClose(relation);
+			goto ret;
+		}
+		RelationClose(relation);
+		rel_idx++;
+	}
+
+	should_distinct_elimination = true;
+
+ ret:
+	bms_array_free(notnullcolumns, num_of_rtables);
+	bms_array_free(targetlist_by_table, num_of_rtables);
+	bms_free(non_semi_anti_relids);
+
+	if (should_distinct_elimination)
+	{
+		query->distinctClause = NIL;
+		query->hasDistinctOn = false;
+	}
+}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index df025a5a30..d8a76a2273 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -2346,6 +2346,8 @@ RelationDestroyRelation(Relation relation, bool remember_tupdesc)
 	bms_free(relation->rd_keyattr);
 	bms_free(relation->rd_pkattr);
 	bms_free(relation->rd_idattr);
+	if (relation->rd_plain_ukattrs)
+		bms_array_free(relation->rd_plain_ukattrs, relation->rd_plain_ukcount);
 	if (relation->rd_pubactions)
 		pfree(relation->rd_pubactions);
 	if (relation->rd_options)
@@ -4762,6 +4764,7 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
 	Bitmapset  *indexattrs;		/* indexed columns */
 	Bitmapset  *uindexattrs;	/* columns in unique indexes */
 	Bitmapset  *pkindexattrs;	/* columns in the primary index */
+	Bitmapset  **ukindexattrs = NULL; /* columns in the unique indexes */
 	Bitmapset  *idindexattrs;	/* columns in the replica identity */
 	List	   *indexoidlist;
 	List	   *newindexoidlist;
@@ -4769,6 +4772,7 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
 	Oid			relreplindex;
 	ListCell   *l;
 	MemoryContext oldcxt;
+	int plain_uk_index_count = 0, index_count = 0, indexno = 0;
 
 	/* Quick exit if we already computed the result. */
 	if (relation->rd_indexattr != NULL)
@@ -4826,6 +4830,9 @@ restart:
 	uindexattrs = NULL;
 	pkindexattrs = NULL;
 	idindexattrs = NULL;
+	index_count = list_length(indexoidlist);
+	ukindexattrs = palloc0(sizeof(Bitmapset *) * index_count);
+
 	foreach(l, indexoidlist)
 	{
 		Oid			indexOid = lfirst_oid(l);
@@ -4875,6 +4882,9 @@ restart:
 		/* Is this index the configured (or default) replica identity? */
 		isIDKey = (indexOid == relreplindex);
 
+		if (isKey)
+			plain_uk_index_count++;
+
 		/* Collect simple attribute references */
 		for (i = 0; i < indexDesc->rd_index->indnatts; i++)
 		{
@@ -4904,6 +4914,11 @@ restart:
 				if (isIDKey && i < indexDesc->rd_index->indnkeyatts)
 					idindexattrs = bms_add_member(idindexattrs,
 												  attrnum - FirstLowInvalidHeapAttributeNumber);
+
+				if (isKey)
+					ukindexattrs[indexno] = bms_add_member(ukindexattrs[indexno],
+														   attrnum - FirstLowInvalidHeapAttributeNumber);
+
 			}
 		}
 
@@ -4914,6 +4929,7 @@ restart:
 		pull_varattnos(indexPredicate, 1, &indexattrs);
 
 		index_close(indexDesc, AccessShareLock);
+		indexno++;
 	}
 
 	/*
@@ -4940,6 +4956,7 @@ restart:
 		bms_free(pkindexattrs);
 		bms_free(idindexattrs);
 		bms_free(indexattrs);
+		bms_array_free(ukindexattrs, index_count);
 
 		goto restart;
 	}
@@ -4953,6 +4970,8 @@ restart:
 	relation->rd_pkattr = NULL;
 	bms_free(relation->rd_idattr);
 	relation->rd_idattr = NULL;
+	bms_array_free(relation->rd_plain_ukattrs, relation->rd_plain_ukcount);
+	relation->rd_plain_ukattrs = NULL;
 
 	/*
 	 * Now save copies of the bitmaps in the relcache entry.  We intentionally
@@ -4966,6 +4985,8 @@ restart:
 	relation->rd_pkattr = bms_copy(pkindexattrs);
 	relation->rd_idattr = bms_copy(idindexattrs);
 	relation->rd_indexattr = bms_copy(indexattrs);
+	relation->rd_plain_ukattrs = bms_array_copy(ukindexattrs, index_count);
+	relation->rd_plain_ukcount = plain_uk_index_count;
 	MemoryContextSwitchTo(oldcxt);
 
 	/* We return our original working copy for caller to play with */
@@ -5618,6 +5639,8 @@ load_relcache_init_file(bool shared)
 		rel->rd_keyattr = NULL;
 		rel->rd_pkattr = NULL;
 		rel->rd_idattr = NULL;
+		rel->rd_plain_ukattrs = NULL;
+		rel->rd_plain_ukcount = 0;
 		rel->rd_pubactions = NULL;
 		rel->rd_statvalid = false;
 		rel->rd_statlist = NIL;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e44f71e991..fa798dd564 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1064,6 +1064,16 @@ static struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_distinct_elimination", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables plan-time and run-time unique elimination."),
+		    gettext_noop("Allows the query planner to remove the uncecessary distinct clause."), 
+			GUC_EXPLAIN
+		},
+		&enable_distinct_elimination,
+		true,
+		NULL, NULL, NULL
+	},
 	{
 		{"geqo", PGC_USERSET, QUERY_TUNING_GEQO,
 			gettext_noop("Enables genetic query optimization."),
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index b7b18a0b68..ff30feb521 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -117,4 +117,6 @@ extern int	bms_prev_member(const Bitmapset *a, int prevbit);
 /* support for hashtables using Bitmapsets as keys: */
 extern uint32 bms_hash_value(const Bitmapset *a);
 
+extern Bitmapset **bms_array_copy(Bitmapset **bms_array, int len);
+extern void bms_array_free(Bitmapset **bms_array,  int len);
 #endif							/* BITMAPSET_H */
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index cb012ba198..4fa5d32df6 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -64,6 +64,7 @@ extern PGDLLIMPORT bool enable_partitionwise_aggregate;
 extern PGDLLIMPORT bool enable_parallel_append;
 extern PGDLLIMPORT bool enable_parallel_hash;
 extern PGDLLIMPORT bool enable_partition_pruning;
+extern PGDLLIMPORT bool enable_distinct_elimination;
 extern PGDLLIMPORT int constraint_exclusion;
 
 extern double index_pages_fetched(double tuples_fetched, BlockNumber pages,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 44ed04dd3f..7c5a6d65b6 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -120,6 +120,9 @@ typedef struct RelationData
 	Bitmapset  *rd_indexattr;	/* identifies columns used in indexes */
 	Bitmapset  *rd_keyattr;		/* cols that can be ref'd by foreign keys */
 	Bitmapset  *rd_pkattr;		/* cols included in primary key */
+	Bitmapset  **rd_plain_ukattrs;    /* cols included in the plain unique indexes,
+                                   only non-expression, non-partical columns are count */
+	int        rd_plain_ukcount;  /* the no. of uk count */
 	Bitmapset  *rd_idattr;		/* included in replica identity index */
 
 	PublicationActions *rd_pubactions;	/* publication actions */
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 761376b007..3f6595d53b 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -4433,17 +4433,17 @@ select d.* from d left join (select * from b group by b.id, b.c_id) s
 explain (costs off)
 select d.* from d left join (select distinct * from b) s
   on d.a = s.id;
-              QUERY PLAN              
---------------------------------------
- Merge Right Join
-   Merge Cond: (b.id = d.a)
-   ->  Unique
-         ->  Sort
-               Sort Key: b.id, b.c_id
-               ->  Seq Scan on b
+           QUERY PLAN            
+---------------------------------
+ Merge Left Join
+   Merge Cond: (d.a = s.id)
    ->  Sort
          Sort Key: d.a
          ->  Seq Scan on d
+   ->  Sort
+         Sort Key: s.id
+         ->  Subquery Scan on s
+               ->  Seq Scan on b
 (9 rows)
 
 -- check join removal works when uniqueness of the join condition is enforced
diff --git a/src/test/regress/expected/select_distinct_2.out b/src/test/regress/expected/select_distinct_2.out
new file mode 100644
index 0000000000..9c0dd564c8
--- /dev/null
+++ b/src/test/regress/expected/select_distinct_2.out
@@ -0,0 +1,141 @@
+create table select_distinct_a(a int, b char(20),  c char(20) not null,  d int, e int, primary key(a, b));
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+-- no node for distinct.
+explain (costs off) select distinct * from select_distinct_a;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a;
+             QUERY PLAN              
+-------------------------------------
+ HashAggregate
+   Group Key: b, c, d, e
+   ->  Seq Scan on select_distinct_a
+(3 rows)
+
+create unique index select_distinct_a_uk on select_distinct_a(c, d);
+explain (costs off) select distinct b,c,d,e from select_distinct_a where c is not null;
+             QUERY PLAN              
+-------------------------------------
+ HashAggregate
+   Group Key: b, c, d, e
+   ->  Seq Scan on select_distinct_a
+         Filter: (c IS NOT NULL)
+(4 rows)
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a where c is not null and d is not null;
+                   QUERY PLAN                    
+-------------------------------------------------
+ Seq Scan on select_distinct_a
+   Filter: ((c IS NOT NULL) AND (d IS NOT NULL))
+(2 rows)
+
+explain select distinct d, e from select_distinct_a group by d, e;
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
+ HashAggregate  (cost=15.85..17.85 rows=200 width=8)
+   Group Key: d, e
+   ->  Seq Scan on select_distinct_a  (cost=0.00..13.90 rows=390 width=8)
+(3 rows)
+
+create table select_distinct_b(a int, b char(20),  c char(20) not null,  d int, e int, primary key(a, b));
+explain (costs off) select distinct * from select_distinct_a a, select_distinct_b b;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop
+   ->  Seq Scan on select_distinct_a a
+   ->  Materialize
+         ->  Seq Scan on select_distinct_b b
+(4 rows)
+
+explain (costs off) select distinct a.b, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: a.b, a.c, b.a, b.b
+         ->  Nested Loop
+               ->  Seq Scan on select_distinct_a a
+               ->  Materialize
+                     ->  Seq Scan on select_distinct_b b
+(7 rows)
+
+explain (costs off) select distinct a.d, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b where a.d is not null;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop
+   ->  Seq Scan on select_distinct_b b
+   ->  Materialize
+         ->  Seq Scan on select_distinct_a a
+               Filter: (d IS NOT NULL)
+(5 rows)
+
+explain (costs off) select distinct a.d, b.a from select_distinct_a a, select_distinct_b b group by a.d, b.a;
+                    QUERY PLAN                     
+---------------------------------------------------
+ HashAggregate
+   Group Key: a.d, b.a
+   ->  Nested Loop
+         ->  Seq Scan on select_distinct_a a
+         ->  Materialize
+               ->  Seq Scan on select_distinct_b b
+(6 rows)
+
+explain (costs off) select distinct a, b from select_distinct_a where a in (select a from select_distinct_b);
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Nested Loop Semi Join
+   ->  Seq Scan on select_distinct_a
+   ->  Index Only Scan using select_distinct_b_pkey on select_distinct_b
+         Index Cond: (a = select_distinct_a.a)
+(4 rows)
+
+explain (costs off) select distinct a, b from select_distinct_a where a not in (select a from select_distinct_b);
+              QUERY PLAN               
+---------------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on select_distinct_b
+(4 rows)
+
+explain (costs off) select distinct * from select_distinct_a a, (select a, max(b) as b from select_distinct_b group by a) b
+where a.a in (select a from select_distinct_b)
+and a.b = b.b;
+                                           QUERY PLAN                                            
+-------------------------------------------------------------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: a.a, a.b, a.c, a.d, a.e, select_distinct_b_1.a
+         ->  Nested Loop
+               Join Filter: (a.b = (max(select_distinct_b_1.b)))
+               ->  HashAggregate
+                     Group Key: select_distinct_b_1.a
+                     ->  Seq Scan on select_distinct_b select_distinct_b_1
+               ->  Materialize
+                     ->  Nested Loop Semi Join
+                           ->  Seq Scan on select_distinct_a a
+                           ->  Index Only Scan using select_distinct_b_pkey on select_distinct_b
+                                 Index Cond: (a = a.a)
+(13 rows)
+
+explain (costs off) select distinct on(a) a, b from select_distinct_a;
+                QUERY PLAN                 
+-------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: a
+         ->  Seq Scan on select_distinct_a
+(4 rows)
+
+explain (costs off) select distinct on(a, b) a, b from select_distinct_a;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)
+
+drop table select_distinct_a;
+drop table select_distinct_b;
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index a1c90eb905..e053214f9d 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -73,6 +73,7 @@ select name, setting from pg_settings where name like 'enable%';
               name              | setting 
 --------------------------------+---------
  enable_bitmapscan              | on
+ enable_distinct_elimination    | on
  enable_gathermerge             | on
  enable_hashagg                 | on
  enable_hashjoin                | on
@@ -89,7 +90,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(17 rows)
+(18 rows)
 
 -- Test that the pg_timezone_names and pg_timezone_abbrevs views are
 -- more-or-less working.  We can't test their contents in any great detail
diff --git a/src/test/regress/sql/select_distinct_2.sql b/src/test/regress/sql/select_distinct_2.sql
new file mode 100644
index 0000000000..cad8d40dc6
--- /dev/null
+++ b/src/test/regress/sql/select_distinct_2.sql
@@ -0,0 +1,42 @@
+create table select_distinct_a(a int, b char(20),  c char(20) not null,  d int, e int, primary key(a, b));
+
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+
+-- no node for distinct.
+explain (costs off) select distinct * from select_distinct_a;
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a;
+
+create unique index select_distinct_a_uk on select_distinct_a(c, d);
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a where c is not null;
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a where c is not null and d is not null;
+
+explain select distinct d, e from select_distinct_a group by d, e;
+
+
+create table select_distinct_b(a int, b char(20),  c char(20) not null,  d int, e int, primary key(a, b));
+
+explain (costs off) select distinct * from select_distinct_a a, select_distinct_b b;
+
+explain (costs off) select distinct a.b, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b;
+
+explain (costs off) select distinct a.d, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b where a.d is not null;
+
+explain (costs off) select distinct a.d, b.a from select_distinct_a a, select_distinct_b b group by a.d, b.a;
+
+explain (costs off) select distinct a, b from select_distinct_a where a in (select a from select_distinct_b);
+
+explain (costs off) select distinct a, b from select_distinct_a where a not in (select a from select_distinct_b);
+
+explain (costs off) select distinct * from select_distinct_a a, (select a, max(b) as b from select_distinct_b group by a) b
+where a.a in (select a from select_distinct_b)
+and a.b = b.b;
+
+explain (costs off) select distinct on(a) a, b from select_distinct_a;
+explain (costs off) select distinct on(a, b) a, b from select_distinct_a;
+
+drop table select_distinct_a;
+drop table select_distinct_b;
-- 
2.20.1 (Apple Git-117)

Ashutosh Bapat

ashutosh.bapat.oss@gmail.com

almost 6 years ago

In reply to: Andy Fan (#4)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Sat, Feb 8, 2020 at 12:53 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

Hi Ashutosh:
Thanks for your time.

On Fri, Feb 7, 2020 at 11:54 PM Ashutosh Bapat <
ashutosh.bapat.oss@gmail.com> wrote:
Hi Andy,
What might help is to add more description to your email message like
giving examples to explain your idea.
Anyway, I looked at the testcases you added for examples.
+create table select_distinct_a(a int, b char(20),  c char(20) not null,
d int, e int, primary key(a, b));
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+-- no node for distinct.
+explain (costs off) select distinct * from select_distinct_a;
+          QUERY PLAN
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)
From this example, it seems that the distinct operation can be dropped
because (a, b) is a primary key. Is my understanding correct?
Yes, you are correct. Actually I added then to commit message,
but it's true that I should have copied them in this email body as
well. so copy it now.

[PATCH] Erase the distinctClause if the result is unique by
definition

I forgot to mention this in the last round of comments. Your patch was
actually removing distictClause from the Query structure. Please avoid
doing that. If you remove it, you are also removing the evidence that this
Query had a DISTINCT clause in it.

However the patch as presented has some problems
1. What happens if the primary key constraint or NOT NULL constraint gets
dropped between a prepare and execute? The plan will no more be valid and
thus execution may produce non-distinct results.

Will this still be an issue if user use doesn't use a "read uncommitted"
isolation level? I suppose it should be ok for this case. But even though
I should add an isolation level check for this. Just added that in the
patch
to continue discussing of this issue.

In PostgreSQL there's no "read uncommitted". But that doesn't matter since
a query can be prepared outside a transaction and executed within one or
more subsequent transactions.

PostgreSQL has similar concept of allowing non-grouping expression as
part of targetlist when those expressions can be proved to be functionally
dependent on the GROUP BY clause. See check_functional_grouping() and its
caller. I think, DISTINCT elimination should work on similar lines.

2. For the same reason described in check_functional_grouping(), using

unique indexes for eliminating DISTINCT should be discouraged.

I checked the comments of check_functional_grouping, the reason is

* Currently we only check to see if the rel has a primary key that is a
* subset of the grouping_columns. We could also use plain unique
constraints
* if all their columns are known not null, but there's a problem: we need
* to be able to represent the not-null-ness as part of the constraints
added
* to *constraintDeps. FIXME whenever not-null constraints get represented
* in pg_constraint.

Actually I am doubtful the reason for pg_constraint since we still be able
to get the not null information from
relation->rd_attr->attrs[n].attnotnull which
is just what this patch did.

The problem isn't whether not-null-less can be inferred or not, the problem
is whether that can be guaranteed across planning and execution of query
(prepare and execute for example.) The constraintDep machinary registers
the constraints used for preparing plan and invalidates the plan if any of
those constraints change after plan is created.

3. If you could eliminate DISTINCT you could similarly eliminate GROUP BY

as well

This is a good point. The rules may have some different for join, so I
prefer
to to focus on the current one so far.

I doubt that since DISTINCT is ultimately carried out as Grouping
operation. But anyway, I won't hang upon that.

4. The patch works only at the query level, but that functionality can be
expanded generally to other places which add Unique/HashAggregate/Group
nodes if the underlying relation can be proved to produce distinct rows.
But that's probably more work since we will have to label paths with unique
keys similar to pathkeys.

Do you mean adding some information into PlannerInfo, and when we create
a node for Unique/HashAggregate/Group, we can just create a dummy node?

Not so much as PlannerInfo but something on lines of PathKey. See PathKey
structure and related code. What I envision is PathKey class is also
annotated with the information whether that PathKey implies uniqueness.
E.g. a PathKey derived from a Primary index would imply uniqueness also. A
PathKey derived from say Group operation also implies uniqueness. Then just
by looking at the underlying Path we would be able to say whether we need
Group/Unique node on top of it or not. I think that would make it much
wider usecase and a very useful optimization.

--
Best Wishes,
Ashutosh Bapat

Tom Lane

tgl@sss.pgh.pa.us

almost 6 years ago

In reply to: Ashutosh Bapat (#5)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> writes:

On Sat, Feb 8, 2020 at 12:53 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
Do you mean adding some information into PlannerInfo, and when we create
a node for Unique/HashAggregate/Group, we can just create a dummy node?

Not so much as PlannerInfo but something on lines of PathKey. See PathKey
structure and related code. What I envision is PathKey class is also
annotated with the information whether that PathKey implies uniqueness.
E.g. a PathKey derived from a Primary index would imply uniqueness also. A
PathKey derived from say Group operation also implies uniqueness. Then just
by looking at the underlying Path we would be able to say whether we need
Group/Unique node on top of it or not. I think that would make it much
wider usecase and a very useful optimization.

FWIW, that doesn't seem like a very prudent approach to me, because it
confuses sorted-ness with unique-ness. PathKeys are about sorting,
but it's possible to have uniqueness guarantees without having sorted
anything, for instance via hashed grouping.

I haven't looked at this patch, but I'd expect it to use infrastructure
related to query_is_distinct_for(), and that doesn't deal in PathKeys.

regards, tom lane

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: Ashutosh Bapat (#5)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat <
ashutosh.bapat.oss@gmail.com> wrote:

[PATCH] Erase the distinctClause if the result is unique by
definition

I forgot to mention this in the last round of comments. Your patch was
actually removing distictClause from the Query structure. Please avoid
doing that. If you remove it, you are also removing the evidence that this
Query had a DISTINCT clause in it.

Yes, I removed it because it is the easiest way to do it. what is the
purpose of keeping the evidence?

However the patch as presented has some problems
1. What happens if the primary key constraint or NOT NULL constraint gets
dropped between a prepare and execute? The plan will no more be valid and
thus execution may produce non-distinct results.

Will this still be an issue if user use doesn't use a "read uncommitted"
isolation level? I suppose it should be ok for this case. But even
though
I should add an isolation level check for this. Just added that in the
patch
to continue discussing of this issue.

In PostgreSQL there's no "read uncommitted".

Thanks for the hint, I just noticed read uncommitted is treated as read
committed
in Postgresql.

But that doesn't matter since a query can be prepared outside a
transaction and executed within one or more subsequent transactions.

Suppose after a DDL, the prepared statement need to be re-parsed/planned
if it is not executed or it will prevent the DDL to happen.

The following is my test.

postgres=# create table t (a int primary key, b int not null, c int);
CREATE TABLE
postgres=# insert into t values(1, 1, 1), (2, 2, 2);
INSERT 0 2
postgres=# create unique index t_idx1 on t(b);
CREATE INDEX

postgres=# prepare st as select distinct b from t where c = $1;
PREPARE
postgres=# explain execute st(1);
QUERY PLAN
-------------------------------------------------
Seq Scan on t (cost=0.00..1.02 rows=1 width=4)
Filter: (c = 1)
(2 rows)
...
postgres=# explain execute st(1);
QUERY PLAN
-------------------------------------------------
Seq Scan on t (cost=0.00..1.02 rows=1 width=4)
Filter: (c = $1)
(2 rows)

-- session 2
postgres=# alter table t alter column b drop not null;
ALTER TABLE

-- session 1:
postgres=# explain execute st(1);
QUERY PLAN
-------------------------------------------------------------
Unique (cost=1.03..1.04 rows=1 width=4)
-> Sort (cost=1.03..1.04 rows=1 width=4)
Sort Key: b
-> Seq Scan on t (cost=0.00..1.02 rows=1 width=4)
Filter: (c = $1)
(5 rows)

-- session 2
postgres=# insert into t values (3, null, 3), (4, null, 3);
INSERT 0 2

-- session 1
postgres=# execute st(3);
b
---

(1 row)

and if we prepare sql outside a transaction, and execute it in the
transaction, the other session can't drop the constraint until the
transaction is ended.

Show quoted text

--
Best Wishes,
Ashutosh Bapat

Julien Rouhaud

rjuju123@gmail.com

almost 6 years ago

In reply to: Andy Fan (#7)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Tue, Feb 11, 2020 at 10:57:26AM +0800, Andy Fan wrote:

On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat <
ashutosh.bapat.oss@gmail.com> wrote:

I forgot to mention this in the last round of comments. Your patch was
actually removing distictClause from the Query structure. Please avoid
doing that. If you remove it, you are also removing the evidence that this
Query had a DISTINCT clause in it.

Yes, I removed it because it is the easiest way to do it. what is the
purpose of keeping the evidence?

However the patch as presented has some problems
1. What happens if the primary key constraint or NOT NULL constraint gets
dropped between a prepare and execute? The plan will no more be valid and
thus execution may produce non-distinct results.

But that doesn't matter since a query can be prepared outside a
transaction and executed within one or more subsequent transactions.

Suppose after a DDL, the prepared statement need to be re-parsed/planned
if it is not executed or it will prevent the DDL to happen.

The following is my test.

postgres=# create table t (a int primary key, b int not null, c int);
CREATE TABLE
postgres=# insert into t values(1, 1, 1), (2, 2, 2);
INSERT 0 2
postgres=# create unique index t_idx1 on t(b);
CREATE INDEX

postgres=# prepare st as select distinct b from t where c = $1;
PREPARE
postgres=# explain execute st(1);
QUERY PLAN
-------------------------------------------------
Seq Scan on t (cost=0.00..1.02 rows=1 width=4)
Filter: (c = 1)
(2 rows)
...
postgres=# explain execute st(1);
QUERY PLAN
-------------------------------------------------
Seq Scan on t (cost=0.00..1.02 rows=1 width=4)
Filter: (c = $1)
(2 rows)

-- session 2
postgres=# alter table t alter column b drop not null;
ALTER TABLE

-- session 1:
postgres=# explain execute st(1);
QUERY PLAN
-------------------------------------------------------------
Unique (cost=1.03..1.04 rows=1 width=4)
-> Sort (cost=1.03..1.04 rows=1 width=4)
Sort Key: b
-> Seq Scan on t (cost=0.00..1.02 rows=1 width=4)
Filter: (c = $1)
(5 rows)

-- session 2
postgres=# insert into t values (3, null, 3), (4, null, 3);
INSERT 0 2

-- session 1
postgres=# execute st(3);
b
---

(1 row)

and if we prepare sql outside a transaction, and execute it in the
transaction, the other session can't drop the constraint until the
transaction is ended.

And what if you create a view on top of a query containing a distinct clause
rather than using prepared statements? FWIW your patch doesn't handle such
case at all, without even needing to drop constraints:

CREATE TABLE t (a int primary key, b int not null, c int);
INSERT INTO t VALUEs(1, 1, 1), (2, 2, 2);
CREATE UNIQUE INDEX t_idx1 on t(b);
CREATE VIEW v1 AS SELECT DISTINCT b FROM t;
EXPLAIN SELECT * FROM v1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

I also think this is not the right way to handle this optimization.

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: Julien Rouhaud (#8)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Tue, Feb 11, 2020 at 3:56 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Tue, Feb 11, 2020 at 10:57:26AM +0800, Andy Fan wrote:

On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat <
ashutosh.bapat.oss@gmail.com> wrote:

I forgot to mention this in the last round of comments. Your patch was
actually removing distictClause from the Query structure. Please avoid
doing that. If you remove it, you are also removing the evidence that

this

Query had a DISTINCT clause in it.

Yes, I removed it because it is the easiest way to do it. what is the
purpose of keeping the evidence?

However the patch as presented has some problems
1. What happens if the primary key constraint or NOT NULL constraint

gets

dropped between a prepare and execute? The plan will no more be valid

and

thus execution may produce non-distinct results.

But that doesn't matter since a query can be prepared outside a
transaction and executed within one or more subsequent transactions.

Suppose after a DDL, the prepared statement need to be re-parsed/planned
if it is not executed or it will prevent the DDL to happen.

The following is my test.

postgres=# create table t (a int primary key, b int not null, c int);
CREATE TABLE
postgres=# insert into t values(1, 1, 1), (2, 2, 2);
INSERT 0 2
postgres=# create unique index t_idx1 on t(b);
CREATE INDEX

postgres=# prepare st as select distinct b from t where c = $1;
PREPARE
postgres=# explain execute st(1);
QUERY PLAN
-------------------------------------------------
Seq Scan on t (cost=0.00..1.02 rows=1 width=4)
Filter: (c = 1)
(2 rows)
...
postgres=# explain execute st(1);
QUERY PLAN
-------------------------------------------------
Seq Scan on t (cost=0.00..1.02 rows=1 width=4)
Filter: (c = $1)
(2 rows)

-- session 2
postgres=# alter table t alter column b drop not null;
ALTER TABLE

-- session 1:
postgres=# explain execute st(1);
QUERY PLAN
-------------------------------------------------------------
Unique (cost=1.03..1.04 rows=1 width=4)
-> Sort (cost=1.03..1.04 rows=1 width=4)
Sort Key: b
-> Seq Scan on t (cost=0.00..1.02 rows=1 width=4)
Filter: (c = $1)
(5 rows)

-- session 2
postgres=# insert into t values (3, null, 3), (4, null, 3);
INSERT 0 2

-- session 1
postgres=# execute st(3);
b
---

(1 row)

and if we prepare sql outside a transaction, and execute it in the
transaction, the other session can't drop the constraint until the
transaction is ended.

And what if you create a view on top of a query containing a distinct
clause
rather than using prepared statements? FWIW your patch doesn't handle such
case at all, without even needing to drop constraints:

CREATE TABLE t (a int primary key, b int not null, c int);

INSERT INTO t VALUEs(1, 1, 1), (2, 2, 2);
CREATE UNIQUE INDEX t_idx1 on t(b);
CREATE VIEW v1 AS SELECT DISTINCT b FROM t;
EXPLAIN SELECT * FROM v1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

Thanks for pointing it out. This is unexpected based on my current
knowledge, I
will check that.

I also think this is not the right way to handle this optimization.

I started to check query_is_distinct_for when Tom point it out, but still
doesn't
understand the context fully. I will take your finding with this as well.

#10

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: Julien Rouhaud (#8)

1 attachment(s)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Tue, Feb 11, 2020 at 3:56 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

and if we prepare sql outside a transaction, and execute it in the
transaction, the other session can't drop the constraint until the
transaction is ended.

And what if you create a view on top of a query containing a distinct
clause
rather than using prepared statements? FWIW your patch doesn't handle such
case at all, without even needing to drop constraints:

CREATE TABLE t (a int primary key, b int not null, c int);
INSERT INTO t VALUEs(1, 1, 1), (2, 2, 2);
CREATE UNIQUE INDEX t_idx1 on t(b);
CREATE VIEW v1 AS SELECT DISTINCT b FROM t;
EXPLAIN SELECT * FROM v1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

This error can be fixed with

-       num_of_rtables = bms_num_members(non_semi_anti_relids);
+       num_of_rtables = list_length(query->rtable);

This test case also be added into the patch.

I also think this is not the right way to handle this optimization.

do you have any other concerns?

Attachments:

0001-Erase-the-distinctClause-if-the-result-is-unique-by-.patchapplication/octet-stream; name=0001-Erase-the-distinctClause-if-the-result-is-unique-by-.patchDownload

From 01159c3011f9798a21ad42c419435f9b23ec7a88 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com>
Date: Fri, 31 Jan 2020 19:38:05 +0800
Subject: [PATCH] Erase the distinctClause if the result is unique by
 definition

For a single relation, we can tell it by any one of the following
is true:
1. The pk is in the target list.
2. The uk is in the target list and the columns is not null
3. The columns in group-by clause is also in the target list

for relation join, we can tell it by:
if every relation in the jointree yield a unique result set,then
the final result is unique as well regardless the join method.
for semi/anti join, we don't have such restriction on the right
table.
---
 src/backend/nodes/bitmapset.c                 |  40 +++
 src/backend/optimizer/path/costsize.c         |   1 +
 src/backend/optimizer/plan/planner.c          | 301 ++++++++++++++++++
 src/backend/utils/cache/relcache.c            |  23 ++
 src/backend/utils/misc/guc.c                  |  10 +
 src/include/nodes/bitmapset.h                 |   2 +
 src/include/optimizer/cost.h                  |   1 +
 src/include/utils/rel.h                       |   3 +
 src/test/regress/expected/join.out            |  16 +-
 .../regress/expected/select_distinct_2.out    | 237 ++++++++++++++
 src/test/regress/expected/sysviews.out        |   3 +-
 src/test/regress/sql/select_distinct_2.sql    |  81 +++++
 12 files changed, 709 insertions(+), 9 deletions(-)
 create mode 100644 src/test/regress/expected/select_distinct_2.out
 create mode 100644 src/test/regress/sql/select_distinct_2.sql

diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index 648cc1a7eb..76ce9b526e 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -1167,3 +1167,43 @@ bms_hash_value(const Bitmapset *a)
 	return DatumGetUInt32(hash_any((const unsigned char *) a->words,
 								   (lastword + 1) * sizeof(bitmapword)));
 }
+
+/*
+ * bms_array_copy --
+ *
+ * copy the bms data in the newly palloc memory
+ */
+
+Bitmapset**
+bms_array_copy(Bitmapset **bms_array, int len)
+{
+	Bitmapset **res;
+	int i;
+	if (bms_array == NULL || len < 1)
+		return NULL;
+
+	res = palloc(sizeof(Bitmapset*) * len);
+	for(i = 0; i < len; i++)
+	{
+		res[i] = bms_copy(bms_array[i]);
+	}
+	return res;
+}
+
+/*
+ * bms_array_free
+ *
+ * free the element in the array one by one, free the array as well at last
+ */
+void
+bms_array_free(Bitmapset **bms_array,  int len)
+{
+	int idx;
+	if (bms_array == NULL)
+		return;
+	for(idx = 0 ; idx < len; idx++)
+	{
+		bms_free(bms_array[idx]);
+	}
+	pfree(bms_array);
+}
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index b5a0033721..dde16b5d44 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -138,6 +138,7 @@ bool		enable_partitionwise_aggregate = false;
 bool		enable_parallel_append = true;
 bool		enable_parallel_hash = true;
 bool		enable_partition_pruning = true;
+bool		enable_distinct_elimination = true;
 
 typedef struct
 {
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index d6f2153593..2974dc3a8a 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -22,8 +22,10 @@
 #include "access/htup_details.h"
 #include "access/parallel.h"
 #include "access/sysattr.h"
+#include "access/relation.h"
 #include "access/table.h"
 #include "access/xact.h"
+#include "catalog/index.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_proc.h"
@@ -35,6 +37,7 @@
 #include "lib/bipartite_match.h"
 #include "lib/knapsack.h"
 #include "miscadmin.h"
+#include "nodes/bitmapset.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
 #ifdef OPTIMIZER_DEBUG
@@ -248,6 +251,7 @@ static bool group_by_has_partkey(RelOptInfo *input_rel,
 								 List *targetList,
 								 List *groupClause);
 static int	common_prefix_cmp(const void *a, const void *b);
+static void	preprocess_distinct_node(PlannerInfo *root);
 
 
 /*****************************************************************************
@@ -989,6 +993,9 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
 	/* Remove any redundant GROUP BY columns */
 	remove_useless_groupby_columns(root);
 
+	if (enable_distinct_elimination)
+		preprocess_distinct_node(root);
+
 	/*
 	 * If we have any outer joins, try to reduce them to plain inner joins.
 	 * This step is most easily done after we've done expression
@@ -7409,3 +7416,297 @@ group_by_has_partkey(RelOptInfo *input_rel,
 
 	return true;
 }
+
+/*
+ * is_unique_result_already
+ *
+ * Given a relation, we can know its primary key + unique key information
+ * unique target is the target list of distinct/distinct on target.
+ * not_null_columns is a union of not null columns based on catalog and quals.
+ * then we can know the result is unique already before executing it if
+ * the primary key or uk + not null in target list.
+ */
+static bool
+is_unique_result_already(Relation relation,
+						 Bitmapset *unique_target,
+						 Bitmapset *not_null_columns)
+{
+	int i;
+	Bitmapset *pkattr = RelationGetIndexAttrBitmap(relation,
+												   INDEX_ATTR_BITMAP_PRIMARY_KEY);
+
+	/*
+	 * if the pk is in the target list,
+	 * the result set is unique for this relation
+	 */
+	if (pkattr != NULL &&
+		!bms_is_empty(pkattr) &&
+		bms_is_subset(pkattr, unique_target))
+	{
+		return true;
+	}
+
+	/*
+	 * check if the pk is in the unique index
+	 */
+	for (i = 0; i < relation->rd_plain_ukcount; i++)
+	{
+		Bitmapset *ukattr = relation->rd_plain_ukattrs[i];
+		if (!bms_is_empty(ukattr)
+			&& bms_is_subset(ukattr, unique_target)
+			&& bms_is_subset(ukattr, not_null_columns))
+			return true;
+	}
+
+	/*
+	 * If a unique index is in the target list, and the columns are not null
+	 * the result set is unique as well
+	 */
+
+	return false;
+}
+
+
+/*
+ * scan_non_semi_anti_relids
+ *
+ * scan jointree to get non-semi/anti join rtindex.
+ */
+static void
+scan_non_semi_anti_relids(Node* jtnode, Relids* relids)
+{
+	if (jtnode == NULL)
+		return;
+
+	if (IsA(jtnode, RangeTblRef))
+	{
+		int			varno = ((RangeTblRef *) jtnode)->rtindex;
+
+		*relids = bms_add_member(*relids, varno);
+	}
+	else if (IsA(jtnode, FromExpr))
+	{
+		FromExpr   *f = (FromExpr *) jtnode;
+		ListCell   *l;
+
+		foreach(l, f->fromlist)
+			scan_non_semi_anti_relids(lfirst(l), relids);
+	}
+	else if (IsA(jtnode, JoinExpr))
+	{
+		JoinExpr   *j = (JoinExpr *) jtnode;
+
+		scan_non_semi_anti_relids(j->larg, relids);
+		if (j->jointype != JOIN_SEMI && j->jointype != JOIN_ANTI)
+		{
+			scan_non_semi_anti_relids(j->rarg, relids);
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+			 (int) nodeTag(jtnode));
+
+}
+
+/*
+ * preprocess_distinct_node
+ *
+ * remove the distinctClause if it is not necessary by definition
+ */
+static void
+preprocess_distinct_node(PlannerInfo *root)
+{
+	Query *query = root->parse;
+	ListCell *lc;
+	int num_of_rtables;
+	Bitmapset **targetlist_by_table = NULL;
+    Bitmapset **notnullcolumns = NULL;
+	Index rel_idx = 0;
+	bool should_distinct_elimination = false;
+	Relids non_semi_anti_relids = NULL;
+
+	if (query->distinctClause == NIL)
+		return;
+
+	/*
+	 * This rewrite depends on the primark key, unique/not null
+	 * constraints, and it may be dropped after get the plan.
+	 * If the IsoLevel is XACT_READ_UNCOMMTTIED, it may produces
+	 * duplicated rows after we remove the distinct node
+	 */
+	if (XactIsoLevel == XACT_READ_UNCOMMITTED)
+		return;
+
+	scan_non_semi_anti_relids((Node*)query->jointree, &non_semi_anti_relids);
+
+	foreach(lc, query->rtable)
+	{
+		RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
+		rel_idx++;
+		if (!bms_is_member(rel_idx, non_semi_anti_relids))
+			continue;
+		/* we only handle the basic Relation for now */
+		if (rte->rtekind != RTE_RELATION)
+			return;
+	}
+
+	num_of_rtables = list_length(query->rtable);
+
+	/*
+	 * If the columns in group clause is in the target list
+	 * we don't need distinct
+	 */
+	if (query->groupClause != NIL)
+	{
+		Bitmapset *groupclause_bm = NULL;
+		Bitmapset *groupclause_in_targetlist_bm = NULL;
+		ListCell *lc;
+		foreach(lc, query->groupClause)
+			groupclause_bm = bms_add_member(groupclause_bm,
+												lfirst_node(SortGroupClause, lc)->tleSortGroupRef);
+
+		foreach(lc, query->targetList)
+		{
+			TargetEntry *te = lfirst_node(TargetEntry, lc);
+			if (te->resjunk)
+				continue;
+			groupclause_in_targetlist_bm = bms_add_member(groupclause_in_targetlist_bm,
+														  te->ressortgroupref);
+		}
+
+		should_distinct_elimination = bms_is_subset(groupclause_bm,
+													groupclause_in_targetlist_bm);
+		bms_free(groupclause_bm);
+		bms_free(groupclause_in_targetlist_bm);
+		if (should_distinct_elimination)
+			goto ret;
+	}
+
+	targetlist_by_table = palloc0(sizeof(Bitmapset*) * num_of_rtables);
+	notnullcolumns = palloc0(sizeof(Bitmapset* ) * num_of_rtables);
+
+	/* build the targetlist_by_table */
+	foreach(lc, query->targetList)
+	{
+		TargetEntry *te = lfirst_node(TargetEntry, lc);
+		Expr *expr = te->expr;
+		Var *var;
+		Bitmapset **target_column_per_rel;
+		int target_attno;
+
+		if (!IsA(expr, Var))
+			continue;
+
+		var = (Var *)(expr);
+		if (var->varlevelsup != 0)
+			continue;
+
+		target_column_per_rel = &targetlist_by_table[var->varno - 1];
+		target_attno = var->varattno - FirstLowInvalidHeapAttributeNumber;
+
+		/*
+		 * for distinct On (..), we only count the field in .. rather than
+		 * all the entries in target list
+		 */
+		if (query->hasDistinctOn)
+		{
+			Index ref = te->ressortgroupref;
+			ListCell *lc;
+
+			/*
+			 * A fastpath to know if the targetEntry is in the distinctClause
+			 */
+			if (ref == 0)
+				continue;
+
+			/*
+			 * Even the ref is not zero, it may be in sort as well, so we
+			 * need dobule check.
+			 */
+			foreach(lc, query->distinctClause)
+			{
+				if (ref == lfirst_node(SortGroupClause, lc)->tleSortGroupRef)
+					*target_column_per_rel = bms_add_member(*target_column_per_rel,
+															target_attno);
+			}
+		}
+		else
+			*target_column_per_rel = bms_add_member(*target_column_per_rel,
+													target_attno);
+	}
+
+	/* find out nonnull columns from qual via find_nonnullable_vars */
+	foreach(lc, find_nonnullable_vars(query->jointree->quals))
+	{
+		Var *not_null_var;
+		Bitmapset **notnullcolumns_per_rel;
+		int notnull_attno;
+		if (!IsA(lfirst(lc), Var))
+			continue;
+		not_null_var = lfirst_node(Var, lc);
+		if (not_null_var->varno == INNER_VAR ||
+			not_null_var->varno == OUTER_VAR ||
+			not_null_var->varno == INDEX_VAR)
+			continue;
+		notnullcolumns_per_rel = &notnullcolumns[not_null_var->varno - 1];
+		notnull_attno = not_null_var->varattno - FirstLowInvalidHeapAttributeNumber;
+		*notnullcolumns_per_rel = bms_add_member(*notnullcolumns_per_rel,
+												 notnull_attno);
+	}
+
+	/* Check if each related rtable can yield a unique result set */
+	rel_idx = 0;
+	foreach(lc, query->rtable)
+	{
+		Relation relation;
+		TupleDesc desc;
+		RangeTblEntry *rte;
+		int attr_idx;
+
+		if (!bms_is_member(rel_idx+1, non_semi_anti_relids))
+			continue;
+
+		rte = lfirst_node(RangeTblEntry, lc);
+		Assert(rte->rtekind == RTE_RELATION);
+		Assert(rte->relid != InvalidOid);
+
+	    relation = relation_open(rte->relid, RowExclusiveLock);
+		desc = relation->rd_att;
+		attr_idx = 0;
+
+		/* Add the notnullcolumns based on catalog */
+		for(; attr_idx < desc->natts; attr_idx++)
+		{
+			int notnull_attno;
+			if (!desc->attrs[attr_idx].attnotnull)
+				continue;
+			notnull_attno = attr_idx + 1 - FirstLowInvalidHeapAttributeNumber;
+			notnullcolumns[rel_idx] = bms_add_member(notnullcolumns[rel_idx],
+													 notnull_attno);
+		}
+
+		/* check non-nullable in qual, only col is not null checked now */
+		if (!is_unique_result_already(relation,
+									  targetlist_by_table[rel_idx],
+									  notnullcolumns[rel_idx]))
+		{
+			RelationClose(relation);
+			goto ret;
+		}
+		RelationClose(relation);
+		rel_idx++;
+	}
+
+	should_distinct_elimination = true;
+
+ ret:
+	bms_array_free(notnullcolumns, num_of_rtables);
+	bms_array_free(targetlist_by_table, num_of_rtables);
+	bms_free(non_semi_anti_relids);
+
+	if (should_distinct_elimination)
+	{
+		query->distinctClause = NIL;
+		query->hasDistinctOn = false;
+	}
+}
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index df025a5a30..d8a76a2273 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -2346,6 +2346,8 @@ RelationDestroyRelation(Relation relation, bool remember_tupdesc)
 	bms_free(relation->rd_keyattr);
 	bms_free(relation->rd_pkattr);
 	bms_free(relation->rd_idattr);
+	if (relation->rd_plain_ukattrs)
+		bms_array_free(relation->rd_plain_ukattrs, relation->rd_plain_ukcount);
 	if (relation->rd_pubactions)
 		pfree(relation->rd_pubactions);
 	if (relation->rd_options)
@@ -4762,6 +4764,7 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
 	Bitmapset  *indexattrs;		/* indexed columns */
 	Bitmapset  *uindexattrs;	/* columns in unique indexes */
 	Bitmapset  *pkindexattrs;	/* columns in the primary index */
+	Bitmapset  **ukindexattrs = NULL; /* columns in the unique indexes */
 	Bitmapset  *idindexattrs;	/* columns in the replica identity */
 	List	   *indexoidlist;
 	List	   *newindexoidlist;
@@ -4769,6 +4772,7 @@ RelationGetIndexAttrBitmap(Relation relation, IndexAttrBitmapKind attrKind)
 	Oid			relreplindex;
 	ListCell   *l;
 	MemoryContext oldcxt;
+	int plain_uk_index_count = 0, index_count = 0, indexno = 0;
 
 	/* Quick exit if we already computed the result. */
 	if (relation->rd_indexattr != NULL)
@@ -4826,6 +4830,9 @@ restart:
 	uindexattrs = NULL;
 	pkindexattrs = NULL;
 	idindexattrs = NULL;
+	index_count = list_length(indexoidlist);
+	ukindexattrs = palloc0(sizeof(Bitmapset *) * index_count);
+
 	foreach(l, indexoidlist)
 	{
 		Oid			indexOid = lfirst_oid(l);
@@ -4875,6 +4882,9 @@ restart:
 		/* Is this index the configured (or default) replica identity? */
 		isIDKey = (indexOid == relreplindex);
 
+		if (isKey)
+			plain_uk_index_count++;
+
 		/* Collect simple attribute references */
 		for (i = 0; i < indexDesc->rd_index->indnatts; i++)
 		{
@@ -4904,6 +4914,11 @@ restart:
 				if (isIDKey && i < indexDesc->rd_index->indnkeyatts)
 					idindexattrs = bms_add_member(idindexattrs,
 												  attrnum - FirstLowInvalidHeapAttributeNumber);
+
+				if (isKey)
+					ukindexattrs[indexno] = bms_add_member(ukindexattrs[indexno],
+														   attrnum - FirstLowInvalidHeapAttributeNumber);
+
 			}
 		}
 
@@ -4914,6 +4929,7 @@ restart:
 		pull_varattnos(indexPredicate, 1, &indexattrs);
 
 		index_close(indexDesc, AccessShareLock);
+		indexno++;
 	}
 
 	/*
@@ -4940,6 +4956,7 @@ restart:
 		bms_free(pkindexattrs);
 		bms_free(idindexattrs);
 		bms_free(indexattrs);
+		bms_array_free(ukindexattrs, index_count);
 
 		goto restart;
 	}
@@ -4953,6 +4970,8 @@ restart:
 	relation->rd_pkattr = NULL;
 	bms_free(relation->rd_idattr);
 	relation->rd_idattr = NULL;
+	bms_array_free(relation->rd_plain_ukattrs, relation->rd_plain_ukcount);
+	relation->rd_plain_ukattrs = NULL;
 
 	/*
 	 * Now save copies of the bitmaps in the relcache entry.  We intentionally
@@ -4966,6 +4985,8 @@ restart:
 	relation->rd_pkattr = bms_copy(pkindexattrs);
 	relation->rd_idattr = bms_copy(idindexattrs);
 	relation->rd_indexattr = bms_copy(indexattrs);
+	relation->rd_plain_ukattrs = bms_array_copy(ukindexattrs, index_count);
+	relation->rd_plain_ukcount = plain_uk_index_count;
 	MemoryContextSwitchTo(oldcxt);
 
 	/* We return our original working copy for caller to play with */
@@ -5618,6 +5639,8 @@ load_relcache_init_file(bool shared)
 		rel->rd_keyattr = NULL;
 		rel->rd_pkattr = NULL;
 		rel->rd_idattr = NULL;
+		rel->rd_plain_ukattrs = NULL;
+		rel->rd_plain_ukcount = 0;
 		rel->rd_pubactions = NULL;
 		rel->rd_statvalid = false;
 		rel->rd_statlist = NIL;
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e44f71e991..fa798dd564 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1064,6 +1064,16 @@ static struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_distinct_elimination", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables plan-time and run-time unique elimination."),
+		    gettext_noop("Allows the query planner to remove the uncecessary distinct clause."), 
+			GUC_EXPLAIN
+		},
+		&enable_distinct_elimination,
+		true,
+		NULL, NULL, NULL
+	},
 	{
 		{"geqo", PGC_USERSET, QUERY_TUNING_GEQO,
 			gettext_noop("Enables genetic query optimization."),
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index b7b18a0b68..ff30feb521 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -117,4 +117,6 @@ extern int	bms_prev_member(const Bitmapset *a, int prevbit);
 /* support for hashtables using Bitmapsets as keys: */
 extern uint32 bms_hash_value(const Bitmapset *a);
 
+extern Bitmapset **bms_array_copy(Bitmapset **bms_array, int len);
+extern void bms_array_free(Bitmapset **bms_array,  int len);
 #endif							/* BITMAPSET_H */
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index cb012ba198..4fa5d32df6 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -64,6 +64,7 @@ extern PGDLLIMPORT bool enable_partitionwise_aggregate;
 extern PGDLLIMPORT bool enable_parallel_append;
 extern PGDLLIMPORT bool enable_parallel_hash;
 extern PGDLLIMPORT bool enable_partition_pruning;
+extern PGDLLIMPORT bool enable_distinct_elimination;
 extern PGDLLIMPORT int constraint_exclusion;
 
 extern double index_pages_fetched(double tuples_fetched, BlockNumber pages,
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 44ed04dd3f..7c5a6d65b6 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -120,6 +120,9 @@ typedef struct RelationData
 	Bitmapset  *rd_indexattr;	/* identifies columns used in indexes */
 	Bitmapset  *rd_keyattr;		/* cols that can be ref'd by foreign keys */
 	Bitmapset  *rd_pkattr;		/* cols included in primary key */
+	Bitmapset  **rd_plain_ukattrs;    /* cols included in the plain unique indexes,
+                                   only non-expression, non-partical columns are count */
+	int        rd_plain_ukcount;  /* the no. of uk count */
 	Bitmapset  *rd_idattr;		/* included in replica identity index */
 
 	PublicationActions *rd_pubactions;	/* publication actions */
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 761376b007..3f6595d53b 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -4433,17 +4433,17 @@ select d.* from d left join (select * from b group by b.id, b.c_id) s
 explain (costs off)
 select d.* from d left join (select distinct * from b) s
   on d.a = s.id;
-              QUERY PLAN              
---------------------------------------
- Merge Right Join
-   Merge Cond: (b.id = d.a)
-   ->  Unique
-         ->  Sort
-               Sort Key: b.id, b.c_id
-               ->  Seq Scan on b
+           QUERY PLAN            
+---------------------------------
+ Merge Left Join
+   Merge Cond: (d.a = s.id)
    ->  Sort
          Sort Key: d.a
          ->  Seq Scan on d
+   ->  Sort
+         Sort Key: s.id
+         ->  Subquery Scan on s
+               ->  Seq Scan on b
 (9 rows)
 
 -- check join removal works when uniqueness of the join condition is enforced
diff --git a/src/test/regress/expected/select_distinct_2.out b/src/test/regress/expected/select_distinct_2.out
new file mode 100644
index 0000000000..28080ecc21
--- /dev/null
+++ b/src/test/regress/expected/select_distinct_2.out
@@ -0,0 +1,237 @@
+create table select_distinct_a(a int, b char(20),  c char(20) not null,  d int, e int, primary key(a, b));
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+-- no node for distinct.
+explain (costs off) select distinct * from select_distinct_a;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a;
+             QUERY PLAN              
+-------------------------------------
+ HashAggregate
+   Group Key: b, c, d, e
+   ->  Seq Scan on select_distinct_a
+(3 rows)
+
+create unique index select_distinct_a_uk on select_distinct_a(c, d);
+explain (costs off) select distinct b,c,d,e from select_distinct_a where c is not null;
+             QUERY PLAN              
+-------------------------------------
+ HashAggregate
+   Group Key: b, c, d, e
+   ->  Seq Scan on select_distinct_a
+         Filter: (c IS NOT NULL)
+(4 rows)
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a where c is not null and d is not null;
+                   QUERY PLAN                    
+-------------------------------------------------
+ Seq Scan on select_distinct_a
+   Filter: ((c IS NOT NULL) AND (d IS NOT NULL))
+(2 rows)
+
+explain select distinct d, e from select_distinct_a group by d, e;
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
+ HashAggregate  (cost=15.85..17.85 rows=200 width=8)
+   Group Key: d, e
+   ->  Seq Scan on select_distinct_a  (cost=0.00..13.90 rows=390 width=8)
+(3 rows)
+
+create table select_distinct_b(a int, b char(20),  c char(20) not null,  d int, e int, primary key(a, b));
+explain (costs off) select distinct * from select_distinct_a a, select_distinct_b b;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop
+   ->  Seq Scan on select_distinct_a a
+   ->  Materialize
+         ->  Seq Scan on select_distinct_b b
+(4 rows)
+
+insert into select_distinct_a values(1, 'a', 'a', 0, 1), (2, 'b', 'A', 0, 2), (3, 'c', 'c', 0, 3);
+insert into select_distinct_b values(1, 'a', 'a', 0, 1), (4, 'd', 'd', 0, 4), (5, 'e', 'e', 0, 5);
+explain (costs off) select distinct a.c, a.d, b.a, b.b from select_distinct_a a, select_distinct_b b where a.d is not null order by 1;
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Nested Loop
+   ->  Index Only Scan using select_distinct_a_uk on select_distinct_a a
+         Index Cond: (d IS NOT NULL)
+   ->  Materialize
+         ->  Seq Scan on select_distinct_b b
+(5 rows)
+
+-- left join
+explain
+select distinct a.a, a.b, b.a, b.b from select_distinct_a a left join select_distinct_b b on (a.a = b.a)
+order by 1;
+                                                     QUERY PLAN                                                     
+--------------------------------------------------------------------------------------------------------------------
+ Sort  (cost=177.24..179.14 rows=760 width=176)
+   Sort Key: a.a
+   ->  Nested Loop Left Join  (cost=0.15..140.88 rows=760 width=176)
+         ->  Seq Scan on select_distinct_a a  (cost=0.00..13.90 rows=390 width=88)
+         ->  Index Only Scan using select_distinct_b_pkey on select_distinct_b b  (cost=0.15..0.31 rows=2 width=88)
+               Index Cond: (a = a.a)
+(6 rows)
+
+select distinct a.a, a.b, b.a, b.b from select_distinct_a a left join select_distinct_b b on (a.a = b.a)
+order by 1;
+ a |          b           | a |          b           
+---+----------------------+---+----------------------
+ 1 | a                    | 1 | a                   
+ 2 | b                    |   | 
+ 3 | c                    |   | 
+(3 rows)
+
+-- full join
+explain
+select distinct a.a, a.b, b.a, b.b from select_distinct_a a full outer join select_distinct_b b on (a.a = b.a)
+order by 1;
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Sort  (cost=10000000096.63..10000000098.53 rows=760 width=176)
+   Sort Key: a.a
+   ->  Hash Full Join  (cost=10000000018.77..10000000060.26 rows=760 width=176)
+         Hash Cond: (a.a = b.a)
+         ->  Seq Scan on select_distinct_a a  (cost=0.00..13.90 rows=390 width=88)
+         ->  Hash  (cost=13.90..13.90 rows=390 width=88)
+               ->  Seq Scan on select_distinct_b b  (cost=0.00..13.90 rows=390 width=88)
+(7 rows)
+
+select distinct a.a, a.b, b.a, b.b from select_distinct_a a full outer join select_distinct_b b on (a.a = b.a)
+order by 1;
+ a |          b           | a |          b           
+---+----------------------+---+----------------------
+ 1 | a                    | 1 | a                   
+ 2 | b                    |   | 
+ 3 | c                    |   | 
+   |                      | 5 | e                   
+   |                      | 4 | d                   
+(5 rows)
+
+-- Cartesian join
+explain (costs off) select distinct a.b, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b order by 1;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: a.b, a.c, b.a, b.b
+         ->  Nested Loop
+               ->  Seq Scan on select_distinct_a a
+               ->  Materialize
+                     ->  Seq Scan on select_distinct_b b
+(7 rows)
+
+select distinct a.b, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b order by 1;
+          b           |          c           | a |          b           
+----------------------+----------------------+---+----------------------
+ a                    | a                    | 1 | a                   
+ a                    | a                    | 4 | d                   
+ a                    | a                    | 5 | e                   
+ b                    | A                    | 1 | a                   
+ b                    | A                    | 4 | d                   
+ b                    | A                    | 5 | e                   
+ c                    | c                    | 1 | a                   
+ c                    | c                    | 4 | d                   
+ c                    | c                    | 5 | e                   
+(9 rows)
+
+explain (costs off) select distinct a.d, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b where a.d is not null;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop
+   ->  Seq Scan on select_distinct_b b
+   ->  Materialize
+         ->  Seq Scan on select_distinct_a a
+               Filter: (d IS NOT NULL)
+(5 rows)
+
+-- Semi/anti join
+explain (costs off) select distinct a, b from select_distinct_a where a in (select a from select_distinct_b);
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Nested Loop Semi Join
+   ->  Seq Scan on select_distinct_a
+   ->  Index Only Scan using select_distinct_b_pkey on select_distinct_b
+         Index Cond: (a = select_distinct_a.a)
+(4 rows)
+
+explain (costs off) select distinct a, b from select_distinct_a where a not in (select a from select_distinct_b);
+              QUERY PLAN               
+---------------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on select_distinct_b
+(4 rows)
+
+-- group
+explain (costs off) select distinct a.d, b.a from select_distinct_a a, select_distinct_b b group by a.d, b.a;
+                    QUERY PLAN                     
+---------------------------------------------------
+ HashAggregate
+   Group Key: a.d, b.a
+   ->  Nested Loop
+         ->  Seq Scan on select_distinct_a a
+         ->  Materialize
+               ->  Seq Scan on select_distinct_b b
+(6 rows)
+
+-- If we have subquery on rangetable, we can't handle it now.
+explain (costs off) select distinct * from select_distinct_a a, (select a, max(b) as b from select_distinct_b group by a) b
+where a.a in (select a from select_distinct_b)
+and a.b = b.b;
+                                           QUERY PLAN                                            
+-------------------------------------------------------------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: a.a, a.b, a.c, a.d, a.e, select_distinct_b_1.a
+         ->  Nested Loop
+               Join Filter: (a.b = (max(select_distinct_b_1.b)))
+               ->  HashAggregate
+                     Group Key: select_distinct_b_1.a
+                     ->  Seq Scan on select_distinct_b select_distinct_b_1
+               ->  Materialize
+                     ->  Nested Loop Semi Join
+                           ->  Seq Scan on select_distinct_a a
+                           ->  Index Only Scan using select_distinct_b_pkey on select_distinct_b
+                                 Index Cond: (a = a.a)
+(13 rows)
+
+-- Distinct On
+explain (costs off) select distinct on(a) a, b from select_distinct_a;
+                QUERY PLAN                 
+-------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: a
+         ->  Seq Scan on select_distinct_a
+(4 rows)
+
+explain (costs off) select distinct on(a, b) a, b from select_distinct_a;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)
+
+create view v_distinct as select distinct a, b from select_distinct_a;
+explain select * from v_distinct;
+                             QUERY PLAN                              
+---------------------------------------------------------------------
+ Seq Scan on select_distinct_a  (cost=0.00..13.90 rows=390 width=88)
+(1 row)
+
+select * from v_distinct;
+ a |          b           
+---+----------------------
+ 1 | a                   
+ 2 | b                   
+ 3 | c                   
+(3 rows)
+
+drop view v_distinct;
+drop table select_distinct_a;
+drop table select_distinct_b;
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index a1c90eb905..e053214f9d 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -73,6 +73,7 @@ select name, setting from pg_settings where name like 'enable%';
               name              | setting 
 --------------------------------+---------
  enable_bitmapscan              | on
+ enable_distinct_elimination    | on
  enable_gathermerge             | on
  enable_hashagg                 | on
  enable_hashjoin                | on
@@ -89,7 +90,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(17 rows)
+(18 rows)
 
 -- Test that the pg_timezone_names and pg_timezone_abbrevs views are
 -- more-or-less working.  We can't test their contents in any great detail
diff --git a/src/test/regress/sql/select_distinct_2.sql b/src/test/regress/sql/select_distinct_2.sql
new file mode 100644
index 0000000000..3d124462dc
--- /dev/null
+++ b/src/test/regress/sql/select_distinct_2.sql
@@ -0,0 +1,81 @@
+create table select_distinct_a(a int, b char(20),  c char(20) not null,  d int, e int, primary key(a, b));
+
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+
+-- no node for distinct.
+explain (costs off) select distinct * from select_distinct_a;
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a;
+
+create unique index select_distinct_a_uk on select_distinct_a(c, d);
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a where c is not null;
+
+explain (costs off) select distinct b,c,d,e from select_distinct_a where c is not null and d is not null;
+
+explain select distinct d, e from select_distinct_a group by d, e;
+
+
+create table select_distinct_b(a int, b char(20),  c char(20) not null,  d int, e int, primary key(a, b));
+
+explain (costs off) select distinct * from select_distinct_a a, select_distinct_b b;
+
+
+
+insert into select_distinct_a values(1, 'a', 'a', 0, 1), (2, 'b', 'A', 0, 2), (3, 'c', 'c', 0, 3);
+insert into select_distinct_b values(1, 'a', 'a', 0, 1), (4, 'd', 'd', 0, 4), (5, 'e', 'e', 0, 5);
+
+explain (costs off) select distinct a.c, a.d, b.a, b.b from select_distinct_a a, select_distinct_b b where a.d is not null order by 1;
+
+
+-- left join
+explain
+select distinct a.a, a.b, b.a, b.b from select_distinct_a a left join select_distinct_b b on (a.a = b.a)
+order by 1;
+
+select distinct a.a, a.b, b.a, b.b from select_distinct_a a left join select_distinct_b b on (a.a = b.a)
+order by 1;
+
+-- full join
+explain
+select distinct a.a, a.b, b.a, b.b from select_distinct_a a full outer join select_distinct_b b on (a.a = b.a)
+order by 1;
+
+select distinct a.a, a.b, b.a, b.b from select_distinct_a a full outer join select_distinct_b b on (a.a = b.a)
+order by 1;
+
+-- Cartesian join
+explain (costs off) select distinct a.b, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b order by 1;
+
+select distinct a.b, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b order by 1;
+
+explain (costs off) select distinct a.d, a.c, b.a, b.b from select_distinct_a a, select_distinct_b b where a.d is not null;
+
+-- Semi/anti join
+explain (costs off) select distinct a, b from select_distinct_a where a in (select a from select_distinct_b);
+
+explain (costs off) select distinct a, b from select_distinct_a where a not in (select a from select_distinct_b);
+
+-- group
+explain (costs off) select distinct a.d, b.a from select_distinct_a a, select_distinct_b b group by a.d, b.a;
+
+-- If we have subquery on rangetable, we can't handle it now.
+explain (costs off) select distinct * from select_distinct_a a, (select a, max(b) as b from select_distinct_b group by a) b
+where a.a in (select a from select_distinct_b)
+and a.b = b.b;
+
+-- Distinct On
+explain (costs off) select distinct on(a) a, b from select_distinct_a;
+explain (costs off) select distinct on(a, b) a, b from select_distinct_a;
+
+create view v_distinct as select distinct a, b from select_distinct_a;
+
+explain select * from v_distinct;
+select * from v_distinct;
+
+drop view v_distinct;
+drop table select_distinct_a;
+drop table select_distinct_b;
+
+
-- 
2.20.1 (Apple Git-117)

#11

Julien Rouhaud

rjuju123@gmail.com

almost 6 years ago

In reply to: Andy Fan (#10)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Tue, Feb 11, 2020 at 08:14:14PM +0800, Andy Fan wrote:

On Tue, Feb 11, 2020 at 3:56 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

and if we prepare sql outside a transaction, and execute it in the
transaction, the other session can't drop the constraint until the
transaction is ended.

And what if you create a view on top of a query containing a distinct
clause
rather than using prepared statements? FWIW your patch doesn't handle such
case at all, without even needing to drop constraints:

CREATE TABLE t (a int primary key, b int not null, c int);
INSERT INTO t VALUEs(1, 1, 1), (2, 2, 2);
CREATE UNIQUE INDEX t_idx1 on t(b);
CREATE VIEW v1 AS SELECT DISTINCT b FROM t;
EXPLAIN SELECT * FROM v1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

This error can be fixed with
-       num_of_rtables = bms_num_members(non_semi_anti_relids);
+       num_of_rtables = list_length(query->rtable);
This test case also be added into the patch.

I also think this is not the right way to handle this optimization.

do you have any other concerns?

Yes, it seems to be broken as soon as you alter the view's underlying table:

=# CREATE TABLE t (a int primary key, b int not null, c int);
CREATE TABLE

=# INSERT INTO t VALUEs(1, 1, 1), (2, 2, 2);
INSERT 0 2

=# CREATE UNIQUE INDEX t_idx1 on t(b);
CREATE INDEX

=# CREATE VIEW v1 AS SELECT DISTINCT b FROM t;
CREATE VIEW

=# EXPLAIN SELECT * FROM v1;
QUERY PLAN
-------------------------------------------------
Seq Scan on t (cost=0.00..1.02 rows=2 width=4)
(1 row)

=# EXPLAIN SELECT DISTINCT b FROM t;
QUERY PLAN
-------------------------------------------------
Seq Scan on t (cost=0.00..1.02 rows=2 width=4)
(1 row)

=# ALTER TABLE t ALTER COLUMN b DROP NOT NULL;
ALTER TABLE

=# EXPLAIN SELECT * FROM v1;
QUERY PLAN
-------------------------------------------------
Seq Scan on t (cost=0.00..1.02 rows=2 width=4)
(1 row)

=# EXPLAIN SELECT DISTINCT b FROM t;
QUERY PLAN
-------------------------------------------------------------
Unique (cost=1.03..1.04 rows=2 width=4)
-> Sort (cost=1.03..1.03 rows=2 width=4)
Sort Key: b
-> Seq Scan on t (cost=0.00..1.02 rows=2 width=4)
(4 rows)

#12

Ashutosh Bapat

ashutosh.bapat.oss@gmail.com

almost 6 years ago

In reply to: Tom Lane (#6)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Mon, Feb 10, 2020 at 10:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> writes:

On Sat, Feb 8, 2020 at 12:53 PM Andy Fan <zhihui.fan1213@gmail.com>

wrote:

Do you mean adding some information into PlannerInfo, and when we

create

a node for Unique/HashAggregate/Group, we can just create a dummy node?

Not so much as PlannerInfo but something on lines of PathKey. See PathKey
structure and related code. What I envision is PathKey class is also
annotated with the information whether that PathKey implies uniqueness.
E.g. a PathKey derived from a Primary index would imply uniqueness also.

A

PathKey derived from say Group operation also implies uniqueness. Then

just

by looking at the underlying Path we would be able to say whether we need
Group/Unique node on top of it or not. I think that would make it much
wider usecase and a very useful optimization.

FWIW, that doesn't seem like a very prudent approach to me, because it
confuses sorted-ness with unique-ness. PathKeys are about sorting,
but it's possible to have uniqueness guarantees without having sorted
anything, for instance via hashed grouping.

I haven't looked at this patch, but I'd expect it to use infrastructure
related to query_is_distinct_for(), and that doesn't deal in PathKeys.

Thanks for the pointer. I think there's another problem with my approach.

PathKeys are specific to paths since the order of the result depends upon
the Path. But uniqueness is a property of the result i.e. relation and thus
should be attached to RelOptInfo as query_is_distinct_for() does. I think
uniquness should bubble up the RelOptInfo tree, annotating each RelOptInfo
with the minimum set of TLEs which make the result from that relation
unique. Thus we could eliminate extra Group/Unique node if the underlying
RelOptInfo's unique column set is subset of required uniqueness.
--
--
Best Wishes,
Ashutosh Bapat

#13

Ashutosh Bapat

ashutosh.bapat.oss@gmail.com

almost 6 years ago

In reply to: Andy Fan (#7)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Tue, Feb 11, 2020 at 8:27 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:

On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat <
ashutosh.bapat.oss@gmail.com> wrote:

[PATCH] Erase the distinctClause if the result is unique by
definition

I forgot to mention this in the last round of comments. Your patch was
actually removing distictClause from the Query structure. Please avoid
doing that. If you remove it, you are also removing the evidence that this
Query had a DISTINCT clause in it.

Yes, I removed it because it is the easiest way to do it. what is the
purpose of keeping the evidence?

Julien's example provides an explanation for this. The Query structure is
serialised into a view definition. Removing distinctClause from there means
that the view will never try to produce unique results.

Suppose after a DDL, the prepared statement need to be re-parsed/planned
if it is not executed or it will prevent the DDL to happen.

The query will be replanned. I am not sure about reparsed though.

-- session 2
postgres=# alter table t alter column b drop not null;
ALTER TABLE

-- session 1:
postgres=# explain execute st(1);
QUERY PLAN
-------------------------------------------------------------
Unique (cost=1.03..1.04 rows=1 width=4)
-> Sort (cost=1.03..1.04 rows=1 width=4)
Sort Key: b
-> Seq Scan on t (cost=0.00..1.02 rows=1 width=4)
Filter: (c = $1)
(5 rows)

Since this prepared statement is parameterised PostgreSQL is replanning it
every time it gets executed. It's not using a stored prepared plan. Try
without parameters. Also make sure that a prepared plan is used for
execution and not a new plan.
--
Best Wishes,
Ashutosh Bapat

#14

Julien Rouhaud

rjuju123@gmail.com

almost 6 years ago

In reply to: Ashutosh Bapat (#13)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Tue, Feb 11, 2020 at 10:06:17PM +0530, Ashutosh Bapat wrote:

On Tue, Feb 11, 2020 at 8:27 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:

On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat <
ashutosh.bapat.oss@gmail.com> wrote:

[PATCH] Erase the distinctClause if the result is unique by
definition

I forgot to mention this in the last round of comments. Your patch was
actually removing distictClause from the Query structure. Please avoid
doing that. If you remove it, you are also removing the evidence that this
Query had a DISTINCT clause in it.

Yes, I removed it because it is the easiest way to do it. what is the
purpose of keeping the evidence?

Julien's example provides an explanation for this. The Query structure is
serialised into a view definition. Removing distinctClause from there means
that the view will never try to produce unique results.

And also I think that this approach will have a lot of other unexpected side
effects. Isn't changing the Query going to affect pg_stat_statements queryid
computing for instance?

#15

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: Julien Rouhaud (#14)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Thu, Feb 13, 2020 at 5:39 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Tue, Feb 11, 2020 at 10:06:17PM +0530, Ashutosh Bapat wrote:

On Tue, Feb 11, 2020 at 8:27 AM Andy Fan <zhihui.fan1213@gmail.com>

wrote:

On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat <
ashutosh.bapat.oss@gmail.com> wrote:

[PATCH] Erase the distinctClause if the result is unique by
definition

I forgot to mention this in the last round of comments. Your patch was
actually removing distictClause from the Query structure. Please avoid
doing that. If you remove it, you are also removing the evidence that

this

Query had a DISTINCT clause in it.

Yes, I removed it because it is the easiest way to do it. what is the
purpose of keeping the evidence?

Julien's example provides an explanation for this. The Query structure is
serialised into a view definition. Removing distinctClause from there

means

that the view will never try to produce unique results.

And also I think that this approach will have a lot of other unexpected
side
effects. Isn't changing the Query going to affect pg_stat_statements
queryid
computing for instance?

Thanks, the 2 factors above are pretty valuable. so erasing the
distinctClause is not reasonable, I will try another way.

#16

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: Ashutosh Bapat (#13)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Wed, Feb 12, 2020 at 12:36 AM Ashutosh Bapat <
ashutosh.bapat.oss@gmail.com> wrote:

On Tue, Feb 11, 2020 at 8:27 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:

On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat <
ashutosh.bapat.oss@gmail.com> wrote:

[PATCH] Erase the distinctClause if the result is unique by
definition

I forgot to mention this in the last round of comments. Your patch was
actually removing distictClause from the Query structure. Please avoid
doing that. If you remove it, you are also removing the evidence that this
Query had a DISTINCT clause in it.

Yes, I removed it because it is the easiest way to do it. what is the
purpose of keeping the evidence?

Julien's example provides an explanation for this. The Query structure is
serialised into a view definition. Removing distinctClause from there means
that the view will never try to produce unique results.

Actually it is not true. If a view is used in the query, the definition
will be *copied*
into the query tree. so if we modify the query tree, the definition of the
view never
touched. The issue of Julien reported is because of a typo error.

-- session 2

postgres=# alter table t alter column b drop not null;
ALTER TABLE

-- session 1:
postgres=# explain execute st(1);
QUERY PLAN
-------------------------------------------------------------
Unique (cost=1.03..1.04 rows=1 width=4)
-> Sort (cost=1.03..1.04 rows=1 width=4)
Sort Key: b
-> Seq Scan on t (cost=0.00..1.02 rows=1 width=4)
Filter: (c = $1)
(5 rows)

Since this prepared statement is parameterised PostgreSQL is replanning it
every time it gets executed. It's not using a stored prepared plan. Try
without parameters. Also make sure that a prepared plan is used for
execution and not a new plan.

Even for parameterised prepared statement, it is still possible to
generate an generic
plan. so it will not replanning every time. But no matter generic plan or
not, after a DDL like
changing the NOT NULL constraints. pg will generated a plan based on the
stored query
tree. However, the query tree will be *copied* again to generate a new
plan. so even I
modified the query tree, everything will be ok as well.

At last, I am agreed with that modifying the query tree is not a good
idea.
so my updated patch doesn't use it any more.

#17

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: Andy Fan (#16)

1 attachment(s)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

Hi All:

Here is the updated patch. It used some functions from
query_is_distinct_for.
I check the query's distinctness in create_distinct_paths, if it is
distinct already,
it will not generate the paths for that. so at last the query tree is not
untouched.

Please see if you have any comments. Thanks

On Mon, Feb 24, 2020 at 8:38 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

Show quoted text

On Wed, Feb 12, 2020 at 12:36 AM Ashutosh Bapat <
ashutosh.bapat.oss@gmail.com> wrote:

On Tue, Feb 11, 2020 at 8:27 AM Andy Fan <zhihui.fan1213@gmail.com>
wrote:

On Tue, Feb 11, 2020 at 12:22 AM Ashutosh Bapat <
ashutosh.bapat.oss@gmail.com> wrote:

[PATCH] Erase the distinctClause if the result is unique by
definition

I forgot to mention this in the last round of comments. Your patch was
actually removing distictClause from the Query structure. Please avoid
doing that. If you remove it, you are also removing the evidence that this
Query had a DISTINCT clause in it.

Yes, I removed it because it is the easiest way to do it. what is the
purpose of keeping the evidence?

Julien's example provides an explanation for this. The Query structure is
serialised into a view definition. Removing distinctClause from there means
that the view will never try to produce unique results.

Actually it is not true. If a view is used in the query, the definition
will be *copied*
into the query tree. so if we modify the query tree, the definition of
the view never
touched. The issue of Julien reported is because of a typo error.

-- session 2

postgres=# alter table t alter column b drop not null;
ALTER TABLE

-- session 1:
postgres=# explain execute st(1);
QUERY PLAN
-------------------------------------------------------------
Unique (cost=1.03..1.04 rows=1 width=4)
-> Sort (cost=1.03..1.04 rows=1 width=4)
Sort Key: b
-> Seq Scan on t (cost=0.00..1.02 rows=1 width=4)
Filter: (c = $1)
(5 rows)

Since this prepared statement is parameterised PostgreSQL is replanning
it every time it gets executed. It's not using a stored prepared plan. Try
without parameters. Also make sure that a prepared plan is used for
execution and not a new plan.

Even for parameterised prepared statement, it is still possible to
generate an generic
plan. so it will not replanning every time. But no matter generic plan or
not, after a DDL like
changing the NOT NULL constraints. pg will generated a plan based on the
stored query
tree. However, the query tree will be *copied* again to generate a new
plan. so even I
modified the query tree, everything will be ok as well.

At last, I am agreed with that modifying the query tree is not a good
idea.
so my updated patch doesn't use it any more.

Attachments:

0001-Erase-the-distinctClause-if-the-result-is-unique-by-.patchapplication/octet-stream; name=0001-Erase-the-distinctClause-if-the-result-is-unique-by-.patchDownload

From 1d4fb1214b3f6b216654825985fb3ef2c5e045e8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com>
Date: Mon, 24 Feb 2020 20:16:42 +0800
Subject: [PATCH] Erase the distinctClause if the result is unique by
 definition

For a single relation, we can tell it by any one of the following
is true:
1. The pk is in the target list.
2. The uk is in the target list and the columns is not null
3. The columns in group-by clause is also in the target list

for relation join, we can tell it by:
if every relation in the jointree yield a unique result set,then
the final result is unique as well regardless the join method.
---
 src/backend/nodes/bitmapset.c                 |  18 ++
 src/backend/optimizer/path/costsize.c         |   1 +
 src/backend/optimizer/plan/analyzejoins.c     | 246 +++++++++++++++-
 src/backend/optimizer/plan/planner.c          |  28 ++
 src/backend/utils/misc/guc.c                  |  10 +
 src/include/nodes/bitmapset.h                 |   1 +
 src/include/optimizer/cost.h                  |   1 +
 src/include/optimizer/planmain.h              |   2 +
 src/test/regress/expected/aggregates.out      |  13 +-
 src/test/regress/expected/join.out            |  16 +-
 .../regress/expected/select_distinct_2.out    | 276 ++++++++++++++++++
 src/test/regress/expected/sysviews.out        |   3 +-
 src/test/regress/sql/select_distinct_2.sql    |  84 ++++++
 13 files changed, 679 insertions(+), 20 deletions(-)
 create mode 100644 src/test/regress/expected/select_distinct_2.out
 create mode 100644 src/test/regress/sql/select_distinct_2.sql

diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index 648cc1a7eb..5cb6924a29 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -1167,3 +1167,21 @@ bms_hash_value(const Bitmapset *a)
 	return DatumGetUInt32(hash_any((const unsigned char *) a->words,
 								   (lastword + 1) * sizeof(bitmapword)));
 }
+
+/*
+ * bms_array_free
+ *
+ * free the element in the array one by one, free the array as well at last
+ */
+void
+bms_array_free(Bitmapset **bms_array,  int len)
+{
+	int idx;
+	if (bms_array == NULL)
+		return;
+	for(idx = 0 ; idx < len; idx++)
+	{
+		bms_free(bms_array[idx]);
+	}
+	pfree(bms_array);
+}
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index b5a0033721..dde16b5d44 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -138,6 +138,7 @@ bool		enable_partitionwise_aggregate = false;
 bool		enable_parallel_append = true;
 bool		enable_parallel_hash = true;
 bool		enable_partition_pruning = true;
+bool		enable_distinct_elimination = true;
 
 typedef struct
 {
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index d0ff660284..a15ba80808 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -22,6 +22,7 @@
  */
 #include "postgres.h"
 
+#include "access/relation.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/clauses.h"
 #include "optimizer/joininfo.h"
@@ -30,7 +31,10 @@
 #include "optimizer/paths.h"
 #include "optimizer/planmain.h"
 #include "optimizer/tlist.h"
+#include "parser/parsetree.h"
 #include "utils/lsyscache.h"
+#include "utils/rel.h"
+#include "utils/relcache.h"
 
 /* local functions */
 static bool join_is_removable(PlannerInfo *root, SpecialJoinInfo *sjinfo);
@@ -801,9 +805,248 @@ query_is_distinct_for(Query *query, List *colnos, List *opids)
 		if (l == NULL)			/* had matches for all? */
 			return true;
 	}
+	return query_is_distinct_agg(query, colnos, opids);
+}
+
+/*
+ * scan_non_semi_anti_relids
+ *
+ * scan jointree to get non-semi/anti join rtindex.
+ */
+static void
+scan_non_semi_anti_relids(Node* jtnode, Relids* relids)
+{
+	if (jtnode == NULL)
+		return;
+
+	if (IsA(jtnode, RangeTblRef))
+	{
+		int			varno = ((RangeTblRef *) jtnode)->rtindex;
+
+		*relids = bms_add_member(*relids, varno);
+	}
+	else if (IsA(jtnode, FromExpr))
+	{
+		FromExpr   *f = (FromExpr *) jtnode;
+		ListCell   *l;
+
+		foreach(l, f->fromlist)
+			scan_non_semi_anti_relids(lfirst(l), relids);
+	}
+	else if (IsA(jtnode, JoinExpr))
+	{
+		JoinExpr   *j = (JoinExpr *) jtnode;
+
+		scan_non_semi_anti_relids(j->larg, relids);
+		if (j->jointype != JOIN_SEMI && j->jointype != JOIN_ANTI)
+		{
+			scan_non_semi_anti_relids(j->rarg, relids);
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+			 (int) nodeTag(jtnode));
+
+}
+
+/*
+ * query_distinct_through_join
+ *
+ * check if the result is unique after the join. if every relation
+ * yields a unique result, the result is unique as well
+ */
+bool
+query_distinct_through_join(PlannerInfo *root, List *colnos, List *opids)
+{
+	RangeTblEntry *rte;
+	int rt_index;
+	Relids non_semi_anti_relids = NULL, tmp = NULL;
+	ListCell *lc1, *lc2;
+
+	/* be filled with metadata and find_nonnull_var */
+	Bitmapset **non_null_var_per_table = NULL;
+	Query *query = root->parse;
+	int max_len;
+
+    /* be used for relation_has_unique_for */
+	List **non_null_expr_per_table = NULL;
+	List **non_null_opids_per_table = NULL;
+	bool ret = true;
+	int idx = 0;
+
+	if (query->hasTargetSRFs)
+		return false;
 
+	/* remove the relids for right semi/anti */
+	scan_non_semi_anti_relids((Node*)query->jointree, &non_semi_anti_relids);
+
+	rt_index = -1;
+	while ((rt_index = bms_next_member(non_semi_anti_relids, rt_index)) >= 0 )
+	{
+		List *indexlist = root->simple_rel_array[rt_index]->indexlist;
+		bool found = false;
+		rte = rt_fetch(rt_index, query->rtable);
+
+		/* for the subquery, we just handle some simple case */
+		if (rte->rtekind == RTE_SUBQUERY)
+		{
+			Query *subquery = rte->subquery;
+			List *sub_opnos = NIL;
+			List *sub_opids = NIL;
+			if (!query_supports_distinctness(subquery))
+			{
+				ret = false;
+				goto done;
+			}
+
+			forboth(lc1, colnos, lc2, opids)
+			{
+				TargetEntry *tle = get_tle_by_resno(query->targetList, lfirst_int(lc1));
+				Var *var;
+				if (!IsA(tle->expr, Var))
+					continue;
+				var =(Var*)tle->expr;
+				if (var->varno == rt_index)
+				{
+					sub_opnos = lappend_int(sub_opnos, var->varattno);
+					sub_opids = lappend_oid(sub_opids, lfirst_oid(lc2));
+				}
+			}
+
+			if (query_is_distinct_for(subquery, sub_opnos, sub_opids))
+			{
+				tmp = bms_add_member(tmp, rt_index);
+			}
+			else
+			{
+				ret = false;
+				goto done;
+			}
+		}
+		else if (rte->rtekind != RTE_RELATION)
+		{
+			ret = false;
+			goto done;
+		}
+		else
+		{
+			foreach(lc1, indexlist)
+			{
+				IndexOptInfo *ind = lfirst_node(IndexOptInfo, lc1);
+				if (ind->unique && ind->immediate && (ind->indpred == NIL || ind->predOK))
+				{
+					found = true;
+					break;
+				}
+			}
+
+			/* if any relation has no pk/uk index, we return fast */
+			if (!found)
+			{
+				ret = false;
+				goto done;
+			}
+		}
+	}
+
+	non_semi_anti_relids = bms_del_members(non_semi_anti_relids, tmp);
 	/*
-	 * Otherwise, a set-returning function in the query's targetlist can
+	 * we know all the base rte is RTE_RELATION and every one has uk/pk index
+	 */
+
+	max_len = list_length(query->rtable) + 1;
+	non_null_var_per_table = palloc0(max_len * sizeof(Bitmapset *));
+	non_null_expr_per_table = palloc0(max_len * sizeof(Bitmapset *));
+	non_null_opids_per_table = palloc0(max_len * sizeof(Bitmapset *));
+
+	/* fill non_null_var_per_table with restrictinfo */
+	foreach(lc1, find_nonnullable_vars(query->jointree->quals))
+	{
+		Var *var;
+
+		if (!IsA(lfirst(lc1), Var))
+			continue;
+		var = lfirst_node(Var, lc1);
+		if (var->varno == INNER_VAR ||
+			var->varno == OUTER_VAR ||
+			var->varno == INDEX_VAR)
+			continue;
+		non_null_var_per_table[var->varno] = bms_add_member(
+			non_null_var_per_table[var->varno], var->varattno);
+	}
+
+	rt_index = -1;
+	/* fill non_null_var_per_table with catalog */
+	while ((rt_index = bms_next_member(non_semi_anti_relids, rt_index)) >= 0 )
+	{
+		int attr_idx = 0;
+		Relation relation;
+		TupleDesc desc;
+
+		rte = rt_fetch(rt_index, query->rtable);
+		relation =  relation_open(rte->relid, AccessShareLock);
+		desc = relation->rd_att;
+
+		for(; attr_idx < desc->natts; attr_idx++)
+		{
+			if (!desc->attrs[attr_idx].attnotnull)
+				continue;
+			non_null_var_per_table[rt_index] = bms_add_member(
+				non_null_var_per_table[rt_index], attr_idx+1);
+		}
+		relation_close(relation, AccessShareLock);
+	}
+
+	/* fiter all the colnos and opids which is not null */
+	forboth(lc1, colnos, lc2, opids)
+	{
+		int colno = lfirst_int(lc1);
+		TargetEntry *tle = get_tle_by_resno(query->targetList, colno);
+		Var *var = NULL;
+		/* We don't know the varno if the target tle->expr is not a Var */
+		if (!IsA(tle->expr, Var))
+			continue;
+		var = (Var *)tle->expr;
+		if (!bms_is_member(var->varattno, non_null_var_per_table[var->varno]))
+			continue;
+		non_null_expr_per_table[var->varno] = lappend(
+			non_null_expr_per_table[var->varno], tle->expr);
+		non_null_opids_per_table[var->varno] = lappend_oid(
+			non_null_opids_per_table[var->varno], lfirst_oid(lc2));
+	}
+
+	rt_index = -1;
+	while ((rt_index = bms_next_member(non_semi_anti_relids, rt_index)) >= 0 )
+	{
+
+		/* if any one can't yield a unique result, we return fast */
+		if (!relation_has_unique_index_for(root, root->simple_rel_array[rt_index],
+										   NIL,
+										   non_null_expr_per_table[rt_index],
+										   non_null_opids_per_table[rt_index]))
+		{
+			ret = false;
+			goto done;
+		}
+	}
+done:
+	bms_array_free(non_null_var_per_table, max_len);
+	for(idx = 0; idx < max_len; idx++)
+	{
+		list_free(non_null_expr_per_table[idx]);
+		list_free(non_null_opids_per_table[idx]);
+	}
+	return ret;
+}
+
+bool
+query_is_distinct_agg(Query *query, List *colnos, List *opids)
+{
+	ListCell   *l;
+	Oid			opid;
+
+	/*
+	 * a set-returning function in the query's targetlist can
 	 * result in returning duplicate rows, despite any grouping that might
 	 * occur before tlist evaluation.  (If all tlist SRFs are within GROUP BY
 	 * columns, it would be safe because they'd be expanded before grouping.
@@ -901,7 +1144,6 @@ query_is_distinct_for(Query *query, List *colnos, List *opids)
 				return true;
 		}
 	}
-
 	/*
 	 * XXX Are there any other cases in which we can easily see the result
 	 * must be distinct?
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index d6f2153593..690fab676c 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -22,8 +22,10 @@
 #include "access/htup_details.h"
 #include "access/parallel.h"
 #include "access/sysattr.h"
+#include "access/relation.h"
 #include "access/table.h"
 #include "access/xact.h"
+#include "catalog/index.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_inherits.h"
 #include "catalog/pg_proc.h"
@@ -35,6 +37,7 @@
 #include "lib/bipartite_match.h"
 #include "lib/knapsack.h"
 #include "miscadmin.h"
+#include "nodes/bitmapset.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
 #ifdef OPTIMIZER_DEBUG
@@ -64,6 +67,8 @@
 #include "utils/rel.h"
 #include "utils/selfuncs.h"
 #include "utils/syscache.h"
+#include "utils/typcache.h"
+
 
 /* GUC parameters */
 double		cursor_tuple_fraction = DEFAULT_CURSOR_TUPLE_FRACTION;
@@ -4737,6 +4742,29 @@ create_distinct_paths(PlannerInfo *root,
 	Path	   *path;
 	ListCell   *lc;
 
+	if (enable_distinct_elimination)
+	{
+		List *colnos = NIL;
+		List *opnos = NIL;
+		ListCell *lc;
+
+		Assert(parse->distinctClause != NIL);
+
+		foreach(lc, parse->distinctClause)
+		{
+			SortGroupClause *sgc = lfirst_node(SortGroupClause, lc);
+			int idx = sgc->tleSortGroupRef;
+			TargetEntry *tle = get_tle_by_resno(parse->targetList, idx);
+			if (tle->resjunk)
+				continue;
+			colnos = lappend_int(colnos, idx);
+			opnos = lappend_oid(opnos, sgc->eqop);
+		}
+
+		if (query_is_distinct_agg(parse, colnos, opnos) ||
+			query_distinct_through_join(root, colnos, opnos))
+			return input_rel;
+	}
 	/* For now, do all work in the (DISTINCT, NULL) upperrel */
 	distinct_rel = fetch_upper_rel(root, UPPERREL_DISTINCT, NULL);
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e44f71e991..fa798dd564 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1064,6 +1064,16 @@ static struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_distinct_elimination", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables plan-time and run-time unique elimination."),
+		    gettext_noop("Allows the query planner to remove the uncecessary distinct clause."), 
+			GUC_EXPLAIN
+		},
+		&enable_distinct_elimination,
+		true,
+		NULL, NULL, NULL
+	},
 	{
 		{"geqo", PGC_USERSET, QUERY_TUNING_GEQO,
 			gettext_noop("Enables genetic query optimization."),
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index b7b18a0b68..5d6f04ffa9 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -117,4 +117,5 @@ extern int	bms_prev_member(const Bitmapset *a, int prevbit);
 /* support for hashtables using Bitmapsets as keys: */
 extern uint32 bms_hash_value(const Bitmapset *a);
 
+extern void bms_array_free(Bitmapset **bms_array,  int len);
 #endif							/* BITMAPSET_H */
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index cb012ba198..4fa5d32df6 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -64,6 +64,7 @@ extern PGDLLIMPORT bool enable_partitionwise_aggregate;
 extern PGDLLIMPORT bool enable_parallel_append;
 extern PGDLLIMPORT bool enable_parallel_hash;
 extern PGDLLIMPORT bool enable_partition_pruning;
+extern PGDLLIMPORT bool enable_distinct_elimination;
 extern PGDLLIMPORT int constraint_exclusion;
 
 extern double index_pages_fetched(double tuples_fetched, BlockNumber pages,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index eab486a621..ebd4f24577 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -100,6 +100,8 @@ extern List *remove_useless_joins(PlannerInfo *root, List *joinlist);
 extern void reduce_unique_semijoins(PlannerInfo *root);
 extern bool query_supports_distinctness(Query *query);
 extern bool query_is_distinct_for(Query *query, List *colnos, List *opids);
+extern bool query_is_distinct_agg(Query *query, List *colnos, List *opids);
+extern bool query_distinct_through_join(PlannerInfo *root, List *colnos, List *opids);
 extern bool innerrel_is_unique(PlannerInfo *root,
 							   Relids joinrelids, Relids outerrelids, RelOptInfo *innerrel,
 							   JoinType jointype, List *restrictlist, bool force_cache);
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index f457b5b150..6712571578 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -870,14 +870,12 @@ explain (costs off)
   select distinct max(unique2) from tenk1;
                              QUERY PLAN                              
 ---------------------------------------------------------------------
- HashAggregate
-   Group Key: $0
+ Result
    InitPlan 1 (returns $0)
      ->  Limit
            ->  Index Only Scan Backward using tenk1_unique2 on tenk1
                  Index Cond: (unique2 IS NOT NULL)
-   ->  Result
-(7 rows)
+(5 rows)
 
 select distinct max(unique2) from tenk1;
  max  
@@ -1036,7 +1034,7 @@ explain (costs off)
   select distinct min(f1), max(f1) from minmaxtest;
                                          QUERY PLAN                                          
 ---------------------------------------------------------------------------------------------
- Unique
+ Result
    InitPlan 1 (returns $0)
      ->  Limit
            ->  Merge Append
@@ -1059,10 +1057,7 @@ explain (costs off)
                  ->  Index Only Scan using minmaxtest2i on minmaxtest2 minmaxtest_8
                        Index Cond: (f1 IS NOT NULL)
                  ->  Index Only Scan Backward using minmaxtest3i on minmaxtest3 minmaxtest_9
-   ->  Sort
-         Sort Key: ($0), ($1)
-         ->  Result
-(26 rows)
+(23 rows)
 
 select distinct min(f1), max(f1) from minmaxtest;
  min | max 
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 761376b007..3f6595d53b 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -4433,17 +4433,17 @@ select d.* from d left join (select * from b group by b.id, b.c_id) s
 explain (costs off)
 select d.* from d left join (select distinct * from b) s
   on d.a = s.id;
-              QUERY PLAN              
---------------------------------------
- Merge Right Join
-   Merge Cond: (b.id = d.a)
-   ->  Unique
-         ->  Sort
-               Sort Key: b.id, b.c_id
-               ->  Seq Scan on b
+           QUERY PLAN            
+---------------------------------
+ Merge Left Join
+   Merge Cond: (d.a = s.id)
    ->  Sort
          Sort Key: d.a
          ->  Seq Scan on d
+   ->  Sort
+         Sort Key: s.id
+         ->  Subquery Scan on s
+               ->  Seq Scan on b
 (9 rows)
 
 -- check join removal works when uniqueness of the join condition is enforced
diff --git a/src/test/regress/expected/select_distinct_2.out b/src/test/regress/expected/select_distinct_2.out
new file mode 100644
index 0000000000..2ece95a806
--- /dev/null
+++ b/src/test/regress/expected/select_distinct_2.out
@@ -0,0 +1,276 @@
+create table select_distinct_a(pk1 int, pk2 char(20),  uk1 char(20) not null,  uk2 int, e int, primary key(pk1, pk2));
+create unique index select_distinct_a_uk on select_distinct_a(uk1, uk2);
+create table select_distinct_b(a int, b char(20), pk1 char(20), pk2 int, e int, primary key(pk1, pk2));
+-- distinct erased since (pk1, pk2)
+explain (costs off) select distinct * from select_distinct_a;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)
+
+-- distinct can't be reased since since we required all the uk must be not null
+explain (costs off) select distinct uk1, uk2 from select_distinct_a;
+             QUERY PLAN              
+-------------------------------------
+ HashAggregate
+   Group Key: uk1, uk2
+   ->  Seq Scan on select_distinct_a
+(3 rows)
+
+-- distinct ereased since uk + not null
+explain (costs off) select distinct uk1, uk2 from select_distinct_a where uk2 is not null;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (uk2 IS NOT NULL)
+(2 rows)
+
+explain (costs off) select distinct uk1, uk2 from select_distinct_a where uk2 > 1;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (uk2 > 1)
+(2 rows)
+
+-- distinct erased due to group by
+explain select distinct e from select_distinct_a group by e;
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
+ HashAggregate  (cost=14.88..16.88 rows=200 width=4)
+   Group Key: e
+   ->  Seq Scan on select_distinct_a  (cost=0.00..13.90 rows=390 width=4)
+(3 rows)
+
+-- distinct erased due to the restirctinfo
+explain select distinct uk1 from select_distinct_a where pk1 = 1 and pk2 = 'c';
+                                           QUERY PLAN                                            
+-------------------------------------------------------------------------------------------------
+ Index Scan using select_distinct_a_pkey on select_distinct_a  (cost=0.15..8.17 rows=1 width=84)
+   Index Cond: ((pk1 = 1) AND (pk2 = 'c'::bpchar))
+(2 rows)
+
+-- test join
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+insert into select_distinct_a values(1, 'a', 'a', 0, 1), (1, 'b', 'A', 0, 2), (3, 'c', 'c', 0, 3);
+insert into select_distinct_b values(1, 'a', 'a', 0, 1), (4, 'd', 'd', 0, 4), (1, 'e', 'e', 0, 5);
+-- Cartesian join
+explain (costs off) select distinct a.uk1, a.uk2, b.pk1, b.pk2 from select_distinct_a a, select_distinct_b b where a.uk2 is not null;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop
+   ->  Seq Scan on select_distinct_b b
+   ->  Materialize
+         ->  Seq Scan on select_distinct_a a
+               Filter: (uk2 IS NOT NULL)
+(5 rows)
+
+select distinct a.uk1, a.uk2, b.pk1, b.pk2 from select_distinct_a a, select_distinct_b b where a.uk2 is not null order by 1, 2, 3, 4;
+         uk1          | uk2 |         pk1          | pk2 
+----------------------+-----+----------------------+-----
+ a                    |   0 | a                    |   0
+ a                    |   0 | d                    |   0
+ a                    |   0 | e                    |   0
+ A                    |   0 | a                    |   0
+ A                    |   0 | d                    |   0
+ A                    |   0 | e                    |   0
+ c                    |   0 | a                    |   0
+ c                    |   0 | d                    |   0
+ c                    |   0 | e                    |   0
+(9 rows)
+
+-- left join
+explain select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a left join select_distinct_b b on (a.pk1 = b.a);
+                                    QUERY PLAN                                     
+-----------------------------------------------------------------------------------
+ Nested Loop Left Join  (cost=0.00..2310.28 rows=760 width=176)
+   Join Filter: (a.pk1 = b.a)
+   ->  Seq Scan on select_distinct_a a  (cost=0.00..13.90 rows=390 width=88)
+   ->  Materialize  (cost=0.00..15.85 rows=390 width=92)
+         ->  Seq Scan on select_distinct_b b  (cost=0.00..13.90 rows=390 width=92)
+(5 rows)
+
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a left join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;;
+ pk1 |         pk2          |         pk1          | pk2 
+-----+----------------------+----------------------+-----
+   1 | a                    | a                    |   0
+   1 | a                    | e                    |   0
+   1 | b                    | a                    |   0
+   1 | b                    | e                    |   0
+   3 | c                    |                      |    
+(5 rows)
+
+-- right join
+explain select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a right join select_distinct_b b on (a.pk1 = b.a);
+                                                  QUERY PLAN                                                  
+--------------------------------------------------------------------------------------------------------------
+ Nested Loop Left Join  (cost=0.15..140.88 rows=760 width=176)
+   ->  Seq Scan on select_distinct_b b  (cost=0.00..13.90 rows=390 width=92)
+   ->  Index Only Scan using select_distinct_a_pkey on select_distinct_a a  (cost=0.15..0.31 rows=2 width=88)
+         Index Cond: (pk1 = b.a)
+(4 rows)
+
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a right join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;
+ pk1 |         pk2          |         pk1          | pk2 
+-----+----------------------+----------------------+-----
+   1 | a                    | a                    |   0
+   1 | a                    | e                    |   0
+   1 | b                    | a                    |   0
+   1 | b                    | e                    |   0
+     |                      | d                    |   0
+(5 rows)
+
+-- full join
+explain select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a);
+                                    QUERY PLAN                                     
+-----------------------------------------------------------------------------------
+ Hash Full Join  (cost=10000000018.77..10000000060.26 rows=760 width=176)
+   Hash Cond: (a.pk1 = b.a)
+   ->  Seq Scan on select_distinct_a a  (cost=0.00..13.90 rows=390 width=88)
+   ->  Hash  (cost=13.90..13.90 rows=390 width=92)
+         ->  Seq Scan on select_distinct_b b  (cost=0.00..13.90 rows=390 width=92)
+(5 rows)
+
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;
+ pk1 |         pk2          |         pk1          | pk2 
+-----+----------------------+----------------------+-----
+   1 | a                    | a                    |   0
+   1 | a                    | e                    |   0
+   1 | b                    | a                    |   0
+   1 | b                    | e                    |   0
+   3 | c                    |                      |    
+     |                      | d                    |   0
+(6 rows)
+
+-- distinct can't be erased since b.pk2 is missed
+explain select distinct a.pk1, a.pk2, b.pk1 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a);
+                                          QUERY PLAN                                           
+-----------------------------------------------------------------------------------------------
+ Unique  (cost=10000000096.63..10000000104.23 rows=760 width=172)
+   ->  Sort  (cost=10000000096.63..10000000098.53 rows=760 width=172)
+         Sort Key: a.pk1, a.pk2, b.pk1
+         ->  Hash Full Join  (cost=10000000018.77..10000000060.26 rows=760 width=172)
+               Hash Cond: (a.pk1 = b.a)
+               ->  Seq Scan on select_distinct_a a  (cost=0.00..13.90 rows=390 width=88)
+               ->  Hash  (cost=13.90..13.90 rows=390 width=88)
+                     ->  Seq Scan on select_distinct_b b  (cost=0.00..13.90 rows=390 width=88)
+(8 rows)
+
+-- Semi/anti join
+explain (costs off) select distinct pk1, pk2 from select_distinct_a where pk1 in (select a from select_distinct_b);
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Nested Loop
+   ->  HashAggregate
+         Group Key: select_distinct_b.a
+         ->  Seq Scan on select_distinct_b
+   ->  Index Only Scan using select_distinct_a_pkey on select_distinct_a
+         Index Cond: (pk1 = select_distinct_b.a)
+(6 rows)
+
+explain (costs off) select distinct pk1, pk2 from select_distinct_a where pk1 not in (select a from select_distinct_b);
+              QUERY PLAN               
+---------------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on select_distinct_b
+(4 rows)
+
+-- we also can handle some limited subquery
+explain select distinct * from select_distinct_a a,  (select a from select_distinct_b group by a) b where a.pk1 = b.a;
+                                                QUERY PLAN                                                
+----------------------------------------------------------------------------------------------------------
+ Nested Loop  (cost=15.02..107.38 rows=390 width=184)
+   ->  HashAggregate  (cost=14.88..16.88 rows=200 width=4)
+         Group Key: select_distinct_b.a
+         ->  Seq Scan on select_distinct_b  (cost=0.00..13.90 rows=390 width=4)
+   ->  Index Scan using select_distinct_a_pkey on select_distinct_a a  (cost=0.15..0.42 rows=2 width=180)
+         Index Cond: (pk1 = select_distinct_b.a)
+(6 rows)
+
+select distinct * from select_distinct_a a,  (select a from select_distinct_b group by a) b where a.pk1 = b.a order by 1, 2, 3;
+ pk1 |         pk2          |         uk1          | uk2 | e | a 
+-----+----------------------+----------------------+-----+---+---
+   1 | a                    | a                    |   0 | 1 | 1
+   1 | b                    | A                    |   0 | 2 | 1
+(2 rows)
+
+explain select distinct * from select_distinct_a a,  (select distinct a from select_distinct_b) b where a.pk1 = b.a;
+                                                QUERY PLAN                                                
+----------------------------------------------------------------------------------------------------------
+ Nested Loop  (cost=15.02..107.38 rows=390 width=184)
+   ->  HashAggregate  (cost=14.88..16.88 rows=200 width=4)
+         Group Key: select_distinct_b.a
+         ->  Seq Scan on select_distinct_b  (cost=0.00..13.90 rows=390 width=4)
+   ->  Index Scan using select_distinct_a_pkey on select_distinct_a a  (cost=0.15..0.42 rows=2 width=180)
+         Index Cond: (pk1 = select_distinct_b.a)
+(6 rows)
+
+select distinct * from select_distinct_a a, (select distinct a from select_distinct_b) b where a.pk1 = b.a order by 1 ,2, 3;
+ pk1 |         pk2          |         uk1          | uk2 | e | a 
+-----+----------------------+----------------------+-----+---+---
+   1 | a                    | a                    |   0 | 1 | 1
+   1 | b                    | A                    |   0 | 2 | 1
+(2 rows)
+
+-- Distinct On
+-- can't erase since pk2 is missed
+explain (costs off) select distinct on(pk1) pk1, pk2 from select_distinct_a;
+                QUERY PLAN                 
+-------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: pk1
+         ->  Seq Scan on select_distinct_a
+(4 rows)
+
+-- ok to erase
+explain (costs off) select distinct on(pk1, pk2) pk1, pk2 from select_distinct_a;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)
+
+-- test some view.
+create view distinct_v1 as select distinct uk1, uk2 from select_distinct_a where uk2 is not null;
+explain select * from distinct_v1;
+                             QUERY PLAN                              
+---------------------------------------------------------------------
+ Seq Scan on select_distinct_a  (cost=0.00..13.90 rows=388 width=88)
+   Filter: (uk2 IS NOT NULL)
+(2 rows)
+
+alter table select_distinct_a alter column uk1 drop not null;
+explain select * from distinct_v1;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ HashAggregate  (cost=15.84..17.84 rows=200 width=88)
+   Group Key: select_distinct_a.uk1, select_distinct_a.uk2
+   ->  Seq Scan on select_distinct_a  (cost=0.00..13.90 rows=388 width=88)
+         Filter: (uk2 IS NOT NULL)
+(4 rows)
+
+alter table select_distinct_a alter column uk1 set not null;
+-- test generic plan
+prepare pt as select * from distinct_v1;
+explain execute pt;
+                             QUERY PLAN                              
+---------------------------------------------------------------------
+ Seq Scan on select_distinct_a  (cost=0.00..13.90 rows=388 width=88)
+   Filter: (uk2 IS NOT NULL)
+(2 rows)
+
+alter table select_distinct_a alter column uk1 drop not null;
+explain execute pt;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ HashAggregate  (cost=15.84..17.84 rows=200 width=88)
+   Group Key: select_distinct_a.uk1, select_distinct_a.uk2
+   ->  Seq Scan on select_distinct_a  (cost=0.00..13.90 rows=388 width=88)
+         Filter: (uk2 IS NOT NULL)
+(4 rows)
+
+drop view distinct_v1;
+drop table select_distinct_a;
+drop table select_distinct_b;
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index a1c90eb905..e053214f9d 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -73,6 +73,7 @@ select name, setting from pg_settings where name like 'enable%';
               name              | setting 
 --------------------------------+---------
  enable_bitmapscan              | on
+ enable_distinct_elimination    | on
  enable_gathermerge             | on
  enable_hashagg                 | on
  enable_hashjoin                | on
@@ -89,7 +90,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(17 rows)
+(18 rows)
 
 -- Test that the pg_timezone_names and pg_timezone_abbrevs views are
 -- more-or-less working.  We can't test their contents in any great detail
diff --git a/src/test/regress/sql/select_distinct_2.sql b/src/test/regress/sql/select_distinct_2.sql
new file mode 100644
index 0000000000..2fc54e7e36
--- /dev/null
+++ b/src/test/regress/sql/select_distinct_2.sql
@@ -0,0 +1,84 @@
+create table select_distinct_a(pk1 int, pk2 char(20),  uk1 char(20) not null,  uk2 int, e int, primary key(pk1, pk2));
+create unique index select_distinct_a_uk on select_distinct_a(uk1, uk2);
+create table select_distinct_b(a int, b char(20), pk1 char(20), pk2 int, e int, primary key(pk1, pk2));
+
+-- distinct erased since (pk1, pk2)
+explain (costs off) select distinct * from select_distinct_a;
+
+-- distinct can't be reased since since we required all the uk must be not null
+explain (costs off) select distinct uk1, uk2 from select_distinct_a;
+
+-- distinct ereased since uk + not null
+explain (costs off) select distinct uk1, uk2 from select_distinct_a where uk2 is not null;
+explain (costs off) select distinct uk1, uk2 from select_distinct_a where uk2 > 1;
+
+-- distinct erased due to group by
+explain select distinct e from select_distinct_a group by e;
+
+-- distinct erased due to the restirctinfo
+explain select distinct uk1 from select_distinct_a where pk1 = 1 and pk2 = 'c';
+
+-- test join
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+
+insert into select_distinct_a values(1, 'a', 'a', 0, 1), (1, 'b', 'A', 0, 2), (3, 'c', 'c', 0, 3);
+insert into select_distinct_b values(1, 'a', 'a', 0, 1), (4, 'd', 'd', 0, 4), (1, 'e', 'e', 0, 5);
+
+-- Cartesian join
+explain (costs off) select distinct a.uk1, a.uk2, b.pk1, b.pk2 from select_distinct_a a, select_distinct_b b where a.uk2 is not null;
+select distinct a.uk1, a.uk2, b.pk1, b.pk2 from select_distinct_a a, select_distinct_b b where a.uk2 is not null order by 1, 2, 3, 4;
+
+
+-- left join
+explain select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a left join select_distinct_b b on (a.pk1 = b.a);
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a left join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;;
+
+-- right join
+explain select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a right join select_distinct_b b on (a.pk1 = b.a);
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a right join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;
+
+-- full join
+explain select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a);
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;
+
+-- distinct can't be erased since b.pk2 is missed
+explain select distinct a.pk1, a.pk2, b.pk1 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a);
+
+
+-- Semi/anti join
+explain (costs off) select distinct pk1, pk2 from select_distinct_a where pk1 in (select a from select_distinct_b);
+explain (costs off) select distinct pk1, pk2 from select_distinct_a where pk1 not in (select a from select_distinct_b);
+
+-- we also can handle some limited subquery
+explain select distinct * from select_distinct_a a,  (select a from select_distinct_b group by a) b where a.pk1 = b.a;
+select distinct * from select_distinct_a a,  (select a from select_distinct_b group by a) b where a.pk1 = b.a order by 1, 2, 3;
+
+explain select distinct * from select_distinct_a a,  (select distinct a from select_distinct_b) b where a.pk1 = b.a;
+select distinct * from select_distinct_a a, (select distinct a from select_distinct_b) b where a.pk1 = b.a order by 1 ,2, 3;
+
+-- Distinct On
+-- can't erase since pk2 is missed
+explain (costs off) select distinct on(pk1) pk1, pk2 from select_distinct_a;
+-- ok to erase
+explain (costs off) select distinct on(pk1, pk2) pk1, pk2 from select_distinct_a;
+
+
+-- test some view.
+create view distinct_v1 as select distinct uk1, uk2 from select_distinct_a where uk2 is not null;
+explain select * from distinct_v1;
+
+alter table select_distinct_a alter column uk1 drop not null;
+explain select * from distinct_v1;
+
+alter table select_distinct_a alter column uk1 set not null;
+
+-- test generic plan
+prepare pt as select * from distinct_v1;
+explain execute pt;
+alter table select_distinct_a alter column uk1 drop not null;
+explain execute pt;
+
+drop view distinct_v1;
+drop table select_distinct_a;
+drop table select_distinct_b;
-- 
2.21.0

#18

Tom Lane

tgl@sss.pgh.pa.us

almost 6 years ago

In reply to: Andy Fan (#17)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

Andy Fan <zhihui.fan1213@gmail.com> writes:

Please see if you have any comments. Thanks

The cfbot isn't at all happy with this. Its linux build is complaining
about a possibly-uninitialized variable, and then giving up:

https://travis-ci.org/postgresql-cfbot/postgresql/builds/656722993

The Windows build isn't using -Werror, but it is crashing in at least
two different spots in the regression tests:

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.81778

I've not attempted to identify the cause of that.

At a high level, I'm a bit disturbed that this focuses only on DISTINCT
and doesn't (appear to) have any equivalent intelligence for GROUP BY,
though surely that offers much the same opportunities for optimization.
It seems like it'd be worthwhile to take a couple steps back and see
if we couldn't recast the logic to work for both.

Some other random comments:

* Don't especially like the way you broke query_is_distinct_for()
into two functions, especially when you then introduced a whole
lot of other code in between. That's just making reviewer's job
harder to see what changed. It makes the comments a bit disjointed
too, that is where you even had any. (Zero introductory comment
for query_is_distinct_agg is *not* up to project coding standards.
There are a lot of other undercommented places in this patch, too.)

* Definitely don't like having query_distinct_through_join re-open
all the relations. The data needed for this should get absorbed
while plancat.c has the relations open at the beginning. (Non-nullness
of columns, in particular, seems like it'll be useful for other
purposes; I'm a bit surprised the planner isn't using that already.)

* In general, query_distinct_through_join seems hugely messy, expensive,
and not clearly correct. If it is correct, the existing comments sure
aren't enough to explain what it is doing or why.

* Not entirely convinced that a new GUC is needed for this, but if
it is, you have to document it.

* I wouldn't bother with bms_array_free(), nor with any of the other
cleanup you've got at the bottom of query_distinct_through_join.
The planner leaks *lots* of memory, and this function isn't going to
be called so many times that it'll move the needle.

* There seem to be some pointless #include additions, eg in planner.c
the added code doesn't look to justify any of them. Please also
avoid unnecessary whitespace changes, those also slow down reviewing.

* I see you decided to add a new regression test file select_distinct_2.
That's a poor choice of name because it conflicts with our rules for the
naming of alternative output files. Besides which, you forgot to plug
it into the test schedule files, so it isn't actually getting run.
Is there a reason not to just add the new test cases to select_distinct?

* There are some changes in existing regression cases that aren't
visibly related to the stated purpose of the patch, eg it now
notices that "select distinct max(unique2) from tenk1" doesn't
require an explicit DISTINCT step. That's not wrong, but I wonder
if maybe you should subdivide this patch into more than one patch,
because that must be coming from some separate change. I'm also
wondering what caused the plan change in expected/join.out.

regards, tom lane

#19

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: Tom Lane (#18)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

Thank you Tom for the review!

On Mon, Mar 2, 2020 at 4:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andy Fan <zhihui.fan1213@gmail.com> writes:

Please see if you have any comments. Thanks

The cfbot isn't at all happy with this. Its linux build is complaining
about a possibly-uninitialized variable, and then giving up:

https://travis-ci.org/postgresql-cfbot/postgresql/builds/656722993

The Windows build isn't using -Werror, but it is crashing in at least
two different spots in the regression tests:

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.81778

I've not attempted to identify the cause of that.

Before I submit the patch, I can make sure "make check-world" is
successful, but
since the compile option is not same, so I didn't catch
the possibly-uninitialized
variable. As for the crash on the windows, I didn't get the enough
information
now, I will find a windows server and reproduce the cases.

I just found the link http://commitfest.cputube.org/ this morning, I will
make sure
the next patch can go pass this test.

At a high level, I'm a bit disturbed that this focuses only on DISTINCT

and doesn't (appear to) have any equivalent intelligence for GROUP BY,
though surely that offers much the same opportunities for optimization.
It seems like it'd be worthwhile to take a couple steps back and see
if we couldn't recast the logic to work for both.

OK, Looks group by is a bit harder than distinct since the aggregation
function.
I will go through the code to see why to add this logic.

Some other random comments:

* Don't especially like the way you broke query_is_distinct_for()
into two functions, especially when you then introduced a whole
lot of other code in between.

This is not expected by me until you point it out. In this case, I have to
break the query_is_distinct_for to two functions, but it true that we
should put the two functions together.

That's just making reviewer's job

harder to see what changed. It makes the comments a bit disjointed
too, that is where you even had any. (Zero introductory comment
for query_is_distinct_agg is *not* up to project coding standards.
There are a lot of other undercommented places in this patch, too.)

* Definitely don't like having query_distinct_through_join re-open
all the relations. The data needed for this should get absorbed
while plancat.c has the relations open at the beginning. (Non-nullness
of columns, in particular, seems like it'll be useful for other
purposes; I'm a bit surprised the planner isn't using that already.)

I can add new attributes to RelOptInfo and fill the value in
get_relation_info
call.

* In general, query_distinct_through_join seems hugely messy, expensive,
and not clearly correct. If it is correct, the existing comments sure
aren't enough to explain what it is doing or why.

Removing the relation_open call can make it a bit simpler, I will try more
comment to make it clearer in the following patch.

* There seem to be some pointless #include additions, eg in planner.c
the added code doesn't look to justify any of them. Please also
avoid unnecessary whitespace changes, those also slow down reviewing.

That may because I added the header file some time 1 and then refactored
the code later then forget the remove the header file accordingly. Do we
need
to relay on experience to tell if the header file is needed or not, or do
have have
any code to tell it automatically?

* I see you decided to add a new regression test file select_distinct_2.
That's a poor choice of name because it conflicts with our rules for the
naming of alternative output files. Besides which, you forgot to plug
it into the test schedule files, so it isn't actually getting run.
Is there a reason not to just add the new test cases to select_distinct?

Adding it to select_distinct.sql is ok for me as well. Actually I have no
obviously reason to add the new file.

* There are some changes in existing regression cases that aren't
visibly related to the stated purpose of the patch, eg it now
notices that "select distinct max(unique2) from tenk1" doesn't
require an explicit DISTINCT step. That's not wrong, but I wonder
if maybe you should subdivide this patch into more than one patch,
because that must be coming from some separate change. I'm also
wondering what caused the plan change in expected/join.out.

Per my purpose it should be in the same patch, the logical here is we
have distinct in the sql and the query is distinct already since the max
function (the rule is defined in query_is_distinct_agg which is splited
from
the original query_is_distinct_for clause).

Show quoted text

regards, tom lane

#20

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: Andy Fan (#19)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Tue, Mar 3, 2020 at 1:24 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:

Thank you Tom for the review!

On Mon, Mar 2, 2020 at 4:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andy Fan <zhihui.fan1213@gmail.com> writes:

Please see if you have any comments. Thanks

The cfbot isn't at all happy with this. Its linux build is complaining
about a possibly-uninitialized variable, and then giving up:

https://travis-ci.org/postgresql-cfbot/postgresql/builds/656722993

The Windows build isn't using -Werror, but it is crashing in at least
two different spots in the regression tests:

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.81778

I've not attempted to identify the cause of that.

Before I submit the patch, I can make sure "make check-world" is
successful, but
since the compile option is not same, so I didn't catch
the possibly-uninitialized
variable. As for the crash on the windows, I didn't get the enough
information
now, I will find a windows server and reproduce the cases.

I just found the link http://commitfest.cputube.org/ this morning, I will
make sure
the next patch can go pass this test.

At a high level, I'm a bit disturbed that this focuses only on DISTINCT

and doesn't (appear to) have any equivalent intelligence for GROUP BY,
though surely that offers much the same opportunities for optimization.
It seems like it'd be worthwhile to take a couple steps back and see
if we couldn't recast the logic to work for both.

OK, Looks group by is a bit harder than distinct since the aggregation
function.
I will go through the code to see why to add this logic.

Can we grantee any_aggr_func(a) == a if only 1 row returned, if so, we
can do
some work on the pathtarget/reltarget by transforming the Aggref to raw
expr.
I checked the execution path of the aggregation call, looks it depends on
Agg node
which is the thing we want to remove.

* There seem to be some pointless #include additions, eg in planner.c
the added code doesn't look to justify any of them. Please also
avoid unnecessary whitespace changes, those also slow down reviewing.

fixed some typo errors.

That may be because I added the header file at time 1 and then refactored
the code but forget to remove the header file when it is not necessary.
Do we need to relay on experience to tell if the header file is needed or
not,
or do we have any tool to tell it automatically?

regards, Andy Fan

#21

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: Andy Fan (#19)

1 attachment(s)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

* There are some changes in existing regression cases that aren't
visibly related to the stated purpose of the patch, eg it now
notices that "select distinct max(unique2) from tenk1" doesn't
require an explicit DISTINCT step. That's not wrong, but I wonder
if maybe you should subdivide this patch into more than one patch,
because that must be coming from some separate change. I'm also
wondering what caused the plan change in expected/join.out.

Per my purpose it should be in the same patch, the logical here is we
have distinct in the sql and the query is distinct already since the max
function (the rule is defined in query_is_distinct_agg which is splited
from
the original query_is_distinct_for clause).

I think I was right until I come
into contrib/postgres_fdw/sql/postgres_fdw.sql.
Per my understanding, the query the result of "select max(a) from t" is
unique
since the aggregation function and has no group clause there. But in the
postgres_fdw.sql case, the Query->hasAggs is true for "select distinct
(select count(*) filter (where t2.c2 = 6 and t2.c1 < 10) from ft1 t1 where
t1.c1 = 6)
from ft2 t2 where t2.c2 % 6 = 0 order by 1;" This looks very strange to
me.
Is my understanding wrong or there is a bug here?

query->hasAggs was set to true in the following call stack.

pstate->p_hasAggs = true;

qry->hasAggs = pstate->p_hasAggs;

0 in check_agglevels_and_constraints of parse_agg.c:343
1 in transformAggregateCall of parse_agg.c:236
2 in ParseFuncOrColumn of parse_func.c:805
3 in transformFuncCall of parse_expr.c:1558
4 in transformExprRecurse of parse_expr.c:265
5 in transformExpr of parse_expr.c:155
6 in transformTargetEntry of parse_target.c:105
7 in transformTargetList of parse_target.c:193
8 in transformSelectStmt of analyze.c:1224
9 in transformStmt of analyze.c:301

You can see the new updated patch which should fix all the issues you point
out
except the one for supporting group by. The another reason for this patch
will
not be the final one is because the changes for postgres_fdw.out is too
arbitrary.
uploading it now just for reference. (The new introduced guc variable can
be
removed at last, keeping it now just make sure the testing is easier.)

At a high level, I'm a bit disturbed that this focuses only on DISTINCT

and doesn't (appear to) have any equivalent intelligence for GROUP BY,
though surely that offers much the same opportunities for optimization.
It seems like it'd be worthwhile to take a couple steps back and see
if we couldn't recast the logic to work for both.

OK, Looks group by is a bit harder than distinct since the aggregation
function.
I will go through the code to see where to add this logic.

Can we grantee any_aggr_func(a) == a if only 1 row returned, if so, we
can do
some work on the pathtarget/reltarget by transforming the Aggref to raw
expr.
I checked the execution path of the aggregation call, looks it depends on
Agg node
which is the thing we want to remove.

We can't grantee any_aggr_func(a) == a when only 1 row returned, so the
above
method doesn't work. do you have any suggestion for this?

Attachments:

v2-0001-PATCH-Erase-the-distinct-path-if-the-result-is-un.patchapplication/x-patch; name=v2-0001-PATCH-Erase-the-distinct-path-if-the-result-is-un.patchDownload

From 9449c09688d542c4dc201ee866f67d67304cff98 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com>
Date: Wed, 4 Mar 2020 14:33:56 +0800
Subject: [PATCH v2] [PATCH] Erase the distinct path if the result is unique by
 catalog

For a single relation, we can tell it by any one of the following
is true:
1. The pk is in the target list.
2. The uk is in the target list and the columns is not null
3. The columns in group-by clause is also in the target list

for relation join, we can tell it by:
if every relation in the jointree yield a unique result set,then
the final result is unique as well regardless the join method.
---
 .../postgres_fdw/expected/postgres_fdw.out    |  28 +-
 src/backend/optimizer/path/costsize.c         |   1 +
 src/backend/optimizer/plan/analyzejoins.c     | 184 +++++++++++-
 src/backend/optimizer/plan/planner.c          |  27 ++
 src/backend/optimizer/util/plancat.c          |   9 +
 src/backend/utils/misc/guc.c                  |  10 +
 src/include/nodes/pathnodes.h                 |   1 +
 src/include/optimizer/cost.h                  |   1 +
 src/include/optimizer/planmain.h              |   2 +
 src/test/regress/expected/aggregates.out      |  13 +-
 src/test/regress/expected/join.out            |  16 +-
 src/test/regress/expected/select_distinct.out | 276 ++++++++++++++++++
 src/test/regress/expected/sysviews.out        |   3 +-
 src/test/regress/sql/select_distinct.sql      |  84 ++++++
 14 files changed, 619 insertions(+), 36 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 84fd3ad2e0..215f10bf7d 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2902,22 +2902,20 @@ select sum(c1%3), sum(distinct c1%3 order by c1%3) filter (where c1%3 < 2), c2 f
 -- Outer query is aggregation query
 explain (verbose, costs off)
 select distinct (select count(*) filter (where t2.c2 = 6 and t2.c1 < 10) from ft1 t1 where t1.c1 = 6) from ft2 t2 where t2.c2 % 6 = 0 order by 1;
-                                                          QUERY PLAN                                                          
-------------------------------------------------------------------------------------------------------------------------------
- Unique
+                                                       QUERY PLAN                                                       
+------------------------------------------------------------------------------------------------------------------------
+ Sort
    Output: ((SubPlan 1))
-   ->  Sort
-         Output: ((SubPlan 1))
-         Sort Key: ((SubPlan 1))
-         ->  Foreign Scan
-               Output: (SubPlan 1)
-               Relations: Aggregate on (public.ft2 t2)
-               Remote SQL: SELECT count(*) FILTER (WHERE ((c2 = 6) AND ("C 1" < 10))) FROM "S 1"."T 1" WHERE (((c2 % 6) = 0))
-               SubPlan 1
-                 ->  Foreign Scan on public.ft1 t1
-                       Output: (count(*) FILTER (WHERE ((t2.c2 = 6) AND (t2.c1 < 10))))
-                       Remote SQL: SELECT NULL FROM "S 1"."T 1" WHERE (("C 1" = 6))
-(13 rows)
+   Sort Key: ((SubPlan 1))
+   ->  Foreign Scan
+         Output: (SubPlan 1)
+         Relations: Aggregate on (public.ft2 t2)
+         Remote SQL: SELECT count(*) FILTER (WHERE ((c2 = 6) AND ("C 1" < 10))) FROM "S 1"."T 1" WHERE (((c2 % 6) = 0))
+         SubPlan 1
+           ->  Foreign Scan on public.ft1 t1
+                 Output: (count(*) FILTER (WHERE ((t2.c2 = 6) AND (t2.c1 < 10))))
+                 Remote SQL: SELECT NULL FROM "S 1"."T 1" WHERE (("C 1" = 6))
+(11 rows)
 
 select distinct (select count(*) filter (where t2.c2 = 6 and t2.c1 < 10) from ft1 t1 where t1.c1 = 6) from ft2 t2 where t2.c2 % 6 = 0 order by 1;
  count 
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index b5a0033721..dde16b5d44 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -138,6 +138,7 @@ bool		enable_partitionwise_aggregate = false;
 bool		enable_parallel_append = true;
 bool		enable_parallel_hash = true;
 bool		enable_partition_pruning = true;
+bool		enable_distinct_elimination = true;
 
 typedef struct
 {
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index d0ff660284..dee152af29 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -30,6 +30,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/planmain.h"
 #include "optimizer/tlist.h"
+#include "parser/parsetree.h"
 #include "utils/lsyscache.h"
 
 /* local functions */
@@ -47,7 +48,8 @@ static bool is_innerrel_unique_for(PlannerInfo *root,
 								   RelOptInfo *innerrel,
 								   JoinType jointype,
 								   List *restrictlist);
-
+static void transform_colno_for_subquery(Query *query, List *colnos, List *opids,
+										List **sub_colnos, List **sub_opids);
 
 /*
  * remove_useless_joins
@@ -801,9 +803,18 @@ query_is_distinct_for(Query *query, List *colnos, List *opids)
 		if (l == NULL)			/* had matches for all? */
 			return true;
 	}
+	return query_is_distinct_agg(query, colnos, opids);
+}
+
+
+bool
+query_is_distinct_agg(Query *query, List *colnos, List *opids)
+{
+	ListCell   *l;
+	Oid			opid;
 
 	/*
-	 * Otherwise, a set-returning function in the query's targetlist can
+	 * a set-returning function in the query's targetlist can
 	 * result in returning duplicate rows, despite any grouping that might
 	 * occur before tlist evaluation.  (If all tlist SRFs are within GROUP BY
 	 * columns, it would be safe because they'd be expanded before grouping.
@@ -901,7 +912,6 @@ query_is_distinct_for(Query *query, List *colnos, List *opids)
 				return true;
 		}
 	}
-
 	/*
 	 * XXX Are there any other cases in which we can easily see the result
 	 * must be distinct?
@@ -913,6 +923,174 @@ query_is_distinct_for(Query *query, List *colnos, List *opids)
 	return false;
 }
 
+/*
+ * scan_non_semi_anti_relids
+ *
+ * Scan jointree to get rid of right table of semi/anti join rtindex.
+ */
+static void
+scan_non_semi_anti_relids(Node* jtnode, Relids* relids)
+{
+	if (jtnode == NULL)
+		return;
+
+	if (IsA(jtnode, RangeTblRef))
+	{
+		int			varno = ((RangeTblRef *) jtnode)->rtindex;
+
+		*relids = bms_add_member(*relids, varno);
+	}
+	else if (IsA(jtnode, FromExpr))
+	{
+		FromExpr   *f = (FromExpr *) jtnode;
+		ListCell   *l;
+
+		foreach(l, f->fromlist)
+			scan_non_semi_anti_relids(lfirst(l), relids);
+	}
+	else if (IsA(jtnode, JoinExpr))
+	{
+		JoinExpr   *j = (JoinExpr *) jtnode;
+
+		scan_non_semi_anti_relids(j->larg, relids);
+		if (j->jointype != JOIN_SEMI && j->jointype != JOIN_ANTI)
+		{
+			scan_non_semi_anti_relids(j->rarg, relids);
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+			 (int) nodeTag(jtnode));
+
+}
+
+/*
+ * transform_colno_for_subquery
+ */
+static void
+transform_colno_for_subquery(Query *query, List *colnos, List *opids,
+							List **sub_colnos, List **sub_opids)
+{
+	ListCell *lc1, *lc2;
+	TargetEntry *tle;
+
+	forboth(lc1, colnos, lc2, opids)
+	{
+		tle = get_tle_by_resno(query->targetList, lfirst_int(lc1));
+		Assert(IsA(tle->expr, Var));
+		*sub_colnos = lappend_int(*sub_colnos, ((Var*)tle->expr)->varattno);
+		*sub_opids = lappend_oid(*sub_opids, lfirst_oid(lc2));
+	}
+}
+
+/*
+ * query_distinct_through_join
+ * If every relation yields a unique result in the join, so the join result
+ * is unqiue as well. We need to distinguish right table in semi/anti
+ * join, which we don't care.
+ */
+bool
+query_distinct_through_join(PlannerInfo *root, List *colnos, List *opids)
+{
+	Query *query = root->parse;
+	Relids non_semi_anti_relids = NULL;
+
+    /* Used for relation_has_unique_for */
+	List **non_null_expr_per_table = NULL;
+	/* Used for query_is_distinct_for */
+	List **non_null_colno_per_table = NULL;
+	/* Used for both as above*/
+	List **non_null_opids_per_table = NULL;
+	/* Not null info from restrictinfo and catalog */
+	Bitmapset **non_null_varno_per_table = NULL;
+
+	int rt_index;
+	ListCell *lc1, *lc2;
+	RangeTblEntry *rte;
+	RelOptInfo *rel;
+	int max_rt_index = list_length(query->rtable) + 1;
+	
+	/* Remove the relids for the right table in semi/anti join */
+	scan_non_semi_anti_relids((Node*)query->jointree, &non_semi_anti_relids);
+
+	non_null_varno_per_table = palloc0(max_rt_index * sizeof(Bitmapset *));
+
+	foreach(lc1, find_nonnullable_vars(query->jointree->quals))
+	{
+		Var *var;
+		if (!IsA(lfirst(lc1), Var))
+			continue;
+		var = lfirst_node(Var, lc1);
+		if (var->varno == INNER_VAR ||
+			var->varno == OUTER_VAR ||
+			var->varno == INDEX_VAR)
+			continue;
+		non_null_varno_per_table[var->varno] = bms_add_member(
+			non_null_varno_per_table[var->varno], var->varattno);
+	}
+
+	/* Add the non null info in catalog */
+	rt_index = -1;
+	while ((rt_index = bms_next_member(non_semi_anti_relids, rt_index)) >= 0 )
+	{
+		non_null_varno_per_table[rt_index] = bms_join(non_null_varno_per_table[rt_index],
+													  root->simple_rel_array[rt_index]->not_null_cols_relids);
+	}
+
+	non_null_expr_per_table = palloc0(max_rt_index * sizeof(List *));
+	non_null_opids_per_table = palloc0(max_rt_index * sizeof(List *));
+	non_null_colno_per_table = palloc0(max_rt_index * sizeof(List *));
+
+	/* Filter out the nullable columns and split them per table*/
+	forboth(lc1, colnos, lc2, opids)
+	{
+		int colno = lfirst_int(lc1);
+		TargetEntry *tle = get_tle_by_resno(query->targetList, colno);
+		Var *var = NULL;
+		if (!IsA(tle->expr, Var))
+			continue;
+		var = (Var *)tle->expr;
+		if (!bms_is_member(var->varattno, non_null_varno_per_table[var->varno]))
+			continue;
+		non_null_expr_per_table[var->varno] = lappend(
+			non_null_expr_per_table[var->varno], tle->expr);
+		non_null_opids_per_table[var->varno] = lappend_oid(
+			non_null_opids_per_table[var->varno], lfirst_oid(lc2));
+		non_null_colno_per_table[var->varno] = lappend_int(
+			non_null_colno_per_table[var->varno],
+			colno);
+	}
+
+	/* Check if every relation yields a unqiue result, if anyone doesn't return false */
+	rt_index = -1;
+	while ((rt_index = bms_next_member(non_semi_anti_relids, rt_index)) >= 0 )
+	{
+		rte = root->simple_rte_array[rt_index];
+		rel = root->simple_rel_array[rt_index];
+		if (rte->rtekind == RTE_RELATION &&
+			relation_has_unique_index_for(root, rel, NIL,
+										  non_null_expr_per_table[rt_index],
+										  non_null_opids_per_table[rt_index]))
+			continue;
+		if (rte->rtekind == RTE_SUBQUERY &&
+			query_supports_distinctness(rte->subquery))
+		{
+			List *subquery_colnos = NIL;
+			List *subquery_opids = NIL;
+			transform_colno_for_subquery(root->parse,
+										non_null_colno_per_table[rt_index],
+										non_null_opids_per_table[rt_index],
+										&subquery_colnos,
+										&subquery_opids);
+			if (query_is_distinct_for(rte->subquery, subquery_colnos, subquery_opids))
+				continue;
+			return false;
+		}
+		return false;
+	}
+	return true;
+}
+
 /*
  * distinct_col_search - subroutine for query_is_distinct_for
  *
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index d6f2153593..9d56e6c88e 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -4737,6 +4737,33 @@ create_distinct_paths(PlannerInfo *root,
 	Path	   *path;
 	ListCell   *lc;
 
+	if (enable_distinct_elimination)
+	{
+		List *colnos = NIL;
+		List *opnos = NIL;
+		ListCell *lc;
+
+		Assert(parse->distinctClause != NIL);
+
+		foreach(lc, parse->distinctClause)
+		{
+			SortGroupClause *sgc = lfirst_node(SortGroupClause, lc);
+			int idx = sgc->tleSortGroupRef;
+			TargetEntry *tle = get_tle_by_resno(parse->targetList, idx);
+			if (tle->resjunk)
+				continue;
+			/* even column x is not null, f(x) may be null as well, so ignore it */
+			if (!IsA(tle->expr, Var))
+				continue;
+			colnos = lappend_int(colnos, idx);
+			opnos = lappend_oid(opnos, sgc->eqop);
+		}
+
+		if ((query_supports_distinctness(parse)
+			 && query_is_distinct_agg(parse, colnos, opnos)) ||
+			query_distinct_through_join(root, colnos, opnos))
+			return input_rel;
+	}
 	/* For now, do all work in the (DISTINCT, NULL) upperrel */
 	distinct_rel = fetch_upper_rel(root, UPPERREL_DISTINCT, NULL);
 
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index d82fc5ab8b..e57b456d9b 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -117,6 +117,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	Relation	relation;
 	bool		hasindex;
 	List	   *indexinfos = NIL;
+	int        i;
 
 	/*
 	 * We need not lock the relation since it was already locked, either by
@@ -460,6 +461,14 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	if (inhparent && relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
 		set_relation_partition_info(root, rel, relation);
 
+	Assert(rel->not_null_cols_relids == NULL);
+	for(i = 0; i < relation->rd_att->natts;  i++)
+	{
+		if (!relation->rd_att->attrs[i].attnotnull)
+			continue;
+		rel->not_null_cols_relids = bms_add_member(rel->not_null_cols_relids, i+1);
+	}
+
 	table_close(relation, NoLock);
 
 	/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e44f71e991..fa798dd564 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1064,6 +1064,16 @@ static struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_distinct_elimination", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables plan-time and run-time unique elimination."),
+		    gettext_noop("Allows the query planner to remove the uncecessary distinct clause."), 
+			GUC_EXPLAIN
+		},
+		&enable_distinct_elimination,
+		true,
+		NULL, NULL, NULL
+	},
 	{
 		{"geqo", PGC_USERSET, QUERY_TUNING_GEQO,
 			gettext_noop("Enables genetic query optimization."),
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 3d3be197e0..51db013f5d 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -687,6 +687,7 @@ typedef struct RelOptInfo
 	PlannerInfo *subroot;		/* if subquery */
 	List	   *subplan_params; /* if subquery */
 	int			rel_parallel_workers;	/* wanted number of parallel workers */
+	Relids     not_null_cols_relids; /* not null cols by catalogs,starts with 1 */
 
 	/* Information about foreign tables and foreign joins */
 	Oid			serverid;		/* identifies server for the table or join */
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index cb012ba198..4fa5d32df6 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -64,6 +64,7 @@ extern PGDLLIMPORT bool enable_partitionwise_aggregate;
 extern PGDLLIMPORT bool enable_parallel_append;
 extern PGDLLIMPORT bool enable_parallel_hash;
 extern PGDLLIMPORT bool enable_partition_pruning;
+extern PGDLLIMPORT bool enable_distinct_elimination;
 extern PGDLLIMPORT int constraint_exclusion;
 
 extern double index_pages_fetched(double tuples_fetched, BlockNumber pages,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index eab486a621..ebd4f24577 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -100,6 +100,8 @@ extern List *remove_useless_joins(PlannerInfo *root, List *joinlist);
 extern void reduce_unique_semijoins(PlannerInfo *root);
 extern bool query_supports_distinctness(Query *query);
 extern bool query_is_distinct_for(Query *query, List *colnos, List *opids);
+extern bool query_is_distinct_agg(Query *query, List *colnos, List *opids);
+extern bool query_distinct_through_join(PlannerInfo *root, List *colnos, List *opids);
 extern bool innerrel_is_unique(PlannerInfo *root,
 							   Relids joinrelids, Relids outerrelids, RelOptInfo *innerrel,
 							   JoinType jointype, List *restrictlist, bool force_cache);
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index f457b5b150..6712571578 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -870,14 +870,12 @@ explain (costs off)
   select distinct max(unique2) from tenk1;
                              QUERY PLAN                              
 ---------------------------------------------------------------------
- HashAggregate
-   Group Key: $0
+ Result
    InitPlan 1 (returns $0)
      ->  Limit
            ->  Index Only Scan Backward using tenk1_unique2 on tenk1
                  Index Cond: (unique2 IS NOT NULL)
-   ->  Result
-(7 rows)
+(5 rows)
 
 select distinct max(unique2) from tenk1;
  max  
@@ -1036,7 +1034,7 @@ explain (costs off)
   select distinct min(f1), max(f1) from minmaxtest;
                                          QUERY PLAN                                          
 ---------------------------------------------------------------------------------------------
- Unique
+ Result
    InitPlan 1 (returns $0)
      ->  Limit
            ->  Merge Append
@@ -1059,10 +1057,7 @@ explain (costs off)
                  ->  Index Only Scan using minmaxtest2i on minmaxtest2 minmaxtest_8
                        Index Cond: (f1 IS NOT NULL)
                  ->  Index Only Scan Backward using minmaxtest3i on minmaxtest3 minmaxtest_9
-   ->  Sort
-         Sort Key: ($0), ($1)
-         ->  Result
-(26 rows)
+(23 rows)
 
 select distinct min(f1), max(f1) from minmaxtest;
  min | max 
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 761376b007..3f6595d53b 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -4433,17 +4433,17 @@ select d.* from d left join (select * from b group by b.id, b.c_id) s
 explain (costs off)
 select d.* from d left join (select distinct * from b) s
   on d.a = s.id;
-              QUERY PLAN              
---------------------------------------
- Merge Right Join
-   Merge Cond: (b.id = d.a)
-   ->  Unique
-         ->  Sort
-               Sort Key: b.id, b.c_id
-               ->  Seq Scan on b
+           QUERY PLAN            
+---------------------------------
+ Merge Left Join
+   Merge Cond: (d.a = s.id)
    ->  Sort
          Sort Key: d.a
          ->  Seq Scan on d
+   ->  Sort
+         Sort Key: s.id
+         ->  Subquery Scan on s
+               ->  Seq Scan on b
 (9 rows)
 
 -- check join removal works when uniqueness of the join condition is enforced
diff --git a/src/test/regress/expected/select_distinct.out b/src/test/regress/expected/select_distinct.out
index f3696c6d1d..73729c8606 100644
--- a/src/test/regress/expected/select_distinct.out
+++ b/src/test/regress/expected/select_distinct.out
@@ -244,3 +244,279 @@ SELECT null IS NOT DISTINCT FROM null as "yes";
  t
 (1 row)
 
+create table select_distinct_a(pk1 int, pk2 char(20),  uk1 char(20) not null,  uk2 int, e int, primary key(pk1, pk2));
+create unique index select_distinct_a_uk on select_distinct_a(uk1, uk2);
+create table select_distinct_b(a int, b char(20), pk1 char(20), pk2 int, e int, primary key(pk1, pk2));
+-- distinct erased since (pk1, pk2)
+explain (costs off) select distinct * from select_distinct_a;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)
+
+-- distinct can't be reased since since we required all the uk must be not null
+explain (costs off) select distinct uk1, uk2 from select_distinct_a;
+             QUERY PLAN              
+-------------------------------------
+ HashAggregate
+   Group Key: uk1, uk2
+   ->  Seq Scan on select_distinct_a
+(3 rows)
+
+-- distinct ereased since uk + not null
+explain (costs off) select distinct uk1, uk2 from select_distinct_a where uk2 is not null;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (uk2 IS NOT NULL)
+(2 rows)
+
+explain (costs off) select distinct uk1, uk2 from select_distinct_a where uk2 > 1;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (uk2 > 1)
+(2 rows)
+
+-- distinct erased due to group by
+explain select distinct e from select_distinct_a group by e;
+                                QUERY PLAN                                
+--------------------------------------------------------------------------
+ HashAggregate  (cost=14.88..16.88 rows=200 width=4)
+   Group Key: e
+   ->  Seq Scan on select_distinct_a  (cost=0.00..13.90 rows=390 width=4)
+(3 rows)
+
+-- distinct erased due to the restirctinfo
+explain select distinct uk1 from select_distinct_a where pk1 = 1 and pk2 = 'c';
+                                           QUERY PLAN                                            
+-------------------------------------------------------------------------------------------------
+ Index Scan using select_distinct_a_pkey on select_distinct_a  (cost=0.15..8.17 rows=1 width=84)
+   Index Cond: ((pk1 = 1) AND (pk2 = 'c'::bpchar))
+(2 rows)
+
+-- test join
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+insert into select_distinct_a values(1, 'a', 'a', 0, 1), (1, 'b', 'A', 0, 2), (3, 'c', 'c', 0, 3);
+insert into select_distinct_b values(1, 'a', 'a', 0, 1), (4, 'd', 'd', 0, 4), (1, 'e', 'e', 0, 5);
+-- Cartesian join
+explain (costs off) select distinct a.uk1, a.uk2, b.pk1, b.pk2 from select_distinct_a a, select_distinct_b b where a.uk2 is not null;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop
+   ->  Seq Scan on select_distinct_b b
+   ->  Materialize
+         ->  Seq Scan on select_distinct_a a
+               Filter: (uk2 IS NOT NULL)
+(5 rows)
+
+select distinct a.uk1, a.uk2, b.pk1, b.pk2 from select_distinct_a a, select_distinct_b b where a.uk2 is not null order by 1, 2, 3, 4;
+         uk1          | uk2 |         pk1          | pk2 
+----------------------+-----+----------------------+-----
+ a                    |   0 | a                    |   0
+ a                    |   0 | d                    |   0
+ a                    |   0 | e                    |   0
+ A                    |   0 | a                    |   0
+ A                    |   0 | d                    |   0
+ A                    |   0 | e                    |   0
+ c                    |   0 | a                    |   0
+ c                    |   0 | d                    |   0
+ c                    |   0 | e                    |   0
+(9 rows)
+
+-- left join
+explain select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a left join select_distinct_b b on (a.pk1 = b.a);
+                                    QUERY PLAN                                     
+-----------------------------------------------------------------------------------
+ Nested Loop Left Join  (cost=0.00..2310.28 rows=760 width=176)
+   Join Filter: (a.pk1 = b.a)
+   ->  Seq Scan on select_distinct_a a  (cost=0.00..13.90 rows=390 width=88)
+   ->  Materialize  (cost=0.00..15.85 rows=390 width=92)
+         ->  Seq Scan on select_distinct_b b  (cost=0.00..13.90 rows=390 width=92)
+(5 rows)
+
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a left join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;;
+ pk1 |         pk2          |         pk1          | pk2 
+-----+----------------------+----------------------+-----
+   1 | a                    | a                    |   0
+   1 | a                    | e                    |   0
+   1 | b                    | a                    |   0
+   1 | b                    | e                    |   0
+   3 | c                    |                      |    
+(5 rows)
+
+-- right join
+explain select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a right join select_distinct_b b on (a.pk1 = b.a);
+                                                  QUERY PLAN                                                  
+--------------------------------------------------------------------------------------------------------------
+ Nested Loop Left Join  (cost=0.15..140.88 rows=760 width=176)
+   ->  Seq Scan on select_distinct_b b  (cost=0.00..13.90 rows=390 width=92)
+   ->  Index Only Scan using select_distinct_a_pkey on select_distinct_a a  (cost=0.15..0.31 rows=2 width=88)
+         Index Cond: (pk1 = b.a)
+(4 rows)
+
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a right join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;
+ pk1 |         pk2          |         pk1          | pk2 
+-----+----------------------+----------------------+-----
+   1 | a                    | a                    |   0
+   1 | a                    | e                    |   0
+   1 | b                    | a                    |   0
+   1 | b                    | e                    |   0
+     |                      | d                    |   0
+(5 rows)
+
+-- full join
+explain select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a);
+                                    QUERY PLAN                                     
+-----------------------------------------------------------------------------------
+ Hash Full Join  (cost=10000000018.77..10000000060.26 rows=760 width=176)
+   Hash Cond: (a.pk1 = b.a)
+   ->  Seq Scan on select_distinct_a a  (cost=0.00..13.90 rows=390 width=88)
+   ->  Hash  (cost=13.90..13.90 rows=390 width=92)
+         ->  Seq Scan on select_distinct_b b  (cost=0.00..13.90 rows=390 width=92)
+(5 rows)
+
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;
+ pk1 |         pk2          |         pk1          | pk2 
+-----+----------------------+----------------------+-----
+   1 | a                    | a                    |   0
+   1 | a                    | e                    |   0
+   1 | b                    | a                    |   0
+   1 | b                    | e                    |   0
+   3 | c                    |                      |    
+     |                      | d                    |   0
+(6 rows)
+
+-- distinct can't be erased since b.pk2 is missed
+explain select distinct a.pk1, a.pk2, b.pk1 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a);
+                                          QUERY PLAN                                           
+-----------------------------------------------------------------------------------------------
+ Unique  (cost=10000000096.63..10000000104.23 rows=760 width=172)
+   ->  Sort  (cost=10000000096.63..10000000098.53 rows=760 width=172)
+         Sort Key: a.pk1, a.pk2, b.pk1
+         ->  Hash Full Join  (cost=10000000018.77..10000000060.26 rows=760 width=172)
+               Hash Cond: (a.pk1 = b.a)
+               ->  Seq Scan on select_distinct_a a  (cost=0.00..13.90 rows=390 width=88)
+               ->  Hash  (cost=13.90..13.90 rows=390 width=88)
+                     ->  Seq Scan on select_distinct_b b  (cost=0.00..13.90 rows=390 width=88)
+(8 rows)
+
+-- Semi/anti join
+explain (costs off) select distinct pk1, pk2 from select_distinct_a where pk1 in (select a from select_distinct_b);
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Nested Loop
+   ->  HashAggregate
+         Group Key: select_distinct_b.a
+         ->  Seq Scan on select_distinct_b
+   ->  Index Only Scan using select_distinct_a_pkey on select_distinct_a
+         Index Cond: (pk1 = select_distinct_b.a)
+(6 rows)
+
+explain (costs off) select distinct pk1, pk2 from select_distinct_a where pk1 not in (select a from select_distinct_b);
+              QUERY PLAN               
+---------------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on select_distinct_b
+(4 rows)
+
+-- we also can handle some limited subquery
+explain select distinct * from select_distinct_a a,  (select a from select_distinct_b group by a) b where a.pk1 = b.a;
+                                                QUERY PLAN                                                
+----------------------------------------------------------------------------------------------------------
+ Nested Loop  (cost=15.02..107.38 rows=390 width=184)
+   ->  HashAggregate  (cost=14.88..16.88 rows=200 width=4)
+         Group Key: select_distinct_b.a
+         ->  Seq Scan on select_distinct_b  (cost=0.00..13.90 rows=390 width=4)
+   ->  Index Scan using select_distinct_a_pkey on select_distinct_a a  (cost=0.15..0.42 rows=2 width=180)
+         Index Cond: (pk1 = select_distinct_b.a)
+(6 rows)
+
+select distinct * from select_distinct_a a,  (select a from select_distinct_b group by a) b where a.pk1 = b.a order by 1, 2, 3;
+ pk1 |         pk2          |         uk1          | uk2 | e | a 
+-----+----------------------+----------------------+-----+---+---
+   1 | a                    | a                    |   0 | 1 | 1
+   1 | b                    | A                    |   0 | 2 | 1
+(2 rows)
+
+explain select distinct * from select_distinct_a a,  (select distinct a from select_distinct_b) b where a.pk1 = b.a;
+                                                QUERY PLAN                                                
+----------------------------------------------------------------------------------------------------------
+ Nested Loop  (cost=15.02..107.38 rows=390 width=184)
+   ->  HashAggregate  (cost=14.88..16.88 rows=200 width=4)
+         Group Key: select_distinct_b.a
+         ->  Seq Scan on select_distinct_b  (cost=0.00..13.90 rows=390 width=4)
+   ->  Index Scan using select_distinct_a_pkey on select_distinct_a a  (cost=0.15..0.42 rows=2 width=180)
+         Index Cond: (pk1 = select_distinct_b.a)
+(6 rows)
+
+select distinct * from select_distinct_a a, (select distinct a from select_distinct_b) b where a.pk1 = b.a order by 1 ,2, 3;
+ pk1 |         pk2          |         uk1          | uk2 | e | a 
+-----+----------------------+----------------------+-----+---+---
+   1 | a                    | a                    |   0 | 1 | 1
+   1 | b                    | A                    |   0 | 2 | 1
+(2 rows)
+
+-- Distinct On
+-- can't erase since pk2 is missed
+explain (costs off) select distinct on(pk1) pk1, pk2 from select_distinct_a;
+                QUERY PLAN                 
+-------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: pk1
+         ->  Seq Scan on select_distinct_a
+(4 rows)
+
+-- ok to erase
+explain (costs off) select distinct on(pk1, pk2) pk1, pk2 from select_distinct_a;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)
+
+-- test some view.
+create view distinct_v1 as select distinct uk1, uk2 from select_distinct_a where uk2 is not null;
+explain select * from distinct_v1;
+                             QUERY PLAN                              
+---------------------------------------------------------------------
+ Seq Scan on select_distinct_a  (cost=0.00..13.90 rows=388 width=88)
+   Filter: (uk2 IS NOT NULL)
+(2 rows)
+
+alter table select_distinct_a alter column uk1 drop not null;
+explain select * from distinct_v1;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ HashAggregate  (cost=15.84..17.84 rows=200 width=88)
+   Group Key: select_distinct_a.uk1, select_distinct_a.uk2
+   ->  Seq Scan on select_distinct_a  (cost=0.00..13.90 rows=388 width=88)
+         Filter: (uk2 IS NOT NULL)
+(4 rows)
+
+alter table select_distinct_a alter column uk1 set not null;
+-- test generic plan
+prepare pt as select * from distinct_v1;
+explain execute pt;
+                             QUERY PLAN                              
+---------------------------------------------------------------------
+ Seq Scan on select_distinct_a  (cost=0.00..13.90 rows=388 width=88)
+   Filter: (uk2 IS NOT NULL)
+(2 rows)
+
+alter table select_distinct_a alter column uk1 drop not null;
+explain execute pt;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ HashAggregate  (cost=15.84..17.84 rows=200 width=88)
+   Group Key: select_distinct_a.uk1, select_distinct_a.uk2
+   ->  Seq Scan on select_distinct_a  (cost=0.00..13.90 rows=388 width=88)
+         Filter: (uk2 IS NOT NULL)
+(4 rows)
+
+drop view distinct_v1;
+drop table select_distinct_a;
+drop table select_distinct_b;
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index a1c90eb905..e053214f9d 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -73,6 +73,7 @@ select name, setting from pg_settings where name like 'enable%';
               name              | setting 
 --------------------------------+---------
  enable_bitmapscan              | on
+ enable_distinct_elimination    | on
  enable_gathermerge             | on
  enable_hashagg                 | on
  enable_hashjoin                | on
@@ -89,7 +90,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(17 rows)
+(18 rows)
 
 -- Test that the pg_timezone_names and pg_timezone_abbrevs views are
 -- more-or-less working.  We can't test their contents in any great detail
diff --git a/src/test/regress/sql/select_distinct.sql b/src/test/regress/sql/select_distinct.sql
index a605e86449..813361ad89 100644
--- a/src/test/regress/sql/select_distinct.sql
+++ b/src/test/regress/sql/select_distinct.sql
@@ -73,3 +73,87 @@ SELECT 1 IS NOT DISTINCT FROM 2 as "no";
 SELECT 2 IS NOT DISTINCT FROM 2 as "yes";
 SELECT 2 IS NOT DISTINCT FROM null as "no";
 SELECT null IS NOT DISTINCT FROM null as "yes";
+create table select_distinct_a(pk1 int, pk2 char(20),  uk1 char(20) not null,  uk2 int, e int, primary key(pk1, pk2));
+create unique index select_distinct_a_uk on select_distinct_a(uk1, uk2);
+create table select_distinct_b(a int, b char(20), pk1 char(20), pk2 int, e int, primary key(pk1, pk2));
+
+-- distinct erased since (pk1, pk2)
+explain (costs off) select distinct * from select_distinct_a;
+
+-- distinct can't be reased since since we required all the uk must be not null
+explain (costs off) select distinct uk1, uk2 from select_distinct_a;
+
+-- distinct ereased since uk + not null
+explain (costs off) select distinct uk1, uk2 from select_distinct_a where uk2 is not null;
+explain (costs off) select distinct uk1, uk2 from select_distinct_a where uk2 > 1;
+
+-- distinct erased due to group by
+explain select distinct e from select_distinct_a group by e;
+
+-- distinct erased due to the restirctinfo
+explain select distinct uk1 from select_distinct_a where pk1 = 1 and pk2 = 'c';
+
+-- test join
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+
+insert into select_distinct_a values(1, 'a', 'a', 0, 1), (1, 'b', 'A', 0, 2), (3, 'c', 'c', 0, 3);
+insert into select_distinct_b values(1, 'a', 'a', 0, 1), (4, 'd', 'd', 0, 4), (1, 'e', 'e', 0, 5);
+
+-- Cartesian join
+explain (costs off) select distinct a.uk1, a.uk2, b.pk1, b.pk2 from select_distinct_a a, select_distinct_b b where a.uk2 is not null;
+select distinct a.uk1, a.uk2, b.pk1, b.pk2 from select_distinct_a a, select_distinct_b b where a.uk2 is not null order by 1, 2, 3, 4;
+
+
+-- left join
+explain select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a left join select_distinct_b b on (a.pk1 = b.a);
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a left join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;;
+
+-- right join
+explain select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a right join select_distinct_b b on (a.pk1 = b.a);
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a right join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;
+
+-- full join
+explain select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a);
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;
+
+-- distinct can't be erased since b.pk2 is missed
+explain select distinct a.pk1, a.pk2, b.pk1 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a);
+
+
+-- Semi/anti join
+explain (costs off) select distinct pk1, pk2 from select_distinct_a where pk1 in (select a from select_distinct_b);
+explain (costs off) select distinct pk1, pk2 from select_distinct_a where pk1 not in (select a from select_distinct_b);
+
+-- we also can handle some limited subquery
+explain select distinct * from select_distinct_a a,  (select a from select_distinct_b group by a) b where a.pk1 = b.a;
+select distinct * from select_distinct_a a,  (select a from select_distinct_b group by a) b where a.pk1 = b.a order by 1, 2, 3;
+
+explain select distinct * from select_distinct_a a,  (select distinct a from select_distinct_b) b where a.pk1 = b.a;
+select distinct * from select_distinct_a a, (select distinct a from select_distinct_b) b where a.pk1 = b.a order by 1 ,2, 3;
+
+-- Distinct On
+-- can't erase since pk2 is missed
+explain (costs off) select distinct on(pk1) pk1, pk2 from select_distinct_a;
+-- ok to erase
+explain (costs off) select distinct on(pk1, pk2) pk1, pk2 from select_distinct_a;
+
+
+-- test some view.
+create view distinct_v1 as select distinct uk1, uk2 from select_distinct_a where uk2 is not null;
+explain select * from distinct_v1;
+
+alter table select_distinct_a alter column uk1 drop not null;
+explain select * from distinct_v1;
+
+alter table select_distinct_a alter column uk1 set not null;
+
+-- test generic plan
+prepare pt as select * from distinct_v1;
+explain execute pt;
+alter table select_distinct_a alter column uk1 drop not null;
+explain execute pt;
+
+drop view distinct_v1;
+drop table select_distinct_a;
+drop table select_distinct_b;
-- 
2.21.0

#22

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: Andy Fan (#21)

1 attachment(s)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

Upload the newest patch so that the cfbot can pass. The last patch failed
because some explain without the (cost off).

I'm still on the way to figure out how to handle aggregation calls without
aggregation path. Probably we can get there by hacking some
ExprEvalPushStep for Aggref node. But since the current patch not tied
with this closely, so I would put this patch for review first.

On Wed, Mar 4, 2020 at 9:13 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

Show quoted text

* There are some changes in existing regression cases that aren't
visibly related to the stated purpose of the patch, eg it now
notices that "select distinct max(unique2) from tenk1" doesn't
require an explicit DISTINCT step. That's not wrong, but I wonder
if maybe you should subdivide this patch into more than one patch,
because that must be coming from some separate change. I'm also
wondering what caused the plan change in expected/join.out.

Per my purpose it should be in the same patch, the logical here is we
have distinct in the sql and the query is distinct already since the max
function (the rule is defined in query_is_distinct_agg which is splited
from
the original query_is_distinct_for clause).

I think I was right until I come
into contrib/postgres_fdw/sql/postgres_fdw.sql.
Per my understanding, the query the result of "select max(a) from t" is
unique
since the aggregation function and has no group clause there. But in the
postgres_fdw.sql case, the Query->hasAggs is true for "select distinct
(select count(*) filter (where t2.c2 = 6 and t2.c1 < 10) from ft1 t1 where
t1.c1 = 6)
from ft2 t2 where t2.c2 % 6 = 0 order by 1;" This looks very strange to
me.
Is my understanding wrong or there is a bug here?

query->hasAggs was set to true in the following call stack.

pstate->p_hasAggs = true;

..

qry->hasAggs = pstate->p_hasAggs;

0 in check_agglevels_and_constraints of parse_agg.c:343
1 in transformAggregateCall of parse_agg.c:236
2 in ParseFuncOrColumn of parse_func.c:805
3 in transformFuncCall of parse_expr.c:1558
4 in transformExprRecurse of parse_expr.c:265
5 in transformExpr of parse_expr.c:155
6 in transformTargetEntry of parse_target.c:105
7 in transformTargetList of parse_target.c:193
8 in transformSelectStmt of analyze.c:1224
9 in transformStmt of analyze.c:301

You can see the new updated patch which should fix all the issues you
point out
except the one for supporting group by. The another reason for this
patch will
not be the final one is because the changes for postgres_fdw.out is too
arbitrary.
uploading it now just for reference. (The new introduced guc variable can
be
removed at last, keeping it now just make sure the testing is easier.)

At a high level, I'm a bit disturbed that this focuses only on DISTINCT

and doesn't (appear to) have any equivalent intelligence for GROUP BY,
though surely that offers much the same opportunities for optimization.
It seems like it'd be worthwhile to take a couple steps back and see
if we couldn't recast the logic to work for both.

OK, Looks group by is a bit harder than distinct since the aggregation
function.
I will go through the code to see where to add this logic.

Can we grantee any_aggr_func(a) == a if only 1 row returned, if so, we
can do
some work on the pathtarget/reltarget by transforming the Aggref to raw
expr.
I checked the execution path of the aggregation call, looks it depends on
Agg node
which is the thing we want to remove.

We can't grantee any_aggr_func(a) == a when only 1 row returned, so the
above
method doesn't work. do you have any suggestion for this?

Attachments:

v3-0001-PATCH-Erase-the-distinct-path-if-the-result-is-un.patchapplication/octet-stream; name=v3-0001-PATCH-Erase-the-distinct-path-if-the-result-is-un.patchDownload

From 8c5deb1b79d960502f5878ee8750f95ad50d1e74 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com>
Date: Wed, 4 Mar 2020 14:33:56 +0800
Subject: [PATCH v3] [PATCH] Erase the distinct path if the result is unique by
 catalog

For a single relation, we can tell it by any one of the following
is true:
1. The pk is in the target list.
2. The uk is in the target list and the columns is not null
3. The columns in group-by clause is also in the target list

for relation join, we can tell it by:
if every relation in the jointree yield a unique result set,then
the final result is unique as well regardless the join method.
---
 .../postgres_fdw/expected/postgres_fdw.out    |  28 +-
 src/backend/optimizer/path/costsize.c         |   1 +
 src/backend/optimizer/plan/analyzejoins.c     | 184 +++++++++++-
 src/backend/optimizer/plan/planner.c          |  27 ++
 src/backend/optimizer/util/plancat.c          |   9 +
 src/backend/utils/misc/guc.c                  |  10 +
 src/include/nodes/pathnodes.h                 |   1 +
 src/include/optimizer/cost.h                  |   1 +
 src/include/optimizer/planmain.h              |   2 +
 src/test/regress/expected/aggregates.out      |  13 +-
 src/test/regress/expected/join.out            |  16 +-
 src/test/regress/expected/select_distinct.out | 276 ++++++++++++++++++
 src/test/regress/expected/sysviews.out        |   3 +-
 src/test/regress/sql/select_distinct.sql      |  84 ++++++
 14 files changed, 619 insertions(+), 36 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 84fd3ad2e0..215f10bf7d 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2902,22 +2902,20 @@ select sum(c1%3), sum(distinct c1%3 order by c1%3) filter (where c1%3 < 2), c2 f
 -- Outer query is aggregation query
 explain (verbose, costs off)
 select distinct (select count(*) filter (where t2.c2 = 6 and t2.c1 < 10) from ft1 t1 where t1.c1 = 6) from ft2 t2 where t2.c2 % 6 = 0 order by 1;
-                                                          QUERY PLAN                                                          
-------------------------------------------------------------------------------------------------------------------------------
- Unique
+                                                       QUERY PLAN                                                       
+------------------------------------------------------------------------------------------------------------------------
+ Sort
    Output: ((SubPlan 1))
-   ->  Sort
-         Output: ((SubPlan 1))
-         Sort Key: ((SubPlan 1))
-         ->  Foreign Scan
-               Output: (SubPlan 1)
-               Relations: Aggregate on (public.ft2 t2)
-               Remote SQL: SELECT count(*) FILTER (WHERE ((c2 = 6) AND ("C 1" < 10))) FROM "S 1"."T 1" WHERE (((c2 % 6) = 0))
-               SubPlan 1
-                 ->  Foreign Scan on public.ft1 t1
-                       Output: (count(*) FILTER (WHERE ((t2.c2 = 6) AND (t2.c1 < 10))))
-                       Remote SQL: SELECT NULL FROM "S 1"."T 1" WHERE (("C 1" = 6))
-(13 rows)
+   Sort Key: ((SubPlan 1))
+   ->  Foreign Scan
+         Output: (SubPlan 1)
+         Relations: Aggregate on (public.ft2 t2)
+         Remote SQL: SELECT count(*) FILTER (WHERE ((c2 = 6) AND ("C 1" < 10))) FROM "S 1"."T 1" WHERE (((c2 % 6) = 0))
+         SubPlan 1
+           ->  Foreign Scan on public.ft1 t1
+                 Output: (count(*) FILTER (WHERE ((t2.c2 = 6) AND (t2.c1 < 10))))
+                 Remote SQL: SELECT NULL FROM "S 1"."T 1" WHERE (("C 1" = 6))
+(11 rows)
 
 select distinct (select count(*) filter (where t2.c2 = 6 and t2.c1 < 10) from ft1 t1 where t1.c1 = 6) from ft2 t2 where t2.c2 % 6 = 0 order by 1;
  count 
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index b5a0033721..dde16b5d44 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -138,6 +138,7 @@ bool		enable_partitionwise_aggregate = false;
 bool		enable_parallel_append = true;
 bool		enable_parallel_hash = true;
 bool		enable_partition_pruning = true;
+bool		enable_distinct_elimination = true;
 
 typedef struct
 {
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index d0ff660284..dee152af29 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -30,6 +30,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/planmain.h"
 #include "optimizer/tlist.h"
+#include "parser/parsetree.h"
 #include "utils/lsyscache.h"
 
 /* local functions */
@@ -47,7 +48,8 @@ static bool is_innerrel_unique_for(PlannerInfo *root,
 								   RelOptInfo *innerrel,
 								   JoinType jointype,
 								   List *restrictlist);
-
+static void transform_colno_for_subquery(Query *query, List *colnos, List *opids,
+										List **sub_colnos, List **sub_opids);
 
 /*
  * remove_useless_joins
@@ -801,9 +803,18 @@ query_is_distinct_for(Query *query, List *colnos, List *opids)
 		if (l == NULL)			/* had matches for all? */
 			return true;
 	}
+	return query_is_distinct_agg(query, colnos, opids);
+}
+
+
+bool
+query_is_distinct_agg(Query *query, List *colnos, List *opids)
+{
+	ListCell   *l;
+	Oid			opid;
 
 	/*
-	 * Otherwise, a set-returning function in the query's targetlist can
+	 * a set-returning function in the query's targetlist can
 	 * result in returning duplicate rows, despite any grouping that might
 	 * occur before tlist evaluation.  (If all tlist SRFs are within GROUP BY
 	 * columns, it would be safe because they'd be expanded before grouping.
@@ -901,7 +912,6 @@ query_is_distinct_for(Query *query, List *colnos, List *opids)
 				return true;
 		}
 	}
-
 	/*
 	 * XXX Are there any other cases in which we can easily see the result
 	 * must be distinct?
@@ -913,6 +923,174 @@ query_is_distinct_for(Query *query, List *colnos, List *opids)
 	return false;
 }
 
+/*
+ * scan_non_semi_anti_relids
+ *
+ * Scan jointree to get rid of right table of semi/anti join rtindex.
+ */
+static void
+scan_non_semi_anti_relids(Node* jtnode, Relids* relids)
+{
+	if (jtnode == NULL)
+		return;
+
+	if (IsA(jtnode, RangeTblRef))
+	{
+		int			varno = ((RangeTblRef *) jtnode)->rtindex;
+
+		*relids = bms_add_member(*relids, varno);
+	}
+	else if (IsA(jtnode, FromExpr))
+	{
+		FromExpr   *f = (FromExpr *) jtnode;
+		ListCell   *l;
+
+		foreach(l, f->fromlist)
+			scan_non_semi_anti_relids(lfirst(l), relids);
+	}
+	else if (IsA(jtnode, JoinExpr))
+	{
+		JoinExpr   *j = (JoinExpr *) jtnode;
+
+		scan_non_semi_anti_relids(j->larg, relids);
+		if (j->jointype != JOIN_SEMI && j->jointype != JOIN_ANTI)
+		{
+			scan_non_semi_anti_relids(j->rarg, relids);
+		}
+	}
+	else
+		elog(ERROR, "unrecognized node type: %d",
+			 (int) nodeTag(jtnode));
+
+}
+
+/*
+ * transform_colno_for_subquery
+ */
+static void
+transform_colno_for_subquery(Query *query, List *colnos, List *opids,
+							List **sub_colnos, List **sub_opids)
+{
+	ListCell *lc1, *lc2;
+	TargetEntry *tle;
+
+	forboth(lc1, colnos, lc2, opids)
+	{
+		tle = get_tle_by_resno(query->targetList, lfirst_int(lc1));
+		Assert(IsA(tle->expr, Var));
+		*sub_colnos = lappend_int(*sub_colnos, ((Var*)tle->expr)->varattno);
+		*sub_opids = lappend_oid(*sub_opids, lfirst_oid(lc2));
+	}
+}
+
+/*
+ * query_distinct_through_join
+ * If every relation yields a unique result in the join, so the join result
+ * is unqiue as well. We need to distinguish right table in semi/anti
+ * join, which we don't care.
+ */
+bool
+query_distinct_through_join(PlannerInfo *root, List *colnos, List *opids)
+{
+	Query *query = root->parse;
+	Relids non_semi_anti_relids = NULL;
+
+    /* Used for relation_has_unique_for */
+	List **non_null_expr_per_table = NULL;
+	/* Used for query_is_distinct_for */
+	List **non_null_colno_per_table = NULL;
+	/* Used for both as above*/
+	List **non_null_opids_per_table = NULL;
+	/* Not null info from restrictinfo and catalog */
+	Bitmapset **non_null_varno_per_table = NULL;
+
+	int rt_index;
+	ListCell *lc1, *lc2;
+	RangeTblEntry *rte;
+	RelOptInfo *rel;
+	int max_rt_index = list_length(query->rtable) + 1;
+	
+	/* Remove the relids for the right table in semi/anti join */
+	scan_non_semi_anti_relids((Node*)query->jointree, &non_semi_anti_relids);
+
+	non_null_varno_per_table = palloc0(max_rt_index * sizeof(Bitmapset *));
+
+	foreach(lc1, find_nonnullable_vars(query->jointree->quals))
+	{
+		Var *var;
+		if (!IsA(lfirst(lc1), Var))
+			continue;
+		var = lfirst_node(Var, lc1);
+		if (var->varno == INNER_VAR ||
+			var->varno == OUTER_VAR ||
+			var->varno == INDEX_VAR)
+			continue;
+		non_null_varno_per_table[var->varno] = bms_add_member(
+			non_null_varno_per_table[var->varno], var->varattno);
+	}
+
+	/* Add the non null info in catalog */
+	rt_index = -1;
+	while ((rt_index = bms_next_member(non_semi_anti_relids, rt_index)) >= 0 )
+	{
+		non_null_varno_per_table[rt_index] = bms_join(non_null_varno_per_table[rt_index],
+													  root->simple_rel_array[rt_index]->not_null_cols_relids);
+	}
+
+	non_null_expr_per_table = palloc0(max_rt_index * sizeof(List *));
+	non_null_opids_per_table = palloc0(max_rt_index * sizeof(List *));
+	non_null_colno_per_table = palloc0(max_rt_index * sizeof(List *));
+
+	/* Filter out the nullable columns and split them per table*/
+	forboth(lc1, colnos, lc2, opids)
+	{
+		int colno = lfirst_int(lc1);
+		TargetEntry *tle = get_tle_by_resno(query->targetList, colno);
+		Var *var = NULL;
+		if (!IsA(tle->expr, Var))
+			continue;
+		var = (Var *)tle->expr;
+		if (!bms_is_member(var->varattno, non_null_varno_per_table[var->varno]))
+			continue;
+		non_null_expr_per_table[var->varno] = lappend(
+			non_null_expr_per_table[var->varno], tle->expr);
+		non_null_opids_per_table[var->varno] = lappend_oid(
+			non_null_opids_per_table[var->varno], lfirst_oid(lc2));
+		non_null_colno_per_table[var->varno] = lappend_int(
+			non_null_colno_per_table[var->varno],
+			colno);
+	}
+
+	/* Check if every relation yields a unqiue result, if anyone doesn't return false */
+	rt_index = -1;
+	while ((rt_index = bms_next_member(non_semi_anti_relids, rt_index)) >= 0 )
+	{
+		rte = root->simple_rte_array[rt_index];
+		rel = root->simple_rel_array[rt_index];
+		if (rte->rtekind == RTE_RELATION &&
+			relation_has_unique_index_for(root, rel, NIL,
+										  non_null_expr_per_table[rt_index],
+										  non_null_opids_per_table[rt_index]))
+			continue;
+		if (rte->rtekind == RTE_SUBQUERY &&
+			query_supports_distinctness(rte->subquery))
+		{
+			List *subquery_colnos = NIL;
+			List *subquery_opids = NIL;
+			transform_colno_for_subquery(root->parse,
+										non_null_colno_per_table[rt_index],
+										non_null_opids_per_table[rt_index],
+										&subquery_colnos,
+										&subquery_opids);
+			if (query_is_distinct_for(rte->subquery, subquery_colnos, subquery_opids))
+				continue;
+			return false;
+		}
+		return false;
+	}
+	return true;
+}
+
 /*
  * distinct_col_search - subroutine for query_is_distinct_for
  *
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index d6f2153593..9d56e6c88e 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -4737,6 +4737,33 @@ create_distinct_paths(PlannerInfo *root,
 	Path	   *path;
 	ListCell   *lc;
 
+	if (enable_distinct_elimination)
+	{
+		List *colnos = NIL;
+		List *opnos = NIL;
+		ListCell *lc;
+
+		Assert(parse->distinctClause != NIL);
+
+		foreach(lc, parse->distinctClause)
+		{
+			SortGroupClause *sgc = lfirst_node(SortGroupClause, lc);
+			int idx = sgc->tleSortGroupRef;
+			TargetEntry *tle = get_tle_by_resno(parse->targetList, idx);
+			if (tle->resjunk)
+				continue;
+			/* even column x is not null, f(x) may be null as well, so ignore it */
+			if (!IsA(tle->expr, Var))
+				continue;
+			colnos = lappend_int(colnos, idx);
+			opnos = lappend_oid(opnos, sgc->eqop);
+		}
+
+		if ((query_supports_distinctness(parse)
+			 && query_is_distinct_agg(parse, colnos, opnos)) ||
+			query_distinct_through_join(root, colnos, opnos))
+			return input_rel;
+	}
 	/* For now, do all work in the (DISTINCT, NULL) upperrel */
 	distinct_rel = fetch_upper_rel(root, UPPERREL_DISTINCT, NULL);
 
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index d82fc5ab8b..e57b456d9b 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -117,6 +117,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	Relation	relation;
 	bool		hasindex;
 	List	   *indexinfos = NIL;
+	int        i;
 
 	/*
 	 * We need not lock the relation since it was already locked, either by
@@ -460,6 +461,14 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	if (inhparent && relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
 		set_relation_partition_info(root, rel, relation);
 
+	Assert(rel->not_null_cols_relids == NULL);
+	for(i = 0; i < relation->rd_att->natts;  i++)
+	{
+		if (!relation->rd_att->attrs[i].attnotnull)
+			continue;
+		rel->not_null_cols_relids = bms_add_member(rel->not_null_cols_relids, i+1);
+	}
+
 	table_close(relation, NoLock);
 
 	/*
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e44f71e991..fa798dd564 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1064,6 +1064,16 @@ static struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_distinct_elimination", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables plan-time and run-time unique elimination."),
+		    gettext_noop("Allows the query planner to remove the uncecessary distinct clause."), 
+			GUC_EXPLAIN
+		},
+		&enable_distinct_elimination,
+		true,
+		NULL, NULL, NULL
+	},
 	{
 		{"geqo", PGC_USERSET, QUERY_TUNING_GEQO,
 			gettext_noop("Enables genetic query optimization."),
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 3d3be197e0..51db013f5d 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -687,6 +687,7 @@ typedef struct RelOptInfo
 	PlannerInfo *subroot;		/* if subquery */
 	List	   *subplan_params; /* if subquery */
 	int			rel_parallel_workers;	/* wanted number of parallel workers */
+	Relids     not_null_cols_relids; /* not null cols by catalogs,starts with 1 */
 
 	/* Information about foreign tables and foreign joins */
 	Oid			serverid;		/* identifies server for the table or join */
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index cb012ba198..4fa5d32df6 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -64,6 +64,7 @@ extern PGDLLIMPORT bool enable_partitionwise_aggregate;
 extern PGDLLIMPORT bool enable_parallel_append;
 extern PGDLLIMPORT bool enable_parallel_hash;
 extern PGDLLIMPORT bool enable_partition_pruning;
+extern PGDLLIMPORT bool enable_distinct_elimination;
 extern PGDLLIMPORT int constraint_exclusion;
 
 extern double index_pages_fetched(double tuples_fetched, BlockNumber pages,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index eab486a621..ebd4f24577 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -100,6 +100,8 @@ extern List *remove_useless_joins(PlannerInfo *root, List *joinlist);
 extern void reduce_unique_semijoins(PlannerInfo *root);
 extern bool query_supports_distinctness(Query *query);
 extern bool query_is_distinct_for(Query *query, List *colnos, List *opids);
+extern bool query_is_distinct_agg(Query *query, List *colnos, List *opids);
+extern bool query_distinct_through_join(PlannerInfo *root, List *colnos, List *opids);
 extern bool innerrel_is_unique(PlannerInfo *root,
 							   Relids joinrelids, Relids outerrelids, RelOptInfo *innerrel,
 							   JoinType jointype, List *restrictlist, bool force_cache);
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index f457b5b150..6712571578 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -870,14 +870,12 @@ explain (costs off)
   select distinct max(unique2) from tenk1;
                              QUERY PLAN                              
 ---------------------------------------------------------------------
- HashAggregate
-   Group Key: $0
+ Result
    InitPlan 1 (returns $0)
      ->  Limit
            ->  Index Only Scan Backward using tenk1_unique2 on tenk1
                  Index Cond: (unique2 IS NOT NULL)
-   ->  Result
-(7 rows)
+(5 rows)
 
 select distinct max(unique2) from tenk1;
  max  
@@ -1036,7 +1034,7 @@ explain (costs off)
   select distinct min(f1), max(f1) from minmaxtest;
                                          QUERY PLAN                                          
 ---------------------------------------------------------------------------------------------
- Unique
+ Result
    InitPlan 1 (returns $0)
      ->  Limit
            ->  Merge Append
@@ -1059,10 +1057,7 @@ explain (costs off)
                  ->  Index Only Scan using minmaxtest2i on minmaxtest2 minmaxtest_8
                        Index Cond: (f1 IS NOT NULL)
                  ->  Index Only Scan Backward using minmaxtest3i on minmaxtest3 minmaxtest_9
-   ->  Sort
-         Sort Key: ($0), ($1)
-         ->  Result
-(26 rows)
+(23 rows)
 
 select distinct min(f1), max(f1) from minmaxtest;
  min | max 
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 761376b007..3f6595d53b 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -4433,17 +4433,17 @@ select d.* from d left join (select * from b group by b.id, b.c_id) s
 explain (costs off)
 select d.* from d left join (select distinct * from b) s
   on d.a = s.id;
-              QUERY PLAN              
---------------------------------------
- Merge Right Join
-   Merge Cond: (b.id = d.a)
-   ->  Unique
-         ->  Sort
-               Sort Key: b.id, b.c_id
-               ->  Seq Scan on b
+           QUERY PLAN            
+---------------------------------
+ Merge Left Join
+   Merge Cond: (d.a = s.id)
    ->  Sort
          Sort Key: d.a
          ->  Seq Scan on d
+   ->  Sort
+         Sort Key: s.id
+         ->  Subquery Scan on s
+               ->  Seq Scan on b
 (9 rows)
 
 -- check join removal works when uniqueness of the join condition is enforced
diff --git a/src/test/regress/expected/select_distinct.out b/src/test/regress/expected/select_distinct.out
index f3696c6d1d..c27e7d4b67 100644
--- a/src/test/regress/expected/select_distinct.out
+++ b/src/test/regress/expected/select_distinct.out
@@ -244,3 +244,279 @@ SELECT null IS NOT DISTINCT FROM null as "yes";
  t
 (1 row)
 
+create table select_distinct_a(pk1 int, pk2 char(20),  uk1 char(20) not null,  uk2 int, e int, primary key(pk1, pk2));
+create unique index select_distinct_a_uk on select_distinct_a(uk1, uk2);
+create table select_distinct_b(a int, b char(20), pk1 char(20), pk2 int, e int, primary key(pk1, pk2));
+-- distinct erased since (pk1, pk2)
+explain (costs off) select distinct * from select_distinct_a;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)
+
+-- distinct can't be reased since since we required all the uk must be not null
+explain (costs off) select distinct uk1, uk2 from select_distinct_a;
+             QUERY PLAN              
+-------------------------------------
+ HashAggregate
+   Group Key: uk1, uk2
+   ->  Seq Scan on select_distinct_a
+(3 rows)
+
+-- distinct ereased since uk + not null
+explain (costs off) select distinct uk1, uk2 from select_distinct_a where uk2 is not null;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (uk2 IS NOT NULL)
+(2 rows)
+
+explain (costs off) select distinct uk1, uk2 from select_distinct_a where uk2 > 1;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (uk2 > 1)
+(2 rows)
+
+-- distinct erased due to group by
+explain (costs off) select distinct e from select_distinct_a group by e;
+             QUERY PLAN              
+-------------------------------------
+ HashAggregate
+   Group Key: e
+   ->  Seq Scan on select_distinct_a
+(3 rows)
+
+-- distinct erased due to the restirctinfo
+explain (costs off) select distinct uk1 from select_distinct_a where pk1 = 1 and pk2 = 'c';
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Index Scan using select_distinct_a_pkey on select_distinct_a
+   Index Cond: ((pk1 = 1) AND (pk2 = 'c'::bpchar))
+(2 rows)
+
+-- test join
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+insert into select_distinct_a values(1, 'a', 'a', 0, 1), (1, 'b', 'A', 0, 2), (3, 'c', 'c', 0, 3);
+insert into select_distinct_b values(1, 'a', 'a', 0, 1), (4, 'd', 'd', 0, 4), (1, 'e', 'e', 0, 5);
+-- Cartesian join
+explain (costs off) select distinct a.uk1, a.uk2, b.pk1, b.pk2 from select_distinct_a a, select_distinct_b b where a.uk2 is not null;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop
+   ->  Seq Scan on select_distinct_b b
+   ->  Materialize
+         ->  Seq Scan on select_distinct_a a
+               Filter: (uk2 IS NOT NULL)
+(5 rows)
+
+select distinct a.uk1, a.uk2, b.pk1, b.pk2 from select_distinct_a a, select_distinct_b b where a.uk2 is not null order by 1, 2, 3, 4;
+         uk1          | uk2 |         pk1          | pk2 
+----------------------+-----+----------------------+-----
+ a                    |   0 | a                    |   0
+ a                    |   0 | d                    |   0
+ a                    |   0 | e                    |   0
+ A                    |   0 | a                    |   0
+ A                    |   0 | d                    |   0
+ A                    |   0 | e                    |   0
+ c                    |   0 | a                    |   0
+ c                    |   0 | d                    |   0
+ c                    |   0 | e                    |   0
+(9 rows)
+
+-- left join
+explain (costs off) select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a left join select_distinct_b b on (a.pk1 = b.a);
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop Left Join
+   Join Filter: (a.pk1 = b.a)
+   ->  Seq Scan on select_distinct_a a
+   ->  Materialize
+         ->  Seq Scan on select_distinct_b b
+(5 rows)
+
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a left join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;;
+ pk1 |         pk2          |         pk1          | pk2 
+-----+----------------------+----------------------+-----
+   1 | a                    | a                    |   0
+   1 | a                    | e                    |   0
+   1 | b                    | a                    |   0
+   1 | b                    | e                    |   0
+   3 | c                    |                      |    
+(5 rows)
+
+-- right join
+explain (costs off) select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a right join select_distinct_b b on (a.pk1 = b.a);
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Nested Loop Left Join
+   ->  Seq Scan on select_distinct_b b
+   ->  Index Only Scan using select_distinct_a_pkey on select_distinct_a a
+         Index Cond: (pk1 = b.a)
+(4 rows)
+
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a right join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;
+ pk1 |         pk2          |         pk1          | pk2 
+-----+----------------------+----------------------+-----
+   1 | a                    | a                    |   0
+   1 | a                    | e                    |   0
+   1 | b                    | a                    |   0
+   1 | b                    | e                    |   0
+     |                      | d                    |   0
+(5 rows)
+
+-- full join
+explain (costs off)  select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a);
+                 QUERY PLAN                  
+---------------------------------------------
+ Hash Full Join
+   Hash Cond: (a.pk1 = b.a)
+   ->  Seq Scan on select_distinct_a a
+   ->  Hash
+         ->  Seq Scan on select_distinct_b b
+(5 rows)
+
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;
+ pk1 |         pk2          |         pk1          | pk2 
+-----+----------------------+----------------------+-----
+   1 | a                    | a                    |   0
+   1 | a                    | e                    |   0
+   1 | b                    | a                    |   0
+   1 | b                    | e                    |   0
+   3 | c                    |                      |    
+     |                      | d                    |   0
+(6 rows)
+
+-- distinct can't be erased since b.pk2 is missed
+explain (costs off) select distinct a.pk1, a.pk2, b.pk1 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a);
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: a.pk1, a.pk2, b.pk1
+         ->  Hash Full Join
+               Hash Cond: (a.pk1 = b.a)
+               ->  Seq Scan on select_distinct_a a
+               ->  Hash
+                     ->  Seq Scan on select_distinct_b b
+(8 rows)
+
+-- Semi/anti join
+explain (costs off) select distinct pk1, pk2 from select_distinct_a where pk1 in (select a from select_distinct_b);
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Nested Loop
+   ->  HashAggregate
+         Group Key: select_distinct_b.a
+         ->  Seq Scan on select_distinct_b
+   ->  Index Only Scan using select_distinct_a_pkey on select_distinct_a
+         Index Cond: (pk1 = select_distinct_b.a)
+(6 rows)
+
+explain (costs off) select distinct pk1, pk2 from select_distinct_a where pk1 not in (select a from select_distinct_b);
+              QUERY PLAN               
+---------------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on select_distinct_b
+(4 rows)
+
+-- we also can handle some limited subquery
+explain (costs off) select distinct * from select_distinct_a a,  (select a from select_distinct_b group by a) b where a.pk1 = b.a;
+                              QUERY PLAN                              
+----------------------------------------------------------------------
+ Nested Loop
+   ->  HashAggregate
+         Group Key: select_distinct_b.a
+         ->  Seq Scan on select_distinct_b
+   ->  Index Scan using select_distinct_a_pkey on select_distinct_a a
+         Index Cond: (pk1 = select_distinct_b.a)
+(6 rows)
+
+select distinct * from select_distinct_a a,  (select a from select_distinct_b group by a) b where a.pk1 = b.a order by 1, 2, 3;
+ pk1 |         pk2          |         uk1          | uk2 | e | a 
+-----+----------------------+----------------------+-----+---+---
+   1 | a                    | a                    |   0 | 1 | 1
+   1 | b                    | A                    |   0 | 2 | 1
+(2 rows)
+
+explain (costs off) select distinct * from select_distinct_a a,  (select distinct a from select_distinct_b) b where a.pk1 = b.a;
+                              QUERY PLAN                              
+----------------------------------------------------------------------
+ Nested Loop
+   ->  HashAggregate
+         Group Key: select_distinct_b.a
+         ->  Seq Scan on select_distinct_b
+   ->  Index Scan using select_distinct_a_pkey on select_distinct_a a
+         Index Cond: (pk1 = select_distinct_b.a)
+(6 rows)
+
+select distinct * from select_distinct_a a, (select distinct a from select_distinct_b) b where a.pk1 = b.a order by 1 ,2, 3;
+ pk1 |         pk2          |         uk1          | uk2 | e | a 
+-----+----------------------+----------------------+-----+---+---
+   1 | a                    | a                    |   0 | 1 | 1
+   1 | b                    | A                    |   0 | 2 | 1
+(2 rows)
+
+-- Distinct On
+-- can't erase since pk2 is missed
+explain (costs off) select distinct on(pk1) pk1, pk2 from select_distinct_a;
+                QUERY PLAN                 
+-------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: pk1
+         ->  Seq Scan on select_distinct_a
+(4 rows)
+
+-- ok to erase
+explain (costs off) select distinct on(pk1, pk2) pk1, pk2 from select_distinct_a;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)
+
+-- test some view.
+create view distinct_v1 as select distinct uk1, uk2 from select_distinct_a where uk2 is not null;
+explain (costs off) select * from distinct_v1;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (uk2 IS NOT NULL)
+(2 rows)
+
+alter table select_distinct_a alter column uk1 drop not null;
+explain (costs off) select * from distinct_v1;
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ HashAggregate
+   Group Key: select_distinct_a.uk1, select_distinct_a.uk2
+   ->  Seq Scan on select_distinct_a
+         Filter: (uk2 IS NOT NULL)
+(4 rows)
+
+alter table select_distinct_a alter column uk1 set not null;
+-- test generic plan
+prepare pt as select * from distinct_v1;
+explain (costs off)  execute pt;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (uk2 IS NOT NULL)
+(2 rows)
+
+alter table select_distinct_a alter column uk1 drop not null;
+explain (costs off) execute pt;
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ HashAggregate
+   Group Key: select_distinct_a.uk1, select_distinct_a.uk2
+   ->  Seq Scan on select_distinct_a
+         Filter: (uk2 IS NOT NULL)
+(4 rows)
+
+drop view distinct_v1;
+drop table select_distinct_a;
+drop table select_distinct_b;
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index a1c90eb905..e053214f9d 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -73,6 +73,7 @@ select name, setting from pg_settings where name like 'enable%';
               name              | setting 
 --------------------------------+---------
  enable_bitmapscan              | on
+ enable_distinct_elimination    | on
  enable_gathermerge             | on
  enable_hashagg                 | on
  enable_hashjoin                | on
@@ -89,7 +90,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(17 rows)
+(18 rows)
 
 -- Test that the pg_timezone_names and pg_timezone_abbrevs views are
 -- more-or-less working.  We can't test their contents in any great detail
diff --git a/src/test/regress/sql/select_distinct.sql b/src/test/regress/sql/select_distinct.sql
index a605e86449..282ca58cf9 100644
--- a/src/test/regress/sql/select_distinct.sql
+++ b/src/test/regress/sql/select_distinct.sql
@@ -73,3 +73,87 @@ SELECT 1 IS NOT DISTINCT FROM 2 as "no";
 SELECT 2 IS NOT DISTINCT FROM 2 as "yes";
 SELECT 2 IS NOT DISTINCT FROM null as "no";
 SELECT null IS NOT DISTINCT FROM null as "yes";
+create table select_distinct_a(pk1 int, pk2 char(20),  uk1 char(20) not null,  uk2 int, e int, primary key(pk1, pk2));
+create unique index select_distinct_a_uk on select_distinct_a(uk1, uk2);
+create table select_distinct_b(a int, b char(20), pk1 char(20), pk2 int, e int, primary key(pk1, pk2));
+
+-- distinct erased since (pk1, pk2)
+explain (costs off) select distinct * from select_distinct_a;
+
+-- distinct can't be reased since since we required all the uk must be not null
+explain (costs off) select distinct uk1, uk2 from select_distinct_a;
+
+-- distinct ereased since uk + not null
+explain (costs off) select distinct uk1, uk2 from select_distinct_a where uk2 is not null;
+explain (costs off) select distinct uk1, uk2 from select_distinct_a where uk2 > 1;
+
+-- distinct erased due to group by
+explain (costs off) select distinct e from select_distinct_a group by e;
+
+-- distinct erased due to the restirctinfo
+explain (costs off) select distinct uk1 from select_distinct_a where pk1 = 1 and pk2 = 'c';
+
+-- test join
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+
+insert into select_distinct_a values(1, 'a', 'a', 0, 1), (1, 'b', 'A', 0, 2), (3, 'c', 'c', 0, 3);
+insert into select_distinct_b values(1, 'a', 'a', 0, 1), (4, 'd', 'd', 0, 4), (1, 'e', 'e', 0, 5);
+
+-- Cartesian join
+explain (costs off) select distinct a.uk1, a.uk2, b.pk1, b.pk2 from select_distinct_a a, select_distinct_b b where a.uk2 is not null;
+select distinct a.uk1, a.uk2, b.pk1, b.pk2 from select_distinct_a a, select_distinct_b b where a.uk2 is not null order by 1, 2, 3, 4;
+
+
+-- left join
+explain (costs off) select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a left join select_distinct_b b on (a.pk1 = b.a);
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a left join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;;
+
+-- right join
+explain (costs off) select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a right join select_distinct_b b on (a.pk1 = b.a);
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a right join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;
+
+-- full join
+explain (costs off)  select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a);
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;
+
+-- distinct can't be erased since b.pk2 is missed
+explain (costs off) select distinct a.pk1, a.pk2, b.pk1 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a);
+
+
+-- Semi/anti join
+explain (costs off) select distinct pk1, pk2 from select_distinct_a where pk1 in (select a from select_distinct_b);
+explain (costs off) select distinct pk1, pk2 from select_distinct_a where pk1 not in (select a from select_distinct_b);
+
+-- we also can handle some limited subquery
+explain (costs off) select distinct * from select_distinct_a a,  (select a from select_distinct_b group by a) b where a.pk1 = b.a;
+select distinct * from select_distinct_a a,  (select a from select_distinct_b group by a) b where a.pk1 = b.a order by 1, 2, 3;
+
+explain (costs off) select distinct * from select_distinct_a a,  (select distinct a from select_distinct_b) b where a.pk1 = b.a;
+select distinct * from select_distinct_a a, (select distinct a from select_distinct_b) b where a.pk1 = b.a order by 1 ,2, 3;
+
+-- Distinct On
+-- can't erase since pk2 is missed
+explain (costs off) select distinct on(pk1) pk1, pk2 from select_distinct_a;
+-- ok to erase
+explain (costs off) select distinct on(pk1, pk2) pk1, pk2 from select_distinct_a;
+
+
+-- test some view.
+create view distinct_v1 as select distinct uk1, uk2 from select_distinct_a where uk2 is not null;
+explain (costs off) select * from distinct_v1;
+
+alter table select_distinct_a alter column uk1 drop not null;
+explain (costs off) select * from distinct_v1;
+
+alter table select_distinct_a alter column uk1 set not null;
+
+-- test generic plan
+prepare pt as select * from distinct_v1;
+explain (costs off)  execute pt;
+alter table select_distinct_a alter column uk1 drop not null;
+explain (costs off) execute pt;
+
+drop view distinct_v1;
+drop table select_distinct_a;
+drop table select_distinct_b;
-- 
2.21.0

#23

David Rowley

dgrowleyml@gmail.com

almost 6 years ago

In reply to: Andy Fan (#22)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Sat, 7 Mar 2020 at 00:47, Andy Fan <zhihui.fan1213@gmail.com> wrote:

Upload the newest patch so that the cfbot can pass. The last patch failed
because some explain without the (cost off).

I've only really glanced at this patch, but I think we need to do this
in a completely different way.

I've been mentioning UniqueKeys around this mailing list for quite a
while now [1]https://www.postgresql.org/search/?m=1&ln=pgsql-hackers&q=uniquekeys. To summarise the idea:

1. Add a new List field to RelOptInfo named unique_keys
2. During get_relation_info() process the base relation's unique
indexes and add to the RelOptInfo's unique_keys list the indexed
expressions from each unique index (this may need to be delayed until
check_index_predicates() since predOK is only set there)
3. Perhaps in add_paths_to_joinrel(), or maybe when creating the join
rel itself (I've not looked for the best location in detail),
determine if the join can cause rows to be duplicated. If it can't,
then add the UniqueKeys from that rel. For example: SELECT * FROM t1
INNER JOIN t2 ON t1.unique = t2.not_unique; would have the joinrel for
{t1,t2} only take the unique keys from t2 (t1 can't duplicate t2 rows
since it's an eqijoin and t1.unique has a unique index). If the
condition was t1.unique = t2.unique then we could take the unique keys
from both sides of the join, and with t1.non_unique = t2.non_unique,
we can take neither.
4. When creating the GROUP BY paths (when there are no aggregates),
don't bother doing anything if the input rel's unique keys are a
subset of the GROUP BY clause. Otherwise, create the group by paths
and tag the new unique keys onto the GROUP BY rel.
5. When creating the DISTINCT paths, don't bother if the input rel has
unique keys are a subset of the distinct clause.

4 and 5 will mean that: SELECT DISTINCT non_unique FROM t1 GROUP BY
non_unique will just uniquify once for the GROUP BY and not for the
distinct. SELECT DISTINCT unique FROM t1 GROUP BY unique; won't do
anything to uniquify the results.

Because both 4 and 5 require that the uniquekeys are a subset of the
distinct/group by clause, an empty uniquekey set would mean that the
RelOptInfo returns no more than 1 row. That would allow your:

SELECT DISTINCT max(non_unique) FROM t1; to skip doing the DISTINCT part.

There's a separate effort in
https://commitfest.postgresql.org/27/1741/ to implement some parts of
the uniquekeys idea. However the implementation currently only covers
adding the unique keys to Paths, not to RelOptInfos.

I also believe that the existing code in analyzejoins.c should be
edited to make use of unique keys rather than how it looks at unique
indexes and group by / distinct clauses.

[1]: https://www.postgresql.org/search/?m=1&ln=pgsql-hackers&q=uniquekeys

#24

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: David Rowley (#23)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

Hi David:

3. Perhaps in add_paths_to_joinrel(), or maybe when creating the join
rel itself (I've not looked for the best location in detail),
determine if the join can cause rows to be duplicated. If it can't,
then add the UniqueKeys from that rel.

I have some concerns about this method, maybe I misunderstand
something, if so, please advise.

In my current implementation, it calculates the uniqueness for each
BaseRel only, but in your way, looks we need to calculate the
UniquePathKey for both BaseRel and JoinRel. This makes more
difference for multi table join. Another concern is UniquePathKey
is designed for a general purpose, we need to maintain it no matter
distinctClause/groupbyClause.

For example: SELECT * FROM t1
INNER JOIN t2 ON t1.unique = t2.not_unique; would have the joinrel for
{t1,t2} only take the unique keys from t2 (t1 can't duplicate t2 rows
since it's an eqijoin and t1.unique has a unique index).

Thanks for raising this. My current rule requires *every* relation yields
a
unique result and *no matter* with the join method. Actually I want to make
the rule less strict, for example, we may just need 1 table yields unique
result
and the final result will be unique as well under some join type.

As for the t1 INNER JOIN t2 ON t1.unique = t2.not_unique; looks we can't
remove the distinct based on this.

create table m1(a int primary key, b int);
create table m2(a int primary key, b int);
insert into m1 values(1, 1), (2, 1);
insert into m2 values(1, 1), (2, 1);
select distinct m1.a from m1, m2 where m1.a = m2.b;

SELECT DISTINCT max(non_unique) FROM t1; to skip doing the DISTINCT part.

Actually I want to keep the distinct for this case now. One reason is
there are only 1
row returned, so the distinct erasing can't help much. The more important
reason is
Query->hasAggs is true for "select distinct (select count(*) filter (where
t2.c2 = 6
and t2.c1 < 10) from ft1 t1 where t1.c1 = 6) from ft2 t2 where t2.c2 % 6 =
0 order by 1;"
(this sql came from postgres_fdw.sql).

There's a separate effort in

https://commitfest.postgresql.org/27/1741/ to implement some parts of
the uniquekeys idea. However the implementation currently only covers
adding the unique keys to Paths, not to RelOptInfos

Thanks for this info. I guess this patch is not merged so far, but looks
the cfbot
for my patch [1]https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.83298 failed due to this :( search
"explain (costs off) select distinct on(pk1) pk1, pk2 from
select_distinct_a;"

I also believe that the existing code in analyzejoins.c should be
edited to make use of unique keys rather than how it looks at unique
indexes and group by / distinct clauses.

I can do this after we have agreement on the UniquePath.

For my cbbot failure, another strange thing is "A" appear ahead of "a" after
the order by.. Still didn't find out why.

[1]: https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.83298
https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.83298

Regards
Andy Fan

#25

Ashutosh Bapat

ashutosh.bapat.oss@gmail.com

almost 6 years ago

In reply to: David Rowley (#23)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Tue, Mar 10, 2020 at 3:51 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Sat, 7 Mar 2020 at 00:47, Andy Fan <zhihui.fan1213@gmail.com> wrote:

Upload the newest patch so that the cfbot can pass. The last patch

failed

because some explain without the (cost off).

I've only really glanced at this patch, but I think we need to do this
in a completely different way.

I've been mentioning UniqueKeys around this mailing list for quite a
while now [1]. To summarise the idea:

1. Add a new List field to RelOptInfo named unique_keys
2. During get_relation_info() process the base relation's unique
indexes and add to the RelOptInfo's unique_keys list the indexed
expressions from each unique index (this may need to be delayed until
check_index_predicates() since predOK is only set there)
3. Perhaps in add_paths_to_joinrel(), or maybe when creating the join
rel itself (I've not looked for the best location in detail),

build_*_join_rel() will be a good place for this. The paths created might
take advantage of this information for costing.

determine if the join can cause rows to be duplicated. If it can't,
then add the UniqueKeys from that rel. For example: SELECT * FROM t1
INNER JOIN t2 ON t1.unique = t2.not_unique; would have the joinrel for
{t1,t2} only take the unique keys from t2 (t1 can't duplicate t2 rows
since it's an eqijoin and t1.unique has a unique index).

this is interesting.

If the
condition was t1.unique = t2.unique then we could take the unique keys
from both sides of the join, and with t1.non_unique = t2.non_unique,
we can take neither.
4. When creating the GROUP BY paths (when there are no aggregates),
don't bother doing anything if the input rel's unique keys are a
subset of the GROUP BY clause. Otherwise, create the group by paths
and tag the new unique keys onto the GROUP BY rel.
5. When creating the DISTINCT paths, don't bother if the input rel has
unique keys are a subset of the distinct clause.

Thanks for laying this out in more details. Two more cases can be added to
this
6. When creating RelOptInfo for a grouped/aggregated result, if all the
columns of a group by clause are part of the result i.e. targetlist, the
columns in group by clause server as the unique keys of the result. So the
corresponding RelOptInfo can be marked as such.
7. The result of DISTINCT is unique for the columns contained in the
DISTINCT clause. Hence we can add those columns to the unique_key of the
RelOptInfo representing the result of the distinct clause.
8. If each partition of a partitioned table has a unique key with the same
columns in it and that happens to be superset of the partition key, then
the whole partitioned table gets that unique key as well.

With this we could actually pass the uniqueness information through
Subquery scans as well and the overall query will benefit with that.

4 and 5 will mean that: SELECT DISTINCT non_unique FROM t1 GROUP BY
non_unique will just uniquify once for the GROUP BY and not for the
distinct. SELECT DISTINCT unique FROM t1 GROUP BY unique; won't do
anything to uniquify the results.

Because both 4 and 5 require that the uniquekeys are a subset of the
distinct/group by clause, an empty uniquekey set would mean that the
RelOptInfo returns no more than 1 row. That would allow your:

SELECT DISTINCT max(non_unique) FROM t1; to skip doing the DISTINCT part.

There's a separate effort in
https://commitfest.postgresql.org/27/1741/ to implement some parts of
the uniquekeys idea. However the implementation currently only covers
adding the unique keys to Paths, not to RelOptInfos.

I haven't looked at that patch, but as discussed upthread, in this case we
want the uniqueness associated with the RelOptInfo and not the path.

I also believe that the existing code in analyzejoins.c should be
edited to make use of unique keys rather than how it looks at unique
indexes and group by / distinct clauses.

+1.
-- 
Best Wishes,
Ashutosh Bapat

#26

Ashutosh Bapat

ashutosh.bapat.oss@gmail.com

almost 6 years ago

In reply to: Andy Fan (#24)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

Hi Andy,

On Tue, Mar 10, 2020 at 1:49 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

Hi David:

3. Perhaps in add_paths_to_joinrel(), or maybe when creating the join
rel itself (I've not looked for the best location in detail),
determine if the join can cause rows to be duplicated. If it can't,
then add the UniqueKeys from that rel.

I have some concerns about this method, maybe I misunderstand
something, if so, please advise.

In my current implementation, it calculates the uniqueness for each
BaseRel only, but in your way, looks we need to calculate the
UniquePathKey for both BaseRel and JoinRel. This makes more
difference for multi table join.

I didn't understand this concern. I think, it would be better to do it
for all kinds of relation types base, join etc. This way we are sure
that one method works across the planner to eliminate the need for
Distinct or grouping. If we just implement something for base
relations right now and don't do that for joins, there is a chance
that that method may not work for joins when we come to implement it.

Another concern is UniquePathKey
is designed for a general purpose, we need to maintain it no matter
distinctClause/groupbyClause.

This should be ok. The time spent in annotating a RelOptInfo about
uniqueness is not going to be a lot. But doing so would help generic
elimination of Distinct/Group/Unique operations. What is
UniquePathKey; I didn't find this in your patch or in the code.

For example: SELECT * FROM t1
INNER JOIN t2 ON t1.unique = t2.not_unique; would have the joinrel for
{t1,t2} only take the unique keys from t2 (t1 can't duplicate t2 rows
since it's an eqijoin and t1.unique has a unique index).

Thanks for raising this. My current rule requires *every* relation yields a
unique result and *no matter* with the join method. Actually I want to make
the rule less strict, for example, we may just need 1 table yields unique result
and the final result will be unique as well under some join type.

That is desirable.

As for the t1 INNER JOIN t2 ON t1.unique = t2.not_unique; looks we can't
remove the distinct based on this.

create table m1(a int primary key, b int);
create table m2(a int primary key, b int);
insert into m1 values(1, 1), (2, 1);
insert into m2 values(1, 1), (2, 1);
select distinct m1.a from m1, m2 where m1.a = m2.b;

IIUC, David's rule is other way round. "select distinct m2.a from m1,
m2 where m1.a = m2.b" won't need DISTINCT node since the result of
joining m1 and m2 has unique value of m2.a for each row. In your
example the join will produce two rows (m1.a, m1.b, m2.a, m2.b) (1, 1,
1, 1) and (1, 1, 2, 1) where m2.a is unique key.

--
Best Wishes,
Ashutosh Bapat

#27

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: Ashutosh Bapat (#26)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

Hi Tom & David & Bapat:

Thanks for your review so far. I want to summarize the current issues to
help
our following discussion.

1. Shall we bypass the AggNode as well with the same logic.

I think yes, since the rules to bypass a AggNode and UniqueNode is exactly
same.
The difficulty of bypassing AggNode is the current aggregation function
call is closely
coupled with AggNode. In the past few days, I have make the aggregation
call can
run without AggNode (at least I tested sum(without finalized fn), avg
(with finalized fn)).
But there are a few things to do, like acl check, anynull check and maybe
more check.
also there are some MemoryContext mess up need to fix.
I still need some time for this goal, so I think the complex of it
deserves another thread
to discuss it, any thought?

2. Shall we used the UniquePath as David suggested.

Actually I am trending to this way now. Daivd, can you share more insights
about the
benefits of UniquePath? Costing size should be one of them, another one
may be
changing the semi join to normal join as the current innerrel_is_unique
did. any others?

3. Can we make the rule more general?

Currently it requires every relation yields a unique result. Daivd & Bapat
provides another example:
select m2.pk from m1, m2 where m1.pk = m2.non_unqiue_key. That's
interesting and not easy to
handle in my current framework. This is another reason I want to take the
UniquePath framework.

Do we have any other rules to think about before implementing it?

Thanks for your feedback.

This should be ok. The time spent in annotating a RelOptInfo about
uniqueness is not going to be a lot. But doing so would help generic
elimination of Distinct/Group/Unique operations. What is
UniquePathKey; I didn't find this in your patch or in the code.

This is a proposal from David, so not in current patch/code :)

Regards
Andy Fan

#28

David Rowley

dgrowleyml@gmail.com

almost 6 years ago

In reply to: Ashutosh Bapat (#26)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Wed, 11 Mar 2020 at 02:50, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

On Tue, Mar 10, 2020 at 1:49 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

In my current implementation, it calculates the uniqueness for each
BaseRel only, but in your way, looks we need to calculate the
UniquePathKey for both BaseRel and JoinRel. This makes more
difference for multi table join.

I didn't understand this concern. I think, it would be better to do it
for all kinds of relation types base, join etc. This way we are sure
that one method works across the planner to eliminate the need for
Distinct or grouping. If we just implement something for base
relations right now and don't do that for joins, there is a chance
that that method may not work for joins when we come to implement it.

Yeah, it seems to me that we're seeing more and more features that
require knowledge of uniqueness of a RelOptInfo. The skip scans patch
needs to know if a join will cause row duplication so it knows if the
skip scan path can be joined to without messing up the uniqueness of
the skip scan. Adding more and more places that loop over the rel's
indexlist just does not seem the right way to do it, especially so
when you have to dissect the join rel down to its base rel components
to check which indexes there are. Having the knowledge on-hand at the
RelOptInfo level means we no longer have to look at indexes for unique
proofs.

Another concern is UniquePathKey
is designed for a general purpose, we need to maintain it no matter
distinctClause/groupbyClause.

This should be ok. The time spent in annotating a RelOptInfo about
uniqueness is not going to be a lot. But doing so would help generic
elimination of Distinct/Group/Unique operations. What is
UniquePathKey; I didn't find this in your patch or in the code.

Possibly a misinterpretation. There is some overlap between this patch
and the skip scan patch, both would like to skip doing explicit work
to implement DISTINCT. Skip scans just go about it by adding new path
types that scan the index and only gathers up unique values. In that
case, the RelOptInfo won't contain the unique keys, but the skip scan
path will. How I imagine both of these patches working together in
create_distinct_paths() is that we first check if the DISTINCT clause
is a subset of the a set of the RelOptInfo's unique keys (this patch),
else we check if there are any paths with uniquekeys that we can use
to perform a no-op on the DISTINCT clause (the skip scan patch), if
none of those apply, we create the required paths to uniquify the
results.

#29

David Rowley

dgrowleyml@gmail.com

almost 6 years ago

In reply to: Andy Fan (#24)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Tue, 10 Mar 2020 at 21:19, Andy Fan <zhihui.fan1213@gmail.com> wrote:

SELECT DISTINCT max(non_unique) FROM t1; to skip doing the DISTINCT part.

Actually I want to keep the distinct for this case now. One reason is there are only 1
row returned, so the distinct erasing can't help much. The more important reason is
Query->hasAggs is true for "select distinct (select count(*) filter (where t2.c2 = 6
and t2.c1 < 10) from ft1 t1 where t1.c1 = 6) from ft2 t2 where t2.c2 % 6 = 0 order by 1;"
(this sql came from postgres_fdw.sql).

I think that sort of view is part of the problem here. If you want to
invent some new way to detect uniqueness that does not count that case
then we have more code with more possible places to have bugs.

FWIW, query_is_distinct_for() does detect that case with:

/*
* If we have no GROUP BY, but do have aggregates or HAVING, then the
* result is at most one row so it's surely unique, for any operators.
*/
if (query->hasAggs || query->havingQual)
return true;

which can be seen by the fact that the following find the unique join on t2.

postgres=# explain verbose select * from t1 inner join (select
count(*) c from t1) t2 on t1.a=t2.c;
QUERY PLAN
------------------------------------------------------------------------------------
Hash Join (cost=41.91..84.25 rows=13 width=12)
Output: t1.a, (count(*))
Inner Unique: true
Hash Cond: (t1.a = (count(*)))
-> Seq Scan on public.t1 (cost=0.00..35.50 rows=2550 width=4)
Output: t1.a
-> Hash (cost=41.89..41.89 rows=1 width=8)
Output: (count(*))
-> Aggregate (cost=41.88..41.88 rows=1 width=8)
Output: count(*)
-> Seq Scan on public.t1 t1_1 (cost=0.00..35.50
rows=2550 width=0)
Output: t1_1.a
(12 rows)

It will be very simple to add an empty List of UniqueKeys to the GROUP
BY's RelOptInfo to indicate that all expressions are unique. That way
any code that checks if some of the RelOptInfo's unique keys are a
subset of some expressions they'd like to know are unique, then
they'll get a match.

It does not really matter how much effort is saved in your example
above. The UniqueKey infrastructure won't know how much effort
properly adding all the uniquekeys will save. It should just add all
the keys it can and let whichever code cares about that reap the
benefits.

#30

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: David Rowley (#28)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Wed, Mar 11, 2020 at 6:49 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 11 Mar 2020 at 02:50, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

On Tue, Mar 10, 2020 at 1:49 PM Andy Fan <zhihui.fan1213@gmail.com>

wrote:

In my current implementation, it calculates the uniqueness for each
BaseRel only, but in your way, looks we need to calculate the
UniquePathKey for both BaseRel and JoinRel. This makes more
difference for multi table join.

I didn't understand this concern. I think, it would be better to do it
for all kinds of relation types base, join etc. This way we are sure
that one method works across the planner to eliminate the need for
Distinct or grouping. If we just implement something for base
relations right now and don't do that for joins, there is a chance
that that method may not work for joins when we come to implement it.

Yeah, it seems to me that we're seeing more and more features that
require knowledge of uniqueness of a RelOptInfo. The skip scans patch
needs to know if a join will cause row duplication so it knows if the
skip scan path can be joined to without messing up the uniqueness of
the skip scan. Adding more and more places that loop over the rel's
indexlist just does not seem the right way to do it, especially so
when you have to dissect the join rel down to its base rel components
to check which indexes there are. Having the knowledge on-hand at the
RelOptInfo level means we no longer have to look at indexes for unique
proofs.

Another concern is UniquePathKey
is designed for a general purpose, we need to maintain it no matter
distinctClause/groupbyClause.

This should be ok. The time spent in annotating a RelOptInfo about
uniqueness is not going to be a lot. But doing so would help generic
elimination of Distinct/Group/Unique operations. What is
UniquePathKey; I didn't find this in your patch or in the code.

Possibly a misinterpretation. There is some overlap between this patch
and the skip scan patch, both would like to skip doing explicit work
to implement DISTINCT. Skip scans just go about it by adding new path
types that scan the index and only gathers up unique values. In that
case, the RelOptInfo won't contain the unique keys, but the skip scan
path will. How I imagine both of these patches working together in
create_distinct_paths() is that we first check if the DISTINCT clause
is a subset of the a set of the RelOptInfo's unique keys (this patch),
else we check if there are any paths with uniquekeys that we can use
to perform a no-op on the DISTINCT clause (the skip scan patch), if
none of those apply, we create the required paths to uniquify the
results.

Now I am convinced that we should maintain UniquePath on RelOptInfo,
I would see how to work with "Index Skip Scan" patch.

#31

Ashutosh Bapat

ashutosh.bapat.oss@gmail.com

almost 6 years ago

In reply to: Andy Fan (#27)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Tue, Mar 10, 2020 at 9:12 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

Hi Tom & David & Bapat:

Thanks for your review so far. I want to summarize the current issues to help
our following discussion.

1. Shall we bypass the AggNode as well with the same logic.

I think yes, since the rules to bypass a AggNode and UniqueNode is exactly same.
The difficulty of bypassing AggNode is the current aggregation function call is closely
coupled with AggNode. In the past few days, I have make the aggregation call can
run without AggNode (at least I tested sum(without finalized fn), avg (with finalized fn)).
But there are a few things to do, like acl check, anynull check and maybe more check.
also there are some MemoryContext mess up need to fix.
I still need some time for this goal, so I think the complex of it deserves another thread
to discuss it, any thought?

I think if the relation underlying an Agg node is know to be unique
for given groupByClause, we could safely use AGG_SORTED strategy.
Though the input is not ordered, it's sorted thus for every row Agg
node will combine/finalize the aggregate result.

--
Best Wishes,
Ashutosh Bapat

#32

Ashutosh Bapat

ashutosh.bapat.oss@gmail.com

almost 6 years ago

In reply to: David Rowley (#28)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Wed, Mar 11, 2020 at 4:19 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 11 Mar 2020 at 02:50, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

On Tue, Mar 10, 2020 at 1:49 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

In my current implementation, it calculates the uniqueness for each
BaseRel only, but in your way, looks we need to calculate the
UniquePathKey for both BaseRel and JoinRel. This makes more
difference for multi table join.

I didn't understand this concern. I think, it would be better to do it
for all kinds of relation types base, join etc. This way we are sure
that one method works across the planner to eliminate the need for
Distinct or grouping. If we just implement something for base
relations right now and don't do that for joins, there is a chance
that that method may not work for joins when we come to implement it.

Yeah, it seems to me that we're seeing more and more features that
require knowledge of uniqueness of a RelOptInfo. The skip scans patch
needs to know if a join will cause row duplication so it knows if the
skip scan path can be joined to without messing up the uniqueness of
the skip scan. Adding more and more places that loop over the rel's
indexlist just does not seem the right way to do it, especially so
when you have to dissect the join rel down to its base rel components
to check which indexes there are. Having the knowledge on-hand at the
RelOptInfo level means we no longer have to look at indexes for unique
proofs.

+1. Yep. When we break join down to the base relation, partitioned
relation pose another challenge that the partitioned relation may not
have an index on it per say but each partition may have it and the
index key happens to be part of the partition key. That case would be
easy to track through RelOptInfo instead of breaking a base rel down
into its child rels.

Another concern is UniquePathKey
is designed for a general purpose, we need to maintain it no matter
distinctClause/groupbyClause.

This should be ok. The time spent in annotating a RelOptInfo about
uniqueness is not going to be a lot. But doing so would help generic
elimination of Distinct/Group/Unique operations. What is
UniquePathKey; I didn't find this in your patch or in the code.

Possibly a misinterpretation. There is some overlap between this patch
and the skip scan patch, both would like to skip doing explicit work
to implement DISTINCT. Skip scans just go about it by adding new path
types that scan the index and only gathers up unique values. In that
case, the RelOptInfo won't contain the unique keys, but the skip scan
path will. How I imagine both of these patches working together in
create_distinct_paths() is that we first check if the DISTINCT clause
is a subset of the a set of the RelOptInfo's unique keys (this patch),
else we check if there are any paths with uniquekeys that we can use
to perform a no-op on the DISTINCT clause (the skip scan patch), if
none of those apply, we create the required paths to uniquify the
results.

Looks good to me. But I have not seen index skip patch.

--
Best Wishes,
Ashutosh Bapat

#33

David Rowley

dgrowleyml@gmail.com

almost 6 years ago

In reply to: Andy Fan (#30)

1 attachment(s)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Wed, 11 Mar 2020 at 17:23, Andy Fan <zhihui.fan1213@gmail.com> wrote:

Now I am convinced that we should maintain UniquePath on RelOptInfo,
I would see how to work with "Index Skip Scan" patch.

I've attached a very early proof of concept patch for unique keys.
The NULL detection stuff is not yet hooked up, so it'll currently do
the wrong thing for NULLable columns. I've left some code in there
with my current idea of how to handle that, but I'll need to add more
code both to look at the catalogue tables to see if there's a NOT NULL
constraint and also to check for strict quals that filter out NULLs.

Additionally, I've not hooked up the collation checking stuff yet. I
just wanted to see if it would work ok for non-collatable types first.

I've added a couple of lines to create_distinct_paths() to check if
the input_rel has the required UniqueKeys to skip doing the DISTINCT.
It seems to work, but my tests so far are limited to:

create table t1(a int primary key, b int);
create table t2(a int primary key, b int);

postgres=# -- t2 could duplicate t1, don't remove DISTINCT
postgres=# explain (costs off) select distinct t1.a from t1 inner join
t2 on t1.a = t2.b;
QUERY PLAN
----------------------------------
HashAggregate
Group Key: t1.a
-> Hash Join
Hash Cond: (t2.b = t1.a)
-> Seq Scan on t2
-> Hash
-> Seq Scan on t1
(7 rows)

postgres=# -- neither rel can duplicate the other due to join on PK.
Remove DISTINCT
postgres=# explain (costs off) select distinct t1.a from t1 inner join
t2 on t1.a = t2.a;
QUERY PLAN
----------------------------
Hash Join
Hash Cond: (t1.a = t2.a)
-> Seq Scan on t1
-> Hash
-> Seq Scan on t2
(5 rows)

postgres=# -- t2.a cannot duplicate t1 and t1.a is unique. Remove DISTINCT
postgres=# explain (costs off) select distinct t1.a from t1 inner join
t2 on t1.b = t2.a;
QUERY PLAN
----------------------------
Hash Join
Hash Cond: (t1.b = t2.a)
-> Seq Scan on t1
-> Hash
-> Seq Scan on t2
(5 rows)

postgres=# -- t1.b can duplicate t2.a. Don't remove DISTINCT
postgres=# explain (costs off) select distinct t2.a from t1 inner join
t2 on t1.b = t2.a;
QUERY PLAN
----------------------------------
HashAggregate
Group Key: t2.a
-> Hash Join
Hash Cond: (t1.b = t2.a)
-> Seq Scan on t1
-> Hash
-> Seq Scan on t2
(7 rows)

postgres=# -- t1.a cannot duplicate t2.a. Remove DISTINCT.
postgres=# explain (costs off) select distinct t2.a from t1 inner join
t2 on t1.a = t2.b;
QUERY PLAN
----------------------------
Hash Join
Hash Cond: (t2.b = t1.a)
-> Seq Scan on t2
-> Hash
-> Seq Scan on t1
(5 rows)

I've also left a bunch of XXX comments for things that I know need more thought.

I believe we can propagate the joinrel's unique keys where the patch
is currently doing it. I understand that in
populate_joinrel_with_paths() we do things like swapping LEFT JOINs
for RIGHT JOINs and switch the input rels around, but we do so only
because it's equivalent, so I don't currently see why we can't take
the jointype for the SpecialJoinInfo. I need to know that as I'll need
to ignore pushed down RestrictInfos for outer joins.

I'm posting now as I know I've been mentioning this UniqueKeys idea
for quite a while and if it's not something that's going to get off
the ground, then it's better to figure that out now.

Attachments:

bearly_poc_uniquekeys_v0.patchapplication/octet-stream; name=bearly_poc_uniquekeys_v0.patchDownload

diff --git a/src/backend/optimizer/path/Makefile b/src/backend/optimizer/path/Makefile
index 1e199ff66f..7b9820c25f 100644
--- a/src/backend/optimizer/path/Makefile
+++ b/src/backend/optimizer/path/Makefile
@@ -21,6 +21,7 @@ OBJS = \
 	joinpath.o \
 	joinrels.o \
 	pathkeys.o \
-	tidpath.o
+	tidpath.o \
+	uniquekeys.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 905bbe77d8..10bcc0e4fa 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -579,6 +579,12 @@ set_plain_rel_size(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
 	 */
 	check_index_predicates(root, rel);
 
+	/*
+	 * Now that we've marked which partial indexes are suitable, we can now
+	 * build the relation's unique keys.
+	 */
+	populate_baserel_uniquekeys(root, rel);
+
 	/* Mark rel with estimated output rows, width, etc */
 	set_baserel_size_estimates(root, rel);
 }
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index a21c295b99..3dd060f926 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -752,6 +752,8 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 		return joinrel;
 	}
 
+	propagate_unique_keys_to_joinrel(root, joinrel, rel1, rel2, restrictlist);
+
 	/* Add paths to the join relation. */
 	populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
 								restrictlist);
diff --git a/src/backend/optimizer/path/uniquekeys.c b/src/backend/optimizer/path/uniquekeys.c
new file mode 100644
index 0000000000..40261db1b3
--- /dev/null
+++ b/src/backend/optimizer/path/uniquekeys.c
@@ -0,0 +1,295 @@
+/*-------------------------------------------------------------------------
+ *
+ * uniquekeys.c
+ *	  Utilities for matching and building unique keys
+ *
+ * Portions Copyright (c) 2020, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/backend/optimizer/path/uniquekeys.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/pathnode.h"
+
+/*
+ * populate_baserel_uniquekeys
+ *		Populate 'baserel' uniquekeys list by looking at the rel's unique indexes
+ */
+void
+populate_baserel_uniquekeys(PlannerInfo *root, RelOptInfo *baserel)
+{
+	ListCell *lc;
+
+	Assert(baserel->rtekind == RTE_RELATION);
+
+	foreach(lc, baserel->indexlist)
+	{
+		IndexOptInfo *ind = (IndexOptInfo *) lfirst(lc);
+		UniqueKeySet *keyset;
+		List	   *keys;
+		int			c;
+		int			exprno;
+
+		/*
+		 * If the index is not unique, or not immediately enforced, or if it's
+		 * a partial index that doesn't match the query, it's useless here.
+		 */
+		if (!ind->unique || !ind->immediate ||
+			(ind->indpred != NIL && !ind->predOK))
+			continue;
+
+		keys = NIL;
+		exprno = 0;
+
+		for (c = 0; c < ind->nkeycolumns; c++)
+		{
+			UniqueKey	*key = makeNode(UniqueKey);
+
+			key->uk_collation = ind->indexcollations[c];
+			key->uk_opfamily = ind->opfamily[c];
+			/* XXX is this too lazy? Should I be building my own Var here from indexkeys[c]? */
+			key->uk_expr = copyObject(((TargetEntry *) list_nth(ind->indextlist, c))->expr);
+
+			keys = lappend(keys, key);
+		}
+
+		keyset = makeNode(UniqueKeySet);
+		/* XXX check and update the non_null_keys for NOT NULL Vars */
+		keyset->non_null_keys = NULL;
+		keyset->keys = keys;
+
+		baserel->uniquekeys = lappend(baserel->uniquekeys, keyset);
+	}
+}
+
+static bool
+relation_is_unique_for_keys(PlannerInfo *root, UniqueKeySet *keyset, List *exprs)
+{
+	ListCell *lc;
+
+	foreach(lc, keyset->keys)
+	{
+		UniqueKey *key = (UniqueKey *) lfirst(lc);
+		ListCell *lc2;
+		bool found = false;
+
+		foreach(lc2, exprs)
+		{
+			Expr *expr = (Expr *) lfirst(lc2);
+
+			/* XXX check collation */
+			if (equal(key->uk_expr, expr))
+			{
+				found = true;
+				break;
+			}
+		}
+
+		if (!found)
+			return false;
+	}
+
+	return true;
+}
+
+/*
+ * relation_has_uniquekeys_for
+ *		Returns true if we have proofs that 'rel' cannot return multiple rows with
+ *		the same values in each of 'exprs'.  Otherwise returns false.
+ */
+bool
+relation_has_uniquekeys_for(PlannerInfo *root, RelOptInfo *rel, List *exprs,
+							bool req_nonnull)
+{
+	ListCell *lc;
+
+	foreach(lc, rel->uniquekeys)
+	{
+		UniqueKeySet *keyset =  (UniqueKeySet *) lfirst(lc);
+
+		/*
+		 * When we require the keys cannot produce NULL values, skip over sets where
+		 * not all keys are marked as non-null.
+		 */
+		if (req_nonnull && bms_num_members(keyset->non_null_keys) < list_length(keyset->keys))
+			continue;
+
+		if (relation_is_unique_for_keys(root, keyset, exprs))
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * clause_sides_match_join
+ *	  Determine whether a join clause is of the right form to use in this join.
+ *
+ * We already know that the clause is a binary opclause referencing only the
+ * rels in the current join.  The point here is to check whether it has the
+ * form "outerrel_expr op innerrel_expr" or "innerrel_expr op outerrel_expr",
+ * rather than mixing outer and inner vars on either side.  If it matches,
+ * we set the transient flag outer_is_left to identify which side is which.
+ */
+static inline bool
+clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
+						Relids innerrelids)
+{
+	if (bms_is_subset(rinfo->left_relids, outerrelids) &&
+		bms_is_subset(rinfo->right_relids, innerrelids))
+	{
+		/* lefthand side is outer */
+		rinfo->outer_is_left = true;
+		return true;
+	}
+	else if (bms_is_subset(rinfo->left_relids, innerrelids) &&
+			 bms_is_subset(rinfo->right_relids, outerrelids))
+	{
+		/* righthand side is outer */
+		rinfo->outer_is_left = false;
+		return true;
+	}
+	return false;				/* no good for these input relations */
+}
+
+static bool
+clauselist_matches_uniquekeys(List *clause_list, UniqueKeySet *keyset, bool outer_side)
+{
+	ListCell *lc;
+
+	foreach(lc, keyset->keys)
+	{
+		UniqueKey *key = (UniqueKey *)lfirst(lc);
+		ListCell *lc2;
+		bool matched_expr = false;
+
+		foreach(lc2, clause_list)
+		{
+			RestrictInfo *rinfo = (RestrictInfo *)lfirst(lc2);
+			Node	   *rexpr;
+
+			/*
+			 * The condition's equality operator must be a member of the
+			 * index opfamily, else it is not asserting the right kind of
+			 * equality behavior for this index.  We check this first
+			 * since it's probably cheaper than match_index_to_operand().
+			 */
+			if (!list_member_oid(rinfo->mergeopfamilies, key->uk_opfamily))
+				continue;
+
+			/*
+			 * XXX at some point we may need to check collations here too.
+			 * For the moment we assume all collations reduce to the same
+			 * notion of equality.
+			 */
+
+			 /* OK, see if the condition operand matches the index key */
+			if (rinfo->outer_is_left != outer_side)
+				rexpr = get_rightop(rinfo->clause);
+			else
+				rexpr = get_leftop(rinfo->clause);
+
+			if (IsA(rexpr, RelabelType))
+				rexpr = (Node *)((RelabelType *)rexpr)->arg;
+
+			if (equal(rexpr, key->uk_expr))
+			{
+				matched_expr = true;
+				break;
+			}
+		}
+
+		if (!matched_expr)
+			return false;
+	}
+
+	return true;
+}
+
+/*
+ * propagate_unique_keys_to_joinrel
+ *		Using 'restrictlist' determine if rel2 can duplicate rows in rel1 and
+ *		vice-versa.  If the relation at the other side of the join cannot
+ *		cause row duplication, then tag the uniquekeys for the relation onto
+ *		'joinrel's uniquekey list.
+ */
+void
+propagate_unique_keys_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
+								 RelOptInfo *rel1, RelOptInfo *rel2,
+								 List *restrictlist)
+{
+	ListCell *lc;
+	List	*clause_list = NIL;
+	bool	matched;
+
+	/*
+	 * XXX what about base quals being compared to Consts? We're not looking
+	 * at those here at all.  We'd need to split the joinrel into base rel
+	 * components and tag on the base quals to clause_list, as, or course a
+	 * join rel does not contain any base quals.
+	 */
+	foreach(lc, restrictlist)
+	{
+		RestrictInfo *restrictinfo = (RestrictInfo *)lfirst(lc);
+
+		/* XXX what do we do about these?  We don't know the join type yet */
+		//if (RINFO_IS_PUSHED_DOWN(restrictinfo, joinrel->relids))
+		//{
+		//	continue;
+		//}
+
+		/* Ignore if it's not a mergejoinable clause */
+		if (!restrictinfo->can_join ||
+			restrictinfo->mergeopfamilies == NIL)
+			continue;			/* not mergejoinable */
+
+		/*
+		 * Check if clause has the form "outer op inner" or "inner op outer",
+		 * and if so mark which side is inner.
+		 */
+		if (!clause_sides_match_join(restrictinfo, rel1->relids, rel2->relids))
+			continue;			/* no good for these input relations */
+
+		/* OK, add to list */
+		clause_list = lappend(clause_list, restrictinfo);
+	}
+
+	matched = false;
+	foreach(lc, rel1->uniquekeys)
+	{
+		UniqueKeySet *keys = (UniqueKeySet *) lfirst(lc);
+
+		/* XXX need to think about how to update the not-null bits here */
+		if (clauselist_matches_uniquekeys(clause_list, keys, true))
+		{
+			matched = true;
+			break;
+		}
+	}
+
+	/* If we get a match then propagate the unique keys of rel2 onto the join rel */
+	if (matched)
+		joinrel->uniquekeys = list_concat(joinrel->uniquekeys, rel2->uniquekeys);
+
+	matched = false;
+	foreach(lc, rel2->uniquekeys)
+	{
+		UniqueKeySet *keys = (UniqueKeySet *)lfirst(lc);
+
+		/* XXX need to think about how to update the not-null bits here */
+		if (clauselist_matches_uniquekeys(clause_list, keys, false))
+		{
+			matched = true;
+			break;
+		}
+	}
+
+	/* If we get a match then propagate the unique keys of rel1 onto the join rel */
+	if (matched)
+		joinrel->uniquekeys = list_concat(joinrel->uniquekeys, rel1->uniquekeys);
+}
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b44efd6314..06846b1b83 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -4757,6 +4757,21 @@ create_distinct_paths(PlannerInfo *root,
 	distinct_rel->useridiscurrent = input_rel->useridiscurrent;
 	distinct_rel->fdwroutine = input_rel->fdwroutine;
 
+	/* XXX just doing this in a really hacky way to see if it works... */
+	if (relation_has_uniquekeys_for(root, input_rel, get_sortgrouplist_exprs(parse->distinctClause, parse->targetList), false))
+	{
+
+		add_path(distinct_rel, (Path *)cheapest_input_path);
+
+		/* XXX yeah yeah, need to call the hooks etc. */
+
+		/* Now choose the best path(s) */
+		set_cheapest(distinct_rel);
+
+		return distinct_rel;
+	}
+
+
 	/* Estimate number of distinct rows there will be */
 	if (parse->groupClause || parse->groupingSets || parse->hasAggs ||
 		root->hasHavingQual)
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 8a76afe8cc..4db76fdb28 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -260,6 +260,8 @@ typedef enum NodeTag
 	T_EquivalenceClass,
 	T_EquivalenceMember,
 	T_PathKey,
+	T_UniqueKeySet,
+	T_UniqueKey,
 	T_PathTarget,
 	T_RestrictInfo,
 	T_IndexClause,
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 0ceb809644..5fc725e2e7 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -706,6 +706,7 @@ typedef struct RelOptInfo
 	QualCost	baserestrictcost;	/* cost of evaluating the above */
 	Index		baserestrict_min_security;	/* min security_level found in
 											 * baserestrictinfo */
+	List	   *uniquekeys;		/* List of UniqueKeysets */
 	List	   *joininfo;		/* RestrictInfo structures for join clauses
 								 * involving this rel */
 	bool		has_eclass_joins;	/* T means joininfo is incomplete */
@@ -1016,6 +1017,31 @@ typedef struct PathKey
 	bool		pk_nulls_first; /* do NULLs come before normal values? */
 } PathKey;
 
+/* UniqueKeySet
+ *
+ * Represents a set of unique keys
+ */
+typedef struct UniqueKeySet
+{
+	NodeTag		type;
+
+	Bitmapset *non_null_keys;	/* indexes of 'keys' proved non-null */
+	List		*keys;	/* list of UniqueKeys */
+} UniqueKeySet;
+
+/*
+ * UniqueKey
+ *
+ * Represents the unique properties held by a RelOptInfo or a Path
+ */
+typedef struct UniqueKey
+{
+	NodeTag		type;
+
+	Oid			uk_collation;	/* collation, if datatypes are collatable */
+	Oid			uk_opfamily;	/* btree opfamily defining the ordering */
+	Expr	   *uk_expr;		/* unique key expression */
+} UniqueKey;
 
 /*
  * PathTarget
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 9ab73bd20c..16c1faa41e 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -240,4 +240,18 @@ extern PathKey *make_canonical_pathkey(PlannerInfo *root,
 extern void add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
 									List *live_childrels);
 
+/*
+ * uniquekeys.c
+ *	  Utilities for matching and building unique keys
+ */
+extern void populate_baserel_uniquekeys(PlannerInfo *root,
+										RelOptInfo *baserel);
+extern bool relation_has_uniquekeys_for(PlannerInfo *root, RelOptInfo *rel,
+										List *exprs, bool req_nonnull);
+extern void propagate_unique_keys_to_joinrel(PlannerInfo *root,
+											 RelOptInfo *joinrel,
+											 RelOptInfo *rel1,
+											 RelOptInfo *rel2,
+											 List *restrictlist);
+
 #endif							/* PATHS_H */

#34

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: David Rowley (#33)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

Hi David:

On Thu, Mar 12, 2020 at 3:51 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 11 Mar 2020 at 17:23, Andy Fan <zhihui.fan1213@gmail.com> wrote:

Now I am convinced that we should maintain UniquePath on RelOptInfo,
I would see how to work with "Index Skip Scan" patch.

I've attached a very early proof of concept patch for unique keys.
The NULL detection stuff is not yet hooked up, so it'll currently do
the wrong thing for NULLable columns. I've left some code in there
with my current idea of how to handle that, but I'll need to add more
code both to look at the catalogue tables to see if there's a NOT NULL
constraint and also to check for strict quals that filter out NULLs.

Additionally, I've not hooked up the collation checking stuff yet. I
just wanted to see if it would work ok for non-collatable types first.

I've added a couple of lines to create_distinct_paths() to check if
the input_rel has the required UniqueKeys to skip doing the DISTINCT.
It seems to work, but my tests so far are limited to:

create table t1(a int primary key, b int);
create table t2(a int primary key, b int);

postgres=# -- t2 could duplicate t1, don't remove DISTINCT
postgres=# explain (costs off) select distinct t1.a from t1 inner join
t2 on t1.a = t2.b;
QUERY PLAN
----------------------------------
HashAggregate
Group Key: t1.a
-> Hash Join
Hash Cond: (t2.b = t1.a)
-> Seq Scan on t2
-> Hash
-> Seq Scan on t1
(7 rows)

postgres=# -- neither rel can duplicate the other due to join on PK.
Remove DISTINCT
postgres=# explain (costs off) select distinct t1.a from t1 inner join
t2 on t1.a = t2.a;
QUERY PLAN
----------------------------
Hash Join
Hash Cond: (t1.a = t2.a)
-> Seq Scan on t1
-> Hash
-> Seq Scan on t2
(5 rows)

postgres=# -- t2.a cannot duplicate t1 and t1.a is unique. Remove DISTINCT
postgres=# explain (costs off) select distinct t1.a from t1 inner join
t2 on t1.b = t2.a;
QUERY PLAN
----------------------------
Hash Join
Hash Cond: (t1.b = t2.a)
-> Seq Scan on t1
-> Hash
-> Seq Scan on t2
(5 rows)

postgres=# -- t1.b can duplicate t2.a. Don't remove DISTINCT
postgres=# explain (costs off) select distinct t2.a from t1 inner join
t2 on t1.b = t2.a;
QUERY PLAN
----------------------------------
HashAggregate
Group Key: t2.a
-> Hash Join
Hash Cond: (t1.b = t2.a)
-> Seq Scan on t1
-> Hash
-> Seq Scan on t2
(7 rows)

postgres=# -- t1.a cannot duplicate t2.a. Remove DISTINCT.
postgres=# explain (costs off) select distinct t2.a from t1 inner join
t2 on t1.a = t2.b;
QUERY PLAN
----------------------------
Hash Join
Hash Cond: (t2.b = t1.a)
-> Seq Scan on t2
-> Hash
-> Seq Scan on t1
(5 rows)

I've also left a bunch of XXX comments for things that I know need more
thought.

I believe we can propagate the joinrel's unique keys where the patch
is currently doing it. I understand that in
populate_joinrel_with_paths() we do things like swapping LEFT JOINs
for RIGHT JOINs and switch the input rels around, but we do so only
because it's equivalent, so I don't currently see why we can't take
the jointype for the SpecialJoinInfo. I need to know that as I'll need
to ignore pushed down RestrictInfos for outer joins.

I'm posting now as I know I've been mentioning this UniqueKeys idea
for quite a while and if it's not something that's going to get off
the ground, then it's better to figure that out now.

Thanks for the code! Here is some points from me.

1. for pupulate_baserel_uniquekeys, we need handle the "pk = Const" as
well.
(relation_has_unqiue_for has a similar logic) currently the following
distinct path is still
there.

postgres=# explain select distinct b from t100 where pk = 1;
QUERY PLAN
----------------------------------------------------------------------------------
Unique (cost=8.18..8.19 rows=1 width=4)
-> Sort (cost=8.18..8.19 rows=1 width=4)
Sort Key: b
-> Index Scan using t100_pkey on t100 (cost=0.15..8.17 rows=1
width=4)
Index Cond: (pk = 1)
(5 rows)

I think in this case, we can add both (pk) and (b) as the UniquePaths.
If so we
can get more opportunities to reach our goal.

2. As for the propagate_unique_keys_to_joinrel, we can add 1 more
UniquePath as
(rel1_unique_paths, rel2_unique_paths) if the current rules doesn't apply.
or else the following cases can't be handled.

postgres=# explain select distinct t100.pk, t101.pk from t100, t101;
QUERY PLAN
--------------------------------------------------------------------------------
Unique (cost=772674.11..810981.11 rows=5107600 width=8)
-> Sort (cost=772674.11..785443.11 rows=5107600 width=8)
Sort Key: t100.pk, t101.pk
-> Nested Loop (cost=0.00..63915.85 rows=5107600 width=8)
-> Seq Scan on t100 (cost=0.00..32.60 rows=2260 width=4)
-> Materialize (cost=0.00..43.90 rows=2260 width=4)
-> Seq Scan on t101 (cost=0.00..32.60 rows=2260
width=4)
(7 rows)

But if we add such rule, the unique paths probably become much longer, so
we need
a strategy to tell if the UniquePath is useful for our query, if not, we
can ignore that.
rel->reltarget maybe a good info for such optimization. I think we can
take this into
consideration for pupulate_baserel_uniquekeys as well.

For the non_null info, Tom suggested to add maintain such info RelOptInfo,
I have done that for the not_null_info for basic relation catalog, I think
we can
maintain the same flag for joinrel and the not null info from
find_nonnullable_vars as
well, but I still didn't find a good place to add that so far.

A small question about the following code:

+       if (relation_has_uniquekeys_for(root, input_rel,
get_sortgrouplist_exprs(parse->distinctClause, parse->targetList), false))
+       {
+
+               add_path(distinct_rel, (Path *)cheapest_input_path);
+
+               /* XXX yeah yeah, need to call the hooks etc. */
+
+               /* Now choose the best path(s) */
+               set_cheapest(distinct_rel);
+
+               return distinct_rel;
+       }

Since we don't create new RelOptInfo/Path, do we need to call add_path and
set_cheapest?

Best Regards
Andy Fan

#35

David Rowley

dgrowleyml@gmail.com

almost 6 years ago

In reply to: Andy Fan (#34)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Fri, 13 Mar 2020 at 14:47, Andy Fan <zhihui.fan1213@gmail.com> wrote:

1. for pupulate_baserel_uniquekeys, we need handle the "pk = Const" as well.
(relation_has_unqiue_for has a similar logic) currently the following distinct path is still
there.

Yeah, I left a comment in propagate_unique_keys_to_joinrel() to
mention that still needs to be done.

postgres=# explain select distinct b from t100 where pk = 1;
QUERY PLAN
----------------------------------------------------------------------------------
Unique (cost=8.18..8.19 rows=1 width=4)
-> Sort (cost=8.18..8.19 rows=1 width=4)
Sort Key: b
-> Index Scan using t100_pkey on t100 (cost=0.15..8.17 rows=1 width=4)
Index Cond: (pk = 1)
(5 rows)

I think in this case, we can add both (pk) and (b) as the UniquePaths. If so we
can get more opportunities to reach our goal.

The UniqueKeySet containing "b" could only be added in the
distinct_rel in the upper planner. It must not change the input_rel
for the distinct.

It's likely best to steer clear of calling UniqueKeys UniquePaths as
it might confuse people. The term "path" is used in PostgreSQL as a
lightweight representation containing all the information required to
build a plan node in createplan.c. More details in
src/backend/optimizer/README.

2. As for the propagate_unique_keys_to_joinrel, we can add 1 more UniquePath as
(rel1_unique_paths, rel2_unique_paths) if the current rules doesn't apply.
or else the following cases can't be handled.

postgres=# explain select distinct t100.pk, t101.pk from t100, t101;
QUERY PLAN
--------------------------------------------------------------------------------
Unique (cost=772674.11..810981.11 rows=5107600 width=8)
-> Sort (cost=772674.11..785443.11 rows=5107600 width=8)
Sort Key: t100.pk, t101.pk
-> Nested Loop (cost=0.00..63915.85 rows=5107600 width=8)
-> Seq Scan on t100 (cost=0.00..32.60 rows=2260 width=4)
-> Materialize (cost=0.00..43.90 rows=2260 width=4)
-> Seq Scan on t101 (cost=0.00..32.60 rows=2260 width=4)
(7 rows)

I don't really follow what you mean here. It seems to me there's no
way we can skip doing DISTINCT in the case above. If you've just
missed out the join clause and you meant to have "WHERE t100.pk =
t101.pk", then we can likely fix that later with some sort of
functional dependency tracking. Likely we can just add a Relids field
to UniqueKeySet to track which relids are functionally dependant on a
row from the UniqueKeySet's uk_exprs. That might be as simple as just
pull_varnos from the non-matched exprs and checking to ensure the
result is a subset of functionally dependant rels. I'd need to give
that more thought.

Was this a case you had working in your patch?

But if we add such rule, the unique paths probably become much longer, so we need
a strategy to tell if the UniquePath is useful for our query, if not, we can ignore that.
rel->reltarget maybe a good info for such optimization. I think we can take this into
consideration for pupulate_baserel_uniquekeys as well.

I don't really think the number of unique indexes in a base rel will
really ever get out of hand for legitimate cases.
propagate_unique_keys_to_joinrel is just concatenating baserel
UniqueKeySets to the joinrel. They're not copied, so it's just tagging
pointers onto the end of an array, which is at best a memcpy(), or at
worst a realloc() then memcpy(). That's not so costly.

For the non_null info, Tom suggested to add maintain such info RelOptInfo,
I have done that for the not_null_info for basic relation catalog, I think we can
maintain the same flag for joinrel and the not null info from find_nonnullable_vars as
well, but I still didn't find a good place to add that so far.

I'd considered just adding a get_notnull() function to lsyscache.c.
Just below get_attname() looks like a good spot. I imagined just
setting the bit in the UniqueKeySet's non_null_keys field
corresponding to the column position from the index. I could see the
benefit of having a field in RelOptInfo if there was some way to
determine the not-null properties of all columns in the table at once,
but there's not, so we're likely best just looking at the ones that
there are unique indexes on.

A small question about the following code:

+       if (relation_has_uniquekeys_for(root, input_rel, get_sortgrouplist_exprs(parse->distinctClause, parse->targetList), false))
+       {
+
+               add_path(distinct_rel, (Path *)cheapest_input_path);
+
+               /* XXX yeah yeah, need to call the hooks etc. */
+
+               /* Now choose the best path(s) */
+               set_cheapest(distinct_rel);
+
+               return distinct_rel;
+       }

Since we don't create new RelOptInfo/Path, do we need to call add_path and set_cheapest?

The distinct_rel already exists. add_path() is the standard way we
have of adding paths to the rel's pathlist. Why would you want to
bypass that? set_cheapest() is our standard way of looking at the
pathlist and figuring out the least costly one. It's not a very hard
job to do when there's just 1 path. Not sure why you'd want to do it
another way.

#36

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: David Rowley (#35)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Fri, Mar 13, 2020 at 11:46 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Fri, 13 Mar 2020 at 14:47, Andy Fan <zhihui.fan1213@gmail.com> wrote:

1. for pupulate_baserel_uniquekeys, we need handle the "pk = Const"

as well.

(relation_has_unqiue_for has a similar logic) currently the following

distinct path is still

there.

Yeah, I left a comment in propagate_unique_keys_to_joinrel() to
mention that still needs to be done.

postgres=# explain select distinct b from t100 where pk = 1;

QUERY PLAN

----------------------------------------------------------------------------------

Unique (cost=8.18..8.19 rows=1 width=4)
-> Sort (cost=8.18..8.19 rows=1 width=4)
Sort Key: b
-> Index Scan using t100_pkey on t100 (cost=0.15..8.17 rows=1

width=4)

Index Cond: (pk = 1)
(5 rows)

I think in this case, we can add both (pk) and (b) as the

UniquePaths. If so we

can get more opportunities to reach our goal.

The UniqueKeySet containing "b" could only be added in the
distinct_rel in the upper planner. It must not change the input_rel
for the distinct.

I think we maintain UniqueKey even without distinct_rel, so at this

stage,
Can we say b is unique for this(no is possible)? If yes, we probably
need to set that information without consider the distinct clause.

It's likely best to steer clear of calling UniqueKeys UniquePaths as

it might confuse people. The term "path" is used in PostgreSQL as a
lightweight representation containing all the information required to
build a plan node in createplan.c. More details in
src/backend/optimizer/README.

OK.

2. As for the propagate_unique_keys_to_joinrel, we can add 1 more

UniquePath as

(rel1_unique_paths, rel2_unique_paths) if the current rules doesn't

apply.

or else the following cases can't be handled.

postgres=# explain select distinct t100.pk, t101.pk from t100, t101;
QUERY PLAN

--------------------------------------------------------------------------------

Unique (cost=772674.11..810981.11 rows=5107600 width=8)
-> Sort (cost=772674.11..785443.11 rows=5107600 width=8)
Sort Key: t100.pk, t101.pk
-> Nested Loop (cost=0.00..63915.85 rows=5107600 width=8)
-> Seq Scan on t100 (cost=0.00..32.60 rows=2260 width=4)
-> Materialize (cost=0.00..43.90 rows=2260 width=4)
-> Seq Scan on t101 (cost=0.00..32.60 rows=2260

width=4)

(7 rows)

I don't really follow what you mean here. It seems to me there's no
way we can skip doing DISTINCT in the case above. If you've just
missed out the join clause and you meant to have "WHERE t100.pk =
t101.pk", then we can likely fix that later with some sort of
functional dependency tracking.

In the above case the result should be unique, the knowledge behind that
is if *we join 2 unique results in any join method, the result is unique as
well*
in the above example, the final unique Key is (t100.pk, t101.pk).

Likely we can just add a Relids field
to UniqueKeySet to track which relids are functionally dependant on a
row from the UniqueKeySet's uk_exprs. That might be as simple as just
pull_varnos from the non-matched exprs and checking to ensure the
result is a subset of functionally dependant rels. I'd need to give
that more thought.

Was this a case you had working in your patch?

I think we can do that after I get your UniqueKey idea, so, no, my
previous patch is not as smart
as yours:)

But if we add such rule, the unique paths probably become much longer,

so we need

a strategy to tell if the UniquePath is useful for our query, if not,

we can ignore that.

rel->reltarget maybe a good info for such optimization. I think we can

take this into

consideration for pupulate_baserel_uniquekeys as well.

I don't really think the number of unique indexes in a base rel will
really ever get out of hand for legitimate cases.
propagate_unique_keys_to_joinrel is just concatenating baserel
UniqueKeySets to the joinrel. They're not copied, so it's just tagging
pointers onto the end of an array, which is at best a memcpy(), or at
worst a realloc() then memcpy(). That's not so costly.

The memcpy is not key concerns here. My main point is we need
to focus on the length of RelOptInfo->uniquekeys. For example:
t has 3 uk like this (uk1), (uk2), (uk3). And the query is
select b from t where m = 1; If so there is no need to add these 3
to UniqueKeys so that we can keep rel->uniquekeys shorter.

The the length of rel->uniquekeys maybe a concern if we add the rule
I suggested above , the (t100.pk, t101.pk) case. Think about this
for example:

1. select .. from t1, t2, t3, t4...;
2. suppose each table has 2 UniqueKeys, named (t{m}_uk{n})
3. follow my above rule (t1.pk1, t2.pk）is a UniqueKey for joinrel.
4. suppose we join with the following order (t1 vs t2 vs t3 vs t4)

For for (t1 vs t2), we need to add 4 more UniqueKeys for this joinrel.
(t1_uk1 , t2_uk1), (t1_uk1, t2_uk2), (t1_uk2 , t2_uk1), (t1_uk2, t2_uk2)

After we come to join the last one, the joinrel->uniquekey will be much
longer
which makes the scan of it less efficient.

But this will not be an issue if my above rule should not be considered.
so
we need to talk about that first.

For the non_null info, Tom suggested to add maintain such info
RelOptInfo,

I have done that for the not_null_info for basic relation catalog, I

think we can

maintain the same flag for joinrel and the not null info from

find_nonnullable_vars as

well, but I still didn't find a good place to add that so far.

I'd considered just adding a get_notnull() function to lsyscache.c.
Just below get_attname() looks like a good spot. I imagined just
setting the bit in the UniqueKeySet's non_null_keys field
corresponding to the column position from the index. I could see the
benefit of having a field in RelOptInfo if there was some way to
determine the not-null properties of all columns in the table at once,

do you mean get the non-null properties from catalog or restrictinfo?
if you mean catalog, get_relation_info may be a good place for that.

but there's not, so we're likely best just looking at the ones that

there are unique indexes on.

A small question about the following code:

+ if (relation_has_uniquekeys_for(root, input_rel,

get_sortgrouplist_exprs(parse->distinctClause, parse->targetList), false))
+       {
+
+               add_path(distinct_rel, (Path *)cheapest_input_path);
+
+               /* XXX yeah yeah, need to call the hooks etc. */
+
+               /* Now choose the best path(s) */
+               set_cheapest(distinct_rel);
+
+               return distinct_rel;
+       }
Since we don't create new RelOptInfo/Path, do we need to call add_path
and set_cheapest?

The distinct_rel already exists. add_path() is the standard way we
have of adding paths to the rel's pathlist. Why would you want to
bypass that? set_cheapest() is our standard way of looking at the
pathlist and figuring out the least costly one. It's not a very hard
job to do when there's just 1 path. Not sure why you'd want to do it
another way.

I got the point now. In this case, you create an new RelOptInfo
named distinct_rel, so we *must* set it. Can we just return the input_rel
in this case? if we can, we don't need that.

#37

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: Andy Fan (#36)

1 attachment(s)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

Hi All:

I have re-implemented the patch based on David's suggestion/code, Looks it
works well. The updated patch mainly includes:

1. Maintain the not_null_colno in RelOptInfo, which includes the not null
from
catalog and the not null from vars.
2. Add the restictinfo check at populate_baserel_uniquekeys. If we are sure
about only 1 row returned, I add each expr in rel->reltarget->expr as a
unique key.
like (select a, b, c from t where pk = 1), the uk will be ( (a), (b),
(c) )
3. postpone the propagate_unique_keys_to_joinrel call to
populate_joinrel_with_paths
since we know jointype at that time. so we can handle the semi/anti join
specially.
4. Add the rule I suggested above, if both of the 2 relation yields the a
unique result,
the join result will be unique as well. the UK can be ( (rel1_uk1,
rel1_uk2).. )
5. If the unique key is impossible to be referenced by others, we can
safely ignore
it in order to keep the (join)rel->unqiuekeys short.
6. I only consider the not null check/opfamily check for the uniquekey
which comes
from UniqueIndex. I think that should be correct.
7. I defined each uniquekey as List of Expr, so I didn't introduce new
node type.
8. checked the uniquekeys's information before create_distinct_paths and
create_group_paths ignore the new paths to be created if the
sortgroupclauses
is unique already or else create it and add the new uniquekey to the
distinctrel/grouprel.

There are some things I still be in-progress, like:
1. Partition table.
2. union/union all
3. maybe refactor the is_innerrel_unqiue_for/query_is_distinct_for to use
UniqueKey
4. if we are sure the groupby clause is unique, and we have aggregation
call, maybe we
should try Bapat's suggestion, we can use sort rather than hash. The
strategy sounds
awesome, but I didn't check the details so far.
5. more clearer commit message.
6. any more ?

Any feedback is welcome, Thanks for you for your any ideas, suggestions,
demo code!

Best Regards
Andy Fan

Attachments:

v4-0001-Patch-Bypass-distinctClause-groupbyClause-if-the-.patchapplication/octet-stream; name=v4-0001-Patch-Bypass-distinctClause-groupbyClause-if-the-.patchDownload

From 60c6662b6782d5d4ad4bba0c57fd5b9fecee7364 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=E4=B8=80=E6=8C=83?= <yizhi.fzh@alibaba-inc.com>
Date: Mon, 16 Mar 2020 00:48:13 +0800
Subject: [PATCH v4] [Patch] Bypass distinctClause & groupbyClause if the exprs
 is unique already

---
 src/backend/nodes/list.c                      |  27 +
 src/backend/optimizer/path/Makefile           |   3 +-
 src/backend/optimizer/path/allpaths.c         |   7 +-
 src/backend/optimizer/path/joinrels.c         |   2 +
 src/backend/optimizer/path/uniquekeys.c       | 473 ++++++++++++++++++
 src/backend/optimizer/plan/initsplan.c        |   9 +-
 src/backend/optimizer/plan/planner.c          |  43 ++
 src/backend/optimizer/util/plancat.c          |   8 +
 src/include/nodes/pathnodes.h                 |  27 +
 src/include/nodes/pg_list.h                   |   2 +
 src/include/optimizer/paths.h                 |  17 +
 src/test/regress/expected/aggregates.out      |  38 +-
 src/test/regress/expected/join.out            |  26 +-
 src/test/regress/expected/select_distinct.out | 276 ++++++++++
 src/test/regress/sql/select_distinct.sql      |  84 ++++
 15 files changed, 1003 insertions(+), 39 deletions(-)
 create mode 100644 src/backend/optimizer/path/uniquekeys.c

diff --git a/src/backend/nodes/list.c b/src/backend/nodes/list.c
index bd0c58cd81..a54c8e66bb 100644
--- a/src/backend/nodes/list.c
+++ b/src/backend/nodes/list.c
@@ -688,6 +688,33 @@ list_member_oid(const List *list, Oid datum)
 	return false;
 }
 
+/* 
+ * Return ture iff there is an equal member in target for every
+ * member in members
+ */
+bool
+list_all_members_in(const List *members, const List *target)
+{
+	const ListCell	*lc1, *lc2;
+	if (target == NIL && members != NIL)
+		return false;
+	foreach(lc1, members)
+	{
+		bool found = false;
+		foreach(lc2, target)
+		{
+			if (equal(lfirst(lc1), lfirst(lc2)))
+			{
+				found = true;
+				break;
+			}
+		}
+		if (!found)
+			return false;
+	}
+	return true;
+}
+
 /*
  * Delete the n'th cell (counting from 0) in list.
  *
diff --git a/src/backend/optimizer/path/Makefile b/src/backend/optimizer/path/Makefile
index 1e199ff66f..7b9820c25f 100644
--- a/src/backend/optimizer/path/Makefile
+++ b/src/backend/optimizer/path/Makefile
@@ -21,6 +21,7 @@ OBJS = \
 	joinpath.o \
 	joinrels.o \
 	pathkeys.o \
-	tidpath.o
+	tidpath.o \
+	uniquekeys.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 8286d9cf34..2c65d3715b 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -158,6 +158,7 @@ make_one_rel(PlannerInfo *root, List *joinlist)
 	/*
 	 * Construct the all_baserels Relids set.
 	 */
+
 	root->all_baserels = NULL;
 	for (rti = 1; rti < root->simple_rel_array_size; rti++)
 	{
@@ -578,7 +579,9 @@ set_plain_rel_size(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
 	 * first since partial unique indexes can affect size estimates.
 	 */
 	check_index_predicates(root, rel);
-
+	
+	populate_baserel_uniquekeys(root, rel);
+	
 	/* Mark rel with estimated output rows, width, etc */
 	set_baserel_size_estimates(root, rel);
 }
@@ -2321,6 +2324,8 @@ set_subquery_pathlist(PlannerInfo *root, RelOptInfo *rel,
 		return;
 	}
 
+	rel->uniquekeys = sub_final_rel->uniquekeys;
+	
 	/*
 	 * Mark rel with estimated output rows, width, etc.  Note that we have to
 	 * do this before generating outer-query paths, else cost_subqueryscan is
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index a21c295b99..f1243f31e7 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -920,6 +920,8 @@ populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 
 	/* Apply partitionwise join technique, if possible. */
 	try_partitionwise_join(root, rel1, rel2, joinrel, sjinfo, restrictlist);
+
+	propagate_unique_keys_to_joinrel(root, joinrel, rel1, rel2, restrictlist, sjinfo->jointype);
 }
 
 
diff --git a/src/backend/optimizer/path/uniquekeys.c b/src/backend/optimizer/path/uniquekeys.c
new file mode 100644
index 0000000000..1be2c4f0db
--- /dev/null
+++ b/src/backend/optimizer/path/uniquekeys.c
@@ -0,0 +1,473 @@
+/*-------------------------------------------------------------------------
+ *
+ * uniquekeys.c
+ *	  Utilities for matching and building unique keys
+ *
+ * Portions Copyright (c) 2020, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *	  src/backend/optimizer/path/uniquekeys.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "nodes/makefuncs.h"
+#include "nodes/nodeFuncs.h"
+#include "optimizer/pathnode.h"
+#include "optimizer/paths.h"
+
+/*
+ * Examine the rel's restriction clauses for usable var = const clauses
+ */
+static List*
+get_mergeable_const_restrictlist(RelOptInfo *rel)
+{
+	List	*restrictlist = NIL;
+	ListCell	*lc;
+	foreach(lc, rel->baserestrictinfo)
+	{
+		RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(lc);
+
+		/*
+		 * Note: can_join won't be set for a restriction clause, but
+		 * mergeopfamilies will be if it has a mergejoinable operator and
+		 * doesn't contain volatile functions.
+		 */
+		if (restrictinfo->mergeopfamilies == NIL)
+			continue;			/* not mergejoinable */
+
+		/* XXX can't we check if is a Const */
+
+		/*
+		 * The clause certainly doesn't refer to anything but the given rel.
+		 * If either side is pseudoconstant then we can use it.
+		 */
+		if (bms_is_empty(restrictinfo->left_relids))
+		{
+			/* righthand side is inner */
+			restrictinfo->outer_is_left = true;
+		}
+		else if (bms_is_empty(restrictinfo->right_relids))
+		{
+			/* lefthand side is inner */
+			restrictinfo->outer_is_left = false;
+		}
+		else
+			continue;
+
+		/* OK, add to list */
+		restrictlist = lappend(restrictlist, restrictinfo);
+	}
+
+	return restrictlist;
+
+}
+
+
+/*
+ * Return true if uk = Const in the restrictlist
+ */
+static bool
+match_index_to_restrictinfo(IndexOptInfo *unique_ind, List *restrictlist)
+{
+	int c = 0;
+
+	if (restrictlist == NIL)
+		return false;
+
+	for(c = 0;  c < unique_ind->nkeycolumns; c++)
+	{
+		ListCell	*lc;
+		foreach(lc, restrictlist)
+		{
+			RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+			Node	   *rexpr;
+
+			/*
+			 * The condition's equality operator must be a member of the
+			 * index opfamily, else it is not asserting the right kind of
+			 * equality behavior for this index.  We check this first
+			 * since it's probably cheaper than match_index_to_operand().
+			 */
+			if (!list_member_oid(rinfo->mergeopfamilies, unique_ind->opfamily[c]))
+				continue;
+
+			/*
+			 * XXX at some point we may need to check collations here too.
+			 * For the moment we assume all collations reduce to the same
+			 * notion of equality.
+			 */
+
+			/* OK, see if the condition operand matches the index key */
+			if (rinfo->outer_is_left)
+				rexpr = get_rightop(rinfo->clause);
+			else
+				rexpr = get_leftop(rinfo->clause);
+
+			if (!match_index_to_operand(rexpr, c, unique_ind))
+			{
+				return false;
+			}
+		}
+	}
+	return true;
+}
+
+/*
+ * add_uniquekey_from_index
+ * 	We only add the Index Vars whose expr exists in rel->reltarget
+ */
+static void
+add_uniquekey_from_index(RelOptInfo *rel, IndexOptInfo *unique_index)
+{
+	int	c = 0;
+	List	*exprs = NIL;
+
+	/* We only add the index which exists in rel->reltarget */
+	for(c = 0; c < unique_index->nkeycolumns; c++)
+	{
+		ListCell	*lc;
+		bool	find_in_exprs = false;
+		foreach(lc, rel->reltarget->exprs)
+		{
+			Var *var;
+			/* We never knows a FuncExpr is nullable or not,  we only handle Var now */
+			if (!IsA(lfirst(lc), Var))
+				continue;
+			var = lfirst_node(Var, lc);
+			if (var->varattno < InvalidAttrNumber)
+				/* System column */
+				continue;
+			/* Must check not null for unqiue index */
+			if (!bms_is_member(var->varattno, rel->not_null_cols))
+				continue;
+
+			/* To keep the uniquekey short, We only add it if it exists in rel->reltrget->exprs */
+			if (match_index_to_operand((Node *)lfirst(lc), c, unique_index))
+			{
+				find_in_exprs = true;
+				exprs = lappend(exprs, lfirst(lc));
+				break;
+			}
+		}
+		if (!find_in_exprs)
+			return;
+	}
+	rel->uniquekeys = lappend(rel->uniquekeys, exprs);
+}
+
+/*
+ * populate_baserel_uniquekeys
+ *		Populate 'baserel' uniquekeys list by looking at the rel's unique index
+ * add baserestrictinfo     
+ */
+void
+populate_baserel_uniquekeys(PlannerInfo *root, RelOptInfo *baserel)
+{
+	ListCell *lc;
+	List	*restrictlist = get_mergeable_const_restrictlist(baserel);
+	bool	return_one_row = false;
+	List	*matched_uk_indexes = NIL;
+
+	Assert(baserel->rtekind == RTE_RELATION);
+
+	foreach(lc, baserel->indexlist)
+	{
+		IndexOptInfo *ind = (IndexOptInfo *) lfirst(lc);
+		/*
+		 * If the index is not unique, or not immediately enforced, or if it's
+		 * a partial index that doesn't match the query, it's useless here.
+		 */
+		if (!ind->unique || !ind->immediate ||
+			(ind->indpred != NIL && !ind->predOK))
+			continue;
+
+		if (match_index_to_restrictinfo(ind, restrictlist))
+		{
+			return_one_row = true;
+			break;
+		}
+		matched_uk_indexes = lappend(matched_uk_indexes, ind);
+	}
+
+	if (return_one_row)
+	{
+		foreach(lc,  baserel->reltarget->exprs)
+		{
+			/* Every columns in this relation is unqiue since only 1 row returned
+			 * No bother to check it is a var or not, also we don't need the check nullable
+			 */
+			baserel->uniquekeys = lappend(baserel->uniquekeys,
+										  list_make1(lfirst(lc)));
+		}
+	}
+	else
+	{
+		foreach(lc,   matched_uk_indexes)
+			add_uniquekey_from_index(baserel, lfirst_node(IndexOptInfo, lc));
+	}
+}
+
+
+/*
+ * relation_has_uniquekeys_for
+ *		Returns true if we have proofs that 'rel' cannot return multiple rows with
+ *		the same values in each of 'exprs'.  Otherwise returns false.
+ */
+bool
+relation_has_uniquekeys_for(PlannerInfo *root, RelOptInfo *rel, List *exprs)
+{
+	ListCell *lc;
+
+	foreach(lc, rel->uniquekeys)
+	{
+		List *unique_exprs = lfirst_node(List, lc);
+		if (unique_exprs == NIL)
+			continue;
+		if (list_all_members_in(unique_exprs, exprs))
+			return true;
+	}
+	return false;
+}
+
+/*
+ * clause_sides_match_join
+ *	  Determine whether a join clause is of the right form to use in this join.
+ *
+ * We already know that the clause is a binary opclause referencing only the
+ * rels in the current join.  The point here is to check whether it has the
+ * form "outerrel_expr op innerrel_expr" or "innerrel_expr op outerrel_expr",
+ * rather than mixing outer and inner vars on either side.  If it matches,
+ * we set the transient flag outer_is_left to identify which side is which.
+ */
+static inline bool
+clause_sides_match_join(RestrictInfo *rinfo, Relids outerrelids,
+						Relids innerrelids)
+{
+	if (bms_is_subset(rinfo->left_relids, outerrelids) &&
+		bms_is_subset(rinfo->right_relids, innerrelids))
+	{
+		/* lefthand side is outer */
+		rinfo->outer_is_left = true;
+		return true;
+	}
+	else if (bms_is_subset(rinfo->left_relids, innerrelids) &&
+			 bms_is_subset(rinfo->right_relids, outerrelids))
+	{
+		/* righthand side is outer */
+		rinfo->outer_is_left = false;
+		return true;
+	}
+	return false;				/* no good for these input relations */
+}
+
+/*
+ * clauselist_matches_uniquekeys
+ *   Detect the pattern that rel1.uk_expr =  rel2.normal_expr in clause_list, 
+ * if so, we are sure that the UniqueKey of rel2 still can be unqiue key in joinrel.
+ */
+static bool
+clauselist_matches_uniquekeys(List *clause_list, List *uniquekey,  bool outer_side)
+{
+	ListCell *lc;
+
+	if (uniquekey == NIL)
+		return false;
+
+	foreach(lc, uniquekey)
+	{
+		Node *expr = (Node *)lfirst(lc);
+		ListCell *lc2;
+		bool matched_expr = false;
+
+		foreach(lc2, clause_list)
+		{
+			RestrictInfo *rinfo = (RestrictInfo *)lfirst(lc2);
+			Node	   *rexpr;
+
+			/*
+			 * The condition's equality operator must be a member of the
+			 * index opfamily, else it is not asserting the right kind of
+			 * equality behavior for this index.  We check this first
+			 * since it's probably cheaper than match_index_to_operand().
+			 */
+			/* XXXXX do we need this?  uk_opfamily is a Concept when we come to Index
+			 * for UniqueKey looks we don't need it
+			 **/
+			/* if (!list_member_oid(rinfo->mergeopfamilies, key->uk_opfamily)) */
+			/* 	continue; */
+
+			/*
+			 * XXX at some point we may need to check collations here too.
+			 * For the moment we assume all collations reduce to the same
+			 * notion of equality.
+			 */
+
+			 /* OK, see if the condition operand matches the index key */
+			if (rinfo->outer_is_left != outer_side)
+				rexpr = get_rightop(rinfo->clause);
+			else
+				rexpr = get_leftop(rinfo->clause);
+
+			if (IsA(rexpr, RelabelType))
+				rexpr = (Node *)((RelabelType *)rexpr)->arg;
+
+			if (equal(rexpr, expr))
+			{
+				matched_expr = true;
+				break;
+			}
+		}
+
+		if (!matched_expr)
+			return false;
+	}
+
+	return true;
+}
+
+/*
+ * Used to record if a uniquekey has been added to joinrel, if so we don't
+ * need to add other superset of this uniquekey to the joinrel.
+ */
+typedef struct UniqueKeyContextData
+{
+	List	*uniquekey;
+	/* Set to true if the unique key has been added to joinrel->uniquekeys */
+	bool	added_to_joinrel;
+} *UniqueKeyContext;
+
+
+/*
+ * initililze_unqiuecontext_for_joinrel
+ * Return a List of UniqueKeyContext for an inputrel, we also filter out 
+ * all the unqiuekeys which are not possible to use later
+ */
+static List *
+initililze_unqiuecontext_for_joinrel(RelOptInfo *joinrel,  RelOptInfo *inputrel)
+{
+	List	*res = NIL;
+	ListCell *lc;
+	foreach(lc,  inputrel->uniquekeys)
+	{
+		UniqueKeyContext context;
+		/* If it isn't shown in joinrel->reltarget->exprs, it will be not referenced by others */
+		if (!list_all_members_in(lfirst_node(List, lc), joinrel->reltarget->exprs))
+			continue;
+		context = palloc(sizeof(struct UniqueKeyContextData));
+		context->uniquekey = lfirst_node(List, lc);
+		context->added_to_joinrel = false;
+		res = lappend(res, context);
+	}
+	return res;
+
+}
+
+/*
+ * propagate_unique_keys_to_joinrel
+ *		Using 'restrictlist' determine if rel2 can duplicate rows in rel1 and
+ *		vice-versa.  If the relation at the other side of the join cannot
+ *		cause row duplication, then tag the uniquekeys for the relation onto
+ *		'joinrel's uniquekey list.
+ */
+void
+propagate_unique_keys_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
+								 RelOptInfo *rel1, RelOptInfo *rel2,
+								 List *restrictlist, JoinType jointype)
+{
+	ListCell *lc, *lc2;
+	List	*clause_list = NIL;
+	List	*rel1_uniquekey_context;
+	List	*rel2_uniquekey_context;
+
+	/* Care about the left relation only for SEMI/ANTI join */
+	if (jointype == JOIN_SEMI || jointype == JOIN_ANTI)
+	{
+		foreach(lc, rel1->uniquekeys)
+		{
+			List	*uniquekey = lfirst_node(List, lc);
+			if (list_all_members_in(uniquekey, joinrel->reltarget->exprs))
+				joinrel->uniquekeys = lappend(joinrel->uniquekeys, uniquekey);
+		}
+		return;
+	}
+
+	rel1_uniquekey_context = initililze_unqiuecontext_for_joinrel(joinrel, rel1);
+	rel2_uniquekey_context = initililze_unqiuecontext_for_joinrel(joinrel, rel2);
+	
+	if (rel1_uniquekey_context == NIL || rel2_uniquekey_context == NIL)
+		return;
+
+	foreach(lc, restrictlist)
+	{
+		RestrictInfo *restrictinfo = (RestrictInfo *)lfirst(lc);
+
+
+		if (IS_OUTER_JOIN(jointype) &&
+			RINFO_IS_PUSHED_DOWN(restrictinfo, joinrel->relids))
+			continue;
+
+		/* Ignore if it's not a mergejoinable clause */
+		if (!restrictinfo->can_join ||
+			restrictinfo->mergeopfamilies == NIL)
+			continue;			/* not mergejoinable */
+
+		/*
+		 * Check if clause has the form "outer op inner" or "inner op outer",
+		 * and if so mark which side is inner.
+		 */
+		if (!clause_sides_match_join(restrictinfo, rel1->relids, rel2->relids))
+			continue;			/* no good for these input relations */
+
+		/* OK, add to list */
+		clause_list = lappend(clause_list, restrictinfo);
+	}
+
+	foreach(lc, rel1_uniquekey_context)
+	{
+		List	*uniquekey = ((UniqueKeyContext)lfirst(lc))->uniquekey;
+		if (clauselist_matches_uniquekeys(clause_list, uniquekey, true))
+		{
+			foreach(lc2, rel2_uniquekey_context)
+			{
+				UniqueKeyContext ctx = (UniqueKeyContext)lfirst(lc);
+				joinrel->uniquekeys = lappend(joinrel->uniquekeys, ctx->uniquekey);
+				ctx->added_to_joinrel = true;
+			}
+			break;
+		}
+	}
+
+	foreach(lc, rel2_uniquekey_context)
+	{
+		List	*uniquekey = ((UniqueKeyContext)lfirst(lc))->uniquekey;
+		if (clauselist_matches_uniquekeys(clause_list, uniquekey, true))
+		{
+			foreach(lc2, rel1_uniquekey_context)
+			{
+				UniqueKeyContext ctx = (UniqueKeyContext)lfirst(lc);
+				joinrel->uniquekeys = lappend(joinrel->uniquekeys, ctx->uniquekey);
+				ctx->added_to_joinrel = true;
+			}
+			break;
+		}
+	}
+
+	foreach(lc, rel1_uniquekey_context)
+	{
+		UniqueKeyContext context1 = (UniqueKeyContext) lfirst(lc);
+		if (context1->added_to_joinrel)
+			continue;
+		foreach(lc2, rel2_uniquekey_context)
+		{
+			UniqueKeyContext context2 = (UniqueKeyContext) lfirst(lc2);
+			List	*uniquekey = NIL;
+			if (context2->added_to_joinrel)
+				continue;
+			uniquekey = list_copy(context1->uniquekey);
+			uniquekey = list_concat(uniquekey, context2->uniquekey);
+			joinrel->uniquekeys = lappend(joinrel->uniquekeys, uniquekey);
+		}
+	}
+}
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index e978b491f6..a674e271e7 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -829,7 +829,14 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
 		foreach(l, (List *) f->quals)
 		{
 			Node	   *qual = (Node *) lfirst(l);
-
+			ListCell	*lc;
+			foreach(lc, find_nonnullable_vars(qual))
+			{
+				Var *var = lfirst_node(Var, lc);
+				RelOptInfo *rel = root->simple_rel_array[var->varno];
+				if (var->varattno > InvalidAttrNumber)
+					rel->not_null_cols = bms_add_member(rel->not_null_cols, var->varattno);
+			}
 			distribute_qual_to_rels(root, qual,
 									false, below_outer_join, JOIN_INNER,
 									root->qual_security_level,
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index d6f2153593..e300dad442 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -2389,6 +2389,9 @@ grouping_planner(PlannerInfo *root, bool inheritance_update,
 		add_path(final_rel, path);
 	}
 
+	/* Copy the uniquekeys to final_rel */
+	final_rel->uniquekeys = current_rel->uniquekeys;
+	
 	/*
 	 * Generate partial paths for final_rel, too, if outer query levels might
 	 * be able to make use of them.
@@ -3813,6 +3816,20 @@ create_grouping_paths(PlannerInfo *root,
 	Query	   *parse = root->parse;
 	RelOptInfo *grouped_rel;
 	RelOptInfo *partially_grouped_rel;
+	ListCell	*lc;
+
+	List	*required_unique_keys = get_sortgrouplist_exprs(parse->groupClause,
+															parse->targetList);
+	/*
+	 * If the groupby clauses is unique already,  groupping node is not necessary
+	 * if there is no aggreation functions
+	 */
+	if (required_unique_keys != NIL &&
+		!parse->hasAggs &&
+		!parse->hasWindowFuncs &&
+		parse->havingQual == NULL &&
+		relation_has_uniquekeys_for(root, input_rel, required_unique_keys))
+		return input_rel;
 
 	/*
 	 * Create grouping relation to hold fully aggregated grouping and/or
@@ -3901,6 +3918,19 @@ create_grouping_paths(PlannerInfo *root,
 	}
 
 	set_cheapest(grouped_rel);
+
+	/* Copy the upperrel's uniquekeys to grouped_rel and add the one which caused
+	 * by groupBy clause
+	 */
+	foreach(lc, input_rel->uniquekeys)
+	{
+		List	*uniquekey = lfirst_node(List, lc);
+		if (list_all_members_in(uniquekey, grouped_rel->reltarget->exprs))
+			grouped_rel->uniquekeys = lappend(grouped_rel->uniquekeys, uniquekey);
+	}
+	if (required_unique_keys != NIL)
+		grouped_rel->uniquekeys = lappend(grouped_rel->uniquekeys,
+										  required_unique_keys);
 	return grouped_rel;
 }
 
@@ -4736,6 +4766,12 @@ create_distinct_paths(PlannerInfo *root,
 	bool		allow_hash;
 	Path	   *path;
 	ListCell   *lc;
+	List	   *required_unique_keys =  get_sortgrouplist_exprs(parse->distinctClause,
+																parse->targetList);
+
+	/* If we the result if unqiue already, we just return the input_rel directly */
+	if (relation_has_uniquekeys_for(root, input_rel, required_unique_keys))
+		return input_rel;
 
 	/* For now, do all work in the (DISTINCT, NULL) upperrel */
 	distinct_rel = fetch_upper_rel(root, UPPERREL_DISTINCT, NULL);
@@ -4920,6 +4956,10 @@ create_distinct_paths(PlannerInfo *root,
 	/* Now choose the best path(s) */
 	set_cheapest(distinct_rel);
 
+	/* All the UK before distinct is still valid and we can add one more required_unique_keys */
+	distinct_rel->uniquekeys = list_copy(input_rel->uniquekeys);
+	distinct_rel->uniquekeys = lappend(distinct_rel->uniquekeys,
+											required_unique_keys);
 	return distinct_rel;
 }
 
@@ -5066,6 +5106,9 @@ create_ordered_paths(PlannerInfo *root,
 	 * need us to do it.
 	 */
 	Assert(ordered_rel->pathlist != NIL);
+	
+	/* Copy the unique keys */
+	ordered_rel->uniquekeys = input_rel->uniquekeys;
 
 	return ordered_rel;
 }
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index d82fc5ab8b..34d30b181c 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -117,6 +117,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	Relation	relation;
 	bool		hasindex;
 	List	   *indexinfos = NIL;
+	int			i;
 
 	/*
 	 * We need not lock the relation since it was already locked, either by
@@ -460,6 +461,13 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 	if (inhparent && relation->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
 		set_relation_partition_info(root, rel, relation);
 
+	Assert(rel->not_null_cols == NULL);
+	for(i = 0; i < relation->rd_att->natts; i++)
+	{
+		if (relation->rd_att->attrs[i].attnotnull)
+			rel->not_null_cols = bms_add_member(rel->not_null_cols, i+1);
+	}
+
 	table_close(relation, NoLock);
 
 	/*
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 3d3be197e0..ee26cfdb5c 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -687,6 +687,7 @@ typedef struct RelOptInfo
 	PlannerInfo *subroot;		/* if subquery */
 	List	   *subplan_params; /* if subquery */
 	int			rel_parallel_workers;	/* wanted number of parallel workers */
+	Relids		not_null_cols; /* the non null column for this relation, start from 1 */
 
 	/* Information about foreign tables and foreign joins */
 	Oid			serverid;		/* identifies server for the table or join */
@@ -706,6 +707,7 @@ typedef struct RelOptInfo
 	QualCost	baserestrictcost;	/* cost of evaluating the above */
 	Index		baserestrict_min_security;	/* min security_level found in
 											 * baserestrictinfo */
+	List	   *uniquekeys;		/* List of Var groups */
 	List	   *joininfo;		/* RestrictInfo structures for join clauses
 								 * involving this rel */
 	bool		has_eclass_joins;	/* T means joininfo is incomplete */
@@ -1016,6 +1018,31 @@ typedef struct PathKey
 	bool		pk_nulls_first; /* do NULLs come before normal values? */
 } PathKey;
 
+/* UniqueKeySet
+ *
+ * Represents a set of unique keys
+ */
+typedef struct UniqueKeySet
+{
+	NodeTag		type;
+
+	Bitmapset *non_null_keys;	/* indexes of 'keys' proved non-null */
+	List		*keys;	/* list of UniqueKeys */
+} UniqueKeySet;
+
+/*
+ * UniqueKey
+ *
+ * Represents the unique properties held by a RelOptInfo or a Path
+ */
+typedef struct UniqueKey
+{
+	NodeTag		type;
+
+	Oid			uk_collation;	/* collation, if datatypes are collatable */
+	Oid			uk_opfamily;	/* btree opfamily defining the ordering */
+	Expr	   *uk_expr;		/* unique key expression */
+} UniqueKey;
 
 /*
  * PathTarget
diff --git a/src/include/nodes/pg_list.h b/src/include/nodes/pg_list.h
index 14ea2766ad..5dfb93895c 100644
--- a/src/include/nodes/pg_list.h
+++ b/src/include/nodes/pg_list.h
@@ -528,6 +528,8 @@ extern bool list_member_ptr(const List *list, const void *datum);
 extern bool list_member_int(const List *list, int datum);
 extern bool list_member_oid(const List *list, Oid datum);
 
+extern bool list_all_members_in(const List *members, const List *target);
+
 extern List *list_delete(List *list, void *datum);
 extern List *list_delete_ptr(List *list, void *datum);
 extern List *list_delete_int(List *list, int datum);
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 9ab73bd20c..f0cc2c4245 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -240,4 +240,21 @@ extern PathKey *make_canonical_pathkey(PlannerInfo *root,
 extern void add_paths_to_append_rel(PlannerInfo *root, RelOptInfo *rel,
 									List *live_childrels);
 
+/*
+ * uniquekeys.c
+ *	  Utilities for matching and building unique keys
+ */
+extern void populate_baserel_uniquekeys(PlannerInfo *root,
+										RelOptInfo *baserel);
+extern bool relation_has_uniquekeys_for(PlannerInfo *root,
+										RelOptInfo *rel,
+										List *exprs);
+										
+extern void propagate_unique_keys_to_joinrel(PlannerInfo *root,
+											 RelOptInfo *joinrel,
+											 RelOptInfo *rel1,
+											 RelOptInfo *rel2,
+											 List *restrictlist,
+											 JoinType jointype);
+
 #endif							/* PATHS_H */
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index f457b5b150..07dab0ed5d 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -1092,12 +1092,10 @@ create temp table t2 (x int, y int, z int, primary key (x, y));
 create temp table t3 (a int, b int, c int, primary key(a, b) deferrable);
 -- Non-primary-key columns can be removed from GROUP BY
 explain (costs off) select * from t1 group by a,b,c,d;
-      QUERY PLAN      
-----------------------
- HashAggregate
-   Group Key: a, b
-   ->  Seq Scan on t1
-(3 rows)
+   QUERY PLAN   
+----------------
+ Seq Scan on t1
+(1 row)
 
 -- No removal can happen if the complete PK is not present in GROUP BY
 explain (costs off) select a,c from t1 group by a,c,d;
@@ -1112,16 +1110,14 @@ explain (costs off) select a,c from t1 group by a,c,d;
 explain (costs off) select *
 from t1 inner join t2 on t1.a = t2.x and t1.b = t2.y
 group by t1.a,t1.b,t1.c,t1.d,t2.x,t2.y,t2.z;
-                      QUERY PLAN                      
-------------------------------------------------------
- HashAggregate
-   Group Key: t1.a, t1.b, t2.x, t2.y
-   ->  Hash Join
-         Hash Cond: ((t2.x = t1.a) AND (t2.y = t1.b))
-         ->  Seq Scan on t2
-         ->  Hash
-               ->  Seq Scan on t1
-(7 rows)
+                   QUERY PLAN                   
+------------------------------------------------
+ Hash Join
+   Hash Cond: ((t2.x = t1.a) AND (t2.y = t1.b))
+   ->  Seq Scan on t2
+   ->  Hash
+         ->  Seq Scan on t1
+(5 rows)
 
 -- Test case where t1 can be optimized but not t2
 explain (costs off) select t1.*,t2.x,t2.z
@@ -1161,12 +1157,10 @@ explain (costs off) select * from t1 group by a,b,c,d;
 
 -- Okay to remove columns if we're only querying the parent.
 explain (costs off) select * from only t1 group by a,b,c,d;
-      QUERY PLAN      
-----------------------
- HashAggregate
-   Group Key: a, b
-   ->  Seq Scan on t1
-(3 rows)
+   QUERY PLAN   
+----------------
+ Seq Scan on t1
+(1 row)
 
 create temp table p_t1 (
   a int,
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 761376b007..a5d98cf421 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -4417,33 +4417,31 @@ select d.* from d left join (select distinct * from b) s
 explain (costs off)
 select d.* from d left join (select * from b group by b.id, b.c_id) s
   on d.a = s.id;
-                QUERY PLAN                
-------------------------------------------
+             QUERY PLAN             
+------------------------------------
  Merge Right Join
    Merge Cond: (b.id = d.a)
-   ->  Group
-         Group Key: b.id
-         ->  Index Scan using b_pkey on b
+   ->  Index Scan using b_pkey on b
    ->  Sort
          Sort Key: d.a
          ->  Seq Scan on d
-(8 rows)
+(6 rows)
 
 -- similarly, but keying off a DISTINCT clause
 explain (costs off)
 select d.* from d left join (select distinct * from b) s
   on d.a = s.id;
-              QUERY PLAN              
---------------------------------------
- Merge Right Join
-   Merge Cond: (b.id = d.a)
-   ->  Unique
-         ->  Sort
-               Sort Key: b.id, b.c_id
-               ->  Seq Scan on b
+           QUERY PLAN            
+---------------------------------
+ Merge Left Join
+   Merge Cond: (d.a = s.id)
    ->  Sort
          Sort Key: d.a
          ->  Seq Scan on d
+   ->  Sort
+         Sort Key: s.id
+         ->  Subquery Scan on s
+               ->  Seq Scan on b
 (9 rows)
 
 -- check join removal works when uniqueness of the join condition is enforced
diff --git a/src/test/regress/expected/select_distinct.out b/src/test/regress/expected/select_distinct.out
index f3696c6d1d..c27e7d4b67 100644
--- a/src/test/regress/expected/select_distinct.out
+++ b/src/test/regress/expected/select_distinct.out
@@ -244,3 +244,279 @@ SELECT null IS NOT DISTINCT FROM null as "yes";
  t
 (1 row)
 
+create table select_distinct_a(pk1 int, pk2 char(20),  uk1 char(20) not null,  uk2 int, e int, primary key(pk1, pk2));
+create unique index select_distinct_a_uk on select_distinct_a(uk1, uk2);
+create table select_distinct_b(a int, b char(20), pk1 char(20), pk2 int, e int, primary key(pk1, pk2));
+-- distinct erased since (pk1, pk2)
+explain (costs off) select distinct * from select_distinct_a;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)
+
+-- distinct can't be reased since since we required all the uk must be not null
+explain (costs off) select distinct uk1, uk2 from select_distinct_a;
+             QUERY PLAN              
+-------------------------------------
+ HashAggregate
+   Group Key: uk1, uk2
+   ->  Seq Scan on select_distinct_a
+(3 rows)
+
+-- distinct ereased since uk + not null
+explain (costs off) select distinct uk1, uk2 from select_distinct_a where uk2 is not null;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (uk2 IS NOT NULL)
+(2 rows)
+
+explain (costs off) select distinct uk1, uk2 from select_distinct_a where uk2 > 1;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (uk2 > 1)
+(2 rows)
+
+-- distinct erased due to group by
+explain (costs off) select distinct e from select_distinct_a group by e;
+             QUERY PLAN              
+-------------------------------------
+ HashAggregate
+   Group Key: e
+   ->  Seq Scan on select_distinct_a
+(3 rows)
+
+-- distinct erased due to the restirctinfo
+explain (costs off) select distinct uk1 from select_distinct_a where pk1 = 1 and pk2 = 'c';
+                          QUERY PLAN                          
+--------------------------------------------------------------
+ Index Scan using select_distinct_a_pkey on select_distinct_a
+   Index Cond: ((pk1 = 1) AND (pk2 = 'c'::bpchar))
+(2 rows)
+
+-- test join
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+insert into select_distinct_a values(1, 'a', 'a', 0, 1), (1, 'b', 'A', 0, 2), (3, 'c', 'c', 0, 3);
+insert into select_distinct_b values(1, 'a', 'a', 0, 1), (4, 'd', 'd', 0, 4), (1, 'e', 'e', 0, 5);
+-- Cartesian join
+explain (costs off) select distinct a.uk1, a.uk2, b.pk1, b.pk2 from select_distinct_a a, select_distinct_b b where a.uk2 is not null;
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop
+   ->  Seq Scan on select_distinct_b b
+   ->  Materialize
+         ->  Seq Scan on select_distinct_a a
+               Filter: (uk2 IS NOT NULL)
+(5 rows)
+
+select distinct a.uk1, a.uk2, b.pk1, b.pk2 from select_distinct_a a, select_distinct_b b where a.uk2 is not null order by 1, 2, 3, 4;
+         uk1          | uk2 |         pk1          | pk2 
+----------------------+-----+----------------------+-----
+ a                    |   0 | a                    |   0
+ a                    |   0 | d                    |   0
+ a                    |   0 | e                    |   0
+ A                    |   0 | a                    |   0
+ A                    |   0 | d                    |   0
+ A                    |   0 | e                    |   0
+ c                    |   0 | a                    |   0
+ c                    |   0 | d                    |   0
+ c                    |   0 | e                    |   0
+(9 rows)
+
+-- left join
+explain (costs off) select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a left join select_distinct_b b on (a.pk1 = b.a);
+                 QUERY PLAN                  
+---------------------------------------------
+ Nested Loop Left Join
+   Join Filter: (a.pk1 = b.a)
+   ->  Seq Scan on select_distinct_a a
+   ->  Materialize
+         ->  Seq Scan on select_distinct_b b
+(5 rows)
+
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a left join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;;
+ pk1 |         pk2          |         pk1          | pk2 
+-----+----------------------+----------------------+-----
+   1 | a                    | a                    |   0
+   1 | a                    | e                    |   0
+   1 | b                    | a                    |   0
+   1 | b                    | e                    |   0
+   3 | c                    |                      |    
+(5 rows)
+
+-- right join
+explain (costs off) select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a right join select_distinct_b b on (a.pk1 = b.a);
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Nested Loop Left Join
+   ->  Seq Scan on select_distinct_b b
+   ->  Index Only Scan using select_distinct_a_pkey on select_distinct_a a
+         Index Cond: (pk1 = b.a)
+(4 rows)
+
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a right join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;
+ pk1 |         pk2          |         pk1          | pk2 
+-----+----------------------+----------------------+-----
+   1 | a                    | a                    |   0
+   1 | a                    | e                    |   0
+   1 | b                    | a                    |   0
+   1 | b                    | e                    |   0
+     |                      | d                    |   0
+(5 rows)
+
+-- full join
+explain (costs off)  select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a);
+                 QUERY PLAN                  
+---------------------------------------------
+ Hash Full Join
+   Hash Cond: (a.pk1 = b.a)
+   ->  Seq Scan on select_distinct_a a
+   ->  Hash
+         ->  Seq Scan on select_distinct_b b
+(5 rows)
+
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;
+ pk1 |         pk2          |         pk1          | pk2 
+-----+----------------------+----------------------+-----
+   1 | a                    | a                    |   0
+   1 | a                    | e                    |   0
+   1 | b                    | a                    |   0
+   1 | b                    | e                    |   0
+   3 | c                    |                      |    
+     |                      | d                    |   0
+(6 rows)
+
+-- distinct can't be erased since b.pk2 is missed
+explain (costs off) select distinct a.pk1, a.pk2, b.pk1 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a);
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: a.pk1, a.pk2, b.pk1
+         ->  Hash Full Join
+               Hash Cond: (a.pk1 = b.a)
+               ->  Seq Scan on select_distinct_a a
+               ->  Hash
+                     ->  Seq Scan on select_distinct_b b
+(8 rows)
+
+-- Semi/anti join
+explain (costs off) select distinct pk1, pk2 from select_distinct_a where pk1 in (select a from select_distinct_b);
+                               QUERY PLAN                                
+-------------------------------------------------------------------------
+ Nested Loop
+   ->  HashAggregate
+         Group Key: select_distinct_b.a
+         ->  Seq Scan on select_distinct_b
+   ->  Index Only Scan using select_distinct_a_pkey on select_distinct_a
+         Index Cond: (pk1 = select_distinct_b.a)
+(6 rows)
+
+explain (costs off) select distinct pk1, pk2 from select_distinct_a where pk1 not in (select a from select_distinct_b);
+              QUERY PLAN               
+---------------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (NOT (hashed SubPlan 1))
+   SubPlan 1
+     ->  Seq Scan on select_distinct_b
+(4 rows)
+
+-- we also can handle some limited subquery
+explain (costs off) select distinct * from select_distinct_a a,  (select a from select_distinct_b group by a) b where a.pk1 = b.a;
+                              QUERY PLAN                              
+----------------------------------------------------------------------
+ Nested Loop
+   ->  HashAggregate
+         Group Key: select_distinct_b.a
+         ->  Seq Scan on select_distinct_b
+   ->  Index Scan using select_distinct_a_pkey on select_distinct_a a
+         Index Cond: (pk1 = select_distinct_b.a)
+(6 rows)
+
+select distinct * from select_distinct_a a,  (select a from select_distinct_b group by a) b where a.pk1 = b.a order by 1, 2, 3;
+ pk1 |         pk2          |         uk1          | uk2 | e | a 
+-----+----------------------+----------------------+-----+---+---
+   1 | a                    | a                    |   0 | 1 | 1
+   1 | b                    | A                    |   0 | 2 | 1
+(2 rows)
+
+explain (costs off) select distinct * from select_distinct_a a,  (select distinct a from select_distinct_b) b where a.pk1 = b.a;
+                              QUERY PLAN                              
+----------------------------------------------------------------------
+ Nested Loop
+   ->  HashAggregate
+         Group Key: select_distinct_b.a
+         ->  Seq Scan on select_distinct_b
+   ->  Index Scan using select_distinct_a_pkey on select_distinct_a a
+         Index Cond: (pk1 = select_distinct_b.a)
+(6 rows)
+
+select distinct * from select_distinct_a a, (select distinct a from select_distinct_b) b where a.pk1 = b.a order by 1 ,2, 3;
+ pk1 |         pk2          |         uk1          | uk2 | e | a 
+-----+----------------------+----------------------+-----+---+---
+   1 | a                    | a                    |   0 | 1 | 1
+   1 | b                    | A                    |   0 | 2 | 1
+(2 rows)
+
+-- Distinct On
+-- can't erase since pk2 is missed
+explain (costs off) select distinct on(pk1) pk1, pk2 from select_distinct_a;
+                QUERY PLAN                 
+-------------------------------------------
+ Unique
+   ->  Sort
+         Sort Key: pk1
+         ->  Seq Scan on select_distinct_a
+(4 rows)
+
+-- ok to erase
+explain (costs off) select distinct on(pk1, pk2) pk1, pk2 from select_distinct_a;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+(1 row)
+
+-- test some view.
+create view distinct_v1 as select distinct uk1, uk2 from select_distinct_a where uk2 is not null;
+explain (costs off) select * from distinct_v1;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (uk2 IS NOT NULL)
+(2 rows)
+
+alter table select_distinct_a alter column uk1 drop not null;
+explain (costs off) select * from distinct_v1;
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ HashAggregate
+   Group Key: select_distinct_a.uk1, select_distinct_a.uk2
+   ->  Seq Scan on select_distinct_a
+         Filter: (uk2 IS NOT NULL)
+(4 rows)
+
+alter table select_distinct_a alter column uk1 set not null;
+-- test generic plan
+prepare pt as select * from distinct_v1;
+explain (costs off)  execute pt;
+          QUERY PLAN           
+-------------------------------
+ Seq Scan on select_distinct_a
+   Filter: (uk2 IS NOT NULL)
+(2 rows)
+
+alter table select_distinct_a alter column uk1 drop not null;
+explain (costs off) execute pt;
+                        QUERY PLAN                         
+-----------------------------------------------------------
+ HashAggregate
+   Group Key: select_distinct_a.uk1, select_distinct_a.uk2
+   ->  Seq Scan on select_distinct_a
+         Filter: (uk2 IS NOT NULL)
+(4 rows)
+
+drop view distinct_v1;
+drop table select_distinct_a;
+drop table select_distinct_b;
diff --git a/src/test/regress/sql/select_distinct.sql b/src/test/regress/sql/select_distinct.sql
index a605e86449..282ca58cf9 100644
--- a/src/test/regress/sql/select_distinct.sql
+++ b/src/test/regress/sql/select_distinct.sql
@@ -73,3 +73,87 @@ SELECT 1 IS NOT DISTINCT FROM 2 as "no";
 SELECT 2 IS NOT DISTINCT FROM 2 as "yes";
 SELECT 2 IS NOT DISTINCT FROM null as "no";
 SELECT null IS NOT DISTINCT FROM null as "yes";
+create table select_distinct_a(pk1 int, pk2 char(20),  uk1 char(20) not null,  uk2 int, e int, primary key(pk1, pk2));
+create unique index select_distinct_a_uk on select_distinct_a(uk1, uk2);
+create table select_distinct_b(a int, b char(20), pk1 char(20), pk2 int, e int, primary key(pk1, pk2));
+
+-- distinct erased since (pk1, pk2)
+explain (costs off) select distinct * from select_distinct_a;
+
+-- distinct can't be reased since since we required all the uk must be not null
+explain (costs off) select distinct uk1, uk2 from select_distinct_a;
+
+-- distinct ereased since uk + not null
+explain (costs off) select distinct uk1, uk2 from select_distinct_a where uk2 is not null;
+explain (costs off) select distinct uk1, uk2 from select_distinct_a where uk2 > 1;
+
+-- distinct erased due to group by
+explain (costs off) select distinct e from select_distinct_a group by e;
+
+-- distinct erased due to the restirctinfo
+explain (costs off) select distinct uk1 from select_distinct_a where pk1 = 1 and pk2 = 'c';
+
+-- test join
+set enable_mergejoin to off;
+set enable_hashjoin to off;
+
+insert into select_distinct_a values(1, 'a', 'a', 0, 1), (1, 'b', 'A', 0, 2), (3, 'c', 'c', 0, 3);
+insert into select_distinct_b values(1, 'a', 'a', 0, 1), (4, 'd', 'd', 0, 4), (1, 'e', 'e', 0, 5);
+
+-- Cartesian join
+explain (costs off) select distinct a.uk1, a.uk2, b.pk1, b.pk2 from select_distinct_a a, select_distinct_b b where a.uk2 is not null;
+select distinct a.uk1, a.uk2, b.pk1, b.pk2 from select_distinct_a a, select_distinct_b b where a.uk2 is not null order by 1, 2, 3, 4;
+
+
+-- left join
+explain (costs off) select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a left join select_distinct_b b on (a.pk1 = b.a);
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a left join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;;
+
+-- right join
+explain (costs off) select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a right join select_distinct_b b on (a.pk1 = b.a);
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a right join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;
+
+-- full join
+explain (costs off)  select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a);
+select distinct a.pk1, a.pk2, b.pk1, b.pk2 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a) order by 1, 2, 3, 4;
+
+-- distinct can't be erased since b.pk2 is missed
+explain (costs off) select distinct a.pk1, a.pk2, b.pk1 from select_distinct_a a full outer join select_distinct_b b on (a.pk1 = b.a);
+
+
+-- Semi/anti join
+explain (costs off) select distinct pk1, pk2 from select_distinct_a where pk1 in (select a from select_distinct_b);
+explain (costs off) select distinct pk1, pk2 from select_distinct_a where pk1 not in (select a from select_distinct_b);
+
+-- we also can handle some limited subquery
+explain (costs off) select distinct * from select_distinct_a a,  (select a from select_distinct_b group by a) b where a.pk1 = b.a;
+select distinct * from select_distinct_a a,  (select a from select_distinct_b group by a) b where a.pk1 = b.a order by 1, 2, 3;
+
+explain (costs off) select distinct * from select_distinct_a a,  (select distinct a from select_distinct_b) b where a.pk1 = b.a;
+select distinct * from select_distinct_a a, (select distinct a from select_distinct_b) b where a.pk1 = b.a order by 1 ,2, 3;
+
+-- Distinct On
+-- can't erase since pk2 is missed
+explain (costs off) select distinct on(pk1) pk1, pk2 from select_distinct_a;
+-- ok to erase
+explain (costs off) select distinct on(pk1, pk2) pk1, pk2 from select_distinct_a;
+
+
+-- test some view.
+create view distinct_v1 as select distinct uk1, uk2 from select_distinct_a where uk2 is not null;
+explain (costs off) select * from distinct_v1;
+
+alter table select_distinct_a alter column uk1 drop not null;
+explain (costs off) select * from distinct_v1;
+
+alter table select_distinct_a alter column uk1 set not null;
+
+-- test generic plan
+prepare pt as select * from distinct_v1;
+explain (costs off)  execute pt;
+alter table select_distinct_a alter column uk1 drop not null;
+explain (costs off) execute pt;
+
+drop view distinct_v1;
+drop table select_distinct_a;
+drop table select_distinct_b;
-- 
2.21.0

#38

David Rowley

dgrowleyml@gmail.com

almost 6 years ago

In reply to: Andy Fan (#37)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Mon, 16 Mar 2020 at 06:01, Andy Fan <zhihui.fan1213@gmail.com> wrote:

Hi All:

I have re-implemented the patch based on David's suggestion/code, Looks it
works well. The updated patch mainly includes:

1. Maintain the not_null_colno in RelOptInfo, which includes the not null from
catalog and the not null from vars.

What about non-nullability that we can derive from means other than
NOT NULL constraints. Where will you track that now that you've
removed the UniqueKeySet type?

Traditionally we use attno or attnum rather than colno for variable
names containing attribute numbers

3. postpone the propagate_unique_keys_to_joinrel call to populate_joinrel_with_paths
since we know jointype at that time. so we can handle the semi/anti join specially.

ok, but the join type was known already where I was calling the
function from. It just wasn't passed to the function.

4. Add the rule I suggested above, if both of the 2 relation yields the a unique result,
the join result will be unique as well. the UK can be ( (rel1_uk1, rel1_uk2).. )

I see. So basically you're saying that the joinrel's uniquekeys should
be the cartesian product of the unique rels from either side of the
join. I wonder if that's a special case we need to worry about too
much. Surely it only applies for clauseless joins.

5. If the unique key is impossible to be referenced by others, we can safely ignore
it in order to keep the (join)rel->unqiuekeys short.

You could probably have an equivalent of has_useful_pathkeys() and
pathkeys_useful_for_ordering()

6. I only consider the not null check/opfamily check for the uniquekey which comes
from UniqueIndex. I think that should be correct.
7. I defined each uniquekey as List of Expr, so I didn't introduce new node type.

Where will you store the collation Oid? I left comments to mention
that needed to be checked but just didn't wire it up.

#39

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: David Rowley (#38)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

Hi David:

Thanks for your time.

On Wed, Mar 18, 2020 at 9:56 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Mon, 16 Mar 2020 at 06:01, Andy Fan <zhihui.fan1213@gmail.com> wrote:

Hi All:

I have re-implemented the patch based on David's suggestion/code, Looks

it

works well. The updated patch mainly includes:

1. Maintain the not_null_colno in RelOptInfo, which includes the not

null from

catalog and the not null from vars.

What about non-nullability that we can derive from means other than
NOT NULL constraints. Where will you track that now that you've
removed the UniqueKeySet type?

I tracked it in 'deconstruct_recurse', just before
the distribute_qual_to_rels call.

+                       ListCell        *lc;
+                       foreach(lc, find_nonnullable_vars(qual))
+                       {
+                               Var *var = lfirst_node(Var, lc);
+                               RelOptInfo *rel =
root->simple_rel_array[var->varno];
+                               if (var->varattno > InvalidAttrNumber)
+                                       rel->not_null_cols =
bms_add_member(rel->not_null_cols, var->varattno);
+                       }

Traditionally we use attno or attnum rather than colno for variable
names containing attribute numbers

Currently I use a list of Var for a UnqiueKey, I guess it is ok?

3. postpone the propagate_unique_keys_to_joinrel call to

populate_joinrel_with_paths

since we know jointype at that time. so we can handle the semi/anti

join specially.

ok, but the join type was known already where I was calling the
function from. It just wasn't passed to the function.

4. Add the rule I suggested above, if both of the 2 relation yields the

a unique result,

the join result will be unique as well. the UK can be ( (rel1_uk1,

rel1_uk2).. )

I see. So basically you're saying that the joinrel's uniquekeys should
be the cartesian product of the unique rels from either side of the
join. I wonder if that's a special case we need to worry about too
much. Surely it only applies for clauseless joins

Some other cases we may need this as well:). like select m1.pk, m2.pk
from m1, m2
where m1.b = m2.b;

The cartesian product of the unique rels will make the unqiue keys too
long, so I maintain
the UnqiueKeyContext to make it short. The idea is if (UK1) is unique
already, no bother
to add another UK as (UK1, UK2) which is just a superset of it.

5. If the unique key is impossible to be referenced by others, we can
safely ignore

it in order to keep the (join)rel->unqiuekeys short.

You could probably have an equivalent of has_useful_pathkeys() and
pathkeys_useful_for_ordering()

Thanks for suggestion, I will do so in the v5-xx.patch.

6. I only consider the not null check/opfamily check for the uniquekey

which comes

from UniqueIndex. I think that should be correct.
7. I defined each uniquekey as List of Expr, so I didn't introduce new

node type.

Where will you store the collation Oid? I left comments to mention
that needed to be checked but just didn't wire it up.

This is too embarrassed, I am not sure if it is safe to ignore it. I
removed it due to
the following reasons (sorry for that I didn't explain it carefully for the
last email).

1. When we choose if an UK is usable, we have chance to compare the
collation info
for restrictinfo (where uk = 1) or target list (select uk from t) with
the indexinfo's collation,
the targetlist one has more trouble since we need to figure out the default
collation for it.
However relation_has_unique_index_for has the same situation as us, it
ignores it as well.
See comment /* XXX at some point we may need to check collations here too.
*/. It think
if there are some reasons we can ignore that.

2. What we expect from UK is:
a). Where m1.uniquekey = m2.b m2.uk will not be duplicated by this
joinclause. Here
if m1.uk has different collation, it will raise runtime error.
b). Where m1.uniquekey collate 'xxxx' = m2.b. We may can't depends on
the run-time error this time. But if we are sure that *if uk is uk at
default collation is unique,
then (uk collcate 'other-colation') is unique as well**. if so we may safe
ignore it as well.
c). select uniquekey from t / select uniquekey collate 'xxxx' from t.
This have the same
requirement as item b).

3). Looks maintain the collation for our case totally is a big effort,
and user rarely use it, If
my expectation for 2.b is not true, I prefer to detect such case (user use
a different collation),
we can just ignore the UK for that.

But After all, I think this should be an open question for now.

---
At last, I am so grateful for your suggestion/feedback, that's really
heuristic and constructive.
And so thanks Tom's for the quick review and suggest to add a new fields
for RelOptInfo,
without it I don't think I can add a new field to a so important struct.
And also thanks Bapat who
explains the thing more detailed. I'm now writing the code for partition
index stuff, which
is a bit of boring, since every partition may have different unique index.
I am expecting that
I can finish it in the following 2 days, and hope you can have another
round of review again.

Thanks for your feedback!

Best Regards
Andy Fan

#40

David Rowley

dgrowleyml@gmail.com

almost 6 years ago

In reply to: Andy Fan (#39)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

On Wed, 18 Mar 2020 at 15:57, Andy Fan <zhihui.fan1213@gmail.com> wrote:

I'm now writing the code for partition index stuff, which
is a bit of boring, since every partition may have different unique index.

Why is that case so different?

For a partitioned table to have a valid unique index, a unique index
must exist on each partition having columns that are a superset of the
partition key columns. An IndexOptInfo will exist on the partitioned
table's RelOptInfo, in this case.

At the leaf partition level, wouldn't you just add the uniquekeys the
same as we do for base rels? Maybe only do it if
enable_partitionwise_aggregation is on. Otherwise, I don't think we'll
currently have a need for them. Currently, we don't do unique joins
for partition-wise joins. Perhaps uniquekeys will be a good way to fix
that omission in the future.

#41

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: David Rowley (#40)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

Hi David:

On Wed, Mar 18, 2020 at 12:13 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Wed, 18 Mar 2020 at 15:57, Andy Fan <zhihui.fan1213@gmail.com> wrote:

I'm now writing the code for partition index stuff, which
is a bit of boring, since every partition may have different unique

index.

Why is that case so different?

For a partitioned table to have a valid unique index, a unique index
must exist on each partition having columns that are a superset of the
partition key columns. An IndexOptInfo will exist on the partitioned
table's RelOptInfo, in this case.

The main difference are caused:

1. we can create unique index on some of partition only.

create table q100 (a int, b int, c int) partition by range (b);
create table q100_1 partition of q100 for values from (1) to (10);
create table q100_2 partition of q100 for values from (11) to (20);
create unique index q100_1_c on q100_1(c); // user may create this index
on q100_1 only

2. The unique index may not contains part key as above.

For the above case, even the same index on all the partition as well, we
still can't
use it since the it unique on local partition only.

3. So the unique index on a partition table can be used only if it
contains the partition key
AND exists on all the partitions.

4. When we apply the uniquekey_is_useful_for_rel, I compare the
information between ind->indextlist
and rel->reltarget, but the indextlist has a wrong varno, we we have to
change the varno with
ChangeVarNodes for the indextlist from childrel since the varno is for
childrel.

5. When we detect the uk = 1 case, the uk is also present with
parentrel->relid information, which
we may requires the ChangeVarNodes on childrel->indexinfo->indextlist as
well.

Even the rules looks long, The run time should be very short since
usually we don't have
many unique index on partition table.

At the leaf partition level, wouldn't you just add the uniquekeys the
same as we do for base rels?

Yes, But due to the uk of a childrel may be not useful for parent rel (the
uk only exist
in one partiton), so I think we can bypass if it is a child rel case?

Best Regards
Andy Fan

#42

Andy Fan

zhihui.fan1213@gmail.com

almost 6 years ago

In reply to: Andy Fan (#41)

Re: [PATCH] Erase the distinctClause if the result is unique by definition

I have started the new thread [1]/messages/by-id/CAKU4AWrwZMAL=uaFUDMf4WGOVkEL3ONbatqju9nSXTUucpp_pw@mail.gmail.com to continue talking about this.
Mr. cfbot is happy now.

[1]: /messages/by-id/CAKU4AWrwZMAL=uaFUDMf4WGOVkEL3ONbatqju9nSXTUucpp_pw@mail.gmail.com
/messages/by-id/CAKU4AWrwZMAL=uaFUDMf4WGOVkEL3ONbatqju9nSXTUucpp_pw@mail.gmail.com

Thanks

Show quoted text