Bloom filter Pushdown Optimization for Merge Join

Started by Zheng Liover 3 years ago11 messages

zhengli10@gmail.com

over 3 years ago

9 attachment(s)

Hello,

A bloom filter provides early filtering of rows that cannot be joined
before they would reach the join operator, the optimization is also
called a semi join filter (SJF) pushdown. Such a filter can be created
when one child of the join operator must materialize its derived table
before the other child is evaluated.

For example, a bloom filter can be created using the the join keys for
the build side/inner side of a hash join or the outer side of a merge
join, the bloom filter can then be used to pre-filter rows on the
other side of the join operator during the scan of the base relation.
The thread about “Hash Joins vs. Bloom Filters / take 2” [1]/messages/by-id/c902844d-837f-5f63-ced3-9f7fd222f175@2ndquadrant.com is good
discussion on using such optimization for hash join without going into
the pushdown of the filter where its performance gain could be further
increased.

We worked on prototyping bloom filter pushdown for both hash join and
merge join. Attached is a patch set for bloom filter pushdown for
merge join. We also plan to send the patch for hash join once we have
it rebased.

Here is a summary of the patch set:
1. Bloom Filter Pushdown optimizes Merge Join by filtering rows early
during the table scan instead of later on.
-The bloom filter is pushed down along the execution tree to
the target SeqScan nodes.
-Experiments show that this optimization can speed up Merge
Join by up to 36%.

2. The planner makes the decision to use the bloom filter based on the
estimated filtering rate and the expected performance gain.
-The planner accomplishes this by estimating four numbers per
variable - the total number of rows of the relation, the number of
distinct values for a given variable, and the minimum and maximum
value of the variable (when applicable). Using these numbers, the
planner estimates a filtering rate of a potential filter.
-Because actually creating and implementing the filter adds
more operations, there is a minimum threshold of filtering where the
filter would actually be useful. Based on testing, we query to see if
the estimated filtering rate is higher than 35%, and that informs our
decision to use a filter or not.

3. If using a bloom filter, the planner also adjusts the expected cost
of Merge Join based on expected performance gain.

4. Capability to build the bloom filter in parallel in case of
parallel SeqScan. This is done efficiently by populating a local bloom
filter for each parallel worker and then taking a bitwise OR over all
the local bloom filters to form a shared bloom filter at the end of
the parallel SeqScan.

5. The optimization is GUC controlled, with settings of
enable_mergejoin_semijoin_filter and force_mergejoin_semijoin_filter.

We found in experiments that there is a significant improvement
when using the bloom filter during Merge Join. One experiment involved
joining two large tables while varying the theoretical filtering rate
(TFR) between the two tables, the TFR is defined as the percentage
that the two datasets are disjoint. Both tables in the merge join were
the same size. We tested changing the TFR to see the change in
filtering optimization.

For example, let’s imagine t0 has 10 million rows, which contain the
numbers 1 through 10 million randomly shuffled. Also, t1 has the
numbers 4 million through 14 million randomly shuffled. Then the TFR
for a join of these two tables is 40%, since 40% of the tables are
disjoint from the other table (1 through 4 million for t0, 10 million
through 14 million for t4).

Here is the performance test result joining two tables:
TFR: theoretical filtering rate
EFR: estimated filtering rate
AFR: actual filtering rate
HJ: hash join
MJ Default: default merge join
MJ Filter: merge join with bloom filter optimization enabled
MJ Filter Forced: merge join with bloom filter optimization forced

TFR EFR AFR HJ MJ Default MJ Filter MJ Filter Forced
-------------------------------------------------------------------------------------
10 33.46 7.41 6529 22638 21949 23160
20 37.27 14.85 6483 22290 21928 21930
30 41.32 22.25 6395 22374 20718 20794
40 45.67 29.7 6272 21969 19449 19410
50 50.41 37.1 6210 21412 18222 18224
60 55.64 44.51 6052 21108 17060 17018
70 61.59 51.98 5947 21020 15682 15737
80 68.64 59.36 5761 20812 14411 14437
90 77.83 66.86 5701 20585 13171 13200
Table. Execution Time (ms) vs Filtering Rate (%) for Joining Two
Tables of 10M Rows.

Attached you can find figures of the same performance test and a SQL script
to reproduce the performance test.

The first thing to notice is that Hash Join generally is the most
efficient join strategy. This is because Hash Join is better at
dealing with small tables, and our size of 10 million is still small
enough where Hash Join outperforms the other join strategies. Future
experiments can investigate using much larger tables.

However, comparing just within the different Merge Join variants, we
see that using the bloom filter greatly improves performance.
Intuitively, all of these execution times follow linear paths.
Comparing forced filtering versus default, we can see that the default
Merge Join outperforms Merge Join with filtering at low filter rates,
but after about 20% TFR, the Merge Join with filtering outperforms
default Merge Join. This makes intuitive sense, as there are some
fixed costs associated with building and checking with the bloom
filter. In the worst case, at only 10% TFR, the bloom filter makes
Merge Join less than 5% slower. However, in the best case, at 90% TFR,
the bloom filter improves Merge Join by 36%.

Based on the results of the above experiments, we came up with a
linear equation for the performance ratio for using the filter
pushdown from the actual filtering rate. Based on the numbers
presented in the figure, this is the equation:

T_filter / T_no_filter = 1 / (0.83 * estimated filtering rate + 0.863)

For example, this means that with an estimated filtering rate of 0.4,
the execution time of merge join is estimated to be improved by 16.3%.
Note that the estimated filtering rate is used in the equation, not
the theoretical filtering rate or the actual filtering rate because it
is what we have during planning. In practice the estimated filtering
rate isn’t usually accurate. In fact, the estimated filtering rate can
differ from the theoretical filtering rate by as much as 17% in our
experiments. One way to mitigate the power loss of bloom filter caused
by inaccurate estimated filtering rate is to adaptively turn it off at
execution time, this is yet to be implemented.

Here is a list of tasks we plan to work on in order to improve this patch:
1. More regression testing to guarantee correctness.
2. More performance testing involving larger tables and complicated query plans.
3. Improve the cost model.
4. Explore runtime tuning such as making the bloom filter checking adaptive.
5. Currently, only the best single join key is used for building the
Bloom filter. However, if there are several keys and we know that
their distributions are somewhat disjoint, we could leverage this fact
and use multiple keys for the bloom filter.
6. Currently, Bloom filter pushdown is only implemented for SeqScan
nodes. However, it would be possible to allow push down to other types
of scan nodes.
7. Explore if the Bloom filter could be pushed down through a foreign
scan when the foreign server is capable of handling it – which could
be made true for postgres_fdw.
8. Better explain command on the usage of bloom filters.

This patch set is prepared by Marcus Ma, Lyu Pan and myself. Feedback
is appreciated.

With Regards,
Zheng Li
Amazon RDS/Aurora for PostgreSQL

[1]: /messages/by-id/c902844d-837f-5f63-ced3-9f7fd222f175@2ndquadrant.com

Attachments:

0001-Support-semijoin-filter-in-the-planner-optimizer.patchapplication/octet-stream; name=0001-Support-semijoin-filter-in-the-planner-optimizer.patchDownload

From 43c31315278890febb411eccd0652ab109215010 Mon Sep 17 00:00:00 2001
From: Lyu Pan <lyup@amazon.com>
Date: Thu, 15 Sep 2022 22:16:23 +0000
Subject: [PATCH 1/5] Support semijoin filter in the planner/optimizer.

1. Introduces two GUCs enable_mergejoin_semijoin_filter and force_mergejoin_semijoin_filter. enable_mergejoin_semijoin_filter enables the use of bloom filter during a merge-join, if such a filter is available, the planner will adjust the merge-join cost and use it only if it meets certain threshold (estimated filtering rate has to be higher than a certain value);
force_mergejoin_semijoin_filter forces the use of bloom filter during a merge-join if a valid filter is available.

2. In this prototype, only a single join clause where both sides maps to a base column will be considered as the key to build the bloom filter. For example:
bloom filter may be used in this query:
SELECT * FROM a JOIN b ON a.col1 = b.col1;
bloom filter will not be used in the following query (the left hand side of the join clause is an expression):
SELECT * FROM a JOIN b ON a.col1 + a.col2 = b.col1;

3. In this prototype, the cost model is based on an assumption that there is a linear relationship between the performance gain from using a semijoin filter and the estimated filtering rate:
% improvement to Merge Join cost = 0.83 * estimated filtering rate - 0.137.
---
 src/backend/optimizer/path/costsize.c         | 1298 ++++++++++++++++-
 src/backend/optimizer/plan/createplan.c       |  611 ++++++++
 src/backend/utils/adt/selfuncs.c              |   24 +-
 src/backend/utils/misc/guc_tables.c           |   20 +
 src/backend/utils/misc/postgresql.conf.sample |    2 +
 src/include/nodes/pathnodes.h                 |   13 +
 src/include/nodes/plannodes.h                 |    8 +
 src/include/optimizer/cost.h                  |   47 +
 src/include/utils/selfuncs.h                  |    4 +-
 src/test/regress/expected/sysviews.out        |   49 +-
 10 files changed, 2037 insertions(+), 39 deletions(-)

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index f486d42441..d1663e5a37 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -192,6 +192,30 @@ static double relation_byte_size(double tuples, int width);
 static double page_size(double tuples, int width);
 static double get_parallel_divisor(Path *path);
 
+/*
+ *  Local functions and option variables to support
+ *  semijoin pushdowns from join nodes
+ */
+static double evaluate_semijoin_filtering_rate(JoinPath *join_path,
+											   const List *hash_equijoins,
+											   const PlannerInfo *root,
+											   JoinCostWorkspace *workspace,
+											   int *best_clause,
+											   int *rows_filtered);
+static bool verify_valid_pushdown(const Path *p,
+								  const Index pushdown_target_key_no,
+								  const PlannerInfo *root);
+static TargetEntry *get_nth_targetentry(int posn,
+										const List *targetlist);
+static bool is_fk_pk(const Var *outer_var,
+					 const Var *inner_var,
+					 Oid op_oid,
+					 const PlannerInfo *root);
+static List *get_switched_clauses(List *clauses, Relids outerrelids);
+
+/* Global variables to store semijoin control options */
+bool		enable_mergejoin_semijoin_filter;
+bool		force_mergejoin_semijoin_filter;
 
 /*
  * clamp_row_est
@@ -3650,6 +3674,10 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 			innerstartsel = 0.0;
 			innerendsel = 1.0;
 		}
+		workspace->outer_min_val = cache->leftmin;
+		workspace->outer_max_val = cache->leftmax;
+		workspace->inner_min_val = cache->rightmin;
+		workspace->inner_max_val = cache->rightmax;
 	}
 	else
 	{
@@ -3811,6 +3839,10 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 	double		mergejointuples,
 				rescannedtuples;
 	double		rescanratio;
+	List	   *mergeclauses_for_sjf;
+	double		filteringRate;
+	int			best_filter_clause;
+	int			rows_filtered;
 
 	/* Protect some assumptions below that rowcounts aren't zero */
 	if (inner_path_rows <= 0)
@@ -3863,6 +3895,49 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 	else
 		path->skip_mark_restore = false;
 
+	if (enable_mergejoin_semijoin_filter)
+	{
+		/*
+		 * determine if merge join should use a semijoin filter. We need to
+		 * rearrange the merge clauses so they match the status of the clauses
+		 * during Plan creation.
+		 */
+		mergeclauses_for_sjf = get_actual_clauses(path->path_mergeclauses);
+		mergeclauses_for_sjf = get_switched_clauses(path->path_mergeclauses,
+													path->jpath.outerjoinpath->parent->relids);
+		filteringRate = evaluate_semijoin_filtering_rate((JoinPath *) path, mergeclauses_for_sjf, root,
+														 workspace, &best_filter_clause, &rows_filtered);
+		if (force_mergejoin_semijoin_filter ||
+			(filteringRate >= 0 && rows_filtered >= 0 && best_filter_clause >= 0))
+		{
+			/* found a valid SJF at the very least */
+			/* want at least 1000 rows_filtered to avoid any nasty edge cases */
+			if (force_mergejoin_semijoin_filter || (filteringRate >= 0.35 && rows_filtered > 1000))
+			{
+				double		improvement;
+
+				path->use_semijoinfilter = true;
+				path->best_mergeclause = best_filter_clause;
+				path->filteringRate = filteringRate;
+
+				/*
+				 * Based on experimental data, we have found that there is a
+				 * linear relationship between the estimated filtering rate
+				 * and improvement to the cost of Merge Join. In fact, this
+				 * improvement can be modeled by this equation: improvement =
+				 * 0.83 * filtering rate - 0.137 i.e., a filtering rate of 0.4
+				 * yields an improvement of 19.5%. This equation also
+				 * concludes thata a 17% filtering rate is the break-even
+				 * point, so we use 35% just be conservative. We use this
+				 * information to adjust the MergeJoin's planned cost.
+				 */
+				improvement = 0.83 * filteringRate - 0.137;
+				run_cost = (1 - improvement) * run_cost;
+				workspace->run_cost = run_cost;
+			}
+		}
+	}
+
 	/*
 	 * Get approx # tuples passing the mergequals.  We use approx_tuple_count
 	 * here because we need an estimate done with JOIN_INNER semantics.
@@ -4044,6 +4119,10 @@ cached_scansel(PlannerInfo *root, RestrictInfo *rinfo, PathKey *pathkey)
 				leftendsel,
 				rightstartsel,
 				rightendsel;
+	Datum		leftmin,
+				leftmax,
+				rightmin,
+				rightmax;
 	MemoryContext oldcontext;
 
 	/* Do we have this result already? */
@@ -4066,7 +4145,11 @@ cached_scansel(PlannerInfo *root, RestrictInfo *rinfo, PathKey *pathkey)
 					 &leftstartsel,
 					 &leftendsel,
 					 &rightstartsel,
-					 &rightendsel);
+					 &rightendsel,
+					 &leftmin,
+					 &leftmax,
+					 &rightmin,
+					 &rightmax);
 
 	/* Cache the result in suitably long-lived workspace */
 	oldcontext = MemoryContextSwitchTo(root->planner_cxt);
@@ -4080,6 +4163,10 @@ cached_scansel(PlannerInfo *root, RestrictInfo *rinfo, PathKey *pathkey)
 	cache->leftendsel = leftendsel;
 	cache->rightstartsel = rightstartsel;
 	cache->rightendsel = rightendsel;
+	cache->leftmin = leftmin;
+	cache->leftmax = leftmax;
+	cache->rightmin = rightmin;
+	cache->rightmax = rightmax;
 
 	rinfo->scansel_cache = lappend(rinfo->scansel_cache, cache);
 
@@ -6552,3 +6639,1212 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
 
 	return pages_fetched;
 }
+
+/* The following code was modified from previous commits on the
+ * HashJoin SemiJoinFilter CR.
+ */
+
+/*
+ *  Conditionally compiled tracing is available for the semijoin
+ *  decision process.  Tracing in only included for debug builds,
+ *  and only if the TRACE_SJPD flag is defined.
+ */
+
+#define TRACE_SJPD 0
+#define DEBUG_BUILD 0
+
+#if defined(DEBUG_BUILD) && defined(TRACE_SJPD) && DEBUG_BUILD && TRACE_SJPD
+#define debug_sj1(x)	   elog(INFO, (x))
+#define debug_sj2(x,y)	   elog(INFO, (x), (y))
+#define debug_sj3(x,y,z)   elog(INFO, (x), (y), (z))
+#define debug_sj4(w,x,y,z) elog(INFO, (w), (x), (y), (z))
+#define debug_sj_md(x,y,z) debug_sj_expr_metadata((x), (y), (z))
+#else
+#define debug_sj1(x)
+#define debug_sj2(x,y)
+#define debug_sj3(x,y,z)
+#define debug_sj4(w,x,y,z)
+#define debug_sj_md(x,y,z)
+#endif
+
+
+static void
+init_expr_metadata(ExprMetadata * md)
+{
+	/* Should only be called by analyze_expr_for_metadata */
+	Assert(md);
+
+	md->is_or_maps_to_constant = false;
+	md->is_or_maps_to_base_column = false;
+	md->local_column_expr = NULL;
+	md->local_relation = NULL;
+	md->est_col_width = 0;
+	md->base_column_expr = NULL;
+	md->base_rel = NULL;
+	md->base_rel_root = NULL;
+	md->base_rel_row_count = 0.0;
+	md->base_rel_filt_row_count = 0.0;
+	md->base_col_distincts = -1.0;
+	md->est_distincts_reliable = false;
+	md->expr_est_distincts = -1.0;
+}
+
+
+#if defined(DEBUG_BUILD) && defined(TRACE_SJPD) && DEBUG_BUILD && TRACE_SJPD
+
+static void
+debug_sj_expr_metadata(const char *side, int ord, const ExprMetadata * md)
+{
+	debug_sj4("SJPD:          %s key [%d]  is constant:   %d",
+			  side, ord, md->is_or_maps_to_constant);
+	debug_sj4("SJPD:          %s key [%d]  is base col:   %d",
+			  side, ord, md->is_or_maps_to_base_column);
+	debug_sj4("SJPD:          %s key [%d]  est reliable:  %d",
+			  side, ord, md->est_distincts_reliable);
+	debug_sj4("SJPD:          %s key [%d]  trows_bf:      %.1lf",
+			  side, ord, md->base_rel_row_count);
+	debug_sj4("SJPD:          %s key [%d]  trows_af:      %.1lf",
+			  side, ord, md->base_rel_filt_row_count);
+	debug_sj4("SJPD:          %s key [%d]  bcol_dist:     %.1lf",
+			  side, ord, md->base_col_distincts);
+	debug_sj4("SJPD:          %s key [%d]  est width:     %d",
+			  side, ord, md->est_col_width);
+
+	if (md->local_relation && md->local_relation != md->base_rel)
+	{
+		debug_sj4("SJPD:          %s key [%d]  logical relid: %d",
+				  side, ord, md->local_relation->relid);
+	}
+	if (md->base_rel)
+	{
+		debug_sj4("SJPD:          %s key [%d]  base relid:    %d",
+				  side, ord, md->base_rel->relid);
+	}
+
+	if (md->base_rel && (md->base_rel->reloptkind == RELOPT_BASEREL ||
+						 md->base_rel->reloptkind == RELOPT_OTHER_MEMBER_REL))
+	{
+		/* include the column name, if we can get it  */
+		const RelOptInfo *cur_relation = md->base_rel;
+		Oid			cur_var_reloid = InvalidOid;
+		const Var  *cur_var = (const Var *) md->base_column_expr;
+
+		Assert(IsA(cur_var, Var));
+		cur_var_reloid = (planner_rt_fetch(cur_relation->relid,
+										   md->base_rel_root))->relid;
+		if (cur_var_reloid != InvalidOid && cur_var->varattno > 0)
+		{
+			const char *base_attribute_name =
+			get_attname(cur_var_reloid, md->base_column_expr->varattno,
+						true);
+			const char *base_rel_name = get_rel_name(cur_var_reloid);
+			char		name_str[260] = "";
+
+			if (base_rel_name && base_attribute_name)
+			{
+				snprintf(name_str, sizeof(name_str), "%s.%s",
+						 base_rel_name, base_attribute_name);
+			}
+			else if (base_attribute_name)
+			{
+				snprintf(name_str, sizeof(name_str), "%s",
+						 base_attribute_name);
+			}
+			if (base_attribute_name)
+			{
+				debug_sj4("SJPD:          %s key [%d]  base col name: %s",
+						  side, ord, name_str);
+			}
+		}
+	}
+	debug_sj4("SJPD:        %s key [%d]    est distincts: %.1lf",
+			  side, ord, md->expr_est_distincts);
+}
+#endif
+
+
+static double
+estimate_distincts_remaining(double original_table_row_count,
+							 double original_distinct_count,
+							 double est_row_count_after_predicates)
+{
+	/*
+	 * Estimates the number of distinct values still present within a column
+	 * after some local filtering has been applied to that table and thereby
+	 * restricted the set of relevant rows.
+	 *
+	 * This method assumes that the original_distinct_count comes from a
+	 * column whose values are uncorrelated with the row restricting
+	 * condition(s) on this table.  Other mechanisms need to be added to more
+	 * accurately handle the cases where the row restrincting condition is
+	 * directly on the current column.
+	 *
+	 * The most probable number of distinct values remaining can be computed
+	 * exactly using Yao's iterative expansion formula from: "Approximating
+	 * block accesses in database organizations", S. B. Yao, CACM, V20, N4,
+	 * April 1977, p. 260-261 However, this formula gets very expensive to
+	 * compute whenever the number of distinct values is large.
+	 *
+	 * This function instead uses a non-iterative approximation of Yao's
+	 * iterative formula from: "Estimating Block Accesses in Database
+	 * Organizations: A Closed Noniterative Formula", Kyu-Young Whang, Gio
+	 * Wiederhold, and Daniel Sagalowicz CACM V26, N11, November 1983, p.
+	 * 945-947 This approximation starts with terms for the first three
+	 * iterations of Yao's formula, and then inserts two adjustment factors
+	 * into the third term which minimize the total error related to the
+	 * missing subsequent terms.
+	 *
+	 * Internally this function uses M, N, P, and K as variables to match the
+	 * notation used in the equation in the paper.
+	 */
+	double		n = original_table_row_count;
+	double		m = original_distinct_count;
+	double		k = est_row_count_after_predicates;
+	double		p = 0.0;		/* avg rows per distinct */
+	double		result;
+
+	/* The three partial probabality terms */
+	double		term_1 = 0.0;
+	double		term_2 = 0.0;
+	double		term_3 = 0.0;
+	double		sum_terms = 0.0;
+
+	/* In debug builds, validate the sanity of the inputs  */
+	Assert(isfinite(original_table_row_count));
+	Assert(isfinite(original_distinct_count));
+	Assert(isfinite(est_row_count_after_predicates));
+	Assert(original_table_row_count >= 0.0);
+	Assert(original_distinct_count >= 0.0);
+	Assert(est_row_count_after_predicates >= 0.0);
+	Assert(original_distinct_count <= original_table_row_count);
+	Assert(est_row_count_after_predicates <= original_table_row_count);
+
+	if (n > 0.0 && m > 0.0)
+	{
+		p = (n / m);
+	}
+	Assert(isfinite(p));
+
+	if (k > (n - p))
+	{							/* All distincts almost guaranteed to still be
+								 * present */
+		result = m;
+	}
+	else if (m < 0.000001)
+	{							/* When all values are NULL, avoid division by
+								 * zero */
+		result = 0.0;
+	}
+	else if (k <= 1.000001)
+	{							/* When only one or zero rows after filtering */
+		result = k;
+	}
+	else
+	{
+		/*
+		 * When this is not a special case, compute the partial probabilities.
+		 * However, if the probability calculation overflows, then revert to
+		 * the estimate we can get from the upper bound analysis.
+		 */
+		result = fmin(original_distinct_count, est_row_count_after_predicates);
+
+		if (isfinite(1.0 / m) && isfinite(pow((1.0 - (1.0 / m)), k)))
+		{
+			term_1 = (1.0 - pow((1.0 - (1.0 / m)), k));
+
+			if (isfinite(term_1))
+			{
+				/*
+				 * As long as we at least have a usable term_1, then proceed
+				 * to the much smaller term_2 and to the even smaller term_3.
+				 *
+				 * If no usable term_1, then just use the hard upper bounds.
+				 */
+				if (isfinite(m * m * p)
+					&& isfinite(pow((1.0 - (1.0 / m)), (k - 1.0))))
+				{
+					term_2 = ((1.0 / (m * m * p))
+							  * ((k * (k - 1.0)) / 2.0)
+							  * pow((1.0 - (1.0 / m)), (k - 1.0))
+						);
+				}
+				if (!isfinite(term_2))
+				{
+					term_2 = 0.0;
+				}
+				if (isfinite(pow(m, 3.0))
+					&& isfinite(pow(p, 4.0))
+					&& isfinite(pow(m, 3.0) * pow(p, 4.0))
+					&& isfinite(k * (k - 1.0) * ((2 * k) - 1.0))
+					&& isfinite(pow((1.0 - (1.0 / m)), (k - 1.0))))
+				{
+					term_3 = ((1.5 / (pow(m, 3.0) * pow(p, 4.0)))
+							  * ((k * (k - 1.0) * ((2 * k) - 1.0)) / 6.0)
+							  * pow((1.0 - (1.0 / m)), (k - 1.0)));
+				}
+				if (!isfinite(term_3))
+				{
+					term_3 = 0.0;
+				}
+				sum_terms = term_1 + term_2 + term_3;
+
+				/* In debug builds, validate the partial probability terms */
+				Assert(term_1 <= 1.0 && term_1 >= 0.0);
+				Assert(term_2 <= 1.0 && term_2 >= 0.0);
+				Assert(term_3 <= 1.0 && term_3 >= 0.0);
+				Assert(term_1 > term_2);
+				Assert(term_2 >= term_3);
+				Assert(isfinite(sum_terms));
+				Assert(sum_terms <= 1.0);
+
+				if (isfinite(m * sum_terms))
+				{
+					result = round(m * sum_terms);
+				}
+
+				/* Ensure hard upper bounds still satisfied  */
+				result = fmin(result,
+							  fmin(original_distinct_count,
+								   est_row_count_after_predicates));
+
+				/* Since empty tables were handled above, must be >= 1 */
+				result = fmax(result, 1.0);
+			}
+		}
+	}
+
+	Assert(result >= 0.0);
+	Assert(result <= original_distinct_count);
+	Assert(result <= est_row_count_after_predicates);
+
+	return result;
+}
+
+
+static void
+gather_base_column_metadata(const Var *base_col,
+							const RelOptInfo *base_rel,
+							const PlannerInfo *root,
+							ExprMetadata * md)
+{
+	/*
+	 * Given a Var for a base column, gather metadata about that column Should
+	 * only be called indirectly under analyze_expr_for_metadata
+	 */
+	VariableStatData base_col_vardata;
+	Oid			base_col_reloid = InvalidOid;
+	bool		is_default;
+
+	/* Oid var_sortop; */
+
+	Assert(md && base_rel && root);
+	Assert(base_col && IsA(base_col, Var));
+	Assert(base_rel->reloptkind == RELOPT_BASEREL ||
+		   base_rel->reloptkind == RELOPT_OTHER_MEMBER_REL);
+	Assert(base_rel->rtekind == RTE_RELATION);
+
+	md->base_column_expr = base_col;
+	md->base_rel = base_rel;
+	md->base_rel_root = root;
+	md->is_or_maps_to_base_column = true;
+
+	examine_variable((PlannerInfo *) root, (Node *) base_col, 0,
+					 &base_col_vardata);
+	Assert(base_col_vardata.rel);
+	Assert(base_col_vardata.rel == base_rel);
+
+	md->base_rel_row_count = md->base_rel->tuples;
+	md->base_rel_filt_row_count = md->base_rel->rows;
+	md->base_col_distincts =
+		fmin(get_variable_numdistinct(&base_col_vardata, &is_default),
+			 md->base_rel_row_count);
+	md->est_distincts_reliable = !is_default;
+
+	/*
+	 * For indirectly filtered columns estimate the effect of the rows
+	 * filtered on the remaining column distinct count.
+	 */
+	md->expr_est_distincts =
+		fmax(1.0,
+			 estimate_distincts_remaining(md->base_rel_row_count,
+										  md->base_col_distincts,
+										  md->base_rel_filt_row_count));
+
+	base_col_reloid = (planner_rt_fetch(base_rel->relid, root))->relid;
+	if (base_col_reloid != InvalidOid && base_col->varattno > 0)
+	{
+		md->est_col_width =
+			get_attavgwidth(base_col_reloid, base_col->varattno);
+	}
+	ReleaseVariableStats(base_col_vardata);
+}
+
+static Expr *
+get_subquery_var_occluded_reference(const Expr *ex, const PlannerInfo *root)
+{
+	/*
+	 * Given a virtual column from an unflattened subquery, return the
+	 * expression it immediately occludes
+	 */
+	Var		   *outside_subq_var = (Var *) ex;
+	RelOptInfo *outside_subq_relation = NULL;
+	RangeTblEntry *outside_subq_rte = NULL;
+	TargetEntry *te = NULL;
+	Expr	   *inside_subq_expr = NULL;
+
+	Assert(ex && root);
+	Assert(IsA(ex, Var));
+	Assert(outside_subq_var->varno < root->simple_rel_array_size);
+
+	outside_subq_relation = root->simple_rel_array[outside_subq_var->varno];
+	outside_subq_rte = root->simple_rte_array[outside_subq_var->varno];
+
+	/*
+	 * If inheritance, subquery has append, leg of append in subquery may not
+	 * have subroot, we may be able to better process it according to
+	 * root->append_rel_list. For now just return the first leg... TODO better
+	 * handling of Union All, we only return statistics of the first leg atm.
+	 * TODO similarly, need better handling of partitioned tables, according
+	 * to outside_subq_relation->part_scheme and part_rels.
+	 */
+	if (outside_subq_rte->inh)
+	{
+		AppendRelInfo *appendRelInfo = NULL;
+
+		Assert(root->append_rel_list);
+
+		/* TODO remove this check once we add better handling of inheritance */
+		if (force_mergejoin_semijoin_filter)
+		{
+			appendRelInfo = list_nth(root->append_rel_list, 0);
+			Assert(appendRelInfo->parent_relid == outside_subq_var->varno);
+
+			Assert(appendRelInfo->translated_vars &&
+				   outside_subq_var->varattno <=
+				   list_length(appendRelInfo->translated_vars));
+			inside_subq_expr = list_nth(appendRelInfo->translated_vars,
+										outside_subq_var->varattno - 1);
+		}
+	}
+
+	/* Subquery without append and partitioned tables */
+	else
+	{
+		Assert(outside_subq_relation && IsA(outside_subq_relation, RelOptInfo));
+		Assert(outside_subq_relation->reloptkind == RELOPT_BASEREL);
+		Assert(outside_subq_relation->rtekind == RTE_SUBQUERY);
+		Assert(outside_subq_relation->subroot->processed_tlist);
+
+		te = get_nth_targetentry(outside_subq_var->varattno,
+								 outside_subq_relation->subroot->processed_tlist);
+		Assert(te && outside_subq_var->varattno == te->resno);
+		inside_subq_expr = te->expr;
+
+		/*
+		 * Strip off any Relabel present, and return the underlying expression
+		 */
+		while (inside_subq_expr && IsA(inside_subq_expr, RelabelType))
+		{
+			inside_subq_expr = ((RelabelType *) inside_subq_expr)->arg;
+		}
+	}
+
+	return inside_subq_expr;
+}
+
+
+static void
+recursively_analyze_expr_metadata(const Expr *ex,
+								  const PlannerInfo *root,
+								  ExprMetadata * md)
+{
+	/* Should only be called by analyze_expr_for_metadata, or itself */
+	Assert(md && ex && root);
+
+	if (IsA(ex, Const))
+	{
+		md->is_or_maps_to_constant = true;
+		md->expr_est_distincts = 1.0;
+		md->est_distincts_reliable = true;
+	}
+	else if (IsA(ex, RelabelType))
+	{
+		recursively_analyze_expr_metadata(((RelabelType *) ex)->arg, root, md);
+	}
+	else if (IsA(ex, Var))
+	{
+		Var		   *local_var = (Var *) ex;
+		RelOptInfo *local_relation = NULL;
+
+		Assert(local_var->varno < root->simple_rel_array_size);
+
+		/* Bail out if varno is invalid */
+		if (local_var->varno == InvalidOid)
+			return;
+
+		local_relation = root->simple_rel_array[local_var->varno];
+		Assert(local_relation && IsA(local_relation, RelOptInfo));
+
+		/*
+		 * For top level call (i.e. not a recursive invocation) cache the
+		 * relation pointer
+		 */
+		if (!md->local_relation
+			&& (local_relation->reloptkind == RELOPT_BASEREL ||
+				local_relation->reloptkind == RELOPT_OTHER_MEMBER_REL))
+		{
+			md->local_relation = local_relation;
+			md->local_column_expr = local_var;
+		}
+
+		if ((local_relation->reloptkind == RELOPT_BASEREL ||
+			 local_relation->reloptkind == RELOPT_OTHER_MEMBER_REL)
+			&& local_relation->rtekind == RTE_RELATION)
+		{
+			/* Found Var is a base column, so gather the metadata we can  */
+			gather_base_column_metadata(local_var, local_relation, root, md);
+		}
+		else if (local_relation->reloptkind == RELOPT_BASEREL
+				 && local_relation->rtekind == RTE_SUBQUERY)
+		{
+			RangeTblEntry *outside_subq_rte =
+			root->simple_rte_array[local_relation->relid];
+
+			/* root doesn't change for inheritance case, e.g. for UNION ALL */
+			const PlannerInfo *new_root = outside_subq_rte->inh ?
+			root : local_relation->subroot;
+
+			/*
+			 * Found that this Var is a subquery SELECT list item, so continue
+			 * to recurse on the occluded expression
+			 */
+			Expr	   *occluded_expr =
+			get_subquery_var_occluded_reference(ex, root);
+
+			if (occluded_expr)
+			{
+				recursively_analyze_expr_metadata(occluded_expr,
+												  new_root, md);
+			}
+		}
+	}
+}
+
+
+void
+analyze_expr_for_metadata(const Expr *ex,
+						  const PlannerInfo *root,
+						  ExprMetadata * md)
+{
+	/*
+	 * Analyze the supplied expression, and if possible, gather metadata about
+	 * it.  Currently handles: base table columns, constants, and virtual
+	 * columns from unflattened subquery blocks.  The metadata collected is
+	 * placed into the supplied ExprMetadata object.
+	 */
+	Assert(md && ex && root);
+
+	init_expr_metadata(md);
+	recursively_analyze_expr_metadata(ex, root, md);
+}
+
+
+/*
+ *  Function:  evaluate_semijoin_filtering_rate
+ *
+ *  Given a merge join path, determine two things.
+ *  First, can a Bloom filter based semijoin be created on the
+ *  outer scan relation and checked on the inner scan relation to
+ *  filter out rows from the inner relation? And second, if this
+ *  is possible, determine the single equijoin condition that is most
+ *  useful as well as the estimated filtering rate of the filter.
+ *
+ *  The output args, inner_semijoin_keys and
+ *  outer_semijoin_keys, will each contain a single key column
+ *  from one of the hash equijoin conditions. probe_semijoin_keys
+ *  contains keys from the target relation to probe the semijoin filter.
+ *
+ *  A potential semijoin will be deemed valid only if all
+ *  of the following are true:
+ *    a) The enable_mergejoin_semijoin_filter option is set true
+ *    b) The equijoin key from the outer side is or maps
+ *       to a base table column
+ *    c) The equijoin key from the inner side is or maps to
+ *       a base column
+ *
+ *  A potential semijoin will be deemed useful only if the
+ *  force_mergejoin_semijoin_filter is set true, or if all of the
+ *  following are true:
+ *    a) The equijoin key base column from the outer side has
+ *       reliable metadata (i.e. ANALYZE was done on it)
+ *    b) The key column(s) from the outer side equijoin keys
+ *       have width metadata available.
+ *    c) The estimated outer side key column width(s) are not
+ *       excessively wide.
+ *    d) The equijoin key from the inner side either:
+ *         1) maps to a base column with reliable metadata, or
+ *         2) is constrained by the incoming estimated tuple
+ *            count to have a distinct count smaller than the
+ *            outer side key column's distinct count.
+ *    e) The semijoin must be estimated to filter at least some of
+ *       the rows from the inner relation. However, the exact filtering
+ *       rate where the semijoin is deemed useful is determined by the
+ *       mergejoin cost model itself, not this function.
+ *
+ *  If there is more than one equijoin condition, we favor the one with the
+ *  higher estimated filtering rate.
+ *
+ *  If this function finds an appropriate semijoin, it will
+ *  allocate a PushdownSemijoinMetadata object to store the
+ *  semijoin metadata, and then attach it to the Join plan node.
+ */
+#define MAX_SEMIJOIN_SINGLE_KEY_WIDTH	  128
+
+static double
+evaluate_semijoin_filtering_rate(JoinPath *join_path,
+								 const List *equijoin_list,
+								 const PlannerInfo *root,
+								 JoinCostWorkspace *workspace,
+								 int *best_clause,
+								 int *rows_filtered)
+{
+	const Path *outer_path;
+	const Path *inner_path;
+	ListCell   *equijoin_lc = NULL;
+	int			equijoin_ordinal = -1;
+	int			best_single_col_sj_ordinal = -1;
+	double		best_sj_selectivity = 1.01;
+	double		best_sj_inner_rows_filtered = -1.0;
+	int			num_md;
+	ExprMetadata *outer_md_array = NULL;
+	ExprMetadata *inner_md_array = NULL;
+
+	Assert(equijoin_list);
+	Assert(list_length(equijoin_list) > 0);
+
+	if (!enable_mergejoin_semijoin_filter && !force_mergejoin_semijoin_filter)
+	{
+		return 0;				/* option setting disabled semijoin insertion  */
+	}
+
+	num_md = list_length(equijoin_list);
+	outer_md_array = alloca(sizeof(ExprMetadata) * num_md);
+	inner_md_array = alloca(sizeof(ExprMetadata) * num_md);
+	if (!outer_md_array || !inner_md_array)
+	{
+		return 0;				/* a stack array allocation failed  */
+	}
+
+	outer_path = join_path->outerjoinpath;
+	inner_path = join_path->innerjoinpath;
+
+	debug_sj1("SJPD:  start evaluate_semijoin_filtering_rate");
+	debug_sj2("SJPD:	  join inner est rows:     %.1lf",
+			  inner_path->rows);
+	debug_sj2("SJPD:	  join outer est rows:     %.1lf",
+			  outer_path->rows);
+
+	/*
+	 * Consider each of the individual equijoin conditions as a possible basis
+	 * for creating a semijoin condition
+	 */
+	foreach(equijoin_lc, equijoin_list)
+	{
+		OpExpr	   *equijoin;
+		Node	   *outer_equijoin_arg = NULL;
+		ExprMetadata *outer_arg_md = NULL;
+		Node	   *inner_equijoin_arg = NULL;
+		ExprMetadata *inner_arg_md = NULL;
+		double		est_sj_selectivity = 1.01;
+		double		est_sj_inner_rows_filtered = -1.0;
+
+		equijoin_ordinal++;
+		equijoin = (OpExpr *) lfirst(equijoin_lc);
+
+		Assert(IsA(equijoin, OpExpr));
+		Assert(list_length(equijoin->args) == 2);
+
+		outer_equijoin_arg = linitial(equijoin->args);
+		outer_arg_md = &(outer_md_array[equijoin_ordinal]);
+		analyze_expr_for_metadata((Expr *) outer_equijoin_arg,
+								  root, outer_arg_md);
+
+		inner_equijoin_arg = llast(equijoin->args);
+		inner_arg_md = &(inner_md_array[equijoin_ordinal]);
+		analyze_expr_for_metadata((Expr *) inner_equijoin_arg,
+								  root, inner_arg_md);
+
+		debug_sj2("SJPD:	  equijoin condition [%d]", equijoin_ordinal);
+		debug_sj_md("outer", equijoin_ordinal, outer_arg_md);
+		debug_sj_md("inner", equijoin_ordinal, inner_arg_md);
+
+		if (outer_arg_md->base_column_expr &&
+			inner_arg_md->base_column_expr)
+		{
+			/*
+			 * If outer key - inner key has FK/PK relationship to each other
+			 * and there is no restriction on the primary key side, the
+			 * semijoin filter will be useless, we should bail out, even if
+			 * the force_semijoin_push_down guc is set. There might be
+			 * exceptions, if the outer key has restrictions on the key
+			 * variable, but we won't be able to tell until the Plan level. We
+			 * will be conservative and assume that an FK/PK relationship will
+			 * yield a useless filter.
+			 */
+			if (
+				is_fk_pk(outer_arg_md->base_column_expr,
+						 inner_arg_md->base_column_expr,
+						 equijoin->opno, root))
+			{
+				debug_sj2("SJPD:        inner and outer equijoin columns %s",
+						  "are PK/FK; semijoin would not be useful");
+				continue;
+			}
+		}
+
+		/* Now see if we can push a semijoin to its source scan node  */
+		if (!outer_arg_md->local_column_expr || !inner_arg_md->local_column_expr)
+		{
+			debug_sj2("SJPD:        could not find a local outer or inner column to%s",
+					  " use as semijoin basis; semijoin is not valid");
+			continue;			/* Continue on to the next equijoin condition  */
+		}
+
+		if (!verify_valid_pushdown((Path *) (join_path->innerjoinpath),
+								   inner_arg_md->local_column_expr->varno, root))
+		{
+			debug_sj2("SJPD:        could not find a place to evaluate %s",
+					  "a semijoin condition; semijoin is not valid");
+			continue;			/* Continue on to the next equijoin condition  */
+		}
+
+		/*
+		 * Adjust cached estimated inner key distinct counts down using the
+		 * inner side tuple count as an upper bound
+		 */
+		inner_arg_md->expr_est_distincts =
+			fmax(1.0, fmin(inner_path->rows,
+						   inner_arg_md->expr_est_distincts));
+
+		/*
+		 * We need to estimate the outer key distinct count as close as
+		 * possible to the where the semijoin filter will actually be applied,
+		 * ignoring the effects of any indirect filtering that would occur
+		 * after the semijoin.
+		 */
+		outer_arg_md->expr_est_distincts =
+			fmax(1.0, fmin(outer_path->rows,
+						   outer_arg_md->expr_est_distincts));
+
+		/* Next, see if this equijoin is valid as a semijoin basis */
+		if (!outer_arg_md->is_or_maps_to_base_column
+			&& !inner_arg_md->is_or_maps_to_constant)
+		{
+			debug_sj2("SJPD:        outer equijoin arg does not map %s",
+					  "to a base column nor a constant; semijoin is not valid");
+			continue;			/* Continue on to the next equijoin condition  */
+		}
+		if (!inner_arg_md->is_or_maps_to_base_column
+			&& !inner_arg_md->is_or_maps_to_constant)
+		{
+			debug_sj2("SJPD:        inner equijoin arg maps to neither %s",
+					  "a base column; semijoin is not valid");
+			continue;			/* Continue on to the next equijoin condition  */
+		}
+
+		/*
+		 * If force_mergejoin_semijoin_filter is used, set the default clause
+		 * as the first valid one.
+		 */
+		if (force_mergejoin_semijoin_filter && best_single_col_sj_ordinal == -1)
+		{
+			best_single_col_sj_ordinal = equijoin_ordinal;
+		}
+		/* Now we know it's valid, see if this potential semijoin is useful */
+		if (!outer_arg_md->est_distincts_reliable)
+		{
+			debug_sj2("SJPD:        outer equijoin column's distinct %s",
+					  "estimates are not reliable; condition rejected");
+			continue;			/* Continue on to the next equijoin condition  */
+		}
+		if (outer_arg_md->est_col_width == 0)
+		{
+			debug_sj2("SJPD:        outer equijoin column's width %s",
+					  "could not be estimated; condition rejected");
+			continue;			/* Continue on to the next equijoin condition  */
+		}
+		if (outer_arg_md->est_col_width > MAX_SEMIJOIN_SINGLE_KEY_WIDTH)
+		{
+			debug_sj2("SJPD:        outer equijoin column's width %s",
+					  "was excessive; condition rejected");
+			continue;			/* Continue on to the next equijoin condition  */
+		}
+		if (!(outer_arg_md->is_or_maps_to_constant
+			  || (inner_arg_md->is_or_maps_to_base_column
+				  && inner_arg_md->est_distincts_reliable)
+			  || (inner_path->rows
+				  < outer_arg_md->expr_est_distincts)))
+		{
+			debug_sj2("SJPD:        inner equijoin arg does not have %s",
+					  "a reliable distinct count; condition rejected");
+			continue;			/* Continue on to the next equijoin condition  */
+		}
+
+		/*
+		 * We now try to estimate the filtering rate (1 minus selectivity) and
+		 * rows filtered of the filter. We first start by finding the ranges
+		 * of both the outer and inner var, and find the overlap between these
+		 * ranges. We assume an equal distribution of variables among this
+		 * range, and we can then calculate the amount of filtering our SJF
+		 * would do.
+		 */
+		if (workspace->inner_min_val > workspace->outer_max_val
+			|| workspace->inner_max_val < workspace->outer_min_val)
+		{
+			/*
+			 * This would mean that the outer and inner tuples are completely
+			 * disjoin from each other. We will not be as optimistic, and just
+			 * assign a filtering rate of 95%.
+			 */
+			est_sj_selectivity = 0.05;	/* selectivity is 1 minus filtering
+										 * rate */
+			est_sj_inner_rows_filtered = 0.95 * inner_arg_md->base_rel_filt_row_count;
+		}
+		else
+		{
+#define APPROACH_1_DAMPENING_FACTOR 0.8
+#define APPROACH_2_DAMPENING_FACTOR 0.66
+			/*
+			 * There are two approaches to estimating the filtering rate. We
+			 * have already outlined the first approach above, finding the
+			 * range and assuming an equal distribution. For the second
+			 * approach, we do not assume anything about the distribution, but
+			 * compare the number of distincts. If, for example, the inner
+			 * relation has 1000 distincts and the outer has 500, then there
+			 * is guaranteed to be at least 500 rows filtered from the inner
+			 * relation, regardless of the data distribution. We make an
+			 * assumption here that the distribution of distinct variables is
+			 * equal to the distribution of all rows so we can multiply by the
+			 * ratio of duplicate values. We then take the geometric mean of
+			 * these two approaches for our final estimated filtering rate. We
+			 * also multiply these values by dampening factors, which we have
+			 * found via experimentation and probably need fine-tuning.
+			 */
+			double		approach_1_selectivity; /* finding selectivity instead
+												 * of filtering rate for
+												 * legacy code reasons */
+			double		approach_2_selectivity;
+			double		inner_overlapping_range = workspace->outer_max_val - workspace->inner_min_val;
+
+			/* we are assuming an equal distribution of val's */
+			double		inner_overlapping_ratio = inner_overlapping_range / inner_arg_md->base_rel_filt_row_count;
+
+			Assert(inner_overlapping_ratio >= 0 && inner_overlapping_ratio <= 1);
+
+			/*
+			 * testing has found that this method is generaly over-optimistic,
+			 * so we multiply by a dampening effect.
+			 */
+			approach_1_selectivity = inner_overlapping_ratio * APPROACH_1_DAMPENING_FACTOR;
+			if (inner_arg_md->expr_est_distincts > outer_arg_md->expr_est_distincts)
+			{
+				int			inner_more_distincts = inner_arg_md->expr_est_distincts - outer_arg_md->expr_est_distincts;
+
+				approach_2_selectivity = 1 - ((double) inner_more_distincts) / inner_arg_md->expr_est_distincts;
+
+				/*
+				 * testing has found that this method is generaly
+				 * over-optimistic, so we multiply by a dampening effect.
+				 */
+				approach_2_selectivity = 1 - ((1 - approach_2_selectivity) * APPROACH_2_DAMPENING_FACTOR);
+			}
+			else
+			{
+				/*
+				 * This means that the outer relation has the same or more
+				 * distincts than the inner relation, which is not good for
+				 * our filtering rate. We will assume a base filtering rate of
+				 * 10% in this case.
+				 */
+				approach_2_selectivity = 0.9;
+			}
+			est_sj_selectivity = sqrt(approach_1_selectivity * approach_2_selectivity);
+			est_sj_inner_rows_filtered = (1 - est_sj_selectivity) * inner_arg_md->base_rel_filt_row_count;
+		}
+		est_sj_selectivity = fmin(1.0, est_sj_selectivity);
+		est_sj_inner_rows_filtered = fmax(1.0, est_sj_inner_rows_filtered);
+
+		debug_sj2("SJPD:        eligible semijoin selectivity:    %.7lf",
+				  est_sj_selectivity);
+		debug_sj2("SJPD:        eligible semijoin rows filtered:  %.7lf",
+				  est_sj_inner_rows_filtered);
+
+		if (est_sj_selectivity < best_sj_selectivity)
+		{
+			debug_sj1("SJPD:        found most useful semijoin seen so far");
+			best_sj_selectivity = est_sj_selectivity;
+			best_sj_inner_rows_filtered = est_sj_inner_rows_filtered;
+			best_single_col_sj_ordinal = equijoin_ordinal;
+		}
+		else
+		{						/* This semijoin was rejected, so explain why  */
+			debug_sj2("SJPD:        found useful single column semijoin; %s",
+					  "not as useful as best found so far, so rejected");
+		}
+	}
+
+	if (best_single_col_sj_ordinal != -1)
+	{
+		debug_sj2("SJPD:      best single column sj selectivity:     %.7lf",
+				  best_sj_selectivity);
+		debug_sj2("SJPD:      best single column rows filtered:      %.7lf",
+				  best_sj_inner_rows_filtered);
+	}
+
+	debug_sj1("SJPD:  finish evaluate_semijoin_filtering_rate");
+	*best_clause = best_single_col_sj_ordinal;
+	*rows_filtered = best_sj_inner_rows_filtered;
+	return 1 - best_sj_selectivity;
+}
+
+/*
+ *  Determine whether a semijoin condition could be pushed from the join
+ *  all the way to the leaf scan node.
+ *
+ *  Parameters:
+ *  node: path node to be considered for semijoin push down.
+ *  target_var:  the inner side join key for a potential semijoin.
+ *  target_relids: relids of all target leaf relations,
+ *  	used only for partitioned table.
+ */
+static bool
+verify_valid_pushdown(const Path *path,
+					  const Index target_var_no,
+					  const PlannerInfo *root)
+{
+	Assert(path);
+	Assert(target_var_no > 0);
+
+	if (path == NULL)
+	{
+		return false;
+	}
+
+	/* Guard against stack overflow due to overly complex plan trees */
+	check_stack_depth();
+
+	switch (path->pathtype)
+	{
+			/* directly push through these paths */
+		case T_Material:
+			{
+				return verify_valid_pushdown(((MaterialPath *) path)->subpath, target_var_no, root);
+			}
+		case T_Gather:
+			{
+				return verify_valid_pushdown(((GatherPath *) path)->subpath, target_var_no, root);
+			}
+		case T_GatherMerge:
+			{
+				return verify_valid_pushdown(((GatherMergePath *) path)->subpath, target_var_no, root);
+			}
+		case T_Sort:
+			{
+				return verify_valid_pushdown(((SortPath *) path)->subpath, target_var_no, root);
+			}
+		case T_Unique:
+			{
+				return verify_valid_pushdown(((UniquePath *) path)->subpath, target_var_no, root);
+			}
+
+		case T_Agg:
+			{					/* We can directly push bloom through GROUP
+								 * BYs and DISTINCTs, as long as there are no
+								 * grouping sets. However, we cannot validate
+								 * this fact until the Plan has been created.
+								 * We will push through for now, but verify
+								 * again during Plan creation. */
+				return verify_valid_pushdown(((AggPath *) path)->subpath, target_var_no, root);
+			}
+
+		case T_Append:
+		case T_SubqueryScan:
+			{
+				/*
+				 * Both append and subquery paths are currently unimplemented,
+				 * so we will just return false, but theoretically there are
+				 * ways to check if a filter can be pushed through them. The
+				 * previous HashJoin CR has implemented these cases, but that
+				 * code is run these after the plan has been created, so code
+				 * will need to be adjusted to do it during Path evaluation.
+				 */
+				return false;
+			}
+
+			/* Leaf nodes */
+		case T_IndexScan:
+		case T_BitmapHeapScan:
+			{
+				/*
+				 * We could definitely implement pushdown filters for Index
+				 * and Bitmap Scans, but currently it is only implemented for
+				 * SeqScan. For now, we return false.
+				 */
+				return false;
+			}
+		case T_SeqScan:
+			{
+				if (path->parent->relid == target_var_no)
+				{
+					/*
+					 * Found source of target var! We know that the pushdown
+					 * is valid now.
+					 */
+					return true;
+				}
+				return false;
+			}
+
+		case T_NestLoop:
+		case T_MergeJoin:
+		case T_HashJoin:
+			{
+				/*
+				 * since this is going to be a sub-join, we can push through
+				 * both sides and don't need to worry about left/right/inner
+				 * joins.
+				 */
+				JoinPath   *join = (JoinPath *) path;
+
+				return verify_valid_pushdown(join->outerjoinpath, target_var_no, root) ||
+					verify_valid_pushdown(join->innerjoinpath, target_var_no, root);
+			}
+
+		default:
+			{
+				return false;
+			}
+	}
+}
+
+static TargetEntry *
+get_nth_targetentry(int n, const List *targetlist)
+{
+	int			i = 1;
+	ListCell   *lc = NULL;
+
+	Assert(n > 0);
+	Assert(targetlist && nodeTag(targetlist) == T_List);
+	Assert(list_length(targetlist) >= n);
+
+	if (targetlist && list_length(targetlist) >= n)
+	{
+		foreach(lc, targetlist)
+		{
+			if (i == n)
+			{
+				TargetEntry *te = lfirst(lc);
+
+				return te;
+			}
+			i++;
+		}
+	}
+	return NULL;
+}
+
+/*
+ * expressions_match_foreign_key
+ *		True if the given con_exprs, ref_exprs and operators will exactlty
+ *      	reflect the expressions referenced by the given foreign key fk.
+ *
+ * Note: This function expects con_exprs and ref_exprs to only contain Var types.
+ *       Expression indexes are not supported by foreign keys.
+ */
+bool
+expressions_match_foreign_key(ForeignKeyOptInfo *fk,
+							  List *con_exprs,
+							  List *ref_exprs,
+							  List *operators)
+{
+	ListCell   *lc;
+	ListCell   *lc2;
+	ListCell   *lc3;
+	int			col;
+	Bitmapset  *all_vars;
+	Bitmapset  *matched_vars;
+	int			idx;
+
+	Assert(list_length(con_exprs) == list_length(ref_exprs));
+	Assert(list_length(con_exprs) == list_length(operators));
+
+	/*
+	 * Fast path out if there's not enough conditions to match each column in
+	 * the foreign key. Note that we cannot check that the number of
+	 * expressions are equal here since it would cause any expressions which
+	 * are duplicated not to match.
+	 */
+	if (list_length(con_exprs) < fk->nkeys)
+		return false;
+
+	/*
+	 * We need to ensure that each item in con_exprs/ref_exprs can be matched
+	 * to a foreign key column in the actual foreign key data fk. We can do
+	 * this by looping over each fk column and checking that we find a
+	 * matching con_expr/ref_expr in con_exprs/ref_exprs. This method does not
+	 * however, allow us to ensure that there are no additional items in
+	 * con_exprs/ref_exprs that have not been matched. To remedy this we will
+	 * create 2 bitmapsets, one which will keep track of all of the vars, the
+	 * other which will keep track of the vars that we have matched. After
+	 * matching is complete, we will ensure that these bitmapsets are equal to
+	 * ensure we have complete mapping in both directions (fk cols to vars and
+	 * vars to fk cols)
+	 */
+	all_vars = NULL;
+	matched_vars = NULL;
+
+	/*
+	 * Build a bitmapset which tracks all vars by their index
+	 */
+	for (idx = 0; idx < list_length(con_exprs); idx++)
+		all_vars = bms_add_member(all_vars, idx);
+
+	for (col = 0; col < fk->nkeys; col++)
+	{
+		bool		matched = false;
+
+		idx = 0;
+
+		forthree(lc, con_exprs, lc2, ref_exprs, lc3, operators)
+		{
+			Var		   *con_expr = (Var *) lfirst(lc);
+			Var		   *ref_expr = (Var *) lfirst(lc2);
+			Oid			opr = lfirst_oid(lc3);
+
+			Assert(IsA(con_expr, Var));
+			Assert(IsA(ref_expr, Var));
+
+			/* Does this join qual match up to the current fkey column? */
+			if (fk->conkey[col] == con_expr->varattno &&
+				fk->confkey[col] == ref_expr->varattno &&
+				equality_ops_are_compatible(opr, fk->conpfeqop[col]))
+			{
+				matched = true;
+
+				/* mark the index of this var as matched */
+				matched_vars = bms_add_member(matched_vars, idx);
+
+				/*
+				 * Don't break here as there may be duplicate expressions that
+				 * match this column that we also need to mark as matched
+				 */
+			}
+			idx++;
+		}
+
+		/*
+		 * can't remove a join if there's no match to fkey column on join
+		 * condition.
+		 */
+		if (!matched)
+			return false;
+	}
+
+	/*
+	 * Ensure that we managed to match every var in con_var/ref_var to a
+	 * foreign key constraint.
+	 */
+	if (!bms_equal(all_vars, matched_vars))
+		return false;
+	return true;
+}
+
+/*
+ * Determine if the given outer and inner Exprs satisfy any fk-pk
+ * relationship.
+ */
+static bool
+is_fk_pk(const Var *outer_var,
+		 const Var *inner_var,
+		 Oid op_oid,
+		 const PlannerInfo *root)
+{
+	ListCell   *lc = NULL;
+	List	   *outer_key_list = list_make1((Var *) outer_var);
+	List	   *inner_key_list = list_make1((Var *) inner_var);
+	List	   *operators = list_make1_oid(op_oid);
+
+	foreach(lc, root->fkey_list)
+	{
+		ForeignKeyOptInfo *fk = (ForeignKeyOptInfo *) lfirst(lc);
+
+		if (expressions_match_foreign_key(fk,
+										  outer_key_list,
+										  inner_key_list,
+										  operators))
+		{
+			return true;
+		}
+	}
+
+	return false;
+}
+
+/*
+ * get_switched_clauses
+ *	  Given a list of merge or hash joinclauses (as RestrictInfo nodes),
+ *	  extract the bare clauses, and rearrange the elements within the
+ *	  clauses, if needed, so the outer join variable is on the left and
+ *	  the inner is on the right.  The original clause data structure is not
+ *	  touched; a modified list is returned.  We do, however, set the transient
+ *	  outer_is_left field in each RestrictInfo to show which side was which.
+ */
+static List *
+get_switched_clauses(List *clauses, Relids outerrelids)
+{
+	List	   *t_list = NIL;
+	ListCell   *l;
+
+	foreach(l, clauses)
+	{
+		RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(l);
+		OpExpr	   *clause = (OpExpr *) restrictinfo->clause;
+
+		Assert(is_opclause(clause));
+
+		/* TODO: handle the case where the operator doesn't hava a commutator */
+		if (bms_is_subset(restrictinfo->right_relids, outerrelids)
+			&& OidIsValid(get_commutator(clause->opno)))
+		{
+			/*
+			 * Duplicate just enough of the structure to allow commuting the
+			 * clause without changing the original list.  Could use
+			 * copyObject, but a complete deep copy is overkill.
+			 */
+			OpExpr	   *temp = makeNode(OpExpr);
+
+			temp->opno = clause->opno;
+			temp->opfuncid = InvalidOid;
+			temp->opresulttype = clause->opresulttype;
+			temp->opretset = clause->opretset;
+			temp->opcollid = clause->opcollid;
+			temp->inputcollid = clause->inputcollid;
+			temp->args = list_copy(clause->args);
+			temp->location = clause->location;
+			/* Commute it --- note this modifies the temp node in-place. */
+			CommuteOpExpr(temp);
+			t_list = lappend(t_list, temp);
+			restrictinfo->outer_is_left = false;
+		}
+		else
+		{
+			/*
+			 * TODO: check if Assert(bms_is_subset(restrictinfo->left_relids,
+			 * outerrelids)) is necessary.
+			 */
+			t_list = lappend(t_list, clause);
+			restrictinfo->outer_is_left = true;
+		}
+	}
+	return t_list;
+}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ab4d8e201d..98a673b9b9 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -315,6 +315,36 @@ static ModifyTable *make_modifytable(PlannerInfo *root, Plan *subplan,
 static GatherMerge *create_gather_merge_plan(PlannerInfo *root,
 											 GatherMergePath *best_path);
 
+/*
+ *  Local functions and option variables to support
+ *  semijoin pushdowns from join nodes
+ */
+static int	depth_of_semijoin_target(Plan *pn,
+									 const Var *target_var,
+									 Bitmapset *target_relids,
+									 int cur_depth,
+									 const PlannerInfo *root,
+									 Plan **target_node);
+static bool is_side_of_join_source_of_var(const Plan *pn,
+										  bool testing_outer_side,
+										  const Var *target_var);
+static bool
+			is_table_scan_node_source_of_relids_or_var(const Plan *pn,
+													   const Var *target_var,
+													   Bitmapset *target_relids);
+static int	position_of_var_in_targetlist(const Var *target_var,
+										  const List *targetlist);
+static TargetEntry *get_nth_targetentry(int posn,
+										const List *targetlist);
+static void get_partition_table_relids(RelOptInfo *rel,
+									   Bitmapset **target_relids);
+static int	get_appendrel_occluded_references(const Expr *ex,
+											  Expr **occluded_exprs,
+											  int num_exprs,
+											  const PlannerInfo *root);
+static Expr *get_subquery_var_occluded_reference(const Expr *ex,
+												 const PlannerInfo *root);
+
 
 /*
  * create_plan
@@ -4691,6 +4721,39 @@ create_mergejoin_plan(PlannerInfo *root,
 	/* Costs of sort and material steps are included in path cost already */
 	copy_generic_path_info(&join_plan->join.plan, &best_path->jpath.path);
 
+	/* Check if we should attach a pushdown semijoin to this join */
+	if (best_path->use_semijoinfilter)
+	{
+		if (best_path->best_mergeclause != -1)
+		{
+			ListCell   *clause_cell = list_nth_cell(mergeclauses, best_path->best_mergeclause);
+			OpExpr	   *joinclause = (OpExpr *) lfirst(clause_cell);
+			Node	   *outer_join_arg = linitial(joinclause->args);
+			Node	   *inner_join_arg = llast(joinclause->args);
+			ExprMetadata *outer_arg_md = (ExprMetadata *) palloc0(sizeof(ExprMetadata));
+			ExprMetadata *inner_arg_md = (ExprMetadata *) palloc0(sizeof(ExprMetadata));
+			int			outer_depth;
+			int			inner_depth;
+
+			Assert(IsA(joinclause, OpExpr));
+			Assert(list_length(joinclause->args) == 2);
+			analyze_expr_for_metadata((Expr *) outer_join_arg, root, outer_arg_md);
+			analyze_expr_for_metadata((Expr *) inner_join_arg, root, inner_arg_md);
+			outer_depth = depth_of_semijoin_target((Plan *) join_plan,
+												   outer_arg_md->local_column_expr, NULL, 0, root, &join_plan->buildingNode);
+			inner_depth = depth_of_semijoin_target((Plan *) join_plan,
+												   inner_arg_md->local_column_expr, NULL, 0, root, &join_plan->checkingNode);
+			if (inner_depth > -1 && outer_depth > -1)
+			{
+				join_plan->applySemiJoinFilter = true;
+				join_plan->bestExpr = best_path->best_mergeclause;
+				join_plan->filteringRate = best_path->filteringRate;
+			}
+			pfree(outer_arg_md);
+			pfree(inner_arg_md);
+		}
+	}
+
 	return join_plan;
 }
 
@@ -7216,3 +7279,551 @@ is_projection_capable_plan(Plan *plan)
 	}
 	return true;
 }
+
+/*
+ *  Determine whether a semijoin condition could be pushed from the join
+ *  all the way to the leaf scan node.  If so, determine the number of
+ *  nodes between the join and the scan node (inclusive of the scan node).
+ *  If the search was terminated by a node the semijoin could not be
+ *  pushed through, the function returns -1.
+ *
+ *  Parameters:
+ *  node: plan node to be considered for semijoin push down.
+ *  target_var:  the outer side join key for a potential semijoin.
+ *  target_relids: relids of all target leaf relations,
+ *  	used only for partitioned table.
+ *  cur_depth: current depth from the root hash join plan node.
+ *  target_node: stores the target plan node where filter will be applied
+ */
+static int
+depth_of_semijoin_target(Plan *pn,
+						 const Var *target_var,
+						 Bitmapset *target_relids,
+						 int cur_depth,
+						 const PlannerInfo *root,
+						 Plan **target_node)
+{
+	int			depth = -1;
+
+	Assert(pn);
+	Assert(target_var && IsA(target_var, Var));
+	Assert(target_var->varno > 0);
+
+	if (pn == NULL)
+	{
+		return -1;
+	}
+
+	/* Guard against stack overflow due to overly complex plan trees */
+	check_stack_depth();
+
+	switch (nodeTag(pn))
+	{
+		case T_Hash:
+		case T_Material:
+		case T_Gather:
+		case T_GatherMerge:
+		case T_Sort:
+		case T_Unique:
+			{					/* Directly push bloom through these node
+								 * types  */
+				depth = depth_of_semijoin_target(pn->lefttree, target_var,
+												 target_relids, cur_depth + 1, root, target_node);
+				break;
+			}
+
+		case T_Agg:
+			{					/* Directly push bloom through GROUP BYs and
+								 * DISTINCTs, as long as there are no grouping
+								 * sets */
+				Agg		   *agg_pn = (Agg *) pn;
+
+				if (!agg_pn->groupingSets
+					|| list_length(agg_pn->groupingSets) == 0)
+				{
+					depth = depth_of_semijoin_target(pn->lefttree, target_var,
+													 target_relids, cur_depth + 1,
+													 root, target_node);
+				}
+				break;
+			}
+
+		case T_SubqueryScan:
+			{
+				/*
+				 * Directly push semijoin into subquery if we can, but we need
+				 * to map the target var to the occluded expression within the
+				 * SELECT list of the subquery
+				 */
+				SubqueryScan *subq_scan = (SubqueryScan *) pn;
+				RelOptInfo *rel = NULL;
+				RangeTblEntry *rte = NULL;
+				Var		   *subq_target_var = NULL;
+
+				/*
+				 * To travel into a subquery we need to use the subquery's
+				 * PlannerInfo, the root of subquery's plan tree, and the
+				 * subquery's SELECT list item that was occluded by the Var
+				 * used within this query block
+				 */
+				rte = root->simple_rte_array[subq_scan->scan.scanrelid];
+				Assert(rte);
+				Assert(rte->subquery);
+				Assert(rte->rtekind == RTE_SUBQUERY);
+				Assert(rte->subquery->targetList);
+
+				rel = find_base_rel((PlannerInfo *) root,
+									subq_scan->scan.scanrelid);
+				Assert(rel->rtekind == RTE_SUBQUERY);
+				Assert(rel->subroot);
+
+				if (rel && rel->subroot
+					&& rte && rte->subquery && rte->subquery->targetList)
+				{
+					/* Find the target_var's occluded expression */
+					Expr	   *occluded_expr =
+					get_subquery_var_occluded_reference((Expr *) target_var,
+														root);
+
+					if (occluded_expr && IsA(occluded_expr, Var))
+					{
+						subq_target_var = (Var *) occluded_expr;
+						if (subq_target_var->varno > 0)
+							depth = depth_of_semijoin_target(subq_scan->subplan,
+															 subq_target_var,
+															 target_relids,
+															 cur_depth + 1,
+															 rel->subroot,
+															 target_node);
+					}
+				}
+				break;
+			}
+
+			/* Either from a partitioned table or Union All */
+		case T_Append:
+			{
+				int			max_depth = -1;
+				Append	   *append = (Append *) pn;
+				RelOptInfo *rel = NULL;
+				RangeTblEntry *rte = NULL;
+
+				rte = root->simple_rte_array[target_var->varno];
+				rel = find_base_rel((PlannerInfo *) root, target_var->varno);
+
+				if (rte->inh && append->appendplans)
+				{
+					int			num_exprs = list_length(append->appendplans);
+					Expr	  **occluded_exprs = alloca(num_exprs * sizeof(Expr *));
+					int			idx = 0;
+					ListCell   *lc = NULL;
+
+					/* Partitioned table */
+					if (rel->part_scheme && rel->part_rels)
+					{
+						get_partition_table_relids(rel, &target_relids);
+
+						foreach(lc, append->appendplans)
+						{
+							Plan	   *appendplan = (Plan *) lfirst(lc);
+
+							depth = depth_of_semijoin_target(appendplan,
+															 target_var,
+															 target_relids,
+															 cur_depth + 1,
+															 root,
+															 target_node);
+
+							if (depth > max_depth)
+								max_depth = depth;
+						}
+					}
+					/* Union All, not partitioned table */
+					else if (num_exprs == get_appendrel_occluded_references(
+																			(Expr *) target_var,
+																			occluded_exprs,
+																			num_exprs,
+																			root))
+					{
+						Var		   *subq_target_var = NULL;
+
+						foreach(lc, append->appendplans)
+						{
+							Expr	   *occluded_expr = occluded_exprs[idx++];
+							Plan	   *appendplan = (Plan *) lfirst(lc);
+
+							if (occluded_expr && IsA(occluded_expr, Var))
+							{
+								subq_target_var = (Var *) occluded_expr;
+
+								depth = depth_of_semijoin_target(appendplan,
+																 subq_target_var,
+																 target_relids,
+																 cur_depth + 1,
+																 root,
+																 target_node);
+
+								if (depth > max_depth)
+									max_depth = depth;
+							}
+						}
+					}
+				}
+				depth = max_depth;
+				break;
+			}
+
+			/* Leaf nodes */
+		case T_IndexScan:
+		case T_BitmapHeapScan:
+			{
+				return -1;
+			}
+		case T_SeqScan:
+			{
+				if (is_table_scan_node_source_of_relids_or_var(pn, target_var, target_relids))
+				{
+					/* Found ultimate source of the join key!  */
+					*target_node = pn;
+					depth = cur_depth;
+				}
+				break;
+			}
+
+		case T_NestLoop:
+		case T_MergeJoin:
+		case T_HashJoin:
+			{
+				/*
+				 * pn->path_jointype is not always the same as join->jointype.
+				 * Avoid using pn->path_jointype when you need accurate
+				 * jointype, use join->jointype instead.
+				 */
+				Join	   *join = (Join *) pn;
+
+				/*
+				 * Push bloom filter to outer node if (target relation is
+				 * under the outer plan node, decided by
+				 * is_side_of_join_source_of_var() ) and either the following
+				 * condition satisfies: 1. this is an inner join or semi join
+				 * 2. this is a root right join 3. this is an intermediate
+				 * left join
+				 */
+				if (is_side_of_join_source_of_var(pn, true, target_var))
+				{
+					if (join->jointype == JOIN_INNER
+						|| join->jointype == JOIN_SEMI
+						|| (join->jointype == JOIN_RIGHT && cur_depth == 0)
+						|| (join->jointype == JOIN_LEFT && cur_depth > 0))
+					{
+						depth = depth_of_semijoin_target(pn->lefttree, target_var,
+														 target_relids, cur_depth + 1, root,
+														 target_node);
+					}
+				}
+				else
+				{
+					/*
+					 * Push bloom filter to inner node if (target rel is under
+					 * the inner node, decided by
+					 * is_side_of_join_source_of_var() ), and either the
+					 * following condition satisfies: 1. this is an inner join
+					 * or semi join 2. this is an intermediate right join
+					 */
+					Assert(is_side_of_join_source_of_var(pn, false, target_var));
+					if (join->jointype == JOIN_INNER
+						|| join->jointype == JOIN_SEMI
+						|| (join->jointype == JOIN_RIGHT && cur_depth > 0))
+					{
+						depth = depth_of_semijoin_target(pn->righttree, target_var,
+														 target_relids, cur_depth + 1, root,
+														 target_node);
+					}
+				}
+				break;
+			}
+
+		default:
+			{					/* For all other node types, just bail out and
+								 * apply the semijoin filter somewhere above
+								 * this node. */
+				depth = -1;
+			}
+	}
+	return depth;
+}
+
+static bool
+is_side_of_join_source_of_var(const Plan *pn,
+							  bool testing_outer_side,
+							  const Var *target_var)
+{
+	/* Determine if target_var is from the indicated child of the join  */
+	Plan	   *target_child = NULL;
+
+	Assert(pn);
+	Assert(target_var && nodeTag(target_var) == T_Var);
+	Assert(nodeTag(pn) == T_NestLoop || nodeTag(pn) == T_MergeJoin
+		   || nodeTag(pn) == T_HashJoin);
+
+	if (testing_outer_side)
+	{
+		target_child = pn->lefttree;
+	}
+	else
+	{
+		target_child = pn->righttree;
+	}
+
+	return (position_of_var_in_targetlist(target_var,
+										  target_child->targetlist) >= 0);
+}
+
+/*
+ * Determine if this scan node is the source of the specified relids,
+ * or the source of the specified var if target_relids is not given.
+ */
+static bool
+is_table_scan_node_source_of_relids_or_var(const Plan *pn,
+										   const Var *target_var,
+										   Bitmapset *target_relids)
+{
+	Scan	   *scan_node = (Scan *) pn;
+	Index		scan_node_varno = 0;
+
+	Assert(pn);
+	Assert(target_var && nodeTag(target_var) == T_Var);
+	Assert(nodeTag(pn) == T_SeqScan || nodeTag(pn) == T_IndexScan
+		   || nodeTag(pn) == T_BitmapHeapScan);
+
+	scan_node_varno = scan_node->scanrelid;
+
+	if (target_relids)
+	{
+		return bms_is_member(scan_node_varno, target_relids);
+	}
+	else if (scan_node_varno == target_var->varno)
+	{
+		/*
+		 * This should never be called for a column that is not being
+		 * projected at it's table scan node
+		 */
+		Assert(position_of_var_in_targetlist(target_var, pn->targetlist) >= 0);
+
+		return true;
+	}
+
+	return false;
+}
+
+static int
+position_of_var_in_targetlist(const Var *target_var, const List *targetlist)
+{
+	ListCell   *lc = NULL;
+	int			i = 1;
+
+	Assert(target_var && nodeTag(target_var) == T_Var);
+	Assert(targetlist && nodeTag(targetlist) == T_List);
+
+	if (targetlist && target_var)
+	{
+		foreach(lc, targetlist)
+		{
+			TargetEntry *te = lfirst(lc);
+
+			if (IsA(te->expr, Var))
+			{
+				Var		   *cur_var = (Var *) te->expr;
+
+				if (cur_var->varno == target_var->varno
+					&& cur_var->varattno == target_var->varattno)
+				{
+					return i;
+				}
+			}
+			i++;
+		}
+	}
+	return -1;
+}
+
+/*
+ * Recursively gather all relids of the given partitioned table rel.
+ */
+static void
+get_partition_table_relids(RelOptInfo *rel, Bitmapset **target_relids)
+{
+	int			i;
+
+	Assert(rel->part_scheme && rel->part_rels);
+
+	for (i = 0; i < rel->nparts; i++)
+	{
+		RelOptInfo *part_rel = rel->part_rels[i];
+
+		if (part_rel->part_scheme && part_rel->part_rels)
+		{
+			get_partition_table_relids(part_rel, target_relids);
+		}
+		else
+		{
+			*target_relids = bms_union(*target_relids,
+									   part_rel->relids);
+		}
+	}
+}
+
+/*
+ *  Given a virtual column from an Union ALL subquery,
+ *  return the expression it immediately occludes that satisfy
+ *  the inheritance condition,
+ *  i.e. appendRelInfo->parent_relid == outside_subq_var->varno
+ */
+static int
+get_appendrel_occluded_references(const Expr *ex,
+								  Expr **occluded_exprs,
+								  int num_exprs,
+								  const PlannerInfo *root)
+{
+	Var		   *outside_subq_var = (Var *) ex;
+	RangeTblEntry *outside_subq_rte = NULL;
+	int			idx = 0;
+
+
+	Assert(ex && root);
+	Assert(IsA(ex, Var));
+	Assert(outside_subq_var->varno < root->simple_rel_array_size);
+
+	outside_subq_rte = root->simple_rte_array[outside_subq_var->varno];
+
+	/* System Vars have varattno < 0, don't bother */
+	if (outside_subq_var->varattno <= 0)
+		return 0;
+
+	/*
+	 * If inheritance, subquery has append, leg of append in subquery may not
+	 * have subroot, process it according to root->append_rel_list.
+	 */
+	if (outside_subq_rte->inh)
+	{
+		ListCell   *lc = NULL;
+
+		Assert(root->append_rel_list &&
+			   num_exprs <= list_length(root->append_rel_list));
+
+		foreach(lc, root->append_rel_list)
+		{
+			AppendRelInfo *appendRelInfo = lfirst(lc);
+
+			if (appendRelInfo->parent_relid == outside_subq_var->varno)
+			{
+				Assert(appendRelInfo->translated_vars &&
+					   outside_subq_var->varattno <=
+					   list_length(appendRelInfo->translated_vars));
+
+				occluded_exprs[idx++] =
+					list_nth(appendRelInfo->translated_vars,
+							 outside_subq_var->varattno - 1);
+			}
+		}
+	}
+
+	return idx;
+}
+
+static Expr *
+get_subquery_var_occluded_reference(const Expr *ex, const PlannerInfo *root)
+{
+	/*
+	 * Given a virtual column from an unflattened subquery, return the
+	 * expression it immediately occludes
+	 */
+	Var		   *outside_subq_var = (Var *) ex;
+	RelOptInfo *outside_subq_relation = NULL;
+	RangeTblEntry *outside_subq_rte = NULL;
+	TargetEntry *te = NULL;
+	Expr	   *inside_subq_expr = NULL;
+
+	Assert(ex && root);
+	Assert(IsA(ex, Var));
+	Assert(outside_subq_var->varno < root->simple_rel_array_size);
+
+	outside_subq_relation = root->simple_rel_array[outside_subq_var->varno];
+	outside_subq_rte = root->simple_rte_array[outside_subq_var->varno];
+
+	/*
+	 * If inheritance, subquery has append, leg of append in subquery may not
+	 * have subroot, we may be able to better process it according to
+	 * root->append_rel_list. For now just return the first leg... TODO better
+	 * handling of Union All, we only return statistics of the first leg atm.
+	 * TODO similarly, need better handling of partitioned tables, according
+	 * to outside_subq_relation->part_scheme and part_rels.
+	 */
+	if (outside_subq_rte->inh)
+	{
+		AppendRelInfo *appendRelInfo = NULL;
+
+		Assert(root->append_rel_list);
+
+		/* TODO remove this check once we add better handling of inheritance */
+		appendRelInfo = list_nth(root->append_rel_list, 0);
+		Assert(appendRelInfo->parent_relid == outside_subq_var->varno);
+
+		Assert(appendRelInfo->translated_vars &&
+			   outside_subq_var->varattno <=
+			   list_length(appendRelInfo->translated_vars));
+		inside_subq_expr = list_nth(appendRelInfo->translated_vars,
+									outside_subq_var->varattno - 1);
+	}
+
+	/* Subquery without append and partitioned tables */
+	else
+	{
+		Assert(outside_subq_relation && IsA(outside_subq_relation, RelOptInfo));
+		Assert(outside_subq_relation->reloptkind == RELOPT_BASEREL);
+		Assert(outside_subq_relation->rtekind == RTE_SUBQUERY);
+		Assert(outside_subq_relation->subroot->processed_tlist);
+
+		te = get_nth_targetentry(outside_subq_var->varattno,
+								 outside_subq_relation->subroot->processed_tlist);
+		Assert(te && outside_subq_var->varattno == te->resno);
+		inside_subq_expr = te->expr;
+
+		/*
+		 * Strip off any Relabel present, and return the underlying expression
+		 */
+		while (inside_subq_expr && IsA(inside_subq_expr, RelabelType))
+		{
+			inside_subq_expr = ((RelabelType *) inside_subq_expr)->arg;
+		}
+	}
+
+	return inside_subq_expr;
+}
+
+
+static TargetEntry *
+get_nth_targetentry(int n, const List *targetlist)
+{
+	int			i = 1;
+	ListCell   *lc = NULL;
+
+	Assert(n > 0);
+	Assert(targetlist && nodeTag(targetlist) == T_List);
+	Assert(list_length(targetlist) >= n);
+
+	if (targetlist && list_length(targetlist) >= n)
+	{
+		foreach(lc, targetlist)
+		{
+			if (i == n)
+			{
+				TargetEntry *te = lfirst(lc);
+
+				return te;
+			}
+			i++;
+		}
+	}
+	return NULL;
+}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 1808388397..b62de16899 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -2905,7 +2905,9 @@ void
 mergejoinscansel(PlannerInfo *root, Node *clause,
 				 Oid opfamily, int strategy, bool nulls_first,
 				 Selectivity *leftstart, Selectivity *leftend,
-				 Selectivity *rightstart, Selectivity *rightend)
+				 Selectivity *rightstart, Selectivity *rightend,
+				 Datum *leftmin, Datum *leftmax,
+				 Datum *rightmin, Datum *rightmax)
 {
 	Node	   *left,
 			   *right;
@@ -2925,10 +2927,6 @@ mergejoinscansel(PlannerInfo *root, Node *clause,
 				revltop,
 				revleop;
 	bool		isgt;
-	Datum		leftmin,
-				leftmax,
-				rightmin,
-				rightmax;
 	double		selec;
 
 	/* Set default results if we can't figure anything out. */
@@ -3075,20 +3073,20 @@ mergejoinscansel(PlannerInfo *root, Node *clause,
 	if (!isgt)
 	{
 		if (!get_variable_range(root, &leftvar, lstatop, collation,
-								&leftmin, &leftmax))
+								leftmin, leftmax))
 			goto fail;			/* no range available from stats */
 		if (!get_variable_range(root, &rightvar, rstatop, collation,
-								&rightmin, &rightmax))
+								rightmin, rightmax))
 			goto fail;			/* no range available from stats */
 	}
 	else
 	{
 		/* need to swap the max and min */
 		if (!get_variable_range(root, &leftvar, lstatop, collation,
-								&leftmax, &leftmin))
+								leftmax, leftmin))
 			goto fail;			/* no range available from stats */
 		if (!get_variable_range(root, &rightvar, rstatop, collation,
-								&rightmax, &rightmin))
+								rightmax, rightmin))
 			goto fail;			/* no range available from stats */
 	}
 
@@ -3098,13 +3096,13 @@ mergejoinscansel(PlannerInfo *root, Node *clause,
 	 * non-default estimates, else stick with our 1.0.
 	 */
 	selec = scalarineqsel(root, leop, isgt, true, collation, &leftvar,
-						  rightmax, op_righttype);
+						  *rightmax, op_righttype);
 	if (selec != DEFAULT_INEQ_SEL)
 		*leftend = selec;
 
 	/* And similarly for the right variable. */
 	selec = scalarineqsel(root, revleop, isgt, true, collation, &rightvar,
-						  leftmax, op_lefttype);
+						  *leftmax, op_lefttype);
 	if (selec != DEFAULT_INEQ_SEL)
 		*rightend = selec;
 
@@ -3128,13 +3126,13 @@ mergejoinscansel(PlannerInfo *root, Node *clause,
 	 * our own default.
 	 */
 	selec = scalarineqsel(root, ltop, isgt, false, collation, &leftvar,
-						  rightmin, op_righttype);
+						  *rightmin, op_righttype);
 	if (selec != DEFAULT_INEQ_SEL)
 		*leftstart = selec;
 
 	/* And similarly for the right variable. */
 	selec = scalarineqsel(root, revltop, isgt, false, collation, &rightvar,
-						  leftmin, op_lefttype);
+						  *leftmin, op_lefttype);
 	if (selec != DEFAULT_INEQ_SEL)
 		*rightstart = selec;
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index fda3f9befb..084cfdf11f 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -885,6 +885,26 @@ struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_mergejoin_semijoin_filter", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables the planner's use of using Semijoin Bloom filters during merge join."),
+			NULL,
+			GUC_EXPLAIN
+		},
+		&enable_mergejoin_semijoin_filter,
+		false,
+		NULL, NULL, NULL
+	},
+	{
+		{"force_mergejoin_semijoin_filter", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Forces the planner's use of using Semijoin Bloom filters during merge join. Overrides enable_mergejoin_semijoin_filter."),
+			NULL,
+			GUC_EXPLAIN
+		},
+		&force_mergejoin_semijoin_filter,
+		false,
+		NULL, NULL, NULL
+	},
 	{
 		{"enable_hashjoin", PGC_USERSET, QUERY_TUNING_METHOD,
 			gettext_noop("Enables the planner's use of hash join plans."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 2ae76e5cfb..1f3d2c772a 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -379,6 +379,8 @@
 #enable_material = on
 #enable_memoize = on
 #enable_mergejoin = on
+#enable_mergejoin_semijoin_filter = on
+#force_mergejoin_semijoin_filter = on
 #enable_nestloop = on
 #enable_parallel_append = on
 #enable_parallel_hash = on
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 294cfe9c47..663069455e 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -2014,6 +2014,9 @@ typedef struct MergePath
 	List	   *innersortkeys;	/* keys for explicit sort, if any */
 	bool		skip_mark_restore;	/* can executor skip mark/restore? */
 	bool		materialize_inner;	/* add Materialize to inner? */
+	bool		use_semijoinfilter; /* should we use a semijoin filter? */
+	double		filteringRate;	/* estimated filtering rate of SJF */
+	int			best_mergeclause;	/* best clause to build SJF on */
 } MergePath;
 
 /*
@@ -2591,6 +2594,12 @@ typedef struct MergeScanSelCache
 	Selectivity leftendsel;		/* last-join fraction for clause left side */
 	Selectivity rightstartsel;	/* first-join fraction for clause right side */
 	Selectivity rightendsel;	/* last-join fraction for clause right side */
+
+	Datum		leftmin;		/* min and max values for left and right
+								 * clauses */
+	Datum		leftmax;
+	Datum		rightmin;
+	Datum		rightmax;
 } MergeScanSelCache;
 
 /*
@@ -3138,6 +3147,10 @@ typedef struct JoinCostWorkspace
 	Cardinality inner_rows;
 	Cardinality outer_skip_rows;
 	Cardinality inner_skip_rows;
+	Datum		outer_min_val;
+	Datum		outer_max_val;
+	Datum		inner_min_val;
+	Datum		inner_max_val;
 
 	/* private for cost_hashjoin code */
 	int			numbuckets;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 21e642a64c..a418a406af 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -22,6 +22,7 @@
 #include "nodes/lockoptions.h"
 #include "nodes/parsenodes.h"
 #include "nodes/primnodes.h"
+#include "optimizer/pathnode.h"
 
 
 /* ----------------------------------------------------------------
@@ -844,6 +845,13 @@ typedef struct MergeJoin
 
 	/* per-clause nulls ordering */
 	bool	   *mergeNullsFirst pg_node_attr(array_size(mergeclauses));
+
+	/* fields for using a SemiJoinFilter */
+	bool		applySemiJoinFilter;
+	double		filteringRate;
+	int			bestExpr;
+	Plan	   *buildingNode;
+	Plan	   *checkingNode;
 } MergeJoin;
 
 /* ----------------
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index f27d11eaa9..cca624a1d8 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -61,6 +61,8 @@ extern PGDLLIMPORT bool enable_nestloop;
 extern PGDLLIMPORT bool enable_material;
 extern PGDLLIMPORT bool enable_memoize;
 extern PGDLLIMPORT bool enable_mergejoin;
+extern PGDLLIMPORT bool enable_mergejoin_semijoin_filter;
+extern PGDLLIMPORT bool force_mergejoin_semijoin_filter;
 extern PGDLLIMPORT bool enable_hashjoin;
 extern PGDLLIMPORT bool enable_gathermerge;
 extern PGDLLIMPORT bool enable_partitionwise_join;
@@ -213,4 +215,49 @@ extern PathTarget *set_pathtarget_cost_width(PlannerInfo *root, PathTarget *targ
 extern double compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel,
 								   Path *bitmapqual, int loop_count, Cost *cost, double *tuple);
 
+/*
+ *  Container for metadata about an expression, used by semijoin decision logic
+ */
+typedef struct expr_metadata
+{
+	bool		is_or_maps_to_constant;
+	bool		is_or_maps_to_base_column;
+
+	/* Var and relation from the current query block, if it is a Var */
+	const Var  *local_column_expr;
+	const RelOptInfo *local_relation;
+
+	int32		est_col_width;
+
+	/*
+	 * The following will be the same as local Var and relation when the local
+	 * relation is a base table (i.e. no occluding query blocks).  Otherwise
+	 * it will be the occluded base column, if the final occluded expression
+	 * is a base column.
+	 */
+	const Var  *base_column_expr;
+	const RelOptInfo *base_rel;
+	const PlannerInfo *base_rel_root;
+	double		base_rel_row_count;
+	double		base_rel_filt_row_count;
+	double		base_col_distincts;
+	Datum		base_col_min_value;
+	Datum		base_col_max_value;
+
+	/* True if the distinct est is based on something meaningful  */
+	bool		est_distincts_reliable;
+	bool		est_minmax_reliable;
+
+	/* Estimated distincts after local filtering, and row count adjustments */
+	double		expr_est_distincts;
+}			ExprMetadata;
+
+extern void analyze_expr_for_metadata(const Expr *ex,
+									  const PlannerInfo *root,
+									  ExprMetadata * md);
+extern bool expressions_match_foreign_key(ForeignKeyOptInfo *fk,
+										  List *con_exprs,
+										  List *ref_exprs,
+										  List *operators);
+
 #endif							/* COST_H */
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
index d485b9bfcd..2d67efd295 100644
--- a/src/include/utils/selfuncs.h
+++ b/src/include/utils/selfuncs.h
@@ -208,7 +208,9 @@ extern Selectivity rowcomparesel(PlannerInfo *root,
 extern void mergejoinscansel(PlannerInfo *root, Node *clause,
 							 Oid opfamily, int strategy, bool nulls_first,
 							 Selectivity *leftstart, Selectivity *leftend,
-							 Selectivity *rightstart, Selectivity *rightend);
+							 Selectivity *rightstart, Selectivity *rightend,
+							 Datum *leftmin, Datum *leftmax,
+							 Datum *rightmin, Datum *rightmax);
 
 extern double estimate_num_groups(PlannerInfo *root, List *groupExprs,
 								  double input_rows, List **pgset,
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 4e775af175..c2c3946e95 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -109,30 +109,31 @@ select count(*) = 0 as ok from pg_stat_wal_receiver;
 -- This is to record the prevailing planner enable_foo settings during
 -- a regression test run.
 select name, setting from pg_settings where name like 'enable%';
-              name              | setting 
---------------------------------+---------
- enable_async_append            | on
- enable_bitmapscan              | on
- enable_gathermerge             | on
- enable_group_by_reordering     | on
- enable_hashagg                 | on
- enable_hashjoin                | on
- enable_incremental_sort        | on
- enable_indexonlyscan           | on
- enable_indexscan               | on
- enable_material                | on
- enable_memoize                 | on
- enable_mergejoin               | on
- enable_nestloop                | on
- enable_parallel_append         | on
- enable_parallel_hash           | on
- enable_partition_pruning       | on
- enable_partitionwise_aggregate | off
- enable_partitionwise_join      | off
- enable_seqscan                 | on
- enable_sort                    | on
- enable_tidscan                 | on
-(21 rows)
+               name               | setting 
+----------------------------------+---------
+ enable_async_append              | on
+ enable_bitmapscan                | on
+ enable_gathermerge               | on
+ enable_group_by_reordering       | on
+ enable_hashagg                   | on
+ enable_hashjoin                  | on
+ enable_incremental_sort          | on
+ enable_indexonlyscan             | on
+ enable_indexscan                 | on
+ enable_material                  | on
+ enable_memoize                   | on
+ enable_mergejoin                 | on
+ enable_mergejoin_semijoin_filter | off
+ enable_nestloop                  | on
+ enable_parallel_append           | on
+ enable_parallel_hash             | on
+ enable_partition_pruning         | on
+ enable_partitionwise_aggregate   | off
+ enable_partitionwise_join        | off
+ enable_seqscan                   | on
+ enable_sort                      | on
+ enable_tidscan                   | on
+(22 rows)
 
 -- Test that the pg_timezone_names and pg_timezone_abbrevs views are
 -- more-or-less working.  We can't test their contents in any great detail
-- 
2.37.1

0003-Support-semijoin-filter-in-the-executor-for-parallel.patchapplication/octet-stream; name=0003-Support-semijoin-filter-in-the-executor-for-parallel.patchDownload

From 0fb9ec964369c2af246cfc53e3866e3beb2b7ecc Mon Sep 17 00:00:00 2001
From: Lyu Pan <lyup@amazon.com>
Date: Fri, 16 Sep 2022 21:28:52 +0000
Subject: [PATCH 3/5] Support semijoin filter in the executor for parallel
 mergejoin.

Each worker process create its own bloom filter, and updates its own bloom filter during the outer/left plan scan (no lock is needed). After all processes finish the scan, the bloom filtered are merged together (by performing OR operations on the bit arrays) into a shared bloom filter in the Dynamic Shared Memory area (use lock for synchronization). After this is completed, the merged bloom filter will be copied back to all the worker processes, and be used in the inner/right plan scan for filtering.
---
 src/backend/executor/execScan.c      |   3 +-
 src/backend/executor/nodeMergejoin.c | 221 +++++++++++++++++++++++++--
 src/backend/executor/nodeSeqscan.c   | 173 ++++++++++++++++++++-
 src/backend/lib/bloomfilter.c        |  74 +++++++++
 src/include/executor/nodeMergejoin.h |   4 +-
 src/include/lib/bloomfilter.h        |   7 +
 src/include/nodes/execnodes.h        |  31 ++++
 7 files changed, 495 insertions(+), 18 deletions(-)

diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c
index 4b97c39455..954d99392b 100644
--- a/src/backend/executor/execScan.c
+++ b/src/backend/executor/execScan.c
@@ -215,7 +215,8 @@ ExecScan(ScanState *node,
 			{
 				SemiJoinFilterFinishScan(
 										 ((SeqScanState *) node)->semiJoinFilters,
-										 node->ss_currentRelation->rd_id);
+										 node->ss_currentRelation->rd_id,
+										 node->ps.state->es_query_dsa);
 			}
 			if (projInfo)
 				return ExecClearTuple(projInfo->pi_state.resultslot);
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index 81c253aad5..d8fb352313 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -99,6 +99,10 @@
 #include "executor/nodeMergejoin.h"
 #include "lib/bloomfilter.h"
 #include "miscadmin.h"
+#include "storage/dsm.h"
+#include "storage/lwlock.h"
+#include "storage/shm_toc.h"
+#include "utils/dsa.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 
@@ -659,6 +663,26 @@ ExecMergeJoin(PlanState *pstate)
 				outerTupleSlot = ExecProcNode(outerPlan);
 				node->mj_OuterTupleSlot = outerTupleSlot;
 
+				/*
+				 * Check if outer plan has an SJF and the inner plan does not.
+				 * This case will only arise during parallel execution, when
+				 * the outer plan is initialized with the SJF but the inner
+				 * plan does not because it is not included in the memory
+				 * copied over during worker creation. If this is the case,
+				 * then push down the filter to the inner plan level to
+				 * correct this error and then proceed as normal.
+				 */
+				if (GetSemiJoinFilter(outerPlan, pstate->plan->plan_node_id)
+					&& !GetSemiJoinFilter(innerPlan,
+										  pstate->plan->plan_node_id))
+				{
+					SemiJoinFilter *sjf = GetSemiJoinFilter(outerPlan,
+															pstate->plan->plan_node_id);
+
+					PushDownFilter(innerPlan, NULL, sjf, &sjf->checkingId,
+								   NULL);
+				}
+
 				/* Compute join values and check for unmatchability */
 				switch (MJEvalOuterValues(node))
 				{
@@ -1819,11 +1843,14 @@ PushDownDirection(PlanState *node)
 	}
 }
 
-/* Recursively pushes down the filter until an appropriate SeqScan node is reached. Then, it
- * verifies if that SeqScan node is the one we want to push the filter to, and if it is, then
- * appends the SJF to the node. */
+/*
+ * Recursively pushes down the filter until an appropriate SeqScan node is
+ * reached. Then, it verifies if that SeqScan node is the one we want to push
+ * the filter to, and if it is, then appends the SJF to the node.
+ */
 void
-PushDownFilter(PlanState *node, Plan *plan, SemiJoinFilter * sjf, Oid *relId, int64 *nodeRows)
+PushDownFilter(PlanState *node, Plan *plan, SemiJoinFilter * sjf, Oid *relId,
+			   int64 *nodeRows)
 {
 	if (node == NULL)
 	{
@@ -1838,8 +1865,8 @@ PushDownFilter(PlanState *node, Plan *plan, SemiJoinFilter * sjf, Oid *relId, in
 		Assert(IsA(scan, SeqScanState));
 
 		/*
-		 * found the right Scan node that we want to apply the filter onto via
-		 * matching relId
+		 * Found the right Scan node that we want to apply the filter onto via
+		 * matching relId.
 		 */
 		if (scan->ss.ss_currentRelation->rd_id == *relId)
 		{
@@ -1876,13 +1903,15 @@ PushDownFilter(PlanState *node, Plan *plan, SemiJoinFilter * sjf, Oid *relId, in
 }
 
 /*
- * If this table is the building-side table for the SemiJoinFilter, adds the element to
- * the bloom filter and always returns true. If this table is the checking-side table for the SemiJoinFilter,
- * then checks the element against the bloom filter and returns true if the element is (probably) in the set,
- * and false if the element is not in the bloom filter.
+ * If this table is the building-side table for the SemiJoinFilter, adds the
+ * element to the bloom filter and always returns true. If this table is the
+ * checking-side table for the SemiJoinFilter, then checks the element
+ * against the bloom filter and returns true if the element is (probably)
+ * in the set, and false if the element is not in the bloom filter.
  */
 bool
-SemiJoinFilterExamineSlot(List *semiJoinFilters, TupleTableSlot *slot, Oid tableId)
+SemiJoinFilterExamineSlot(List *semiJoinFilters,
+						  TupleTableSlot *slot, Oid tableId)
 {
 	ListCell   *cell;
 
@@ -1903,7 +1932,8 @@ SemiJoinFilterExamineSlot(List *semiJoinFilters, TupleTableSlot *slot, Oid table
 			 * include multiple join keys.
 			 */
 			val = slot->tts_values[sjf->buildingAttr];
-			bloom_add_element(sjf->filter, (unsigned char *) &val, sizeof(val));
+			bloom_add_element(sjf->filter, (unsigned char *) &val,
+							  sizeof(val));
 		}
 		else if (sjf->doneBuilding && tableId == sjf->checkingId)
 		{
@@ -1912,7 +1942,8 @@ SemiJoinFilterExamineSlot(List *semiJoinFilters, TupleTableSlot *slot, Oid table
 			slot_getsomeattrs(slot, sjf->checkingAttr + 1);
 			sjf->elementsChecked++;
 			val = slot->tts_values[sjf->checkingAttr];
-			if (bloom_lacks_element(sjf->filter, (unsigned char *) &val, sizeof(val)))
+			if (bloom_lacks_element(sjf->filter,
+									(unsigned char *) &val, sizeof(val)))
 			{
 				sjf->elementsFiltered++;
 				return false;
@@ -1923,7 +1954,8 @@ SemiJoinFilterExamineSlot(List *semiJoinFilters, TupleTableSlot *slot, Oid table
 }
 
 void
-SemiJoinFilterFinishScan(List *semiJoinFilters, Oid tableId)
+SemiJoinFilterFinishScan(List *semiJoinFilters, Oid tableId,
+						 dsa_area *parallel_area)
 {
 	ListCell   *cell;
 
@@ -1933,7 +1965,166 @@ SemiJoinFilterFinishScan(List *semiJoinFilters, Oid tableId)
 
 		if (!sjf->doneBuilding && tableId == sjf->buildingId)
 		{
-			sjf->doneBuilding = true;
+			if (!sjf->isParallel)
+			{
+				/*
+				 * Not parallel, so only one process running and that process
+				 * is now complete.
+				 */
+				sjf->doneBuilding = true;
+			}
+			else
+			{
+				/* parallel, so need to sync with the other processes */
+				SemiJoinFilterParallelState *parallelState =
+				(SemiJoinFilterParallelState *) dsa_get_address(
+																parallel_area, sjf->parallelState);
+				bloom_filter *shared_bloom =
+				(bloom_filter *) dsa_get_address(
+												 parallel_area, parallelState->bloom_dsa_address);
+
+				/*
+				 * This process takes control of the lock and updates the
+				 * shared bloom filter. These locks are created by the
+				 * SemiJoinFilterParallelState and are unique to that struct.
+				 */
+				LWLockAcquire(&parallelState->lock, LW_EXCLUSIVE);
+				parallelState->elementsAdded += sjf->elementsAdded;
+				add_to_filter(shared_bloom, sjf->filter);
+				parallelState->workersDone++;
+				LWLockRelease(&parallelState->lock);
+
+				/*
+				 * We need to wait until all threads have had their chance to
+				 * update the shared bloom filter, since our next step is to
+				 * copy the finished bloom filter back into all of the
+				 * separate processes.
+				 */
+				if (parallelState->workersDone == parallelState->numProcesses)
+				{
+					LWLockUpdateVar(&parallelState->secondlock,
+									&parallelState->lockStop, 1);
+				}
+				LWLockWaitForVar(&parallelState->secondlock,
+								 &parallelState->lockStop, 0,
+								 &parallelState->lockStop);
+
+				/*
+				 * Now the shared Bloom filter is fully updated, so each
+				 * individual process copies the finished Bloom filter to the
+				 * local SemiJoinFilter.
+				 */
+				LWLockAcquire(&parallelState->lock, LW_EXCLUSIVE);
+				replace_bitset(sjf->filter, shared_bloom);
+				sjf->elementsAdded = parallelState->elementsAdded;
+				sjf->doneBuilding = true;
+				parallelState->workersDone++;
+				LWLockRelease(&parallelState->lock);
+
+				/*
+				 * Again, we need to wait for all processes to finish copying
+				 * the completed bloom filter because the main process will
+				 * free the shared memory afterwards.
+				 */
+				if (parallelState->workersDone ==
+					2 * parallelState->numProcesses)
+				{
+					LWLockUpdateVar(&parallelState->secondlock,
+									&parallelState->lockStop, 2);
+				}
+				LWLockWaitForVar(&parallelState->secondlock,
+								 &parallelState->lockStop, 1,
+								 &parallelState->lockStop);
+				/* release allocated shared memory in main process */
+				if (!sjf->isWorker)
+				{
+					LWLockRelease(&parallelState->secondlock);
+					bloom_free_in_dsa(parallel_area,
+									  parallelState->bloom_dsa_address);
+					dsa_free(parallel_area, sjf->parallelState);
+				}
+			}
 		}
 	}
 }
+
+dsa_pointer
+CreateFilterParallelState(dsa_area *area, SemiJoinFilter * sjf,
+						  int sjf_num)
+{
+	dsa_pointer bloom_dsa_address =
+	bloom_create_in_dsa(area, sjf->num_elements, sjf->work_mem, sjf->seed);
+	dsa_pointer parallel_address =
+	dsa_allocate0(area, sizeof(SemiJoinFilterParallelState));
+	SemiJoinFilterParallelState *parallelState =
+	(SemiJoinFilterParallelState *) dsa_get_address(area,
+													parallel_address);
+
+	/* copy over information to parallel state */
+	parallelState->doneBuilding = sjf->doneBuilding;
+	parallelState->seed = sjf->seed;
+	parallelState->num_elements = sjf->num_elements;
+	parallelState->work_mem = sjf->work_mem;
+	parallelState->buildingId = sjf->buildingId;
+	parallelState->checkingId = sjf->checkingId;
+	parallelState->buildingAttr = sjf->buildingAttr;
+	parallelState->checkingAttr = sjf->checkingAttr;
+	parallelState->bloom_dsa_address = bloom_dsa_address;
+	parallelState->sjf_num = sjf_num;
+	parallelState->mergejoin_plan_id = sjf->mergejoin_plan_id;
+	/* initialize locks */
+	LWLockInitialize(&parallelState->lock, LWLockNewTrancheId());
+	LWLockInitialize(&parallelState->secondlock, LWLockNewTrancheId());
+	/* should be main process that acquires lock */
+	LWLockAcquire(&parallelState->secondlock, LW_EXCLUSIVE);
+	return parallel_address;
+}
+
+/*
+ * Checks a side of the execution tree and fetches an SJF if its mergejoin
+ * plan ID matches that of the method's mergejoin ID. Used during parallel
+ * execution, where SJF information is lost during information copying to
+ * the worker.
+ */
+SemiJoinFilter *
+GetSemiJoinFilter(PlanState *node, int plan_id)
+{
+	if (node == NULL)
+	{
+		return NULL;
+	}
+	check_stack_depth();
+	if (node->type == T_SeqScanState)
+	{
+		SeqScanState *scan = (SeqScanState *) node;
+
+		Assert(IsA(scan, SeqScanState));
+		if (scan->applySemiJoinFilter)
+		{
+			ListCell   *lc;
+
+			foreach(lc, scan->semiJoinFilters)
+			{
+				SemiJoinFilter *sjf = (SemiJoinFilter *) lfirst(lc);
+
+				if (sjf->mergejoin_plan_id == plan_id)
+				{
+					return sjf;
+				}
+			}
+			return NULL;
+		}
+	}
+	if (PushDownDirection(node) == 1)
+	{
+		/* check both children and return the non-null one */
+		return GetSemiJoinFilter(node->lefttree, plan_id) != NULL ?
+			GetSemiJoinFilter(node->lefttree, plan_id) :
+			GetSemiJoinFilter(node->righttree, plan_id);
+	}
+	if (PushDownDirection(node) == 0)
+	{
+		return GetSemiJoinFilter(node->lefttree, plan_id);
+	}
+	return NULL;
+}
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index e43ce3f8d0..c796d0c416 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -30,9 +30,18 @@
 #include "access/relscan.h"
 #include "access/tableam.h"
 #include "executor/execdebug.h"
+#include "executor/nodeMergejoin.h"
 #include "executor/nodeSeqscan.h"
+#include "storage/lwlock.h"
+#include "storage/shm_toc.h"
 #include "utils/rel.h"
 
+/*
+ * Magic number for location of shared dsa pointer if scan is using a semi-join
+ * filter.
+ */
+#define DSA_LOCATION_KEY_FOR_SJF	UINT64CONST(0xE00000000000FFFF)
+
 static TupleTableSlot *SeqNext(SeqScanState *node);
 
 /* ----------------------------------------------------------------
@@ -157,7 +166,8 @@ ExecInitSeqScan(SeqScan *node, EState *estate, int eflags)
 	/* and create slot with the appropriate rowtype */
 	ExecInitScanTupleSlot(estate, &scanstate->ss,
 						  RelationGetDescr(scanstate->ss.ss_currentRelation),
-						  table_slot_callbacks(scanstate->ss.ss_currentRelation));
+						  table_slot_callbacks(
+											   scanstate->ss.ss_currentRelation));
 
 	/*
 	 * Initialize result type and projection.
@@ -262,6 +272,20 @@ ExecSeqScanEstimate(SeqScanState *node,
 												  estate->es_snapshot);
 	shm_toc_estimate_chunk(&pcxt->estimator, node->pscan_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/*
+	 * Estimate space for extra dsa_pointer address for when parallel
+	 * sequential scans use a semi-join filter.
+	 */
+	if (node->ss.ps.plan->parallel_aware && node->applySemiJoinFilter)
+	{
+		shm_toc_estimate_keys(&pcxt->estimator, 1);
+		if (node->semiJoinFilters)
+		{
+			shm_toc_estimate_keys(&pcxt->estimator,
+								  sizeof(dsa_pointer) * list_length(node->semiJoinFilters));
+		}
+	}
 }
 
 /* ----------------------------------------------------------------
@@ -277,6 +301,51 @@ ExecSeqScanInitializeDSM(SeqScanState *node,
 	EState	   *estate = node->ss.ps.state;
 	ParallelTableScanDesc pscan;
 
+	/*
+	 * If scan is using a semi-join filter, then initialize dsa pointer of
+	 * shared sjf.
+	 */
+	if (node->applySemiJoinFilter)
+	{
+		int			sjf_num = list_length(node->semiJoinFilters);
+		dsa_pointer *dsa_pointer_address;	/* an array of size sjf_num */
+		ListCell   *lc;
+		int			i = 0;
+
+		dsa_pointer_address = (dsa_pointer *) shm_toc_allocate(pcxt->toc,
+															   sizeof(dsa_pointer) * sjf_num);
+		foreach(lc, node->semiJoinFilters)
+		{
+			SemiJoinFilter *sjf = (SemiJoinFilter *) (lfirst(lc));
+			SemiJoinFilterParallelState *parallelState;
+			dsa_area   *area = node->ss.ps.state->es_query_dsa;
+
+			sjf->parallelState = CreateFilterParallelState(area, sjf, sjf_num);
+			sjf->isParallel = true;
+			/* check if main process always will run */
+			parallelState = (SemiJoinFilterParallelState *) dsa_get_address(area, sjf->parallelState);
+			parallelState->numProcesses = 1;
+			/* update parallelState with built bloom filter */
+			if (sjf->doneBuilding &&
+				node->ss.ss_currentRelation->rd_id == sjf->checkingId)
+			{
+				bloom_filter *parallel_bloom = (bloom_filter *) dsa_get_address(area, parallelState->bloom_dsa_address);
+
+				replace_bitset(parallel_bloom, sjf->filter);
+				LWLockRelease(&parallelState->secondlock);
+			}
+			dsa_pointer_address[i] = sjf->parallelState;
+			i++;
+		}
+
+		/*
+		 * Add plan_id to magic number so this is also unique for each plan
+		 * node.
+		 */
+		shm_toc_insert(pcxt->toc, DSA_LOCATION_KEY_FOR_SJF +
+					   node->ss.ps.plan->plan_node_id, dsa_pointer_address);
+	}
+
 	pscan = shm_toc_allocate(pcxt->toc, node->pscan_len);
 	table_parallelscan_initialize(node->ss.ss_currentRelation,
 								  pscan,
@@ -317,4 +386,106 @@ ExecSeqScanInitializeWorker(SeqScanState *node,
 	pscan = shm_toc_lookup(pwcxt->toc, node->ss.ps.plan->plan_node_id, false);
 	node->ss.ss_currentScanDesc =
 		table_beginscan_parallel(node->ss.ss_currentRelation, pscan);
+
+	/*
+	 * Create worker's semi-join filter for merge join, if using it. We first
+	 * need to check shm_toc to see if a sjf exists, then create the local
+	 * backend sjf.
+	 */
+	if (shm_toc_lookup(pwcxt->toc,
+					   DSA_LOCATION_KEY_FOR_SJF + node->ss.ps.plan->plan_node_id, 1))
+	{
+		dsa_pointer *parallel_addresses = (dsa_pointer *)
+		shm_toc_lookup(pwcxt->toc,
+					   DSA_LOCATION_KEY_FOR_SJF + node->ss.ps.plan->plan_node_id, 1);
+
+		/*
+		 * we know that there is at least one sjf, we will update accordingly
+		 * if the parallel state says there is more (this avoids using an
+		 * additional shm_toc allocation).
+		 */
+		int			sjf_num = 1;
+
+		/*
+		 * If a copy of any sjf already exists on the backend, we want to free
+		 * it and create a new one.
+		 */
+		if (node->applySemiJoinFilter)
+		{
+			while (list_length(node->semiJoinFilters) > 0)
+			{
+				SemiJoinFilter *sjf = (SemiJoinFilter *)
+				(list_head(node->semiJoinFilters)->ptr_value);
+
+				node->semiJoinFilters =
+					list_delete_nth_cell(node->semiJoinFilters, 0);
+				FreeSemiJoinFilter(sjf);
+			}
+		}
+
+		/*
+		 * Here, we create the process-local SJF's, which will later be
+		 * combined into the single SJF after all parallel work is done.
+		 */
+		for (int i = 0; i < sjf_num; i++)
+		{
+			dsa_pointer parallel_address = parallel_addresses[i];
+			SemiJoinFilterParallelState *parallelState =
+			(SemiJoinFilterParallelState *)
+			dsa_get_address(node->ss.ps.state->es_query_dsa,
+							parallel_address);
+			SemiJoinFilter *sjf;
+			MemoryContext oldContext;
+
+			sjf_num = parallelState->sjf_num;
+			oldContext = MemoryContextSwitchTo(GetMemoryChunkContext(node));
+			sjf = (SemiJoinFilter *) palloc0(sizeof(SemiJoinFilter));
+			sjf->filter = bloom_create(parallelState->num_elements,
+									   parallelState->work_mem,
+									   parallelState->seed);
+			sjf->buildingId = parallelState->buildingId;
+			sjf->checkingId = parallelState->checkingId;
+			sjf->seed = parallelState->seed;
+			sjf->isParallel = true;
+			sjf->isWorker = true;
+			sjf->doneBuilding = parallelState->doneBuilding;
+			sjf->parallelState = parallel_address;
+			node->applySemiJoinFilter = true;
+			sjf->buildingAttr = parallelState->buildingAttr;
+			sjf->checkingAttr = parallelState->checkingAttr;
+			node->semiJoinFilters =
+				lappend(node->semiJoinFilters, (void *) sjf);
+			sjf->mergejoin_plan_id = parallelState->mergejoin_plan_id;
+			/* copy over bloom filter if already built */
+			if (sjf->doneBuilding &&
+				parallelState->checkingId ==
+				node->ss.ss_currentRelation->rd_id)
+			{
+				SemiJoinFilterParallelState *parallelState =
+				(SemiJoinFilterParallelState *)
+				dsa_get_address(node->ss.ps.state->es_query_dsa,
+								sjf->parallelState);
+				bloom_filter *shared_bloom = (bloom_filter *) dsa_get_address(
+																			  node->ss.ps.state->es_query_dsa,
+																			  parallelState->bloom_dsa_address);
+
+				replace_bitset(sjf->filter, shared_bloom);
+			}
+			else if (!sjf->doneBuilding &&
+					 parallelState->buildingId ==
+					 node->ss.ss_currentRelation->rd_id)
+			{
+				/*
+				 * Add this process to number of scan processes, need to use
+				 * lock in case of multiple workers updating at same time. We
+				 * want to avoid using the planned number of workers because
+				 * that can be wrong.
+				 */
+				LWLockAcquire(&parallelState->lock, LW_EXCLUSIVE);
+				parallelState->numProcesses += 1;
+				LWLockRelease(&parallelState->lock);
+			}
+			MemoryContextSwitchTo(oldContext);
+		}
+	}
 }
diff --git a/src/backend/lib/bloomfilter.c b/src/backend/lib/bloomfilter.c
index 3ef67d35ac..0a05ada9b6 100644
--- a/src/backend/lib/bloomfilter.c
+++ b/src/backend/lib/bloomfilter.c
@@ -128,6 +128,56 @@ bloom_free(bloom_filter *filter)
 	pfree(filter);
 }
 
+/*
+ * Create Bloom filter in dsa shared memory
+ */
+dsa_pointer
+bloom_create_in_dsa(dsa_area *area, int64 total_elems, int bloom_work_mem, uint64 seed)
+{
+	dsa_pointer filter_dsa_address;
+	bloom_filter *filter;
+	int			bloom_power;
+	uint64		bitset_bytes;
+	uint64		bitset_bits;
+
+	/*
+	 * Aim for two bytes per element; this is sufficient to get a false
+	 * positive rate below 1%, independent of the size of the bitset or total
+	 * number of elements.  Also, if rounding down the size of the bitset to
+	 * the next lowest power of two turns out to be a significant drop, the
+	 * false positive rate still won't exceed 2% in almost all cases.
+	 */
+	bitset_bytes = Min(bloom_work_mem * UINT64CONST(1024), total_elems * 2);
+	bitset_bytes = Max(1024 * 1024, bitset_bytes);
+
+	/*
+	 * Size in bits should be the highest power of two <= target.  bitset_bits
+	 * is uint64 because PG_UINT32_MAX is 2^32 - 1, not 2^32
+	 */
+	bloom_power = my_bloom_power(bitset_bytes * BITS_PER_BYTE);
+	bitset_bits = UINT64CONST(1) << bloom_power;
+	bitset_bytes = bitset_bits / BITS_PER_BYTE;
+
+	/* Allocate bloom filter with unset bitset */
+	filter_dsa_address = dsa_allocate0(area, offsetof(bloom_filter, bitset) +
+									   sizeof(unsigned char) * bitset_bytes);
+	filter = (bloom_filter *) dsa_get_address(area, filter_dsa_address);
+	filter->k_hash_funcs = optimal_k(bitset_bits, total_elems);
+	filter->seed = seed;
+	filter->m = bitset_bits;
+
+	return filter_dsa_address;
+}
+
+/*
+ * Free Bloom filter in dsa shared memory
+ */
+void
+bloom_free_in_dsa(dsa_area *area, dsa_pointer filter_dsa_address)
+{
+	dsa_free(area, filter_dsa_address);
+}
+
 /*
  * Add element to Bloom filter
  */
@@ -292,3 +342,27 @@ mod_m(uint32 val, uint64 m)
 
 	return val & (m - 1);
 }
+
+/*
+ * Add secondary filter to main filter, essentially "combining" the two filters together.
+ * This happens in-place, with main_filter being the combined filter.
+ * Both filters must have the same seed and size for this to work.
+ */
+void
+add_to_filter(bloom_filter *main_filter, bloom_filter *to_add)
+{
+	Assert(main_filter->seed == to_add->seed);
+	Assert(main_filter->m == to_add->m);
+	/* m is in bits not bytes */
+	for (int i = 0; i < main_filter->m / BITS_PER_BYTE; i++)
+	{
+		main_filter->bitset[i] = main_filter->bitset[i] | to_add->bitset[i];
+	}
+}
+
+void
+replace_bitset(bloom_filter *main_filter, bloom_filter *overriding_filter)
+{
+	Assert(main_filter->m == overriding_filter->m);
+	memcpy(&main_filter->bitset, &overriding_filter->bitset, main_filter->m / BITS_PER_BYTE);
+}
diff --git a/src/include/executor/nodeMergejoin.h b/src/include/executor/nodeMergejoin.h
index c311c7ed80..d4cc439315 100644
--- a/src/include/executor/nodeMergejoin.h
+++ b/src/include/executor/nodeMergejoin.h
@@ -23,6 +23,8 @@ extern void FreeSemiJoinFilter(SemiJoinFilter * sjf);
 extern int	PushDownDirection(PlanState *node);
 extern void PushDownFilter(PlanState *node, Plan *plan, SemiJoinFilter * sjf, Oid *relId, int64 *nodeRows);
 extern bool SemiJoinFilterExamineSlot(List *semiJoinFilters, TupleTableSlot *slot, Oid tableId);
-extern void SemiJoinFilterFinishScan(List *semiJoinFilters, Oid tableId);
+extern void SemiJoinFilterFinishScan(List *semiJoinFilters, Oid tableId, dsa_area *parallel_area);
+extern dsa_pointer CreateFilterParallelState(dsa_area *area, SemiJoinFilter * sjf, int sjf_num);
+extern SemiJoinFilter * GetSemiJoinFilter(PlanState *node, int plan_id);
 
 #endif							/* NODEMERGEJOIN_H */
diff --git a/src/include/lib/bloomfilter.h b/src/include/lib/bloomfilter.h
index 8146d8e7fd..3b5d1821a5 100644
--- a/src/include/lib/bloomfilter.h
+++ b/src/include/lib/bloomfilter.h
@@ -13,15 +13,22 @@
 #ifndef BLOOMFILTER_H
 #define BLOOMFILTER_H
 
+#include "utils/dsa.h"
+
 typedef struct bloom_filter bloom_filter;
 
 extern bloom_filter *bloom_create(int64 total_elems, int bloom_work_mem,
 								  uint64 seed);
 extern void bloom_free(bloom_filter *filter);
+extern dsa_pointer bloom_create_in_dsa(dsa_area *area, int64 total_elems,
+									   int bloom_work_mem, uint64 seed);
+extern void bloom_free_in_dsa(dsa_area *area, dsa_pointer filter_dsa_address);
 extern void bloom_add_element(bloom_filter *filter, unsigned char *elem,
 							  size_t len);
 extern bool bloom_lacks_element(bloom_filter *filter, unsigned char *elem,
 								size_t len);
 extern double bloom_prop_bits_set(bloom_filter *filter);
+extern void add_to_filter(bloom_filter *main_filter, bloom_filter *to_add);
+extern void replace_bitset(bloom_filter *main_filter, bloom_filter *overriding_filter);
 
 #endif							/* BLOOMFILTER_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 6964462720..6ca6de437b 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -40,6 +40,8 @@
 #include "nodes/tidbitmap.h"
 #include "partitioning/partdefs.h"
 #include "storage/condition_variable.h"
+#include "storage/lwlock.h"
+#include "storage/shm_toc.h"
 #include "utils/hsearch.h"
 #include "utils/queryenvironment.h"
 #include "utils/reltrigger.h"
@@ -2026,6 +2028,10 @@ typedef struct SemiJoinFilter
 	Oid			checkingId;
 	int			checkingAttr;
 	bool		doneBuilding;
+	/* Parallel information */
+	bool		isParallel;
+	bool		isWorker;
+	dsa_pointer parallelState;
 	/* metadata */
 	uint64		seed;
 	int64		num_elements;
@@ -2036,6 +2042,31 @@ typedef struct SemiJoinFilter
 	int			mergejoin_plan_id;
 }			SemiJoinFilter;
 
+typedef struct SemiJoinFilterParallelState
+{
+	/* bloom filter information */
+	uint64		seed;
+	int64		num_elements;
+	int			work_mem;
+	dsa_pointer bloom_dsa_address;
+	/* information to copy over to worker processes */
+	int			numAttr;
+	Oid			buildingId;
+	Oid			checkingId;
+	int			buildingAttr;
+	int			checkingAttr;
+	int			elementsAdded;
+	/* information for parallelization and locking */
+	bool		doneBuilding;
+	int			workersDone;
+	int			numProcesses;
+	uint64		lockStop;
+	LWLock		lock;
+	LWLock		secondlock;
+	int			sjf_num;
+	int			mergejoin_plan_id;
+}			SemiJoinFilterParallelState;
+
 typedef struct MergeJoinState
 {
 	JoinState	js;				/* its first field is NodeTag */
-- 
2.37.1

0004-Integrate-EXPLAIN-command-with-semijoin-filter.patchapplication/octet-stream; name=0004-Integrate-EXPLAIN-command-with-semijoin-filter.patchDownload

From e328d8ad83c21eb8a34c8101cd1dee1e4276e50e Mon Sep 17 00:00:00 2001
From: Lyu Pan <lyup@amazon.com>
Date: Fri, 16 Sep 2022 21:40:59 +0000
Subject: [PATCH 4/5] Integrate EXPLAIN command with semijoin filter.

When explaining Merge Join node, if semijoin filter is used in the Merge Join node, related metadata will be displyed, including filter clause, estimated filtering rate and actual filtering rate if EXPLAIN ANALYZE is used.

For example:
Merge Join  (...)
   Merge Cond: (...)
   SemiJoin Filter Created Based on: (...)
   SemiJoin Estimated Filtering Rate: XXX
   SemiJoin Actual Filtering Rate: XXX
---
 src/backend/commands/explain.c | 40 +++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..438792e31a 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -153,7 +153,8 @@ static void ExplainIndentText(ExplainState *es);
 static void ExplainJSONLineEnding(ExplainState *es);
 static void ExplainYAMLLineStarting(ExplainState *es);
 static void escape_yaml(StringInfo buf, const char *str);
-
+static void show_semijoin_metadata(List *equijoins, PlanState *planstate,
+								   List *ancestors, ExplainState *es);
 
 
 /*
@@ -1981,6 +1982,11 @@ ExplainNode(PlanState *planstate, List *ancestors,
 							"Merge Cond", planstate, ancestors, es);
 			show_upper_qual(((MergeJoin *) plan)->join.joinqual,
 							"Join Filter", planstate, ancestors, es);
+			if (((MergeJoinState *) planstate)->sjf)
+			{
+				show_semijoin_metadata(((MergeJoin *) plan)->mergeclauses,
+									   planstate, ancestors, es);
+			}
 			if (((MergeJoin *) plan)->join.joinqual)
 				show_instrumentation_count("Rows Removed by Join Filter", 1,
 										   planstate, es);
@@ -5041,3 +5047,35 @@ escape_yaml(StringInfo buf, const char *str)
 {
 	escape_json(buf, str);
 }
+
+static void
+show_semijoin_metadata(List *equijoins, PlanState *planstate,
+					   List *ancestors, ExplainState *es)
+{
+	char		createStr[256];
+	int			clause_ordinal;
+	Node	   *best_equijoin_clause;
+	MergeJoin  *mj = ((MergeJoin *) planstate->plan);
+
+	Assert(planstate);
+	Assert(nodeTag(planstate) == T_MergeJoinState);
+	Assert(planstate->plan);
+	Assert(nodeTag(planstate->plan) == T_MergeJoin);
+
+	snprintf(createStr, sizeof(createStr), "%s",
+			 "SemiJoin Filter Created Based on");
+	clause_ordinal = mj->bestExpr;
+	best_equijoin_clause =
+		(Node *) list_nth_node(OpExpr, equijoins, clause_ordinal);
+	show_expression(best_equijoin_clause, createStr, planstate, ancestors,
+					true, es);
+	ExplainPropertyFloat("SemiJoin Estimated Filtering Rate", NULL,
+						 mj->filteringRate, 4, es);
+	if (es->analyze)
+	{
+		SemiJoinFilter *sjf = ((MergeJoinState *) planstate)->sjf;
+
+		ExplainPropertyFloat("SemiJoin Actual Filtering Rate", NULL,
+							 ((double) sjf->elementsFiltered) / sjf->elementsChecked, 4, es);
+	}
+}
-- 
2.37.1

0002-Support-semijoin-filter-in-the-executor-for-non-para.patchapplication/octet-stream; name=0002-Support-semijoin-filter-in-the-executor-for-non-para.patchDownload

From 071e35b4fc038cd791479beb6fcbe8b0a4c0bee8 Mon Sep 17 00:00:00 2001
From: Lyu Pan <lyup@amazon.com>
Date: Fri, 16 Sep 2022 03:35:06 +0000
Subject: [PATCH 2/5] Support semijoin filter in the executor for non-parallel
 mergejoin.

During MergeJoinState initialization, if a semijoin filter should be used in the MergeJoin node (according to the planner), a SemiJoinFilter struct is initialized, then the relation id and attribute number used to build/check the bloom filter are calculated, related information is pushed down to the scan node (only SeqScan for now). The bloom filter is always built on the outer/left tree, and used in the inner/right tree.
---
 src/backend/executor/execScan.c      |  52 +++++-
 src/backend/executor/nodeMergejoin.c | 259 +++++++++++++++++++++++++++
 src/backend/executor/nodeSeqscan.c   |   6 +
 src/include/executor/nodeMergejoin.h |   5 +
 src/include/nodes/execnodes.h        |  25 +++
 5 files changed, 340 insertions(+), 7 deletions(-)

diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c
index 043bb83f55..4b97c39455 100644
--- a/src/backend/executor/execScan.c
+++ b/src/backend/executor/execScan.c
@@ -19,8 +19,10 @@
 #include "postgres.h"
 
 #include "executor/executor.h"
+#include "executor/nodeMergejoin.h"
 #include "miscadmin.h"
 #include "utils/memutils.h"
+#include "utils/rel.h"
 
 
 
@@ -173,10 +175,12 @@ ExecScan(ScanState *node,
 	/* interrupt checks are in ExecScanFetch */
 
 	/*
-	 * If we have neither a qual to check nor a projection to do, just skip
-	 * all the overhead and return the raw scan tuple.
+	 * If we have neither a qual to check nor a projection to do, nor a bloom
+	 * filter to check, just skip all the overhead and return the raw scan
+	 * tuple.
 	 */
-	if (!qual && !projInfo)
+	if (!qual && !projInfo && !IsA(node, SeqScanState) &&
+		!((SeqScanState *) node)->applySemiJoinFilter)
 	{
 		ResetExprContext(econtext);
 		return ExecScanFetch(node, accessMtd, recheckMtd);
@@ -206,6 +210,13 @@ ExecScan(ScanState *node,
 		 */
 		if (TupIsNull(slot))
 		{
+			if (IsA(node, SeqScanState) &&
+				((SeqScanState *) node)->applySemiJoinFilter)
+			{
+				SemiJoinFilterFinishScan(
+										 ((SeqScanState *) node)->semiJoinFilters,
+										 node->ss_currentRelation->rd_id);
+			}
 			if (projInfo)
 				return ExecClearTuple(projInfo->pi_state.resultslot);
 			else
@@ -232,16 +243,43 @@ ExecScan(ScanState *node,
 			if (projInfo)
 			{
 				/*
-				 * Form a projection tuple, store it in the result tuple slot
-				 * and return it.
+				 * Form a projection tuple, store it in the result tuple slot,
+				 * check against SemiJoinFilter, then return it.
 				 */
-				return ExecProject(projInfo);
+				TupleTableSlot *projectedSlot = ExecProject(projInfo);
+
+				if (IsA(node, SeqScanState) &&
+					((SeqScanState *) node)->applySemiJoinFilter)
+				{
+					if (!SemiJoinFilterExamineSlot(
+												   ((SeqScanState *) node)->semiJoinFilters,
+												   projectedSlot, node->ss_currentRelation->rd_id))
+					{
+						/* slot did not pass SemiJoinFilter, so skipping it. */
+						ResetExprContext(econtext);
+						continue;
+					}
+				}
+				return projectedSlot;
 			}
 			else
 			{
 				/*
-				 * Here, we aren't projecting, so just return scan tuple.
+				 * Here, we aren't projecting, so check against
+				 * SemiJoinFilter, then return tuple.
 				 */
+				if (IsA(node, SeqScanState) &&
+					((SeqScanState *) node)->applySemiJoinFilter)
+				{
+					if (!SemiJoinFilterExamineSlot(
+												   ((SeqScanState *) node)->semiJoinFilters, slot,
+												   node->ss_currentRelation->rd_id))
+					{
+						/* slot did not pass SemiJoinFilter, so skipping it. */
+						ResetExprContext(econtext);
+						continue;
+					}
+				}
 				return slot;
 			}
 		}
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index fed345eae5..81c253aad5 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -93,8 +93,11 @@
 #include "postgres.h"
 
 #include "access/nbtree.h"
+#include "common/pg_prng.h"
 #include "executor/execdebug.h"
+#include "executor/execExpr.h"
 #include "executor/nodeMergejoin.h"
+#include "lib/bloomfilter.h"
 #include "miscadmin.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
@@ -1603,6 +1606,95 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 											node->mergeNullsFirst,
 											(PlanState *) mergestate);
 
+	/*
+	 * initialize SemiJoinFilter, if planner decided to do so
+	 */
+	if (((MergeJoin *) mergestate->js.ps.plan)->applySemiJoinFilter)
+	{
+		SemiJoinFilter *sjf;
+		Plan	   *buildingNode;
+		Plan	   *checkingNode;
+		uint64		seed;
+		MergeJoinClause clause;
+
+		/* create Bloom filter */
+		sjf = (SemiJoinFilter *) palloc0(sizeof(SemiJoinFilter));
+
+		/*
+		 * Push down filter down outer and inner subtrees and apply filter to
+		 * the nodes that correspond to the ones identified during planning.
+		 * We are pushing down first because we need some metadata from the
+		 * scan nodes (i.e. Relation Id's and planner-estimated number of
+		 * rows).
+		 */
+		buildingNode = ((MergeJoin *) mergestate->js.ps.plan)->buildingNode;
+		checkingNode = ((MergeJoin *) mergestate->js.ps.plan)->checkingNode;
+		sjf->buildingId = -1;
+		sjf->checkingId = -1;
+		PushDownFilter(mergestate->js.ps.lefttree, buildingNode, sjf, &sjf->buildingId, &sjf->num_elements);
+		PushDownFilter(mergestate->js.ps.righttree, checkingNode, sjf, &sjf->checkingId, &sjf->num_elements);
+
+		/* Initialize SJF data and create Bloom filter */
+		seed = pg_prng_uint64(&pg_global_prng_state);
+		sjf->filter = bloom_create(sjf->num_elements, work_mem, seed);
+		sjf->work_mem = work_mem;
+		sjf->seed = seed;
+		sjf->doneBuilding = false;
+
+		/*
+		 * From the plan level, we already know which mergeclause to build the
+		 * filter on. However, to implement this on the scan level, we look at
+		 * the expression and figure out which slot we need to examine in
+		 * ExecScan to build/check the filter on.
+		 */
+
+		clause = &mergestate->mj_Clauses[((MergeJoin *) mergestate->js.ps.plan)->bestExpr];
+
+		/*
+		 * Look through expression steps to determine which relation attribute
+		 * slot they are comparing and take note of it. All merge join clauses
+		 * eventually fetch tuples from either the outer or inner slot, so we
+		 * just need to check for those specific ExprEvalOp's.
+		 */
+		for (int j = 0; j < clause->lexpr->steps_len; j++)
+		{
+			ExprEvalOp	leftOp = clause->lexpr->steps[j].opcode;
+			int			leftAttr = clause->lexpr->steps[j].d.var.attnum;
+
+			if (leftOp == EEOP_OUTER_FETCHSOME)
+			{
+				/* attribute numbers are 1-indexed */
+				sjf->buildingAttr = leftAttr - 1;
+				break;
+			}
+			else if (leftOp == EEOP_INNER_FETCHSOME)
+			{
+				sjf->checkingAttr = leftAttr - 1;
+				break;
+			}
+		}
+		/* do it again for right expression */
+		for (int j = 0; j < clause->rexpr->steps_len; j++)
+		{
+			ExprEvalOp	rightOp = clause->rexpr->steps[j].opcode;
+			int			rightAttr = clause->rexpr->steps[j].d.var.attnum;
+
+			if (rightOp == EEOP_OUTER_FETCHSOME)
+			{
+				/* attribute numbers are 1-indexed */
+				sjf->buildingAttr = rightAttr - 1;
+				break;
+			}
+			else if (rightOp == EEOP_INNER_FETCHSOME)
+			{
+				sjf->checkingAttr = rightAttr - 1;
+				break;
+			}
+		}
+		sjf->mergejoin_plan_id = mergestate->js.ps.plan->plan_node_id;
+		mergestate->sjf = sjf;
+	}
+
 	/*
 	 * initialize join state
 	 */
@@ -1645,6 +1737,14 @@ ExecEndMergeJoin(MergeJoinState *node)
 	ExecClearTuple(node->js.ps.ps_ResultTupleSlot);
 	ExecClearTuple(node->mj_MarkedTupleSlot);
 
+	/*
+	 * free SemiJoinFilter
+	 */
+	if (node->sjf)
+	{
+		FreeSemiJoinFilter(node->sjf);
+	}
+
 	/*
 	 * shut down the subplans
 	 */
@@ -1678,3 +1778,162 @@ ExecReScanMergeJoin(MergeJoinState *node)
 	if (innerPlan->chgParam == NULL)
 		ExecReScan(innerPlan);
 }
+
+void
+FreeSemiJoinFilter(SemiJoinFilter * sjf)
+{
+	bloom_free(sjf->filter);
+	pfree(sjf);
+}
+
+/*
+ * Determines the direction that a pushdown filter can be pushed. This is not very robust, but this
+ * is because we've already done the careful calculations at the plan level. If we end up pushing where
+ * we're not supposed to, it's fine because we've done the verifications in the planner.
+ */
+int
+PushDownDirection(PlanState *node)
+{
+	switch (nodeTag(node))
+	{
+		case T_HashState:
+		case T_MaterialState:
+		case T_GatherState:
+		case T_GatherMergeState:
+		case T_SortState:
+		case T_UniqueState:
+		case T_AggState:
+			{
+				return 0;
+			}
+		case T_NestLoopState:
+		case T_MergeJoinState:
+		case T_HashJoinState:
+			{
+				return 1;
+			}
+		default:
+			{
+				return -1;
+			}
+	}
+}
+
+/* Recursively pushes down the filter until an appropriate SeqScan node is reached. Then, it
+ * verifies if that SeqScan node is the one we want to push the filter to, and if it is, then
+ * appends the SJF to the node. */
+void
+PushDownFilter(PlanState *node, Plan *plan, SemiJoinFilter * sjf, Oid *relId, int64 *nodeRows)
+{
+	if (node == NULL)
+	{
+		return;
+	}
+
+	check_stack_depth();
+	if (node->type == T_SeqScanState)
+	{
+		SeqScanState *scan = (SeqScanState *) node;
+
+		Assert(IsA(scan, SeqScanState));
+
+		/*
+		 * found the right Scan node that we want to apply the filter onto via
+		 * matching relId
+		 */
+		if (scan->ss.ss_currentRelation->rd_id == *relId)
+		{
+			scan->applySemiJoinFilter = true;
+			scan->semiJoinFilters = lappend(scan->semiJoinFilters, sjf);
+		}
+
+		/*
+		 * Check if right Scan node, based on matching Plan nodes. This will
+		 * be the most common way of matching Scan nodes to the filter, the
+		 * above use case is only for fringe parallel-execution cases.
+		 */
+		else if (scan->ss.ps.plan == plan)
+		{
+			scan->applySemiJoinFilter = true;
+			scan->semiJoinFilters = lappend(scan->semiJoinFilters, sjf);
+			*relId = scan->ss.ss_currentRelation->rd_id;
+			/* double row estimate to reduce error rate for Bloom filter */
+			*nodeRows = Max(*nodeRows, scan->ss.ps.plan->plan_rows * 2);
+		}
+	}
+	else
+	{
+		if (PushDownDirection(node) == 1)
+		{
+			PushDownFilter(node->lefttree, plan, sjf, relId, nodeRows);
+			PushDownFilter(node->righttree, plan, sjf, relId, nodeRows);
+		}
+		else if (PushDownDirection(node) == 0)
+		{
+			PushDownFilter(node->lefttree, plan, sjf, relId, nodeRows);
+		}
+	}
+}
+
+/*
+ * If this table is the building-side table for the SemiJoinFilter, adds the element to
+ * the bloom filter and always returns true. If this table is the checking-side table for the SemiJoinFilter,
+ * then checks the element against the bloom filter and returns true if the element is (probably) in the set,
+ * and false if the element is not in the bloom filter.
+ */
+bool
+SemiJoinFilterExamineSlot(List *semiJoinFilters, TupleTableSlot *slot, Oid tableId)
+{
+	ListCell   *cell;
+
+	foreach(cell, semiJoinFilters)
+	{
+		SemiJoinFilter *sjf = ((SemiJoinFilter *) (lfirst(cell)));
+
+		/* check if this table's relation ID matches the filter's relation ID */
+		if (!sjf->doneBuilding && tableId == sjf->buildingId)
+		{
+			Datum		val;
+
+			slot_getsomeattrs(slot, sjf->buildingAttr + 1);
+			sjf->elementsAdded++;
+
+			/*
+			 * We are only using one key for now. Later functionality might
+			 * include multiple join keys.
+			 */
+			val = slot->tts_values[sjf->buildingAttr];
+			bloom_add_element(sjf->filter, (unsigned char *) &val, sizeof(val));
+		}
+		else if (sjf->doneBuilding && tableId == sjf->checkingId)
+		{
+			Datum		val;
+
+			slot_getsomeattrs(slot, sjf->checkingAttr + 1);
+			sjf->elementsChecked++;
+			val = slot->tts_values[sjf->checkingAttr];
+			if (bloom_lacks_element(sjf->filter, (unsigned char *) &val, sizeof(val)))
+			{
+				sjf->elementsFiltered++;
+				return false;
+			}
+		}
+	}
+	return true;
+}
+
+void
+SemiJoinFilterFinishScan(List *semiJoinFilters, Oid tableId)
+{
+	ListCell   *cell;
+
+	foreach(cell, semiJoinFilters)
+	{
+		SemiJoinFilter *sjf = ((SemiJoinFilter *) (lfirst(cell)));
+
+		if (!sjf->doneBuilding && tableId == sjf->buildingId)
+		{
+			sjf->doneBuilding = true;
+		}
+	}
+}
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index 7b58cd9162..e43ce3f8d0 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -207,6 +207,12 @@ ExecEndSeqScan(SeqScanState *node)
 	 */
 	if (scanDesc != NULL)
 		table_endscan(scanDesc);
+
+	/*
+	 * clear out semijoinfilter list, if non-null
+	 */
+	if (node->semiJoinFilters)
+		pfree(node->semiJoinFilters);
 }
 
 /* ----------------------------------------------------------------
diff --git a/src/include/executor/nodeMergejoin.h b/src/include/executor/nodeMergejoin.h
index 26ab517508..c311c7ed80 100644
--- a/src/include/executor/nodeMergejoin.h
+++ b/src/include/executor/nodeMergejoin.h
@@ -19,5 +19,10 @@
 extern MergeJoinState *ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags);
 extern void ExecEndMergeJoin(MergeJoinState *node);
 extern void ExecReScanMergeJoin(MergeJoinState *node);
+extern void FreeSemiJoinFilter(SemiJoinFilter * sjf);
+extern int	PushDownDirection(PlanState *node);
+extern void PushDownFilter(PlanState *node, Plan *plan, SemiJoinFilter * sjf, Oid *relId, int64 *nodeRows);
+extern bool SemiJoinFilterExamineSlot(List *semiJoinFilters, TupleTableSlot *slot, Oid tableId);
+extern void SemiJoinFilterFinishScan(List *semiJoinFilters, Oid tableId);
 
 #endif							/* NODEMERGEJOIN_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..6964462720 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -32,6 +32,7 @@
 #include "access/tupconvert.h"
 #include "executor/instrument.h"
 #include "fmgr.h"
+#include "lib/bloomfilter.h"
 #include "lib/ilist.h"
 #include "lib/pairingheap.h"
 #include "nodes/params.h"
@@ -1451,6 +1452,9 @@ typedef struct SeqScanState
 {
 	ScanState	ss;				/* its first field is NodeTag */
 	Size		pscan_len;		/* size of parallel heap scan descriptor */
+	/* for use of SemiJoinFilters during merge join */
+	bool		applySemiJoinFilter;
+	List	   *semiJoinFilters;
 } SeqScanState;
 
 /* ----------------
@@ -2012,6 +2016,26 @@ typedef struct NestLoopState
 /* private in nodeMergejoin.c: */
 typedef struct MergeJoinClauseData *MergeJoinClause;
 
+typedef struct SemiJoinFilter
+{
+	bloom_filter *filter;
+	/* Relation that Bloom Filter is built on */
+	Oid			buildingId;
+	int			buildingAttr;
+	/* Relation that Bloom Filter is checked on */
+	Oid			checkingId;
+	int			checkingAttr;
+	bool		doneBuilding;
+	/* metadata */
+	uint64		seed;
+	int64		num_elements;
+	int			work_mem;
+	int			elementsAdded;
+	int			elementsChecked;
+	int			elementsFiltered;
+	int			mergejoin_plan_id;
+}			SemiJoinFilter;
+
 typedef struct MergeJoinState
 {
 	JoinState	js;				/* its first field is NodeTag */
@@ -2032,6 +2056,7 @@ typedef struct MergeJoinState
 	TupleTableSlot *mj_NullInnerTupleSlot;
 	ExprContext *mj_OuterEContext;
 	ExprContext *mj_InnerEContext;
+	SemiJoinFilter *sjf;
 } MergeJoinState;
 
 /* ----------------
-- 
2.37.1

0005-Add-basic-regress-tests-for-semijoin-filter.patchapplication/octet-stream; name=0005-Add-basic-regress-tests-for-semijoin-filter.patchDownload

From ccc7c86b38f541bf9c819f13f32c548812e216e6 Mon Sep 17 00:00:00 2001
From: Lyu Pan <lyup@amazon.com>
Date: Mon, 19 Sep 2022 23:25:30 +0000
Subject: [PATCH 5/5] Add basic regress tests for semijoin filter.

Because of implementation limits (and bugs) in the prototype, only some basic sqls are tested.

1. Test that when force_mergejoin_semijoin_filter is ON, semijoin filter is used.
2. Test that semijoin filter works in two-table inner merge join.
3. Test that semijoin filter can be pushed throudh SORT node.
4. Test that semijoin filter works in a basic three-table-merge-join. However, due to implementation bugs, it doesn't work in all kinds of three table merge joins.
---
 .../expected/mergejoin_semijoinfilter.out     | 158 ++++++++++++++++++
 src/test/regress/parallel_schedule            |   1 +
 .../regress/sql/mergejoin_semijoinfilter.sql  |  62 +++++++
 3 files changed, 221 insertions(+)
 create mode 100644 src/test/regress/expected/mergejoin_semijoinfilter.out
 create mode 100644 src/test/regress/sql/mergejoin_semijoinfilter.sql

diff --git a/src/test/regress/expected/mergejoin_semijoinfilter.out b/src/test/regress/expected/mergejoin_semijoinfilter.out
new file mode 100644
index 0000000000..07c4acc253
--- /dev/null
+++ b/src/test/regress/expected/mergejoin_semijoinfilter.out
@@ -0,0 +1,158 @@
+SET enable_hashjoin = OFF;
+SET enable_nestloop = OFF;
+SET enable_mergejoin = ON;
+SET enable_mergejoin_semijoin_filter = ON;
+SET force_mergejoin_semijoin_filter = ON;
+CREATE TABLE t1 (
+  i integer,
+  j integer
+);
+CREATE TABLE t2 (
+  i integer,
+  k integer
+);
+CREATE TABLE t3 (
+  i integer,
+  m integer
+);
+INSERT INTO t1 (i, j)
+  SELECT
+    generate_series(1,100000) AS i,
+    generate_series(1,100000) AS j;
+INSERT INTO t2 (i, k)
+  SELECT
+    generate_series(1,100000) AS i,
+    generate_series(1,100000) AS k;
+INSERT INTO t3 (i, m)
+  SELECT
+    generate_series(1,100000) AS i,
+    generate_series(1,100000) AS m;
+-- Semijoin filter is not used when force_mergejoin_semijoin_filter is OFF.
+SET force_mergejoin_semijoin_filter = OFF;
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+               QUERY PLAN                
+-----------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (t1.i = t2.i)
+         ->  Sort
+               Output: t1.i
+               Sort Key: t1.i
+               ->  Seq Scan on public.t1
+                     Output: t1.i
+         ->  Sort
+               Output: t2.i
+               Sort Key: t2.i
+               ->  Seq Scan on public.t2
+                     Output: t2.i
+(14 rows)
+
+SET force_mergejoin_semijoin_filter = ON;
+-- One level of inner mergejoin: push semi-join filter to outer scan.
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (t1.i = t2.i)
+         SemiJoin Filter Created Based on: (t1.i = t2.i)
+         SemiJoin Estimated Filtering Rate: -0.0100
+         ->  Sort
+               Output: t1.i
+               Sort Key: t1.i
+               ->  Seq Scan on public.t1
+                     Output: t1.i
+         ->  Sort
+               Output: t2.i
+               Sort Key: t2.i
+               ->  Seq Scan on public.t2
+                     Output: t2.i
+(16 rows)
+
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+ count  
+--------
+ 100000
+(1 row)
+
+-- Push semijoin filter through SORT node.
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM (SELECT DISTINCT t1.i FROM t1 ORDER BY t1.i) x JOIN t2 ON x.i = t2.i;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (t1.i = t2.i)
+         SemiJoin Filter Created Based on: (t1.i = t2.i)
+         SemiJoin Estimated Filtering Rate: -0.0100
+         ->  Sort
+               Output: t1.i
+               Sort Key: t1.i
+               ->  HashAggregate
+                     Output: t1.i
+                     Group Key: t1.i
+                     ->  Seq Scan on public.t1
+                           Output: t1.i, t1.j
+         ->  Sort
+               Output: t2.i
+               Sort Key: t2.i
+               ->  Seq Scan on public.t2
+                     Output: t2.i
+(19 rows)
+
+SELECT COUNT(*) FROM (SELECT DISTINCT t1.i FROM t1 ORDER BY t1.i) x JOIN t2 ON x.i = t2.i;
+ count  
+--------
+ 100000
+(1 row)
+
+-- Two levels of MergeJoin
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i JOIN t3 ON t2.i = t3.i;
+                             QUERY PLAN                              
+---------------------------------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (t3.i = t1.i)
+         SemiJoin Filter Created Based on: (t3.i = t1.i)
+         SemiJoin Estimated Filtering Rate: -0.0100
+         ->  Sort
+               Output: t3.i
+               Sort Key: t3.i
+               ->  Seq Scan on public.t3
+                     Output: t3.i
+         ->  Materialize
+               Output: t1.i, t2.i
+               ->  Merge Join
+                     Output: t1.i, t2.i
+                     Merge Cond: (t1.i = t2.i)
+                     SemiJoin Filter Created Based on: (t1.i = t2.i)
+                     SemiJoin Estimated Filtering Rate: -0.0100
+                     ->  Sort
+                           Output: t1.i
+                           Sort Key: t1.i
+                           ->  Seq Scan on public.t1
+                                 Output: t1.i
+                     ->  Sort
+                           Output: t2.i
+                           Sort Key: t2.i
+                           ->  Seq Scan on public.t2
+                                 Output: t2.i
+(28 rows)
+
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i JOIN t3 ON t2.i = t3.i;
+ count  
+--------
+ 100000
+(1 row)
+
+DROP TABLE t1;
+DROP TABLE t2;
+DROP TABLE t3;
+RESET enable_mergejoin;
+RESET enable_memoize;
+RESET enable_mergejoin;
+RESET enable_mergejoin_semijoin_filter;
+RESET force_mergejoin_semijoin_filter;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 9f644a0c1b..6bde05b8cb 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -98,6 +98,7 @@ test: rules psql psql_crosstab amutils stats_ext collate.linux.utf8
 test: select_parallel
 test: write_parallel
 test: vacuum_parallel
+test: mergejoin_semijoinfilter
 
 # no relation related tests can be put in this group
 test: publication subscription
diff --git a/src/test/regress/sql/mergejoin_semijoinfilter.sql b/src/test/regress/sql/mergejoin_semijoinfilter.sql
new file mode 100644
index 0000000000..6c6418f71a
--- /dev/null
+++ b/src/test/regress/sql/mergejoin_semijoinfilter.sql
@@ -0,0 +1,62 @@
+SET enable_hashjoin = OFF;
+SET enable_nestloop = OFF;
+SET enable_mergejoin = ON;
+SET enable_mergejoin_semijoin_filter = ON;
+SET force_mergejoin_semijoin_filter = ON;
+
+CREATE TABLE t1 (
+  i integer,
+  j integer
+);
+
+CREATE TABLE t2 (
+  i integer,
+  k integer
+);
+
+CREATE TABLE t3 (
+  i integer,
+  m integer
+);
+
+INSERT INTO t1 (i, j)
+  SELECT
+    generate_series(1,100000) AS i,
+    generate_series(1,100000) AS j;
+
+INSERT INTO t2 (i, k)
+  SELECT
+    generate_series(1,100000) AS i,
+    generate_series(1,100000) AS k;
+
+INSERT INTO t3 (i, m)
+  SELECT
+    generate_series(1,100000) AS i,
+    generate_series(1,100000) AS m;
+
+-- Semijoin filter is not used when force_mergejoin_semijoin_filter is OFF.
+SET force_mergejoin_semijoin_filter = OFF;
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+SET force_mergejoin_semijoin_filter = ON;
+
+-- One level of inner mergejoin: push semi-join filter to outer scan.
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+
+-- Push semijoin filter through SORT node.
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM (SELECT DISTINCT t1.i FROM t1 ORDER BY t1.i) x JOIN t2 ON x.i = t2.i;
+SELECT COUNT(*) FROM (SELECT DISTINCT t1.i FROM t1 ORDER BY t1.i) x JOIN t2 ON x.i = t2.i;
+
+-- Two levels of MergeJoin
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i JOIN t3 ON t2.i = t3.i;
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i JOIN t3 ON t2.i = t3.i;
+
+DROP TABLE t1;
+DROP TABLE t2;
+DROP TABLE t3;
+
+RESET enable_mergejoin;
+RESET enable_memoize;
+RESET enable_mergejoin;
+RESET enable_mergejoin_semijoin_filter;
+RESET force_mergejoin_semijoin_filter;
-- 
2.37.1

semijoin_filter_performance.sqlapplication/octet-stream; name=semijoin_filter_performance.sqlDownload

theoretical_filtering_rate.pngimage/png; name=theoretical_filtering_rate.pngDownload

�PNG


IHDR�,���xsRGB���xeXIfMM*>F(�iN������,��g	pHYsg��R@IDATx�������A����`�XK��1j�~c��{5��c����DSL�4s�G�1�K4��Q�X�`Q�����7��u��{�����<�a����g����3����a�-Y � � � � � � ��:��i��nv	�@@@@@@���wjY�G��
��� � � � � � �@S
L�0!�n�|=z4�4�`� � � � � � �t��wH��tW�#� � � � � � �G� Z #� � � � � �4�A���s�@@@@@@ �A�<@F@@@@@h>�h���\1 � � � � � �@�hy�8� � � � � � �|�����b@@@@@@�<��q@@@@@@���5_�s� � � � � � �y���0 � � � � � �@�	Dk�>��@@@@@@�D��a@@@@@@�� ��|}�#� � � � � �� ���� � � � � � ��'@����+F@@@@@�#@-�@@@@@@�O� Z��9W� � � � � � �G� Z #� � � � � �4�A���s�@@@@@@ �A�<@F@@@@@h>�h���\1 � � � � � �@�hy�8� � � � � � �|�����b@@@@@@�<��q@@@@@@���5_�s� � � � � � �y���0 � � � � � �@�	Dk�>��@@@@@@�D��a@@@@@@�� ��|}�#� � � � � �� ���� � � � � � ��'@����+F@@@@@�#@-�@@@@@@�O� Z��9W� � � � � � �G� Z #� � � � � �4�A���s�@@@@@@ �A�<@F@@@@@h>�h���\1 � � � � � �@�hy�8� � � � � � �|�����b@@@@@@�<��q@@@@@@���5_�s� � � �@���/w}�Q���X?��w�����!Z� � P�����h � �	K�,q�4Cj�k������{�)���N>�d7g��zk^]����m-�.�5��-in�l�OSe@@��@��q\ � �@�
|�����ot�-�y	�������[�������7�����n��C7v�X��[��e������{�-�}C�u;���k�k-�"�P�T�+V�V�Z����J*L���R�������������%�5�����K������^���M��]�������N;����wVjB@�W� Z��=W� ����o��x����r�!���6������i��n���n���o���������W`������ ��:y������s����A�F��\�S�c������|���M�.�wY�|Z���j!�I�,��R�_�v��+_�<����S�N����sp@���:^�TT��� ��A� Z�@@ ��A��Yg����f�R#�t���/����A�o�������x�;����6m�S�I��9m���n�m�m�����G#�����R#]k��������%�]��w4	8�]�����'��j�����|W1�NT��R�_�&%V����b��7v�~������l��r�Jw����`��#�I���)M�)� ��"@�=�"� ��U`��ww�{lY�l����-�������#G������>���d��s�E�������??k��Z�^D
��_�������i������v�=�,�w��k6K[&�Vz_)��t�����n�e���������j��/�"��z�)�����lY�:�@�w����@�� � �@��n��~�IS��/�k��y(����6}��'����u�R��k�y����G}Xl�Wo������&j�S�{"�a)�����g��m��-��-�7E�YK�,��oP�us�I����������?��@�N���Bl�^{m���~�i��������y�r������r6�� � �@��V%hN� � P>��/��O[��kWw�g���_?����z��t�M�!��������o�����kn����c��n�
6��"}��n�5����+o�����_���|�M��{��Q`*�Qz�L�.]|������������G�a5~��?���C���|���h�\r�o���R��y�=�\�����v��o?��#��;�p�;wv?�����Q��V�^�;���;����O�>~��=��������"n����^M9z�h����:�_�_���/���e�]�G���TP���5��'�|�=��C�N��]w]���|�
>��L���}���/�m�wN��lu$���KX������'�u�u�Q��h����������F������M_~�e�/���.]�{zU��y5&����?����>o��O!�DZ�R�a�p������	=��w��.�m��+�mjK���������g'L��t/,X��7_�M�T�^�:���&��XS���>����D�{"W��:��6*�}�]w�g�y���3��������1�������Yjt�����KI���ks?U���\+e@@�jD��2�@@(���#
�(]{������@A�k���-Z���a��[K�\���+��c,�F(8��p*x�Q>�n����@
t����>(��K/E�/{��w���s}�
\%���C�w�y���C��_���?
P(�%�kI��9T��^a�T^W\q�7���?����S�Lq�F�r��u��(���F/(w�m�E�����G�:���3�Uj#����U�S��0��=�Om�Qx��w���(.Q��5k����z��%^����
�)��� �����+u����}�Q��a^�����A�e}_UX��
>�{��[�wO�5,�=�����M�<����k�.X*�m�����r�-��������h���~7.��2�Up��)�w5�]����b����N��'r�?����F�s���>�����M�O�����a��4�-��_�~�{?U��2� � PG���3h
 � �@azX���z���+��I�&������H{P�F�K%�+�������OS]�=���h���(=���d.���ht��7�pC81b��f�m|����#r4����Q�w���6
Rh��FY
G����������e4��F�����n�5�p����Y�kVM�5�!�t�����Q.
�����mS�Q#b��u���j��4�O���{�t~��F{�U#w��=[��>=���[l57��U��~���>�l�2����!=����� �z���G�����4���SNq�l������}k���L�G#��
�'�������5����������bS��e��7�#H7�h#�8n�8�����^�z�c��O����Y8�K���uOb���Q�n�iT��I�����U���~�KI
�i���o��6s={�����������X6@@�: �V�@@@Z�y����phT��^=�T�
(���?�y5JLA
�*z����>���/Gu)8�����g����r����AM��QV
���?��O�!�������u������T����;���}��K#���_#��M
~�uj*�c�=6�B�#�4���(���/}��H6+p�I'eL�� ��V��'��� ����SO��nT����6m�?�`���[��V4u�F�Y�P�0���~����C��FO�A~<�����J���Q)i��{y��v�����w�g�?6b���?��i
�y+�~�����I��r}G��M�y<��h:W��/|���~<���>k4m�����p�i5"2))H��Q�T��^�>KjW�{")o�}�~��V@_i���r��RRpK��
������oq�I�aP<�hW�m��F����T��T�~���m@@�V����5�@@����_��?qNO%$=,����4�L�>�4�J������+=`����������?zHm��V���S�N��5%��[��i�w��x������+��QyJ����-_LR��]�����q��mi�%�
��t�	'd���>|��F#�~���=������I{�����-�� ��*L6BF������=��{���(����b�������XO
��m��KMi~Sv�a�64k�����)+�ljMM�h4\5R��,�=Q�5�=����dJ
z�I��������HT�r+6M�>�izF����t_�;���uV�~*W?���6 � PKF��R�s#� �dh�Dm��u}O
��!����DK���L�-�wQi*�x��mK�^*�,i���l_�-�n��Qa&l������k��Ak<iZB��������`��+�g4a|�C�KN}��z������OZ�����S�H������=uQ�����{�����S�G��1c����r����'���>��#�c���m���f;����M�o�F.)����oX���)�`}5R��,�=Q�5�=�9i���\��%����kx��u�&k������u*���j4s!���S����v�@��A�jIs@@���o�yQSZ��?_��W���^�����
�Mo�}z��>��M��S������N_�%{O���#����[M��zOz��+i���
FX�Q��~/�~�u.�s�=�`���� Z8����I��]r�%���e��_����O#'5�[�G��m���{"�a9�a.�R��%�U���������}��Q4��=���ia��Qj�����d�{����|��@���}�v�8<U@_�bT0]i�����$�7���}������{P�y����6�-�sT�~*W?Y;Y"� �� @�z�6 � ���s�W���U8u�o���~��i��L*Y0M���P;r�p$B����r������/��4���
\�=I��4�D5%���JS����g��;�p�=����S��^-m_p��z�6T�xZ�j��i�V�4�)�>�����K���
��wL���4����Z�Y��
��~��>P�~��=}f����Vs�1���Oy�Yg����~t����7?%o��=3���������� � D0XE@h,Ma����I����?�o���n����������n�9���-���N���7��#[�u�Y':����%�C�Rhc��}�s��s����C�������\���J�{��^#%M��>��i���G����p��s��R��7�pw�uWY�����LkX�{8m��aS�o��[��_���4zIS����>~�>�g��i~��r�����e�����i���O�7j$��&���wj�]��H�m���jw�^v�aQ������" V@@���	��� � �@C\w�u�{`N=���]R���/D�>�S����Kz�MR=�;��Y
��Af�����������o��FAY*��9���a�s�b� ����';},��r�������9r�=z�S@NIA�zLi
�q�m[�9�w4<w!�)����uv��)<�MSog#mg3
���������f?����wW�o~����/t���/}���m�;A-)��<.�������X"� �I��T�� � P�z���O?����v��������?>j���f�5��M��Q?zwL!I�F4m���!�T��W^i�N6�;�h������\A4�CG�����x��w���O�>���f���dOz7����n�l����,M���~)n�
7tm���:u�T��C�uM�6h���*4�1���n��~*?m[0����2�a5���m3�R�����,\��N����6ZVy����]
�����>�L8��}���G��h�������J��M������y��{���K���k?�K@@ I� Z�
�@@j"�����>����2{����K���_���)������{����v�m���s���H��|�k_�U�z�9{�l?���7��������~��>�����Q#,]r�%����w3f���+������?��R����M��Qz����������]S�r%��M@�����Gy����;���c:��G���o~����N~�&L����e��z���&R[�4�K�Z����3M����������{K�{���~�������4��MW�����o���4�������pK����bW�[���z��q�~�.���h�@k[�.1-��j�����T���s�u�F�r��r����H���~��o�+���^O�c�z��z?��~�
	@�&�;����@@����1��1����;��R7�t���>�`7`��(��'���������W�>O�?x�`��^{�I�&��`7�p��')Y�I�F��^�4X
`��=X���}���h���U��_����2���
�%%�������;}�N:�$�H���O�2�=����<��v��'����p�d���Krm�{��=�����l[K�t/�j�h�YXF�$9���H]��5��=��m.�;Z�o�z���4r��g��S�^v�e}��[�hJ��
�Q�i)}V+��@�����'6A���F4kjV���I������4}��^k��\��S{��r�S ��_F������@@��P,|�X�F�/�T��z@y���gQ�����W�4�;����{�����_�w�q�A��������AM�g��v���lzGU��~�<0c�.�]�M�hSKv���o�4�`��������]Q�`�:t]g�y��{w�&��u��3�p��0����Y�xY;^����2[9;����5��x�0ox.����4�KXw�u=�V`����-���6���(�Ckkc@2d�;�����?L��zm��;�������I�+�a��akw���l��Q9�7����S�B��������/��G�.��+�����5g�������|O���:�
1U�4}����W)W�\�ZK^>���A��������~��h�s�9�}�+_��W>��{���������p=,��
�)�w#��������d?�~ � P�z��$j�����a�����	@@��+J���,z���4���]:6�b�O��`QA9�p���|I9���Q0��i�8�4�j��W���+W�w]�m��/[���Ou�<�kI�-Y��?�U;r%Y�:�^�d{��:
�����j�S�S}�� _�
i��)&�l�V�>�Q�{I�H{���&-����;��22�}d�j�~���g��,��o�-��HkX�{��.���\��BS�����F�'�]V������|?B���������g��������2�i�O���Y����/�1kW��`��1~��@K�o���@�w��]��.�X�Y�����Y��	��a�{?Y9-+�Oa��#� ����Mh�����j��@@@��������5�F�%�k��6zo�Fn��&I��� � PE�0��t�U��T � � � �����;��k���M�:�-\�����5k�������Yg��]I � ��%��%��>Z� � � �4��a�����;M�����O�E���g�uV�i]���@@���*o�@@@@����n;7v�Xw����)S���/�����Zk�w�x��N#�H � ��'@����!� � � �@;�l������4�|�l�2��sg���k3���/�� ��_� Z��c�@@@j,��O�	@@�q:6NSi) � � � � � �� �Vg�� � � � � � �@���h* � � � � � �@u�U��� � � � � � �4�A��,�� � � � � � P�h�q�, � � � � � �
$@��:��"� � � � � �TG� Zu�9 � � P������>�(U]��Mu�2R�g���>���2��|����\��O�+F@@�s�N���� � ��"�d���=z�K�������r�����u�]��g���x�v4�]Z�{����t�M�C���K/u���/��R�|�2g|����5�\�hV�����N>�d�dd��s�}�c�:[��iYE@@��
Dk�]��!� ������n��F�h�����n���������|����������c��-��"g�Z�kZ�xq�M�3g�����V�\�8�7d����H*Pm�W_}��{��IM��o���n��w��'���+��U�V9�[��R�s�r��;w��������~�Q������]�������w�y������}��y����2e��6m��>}����O���o�?�?�Sh�5(����������!���t�RP)v�w�y�����n��/�u����\�r���u@@��A����\ � �@]�������M�gA����>�%o��v���^T�
&<���>����
�U�N��<yr�+M>��Zk�
�%��>��w�}Qm��N#�|]�tiQ<a���r�:�����?/��"?�2<�{O<��{����������������q�2�+xu���f���q�������?:�{���r���Z���~%���
 � �
 ���=4�E�D@@�U@�0K�{�v�����j�h��;��G�(�2p��h?+��m�A�n�m��i���K:�r��
 +������N���6�h�����
�]}��NSB�����.��B��{��b�2s�vpk���{����~��Q�?�����|�k[<�?�u�QyG��=p�O�o��U���D���`@@�z �V��C�@@����{�Y���C=��9�O����p1��S�,)Hv�9��Q�
���U���w�>��_��k��7��z������R5L�|���G@�i�����z��a���ga��=��Sn�}��We�q��{��'�~r�I�&9�rQ��4jR���_�:���>@@�U�c�6�v!� � �����!��A|����L�JuS�����(!5V��S,Y�$����v��Z�-�-��$[}�_L?��(YlZ�`��)���|�O��e�$���w��F����;�N<����L+6�A��~8oqM��k���\��{?W]v��:��_�?\����w�8-���5����&�\,�^�#� ��`$Z:7J!� � P1�Gy��q��s���?�����J�_~����]��3�8�����m����n��b�?�������7��~�����^s3g�t;vtl���.����vk��fF]����O��z��S�G�Q�A���X��-.\����N�o�������m���n�=�p{��w�sg�S��'zWy�=�=�����V����|���e�]�G�}�T^�/������4���>J�s]�F�-�����������W���q��n���+��x���ME��b�Q����'>���dJ�{�LI>��e������������M����3%9�����i��rK��Y�g/�����;YhZo���6�l�^y��w����.[]���Y~S�L|��7�P�e�{�M�wRg1�k���w1�S�����g���`�~�.�����6�����~���c��?�c�DK�N��os-���}�����Q��T����:��#���
�o��W#��y��p@��@@��D��9gD@@ �����|,��������T��k����"=x���k��E��;��wn)�v��Wf���H=����g��}�2�����v7�|���/�@_�
��Q���^x�]q��z�b
����S�L�#��`Q6;��w\��x�b����������z��3�~��q�-������mj�M;x�e���e)��-��h��/�_�4�a��=�o�?�}���`G����X��0�^U�X�}��W/�L\j����C���i+�I��S�����D�&c�	������� ��5-��}�l�j��������\�~�2���������U��������M�<�N����m�����n�9�N����G}4���4�
��IQ��R8X��[&�W�_��n��AY���a@@�2�*�J� � ��]@=��m�b��s{���?��]��X#�4�������~�����
� ^ul���~[��4�D�v�����E]M]�@N@�{��G�(x��A�Jj�/��h�E��j��wvk���{��W��1=�V;�:��c�j�����%�������}PDAE��D)�dY2[�h��7��R�Qz�����q�|;tm#F��0R�v�m7����z��QzX���������y����>�Uo�~�}�Q��_�B������/�SiTY�Z�6�;f�!����o��aEA�p;k�������S@P�c��
��I~���t�k�[=�4�o%����h5�?>��?U����v��O���Q�vOj��Q�M��{��6A4����T�t��h�SN9�m��&�4U�~���MB@��A���sf@@�&P�I���EzS����z������g�����j:8{����wHQ@B��3����F�?[o���i4��Q����s�=�-cD������
������+_��;��c������~�>M��<��w��������2�r�I'eL���"5��O<Q� ���<�@�~���8z����i����II#����oF�T����i�4zK�e��c���>�Uo�~<���|�2PM#1��zr���cUf)�h�l��|�&)��f
zj���aR@TA%}��-�������wQLp����9]�v��Q#H��A4�Q����+��~
� WR0t��7���g��~G���������}����$!� ��N�c�N��@@h.�BLSw�?��5��*z��F���?��QIA�SO=5z���>�����F�Y�����r,i����ij?%��9r�_���bw�y�Ud���+���K#3�:�(;m���e��H�:ujt�����;.#��r�eI�U*�v�a�64;��o������N8!#��|[m�Ut��������'���V?�jC�cUf)����NAK
��I��
��Y�vh�?��4����"���,�����.*��@+��zG��8T�L#�,����
�i�P� V���~�����z7���a"�j�� ��F����_��sV@@hP�������p?~�w��j$C�^3`��l�����p�.����;�i5����w�C`%M��� W��d��S��6�}@JC��x�V���,��i�.d� V<i����#���*��@��C��@N��v�m��������R��$W����\mHsL�Ka����2�b/��a�r-��t�J�n0�.jd���
�Z�%W}�<V��
�W��R��Gy�U��,����
r�j�#RI�`�F8}c���5K�2��
U�m��1~D���Z>� � ��`^���sf@@�&�OM����N��~����k����a��,L���n���g�;������q��y��u�Y'Z���r[����.��o�W�Qc
�i�j<��;�4e�F!����:��k�CK����>�s$-k��Im*d_���CR9�h���gO[-z9|����Xyv��'�:4����z����_������|�`h�Vj�D���{}���Q����H3%�,X�k8�(jA��]�:M��Q�F�K.��-[���]����/�;-5JXS��
���`@@ �A��v�D@@�f� �F,����Z
�+�������6KMO������z�k)����Ur�;�����W�M��[�/���h�����4G���O}�bm��'����Xh-������C�:��P��N�4�s�1NAM�LS����R��R��o���t����c6
M�aM�����M���o�8M���Qw��5MO������q��=��t��O��\pA��z���@h���{s} � ��N@S��r�-��Q3�������h�~����v��}��_��s����':���L��7�����v���-=��C�_��W>����4=�F�����M����m��[
G����2�+�>��n7j?�A4�#p���/+Z�����?�E�h���}�Q?R�8��X������zX6j���n�Q�N;����~��7�w�m��f>��)p��J%������� �}�v�v�i��c�������{��� �]w�����2l#� �TG�w�U��� � � P6������VR���z��T�JO<��{�����	�#z[h
�h�{��)�`P���w�����GUj��l�F�������-[]���m���8�O�f�b�w�%���Iy�/�}��>�o�~�H#K�=f���H#{���o���}a�\���
�+x��c���
7��m�����zZi��-G��@�3�<�l$���p�
6p���=d�c�����6r�H7z���=x
��@@j'@�v��@@��������~���C[�S�������{�,@��m
�����l���g+��n����[�g��FZX�������%���p���)�g�B�9s������k4q�D�[��@F#%�Y
�R�>M������������7���Y�=�W�����|���z���'��l�~T���,�<yr��G����^m��c�m����f��~��WlUU����[�v�����v��w�i?h���_���0��G���Y#��|�r���=���T�hG � �@�����3#� � P����K���_��h���?��q�=�����=���u,}�k_�U������w�}N�1M/�)5J����v��~��e�����v&Lp�f�����?��tv�?�r��q~t�(+H�mK���Fk}��v�)Pt��w:M������wY���C��wFE����FE{e��SO����3���_�x���x�=����n��f�-M	�����+}1��=��U����O��[�����=g��������i��*L{��g�y�W����F�h������C����]Q�$��z421lG��+U��7�����z����z����K��A��Z�*�c��{��{�������UAt����W��Jo��_� � �@mx'Zm�9+ � �E�t�M��v���Du�|��������)@5t�P����4Zf��I�a�
7���IJ�@X��P_��������*���P�h�B�i����B(&���v�m�]��S�Lqz?�*�v�m����ec��ww����F���J
�hZ��.�,�4�,��2�`��{��Q�a�	��^�H�)�}��~����GM7z�Eem�%�\����������)h���
�%}_�Zk-w�y��<���S�^���[���}
��>��T��-���V@,��s���v�:}�xEA6�>�S���@Z�����}�PU���F�r�!��@@��D�:�D@h�P������M�c�`VZ����+xr���g�h.{��Qf�ei��Q�;����2.ij0��:����)UN�$�����0_�����~��h��������Tg������9���&��x��g��N;-���2���U��8��6A�$;+��X<O�}e��K+o��q�����C������/v
�(�m���%��n���Q�F0�����z���������,�}bu'-K�G�gv�L:G�}���z���~�,
����8������h������GU���a�l�a[��G����.����KJV�����Q^;��x�V�X)���=j���I�j����������9Ki����s��x�L�l4�������fuh�����aM�:w5%	@@�v�\[�O|������� � ����{���G��*���%K����~����)����II��T^��)O*���S�
���[�2
�)���Y���^�W;u�\��>����5Y(?���/u�j��?�C�lv�3�1;���wzMV.�T S}��]�������M��O��Q>M��>���8o�<_V#���m*���L����f[���C��F��3�v�r�W?��S�+)R�yt-�/�}j��t�;��}����{��[y�����W�������o��X}�~������;,g��w�o��c�������T��K������_�6}��u@@��
���>�����*kM� � � � � � �
"���ij�\�D@@@@@@�R�*%K� � � � � � �
+@�a���#� � � � � �TJ� Z�d�@@@@@@�a�5l��p@@@@@@�J	D��,�"� � � � � �4�A���:� � � � � � P)�h���^@@@@@@�����-�B�W�X�{�1��3����g��K���}�������
80�K�,q7�t�[�lY�q��:��v[��X.Z��M�2�M�6�M�>�}��'n��7���C��^�ze�����T���v~� � � � � � �@{ ���_z�%�������,
4���kn��In�����c�	�u����f|G����h�������������Qd�����'�p>��;��s}@/�����(�l��@@@@@@�� �t�Yz��{��h���s����2d����[T����v/��B�m+�~������[�m��������:�g+���w^xa@S��K_��;�������9s���c��~h�\�r����QXA@@�,�@IDAT@@@��#��tf�.]|��c�u�v:t�95b�G?����0��z�m���~=����:�����I�����p����?��N;���>;:�i
������>pO=���g�}|���T������� � � � � ��3F�e��Q�F�K.������QKY�^{mw�GD��~��h�+�&R�{��Nm����)�w��'j�'���R�r*_JY;?K@@@@@@�=	D���
Xu����,������P��i��E#���v[��G�6�l����w��~����|�r���*)�l���@@@@@h'�Q�vrq���DU���+Z/ue��YQ
�%%�L�h����U�V�����TI)e���>@@@@@@�=�N��8e�����[o�'�<����g��~T[��}������`�;wn�}��E�>������Z�x�K[N��R6j+ � � � � � �@; �Vd�����n���Q��C�F�I+<�@��z���G���k��w�iT��n���j�exl��%~4�e
��>[��TN)�9�N� � � � � � �@{ �Vd�^��n������E��f���A���w��4�LK%M�t�R��@����������C=���?Ufi��W��6��]m}�Q�r�8�9�4��;&L�P���@@@@@U`��ak:A�"h5���r����;���K��eW_}u�c3g�t��~{4��/���y���l���S<._��My���j����������b��L�s����;v�#�V
i�� � � � � �@c	�?�U*�F��{����w��� ����m�����l����������C���/�O>������Qm�5�������������~�o-m9�/���W3U�KP�k�\ � � � � � P�@5�D+��f���~��_�U�V����z�4hP%�f�T�{���SMi��YQ�BZ��R�^�
���������Ws9z���E��y�@@@@@J>|x�g��XZ�iM������-[��_��G���{��.�k������������G;�~�m�C�.D+��*I{N��A@@@@@��A���{���/��-^���:�������Da�4�����fi����U��s�E�����s�;���wm����K�..m9URJ��]�#� � � � � ��I� Z���?�������J#F�p'�pB�������O�]���o|�����o�y���g��c����1[���[l����^~=m9.�l�V@@@@@@�v&�;��t���^�V[m5�c=�Pbn�6�tSL�)���?ttC��A1M��h�"7u�T7n�8�b�
�w��aQ9�x�=�t��������+|o�m�q�������O<������
2�����TA)e��� � � � � ��#�h	��|��(����|������r���q�������
��Li�����7�Q����Njsd���w��q���n�����nh�g���r��w�����-�������D@@@@@��A����{�4�k��	G��Zc�5���v�1��I�&��3gF��������;�0?��na��;��N;�����~�������j��[n��;�8��o�h�V��+�lF#�@@@@@@��A���T@JS)�I*;r�H�Y�d��7o������m����=��~��:�(�Y�t�S M�u�Y'[�?m9.�l�Fq@@@@@h@�h��=z8}JI��ww[l�E�U�-��R���R@@@@@�C��u�&�� � � � � � �@M�����#� � � � � ���A�z��� � � � � � PS�h5��� � � � � � ��(@�{�6!� � � � � ��T� ZM�99 � � � � � �@=
D��^�M � � � � � �5 �VS~N� � � � � � P����Wh � � � � � �@M�����#� � � � � ���A�z��� � � � � � PS�h5��� � � � � � ��(@�{�6!� � � � � ��T� ZM�99 � � � � � �@=
D��^�M � � � � � �5 �VS~N� � � � � � P����Wh � � � � � �@M�����#� � � � � ���A�z��� � � � � � PS�h5��� � � � � � ��(@�{�6!� � � � � ��T� ZM�99 � � � � � �@=
D��^�M � � � � � �5 �VS~N� � � � � � P����Wh � � � � � �@M�����#� � � � � ���A�z��� � � � � � PS�h5��� � � � � � ��(@�{�6!� � � � � ��T� ZM�99 � � � � � �@=
D��^�M � � � � � �5 �VS~N� � � � � � P����Wh � � � � � �@M�����#� � � � � ���A�z��� � � � � � PS�h5��� � � � � � ��(@�{�6!� � � � � ��T� ZM�99 � � � � � �@=
D��^�M � � � � � �5 �VS~N� � � � � � P����Wh � � � � � �@M�����#P��{~�s;|��1�
�� � � � � ���A���Z�@�,SM��0�9�5���a@@@@@�& ��D���6��h
��JI�\�9� � � � � �@; ��N;��B��@��(4��g� � � � � �@; ��;�KB���4�!a � � � � ��:�:�1h'���wb�zs��>�K��@@@@@h"�hM��\j(V�4�
��c�,�&F�X�	@@@@@��
Dk��e!�!P����Tq<�FP-��
@@@@@�� ���}� P����?~V".�YPM%t%k��� � � � � �����h4)6�����S4f+�u����v���P���iz����@B@@@@@��5@'�D�&��x�f�I��d��f:���6"�.F���� � � � � �@��U	�� P7
�������;������|����\K����j��8� � � � � P�h5@����@<p��Qa`My�f����,��h�|=�q@@@@@�

D� .U#���A5]lR0,-��l�h����C@@@@H!@-E@ �@X�i g�i��h�,h�F@@@@�G�h��+�	��$`�4��j�Vzvn� � � � � �(@�@(�!�@�����f�i����l���e��A@@@@h���o�re$
|���|�UNKK����S�d�4k@9G�Y]I�V#�f�,@@@@@b�b l"�������O����r�7?\��0�e�cZ�Q��j:����l�� � � � � ����������@Y�+_|�i�ni����[6�L�l�Y�L��Ium<��t�+�>�!� � � � �4�A���B.���F�%�\�H[�r >"��&;gd�o��gd��F����1�g*u��j���-u�^���o�k��_@@@@@�v(@�v*��@\ |�Y�X��x��v�r�?��n����:�^��[��Rka]Xc�Z�]D@@@@@�Q�5JO�NJX9���KkZ9Rt��;��n�#����?�f,VjPM�Y0���V3e� � � � � �@�Dk�.��/�����a�-\/�u��Z��_�%w�s=[k���o�+��,@��,��h�V[�E@@@@L� Z�u�E �@<����-�5-�y*y,t�o�>�N��6��N�f�l]�<�;{��L�e8Z-i�\��8� � � � � P�hUa�$�^���o�e������-��7x�O	����{�:�����A�v���6�����,q�Z>�S�h5�TK�JA�A@���'\�����S}H � � � ��'@����#�J@q���
��G����5"<O<�����0o�zku|���S/{��
�u���ot�����e���F�}�|=�������5�}����?����~���
(���H!�l��'_�������@@@@��D�
3'A�>����[����l����B�
ka�������v�����{8}�,�V��j�u���iM��I�\v���[l�e�X�����2����3*c2���|��=n���d��o,��4G -��6 � � � P����he�`Y�0�)a���B@�kS�0����_�9��X0���4������o�^_�7��kF����������7�������k]�gz�riV��b���%��gx,�z�r�����jS�cKZiz�d������0_��9� � � � P~�h�7�FB�=?��_[|;_����������+9ZM��`Z7�:�d8Z�za{
Y�%\/�����a0������v<\f5i�eTOGs��k�?�2m�R���M����b����9\�UM�}	�G*n��r��1@@@@������L |��>O
fh���-��|V������9mV���|��V��{=��p������f�O���YL��u���Z�9����#<�k=�����c�C���niSx�������ThS�p������ � � � �@�	Dk��z@����p����x��O��+�vV<��)�-O%F�e;W9���>ny�G�81u �R �H���JU��S�,���w+��l��+�^� � � � ��&@��z��"�@�h}(=$�[F�9}�N�|�uOQ���h�|
���D��s��
d����RF�UK6��p=��C�0��2�v�/W}C@@@�A� Z=�m@r	���>a
ko�
�������n�����f�OVm�>u[Fu��!�p=*\�J���+�9d+P mp$�x*�-m9N��l����e;n�������	�;�wJ���0��Q�����z������p;���>�!� � � �� �VnQ�C�!�6�z��-K���$���-��N���z���N��
�����><��%����Z���h��@Bh�o=m�R���M���c������v��G)t���,��u��#e��|Y����a[��|
4��}n[�|uq@@@�%@-��@�F�`�������t���F�YPM�
z���p���k>D_|�i��F�V�3�u���h;�J��=_���b�'^���-���W�l#��ZZ����s�~��@����a�\���1k�-iWh���0O!u�@@@��A����\!4�@�r`�1�U���jQPM�%��|�g+�����
H�K
����|y9�@�,Xw��������;\���o[��@�����'^�^��������Y�����a>;�@@@��A����\ P�@<�U��jj��6���0�W@K���\�-`Q@udA���T��������T6��Ro-��m
�s�)4m*n��r��1@@@��A���sv�.0g����/:--
4��CjB�0�e�cZ!J��Z,�fK�O�VS���n�������9!;@ �@
�sU��@����a�\�������z���^a�Me��0_���9�6.�pU�,�
��@@@��Dk�=��!L�:��O��������l7�@<��V+5��e�����o��u��S����Q/��6�u���6+ ��aP(\�u5a`*��L���U_=���k����6+c�,������g������u�� � � �@{ ��^{��B &`�������~�H� aC�h5�`X�A5�g�4[~6Zm��q���)�u���.�-��s��ot�������B�v-u��|m��0��2����WW������v���f�O���Yd�u�9i9�8� � � ������D$�4-I��g�q�N:�>�m����(��u������2t����~����D���p���<LF����08�T���<��Y�<��w���s��~lF�5�����l � � � �`���h.i����+?�|�i$����"Z���������e�oP�;��������s}����+ ?Y@�B�@Q���|��@����|v�Z�U-'�
��z�U�;M��+��M!���n�r�� � � ��,@��{��!P&��s���&�A�p�9+�q�(M]|�_Z�v�������+~�����ro??�W���.ni�!�{�����X+�� �J ���*ja�Me��0_���+4��zp���]����<,��|�n���e? � � �� �V+y��@���W��q*�X0�-n�S��j�?~������=��X���M����@�:H�z�&��0��2�v���+f$��)���b��:��-��<�n����@@@�J	D��,�"PG�������>H��{��s�u��y������-��9��4��=n�V+�� Ps�0���k��_wZ����F�2���
�d��e�"v�`S>[�c���@@@�b��E^X`��An���y��G�>��7#���n����W���������t"k��b�#0u����:S���t���{@t6�-_�+*P�;�-��F�k�5�c�fa�Y"� � � �$@-I�}�C��n�9s�d�:����@��>9�5���!��������<����4
���Ok@��0����� �@
�) �+�<��KV������X�:*u\����YpMm�u;V��Q/ � � �@�D��>���M`��~4Z��4hec��2���
7���O������c����p���k\�p�������
9���6���j�ZrVoF`m�1�TC@�n�<�*?m��+�����
Z���m*k�a�,�Y�|�<�W�������e�g6��u;��,�@@@W� Z��-G ����i�3-%�*,�g�sn����8"-�����P�f�������)n���n��OU$���{�.}�o��k$@��,X�*�:m�����l��e+[��v~[&�����k�c�v,��@@@�W� Z��
-C���*�K�ehi�2*��;��lz��me_�l�)-�iM
/�9������+%��|�x�H5}��i��u�f�H�h�z	�X;l���Y`S>[�c��Vr������e�g6��u;��,�@@@�+@����
@ ��g����j��5�ijSM��a�D�d�7�����u���6�@vpPk��_�d_*@,�d�$dYpMyl��%���>;�-�����l�g�v,[Y�#� � ��G� Zy�@�4��s���X���s����~_�i �ly���g#��i ����H �@Y,�d�l�[ �l�g�v,[�J����2�|v}\S[�cI��� � � P�A����� ���:����W��Z��f�V�i ejSA*��k�V�M��.�@��h�e��)�e�5���\.�W�������e�g6��u;��,�@@@�f ������#���<�v��������j�4�������,X�x�?'��Z����M��M�
�m�n�j�~;�-��a�g�5��u;�T�} � � �@{ ���{�kC(���A��i X���]����s���jQ0��j���Z���
@�rX������@������X����o��e�s��Y�M�l��e+[�~�k���2F�u�
�	@@��A�J�R/ �@�d��r�������x`-����5�=��"�@�
X���IWb�,�)�����r��g��e�s��Y�M�l��e+��l��N�x��k�|U�����l#� � ��D+D�< �!�yPM�4b-3��f�����_-q�0��I���	�@�!,�d�l��@������X����o��e����,��<�n�,P�T��-��4G �4X"� � �@9��S��@�"�Xt��5u�T7�e-�V���Y`��jYz�� �@�	X�����@�,�Y������Z��mY�y4�c>�R��, � �4�A���w�@�
�(5��G��my��ks����uy�-k`���yS�A�I@��|&dYpM>�n��m����
<���@�?��;~�6�-#S�F|�\<?� � � �<�����R@�:���`��j/��4u����������-m��jf�Z����4���l���P������V�����T�n�oO|;�?�v�%x�����r��% � �4�A���'Z� ����V�7g���jS����5[��,�h5����H �1
�2v�oZ���k�i�v,�\|_1#��e��ow|�������7�+�?^���D@@��	D��-5#� Pv�p����w?�Q�w����$�d�`�SE�6R��Xd��jm�_[���[k��������=�zj����Zk����������]�g[Z�����O+*8j2���� @����|&��{'�;3���y���;��������Z^�?�x�v�	�tH�V��mP���YF:D(l������ �lz���c�T���^�h~������=����Y�������������@@z���� �'��ZM� m��V�`m�S�V;�11d�*���wc������@���A�]{�G����~�[
%@��9�=�������������������@@�b D+���3 � �$��ZML�����j�E{k9�_M/F��
, ��^�S��*�s��^�ml�[�y^�����/�m�s�����^��;|�������s]�E@@ ��h��?@�P�o�Z|n5��G"7�;N�X�I��UjU�������?@����L5�WE�_������~/�t�;P�
��t�~w��.�z�����
�����7������@���s�YO4<$����w5��@@������^@�k�T�j1�Z��}�`�����5rs ����Dr(1�!��Z������g�v�������ng�+���s{�������7��k������� ��5h_~�`? � ���	�
�5WB@ ������=��j�$W��l~5[�F��1� �@�@1��gp�y����a�;x�=��������������x����k[�e�W�WJ����R^����� ���TOf�.@�R��(_+� �@��U�E�Qs��k�W�<���X�� �@��;lqO����7��{���=o8��>��'���}�^�w���m���@���.`�$�-�~�'�2|Gn�h����� �d ��Z-��jm����5�W��� ���;�pO�6���;x�}������
a��X��hP�����q��n
����
r��mj��T���I��TBC(vB�b�< �9��V��[-�T�E��`~5����[�k@R������{�
x����p��j����~K'@�EZv�&�{�n��J������i�6�����,��W9�s@(:B��{�< �C+�]�sn*��a �.��>nn6����Uf��k�@
G�+L��7�'�
R���������}�|�Ds��W` �s^O�N�t~g������%^�4�R�3�_���x�~}o� �dY�-��t� �������P�J�X��T�y
��`����k`� �����y�}Eq:��@������������D ��@6�'S���:���K�vSszM������^i� P��h����{@
R���P�V��w�`-'��� �ZA�q� �����G/��dx�QW�$zn���a���.��
�6���so��Z �J��=
��T�C��er��������9~�S�i��FZB����� � ��C5E���V�E�s���_�vT����0�r���jU�M�2����6��^-�k�h8�0�V�5 �E-����J��_��E���A�0>h��x��lS�O��O��6���z�lK�M���m�/��i�c��������k�:���m��k�>��P������������ D��1W@@��`��j��W�6���5��4�sj������q��|G@�����L5�WE�������A�e~��z�������Q=E�4��_Y�������l���?/��9�n��~}�{����^��j���g�Z���#�@�
�����@(*w����E"���_'W��l~5�R�r�^n�������-%H��a7 ��&`���?��u��-��y���i�s�%������>��'O���`/��T�&_���
��������~o��������<�����^xs��E{@ !@���`@
M 9X���3��N������W�Z�6P4�H���@��@�?<dAd�z2��z�c��
tRszM������^i�Z �{�c~��x�>�p>^���crX�n�u��w[�#�@q��{�)@@�[��i�]�=�_�+X���j�������=|G@@`�l�Yr`��!A��Ku,�dw?~�����X�>����uN��,�y��rI��?�0N�����O�{�
d&@����@@ �RW���!!�a!���H�
�i�:���9�Ud�����jq|Me��a� � 0���}���'�����>���
���T���L}'C{4�M��\�|���{�~�k~a���w���t���(vB�b�< ��H]���j5� k������������ � ���B�����*�I���RO��@n3[�h�V���g��?����~��qz
�c~��'� 0��hC���@@ /��V��	�jG5J��
�xHB����&@@��B�T�2qK�$s:���=V�v�
�\N��{'~��{I�����+�������������hxH�����p������Y � 0���V�! �0�i���j�l#� � �@�
A������&���kI2��'�u��u���_?��|��2����C�#��{�_������c�K���W��Y���L�Oum�eW�-���� �E*�U�����Kd�
�h���F�'�9�[��\��b� � ���ch�
Z����,�����?� ���w����u�~~~�����^�76���wU{k[�e!H����{����pO � P���e��]���]f�4���H���P�vH}8�s���jb�$@@(.
l����������W���/���B �0N��w�����6��y�y�K��C=f��#��8\�-�-@@�a#��	�"m7�ev��=��1c�H�X���\�WMB5��@@@�W��e�a�p
!���o�/d�������q~���I������*)���D�6�_>�� ��h���$�����3���c�X����xI��F�f�X#� � �������;�g��?]!�p�/���������=e��t�|��lZK_�h��&y@2�e�������4����������
�t[���PMEX@@@�I�/�������)�0N��u���T�D%Z*��<F�V����F@�<�a��H������%9T��j:<�.�X�D�s�6��a� � � �)�9��);�9��� N��������8'�!Z07Z!� ��	�<&��a��E�f%X#� � � 0�^X���i�3.��� D���@@��p�j=C@��a$���T�w��@@@P��K�b����V�Bz�~X�_�-��w� ��pI����/ � � ����/<$Z��U�F�6��&����MF@@ �����]|�:g^��h�F�Z~�3�@@@��lX�\��<�cq?}�<!Z��K�@HxI��b@@@�A 8�^�-���'��g���_~Y�{�=ioo����2e�\}��2k���EZ[[e��-�}�v��c��9sF�;�<�Y�h�TUU���6h;��L��>@@ /������C@��{��5*���5q � � � �G%��lt>K6n�(K�,��[�[y��������L��w'�]w��x����>,��s�8p����A��w�iB9�	A�m���������}������ ��f[��@�-RC@��j�p����qk{����%"U�Et�@@@@ �@����sKK�D�y1����Z8��3g�J2
�:::L�������.�H.���^�477����m9x��������s����T��~����W���}�{f�� h����������@
_�{��	��w�d%T;� �����`�j���� � � � P��h>/���\�9�������K.��-��Wl�������$����O�	��y��D�6o�<Y�jU��'?�I����K�9"/���,^��m��%h����������@�O�{��y�G"�j���y"@@@H������o�]���^���K��8q��X�"q��������7o6��PH���N
�n��{��LK|q6��
�N��I��{g@�[@C5�V���E�e7����z��p�i�p�.���0���Jm�R�M%r��E"M�5t? � � � 0�T��`k���L�:5q���3����oOT��P�c���u\�\p�2~�x9z�����'N������m���3����eA@z�L��jb+M����j���s�~�O|��S���f9s���� � � �9 D@���]�����Y���?�]C4�E+��M�fB���.iii1!Z��A�i�I[�gc �ow�����b1�H���?&����Hh��"T��v<= � � �9 D��e��D���g'�u���5����&����������&�=h���2�f����`� ���������P-���S���x����
�@@@�
��)w��ay���-r��HZ���.�G���}����;f�m��^4��}� � �@?�PMO�! �R���z)k��P7nm?=y�
��j���^�V�n�^@@@@��!Z��;~�a9y��9����������@���2j�(��g=b����S�N���m����f�6������!�aA@��8����d��������w���dT���XF��!�R���M?�J�����J�������$O� � � �Y DK�����;���1c����|��eee�}'N�Hl�7F���UZ
A�m�7�I����V�����t5.� �C)
�DB�9�p��K?
m_���UF�m6�4Tsz��G������|H�v���IW��������6 � � ��!p���4�v*!���m�&Z���`_���e��q}Z�;6�/���lz��'h�����������������yI�� �y �������k����gw�!U]��J5��KAC5
���l9�����y��|N��R�j����s � � � 0T�h�ojj�������e�����$uuu�-JiU�]����f��A��E3ik���'N�s�=w�.�u@@ ���M�s�~��e4��}���e��|�AC5
��#���|D��*����&P�! Y@@@@ ?���gb��x����}���w����\\�����#��?#9���s����a	KJJ�+DK��P\���8� ���V����[e�������Kt��9�}�DZW����m�xP��8}l*���I��+e��N��� � � �E-@%Z�������?����CO���>&�_��o _x����nu��Xb��������������lm��^4������ �������V��n�p��i������������5T�'X����(�Ss��j�+�/���t;�k!� � � �@�	P���B�����{�]�����@IDATl�2���>�sv�n����;������^�9������&�]u�U���m���g�6q�l � � ��j��0�j��+Q����-������T�����R��d���(Q��CL� � � � �@�
P����~����`��9R�M�&O?�������93q��+��w�y�|��L7g�����'�|R��ysl���2��D;��6h�L������ ����jZ�_��]OH������d�PM:�`M�)}u�/] ]U���l����A�E@@@��!����'!�>s���Y������}�C��������/���&��i��>}Z~��_&���	&�7����P�v��A��u3ik��5 � P��3�a��c�/Be+T'Xk�]?Gb'���	�
�7��G@@(vB4�7�s�i����G=���UYY�k��#��_��L�8�T����{����\ 7�|����$����m����f���7k@@�����m�����cfR�f�����~��Z8f^�b�E�y@@@
V������g���e��%� �|�:y�i�M�4)�[
�6h;��L���p<y������ ����@@`p�P�}�:��NAC5��FO�+�:C���f�9\WW�>�� � � �{�\e�_x)�h��k
�����t��m������m��� � P���HH?���T�4T�*5S�&N��gN5'T�49s���o�� � � P��h��}@@�<�������T��b�.���6��y���
@@@�� D+�W�!� �������	����W��j5g������������wL�Z��T{�	����9���f� � � ���h�,�D@@ ���Z|��{���Y
��	�8s��B5}
�y�������k�k��y^����@@@N�h��m�� � ��#�����:�}��T����9�a�E=9'�;vL����|p���ljc�Z���H�Y�l����&��� � � ��p D�o�gF@�?��P�����D�����PM��jN�6�������>u���~6��w�\�5��>B�@@@N�h��m�� � ��#��P-5��I���_�vB�;R��Q@@@�X���_.�� � PD9�R��<�cy����! ��0�i��8� � � Pt�hE�Jy @@�a!��P��l��4�����&�
m7� M5]����� � � ��!Z1�U�	@@`�	�(T�-o������iZa��_$��sa�j�� � � �E"@�V$/��@@@���W�v��D��:-����C���9;�|�D^;u��sm
�����@@@ _���Mp � ��R��jY����\�����	���c�+M�����z|G@@�WB�|}3� � ���~�������W�}��#�0��+�j�>m�'T���@@@�< D����� � � �s��"-�D�6�_J�6=O��z���M���?{
y��V;�C@���:C���f�i_WW��"� � � 0�h���5@@�'�����WEZr���l�4{������=ok�j5
�"N�f��kS�6LNA@@��!Z�I�@@���J?6�[�y��[��jz^w�e��~��p��o�x�`��Z�"���lQ�6@NA@@��!Z�(�@@�Hp��c%k�����V����@��V���^#�e�j5�@@@ [�h���@@��UjvO���l�W�E8�@��#c���c�V��2���g� � � D�-�m@@@ =��j5��`-�j����wso���0k���{U�� � �  D�7@@�F 9X�w�To�N\/����{��@zT�uU-
�jkk�M#@@@�� D+�w�"� � P8�Uj��k��j��Wg������j�7���^�#�KH(�����! 3"�1 � � PT�hE�:y@@�L �Z��e�ZM�5��!�xPd�H�u��P;V:�j�"�u�q@@@�t����\@@z��`-Q��w�To��}��%���������v��V�`�����k�!� � ��!Z�,n@@|���B�k;d{{�H�����^�jz�M_���9RR���L����[�H�@@(.B��z�<
 � ��@w�Z(Y��_�}�:�`���|,�m�u8�@���3�S���5W:+
�jq~"� � ��.@�V�o��G@@��	8�ZH?���m��%�I����������m��j���@:�j��F�n�e5��z��f� � � ��!@�V���D@@�\������������%*��8�j{z��B���>po���D@@�7B�|{#� � ��@�j���p�������D(�m��t8l�����l��������@@��!Z�`�@@�D�	���-�~��0������[-�j���CIvW���tHS�6����n�Wg� � � 0x�h�g��@@@�Ht(���j��T�Vv����h<��x�j�C�������3% � � �@��ro�@@@��zU���y��yp��X�9��3HT��v�������@J�b	Q���-�@@@?B4?�#� � �T��^D?fyX����t`��V�`M�'^����a M����~X@@@���! � � 0p���Z����j�����U���b��*O�(����P����Z-�>I���F � � 0|������@@@`������'�h�Z��u&X���qk��@��j5�y�0��)�+����@��@@�h��-�� � � P�j�x����H$"�f�Z�k��a u����U�Z-n�O@@@�[��_@@@ ��������]��h�x����v��Lw�v�r���,��0d�vpk � � �KB�\��7 � � �e�j5�0+i�$���x����9�����;X3s�U-��c��?@@@�������p � � 0��jZ��_M��b��R��Q�&G�|�k���ob6�XcA@@�H���^&�� � ���V���5gH�[M����%W�e%X�a�]k�f��R_ � ��!Z�<n@@�@O���ZM��������Y�M�������j��`�a ��8� � �y'@��w��B@@r/�;X���Z�>~���:+�j^��@��s@@�X�-cB:@@@
_�'Ts���q�@�HDb�Vr�ZV�5;��]'W�y���	3[u��tl������w6@@@�l�e[��@@@�"����u�u�Z-���Zl�c.CJ.��l��N�Oo)1�J�w<(��Ai��V�g^��� � � �-B�lI� � � P������V�E�^4\�W�e�Z�{�P?��{VH�����a@@ @��& � � �q��j5������U���`�}���"T����@@�
��� � � �@�j5�5�ZM�
��
�}�
9���=�����{=�V�s��@@@R���� � � �@�~�jb+M�Z����E�R��{u���Z�����}3��k~"� � � D� � � �dO�W���m4������j�9�jG5���������_�{n5��;X�Z-s_z@@@�������h � � �@!$�j�j-��ir�����
����^�`-7��� � �@
��K��@@@�b�0-���%��`���~f��F��<�l��'��5B5_6 � � P��h���y>@@@�@��= �
�Iu�6�'�u�9V�J�rM����a��������`�;T;U�P�k����
���@@(tB�B��? � � P��K�:i+=+�b�o��N��K�a ��b+����R7�qs<\�(���f;�����9Uj�G����D@@������x � � �@�hE��S������T��^��o=6L�{��e��$��x�Y�Sa��j5s��w�/M��}�@@(XB��}u�8 � � 0�48Kw�����,�L��k�7[�x��-�j5s
[��5�����`@@�^�-�_7� � � �M�T��V���?��V�4X��}�*	�B"T�e��� � �Y D�:)"� � ��@�`����~$�3d���v:�}�C�Fw�Z{�����`�j5c�@@�A�-��� � � �w��W[f��k��KY��f_��Z��E'T���]w�}�����oT����7� � 0|
.D������v�9{�����Jee��{����� � � ��.��i�����tH�e��C�;t5g�j5�`A@@`H�>D��c�l��Uv��%;w����V_�	&��3�g���r���KII���@@@@ �^�@JO���D+����3�g��U���j����+�2y�"� � �@���ZZZd������ ����s�~;�9"�y���e���R]]-,����Z�=�\�f�G@@@ k=���]r��7��;X��6�k�j���c�Ytn����f����5��� � ���&D���2��g�����P��E��4����'�a5p;|��	����z5inn���z�|>���u�]'�g��u_@@@���W�����2�	��;&��V3k�m�k:��.�>M���� � ��������������k`v�������e���2~�x1bD�s���:uJ48��g���O��^zIN�<iN�!!�m�&?���G-�� � � ��@��j�K'W���l2�ZV����m�k���5W:+R�f4�� � �@j��	���9c�t��q&8���+L�X:s�������}:���j������M5�
��DE@@@ ������[�g��3�Zm�����.��_�����7S]�M�C?N�j����R=���}�@@�b���������W�y�����,�����i6P��{��>@@@
A�'X{ q�Z���o��9�Et;k�j{�j�==�jz���w�����8 � �#��	�JKKe��Y9���,������@@@�$
��V+�n�p�~�_-��j���N��������! ������k���Y!� ��@��h�0���{����������s�5�@j(�� � � ��p����S������5�j:�����i�iB;'T��:G��.0�Zt���3�[F��1 � �y%PI��C���?����fw�q��s�9	���7�C=$vN5=�CB�y������@@@@`8	���x�Z��6C?��^m��N��hA���5�����|m��	�N����%B�Z@T�!� ���@A�hO>�����S&L� �&MJ ���?���{hz��w��5k�� -q2 � � �c��0�����w=!���e�c����^�j�s���k��@@`�
"Dklt�v��,]�T���X���������E��n[�l1�`����D6@@@@�
T��^D?f�l��������[$���=5���j�`��0�]U���l���a�{��4@@,���ZZZ���i�.'N��Q��3g�m��&%%%f{���f���{���	16@@@�_ ��{%�|���m���J��8Q�->G[���8�jc���1�Z2>� � �C)P!���c���#r�����3�<#������h�=z�h���
�>��X@@@@ ���%�lIw�U�i�6X[?Gt���")�YN�Z�<+@@��(���s����w�o�[Si�����k�UUU%����%���j�������@@@�\��Z�}�:3d6���0��V��s��'����CS�a��_= � �P� B4���^���w�������:)-�y���~[l�6y��^��@@@r �T���c��W�IS�	��eSFs�i�u��@M���:y�c�V�C��f�� � �E���)��f�+��k�����w���������h�}���o�wO�����u�/ � � � 0H3����&:�Z,&��[2
��R-^��<����j��s��a �)	�T�@@ �@A�h�p7�t�\v�e���(�O�6��d�mmm2u�T�%W�� � � �dA��T������i������j�c�Z�{��)gn�&���U�4��9����, � �D�`B4}�,���l���yn�A@@@����BN��Yl���.���Z�;X����H����H�|�\�j�8=?@@�
T������ � � �E#`C����N\/���K>��v�j��Hr�Z���#���MK@@�(�*D{��W���Y���G>"Y���@@@@`����rW���E��Zr��	���h�Z�I�5�Z�?�h��F���^]�E�9��}�@@�_ �B�GyD����5�i��������!� � � ��t�jz^����l��^�j�T�p�U;����lp��jkk}�W�D"��b�/[���w�; � �@~
�U�6b���*UVVf�?:C@@@�<H
��V�5���z���k�i��{���$zrN"����Q���q��V������p6l�4��@���
�V�\)����jO=�����K��UW]%W^ye����I���Y#mmmr�y�����-)++�u_@@@@`�t�i��V���jN�Z��F	;!�{��6�l8t�3�cm"ts��w@@������s������~[�n�jv/Z�H�����W���x��w�;��#��o�&_����]�@@@��@�j5i�$r�!�J��j�:Y6�;��c�S��:�c � �y"�W!����M����32n�8����k�M�N�2E���/�O~����(���������[�#� � � ��p��VKh�f���VK����`�}��#�K����Ls�@�����z�-�5o�<9rdJ�+��B|�A�i��s��<�� � � � ���a���<(7M��|���bM"���jkkE5B5+�@z���,��w��r����y---���q@@@@�_ ��j:O�.vmC��N]$�z�j���# � ��`D�v�y����{e��m���/��<mN�>-�>�h����j��l � � � ��
xU����E���S��m�&���w$z`�S��|�o����j}��� ��V� B4�������P���/:l�u0j�(�
�;v���>+{��1b:����s�G� � � � ����J����-o�����jn �#� �9(�M���FY�~�tuu�j4�H�[JJJd���������� � � � �@Z������&3DN4����PM���! �s;C���f9s�y��@*P!�>�-��"555��?�A���}�W���g>c��|O� � � � �M������D�6���A�����������)�^C@:��91�P-��@@��
L�6r�H���k�����$�H4�C�����'&L0�7��3G��@@@T��E���+��4�������
�t{ mz���S�������3F�MZE����@@?��	�������y��W� � � � �@�h��~�+�40�oI���.M�fu��z)?���N�GO���<��&X��y�h��U�XBS�1s���/�#� ��A�t8<$�� � � ���@���n�;L+�b���L���jo:�����&8� ��T��B���.��}�<xPZZZ�����������)�G��=� � � � �@��C��! ���%t�����E_�#�KH(�.g�����}�@(d��	�t�5k����}������|>'"� � � �#�=d�����y�2
�tH�����|�T��:{�Z��bp� � �-������+����8q"�����X@@@@�a!������EC5i�$��3�������,������� � �@>D��~��D�6y�dY�|�L�2E��+:l�����M��u�} � � � ����Q/�J���u��l��.{���Rm�Hs�\��\h��Bp��X@@�"(�m����Z����w�#���E@�# � � � ��,�T���b�.���V��;�@,�R]���T�������t� ���@A�h��3j_|1Z~��p7 � � � P���Z�Z��$����2tzK�! ���v}{�Rm�5"�A^!�q� � P��]x���F��w�)����!� � � �C-��1&���	�<�^2	���L��V���y���"U�%D�f<�� ��'P!���^*��E�u|��Wd��y�'�!� � � �E(P=�z�t/6T����|�	���
�BN�7���j�@@��1t���,X W_}�ip�����m��, � � � ��.��Z���GwIt�3T��=5'��h��@m�jg^����%��^����@l���D��y��<y���>}Z���>	��)�JJJd��UR[[��<"� � � �0�R�OM{�)9J4s����-o4�A~�P�������I���jfN5�UBJ@HC� B�^x��h�K��������������p@@@���W���o��9��lu����;T;U�P�k����Q�_"� �^�M�>]b������7q����9� � � � �[�x�v�s�[E��J�l�j���s����.��j2���wB��?@@ �@A�h����E?, � � � ��-���_���x�l������L�i�PM�d������@d������!� � � ����j&X��j'Tu�h*���i�V�8�tT�
�[�2 ��@A�h���fxG;���	�lUTT�[��@@@@�@O�V'r���:-9T��2�WM�6��hGZ�F��, � �!PP!��f�=��<��������q.��B��g?+\pA�c�@@@@(<�P�>� M�uF�j�PM{���1d��� �S��	��n�*��w��9s��U���[R__/�\s��|����q@@@@�0zB5���7�j��c�����$\��d�)��O��m����� � 0l
"Dkmm���������L���'�pXF�!��
����g^��u�L5������A@@@@��(�;T����%+C@�C5�xF��dA@�"(�������������m��&�P���x�����?����i���/��;@@@@����"R/�HD��>n�?�j5��,��!@+P!����k�w������~UJK�o��K/�[o��T�i��������>+k@@@@�a&����i��2��?n�V����J���Wm����� ��(��F����M�C�/@����g:��.MMM�h�5 � � � ������j4��)UFC@�0����j���z��� �y.P!���g
ceee���s����G���|N@@@@�}C���TS�G"7H�B�TC@�
�_@�@ �
"D�6m�8p@�~��~1�jM4]&O������ � � � ���/T���2��vd����! U�@��(�m�����?�Y^}�Uihh�%K�x�:uJ��Y�8����/�� � � � ��
x�ju��SM[��h���F�/���5C@b� ��(���+�����w���!=��l��E�.]j��=z�����[o�%6l�����Y�f������D? � � � �x����?z�j����j�! 	��
D@`P
"D����/~��������������e��Q�r�J���G@@@@ +�*�"N�Z��s�����g����{�@�
"D�_�p�������>*�������}Z�v��7���{@@@@���w�s�T'�I[��9R����� ��
&D�'������K/5Uh{���h4*���2a�	��2c�����c�@@@@�l
�����2�@�U��6=W�v�����k@��@A�h��F�),0�a��xp@@@@���7Ts�j���<�$}����j����j��:�AF����+��K/�����O|�����C�z�)����o�QF��y;@@@@��P�	�����=dV�U�m�Z�w�������p@`D�v������������|�7D;~��<������={V���/�W�#"� � � ��@:���WM�=�! S�i���j��8� P���_��h:��m����:���+W]u�l��Y������J�m�@@@@�X��P-zr��{2'�����iU��J3��k~"� 0
"D{���������D��!������hg�����F���KS��1@@@@
F �PM*+C@j��]�P�����*5*���D(:���Z[[
��)S�}��9r���9@@@@�BH�ED+���=��V���I����#�0��	�2��5 �@^
D�6q�D9t����h�-�h4qJeeeb�
@@@@�]��PM��O�!���iv��i:$�jn)�#��@A�h��M��O��,]�T.��"O��g����$�M�>=�� � � � ��M�+T��s�D���qks�a��t�jz����.=#� �E������Jy�������������h�"Y�p�h�ZEE����HSS�<����g��S]]-�C;f���@@@@(H�=��
�h���F�����0�A�|^5B5�w�@��(�m��Yr����o�[���������+))����2b��S�� � � � �������
��D2��4�)���j��7�
����Ujv��j��q ���
D��$���gD���c�=&����J�'O�o�Qf���{@@@@���b1�Do0!Y����4�3M=G?�h�������WM����VC5�&�!"m�f��jq`~"��@��hZU��r�W��m�d����o�>9y�����555����O.��*�����#� � � ����e�$���
����M��! �1��wm���{�
��Pm����zY�U�+�@ ��Y���J���?l��F@@@@��6,�`�.��Z�C5m���i�����k����y����@ ���9sF8`����:�w�yr�������@@@@(n��������i�<[�f���a�2'��^�! �+`� ���	���=+�6m��k����������������Y#o���\s�5�d���~6@@@@@`p�����6T�m3��3��Y�W-��O@��(��W����[��bGGG�}���f��'�x���_@@@@��P�V��2����P��aA@ �@A�h�/T�UTT���Ke��q�����y�����9x���.�����;@@@@@ c���l�Z����[k���>� �����m�W-}CZ ��H� B��^z���1c��w��]�8q�	��B��3g��Q������g��V����"��_�%{�����*Y�j�����c�D+��Ur�W_}�\t�E�������e���}����Ct.8��M?�-2���0h;�+��^��>@@@@*�\��ml����C@���r2���r� �tD����dnW�9�-�2b������P(��=��� ����)������j��k����N1��?�B4����{�1�����-���^��_/w�y����$6s�i���f�� � � � �YH�l�Z������j��F�-ov�RM[�j��w��b�`��B
X u�'��.������L���2�S�_����~�3y��7t�=����vS�9��!(����g�����,�����p�zP�ca���RYY)�>�lb����W����=�_��.����@@@@���a��Vr���Q�DO�i����j�j��5 P��M�2�TJ�P��-:�
�&M������5�����;9u��9�������S�?�����w�!S�N�:����g�Ih���3�F����s?��O��g�v��#G���/�(�/6�������5� � � � �C ��z��9��C@�,T�C@R�6�\�����v���u�3]�|�I��o�o�]]]��_�"q<��*��{����&@��#���o���*,������%B���~��b4�Y^^.��rK��2�.A�i�L����F@@@@`����4X��M7�d>��8!��i�yd��O�u��'z��`���:����"��������'h��C����D��������8qB���.���>&�p�vtt��A����������:bUUU`d
���������e��Q��I�����E�=�E�J3fL��\p��?^4�{������={��a&�^�o��>7�@@@@�P�]��<�{^�����;���j3��P�K#��+P!�ZZ��Uf'O�����7������������{�������nK�dC��-Z�i�6���O���k���i���M+�ZZZ$h;
�2i�u�C@@@@ ����{Hs��y�2
��C�TK^3�Z��� �@�D�����-�q����~�+9p��/��O7�|��0����_�����#���F&O�lB����>���������-��J���I�v�O&m���� � � � �������6��T��J$���
������
���P�y��[���@��hj�`����K��W_���{��i"i`����
�/�0�\�w��SO�i����O}Jt���9����.ZM��$;v���F��&���:����%�5m��@@@@(6�����9tn5]���j:���[M;���n����, �C PP!���9R���g>C�5�KjY(�6��.:�Y{{��>|�������477�'>�	�Oh h�T��iU�]N�:��������������� � � � ��`
����rv}��A�?��%����ZFC@�PM/�T���9;�*�&�S�i��@��@��h^$Z�����������s����gKi��=�V���g?�s������\^x�s�������K/5��;��x<q�D��v��v���N�������Z�������u9�� � � � ����������7���������w�/�����$T��G��.���1��c�f��.�����m~ ��I����9��K��x�C���C9y���q�r�9�$Zo��Yz�!9s�Lb�y��g����N�����S�����53�o�a����^K�hc��M�f���v�y����>2ik�a��:eee�`]�� � � � � 0`[�f����l���������6��O���^�V������~X@2(���'���;w��	d��I����b�����W���y�Y�f�	�'�����p�B�M�7Tt{���-�6UUU��erMm;���%���, � � � ������������g:����cM����(����f2������$vm�U��~X@�"�������]�ti����['����c�-
��l�b>�/��������������MC��|�#���!
KJJ��h�����^����� � � � �������*����/T��uc7}d2�m�q�T3�
����z�� ��
"Dkii1O�s��E���u�9s��v�m���n�\����w�����l�|X755%n#��.������[�&��7�o��s�r��Kyy�m�}d�V�� � � � � �@��PM{�j�X�	��{^���jz!�M�[������(�M���9"��K<�3�<#������h���f�����)y�C=6���gE�p��������_�8����E����u8J
�^x����+�a�~��G����*���6��m�F�@@@@@���B-y�S����! 3�T��0����]�@IDATm������m���s�9�q����5�f�����6�M��?~�W���j�k�V&����.�N�Jt����������W^I���l��������o}Kf��%:���b�=��$�!(O�<i�]�d�y�DG���W^iB4���Hss���3�[�CU>���������z�����L���� � � � �d]�]��_��7���j}B5;$�Z��-"�@aD��s�iU����;��%{�u�Iii�c����&���&O����t�hxu�=��6���{�.��"�����wRR�;v�����C�����}�,_�\t��M�6��u���/��3a���7�i�r���������g� � � � ��p�jz���ufd+
�"m7���C@f�1�Zn_(�#�@���Oy|�Z�|�����~��^wy����G?��^�~�����c�����?���t�����~:;;�mVQQ�8G�d����|m���K��

�4���'?i��l��|��#��_��L�8�T����{��Z}v���7�l��Lp6���>2i�|l#� � � � ���
�7�;T�hH��#���RM�����, �@Q	h��F��d��������Uf����BK2�� �����n9~��	�/^<���s���h:$dee��'-�!&���E�4
�&M�4��	�N/�I��`'j%bCC���h��� � � � ��\�gH[��d��Us?���E��9�Uv`�un{iAT�Y_
���.[���y�h5�~2YB�P�����v�W&m���} � � � � 04�
q�����j����Z���]��=��)PP!ZA
s� � � � � �@	�G��W���W-g��R��aA�X�-�_�� � � � ��Z �ZM�eC5��D"��y�l��
��b���
, �Gy���G?�w�}7+<%%%�j�*�� � � � � �L 9T�Uk�9=J,;d���M���F�-oX����6��G��! ]�Z��'�9�%���;?c�o��e$���dS �B�;v��#G��|�&D��&!� � � � ��p�a�}~[����! 3
��6L�k'Tk��N�Oo)�w_�;���4O[+�3��}�o �@�y���5K�����#�4�8qb��@@@@@�"�\���PM�7��5����qkulq��P?-����f!H���� ��@^�h���7��}NG@@@@j��P�V����t�\t��|d�
����j>;z�J��|t��A�*D��A@@@@�?��;��j^�Z�C@:�������=�y��/� @�-M@@@@@�H�V��6T�m���jv>5�� �j�^JdF�Y��O��?!�#� � � � �d] 9T�Ukv���! 7����1O�%U����GD���k���jq~"�@��
�~��������&��())�U�V��1 � � � � �@��0��i4�L��,�h�C�v�j��! �k���m��r���������eM��@@@@@���"����%��`�n?3IB#�{��	v�G�N�VcH_6 P�y���5K����5��'f�/:B@@@@|��eHs�sR]�������";�n��3d��F	��a�����M��V���(v��
����o�7�� � � � � ��@�����J������e�����3�J��SM�G"7�u����:\�(���f;�[�f�T�b��"�W!Z��q� � � � � ��+�i"NU��'��y����:�~t�s�E"u��kN��!�Z-�P�j5���U��X�,�� � � � ���g�.6L��l��;T��
��*5���j���@^�hRZZ*eee��"� � � � ��@r����
�t;�C@�j5w���Q�?Y@ O�.D��~ /����;V����YF�)�����O�SZ�%%%�j��D�nZ�9@@@@@`�$�j�j�oHE��Z��j��
�������jT��-��@�]����(]]]���j��)S�HSS����?m������F@@@@@�Y��V��0����zq���+`����]�6k�,��u�TWWK8�r���s&|Lw�8qb�M8@@@@@�>�T��+����O����� ���~����o!,�
	$��(P��5���_JU�/���vP;������	�e��h)��[�((�
��*��y�!cBBI��d!$a���9��n6{_������~����{����y>���s�>�X�V5���)�t!Z�����c��I�n����/�<{���	 @� @� @�@J�V��o������9nYt�}.���n��PX�Vx/�VK�n����h����5{�`�� @� @� @�@=z�VK������j�C���[@&�4]�6�� @� @�h���Za���%'d=��j�P-]�������V������@�B�5k�����c��m�w����M��<��sc��q%�8A� @� @��R�����j�B�Vx/�25n�Za
�()��m������}-��[Wr0�O��1#�;����}&@� @� @�M!���j��E��e���j��������������s��~MRz��� @� @��I��j�A�dB)�R+�[�������Y !��E��ZGGG�w�yq�����	J���9��3��y' @� @� @�@z�VK���- {�V��y�Z�.�8������l���M��F[[� � @� @���@�P��jm����j���Zje�g~�%T����[�\�h��o�& �1�
���� @� @� @�@��0�P����
�~,��\�v�g]�N ���f��/��r,Y�$��@� @� @��-`�Z��]$�"D���K��'��
6���������G��# @� @�4�J��:�-����e�u�_�c���va�Z��j��-�l�@.B�c�=6��������{c���q��Ekii��;�l'	 @� @� @��H��Z��- {?[m���g����Y����-	�����������?���W%����G]��� @� @� @`���
�`�w����p���b��V�����{z�4X !���[��n���x����?��%@� @� @�z��
�Z*����IV�������j���Z��Q����C)��-�<{����.�����O�������;�v�Gyd��> @� @� @�5
��
�X[����j��Z�- k�%�K !��e�{x��	����d��;w���: @� @� @��
aZ�u]�VX�Vx/�25n�Za
��@.B����g�=��3h�4�.C� @� @��!���j�v�4�B�6�3��g��czG�5s���g�v�6lQ�c� @� @� @`8�Z�6���LX�0��^X�&T_�!C.B��N;-}��H�u\�ti���oR$� @� @� @��'�s�Z�j]oYx�Z�P-5��I�VB !�Yg��>�l<��Cq�M7���wj]b| @� @� @�M.�3T+���d�	Y�}�Z!TKW/kn�����t/!��?{����������o�����b---Y����� @� @� @��|	��B���Z�����L�[�V�����������z�89�w�����?����eK�J�8N� @� @��#���j��E��e�n���R������V�	�
�x�E�6k����iS�A����]G @� @� @ �V�
�- Ga�Z���Z-��Wa[���)[8��S��:�����Y�~S	�"D���+"�l @� @� @��R=W��2u�da�Z�Pm���m�����?�NAZy��9���i�t� @� @���@�P��jm�����j���n� ����c���T(�t��h��� @� @� @�C&P�

�u�Z��^���o������|M����3~��_F��t�A�.�y��X�bE�}���~m$@� @� @��)0�����hJ�3��a�n�m��U��/~1Z[[� ������N;-���j���=����k����uN9��?~|��T� @� @���V�u�[c��:�/�����7�]��h&L��`�����3�d���G�����eaZJ�=���2eJ����;v��-[��W^�5k�����3&��F� @� @��V��j����,<[����'|oU����[����
5R��4�A��={v,\�0~��e��={b������Of����<yr��Uj[�n����������w^\z��1j��R�'@� @� @�T��
/YrB,YUio�>�P�{4M����>�����>�}�{��G���^x��m��EzU��9��8��3b���q��W*�< @� @� @�&�t�%��������t�������JC+�T!Za������w�;{���-]�4R�����a��x��7b��}������=;m���q�QGEZ�v��'f����� @� @� @��(�����������}�H�]�������:v�� ��!Z��L�:5�?���U8������+���yj)H� @� @� @�)p���%����It�[V���]o��3Z���w�>D�K0f'N�^}�w� @� @� �(�������]�����o'���A�� @� @� @��H���z�[s�{�'@� @� @� P!ZP]� @� @� @ �B�|��� @� @� @��A@�VT�$@� @� @����-���� @� @� @�u���%	 @� @� @��-0:o���gOl��!6m�]]]1g��8��#�6�%@� @� @�hb���h{����z(������eK����3n�����;��3�{�����K����/�C� @� @� @�Z����������o�}�-
r����u����n����w���q @� @� @�T+��m��%q�}�ec?~|\r�%q��W�9���>;;�y����qc�e$@� @� @� PN !�����1L�4)n���x���g�yf���={v�7.;�f��>�8H� @� @� @��@.B���WgcH�9koo/7�hmm�)S�de�����u� @� @� @�@_���l���}���}���c���zvl��1�s� @� @� @�@%�\�h��O���v��J���,�B�v�a�U,� @� @� @������s������u�z���y��}q�w?��1��o� @� @� @�@���-��r�������?v��������K.�����K;v��+Vd�Z*���/f�O:���<yr#��m @� @� @���
�"DK+�����l���]��{��^����;>���?��	&���^��1 @� @� @�T+��-
��.��>8������aC����q��WG{{{�2N @� @� @�('��-
���O�SN9%�}��x����0���^������9sf��7��x�#@� @� @� PQ W!Z��Q����O�^G� @� @� @��Zk��
 @� @� @��a-���hk�����7��m�b���%'���%�=��7n\�2N @� @� @��K 7!���K�k_�Z�[���q�yl��q�q��y�A @� @� @��r���g����c���������Vw���A @� @� @�����-ZT�:::������?<&L�Prp�v�s��)y�	 @� @� @��r�=���Y��M�.����R�q� @� @� @���rq�����g=��h�r @� @� @��$��m����8�,YRi<� @� @� @��@.B�K/�4�3�6l�����<h @� @� @� PN �D;��c���/�{��'������{w\p�����n�vX�2N @� @� @��K !Z��;������]]]��� {�5���.\G}t�C�	 @� @� @�T����n�7�pC�UQ��v����. @� @� @���r�-�<{����#J�w�>}z������I�P;��#{�� @� @� @�@E�\�h��-�2a����'?s���80 @� @� @��*���9n��=��g�)@�u��#@� @� @��Z !Za���
��� @� @� @�j�E�v�i�e�K�u\�ti�cU� @� @� @�@U���:��8�����t�M�d���� @� @� @��ZF�Ri��<����g���������������l7ZZZb�����Y��� @� @� @�z�"D{�������*�}����~����R;[�l���q� @� @� @��@.B�Y�f��M�J������R�'@� @� @� PR !�W\�e#@� @� @� 0�C��6 @� @� @��I@�����W @� @� @��!h�m�����=$!@� @� @� �[������/|!/^tP|�K_�Q�F�=���?�x�������,�������$@� @� @� �[��B�e����}����+6n���O���W����{����-[��**)@� @� @� �[��B��s���O?S�L�������g��M�6��{������(@� @� @� @��@��h����o�I�&E�%c�.�������> @� @� @���@k=.:�k���3��c-�oH�� @� @� @�(4�J����q�=��������n+��; @� @� @��!h��h����_x2	
 @� @� @��o���� @� @� @��F�=�'@� @� @�h:!Z�M� @� @� @�4Z`t�;P��;v�W���R��:����/:����*D� @� @� @� ��!Zwww<��C�~����w�#�����TW% @� @� @���+0�o�8~���;�FN� @� @� P�@��D�0aB,\�������~����L� @� @� 02�6D3f�ld~'�� @� @� �p�a};���� @� @� @�@.�h��6�&@� @� @��������M� @� @� �K��{&��Y���c��<�\~�t� @� @� 0<�.D;��S#�l @� @� @�%�v����. @� @� @�@�
��vjt� @� @� @�QB�F�k� @� @� @�i�hM;5:F� @� @� �(!Z���K� @� @� ��B���#@� @� @�h���Q��%@� @� @�hZ!Z�N�� @� @� @�4J`t����}������c����m�����.y��������c��	%�8A� @� @� @�/���hK�,�;��36l���8�<6o��8����<�  @� @� @��R����]��zk�����8�<>zt.��g�$@� @� @�h�@.R�E��i���E]��O��:(�m�����3g�u�1 @� @� @�er��Z�*Dggg,\�0�����I @� @� @�hH�������gM�x����B� @� @� @`�"D�7o^6E+W��Se� @� @� @�C%����SO�<�m�z����� @� @� @���E�v�����^�M�����<���.�&@� @� @�
��C��@�x���c��]�ev�����������e[ZZb�����Y�\�'�m����7c���1y������vuu��O>��/�+V��={b��9���s���S�����^��@���� @� @� @��<
�"D{������c
���__�{��-��=�����o|#�o���9zte����7�6l����6m�G}4-Z����b�����Oj�[k���y@� @� @� @���iPr��Y����n������_��7�����=��~�+}x��W�����b���Y�����N����x��G�����������n���Yk�Z��vR��o� @� @� @���$����+����-K�7o��f������X�jU�����������|������%�v�e�e��/����J<��q�y��Yk�Z���R��q; @� @� @��a$�:��2�C��uk���l����[/V�|����M�81����(h����c��k�I���V���j�[k���@����} @� @� @��p��J���������x<��C�g��?�w��?��1#>����i�������:��//�z�����I�&P��c��C9$RP�t����sg���Zw��5C��`Z� @� @� @���
�Rhv��w���>���;�n��y����7RH5�-�;��s�}��������-�L�9sf��ql��-�j�[k��
�n_cs� @� @� 0rs;���~:����:������-M������g?���767]]]���N�Z����������Zv�����K��n�1�L� @� @�.�X�����~���g���}��1q��'GGGG��������h������w�}�j��O?}��)�*+l&L(����\��j�[k����u� @� @� @��T !�O��b�t�g���^'N<`J/^��O���iw�qG4"D+�*K�+�,���7�|3���n��R��[��P����#��F� @� @���@.B�_|1s�1cF|��������������Uki�������-���V���;wvx5jT�Xa<����^��@�0D;[�l��wa��� @� @� ��o��F�{�wU�f��@!89��Jh�+��g���i[�z���htP�+Qnw��U,w��g�����^jt u���t����!jM3 @� @� @��H�E��w��l��	P�����J�t��u���m��TZUU�&O����Z��z����-��������#����C� @� @�4�@5��@��?����X�����������Uk)@K��i�*��=C�U�V��|au]KKK����n#�,90' @� @� @��@ !��Y�2�g�}6|����o��f�y����Gydq�v���Wl����.�����iS�]�6;t��'���c��Z��Z/5:��Y��� @� @� 0r��{����������[n�%/^�Q���i���������b���Y��s��!����:;;c��9Y�),{���h�[��V�����������^jx u��C� @� @�f�x&���S�C�P|��_����_Fz������GJ������K#�n+l���Z��{��x���
����#f��Q��B��+Wf����/����oy�[��L>������f�R�w�i����Z��Zo m��q @� @� @��H !Z�>���c����Vq���K%� �@�����V%Uq"_7�xc��7�|s����������/���X�zu<��C������s��C=4�������k�[k���@���� @� @� 0�r�%�����q���f���3�6n�]]]�������������������]�r�����Lkkk|������l�Y��/�>;��c��/�����Z��z�������	 @� @� @��A������o_����?���?8�����}��l�\
�;��~������K�H�~
����|�;����*�#@� @� @��\�^�A����{g�V��u�'N��sLM���n��R'R��A�D� @� @�h2��&�O��;vTu�f��> @� @� @��_��V�}�_����A_���b��Qq�=����?�/����X�`Atvv���� @� @� @��.D[�lY���/���b���1}��X�zu�_�����e�!Z��T @� @� @�h�m������O��)S���#����g��M��=[������ @� @� @�����-�o��&M�tK��]~����t @� @� @�
���Z[[���
���A� @� @� @�/���6�������ys��J���{c��m�j�Je�'@� @� @� ��@.B�;��3>�������������%K��u�]���l���9 @� @� @�T#��m����X�|���c�7o^V&����]�� @� @� @�z�"D���r�W�ZU<�s���� @� @� @��
����P�{�����@{����7l�?�����Jz^��5k��g�)����(��!@� @� @� P�@S�h���j�v�m�a����^�l�'O���>���� @� @� @��O�)C�����e2�@���?mmm�
� @� @� @��4e��V�]���{��l���wc���q����������7n\z��q�a��@����8A� @� @� 0��2DK�o}�[���<�H�M�2%N:���q; @� @� @��!��!Z���w�y���'�pB���	 @� @� @��E !ZZ��seZ]$\� @� @� @���"DK�C[�zu�'��xG�?���T @� @� @����~�������=S3g����;���T @� @� @���y~kkm��
-��� @� @� @�����J����}q������}{�q���g���_}������$ @� @� @���r�uvvFzU�:::�������^�o~����~�R�	 @� @� @� P�}�Ls8������}o�������7�x�9:� @� @� @��V!Z�?��3�	��o_,[�,W��� @� @� @��!0�B��^{�(�m���� @� @� @��
���h��7��M|�+_)�5kVq� @� @� @��jr�}����'�|������#�:�����C��3g��� @� @� @��R���/_+W�,5���3&>��ODz� @� @� @���@.B�3f����c���%�7i��H�����_|q�_�� @� @� @���"D{����e#@� @� @� 0�C��6 @� @� @��I +��SO=�/��S��{���>�W�X?��O���-�����0aB��$@� @� @� PN !��
��[o���w�[����!�o��<�H6����}�C��� @� @� @��>rq;�E�e��Q���k��s ��I'������?����B5 @� @� @���
�"D�������o{tvv���������={b��ee�:I� @� @� @�/�\�h]]]Y��O����;���+����9 @� @� @�T#�����=Kz6Z�m����"mmm�}; @� @� @���E�6s��l<�?�x<��s%��w��������Y�f�� @� @� @��V`t�Y��s�������{������8��s������Bm����m��X�zu<���f����S�L���vld��M� @� @� �/�\�hs���?��?�������>�`�*E����W���XhWj� @� @� @�4H !Z����+���3�������*�5m����������+Y�	 @� @� @��r��Ue����g�yf<��3�v��X�n]���+���N�Gyd�r�)V���q� @� @� @�*
�&D+����-�:���G� @� @� @�] w!��={b��
�i�����s���#�8b�a\� @� @� @`�
�&D��wo<��Cq�����-[�3���v���?�y����s����^��~�� @� @� @��
�V[�������������R�v���_�F��=+�����~�} @� @� @� P�@.B�%K��}����i���q�%���W^���>�������c���}�q� @� @� @�@9�\�h�/��0i����������y��}�k���1n�����5k�,�  @� @� @��r��V�^��!=������x���5�L������*[�I @� @� @�}	�"D��eK���S��5��������1c�p� @� @� @��r�M�>=���k+�'���
!�a�V�� @� @� @�z�"DK�9K�<����=���}���w�Q�<c���� @� @� @��
���`#���??������sg|����K.�$:::�.���#V�X��k���/��?���b�������	 @� @� @�r*��-�(���k�Uf�v���}�{E������g>S��v&L��^{�~�| @� @� @� P�@.B�4�.� >������b��
%�w�����W_���%�8A� @� @� @��@nB�4��O?=N9��x��g���^����^{-�:;;c���1o��r�u� @� @� @�@E�\�h��c������5jT�|����������={vV�\9� @� @� @��h�}�?�+_�O~���|������iS������n�U�VU,� @� @� @����X����o���k����l\x��q�UWE[[�~c��gO��G?��|�;������{�2> @� @� @��F !�	'�O?�t���/~�����O>����3����b������x���c>������/~�C� @� @� @�Z�\�h��sL�q�����&�n������=��C�|0���[ZZ���.�+�����j��(G� @� @� 0�r��)J!��7�<�@�s�=�k��x������T���`u�Q��� @� @� @�@r��A�������������n�o��~����+��V��w�E��@IDAT @� @� @���U��s�����{�G?�������c�������c�p� @� @� @��
�&D{��'�������W^)�m���1m����/~���Y������q�5��)��R,g� @� @� @�@r��}�����}�8����O��O�[;�5*�}�����;b����i�����[��c�X�?�X� @� @� @��jZ�)��2���o�]8������o�w��]����x���3����ZZZ�c���/c������ @� @� @���@.V���g�'O�|�q��g�9�q���������g���~{�����  @� @� @��������<RHV���>��X�pa<���q�QG��� @� @� @�J�"DK������<�i6 @� @� @��4�3�}��X�`A��?�s-���7�_��_�_��_��-[j��J @� @� @�#[��V����K�~������sf����x��'���.��/���2;v���^{-;��m @� @� @��+�t+�*
`��U�n��X�vm��� @� @� @��I w!ZM�T� @� @� @�@?�h��R� @� @� @`d�F�<% @� @� @�@?�h��R� @� @� @`d�n�avww���+�^WWWv,��u~����q� @� @� @�@�6D��uk����m�������� @� @� @�l������Q�m��y�A�� @� @� @�4�@��D{�;��v���;wo���q����* @� @� @��L����L�W\q���
�&@� @� @�h
����cS�� @� @� @�������7x @� @� @����h}�8F� @� @� 0��h#z�
� @� @� @�/!Z_*� @� @� @��h!���~�'@� @� @��K@����c @� @� @�#Z@�6����	 @� @� @������ @� @� @�������7x @� @� @���F�u�1���3+����b<���� ���q���Y�l� @� @��t+�F�7��G������X��_��%��]���D
�l @� @� @�@��h�F�@
�RXVnK�-y[�4gr�b�2+'��ZC @� @�������p���}��_���3��qH��wC�����
~�w9��M?�:H� @� @ �B�\O���N�?�i\�~{v����<rK�)�;��|�,<q����'6e�r���r���f���K��<�� @� @�4D@��v�Z�g������
m���Z��F�1O}��UZ9u����
$�L�a-[�}��-u���hyg	 @� @�y��m���@
���jo�X��U!0�j
k��(�Z8Aa�����}�8J� @����-��������-��hV�T�U��O�w�^�J���R�W�C��_�z��?aaZ�[����Gk�_���%@� @�����N�.K�����U�F�<d\L;t|C�_�G��tJ��Z���+��+�5���V��'��mD�m���s�%���i�z��>�����5��:�7 @� @�AB��k��P���]���e������"�?�����
���+�v�!�]���B�&Ms�O,|<�=O�t@� �K���s��z]~P?V�	���_s�~q)L� @�#T@�6B'��G��-��-J�h&@��w����w����S_�@�.���������e�D�!�@��+Az����Z�Z����I`�~���e���4#�e��9��j���[�����E�>��#�&���+'} @�j!�P�k�@����������<)�?@��+'k�7Qk������z�
�O����h)��j�-T[��c��oL��/�l����gv5E�e��c��U�
~�w9��U *B� @�@MB���T"�?r���������7��zi4C�6k
>��w�����h����=e��@���j�UZ�~{�i��
�oO��W
K��t�?��t���z��p�f/������=5� @� 0�B���t-T+'�sD^����So��}�J�J���h��H���F���+98�#�&�i=���f�f7�j��U;��'kY9Y�qU;~� @�^B��5�FC��a)���a9��fPCu��j�T[.M@�VVN���?��@��V�3�z\�����r����j��J�+�Q���!@� @��B���k�n?��$V��q����~�������D���d���@?�����J�����j�����`���\'}�*��+���K�� @�@��h��/�%@��T��G�tb*t�������K��=�K?�Vt2.�:�@P�{5���B
!���T:?�L���U�	�"J=�/�j�u�G���s^]� @``B����M��O�c�q�0��~�*�K!���~��
���l��
�s;��>�>�\yR�\����?}h��VNx�*}�*�p�����
F�����N-�)�����/�J���> @���hM>A�G� @`��t�z���<�u�z�D��@�^=B��'kY9Y�q���5��4���}����7��0/�4�f���/�u��]�h��KVt� @`��h�N�� @��_�-H�����z�P?������JAL5a`�U���������*�_���Z_�~{�i�������@���U��g�����R�s|`���9~3T����m���� @��(����D
4X�+'+}�+�O$����RSM���"�K
9�������O�Z��^����x5�>{�AP�[�>��
������=�B��=�FO� @����+'+���&�Y=a^aF��S�����:�C�kW�o��u�Vs���j���f���?�ws> @��! D�l� @���@�?,���A%�J�S�yU�S�o��i5Z�=P��j
�j�W�#=�����P�X�����r�-w���q���m�A]� @��M�������V�U��'����q1���i���^AH��9H`�}g��H��������@��.�[@�����{ @� @`�*������-��We�Q������c�5����{}���-<�������R�s��M�r��w����H����p=�]A�;���m��� @� @�
�����������O^<��zh��r?���j��.1��zt��n�����"�U����;W��K)�����.5T��%;�d'�u��3�����&#��X@�6���#@� @���@Zi�^=T�����2[�:�Z/�{����f��N�\����0��l�&���w����2�,P��P��`4_��K3w��G
)TK�[>������1���<�����y�����j�S� @� PB���%�;L�(P�w����������bG�{���^�6���J�^��D��-w��e}<�R�����r����]5��\��_�u���9�'��[������@��B���5�
��4 @� @�U	��������x�+Vs��r�C�s��ZW��tj
��]�����'P����R=���/D4�s�*}�K������b�r��;q���T+[�el������K����=E�w_���s�g @� @��&����
dP�~�/w.�9����v��|��%�4�z\�R��v����Z�;��d��c5[*w�u��)�L�h
��4 @� @��F
x�_u��~�/w���.��T`R�j�r��N!��dZ�r��R�.2�z\s �Q���h�5zC� @� @`��AC�
f�����[��]� ApW��N`���T�V\i�B��
�O� @� @�������K��wp��1U�6i���3}b5E�e����Bv�B�;�N� @� @���	��z���o_T���?�x�[�6�V�p����Z�]�Q���k�z��,�fg��%D���? @� @���@
��]������Y�4=C����z\3������5��\���`���{ZM?O<���]�	��Y @� @��N�������}�Hk���)U�@=����f�>.��W�
}k���RG�6Rf�8	 @� @� �c���,�z���$�<��DB�&��vE�6@@�	 @� @� @`�gCg�������5SM!Z3��� @� @� @��F@����l�w��� @� @� @������	 @� @� @�r. D���> @� @� @������	 @� @� @�r. D���> @� @� @�����K�����w�uW����,��^�|�e�����'�������+b��=1g���u�9���������Z��z%;� @� @� @ �B�:L\
�~���W�rggg�!��-[��o�
6�w�M�6���>�-�O}�S1u����������p� @� @� 0�����w���U?���={v����;�X�������������]|��q�e�C��7��>��x��7
���Z��Zo��} @� @� @�#+��<����'b��U�����,6o���?���c������}NAZ
�^x��x��W��'����;�x�Z��Z��� @� @� @��0���&����z4q�������Z:8v�����k�=~��G��i�������q @� @� @��H@��D��|��H�=K����&M:�w�{lr�!���K����;��Z��Z���9@� @� @�FB�&�����{�B���tk��3gf������m��k�[k���� @� @� @`�x&Z�gr��Eq��GkkkL�:5�M���`c��9�������T�����Y<��k�E�\k�Z���C�3v @� @� @��D@�V����O~r@�������1����yVXU�*L�0��z�=�������Z��Z����|��cG��� @� @� Po!Z��*��'FZm����u����}{��e����W����j��=������Ue�m��q����������of�����^���|On/���P6�- @� @� @�	�x����J�V�����o���+�[�.�����c�=��������zjq�����x��s��F�U����7�����^�C��V����
a��"@� @� @�F��mg~�����},{~�s�={���_��W����*��\��k��b������Z��z��N�
f!p�f5E� @� @�4��P,����6���kw���>������__��6�J�5,l�'O�vk�[k�B��	 @� @� @��Q@���Y?~|����V�Z�g�t��\�����+D�O���Y��N @� @� @�r. Dk��^���jz~Za�7o^a7�~���~��M�6���k�C'�xb�;6���n��z��> @� @� @`�	�yF���)�*���7.Z��xz��9�����(|N�x�����
;����
�1���~�uk�Wl� @� @� @`
��cj����o����;wn�s�9Y(�n����K�,����/v�����������g���s�=7V�\��������j��-o����x����G��r�!q�i�
J����_| @� @� @�!� OdzNY�V�X��J]>�8��>p���.�(��z����{w|��_?�����_}�V������^�]� @� @� @���& D�M+�����x���c��u�]=l��M��.�,���B���Pkkk|������l��K/�T<�V�{��q��W���S��;����^�]� @� @� @���& D�M�������������/�o��f���EGG����j>�k�_~y���}{� -h�vX_��;Vk�Z���� @� @� @��a" D��DN�4)�k �����c�������^M�T� @� @� ���M�']"@� @� @� �P!ZC�5N� @� @� ��B�f�}"@� @� @�h�����'@� @� @�hF!Z3��> @� @� @�4T@��P~� @� @� @�4���gE� @� @� @�* Dk(��	 @� @� @��Q@�����O @� @� @�
�5�_� @� @� @��( Dk�Y�' @� @� @���
���q @� @� @��f�5��� @� @� @�@C�h
��8 @� @� @�@3
��qV�� @� @� @��B���k� @� @� @��h�8+�D� @� @� �P!ZC�5N� @� @� ��B�f�}"@� @� @�h�����'@� @� @�hF!Z3��> @� @� @�4T@��P~� @� @� @�4���gE� @� @� @�* Dk(��	 @� @� @��Q@�����O @� @� @�
�5�_� @� @� @��( Dk�Y�' @� @� @���
���q @� @� @��f�5��� @� @� @�@C�h
��8 @� @� @�@3
��qV�� @� @� @��B���k� @� @� @��h�8+�D� @� @� �P!ZC�5N� @� @� ��B�f�}"@� @� @�h�����'@� @� @�hF!Z3��> @� @� @�4T@��P~� @� @� @�4���gE� @� @� @�* Dk(��	 @� @� @��Q@�����O @� @� @�
�5�_� @� @� @��( Dk�Y�' @� @� @���
���q @� @� @��f�5��� @� @� @�@C�h
��8 @� @� @�@3
��qV�� @� @� @��B���k� @� @��?{�/Eu���TcE�"*�"* b�1(���Q�����DM��F�=j��Q�`AE;6D,�X@����c�xv�l��;�{?������S��������9@ ���X*�	@@@@@@ V�h���r@@@@@@�$
DKb��'@@@@@@�X�����@@@@@@�(@-��B�@@@@@@b �+?+G@@@@@H�A�$�
yB@@@@@�U� Z���@@@@@@ ���X*�	@@@@@@ V�h���r@@@@@@�$
DKb��'@@@@@@�X�����@@@@@@�(@-��B�@@@@@@b �+?+G@@@@@H�A�$�
yB@@@@@�U� Z���@@@@@@ ���X*�	@@@@@@ V�h���r@@@@@@�$
DKb��'@@@@@@�X�����@@@@@@�(@-��B�@@@@@@b �+?+G@@@@@H�A�$�
yB@@@@@�U� Z���@@@@@@ ���X*�	@@@@@@ V�h���r@@@@@@�$
DKb��'@@@@@@�X�����@@@@@@�(@-��B�@@@@@@b �+?+G@@@@@H�A�$�
yB@@@@@�U� Z���@@@@@@ ���X*�	@@@@@@ V�h���r@@@@@@�$
DKb��'@@@@@@�X�����@@@@@@�(@-��B�@@@@@@b �+?+G@@@@@H�A�$�
yB@@@@@�U� Z���@@@@@@ ���X*�	@@@@@@ V�h���r@@@@@@�$
DKb��'@@@@@@�X�����@@@@@@�(@-��B�@@@@@@b �+?+G@@@@@H�A�$�
yB@@@@@�U� Z���@@@@@@ ���X*�	@@@@@@ V�h���r@@@@@@�$
DKb��'@@@@@@�X�����@@@@@@�(@-��B�@@@@@@b �+?+G@@@@@H�A�$�
yB@@@@@�U� Z���@@@@@@ ���X*�	@@@@@@ V�h���r@@@@@@�$
DKb��'@@@@@@�X�����@@@@@@�(@-��B�@@@@@@b �+?+G@@@@@H�A�$�
yB@@@@@�U� Z���@@@@@@ ���X*�	@@@@@@ V�h���r@@@@@@�$
DKb��'@@@@@@�X�����@@@@@@�(@-��B�@@@@@@b �+?+G@@@@@H�A�$�
yB@@@@@�U� Z���@@@@@@ ���X*�	@@@@@@ V�h���r@@@@@@�$
DKb��'@@@@@@�X�����@@@@@@�(@-��B�@@@@@@b �+?+G@@@@@H�A�$�
yB@@@@@�U� Z���@@@@@@ ���X*�	@@@@@@ V�h���r@@@@@@�$
DKb��'@@@@@@�X�����@@@@@@�(@-��B�@@@@@@b �+?+G@@@@@H�A�$�
yB@@@@@�U�I�kg��x��������[o�e�{�=��z���7��t������+q�%C � � � � � �@9��C�J�����;��#-�K�,1o���7n�9�����C��� � � � � � P��j�T��������3����iS�����.]��>��L�6���������4o������E��Y@@@@@@�� �V=eU���X���w�}v���������l6�t�`}}��1�]v����#��=������J�� � � � � � ��&�r�m�S��s�=g���;��{��@��~���>�4�x�b3g�
�@@@@@@�� �V�E���=��3��}��
���m��&����/� � � � � � ��(@�K��mR�gJ�[�6������C����>�,f@@@@@�E�h�X�n���K�k��V����Yg�`���-�@@@@@@jQ� Z-�j����_��������e��s�����w�
@@@@@@�F���v�Yy
|�������5��M����|�����+�y���Y+A@@@@@~��`�(���7�|���P�k�M4r��-�� � � � � � ���e���o[�jl��v����gm���4YY�0�,�e� � � � � � �@&�h�d���M������B����2n������*D1b���� � � � � �TJ`�J���$W�u��6s�~��Y�tidF.\�o��m0� � � � � � P��j�T���]��9������3#�~������{��@@@@@@jQ� Z-�j�����`���������B{��g����_�t�����a@@@@@@��������
����Q?gj�q��y���.2{���Yg�u���s����n~����]w���0 � � � � � Pe�������UVY��q�����7��-3�f������b�k��|F@@@@@jM��k�D�����;���>�t����l�2X�lo��6l�9����� � � � � � ��,@M�Z.��m�
74�����?�h>��C��w��v���f���$&G@@@@@�[� Zu�_Yr���+�
6��,�f� � � � � � �@5��c5�yD@@@@@��A��r�2@@@@@@�j �V
�D@@@@@@**@����@@@@@@��UC)�G@@@@@@��
D�(7+C@@@@@��h�PJ�@@@@@@��M*�6V��|�����'�0�������Wd4��/�l�|�M��[7��gO��i��i�@�f��mf��aV^ye�n�����b�dad��,Yb�L�b>��#��];������W_=2|���:u�=��������z��1�J����F���m��>}��[%��z$�������j�-[Vo���w7�[�N�����������Y��o�aV�Xa�v�y��� �Bb�f��F�������j��c
b��[/���Y�h�i�������i��U����#aD���������~�����ndN�&�dI������M�81u�GB�ZH���K���;���~�y��O�[s�5�4�N��7��n�����3�R�.������j?6lX��?���:��4>�$p��������i��Zk�U��SO��(um�cwlu���{n�t|@�R��/������wWYe�z��[��H��VZi����;v�|��i�aM��C���Y]�&Y����[o
r�5A@�@����_���c�~�]w��P0��%�\R��E���v�����4iRZ��_�8���[o�U���[����o�_(k\�T@�v���,��c�*��@�
�z��f���3�`X�M���O����j���+s�W�^x��|��f���f�����J)0m�4[���N3>����?�aZ�liR7��������5�` f���G��w��<��C����3�v�Y�x�9����r��������
2&L0���3��9����<�6-���\`����*���,������M�k�O���z��t,u���I���_|a��sO[�\����U��k������#��1
�~����k���r�$6�h#�;��1�t,�}��C�&�5�����������c�=6���5�` F]�z����/��U��o���Gu�Q�3��&p��>1�]j��dr�A ��K]p�����yTM4���1����K��m������$�-�t���T�i�9��3�>x�q���_
b�����r��7��5o��.��}�L_~��wu���N���L5�L?~�x�op��8��@*xV�k��v��N��pM4���(��K ����SN9%����Yy�������gG��q�\d����n��}8��C���1
\w�uv�L�r��cG[k]�y��_�x���+��b��]v�%-g�y��zX���� �'���V]ANr�@���=m������P�w�}���k����T�di���@9���{��-_}L)�4��_��q�ciY���/M�b��1�
L��TC-l��)��p*�aR7��5��H%�j�\v�e�k�����
��&�`a���cRM���N:)��sM���/ �j.����Or�16�` F�Ts�v�/������o���y��w�f�mf��]��1����R����qHUh����������4���a�[ ��n�i?��C{������M���~V ��@O>��]��[o���5�` !
����k��T�}f���A���*m����8
���:��HR�]�T�?i���r<��cf���FM�u��5rc#Y��4H=�k��2���.��:	����	jn�����y���������n:w�lR��^�k�8K�ug�����m��fR�'��>8��cl@�@�:����2?���c�=lSy�kL��MM��Y�3������Xc
��p��������������D��"#��V@�$T(�V[m5;Jm����@�S`�7����������$@`�
6Hk�\5�RM99������o���U�V�< P{O8����>��������/*(��Y3[]����]=���`��4�������j����W�1v������j���T3��x��Pw�a[3]Y�� �c�����J���)�A�������eO<���g�}���>j_�����k�~R�_c(VYO@_�@�����/74�����#U�zAI�����j���FV�d��"�@)T��o���HW-������I@O�6������+��#e-:�-J@��s�9�vj��H�q{�����������[�����Z�����m���/��g���c���=N�T�}&���mzT7�~�ms�!�����<�� k��1
�X���])<��#f��e����6���@IDAT\p�N��c��� �Bb��/_n���j[��oMA3p�������k���S��{H�5e�~����@��
����=N��m��T?~F��z��G�f��w7&L��r��qMg)�n�h��1'5!�6����{"�m�J�	DB�R
6���~�����M8p�i�fM��C�zb����2�����M^���2e�=z���k�?�F������l�X�t�9��s��G51��p�����l

�o	%���,	�]����z�@������n��������)H���Z�(��K���������SO=�s_}�U�j�	�U,�!��r�Q�
x����/�c���p\�����}��7�x�#���~{
��G����k\%�z�:����I�&��������hm����k��Zu|&�V�D.(��jO(�������;�Q�#PV�~��t�Af���FO������>��z$�H��n���Yi��9�=�����.�9�Z&��Y@m�����&�^{�e�K�����Q�Fe�g�&(s!���M�4�5������������	J(����=Y�p��`������8���rMP��`Q	�Xz�%��yN<��z�f�g9���bD�����_��0d�[�G���G%�W�����F�N;�d�Px�>}���_�~6���W�	R��A-�h,��U��[o
6Y5����[KB���(����SMc������p�
��d�dad�Tsw������s��q]�t���H^s�5�E��<X_��A����nvb� P&�M6��<��Sf���i��]����4�����k�[�B`�	��f�L�+u����s���M4�W_���(�j���G}d�~�� V�_5���5A@�@B��ok�l����&o8[c�"|�C���4{������7�~��J����?	PmI5�������C�	XPyd�I�0	T��e�]f�yS�Jj�i�������6S�N���f����g�o��
��i�3�<�>YY��O��H����2jNDIU�O>�� �z�W�-�k@�@��E��W/���;=h������q�lS8��Xa}��F�P��~������j*���'�Mb�5,��;��C�-�M}���%�	��q
�����������Yo��l�j�I�������g)�n_������9����O?�����.��N�f���&��I���G�j�)�cTX�8+j1A�������1�:u2�>�����{��k�mv�yg�����w��m��v����Q���f�27�|�mD��[����� a�V@v&���K5�������$�����n"_���z��/��R]�n��R�[�iSU��R�`�3�@�RM�E��n�}��w�,���$�z"�.��z]*���`D��A��ROI��*�Li]��D0m�I���}��������@�R���5o����8��#aD�R�u�R�K�W]l��fuw�}w�����#aDL���R�:�m�/��|0-7\�q�!F��av_M5-^���L�cl&�WJ �zG�����T-�������bu�?�|Z6�_�8����[n����Z�����{���k�z$�1`�W�VJ|&�^�M��� 	��j����
7�����U$��ZE�U�Y����{��g�f�q�Y�f�4uEh,X`��T�)$�P?��[�h����,�����E��u�j��'w3���5��+)�f�t]���J��5A&�WR����2��i�����8����r��U����7����-[f\%�kF�����x�~��];�MN��rM�M'��h&M���$�y�@@@@@@��A����� � � � � � �@��%�8� � � � � � �@�%�� � � � � � �@��%�8� � � � � � �@�%�� � � � � � �@��%�8� � � � � � �@�%�� � � � � � �@��%�8� � � � � � �@�%�� � � � � � �@��%�8� � � � � � �@�%�� � � � � � �@��%�8� � � � � � �@�%�� � � � � � �@��%�8� � � � � � �@�%�� � � � � � �@��%�8� � � � � � �@�$!�@@�_`��f��W6��5���id[���KJ>Y��������f����C��k��e������%K��=z���_���b� � �� �Vys�� � �@�	<�����;�0+�����������[s�X�
Z�l�]t�V�������t�e��J�#+J�/3�?�,|UA�o�������K�����$������@Y�v�L�-��f��1f��Ef�-�4��~z�w|@@�~�h�_�l � �
x�����q�
^�v�mg��zk��7��y����w�}W�r�w�y�y��l�q��Q�K�.�(��T���v�Y�ti�<w�������v�l�[l>����_f��W���e*�]m���&�lb?����3������[o�&L�`^~�e�����c�A6x�`��7qj��y���g�m~��s��G��v�-m���{��z����+���?7m��M�� � ��-@�����#� � P56m�������k� Z�36�����yS����^;r��������c	�Ef,��
.<��c9�T� Dk�����33
��!�o���=W�<����I�&��C�A��W�e�8��=��ys$��y�u���'�T�L5Y
><���^y��6����N;���
�U�MA4���~�i3d���;@@�_� Z��![� � �@�:u�d6�|�z��3gNP�,�{�G�8��s����������9Q�^��[o�e�T��jJ
���k�`����o��f���lo>����4����)K��������4_��
�����on��&L�e�]2.��/�*+���}�]���g\�`A@�o����;�hk�>���A�k�
6H�T5Q���UVY�����o����:���z�I�I�&Ds0�#� �5"@�F
��@@@�!
�E����?�R�������!�ht�� �{��{���Mw�%W�i���7�rH�Mh��:G��see��!�/Cv�^�6�l�V>��K��f��i�q�]w�����]���3��	�{��`	����5�gJ
�:��Y��>�(����,� �>+���������;���?Z�}��gio��FI�a�Z�@@*/�r�W�@@@����b�H�<�%Sh�M{��*&�Z:��-6o��[H�B��)��$�-[�,���Yf)�)d{K��b�+��Ph���
h�����z���0'�|��a�y��������^�����rwA�UW]�l��n�}w��f@SRM5�"SR��.�ws�5����!��i��������'� � �T�5����@@'0c�3~�x�>���p�
M�>}l�w;�i�����{���sm�l�N��&�`Z�n��~����c��&��d�j��m���o������m'�����{m3��aw����	&����s�Yg�u������r�������$�?��OnQv[��rK�������h���9��3m~�	�;�m���c�}�]��'�q���_�Nkbn��Y�f�k��f����-[
��L�l���B=������>��![��e�1��|R�c���}!�o@�,�6�K/�d>��#�W�V��^~�����M��������y��KI����
��*UV��w�����E}��G�D�?m���\�������nZ�ha�q������U�*'���={��; � �@
D��Bd@@H��a��7��p�B�,��,:����e�����5?HpI5x8�K��SO=���`S�kW]u��
t?��^���^0��7-[�&Q0J/�����+��i���4�_^!yS ��w�	����.-Z��
��?���4B�������#���n��T�x��S�5����om��1����w�����<]r�%f��������
���pv
������[_��C����PP����v����~��h������Q�(������$���n���Yg5����Re��I���y��Ok��V0�XWr5�4���6��G�)S�������a���u�]����u*�����)��k9|� � �,�h�*r� � �@����cM5�T�F}����a�%�����W\q����*�� O��=�g�W��?
.�����oA��w�E�E�v��[om����o�mTH$�\R
�pR��.���W/���j���R�ySSq�z�m�O�Li�U�g��wV��w�`8���.��rK��e�]�/�Kl{��gl
7��s�
w������h���W�������A�Y}���l�~/�����kx	"������h��n��� �6p�@[F
�j�Wm0%����>������.h�����i���B��J�UZfS��u8�i]^�P��O��O?4���kW=z�]�~�����QM4���1�� � P�����@@'p�G��]�TD7���N��DS�I�.��N:)-H��
t��<��R��p����^��C����~���?����4jZr��A�jz��\���C��������W����i>5��>�r�������[���Gm+��]w���nZ�{�a\S�n�n����p�=���P��R��W��������V[-�����4ib
�,���2�i��6��|
����+v1�m�����~g���>�Z���/�5���VU�3��z�������GuT���?�UV.�~
�� ��
�}����K/�M5�����k��T�^�
2����u����_����ff"@@��
DKl��1@@�S@5��W��T���T[�%����g�G�}�d�{�+0� ��j�h�T�DI�����pRN�4M�`����jv�i��df�������z+(b���T~����j ���z��{�b�-����G��hJjv��I�f�������o���J9�T��i($����{P�FIO?5k��t���6Q���Q?]��9s���I������������S����U�~��~�07��5;����
7�`��jrQ�-�<�@3n�8;^�s�1�f�j�*���aU�SA��;�E����������� � �@�
D���#� � �$R ����XTm$�\V0�%t���R��Mm���;����KjRI������
6D%� w�65����o���6�TL�"V�H?����@�In��I�\�=��r�W�o�b����g���e�\�o���O5�������������j���F�����������J�?����<�A4�V�?���TmW%�u�QRMJ�bTR-O��.��r[k��L�Q��<�����o��u����N�hN�w@@���U� � �T�����K�I9}V_izeK_������A}�eJ~���?������p�y/�������������T�L�����\�5��ps�
]^1��f��K�ue����9�7�q����
+�
�(�����y������T>�P�|�����g���������c�5_���?����<�����|��M�`��W^i�1T���E���/45�����1c��S���T6���2� � �@U
D��b#� � �����etI�m���c�w��V0@��M���2%?`�i�L���[��2�5O������_|qP�O}��T����&�j9�Y^
u�~���=��ckW�U{O/�����<����;���<��Sl���}<��C��U���S9��r��_�O�z�5 ]pL����?���{��g,���sF;G�e��jV�K�.>W�����&� � �@U
d�o�*7�L#� � �@5	�O"���f�����}����Zk����Li����W����Y���[�E���_�NM��J�Ms�����j0��+�������z�-3r�H;\��,���*`���S���F�"t5	��6����R���;�m���(X�}��'�4���O��r�O�(+�\R�S�a�+�V��BS�w�}m�3
�~���j������h~MW?/n�@@�S� Zu��F@@�&�������{����Kj����s��_y�����j+$���:�Zs��2}���OVx��s���������A������L��Y!�R�0*_0�����~��m.�]�v�C�A�]������n�hZ��hC�	�(,��T��j��U�������[�f-����.�~-P8�(��v�y���q � ��#Px����m�@@.��]�s��1��Oj��m�����o�O�^o6��6y�d;^����g�i��(6on��6��L��J�&��bTR�TQ.��������4K�,����7���Z���>���N�<Z^q���l�?��T�h}��5�	U��U��}��v�/^l��s
���(+�-�?���)�~��7��T[S�8�}����vr�l�-[,��6��?p�����a@@�� �V}eF�@@�)��>8���Q A5q���j����u�]got������*���_~�;v�Q_E
�M�4��u�YA@i���6��)&onc\���!6z�h��QO=��y����$��}����^hy�3�|P���k�_��� H�7]�����?ok�i����As�ny����gC�+.�=z�nN�0���O2��7Gu��ay�q��s�=��+4�k�\��N�v���Se��cG����V0��:ujPOM��5[5��nh�V�p�IX�v��t�;��\�N5�����e@@�zh��z���"� � P�[m���a���O?m�g���[�^Q��e��v��^x�<��s����w�m�
�~��5qXL*6oZ����uJO<��}i��#�n��s���.��d�i��o����.X�����^�M5Ew�%��M��eK;>md�}���!���k�m���o&N��q�R�C�mw��GFNW�H��~���w��OZs�5MC��8�J�?���~3����������<����y��G��|<������S@Zi����u��v���t
�+�Z���|�#� �$_��h�/#r� � �@l��5��v��2�79����&jY�ys��'�[��J+�[�n�z������R��t�I��c�	�v��Sm}w�	'��:]^4}���e�7��������[�}W
3��K�|(�#F�0{���QS�~j���Q�M���|�����B�n�����w���:��sY�u��7���i��������������+��r�[�U�6]��~����}/�t459z���� �je��0{��W��>�����O>q�������gR�L������@�>�?i�r���������>���p������qT��M�6��5�����v��F����~K�k0���_|������#!� �����;��nt�>`����2�@@,�&u�YA�p��_�n��� ��J�������s.K��V�j�iY����T���+�|
"�k�.-��7������b���a�5��|��B�����h�����<��h�"Q �P�w���Z������r
/3<���m]��s��=��?���-�|��m�l��|E�F�ik��P��}]���@��'�l������=��q�V���7�����|z/GY��?��������=t���&_��S�L1�/Q��N�:�������[���(*)���|J���������q � ������R]��@�sLvY�;@@b�Mc�����
7q�GA�|���O�������R>7���|���b���H����WP>L�+W�C���M������e�Z��UH���l������O��|�+�ug�.�w._Q���?��~��kD�����gp���|r������[��(+�����{����>}zdm��������Lu��FM��^}�U��j�@�Rb � P�4�X�eG�@@@�\@5���{�=s�������%Kl
?5Kx����	&�i�7AK�GAz�<��,X9])G�_F����{���aY � ���&Zi=Y � � �y���>��7����'�x����7onN9��~�H�����O?m'Pm�r�M�f����;��S���
 � �@U
p�]��F�@@@jA�{��f��QF���_T�L����cs��F6�Y�������a�&��b����S����F�N � �@m	�������uR<`����:�@@@�*��������M��Mm ��gUTxd@@�&�P��I��-i��&���@@@���m��H � � ��9�_�@@@@@@ a�V d@@@@@@ ~�h��9@@@@@@H�A���A@@@@@�_� Z�e@@@@@@@&@-aBv@@@@@@� ��@@@@@@��	DKX��@@@@@@����_�@@@@@@ a�V d@@@@@@ ~�h��9@@@@@@H�A���A@@@@@�_� Z�e@@@@@@@&@-aBv@@@@@@� ��@@@@@@��	DKX��@@@@@@����_�@@@@@@ a�V d@@@@@@ ~�h��9@@@@@@H�A���A@@@@@�_� Z�e@@@@@@@&@-aBv@@@@@@� ��@@@@@@��	DKX��@@@@@@����_�@@@@@@ a�V d@@@@@@ ~�h��9@@@@@@H�A���A@@@@@�_� Z�e@@@@@@@&@-aBv@@@@@@� ��@@@@@@��	DKX��@@@@@@����_�@@@@@@ a�V d@@@@@@ ~�h��9@@@@@@H�A���A@@@@@�_� Z�e@@@@@@@&@-aBv@@@@@@� ��(����o���kSWWW���h@�L:�.^���~���L�Ud<���0�	�X��|�����|,�@C�2o����"��p+D�i@����7S!��)p��G��������f���U��������5�X����*U�o2�@-
�{,�T���s������~:�y��U+3b������U���n6�#���e�lt����>�����;�J+�d.��b�����2A��������
?��S��E�L��}�I'���E1/�������q�j�7����#�%D�jM� Z��(��@�tQ6s�L��j��N8���r�
�����l���6��p�e�Y��@S���R�������5��n��Z��H)���K��'�l�/_nz��e�8��R,�e$D��7�0&L0
�4o�����N�k��y�N��)S���Q�u����]�vy�_��{��dg��a�z��z5p���o��t�����}3��o4�'O�9c�!���Ziq��yNH�P>�M@�N$>��S��������i��Y^������{��5C�N���y�\�}��GF�Y�[lz��]�����;�<`�k��Q�K�.%[v9���m��ft=�-u��������$���]��V�;3U�@��&9���L��-c���x7d��l��������,�y� �9���Er���5�c,����4�H��q|���8���L�A���9A�*^}�U3{�l�W]���l�4}�t�Rs�q�2���e�����E����I��R.\hhZ��J�U�� {si������.3m��q�2������.J�	��:����+#B�/�=�������c��f�
�6m���
�d���b�ow�q�]���h�6I���s���C~��_f}H$����|�����>�'��>��C�Q����.��#��O�6�����������
�+)�U� ��o�i�� ��q�D�7o�y���l���Q��~����?��1S���^;�*c�._eN���6�(�|�R�!�9���>�9�'�|���s����G��|BJ/P�;���KD����_}���W_}U��X�4a�'Z��T[G7�r%�`?��.�|��{����
(@������h�Q@W;���s��%E�����7
dk�j�^O)3=q�D[�D�Tm�M6����gY%p���8�������6�~�6��75,R���U��[o�5�\�*~W���f�3��VK~*v;�9���:^������a�c����5��2��3�=	�Z��S}�Y��8�-����sLZK�x�U~��A��S�li�
����n��^�[��6k�UW5W_}���,U��O1�-�n�v���J�tU�kW
 ��|�I��^{���2��&�4��j=�3����{T��~�a�vx���m_e
�~����&�\�3Q���tti���w�U������L��Q5��g?Oz�����3�o�}V����k�8����_����f��!Y�q%]a	�}*�Z��n�;���f������_�%_9D�	|1���5�1}��SQ��WV�����a�9���x���'@���Jo������.0�<������\}����p=������q�
"���E��7�?���b�����S���'���sK(4��6��V����������c�^�N����3�<c�x���L�n�2�V�5�������5G�q&��b����f�&���4���6���j�j�Qs�^x������K��&O���~��\��X[�y������	U���G�F�J$�)���{��'r��q��_���/�^��!�|��t|+��f�>��A�lI��U����W������>:���/������u��7������lH��-�|����-��������o)���p4��]���e�?[��K��k����4��Q�:�0�c��2'�9�>�W���E�9��"���&��B�>
�{�����v���T�yCy��!�	D���5#P��iW�23���Zx!.���@�:�V��_|��/�c���~���P���pZ�|�������������V[����<�j<�������}~�����[��h|����CxU%������������O>���R�@j&Nyn��m�u���������prg�y��3M����{��������*�M�2\e#�}�q5��Yw��J�h�h���j�Ds���l�����?� �~c���[�l��Mi_�z��f�����M�k������{����}9�����Q�VjP}��7��[o��M��E���Th��y:���'A�Kv����>�4��4���[�q���7n���|����G�c��w�e��p1���?����?�
F5����m��f��SN9�l���Z]��}X����sW]uU��y���N�8c�B+W-��.�V�@���
�R�G��`�>�t���_I�3��s����s�����u�����>un�9�O
��{���Z�/��������]7L4�Yg���?*p�����^������U+���c����M��u�/7_)�U�u���F}�������c��oo�M;��c��<��S�E1�g��r�71b�y������o���]�����c��w_���K�{���W�z�A��2T�>��?�)�T���[n|�@�e����O�����u[��f�2�&M2z���W�iSm�������L}[)i��|����htMUh*����]��x�k2�/��y�R3�j��'w��2�8��w.OC>s����9'���}�?_�s������������tn�������/�\{����;�@B ��Q���T-S�-D��-��n�����	>��3��y\����g����n�^t�E�f��^O���p������P�@/�������a�������/tw����F������k0})�����
ey����wzi;�=��z}������N���n*(X�m
��?+H�)���G�4���)w����&�t3Z5"�.]j��i����B]7��v�a��OQ�����E���O��y�09��#��������
�W�����i�j.�y���n��S���b���o�;���z[o���x���
=���3?�QLy���������^J��U�y@�\���Z��}��S��CJ�M�J:���������E��[��+H��D����:i�Q(������a�"�]-4���<7�xc�tn����\s����%���������j�q��x���/���J{m���8��&����n���'5����������������Z��z�{�?���p5^5�<� �o����k����^��&=��m;�m����.����u�,Z�h����S��o�}2me%������3f�}p.��dw�%���
�fJ��+�'�O���?����]���p1e�����t�K�B�8�����
��9���HS��z�1q�1�Fg�sa��9�d���S���}�?_�:W>��#�a=]����un���;������{���9w8a�H�A�d��hd��B7[�3W�,��v�A����}��p���D��5�O-�)Lu�����'�8��&=5z�a���
0@�n��l
��3g�Q��lU�]0J
��Q�B�+t����i�'�K�t5e��H=Y���6�h#��
���wuC����gA��k�z�U���dW��?��`�����#�p��~��k�F4.WN���(�������Y�n�j��}��l����
h]�t�Ok�F��$�������
�����TR�	��P�O��;t7:��=��#l�	�<����VmI�Ls���r�)���c����z*^I7�� Z1�������qYA%��t��[la��=��K�g�=�0�;w�7Z]-*�$SHI��j-�
�e�^��qF��q�uc��)�~��,�n��&��<S�-]_�I�:>)����V������
� ��3t�^�K��W�=����u�]g�������r��y^I�1� ��\�������7��T�D�wv���;N74_n���k�z�IJ�!�|�����~��S����b�����Z�<�%�F�O}�Q{���D�}WI��C=����c���:O���K�{�v������d��[������j8�Zi������f�����j�B���j�k�u��St��(d����V��L�����:RM�f�1A�]�5���Y���|��Y�i��.�rr�8��.����;�9��
��1�sJ����~�9��[n	���G��T�?����]*�b���s�H$K� Z����4�lM1�r�p�Uc-��r��v�1�d
\��i���?�zR\���odm��f��7���MX?��9���]�LM7��z�i >7��T�����������S�
,����-�]7��O�������`�!�_~�}�\kzBW��BS%�����k�lM1�r�
��s���5��
H�������+���.�]0V�#t�)���jc(����=���n@�������EM���������Zf���i5�4���~o�|��v�G7���L�kj�T�ry(���cf-?H���*�����}����B�I�b�����oTY/�N\�H��M7����&x�lM1�2�
��s�PP������f
����W-Z�����Gc����ts���N:)�8�}E@����~��8����Zou
����+v�����������
��G�W��d��A����W��t�r&:Or�!����S��;�<;����m��g��9t-���\RS�zBi����)����.O�]�5�7>�wo~��_�I�����u��&
�R��>�>���q
j�Y�p���s�9Y+�l�����5}������I��q55���o���������j+�����^������\}j�b�@�%���G]�9��>���k�&Z��N)�p+6X����5�9?Ar.t���s�����9���K��d:W�z��AM�(������!+�Fy��
-�s����������j_ ������l�j��W�����i[t�}��w�X�����D�����,W�i�i;���Iv�7�W��4s4}�����R�I%7��w=I���	���<���nv/�uVz���S����)�r������_� TM%�pV~R�
���y���[n��?����&����O�?B�
P�p�	�4M�j��w���������@/E���2��1��X��]^��\�x/N@AB�wY�R*7W��U�^��C���q����s�dR��	��AT�L7�]
7����8@��z2YA2�p_��������NM��A�R��.��?:�\���
��n�*����C�O���jf�4M����[��kJ��_z���Te�m��w;T��^�TK��B�ZV1e�����[�\��������T�z?)p��n����]��n��2�|j�KI5Q��i�����AV���M�G6<�UI�f�����S��s.,��8��?�����9���K��d:W��
wm��0jE ��p�kV�������#,�g�!@M�d��h$3�YZS[��7��
���{����D��+D���.T��_z�GIOm��]Oh���s��#��U�9.�fU��jq��ZQI���P C7�t�/�gx���������k1�"����
`wU���r��K5!�+�7�U��5a���n�����O�(�O
���N�����W���y��[�>+@�R)���U�{5�u�>�l���m�����6�����A��+��@�m�����]��S�N��T�(���U�4w~W@F��lI�D� B��M#������[����~�����1T#�i�[��S�Y�T��+���Q�Q%�8s5��s������^�������
��r���V��(WM���k0j�B����.��
IDAT\�T����O��zp�X��� �_���w��([��T;S����%�����w�z��������/�Q�<DWlh���FI�VQ��:~��������g��Z��`;8��D���2��8�Dc5�sN��Oc}�R�3����u��p��2��������#�*��W� Z�����	�y��6�Xi�l���	k���)��=�t��fX�����s7���u��$����>}�}�Y������S�(���ka����\��3x�T��D���QM7�>��s�F�e���\��k������c��b%���#W�h�V7V������^�Q�����qM��y9����������b�����Q7y�wzW�c.�7����
��n��}e�u��r�*����~����J�J���'3����'�8����_c����b�#��L���2-���	��r)ot����M�����d=�����T��=	�D��S��\�������S)�|�n|g;^�<��RYh�:�����Q�����tmH�2mG���>�?|���?����\�#��
������+�z1�te�2�=��d���f��&F��/�t��VQ���~�,����?���r�-�)����eR��9jv%�%G�h�/���R�t.L�*�O�s~n.9?��NU��.��l�A�i3}��\�}&S�}��)����^�q����x� ��?kod
H)��Tl��|�p[tY�A�=;�1����w�1�n�^p�i7��SA7���OT-�@>���S�
�����6��������[����Wd�L(����|oU$�U�����R������,��W6�&���m:}��^���
������k{����7i��E����\���7W��D�>=�~�u��`��z����D�n���D�\�����{T��	g5�N��������~�q�%�Fb�������+��	(p��A���-di���l�����5����i��gV_���:����n`�IjjP���J����\�Q��)��w�9�����o����C4_|�m&QOU�'t��mp�R�r���]����!��\�(����mq5�qI��������~�!7�&��T�D��z���T��Zl���\���������iJ��J�7�G����d��z�A��#3|�>�s��
9�������
���]*5T���R��ZO%�?P�m�t�k���=���� yG A�Td�q440�>�N�jVV,��1���4�N��=�P@S�0zZ\�����]{�n���iz���FMF���������mk��s��_^9����jx��IA~�M�����
��Op��0�.��h�c��� �����:�h���6�4��R�+(���Z�n��
��n$�{CZ�T�S���N�������?����<�w��s��i1����O�T�|�������������7?�eVe��i����k����=�%��h�i�($0�@��������*��x�b?+����)Sl�z�F�	.���\��~�����%{����Qr�P]�k�tm�>�\�D�����~
�������K~~��J���u�j%����'�f�`���^j�/�Q�f��p-`�)��#G�m���|���-�Yk��Qt��u�<S7OA�3~m�Pm����)���sJ�Z��H�?{��p6�VW�}�?W��pC9w�/H�A��A ?�f���v7�<�0�i����������X�=��Sl_�m}�T����3�4W\q��:u�Q�5���[��	��)�B�hr��&h�|8�	�������a�SFa��|v����l�}_��g^���SPY��nR_v�eA4���k�C����y��T�#���������=��C��[R���B5��2�B��/~�N:��1c�O�����4I���=��:d���oM�lR���m�������(k�}6zAA4%55�j��o��l���ye����ch9�j����j�����n�u���z�(\������y����tMs�a��_����J0�7�d ��_���.{]I�;w���W~\t�z�-�����3�M�o�#C���#t�������Q�D�>�����t�wGE7
�������������2�_�r�=�����p%�S|��A����/�z�sG6Y�C >���[5kF�bl -U�L�45ig�a}Nr
4������0R��O�A�@XT�z�G7`�'��Z
Q�(�857��j��M��|.���_���;vlp���n4�p�2�6S�6�s�k�����\R`JI��m��Vnt�w��p�}������	�oQ��+��#M�b�p�
�i�����
/5�n����>P���|�v�Y^ZG��������9sl�������A
|>�
���pp�yOp�2��r5e5�^�SW��Dp�]O�so!"h?pM8j�t��![�7C7~�x[�FM�����O�4�������p�1������}��LbTc
�N�<�N�������'����J,�<g���s���S������}VS��S9�t�����H�<�9��TKj�3W*��[rM%+PniC�y���D�U���T�L�C��+X��*�_��a�9�y�r�Z<�T�:E����#��u}��o��[\��743��R��,��	P�t�,	��(��W�%�Tq}U�~��f�}�5
D�_��n��6�&��|�	'��K�+J��Z�ha����T.��B�q�x�M��N;������������+�hw��_����S�����w����)��vu����
lm{-5;�;$}X�3���t��o4~���o'��f����'���'�t!�����x��F51tcV�_� Qm�.]��#�8"��"�����y���m0��K.I�Fm����C�~�n��2_����~P�P*w5}������{�T5_�L��C~*ey��u�����[l���8q����`����>���q�M[���1ZO���r��isR}�����OQ�1
�������`�j�M���T��?�ag�>���z��e�	��u|Us��yu�9�d|x �:xT��0a�}�����k������^�|=���v��q[�]
_�h�p������$
L��F�Em��)��s������,M	������=z8E��=�����z@���*e�7�O�>W�3�]�����b��B>���f
�j�~���_x��u�w��������fS�Z�@:>�����nt���-���[n��P+j�Z5c��Z����;�=�O^�4
�����9'�s!������B5�s�U��u��1d���x~����C=<�kC�������:�(��b���sG���tTN� Z��Y�W`��w�O1+0��3������p���'E�����i�q������C�n�-#��n�Y(����
�����.��W]T���'n���9���m@��p�h���m"Q*%��~��|T4��Gita����'�n�F%�����3�S@X7~����V7�$�K7�F�e�'+e��=:����i��)uO�)�f���������sp�������������!�|o���y�`u����(W�L��t�U�%�S7� ����?;�����{��=�����I�i�R^���$��ABJ�H�� !A�D��R�	�	�
"�KGAK�hRD����w�z�Y����s�Y{��������y��gO�>���H��{I��'��B�t�����?O����*�z
��B�w�'!RY(����L���/�����!�6����7)<�=�����}���*9��	�b[�!��w��K�������1�
b�b?��[��[���J����U��{�Z_���R�

s��z�w��{g+^*]�x��=p��>H����tQ���?�����Me��g��' ���g����w��~���E���>%��z��!?����$�����7�Py�>Z_x��y�)��+�<�OHm����7��}����j�������v�Z5�x*����_~�=����~�KK�+W����a-c�k�Dj����,��Hj������y��C!�R�57O�Z��F!�y��Kj����t�������������#P�d����y�[9F��Jq;o;^���,�h�11��8�sZ7���
[����<�
�?�rjV��F
3R(S]�aQ����l�j�:Kbx��'����kv����Q
��������h5�*����,k�]&�t����;w���������g�}�YE�j����52�O:�_�>��'����xj��Ja�	r�L����j��a-'�F���W_}5�q�F�=���>����j���������~�{%�--l>|�U�I���7ov�xk�T�H!@�e�U����{�����Y�P��7�|3���/vZW�L�y�i�����<�y�j�����~��v=��A��t&^����}�i	iq��r4��V����������s
L��)�������3���ME��������{�������Q���s-�{�nw��.#i�=P*��[a��YV�k�����g)\�:�s.�R��4VOC��u����:N�9����=%�R���=����9����f{�������9�A;^��}J���'���[���9�$�w|��a�|#��;ZE��
��bL����KI�@��yy��������A75���Z�������B��J��q�#�m�4��;��ty�.��!���B��:RC(]��%�X�lO��o�����Iof%��9:�������9�z������v�y��7����6�<��#P�Pn��9��S�3�9�m������:�����s��c�>��`ih�S���������*�M������>-y��2!�SX<��kh�|�|�9���_S�C�����+�.��y�U��D�_���ISZ�z��)$:_j]��/�{���g���}��/��C�6�u-�C��{�6n��t��ks��uK�0���zsmO���B��0��;hC����v
�RY�
�eNMW;O;����~I^�B�zOJgl<yPLrn��e��<�y�j�I_��������~�E��Z����}_��0�uLsl����xN��S��.�����{����H��Y^��:�b�O�>(��7��Y���><��u�����4���}P�3��MK�:o*u=:�c�M�����m�e��VV��j������w	���a��������B���{�����,{�;Zu��W �677�����<�7
�F��{*�{�Z�c���Hc��'? ��3�~�W�5�P�L	��\?����|����i4�2h�<��m�C�|��y��7����6�<��#0��C�����yh+N
��3S��.Th[`�w�!T�37������j���;[��_tM�-���&��/�>O9���w�]_����C�N�M���q���r-��9n��H}74����W����mK[�h�::5]Ci�w�G�{��	�������)a���i��<��z�Y��:\$��u�m���5_
s,L�S���uWs8e�S�m���S��%m�z������kmE�}{���4��S�A��0��t#��\�����������)��"�x��5e�I�<����o����\�v�x8c�����x�]��+�:����r��y�������:2.�������P��v������eW����Zf��V�OM����k��/�th������2-c^�x1���g�~�i�j7-��������;�9R�h����K3F���9�6� @���
��J�G����93�;|�����;�G�����li�. @���@��M��tI��o�u����ci���w��v=�������,"��]����z��`7	�,L\ @`���DZM����#���W{68���>}z�����.]��ue�'� ����g�������gy/T�\c�U��s�f_�u���� @��E�9���O���<y2{��y�y����g[��g���666�F @��4�������,�O�81{����E��O /�Ow�yGY
�R�& @�������e @��:
�g����l�y�9���M @������|Lk���@* �# @�������� @��:
����{�6 @� @� @��$�m%>3 @� @� @����B�u���� @� @� @`%�h+��� @� @� @`���^�M @� @� @�+	(D[��� @� @� @��(�m��m"@� @� @�XI@!�J|f&@� @� @�XG�h��Wm @� @� @��J
�V�33 @� @� @��:
(D[��j� @� @� @�VP����	 @� @� @��Q@!�:�U�D� @� @� ���B����L� @� @� ��
��q��& @� @� @������gf @� @� @��uP���{�6 @� @� @��$�m%>3 @� @� @����B�u���� @� @� @`%�������5jH� @� @� @���hk��m}6>hO� @� @� @`W�|
�6�|��~o� @� @� @��+p�_��xy�O&IEND�B`�

estimated_filtering_rate.pngimage/png; name=estimated_filtering_rate.pngDownload

�PNG


IHDR���WNsRGB���xeXIfMM*>F(�iN�������^��	pHYsg��R@IDATx���E���		!!H�K/�J��H	E�HQ@)AP�HQP�I��� �H�J�$�tB(IHB�����[�������������^Ovowvv�=s��~���s���'��!� � � � � � �@�s�Y8e#� � � � � � �@�����R��uI^#� � � � � � ��@�~�|Q���J�r@@@@@@�C�@]XdE@@@@@@ +uYIR � � � � � �u���� � � � � � �d%@�.+I�A@@@@@@�uu`�@@@@@@���e%I9 � � � � � ��!@��,�"� � � � � � �����$)@@@@@@�:���EV@@@@@@� P��$� � � � � � � P���:��� � � � � � �@V����@@@@@@� PWY@@@@@@�J�@]V��� � � � � � �@���"+ � � � � � �Y	��J�r@@@@@@�C�@]XdE@@@@@@ +uYIR � � � � � �u���� � � � � � �d%@�.+I�A@@@@@@�uu`�@@@�f	L�>����{���A@@��k:�D@hO���z�M�<�=+�q������8���n[~��]�~��N;�T��
���7��r�:t�2d�{��g(�9�T�s��]@@(����7� � �$��}w�a���_=y����������.;��/�>�hw�I'�/����~��z����C=�N?������n��1c��"���F���������i�ZJ���G����g_d��y��w|��>��mg�U�-f��S�o������~��������7�N�@@ Pgl@@�x��G�9��SS��8�7j����5�?����k��W\�-��R�����{]�T�?�?�x[�T�+��2��.���>PW��VkO;�k�{�����i-����������@]_���`���7��z����;��5�\��SOu���w2�Z;�S�
�0@@���eNJ� � �@�p����KV�iF�K,aY;v��F�i���-����z�����j���}���[l1������i��k����d�MzT���g���_{���5Ejk�F�@����6��{ki��m�m��#G�t+��Bj�6�p��x_���`����M�	�f���v�����5k��:u�So��N��8�7p�@7v���j���Yc(@@ u��R( � �>;�������9����M�8�����G���L;�N�6�xc��d����=���g�y�_������u��W��;y�mL�6����m��mZ��������s�/}c6�m�G_������;��cY�����3��8�9��Z�(�"Y��6�2)@@�������@@�VhV���3���{����,�V�z��������h�����F���u��U�R�~��o�������W:��x�T^#��~�m���c��n������'On����8��oO}y/�Z�J�j��J��{�Q#����_��o��O������S=�5���k��3�t�o��^|���E��}R���O6�O��M@@�� P�^�Am@@��	h���?�i��j��	&���u�]��_~y��2��n����K/����sO����!C���5�\�������7�,�����/����s�=���T����~/���g�u������ BZ�t�M}9�|�I�&�u�]�i������K�<%����o�9.���/w�.��[e�U��/�wT���6�`7����v�K�/|����.����^z����?�y�O���f�m�,�d��w�q~����w��5Y�k���m���n���w�����������KZRt��A�C~���/�+�F�KX�/����x�_��/��[n9��z�������n���n�t��7�>�f$�}����e���������;�f���zN���fe����}Z�c���.�W�����nr_����>�)���W^�v�a
��.�n����j�Q��
�z���H���wO<���^���e(s��}
R~��N����M������pmo���4�������VZ)�F��T���Y���#�@@�/��D�LB@(�@��������������h	���hY��k��Xi���|\*E_����/�K������g�������g��m��?�y)zPY���(�cYK�s��|������!z�?�k�r��3�8#."
�����N|�v���?�a��v�mW��\�K���:th��?�q)�}�'�v��1e�5�"
���� K���N���vN��������g���d�>���uht�$��-U���������hFf2�=�/�~�����_����1kO�}��'S���`#�A�_K���G}tU���=�nc#�qRo�T�F��I'�T��lc�<��r��>7G9Wz/���ed�R�~�+_�=�}r�]wy���h�\#�}���zIw���m�_��[V���7Q0���E|�(�VVV���f�SY%y� � P8�w.���$H � ��{����+�t�=��7n��n��w�4zv�k����O8�}����l�]w���@��n�?���b�-�l��/��iF���_���Gy$~��f6i��f]i��}���E_��)S��+����s�=����S����
��SO����m���K/��bG�5���U��v�x��f���f�����w�7������'��~����(o��i�����?������>�gpi�O���{���z�����~�#�?�u��V[�[n������>�`7�s�%5�Q�aS�E�OL3�d�YY�/v}����A3�4�FuK����f�)i��f��:
��c��s����:�E���q;`@k��Tk�4����>[c�5�7��
�Y�J�<�����F�y���.
��s�Y��z��������(�������?s��������T�m�ed
W���������N��f]������e}�*�����[���{�������;���5k[����/�������w����:���Zh����}�4���*�@@�B��+\��
#� ��gDK���	���D�\�4k!L�5����D���4������g�DK��4��R��~�?Z���h9����7���y��.� >n;Q����eC3�����0S�t�J�f(_rf�{��7m�bX,�Y�j}��&,W����s�9�������e��z_d9�Nu���_��Wq5�/�K���g��*�f��cB�u\?�2����it������������gf��s�9��(��F������s�k��B�8���F������c;
���%	S�-#[��!L���z��>�������N;��J�|�����������`��L������1��S�\�F�����^Q�|���2j����%���5n��Q�m�����/\v�����-]\�~���i�}�)E�x<D���~��E5�>QAy��� � �@��o^����H�� �t�@�������	��f�%���p�!��g�DA)?�!���������/��:�Y�����(i�4��R������(t��~�����4#"��/������V�~��w�����Y��=����}�E_������%3�v�i���g�4���������jF�^{�WC3���e�Z3l~����cB'�w�lF�^7:^����f�
6����D�U����*�Rcw�]v��z��%���������Y�Z�������(����Kz&`���8�t���F����������I&������s��p��Q`��H�l���f+6#5j���>�m6p�\��������f�*i�q��,��6���8q��1cFx�����O\����E���4SOI�]�>����('��I�B?>�U?U*�� � ��/��5]���"� �#��w�q����
�%��.�������Ok)�hF]���_t3g�����i4�.>g;
PX�2�J
*���@��?��To-I��������;�=�1-E�lz�fp�`��my��Zt`��7�q�h&f|Lc+�L���.��)��o�y/q);�3��������^��h�����+����b���.�}��/���~	�����v�m.�!���W�*��X�{�u�(�
�����N�������~7�q�,���F���8����<- ����Y�~)�J���q-�����Q�n������_����em-i��d��w�O�To��g�F��������PY�-�\M�5�>�TpV�T�|�#� �������#j� ��I@������]s����kz&��EY���ta���������~���S����u
��{z��g�*.��R�~r������L�d�����@���������353L����a��3^zk����5?�S��,-�]}��q�Uy���z�����=5f���K���i�cZ�&����z������-���=�\��s��3���8	�����xz����b��y%�F>��S�*��r5J��x^)��O��e�]���W��%�R�~)@�������g�5���.��Q����l����si��u�<�'*7�~RY$@@�b
�+f�Qk@@ W�d�H�i�O�^6{!\rL��V^y��u�L%}9�4e��8_���g�����j���e������%�>=����������y�i�����s����~���?�SK�>��c������N�ct�����s7n\�X/�8���[�|:�c���L��������h��O���l�vlg_���Q_����Z.T3�4�Y3�5Q?O>������I'����m��&.z��_�Y��z��N�����|������ � ��1@@������&�9����t:���K�Y�|���:+~]m'��Rk \�P�f&-GhIKVJ���z|*\2>��;�����4�O�U��������q�/�x�����g�yz�q��]5�IKg�|��~V�#�<�~����O<��5�:����k��w�K�j��g�����A����7�`�f5���4:�4;�������Q�V�(�z�9��jh\*����6�S3�5V�_�L��g�i&������/�������'E��L�)@@���A�$@@��h�L�u�@�}������?��*>���G���n�/%��?�V@8�!-X���fE���]����A�[o�5<U����Y��m�r�F�Ky)��v�m���f������L.{g�xg�pGq��������d�����n����~����;�mp����� ]K�r�J�jt<�����/I��sz�a-�'���`�F-hB�-���������o�������e�]������m�������8����"�����`V��N���@@�>��3! � ��#p�5����E�m���k���i��7p����3t@��r�������w|��;z����T�����z��w���_�ZA�0��Bm�Q�/���#%-w~�Si__�[p��'�pz�Y2=������/��u�/|��,]����R�����V_}u����ow�������o�yYZf�����X�qb��|3�������=q���Y��{:���W\�o�No�jt<iL���
�9O=���x	�f���[�@]^F�G+����g�q�_�V�������4�-�g��9�����o��/����hm����S��|�"� ��]�@]�
)@hk}�����&�G��lF�����%=����e�]��Tz����	'���������6�����~�i�g�h�2�@;v�[l������?�U�:��#�27f��l�?���SpP����K�<
���4�Au�,��O?���^
�UK
*���5�����]~��N�5zK
��m���F����q�%����[�Y�����r��
����<��x��f�i��g�y�_���l�X93
����Q@��7�t|�{���|�5�X��g���=�H��2�-0�Fj����_�����?��;�d�Pg!������_8��1� ����?~��n�m�F�l���C��UP�}�5����1t�P��?���;B��z��Z���{���/~��x����N���@@�LJQi% � �@�����������~��p��Q@-�����d��YA�����������u����k����]w���-���q~���������*�h��?��+�N/��c��g�}v�������<����q�D_�v�a��|�vF_����0U+��E�k����jh}������V��ju�vN�F� c����g����P�y6�t��/=
I9�KQP5�������3
���I^J�����{���`#u�����e���o���s�3���'��Kjd<EK���z������g��Q������2��E�������3�w�9V;��4�����j�����+}����>�'���/����Z�&
��
����}�Xj�}����'�[@@�b
��m�QI�@@�NXq��B-�k�����g.h�����e�=���kWZi%w����c��d�<�����(�/��f��I�kIB-�����/C�s�������
�9���w�}�=��c��V3�9�7x���x�r����{���e8��E���u�E��X�L��9W����$��1-y)#���h���M�������������r-����v���]o�J�W�S�s*/��������z42^�����,9����q��e���=���f3r�?�o���������#�������}�[_��f������`�=z��yj���~W>��J�:�oy�|��q�;�m-�R�F��>c���N���ZB���	���(��lVe��Wi�c���������9V;�B�4�����j��������:������/}�-������~�t�~���.�~9����y+��6,�R�h&����>4��R��]�g?Y��"� �[��o*f��u���b7��#� ��+0}�t�\&=O'�\����1���$�r��&sZ�R�?��Zj)�/�{Kz&�+�����

*PP)��5�M�0��C2���z�X4�����M��;�s���R��w~�C}��WK�R;e���}!�vM-�~��.��S��ie'�i�Q���Z��R�V�j��<�U�R`#-��^2�6�t]��%��vl���~l�K��1j�d�q�%K�����^����{�H��:��_Z2T�o�-��>�O���5N��+���'�W��T�L��1����j�'�����I���:V������<�j�w�|��Y��>����}���\pA����������~�+_�u�]���n;+������:�����+����]�m���> � �@���%u��;j� � � �@G
hF�����V]u�����F��:%=��f@v$�B@�X���M����a � � � �������.�d�����^{�5?SQ�Qv�a������Y����> � � PTf����7 � � �&��~��_���jM2d��)���)q@@���Q���C�@@@�B���G�g�����=�O�g�i�!���-H��MF@:P�u��4	@@@�N����~�m7h� ��b�����wB�h � ��f��c0 � � � � � � �D����{r+@@@@@@�X�@C@@@@@@��k:�D@@@@@@�@c@@@@@@��k:�D@@@@@@�@c@@@@@@��k:�D@@@@@@�@c@@�B`�������k��}���ft�������������}�T��j��O�#� � �@#�Q�@@��x���������~����/�����v�i���g�;��S���C��!C���>[W1}���e�y��qn�W�m�x����������YEU�����9�T*���h
 � ��U`@_�z@@�x����a��^�����=����>>��G�N:�$�����z��W��V��6���z<��s�;����������v�z.����v=��;���+�����v���3������y����>�{V]_��X�O<���n��7w~�a|�R��}���u�;�������������n���^/}��7�u�]��|��|=�Yg�����]w���1��2,��Y���G��������J+��?�p7h� �Ru;c�w����_|��[k�����E��~���v�Z����a��gm���}�w�
6p[o���o��R@@:M�@]��(�A@:D@A�s�9�����r��{��N	�<���m���QU2���K��[n�9��*P�l;���+�����S�.�hC����:���g�����o������"�2eJ]
����}��f��5�\�^�����)x���O���,���������sw�q�[r�%��Wz��#��SO=������l�Z�1��~����Y���*w����>��pc?�2�Fd���g��L}�~�/	�@@@ [�O�w�m��� � �@�l�4r�H��
+�����n��j���}���[l1��������]��v����&����f`�s�������=��@������-�y�g��g���Zy�*�T;v���_�R%W�S�������?�4�Oi�e�uW��;����+��g�y��=�:�7�|=I	g���.�����mV�����a���fU�)���G�a���?k�^zi����;�d�:u����Y��u���c#��S � ��.@���G�G@
 ��k�As�QG�e���7R}���x���~�IA��{��3�<u��eu���_~�7}���y�H�z��W��z�����|���]-��.���8H��6��e2����m
�i�������{���������v��W��}�kU������f�)iF��Mi)���G�iu����;�X�Y����|����[����5��S � ��(0G76�6#� �t�@�_ ����3g���)�J�-S�����l�J��f�W������j�����T������qR��z���[��z�����K,��a��v�[f�ej*F�1�a��9���i���eKPj�]�i�=��/QP���%���p����V�W��z�������{Q�����?�N����yi������R������Y����5 � ��%@�./Y�E@h���Zno�UVq6�H��,�O��~���	z�K�{�_~y��������p
�5�i)A?�\sM�����7�|�,o����nr_����>�)��+����������r��4i�;���l��T������w�e������2_z���O�*��Or�1��x��:����ry>C��y��N�,��:��R=��u�]������7�x�_U�':�/�w�A�i���ry�<�����G$������8%�L�k�|����)���J+�����N��v�!�M�]-Y��s��LZ�T��d�x�2�JZV�f�%�Uz���Fm�O�s�=�����|{�k����W����i�/��D���]^O��5�x/�=��N�����L��E�1�g��1c�E��\p��<�}��;T}�~�3��e��ee�����I\p�������L���>�,/@@hG=$��7 � �����{��O�o��}�{5U(
��<����5�2t��h����v��X�#F��Q �\��Q�4������.�O��`)Z�/�o;'�tR)�MS�����0��GA����]w�_��G�����s��gr��v����N�������E������m��t/+��Q���/
��^�6'��|}q��Zk��n������Z��_�����p\Y�F��]�������Ul�����ki����LTF������/���YgU���|���*������W������>�`��c}�9���=;�d���h�cY�j���9�����
[}�|/��i�����R������U�oi����������(����OZv?��~Y >~����}�}��'{��@@Z%`�VeF]$AB@�l�c�=���P+����/s�y��.
`��'�p��s���l�]w��?�k���.
P��z��}�������-=�)�G��F�\*������NK��,�L������zG��v�m��h��/�Kw��W��~��N�*]����O����m�����-
v�( �4m�9f��B���gR(�����3��3�4{*
0����/�c���(��N;����G�g����O���55:N���H?j��~���_5��^xa�Z}�����
>��m�t.�e���KW,+\F����Jh�_<��5k������U}b3c�~�C;�F�7Y�f�g���?�d���:���Z(�N�m�������a���|0>t��7���S+i��8��k&����������o}�[~������ � �@�
��HkU���"� � 
��<��rK?�F�%�?�g�f�D��.%g>i����'
X��t��<;l���*i���o|�q��K;o��0���2���(�O����|.������=��#�w���h��R��y���|��]�.<'�h	��:��|���}��w��s���mF�-���g������|'���;La�eKQ�'<]�x��+�O3+�L���M��F�I\x�N_�Q�EK��vDK��^��Zf�r�!�}��`����[�`K�|v"���:�h���^�<H�(0���I���iF]_�7�Y�C�hy�R�-E��P�~	�-7[�7�|��~������������b�-������f?Qp�^���e3��EY����h����?��3)��y� � �@�������$H � ��-]�f���D���*}�g�)�f�h�fYD��3�4���c�Y$W\q�/7Z����?�f��{���/���������|$��������E��p&�������
����s�t�s����E�}=����o�n]���NsQ@���(8���7���w��==�/-��-�����c��gKa�3����?4c�-1�N�ot�����f�c�:4r��7��/�}��`�f_�P��k�k����,�0��9��"�1=�N��k��e���^�~y�EKI:�B�g�=N�-��s�,��-zf����g��=g)���E�g�SO=?Q���2��R�1�z����fF�@@hGu��+�	@@�f-qVk�R���
����_��{��x��.�Q���/:}��4h� ������!q~-�����/�_+����}�[��Y��Z�M)z��wZ�Z:��s�=g�5o-�^���]m��|��Z�0z>���0XT-4z���*m��V��Z�����&);����[����'�����m��h4p����k����f��'�x��w�\s��8q���q�-����x/jYQ���w�2)}f-��re���[�2�����
�@����.u������K.�l���n��)���?�1�\�|l@@h7u��#�@@���f�i�[����o�#�zF��_�h�4�g���?��O�R3��ep�g�]v��L�:�����K��%�X"�o��3�<�f������;�>����JM�b����=���_���9�Y4�2xn��Ww�r�a����%#�k^��x��N���Ry:��~�V�j���y�lh����e�/���{���w������{��W�0Z�1>�.;�������M��>��,G���L���d_��T���)@���O���/h��fN��_�uC�uzVf���B�����g9f���/i�/_���G@@�x�r;�u@@h�@2���W��O/���Zx����J�R~�r[������E�U����3m���6�f*i��"'}������_�k�R|���4���T�M���Z����Z��R�@]��K�N��;�0�|f��:=��/��vJ;���[`��Jm�-R�j��k���i�_}n�z��N��-(���h�����o������W�\��5�W�,�������
+�����n�G��&?���Li-������i��������37n\�g}�*� � ��������i< � �@���������!�?�N�l�4����N�����n�I��,�9x�LZ������j�=��L1�D+R�y
�hf���;��3��){&�<�6�`�����[o���2#��q�$e���h�5��Gq���7RZ����?��+�:b���<���;���S�N�v��W�����z�d;�����I���|���K
��h��}m���6�l�.��2��t�����g�����%�5�ZI3����>��m�L�"�C�]-m|�)�����|��n���~\��'?q'�xb��]@@Z/0G��@
@@h������x��������~�;�gYZf�el�=����~o;���Z6.-���+~�G��Z���X��M�.��z����r��]vY�-����v�t����gA�z�g���?�����0L�4:N���,j?n���q;�1�v�,A�m��N���%���^��
�h��j���g��'�h���{�A��
���0�hK����'w��w�{m��f~����{Q�:[�����[��J����#�8�i��f�)�w�}iY9� � �@K�����#� �4K��k�q7�p���������vZ|�����/����;o���������g������4��SO=��K4�N��i9�F��O?_�HR��fx=��~�Q����g�}�]|����f$�.EJ'N��.;��
\q���j;_��W���a=����n�������Nx:u��q�Z�����
������gW]uU�fs�1�1=c�������[n��|1���w6&�u6�*Z���{A�O%�-X�E[��rK��	*i�R-q�>{T�����9=�Ni��6r�����W���Z�.<�> � �� P���� � ��hyK��R�3��9����.��� ����t�	'���C��Y��lZfQA2=���7�p��6v�X��b����,�~��d�c
�i����V@���`����/�UF\����tX�M��i�'�|�]r�%n�����Gu�_Z��/�����,���z\��?���A�~��-��� �4��7�����5���z�]-��qR��v�G=L���G�S���vL�	&�5�k_�Z�Z�8�g�7�����)�>S�L����.���N�~�|�=�~�y�qa=�s��������{�_��_���g��!C����G����P%��
���@���Z~S)m���O?�B���g��k�|��~yT]��khCB@@�mJQ�J$@@�A Z����T�F��QTS��;����h����(�?��#����f���%$�ym�@Z|>
6����~�@X|��o�]�b�.����Q�F�������J;��s�e����=.���.�������G�R��w�������9+$�Q��}n��F�8_NpI�>
�Vl_\��EA����6D3�|����7��������252N���m_�Q�E3�|���|�_�X�4kE�����nZV��Y�J{��g��_|�R�.�����s �q�#k�<�R�b|�}���G�Q@�,O86�������o��X��}/���>k�����o�9
���g�����-V�h9���4�� ����h�\|o�����<z�bBY9����v��.E�)�q@@h���{�u�	@@����W��2[BM3m��k.�(=����.��Z��c�-k�J+��?�pl������:p�e��?��.
���8��u��W_��p�yKC���Y��6-�hI�K�}����>���:Z������Q�6���dR����j�^�U�-����/�)������rsv>���s�����v���]o����Y��=j�%=�O�r�����(�'m��RK-��`��v*�S��Q�i���Z����j6���8)�A�E_�QE���;Q|������d&�������%`��7r�H?�����������������/n�����4&����d�jF��&���k����q�/�[�>Vgk�m�x�m���g��a���CeD�9?cM�}i��W3f��R��:�W�lV�^��o�%��nY���n����SIu��LQ�C � ��&����g������[� � �@�
h)4-�� W-I���f~�/���%�����9E��;\�����bWy�_�+��\���
��l|���T���/��r��Gn���~yE�zKZ�1���-T�=o����n�Y�]q��`*����Y�O�l���d�%�ZZT}����*�����3��>���O�I��A����,�g����K�Z-#�����~���8�k+m��G�����=�|g-���}�z\����`ZZ`�����������@�@����l�{��]V�]�q~n�|����������k���:��Yk��gh��a�i��W[��� �sU���h��,
��}��eY��:������W��@@�V
��%u���� � � � � � �u�+_���h0 � � � � � �� P�w�� � � � � � ����|�|@@@@@@���k�;wE@@@@@�ru]>h> � � � � � �@k�����"� � � � � �t���.4@@@@@@�5�Z��]@@@@@@�\�@]��� � � � � � �u�q�� � � � � � �].@����G@@@@@h�����sW@@@@@@�. P����#� � � � � ��F�@]k��+ � � � � � �@����@�@@@@@@Z#@��5��@@@@@@���u��� � � � � � �� P�w�X-��@IDAT� � � � � � ����|�|@@@@@@���k�;wE@@@@@�ru]>h> � � � � � �@k�����"� � � � � �t���.4@@@@@@�5�Z��]@@@@@@�\`@�����O�>��s�=n���n��w����:th��S�Lq&L(;���7�[y�������+��ps�9�[w�u��/\v�^����<V[@@@@@@�� P�K_�{����#�t
�YR`����v���g����O�N9�����|���v�>��9k�,���{��/��N��	'��������j�WK��@v@@@@@@�N��/{��.���������/v7�p��t�M���=�����>���>����E]��s�9q>���b�-�w��n��f���K�c�=��x��u�����Pv@@@@@@�J�_P���K%�	w��f�iV\�~���N������R�#F��@�r�;��3��o��cYLe������.�}�y�7�����n��V��V[��v��]u�U5����+j*���?@@@@@@�J��N����[�Yf�8H��z���s����c���z�!7i�$��f��A:����������}����Z����� � � � � � �m+�3������n��)n�%�p����z\����
4�����>�V^ye���W_�[{m����-����������j�WK���h���- � � � � � �@����/�l8�����,6x�`\�_��5��k�����8�?�N����2�/y9c����if�Roe�3�<�m�G@@@@@h#uut�����n��f�����C=�����?�s�1NA������n�����>��uZ��f���v��YuJ�����<Vf�^�����s
y@@@@@@�N�*�y����W
R�-���:� ��"��n���9��e50`����f�}��_v6�n��q�2�Z63�&O�����<[N�Z�Z�����x� � � � � � ��������^���;�@����'�L�.s.����H��#������.}�����/��V\qE��|��)�I_�)�c��@@@@@�E�+2����{��G��[o�s)H���k�^���%�UW]�i����_�-����;�t���j����.����y����Z���'�; � � � � � �m)�/��_`��S����5�-��bn���n�UVq�l��''��
7��)����O���^��k;s�Lw�u��{��������tt�~����~�VZi%��o~�M�4��u�Yn���w
�^J���%�/�X��1�D.E@@@@@�B
�'��	�U����6l�{��wSs�����K.qZ�R;�����}�A��]w����'?q/��?��G�C=�]z��n���N��[k���f�����XK�Z�����$����� � � � � � PH�<�$V6����z/����k�����.���#}eQ�^z�%��B�����b
j�WK��7���
u�@q@@@@@:N �8��M����Mv
�AB�.;SJB@@@@@�b�'����|��Z"� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � PduE�=�� � � � � � PXu��:*� � � � � � Pd�E�<uG��S�����?rn��f�/���[�D@@@@@�������R�o���v�@�~��c"����S�.--y�s��d8� � � � � �@S���X�,}��.�f]/P-H'���� � � � � �/�������h�\f�5n�����o�g?9��]����'��C@@@@@��
�'��4�E��n�g���6�����)hG���GmG@@@@�@u��4�C��sN?J�L;��y�@@@@@@���:|%5�e�������]2&22�,9�eT�/&�-��,K�,@@@@@�V �8��M��k�W�
�AB��w��rh�K�W�}�D��x\� � � � � ��sy�I�lu���6H�U$���f�iV]���V�s_@@@@@�
�'��	�p`4��6H�e(^���%�sNKX*������d��&x' � � � � �=���X��z�s�l��3�����u��t+�w�d8� � � � � �@�y�I�lu�vYgf��@]���x�4�����4�������� � � � � �@y�I�lutT�a��@]�z��]�:��"� � � � ���@�q+�@];�|���	��6���T��]_��@@@@@����X���p`��d$�j+X>�w�0�� � � � � �L�<�$V6��f�h��e��@]�7��G|�����
Xjm�?x�}��Q�%6+x���)��|�>��_� � � � � ��+�g���&P�o�t$����w����'-)X���r�B
�@@@@@�D �8��M��KS#��AB����k������Y�lK��$�"� � � � �t�@�q+�@]��,�e��@]���x�k�T��5����F��k�d��9�����~K���}@
@@@@@2�3Nbe����:� $����Zf���N��]2�g�'�'��|}���3! � � � � ��F �8��M��5}[��� !P�Mw������m*%�K�Th�X���7&xW�� � � � � ��#�g���&P�>��v5�AB�.���v�X7��G�)� �$�w�����c��{������Us���s
����<nv��o��~H � � � � �T�3Nbe����a�l���f4�3�.�;�����9���1pd?����4��]>��� � � � �\ �8��M����$��� !P���f�iV]-i�c\�a#{d������t��f���a�n�(���y�������\k�+���-(�L�>S�wa����Mx��Y7Q@@@@��y�I�luM��b��	������Yu�b��/���f�q��t,y��+����'��fF?J3�u��fVZ�34L.�cZ+P���/���mwG@@@��@�q+�@]_{����AB�.�N.����]�~��)�WoJ������,�z{���"P�3��H��$�"� � � ��%�g���&P�]c���� !PW[M����R��X���TXg� �U1lo�c�k,__�E	�5����iV��Z���F#=��k�z��i��,Z�G��:$�$�g��`m�@@@@���'��	������6H��muM��_��}Y�<��&
���wi���X2�"�Z;��?or���&;���K����K�i
�c��Kz��G�v�<-�l��K��	���y�G@@@h�@�q+�@]���mk`��@]�v�S �H��&��
��/���W]�A��y���'�y�������=r��j���G���hk<����]�������v�- � � � �@vy�I����S��A��:��%� aL�'����sN?J/�h�6�?�)x�1�!��@��o
W����]�S1��a�:oCv@@@@�k���X���f8��P$�����.hB��#����v�;��zo������iwo��@�a�.�K��6z?�C@@@� �g���&PW����:� !P�����hB�����s�m4�F��O�K[�S����kfV���j�j��y[�����.��W�r��zZM������8o�����).��;�q����u2oA�L�@@@@�����X���h@��T$���#?�@��wA
j�M� DX��4�7k�����������0p�U��(�����������u��O�4x�}�~jM��)����Z~��K����(|���)�@@@�[ �8��M����@Kkg��@]K���w�@A�w��X�v��J�������/��Hx����u�����0p�Tn�u����-����J��v����v�- � � � P�@�q+�@]-=��yl����@��+P��]����5;�a�e�@#6�.�
�xl��a�.l�<�����x��0p��M��0��o����{�.t��zf��{�� � � ���3Nbe�kf��^6H�e�q����/���w�����F���F�Xl�6�)��������OOt�N]p���?����p�����X��A��u2�]�-�0pgA0k{�u�������K��D���� � � ��y�I�lu�8r2j�
u�~\�����~���u��d8�Q�n�C�����W-��4x?7|���C�#�@
$waPO�
_'�v G�&���dP/�:�[���d� �e#Xgl@@@�M �8��M������\6H�e����tv�u&��f��&=}U� ���`�I�E�4�0p�,ox,�k��e��%�z��g�v���� � � ���3Nbe�kf��^6H�e�qW\qE���1��kZ}�-���z�s�����q��g7�_�S �Ei���aKo���s[�T�0p�����0�����z��R2��.�@@@�4�<�$V6��4y�y$�������N��@�������Mp#NH���Qf�eJq �g�0p�Tp�u���7n����RJ]�����S�p?��R�@@@�8�<�$V6���6�5�	��lL	�e�H)�
4+x�����m��l���@��a�.�K���]�J��<�kA�0����<�@� � � ��-�g���&P�w/�|$����;������k�F)�(�����9�q3|�
2�
Ytk���8�VQ4 �:�d�.�_'��������va O��k;��)	@@@�l���X������,�	��l��u�8RJk�����"���]���^��47k�S5ai���?5]��L�� ���^�y;�@@@�<���X������*�����*�	c"�~�l:���%�=��1�g-�,�I�&�����S�:�r\��k���7��4������}�g�1������_���.p����R���}� `�k����,�gu��v���E@@@�^�<�$V6��z{���� !P�]��2�n��QN?$�Y`�C�������*�k)?7G�6�����N?���,�g������E��A;��
��/Z��/ � � �@~y�I�lu��_�K�AB�.����#H��5��+P-X7i�~n����
<����<>��KS��'�Z1���g�c��z/
G�M�f�Y�M���X�e *������|���Z��@����|���@@@��y�I�lu�M��
u�s+X�.�������]fGJD ���|��On��qn����:eo���������4#x��wCF~qvM�l&Kg��
�@�|,pg�<�%�����=�R-h�tW{m���	w@@@�Z �8��M�.�^���l����N�)�����7��?��w3f�ps�p���X:��UC����]�����E���������q� � � ���3Nbe�km���m��k�n�rt�@��w��`�
���6������_4�-`A;�Yk�����E�Z���xV{m��8[@@@�����X������)�	����R�@G(x�ec��|����W7}�t7|��\g�����C#@��,hgA<���v��m�,��6�~x�hm�� � � �@+���X��Z��m~o$������V�<�f�����`�#��0hg�<5����Elp��@����yZ�Fs~o��q�o�o���}@@@�K���X���tp��l$�j�"����T�y�8������y��&�D�.?[JF�L��I�����a�6�~M���]���|M������.p�IK
��K�� � ��!�g���&P�G�uH�6H�uH��������������q���{�L�%����R?�d�=@����]�S����/*���@��b��|��U��u�L�- � ��-�g���&P�w/�|$�
��TjhY���w5��@�n������^�y;^���,�g��4����v�7/tV�c� � �d-�g���&P�u�uPy6H�uP���Y \:s��W������>���]�}DF@���� ���k;o�;a� ]2��v%������
 � ��N �8��M��u���w�AB�����
"�@�Z�S��5���
 ��a��yj�����YB��-5X���]2�gE'��]ky�"� � �@�	�'��	�u����E6H�eFJA ��e��I���2��w��4	�F�0hg�<9��a���%���{�����]2������]���p- � �d+�g���&P�m�uTi6H�uT��h�@K�wj������@ )`A�0�����u�.H������][^:�@@@��<�$V6��zz���� !P�eOs@ �d�N�G
�&���y���s�!x�+�"�4&`A�0�7k�+���TS�Z�2��lMui���]Z�/�X��]�H�@@�T �8�����/�OP�KG[J�m�0&Rp8�d ��1���-��6,y�'-Y������ �@�����ji�5��9���%�)�S3s�!�d�.-�����'�m�����{�.�Mo�o�ZAu@@���3Nbe�����k�m����� �@fm�c�]f�IA ��T�)��� �N��%|:�v,�����H��|*5�x������y��|kO� � ��3Nbe�c�U�AB��"'@��	�?��k��IN�<����#x���@�96�.�
�x�k��LZ�.����K����mL 9F�|*9y<y]cw���jA:��E���b� � ��@�q+�@]>}��� !P��I#@�Z��)�Y4	@����]Z�O�K;�������{'x���3y|�%c�T���o^������dW
����ve��h@�����X���z�Uo�
u��8���@�t��T�p���y��������t@��/�������O^ke��N��������]��F��tM^e'�������	T�!�����wFh�@�q+�@]s���w�AB����G�@�X�����A����kr�!��+�"� �L�d�.-���$�'�kf���^�R���,x��
��w^e7jQ-Hg��`�I�E�D�<�$V6��N9��	���@)h#�0x>�NU5�_S�wm�aT@�H��>kv�x�:��[�o������^��1x��mo��j�
(6���@�{���X���g<��R$����@����w���M	�-y�lC���	@�A �K������u��-��@�@^��ZQ�|�r�K�� �4[ �8��M����Z��� !PW�N�� �@�m�c���{��@@��@Z�.������;������������3U3s�6�����y���]��T*��e� �z�<�$Vv��A�,:A��w|���	c�]z�z ��!P-x�6u�]V��)�f���#���h��R?���O@@�N�Z��5��c��5��Yr��a��z����UvZ@���qE�|��,�2�������T+�����}��7��x����HB*	�'��	�U����AB���� ��@[�T�zx/��9�����$`�&�1@�E�Z�N_
��p/���/�3(�������I^e�eQ���������W;_�\�>�-��yS>�)�g���&P��c'�V� !P�	'� �])0~�x��I�&9��)�}7j�k���}�������3	� � P���V
�/�6����:�B�.Ppo�%ckj��/|�
Xd�7�iW;O@1M�;�U�\��N��Vn� �	�K&�B�<�$V6��P��2$��Xx� �G�z�w�Mp#N��3�|��f����(�@@@�(������w����-J_Q�l�/���c�����J�A
fSJA�v�3Nbe�k��n�z� !P���C�@����@^��p�]K�w����}@@�.����H��x��6D���V�]����x�~jq�s��*�-@�}���X�����[^3$�Z�T�J�0xW������S���!!� � �%,��������SM���Vn�[�~w������VM5����S���U*����	��3Nbe���A����	�]U�/���2+P��)@	�4��-�*�(O���N��nE�~*��^�������["�S5#J+`�6(���������HBx�������[U�T�[�w�������w���������;y��j��E������;��P��� @���@_A^_`�B_��U���7r�V��@��������}�[��^S3���<=���.����V�@�9I�-�+g�FE�<Iu�b8u��V���.u���g��.�Lx�l @��J =5�+L������=�����-��-�KW��w������*��v��e�$���n��4j+�I"��C�c���w)�{��.���2��U� @����n����O����{��b������
�"��F=�x�~��{���Z��Z����s4��Cs_�����$��
�h6�y�$�5���d�J����m��m�������r���y��Fa @�x^`$�����t��}�[��{�]�	���������j�)g\����|�z��IrmA]#����<Iu#4NK��
����k�������'G��S�n�����h����e�]m=U#@� @�@}�u�aRo!^o�_"����j��jW
���\O�e�$�����F���%OA]�
��!@���	���G�Ia]�R��%�O�UO�������;���Y�@� @��Z	��.m��������*�{�*=�,��w����x�.�V��Uu���������������'��nT�N @��s9��	���-�O�R��uttDZ���V�Yu���� @� @�:��B��B�T���z�U���O��<[��XQW%T����IrmA]������$�5���3�K�2��=?�����5����g�U6�\uW���YuWI�= @��@o^o�_.�������������e�$�����c6���I"�U��3 P�@��Uw%�+M� @��L�����/_zO�mZ�&�Yy{n��k��e����6e�$����6c5*��I"����S 0L}�wV�
� 8
 @�hp���������b5]o2��_fN�k�?>���<Iu�~�u��Y�����Uw���L���� @�h���:!]9�XfN�k���QQ5OA��N� @��:�-��Uw��:�.� @�t
�Uu������w��eI���$���4x��l�$���0��@��(�[xW�������;�I5s^#��f @� @�@���IrmA]��k���DP7��V� @�qz
��\u�v�sb��I���t����8�� @��@�9I�-���4�y���g���S`��.[-�o+:����h��f[;���i�v�wV���U! @� @����IrmA]���O����X�re��1#?���2eJ�G>���q�����q�b�����N;m���6��j�U�f��rG�$��*�4#@�u$P����G^uW��.u�c�����]_�����\
 @� @`�e�$������������G#�uyK��~��8�����x��g���|g|�������|�S�������j�������Mq�!���DP7D� @��:��:���k�]���5����n���v�w\P��� @� @��K���$���3�tP�����w���o�������kc����e��u���},>�����^��8��sc���q��g�=��,�����QM�T��v�����A���DP7H@� @����b�����Zu������x��'��� @� P�e�$�����p�]w��Y�"�=������Rq�V�����q���u�]#Zw�ygL�:��z�UW�	'�o|��;��N�m.�������k�~��,���J 0�rx7i�����+���z����&U�~v���?S�*D�!@� @�^���$��
b����/��{,:::b�w������
����^r�d�=��I�b���1��~�������u�5OA�`G�F�@�Z�}5��2V����m�����|��Ys�9z��� @�hH�2s�\{lC���E�|��EH�����;�P\�_����u����ve�l�M���q��wG5mR VM�j��Zy��]� @��A������D�s����GZu��������=�l~��������X�~v<9��xv����{���6�+w @� @�~u������{�������.mS�L)^+�u�����Y�vmU���5y���K�� PS���,�g��~K�]Zu��\3q���:~����y����l�6o�~1;:����8��A�:� @��#A���_�r���?��>8>��v�W��_��k_~�V����6�m5��i����c(�w�q�Pw, �D&L��&�S����7�;�<��O7�c���}&\6h��u��w�O.�����X��3�v�e�A��� @� @����U�����,���1v�y�X�`A�7���|����-�G}��[����n���n�Y�Z[^�� @������kb}��8�1�x����S6�!����x?�_��ns~qx�]f�������]w<�
F�1 @� @�@���*���������6�W^ye�����:���n�T��^�:V�X����s�t��M:���*�e(���g���X @����\�}�c�����c������u���O�����3�k_}r<}{D����e���b---]����� @� @���
�������}�k�V)�{�K^��GqD��E/�k��&������R�o|����N;����M:��v��I�l @����~\��=������}���YwC
��������Xu�|����gk_sJt>k/Bx�<�w @� @������4����>[�T�n���<�L���n�r��8����/y7�c�9&R��s�=7.����={v�q����]tQL�>=R���T���������,<sb��#@�j+�|^����������J����V��zz���'$� @� @�@�	�������>&��O?/|�c��u=�z�;�_�����6m�����K/��<��s��)V��[_���6����Vq�A���DP7H@� @��
l^ui����1�����\����:�u�V������+F� @��^���$���x�P��{��w�1&N��c�j���iWM�/���y��s� @�
!�y�]��?����u��Z��r�]�+����: @� @����IrmA]��{�\Q�$���B�T��Uw�a+���}��i�]kkk���t�g����K @� P�e�$��������]`�$���g&@�Jxn�]z�]�uf����.?�.���JX @� 0�e�$���nx�����'������� @��Uw��]�<��@I{ @� 0�e�$���n���a��'���a��� @���Uw�kN�v���n> @� @��(3'��u#>��{y���w�\ 0L%��k_}r��-W���
�V�ZU�Z�dI�������}~ @� @�(3'��u��j�'�����]g	 @��j*���u����ti��w����q����8�������}���:�]�� @��Z���$��U=��0OA]��� @����]���t%}��K�O�<9�|�7�m���q��8~�<a�V2v @� @�����IrmA]��M�M�$����
 @�����.����T���U�gw_17����t�H
�Z�~�X���y%@� @�z(3'��u��7�7y���~* @�j%�|^g�V�Us����LL����iKK���4m @� @�i��IrmA]�N��;�'���+- @��(q�]O��V�m�����y���� @� @��f(3'��u�4���<Iu��� 0�W��vb������u��?V==��+�����zlc' @��e�$���n���!�/OA�J���0��K+��&��`9� @�F�@�9I�-�M3��}��DPWcX� @�C(q�]��U�gG�S�+�z����Z^	 @� @`4
��������8sj��<Iu5U� P�@Zqw�����UM��V$v @� @��((3'��u�x
�ky���*�x @�@�i����K>I���w=��K� @��/PfN�k�����<Iu�+L���@_a��E��Q����w~��t���{�v��9��w���fnEb @���@�9I�-���A�����DPWO��Z @�}T>�.7K!������_�1��o���n���� @�FN���$���������$���P�@ @�@m�w��T� @�R���$��5������DP7<��B��{��b/]h�������g�$@� @�@S���������J��d�$����9� �#��n�������---�����^'	 @� @�T�2s�\[PW�6v�<Iu�=��� 0�#�u��?V==�����.
o @� @`e�$���n�,��I"�k��O @�@���q�&@� @��2��IrmA]#7Jj�I"�%� @��z��J|�]�z���+��q"�& @���@�9I�-���q��+��DPW7C�B @��!P�=�8"}.i[�~vt<�y������mfI�� @� @����IrmA]M����<Iu�-�| @�[	�PxW��Nx����A� @��Q-PfN�k�F�Z��$�
��� @�%
,��Y|W�	�JO�	 @� PGe�$�������.%OA]����!@��S�r��<����>G�� @�h�2s�\�@IDAT�\[P�0�a�/4OA���;# @�@�*��X}'���x*G� @��a(3'��u�0��z�<Iu�:��� @�O��b�q���R\^zmmm��R}I� @��/PfN�k��\��y��f�\( @�@-F��w��Z� @���@�9I�-���x��Jy��F��� @��@�q�]������c�������� @��+PfN�k��wL�ly��j�\, @��p	cx�j���xjv��
���q;_�w���l�!@� @���IrmA]3��*��'���J0� @��� @���2s�\[P7j�K�;�'�����* @�4�@~�]����K���w��*L� @�@
�������&�X�v9OA]�b� @������<yr<��9�E�m��JS @��R���$��5�����y������ @���F(�{v���v�a!��*@� @��((3'��GUP�l�����k�����8 �9��?~�(����<Iu���l @��	��q�@� @���(3'��*����{��SO��k���~���{�����[��V�u�Y�~���}s���+��"v�e��}�T/�'���z3-	 @��@e�7��K����n�F�� @���2s�\����s�9'.����m����;��	&�t�=�+���[����t�I1�������@�$����� @� P����~������Z�~v<9���4iR����n�Y��� @���@�9I��PA�a�7�tS��7/�;���qy���_�����iU�[���������t��)��5kVW{o���DPW��V @��R`���'��4&���hmm0���\�y����&U�o����q @��"PfN�k7TP7c���������O,l�m0w�y�X�fM���/�n�!R����?}�`��x��_?��h�c�$�5���4 @��h�������E�������V�	�z��� @��������ck��U�a�����G�:��_\�ri��?��"�K��L�{��g,Y�$������F� @�I`�q�?����9o�������'=���wH����n>~�����T���?�lz[q����w}�t���u���UX�8l @����ZQ��7�9������{�]���u��~z�jn��v�+V������%�b���#.���8��3��{S�@Ns����K+ @��R���.�����7��
7�x7�}~�p��F�:\ @��[���$�n�u��s)�[�lY��3���������w�+B��h������ @��R���w-3+��px���_D�*�[ @�U��V�%�s�=7.���n���vZ\v�e1f������t.����>|��n!^W#o��i�u}2�� @��,�|^�&�^y��������g���V�e� @��T���$�n��.	��r�/�
6���s�����
�������W�9���x�;������$��o� @��l^}�q�������.^��[zi8������'���s��V�� @�4�@�9I���A]3O���{�$���Tw. @�M  �k�A�E @��/PfN�k�����<Iu�+L� @��sk�~*�t|rX<�W��'O�g�k�---���:,�v @�G���$������?�y,[��&#�:���������l�"y���h�u� @��
t���hY�������;) @���(3'���2�;��C��[j���_��Wq�1���`����I"����qM @�F�@_a]�v����_���~8��][�����u<5�Xm�j������V��E� @���(3'���2�;��C��CMF8u4�~�ej��y��n� @�/�����{�����q^�p�������G��"6<��b�p�w���O�pN���:s���� @���(3'���2��������q���/���������[���n�}��8��������?</^&L������$�U�� @�u&���X������Io�In,���;��+�WQ @�#&PfN�k�eP���
7�/{��b��
E@��o|#���U��o�=�<��H��Yg�����������@�$����� @� @�A�9�K*&�v<�j���A&��$@� QfN�k7TP����7����=���<��1v��^��w���8���c��qE`7i��^���g�<Iu=��K� @��(��E
��s�_\����&�%sG7 @�F�@�9I��PA]z���%K���|g|�k_�s�7m��n�m��������������/���DP���= @��r��:��1���/�&]��F��= @���(3'��{_�V�<?�p�Ww���!]j�r��~�k@� @��6f)4�vo�����f��?������K#�tn�7�|:V��Om�Y�Dx��� @��S��V��y��q�%������{��^�x��=�Jz�]Zu�����+�������C������4�����|C� @��X>�@X��/b���������vZy������:����y�W @��&PfN�k7TP���?N;��Bu�m��7��
�����+&N��<�H\����o;n������)S���k����0����$���1�3 @�J�������3KRe	 @��O���$�n��.M�������/���������E�W�u��DP��� @�N������������U�;o�����9%�wU�iB� @��2s�\�����7����������s��Y�������MozS�m|��@�$����|K� @��A�Px��W�7�Qs  @�@������
���_�~}�������+����x���c�w��v�-9�����������x�@��
�Mk @�I�2�{|qD�\�V��.?�NxW�� @�
+PfN�k7lP����@�'������ @� 0:F0�K���9��� @�o�2s�\[P��4��y���z�< @��* ����q] @��D���$�n��.G�]w],_�<V�\O?�t�C>f��8���c��)���E�y��z��� @�u'0��]�hkk�b�|��� @�
"PfN�k7\P��K����?�������W��Us�1U���S OA�A� @�������3x��t @�@�����
��v�mq�G��5k4-~������>�c4���DPg6 @� @`
,�����G�0���}��E�������gw�Ex�E�
 @�@�
�����
��[X~�+_)�k�}�����}��~��;��*m9����tP�W���$�
�Mk @�4��0�w��}�)��x�8| @�A�2s�\�����*��n����+n����6m���?u�$���?�zH� @��^*o����^�
��U�gG�S�+��wC�t< @�@-��Ir��
���{����;#������Tc5���DP��� @� ��#�my���n�]3N>}&@��@�9I��PA�g��^ziv�a������<S�$����&@� @���*���{W��Nx7��� @`�e�$�vCu?������N*����k_���j>�<IuQ�� @���$�K����V\VKKK����!@� ��@�9I��PA]rz���_��Wc��	���W������=T�<IuC�t< @�t	�Qx���]��
 @�@e�$�vCu���������o~��W
�f������RG.\�o����.�I"��"^	 @� @��])�� @�M���$�n���u�{]�[^t[�hQ{��=����I"�k��� @��/0L�]�X��������������[�jUq��%K"�r3m���� @��Z���$�[�[\��w�}�{���::c����� @��/���E��-���:�<�8"�y5�����J~�:�K�[��)�
����������N�at,�]��N�)EX'���^	 @��-�P+��{����9���n���� @� PBx���W��Oun�kN��I�o����������+�f����;�O @���(3'��uu7��sAy���gL\	 @�T)Py���Wy������M�a�!]>C
�Z�~%��< @�@}
�����
�=���q�=�t�s�]w����+&M�T�#�`W�'������ @� @�@���]
o��������g.�q;_<�N`��� @���(3'��2�K���?����w��6m�j\�9�����>Gy�V��Q�@�$�����$@� @����.�����f�3�<������%@�F�@�9I��pA��~��8��c�����9��\pA��4�Y OA]�>� @� @��(��n����5�.+�J������Mf%�� @��a(3'��*�{����-�\����n���5�yM��5+��f���_���_���zk��\q�E�������$�UM�! @��V�a
��*��u��?V=={+���Np�� @��������
�}�S��O~����������c���[�/\�0���7��u���m�������@�$��>�|I� @��*P�]����u"�7 P�@�9I��PA�i���������?��1~��^�/���8��3���3�f���k[_�,�'���g{	 @� @��V�1���jw-w�Z?;:��\m�����	�j�� ��e�$���F�]�dIq��|�+��R��O>9���wF
����?
�i�]+ @�hT���)b�c+�/����F����/�������U��63w����S�o @�@�4TP��3�n��M��o��I1a��x����/�K��5 @� @��D`���2��u{�1���t���=�����{���������S�������� @��ah������;��#n���~������m������ @� @���	��.LA\�rH�>���r��3�*����[���"���]��������� @�D�T�~6�����}����O|��%_r�%]����C�v��uq�����������v�i�f>�#���Z�s��.�� @�4�@ex��
�!��}��E�-�w�;����p @�a��Ir��
��/_�rH<����:����.�<���k��b����j����o~���������?�������	#t�y��Fh�� @��$0B��Uw=
�} @��h(3'��*�K����~7N?����}���q�-���Y��j�Qw�<Iu�]|"@� @�u'�|^�%��yw=�m�����������wV���e 0Z��Ir�������??��_�%n���^���#��/�0��mpy����( @� 0�#�Yu7�#�� @�@���Ir�������c���q�m���w���n���^�����^��G����DP�|c�� @� 0
*o�9�+����sI� @�@������
�5��n�I"��apr @� P��0�w�kN	���N�	 @���
�����
����?���{���G{��o_���b��i���~6�L��c;;���DP���o	 @� @�����V�_J��3������+�YQ @`�e�$�vCu��x`<�����W�"���������8���������W�������@�$����|K� @�F��0�w��O.Sx�������s�F�� @�u'PfN�k7TP��~0���/����c������z�������o~�h��_�V��&�'��n`nZ @� @���X>��������}N+���t�L @�l�2s�\�����s���7�o~�����.����[o-V��F?����u�{]��}��@�$���m�!@� @�z��n���c���a�]ca 0d�2s�\{���r��qi�o���=kj�:�B������� @� @�Q`���*o�Y��w���F����G�5"=��}��b��]_k�V�6����C� P�
������b��X�lY��w�}w���n{�/� @� @���l~�\z�\����n���n��j������1y��xb�s�}V�~ @�u&�P��<����+_�JL�0!��?�3�=��97m�ox��'?�I�}�
���o~�m@y��[_�Mc @� @`�5�z:}Zy��9�K+�Vm�^��zB���&PfN�k7TP����6�>��HA���noy�[���O���.�
��+W��7�]tQ�r�-����Z���f�m�����@�$����� @� @�j,P��8���;���>��oJ�N;6�m3�.�fE	 @�@�
�����
�����G?������������c�9���u��DP���' @� @`�rxW������{6�-&M�O�}���7$�	 @�@������
�=��3q�%���?������u*�������~6N>��^���o�<Iu};�� @�A�������1���j~!���O�=����x|�p��j~	 @���(3'��.��C�z�����~��sK�.��k��.��{��gx�������v�Ykp�y����( @� @`�����^��Oc��k~�>{pl�xTQWxWs^	 @�@������
���(����DP7�Y� @� @�@3<�����o�Yfx�n�9���AU	 @���(3'��6���qc�y��q��w���;wn�����~Rg�$�
��s @� @��"�9�[����]�&m����n���b��g����D��7,�r @�ve�$�v�u�u�^zi����k�{��]�{��W,[�����g��-�s�9'�8�����T/�'���z3-	 @� @����Z�}���tl���yw�v}m���J9�� @�@m��Ir���>����?����[[[c���]�?��������ui��m�y��n� @� @`t<z����_�)3�+V���Nx7:&�^ @���(3'��*������U�zU��h��)q�Yg��;���/�[u7�xc��a�-�"��/~������<Iu���\ @� @�@������-3�=y]�����r�-3c�����"����XA P�@�9I��PA����=���/���o7�|s��1#����H���2���iSL�6-�x���?~�t�I��kS!�'����[ @� @���V���8�������u��-Z�c
�&����BV�
T P�@�9I�=�����&)�K��>��"������;��s����z����#@� @� 0h�����+��������7+�&n�q����
7F������B�����;��o���W�	 @���h������+pg��Y�_������	�j� @� @��Z�����e�O��V��}��"�K�[�]:��n��[o���?i{.�+n������&��t����@Cu���_�{��q����K�b��x��G�v�{��� @� @�� :W�u�w-����i��_\<��m��AT����E�����n��:?���������~ @��	4�3�>����g?���2eJ��w����g�����L����5q�UWL��Rj���\���G���&l]$@� @�F\ �7����"�K���V�����U�-��|��e�T���-U|&@��Q$PfN�k7TP�t��8��Cc��u1y�������s�=��3�����.����Hm���/�����b*�p�	����tM���J�$���3w& @� @�@�@Zu����XuW�[fV��z�V���ry�]_!]>PX�%� @��((3'��*�Kc|���������{����d���1cFU�5�.�'�����O @� @��������� m�VZxWT��������xU4��4�@�9I��pA]���G���|$����^���/y|����9s����}�I"����� @� @������6<������K������~�&@����Ir���������?�y�~���l��x���c��w����q�A��G]��4G�<Iu�1�zI� @����Uww\-�o+:VVx��6����S'�g���I� P�����
��#��I"�+�� @� @�e
�Uw�-�
����/��i�]����:?�M�J���$�UA]ZYw�����O?p@s�11~����z��<Iu�4*�� @� 0t��Uw)�k��Tx7tV @`�	�����
��{��q������k�������]C��o}+�:��X�~}���s��W\���K�>o���DPW��� @� @��F�r��+v�t�]I����y+���V�%PfN�k7TPw�9���^���[�y��1a����{�)V��[�n+��N:)��/i	�Vg];�$���q� @� @�@Eh�|^�vk��.@x���� 0e�$������O�x�����r.�tiG
�rH��{�[�]tQ���G?�Q�}��1k��A��� @� @��I���%�i?��roa]����c��]��l�vi��_:4����K��+��w�V�w3�u������ZQ7c���������O,� �s��w�5k��K_�����"���s���_�`A����%C6|��i�u�g�L @� @��zH���O�������-Y}J�zzv����1y��bwGGG�g�U>��&�]>i����R�{(I���$�n�u;��C�=��#]�_|q�������.��2eJ��%K��}���v� @� @��R����h�9����t?0}��T~�����m���k����B��F��{Zy��y��y�]�| @�@�4TP�����������>s��)B�O|���N;�y�]�� �8qb��� @� @��**�������x`����X%�������;��R�r�]M��|�����Nx���I��#�P�����������Q��_�b��?�c�w������#�(>��W��c�9��;o���.���:/� @� @�����-6�Vy��!?���K��e��w=	�G�e�$�vCu���s��.���)���N��.�,����?=�n���������>���������$�U�� @� @���r`��w��	��f�h�@�9I��pA]�L��/^6l��s�����������W�9���x�;?
M|d�$��&��N� @�a�-���[f�l�4���rz�b�2s�\�!��Q<�u��<Iuu5,.� @� ������+��=��9���@�	 �,e�$����Yf� ��'��nx!@� @�6������������;��+�WU��@�9I�]�A��5k�g�M�0a��������c��e��W�"&N����{��X�ti�}���i�b�����?=3�����q�����i���}�?T���6��@_�$�
TN{ @� @�������3+W����q^g��w#=��C(3'���6�;���c�����;����c���O~�����`@���.�Y�f
����������??.�����qc�x��������I|�#���?�g�}�C:�]w�U���g���|g|������S��T|����WM�j�t��<Iu�t @� @�@]	���M�aq��wWWC�b Pe�$�����i�h����iS���LA����7�|s���z8������o�A�-����zj�y����v�A]���}X�~}�������m���;���<�q)�{��^��{n�c�>����'>tP����/�V���6]'�� @� @��hmm-~E[[[!�������#�y`U�����c�"�O�V����o��>Yy�i�7�T�nW�����}\y�������=��S�&������������>����e�])�����_���z\Q�������/~1V�^S�L�j:��x�/iuZ
��N�Z�������N�7���q���A`�����TUk������ZQ7@8�	 @� @�Z �*�csx�T�2����ZO��	�����u��[;�����E/�|���Q���z�xN��1c����tPw�u���G��~z��-o���&M�������+��v�����V>�`_���n���#@� @�
�����n���@�h(�2s�\�no}��6����o?����������1���{��e/�����8�/�K��?����{��G�}���j�j�U�&�ky���� @� @� 0p��[f��+��|�����������<^	 ��u�5�hz~]���/�*���Ux���_�B��i���n��V��[c�]���v���<yrqN� @� @����`������~=�w{��Y�������Y@PW#������?�Gq�'�x"~����?��?��+_����m������iU]��k^�W�j���[�l;����A� @� @��'0a����3f����f�����W����)�k^yW����8�:�n�������M8<�?��{�	 @���2�;���b��%5�J����c��Y5��[��c�F�I[Z�v��'��+���`,Z�(?�������-�G}�8&���a��mWM���o�s�L� @� @���	�;l��l���.��sx�|��X���b_����
���K��|�5�u����w�4~ @��2������������}��WzP����g�������ikoo��t���E���~���iWM�n'��}��g�N� @� @�@_��m��e=�n��"�[���2����m3;�&@��0�ePw�QG�tE]Zb^���/��5W�]~�����>8�8��x��^�\sM��/��w������F�|���0m����MQ�/ @� @�B���3���j�����(>��EO����f�����JxAE�g�������������?��!�,�?���q��g�n���rH�x����?�9=��Ha[z��aC\q���_��h����.����{n\p�1{��8��3�������b�����������\M�j�<�������3'��( @� @����R��>���������[]��n+;h�2s�\[P��\�����x�{��c�7���q�e�E�}e
��j��7m��v��o����g>���i���}�6m*�Yw������?^<�n��9�V��[_������Mq�!���DP7D� @� @�(Q����mj�"��_��u�t����*H�@�����������u����	&����c����z
������q�c���=�I;�iWM�^O��y�����5 @� @��:������+�\y'����r)��@�9I�-���)0��'��nd���	 @� @�U����m����r�L+��:��'@`��Ir����m$�,YR������Y�jR����I"�k�Q�W @� @��f������R��w�2��I�@5e�$�v]u3f���������,Z�(�=����j��@�$���M�#@� @� 0���*o�9,+�f����F�@�	�����u���Mo�������V�
b��I"��C @� @����[f��w��Q0��@�@Ue�$�v]uU�hT�@�$������ @� @�@����M�_N_�8���`�w�����W�1��;�Y�W�X�J`��������Q2Y��F�$��2t�$@� @� 0z*��|����w#�-�����
�z���@����������f��I"����7 @� @��,�Sx�6��E�ay�]^y�WH�/]X�%� P!PfN�k�uP�f��?~|L�0�������DP7\��C� @� @`t��v�yw����w������� ��@�9I�]�A��'�,�w�1����;vl|������� ��.�Y�f
�8�#�$��
 @� @��%�exWy���V�����;�����o�"@�Y��Ir�����h����iS���w�uW����q��7���K|���w��n�j @� @� @�@�mmm�NRy��%q��b8��m�����:�%�6����@����mPw�QG��W^���kW�6g���������3f��	 @� @� @`dZ[[#�Tn��wmS�W6��m�]����c�k
����Ox7(W ��@�����g���{,^��u��������y��[_�ZV= @� @��*P���gV���Ex��5V�w3����NF�@�9I�]�A�����E�$���GWO� @� @�Y��Rp��>]~��w�;�(3'��u#0��r�<Iu�2b�� @� @��J�"���}Q�����m�^x����N���$�n��.-e���{"�m�Mfo��1c��o}kL�8��&��"�'��� �	 @� @��{����������>9���mS~X��e���:~i9�����*����Ir���������>��zk�������e/{Y��5���DPgF @� @� ��9�K}HA\���>%Z�9-���"?�����X �����W�)PfN�k7TP��������5k�hF\w�uq��G��#�$��
 @� @�4�@e��r��Fkkk���k�1��n��w-�o+�Yy��F�@�9I��PA�{��������b�g��g�qF����1u��^?u�o��oc������E�y��z��� @� @���(�y�y�]���;�]s��m.�2s�\����9s���7����O�p�
1m�������<Iu��t @� @�4����a����@�9I��P��}�����|��n��� @� @� Pk�t��-o�Y�-����J�m�c�"�O�V���:���i�v��9���o�V��V��r�)1���{����;��[��ra9���n���~ @� @� 0R�����6s��:��B��c#@�*�2s�\�������>^�����M�����d��sC�*N�*�'��n�r� @� @� @��]�FZI�2s�\����4�������D1.������~w�c�:��������=�I"����^ @� @��Z@xWkQ�^���$�n��n���q�!�D��j��n�)=��j�k��@�$�:S� @� @�#' �9{gnn�2s�\�����q��b�������������#�$��
 @� @��/�]}����e�$���F����>�-�;���c���������������^�� @� @�h4����H?�[ex���#�y`U�NX-�o+��l�4Z�/�<dp�W��y\~M��8�s���"�����jE��9s���o��S���������F��i�u��s @� @�F\�2��������w=�Tx���}
$PfN�k7TP7s��bE�?��?��_�@C����'���1��U @� @� @�'�]O*��Z���$�n�[_y��EP�l�����!@� @� @��p��~�4 0l
���{��^���o��2^}���������'�i�u�8��L� @� @�@��������i�v�{�]�O���9I��PA]�3�<3.����n��b��q����� ����I"����o @� @��^@xW����/PfN�k7TP��o}+~��_DzM��	b��Y}�t�����m�g�&�2OA]�N�&@� @� @�@��*�4iH�2s�\�������u����d���h��8��������= OA]��='@� @� @��`�w�QsL�	�����c���}]��w�}w_M��.ut��[��� @� @�(G���5�O�V�-���kX��F����f-�.���K+���E����8���3�:��K��ZQW�����r�kE�(d]#@� @� @��
T�w�����2�k�:������:���������Q6ij��<Iu�TU� @� @�������������|W�@�9I�-�+k�FA�<Iu�`0u� @� @�
, �k��k�K/3'��6�[�fM<�����3�<��������_���4iR{�\z�$���WB� @� @���;3�l�2s�\�a����������4~����M7��^��m3f�����%���_��8���#w��c��Z �	����� @� @��O@xWc��WTfN�k�uP��������\rI,]�t�c�����I'��y�{��t�f;0OA]���� @� @� @`��F�XwO��Ir���R(�p��"����+c���[�o��6���;�n��i���i�"��v�}�����?��V�����y�k�#�Hs�1=���y�<Iu��xG� @� @��/����c�:�W�1����3�u��]s�2s�\����t��#�<�fZ��7�1N<���k��b��v����=��S���������~?�����'��j��}��'���u��f+�<Iu[��A� @� @��L������"R@����y��d�����$�����7��M�����%N=��x���T|�<����g?��/��X��v��"��n��S�i��������T�@IDATr%@� @� @��
��	��J��B�|���,Q��2s�\����
6����8����\57X�u���-��Gq�`K4�qy���f�u� @� @���)�k������m�F����T(���1��O�\���Ir����k����y���{�\ @� @���@{{{q�����Dx�v�sb���k�e�$���n��4j+�I"��C�c @� @� P��H�wk������5_�2s�\{l��������?�W��U��������/_�_}�~��5��  @� @� @ 	���m����o���w�<P��wO��=�:��,Pw+�/^�w\�;6�?��x���'�xbL�:u��+V�������
7t&�O<�DL�4i�5�����ZQ���� @� @� 0�\y���{�e�WF����e�$�v�u����C�6�&L�N8!N:��x��_���[���.1n��n���5k��}������O�����m�]�^���>�[�I"���� @� @� P�@ex�-�W����Nu�����9����F������u�%��n�)���/!��
z�Jhmm-B�i�����+��.p�m'N�3�8#>����{Z���S��'��.�x%@� @� @��+�j�����}�6u~�'n_}r��.e'�������u�e��*����v��?(��� ����~x�z������=v�a����m�$�5�4�y @� @�a�b���y��u)��������x#|�
}�2s�\������K�����k����}�-[�,{����iSW�1c�����c�������%1g��������u�]��xS�@�$�����$@� @� @�e��.�n�vi�)��>%Z�9MH�%R�7e�$�v�u=�������k���)S�������70�l)���� @� @� @��n��{�x`���2���5��Ir����}����<IueK�O� @� @���@�9I�=��:�z @� @� @�4����FY	 @� @� @��N@PWwC�� @� @� @��A@P���� @� @� @�u' ���!qA @� @� @��  �k�Q�G @� @� @��������  @� @� @��f�����qc�y��q��wGGGG��;7���F���&@� @� @�h2��[Q��3������x��_�g������q�g����nCw��g�_������� @� @� @�@=4\P���~4�:�����{���Y����	&�m���������� @� @� @�@=4TPw��W��^X�M�2%����9�����G��O?���=��]w��c;	 @� @� @���@Cu,�g�}6��~����[������vZ�v�zhL�<���O�S�m�$@� @� @� 0R
��|�����>���1cF�fc����w��h��C���� @� @� @��[����������9sfUN��_�v�yu6 @� @� @��$�PA�~��W���^���X�"y���Y��^5}O� @� @� @�V
���3���E]K�.�� =��=�yO���g��z�
 @� @� @��zh���mo{[l��v�f���;wn|�������W���������=9��������'�pB������k @� @� @� �%���w���i5Z=o_|q�������N�K�,�3fT�^��/xA����9���}"@� @� @�]���$�n�u��]�zW\q���>��)����<~��_
��T�% @� @� @��H	4����q�������~{,[�,~���}����/~qt�Aq��G��^)��\+�	�0 @� @� @�a��Ir��
�vT���$�5���T @� @� @�&e�$�v����&�� @� @� @�a��#|�A����=����X�jU<��3��3fL���o��'��� @� @� @��[����k��6>������Z�����/{���n�! @� @� @���*�K��N<��X�f��\��f���� @� @� @�@�
�}��_�
�f��g�qF����1u��^����;��^�� @� @� @���h����n(���g�H��M�����(9�:��I!���^B������A�������]Qv�A]Vt���(��,����px�`x$<Cx�y!1��S��8�ytM��tW�������u�������[]3�$@� @� @� �g�A}�a+x�����~��B�tw) @� @� @���(TP7}��T����e�K�� @� @� @��
*�;��sb��A���O��?��dr) @� @� @��+�����d��5}j���6/�������F�w�yq����������y��{,�`��]�h�9�u��%@� @� @�T/P��$��pA��/���M�%K����;wnp���+��@6Iuf @� @� �l��I��u��$��1cF�B�d��^������ @� @� @�4���o_��]z��������;��Cb��I��V[���t#I$��o�Mw{M� @� @� @���
�����k��Q��_�"]]WW='@� @� @� P�@�n}�l����'�x����w @� @� @�@c*�;���R���~�1��� @� @� @�@��
��;�����-0��_W�e� @� @� @���@��������vZ�����G���/�/� @� @� @��R�9u;��k��?��O���|��8��cb���=����%n�����z��A @� @� @��,P����n����'mk������uw�/�u�c? @� @� @�@]
�M�:5-Z�+�dE����{u�� @� @� @�j-�����v��o�$�L�D��_ @� @� @�@	�2'���D��J� @� @� @�au
3B� @� @� �L
��Z�*��Y�L��� @� @� @�M"0�Q�9k������c����_|1�^xa�x���jrr��[n�%&N����&@� @� @� PK��
�~�����
b��%��3��^{���������/��Z�	 @� @� @��h����������1n����m����h��^y$+�����s&@� @� @� Pk��vhM�[[��v����~��X�|yl�����m/��7��xW$@� @� @��U��9IV��F�]�zu,\�0�|��Fm�v @� @� @��Z�a��+��"f���vX��s" @� @� @��Fh��.����
�N� @� @� @�T-��A]�=r" @� @� @���
0H�H� @� @� P>A]��T� @� @� @�
 0����r��8����nfKKK\v�e��6�T]�	 @� @� @���@�uo��V\}��}��G?��������' @� @� @��O������#��L] @� @� @��,��+�F�s�������/��c���w" @� @� @��Z4|P7t���s�=k�wu @� @� @���@S���n�.L� @� @� @�A]70v @� @� @���������&@� @� @� ��@�~G���Sc��������� @� @� @��-����l���O�v�lV������9��3@�	 @� @� @�@�
�2'��v����_zN� @� @� PGA]�]� @� @� @�yu�;�zN� @� @� PGA]�]� @� @� @�yu�;�zN� @� @� PGA]�]� @� @� @�yu�;�zN� @� @� PGA]�]� @� @� @�y������q�=��s�=/��r�Y���n4(f��#G���� @� @� @�Z�pA�/�����>O=�Tn�C9$=����$@� @� @� Pk�Bu�=�X|���U�V��e��7�U��
?�����O�{����r�-�,��?�!�����l������c�v*��LrR�ry�tj� @� @� @�u(TP���m!��{�g�yfL�4)��v�hii�3����)S�<�w��o�_����_�j�[�.x���������_�>N;�����k:���������O��)��S.O�
�� @� @� @��
���;7��}��c��91z���c>���q������c��aiP��E�0.	�f����{n�]�6�n�.� 
?����]�2I��YWWm�� @� @� @������&�imM����.Z��{���������F%�����������_�z\w�u���|����d���q�R�$�5jT��[o�5�<��8����?�a�27�pC�W]}��V)6���k�O� @� @��J��9IV���.������w�MKV�
������Gy$;��n/��g���q�����tI�#�8"�w�yg��R�����+��� @� @� @�1
�{���b�q?��ODt�-���{�zz���K��}���C����.�����Z�)��\�S.O��:� @� @� @����9�j��w�qq�g����?��%����K#G�����������Sf���������Sc� @� @� @�h�!
��
���k#	����5k"	�&N�����=>o�����z�����>�k���T2YU�<��I��)��Lv�N
�b�SO=U�YN!@� @� @��I�PA�
7�?��O����u�?�x���6/^\��n�m�M/�|��NMx���c������a�r��[W���A� @� @� �0�
��N��-�^��l����:�����������v8u���������I�"O���<������>��s�=�X��	 @� @� @�6(TPw��G�O�=f��[o�u�~����K/�;��6����~��t���"O���<���i4#�!@� @� @��(���ek����>������<�H����o�p��7�C=���b��v�i������<��s��_�r����q������K��+�����*|���|�2�]W��*��}�9Q%�� @� @� @���
�2'��.tP�|G���>�v;�q����������A�������'?�e}'�tR\����
6�Yg�����b��1d���>}z$���[_&�<e���[Wz�*��MA]��N#@� @� @�
+P��$���A]�������(
�6�C=4.���8���6=T��I���/�v�m[n�e���S&91O�<e�lD���$I��A� @� @�h&�Z�$Y���~������\�vm��p�9�����XP�.�I"����N @� @� @����I���������~���/����a����L�81��-����;,X�6-n������k�i#�@6Iu��"@� @� @�J$P��$��PA�E]^xa:�'�pB\u�U��V[u�[n�%N<��x��7c��1m�^��v�(�MA]�L @� @� @�%�eN��=�Hn�>�h��}��7~��t�%�9�����+��K�,���{.��/ @� @� @��"P��n��������o�y���f��,�|���z,�  @� @� @���(TP�~���g�����C�M����K�+@� @� @� @` 
�M�2%��3gNE�d��[o����c�=*�W� @� @� @��@
*��:ujjs�����W_����o��g�n;>y���m @� @� @�A��]#Z����������|���b��i�b������9�����>���{�5*�,Yw�uW|�[��'�x"m��w�}w�t�@-������D�85� @� @�(�@-s���Bu����G?��O>9�0�1"~���8qb��
u�&�����W @� @� @�@�j��du������t�I�?��?��^{�8:�������t=*9H� @� @� P/��������[��rK<��c�p��X�ti���Ni0�������3������\+��t @� @� PX�Z�$Y��
�
;�jx6Iu4M%@� @� @��E��9IVw�n}�/�*!@� @� @� Pg���V�Zk���3�� @� @� @���!�_e��8k������c����_|1�^xa�x����@�t0�.��'��<�	 @� @� @��R�a�������a��X�dI<��3��^{��y����������u�Vs @� @� @�@-6�;�����?�i�7�-d�>}z,Z��W���������	 @� @� @��Z���Z���������l�_�>�/_[o�u$a���d��2'^�	 @� @� @��f�eN���>k���Y����MA]#��� @� @� @�!P��$�{�@t����f��x���#y�����.Y��A� @� @� @��
���=;v�u����>U����n��c��������+@� @� @� @` 
��]�6�y��7+z���w�%�������� @� @� @� 0��
�z��D��jo��FoNU� @� @� @�@����
}����^���t���Ok[�pa|��������z+y�����[��O�0�m� @� @� @��Fhi���d;[��n�6_z���q��t��{��x��=zt��i��[Z���4'�q�� @� @�x�Z�$Y�
}�����I}���q�������d @� @� @��Z4��������o�n}y����w��vX�{���z�1"�����K��Yd�}h�Y�kE]#��6 @� @� @�)P��$�����.	�����4�Kn�����/��� @� @� @�@�>�k/����1~���9sf���	 @� @� @�N��o}Y8�58[v���%T]!@� @� @�r	�2'��.���{��'����/+�t���h><��� @� @� @�@�
��������o���hw�ygz���>��O��\+��}&�? @� @�h>�Z�$Y����:x����;b����s @� @� @��Z	jE��E�"���������~z$�p@�����Q�F�t�c�di�u��M� @� @��V��9IVw����#=w��8��#b���q�g��W^��T��	d�DP��& @� @� ���I��K�%�����JW�
4(�-[�G�n�I����&���?U�E� @� @�A��9IVw����7�v�	'��7l�w�qGoNU� @� @� @�@�J����+mxK�,i��A� @� @� @�J�=���q��'��N�6�m� @� @� @��F�����K.�$n���n����[���{��������1e����6 @� @� @�4�@���{��'�����n[l�E�t�M1t����(H� @� @� @` 
�����1o��X�~}�6[o�u$��:���={v�;��� @� @� @���@K��&����S��6�U�����aN4��o @� @� ����I��5/�� @� @� @���@������gq��������n��������������V���� @� @� @��%P���[�pa��5+��Y�y�{�s��\�n��/����.=�a����7��e9;	 @� @� @��K�P+�����4��|���������}�{_�r�)���~���b��n�:@� @� @� @��
������(YU7q����=������k��;����� @� @� @�Z�PA���KS�I�&UtJ������^|���� @� @� @� 0��
�v�y�������h�h��hmmM��=�by @� @� @��@������/������;���i��
q�Yg��6mZ��
 @� @� @�� ���!�nI��,[�����
��w_r�!�q[l�E|����O>9��v#F���_~9���W\qE<�����������>�iu^W�n��s�B&@� @� @�T%P��$��PA]����}..���\�I8����&=��\��(�MA]G� @� @� @����2'��.��/�!���W]uUl���=��=��#n��!]�J @� @� @��K�p+�2��+W����X�`A<����z���q�c��	1y��8�����2���9Ks����i @� @� @�@aj��du6�+�����$�h�4� @� @���Z�$Y�C���u�d��u�p��X�hQ,]�4<���g�}���$@� @� @� �{��}G�����;��N���n���{�QG��zj{��z?{���w�}������� @� @� @�� P���s��\|���^x����U�:�:th<��cq�Eu�� @� @� @��F(TP��_�*�����n#G��O�������]:�|����g�}6�y��.��I� @� @� @�^�
�n���hmm�m��&,X��o������;��b����Gy��2v @� @� @���@���y���Ng�}v�?�G�A��;���y��Wz,�  @� @� @���(TP�x���g�]w���l���\�}u @� @� @�I�PA��I�R������?�|���ki���^�:'@� @� @� �_�
��O�����+����[��{�>��O��{����m @� @� @�h�Bu��rJ6,V�Zx`\p�q�w��+W�����/�����6mZ�z����#�<2��~�F�� @� @� @�m-m[��v���W]uU�~����8j���?~�?>Wy�:
���==}Ntl�W @� @� @����2'��.�������D�t�M���{�(|�a��o~�!]�J @� @� @��K�p+�2�u���m��O>�d<���������;����[L�2%9�����*�4���*�F� @� @�V��9IVw���?�����[��A���3'�O�C�)�D�W��I"����. @� @� P/�Z�$Y���z)������S�N�{������7�}��8��#�]�zW��;��B� @� @� @���@���dE������Z�y���b��Nl�-1/����<yr�z����5k�t*g @� @� @��z
*�;��#��^&�b��7�{��w�x��m~��w_p����~6�P/y�3&��s��26 @� @� @�4�@���.K�s���O��z��������c��w�����aC����q�g��_�G�n+k#�@vT�Q��LI @� @� @���I���%C�~�����+��/�7�x��h��1#�������;�#�@6Iu���$@� @� @��!P��$���A]6�?�������2}>�����_�bd�p��^	d���^�)L� @� @��@��9IVw���.���W�������]m��_~y|�_���z�m�
 @� @� @��&P�u7�tS����},^���r�����{��7���o������}-�����N Ks�����Y @� @� @�@qj��dujE�\�f�j���r�����b����������_�"&L�����E���c��c�=6�x @� @� @�I�PA�C=�fw��G���3��L2$�?s���?~�}��1x��t�����7o^�y6 @� @� @�4�@���a�����c������g?�Y������Uv_��W�����i��u:n @� @� @��F(�w�-Y�$F����e�n���{��[`�='W�MR(�?���k��M @� @� @�M��9IVw���6"�MA��p� @� @� �@��I�������-��{�8�����E��v�m�l�M,^���:�D� @� @� @�V
�=��S��O��5�U�/����<yr\y��]�+V������-��+WvY�N @� @� @��h������sc��1��JE'@� @� @� �p�
�NR� @� @� @��B@P�,E	 @� @� @�������H=�x��i�ko[S&�J�Oy���+P� @� @�(�����C�E��/D�e���9�����!��d<A@�\�Q�F @� @�U����5k��<��o�������������t��-�iH�i{�OX�����(��|:7�(i @� @��,���q��vkk��nw}6/�������������}���_�j�JZZ���2'�`��w��Sw��s�}������)��}�z���~����!]W��B�+�U���F�"@� @��O��9IVw����l���M~��7���TD�Z�$����}x\���:��m��U�����g���E��S����/�k���2��F��������>b�G� @�(�@�>=�~�r�)��o�����&L�=���/U8���$�cq����K%���W��?Ez�
��J&��������Y�6
�6m�[o*�5 @� @�@
4��/�����N [v�����mz�9���*�M�yM��B�Rm�A_#�yoI��3'G��.�$�) @� @�@�2'���5�D����$�U+���J�6:���@�xp4�Vc�14�n�E���K0i�d��n'��JNV�N������������I� @���eN��-��;MX.�$�������s������+7��r�� �yfe��$3������"�h(I��$��v�`=B�d�T���a�����(�#3����_�UB� @�M$P��$�[P�D��]�&����r������,�e�U�7���M��v����%��K�*���0��p��o����E�!__B���Z-�w���#j��@��e�C�;�N� @��j��du��}����l��z@��PO�	��uJ]z��Y��i>g&f}����P2�F��d����/�h\���
�[���5���!���=���6��T���'������E���� @��jj��du���&8/�$�����������?�"2���G�o���	��v��\s_B>�d#��� ���]����~���HX�Ix&@� @����I��u����MA]�Q�	�X@�\c`��\ ��3��}$���!�����*�����b��juh�@3�J�Z���P3 @���eN��-�������l��J0��@��
��5���z�����������N�"�q�������<�aBP)8�t��36E���V����n�$@�J(P��$�[PW���_]�&����D�C��*`uh��L���}p��+X-�[1�	�H@@��R�'8��5`/�1��Q� @�f��I��u5��W�MA]��R @���
$��9W����3����8<W�<��\�t��������^
* |k���]�*��Im���e���r?��g @�@�j��du��6+���$�
 �K @����ph58������"�$�8���+8�t�����u��ww��vG�r����;d�@C	d��s���!@���eN��-���,�%�I"�+�j7 �L=�uVm4�L(n_��:4O��g�f�atq%��Hc�14�n�E�&�]��U�y����K��!�+cy��Q���3�,P��$�[PWy��D6IuM;t�(�@�AZ�xd����'!�X�8l4�@OgMo���R��'4L�S)8�t���s�������7�/�d2#�
&�mo�ga�M/��r��p��A��9IV���T�S�I"�+�H� @�����C�F�G��-��	�l5 [M�
�cA/Rm�������t"j6��@-s��nA]�
W�*�&���|c�G @�Y���F���V)4L��l�(�"
tu�<�c������Et�fh<�Z�$Y������iQ6Iu
3$B� @�������mJ�&��V/��(�b	�y%i�0/Oh��W
+]�X��� �G��9IV��.�H4i�l���t�6 @�T%��-\�
g�s�H���#�j����6vu^�O��L���9|+�n�@OpX�������mqL�8*m�)���~
re^��9IV�����A��MA]����	 @� @��=�u��}������������
&�moo��<���&�h�����@�:���@��*�s������9IV��n`���W�&������� @� Pg��X���g�VsY�d5j�)�@��nS����VO!]��LD���s�z���pm���9IV���6cW�Z�I"�+�p� @� @�W=} �U��m\���yV>�%����@^���
�[�[�b���+V�E�<ae�U��>��XX����,���!�eN��-���q+M��$��fHu� @� �+��>n��+�yyB��RpX�:��U�@
�*l�@�����5k�r��+Js��P���I��u��;�MA��@� @�h^�$�K�,Z����t�����"���	x`�O�[�}�F �@��E���H-s��nA]!�B}�MA]}�]� @� @�@_*��I���o}�B�-�@���+gL.`�����I��u�9�
��l��b84� @� @�@��+8�t��)��
����%����������I�����1���j#@� @� @��	��]j�2I���v�\9?W�3����8<W�<���V�S��e���������3�R��%k���V�
�u�����V�n�4� @� @���H�g�����aR�<�r�V!�@���|����C�#I�^�2'����i��}r6Iu�g�#@� @� Pv���:!]�G���ig=4�3��{�eN������7^j"@� @� @��8��;����E+���`#��6�N�A�N��nu���N������ZE��ZQW��/ @� @�(�@��.y�f�j��du�f,y�l��
9|M� @� @��A��9IV��>��� @� @� @�TrW�2@IDAT) ���i @� @� @��" ����s	 @� @� @�T) ���i @� @� @��" ����s	 @� @� @�T) ���i @� @� @��"0�/';���>�h�����[�'O��#Gv��v���7o^<���������)Sb��6�u���<uu�� @� @� @���hi���d��5}j��f%���W�a\Wv�\sM|�#i�b��9q��G��e������nq����.�����S&)��\��z��������_��Jq @� @� @�@�j��du��aJ�X�"��j�8������O�P�QGc��I����+��r�*��~��1c���������/����?~���E�2Iey�uhL/_d�DP�K8�	 @� @� @����2'��v��~�&�-,O;��nk����c��Uq�%������rp@�s�=q����c�=����*��g�}"O]I9 @� @� @������*g�~�����5kV�}����;��3��I
�-��B^ @� @� @�4��u�8I�v�y���Q�b�������C�m��K/���=���m_���3y������LR6o��� @� @� @�@�	��aL6�|�4hP��;7���
6�����.�����'���.]#G��t�������/_y�$����t1; @� @� @�B@P��0l��X�bE$����=���q�����8��S��H��[�vm�+&���G���LR6o��l_O=�T_�p> @� @� @��&��n�j_�1"
���u��,�����6�l>�`���i��n�m����?��O.��������~��S&)��\�yA� @� @� �0V��h(��~W�k������U��,�G}4�O��v��u��u�]#O��l�rI��>6�N����| @� @� @��+��ad+��Wu����s�='Nl�^���:*-�����h�������dE��Gy�$'�-�v! @� @� @�
%`E]?��?��x��'������c�F��n������t�\rI�N:���������/Oo���>�D�-^�8�;���r�-#O������.n� @� @� @��Z��&�?cv��v�mV���k�������}[����+.���8�����%�����|�#��c��A^r�����K��`�2y����������aNT�� @� @� @����I��u�8C^~��X�jU��n���=��b����N;��m�<e������B��&��� �	 @� @� @���
�2'����v���c�$���R
 @� @� @�@�j��du*�� @� @� @�(������ @� @� @��l�4� @� @� @���r��^ @� @� @�L@PW��\ @� @� @��r��1�zA� @� @� P0A]�Ls	 @� @� @��! �+�8� @� @� @�@�u0�%@� @� @�(������ @� @� @��l�4� @� @� @���r��^ @� @� @�L@PW��\ @� @� @��r��1�zA� @� @� P0A]�Ls	 @� @� @��! �+�8� @� @� @�@�u0�%@� @� @�(������ @� @� @��l�4� @� @� @���r��^ @� @� @�L@PW��\ @� @� @��r��1�zA� @� @� P0A]�Ls	 @� @� @��! �+�8� @� @� @�@�u0�%@� @� @�(������ @� @� @��l�4� @� @� @���r��^ @� @� @�L@PW��\ @� @� @��r��1�zA� @� @� P0A]�Ls	 @� @� @��! �+�8� @� @� @�@�u0�%@� @� @�(������ @� @� @��l�4� @� @� @���r��^ @� @� @�L@PW��\ @� @� @��r��1�zA� @� @� P0A]�Ls	 @� @� @��! �+�8� @� @� @�@�u0�%@� @� @�(������ @� @� @��l�4� @� @� @���r��^ @� @� @�L@PW��\ @� @� @��r��1�zA� @� @� P0A]�Ls	 @� @� @��! �+�8� @� @� @�@�u0�%@� @� @�(������ @� @� @��l�4� @� @� @���r��^ @� @� @�L@PW��\ @� @� @��r��1�zA� @� @� P0A]�Ls	 @� @� @��! �+�8� @� @� @�@�u0�%@� @� @�(������ @� @� @��l�4� @� @� @���r��^ @� @� @�L@PW��\ @� @� @��r��1�zA� @� @� P0A]�Ls	 @� @� @��! �+�8� @� @� @�@�u0�%@� @� @�(������ @� @� @��l�4� @� @� @���r��^ @� @� @�L@PW��\ @� @� @��r��1�zA� @� @� P0A]�Ls	 @� @� @��! �+�8� @� @� @�@�u0�%@� @� @�(������ @� @� @��l�4� @� @� @���r��^ @� @� @�L@PW��\ @� @� @��r��1�zA� @� @� P0A]�Ls	 @� @� @��! �+�8� @� @� @�@�u0�%@� @� @�(������ @� @� @��l�4� @� @� @���r��^ @� @� @�L@PW��\ @� @� @��r��1�zA� @� @� P0A]�Ls	 @� @� @��! �+�8� @� @� @�@�u0�%@� @� @�(������ @� @� @��l�4� @� @� @���r��^ @� @� @�L@PW��\ @� @� @��r��1�zA� @� @� P0A]�Ls	 @� @� @��! �+�8� @� @� @�@�u0�%@� @� @�(������ @� @� @��l�4� @� @� @���r��^ @� @� @�L@PW��\ @� @� @��r��1�zA� @� @� P0A]�Ls	 @� @� @��! �+�8� @� @� @�@�u0�%@� @� @�(������ @� @� @��l�4� @� @� @���r��^ @� @� @�L@PW��\ @� @� @��r��1�zA� @� @� P0A]�Ls	 @� @� @��! �+�8� @� @� @�@�u0�%@� @� @�(������ @� @� @��l�4� @� @� @���r��^ @� @� @�L@PW��\ @� @� @��r��1�zA� @� @� P0A]�Ls	 @� @� @��! �+�8� @� @� @�@�u0�%@� @� @�(������ @� @� @��l�4� @� @� @���r��^ @� @� @�L@PW��\ @� @� @��r��1�zA� @� @� P0A]�Ls	 @� @� @��! �+�8� @� @� @�@�u0�%@� @� @�(���rt�y{��?�!�����l������c�6/�� @� @� @�
$`E]��}S��_��Gc��qq������;��C\|�����&@� @� @�hPA]�L�f���q�5����3�W��U��g?�	&�\?���+��8 @� @� @�@�Z�]�5�nmM�����h���KW�%c�p��5jT��[o�5�<��8����n�s�[Z���D�)U@� @� @�L��9IV�u�Is���K�.��?�-�K�q�1t�����;�� @� @� @�
, �k����i/��Rzh�}��Pd�����.��k��fed/ @� @� @��' �k�1���d5]�9rd���������W��t� @� @� @����8M����}K��]���dU]���;�bGv�*Nu
 @� @� @�@7V�u�����v��y��/����_=F���
�t� @� @� @������q�"wK����}��G;��r��x���c��I�W������S�G� @� @� PA���
@�xx�����[����/��R[������O��}�m�
 @� @� @���hi��t��UT�Dx��s��/�����{����K�.�+��"��j�x��c��vj��k @� @� @�yZZ���u�6l���:+������+b��!1}��HV����/J�� @� @� @����5���o\���^����.��r��'*I� @� @� PA]]�]� @� @� @����nP�C�? @� @� @��z����� @� @� @�M/ �k�)� @� @� @���z��& @� @� @�@���~
 @� @� @��������I� @� @� ������ @� @� @��! ����k @� @� @�4������ @� @� @��z����� @� @� @�M/0���B�������GKKKL�<9�L��e-y�uy�������c����w�#���wv*�l��H������e���w�6�l�����4G_x��tvw~2�����t�&�4hP�s��>����k�m� �N���fRt���q����;���Mkwv���~V�3�����{�5;B�k������>o��F���IF��i��;�������R^W+��9��'��7/�|���w�}��6�l�����g%������W����/���SG����[�SE�����5��'�x"�z���4iR�����������7�����?��v�����@�y�Zrr2�~��x�{�[n�e��e;���[�.��>$�3�1":��3fLV�g��������s�y�T<u�����<�SROR��rQe��_�>x��t����m'�'��-����s���+V�]w�K�,�w�1}_N~����@��&lC������10KV������N:�u��
m��-�vB��u��	�5���%�\�����6$m�}��;���@Wy��?��s,����
7��U��I�|p�sO9�����j����v�(P�}3A���{[7~�����:��[���<���� �N����1�k���B��_���k���]-]o�����������@o���������[w�����[�s�=���=��$��=?�Ej�s�=����~��8qb���:��/6�!���N���~��������r]tQ��v�I �\�������~�u��!�|��arO�����<��S���9��<%�gt5�+^P���3�_|������a�%s.��nc���_���
�/g�{#��:���w��y����-�����|�k_�����?j���;����
d�oE�F	y����>�l\z��q�a���g�yf��G?�N8!f���V��\O��������*s�W�y������������x������*o�M����|�3����w����.H��f��w�t,�q�}��+��9�����������s6~�3g����Qm�]^�Nzz�L��>������6lXn�<?+y�{�*H`�@Os9�k�����~x�|��������~�����cy���W^y%�����o|�1c���������/���������w/���J5��/�@S	$s3y|��_���(Y�����\�s��d�����X�hQ$�3���*�du�~����x������:���@^��s-�3D2.\�������yT�H~oI��/Y=}�����'�>�`�q�q�i����$+9<T�;�?��O�C=���'����d$s9yN>�;��ccc`��������}�KB;s�����o;6.��P:��!��}���\�J��E������:����*���~��#���<�g��Gy��j�@�4��a8�j�H�jw��U����������?����y�������0�u���GqD�������+���pK��8�+��plr���	�e�f����Zw��+W�����gi�������Sw���	d=�ofe������n��N���gE]���j�{�.��T��?@H��Yg����������}�"������_�n���d�g�G��o�?y���Q�g�������X�	��?�C:������[n�:���/���������������bI,(hT�h����(X�5�Qc�c�`���Q�]c/�["&F�(E����'�����e���{�����}��yowggfg�3��w��s��#�i��4BZ/�[4���<���SN9%X�x���
���oC_��QW���P�7�YT]��8��s�9E�@ �@����"�f�u����-\q�����U��[�2�^.�GD��B��Go����%�I����������������)x�v�9�K@�}�7��G�r���kX�O�U4
qy����������G��a��1a���p.�f2�(G��>���
tP�%�#)�o�VI�_]V�SvV^�;7�r�M#��G9?���k.���������+^���@��\�Z�x/�B�<y�u�]!�Y��<����E�n+=+���S/D$�D��np���B��W������.�8�$'@�Zy��4��E!�A�'N�����E��Es�8�uQF:-��}Y�s��?~|������q���w���{9I��z	�h���ns���8p`fqy��yX`���x�����;���
#E�8h��l�\e:$0e�w���:o��y
�`�!�6����<�@x9�����&�����:7b��m��I��W^�m������If�e	T�G�`Vf���4�V�"��_�����*�I�);�@";
�J�M��`�nM�j��J���
�i�'��/��2�-�s�^��&����LF�|?�^����*����{/L,���E�yu�X�����Y����um�;��N:)�d�J���I���DPf�4�������>����������=z����m�����n����2MH�@5��5�)�q=!��g�yB��`�3#��
*��/�T�L�z+U�k�9���������]����|��i�����5���0��[��7]5���s/�����L���YbB�*v�U��t�>�K/�4<+��
[��}��%7�7]<�_�S��m��Z0+I���Vp����{�������>����z?s��i[M�>�
^�\�����~�����S#�����L��>=C?�c�Ry�}V���d�8�@��,����w�m��'u2S�j�~���E��r��Q}�����������e�8-�}V,o��|��f�On���d�/�����+S[/ 	���?�s��;$���C:3
x����cL_�i�_�@���,�����o����t�R����{����+x��?������s�����n���=��� +�-��rA�k% ���JO�������y�o���9�&/��"����7���!��i��s�,��8��SGqD�!������[n)�������:���z��QtRPg��������to��f��.�'O�@��QM��7��3�\������!�~�����o�x����j-;�0";�<��$�j'm�yV���d}8��T���/���-�U���^{���_�:�c�I�L��~?�^�D����Qo"����JJ�&1C�o��P���������I\�$��*�o�>��{���\�������Z��M	,��n�[�-n[p�^�� �����U��������<>��<��`����c����f�m�V� PD������z���/�c�����/s�p��y/5	
"p��g���5����7�
��x+q�a����g������x���#��6
�1��
\	���/�<<;�U.TJ�_����?�[j���&��_~������q���{��!��.(W
�A �@�>�}c�>���;g��<����8���2+�RvVY�wy��I"�#,�F]2�����'����e#�	����/|/h29-T���{9�"q�H�����/���	 �����:����k[��R�;?~}�!`������0*���?SH.��i�z�0�N8!����6�8}��s�=�z/�
��<}-�T5����i�����#��x��",<��������iE�"y����������;�ex�����{w��w/*���o�`��F�m���������q��Qfqy���Z���\~���s"�����?|�y
�C`�]wu���M�<�����*I-�R:���/��7`���2'�`~UP8��[o��f$�1?��pP�r}�O�����t�A�J�}�w��=���n������NK�WK�����}o^w�u
e������
��M7������:�Q�;5-x�&!:����4���C ���U��k,;o�����+���3l���}V�e��Z�E![��������/p������m~�����^y��P��������?�I�y��cy�Z3(�=^�6�`��r��~z�_s�5�8v �E o_;v���D�����Cq����?���z���K�)��r*:"� p��7�o��7��yE�����S�>���~����W_}5�.�h��W4%C`�#�5��0.~gO?���+�&LH�7�7#��!�^����#���2�,���Mq����.�����x5����WXa��x �F o����U���Z����jm�}Y������m���
�F��M�T*;��c���MK�g���y��j�{�:�����/kQP2�����vK,������t�/W���{9I��j���~e{(��|���&W���_�������}��g%*��H��/�,��5�B�-��A�t^S�=��QZ	�$�~���������:�WZ���o���B��q�!����v�%�����4y�ZZ���d_V�<�@�<o�;���h�m�M��%��e��)���KEeha�73�$�SH��<��^.B�Ax��PJ����%�r�g@il>:~^�9�}��v�h��k�48��S��!�v�a���8��1bD�3oR*��<��x�
����;o��y��a��z���a;�����S��JtX5)-%�c�
7t���{����4Kg��B �@�>�7|��p������Jx�%�8o�*�Q?��C���I���OtZ��P����� �wc���e����2o:-DiE���C�o{�(E�e%���T���.l!'PM_�s�=���[o���m�M��Q�F��2=������~��o��%����Li��|��Ac_���[�Ih��4o.;h%�r5�J�w�}�T{���Pg�UV	+������I8���+��a"�7������*�~y�����{����5;�0��7�D���w\�*Mj��YQ���C&���T�k�����w��zk8|������?s��w�^����������k��[l��������>
��-2�z��@�<}Y����B6-�7��Nx?��X��)$���*��w2�e�"4���)� ���u�n�if���%�r�g`��I���9d-����[�����������h���_N�2x�X��P������X� ���'`!O:o*�����������Yg�����	
^�)\_��m��
��`��O>)J������������T?i}���N���=+�������T���-�!���T~��������i}9�����Ws������+�����"���^�z��}-�/�d����Fvj$���Z��2EA�����������c�������dl'�Y����eg%����5�^���ka�=�(���Q����p.�so�2��/�.x��;]~����?��x����/��G�����5�$�����������Xi�������
^(X��$�@�@��,���PQ����GqD�k��"�}����|K�-��8
��!�o
�k�����*�K������0d����{��q/0,x��e�����6���XP�d���Dl!������ /��DR)�Ly![�0�*����s������aX�D+���N���MW���*�Q+B�L����/D��m��li��Je'�s�J��7+����y��J�s�(���0.|H�.m%zV_��+}?�^���H�E�R���;V����H�r}���{V,
[�B����v���n���$����r*���9�[�����{��w�oK�O��Lh��������0��5��f��J�����W���7�����fy	���z_�{D�M�j������:�-���{�H����L�~���aL/��R�r}��3�k*��At
}��2�,����&�w��&�X@� @� @� �0A��i'�� @� @� @��KA]s�R: @� @� @�R	 �K�B$ @� @� @��KA]s�R: @� @� @�R	 �K�B$ @� @� @��KA]s�R: @� @� @�R	 �K�B$ @� @� @��KA]s�R: @� @� @�R	 �K�B$ @� @� @��KA]s�R: @� @� @�R	 �K�B$ @� @� @��KA]s�R: @� @� @�R	 �K�B$ @� @� @��KA]s�R: @� @� @�R	 �K�B$ @� @� @��KA]s�R: @� @� @�R	tI�%� @� P'�/���u���u�������Y	�;�=��3�w��n���k*��o������n�M7uK-�TS�E�� @0h�	�� @�@�%0u�T���w��u��-���n�fp�m�]����6g�u��s�9]�n�����^�Z/�U�Z�#_�|��7n�m�qp�1bDC
~��w��q���_~YR��G���w�����#� @�"�F]��R. @�Z����~���>����w!m)M�K �����3�8#��q��k4�R�M��n��scy��O�>n���
E�k����*������$p���\�;R}��f��fr��3�[{���[l�;R������^��
TR���#��w�����9	�����zR����~����Z����s�\r�2dHQ2	%`���{�{���Zh���@� @��5�*eB� @��	<��S���/�uGq�[a�r�M&�I;��[f�e\��=����c�=�
�B�?~|K���W���cd�6��u��MG`TO����*���N��:�;��s�FX#����t�y��[�c�9� 2~_{��������(	����.'VW^yeo;_���e�]��N���v��NE�v�!������������F� @�@�	 �k6a�� @�@�`�B�=2q��[l��,i��
6��}����_�~����K���'?U��wwk��fj�����:Z��m����^�z�V1�'�������*�&E��f����$L���O�`Jf�u'���2���Q���I�_|1��V[m�f�m���^x�7j��p|��'������.��� `;��#���.����&L��f�yfw�5��g	W^y�����W_u�G�FPg`�B� 4������p@� �>����>h�4�NLa��ku�Qa�=��]Z������m{�������������16�����H_O�q�Z��v�m��F>���h��c�9&7�y���m-u�%�-��e��[<HsN��i���;��8����3���/���=��E�:_t�E��N8����*����u�SN9%��������[t�@� 4����.�� @� ��>����?������
�]$?TYAu�)�j����>��c'�k�
���Sm�FU{�j���z���4�������;�C5A�.��)+�V��jI�N���>r�&MJ&+:��}�*j�:����m<��IkNB:��]��%�\2�O�<9l�o���n�=��=���f]����;F�/���h�@� 4���f��\@� tR�>�������o����s�|g���;�h��������Z��� �i���=�����(M����p���n��+���-��n������z��������L������(���i�_���{�2��$AJZx�����Yd�p��;o����v�mEY��/���[oA���u��-���J+�}���M�2���l���?��7���?��!�r�-��Z����*"��MZ1m�(�/$X;��C#���������^
�+W�xJ|��w�8���/�K'����#!�����.���!���yE���J���0���i��s2t������~�^~�����������s��f��n��!��1��s�������E�b;z'�9��z(v�]@� 4���Px�@� @�������@�;���8�Dx�Oy�~��������'��w�y�5�0$J?~��(^;v�O�G���t|��������������4hPQ�:8���S���
6,�W�})����(�:�.�h������a;g�qF�k����B�(���,����������k+FV/�o�0�am����.^Ge�T�7���{����%�zASAmg�+��~��|Y;���#^�.������rQE<���{�{����$�{l8����9��ZXp�C��_���#/���V��~��{
�J�9@� ����|�y@� @�~2	y����94b���=$-����.������[|���9��L���4�'?[��zk��������I'��d�PZf���w~�?hA���>8��S��&�{�|�����@IDAT���`�R�x���?���]P�
+�����o~����r�=��P��O?������g?�����WU�%�b������4��?�x�����o��&��F��o���7.����wE����Zk���x������{N�:dh+Fv��7�pC8�b�-�RK-8I+.oP^0�o���N}s���ACT�I���<���p�
�"<���v��s�=������g�G�v��_B:iM���[n��eG���9�?)���jn�(*�����J���-��g2HQ���_f?��_m93y�mN�����H�6�����Y�����K @���@X1V���L� @��tE ������ i����"�!&L��PN>�d�����W�o;��d��'�-�dk�M�GQY�-i��5*���u�Y'���K�J�G�)�x�%5w����Z���D������s:�s_^��q�w�����;��3:��v�E�q�>%x��'�4�j����A�FJ��~����m����
h&�x���i�Qu��K�1~Ne\t�E��3�<3j�}��'��f�\s��I3+�_������������N;��pO^�]Xz���zxS��'�|������K*�c�`F�t��7�$�����Fi��h����n[�y/8
q�O���M�F�]z��))�� @�@��u�� @�&��A;H�]�?i��}�������oN���!�g*~���r��o5��3��v�-:�F�5�\S��J�rto���f���x���5l�|�M�������
��U%-@^�v��|���4��A����l��g�GW����N�zi�(l�P+�d���4����><���<^8g��6�"����NAx� ���l�I�����_~?�k���w����6�e�]4��Q}R�V]u��r:b_.�d,b����Q�F������������������
����rT��Ji��x���
��[w��'�g�}6�O���L�%!� @
#������ @� L�d�O�� a����(�G����o��x�����0A�(5�����o��Kh����C�7�a�E	q*z_T�S%�2I��{�9	�i%��6L�4�}���!��7^����x�2�� 3�&�4���+���u�L{���	=�"��(Y7������k����k�'?��� ������{m�h�k�E��c��tj�j��i�^x� 4���PY2��'t��\���u�+��7�tS(�k��Yg�5�{?��`v��!A�.��_}8�v{?�!n�m���mA��`@� ������ @� �R��j����U�����
0�}���������b�-|��_T�`)W�5&�$�O;iJ��
��\���t�����_��p�$a����j����v�+�SA~�,�k��4��:��f���Z%7N2��&�3���� m�%�t���ZhW	a��)!�|�u�]���QWK�/Gk�i�J�O�'o��ys���v���Z�#%x� Av�`m�|�������
>#����HH'-Di�J#��^ptP�f�i���.��;�@� �f@P�L��
@� �NF`�u�u�h����k�u���p��C
&�d������ �����mLJ��?��O�4�$��v[�!n����r�-�\fQ�F�pF!�A���,�EO���#��L�J����s[o�u0�(M��_~9TO}I]���6���z���|�����s�;�����ew���_r_��=g2��'s�1A�gCfng�}��R^��L�)��%�7n�[c�5�������F��� @�@
���,� @�@6iz�5*h�����g��jB��SOu��vZv�rF9�o�}�I����w�yw��W������^{��k����Gy��n;w��F��vZh���|�O�;�2jo�^|����;�J�42{�����k�����'�;�R2�z����1+���e�_�VZ��y���_��3��.�o:���?�IX�|H����UgA�'u������� @�@#	 �k$M�� @�"��;�;��#��_���
A�h������N\S,�^�O>��������8������T������{���r�-�zz�!�-�JH�������[~�>���P�_|�I���+����,�����w�O�Qu�	Gi�*HPw��GF~!;j_.w�q
:i�U
�|�I0*�������_��2�EU4��f�.Q�@� @��fl`Y@� tb2Uh���d���Rd�1,^�2e&�����'GU��`T��y�u�E��v*��\s���|���]��-Y�Lo���+DK�3f���$
m��f-!�k�" �p7�z�y�S��Q7h� '�7�����I[���&M�|*Z��/[����/�|�n�";�)������e���i���YN�:5*A�T2���K2�� @h��_��(�2 @� �����h��N����"�3�����QG4�$��2e�;����M�����El]t�p,�5i�=�������r�Dj�`B�����w��~��Y�g�}�m����i��:��/��0`��`�_�����1�I������{���w�����!C�X�`�p������)��/����;�d�����o����#�J+����p�9�$�������N�����m�92���}9�dbg��W�b��mQdlG� �&���������G��3M�+��2� �sv>Vd��I��v>��}@� 4�@��-E @� ��N���������������^{���(
^�����U���u�EE�N8��(�����x��;�����?��g�}��J���4o�q�9/<��Z�l;��sG��p�(���������Q9Vv����_V��)������o���.xS���l�c���x�v�a�3����;�����������+w��/����@[T����^��o_/�.��u��m�0��O��z��p7*�������/�/x�~�^�]v���i������'!��[n9�Dl�k��]��VYe���k��R��������P���,9O @�E�~�Q�I @� �b���'��#S�f�a�H{H=;����m��BJ�R�m���=���n��.*�4�4���B�.]��b����4I����)O<��U|<X����I�O�7����O�>A��4�����y�Ki��s�M79/\�|�Y�_�3>|x8o���z���N;��dB��|<�=���4��f������>����������i&�r���H�t��Y�������e�]���������n������������K.�$�����������B�Pz�d�BG��V��V���La��	�]2���?����W�5*-Ixv���N����,h��'��c���p2�q�������t%@� @�����a��T��}M�� @��c�����|��1��k�m�[����#s����5�����"��:��x(w��y
�`NR�����7��/��"����0t���^���	"e
SY���r�������K��}���3�KL��K ��K/!��f>�����8J(�7�^�~��)^V9�����h���.���9��X��9����n���n���so��FjJHi���n��m���Vd�6O�X��������������#N
�w�6l����2�+!���]�\f� �#3�����i������=��r����NM��! @�C��"�kOJ� @� ����{�-�������Q���y��H|��O�4p���� X����A�y���5�`���m�%)� @�0A�/;q'��!@� @����� S�2�:f�����ME��:t�9rdH#�I4�����?d�S�^x�$M�#d�TB:��$@� @�-�Q��� @� t
�����G~��n�[�n���C�.������Y�/�d��T����A		'M����V]0 @� �B��(DA� @��!��O���n���%���MZt�rH�-���+�s��>����O�b�-��L�������=z���?!]�is@� #�F��`@� @h0������\��]]����L3���+P @� ��L�A]+�u� @� @� @hY&���e���C� @� @� @��	 �k����� @� @� @�KA]��5� @� @� @ha�Z���: @� @� @�@�@P��mG�!@� @� @�Z���n<�@� @� @� ����n�Qs@� @� @� �&������C� @� @� �.u��v�� @� @� @��	 �k����� @� @� @�KA]��5� @� @� @ha�Z���: @� @� @�@�@P��mG�!@� @� @�Z���n<�@� @� @� ����n�Qs@� @� @� �&������C� @� @� �.u��v�� @� @� @��	 �k����� @� @� @�KA]��5� @� @� @ha�Z���: @� @� @�@�@P��mG�!@� @� @�Z���n<�@� @� @� ����n�Qs@� @� @� �&������C� @� @� �.u��v�� @� @� @��	 �k����� @� @� @�KA]��5� @� @� @ha�Z���: @� @� @�@�@P��mG�!@� @� @�Z���n<�@� @� @� ����n�Qs@� @� @� �&������C� @� @� �.u��v�� @� @� @��	 �k����� @� @� @�KA]��5� @� @� @ha�Z���: @� @� @�@�@P��mG�!@� @� @�Z���n<�@� @� @� ����n�Qs@�������7�4�

@��=������>��\�69���&�� #������_�P����,���'o���r!TC��X5�H@�9�5�+�B-L����rs�9������8qbK����S���Ru���^	�<����<��[f�e�{x���v��3��o��j+��ZyLh��S�����_��;�����?�V���p�}�=��N���{[T��_������UW]��t���v���y��Z�! �#��w��(�>��=6��m��a�h�V����@��|��; �6&M�������6d�7h�����/qg�}��q�������]v����:��W_������5\����_,���;w������a�q��G�3�8��0����wk��F�����%�TGy�{���C���8�����L��}���n��aN����+����;=��y����~��:t�����E�=��]tQ��w��'�L3�T���o������o����I�z��F^����9&TU�E�o��9r���t��gw{���[w�us����k���i��y������z���;������Ns>���o���
7���F��G}��\rI�����-����u�]�(��e�����+�p�4��N����/U��}~������[1����R�<��o��������0O�3��9!t,�9�i��i\��cs���^�G���q���9���l]�Z���9Z��VC��J+�TVP����:M�)�����"�k&���z��{����5RP��c���!M�?A]3���7n�;��s��*S|����9�w�DIo��Fw������Qd'��y����]|���"2�~��_�Q�F��V[-#E}���~$�����?���l���O�+���������-��={6�J]M��w�}Q*i������e�L�<9��q���/�xX���������G
�{���!f#��������o�0|�p7f���U��	6j�O	f����q|���o���7�� /�[����� 5`��	[�L�9��}�2�Lc��8��3Ah���
� ������;v�����[s�5R&�tif����?�Ib�<*��Ki�A�\i��G�����H�=�
+����_{��Hx�I���Z�|��AP����/��B(N���;���5*�`Gl������]�~��A����n���4%G��K�x��%���:#�f��*���\p�`������Y��{t���K]#���^8�F���q�w��](���{�������Q��iA�%�������[��wq{�U���������h������ �K�B �$p�QG��2}I��	����7��v�e��7+;���S�u�Y�L�O��������H[S��^|��0)���3[Z�+�q{��wCa�����<���$�n���(�&Jl[GQdm#�[�������a�V�\��}�N�wh5���|����{a���k��)��B�����+]����z���x�W��.�I�1��tSv����\��Rs���Q��#@�O������~� P��s?uCG������-��V�<���j5��a�\������=�������������<�;j����>�|���������VA~�,\z��������K���[�U�j]�]�Y�!������Pk���NZ���<�2�,Nf��CT~�,H9e�;��V�^���c�92����u���~���P�4�?iQ�&e[�D��������-*�s� �F
2�Zm�������!�c-��Z�U�}���fI�o�j�k��v��W�}�s/����������\�Z�V���k�\�Y��1]�3�U��ki=��n"�@��G�{n�io����� 0����>���a�)%��S�$SM����~��RcJ��R��cG��8�@�&���c���@C	�%��s�}�]���N�"�����;��Yd���}�y�u��]}���m����K�S���|�h��V�����]t�E�y��5K�����������p�	M�����<�����n���
�(���m�����+�jE[iN-�������3�Kp�W�%�X������w�����������[�IXg&�k+a���djOA��'L��������K�I&�*�m�0P��7�|IH#��iB�x���W���g�y�q?��O�K/�TtIM�~��n���s��6[x�#j��Wv2;c���2W[/���-��e�r��a��#��SO=5q5��>!s�j�����<������:�M�2\x����?�y��Z��6�x�������N��Oh�Q�����Z�����{��p�"���u�g��������Qs���Z����i����N;E���H�o������s��Zh��!-�������-����x���O�OY���z�o��&�������.�lQ�3ad<o�����w�O���f�m�3�1B���[ma�}����o���?>y*�5fz�h�����^`����{�$_G�(w�������/q����f��,3���}�}2Y�z�����m��r���3����U���>������T�~7�;\�)�|nW��v��6P��C�=���[D�� �$`:	��o�I�����u�9��<ca����1'cN:�Z�L|�������66i\���w�v�m��+<-�2n���4��A��|��`�L���oRa�C+���K^h������~�;���*{���>;J{��w�=����s*+��O"����W_��������Ey�.J�?�
~r::�<���E�����QY���������gQ��5m���.�����z�?>����T8��S
~]j���S������Y�r�Je�>�WS���3�8��O{_a�������_���)����__���^���7��Or���|����v�a�����>���P���r[/X�l�Z�e.���8
N��UzO{T��~1��B~�����x�����>��:��Z�?�]�K����^^t����f�V/H�������WTNg?�k�������l&/�
m�'�^�����c/�*��������p^����(-���'���$J��7��o��|��/D*x�[Q�d�`��m�B��:��+*����g���%)�6R���4�6�U]o������f�m��<��/�L-�:��_K�6+�������r_&�>��
�z������[k�[yy�d�c����m��3�|��e��/�����]����Z������<��if���Zkhl���2��������VI�:s�'
��9�r�n�B��s�����y�1g%�czw1�
��h�wJ����w�)�5�;���v�����F�[���;"��@�����:O��������5W)M;o�������j+�����a5�V����i
i��V�[�i%�DU�j��n��M�8����`�J�����%/�*�����42:� '��?�����G	������o;���{���VR�p�
�?�V�+�	-�U������:?b��=U[������*����*����4%��� MJ��H����q�!Z}j�L&):V�_�����0�^�����Y�m��BZ�����+�����@}K�Ta�-�p^��V]u���"�?i���.7n������fH�����'
i^(�[�PH��x��m����{��W^)JTK{I���s�
�4�����w��.����9��2U�A�z�
-���?�/�xTg��
�Z�cZs�����*����L>�a������S�4��:�����W{�JKT������V�����/D�nI��L���_�0i��]/-j�y�q��C/H�4Z�-3�^8a���g��F��[���v�8��p���:�x��N�D�\K[VA���
��9�q5�F�4����o4YJ�F���J�j���i5������������_l�a[o�W��Ek�A-m�o|i�i,��o�T�q'M	i\*��ei�������<e�����r#G��:'��z����*�Z�@�_�*�,f��'�|�[~��C=^��J�����-'��J���*���<cNuca�cN=�J�v�1�}��X�o:��[f�e����~7kK�u��{�����u�P�Fv ���g�[�v�4�_2[)AZ��Po��0x�E����,B�z�_�����
�����!��6�|��$m�Q������kR�&0�����2h��t
yeJi���E������B���>����
7���'�N&t���&,l���a�W����$�����jC[�S�ukVz��� ��A��'���>N�����Y�������.EY��N�
i����� ��O�-��2J�gE���G�w�uW0%��hIe%���2����$�V��.	��i0�!_P2�i�Q������<���`�G�a��&D?����B�ki/��w����\��_	���q YF#�5n�.
���K���Z�F\�#�!��������u�	����\9Z��g\�I��4��}��l�5�\��;K��^�2�HM���5�X����}_�l�Zk��S���I'�T����gk����c�Lq��P	���HAfXe��c�	�2�g�U�Pp������2���8X�����;��{h���m�
j-���N�7��Mx�_��'O��X��kVJ�E(2���kf�����/�O}�LDK�����j�r}2y��c�mQC<��.�mV��me�^}H���J��������������Y��J/���)h��=��G�:}c�{�R��
$��o&-fQ?�������{$�T��>/���5;�:��'8����%0��`�5�XXf�I���y��F����R�En���P5�~��|��T�X��@ZT�o9�Z�
�e�Z�u��~����F\[��S�{*fM��|������|2$�t��s:~��7-:�����2����$�'���/����[X��H��&��� �*���d�s	U���LR��-��be�(�4��BH���Q]���R��M��&rmrT�f��t�����,�VTZx��'m�h�|iB:%��sM|+��r8��4�����CM������z������X���T*����U�8P�N8���4��BH��.����y�}�7�&j�>�����������#!��������������-~?���dN�j���n�-DR���B#�eeU�������q!���f��7y	HM�����7�	��W�����B#����J[	v�%��S|[�F�}�>��^��P����Vv��Z�Vki���tV��+M�*�+�y������N�z��EY��jm-��o&}��N�x�?��P�t�<�m�t��'��
�>��@�km���;y��F��rc�xX�b��o(��� +���P������@�@k�����������H$�	�R������jkM��	Z+����F��
��V6����V���V,ir�LXZ��m�~�J��R]X��n�������R&��Vb�22�e�Z����=��k�	�Z)�*�y��D>��3ae��#��-��4t�,��D�8L�4)�LU�K�,�?W��i���J�q2�aA+	��&�����7�^�k�9n��Q�����^���V1����i>�t����!�����y��t�$�j��wl��3C)
�>}�D�i������[)��c2���5����H�(�mb+����o�Q}+h���X��u��oD���ne���N;�d�E[Y8�Is�e�h,�/o�8��M���g�(*O�u����o���N�n�r�>����q�O�t�8�����'��S��f�u�����{#�V�.i~{?�a��i���i���u��J#��TYy������Sif��F�ida�
y�Vv��������C��x�5����9��:��SJ���$�F��rce|�[Y�S|<}������r�AA]k����z/9w�h���d�2+x�E+R��d���N��2�v���	�����%���7M,���>�dzH+Te:��	!+�uI
=;_�6�Q�`J�%84�����{jT��-g��Wm�:�k%���C~i�3�p�If��W�3� ��L�T
2�aA�_�gP��B��m�$���O=iS���|Mb���V ����6���F����f�j��&A�&��<���
�4���[o���4�A[�W-�@5mO�R��v��l��<�/U3M�h%���=�	x�ii�������21W��fw��IS��&��"�dHj�%����jQ����[Fz':4:����^Q�U��{T�%�YA�5�I)�+-��7�-y���~��8ym	EdN��'��K3�|n��o�����y�CZ���Q�m%��w�������k�t��=!��W��)jmc��*��_�Z���y6l�:����
�1�P�XX�*_c���@5)U���xu*}�����;7�����q�7D|�������j��;����!P���u���B`�&`��r�j����kF��\��Z�	5i��#��w�M��V�?�������JC1r�+s�<���j���*�>����P����7.��Zu%7>���]z���`�����&�����3�pZh�U�{?���V,�LM*��E�f^ZFMX�j�K&��u��X���F�����i������]/�m��QZ	&L��B�����G����&�i������h�Z��#��Xja����6�4�����~������9��?<Z��I����re����o2����[Wm��]�����=���E;��B�6���l�ze���G� ��fwG��Q�Hc��m�{�B�,+�tm�=���X�_�@����a�}����FZ����U3KY)_�m`��U�iV��tw�_94����%T�_���S������iG�9�X��^������]�-��v(�����h[G��8I�!����f�Q�NF�����kH}�/���$�����!���mTS�G��fI�C�\���r
���kg^N+�e�����
���]�+���N=�Tw�i�e�m���Z��d��LF*��K����$��5����z.SU^Mj�VB��S�����h!�_%��yX�yr���a����p����4i�gY&k�l��H�\�}~%l�ze�`�=���5j����qaj\�����g��:6�K�`�Y����L��z��R�$��5��'k�V5���c��@@������;F�0e6V�t��3*��<�$��Os���!��4�#����� L�E�LmQ�����z�J0�I'M��w\Z���
��wOK��8���	����m�^����t��q����{G����x :�vG�Z
����M�t���i��������t��&�~��_���$��uz�s�P�Z9O2��UC2���50K���V������hJ����J�C��������.@t83v�Q!@�)��WN�M�Ls�)���'�|�}��W��&�w�}#!]5Ek"FN|�qd�c���������-�4���w�}����O2%Hh��+'��9��k��p��� A��F����Zk�����D�SO=U6m�'���d��/��A9����+�LJ�n���**����y,���@�/����V����f�W=�@�'E�BidK��_�ZO9�k^i�����i��%����W�	�v����L�������k�v�m���w�9L��z��n��A�N~IF�]" lv��=*AaR�'�]�v	7��?=��>�l8��T��xZ��	d�{�����i��G}4T@�H�l�B��5��9{������w����$�km�EY$**�y���(#;?��,���s���F�s���S�F�������N����zc�i�Z�
����F�8�u��^�5�V����;����t�0y���jqs��=��LJk�����"m$���7k?��89a�+���d��7�\R
�(�D�����f���!�9L[N���;�+N�t���������_�-��R���9���f�Y>�L3����V��^����R	��w^XM(u�
r���j��f���������u�Y'��<���������^�����0�~����JPS�����X�xDH'`�r����8i����KiN��}Z��?�v�1#��yC|a����&��������2��z��)��&�l�������f�4�X`������[/�i����(���|��A3Pq�f$�#`��������O�^G��qB��d���s�=W��z�v�i���.�X�8,���jm���s�%���OF��.���<U��iL[N�����t�����-���@3�������1'��m�wJ��p��P���=�Xw������u�P���0v4��S�����g�!��v��X������t���!��E���JV5Z�-�fR/y2%�QG�^y��0�:e�w��S�J���+&�5�x���ab[����-�`��%0���M�,�L1I�HZ,k��F$��=�qHl��m��;�i�?	��^AZq�:�C�X�8���9>%�.2C)A����`vIZ�{��w0K������3S�Yi�����B��w���h����=.�hd�x�����W\qEx�6,�%i%��7�q�gA����>���F�W��Z���IU3Sv����w��G5F��i=�	�.�h��&������O;���}A(%`B9����t�k���_�*Z�c��/8�3Wn�o	�$H��:�"�3#!��qM����AT���s��� ��9���,H�U>���n����vVH#�%S�Z�-MZ������q]+�M�8�%]f�e���������\JKH���@A��$�'�#`���;��n��s)��m���	ZPw�5���5R���	�����W_}5<������]�G-��7����X�?23��oY��jm�����_�����h+A��������1=O������'���)0��s����\�����[H�j�������������}.�,����-$�u��
2v�63�hY����oZ ���*����w���)[��-����;��z�AtNe����{����|��'�Wz/+:��������k��N�����E����Ii����R�~2.*+^_o�+5�P�Vj�x~�����Ee��w�sVH�{��l��@����J.�M���r�o��g��4���-���$������o�}?�����,����U�vX����^E�x���F���xK�w���-�������i/��"�/�H-��q�
�?�3��pz���{�:�����_(�Z����A�����N^��Z/����/(xM��t^k9J�����f|��j�&������XQ^;�'g�����eAu��Y[���O�Z����o��k{�F2y8��R�'���3�7�$��I�R��
�����^8XRv3"���n^�=�%��%+^{�o����N���������������e�<�����PT����v���>�����,�����)��,����[�m�;����}f��:��^��y/~�V�7��3B�"`��1��7�=K�&��d2�$�[K��B�V��N���������-�9��
���P�����F�-Z����h�y@��f�m����(2�?�W����i�X��7��������rJ/_.V�i�Hi�m�	+�e�*�*]Ai�e��Xy�����K�H���a���v��sO<*��5e�R��dS�t�Z��s�c/���s���w�sV���m-�m���SM��G�h��
�'��I���P(1Q����U�K/�t�y�EiUI+M�-X��=/J+-��.�,d���_����LQI[O+��y���7��m������<�q���z���d��~���6�)���&L��|&}l����^�G�_���~-���!��C9�����V����H#������d�qi^I�!���&V�������T�[��1
:�O�d}Ie����_�����N�������N��d�Q&����\Z��N�/��L&)����/�4���$�43��'7����nY]���
�q�/�
�\�x/���^@���7�pC'��[n�e���N�iJ3P���~N�r\��S9�����5y,
c������u/Y�*O��-��g���M��"km�r���+me����ri���J�.���X�����,m}��d��]#~]����:�M�>�;��;�H�������Z�@�5��Rmj��%�V��������~��~��'T3����0�b�9I"���9������x�d�\����N�]���[n�%�\�<�����a�B�5	L��|���7���Pk@�C���/�M�T
�`�d�L�����k������IDAT3��.�l4#S��K��Lgh�Y��4! ��������S�N
?����<9R��r�*�k.a����q�v����U��A��Y?���\����kWsO��m���t���2wjm�������&z������%�����'s~J����g���I���E������c�_�|��%��L�(hr��������W<O|��1N�������G}�����q��V��z����3�}"���k������X��y��&��'�v����=Y��z��w�M�'k^���LT��`�����G4�/W������Y����/�0K&��L�z��H���2c���+-�����M��B�|	D�����gI�E��Yr[`�Lg����;g����~�3���X�16^��{����S�	T�{����&��*_���s�x=������V\�O�E��|@��2�/���� =���w��SBf���/`}�d�<u��y���k�:kq��7���5�h,k�>�k���cN��NY��t2��s~�S+�8�F~������[S��5n�������I��2n���$I�!�q	��;����F�� 0���tM�y��a�+���	h���y�����L�>q�:*i�z���z����d�D�&}��c���
���4C�*`��D��� ��0A�/��� @� `��d�Rf�������i+-��C�F&�4�,��@����Cut���3��?�|x�j��L�m���AH�4��@��cN���@��uu#�@�*x�����o&����9*��_.� t&��0`�����+��|�]s�5��� �F�1'�
q� ��@��9\)� �}��	>�\��N>��Ew�!���^z	��)���:7�'����.��c���������n����mg�l!@�`����!@�@��~��@�@������>s]�vu��ww3�4S�%��@�%��W_���'-��Z��=���w@M%���T�@���-�DP��;�@� @� @� ��LP7c�^��A� @� @� @"���~@� @� @� �v ����sI@� @� @�  ��@� @� @� @� �k�\� @� @� @��� @� @� @h��:�� @� @� @��:� @� @� @�����v��%!@� @� @� ���>@� @� @� �v ����sI@� @� @�  ��@� @� @� @� �k�\� @� @� @��� @� @� @h��:�� @� @� @��:� @� @� @�����v��%!@� @� @� ���>@� @� @� �v ����sI@� @� @�  ��@� @� @� @���f!��. @� @� @� �Dh�5.EC� @� @� @ ����\B�+e IEND�B`�

actual_filtering_rate.pngimage/png; name=actual_filtering_rate.pngDownload

�PNG


IHDR���jkxsRGB���xeXIfMM*>F(�iN���������/	pHYsg��R@IDATx��	�������*��D�m���HQ%E-�%��(U;A5U�������Q[mU�\ZI�Eb�-�D�����;7�����/�l�y��=�23�<�y��wg� � � � � � ���@�W�jW�B � � � � � �4�@��M�C��;G@@@@@@�U��/3t��p�m@@@@@@(�@SSSTKR�n��1� ]d� � � � � � �%6l��A��%��J@@@@@@@ 't91�	@@@@@@���+�#� � � � � � � ������� � � � � � �@i����Z@@@@@@�I�]NLdB@@@@@@�4�J�H- � � � � � ��$@�.'&2!� � � � � � Pt�q�@@@@@@r @��@@@@@@(���8R � � � � � �9	����L � � � � � ��F�]i�@@@@@@�����D&@@@@@@J#@��4��� � � � � � �@N�rb" � � � � � �� @WGjA@@@@@@ 't91�	@@@@@@���+�#� � � � � � � ������� � � � � � �@i����Z@@@@@@�I�]NLdB@@������%K*�S-��8
@@@���LGA@@�������9s�+T���^k��j��^x���������y��w���t�pk���M��B�� � ��V�}i��6@@�F���/�)��bf����v��oo���?�/�3fl��g�y�9���M�6m������n[�w������c�����:��1�����}�.p��'��_~��m�����?6�vX�[�z�_�h�������E���������G����.��'ej�=��o��kgVYe3d����������Z � �����P� � �@�	�����+����N?�t3p�����j�����f��f�M61���Ky����M"�p?S�L����z�m����L�Zk-���{�t?�v��W_M
���9r����CI����\����\?�x��b��j�+��e��������W_mVZi%s�E�����Ie����~*�(� � P��j��h= � P�|Zw�u��4��o��>k������f���f�]w5=�P���m����O?m�^{m��v���S-��rK3l����4iR4r)���������{mq�U|�/�KR�f����7�Z,e��y.���]W�y��O������~�����<���WL�#�V��f���3�
2��-3���Q�n���F���;�8��cGs�QG��U��O%�*B@�It5�m4@������hM�FN�Kf�����3�0�G�vS\�:_M��j�O����W^1��u3O<�D�t�_K�5��7�l�?�E+�+u��?����Ek���a�����g����Jp��J�W��w�}�����>r#�=�u�Yn_��"�������:@@�G�m>��� � �@�4�b���y7c��%FkWU2�����4���B�W_}�F�Z��r�����~���/�^V��x������(�yIW_!����g��b�SO=e>��W������k��������|R5�����<g����?�6i;�g8^>��B�U?�,~�|�_�g����}oS>�m|�]w��s*��S�N�XEk<K��S��s@@�!�5d�s� � �@�ht����o6�l3�����h��w�m6�xc���{��7���v��J�k��n�[la�9�����&�
w4���?���n�.]����
R�t�����j�<
R��v�����A�9s�<x����,pE������?�����[n��l�������������������/~a�bV^yew����{�����������M7�����;�c<���c�=����jd��3f��6.*T��L�o����w���;�UVY�t�����w���G-�:Q�:�S�N�C~I��@�T����������<�/]Ra��6r}��}z��'�����j��fz�������+���g���������m[s����Gn_���w����l��|��B�3a#�q��lZ�Pk��y��L���YP���N��|�|�>���G��N:������W/��k���3�?�C����P@���N3�n��{������oM��?����#���,�U�����9��y�Z���md@@�L����	���@@�,`QZ����z���3�8k�\��������X�w����
*%����y�e}�~���������N��U���]vY��Y���
���	��Q�/�v��n�6�u��a������_/����
����@Nt�o<����A����^{%�����{��v��=����6aG�D����j�AL*W��
���m %cq�&]7~�������v��K/�4a��i��^S����%^�]+a�~m�4aa�,n���y������
�E�����o��F����#6������3�<]Cm��ry/��<���m<��3���.��o~��=z
���seJ�Q��U�k�:����Y�������
��p��O���4n����1k���>s��J�m��)z#�P���A���#��z6��TA��Rk�SR#�A@@ ��!+�
=�tV�� � P~���>����������}��6c�m�s�=��/���� :�����s��7�2�>���t�z������u���w��~�/���M��i��>�hs����?���f���w�TO1������.2��w�i����j���s�:�P����=��32��?��O���n3'�|�������Wo�J���o~��_9h��F8��F#�����~���l%�����s����w��\y��F�>�p�	��?���>O#�\����F�������_3|�F���F=����i��gV#��4bH#@5��
^�cUt�w�w�y�=�zm���o��R��;0tu���?v�i��}Jz?�J�r}/��yN�������7��� �[�Pk�`�^����������_����?�|P��F�������D����������Ga��y�6�w������|�UW����;:V��F���
�R�f�m�����TU��Rk�SRc�A@@ ��{�tA�M@@����;-�Q�Q����2��a�(1��S?v:��F�M�81�e�HKh��O�rH��N{�G�v����]�+:��.�D���������S�%��~��tQa��e�']7]��V�/>���������+!g�����:�LX���)�s������G��t.��R��S[5�������a��O����_�^��(����(A��)����F��KR%��
F�����L��
*F������[�c6����]����l~���:m`/�=e�����?������^P%�<���������V1e{���=��"L���������F�j�c�}���Z���|.~�>����H���
a��UxNu���a��	�&U���L�4���[c�5���6���S�&�6�{:��#6�=A��s�����gI���Z4� � ��`��?
 � �@i��nD�FU�?���n���l��������F�hT��w�9�Qc~4�FO�?�5������A�
������nSkr=���n[#�4�#����n�N�x%�-Zd�^��F����Z4�;����o���qy���g�����_#���A9��D�Tx������B����H������/�3�c�;��H�����R�~���7={�t�4�M}��o�������pJZG�'X5Z',LA�O��~��F)iT�O����S<U�{Ak*�6U�g�OZ�������#S�x�R_3^_��U}&j�l1)����4�TI�t��5"�(�O>��,\�0<����c
c��4m�����[���	���x*��T�~���}@@�@{��+ � �����2���o��:t���OjJ;}nG�;z*:�����h���f���n__��t�9��`�O�FPI�A����S"�U�K����J�S��*���>Q O��*��A-��'.�$o?�h�L:���;���y���r���r���={v����%� ��]��=�
(�� ���rN�6���?n{�����r������T�����y��G��
H�,jC�>r������|������z��O��^P q���FS.�}���=��5~��:�zV�H�����g��}��?�~_��9U>��hx*�m��������CTA���F�TG)��R�SN
' � �@C
�k�n��@@��	h]�c�=6�
���F�Ki��Ok���[c������_�;a���L����S0����o�5�_�~�v|��������x�T��]|4W���>���}�)�Qj
��@���{>�K��=����h�S�������(��<a��H�_�gRkd�1��������
^����qI�j��sJ��B�j{/<��3n�2���\�.�����/����%)�K1����J�>3@�p�
�>��n����BAt������5Gk�����Q��~���g��w�����jt��S�g�T�T�e@@C�]c�3w� �T�@<P��2,HuN���I�n�i�{��$}��4w��(_��	F[qC�&�2�W�����������]k��>������M��B
�)u���������W�����������������?������MMMI�zR�`G�+'M��krE��6n��V��#����`��sf~���U�^�=�~����S�^�+���@�T#�4�W#~5�P?o���������XtS�
f��FAi�����.2e�X�g������@@�V @�*�\@@���w�3&N<��h�9��4�>i���/���f|�H%�r���4TN���L��'M��.��5+:Nl��B��ll�w�����_|�F
)x�g
Rt���E5z5��Fki��.���b|�����~�;s�y��(?����K��i7�q��-�����T�
}��p���w�!j$���^j=�P��u�:d��T���c�zV�
�i'���Jw�����G@j��s���+IS���F�*��?�7n\�;��,�j?��J@@ZE�m�\�� � � �B@�a�������J��5�|�`�
��y����l�����IJU>��*H�@�FAeK���������z(<����}�4s�d�B���ZR�t�A�	�����;��5��[FWl�������Ow������O?��r��Q�JSY��9��3�����^�*����B�3�=��Y�h���q�g�9��(8^������Y-��>O�
������������7�����(�=����M8��������u�`���z���
@@�� @W�]D@@��������E�#��u�����/�8�����gn��hA?�����O<������4%��������z�)��� ��|	��@�q�3��6���5�@�I����| ���^3Z�,�4��
7���Z�����Y~���%�o}�[f��A.���>j���?�mM��G��z4-�_����e���@]x>��g��O�khT����GAo_����B�j�og������'�|�j8����3~���|�nd�+��ZN�pD�%�\��i�t?���F��cM<�q��f����:u�	�P��g�^����u#� ��V�]i=�
@h8}��Q;�~4��4��F�(i�5?U�`4e�����k�=�\��_���=���M���[o�5�4��F�u�Qf���6}����5�F����#F��1�����`�����M7��Q��}����5JCm����c��'�Y��@���U#�^|�Es�-��2��X	�Y�QS!jD��7�h��v[��-��g�l�6��B��\��H9M��g�~�?7�=3
����|�Mt���O�	'�`>��3�m��7�����#�t�LI�SYIS��s�=n����
��<����]G�f�VM_������o������r�+��Z.7��v���W�z�-�p��{w�gM��VY��C�Lm>��#���\sM�]��T���� � P3lK&LH�@@�&`�:K�������o�:H���Z�%���������������?��O�r��g�YQ��b����m���Q~mv�ai����L���������r�����e����(��1c�<S�L��k�~���g�}�����j��0e�����h��m`�*��~!���k�e,��M���R;r,r�����:6 �����N-�����$�N�`j�6��3�<�"�
�&����A�����\x�v
IW����#���)��:w�uw�m�(O!��\�g]����
�Gm�?�=z���������F�g')s�~^�z�����������j����g5���e����/��3��#���'�w���C_U��\����:ur���I�E�
}�T������+ � �q��C��Se��� � �@~�l��Y}���j���q��eZ�HI�9�}��Ie`N;�4wL#��t�:���4��
�E�����_S�i�L���~���G:�������9��5j�������W��;��M�����6eF������m����~im4=����FV�kM�MJ��C�4�������)�����o3����^�������}���k��������m|��u��T�(�y���j_��4��'=w6�w�W����u�m~$��g���6��"M�>}���64T#D�TFS"fK6�k��*i�O���a��ry�U���.���m�������]�����YP>����u__���B��t��J���Z����\�
yV�]��P�|���������F;�@�����������.�p�F3�3����W_�������������b%��42��B�%�/g?���� � �@&�
��i��g�F�2�� � �	h��%K���Y|Z@_�����ZHS?������
�)��_?��[�����M����
*�.-[�������
���*p����������MkhG����:�������R�����t�2P0�����\����/��y�2�=U��c_}��Q)`�iJ��Lm�t�����Z����?����HFzN|�$eX�S����.l��9��P� �3���\���&U�IY�������� ��y���E����0_>��L�sXg������{^�K�f��~������+����'�:�<�v�ke:���������z�L�2���O����~�~�i��j����{/�g��^�]w������6{����2�k.}�����>��)�g),_�~
�g@@�P@��755��0t�� � � � �B@#?�O�n���o;
q��:�58�>�����e�E��_ � �4�@�k�g�
��m"� � � �@n���)-5����>jf���F&j��SN9�\s�5.�F78�m�@@R�O}�� � � � �@��Yg�������
��`\�GS�jz�p��x�@@@�F�� � � � �@F��������#[�?���4j��O4���S[f��$ � �����I@@@@��[o�����[\>�G7o�<��S'���k3b.�@@H @��� � � �dXs�5�~H � � P�S\�F)@@@@@@
 @W�@@@@@@(L�]an�B@@@@@@� t�Q@@@@@@����F)@@@@@@
 @W�@@@@@@(L�]an�B@@@@@@� t�Q@@jI`��f��%5���]�D������f��y%���������5�w� � ��
��W�� � �U+��g��9s�Tm����7���i�����~U��Z�+��/4��w7]�v5���N^�S6��0sSS��d�M�=�y[e�U���[�+�WU���L}���t��Y_r�
 � �#�����E@@�_|��9��S��Y�2V?|�ps��G�<g�y�9���]�k���f�m��X�R'�=��|������������/�4�w��s�=�)�6ok�=���f���i����AdF���t���-r��/_��(�b��mPO���{f��w6_}�Ut�D"a^x��h?��3f=��������������={���������&Mr��f�m������hz�����a��ef�����?t����N;�t���g���p�B�����L�:���r�-]��P�g?S�g:�����������]�v.�;d�������jh/m@@@����Sor/ � �@�(�p��Wf�}I�t�������2eJ����T�|��y��] K�km;]���n�p��O���Z�R�V�G/���(8w��G����gn����s���n�������VZ)k��>��(h��[o%�������o����6�=��Yw�u����y����E]�tZ�Jj�Z.����2�]vY����o7'�t�i����
.��_�:��(�F��Z���^��? !� � �@���?���IM � �%� �O}��1�����I��o�}���n����~�����f�������]���4�i��a-�W~��T���j�e8�,0y�dw@�y^r�%n��f�m��)���iGu�y��'2�jyj���f�v0�����=W+���?~�;���o�g��=z��$v$�O]��nT\�QtZwOy�����S9��r������g�z��g
d4r���?w�.5
Q��4b�c����(���@@�F @�����#� �5 �i
s1s�g�����)?���:t��O<)8��+��n��� ���������u�Y'�� ���i���y~�FM���s��\F��7.
����n:L�^�H6�4ZK�Uj
�C=�_6�W��v��w��>8c�[n��h���F�)��*���/G���^�c���o�g�FDj��C=�.�~����R_��@@h4���v��/ � �@�
���c}��t����4�RI�cr�~�v��.��Z������QQ�^_�}0*�2����d�7�~��� g���O>q}��uA�Gy�l��9U���R��=�����s��ISMjD]������(�-ijF%��F�eJa[3���\>u�����P���F#�u�]n������z~���c��K�z9�Q��FaRe@@(��r�R' � �@�((�i����o?�H������_��z���[�K�y6�xc����{��k�7h4e��[l��9��c���~��7���������}��o|���t�M�z^�/��e{�������2d��^Pm��������7����������nr~��P��0!����:��3f���e(����hT�����(�C�8x�`72+���w�}nTM���{w7����o����K�y
}NZT�������T���4Q����������0��_��'���O��������w�u�45��t���Q��*i�L?�-�/����;���;�����1]^�������]�v���`b����r8���������L%��]�t1z.|R��)�{�:��L1bD�*������c]7���2�st���vu��7�I���'�t� ���jnU}Fhd�W\���@@�I`�mLb��	��H � �TV������s?��zjN��A����)S�2v�����V1:�
;�*��wow��6������*�������v
����/��7�?���=�����8:n�
�H��������g�����?����k��6��T6��2�s6����o������Iu�c�}��l�-eq�s�~����e�=(�N;����J��5+�|X6|�|�B�_>�k��h�}K{O�g���.���LT�
t��{�
7D��������A�(�]�.m>"������sT��N��Z�������C��]/��uv�cR�L�[�s�x�����S��bh�����@K��n�q�9���{�������9���������i���@t>������o��F��8� � �@k�{��N�:+AB@�_���>����655�%nt��G�66p���=�\�u���������������)�������z*Z;Kkii=&��yWF�������F�i4����H��_~9�[�
�g�=���G�����?��n���|��F�*�s�=�������]��_����
r�1u��m��Rh��|���J�o�����K�H,�����h���B?2H��W�o����p����Qx:���:[��B��L���zn����U���k������]v����W��u.U��z���+�.3��thT�
�����d��Y�'~$��r��TH����Z��e����3�*m��6f��W��S����?�����0i4�3�<��?�m�����XI��w�qn[#������;������>�M���}{w�_ � �T��� F��v���!� ��Gu����n��FG�4�'?�D����t�(��
T��s'N�F�m���	����C���/�����N{���%F�m�':�e�tN�W�t�zht�T#a���	�sy�j�t����.<'?;U\TN\pAt�Q�F%��w'��|���IU�������0��`�;L��Nx:��M��tI9w���|X6�\��D���(�U�������0E���e��'���)�^$�j�,i�����F�)��e�k�� �l@>:o�Cu��i]1�>��z/��v��
�&�+$�$����e�=�=�\w������F����.Iu��T��lP�}^��������k����9w�Ni�Q��)>r2~�}@@ZK�t�_�$@@������h�~l�&���_�Ql���>q�Q��n��F��u�4jd����^;�	G{��������_Z�m��zvJZ�H#���5��~�
��K��h�"���*����L:���|�����pz��G��W�����
�%��A�h�����xC��im�TI}������7[�j����0i�6v�.\h�T������>')+[q���1S
97{�����6�hK*t=��?<�Qk��i���n���i�:��WM���[�����c��4u��D����[:�u�|*�^�&���S��qz���\�8��Y���oF����N'�������*&��&!� � PmLqYm=B{@@r�Tf�&M���gO�����3�������/~a�����S�}����S'7]�����(��QS�����/�DJ��u���l�6�M��d��s�NU�����w���9�� VX����\�t���rnk�C�����0H�)�3|���M�m�����}N����h�~Lq���}�Sl���k��(4`���[l������k����7w�u����O���>xe��F)���EM����+��$����6�(�X������#��.��}�N���%��2��g�O�������>��#FA��������+ � ��$@���z�� � ���h7�tK����;���������da�S��=������FR�K����������|���o��o��vt�~��E��
����k��R%���Z)U�a=Z��'?��)�Q3v�;3h� c�
���m������5+�N�Q�s��>�t?fj[�sa��N�6�F]�T��xGq����7.��������v���x�l�F��^�����}�i�F�y������b�%����S`��7�p��?&��_%�������u���h-�0i]B������1b����3�]���F@@����\��� � ��& i����F��#��Xc
7�$]5e���V�7o^����m��������d��ik9������;��]��*��/�e�Q���|m
}N2��V�QS���)@��	�it�)���������>���Q#���������&U�k-�������n4��>7/��"���>[��h����O?�4
�=��c������k���HB#s_|�E����SO�<��6�Y���FFk������F#l���t�In���������y0h � �@�
��7
���#� �TZ@���&N<��h�9����i{���3�_~�6�&���I���f��n>M�>�o�x
G�i�Y-%M����F�h��K/�����k�M�4�2��[���������,�9�.�b�V��mtK�?��9��CR��1S�Lq�X���w�<��T�����QpN�v��w�u�|Y�/Y����Wk���G?r�.�1�F+���b�E��{�an��f��s/��r��������*X#��4rN����K����t;�/�w5���^��������9s�s������w�yAn6@@��@��^��#� � �:�{l4�M_�������j�f�Ol���4/��B��m#�"Q���J��Ms�:R���X8j����ra����J[����i�
7��5���s�?E�q�g�9���s�����K�~���p|��6
}Nb�$��j?n����}��Xt�nh�@��e�]��'-&~��Qqc4bJi��6s����U�Q��{���G�
���.Kq/�4�Z/���w���w��^��/*@��-�P�}���v�����O?�hT�F�)=�����r@@��	��=F@h-������{���r��W_�_|����������t`��W����8q�[�)��aCS�i
6�7�|3
��hT��e��k����z+*�@Rp���z������(���w�y��p�
nO#���ZJ�|�I��pzIT�`�����L?������a��u����C��m��&<�r���$ee+�j?*X3x�`wz�n����y�YgE���`�i�v0m���f����?�:zN
����{AAO%�}����������iJ5��R���F�)iM9�1����;}��I���
�+����@]x�m@@*)@����\@@����R#���f��W^��8����v�����������E�I��i:E�������F�u�Qf���v����
�����S�N���E�	�$�V���VaPQ_<��/�����4���|��7��7�h��v�(�q�g�)�|�Zx�A�U��s�����Kn�C����E����F���/q��BSI}�5�rI�>'����~��_���F����������'����+�����?�����q�3%�\�k�����n�i�������u3a;�s��]
�[��G���k�v)��k��f����>���Q��T�>@� ���TJ5����c]T�k}�*���N8�M��r�o��^H � �T��������	@@�J�i����~�D95g��1Q;�^T�����G���k��6K��"�y��Zt��r�-�^m,*3o���
��-���'���Q Q�tvz��u}�[�jQ�~!��g�}R��m�	�EwR�tv��������r��_��C�� ���ZR������g�j�9J*��i�]����#��
�+������M�ZL?�>;����]�+U���)X����v�i����-[�8���2�_g�u6��T.�N�9`GX��j�[L���k�5�E�6��'|6��[�s��o�������������%�H7�ls	���\1���k����O��
��S��������A;�/)�v�$$��v��z]c�5v��8� � �@k:��[u(#����I � ��%�O��:c~�4��Yi���
i������mk
���>;�F`N;�4wl�����:p��7�;�����Z���/�k4��h�y��w���Q�6M����C9r�y��g�Q��>O�WM���<]�L��6�y���~u����]�vn�NY��o�����S��r�����_��|_}y�/�������Ik�i��/�hl�����~���>�o����T#y�g��t��W�U��L6�P�s�t��N1�����ow����*�k9��I#�����~�+7�k�������r|����i)��������
�����{���!=Z�0U�e5�V������������L��m���_��t�a�t��={�t��a�rn������W1b�	��p]:�W��(:����o���LJ�\8�����_�TR��hd��C � ��$���}�m�P���F���}�@@�A4����Sp+�d���Me�/�5��O�G��i";��Nz�4��BWy�_�+��V��
��nz���T�����
�z�H#3g�7���K���^�#{�����)e+��V�u�M6�$c�%������o�������|����U_((�`J��{{���]����O��?3�Mw�>�A�i�����.4������}�B�_6�k>��:��hOM��������W?��S-U@%��u/z.$J�|��H`�{��s�]��u����
�/�9]��O�fj�����������_[�������^|{����>H���k��h=��4��o�.M��5��Y�����|�"� � P)�a�����&]~�J��E@@@@@h�0@�<�G�p� � � � � � �TF�]e��* � � � � � �@�
�k����@@@@@@*#@��2�\@@@@@@�A�5h�s� � � � � � �� @Ww�� � � � � � ������m@@@@@@�����;WE@@@@@hPt
���6 � � � � � �@e�U���"� � � � � �4���xn@@@@@@�2�*��U@@@@@@T�]�v<�� � � � � � Pt�q�� � � � � � �
*@��A;�����@IDATF@@@@@�����sU@@@@@@� @���m#� � � � � �TF�]e��* � � � � � �@�
�k����@@@@@@*#@��2�\@@@@@@�A�5h�s� � � � � � �� @Ww�� � � � � � ������m@@@@@@�����;WE@@@@@hPt
���6 � � � � � �@e�U���"� � � � � �4���xn@@@@@@�2�*��U@@@@@@T�]�v<�� � � � � � Pt�q�� � � � � � �
*@��A;��F@@@@@��@��\�q��h�"3~�xs����7�|���3�����0�}��f���K��|��v��?~�����:t��Mz�={����������I��W_}e��f3x�`s�����{'��;��S�b����� � � � � ���@�73���0aB��N=�t����'�p��3f�lB�6m�����y�����<`��c����<�L����.~�|��Gf�������jqN6�pC��c��u�]7�|��TI1e�� � � � � � P���
3MMM��a��k�����K������f���2K�.5
���7�$	�����Iw�e���-_�<��������[�nl�������>}��a��{������m����+��F����o�mTv����G�E�S�B��.�/@@@@@@����b�v���(���?������9�?����}�]��5��x���X���{�1�l�Ix(���q����F��w�}�_����r��^x��CS`z����B��p1e���� � � � � ���@�:������n������f��Q�L
]g�u��1c�6k[)�-������������`����e�]]N���Th9�/���>� � � � � � �@=
�k�^U0�]�v)�8`����_|m�1i�$�G�inS��i��!�O�>����\�`�)��*)�l�m�#� � � � � ��&@��JzTk����W/�Y��o��1t��h;�����6��Z�l�['��r����a��F@@@@@�Q�5���W�����%;��C��j����6�������_?��F��J+��"���3�c���^���`�
�C�~��)���)�l�6@@@@@@�: @W�����n�-j�Am�����+Z����9��s���#����1cF�w��W����9�
-�z�)o� � � � � � �@�	���=������]K8���V[�h�F�i�8�c��*�� ��i���9s\�>��r�!F�F���={v���K�h;�������.^��ZN�S6��r�755��Z�D@@@@@�*H�X��K��R�+�{�UW���������j���.K�����>���9����_�:���_����?0�n�����S���F��:D{;v4��S%���Q��s�9��+#0U#� � � � � Pe&L0��#@W����6=�����������I�l��?~�[�M���K�c�=����oD���;7��o,Z�(:��W/Sh9URL����QMo�V�].� � � � � �@�	T��tz���������,[����O���y��j������
�)iT�O��S�����I�B���b��6����1c�*Z���5@@@@@h$�a��U��z_/<�H�Q�{Um�]w5���s-9��s��q�4�[�n)��������ytp��)�\�v�L����F��SN�zM�~!� � � � � �u.@���;����7���w��Y���O<�D��_���V���KQ�
����~�h��������{/
����.�s����r����a��F@@@@@�Q�]+�����]p���?vW5j�;vl�,_��(��.���������No��6�����o���y�s���G���Yg��7�O~��]h9.�l�6@@@@@@�:`
�V������(H���C��n�^{m�(����[�s�br�m�5������ltN�H��9�<������/6.ty?�����Xe�}�Y����i���kK�,1�_~����[��5�\����^���V!�TA���.� � � � � � P��Z�c@{��g��-]��s�1�~|c��w7���?��6m���I�&��K
�)�O�{���7�p����/�����u�Y�<����S�N��B���b�F
`@@@@@�C��l�NU�K#�rM=z���n��V����7�n����:�m�������[nq��.]��S���\w�u�W�������O�������z�l��FQmZ���I�`@@@@@�3���Y�L��5t��	n��:�����9s�����,^��(���$�z��f���k�z�-��k��}��R��)��
S6���q��a�������F6@@@@@jT��b�-�rS\����gO��b�*��b�w�����)�wC)� � � � � �T�S\Vq��4@@@@@@�� @W}�!� � � � � �T��*��� � � � � � P���O�#@@@@@@�* @W��C�@@@@@@�O�]��)w� � � � � � P����sh � � � � � �@�	���>��@@@@@@�X�]wMC@@@@@�?t���� � � � � � �@�����i � � � � � ��'@�����;B@@@@@�btU�94
@@@@@@����_�rG � � � � � �U,@���;��!� � � � � ������S�@@@@@@���Uq��4@@@@@@�� @W}�!� � � � � �T��*��� � � � � � P���O�#@@@@@@�* @W��C�@@@@@@�O�]��)w� � � � � � P����sh � � � � � �@�	���>��@@@@@@�X�]wMC@@@@@�?t���� � � � � � �@�����i � � � � � ��'@�����;B@@@@@�btU�94
@@@@@@����_�rG � � � � � �U,@���;��!� � � � � ������S�@@@@@@���Uq��4@@@@@@�� @W}�!� � � � � �T��*��� � � � � � P���O�#ZW`n�1/3��6_�����mWC@@@@@����Pg�T�N@�8����s��u��a�@@@@@hPt
���6E(8�@\��*x�)?�@@@@@ht
���"e����5���+ � � � � �u.@���;��C�,�L]����S`��QT� � � � � �@m���f�J�*�|t����u�4o�2���@@@@@h0t
���.%P`���+}9���+I�P	 � � � � �@m���~��T�@�G��`������>�5 � � � � �% @W2J*B��@j��}_�U��J�L} � � � � � @Wt.�@]���1�?iL���z����\�|���`�U9�v���@@@@@�J�UIG�jR`�c��u��t
�)�&�|@��](�6 � � � � �@	�k���V(���p���7]$�u:?G�Nb$@@@@@� @�@���"PV�x�-��U2h�kkN%I � � � � ����V@� ��@k���{~];�vyv�@@@@@� @���@�2�*��U@@@@@J.@����T�@c	,}�y��i���O���2�){"hWvb.� � � � � Pzt�7�FF`��k�~�����q��z[�O�w��]y}�@@@@@�htER�)��p��~��G�����"ha@@@@@���*����e�T#�R���K'�#����*8����o����
�� � � � � �@����8�(�7�������s�/�z�|�X�����J�.��@@@@@���+�+�"P�_��B������P����^|?Kqw:������X7��Gs�A�v�fn�q?���������;�s�9�u(`�c����OB@@@@�yt5����/�*����(����^|?����|���v����k6�j��-����K������Y@@@@@�F��h��l*)}��-m��r���t��:����.y�_���������tXe�;��nk��D��hB*@������i�	?c:e�CB@@@�t�����.�/�O�6�u;��T�pN5Uf�X�?���,�Ws��(-~��q��[n66���.��m������	�&d�*���T���x���3���T� � � ����Va�"���t
�e
���~�n�#�f<����c���\��y[+h�)65��R{
c�`�.��{�G�C���E�iL���eTO5'=�J��j����L�������?#���2�2q@@@�	t5�M4�����~TG����s�<���C��\��A�|��X<��S���
�i���f����5]�n��o���:hW���}����ca�RX��#�s���H'���}����=�k�)��.����B^� ��:8|��_�~���kR@@@@���J�Im4����_T����6��s�����?���c�2���*h��������jM��A;��K���xx�v�'�v�a������<��������2*�PS�o�Y���/�������:�U7
v�7���v�z��M�=������y@@@@����nEN� ��_���9���+:�B=U@$<��-�2~M;UT�A��fs�M����la_�Z��{~m:n>"�����s�smS����v�	���s��m������[a O���a�l�r@@@h4t����/T�@���~.��_��K���G��m���j:�GWl*�H�b���|h�k�c��in��lus�r:��m�T�o)O�������m�@������:GB@@@�Q�]=�*��
%�2�yTK���|��'�����|�Gj!h�kpN�^�QIy��A��Y��k�K�u�=�������3��^a Oe��0_��8� � � �T��j�����@�_��J+����������_�%��-Y3jG�_�Gs���QI9�NCd+&�kr�*�������:��\�KT�N�F���RM����������v������p?���N�#� � � �� @W*I�AjQ��v�bSSm*���i�������a?0F���)��{`p����x�O������QI���Dh�i��2��b���h�W��|gz��G?J�q���[x��{*���Ro5�	�ngkch�������< � � � 
�5�F0��vn�M?���s��m����vJ+�v�~�~��i\��G�m�p������������O�/��B�+U]�/��?��B�h�x
�s�s����`�������T.��e�����m���3tL���f��� � � � P����_�+@�����v��`�]����������+e�#[�8�@6�<������V����v�����K����ca^�^�v����=�y*���r��5��^������>�@@@@�V��j��n�D`��f��)F�>
8���T���vWC�������v	�J���������U;�e���m������N��@Tc0�P��^��l����0��������|���m���?�y*��������t�&����-���i'y@@@�Ft��+�	��<y��O<����7�{���f���Qn>��*�V�A����f>{����[\u�5����3s��ez��w��@��
���p;�V���0����~�/�z�)O��p;[C�0��r��0oX�������L�.��> � � ���x@� ��T���7[n�����g�le;Gp�l��U\�����0��z>c�{~�C3��� ]F%N"P[a`)��va�+��\���Vg���ngkg��k9M��v
�#� � � �(����OJ,�j�\�K�����s,&P+��^�z�Z^��j��>��Wm~���9�mE �����������v]s-8�cE����[�0�ng��0`�|��X�����W���ml����7�:������q���dC@@@���U}�@�O \o��ZW�-���������������+�nzu|�������+%�����j����1�GB��L�v�*|�K�� �/����x�58��-_<�,�ys.��vaO��}^�H � � ����Z�=��@�f��Y�+s�F��������G��+h���}�g_��c�7'1�,�2�]�����s�B�R��p;����0����~�/�z���g]����}[�k�|���|�������?�+ � � ��$@���z�� P#����Fn�fV�@k�z�y����G�����\�a�����k����k��7TH @������@����a���|F�)�z�H�}�5�5B�S�p;���.�!� � � P
t�P�L ���:v�hz��Y��N�X�[��i2�J=5�����Yl�w6�s�B�Q�v�����%M�L���l��s����x��p;]~���������s@@@�l��	qR
8�L�<9���`������2�J0���7h��7�t4�;�-sy6U�NU*pgz��F��u�X��Hi�#�@%������xT� ]�a���V82O���s	�U��}��k�6�.>����v�'S]�C@@ht����)%P�M��L�%�!8�{>�s��<�j���w?��iz��)+�OD{s�zs_���:,��1_T �
7�n�m7��z�+E�.G<�!�@�t��8�x���'�X���S.�0�x*n�y|��~
�n�k��x�����t�9� � � P����*"0|�p7�.�H:�s�.�F������g�1�����:���L��W�]������0��z�,\��t��9����Iy
�I�S]
�u����j5���v��F�����00Uh�),n���x�����t�+u����fj�w��;������8� � � ��#@��v���"P�>����QVU�K�J(�����y�������mzE{_o�g��a��~�S9�v���M������v��I���-P�@����$�|O���0O��Z��o��t�����?��d��s � � �� @W~c��@C�k�n���TP���*h�z4�T��6�?i,X`zuz����bA;]S�qv\}gm6��c�]��@�4a`*�N�=Z[/�)���%`���r�n���wx�����t�9� � � P����(� ����,�
�G�i�])�v��h��3S�����v��9���B���k�:}0��|^����������_3�+t�<���<��� � � �@�:�@J,���Sh*P7s�����K�s���u=vl�C��+qOS `�L��L&a0���?��d��5��m
����[�<����|��G@@A�]#�2�� Pq�����z���v���s7=���(�mg�5{�he�00n�k���<����|���:���_3��;�������y��W�eI��h�R��y�(�Sh��C@@� @W�e@(B S�N�.������iw��N��*h���s��B�E���k�v��0�S�p;�����<���_3];t�<���<����5��'��O�E#!�y�G@@�%@��\��� �@_�TX��Nr�L�<���[e��*�w��?�@�jS�v�v��X�S^����+_��a���tm�������9��eq���x�!H���� � � Pte@�J@J-0p��uJ_���::���M���i�0��1�@ �����Y�i!}�������|��|��]l�.4.�����R���yyE@@���"�#� P#I��V���i��c�]�t"�Dj@���k�&�A1�S�p;����b�)����J�m���Sm�-U`/��x�|�I^@���;�����?���$P!P	�QGZ�b�_����^�U�]���joo�5n�����^�"�b[��B�U�
�*��v�,��%�$����39����d�d��y�f��<�9���7�� � ��t���h! �Q�v�������w:�%���&�B@����s;\v����Z{�>�|����s��s����\�3��Rg�i�`/���r���k@@@ 5t�q�V@�F 0��fMgh�6�������T�@ ������9�-;����m�5�����4�6���M���������P����rO�E@@��t�-�B@ g�	�4�+����v9�S��"��-�����Z���kn�h�m��^t�����	���ciO����:\.�{r- � ��.@@��o��#� �$�����b��c6����]����p.�]���P�����������6[ �������[�~��a�8}��u�^k/�a����}M�c���k�}�����<ph*�u,�\,�L�k������#�g����N��- � ����oGI@rB��v�xN{�L=��3��OYo;�}�������9Q�h���n��@����E
�B��aO����n�;T��Xp�Vo�%?w�~,���B{�������\�ar�@�>����J{�-���, � �@h��.E@���vv�����vZ������m��`@��4����[�
���It?8P	������*T��Xp�X����?w�~,���B{���w�p?���u�VB:�� � �@f�e���� ����3����|=����w����a� �vv�h�
N�����)������p�
��
��:\�t��������<}�{�`/T��Pu���OD���C�
{,�~a/�1�?��"� B��.
�@@ 9��v��j���m��`@ �Q�-��O_��VpX*�u,�\�w��+����
:��|��{6���k�%���pm�1���OE��0������p=>����l�7�}@2M��.���E@ �B�v�H���v�����w�#����WM�@rD 8����!8�
��:\.�{�����s����nl��S���:��+�f$��N��$;���Z{W"w�zzv�;�y���#� `	��c� �i!`�e�l�s�L�9}�]{���(�@H�@p@����C�P�^�c��b�g<�j�KH�=(����KE��h��cm-P���wk�b�������\� �@[
���6�B@��������y�z�m&#� �c������#8���:\.�=c	����B�P���Hv����H�y�p�����t>\�vY� �� �'�q@HKghg70Th'��v%��Kq�����]�KI�z�hb���E�����}k���uE|����9�� �@�>x?�G4��O��Puv��O:���T�c��� !K���Ar�&����H�����M���TgL������[>7���]'�"�aA�[������G@��"�vb�kW"��2�}�����vz�P�]�������������V� �d�@�/��}��+RH�N�P���x_O�����,��zS���6��v�{���s&�]�~V"�K�
��S�����E:�N�l����=)�	^��=�}.�$���m*@@����@�J ��N����.Th�7tw��D����E;��X��w�2�!� �5��b�����N���-��OE��������&�e��#�k��=����#����q,q�p?G��'z�H��F
�����G�3���������a!�S�mhk���~ � �n�B;m���1m�z�R]med����]k2�O"�k��� �d���9���_Tg0IN5=�:uF�u3��!�?��/��`1��Hu����D2�t.�����v~���j���'"���)���D��� �t���yJ@� �v�[�==��x-C;-���v-Z�8���E.��8�& �d�@�_>g�s�<�'�!��c��;hn��"��G:�Z����kw����+��pu�eY'_ ��}<��XO��(��.����
��#��Hu���c ��t��^y*@HP�������]����g�a0W���N���E��� � �)���X�)��"����u��B�������G�3�}r�X��[���5�km�d�\k�F:�g2\�g��|�:���@������;"� �@�
8C;��	�������y�t���|vv>�@@R"`��s����%u$�H�Z���;q�
���D:�N�l��c�A�nm���<�q�\��H?��B?�_��������C ������< � ����v�f^�ps��:��A��&����E��i����� � �@��|�(�VQ�w�x�7�E
�����G�3�����N���]4���k"�G:��3�������t>\�v�t^����s���"��M� �K��� � �@�P��^������7<f`h��������;=d�v��b@@��D:��=�F��O��]��w�^��C�~Q]�,�v���O����1��c��O��|�:cio��'E?��}\�N�����~�
���?wG@����s>���u�V���&��E�S!�kO�W��!�Eu����=,���]T�\� � ��'.�>~�W_��E�;)R`�C��;��P�p,q�H���%r���I�.g�M��?��t�^v�	���}�4 � �avo���Kq��RR�>�xO�����sN�h������.,
'@@@ s4��{ ?E������v.�������~v������.�:�������2��b���K_��}7�@rH@{�-��+�/��Ig�s�i�;�����NC��vjn����v/;]��@@�;�s"�r�kc����
�Op]�B?�6��Hu�#W�u��X�L�U�l~n�l~�< �d��t:7��z��%�����A�^�&3pN;	����}���3�\���e�i���$�9@@H�T[i�hIiB$�H��y�/\�g�+��pu�e�mK8�i��n����t}3�@rN`������~�P�F���3��}�Y�!�.v�gv�������;������	��||#� � ����;��(OD
�"�~Z}������Y���F)��!������!� �@�	�A��m���ZF����9�u�vV�������������=,�6�^v�W�6 � � �@D
�"�K���<g���$F�������kR'@@�:[jF@��6�k�����q��:�z�%mXL�\C;g`G/��^�@@@2\ \�|<���S�������B�p@�b�P�����J�y��D�;�p.�_�a1���$ � � �@V
t�t�B3\o;}h�F?,�'@@�}��'B@b�{��O��s���t�a1m� � � �$G�����E�'�\r����6~3uuu�p�B����,7n���)--���2���[e��1[��~Y�x��\�RV�Z%'N��Q�F����e��i8�x��[N$��!8� �)p���o��L�;z��4�@@@�[���=������r
��@^s��Y����-��'�mc3�ao���	���[����<��w�+������|���2y�d��iS��C��%K���Z���l�����m�I<0i�$Y�|����DT�B�Y�����=L�b�{�`L��.X����y��e�G@@@���	�m��9�t!^V���g?;�4h��9RE����KSS������	I�����f���G.��2��m�9�a�UW]%����G���y��?��r����_>�����'R��p6@@ ��=��a2}A�ohLO�&�t^/����;{.����l��N�J+|��F@@@�\���
_P�n�d�����#���?�i�s������r[�n5����/%8��;w�?���/�����=��c��w�y���C`�t�M�.���l������l � �@�
�A���^vn�������������p�}i�k;���vh7�>_=����� � � � �&q�&������KAA�t����]�y�����q���\3x�`��W�^�e��v^���o��q�����ry������-o9�q"e�
O�����!.SL� �D!jh�T
���m��)���^vQ�.A@@@ {�)��b�2�e[��u��5��������?����U�V�pN���iH��;V���'�w�6Cf=zT�w�wY�����V���� � �@n���z�x�v^�`���"��eWP���]g}t�����o@@@hW��lW��7�9�������4�
6��'N���vn�p�^x�	�N�<i����+����K��t�7�6 ��'*���ggv�Y"�p�:��.	
���A]��6���MaXL�aA@@@���8��_|�E���]v�[7�/��e����f��+i/}��f2�����6$R�n?k@@@���l�I�[�#��9�����.���`}�h��J+�����
��h��\.������ � � � Z��.�K���s�<��s�{^���m��9Z������l�v���N�x��[.�{��� �Dp���]v:��?i�nz>)��L`��
��[�6�v��3M`��?���.B;9� � � �D ����F����oJmm���u�]'#G�������!���[7{���C��c���f;������&R��m��F�S��[ � �@�����?T2�{�.����|t�����Cf�h/���'D�?���6��t��8Rz��-���1��B@@@� ��F)��<����x�bs�3�8C~�����[�.]��� ��������+((0�������4���h�
�e��Y@@ ;����h���<����e}�HA�J�q�I4��~b��Gv�t���|t�-G:\d����RW0�l�� � � ��'PWW�~7�pg�8�>��k�����E��g�}V�������W��r��A{����CV\\l��[6�rz�D��x�����D����[S5 �����!C����d���U����H��Ur��1q.��egt����xf�h�Y��p�Y��>N������� � � �9/@@�N?���?�_���<y��������������u�)���c�}��]�e�-�
H����`C�y�Yg���� ��������j����:����k8K�N����������>,R�4B��5���':� � � ��N ];����,u�N�A6l�O}�Sr���[ky����o�����G��5�0Ja��k��3;v���'�����$R6��q@h
��n���#��_���I��{ek����u�xOK������N?%�Y�'���(���d���a^Bw�0 � � �d�=���-UUU�W\!���3w����%��{o�V�?���W_�o;7�m�&v@w��W���K�e�-��=M��B@�L��r+��*o� �"��������������q�z`�_�I@IDAT�gD��W���HuC��u%ME�6�	���� � � � ����k���g���������[n�������`��A2z�hs��-[����oQ��{�������������8������ ������n��-z��0��,�ih7��W�}�B�~�<��y��e������ � � ��-@�6|����D6]���e�����O�l��Q����.������$�W�6�����[&N�(����{L�}�Ys�o��r�5����F�e�-��=� � �!�^vv�k�-��}o�]�2�E;�CO{�i=�|��I������t�a���W�r � � ���@^�-�Y����-3�O�7#�����/Q�]�N�*��_���:uJ���������&9���e���2t�P9���l���g@�S�3i�$Y�|����"`�E@�������B��5�j�{]�'tXL��2���G{�e�k/?@@@r] �2�-�;�D�6������h�6�2����(��:��y��������{N6m�$MMM��~�����c��G�����r�o�x�%r��� � �E����y.�&���I�e�u��������u��R��Sn�	�_ � � ���=�����������3��s�9'�J�-o9m\"ecz�(.�jz�E��% � ��V`W��e��.��.T��^v����x��%�J�c � � ��$�N�����t��C��gO=zt\���l��������!)� �d�@�����i��TU��N�HhXL�\v���e�����������,��N�XB���jY�v���^��=�l
� � � ��`���.E@@ =J+Z�)��v:����q��t`�Uhhg
�yx�x�D�Z���

��.T���x���� ������S5Mz����������
@@@8-@@w��-@@�_�
�4<���{��^���.��_I�����
���k���^����2�TY����&p�����Jq������WJ���,}�^�q����a@@@����
@@2T��e���y�=�����
)�[i�����e���=,���z).<�k.T��{?$���etw�:�1@@@rZ��.�_?� � �����
�;���������]4���-�.#����k@@@rK��.��7O� � �����yt;����IZaz���%��!Y�p�����N��dn�ThS' � � �@&	�e���� � �$K�������4�	�����u���2�����=���Jd�5��.�v��/@@@� �����##� � �B �����c�|v�8C��
�I��2B�/� � � ��(@@��o�gB@@ �z�%���3�!�K�+�@@@�� ����D+@@hg/�$�t�Eh,�> � � �@6
�e�[��@@H����;�BD���M���|M��A�	��@�E@@�x���< � �� 0b�HU�H��t���^4������=�L+�P����&�*�U�sd:s����i@@@�� �kw�� � �@�h�;|�'�a0�������vZ���7���
�������X5�+))���b�����@@@h#�6��6 � �d�@�P���
�i��
_�$���������z��+O�����<,����F@@H�]��� � �D+�=��go�PCj_��3��^w	�]p\� � � ���0R	 � ��L�������]�z����
R1 � � �@�]� � � ����nn�vb�k�9<��1mw� � � �1��LF@@HK�6
��������15��n�9�����,aN���)�Q � � �@Z���k� � � ��P������w�$
����"3(�kt]*�}��}�n���|#� � ���]N�v@@�(�h	�"s��-��q��.4� � � �9 @@�/�GD@@�(������5�{C
�VFQQ�Kb	�t������r@@@ ��2���`@@h3GhW�������k=�sxLB�6{��@@H�]�h�@@�R�C;��TM3�]����h��1�]V���P � � ��tY�ry4@@h#�6
��i���N?��g�v��2�+B��4l!� � ��,@@��o��!� � �@�
�
��i�*�3%kN;�,`^��]S���%��3�|!� � ��.@@���� � � �@N	�V����i�{��Omm����6��
����iGh��0�@@@��r>�;q��8p@<(uuu��={�4�^�zI�����$ � � �@so;���_���Nwjw�"��J���	���( � � �D���4�{��d��U��;���u����1$i�.]��v�'>�	��'?)W_}��q�!��  � � �T����^wvO;�O*C;9 �Y=��iGO���U*C@@�@NtMMM���o����e��Er����^�^WYYi>s��5��&N�(3f�����^:w�U=\� � �$E��B;m��p�����cjhW��2��4J\.�Tw���vIy�T� � ��(����~(O?���������������P�:�,9����_�~��� Zn����k�.9~������'e����s�=��w�!��~�h, � � �@��
��!U�9��i����"��^���		��v�X\��JII��7_ � � ���:���W�*������u>�I�&��������cM0��G���
�y��~��}�������������o.��w�hH���x�����#� � ��+PZa��S�����z����\��W@h�Um�iGh��/�@@@ ;�:�;q�����������_��|���7C�D�*�l�>}����/��}�{�q�F���`�:����.�z�@@h���v��z�9�&8B;��5�j��O���.<�"� � �@Ndu@7g�3��]��e���r������u��wOV��� � ���@@h�|{B���@@rJ ��a����ejX�� � �d�@;�v:D��S7���2�M�cN�����@@@ ��x_�������| �{�6Cd80��(� � �d�@�����V������P;<#R�L��	3�]M�i�6��5����Xd���+�7��V��:S�G7��|+V � � �@[	�l@�����~�;�2e����?�{9rD��������X~~�<��cr������� � ���@sh��������d�v&\�o�B�v�}�����9�����%��V��x�[�����(d����C@@�+������������7� z��w�sz���Qf��%c��������� � � $PZ!-B;k^��]/����cAE��
�vX!�gO�������+���c��!]D$N"� � �@�r2�[�j�	��u�&7�t�u���2�|�_\\,���/����r�m�������~&,�_� � � �@�Vo;�~���CK:B;��5I"3 ���i:����}mq�C@@b���n����i����!��<���������7O���jsj���������/e� � � ��@@h�\Y
B������pk�p@@H�@Nt�{�6�:�eSS����I}}�<����xYY�?��^x�9�s�N��@@@ E��u�&��7I�s����IIII��j@@@ �r2�>|�y��=���r�������}���6�g���3��z����]�� � � ��N Thg��v��fN�d
��0D�k^�
�����)�^���wM� � ���@Nt��{��7N�z�-�i8g/�����o���5��^z���p�@@@��p��H��v��P�m�4�{�(���$,����"�
e�W4��~���2�������R � �d�@Nt�>�|�I�:u�TUU�_o�^���g���7m��x��w���^3���, � � ��+`z�5�p��IVH��%�E�������������(��)5�s����j�� � �-r6�;���������a�8p�	������q�F6l��s�9���}-�; � � ��+P<z��,_��2L36��&���%��h��N4�k���1��Y&�v�
r � ��#�������"�����m��1C��� � � �y�&�'���B���v�)��������y��_9���ie#Th���yO������v�V;�nw+5q@@�I ��lz�< � � �@h_g������
�Mi�u��V`7�yD�
�jw�,����:�F\������s.����v:W�
��vvo;
����"3N_�!� � ��Y���������tX�I�&%�>*B@@h;3?]���B;�~�����;��]@o;
�0Df;� � �d�@Vt+V���o�=i/���/'�K�&!� � ��+�:���������4����x����<���X�"��h����b��$XS � �m)��]���j���},��Q � � �@	�9��`)"SC;_pg���3�>�o�Iph��Y���� � �d�@Vt����d�ijjj��<(�^{�9��W/�;w����3�:�W�������/� ������� � � �@D�PCd�Qh�CdzVO�iGh�mq@@�6����K�.2y�������g����"y������/nq�W\!��u��\n������s�=��u@@@@ j�0��"����5�j����H��N7������{��}�9OpI�s � �$_ ��p\�����K/��<�@�p�.��������e��u����F~�a�k@@@�#`�vb}�K�������Z�C+���y�y���e�	����.�]�� � �@r2�{�������O�o������|��_������o���c � � ��F��B\�5;��,�"�l\���=��]\�B@@ ����0^k�����;����m�9�}���8� � � �@�
�9�wk+�K���B;�i��R��Sn�Oo;�� � �Q�d@7z�h?��w�-�������������o�h�o���#G:��� � � �@��"�9�K����X�_��HuC�x���EAh�&?4@@ �r2�;�������7��-]�T.��B�������.�������-����E.\(��3/r���i�Bi  � � �F�9�"��B;���j�	��w�.G{�e�DO;�� � �����w�7f������������0n�8���;"^�I@@@�Z Bh�������+���B�h4����Uu��v���6K4�����0�� � �C9�
6L�-[&���/JSSS�����/�\s���;W�v��"� � � �������UZq�1�|�
��������'���	�Q@@ �r6���8b�Y�x�l��It�9�r����cG�����s�92e����OV�t@@@�V���gh�"��v�UDsAk����<)�S=���@@�L����~A��N?, � � � A `��G|6�v����v-C�����L�U�����vb��!2}��o@@�� ���wF�@@@H����vm��C"D<��Y���ZB�����% � ��
�|@�g�������wKMMM���T����\u�U��r � � �9-�F���5S���9{�y�L��m�a��f*V � ��.��]CC��z������V�z�_~9]TR\� � � �@�@��N/���]xh����K��"�xVO;gh��I@�
@@��r6��5k���W��	�s��1]�� � � � ��@iE�Cd���[����3������k�iwL��c��8H)� � ��@Nt^�W�~�i���S'����$S�N��}�J��]�r�y@@@H�������#�k�����L���v���"3ai*@@�����w��;<���r�m����@@@@�4p�v!B;I��-C;kx���v������i�s����MB@�\ '�����������d� � � ��$����v��Keh���TM3��!2���phW]]-k��]����������F@�|������s���d��)�}6@@@@ ��(�S!w��e�[�mgo;
�t���^���1]{<�qs�<y�	���� � ��9�
0@F�!�������?��/�\�t���o�V#� � � �@d�P�������K���Z�s����N�/�t��y��Q��m��t^/�`��ku��&+�[�t���L�F@�A ':}q�?��\r�%����U���y���s����Ny@@@�F����U�!2����|G������]uC�x��Lmu
�����z�����O?K��k
YB/:�5 � ��9���������G�r��Ay���d���RXX�u�P�Z�@@@�T�������]�z��
�"�N����{��u��H�q@@ Cr2���u�<��S�h��]��HKIII���C@@@�lp�v��� ����x�1Y��L��v�������q@@ �r2�;��3��.�S�N��zF���\� � � �Y*��]���2���R�����)��S��hX����6k�@@�[ '�������I�7C�@@@@ �B�v�U��H����C;O����t��NC;z����F@�M '�t{	�@@@�X������Cd��^��|�G�s������i!{������ � �@
��Q�^�l��Et]XX(C��i����v�D@@@��p��KB@���D4�{H�����p����.��@@��
�t@�s�=��3r�}���;ZH��w�)������r�8�@@@@ a���l�?�j�� 8�3��UY��Y�]�J~�rS
�]4�\� ��/�����'��+�������;t��	��x�	y���d���a�� � � � �@i�Hk��i����RUaV���]��94��ig���n������m�����."#'@@��r6�������r�e�I��=���Q�n�*����;vL���+�]w��������32@@@@��#����P=����Vbuv��|v��Z�g�����-�1+�k8O��}��&����)@@����4�{��
�]�h�"���+[P�|t��r���O���J�7o���9��u@@@@��M?����U�������6e��5�����]<U�|v��shL��m��5 � �@���>��9~��a�;wn�pNO�_�-\�P.��"��a����'�����)@@@H�@4�\k�i-��S��:���g�<4��SW��R�?��n���$�$� � �9��]�������*�v���]���)SL@��{����c � � � �@�8C;���KU�Y%}h�#VO;k�n�{���T4��v������ ��-�����'�[����� ��|�c3������K9� � � �d���9[��5�{C
�V:���0���gL��7�������g]u�����Y� � �
9�]x�����8qB�}�]���K"��5k���C��x'@@@@��h�
���1�]RC�zkz��'�9�D<���?���]���� � �@Nt�����;�y��������+�G�!�/^,�����.� �5D@@@�V�14fphW[[+���~t�|v���d���fz��t������f�@H#���th�[n�E{�1����=�\���;d��	��9v��l��U.\(�=���CbN�>=�^MA@@@�I�9�s��?�kDsO���v��Nk�`�g��2��4J\.�hh7���}��@�@����=���?�+V����/�w���������f��)�����$ � � � ���B��
�Q��q5V&D����X;�[��n(���������w	��� � �@	�l@�C\���kRQQ!�����.���u��w���*�1@@@@ �J+�������e'�V$'�����Z6<az��4���nc�e�S�@�T g:}:v�O<a�%K�����e��-RPP �
2��~����U����,@@@�0��^v�mo��^��N�I8��������}���V/;]�]g�u~�r�5�sf�/@@�=r:����*�aA@@@hc������q_+������|v��V��;��v�������a_�6� � �m(����#G��������Y���v�`?y���r�-r��1���?/�\sM�k8� � � � �d+0s��T;�WyU�$�����}_X�?Jj;],:����TB�$�R�C@�@��
���?�C���o4f�������C�x<�z�jy��d��
2`����r@@@@ ������U���M/�d��j��
�YoU��y;�+..���� �$K '�����W��U?~|H���<��o~#\p��E7g����~�Z"� � � ������G�{�5�����y��_�R�|t	4��m�*�~��I����]�E@#����U�D�������2�p_��
��~����?�Q�,Y�2�#� � � ����5,f�������X��a�����"�S�����&�����@@@$'�}���w�CX:������2s��~���\� � � � �>�~ND?f��g�������3<���$����N����	���X���$|*@@�l�����s�1����S�m�6<xp�w�e�s���(�u�D@@@H�A�l5R?�,���j�k����r))X�;���)��cU�\o@hWZg�C@�l�����v���7��My���c���c����z�o��>���;O�� � � � �%%%"%��u��8{�u;Q������X)�S��a�=�X@@ gr2�����L�2E^y�y��������v�m2j�(9��3���^v��!�?��,X�@����_�3?<( � � � ������������S{�wY�phWP���a��������bhL� �d�@Nt�:�z�)������F<��z�����>�)���["^�I@@@@������
�Y����;%�2��
9z���%�����[���7�W/��%�K ���@�t������J����e���a�G�N���n��������:N � � � ����o�J�7�����z�RR�����J����n�g�KsO;���z��NX@�8��
��M
<X-Z$k���U�V����/���2�8PJKK�P���� � � � �@$�egz����������3N{�%uhLm�=<�uE|�+����F@������~+_|���@@@@�d
���j5��X���Cc&#�����|�.vh7�>sOz���F@ ������~���L����|��g>#���h � � � �@�������q��_��y�d�g����Nk&����"@ r:���y���?�������������o����7�:���3�w����������
@@@@��_/�
Sk�C;�����g���4����h�����k@Z���n��r�%���Z 9r��o���q�FY�v�����o���
@@@@�-p:�����]`h'Uz")������nzog�<��K��g�21Cj��Y#� �@H��
������7N�M�f��+W@i����ry��Wd��%t:� � � � ���@@h�~��RG~�Z[�^v)��NC;��������$B��B�E@��d@�s����[�;�����G?2�����tzb���&���?�i��@@@@��p��w�.�{�y���4��nz^��e�&-�3�E��=�4�cA@��9����{�[�n������O����l����>�@@@@���e���^vk��N���C;w�����d����������@2_ '��;w�7w�YgI�N�|��G���y@@@@��8��N}�TWW����H�hx����1IJ/�f������t�K]4�+�`6�����@��t*��=�\�T��m���:��t��R����G��s � � � �������/��a�:S{�i����N{��G�RB;�� �9-����#�C����(���7����
�C�s���7�1�����^�	@@@@2Q ���X!�d��NC��V/�������I3Th7�>��w���?F�@ F������
7� ����e��y�~�z���;��r��������K/�$O?���<y�zZ�@@@@�lh��Nd��U��/���.��]g��������G������������~[>����m{y��GE?������3�y�m@@@@rB�����,\8]Z���{�L��O�A^�^v����"'�yH@������{������U***L/�'N�|�EEE�����r��w�<�A@@@@rA@��\����Q*���&�#�E{�-�5��0�].�d�� ���l@�X�O<��|��������u�t��U,�
���r9��3��� � � � �Y#����O�t_g�Mg/�kN��9�]������z��5�.�z�1�]�l\� ��9���C��� � � � � �@x���'���1;�����x�����.k��o��j)������c�z���|����k����g�$�"��P��.�w�������z��-��������C � � � �����UWW�\��hg;���-���7��]�^v�����nk@�L g�|P~�����)S��?�����#fX������c�����c������?� � � � ��,K(��Sph�!�����1�]k��G�$���8 =��444�7�����u��w�3������2k�,3f�|����@@@@H��shL�f�|v:<�X=���1���Vb� �)9��Z���s��u��n����v��-���7������_�R�=*��v�h�����g�`����n�_}�{��u�������^
[eMM�|�;�1�{�uB{�M�81�%���������+E
t(�Q�F����e��fL�P�-�u%R6T[8� � � � ��-C��f8L�i�}��C�T�g��m��)������D�H�@Nt{��5|C�
������6��t��yr��W�S�7o�x�L~k_��Z�����K4x��s����PM���2x�����~(:���M����e������9s���%Kd����-��$R6�� � � � � B Rh��
�0���v��vu+E�[]���[3����o@��r2�����A��.���$//O������7�������������;w�u"_�}��_�7�|3�jN�:��~�����{w��sc����]��g�����d��mf��!r�UWIaa�,\���R����EEE	��������/@@@@�h-���1�]��CH�@Ntp�����?,�^{�|���7]�����3u�_�	i����'c��!45����3�������R<C-"�����s������s����?�i3������s�=&�{��w���i�o9�7��Z�@@@@�d	8C;��mU;#`xL�����.Y��� �@������4�7n�y*
�t���������|��Ol�
�=�YthM
��9����(��������^�y�s�^�z�n����������?�7A{��K���|"e���F@@@@ U�iX������/H��9�=�9Yzb�,��xO3���A������DV�IC��"U��5' �@n	�d:}�O>��L�:U����o\�g�}V\.����,{�����^���Cg��w��k��&`��D�l���U�d�����I�&�>c�2v�X����h?z����f��x������3�����e@@@@�T8{��L6��!1�j>��nc��O���J+R���� �&9��w�y��{���7l� 4�]��}^���e��a��������s��ho�3f�Z,�����e����f�Z{������<y�t�����l��
h; � � � ����oHL�-C���9�dq��RR���0�uA�J�����E{�Y��a��:��
��M������/U���"6�:9g�����t��QJKKe���&`���s����yzBC�p�s���>�H�-��$R6\�8� � � � �@��v����wb^���������MSS�����7
X��"�aA�8���2�m57���E����v>��	�s���Kaa���b�<�]����D��h@@@@Hs���1�Vk�4�#�?d����~���"��j�{�����z#L �KX�
@��������A�?.=z�h;�$�I{���q:Lf��=EC8������i��r��7�c���w�w��_����t����l������X}}��[N+I���m����@@@@�-�����!C��g��}���knu�����~��N4�s5VZ����U]shw���r��8�5�����/@�����n��Ur�W����G��o�]���#���r�������)Sd�l��S�a�n�hqg�����B�{�9s�?��|�3��cO�.]���~������7�����


�.���{O�hC{
�������m@@@@_/7�p}����9rD�����k�k~����1��X#rP?"�v=l�?(7���|��wOs�/@ ��������>��������6��<[�n]�/D����r����������-���FY�d�?�������Z�kf���tqq��[N�'RV�����#�*l�vp@@@@����O�����E���g�o��CR��i�Yfp�����D�Zk9������7KQQ���I�1�BH�@�t�G���;�K/������'\pA�=�F����
tx����N4��E{���g�aoJ��N���
��-�u$R�nC[���g�uV[��{ � � � � ����Y��3o7�����c�i��Y��z�KthL���S�Et����:u>�ZkxL�������3|!��)��u�>�3f��P��;w����o_�x<�������wy���m��5f��P�]���0S�b'�rZI"eC��c � � � ����������-r�����g��7$��ms,��N@3��v�����4���	��9K+�}�BHL �:��.�������k�>��#��;~�x�������vnl��M����+��1�rZo"e��b@@@@���VXg]�i�������r��QI���&��_)�����q�HT�C�P>��N�
��-7n���z�(�/Z��A������-[����o>szu�=��wn���o9-�HYC�@@@@@��4����I�}�RR�/�|���������
�9<M��b�7���ig��&�[�'b}��gI���E.��}@� ���A��������������o���� ������������?�:4�%�\"���'�K_��h��Hz�1��x�
��O~"�����W�����������Y�z�9�����e���r��1y�����g�5�t��k���l�W��-�o � � � � ���������
����.��
��.��1����O���>5��{>;��%��S��������:�[�b���93i�z��)2{��\��t��=r����,���$S�Z�#���uK�.5{yy�_�X��U�������O�����ns���r��q����|��}���~]�t�����V�HY�@@@@@���g���C���KVhgz�Y��
��$��F� �@�du@�y�fY�~}���o�����K?�s��E���^F�)?����g�1��a��t��A�*:L��x��<�����y���Yg�%�=��l��I�:���'c���GyDJKK���.��=� � � � ���@������#xu;Q)%���5 �[���]��on��/���%��j�D~Q�>��E{���5+i�s�UW%��.
������*3LfQQ�<��~��}��A��y��r�9�D]4�rz�D�F��(/�4i�,_�\���=�1��@IDAT���2@@@@���1kw�,�O��.��>��.Tc�]}#��Cc��D-�N����������	d��uQ��L��W�^��D��={����c�"�rz�D���P
 � � � � �@J�(c%:���R�|���.i��i��[��YKM�i�6�lG�\.��~���������V�{�@�v�����T�1 � � � � �#m���{��������tA���1�*|��)����N?#�A�zNt�`h[����n � � � � �@��
�<�$s>;hWoq:����&qYCpF\��DH�� �j�TS? � � � � ����nKaN���w����jkk�������s5V�i���{�`���.� �2���V�X!3g�&M���)Sd������:@@@@@��t�o>;'�3������n�8�Yj�-�^?��z��F ����7�����q���-;�+�@@@@@ ���n�TWWK����=���u���n;����������#�O9�]</�2 �@VtC����;/&�H�92�i�!� � � � �$Y�9���j_h���~��_�R�|t	����	_
;�5�]��G���:��0a��[��5�#� � � � �d��/������R{��u���=	�]�(���:�zVv@@@@@�T@������5�.\�������H��)�7#�@��e���I@@@@@J+�z�z))X��{�Lj�GI���[A�	h�~�0��y{N;B� ,v@��D@��� ���=z��@@@@@�X��v�R�/�{xN��t���D�O���q�f������v.�K\������g�@�
d}@�j�*���+L@��������.{�����r9u�TL�}��)2{����p1 � � � � �@�
L�<Y<�bY��.%�O����z�i0���e:4����l}w�t?�Fjkk����������g��v�vx6����*++��������~�t�w��u��������1�� � � � � �@[�A\uu����KJ���m��f���^�*���7�h�	���d�@�t�G���;�7w����u�~���.�����#��'�'B@@@@�X �/�C;
�jw����v��J���~���~�v�����G�I�����#}��t��Y�t�b^r��}�����|�<4 � � � � �@rL�W����.9���@�d}@�/���(��#�� � � � � ��vi�h �]��� � � � � ������M���Vd�2��@f	�|@WSS#�������K�=�����2~����p@@@@@��Z�t���.���`}kU�z��i����N��v�v>�h��
����o�����R__�����e��%Q]�E � � � � ��"����:��#��E�kHo��
�������bz;;v��z.F@@@@@ �H�]��fd�T��k�6V
�����1y��E @ '��{���9
�>�����W^)�/�N��������� � � � � �@[8C;���Gt(�d�vu+�a1��.��c�v='���b�F�TMf]�z�j����Wn��V�> � � � � �d����������M'��S��@�	�d@WSS��^{��
@@@@@�E 8��9�<^/=����-���E]�i���'�_��
@@������kU���q%.���*����H���Ef6��dVS���'���|�X�L���N�VZZ��]f*�K^bu04L�2"  �w����p8�{����={�����^k��>��<=���� @��Q y4f�x�z+��vy�ic�����������H���3g���c����a��� @� @�������(]�@�
�%BW_}uL�0!������G7�|s���n�����X�I @� @� ��j!���#��W������H����a������.� �.]�vX��z�	'����&��@� @� @��$P)�K������A���=o������$�_���O��_�7�
�f����s���u��5��� @� @� @ �o�voq����{�]�#����X�
�%� ���/��v�u��2eJ:4z��Q:��P�*��m� @� @� @���4�-X� ���|�������J��|#���@!����?6n����v�i1{�����W9#�	 @� @� @�����������,��G�XH�Bt�V�*M�7��M�\IC� @� @�����������
�q��YY�pa2���@� @� @���@��.y]U�XL��l��p��(d@7j��=zt<��q�E�o~�����g�k� @� @� P��v��K[B���T����y����b���q�m���'���{]'���" @� @�V@hW�������^�:.������_���+�
��#G���fx��iq��W�� @� @� ��M�v,����A=���j\��[��5wE$?������s����G�YN�m�2�[�xq\u�U��<�����T���6 @� @� @����vu���{<���r�Rh�z�/��o`�GJ������_^�W��@!�!C����cc������C=�Y�U&@� @� @�jC�C��/�o����.�
�pE����{��`���r!P��n��wO��\��A @� @� @�����u�v���#Yu�����D�>�7�:�tM0u����:�L�4 @� @� ��u����6����a9y��Ut
U:�����O� @� @� @�m)Ps�][���E ��=����w�	&D�.
��B���7����{��gp�-��� @� @� @�@gh�%��\�|y:��|<��.�t&}�R ���^w�uW���^��}(>���!�R%����Pn��9q��7�/��X�zu�����o~s��� @� @� @����>��A�����C�����j>(����-oyK:��>�l�%A�~����u'N�=��#�8p��0n��%^|��X�ti��/�����q�-���\������* @� @� @�R�����F�:�*����S����T���������~�]vY<����4<��Sq�Em7%;��S�
6,6n��=�\,[�,����v���?��?�)��R��� @� @� @����B����6\����v6�4����ur�\t}���O��q�9��o���������b��5�M����c�����v'����.����?�O��{l��v	 @� @� @��	y����Q�s�����Q��?�\tu���k�x�{���$�����������|0����>�r��
u���>}����C#Y)��q�q�QGE��� ��� @� @� ��'������J:�\�{���6���#]��p�k����_�[�n��W�g���4O�B� @� @��P��+��XM����
��T�S��@��r���w�t�\��� @� @� @��J 	�l��Z��) @� @� @����u�� @� @� @�@�t�lC%@� @� @��x]��� @� @� @�H@@W��6T @� @� @����u�� @� @� @�@�t�lC%@� @� @��x��������/�O=�T<��3��G�8���g���1= @� @� @���@�W�=���1}��4hPL�81-�r�)�����4�,�q����)Sb��u��
 @� @� @��6�{��g����n�!�m������kK�C��'�|2��������J� @� @� @�d(l@���|$V�Z��M�4)���o���"����1u�����9s��O� @� @� @�Y���w��{��)��?�����{�����#F4�7y��������F�;H� @� @� @�Z�Bt���O}v�i�������������Y�bE�uU @� @� @� PI�����KS�=��#�w�^�'=�r��&��@� @� @� @��Bt�G�Nm�,Y���o����N��5���* @� @� @��$P�������]���������l%��]u�]w]Zg��1�:I� @� @� @�)�Bt���N?���f���1i�����~u��|��'��[o�s�=7�N�[�lI��6M�:O� @� @� @��@�/`+�����w���w_<��S�gR��.���H~�o���gb��	�) @� @� @�h�@!W�%J}���{��'f������)������7b��Y���� @� @� @�@C���T��9�<xp\q����}���n�����O�9rd�1"}��.����� @� @� @�@G:���w�}#�� @� @� @�hk��>���a]� @� @� @�@c�_A�z���?~<�������6fT:v���GY�W @� @� @� �\��t�6m�s�=7~����
�r;��cc��9U�U� @� @� @�@c�
��8�����~��I�c��u+{�	 @� @� @��2�{��K�\�}�����;.���'�g�}�1U� @� @� @�@Y��iT�&���<P�w����1cFi_� @� @� @�@[
tm�����W�^]��)��R*+ @� @� @�hk�Bt�rH��w��]��@� @� @� @��
��7.���w��3g��%K����� @� @� @�H
��d�W_}uL�0!������G7�|s���n�����W��$ @� @� @��J�
��
g�}v\p��t��8���*9��N8������d= @� @� @��(�#.�Y�f��\9����[����� @� @� @��Z��+�^z������\B�u�]c��)1t�����G�x�B5����O� @� @� @��@!����?6n��:�v�i1{�����W}e @� @� @�m"P�G\�Z�����o~S8W�P @� @� @�hk�BtGqD�u����� @� @� @���(d@7j��=ztj{�E��M�����	 @� @� @�������t�M��_��������O.����� @� @� @�@[
to�����W�^�^zi����+����&F���W���i����/�T�9 @� @� @�
�-^�8�����`���H~*m��t�9 @� @� @�M
2�2dH�;6�n��$P�
�zh�]e @� @� @��(d@��������� @� @� @���
tm��'@� @� @� @�z]�Vj @� @� @�h�@!�M�6���k[�� @� @� @�Z*��w���;7���w�����K.�s�=7^x���:ujl���Y~��M��/��YmT&@� @� @� P_ ���y�b���������4�[�lY<��c��*<��z* @� @� @�('���n�����[�t��|�;��a�����c�����C-��8 @� @� @���r�M�0!V�\�z����{�(C��T� @� @� @��������k-]�4y����.k���E� @� @� ��Bto��#YIw�u��F�� @� @� @��
�m��-���[���d� @� @� @�@�
���� @� @� @���:�L�' @� @� @�@.t��F� @� @� @��,�;KG[���]w]��7/���<����'>���� @� @� @�
���;7����3�<#���� @� @� @�@*����"�����U%@� @� @� ��@�V���93N?���<����VYS5 @� @� @��*��}��c���o\�Q @� @� @�� ����� @� @� @��tu>	 @� @� @��������� @� @� @�@�@!�A����/V�Xc����O @� @� @�"P����[n�\7%@� @� @� �P�#.��'@� @� @� ���6�ui @� @� @�
t
E� @� @� @�hC]��4 @� @� @�����"�	 @� @� @���@�6��K7!��K/����x���b��A��_���/��r����?���1w��x��������������c���e���m�vIGZ���@� @� @� @�tb]M�5�\��~�^�:�A�^����s�=�{l,\�p���-�����q�����9sb�����|���m�v-���w� @� @� �#];Of��q�q�w4��/��Bu�Q�d��������y�{�_�~q������~��8��#c�����������.�qK��:�@� @� @���@��$H�7o^,[�,]��m����|����X�
M�H�q��������S�NM��f��&ZF\y���p��NH���K�����3�`�����K�������?^�f��Y�%7nI�R� @� @� @�9(l@�i���1cF���?���7W5��sL��_|1
���k�����'�|r����$Xkj���k�*��\�%����^ziL�4)������em��]����M�? @� @� @���
t������g�?�����V���J73fL��'?��<
�*��n����x�����)S"	�n'N�a������g����i9k�������m�i�!@� @� @��X��+��/_W_}u:���w��N;-�������C��h��;9��-Y�6}��f_��'�(��<yr�\����K��<Fs��-��K/E��}#k���Zr���� @� @� �w�Bt�)y�%��9��S����u������;|�9�tl�����gm��]K�Y����( @� @� @�r&P��n��U�i<��SK�Z-$����~���w��.	��-k���Zr������
6D�c#@� @� @���@!�I�&�����'�M�V�����/�\��N;�T*7,t���+����m��K����
����I��t�����k @� @� @�(�~���{�[2��{������Gy$f���sL�����R����u����M�=J�z��������.�iK���������v��[ @� @� @�(d@����7��~x�s�=q�I'�/�����WM~v�u�R���YS*7,�O�
����6k���-i�pLm���.��{����pm @� @� @�ju�N!���W��_oy�["	�~������#��;���$��L����Iu[����g���u]��Y�%hI�� @� @� @�9x��e9`c�Z�xq\u�U�����������C=�����X�i>�`��=�����n�������r��Y�%7mI����C� @� @���@!W�
2$��[�nm��z�����Z��<����~������K�,����������j���m�v-�g�a�!@� @� @��\������,�4S;b��?~|<���h����y���]�g��Y�?���K��m��Kn�����+ @� @� @�r*P���#���;��
6���r����i��������������1cJ���vZ�%�n��e1y��x������.�n�!�;t��8���J��B��Y�����u� @� @� @ ���?���/OW�%��������/��v]�ti�i����c�i�z��m������sI��o����9�������k���7�y��W:WW�s�=�6�{��;�~fm��]r������v @� @� @�9����4k8�;�~����{����'����p,	�_��Wc��u��n��Ip�0<+Ww��w��T��]c���q����Q���
������N�������m�v��[��a�� @� @� @��$P�<��759y�b���"l[�l�w��]q�]w59�!C�����#G�l�n{UX�fM,\�0�����^��m��Y�%�kI�f
���S�LI��H��*XT!@� @� @����L��/o O)�#.g���]8w��QG���7�����w�%�x{����SO����/z��Q_����������m��K:����� @� @� @�F
�%�����tJ�GI�|��q�q��0E����<�����[b��y�#&�:���9@� @� @� @�Z�B��������7�FW^ye��\rr��Aq�������������? @� @� @��
2�{��GK^�y�{J��
}���i��������X� @� @� @�T-P��n��-)P����o��Mb���oM����M�U� @� @� @�@%�Btx`j������?\�'=��������_�uU @� @� @� PI����Q��W�^���>��X�vmY�_����_�:=?v����� @� @� @� P�@!����g�yf����n���1k���;wn<����p�����~��~z|����Gb�|�����C� @� @� @��@��gr~�������C<��#�l�����*�������S�V��$ @� @� @���
��.AIq����>f�����r��;�_����[��V�*� @� @� @��Z�|2U�%:o����W\���s�����~:-Z={��#F�?���#�g#@� @� @� ������o�H~l @� @� @��Z����lkX�'@� @� @� ��@!�M�6���k�c @� @� @��U ����;wn��]���7�%�\��{n���1u����uk���M�_|q���L� @� @� @��@��y�������1�w�}i@�l��x����;TU<xpU�T"@� @� @� PN ������[�n�����w����
��c�6{���Z��q @� @� @�U	�>��0aB�\�2z���{�NQ�,�
H% @� @� @��)���.����k�� @� @� @���]3���
_z��X�|yU�X�fM$��n�ZU}� @� @� @��(d@�t��2dH<8�{��r6���~��������K� @� @� @�d(d@�y����k��V*�+}�����o��\�	 @� @� @�T%P���*�z�����t��W_�wT� @� @� @�@��7�I�l��c����s������A�x��1h���~�B�(�|0���?���^��2 @� @� @��f&�������+��h���;+w�}�{_�S� @� @� @��J�0]�n��i�R�=�#�H|�s�k��c @� @� @��(L@w���'���$��<��3���W_��
k,	���_���>���F�8H� @� @� @�9�	���s�H~�m���%���:*��w��� @� @� @���(L@W1Y1���~5�GW����O) @� @� @�hS�Bt�{��/}�Km
�� @� @� @�(d@�i��H�=�u���L�;��b���e�;A� @� @� @�)�Bt�������n�f���s��n @� @� @��#��9��R�[�n����o�L�4"@� @� @� P'P�t��*o���&qy�u����^�Z���?��N:���' @� @� @��L���v���*����$!��>��8��c���o��� @� @� @��
��e5�H��k��&��c���������+��K� @� @� @�y�&��������MZ��;�h��� @� @� @�*�*��gW�X�~���KU�V� @� @� @�@y�B���<��g��_������nHO���o���= @� @� @��(d@�����>[�nm�k��m�r��X�lYl���T����/� @� @� @�d(d@��s����s��u��'��?��f�Q� @� @� @�@C�Bt�
����o�:�!L��=c��!���{��>���6mZt���a5� @� @� @��%P��n�=����7Je @� @� @��!��5.� @� @� @�T'P��n����O��Oq�Yg�����eK|�S���N;-n���F�8H� @� @� @�9�|�e����q�5��V��Oo��k��ix������O<������ @� @� @��B��+�V�Z?���R�d���G�(U�.]�����=z�k���_~y��$@� @� @� P�@!��s�F���d;���+Z��������?�3g���u�$@� @� @� ��@!�+V�.�#,��o���b��1i���{���* @� @� @��$P��n���JM�n�K�,����[�hQ����&��@� @� @� @��@!�q���L>����wY:X�p����M7��y���^��" @� @� @���2��m��b��i��������������e��E�b��;����:+�;����iSZ��c�i�� @� @� @��	t�W.T������c�����c��1c����?�����3��X�I @� @� @�M	r]�2l���7o^|��h��{��������O~]�t�X�I @� @� @�M	v]3r����������s������x��gb��w�����>���>
3)� @� @� @���@��:�w����� @� @� @������
��_~9�z��t�\�=��O��={���� @� @� @�P����K�����������A�b���i��SN���������b��q1e��X�n]�� @� @� @��,�
��}��8�����n�m��mg�v������C��'�����+n����q @� @� @�Y
�}�#�U�V�f�&M�o���*������kL�:5=<g����� @� @� @�h�@!���s��{o
���>����8���b����M�<9=��?���� @� @� @��V������S��v�)����7i�����uV�X�d] @� @� @�T(d@�t���d�=�����W�I��\���:* @� @� @��F������S�%K������tz����:�F�j��
 @� @� @�*	2�;����k���y����g?[�'}W�u�]��3fL��N @� @� @�hJ��]����O?=��={vL�4)~���E��/�|�����[��s���S���-[�@��MS�� @� @� @�('�������������}��O=�T�����K.�$����g>���0aB�C� @� @� @��-P�t�R��}��{��3fD���s�����7����5kV�q5 @� @� @� �P�|2��f�W\qE|�s�+��[�xq���'F�#F�Hq��.��p��D� @� @� ��
�^}��t]������?�&`���z�����+Us� @� @� @�@E�B>�����������/|!6l�Ph��m���?
����� @� @� @��D������c��������1v�����[5|��Gc��I�{���Y7nl��� @� @� @��(d@��_.y�e�%��;������N�������Uu_����C�?������?GuT��@� @� @� @ �@!���������g����O~��3fL\x��1n������t�]r2y�\���+_�Jcm @� @� @��
�%����w|�k_�Gy$�>��d��U��E���}����.�(
��N��� @� @� @�-����yh;z��������#���z�4�~���������#J� @� @� @��T��+�������8������s���J|�C�y���U�I� @� @� @���
�V�X���'��[>��c)d�=����d6,���c��	q��gF��K @� @� @���
2�[�|y$������mK
���w����={v����3fD�.]b���q��W������rKK��'@� @� @�(�@!��K��V�
0 ����G���8 �:���?��������� /9�r����w�[���� @� @� @��T��]�>}R�SO=5�x��8��3��r
1'M�����/�0z�����} @� @� @�����9h0f��X�xq>������+����D����KM�W�@�Y�q��K�����4���g�~�^�} @� @� @�M�Bt����	��d��Q��{�l�)p���M��7����v���g���F�ox�>�&w�� @� @� ��@!���>:V�X���GuT#��}�����8>��O�����W��,����������N#��\�p�����F]�7*�}&����  @� @��\
"����{c�����3���%�N>�����{�q����;	�?�|�]�6�{��\~	�@�$��f���K"�US��u:K�x���kE��#������?��a����e�*����!`�g��'�$@� @��D�]%���r?�pZe��u�t��;G����m�+�YL;K?��N[��Y�����I0�e���,���q�=wq� @� �7��ty�P�!��<����������a���f�vA�5�Bn?[����������Y_D� @� �9t�s���@�
$T����Q7'�ars����5L��n����N{����^MV{����eG���e/� @� @�@��Z�����4@IDAT	�_�9����{�����C�3���=<����Y�dmW������d%]ko���^��VB��(^9��'�r���=n�8�����a� @� @��f`�J������Y���#��#��leKm�G��7k��?-z����i��jm+���^��G���h���Y��Y�u��	![W��cY��}��xk�z� @��K�P����c��y������K�Iy���������K��>	����3�O�_���1������H���4.��9�Y�Y7��;X��uu�>�*L�^�$����n|M}f�gS�u�@-	d��gm�c��{��+!��?@$+=�:ix�V�g�GX�g�H�[����Lo	 @��]�P�W���H~�m�|�S� ����g�-~�s�C::L���������_{nY�!d�Y��F�6ZE��G���@��Y�55������}m���rcln�Y�:
�8�_�C-�o����S�x�r�g�=�q�� @��
����#6n�����{�g���m�>��$�G�S�{E�#P�'�P�������|�v�h�BV�")���b�[{'�����w8��������fg����ol�mzV�������=���� @��(D@w�5���?���z��NB��}�c��kH ��������&w�9��d���vI�x��v�����
VBv�7���j�N0I�����F�;��������-Y�Y��������������1 @�:V�����#�� @����e��\����F�v�<�q%d���wB6�O ����k+�j_7'HL�*Ll�K�'������F��U�Bt���_ @���V{6i��j�8���Y�%��k������o��1r���TM�tt�QuGU��@V{V����zm�%���n��D�Kl�o�k @�jO@@W{s�G @�����������in�m���+<�����������s�y�-��<��������t��\[���{:O�%�~g�����4�����M��Y��X����~e @����u��; @����J�?-~�����\����u[\3W{��l�e�����E��e�gk����r����T[�)���/�����>k*t,w��x��Fmx��}mx�-H����rGW @��b��=�FO� @ �@G�31S�5ju���Cyk_/p[��lN����������\07Y�GY�u\��1B��g��Xwq�4W@@�\1�	 @�H�3����z�gs�����m����0�9Ab2���v�WP��%��h����?K��������3���� ���c�����S&@�y��uf�� @�@;	��[;A�M&�=��V���j��M�0�N�'���(fmWi��<������B�r�I�t�J�u�t����#��� @��M�j�v�n�U��j��u����,�K�W�g[�,u��I�-*}g+�kI_����&})w���S[(�����l� @�(��?��+���~������K�4��AHs�Z��>�~�����
�c�j�q���C���T������`�k���yl���u�.������a#�?�� ������'@� @�
-���K������Y�%}i������5|BVRu��J��J���v�:��mf
����f�>���dur�5��c���g���ck8&�tN]��7�&@� @�r$�}�m7�Y�������w��3k��� ���c���d<�l����v�7[U�w�5�t�7�\�T.��������Q�P������:rs]���) @� @`{�K���^����i���c�5�{Lj�k6OS�j*�W:W���W�;[?ll�\c����K���o�T�+7�6����@�@�j����V�����I]� @� �]�m��i����8�J��J��f]�X.0�&})w����m���;^�����d%���@S�&+�n���C���k�;8�_�:oy�S�*V{V����t�{��� @� @�@�	t������Cm�@0�n�P0KU7��lk��C;t��]g���UHV�tV�*j���t�i��� @� @�@xkuP)��t.�zk��Y��/y=��� �Q���w_ @� @��X������v����;^7����f
�5��FG������a�r����.�Z�*�T!@� @� @���*��%Wi�P�~ ���}���!o�����i����SW����u�� @� @� @�@����'���W<�V%u������,ab����Qu|�j�������s?c��]���" @� @�t���g��k�l\w��z!��������g��|�����)��������<pD����	�r4��B� @� @�HV�%?�W�u��\�[ �~�i�+V{�{����k�3N @� @� @��N.��ru?�|(��C�d�g�����j��T�s�
����� @� @� @��L�j�N6a��]]+A� @� @� @ ��G�f����<��s��^ @� @� @�tR+�:���^�:>������k+�x��1y��F�������_�"���?���s���_��;,����O���7�.9��m�ve;� @� @� @���:�&����������#
��{��8��cc����]c��E����4.����3gN������Ov����n�8@� @� @�������L���[K=5jT�����_�p��G��M�/��Bu�Q�d��t�}�����=��_�������O?�t$m,X���/]#k���J7V @� @� @��T@@�	'��[n���GW��+�����p�	��_�*�t����9sf�=��C�t�������K���6k��� @� @� @�9���qV=�k��6�8p`$��p.9��O����KK��u���m���oe @� @� @�@ty��zc�;wn,^�8=2e��HB�����c��a��;��#^}�����m�v
�e� @� @� �G]g����x�������K���dE�����eK���Ki9k�����I� @� @� �W����3{����n����u�}��'��o�4`�����Y�|y�����K����#G��\�2���m�����Rg @� @� @�9�u�	��w��C���k�������G�{�\�j��A�~�vhWw���$�K��m�����O @� @� @�@�t�dv��r������HC�e�������<������},�c_��J�z���K��v��TnX������n��!=��m�v
�����8�����r} @� @� @�@����N��?���V����������/��~�����^'�xb��moK�{��]j�n��R�a�G��C={�L�Y�fmW�@;���K�.m���
 @� @� �����o�[Vu?]UL�[i���q���������;c���1g��R@�����:�f��R�a��t��A���m��k����OV#VZU���w} @� @� @����{�x�SO=5��.��Uuu�.��RW�J]�����.���6k�����g��=����n�> @� @� @�@;��B�7_<�� n��}��m���W�=�����I>�����n�������r��Y��7� @� @� @�@�t9������Fr�A��Gyd������T�_X�dI)�;���"y�c�em��]�>) @� @� @��* ��3�u��HB�r��O>����J�;��Ry��1~��t��Eq��7���f��YW��O?�T��6k��� @� @� @�9��N0���)?��8��#���N�$�K#�|�������[��V�[�.�'?��8��C�U���H�%��e�b�����k��e�]7�pCzn���q�I'�J���s��!@� @� @��L@@�	&�K�.i/����O�-	�����v�9�D��k��&6n���w^�*���{�����{���\��Y�mws; @� @� @��
x�e'��C=4f��o{���.���v��]c��Qq������N;�Tw�����={v\x��i���6lX|������;��o�R��B��Y����' @� @� @ �]�q��|��w��>��`�0���W�3�<6l�������#wX���8��Y.�!C��^{��T���gm���v7o��)S��]w�����2 @� @� @�Fj)���TS<��F�0��5p��H~Z�
0 ����Y�fm��� @� @� @�jX�#.kxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt� @� @� @ ���� @� @� @�������'G� @� @� @��' ���� @� @� @�@
�jxrt�����}�MM�DEA,Q�^Q<AP������`����gogAE��b���
�H�W,�������	���n����>�y��g�)��;�l6�� � � � � �������l � � � � � �@
����C�@@@@@@�O�]��S�@@@@@@�������h � � � � � ��'@����)[� � � � � � P��jx�P4@@@@@@�� @W��-B@@@@@�at5�s( � � � � � �@�	���}�!� � � � � �����9
@@@@@@�����>e�@@@@@@jX�]
��� � � � � � P��o��E � � � � � �5,@���wEC@@@@@�?t��O�"@@@@@@� @W�;��!� � � � � �������l � � � � � �@
����C�@@@@@@�O�]��S�@@@@@@�������h � � � � � ��'@����)[� � � � � � P��jx�P4@@@@@@�� @W��-B@@@@@�at5�s( � � � � � �@�	���}�!� � � � � ���@�.E��o���>�����f��1��?�0���������m��L��m�`+�@@@@@@ ���V�Y���i��&�lb>����%'L�`���s��W���~�t��!k:o@@@@@@�z @W�{�����?7�z�2�&Mr�����0`�i����������?����������f��n���j@@@@@@��������������rKW���+�0g�}�y��w��k��<�N����lV8l, � � � � �4kt�z�Wo�o����"�,b4<�s+k����������QGB@@@@@������p;��c&N�����O� ]n�`�
L�����#F���9w�#� � � � � �u)@��.wk�n�����w���px@5�V[m57��?�4_~�ex2� � � � � � �@�
���]�x��W_+���S0�;��r������`�@@@@@@������m�m��k��U�R���+��@@@@@�3�u�=lN
|��7A)���`8w`�9g��g���;����}����� � � � � � ���rEx�X`�������_�������;5�<��
5p���7��X � � � � � �(HK�M�6AV?��C0�;0c��`�K,W{�w���^�#� � � � � �E��aB��-�X�hT�n��)�|
�<x��	@@@@@h����5��Yg]
�k��;��6�?�M�k��L��m���@@@@@@��������m���gP��{,L�4��]�~�L��-��F@@@@@�V�]��������;�����L�0��u�]y�9��S�q{��g0� � � � � � P��AW�{���o�����W^qkW���>3�{�6�������+������-���f�m�i�R�Z@@@@@@���5�y�X��A���o�i�j~��7s�QG�m�2�,c�y�3�|��Mc � � � � � P�4qY�{���k�9�47�p�9���L��]�s���}��f�v0�=��Y~���� � � � � � ��A�t�a/7�6*(w�g��~��|����]�vf�e�m��Z@@@@@@���5�>h%h������{��V6@@@@@����(�!� � � � � � ����A�@@@@@@�(tQ:LC@@@@@@ et)�� � � � � � �Q��t�� � � � � � �@��R%;@@@@@@��E�0
@@@@@@����Jv � � � � � �D	���a � � � � � �)�K��@@@@@@� @��4@@@@@@R @�2(�!� � � � � � %@�.J�i � � � � � ��,@�.eP�C@@@@@@ J�]��@@@@@@HY�]��d� � � � � � �@��(�!� � � � � � ��@���#;�Q�����q����_��m��\~����-4��3��Gi&N�h��c���k�.����Y����Ks��'�w�y�,������������;�5�\cf��eN<�D���[���=�����k����j\pAs����u�]��<4��an��&��SO��>��L�6��i������f���3�����s���[T����?r}����������)��o�a��������7�;&�������:��l��h��,[4S;��|9V�T�{���o��6���|`���{��cG��[7s�A���_?���_w������{z���7]�t1����8p`�sc����X9_�/���;�^z�%�����y���,���f�
60�s�Yh���
�u���#G��/6b��A�g���&G�Oz��9kZ���9��$�z���?�x��O?�Q_u^���w�i~$���D}��y���9N���b^y�3n�8��_���#8���p�=P�o�b�1�6��I��(�J�=r�����x��w����o�O���'�x�y�G������$��I�-����1�2*'�������1�!!�@}
�1"��y�����;c�yc��H��]t��y�x���EY$Xn�y�-�34}�)S�d��r�`��=�����0{�7��QW\qEYyhf{�����kVYr?3��w����/yy���:���|����eQ��_���k�����s�;�6"��y��F$��c5����y���}����l���t�IE������g�����?���n��F��D���l���Be`\m
�����@L�cM�^�v�2�!��
���W�r���Yg���G�i\��Z�k[���9~��,��x���c���G�H�W~{di�#���K�g�����E�Y����
�'�mV0CF��@�c5�1U ��G8O]/��������7!�$�6�����87�]�B�d\��}���+zS���J#!�T���o��Vtsg��i>��C7]�AT���4a�s���^T�p������36}����e��p�'�Km�����I���4{����U�|��oo��b��z���t����������[���N��+��B��,������.8���)��|o��vW8��P����^��?��`���g�}\���y3��QI��X����i�^z�QM	%T6k����7#�w��?�ht�G�������:t�����-Z�p��V[m5�������?�>�`�}�YW������Z�����.)V����ZQ����\�_|q�������S�>���Nc�����{����������������PZ`��f����uKV��iR���	o\�e����q�����#���p������B�����J~�T��{[m���V��*���Z$�5���>jT[I������z�]�������������XMrLEI$�������Y��������%�\�]3?��������6!�$�6�����87�]�B�d\e#�b���G9	��,p�����}����������CK-���s��+����~���Q��}3�[�v�P�� Y���M����/��7���������mr�-kcAW^yeYVz_���SMN�Q���9���H�N�����Zk����y����1��
��I26 �9��C3����
9��s�cF�����5=��Y��I�/�jh3y��N;�������zR^�@��o�m���M�5�t�}��'�����������������'y��uI�2�l����260������je�����c��o�)���9�L�\:�'M�{���[b��kR���9��d�p^:��s�{���Tr��^%���I���q��]\��~��_���;�L��cd�'�m��ojN��c5�1����c��1�=	������1�		$��I�-F���X�2++���5��;��Wz$h~��a�t�O������#����[T}��������������{���g�[n��E�V'�������a�V�����������?��3���/��6}R_J�}���Y����O>	�����Z���K�3m���&��6R}%�&�O��+��,�'w�Z������%p��7�~��${���2�,cll�q����5���4c�	�����&�q��U�]�5C�o��.�p�,���g�z��G��������}R�.��{���L�<�-��ym�qS�-q��|�)P����V$Y6�O�a~{%����q�����{
��W���E�$�#7��m�`�$9�|�^�^{������^�����qMP ��M�l1����j��XY_�]|+�D���x�	���6��#cV_}���W�
�'��r
�{��%����a������~��K���#��u����n2K,�DE�e�����f�5�*�G�0�~��������zj�p�������)����0[����>�t�MI���+��Z����p}	�W����b�r�c5+��&�l��^m?����_=N2��uI���l�
���Z�*���\rI0���q����&'P����I��y���o��$�/�j���5���[i����l�f��G7^f���<�6(	$9�B��=u���w������<u���p��H�o�,����X�2E��i���sb.�^�6#lc�7,���C��(��8�k��e��v3��
��D@�.�R���~0������X����cV�?���I59�Mc�Q��8?��f.���O���,��Y��Z�Z��C ��������,�]��f�`
�O2��uI���lm
�����SR�q]�t�]���G��_~����l�2����I�[�Z3��@%�s�F$Y����+������|�q���;�b��X���f�w���EI��J�_�����^s�:+��K��$�6��Q�I���*STy�O�E����Y`��q����v���~��z��7w���������b�-��HTS���N3�q�����`���J���m[s�q����>��6�|ss�I'�����|�����s��G�)��Rto���9��s]����2V^ye�f�H�%`��2��Js�=w�p��L�lT�����%���=������+�@������R�6m����O�+H����u�\�
��������B�i��S��J�=w�������I~�$Y6(@���W_m���jH�w��;W[m�����E�T��r��VG����i���?�l
������5DZ��|����@��j�c��8���� k������#Fu���\�%��k�iz��i���$��I��Jrn�V������F��3#G���H ������sf�|E�ok�<l
���E�a��p�[l�`\��DH�W�y�^?=�Pp��j�fo�f��7�����k�D��A�`|��n|����
��wc��O�>�/���`v�����~�3�_���H/�,#����9[n�eY�d�����c5J��M�A�����c���Bx���i6��������m�8���h��^��4�K�]'�����W6c��~\2:��]��W��ou�=a����)���7�33g���l���n)����@��9I�-�����������3���2������G�i���5^��~�����������/�d��n�`�[o�5k�$���2�M�	Tz�&=�JATr�a[�
�a<	���e
o���]����@�}�d�RB���Y�Refz��m���/zS���%I4gu�;dVSA��G�z:����r���9�?GB���z����?M���]��i�K�����<77����S��������4i��o��2j�S����5���bcp�'�T3����2�����ZWR�95�7%Y6jq��X�Rl~��5co��
W_rj6%��7����o^z�%�f��4�j���3f�y��g�E��`���4�K*Y/�����O������o������y��F��ulp�f������iI~�$Y6(@���S+
�A�ka]�~��g����ws�6�^{������d��:��k���>:�6@�����[f��7v������{�k��
�����6�y�Z���I��R�\{���OA��a^7��d5���:��U�/�`�C�����.�/t�5!�d�&Y���Wzn�f�J����F���A��du/�'�����!�R����e�|�[�6���Sp�D
���Qq�����1�[K.i
:T���Y��U���	��s2`����?��)���~�-3t�����,����Lg�i
���3+��B�Oms��7$��Q+I�/�j�n�N������X5������k��y�B�����k�#����%q��|�/p�
7d�9��:(�Z��fWp���[o��2�FC�
�A��mV�-k��|���%��3C���8y1Om
$���d�J5l3���w�9���x��wJfG
��DMz�J���:���������
�����u����2�M�	�=V�S�6��k������W^�����f<3�W��6lX�t���@�}�d�R"���Y�Refz�@��=?�4��'@�o��Z�����
9}�u�l��){{m-��B����[�]	#���h�.�,��R����kd��hY�%	�)o�D~p<�Y-�4�[��!C2�Z�
����c7��	i}������f��7MK������S�?7�t����II���?�|9V���k���?4���O>�d������g�����jBM��Gg��g�
��d�M"�����uIT�Lk���|���o��?���9x��A%7��.���DK�w�8�-q�b��H�;'��I4�9	_����%�#@W��.f(�/�s���>ck�g6�`���K.����3�o�}��+�pb�����>���o�`%��@�c��c����s��c���e�����?��GYpF��@�}�d�(�$��j�R?9@IDAT)��L+.����=C�h��o�������l��L��]�������e�������������c�9�
��@5i@B�\�;����'��bj��������oy��w�&'��d��0J���:�57����gc��s���������c�?*��ab��Gco��&�L����I.5?dk:�<l����1Cm	�K+������
����4���,��,�w��r�Fi��45�j�ErMk����z��o��
T����~z�y�t�o�J��JS�%����j_��!gv�ec��7��BW`}���Eok1{8�no�� PJ ���$��*W�t}N�\��V�������>�c�W(����&����,��m�����s�VrLU��q�=�U�O����_u�'u�AjI�m�e�t���U���2-��xN��@�	���e�]l����$�/.|�]��Q��~���"@WH�q����+��b�W,=��sF��3�9����`[����&)���j�#1����}
?�)� ]�����k��c�	
�<y�-�����C��G5�fe�-I�l�
��/�j�x��������7�F����S_EI�m�*X|�u�
��H����u2o��f�m�������~�
%=�3m�47I})v����l�C��@�����)4c��9I�-��r��=^�V�����
[s�<��Sn'����������y�������Q�T��k���^:�:|�F�[/x���i$��I���Irn�V�����x��91u'��#��>��m��+�hl�Um�V[m��(����������&���:�u���*4+�()`��1�8*4�n�M�8�MR�5������m����-���E�i�;w��E���k��k��P{.�F'�~�������m���f���8)��Q�W#_��(��?��O>1�YV���_����:��$��j��P����YQvi]�T�rjR����m�2����_�A�JF�y������I�lRi��Q�E�Z�A&��@-^����]�"V��p��o��z�=�$�j�1�dK��=l7A�?��kq%��o��v0J��HMC ��M�l�N�sc��U^�� @����;��
��?�d��_~qM�M�<�����[�n��?�QHU�m;�F��^}��b�1�X0�+�lJ�f���t�I��C��UM����L�6m\S��	�.]��,�������Yy����5jT0n��V
���nL�M�&��Ss��.	V�@����9�����s�����)�8)����U���|9V�����GO�*8��	>������n��8p�k�Wy�	k5�N����h������y_?��<�\s���[���O
>��q?����k������k]��JQ��R�2�����I��$��;���~�u��� ������L��3������������3��SN9���xr�Ae��$���2�M�	T�X-uL	������C5�[�lif��il����C��]��������n�a0�@m$��I��J�c5��1i�j{o5���k����#P��N��_5	���{��������R���,�L�T]������S��P�I?D}`�����
��|��q���/T���{�xS�q��
2��
:���js��m�����5�9������EY��n��� H���Z�p�=z��c�=������~���1b�+������#J��Zx����	'�k���s�Z�;7�yT=���C)��Z����$_�U�6�����5yu>U`����+������v�}w��
�-�������S�/����CA������������1�n���Z��C���5����C5
J���;��3�V�-��A�C9��hQ��o�����X�-�����-P��9v�X��z��=z�$t���C}$��@�}��������u.�b�������/�9���S��������q�fU���g�u���P%���BE8%�m����Hr�&9�$Q�XMr������w���~�������{PB�~��G����g������x��ko�P��I�m�eU�b�j�sc�2Dbd�#mn{�f�K�@�����}��w������3�8#X�^\�ZF3��n9[���2�}�Y��/S�W{!S0F6M{a{����cm���i�#�[����7��yl�#���a���M���o/���4�����\�o2��zk�r��}�#����i��`��,u�&��c5�=�j�6T�qlR���w�y#���d�	Y��7q�*�.���Z6�y�����%_t�u<�|\>�������O�:�j�j\��u��4J�����R�;w���c}>�{�����o�<������s\���>�y��>�3�w\�}���Y�Y3I��$�T������>$����c����k��
���~� ����J��:V��+-S<-�*G�w�������)�/<�M@�/���a�k�{������e=��%J�5we_lZ���}D��L��t����8)|,����j2K�����.j�\sM?��i�������i�I5������P�p���v;��c^�n�G���	859�Z(��%`h�l�o���p��I��:V������V�z��p�%����:���yj�'7����:����j�Jq�*�.)�.������^j�>�lS���U�N5�|����ZMY��b��W���|�qQ�]�W�����nz�Z����~���l�s�j����FM��9�t�jo�[n��u1P��Y�+��p^��W�Hm�Oz�W�9.����B�Z�����UjA%�lqX8�o�p>��@�c5�1u�&��P?��lw���[^���^V�����k����%�H�o+]6�XMzn��LQFLK.���F��z���K�+9 �@M�
lu����.����������]3����������eb>rt���\��T}��/&](�FF�d�zqMM�2����K[C�d3�Z��	��j�p�e�1���+�
�!PR ��Z2�"3p��atQ}�+���?�f\�+��fn&��*�.�]��O��|3S�N5:F���������w�}���
iU��[h}�k�I~��Y���S�?���U�n�-��r��oM[��G	4�5^�s��_~i&M�d�K-���E���J���q�#��X���*u�J'�����1c��RC�
w���( B��$���e��I�����>�b�lE�>}\`������/�@@@@@@�N���U�t��,@@@@@@[�]c��� � � � � � ���5����"� � � � � �4�����@@@@@@�Y	�kV���E@@@@@hlt��X? � � � � � �@� @��v7� � � � � � ���{�~@@@@@@�f%@��Y�n6@@@@@@���5�`� � � � � � ��J�]���l, � � � � � �@c�k�=��@@@@@@���f���X@@@@@@�� @��{��#� � � � � �4+t�jw�� � � � � � ��-@�����G@@@@@hV����fc@@@@@@[�]c��� � � � � � ���5����"� � � � � �4�@��.�G@@HG���73g�4-���c�9����\j��EjD��_~1��
3�L�����f��Z�&O�ly���uk��n�Um=d� � �@X�ta
�@@@�&,��������U��?���L�6����y�j�r��<���������(p�u��<�t�A���>K�&��x�
�������/�A��@�k���7� � �TC�t�P%O@@h�W_}��~����_5�.�����k�b�-�(ei������j���x�	s��7��J���s�9��W^�<�����K3}�t��[��~8k�J���I��*W]v����jG�7�|�c��f�-�4k��V���,{�J,���W\��Zy���
+�����	�\`�������:��#�4��u��/�F5������m[��_�'�cx�y�1�f�2C�5k��v�t� � � P
t�P%O@@h�/��<�����7�`s�QG��PP��W^1-Z�0�z�2s�5W�6j_|�y���J���c�u�q������p9������mMc�R�F�=z��o��t��!�U�<s?
>���I�]v�������7m�������(�����;����5�����x?p�-������w�y~t���-���oz�!s�m��+JB@@��4qYM]�F@@�P@A7����[o
�Me��o4���3o��Q��zH�����={�>}���m����}��n6�0j����w�y��[o���Z~I�k#S�)�oT[Q��m�YV��1c������i���O������y���a
�q�.8��sg�d�����������'�|b;�0��v�mW��W����}����N� � � P
j�UC�<@@@���|]n�B5k�{������-h8`.7�Z����7��?d���������Ts����Yx��<�����I��d����+k��9���>��f�4i������`T��h���	'����?���y��'�lj�s���Q9|����j��������r~����f���2?����
2�O�{����nZ�li�<�\�$&	@@�j
P������ � �
&�k�����/�<Xo�6N02b@A�3fD����`��2I���\R9o*IZ���~�[�������V��q�\���OV@���^����
q,U��T�Gy$��Zm~����`?�o:�V]uU7��o���&o�{�97MA���(\pA��f1bD���[� � �TA�]P�@@V`��)��g�u+�~��]����>�JH����<���j���Y`��"�,bv�qG����{�Q��]t��W���c7N��M=*�����5�w������~��K��kW�����'�o��6(���R�/�i�e�1x�������o�7�������N;-����g�Za}��5m��1-��Yr�%���>���jX���[n9s�����_~�(x��T+���f�*8�TKM!q_|q����z+�+w@5	��K�ZY��3��������8M#)?�
�K
��|�s_}@M���g8��5+�6ox�9���o������!���Y�9#6�|s7F��gGB@@��4qYM]�F@@��v�uW�QP���s5a��y��W��e���O]p���L���f3H9r�����p�0aB�6<�ZS���s�>���`�������_F���VZ�M���/�*����t��5<m�4�f��n��}�������`������&D�$���S~�SL�>��Y��G�p�OM^u�UY��x�
�^F�KLM*���pY����x��g\��������n�T��N������'�e��
'���FR�2�:u���k%�RC�������?�:tM1.8�Z�
��IOo��g����yP0T������2�y^��$@@�%@��Z��� � �
&���l��}�SR
�)���b:���A�W\�t�A.���]w���J��#�<����:�f��#k���s����nK#)h�ZO���`�j����j	�u�Y��w�1_��9��s��_��*$�r�N=�� 8������N:�t���>�?�x�����o��_?���r��a��(�xSMGD�
h^s�5.����P�GH�~�m����=�����8%�jju�e�uA^-���� ^�o57��
�<8X2��Y%�RC����v@�L_3�u����hx��s�mV[m57���j!�����s���g\����j���������g���,�< � �T"@��5�A@@��Pm'��W�i�����S�f�m��y��RM8���w�y�4���_4w�u�{�l/���k�R#��QG�.��w7
���j����S��i&5�������]� k�g�
7tA"�3fL0-��FT���I�&�������}��GA�e���s���]IYt���������3�(����5z�h��RK�mEIM[��J��T�J�5�\S�������O<�����x��5ht��y�SN9��W���n��.���K�1����������S@�7�����Xj����+S�JK/����������n^z�%WSV:r��j=�>(�TP=n
'
��@@���j��7 � �T]�����v�e�`}
�l��V.��*z�!��;�5����O8�� 8�G���
7���6���}���\�
$�F�')5�������?w�{��
+�b�u\ ���o�=<�y��.(�N}�i��	K��R�F���I��|
]���k=j�T��|Rv
�]z��n��������q
����/�^u�����5-�#s�j��~������������<���O�B��w��b:�+(�`���
�)x���j�S}A�	LqU�N���W��
�*��{�h����Z�*�����@@�!@����� � �
"��
(u�����U8�f��!�@^n�.(Q����
jQ���	���������Pen��MC����������N����]yt(��d�j�)�$����Y�9�����)�\�V��h�w�������/�����J��`����\s�Z�����%���J����4�S��qt�_M��O�Y}|�Y5C���������+��-�����z���z�����-����*�>_���u���@@����
�< � �4����������7�d��`�0
��i�Y�f���t�=���RT��z��������WS�Q��)�P��_~q5�j}������
8��,Y�~�����z�������������O����t���YP��I�c�=6������~�T���4'�x�Q?l�_;�����X(�Z�c)��vP@��B���i�5\N�����7���.3�����>�jU����j�iR���A;�|���W@@� @��&y!� � �@�
���T��K
��y�����fQ��J��+�9�>�T�G����AjFQ�6N�4�������UA'�|?n�}��\�/]-$��T�GA15���#��a7u�T�S�os��,����w��W��>���.TJM>�/��},����7�O�~��,�*�A���������k<|��W��+���(��Z��?��y���]���t2�)N����+ � ��
�+W��@@@�&Pd�$��:�nR������X`�r�+:�orO3��6,�@h��^�6�h#s�����W_=������'������S�A����;��c-o�%�\��s*������S����?�E]��&( ����>���j����O�>���O�[��EwMUo"�l�\lo��m�����p�F5������Mj.���U�0N���y�@@��s����@@@�Z����]PMe�o����o�Y�O���f-�/�����P���^x��~US���n�����
�����w����=%���m�����p1?5O���ul*i����������*��f(U�n���f�������6���
f��.+�}�>������O�-��/��u��Ps�����7��.s���j�K_|����Q��e)6?�@@H"@�.��"� � �@�	����
�}��@��Y�k���Q��>�����QU�ANj���SO5O<�D0:\�G��BI5oT�Gi��1y�u�;�<s�UWZ�(H����f���5�jc���[Y����8~=z�6��s���/����2�5����Kf�u�1;����t�M�2�,S��*8����^��k�c�1�������/���d���<�t�In[:� ��[����K��f+���K,L��Zu�����+ � �U @WV2E@@�j
(��f��:w�l���B�U�&��
��>��c����#G��H
�)v�
7�UW]��s�9YA�p�E�
�_;�k��Q���f�������|��]-���n���O>9<k���OMM����/-5�w�-�����oF������)������]��N�rP�j����Q�b����9��#j�c�m�qeQ�n���3jBU�7��w^������J7����������k�5�K����Zk-G���<�O�����_~ys��g��q����7�����y����_Ox�p�n�5�Ob@@��	��9g�R{
LB@@j[��K/��7��N<�������26�,���/��q��^0��~������m�(��RK��o�����{���������|mu�m��6��-fk�el`!�.���4[�)k}z�����iQ#l
�`Y�_�����uEM���}dkT��y�����.A�4���2��/j��a���zl3���m�2c�&
��|r_Yd��
2��"_���
�y���%9�����B�������w���o�6���<��#���� ���^;��];��K36��
�<��~�����@@@ �@����5joj��+o � �4-�7��2��q��BMN�������_=V��NMa��Y0^�Es����]w�5��P6�g:v�����+���^�v��-k��������W�>�V��.�����^U�O��|3��e��~�jk��l �������]���R���u�������n��j�*�Hj���\���Q��;_�<Z�ld���$�
���8������Sm?�jS_t����������8)��Q3���5����X��~�e�}U�@����.?����e/��"7NM��X)�d��N;�cZ�@5oU�N�2m�=o�������y�>}��3@@�4����F���j��F�����@@@���l���[`����Y����`2
��&������=��o�2w>��ON�I�&��?�����VX�`����o�5&LpM�Z<��������/��cf�(��S����_V�N����O���S�m
8���hS��{OA�\0/�`�9Q�����)�O���C���t���CA�8�V~Q�EM��J�R@��;5���o_7M��j�T�J����<�+����"���_f�g�������c�gP�����}���4���>�_��;����s�&em�Us���<�e���>����S����j�S��
��@@H[@���1���H;s�C@@�-��s
����XTR������S�w�+�H�_n�����z����_n*���Q@�����W��r=��5���?-�O�j=���k�uk�������(N'N�v��5����]�������2Wr,��Uk?��s_�>�Ax�e�� �mj���k8�����GE�F�������;F��4@@R�n�%�,�@@@��%n�����6��:3n�83k�,W�R�������$��V[5/�2�v��AAs���YY�=�o�R5U��� � �@��AWma�G@@�{�
6������<��f��)f���*���7q�}�-:��OP��
��s�=���3?���k~�Z./����/P�q�A~�Z�"� �  j�q � � � �P@MW>�\~��f�UW���u���W�^F5�T��-�>�������}F/Q�T�i�>
�Zk-s��V�K"� � P���}�]����#M����X�Y@@@@ W`����/�pM\*���`���G@@������5j�6�M\��>f�@@@Q`���7�;wn��j@@�e�����C�@@@@@@�N�]��R6@@@@@@�������l � � � � � �u'@���v)� � � � � � P��jy�P6@@@@@@�� @Ww��
B@@@@@�et��w( � � � � � �@�	���]�!� � � � � ����Z�;�
@@@@@@�����.e�@@@@@@jY�]-��� � � � � � Pw��n��A � � � � � ��,@����eC@@@@@�;tu�K� @@@@@@�Z @W�{��!� � � � � �������l � � � � � �@-����C�@@@@@@�N�]��R6@@@@@@�������l � � � � � �u'@���v)� � � � � � P��jy�P6@@@@@@�� @Ww��
B@@@@@�et��w( � � � � � �@�	���]�!� � � � � ����Z�;�
@@@@@@�����.e�@@@@@@jY�]-��� � � � � � Pw��n��A � � � � � ��,@����eC@@@@@�;tu�K� @@@@@@�Z @W�{��!� � � � � �������l � � � � � �@-����C�@@@@@@�N�]��R6@@@@@@�������l � � � � � �u'@���v)� � � � � � P��jy�P6@������_M��"�$�y���>2?��c�lRY���T��������/�����V*��;������@�48���I^
)@��!�Y �
$p���Z�,��f��	
��tV��w�����>���	�y��o��Qf�Wt��VX��n��4���RX�)'���d�@�	����f��i��?�h�24���m��/l����{7D1��:�x'Y�R�[n���h���7�|��7��4�C��(v.�7�b����z�m
�S��-�<eD@��&O�l=�PW#���6�m�]�"=�����/4s�9����KL�n���[�	3f�pY���K�������Y��a�f��z��6���O6����c�9��/�h�[o���&����O<�L�:��_}.�?�xwS&N�~��s��g}��Zk-�|�e�y>����o��f�=�X���_�i�t.^f�e������8�k����I���I�L��}�n����k����j�����	e��	<�����k�5:��?��f�}�1n�a�<�|�Is���}�=����SO5]�v��|��x����g�y�,��b��;�t�Qil�7�|c�t�b�O�n��b�����mU�x��'��7�l�j���z��W:/&j;g����������xh����c�F��$�I��t��@��g����T[|�Tg��3�5�9����^�����<���w��q��� �	��������kF��|�M��`J�rH��nn��S�L1�=���G�4t/����)�C���'@�'����c���.�(��jrO5{��SN9�\~����w�u�9��������f8��q�NW�����=���j��fn�������#���.�,�t�AFu�&�?��,�*.��_~������c�*���������?d�`�Zh��M0���_|����M_w���=�Pl��2^�l�>����a����e�>n�� �����.��b��SO�,��P@S���T@��W^q���z����%7��q=��Z-��l��H�����J.�w�l���X�sf�i(��=�wMU�]S�s�@�:0`�=z�Yz������_�[�|7�P�^7�x��9����������� �j���'��>��:t0����;FT����?�f����Gs��G���DM3����.;51|����Z�

�r�h���O?�d������27����GU3r����j_s�5y5�s�k�������^��k��5����L�,s�<��g����r����bT���^?��c\>c�����������yZ(����@�	�~G��S�{��sqcnU��w������1��NO�]z��� �	N:�$����$����7���{���#7V���������k��!��M����N;e��T�~�����<r�H�����U�pp@���~���O�j�
>�����$�5AV,�@:�T����7�w\d�������8�6���Q_8�������kT[SM��J�ng�|.����~�}���h���^�)�wN5u����sq:kO��9��q=���]SH�q���@�@���[��5���fjC���@�Q��Tc-��
�x���H�7�k��x��<�Ok���I�&�4d�?X�����v��]vYS�S�:��9+7��E�)�Sn��\��������8�<�5O�>B���O
B~����m��r��?�-���E�O��{ ���#�fA���I7J��~{�Z�U�9*)��@���]-7�$*Y.�5���h*-W��Qh~�$�{�V�u�?�Zo-�+w;�lK��^��9&��-j�J����}_�Q�����|�������}�u�D�P��Q�����9�Wr���������;'�$<��c&����|W�;����J?C����]�Q�{@�����)��1���W�w���7���3���w��,��R�I�E]��Fn��������n��G�d�M\�zBNO�������+�,������Xd�E��]Ssk���������~��f�
60�Z�r���+7�l3�����-�U5��t�bV^ye�k��n��f��sg�~J�y����[�5�6��4���#�3�������������
�TkIM�)����{��jG�LWM'5�U*�_6����
}��Q����1�cK_�m�eZ�nm�Xb	����f�R7��?�x��J+��-[�����Zk�������/J�p���������/E5gy��g�}�1��y�9�$.g��P�����}���OMi�c�o�[�M�J����
��r���ZvJo��;Oz���&g��1�����`V?���z}��G����_j����n3������|��
���~������s����o7��v�u�`����������n��2K.����P��BI�?���{�y����q��TK�0�&��7�tSw��yTA�n��e>^��r��M:|�w�m���}����m�������C}]�����N�����f:���G�?�{��w�o��V�r�8"j;s��~�t~	�K������Xs����r���r&y^w%�V�*��w�k�Y�}vu��f��ht�.���a��~7�:\�)[m�U��v��M/w(m����:��[D�t ���tc��������?
�O�(������.Lsw��S\����6�3���R�o���������%��w�qG����J�7�Os���#w���
��@�]�c�~;���/������Rx{�2��J����"�q���>���Y��|���4�_���y���������\�����Z�����S������g�g-���|��B�3m��
����_��f�����Z<\�����fe�>���}j�`��lS������C��|��5���qmU�����??���}�]��A�����%ck�d�{K�)���l�;ck�
-cone�����q�ck��w�[6|����_�<��r�aP	�SI���#�<��	c���6��g���L����5i�� �b���i�r��{����a��ye���x�t��l��j�A~��l�����7�����6lX�?�V+����g��}�����Z�{����W��6p��A�`���y���� 	�)tl4(o9?�})cnY������G��p��TZ� ������������>��R0��7������y��yv�a���l3c������s�=��l�F�cL����b���v��v��V?�
Fe���}���sLf�0�&�G%�V�:�-�X��
�fl.T���6/����O`�<��N{�������J�����LE�a������V^����7{g2�l�b�WM��|��u���k���8;����J��)�9�L�i\�����6mZ�>H�������h�"c�
vn�����|w����{���=�H �MC@5���8�I��>9����|��W�k�a��O���� U�1�f��J
7��&������~������	��a����S����M/�O�+��P
�#�8����C=dF�����7u�T����6��������;]�OzR^���2z����fz�D��vk�m*���:�j��j�2�<il�jF����jN����d�x���1��]��Y��k��v����6�
�}��Y��?���������l�I���S�-������Yg�u\-M7��S�P��������j��Tn�����j����R�r�L*����0\�~����|�5S%�K�.��wN�}6�x�9N�_~��)�U�=�U���t����`Q���O�N��2k�}�"���G6�!����mQ�y�-[�x5���V3�7�tS��}���yK}����j�jX����B������>����d��f�����j��\�Z�>�/������
 �k=���fmP��f����n�!x��\AF�p�	n����x���5���j�*�6�j���8��r�]s�5���jl�M-#������^��Y����U���}������������k�}_���ZY�o*����W�7}������V��/�hT3B5,�t�����}������:.��s�������&&u.��2W*U�����B�2T���:����*�'N,���=]5�J5i9�������wNy��I���I���ls��I��)�]�k:_Kn�W4�]v�����X�M��u���z�*��P���c���O��^@�$���?d*�S����s�i�eW�:���n~�f�jn���E����rk�=��c�4cn����l����T�'��_0�������B����%������{�+��6��5���R5vl $��P-=Y�������2\n_��OOK{��:������f*�S����s�i�eW�:���Z�?C�=�d�����M��d0�6#�&G���k�����M���2�fv0=���=�
}���j�� .��={f�ooH�)y,�IDAT�l`&kZ������>��a��X�s��9��M���jF��t��._��Yf�V�=���A.�mv����0����/��"S��j�E��OS-�J�^���Z��l��=���D���9*��O��������j�}����VV~���^^-e��]0�6�^�[F���C�Y���/���`^^�����`�
(���$�
2����`�6�f>3zN�q�f�O��P�$��Wx��Z��@-�["P���H�S�T�J���������m���b���m�2����<5"�����k�1Yp��?2��i�����k��^w��V5!l����������W�=z���x����6������y�����}�s�a��@������������y�R��~��#3�J�T3nTD���4���d�e��3{��|��6�E}f�X�
�9�a�1���9�5��z�u��}P��r����w�jE���f�D;��Q�@If��1n|��Z�9}w�k���� �i
{���5��X�j���iWj={�[��������Si'��s�X������O>�$���8�������5�T�.*��a&�?"�_�'��T[$��'�UJI����������v�5��������M��{���T��1^}����F�����i��@����'�}���4p��`����������������D���R�7������������C��W_}����;������6W{�J��N����OI�*�Mvw�P��i�+����|���g�O��M5�bU�=���k2�W�\�5�����t=�A�kQ+����|��]m6{����W_�>��W��`��s��~���WI�����/M}X�<�Zu��������O���.��n���co��!���7&�f��s�)��q���(�����o�109��c\?��<���is����U�P5��Ir��	w���S_G����t�V��_6�W�6/����[���|i�Kc��:&���X?�Q��[�������j���[�R�V��W�U_���9}�T����v��
�f+�*���R�	%]���O�����z�Z����>�J��Kc������0����G�����x�|FBFL�;'��~���N��������;���n�8���T������w:�} �@��4OY�B��5��z�����-��T��������jNO7Z�$�0�OE5���7��(p��}����1��y��']sJj�H7�|S�~�B��������ua���Io"�3W�<jNSIM�)hQ(��+���y����B�o�q
��dZ����������:E�T��j��7�0���kI7�����^����jZ'*M�<9�����7�^&����Z(�'��v��|�O*on�7�}����V�r��}S�<���B����}���V[-�f����.
��5U�pc��8�AA�X�������9��j���9���z��779d��� |��n���	�B�Q�S0[��UV��6�uR�����VP�}zh�'����o��Q.��r_�S��"���k0��h5]��2=�Q(p�?wX��z�)7u��1�ZM���.�e�`9?�F�i��Q�:�*���K�q�E��;��y�G�i�[�����g�=���$_��J�����yl�A�����>�W�>���O��K�o5[��ONO4��
���2[���/b}|�������}FFN�;'���~��K�5&�#�c&��2�=��J����^O6��t��*�#��"0n�_�NR���3��R�'P��6���Mh��b�9�
��O��R_n�����ta�����'RU;.�	����4�S���R�b,|c)7���S9}`�e��Vc���������k
��Y�������u�����W�gDI�6�l�����V|R��T�<�������7?]}����_7��W��8T����B��)�r���ymj�G��,v���OL5@��S�Lq�
���AC��J������/����tCEO.�a����:�U[����w����{G���K���h��s_�pQn��u�;}�u�
F��9]��O��c�=6�V�<Zi������Qm���%]�(@���jZ����)`�s�����,�]��!��^n�!w���j-����;���7�1w[�L����U�M�A���`�(nY��S�[�	����*G��*��;VX�����.�M�.7�����Z��0�xc���?T����|.N�����Q�q���:f���]t
��*V~?���{��\�;�;�� �����L��d���Y��s1��E��l����m=�i����{�?C��������59�|U�B/��W���5�����N�(Qyz��T��|��y��J=wT�9�����MY������?�����&�N;�d]tQ?{��p��j���+��nTk6����q{���Q-;����z�Q�F�>����\~}����yT-��fx��UVY%Nc@5w}-5��SC��J�|y�L@�P�ZH��^��J���Z���b8������5w���9���j!�I��?����u���V�����W�|=��������<�N������/�\�����������]�5�'�!��m~��L��j����s�7�����1bDP5{�����Z�(����K�c�7?Yb��f��9/�&�����Wj]u5]MF&mv��AT��D�;'�����\f��;g�Ec5����8f�@�R�����J��\�;��� �@�&�`��tJ�����S��I�j�[�.I��1�`�o�H5�tS�?�k;�5l�A�U�	s��}���������;v�9��s����[t���~b\O�Kj��'5
E*-�`������tq���X�*����e��O�t�P7l��N�v����Oq��R3�>������+��oS{��O�n�����i}���5��������!��,g��}�
�����T�����`n��W���p���i�TQ�<3�|�n������X����;��Q��i����{0@7����z��.��^�yX��'��Wm%�f��pD�|��\r�%.8�q�������N0]�%�pj�r���yT�l�M0�
W(��}7,����f��8�/]�Sj�}�X������j��4\}������w5y_i����j�)�R*U�|�S��t���Z�J
v)H�Tim������
R�����.�$�9a�t���I����*=����4�o�cF��������P���86������i5�� �����l��k�m�����y(8w������\a_}�U3c���� :��C�Xz�����y���n��&����g�P�!���V���f8�����5YW(�yO��d )�@��W�������Mv��[h�!���=Q�Y(@�����P�3z��ku����V��@U����}��w\Q|�������+�h�MdV�\|cY3�������nr����A�j��$��>'�D�n���K5�U�"I
?��$�Z]���W�mP�NMGF��k�|Bn���?0��rSH��(Q�vo$�OH��|W���/��W������/��\���0`��0�����o�����g�}�{�s�����V�����Zk���{����<���\_�e]�e"mzL�r�-Uh�]�@Y������{��������*_}�U�ld���i3���r�waS��q
\	�]Xh����}�]>��T3�f|�g	t}{V]��%��3�}v�4"���,dZk����f[d��:�E<}�/Z��~z�,f��.5C�?�f���x��Wttw���p�����Y�U�����Y�e�����1�)�m>�������,����y������$ 	H`3@���Qui�mf�:T�,��9��
S�m��a�����G�h/5+�8��/7�P1�#0	���o������j>O������8�X
��ui����������;���>;G@{#���L,�C,4�����z�yw�k��Z����O����������/�����������z�����^Z/z��������k�~����R��������fQ]�%�R�?2��[����q=�����3B/2,��h���o��C��w�������W\�*�#�U�Mh4���zj�O]v�e������V�����OdM@���hF �;��_[Xe�����E?������6=x���b�n�z����\�x���E��|w�������/�X=���C�p��	�[��\\�B�9�A��Y��2���}N��]�8��?�9>�l���>Z���?W���7��s,�
������$ 	�F ����:Lff�\��x�j���WN�Y�`�U����y?���B����Wh�������x��l�x��~D_�-8&������V\D"S&�������71���.�V.���Z�3
q4��]?���%�F^y<��\<
Zh��+C�J���O�o��$���'�WB��������X]���6I��x���.� �������C�{��,l/��,����Z����k��o���?��&�����*�(�M~��Gf\f}5_� &S��K/����h��G��t=����d0���|�M���ma�&B8����*����+Kv��7��v�<�������-cfh����0��_�q&�1�{����%�y
��u,n@s�h��#
t��{o6��,�\��de7����������p�j�i8��Q4�	h�R�����F+?��L�!�7#�b�M{��7s{�E\�_f���z�������L���'0
�?��~�)����d��]r0&������Zc�y	����0k���
[�������!�'�ph�a2�<Ys����a}�}N�j6v�h�s��c��Y�X(��������8���`a<��O, Y�������}�I���	H@����*�����4���<u�L6�q?����kIhP_��b�m���Z,����N}��I6u���4�U�g��������S�� �C�q�p/N�n�$\�V����;Z�'��$
�Z�)�O+�'I@9�F_���E"C�)���@�%�_3'a�4�\�����{&'����y�����;�4����O���C�VDN���OB��%o)�TG|��>W��f}������6�&M:[O}���$AGk����P������`cr�����I�.n��ym<-iM7	c���`c	$!m��$�j-Hl���
L�f�L���\�I�df�����dN��S�;�~���7��I�����_�uB�"~�6-�����E���?����;	4���1�$������Y�6s�����	%�L�g
������([�f�E��t�'�$����2��=��S��]��#�6�S��<"��m�0�J�����n�O���P��+%m��<9�8�b��b�����{����2/ZI���,iq�������*�����s��;Tn�}a���s�D������>g2Y����)em�����;��o���GX����c����kW���]jA	H@��<'�tR���O;��r�(�~�$H�?n��Fc!Vv�~�����H+4s�:����
n�X��U���i�q"�����}��0�V�K.������S�>yb���W^��.9��s�B�ki��W��5�}���i���6���xQ���~�>�=h']>���a���������9��3s���Zh\�e�{_�����/��o���/�Y6ZK��B;�����a���)W�[n}K��h�`��X��0�sO>�d���?f?�M��}���h����t�_��4�2��g��;��)�g��n��0���t�q4��f(��h^�>$"-�G`^�BC&4����I�������]��.���l��qG��'&�0��)�2D�kK3�1~�\�F"����[%�]��C	M�0��&5���z*n�-W�����j�G`�g��$�hF��i������__����?4Y5�����K�5���n���/{��r�:/��g�:Oz\c,�}����Cc1N.Z�}yG��������/n�]����n�P��h�hF{���!	�GY.��@_��������~�+�k{��,i�w�:
�l���o4��|�_>��K �i[��d�9�}a���s�D���s��cU���t�#�����9p�����;��r��h����b�3f�����$��j��$ 	H`��_��}�03/0��$���i�����\�y��WO�`����<0���2����!psB �J���K~�v�!=$���o��4��RBp�-m�G�][�3������#�����w-�^�3�=n�#�St�Mv�5�:fB ~�5K�����u_[�>�%�f�x�l{'#�����0#��_Tiu�Y���L}��
sn���C�U�S��>�4V�����~h��������8�xX�B|�������6N���r����D�n7�����oO�d6��G��w��k�`�2Lc���3��'��t���	��.���\�0��fb6iS�BT`��--W����o�/i|!]en����P?a���U�xq����Z�O}3~��#D��������z�����0� u�4�����}C�~H�e9��?$�yu�)\h����rb�3��:�����'�eL�����KWR��whD|���EM�oQ>����������0�@|�s�w�������6f�s���<J��������5��M����0Y�i�/�o��V�;0C} ��Oa�:0$ 	H@��$0�+��(K�]��W[t4�����~�R��m�{N��f%�h2_����~F��	&{X�q�
7d��f�% 	H@����#�u	H`#	�:M\ndM��$ 	H@��QC �na�s�}�Q��o�e�Z�����[��b���	H@8L ��=�����������������W^��s������$ 	H@k&`��fd� 	l5�6��J@��$ 	]>�����k����}%��T�����5	H@�������j���sx�������$ �6�9mT<'	lj�m�����$ 	H@G
��/�<������g|����={�T?����-��Z����X�N����g���M?c����sg��[oe����e>��$����l�:��$�U	�A�Uk����$ 	H@X)��������'�X�q�������L\��V"���W��Z���vZ�m����x>�$ 	l"�9��2,�$P�t'�C��$ 	H@X;&��3H@���	�|���Yg����C��$�F�9kft	H`4����I@��$ 	H@��$ 	H@��$ 	H@�:�$ 	H@��$ 	H@��$ 	H@��F$��nD�f%	H@��$ 	H@��$ 	H@��$ t�	H@��$ 	H@��$ 	H@��$ 	�H@����J��$ 	H@��$ 	H@��$ 	H@
�l��$ 	H@��$ 	H@��$ 	H@���a��$ 	H@��$ 	H@��$ 	H@����$ 	H@��$ 	H@��$ 	H@��$0"t#�6+	H@��$ 	H@��$ 	H@��$ 	(��
H@��$ 	H@��$ 	H@��$ 	H`D
�F�mV��$ 	H@��$ 	H@��$ 	H@P@g��$ 	H@��$ 	H@��$ 	H@�������$ 	H@��$ 	H@��$ 	H@��$���6 	H@��$ 	H@��$ 	H@��$ �	(��YI@��$ 	H@��$ 	H@��$ 	H�����C�% 	H@��$ 	H@��$ 	H@��$ �%����>M�]KN��$ 	H@��$ 	H@��$ 	H@��$ �i�C�����I@��$ 	H@��$ 	H@��$ 	H@� �R	=�:kuIEND�B`�

Zhihong Yu

zyu@yugabyte.com

over 3 years ago

In reply to: Zheng Li (#1)

Re: Bloom filter Pushdown Optimization for Merge Join

On Fri, Sep 30, 2022 at 3:44 PM Zheng Li <zhengli10@gmail.com> wrote:

Hello,

A bloom filter provides early filtering of rows that cannot be joined
before they would reach the join operator, the optimization is also
called a semi join filter (SJF) pushdown. Such a filter can be created
when one child of the join operator must materialize its derived table
before the other child is evaluated.

For example, a bloom filter can be created using the the join keys for
the build side/inner side of a hash join or the outer side of a merge
join, the bloom filter can then be used to pre-filter rows on the
other side of the join operator during the scan of the base relation.
The thread about “Hash Joins vs. Bloom Filters / take 2” [1] is good
discussion on using such optimization for hash join without going into
the pushdown of the filter where its performance gain could be further
increased.

We worked on prototyping bloom filter pushdown for both hash join and
merge join. Attached is a patch set for bloom filter pushdown for
merge join. We also plan to send the patch for hash join once we have
it rebased.

Here is a summary of the patch set:
1. Bloom Filter Pushdown optimizes Merge Join by filtering rows early
during the table scan instead of later on.
-The bloom filter is pushed down along the execution tree to
the target SeqScan nodes.
-Experiments show that this optimization can speed up Merge
Join by up to 36%.

2. The planner makes the decision to use the bloom filter based on the
estimated filtering rate and the expected performance gain.
-The planner accomplishes this by estimating four numbers per
variable - the total number of rows of the relation, the number of
distinct values for a given variable, and the minimum and maximum
value of the variable (when applicable). Using these numbers, the
planner estimates a filtering rate of a potential filter.
-Because actually creating and implementing the filter adds
more operations, there is a minimum threshold of filtering where the
filter would actually be useful. Based on testing, we query to see if
the estimated filtering rate is higher than 35%, and that informs our
decision to use a filter or not.

3. If using a bloom filter, the planner also adjusts the expected cost
of Merge Join based on expected performance gain.

4. Capability to build the bloom filter in parallel in case of
parallel SeqScan. This is done efficiently by populating a local bloom
filter for each parallel worker and then taking a bitwise OR over all
the local bloom filters to form a shared bloom filter at the end of
the parallel SeqScan.

5. The optimization is GUC controlled, with settings of
enable_mergejoin_semijoin_filter and force_mergejoin_semijoin_filter.

We found in experiments that there is a significant improvement
when using the bloom filter during Merge Join. One experiment involved
joining two large tables while varying the theoretical filtering rate
(TFR) between the two tables, the TFR is defined as the percentage
that the two datasets are disjoint. Both tables in the merge join were
the same size. We tested changing the TFR to see the change in
filtering optimization.

For example, let’s imagine t0 has 10 million rows, which contain the
numbers 1 through 10 million randomly shuffled. Also, t1 has the
numbers 4 million through 14 million randomly shuffled. Then the TFR
for a join of these two tables is 40%, since 40% of the tables are
disjoint from the other table (1 through 4 million for t0, 10 million
through 14 million for t4).

Here is the performance test result joining two tables:
TFR: theoretical filtering rate
EFR: estimated filtering rate
AFR: actual filtering rate
HJ: hash join
MJ Default: default merge join
MJ Filter: merge join with bloom filter optimization enabled
MJ Filter Forced: merge join with bloom filter optimization forced

TFR EFR AFR HJ MJ Default MJ Filter MJ Filter Forced

-------------------------------------------------------------------------------------
10 33.46 7.41 6529 22638 21949 23160
20 37.27 14.85 6483 22290 21928 21930
30 41.32 22.25 6395 22374 20718 20794
40 45.67 29.7 6272 21969 19449 19410
50 50.41 37.1 6210 21412 18222 18224
60 55.64 44.51 6052 21108 17060 17018
70 61.59 51.98 5947 21020 15682 15737
80 68.64 59.36 5761 20812 14411 14437
90 77.83 66.86 5701 20585 13171 13200
Table. Execution Time (ms) vs Filtering Rate (%) for Joining Two
Tables of 10M Rows.

Attached you can find figures of the same performance test and a SQL script
to reproduce the performance test.

The first thing to notice is that Hash Join generally is the most
efficient join strategy. This is because Hash Join is better at
dealing with small tables, and our size of 10 million is still small
enough where Hash Join outperforms the other join strategies. Future
experiments can investigate using much larger tables.

However, comparing just within the different Merge Join variants, we
see that using the bloom filter greatly improves performance.
Intuitively, all of these execution times follow linear paths.
Comparing forced filtering versus default, we can see that the default
Merge Join outperforms Merge Join with filtering at low filter rates,
but after about 20% TFR, the Merge Join with filtering outperforms
default Merge Join. This makes intuitive sense, as there are some
fixed costs associated with building and checking with the bloom
filter. In the worst case, at only 10% TFR, the bloom filter makes
Merge Join less than 5% slower. However, in the best case, at 90% TFR,
the bloom filter improves Merge Join by 36%.

Based on the results of the above experiments, we came up with a
linear equation for the performance ratio for using the filter
pushdown from the actual filtering rate. Based on the numbers
presented in the figure, this is the equation:

T_filter / T_no_filter = 1 / (0.83 * estimated filtering rate + 0.863)

For example, this means that with an estimated filtering rate of 0.4,
the execution time of merge join is estimated to be improved by 16.3%.
Note that the estimated filtering rate is used in the equation, not
the theoretical filtering rate or the actual filtering rate because it
is what we have during planning. In practice the estimated filtering
rate isn’t usually accurate. In fact, the estimated filtering rate can
differ from the theoretical filtering rate by as much as 17% in our
experiments. One way to mitigate the power loss of bloom filter caused
by inaccurate estimated filtering rate is to adaptively turn it off at
execution time, this is yet to be implemented.

Here is a list of tasks we plan to work on in order to improve this patch:
1. More regression testing to guarantee correctness.
2. More performance testing involving larger tables and complicated query
plans.
3. Improve the cost model.
4. Explore runtime tuning such as making the bloom filter checking
adaptive.
5. Currently, only the best single join key is used for building the
Bloom filter. However, if there are several keys and we know that
their distributions are somewhat disjoint, we could leverage this fact
and use multiple keys for the bloom filter.
6. Currently, Bloom filter pushdown is only implemented for SeqScan
nodes. However, it would be possible to allow push down to other types
of scan nodes.
7. Explore if the Bloom filter could be pushed down through a foreign
scan when the foreign server is capable of handling it – which could
be made true for postgres_fdw.
8. Better explain command on the usage of bloom filters.

This patch set is prepared by Marcus Ma, Lyu Pan and myself. Feedback
is appreciated.

With Regards,
Zheng Li
Amazon RDS/Aurora for PostgreSQL

[1]
/messages/by-id/c902844d-837f-5f63-ced3-9f7fd222f175@2ndquadrant.com

Hi,
In the header of patch 1:

In this prototype, the cost model is based on an assumption that there is a
linear relationship between the performance gain from using a semijoin
filter and the estimated filtering rate:
% improvement to Merge Join cost = 0.83 * estimated filtering rate - 0.137.

How were the coefficients (0.83 and 0.137) determined ?
I guess they were based on the results of running certain workload.

Cheers

Zhihong Yu

zyu@yugabyte.com

over 3 years ago

In reply to: Zhihong Yu (#2)

Re: Bloom filter Pushdown Optimization for Merge Join

On Fri, Sep 30, 2022 at 8:40 PM Zhihong Yu <zyu@yugabyte.com> wrote:

On Fri, Sep 30, 2022 at 3:44 PM Zheng Li <zhengli10@gmail.com> wrote:

Hello,

A bloom filter provides early filtering of rows that cannot be joined
before they would reach the join operator, the optimization is also
called a semi join filter (SJF) pushdown. Such a filter can be created
when one child of the join operator must materialize its derived table
before the other child is evaluated.

For example, a bloom filter can be created using the the join keys for
the build side/inner side of a hash join or the outer side of a merge
join, the bloom filter can then be used to pre-filter rows on the
other side of the join operator during the scan of the base relation.
The thread about “Hash Joins vs. Bloom Filters / take 2” [1] is good
discussion on using such optimization for hash join without going into
the pushdown of the filter where its performance gain could be further
increased.

We worked on prototyping bloom filter pushdown for both hash join and
merge join. Attached is a patch set for bloom filter pushdown for
merge join. We also plan to send the patch for hash join once we have
it rebased.

Here is a summary of the patch set:
1. Bloom Filter Pushdown optimizes Merge Join by filtering rows early
during the table scan instead of later on.
-The bloom filter is pushed down along the execution tree to
the target SeqScan nodes.
-Experiments show that this optimization can speed up Merge
Join by up to 36%.

2. The planner makes the decision to use the bloom filter based on the
estimated filtering rate and the expected performance gain.
-The planner accomplishes this by estimating four numbers per
variable - the total number of rows of the relation, the number of
distinct values for a given variable, and the minimum and maximum
value of the variable (when applicable). Using these numbers, the
planner estimates a filtering rate of a potential filter.
-Because actually creating and implementing the filter adds
more operations, there is a minimum threshold of filtering where the
filter would actually be useful. Based on testing, we query to see if
the estimated filtering rate is higher than 35%, and that informs our
decision to use a filter or not.

3. If using a bloom filter, the planner also adjusts the expected cost
of Merge Join based on expected performance gain.

4. Capability to build the bloom filter in parallel in case of
parallel SeqScan. This is done efficiently by populating a local bloom
filter for each parallel worker and then taking a bitwise OR over all
the local bloom filters to form a shared bloom filter at the end of
the parallel SeqScan.

5. The optimization is GUC controlled, with settings of
enable_mergejoin_semijoin_filter and force_mergejoin_semijoin_filter.

We found in experiments that there is a significant improvement
when using the bloom filter during Merge Join. One experiment involved
joining two large tables while varying the theoretical filtering rate
(TFR) between the two tables, the TFR is defined as the percentage
that the two datasets are disjoint. Both tables in the merge join were
the same size. We tested changing the TFR to see the change in
filtering optimization.

For example, let’s imagine t0 has 10 million rows, which contain the
numbers 1 through 10 million randomly shuffled. Also, t1 has the
numbers 4 million through 14 million randomly shuffled. Then the TFR
for a join of these two tables is 40%, since 40% of the tables are
disjoint from the other table (1 through 4 million for t0, 10 million
through 14 million for t4).

Here is the performance test result joining two tables:
TFR: theoretical filtering rate
EFR: estimated filtering rate
AFR: actual filtering rate
HJ: hash join
MJ Default: default merge join
MJ Filter: merge join with bloom filter optimization enabled
MJ Filter Forced: merge join with bloom filter optimization forced

TFR EFR AFR HJ MJ Default MJ Filter MJ Filter Forced

-------------------------------------------------------------------------------------
10 33.46 7.41 6529 22638 21949 23160
20 37.27 14.85 6483 22290 21928 21930
30 41.32 22.25 6395 22374 20718 20794
40 45.67 29.7 6272 21969 19449 19410
50 50.41 37.1 6210 21412 18222 18224
60 55.64 44.51 6052 21108 17060 17018
70 61.59 51.98 5947 21020 15682 15737
80 68.64 59.36 5761 20812 14411 14437
90 77.83 66.86 5701 20585 13171 13200
Table. Execution Time (ms) vs Filtering Rate (%) for Joining Two
Tables of 10M Rows.

Attached you can find figures of the same performance test and a SQL
script
to reproduce the performance test.

The first thing to notice is that Hash Join generally is the most
efficient join strategy. This is because Hash Join is better at
dealing with small tables, and our size of 10 million is still small
enough where Hash Join outperforms the other join strategies. Future
experiments can investigate using much larger tables.

However, comparing just within the different Merge Join variants, we
see that using the bloom filter greatly improves performance.
Intuitively, all of these execution times follow linear paths.
Comparing forced filtering versus default, we can see that the default
Merge Join outperforms Merge Join with filtering at low filter rates,
but after about 20% TFR, the Merge Join with filtering outperforms
default Merge Join. This makes intuitive sense, as there are some
fixed costs associated with building and checking with the bloom
filter. In the worst case, at only 10% TFR, the bloom filter makes
Merge Join less than 5% slower. However, in the best case, at 90% TFR,
the bloom filter improves Merge Join by 36%.

Based on the results of the above experiments, we came up with a
linear equation for the performance ratio for using the filter
pushdown from the actual filtering rate. Based on the numbers
presented in the figure, this is the equation:

T_filter / T_no_filter = 1 / (0.83 * estimated filtering rate + 0.863)

For example, this means that with an estimated filtering rate of 0.4,
the execution time of merge join is estimated to be improved by 16.3%.
Note that the estimated filtering rate is used in the equation, not
the theoretical filtering rate or the actual filtering rate because it
is what we have during planning. In practice the estimated filtering
rate isn’t usually accurate. In fact, the estimated filtering rate can
differ from the theoretical filtering rate by as much as 17% in our
experiments. One way to mitigate the power loss of bloom filter caused
by inaccurate estimated filtering rate is to adaptively turn it off at
execution time, this is yet to be implemented.

Here is a list of tasks we plan to work on in order to improve this patch:
1. More regression testing to guarantee correctness.
2. More performance testing involving larger tables and complicated query
plans.
3. Improve the cost model.
4. Explore runtime tuning such as making the bloom filter checking
adaptive.
5. Currently, only the best single join key is used for building the
Bloom filter. However, if there are several keys and we know that
their distributions are somewhat disjoint, we could leverage this fact
and use multiple keys for the bloom filter.
6. Currently, Bloom filter pushdown is only implemented for SeqScan
nodes. However, it would be possible to allow push down to other types
of scan nodes.
7. Explore if the Bloom filter could be pushed down through a foreign
scan when the foreign server is capable of handling it – which could
be made true for postgres_fdw.
8. Better explain command on the usage of bloom filters.

This patch set is prepared by Marcus Ma, Lyu Pan and myself. Feedback
is appreciated.

With Regards,
Zheng Li
Amazon RDS/Aurora for PostgreSQL

[1]
/messages/by-id/c902844d-837f-5f63-ced3-9f7fd222f175@2ndquadrant.com

Hi,
In the header of patch 1:

In this prototype, the cost model is based on an assumption that there is
a linear relationship between the performance gain from using a semijoin
filter and the estimated filtering rate:
% improvement to Merge Join cost = 0.83 * estimated filtering rate - 0.137.

How were the coefficients (0.83 and 0.137) determined ?
I guess they were based on the results of running certain workload.

Cheers

Hi,
For patch 1:

+bool       enable_mergejoin_semijoin_filter;
+bool       force_mergejoin_semijoin_filter;

How would (enable_mergejoin_semijoin_filter = off,
force_mergejoin_semijoin_filter = on) be interpreted ?
Have you considered using one GUC which has three values: off, enabled,
forced ?

+       mergeclauses_for_sjf = get_actual_clauses(path->path_mergeclauses);
+       mergeclauses_for_sjf = get_switched_clauses(path->path_mergeclauses,
+
path->jpath.outerjoinpath->parent->relids);

mergeclauses_for_sjf is assigned twice and I don't see mergeclauses_for_sjf
being reference in the call to get_switched_clauses().
Is this intentional ?

+           /* want at least 1000 rows_filtered to avoid any nasty edge
cases */
+           if (force_mergejoin_semijoin_filter || (filteringRate >= 0.35
&& rows_filtered > 1000))

The above condition is narrower compared to the enclosing condition.
Since there is no else block for the second if block, please merge the two
if statements.

+ int best_filter_clause;

Normally I would think `clause` is represented by List*. But
best_filter_clause is an int. Please use another variable name so that
there is less chance of confusion.

For evaluate_semijoin_filtering_rate():

+ double best_sj_selectivity = 1.01;

How was 1.01 determined ?

+ debug_sj1("SJPD: start evaluate_semijoin_filtering_rate");

There are debug statements in the methods.
It would be better to remove them in the next patch set.

Cheers

Zhihong Yu

zyu@yugabyte.com

over 3 years ago

In reply to: Zhihong Yu (#3)

Re: Bloom filter Pushdown Optimization for Merge Join

On Fri, Sep 30, 2022 at 9:20 PM Zhihong Yu <zyu@yugabyte.com> wrote:

On Fri, Sep 30, 2022 at 8:40 PM Zhihong Yu <zyu@yugabyte.com> wrote:

On Fri, Sep 30, 2022 at 3:44 PM Zheng Li <zhengli10@gmail.com> wrote:

Hello,

A bloom filter provides early filtering of rows that cannot be joined
before they would reach the join operator, the optimization is also
called a semi join filter (SJF) pushdown. Such a filter can be created
when one child of the join operator must materialize its derived table
before the other child is evaluated.

For example, a bloom filter can be created using the the join keys for
the build side/inner side of a hash join or the outer side of a merge
join, the bloom filter can then be used to pre-filter rows on the
other side of the join operator during the scan of the base relation.
The thread about “Hash Joins vs. Bloom Filters / take 2” [1] is good
discussion on using such optimization for hash join without going into
the pushdown of the filter where its performance gain could be further
increased.

We worked on prototyping bloom filter pushdown for both hash join and
merge join. Attached is a patch set for bloom filter pushdown for
merge join. We also plan to send the patch for hash join once we have
it rebased.

Here is a summary of the patch set:
1. Bloom Filter Pushdown optimizes Merge Join by filtering rows early
during the table scan instead of later on.
-The bloom filter is pushed down along the execution tree to
the target SeqScan nodes.
-Experiments show that this optimization can speed up Merge
Join by up to 36%.

2. The planner makes the decision to use the bloom filter based on the
estimated filtering rate and the expected performance gain.
-The planner accomplishes this by estimating four numbers per
variable - the total number of rows of the relation, the number of
distinct values for a given variable, and the minimum and maximum
value of the variable (when applicable). Using these numbers, the
planner estimates a filtering rate of a potential filter.
-Because actually creating and implementing the filter adds
more operations, there is a minimum threshold of filtering where the
filter would actually be useful. Based on testing, we query to see if
the estimated filtering rate is higher than 35%, and that informs our
decision to use a filter or not.

3. If using a bloom filter, the planner also adjusts the expected cost
of Merge Join based on expected performance gain.

4. Capability to build the bloom filter in parallel in case of
parallel SeqScan. This is done efficiently by populating a local bloom
filter for each parallel worker and then taking a bitwise OR over all
the local bloom filters to form a shared bloom filter at the end of
the parallel SeqScan.

5. The optimization is GUC controlled, with settings of
enable_mergejoin_semijoin_filter and force_mergejoin_semijoin_filter.

We found in experiments that there is a significant improvement
when using the bloom filter during Merge Join. One experiment involved
joining two large tables while varying the theoretical filtering rate
(TFR) between the two tables, the TFR is defined as the percentage
that the two datasets are disjoint. Both tables in the merge join were
the same size. We tested changing the TFR to see the change in
filtering optimization.

For example, let’s imagine t0 has 10 million rows, which contain the
numbers 1 through 10 million randomly shuffled. Also, t1 has the
numbers 4 million through 14 million randomly shuffled. Then the TFR
for a join of these two tables is 40%, since 40% of the tables are
disjoint from the other table (1 through 4 million for t0, 10 million
through 14 million for t4).

Here is the performance test result joining two tables:
TFR: theoretical filtering rate
EFR: estimated filtering rate
AFR: actual filtering rate
HJ: hash join
MJ Default: default merge join
MJ Filter: merge join with bloom filter optimization enabled
MJ Filter Forced: merge join with bloom filter optimization forced

TFR EFR AFR HJ MJ Default MJ Filter MJ Filter Forced

-------------------------------------------------------------------------------------
10 33.46 7.41 6529 22638 21949 23160
20 37.27 14.85 6483 22290 21928 21930
30 41.32 22.25 6395 22374 20718 20794
40 45.67 29.7 6272 21969 19449 19410
50 50.41 37.1 6210 21412 18222 18224
60 55.64 44.51 6052 21108 17060 17018
70 61.59 51.98 5947 21020 15682 15737
80 68.64 59.36 5761 20812 14411 14437
90 77.83 66.86 5701 20585 13171 13200
Table. Execution Time (ms) vs Filtering Rate (%) for Joining Two
Tables of 10M Rows.

Attached you can find figures of the same performance test and a SQL
script
to reproduce the performance test.

The first thing to notice is that Hash Join generally is the most
efficient join strategy. This is because Hash Join is better at
dealing with small tables, and our size of 10 million is still small
enough where Hash Join outperforms the other join strategies. Future
experiments can investigate using much larger tables.

However, comparing just within the different Merge Join variants, we
see that using the bloom filter greatly improves performance.
Intuitively, all of these execution times follow linear paths.
Comparing forced filtering versus default, we can see that the default
Merge Join outperforms Merge Join with filtering at low filter rates,
but after about 20% TFR, the Merge Join with filtering outperforms
default Merge Join. This makes intuitive sense, as there are some
fixed costs associated with building and checking with the bloom
filter. In the worst case, at only 10% TFR, the bloom filter makes
Merge Join less than 5% slower. However, in the best case, at 90% TFR,
the bloom filter improves Merge Join by 36%.

Based on the results of the above experiments, we came up with a
linear equation for the performance ratio for using the filter
pushdown from the actual filtering rate. Based on the numbers
presented in the figure, this is the equation:

T_filter / T_no_filter = 1 / (0.83 * estimated filtering rate + 0.863)

For example, this means that with an estimated filtering rate of 0.4,
the execution time of merge join is estimated to be improved by 16.3%.
Note that the estimated filtering rate is used in the equation, not
the theoretical filtering rate or the actual filtering rate because it
is what we have during planning. In practice the estimated filtering
rate isn’t usually accurate. In fact, the estimated filtering rate can
differ from the theoretical filtering rate by as much as 17% in our
experiments. One way to mitigate the power loss of bloom filter caused
by inaccurate estimated filtering rate is to adaptively turn it off at
execution time, this is yet to be implemented.

Here is a list of tasks we plan to work on in order to improve this
patch:
1. More regression testing to guarantee correctness.
2. More performance testing involving larger tables and complicated
query plans.
3. Improve the cost model.
4. Explore runtime tuning such as making the bloom filter checking
adaptive.
5. Currently, only the best single join key is used for building the
Bloom filter. However, if there are several keys and we know that
their distributions are somewhat disjoint, we could leverage this fact
and use multiple keys for the bloom filter.
6. Currently, Bloom filter pushdown is only implemented for SeqScan
nodes. However, it would be possible to allow push down to other types
of scan nodes.
7. Explore if the Bloom filter could be pushed down through a foreign
scan when the foreign server is capable of handling it – which could
be made true for postgres_fdw.
8. Better explain command on the usage of bloom filters.

This patch set is prepared by Marcus Ma, Lyu Pan and myself. Feedback
is appreciated.

With Regards,
Zheng Li
Amazon RDS/Aurora for PostgreSQL

[1]
/messages/by-id/c902844d-837f-5f63-ced3-9f7fd222f175@2ndquadrant.com

Hi,
In the header of patch 1:

In this prototype, the cost model is based on an assumption that there is
a linear relationship between the performance gain from using a semijoin
filter and the estimated filtering rate:
% improvement to Merge Join cost = 0.83 * estimated filtering rate -
0.137.

How were the coefficients (0.83 and 0.137) determined ?
I guess they were based on the results of running certain workload.

Cheers

Hi,
For patch 1:
+bool       enable_mergejoin_semijoin_filter;
+bool       force_mergejoin_semijoin_filter;
How would (enable_mergejoin_semijoin_filter = off,
force_mergejoin_semijoin_filter = on) be interpreted ?
Have you considered using one GUC which has three values: off, enabled,
forced ?
+       mergeclauses_for_sjf = get_actual_clauses(path->path_mergeclauses);
+       mergeclauses_for_sjf =
get_switched_clauses(path->path_mergeclauses,
+
path->jpath.outerjoinpath->parent->relids);
mergeclauses_for_sjf is assigned twice and I don't
see mergeclauses_for_sjf being reference in the call
to get_switched_clauses().
Is this intentional ?
+           /* want at least 1000 rows_filtered to avoid any nasty edge
cases */
+           if (force_mergejoin_semijoin_filter || (filteringRate >= 0.35
&& rows_filtered > 1000))
The above condition is narrower compared to the enclosing condition.
Since there is no else block for the second if block, please merge the two
if statements.

+ int best_filter_clause;

Normally I would think `clause` is represented by List*. But
best_filter_clause is an int. Please use another variable name so that
there is less chance of confusion.

For evaluate_semijoin_filtering_rate():

+ double best_sj_selectivity = 1.01;

How was 1.01 determined ?

+ debug_sj1("SJPD: start evaluate_semijoin_filtering_rate");

There are debug statements in the methods.
It would be better to remove them in the next patch set.

Cheers

Hi,
Still patch 1.

+       if (!outer_arg_md->is_or_maps_to_base_column
+           && !inner_arg_md->is_or_maps_to_constant)
+       {
+           debug_sj2("SJPD:        outer equijoin arg does not map %s",
+                     "to a base column nor a constant; semijoin is not
valid");

Looks like there is a typo: inner_arg_md->is_or_maps_to_constant should be
outer_arg_md->is_or_maps_to_constant

+       if (outer_arg_md->est_col_width > MAX_SEMIJOIN_SINGLE_KEY_WIDTH)
+       {
+           debug_sj2("SJPD:        outer equijoin column's width %s",
+                     "was excessive; condition rejected");

How is the value of MAX_SEMIJOIN_SINGLE_KEY_WIDTH determined ?

For verify_valid_pushdown():

+   Assert(path);
+   Assert(target_var_no > 0);
+
+   if (path == NULL)
+   {
+       return false;

I don't understand the first assertion. Does it mean path would always be
non-NULL ? Then the if statement should be dropped.

+               if (path->parent->relid == target_var_no)
+               {
+                   /*
+                    * Found source of target var! We know that the pushdown
+                    * is valid now.
+                    */
+                   return true;
+               }
+               return false;

The above can be simplified as: return path->parent->relid == target_var_no;

+ * True if the given con_exprs, ref_exprs and operators will exactlty

Typo: exactlty -> exactly

+   if (!bms_equal(all_vars, matched_vars))
+       return false;
+   return true;

The above can be simplified as: return bms_equal(all_vars, matched_vars);

Cheers

Zhihong Yu

zyu@yugabyte.com

over 3 years ago

In reply to: Zhihong Yu (#4)

Re: Bloom filter Pushdown Optimization for Merge Join

On Sat, Oct 1, 2022 at 12:45 AM Zhihong Yu <zyu@yugabyte.com> wrote:

On Fri, Sep 30, 2022 at 9:20 PM Zhihong Yu <zyu@yugabyte.com> wrote:
On Fri, Sep 30, 2022 at 8:40 PM Zhihong Yu <zyu@yugabyte.com> wrote:

On Fri, Sep 30, 2022 at 3:44 PM Zheng Li <zhengli10@gmail.com> wrote:

Hello,

A bloom filter provides early filtering of rows that cannot be joined
before they would reach the join operator, the optimization is also
called a semi join filter (SJF) pushdown. Such a filter can be created
when one child of the join operator must materialize its derived table
before the other child is evaluated.

For example, a bloom filter can be created using the the join keys for
the build side/inner side of a hash join or the outer side of a merge
join, the bloom filter can then be used to pre-filter rows on the
other side of the join operator during the scan of the base relation.
The thread about “Hash Joins vs. Bloom Filters / take 2” [1] is good
discussion on using such optimization for hash join without going into
the pushdown of the filter where its performance gain could be further
increased.

We worked on prototyping bloom filter pushdown for both hash join and
merge join. Attached is a patch set for bloom filter pushdown for
merge join. We also plan to send the patch for hash join once we have
it rebased.

Here is a summary of the patch set:
1. Bloom Filter Pushdown optimizes Merge Join by filtering rows early
during the table scan instead of later on.
-The bloom filter is pushed down along the execution tree to
the target SeqScan nodes.
-Experiments show that this optimization can speed up Merge
Join by up to 36%.

2. The planner makes the decision to use the bloom filter based on the
estimated filtering rate and the expected performance gain.
-The planner accomplishes this by estimating four numbers per
variable - the total number of rows of the relation, the number of
distinct values for a given variable, and the minimum and maximum
value of the variable (when applicable). Using these numbers, the
planner estimates a filtering rate of a potential filter.
-Because actually creating and implementing the filter adds
more operations, there is a minimum threshold of filtering where the
filter would actually be useful. Based on testing, we query to see if
the estimated filtering rate is higher than 35%, and that informs our
decision to use a filter or not.

3. If using a bloom filter, the planner also adjusts the expected cost
of Merge Join based on expected performance gain.

4. Capability to build the bloom filter in parallel in case of
parallel SeqScan. This is done efficiently by populating a local bloom
filter for each parallel worker and then taking a bitwise OR over all
the local bloom filters to form a shared bloom filter at the end of
the parallel SeqScan.

5. The optimization is GUC controlled, with settings of
enable_mergejoin_semijoin_filter and force_mergejoin_semijoin_filter.

We found in experiments that there is a significant improvement
when using the bloom filter during Merge Join. One experiment involved
joining two large tables while varying the theoretical filtering rate
(TFR) between the two tables, the TFR is defined as the percentage
that the two datasets are disjoint. Both tables in the merge join were
the same size. We tested changing the TFR to see the change in
filtering optimization.

For example, let’s imagine t0 has 10 million rows, which contain the
numbers 1 through 10 million randomly shuffled. Also, t1 has the
numbers 4 million through 14 million randomly shuffled. Then the TFR
for a join of these two tables is 40%, since 40% of the tables are
disjoint from the other table (1 through 4 million for t0, 10 million
through 14 million for t4).

Here is the performance test result joining two tables:
TFR: theoretical filtering rate
EFR: estimated filtering rate
AFR: actual filtering rate
HJ: hash join
MJ Default: default merge join
MJ Filter: merge join with bloom filter optimization enabled
MJ Filter Forced: merge join with bloom filter optimization forced

TFR EFR AFR HJ MJ Default MJ Filter MJ Filter Forced

-------------------------------------------------------------------------------------
10 33.46 7.41 6529 22638 21949 23160
20 37.27 14.85 6483 22290 21928 21930
30 41.32 22.25 6395 22374 20718 20794
40 45.67 29.7 6272 21969 19449 19410
50 50.41 37.1 6210 21412 18222 18224
60 55.64 44.51 6052 21108 17060 17018
70 61.59 51.98 5947 21020 15682 15737
80 68.64 59.36 5761 20812 14411 14437
90 77.83 66.86 5701 20585 13171 13200
Table. Execution Time (ms) vs Filtering Rate (%) for Joining Two
Tables of 10M Rows.

Attached you can find figures of the same performance test and a SQL
script
to reproduce the performance test.

The first thing to notice is that Hash Join generally is the most
efficient join strategy. This is because Hash Join is better at
dealing with small tables, and our size of 10 million is still small
enough where Hash Join outperforms the other join strategies. Future
experiments can investigate using much larger tables.

However, comparing just within the different Merge Join variants, we
see that using the bloom filter greatly improves performance.
Intuitively, all of these execution times follow linear paths.
Comparing forced filtering versus default, we can see that the default
Merge Join outperforms Merge Join with filtering at low filter rates,
but after about 20% TFR, the Merge Join with filtering outperforms
default Merge Join. This makes intuitive sense, as there are some
fixed costs associated with building and checking with the bloom
filter. In the worst case, at only 10% TFR, the bloom filter makes
Merge Join less than 5% slower. However, in the best case, at 90% TFR,
the bloom filter improves Merge Join by 36%.

Based on the results of the above experiments, we came up with a
linear equation for the performance ratio for using the filter
pushdown from the actual filtering rate. Based on the numbers
presented in the figure, this is the equation:

T_filter / T_no_filter = 1 / (0.83 * estimated filtering rate + 0.863)

For example, this means that with an estimated filtering rate of 0.4,
the execution time of merge join is estimated to be improved by 16.3%.
Note that the estimated filtering rate is used in the equation, not
the theoretical filtering rate or the actual filtering rate because it
is what we have during planning. In practice the estimated filtering
rate isn’t usually accurate. In fact, the estimated filtering rate can
differ from the theoretical filtering rate by as much as 17% in our
experiments. One way to mitigate the power loss of bloom filter caused
by inaccurate estimated filtering rate is to adaptively turn it off at
execution time, this is yet to be implemented.

Here is a list of tasks we plan to work on in order to improve this
patch:
1. More regression testing to guarantee correctness.
2. More performance testing involving larger tables and complicated
query plans.
3. Improve the cost model.
4. Explore runtime tuning such as making the bloom filter checking
adaptive.
5. Currently, only the best single join key is used for building the
Bloom filter. However, if there are several keys and we know that
their distributions are somewhat disjoint, we could leverage this fact
and use multiple keys for the bloom filter.
6. Currently, Bloom filter pushdown is only implemented for SeqScan
nodes. However, it would be possible to allow push down to other types
of scan nodes.
7. Explore if the Bloom filter could be pushed down through a foreign
scan when the foreign server is capable of handling it – which could
be made true for postgres_fdw.
8. Better explain command on the usage of bloom filters.

This patch set is prepared by Marcus Ma, Lyu Pan and myself. Feedback
is appreciated.

With Regards,
Zheng Li
Amazon RDS/Aurora for PostgreSQL

[1]
/messages/by-id/c902844d-837f-5f63-ced3-9f7fd222f175@2ndquadrant.com

Hi,
In the header of patch 1:

In this prototype, the cost model is based on an assumption that there
is a linear relationship between the performance gain from using a semijoin
filter and the estimated filtering rate:
% improvement to Merge Join cost = 0.83 * estimated filtering rate -
0.137.

How were the coefficients (0.83 and 0.137) determined ?
I guess they were based on the results of running certain workload.

Cheers

Hi,
For patch 1:
+bool       enable_mergejoin_semijoin_filter;
+bool       force_mergejoin_semijoin_filter;
How would (enable_mergejoin_semijoin_filter = off,
force_mergejoin_semijoin_filter = on) be interpreted ?
Have you considered using one GUC which has three values: off, enabled,
forced ?

+ mergeclauses_for_sjf =
get_actual_clauses(path->path_mergeclauses);
+ mergeclauses_for_sjf =
get_switched_clauses(path->path_mergeclauses,
+
path->jpath.outerjoinpath->parent->relids);

mergeclauses_for_sjf is assigned twice and I don't
see mergeclauses_for_sjf being reference in the call
to get_switched_clauses().
Is this intentional ?
+           /* want at least 1000 rows_filtered to avoid any nasty edge
cases */
+           if (force_mergejoin_semijoin_filter || (filteringRate >= 0.35
&& rows_filtered > 1000))
The above condition is narrower compared to the enclosing condition.
Since there is no else block for the second if block, please merge the
two if statements.

+ int best_filter_clause;

Normally I would think `clause` is represented by List*. But
best_filter_clause is an int. Please use another variable name so that
there is less chance of confusion.

For evaluate_semijoin_filtering_rate():

+ double best_sj_selectivity = 1.01;

How was 1.01 determined ?

+ debug_sj1("SJPD: start evaluate_semijoin_filtering_rate");

There are debug statements in the methods.
It would be better to remove them in the next patch set.

Cheers
Hi,
Still patch 1.
+       if (!outer_arg_md->is_or_maps_to_base_column
+           && !inner_arg_md->is_or_maps_to_constant)
+       {
+           debug_sj2("SJPD:        outer equijoin arg does not map %s",
+                     "to a base column nor a constant; semijoin is not
valid");
Looks like there is a typo: inner_arg_md->is_or_maps_to_constant should
be outer_arg_md->is_or_maps_to_constant
+       if (outer_arg_md->est_col_width > MAX_SEMIJOIN_SINGLE_KEY_WIDTH)
+       {
+           debug_sj2("SJPD:        outer equijoin column's width %s",
+                     "was excessive; condition rejected");
How is the value of MAX_SEMIJOIN_SINGLE_KEY_WIDTH determined ?

For verify_valid_pushdown():
+   Assert(path);
+   Assert(target_var_no > 0);
+
+   if (path == NULL)
+   {
+       return false;
I don't understand the first assertion. Does it mean path would always be
non-NULL ? Then the if statement should be dropped.
+               if (path->parent->relid == target_var_no)
+               {
+                   /*
+                    * Found source of target var! We know that the
pushdown
+                    * is valid now.
+                    */
+                   return true;
+               }
+               return false;
The above can be simplified as: return path->parent->relid ==
target_var_no;

+ * True if the given con_exprs, ref_exprs and operators will exactlty

Typo: exactlty -> exactly
+   if (!bms_equal(all_vars, matched_vars))
+       return false;
+   return true;
The above can be simplified as: return bms_equal(all_vars, matched_vars);

Cheers

Hi,
Still in patch 1 :-)

+   if (best_path->use_semijoinfilter)
+   {
+       if (best_path->best_mergeclause != -1)

Since there is no else block, the two conditions can be combined.

+ ListCell *clause_cell = list_nth_cell(mergeclauses,
best_path->best_mergeclause);

As shown in the above code, best_mergeclause is the position of the best
merge clause in mergeclauses.
I think best_mergeclause_pos (or similar name) is more appropriate for the
fieldname.

For depth_of_semijoin_target():

+ *  Parameters:
+ *  node: plan node to be considered for semijoin push down.

The name of the parameter is pn - please align the comment with code.

For T_SubqueryScan case in depth_of_semijoin_target():

+               Assert(rte->subquery->targetList);
...
+               if (rel && rel->subroot
+                   && rte && rte->subquery && rte->subquery->targetList)

It seems the condition can be simplified since rte->subquery->targetList
has passed the assertion.

For is_table_scan_node_source_of_relids_or_var(), the else block can be
simplified to returning scan_node_varno == target_var->varno directly.

For get_appendrel_occluded_references():

+ *  Given a virtual column from an Union ALL subquery,
+ *  return the expression it immediately occludes that satisfy

Since the index is returned from the func, it would be better to clarify
the comment by saying `return the last index of expression ...`

+ /* Subquery without append and partitioned tables */

append and partitioned tables -> append or partitioned tables

More reviews for subsequent patches to follow.

Zhihong Yu

zyu@yugabyte.com

over 3 years ago

In reply to: Zhihong Yu (#5)

Re: Bloom filter Pushdown Optimization for Merge Join

On Sun, Oct 2, 2022 at 6:40 AM Zhihong Yu <zyu@yugabyte.com> wrote:

On Sat, Oct 1, 2022 at 12:45 AM Zhihong Yu <zyu@yugabyte.com> wrote:
On Fri, Sep 30, 2022 at 9:20 PM Zhihong Yu <zyu@yugabyte.com> wrote:
On Fri, Sep 30, 2022 at 8:40 PM Zhihong Yu <zyu@yugabyte.com> wrote:

On Fri, Sep 30, 2022 at 3:44 PM Zheng Li <zhengli10@gmail.com> wrote:

Hello,

A bloom filter provides early filtering of rows that cannot be joined
before they would reach the join operator, the optimization is also
called a semi join filter (SJF) pushdown. Such a filter can be created
when one child of the join operator must materialize its derived table
before the other child is evaluated.

For example, a bloom filter can be created using the the join keys for
the build side/inner side of a hash join or the outer side of a merge
join, the bloom filter can then be used to pre-filter rows on the
other side of the join operator during the scan of the base relation.
The thread about “Hash Joins vs. Bloom Filters / take 2” [1] is good
discussion on using such optimization for hash join without going into
the pushdown of the filter where its performance gain could be further
increased.

We worked on prototyping bloom filter pushdown for both hash join and
merge join. Attached is a patch set for bloom filter pushdown for
merge join. We also plan to send the patch for hash join once we have
it rebased.

Here is a summary of the patch set:
1. Bloom Filter Pushdown optimizes Merge Join by filtering rows early
during the table scan instead of later on.
-The bloom filter is pushed down along the execution tree to
the target SeqScan nodes.
-Experiments show that this optimization can speed up Merge
Join by up to 36%.

2. The planner makes the decision to use the bloom filter based on the
estimated filtering rate and the expected performance gain.
-The planner accomplishes this by estimating four numbers per
variable - the total number of rows of the relation, the number of
distinct values for a given variable, and the minimum and maximum
value of the variable (when applicable). Using these numbers, the
planner estimates a filtering rate of a potential filter.
-Because actually creating and implementing the filter adds
more operations, there is a minimum threshold of filtering where the
filter would actually be useful. Based on testing, we query to see if
the estimated filtering rate is higher than 35%, and that informs our
decision to use a filter or not.

3. If using a bloom filter, the planner also adjusts the expected cost
of Merge Join based on expected performance gain.

4. Capability to build the bloom filter in parallel in case of
parallel SeqScan. This is done efficiently by populating a local bloom
filter for each parallel worker and then taking a bitwise OR over all
the local bloom filters to form a shared bloom filter at the end of
the parallel SeqScan.

5. The optimization is GUC controlled, with settings of
enable_mergejoin_semijoin_filter and force_mergejoin_semijoin_filter.

We found in experiments that there is a significant improvement
when using the bloom filter during Merge Join. One experiment involved
joining two large tables while varying the theoretical filtering rate
(TFR) between the two tables, the TFR is defined as the percentage
that the two datasets are disjoint. Both tables in the merge join were
the same size. We tested changing the TFR to see the change in
filtering optimization.

For example, let’s imagine t0 has 10 million rows, which contain the
numbers 1 through 10 million randomly shuffled. Also, t1 has the
numbers 4 million through 14 million randomly shuffled. Then the TFR
for a join of these two tables is 40%, since 40% of the tables are
disjoint from the other table (1 through 4 million for t0, 10 million
through 14 million for t4).

Here is the performance test result joining two tables:
TFR: theoretical filtering rate
EFR: estimated filtering rate
AFR: actual filtering rate
HJ: hash join
MJ Default: default merge join
MJ Filter: merge join with bloom filter optimization enabled
MJ Filter Forced: merge join with bloom filter optimization forced

TFR EFR AFR HJ MJ Default MJ Filter MJ Filter Forced

-------------------------------------------------------------------------------------
10 33.46 7.41 6529 22638 21949 23160
20 37.27 14.85 6483 22290 21928 21930
30 41.32 22.25 6395 22374 20718 20794
40 45.67 29.7 6272 21969 19449 19410
50 50.41 37.1 6210 21412 18222 18224
60 55.64 44.51 6052 21108 17060 17018
70 61.59 51.98 5947 21020 15682 15737
80 68.64 59.36 5761 20812 14411 14437
90 77.83 66.86 5701 20585 13171 13200
Table. Execution Time (ms) vs Filtering Rate (%) for Joining Two
Tables of 10M Rows.

Attached you can find figures of the same performance test and a SQL
script
to reproduce the performance test.

The first thing to notice is that Hash Join generally is the most
efficient join strategy. This is because Hash Join is better at
dealing with small tables, and our size of 10 million is still small
enough where Hash Join outperforms the other join strategies. Future
experiments can investigate using much larger tables.

However, comparing just within the different Merge Join variants, we
see that using the bloom filter greatly improves performance.
Intuitively, all of these execution times follow linear paths.
Comparing forced filtering versus default, we can see that the default
Merge Join outperforms Merge Join with filtering at low filter rates,
but after about 20% TFR, the Merge Join with filtering outperforms
default Merge Join. This makes intuitive sense, as there are some
fixed costs associated with building and checking with the bloom
filter. In the worst case, at only 10% TFR, the bloom filter makes
Merge Join less than 5% slower. However, in the best case, at 90% TFR,
the bloom filter improves Merge Join by 36%.

Based on the results of the above experiments, we came up with a
linear equation for the performance ratio for using the filter
pushdown from the actual filtering rate. Based on the numbers
presented in the figure, this is the equation:

T_filter / T_no_filter = 1 / (0.83 * estimated filtering rate + 0.863)

For example, this means that with an estimated filtering rate of 0.4,
the execution time of merge join is estimated to be improved by 16.3%.
Note that the estimated filtering rate is used in the equation, not
the theoretical filtering rate or the actual filtering rate because it
is what we have during planning. In practice the estimated filtering
rate isn’t usually accurate. In fact, the estimated filtering rate can
differ from the theoretical filtering rate by as much as 17% in our
experiments. One way to mitigate the power loss of bloom filter caused
by inaccurate estimated filtering rate is to adaptively turn it off at
execution time, this is yet to be implemented.

Here is a list of tasks we plan to work on in order to improve this
patch:
1. More regression testing to guarantee correctness.
2. More performance testing involving larger tables and complicated
query plans.
3. Improve the cost model.
4. Explore runtime tuning such as making the bloom filter checking
adaptive.
5. Currently, only the best single join key is used for building the
Bloom filter. However, if there are several keys and we know that
their distributions are somewhat disjoint, we could leverage this fact
and use multiple keys for the bloom filter.
6. Currently, Bloom filter pushdown is only implemented for SeqScan
nodes. However, it would be possible to allow push down to other types
of scan nodes.
7. Explore if the Bloom filter could be pushed down through a foreign
scan when the foreign server is capable of handling it – which could
be made true for postgres_fdw.
8. Better explain command on the usage of bloom filters.

This patch set is prepared by Marcus Ma, Lyu Pan and myself. Feedback
is appreciated.

With Regards,
Zheng Li
Amazon RDS/Aurora for PostgreSQL

[1]
/messages/by-id/c902844d-837f-5f63-ced3-9f7fd222f175@2ndquadrant.com

Hi,
In the header of patch 1:

In this prototype, the cost model is based on an assumption that there
is a linear relationship between the performance gain from using a semijoin
filter and the estimated filtering rate:
% improvement to Merge Join cost = 0.83 * estimated filtering rate -
0.137.

How were the coefficients (0.83 and 0.137) determined ?
I guess they were based on the results of running certain workload.

Cheers

Hi,
For patch 1:
+bool       enable_mergejoin_semijoin_filter;
+bool       force_mergejoin_semijoin_filter;
How would (enable_mergejoin_semijoin_filter = off,
force_mergejoin_semijoin_filter = on) be interpreted ?
Have you considered using one GUC which has three values: off, enabled,
forced ?

+ mergeclauses_for_sjf =
get_actual_clauses(path->path_mergeclauses);
+ mergeclauses_for_sjf =
get_switched_clauses(path->path_mergeclauses,
+
path->jpath.outerjoinpath->parent->relids);

mergeclauses_for_sjf is assigned twice and I don't
see mergeclauses_for_sjf being reference in the call
to get_switched_clauses().
Is this intentional ?
+           /* want at least 1000 rows_filtered to avoid any nasty edge
cases */
+           if (force_mergejoin_semijoin_filter || (filteringRate >=
0.35 && rows_filtered > 1000))
The above condition is narrower compared to the enclosing condition.
Since there is no else block for the second if block, please merge the
two if statements.

+ int best_filter_clause;

Normally I would think `clause` is represented by List*. But
best_filter_clause is an int. Please use another variable name so that
there is less chance of confusion.

For evaluate_semijoin_filtering_rate():

+ double best_sj_selectivity = 1.01;

How was 1.01 determined ?

+ debug_sj1("SJPD: start evaluate_semijoin_filtering_rate");

There are debug statements in the methods.
It would be better to remove them in the next patch set.

Cheers
Hi,
Still patch 1.
+       if (!outer_arg_md->is_or_maps_to_base_column
+           && !inner_arg_md->is_or_maps_to_constant)
+       {
+           debug_sj2("SJPD:        outer equijoin arg does not map %s",
+                     "to a base column nor a constant; semijoin is not
valid");
Looks like there is a typo: inner_arg_md->is_or_maps_to_constant should
be outer_arg_md->is_or_maps_to_constant
+       if (outer_arg_md->est_col_width > MAX_SEMIJOIN_SINGLE_KEY_WIDTH)
+       {
+           debug_sj2("SJPD:        outer equijoin column's width %s",
+                     "was excessive; condition rejected");
How is the value of MAX_SEMIJOIN_SINGLE_KEY_WIDTH determined ?

For verify_valid_pushdown():
+   Assert(path);
+   Assert(target_var_no > 0);
+
+   if (path == NULL)
+   {
+       return false;
I don't understand the first assertion. Does it mean path would always be
non-NULL ? Then the if statement should be dropped.
+               if (path->parent->relid == target_var_no)
+               {
+                   /*
+                    * Found source of target var! We know that the
pushdown
+                    * is valid now.
+                    */
+                   return true;
+               }
+               return false;
The above can be simplified as: return path->parent->relid ==
target_var_no;

+ * True if the given con_exprs, ref_exprs and operators will exactlty

Typo: exactlty -> exactly
+   if (!bms_equal(all_vars, matched_vars))
+       return false;
+   return true;
The above can be simplified as: return bms_equal(all_vars, matched_vars);

Cheers
Hi,
Still in patch 1 :-)
+   if (best_path->use_semijoinfilter)
+   {
+       if (best_path->best_mergeclause != -1)
Since there is no else block, the two conditions can be combined.

+ ListCell *clause_cell = list_nth_cell(mergeclauses,
best_path->best_mergeclause);

As shown in the above code, best_mergeclause is the position of the best
merge clause in mergeclauses.
I think best_mergeclause_pos (or similar name) is more appropriate for the
fieldname.

For depth_of_semijoin_target():
+ *  Parameters:
+ *  node: plan node to be considered for semijoin push down.
The name of the parameter is pn - please align the comment with code.

For T_SubqueryScan case in depth_of_semijoin_target():
+               Assert(rte->subquery->targetList);
...
+               if (rel && rel->subroot
+                   && rte && rte->subquery && rte->subquery->targetList)
It seems the condition can be simplified since rte->subquery->targetList
has passed the assertion.

For is_table_scan_node_source_of_relids_or_var(), the else block can be
simplified to returning scan_node_varno == target_var->varno directly.

For get_appendrel_occluded_references():
+ *  Given a virtual column from an Union ALL subquery,
+ *  return the expression it immediately occludes that satisfy
Since the index is returned from the func, it would be better to clarify
the comment by saying `return the last index of expression ...`

+ /* Subquery without append and partitioned tables */

append and partitioned tables -> append or partitioned tables

More reviews for subsequent patches to follow.

Hi,
For 0002-Support-semijoin-filter-in-the-executor-for-non-para.patch ,

+   if (!qual && !projInfo && !IsA(node, SeqScanState) &&
+       !((SeqScanState *) node)->applySemiJoinFilter)

I am confused by the last two clauses in the condition. If !IsA(node,
SeqScanState) is true, why does the last clause cast node to SeqScanState *
?
I think you forgot to put the last two clauses in a pair of parentheses.

+ /* slot did not pass SemiJoinFilter, so skipping
it. */

skipping it -> skip it

+           /* double row estimate to reduce error rate for Bloom filter */
+           *nodeRows = Max(*nodeRows, scan->ss.ps.plan->plan_rows * 2);

Probably add more comment above about why the row count is doubled and how
the error rate is reduced.

+SemiJoinFilterExamineSlot(List *semiJoinFilters, TupleTableSlot *slot, Oid
tableId)

SemiJoinFilterExamineSlot -> ExamineSlotUsingSemiJoinFilter

Cheers

Tomas Vondra

tomas.vondra@enterprisedb.com

over 3 years ago

In reply to: Zheng Li (#1)

10 attachment(s)

Re: Bloom filter Pushdown Optimization for Merge Join

Hello Zheng Li,

Great to see someone is working on this! Some initial comments/review:

On 10/1/22 00:44, Zheng Li wrote:

Hello,

A bloom filter provides early filtering of rows that cannot be joined
before they would reach the join operator, the optimization is also
called a semi join filter (SJF) pushdown. Such a filter can be created
when one child of the join operator must materialize its derived table
before the other child is evaluated.

For example, a bloom filter can be created using the the join keys for
the build side/inner side of a hash join or the outer side of a merge
join, the bloom filter can then be used to pre-filter rows on the
other side of the join operator during the scan of the base relation.
The thread about “Hash Joins vs. Bloom Filters / take 2” [1] is good
discussion on using such optimization for hash join without going into
the pushdown of the filter where its performance gain could be further
increased.

Agreed. That patch was beneficial for hashjoins with batching, but I
think the pushdown makes this much more interesting.

We worked on prototyping bloom filter pushdown for both hash join and
merge join. Attached is a patch set for bloom filter pushdown for
merge join. We also plan to send the patch for hash join once we have
it rebased.

Here is a summary of the patch set:
1. Bloom Filter Pushdown optimizes Merge Join by filtering rows early
during the table scan instead of later on.
-The bloom filter is pushed down along the execution tree to
the target SeqScan nodes.
-Experiments show that this optimization can speed up Merge
Join by up to 36%.

Right, although I think the speedup very much depends on the data sets
used for the tests, and can be made arbitrarily large with "appropriate"
data set.

2. The planner makes the decision to use the bloom filter based on the
estimated filtering rate and the expected performance gain.
-The planner accomplishes this by estimating four numbers per
variable - the total number of rows of the relation, the number of
distinct values for a given variable, and the minimum and maximum
value of the variable (when applicable). Using these numbers, the
planner estimates a filtering rate of a potential filter.
-Because actually creating and implementing the filter adds
more operations, there is a minimum threshold of filtering where the
filter would actually be useful. Based on testing, we query to see if
the estimated filtering rate is higher than 35%, and that informs our
decision to use a filter or not.

I agree, in principle, although I think the current logic / formula is a
bit too crude and fitted to the simple data used in the test. I think
this needs to be formulated as a regular costing issue, considering
stuff like cost of the hash functions, and so on.

I think this needs to do two things:

1) estimate the cost of building the bloom filter - This shall depend on
the number of rows in the inner relation, number/cost of the hash
functions (which may be higher for some data types), etc.

2) estimate improvement for the probing branch - Essentially, we need to
estimate how much we save by filtering some of the rows, but this also
neeeds to include the cost of probing the bloom filter.

This will probably require some improvements to the lib/bloomfilter, in
order to estimate the false positive rate - this may matter a lot for
large data sets and small work_mem values. The bloomfilter library
simply reduces the size of the bloom filter, which increases the false
positive rate. At some point it'll start reducing the benefit.

3. If using a bloom filter, the planner also adjusts the expected cost
of Merge Join based on expected performance gain.

I think this is going to be a weak point of the costing, because we're
adjusting the cost of the whole subtree after it was costed.

We're doing something similar when costing LIMIT, and that can already
causes a lot of strange stuff with non-uniform data distributions, etc.

And in this case it's probably worse, because we're eliminating rows at
the scan level, without changing the cost of any of the intermediate
nodes. It's certainly going to be confusing in EXPLAIN, because of the
discrepancy between estimated and actual row counts ...

4. Capability to build the bloom filter in parallel in case of
parallel SeqScan. This is done efficiently by populating a local bloom
filter for each parallel worker and then taking a bitwise OR over all
the local bloom filters to form a shared bloom filter at the end of
the parallel SeqScan.

OK. Could also build the bloom filter in shared memory?

5. The optimization is GUC controlled, with settings of
enable_mergejoin_semijoin_filter and force_mergejoin_semijoin_filter.

We found in experiments that there is a significant improvement
when using the bloom filter during Merge Join. One experiment involved
joining two large tables while varying the theoretical filtering rate
(TFR) between the two tables, the TFR is defined as the percentage
that the two datasets are disjoint. Both tables in the merge join were
the same size. We tested changing the TFR to see the change in
filtering optimization.

For example, let’s imagine t0 has 10 million rows, which contain the
numbers 1 through 10 million randomly shuffled. Also, t1 has the
numbers 4 million through 14 million randomly shuffled. Then the TFR
for a join of these two tables is 40%, since 40% of the tables are
disjoint from the other table (1 through 4 million for t0, 10 million
through 14 million for t4).

Here is the performance test result joining two tables:
TFR: theoretical filtering rate
EFR: estimated filtering rate
AFR: actual filtering rate
HJ: hash join
MJ Default: default merge join
MJ Filter: merge join with bloom filter optimization enabled
MJ Filter Forced: merge join with bloom filter optimization forced

TFR EFR AFR HJ MJ Default MJ Filter MJ Filter Forced
-------------------------------------------------------------------------------------
10 33.46 7.41 6529 22638 21949 23160
20 37.27 14.85 6483 22290 21928 21930
30 41.32 22.25 6395 22374 20718 20794
40 45.67 29.7 6272 21969 19449 19410
50 50.41 37.1 6210 21412 18222 18224
60 55.64 44.51 6052 21108 17060 17018
70 61.59 51.98 5947 21020 15682 15737
80 68.64 59.36 5761 20812 14411 14437
90 77.83 66.86 5701 20585 13171 13200
Table. Execution Time (ms) vs Filtering Rate (%) for Joining Two
Tables of 10M Rows.

Attached you can find figures of the same performance test and a SQL script
to reproduce the performance test.

The first thing to notice is that Hash Join generally is the most
efficient join strategy. This is because Hash Join is better at
dealing with small tables, and our size of 10 million is still small
enough where Hash Join outperforms the other join strategies. Future
experiments can investigate using much larger tables.

However, comparing just within the different Merge Join variants, we
see that using the bloom filter greatly improves performance.
Intuitively, all of these execution times follow linear paths.
Comparing forced filtering versus default, we can see that the default
Merge Join outperforms Merge Join with filtering at low filter rates,
but after about 20% TFR, the Merge Join with filtering outperforms
default Merge Join. This makes intuitive sense, as there are some
fixed costs associated with building and checking with the bloom
filter. In the worst case, at only 10% TFR, the bloom filter makes
Merge Join less than 5% slower. However, in the best case, at 90% TFR,
the bloom filter improves Merge Join by 36%.

Based on the results of the above experiments, we came up with a
linear equation for the performance ratio for using the filter
pushdown from the actual filtering rate. Based on the numbers
presented in the figure, this is the equation:

T_filter / T_no_filter = 1 / (0.83 * estimated filtering rate + 0.863)

For example, this means that with an estimated filtering rate of 0.4,
the execution time of merge join is estimated to be improved by 16.3%.
Note that the estimated filtering rate is used in the equation, not
the theoretical filtering rate or the actual filtering rate because it
is what we have during planning. In practice the estimated filtering
rate isn’t usually accurate. In fact, the estimated filtering rate can
differ from the theoretical filtering rate by as much as 17% in our
experiments. One way to mitigate the power loss of bloom filter caused
by inaccurate estimated filtering rate is to adaptively turn it off at
execution time, this is yet to be implemented.

IMHO we shouldn't make too many conclusions from these examples. Yes, it
shows merge join can be improved, but for cases where a hashjoin works
better so we wouldn't use merge join anyway.

I think we should try constructing examples where either merge join wins
already (and gets further improved by the bloom filter), or would lose
to hash join and the bloom filter improves it enough to win.

AFAICS that requires a join of two large tables - large enough that hash
join would need to be batched, or pre-sorted inputs (which eliminates
the explicit Sort, which is the main cost in most cases).

The current patch only works with sequential scans, which eliminates the
second (pre-sorted) option. So let's try the first one - can we invent
an example with a join of two large tables where a merge join would win?

Can we find such example in existing benchmarks like TPC-H/TPC-DS.

Here is a list of tasks we plan to work on in order to improve this patch:
1. More regression testing to guarantee correctness.
2. More performance testing involving larger tables and complicated query plans.
3. Improve the cost model.

4. Explore runtime tuning such as making the bloom filter checking adaptive.

I think this is tricky, I'd leave it out from the patch for now until
the other bits are polished. It can be added later.

5. Currently, only the best single join key is used for building the
Bloom filter. However, if there are several keys and we know that
their distributions are somewhat disjoint, we could leverage this fact
and use multiple keys for the bloom filter.

True, and I guess it wouldn't be hard.

6. Currently, Bloom filter pushdown is only implemented for SeqScan
nodes. However, it would be possible to allow push down to other types
of scan nodes.

I think pushing down the bloom filter to other types of scans is not the
hard part, really. It's populating the bloom filter early enough.

Invariably, all the examples end up with plans like this:

-> Merge Join
Merge Cond: (t0.c1 = t1.c1)
SemiJoin Filter Created Based on: (t0.c1 = t1.c1)
SemiJoin Estimated Filtering Rate: 1.0000
-> Sort
Sort Key: t0.c1
-> Seq Scan on t0
-> Materialize
-> Sort
Sort Key: t1.c1
-> Seq Scan on t1

The bloom filter is built by the first seqscan (on t0), and then used by
the second seqscan (on t1). But this only works because we always run
the t0 scan to completion (because we're feeding it into Sort) before we
start scanning t1.

But when the scan on t1 switches to an index scan, it's over - we'd be
building the filter without being able to probe it, and when we finish
building it we no longer need it. So this seems pretty futile.

It might still improve plans like

-> Merge Join
Merge Cond: (t0.c1 = t1.c1)
SemiJoin Filter Created Based on: (t0.c1 = t1.c1)
SemiJoin Estimated Filtering Rate: 1.0000
-> Sort
Sort Key: t0.c1
-> Seq Scan on t0
-> Index Scan on t1

But I don't know how common/likely that actually is. I'd expect to have
an index on both sides, but perhaps I'm wrong.

This is why hashjoin seems like a more natural fit for the bloom filter,
BTW, because there we have a guarantee the inner relation is processed
first (so we know the bloom filter is fine and can be probed).

7. Explore if the Bloom filter could be pushed down through a foreign
scan when the foreign server is capable of handling it – which could
be made true for postgres_fdw.

Neat idea, but I suggest to leave this out of scope of this patch.

8. Better explain command on the usage of bloom filters.

I don't know what improvements you have in mind exactly, but I think
it'd be good to show which node is building/using a bloom filter, and
then also some basic stats (size, number of hash functions, FPR, number
of probes, ...). This may require improvements to lib/bloomfilter, which
currently does not expose some of the details.

This patch set is prepared by Marcus Ma, Lyu Pan and myself. Feedback
is appreciated.

Attached is a patch series with two "review" parts (0002 and 0004). I
already mentioned some of the stuff above, but a couple more points:

1) Don't allocate memory directly through alloca() etc. Use palloc, i.e.
rely on our memory context.

2) It's customary to have "PlannerInfo *root" as the first parameter.

3) For the "debug" logging, I'd suggest to do it the way TRACE_SORT
(instead of inventing a bunch of dbg routines).

4) I find the naming inconsistent, e.g. with respect to the surrounding
code (say, when everything around starts with Exec, maybe the new
functions should too?). Also, various functions/variables say "semijoin"
but then we apply that to "inner joins" too.

5) Do we really need estimate_distincts_remaining() to implement yet
another formula for estimating number of distinct groups, different from
estimate_num_groups() does? Why?

6) A number of new functions miss comments explaining the purpose, and
it's not quite clear what the "contract" is. Also, some functions have
new parameters but the comment was not updated to reflect it.

7) SemiJoinFilterExamineSlot is matching the relations by OID, but
that's wrong - if you do a self-join, both sides have the same OID. It
needs to match RT index (I believe scanrelid in Scan node is what this
should be looking at).

There's a couple more review comments in the patches, but those are
minor and not worth discussing here - feel free to ask, if anything is
not clear enough (or if you disagree).

I did a bunch of testing, after tweaking your SQL script.

I changed the data generation a bit not to be so slow (instead of
relying on unnest of multiple large sets, I use one sequence and random
to generate data). And I run the tests with different parameters (step,
work_mem, ...) driven by the attached shell script.

And it quickly fails (on assert-enabled-build). I see two backtraces:

1) bogus overlapping estimate (ratio > 1.0)
...
#4 0x0000000000c9d56b in ExceptionalCondition (conditionName=0xe43724
"inner_overlapping_ratio >= 0 && inner_overlapping_ratio <= 1",
errorType=0xd33069 "FailedAssertion", fileName=0xe42bdb "costsize.c",
lineNumber=7442) at assert.c:69
#5 0x00000000008ed767 in evaluate_semijoin_filtering_rate
(join_path=0x2fe79f0, equijoin_list=0x2fea6c0, root=0x2fe6b68,
workspace=0x7ffd87588a78, best_clause=0x7ffd875888cc,
rows_filtered=0x7ffd875888c8) at costsize.c:7442

Seems it's doing the math wrong, or does not expect some corner case.

2) stuck spinlock in SemiJoinFilterFinishScan
...
#5 0x0000000000a85cb0 in s_lock_stuck (file=0xe68c8c "lwlock.c",
line=907, func=0xe690a1 "LWLockWaitListLock") at s_lock.c:83
#6 0x0000000000a85a8d in perform_spin_delay (status=0x7ffd8758b8e8) at
s_lock.c:134
#7 0x0000000000a771c3 in LWLockWaitListLock (lock=0x7e40a597c060) at
lwlock.c:911
#8 0x0000000000a76e93 in LWLockConflictsWithVar (lock=0x7e40a597c060,
valptr=0x7e40a597c048, oldval=1, newval=0x7e40a597c048,
result=0x7ffd8758b983) at lwlock.c:1580
#9 0x0000000000a76ce9 in LWLockWaitForVar (lock=0x7e40a597c060,
valptr=0x7e40a597c048, oldval=1, newval=0x7e40a597c048) at lwlock.c:1638
#10 0x000000000080aa55 in SemiJoinFilterFinishScan
(semiJoinFilters=0x2e349b0, tableId=1253696, parallel_area=0x2e17388) at
nodeMergejoin.c:2035

This only happens in parallel plans, I haven't looked at the details.

I do recall parallel hash join was quite tricky exactly because there
are issues with coordinating building the hash table (e.g. workers might
get stuck due to waiting on shmem queues etc.), I wonder if this might
be something similar due to building the filter. But maybe it's
something trivial.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

0007-Add-basic-regress-tests-for-semijoin-filter-20221003.patchtext/x-patch; charset=UTF-8; name=0007-Add-basic-regress-tests-for-semijoin-filter-20221003.patchDownload

From cbf7743bc61190a1484b2f9fdcad6c5a0bd59d9a Mon Sep 17 00:00:00 2001
From: Lyu Pan <lyup@amazon.com>
Date: Mon, 19 Sep 2022 23:25:30 +0000
Subject: [PATCH 7/7] Add basic regress tests for semijoin filter.

Because of implementation limits (and bugs) in the prototype, only some basic sqls are tested.

1. Test that when force_mergejoin_semijoin_filter is ON, semijoin filter is used.
2. Test that semijoin filter works in two-table inner merge join.
3. Test that semijoin filter can be pushed throudh SORT node.
4. Test that semijoin filter works in a basic three-table-merge-join. However, due to implementation bugs, it doesn't work in all kinds of three table merge joins.
---
 .../expected/mergejoin_semijoinfilter.out     | 158 ++++++++++++++++++
 src/test/regress/parallel_schedule            |   1 +
 .../regress/sql/mergejoin_semijoinfilter.sql  |  62 +++++++
 3 files changed, 221 insertions(+)
 create mode 100644 src/test/regress/expected/mergejoin_semijoinfilter.out
 create mode 100644 src/test/regress/sql/mergejoin_semijoinfilter.sql

diff --git a/src/test/regress/expected/mergejoin_semijoinfilter.out b/src/test/regress/expected/mergejoin_semijoinfilter.out
new file mode 100644
index 00000000000..07c4acc2533
--- /dev/null
+++ b/src/test/regress/expected/mergejoin_semijoinfilter.out
@@ -0,0 +1,158 @@
+SET enable_hashjoin = OFF;
+SET enable_nestloop = OFF;
+SET enable_mergejoin = ON;
+SET enable_mergejoin_semijoin_filter = ON;
+SET force_mergejoin_semijoin_filter = ON;
+CREATE TABLE t1 (
+  i integer,
+  j integer
+);
+CREATE TABLE t2 (
+  i integer,
+  k integer
+);
+CREATE TABLE t3 (
+  i integer,
+  m integer
+);
+INSERT INTO t1 (i, j)
+  SELECT
+    generate_series(1,100000) AS i,
+    generate_series(1,100000) AS j;
+INSERT INTO t2 (i, k)
+  SELECT
+    generate_series(1,100000) AS i,
+    generate_series(1,100000) AS k;
+INSERT INTO t3 (i, m)
+  SELECT
+    generate_series(1,100000) AS i,
+    generate_series(1,100000) AS m;
+-- Semijoin filter is not used when force_mergejoin_semijoin_filter is OFF.
+SET force_mergejoin_semijoin_filter = OFF;
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+               QUERY PLAN                
+-----------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (t1.i = t2.i)
+         ->  Sort
+               Output: t1.i
+               Sort Key: t1.i
+               ->  Seq Scan on public.t1
+                     Output: t1.i
+         ->  Sort
+               Output: t2.i
+               Sort Key: t2.i
+               ->  Seq Scan on public.t2
+                     Output: t2.i
+(14 rows)
+
+SET force_mergejoin_semijoin_filter = ON;
+-- One level of inner mergejoin: push semi-join filter to outer scan.
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (t1.i = t2.i)
+         SemiJoin Filter Created Based on: (t1.i = t2.i)
+         SemiJoin Estimated Filtering Rate: -0.0100
+         ->  Sort
+               Output: t1.i
+               Sort Key: t1.i
+               ->  Seq Scan on public.t1
+                     Output: t1.i
+         ->  Sort
+               Output: t2.i
+               Sort Key: t2.i
+               ->  Seq Scan on public.t2
+                     Output: t2.i
+(16 rows)
+
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+ count  
+--------
+ 100000
+(1 row)
+
+-- Push semijoin filter through SORT node.
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM (SELECT DISTINCT t1.i FROM t1 ORDER BY t1.i) x JOIN t2 ON x.i = t2.i;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (t1.i = t2.i)
+         SemiJoin Filter Created Based on: (t1.i = t2.i)
+         SemiJoin Estimated Filtering Rate: -0.0100
+         ->  Sort
+               Output: t1.i
+               Sort Key: t1.i
+               ->  HashAggregate
+                     Output: t1.i
+                     Group Key: t1.i
+                     ->  Seq Scan on public.t1
+                           Output: t1.i, t1.j
+         ->  Sort
+               Output: t2.i
+               Sort Key: t2.i
+               ->  Seq Scan on public.t2
+                     Output: t2.i
+(19 rows)
+
+SELECT COUNT(*) FROM (SELECT DISTINCT t1.i FROM t1 ORDER BY t1.i) x JOIN t2 ON x.i = t2.i;
+ count  
+--------
+ 100000
+(1 row)
+
+-- Two levels of MergeJoin
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i JOIN t3 ON t2.i = t3.i;
+                             QUERY PLAN                              
+---------------------------------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (t3.i = t1.i)
+         SemiJoin Filter Created Based on: (t3.i = t1.i)
+         SemiJoin Estimated Filtering Rate: -0.0100
+         ->  Sort
+               Output: t3.i
+               Sort Key: t3.i
+               ->  Seq Scan on public.t3
+                     Output: t3.i
+         ->  Materialize
+               Output: t1.i, t2.i
+               ->  Merge Join
+                     Output: t1.i, t2.i
+                     Merge Cond: (t1.i = t2.i)
+                     SemiJoin Filter Created Based on: (t1.i = t2.i)
+                     SemiJoin Estimated Filtering Rate: -0.0100
+                     ->  Sort
+                           Output: t1.i
+                           Sort Key: t1.i
+                           ->  Seq Scan on public.t1
+                                 Output: t1.i
+                     ->  Sort
+                           Output: t2.i
+                           Sort Key: t2.i
+                           ->  Seq Scan on public.t2
+                                 Output: t2.i
+(28 rows)
+
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i JOIN t3 ON t2.i = t3.i;
+ count  
+--------
+ 100000
+(1 row)
+
+DROP TABLE t1;
+DROP TABLE t2;
+DROP TABLE t3;
+RESET enable_mergejoin;
+RESET enable_memoize;
+RESET enable_mergejoin;
+RESET enable_mergejoin_semijoin_filter;
+RESET force_mergejoin_semijoin_filter;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 9f644a0c1b2..6bde05b8cbd 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -98,6 +98,7 @@ test: rules psql psql_crosstab amutils stats_ext collate.linux.utf8
 test: select_parallel
 test: write_parallel
 test: vacuum_parallel
+test: mergejoin_semijoinfilter
 
 # no relation related tests can be put in this group
 test: publication subscription
diff --git a/src/test/regress/sql/mergejoin_semijoinfilter.sql b/src/test/regress/sql/mergejoin_semijoinfilter.sql
new file mode 100644
index 00000000000..6c6418f71a4
--- /dev/null
+++ b/src/test/regress/sql/mergejoin_semijoinfilter.sql
@@ -0,0 +1,62 @@
+SET enable_hashjoin = OFF;
+SET enable_nestloop = OFF;
+SET enable_mergejoin = ON;
+SET enable_mergejoin_semijoin_filter = ON;
+SET force_mergejoin_semijoin_filter = ON;
+
+CREATE TABLE t1 (
+  i integer,
+  j integer
+);
+
+CREATE TABLE t2 (
+  i integer,
+  k integer
+);
+
+CREATE TABLE t3 (
+  i integer,
+  m integer
+);
+
+INSERT INTO t1 (i, j)
+  SELECT
+    generate_series(1,100000) AS i,
+    generate_series(1,100000) AS j;
+
+INSERT INTO t2 (i, k)
+  SELECT
+    generate_series(1,100000) AS i,
+    generate_series(1,100000) AS k;
+
+INSERT INTO t3 (i, m)
+  SELECT
+    generate_series(1,100000) AS i,
+    generate_series(1,100000) AS m;
+
+-- Semijoin filter is not used when force_mergejoin_semijoin_filter is OFF.
+SET force_mergejoin_semijoin_filter = OFF;
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+SET force_mergejoin_semijoin_filter = ON;
+
+-- One level of inner mergejoin: push semi-join filter to outer scan.
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+
+-- Push semijoin filter through SORT node.
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM (SELECT DISTINCT t1.i FROM t1 ORDER BY t1.i) x JOIN t2 ON x.i = t2.i;
+SELECT COUNT(*) FROM (SELECT DISTINCT t1.i FROM t1 ORDER BY t1.i) x JOIN t2 ON x.i = t2.i;
+
+-- Two levels of MergeJoin
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i JOIN t3 ON t2.i = t3.i;
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i JOIN t3 ON t2.i = t3.i;
+
+DROP TABLE t1;
+DROP TABLE t2;
+DROP TABLE t3;
+
+RESET enable_mergejoin;
+RESET enable_memoize;
+RESET enable_mergejoin;
+RESET enable_mergejoin_semijoin_filter;
+RESET force_mergejoin_semijoin_filter;
-- 
2.37.3

bloom-bt.txttext/plain; charset=UTF-8; name=bloom-bt.txtDownload

0001-Support-semijoin-filter-in-the-planner-opti-20221003.patchtext/x-patch; charset=UTF-8; name=0001-Support-semijoin-filter-in-the-planner-opti-20221003.patchDownload

From 82b9d9bcd683bd9030de334422bdd25403602bc8 Mon Sep 17 00:00:00 2001
From: Lyu Pan <lyup@amazon.com>
Date: Thu, 15 Sep 2022 22:16:23 +0000
Subject: [PATCH 1/7] Support semijoin filter in the planner/optimizer.

1. Introduces two GUCs enable_mergejoin_semijoin_filter and force_mergejoin_semijoin_filter. enable_mergejoin_semijoin_filter enables the use of bloom filter during a merge-join, if such a filter is available, the planner will adjust the merge-join cost and use it only if it meets certain threshold (estimated filtering rate has to be higher than a certain value);
force_mergejoin_semijoin_filter forces the use of bloom filter during a merge-join if a valid filter is available.

2. In this prototype, only a single join clause where both sides maps to a base column will be considered as the key to build the bloom filter. For example:
bloom filter may be used in this query:
SELECT * FROM a JOIN b ON a.col1 = b.col1;
bloom filter will not be used in the following query (the left hand side of the join clause is an expression):
SELECT * FROM a JOIN b ON a.col1 + a.col2 = b.col1;

3. In this prototype, the cost model is based on an assumption that there is a linear relationship between the performance gain from using a semijoin filter and the estimated filtering rate:
% improvement to Merge Join cost = 0.83 * estimated filtering rate - 0.137.
---
 src/backend/optimizer/path/costsize.c         | 1298 ++++++++++++++++-
 src/backend/optimizer/plan/createplan.c       |  611 ++++++++
 src/backend/utils/adt/selfuncs.c              |   24 +-
 src/backend/utils/misc/guc_tables.c           |   20 +
 src/backend/utils/misc/postgresql.conf.sample |    2 +
 src/include/nodes/pathnodes.h                 |   13 +
 src/include/nodes/plannodes.h                 |    8 +
 src/include/optimizer/cost.h                  |   47 +
 src/include/utils/selfuncs.h                  |    4 +-
 src/test/regress/expected/sysviews.out        |   49 +-
 10 files changed, 2037 insertions(+), 39 deletions(-)

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index f486d42441b..d1663e5a379 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -192,6 +192,30 @@ static double relation_byte_size(double tuples, int width);
 static double page_size(double tuples, int width);
 static double get_parallel_divisor(Path *path);
 
+/*
+ *  Local functions and option variables to support
+ *  semijoin pushdowns from join nodes
+ */
+static double evaluate_semijoin_filtering_rate(JoinPath *join_path,
+											   const List *hash_equijoins,
+											   const PlannerInfo *root,
+											   JoinCostWorkspace *workspace,
+											   int *best_clause,
+											   int *rows_filtered);
+static bool verify_valid_pushdown(const Path *p,
+								  const Index pushdown_target_key_no,
+								  const PlannerInfo *root);
+static TargetEntry *get_nth_targetentry(int posn,
+										const List *targetlist);
+static bool is_fk_pk(const Var *outer_var,
+					 const Var *inner_var,
+					 Oid op_oid,
+					 const PlannerInfo *root);
+static List *get_switched_clauses(List *clauses, Relids outerrelids);
+
+/* Global variables to store semijoin control options */
+bool		enable_mergejoin_semijoin_filter;
+bool		force_mergejoin_semijoin_filter;
 
 /*
  * clamp_row_est
@@ -3650,6 +3674,10 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 			innerstartsel = 0.0;
 			innerendsel = 1.0;
 		}
+		workspace->outer_min_val = cache->leftmin;
+		workspace->outer_max_val = cache->leftmax;
+		workspace->inner_min_val = cache->rightmin;
+		workspace->inner_max_val = cache->rightmax;
 	}
 	else
 	{
@@ -3811,6 +3839,10 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 	double		mergejointuples,
 				rescannedtuples;
 	double		rescanratio;
+	List	   *mergeclauses_for_sjf;
+	double		filteringRate;
+	int			best_filter_clause;
+	int			rows_filtered;
 
 	/* Protect some assumptions below that rowcounts aren't zero */
 	if (inner_path_rows <= 0)
@@ -3863,6 +3895,49 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 	else
 		path->skip_mark_restore = false;
 
+	if (enable_mergejoin_semijoin_filter)
+	{
+		/*
+		 * determine if merge join should use a semijoin filter. We need to
+		 * rearrange the merge clauses so they match the status of the clauses
+		 * during Plan creation.
+		 */
+		mergeclauses_for_sjf = get_actual_clauses(path->path_mergeclauses);
+		mergeclauses_for_sjf = get_switched_clauses(path->path_mergeclauses,
+													path->jpath.outerjoinpath->parent->relids);
+		filteringRate = evaluate_semijoin_filtering_rate((JoinPath *) path, mergeclauses_for_sjf, root,
+														 workspace, &best_filter_clause, &rows_filtered);
+		if (force_mergejoin_semijoin_filter ||
+			(filteringRate >= 0 && rows_filtered >= 0 && best_filter_clause >= 0))
+		{
+			/* found a valid SJF at the very least */
+			/* want at least 1000 rows_filtered to avoid any nasty edge cases */
+			if (force_mergejoin_semijoin_filter || (filteringRate >= 0.35 && rows_filtered > 1000))
+			{
+				double		improvement;
+
+				path->use_semijoinfilter = true;
+				path->best_mergeclause = best_filter_clause;
+				path->filteringRate = filteringRate;
+
+				/*
+				 * Based on experimental data, we have found that there is a
+				 * linear relationship between the estimated filtering rate
+				 * and improvement to the cost of Merge Join. In fact, this
+				 * improvement can be modeled by this equation: improvement =
+				 * 0.83 * filtering rate - 0.137 i.e., a filtering rate of 0.4
+				 * yields an improvement of 19.5%. This equation also
+				 * concludes thata a 17% filtering rate is the break-even
+				 * point, so we use 35% just be conservative. We use this
+				 * information to adjust the MergeJoin's planned cost.
+				 */
+				improvement = 0.83 * filteringRate - 0.137;
+				run_cost = (1 - improvement) * run_cost;
+				workspace->run_cost = run_cost;
+			}
+		}
+	}
+
 	/*
 	 * Get approx # tuples passing the mergequals.  We use approx_tuple_count
 	 * here because we need an estimate done with JOIN_INNER semantics.
@@ -4044,6 +4119,10 @@ cached_scansel(PlannerInfo *root, RestrictInfo *rinfo, PathKey *pathkey)
 				leftendsel,
 				rightstartsel,
 				rightendsel;
+	Datum		leftmin,
+				leftmax,
+				rightmin,
+				rightmax;
 	MemoryContext oldcontext;
 
 	/* Do we have this result already? */
@@ -4066,7 +4145,11 @@ cached_scansel(PlannerInfo *root, RestrictInfo *rinfo, PathKey *pathkey)
 					 &leftstartsel,
 					 &leftendsel,
 					 &rightstartsel,
-					 &rightendsel);
+					 &rightendsel,
+					 &leftmin,
+					 &leftmax,
+					 &rightmin,
+					 &rightmax);
 
 	/* Cache the result in suitably long-lived workspace */
 	oldcontext = MemoryContextSwitchTo(root->planner_cxt);
@@ -4080,6 +4163,10 @@ cached_scansel(PlannerInfo *root, RestrictInfo *rinfo, PathKey *pathkey)
 	cache->leftendsel = leftendsel;
 	cache->rightstartsel = rightstartsel;
 	cache->rightendsel = rightendsel;
+	cache->leftmin = leftmin;
+	cache->leftmax = leftmax;
+	cache->rightmin = rightmin;
+	cache->rightmax = rightmax;
 
 	rinfo->scansel_cache = lappend(rinfo->scansel_cache, cache);
 
@@ -6552,3 +6639,1212 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
 
 	return pages_fetched;
 }
+
+/* The following code was modified from previous commits on the
+ * HashJoin SemiJoinFilter CR.
+ */
+
+/*
+ *  Conditionally compiled tracing is available for the semijoin
+ *  decision process.  Tracing in only included for debug builds,
+ *  and only if the TRACE_SJPD flag is defined.
+ */
+
+#define TRACE_SJPD 0
+#define DEBUG_BUILD 0
+
+#if defined(DEBUG_BUILD) && defined(TRACE_SJPD) && DEBUG_BUILD && TRACE_SJPD
+#define debug_sj1(x)	   elog(INFO, (x))
+#define debug_sj2(x,y)	   elog(INFO, (x), (y))
+#define debug_sj3(x,y,z)   elog(INFO, (x), (y), (z))
+#define debug_sj4(w,x,y,z) elog(INFO, (w), (x), (y), (z))
+#define debug_sj_md(x,y,z) debug_sj_expr_metadata((x), (y), (z))
+#else
+#define debug_sj1(x)
+#define debug_sj2(x,y)
+#define debug_sj3(x,y,z)
+#define debug_sj4(w,x,y,z)
+#define debug_sj_md(x,y,z)
+#endif
+
+
+static void
+init_expr_metadata(ExprMetadata * md)
+{
+	/* Should only be called by analyze_expr_for_metadata */
+	Assert(md);
+
+	md->is_or_maps_to_constant = false;
+	md->is_or_maps_to_base_column = false;
+	md->local_column_expr = NULL;
+	md->local_relation = NULL;
+	md->est_col_width = 0;
+	md->base_column_expr = NULL;
+	md->base_rel = NULL;
+	md->base_rel_root = NULL;
+	md->base_rel_row_count = 0.0;
+	md->base_rel_filt_row_count = 0.0;
+	md->base_col_distincts = -1.0;
+	md->est_distincts_reliable = false;
+	md->expr_est_distincts = -1.0;
+}
+
+
+#if defined(DEBUG_BUILD) && defined(TRACE_SJPD) && DEBUG_BUILD && TRACE_SJPD
+
+static void
+debug_sj_expr_metadata(const char *side, int ord, const ExprMetadata * md)
+{
+	debug_sj4("SJPD:          %s key [%d]  is constant:   %d",
+			  side, ord, md->is_or_maps_to_constant);
+	debug_sj4("SJPD:          %s key [%d]  is base col:   %d",
+			  side, ord, md->is_or_maps_to_base_column);
+	debug_sj4("SJPD:          %s key [%d]  est reliable:  %d",
+			  side, ord, md->est_distincts_reliable);
+	debug_sj4("SJPD:          %s key [%d]  trows_bf:      %.1lf",
+			  side, ord, md->base_rel_row_count);
+	debug_sj4("SJPD:          %s key [%d]  trows_af:      %.1lf",
+			  side, ord, md->base_rel_filt_row_count);
+	debug_sj4("SJPD:          %s key [%d]  bcol_dist:     %.1lf",
+			  side, ord, md->base_col_distincts);
+	debug_sj4("SJPD:          %s key [%d]  est width:     %d",
+			  side, ord, md->est_col_width);
+
+	if (md->local_relation && md->local_relation != md->base_rel)
+	{
+		debug_sj4("SJPD:          %s key [%d]  logical relid: %d",
+				  side, ord, md->local_relation->relid);
+	}
+	if (md->base_rel)
+	{
+		debug_sj4("SJPD:          %s key [%d]  base relid:    %d",
+				  side, ord, md->base_rel->relid);
+	}
+
+	if (md->base_rel && (md->base_rel->reloptkind == RELOPT_BASEREL ||
+						 md->base_rel->reloptkind == RELOPT_OTHER_MEMBER_REL))
+	{
+		/* include the column name, if we can get it  */
+		const RelOptInfo *cur_relation = md->base_rel;
+		Oid			cur_var_reloid = InvalidOid;
+		const Var  *cur_var = (const Var *) md->base_column_expr;
+
+		Assert(IsA(cur_var, Var));
+		cur_var_reloid = (planner_rt_fetch(cur_relation->relid,
+										   md->base_rel_root))->relid;
+		if (cur_var_reloid != InvalidOid && cur_var->varattno > 0)
+		{
+			const char *base_attribute_name =
+			get_attname(cur_var_reloid, md->base_column_expr->varattno,
+						true);
+			const char *base_rel_name = get_rel_name(cur_var_reloid);
+			char		name_str[260] = "";
+
+			if (base_rel_name && base_attribute_name)
+			{
+				snprintf(name_str, sizeof(name_str), "%s.%s",
+						 base_rel_name, base_attribute_name);
+			}
+			else if (base_attribute_name)
+			{
+				snprintf(name_str, sizeof(name_str), "%s",
+						 base_attribute_name);
+			}
+			if (base_attribute_name)
+			{
+				debug_sj4("SJPD:          %s key [%d]  base col name: %s",
+						  side, ord, name_str);
+			}
+		}
+	}
+	debug_sj4("SJPD:        %s key [%d]    est distincts: %.1lf",
+			  side, ord, md->expr_est_distincts);
+}
+#endif
+
+
+static double
+estimate_distincts_remaining(double original_table_row_count,
+							 double original_distinct_count,
+							 double est_row_count_after_predicates)
+{
+	/*
+	 * Estimates the number of distinct values still present within a column
+	 * after some local filtering has been applied to that table and thereby
+	 * restricted the set of relevant rows.
+	 *
+	 * This method assumes that the original_distinct_count comes from a
+	 * column whose values are uncorrelated with the row restricting
+	 * condition(s) on this table.  Other mechanisms need to be added to more
+	 * accurately handle the cases where the row restrincting condition is
+	 * directly on the current column.
+	 *
+	 * The most probable number of distinct values remaining can be computed
+	 * exactly using Yao's iterative expansion formula from: "Approximating
+	 * block accesses in database organizations", S. B. Yao, CACM, V20, N4,
+	 * April 1977, p. 260-261 However, this formula gets very expensive to
+	 * compute whenever the number of distinct values is large.
+	 *
+	 * This function instead uses a non-iterative approximation of Yao's
+	 * iterative formula from: "Estimating Block Accesses in Database
+	 * Organizations: A Closed Noniterative Formula", Kyu-Young Whang, Gio
+	 * Wiederhold, and Daniel Sagalowicz CACM V26, N11, November 1983, p.
+	 * 945-947 This approximation starts with terms for the first three
+	 * iterations of Yao's formula, and then inserts two adjustment factors
+	 * into the third term which minimize the total error related to the
+	 * missing subsequent terms.
+	 *
+	 * Internally this function uses M, N, P, and K as variables to match the
+	 * notation used in the equation in the paper.
+	 */
+	double		n = original_table_row_count;
+	double		m = original_distinct_count;
+	double		k = est_row_count_after_predicates;
+	double		p = 0.0;		/* avg rows per distinct */
+	double		result;
+
+	/* The three partial probabality terms */
+	double		term_1 = 0.0;
+	double		term_2 = 0.0;
+	double		term_3 = 0.0;
+	double		sum_terms = 0.0;
+
+	/* In debug builds, validate the sanity of the inputs  */
+	Assert(isfinite(original_table_row_count));
+	Assert(isfinite(original_distinct_count));
+	Assert(isfinite(est_row_count_after_predicates));
+	Assert(original_table_row_count >= 0.0);
+	Assert(original_distinct_count >= 0.0);
+	Assert(est_row_count_after_predicates >= 0.0);
+	Assert(original_distinct_count <= original_table_row_count);
+	Assert(est_row_count_after_predicates <= original_table_row_count);
+
+	if (n > 0.0 && m > 0.0)
+	{
+		p = (n / m);
+	}
+	Assert(isfinite(p));
+
+	if (k > (n - p))
+	{							/* All distincts almost guaranteed to still be
+								 * present */
+		result = m;
+	}
+	else if (m < 0.000001)
+	{							/* When all values are NULL, avoid division by
+								 * zero */
+		result = 0.0;
+	}
+	else if (k <= 1.000001)
+	{							/* When only one or zero rows after filtering */
+		result = k;
+	}
+	else
+	{
+		/*
+		 * When this is not a special case, compute the partial probabilities.
+		 * However, if the probability calculation overflows, then revert to
+		 * the estimate we can get from the upper bound analysis.
+		 */
+		result = fmin(original_distinct_count, est_row_count_after_predicates);
+
+		if (isfinite(1.0 / m) && isfinite(pow((1.0 - (1.0 / m)), k)))
+		{
+			term_1 = (1.0 - pow((1.0 - (1.0 / m)), k));
+
+			if (isfinite(term_1))
+			{
+				/*
+				 * As long as we at least have a usable term_1, then proceed
+				 * to the much smaller term_2 and to the even smaller term_3.
+				 *
+				 * If no usable term_1, then just use the hard upper bounds.
+				 */
+				if (isfinite(m * m * p)
+					&& isfinite(pow((1.0 - (1.0 / m)), (k - 1.0))))
+				{
+					term_2 = ((1.0 / (m * m * p))
+							  * ((k * (k - 1.0)) / 2.0)
+							  * pow((1.0 - (1.0 / m)), (k - 1.0))
+						);
+				}
+				if (!isfinite(term_2))
+				{
+					term_2 = 0.0;
+				}
+				if (isfinite(pow(m, 3.0))
+					&& isfinite(pow(p, 4.0))
+					&& isfinite(pow(m, 3.0) * pow(p, 4.0))
+					&& isfinite(k * (k - 1.0) * ((2 * k) - 1.0))
+					&& isfinite(pow((1.0 - (1.0 / m)), (k - 1.0))))
+				{
+					term_3 = ((1.5 / (pow(m, 3.0) * pow(p, 4.0)))
+							  * ((k * (k - 1.0) * ((2 * k) - 1.0)) / 6.0)
+							  * pow((1.0 - (1.0 / m)), (k - 1.0)));
+				}
+				if (!isfinite(term_3))
+				{
+					term_3 = 0.0;
+				}
+				sum_terms = term_1 + term_2 + term_3;
+
+				/* In debug builds, validate the partial probability terms */
+				Assert(term_1 <= 1.0 && term_1 >= 0.0);
+				Assert(term_2 <= 1.0 && term_2 >= 0.0);
+				Assert(term_3 <= 1.0 && term_3 >= 0.0);
+				Assert(term_1 > term_2);
+				Assert(term_2 >= term_3);
+				Assert(isfinite(sum_terms));
+				Assert(sum_terms <= 1.0);
+
+				if (isfinite(m * sum_terms))
+				{
+					result = round(m * sum_terms);
+				}
+
+				/* Ensure hard upper bounds still satisfied  */
+				result = fmin(result,
+							  fmin(original_distinct_count,
+								   est_row_count_after_predicates));
+
+				/* Since empty tables were handled above, must be >= 1 */
+				result = fmax(result, 1.0);
+			}
+		}
+	}
+
+	Assert(result >= 0.0);
+	Assert(result <= original_distinct_count);
+	Assert(result <= est_row_count_after_predicates);
+
+	return result;
+}
+
+
+static void
+gather_base_column_metadata(const Var *base_col,
+							const RelOptInfo *base_rel,
+							const PlannerInfo *root,
+							ExprMetadata * md)
+{
+	/*
+	 * Given a Var for a base column, gather metadata about that column Should
+	 * only be called indirectly under analyze_expr_for_metadata
+	 */
+	VariableStatData base_col_vardata;
+	Oid			base_col_reloid = InvalidOid;
+	bool		is_default;
+
+	/* Oid var_sortop; */
+
+	Assert(md && base_rel && root);
+	Assert(base_col && IsA(base_col, Var));
+	Assert(base_rel->reloptkind == RELOPT_BASEREL ||
+		   base_rel->reloptkind == RELOPT_OTHER_MEMBER_REL);
+	Assert(base_rel->rtekind == RTE_RELATION);
+
+	md->base_column_expr = base_col;
+	md->base_rel = base_rel;
+	md->base_rel_root = root;
+	md->is_or_maps_to_base_column = true;
+
+	examine_variable((PlannerInfo *) root, (Node *) base_col, 0,
+					 &base_col_vardata);
+	Assert(base_col_vardata.rel);
+	Assert(base_col_vardata.rel == base_rel);
+
+	md->base_rel_row_count = md->base_rel->tuples;
+	md->base_rel_filt_row_count = md->base_rel->rows;
+	md->base_col_distincts =
+		fmin(get_variable_numdistinct(&base_col_vardata, &is_default),
+			 md->base_rel_row_count);
+	md->est_distincts_reliable = !is_default;
+
+	/*
+	 * For indirectly filtered columns estimate the effect of the rows
+	 * filtered on the remaining column distinct count.
+	 */
+	md->expr_est_distincts =
+		fmax(1.0,
+			 estimate_distincts_remaining(md->base_rel_row_count,
+										  md->base_col_distincts,
+										  md->base_rel_filt_row_count));
+
+	base_col_reloid = (planner_rt_fetch(base_rel->relid, root))->relid;
+	if (base_col_reloid != InvalidOid && base_col->varattno > 0)
+	{
+		md->est_col_width =
+			get_attavgwidth(base_col_reloid, base_col->varattno);
+	}
+	ReleaseVariableStats(base_col_vardata);
+}
+
+static Expr *
+get_subquery_var_occluded_reference(const Expr *ex, const PlannerInfo *root)
+{
+	/*
+	 * Given a virtual column from an unflattened subquery, return the
+	 * expression it immediately occludes
+	 */
+	Var		   *outside_subq_var = (Var *) ex;
+	RelOptInfo *outside_subq_relation = NULL;
+	RangeTblEntry *outside_subq_rte = NULL;
+	TargetEntry *te = NULL;
+	Expr	   *inside_subq_expr = NULL;
+
+	Assert(ex && root);
+	Assert(IsA(ex, Var));
+	Assert(outside_subq_var->varno < root->simple_rel_array_size);
+
+	outside_subq_relation = root->simple_rel_array[outside_subq_var->varno];
+	outside_subq_rte = root->simple_rte_array[outside_subq_var->varno];
+
+	/*
+	 * If inheritance, subquery has append, leg of append in subquery may not
+	 * have subroot, we may be able to better process it according to
+	 * root->append_rel_list. For now just return the first leg... TODO better
+	 * handling of Union All, we only return statistics of the first leg atm.
+	 * TODO similarly, need better handling of partitioned tables, according
+	 * to outside_subq_relation->part_scheme and part_rels.
+	 */
+	if (outside_subq_rte->inh)
+	{
+		AppendRelInfo *appendRelInfo = NULL;
+
+		Assert(root->append_rel_list);
+
+		/* TODO remove this check once we add better handling of inheritance */
+		if (force_mergejoin_semijoin_filter)
+		{
+			appendRelInfo = list_nth(root->append_rel_list, 0);
+			Assert(appendRelInfo->parent_relid == outside_subq_var->varno);
+
+			Assert(appendRelInfo->translated_vars &&
+				   outside_subq_var->varattno <=
+				   list_length(appendRelInfo->translated_vars));
+			inside_subq_expr = list_nth(appendRelInfo->translated_vars,
+										outside_subq_var->varattno - 1);
+		}
+	}
+
+	/* Subquery without append and partitioned tables */
+	else
+	{
+		Assert(outside_subq_relation && IsA(outside_subq_relation, RelOptInfo));
+		Assert(outside_subq_relation->reloptkind == RELOPT_BASEREL);
+		Assert(outside_subq_relation->rtekind == RTE_SUBQUERY);
+		Assert(outside_subq_relation->subroot->processed_tlist);
+
+		te = get_nth_targetentry(outside_subq_var->varattno,
+								 outside_subq_relation->subroot->processed_tlist);
+		Assert(te && outside_subq_var->varattno == te->resno);
+		inside_subq_expr = te->expr;
+
+		/*
+		 * Strip off any Relabel present, and return the underlying expression
+		 */
+		while (inside_subq_expr && IsA(inside_subq_expr, RelabelType))
+		{
+			inside_subq_expr = ((RelabelType *) inside_subq_expr)->arg;
+		}
+	}
+
+	return inside_subq_expr;
+}
+
+
+static void
+recursively_analyze_expr_metadata(const Expr *ex,
+								  const PlannerInfo *root,
+								  ExprMetadata * md)
+{
+	/* Should only be called by analyze_expr_for_metadata, or itself */
+	Assert(md && ex && root);
+
+	if (IsA(ex, Const))
+	{
+		md->is_or_maps_to_constant = true;
+		md->expr_est_distincts = 1.0;
+		md->est_distincts_reliable = true;
+	}
+	else if (IsA(ex, RelabelType))
+	{
+		recursively_analyze_expr_metadata(((RelabelType *) ex)->arg, root, md);
+	}
+	else if (IsA(ex, Var))
+	{
+		Var		   *local_var = (Var *) ex;
+		RelOptInfo *local_relation = NULL;
+
+		Assert(local_var->varno < root->simple_rel_array_size);
+
+		/* Bail out if varno is invalid */
+		if (local_var->varno == InvalidOid)
+			return;
+
+		local_relation = root->simple_rel_array[local_var->varno];
+		Assert(local_relation && IsA(local_relation, RelOptInfo));
+
+		/*
+		 * For top level call (i.e. not a recursive invocation) cache the
+		 * relation pointer
+		 */
+		if (!md->local_relation
+			&& (local_relation->reloptkind == RELOPT_BASEREL ||
+				local_relation->reloptkind == RELOPT_OTHER_MEMBER_REL))
+		{
+			md->local_relation = local_relation;
+			md->local_column_expr = local_var;
+		}
+
+		if ((local_relation->reloptkind == RELOPT_BASEREL ||
+			 local_relation->reloptkind == RELOPT_OTHER_MEMBER_REL)
+			&& local_relation->rtekind == RTE_RELATION)
+		{
+			/* Found Var is a base column, so gather the metadata we can  */
+			gather_base_column_metadata(local_var, local_relation, root, md);
+		}
+		else if (local_relation->reloptkind == RELOPT_BASEREL
+				 && local_relation->rtekind == RTE_SUBQUERY)
+		{
+			RangeTblEntry *outside_subq_rte =
+			root->simple_rte_array[local_relation->relid];
+
+			/* root doesn't change for inheritance case, e.g. for UNION ALL */
+			const PlannerInfo *new_root = outside_subq_rte->inh ?
+			root : local_relation->subroot;
+
+			/*
+			 * Found that this Var is a subquery SELECT list item, so continue
+			 * to recurse on the occluded expression
+			 */
+			Expr	   *occluded_expr =
+			get_subquery_var_occluded_reference(ex, root);
+
+			if (occluded_expr)
+			{
+				recursively_analyze_expr_metadata(occluded_expr,
+												  new_root, md);
+			}
+		}
+	}
+}
+
+
+void
+analyze_expr_for_metadata(const Expr *ex,
+						  const PlannerInfo *root,
+						  ExprMetadata * md)
+{
+	/*
+	 * Analyze the supplied expression, and if possible, gather metadata about
+	 * it.  Currently handles: base table columns, constants, and virtual
+	 * columns from unflattened subquery blocks.  The metadata collected is
+	 * placed into the supplied ExprMetadata object.
+	 */
+	Assert(md && ex && root);
+
+	init_expr_metadata(md);
+	recursively_analyze_expr_metadata(ex, root, md);
+}
+
+
+/*
+ *  Function:  evaluate_semijoin_filtering_rate
+ *
+ *  Given a merge join path, determine two things.
+ *  First, can a Bloom filter based semijoin be created on the
+ *  outer scan relation and checked on the inner scan relation to
+ *  filter out rows from the inner relation? And second, if this
+ *  is possible, determine the single equijoin condition that is most
+ *  useful as well as the estimated filtering rate of the filter.
+ *
+ *  The output args, inner_semijoin_keys and
+ *  outer_semijoin_keys, will each contain a single key column
+ *  from one of the hash equijoin conditions. probe_semijoin_keys
+ *  contains keys from the target relation to probe the semijoin filter.
+ *
+ *  A potential semijoin will be deemed valid only if all
+ *  of the following are true:
+ *    a) The enable_mergejoin_semijoin_filter option is set true
+ *    b) The equijoin key from the outer side is or maps
+ *       to a base table column
+ *    c) The equijoin key from the inner side is or maps to
+ *       a base column
+ *
+ *  A potential semijoin will be deemed useful only if the
+ *  force_mergejoin_semijoin_filter is set true, or if all of the
+ *  following are true:
+ *    a) The equijoin key base column from the outer side has
+ *       reliable metadata (i.e. ANALYZE was done on it)
+ *    b) The key column(s) from the outer side equijoin keys
+ *       have width metadata available.
+ *    c) The estimated outer side key column width(s) are not
+ *       excessively wide.
+ *    d) The equijoin key from the inner side either:
+ *         1) maps to a base column with reliable metadata, or
+ *         2) is constrained by the incoming estimated tuple
+ *            count to have a distinct count smaller than the
+ *            outer side key column's distinct count.
+ *    e) The semijoin must be estimated to filter at least some of
+ *       the rows from the inner relation. However, the exact filtering
+ *       rate where the semijoin is deemed useful is determined by the
+ *       mergejoin cost model itself, not this function.
+ *
+ *  If there is more than one equijoin condition, we favor the one with the
+ *  higher estimated filtering rate.
+ *
+ *  If this function finds an appropriate semijoin, it will
+ *  allocate a PushdownSemijoinMetadata object to store the
+ *  semijoin metadata, and then attach it to the Join plan node.
+ */
+#define MAX_SEMIJOIN_SINGLE_KEY_WIDTH	  128
+
+static double
+evaluate_semijoin_filtering_rate(JoinPath *join_path,
+								 const List *equijoin_list,
+								 const PlannerInfo *root,
+								 JoinCostWorkspace *workspace,
+								 int *best_clause,
+								 int *rows_filtered)
+{
+	const Path *outer_path;
+	const Path *inner_path;
+	ListCell   *equijoin_lc = NULL;
+	int			equijoin_ordinal = -1;
+	int			best_single_col_sj_ordinal = -1;
+	double		best_sj_selectivity = 1.01;
+	double		best_sj_inner_rows_filtered = -1.0;
+	int			num_md;
+	ExprMetadata *outer_md_array = NULL;
+	ExprMetadata *inner_md_array = NULL;
+
+	Assert(equijoin_list);
+	Assert(list_length(equijoin_list) > 0);
+
+	if (!enable_mergejoin_semijoin_filter && !force_mergejoin_semijoin_filter)
+	{
+		return 0;				/* option setting disabled semijoin insertion  */
+	}
+
+	num_md = list_length(equijoin_list);
+	outer_md_array = alloca(sizeof(ExprMetadata) * num_md);
+	inner_md_array = alloca(sizeof(ExprMetadata) * num_md);
+	if (!outer_md_array || !inner_md_array)
+	{
+		return 0;				/* a stack array allocation failed  */
+	}
+
+	outer_path = join_path->outerjoinpath;
+	inner_path = join_path->innerjoinpath;
+
+	debug_sj1("SJPD:  start evaluate_semijoin_filtering_rate");
+	debug_sj2("SJPD:	  join inner est rows:     %.1lf",
+			  inner_path->rows);
+	debug_sj2("SJPD:	  join outer est rows:     %.1lf",
+			  outer_path->rows);
+
+	/*
+	 * Consider each of the individual equijoin conditions as a possible basis
+	 * for creating a semijoin condition
+	 */
+	foreach(equijoin_lc, equijoin_list)
+	{
+		OpExpr	   *equijoin;
+		Node	   *outer_equijoin_arg = NULL;
+		ExprMetadata *outer_arg_md = NULL;
+		Node	   *inner_equijoin_arg = NULL;
+		ExprMetadata *inner_arg_md = NULL;
+		double		est_sj_selectivity = 1.01;
+		double		est_sj_inner_rows_filtered = -1.0;
+
+		equijoin_ordinal++;
+		equijoin = (OpExpr *) lfirst(equijoin_lc);
+
+		Assert(IsA(equijoin, OpExpr));
+		Assert(list_length(equijoin->args) == 2);
+
+		outer_equijoin_arg = linitial(equijoin->args);
+		outer_arg_md = &(outer_md_array[equijoin_ordinal]);
+		analyze_expr_for_metadata((Expr *) outer_equijoin_arg,
+								  root, outer_arg_md);
+
+		inner_equijoin_arg = llast(equijoin->args);
+		inner_arg_md = &(inner_md_array[equijoin_ordinal]);
+		analyze_expr_for_metadata((Expr *) inner_equijoin_arg,
+								  root, inner_arg_md);
+
+		debug_sj2("SJPD:	  equijoin condition [%d]", equijoin_ordinal);
+		debug_sj_md("outer", equijoin_ordinal, outer_arg_md);
+		debug_sj_md("inner", equijoin_ordinal, inner_arg_md);
+
+		if (outer_arg_md->base_column_expr &&
+			inner_arg_md->base_column_expr)
+		{
+			/*
+			 * If outer key - inner key has FK/PK relationship to each other
+			 * and there is no restriction on the primary key side, the
+			 * semijoin filter will be useless, we should bail out, even if
+			 * the force_semijoin_push_down guc is set. There might be
+			 * exceptions, if the outer key has restrictions on the key
+			 * variable, but we won't be able to tell until the Plan level. We
+			 * will be conservative and assume that an FK/PK relationship will
+			 * yield a useless filter.
+			 */
+			if (
+				is_fk_pk(outer_arg_md->base_column_expr,
+						 inner_arg_md->base_column_expr,
+						 equijoin->opno, root))
+			{
+				debug_sj2("SJPD:        inner and outer equijoin columns %s",
+						  "are PK/FK; semijoin would not be useful");
+				continue;
+			}
+		}
+
+		/* Now see if we can push a semijoin to its source scan node  */
+		if (!outer_arg_md->local_column_expr || !inner_arg_md->local_column_expr)
+		{
+			debug_sj2("SJPD:        could not find a local outer or inner column to%s",
+					  " use as semijoin basis; semijoin is not valid");
+			continue;			/* Continue on to the next equijoin condition  */
+		}
+
+		if (!verify_valid_pushdown((Path *) (join_path->innerjoinpath),
+								   inner_arg_md->local_column_expr->varno, root))
+		{
+			debug_sj2("SJPD:        could not find a place to evaluate %s",
+					  "a semijoin condition; semijoin is not valid");
+			continue;			/* Continue on to the next equijoin condition  */
+		}
+
+		/*
+		 * Adjust cached estimated inner key distinct counts down using the
+		 * inner side tuple count as an upper bound
+		 */
+		inner_arg_md->expr_est_distincts =
+			fmax(1.0, fmin(inner_path->rows,
+						   inner_arg_md->expr_est_distincts));
+
+		/*
+		 * We need to estimate the outer key distinct count as close as
+		 * possible to the where the semijoin filter will actually be applied,
+		 * ignoring the effects of any indirect filtering that would occur
+		 * after the semijoin.
+		 */
+		outer_arg_md->expr_est_distincts =
+			fmax(1.0, fmin(outer_path->rows,
+						   outer_arg_md->expr_est_distincts));
+
+		/* Next, see if this equijoin is valid as a semijoin basis */
+		if (!outer_arg_md->is_or_maps_to_base_column
+			&& !inner_arg_md->is_or_maps_to_constant)
+		{
+			debug_sj2("SJPD:        outer equijoin arg does not map %s",
+					  "to a base column nor a constant; semijoin is not valid");
+			continue;			/* Continue on to the next equijoin condition  */
+		}
+		if (!inner_arg_md->is_or_maps_to_base_column
+			&& !inner_arg_md->is_or_maps_to_constant)
+		{
+			debug_sj2("SJPD:        inner equijoin arg maps to neither %s",
+					  "a base column; semijoin is not valid");
+			continue;			/* Continue on to the next equijoin condition  */
+		}
+
+		/*
+		 * If force_mergejoin_semijoin_filter is used, set the default clause
+		 * as the first valid one.
+		 */
+		if (force_mergejoin_semijoin_filter && best_single_col_sj_ordinal == -1)
+		{
+			best_single_col_sj_ordinal = equijoin_ordinal;
+		}
+		/* Now we know it's valid, see if this potential semijoin is useful */
+		if (!outer_arg_md->est_distincts_reliable)
+		{
+			debug_sj2("SJPD:        outer equijoin column's distinct %s",
+					  "estimates are not reliable; condition rejected");
+			continue;			/* Continue on to the next equijoin condition  */
+		}
+		if (outer_arg_md->est_col_width == 0)
+		{
+			debug_sj2("SJPD:        outer equijoin column's width %s",
+					  "could not be estimated; condition rejected");
+			continue;			/* Continue on to the next equijoin condition  */
+		}
+		if (outer_arg_md->est_col_width > MAX_SEMIJOIN_SINGLE_KEY_WIDTH)
+		{
+			debug_sj2("SJPD:        outer equijoin column's width %s",
+					  "was excessive; condition rejected");
+			continue;			/* Continue on to the next equijoin condition  */
+		}
+		if (!(outer_arg_md->is_or_maps_to_constant
+			  || (inner_arg_md->is_or_maps_to_base_column
+				  && inner_arg_md->est_distincts_reliable)
+			  || (inner_path->rows
+				  < outer_arg_md->expr_est_distincts)))
+		{
+			debug_sj2("SJPD:        inner equijoin arg does not have %s",
+					  "a reliable distinct count; condition rejected");
+			continue;			/* Continue on to the next equijoin condition  */
+		}
+
+		/*
+		 * We now try to estimate the filtering rate (1 minus selectivity) and
+		 * rows filtered of the filter. We first start by finding the ranges
+		 * of both the outer and inner var, and find the overlap between these
+		 * ranges. We assume an equal distribution of variables among this
+		 * range, and we can then calculate the amount of filtering our SJF
+		 * would do.
+		 */
+		if (workspace->inner_min_val > workspace->outer_max_val
+			|| workspace->inner_max_val < workspace->outer_min_val)
+		{
+			/*
+			 * This would mean that the outer and inner tuples are completely
+			 * disjoin from each other. We will not be as optimistic, and just
+			 * assign a filtering rate of 95%.
+			 */
+			est_sj_selectivity = 0.05;	/* selectivity is 1 minus filtering
+										 * rate */
+			est_sj_inner_rows_filtered = 0.95 * inner_arg_md->base_rel_filt_row_count;
+		}
+		else
+		{
+#define APPROACH_1_DAMPENING_FACTOR 0.8
+#define APPROACH_2_DAMPENING_FACTOR 0.66
+			/*
+			 * There are two approaches to estimating the filtering rate. We
+			 * have already outlined the first approach above, finding the
+			 * range and assuming an equal distribution. For the second
+			 * approach, we do not assume anything about the distribution, but
+			 * compare the number of distincts. If, for example, the inner
+			 * relation has 1000 distincts and the outer has 500, then there
+			 * is guaranteed to be at least 500 rows filtered from the inner
+			 * relation, regardless of the data distribution. We make an
+			 * assumption here that the distribution of distinct variables is
+			 * equal to the distribution of all rows so we can multiply by the
+			 * ratio of duplicate values. We then take the geometric mean of
+			 * these two approaches for our final estimated filtering rate. We
+			 * also multiply these values by dampening factors, which we have
+			 * found via experimentation and probably need fine-tuning.
+			 */
+			double		approach_1_selectivity; /* finding selectivity instead
+												 * of filtering rate for
+												 * legacy code reasons */
+			double		approach_2_selectivity;
+			double		inner_overlapping_range = workspace->outer_max_val - workspace->inner_min_val;
+
+			/* we are assuming an equal distribution of val's */
+			double		inner_overlapping_ratio = inner_overlapping_range / inner_arg_md->base_rel_filt_row_count;
+
+			Assert(inner_overlapping_ratio >= 0 && inner_overlapping_ratio <= 1);
+
+			/*
+			 * testing has found that this method is generaly over-optimistic,
+			 * so we multiply by a dampening effect.
+			 */
+			approach_1_selectivity = inner_overlapping_ratio * APPROACH_1_DAMPENING_FACTOR;
+			if (inner_arg_md->expr_est_distincts > outer_arg_md->expr_est_distincts)
+			{
+				int			inner_more_distincts = inner_arg_md->expr_est_distincts - outer_arg_md->expr_est_distincts;
+
+				approach_2_selectivity = 1 - ((double) inner_more_distincts) / inner_arg_md->expr_est_distincts;
+
+				/*
+				 * testing has found that this method is generaly
+				 * over-optimistic, so we multiply by a dampening effect.
+				 */
+				approach_2_selectivity = 1 - ((1 - approach_2_selectivity) * APPROACH_2_DAMPENING_FACTOR);
+			}
+			else
+			{
+				/*
+				 * This means that the outer relation has the same or more
+				 * distincts than the inner relation, which is not good for
+				 * our filtering rate. We will assume a base filtering rate of
+				 * 10% in this case.
+				 */
+				approach_2_selectivity = 0.9;
+			}
+			est_sj_selectivity = sqrt(approach_1_selectivity * approach_2_selectivity);
+			est_sj_inner_rows_filtered = (1 - est_sj_selectivity) * inner_arg_md->base_rel_filt_row_count;
+		}
+		est_sj_selectivity = fmin(1.0, est_sj_selectivity);
+		est_sj_inner_rows_filtered = fmax(1.0, est_sj_inner_rows_filtered);
+
+		debug_sj2("SJPD:        eligible semijoin selectivity:    %.7lf",
+				  est_sj_selectivity);
+		debug_sj2("SJPD:        eligible semijoin rows filtered:  %.7lf",
+				  est_sj_inner_rows_filtered);
+
+		if (est_sj_selectivity < best_sj_selectivity)
+		{
+			debug_sj1("SJPD:        found most useful semijoin seen so far");
+			best_sj_selectivity = est_sj_selectivity;
+			best_sj_inner_rows_filtered = est_sj_inner_rows_filtered;
+			best_single_col_sj_ordinal = equijoin_ordinal;
+		}
+		else
+		{						/* This semijoin was rejected, so explain why  */
+			debug_sj2("SJPD:        found useful single column semijoin; %s",
+					  "not as useful as best found so far, so rejected");
+		}
+	}
+
+	if (best_single_col_sj_ordinal != -1)
+	{
+		debug_sj2("SJPD:      best single column sj selectivity:     %.7lf",
+				  best_sj_selectivity);
+		debug_sj2("SJPD:      best single column rows filtered:      %.7lf",
+				  best_sj_inner_rows_filtered);
+	}
+
+	debug_sj1("SJPD:  finish evaluate_semijoin_filtering_rate");
+	*best_clause = best_single_col_sj_ordinal;
+	*rows_filtered = best_sj_inner_rows_filtered;
+	return 1 - best_sj_selectivity;
+}
+
+/*
+ *  Determine whether a semijoin condition could be pushed from the join
+ *  all the way to the leaf scan node.
+ *
+ *  Parameters:
+ *  node: path node to be considered for semijoin push down.
+ *  target_var:  the inner side join key for a potential semijoin.
+ *  target_relids: relids of all target leaf relations,
+ *  	used only for partitioned table.
+ */
+static bool
+verify_valid_pushdown(const Path *path,
+					  const Index target_var_no,
+					  const PlannerInfo *root)
+{
+	Assert(path);
+	Assert(target_var_no > 0);
+
+	if (path == NULL)
+	{
+		return false;
+	}
+
+	/* Guard against stack overflow due to overly complex plan trees */
+	check_stack_depth();
+
+	switch (path->pathtype)
+	{
+			/* directly push through these paths */
+		case T_Material:
+			{
+				return verify_valid_pushdown(((MaterialPath *) path)->subpath, target_var_no, root);
+			}
+		case T_Gather:
+			{
+				return verify_valid_pushdown(((GatherPath *) path)->subpath, target_var_no, root);
+			}
+		case T_GatherMerge:
+			{
+				return verify_valid_pushdown(((GatherMergePath *) path)->subpath, target_var_no, root);
+			}
+		case T_Sort:
+			{
+				return verify_valid_pushdown(((SortPath *) path)->subpath, target_var_no, root);
+			}
+		case T_Unique:
+			{
+				return verify_valid_pushdown(((UniquePath *) path)->subpath, target_var_no, root);
+			}
+
+		case T_Agg:
+			{					/* We can directly push bloom through GROUP
+								 * BYs and DISTINCTs, as long as there are no
+								 * grouping sets. However, we cannot validate
+								 * this fact until the Plan has been created.
+								 * We will push through for now, but verify
+								 * again during Plan creation. */
+				return verify_valid_pushdown(((AggPath *) path)->subpath, target_var_no, root);
+			}
+
+		case T_Append:
+		case T_SubqueryScan:
+			{
+				/*
+				 * Both append and subquery paths are currently unimplemented,
+				 * so we will just return false, but theoretically there are
+				 * ways to check if a filter can be pushed through them. The
+				 * previous HashJoin CR has implemented these cases, but that
+				 * code is run these after the plan has been created, so code
+				 * will need to be adjusted to do it during Path evaluation.
+				 */
+				return false;
+			}
+
+			/* Leaf nodes */
+		case T_IndexScan:
+		case T_BitmapHeapScan:
+			{
+				/*
+				 * We could definitely implement pushdown filters for Index
+				 * and Bitmap Scans, but currently it is only implemented for
+				 * SeqScan. For now, we return false.
+				 */
+				return false;
+			}
+		case T_SeqScan:
+			{
+				if (path->parent->relid == target_var_no)
+				{
+					/*
+					 * Found source of target var! We know that the pushdown
+					 * is valid now.
+					 */
+					return true;
+				}
+				return false;
+			}
+
+		case T_NestLoop:
+		case T_MergeJoin:
+		case T_HashJoin:
+			{
+				/*
+				 * since this is going to be a sub-join, we can push through
+				 * both sides and don't need to worry about left/right/inner
+				 * joins.
+				 */
+				JoinPath   *join = (JoinPath *) path;
+
+				return verify_valid_pushdown(join->outerjoinpath, target_var_no, root) ||
+					verify_valid_pushdown(join->innerjoinpath, target_var_no, root);
+			}
+
+		default:
+			{
+				return false;
+			}
+	}
+}
+
+static TargetEntry *
+get_nth_targetentry(int n, const List *targetlist)
+{
+	int			i = 1;
+	ListCell   *lc = NULL;
+
+	Assert(n > 0);
+	Assert(targetlist && nodeTag(targetlist) == T_List);
+	Assert(list_length(targetlist) >= n);
+
+	if (targetlist && list_length(targetlist) >= n)
+	{
+		foreach(lc, targetlist)
+		{
+			if (i == n)
+			{
+				TargetEntry *te = lfirst(lc);
+
+				return te;
+			}
+			i++;
+		}
+	}
+	return NULL;
+}
+
+/*
+ * expressions_match_foreign_key
+ *		True if the given con_exprs, ref_exprs and operators will exactlty
+ *      	reflect the expressions referenced by the given foreign key fk.
+ *
+ * Note: This function expects con_exprs and ref_exprs to only contain Var types.
+ *       Expression indexes are not supported by foreign keys.
+ */
+bool
+expressions_match_foreign_key(ForeignKeyOptInfo *fk,
+							  List *con_exprs,
+							  List *ref_exprs,
+							  List *operators)
+{
+	ListCell   *lc;
+	ListCell   *lc2;
+	ListCell   *lc3;
+	int			col;
+	Bitmapset  *all_vars;
+	Bitmapset  *matched_vars;
+	int			idx;
+
+	Assert(list_length(con_exprs) == list_length(ref_exprs));
+	Assert(list_length(con_exprs) == list_length(operators));
+
+	/*
+	 * Fast path out if there's not enough conditions to match each column in
+	 * the foreign key. Note that we cannot check that the number of
+	 * expressions are equal here since it would cause any expressions which
+	 * are duplicated not to match.
+	 */
+	if (list_length(con_exprs) < fk->nkeys)
+		return false;
+
+	/*
+	 * We need to ensure that each item in con_exprs/ref_exprs can be matched
+	 * to a foreign key column in the actual foreign key data fk. We can do
+	 * this by looping over each fk column and checking that we find a
+	 * matching con_expr/ref_expr in con_exprs/ref_exprs. This method does not
+	 * however, allow us to ensure that there are no additional items in
+	 * con_exprs/ref_exprs that have not been matched. To remedy this we will
+	 * create 2 bitmapsets, one which will keep track of all of the vars, the
+	 * other which will keep track of the vars that we have matched. After
+	 * matching is complete, we will ensure that these bitmapsets are equal to
+	 * ensure we have complete mapping in both directions (fk cols to vars and
+	 * vars to fk cols)
+	 */
+	all_vars = NULL;
+	matched_vars = NULL;
+
+	/*
+	 * Build a bitmapset which tracks all vars by their index
+	 */
+	for (idx = 0; idx < list_length(con_exprs); idx++)
+		all_vars = bms_add_member(all_vars, idx);
+
+	for (col = 0; col < fk->nkeys; col++)
+	{
+		bool		matched = false;
+
+		idx = 0;
+
+		forthree(lc, con_exprs, lc2, ref_exprs, lc3, operators)
+		{
+			Var		   *con_expr = (Var *) lfirst(lc);
+			Var		   *ref_expr = (Var *) lfirst(lc2);
+			Oid			opr = lfirst_oid(lc3);
+
+			Assert(IsA(con_expr, Var));
+			Assert(IsA(ref_expr, Var));
+
+			/* Does this join qual match up to the current fkey column? */
+			if (fk->conkey[col] == con_expr->varattno &&
+				fk->confkey[col] == ref_expr->varattno &&
+				equality_ops_are_compatible(opr, fk->conpfeqop[col]))
+			{
+				matched = true;
+
+				/* mark the index of this var as matched */
+				matched_vars = bms_add_member(matched_vars, idx);
+
+				/*
+				 * Don't break here as there may be duplicate expressions that
+				 * match this column that we also need to mark as matched
+				 */
+			}
+			idx++;
+		}
+
+		/*
+		 * can't remove a join if there's no match to fkey column on join
+		 * condition.
+		 */
+		if (!matched)
+			return false;
+	}
+
+	/*
+	 * Ensure that we managed to match every var in con_var/ref_var to a
+	 * foreign key constraint.
+	 */
+	if (!bms_equal(all_vars, matched_vars))
+		return false;
+	return true;
+}
+
+/*
+ * Determine if the given outer and inner Exprs satisfy any fk-pk
+ * relationship.
+ */
+static bool
+is_fk_pk(const Var *outer_var,
+		 const Var *inner_var,
+		 Oid op_oid,
+		 const PlannerInfo *root)
+{
+	ListCell   *lc = NULL;
+	List	   *outer_key_list = list_make1((Var *) outer_var);
+	List	   *inner_key_list = list_make1((Var *) inner_var);
+	List	   *operators = list_make1_oid(op_oid);
+
+	foreach(lc, root->fkey_list)
+	{
+		ForeignKeyOptInfo *fk = (ForeignKeyOptInfo *) lfirst(lc);
+
+		if (expressions_match_foreign_key(fk,
+										  outer_key_list,
+										  inner_key_list,
+										  operators))
+		{
+			return true;
+		}
+	}
+
+	return false;
+}
+
+/*
+ * get_switched_clauses
+ *	  Given a list of merge or hash joinclauses (as RestrictInfo nodes),
+ *	  extract the bare clauses, and rearrange the elements within the
+ *	  clauses, if needed, so the outer join variable is on the left and
+ *	  the inner is on the right.  The original clause data structure is not
+ *	  touched; a modified list is returned.  We do, however, set the transient
+ *	  outer_is_left field in each RestrictInfo to show which side was which.
+ */
+static List *
+get_switched_clauses(List *clauses, Relids outerrelids)
+{
+	List	   *t_list = NIL;
+	ListCell   *l;
+
+	foreach(l, clauses)
+	{
+		RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(l);
+		OpExpr	   *clause = (OpExpr *) restrictinfo->clause;
+
+		Assert(is_opclause(clause));
+
+		/* TODO: handle the case where the operator doesn't hava a commutator */
+		if (bms_is_subset(restrictinfo->right_relids, outerrelids)
+			&& OidIsValid(get_commutator(clause->opno)))
+		{
+			/*
+			 * Duplicate just enough of the structure to allow commuting the
+			 * clause without changing the original list.  Could use
+			 * copyObject, but a complete deep copy is overkill.
+			 */
+			OpExpr	   *temp = makeNode(OpExpr);
+
+			temp->opno = clause->opno;
+			temp->opfuncid = InvalidOid;
+			temp->opresulttype = clause->opresulttype;
+			temp->opretset = clause->opretset;
+			temp->opcollid = clause->opcollid;
+			temp->inputcollid = clause->inputcollid;
+			temp->args = list_copy(clause->args);
+			temp->location = clause->location;
+			/* Commute it --- note this modifies the temp node in-place. */
+			CommuteOpExpr(temp);
+			t_list = lappend(t_list, temp);
+			restrictinfo->outer_is_left = false;
+		}
+		else
+		{
+			/*
+			 * TODO: check if Assert(bms_is_subset(restrictinfo->left_relids,
+			 * outerrelids)) is necessary.
+			 */
+			t_list = lappend(t_list, clause);
+			restrictinfo->outer_is_left = true;
+		}
+	}
+	return t_list;
+}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ab4d8e201df..98a673b9b94 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -315,6 +315,36 @@ static ModifyTable *make_modifytable(PlannerInfo *root, Plan *subplan,
 static GatherMerge *create_gather_merge_plan(PlannerInfo *root,
 											 GatherMergePath *best_path);
 
+/*
+ *  Local functions and option variables to support
+ *  semijoin pushdowns from join nodes
+ */
+static int	depth_of_semijoin_target(Plan *pn,
+									 const Var *target_var,
+									 Bitmapset *target_relids,
+									 int cur_depth,
+									 const PlannerInfo *root,
+									 Plan **target_node);
+static bool is_side_of_join_source_of_var(const Plan *pn,
+										  bool testing_outer_side,
+										  const Var *target_var);
+static bool
+			is_table_scan_node_source_of_relids_or_var(const Plan *pn,
+													   const Var *target_var,
+													   Bitmapset *target_relids);
+static int	position_of_var_in_targetlist(const Var *target_var,
+										  const List *targetlist);
+static TargetEntry *get_nth_targetentry(int posn,
+										const List *targetlist);
+static void get_partition_table_relids(RelOptInfo *rel,
+									   Bitmapset **target_relids);
+static int	get_appendrel_occluded_references(const Expr *ex,
+											  Expr **occluded_exprs,
+											  int num_exprs,
+											  const PlannerInfo *root);
+static Expr *get_subquery_var_occluded_reference(const Expr *ex,
+												 const PlannerInfo *root);
+
 
 /*
  * create_plan
@@ -4691,6 +4721,39 @@ create_mergejoin_plan(PlannerInfo *root,
 	/* Costs of sort and material steps are included in path cost already */
 	copy_generic_path_info(&join_plan->join.plan, &best_path->jpath.path);
 
+	/* Check if we should attach a pushdown semijoin to this join */
+	if (best_path->use_semijoinfilter)
+	{
+		if (best_path->best_mergeclause != -1)
+		{
+			ListCell   *clause_cell = list_nth_cell(mergeclauses, best_path->best_mergeclause);
+			OpExpr	   *joinclause = (OpExpr *) lfirst(clause_cell);
+			Node	   *outer_join_arg = linitial(joinclause->args);
+			Node	   *inner_join_arg = llast(joinclause->args);
+			ExprMetadata *outer_arg_md = (ExprMetadata *) palloc0(sizeof(ExprMetadata));
+			ExprMetadata *inner_arg_md = (ExprMetadata *) palloc0(sizeof(ExprMetadata));
+			int			outer_depth;
+			int			inner_depth;
+
+			Assert(IsA(joinclause, OpExpr));
+			Assert(list_length(joinclause->args) == 2);
+			analyze_expr_for_metadata((Expr *) outer_join_arg, root, outer_arg_md);
+			analyze_expr_for_metadata((Expr *) inner_join_arg, root, inner_arg_md);
+			outer_depth = depth_of_semijoin_target((Plan *) join_plan,
+												   outer_arg_md->local_column_expr, NULL, 0, root, &join_plan->buildingNode);
+			inner_depth = depth_of_semijoin_target((Plan *) join_plan,
+												   inner_arg_md->local_column_expr, NULL, 0, root, &join_plan->checkingNode);
+			if (inner_depth > -1 && outer_depth > -1)
+			{
+				join_plan->applySemiJoinFilter = true;
+				join_plan->bestExpr = best_path->best_mergeclause;
+				join_plan->filteringRate = best_path->filteringRate;
+			}
+			pfree(outer_arg_md);
+			pfree(inner_arg_md);
+		}
+	}
+
 	return join_plan;
 }
 
@@ -7216,3 +7279,551 @@ is_projection_capable_plan(Plan *plan)
 	}
 	return true;
 }
+
+/*
+ *  Determine whether a semijoin condition could be pushed from the join
+ *  all the way to the leaf scan node.  If so, determine the number of
+ *  nodes between the join and the scan node (inclusive of the scan node).
+ *  If the search was terminated by a node the semijoin could not be
+ *  pushed through, the function returns -1.
+ *
+ *  Parameters:
+ *  node: plan node to be considered for semijoin push down.
+ *  target_var:  the outer side join key for a potential semijoin.
+ *  target_relids: relids of all target leaf relations,
+ *  	used only for partitioned table.
+ *  cur_depth: current depth from the root hash join plan node.
+ *  target_node: stores the target plan node where filter will be applied
+ */
+static int
+depth_of_semijoin_target(Plan *pn,
+						 const Var *target_var,
+						 Bitmapset *target_relids,
+						 int cur_depth,
+						 const PlannerInfo *root,
+						 Plan **target_node)
+{
+	int			depth = -1;
+
+	Assert(pn);
+	Assert(target_var && IsA(target_var, Var));
+	Assert(target_var->varno > 0);
+
+	if (pn == NULL)
+	{
+		return -1;
+	}
+
+	/* Guard against stack overflow due to overly complex plan trees */
+	check_stack_depth();
+
+	switch (nodeTag(pn))
+	{
+		case T_Hash:
+		case T_Material:
+		case T_Gather:
+		case T_GatherMerge:
+		case T_Sort:
+		case T_Unique:
+			{					/* Directly push bloom through these node
+								 * types  */
+				depth = depth_of_semijoin_target(pn->lefttree, target_var,
+												 target_relids, cur_depth + 1, root, target_node);
+				break;
+			}
+
+		case T_Agg:
+			{					/* Directly push bloom through GROUP BYs and
+								 * DISTINCTs, as long as there are no grouping
+								 * sets */
+				Agg		   *agg_pn = (Agg *) pn;
+
+				if (!agg_pn->groupingSets
+					|| list_length(agg_pn->groupingSets) == 0)
+				{
+					depth = depth_of_semijoin_target(pn->lefttree, target_var,
+													 target_relids, cur_depth + 1,
+													 root, target_node);
+				}
+				break;
+			}
+
+		case T_SubqueryScan:
+			{
+				/*
+				 * Directly push semijoin into subquery if we can, but we need
+				 * to map the target var to the occluded expression within the
+				 * SELECT list of the subquery
+				 */
+				SubqueryScan *subq_scan = (SubqueryScan *) pn;
+				RelOptInfo *rel = NULL;
+				RangeTblEntry *rte = NULL;
+				Var		   *subq_target_var = NULL;
+
+				/*
+				 * To travel into a subquery we need to use the subquery's
+				 * PlannerInfo, the root of subquery's plan tree, and the
+				 * subquery's SELECT list item that was occluded by the Var
+				 * used within this query block
+				 */
+				rte = root->simple_rte_array[subq_scan->scan.scanrelid];
+				Assert(rte);
+				Assert(rte->subquery);
+				Assert(rte->rtekind == RTE_SUBQUERY);
+				Assert(rte->subquery->targetList);
+
+				rel = find_base_rel((PlannerInfo *) root,
+									subq_scan->scan.scanrelid);
+				Assert(rel->rtekind == RTE_SUBQUERY);
+				Assert(rel->subroot);
+
+				if (rel && rel->subroot
+					&& rte && rte->subquery && rte->subquery->targetList)
+				{
+					/* Find the target_var's occluded expression */
+					Expr	   *occluded_expr =
+					get_subquery_var_occluded_reference((Expr *) target_var,
+														root);
+
+					if (occluded_expr && IsA(occluded_expr, Var))
+					{
+						subq_target_var = (Var *) occluded_expr;
+						if (subq_target_var->varno > 0)
+							depth = depth_of_semijoin_target(subq_scan->subplan,
+															 subq_target_var,
+															 target_relids,
+															 cur_depth + 1,
+															 rel->subroot,
+															 target_node);
+					}
+				}
+				break;
+			}
+
+			/* Either from a partitioned table or Union All */
+		case T_Append:
+			{
+				int			max_depth = -1;
+				Append	   *append = (Append *) pn;
+				RelOptInfo *rel = NULL;
+				RangeTblEntry *rte = NULL;
+
+				rte = root->simple_rte_array[target_var->varno];
+				rel = find_base_rel((PlannerInfo *) root, target_var->varno);
+
+				if (rte->inh && append->appendplans)
+				{
+					int			num_exprs = list_length(append->appendplans);
+					Expr	  **occluded_exprs = alloca(num_exprs * sizeof(Expr *));
+					int			idx = 0;
+					ListCell   *lc = NULL;
+
+					/* Partitioned table */
+					if (rel->part_scheme && rel->part_rels)
+					{
+						get_partition_table_relids(rel, &target_relids);
+
+						foreach(lc, append->appendplans)
+						{
+							Plan	   *appendplan = (Plan *) lfirst(lc);
+
+							depth = depth_of_semijoin_target(appendplan,
+															 target_var,
+															 target_relids,
+															 cur_depth + 1,
+															 root,
+															 target_node);
+
+							if (depth > max_depth)
+								max_depth = depth;
+						}
+					}
+					/* Union All, not partitioned table */
+					else if (num_exprs == get_appendrel_occluded_references(
+																			(Expr *) target_var,
+																			occluded_exprs,
+																			num_exprs,
+																			root))
+					{
+						Var		   *subq_target_var = NULL;
+
+						foreach(lc, append->appendplans)
+						{
+							Expr	   *occluded_expr = occluded_exprs[idx++];
+							Plan	   *appendplan = (Plan *) lfirst(lc);
+
+							if (occluded_expr && IsA(occluded_expr, Var))
+							{
+								subq_target_var = (Var *) occluded_expr;
+
+								depth = depth_of_semijoin_target(appendplan,
+																 subq_target_var,
+																 target_relids,
+																 cur_depth + 1,
+																 root,
+																 target_node);
+
+								if (depth > max_depth)
+									max_depth = depth;
+							}
+						}
+					}
+				}
+				depth = max_depth;
+				break;
+			}
+
+			/* Leaf nodes */
+		case T_IndexScan:
+		case T_BitmapHeapScan:
+			{
+				return -1;
+			}
+		case T_SeqScan:
+			{
+				if (is_table_scan_node_source_of_relids_or_var(pn, target_var, target_relids))
+				{
+					/* Found ultimate source of the join key!  */
+					*target_node = pn;
+					depth = cur_depth;
+				}
+				break;
+			}
+
+		case T_NestLoop:
+		case T_MergeJoin:
+		case T_HashJoin:
+			{
+				/*
+				 * pn->path_jointype is not always the same as join->jointype.
+				 * Avoid using pn->path_jointype when you need accurate
+				 * jointype, use join->jointype instead.
+				 */
+				Join	   *join = (Join *) pn;
+
+				/*
+				 * Push bloom filter to outer node if (target relation is
+				 * under the outer plan node, decided by
+				 * is_side_of_join_source_of_var() ) and either the following
+				 * condition satisfies: 1. this is an inner join or semi join
+				 * 2. this is a root right join 3. this is an intermediate
+				 * left join
+				 */
+				if (is_side_of_join_source_of_var(pn, true, target_var))
+				{
+					if (join->jointype == JOIN_INNER
+						|| join->jointype == JOIN_SEMI
+						|| (join->jointype == JOIN_RIGHT && cur_depth == 0)
+						|| (join->jointype == JOIN_LEFT && cur_depth > 0))
+					{
+						depth = depth_of_semijoin_target(pn->lefttree, target_var,
+														 target_relids, cur_depth + 1, root,
+														 target_node);
+					}
+				}
+				else
+				{
+					/*
+					 * Push bloom filter to inner node if (target rel is under
+					 * the inner node, decided by
+					 * is_side_of_join_source_of_var() ), and either the
+					 * following condition satisfies: 1. this is an inner join
+					 * or semi join 2. this is an intermediate right join
+					 */
+					Assert(is_side_of_join_source_of_var(pn, false, target_var));
+					if (join->jointype == JOIN_INNER
+						|| join->jointype == JOIN_SEMI
+						|| (join->jointype == JOIN_RIGHT && cur_depth > 0))
+					{
+						depth = depth_of_semijoin_target(pn->righttree, target_var,
+														 target_relids, cur_depth + 1, root,
+														 target_node);
+					}
+				}
+				break;
+			}
+
+		default:
+			{					/* For all other node types, just bail out and
+								 * apply the semijoin filter somewhere above
+								 * this node. */
+				depth = -1;
+			}
+	}
+	return depth;
+}
+
+static bool
+is_side_of_join_source_of_var(const Plan *pn,
+							  bool testing_outer_side,
+							  const Var *target_var)
+{
+	/* Determine if target_var is from the indicated child of the join  */
+	Plan	   *target_child = NULL;
+
+	Assert(pn);
+	Assert(target_var && nodeTag(target_var) == T_Var);
+	Assert(nodeTag(pn) == T_NestLoop || nodeTag(pn) == T_MergeJoin
+		   || nodeTag(pn) == T_HashJoin);
+
+	if (testing_outer_side)
+	{
+		target_child = pn->lefttree;
+	}
+	else
+	{
+		target_child = pn->righttree;
+	}
+
+	return (position_of_var_in_targetlist(target_var,
+										  target_child->targetlist) >= 0);
+}
+
+/*
+ * Determine if this scan node is the source of the specified relids,
+ * or the source of the specified var if target_relids is not given.
+ */
+static bool
+is_table_scan_node_source_of_relids_or_var(const Plan *pn,
+										   const Var *target_var,
+										   Bitmapset *target_relids)
+{
+	Scan	   *scan_node = (Scan *) pn;
+	Index		scan_node_varno = 0;
+
+	Assert(pn);
+	Assert(target_var && nodeTag(target_var) == T_Var);
+	Assert(nodeTag(pn) == T_SeqScan || nodeTag(pn) == T_IndexScan
+		   || nodeTag(pn) == T_BitmapHeapScan);
+
+	scan_node_varno = scan_node->scanrelid;
+
+	if (target_relids)
+	{
+		return bms_is_member(scan_node_varno, target_relids);
+	}
+	else if (scan_node_varno == target_var->varno)
+	{
+		/*
+		 * This should never be called for a column that is not being
+		 * projected at it's table scan node
+		 */
+		Assert(position_of_var_in_targetlist(target_var, pn->targetlist) >= 0);
+
+		return true;
+	}
+
+	return false;
+}
+
+static int
+position_of_var_in_targetlist(const Var *target_var, const List *targetlist)
+{
+	ListCell   *lc = NULL;
+	int			i = 1;
+
+	Assert(target_var && nodeTag(target_var) == T_Var);
+	Assert(targetlist && nodeTag(targetlist) == T_List);
+
+	if (targetlist && target_var)
+	{
+		foreach(lc, targetlist)
+		{
+			TargetEntry *te = lfirst(lc);
+
+			if (IsA(te->expr, Var))
+			{
+				Var		   *cur_var = (Var *) te->expr;
+
+				if (cur_var->varno == target_var->varno
+					&& cur_var->varattno == target_var->varattno)
+				{
+					return i;
+				}
+			}
+			i++;
+		}
+	}
+	return -1;
+}
+
+/*
+ * Recursively gather all relids of the given partitioned table rel.
+ */
+static void
+get_partition_table_relids(RelOptInfo *rel, Bitmapset **target_relids)
+{
+	int			i;
+
+	Assert(rel->part_scheme && rel->part_rels);
+
+	for (i = 0; i < rel->nparts; i++)
+	{
+		RelOptInfo *part_rel = rel->part_rels[i];
+
+		if (part_rel->part_scheme && part_rel->part_rels)
+		{
+			get_partition_table_relids(part_rel, target_relids);
+		}
+		else
+		{
+			*target_relids = bms_union(*target_relids,
+									   part_rel->relids);
+		}
+	}
+}
+
+/*
+ *  Given a virtual column from an Union ALL subquery,
+ *  return the expression it immediately occludes that satisfy
+ *  the inheritance condition,
+ *  i.e. appendRelInfo->parent_relid == outside_subq_var->varno
+ */
+static int
+get_appendrel_occluded_references(const Expr *ex,
+								  Expr **occluded_exprs,
+								  int num_exprs,
+								  const PlannerInfo *root)
+{
+	Var		   *outside_subq_var = (Var *) ex;
+	RangeTblEntry *outside_subq_rte = NULL;
+	int			idx = 0;
+
+
+	Assert(ex && root);
+	Assert(IsA(ex, Var));
+	Assert(outside_subq_var->varno < root->simple_rel_array_size);
+
+	outside_subq_rte = root->simple_rte_array[outside_subq_var->varno];
+
+	/* System Vars have varattno < 0, don't bother */
+	if (outside_subq_var->varattno <= 0)
+		return 0;
+
+	/*
+	 * If inheritance, subquery has append, leg of append in subquery may not
+	 * have subroot, process it according to root->append_rel_list.
+	 */
+	if (outside_subq_rte->inh)
+	{
+		ListCell   *lc = NULL;
+
+		Assert(root->append_rel_list &&
+			   num_exprs <= list_length(root->append_rel_list));
+
+		foreach(lc, root->append_rel_list)
+		{
+			AppendRelInfo *appendRelInfo = lfirst(lc);
+
+			if (appendRelInfo->parent_relid == outside_subq_var->varno)
+			{
+				Assert(appendRelInfo->translated_vars &&
+					   outside_subq_var->varattno <=
+					   list_length(appendRelInfo->translated_vars));
+
+				occluded_exprs[idx++] =
+					list_nth(appendRelInfo->translated_vars,
+							 outside_subq_var->varattno - 1);
+			}
+		}
+	}
+
+	return idx;
+}
+
+static Expr *
+get_subquery_var_occluded_reference(const Expr *ex, const PlannerInfo *root)
+{
+	/*
+	 * Given a virtual column from an unflattened subquery, return the
+	 * expression it immediately occludes
+	 */
+	Var		   *outside_subq_var = (Var *) ex;
+	RelOptInfo *outside_subq_relation = NULL;
+	RangeTblEntry *outside_subq_rte = NULL;
+	TargetEntry *te = NULL;
+	Expr	   *inside_subq_expr = NULL;
+
+	Assert(ex && root);
+	Assert(IsA(ex, Var));
+	Assert(outside_subq_var->varno < root->simple_rel_array_size);
+
+	outside_subq_relation = root->simple_rel_array[outside_subq_var->varno];
+	outside_subq_rte = root->simple_rte_array[outside_subq_var->varno];
+
+	/*
+	 * If inheritance, subquery has append, leg of append in subquery may not
+	 * have subroot, we may be able to better process it according to
+	 * root->append_rel_list. For now just return the first leg... TODO better
+	 * handling of Union All, we only return statistics of the first leg atm.
+	 * TODO similarly, need better handling of partitioned tables, according
+	 * to outside_subq_relation->part_scheme and part_rels.
+	 */
+	if (outside_subq_rte->inh)
+	{
+		AppendRelInfo *appendRelInfo = NULL;
+
+		Assert(root->append_rel_list);
+
+		/* TODO remove this check once we add better handling of inheritance */
+		appendRelInfo = list_nth(root->append_rel_list, 0);
+		Assert(appendRelInfo->parent_relid == outside_subq_var->varno);
+
+		Assert(appendRelInfo->translated_vars &&
+			   outside_subq_var->varattno <=
+			   list_length(appendRelInfo->translated_vars));
+		inside_subq_expr = list_nth(appendRelInfo->translated_vars,
+									outside_subq_var->varattno - 1);
+	}
+
+	/* Subquery without append and partitioned tables */
+	else
+	{
+		Assert(outside_subq_relation && IsA(outside_subq_relation, RelOptInfo));
+		Assert(outside_subq_relation->reloptkind == RELOPT_BASEREL);
+		Assert(outside_subq_relation->rtekind == RTE_SUBQUERY);
+		Assert(outside_subq_relation->subroot->processed_tlist);
+
+		te = get_nth_targetentry(outside_subq_var->varattno,
+								 outside_subq_relation->subroot->processed_tlist);
+		Assert(te && outside_subq_var->varattno == te->resno);
+		inside_subq_expr = te->expr;
+
+		/*
+		 * Strip off any Relabel present, and return the underlying expression
+		 */
+		while (inside_subq_expr && IsA(inside_subq_expr, RelabelType))
+		{
+			inside_subq_expr = ((RelabelType *) inside_subq_expr)->arg;
+		}
+	}
+
+	return inside_subq_expr;
+}
+
+
+static TargetEntry *
+get_nth_targetentry(int n, const List *targetlist)
+{
+	int			i = 1;
+	ListCell   *lc = NULL;
+
+	Assert(n > 0);
+	Assert(targetlist && nodeTag(targetlist) == T_List);
+	Assert(list_length(targetlist) >= n);
+
+	if (targetlist && list_length(targetlist) >= n)
+	{
+		foreach(lc, targetlist)
+		{
+			if (i == n)
+			{
+				TargetEntry *te = lfirst(lc);
+
+				return te;
+			}
+			i++;
+		}
+	}
+	return NULL;
+}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 1808388397a..b62de16899c 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -2905,7 +2905,9 @@ void
 mergejoinscansel(PlannerInfo *root, Node *clause,
 				 Oid opfamily, int strategy, bool nulls_first,
 				 Selectivity *leftstart, Selectivity *leftend,
-				 Selectivity *rightstart, Selectivity *rightend)
+				 Selectivity *rightstart, Selectivity *rightend,
+				 Datum *leftmin, Datum *leftmax,
+				 Datum *rightmin, Datum *rightmax)
 {
 	Node	   *left,
 			   *right;
@@ -2925,10 +2927,6 @@ mergejoinscansel(PlannerInfo *root, Node *clause,
 				revltop,
 				revleop;
 	bool		isgt;
-	Datum		leftmin,
-				leftmax,
-				rightmin,
-				rightmax;
 	double		selec;
 
 	/* Set default results if we can't figure anything out. */
@@ -3075,20 +3073,20 @@ mergejoinscansel(PlannerInfo *root, Node *clause,
 	if (!isgt)
 	{
 		if (!get_variable_range(root, &leftvar, lstatop, collation,
-								&leftmin, &leftmax))
+								leftmin, leftmax))
 			goto fail;			/* no range available from stats */
 		if (!get_variable_range(root, &rightvar, rstatop, collation,
-								&rightmin, &rightmax))
+								rightmin, rightmax))
 			goto fail;			/* no range available from stats */
 	}
 	else
 	{
 		/* need to swap the max and min */
 		if (!get_variable_range(root, &leftvar, lstatop, collation,
-								&leftmax, &leftmin))
+								leftmax, leftmin))
 			goto fail;			/* no range available from stats */
 		if (!get_variable_range(root, &rightvar, rstatop, collation,
-								&rightmax, &rightmin))
+								rightmax, rightmin))
 			goto fail;			/* no range available from stats */
 	}
 
@@ -3098,13 +3096,13 @@ mergejoinscansel(PlannerInfo *root, Node *clause,
 	 * non-default estimates, else stick with our 1.0.
 	 */
 	selec = scalarineqsel(root, leop, isgt, true, collation, &leftvar,
-						  rightmax, op_righttype);
+						  *rightmax, op_righttype);
 	if (selec != DEFAULT_INEQ_SEL)
 		*leftend = selec;
 
 	/* And similarly for the right variable. */
 	selec = scalarineqsel(root, revleop, isgt, true, collation, &rightvar,
-						  leftmax, op_lefttype);
+						  *leftmax, op_lefttype);
 	if (selec != DEFAULT_INEQ_SEL)
 		*rightend = selec;
 
@@ -3128,13 +3126,13 @@ mergejoinscansel(PlannerInfo *root, Node *clause,
 	 * our own default.
 	 */
 	selec = scalarineqsel(root, ltop, isgt, false, collation, &leftvar,
-						  rightmin, op_righttype);
+						  *rightmin, op_righttype);
 	if (selec != DEFAULT_INEQ_SEL)
 		*leftstart = selec;
 
 	/* And similarly for the right variable. */
 	selec = scalarineqsel(root, revltop, isgt, false, collation, &rightvar,
-						  leftmin, op_lefttype);
+						  *leftmin, op_lefttype);
 	if (selec != DEFAULT_INEQ_SEL)
 		*rightstart = selec;
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index fda3f9befb0..084cfdf11f9 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -885,6 +885,26 @@ struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_mergejoin_semijoin_filter", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables the planner's use of using Semijoin Bloom filters during merge join."),
+			NULL,
+			GUC_EXPLAIN
+		},
+		&enable_mergejoin_semijoin_filter,
+		false,
+		NULL, NULL, NULL
+	},
+	{
+		{"force_mergejoin_semijoin_filter", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Forces the planner's use of using Semijoin Bloom filters during merge join. Overrides enable_mergejoin_semijoin_filter."),
+			NULL,
+			GUC_EXPLAIN
+		},
+		&force_mergejoin_semijoin_filter,
+		false,
+		NULL, NULL, NULL
+	},
 	{
 		{"enable_hashjoin", PGC_USERSET, QUERY_TUNING_METHOD,
 			gettext_noop("Enables the planner's use of hash join plans."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 2ae76e5cfb7..1f3d2c772a2 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -379,6 +379,8 @@
 #enable_material = on
 #enable_memoize = on
 #enable_mergejoin = on
+#enable_mergejoin_semijoin_filter = on
+#force_mergejoin_semijoin_filter = on
 #enable_nestloop = on
 #enable_parallel_append = on
 #enable_parallel_hash = on
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 294cfe9c47c..663069455ef 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -2014,6 +2014,9 @@ typedef struct MergePath
 	List	   *innersortkeys;	/* keys for explicit sort, if any */
 	bool		skip_mark_restore;	/* can executor skip mark/restore? */
 	bool		materialize_inner;	/* add Materialize to inner? */
+	bool		use_semijoinfilter; /* should we use a semijoin filter? */
+	double		filteringRate;	/* estimated filtering rate of SJF */
+	int			best_mergeclause;	/* best clause to build SJF on */
 } MergePath;
 
 /*
@@ -2591,6 +2594,12 @@ typedef struct MergeScanSelCache
 	Selectivity leftendsel;		/* last-join fraction for clause left side */
 	Selectivity rightstartsel;	/* first-join fraction for clause right side */
 	Selectivity rightendsel;	/* last-join fraction for clause right side */
+
+	Datum		leftmin;		/* min and max values for left and right
+								 * clauses */
+	Datum		leftmax;
+	Datum		rightmin;
+	Datum		rightmax;
 } MergeScanSelCache;
 
 /*
@@ -3138,6 +3147,10 @@ typedef struct JoinCostWorkspace
 	Cardinality inner_rows;
 	Cardinality outer_skip_rows;
 	Cardinality inner_skip_rows;
+	Datum		outer_min_val;
+	Datum		outer_max_val;
+	Datum		inner_min_val;
+	Datum		inner_max_val;
 
 	/* private for cost_hashjoin code */
 	int			numbuckets;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 21e642a64c4..a418a406aff 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -22,6 +22,7 @@
 #include "nodes/lockoptions.h"
 #include "nodes/parsenodes.h"
 #include "nodes/primnodes.h"
+#include "optimizer/pathnode.h"
 
 
 /* ----------------------------------------------------------------
@@ -844,6 +845,13 @@ typedef struct MergeJoin
 
 	/* per-clause nulls ordering */
 	bool	   *mergeNullsFirst pg_node_attr(array_size(mergeclauses));
+
+	/* fields for using a SemiJoinFilter */
+	bool		applySemiJoinFilter;
+	double		filteringRate;
+	int			bestExpr;
+	Plan	   *buildingNode;
+	Plan	   *checkingNode;
 } MergeJoin;
 
 /* ----------------
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index f27d11eaa9f..cca624a1d86 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -61,6 +61,8 @@ extern PGDLLIMPORT bool enable_nestloop;
 extern PGDLLIMPORT bool enable_material;
 extern PGDLLIMPORT bool enable_memoize;
 extern PGDLLIMPORT bool enable_mergejoin;
+extern PGDLLIMPORT bool enable_mergejoin_semijoin_filter;
+extern PGDLLIMPORT bool force_mergejoin_semijoin_filter;
 extern PGDLLIMPORT bool enable_hashjoin;
 extern PGDLLIMPORT bool enable_gathermerge;
 extern PGDLLIMPORT bool enable_partitionwise_join;
@@ -213,4 +215,49 @@ extern PathTarget *set_pathtarget_cost_width(PlannerInfo *root, PathTarget *targ
 extern double compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel,
 								   Path *bitmapqual, int loop_count, Cost *cost, double *tuple);
 
+/*
+ *  Container for metadata about an expression, used by semijoin decision logic
+ */
+typedef struct expr_metadata
+{
+	bool		is_or_maps_to_constant;
+	bool		is_or_maps_to_base_column;
+
+	/* Var and relation from the current query block, if it is a Var */
+	const Var  *local_column_expr;
+	const RelOptInfo *local_relation;
+
+	int32		est_col_width;
+
+	/*
+	 * The following will be the same as local Var and relation when the local
+	 * relation is a base table (i.e. no occluding query blocks).  Otherwise
+	 * it will be the occluded base column, if the final occluded expression
+	 * is a base column.
+	 */
+	const Var  *base_column_expr;
+	const RelOptInfo *base_rel;
+	const PlannerInfo *base_rel_root;
+	double		base_rel_row_count;
+	double		base_rel_filt_row_count;
+	double		base_col_distincts;
+	Datum		base_col_min_value;
+	Datum		base_col_max_value;
+
+	/* True if the distinct est is based on something meaningful  */
+	bool		est_distincts_reliable;
+	bool		est_minmax_reliable;
+
+	/* Estimated distincts after local filtering, and row count adjustments */
+	double		expr_est_distincts;
+}			ExprMetadata;
+
+extern void analyze_expr_for_metadata(const Expr *ex,
+									  const PlannerInfo *root,
+									  ExprMetadata * md);
+extern bool expressions_match_foreign_key(ForeignKeyOptInfo *fk,
+										  List *con_exprs,
+										  List *ref_exprs,
+										  List *operators);
+
 #endif							/* COST_H */
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
index d485b9bfcd9..2d67efd2950 100644
--- a/src/include/utils/selfuncs.h
+++ b/src/include/utils/selfuncs.h
@@ -208,7 +208,9 @@ extern Selectivity rowcomparesel(PlannerInfo *root,
 extern void mergejoinscansel(PlannerInfo *root, Node *clause,
 							 Oid opfamily, int strategy, bool nulls_first,
 							 Selectivity *leftstart, Selectivity *leftend,
-							 Selectivity *rightstart, Selectivity *rightend);
+							 Selectivity *rightstart, Selectivity *rightend,
+							 Datum *leftmin, Datum *leftmax,
+							 Datum *rightmin, Datum *rightmax);
 
 extern double estimate_num_groups(PlannerInfo *root, List *groupExprs,
 								  double input_rows, List **pgset,
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 4e775af1758..c2c3946e95f 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -109,30 +109,31 @@ select count(*) = 0 as ok from pg_stat_wal_receiver;
 -- This is to record the prevailing planner enable_foo settings during
 -- a regression test run.
 select name, setting from pg_settings where name like 'enable%';
-              name              | setting 
---------------------------------+---------
- enable_async_append            | on
- enable_bitmapscan              | on
- enable_gathermerge             | on
- enable_group_by_reordering     | on
- enable_hashagg                 | on
- enable_hashjoin                | on
- enable_incremental_sort        | on
- enable_indexonlyscan           | on
- enable_indexscan               | on
- enable_material                | on
- enable_memoize                 | on
- enable_mergejoin               | on
- enable_nestloop                | on
- enable_parallel_append         | on
- enable_parallel_hash           | on
- enable_partition_pruning       | on
- enable_partitionwise_aggregate | off
- enable_partitionwise_join      | off
- enable_seqscan                 | on
- enable_sort                    | on
- enable_tidscan                 | on
-(21 rows)
+               name               | setting 
+----------------------------------+---------
+ enable_async_append              | on
+ enable_bitmapscan                | on
+ enable_gathermerge               | on
+ enable_group_by_reordering       | on
+ enable_hashagg                   | on
+ enable_hashjoin                  | on
+ enable_incremental_sort          | on
+ enable_indexonlyscan             | on
+ enable_indexscan                 | on
+ enable_material                  | on
+ enable_memoize                   | on
+ enable_mergejoin                 | on
+ enable_mergejoin_semijoin_filter | off
+ enable_nestloop                  | on
+ enable_parallel_append           | on
+ enable_parallel_hash             | on
+ enable_partition_pruning         | on
+ enable_partitionwise_aggregate   | off
+ enable_partitionwise_join        | off
+ enable_seqscan                   | on
+ enable_sort                      | on
+ enable_tidscan                   | on
+(22 rows)
 
 -- Test that the pg_timezone_names and pg_timezone_abbrevs views are
 -- more-or-less working.  We can't test their contents in any great detail
-- 
2.37.3

0002-review-20221003.patchtext/x-patch; charset=UTF-8; name=0002-review-20221003.patchDownload

From c51c7f7cfa700d3ec93a2828d3be119ef1839406 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@postgresql.org>
Date: Mon, 3 Oct 2022 00:12:37 +0200
Subject: [PATCH 2/7] review

---
 src/backend/optimizer/path/costsize.c   | 74 ++++++++++++++++++++++++-
 src/backend/optimizer/plan/createplan.c |  5 ++
 src/backend/utils/adt/selfuncs.c        |  2 +
 src/include/optimizer/cost.h            |  9 +++
 4 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index d1663e5a379..da17da599d7 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -195,6 +195,8 @@ static double get_parallel_divisor(Path *path);
 /*
  *  Local functions and option variables to support
  *  semijoin pushdowns from join nodes
+ *
+ * XXX It's customary to keep the "root" as the first parameter.
  */
 static double evaluate_semijoin_filtering_rate(JoinPath *join_path,
 											   const List *hash_equijoins,
@@ -3901,6 +3903,12 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 		 * determine if merge join should use a semijoin filter. We need to
 		 * rearrange the merge clauses so they match the status of the clauses
 		 * during Plan creation.
+		 *
+		 * XXX Not sure what "match the status" means? Perhaps ordering of clauses?
+		 * Ah, it means we get the clauses in outer/inner is first/second?
+		 *
+		 * XXX Why do we do the call get_actual_clauses() at all? Clearly the result
+		 * is just discarded right away ...
 		 */
 		mergeclauses_for_sjf = get_actual_clauses(path->path_mergeclauses);
 		mergeclauses_for_sjf = get_switched_clauses(path->path_mergeclauses,
@@ -3912,6 +3920,16 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 		{
 			/* found a valid SJF at the very least */
 			/* want at least 1000 rows_filtered to avoid any nasty edge cases */
+			/*
+			 * XXX Not sure hard-coded limits will be acceptable. We do that for
+			 * JIT, but that's because we don't have any reasonable cost model
+			 * for the LLVM stuff etc. Can't we construct one here? Surely we
+			 * have at least some rough idea regarding how expensive it is to
+			 * build and query the bloom filter, no?
+			 *
+			 * XXX It's very weird this does not consider if the bloom filter
+			 * fits into work_mem and/or false positive rate.
+			 */
 			if (force_mergejoin_semijoin_filter || (filteringRate >= 0.35 && rows_filtered > 1000))
 			{
 				double		improvement;
@@ -3930,6 +3948,16 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 				 * concludes thata a 17% filtering rate is the break-even
 				 * point, so we use 35% just be conservative. We use this
 				 * information to adjust the MergeJoin's planned cost.
+				 *
+				 * XXX I'd bet this will very much depend on what the path
+				 * (with the bloom filter) does.
+				 *
+				 * XXX Also, if we're only tweaking costs for the mergejoin
+				 * node, won't that be somewhat confusing? I mean, if we filter
+				 * out most of the rows, then it'll look like the row estimates
+				 * are wrong. I wonder if it may cause some more serious issues.
+				 * Of course, when planning the scans, we don't know there'll be
+				 * a join (not to mention bloom filter pushdown) later.
 				 */
 				improvement = 0.83 * filteringRate - 0.137;
 				run_cost = (1 - improvement) * run_cost;
@@ -6650,6 +6678,11 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
  *  and only if the TRACE_SJPD flag is defined.
  */
 
+/*
+ * XXX If we want some debug logging, perhaps this should do something more like
+ * we have for TRACE_SORT. Inventing new "debug_" macros seems unnecessary.
+ */
+
 #define TRACE_SJPD 0
 #define DEBUG_BUILD 0
 
@@ -6762,7 +6795,10 @@ debug_sj_expr_metadata(const char *side, int ord, const ExprMetadata * md)
 }
 #endif
 
-
+/*
+ * XXX No sure why this function (and various others) have the comment inside,
+ * and not before the function?
+ */
 static double
 estimate_distincts_remaining(double original_table_row_count,
 							 double original_distinct_count,
@@ -6796,6 +6832,11 @@ estimate_distincts_remaining(double original_table_row_count,
 	 *
 	 * Internally this function uses M, N, P, and K as variables to match the
 	 * notation used in the equation in the paper.
+	 *
+	 * XXX We already do similar estimated in estimate_num_groups_incremental,
+	 * i.e. we estimate num of groups when we select just some rows (that's
+	 * at selfuncs.c:3670 - roughly). There's reference to a paper by Alberto
+	 * Dell'Era. Why should this use a different logic? Might be confusing.
 	 */
 	double		n = original_table_row_count;
 	double		m = original_distinct_count;
@@ -7159,6 +7200,8 @@ analyze_expr_for_metadata(const Expr *ex,
  *  is possible, determine the single equijoin condition that is most
  *  useful as well as the estimated filtering rate of the filter.
  *
+ * XXX Why not to allow bloom filter on multiple conditions?
+ *
  *  The output args, inner_semijoin_keys and
  *  outer_semijoin_keys, will each contain a single key column
  *  from one of the hash equijoin conditions. probe_semijoin_keys
@@ -7224,9 +7267,19 @@ evaluate_semijoin_filtering_rate(JoinPath *join_path,
 
 	if (!enable_mergejoin_semijoin_filter && !force_mergejoin_semijoin_filter)
 	{
+		/*
+		 * XXX AFAICS we shouldn't ever get here, because this gets only called
+		 * with enable_mergejoin_semijoin_filter=true, no?
+		 */
 		return 0;				/* option setting disabled semijoin insertion  */
 	}
 
+	/*
+	 * XXX use the memory context through palloc, not alloca() or anything else
+	 * directly. Also, it's probably pointless to check if we got NULL - palloc
+	 * will check that, and in some cases we'll only find later on access anyway
+	 * because of overcommit.
+	 */
 	num_md = list_length(equijoin_list);
 	outer_md_array = alloca(sizeof(ExprMetadata) * num_md);
 	inner_md_array = alloca(sizeof(ExprMetadata) * num_md);
@@ -7564,6 +7617,14 @@ verify_valid_pushdown(const Path *path,
 								 * this fact until the Plan has been created.
 								 * We will push through for now, but verify
 								 * again during Plan creation. */
+
+				/*
+				 * XXX How come we can't validate that? Surely we consider if
+				 * a path hash grouping sets when constructing the AggPath, see
+				 * for example consider_groupingsets_paths. However, do we
+				 * even allow AggPath to have grouping sets? We have a separate
+				 * node type GroupingSetsPath for that, no?
+				 */
 				return verify_valid_pushdown(((AggPath *) path)->subpath, target_var_no, root);
 			}
 
@@ -7577,6 +7638,9 @@ verify_valid_pushdown(const Path *path,
 				 * previous HashJoin CR has implemented these cases, but that
 				 * code is run these after the plan has been created, so code
 				 * will need to be adjusted to do it during Path evaluation.
+				 *
+				 * XXX I'd say the Append case is something we'll need because
+				 * of partitioning, no?
 				 */
 				return false;
 			}
@@ -7589,6 +7653,9 @@ verify_valid_pushdown(const Path *path,
 				 * We could definitely implement pushdown filters for Index
 				 * and Bitmap Scans, but currently it is only implemented for
 				 * SeqScan. For now, we return false.
+				 *
+				 * XXX I think this would be very valuable, as it might
+				 * eliminate a lot of the random I/O with index scans.
 				 */
 				return false;
 			}
@@ -7627,6 +7694,11 @@ verify_valid_pushdown(const Path *path,
 	}
 }
 
+/*
+ * XXX Why we need this, when we have list_nth? In fact, that should be
+ * faster, as it may not need to iterate. Also, why do we ignore the case
+ * when the list is shorter? Shouldn't that be an error?
+ */
 static TargetEntry *
 get_nth_targetentry(int n, const List *targetlist)
 {
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 98a673b9b94..dd912ab76ce 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -7294,6 +7294,11 @@ is_projection_capable_plan(Plan *plan)
  *  	used only for partitioned table.
  *  cur_depth: current depth from the root hash join plan node.
  *  target_node: stores the target plan node where filter will be applied
+ *
+ * XXX I don't understand why this returns int and not bool.
+ * XXX Could we consider pushing the bloom filter only partially, e.g.
+ * as deep as possible (in case there's some sort of node preventing
+ * pushdown further down)?
  */
 static int
 depth_of_semijoin_target(Plan *pn,
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index b62de16899c..bf2d461bbe3 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -2900,6 +2900,8 @@ scalargejoinsel(PG_FUNCTION_ARGS)
  *		*leftend is set to the fraction of the left-hand variable expected
  *		 to be scanned before the join terminates (0 to 1).
  *		*rightstart, *rightend similarly for the right-hand variable.
+ *
+ * XXX Should explain what the new parameters do etc.
  */
 void
 mergejoinscansel(PlannerInfo *root, Node *clause,
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index cca624a1d86..05df9434fee 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -61,6 +61,10 @@ extern PGDLLIMPORT bool enable_nestloop;
 extern PGDLLIMPORT bool enable_material;
 extern PGDLLIMPORT bool enable_memoize;
 extern PGDLLIMPORT bool enable_mergejoin;
+/*
+ * XXX These names sound a bit strange - it's not clear the semijoin
+ * filter is a bloom filter, or that it's pushing down stuff.
+ */
 extern PGDLLIMPORT bool enable_mergejoin_semijoin_filter;
 extern PGDLLIMPORT bool force_mergejoin_semijoin_filter;
 extern PGDLLIMPORT bool enable_hashjoin;
@@ -217,6 +221,11 @@ extern double compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel,
 
 /*
  *  Container for metadata about an expression, used by semijoin decision logic
+ *
+ * XXX Probably should be private in costsize.c, why expose it here?
+ *
+ * XXX The name is a bit generic, maybe make sure it's associated to merge
+ * joins and/or bloom filters?
  */
 typedef struct expr_metadata
 {
-- 
2.37.3

0003-Support-semijoin-filter-in-the-executor-for-20221003.patchtext/x-patch; charset=UTF-8; name=0003-Support-semijoin-filter-in-the-executor-for-20221003.patchDownload

From 29ab36bc9cbecf7b5c16845ffe843ef9352a89d7 Mon Sep 17 00:00:00 2001
From: Lyu Pan <lyup@amazon.com>
Date: Fri, 16 Sep 2022 03:35:06 +0000
Subject: [PATCH 3/7] Support semijoin filter in the executor for non-parallel
 mergejoin.

During MergeJoinState initialization, if a semijoin filter should be used in the MergeJoin node (according to the planner), a SemiJoinFilter struct is initialized, then the relation id and attribute number used to build/check the bloom filter are calculated, related information is pushed down to the scan node (only SeqScan for now). The bloom filter is always built on the outer/left tree, and used in the inner/right tree.
---
 src/backend/executor/execScan.c      |  52 +++++-
 src/backend/executor/nodeMergejoin.c | 259 +++++++++++++++++++++++++++
 src/backend/executor/nodeSeqscan.c   |   6 +
 src/include/executor/nodeMergejoin.h |   5 +
 src/include/nodes/execnodes.h        |  25 +++
 5 files changed, 340 insertions(+), 7 deletions(-)

diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c
index 043bb83f558..4b97c394550 100644
--- a/src/backend/executor/execScan.c
+++ b/src/backend/executor/execScan.c
@@ -19,8 +19,10 @@
 #include "postgres.h"
 
 #include "executor/executor.h"
+#include "executor/nodeMergejoin.h"
 #include "miscadmin.h"
 #include "utils/memutils.h"
+#include "utils/rel.h"
 
 
 
@@ -173,10 +175,12 @@ ExecScan(ScanState *node,
 	/* interrupt checks are in ExecScanFetch */
 
 	/*
-	 * If we have neither a qual to check nor a projection to do, just skip
-	 * all the overhead and return the raw scan tuple.
+	 * If we have neither a qual to check nor a projection to do, nor a bloom
+	 * filter to check, just skip all the overhead and return the raw scan
+	 * tuple.
 	 */
-	if (!qual && !projInfo)
+	if (!qual && !projInfo && !IsA(node, SeqScanState) &&
+		!((SeqScanState *) node)->applySemiJoinFilter)
 	{
 		ResetExprContext(econtext);
 		return ExecScanFetch(node, accessMtd, recheckMtd);
@@ -206,6 +210,13 @@ ExecScan(ScanState *node,
 		 */
 		if (TupIsNull(slot))
 		{
+			if (IsA(node, SeqScanState) &&
+				((SeqScanState *) node)->applySemiJoinFilter)
+			{
+				SemiJoinFilterFinishScan(
+										 ((SeqScanState *) node)->semiJoinFilters,
+										 node->ss_currentRelation->rd_id);
+			}
 			if (projInfo)
 				return ExecClearTuple(projInfo->pi_state.resultslot);
 			else
@@ -232,16 +243,43 @@ ExecScan(ScanState *node,
 			if (projInfo)
 			{
 				/*
-				 * Form a projection tuple, store it in the result tuple slot
-				 * and return it.
+				 * Form a projection tuple, store it in the result tuple slot,
+				 * check against SemiJoinFilter, then return it.
 				 */
-				return ExecProject(projInfo);
+				TupleTableSlot *projectedSlot = ExecProject(projInfo);
+
+				if (IsA(node, SeqScanState) &&
+					((SeqScanState *) node)->applySemiJoinFilter)
+				{
+					if (!SemiJoinFilterExamineSlot(
+												   ((SeqScanState *) node)->semiJoinFilters,
+												   projectedSlot, node->ss_currentRelation->rd_id))
+					{
+						/* slot did not pass SemiJoinFilter, so skipping it. */
+						ResetExprContext(econtext);
+						continue;
+					}
+				}
+				return projectedSlot;
 			}
 			else
 			{
 				/*
-				 * Here, we aren't projecting, so just return scan tuple.
+				 * Here, we aren't projecting, so check against
+				 * SemiJoinFilter, then return tuple.
 				 */
+				if (IsA(node, SeqScanState) &&
+					((SeqScanState *) node)->applySemiJoinFilter)
+				{
+					if (!SemiJoinFilterExamineSlot(
+												   ((SeqScanState *) node)->semiJoinFilters, slot,
+												   node->ss_currentRelation->rd_id))
+					{
+						/* slot did not pass SemiJoinFilter, so skipping it. */
+						ResetExprContext(econtext);
+						continue;
+					}
+				}
 				return slot;
 			}
 		}
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index fed345eae54..81c253aad58 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -93,8 +93,11 @@
 #include "postgres.h"
 
 #include "access/nbtree.h"
+#include "common/pg_prng.h"
 #include "executor/execdebug.h"
+#include "executor/execExpr.h"
 #include "executor/nodeMergejoin.h"
+#include "lib/bloomfilter.h"
 #include "miscadmin.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
@@ -1603,6 +1606,95 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 											node->mergeNullsFirst,
 											(PlanState *) mergestate);
 
+	/*
+	 * initialize SemiJoinFilter, if planner decided to do so
+	 */
+	if (((MergeJoin *) mergestate->js.ps.plan)->applySemiJoinFilter)
+	{
+		SemiJoinFilter *sjf;
+		Plan	   *buildingNode;
+		Plan	   *checkingNode;
+		uint64		seed;
+		MergeJoinClause clause;
+
+		/* create Bloom filter */
+		sjf = (SemiJoinFilter *) palloc0(sizeof(SemiJoinFilter));
+
+		/*
+		 * Push down filter down outer and inner subtrees and apply filter to
+		 * the nodes that correspond to the ones identified during planning.
+		 * We are pushing down first because we need some metadata from the
+		 * scan nodes (i.e. Relation Id's and planner-estimated number of
+		 * rows).
+		 */
+		buildingNode = ((MergeJoin *) mergestate->js.ps.plan)->buildingNode;
+		checkingNode = ((MergeJoin *) mergestate->js.ps.plan)->checkingNode;
+		sjf->buildingId = -1;
+		sjf->checkingId = -1;
+		PushDownFilter(mergestate->js.ps.lefttree, buildingNode, sjf, &sjf->buildingId, &sjf->num_elements);
+		PushDownFilter(mergestate->js.ps.righttree, checkingNode, sjf, &sjf->checkingId, &sjf->num_elements);
+
+		/* Initialize SJF data and create Bloom filter */
+		seed = pg_prng_uint64(&pg_global_prng_state);
+		sjf->filter = bloom_create(sjf->num_elements, work_mem, seed);
+		sjf->work_mem = work_mem;
+		sjf->seed = seed;
+		sjf->doneBuilding = false;
+
+		/*
+		 * From the plan level, we already know which mergeclause to build the
+		 * filter on. However, to implement this on the scan level, we look at
+		 * the expression and figure out which slot we need to examine in
+		 * ExecScan to build/check the filter on.
+		 */
+
+		clause = &mergestate->mj_Clauses[((MergeJoin *) mergestate->js.ps.plan)->bestExpr];
+
+		/*
+		 * Look through expression steps to determine which relation attribute
+		 * slot they are comparing and take note of it. All merge join clauses
+		 * eventually fetch tuples from either the outer or inner slot, so we
+		 * just need to check for those specific ExprEvalOp's.
+		 */
+		for (int j = 0; j < clause->lexpr->steps_len; j++)
+		{
+			ExprEvalOp	leftOp = clause->lexpr->steps[j].opcode;
+			int			leftAttr = clause->lexpr->steps[j].d.var.attnum;
+
+			if (leftOp == EEOP_OUTER_FETCHSOME)
+			{
+				/* attribute numbers are 1-indexed */
+				sjf->buildingAttr = leftAttr - 1;
+				break;
+			}
+			else if (leftOp == EEOP_INNER_FETCHSOME)
+			{
+				sjf->checkingAttr = leftAttr - 1;
+				break;
+			}
+		}
+		/* do it again for right expression */
+		for (int j = 0; j < clause->rexpr->steps_len; j++)
+		{
+			ExprEvalOp	rightOp = clause->rexpr->steps[j].opcode;
+			int			rightAttr = clause->rexpr->steps[j].d.var.attnum;
+
+			if (rightOp == EEOP_OUTER_FETCHSOME)
+			{
+				/* attribute numbers are 1-indexed */
+				sjf->buildingAttr = rightAttr - 1;
+				break;
+			}
+			else if (rightOp == EEOP_INNER_FETCHSOME)
+			{
+				sjf->checkingAttr = rightAttr - 1;
+				break;
+			}
+		}
+		sjf->mergejoin_plan_id = mergestate->js.ps.plan->plan_node_id;
+		mergestate->sjf = sjf;
+	}
+
 	/*
 	 * initialize join state
 	 */
@@ -1645,6 +1737,14 @@ ExecEndMergeJoin(MergeJoinState *node)
 	ExecClearTuple(node->js.ps.ps_ResultTupleSlot);
 	ExecClearTuple(node->mj_MarkedTupleSlot);
 
+	/*
+	 * free SemiJoinFilter
+	 */
+	if (node->sjf)
+	{
+		FreeSemiJoinFilter(node->sjf);
+	}
+
 	/*
 	 * shut down the subplans
 	 */
@@ -1678,3 +1778,162 @@ ExecReScanMergeJoin(MergeJoinState *node)
 	if (innerPlan->chgParam == NULL)
 		ExecReScan(innerPlan);
 }
+
+void
+FreeSemiJoinFilter(SemiJoinFilter * sjf)
+{
+	bloom_free(sjf->filter);
+	pfree(sjf);
+}
+
+/*
+ * Determines the direction that a pushdown filter can be pushed. This is not very robust, but this
+ * is because we've already done the careful calculations at the plan level. If we end up pushing where
+ * we're not supposed to, it's fine because we've done the verifications in the planner.
+ */
+int
+PushDownDirection(PlanState *node)
+{
+	switch (nodeTag(node))
+	{
+		case T_HashState:
+		case T_MaterialState:
+		case T_GatherState:
+		case T_GatherMergeState:
+		case T_SortState:
+		case T_UniqueState:
+		case T_AggState:
+			{
+				return 0;
+			}
+		case T_NestLoopState:
+		case T_MergeJoinState:
+		case T_HashJoinState:
+			{
+				return 1;
+			}
+		default:
+			{
+				return -1;
+			}
+	}
+}
+
+/* Recursively pushes down the filter until an appropriate SeqScan node is reached. Then, it
+ * verifies if that SeqScan node is the one we want to push the filter to, and if it is, then
+ * appends the SJF to the node. */
+void
+PushDownFilter(PlanState *node, Plan *plan, SemiJoinFilter * sjf, Oid *relId, int64 *nodeRows)
+{
+	if (node == NULL)
+	{
+		return;
+	}
+
+	check_stack_depth();
+	if (node->type == T_SeqScanState)
+	{
+		SeqScanState *scan = (SeqScanState *) node;
+
+		Assert(IsA(scan, SeqScanState));
+
+		/*
+		 * found the right Scan node that we want to apply the filter onto via
+		 * matching relId
+		 */
+		if (scan->ss.ss_currentRelation->rd_id == *relId)
+		{
+			scan->applySemiJoinFilter = true;
+			scan->semiJoinFilters = lappend(scan->semiJoinFilters, sjf);
+		}
+
+		/*
+		 * Check if right Scan node, based on matching Plan nodes. This will
+		 * be the most common way of matching Scan nodes to the filter, the
+		 * above use case is only for fringe parallel-execution cases.
+		 */
+		else if (scan->ss.ps.plan == plan)
+		{
+			scan->applySemiJoinFilter = true;
+			scan->semiJoinFilters = lappend(scan->semiJoinFilters, sjf);
+			*relId = scan->ss.ss_currentRelation->rd_id;
+			/* double row estimate to reduce error rate for Bloom filter */
+			*nodeRows = Max(*nodeRows, scan->ss.ps.plan->plan_rows * 2);
+		}
+	}
+	else
+	{
+		if (PushDownDirection(node) == 1)
+		{
+			PushDownFilter(node->lefttree, plan, sjf, relId, nodeRows);
+			PushDownFilter(node->righttree, plan, sjf, relId, nodeRows);
+		}
+		else if (PushDownDirection(node) == 0)
+		{
+			PushDownFilter(node->lefttree, plan, sjf, relId, nodeRows);
+		}
+	}
+}
+
+/*
+ * If this table is the building-side table for the SemiJoinFilter, adds the element to
+ * the bloom filter and always returns true. If this table is the checking-side table for the SemiJoinFilter,
+ * then checks the element against the bloom filter and returns true if the element is (probably) in the set,
+ * and false if the element is not in the bloom filter.
+ */
+bool
+SemiJoinFilterExamineSlot(List *semiJoinFilters, TupleTableSlot *slot, Oid tableId)
+{
+	ListCell   *cell;
+
+	foreach(cell, semiJoinFilters)
+	{
+		SemiJoinFilter *sjf = ((SemiJoinFilter *) (lfirst(cell)));
+
+		/* check if this table's relation ID matches the filter's relation ID */
+		if (!sjf->doneBuilding && tableId == sjf->buildingId)
+		{
+			Datum		val;
+
+			slot_getsomeattrs(slot, sjf->buildingAttr + 1);
+			sjf->elementsAdded++;
+
+			/*
+			 * We are only using one key for now. Later functionality might
+			 * include multiple join keys.
+			 */
+			val = slot->tts_values[sjf->buildingAttr];
+			bloom_add_element(sjf->filter, (unsigned char *) &val, sizeof(val));
+		}
+		else if (sjf->doneBuilding && tableId == sjf->checkingId)
+		{
+			Datum		val;
+
+			slot_getsomeattrs(slot, sjf->checkingAttr + 1);
+			sjf->elementsChecked++;
+			val = slot->tts_values[sjf->checkingAttr];
+			if (bloom_lacks_element(sjf->filter, (unsigned char *) &val, sizeof(val)))
+			{
+				sjf->elementsFiltered++;
+				return false;
+			}
+		}
+	}
+	return true;
+}
+
+void
+SemiJoinFilterFinishScan(List *semiJoinFilters, Oid tableId)
+{
+	ListCell   *cell;
+
+	foreach(cell, semiJoinFilters)
+	{
+		SemiJoinFilter *sjf = ((SemiJoinFilter *) (lfirst(cell)));
+
+		if (!sjf->doneBuilding && tableId == sjf->buildingId)
+		{
+			sjf->doneBuilding = true;
+		}
+	}
+}
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index 7b58cd9162e..e43ce3f8d0f 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -207,6 +207,12 @@ ExecEndSeqScan(SeqScanState *node)
 	 */
 	if (scanDesc != NULL)
 		table_endscan(scanDesc);
+
+	/*
+	 * clear out semijoinfilter list, if non-null
+	 */
+	if (node->semiJoinFilters)
+		pfree(node->semiJoinFilters);
 }
 
 /* ----------------------------------------------------------------
diff --git a/src/include/executor/nodeMergejoin.h b/src/include/executor/nodeMergejoin.h
index 26ab5175081..c311c7ed80e 100644
--- a/src/include/executor/nodeMergejoin.h
+++ b/src/include/executor/nodeMergejoin.h
@@ -19,5 +19,10 @@
 extern MergeJoinState *ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags);
 extern void ExecEndMergeJoin(MergeJoinState *node);
 extern void ExecReScanMergeJoin(MergeJoinState *node);
+extern void FreeSemiJoinFilter(SemiJoinFilter * sjf);
+extern int	PushDownDirection(PlanState *node);
+extern void PushDownFilter(PlanState *node, Plan *plan, SemiJoinFilter * sjf, Oid *relId, int64 *nodeRows);
+extern bool SemiJoinFilterExamineSlot(List *semiJoinFilters, TupleTableSlot *slot, Oid tableId);
+extern void SemiJoinFilterFinishScan(List *semiJoinFilters, Oid tableId);
 
 #endif							/* NODEMERGEJOIN_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc09..69644627204 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -32,6 +32,7 @@
 #include "access/tupconvert.h"
 #include "executor/instrument.h"
 #include "fmgr.h"
+#include "lib/bloomfilter.h"
 #include "lib/ilist.h"
 #include "lib/pairingheap.h"
 #include "nodes/params.h"
@@ -1451,6 +1452,9 @@ typedef struct SeqScanState
 {
 	ScanState	ss;				/* its first field is NodeTag */
 	Size		pscan_len;		/* size of parallel heap scan descriptor */
+	/* for use of SemiJoinFilters during merge join */
+	bool		applySemiJoinFilter;
+	List	   *semiJoinFilters;
 } SeqScanState;
 
 /* ----------------
@@ -2012,6 +2016,26 @@ typedef struct NestLoopState
 /* private in nodeMergejoin.c: */
 typedef struct MergeJoinClauseData *MergeJoinClause;
 
+typedef struct SemiJoinFilter
+{
+	bloom_filter *filter;
+	/* Relation that Bloom Filter is built on */
+	Oid			buildingId;
+	int			buildingAttr;
+	/* Relation that Bloom Filter is checked on */
+	Oid			checkingId;
+	int			checkingAttr;
+	bool		doneBuilding;
+	/* metadata */
+	uint64		seed;
+	int64		num_elements;
+	int			work_mem;
+	int			elementsAdded;
+	int			elementsChecked;
+	int			elementsFiltered;
+	int			mergejoin_plan_id;
+}			SemiJoinFilter;
+
 typedef struct MergeJoinState
 {
 	JoinState	js;				/* its first field is NodeTag */
@@ -2032,6 +2056,7 @@ typedef struct MergeJoinState
 	TupleTableSlot *mj_NullInnerTupleSlot;
 	ExprContext *mj_OuterEContext;
 	ExprContext *mj_InnerEContext;
+	SemiJoinFilter *sjf;
 } MergeJoinState;
 
 /* ----------------
-- 
2.37.3

0004-review-20221003.patchtext/x-patch; charset=UTF-8; name=0004-review-20221003.patchDownload

From 7ba4272a429ab4b6afe49c2014f3c35e2809fb3a Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@postgresql.org>
Date: Mon, 3 Oct 2022 13:39:05 +0200
Subject: [PATCH 4/7] review

---
 src/backend/executor/execScan.c      |  6 ++++++
 src/backend/executor/nodeMergejoin.c | 32 ++++++++++++++++++++++++++++
 src/backend/executor/nodeSeqscan.c   |  3 +++
 3 files changed, 41 insertions(+)

diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c
index 4b97c394550..e87fb5c6eba 100644
--- a/src/backend/executor/execScan.c
+++ b/src/backend/executor/execScan.c
@@ -210,6 +210,12 @@ ExecScan(ScanState *node,
 		 */
 		if (TupIsNull(slot))
 		{
+			/*
+			 * XXX I'm confused. Are we building or applying the filter for this
+			 * node? Because applySemiJoinFilter kinda suggests we're checking
+			 * the filter (to filter rows), but SemiJoinFilterFinishScan sets
+			 * doneBuilding=true, which suggests we're building it.
+			 */
 			if (IsA(node, SeqScanState) &&
 				((SeqScanState *) node)->applySemiJoinFilter)
 			{
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index 81c253aad58..9a1c0894060 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -1608,6 +1608,9 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 
 	/*
 	 * initialize SemiJoinFilter, if planner decided to do so
+	 *
+	 * XXX I suggest we move this into a separate function, called e.g.
+	 * ExecInitMergeBloomFilter() or something like that.
 	 */
 	if (((MergeJoin *) mergestate->js.ps.plan)->applySemiJoinFilter)
 	{
@@ -1739,6 +1742,9 @@ ExecEndMergeJoin(MergeJoinState *node)
 
 	/*
 	 * free SemiJoinFilter
+	 *
+	 * XXX I suggest we do the "if" in the function. And maybe call it
+	 * ExecFreeSemiJoinFilter() to keep the naming more consistent.
 	 */
 	if (node->sjf)
 	{
@@ -1790,6 +1796,13 @@ FreeSemiJoinFilter(SemiJoinFilter * sjf)
  * Determines the direction that a pushdown filter can be pushed. This is not very robust, but this
  * is because we've already done the careful calculations at the plan level. If we end up pushing where
  * we're not supposed to, it's fine because we've done the verifications in the planner.
+ *
+ * XXX Not sure I understand the last sentence. If we did the verification in the planner,
+ * we should not end pushing where we're not supposed to, no? Or what happens?
+ *
+ * XXX Also, either this is specific to merge join, and then maybe call it ExecMergePush... or
+ * it's generic function to be used from other places, and then maybe it should be in some
+ * other file?
  */
 int
 PushDownDirection(PlanState *node)
@@ -1880,6 +1893,12 @@ PushDownFilter(PlanState *node, Plan *plan, SemiJoinFilter * sjf, Oid *relId, in
  * the bloom filter and always returns true. If this table is the checking-side table for the SemiJoinFilter,
  * then checks the element against the bloom filter and returns true if the element is (probably) in the set,
  * and false if the element is not in the bloom filter.
+ *
+ * XXX So we can be building and examining the filter at the same time? That is, we can
+ * be examining the filter before it's fully built? Seems a bit strange, and it can
+ * make the benefits pretty arbitrary, depending on when we finish building. For example
+ * if we have non-overlapping data (a,b) and (c,d) where b<c, it'll be great, but if we
+ * have d<a then we'll never filter anything.
  */
 bool
 SemiJoinFilterExamineSlot(List *semiJoinFilters, TupleTableSlot *slot, Oid tableId)
@@ -1890,6 +1909,19 @@ SemiJoinFilterExamineSlot(List *semiJoinFilters, TupleTableSlot *slot, Oid table
 	{
 		SemiJoinFilter *sjf = ((SemiJoinFilter *) (lfirst(cell)));
 
+		/*
+		 * XXX Not sure this is correct expectation, but can we assume we
+		 * don't start probing until it's built? It might be true with sort
+		 * on top of seqscan, but what about other scans, that can produce
+		 * rows incrementally?
+		 */
+		Assert(!(!sjf->doneBuilding && (tableId == sjf->checkingId)));
+
+		/*
+		 * XXX comparing OIDs is wrong, and will fail when the same table
+		 * is self-joined. Needs to use rtindex or something like that.
+		 */
+
 		/* check if this table's relation ID matches the filter's relation ID */
 		if (!sjf->doneBuilding && tableId == sjf->buildingId)
 		{
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index e43ce3f8d0f..f53e215d9be 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -210,6 +210,9 @@ ExecEndSeqScan(SeqScanState *node)
 
 	/*
 	 * clear out semijoinfilter list, if non-null
+	 *
+	 * XXX This only frees the list, not the nodes. But does it matter? If not
+	 * why free the list at all?
 	 */
 	if (node->semiJoinFilters)
 		pfree(node->semiJoinFilters);
-- 
2.37.3

0005-Support-semijoin-filter-in-the-executor-for-20221003.patchtext/x-patch; charset=UTF-8; name=0005-Support-semijoin-filter-in-the-executor-for-20221003.patchDownload

From a09fdaa0bd3e7bf3e0ebe2dc2a6d3ce793ebfa0e Mon Sep 17 00:00:00 2001
From: Lyu Pan <lyup@amazon.com>
Date: Fri, 16 Sep 2022 21:28:52 +0000
Subject: [PATCH 5/7] Support semijoin filter in the executor for parallel
 mergejoin.

Each worker process create its own bloom filter, and updates its own bloom filter during the outer/left plan scan (no lock is needed). After all processes finish the scan, the bloom filtered are merged together (by performing OR operations on the bit arrays) into a shared bloom filter in the Dynamic Shared Memory area (use lock for synchronization). After this is completed, the merged bloom filter will be copied back to all the worker processes, and be used in the inner/right plan scan for filtering.
---
 src/backend/executor/execScan.c      |   3 +-
 src/backend/executor/nodeMergejoin.c | 221 +++++++++++++++++++++++++--
 src/backend/executor/nodeSeqscan.c   | 173 ++++++++++++++++++++-
 src/backend/lib/bloomfilter.c        |  74 +++++++++
 src/include/executor/nodeMergejoin.h |   4 +-
 src/include/lib/bloomfilter.h        |   7 +
 src/include/nodes/execnodes.h        |  31 ++++
 7 files changed, 495 insertions(+), 18 deletions(-)

diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c
index e87fb5c6eba..d54479ea3af 100644
--- a/src/backend/executor/execScan.c
+++ b/src/backend/executor/execScan.c
@@ -221,7 +221,8 @@ ExecScan(ScanState *node,
 			{
 				SemiJoinFilterFinishScan(
 										 ((SeqScanState *) node)->semiJoinFilters,
-										 node->ss_currentRelation->rd_id);
+										 node->ss_currentRelation->rd_id,
+										 node->ps.state->es_query_dsa);
 			}
 			if (projInfo)
 				return ExecClearTuple(projInfo->pi_state.resultslot);
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index 9a1c0894060..e28d0639567 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -99,6 +99,10 @@
 #include "executor/nodeMergejoin.h"
 #include "lib/bloomfilter.h"
 #include "miscadmin.h"
+#include "storage/dsm.h"
+#include "storage/lwlock.h"
+#include "storage/shm_toc.h"
+#include "utils/dsa.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 
@@ -659,6 +663,26 @@ ExecMergeJoin(PlanState *pstate)
 				outerTupleSlot = ExecProcNode(outerPlan);
 				node->mj_OuterTupleSlot = outerTupleSlot;
 
+				/*
+				 * Check if outer plan has an SJF and the inner plan does not.
+				 * This case will only arise during parallel execution, when
+				 * the outer plan is initialized with the SJF but the inner
+				 * plan does not because it is not included in the memory
+				 * copied over during worker creation. If this is the case,
+				 * then push down the filter to the inner plan level to
+				 * correct this error and then proceed as normal.
+				 */
+				if (GetSemiJoinFilter(outerPlan, pstate->plan->plan_node_id)
+					&& !GetSemiJoinFilter(innerPlan,
+										  pstate->plan->plan_node_id))
+				{
+					SemiJoinFilter *sjf = GetSemiJoinFilter(outerPlan,
+															pstate->plan->plan_node_id);
+
+					PushDownFilter(innerPlan, NULL, sjf, &sjf->checkingId,
+								   NULL);
+				}
+
 				/* Compute join values and check for unmatchability */
 				switch (MJEvalOuterValues(node))
 				{
@@ -1832,11 +1856,14 @@ PushDownDirection(PlanState *node)
 	}
 }
 
-/* Recursively pushes down the filter until an appropriate SeqScan node is reached. Then, it
- * verifies if that SeqScan node is the one we want to push the filter to, and if it is, then
- * appends the SJF to the node. */
+/*
+ * Recursively pushes down the filter until an appropriate SeqScan node is
+ * reached. Then, it verifies if that SeqScan node is the one we want to push
+ * the filter to, and if it is, then appends the SJF to the node.
+ */
 void
-PushDownFilter(PlanState *node, Plan *plan, SemiJoinFilter * sjf, Oid *relId, int64 *nodeRows)
+PushDownFilter(PlanState *node, Plan *plan, SemiJoinFilter * sjf, Oid *relId,
+			   int64 *nodeRows)
 {
 	if (node == NULL)
 	{
@@ -1851,8 +1878,8 @@ PushDownFilter(PlanState *node, Plan *plan, SemiJoinFilter * sjf, Oid *relId, in
 		Assert(IsA(scan, SeqScanState));
 
 		/*
-		 * found the right Scan node that we want to apply the filter onto via
-		 * matching relId
+		 * Found the right Scan node that we want to apply the filter onto via
+		 * matching relId.
 		 */
 		if (scan->ss.ss_currentRelation->rd_id == *relId)
 		{
@@ -1889,10 +1916,11 @@ PushDownFilter(PlanState *node, Plan *plan, SemiJoinFilter * sjf, Oid *relId, in
 }
 
 /*
- * If this table is the building-side table for the SemiJoinFilter, adds the element to
- * the bloom filter and always returns true. If this table is the checking-side table for the SemiJoinFilter,
- * then checks the element against the bloom filter and returns true if the element is (probably) in the set,
- * and false if the element is not in the bloom filter.
+ * If this table is the building-side table for the SemiJoinFilter, adds the
+ * element to the bloom filter and always returns true. If this table is the
+ * checking-side table for the SemiJoinFilter, then checks the element
+ * against the bloom filter and returns true if the element is (probably)
+ * in the set, and false if the element is not in the bloom filter.
  *
  * XXX So we can be building and examining the filter at the same time? That is, we can
  * be examining the filter before it's fully built? Seems a bit strange, and it can
@@ -1901,7 +1929,8 @@ PushDownFilter(PlanState *node, Plan *plan, SemiJoinFilter * sjf, Oid *relId, in
  * have d<a then we'll never filter anything.
  */
 bool
-SemiJoinFilterExamineSlot(List *semiJoinFilters, TupleTableSlot *slot, Oid tableId)
+SemiJoinFilterExamineSlot(List *semiJoinFilters,
+						  TupleTableSlot *slot, Oid tableId)
 {
 	ListCell   *cell;
 
@@ -1935,7 +1964,8 @@ SemiJoinFilterExamineSlot(List *semiJoinFilters, TupleTableSlot *slot, Oid table
 			 * include multiple join keys.
 			 */
 			val = slot->tts_values[sjf->buildingAttr];
-			bloom_add_element(sjf->filter, (unsigned char *) &val, sizeof(val));
+			bloom_add_element(sjf->filter, (unsigned char *) &val,
+							  sizeof(val));
 		}
 		else if (sjf->doneBuilding && tableId == sjf->checkingId)
 		{
@@ -1944,7 +1974,8 @@ SemiJoinFilterExamineSlot(List *semiJoinFilters, TupleTableSlot *slot, Oid table
 			slot_getsomeattrs(slot, sjf->checkingAttr + 1);
 			sjf->elementsChecked++;
 			val = slot->tts_values[sjf->checkingAttr];
-			if (bloom_lacks_element(sjf->filter, (unsigned char *) &val, sizeof(val)))
+			if (bloom_lacks_element(sjf->filter,
+									(unsigned char *) &val, sizeof(val)))
 			{
 				sjf->elementsFiltered++;
 				return false;
@@ -1955,7 +1986,8 @@ SemiJoinFilterExamineSlot(List *semiJoinFilters, TupleTableSlot *slot, Oid table
 }
 
 void
-SemiJoinFilterFinishScan(List *semiJoinFilters, Oid tableId)
+SemiJoinFilterFinishScan(List *semiJoinFilters, Oid tableId,
+						 dsa_area *parallel_area)
 {
 	ListCell   *cell;
 
@@ -1965,7 +1997,166 @@ SemiJoinFilterFinishScan(List *semiJoinFilters, Oid tableId)
 
 		if (!sjf->doneBuilding && tableId == sjf->buildingId)
 		{
-			sjf->doneBuilding = true;
+			if (!sjf->isParallel)
+			{
+				/*
+				 * Not parallel, so only one process running and that process
+				 * is now complete.
+				 */
+				sjf->doneBuilding = true;
+			}
+			else
+			{
+				/* parallel, so need to sync with the other processes */
+				SemiJoinFilterParallelState *parallelState =
+				(SemiJoinFilterParallelState *) dsa_get_address(
+																parallel_area, sjf->parallelState);
+				bloom_filter *shared_bloom =
+				(bloom_filter *) dsa_get_address(
+												 parallel_area, parallelState->bloom_dsa_address);
+
+				/*
+				 * This process takes control of the lock and updates the
+				 * shared bloom filter. These locks are created by the
+				 * SemiJoinFilterParallelState and are unique to that struct.
+				 */
+				LWLockAcquire(&parallelState->lock, LW_EXCLUSIVE);
+				parallelState->elementsAdded += sjf->elementsAdded;
+				add_to_filter(shared_bloom, sjf->filter);
+				parallelState->workersDone++;
+				LWLockRelease(&parallelState->lock);
+
+				/*
+				 * We need to wait until all threads have had their chance to
+				 * update the shared bloom filter, since our next step is to
+				 * copy the finished bloom filter back into all of the
+				 * separate processes.
+				 */
+				if (parallelState->workersDone == parallelState->numProcesses)
+				{
+					LWLockUpdateVar(&parallelState->secondlock,
+									&parallelState->lockStop, 1);
+				}
+				LWLockWaitForVar(&parallelState->secondlock,
+								 &parallelState->lockStop, 0,
+								 &parallelState->lockStop);
+
+				/*
+				 * Now the shared Bloom filter is fully updated, so each
+				 * individual process copies the finished Bloom filter to the
+				 * local SemiJoinFilter.
+				 */
+				LWLockAcquire(&parallelState->lock, LW_EXCLUSIVE);
+				replace_bitset(sjf->filter, shared_bloom);
+				sjf->elementsAdded = parallelState->elementsAdded;
+				sjf->doneBuilding = true;
+				parallelState->workersDone++;
+				LWLockRelease(&parallelState->lock);
+
+				/*
+				 * Again, we need to wait for all processes to finish copying
+				 * the completed bloom filter because the main process will
+				 * free the shared memory afterwards.
+				 */
+				if (parallelState->workersDone ==
+					2 * parallelState->numProcesses)
+				{
+					LWLockUpdateVar(&parallelState->secondlock,
+									&parallelState->lockStop, 2);
+				}
+				LWLockWaitForVar(&parallelState->secondlock,
+								 &parallelState->lockStop, 1,
+								 &parallelState->lockStop);
+				/* release allocated shared memory in main process */
+				if (!sjf->isWorker)
+				{
+					LWLockRelease(&parallelState->secondlock);
+					bloom_free_in_dsa(parallel_area,
+									  parallelState->bloom_dsa_address);
+					dsa_free(parallel_area, sjf->parallelState);
+				}
+			}
 		}
 	}
 }
+
+dsa_pointer
+CreateFilterParallelState(dsa_area *area, SemiJoinFilter * sjf,
+						  int sjf_num)
+{
+	dsa_pointer bloom_dsa_address =
+	bloom_create_in_dsa(area, sjf->num_elements, sjf->work_mem, sjf->seed);
+	dsa_pointer parallel_address =
+	dsa_allocate0(area, sizeof(SemiJoinFilterParallelState));
+	SemiJoinFilterParallelState *parallelState =
+	(SemiJoinFilterParallelState *) dsa_get_address(area,
+													parallel_address);
+
+	/* copy over information to parallel state */
+	parallelState->doneBuilding = sjf->doneBuilding;
+	parallelState->seed = sjf->seed;
+	parallelState->num_elements = sjf->num_elements;
+	parallelState->work_mem = sjf->work_mem;
+	parallelState->buildingId = sjf->buildingId;
+	parallelState->checkingId = sjf->checkingId;
+	parallelState->buildingAttr = sjf->buildingAttr;
+	parallelState->checkingAttr = sjf->checkingAttr;
+	parallelState->bloom_dsa_address = bloom_dsa_address;
+	parallelState->sjf_num = sjf_num;
+	parallelState->mergejoin_plan_id = sjf->mergejoin_plan_id;
+	/* initialize locks */
+	LWLockInitialize(&parallelState->lock, LWLockNewTrancheId());
+	LWLockInitialize(&parallelState->secondlock, LWLockNewTrancheId());
+	/* should be main process that acquires lock */
+	LWLockAcquire(&parallelState->secondlock, LW_EXCLUSIVE);
+	return parallel_address;
+}
+
+/*
+ * Checks a side of the execution tree and fetches an SJF if its mergejoin
+ * plan ID matches that of the method's mergejoin ID. Used during parallel
+ * execution, where SJF information is lost during information copying to
+ * the worker.
+ */
+SemiJoinFilter *
+GetSemiJoinFilter(PlanState *node, int plan_id)
+{
+	if (node == NULL)
+	{
+		return NULL;
+	}
+	check_stack_depth();
+	if (node->type == T_SeqScanState)
+	{
+		SeqScanState *scan = (SeqScanState *) node;
+
+		Assert(IsA(scan, SeqScanState));
+		if (scan->applySemiJoinFilter)
+		{
+			ListCell   *lc;
+
+			foreach(lc, scan->semiJoinFilters)
+			{
+				SemiJoinFilter *sjf = (SemiJoinFilter *) lfirst(lc);
+
+				if (sjf->mergejoin_plan_id == plan_id)
+				{
+					return sjf;
+				}
+			}
+			return NULL;
+		}
+	}
+	if (PushDownDirection(node) == 1)
+	{
+		/* check both children and return the non-null one */
+		return GetSemiJoinFilter(node->lefttree, plan_id) != NULL ?
+			GetSemiJoinFilter(node->lefttree, plan_id) :
+			GetSemiJoinFilter(node->righttree, plan_id);
+	}
+	if (PushDownDirection(node) == 0)
+	{
+		return GetSemiJoinFilter(node->lefttree, plan_id);
+	}
+	return NULL;
+}
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index f53e215d9be..9352ea649e9 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -30,9 +30,18 @@
 #include "access/relscan.h"
 #include "access/tableam.h"
 #include "executor/execdebug.h"
+#include "executor/nodeMergejoin.h"
 #include "executor/nodeSeqscan.h"
+#include "storage/lwlock.h"
+#include "storage/shm_toc.h"
 #include "utils/rel.h"
 
+/*
+ * Magic number for location of shared dsa pointer if scan is using a semi-join
+ * filter.
+ */
+#define DSA_LOCATION_KEY_FOR_SJF	UINT64CONST(0xE00000000000FFFF)
+
 static TupleTableSlot *SeqNext(SeqScanState *node);
 
 /* ----------------------------------------------------------------
@@ -157,7 +166,8 @@ ExecInitSeqScan(SeqScan *node, EState *estate, int eflags)
 	/* and create slot with the appropriate rowtype */
 	ExecInitScanTupleSlot(estate, &scanstate->ss,
 						  RelationGetDescr(scanstate->ss.ss_currentRelation),
-						  table_slot_callbacks(scanstate->ss.ss_currentRelation));
+						  table_slot_callbacks(
+											   scanstate->ss.ss_currentRelation));
 
 	/*
 	 * Initialize result type and projection.
@@ -265,6 +275,20 @@ ExecSeqScanEstimate(SeqScanState *node,
 												  estate->es_snapshot);
 	shm_toc_estimate_chunk(&pcxt->estimator, node->pscan_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/*
+	 * Estimate space for extra dsa_pointer address for when parallel
+	 * sequential scans use a semi-join filter.
+	 */
+	if (node->ss.ps.plan->parallel_aware && node->applySemiJoinFilter)
+	{
+		shm_toc_estimate_keys(&pcxt->estimator, 1);
+		if (node->semiJoinFilters)
+		{
+			shm_toc_estimate_keys(&pcxt->estimator,
+								  sizeof(dsa_pointer) * list_length(node->semiJoinFilters));
+		}
+	}
 }
 
 /* ----------------------------------------------------------------
@@ -280,6 +304,51 @@ ExecSeqScanInitializeDSM(SeqScanState *node,
 	EState	   *estate = node->ss.ps.state;
 	ParallelTableScanDesc pscan;
 
+	/*
+	 * If scan is using a semi-join filter, then initialize dsa pointer of
+	 * shared sjf.
+	 */
+	if (node->applySemiJoinFilter)
+	{
+		int			sjf_num = list_length(node->semiJoinFilters);
+		dsa_pointer *dsa_pointer_address;	/* an array of size sjf_num */
+		ListCell   *lc;
+		int			i = 0;
+
+		dsa_pointer_address = (dsa_pointer *) shm_toc_allocate(pcxt->toc,
+															   sizeof(dsa_pointer) * sjf_num);
+		foreach(lc, node->semiJoinFilters)
+		{
+			SemiJoinFilter *sjf = (SemiJoinFilter *) (lfirst(lc));
+			SemiJoinFilterParallelState *parallelState;
+			dsa_area   *area = node->ss.ps.state->es_query_dsa;
+
+			sjf->parallelState = CreateFilterParallelState(area, sjf, sjf_num);
+			sjf->isParallel = true;
+			/* check if main process always will run */
+			parallelState = (SemiJoinFilterParallelState *) dsa_get_address(area, sjf->parallelState);
+			parallelState->numProcesses = 1;
+			/* update parallelState with built bloom filter */
+			if (sjf->doneBuilding &&
+				node->ss.ss_currentRelation->rd_id == sjf->checkingId)
+			{
+				bloom_filter *parallel_bloom = (bloom_filter *) dsa_get_address(area, parallelState->bloom_dsa_address);
+
+				replace_bitset(parallel_bloom, sjf->filter);
+				LWLockRelease(&parallelState->secondlock);
+			}
+			dsa_pointer_address[i] = sjf->parallelState;
+			i++;
+		}
+
+		/*
+		 * Add plan_id to magic number so this is also unique for each plan
+		 * node.
+		 */
+		shm_toc_insert(pcxt->toc, DSA_LOCATION_KEY_FOR_SJF +
+					   node->ss.ps.plan->plan_node_id, dsa_pointer_address);
+	}
+
 	pscan = shm_toc_allocate(pcxt->toc, node->pscan_len);
 	table_parallelscan_initialize(node->ss.ss_currentRelation,
 								  pscan,
@@ -320,4 +389,106 @@ ExecSeqScanInitializeWorker(SeqScanState *node,
 	pscan = shm_toc_lookup(pwcxt->toc, node->ss.ps.plan->plan_node_id, false);
 	node->ss.ss_currentScanDesc =
 		table_beginscan_parallel(node->ss.ss_currentRelation, pscan);
+
+	/*
+	 * Create worker's semi-join filter for merge join, if using it. We first
+	 * need to check shm_toc to see if a sjf exists, then create the local
+	 * backend sjf.
+	 */
+	if (shm_toc_lookup(pwcxt->toc,
+					   DSA_LOCATION_KEY_FOR_SJF + node->ss.ps.plan->plan_node_id, 1))
+	{
+		dsa_pointer *parallel_addresses = (dsa_pointer *)
+		shm_toc_lookup(pwcxt->toc,
+					   DSA_LOCATION_KEY_FOR_SJF + node->ss.ps.plan->plan_node_id, 1);
+
+		/*
+		 * we know that there is at least one sjf, we will update accordingly
+		 * if the parallel state says there is more (this avoids using an
+		 * additional shm_toc allocation).
+		 */
+		int			sjf_num = 1;
+
+		/*
+		 * If a copy of any sjf already exists on the backend, we want to free
+		 * it and create a new one.
+		 */
+		if (node->applySemiJoinFilter)
+		{
+			while (list_length(node->semiJoinFilters) > 0)
+			{
+				SemiJoinFilter *sjf = (SemiJoinFilter *)
+				(list_head(node->semiJoinFilters)->ptr_value);
+
+				node->semiJoinFilters =
+					list_delete_nth_cell(node->semiJoinFilters, 0);
+				FreeSemiJoinFilter(sjf);
+			}
+		}
+
+		/*
+		 * Here, we create the process-local SJF's, which will later be
+		 * combined into the single SJF after all parallel work is done.
+		 */
+		for (int i = 0; i < sjf_num; i++)
+		{
+			dsa_pointer parallel_address = parallel_addresses[i];
+			SemiJoinFilterParallelState *parallelState =
+			(SemiJoinFilterParallelState *)
+			dsa_get_address(node->ss.ps.state->es_query_dsa,
+							parallel_address);
+			SemiJoinFilter *sjf;
+			MemoryContext oldContext;
+
+			sjf_num = parallelState->sjf_num;
+			oldContext = MemoryContextSwitchTo(GetMemoryChunkContext(node));
+			sjf = (SemiJoinFilter *) palloc0(sizeof(SemiJoinFilter));
+			sjf->filter = bloom_create(parallelState->num_elements,
+									   parallelState->work_mem,
+									   parallelState->seed);
+			sjf->buildingId = parallelState->buildingId;
+			sjf->checkingId = parallelState->checkingId;
+			sjf->seed = parallelState->seed;
+			sjf->isParallel = true;
+			sjf->isWorker = true;
+			sjf->doneBuilding = parallelState->doneBuilding;
+			sjf->parallelState = parallel_address;
+			node->applySemiJoinFilter = true;
+			sjf->buildingAttr = parallelState->buildingAttr;
+			sjf->checkingAttr = parallelState->checkingAttr;
+			node->semiJoinFilters =
+				lappend(node->semiJoinFilters, (void *) sjf);
+			sjf->mergejoin_plan_id = parallelState->mergejoin_plan_id;
+			/* copy over bloom filter if already built */
+			if (sjf->doneBuilding &&
+				parallelState->checkingId ==
+				node->ss.ss_currentRelation->rd_id)
+			{
+				SemiJoinFilterParallelState *parallelState =
+				(SemiJoinFilterParallelState *)
+				dsa_get_address(node->ss.ps.state->es_query_dsa,
+								sjf->parallelState);
+				bloom_filter *shared_bloom = (bloom_filter *) dsa_get_address(
+																			  node->ss.ps.state->es_query_dsa,
+																			  parallelState->bloom_dsa_address);
+
+				replace_bitset(sjf->filter, shared_bloom);
+			}
+			else if (!sjf->doneBuilding &&
+					 parallelState->buildingId ==
+					 node->ss.ss_currentRelation->rd_id)
+			{
+				/*
+				 * Add this process to number of scan processes, need to use
+				 * lock in case of multiple workers updating at same time. We
+				 * want to avoid using the planned number of workers because
+				 * that can be wrong.
+				 */
+				LWLockAcquire(&parallelState->lock, LW_EXCLUSIVE);
+				parallelState->numProcesses += 1;
+				LWLockRelease(&parallelState->lock);
+			}
+			MemoryContextSwitchTo(oldContext);
+		}
+	}
 }
diff --git a/src/backend/lib/bloomfilter.c b/src/backend/lib/bloomfilter.c
index 3ef67d35acb..0a05ada9b6c 100644
--- a/src/backend/lib/bloomfilter.c
+++ b/src/backend/lib/bloomfilter.c
@@ -128,6 +128,56 @@ bloom_free(bloom_filter *filter)
 	pfree(filter);
 }
 
+/*
+ * Create Bloom filter in dsa shared memory
+ */
+dsa_pointer
+bloom_create_in_dsa(dsa_area *area, int64 total_elems, int bloom_work_mem, uint64 seed)
+{
+	dsa_pointer filter_dsa_address;
+	bloom_filter *filter;
+	int			bloom_power;
+	uint64		bitset_bytes;
+	uint64		bitset_bits;
+
+	/*
+	 * Aim for two bytes per element; this is sufficient to get a false
+	 * positive rate below 1%, independent of the size of the bitset or total
+	 * number of elements.  Also, if rounding down the size of the bitset to
+	 * the next lowest power of two turns out to be a significant drop, the
+	 * false positive rate still won't exceed 2% in almost all cases.
+	 */
+	bitset_bytes = Min(bloom_work_mem * UINT64CONST(1024), total_elems * 2);
+	bitset_bytes = Max(1024 * 1024, bitset_bytes);
+
+	/*
+	 * Size in bits should be the highest power of two <= target.  bitset_bits
+	 * is uint64 because PG_UINT32_MAX is 2^32 - 1, not 2^32
+	 */
+	bloom_power = my_bloom_power(bitset_bytes * BITS_PER_BYTE);
+	bitset_bits = UINT64CONST(1) << bloom_power;
+	bitset_bytes = bitset_bits / BITS_PER_BYTE;
+
+	/* Allocate bloom filter with unset bitset */
+	filter_dsa_address = dsa_allocate0(area, offsetof(bloom_filter, bitset) +
+									   sizeof(unsigned char) * bitset_bytes);
+	filter = (bloom_filter *) dsa_get_address(area, filter_dsa_address);
+	filter->k_hash_funcs = optimal_k(bitset_bits, total_elems);
+	filter->seed = seed;
+	filter->m = bitset_bits;
+
+	return filter_dsa_address;
+}
+
+/*
+ * Free Bloom filter in dsa shared memory
+ */
+void
+bloom_free_in_dsa(dsa_area *area, dsa_pointer filter_dsa_address)
+{
+	dsa_free(area, filter_dsa_address);
+}
+
 /*
  * Add element to Bloom filter
  */
@@ -292,3 +342,27 @@ mod_m(uint32 val, uint64 m)
 
 	return val & (m - 1);
 }
+
+/*
+ * Add secondary filter to main filter, essentially "combining" the two filters together.
+ * This happens in-place, with main_filter being the combined filter.
+ * Both filters must have the same seed and size for this to work.
+ */
+void
+add_to_filter(bloom_filter *main_filter, bloom_filter *to_add)
+{
+	Assert(main_filter->seed == to_add->seed);
+	Assert(main_filter->m == to_add->m);
+	/* m is in bits not bytes */
+	for (int i = 0; i < main_filter->m / BITS_PER_BYTE; i++)
+	{
+		main_filter->bitset[i] = main_filter->bitset[i] | to_add->bitset[i];
+	}
+}
+
+void
+replace_bitset(bloom_filter *main_filter, bloom_filter *overriding_filter)
+{
+	Assert(main_filter->m == overriding_filter->m);
+	memcpy(&main_filter->bitset, &overriding_filter->bitset, main_filter->m / BITS_PER_BYTE);
+}
diff --git a/src/include/executor/nodeMergejoin.h b/src/include/executor/nodeMergejoin.h
index c311c7ed80e..d4cc4393153 100644
--- a/src/include/executor/nodeMergejoin.h
+++ b/src/include/executor/nodeMergejoin.h
@@ -23,6 +23,8 @@ extern void FreeSemiJoinFilter(SemiJoinFilter * sjf);
 extern int	PushDownDirection(PlanState *node);
 extern void PushDownFilter(PlanState *node, Plan *plan, SemiJoinFilter * sjf, Oid *relId, int64 *nodeRows);
 extern bool SemiJoinFilterExamineSlot(List *semiJoinFilters, TupleTableSlot *slot, Oid tableId);
-extern void SemiJoinFilterFinishScan(List *semiJoinFilters, Oid tableId);
+extern void SemiJoinFilterFinishScan(List *semiJoinFilters, Oid tableId, dsa_area *parallel_area);
+extern dsa_pointer CreateFilterParallelState(dsa_area *area, SemiJoinFilter * sjf, int sjf_num);
+extern SemiJoinFilter * GetSemiJoinFilter(PlanState *node, int plan_id);
 
 #endif							/* NODEMERGEJOIN_H */
diff --git a/src/include/lib/bloomfilter.h b/src/include/lib/bloomfilter.h
index 8146d8e7fdc..3b5d1821a5d 100644
--- a/src/include/lib/bloomfilter.h
+++ b/src/include/lib/bloomfilter.h
@@ -13,15 +13,22 @@
 #ifndef BLOOMFILTER_H
 #define BLOOMFILTER_H
 
+#include "utils/dsa.h"
+
 typedef struct bloom_filter bloom_filter;
 
 extern bloom_filter *bloom_create(int64 total_elems, int bloom_work_mem,
 								  uint64 seed);
 extern void bloom_free(bloom_filter *filter);
+extern dsa_pointer bloom_create_in_dsa(dsa_area *area, int64 total_elems,
+									   int bloom_work_mem, uint64 seed);
+extern void bloom_free_in_dsa(dsa_area *area, dsa_pointer filter_dsa_address);
 extern void bloom_add_element(bloom_filter *filter, unsigned char *elem,
 							  size_t len);
 extern bool bloom_lacks_element(bloom_filter *filter, unsigned char *elem,
 								size_t len);
 extern double bloom_prop_bits_set(bloom_filter *filter);
+extern void add_to_filter(bloom_filter *main_filter, bloom_filter *to_add);
+extern void replace_bitset(bloom_filter *main_filter, bloom_filter *overriding_filter);
 
 #endif							/* BLOOMFILTER_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 69644627204..6ca6de437b5 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -40,6 +40,8 @@
 #include "nodes/tidbitmap.h"
 #include "partitioning/partdefs.h"
 #include "storage/condition_variable.h"
+#include "storage/lwlock.h"
+#include "storage/shm_toc.h"
 #include "utils/hsearch.h"
 #include "utils/queryenvironment.h"
 #include "utils/reltrigger.h"
@@ -2026,6 +2028,10 @@ typedef struct SemiJoinFilter
 	Oid			checkingId;
 	int			checkingAttr;
 	bool		doneBuilding;
+	/* Parallel information */
+	bool		isParallel;
+	bool		isWorker;
+	dsa_pointer parallelState;
 	/* metadata */
 	uint64		seed;
 	int64		num_elements;
@@ -2036,6 +2042,31 @@ typedef struct SemiJoinFilter
 	int			mergejoin_plan_id;
 }			SemiJoinFilter;
 
+typedef struct SemiJoinFilterParallelState
+{
+	/* bloom filter information */
+	uint64		seed;
+	int64		num_elements;
+	int			work_mem;
+	dsa_pointer bloom_dsa_address;
+	/* information to copy over to worker processes */
+	int			numAttr;
+	Oid			buildingId;
+	Oid			checkingId;
+	int			buildingAttr;
+	int			checkingAttr;
+	int			elementsAdded;
+	/* information for parallelization and locking */
+	bool		doneBuilding;
+	int			workersDone;
+	int			numProcesses;
+	uint64		lockStop;
+	LWLock		lock;
+	LWLock		secondlock;
+	int			sjf_num;
+	int			mergejoin_plan_id;
+}			SemiJoinFilterParallelState;
+
 typedef struct MergeJoinState
 {
 	JoinState	js;				/* its first field is NodeTag */
-- 
2.37.3

0006-Integrate-EXPLAIN-command-with-semijoin-fil-20221003.patchtext/x-patch; charset=UTF-8; name=0006-Integrate-EXPLAIN-command-with-semijoin-fil-20221003.patchDownload

From 914032be56e6d6601e9638770a5bb926edb240e2 Mon Sep 17 00:00:00 2001
From: Lyu Pan <lyup@amazon.com>
Date: Fri, 16 Sep 2022 21:40:59 +0000
Subject: [PATCH 6/7] Integrate EXPLAIN command with semijoin filter.

When explaining Merge Join node, if semijoin filter is used in the Merge Join node, related metadata will be displyed, including filter clause, estimated filtering rate and actual filtering rate if EXPLAIN ANALYZE is used.

For example:
Merge Join  (...)
   Merge Cond: (...)
   SemiJoin Filter Created Based on: (...)
   SemiJoin Estimated Filtering Rate: XXX
   SemiJoin Actual Filtering Rate: XXX
---
 src/backend/commands/explain.c | 40 +++++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c6601..438792e31ab 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -153,7 +153,8 @@ static void ExplainIndentText(ExplainState *es);
 static void ExplainJSONLineEnding(ExplainState *es);
 static void ExplainYAMLLineStarting(ExplainState *es);
 static void escape_yaml(StringInfo buf, const char *str);
-
+static void show_semijoin_metadata(List *equijoins, PlanState *planstate,
+								   List *ancestors, ExplainState *es);
 
 
 /*
@@ -1981,6 +1982,11 @@ ExplainNode(PlanState *planstate, List *ancestors,
 							"Merge Cond", planstate, ancestors, es);
 			show_upper_qual(((MergeJoin *) plan)->join.joinqual,
 							"Join Filter", planstate, ancestors, es);
+			if (((MergeJoinState *) planstate)->sjf)
+			{
+				show_semijoin_metadata(((MergeJoin *) plan)->mergeclauses,
+									   planstate, ancestors, es);
+			}
 			if (((MergeJoin *) plan)->join.joinqual)
 				show_instrumentation_count("Rows Removed by Join Filter", 1,
 										   planstate, es);
@@ -5041,3 +5047,35 @@ escape_yaml(StringInfo buf, const char *str)
 {
 	escape_json(buf, str);
 }
+
+static void
+show_semijoin_metadata(List *equijoins, PlanState *planstate,
+					   List *ancestors, ExplainState *es)
+{
+	char		createStr[256];
+	int			clause_ordinal;
+	Node	   *best_equijoin_clause;
+	MergeJoin  *mj = ((MergeJoin *) planstate->plan);
+
+	Assert(planstate);
+	Assert(nodeTag(planstate) == T_MergeJoinState);
+	Assert(planstate->plan);
+	Assert(nodeTag(planstate->plan) == T_MergeJoin);
+
+	snprintf(createStr, sizeof(createStr), "%s",
+			 "SemiJoin Filter Created Based on");
+	clause_ordinal = mj->bestExpr;
+	best_equijoin_clause =
+		(Node *) list_nth_node(OpExpr, equijoins, clause_ordinal);
+	show_expression(best_equijoin_clause, createStr, planstate, ancestors,
+					true, es);
+	ExplainPropertyFloat("SemiJoin Estimated Filtering Rate", NULL,
+						 mj->filteringRate, 4, es);
+	if (es->analyze)
+	{
+		SemiJoinFilter *sjf = ((MergeJoinState *) planstate)->sjf;
+
+		ExplainPropertyFloat("SemiJoin Actual Filtering Rate", NULL,
+							 ((double) sjf->elementsFiltered) / sjf->elementsChecked, 4, es);
+	}
+}
-- 
2.37.3

run.shapplication/x-shellscript; name=run.shDownload

script.templatetext/plain; charset=UTF-8; name=script.templateDownload

Lyu Pan

lyu.steve.pan@gmail.com

over 3 years ago

In reply to: Tomas Vondra (#7)

5 attachment(s)

Re: Bloom filter Pushdown Optimization for Merge Join

Hello Zhihong Yu & Tomas Vondra,

Thank you so much for your review and feedback!

We made some updates based on previous feedback and attached the new
patch set. Due to time constraints, we didn't get to resolve all the
comments, and we'll continue to improve this patch.

In this prototype, the cost model is based on an assumption that there is
a linear relationship between the performance gain from using a semijoin
filter and the estimated filtering rate:
% improvement to Merge Join cost = 0.83 * estimated filtering rate - 0.137.

How were the coefficients (0.83 and 0.137) determined ?
I guess they were based on the results of running certain workload.

Right, the coefficients (0.83 and 0.137) determined are based on some
preliminary testings. The current costing model is pretty naive and
we'll work on a more robust costing model in future work.

I agree, in principle, although I think the current logic / formula is a
bit too crude and fitted to the simple data used in the test. I think
this needs to be formulated as a regular costing issue, considering
stuff like cost of the hash functions, and so on.

I think this needs to do two things:

1) estimate the cost of building the bloom filter - This shall depend on
the number of rows in the inner relation, number/cost of the hash
functions (which may be higher for some data types), etc.

2) estimate improvement for the probing branch - Essentially, we need to
estimate how much we save by filtering some of the rows, but this also
neeeds to include the cost of probing the bloom filter.

This will probably require some improvements to the lib/bloomfilter, in
order to estimate the false positive rate - this may matter a lot for
large data sets and small work_mem values. The bloomfilter library
simply reduces the size of the bloom filter, which increases the false
positive rate. At some point it'll start reducing the benefit.

These suggestions make a lot of sense. The current costing model is
definitely not good enough, and we plan to work on a more robust
costing model as we continue to improve the patch.

OK. Could also build the bloom filter in shared memory?

We thought about this approach but didn't prefer this one because if
all worker processes share the same bloom filter in shared memory, we
need to frequently lock and unlock the bloom filter to avoid race
conditions. So we decided to have each worker process create its own
bloom filter.

IMHO we shouldn't make too many conclusions from these examples. Yes, it
shows merge join can be improved, but for cases where a hashjoin works
better so we wouldn't use merge join anyway.

I think we should try constructing examples where either merge join wins
already (and gets further improved by the bloom filter), or would lose
to hash join and the bloom filter improves it enough to win.

AFAICS that requires a join of two large tables - large enough that hash
join would need to be batched, or pre-sorted inputs (which eliminates
the explicit Sort, which is the main cost in most cases).

The current patch only works with sequential scans, which eliminates the
second (pre-sorted) option. So let's try the first one - can we invent
an example with a join of two large tables where a merge join would win?

Can we find such example in existing benchmarks like TPC-H/TPC-DS.

Agreed. The current examples are only intended to show us that using
bloom filters in merge join could improve the merge join performance
in some cases. We are working on testing more examples that merge join
with bloom filter could out-perform hash join, which should be more
persuasive.

The bloom filter is built by the first seqscan (on t0), and then used by
the second seqscan (on t1). But this only works because we always run
the t0 scan to completion (because we're feeding it into Sort) before we
start scanning t1.

But when the scan on t1 switches to an index scan, it's over - we'd be
building the filter without being able to probe it, and when we finish
building it we no longer need it. So this seems pretty futile.

It might still improve plans like

-> Merge Join
Merge Cond: (t0.c1 = t1.c1)
SemiJoin Filter Created Based on: (t0.c1 = t1.c1)
SemiJoin Estimated Filtering Rate: 1.0000
-> Sort
Sort Key: t0.c1
-> Seq Scan on t0
-> Index Scan on t1

But I don't know how common/likely that actually is. I'd expect to have
an index on both sides, but perhaps I'm wrong.

This is why hashjoin seems like a more natural fit for the bloom filter,
BTW, because there we have a guarantee the inner relation is processed
first (so we know the bloom filter is fine and can be probed).

Great observation. The bloom filter only works if the first SeqScan
always runs to completion before the second SeqScan starts.
I guess one possible way to avoid futile bloom filter might be
enforcing that the bloom filter only be used if both the outer/inner
plans of the MergeJoin are Sort nodes, to guarantee the bloom filter
is ready to use after processing one side of the join, but this may be
too restrictive.

I don't know what improvements you have in mind exactly, but I think
it'd be good to show which node is building/using a bloom filter, and
then also some basic stats (size, number of hash functions, FPR, number
of probes, ...). This may require improvements to lib/bloomfilter, which
currently does not expose some of the details.

Along with the new patch set, we have added information to display
which node is building/using a bloom filter (as well as the
corresponding expressions), and some bloom filter basic stats. We'll
add more related information (e.g. FPR) as we modify lib/bloomfilter
implementation in future work.

Thanks again for the great reviews and they are really useful! More
feedback is always welcome and appreciated!

Regards,
Lyu Pan
Amazon RDS/Aurora for PostgreSQL

Show quoted text

On Mon, 3 Oct 2022 at 09:14, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:

Hello Zheng Li,

Great to see someone is working on this! Some initial comments/review:

On 10/1/22 00:44, Zheng Li wrote:

Hello,

A bloom filter provides early filtering of rows that cannot be joined
before they would reach the join operator, the optimization is also
called a semi join filter (SJF) pushdown. Such a filter can be created
when one child of the join operator must materialize its derived table
before the other child is evaluated.

For example, a bloom filter can be created using the the join keys for
the build side/inner side of a hash join or the outer side of a merge
join, the bloom filter can then be used to pre-filter rows on the
other side of the join operator during the scan of the base relation.
The thread about “Hash Joins vs. Bloom Filters / take 2” [1] is good
discussion on using such optimization for hash join without going into
the pushdown of the filter where its performance gain could be further
increased.

Agreed. That patch was beneficial for hashjoins with batching, but I
think the pushdown makes this much more interesting.

We worked on prototyping bloom filter pushdown for both hash join and
merge join. Attached is a patch set for bloom filter pushdown for
merge join. We also plan to send the patch for hash join once we have
it rebased.

Here is a summary of the patch set:
1. Bloom Filter Pushdown optimizes Merge Join by filtering rows early
during the table scan instead of later on.
-The bloom filter is pushed down along the execution tree to
the target SeqScan nodes.
-Experiments show that this optimization can speed up Merge
Join by up to 36%.

Right, although I think the speedup very much depends on the data sets
used for the tests, and can be made arbitrarily large with "appropriate"
data set.

2. The planner makes the decision to use the bloom filter based on the
estimated filtering rate and the expected performance gain.
-The planner accomplishes this by estimating four numbers per
variable - the total number of rows of the relation, the number of
distinct values for a given variable, and the minimum and maximum
value of the variable (when applicable). Using these numbers, the
planner estimates a filtering rate of a potential filter.
-Because actually creating and implementing the filter adds
more operations, there is a minimum threshold of filtering where the
filter would actually be useful. Based on testing, we query to see if
the estimated filtering rate is higher than 35%, and that informs our
decision to use a filter or not.

I agree, in principle, although I think the current logic / formula is a
bit too crude and fitted to the simple data used in the test. I think
this needs to be formulated as a regular costing issue, considering
stuff like cost of the hash functions, and so on.

I think this needs to do two things:

1) estimate the cost of building the bloom filter - This shall depend on
the number of rows in the inner relation, number/cost of the hash
functions (which may be higher for some data types), etc.

2) estimate improvement for the probing branch - Essentially, we need to
estimate how much we save by filtering some of the rows, but this also
neeeds to include the cost of probing the bloom filter.

This will probably require some improvements to the lib/bloomfilter, in
order to estimate the false positive rate - this may matter a lot for
large data sets and small work_mem values. The bloomfilter library
simply reduces the size of the bloom filter, which increases the false
positive rate. At some point it'll start reducing the benefit.

3. If using a bloom filter, the planner also adjusts the expected cost
of Merge Join based on expected performance gain.

I think this is going to be a weak point of the costing, because we're
adjusting the cost of the whole subtree after it was costed.

We're doing something similar when costing LIMIT, and that can already
causes a lot of strange stuff with non-uniform data distributions, etc.

And in this case it's probably worse, because we're eliminating rows at
the scan level, without changing the cost of any of the intermediate
nodes. It's certainly going to be confusing in EXPLAIN, because of the
discrepancy between estimated and actual row counts ...

4. Capability to build the bloom filter in parallel in case of
parallel SeqScan. This is done efficiently by populating a local bloom
filter for each parallel worker and then taking a bitwise OR over all
the local bloom filters to form a shared bloom filter at the end of
the parallel SeqScan.

OK. Could also build the bloom filter in shared memory?

5. The optimization is GUC controlled, with settings of
enable_mergejoin_semijoin_filter and force_mergejoin_semijoin_filter.

We found in experiments that there is a significant improvement
when using the bloom filter during Merge Join. One experiment involved
joining two large tables while varying the theoretical filtering rate
(TFR) between the two tables, the TFR is defined as the percentage
that the two datasets are disjoint. Both tables in the merge join were
the same size. We tested changing the TFR to see the change in
filtering optimization.

For example, let’s imagine t0 has 10 million rows, which contain the
numbers 1 through 10 million randomly shuffled. Also, t1 has the
numbers 4 million through 14 million randomly shuffled. Then the TFR
for a join of these two tables is 40%, since 40% of the tables are
disjoint from the other table (1 through 4 million for t0, 10 million
through 14 million for t4).

Here is the performance test result joining two tables:
TFR: theoretical filtering rate
EFR: estimated filtering rate
AFR: actual filtering rate
HJ: hash join
MJ Default: default merge join
MJ Filter: merge join with bloom filter optimization enabled
MJ Filter Forced: merge join with bloom filter optimization forced

TFR EFR AFR HJ MJ Default MJ Filter MJ Filter Forced
-------------------------------------------------------------------------------------
10 33.46 7.41 6529 22638 21949 23160
20 37.27 14.85 6483 22290 21928 21930
30 41.32 22.25 6395 22374 20718 20794
40 45.67 29.7 6272 21969 19449 19410
50 50.41 37.1 6210 21412 18222 18224
60 55.64 44.51 6052 21108 17060 17018
70 61.59 51.98 5947 21020 15682 15737
80 68.64 59.36 5761 20812 14411 14437
90 77.83 66.86 5701 20585 13171 13200
Table. Execution Time (ms) vs Filtering Rate (%) for Joining Two
Tables of 10M Rows.

Attached you can find figures of the same performance test and a SQL script
to reproduce the performance test.

The first thing to notice is that Hash Join generally is the most
efficient join strategy. This is because Hash Join is better at
dealing with small tables, and our size of 10 million is still small
enough where Hash Join outperforms the other join strategies. Future
experiments can investigate using much larger tables.

However, comparing just within the different Merge Join variants, we
see that using the bloom filter greatly improves performance.
Intuitively, all of these execution times follow linear paths.
Comparing forced filtering versus default, we can see that the default
Merge Join outperforms Merge Join with filtering at low filter rates,
but after about 20% TFR, the Merge Join with filtering outperforms
default Merge Join. This makes intuitive sense, as there are some
fixed costs associated with building and checking with the bloom
filter. In the worst case, at only 10% TFR, the bloom filter makes
Merge Join less than 5% slower. However, in the best case, at 90% TFR,
the bloom filter improves Merge Join by 36%.

Based on the results of the above experiments, we came up with a
linear equation for the performance ratio for using the filter
pushdown from the actual filtering rate. Based on the numbers
presented in the figure, this is the equation:

T_filter / T_no_filter = 1 / (0.83 * estimated filtering rate + 0.863)

For example, this means that with an estimated filtering rate of 0.4,
the execution time of merge join is estimated to be improved by 16.3%.
Note that the estimated filtering rate is used in the equation, not
the theoretical filtering rate or the actual filtering rate because it
is what we have during planning. In practice the estimated filtering
rate isn’t usually accurate. In fact, the estimated filtering rate can
differ from the theoretical filtering rate by as much as 17% in our
experiments. One way to mitigate the power loss of bloom filter caused
by inaccurate estimated filtering rate is to adaptively turn it off at
execution time, this is yet to be implemented.

IMHO we shouldn't make too many conclusions from these examples. Yes, it
shows merge join can be improved, but for cases where a hashjoin works
better so we wouldn't use merge join anyway.

I think we should try constructing examples where either merge join wins
already (and gets further improved by the bloom filter), or would lose
to hash join and the bloom filter improves it enough to win.

AFAICS that requires a join of two large tables - large enough that hash
join would need to be batched, or pre-sorted inputs (which eliminates
the explicit Sort, which is the main cost in most cases).

The current patch only works with sequential scans, which eliminates the
second (pre-sorted) option. So let's try the first one - can we invent
an example with a join of two large tables where a merge join would win?

Can we find such example in existing benchmarks like TPC-H/TPC-DS.

Here is a list of tasks we plan to work on in order to improve this patch:
1. More regression testing to guarantee correctness.
2. More performance testing involving larger tables and complicated query plans.
3. Improve the cost model.

+1

4. Explore runtime tuning such as making the bloom filter checking adaptive.

I think this is tricky, I'd leave it out from the patch for now until
the other bits are polished. It can be added later.

5. Currently, only the best single join key is used for building the
Bloom filter. However, if there are several keys and we know that
their distributions are somewhat disjoint, we could leverage this fact
and use multiple keys for the bloom filter.

True, and I guess it wouldn't be hard.

6. Currently, Bloom filter pushdown is only implemented for SeqScan
nodes. However, it would be possible to allow push down to other types
of scan nodes.

I think pushing down the bloom filter to other types of scans is not the
hard part, really. It's populating the bloom filter early enough.

Invariably, all the examples end up with plans like this:

-> Merge Join
Merge Cond: (t0.c1 = t1.c1)
SemiJoin Filter Created Based on: (t0.c1 = t1.c1)
SemiJoin Estimated Filtering Rate: 1.0000
-> Sort
Sort Key: t0.c1
-> Seq Scan on t0
-> Materialize
-> Sort
Sort Key: t1.c1
-> Seq Scan on t1

The bloom filter is built by the first seqscan (on t0), and then used by
the second seqscan (on t1). But this only works because we always run
the t0 scan to completion (because we're feeding it into Sort) before we
start scanning t1.

But when the scan on t1 switches to an index scan, it's over - we'd be
building the filter without being able to probe it, and when we finish
building it we no longer need it. So this seems pretty futile.

It might still improve plans like

-> Merge Join
Merge Cond: (t0.c1 = t1.c1)
SemiJoin Filter Created Based on: (t0.c1 = t1.c1)
SemiJoin Estimated Filtering Rate: 1.0000
-> Sort
Sort Key: t0.c1
-> Seq Scan on t0
-> Index Scan on t1

But I don't know how common/likely that actually is. I'd expect to have
an index on both sides, but perhaps I'm wrong.

This is why hashjoin seems like a more natural fit for the bloom filter,
BTW, because there we have a guarantee the inner relation is processed
first (so we know the bloom filter is fine and can be probed).

7. Explore if the Bloom filter could be pushed down through a foreign
scan when the foreign server is capable of handling it – which could
be made true for postgres_fdw.

Neat idea, but I suggest to leave this out of scope of this patch.

8. Better explain command on the usage of bloom filters.

I don't know what improvements you have in mind exactly, but I think
it'd be good to show which node is building/using a bloom filter, and
then also some basic stats (size, number of hash functions, FPR, number
of probes, ...). This may require improvements to lib/bloomfilter, which
currently does not expose some of the details.

This patch set is prepared by Marcus Ma, Lyu Pan and myself. Feedback
is appreciated.

Attached is a patch series with two "review" parts (0002 and 0004). I
already mentioned some of the stuff above, but a couple more points:

1) Don't allocate memory directly through alloca() etc. Use palloc, i.e.
rely on our memory context.

2) It's customary to have "PlannerInfo *root" as the first parameter.

3) For the "debug" logging, I'd suggest to do it the way TRACE_SORT
(instead of inventing a bunch of dbg routines).

4) I find the naming inconsistent, e.g. with respect to the surrounding
code (say, when everything around starts with Exec, maybe the new
functions should too?). Also, various functions/variables say "semijoin"
but then we apply that to "inner joins" too.

5) Do we really need estimate_distincts_remaining() to implement yet
another formula for estimating number of distinct groups, different from
estimate_num_groups() does? Why?

6) A number of new functions miss comments explaining the purpose, and
it's not quite clear what the "contract" is. Also, some functions have
new parameters but the comment was not updated to reflect it.

7) SemiJoinFilterExamineSlot is matching the relations by OID, but
that's wrong - if you do a self-join, both sides have the same OID. It
needs to match RT index (I believe scanrelid in Scan node is what this
should be looking at).

There's a couple more review comments in the patches, but those are
minor and not worth discussing here - feel free to ask, if anything is
not clear enough (or if you disagree).

I did a bunch of testing, after tweaking your SQL script.

I changed the data generation a bit not to be so slow (instead of
relying on unnest of multiple large sets, I use one sequence and random
to generate data). And I run the tests with different parameters (step,
work_mem, ...) driven by the attached shell script.

And it quickly fails (on assert-enabled-build). I see two backtraces:

1) bogus overlapping estimate (ratio > 1.0)
...
#4 0x0000000000c9d56b in ExceptionalCondition (conditionName=0xe43724
"inner_overlapping_ratio >= 0 && inner_overlapping_ratio <= 1",
errorType=0xd33069 "FailedAssertion", fileName=0xe42bdb "costsize.c",
lineNumber=7442) at assert.c:69
#5 0x00000000008ed767 in evaluate_semijoin_filtering_rate
(join_path=0x2fe79f0, equijoin_list=0x2fea6c0, root=0x2fe6b68,
workspace=0x7ffd87588a78, best_clause=0x7ffd875888cc,
rows_filtered=0x7ffd875888c8) at costsize.c:7442

Seems it's doing the math wrong, or does not expect some corner case.

2) stuck spinlock in SemiJoinFilterFinishScan
...
#5 0x0000000000a85cb0 in s_lock_stuck (file=0xe68c8c "lwlock.c",
line=907, func=0xe690a1 "LWLockWaitListLock") at s_lock.c:83
#6 0x0000000000a85a8d in perform_spin_delay (status=0x7ffd8758b8e8) at
s_lock.c:134
#7 0x0000000000a771c3 in LWLockWaitListLock (lock=0x7e40a597c060) at
lwlock.c:911
#8 0x0000000000a76e93 in LWLockConflictsWithVar (lock=0x7e40a597c060,
valptr=0x7e40a597c048, oldval=1, newval=0x7e40a597c048,
result=0x7ffd8758b983) at lwlock.c:1580
#9 0x0000000000a76ce9 in LWLockWaitForVar (lock=0x7e40a597c060,
valptr=0x7e40a597c048, oldval=1, newval=0x7e40a597c048) at lwlock.c:1638
#10 0x000000000080aa55 in SemiJoinFilterFinishScan
(semiJoinFilters=0x2e349b0, tableId=1253696, parallel_area=0x2e17388) at
nodeMergejoin.c:2035

This only happens in parallel plans, I haven't looked at the details.

I do recall parallel hash join was quite tricky exactly because there
are issues with coordinating building the hash table (e.g. workers might
get stuck due to waiting on shmem queues etc.), I wonder if this might
be something similar due to building the filter. But maybe it's
something trivial.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

v1-0001-Support-semijoin-filter-in-the-planner-optimizer.patchapplication/octet-stream; name=v1-0001-Support-semijoin-filter-in-the-planner-optimizer.patchDownload

From 89369492e2715d9d08f2fe5ac0d75f4d2cec6b8c Mon Sep 17 00:00:00 2001
From: Lyu Pan <lyu.steve.pan@gmail.com>
Date: Wed, 12 Oct 2022 05:00:25 +0000
Subject: [PATCH 1/5] Support semijoin filter in the planner/optimizer.

1. Introduces two GUCs enable_mergejoin_semijoin_filter and force_mergejoin_semijoin_filter. enable_mergejoin_semijoin_filter enables the use of bloom filter during a merge-join, if such a filter is available, the planner will adjust the merge-join cost and use it only if it meets certain threshold (estimated filtering rate has to be higher than a certain value);
force_mergejoin_semijoin_filter forces the use of bloom filter during a merge-join if a valid filter is available.

2. In this prototype, only a single join clause where both sides maps to a base column will be considered as the key to build the bloom filter. For example:
bloom filter may be used in this query:
SELECT * FROM a JOIN b ON a.col1 = b.col1;
bloom filter will not be used in the following query (the left hand side of the join clause is an expression):
SELECT * FROM a JOIN b ON a.col1 + a.col2 = b.col1;

3. In this prototype, the cost model is based on an assumption that there is a linear relationship between the performance gain from using a semijoin filter and the estimated filtering rate:
% improvement to Merge Join cost = 0.83 * estimated filtering rate - 0.137. Future work will improve the cost model.
---
 src/backend/optimizer/path/costsize.c         | 1211 ++++++++++++++++-
 src/backend/optimizer/plan/createplan.c       |  617 ++++++++-
 src/backend/utils/adt/selfuncs.c              |   28 +-
 src/backend/utils/misc/guc_tables.c           |   20 +
 src/backend/utils/misc/postgresql.conf.sample |    2 +
 src/include/nodes/pathnodes.h                 |   45 +
 src/include/nodes/plannodes.h                 |    8 +-
 src/include/optimizer/cost.h                  |   46 +
 src/include/utils/selfuncs.h                  |    4 +-
 src/test/regress/expected/sysviews.out        |   47 +-
 10 files changed, 1988 insertions(+), 40 deletions(-)

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 4c6b1d1f55..9b9ab4dbe9 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -95,6 +95,7 @@
 #include "optimizer/restrictinfo.h"
 #include "parser/parsetree.h"
 #include "utils/lsyscache.h"
+#include "utils/guc.h"
 #include "utils/selfuncs.h"
 #include "utils/spccache.h"
 #include "utils/tuplesort.h"
@@ -192,6 +193,27 @@ static double relation_byte_size(double tuples, int width);
 static double page_size(double tuples, int width);
 static double get_parallel_divisor(Path *path);
 
+/*
+ *  Local functions and option variables to support
+ *  semijoin pushdowns from join nodes
+ */
+static double evaluate_semijoin_filtering_rate(PlannerInfo *root,
+											   JoinPath *join_path,
+											   List *hash_equijoins,
+											   JoinCostWorkspace *workspace,
+											   int *best_clause,
+											   int *rows_filtered);
+static bool verify_valid_pushdown(PlannerInfo *root, Path *p,
+								  Index pushdown_target_key_no);
+static TargetEntry *get_nth_targetentry(int posn,
+										List *targetlist);
+static bool is_fk_pk(PlannerInfo *root, Var *outer_var,
+					 Var *inner_var, Oid op_oid);
+static List *get_switched_clauses(List *clauses, Relids outerrelids);
+
+/* Global variables to store semijoin control options */
+bool		enable_mergejoin_semijoin_filter;
+bool		force_mergejoin_semijoin_filter;
 
 /*
  * clamp_row_est
@@ -3333,6 +3355,11 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 			innerstartsel = 0.0;
 			innerendsel = 1.0;
 		}
+
+		workspace->outer_min_val = cache->leftmin;
+		workspace->outer_max_val = cache->leftmax;
+		workspace->inner_min_val = cache->rightmin;
+		workspace->inner_max_val = cache->rightmax;
 	}
 	else
 	{
@@ -3494,6 +3521,10 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 	double		mergejointuples,
 				rescannedtuples;
 	double		rescanratio;
+	List	   *mergeclauses_for_sjf;
+	double		filtering_rate;
+	int			best_filter_clause;
+	int			rows_filtered;
 
 	/* Protect some assumptions below that rowcounts aren't zero */
 	if (inner_path_rows <= 0)
@@ -3546,6 +3577,54 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 	else
 		path->skip_mark_restore = false;
 
+	if (enable_mergejoin_semijoin_filter)
+	{
+		/*
+		 * Rearrange mergeclauses if needed, so that the outer variable is
+		 * always on the left;
+		 */
+		mergeclauses_for_sjf = get_switched_clauses(path->path_mergeclauses,
+													path->jpath.outerjoinpath->parent->relids);
+
+		/*
+		 * TODO: 1. Update the costing model; the current costing model is too
+		 * naive and doesn't fit in complex queries; 2. Update the hard-coded
+		 * numbers; these numbers only serves as a temporary solution; 3.
+		 * Adjust other properties of the mergejoin node (in addition to
+		 * cost), such as estimated rows etc; 4. Check the bloom filter size
+		 * against work_mem, false positive rate etc.
+		 */
+		filtering_rate = evaluate_semijoin_filtering_rate(root, (JoinPath *) path, mergeclauses_for_sjf,
+														  workspace, &best_filter_clause, &rows_filtered);
+
+		/* want at least 1000 rows_filtered to avoid any nasty edge cases */
+		if (force_mergejoin_semijoin_filter ||
+			(filtering_rate >= 0.35 && rows_filtered > 1000 && best_filter_clause >= 0))
+		{
+			double		improvement;
+			SemijoinFilterJoinData *result = makeNode(SemijoinFilterJoinData);
+
+			result->best_mergeclause_pos = best_filter_clause;
+			result->filtering_rate = filtering_rate;
+			path->sj_metadata = result;
+
+			/*
+			 * Based on experimental data, we have found that there is a
+			 * linear relationship between the estimated filtering rate and
+			 * improvement to the cost of Merge Join. In fact, this
+			 * improvement can be modeled by this equation: improvement = 0.83 *
+			 * filtering rate - 0.137 i.e., a filtering rate of 0.4 yields an
+			 * improvement of 19.5%. This equation also concludes thata a 17%
+			 * filtering rate is the break-even point, so we use 35% just be
+			 * conservative. We use this information to adjust the MergeJoin's
+			 * planned cost.
+			 */
+			improvement = 0.83 * filtering_rate - 0.137;
+			run_cost = (1 - improvement) * run_cost;
+			workspace->run_cost = run_cost;
+		}
+	}
+
 	/*
 	 * Get approx # tuples passing the mergequals.  We use approx_tuple_count
 	 * here because we need an estimate done with JOIN_INNER semantics.
@@ -3727,6 +3806,10 @@ cached_scansel(PlannerInfo *root, RestrictInfo *rinfo, PathKey *pathkey)
 				leftendsel,
 				rightstartsel,
 				rightendsel;
+	Datum		leftmin,
+				leftmax,
+				rightmin,
+				rightmax;
 	MemoryContext oldcontext;
 
 	/* Do we have this result already? */
@@ -3749,7 +3832,11 @@ cached_scansel(PlannerInfo *root, RestrictInfo *rinfo, PathKey *pathkey)
 					 &leftstartsel,
 					 &leftendsel,
 					 &rightstartsel,
-					 &rightendsel);
+					 &rightendsel,
+					 &leftmin,
+					 &leftmax,
+					 &rightmin,
+					 &rightmax);
 
 	/* Cache the result in suitably long-lived workspace */
 	oldcontext = MemoryContextSwitchTo(root->planner_cxt);
@@ -3763,6 +3850,10 @@ cached_scansel(PlannerInfo *root, RestrictInfo *rinfo, PathKey *pathkey)
 	cache->leftendsel = leftendsel;
 	cache->rightstartsel = rightstartsel;
 	cache->rightendsel = rightendsel;
+	cache->leftmin = leftmin;
+	cache->leftmax = leftmax;
+	cache->rightmin = rightmin;
+	cache->rightmax = rightmax;
 
 	rinfo->scansel_cache = lappend(rinfo->scansel_cache, cache);
 
@@ -6235,3 +6326,1121 @@ compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel, Path *bitmapqual,
 
 	return pages_fetched;
 }
+
+/*
+ * Initialize SemiJoinFilterExprMetadata struct.
+ */
+static void
+init_semijoin_filter_expr_metadata(SemiJoinFilterExprMetadata * md)
+{
+	/* Should only be called by analyze_expr_for_metadata */
+	Assert(md);
+
+	md->is_or_maps_to_constant = false;
+	md->is_or_maps_to_base_column = false;
+	md->local_column_expr = NULL;
+	md->local_relation = NULL;
+	md->est_col_width = 0;
+	md->base_column_expr = NULL;
+	md->base_rel = NULL;
+	md->base_rel_root = NULL;
+	md->base_rel_row_count = 0.0;
+	md->base_rel_filt_row_count = 0.0;
+	md->base_col_distincts = -1.0;
+	md->est_distincts_reliable = false;
+	md->expr_est_distincts = -1.0;
+}
+
+/*
+	* Estimates the number of distinct values still present within a column
+	* after some local filtering has been applied to that table and thereby
+	* restricted the set of relevant rows.
+	*
+	* This method assumes that the original_distinct_count comes from a
+	* column whose values are uncorrelated with the row restricting
+	* condition(s) on this table.  Other mechanisms need to be added to more
+	* accurately handle the cases where the row restrincting condition is
+	* directly on the current column.
+	*
+	* The most probable number of distinct values remaining can be computed
+	* exactly using Yao's iterative expansion formula from: "Approximating
+	* block accesses in database organizations", S. B. Yao, CACM, V20, N4,
+	* April 1977, p. 260-261 However, this formula gets very expensive to
+	* compute whenever the number of distinct values is large.
+	*
+	* This function instead uses a non-iterative approximation of Yao's
+	* iterative formula from: "Estimating Block Accesses in Database
+	* Organizations: A Closed Noniterative Formula", Kyu-Young Whang, Gio
+	* Wiederhold, and Daniel Sagalowicz CACM V26, N11, November 1983, p.
+	* 945-947 This approximation starts with terms for the first three
+	* iterations of Yao's formula, and then inserts two adjustment factors
+	* into the third term which minimize the total error related to the
+	* missing subsequent terms.
+	*
+	* Internally this function uses M, N, P, and K as variables to match the
+	* notation used in the equation in the paper.
+	*/
+static double
+estimate_distincts_remaining(double original_table_row_count,
+							 double original_distinct_count,
+							 double est_row_count_after_predicates)
+{
+	double		n = original_table_row_count;
+	double		m = original_distinct_count;
+	double		k = est_row_count_after_predicates;
+	double		p = 0.0;		/* avg rows per distinct */
+	double		result;
+
+	/* The three partial probabality terms */
+	double		term_1 = 0.0;
+	double		term_2 = 0.0;
+	double		term_3 = 0.0;
+	double		sum_terms = 0.0;
+
+	/* In debug builds, validate the sanity of the inputs  */
+	Assert(isfinite(original_table_row_count));
+	Assert(isfinite(original_distinct_count));
+	Assert(isfinite(est_row_count_after_predicates));
+	Assert(original_table_row_count >= 0.0);
+	Assert(original_distinct_count >= 0.0);
+	Assert(est_row_count_after_predicates >= 0.0);
+	Assert(original_distinct_count <= original_table_row_count);
+	Assert(est_row_count_after_predicates <= original_table_row_count);
+
+	if (n > 0.0 && m > 0.0)
+	{
+		p = (n / m);
+	}
+	Assert(isfinite(p));
+
+	if (k > (n - p))
+	{							/* All distincts almost guaranteed to still be
+								 * present */
+		result = m;
+	}
+	else if (m < 0.000001)
+	{							/* When all values are NULL, avoid division by
+								 * zero */
+		result = 0.0;
+	}
+	else if (k <= 1.000001)
+	{							/* When only one or zero rows after filtering */
+		result = k;
+	}
+	else
+	{
+		/*
+		 * When this is not a special case, compute the partial probabilities.
+		 * However, if the probability calculation overflows, then revert to
+		 * the estimate we can get from the upper bound analysis.
+		 */
+		result = fmin(original_distinct_count, est_row_count_after_predicates);
+
+		if (isfinite(1.0 / m) && isfinite(pow((1.0 - (1.0 / m)), k)))
+		{
+			term_1 = (1.0 - pow((1.0 - (1.0 / m)), k));
+
+			if (isfinite(term_1))
+			{
+				/*
+				 * As long as we at least have a usable term_1, then proceed
+				 * to the much smaller term_2 and to the even smaller term_3.
+				 *
+				 * If no usable term_1, then just use the hard upper bounds.
+				 */
+				if (isfinite(m * m * p)
+					&& isfinite(pow((1.0 - (1.0 / m)), (k - 1.0))))
+				{
+					term_2 = ((1.0 / (m * m * p))
+							  * ((k * (k - 1.0)) / 2.0)
+							  * pow((1.0 - (1.0 / m)), (k - 1.0))
+						);
+				}
+				if (!isfinite(term_2))
+				{
+					term_2 = 0.0;
+				}
+				if (isfinite(pow(m, 3.0))
+					&& isfinite(pow(p, 4.0))
+					&& isfinite(pow(m, 3.0) * pow(p, 4.0))
+					&& isfinite(k * (k - 1.0) * ((2 * k) - 1.0))
+					&& isfinite(pow((1.0 - (1.0 / m)), (k - 1.0))))
+				{
+					term_3 = ((1.5 / (pow(m, 3.0) * pow(p, 4.0)))
+							  * ((k * (k - 1.0) * ((2 * k) - 1.0)) / 6.0)
+							  * pow((1.0 - (1.0 / m)), (k - 1.0)));
+				}
+				if (!isfinite(term_3))
+				{
+					term_3 = 0.0;
+				}
+				sum_terms = term_1 + term_2 + term_3;
+
+				/* In debug builds, validate the partial probability terms */
+				Assert(term_1 <= 1.0 && term_1 >= 0.0);
+				Assert(term_2 <= 1.0 && term_2 >= 0.0);
+				Assert(term_3 <= 1.0 && term_3 >= 0.0);
+				Assert(term_1 > term_2);
+				Assert(term_2 >= term_3);
+				Assert(isfinite(sum_terms));
+				Assert(sum_terms <= 1.0);
+
+				if (isfinite(m * sum_terms))
+				{
+					result = round(m * sum_terms);
+				}
+
+				/* Ensure hard upper bounds still satisfied  */
+				result = fmin(result,
+							  fmin(original_distinct_count,
+								   est_row_count_after_predicates));
+
+				/* Since empty tables were handled above, must be >= 1 */
+				result = fmax(result, 1.0);
+			}
+		}
+	}
+
+	Assert(result >= 0.0);
+	Assert(result <= original_distinct_count);
+	Assert(result <= est_row_count_after_predicates);
+
+	return result;
+}
+
+/*
+ * Gather metadata for the given column `base_col` in the given relation `base_rel`, and return the metadata using `md`.
+ */
+static void
+gather_base_column_metadata(PlannerInfo *root,
+							Var *base_col,
+							RelOptInfo *base_rel,
+							SemiJoinFilterExprMetadata * md)
+{
+	/*
+	 * Given a Var for a base column, gather metadata about that column Should
+	 * only be called indirectly under analyze_expr_for_metadata
+	 */
+	VariableStatData base_col_vardata;
+	Oid			base_col_reloid = InvalidOid;
+	bool		is_default;
+
+	Assert(md && base_rel && root);
+	Assert(base_col && IsA(base_col, Var));
+	Assert(base_rel->reloptkind == RELOPT_BASEREL ||
+		   base_rel->reloptkind == RELOPT_OTHER_MEMBER_REL);
+	Assert(base_rel->rtekind == RTE_RELATION);
+
+	md->base_column_expr = base_col;
+	md->base_rel = base_rel;
+	md->base_rel_root = root;
+	md->is_or_maps_to_base_column = true;
+
+	examine_variable((PlannerInfo *) root, (Node *) base_col, 0,
+					 &base_col_vardata);
+	Assert(base_col_vardata.rel);
+	Assert(base_col_vardata.rel == base_rel);
+
+	md->base_rel_row_count = md->base_rel->tuples;
+	md->base_rel_filt_row_count = md->base_rel->rows;
+	md->base_col_distincts =
+		fmin(get_variable_numdistinct(&base_col_vardata, &is_default),
+			 md->base_rel_row_count);
+	md->est_distincts_reliable = !is_default;
+
+	/*
+	 * For indirectly filtered columns estimate the effect of the rows
+	 * filtered on the remaining column distinct count.
+	 */
+	md->expr_est_distincts =
+		fmax(1.0,
+			 estimate_distincts_remaining(md->base_rel_row_count,
+										  md->base_col_distincts,
+										  md->base_rel_filt_row_count));
+
+	base_col_reloid = (planner_rt_fetch(base_rel->relid, root))->relid;
+	if (base_col_reloid != InvalidOid && base_col->varattno > 0)
+	{
+		md->est_col_width =
+			get_attavgwidth(base_col_reloid, base_col->varattno);
+	}
+	ReleaseVariableStats(base_col_vardata);
+}
+
+static Expr *
+get_subquery_var_occluded_reference(PlannerInfo *root, Expr *ex)
+{
+	/*
+	 * Given a virtual column from an unflattened subquery, return the
+	 * expression it immediately occludes
+	 */
+	Var		   *outside_subq_var = (Var *) ex;
+	RelOptInfo *outside_subq_relation = NULL;
+	RangeTblEntry *outside_subq_rte = NULL;
+	TargetEntry *te = NULL;
+	Expr	   *inside_subq_expr = NULL;
+
+	Assert(ex && root);
+	Assert(IsA(ex, Var));
+	Assert(outside_subq_var->varno < root->simple_rel_array_size);
+
+	outside_subq_relation = root->simple_rel_array[outside_subq_var->varno];
+	outside_subq_rte = root->simple_rte_array[outside_subq_var->varno];
+
+	/*
+	 * If inheritance, subquery has append, leg of append in subquery may not
+	 * have subroot, we may be able to better process it according to
+	 * root->append_rel_list. For now just return the first leg... TODO better
+	 * handling of Union All, we only return statistics of the first leg atm.
+	 * TODO similarly, need better handling of partitioned tables, according
+	 * to outside_subq_relation->part_scheme and part_rels.
+	 */
+	if (outside_subq_rte->inh)
+	{
+		AppendRelInfo *appendRelInfo = NULL;
+
+		Assert(root->append_rel_list);
+
+		/* TODO remove this check once we add better handling of inheritance */
+		if (force_mergejoin_semijoin_filter)
+		{
+			appendRelInfo = list_nth(root->append_rel_list, 0);
+			Assert(appendRelInfo->parent_relid == outside_subq_var->varno);
+
+			Assert(appendRelInfo->translated_vars &&
+				   outside_subq_var->varattno <=
+				   list_length(appendRelInfo->translated_vars));
+			inside_subq_expr = list_nth(appendRelInfo->translated_vars,
+										outside_subq_var->varattno - 1);
+		}
+	}
+
+	/* Subquery without append and partitioned tables */
+	else
+	{
+		Assert(outside_subq_relation && IsA(outside_subq_relation, RelOptInfo));
+		Assert(outside_subq_relation->reloptkind == RELOPT_BASEREL);
+		Assert(outside_subq_relation->rtekind == RTE_SUBQUERY);
+		Assert(outside_subq_relation->subroot->processed_tlist);
+
+		te = get_nth_targetentry(outside_subq_var->varattno,
+								 outside_subq_relation->subroot->processed_tlist);
+		Assert(te && outside_subq_var->varattno == te->resno);
+		inside_subq_expr = te->expr;
+
+		/*
+		 * Strip off any Relabel present, and return the underlying expression
+		 */
+		while (inside_subq_expr && IsA(inside_subq_expr, RelabelType))
+		{
+			inside_subq_expr = ((RelabelType *) inside_subq_expr)->arg;
+		}
+	}
+
+	return inside_subq_expr;
+}
+
+/*
+ * Analyze the supplied expression, and if possible, gather metadata about
+ * it.
+ * Should only be called by analyze_expr_for_metadata, or itself.
+ */
+static void
+recursively_analyze_expr_metadata(PlannerInfo *root,
+								  Expr *ex,
+								  SemiJoinFilterExprMetadata * md)
+{
+	Assert(md && ex && root);
+
+	if (IsA(ex, Const))
+	{
+		md->is_or_maps_to_constant = true;
+		md->expr_est_distincts = 1.0;
+		md->est_distincts_reliable = true;
+	}
+	else if (IsA(ex, RelabelType))
+	{
+		recursively_analyze_expr_metadata(root, ((RelabelType *) ex)->arg, md);
+	}
+	else if (IsA(ex, Var))
+	{
+		Var		   *local_var = (Var *) ex;
+		RelOptInfo *local_relation = NULL;
+
+		Assert(local_var->varno < root->simple_rel_array_size);
+
+		/* Bail out if varno is invalid */
+		if (local_var->varno == InvalidOid)
+			return;
+
+		local_relation = root->simple_rel_array[local_var->varno];
+		Assert(local_relation && IsA(local_relation, RelOptInfo));
+
+		/*
+		 * For top level call (i.e. not a recursive invocation) cache the
+		 * relation pointer
+		 */
+		if (!md->local_relation
+			&& (local_relation->reloptkind == RELOPT_BASEREL ||
+				local_relation->reloptkind == RELOPT_OTHER_MEMBER_REL))
+		{
+			md->local_relation = local_relation;
+			md->local_column_expr = local_var;
+		}
+
+		if ((local_relation->reloptkind == RELOPT_BASEREL ||
+			 local_relation->reloptkind == RELOPT_OTHER_MEMBER_REL)
+			&& local_relation->rtekind == RTE_RELATION)
+		{
+			/* Found Var is a base column, so gather the metadata we can  */
+			gather_base_column_metadata(root, local_var, local_relation, md);
+		}
+		else if (local_relation->reloptkind == RELOPT_BASEREL
+				 && local_relation->rtekind == RTE_SUBQUERY)
+		{
+			RangeTblEntry *outside_subq_rte =
+			root->simple_rte_array[local_relation->relid];
+
+			/* root doesn't change for inheritance case, e.g. for UNION ALL */
+			PlannerInfo *new_root = outside_subq_rte->inh ?
+			root : local_relation->subroot;
+
+			/*
+			 * Found that this Var is a subquery SELECT list item, so continue
+			 * to recurse on the occluded expression
+			 */
+			Expr	   *occluded_expr =
+			get_subquery_var_occluded_reference(root, ex);
+
+			if (occluded_expr)
+			{
+				recursively_analyze_expr_metadata(new_root, occluded_expr,
+												  md);
+			}
+		}
+	}
+}
+
+/*
+ * Analyze the supplied expression, and if possible, gather metadata about
+ * it.
+ * Currently handles: base table columns, constants, and virtual
+ * columns from unflattened subquery blocks.  The metadata collected is
+ * placed into the supplied SemiJoinFilterExprMetadata object.
+ */
+void
+analyze_expr_for_metadata(PlannerInfo *root, Expr *ex,
+						  SemiJoinFilterExprMetadata * md)
+{
+	Assert(md && ex && root);
+
+	init_semijoin_filter_expr_metadata(md);
+	recursively_analyze_expr_metadata(root, ex, md);
+}
+
+
+/*
+ *  Function:  evaluate_semijoin_filtering_rate
+ *
+ *  Given a merge join path, determine two things.
+ *  First, can a Bloom filter based semijoin be created on the
+ *  outer scan relation and checked on the inner scan relation to
+ *  filter out rows from the inner relation? And second, if this
+ *  is possible, determine the single equijoin condition that is most
+ *  useful as well as the estimated filtering rate of the filter.
+ *
+ *  The output args, inner_semijoin_keys and
+ *  outer_semijoin_keys, will each contain a single key column
+ *  from one of the hash equijoin conditions. probe_semijoin_keys
+ *  contains keys from the target relation to probe the semijoin filter.
+ *
+ *  A potential semijoin will be deemed valid only if all
+ *  of the following are true:
+ *    a) The enable_mergejoin_semijoin_filter option is set true
+ *    b) The equijoin key from the outer side is or maps
+ *       to a base table column
+ *    c) The equijoin key from the inner side is or maps to
+ *       a base column
+ *
+ *  A potential semijoin will be deemed useful only if the
+ *  force_mergejoin_semijoin_filter is set true, or if all of the
+ *  following are true:
+ *    a) The equijoin key base column from the outer side has
+ *       reliable metadata (i.e. ANALYZE was done on it)
+ *    b) The key column(s) from the outer side equijoin keys
+ *       have width metadata available.
+ *    c) The estimated outer side key column width(s) are not
+ *       excessively wide.
+ *    d) The equijoin key from the inner side either:
+ *         1) maps to a base column with reliable metadata, or
+ *         2) is constrained by the incoming estimated tuple
+ *            count to have a distinct count smaller than the
+ *            outer side key column's distinct count.
+ *    e) The semijoin must be estimated to filter at least some of
+ *       the rows from the inner relation. However, the exact filtering
+ *       rate where the semijoin is deemed useful is determined by the
+ *       mergejoin cost model itself, not this function.
+ *
+ *  If there is more than one equijoin condition, we favor the one with the
+ *  higher estimated filtering rate.
+ *
+ *  If this function finds an appropriate semijoin, it will
+ *  allocate a SemijoinFilterJoinData object to store the
+ *  semijoin metadata, and then attach it to the Join plan node.
+ */
+#define MAX_SEMIJOIN_SINGLE_KEY_WIDTH	  128
+
+static double
+evaluate_semijoin_filtering_rate(PlannerInfo *root,
+								 JoinPath *join_path,
+								 List *equijoin_list,
+								 JoinCostWorkspace *workspace,
+								 int *best_clause,
+								 int *rows_filtered)
+{
+	const Path *outer_path;
+	const Path *inner_path;
+	ListCell   *equijoin_lc = NULL;
+	int			equijoin_ordinal = -1;
+	int			best_single_col_sj_ordinal = -1;
+	double		best_sj_selectivity = 1.01;
+	double		best_sj_inner_rows_filtered = -1.0;
+	int			num_md;
+	SemiJoinFilterExprMetadata *outer_md_array = NULL;
+	SemiJoinFilterExprMetadata *inner_md_array = NULL;
+
+	Assert(equijoin_list);
+	Assert(list_length(equijoin_list) > 0);
+
+	if (!enable_mergejoin_semijoin_filter && !force_mergejoin_semijoin_filter)
+	{
+		return 0;				/* option setting disabled semijoin insertion  */
+	}
+
+	num_md = list_length(equijoin_list);
+	outer_md_array = palloc(sizeof(SemiJoinFilterExprMetadata) * num_md);
+	inner_md_array = palloc(sizeof(SemiJoinFilterExprMetadata) * num_md);
+	if (!outer_md_array || !inner_md_array)
+	{
+		return 0;				/* a stack array allocation failed  */
+	}
+
+	outer_path = join_path->outerjoinpath;
+	inner_path = join_path->innerjoinpath;
+
+#ifdef TRACE_SORT
+	if (trace_sort)
+		elog(LOG,
+			 "evaluate_semijoin_filtering_rate: inner path est rows: %.1lf, outer path est rows: %.1lf."
+			 ,inner_path->rows, outer_path->rows);
+#endif
+
+	/*
+	 * Consider each of the individual equijoin conditions as a possible basis
+	 * for creating a semijoin condition
+	 */
+	foreach(equijoin_lc, equijoin_list)
+	{
+		OpExpr	   *equijoin;
+		Node	   *outer_equijoin_arg = NULL;
+		SemiJoinFilterExprMetadata *outer_arg_md = NULL;
+		Node	   *inner_equijoin_arg = NULL;
+		SemiJoinFilterExprMetadata *inner_arg_md = NULL;
+		double		est_sj_selectivity = 1.01;
+		double		est_sj_inner_rows_filtered = -1.0;
+
+		equijoin_ordinal++;
+		equijoin = (OpExpr *) lfirst(equijoin_lc);
+
+		Assert(IsA(equijoin, OpExpr));
+		Assert(list_length(equijoin->args) == 2);
+
+		outer_equijoin_arg = linitial(equijoin->args);
+		outer_arg_md = &(outer_md_array[equijoin_ordinal]);
+		analyze_expr_for_metadata(root, (Expr *) outer_equijoin_arg,
+								  outer_arg_md);
+
+		inner_equijoin_arg = llast(equijoin->args);
+		inner_arg_md = &(inner_md_array[equijoin_ordinal]);
+		analyze_expr_for_metadata(root, (Expr *) inner_equijoin_arg,
+								  inner_arg_md);
+
+		/*
+		 * If outer key - inner key has FK/PK relationship to each other and
+		 * there is no restriction on the primary key side, the semijoin
+		 * filter will be useless, we should bail out, even if the
+		 * force_semijoin_push_down guc is set. There might be exceptions, if
+		 * the outer key has restrictions on the key variable, but we won't be
+		 * able to tell until the Plan level. We will be conservative and
+		 * assume that an FK/PK relationship will yield a useless filter.
+		 */
+		if (outer_arg_md->base_column_expr &&
+			inner_arg_md->base_column_expr && is_fk_pk(root, outer_arg_md->base_column_expr,
+													   inner_arg_md->base_column_expr,
+													   equijoin->opno))
+		{
+
+#ifdef TRACE_SORT
+			if (trace_sort)
+				elog(LOG,
+					 "evaluate_semijoin_filtering_rate: inner and outer equijoin columns are PK/FK; semijoin would not be useful.");
+#endif
+			continue;
+		}
+
+		/* Now see if we can push a semijoin to its source scan node  */
+		if (!outer_arg_md->local_column_expr || !inner_arg_md->local_column_expr)
+		{
+#ifdef TRACE_SORT
+			if (trace_sort)
+				elog(LOG,
+					 "evaluate_semijoin_filtering_rate:  could not find a local outer or inner column to use as semijoin basis; semijoin is not valid.");
+#endif
+			continue;
+		}
+
+		if (!verify_valid_pushdown(root, (Path *) (join_path->innerjoinpath),
+								   inner_arg_md->local_column_expr->varno))
+		{
+#ifdef TRACE_SORT
+			if (trace_sort)
+				elog(LOG,
+					 "evaluate_semijoin_filtering_rate: could not find a place to evaluate a semijoin condition; semijoin is not valid.");
+#endif
+			continue;
+		}
+
+		/*
+		 * Adjust cached estimated inner key distinct counts down using the
+		 * inner side tuple count as an upper bound
+		 */
+		inner_arg_md->expr_est_distincts =
+			fmax(1.0, fmin(inner_path->rows,
+						   inner_arg_md->expr_est_distincts));
+
+		/*
+		 * We need to estimate the outer key distinct count as close as
+		 * possible to the where the semijoin filter will actually be applied,
+		 * ignoring the effects of any indirect filtering that would occur
+		 * after the semijoin.
+		 */
+		outer_arg_md->expr_est_distincts =
+			fmax(1.0, fmin(outer_path->rows,
+						   outer_arg_md->expr_est_distincts));
+
+		/*
+		 * If force_mergejoin_semijoin_filter is used, set the default clause
+		 * as the first valid one.
+		 */
+		if (force_mergejoin_semijoin_filter && best_single_col_sj_ordinal == -1)
+		{
+			best_single_col_sj_ordinal = equijoin_ordinal;
+		}
+
+		/* Next, see if this equijoin is valid as a semijoin basis */
+		if (!outer_arg_md->is_or_maps_to_base_column
+			&& !outer_arg_md->is_or_maps_to_constant)
+		{
+#ifdef TRACE_SORT
+			if (trace_sort)
+				elog(LOG,
+					 "evaluate_semijoin_filtering_rate: outer equijoin arg does not map to a base column nor a constant; semijoin is not valid.");
+#endif
+
+			continue;
+		}
+		if (!inner_arg_md->is_or_maps_to_base_column
+			&& !inner_arg_md->is_or_maps_to_constant)
+		{
+#ifdef TRACE_SORT
+			if (trace_sort)
+				elog(LOG,
+					 "evaluate_semijoin_filtering_rate: inner equijoin arg does not map to a base column nor a constant; semijoin is not valid.");
+#endif
+			continue;
+		}
+
+		/* Now we know it's valid, see if this potential semijoin is useful */
+		if (!outer_arg_md->est_distincts_reliable)
+		{
+#ifdef TRACE_SORT
+			if (trace_sort)
+				elog(LOG,
+					 "evaluate_semijoin_filtering_rate:  outer equijoin column's distinct estimates are not reliable; condition rejected.");
+#endif
+			continue;
+		}
+		if (outer_arg_md->est_col_width == 0)
+		{
+#ifdef TRACE_SORT
+			if (trace_sort)
+				elog(LOG,
+					 "evaluate_semijoin_filtering_rate:  outer equijoin column's width estimates are not reliable; condition rejected.");
+#endif
+			continue;
+		}
+		if (outer_arg_md->est_col_width > MAX_SEMIJOIN_SINGLE_KEY_WIDTH)
+		{
+#ifdef TRACE_SORT
+			if (trace_sort)
+				elog(LOG,
+					 "evaluate_semijoin_filtering_rate:  outer equijoin column's width estimates are was excessive; condition rejected.");
+#endif
+			continue;
+		}
+		if (!(outer_arg_md->is_or_maps_to_constant
+			  || (inner_arg_md->is_or_maps_to_base_column
+				  && inner_arg_md->est_distincts_reliable)
+			  || (inner_path->rows
+				  < outer_arg_md->expr_est_distincts)))
+		{
+#ifdef TRACE_SORT
+			if (trace_sort)
+				elog(LOG,
+					 "evaluate_semijoin_filtering_rate: inner equijoin arg does not have a reliable distinct count; condition rejected.");
+#endif
+			continue;
+		}
+
+		/*
+		 * We now try to estimate the filtering rate (1 minus selectivity) and
+		 * rows filtered of the filter. We first start by finding the ranges
+		 * of both the outer and inner var, and find the overlap between these
+		 * ranges. We assume an equal distribution of variables among this
+		 * range, and we can then calculate the amount of filtering our SJF
+		 * would do.
+		 */
+		if (workspace->inner_min_val > workspace->outer_max_val
+			|| workspace->inner_max_val < workspace->outer_min_val)
+		{
+			/*
+			 * This would mean that the outer and inner tuples are completely
+			 * disjoin from each other. We will not be as optimistic, and just
+			 * assign a filtering rate of 95%.
+			 */
+			est_sj_selectivity = 0.05;	/* selectivity is 1 minus filtering
+										 * rate */
+			est_sj_inner_rows_filtered = 0.95 * inner_arg_md->base_rel_filt_row_count;
+		}
+		else
+		{
+#define APPROACH_1_DAMPENING_FACTOR 0.8
+#define APPROACH_2_DAMPENING_FACTOR 0.66
+			/*
+			 * There are two approaches to estimating the filtering rate. We
+			 * have already outlined the first approach above, finding the
+			 * range and assuming an equal distribution. For the second
+			 * approach, we do not assume anything about the distribution, but
+			 * compare the number of distincts. If, for example, the inner
+			 * relation has 1000 distincts and the outer has 500, then there
+			 * is guaranteed to be at least 500 rows filtered from the inner
+			 * relation, regardless of the data distribution. We make an
+			 * assumption here that the distribution of distinct variables is
+			 * equal to the distribution of all rows so we can multiply by the
+			 * ratio of duplicate values. We then take the geometric mean of
+			 * these two approaches for our final estimated filtering rate. We
+			 * also multiply these values by dampening factors, which we have
+			 * found via experimentation and probably need fine-tuning.
+			 */
+			double		approach_1_selectivity; /* finding selectivity instead
+												 * of filtering rate for
+												 * legacy code reasons */
+			double		approach_2_selectivity;
+			double		inner_overlapping_range = Max(0, workspace->outer_max_val - workspace->inner_min_val);
+
+			/* we are assuming an equal distribution of val's */
+			double		inner_overlapping_ratio = Min(1, inner_overlapping_range / inner_arg_md->base_rel_filt_row_count);
+
+			/*
+			 * testing has found that this method is generaly over-optimistic,
+			 * so we multiply by a dampening effect.
+			 */
+			approach_1_selectivity = inner_overlapping_ratio * APPROACH_1_DAMPENING_FACTOR;
+			if (inner_arg_md->expr_est_distincts > outer_arg_md->expr_est_distincts)
+			{
+				int			inner_more_distincts = inner_arg_md->expr_est_distincts - outer_arg_md->expr_est_distincts;
+
+				approach_2_selectivity = 1 - ((double) inner_more_distincts) / inner_arg_md->expr_est_distincts;
+
+				/*
+				 * testing has found that this method is generaly
+				 * over-optimistic, so we multiply by a dampening effect.
+				 */
+				approach_2_selectivity = 1 - ((1 - approach_2_selectivity) * APPROACH_2_DAMPENING_FACTOR);
+			}
+			else
+			{
+				/*
+				 * This means that the outer relation has the same or more
+				 * distincts than the inner relation, which is not good for
+				 * our filtering rate. We will assume a base filtering rate of
+				 * 10% in this case.
+				 */
+				approach_2_selectivity = 0.9;
+			}
+			est_sj_selectivity = sqrt(approach_1_selectivity * approach_2_selectivity);
+			est_sj_inner_rows_filtered = (1 - est_sj_selectivity) * inner_arg_md->base_rel_filt_row_count;
+		}
+		est_sj_selectivity = fmin(1.0, est_sj_selectivity);
+		est_sj_inner_rows_filtered = fmax(1.0, est_sj_inner_rows_filtered);
+
+#ifdef TRACE_SORT
+		if (trace_sort)
+			elog(LOG,
+				 "evaluate_semijoin_filtering_rate: eligible semijoin selectivity: %.7lf; eligible semijoin rows filtered: %.7lf", est_sj_selectivity, est_sj_inner_rows_filtered);
+#endif
+
+		if (est_sj_selectivity < best_sj_selectivity)
+		{
+			best_sj_selectivity = est_sj_selectivity;
+			best_sj_inner_rows_filtered = est_sj_inner_rows_filtered;
+			best_single_col_sj_ordinal = equijoin_ordinal;
+		}
+	}
+
+	if (best_single_col_sj_ordinal != -1)
+	{
+#ifdef TRACE_SORT
+		if (trace_sort)
+			elog(LOG,
+				 "evaluate_semijoin_filtering_rate: best single column sj selectivity: %.7lf; best single column rows filtered: %.7lf", best_sj_selectivity, best_sj_inner_rows_filtered);
+#endif
+	}
+
+	*best_clause = best_single_col_sj_ordinal;
+	*rows_filtered = best_sj_inner_rows_filtered;
+	return 1 - best_sj_selectivity;
+}
+
+/*
+ *  Determine whether a semijoin condition could be pushed from the join
+ *  all the way to the leaf scan node.
+ *
+ *  Parameters:
+ *  node: path node to be considered for semijoin push down.
+ *  target_var:  the inner side join key for a potential semijoin.
+ *  target_relids: relids of all target leaf relations,
+ *  	used only for partitioned table.
+ */
+static bool
+verify_valid_pushdown(PlannerInfo *root, Path *path,
+					  Index target_var_no)
+{
+	Assert(path);
+	Assert(target_var_no > 0);
+
+	/* Guard against stack overflow due to overly complex plan trees */
+	check_stack_depth();
+
+	switch (path->pathtype)
+	{
+			/* directly push through these paths */
+		case T_Material:
+			{
+				return verify_valid_pushdown(root, ((MaterialPath *) path)->subpath, target_var_no);
+			}
+		case T_Gather:
+			{
+				return verify_valid_pushdown(root, ((GatherPath *) path)->subpath, target_var_no);
+			}
+		case T_GatherMerge:
+			{
+				return verify_valid_pushdown(root, ((GatherMergePath *) path)->subpath, target_var_no);
+			}
+		case T_Sort:
+			{
+				return verify_valid_pushdown(root, ((SortPath *) path)->subpath, target_var_no);
+			}
+		case T_Unique:
+			{
+				return verify_valid_pushdown(root, ((UniquePath *) path)->subpath, target_var_no);
+			}
+
+		case T_Agg:
+			{					/* We can directly push bloom through GROUP
+								 * BYs and DISTINCTs, as long as there are no
+								 * grouping sets. However, we cannot validate
+								 * this fact until the Plan has been created.
+								 * We will push through for now, but verify
+								 * again during Plan creation. */
+				return verify_valid_pushdown(root, ((AggPath *) path)->subpath, target_var_no);
+			}
+
+		case T_Append:
+		case T_SubqueryScan:
+			{
+				/*
+				 * Both append and subquery paths are currently unimplemented,
+				 * so we will just return false, but theoretically there are
+				 * ways to check if a filter can be pushed through them. The
+				 * previous HashJoin CR has implemented these cases, but that
+				 * code is run these after the plan has been created, so code
+				 * will need to be adjusted to do it during Path evaluation.
+				 */
+				return false;
+			}
+
+			/* Leaf nodes */
+		case T_IndexScan:
+		case T_BitmapHeapScan:
+			{
+				/*
+				 * We could definitely implement pushdown filters for Index
+				 * and Bitmap Scans, but currently it is only implemented for
+				 * SeqScan. For now, we return false.
+				 */
+				return false;
+			}
+		case T_SeqScan:
+			{
+				/*
+				 * Found source of target var! We know that the pushdown is
+				 * valid now.
+				 */
+				return path->parent->relid == target_var_no;
+			}
+
+		case T_NestLoop:
+		case T_MergeJoin:
+		case T_HashJoin:
+			{
+				/*
+				 * since this is going to be a sub-join, we can push through
+				 * both sides and don't need to worry about left/right/inner
+				 * joins.
+				 */
+				JoinPath   *join = (JoinPath *) path;
+
+				return verify_valid_pushdown(root, join->outerjoinpath, target_var_no) ||
+					verify_valid_pushdown(root, join->innerjoinpath, target_var_no);
+			}
+
+		default:
+			{
+				return false;
+			}
+	}
+}
+
+/*
+ *  Return the nth (starting from 1) entry from the `targetlist`.
+ *
+ * 	If n is larger than the length of `targetlist`, an assertion will fail.
+ */
+static TargetEntry *
+get_nth_targetentry(int n, List *targetlist)
+{
+	int			i = 1;
+	ListCell   *lc = NULL;
+
+	Assert(n > 0);
+	Assert(targetlist && nodeTag(targetlist) == T_List);
+	Assert(list_length(targetlist) >= n);
+
+	if (targetlist && list_length(targetlist) >= n)
+	{
+		foreach(lc, targetlist)
+		{
+			if (i == n)
+			{
+				TargetEntry *te = lfirst(lc);
+
+				return te;
+			}
+			i++;
+		}
+	}
+	return NULL;
+}
+
+/*
+ * expressions_match_foreign_key
+ *		True if the given con_exprs, ref_exprs and operators will exactly
+ *      	reflect the expressions referenced by the given foreign key fk.
+ *
+ * Note: This function expects con_exprs and ref_exprs to only contain Var types.
+ *       Expression indexes are not supported by foreign keys.
+ */
+bool
+expressions_match_foreign_key(ForeignKeyOptInfo *fk,
+							  List *con_exprs,
+							  List *ref_exprs,
+							  List *operators)
+{
+	ListCell   *lc;
+	ListCell   *lc2;
+	ListCell   *lc3;
+	int			col;
+	Bitmapset  *all_vars;
+	Bitmapset  *matched_vars;
+	int			idx;
+
+	Assert(list_length(con_exprs) == list_length(ref_exprs));
+	Assert(list_length(con_exprs) == list_length(operators));
+
+	/*
+	 * Fast path out if there's not enough conditions to match each column in
+	 * the foreign key. Note that we cannot check that the number of
+	 * expressions are equal here since it would cause any expressions which
+	 * are duplicated not to match.
+	 */
+	if (list_length(con_exprs) < fk->nkeys)
+		return false;
+
+	/*
+	 * We need to ensure that each item in con_exprs/ref_exprs can be matched
+	 * to a foreign key column in the actual foreign key data fk. We can do
+	 * this by looping over each fk column and checking that we find a
+	 * matching con_expr/ref_expr in con_exprs/ref_exprs. This method does not
+	 * however, allow us to ensure that there are no additional items in
+	 * con_exprs/ref_exprs that have not been matched. To remedy this we will
+	 * create 2 bitmapsets, one which will keep track of all of the vars, the
+	 * other which will keep track of the vars that we have matched. After
+	 * matching is complete, we will ensure that these bitmapsets are equal to
+	 * ensure we have complete mapping in both directions (fk cols to vars and
+	 * vars to fk cols)
+	 */
+	all_vars = NULL;
+	matched_vars = NULL;
+
+	/*
+	 * Build a bitmapset which tracks all vars by their index
+	 */
+	for (idx = 0; idx < list_length(con_exprs); idx++)
+		all_vars = bms_add_member(all_vars, idx);
+
+	for (col = 0; col < fk->nkeys; col++)
+	{
+		bool		matched = false;
+
+		idx = 0;
+
+		forthree(lc, con_exprs, lc2, ref_exprs, lc3, operators)
+		{
+			Var		   *con_expr = (Var *) lfirst(lc);
+			Var		   *ref_expr = (Var *) lfirst(lc2);
+			Oid			opr = lfirst_oid(lc3);
+
+			Assert(IsA(con_expr, Var));
+			Assert(IsA(ref_expr, Var));
+
+			/* Does this join qual match up to the current fkey column? */
+			if (fk->conkey[col] == con_expr->varattno &&
+				fk->confkey[col] == ref_expr->varattno &&
+				equality_ops_are_compatible(opr, fk->conpfeqop[col]))
+			{
+				matched = true;
+
+				/* mark the index of this var as matched */
+				matched_vars = bms_add_member(matched_vars, idx);
+
+				/*
+				 * Don't break here as there may be duplicate expressions that
+				 * match this column that we also need to mark as matched
+				 */
+			}
+			idx++;
+		}
+
+		/*
+		 * can't remove a join if there's no match to fkey column on join
+		 * condition.
+		 */
+		if (!matched)
+			return false;
+	}
+
+	/*
+	 * Ensure that we managed to match every var in con_var/ref_var to a
+	 * foreign key constraint.
+	 */
+	return bms_equal(all_vars, matched_vars);
+}
+
+/*
+ * Determine if the given outer and inner Exprs satisfy any fk-pk
+ * relationship.
+ */
+static bool
+is_fk_pk(PlannerInfo *root, Var *outer_var,
+		 Var *inner_var, Oid op_oid)
+{
+	ListCell   *lc = NULL;
+	List	   *outer_key_list = list_make1((Var *) outer_var);
+	List	   *inner_key_list = list_make1((Var *) inner_var);
+	List	   *operators = list_make1_oid(op_oid);
+
+	foreach(lc, root->fkey_list)
+	{
+		ForeignKeyOptInfo *fk = (ForeignKeyOptInfo *) lfirst(lc);
+
+		if (expressions_match_foreign_key(fk,
+										  outer_key_list,
+										  inner_key_list,
+										  operators))
+		{
+			return true;
+		}
+	}
+
+	return false;
+}
+
+/*
+ * get_switched_clauses
+ *	  Given a list of merge or hash joinclauses (as RestrictInfo nodes),
+ *	  extract the bare clauses, and rearrange the elements within the
+ *	  clauses, if needed, so the outer join variable is on the left and
+ *	  the inner is on the right.  The original clause data structure is not
+ *	  touched; a modified list is returned.  We do, however, set the transient
+ *	  outer_is_left field in each RestrictInfo to show which side was which.
+ */
+static List *
+get_switched_clauses(List *clauses, Relids outerrelids)
+{
+	List	   *t_list = NIL;
+	ListCell   *l;
+
+	foreach(l, clauses)
+	{
+		RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(l);
+		OpExpr	   *clause = (OpExpr *) restrictinfo->clause;
+
+		Assert(is_opclause(clause));
+
+		/* TODO: handle the case where the operator doesn't hava a commutator */
+		if (bms_is_subset(restrictinfo->right_relids, outerrelids)
+			&& OidIsValid(get_commutator(clause->opno)))
+		{
+			/*
+			 * Duplicate just enough of the structure to allow commuting the
+			 * clause without changing the original list.  Could use
+			 * copyObject, but a complete deep copy is overkill.
+			 */
+			OpExpr	   *temp = makeNode(OpExpr);
+
+			temp->opno = clause->opno;
+			temp->opfuncid = InvalidOid;
+			temp->opresulttype = clause->opresulttype;
+			temp->opretset = clause->opretset;
+			temp->opcollid = clause->opcollid;
+			temp->inputcollid = clause->inputcollid;
+			temp->args = list_copy(clause->args);
+			temp->location = clause->location;
+			/* Commute it --- note this modifies the temp node in-place. */
+			CommuteOpExpr(temp);
+			t_list = lappend(t_list, temp);
+			restrictinfo->outer_is_left = false;
+		}
+		else
+		{
+			/*
+			 * TODO: check if Assert(bms_is_subset(restrictinfo->left_relids,
+			 * outerrelids)) is necessary.
+			 */
+			t_list = lappend(t_list, clause);
+			restrictinfo->outer_is_left = true;
+		}
+	}
+	return t_list;
+}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index ab4d8e201d..a537a580ff 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -30,6 +30,7 @@
 #include "optimizer/optimizer.h"
 #include "optimizer/paramassign.h"
 #include "optimizer/paths.h"
+#include "optimizer/pathnode.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planmain.h"
@@ -314,7 +315,31 @@ static ModifyTable *make_modifytable(PlannerInfo *root, Plan *subplan,
 									 List *mergeActionLists, int epqParam);
 static GatherMerge *create_gather_merge_plan(PlannerInfo *root,
 											 GatherMergePath *best_path);
-
+static int	depth_of_semijoin_target(Plan *plan,
+									 const Var *target_var,
+									 Bitmapset *target_relids,
+									 int cur_depth,
+									 const PlannerInfo *root,
+									 Plan **target_node);
+static bool is_side_of_join_source_of_var(const Plan *plan,
+										  bool testing_outer_side,
+										  const Var *target_var);
+static bool
+			is_table_scan_node_source_of_relids_or_var(const Plan *plan,
+													   const Var *target_var,
+													   Bitmapset *target_relids);
+static int	position_of_var_in_targetlist(const Var *target_var,
+										  const List *targetlist);
+static TargetEntry *get_nth_targetentry(int posn,
+										const List *targetlist);
+static void get_partition_table_relids(RelOptInfo *rel,
+									   Bitmapset **target_relids);
+static int	get_appendrel_occluded_references(const Expr *ex,
+											  Expr **occluded_exprs,
+											  int num_exprs,
+											  const PlannerInfo *root);
+static Expr *get_subquery_var_occluded_reference(const Expr *ex,
+												 const PlannerInfo *root);
 
 /*
  * create_plan
@@ -4691,6 +4716,57 @@ create_mergejoin_plan(PlannerInfo *root,
 	/* Costs of sort and material steps are included in path cost already */
 	copy_generic_path_info(&join_plan->join.plan, &best_path->jpath.path);
 
+	/* Check if we should attach a pushdown semijoin to this join */
+	if (best_path->sj_metadata && best_path->sj_metadata->best_mergeclause_pos >= 0)
+	{
+		ListCell   *clause_cell = list_nth_cell(mergeclauses, best_path->sj_metadata->best_mergeclause_pos);
+		OpExpr	   *joinclause = (OpExpr *) lfirst(clause_cell);
+		Node	   *outer_join_arg = linitial(joinclause->args);
+		Node	   *inner_join_arg = llast(joinclause->args);
+		SemiJoinFilterExprMetadata *outer_arg_md = (SemiJoinFilterExprMetadata *) palloc0(sizeof(SemiJoinFilterExprMetadata));
+		SemiJoinFilterExprMetadata *inner_arg_md = (SemiJoinFilterExprMetadata *) palloc0(sizeof(SemiJoinFilterExprMetadata));
+		int			outer_depth;
+		int			inner_depth;
+		Plan	   *building_node;
+		Plan	   *checking_node;
+
+		join_plan->sj_metadata = best_path->sj_metadata;
+
+		Assert(IsA(joinclause, OpExpr));
+		Assert(list_length(joinclause->args) == 2);
+		analyze_expr_for_metadata(root, (Expr *) outer_join_arg, outer_arg_md);
+		analyze_expr_for_metadata(root, (Expr *) inner_join_arg, inner_arg_md);
+
+		outer_depth = depth_of_semijoin_target((Plan *) join_plan,
+											   outer_arg_md->local_column_expr, NULL, 0, root, &building_node);
+		inner_depth = depth_of_semijoin_target((Plan *) join_plan,
+											   inner_arg_md->local_column_expr, NULL, 0, root, &checking_node);
+		if (inner_depth > -1 && outer_depth > -1)
+		{
+			SemijoinFilterScanData *building_node_data = makeNode(SemijoinFilterScanData);
+			SemijoinFilterScanData *checking_node_data = makeNode(SemijoinFilterScanData);
+
+			building_node_data->semijoin_keys =
+				lappend(building_node_data->semijoin_keys, (Node *) outer_arg_md->local_column_expr);
+			building_node_data->is_building_node = true;
+
+			building_node->sj_md_list =
+				lappend(building_node->sj_md_list, building_node_data);
+
+			checking_node_data->semijoin_keys =
+				lappend(checking_node_data->semijoin_keys, (Node *) inner_arg_md->local_column_expr);
+			checking_node_data->is_building_node = false;
+
+			checking_node->sj_md_list =
+				lappend(checking_node->sj_md_list, checking_node_data);
+
+			join_plan->sj_metadata->apply_semijoin_filter = true;
+			join_plan->sj_metadata->filtering_rate = best_path->sj_metadata->filtering_rate;
+		}
+		pfree(outer_arg_md);
+		pfree(inner_arg_md);
+	}
+
 	return join_plan;
 }
 
@@ -7216,3 +7292,542 @@ is_projection_capable_plan(Plan *plan)
 	}
 	return true;
 }
+
+/*
+ *  Determine whether a semijoin condition could be pushed from the join
+ *  all the way to the leaf scan node.  If so, determine the number of
+ *  nodes between the join and the scan node (inclusive of the scan node).
+ *  If the search was terminated by a node the semijoin could not be
+ *  pushed through, the function returns -1.
+ *
+ *  Parameters:
+ *  plan: plan node to be considered for semijoin push down.
+ *  target_var:  the outer side join key for a potential semijoin.
+ *  target_relids: relids of all target leaf relations,
+ *  	used only for partitioned table.
+ *  cur_depth: current depth from the root hash join plan node.
+ *  target_node: stores the target plan node where filter will be applied
+ */
+static int
+depth_of_semijoin_target(Plan *plan,
+						 const Var *target_var,
+						 Bitmapset *target_relids,
+						 int cur_depth,
+						 const PlannerInfo *root,
+						 Plan **target_node)
+{
+	int			depth = -1;
+
+	Assert(plan);
+	Assert(target_var && IsA(target_var, Var));
+	Assert(target_var->varno > 0);
+
+	/* Guard against stack overflow due to overly complex plan trees */
+	check_stack_depth();
+
+	switch (nodeTag(plan))
+	{
+		case T_Hash:
+		case T_Material:
+		case T_Gather:
+		case T_GatherMerge:
+		case T_Sort:
+		case T_Unique:
+			{					/* Directly push bloom through these node
+								 * types  */
+				depth = depth_of_semijoin_target(plan->lefttree, target_var,
+												 target_relids, cur_depth + 1, root, target_node);
+				break;
+			}
+
+		case T_Agg:
+			{					/* Directly push bloom through GROUP BYs and
+								 * DISTINCTs, as long as there are no grouping
+								 * sets */
+				Agg		   *agg_pn = (Agg *) plan;
+
+				if (!agg_pn->groupingSets
+					|| list_length(agg_pn->groupingSets) == 0)
+				{
+					depth = depth_of_semijoin_target(plan->lefttree, target_var,
+													 target_relids, cur_depth + 1,
+													 root, target_node);
+				}
+				break;
+			}
+
+		case T_SubqueryScan:
+			{
+				/*
+				 * Directly push semijoin into subquery if we can, but we need
+				 * to map the target var to the occluded expression within the
+				 * SELECT list of the subquery
+				 */
+				SubqueryScan *subq_scan = (SubqueryScan *) plan;
+				RelOptInfo *rel = NULL;
+				RangeTblEntry *rte = NULL;
+				Var		   *subq_target_var = NULL;
+
+				/*
+				 * To travel into a subquery we need to use the subquery's
+				 * PlannerInfo, the root of subquery's plan tree, and the
+				 * subquery's SELECT list item that was occluded by the Var
+				 * used within this query block
+				 */
+				rte = root->simple_rte_array[subq_scan->scan.scanrelid];
+				Assert(rte);
+				Assert(rte->subquery);
+				Assert(rte->rtekind == RTE_SUBQUERY);
+				Assert(rte->subquery->targetList);
+
+				rel = find_base_rel((PlannerInfo *) root,
+									subq_scan->scan.scanrelid);
+				Assert(rel->rtekind == RTE_SUBQUERY);
+				Assert(rel->subroot);
+
+				if (rel && rel->subroot
+					&& rte && rte->subquery && rte->subquery->targetList)
+				{
+					/* Find the target_var's occluded expression */
+					Expr	   *occluded_expr =
+					get_subquery_var_occluded_reference((Expr *) target_var,
+														root);
+
+					if (occluded_expr && IsA(occluded_expr, Var))
+					{
+						subq_target_var = (Var *) occluded_expr;
+						if (subq_target_var->varno > 0)
+							depth = depth_of_semijoin_target(subq_scan->subplan,
+															 subq_target_var,
+															 target_relids,
+															 cur_depth + 1,
+															 rel->subroot,
+															 target_node);
+					}
+				}
+				break;
+			}
+
+			/* Either from a partitioned table or Union All */
+		case T_Append:
+			{
+				int			max_depth = -1;
+				Append	   *append = (Append *) plan;
+				RelOptInfo *rel = NULL;
+				RangeTblEntry *rte = NULL;
+
+				rte = root->simple_rte_array[target_var->varno];
+				rel = find_base_rel((PlannerInfo *) root, target_var->varno);
+
+				if (rte->inh && append->appendplans)
+				{
+					int			num_exprs = list_length(append->appendplans);
+					Expr	  **occluded_exprs = alloca(num_exprs * sizeof(Expr *));
+					int			idx = 0;
+					ListCell   *lc = NULL;
+
+					/* Partitioned table */
+					if (rel->part_scheme && rel->part_rels)
+					{
+						get_partition_table_relids(rel, &target_relids);
+
+						foreach(lc, append->appendplans)
+						{
+							Plan	   *appendplan = (Plan *) lfirst(lc);
+
+							depth = depth_of_semijoin_target(appendplan,
+															 target_var,
+															 target_relids,
+															 cur_depth + 1,
+															 root,
+															 target_node);
+
+							if (depth > max_depth)
+								max_depth = depth;
+						}
+					}
+					/* Union All, not partitioned table */
+					else if (num_exprs == get_appendrel_occluded_references(
+																			(Expr *) target_var,
+																			occluded_exprs,
+																			num_exprs,
+																			root))
+					{
+						Var		   *subq_target_var = NULL;
+
+						foreach(lc, append->appendplans)
+						{
+							Expr	   *occluded_expr = occluded_exprs[idx++];
+							Plan	   *appendplan = (Plan *) lfirst(lc);
+
+							if (occluded_expr && IsA(occluded_expr, Var))
+							{
+								subq_target_var = (Var *) occluded_expr;
+
+								depth = depth_of_semijoin_target(appendplan,
+																 subq_target_var,
+																 target_relids,
+																 cur_depth + 1,
+																 root,
+																 target_node);
+
+								if (depth > max_depth)
+									max_depth = depth;
+							}
+						}
+					}
+				}
+				depth = max_depth;
+				break;
+			}
+
+			/* Leaf nodes */
+		case T_IndexScan:
+		case T_BitmapHeapScan:
+			{
+				return -1;
+			}
+		case T_SeqScan:
+			{
+				if (is_table_scan_node_source_of_relids_or_var(plan, target_var, target_relids))
+				{
+					/* Found ultimate source of the join key!  */
+					*target_node = plan;
+					depth = cur_depth;
+				}
+				break;
+			}
+
+		case T_NestLoop:
+		case T_MergeJoin:
+		case T_HashJoin:
+			{
+				/*
+				 * plan->path_jointype is not always the same as
+				 * join->jointype. Avoid using plan->path_jointype when you
+				 * need accurate jointype, use join->jointype instead.
+				 */
+				Join	   *join = (Join *) plan;
+
+				/*
+				 * Push bloom filter to outer node if (target relation is
+				 * under the outer plan node, decided by
+				 * is_side_of_join_source_of_var() ) and either the following
+				 * condition satisfies: 1. this is an inner join or semi join
+				 * 2. this is a root right join 3. this is an intermediate
+				 * left join
+				 */
+				if (is_side_of_join_source_of_var(plan, true, target_var))
+				{
+					if (join->jointype == JOIN_INNER
+						|| join->jointype == JOIN_SEMI
+						|| (join->jointype == JOIN_RIGHT && cur_depth == 0)
+						|| (join->jointype == JOIN_LEFT && cur_depth > 0))
+					{
+						depth = depth_of_semijoin_target(plan->lefttree, target_var,
+														 target_relids, cur_depth + 1, root,
+														 target_node);
+					}
+				}
+				else
+				{
+					/*
+					 * Push bloom filter to inner node if (target rel is under
+					 * the inner node, decided by
+					 * is_side_of_join_source_of_var() ), and either the
+					 * following condition satisfies: 1. this is an inner join
+					 * or semi join 2. this is an intermediate right join
+					 */
+					Assert(is_side_of_join_source_of_var(plan, false, target_var));
+					if (join->jointype == JOIN_INNER
+						|| join->jointype == JOIN_SEMI
+						|| (join->jointype == JOIN_RIGHT && cur_depth > 0))
+					{
+						depth = depth_of_semijoin_target(plan->righttree, target_var,
+														 target_relids, cur_depth + 1, root,
+														 target_node);
+					}
+				}
+				break;
+			}
+
+		default:
+			{					/* For all other node types, just bail out and
+								 * apply the semijoin filter somewhere above
+								 * this node. */
+				depth = -1;
+			}
+	}
+	return depth;
+}
+
+static bool
+is_side_of_join_source_of_var(const Plan *plan,
+							  bool testing_outer_side,
+							  const Var *target_var)
+{
+	/* Determine if target_var is from the indicated child of the join  */
+	Plan	   *target_child = NULL;
+
+	Assert(plan);
+	Assert(target_var && nodeTag(target_var) == T_Var);
+	Assert(nodeTag(plan) == T_NestLoop || nodeTag(plan) == T_MergeJoin
+		   || nodeTag(plan) == T_HashJoin);
+
+	if (testing_outer_side)
+	{
+		target_child = plan->lefttree;
+	}
+	else
+	{
+		target_child = plan->righttree;
+	}
+
+	return (position_of_var_in_targetlist(target_var,
+										  target_child->targetlist) >= 0);
+}
+
+/*
+ * Determine if this scan node is the source of the specified relids,
+ * or the source of the specified var if target_relids is not given.
+ */
+static bool
+is_table_scan_node_source_of_relids_or_var(const Plan *plan,
+										   const Var *target_var,
+										   Bitmapset *target_relids)
+{
+	Scan	   *scan_node = (Scan *) plan;
+	Index		scan_node_varno = 0;
+
+	Assert(plan);
+	Assert(target_var && nodeTag(target_var) == T_Var);
+	Assert(nodeTag(plan) == T_SeqScan || nodeTag(plan) == T_IndexScan
+		   || nodeTag(plan) == T_BitmapHeapScan);
+
+	scan_node_varno = scan_node->scanrelid;
+
+	if (target_relids)
+	{
+		return bms_is_member(scan_node_varno, target_relids);
+	}
+	return scan_node_varno == target_var->varno;
+}
+
+/*
+ *  Return the index of a given var `target_var` from the `targetlist`, -1 if not found.
+ */
+static int
+position_of_var_in_targetlist(const Var *target_var, const List *targetlist)
+{
+	ListCell   *lc = NULL;
+	int			i = 1;
+
+	Assert(target_var && nodeTag(target_var) == T_Var);
+	Assert(targetlist && nodeTag(targetlist) == T_List);
+
+	if (targetlist && target_var)
+	{
+		foreach(lc, targetlist)
+		{
+			TargetEntry *te = lfirst(lc);
+
+			if (IsA(te->expr, Var))
+			{
+				Var		   *cur_var = (Var *) te->expr;
+
+				if (cur_var->varno == target_var->varno
+					&& cur_var->varattno == target_var->varattno)
+				{
+					return i;
+				}
+			}
+			i++;
+		}
+	}
+	return -1;
+}
+
+/*
+ * Recursively gather all relids of the given partitioned table rel.
+ */
+static void
+get_partition_table_relids(RelOptInfo *rel, Bitmapset **target_relids)
+{
+	int			i;
+
+	Assert(rel->part_scheme && rel->part_rels);
+
+	for (i = 0; i < rel->nparts; i++)
+	{
+		RelOptInfo *part_rel = rel->part_rels[i];
+
+		if (part_rel->part_scheme && part_rel->part_rels)
+		{
+			get_partition_table_relids(part_rel, target_relids);
+		}
+		else
+		{
+			*target_relids = bms_union(*target_relids,
+									   part_rel->relids);
+		}
+	}
+}
+
+/*
+ *  Given a virtual column from an Union ALL subquery,
+ *  returns the last index of expression it immediately occludes that satisfy
+ *  the inheritance condition,
+ *  i.e. appendRelInfo->parent_relid == outside_subq_var->varno
+ */
+static int
+get_appendrel_occluded_references(const Expr *ex,
+								  Expr **occluded_exprs,
+								  int num_exprs,
+								  const PlannerInfo *root)
+{
+	Var		   *outside_subq_var = (Var *) ex;
+	RangeTblEntry *outside_subq_rte = NULL;
+	int			idx = 0;
+
+
+	Assert(ex && root);
+	Assert(IsA(ex, Var));
+	Assert(outside_subq_var->varno < root->simple_rel_array_size);
+
+	outside_subq_rte = root->simple_rte_array[outside_subq_var->varno];
+
+	/* System Vars have varattno < 0, don't bother */
+	if (outside_subq_var->varattno <= 0)
+		return 0;
+
+	/*
+	 * If inheritance, subquery has append, leg of append in subquery may not
+	 * have subroot, process it according to root->append_rel_list.
+	 */
+	if (outside_subq_rte->inh)
+	{
+		ListCell   *lc = NULL;
+
+		Assert(root->append_rel_list &&
+			   num_exprs <= list_length(root->append_rel_list));
+
+		foreach(lc, root->append_rel_list)
+		{
+			AppendRelInfo *appendRelInfo = lfirst(lc);
+
+			if (appendRelInfo->parent_relid == outside_subq_var->varno)
+			{
+				Assert(appendRelInfo->translated_vars &&
+					   outside_subq_var->varattno <=
+					   list_length(appendRelInfo->translated_vars));
+
+				occluded_exprs[idx++] =
+					list_nth(appendRelInfo->translated_vars,
+							 outside_subq_var->varattno - 1);
+			}
+		}
+	}
+
+	return idx;
+}
+
+static Expr *
+get_subquery_var_occluded_reference(const Expr *ex, const PlannerInfo *root)
+{
+	/*
+	 * Given a virtual column from an unflattened subquery, return the
+	 * expression it immediately occludes
+	 */
+	Var		   *outside_subq_var = (Var *) ex;
+	RelOptInfo *outside_subq_relation = NULL;
+	RangeTblEntry *outside_subq_rte = NULL;
+	TargetEntry *te = NULL;
+	Expr	   *inside_subq_expr = NULL;
+
+	Assert(ex && root);
+	Assert(IsA(ex, Var));
+	Assert(outside_subq_var->varno < root->simple_rel_array_size);
+
+	outside_subq_relation = root->simple_rel_array[outside_subq_var->varno];
+	outside_subq_rte = root->simple_rte_array[outside_subq_var->varno];
+
+	/*
+	 * If inheritance, subquery has append, leg of append in subquery may not
+	 * have subroot, we may be able to better process it according to
+	 * root->append_rel_list. For now just return the first leg... TODO better
+	 * handling of Union All, we only return statistics of the first leg atm.
+	 * TODO similarly, need better handling of partitioned tables, according
+	 * to outside_subq_relation->part_scheme and part_rels.
+	 */
+	if (outside_subq_rte->inh)
+	{
+		AppendRelInfo *appendRelInfo = NULL;
+
+		Assert(root->append_rel_list);
+
+		/* TODO remove this check once we add better handling of inheritance */
+		appendRelInfo = list_nth(root->append_rel_list, 0);
+		Assert(appendRelInfo->parent_relid == outside_subq_var->varno);
+
+		Assert(appendRelInfo->translated_vars &&
+			   outside_subq_var->varattno <=
+			   list_length(appendRelInfo->translated_vars));
+		inside_subq_expr = list_nth(appendRelInfo->translated_vars,
+									outside_subq_var->varattno - 1);
+	}
+
+	/* Subquery without append or partitioned tables */
+	else
+	{
+		Assert(outside_subq_relation && IsA(outside_subq_relation, RelOptInfo));
+		Assert(outside_subq_relation->reloptkind == RELOPT_BASEREL);
+		Assert(outside_subq_relation->rtekind == RTE_SUBQUERY);
+		Assert(outside_subq_relation->subroot->processed_tlist);
+
+		te = get_nth_targetentry(outside_subq_var->varattno,
+								 outside_subq_relation->subroot->processed_tlist);
+		Assert(te && outside_subq_var->varattno == te->resno);
+		inside_subq_expr = te->expr;
+
+		/*
+		 * Strip off any Relabel present, and return the underlying expression
+		 */
+		while (inside_subq_expr && IsA(inside_subq_expr, RelabelType))
+		{
+			inside_subq_expr = ((RelabelType *) inside_subq_expr)->arg;
+		}
+	}
+
+	return inside_subq_expr;
+}
+
+/*
+ *  Return the nth (starting from 1) entry from the `targetlist`.
+ *
+ * 	If n is larger than the length of `targetlist`, an assertion will fail.
+ */
+static TargetEntry *
+get_nth_targetentry(int n, const List *targetlist)
+{
+	int			i = 1;
+	ListCell   *lc = NULL;
+
+	Assert(n > 0);
+	Assert(targetlist && nodeTag(targetlist) == T_List);
+	Assert(list_length(targetlist) >= n);
+
+	if (targetlist && list_length(targetlist) >= n)
+	{
+		foreach(lc, targetlist)
+		{
+			if (i == n)
+			{
+				TargetEntry *te = lfirst(lc);
+
+				return te;
+			}
+			i++;
+		}
+	}
+	return NULL;
+}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 69e0fb98f5..cbf64b6338 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -2900,12 +2900,18 @@ scalargejoinsel(PG_FUNCTION_ARGS)
  *		*leftend is set to the fraction of the left-hand variable expected
  *		 to be scanned before the join terminates (0 to 1).
  *		*rightstart, *rightend similarly for the right-hand variable.
+ *
+ * 		*leftmin is set to the min value of the left-hand clause.
+ * 		*leftmax is set to the max value of the left-hand clause.
+ * 		*rightmin, *rightmax similarly for the right-hand clause.
  */
 void
 mergejoinscansel(PlannerInfo *root, Node *clause,
 				 Oid opfamily, int strategy, bool nulls_first,
 				 Selectivity *leftstart, Selectivity *leftend,
-				 Selectivity *rightstart, Selectivity *rightend)
+				 Selectivity *rightstart, Selectivity *rightend,
+				 Datum *leftmin, Datum *leftmax,
+				 Datum *rightmin, Datum *rightmax)
 {
 	Node	   *left,
 			   *right;
@@ -2925,10 +2931,6 @@ mergejoinscansel(PlannerInfo *root, Node *clause,
 				revltop,
 				revleop;
 	bool		isgt;
-	Datum		leftmin,
-				leftmax,
-				rightmin,
-				rightmax;
 	double		selec;
 
 	/* Set default results if we can't figure anything out. */
@@ -3075,20 +3077,20 @@ mergejoinscansel(PlannerInfo *root, Node *clause,
 	if (!isgt)
 	{
 		if (!get_variable_range(root, &leftvar, lstatop, collation,
-								&leftmin, &leftmax))
+								leftmin, leftmax))
 			goto fail;			/* no range available from stats */
 		if (!get_variable_range(root, &rightvar, rstatop, collation,
-								&rightmin, &rightmax))
+								rightmin, rightmax))
 			goto fail;			/* no range available from stats */
 	}
 	else
 	{
 		/* need to swap the max and min */
 		if (!get_variable_range(root, &leftvar, lstatop, collation,
-								&leftmax, &leftmin))
+								leftmax, leftmin))
 			goto fail;			/* no range available from stats */
 		if (!get_variable_range(root, &rightvar, rstatop, collation,
-								&rightmax, &rightmin))
+								rightmax, rightmin))
 			goto fail;			/* no range available from stats */
 	}
 
@@ -3098,13 +3100,13 @@ mergejoinscansel(PlannerInfo *root, Node *clause,
 	 * non-default estimates, else stick with our 1.0.
 	 */
 	selec = scalarineqsel(root, leop, isgt, true, collation, &leftvar,
-						  rightmax, op_righttype);
+						  *rightmax, op_righttype);
 	if (selec != DEFAULT_INEQ_SEL)
 		*leftend = selec;
 
 	/* And similarly for the right variable. */
 	selec = scalarineqsel(root, revleop, isgt, true, collation, &rightvar,
-						  leftmax, op_lefttype);
+						  *leftmax, op_lefttype);
 	if (selec != DEFAULT_INEQ_SEL)
 		*rightend = selec;
 
@@ -3128,13 +3130,13 @@ mergejoinscansel(PlannerInfo *root, Node *clause,
 	 * our own default.
 	 */
 	selec = scalarineqsel(root, ltop, isgt, false, collation, &leftvar,
-						  rightmin, op_righttype);
+						  *rightmin, op_righttype);
 	if (selec != DEFAULT_INEQ_SEL)
 		*leftstart = selec;
 
 	/* And similarly for the right variable. */
 	selec = scalarineqsel(root, revltop, isgt, false, collation, &rightvar,
-						  leftmin, op_lefttype);
+						  *leftmin, op_lefttype);
 	if (selec != DEFAULT_INEQ_SEL)
 		*rightstart = selec;
 
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 05ab087934..bcb2797486 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -885,6 +885,26 @@ struct config_bool ConfigureNamesBool[] =
 		true,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_mergejoin_semijoin_filter", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables the planner's use of using Semijoin Bloom filters during merge join."),
+			NULL,
+			GUC_EXPLAIN
+		},
+		&enable_mergejoin_semijoin_filter,
+		false,
+		NULL, NULL, NULL
+	},
+	{
+		{"force_mergejoin_semijoin_filter", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Forces the planner's use of using Semijoin Bloom filters during merge join. Overrides enable_mergejoin_semijoin_filter."),
+			NULL,
+			GUC_EXPLAIN
+		},
+		&force_mergejoin_semijoin_filter,
+		false,
+		NULL, NULL, NULL
+	},
 	{
 		{"enable_hashjoin", PGC_USERSET, QUERY_TUNING_METHOD,
 			gettext_noop("Enables the planner's use of hash join plans."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 868d21c351..e5c5d62720 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -379,6 +379,8 @@
 #enable_material = on
 #enable_memoize = on
 #enable_mergejoin = on
+#enable_mergejoin_semijoin_filter = on
+#force_mergejoin_semijoin_filter = on
 #enable_nestloop = on
 #enable_parallel_append = on
 #enable_parallel_hash = on
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 6bda383bea..77bf4930a2 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1959,6 +1959,41 @@ typedef struct NestPath
 	JoinPath	jpath;
 } NestPath;
 
+/*
+ * Information used in (Merge)Join node to support semijoin filter pushdowns.
+ */
+typedef struct SemijoinFilterJoinData
+{
+	NodeTag		type;
+	double		est_inner_semijoin_keys_distincts;
+	double		est_outer_semijoin_keys_distincts;
+	double		est_semijoin_selectivity;
+	double		est_semijoin_outer_rows_filtered;
+
+	double		filtering_rate; /* estimated filtering rate of SJF */
+	int			best_mergeclause_pos;	/* best clause index to build SJF on */
+	bool		apply_semijoin_filter;	/* whether semijoin filter should be
+										 * applied */
+
+	int			building_node_id;	/* scan node id where the bloom filter is
+									 * built */
+	int			checking_node_id;	/* scan node id where the bloom filter is
+									 * checked */
+}			SemijoinFilterJoinData;
+
+/*
+ * Information used in Scan node to support semijoin filter pushdowns.
+ */
+typedef struct SemijoinFilterScanData
+{
+	NodeTag		type;
+	bool		is_building_node;	/* True if the bloom filter is built on
+									 * this node; otherwise, the bloom filter
+									 * is checked on this node. */
+	List	   *semijoin_keys;	/* List of keys used to build/check the bloom
+								 * filter. */
+}			SemijoinFilterScanData;
+
 /*
  * A mergejoin path has these fields.
  *
@@ -2002,6 +2037,7 @@ typedef struct MergePath
 	List	   *innersortkeys;	/* keys for explicit sort, if any */
 	bool		skip_mark_restore;	/* can executor skip mark/restore? */
 	bool		materialize_inner;	/* add Materialize to inner? */
+	SemijoinFilterJoinData *sj_metadata;
 } MergePath;
 
 /*
@@ -2579,6 +2615,11 @@ typedef struct MergeScanSelCache
 	Selectivity leftendsel;		/* last-join fraction for clause left side */
 	Selectivity rightstartsel;	/* first-join fraction for clause right side */
 	Selectivity rightendsel;	/* last-join fraction for clause right side */
+
+	Datum		leftmin;		/* min value in clause left side */
+	Datum		leftmax;		/* max value in clause left side */
+	Datum		rightmin;		/* min value in clause right side */
+	Datum		rightmax;		/* max value in clause right side */
 } MergeScanSelCache;
 
 /*
@@ -3126,6 +3167,10 @@ typedef struct JoinCostWorkspace
 	Cardinality inner_rows;
 	Cardinality outer_skip_rows;
 	Cardinality inner_skip_rows;
+	Datum		outer_min_val;
+	Datum		outer_max_val;
+	Datum		inner_min_val;
+	Datum		inner_max_val;
 
 	/* private for cost_hashjoin code */
 	int			numbuckets;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 21e642a64c..e9825dbd5d 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -21,9 +21,9 @@
 #include "nodes/bitmapset.h"
 #include "nodes/lockoptions.h"
 #include "nodes/parsenodes.h"
+#include "nodes/pathnodes.h"
 #include "nodes/primnodes.h"
 
-
 /* ----------------------------------------------------------------
  *						node definitions
  * ----------------------------------------------------------------
@@ -167,6 +167,9 @@ typedef struct Plan
 	 */
 	Bitmapset  *extParam;
 	Bitmapset  *allParam;
+
+	/* List of SemijoinFilterScanData that gets pushed to this node */
+	List	   *sj_md_list;
 } Plan;
 
 /* ----------------
@@ -844,6 +847,9 @@ typedef struct MergeJoin
 
 	/* per-clause nulls ordering */
 	bool	   *mergeNullsFirst pg_node_attr(array_size(mergeclauses));
+
+	/* Semijoin filter metadata */
+	SemijoinFilterJoinData *sj_metadata;
 } MergeJoin;
 
 /* ----------------
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 204e94b6d1..42cfb600fa 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -61,6 +61,8 @@ extern PGDLLIMPORT bool enable_nestloop;
 extern PGDLLIMPORT bool enable_material;
 extern PGDLLIMPORT bool enable_memoize;
 extern PGDLLIMPORT bool enable_mergejoin;
+extern PGDLLIMPORT bool enable_mergejoin_semijoin_filter;
+extern PGDLLIMPORT bool force_mergejoin_semijoin_filter;
 extern PGDLLIMPORT bool enable_hashjoin;
 extern PGDLLIMPORT bool enable_gathermerge;
 extern PGDLLIMPORT bool enable_partitionwise_join;
@@ -211,4 +213,48 @@ extern PathTarget *set_pathtarget_cost_width(PlannerInfo *root, PathTarget *targ
 extern double compute_bitmap_pages(PlannerInfo *root, RelOptInfo *baserel,
 								   Path *bitmapqual, int loop_count, Cost *cost, double *tuple);
 
+/*
+ *  Container for metadata about an expression, used by semijoin decision logic
+ */
+typedef struct SemiJoinFilterExprMetadata
+{
+	bool		is_or_maps_to_constant;
+	bool		is_or_maps_to_base_column;
+
+	/* Var and relation from the current query block, if it is a Var */
+	Var		   *local_column_expr;
+	RelOptInfo *local_relation;
+
+	int32		est_col_width;
+
+	/*
+	 * The following will be the same as local Var and relation when the local
+	 * relation is a base table (i.e. no occluding query blocks).  Otherwise
+	 * it will be the occluded base column, if the final occluded expression
+	 * is a base column.
+	 */
+	Var		   *base_column_expr;
+	RelOptInfo *base_rel;
+	PlannerInfo *base_rel_root;
+	double		base_rel_row_count;
+	double		base_rel_filt_row_count;
+	double		base_col_distincts;
+	Datum		base_col_min_value;
+	Datum		base_col_max_value;
+
+	/* True if the distinct est is based on something meaningful  */
+	bool		est_distincts_reliable;
+	bool		est_minmax_reliable;
+
+	/* Estimated distincts after local filtering, and row count adjustments */
+	double		expr_est_distincts;
+}			SemiJoinFilterExprMetadata;
+
+extern void analyze_expr_for_metadata(PlannerInfo *root, Expr *ex,
+									  SemiJoinFilterExprMetadata * md);
+extern bool expressions_match_foreign_key(ForeignKeyOptInfo *fk,
+										  List *con_exprs,
+										  List *ref_exprs,
+										  List *operators);
+
 #endif							/* COST_H */
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
index 8f3d73edfb..b8e13ec5b0 100644
--- a/src/include/utils/selfuncs.h
+++ b/src/include/utils/selfuncs.h
@@ -208,7 +208,9 @@ extern Selectivity rowcomparesel(PlannerInfo *root,
 extern void mergejoinscansel(PlannerInfo *root, Node *clause,
 							 Oid opfamily, int strategy, bool nulls_first,
 							 Selectivity *leftstart, Selectivity *leftend,
-							 Selectivity *rightstart, Selectivity *rightend);
+							 Selectivity *rightstart, Selectivity *rightend,
+							 Datum *leftmin, Datum *leftmax,
+							 Datum *rightmin, Datum *rightmax);
 
 extern double estimate_num_groups(PlannerInfo *root, List *groupExprs,
 								  double input_rows, List **pgset,
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 579b861d84..22601c83f8 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -109,29 +109,30 @@ select count(*) = 0 as ok from pg_stat_wal_receiver;
 -- This is to record the prevailing planner enable_foo settings during
 -- a regression test run.
 select name, setting from pg_settings where name like 'enable%';
-              name              | setting 
---------------------------------+---------
- enable_async_append            | on
- enable_bitmapscan              | on
- enable_gathermerge             | on
- enable_hashagg                 | on
- enable_hashjoin                | on
- enable_incremental_sort        | on
- enable_indexonlyscan           | on
- enable_indexscan               | on
- enable_material                | on
- enable_memoize                 | on
- enable_mergejoin               | on
- enable_nestloop                | on
- enable_parallel_append         | on
- enable_parallel_hash           | on
- enable_partition_pruning       | on
- enable_partitionwise_aggregate | off
- enable_partitionwise_join      | off
- enable_seqscan                 | on
- enable_sort                    | on
- enable_tidscan                 | on
-(20 rows)
+               name               | setting 
+----------------------------------+---------
+ enable_async_append              | on
+ enable_bitmapscan                | on
+ enable_gathermerge               | on
+ enable_hashagg                   | on
+ enable_hashjoin                  | on
+ enable_incremental_sort          | on
+ enable_indexonlyscan             | on
+ enable_indexscan                 | on
+ enable_material                  | on
+ enable_memoize                   | on
+ enable_mergejoin                 | on
+ enable_mergejoin_semijoin_filter | off
+ enable_nestloop                  | on
+ enable_parallel_append           | on
+ enable_parallel_hash             | on
+ enable_partition_pruning         | on
+ enable_partitionwise_aggregate   | off
+ enable_partitionwise_join        | off
+ enable_seqscan                   | on
+ enable_sort                      | on
+ enable_tidscan                   | on
+(21 rows)
 
 -- Test that the pg_timezone_names and pg_timezone_abbrevs views are
 -- more-or-less working.  We can't test their contents in any great detail
-- 
2.37.1

v1-0002-Support-semijoin-filter-in-the-executor-for-non-para.patchapplication/octet-stream; name=v1-0002-Support-semijoin-filter-in-the-executor-for-non-para.patchDownload

From f67551ba46d62b308fbf237a059c3819f4c5766e Mon Sep 17 00:00:00 2001
From: Lyu Pan <lyu.steve.pan@gmail.com>
Date: Wed, 12 Oct 2022 16:35:22 +0000
Subject: [PATCH 2/5] Support semijoin filter in the executor for non-parallel
 mergejoin.

During MergeJoinState initialization, if a semijoin filter should be used in the MergeJoin node (according to the planner), a SemiJoinFilterJoinNodeState struct is initialized and pushed down to the scan node (only SeqScan for now). The bloom filter is always built on the outer/left tree, and used in the inner/right tree.
---
 src/backend/executor/execScan.c      | 116 ++++++++++++++++++++++--
 src/backend/executor/nodeMergejoin.c | 129 ++++++++++++++++++++++++++-
 src/backend/executor/nodeSeqscan.c   |  19 ++++
 src/include/executor/nodeMergejoin.h |   3 +
 src/include/nodes/execnodes.h        |  39 ++++++++
 5 files changed, 298 insertions(+), 8 deletions(-)

diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c
index 043bb83f55..7dd7b07a51 100644
--- a/src/backend/executor/execScan.c
+++ b/src/backend/executor/execScan.c
@@ -20,9 +20,13 @@
 
 #include "executor/executor.h"
 #include "miscadmin.h"
+#include "nodes/execnodes.h"
 #include "utils/memutils.h"
 
 
+bool		ExecScanUsingSemiJoinFilter(SeqScanState *node, ExprContext *econtext);
+void		ExecSemiJoinFilterFinishScan(List *semijoin_filters, List *semijoin_scan);
+
 
 /*
  * ExecScanFetch -- check interrupts & fetch next potential tuple
@@ -173,10 +177,11 @@ ExecScan(ScanState *node,
 	/* interrupt checks are in ExecScanFetch */
 
 	/*
-	 * If we have neither a qual to check nor a projection to do, just skip
-	 * all the overhead and return the raw scan tuple.
+	 * If we have neither a qual to check nor a projection to do nor a Bloom
+	 * filter to check, just skip all the overhead and return the raw scan
+	 * tuple.
 	 */
-	if (!qual && !projInfo)
+	if (!qual && !projInfo && !(IsA(node, SeqScanState) && ((SeqScanState *) node)->apply_semijoin_filter))
 	{
 		ResetExprContext(econtext);
 		return ExecScanFetch(node, accessMtd, recheckMtd);
@@ -206,6 +211,10 @@ ExecScan(ScanState *node,
 		 */
 		if (TupIsNull(slot))
 		{
+			if (IsA(node, SeqScanState) && ((SeqScanState *) node)->apply_semijoin_filter)
+			{
+				ExecSemiJoinFilterFinishScan(((SeqScanState *) node)->semijoin_filters, ((SeqScanState *) node)->sj_scan_data);
+			}
 			if (projInfo)
 				return ExecClearTuple(projInfo->pi_state.resultslot);
 			else
@@ -232,16 +241,33 @@ ExecScan(ScanState *node,
 			if (projInfo)
 			{
 				/*
-				 * Form a projection tuple, store it in the result tuple slot
-				 * and return it.
+				 * Form a projection tuple, store it in the result tuple slot,
+				 * check against SemiJoinFilter, then return it.
 				 */
-				return ExecProject(projInfo);
+				TupleTableSlot *projectedSlot = ExecProject(projInfo);
+
+				if (IsA(node, SeqScanState) && ((SeqScanState *) node)->apply_semijoin_filter
+					&& !ExecScanUsingSemiJoinFilter((SeqScanState *) node, econtext))
+				{
+					/* slot did not pass SemiJoinFilter, so skipping it. */
+					ResetExprContext(econtext);
+					continue;
+				}
+				return projectedSlot;
 			}
 			else
 			{
 				/*
-				 * Here, we aren't projecting, so just return scan tuple.
+				 * Here, we aren't projecting, so check against
+				 * SemiJoinFilter, then return tuple.
 				 */
+				if (IsA(node, SeqScanState) && ((SeqScanState *) node)->apply_semijoin_filter
+					&& !ExecScanUsingSemiJoinFilter((SeqScanState *) node, econtext))
+				{
+					/* slot did not pass SemiJoinFilter, so skipping it. */
+					ResetExprContext(econtext);
+					continue;
+				}
 				return slot;
 			}
 		}
@@ -340,3 +366,79 @@ ExecScanReScan(ScanState *node)
 		}
 	}
 }
+
+/*
+ *	ExecScanUsingSemiJoinFilter
+ *
+ *	This must be called within ExecScan function of SeqScanState node. This
+ *	function builds/checks the semijoin filter (bloom filter) in the node. If
+ *	this node is the building node of the bloom filter, it always returns true;
+ *	if this node is the checking node of the bloom filter, it returns true if
+ *	the value is in the bloom filter and returns false if not.
+ */
+bool
+ExecScanUsingSemiJoinFilter(SeqScanState *node, ExprContext *econtext)
+{
+	List	   *sj_scan_data;
+	List	   *sj_filters;
+	int			i;
+	ListCell   *lc;
+	SemiJoinFilterJoinNodeState *sjf;
+	SemiJoinFilterScanNodeState *sj_scan;
+	Datum		data;
+	bool		is_null;
+
+	sj_scan_data = node->sj_scan_data;
+	sj_filters = node->semijoin_filters;
+
+	Assert(list_length(sj_scan_data) == list_length(sj_filters));
+
+	i = 0;
+	foreach(lc, sj_scan_data)
+	{
+		is_null = false;
+		sjf = ((SemiJoinFilterJoinNodeState *) (lfirst(list_nth_cell(sj_filters, i))));
+		sj_scan = (SemiJoinFilterScanNodeState *) lfirst(lc);
+
+		data = ExecEvalExpr(sj_scan->expr_state, econtext, &is_null);
+
+		if (!sjf->done_building && sj_scan->is_building_side && !is_null)
+		{
+			bloom_add_element(sjf->filter, (unsigned char *) &data, sizeof(data));
+			++(sjf->elements_added);
+		}
+		else if (sjf->done_building && !sj_scan->is_building_side && !is_null &&
+				 bloom_lacks_element(sjf->filter, (unsigned char *) &data, sizeof(data)))
+		{
+			++(sjf->elements_filtered);
+			return false;
+		}
+		++i;
+	}
+	return true;
+}
+
+/*
+ *	ExecSemiJoinFilterFinishScan
+ *
+ *	This must be called within ExecScan function of SeqScanState node, and
+ *	represents the steps after the bloom filter is done building/checking.
+ */
+void
+ExecSemiJoinFilterFinishScan(List *semijoin_filters, List *semijoin_scan)
+{
+	ListCell   *cell;
+	int			i = 0;
+
+	foreach(cell, semijoin_filters)
+	{
+		SemiJoinFilterJoinNodeState *sjf = ((SemiJoinFilterJoinNodeState *) (lfirst(cell)));
+		SemiJoinFilterScanNodeState *sj_scan = ((SemiJoinFilterScanNodeState *) (lfirst(list_nth_cell(semijoin_scan, i))));;
+
+		++i;
+		if (!sjf->done_building && sj_scan->is_building_side)
+		{
+			sjf->done_building = true;
+		}
+	}
+}
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index fed345eae5..4d84ede243 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -93,13 +93,14 @@
 #include "postgres.h"
 
 #include "access/nbtree.h"
+#include "common/pg_prng.h"
 #include "executor/execdebug.h"
 #include "executor/nodeMergejoin.h"
+#include "lib/bloomfilter.h"
 #include "miscadmin.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 
-
 /*
  * States of the ExecMergeJoin state machine
  */
@@ -1603,6 +1604,40 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
 											node->mergeNullsFirst,
 											(PlanState *) mergestate);
 
+	/*
+	 * initialize SemiJoinFilter, if planner decided to do so
+	 */
+	if (node->sj_metadata && node->sj_metadata->apply_semijoin_filter)
+	{
+		SemiJoinFilterJoinNodeState *sjf;
+		uint64		seed;
+
+		/* create Bloom filter */
+		sjf = (SemiJoinFilterJoinNodeState *) palloc0(sizeof(SemiJoinFilterJoinNodeState));
+
+		/*
+		 * Push down filter down outer and inner subtrees and apply filter to
+		 * the nodes that correspond to the ones identified during planning.
+		 * We are pushing down first because we need some metadata from the
+		 * scan nodes (i.e. Relation Id's and planner-estimated number of
+		 * rows).
+		 */
+		sjf->building_node_id = ((MergeJoin *) mergestate->js.ps.plan)->sj_metadata->building_node_id;
+		sjf->checking_node_id = ((MergeJoin *) mergestate->js.ps.plan)->sj_metadata->checking_node_id;
+		PushDownFilter(mergestate->js.ps.lefttree, sjf, sjf->building_node_id, &sjf->num_elements);
+		PushDownFilter(mergestate->js.ps.righttree, sjf, sjf->checking_node_id, &sjf->num_elements);
+
+		/* Initialize SJF data and create Bloom filter */
+		seed = pg_prng_uint64(&pg_global_prng_state);
+		sjf->filter = bloom_create(sjf->num_elements, work_mem, seed);
+		sjf->work_mem = work_mem;
+		sjf->seed = seed;
+		sjf->done_building = false;
+
+		sjf->mergejoin_plan_id = mergestate->js.ps.plan->plan_node_id;
+		mergestate->sjf = sjf;
+	}
+
 	/*
 	 * initialize join state
 	 */
@@ -1645,6 +1680,14 @@ ExecEndMergeJoin(MergeJoinState *node)
 	ExecClearTuple(node->js.ps.ps_ResultTupleSlot);
 	ExecClearTuple(node->mj_MarkedTupleSlot);
 
+	/*
+	 * free SemiJoinFilterJoinNodeState
+	 */
+	if (node->sjf)
+	{
+		FreeSemiJoinFilter(node->sjf);
+	}
+
 	/*
 	 * shut down the subplans
 	 */
@@ -1678,3 +1721,87 @@ ExecReScanMergeJoin(MergeJoinState *node)
 	if (innerPlan->chgParam == NULL)
 		ExecReScan(innerPlan);
 }
+
+void
+FreeSemiJoinFilter(SemiJoinFilterJoinNodeState * sjf)
+{
+	bloom_free(sjf->filter);
+	pfree(sjf);
+}
+
+/*
+ * Determines the direction that a pushdown filter can be pushed. This is not very robust, but this
+ * is because we've already done the careful calculations at the plan level. If we end up pushing where
+ * we're not supposed to, it's fine because we've done the verifications in the planner.
+ */
+int
+PushDownDirection(PlanState *node)
+{
+	switch (nodeTag(node))
+	{
+		case T_HashState:
+		case T_MaterialState:
+		case T_GatherState:
+		case T_GatherMergeState:
+		case T_SortState:
+		case T_UniqueState:
+		case T_AggState:
+			{
+				return 0;
+			}
+		case T_NestLoopState:
+		case T_MergeJoinState:
+		case T_HashJoinState:
+			{
+				return 1;
+			}
+		default:
+			{
+				return -1;
+			}
+	}
+}
+
+/* Recursively pushes down the filter until an appropriate SeqScan node is reached. Then, it
+ * verifies if that SeqScan node is the one we want to push the filter to, and if it is, then
+ * appends the SJF to the node. */
+void
+PushDownFilter(PlanState *node, SemiJoinFilterJoinNodeState * sjf, int target_node_id, int64 *plan_rows)
+{
+	if (node == NULL)
+	{
+		return;
+	}
+
+	check_stack_depth();
+	if (node->type == T_SeqScanState)
+	{
+		SeqScanState *scan = (SeqScanState *) node;
+
+		/*
+		 * found the right Scan node that we want to apply the filter onto via
+		 * matching plan id.
+		 */
+		if (scan->ss.ps.plan->plan_node_id == target_node_id)
+		{
+			scan->apply_semijoin_filter = true;
+			scan->semijoin_filters = lappend(scan->semijoin_filters, sjf);
+			/* double row estimate to reduce error rate for Bloom filter */
+			if (plan_rows)
+			{
+				*plan_rows = Max(*plan_rows, scan->ss.ps.plan->plan_rows * 2);
+			}
+		}
+		return;
+	}
+	if (PushDownDirection(node) == 0)
+	{
+		PushDownFilter(node->lefttree, sjf, target_node_id, plan_rows);
+		return;
+	}
+	if (PushDownDirection(node) == 1)
+	{
+		PushDownFilter(node->lefttree, sjf, target_node_id, plan_rows);
+		PushDownFilter(node->righttree, sjf, target_node_id, plan_rows);
+	}
+}
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index 7b58cd9162..8f8cc36bc9 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -171,6 +171,25 @@ ExecInitSeqScan(SeqScan *node, EState *estate, int eflags)
 	scanstate->ss.ps.qual =
 		ExecInitQual(node->scan.plan.qual, (PlanState *) scanstate);
 
+	/*
+	 * Initialize semijoin filter expressions.
+	 */
+	if (node->scan.plan.sj_md_list)
+	{
+		ListCell   *lc;
+
+		scanstate->sj_scan_data = NIL;
+		foreach(lc, node->scan.plan.sj_md_list)
+		{
+			SemijoinFilterScanData *md = (SemijoinFilterScanData *) lfirst(lc);
+			SemiJoinFilterScanNodeState *sj_filter = (SemiJoinFilterScanNodeState *) palloc0(sizeof(SemiJoinFilterScanNodeState));
+
+			sj_filter->is_building_side = md->is_building_node;
+			sj_filter->expr_state = ExecInitExpr(lfirst(list_nth_cell(md->semijoin_keys, 0)), (PlanState *) scanstate);
+			scanstate->sj_scan_data = lappend(scanstate->sj_scan_data, sj_filter);
+		}
+	}
+
 	return scanstate;
 }
 
diff --git a/src/include/executor/nodeMergejoin.h b/src/include/executor/nodeMergejoin.h
index 26ab517508..c922fbe297 100644
--- a/src/include/executor/nodeMergejoin.h
+++ b/src/include/executor/nodeMergejoin.h
@@ -19,5 +19,8 @@
 extern MergeJoinState *ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags);
 extern void ExecEndMergeJoin(MergeJoinState *node);
 extern void ExecReScanMergeJoin(MergeJoinState *node);
+extern void FreeSemiJoinFilter(SemiJoinFilterJoinNodeState * sjf);
+extern int	PushDownDirection(PlanState *node);
+extern void PushDownFilter(PlanState *node, SemiJoinFilterJoinNodeState * sjf, int target_node_id, int64 *plan_rows);
 
 #endif							/* NODEMERGEJOIN_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 01b1727fc0..9a4c3a7792 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -32,6 +32,7 @@
 #include "access/tupconvert.h"
 #include "executor/instrument.h"
 #include "fmgr.h"
+#include "lib/bloomfilter.h"
 #include "lib/ilist.h"
 #include "lib/pairingheap.h"
 #include "nodes/params.h"
@@ -1451,6 +1452,10 @@ typedef struct SeqScanState
 {
 	ScanState	ss;				/* its first field is NodeTag */
 	Size		pscan_len;		/* size of parallel heap scan descriptor */
+	/* for use of SemiJoinFilter during merge join */
+	bool		apply_semijoin_filter;
+	List	   *semijoin_filters;	/* SemiJoinFilterJoinNodeState	*/
+	List	   *sj_scan_data;	/* SemiJoinFilterScanNodeState */
 } SeqScanState;
 
 /* ----------------
@@ -1987,6 +1992,39 @@ typedef struct NestLoopState
 	TupleTableSlot *nl_NullInnerTupleSlot;
 } NestLoopState;
 
+/* ----------------
+ *	 SemiJoinFilterScanNodeState information
+ *
+ *		SemiJoinFilter information used in Scan node.
+ * ----------------
+ */
+typedef struct SemiJoinFilterScanNodeState
+{
+	bool		is_building_side;
+	ExprState  *expr_state;
+}			SemiJoinFilterScanNodeState;
+
+/* ----------------
+ *	 SemiJoinFilterJoinNodeState information
+ *
+ *		SemiJoinFilter information used in Scan node.
+ * ----------------
+ */
+typedef struct SemiJoinFilterJoinNodeState
+{
+	bloom_filter *filter;
+	int			building_node_id;
+	int			checking_node_id;
+	bool		done_building;
+	uint64		seed;
+	int64		num_elements;
+	int			work_mem;
+	int			elements_added;
+	int			elements_checked;
+	int			elements_filtered;
+	int			mergejoin_plan_id;
+}			SemiJoinFilterJoinNodeState;
+
 /* ----------------
  *	 MergeJoinState information
  *
@@ -2032,6 +2070,7 @@ typedef struct MergeJoinState
 	TupleTableSlot *mj_NullInnerTupleSlot;
 	ExprContext *mj_OuterEContext;
 	ExprContext *mj_InnerEContext;
+	SemiJoinFilterJoinNodeState *sjf;
 } MergeJoinState;
 
 /* ----------------
-- 
2.37.1

v1-0004-Integrate-EXPLAIN-command-with-semijoin-filter.patchapplication/octet-stream; name=v1-0004-Integrate-EXPLAIN-command-with-semijoin-filter.patchDownload

From 30b1543a8505c8270ac406a4db090d778d3f6658 Mon Sep 17 00:00:00 2001
From: Lyu Pan <lyu.steve.pan@gmail.com>
Date: Wed, 12 Oct 2022 18:49:44 +0000
Subject: [PATCH 4/5] Integrate EXPLAIN command with semijoin filter.

Semijoin filter info will be displayed in the MergeJoin node and the SeqScan node. For example:
EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
                           QUERY PLAN
-----------------------------------------------------------------
 Aggregate
   Output: count(*)
   ->  Merge Join
         Merge Cond: (t1.i = t2.i)
         SemiJoin Filter Created Based on: (t1.i = t2.i)
         SemiJoin Estimated Filtering Rate: XXX
         ->  Sort
               Output: t1.i
               Sort Key: t1.i
               ->  Seq Scan on public.t1
                     Semijoin Filter building based on: { t1.i }
                     Output: t1.i
         ->  Sort
               Output: t2.i
               Sort Key: t2.i
               ->  Seq Scan on public.t2
                     Semijoin Filter checking based on: { t2.i }
                     Output: t2.i
---
 src/backend/commands/explain.c | 135 +++++++++++++++++++++++++++++++++
 src/backend/lib/bloomfilter.c  |  10 ---
 src/include/lib/bloomfilter.h  |  10 ++-
 3 files changed, 144 insertions(+), 11 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index f86983c660..bdcb2f3c14 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -21,6 +21,7 @@
 #include "executor/nodeHash.h"
 #include "foreign/fdwapi.h"
 #include "jit/jit.h"
+#include "lib/bloomfilter.h"
 #include "nodes/extensible.h"
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
@@ -108,6 +109,10 @@ static void show_tablesample(TableSampleClause *tsc, PlanState *planstate,
 static void show_sort_info(SortState *sortstate, ExplainState *es);
 static void show_incremental_sort_info(IncrementalSortState *incrsortstate,
 									   ExplainState *es);
+static void show_semijoin_metadata(List *equijoins, PlanState *planstate,
+								   List *ancestors, ExplainState *es);
+static void show_semijoin_filters_info(Plan *plan, PlanState *planstate,
+									   List *ancestors, ExplainState *es);
 static void show_hash_info(HashState *hashstate, ExplainState *es);
 static void show_memoize_info(MemoizeState *mstate, List *ancestors,
 							  ExplainState *es);
@@ -1687,6 +1692,15 @@ ExplainNode(PlanState *planstate, List *ancestors,
 	if (es->format == EXPLAIN_FORMAT_TEXT)
 		appendStringInfoChar(es->str, '\n');
 
+	switch (nodeTag(plan))
+	{
+		case T_SeqScan:
+			show_semijoin_filters_info((Plan *) plan, planstate, ancestors, es);
+			break;
+		default:
+			break;
+	}
+
 	/* prepare per-worker general execution details */
 	if (es->workers_state && es->verbose)
 	{
@@ -1981,6 +1995,11 @@ ExplainNode(PlanState *planstate, List *ancestors,
 							"Merge Cond", planstate, ancestors, es);
 			show_upper_qual(((MergeJoin *) plan)->join.joinqual,
 							"Join Filter", planstate, ancestors, es);
+			if (((MergeJoinState *) planstate)->sjf)
+			{
+				show_semijoin_metadata(((MergeJoin *) plan)->mergeclauses,
+									   planstate, ancestors, es);
+			}
 			if (((MergeJoin *) plan)->join.joinqual)
 				show_instrumentation_count("Rows Removed by Join Filter", 1,
 										   planstate, es);
@@ -5041,3 +5060,119 @@ escape_yaml(StringInfo buf, const char *str)
 {
 	escape_json(buf, str);
 }
+
+/*
+ * Show a list of expressions (or ExprState's), wrapped in the specified bounding characters
+ */
+static void
+show_expression_list(List *expr_list, const char *qlabel,
+					 const char *list_start, const char *list_separator, const char *list_end,
+					 PlanState *planstate, List *ancestors, ExplainState *es)
+{
+	List	   *context;
+	ListCell   *lc;
+	char	   *exprstr;
+	bool		first = true;
+	Node	   *nth_expr = NULL;
+	Node	   *nth_list_elem = NULL;
+
+	/* Set up deparsing context */
+	context = set_deparse_context_plan(es->deparse_cxt, planstate->plan, ancestors);
+
+	/* Insert header */
+	appendStringInfoSpaces(es->str, es->indent * 2);
+	appendStringInfo(es->str, "%s: ", qlabel);
+	appendStringInfo(es->str, "%s", list_start);
+
+	/* Deparse each expressions, and append it */
+	foreach(lc, expr_list)
+	{
+		if (!first)
+		{
+			appendStringInfo(es->str, "%s", list_separator);
+		}
+		nth_expr = NULL;
+		nth_list_elem = (Node *) lfirst(lc);
+		if (nth_list_elem->type == T_ExprState)
+		{
+			nth_expr = (Node *) (((ExprState *) nth_list_elem)->expr);
+		}
+		else
+		{
+			nth_expr = nth_list_elem;
+		}
+		if (nth_expr)
+		{
+			exprstr = deparse_expression(nth_expr, context, /* useprefix */ true, false);
+			appendStringInfo(es->str, "%s", exprstr);
+		}
+		else
+		{
+			appendStringInfo(es->str, "%s", "<?>");
+		}
+		first = false;
+	}
+
+	/* Finalize with trailer */
+	appendStringInfo(es->str, "%s\n", list_end);
+}
+
+static void
+show_semijoin_metadata(List *equijoins, PlanState *planstate,
+					   List *ancestors, ExplainState *es)
+{
+	char		createStr[256];
+	Node	   *best_equijoin_clause;
+	SemijoinFilterJoinData *sj_metadata = (((MergeJoin *) planstate->plan)->sj_metadata);
+
+	Assert(planstate);
+	Assert(nodeTag(planstate) == T_MergeJoinState);
+	Assert(planstate->plan);
+	Assert(nodeTag(planstate->plan) == T_MergeJoin);
+
+	snprintf(createStr, sizeof(createStr), "%s", "SemiJoin Filter Created Based on");
+	best_equijoin_clause = (Node *) list_nth_node(OpExpr, equijoins, sj_metadata->best_mergeclause_pos);
+	show_expression(best_equijoin_clause, createStr, planstate, ancestors, true, es);
+	ExplainPropertyFloat("SemiJoin Estimated Filtering Rate", NULL, sj_metadata->filtering_rate, 4, es);
+	if (es->analyze)
+	{
+		SemiJoinFilterJoinNodeState *sjf = ((MergeJoinState *) planstate)->sjf;
+
+		ExplainPropertyInteger("SemiJoin Filter Size", "kB", (sjf->filter->m) / 8 / 1024, es);
+		ExplainPropertyInteger("SemiJoin Filter Hashes", NULL, sjf->filter->k_hash_funcs, es);
+		ExplainPropertyFloat("SemiJoin Actual Filtering Rate", NULL, ((double) sjf->elements_filtered) / sjf->elements_checked, 4, es);
+	}
+}
+
+static void
+show_semijoin_filter_info(SemijoinFilterScanData * sj_md, PlanState *planstate, List *ancestors, ExplainState *es)
+{
+	if (sj_md->is_building_node)
+	{
+		show_expression_list(sj_md->semijoin_keys,
+							 "Semijoin Filter building based on", "{ ", ", ", " }",
+							 planstate, ancestors, es);
+	}
+	else
+	{
+		show_expression_list(sj_md->semijoin_keys,
+							 "Semijoin Filter checking based on", "{ ", ", ", " }",
+							 planstate, ancestors, es);
+	}
+}
+
+static void
+show_semijoin_filters_info(Plan *plan, PlanState *planstate, List *ancestors, ExplainState *es)
+{
+	List	   *sj_md_list = plan->sj_md_list;
+
+	if (sj_md_list != NIL)
+	{
+		ListCell   *lc;
+
+		foreach(lc, sj_md_list)
+		{
+			show_semijoin_filter_info((SemijoinFilterScanData *) lfirst(lc), planstate, ancestors, es);
+		}
+	}
+}
diff --git a/src/backend/lib/bloomfilter.c b/src/backend/lib/bloomfilter.c
index 743bd7388a..52a389d70b 100644
--- a/src/backend/lib/bloomfilter.c
+++ b/src/backend/lib/bloomfilter.c
@@ -41,16 +41,6 @@
 
 #define MAX_HASH_FUNCS		10
 
-struct bloom_filter
-{
-	/* K hash functions are used, seeded by caller's seed */
-	int			k_hash_funcs;
-	uint64		seed;
-	/* m is bitset size, in bits.  Must be a power of two <= 2^32.  */
-	uint64		m;
-	unsigned char bitset[FLEXIBLE_ARRAY_MEMBER];
-};
-
 static int	my_bloom_power(uint64 target_bitset_bits);
 static int	optimal_k(uint64 bitset_bits, int64 total_elems);
 static void k_hashes(bloom_filter *filter, uint32 *hashes, unsigned char *elem,
diff --git a/src/include/lib/bloomfilter.h b/src/include/lib/bloomfilter.h
index 3b5d1821a5..68d54a8ba8 100644
--- a/src/include/lib/bloomfilter.h
+++ b/src/include/lib/bloomfilter.h
@@ -15,7 +15,15 @@
 
 #include "utils/dsa.h"
 
-typedef struct bloom_filter bloom_filter;
+typedef struct bloom_filter
+{
+	/* K hash functions are used, seeded by caller's seed */
+	int			k_hash_funcs;
+	uint64		seed;
+	/* m is bitset size, in bits.  Must be a power of two <= 2^32.  */
+	uint64		m;
+	unsigned char bitset[FLEXIBLE_ARRAY_MEMBER];
+} bloom_filter;
 
 extern bloom_filter *bloom_create(int64 total_elems, int bloom_work_mem,
 								  uint64 seed);
-- 
2.37.1

v1-0003-Support-semijoin-filter-in-the-executor-for-parallel.patchapplication/octet-stream; name=v1-0003-Support-semijoin-filter-in-the-executor-for-parallel.patchDownload

From e0ddb12f2f031ec5460bf4039b60dd3e7ca397e2 Mon Sep 17 00:00:00 2001
From: Lyu Pan <lyu.steve.pan@gmail.com>
Date: Wed, 12 Oct 2022 17:03:54 +0000
Subject: [PATCH 3/5] Support semijoin filter in the executor for parallel
 mergejoin.

Each worker process create its own bloom filter, and updates its own bloom filter during the outer/left plan scan (no lock is needed). After all processes finish the scan, the bloom filtered are merged together (by performing OR operations on the bit arrays) into a shared bloom filter in the Dynamic Shared Memory area (use lock for synchronization). After this is completed, the merged bloom filter will be copied back to all the worker processes, and be used in the inner/right plan scan for filtering.
---
 src/backend/executor/execScan.c      |  78 +++++++++++++-
 src/backend/executor/nodeMergejoin.c |  90 ++++++++++++++++
 src/backend/executor/nodeSeqscan.c   | 149 +++++++++++++++++++++++++++
 src/backend/lib/bloomfilter.c        |  74 +++++++++++++
 src/include/executor/nodeMergejoin.h |   2 +
 src/include/lib/bloomfilter.h        |   7 ++
 src/include/nodes/execnodes.h        |  34 ++++++
 7 files changed, 429 insertions(+), 5 deletions(-)

diff --git a/src/backend/executor/execScan.c b/src/backend/executor/execScan.c
index 7dd7b07a51..71bc9cdc20 100644
--- a/src/backend/executor/execScan.c
+++ b/src/backend/executor/execScan.c
@@ -25,7 +25,7 @@
 
 
 bool		ExecScanUsingSemiJoinFilter(SeqScanState *node, ExprContext *econtext);
-void		ExecSemiJoinFilterFinishScan(List *semijoin_filters, List *semijoin_scan);
+void		ExecSemiJoinFilterFinishScan(List *semijoin_filters, List *semijoin_scan, dsa_area *parallel_area);
 
 
 /*
@@ -213,7 +213,8 @@ ExecScan(ScanState *node,
 		{
 			if (IsA(node, SeqScanState) && ((SeqScanState *) node)->apply_semijoin_filter)
 			{
-				ExecSemiJoinFilterFinishScan(((SeqScanState *) node)->semijoin_filters, ((SeqScanState *) node)->sj_scan_data);
+				ExecSemiJoinFilterFinishScan(((SeqScanState *) node)->semijoin_filters, ((SeqScanState *) node)->sj_scan_data,
+											 node->ps.state->es_query_dsa);
 			}
 			if (projInfo)
 				return ExecClearTuple(projInfo->pi_state.resultslot);
@@ -425,7 +426,7 @@ ExecScanUsingSemiJoinFilter(SeqScanState *node, ExprContext *econtext)
  *	represents the steps after the bloom filter is done building/checking.
  */
 void
-ExecSemiJoinFilterFinishScan(List *semijoin_filters, List *semijoin_scan)
+ExecSemiJoinFilterFinishScan(List *semijoin_filters, List *semiJoinScan, dsa_area *parallel_area)
 {
 	ListCell   *cell;
 	int			i = 0;
@@ -433,12 +434,79 @@ ExecSemiJoinFilterFinishScan(List *semijoin_filters, List *semijoin_scan)
 	foreach(cell, semijoin_filters)
 	{
 		SemiJoinFilterJoinNodeState *sjf = ((SemiJoinFilterJoinNodeState *) (lfirst(cell)));
-		SemiJoinFilterScanNodeState *sj_scan = ((SemiJoinFilterScanNodeState *) (lfirst(list_nth_cell(semijoin_scan, i))));;
+		SemiJoinFilterScanNodeState *sj_scan = ((SemiJoinFilterScanNodeState *) (lfirst(list_nth_cell(semiJoinScan, i))));;
 
 		++i;
 		if (!sjf->done_building && sj_scan->is_building_side)
 		{
-			sjf->done_building = true;
+			if (!sjf->is_parallel)
+			{
+				/*
+				 * not parallel, so only one process running and that process
+				 * is now complete
+				 */
+				sjf->done_building = true;
+			}
+			else
+			{
+				/* parallel, so need to sync with the other processes */
+				SemiJoinFilterParallelState *parallel_state =
+				(SemiJoinFilterParallelState *) dsa_get_address(parallel_area, sjf->parallel_state);
+				bloom_filter *shared_bloom = (bloom_filter *) dsa_get_address(parallel_area, parallel_state->bloom_dsa_address);
+
+				/*
+				 * this process takes control of the lock and updates the
+				 * shared bloom filter. These locks are created by the
+				 * SemiJoinFilterParallelState and are unique to that struct.
+				 */
+				LWLockAcquire(&parallel_state->lock, LW_EXCLUSIVE);
+				parallel_state->elements_added += sjf->elements_added;
+				add_to_filter(shared_bloom, sjf->filter);
+				parallel_state->workers_done++;
+				LWLockRelease(&parallel_state->lock);
+
+				/*
+				 * we need to wait until all threads have had their chance to
+				 * update the shared bloom filter, since our next step is to
+				 * copy the finished bloom filter back into all of the
+				 * separate processes
+				 */
+				if (parallel_state->workers_done == parallel_state->num_processes)
+				{
+					LWLockUpdateVar(&parallel_state->secondlock, &parallel_state->lock_stop, 1);
+				}
+				LWLockWaitForVar(&parallel_state->secondlock, &parallel_state->lock_stop, 0, &parallel_state->lock_stop);
+
+				/*
+				 * now the shared Bloom filter is fully updated, so each
+				 * individual process copies the finished Bloom filter to the
+				 * local SemiJoinFilterJoinNodeState
+				 */
+				LWLockAcquire(&parallel_state->lock, LW_EXCLUSIVE);
+				replace_bitset(sjf->filter, shared_bloom);
+				sjf->elements_added = parallel_state->elements_added;
+				sjf->done_building = true;
+				parallel_state->workers_done++;
+				LWLockRelease(&parallel_state->lock);
+
+				/*
+				 * again, we need to wait for all processes to finish copying
+				 * the completed bloom filter because the main process will
+				 * free the shared memory afterwards
+				 */
+				if (parallel_state->workers_done == 2 * parallel_state->num_processes)
+				{
+					LWLockUpdateVar(&parallel_state->secondlock, &parallel_state->lock_stop, 2);
+				}
+				LWLockWaitForVar(&parallel_state->secondlock, &parallel_state->lock_stop, 1, &parallel_state->lock_stop);
+				/* release allocated shared memory in main process */
+				if (!sjf->is_worker)
+				{
+					LWLockRelease(&parallel_state->secondlock);
+					bloom_free_in_dsa(parallel_area, parallel_state->bloom_dsa_address);
+					dsa_free(parallel_area, sjf->parallel_state);
+				}
+			}
 		}
 	}
 }
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index 4d84ede243..f96bc2f9b6 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -98,6 +98,10 @@
 #include "executor/nodeMergejoin.h"
 #include "lib/bloomfilter.h"
 #include "miscadmin.h"
+#include "storage/dsm.h"
+#include "storage/lwlock.h"
+#include "storage/shm_toc.h"
+#include "utils/dsa.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 
@@ -657,6 +661,23 @@ ExecMergeJoin(PlanState *pstate)
 				outerTupleSlot = ExecProcNode(outerPlan);
 				node->mj_OuterTupleSlot = outerTupleSlot;
 
+				/*
+				 * Check if outer plan has an SJF and the inner plan does not.
+				 * This case will only arise during parallel execution, when
+				 * the outer plan is initialized with the SJF but the inner
+				 * plan does not because it is not included in the memory
+				 * copied over during worker creation. If this is the case,
+				 * then push down the filter to the inner plan level to
+				 * correct this error and then proceed as normal.
+				 */
+				if (GetSemiJoinFilter(outerPlan, pstate->plan->plan_node_id)
+					&& !GetSemiJoinFilter(innerPlan, pstate->plan->plan_node_id))
+				{
+					SemiJoinFilterJoinNodeState *sjf = GetSemiJoinFilter(outerPlan, pstate->plan->plan_node_id);
+
+					PushDownFilter(innerPlan, sjf, sjf->checking_node_id, NULL);
+				}
+
 				/* Compute join values and check for unmatchability */
 				switch (MJEvalOuterValues(node))
 				{
@@ -1805,3 +1826,72 @@ PushDownFilter(PlanState *node, SemiJoinFilterJoinNodeState * sjf, int target_no
 		PushDownFilter(node->righttree, sjf, target_node_id, plan_rows);
 	}
 }
+
+dsa_pointer
+CreateFilterParallelState(dsa_area *area, SemiJoinFilterJoinNodeState * sjf, int sjf_num)
+{
+	dsa_pointer bloom_dsa_address = bloom_create_in_dsa(area, sjf->num_elements,
+														sjf->work_mem, sjf->seed);
+	dsa_pointer parallel_address = dsa_allocate0(area, sizeof(SemiJoinFilterParallelState));
+	SemiJoinFilterParallelState *parallel_state = (SemiJoinFilterParallelState *) dsa_get_address(area, parallel_address);
+
+	/* copy over information to parallel state */
+	parallel_state->done_building = sjf->done_building;
+	parallel_state->seed = sjf->seed;
+	parallel_state->num_elements = sjf->num_elements;
+	parallel_state->work_mem = sjf->work_mem;
+	parallel_state->bloom_dsa_address = bloom_dsa_address;
+	parallel_state->sjf_num = sjf_num;
+	parallel_state->mergejoin_plan_id = sjf->mergejoin_plan_id;
+	/* initialize locks */
+	LWLockInitialize(&parallel_state->lock, LWLockNewTrancheId());
+	LWLockInitialize(&parallel_state->secondlock, LWLockNewTrancheId());
+	/* should be main process that acquires lock */
+	LWLockAcquire(&parallel_state->secondlock, LW_EXCLUSIVE);
+	return parallel_address;
+}
+
+/* Checks a side of the execution tree and fetches an SJF if its mergejoin plan ID matches that of the method's mergejoin ID.
+ * Used during parallel execution, where SJF information is lost during information copying to the worker.
+ */
+SemiJoinFilterJoinNodeState *
+GetSemiJoinFilter(PlanState *node, int plan_id)
+{
+	if (node == NULL)
+	{
+		return NULL;
+	}
+	check_stack_depth();
+	if (node->type == T_SeqScanState)
+	{
+		SeqScanState *scan = (SeqScanState *) node;
+
+		Assert(IsA(scan, SeqScanState));
+		if (scan->apply_semijoin_filter)
+		{
+			ListCell   *lc;
+
+			foreach(lc, scan->semijoin_filters)
+			{
+				SemiJoinFilterJoinNodeState *sjf = (SemiJoinFilterJoinNodeState *) lfirst(lc);
+
+				if (sjf->mergejoin_plan_id == plan_id)
+				{
+					return sjf;
+				}
+			}
+			return NULL;
+		}
+	}
+	if (PushDownDirection(node) == 1)
+	{
+		/* check both children and return the non-null one */
+		return GetSemiJoinFilter(node->lefttree, plan_id) != NULL ? GetSemiJoinFilter(node->lefttree, plan_id)
+			: GetSemiJoinFilter(node->righttree, plan_id);
+	}
+	if (PushDownDirection(node) == 0)
+	{
+		return GetSemiJoinFilter(node->lefttree, plan_id);
+	}
+	return NULL;
+}
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index 8f8cc36bc9..0a5268b429 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -30,8 +30,15 @@
 #include "access/relscan.h"
 #include "access/tableam.h"
 #include "executor/execdebug.h"
+#include "executor/nodeMergejoin.h"
 #include "executor/nodeSeqscan.h"
 #include "utils/rel.h"
+#include "storage/lwlock.h"
+#include "storage/shm_toc.h"
+#include <unistd.h>
+
+/* Magic number for location of shared dsa pointer if scan is using a semi-join filter */
+#define DSA_LOCATION_KEY_FOR_SJF	UINT64CONST(0xE00000000000FFFF)
 
 static TupleTableSlot *SeqNext(SeqScanState *node);
 
@@ -275,6 +282,19 @@ ExecSeqScanEstimate(SeqScanState *node,
 												  estate->es_snapshot);
 	shm_toc_estimate_chunk(&pcxt->estimator, node->pscan_len);
 	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
+	/*
+	 * Estimate space for extra dsa_pointer address for when parallel
+	 * sequential scans use a semi-join filter.
+	 */
+	if (node->ss.ps.plan->parallel_aware && node->apply_semijoin_filter)
+	{
+		shm_toc_estimate_keys(&pcxt->estimator, 1);
+		if (node->semijoin_filters)
+		{
+			shm_toc_estimate_keys(&pcxt->estimator, sizeof(dsa_pointer) * list_length(node->semijoin_filters));
+		}
+	}
 }
 
 /* ----------------------------------------------------------------
@@ -290,6 +310,50 @@ ExecSeqScanInitializeDSM(SeqScanState *node,
 	EState	   *estate = node->ss.ps.state;
 	ParallelTableScanDesc pscan;
 
+	/*
+	 * If scan is using a semi-join filter, then initialize dsa pointer of
+	 * shared sjf
+	 */
+	if (node->apply_semijoin_filter)
+	{
+		int			sjf_num = list_length(node->semijoin_filters);
+		dsa_pointer *dsa_pointer_address;	/* actuallly an array of size
+											 * sjf_num */
+		ListCell   *lc;
+		int			i = 0;
+
+		dsa_pointer_address = (dsa_pointer *) shm_toc_allocate(pcxt->toc, sizeof(dsa_pointer) * sjf_num);
+		foreach(lc, node->semijoin_filters)
+		{
+			SemiJoinFilterJoinNodeState *sjf = (SemiJoinFilterJoinNodeState *) (lfirst(lc));
+			SemiJoinFilterParallelState *parallel_state;
+			dsa_area   *area = node->ss.ps.state->es_query_dsa;
+
+			sjf->parallel_state = CreateFilterParallelState(area, sjf, sjf_num);
+			sjf->is_parallel = true;
+			/* check if main process always will run */
+			parallel_state = (SemiJoinFilterParallelState *) dsa_get_address(area, sjf->parallel_state);
+			parallel_state->num_processes = 1;
+			/* update parallel_state with built bloom filter */
+			if (sjf->done_building && node->ss.ps.plan->plan_node_id == sjf->checking_node_id)
+			{
+				bloom_filter *parallel_bloom = (bloom_filter *) dsa_get_address(area, parallel_state->bloom_dsa_address);
+
+				replace_bitset(parallel_bloom, sjf->filter);
+				LWLockRelease(&parallel_state->secondlock);
+			}
+			dsa_pointer_address[i] = sjf->parallel_state;
+			i++;
+		}
+
+		/*
+		 * add plan_id to magic number so this is also unique for each plan
+		 * node
+		 */
+		shm_toc_insert(pcxt->toc, DSA_LOCATION_KEY_FOR_SJF +
+					   node->ss.ps.plan->plan_node_id, dsa_pointer_address);
+	}
+
 	pscan = shm_toc_allocate(pcxt->toc, node->pscan_len);
 	table_parallelscan_initialize(node->ss.ss_currentRelation,
 								  pscan,
@@ -327,6 +391,91 @@ ExecSeqScanInitializeWorker(SeqScanState *node,
 {
 	ParallelTableScanDesc pscan;
 
+	/*
+	 * Create worker's semi-join filter for merge join, if using it. We first
+	 * need to check shm_toc to see if a sjf exists, then create the local
+	 * backend sjf.
+	 */
+	if (shm_toc_lookup(pwcxt->toc, DSA_LOCATION_KEY_FOR_SJF + node->ss.ps.plan->plan_node_id, 1))
+	{
+		dsa_pointer *parallel_addresses = (dsa_pointer *)
+		shm_toc_lookup(pwcxt->toc, DSA_LOCATION_KEY_FOR_SJF + node->ss.ps.plan->plan_node_id, 1);
+
+		/*
+		 * we know that there is at least one sjf, we will update accordingly
+		 * if the parallel state says there is more (this avoids using an
+		 * additional shm_toc allocation)
+		 */
+		int			sjf_num = 1;
+
+		/*
+		 * If a copy of any sjf already exists on the backend, we want to free
+		 * it and create a new one.
+		 */
+		if (node->apply_semijoin_filter)
+		{
+			while (list_length(node->semijoin_filters) > 0)
+			{
+				SemiJoinFilterJoinNodeState *sjf = (SemiJoinFilterJoinNodeState *) (list_head(node->semijoin_filters)->ptr_value);
+
+				node->semijoin_filters = list_delete_nth_cell(node->semijoin_filters, 0);
+				FreeSemiJoinFilter(sjf);
+			}
+		}
+
+		/*
+		 * Here, we create the process-local SJF's, which will later be
+		 * combined into the single SJF after all parallel work is done.
+		 */
+		for (int i = 0; i < sjf_num; i++)
+		{
+			dsa_pointer parallel_address = parallel_addresses[i];
+			SemiJoinFilterParallelState *parallel_state = (SemiJoinFilterParallelState *)
+			dsa_get_address(node->ss.ps.state->es_query_dsa, parallel_address);
+			SemiJoinFilterJoinNodeState *sjf;
+			MemoryContext oldContext;
+
+			sjf_num = parallel_state->sjf_num;
+			oldContext = MemoryContextSwitchTo(GetMemoryChunkContext(node));
+			sjf = (SemiJoinFilterJoinNodeState *) palloc0(sizeof(SemiJoinFilterJoinNodeState));
+			sjf->filter = bloom_create(parallel_state->num_elements, parallel_state->work_mem,
+									   parallel_state->seed);
+			sjf->building_node_id = parallel_state->building_node_id;
+			sjf->checking_node_id = parallel_state->checking_node_id;
+			sjf->seed = parallel_state->seed;
+			sjf->is_parallel = true;
+			sjf->is_worker = true;
+			sjf->done_building = parallel_state->done_building;
+			sjf->parallel_state = parallel_address;
+			node->apply_semijoin_filter = true;
+			node->semijoin_filters = lappend(node->semijoin_filters, (void *) sjf);
+			sjf->mergejoin_plan_id = parallel_state->mergejoin_plan_id;
+			/* copy over bloom filter if already built */
+			if (sjf->done_building && parallel_state->checking_node_id == node->ss.ps.plan->plan_node_id)
+			{
+				SemiJoinFilterParallelState *copy_parallel_state = (SemiJoinFilterParallelState *)
+				dsa_get_address(node->ss.ps.state->es_query_dsa, sjf->parallel_state);
+				bloom_filter *shared_bloom = (bloom_filter *) dsa_get_address(
+																			  node->ss.ps.state->es_query_dsa, copy_parallel_state->bloom_dsa_address);
+
+				replace_bitset(sjf->filter, shared_bloom);
+			}
+			else if (!sjf->done_building && parallel_state->building_node_id == node->ss.ps.plan->plan_node_id)
+			{
+				/*
+				 * Add this process to number of scan processes, need to use
+				 * lock in case of multiple workers updating at same time. We
+				 * want to avoid using the planned number of workers because
+				 * that can be wrong.
+				 */
+				LWLockAcquire(&parallel_state->lock, LW_EXCLUSIVE);
+				parallel_state->num_processes += 1;
+				LWLockRelease(&parallel_state->lock);
+			}
+			MemoryContextSwitchTo(oldContext);
+		}
+	}
+
 	pscan = shm_toc_lookup(pwcxt->toc, node->ss.ps.plan->plan_node_id, false);
 	node->ss.ss_currentScanDesc =
 		table_beginscan_parallel(node->ss.ss_currentRelation, pscan);
diff --git a/src/backend/lib/bloomfilter.c b/src/backend/lib/bloomfilter.c
index 3ef67d35ac..743bd7388a 100644
--- a/src/backend/lib/bloomfilter.c
+++ b/src/backend/lib/bloomfilter.c
@@ -292,3 +292,77 @@ mod_m(uint32 val, uint64 m)
 
 	return val & (m - 1);
 }
+
+/*
+ * Create Bloom filter in dsa shared memory
+ */
+dsa_pointer
+bloom_create_in_dsa(dsa_area *area, int64 total_elems, int bloom_work_mem, uint64 seed)
+{
+	dsa_pointer filter_dsa_address;
+	bloom_filter *filter;
+	int			bloom_power;
+	uint64		bitset_bytes;
+	uint64		bitset_bits;
+
+	/*
+	 * Aim for two bytes per element; this is sufficient to get a false
+	 * positive rate below 1%, independent of the size of the bitset or total
+	 * number of elements.  Also, if rounding down the size of the bitset to
+	 * the next lowest power of two turns out to be a significant drop, the
+	 * false positive rate still won't exceed 2% in almost all cases.
+	 */
+	bitset_bytes = Min(bloom_work_mem * UINT64CONST(1024), total_elems * 2);
+	bitset_bytes = Max(1024 * 1024, bitset_bytes);
+
+	/*
+	 * Size in bits should be the highest power of two <= target.  bitset_bits
+	 * is uint64 because PG_UINT32_MAX is 2^32 - 1, not 2^32
+	 */
+	bloom_power = my_bloom_power(bitset_bytes * BITS_PER_BYTE);
+	bitset_bits = UINT64CONST(1) << bloom_power;
+	bitset_bytes = bitset_bits / BITS_PER_BYTE;
+
+	/* Allocate bloom filter with unset bitset */
+	filter_dsa_address = dsa_allocate0(area, offsetof(bloom_filter, bitset) +
+									   sizeof(unsigned char) * bitset_bytes);
+	filter = (bloom_filter *) dsa_get_address(area, filter_dsa_address);
+	filter->k_hash_funcs = optimal_k(bitset_bits, total_elems);
+	filter->seed = seed;
+	filter->m = bitset_bits;
+
+	return filter_dsa_address;
+}
+
+/*
+ * Free Bloom filter in dsa shared memory
+ */
+void
+bloom_free_in_dsa(dsa_area *area, dsa_pointer filter_dsa_address)
+{
+	dsa_free(area, filter_dsa_address);
+}
+
+/*
+ * Add secondary filter to main filter, essentially "combining" the two filters together.
+ * This happens in-place, with main_filter being the combined filter.
+ * Both filters must have the same seed and size for this to work.
+ */
+void
+add_to_filter(bloom_filter *main_filter, bloom_filter *to_add)
+{
+	Assert(main_filter->seed == to_add->seed);
+	Assert(main_filter->m == to_add->m);
+	/* m is in bits not bytes */
+	for (int i = 0; i < main_filter->m / BITS_PER_BYTE; i++)
+	{
+		main_filter->bitset[i] = main_filter->bitset[i] | to_add->bitset[i];
+	}
+}
+
+void
+replace_bitset(bloom_filter *main_filter, bloom_filter *overriding_filter)
+{
+	Assert(main_filter->m == overriding_filter->m);
+	memcpy(&main_filter->bitset, &overriding_filter->bitset, main_filter->m / BITS_PER_BYTE);
+}
diff --git a/src/include/executor/nodeMergejoin.h b/src/include/executor/nodeMergejoin.h
index c922fbe297..2116cb8fcd 100644
--- a/src/include/executor/nodeMergejoin.h
+++ b/src/include/executor/nodeMergejoin.h
@@ -22,5 +22,7 @@ extern void ExecReScanMergeJoin(MergeJoinState *node);
 extern void FreeSemiJoinFilter(SemiJoinFilterJoinNodeState * sjf);
 extern int	PushDownDirection(PlanState *node);
 extern void PushDownFilter(PlanState *node, SemiJoinFilterJoinNodeState * sjf, int target_node_id, int64 *plan_rows);
+extern dsa_pointer CreateFilterParallelState(dsa_area *area, SemiJoinFilterJoinNodeState * sjf, int sjf_num);
+extern SemiJoinFilterJoinNodeState * GetSemiJoinFilter(PlanState *node, int plan_id);
 
 #endif							/* NODEMERGEJOIN_H */
diff --git a/src/include/lib/bloomfilter.h b/src/include/lib/bloomfilter.h
index 8146d8e7fd..3b5d1821a5 100644
--- a/src/include/lib/bloomfilter.h
+++ b/src/include/lib/bloomfilter.h
@@ -13,15 +13,22 @@
 #ifndef BLOOMFILTER_H
 #define BLOOMFILTER_H
 
+#include "utils/dsa.h"
+
 typedef struct bloom_filter bloom_filter;
 
 extern bloom_filter *bloom_create(int64 total_elems, int bloom_work_mem,
 								  uint64 seed);
 extern void bloom_free(bloom_filter *filter);
+extern dsa_pointer bloom_create_in_dsa(dsa_area *area, int64 total_elems,
+									   int bloom_work_mem, uint64 seed);
+extern void bloom_free_in_dsa(dsa_area *area, dsa_pointer filter_dsa_address);
 extern void bloom_add_element(bloom_filter *filter, unsigned char *elem,
 							  size_t len);
 extern bool bloom_lacks_element(bloom_filter *filter, unsigned char *elem,
 								size_t len);
 extern double bloom_prop_bits_set(bloom_filter *filter);
+extern void add_to_filter(bloom_filter *main_filter, bloom_filter *to_add);
+extern void replace_bitset(bloom_filter *main_filter, bloom_filter *overriding_filter);
 
 #endif							/* BLOOMFILTER_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 9a4c3a7792..6e25dabc0b 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -48,6 +48,8 @@
 #include "utils/sortsupport.h"
 #include "utils/tuplesort.h"
 #include "utils/tuplestore.h"
+#include "storage/shm_toc.h"
+#include "storage/lwlock.h"
 
 struct PlanState;				/* forward references in this file */
 struct ParallelHashJoinState;
@@ -2023,8 +2025,40 @@ typedef struct SemiJoinFilterJoinNodeState
 	int			elements_checked;
 	int			elements_filtered;
 	int			mergejoin_plan_id;
+	/* Parallel information */
+	dsa_pointer parallel_state;
+	bool		is_parallel;
+	bool		is_worker;
 }			SemiJoinFilterJoinNodeState;
 
+/* ----------------
+ *	 SemiJoinFilterParallelState information
+ *
+ *		SemiJoinFilter information used in parallel workers.
+ * ----------------
+ */
+typedef struct SemiJoinFilterParallelState
+{
+	/* bloom filter information */
+	uint64		seed;
+	int64		num_elements;
+	int			work_mem;
+	dsa_pointer bloom_dsa_address;
+	int			elements_added;
+	/* information for parallelization and locking */
+	bool		done_building;
+	int			workers_done;
+	int			num_processes;
+	uint64		lock_stop;
+	LWLock		lock;
+	LWLock		secondlock;
+	int			sjf_num;
+	int			mergejoin_plan_id;
+
+	int			building_node_id;
+	int			checking_node_id;
+}			SemiJoinFilterParallelState;
+
 /* ----------------
  *	 MergeJoinState information
  *
-- 
2.37.1

v1-0005-Add-basic-regression-tests-for-semijoin-filter.patchapplication/octet-stream; name=v1-0005-Add-basic-regression-tests-for-semijoin-filter.patchDownload

From 554372eb50fff8afa5eb7882561d9eed01e1a731 Mon Sep 17 00:00:00 2001
From: Lyu Pan <lyu.steve.pan@gmail.com>
Date: Wed, 12 Oct 2022 18:52:08 +0000
Subject: [PATCH 5/5] Add basic regression tests for semijoin filter.

For now, only some basic sqls are tested. Future work will add more regression test cases.

1. Test that when force_mergejoin_semijoin_filter is ON, semijoin filter is used.
2. Test that semijoin filter works in two-table inner merge join.
3. Test that semijoin filter can be pushed throudh SORT node.
4. Test that semijoin filter works in a basic three-table-merge-join.
5. Test that semijoin filter works in basic parallel queries.
---
 .../expected/mergejoin_semijoinfilter.out     | 480 ++++++++++++++++++
 src/test/regress/parallel_schedule            |   1 +
 .../regress/sql/mergejoin_semijoinfilter.sql  | 111 ++++
 3 files changed, 592 insertions(+)
 create mode 100644 src/test/regress/expected/mergejoin_semijoinfilter.out
 create mode 100644 src/test/regress/sql/mergejoin_semijoinfilter.sql

diff --git a/src/test/regress/expected/mergejoin_semijoinfilter.out b/src/test/regress/expected/mergejoin_semijoinfilter.out
new file mode 100644
index 0000000000..4506720897
--- /dev/null
+++ b/src/test/regress/expected/mergejoin_semijoinfilter.out
@@ -0,0 +1,480 @@
+SET enable_hashjoin = OFF;
+SET enable_nestloop = OFF;
+SET enable_mergejoin = ON;
+SET enable_mergejoin_semijoin_filter = ON;
+SET force_mergejoin_semijoin_filter = ON;
+CREATE TABLE t1 (
+  i integer,
+  j integer
+);
+CREATE TABLE t2 (
+  i integer,
+  k integer
+);
+CREATE TABLE t3 (
+  i integer,
+  m integer
+);
+insert into t1 (i, j)
+  select
+    generate_series(1,100000) as i,
+    generate_series(1,100000) as j;
+insert into t2 (i, k)
+  select
+    generate_series(1,100000) as i,
+    generate_series(1,100000) as k;
+insert into t3 (i, m)
+  select
+    generate_series(1,100000) as i,
+    generate_series(1,100000) as m;
+-- Semijoin filter is not used when force_mergejoin_semijoin_filter is OFF.
+SET force_mergejoin_semijoin_filter = OFF;
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+               QUERY PLAN                
+-----------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (t1.i = t2.i)
+         ->  Sort
+               Output: t1.i
+               Sort Key: t1.i
+               ->  Seq Scan on public.t1
+                     Output: t1.i
+         ->  Sort
+               Output: t2.i
+               Sort Key: t2.i
+               ->  Seq Scan on public.t2
+                     Output: t2.i
+(14 rows)
+
+SET force_mergejoin_semijoin_filter = ON;
+-- One level of inner mergejoin: push semi-join filter to outer scan.
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+                           QUERY PLAN                            
+-----------------------------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (t1.i = t2.i)
+         SemiJoin Filter Created Based on: (t1.i = t2.i)
+         SemiJoin Estimated Filtering Rate: -0.0100
+         ->  Sort
+               Output: t1.i
+               Sort Key: t1.i
+               ->  Seq Scan on public.t1
+                     Semijoin Filter building based on: { t1.i }
+                     Output: t1.i
+         ->  Sort
+               Output: t2.i
+               Sort Key: t2.i
+               ->  Seq Scan on public.t2
+                     Semijoin Filter checking based on: { t2.i }
+                     Output: t2.i
+(18 rows)
+
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+ count  
+--------
+ 100000
+(1 row)
+
+-- Push semijoin filter through SORT node.
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM (SELECT DISTINCT t1.i FROM t1 ORDER BY t1.i) x JOIN t2 ON x.i = t2.i;
+                             QUERY PLAN                             
+--------------------------------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (t1.i = t2.i)
+         SemiJoin Filter Created Based on: (t1.i = t2.i)
+         SemiJoin Estimated Filtering Rate: -0.0100
+         ->  Sort
+               Output: t1.i
+               Sort Key: t1.i
+               ->  HashAggregate
+                     Output: t1.i
+                     Group Key: t1.i
+                     ->  Seq Scan on public.t1
+                           Semijoin Filter building based on: { i }
+                           Output: t1.i, t1.j
+         ->  Sort
+               Output: t2.i
+               Sort Key: t2.i
+               ->  Seq Scan on public.t2
+                     Semijoin Filter checking based on: { t2.i }
+                     Output: t2.i
+(21 rows)
+
+SELECT COUNT(*) FROM (SELECT DISTINCT t1.i FROM t1 ORDER BY t1.i) x JOIN t2 ON x.i = t2.i;
+ count  
+--------
+ 100000
+(1 row)
+
+-- Push semijoin filter through LIMIT node: not supported.
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM (SELECT DISTINCT t1.i FROM t1 ORDER BY t1.i LIMIT 1) x JOIN t2 ON x.i = t2.i;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (t1.i = t2.i)
+         ->  Limit
+               Output: t1.i
+               ->  Sort
+                     Output: t1.i
+                     Sort Key: t1.i
+                     ->  HashAggregate
+                           Output: t1.i
+                           Group Key: t1.i
+                           ->  Seq Scan on public.t1
+                                 Output: t1.i, t1.j
+         ->  Sort
+               Output: t2.i
+               Sort Key: t2.i
+               ->  Seq Scan on public.t2
+                     Output: t2.i
+(19 rows)
+
+-- Push SJF through MAX: not supported.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT COUNT(*) FROM (SELECT MAX(t1.i) AS i, t1.j FROM t1 GROUP BY t1.j) x JOIN t2 ON x.i = t2.i;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (x.i = t2.i)
+         ->  Sort
+               Output: x.i
+               Sort Key: x.i
+               ->  Subquery Scan on x
+                     Output: x.i
+                     ->  HashAggregate
+                           Output: max(t1.i), t1.j
+                           Group Key: t1.j
+                           ->  Seq Scan on public.t1
+                                 Output: t1.i, t1.j
+         ->  Sort
+               Output: t2.i
+               Sort Key: t2.i
+               ->  Seq Scan on public.t2
+                     Output: t2.i
+(19 rows)
+
+-- SJF with UNION: not supported.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT COUNT(*) FROM (SELECT t1.i FROM t1 UNION ALL SELECT t2.i FROM t2) x JOIN t3 ON x.i = t3.i;
+                                    QUERY PLAN                                    
+----------------------------------------------------------------------------------
+ Finalize Aggregate
+   Output: count(*)
+   ->  Gather
+         Output: (PARTIAL count(*))
+         Workers Planned: 2
+         ->  Partial Aggregate
+               Output: PARTIAL count(*)
+               ->  Merge Join
+                     Merge Cond: (t1.i = t3.i)
+                     SemiJoin Filter Created Based on: (t1.i = t3.i)
+                     SemiJoin Estimated Filtering Rate: -0.0100
+                     ->  Sort
+                           Output: t1.i
+                           Sort Key: t1.i
+                           ->  Parallel Append
+                                 ->  Parallel Seq Scan on public.t1
+                                       Output: t1.i
+                                 ->  Parallel Seq Scan on public.t2
+                                       Semijoin Filter building based on: { x.i }
+                                       Output: t2.i
+                     ->  Sort
+                           Output: t3.i
+                           Sort Key: t3.i
+                           ->  Seq Scan on public.t3
+                                 Semijoin Filter checking based on: { t3.i }
+                                 Output: t3.i
+(26 rows)
+
+-- Join clause is an expression: not supported.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT COUNT(*) FROM t1 JOIN t2 ON 1 + t1.i = t2.i;
+                  QUERY PLAN                  
+----------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (((1 + t1.i)) = t2.i)
+         ->  Sort
+               Output: t1.i, ((1 + t1.i))
+               Sort Key: ((1 + t1.i))
+               ->  Seq Scan on public.t1
+                     Output: t1.i, (1 + t1.i)
+         ->  Sort
+               Output: t2.i
+               Sort Key: t2.i
+               ->  Seq Scan on public.t2
+                     Output: t2.i
+(14 rows)
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = 2 * t2.i;
+                  QUERY PLAN                  
+----------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (t1.i = ((2 * t2.i)))
+         ->  Sort
+               Output: t1.i
+               Sort Key: t1.i
+               ->  Seq Scan on public.t1
+                     Output: t1.i
+         ->  Sort
+               Output: t2.i, ((2 * t2.i))
+               Sort Key: ((2 * t2.i))
+               ->  Seq Scan on public.t2
+                     Output: t2.i, (2 * t2.i)
+(14 rows)
+
+-- Two levels of MergeJoin
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i JOIN t3 ON t2.i = t3.i;
+                                 QUERY PLAN                                  
+-----------------------------------------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (t3.i = t1.i)
+         SemiJoin Filter Created Based on: (t3.i = t1.i)
+         SemiJoin Estimated Filtering Rate: -0.0100
+         ->  Sort
+               Output: t3.i
+               Sort Key: t3.i
+               ->  Seq Scan on public.t3
+                     Semijoin Filter building based on: { t3.i }
+                     Output: t3.i
+         ->  Materialize
+               Output: t1.i, t2.i
+               ->  Merge Join
+                     Output: t1.i, t2.i
+                     Merge Cond: (t1.i = t2.i)
+                     SemiJoin Filter Created Based on: (t1.i = t2.i)
+                     SemiJoin Estimated Filtering Rate: -0.0100
+                     ->  Sort
+                           Output: t1.i
+                           Sort Key: t1.i
+                           ->  Seq Scan on public.t1
+                                 Semijoin Filter building based on: { t1.i }
+                                 Semijoin Filter checking based on: { t1.i }
+                                 Output: t1.i
+                     ->  Sort
+                           Output: t2.i
+                           Sort Key: t2.i
+                           ->  Seq Scan on public.t2
+                                 Semijoin Filter checking based on: { t2.i }
+                                 Output: t2.i
+(32 rows)
+
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i JOIN t3 ON t2.i = t3.i;
+ count  
+--------
+ 100000
+(1 row)
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT COUNT(*) FROM t1 JOIN (t2 JOIN t3 ON t2.i = t3.i) ON t1.i = t2.i;
+                                 QUERY PLAN                                  
+-----------------------------------------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (t3.i = t2.i)
+         SemiJoin Filter Created Based on: (t3.i = t2.i)
+         SemiJoin Estimated Filtering Rate: -0.0100
+         ->  Sort
+               Output: t3.i
+               Sort Key: t3.i
+               ->  Seq Scan on public.t3
+                     Semijoin Filter building based on: { t3.i }
+                     Output: t3.i
+         ->  Materialize
+               Output: t1.i, t2.i
+               ->  Merge Join
+                     Output: t1.i, t2.i
+                     Merge Cond: (t1.i = t2.i)
+                     SemiJoin Filter Created Based on: (t1.i = t2.i)
+                     SemiJoin Estimated Filtering Rate: -0.0100
+                     ->  Sort
+                           Output: t1.i
+                           Sort Key: t1.i
+                           ->  Seq Scan on public.t1
+                                 Semijoin Filter building based on: { t1.i }
+                                 Output: t1.i
+                     ->  Sort
+                           Output: t2.i
+                           Sort Key: t2.i
+                           ->  Seq Scan on public.t2
+                                 Semijoin Filter checking based on: { t2.i }
+                                 Semijoin Filter checking based on: { t2.i }
+                                 Output: t2.i
+(32 rows)
+
+SELECT COUNT(*) FROM t1 JOIN (t2 JOIN t3 ON t2.i = t3.i) ON t1.i = t2.i;
+ count  
+--------
+ 100000
+(1 row)
+
+EXPLAIN (VERBOSE, COSTS OFF) 
+SELECT COUNT(*) FROM (SELECT * FROM t1 JOIN t2 ON t1.i = t2.i) x JOIN t2 y ON x.k = y.k;
+                                    QUERY PLAN                                     
+-----------------------------------------------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Merge Join
+         Merge Cond: (y.k = t2.k)
+         SemiJoin Filter Created Based on: (y.k = t2.k)
+         SemiJoin Estimated Filtering Rate: -0.0100
+         ->  Sort
+               Output: y.k
+               Sort Key: y.k
+               ->  Seq Scan on public.t2 y
+                     Semijoin Filter building based on: { y.k }
+                     Output: y.k
+         ->  Materialize
+               Output: t2.k
+               ->  Sort
+                     Output: t2.k
+                     Sort Key: t2.k
+                     ->  Merge Join
+                           Output: t2.k
+                           Merge Cond: (t1.i = t2.i)
+                           SemiJoin Filter Created Based on: (t1.i = t2.i)
+                           SemiJoin Estimated Filtering Rate: -0.0100
+                           ->  Sort
+                                 Output: t1.i
+                                 Sort Key: t1.i
+                                 ->  Seq Scan on public.t1
+                                       Semijoin Filter building based on: { t1.i }
+                                       Output: t1.i
+                           ->  Sort
+                                 Output: t2.i, t2.k
+                                 Sort Key: t2.i
+                                 ->  Seq Scan on public.t2
+                                       Semijoin Filter checking based on: { t2.i }
+                                       Semijoin Filter checking based on: { t2.k }
+                                       Output: t2.i, t2.k
+(35 rows)
+
+SELECT COUNT(*) FROM (SELECT * FROM t1 JOIN t2 ON t1.i = t2.i) x JOIN t2 y ON x.k = y.k;
+ count  
+--------
+ 100000
+(1 row)
+
+-- Test Parallel Query
+SET max_parallel_workers_per_gather = 8;
+SET max_parallel_workers = 8;
+SET min_parallel_table_scan_size = 0;
+ALTER TABLE t1 SET (parallel_workers = 8);
+ALTER TABLE t2 SET (parallel_workers = 8);
+ALTER TABLE t3 SET (parallel_workers = 8);
+-- one level of merge join
+-- inner join: push bloom filter to outer scan
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+                                 QUERY PLAN                                  
+-----------------------------------------------------------------------------
+ Finalize Aggregate
+   Output: count(*)
+   ->  Gather
+         Output: (PARTIAL count(*))
+         Workers Planned: 8
+         ->  Partial Aggregate
+               Output: PARTIAL count(*)
+               ->  Merge Join
+                     Merge Cond: (t1.i = t2.i)
+                     SemiJoin Filter Created Based on: (t1.i = t2.i)
+                     SemiJoin Estimated Filtering Rate: -0.0100
+                     ->  Sort
+                           Output: t1.i
+                           Sort Key: t1.i
+                           ->  Parallel Seq Scan on public.t1
+                                 Semijoin Filter building based on: { t1.i }
+                                 Output: t1.i
+                     ->  Sort
+                           Output: t2.i
+                           Sort Key: t2.i
+                           ->  Seq Scan on public.t2
+                                 Semijoin Filter checking based on: { t2.i }
+                                 Output: t2.i
+(23 rows)
+
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+ count  
+--------
+ 100000
+(1 row)
+
+-- Two levels of  merge join
+EXPLAIN (VERBOSE, COSTS OFF) 
+SELECT COUNT(*) FROM (SELECT * FROM t1 JOIN t2 ON t1.i = t2.i) x JOIN t2 y ON x.k = y.k;
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
+ Finalize Aggregate
+   Output: count(*)
+   ->  Gather
+         Output: (PARTIAL count(*))
+         Workers Planned: 8
+         ->  Partial Aggregate
+               Output: PARTIAL count(*)
+               ->  Merge Join
+                     Merge Cond: (t2.k = y.k)
+                     SemiJoin Filter Created Based on: (t2.k = y.k)
+                     SemiJoin Estimated Filtering Rate: -0.0100
+                     ->  Sort
+                           Output: t2.k
+                           Sort Key: t2.k
+                           ->  Merge Join
+                                 Output: t2.k
+                                 Merge Cond: (t1.i = t2.i)
+                                 SemiJoin Filter Created Based on: (t1.i = t2.i)
+                                 SemiJoin Estimated Filtering Rate: -0.0100
+                                 ->  Sort
+                                       Output: t1.i
+                                       Sort Key: t1.i
+                                       ->  Parallel Seq Scan on public.t1
+                                             Semijoin Filter building based on: { t1.i }
+                                             Output: t1.i
+                                 ->  Sort
+                                       Output: t2.i, t2.k
+                                       Sort Key: t2.i
+                                       ->  Seq Scan on public.t2
+                                             Semijoin Filter checking based on: { t2.i }
+                                             Semijoin Filter building based on: { t2.k }
+                                             Output: t2.i, t2.k
+                     ->  Sort
+                           Output: y.k
+                           Sort Key: y.k
+                           ->  Seq Scan on public.t2 y
+                                 Semijoin Filter checking based on: { y.k }
+                                 Output: y.k
+(38 rows)
+
+SELECT COUNT(*) FROM (SELECT * FROM t1 JOIN t2 ON t1.i = t2.i) x JOIN t2 y ON x.k = y.k;
+ count  
+--------
+ 100000
+(1 row)
+
+RESET max_parallel_workers_per_gather;
+RESET max_parallel_workers;
+RESET min_parallel_table_scan_size;
+DROP TABLE t1;
+DROP TABLE t2;
+DROP TABLE t3;
+RESET enable_mergejoin;
+RESET enable_memoize;
+RESET enable_mergejoin;
+RESET enable_mergejoin_semijoin_filter;
+RESET force_mergejoin_semijoin_filter;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 9f644a0c1b..6bde05b8cb 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -98,6 +98,7 @@ test: rules psql psql_crosstab amutils stats_ext collate.linux.utf8
 test: select_parallel
 test: write_parallel
 test: vacuum_parallel
+test: mergejoin_semijoinfilter
 
 # no relation related tests can be put in this group
 test: publication subscription
diff --git a/src/test/regress/sql/mergejoin_semijoinfilter.sql b/src/test/regress/sql/mergejoin_semijoinfilter.sql
new file mode 100644
index 0000000000..f38f302795
--- /dev/null
+++ b/src/test/regress/sql/mergejoin_semijoinfilter.sql
@@ -0,0 +1,111 @@
+SET enable_hashjoin = OFF;
+SET enable_nestloop = OFF;
+SET enable_mergejoin = ON;
+SET enable_mergejoin_semijoin_filter = ON;
+SET force_mergejoin_semijoin_filter = ON;
+
+CREATE TABLE t1 (
+  i integer,
+  j integer
+);
+
+CREATE TABLE t2 (
+  i integer,
+  k integer
+);
+
+CREATE TABLE t3 (
+  i integer,
+  m integer
+);
+
+insert into t1 (i, j)
+  select
+    generate_series(1,100000) as i,
+    generate_series(1,100000) as j;
+
+insert into t2 (i, k)
+  select
+    generate_series(1,100000) as i,
+    generate_series(1,100000) as k;
+
+insert into t3 (i, m)
+  select
+    generate_series(1,100000) as i,
+    generate_series(1,100000) as m;
+
+-- Semijoin filter is not used when force_mergejoin_semijoin_filter is OFF.
+SET force_mergejoin_semijoin_filter = OFF;
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+SET force_mergejoin_semijoin_filter = ON;
+
+-- One level of inner mergejoin: push semi-join filter to outer scan.
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+
+-- Push semijoin filter through SORT node.
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM (SELECT DISTINCT t1.i FROM t1 ORDER BY t1.i) x JOIN t2 ON x.i = t2.i;
+SELECT COUNT(*) FROM (SELECT DISTINCT t1.i FROM t1 ORDER BY t1.i) x JOIN t2 ON x.i = t2.i;
+
+-- Push semijoin filter through LIMIT node: not supported.
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM (SELECT DISTINCT t1.i FROM t1 ORDER BY t1.i LIMIT 1) x JOIN t2 ON x.i = t2.i;
+
+-- Push SJF through MAX: not supported.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT COUNT(*) FROM (SELECT MAX(t1.i) AS i, t1.j FROM t1 GROUP BY t1.j) x JOIN t2 ON x.i = t2.i;
+
+-- SJF with UNION: not supported.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT COUNT(*) FROM (SELECT t1.i FROM t1 UNION ALL SELECT t2.i FROM t2) x JOIN t3 ON x.i = t3.i;
+
+-- Join clause is an expression: not supported.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT COUNT(*) FROM t1 JOIN t2 ON 1 + t1.i = t2.i;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = 2 * t2.i;
+
+-- Two levels of MergeJoin
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i JOIN t3 ON t2.i = t3.i;
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i JOIN t3 ON t2.i = t3.i;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT COUNT(*) FROM t1 JOIN (t2 JOIN t3 ON t2.i = t3.i) ON t1.i = t2.i;
+SELECT COUNT(*) FROM t1 JOIN (t2 JOIN t3 ON t2.i = t3.i) ON t1.i = t2.i;
+
+EXPLAIN (VERBOSE, COSTS OFF) 
+SELECT COUNT(*) FROM (SELECT * FROM t1 JOIN t2 ON t1.i = t2.i) x JOIN t2 y ON x.k = y.k;
+SELECT COUNT(*) FROM (SELECT * FROM t1 JOIN t2 ON t1.i = t2.i) x JOIN t2 y ON x.k = y.k;
+
+-- Test Parallel Query
+SET max_parallel_workers_per_gather = 8;
+SET max_parallel_workers = 8;
+SET min_parallel_table_scan_size = 0;
+ALTER TABLE t1 SET (parallel_workers = 8);
+ALTER TABLE t2 SET (parallel_workers = 8);
+ALTER TABLE t3 SET (parallel_workers = 8);
+
+-- one level of merge join
+-- inner join: push bloom filter to outer scan
+EXPLAIN (VERBOSE, COSTS OFF) SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+SELECT COUNT(*) FROM t1 JOIN t2 ON t1.i = t2.i;
+
+-- Two levels of  merge join
+EXPLAIN (VERBOSE, COSTS OFF) 
+SELECT COUNT(*) FROM (SELECT * FROM t1 JOIN t2 ON t1.i = t2.i) x JOIN t2 y ON x.k = y.k;
+SELECT COUNT(*) FROM (SELECT * FROM t1 JOIN t2 ON t1.i = t2.i) x JOIN t2 y ON x.k = y.k;
+
+
+RESET max_parallel_workers_per_gather;
+RESET max_parallel_workers;
+RESET min_parallel_table_scan_size;
+
+DROP TABLE t1;
+DROP TABLE t2;
+DROP TABLE t3;
+RESET enable_mergejoin;
+RESET enable_memoize;
+RESET enable_mergejoin;
+RESET enable_mergejoin_semijoin_filter;
+RESET force_mergejoin_semijoin_filter;
-- 
2.37.1

Zhihong Yu

zyu@yugabyte.com

over 3 years ago

In reply to: Lyu Pan (#8)

Re: Bloom filter Pushdown Optimization for Merge Join

On Wed, Oct 12, 2022 at 3:36 PM Lyu Pan <lyu.steve.pan@gmail.com> wrote:

Hello Zhihong Yu & Tomas Vondra,

Thank you so much for your review and feedback!

We made some updates based on previous feedback and attached the new
patch set. Due to time constraints, we didn't get to resolve all the
comments, and we'll continue to improve this patch.

In this prototype, the cost model is based on an assumption that there is
a linear relationship between the performance gain from using a semijoin
filter and the estimated filtering rate:
% improvement to Merge Join cost = 0.83 * estimated filtering rate -

0.137.

How were the coefficients (0.83 and 0.137) determined ?
I guess they were based on the results of running certain workload.

Right, the coefficients (0.83 and 0.137) determined are based on some
preliminary testings. The current costing model is pretty naive and
we'll work on a more robust costing model in future work.

I agree, in principle, although I think the current logic / formula is a
bit too crude and fitted to the simple data used in the test. I think
this needs to be formulated as a regular costing issue, considering
stuff like cost of the hash functions, and so on.

I think this needs to do two things:

1) estimate the cost of building the bloom filter - This shall depend on
the number of rows in the inner relation, number/cost of the hash
functions (which may be higher for some data types), etc.

2) estimate improvement for the probing branch - Essentially, we need to
estimate how much we save by filtering some of the rows, but this also
neeeds to include the cost of probing the bloom filter.

This will probably require some improvements to the lib/bloomfilter, in
order to estimate the false positive rate - this may matter a lot for
large data sets and small work_mem values. The bloomfilter library
simply reduces the size of the bloom filter, which increases the false
positive rate. At some point it'll start reducing the benefit.

These suggestions make a lot of sense. The current costing model is
definitely not good enough, and we plan to work on a more robust
costing model as we continue to improve the patch.

OK. Could also build the bloom filter in shared memory?

We thought about this approach but didn't prefer this one because if
all worker processes share the same bloom filter in shared memory, we
need to frequently lock and unlock the bloom filter to avoid race
conditions. So we decided to have each worker process create its own
bloom filter.

IMHO we shouldn't make too many conclusions from these examples. Yes, it
shows merge join can be improved, but for cases where a hashjoin works
better so we wouldn't use merge join anyway.

I think we should try constructing examples where either merge join wins
already (and gets further improved by the bloom filter), or would lose
to hash join and the bloom filter improves it enough to win.

AFAICS that requires a join of two large tables - large enough that hash
join would need to be batched, or pre-sorted inputs (which eliminates
the explicit Sort, which is the main cost in most cases).

The current patch only works with sequential scans, which eliminates the
second (pre-sorted) option. So let's try the first one - can we invent
an example with a join of two large tables where a merge join would win?

Can we find such example in existing benchmarks like TPC-H/TPC-DS.

Agreed. The current examples are only intended to show us that using
bloom filters in merge join could improve the merge join performance
in some cases. We are working on testing more examples that merge join
with bloom filter could out-perform hash join, which should be more
persuasive.

The bloom filter is built by the first seqscan (on t0), and then used by
the second seqscan (on t1). But this only works because we always run
the t0 scan to completion (because we're feeding it into Sort) before we
start scanning t1.

But when the scan on t1 switches to an index scan, it's over - we'd be
building the filter without being able to probe it, and when we finish
building it we no longer need it. So this seems pretty futile.

It might still improve plans like

-> Merge Join
Merge Cond: (t0.c1 = t1.c1)
SemiJoin Filter Created Based on: (t0.c1 = t1.c1)
SemiJoin Estimated Filtering Rate: 1.0000
-> Sort
Sort Key: t0.c1
-> Seq Scan on t0
-> Index Scan on t1

But I don't know how common/likely that actually is. I'd expect to have
an index on both sides, but perhaps I'm wrong.

This is why hashjoin seems like a more natural fit for the bloom filter,
BTW, because there we have a guarantee the inner relation is processed
first (so we know the bloom filter is fine and can be probed).

Great observation. The bloom filter only works if the first SeqScan
always runs to completion before the second SeqScan starts.
I guess one possible way to avoid futile bloom filter might be
enforcing that the bloom filter only be used if both the outer/inner
plans of the MergeJoin are Sort nodes, to guarantee the bloom filter
is ready to use after processing one side of the join, but this may be
too restrictive.

I don't know what improvements you have in mind exactly, but I think
it'd be good to show which node is building/using a bloom filter, and
then also some basic stats (size, number of hash functions, FPR, number
of probes, ...). This may require improvements to lib/bloomfilter, which
currently does not expose some of the details.

Along with the new patch set, we have added information to display
which node is building/using a bloom filter (as well as the
corresponding expressions), and some bloom filter basic stats. We'll
add more related information (e.g. FPR) as we modify lib/bloomfilter
implementation in future work.

Thanks again for the great reviews and they are really useful! More
feedback is always welcome and appreciated!

Regards,
Lyu Pan
Amazon RDS/Aurora for PostgreSQL

Hi,

For v1-0001-Support-semijoin-filter-in-the-planner-optimizer.patch :

+
+       /* want at least 1000 rows_filtered to avoid any nasty edge cases */
+       if (force_mergejoin_semijoin_filter ||
+           (filtering_rate >= 0.35 && rows_filtered > 1000 &&
best_filter_clause >= 0))

Currently rows_filtered is compared with a constant, should the constant be
made configurable ?

+ * improvement of 19.5%. This equation also concludes thata a
17%

Typo: thata

+   inner_md_array = palloc(sizeof(SemiJoinFilterExprMetadata) * num_md);
+   if (!outer_md_array || !inner_md_array)
+   {
+       return 0;               /* a stack array allocation failed  */

Should the allocated array be freed before returning ?

For verify_valid_pushdown(), the parameters in comment don't match the
actual parameters. Please modify the comment.

For is_fk_pk(), since the outer_key_list is fixed across the iterations, I
think all_vars can be computed outside of expressions_match_foreign_key().

Cheers

#10

Zhihong Yu

zyu@yugabyte.com

about 3 years ago

In reply to: Zhihong Yu (#9)

Re: Bloom filter Pushdown Optimization for Merge Join

On Wed, Oct 12, 2022 at 4:35 PM Zhihong Yu <zyu@yugabyte.com> wrote:

On Wed, Oct 12, 2022 at 3:36 PM Lyu Pan <lyu.steve.pan@gmail.com> wrote:

Hello Zhihong Yu & Tomas Vondra,

Thank you so much for your review and feedback!

We made some updates based on previous feedback and attached the new
patch set. Due to time constraints, we didn't get to resolve all the
comments, and we'll continue to improve this patch.

In this prototype, the cost model is based on an assumption that there

is

a linear relationship between the performance gain from using a semijoin
filter and the estimated filtering rate:
% improvement to Merge Join cost = 0.83 * estimated filtering rate -

0.137.

How were the coefficients (0.83 and 0.137) determined ?
I guess they were based on the results of running certain workload.

Right, the coefficients (0.83 and 0.137) determined are based on some
preliminary testings. The current costing model is pretty naive and
we'll work on a more robust costing model in future work.

I agree, in principle, although I think the current logic / formula is a
bit too crude and fitted to the simple data used in the test. I think
this needs to be formulated as a regular costing issue, considering
stuff like cost of the hash functions, and so on.

I think this needs to do two things:

1) estimate the cost of building the bloom filter - This shall depend on
the number of rows in the inner relation, number/cost of the hash
functions (which may be higher for some data types), etc.

2) estimate improvement for the probing branch - Essentially, we need to
estimate how much we save by filtering some of the rows, but this also
neeeds to include the cost of probing the bloom filter.

This will probably require some improvements to the lib/bloomfilter, in
order to estimate the false positive rate - this may matter a lot for
large data sets and small work_mem values. The bloomfilter library
simply reduces the size of the bloom filter, which increases the false
positive rate. At some point it'll start reducing the benefit.

These suggestions make a lot of sense. The current costing model is
definitely not good enough, and we plan to work on a more robust
costing model as we continue to improve the patch.

OK. Could also build the bloom filter in shared memory?

We thought about this approach but didn't prefer this one because if
all worker processes share the same bloom filter in shared memory, we
need to frequently lock and unlock the bloom filter to avoid race
conditions. So we decided to have each worker process create its own
bloom filter.

IMHO we shouldn't make too many conclusions from these examples. Yes, it
shows merge join can be improved, but for cases where a hashjoin works
better so we wouldn't use merge join anyway.

I think we should try constructing examples where either merge join wins
already (and gets further improved by the bloom filter), or would lose
to hash join and the bloom filter improves it enough to win.

AFAICS that requires a join of two large tables - large enough that hash
join would need to be batched, or pre-sorted inputs (which eliminates
the explicit Sort, which is the main cost in most cases).

The current patch only works with sequential scans, which eliminates the
second (pre-sorted) option. So let's try the first one - can we invent
an example with a join of two large tables where a merge join would win?

Can we find such example in existing benchmarks like TPC-H/TPC-DS.

Agreed. The current examples are only intended to show us that using
bloom filters in merge join could improve the merge join performance
in some cases. We are working on testing more examples that merge join
with bloom filter could out-perform hash join, which should be more
persuasive.

The bloom filter is built by the first seqscan (on t0), and then used by
the second seqscan (on t1). But this only works because we always run
the t0 scan to completion (because we're feeding it into Sort) before we
start scanning t1.

But when the scan on t1 switches to an index scan, it's over - we'd be
building the filter without being able to probe it, and when we finish
building it we no longer need it. So this seems pretty futile.

It might still improve plans like

-> Merge Join
Merge Cond: (t0.c1 = t1.c1)
SemiJoin Filter Created Based on: (t0.c1 = t1.c1)
SemiJoin Estimated Filtering Rate: 1.0000
-> Sort
Sort Key: t0.c1
-> Seq Scan on t0
-> Index Scan on t1

But I don't know how common/likely that actually is. I'd expect to have
an index on both sides, but perhaps I'm wrong.

This is why hashjoin seems like a more natural fit for the bloom filter,
BTW, because there we have a guarantee the inner relation is processed
first (so we know the bloom filter is fine and can be probed).

Great observation. The bloom filter only works if the first SeqScan
always runs to completion before the second SeqScan starts.
I guess one possible way to avoid futile bloom filter might be
enforcing that the bloom filter only be used if both the outer/inner
plans of the MergeJoin are Sort nodes, to guarantee the bloom filter
is ready to use after processing one side of the join, but this may be
too restrictive.

I don't know what improvements you have in mind exactly, but I think
it'd be good to show which node is building/using a bloom filter, and
then also some basic stats (size, number of hash functions, FPR, number
of probes, ...). This may require improvements to lib/bloomfilter, which
currently does not expose some of the details.

Along with the new patch set, we have added information to display
which node is building/using a bloom filter (as well as the
corresponding expressions), and some bloom filter basic stats. We'll
add more related information (e.g. FPR) as we modify lib/bloomfilter
implementation in future work.

Thanks again for the great reviews and they are really useful! More
feedback is always welcome and appreciated!

Regards,
Lyu Pan
Amazon RDS/Aurora for PostgreSQL

Hi,

For v1-0001-Support-semijoin-filter-in-the-planner-optimizer.patch :
+
+       /* want at least 1000 rows_filtered to avoid any nasty edge cases
*/
+       if (force_mergejoin_semijoin_filter ||
+           (filtering_rate >= 0.35 && rows_filtered > 1000 &&
best_filter_clause >= 0))
Currently rows_filtered is compared with a constant, should the constant
be made configurable ?

+ * improvement of 19.5%. This equation also concludes thata a
17%

Typo: thata
+   inner_md_array = palloc(sizeof(SemiJoinFilterExprMetadata) * num_md);
+   if (!outer_md_array || !inner_md_array)
+   {
+       return 0;               /* a stack array allocation failed  */
Should the allocated array be freed before returning ?

For verify_valid_pushdown(), the parameters in comment don't match the
actual parameters. Please modify the comment.

For is_fk_pk(), since the outer_key_list is fixed across the iterations, I
think all_vars can be computed outside of expressions_match_foreign_key().

Cheers

Continuing
with v1-0001-Support-semijoin-filter-in-the-planner-optimizer.patch

For get_switched_clauses(), the returned List contains all the clauses. Yet
the name suggests that only switched clauses are returned.
Please rename the method to adjust_XX or rearrange_YY so that it is less
likely to cause confusion.

For depth_of_semijoin_target(), idx is only used inside the `if (num_exprs
== get_appendrel_occluded_references(` block, you can move its declaration
inside that if block.

+   outside_subq_rte = root->simple_rte_array[outside_subq_var->varno];
+
+   /* System Vars have varattno < 0, don't bother */
+   if (outside_subq_var->varattno <= 0)
+       return 0;

Since the check on outside_subq_var->varattno doesn't depend on
outside_subq_rte, you can move the assignment to outside_subq_rte after the
if check.

Cheers

#11

Zhihong Yu

zyu@yugabyte.com

about 3 years ago

In reply to: Zhihong Yu (#10)

Re: Bloom filter Pushdown Optimization for Merge Join

On Thu, Oct 13, 2022 at 7:30 AM Zhihong Yu <zyu@yugabyte.com> wrote:

On Wed, Oct 12, 2022 at 4:35 PM Zhihong Yu <zyu@yugabyte.com> wrote:
On Wed, Oct 12, 2022 at 3:36 PM Lyu Pan <lyu.steve.pan@gmail.com> wrote:

Hello Zhihong Yu & Tomas Vondra,

Thank you so much for your review and feedback!

We made some updates based on previous feedback and attached the new
patch set. Due to time constraints, we didn't get to resolve all the
comments, and we'll continue to improve this patch.

In this prototype, the cost model is based on an assumption that there

is

a linear relationship between the performance gain from using a

semijoin

filter and the estimated filtering rate:
% improvement to Merge Join cost = 0.83 * estimated filtering rate -

0.137.

How were the coefficients (0.83 and 0.137) determined ?
I guess they were based on the results of running certain workload.

Right, the coefficients (0.83 and 0.137) determined are based on some
preliminary testings. The current costing model is pretty naive and
we'll work on a more robust costing model in future work.

I agree, in principle, although I think the current logic / formula is

a

bit too crude and fitted to the simple data used in the test. I think
this needs to be formulated as a regular costing issue, considering
stuff like cost of the hash functions, and so on.

I think this needs to do two things:

1) estimate the cost of building the bloom filter - This shall depend

on

the number of rows in the inner relation, number/cost of the hash
functions (which may be higher for some data types), etc.

2) estimate improvement for the probing branch - Essentially, we need

to

estimate how much we save by filtering some of the rows, but this also
neeeds to include the cost of probing the bloom filter.

This will probably require some improvements to the lib/bloomfilter, in
order to estimate the false positive rate - this may matter a lot for
large data sets and small work_mem values. The bloomfilter library
simply reduces the size of the bloom filter, which increases the false
positive rate. At some point it'll start reducing the benefit.

These suggestions make a lot of sense. The current costing model is
definitely not good enough, and we plan to work on a more robust
costing model as we continue to improve the patch.

OK. Could also build the bloom filter in shared memory?

We thought about this approach but didn't prefer this one because if
all worker processes share the same bloom filter in shared memory, we
need to frequently lock and unlock the bloom filter to avoid race
conditions. So we decided to have each worker process create its own
bloom filter.

IMHO we shouldn't make too many conclusions from these examples. Yes,

it

shows merge join can be improved, but for cases where a hashjoin works
better so we wouldn't use merge join anyway.

I think we should try constructing examples where either merge join

wins

already (and gets further improved by the bloom filter), or would lose
to hash join and the bloom filter improves it enough to win.

AFAICS that requires a join of two large tables - large enough that

hash

join would need to be batched, or pre-sorted inputs (which eliminates
the explicit Sort, which is the main cost in most cases).

The current patch only works with sequential scans, which eliminates

the

second (pre-sorted) option. So let's try the first one - can we invent
an example with a join of two large tables where a merge join would

win?

Can we find such example in existing benchmarks like TPC-H/TPC-DS.

Agreed. The current examples are only intended to show us that using
bloom filters in merge join could improve the merge join performance
in some cases. We are working on testing more examples that merge join
with bloom filter could out-perform hash join, which should be more
persuasive.

The bloom filter is built by the first seqscan (on t0), and then used

by

the second seqscan (on t1). But this only works because we always run
the t0 scan to completion (because we're feeding it into Sort) before

we

start scanning t1.

But when the scan on t1 switches to an index scan, it's over - we'd be
building the filter without being able to probe it, and when we finish
building it we no longer need it. So this seems pretty futile.

It might still improve plans like

-> Merge Join
Merge Cond: (t0.c1 = t1.c1)
SemiJoin Filter Created Based on: (t0.c1 = t1.c1)
SemiJoin Estimated Filtering Rate: 1.0000
-> Sort
Sort Key: t0.c1
-> Seq Scan on t0
-> Index Scan on t1

But I don't know how common/likely that actually is. I'd expect to have
an index on both sides, but perhaps I'm wrong.

This is why hashjoin seems like a more natural fit for the bloom

filter,

BTW, because there we have a guarantee the inner relation is processed
first (so we know the bloom filter is fine and can be probed).

Great observation. The bloom filter only works if the first SeqScan
always runs to completion before the second SeqScan starts.
I guess one possible way to avoid futile bloom filter might be
enforcing that the bloom filter only be used if both the outer/inner
plans of the MergeJoin are Sort nodes, to guarantee the bloom filter
is ready to use after processing one side of the join, but this may be
too restrictive.

I don't know what improvements you have in mind exactly, but I think
it'd be good to show which node is building/using a bloom filter, and
then also some basic stats (size, number of hash functions, FPR, number
of probes, ...). This may require improvements to lib/bloomfilter,

which

currently does not expose some of the details.

Along with the new patch set, we have added information to display
which node is building/using a bloom filter (as well as the
corresponding expressions), and some bloom filter basic stats. We'll
add more related information (e.g. FPR) as we modify lib/bloomfilter
implementation in future work.

Thanks again for the great reviews and they are really useful! More
feedback is always welcome and appreciated!

Regards,
Lyu Pan
Amazon RDS/Aurora for PostgreSQL

Hi,

For v1-0001-Support-semijoin-filter-in-the-planner-optimizer.patch :
+
+       /* want at least 1000 rows_filtered to avoid any nasty edge cases
*/
+       if (force_mergejoin_semijoin_filter ||
+           (filtering_rate >= 0.35 && rows_filtered > 1000 &&
best_filter_clause >= 0))
Currently rows_filtered is compared with a constant, should the constant
be made configurable ?

+ * improvement of 19.5%. This equation also concludes thata a
17%

Typo: thata
+   inner_md_array = palloc(sizeof(SemiJoinFilterExprMetadata) * num_md);
+   if (!outer_md_array || !inner_md_array)
+   {
+       return 0;               /* a stack array allocation failed  */
Should the allocated array be freed before returning ?

For verify_valid_pushdown(), the parameters in comment don't match the
actual parameters. Please modify the comment.

For is_fk_pk(), since the outer_key_list is fixed across the iterations,
I think all_vars can be computed outside of expressions_match_foreign_key().

Cheers
Continuing
with v1-0001-Support-semijoin-filter-in-the-planner-optimizer.patch

For get_switched_clauses(), the returned List contains all the clauses.
Yet the name suggests that only switched clauses are returned.
Please rename the method to adjust_XX or rearrange_YY so that it is less
likely to cause confusion.

For depth_of_semijoin_target(), idx is only used inside the `if (num_exprs
== get_appendrel_occluded_references(` block, you can move its declaration
inside that if block.
+   outside_subq_rte = root->simple_rte_array[outside_subq_var->varno];
+
+   /* System Vars have varattno < 0, don't bother */
+   if (outside_subq_var->varattno <= 0)
+       return 0;
Since the check on outside_subq_var->varattno doesn't depend on
outside_subq_rte, you can move the assignment to outside_subq_rte after the
if check.

Cheers

For v1-0002-Support-semijoin-filter-in-the-executor-for-non-para.patch, I
have a few comments.

+
+               if (IsA(node, SeqScanState) && ((SeqScanState *)
node)->apply_semijoin_filter
+                   && !ExecScanUsingSemiJoinFilter((SeqScanState *) node,
econtext))
+               {
+                   /* slot did not pass SemiJoinFilter, so skipping it. */
+                   ResetExprContext(econtext);
+                   continue;
+               }
+               return projectedSlot;

Since projectedSlot is only returned if slot passes SemiJoinFilter, you can
move the call to `ExecProject` after the above if block.

+   List       *semijoin_filters;   /* SemiJoinFilterJoinNodeState  */
+   List       *sj_scan_data;   /* SemiJoinFilterScanNodeState */

Better use plural form in the comments.

Cheers