Selectivity estimation for inet operators

Started by Emre Hasegeliover 11 years ago25 messages
#1Emre Hasegeli
emre@hasegeli.com
2 attachment(s)

New version of the selectivity estimation patch attached. I am adding
it to CommitFest 2014-06. Previous version of it reviewed by
Andreas Karlson on the previous CommitFest with the GiST support patch.
The new version includes join selectivity estimation.

Join selectivity is calculated in 4 steps:

* matching first MCV to second MCV
* searching first MCV in the second histogram
* searching second MCV in the first histogram
* searching boundaries of the first histogram in the second histogram

Comparing the lists with each other slows down the function when
statistics set to higher values. To avoid this problem I only use
log(n) values of the lists. It is the first log(n) value for MCV,
evenly separated values for histograms. In my tests, this optimization
does not affect the planning time when statistics = 100, but does
affect accuracy of the estimation. I can send the version without
this optimization, if slow down with larger statistics is not a problem
which should be solved on the selectivity estimation function.

I also attach the script I was using for testing and I left log statements
in the networkjoinsel() function to make testing easier. These statements
should be removed before commit.

Attachments:

inet-selfuncs-v4.patchapplication/octet-stream; name=inet-selfuncs-v4.patchDownload
diff --git a/src/backend/utils/adt/network_selfuncs.c b/src/backend/utils/adt/network_selfuncs.c
index d0d806f..4e56da9 100644
--- a/src/backend/utils/adt/network_selfuncs.c
+++ b/src/backend/utils/adt/network_selfuncs.c
@@ -1,32 +1,670 @@
 /*-------------------------------------------------------------------------
  *
  * network_selfuncs.c
  *	  Functions for selectivity estimation of inet/cidr operators
  *
- * Currently these are just stubs, but we hope to do better soon.
+ * Estimates are based on null fraction, distinct value count, most common
+ * values, and histogram of inet/cidr datatypes.
  *
  * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
  *
  * IDENTIFICATION
  *	  src/backend/utils/adt/network_selfuncs.c
  *
  *-------------------------------------------------------------------------
  */
 #include "postgres.h"
 
+#include <math.h>
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_operator.h"
+#include "catalog/pg_statistic.h"
+#include "utils/lsyscache.h"
 #include "utils/inet.h"
+#include "utils/selfuncs.h"
 
 
+/* Default selectivity constant for the inet overlap operator */
+#define DEFAULT_OVERLAP_SEL 0.01
+
+/* Default selectivity constant for the other operators */
+#define DEFAULT_INCLUSION_SEL 0.005
+
+/* Default selectivity for given operator */
+#define DEFAULT_SEL(operator) \
+	((operator) == OID_INET_OVERLAP_OP ? \
+			DEFAULT_OVERLAP_SEL : DEFAULT_INCLUSION_SEL)
+
+static short int inet_opr_order(Oid operator, bool reversed);
+static Selectivity inet_his_inclusion_selec(Datum *values, int nvalues,
+					int red_nvalues, Datum *constvalue, double ndistinct,
+					short int opr_order);
+static Selectivity inet_mcv_join_selec(Datum *values1, float4 *numbers1,
+					int nvalues1, Datum *values2, float4 *numbers2,
+					int nvalues2, Oid operator, bool reversed);
+static Selectivity inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers,
+					int mcv_nvalues, Datum *his_values, int his_nvalues,
+					int red_nvalues, short int opr_order);
+static Selectivity inet_his_inclusion_join_selec(Datum *his1_values,
+					int his1_nvalues, int red1_nvalues, Datum *his2_values,
+					int his2_nvalues, int red2_nvalues, short int opr_order);
+static short int inet_inclusion_cmp(inet *left, inet *right,
+									short int opr_order);
+static short int inet_masklen_inclusion_cmp(inet *left, inet *right,
+											short int opr_order);
+static short int inet_his_match_divider(inet *boundary, inet *query,
+										short int opr_order);
+
+/*
+ * Selectivity estimation for the subnet inclusion operators
+ */
 Datum
 networksel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo	   *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid				operator = PG_GETARG_OID(1);
+	List		   *args = (List *) PG_GETARG_POINTER(2);
+	int				varRelid = PG_GETARG_INT32(3),
+					his_nvalues;
+	VariableStatData vardata;
+	Node		   *other;
+	bool			varonleft;
+	Selectivity		selec,
+					max_mcv_selec,
+					max_his_selec;
+	Datum			constvalue,
+				   *his_values;
+	Form_pg_statistic stats;
+	FmgrInfo		proc;
+
+	/*
+	 * If expression is not (variable op something) or (something op
+	 * variable), then punt and return a default estimate.
+	 */
+	if (!get_restriction_variable(root, args, varRelid,
+								  &vardata, &other, &varonleft))
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+
+	/*
+	 * Can't do anything useful if the something is not a constant, either.
+	 */
+	if (!IsA(other, Const))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	/* All of the subnet inclusion operators are strict. */
+	if (((Const *) other)->constisnull)
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(0.0);
+	}
+
+	if (!HeapTupleIsValid(vardata.statsTuple))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	constvalue = ((Const *) other)->constvalue;
+	stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+
+	fmgr_info(get_opcode(operator), &proc);
+	selec = mcv_selectivity(&vardata, &proc, constvalue, varonleft,
+							&max_mcv_selec);
+	max_his_selec = 1.0 - stats->stanullfrac - max_mcv_selec;
+
+	if (get_attstatsslot(vardata.statsTuple,
+						 vardata.atttype, vardata.atttypmod,
+						 STATISTIC_KIND_HISTOGRAM, InvalidOid,
+						 NULL,
+						 &his_values, &his_nvalues,
+						 NULL, NULL))
+	{
+		selec += max_his_selec *
+				 inet_his_inclusion_selec(his_values, his_nvalues, his_nvalues,
+										  &constvalue, stats->stadistinct,
+										  inet_opr_order(operator, !varonleft));
+
+		free_attstatsslot(vardata.atttype, his_values, his_nvalues, NULL, 0);
+	}
+	else
+		if (max_mcv_selec > 0)
+			selec = selec / (1.0 - max_his_selec); /* Correct the value. */
+		else
+			selec = DEFAULT_SEL(operator);
+
+	/* Result should be in range, but make sure... */
+	CLAMP_PROBABILITY(selec);
+
+	ReleaseVariableStats(vardata);
+	PG_RETURN_FLOAT8(selec);
 }
 
+/*
+ * Join selectivity estimation for the subnet inclusion operators
+ *
+ * Calculates MCV vs MCV, MCV vs histogram and histogram vs histogram
+ * selectivity for join using the subnet inclusion operators.
+ * Only some of the values in the lists are used to make it faster.
+ */
 Datum
 networkjoinsel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo	   *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid				operator = PG_GETARG_OID(1);
+	List		   *args = (List *) PG_GETARG_POINTER(2);
+	SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) PG_GETARG_POINTER(4);
+	VariableStatData vardata1,
+					vardata2;
+	Form_pg_statistic stats1,
+					stats2;
+	Selectivity		selec,
+					mcv1_max_selec,
+					mcv1_red_selec,
+					mcv2_max_selec,
+					mcv2_red_selec;
+	bool			reversed,
+					mcv1_exists,
+					mcv2_exists,
+					his1_exists,
+					his2_exists;
+	short int		opr_order;
+	int				mcv1_nvalues,
+					mcv2_nvalues,
+					mcv1_nnumbers,
+					mcv2_nnumbers,
+					his1_nvalues,
+					his2_nvalues,
+					red1_nvalues,
+					red2_nvalues,
+					i;
+	Datum		   *mcv1_values,
+				   *mcv2_values,
+				   *his1_values,
+				   *his2_values;
+	float4		   *mcv1_numbers,
+				   *mcv2_numbers;
+
+	get_join_variables(root, args, sjinfo, &vardata1, &vardata2, &reversed);
+
+	switch (sjinfo->jointype)
+	{
+		case JOIN_INNER:
+		case JOIN_LEFT:
+		case JOIN_FULL:
+			break;
+		default:
+			ReleaseVariableStats(vardata1);
+			ReleaseVariableStats(vardata2);
+			PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	if (!HeapTupleIsValid(vardata1.statsTuple) ||
+		!HeapTupleIsValid(vardata2.statsTuple))
+	{
+		ReleaseVariableStats(vardata1);
+		ReleaseVariableStats(vardata2);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	opr_order = inet_opr_order(operator, reversed);
+	stats1 = (Form_pg_statistic) GETSTRUCT(vardata1.statsTuple);
+	stats2 = (Form_pg_statistic) GETSTRUCT(vardata2.statsTuple);
+	mcv1_exists = get_attstatsslot(vardata1.statsTuple,
+								   vardata1.atttype, vardata1.atttypmod,
+								   STATISTIC_KIND_MCV, InvalidOid,
+								   NULL,
+								   &mcv1_values, &mcv1_nvalues,
+								   &mcv1_numbers, &mcv1_nnumbers);
+	mcv2_exists = get_attstatsslot(vardata2.statsTuple,
+								   vardata2.atttype, vardata2.atttypmod,
+								   STATISTIC_KIND_MCV, InvalidOid,
+								   NULL,
+								   &mcv2_values, &mcv2_nvalues,
+								   &mcv2_numbers, &mcv2_nnumbers);
+	his1_exists = get_attstatsslot(vardata1.statsTuple,
+								   vardata1.atttype, vardata1.atttypmod,
+								   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+								   NULL,
+								   &his1_values, &his1_nvalues,
+								   NULL, NULL);
+	his2_exists = get_attstatsslot(vardata2.statsTuple,
+								   vardata2.atttype, vardata2.atttypmod,
+								   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+								   NULL,
+								   &his2_values, &his2_nvalues,
+								   NULL, NULL);
+
+	red1_nvalues = ((int) log(Max(mcv1_nvalues, his1_nvalues))) + 1;
+	red2_nvalues = ((int) log(Max(mcv2_nvalues, his2_nvalues))) + 1;
+
+	selec = 0.0;
+	mcv1_max_selec = 0.0;
+	mcv1_red_selec = 0.0;
+	mcv2_max_selec = 0.0;
+	mcv2_red_selec = 0.0;
+	if (mcv1_exists)
+		for (i = 0; i < mcv1_nvalues; i++)
+		{
+			mcv1_max_selec += mcv1_numbers[i];
+			if (i < red1_nvalues)
+				mcv1_red_selec += mcv1_numbers[i];
+		}
+	if (mcv2_exists)
+		for (i = 0; i < mcv2_nvalues; i++)
+		{
+			mcv2_max_selec += mcv2_numbers[i];
+			if (i < red2_nvalues)
+				mcv2_red_selec += mcv2_numbers[i];
+		}
+
+	elog(LOG, "\n-----------");
+	if (mcv1_exists && mcv2_exists)
+		selec += (mcv1_max_selec / mcv1_red_selec) *
+				 (mcv2_max_selec / mcv2_red_selec) *
+				 inet_mcv_join_selec(mcv1_values, mcv1_numbers,
+									 Min(mcv1_nvalues, red1_nvalues),
+									 mcv2_values, mcv1_numbers,
+									 Min(mcv2_nvalues, red2_nvalues),
+									 operator, reversed);
+	elog(LOG, "%f", selec);
+	if (mcv1_exists && his2_exists)
+		selec += (mcv1_max_selec / mcv1_red_selec) *
+				 inet_mcv_his_selec(mcv1_values, mcv1_numbers,
+									Min(mcv1_nvalues, red1_nvalues),
+									his2_values, his2_nvalues, red2_nvalues,
+									opr_order);
+	elog(LOG, "%f", selec);
+	if (mcv2_exists && his1_exists)
+		selec += (mcv2_max_selec / mcv2_red_selec) *
+				 inet_mcv_his_selec(mcv2_values, mcv2_numbers,
+									Min(mcv2_nvalues, red2_nvalues),
+									his1_values, his1_nvalues, red1_nvalues,
+									opr_order);
+	elog(LOG, "%f", selec);
+	if (his1_exists && his2_exists)
+		selec += (1.0 - stats1->stanullfrac - mcv1_max_selec) *
+				 (1.0 - stats2->stanullfrac - mcv2_max_selec) *
+				 inet_his_inclusion_join_selec(his1_values, his1_nvalues,
+											   red1_nvalues, his2_values,
+											   his2_nvalues, red2_nvalues,
+											   opr_order);
+	elog(LOG, "%f", selec);
+
+	/* Correct the value. */
+	if (!his1_exists)
+		selec /= stats1->stanullfrac + mcv1_max_selec;
+	if (!his2_exists)
+		selec /= stats2->stanullfrac + mcv2_max_selec;
+
+	if (!mcv1_exists && !mcv2_exists && !his1_exists && his2_exists)
+		selec = DEFAULT_SEL(operator);
+
+	if (mcv1_exists)
+		free_attstatsslot(vardata1.atttype, mcv1_values, mcv1_nvalues,
+						  mcv1_numbers, mcv1_nnumbers);
+	if (mcv2_exists)
+		free_attstatsslot(vardata2.atttype, mcv2_values, mcv2_nvalues,
+						  mcv2_numbers, mcv2_nnumbers);
+	if (his1_exists)
+		free_attstatsslot(vardata1.atttype, his1_values, his1_nvalues, NULL, 0);
+	if (his2_exists)
+		free_attstatsslot(vardata2.atttype, his2_values, his2_nvalues, NULL, 0);
+	ReleaseVariableStats(vardata1);
+	ReleaseVariableStats(vardata2);
+
+	/* Result should be in range, but make sure... */
+	CLAMP_PROBABILITY(selec);
+
+	PG_RETURN_FLOAT8(selec);
+}
+
+/*
+ * Practical comparable numbers for the subnet inclusion operators
+ */
+static short int
+inet_opr_order(Oid operator, bool reversed)
+{
+	short int	order;
+
+	switch (operator)
+	{
+		case OID_INET_SUP_OP:
+			order = -2;
+			break;
+		case OID_INET_SUPEQ_OP:
+			order = -1;
+			break;
+		case OID_INET_OVERLAP_OP:
+			order = 0;
+			break;
+		case OID_INET_SUBEQ_OP:
+			order = 1;
+			break;
+		case OID_INET_SUB_OP:
+			order = 2;
+			break;
+		default:
+			elog(ERROR, "unknown operator for inet inclusion selectivity");
+	}
+
+	return (reversed ? order * -1 : order);
+}
+
+/*
+ * Inet histogram inclusion selectivity estimation
+ *
+ * Calculates histogram selectivity for the subnet inclusion operators of
+ * the inet type. The return value is between 0 and 1. It should be
+ * corrected with the MVC selectivity and null fraction. If the constant
+ * is less than the first element or greater than the last element of
+ * the histogram the return value will be 0.
+ *
+ * The histogram is originally for the basic comparison operators. Only
+ * the common bits of the network part and the lenght of the network part
+ * (masklen) are appropriate for the subnet inclusion opeators. Fortunately,
+ * basic comparison fits in this situation. Even so, the lenght of the
+ * network part would not really be significant in the histogram. This would
+ * lead to big mistakes for data sets with uneven masklen distribution.
+ * To avoid this problem, comparison with the left and the right side of the
+ * buckets used together.
+ *
+ * Histogram bucket matches are calculated in 3 forms. If the constant
+ * matches both sides the bucket is considered as fully matched. If the
+ * constant matches only the right side the bucket is not considered as
+ * matched at all. In that case the ratio for only one value in the column
+ * is added to the selectivity.
+ *
+ * The ratio for only one value is calculated with the ndistinct variable
+ * if greater than 0. 0 can be given if this behavior is not desired.
+ * This ratio can be big enough to not disregard for addresses with small
+ * masklens. See pg_statistic for more information about it.
+ *
+ * When the constant matches only the right side of the bucket, it will match
+ * the next bucket, unless the bucket is the last one. If these buckets would
+ * be considered as matched it would lead to unfair multiple matches for some
+ * constants.
+ *
+ * The third form is to match the bucket partially. We try to calculate
+ * dividers for both of the boundaries. If the address family of the boundary
+ * does not match the constant or comparison of the lenght of the network
+ * parts is not true by the operator, the divider for the boundary would not
+ * taken into account. If both of the dividers can be calculated the greater
+ * one will be used to mimimize the mistake in the buckets which have
+ * disperate masklens.
+ *
+ * The divider on the partial bucket match is imagined as the distance
+ * between the decisive bits and the common bits of the addresses. It will be
+ * used as power of two as it is the natural scale for the IP network
+ * inclusion. The partial bucket match divider calculation is an empirical
+ * formula and subject to change with more experiment.
+ *
+ * For partial match with buckets which have different address families
+ * on the left and right sides only the boundary with the same address
+ * family is taken into consideration. This can cause more mistakes for these
+ * buckets if the masklens of their boundaries are also disparate. It can
+ * only be the case for one bucket, if there are addresses with different
+ * families on the column. It seems as a better option than not considering
+ * these buckets.
+ */
+static Selectivity
+inet_his_inclusion_selec(Datum *values, int nvalues, int red_nvalues,
+						 Datum *constvalue, double ndistinct,
+						 short int opr_order)
+{
+	inet		   *query,
+				   *left,
+				   *right;
+	float			gap,
+					match;
+	int				i;
+	short int		left_order,
+					right_order,
+					left_divider,
+					right_divider;
+
+	Assert(nvalues >= red_nvalues);
+
+	gap = ((float) (nvalues - 1)) / ((float) red_nvalues);
+	match = 0.0;
+	query = DatumGetInetP(*constvalue);
+	left = DatumGetInetP(values[0]);
+	left_order = inet_inclusion_cmp(left, query, opr_order);
+
+	if (left_order == 0)
+	{
+		/* Only left boundary match. */
+
+		if (ndistinct > 0)
+			match += 1.0 / ndistinct;
+	}
+
+	for (i = 1; i <= red_nvalues; i++)
+	{
+		right = DatumGetInetP(values[(int) (i * gap)]);
+		right_order = inet_inclusion_cmp(right, query, opr_order);
+
+		if (right_order == 0 && left_order == 0)
+		{
+			/* Full bucket match. */
+
+			match += 1.0;
+		}
+		else if (((right_order > 0 && left_order <= 0) ||
+				  (right_order < 0 && left_order >= 0)) && left)
+		{
+			/* Partial bucket match. */
+
+			left_divider = inet_his_match_divider(left, query, opr_order);
+			right_divider = inet_his_match_divider(right, query, opr_order);
+
+			if (left_divider >= 0 || right_divider >= 0)
+				match += 1.0 / pow(2, Max(left_divider, right_divider));
+		}
+		else if (right_order == 0)
+		{
+			/* Only right boundary match. */
+
+			if (ndistinct > 0)
+				match += 1.0 / ndistinct;
+		}
+
+		/* Shift the variables. */
+		left = right;
+		left_order = right_order;
+	}
+
+	/*
+	 * (1.0 / ndistinct) should be added to (red_nvalues - 1) in case
+	 * the query matches the first value, but it is neglected.
+	 */
+	return match / (red_nvalues - 1);
+}
+
+/*
+ * Inet MCV join selectivity estimation
+ */
+static Selectivity
+inet_mcv_join_selec(Datum *values1, float4 *numbers1, int nvalues1,
+					Datum *values2, float4 *numbers2, int nvalues2,
+					Oid operator, bool reversed)
+{
+	Selectivity		selec;
+	FmgrInfo		proc;
+	int				i,
+					j;
+
+	fmgr_info(get_opcode(operator), &proc);
+	selec = 0.0;
+
+	for (i = 0; i < nvalues1; i++)
+		for (j = 0; j < nvalues2; j++)
+			if (reversed ?
+				DatumGetBool(FunctionCall2Coll(&proc,
+											   DEFAULT_COLLATION_OID,
+											   values1[i],
+											   values2[j])) :
+				DatumGetBool(FunctionCall2Coll(&proc,
+											   DEFAULT_COLLATION_OID,
+											   values2[j],
+											   values1[i])))
+				selec += numbers1[i] * numbers2[j];
+
+	return selec;
+}
+
+/*
+ * Inet MCV vs histogram inclusion join selectivity estimation
+ */
+static Selectivity
+inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers, int mcv_nvalues,
+				   Datum *his_values, int his_nvalues, int red_nvalues,
+				   short int opr_order)
+{
+	Selectivity		selec;
+	int				i;
+
+	selec = 0.0;
+	for (i = 0; i < mcv_nvalues; i++)
+		selec += mcv_numbers[i] *
+				 inet_his_inclusion_selec(his_values, his_nvalues, red_nvalues,
+										  &mcv_values[i], 0, opr_order);
+	return selec;
+}
+
+/*
+ * Inet histogram inclusion join selectivity estimation
+ *
+ * Choose red1_nvalues from his1_values. Do not use the first and the last
+ * values for better sampling.
+ */
+static Selectivity
+inet_his_inclusion_join_selec(Datum *his1_values, int his1_nvalues,
+							  int red1_nvalues, Datum *his2_values,
+							  int his2_nvalues, int red2_nvalues,
+							  short int opr_order)
+{
+	float			match,
+					gap;
+	int				i;
+
+	Assert(his1_nvalues >= red1_nvalues);
+
+	gap = ((float) (his1_nvalues - 2)) / ((float) red1_nvalues);
+	match = 0.0;
+	for (i = 1; i <= red1_nvalues; i++)
+		match += inet_his_inclusion_selec(his2_values, his2_nvalues,
+										  red2_nvalues,
+										  &his1_values[(int) (i * gap)],
+										  0, opr_order);
+
+	return match / red1_nvalues;
+}
+
+/*
+ * Comparison function for the subnet inclusion operators
+ *
+ * Comparison is compatible with the basic comparison function for the inet
+ * type. See network_cmp_internal on network.c for the original. Basic
+ * comparison operators are implemented with the network_cmp_internal
+ * function. It is possible to implement the subnet inclusion operators with
+ * this function.
+ *
+ * Comparison is first on the common bits of the network part, then on
+ * the length of the network part (masklen) as the network_cmp_internal
+ * function. Only the first part is on this function. The second part is
+ * seperated to another function for reusability. The difference between
+ * the second part and the original network_cmp_internal is that the operator
+ * is used while comparing the lengths of the network parts. See the second
+ * part on the inet_masklen_inclusion_cmp function below.
+ */
+static short int
+inet_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	 order;
+
+		order = bitncmp(ip_addr(left), ip_addr(right),
+						Min(ip_bits(left), ip_bits(right)));
+
+		if (order != 0)
+			return order;
+
+		return inet_masklen_inclusion_cmp(left, right, opr_order);
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Masklen comparison function for the subnet inclusion operators
+ *
+ * Compares the lengths of network parts of the inputs using the operator.
+ * If the comparision is okay for the operator the return value will be 0.
+ * Otherwise the return value will be less than or greater than 0 with
+ * respect to the operator.
+ */
+static short int
+inet_masklen_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	 order;
+
+		order = ip_bits(left) - ip_bits(right);
+
+		if ((order > 0 && opr_order >= 0) ||
+			(order == 0 && opr_order >= -1 && opr_order <= 1) ||
+			(order < 0 && opr_order <= 0))
+			return 0;
+
+		return opr_order;
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Inet histogram partial match divider calculation
+ *
+ * First the families and the lenghts of the network parts are compared
+ * using the subnet inclusion operator. The divider will be calculated
+ * using the masklens and the common bits of the addresses. -1 will be
+ * returned if it cannot be calculated.
+ */
+static short int
+inet_his_match_divider(inet *boundary, inet *query, short int opr_order)
+{
+	if (inet_masklen_inclusion_cmp(boundary, query, opr_order) == 0)
+	{
+		short int	min_bits,
+					decisive_bits;
+
+		min_bits = Min(ip_bits(boundary), ip_bits(query));
+
+		/*
+		 * Set the decisive bits from the one which should contain the other
+		 * according to the operator.
+		 */
+		if (opr_order < 0)
+			decisive_bits = ip_bits(boundary);
+		else if (opr_order > 0)
+			decisive_bits = ip_bits(query);
+		else
+			decisive_bits = min_bits;
+
+		if (min_bits > 0)
+			return decisive_bits - bitncommon(ip_addr(boundary), ip_addr(query),
+											  min_bits);
+		return decisive_bits;
+	}
+
+	return -1;
 }
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index f280af4..b8f7aad 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -1135,32 +1135,33 @@ DESCR("not equal");
 DATA(insert OID = 1203 (  "<"	   PGNSP PGUID b f f 869 869	 16 1205 1206 network_lt scalarltsel scalarltjoinsel ));
 DESCR("less than");
 DATA(insert OID = 1204 (  "<="	   PGNSP PGUID b f f 869 869	 16 1206 1205 network_le scalarltsel scalarltjoinsel ));
 DESCR("less than or equal");
 DATA(insert OID = 1205 (  ">"	   PGNSP PGUID b f f 869 869	 16 1203 1204 network_gt scalargtsel scalargtjoinsel ));
 DESCR("greater than");
 DATA(insert OID = 1206 (  ">="	   PGNSP PGUID b f f 869 869	 16 1204 1203 network_ge scalargtsel scalargtjoinsel ));
 DESCR("greater than or equal");
 DATA(insert OID = 931  (  "<<"	   PGNSP PGUID b f f 869 869	 16 933		0 network_sub networksel networkjoinsel ));
 DESCR("is subnet");
-#define OID_INET_SUB_OP				  931
+#define OID_INET_SUB_OP			931
 DATA(insert OID = 932  (  "<<="    PGNSP PGUID b f f 869 869	 16 934		0 network_subeq networksel networkjoinsel ));
 DESCR("is subnet or equal");
-#define OID_INET_SUBEQ_OP				932
+#define OID_INET_SUBEQ_OP		932
 DATA(insert OID = 933  (  ">>"	   PGNSP PGUID b f f 869 869	 16 931		0 network_sup networksel networkjoinsel ));
 DESCR("is supernet");
-#define OID_INET_SUP_OP				  933
+#define OID_INET_SUP_OP			933
 DATA(insert OID = 934  (  ">>="    PGNSP PGUID b f f 869 869	 16 932		0 network_supeq networksel networkjoinsel ));
 DESCR("is supernet or equal");
-#define OID_INET_SUPEQ_OP				934
+#define OID_INET_SUPEQ_OP		934
 DATA(insert OID = 3552	(  "&&"    PGNSP PGUID b f f 869 869	 16 3552	0 network_overlap networksel networkjoinsel ));
 DESCR("overlaps (is subnet or supernet)");
+#define OID_INET_OVERLAP_OP		3552
 
 DATA(insert OID = 2634 (  "~"	   PGNSP PGUID l f f	  0 869 869 0 0 inetnot - - ));
 DESCR("bitwise not");
 DATA(insert OID = 2635 (  "&"	   PGNSP PGUID b f f	869 869 869 0 0 inetand - - ));
 DESCR("bitwise and");
 DATA(insert OID = 2636 (  "|"	   PGNSP PGUID b f f	869 869 869 0 0 inetor - - ));
 DESCR("bitwise or");
 DATA(insert OID = 2637 (  "+"	   PGNSP PGUID b f f	869  20 869 2638 0 inetpl - - ));
 DESCR("add");
 DATA(insert OID = 2638 (  "+"	   PGNSP PGUID b f f	 20 869 869 2637 0 int8pl_inet - - ));
inet-selfuncs-test.sqlapplication/octet-stream; name=inet-selfuncs-test.sqlDownload
#2Emre Hasegeli
emre@hasegeli.com
In reply to: Emre Hasegeli (#1)
1 attachment(s)
Re: Selectivity estimation for inet operators

I wanted to check the patch last time and found a bug effecting
MVC vs MVC part of the join selectivity. Fixed version attached.

Emre Hasegeli <emre@hasegeli.com>:

Comparing the lists with each other slows down the function when
statistics set to higher values. To avoid this problem I only use
log(n) values of the lists. It is the first log(n) value for MCV,
evenly separated values for histograms. In my tests, this optimization
does not affect the planning time when statistics = 100, but does
affect accuracy of the estimation. I can send the version without
this optimization, if slow down with larger statistics is not a problem
which should be solved on the selectivity estimation function.

Also, I changed this from log(n) to sqrt(n). It seems much better
now.

I try to explain the reason to processes some of the values with more
comments. I hope it is understandable.

Attachments:

inet-selfuncs-v5.patchapplication/octet-stream; name=inet-selfuncs-v5.patchDownload
diff --git a/src/backend/utils/adt/network_selfuncs.c b/src/backend/utils/adt/network_selfuncs.c
index d0d806f..e3218e9 100644
--- a/src/backend/utils/adt/network_selfuncs.c
+++ b/src/backend/utils/adt/network_selfuncs.c
@@ -1,32 +1,693 @@
 /*-------------------------------------------------------------------------
  *
  * network_selfuncs.c
  *	  Functions for selectivity estimation of inet/cidr operators
  *
- * Currently these are just stubs, but we hope to do better soon.
+ * Estimates are based on null fraction, distinct value count, most common
+ * values, and histogram of inet/cidr datatypes.
  *
  * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
  *
  * IDENTIFICATION
  *	  src/backend/utils/adt/network_selfuncs.c
  *
  *-------------------------------------------------------------------------
  */
 #include "postgres.h"
 
+#include <math.h>
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_operator.h"
+#include "catalog/pg_statistic.h"
+#include "utils/lsyscache.h"
 #include "utils/inet.h"
+#include "utils/selfuncs.h"
 
 
+/* Default selectivity constant for the inet overlap operator */
+#define DEFAULT_OVERLAP_SEL 0.01
+
+/* Default selectivity constant for the other operators */
+#define DEFAULT_INCLUSION_SEL 0.005
+
+/* Default selectivity for given operator */
+#define DEFAULT_SEL(operator) \
+	((operator) == OID_INET_OVERLAP_OP ? \
+			DEFAULT_OVERLAP_SEL : DEFAULT_INCLUSION_SEL)
+
+static short int inet_opr_order(Oid operator, bool reversed);
+static Selectivity inet_his_inclusion_selec(Datum *values, int nvalues,
+					int red_nvalues, Datum *constvalue, double ndistinct,
+					short int opr_order);
+static Selectivity inet_mcv_join_selec(Datum *values1, float4 *numbers1,
+					int nvalues1, Datum *values2, float4 *numbers2,
+					int nvalues2, Oid operator, bool reversed);
+static Selectivity inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers,
+					int mcv_nvalues, Datum *his_values, int his_nvalues,
+					int red_nvalues, short int opr_order);
+static Selectivity inet_his_inclusion_join_selec(Datum *his1_values,
+					int his1_nvalues, int red1_nvalues, Datum *his2_values,
+					int his2_nvalues, int red2_nvalues, short int opr_order);
+static short int inet_inclusion_cmp(inet *left, inet *right,
+									short int opr_order);
+static short int inet_masklen_inclusion_cmp(inet *left, inet *right,
+											short int opr_order);
+static short int inet_his_match_divider(inet *boundary, inet *query,
+										short int opr_order);
+
+/*
+ * Selectivity estimation for the subnet inclusion operators
+ */
 Datum
 networksel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo	   *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid				operator = PG_GETARG_OID(1);
+	List		   *args = (List *) PG_GETARG_POINTER(2);
+	int				varRelid = PG_GETARG_INT32(3),
+					his_nvalues;
+	VariableStatData vardata;
+	Node		   *other;
+	bool			varonleft;
+	Selectivity		selec,
+					max_mcv_selec,
+					max_his_selec;
+	Datum			constvalue,
+				   *his_values;
+	Form_pg_statistic stats;
+	FmgrInfo		proc;
+
+	/*
+	 * If expression is not (variable op something) or (something op
+	 * variable), then punt and return a default estimate.
+	 */
+	if (!get_restriction_variable(root, args, varRelid,
+								  &vardata, &other, &varonleft))
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+
+	/*
+	 * Can't do anything useful if the something is not a constant, either.
+	 */
+	if (!IsA(other, Const))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	/* All of the subnet inclusion operators are strict. */
+	if (((Const *) other)->constisnull)
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(0.0);
+	}
+
+	if (!HeapTupleIsValid(vardata.statsTuple))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	constvalue = ((Const *) other)->constvalue;
+	stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+
+	fmgr_info(get_opcode(operator), &proc);
+	selec = mcv_selectivity(&vardata, &proc, constvalue, varonleft,
+							&max_mcv_selec);
+	max_his_selec = 1.0 - stats->stanullfrac - max_mcv_selec;
+
+	if (get_attstatsslot(vardata.statsTuple,
+						 vardata.atttype, vardata.atttypmod,
+						 STATISTIC_KIND_HISTOGRAM, InvalidOid,
+						 NULL,
+						 &his_values, &his_nvalues,
+						 NULL, NULL))
+	{
+		selec += max_his_selec *
+				 inet_his_inclusion_selec(his_values, his_nvalues, his_nvalues,
+										  &constvalue, stats->stadistinct,
+										  inet_opr_order(operator, !varonleft));
+
+		free_attstatsslot(vardata.atttype, his_values, his_nvalues, NULL, 0);
+	}
+	else
+		if (max_mcv_selec > 0)
+			selec = selec / (1.0 - max_his_selec); /* Correct the value. */
+		else
+			selec = DEFAULT_SEL(operator);
+
+	/* Result should be in range, but make sure... */
+	CLAMP_PROBABILITY(selec);
+
+	ReleaseVariableStats(vardata);
+	PG_RETURN_FLOAT8(selec);
 }
 
+/*
+ * Join selectivity estimation for the subnet inclusion operators
+ *
+ * Calculates MCV vs MCV, MCV vs histogram and histogram vs histogram
+ * selectivity for join using the subnet inclusion operators. Unlike the
+ * join selectivity function for the equality operator, eqjoinsel(), 1 to 1
+ * matching of the values is not enough. Network inclusion operators are
+ * likely to match many to many. It requires to loop the MVC and histogram
+ * lists to the end. Also, MCV vs histogram selectiviy is not neglected
+ * as in eqjoinsel().
+ *
+ * More processing on this function can become a problem with large
+ * statistics. To avoid it only some of the values in the lists are used.
+ * The reduced amount of the list is calculated by the square root of
+ * the original amount. It fits the situation because the lists will be
+ * matched to each other (sqrt(x) * sqrt(x) == x). MCV's will be reduced
+ * by choosing the first N values. Histogram boundaries will be reduced
+ * by skipping some of them.
+ */
 Datum
 networkjoinsel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo	   *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid				operator = PG_GETARG_OID(1);
+	List		   *args = (List *) PG_GETARG_POINTER(2);
+	SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) PG_GETARG_POINTER(4);
+	VariableStatData vardata1,
+					vardata2;
+	Form_pg_statistic stats1,
+					stats2;
+	Selectivity		selec,
+					mcv1_max_selec,
+					mcv1_red_selec,
+					mcv2_max_selec,
+					mcv2_red_selec;
+	bool			reversed,
+					mcv1_exists,
+					mcv2_exists,
+					his1_exists,
+					his2_exists;
+	short int		opr_order;
+	int				mcv1_nvalues,
+					mcv2_nvalues,
+					mcv1_nnumbers,
+					mcv2_nnumbers,
+					his1_nvalues,
+					his2_nvalues,
+					red1_nvalues,
+					red2_nvalues,
+					i;
+	Datum		   *mcv1_values,
+				   *mcv2_values,
+				   *his1_values,
+				   *his2_values;
+	float4		   *mcv1_numbers,
+				   *mcv2_numbers;
+
+	get_join_variables(root, args, sjinfo, &vardata1, &vardata2, &reversed);
+
+	switch (sjinfo->jointype)
+	{
+		case JOIN_INNER:
+		case JOIN_LEFT:
+		case JOIN_FULL:
+			break;
+		default:
+			ReleaseVariableStats(vardata1);
+			ReleaseVariableStats(vardata2);
+			PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	if (!HeapTupleIsValid(vardata1.statsTuple) ||
+		!HeapTupleIsValid(vardata2.statsTuple))
+	{
+		ReleaseVariableStats(vardata1);
+		ReleaseVariableStats(vardata2);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	opr_order = inet_opr_order(operator, reversed);
+	stats1 = (Form_pg_statistic) GETSTRUCT(vardata1.statsTuple);
+	stats2 = (Form_pg_statistic) GETSTRUCT(vardata2.statsTuple);
+	mcv1_exists = get_attstatsslot(vardata1.statsTuple,
+								   vardata1.atttype, vardata1.atttypmod,
+								   STATISTIC_KIND_MCV, InvalidOid,
+								   NULL,
+								   &mcv1_values, &mcv1_nvalues,
+								   &mcv1_numbers, &mcv1_nnumbers);
+	mcv2_exists = get_attstatsslot(vardata2.statsTuple,
+								   vardata2.atttype, vardata2.atttypmod,
+								   STATISTIC_KIND_MCV, InvalidOid,
+								   NULL,
+								   &mcv2_values, &mcv2_nvalues,
+								   &mcv2_numbers, &mcv2_nnumbers);
+	his1_exists = get_attstatsslot(vardata1.statsTuple,
+								   vardata1.atttype, vardata1.atttypmod,
+								   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+								   NULL,
+								   &his1_values, &his1_nvalues,
+								   NULL, NULL);
+	his2_exists = get_attstatsslot(vardata2.statsTuple,
+								   vardata2.atttype, vardata2.atttypmod,
+								   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+								   NULL,
+								   &his2_values, &his2_nvalues,
+								   NULL, NULL);
+
+	red1_nvalues = ((int) sqrt(Max(mcv1_nvalues, his1_nvalues))) + 1;
+	red2_nvalues = ((int) sqrt(Max(mcv2_nvalues, his2_nvalues))) + 1;
+
+	selec = 0.0;
+	mcv1_max_selec = 0.0;
+	mcv1_red_selec = 0.0;
+	mcv2_max_selec = 0.0;
+	mcv2_red_selec = 0.0;
+	if (mcv1_exists)
+		for (i = 0; i < mcv1_nvalues; i++)
+		{
+			mcv1_max_selec += mcv1_numbers[i];
+			if (i < red1_nvalues)
+				mcv1_red_selec += mcv1_numbers[i];
+		}
+	if (mcv2_exists)
+		for (i = 0; i < mcv2_nvalues; i++)
+		{
+			mcv2_max_selec += mcv2_numbers[i];
+			if (i < red2_nvalues)
+				mcv2_red_selec += mcv2_numbers[i];
+		}
+
+	elog(LOG, "\n-----------");
+	if (mcv1_exists && mcv2_exists)
+		selec += (mcv1_max_selec / mcv1_red_selec) *
+				 (mcv2_max_selec / mcv2_red_selec) *
+				 inet_mcv_join_selec(mcv1_values, mcv1_numbers,
+									 Min(mcv1_nvalues, red1_nvalues),
+									 mcv2_values, mcv2_numbers,
+									 Min(mcv2_nvalues, red2_nvalues),
+									 operator, reversed);
+	elog(LOG, "%f", selec);
+	if (mcv1_exists && his2_exists)
+		selec += (mcv1_max_selec / mcv1_red_selec) *
+				 inet_mcv_his_selec(mcv1_values, mcv1_numbers,
+									Min(mcv1_nvalues, red1_nvalues),
+									his2_values, his2_nvalues, red2_nvalues,
+									opr_order);
+	elog(LOG, "%f", selec);
+	if (mcv2_exists && his1_exists)
+		selec += (mcv2_max_selec / mcv2_red_selec) *
+				 inet_mcv_his_selec(mcv2_values, mcv2_numbers,
+									Min(mcv2_nvalues, red2_nvalues),
+									his1_values, his1_nvalues, red1_nvalues,
+									opr_order);
+	elog(LOG, "%f", selec);
+	if (his1_exists && his2_exists)
+		selec += (1.0 - stats1->stanullfrac - mcv1_max_selec) *
+				 (1.0 - stats2->stanullfrac - mcv2_max_selec) *
+				 inet_his_inclusion_join_selec(his1_values, his1_nvalues,
+											   red1_nvalues, his2_values,
+											   his2_nvalues, red2_nvalues,
+											   opr_order);
+	elog(LOG, "%f", selec);
+
+	/* Correct the value. */
+	if (!his1_exists)
+		selec /= stats1->stanullfrac + mcv1_max_selec;
+	if (!his2_exists)
+		selec /= stats2->stanullfrac + mcv2_max_selec;
+
+	if (!mcv1_exists && !mcv2_exists && !his1_exists && his2_exists)
+		selec = DEFAULT_SEL(operator);
+
+	if (mcv1_exists)
+		free_attstatsslot(vardata1.atttype, mcv1_values, mcv1_nvalues,
+						  mcv1_numbers, mcv1_nnumbers);
+	if (mcv2_exists)
+		free_attstatsslot(vardata2.atttype, mcv2_values, mcv2_nvalues,
+						  mcv2_numbers, mcv2_nnumbers);
+	if (his1_exists)
+		free_attstatsslot(vardata1.atttype, his1_values, his1_nvalues, NULL, 0);
+	if (his2_exists)
+		free_attstatsslot(vardata2.atttype, his2_values, his2_nvalues, NULL, 0);
+	ReleaseVariableStats(vardata1);
+	ReleaseVariableStats(vardata2);
+
+	/* Result should be in range, but make sure... */
+	CLAMP_PROBABILITY(selec);
+
+	PG_RETURN_FLOAT8(selec);
+}
+
+/*
+ * Practical comparable numbers for the subnet inclusion operators
+ */
+static short int
+inet_opr_order(Oid operator, bool reversed)
+{
+	short int	order;
+
+	switch (operator)
+	{
+		case OID_INET_SUP_OP:
+			order = -2;
+			break;
+		case OID_INET_SUPEQ_OP:
+			order = -1;
+			break;
+		case OID_INET_OVERLAP_OP:
+			order = 0;
+			break;
+		case OID_INET_SUBEQ_OP:
+			order = 1;
+			break;
+		case OID_INET_SUB_OP:
+			order = 2;
+			break;
+		default:
+			elog(ERROR, "unknown operator for inet inclusion selectivity");
+	}
+
+	return (reversed ? order * -1 : order);
+}
+
+/*
+ * Inet histogram inclusion selectivity estimation
+ *
+ * Calculates histogram selectivity for the subnet inclusion operators of
+ * the inet type. The return value is between 0 and 1. It should be
+ * corrected with the MVC selectivity and null fraction. If the constant
+ * is less than the first element or greater than the last element of
+ * the histogram the return value will be 0.
+ *
+ * This function is capable of checking only some of the histogram boundies.
+ * It is used to make join selectivity estimation faster. nvalues should
+ * also be given to red_nvalues to avoid this behavior. It is explained on
+ * inet_mcv_join_selec(), below.
+ *
+ * The histogram is originally for the basic comparison operators. Only
+ * the common bits of the network part and the lenght of the network part
+ * (masklen) are appropriate for the subnet inclusion opeators. Fortunately,
+ * basic comparison fits in this situation. Even so, the lenght of the
+ * network part would not really be significant in the histogram. This would
+ * lead to big mistakes for data sets with uneven masklen distribution.
+ * To avoid this problem, comparison with the left and the right side of the
+ * buckets used together.
+ *
+ * Histogram bucket matches are calculated in 3 forms. If the constant
+ * matches both sides the bucket is considered as fully matched. If the
+ * constant matches only the right side the bucket is not considered as
+ * matched at all. In that case the ratio for only one value in the column
+ * is added to the selectivity.
+ *
+ * The ratio for only one value is calculated with the ndistinct variable
+ * if greater than 0. 0 can be given if this behavior is not desired.
+ * This ratio can be big enough to not disregard for addresses with small
+ * masklens. See pg_statistic for more information about it.
+ *
+ * When the constant matches only the right side of the bucket, it will match
+ * the next bucket, unless the bucket is the last one. If these buckets would
+ * be considered as matched it would lead to unfair multiple matches for some
+ * constants.
+ *
+ * The third form is to match the bucket partially. We try to calculate
+ * dividers for both of the boundaries. If the address family of the boundary
+ * does not match the constant or comparison of the lenght of the network
+ * parts is not true by the operator, the divider for the boundary would not
+ * taken into account. If both of the dividers can be calculated the greater
+ * one will be used to mimimize the mistake in the buckets which have
+ * disperate masklens.
+ *
+ * The divider on the partial bucket match is imagined as the distance
+ * between the decisive bits and the common bits of the addresses. It will be
+ * used as power of two as it is the natural scale for the IP network
+ * inclusion. The partial bucket match divider calculation is an empirical
+ * formula and subject to change with more experiment.
+ *
+ * For partial match with buckets which have different address families
+ * on the left and right sides only the boundary with the same address
+ * family is taken into consideration. This can cause more mistakes for these
+ * buckets if the masklens of their boundaries are also disparate. It can
+ * only be the case for one bucket, if there are addresses with different
+ * families on the column. It seems as a better option than not considering
+ * these buckets.
+ */
+static Selectivity
+inet_his_inclusion_selec(Datum *values, int nvalues, int red_nvalues,
+						 Datum *constvalue, double ndistinct,
+						 short int opr_order)
+{
+	inet		   *query,
+				   *left,
+				   *right;
+	float			gap,
+					match;
+	int				i;
+	short int		left_order,
+					right_order,
+					left_divider,
+					right_divider;
+
+	Assert(nvalues >= red_nvalues);
+
+	gap = ((float) (nvalues - 1)) / ((float) red_nvalues);
+	match = 0.0;
+	query = DatumGetInetP(*constvalue);
+	left = DatumGetInetP(values[0]);
+	left_order = inet_inclusion_cmp(left, query, opr_order);
+
+	if (left_order == 0)
+	{
+		/* Only left boundary match. */
+
+		if (ndistinct > 0)
+			match += 1.0 / ndistinct;
+	}
+
+	for (i = 1; i <= red_nvalues; i++)
+	{
+		right = DatumGetInetP(values[(int) (i * gap)]);
+		right_order = inet_inclusion_cmp(right, query, opr_order);
+
+		if (right_order == 0 && left_order == 0)
+		{
+			/* Full bucket match. */
+
+			match += 1.0;
+		}
+		else if (((right_order > 0 && left_order <= 0) ||
+				  (right_order < 0 && left_order >= 0)) && left)
+		{
+			/* Partial bucket match. */
+
+			left_divider = inet_his_match_divider(left, query, opr_order);
+			right_divider = inet_his_match_divider(right, query, opr_order);
+
+			if (left_divider >= 0 || right_divider >= 0)
+				match += 1.0 / pow(2, Max(left_divider, right_divider));
+		}
+		else if (right_order == 0)
+		{
+			/* Only right boundary match. */
+
+			if (ndistinct > 0)
+				match += 1.0 / ndistinct;
+		}
+
+		/* Shift the variables. */
+		left = right;
+		left_order = right_order;
+	}
+
+	/*
+	 * (1.0 / ndistinct) should be added to (red_nvalues - 1) in case
+	 * the query matches the first value, but it is neglected.
+	 */
+	return match / (red_nvalues - 1);
+}
+
+/*
+ * Inet MCV join selectivity estimation
+ *
+ * The original function of the operator used in this function, like the
+ * mcv_selectivity() on selfuncs.c. Actually this function has nothing
+ * to do with the network data types except its name and location.
+ */
+static Selectivity
+inet_mcv_join_selec(Datum *values1, float4 *numbers1, int nvalues1,
+					Datum *values2, float4 *numbers2, int nvalues2,
+					Oid operator, bool reversed)
+{
+	Selectivity		selec;
+	FmgrInfo		proc;
+	int				i,
+					j;
+
+	fmgr_info(get_opcode(operator), &proc);
+	selec = 0.0;
+
+	for (i = 0; i < nvalues1; i++)
+		for (j = 0; j < nvalues2; j++)
+			if (reversed ?
+				DatumGetBool(FunctionCall2Coll(&proc,
+											   DEFAULT_COLLATION_OID,
+											   values1[i],
+											   values2[j])) :
+				DatumGetBool(FunctionCall2Coll(&proc,
+											   DEFAULT_COLLATION_OID,
+											   values2[j],
+											   values1[i])))
+				selec += numbers1[i] * numbers2[j];
+
+	return selec;
+}
+
+/*
+ * Inet MCV vs histogram inclusion join selectivity estimation
+ */
+static Selectivity
+inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers, int mcv_nvalues,
+				   Datum *his_values, int his_nvalues, int red_nvalues,
+				   short int opr_order)
+{
+	Selectivity		selec;
+	int				i;
+
+	selec = 0.0;
+	for (i = 0; i < mcv_nvalues; i++)
+		selec += mcv_numbers[i] *
+				 inet_his_inclusion_selec(his_values, his_nvalues, red_nvalues,
+										  &mcv_values[i], 0, opr_order);
+	return selec;
+}
+
+/*
+ * Inet histogram inclusion join selectivity estimation
+ *
+ * It is required to choose red1_nvalues from his1_values. The first and
+ * the last values will not be used for better sampling. A gap will be
+ * calculated and used to skip some of the histogram boundaries. It is
+ * important to check exactly given amount of the values.
+ */
+static Selectivity
+inet_his_inclusion_join_selec(Datum *his1_values, int his1_nvalues,
+							  int red1_nvalues, Datum *his2_values,
+							  int his2_nvalues, int red2_nvalues,
+							  short int opr_order)
+{
+	float			match,
+					gap;
+	int				i;
+
+	Assert(his1_nvalues >= red1_nvalues);
+
+	gap = ((float) (his1_nvalues - 2)) / ((float) red1_nvalues);
+	match = 0.0;
+	for (i = 1; i <= red1_nvalues; i++)
+		match += inet_his_inclusion_selec(his2_values, his2_nvalues,
+										  red2_nvalues,
+										  &his1_values[(int) (i * gap)],
+										  0, opr_order);
+
+	return match / red1_nvalues;
+}
+
+/*
+ * Comparison function for the subnet inclusion operators
+ *
+ * Comparison is compatible with the basic comparison function for the inet
+ * type. See network_cmp_internal on network.c for the original. Basic
+ * comparison operators are implemented with the network_cmp_internal
+ * function. It is possible to implement the subnet inclusion operators with
+ * this function.
+ *
+ * Comparison is first on the common bits of the network part, then on
+ * the length of the network part (masklen) as the network_cmp_internal
+ * function. Only the first part is on this function. The second part is
+ * seperated to another function for reusability. The difference between
+ * the second part and the original network_cmp_internal is that the operator
+ * is used while comparing the lengths of the network parts. See the second
+ * part on the inet_masklen_inclusion_cmp function below.
+ */
+static short int
+inet_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	 order;
+
+		order = bitncmp(ip_addr(left), ip_addr(right),
+						Min(ip_bits(left), ip_bits(right)));
+
+		if (order != 0)
+			return order;
+
+		return inet_masklen_inclusion_cmp(left, right, opr_order);
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Masklen comparison function for the subnet inclusion operators
+ *
+ * Compares the lengths of network parts of the inputs using the operator.
+ * If the comparision is okay for the operator the return value will be 0.
+ * Otherwise the return value will be less than or greater than 0 with
+ * respect to the operator.
+ */
+static short int
+inet_masklen_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	 order;
+
+		order = ip_bits(left) - ip_bits(right);
+
+		if ((order > 0 && opr_order >= 0) ||
+			(order == 0 && opr_order >= -1 && opr_order <= 1) ||
+			(order < 0 && opr_order <= 0))
+			return 0;
+
+		return opr_order;
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Inet histogram partial match divider calculation
+ *
+ * First the families and the lenghts of the network parts are compared
+ * using the subnet inclusion operator. The divider will be calculated
+ * using the masklens and the common bits of the addresses. -1 will be
+ * returned if it cannot be calculated.
+ */
+static short int
+inet_his_match_divider(inet *boundary, inet *query, short int opr_order)
+{
+	if (inet_masklen_inclusion_cmp(boundary, query, opr_order) == 0)
+	{
+		short int	min_bits,
+					decisive_bits;
+
+		min_bits = Min(ip_bits(boundary), ip_bits(query));
+
+		/*
+		 * Set the decisive bits from the one which should contain the other
+		 * according to the operator.
+		 */
+		if (opr_order < 0)
+			decisive_bits = ip_bits(boundary);
+		else if (opr_order > 0)
+			decisive_bits = ip_bits(query);
+		else
+			decisive_bits = min_bits;
+
+		if (min_bits > 0)
+			return decisive_bits - bitncommon(ip_addr(boundary), ip_addr(query),
+											  min_bits);
+		return decisive_bits;
+	}
+
+	return -1;
 }
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index 87ee4eb..8afa428 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -1135,32 +1135,33 @@ DESCR("not equal");
 DATA(insert OID = 1203 (  "<"	   PGNSP PGUID b f f 869 869	 16 1205 1206 network_lt scalarltsel scalarltjoinsel ));
 DESCR("less than");
 DATA(insert OID = 1204 (  "<="	   PGNSP PGUID b f f 869 869	 16 1206 1205 network_le scalarltsel scalarltjoinsel ));
 DESCR("less than or equal");
 DATA(insert OID = 1205 (  ">"	   PGNSP PGUID b f f 869 869	 16 1203 1204 network_gt scalargtsel scalargtjoinsel ));
 DESCR("greater than");
 DATA(insert OID = 1206 (  ">="	   PGNSP PGUID b f f 869 869	 16 1204 1203 network_ge scalargtsel scalargtjoinsel ));
 DESCR("greater than or equal");
 DATA(insert OID = 931  (  "<<"	   PGNSP PGUID b f f 869 869	 16 933		0 network_sub networksel networkjoinsel ));
 DESCR("is subnet");
-#define OID_INET_SUB_OP				  931
+#define OID_INET_SUB_OP			931
 DATA(insert OID = 932  (  "<<="    PGNSP PGUID b f f 869 869	 16 934		0 network_subeq networksel networkjoinsel ));
 DESCR("is subnet or equal");
-#define OID_INET_SUBEQ_OP				932
+#define OID_INET_SUBEQ_OP		932
 DATA(insert OID = 933  (  ">>"	   PGNSP PGUID b f f 869 869	 16 931		0 network_sup networksel networkjoinsel ));
 DESCR("is supernet");
-#define OID_INET_SUP_OP				  933
+#define OID_INET_SUP_OP			933
 DATA(insert OID = 934  (  ">>="    PGNSP PGUID b f f 869 869	 16 932		0 network_supeq networksel networkjoinsel ));
 DESCR("is supernet or equal");
-#define OID_INET_SUPEQ_OP				934
+#define OID_INET_SUPEQ_OP		934
 DATA(insert OID = 3552	(  "&&"    PGNSP PGUID b f f 869 869	 16 3552	0 network_overlap networksel networkjoinsel ));
 DESCR("overlaps (is subnet or supernet)");
+#define OID_INET_OVERLAP_OP		3552
 
 DATA(insert OID = 2634 (  "~"	   PGNSP PGUID l f f	  0 869 869 0 0 inetnot - - ));
 DESCR("bitwise not");
 DATA(insert OID = 2635 (  "&"	   PGNSP PGUID b f f	869 869 869 0 0 inetand - - ));
 DESCR("bitwise and");
 DATA(insert OID = 2636 (  "|"	   PGNSP PGUID b f f	869 869 869 0 0 inetor - - ));
 DESCR("bitwise or");
 DATA(insert OID = 2637 (  "+"	   PGNSP PGUID b f f	869  20 869 2638 0 inetpl - - ));
 DESCR("add");
 DATA(insert OID = 2638 (  "+"	   PGNSP PGUID b f f	 20 869 869 2637 0 int8pl_inet - - ));
#3Dilip kumar
dilip.kumar@huawei.com
In reply to: Emre Hasegeli (#1)
Re: Selectivity estimation for inet operators

On, 15 May 2014 14:04 Emre Hasegeli Wrote,

* matching first MCV to second MCV
* searching first MCV in the second histogram
* searching second MCV in the first histogram
* searching boundaries of the first histogram in the second histogram

Comparing the lists with each other slows down the function when
statistics set to higher values. To avoid this problem I only use
log(n) values of the lists. It is the first log(n) value for MCV,
evenly separated values for histograms. In my tests, this optimization
does not affect the planning time when statistics = 100, but does
affect accuracy of the estimation. I can send the version without this
optimization, if slow down with larger statistics is not a problem
which should be solved on the selectivity estimation function.

I have started reviewing this patch, so far I have done basic reviews and some testing/debugging.

1. Patch applied to git head.
2. Basic testing works fine.

I have one query,

In inet_his_inclusion_selec function,
When the constant matches only the right side of the bucket, and if it’s a last bucket then it's never considered as partial match candidate.
In my opinion, if it's not a last bucket then for next bucket it will become left boundary and this will be treated as partial match so no problem, but in-case of last bucket it can give wrong selectivity.

Can't we consider it as partial bucket match if it is last bucket ?

Apart from that there is one spell check you can correct
-- in inet_his_inclusion_selec comments
histogram boundies -> histogram boundaries :)

Thanks & Regards,
Dilip Kumar

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Emre Hasegeli
emre@hasegeli.com
In reply to: Dilip kumar (#3)
1 attachment(s)
Re: Selectivity estimation for inet operators

Thank you for looking at it.

In inet_his_inclusion_selec function,
When the constant matches only the right side of the bucket, and if it’s a last bucket then it's never considered as partial match candidate.
In my opinion, if it's not a last bucket then for next bucket it will become left boundary and this will be treated as partial match so no problem, but in-case of last bucket it can give wrong selectivity.

Can't we consider it as partial bucket match if it is last bucket ?

Actually, in that case, the ratio for one value in the column is used.
I clarified the comment about it. I do not think it is common enough
case to make the function more complicated.

Apart from that there is one spell check you can correct
-- in inet_his_inclusion_selec comments
histogram boundies -> histogram boundaries :)

I fixed it. New version attached. The debug log statements are also
removed.

Attachments:

inet-selfuncs-v6.patchtext/plain; charset=utf-8Download
diff --git a/src/backend/utils/adt/network_selfuncs.c b/src/backend/utils/adt/network_selfuncs.c
index d0d806f..d8aeae9 100644
--- a/src/backend/utils/adt/network_selfuncs.c
+++ b/src/backend/utils/adt/network_selfuncs.c
@@ -1,32 +1,677 @@
 /*-------------------------------------------------------------------------
  *
  * network_selfuncs.c
  *	  Functions for selectivity estimation of inet/cidr operators
  *
- * Currently these are just stubs, but we hope to do better soon.
+ * Estimates are based on null fraction, distinct value count, most common
+ * values, and histogram of inet/cidr datatypes.
  *
  * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
  *
  * IDENTIFICATION
  *	  src/backend/utils/adt/network_selfuncs.c
  *
  *-------------------------------------------------------------------------
  */
 #include "postgres.h"
 
+#include <math.h>
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_operator.h"
+#include "catalog/pg_statistic.h"
+#include "utils/lsyscache.h"
 #include "utils/inet.h"
+#include "utils/selfuncs.h"
 
 
+/* Default selectivity constant for the inet overlap operator */
+#define DEFAULT_OVERLAP_SEL 0.01
+
+/* Default selectivity constant for the other operators */
+#define DEFAULT_INCLUSION_SEL 0.005
+
+/* Default selectivity for given operator */
+#define DEFAULT_SEL(operator) \
+	((operator) == OID_INET_OVERLAP_OP ? \
+			DEFAULT_OVERLAP_SEL : DEFAULT_INCLUSION_SEL)
+
+static short int inet_opr_order(Oid operator, bool reversed);
+static Selectivity inet_his_inclusion_selec(Datum *values, int nvalues,
+					int red_nvalues, Datum *constvalue, double ndistinct,
+					short int opr_order);
+static Selectivity inet_mcv_join_selec(Datum *values1, float4 *numbers1,
+					int nvalues1, Datum *values2, float4 *numbers2,
+					int nvalues2, Oid operator, bool reversed);
+static Selectivity inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers,
+					int mcv_nvalues, Datum *his_values, int his_nvalues,
+					int red_nvalues, short int opr_order);
+static Selectivity inet_his_inclusion_join_selec(Datum *his1_values,
+					int his1_nvalues, int red1_nvalues, Datum *his2_values,
+					int his2_nvalues, int red2_nvalues, short int opr_order);
+static short int inet_inclusion_cmp(inet *left, inet *right,
+									short int opr_order);
+static short int inet_masklen_inclusion_cmp(inet *left, inet *right,
+											short int opr_order);
+static short int inet_his_match_divider(inet *boundary, inet *query,
+										short int opr_order);
+
+/*
+ * Selectivity estimation for the subnet inclusion operators
+ */
 Datum
 networksel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo	   *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid				operator = PG_GETARG_OID(1);
+	List		   *args = (List *) PG_GETARG_POINTER(2);
+	int				varRelid = PG_GETARG_INT32(3),
+					his_nvalues;
+	VariableStatData vardata;
+	Node		   *other;
+	bool			varonleft;
+	Selectivity		selec,
+					max_mcv_selec,
+					max_his_selec;
+	Datum			constvalue,
+				   *his_values;
+	Form_pg_statistic stats;
+	FmgrInfo		proc;
+
+	/*
+	 * If expression is not (variable op something) or (something op
+	 * variable), then punt and return a default estimate.
+	 */
+	if (!get_restriction_variable(root, args, varRelid,
+								  &vardata, &other, &varonleft))
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+
+	/*
+	 * Can't do anything useful if the something is not a constant, either.
+	 */
+	if (!IsA(other, Const))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	/* All of the subnet inclusion operators are strict. */
+	if (((Const *) other)->constisnull)
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(0.0);
+	}
+
+	if (!HeapTupleIsValid(vardata.statsTuple))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	constvalue = ((Const *) other)->constvalue;
+	stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+
+	fmgr_info(get_opcode(operator), &proc);
+	selec = mcv_selectivity(&vardata, &proc, constvalue, varonleft,
+							&max_mcv_selec);
+	max_his_selec = 1.0 - stats->stanullfrac - max_mcv_selec;
+
+	if (get_attstatsslot(vardata.statsTuple,
+						 vardata.atttype, vardata.atttypmod,
+						 STATISTIC_KIND_HISTOGRAM, InvalidOid,
+						 NULL,
+						 &his_values, &his_nvalues,
+						 NULL, NULL))
+	{
+		selec += max_his_selec *
+				 inet_his_inclusion_selec(his_values, his_nvalues, his_nvalues,
+										  &constvalue, stats->stadistinct,
+										  inet_opr_order(operator, !varonleft));
+
+		free_attstatsslot(vardata.atttype, his_values, his_nvalues, NULL, 0);
+	}
+	else
+		if (max_mcv_selec > 0)
+			selec = selec / (1.0 - max_his_selec); /* Correct the value. */
+		else
+			selec = DEFAULT_SEL(operator);
+
+	/* Result should be in range, but make sure... */
+	CLAMP_PROBABILITY(selec);
+
+	ReleaseVariableStats(vardata);
+	PG_RETURN_FLOAT8(selec);
 }
 
+/*
+ * Join selectivity estimation for the subnet inclusion operators
+ *
+ * Calculates MCV vs MCV, MCV vs histogram and histogram vs histogram
+ * selectivity for join using the subnet inclusion operators.  Unlike the
+ * join selectivity function for the equality operator, eqjoinsel(), 1 to 1
+ * matching of the values is not enough.   Network inclusion operators are
+ * likely to match many to many.   It requires to loop the MVC and histogram
+ * lists to the end.  Also, MCV vs histogram selectiviy is not neglected
+ * as in eqjoinsel().
+ *
+ * More processing on this function can become a problem with large
+ * statistics.  To avoid it only some of the values in the lists are used.
+ * The reduced amount of the list is calculated by the square root of
+ * the original amount.  It fits the situation because the lists will be
+ * matched to each other (sqrt(x) * sqrt(x) == x).  MCV's will be reduced
+ * by choosing the first N values.  Histogram boundaries will be reduced
+ * by skipping some of them.
+ */
 Datum
 networkjoinsel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo	   *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid				operator = PG_GETARG_OID(1);
+	List		   *args = (List *) PG_GETARG_POINTER(2);
+	SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) PG_GETARG_POINTER(4);
+	VariableStatData vardata1,
+					vardata2;
+	Form_pg_statistic stats1,
+					stats2;
+	Selectivity		selec,
+					mcv1_max_selec,
+					mcv1_red_selec,
+					mcv2_max_selec,
+					mcv2_red_selec;
+	bool			reversed,
+					mcv1_exists,
+					mcv2_exists,
+					his1_exists,
+					his2_exists;
+	short int		opr_order;
+	int				mcv1_nvalues,
+					mcv2_nvalues,
+					mcv1_nnumbers,
+					mcv2_nnumbers,
+					his1_nvalues,
+					his2_nvalues,
+					red1_nvalues,
+					red2_nvalues,
+					i;
+	Datum		   *mcv1_values,
+				   *mcv2_values,
+				   *his1_values,
+				   *his2_values;
+	float4		   *mcv1_numbers,
+				   *mcv2_numbers;
+
+	get_join_variables(root, args, sjinfo, &vardata1, &vardata2, &reversed);
+
+	switch (sjinfo->jointype)
+	{
+		case JOIN_INNER:
+		case JOIN_LEFT:
+		case JOIN_FULL:
+			break;
+		default:
+			ReleaseVariableStats(vardata1);
+			ReleaseVariableStats(vardata2);
+			PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	if (!HeapTupleIsValid(vardata1.statsTuple) ||
+		!HeapTupleIsValid(vardata2.statsTuple))
+	{
+		ReleaseVariableStats(vardata1);
+		ReleaseVariableStats(vardata2);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	opr_order = inet_opr_order(operator, reversed);
+	stats1 = (Form_pg_statistic) GETSTRUCT(vardata1.statsTuple);
+	stats2 = (Form_pg_statistic) GETSTRUCT(vardata2.statsTuple);
+	mcv1_exists = get_attstatsslot(vardata1.statsTuple,
+								   vardata1.atttype, vardata1.atttypmod,
+								   STATISTIC_KIND_MCV, InvalidOid,
+								   NULL,
+								   &mcv1_values, &mcv1_nvalues,
+								   &mcv1_numbers, &mcv1_nnumbers);
+	mcv2_exists = get_attstatsslot(vardata2.statsTuple,
+								   vardata2.atttype, vardata2.atttypmod,
+								   STATISTIC_KIND_MCV, InvalidOid,
+								   NULL,
+								   &mcv2_values, &mcv2_nvalues,
+								   &mcv2_numbers, &mcv2_nnumbers);
+	his1_exists = get_attstatsslot(vardata1.statsTuple,
+								   vardata1.atttype, vardata1.atttypmod,
+								   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+								   NULL,
+								   &his1_values, &his1_nvalues,
+								   NULL, NULL);
+	his2_exists = get_attstatsslot(vardata2.statsTuple,
+								   vardata2.atttype, vardata2.atttypmod,
+								   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+								   NULL,
+								   &his2_values, &his2_nvalues,
+								   NULL, NULL);
+
+	red1_nvalues = ((int) sqrt(Max(mcv1_nvalues, his1_nvalues))) + 1;
+	red2_nvalues = ((int) sqrt(Max(mcv2_nvalues, his2_nvalues))) + 1;
+
+	selec = 0.0;
+	mcv1_max_selec = 0.0;
+	mcv1_red_selec = 0.0;
+	mcv2_max_selec = 0.0;
+	mcv2_red_selec = 0.0;
+	if (mcv1_exists)
+		for (i = 0; i < mcv1_nvalues; i++)
+		{
+			mcv1_max_selec += mcv1_numbers[i];
+			if (i < red1_nvalues)
+				mcv1_red_selec += mcv1_numbers[i];
+		}
+	if (mcv2_exists)
+		for (i = 0; i < mcv2_nvalues; i++)
+		{
+			mcv2_max_selec += mcv2_numbers[i];
+			if (i < red2_nvalues)
+				mcv2_red_selec += mcv2_numbers[i];
+		}
+
+	if (mcv1_exists && mcv2_exists)
+		selec += (mcv1_max_selec / mcv1_red_selec) *
+				 (mcv2_max_selec / mcv2_red_selec) *
+				 inet_mcv_join_selec(mcv1_values, mcv1_numbers,
+									 Min(mcv1_nvalues, red1_nvalues),
+									 mcv2_values, mcv2_numbers,
+									 Min(mcv2_nvalues, red2_nvalues),
+									 operator, reversed);
+	if (mcv1_exists && his2_exists)
+		selec += (mcv1_max_selec / mcv1_red_selec) *
+				 inet_mcv_his_selec(mcv1_values, mcv1_numbers,
+									Min(mcv1_nvalues, red1_nvalues),
+									his2_values, his2_nvalues, red2_nvalues,
+									opr_order);
+	if (mcv2_exists && his1_exists)
+		selec += (mcv2_max_selec / mcv2_red_selec) *
+				 inet_mcv_his_selec(mcv2_values, mcv2_numbers,
+									Min(mcv2_nvalues, red2_nvalues),
+									his1_values, his1_nvalues, red1_nvalues,
+									opr_order);
+	if (his1_exists && his2_exists)
+		selec += (1.0 - stats1->stanullfrac - mcv1_max_selec) *
+				 (1.0 - stats2->stanullfrac - mcv2_max_selec) *
+				 inet_his_inclusion_join_selec(his1_values, his1_nvalues,
+											   red1_nvalues, his2_values,
+											   his2_nvalues, red2_nvalues,
+											   opr_order);
+
+	/* Correct the value. */
+	if (!his1_exists)
+		selec /= stats1->stanullfrac + mcv1_max_selec;
+	if (!his2_exists)
+		selec /= stats2->stanullfrac + mcv2_max_selec;
+
+	if (!mcv1_exists && !mcv2_exists && !his1_exists && his2_exists)
+		selec = DEFAULT_SEL(operator);
+
+	if (mcv1_exists)
+		free_attstatsslot(vardata1.atttype, mcv1_values, mcv1_nvalues,
+						  mcv1_numbers, mcv1_nnumbers);
+	if (mcv2_exists)
+		free_attstatsslot(vardata2.atttype, mcv2_values, mcv2_nvalues,
+						  mcv2_numbers, mcv2_nnumbers);
+	if (his1_exists)
+		free_attstatsslot(vardata1.atttype, his1_values, his1_nvalues, NULL, 0);
+	if (his2_exists)
+		free_attstatsslot(vardata2.atttype, his2_values, his2_nvalues, NULL, 0);
+	ReleaseVariableStats(vardata1);
+	ReleaseVariableStats(vardata2);
+
+	/* Result should be in range, but make sure... */
+	CLAMP_PROBABILITY(selec);
+
+	PG_RETURN_FLOAT8(selec);
+}
+
+/*
+ * Practical comparable numbers for the subnet inclusion operators
+ */
+static short int
+inet_opr_order(Oid operator, bool reversed)
+{
+	short int	order;
+
+	switch (operator)
+	{
+		case OID_INET_SUP_OP:
+			order = -2;
+			break;
+		case OID_INET_SUPEQ_OP:
+			order = -1;
+			break;
+		case OID_INET_OVERLAP_OP:
+			order = 0;
+			break;
+		case OID_INET_SUBEQ_OP:
+			order = 1;
+			break;
+		case OID_INET_SUB_OP:
+			order = 2;
+			break;
+		default:
+			elog(ERROR, "unknown operator for inet inclusion selectivity");
+	}
+
+	return (reversed ? order * -1 : order);
+}
+
+/*
+ * Inet histogram inclusion selectivity estimation
+ *
+ * Calculates histogram selectivity for the subnet inclusion operators of
+ * the inet type.  The return value is between 0 and 1.  It should be
+ * corrected with the MVC selectivity and null fraction.  If the constant
+ * is less than the first element or greater than the last element of
+ * the histogram the return value will be 0.
+ *
+ * This function is capable of checking only some of the histogram boundaries.
+ * Reduced number of values, red_nvalues, argument is added for that purpose.
+ * Nvalues can also be given to it to avoid this behavior.  This functionality
+ * is used to make join selectivity estimation faster.  It is explained on
+ * inet_mcv_join_selec(), below.
+ *
+ * The histogram is originally for the basic comparison operators.  Only
+ * the common bits of the network part and the lenght of the network part
+ * (masklen) are appropriate for the subnet inclusion opeators.  Fortunately,
+ * basic comparison fits in this situation.  Even so, the lenght of the
+ * network part would not really be significant in the histogram.  This would
+ * lead to big mistakes for data sets with uneven masklen distribution.
+ * To avoid this problem, comparison with the left and the right side of the
+ * buckets used together.
+ *
+ * Histogram bucket matches are calculated in 3 forms.  If the constant
+ * matches both sides the bucket is considered as fully matched.  If the
+ * constant matches only the right side the bucket, it is considered as
+ * only matched with a single value.  In that case the ratio for only one
+ * value in the column is added to the selectivity.
+ *
+ * The ratio for only one value is calculated with the ndistinct variable
+ * if greater than 0.  0 can be given, if this behavior is not desired.
+ * This ratio can be big enough to not disregard for addresses with small
+ * masklens.  See pg_statistic for more information about it.
+ *
+ * When the constant matches only the right side of the bucket, it will match
+ * the next bucket, unless the bucket is the last one.  If these buckets would
+ * be considered as matched it would lead to unfair multiple matches for some
+ * constants.
+ *
+ * The third form is to match the bucket partially.  We try to calculate
+ * dividers for both of the boundaries.  If the address family of the boundary
+ * does not match the constant or comparison of the lenght of the network
+ * parts is not true by the operator, the divider for the boundary would not
+ * taken into account.  If both of the dividers can be calculated the greater
+ * one will be used to mimimize the mistake in the buckets which have
+ * disperate masklens.
+ *
+ * The divider on the partial bucket match is imagined as the distance
+ * between the decisive bits and the common bits of the addresses.  It will
+ * be* used as power of two as it is the natural scale for the IP network
+ * inclusion.  The partial bucket match divider calculation is an empirical
+ * formula and subject to change with more experiment.
+ *
+ * For partial match with buckets which have different address families
+ * on the left and right sides only the boundary with the same address
+ * family is taken into consideration.  This can cause more mistakes for these
+ * buckets if the masklens of their boundaries are also disparate.  It can
+ * only be the case for one bucket, if there are addresses with different
+ * families on the column.  It seems as a better option than not considering
+ * these buckets.
+ */
+static Selectivity
+inet_his_inclusion_selec(Datum *values, int nvalues, int red_nvalues,
+						 Datum *constvalue, double ndistinct,
+						 short int opr_order)
+{
+	inet		   *query,
+				   *left,
+				   *right;
+	float			gap,
+					match;
+	int				i;
+	short int		left_order,
+					right_order,
+					left_divider,
+					right_divider;
+
+	Assert(nvalues >= red_nvalues);
+
+	gap = ((float) (nvalues - 1)) / ((float) red_nvalues);
+	match = 0.0;
+	query = DatumGetInetP(*constvalue);
+	left = DatumGetInetP(values[0]);
+	left_order = inet_inclusion_cmp(left, query, opr_order);
+
+	for (i = 1; i <= red_nvalues; i++)
+	{
+		right = DatumGetInetP(values[(int) (i * gap)]);
+		right_order = inet_inclusion_cmp(right, query, opr_order);
+
+		if (left_order == 0 && right_order == 0)
+		{
+			/* Full bucket match. */
+
+			match += 1.0;
+		}
+		else if ((left_order <= 0 && right_order > 0) ||
+				 (left_order >= 0 && right_order < 0))
+		{
+			/* Partial bucket match. */
+
+			left_divider = inet_his_match_divider(left, query, opr_order);
+			right_divider = inet_his_match_divider(right, query, opr_order);
+
+			if (left_divider >= 0 || right_divider >= 0)
+				match += 1.0 / pow(2, Max(left_divider, right_divider));
+		}
+		else if (right_order == 0)
+		{
+			/* Only right boundary match. */
+
+			if (ndistinct > 0)
+				match += 1.0 / ndistinct;
+		}
+
+		/* Shift the variables. */
+		left = right;
+		left_order = right_order;
+	}
+
+	return match / (red_nvalues - 1);
+}
+
+/*
+ * Inet MCV join selectivity estimation
+ *
+ * The original function of the operator used in this function, like the
+ * mcv_selectivity() on selfuncs.c.  Actually this function has nothing
+ * to do with the network data types except its name and location.
+ */
+static Selectivity
+inet_mcv_join_selec(Datum *values1, float4 *numbers1, int nvalues1,
+					Datum *values2, float4 *numbers2, int nvalues2,
+					Oid operator, bool reversed)
+{
+	Selectivity		selec;
+	FmgrInfo		proc;
+	int				i,
+					j;
+
+	fmgr_info(get_opcode(operator), &proc);
+	selec = 0.0;
+
+	for (i = 0; i < nvalues1; i++)
+		for (j = 0; j < nvalues2; j++)
+			if (reversed ?
+				DatumGetBool(FunctionCall2Coll(&proc,
+											   DEFAULT_COLLATION_OID,
+											   values1[i],
+											   values2[j])) :
+				DatumGetBool(FunctionCall2Coll(&proc,
+											   DEFAULT_COLLATION_OID,
+											   values2[j],
+											   values1[i])))
+				selec += numbers1[i] * numbers2[j];
+
+	return selec;
+}
+
+/*
+ * Inet MCV vs histogram inclusion join selectivity estimation
+ */
+static Selectivity
+inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers, int mcv_nvalues,
+				   Datum *his_values, int his_nvalues, int red_nvalues,
+				   short int opr_order)
+{
+	Selectivity		selec;
+	int				i;
+
+	selec = 0.0;
+	for (i = 0; i < mcv_nvalues; i++)
+		selec += mcv_numbers[i] *
+				 inet_his_inclusion_selec(his_values, his_nvalues, red_nvalues,
+										  &mcv_values[i], 0, opr_order);
+	return selec;
+}
+
+/*
+ * Inet histogram inclusion join selectivity estimation
+ *
+ * It is required to choose red1_nvalues from his1_values.  The first and
+ * the last values will not be used for better sampling.  A gap will be
+ * calculated and used to skip some of the histogram boundaries.  It is
+ * important to check exactly given amount of the values.
+ */
+static Selectivity
+inet_his_inclusion_join_selec(Datum *his1_values, int his1_nvalues,
+							  int red1_nvalues, Datum *his2_values,
+							  int his2_nvalues, int red2_nvalues,
+							  short int opr_order)
+{
+	float			match,
+					gap;
+	int				i;
+
+	Assert(his1_nvalues >= red1_nvalues);
+
+	gap = ((float) (his1_nvalues - 2)) / ((float) red1_nvalues);
+	match = 0.0;
+	for (i = 1; i <= red1_nvalues; i++)
+		match += inet_his_inclusion_selec(his2_values, his2_nvalues,
+										  red2_nvalues,
+										  &his1_values[(int) (i * gap)],
+										  0, opr_order);
+
+	return match / red1_nvalues;
+}
+
+/*
+ * Comparison function for the subnet inclusion operators
+ *
+ * Comparison is compatible with the basic comparison function for the inet
+ * type.  See network_cmp_internal on network.c for the original.  Basic
+ * comparison operators are implemented with the network_cmp_internal
+ * function.  It is possible to implement the subnet inclusion operators with
+ * this function.
+ *
+ * Comparison is first on the common bits of the network part, then on
+ * the length of the network part (masklen) as the network_cmp_internal
+ * function.  Only the first part is on this function.  The second part is
+ * seperated to another function for reusability.  The difference between
+ * the second part and the original network_cmp_internal is that the operator
+ * is used while comparing the lengths of the network parts.  See the second
+ * part on the inet_masklen_inclusion_cmp function below.
+ */
+static short int
+inet_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	 order;
+
+		order = bitncmp(ip_addr(left), ip_addr(right),
+						Min(ip_bits(left), ip_bits(right)));
+
+		if (order != 0)
+			return order;
+
+		return inet_masklen_inclusion_cmp(left, right, opr_order);
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Masklen comparison function for the subnet inclusion operators
+ *
+ * Compares the lengths of network parts of the inputs using the operator.
+ * If the comparision is okay for the operator the return value will be 0.
+ * Otherwise the return value will be less than or greater than 0 with
+ * respect to the operator.
+ */
+static short int
+inet_masklen_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	 order;
+
+		order = ip_bits(left) - ip_bits(right);
+
+		if ((order > 0 && opr_order >= 0) ||
+			(order == 0 && opr_order >= -1 && opr_order <= 1) ||
+			(order < 0 && opr_order <= 0))
+			return 0;
+
+		return opr_order;
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Inet histogram partial match divider calculation
+ *
+ * First the families and the lenghts of the network parts are compared
+ * using the subnet inclusion operator.  The divider will be calculated
+ * using the masklens and the common bits of the addresses.  -1 will be
+ * returned if it cannot be calculated.
+ */
+static short int
+inet_his_match_divider(inet *boundary, inet *query, short int opr_order)
+{
+	if (inet_masklen_inclusion_cmp(boundary, query, opr_order) == 0)
+	{
+		short int	min_bits,
+					decisive_bits;
+
+		min_bits = Min(ip_bits(boundary), ip_bits(query));
+
+		/*
+		 * Set the decisive bits from the one which should contain the other
+		 * according to the operator.
+		 */
+		if (opr_order < 0)
+			decisive_bits = ip_bits(boundary);
+		else if (opr_order > 0)
+			decisive_bits = ip_bits(query);
+		else
+			decisive_bits = min_bits;
+
+		if (min_bits > 0)
+			return decisive_bits - bitncommon(ip_addr(boundary), ip_addr(query),
+											  min_bits);
+		return decisive_bits;
+	}
+
+	return -1;
 }
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index f8b4a65..fb37337 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -1135,32 +1135,33 @@ DESCR("not equal");
 DATA(insert OID = 1203 (  "<"	   PGNSP PGUID b f f 869 869	 16 1205 1206 network_lt scalarltsel scalarltjoinsel ));
 DESCR("less than");
 DATA(insert OID = 1204 (  "<="	   PGNSP PGUID b f f 869 869	 16 1206 1205 network_le scalarltsel scalarltjoinsel ));
 DESCR("less than or equal");
 DATA(insert OID = 1205 (  ">"	   PGNSP PGUID b f f 869 869	 16 1203 1204 network_gt scalargtsel scalargtjoinsel ));
 DESCR("greater than");
 DATA(insert OID = 1206 (  ">="	   PGNSP PGUID b f f 869 869	 16 1204 1203 network_ge scalargtsel scalargtjoinsel ));
 DESCR("greater than or equal");
 DATA(insert OID = 931  (  "<<"	   PGNSP PGUID b f f 869 869	 16 933		0 network_sub networksel networkjoinsel ));
 DESCR("is subnet");
-#define OID_INET_SUB_OP				  931
+#define OID_INET_SUB_OP			931
 DATA(insert OID = 932  (  "<<="    PGNSP PGUID b f f 869 869	 16 934		0 network_subeq networksel networkjoinsel ));
 DESCR("is subnet or equal");
-#define OID_INET_SUBEQ_OP				932
+#define OID_INET_SUBEQ_OP		932
 DATA(insert OID = 933  (  ">>"	   PGNSP PGUID b f f 869 869	 16 931		0 network_sup networksel networkjoinsel ));
 DESCR("is supernet");
-#define OID_INET_SUP_OP				  933
+#define OID_INET_SUP_OP			933
 DATA(insert OID = 934  (  ">>="    PGNSP PGUID b f f 869 869	 16 932		0 network_supeq networksel networkjoinsel ));
 DESCR("is supernet or equal");
-#define OID_INET_SUPEQ_OP				934
+#define OID_INET_SUPEQ_OP		934
 DATA(insert OID = 3552	(  "&&"    PGNSP PGUID b f f 869 869	 16 3552	0 network_overlap networksel networkjoinsel ));
 DESCR("overlaps (is subnet or supernet)");
+#define OID_INET_OVERLAP_OP		3552
 
 DATA(insert OID = 2634 (  "~"	   PGNSP PGUID l f f	  0 869 869 0 0 inetnot - - ));
 DESCR("bitwise not");
 DATA(insert OID = 2635 (  "&"	   PGNSP PGUID b f f	869 869 869 0 0 inetand - - ));
 DESCR("bitwise and");
 DATA(insert OID = 2636 (  "|"	   PGNSP PGUID b f f	869 869 869 0 0 inetor - - ));
 DESCR("bitwise or");
 DATA(insert OID = 2637 (  "+"	   PGNSP PGUID b f f	869  20 869 2638 0 inetpl - - ));
 DESCR("add");
 DATA(insert OID = 2638 (  "+"	   PGNSP PGUID b f f	 20 869 869 2637 0 int8pl_inet - - ));
#5Dilip kumar
dilip.kumar@huawei.com
In reply to: Emre Hasegeli (#4)
Re: Selectivity estimation for inet operators

On 06 July 2014 20:33, Emre Hasegeli Wrote,

Apart from that there is one spell check you can correct
-- in inet_his_inclusion_selec comments histogram boundies ->
histogram boundaries :)

I fixed it. New version attached. The debug log statements are also
removed.

I have done with the review, patch seems fine to me

I have one last comment, after clarifying this I can move it to "ready for committer".
1. In networkjoinsel, For avoiding the case of huge statistics, only some of the values from mcv and histograms are used (calculated using SQRT).
-- But in my opinion, if histograms and mcv both are exist then its fine, but if only mcv's are there in that case, we can match complete MCV, it will give better accuracy.
In other function like eqjoinsel also its matching complete MCV.

Thanks & Regards,
Dilip Kumar

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Emre Hasegeli
emre@hasegeli.com
In reply to: Dilip kumar (#5)
1 attachment(s)
Re: Selectivity estimation for inet operators

I have one last comment, after clarifying this I can move it to "ready for committer".
1. In networkjoinsel, For avoiding the case of huge statistics, only some of the values from mcv and histograms are used (calculated using SQRT).
-- But in my opinion, if histograms and mcv both are exist then its fine, but if only mcv's are there in that case, we can match complete MCV, it will give better accuracy.
In other function like eqjoinsel also its matching complete MCV.

I was not sure of reducing statistics, at all. I could not find any
other selectivity estimation function which does this. After testing
it some more, I reached the conclusion that it would be better to
only reduce the values of the outer loop on histogram match. Now it
matches complete MCV lists to each other. I also switched back to
log2() from sqrt() to make the outer list smaller.

I rethink your previous advice to threat histogram bucket partially
matched when the constant matches the last boundary, and changed it
that way. It is better than using the selectivity for only one value.
Removing this part also make the function more simple. The new
version of the patch attached.

While looking at it I find some other small problems and fixed them.
I also realized that I forgot to support other join types than inner
join. Currently, the default estimation is used for anti joins.
I think the patch will need more than trivial amount of change to
support anti joins. I can work on it later. While doing it, outer
join selectivity estimation can also be improved. I think the patch
is better than nothing in its current state.

Attachments:

inet-selfuncs-v7.patchtext/plain; charset=utf-8Download
diff --git a/src/backend/utils/adt/network_selfuncs.c b/src/backend/utils/adt/network_selfuncs.c
index d0d806f..08ec945 100644
--- a/src/backend/utils/adt/network_selfuncs.c
+++ b/src/backend/utils/adt/network_selfuncs.c
@@ -1,32 +1,626 @@
 /*-------------------------------------------------------------------------
  *
  * network_selfuncs.c
  *	  Functions for selectivity estimation of inet/cidr operators
  *
- * Currently these are just stubs, but we hope to do better soon.
+ * Estimates are based on null fraction, distinct value count, most common
+ * values, and histogram of inet/cidr datatypes.
  *
  * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
  *
  * IDENTIFICATION
  *	  src/backend/utils/adt/network_selfuncs.c
  *
  *-------------------------------------------------------------------------
  */
 #include "postgres.h"
 
+#include <math.h>
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_operator.h"
+#include "catalog/pg_statistic.h"
+#include "utils/lsyscache.h"
 #include "utils/inet.h"
+#include "utils/selfuncs.h"
 
 
+/* Default selectivity constant for the inet overlap operator */
+#define DEFAULT_OVERLAP_SEL 0.01
+
+/* Default selectivity constant for the other operators */
+#define DEFAULT_INCLUSION_SEL 0.005
+
+/* Default selectivity for given operator */
+#define DEFAULT_SEL(operator) \
+	((operator) == OID_INET_OVERLAP_OP ? \
+			DEFAULT_OVERLAP_SEL : DEFAULT_INCLUSION_SEL)
+
+static short int inet_opr_order(Oid operator, bool reversed);
+static Selectivity inet_his_inclusion_selec(Datum *values, int nvalues,
+					Datum *constvalue, short int opr_order);
+static Selectivity inet_mcv_join_selec(Datum *values1, float4 *numbers1,
+					int nvalues1, Datum *values2, float4 *numbers2,
+					int nvalues2, Oid operator, bool reversed);
+static Selectivity inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers,
+					int mcv_nvalues, Datum *his_values, int his_nvalues,
+					int red_nvalues, short int opr_order,
+					Selectivity *max_selec_pointer);
+static Selectivity inet_his_inclusion_join_selec(Datum *his1_values,
+					int his1_nvalues, Datum *his2_values, int his2_nvalues,
+					int red_nvalues, short int opr_order);
+static short int inet_inclusion_cmp(inet *left, inet *right,
+									short int opr_order);
+static short int inet_masklen_inclusion_cmp(inet *left, inet *right,
+											short int opr_order);
+static short int inet_his_match_divider(inet *boundary, inet *query,
+										short int opr_order);
+
+/*
+ * Selectivity estimation for the subnet inclusion operators
+ */
 Datum
 networksel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo	   *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid				operator = PG_GETARG_OID(1);
+	List		   *args = (List *) PG_GETARG_POINTER(2);
+	int				varRelid = PG_GETARG_INT32(3),
+					his_nvalues;
+	VariableStatData vardata;
+	Node		   *other;
+	bool			varonleft;
+	Selectivity		selec,
+					max_mcv_selec;
+	Datum			constvalue,
+				   *his_values;
+	Form_pg_statistic stats;
+	FmgrInfo		proc;
+
+	/*
+	 * If expression is not (variable op something) or (something op
+	 * variable), then punt and return a default estimate.
+	 */
+	if (!get_restriction_variable(root, args, varRelid,
+								  &vardata, &other, &varonleft))
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+
+	/*
+	 * Can't do anything useful if the something is not a constant, either.
+	 */
+	if (!IsA(other, Const))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	/* All of the subnet inclusion operators are strict. */
+	if (((Const *) other)->constisnull)
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(0.0);
+	}
+
+	if (!HeapTupleIsValid(vardata.statsTuple))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	constvalue = ((Const *) other)->constvalue;
+	stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+
+	fmgr_info(get_opcode(operator), &proc);
+	selec = mcv_selectivity(&vardata, &proc, constvalue, varonleft,
+							&max_mcv_selec);
+
+	if (get_attstatsslot(vardata.statsTuple,
+						 vardata.atttype, vardata.atttypmod,
+						 STATISTIC_KIND_HISTOGRAM, InvalidOid,
+						 NULL,
+						 &his_values, &his_nvalues,
+						 NULL, NULL))
+	{
+		selec += (1.0 - stats->stanullfrac - max_mcv_selec) *
+				 inet_his_inclusion_selec(his_values, his_nvalues, &constvalue,
+										  inet_opr_order(operator, !varonleft));
+
+		free_attstatsslot(vardata.atttype, his_values, his_nvalues, NULL, 0);
+	}
+	else if (max_mcv_selec == 0)
+		selec = DEFAULT_SEL(operator);
+
+	/* Result should be in range, but make sure... */
+	CLAMP_PROBABILITY(selec);
+
+	ReleaseVariableStats(vardata);
+	PG_RETURN_FLOAT8(selec);
 }
 
+/*
+ * Join selectivity estimation for the subnet inclusion operators
+ *
+ * Calculates MCV vs MCV, MCV vs histogram and histogram vs histogram
+ * selectivity for join using the subnet inclusion operators.  Unlike the
+ * join selectivity function for the equality operator, eqjoinsel(), 1 to 1
+ * matching of the values is not enough.   Network inclusion operators are
+ * likely to match many to many.   It requires to loop the MVC and histogram
+ * lists to the end.  Also, MCV vs histogram selectiviy is not neglected
+ * as in eqjoinsel().
+ *
+ * To make the the function faster only some of the values from the first
+ * MVC and histogram matched to the second histogram.  It is calculated by
+ * log2().
+ */
 Datum
 networkjoinsel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo	   *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid				operator = PG_GETARG_OID(1);
+	List		   *args = (List *) PG_GETARG_POINTER(2);
+	SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) PG_GETARG_POINTER(4);
+	VariableStatData vardata1,
+					vardata2;
+	Form_pg_statistic stats1,
+					stats2;
+	Selectivity		selec,
+					mcv1_max_selec,
+					mcv2_max_selec;
+	bool			reversed,
+					mcv1_exists,
+					mcv2_exists,
+					his1_exists,
+					his2_exists;
+	short int		opr_order;
+	int				mcv1_nvalues,
+					mcv2_nvalues,
+					mcv1_nnumbers,
+					mcv2_nnumbers,
+					his1_nvalues,
+					his2_nvalues,
+					red1_nvalues,
+					red2_nvalues;
+	Datum		   *mcv1_values,
+				   *mcv2_values,
+				   *his1_values,
+				   *his2_values;
+	float4		   *mcv1_numbers,
+				   *mcv2_numbers;
+
+	get_join_variables(root, args, sjinfo, &vardata1, &vardata2, &reversed);
+
+	switch (sjinfo->jointype)
+	{
+		case JOIN_INNER:
+		case JOIN_LEFT:
+		case JOIN_FULL:
+			break;
+		default:
+			ReleaseVariableStats(vardata1);
+			ReleaseVariableStats(vardata2);
+			PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	if (!HeapTupleIsValid(vardata1.statsTuple) ||
+		!HeapTupleIsValid(vardata2.statsTuple))
+	{
+		ReleaseVariableStats(vardata1);
+		ReleaseVariableStats(vardata2);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	opr_order = inet_opr_order(operator, reversed);
+	stats1 = (Form_pg_statistic) GETSTRUCT(vardata1.statsTuple);
+	stats2 = (Form_pg_statistic) GETSTRUCT(vardata2.statsTuple);
+	mcv1_exists = get_attstatsslot(vardata1.statsTuple,
+								   vardata1.atttype, vardata1.atttypmod,
+								   STATISTIC_KIND_MCV, InvalidOid,
+								   NULL,
+								   &mcv1_values, &mcv1_nvalues,
+								   &mcv1_numbers, &mcv1_nnumbers);
+	mcv2_exists = get_attstatsslot(vardata2.statsTuple,
+								   vardata2.atttype, vardata2.atttypmod,
+								   STATISTIC_KIND_MCV, InvalidOid,
+								   NULL,
+								   &mcv2_values, &mcv2_nvalues,
+								   &mcv2_numbers, &mcv2_nnumbers);
+	his1_exists = get_attstatsslot(vardata1.statsTuple,
+								   vardata1.atttype, vardata1.atttypmod,
+								   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+								   NULL,
+								   &his1_values, &his1_nvalues,
+								   NULL, NULL);
+	his2_exists = get_attstatsslot(vardata2.statsTuple,
+								   vardata2.atttype, vardata2.atttypmod,
+								   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+								   NULL,
+								   &his2_values, &his2_nvalues,
+								   NULL, NULL);
+
+	red1_nvalues = ((int) log2(Max(mcv1_nvalues, his1_nvalues))) + 1;
+	red2_nvalues = ((int) log2(Max(mcv2_nvalues, his2_nvalues))) + 1;
+
+	selec = 0.0;
+	mcv1_max_selec = 0.0;
+	mcv2_max_selec = 0.0;
+
+	if (mcv1_exists && mcv2_exists)
+		selec += inet_mcv_join_selec(mcv1_values, mcv1_numbers, mcv1_nvalues,
+									 mcv2_values, mcv2_numbers, mcv2_nvalues,
+									 operator, reversed);
+	if (mcv1_exists && his2_exists)
+		selec += inet_mcv_his_selec(mcv1_values, mcv1_numbers, mcv1_nvalues,
+									his2_values, his2_nvalues,
+									Min(mcv1_nvalues, red1_nvalues),
+									opr_order, &mcv1_max_selec);
+	if (mcv2_exists && his1_exists)
+		selec += inet_mcv_his_selec(mcv2_values, mcv2_numbers, mcv2_nvalues,
+									his1_values, his1_nvalues,
+									Min(mcv2_nvalues, red1_nvalues),
+									opr_order, &mcv2_max_selec);
+	if (his1_exists && his2_exists)
+		selec += (1.0 - stats1->stanullfrac - mcv1_max_selec) *
+				 (1.0 - stats2->stanullfrac - mcv2_max_selec) *
+				 inet_his_inclusion_join_selec(his1_values, his1_nvalues,
+											   his2_values, his2_nvalues,
+											   Min(his1_nvalues, red1_nvalues),
+											   opr_order);
+
+	if ((!mcv1_exists && !his1_exists) || (!mcv2_exists && !his2_exists))
+		selec = DEFAULT_SEL(operator);
+
+	if (mcv1_exists)
+		free_attstatsslot(vardata1.atttype, mcv1_values, mcv1_nvalues,
+						  mcv1_numbers, mcv1_nnumbers);
+	if (mcv2_exists)
+		free_attstatsslot(vardata2.atttype, mcv2_values, mcv2_nvalues,
+						  mcv2_numbers, mcv2_nnumbers);
+	if (his1_exists)
+		free_attstatsslot(vardata1.atttype, his1_values, his1_nvalues, NULL, 0);
+	if (his2_exists)
+		free_attstatsslot(vardata2.atttype, his2_values, his2_nvalues, NULL, 0);
+	ReleaseVariableStats(vardata1);
+	ReleaseVariableStats(vardata2);
+
+	/* Result should be in range, but make sure... */
+	CLAMP_PROBABILITY(selec);
+
+	PG_RETURN_FLOAT8(selec);
+}
+
+/*
+ * Practical comparable numbers for the subnet inclusion operators
+ */
+static short int
+inet_opr_order(Oid operator, bool reversed)
+{
+	short int	order;
+
+	switch (operator)
+	{
+		case OID_INET_SUP_OP:
+			order = -2;
+			break;
+		case OID_INET_SUPEQ_OP:
+			order = -1;
+			break;
+		case OID_INET_OVERLAP_OP:
+			order = 0;
+			break;
+		case OID_INET_SUBEQ_OP:
+			order = 1;
+			break;
+		case OID_INET_SUB_OP:
+			order = 2;
+			break;
+		default:
+			elog(ERROR, "unknown operator for inet inclusion selectivity");
+	}
+
+	return (reversed ? order * -1 : order);
+}
+
+/*
+ * Inet histogram inclusion selectivity estimation
+ *
+ * Calculates histogram selectivity for the subnet inclusion operators of
+ * the inet type.  The return value is between 0 and 1.  It should be
+ * corrected with the MVC selectivity and null fraction.  If the constant
+ * is less than the first element or greater than the last element of
+ * the histogram the return value will be 0.
+ *
+ * The histogram is originally for the basic comparison operators.  Only
+ * the common bits of the network part and the lenght of the network part
+ * (masklen) are appropriate for the subnet inclusion opeators.  Fortunately,
+ * basic comparison fits in this situation.  Even so, the lenght of the
+ * network part would not really be significant in the histogram.  This would
+ * lead to big mistakes for data sets with uneven masklen distribution.
+ * To avoid this problem, comparison with the left and the right side of the
+ * buckets used together.
+ *
+ * Histogram bucket matches are calculated in two forms.  If the constant
+ * matches both sides the bucket is considered as fully matched.  If the
+ * constant matches only the right side the bucket, it is not considered
+ * as matched unless it is the last bucket, because it will match the next
+ * bucked.  If all of these buckets would be considered as matched, it would
+ * lead to unfair multiple matches for some constants.
+ *
+ * The second form is to match the bucket partially.  We try to calculate
+ * dividers for both of the boundaries.  If the address family of the boundary
+ * does not match the constant or comparison of the lenght of the network
+ * parts is not true by the operator, the divider for the boundary would not
+ * taken into account.  If both of the dividers can be calculated the greater
+ * one will be used to mimimize the mistake in the buckets which have
+ * disperate masklens.
+ *
+ * The divider on the partial bucket match is imagined as the distance
+ * between the decisive bits and the common bits of the addresses.  It will
+ * be* used as power of two as it is the natural scale for the IP network
+ * inclusion.  The partial bucket match divider calculation is an empirical
+ * formula and subject to change with more experiment.
+ *
+ * For partial match with buckets which have different address families
+ * on the left and right sides only the boundary with the same address
+ * family is taken into consideration.  This can cause more mistakes for these
+ * buckets if the masklens of their boundaries are also disparate.  It can
+ * only be the case for one bucket, if there are addresses with different
+ * families on the column.  It seems as a better option than not considering
+ * these buckets.
+ */
+static Selectivity
+inet_his_inclusion_selec(Datum *values, int nvalues, Datum *constvalue,
+						 short int opr_order)
+{
+	inet		   *query,
+				   *left,
+				   *right;
+	float			match;
+	int				i;
+	short int		left_order,
+					right_order,
+					left_divider,
+					right_divider;
+
+	match = 0.0;
+	query = DatumGetInetP(*constvalue);
+	left = DatumGetInetP(values[0]);
+	left_order = inet_inclusion_cmp(left, query, opr_order);
+
+	for (i = 1; i < nvalues; i++)
+	{
+		right = DatumGetInetP(values[i]);
+		right_order = inet_inclusion_cmp(right, query, opr_order);
+
+		if (left_order == 0 && right_order == 0)
+		{
+			/* Full bucket match. */
+
+			match += 1.0;
+		}
+		else if ((left_order <= 0 && right_order > 0) ||
+				 (left_order >= 0 && right_order < 0) ||
+				 (right_order == 0 && i == nvalues - 1))
+		{
+			/* Partial bucket match. */
+
+			left_divider = inet_his_match_divider(left, query, opr_order);
+			right_divider = inet_his_match_divider(right, query, opr_order);
+
+			if (left_divider >= 0 || right_divider >= 0)
+				match += 1.0 / pow(2, Max(left_divider, right_divider));
+		}
+
+		/* Shift the variables. */
+		left = right;
+		left_order = right_order;
+	}
+
+	/* There are nvalues - 1 buckets. */
+	return match / (nvalues - 1);
+}
+
+/*
+ * Inet MCV join selectivity estimation
+ *
+ * The original function of the operator used in this function, like the
+ * mcv_selectivity() on selfuncs.c.  Actually this function has nothing
+ * to do with the network data types except its name and location.
+ */
+static Selectivity
+inet_mcv_join_selec(Datum *values1, float4 *numbers1, int nvalues1,
+					Datum *values2, float4 *numbers2, int nvalues2,
+					Oid operator, bool reversed)
+{
+	Selectivity		selec;
+	FmgrInfo		proc;
+	int				i,
+					j;
+
+	fmgr_info(get_opcode(operator), &proc);
+	selec = 0.0;
+
+	for (i = 0; i < nvalues1; i++)
+		for (j = 0; j < nvalues2; j++)
+			if (reversed ?
+				DatumGetBool(FunctionCall2Coll(&proc,
+											   DEFAULT_COLLATION_OID,
+											   values1[i],
+											   values2[j])) :
+				DatumGetBool(FunctionCall2Coll(&proc,
+											   DEFAULT_COLLATION_OID,
+											   values2[j],
+											   values1[i])))
+				selec += numbers1[i] * numbers2[j];
+
+	return selec;
+}
+
+/*
+ * Inet MCV vs histogram inclusion join selectivity estimation
+ *
+ * The function result is the selectivity, and the fraction of the total
+ * population of the MCV is returned into *max_selec_pointer.
+ */
+static Selectivity
+inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers, int mcv_nvalues,
+				   Datum *his_values, int his_nvalues, int red_nvalues,
+				   short int opr_order, Selectivity *max_selec_pointer)
+{
+	Selectivity		selec,
+					red_selec,
+					max_selec;
+	int				i;
+
+	selec = 0.0;
+	red_selec = 0.0;
+	max_selec = 0.0;
+
+	for (i = 0; i < mcv_nvalues; i++)
+	{
+		if (i < red_nvalues)
+		{
+			selec += mcv_numbers[i] *
+					 inet_his_inclusion_selec(his_values, his_nvalues,
+											  &mcv_values[i], opr_order);
+
+			red_selec += mcv_numbers[i];
+		}
+
+		max_selec += mcv_numbers[i];
+	}
+
+	*max_selec_pointer = max_selec;
+	return selec * max_selec / red_selec;
+}
+
+/*
+ * Inet histogram inclusion join selectivity estimation
+ *
+ * Selected values from the first histogram will be matched with the second.
+ * red_nvalues of the values will by discarding same amount of values from
+ * the begging and the end of the list, on the grounds that they are outliers
+ * and hence not very representative.
+ */
+static Selectivity
+inet_his_inclusion_join_selec(Datum *his1_values, int his1_nvalues,
+							  Datum *his2_values, int his2_nvalues,
+							  int red_nvalues, short int opr_order)
+{
+	float			match;
+	int				nskip,
+					i;
+
+	match = 0.0;
+	nskip = (his1_nvalues - red_nvalues) / 2;
+
+	for (i = nskip; i < his1_nvalues - nskip; i++)
+		match += inet_his_inclusion_selec(his2_values, his2_nvalues,
+										  &his1_values[i], opr_order);
+
+	return match / (his1_nvalues - 2 * nskip);
+}
+
+/*
+ * Comparison function for the subnet inclusion operators
+ *
+ * Comparison is compatible with the basic comparison function for the inet
+ * type.  See network_cmp_internal on network.c for the original.  Basic
+ * comparison operators are implemented with the network_cmp_internal
+ * function.  It is possible to implement the subnet inclusion operators with
+ * this function.
+ *
+ * Comparison is first on the common bits of the network part, then on
+ * the length of the network part (masklen) as the network_cmp_internal
+ * function.  Only the first part is on this function.  The second part is
+ * seperated to another function for reusability.  The difference between
+ * the second part and the original network_cmp_internal is that the operator
+ * is used while comparing the lengths of the network parts.  See the second
+ * part on the inet_masklen_inclusion_cmp function below.
+ */
+static short int
+inet_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	 order;
+
+		order = bitncmp(ip_addr(left), ip_addr(right),
+						Min(ip_bits(left), ip_bits(right)));
+
+		if (order != 0)
+			return order;
+
+		return inet_masklen_inclusion_cmp(left, right, opr_order);
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Masklen comparison function for the subnet inclusion operators
+ *
+ * Compares the lengths of network parts of the inputs using the operator.
+ * If the comparision is okay for the operator the return value will be 0.
+ * Otherwise the return value will be less than or greater than 0 with
+ * respect to the operator.
+ */
+static short int
+inet_masklen_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	 order;
+
+		order = ip_bits(left) - ip_bits(right);
+
+		if ((order > 0 && opr_order >= 0) ||
+			(order == 0 && opr_order >= -1 && opr_order <= 1) ||
+			(order < 0 && opr_order <= 0))
+			return 0;
+
+		return opr_order;
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Inet histogram partial match divider calculation
+ *
+ * First the families and the lenghts of the network parts are compared
+ * using the subnet inclusion operator.  The divider will be calculated
+ * using the masklens and the common bits of the addresses.  -1 will be
+ * returned if it cannot be calculated.
+ */
+static short int
+inet_his_match_divider(inet *boundary, inet *query, short int opr_order)
+{
+	if (inet_masklen_inclusion_cmp(boundary, query, opr_order) == 0)
+	{
+		short int	min_bits,
+					decisive_bits;
+
+		min_bits = Min(ip_bits(boundary), ip_bits(query));
+
+		/*
+		 * Set the decisive bits from the one which should contain the other
+		 * according to the operator.
+		 */
+		if (opr_order < 0)
+			decisive_bits = ip_bits(boundary);
+		else if (opr_order > 0)
+			decisive_bits = ip_bits(query);
+		else
+			decisive_bits = min_bits;
+
+		if (min_bits > 0)
+			return decisive_bits - bitncommon(ip_addr(boundary), ip_addr(query),
+											  min_bits);
+		return decisive_bits;
+	}
+
+	return -1;
 }
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index f8b4a65..fb37337 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -1135,32 +1135,33 @@ DESCR("not equal");
 DATA(insert OID = 1203 (  "<"	   PGNSP PGUID b f f 869 869	 16 1205 1206 network_lt scalarltsel scalarltjoinsel ));
 DESCR("less than");
 DATA(insert OID = 1204 (  "<="	   PGNSP PGUID b f f 869 869	 16 1206 1205 network_le scalarltsel scalarltjoinsel ));
 DESCR("less than or equal");
 DATA(insert OID = 1205 (  ">"	   PGNSP PGUID b f f 869 869	 16 1203 1204 network_gt scalargtsel scalargtjoinsel ));
 DESCR("greater than");
 DATA(insert OID = 1206 (  ">="	   PGNSP PGUID b f f 869 869	 16 1204 1203 network_ge scalargtsel scalargtjoinsel ));
 DESCR("greater than or equal");
 DATA(insert OID = 931  (  "<<"	   PGNSP PGUID b f f 869 869	 16 933		0 network_sub networksel networkjoinsel ));
 DESCR("is subnet");
-#define OID_INET_SUB_OP				  931
+#define OID_INET_SUB_OP			931
 DATA(insert OID = 932  (  "<<="    PGNSP PGUID b f f 869 869	 16 934		0 network_subeq networksel networkjoinsel ));
 DESCR("is subnet or equal");
-#define OID_INET_SUBEQ_OP				932
+#define OID_INET_SUBEQ_OP		932
 DATA(insert OID = 933  (  ">>"	   PGNSP PGUID b f f 869 869	 16 931		0 network_sup networksel networkjoinsel ));
 DESCR("is supernet");
-#define OID_INET_SUP_OP				  933
+#define OID_INET_SUP_OP			933
 DATA(insert OID = 934  (  ">>="    PGNSP PGUID b f f 869 869	 16 932		0 network_supeq networksel networkjoinsel ));
 DESCR("is supernet or equal");
-#define OID_INET_SUPEQ_OP				934
+#define OID_INET_SUPEQ_OP		934
 DATA(insert OID = 3552	(  "&&"    PGNSP PGUID b f f 869 869	 16 3552	0 network_overlap networksel networkjoinsel ));
 DESCR("overlaps (is subnet or supernet)");
+#define OID_INET_OVERLAP_OP		3552
 
 DATA(insert OID = 2634 (  "~"	   PGNSP PGUID l f f	  0 869 869 0 0 inetnot - - ));
 DESCR("bitwise not");
 DATA(insert OID = 2635 (  "&"	   PGNSP PGUID b f f	869 869 869 0 0 inetand - - ));
 DESCR("bitwise and");
 DATA(insert OID = 2636 (  "|"	   PGNSP PGUID b f f	869 869 869 0 0 inetor - - ));
 DESCR("bitwise or");
 DATA(insert OID = 2637 (  "+"	   PGNSP PGUID b f f	869  20 869 2638 0 inetpl - - ));
 DESCR("add");
 DATA(insert OID = 2638 (  "+"	   PGNSP PGUID b f f	 20 869 869 2637 0 int8pl_inet - - ));
#7Dilip kumar
dilip.kumar@huawei.com
In reply to: Emre Hasegeli (#6)
Re: Selectivity estimation for inet operators

On 12 July 2014 23:25, Emre Hasegeli Wrote,

I have one last comment, after clarifying this I can move it to

"ready for committer".

1. In networkjoinsel, For avoiding the case of huge statistics, only

some of the values from mcv and histograms are used (calculated using
SQRT).

-- But in my opinion, if histograms and mcv both are exist then its

fine, but if only mcv's are there in that case, we can match complete
MCV, it will give better accuracy.

In other function like eqjoinsel also its matching complete MCV.

I was not sure of reducing statistics, at all. I could not find any
other selectivity estimation function which does this. After testing
it some more, I reached the conclusion that it would be better to only
reduce the values of the outer loop on histogram match. Now it matches
complete MCV lists to each other. I also switched back to
log2() from sqrt() to make the outer list smaller.

OK

I rethink your previous advice to threat histogram bucket partially
matched when the constant matches the last boundary, and changed it
that way. It is better than using the selectivity for only one value.
Removing this part also make the function more simple. The new version
of the patch attached.

This seems good to me.

While looking at it I find some other small problems and fixed them.
I also realized that I forgot to support other join types than inner
join. Currently, the default estimation is used for anti joins.
I think the patch will need more than trivial amount of change to
support anti joins. I can work on it later. While doing it, outer
join selectivity estimation can also be improved. I think the patch is
better than nothing in its current state.

I agree with you that we can support other join type and anti join later,
If others don’t have any objection in doing other parts later I will mark as "Ready For Committer".

Regards,
Dilip

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Emre Hasegeli
emre@hasegeli.com
In reply to: Dilip kumar (#7)
2 attachment(s)
Re: Selectivity estimation for inet operators

I agree with you that we can support other join type and anti join later,
If others don’t have any objection in doing other parts later I will mark as "Ready For Committer".

I updated the patch to cover semi and anti joins with eqjoinsel_semi().
I think it is better than returning a constant. The new version
attached with the new version of the test script. Can you please
look at it again and mark it as "ready for committer" if it seems okay
to you?

Attachments:

inet-selfuncs-v8.patchtext/plain; charset=utf-8Download
diff --git a/src/backend/utils/adt/network_selfuncs.c b/src/backend/utils/adt/network_selfuncs.c
index d0d806f..eca9e7c 100644
--- a/src/backend/utils/adt/network_selfuncs.c
+++ b/src/backend/utils/adt/network_selfuncs.c
@@ -1,32 +1,669 @@
 /*-------------------------------------------------------------------------
  *
  * network_selfuncs.c
  *	  Functions for selectivity estimation of inet/cidr operators
  *
- * Currently these are just stubs, but we hope to do better soon.
+ * Estimates are based on null fraction, distinct value count, most common
+ * values, and histogram of inet/cidr datatypes.
  *
  * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
  *
  * IDENTIFICATION
  *	  src/backend/utils/adt/network_selfuncs.c
  *
  *-------------------------------------------------------------------------
  */
 #include "postgres.h"
 
+#include <math.h>
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_operator.h"
+#include "catalog/pg_statistic.h"
+#include "utils/lsyscache.h"
 #include "utils/inet.h"
+#include "utils/selfuncs.h"
 
 
+/* Default selectivity constant for the inet overlap operator */
+#define DEFAULT_OVERLAP_SEL 0.01
+
+/* Default selectivity constant for the other operators */
+#define DEFAULT_INCLUSION_SEL 0.005
+
+/* Default selectivity for given operator */
+#define DEFAULT_SEL(operator) \
+	((operator) == OID_INET_OVERLAP_OP ? \
+			DEFAULT_OVERLAP_SEL : DEFAULT_INCLUSION_SEL)
+
+static Selectivity networkjoinsel_inner(Oid operator,
+					 VariableStatData *vardata1, VariableStatData *vardata2);
+extern double eqjoinsel_semi(Oid operator, VariableStatData *vardata1,
+			   VariableStatData *vardata2, RelOptInfo *inner_rel);
+extern RelOptInfo *find_join_input_rel(PlannerInfo *root, Relids relids);
+static short int inet_opr_order(Oid operator);
+static Selectivity inet_his_inclusion_selec(Datum *values, int nvalues,
+						 Datum *constvalue, short int opr_order);
+static Selectivity inet_mcv_join_selec(Datum *values1, float4 *numbers1,
+					int nvalues1, Datum *values2, float4 *numbers2,
+					int nvalues2, Oid operator);
+static Selectivity inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers,
+				   int mcv_nvalues, Datum *his_values, int his_nvalues,
+				   int red_nvalues, short int opr_order,
+				   Selectivity *max_selec_pointer);
+static Selectivity inet_his_inclusion_join_selec(Datum *his1_values,
+					  int his1_nvalues, Datum *his2_values, int his2_nvalues,
+							  int red_nvalues, short int opr_order);
+static short int inet_inclusion_cmp(inet *left, inet *right,
+				   short int opr_order);
+static short int inet_masklen_inclusion_cmp(inet *left, inet *right,
+						   short int opr_order);
+static short int inet_his_match_divider(inet *boundary, inet *query,
+					   short int opr_order);
+
+/*
+ * Selectivity estimation for the subnet inclusion operators
+ */
 Datum
 networksel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid			operator = PG_GETARG_OID(1);
+	List	   *args = (List *) PG_GETARG_POINTER(2);
+	int			varRelid = PG_GETARG_INT32(3),
+				his_nvalues;
+	VariableStatData vardata;
+	Node	   *other;
+	bool		varonleft;
+	Selectivity selec,
+				max_mcv_selec;
+	Datum		constvalue,
+			   *his_values;
+	Form_pg_statistic stats;
+	double		nullfrac;
+	FmgrInfo	proc;
+
+	/*
+	 * If expression is not (variable op something) or (something op
+	 * variable), then punt and return a default estimate.
+	 */
+	if (!get_restriction_variable(root, args, varRelid,
+								  &vardata, &other, &varonleft))
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+
+	/*
+	 * Can't do anything useful if the something is not a constant, either.
+	 */
+	if (!IsA(other, Const))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	/* All of the subnet inclusion operators are strict. */
+	if (((Const *) other)->constisnull)
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(0.0);
+	}
+
+	if (!HeapTupleIsValid(vardata.statsTuple))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	constvalue = ((Const *) other)->constvalue;
+	stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+	nullfrac = stats ? stats->stanullfrac : 0.0;
+
+	fmgr_info(get_opcode(operator), &proc);
+	selec = mcv_selectivity(&vardata, &proc, constvalue, varonleft,
+							&max_mcv_selec);
+
+	if (get_attstatsslot(vardata.statsTuple,
+						 vardata.atttype, vardata.atttypmod,
+						 STATISTIC_KIND_HISTOGRAM, InvalidOid,
+						 NULL,
+						 &his_values, &his_nvalues,
+						 NULL, NULL))
+	{
+		selec += (1.0 - nullfrac - max_mcv_selec) *
+			inet_his_inclusion_selec(his_values, his_nvalues, &constvalue,
+									 varonleft ? inet_opr_order(operator) :
+									 inet_opr_order(operator) * -1);
+
+		free_attstatsslot(vardata.atttype, his_values, his_nvalues, NULL, 0);
+	}
+	else if (max_mcv_selec == 0.0)
+		selec = (1.0 - nullfrac) * DEFAULT_SEL(operator);
+
+	/* Result should be in range, but make sure... */
+	CLAMP_PROBABILITY(selec);
+
+	ReleaseVariableStats(vardata);
+	PG_RETURN_FLOAT8(selec);
 }
 
+/*
+ * Join selectivity estimation for the subnet inclusion operators
+ *
+ * This function is the copy of eqjoinsel() on selfuncs.c except the comments
+ * and that it calls networkjoinsel_inner() instead of eqjoinsel_inner().
+ */
 Datum
 networkjoinsel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid			operator = PG_GETARG_OID(1);
+	List	   *args = (List *) PG_GETARG_POINTER(2);
+#ifdef NOT_USED
+	JoinType	jointype = (JoinType) PG_GETARG_INT16(3);
+#endif
+	SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) PG_GETARG_POINTER(4);
+	double		selec;
+	VariableStatData vardata1;
+	VariableStatData vardata2;
+	bool		join_is_reversed;
+	RelOptInfo *inner_rel;
+
+	get_join_variables(root, args, sjinfo,
+					   &vardata1, &vardata2, &join_is_reversed);
+
+	switch (sjinfo->jointype)
+	{
+		case JOIN_INNER:
+		case JOIN_LEFT:
+		case JOIN_FULL:
+
+			/*
+			 * Selectivity for left join is not exactly same as inner join,
+			 * but is neglected.
+			 */
+			if (!join_is_reversed)
+				selec = networkjoinsel_inner(operator, &vardata1, &vardata2);
+			else
+				selec = networkjoinsel_inner(get_commutator(operator),
+											 &vardata2, &vardata1);
+			break;
+		case JOIN_SEMI:
+		case JOIN_ANTI:
+
+			/*
+			 * Selectivity estimation functions of semi and anti joins are not
+			 * implemented for the subnet inclusion operators.
+			 * eqjoinsel_semi() used to cover.  It makes small or big mistakes
+			 * based on the join type, the operator and the ratio between the
+			 * row counts.
+			 */
+			inner_rel = find_join_input_rel(root, sjinfo->min_righthand);
+
+			if (!join_is_reversed)
+				selec = eqjoinsel_semi(operator, &vardata1, &vardata2,
+									   inner_rel);
+			else
+				selec = eqjoinsel_semi(get_commutator(operator),
+									   &vardata2, &vardata1,
+									   inner_rel);
+			break;
+		default:
+			/* other values not expected here */
+			elog(ERROR, "unrecognized join type: %d",
+				 (int) sjinfo->jointype);
+			selec = 0;			/* keep compiler quiet */
+			break;
+	}
+
+	ReleaseVariableStats(vardata1);
+	ReleaseVariableStats(vardata2);
+
+	CLAMP_PROBABILITY(selec);
+
+	PG_RETURN_FLOAT8((float8) selec);
+}
+
+/*
+ * Inner join selectivity estimation for the subnet inclusion operators
+ *
+ * Calculates MCV vs MCV, MCV vs histogram and histogram vs histogram
+ * selectivity for join using the subnet inclusion operators.  Unlike the
+ * join selectivity function for the equality operator, eqjoinsel_inner(),
+ * one to one matching of the values is not enough.   Network inclusion
+ * operators are likely to match many to many.   It requires to loop the MVC
+ * and histogram lists to the end.  Also, MCV vs histogram selectiviy is
+ * not neglected as in eqjoinsel_inner().
+ *
+ * To make the function faster only some of the values from the first
+ * MVC and histogram matched to the second histogram.  It is calculated by
+ * log2().
+ */
+static Selectivity
+networkjoinsel_inner(Oid operator,
+					 VariableStatData *vardata1, VariableStatData *vardata2)
+{
+	Form_pg_statistic stats;
+	double		nullfrac1 = 0.0,
+				nullfrac2 = 0.0;
+	Selectivity selec = 0.0,
+				mcv1_max_selec = 0.0,
+				mcv2_max_selec = 0.0;
+	bool		mcv1_exists = false,
+				mcv2_exists = false,
+				his1_exists = false,
+				his2_exists = false;
+	int			mcv1_nvalues,
+				mcv2_nvalues,
+				mcv1_nnumbers,
+				mcv2_nnumbers,
+				his1_nvalues,
+				his2_nvalues,
+				red1_nvalues,
+				red2_nvalues;
+	Datum	   *mcv1_values,
+			   *mcv2_values,
+			   *his1_values,
+			   *his2_values;
+	float4	   *mcv1_numbers,
+			   *mcv2_numbers;
+
+	if (HeapTupleIsValid(vardata1->statsTuple))
+	{
+		if ((stats = (Form_pg_statistic) GETSTRUCT(vardata1->statsTuple)))
+			nullfrac1 = stats->stanullfrac;
+
+		mcv1_exists = get_attstatsslot(vardata1->statsTuple,
+									   vardata1->atttype, vardata1->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv1_values, &mcv1_nvalues,
+									   &mcv1_numbers, &mcv1_nnumbers);
+		his1_exists = get_attstatsslot(vardata1->statsTuple,
+									   vardata1->atttype, vardata1->atttypmod,
+									   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+									   NULL,
+									   &his1_values, &his1_nvalues,
+									   NULL, NULL);
+		red1_nvalues = ((int) log2(Max(mcv1_nvalues, his1_nvalues))) + 1;
+	}
+
+	if (HeapTupleIsValid(vardata2->statsTuple))
+	{
+		if ((stats = (Form_pg_statistic) GETSTRUCT(vardata2->statsTuple)))
+			nullfrac2 = stats->stanullfrac;
+
+		mcv2_exists = get_attstatsslot(vardata2->statsTuple,
+									   vardata2->atttype, vardata2->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv2_values, &mcv2_nvalues,
+									   &mcv2_numbers, &mcv2_nnumbers);
+		his2_exists = get_attstatsslot(vardata2->statsTuple,
+									   vardata2->atttype, vardata2->atttypmod,
+									   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+									   NULL,
+									   &his2_values, &his2_nvalues,
+									   NULL, NULL);
+		red2_nvalues = ((int) log2(Max(mcv2_nvalues, his2_nvalues))) + 1;
+	}
+
+	if (mcv1_exists && mcv2_exists)
+		selec += inet_mcv_join_selec(mcv1_values, mcv1_numbers, mcv1_nvalues,
+									 mcv2_values, mcv2_numbers, mcv2_nvalues,
+									 operator);
+	if (mcv1_exists && his2_exists)
+		selec += inet_mcv_his_selec(mcv1_values, mcv1_numbers, mcv1_nvalues,
+									his2_values, his2_nvalues,
+									Min(mcv1_nvalues, red1_nvalues),
+								  inet_opr_order(operator), &mcv1_max_selec);
+	if (mcv2_exists && his1_exists)
+		selec += inet_mcv_his_selec(mcv2_values, mcv2_numbers, mcv2_nvalues,
+									his1_values, his1_nvalues,
+									Min(mcv2_nvalues, red2_nvalues),
+								  inet_opr_order(operator), &mcv2_max_selec);
+	if (his1_exists && his2_exists)
+		selec += (1.0 - nullfrac1 - mcv1_max_selec) *
+			(1.0 - nullfrac2 - mcv2_max_selec) *
+			inet_his_inclusion_join_selec(his1_values, his1_nvalues,
+										  his2_values, his2_nvalues,
+										  Min(his1_nvalues, red1_nvalues),
+										  inet_opr_order(operator));
+
+	if ((!mcv1_exists && !his1_exists) || (!mcv2_exists && !his2_exists))
+		selec = (1.0 - nullfrac1) * (1.0 - nullfrac2) * DEFAULT_SEL(operator);
+
+	if (mcv1_exists)
+		free_attstatsslot(vardata1->atttype, mcv1_values, mcv1_nvalues,
+						  mcv1_numbers, mcv1_nnumbers);
+	if (mcv2_exists)
+		free_attstatsslot(vardata2->atttype, mcv2_values, mcv2_nvalues,
+						  mcv2_numbers, mcv2_nnumbers);
+	if (his1_exists)
+		free_attstatsslot(vardata1->atttype, his1_values, his1_nvalues,
+						  NULL, 0);
+	if (his2_exists)
+		free_attstatsslot(vardata2->atttype, his2_values, his2_nvalues,
+						  NULL, 0);
+
+	return selec;
+}
+
+/*
+ * Practical comparable numbers for the subnet inclusion operators
+ */
+static short int
+inet_opr_order(Oid operator)
+{
+	switch (operator)
+	{
+		case OID_INET_SUP_OP:
+			return -2;
+		case OID_INET_SUPEQ_OP:
+			return -1;
+		case OID_INET_OVERLAP_OP:
+			return 0;
+		case OID_INET_SUBEQ_OP:
+			return 1;
+		case OID_INET_SUB_OP:
+			return 2;
+		default:
+			elog(ERROR, "unknown operator for inet inclusion selectivity");
+	}
+}
+
+/*
+ * Inet histogram inclusion selectivity estimation
+ *
+ * Calculates histogram selectivity for the subnet inclusion operators of
+ * the inet type.  The return value is between 0 and 1.  It should be
+ * corrected with the MVC selectivity and null fraction.  If the constant
+ * is less than the first element or greater than the last element of
+ * the histogram the return value will be 0.
+ *
+ * The histogram is originally for the basic comparison operators.  Only
+ * the common bits of the network part and the lenght of the network part
+ * (masklen) are appropriate for the subnet inclusion opeators.  Fortunately,
+ * basic comparison fits in this situation.  Even so, the lenght of the
+ * network part would not really be significant in the histogram.  This would
+ * lead to big mistakes for data sets with uneven masklen distribution.
+ * To avoid this problem, comparison with the left and the right side of the
+ * buckets used together.
+ *
+ * Histogram bucket matches are calculated in two forms.  If the constant
+ * matches both sides the bucket is considered as fully matched.  If the
+ * constant matches only the right side the bucket, it is not considered
+ * as matched unless it is the last bucket, because it will match the next
+ * bucked.  If all of these buckets would be considered as matched, it would
+ * lead to unfair multiple matches for some constants.
+ *
+ * The second form is to match the bucket partially.  We try to calculate
+ * dividers for both of the boundaries.  If the address family of the boundary
+ * does not match the constant or comparison of the lenght of the network
+ * parts is not true by the operator, the divider for the boundary would not
+ * taken into account.  If both of the dividers can be calculated the greater
+ * one will be used to mimimize the mistake in the buckets which have
+ * disperate masklens.
+ *
+ * The divider on the partial bucket match is imagined as the distance
+ * between the decisive bits and the common bits of the addresses.  It will
+ * be* used as power of two as it is the natural scale for the IP network
+ * inclusion.  The partial bucket match divider calculation is an empirical
+ * formula and subject to change with more experiment.
+ *
+ * For partial match with buckets which have different address families
+ * on the left and right sides only the boundary with the same address
+ * family is taken into consideration.  This can cause more mistakes for these
+ * buckets if the masklens of their boundaries are also disparate.  It can
+ * only be the case for one bucket, if there are addresses with different
+ * families on the column.  It seems as a better option than not considering
+ * these buckets.
+ */
+static Selectivity
+inet_his_inclusion_selec(Datum *values, int nvalues, Datum *constvalue,
+						 short int opr_order)
+{
+	inet	   *query,
+			   *left,
+			   *right;
+	float		match;
+	int			i;
+	short int	left_order,
+				right_order,
+				left_divider,
+				right_divider;
+
+	match = 0.0;
+	query = DatumGetInetP(*constvalue);
+	left = DatumGetInetP(values[0]);
+	left_order = inet_inclusion_cmp(left, query, opr_order);
+
+	for (i = 1; i < nvalues; i++)
+	{
+		right = DatumGetInetP(values[i]);
+		right_order = inet_inclusion_cmp(right, query, opr_order);
+
+		if (left_order == 0 && right_order == 0)
+		{
+			/* Full bucket match. */
+
+			match += 1.0;
+		}
+		else if ((left_order <= 0 && right_order > 0) ||
+				 (left_order >= 0 && right_order < 0) ||
+				 (right_order == 0 && i == nvalues - 1))
+		{
+			/* Partial bucket match. */
+
+			left_divider = inet_his_match_divider(left, query, opr_order);
+			right_divider = inet_his_match_divider(right, query, opr_order);
+
+			if (left_divider >= 0 || right_divider >= 0)
+				match += 1.0 / pow(2, Max(left_divider, right_divider));
+		}
+
+		/* Shift the variables. */
+		left = right;
+		left_order = right_order;
+	}
+
+	/* There are nvalues - 1 buckets. */
+	return match / (nvalues - 1);
+}
+
+/*
+ * Inet MCV join selectivity estimation
+ *
+ * The original function of the operator used in this function, like the
+ * mcv_selectivity() on selfuncs.c.  Actually, this function has nothing
+ * to do with the network data types except its name and location.
+ */
+static Selectivity
+inet_mcv_join_selec(Datum *values1, float4 *numbers1, int nvalues1,
+					Datum *values2, float4 *numbers2, int nvalues2,
+					Oid operator)
+{
+	Selectivity selec;
+	FmgrInfo	proc;
+	int			i,
+				j;
+
+	fmgr_info(get_opcode(operator), &proc);
+	selec = 0.0;
+
+	for (i = 0; i < nvalues1; i++)
+		for (j = 0; j < nvalues2; j++)
+			if (DatumGetBool(FunctionCall2Coll(&proc, DEFAULT_COLLATION_OID,
+											   values1[i], values2[j])))
+				selec += numbers1[i] * numbers2[j];
+
+	return selec;
+}
+
+/*
+ * Inet MCV vs histogram inclusion join selectivity estimation
+ *
+ * The function result is the selectivity, and the fraction of the total
+ * population of the MCV is returned into *max_selec_pointer.
+ */
+static Selectivity
+inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers, int mcv_nvalues,
+				   Datum *his_values, int his_nvalues, int red_nvalues,
+				   short int opr_order, Selectivity *max_selec_pointer)
+{
+	Selectivity selec,
+				red_selec,
+				max_selec;
+	int			i;
+
+	selec = 0.0;
+	red_selec = 0.0;
+	max_selec = 0.0;
+
+	for (i = 0; i < mcv_nvalues; i++)
+	{
+		if (i < red_nvalues)
+		{
+			selec += mcv_numbers[i] *
+				inet_his_inclusion_selec(his_values, his_nvalues,
+										 &mcv_values[i], opr_order);
+
+			red_selec += mcv_numbers[i];
+		}
+
+		max_selec += mcv_numbers[i];
+	}
+
+	*max_selec_pointer = max_selec;
+	return selec * max_selec / red_selec;
+}
+
+/*
+ * Inet histogram inclusion join selectivity estimation
+ *
+ * Selected values from the first histogram will be matched with the second.
+ * red_nvalues of the values will by discarding same amount of values from
+ * the begging and the end of the list, on the grounds that they are outliers
+ * and hence not very representative.
+ */
+static Selectivity
+inet_his_inclusion_join_selec(Datum *his1_values, int his1_nvalues,
+							  Datum *his2_values, int his2_nvalues,
+							  int red_nvalues, short int opr_order)
+{
+	float		match;
+	int			nskip,
+				i;
+
+	match = 0.0;
+	nskip = (his1_nvalues - red_nvalues) / 2;
+
+	for (i = nskip; i < his1_nvalues - nskip; i++)
+		match += inet_his_inclusion_selec(his2_values, his2_nvalues,
+										  &his1_values[i], opr_order);
+
+	return match / (his1_nvalues - 2 * nskip);
+}
+
+/*
+ * Comparison function for the subnet inclusion operators
+ *
+ * Comparison is compatible with the basic comparison function for the inet
+ * type.  See network_cmp_internal on network.c for the original.  Basic
+ * comparison operators are implemented with the network_cmp_internal
+ * function.  It is possible to implement the subnet inclusion operators with
+ * this function.
+ *
+ * Comparison is first on the common bits of the network part, then on
+ * the length of the network part (masklen) as the network_cmp_internal
+ * function.  Only the first part is on this function.  The second part is
+ * seperated to another function for reusability.  The difference between
+ * the second part and the original network_cmp_internal is that the operator
+ * is used while comparing the lengths of the network parts.  See the second
+ * part on the inet_masklen_inclusion_cmp function below.
+ */
+static short int
+inet_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	order;
+
+		order = bitncmp(ip_addr(left), ip_addr(right),
+						Min(ip_bits(left), ip_bits(right)));
+
+		if (order != 0)
+			return order;
+
+		return inet_masklen_inclusion_cmp(left, right, opr_order);
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Masklen comparison function for the subnet inclusion operators
+ *
+ * Compares the lengths of network parts of the inputs using the operator.
+ * If the comparision is okay for the operator the return value will be 0.
+ * Otherwise the return value will be less than or greater than 0 with
+ * respect to the operator.
+ */
+static short int
+inet_masklen_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	order;
+
+		order = ip_bits(left) - ip_bits(right);
+
+		if ((order > 0 && opr_order >= 0) ||
+			(order == 0 && opr_order >= -1 && opr_order <= 1) ||
+			(order < 0 && opr_order <= 0))
+			return 0;
+
+		return opr_order;
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Inet histogram partial match divider calculation
+ *
+ * First the families and the lenghts of the network parts are compared
+ * using the subnet inclusion operator.  The divider will be calculated
+ * using the masklens and the common bits of the addresses.  -1 will be
+ * returned if it cannot be calculated.
+ */
+static short int
+inet_his_match_divider(inet *boundary, inet *query, short int opr_order)
+{
+	if (inet_masklen_inclusion_cmp(boundary, query, opr_order) == 0)
+	{
+		short int	min_bits,
+					decisive_bits;
+
+		min_bits = Min(ip_bits(boundary), ip_bits(query));
+
+		/*
+		 * Set the decisive bits from the one which should contain the other
+		 * according to the operator.
+		 */
+		if (opr_order < 0)
+			decisive_bits = ip_bits(boundary);
+		else if (opr_order > 0)
+			decisive_bits = ip_bits(query);
+		else
+			decisive_bits = min_bits;
+
+		if (min_bits > 0)
+			return decisive_bits - bitncommon(ip_addr(boundary), ip_addr(query),
+											  min_bits);
+		return decisive_bits;
+	}
+
+	return -1;
 }
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index e932ccf..8dcda93 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -149,21 +149,21 @@ static double var_eq_const(VariableStatData *vardata, Oid operator,
 			 bool varonleft);
 static double var_eq_non_const(VariableStatData *vardata, Oid operator,
 				 Node *other,
 				 bool varonleft);
 static double ineq_histogram_selectivity(PlannerInfo *root,
 						   VariableStatData *vardata,
 						   FmgrInfo *opproc, bool isgt,
 						   Datum constval, Oid consttype);
 static double eqjoinsel_inner(Oid operator,
 				VariableStatData *vardata1, VariableStatData *vardata2);
-static double eqjoinsel_semi(Oid operator,
+double eqjoinsel_semi(Oid operator,
 			   VariableStatData *vardata1, VariableStatData *vardata2,
 			   RelOptInfo *inner_rel);
 static bool convert_to_scalar(Datum value, Oid valuetypid, double *scaledvalue,
 				  Datum lobound, Datum hibound, Oid boundstypid,
 				  double *scaledlobound, double *scaledhibound);
 static double convert_numeric_to_scalar(Datum value, Oid typid);
 static void convert_string_to_scalar(char *value,
 						 double *scaledvalue,
 						 char *lobound,
 						 double *scaledlobound,
@@ -182,21 +182,21 @@ static double convert_one_bytea_to_scalar(unsigned char *value, int valuelen,
 static char *convert_string_datum(Datum value, Oid typid);
 static double convert_timevalue_to_scalar(Datum value, Oid typid);
 static void examine_simple_variable(PlannerInfo *root, Var *var,
 						VariableStatData *vardata);
 static bool get_variable_range(PlannerInfo *root, VariableStatData *vardata,
 				   Oid sortop, Datum *min, Datum *max);
 static bool get_actual_variable_range(PlannerInfo *root,
 						  VariableStatData *vardata,
 						  Oid sortop,
 						  Datum *min, Datum *max);
-static RelOptInfo *find_join_input_rel(PlannerInfo *root, Relids relids);
+RelOptInfo *find_join_input_rel(PlannerInfo *root, Relids relids);
 static Selectivity prefix_selectivity(PlannerInfo *root,
 				   VariableStatData *vardata,
 				   Oid vartype, Oid opfamily, Const *prefixcon);
 static Selectivity like_selectivity(const char *patt, int pattlen,
 				 bool case_insensitive);
 static Selectivity regex_selectivity(const char *patt, int pattlen,
 				  bool case_insensitive,
 				  int fixed_prefix_len);
 static Datum string_to_datum(const char *str, Oid datatype);
 static Const *string_to_const(const char *str, Oid datatype);
@@ -2418,21 +2418,21 @@ eqjoinsel_inner(Oid operator,
 
 	return selec;
 }
 
 /*
  * eqjoinsel_semi --- eqjoinsel for semi join
  *
  * (Also used for anti join, which we are supposed to estimate the same way.)
  * Caller has ensured that vardata1 is the LHS variable.
  */
-static double
+double
 eqjoinsel_semi(Oid operator,
 			   VariableStatData *vardata1, VariableStatData *vardata2,
 			   RelOptInfo *inner_rel)
 {
 	double		selec;
 	double		nd1;
 	double		nd2;
 	bool		isdefault1;
 	bool		isdefault2;
 	Form_pg_statistic stats1 = NULL;
@@ -5094,21 +5094,21 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata,
 	return have_data;
 }
 
 /*
  * find_join_input_rel
  *		Look up the input relation for a join.
  *
  * We assume that the input relation's RelOptInfo must have been constructed
  * already.
  */
-static RelOptInfo *
+RelOptInfo *
 find_join_input_rel(PlannerInfo *root, Relids relids)
 {
 	RelOptInfo *rel = NULL;
 
 	switch (bms_membership(relids))
 	{
 		case BMS_EMPTY_SET:
 			/* should not happen */
 			break;
 		case BMS_SINGLETON:
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index d7dcd1c..3b827fc 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -1135,32 +1135,33 @@ DESCR("not equal");
 DATA(insert OID = 1203 (  "<"	   PGNSP PGUID b f f 869 869	 16 1205 1206 network_lt scalarltsel scalarltjoinsel ));
 DESCR("less than");
 DATA(insert OID = 1204 (  "<="	   PGNSP PGUID b f f 869 869	 16 1206 1205 network_le scalarltsel scalarltjoinsel ));
 DESCR("less than or equal");
 DATA(insert OID = 1205 (  ">"	   PGNSP PGUID b f f 869 869	 16 1203 1204 network_gt scalargtsel scalargtjoinsel ));
 DESCR("greater than");
 DATA(insert OID = 1206 (  ">="	   PGNSP PGUID b f f 869 869	 16 1204 1203 network_ge scalargtsel scalargtjoinsel ));
 DESCR("greater than or equal");
 DATA(insert OID = 931  (  "<<"	   PGNSP PGUID b f f 869 869	 16 933		0 network_sub networksel networkjoinsel ));
 DESCR("is subnet");
-#define OID_INET_SUB_OP				  931
+#define OID_INET_SUB_OP			931
 DATA(insert OID = 932  (  "<<="    PGNSP PGUID b f f 869 869	 16 934		0 network_subeq networksel networkjoinsel ));
 DESCR("is subnet or equal");
-#define OID_INET_SUBEQ_OP				932
+#define OID_INET_SUBEQ_OP		932
 DATA(insert OID = 933  (  ">>"	   PGNSP PGUID b f f 869 869	 16 931		0 network_sup networksel networkjoinsel ));
 DESCR("is supernet");
-#define OID_INET_SUP_OP				  933
+#define OID_INET_SUP_OP			933
 DATA(insert OID = 934  (  ">>="    PGNSP PGUID b f f 869 869	 16 932		0 network_supeq networksel networkjoinsel ));
 DESCR("is supernet or equal");
-#define OID_INET_SUPEQ_OP				934
+#define OID_INET_SUPEQ_OP		934
 DATA(insert OID = 3552	(  "&&"    PGNSP PGUID b f f 869 869	 16 3552	0 network_overlap networksel networkjoinsel ));
 DESCR("overlaps (is subnet or supernet)");
+#define OID_INET_OVERLAP_OP		3552
 
 DATA(insert OID = 2634 (  "~"	   PGNSP PGUID l f f	  0 869 869 0 0 inetnot - - ));
 DESCR("bitwise not");
 DATA(insert OID = 2635 (  "&"	   PGNSP PGUID b f f	869 869 869 0 0 inetand - - ));
 DESCR("bitwise and");
 DATA(insert OID = 2636 (  "|"	   PGNSP PGUID b f f	869 869 869 0 0 inetor - - ));
 DESCR("bitwise or");
 DATA(insert OID = 2637 (  "+"	   PGNSP PGUID b f f	869  20 869 2638 0 inetpl - - ));
 DESCR("add");
 DATA(insert OID = 2638 (  "+"	   PGNSP PGUID b f f	 20 869 869 2637 0 int8pl_inet - - ));
inet-selfuncs-test.sqltext/plain; charset=utf-8Download
#9Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Emre Hasegeli (#8)
Re: Selectivity estimation for inet operators

On 08/26/2014 12:44 PM, Emre Hasegeli wrote:

I agree with you that we can support other join type and anti join later,
If others don’t have any objection in doing other parts later I will mark as "Ready For Committer".

I updated the patch to cover semi and anti joins with eqjoinsel_semi().
I think it is better than returning a constant. The new version
attached with the new version of the test script. Can you please
look at it again and mark it as "ready for committer" if it seems okay
to you?

I took a quick look at this. Some questions:

* Isn't "X >> Y" equivalent to "network_scan_first(X) < Y AND
network_scan_last(X) > Y"? Or at least close enough for selectivity
estimation purposes? Pardon my ignorance - I'm not too familiar with the
inet datatype - but how about just calling scalarineqsel for both bounds?

* inet_mcv_join_selec() is O(n^2) where n is the number of entries in
the MCV lists. With the max statistics target of 10000, a worst case
query on my laptop took about 15 seconds to plan. Maybe that's
acceptable, but you went through some trouble to make planning of MCV vs
histogram faster, by the log2 method to compare only some values, so I
wonder why you didn't do the same for the MCV vs MCV case?

* A few typos: lenght -> length.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Emre Hasegeli (#8)
Re: Selectivity estimation for inet operators

Emre Hasegeli <emre@hasegeli.com> writes:

I updated the patch to cover semi and anti joins with eqjoinsel_semi().
I think it is better than returning a constant.

What you did there is utterly unacceptable from a modularity standpoint;
and considering that the values will be nowhere near right, the argument
that "it's better than returning a constant" seems pretty weak. I think
you should just take that out again.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#9)
Re: Selectivity estimation for inet operators

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

* inet_mcv_join_selec() is O(n^2) where n is the number of entries in
the MCV lists. With the max statistics target of 10000, a worst case
query on my laptop took about 15 seconds to plan. Maybe that's
acceptable, but you went through some trouble to make planning of MCV vs
histogram faster, by the log2 method to compare only some values, so I
wonder why you didn't do the same for the MCV vs MCV case?

Actually, what I think needs to be asked is the opposite question: why is
the other code ignoring some of the statistical data? If the user asked
us to collect a lot of stats detail it seems reasonable that he's
expecting us to use it to get more accurate estimates. It's for sure
not obvious why these estimators should take shortcuts that are not being
taken in the much-longer-established code for scalar comparison estimates.

I'm not exactly convinced that the math adds up in this logic, either.
The way in which it combines results from looking at the MCV lists and
at the histograms seems pretty arbitrary.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Emre Hasegeli
emre@hasegeli.com
In reply to: Heikki Linnakangas (#9)
1 attachment(s)
Re: Selectivity estimation for inet operators

* Isn't "X >> Y" equivalent to "network_scan_first(X) < Y AND
network_scan_last(X) > Y"? Or at least close enough for selectivity
estimation purposes? Pardon my ignorance - I'm not too familiar with the
inet datatype - but how about just calling scalarineqsel for both bounds?

Actually, "X >> Y" is equivalent to

network_scan_first(X) <= network_host(Y) AND
network_scan_last(X) >= network_host(Y) AND
network_masklen(X) < network_masklen(X)

but we do not have statistics for neither network_scan_last(X)
nor network_masklen(X). I tried to find a solution based on
the implementation of the operators.

* inet_mcv_join_selec() is O(n^2) where n is the number of entries in the
MCV lists. With the max statistics target of 10000, a worst case query on
my laptop took about 15 seconds to plan. Maybe that's acceptable, but you
went through some trouble to make planning of MCV vs histogram faster, by
the log2 method to compare only some values, so I wonder why you didn't do
the same for the MCV vs MCV case?

It was like that in the previous versions. It was causing worse
estimation, but I was trying to reduce both sides of the lists. It
works slightly better when only the left hand side of the list is
reduced. Attached version works like that.

* A few typos: lenght -> length.

Fixed.

Thank you for looking at it.

Attachments:

inet-selfuncs-v9.patchtext/plain; charset=utf-8Download
diff --git a/src/backend/utils/adt/network_selfuncs.c b/src/backend/utils/adt/network_selfuncs.c
index d0d806f..a00706c 100644
--- a/src/backend/utils/adt/network_selfuncs.c
+++ b/src/backend/utils/adt/network_selfuncs.c
@@ -1,32 +1,671 @@
 /*-------------------------------------------------------------------------
  *
  * network_selfuncs.c
  *	  Functions for selectivity estimation of inet/cidr operators
  *
- * Currently these are just stubs, but we hope to do better soon.
+ * Estimates are based on null fraction, distinct value count, most common
+ * values, and histogram of inet/cidr datatypes.
  *
  * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
  *
  * IDENTIFICATION
  *	  src/backend/utils/adt/network_selfuncs.c
  *
  *-------------------------------------------------------------------------
  */
 #include "postgres.h"
 
+#include <math.h>
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_operator.h"
+#include "catalog/pg_statistic.h"
+#include "utils/lsyscache.h"
 #include "utils/inet.h"
+#include "utils/selfuncs.h"
 
 
+/* Default selectivity constant for the inet overlap operator */
+#define DEFAULT_OVERLAP_SEL 0.01
+
+/* Default selectivity constant for the other operators */
+#define DEFAULT_INCLUSION_SEL 0.005
+
+/* Default selectivity for given operator */
+#define DEFAULT_SEL(operator) \
+	((operator) == OID_INET_OVERLAP_OP ? \
+			DEFAULT_OVERLAP_SEL : DEFAULT_INCLUSION_SEL)
+
+static Selectivity networkjoinsel_inner(Oid operator,
+					 VariableStatData *vardata1, VariableStatData *vardata2);
+extern double eqjoinsel_semi(Oid operator, VariableStatData *vardata1,
+			   VariableStatData *vardata2, RelOptInfo *inner_rel);
+extern RelOptInfo *find_join_input_rel(PlannerInfo *root, Relids relids);
+static short int inet_opr_order(Oid operator);
+static Selectivity inet_his_inclusion_selec(Datum *values, int nvalues,
+						 Datum *constvalue, short int opr_order);
+static Selectivity inet_mcv_join_selec(Datum *values1, float4 *numbers1,
+					int nvalues1, Datum *values2, float4 *numbers2,
+					int nvalues2, int red_nvalues, Oid operator);
+static Selectivity inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers,
+				   int mcv_nvalues, Datum *his_values, int his_nvalues,
+				   int red_nvalues, short int opr_order,
+				   Selectivity *max_selec_pointer);
+static Selectivity inet_his_inclusion_join_selec(Datum *his1_values,
+					  int his1_nvalues, Datum *his2_values, int his2_nvalues,
+							  int red_nvalues, short int opr_order);
+static short int inet_inclusion_cmp(inet *left, inet *right,
+				   short int opr_order);
+static short int inet_masklen_inclusion_cmp(inet *left, inet *right,
+						   short int opr_order);
+static short int inet_his_match_divider(inet *boundary, inet *query,
+					   short int opr_order);
+
+/*
+ * Selectivity estimation for the subnet inclusion operators
+ */
 Datum
 networksel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid			operator = PG_GETARG_OID(1);
+	List	   *args = (List *) PG_GETARG_POINTER(2);
+	int			varRelid = PG_GETARG_INT32(3),
+				his_nvalues;
+	VariableStatData vardata;
+	Node	   *other;
+	bool		varonleft;
+	Selectivity selec,
+				max_mcv_selec;
+	Datum		constvalue,
+			   *his_values;
+	Form_pg_statistic stats;
+	double		nullfrac;
+	FmgrInfo	proc;
+
+	/*
+	 * If expression is not (variable op something) or (something op
+	 * variable), then punt and return a default estimate.
+	 */
+	if (!get_restriction_variable(root, args, varRelid,
+								  &vardata, &other, &varonleft))
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+
+	/*
+	 * Can't do anything useful if the something is not a constant, either.
+	 */
+	if (!IsA(other, Const))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	/* All of the subnet inclusion operators are strict. */
+	if (((Const *) other)->constisnull)
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(0.0);
+	}
+
+	if (!HeapTupleIsValid(vardata.statsTuple))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	constvalue = ((Const *) other)->constvalue;
+	stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+	nullfrac = stats ? stats->stanullfrac : 0.0;
+
+	fmgr_info(get_opcode(operator), &proc);
+	selec = mcv_selectivity(&vardata, &proc, constvalue, varonleft,
+							&max_mcv_selec);
+
+	if (get_attstatsslot(vardata.statsTuple,
+						 vardata.atttype, vardata.atttypmod,
+						 STATISTIC_KIND_HISTOGRAM, InvalidOid,
+						 NULL,
+						 &his_values, &his_nvalues,
+						 NULL, NULL))
+	{
+		selec += (1.0 - nullfrac - max_mcv_selec) *
+			inet_his_inclusion_selec(his_values, his_nvalues, &constvalue,
+									 varonleft ? inet_opr_order(operator) :
+									 inet_opr_order(operator) * -1);
+
+		free_attstatsslot(vardata.atttype, his_values, his_nvalues, NULL, 0);
+	}
+	else if (max_mcv_selec == 0.0)
+		selec = (1.0 - nullfrac) * DEFAULT_SEL(operator);
+
+	/* Result should be in range, but make sure... */
+	CLAMP_PROBABILITY(selec);
+
+	ReleaseVariableStats(vardata);
+	PG_RETURN_FLOAT8(selec);
 }
 
+/*
+ * Join selectivity estimation for the subnet inclusion operators
+ *
+ * This function is the copy of eqjoinsel() on selfuncs.c except the comments
+ * and that it calls networkjoinsel_inner() instead of eqjoinsel_inner().
+ */
 Datum
 networkjoinsel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid			operator = PG_GETARG_OID(1);
+	List	   *args = (List *) PG_GETARG_POINTER(2);
+#ifdef NOT_USED
+	JoinType	jointype = (JoinType) PG_GETARG_INT16(3);
+#endif
+	SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) PG_GETARG_POINTER(4);
+	double		selec;
+	VariableStatData vardata1;
+	VariableStatData vardata2;
+	bool		join_is_reversed;
+	RelOptInfo *inner_rel;
+
+	get_join_variables(root, args, sjinfo,
+					   &vardata1, &vardata2, &join_is_reversed);
+
+	switch (sjinfo->jointype)
+	{
+		case JOIN_INNER:
+		case JOIN_LEFT:
+		case JOIN_FULL:
+
+			/*
+			 * Selectivity for left join is not exactly same as inner join,
+			 * but is neglected.
+			 */
+			if (!join_is_reversed)
+				selec = networkjoinsel_inner(operator, &vardata1, &vardata2);
+			else
+				selec = networkjoinsel_inner(get_commutator(operator),
+											 &vardata2, &vardata1);
+			break;
+		case JOIN_SEMI:
+		case JOIN_ANTI:
+
+			/*
+			 * Selectivity estimation functions of semi and anti joins are not
+			 * implemented for the subnet inclusion operators.
+			 * eqjoinsel_semi() used to cover.  It makes small or big mistakes
+			 * based on the join type, the operator and the ratio between the
+			 * row counts.
+			 */
+			inner_rel = find_join_input_rel(root, sjinfo->min_righthand);
+
+			if (!join_is_reversed)
+				selec = eqjoinsel_semi(operator, &vardata1, &vardata2,
+									   inner_rel);
+			else
+				selec = eqjoinsel_semi(get_commutator(operator),
+									   &vardata2, &vardata1,
+									   inner_rel);
+			break;
+		default:
+			/* other values not expected here */
+			elog(ERROR, "unrecognized join type: %d",
+				 (int) sjinfo->jointype);
+			selec = 0;			/* keep compiler quiet */
+			break;
+	}
+
+	ReleaseVariableStats(vardata1);
+	ReleaseVariableStats(vardata2);
+
+	CLAMP_PROBABILITY(selec);
+
+	PG_RETURN_FLOAT8((float8) selec);
+}
+
+/*
+ * Inner join selectivity estimation for the subnet inclusion operators
+ *
+ * Calculates MCV vs MCV, MCV vs histogram and histogram vs histogram
+ * selectivity for join using the subnet inclusion operators.  Unlike the
+ * join selectivity function for the equality operator, eqjoinsel_inner(),
+ * one to one matching of the values is not enough.   Network inclusion
+ * operators are likely to match many to many.   It requires to loop the MVC
+ * and histogram lists to the end.  Also, MCV vs histogram selectiviy is
+ * not neglected as in eqjoinsel_inner().
+ *
+ * To make the function faster only some of the values from the first
+ * MVC and histogram matched to the second histogram.  It is calculated by
+ * log2().
+ */
+static Selectivity
+networkjoinsel_inner(Oid operator,
+					 VariableStatData *vardata1, VariableStatData *vardata2)
+{
+	Form_pg_statistic stats;
+	double		nullfrac1 = 0.0,
+				nullfrac2 = 0.0;
+	Selectivity selec = 0.0,
+				mcv1_max_selec = 0.0,
+				mcv2_max_selec = 0.0;
+	bool		mcv1_exists = false,
+				mcv2_exists = false,
+				his1_exists = false,
+				his2_exists = false;
+	int			mcv1_nvalues,
+				mcv2_nvalues,
+				mcv1_nnumbers,
+				mcv2_nnumbers,
+				his1_nvalues,
+				his2_nvalues,
+				red1_nvalues,
+				red2_nvalues;
+	Datum	   *mcv1_values,
+			   *mcv2_values,
+			   *his1_values,
+			   *his2_values;
+	float4	   *mcv1_numbers,
+			   *mcv2_numbers;
+
+	if (HeapTupleIsValid(vardata1->statsTuple))
+	{
+		if ((stats = (Form_pg_statistic) GETSTRUCT(vardata1->statsTuple)))
+			nullfrac1 = stats->stanullfrac;
+
+		mcv1_exists = get_attstatsslot(vardata1->statsTuple,
+									   vardata1->atttype, vardata1->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv1_values, &mcv1_nvalues,
+									   &mcv1_numbers, &mcv1_nnumbers);
+		his1_exists = get_attstatsslot(vardata1->statsTuple,
+									   vardata1->atttype, vardata1->atttypmod,
+									   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+									   NULL,
+									   &his1_values, &his1_nvalues,
+									   NULL, NULL);
+		red1_nvalues = ((int) log2(Max(mcv1_nvalues, his1_nvalues))) + 1;
+	}
+
+	if (HeapTupleIsValid(vardata2->statsTuple))
+	{
+		if ((stats = (Form_pg_statistic) GETSTRUCT(vardata2->statsTuple)))
+			nullfrac2 = stats->stanullfrac;
+
+		mcv2_exists = get_attstatsslot(vardata2->statsTuple,
+									   vardata2->atttype, vardata2->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv2_values, &mcv2_nvalues,
+									   &mcv2_numbers, &mcv2_nnumbers);
+		his2_exists = get_attstatsslot(vardata2->statsTuple,
+									   vardata2->atttype, vardata2->atttypmod,
+									   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+									   NULL,
+									   &his2_values, &his2_nvalues,
+									   NULL, NULL);
+		red2_nvalues = ((int) log2(Max(mcv2_nvalues, his2_nvalues))) + 1;
+	}
+
+	if (mcv1_exists && mcv2_exists)
+		selec += inet_mcv_join_selec(mcv1_values, mcv1_numbers, mcv1_nvalues,
+									 mcv2_values, mcv2_numbers, mcv2_nvalues,
+									 Min(mcv1_nvalues, red1_nvalues), operator);
+	if (mcv1_exists && his2_exists)
+		selec += inet_mcv_his_selec(mcv1_values, mcv1_numbers, mcv1_nvalues,
+									his2_values, his2_nvalues,
+									Min(mcv1_nvalues, red1_nvalues),
+									inet_opr_order(operator), &mcv1_max_selec);
+	if (mcv2_exists && his1_exists)
+		selec += inet_mcv_his_selec(mcv2_values, mcv2_numbers, mcv2_nvalues,
+									his1_values, his1_nvalues,
+									Min(mcv2_nvalues, red2_nvalues),
+									inet_opr_order(operator), &mcv2_max_selec);
+	if (his1_exists && his2_exists)
+		selec += (1.0 - nullfrac1 - mcv1_max_selec) *
+			(1.0 - nullfrac2 - mcv2_max_selec) *
+			inet_his_inclusion_join_selec(his1_values, his1_nvalues,
+										  his2_values, his2_nvalues,
+										  Min(his1_nvalues, red1_nvalues),
+										  inet_opr_order(operator));
+
+	if ((!mcv1_exists && !his1_exists) || (!mcv2_exists && !his2_exists))
+		selec = (1.0 - nullfrac1) * (1.0 - nullfrac2) * DEFAULT_SEL(operator);
+
+	if (mcv1_exists)
+		free_attstatsslot(vardata1->atttype, mcv1_values, mcv1_nvalues,
+						  mcv1_numbers, mcv1_nnumbers);
+	if (mcv2_exists)
+		free_attstatsslot(vardata2->atttype, mcv2_values, mcv2_nvalues,
+						  mcv2_numbers, mcv2_nnumbers);
+	if (his1_exists)
+		free_attstatsslot(vardata1->atttype, his1_values, his1_nvalues,
+						  NULL, 0);
+	if (his2_exists)
+		free_attstatsslot(vardata2->atttype, his2_values, his2_nvalues,
+						  NULL, 0);
+
+	return selec;
+}
+
+/*
+ * Practical comparable numbers for the subnet inclusion operators
+ */
+static short int
+inet_opr_order(Oid operator)
+{
+	switch (operator)
+	{
+		case OID_INET_SUP_OP:
+			return -2;
+		case OID_INET_SUPEQ_OP:
+			return -1;
+		case OID_INET_OVERLAP_OP:
+			return 0;
+		case OID_INET_SUBEQ_OP:
+			return 1;
+		case OID_INET_SUB_OP:
+			return 2;
+		default:
+			elog(ERROR, "unknown operator for inet inclusion selectivity");
+	}
+}
+
+/*
+ * Inet histogram inclusion selectivity estimation
+ *
+ * Calculates histogram selectivity for the subnet inclusion operators of
+ * the inet type.  The return value is between 0 and 1.  It should be
+ * corrected with the MVC selectivity and null fraction.  If the constant
+ * is less than the first element or greater than the last element of
+ * the histogram the return value will be 0.
+ *
+ * The histogram is originally for the basic comparison operators.  Only
+ * the common bits of the network part and the length of the network part
+ * (masklen) are appropriate for the subnet inclusion opeators.  Fortunately,
+ * basic comparison fits in this situation.  Even so, the length of the
+ * network part would not really be significant in the histogram.  This would
+ * lead to big mistakes for data sets with uneven masklen distribution.
+ * To avoid this problem, comparison with the left and the right side of the
+ * buckets used together.
+ *
+ * Histogram bucket matches are calculated in two forms.  If the constant
+ * matches both sides the bucket is considered as fully matched.  If the
+ * constant matches only the right side the bucket, it is not considered
+ * as matched unless it is the last bucket, because it will match the next
+ * bucked.  If all of these buckets would be considered as matched, it would
+ * lead to unfair multiple matches for some constants.
+ *
+ * The second form is to match the bucket partially.  We try to calculate
+ * dividers for both of the boundaries.  If the address family of the boundary
+ * does not match the constant or comparison of the length of the network
+ * parts is not true by the operator, the divider for the boundary would not
+ * taken into account.  If both of the dividers can be calculated the greater
+ * one will be used to mimimize the mistake in the buckets which have
+ * disperate masklens.
+ *
+ * The divider on the partial bucket match is imagined as the distance
+ * between the decisive bits and the common bits of the addresses.  It will
+ * be* used as power of two as it is the natural scale for the IP network
+ * inclusion.  The partial bucket match divider calculation is an empirical
+ * formula and subject to change with more experiment.
+ *
+ * For partial match with buckets which have different address families
+ * on the left and right sides only the boundary with the same address
+ * family is taken into consideration.  This can cause more mistakes for these
+ * buckets if the masklens of their boundaries are also disparate.  It can
+ * only be the case for one bucket, if there are addresses with different
+ * families on the column.  It seems as a better option than not considering
+ * these buckets.
+ */
+static Selectivity
+inet_his_inclusion_selec(Datum *values, int nvalues, Datum *constvalue,
+						 short int opr_order)
+{
+	inet	   *query,
+			   *left,
+			   *right;
+	float		match = 0.0;
+	int			i;
+	short int	left_order,
+				right_order,
+				left_divider,
+				right_divider;
+
+	query = DatumGetInetP(*constvalue);
+	left = DatumGetInetP(values[0]);
+	left_order = inet_inclusion_cmp(left, query, opr_order);
+
+	for (i = 1; i < nvalues; i++)
+	{
+		right = DatumGetInetP(values[i]);
+		right_order = inet_inclusion_cmp(right, query, opr_order);
+
+		if (left_order == 0 && right_order == 0)
+		{
+			/* Full bucket match. */
+
+			match += 1.0;
+		}
+		else if ((left_order <= 0 && right_order > 0) ||
+				 (left_order >= 0 && right_order < 0) ||
+				 (right_order == 0 && i == nvalues - 1))
+		{
+			/* Partial bucket match. */
+
+			left_divider = inet_his_match_divider(left, query, opr_order);
+			right_divider = inet_his_match_divider(right, query, opr_order);
+
+			if (left_divider >= 0 || right_divider >= 0)
+				match += 1.0 / pow(2, Max(left_divider, right_divider));
+		}
+
+		/* Shift the variables. */
+		left = right;
+		left_order = right_order;
+	}
+
+	/* There are nvalues - 1 buckets. */
+	return match / (nvalues - 1);
+}
+
+/*
+ * Inet MCV join selectivity estimation
+ *
+ * The original function of the operator used in this function, like the
+ * mcv_selectivity() on selfuncs.c.  Actually, this function has nothing
+ * to do with the network data types except its name and location.
+ */
+static Selectivity
+inet_mcv_join_selec(Datum *values1, float4 *numbers1, int nvalues1,
+					Datum *values2, float4 *numbers2, int nvalues2,
+					int red_nvalues, Oid operator)
+{
+	Selectivity selec = 0.0,
+				red_selec = 0.0,
+				max_selec = 0.0;
+	FmgrInfo	proc;
+	int			i,
+				j;
+
+	fmgr_info(get_opcode(operator), &proc);
+
+	for (i = 0; i < nvalues1; i++)
+	{
+		if (i < red_nvalues)
+		{
+			for (j = 0; j < nvalues2; j++)
+				if (DatumGetBool(FunctionCall2Coll(&proc, DEFAULT_COLLATION_OID,
+												   values1[i], values2[j])))
+					selec += numbers1[i] * numbers2[j];
+
+			red_selec += numbers1[i];
+		}
+
+		max_selec += numbers1[i];
+	}
+
+	return selec * max_selec / red_selec;
+}
+
+/*
+ * Inet MCV vs histogram inclusion join selectivity estimation
+ *
+ * The function result is the selectivity, and the fraction of the total
+ * population of the MCV is returned into *max_selec_pointer.
+ */
+static Selectivity
+inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers, int mcv_nvalues,
+				   Datum *his_values, int his_nvalues, int red_nvalues,
+				   short int opr_order, Selectivity *max_selec_pointer)
+{
+	Selectivity selec = 0.0,
+				red_selec = 0.0,
+				max_selec = 0.0;
+	int			i;
+
+	for (i = 0; i < mcv_nvalues; i++)
+	{
+		if (i < red_nvalues)
+		{
+			selec += mcv_numbers[i] *
+				inet_his_inclusion_selec(his_values, his_nvalues,
+										 &mcv_values[i], opr_order);
+
+			red_selec += mcv_numbers[i];
+		}
+
+		max_selec += mcv_numbers[i];
+	}
+
+	*max_selec_pointer = max_selec;
+	return selec * max_selec / red_selec;
+}
+
+/*
+ * Inet histogram inclusion join selectivity estimation
+ *
+ * Selected values from the first histogram will be matched with the second.
+ * red_nvalues of the values will by discarding same amount of values from
+ * the begging and the end of the list, on the grounds that they are outliers
+ * and hence not very representative.
+ */
+static Selectivity
+inet_his_inclusion_join_selec(Datum *his1_values, int his1_nvalues,
+							  Datum *his2_values, int his2_nvalues,
+							  int red_nvalues, short int opr_order)
+{
+	float		match = 0.0;
+	int			nskip = (his1_nvalues - red_nvalues) / 2,
+				i;
+
+	for (i = nskip; i < his1_nvalues - nskip; i++)
+		match += inet_his_inclusion_selec(his2_values, his2_nvalues,
+										  &his1_values[i], opr_order);
+
+	return match / (his1_nvalues - 2 * nskip);
+}
+
+/*
+ * Comparison function for the subnet inclusion operators
+ *
+ * Comparison is compatible with the basic comparison function for the inet
+ * type.  See network_cmp_internal on network.c for the original.  Basic
+ * comparison operators are implemented with the network_cmp_internal
+ * function.  It is possible to implement the subnet inclusion operators with
+ * this function.
+ *
+ * Comparison is first on the common bits of the network part, then on
+ * the length of the network part (masklen) as the network_cmp_internal
+ * function.  Only the first part is on this function.  The second part is
+ * seperated to another function for reusability.  The difference between
+ * the second part and the original network_cmp_internal is that the operator
+ * is used while comparing the lengths of the network parts.  See the second
+ * part on the inet_masklen_inclusion_cmp function below.
+ */
+static short int
+inet_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	order;
+
+		order = bitncmp(ip_addr(left), ip_addr(right),
+						Min(ip_bits(left), ip_bits(right)));
+
+		if (order != 0)
+			return order;
+
+		return inet_masklen_inclusion_cmp(left, right, opr_order);
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Masklen comparison function for the subnet inclusion operators
+ *
+ * Compares the lengths of network parts of the inputs using the operator.
+ * If the comparision is okay for the operator the return value will be 0.
+ * Otherwise the return value will be less than or greater than 0 with
+ * respect to the operator.
+ */
+static short int
+inet_masklen_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	order;
+
+		order = ip_bits(left) - ip_bits(right);
+
+		if ((order > 0 && opr_order >= 0) ||
+			(order == 0 && opr_order >= -1 && opr_order <= 1) ||
+			(order < 0 && opr_order <= 0))
+			return 0;
+
+		return opr_order;
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Inet histogram partial match divider calculation
+ *
+ * First the families and the lengths of the network parts are compared
+ * using the subnet inclusion operator.  The divider will be calculated
+ * using the masklens and the common bits of the addresses.  -1 will be
+ * returned if it cannot be calculated.
+ */
+static short int
+inet_his_match_divider(inet *boundary, inet *query, short int opr_order)
+{
+	if (inet_masklen_inclusion_cmp(boundary, query, opr_order) == 0)
+	{
+		short int	min_bits,
+					decisive_bits;
+
+		min_bits = Min(ip_bits(boundary), ip_bits(query));
+
+		/*
+		 * Set the decisive bits from the one which should contain the other
+		 * according to the operator.
+		 */
+		if (opr_order < 0)
+			decisive_bits = ip_bits(boundary);
+		else if (opr_order > 0)
+			decisive_bits = ip_bits(query);
+		else
+			decisive_bits = min_bits;
+
+		if (min_bits > 0)
+			return decisive_bits - bitncommon(ip_addr(boundary), ip_addr(query),
+											  min_bits);
+		return decisive_bits;
+	}
+
+	return -1;
 }
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index e932ccf..8dcda93 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -149,21 +149,21 @@ static double var_eq_const(VariableStatData *vardata, Oid operator,
 			 bool varonleft);
 static double var_eq_non_const(VariableStatData *vardata, Oid operator,
 				 Node *other,
 				 bool varonleft);
 static double ineq_histogram_selectivity(PlannerInfo *root,
 						   VariableStatData *vardata,
 						   FmgrInfo *opproc, bool isgt,
 						   Datum constval, Oid consttype);
 static double eqjoinsel_inner(Oid operator,
 				VariableStatData *vardata1, VariableStatData *vardata2);
-static double eqjoinsel_semi(Oid operator,
+double eqjoinsel_semi(Oid operator,
 			   VariableStatData *vardata1, VariableStatData *vardata2,
 			   RelOptInfo *inner_rel);
 static bool convert_to_scalar(Datum value, Oid valuetypid, double *scaledvalue,
 				  Datum lobound, Datum hibound, Oid boundstypid,
 				  double *scaledlobound, double *scaledhibound);
 static double convert_numeric_to_scalar(Datum value, Oid typid);
 static void convert_string_to_scalar(char *value,
 						 double *scaledvalue,
 						 char *lobound,
 						 double *scaledlobound,
@@ -182,21 +182,21 @@ static double convert_one_bytea_to_scalar(unsigned char *value, int valuelen,
 static char *convert_string_datum(Datum value, Oid typid);
 static double convert_timevalue_to_scalar(Datum value, Oid typid);
 static void examine_simple_variable(PlannerInfo *root, Var *var,
 						VariableStatData *vardata);
 static bool get_variable_range(PlannerInfo *root, VariableStatData *vardata,
 				   Oid sortop, Datum *min, Datum *max);
 static bool get_actual_variable_range(PlannerInfo *root,
 						  VariableStatData *vardata,
 						  Oid sortop,
 						  Datum *min, Datum *max);
-static RelOptInfo *find_join_input_rel(PlannerInfo *root, Relids relids);
+RelOptInfo *find_join_input_rel(PlannerInfo *root, Relids relids);
 static Selectivity prefix_selectivity(PlannerInfo *root,
 				   VariableStatData *vardata,
 				   Oid vartype, Oid opfamily, Const *prefixcon);
 static Selectivity like_selectivity(const char *patt, int pattlen,
 				 bool case_insensitive);
 static Selectivity regex_selectivity(const char *patt, int pattlen,
 				  bool case_insensitive,
 				  int fixed_prefix_len);
 static Datum string_to_datum(const char *str, Oid datatype);
 static Const *string_to_const(const char *str, Oid datatype);
@@ -2418,21 +2418,21 @@ eqjoinsel_inner(Oid operator,
 
 	return selec;
 }
 
 /*
  * eqjoinsel_semi --- eqjoinsel for semi join
  *
  * (Also used for anti join, which we are supposed to estimate the same way.)
  * Caller has ensured that vardata1 is the LHS variable.
  */
-static double
+double
 eqjoinsel_semi(Oid operator,
 			   VariableStatData *vardata1, VariableStatData *vardata2,
 			   RelOptInfo *inner_rel)
 {
 	double		selec;
 	double		nd1;
 	double		nd2;
 	bool		isdefault1;
 	bool		isdefault2;
 	Form_pg_statistic stats1 = NULL;
@@ -5094,21 +5094,21 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata,
 	return have_data;
 }
 
 /*
  * find_join_input_rel
  *		Look up the input relation for a join.
  *
  * We assume that the input relation's RelOptInfo must have been constructed
  * already.
  */
-static RelOptInfo *
+RelOptInfo *
 find_join_input_rel(PlannerInfo *root, Relids relids)
 {
 	RelOptInfo *rel = NULL;
 
 	switch (bms_membership(relids))
 	{
 		case BMS_EMPTY_SET:
 			/* should not happen */
 			break;
 		case BMS_SINGLETON:
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index d7dcd1c..3b827fc 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -1135,32 +1135,33 @@ DESCR("not equal");
 DATA(insert OID = 1203 (  "<"	   PGNSP PGUID b f f 869 869	 16 1205 1206 network_lt scalarltsel scalarltjoinsel ));
 DESCR("less than");
 DATA(insert OID = 1204 (  "<="	   PGNSP PGUID b f f 869 869	 16 1206 1205 network_le scalarltsel scalarltjoinsel ));
 DESCR("less than or equal");
 DATA(insert OID = 1205 (  ">"	   PGNSP PGUID b f f 869 869	 16 1203 1204 network_gt scalargtsel scalargtjoinsel ));
 DESCR("greater than");
 DATA(insert OID = 1206 (  ">="	   PGNSP PGUID b f f 869 869	 16 1204 1203 network_ge scalargtsel scalargtjoinsel ));
 DESCR("greater than or equal");
 DATA(insert OID = 931  (  "<<"	   PGNSP PGUID b f f 869 869	 16 933		0 network_sub networksel networkjoinsel ));
 DESCR("is subnet");
-#define OID_INET_SUB_OP				  931
+#define OID_INET_SUB_OP			931
 DATA(insert OID = 932  (  "<<="    PGNSP PGUID b f f 869 869	 16 934		0 network_subeq networksel networkjoinsel ));
 DESCR("is subnet or equal");
-#define OID_INET_SUBEQ_OP				932
+#define OID_INET_SUBEQ_OP		932
 DATA(insert OID = 933  (  ">>"	   PGNSP PGUID b f f 869 869	 16 931		0 network_sup networksel networkjoinsel ));
 DESCR("is supernet");
-#define OID_INET_SUP_OP				  933
+#define OID_INET_SUP_OP			933
 DATA(insert OID = 934  (  ">>="    PGNSP PGUID b f f 869 869	 16 932		0 network_supeq networksel networkjoinsel ));
 DESCR("is supernet or equal");
-#define OID_INET_SUPEQ_OP				934
+#define OID_INET_SUPEQ_OP		934
 DATA(insert OID = 3552	(  "&&"    PGNSP PGUID b f f 869 869	 16 3552	0 network_overlap networksel networkjoinsel ));
 DESCR("overlaps (is subnet or supernet)");
+#define OID_INET_OVERLAP_OP		3552
 
 DATA(insert OID = 2634 (  "~"	   PGNSP PGUID l f f	  0 869 869 0 0 inetnot - - ));
 DESCR("bitwise not");
 DATA(insert OID = 2635 (  "&"	   PGNSP PGUID b f f	869 869 869 0 0 inetand - - ));
 DESCR("bitwise and");
 DATA(insert OID = 2636 (  "|"	   PGNSP PGUID b f f	869 869 869 0 0 inetor - - ));
 DESCR("bitwise or");
 DATA(insert OID = 2637 (  "+"	   PGNSP PGUID b f f	869  20 869 2638 0 inetpl - - ));
 DESCR("add");
 DATA(insert OID = 2638 (  "+"	   PGNSP PGUID b f f	 20 869 869 2637 0 int8pl_inet - - ));
#13Emre Hasegeli
emre@hasegeli.com
In reply to: Tom Lane (#11)
1 attachment(s)
Re: Selectivity estimation for inet operators

Heikki Linnakangas <hlinnakangas@vmware.com> writes:

* inet_mcv_join_selec() is O(n^2) where n is the number of entries in
the MCV lists. With the max statistics target of 10000, a worst case
query on my laptop took about 15 seconds to plan. Maybe that's
acceptable, but you went through some trouble to make planning of MCV vs
histogram faster, by the log2 method to compare only some values, so I
wonder why you didn't do the same for the MCV vs MCV case?

Actually, what I think needs to be asked is the opposite question: why is
the other code ignoring some of the statistical data? If the user asked
us to collect a lot of stats detail it seems reasonable that he's
expecting us to use it to get more accurate estimates. It's for sure
not obvious why these estimators should take shortcuts that are not being
taken in the much-longer-established code for scalar comparison estimates.

It will still use more statistical data, when statistics_target is
higher. It was not sure that the user wants to spent O(n^2) amount
of time based on statistics_target. Attached version is without
this optimization. Estimates are better without it, but planning
takes more time.

I'm not exactly convinced that the math adds up in this logic, either.
The way in which it combines results from looking at the MCV lists and
at the histograms seems pretty arbitrary.

I taught the product of the join will be

(left_mcv + left_histogram) * (right_mcv + right_histogram) * selectivity

and tried to calculate it as in the following:

(left_mcv * right_mcv * selectivity) +
(right_mcv * left_histogram * selectivity) +
(left_mcv * right_histogram * selectivity) +
(left_histogram * right_histogram * selectivity)

where left_histogram is

1.0 - left_nullfrac - left_mcv

I fixed calculation for the MCV vs histogram part. The estimates of
inner join are very close to the actual rows with statistics_target = 1000.
I think the calculation should be right.

Attachments:

inet-selfuncs-v10.patchtext/plain; charset=utf-8Download
diff --git a/src/backend/utils/adt/network_selfuncs.c b/src/backend/utils/adt/network_selfuncs.c
index d0d806f..4367f0e 100644
--- a/src/backend/utils/adt/network_selfuncs.c
+++ b/src/backend/utils/adt/network_selfuncs.c
@@ -1,32 +1,655 @@
 /*-------------------------------------------------------------------------
  *
  * network_selfuncs.c
  *	  Functions for selectivity estimation of inet/cidr operators
  *
- * Currently these are just stubs, but we hope to do better soon.
+ * Estimates are based on null fraction, distinct value count, most common
+ * values, and histogram of inet/cidr datatypes.
  *
  * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
  *
  * IDENTIFICATION
  *	  src/backend/utils/adt/network_selfuncs.c
  *
  *-------------------------------------------------------------------------
  */
 #include "postgres.h"
 
+#include <math.h>
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_operator.h"
+#include "catalog/pg_statistic.h"
+#include "utils/lsyscache.h"
 #include "utils/inet.h"
+#include "utils/selfuncs.h"
 
 
+/* Default selectivity constant for the inet overlap operator */
+#define DEFAULT_OVERLAP_SEL 0.01
+
+/* Default selectivity constant for the other operators */
+#define DEFAULT_INCLUSION_SEL 0.005
+
+/* Default selectivity for given operator */
+#define DEFAULT_SEL(operator) \
+	((operator) == OID_INET_OVERLAP_OP ? \
+			DEFAULT_OVERLAP_SEL : DEFAULT_INCLUSION_SEL)
+
+static Selectivity networkjoinsel_inner(Oid operator,
+					 VariableStatData *vardata1, VariableStatData *vardata2);
+extern double eqjoinsel_semi(Oid operator, VariableStatData *vardata1,
+			   VariableStatData *vardata2, RelOptInfo *inner_rel);
+extern RelOptInfo *find_join_input_rel(PlannerInfo *root, Relids relids);
+static short int inet_opr_order(Oid operator);
+static Selectivity inet_his_inclusion_selec(Datum *values, int nvalues,
+						 Datum *constvalue, short int opr_order);
+static Selectivity inet_mcv_join_selec(Datum *values1, float4 *numbers1,
+					int nvalues1, Datum *values2, float4 *numbers2,
+					int nvalues2, Oid operator, Selectivity *max_selec_p);
+static Selectivity inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers,
+				   int mcv_nvalues, Datum *his_values, int his_nvalues,
+				   short int opr_order, Selectivity *max_selec_p);
+static Selectivity inet_his_inclusion_join_selec(Datum *his1_values,
+					  int his1_nvalues, Datum *his2_values, int his2_nvalues,
+					  short int opr_order);
+static short int inet_inclusion_cmp(inet *left, inet *right,
+				   short int opr_order);
+static short int inet_masklen_inclusion_cmp(inet *left, inet *right,
+						   short int opr_order);
+static short int inet_his_match_divider(inet *boundary, inet *query,
+					   short int opr_order);
+
+/*
+ * Selectivity estimation for the subnet inclusion operators
+ */
 Datum
 networksel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid			operator = PG_GETARG_OID(1);
+	List	   *args = (List *) PG_GETARG_POINTER(2);
+	int			varRelid = PG_GETARG_INT32(3),
+				his_nvalues;
+	VariableStatData vardata;
+	Node	   *other;
+	bool		varonleft;
+	Selectivity selec,
+				max_mcv_selec;
+	Datum		constvalue,
+			   *his_values;
+	Form_pg_statistic stats;
+	double		nullfrac;
+	FmgrInfo	proc;
+
+	/*
+	 * If expression is not (variable op something) or (something op
+	 * variable), then punt and return a default estimate.
+	 */
+	if (!get_restriction_variable(root, args, varRelid,
+								  &vardata, &other, &varonleft))
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+
+	/*
+	 * Can't do anything useful if the something is not a constant, either.
+	 */
+	if (!IsA(other, Const))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	/* All of the subnet inclusion operators are strict. */
+	if (((Const *) other)->constisnull)
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(0.0);
+	}
+
+	if (!HeapTupleIsValid(vardata.statsTuple))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	constvalue = ((Const *) other)->constvalue;
+	stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+	nullfrac = stats ? stats->stanullfrac : 0.0;
+
+	fmgr_info(get_opcode(operator), &proc);
+	selec = mcv_selectivity(&vardata, &proc, constvalue, varonleft,
+							&max_mcv_selec);
+
+	if (get_attstatsslot(vardata.statsTuple,
+						 vardata.atttype, vardata.atttypmod,
+						 STATISTIC_KIND_HISTOGRAM, InvalidOid,
+						 NULL,
+						 &his_values, &his_nvalues,
+						 NULL, NULL))
+	{
+		selec += (1.0 - nullfrac - max_mcv_selec) *
+			inet_his_inclusion_selec(his_values, his_nvalues, &constvalue,
+									 varonleft ? inet_opr_order(operator) :
+									 inet_opr_order(operator) * -1);
+
+		free_attstatsslot(vardata.atttype, his_values, his_nvalues, NULL, 0);
+	}
+	else if (max_mcv_selec == 0.0)
+		selec = (1.0 - nullfrac) * DEFAULT_SEL(operator);
+
+	/* Result should be in range, but make sure... */
+	CLAMP_PROBABILITY(selec);
+
+	ReleaseVariableStats(vardata);
+	PG_RETURN_FLOAT8(selec);
 }
 
+/*
+ * Join selectivity estimation for the subnet inclusion operators
+ *
+ * This function is the copy of eqjoinsel() on selfuncs.c except the comments
+ * and that it calls networkjoinsel_inner() instead of eqjoinsel_inner().
+ */
 Datum
 networkjoinsel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid			operator = PG_GETARG_OID(1);
+	List	   *args = (List *) PG_GETARG_POINTER(2);
+#ifdef NOT_USED
+	JoinType	jointype = (JoinType) PG_GETARG_INT16(3);
+#endif
+	SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) PG_GETARG_POINTER(4);
+	double		selec;
+	VariableStatData vardata1;
+	VariableStatData vardata2;
+	bool		join_is_reversed;
+	RelOptInfo *inner_rel;
+
+	get_join_variables(root, args, sjinfo,
+					   &vardata1, &vardata2, &join_is_reversed);
+
+	switch (sjinfo->jointype)
+	{
+		case JOIN_INNER:
+		case JOIN_LEFT:
+		case JOIN_FULL:
+
+			/*
+			 * Selectivity for left join is not exactly same as inner join,
+			 * but is neglected.
+			 */
+			if (!join_is_reversed)
+				selec = networkjoinsel_inner(operator, &vardata1, &vardata2);
+			else
+				selec = networkjoinsel_inner(get_commutator(operator),
+											 &vardata2, &vardata1);
+			break;
+		case JOIN_SEMI:
+		case JOIN_ANTI:
+
+			/*
+			 * Selectivity estimation functions of semi and anti joins are not
+			 * implemented for the subnet inclusion operators.
+			 * eqjoinsel_semi() used to cover.  It makes small or big mistakes
+			 * based on the join type, the operator and the ratio between the
+			 * row counts.
+			 */
+			inner_rel = find_join_input_rel(root, sjinfo->min_righthand);
+
+			if (!join_is_reversed)
+				selec = eqjoinsel_semi(operator, &vardata1, &vardata2,
+									   inner_rel);
+			else
+				selec = eqjoinsel_semi(get_commutator(operator),
+									   &vardata2, &vardata1,
+									   inner_rel);
+			break;
+		default:
+			/* other values not expected here */
+			elog(ERROR, "unrecognized join type: %d",
+				 (int) sjinfo->jointype);
+			selec = 0;			/* keep compiler quiet */
+			break;
+	}
+
+	ReleaseVariableStats(vardata1);
+	ReleaseVariableStats(vardata2);
+
+	CLAMP_PROBABILITY(selec);
+
+	PG_RETURN_FLOAT8((float8) selec);
+}
+
+/*
+ * Inner join selectivity estimation for the subnet inclusion operators
+ *
+ * Calculates MCV vs MCV, MCV vs histogram and histogram vs histogram
+ * selectivity for join using the subnet inclusion operators.  Unlike the
+ * join selectivity function for the equality operator, eqjoinsel_inner(),
+ * one to one matching of the values is not enough.   Network inclusion
+ * operators are likely to match many to many.   It requires to loop the MVC
+ * and histogram lists to the end.  Also, MCV vs histogram selectiviy is
+ * not neglected as in eqjoinsel_inner().
+ */
+static Selectivity
+networkjoinsel_inner(Oid operator,
+					 VariableStatData *vardata1, VariableStatData *vardata2)
+{
+	Form_pg_statistic stats;
+	double		nullfrac1 = 0.0,
+				nullfrac2 = 0.0;
+	Selectivity selec = 0.0,
+				mcv1_max_selec = 0.0,
+				mcv2_max_selec = 0.0;
+	bool		mcv1_exists = false,
+				mcv2_exists = false,
+				his1_exists = false,
+				his2_exists = false;
+	short int	opr_order;
+	int			mcv1_nvalues,
+				mcv2_nvalues,
+				mcv1_nnumbers,
+				mcv2_nnumbers,
+				his1_nvalues,
+				his2_nvalues;
+	Datum	   *mcv1_values,
+			   *mcv2_values,
+			   *his1_values,
+			   *his2_values;
+	float4	   *mcv1_numbers,
+			   *mcv2_numbers;
+
+	if (HeapTupleIsValid(vardata1->statsTuple))
+	{
+		if ((stats = (Form_pg_statistic) GETSTRUCT(vardata1->statsTuple)))
+			nullfrac1 = stats->stanullfrac;
+
+		mcv1_exists = get_attstatsslot(vardata1->statsTuple,
+									   vardata1->atttype, vardata1->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv1_values, &mcv1_nvalues,
+									   &mcv1_numbers, &mcv1_nnumbers);
+		his1_exists = get_attstatsslot(vardata1->statsTuple,
+									   vardata1->atttype, vardata1->atttypmod,
+									   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+									   NULL,
+									   &his1_values, &his1_nvalues,
+									   NULL, NULL);
+	}
+
+	if (HeapTupleIsValid(vardata2->statsTuple))
+	{
+		if ((stats = (Form_pg_statistic) GETSTRUCT(vardata2->statsTuple)))
+			nullfrac2 = stats->stanullfrac;
+
+		mcv2_exists = get_attstatsslot(vardata2->statsTuple,
+									   vardata2->atttype, vardata2->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv2_values, &mcv2_nvalues,
+									   &mcv2_numbers, &mcv2_nnumbers);
+		his2_exists = get_attstatsslot(vardata2->statsTuple,
+									   vardata2->atttype, vardata2->atttypmod,
+									   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+									   NULL,
+									   &his2_values, &his2_nvalues,
+									   NULL, NULL);
+	}
+
+	opr_order = inet_opr_order(operator);
+
+	if (mcv1_exists && mcv2_exists)
+		selec += inet_mcv_join_selec(mcv1_values, mcv1_numbers, mcv1_nvalues,
+									 mcv2_values, mcv2_numbers, mcv2_nvalues,
+									 operator, &mcv1_max_selec);
+	if (mcv2_exists && his1_exists)
+		selec += (1.0 - nullfrac1 - mcv1_max_selec) *
+			inet_mcv_his_selec(mcv2_values, mcv2_numbers, mcv2_nvalues,
+							   his1_values, his1_nvalues, opr_order,
+							   &mcv2_max_selec);
+	if (mcv1_exists && his2_exists)
+		selec += (1.0 - nullfrac2 - mcv2_max_selec) *
+			inet_mcv_his_selec(mcv1_values, mcv1_numbers, mcv1_nvalues,
+							   his2_values, his2_nvalues, opr_order,
+							   &mcv1_max_selec);
+	if (his1_exists && his2_exists)
+		selec += (1.0 - nullfrac1 - mcv1_max_selec) *
+			(1.0 - nullfrac2 - mcv2_max_selec) *
+			inet_his_inclusion_join_selec(his1_values, his1_nvalues,
+										  his2_values, his2_nvalues,
+										  opr_order);
+
+	if ((!mcv1_exists && !his1_exists) || (!mcv2_exists && !his2_exists))
+		selec = (1.0 - nullfrac1) * (1.0 - nullfrac2) * DEFAULT_SEL(operator);
+
+	if (mcv1_exists)
+		free_attstatsslot(vardata1->atttype, mcv1_values, mcv1_nvalues,
+						  mcv1_numbers, mcv1_nnumbers);
+	if (mcv2_exists)
+		free_attstatsslot(vardata2->atttype, mcv2_values, mcv2_nvalues,
+						  mcv2_numbers, mcv2_nnumbers);
+	if (his1_exists)
+		free_attstatsslot(vardata1->atttype, his1_values, his1_nvalues,
+						  NULL, 0);
+	if (his2_exists)
+		free_attstatsslot(vardata2->atttype, his2_values, his2_nvalues,
+						  NULL, 0);
+
+	return selec;
+}
+
+/*
+ * Practical comparable numbers for the subnet inclusion operators
+ */
+static short int
+inet_opr_order(Oid operator)
+{
+	switch (operator)
+	{
+		case OID_INET_SUP_OP:
+			return -2;
+		case OID_INET_SUPEQ_OP:
+			return -1;
+		case OID_INET_OVERLAP_OP:
+			return 0;
+		case OID_INET_SUBEQ_OP:
+			return 1;
+		case OID_INET_SUB_OP:
+			return 2;
+		default:
+			elog(ERROR, "unknown operator for inet inclusion selectivity");
+	}
+}
+
+/*
+ * Inet histogram inclusion selectivity estimation
+ *
+ * Calculates histogram selectivity for the subnet inclusion operators of
+ * the inet type.  The return value is between 0 and 1.  It should be
+ * corrected with the MVC selectivity and null fraction.  If the constant
+ * is less than the first element or greater than the last element of
+ * the histogram the return value will be 0.
+ *
+ * The histogram is originally for the basic comparison operators.  Only
+ * the common bits of the network part and the length of the network part
+ * (masklen) are appropriate for the subnet inclusion opeators.  Fortunately,
+ * basic comparison fits in this situation.  Even so, the length of the
+ * network part would not really be significant in the histogram.  This would
+ * lead to big mistakes for data sets with uneven masklen distribution.
+ * To avoid this problem, comparison with the left and the right side of the
+ * buckets used together.
+ *
+ * Histogram bucket matches are calculated in two forms.  If the constant
+ * matches both sides the bucket is considered as fully matched.  If the
+ * constant matches only the right side the bucket, it is not considered
+ * as matched unless it is the last bucket, because it will match the next
+ * bucked.  If all of these buckets would be considered as matched, it would
+ * lead to unfair multiple matches for some constants.
+ *
+ * The second form is to match the bucket partially.  We try to calculate
+ * dividers for both of the boundaries.  If the address family of the boundary
+ * does not match the constant or comparison of the length of the network
+ * parts is not true by the operator, the divider for the boundary would not
+ * taken into account.  If both of the dividers can be calculated the greater
+ * one will be used to mimimize the mistake in the buckets which have
+ * disperate masklens.
+ *
+ * The divider on the partial bucket match is imagined as the distance
+ * between the decisive bits and the common bits of the addresses.  It will
+ * be* used as power of two as it is the natural scale for the IP network
+ * inclusion.  The partial bucket match divider calculation is an empirical
+ * formula and subject to change with more experiment.
+ *
+ * For partial match with buckets which have different address families
+ * on the left and right sides only the boundary with the same address
+ * family is taken into consideration.  This can cause more mistakes for these
+ * buckets if the masklens of their boundaries are also disparate.  It can
+ * only be the case for one bucket, if there are addresses with different
+ * families on the column.  It seems as a better option than not considering
+ * these buckets.
+ */
+static Selectivity
+inet_his_inclusion_selec(Datum *values, int nvalues, Datum *constvalue,
+						 short int opr_order)
+{
+	inet	   *query,
+			   *left,
+			   *right;
+	float		match = 0.0;
+	int			i;
+	short int	left_order,
+				right_order,
+				left_divider,
+				right_divider;
+
+	query = DatumGetInetP(*constvalue);
+	left = DatumGetInetP(values[0]);
+	left_order = inet_inclusion_cmp(left, query, opr_order);
+
+	for (i = 1; i < nvalues; i++)
+	{
+		right = DatumGetInetP(values[i]);
+		right_order = inet_inclusion_cmp(right, query, opr_order);
+
+		if (left_order == 0 && right_order == 0)
+		{
+			/* Full bucket match. */
+
+			match += 1.0;
+		}
+		else if ((left_order <= 0 && right_order > 0) ||
+				 (left_order >= 0 && right_order < 0) ||
+				 (right_order == 0 && i == nvalues - 1))
+		{
+			/* Partial bucket match. */
+
+			left_divider = inet_his_match_divider(left, query, opr_order);
+			right_divider = inet_his_match_divider(right, query, opr_order);
+
+			if (left_divider >= 0 || right_divider >= 0)
+				match += 1.0 / pow(2, Max(left_divider, right_divider));
+		}
+
+		/* Shift the variables. */
+		left = right;
+		left_order = right_order;
+	}
+
+	/* There are nvalues - 1 buckets. */
+	return match / (nvalues - 1);
+}
+
+/*
+ * Inet MCV join selectivity estimation
+ *
+ * The original function of the operator used in this function, like the
+ * mcv_selectivity() on selfuncs.c.  Actually, this function has nothing
+ * to do with the network data types except its name and location.
+ *
+ * The function result is the selectivity, and the fraction of the total
+ * population of the left hand side MCV are returned into *max_selec_p.
+ */
+static Selectivity
+inet_mcv_join_selec(Datum *values1, float4 *numbers1, int nvalues1,
+					Datum *values2, float4 *numbers2, int nvalues2,
+					Oid operator, Selectivity *max_selec_p)
+{
+	Selectivity selec = 0.0,
+				max_selec = 0.0;
+	FmgrInfo	proc;
+	int			i,
+				j;
+
+	fmgr_info(get_opcode(operator), &proc);
+
+	for (i = 0; i < nvalues1; i++)
+	{
+		for (j = 0; j < nvalues2; j++)
+			if (DatumGetBool(FunctionCall2Coll(&proc, DEFAULT_COLLATION_OID,
+											   values1[i], values2[j])))
+				selec += numbers1[i] * numbers2[j];
+
+		max_selec += numbers1[i];
+	}
+
+	*max_selec_p = max_selec;
+	return selec;
+}
+
+/*
+ * Inet MCV vs histogram inclusion join selectivity estimation
+ *
+ * The function result is the selectivity, and the fraction of the total
+ * population of the MCV is returned into *max_selec_p.
+ */
+static Selectivity
+inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers, int mcv_nvalues,
+				   Datum *his_values, int his_nvalues, short int opr_order,
+				   Selectivity *max_selec_p)
+{
+	Selectivity selec = 0.0,
+				max_selec = 0.0;
+	int			i;
+
+	for (i = 0; i < mcv_nvalues; i++)
+	{
+		selec += mcv_numbers[i] *
+			inet_his_inclusion_selec(his_values, his_nvalues, &mcv_values[i],
+									 opr_order);
+
+		max_selec += mcv_numbers[i];
+	}
+
+	*max_selec_p = max_selec;
+	return selec;
+}
+
+/*
+ * Inet histogram inclusion join selectivity estimation
+ *
+ * Values from the first histogram will be matched with the second.  The first
+ * and the last element of the first histogram will not be used, on
+ * the grounds that they are outliers and hence not very representative.
+ */
+static Selectivity
+inet_his_inclusion_join_selec(Datum *his1_values, int his1_nvalues,
+							  Datum *his2_values, int his2_nvalues,
+							  short int opr_order)
+{
+	float		match = 0.0;
+	int			i;
+
+	for (i = 1; i < his1_nvalues - 1; i++)
+		match += inet_his_inclusion_selec(his2_values, his2_nvalues,
+										  &his1_values[i], opr_order);
+
+	/* his1_nvalues - 2 values should be checked. */
+	return match / (his1_nvalues - 2);
+}
+
+/*
+ * Comparison function for the subnet inclusion operators
+ *
+ * Comparison is compatible with the basic comparison function for the inet
+ * type.  See network_cmp_internal on network.c for the original.  Basic
+ * comparison operators are implemented with the network_cmp_internal
+ * function.  It is possible to implement the subnet inclusion operators with
+ * this function.
+ *
+ * Comparison is first on the common bits of the network part, then on
+ * the length of the network part (masklen) as the network_cmp_internal
+ * function.  Only the first part is on this function.  The second part is
+ * seperated to another function for reusability.  The difference between
+ * the second part and the original network_cmp_internal is that the operator
+ * is used while comparing the lengths of the network parts.  See the second
+ * part on the inet_masklen_inclusion_cmp function below.
+ */
+static short int
+inet_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	order;
+
+		order = bitncmp(ip_addr(left), ip_addr(right),
+						Min(ip_bits(left), ip_bits(right)));
+
+		if (order != 0)
+			return order;
+
+		return inet_masklen_inclusion_cmp(left, right, opr_order);
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Masklen comparison function for the subnet inclusion operators
+ *
+ * Compares the lengths of network parts of the inputs using the operator.
+ * If the comparision is okay for the operator the return value will be 0.
+ * Otherwise the return value will be less than or greater than 0 with
+ * respect to the operator.
+ */
+static short int
+inet_masklen_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	order;
+
+		order = ip_bits(left) - ip_bits(right);
+
+		if ((order > 0 && opr_order >= 0) ||
+			(order == 0 && opr_order >= -1 && opr_order <= 1) ||
+			(order < 0 && opr_order <= 0))
+			return 0;
+
+		return opr_order;
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Inet histogram partial match divider calculation
+ *
+ * First the families and the lengths of the network parts are compared
+ * using the subnet inclusion operator.  The divider will be calculated
+ * using the masklens and the common bits of the addresses.  -1 will be
+ * returned if it cannot be calculated.
+ */
+static short int
+inet_his_match_divider(inet *boundary, inet *query, short int opr_order)
+{
+	if (inet_masklen_inclusion_cmp(boundary, query, opr_order) == 0)
+	{
+		short int	min_bits,
+					decisive_bits;
+
+		min_bits = Min(ip_bits(boundary), ip_bits(query));
+
+		/*
+		 * Set the decisive bits from the one which should contain the other
+		 * according to the operator.
+		 */
+		if (opr_order < 0)
+			decisive_bits = ip_bits(boundary);
+		else if (opr_order > 0)
+			decisive_bits = ip_bits(query);
+		else
+			decisive_bits = min_bits;
+
+		if (min_bits > 0)
+			return decisive_bits - bitncommon(ip_addr(boundary), ip_addr(query),
+											  min_bits);
+		return decisive_bits;
+	}
+
+	return -1;
 }
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index e932ccf..8dcda93 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -149,21 +149,21 @@ static double var_eq_const(VariableStatData *vardata, Oid operator,
 			 bool varonleft);
 static double var_eq_non_const(VariableStatData *vardata, Oid operator,
 				 Node *other,
 				 bool varonleft);
 static double ineq_histogram_selectivity(PlannerInfo *root,
 						   VariableStatData *vardata,
 						   FmgrInfo *opproc, bool isgt,
 						   Datum constval, Oid consttype);
 static double eqjoinsel_inner(Oid operator,
 				VariableStatData *vardata1, VariableStatData *vardata2);
-static double eqjoinsel_semi(Oid operator,
+double eqjoinsel_semi(Oid operator,
 			   VariableStatData *vardata1, VariableStatData *vardata2,
 			   RelOptInfo *inner_rel);
 static bool convert_to_scalar(Datum value, Oid valuetypid, double *scaledvalue,
 				  Datum lobound, Datum hibound, Oid boundstypid,
 				  double *scaledlobound, double *scaledhibound);
 static double convert_numeric_to_scalar(Datum value, Oid typid);
 static void convert_string_to_scalar(char *value,
 						 double *scaledvalue,
 						 char *lobound,
 						 double *scaledlobound,
@@ -182,21 +182,21 @@ static double convert_one_bytea_to_scalar(unsigned char *value, int valuelen,
 static char *convert_string_datum(Datum value, Oid typid);
 static double convert_timevalue_to_scalar(Datum value, Oid typid);
 static void examine_simple_variable(PlannerInfo *root, Var *var,
 						VariableStatData *vardata);
 static bool get_variable_range(PlannerInfo *root, VariableStatData *vardata,
 				   Oid sortop, Datum *min, Datum *max);
 static bool get_actual_variable_range(PlannerInfo *root,
 						  VariableStatData *vardata,
 						  Oid sortop,
 						  Datum *min, Datum *max);
-static RelOptInfo *find_join_input_rel(PlannerInfo *root, Relids relids);
+RelOptInfo *find_join_input_rel(PlannerInfo *root, Relids relids);
 static Selectivity prefix_selectivity(PlannerInfo *root,
 				   VariableStatData *vardata,
 				   Oid vartype, Oid opfamily, Const *prefixcon);
 static Selectivity like_selectivity(const char *patt, int pattlen,
 				 bool case_insensitive);
 static Selectivity regex_selectivity(const char *patt, int pattlen,
 				  bool case_insensitive,
 				  int fixed_prefix_len);
 static Datum string_to_datum(const char *str, Oid datatype);
 static Const *string_to_const(const char *str, Oid datatype);
@@ -2418,21 +2418,21 @@ eqjoinsel_inner(Oid operator,
 
 	return selec;
 }
 
 /*
  * eqjoinsel_semi --- eqjoinsel for semi join
  *
  * (Also used for anti join, which we are supposed to estimate the same way.)
  * Caller has ensured that vardata1 is the LHS variable.
  */
-static double
+double
 eqjoinsel_semi(Oid operator,
 			   VariableStatData *vardata1, VariableStatData *vardata2,
 			   RelOptInfo *inner_rel)
 {
 	double		selec;
 	double		nd1;
 	double		nd2;
 	bool		isdefault1;
 	bool		isdefault2;
 	Form_pg_statistic stats1 = NULL;
@@ -5094,21 +5094,21 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata,
 	return have_data;
 }
 
 /*
  * find_join_input_rel
  *		Look up the input relation for a join.
  *
  * We assume that the input relation's RelOptInfo must have been constructed
  * already.
  */
-static RelOptInfo *
+RelOptInfo *
 find_join_input_rel(PlannerInfo *root, Relids relids)
 {
 	RelOptInfo *rel = NULL;
 
 	switch (bms_membership(relids))
 	{
 		case BMS_EMPTY_SET:
 			/* should not happen */
 			break;
 		case BMS_SINGLETON:
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index d7dcd1c..3b827fc 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -1135,32 +1135,33 @@ DESCR("not equal");
 DATA(insert OID = 1203 (  "<"	   PGNSP PGUID b f f 869 869	 16 1205 1206 network_lt scalarltsel scalarltjoinsel ));
 DESCR("less than");
 DATA(insert OID = 1204 (  "<="	   PGNSP PGUID b f f 869 869	 16 1206 1205 network_le scalarltsel scalarltjoinsel ));
 DESCR("less than or equal");
 DATA(insert OID = 1205 (  ">"	   PGNSP PGUID b f f 869 869	 16 1203 1204 network_gt scalargtsel scalargtjoinsel ));
 DESCR("greater than");
 DATA(insert OID = 1206 (  ">="	   PGNSP PGUID b f f 869 869	 16 1204 1203 network_ge scalargtsel scalargtjoinsel ));
 DESCR("greater than or equal");
 DATA(insert OID = 931  (  "<<"	   PGNSP PGUID b f f 869 869	 16 933		0 network_sub networksel networkjoinsel ));
 DESCR("is subnet");
-#define OID_INET_SUB_OP				  931
+#define OID_INET_SUB_OP			931
 DATA(insert OID = 932  (  "<<="    PGNSP PGUID b f f 869 869	 16 934		0 network_subeq networksel networkjoinsel ));
 DESCR("is subnet or equal");
-#define OID_INET_SUBEQ_OP				932
+#define OID_INET_SUBEQ_OP		932
 DATA(insert OID = 933  (  ">>"	   PGNSP PGUID b f f 869 869	 16 931		0 network_sup networksel networkjoinsel ));
 DESCR("is supernet");
-#define OID_INET_SUP_OP				  933
+#define OID_INET_SUP_OP			933
 DATA(insert OID = 934  (  ">>="    PGNSP PGUID b f f 869 869	 16 932		0 network_supeq networksel networkjoinsel ));
 DESCR("is supernet or equal");
-#define OID_INET_SUPEQ_OP				934
+#define OID_INET_SUPEQ_OP		934
 DATA(insert OID = 3552	(  "&&"    PGNSP PGUID b f f 869 869	 16 3552	0 network_overlap networksel networkjoinsel ));
 DESCR("overlaps (is subnet or supernet)");
+#define OID_INET_OVERLAP_OP		3552
 
 DATA(insert OID = 2634 (  "~"	   PGNSP PGUID l f f	  0 869 869 0 0 inetnot - - ));
 DESCR("bitwise not");
 DATA(insert OID = 2635 (  "&"	   PGNSP PGUID b f f	869 869 869 0 0 inetand - - ));
 DESCR("bitwise and");
 DATA(insert OID = 2636 (  "|"	   PGNSP PGUID b f f	869 869 869 0 0 inetor - - ));
 DESCR("bitwise or");
 DATA(insert OID = 2637 (  "+"	   PGNSP PGUID b f f	869  20 869 2638 0 inetpl - - ));
 DESCR("add");
 DATA(insert OID = 2638 (  "+"	   PGNSP PGUID b f f	 20 869 869 2637 0 int8pl_inet - - ));
#14Emre Hasegeli
emre@hasegeli.com
In reply to: Tom Lane (#10)
Re: Selectivity estimation for inet operators

What you did there is utterly unacceptable from a modularity standpoint;
and considering that the values will be nowhere near right, the argument
that "it's better than returning a constant" seems pretty weak. I think
you should just take that out again.

I will try to come up with a better, data type specific implementation
in a week.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Emre Hasegeli
emre@hasegeli.com
In reply to: Emre Hasegeli (#14)
1 attachment(s)
Re: Selectivity estimation for inet operators

I updated the patch to cover semi and anti joins with eqjoinsel_semi().
I think it is better than returning a constant.

What you did there is utterly unacceptable from a modularity standpoint;
and considering that the values will be nowhere near right, the argument
that "it's better than returning a constant" seems pretty weak. I think
you should just take that out again.

I will try to come up with a better, data type specific implementation
in a week.

New version with semi join estimation function attached.

Attachments:

inet-selfuncs-v11.patchtext/x-diff; charset=utf-8Download
diff --git a/src/backend/utils/adt/network_selfuncs.c b/src/backend/utils/adt/network_selfuncs.c
index d0d806f..f6cb2f6 100644
--- a/src/backend/utils/adt/network_selfuncs.c
+++ b/src/backend/utils/adt/network_selfuncs.c
@@ -1,32 +1,819 @@
 /*-------------------------------------------------------------------------
  *
  * network_selfuncs.c
  *	  Functions for selectivity estimation of inet/cidr operators
  *
- * Currently these are just stubs, but we hope to do better soon.
+ * Estimates are based on null fraction, distinct value count, most common
+ * values, and histogram of inet/cidr datatypes.
  *
  * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
  *
  * IDENTIFICATION
  *	  src/backend/utils/adt/network_selfuncs.c
  *
  *-------------------------------------------------------------------------
  */
 #include "postgres.h"
 
+#include <math.h>
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_operator.h"
+#include "catalog/pg_statistic.h"
+#include "utils/lsyscache.h"
 #include "utils/inet.h"
+#include "utils/selfuncs.h"
+
+
+/* Default selectivity constant for the inet overlap operator */
+#define DEFAULT_OVERLAP_SEL 0.01
+
+/* Default selectivity constant for the other operators */
+#define DEFAULT_INCLUSION_SEL 0.005
+
+/* Default selectivity for given operator */
+#define DEFAULT_SEL(operator) \
+	((operator) == OID_INET_OVERLAP_OP ? \
+			DEFAULT_OVERLAP_SEL : DEFAULT_INCLUSION_SEL)
 
+static Selectivity networkjoinsel_inner(Oid operator,
+					VariableStatData *vardata1, VariableStatData *vardata2);
+static Selectivity networkjoinsel_semi(Oid operator,
+					VariableStatData *vardata1, VariableStatData *vardata2);
+static short int inet_opr_order(Oid operator);
+static Selectivity inet_his_inclusion_selec(Datum *values, int nvalues,
+					Datum *constvalue, short int opr_order);
+static Selectivity inet_mcv_join_selec(Datum *mcv1_values,
+					float4 *mcv1_numbers, int mcv1_nvalues, Datum *mcv2_values,
+					float4 *mcv2_numbers, int mcv2_nvalues, Oid operator,
+					Selectivity *max_selec_p);
+static Selectivity inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers,
+					int mcv_nvalues, Datum *his_values, int his_nvalues,
+					short int opr_order, Selectivity *max_selec_p);
+static Selectivity inet_his_inclusion_join_selec(Datum *his1_values,
+					int his1_nvalues, Datum *his2_values, int his2_nvalues,
+					short int opr_order);
+static Selectivity inet_semi_join_selec(bool mcv2_exists, Datum *mcv2_values,
+					int mcv2_nvalues, bool his2_exists, Datum *his2_values,
+					int his2_nvalues, double his2_weight, Datum *constvalue,
+					FmgrInfo *proc, short int opr_order);
+static short int inet_inclusion_cmp(inet *left, inet *right,
+					short int opr_order);
+static short int inet_masklen_inclusion_cmp(inet *left, inet *right,
+					short int opr_order);
+static short int inet_his_match_divider(inet *boundary, inet *query,
+					short int opr_order);
 
+/*
+ * Selectivity estimation for the subnet inclusion operators
+ */
 Datum
 networksel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid			operator = PG_GETARG_OID(1);
+	List	   *args = (List *) PG_GETARG_POINTER(2);
+	int			varRelid = PG_GETARG_INT32(3),
+				his_nvalues;
+	VariableStatData vardata;
+	Node	   *other;
+	bool		varonleft;
+	Selectivity selec,
+				max_mcv_selec;
+	Datum		constvalue,
+			   *his_values;
+	Form_pg_statistic stats;
+	double		nullfrac;
+	FmgrInfo	proc;
+
+	/*
+	 * If expression is not (variable op something) or (something op
+	 * variable), then punt and return a default estimate.
+	 */
+	if (!get_restriction_variable(root, args, varRelid,
+								  &vardata, &other, &varonleft))
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+
+	/*
+	 * Can't do anything useful if the something is not a constant, either.
+	 */
+	if (!IsA(other, Const))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	/* All of the subnet inclusion operators are strict. */
+	if (((Const *) other)->constisnull)
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(0.0);
+	}
+
+	if (!HeapTupleIsValid(vardata.statsTuple))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	constvalue = ((Const *) other)->constvalue;
+	stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+	nullfrac = stats ? stats->stanullfrac : 0.0;
+
+	fmgr_info(get_opcode(operator), &proc);
+	selec = mcv_selectivity(&vardata, &proc, constvalue, varonleft,
+							&max_mcv_selec);
+
+	if (get_attstatsslot(vardata.statsTuple,
+						 vardata.atttype, vardata.atttypmod,
+						 STATISTIC_KIND_HISTOGRAM, InvalidOid,
+						 NULL,
+						 &his_values, &his_nvalues,
+						 NULL, NULL))
+	{
+		selec += (1.0 - nullfrac - max_mcv_selec) *
+			inet_his_inclusion_selec(his_values, his_nvalues, &constvalue,
+									 varonleft ? inet_opr_order(operator) :
+												 inet_opr_order(operator) * -1);
+
+		free_attstatsslot(vardata.atttype, his_values, his_nvalues, NULL, 0);
+	}
+	else if (max_mcv_selec == 0.0)
+		selec = (1.0 - nullfrac) * DEFAULT_SEL(operator);
+
+	/* Result should be in range, but make sure... */
+	CLAMP_PROBABILITY(selec);
+
+	ReleaseVariableStats(vardata);
+	PG_RETURN_FLOAT8(selec);
 }
 
+/*
+ * Join selectivity estimation for the subnet inclusion operators
+ *
+ * This function has the same structure of eqjoinsel() on selfuncs.c.
+ */
 Datum
 networkjoinsel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid			operator = PG_GETARG_OID(1);
+	List	   *args = (List *) PG_GETARG_POINTER(2);
+#ifdef NOT_USED
+	JoinType	jointype = (JoinType) PG_GETARG_INT16(3);
+#endif
+	SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) PG_GETARG_POINTER(4);
+	double		selec;
+	VariableStatData vardata1;
+	VariableStatData vardata2;
+	bool		join_is_reversed;
+
+	get_join_variables(root, args, sjinfo,
+					   &vardata1, &vardata2, &join_is_reversed);
+
+	switch (sjinfo->jointype)
+	{
+		case JOIN_INNER:
+		case JOIN_LEFT:
+		case JOIN_FULL:
+
+			/*
+			 * Selectivity for left join is not exactly same as inner join,
+			 * but is neglected.
+			 */
+			if (!join_is_reversed)
+				selec = networkjoinsel_inner(operator, &vardata1, &vardata2);
+			else
+				selec = networkjoinsel_inner(get_commutator(operator),
+											 &vardata2, &vardata1);
+			break;
+		case JOIN_SEMI:
+		case JOIN_ANTI:
+
+			if (!join_is_reversed)
+				selec = networkjoinsel_semi(operator, &vardata1, &vardata2);
+			else
+				selec = networkjoinsel_semi(get_commutator(operator),
+											&vardata2, &vardata1);
+			break;
+		default:
+			/* other values not expected here */
+			elog(ERROR, "unrecognized join type: %d",
+				 (int) sjinfo->jointype);
+			selec = 0;			/* keep compiler quiet */
+			break;
+	}
+
+	ReleaseVariableStats(vardata1);
+	ReleaseVariableStats(vardata2);
+
+	CLAMP_PROBABILITY(selec);
+
+	PG_RETURN_FLOAT8((float8) selec);
+}
+
+/*
+ * Inner join selectivity estimation for the subnet inclusion operators
+ *
+ * Calculates MCV vs MCV, MCV vs histogram and histogram vs histogram
+ * selectivity for join using the subnet inclusion operators.  Unlike the
+ * join selectivity function for the equality operator, eqjoinsel_inner(),
+ * one to one matching of the values is not enough.   Network inclusion
+ * operators are likely to match many to many.   It requires to loop the MVC
+ * and histogram lists to the end.  Also, MCV vs histogram selectivity is
+ * not neglected as in eqjoinsel_inner().
+ */
+static Selectivity
+networkjoinsel_inner(Oid operator,
+					 VariableStatData *vardata1, VariableStatData *vardata2)
+{
+	Form_pg_statistic stats;
+	double		nullfrac1 = 0.0,
+				nullfrac2 = 0.0;
+	Selectivity selec = 0.0,
+				mcv1_max_selec = 0.0,
+				mcv2_max_selec = 0.0;
+	bool		mcv1_exists = false,
+				mcv2_exists = false,
+				his1_exists = false,
+				his2_exists = false;
+	short int	opr_order;
+	int			mcv1_nvalues,
+				mcv2_nvalues,
+				mcv1_nnumbers,
+				mcv2_nnumbers,
+				his1_nvalues,
+				his2_nvalues;
+	Datum	   *mcv1_values,
+			   *mcv2_values,
+			   *his1_values,
+			   *his2_values;
+	float4	   *mcv1_numbers,
+			   *mcv2_numbers;
+
+	if (HeapTupleIsValid(vardata1->statsTuple))
+	{
+		if ((stats = (Form_pg_statistic) GETSTRUCT(vardata1->statsTuple)))
+			nullfrac1 = stats->stanullfrac;
+
+		mcv1_exists = get_attstatsslot(vardata1->statsTuple,
+									   vardata1->atttype, vardata1->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv1_values, &mcv1_nvalues,
+									   &mcv1_numbers, &mcv1_nnumbers);
+		his1_exists = get_attstatsslot(vardata1->statsTuple,
+									   vardata1->atttype, vardata1->atttypmod,
+									   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+									   NULL,
+									   &his1_values, &his1_nvalues,
+									   NULL, NULL);
+	}
+
+	if (HeapTupleIsValid(vardata2->statsTuple))
+	{
+		if ((stats = (Form_pg_statistic) GETSTRUCT(vardata2->statsTuple)))
+			nullfrac2 = stats->stanullfrac;
+
+		mcv2_exists = get_attstatsslot(vardata2->statsTuple,
+									   vardata2->atttype, vardata2->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv2_values, &mcv2_nvalues,
+									   &mcv2_numbers, &mcv2_nnumbers);
+		his2_exists = get_attstatsslot(vardata2->statsTuple,
+									   vardata2->atttype, vardata2->atttypmod,
+									   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+									   NULL,
+									   &his2_values, &his2_nvalues,
+									   NULL, NULL);
+	}
+
+	opr_order = inet_opr_order(operator);
+
+	if (mcv1_exists && mcv2_exists)
+		selec += inet_mcv_join_selec(mcv1_values, mcv1_numbers, mcv1_nvalues,
+									 mcv2_values, mcv2_numbers, mcv2_nvalues,
+									 operator, &mcv1_max_selec);
+	if (mcv2_exists && his1_exists)
+		selec += (1.0 - nullfrac1 - mcv1_max_selec) *
+			inet_mcv_his_selec(mcv2_values, mcv2_numbers, mcv2_nvalues,
+							   his1_values, his1_nvalues, opr_order,
+							   &mcv2_max_selec);
+	if (mcv1_exists && his2_exists)
+		selec += (1.0 - nullfrac2 - mcv2_max_selec) *
+			inet_mcv_his_selec(mcv1_values, mcv1_numbers, mcv1_nvalues,
+							   his2_values, his2_nvalues, opr_order,
+							   &mcv1_max_selec);
+	if (his1_exists && his2_exists)
+		selec += (1.0 - nullfrac1 - mcv1_max_selec) *
+			(1.0 - nullfrac2 - mcv2_max_selec) *
+			inet_his_inclusion_join_selec(his1_values, his1_nvalues,
+										  his2_values, his2_nvalues,
+										  opr_order);
+
+	if ((!mcv1_exists && !his1_exists) || (!mcv2_exists && !his2_exists))
+		selec = (1.0 - nullfrac1) * (1.0 - nullfrac2) * DEFAULT_SEL(operator);
+
+	if (mcv1_exists)
+		free_attstatsslot(vardata1->atttype, mcv1_values, mcv1_nvalues,
+						  mcv1_numbers, mcv1_nnumbers);
+	if (mcv2_exists)
+		free_attstatsslot(vardata2->atttype, mcv2_values, mcv2_nvalues,
+						  mcv2_numbers, mcv2_nnumbers);
+	if (his1_exists)
+		free_attstatsslot(vardata1->atttype, his1_values, his1_nvalues,
+						  NULL, 0);
+	if (his2_exists)
+		free_attstatsslot(vardata2->atttype, his2_values, his2_nvalues,
+						  NULL, 0);
+
+	return selec;
+}
+
+/*
+ * Semi join selectivity estimation for the subnet inclusion operators
+ *
+ * Calculates MCV vs MCV, MCV vs histogram, histogram vs MCV, and histogram
+ * vs histogram selectivity for semi join using the subnet inclusion
+ * operators.
+ */
+static Selectivity
+networkjoinsel_semi(Oid operator,
+					VariableStatData *vardata1, VariableStatData *vardata2)
+{
+	Form_pg_statistic stats;
+	Selectivity selec = 0.0,
+				mcv1_max_selec = 0.0,
+				mcv2_max_selec = 0.0;
+	double		nullfrac1 = 0.0,
+				nullfrac2 = 0.0,
+				his2_weight = 0.0;
+	bool		mcv1_exists = false,
+				mcv2_exists = false,
+				his1_exists = false,
+				his2_exists = false;
+	short int	opr_order = 0;
+	FmgrInfo	proc;
+	int			i,
+				mcv1_nvalues,
+				mcv2_nvalues,
+				mcv1_nnumbers,
+				mcv2_nnumbers,
+				his1_nvalues,
+				his2_nvalues;
+	Datum	   *mcv1_values,
+			   *mcv2_values,
+			   *his1_values,
+			   *his2_values;
+	float4	   *mcv1_numbers,
+			   *mcv2_numbers;
+
+	if (HeapTupleIsValid(vardata1->statsTuple))
+	{
+		if ((stats = (Form_pg_statistic) GETSTRUCT(vardata1->statsTuple)))
+			nullfrac1 = stats->stanullfrac;
+
+		mcv1_exists = get_attstatsslot(vardata1->statsTuple,
+									   vardata1->atttype, vardata1->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv1_values, &mcv1_nvalues,
+									   &mcv1_numbers, &mcv1_nnumbers);
+		his1_exists = get_attstatsslot(vardata1->statsTuple,
+									   vardata1->atttype, vardata1->atttypmod,
+									   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+									   NULL,
+									   &his1_values, &his1_nvalues,
+									   NULL, NULL);
+	}
+
+	if (HeapTupleIsValid(vardata2->statsTuple))
+	{
+
+		if ((stats = (Form_pg_statistic) GETSTRUCT(vardata2->statsTuple)))
+			nullfrac2 = stats->stanullfrac;
+
+		mcv2_exists = get_attstatsslot(vardata2->statsTuple,
+									   vardata2->atttype, vardata2->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv2_values, &mcv2_nvalues,
+									   &mcv2_numbers, &mcv2_nnumbers);
+		his2_exists = get_attstatsslot(vardata2->statsTuple,
+									   vardata2->atttype, vardata2->atttypmod,
+									   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+									   NULL,
+									   &his2_values, &his2_nvalues,
+									   NULL, NULL);
+	}
+
+	if (mcv2_exists)
+	{
+		fmgr_info(get_opcode(operator), &proc);
+
+		for (i = 0; i < mcv2_nvalues; i++)
+			 mcv2_max_selec += mcv2_numbers[i];
+	}
+
+	if (his2_exists)
+	{
+		his2_weight = (1.0 - nullfrac2 - mcv2_max_selec) * vardata2->rel->rows;
+
+		/* Match the histogram on the right hand side using the commutator. */
+		opr_order = inet_opr_order(operator) * -1;
+	}
+
+	if (mcv1_exists && (mcv2_exists || his2_exists))
+		for (i = 0; i < mcv1_nvalues; i++)
+			selec += mcv1_numbers[i] *
+				inet_semi_join_selec(mcv2_exists, mcv2_values, mcv2_nvalues,
+									 his2_exists, his2_values, his2_nvalues,
+									 his2_weight, &mcv1_values[i],
+									 &proc, opr_order);
+
+	if (his1_exists && (mcv2_exists || his2_exists))
+	{
+		double		his1_total_selec = 0.0;
+
+		/*
+		 * The first and the last element of the first histogram will not be
+		 * used, on the grounds that they are outliers and hence not very
+		 * representative.
+		 */
+		for (i = 1; i < his1_nvalues - 1; i++)
+			his1_total_selec +=
+				inet_semi_join_selec(mcv2_exists, mcv2_values, mcv2_nvalues,
+									 his2_exists, his2_values, his2_nvalues,
+									 his2_weight, &his1_values[i],
+									 &proc, opr_order);
+		selec += ((1.0 - nullfrac1 - mcv1_max_selec) / (his1_nvalues - 2)) *
+			his1_total_selec;
+	}
+
+	if ((!mcv1_exists && !his1_exists) || (!mcv2_exists && !his2_exists))
+		selec = (1.0 - nullfrac1) * (1.0 - nullfrac2) * DEFAULT_SEL(operator);
+
+	if (mcv1_exists)
+		free_attstatsslot(vardata1->atttype, mcv1_values, mcv1_nvalues,
+						  mcv1_numbers, mcv1_nnumbers);
+	if (mcv2_exists)
+		free_attstatsslot(vardata2->atttype, mcv2_values, mcv2_nvalues,
+						  mcv2_numbers, mcv2_nnumbers);
+	if (his1_exists)
+		free_attstatsslot(vardata1->atttype, his1_values, his1_nvalues,
+						  NULL, 0);
+	if (his2_exists)
+		free_attstatsslot(vardata2->atttype, his2_values, his2_nvalues,
+						  NULL, 0);
+
+	return selec;
+}
+
+/*
+ * Practical comparable numbers for the subnet inclusion operators
+ */
+static short int
+inet_opr_order(Oid operator)
+{
+	switch (operator)
+	{
+		case OID_INET_SUP_OP:
+			return -2;
+		case OID_INET_SUPEQ_OP:
+			return -1;
+		case OID_INET_OVERLAP_OP:
+			return 0;
+		case OID_INET_SUBEQ_OP:
+			return 1;
+		case OID_INET_SUB_OP:
+			return 2;
+		default:
+			elog(ERROR, "unknown operator for inet inclusion selectivity");
+	}
+}
+
+/*
+ * Inet histogram inclusion selectivity estimation
+ *
+ * Calculates histogram selectivity for the subnet inclusion operators of
+ * the inet type.  The return value is between 0 and 1.  It should be
+ * corrected with the MVC selectivity and null fraction.  If the constant
+ * is less than the first element or greater than the last element of
+ * the histogram the return value will be 0.
+ *
+ * The histogram is originally for the basic comparison operators.  Only
+ * the common bits of the network part and the length of the network part
+ * (masklen) are appropriate for the subnet inclusion operators.  Fortunately,
+ * basic comparison fits in this situation.  Even so, the length of the
+ * network part would not really be significant in the histogram.  This would
+ * lead to big mistakes for data sets with uneven masklen distribution.
+ * To avoid this problem, comparison with the left and the right side of the
+ * buckets used together.
+ *
+ * Histogram bucket matches are calculated in two forms.  If the constant
+ * matches both sides the bucket is considered as fully matched.  If the
+ * constant matches only the right side the bucket, it is not considered
+ * as matched unless it is the last bucket, because it will match the next
+ * bucked.  If all of these buckets would be considered as matched, it would
+ * lead to unfair multiple matches for some constants.
+ *
+ * The second form is to match the bucket partially.  We try to calculate
+ * dividers for both of the boundaries.  If the address family of the boundary
+ * does not match the constant or comparison of the length of the network
+ * parts is not true by the operator, the divider for the boundary would not
+ * taken into account.  If both of the dividers can be calculated the greater
+ * one will be used to minimize the mistake in the buckets which have
+ * disparate masklens.
+ *
+ * The divider on the partial bucket match is imagined as the distance
+ * between the decisive bits and the common bits of the addresses.  It will
+ * be used as power of two as it is the natural scale for the IP network
+ * inclusion.  The partial bucket match divider calculation is an empirical
+ * formula and subject to change with more experiment.
+ *
+ * For partial match with buckets which have different address families
+ * on the left and right sides only the boundary with the same address
+ * family is taken into consideration.  This can cause more mistakes for these
+ * buckets if the masklens of their boundaries are also disparate.  It can
+ * only be the case for one bucket, if there are addresses with different
+ * families on the column.  It seems as a better option than not considering
+ * these buckets.
+ */
+static Selectivity
+inet_his_inclusion_selec(Datum *values, int nvalues, Datum *constvalue,
+						 short int opr_order)
+{
+	inet	   *query,
+			   *left,
+			   *right;
+	float		match = 0.0;
+	int			i;
+	short int	left_order,
+				right_order,
+				left_divider,
+				right_divider;
+
+	query = DatumGetInetPP(*constvalue);
+	left = DatumGetInetPP(values[0]);
+	left_order = inet_inclusion_cmp(left, query, opr_order);
+
+	for (i = 1; i < nvalues; i++)
+	{
+		right = DatumGetInetP(values[i]);
+		right_order = inet_inclusion_cmp(right, query, opr_order);
+
+		if (left_order == 0 && right_order == 0)
+		{
+			/* Full bucket match. */
+
+			match += 1.0;
+		}
+		else if ((left_order <= 0 && right_order > 0) ||
+				 (left_order >= 0 && right_order < 0) ||
+				 (right_order == 0 && i == nvalues - 1))
+		{
+			/* Partial bucket match. */
+
+			left_divider = inet_his_match_divider(left, query, opr_order);
+			right_divider = inet_his_match_divider(right, query, opr_order);
+
+			if (left_divider >= 0 || right_divider >= 0)
+				match += 1.0 / pow(2, Max(left_divider, right_divider));
+		}
+
+		/* Shift the variables. */
+		left = right;
+		left_order = right_order;
+	}
+
+	/* There are nvalues - 1 buckets. */
+	return match / (nvalues - 1);
+}
+
+/*
+ * Inet MCV inner join selectivity estimation
+ *
+ * The original function of the operator used in this function, like the
+ * mcv_selectivity() on selfuncs.c.  Actually, this function has nothing
+ * to do with the network data types except its name and location.
+ *
+ * The function result is the selectivity, and the fraction of the total
+ * population of the left hand side MCV are returned into *max_selec_p.
+ */
+static Selectivity
+inet_mcv_join_selec(Datum *mcv1_values, float4 *mcv1_numbers, int mcv1_nvalues,
+					Datum *mcv2_values, float4 *mcv2_numbers, int mcv2_nvalues,
+					Oid operator, Selectivity *max_selec_p)
+{
+	Selectivity selec = 0.0,
+				max_selec = 0.0;
+	FmgrInfo	proc;
+	int			i,
+				j;
+
+	fmgr_info(get_opcode(operator), &proc);
+
+	for (i = 0; i < mcv1_nvalues; i++)
+	{
+		for (j = 0; j < mcv2_nvalues; j++)
+			if (DatumGetBool(FunctionCall2Coll(&proc, DEFAULT_COLLATION_OID,
+											   mcv1_values[i], mcv2_values[j])))
+				selec += mcv1_numbers[i] * mcv2_numbers[j];
+
+		max_selec += mcv1_numbers[i];
+	}
+
+	*max_selec_p = max_selec;
+	return selec;
+}
+
+/*
+ * Inet MCV vs histogram inclusion join selectivity estimation
+ *
+ * The function result is the selectivity, and the fraction of the total
+ * population of the MCV is returned into *max_selec_p.
+ */
+static Selectivity
+inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers, int mcv_nvalues,
+				   Datum *his_values, int his_nvalues, short int opr_order,
+				   Selectivity *max_selec_p)
+{
+	Selectivity selec = 0.0,
+				max_selec = 0.0;
+	int			i;
+
+	for (i = 0; i < mcv_nvalues; i++)
+	{
+		selec += mcv_numbers[i] *
+			inet_his_inclusion_selec(his_values, his_nvalues, &mcv_values[i],
+									 opr_order);
+
+		max_selec += mcv_numbers[i];
+	}
+
+	*max_selec_p = max_selec;
+	return selec;
+}
+
+/*
+ * Inet histogram inclusion join selectivity estimation
+ *
+ * Values from the first histogram will be matched with the second.  The first
+ * and the last element of the first histogram will not be used, on
+ * the grounds that they are outliers and hence not very representative.
+ */
+static Selectivity
+inet_his_inclusion_join_selec(Datum *his1_values, int his1_nvalues,
+							  Datum *his2_values, int his2_nvalues,
+							  short int opr_order)
+{
+	float		match = 0.0;
+	int			i;
+
+	for (i = 1; i < his1_nvalues - 1; i++)
+		match += inet_his_inclusion_selec(his2_values, his2_nvalues,
+										  &his1_values[i], opr_order);
+
+	/* his1_nvalues - 2 values should be checked. */
+	return match / (his1_nvalues - 2);
+}
+
+/*
+ * Inet semi join selectivity estimation
+ */
+static Selectivity
+inet_semi_join_selec(bool mcv2_exists, Datum *mcv2_values, int mcv2_nvalues,
+					 bool his2_exists, Datum *his2_values, int his2_nvalues,
+					 double his2_weight, Datum *constvalue,
+					 FmgrInfo *proc, short int opr_order)
+{
+	if (mcv2_exists)
+	{
+		int			i;
+
+		for (i = 0; i < mcv2_nvalues; i++)
+			if (DatumGetBool(FunctionCall2Coll(proc, DEFAULT_COLLATION_OID,
+											   *constvalue, mcv2_values[i])))
+				return 1.0;
+	}
+
+	/* Do not bother if histogram weight is smaller than 0.1. */
+	if (his2_exists && his2_weight > 0.1)
+	{
+		Selectivity	his_selec;
+
+		his_selec = inet_his_inclusion_selec(his2_values, his2_nvalues,
+											 constvalue, opr_order);
+
+		if (his_selec > 0)
+			return Min(1.0, his2_weight * his_selec);
+	}
+
+	return 0.0;
+}
+
+/*
+ * Comparison function for the subnet inclusion operators
+ *
+ * Comparison is compatible with the basic comparison function for the inet
+ * type.  See network_cmp_internal on network.c for the original.  Basic
+ * comparison operators are implemented with the network_cmp_internal
+ * function.  It is possible to implement the subnet inclusion operators with
+ * this function.
+ *
+ * Comparison is first on the common bits of the network part, then on
+ * the length of the network part (masklen) as the network_cmp_internal
+ * function.  Only the first part is on this function.  The second part is
+ * separated to another function for reusability.  The difference between
+ * the second part and the original network_cmp_internal is that the operator
+ * is used while comparing the lengths of the network parts.  See the second
+ * part on the inet_masklen_inclusion_cmp function below.
+ */
+static short int
+inet_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	order;
+
+		order = bitncmp(ip_addr(left), ip_addr(right),
+						Min(ip_bits(left), ip_bits(right)));
+
+		if (order != 0)
+			return order;
+
+		return inet_masklen_inclusion_cmp(left, right, opr_order);
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Masklen comparison function for the subnet inclusion operators
+ *
+ * Compares the lengths of network parts of the inputs using the operator.
+ * If the comparison is okay for the operator the return value will be 0.
+ * Otherwise the return value will be less than or greater than 0 with
+ * respect to the operator.
+ */
+static short int
+inet_masklen_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	order;
+
+		order = ip_bits(left) - ip_bits(right);
+
+		if ((order > 0 && opr_order >= 0) ||
+			(order == 0 && opr_order >= -1 && opr_order <= 1) ||
+			(order < 0 && opr_order <= 0))
+			return 0;
+
+		return opr_order;
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Inet histogram partial match divider calculation
+ *
+ * First the families and the lengths of the network parts are compared
+ * using the subnet inclusion operator.  The divider will be calculated
+ * using the masklens and the common bits of the addresses.  -1 will be
+ * returned if it cannot be calculated.
+ */
+static short int
+inet_his_match_divider(inet *boundary, inet *query, short int opr_order)
+{
+	if (inet_masklen_inclusion_cmp(boundary, query, opr_order) == 0)
+	{
+		short int	min_bits,
+					decisive_bits;
+
+		min_bits = Min(ip_bits(boundary), ip_bits(query));
+
+		/*
+		 * Set the decisive bits from the one which should contain the other
+		 * according to the operator.
+		 */
+		if (opr_order < 0)
+			decisive_bits = ip_bits(boundary);
+		else if (opr_order > 0)
+			decisive_bits = ip_bits(query);
+		else
+			decisive_bits = min_bits;
+
+		if (min_bits > 0)
+			return decisive_bits - bitncommon(ip_addr(boundary), ip_addr(query),
+											  min_bits);
+		return decisive_bits;
+	}
+
+	return -1;
 }
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index d7dcd1c..3b827fc 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -1135,32 +1135,33 @@ DESCR("not equal");
 DATA(insert OID = 1203 (  "<"	   PGNSP PGUID b f f 869 869	 16 1205 1206 network_lt scalarltsel scalarltjoinsel ));
 DESCR("less than");
 DATA(insert OID = 1204 (  "<="	   PGNSP PGUID b f f 869 869	 16 1206 1205 network_le scalarltsel scalarltjoinsel ));
 DESCR("less than or equal");
 DATA(insert OID = 1205 (  ">"	   PGNSP PGUID b f f 869 869	 16 1203 1204 network_gt scalargtsel scalargtjoinsel ));
 DESCR("greater than");
 DATA(insert OID = 1206 (  ">="	   PGNSP PGUID b f f 869 869	 16 1204 1203 network_ge scalargtsel scalargtjoinsel ));
 DESCR("greater than or equal");
 DATA(insert OID = 931  (  "<<"	   PGNSP PGUID b f f 869 869	 16 933		0 network_sub networksel networkjoinsel ));
 DESCR("is subnet");
-#define OID_INET_SUB_OP				  931
+#define OID_INET_SUB_OP			931
 DATA(insert OID = 932  (  "<<="    PGNSP PGUID b f f 869 869	 16 934		0 network_subeq networksel networkjoinsel ));
 DESCR("is subnet or equal");
-#define OID_INET_SUBEQ_OP				932
+#define OID_INET_SUBEQ_OP		932
 DATA(insert OID = 933  (  ">>"	   PGNSP PGUID b f f 869 869	 16 931		0 network_sup networksel networkjoinsel ));
 DESCR("is supernet");
-#define OID_INET_SUP_OP				  933
+#define OID_INET_SUP_OP			933
 DATA(insert OID = 934  (  ">>="    PGNSP PGUID b f f 869 869	 16 932		0 network_supeq networksel networkjoinsel ));
 DESCR("is supernet or equal");
-#define OID_INET_SUPEQ_OP				934
+#define OID_INET_SUPEQ_OP		934
 DATA(insert OID = 3552	(  "&&"    PGNSP PGUID b f f 869 869	 16 3552	0 network_overlap networksel networkjoinsel ));
 DESCR("overlaps (is subnet or supernet)");
+#define OID_INET_OVERLAP_OP		3552
 
 DATA(insert OID = 2634 (  "~"	   PGNSP PGUID l f f	  0 869 869 0 0 inetnot - - ));
 DESCR("bitwise not");
 DATA(insert OID = 2635 (  "&"	   PGNSP PGUID b f f	869 869 869 0 0 inetand - - ));
 DESCR("bitwise and");
 DATA(insert OID = 2636 (  "|"	   PGNSP PGUID b f f	869 869 869 0 0 inetor - - ));
 DESCR("bitwise or");
 DATA(insert OID = 2637 (  "+"	   PGNSP PGUID b f f	869  20 869 2638 0 inetpl - - ));
 DESCR("add");
 DATA(insert OID = 2638 (  "+"	   PGNSP PGUID b f f	 20 869 869 2637 0 int8pl_inet - - ));
#16Brightwell, Adam
adam.brightwell@crunchydatasolutions.com
In reply to: Emre Hasegeli (#15)
Re: Selectivity estimation for inet operators

New version with semi join estimation function attached.

I have performed the following initial review:

- Patch format. -- submitted as unified, but not sure it makes it any
easier to read than context format.
- Apply to current master (77e65bf). -- success (though, I do get
"Stripping trailing CR's from patch;" notification)
- check-world -- success
- Whitespace - were the whitespace changes in pg_operator.h necessary?

As for implementation, I'll leave that to those with a better understanding
of the purpose/expectations of the modified functions.

-Adam

--
Adam Brightwell - adam.brightwell@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com

#17Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Emre Hasegeli (#15)
Re: Selectivity estimation for inet operators

On 09/07/2014 07:09 PM, Emre Hasegeli wrote:

I updated the patch to cover semi and anti joins with eqjoinsel_semi().
I think it is better than returning a constant.

What you did there is utterly unacceptable from a modularity standpoint;
and considering that the values will be nowhere near right, the argument
that "it's better than returning a constant" seems pretty weak. I think
you should just take that out again.

I will try to come up with a better, data type specific implementation
in a week.

New version with semi join estimation function attached.

Thanks. Overall, my impression of this patch is that it works very well.
But damned if I understood *how* it works :-). There's a lot of
statistics involved, and it's not easy to see why something is
multiplied by something else. I'm adding comments as I read through it.

I've gotten to the inet_semi_join_selec function:

/*
* Inet semi join selectivity estimation.
*/
static Selectivity
inet_semi_join_selec(bool mcv2_exists, Datum *mcv2_values, int mcv2_nvalues,
bool his2_exists, Datum *his2_values, int his2_nvalues,
double his2_weight, Datum *constvalue,
FmgrInfo *proc, short opr_order)
{
if (mcv2_exists)
{
int i;

for (i = 0; i < mcv2_nvalues; i++)
{
if (DatumGetBool(FunctionCall2Coll(proc, DEFAULT_COLLATION_OID,
*constvalue, mcv2_values[i])))
return 1.0;
}
}

/* Do not bother if histogram weight is smaller than 0.1. */
if (his2_exists && his2_weight > 0.1)
{
Selectivity his_selec;

his_selec = inet_his_inclusion_selec(his2_values, his2_nvalues,
constvalue, opr_order);

if (his_selec > 0)
return Min(1.0, his2_weight * his_selec);
}

return 0.0;
}

This desperately needs comment at the top of the function explaining
what it does. Let me try to explain what I think it does:

This function calculates the probability that there is at least one row
in table B, which satisfies the "constant op column" qual. The constant
is passed as argument, and for table B, the MCV list and histogram is
provided. his2_weight is the total number of rows in B that are covered
by the histogram. For example, if the table has 1000 rows, and 10% of
the rows in the table are in the MCV, and another 10% are NULLs,
his_weight would be 800.

First, we check if the constant matches any of the most common values.
If it does, return 1.0, because then there is surely a match.

Next, we use the histogram to estimate the number of rows in the table
that matches the qual. If it amounts to more than 1 row, we return 1.0.
If it's between 0.0 and 1.0 rows, we return that number as the probability.

Now, I think that last step is wrong. Firstly, the "Do not bother if
histogram weight is smaller than 0.1" rule seems bogus. The his2_weight
is the total number of rows represented by the histogram, so surely it
can't be less than 1. It can't really be less than the statistics
target. Unless maybe if the histogram was collected when the table was
large, but it has since shrunk to contain only a few rows, but that
seems like a very bizarre corner case. At least it needs more comments
explaining what the test is all about, but I think we should just always
use the histogram (if it's available).

Secondly, if we estimate that there is on average 1.0 matching row in
the table, it does not follow that the probability that at least one row
matches is 1.0. Assuming a gaussian distribution with mean 1.0, the
probability that at least one row matches is 0.5. Assuming a gaussian
distribution here isn't quite right - I guess a Poisson distribution
would be more accurate - but it sure doesn't seem right as it is.

The error isn't very big, and perhaps you don't run into that very
often, so I'm not sure what the best way to fix that would be. My
statistics skills are a bit rusty, but I think the appropriate way would
be to apply the Poisson distribution, with the estimated number of
matched rows as the mean. The probability of at least one match would be
the cumulative distribution function at k=1. It sounds like overkill, if
this is case occurs only rarely. But then again, perhaps it's not all
that rare.

That said, I can't immediately find a test case where that error would
matter. I tried this:

create table inettbl1 (a inet);
insert into inettbl1 select '10.0.0.' || (g % 255) from
generate_series(1, 10) g;
analyze inettbl1;
explain analyze select count(*) from inettbl1 where a >>= ANY (SELECT a
from inettbl1);

The estimate for that is pretty accurate, 833 rows estimated vs 1000
actual, with the current patch. I'm afraid if we fixed
inet_semi_join_selec the way I suggest, the estimate would be smaller,
i.e. more wrong. Is there something else in the estimates that
accidentally compensates for this currently?

Thoughts?

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Emre Hasegeli
emre@hasegeli.com
In reply to: Heikki Linnakangas (#17)
1 attachment(s)
Re: Selectivity estimation for inet operators

Thanks. Overall, my impression of this patch is that it works very
well. But damned if I understood *how* it works :-). There's a lot
of statistics involved, and it's not easy to see why something is
multiplied by something else. I'm adding comments as I read through
it.

Thank you for looking at it. I tried to add more comments to
the multiplications. New version attached. It also fixes a bug
caused by wrong operator order used on histogram to histogram
selectivity estimation for inner join.

I've gotten to the inet_semi_join_selec function:

[function]

This desperately needs comment at the top of the function explaining
what it does. Let me try to explain what I think it does:

[explanation]

I used your explanation on the new version.

Now, I think that last step is wrong. Firstly, the "Do not bother if
histogram weight is smaller than 0.1" rule seems bogus. The
his2_weight is the total number of rows represented by the
histogram, so surely it can't be less than 1. It can't really be
less than the statistics target. Unless maybe if the histogram was
collected when the table was large, but it has since shrunk to
contain only a few rows, but that seems like a very bizarre corner
case. At least it needs more comments explaining what the test is
all about, but I think we should just always use the histogram (if
it's available).

It was an unnecessary check. I put an assert instead of it.

Secondly, if we estimate that there is on average 1.0 matching row
in the table, it does not follow that the probability that at least
one row matches is 1.0. Assuming a gaussian distribution with mean
1.0, the probability that at least one row matches is 0.5. Assuming
a gaussian distribution here isn't quite right - I guess a Poisson
distribution would be more accurate - but it sure doesn't seem right
as it is.

The error isn't very big, and perhaps you don't run into that very
often, so I'm not sure what the best way to fix that would be. My
statistics skills are a bit rusty, but I think the appropriate way
would be to apply the Poisson distribution, with the estimated
number of matched rows as the mean. The probability of at least one
match would be the cumulative distribution function at k=1. It
sounds like overkill, if this is case occurs only rarely. But then
again, perhaps it's not all that rare.

A function of his_weight and his_selec could be a better option
than just multiplying them. I am not sure about the function or
it worths the trouble. Join selectivity estimation function for
equality doesn't even bother to look at the histograms. Others
only return constant values.

That said, I can't immediately find a test case where that error
would matter. I tried this:

create table inettbl1 (a inet);
insert into inettbl1 select '10.0.0.' || (g % 255) from
generate_series(1, 10) g;
analyze inettbl1;
explain analyze select count(*) from inettbl1 where a >>= ANY
(SELECT a from inettbl1);

The estimate for that is pretty accurate, 833 rows estimated vs 1000
actual, with the current patch. I'm afraid if we fixed
inet_semi_join_selec the way I suggest, the estimate would be
smaller, i.e. more wrong. Is there something else in the estimates
that accidentally compensates for this currently?

The partial bucket match on inet_his_inclusion_selec() causes low
estimates. Which also effects non join estimation but not as much as
it effects join estimations. If that works more correctly, semi
join estimation can be higher than it should be.

network_selfuncs.c:602:

/* Partial bucket match. */

left_divider = inet_his_match_divider(left, query, opr_order);
right_divider = inet_his_match_divider(right, query, opr_order);

if (left_divider >= 0 || right_divider >= 0)
match += 1.0 / pow(2, Max(left_divider, right_divider));

I think this calculation can benefit from a statistical function
more than the semi join. Using the different bit count as power
of two is the best I could find. It works quite well on most of
the cases.

Attachments:

inet-selfuncs-v12.patchtext/x-diff; charset=utf-8Download
diff --git a/src/backend/utils/adt/network_selfuncs.c b/src/backend/utils/adt/network_selfuncs.c
index d0d806f..e9f9696 100644
--- a/src/backend/utils/adt/network_selfuncs.c
+++ b/src/backend/utils/adt/network_selfuncs.c
@@ -3,7 +3,8 @@
  * network_selfuncs.c
  *	  Functions for selectivity estimation of inet/cidr operators
  *
- * Currently these are just stubs, but we hope to do better soon.
+ * Estimates are based on null fraction, most common values, and
+ * histogram of inet/cidr datatypes.
  *
  * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -16,17 +17,864 @@
  */
 #include "postgres.h"
 
+#include <math.h>
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_operator.h"
+#include "catalog/pg_statistic.h"
+#include "utils/lsyscache.h"
 #include "utils/inet.h"
+#include "utils/selfuncs.h"
+
+
+/* Default selectivity constant for the inet overlap operator */
+#define DEFAULT_OVERLAP_SEL 0.01
+
+/* Default selectivity constant for the other operators */
+#define DEFAULT_INCLUSION_SEL 0.005
+
+/* Default selectivity for given operator */
+#define DEFAULT_SEL(operator) \
+	((operator) == OID_INET_OVERLAP_OP ? \
+			DEFAULT_OVERLAP_SEL : DEFAULT_INCLUSION_SEL)
 
+static Selectivity networkjoinsel_inner(Oid operator,
+					VariableStatData *vardata1, VariableStatData *vardata2);
+static Selectivity networkjoinsel_semi(Oid operator,
+					VariableStatData *vardata1, VariableStatData *vardata2);
+static short int inet_opr_order(Oid operator);
+static Selectivity inet_his_inclusion_selec(Datum *values, int nvalues,
+					Datum *constvalue, short int opr_order);
+static Selectivity inet_mcv_join_selec(Datum *mcv1_values,
+					float4 *mcv1_numbers, int mcv1_nvalues, Datum *mcv2_values,
+					float4 *mcv2_numbers, int mcv2_nvalues, Oid operator,
+					Selectivity *max_selec_p);
+static Selectivity inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers,
+					int mcv_nvalues, Datum *his_values, int his_nvalues,
+					short int opr_order, Selectivity *max_selec_p);
+static Selectivity inet_his_inclusion_join_selec(Datum *his1_values,
+					int his1_nvalues, Datum *his2_values, int his2_nvalues,
+					short int opr_order);
+static Selectivity inet_semi_join_selec(bool mcv_exists, Datum *mcv_values,
+					int mcv_nvalues, bool his_exists, Datum *his_values,
+					int his_nvalues, double his_weight, Datum *constvalue,
+					FmgrInfo *proc, short int comm_opr_order);
+static short int inet_inclusion_cmp(inet *left, inet *right,
+					short int opr_order);
+static short int inet_masklen_inclusion_cmp(inet *left, inet *right,
+					short int opr_order);
+static short int inet_his_match_divider(inet *boundary, inet *query,
+					short int opr_order);
 
+/*
+ * Selectivity estimation for the subnet inclusion operators
+ */
 Datum
 networksel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid			operator = PG_GETARG_OID(1);
+	List	   *args = (List *) PG_GETARG_POINTER(2);
+	int			varRelid = PG_GETARG_INT32(3),
+				his_nvalues;
+	VariableStatData vardata;
+	Node	   *other;
+	bool		varonleft;
+	Selectivity selec,
+				max_mcv_selec;
+	Datum		constvalue,
+			   *his_values;
+	Form_pg_statistic stats;
+	double		nullfrac;
+	FmgrInfo	proc;
+
+	/*
+	 * If expression is not (variable op something) or (something op
+	 * variable), then punt and return a default estimate.
+	 */
+	if (!get_restriction_variable(root, args, varRelid,
+								  &vardata, &other, &varonleft))
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+
+	/*
+	 * Can't do anything useful if the something is not a constant, either.
+	 */
+	if (!IsA(other, Const))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	/* All of the subnet inclusion operators are strict. */
+	if (((Const *) other)->constisnull)
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(0.0);
+	}
+
+	if (!HeapTupleIsValid(vardata.statsTuple))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	constvalue = ((Const *) other)->constvalue;
+	stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+	nullfrac = stats ? stats->stanullfrac : 0.0;
+
+	fmgr_info(get_opcode(operator), &proc);
+	selec = mcv_selectivity(&vardata, &proc, constvalue, varonleft,
+							&max_mcv_selec);
+
+	if (get_attstatsslot(vardata.statsTuple,
+						 vardata.atttype, vardata.atttypmod,
+						 STATISTIC_KIND_HISTOGRAM, InvalidOid,
+						 NULL,
+						 &his_values, &his_nvalues,
+						 NULL, NULL))
+	{
+		/*
+		 * Multiply histogram selectivity with remaining total selectivity
+		 * for histogram.
+		 */
+		selec += (1.0 - nullfrac - max_mcv_selec) *
+			inet_his_inclusion_selec(his_values, his_nvalues, &constvalue,
+									 varonleft ? inet_opr_order(operator) :
+												 inet_opr_order(operator) * -1);
+
+		free_attstatsslot(vardata.atttype, his_values, his_nvalues, NULL, 0);
+	}
+	else if (max_mcv_selec == 0.0)
+		selec = (1.0 - nullfrac) * DEFAULT_SEL(operator);
+
+	/* Result should be in range, but make sure... */
+	CLAMP_PROBABILITY(selec);
+
+	ReleaseVariableStats(vardata);
+	PG_RETURN_FLOAT8(selec);
 }
 
+/*
+ * Join selectivity estimation for the subnet inclusion operators
+ *
+ * This function has the same structure of eqjoinsel() on selfuncs.c.
+ */
 Datum
 networkjoinsel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid			operator = PG_GETARG_OID(1);
+	List	   *args = (List *) PG_GETARG_POINTER(2);
+#ifdef NOT_USED
+	JoinType	jointype = (JoinType) PG_GETARG_INT16(3);
+#endif
+	SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) PG_GETARG_POINTER(4);
+	double		selec;
+	VariableStatData vardata1;
+	VariableStatData vardata2;
+	bool		join_is_reversed;
+
+	get_join_variables(root, args, sjinfo,
+					   &vardata1, &vardata2, &join_is_reversed);
+
+	switch (sjinfo->jointype)
+	{
+		case JOIN_INNER:
+		case JOIN_LEFT:
+		case JOIN_FULL:
+
+			/*
+			 * Selectivity for left join is not exactly same as inner join,
+			 * but is neglected.
+			 */
+			if (!join_is_reversed)
+				selec = networkjoinsel_inner(operator, &vardata1, &vardata2);
+			else
+				selec = networkjoinsel_inner(get_commutator(operator),
+											 &vardata2, &vardata1);
+			break;
+		case JOIN_SEMI:
+		case JOIN_ANTI:
+
+			if (!join_is_reversed)
+				selec = networkjoinsel_semi(operator, &vardata1, &vardata2);
+			else
+				selec = networkjoinsel_semi(get_commutator(operator),
+											&vardata2, &vardata1);
+			break;
+		default:
+			/* other values not expected here */
+			elog(ERROR, "unrecognized join type: %d",
+				 (int) sjinfo->jointype);
+			selec = 0;			/* keep compiler quiet */
+			break;
+	}
+
+	ReleaseVariableStats(vardata1);
+	ReleaseVariableStats(vardata2);
+
+	CLAMP_PROBABILITY(selec);
+
+	PG_RETURN_FLOAT8((float8) selec);
+}
+
+/*
+ * Inner join selectivity estimation for the subnet inclusion operators
+ *
+ * Calculates MCV vs MCV, MCV vs histogram and histogram vs histogram
+ * selectivity for join using the subnet inclusion operators.  Unlike the
+ * join selectivity function for the equality operator, eqjoinsel_inner(),
+ * one to one matching of the values is not enough.   Network inclusion
+ * operators are likely to match many to many.   It requires to loop the MVC
+ * and histogram lists to the end.  Also, MCV vs histogram selectivity is
+ * not neglected as in eqjoinsel_inner().
+ */
+static Selectivity
+networkjoinsel_inner(Oid operator,
+					 VariableStatData *vardata1, VariableStatData *vardata2)
+{
+	Form_pg_statistic stats;
+	double		nullfrac1 = 0.0,
+				nullfrac2 = 0.0;
+	Selectivity selec = 0.0,
+				mcv1_max_selec = 0.0,
+				mcv2_max_selec = 0.0;
+	bool		mcv1_exists = false,
+				mcv2_exists = false,
+				his1_exists = false,
+				his2_exists = false;
+	short int	opr_order;
+	int			mcv1_nvalues,
+				mcv2_nvalues,
+				mcv1_nnumbers,
+				mcv2_nnumbers,
+				his1_nvalues,
+				his2_nvalues;
+	Datum	   *mcv1_values,
+			   *mcv2_values,
+			   *his1_values,
+			   *his2_values;
+	float4	   *mcv1_numbers,
+			   *mcv2_numbers;
+
+	if (HeapTupleIsValid(vardata1->statsTuple))
+	{
+		if ((stats = (Form_pg_statistic) GETSTRUCT(vardata1->statsTuple)))
+			nullfrac1 = stats->stanullfrac;
+
+		mcv1_exists = get_attstatsslot(vardata1->statsTuple,
+									   vardata1->atttype, vardata1->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv1_values, &mcv1_nvalues,
+									   &mcv1_numbers, &mcv1_nnumbers);
+		his1_exists = get_attstatsslot(vardata1->statsTuple,
+									   vardata1->atttype, vardata1->atttypmod,
+									   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+									   NULL,
+									   &his1_values, &his1_nvalues,
+									   NULL, NULL);
+	}
+
+	if (HeapTupleIsValid(vardata2->statsTuple))
+	{
+		if ((stats = (Form_pg_statistic) GETSTRUCT(vardata2->statsTuple)))
+			nullfrac2 = stats->stanullfrac;
+
+		mcv2_exists = get_attstatsslot(vardata2->statsTuple,
+									   vardata2->atttype, vardata2->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv2_values, &mcv2_nvalues,
+									   &mcv2_numbers, &mcv2_nnumbers);
+		his2_exists = get_attstatsslot(vardata2->statsTuple,
+									   vardata2->atttype, vardata2->atttypmod,
+									   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+									   NULL,
+									   &his2_values, &his2_nvalues,
+									   NULL, NULL);
+	}
+
+	opr_order = inet_opr_order(operator);
+
+	if (mcv1_exists && mcv2_exists)
+		selec += inet_mcv_join_selec(mcv1_values, mcv1_numbers, mcv1_nvalues,
+									 mcv2_values, mcv2_numbers, mcv2_nvalues,
+									 operator, &mcv1_max_selec);
+
+	/*
+	 * Multiply selectivities with remaining total selectivity for the used
+	 * histogram.  mcv1_max_selec and mcv2_max_selec will be used from
+	 * the previous operations to loop one less time.
+	 */
+	if (mcv2_exists && his1_exists)
+		selec += (1.0 - nullfrac1 - mcv1_max_selec) *
+			inet_mcv_his_selec(mcv2_values, mcv2_numbers, mcv2_nvalues,
+							   his1_values, his1_nvalues, opr_order,
+							   &mcv2_max_selec);
+	if (mcv1_exists && his2_exists)
+		selec += (1.0 - nullfrac2 - mcv2_max_selec) *
+			inet_mcv_his_selec(mcv1_values, mcv1_numbers, mcv1_nvalues,
+							   his2_values, his2_nvalues, opr_order,
+							   &mcv1_max_selec);
+
+	if (his1_exists && his2_exists && his2_nvalues > 2)
+		selec += (1.0 - nullfrac1 - mcv1_max_selec) *
+			(1.0 - nullfrac2 - mcv2_max_selec) *
+			inet_his_inclusion_join_selec(his1_values, his1_nvalues,
+										  his2_values, his2_nvalues,
+										  opr_order);
+
+	/*
+	 * If useful statistics are not available set the selectivity using
+	 * the default constant with null fractions.
+	 */
+	if ((!mcv1_exists && !his1_exists) || (!mcv2_exists && !his2_exists))
+		selec = (1.0 - nullfrac1) * (1.0 - nullfrac2) * DEFAULT_SEL(operator);
+
+	if (mcv1_exists)
+		free_attstatsslot(vardata1->atttype, mcv1_values, mcv1_nvalues,
+						  mcv1_numbers, mcv1_nnumbers);
+	if (mcv2_exists)
+		free_attstatsslot(vardata2->atttype, mcv2_values, mcv2_nvalues,
+						  mcv2_numbers, mcv2_nnumbers);
+	if (his1_exists)
+		free_attstatsslot(vardata1->atttype, his1_values, his1_nvalues,
+						  NULL, 0);
+	if (his2_exists)
+		free_attstatsslot(vardata2->atttype, his2_values, his2_nvalues,
+						  NULL, 0);
+
+	return selec;
+}
+
+/*
+ * Semi join selectivity estimation for the subnet inclusion operators
+ *
+ * Calculates MCV vs MCV, MCV vs histogram, histogram vs MCV, and histogram
+ * vs histogram selectivity for semi join using the subnet inclusion
+ * operators.
+ */
+static Selectivity
+networkjoinsel_semi(Oid operator,
+					VariableStatData *vardata1, VariableStatData *vardata2)
+{
+	Form_pg_statistic stats;
+	Selectivity selec = 0.0,
+				mcv1_max_selec = 0.0,
+				mcv2_max_selec = 0.0;
+	double		nullfrac1 = 0.0,
+				nullfrac2 = 0.0,
+				his2_weight = 0.0;
+	bool		mcv1_exists = false,
+				mcv2_exists = false,
+				his1_exists = false,
+				his2_exists = false;
+	short int	comm_opr_order = 0;
+	FmgrInfo	proc;
+	int			i,
+				mcv1_nvalues,
+				mcv2_nvalues,
+				mcv1_nnumbers,
+				mcv2_nnumbers,
+				his1_nvalues,
+				his2_nvalues;
+	Datum	   *mcv1_values,
+			   *mcv2_values,
+			   *his1_values,
+			   *his2_values;
+	float4	   *mcv1_numbers,
+			   *mcv2_numbers;
+
+	if (HeapTupleIsValid(vardata1->statsTuple))
+	{
+		if ((stats = (Form_pg_statistic) GETSTRUCT(vardata1->statsTuple)))
+			nullfrac1 = stats->stanullfrac;
+
+		mcv1_exists = get_attstatsslot(vardata1->statsTuple,
+									   vardata1->atttype, vardata1->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv1_values, &mcv1_nvalues,
+									   &mcv1_numbers, &mcv1_nnumbers);
+		his1_exists = get_attstatsslot(vardata1->statsTuple,
+									   vardata1->atttype, vardata1->atttypmod,
+									   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+									   NULL,
+									   &his1_values, &his1_nvalues,
+									   NULL, NULL);
+	}
+
+	if (HeapTupleIsValid(vardata2->statsTuple))
+	{
+
+		if ((stats = (Form_pg_statistic) GETSTRUCT(vardata2->statsTuple)))
+			nullfrac2 = stats->stanullfrac;
+
+		mcv2_exists = get_attstatsslot(vardata2->statsTuple,
+									   vardata2->atttype, vardata2->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv2_values, &mcv2_nvalues,
+									   &mcv2_numbers, &mcv2_nnumbers);
+		his2_exists = get_attstatsslot(vardata2->statsTuple,
+									   vardata2->atttype, vardata2->atttypmod,
+									   STATISTIC_KIND_HISTOGRAM, InvalidOid,
+									   NULL,
+									   &his2_values, &his2_nvalues,
+									   NULL, NULL);
+	}
+
+	if (mcv2_exists)
+	{
+		fmgr_info(get_opcode(operator), &proc);
+
+		for (i = 0; i < mcv2_nvalues; i++)
+			 mcv2_max_selec += mcv2_numbers[i];
+	}
+
+	if (his2_exists)
+	{
+		his2_weight = (1.0 - nullfrac2 - mcv2_max_selec) * vardata2->rel->rows;
+		comm_opr_order = inet_opr_order(operator) * -1;
+	}
+
+	if (mcv1_exists && (mcv2_exists || his2_exists))
+		for (i = 0; i < mcv1_nvalues; i++)
+			selec += mcv1_numbers[i] *
+				inet_semi_join_selec(mcv2_exists, mcv2_values, mcv2_nvalues,
+									 his2_exists, his2_values, his2_nvalues,
+									 his2_weight, &mcv1_values[i],
+									 &proc, comm_opr_order);
+
+	if (his1_exists && his1_nvalues > 2 && (mcv2_exists || his2_exists))
+	{
+		double		his_selec_sum = 0.0;
+
+		/*
+		 * The first and the last element of the first histogram will not be
+		 * used, on the grounds that they are outliers and hence not very
+		 * representative.
+		 */
+		for (i = 1; i < his1_nvalues - 1; i++)
+			his_selec_sum +=
+				inet_semi_join_selec(mcv2_exists, mcv2_values, mcv2_nvalues,
+									 his2_exists, his2_values, his2_nvalues,
+									 his2_weight, &his1_values[i],
+									 &proc, comm_opr_order);
+
+		/*
+		 * Multiply the histogram selectivity sum with remaining total
+		 * selectivity for the histogram, and divide it to the checked
+		 * element count.
+		 */
+		selec += (1.0 - nullfrac1 - mcv1_max_selec) *
+			his_selec_sum / (his1_nvalues - 2);
+	}
+
+	/*
+	 * If useful statistics are not available set the selectivity using
+	 * the default constant with null fractions.
+	 */
+	if ((!mcv1_exists && !his1_exists) || (!mcv2_exists && !his2_exists))
+		selec = (1.0 - nullfrac1) * (1.0 - nullfrac2) * DEFAULT_SEL(operator);
+
+	if (mcv1_exists)
+		free_attstatsslot(vardata1->atttype, mcv1_values, mcv1_nvalues,
+						  mcv1_numbers, mcv1_nnumbers);
+	if (mcv2_exists)
+		free_attstatsslot(vardata2->atttype, mcv2_values, mcv2_nvalues,
+						  mcv2_numbers, mcv2_nnumbers);
+	if (his1_exists)
+		free_attstatsslot(vardata1->atttype, his1_values, his1_nvalues,
+						  NULL, 0);
+	if (his2_exists)
+		free_attstatsslot(vardata2->atttype, his2_values, his2_nvalues,
+						  NULL, 0);
+
+	return selec;
+}
+
+/*
+ * Practical comparable numbers for the subnet inclusion operators
+ */
+static short int
+inet_opr_order(Oid operator)
+{
+	switch (operator)
+	{
+		case OID_INET_SUP_OP:
+			return -2;
+		case OID_INET_SUPEQ_OP:
+			return -1;
+		case OID_INET_OVERLAP_OP:
+			return 0;
+		case OID_INET_SUBEQ_OP:
+			return 1;
+		case OID_INET_SUB_OP:
+			return 2;
+		default:
+			elog(ERROR, "unknown operator for inet inclusion selectivity");
+	}
+}
+
+/*
+ * Inet histogram inclusion selectivity estimation
+ *
+ * Calculates histogram selectivity for the subnet inclusion operators of
+ * the inet type.  The return value is between 0 and 1.  It should be
+ * corrected with the MVC selectivity and null fraction.  If the constant
+ * is less than the first element or greater than the last element of
+ * the histogram the return value will be 0.
+ *
+ * The histogram is originally for the basic comparison operators.  Only
+ * the common bits of the network part and the length of the network part
+ * (masklen) are appropriate for the subnet inclusion operators.  Fortunately,
+ * basic comparison fits in this situation.  Even so, the length of the
+ * network part would not really be significant in the histogram.  This would
+ * lead to big mistakes for data sets with uneven masklen distribution.
+ * To avoid this problem, comparison with the left and the right side of the
+ * buckets used together.
+ *
+ * Histogram bucket matches are calculated in two forms.  If the constant
+ * matches both sides the bucket is considered as fully matched.  If the
+ * constant matches only the right side the bucket, it is not considered
+ * as matched unless it is the last bucket, because it will match the next
+ * bucked.  If all of these buckets would be considered as matched, it would
+ * lead to unfair multiple matches for some constants.
+ *
+ * The second form is to match the bucket partially.  We try to calculate
+ * dividers for both of the boundaries.  If the address family of the boundary
+ * does not match the constant or comparison of the length of the network
+ * parts is not true by the operator, the divider for the boundary would not
+ * taken into account.  If both of the dividers can be calculated the greater
+ * one will be used to minimize the mistake in the buckets which have
+ * disparate masklens.
+ *
+ * The divider on the partial bucket match is imagined as the distance
+ * between the decisive bits and the common bits of the addresses.  It will
+ * be used as power of two as it is the natural scale for the IP network
+ * inclusion.  The partial bucket match divider calculation is an empirical
+ * formula and subject to change with more experiment.
+ *
+ * For partial match with buckets which have different address families
+ * on the left and right sides only the boundary with the same address
+ * family is taken into consideration.  This can cause more mistakes for these
+ * buckets if the masklens of their boundaries are also disparate.  It can
+ * only be the case for one bucket, if there are addresses with different
+ * families on the column.  It seems as a better option than not considering
+ * these buckets.
+ */
+static Selectivity
+inet_his_inclusion_selec(Datum *values, int nvalues, Datum *constvalue,
+						 short int opr_order)
+{
+	inet	   *query,
+			   *left,
+			   *right;
+	float		match = 0.0;
+	int			i;
+	short int	left_order,
+				right_order,
+				left_divider,
+				right_divider;
+
+	query = DatumGetInetPP(*constvalue);
+	left = DatumGetInetPP(values[0]);
+	left_order = inet_inclusion_cmp(left, query, opr_order);
+
+	for (i = 1; i < nvalues; i++)
+	{
+		right = DatumGetInetP(values[i]);
+		right_order = inet_inclusion_cmp(right, query, opr_order);
+
+		if (left_order == 0 && right_order == 0)
+		{
+			/* Full bucket match. */
+
+			match += 1.0;
+		}
+		else if ((left_order <= 0 && right_order > 0) ||
+				 (left_order >= 0 && right_order < 0) ||
+				 (right_order == 0 && i == nvalues - 1))
+		{
+			/* Partial bucket match. */
+
+			left_divider = inet_his_match_divider(left, query, opr_order);
+			right_divider = inet_his_match_divider(right, query, opr_order);
+
+			if (left_divider >= 0 || right_divider >= 0)
+				match += 1.0 / pow(2, Max(left_divider, right_divider));
+		}
+
+		/* Shift the variables. */
+		left = right;
+		left_order = right_order;
+	}
+
+	/* There are nvalues - 1 buckets. */
+	return match / (nvalues - 1);
+}
+
+/*
+ * Inet MCV inner join selectivity estimation
+ *
+ * The original function of the operator used in this function, like the
+ * mcv_selectivity() on selfuncs.c.  Actually, this function has nothing
+ * to do with the network data types except its name and location.
+ *
+ * The function result is the selectivity, and the fraction of the total
+ * population of the left hand side MCV are returned into *max_selec_p.
+ */
+static Selectivity
+inet_mcv_join_selec(Datum *mcv1_values, float4 *mcv1_numbers, int mcv1_nvalues,
+					Datum *mcv2_values, float4 *mcv2_numbers, int mcv2_nvalues,
+					Oid operator, Selectivity *max_selec_p)
+{
+	Selectivity selec = 0.0,
+				max_selec = 0.0;
+	FmgrInfo	proc;
+	int			i,
+				j;
+
+	fmgr_info(get_opcode(operator), &proc);
+
+	for (i = 0; i < mcv1_nvalues; i++)
+	{
+		for (j = 0; j < mcv2_nvalues; j++)
+			if (DatumGetBool(FunctionCall2Coll(&proc, DEFAULT_COLLATION_OID,
+											   mcv1_values[i], mcv2_values[j])))
+				selec += mcv1_numbers[i] * mcv2_numbers[j];
+
+		max_selec += mcv1_numbers[i];
+	}
+
+	*max_selec_p = max_selec;
+	return selec;
+}
+
+/*
+ * Inet MCV vs histogram inclusion join selectivity estimation
+ *
+ * The function result is the selectivity, and the fraction of the total
+ * population of the MCV is returned into *max_selec_p.
+ */
+static Selectivity
+inet_mcv_his_selec(Datum *mcv_values, float4 *mcv_numbers, int mcv_nvalues,
+				   Datum *his_values, int his_nvalues, short int opr_order,
+				   Selectivity *max_selec_p)
+{
+	Selectivity selec = 0.0,
+				max_selec = 0.0;
+	int			i;
+
+	for (i = 0; i < mcv_nvalues; i++)
+	{
+		selec += mcv_numbers[i] *
+			inet_his_inclusion_selec(his_values, his_nvalues, &mcv_values[i],
+									 opr_order);
+
+		max_selec += mcv_numbers[i];
+	}
+
+	*max_selec_p = max_selec;
+	return selec;
+}
+
+/*
+ * Inet histogram inclusion join selectivity estimation
+ *
+ * The function calculates histogram to histogram selectivity for inner join
+ * using the elements from the histogram of the right hand side table as
+ * the samples.  Alternatively elements from the histogram of the left hand
+ * side table could be used with the commutator of the operator.
+ *
+ * Values from the first histogram will be matched with the second.  The first
+ * and the last element of the first histogram will not be used, on
+ * the grounds that they are outliers and hence not very representative.
+ */
+static Selectivity
+inet_his_inclusion_join_selec(Datum *his1_values, int his1_nvalues,
+							  Datum *his2_values, int his2_nvalues,
+							  short int opr_order)
+{
+	float		match = 0.0;
+	int			i;
+
+	Assert(his2_nvalues > 2);
+
+	for (i = 1; i < his2_nvalues - 1; i++)
+		match += inet_his_inclusion_selec(his1_values, his1_nvalues,
+										  &his2_values[i], opr_order);
+
+	/* Divide the match sum to checked element count. */
+	return match / (his2_nvalues - 2);
+}
+
+/*
+ * Inet semi join selectivity estimation for one value
+ *
+ * The function calculates the probability that there is at least one row
+ * in the table which satisfies the "constant op column" condition.  It is
+ * used for semi join estimation to check the samples of the left hand
+ * side table.  For better estimation, it should be called several times
+ * in a loop with different constants, and the average should be used.
+ *
+ * The MCV and histogram from the right hand side table should be provided
+ * as arguments with the constant from the left hand side table for the join.
+ * his_weight is the total number of rows covered by the histogram.  For
+ * example, if the table has 1000 rows, and 10% of the rows are in the MCV,
+ * and another 10% are NULLs, his_height would be 800.  Finally, underlying
+ * proc the of the join operator to use with the MCV, and opr_order for
+ * the commutator of the semi join operator to use with the histogram should
+ * be passed.
+ *
+ * First, the constant will be matched with the most common values.  If it
+ * matches any of them, 1.0 will be returned, because then there is surely
+ * a match.
+ *
+ * Next, the histogram will be used to estimate the number of rows in
+ * the second table that matches the condition.  If the estimate is greater
+ * than 1.0, 1.0 will be returned, because it means there is a greater chance
+ * that the constant will match more than one row in the table.  If it is
+ * between 0.0 and 1.0, it will be returned as the probability.
+ */
+static Selectivity
+inet_semi_join_selec(bool mcv_exists, Datum *mcv_values, int mcv_nvalues,
+					 bool his_exists, Datum *his_values, int his_nvalues,
+					 double his_weight, Datum *constvalue,
+					 FmgrInfo *proc, short int comm_opr_order)
+{
+	if (mcv_exists)
+	{
+		int			i;
+
+		for (i = 0; i < mcv_nvalues; i++)
+			if (DatumGetBool(FunctionCall2Coll(proc, DEFAULT_COLLATION_OID,
+											   *constvalue, mcv_values[i])))
+				return 1.0;
+	}
+
+	if (his_exists)
+	{
+		Selectivity	his_selec;
+
+		Assert(his_weight > 0);
+
+		/*
+		 * Histogram is used as it was from the left hand side table which
+		 * is the opposite for semi join.  That is why, the commutator of
+		 * the actual operator is required.
+		 */
+		his_selec = inet_his_inclusion_selec(his_values, his_nvalues,
+											 constvalue, comm_opr_order);
+
+		if (his_selec > 0)
+			return Min(1.0, his_weight * his_selec);
+	}
+
+	return 0.0;
+}
+
+/*
+ * Comparison function for the subnet inclusion operators
+ *
+ * Comparison is compatible with the basic comparison function for the inet
+ * type.  See network_cmp_internal on network.c for the original.  Basic
+ * comparison operators are implemented with the network_cmp_internal
+ * function.  It is possible to implement the subnet inclusion operators with
+ * this function.
+ *
+ * Comparison is first on the common bits of the network part, then on
+ * the length of the network part (masklen) as the network_cmp_internal
+ * function.  Only the first part is on this function.  The second part is
+ * separated to another function for reusability.  The difference between
+ * the second part and the original network_cmp_internal is that the operator
+ * is used while comparing the lengths of the network parts.  See the second
+ * part on the inet_masklen_inclusion_cmp function below.
+ */
+static short int
+inet_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	order;
+
+		order = bitncmp(ip_addr(left), ip_addr(right),
+						Min(ip_bits(left), ip_bits(right)));
+
+		if (order != 0)
+			return order;
+
+		return inet_masklen_inclusion_cmp(left, right, opr_order);
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Masklen comparison function for the subnet inclusion operators
+ *
+ * Compares the lengths of network parts of the inputs using the operator.
+ * If the comparison is okay for the operator the return value will be 0.
+ * Otherwise the return value will be less than or greater than 0 with
+ * respect to the operator.
+ */
+static short int
+inet_masklen_inclusion_cmp(inet *left, inet *right, short int opr_order)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		short int	order;
+
+		order = ip_bits(left) - ip_bits(right);
+
+		if ((order > 0 && opr_order >= 0) ||
+			(order == 0 && opr_order >= -1 && opr_order <= 1) ||
+			(order < 0 && opr_order <= 0))
+			return 0;
+
+		return opr_order;
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Inet histogram partial match divider calculation
+ *
+ * First the families and the lengths of the network parts are compared
+ * using the subnet inclusion operator.  The divider will be calculated
+ * using the masklens and the common bits of the addresses.  -1 will be
+ * returned if it cannot be calculated.
+ */
+static short int
+inet_his_match_divider(inet *boundary, inet *query, short int opr_order)
+{
+	if (inet_masklen_inclusion_cmp(boundary, query, opr_order) == 0)
+	{
+		short int	min_bits,
+					decisive_bits;
+
+		min_bits = Min(ip_bits(boundary), ip_bits(query));
+
+		/*
+		 * Set the decisive bits from the one which should contain the other
+		 * according to the operator.
+		 */
+		if (opr_order < 0)
+			decisive_bits = ip_bits(boundary);
+		else if (opr_order > 0)
+			decisive_bits = ip_bits(query);
+		else
+			decisive_bits = min_bits;
+
+		if (min_bits > 0)
+			return decisive_bits - bitncommon(ip_addr(boundary), ip_addr(query),
+											  min_bits);
+		return decisive_bits;
+	}
+
+	return -1;
 }
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index d7dcd1c..3b827fc 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -1142,18 +1142,19 @@ DATA(insert OID = 1206 (  ">="	   PGNSP PGUID b f f 869 869	 16 1204 1203 networ
 DESCR("greater than or equal");
 DATA(insert OID = 931  (  "<<"	   PGNSP PGUID b f f 869 869	 16 933		0 network_sub networksel networkjoinsel ));
 DESCR("is subnet");
-#define OID_INET_SUB_OP				  931
+#define OID_INET_SUB_OP			931
 DATA(insert OID = 932  (  "<<="    PGNSP PGUID b f f 869 869	 16 934		0 network_subeq networksel networkjoinsel ));
 DESCR("is subnet or equal");
-#define OID_INET_SUBEQ_OP				932
+#define OID_INET_SUBEQ_OP		932
 DATA(insert OID = 933  (  ">>"	   PGNSP PGUID b f f 869 869	 16 931		0 network_sup networksel networkjoinsel ));
 DESCR("is supernet");
-#define OID_INET_SUP_OP				  933
+#define OID_INET_SUP_OP			933
 DATA(insert OID = 934  (  ">>="    PGNSP PGUID b f f 869 869	 16 932		0 network_supeq networksel networkjoinsel ));
 DESCR("is supernet or equal");
-#define OID_INET_SUPEQ_OP				934
+#define OID_INET_SUPEQ_OP		934
 DATA(insert OID = 3552	(  "&&"    PGNSP PGUID b f f 869 869	 16 3552	0 network_overlap networksel networkjoinsel ));
 DESCR("overlaps (is subnet or supernet)");
+#define OID_INET_OVERLAP_OP		3552
 
 DATA(insert OID = 2634 (  "~"	   PGNSP PGUID l f f	  0 869 869 0 0 inetnot - - ));
 DESCR("bitwise not");
#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Emre Hasegeli (#18)
1 attachment(s)
Re: Selectivity estimation for inet operators

Emre Hasegeli <emre@hasegeli.com> writes:

Thanks. Overall, my impression of this patch is that it works very
well. But damned if I understood *how* it works :-). There's a lot
of statistics involved, and it's not easy to see why something is
multiplied by something else. I'm adding comments as I read through
it.

Thank you for looking at it. I tried to add more comments to
the multiplications. New version attached. It also fixes a bug
caused by wrong operator order used on histogram to histogram
selectivity estimation for inner join.

I spent a fair chunk of the weekend hacking on this patch to make
it more understandable and fix up a lot of what seemed to me pretty
clear arithmetic errors in the "upper layers" of the patch. However,
I couldn't quite convince myself to commit it, because the business
around estimation for partial histogram-bucket matches still doesn't
make any sense to me. Specifically this:

/* Partial bucket match. */
left_divider = inet_hist_match_divider(left, query, opr_codenum);
right_divider = inet_hist_match_divider(right, query, opr_codenum);

if (left_divider >= 0 || right_divider >= 0)
match += 1.0 / pow(2.0, Max(left_divider, right_divider));

Now unless I'm missing something pretty basic about the divider
function, it returns larger numbers for inputs that are "further away"
from each other (ie, have more not-in-common significant address bits).
So the above calculation seems exactly backwards to me: if one endpoint
of a bucket is "close" to the query, or even an exact match, and the
other endpoint is further away, we completely ignore the close/exact
match and assign a bucket match fraction based only on the further-away
endpoint. Isn't that exactly backwards?

I experimented with logic like this:

if (left_divider >= 0 && right_divider >= 0)
match += 1.0 / pow(2.0, Min(left_divider, right_divider));
else if (left_divider >= 0 || right_divider >= 0)
match += 1.0 / pow(2.0, Max(left_divider, right_divider));

ie, consider the closer endpoint if both are valid. But that didn't seem
to work a whole lot better. I think really we need to consider both
endpoints not just one to the exclusion of the other.

I'm also not exactly convinced by the divider function itself,
specifically about the decision to fail and return -1 if the masklen
comparison comes out wrong. This effectively causes the masklen to be
the most significant part of the value (after the IP family), which seems
totally wrong. ISTM we ought to consider the number of leading bits in
common as the primary indicator of "how far apart" a query and a
histogram endpoint are.

Even if the above aspects of the code are really completely right, the
comments fail to explain why. I spent a lot of time on the comments,
but so far as these points are concerned they still only explain what
is being done and not why it's a useful calculation to make.

Anyway, attached is my updated version of the patch. (I did commit the
added #define in pg_operator.h, so that the patch can be independent of
that file in future.) I've marked this "waiting on author" in the CF app.

regards, tom lane

Attachments:

inet-selfuncs-v13.patchtext/x-diff; charset=us-ascii; name=inet-selfuncs-v13.patchDownload
diff --git a/src/backend/utils/adt/network_selfuncs.c b/src/backend/utils/adt/network_selfuncs.c
index d0d806f..f854847 100644
*** a/src/backend/utils/adt/network_selfuncs.c
--- b/src/backend/utils/adt/network_selfuncs.c
***************
*** 3,9 ****
   * network_selfuncs.c
   *	  Functions for selectivity estimation of inet/cidr operators
   *
!  * Currently these are just stubs, but we hope to do better soon.
   *
   * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
--- 3,11 ----
   * network_selfuncs.c
   *	  Functions for selectivity estimation of inet/cidr operators
   *
!  * This module provides estimators for the subnet inclusion and overlap
!  * operators.  Estimates are based on null fraction, most common values,
!  * and histogram of inet/cidr columns.
   *
   * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
***************
*** 16,32 ****
   */
  #include "postgres.h"
  
  #include "utils/inet.h"
  
  
  Datum
  networksel(PG_FUNCTION_ARGS)
  {
! 	PG_RETURN_FLOAT8(0.001);
  }
  
  Datum
  networkjoinsel(PG_FUNCTION_ARGS)
  {
! 	PG_RETURN_FLOAT8(0.001);
  }
--- 18,949 ----
   */
  #include "postgres.h"
  
+ #include <math.h>
+ 
+ #include "access/htup_details.h"
+ #include "catalog/pg_operator.h"
+ #include "catalog/pg_statistic.h"
  #include "utils/inet.h"
+ #include "utils/lsyscache.h"
+ #include "utils/selfuncs.h"
+ 
+ 
+ /* Default selectivity for the inet overlap operator */
+ #define DEFAULT_OVERLAP_SEL 0.01
  
+ /* Default selectivity for the various inclusion operators */
+ #define DEFAULT_INCLUSION_SEL 0.005
+ 
+ /* Default selectivity for specified operator */
+ #define DEFAULT_SEL(operator) \
+ 	((operator) == OID_INET_OVERLAP_OP ? \
+ 	 DEFAULT_OVERLAP_SEL : DEFAULT_INCLUSION_SEL)
+ 
+ static Selectivity networkjoinsel_inner(Oid operator,
+ 					 VariableStatData *vardata1, VariableStatData *vardata2);
+ static Selectivity networkjoinsel_semi(Oid operator,
+ 					VariableStatData *vardata1, VariableStatData *vardata2);
+ static Selectivity mcv_population(float4 *mcv_numbers, int mcv_nvalues);
+ static Selectivity inet_hist_value_sel(Datum *values, int nvalues,
+ 					Datum constvalue, int opr_codenum);
+ static Selectivity inet_mcv_join_sel(Datum *mcv1_values,
+ 				  float4 *mcv1_numbers, int mcv1_nvalues, Datum *mcv2_values,
+ 				  float4 *mcv2_numbers, int mcv2_nvalues, Oid operator);
+ static Selectivity inet_mcv_hist_sel(Datum *mcv_values, float4 *mcv_numbers,
+ 				  int mcv_nvalues, Datum *hist_values, int hist_nvalues,
+ 				  int opr_codenum);
+ static Selectivity inet_hist_inclusion_join_sel(Datum *hist1_values,
+ 							 int hist1_nvalues,
+ 							 Datum *hist2_values, int hist2_nvalues,
+ 							 int opr_codenum);
+ static Selectivity inet_semi_join_sel(Datum lhs_value,
+ 				   bool mcv_exists, Datum *mcv_values, int mcv_nvalues,
+ 				   bool hist_exists, Datum *hist_values, int hist_nvalues,
+ 				   double hist_weight,
+ 				   FmgrInfo *proc, int opr_codenum);
+ static int	inet_opr_codenum(Oid operator);
+ static int	inet_inclusion_cmp(inet *left, inet *right, int opr_codenum);
+ static int inet_masklen_inclusion_cmp(inet *left, inet *right,
+ 						   int opr_codenum);
+ static int inet_hist_match_divider(inet *boundary, inet *query,
+ 						int opr_codenum);
  
+ /*
+  * Selectivity estimation for the subnet inclusion/overlap operators
+  */
  Datum
  networksel(PG_FUNCTION_ARGS)
  {
! 	PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0);
! 	Oid			operator = PG_GETARG_OID(1);
! 	List	   *args = (List *) PG_GETARG_POINTER(2);
! 	int			varRelid = PG_GETARG_INT32(3);
! 	VariableStatData vardata;
! 	Node	   *other;
! 	bool		varonleft;
! 	Selectivity selec,
! 				mcv_selec,
! 				non_mcv_selec;
! 	Datum		constvalue,
! 			   *hist_values;
! 	int			hist_nvalues;
! 	Form_pg_statistic stats;
! 	double		sumcommon,
! 				nullfrac;
! 	FmgrInfo	proc;
! 
! 	/*
! 	 * If expression is not (variable op something) or (something op
! 	 * variable), then punt and return a default estimate.
! 	 */
! 	if (!get_restriction_variable(root, args, varRelid,
! 								  &vardata, &other, &varonleft))
! 		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
! 
! 	/*
! 	 * Can't do anything useful if the something is not a constant, either.
! 	 */
! 	if (!IsA(other, Const))
! 	{
! 		ReleaseVariableStats(vardata);
! 		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
! 	}
! 
! 	/* All of the operators handled here are strict. */
! 	if (((Const *) other)->constisnull)
! 	{
! 		ReleaseVariableStats(vardata);
! 		PG_RETURN_FLOAT8(0.0);
! 	}
! 	constvalue = ((Const *) other)->constvalue;
! 
! 	/* Otherwise, we need stats in order to produce a non-default estimate. */
! 	if (!HeapTupleIsValid(vardata.statsTuple))
! 	{
! 		ReleaseVariableStats(vardata);
! 		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
! 	}
! 
! 	stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
! 	nullfrac = stats->stanullfrac;
! 
! 	/*
! 	 * If we have most-common-values info, add up the fractions of the MCV
! 	 * entries that satisfy MCV OP CONST.  These fractions contribute directly
! 	 * to the result selectivity.  Also add up the total fraction represented
! 	 * by MCV entries.
! 	 */
! 	fmgr_info(get_opcode(operator), &proc);
! 	mcv_selec = mcv_selectivity(&vardata, &proc, constvalue, varonleft,
! 								&sumcommon);
! 
! 	/*
! 	 * If we have a histogram, use it to estimate the proportion of the
! 	 * non-MCV population that satisfies the clause.  If we don't, apply the
! 	 * default selectivity to that population.
! 	 */
! 	if (get_attstatsslot(vardata.statsTuple,
! 						 vardata.atttype, vardata.atttypmod,
! 						 STATISTIC_KIND_HISTOGRAM, InvalidOid,
! 						 NULL,
! 						 &hist_values, &hist_nvalues,
! 						 NULL, NULL))
! 	{
! 		int			opr_codenum = inet_opr_codenum(operator);
! 
! 		/* Commute if needed, so we can consider histogram to be on the left */
! 		if (!varonleft)
! 			opr_codenum = -opr_codenum;
! 		non_mcv_selec = inet_hist_value_sel(hist_values, hist_nvalues,
! 											constvalue, opr_codenum);
! 
! 		free_attstatsslot(vardata.atttype, hist_values, hist_nvalues, NULL, 0);
! 	}
! 	else
! 		non_mcv_selec = DEFAULT_SEL(operator);
! 
! 	/* Combine selectivities for MCV and non-MCV populations */
! 	selec = mcv_selec + (1.0 - nullfrac - sumcommon) * non_mcv_selec;
! 
! 	/* Result should be in range, but make sure... */
! 	CLAMP_PROBABILITY(selec);
! 
! 	ReleaseVariableStats(vardata);
! 
! 	PG_RETURN_FLOAT8(selec);
  }
  
+ /*
+  * Join selectivity estimation for the subnet inclusion/overlap operators
+  *
+  * This function has the same structure as eqjoinsel() in selfuncs.c.
+  */
  Datum
  networkjoinsel(PG_FUNCTION_ARGS)
  {
! 	PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0);
! 	Oid			operator = PG_GETARG_OID(1);
! 	List	   *args = (List *) PG_GETARG_POINTER(2);
! #ifdef NOT_USED
! 	JoinType	jointype = (JoinType) PG_GETARG_INT16(3);
! #endif
! 	SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) PG_GETARG_POINTER(4);
! 	double		selec;
! 	VariableStatData vardata1;
! 	VariableStatData vardata2;
! 	bool		join_is_reversed;
! 
! 	get_join_variables(root, args, sjinfo,
! 					   &vardata1, &vardata2, &join_is_reversed);
! 
! 	switch (sjinfo->jointype)
! 	{
! 		case JOIN_INNER:
! 		case JOIN_LEFT:
! 		case JOIN_FULL:
! 
! 			/*
! 			 * Selectivity for left/full join is not exactly the same as inner
! 			 * join, but we neglect the difference, as eqjoinsel does.
! 			 */
! 			selec = networkjoinsel_inner(operator, &vardata1, &vardata2);
! 			break;
! 		case JOIN_SEMI:
! 		case JOIN_ANTI:
! 			/* Here, it's important that we pass the outer var on the left. */
! 			if (!join_is_reversed)
! 				selec = networkjoinsel_semi(operator, &vardata1, &vardata2);
! 			else
! 				selec = networkjoinsel_semi(get_commutator(operator),
! 											&vardata2, &vardata1);
! 			break;
! 		default:
! 			/* other values not expected here */
! 			elog(ERROR, "unrecognized join type: %d",
! 				 (int) sjinfo->jointype);
! 			selec = 0;			/* keep compiler quiet */
! 			break;
! 	}
! 
! 	ReleaseVariableStats(vardata1);
! 	ReleaseVariableStats(vardata2);
! 
! 	CLAMP_PROBABILITY(selec);
! 
! 	PG_RETURN_FLOAT8((float8) selec);
! }
! 
! /*
!  * Inner join selectivity estimation for subnet inclusion/overlap operators
!  *
!  * Calculates MCV vs MCV, MCV vs histogram and histogram vs histogram
!  * selectivity for join using the subnet inclusion operators.  Unlike the
!  * join selectivity function for the equality operator, eqjoinsel_inner(),
!  * one to one matching of the values is not enough.  Network inclusion
!  * operators are likely to match many to many, so we must check all pairs.
!  * (Note: it might be possible to exploit understanding of the histogram's
!  * btree ordering to reduce the work needed, but we don't currently try.)
!  * Also, MCV vs histogram selectivity is not neglected as in eqjoinsel_inner().
!  */
! static Selectivity
! networkjoinsel_inner(Oid operator,
! 					 VariableStatData *vardata1, VariableStatData *vardata2)
! {
! 	Form_pg_statistic stats;
! 	double		nullfrac1 = 0.0,
! 				nullfrac2 = 0.0;
! 	Selectivity selec = 0.0,
! 				sumcommon1 = 0.0,
! 				sumcommon2 = 0.0;
! 	bool		mcv1_exists = false,
! 				mcv2_exists = false,
! 				hist1_exists = false,
! 				hist2_exists = false;
! 	int			opr_codenum;
! 	int			mcv1_nvalues,
! 				mcv2_nvalues,
! 				mcv1_nnumbers,
! 				mcv2_nnumbers,
! 				hist1_nvalues,
! 				hist2_nvalues;
! 	Datum	   *mcv1_values,
! 			   *mcv2_values,
! 			   *hist1_values,
! 			   *hist2_values;
! 	float4	   *mcv1_numbers,
! 			   *mcv2_numbers;
! 
! 	if (HeapTupleIsValid(vardata1->statsTuple))
! 	{
! 		stats = (Form_pg_statistic) GETSTRUCT(vardata1->statsTuple);
! 		nullfrac1 = stats->stanullfrac;
! 
! 		mcv1_exists = get_attstatsslot(vardata1->statsTuple,
! 									   vardata1->atttype, vardata1->atttypmod,
! 									   STATISTIC_KIND_MCV, InvalidOid,
! 									   NULL,
! 									   &mcv1_values, &mcv1_nvalues,
! 									   &mcv1_numbers, &mcv1_nnumbers);
! 		hist1_exists = get_attstatsslot(vardata1->statsTuple,
! 									  vardata1->atttype, vardata1->atttypmod,
! 										STATISTIC_KIND_HISTOGRAM, InvalidOid,
! 										NULL,
! 										&hist1_values, &hist1_nvalues,
! 										NULL, NULL);
! 		if (mcv1_exists)
! 			sumcommon1 = mcv_population(mcv1_numbers, mcv1_nnumbers);
! 	}
! 
! 	if (HeapTupleIsValid(vardata2->statsTuple))
! 	{
! 		stats = (Form_pg_statistic) GETSTRUCT(vardata2->statsTuple);
! 		nullfrac2 = stats->stanullfrac;
! 
! 		mcv2_exists = get_attstatsslot(vardata2->statsTuple,
! 									   vardata2->atttype, vardata2->atttypmod,
! 									   STATISTIC_KIND_MCV, InvalidOid,
! 									   NULL,
! 									   &mcv2_values, &mcv2_nvalues,
! 									   &mcv2_numbers, &mcv2_nnumbers);
! 		hist2_exists = get_attstatsslot(vardata2->statsTuple,
! 									  vardata2->atttype, vardata2->atttypmod,
! 										STATISTIC_KIND_HISTOGRAM, InvalidOid,
! 										NULL,
! 										&hist2_values, &hist2_nvalues,
! 										NULL, NULL);
! 		if (mcv2_exists)
! 			sumcommon2 = mcv_population(mcv2_numbers, mcv2_nnumbers);
! 	}
! 
! 	opr_codenum = inet_opr_codenum(operator);
! 
! 	/*
! 	 * Calculate selectivity for MCV vs MCV matches.
! 	 */
! 	if (mcv1_exists && mcv2_exists)
! 		selec += inet_mcv_join_sel(mcv1_values, mcv1_numbers, mcv1_nvalues,
! 								   mcv2_values, mcv2_numbers, mcv2_nvalues,
! 								   operator);
! 
! 	/*
! 	 * Add in selectivities for MCV vs histogram matches, scaling according to
! 	 * the fractions of the populations represented by the histograms. Note
! 	 * that the second case needs to commute the operator.
! 	 */
! 	if (mcv1_exists && hist2_exists)
! 		selec += (1.0 - nullfrac2 - sumcommon2) *
! 			inet_mcv_hist_sel(mcv1_values, mcv1_numbers, mcv1_nvalues,
! 							  hist2_values, hist2_nvalues,
! 							  opr_codenum);
! 	if (mcv2_exists && hist1_exists)
! 		selec += (1.0 - nullfrac1 - sumcommon1) *
! 			inet_mcv_hist_sel(mcv2_values, mcv2_numbers, mcv2_nvalues,
! 							  hist1_values, hist1_nvalues,
! 							  -opr_codenum);
! 
! 	/*
! 	 * Add in selectivity for histogram vs histogram matches, again scaling
! 	 * appropriately.
! 	 */
! 	if (hist1_exists && hist2_exists)
! 		selec += (1.0 - nullfrac1 - sumcommon1) *
! 			(1.0 - nullfrac2 - sumcommon2) *
! 			inet_hist_inclusion_join_sel(hist1_values, hist1_nvalues,
! 										 hist2_values, hist2_nvalues,
! 										 opr_codenum);
! 
! 	/*
! 	 * If useful statistics are not available then use the default estimate.
! 	 * We can apply null fractions if known, though.
! 	 */
! 	if ((!mcv1_exists && !hist1_exists) || (!mcv2_exists && !hist2_exists))
! 		selec = (1.0 - nullfrac1) * (1.0 - nullfrac2) * DEFAULT_SEL(operator);
! 
! 	/* Release stats. */
! 	if (mcv1_exists)
! 		free_attstatsslot(vardata1->atttype, mcv1_values, mcv1_nvalues,
! 						  mcv1_numbers, mcv1_nnumbers);
! 	if (mcv2_exists)
! 		free_attstatsslot(vardata2->atttype, mcv2_values, mcv2_nvalues,
! 						  mcv2_numbers, mcv2_nnumbers);
! 	if (hist1_exists)
! 		free_attstatsslot(vardata1->atttype, hist1_values, hist1_nvalues,
! 						  NULL, 0);
! 	if (hist2_exists)
! 		free_attstatsslot(vardata2->atttype, hist2_values, hist2_nvalues,
! 						  NULL, 0);
! 
! 	return selec;
! }
! 
! /*
!  * Semi join selectivity estimation for subnet inclusion/overlap operators
!  *
!  * Calculates MCV vs MCV, MCV vs histogram, histogram vs MCV, and histogram vs
!  * histogram selectivity for semi/anti join cases.
!  */
! static Selectivity
! networkjoinsel_semi(Oid operator,
! 					VariableStatData *vardata1, VariableStatData *vardata2)
! {
! 	Form_pg_statistic stats;
! 	Selectivity selec = 0.0,
! 				sumcommon1 = 0.0,
! 				sumcommon2 = 0.0;
! 	double		nullfrac1 = 0.0,
! 				nullfrac2 = 0.0,
! 				hist2_weight = 0.0;
! 	bool		mcv1_exists = false,
! 				mcv2_exists = false,
! 				hist1_exists = false,
! 				hist2_exists = false;
! 	int			opr_codenum;
! 	FmgrInfo	proc;
! 	int			i,
! 				mcv1_nvalues,
! 				mcv2_nvalues,
! 				mcv1_nnumbers,
! 				mcv2_nnumbers,
! 				hist1_nvalues,
! 				hist2_nvalues;
! 	Datum	   *mcv1_values,
! 			   *mcv2_values,
! 			   *hist1_values,
! 			   *hist2_values;
! 	float4	   *mcv1_numbers,
! 			   *mcv2_numbers;
! 
! 	if (HeapTupleIsValid(vardata1->statsTuple))
! 	{
! 		stats = (Form_pg_statistic) GETSTRUCT(vardata1->statsTuple);
! 		nullfrac1 = stats->stanullfrac;
! 
! 		mcv1_exists = get_attstatsslot(vardata1->statsTuple,
! 									   vardata1->atttype, vardata1->atttypmod,
! 									   STATISTIC_KIND_MCV, InvalidOid,
! 									   NULL,
! 									   &mcv1_values, &mcv1_nvalues,
! 									   &mcv1_numbers, &mcv1_nnumbers);
! 		hist1_exists = get_attstatsslot(vardata1->statsTuple,
! 									  vardata1->atttype, vardata1->atttypmod,
! 										STATISTIC_KIND_HISTOGRAM, InvalidOid,
! 										NULL,
! 										&hist1_values, &hist1_nvalues,
! 										NULL, NULL);
! 		if (mcv1_exists)
! 			sumcommon1 = mcv_population(mcv1_numbers, mcv1_nnumbers);
! 	}
! 
! 	if (HeapTupleIsValid(vardata2->statsTuple))
! 	{
! 		stats = (Form_pg_statistic) GETSTRUCT(vardata2->statsTuple);
! 		nullfrac2 = stats->stanullfrac;
! 
! 		mcv2_exists = get_attstatsslot(vardata2->statsTuple,
! 									   vardata2->atttype, vardata2->atttypmod,
! 									   STATISTIC_KIND_MCV, InvalidOid,
! 									   NULL,
! 									   &mcv2_values, &mcv2_nvalues,
! 									   &mcv2_numbers, &mcv2_nnumbers);
! 		hist2_exists = get_attstatsslot(vardata2->statsTuple,
! 									  vardata2->atttype, vardata2->atttypmod,
! 										STATISTIC_KIND_HISTOGRAM, InvalidOid,
! 										NULL,
! 										&hist2_values, &hist2_nvalues,
! 										NULL, NULL);
! 		if (mcv2_exists)
! 			sumcommon2 = mcv_population(mcv2_numbers, mcv2_nnumbers);
! 	}
! 
! 	opr_codenum = inet_opr_codenum(operator);
! 	fmgr_info(get_opcode(operator), &proc);
! 
! 	/* Estimate number of input rows represented by RHS histogram. */
! 	if (hist2_exists && vardata2->rel)
! 		hist2_weight = (1.0 - nullfrac2 - sumcommon2) * vardata2->rel->rows;
! 
! 	/*
! 	 * Consider each element of the LHS MCV list, matching it to whatever RHS
! 	 * stats we have.  Scale according to the known frequency of the MCV.
! 	 */
! 	if (mcv1_exists && (mcv2_exists || hist2_exists))
! 	{
! 		for (i = 0; i < mcv1_nvalues; i++)
! 		{
! 			selec += mcv1_numbers[i] *
! 				inet_semi_join_sel(mcv1_values[i],
! 								   mcv2_exists, mcv2_values, mcv2_nvalues,
! 								   hist2_exists, hist2_values, hist2_nvalues,
! 								   hist2_weight,
! 								   &proc, opr_codenum);
! 		}
! 	}
! 
! 	/*
! 	 * Consider each element of the LHS histogram, except for the first and
! 	 * last elements, which we exclude on the grounds that they're outliers
! 	 * and thus not very representative.  Scale on the assumption that each
! 	 * such histogram element represents an equal share of the LHS histogram
! 	 * population (which is a bit bogus, because the members of its bucket may
! 	 * not all act the same with respect to the join clause, but it's hard to
! 	 * do better).
! 	 */
! 	if (hist1_exists && hist1_nvalues > 2 && (mcv2_exists || hist2_exists))
! 	{
! 		double		hist_selec_sum = 0.0;
! 
! 		for (i = 1; i < hist1_nvalues - 1; i++)
! 		{
! 			hist_selec_sum +=
! 				inet_semi_join_sel(hist1_values[i],
! 								   mcv2_exists, mcv2_values, mcv2_nvalues,
! 								   hist2_exists, hist2_values, hist2_nvalues,
! 								   hist2_weight,
! 								   &proc, opr_codenum);
! 		}
! 
! 		selec += (1.0 - nullfrac1 - sumcommon1) *
! 			hist_selec_sum / (hist1_nvalues - 2);
! 	}
! 
! 	/*
! 	 * If useful statistics are not available then use the default estimate.
! 	 * We can apply null fractions if known, though.
! 	 */
! 	if ((!mcv1_exists && !hist1_exists) || (!mcv2_exists && !hist2_exists))
! 		selec = (1.0 - nullfrac1) * (1.0 - nullfrac2) * DEFAULT_SEL(operator);
! 
! 	/* Release stats. */
! 	if (mcv1_exists)
! 		free_attstatsslot(vardata1->atttype, mcv1_values, mcv1_nvalues,
! 						  mcv1_numbers, mcv1_nnumbers);
! 	if (mcv2_exists)
! 		free_attstatsslot(vardata2->atttype, mcv2_values, mcv2_nvalues,
! 						  mcv2_numbers, mcv2_nnumbers);
! 	if (hist1_exists)
! 		free_attstatsslot(vardata1->atttype, hist1_values, hist1_nvalues,
! 						  NULL, 0);
! 	if (hist2_exists)
! 		free_attstatsslot(vardata2->atttype, hist2_values, hist2_nvalues,
! 						  NULL, 0);
! 
! 	return selec;
! }
! 
! /*
!  * Compute the fraction of a relation's population that is represented
!  * by the MCV list.
!  */
! static Selectivity
! mcv_population(float4 *mcv_numbers, int mcv_nvalues)
! {
! 	Selectivity sumcommon = 0.0;
! 	int			i;
! 
! 	for (i = 0; i < mcv_nvalues; i++)
! 	{
! 		sumcommon += mcv_numbers[i];
! 	}
! 
! 	return sumcommon;
! }
! 
! /*
!  * Inet histogram vs single value selectivity estimation
!  *
!  * Estimate the fraction of the histogram population that satisfies
!  * "value OPR CONST".  (The result needs to be scaled to reflect the
!  * proportion of the total population represented by the histogram.)
!  *
!  * The histogram is originally for the inet btree comparison operators.
!  * Only the common bits of the network part and the length of the network part
!  * (masklen) are interesting for the subnet inclusion operators.  Fortunately,
!  * btree comparison treats the network part as the major sort key.  Even so,
!  * the length of the network part would not really be significant in the
!  * histogram.  This would lead to big mistakes for data sets with uneven
!  * masklen distribution.  To reduce this problem, comparisons with the left
!  * and the right sides of the buckets are used together.
!  *
!  * Histogram bucket matches are calculated in two forms.  If the constant
!  * matches both bucket endpoints the bucket is considered as fully matched.
!  * The second form is to match the bucket partially; we recognize this when
!  * the constant matches just one endpoint, or the two endpoints fall on
!  * opposite sides of the constant.  (Note that when the constant matches an
!  * interior histogram element, it gets credit for partial matches to the
!  * buckets on both sides, while a match to a histogram endpoint gets credit
!  * for only one partial match.  This is desirable.)
!  *
!  * For a partial match, we try to calculate dividers for both of the
!  * boundaries.  If the address family of a boundary value does not match the
!  * constant or comparison of the length of the network parts is not correct
!  * for the operator, the divider for that boundary will not be taken into
!  * account.  If both of the dividers are valid, the greater one will be used
!  * to minimize the mistake in buckets that have disparate masklens.
!  *
!  * The divider in the partial bucket match is imagined as the distance
!  * between the decisive bits and the common bits of the addresses.  It will
!  * be used as a power of two as it is the natural scale for the IP network
!  * inclusion.  This partial bucket match divider calculation is an empirical
!  * formula and subject to change with more experiment.
!  *
!  * For partial match in buckets that have different address families on the
!  * left and right sides, only the boundary with the same address family is
!  * taken into consideration.  This can cause more mistakes for these buckets
!  * if the masklens of their boundaries are also disparate.  But this can only
!  * happen in one bucket, since only two address families exist.  It seems a
!  * better option than not considering these buckets at all.
!  */
! static Selectivity
! inet_hist_value_sel(Datum *values, int nvalues, Datum constvalue,
! 					int opr_codenum)
! {
! 	Selectivity match = 0.0;
! 	inet	   *query,
! 			   *left,
! 			   *right;
! 	int			i;
! 	int			left_order,
! 				right_order,
! 				left_divider,
! 				right_divider;
! 
! 	/* guard against zero-divide below */
! 	if (nvalues <= 1)
! 		return 0.0;
! 
! 	query = DatumGetInetPP(constvalue);
! 
! 	/* "left" is the left boundary value of the current bucket ... */
! 	left = DatumGetInetPP(values[0]);
! 	left_order = inet_inclusion_cmp(left, query, opr_codenum);
! 
! 	for (i = 1; i < nvalues; i++)
! 	{
! 		/* ... and "right" is the right boundary value */
! 		right = DatumGetInetPP(values[i]);
! 		right_order = inet_inclusion_cmp(right, query, opr_codenum);
! 
! 		if (left_order == 0 && right_order == 0)
! 		{
! 			/* The whole bucket matches, since both endpoints do. */
! 			match += 1.0;
! 		}
! 		else if ((left_order <= 0 && right_order >= 0) ||
! 				 (left_order >= 0 && right_order <= 0))
! 		{
! 			/* Partial bucket match. */
! 			left_divider = inet_hist_match_divider(left, query, opr_codenum);
! 			right_divider = inet_hist_match_divider(right, query, opr_codenum);
! 
! 			if (left_divider >= 0 || right_divider >= 0)
! 				match += 1.0 / pow(2.0, Max(left_divider, right_divider));
! 		}
! 
! 		/* Shift the variables. */
! 		left = right;
! 		left_order = right_order;
! 	}
! 
! 	/* There are nvalues - 1 buckets. */
! 	return match / (nvalues - 1);
! }
! 
! /*
!  * Inet MCV vs MCV join selectivity estimation
!  *
!  * We simply add up the fractions of the populations that satisfy the clause.
!  * The result is exact and does not need to be scaled further.
!  */
! static Selectivity
! inet_mcv_join_sel(Datum *mcv1_values, float4 *mcv1_numbers, int mcv1_nvalues,
! 				  Datum *mcv2_values, float4 *mcv2_numbers, int mcv2_nvalues,
! 				  Oid operator)
! {
! 	Selectivity selec = 0.0;
! 	FmgrInfo	proc;
! 	int			i,
! 				j;
! 
! 	fmgr_info(get_opcode(operator), &proc);
! 
! 	for (i = 0; i < mcv1_nvalues; i++)
! 	{
! 		for (j = 0; j < mcv2_nvalues; j++)
! 			if (DatumGetBool(FunctionCall2(&proc,
! 										   mcv1_values[i],
! 										   mcv2_values[j])))
! 				selec += mcv1_numbers[i] * mcv2_numbers[j];
! 	}
! 	return selec;
! }
! 
! /*
!  * Inet MCV vs histogram join selectivity estimation
!  *
!  * For each MCV on the lefthand side, estimate the fraction of the righthand's
!  * histogram population that satisfies the join clause, and add those up,
!  * scaling by the MCV's frequency.  The result still needs to be scaled
!  * according to the fraction of the righthand's population represented by
!  * the histogram.
!  */
! static Selectivity
! inet_mcv_hist_sel(Datum *mcv_values, float4 *mcv_numbers, int mcv_nvalues,
! 				  Datum *hist_values, int hist_nvalues,
! 				  int opr_codenum)
! {
! 	Selectivity selec = 0.0;
! 	int			i;
! 
! 	/*
! 	 * We'll call inet_hist_value_selec with the histogram on the left, so we
! 	 * must commute the operator.
! 	 */
! 	opr_codenum = -opr_codenum;
! 
! 	for (i = 0; i < mcv_nvalues; i++)
! 	{
! 		selec += mcv_numbers[i] *
! 			inet_hist_value_sel(hist_values, hist_nvalues, mcv_values[i],
! 								opr_codenum);
! 	}
! 	return selec;
! }
! 
! /*
!  * Inet histogram vs histogram join selectivity estimation
!  *
!  * Here, we take all values listed in the second histogram (except for the
!  * first and last elements, which are excluded on the grounds of possibly
!  * not being very representative) and treat them as a uniform sample of
!  * the non-MCV population for that relation.  For each one, we apply
!  * inet_hist_value_selec to see what fraction of the first histogram
!  * it matches.
!  *
!  * We could alternatively do this the other way around using the operator's
!  * commutator.  XXX would it be worthwhile to do it both ways and take the
!  * average?  That would at least avoid non-commutative estimation results.
!  */
! static Selectivity
! inet_hist_inclusion_join_sel(Datum *hist1_values, int hist1_nvalues,
! 							 Datum *hist2_values, int hist2_nvalues,
! 							 int opr_codenum)
! {
! 	float		match = 0.0;
! 	int			i;
! 
! 	if (hist2_nvalues <= 2)
! 		return 0.0;				/* no interior histogram elements */
! 
! 	for (i = 1; i < hist2_nvalues - 1; i++)
! 		match += inet_hist_value_sel(hist1_values, hist1_nvalues,
! 									 hist2_values[i], opr_codenum);
! 
! 	return match / (hist2_nvalues - 2);
! }
! 
! /*
!  * Inet semi join selectivity estimation for one value
!  *
!  * The function calculates the probability that there is at least one row
!  * in the RHS table that satisfies the "lhs_value op column" condition.
!  * It is used in semi join estimation to check a sample from the left hand
!  * side table.
!  *
!  * The MCV and histogram from the right hand side table should be provided as
!  * arguments with the lhs_value from the left hand side table for the join.
!  * hist_weight is the total number of rows represented by the histogram.
!  * For example, if the table has 1000 rows, and 10% of the rows are in the MCV
!  * list, and another 10% are NULLs, hist_weight would be 800.
!  *
!  * First, the lhs_value will be matched to the most common values.  If it
!  * matches any of them, 1.0 will be returned, because then there is surely
!  * a match.
!  *
!  * Otherwise, the histogram will be used to estimate the number of rows in
!  * the second table that match the condition.  If the estimate is greater
!  * than 1.0, 1.0 will be returned, because it means there is a greater chance
!  * that the lhs_value will match more than one row in the table.  If it is
!  * between 0.0 and 1.0, it will be returned as the probability.
!  */
! static Selectivity
! inet_semi_join_sel(Datum lhs_value,
! 				   bool mcv_exists, Datum *mcv_values, int mcv_nvalues,
! 				   bool hist_exists, Datum *hist_values, int hist_nvalues,
! 				   double hist_weight,
! 				   FmgrInfo *proc, int opr_codenum)
! {
! 	if (mcv_exists)
! 	{
! 		int			i;
! 
! 		for (i = 0; i < mcv_nvalues; i++)
! 		{
! 			if (DatumGetBool(FunctionCall2(proc,
! 										   lhs_value,
! 										   mcv_values[i])))
! 				return 1.0;
! 		}
! 	}
! 
! 	if (hist_exists && hist_weight > 0)
! 	{
! 		Selectivity hist_selec;
! 
! 		/* Commute operator, since we're passing lhs_value on the right */
! 		hist_selec = inet_hist_value_sel(hist_values, hist_nvalues,
! 										 lhs_value, -opr_codenum);
! 
! 		if (hist_selec > 0)
! 			return Min(1.0, hist_weight * hist_selec);
! 	}
! 
! 	return 0.0;
! }
! 
! /*
!  * Assign useful code numbers for the subnet inclusion/overlap operators
!  *
!  * Only inet_masklen_inclusion_cmp() and inet_hist_match_divider() depend
!  * on the exact codes assigned here; but many other places in this file
!  * know that they can negate a code to obtain the code for the commutator
!  * operator.
!  */
! static int
! inet_opr_codenum(Oid operator)
! {
! 	switch (operator)
! 	{
! 		case OID_INET_SUP_OP:
! 			return -2;
! 		case OID_INET_SUPEQ_OP:
! 			return -1;
! 		case OID_INET_OVERLAP_OP:
! 			return 0;
! 		case OID_INET_SUBEQ_OP:
! 			return 1;
! 		case OID_INET_SUB_OP:
! 			return 2;
! 		default:
! 			elog(ERROR, "unrecognized operator %u for inet selectivity",
! 				 operator);
! 	}
! 	return 0;					/* unreached, but keep compiler quiet */
! }
! 
! /*
!  * Comparison function for the subnet inclusion/overlap operators
!  *
!  * If the comparison is okay for the specified inclusion operator, the return
!  * value will be 0.  Otherwise the return value will be less than or greater
!  * than 0 as appropriate for the operator.
!  *
!  * Comparison is compatible with the basic comparison function for the inet
!  * type.  See network_cmp_internal() in network.c for the original.  Basic
!  * comparison operators are implemented with the network_cmp_internal
!  * function.  It is possible to implement the subnet inclusion operators with
!  * this function.
!  *
!  * Comparison is first on the common bits of the network part, then on the
!  * length of the network part (masklen) as in the network_cmp_internal()
!  * function.  Only the first part is in this function.  The second part is
!  * separated to another function for reusability.  The difference between the
!  * second part and the original network_cmp_internal() is that the inclusion
!  * operator is considered while comparing the lengths of the network parts.
!  * See the inet_masklen_inclusion_cmp() function below.
!  */
! static int
! inet_inclusion_cmp(inet *left, inet *right, int opr_codenum)
! {
! 	if (ip_family(left) == ip_family(right))
! 	{
! 		int			order;
! 
! 		order = bitncmp(ip_addr(left), ip_addr(right),
! 						Min(ip_bits(left), ip_bits(right)));
! 
! 		if (order != 0)
! 			return order;
! 
! 		return inet_masklen_inclusion_cmp(left, right, opr_codenum);
! 	}
! 
! 	return ip_family(left) - ip_family(right);
! }
! 
! /*
!  * Masklen comparison function for the subnet inclusion/overlap operators
!  *
!  * Compares the lengths of the network parts of the inputs.  If the comparison
!  * is okay for the specified inclusion operator, the return value will be 0.
!  * Otherwise the return value will be less than or greater than 0 as
!  * appropriate for the operator.
!  */
! static int
! inet_masklen_inclusion_cmp(inet *left, inet *right, int opr_codenum)
! {
! 	int			order;
! 
! 	order = (int) ip_bits(left) - (int) ip_bits(right);
! 
! 	/*
! 	 * Return 0 if the operator would accept this combination of masklens.
! 	 * Note that opr_codenum zero (overlaps) will accept all cases.
! 	 */
! 	if ((order > 0 && opr_codenum >= 0) ||
! 		(order == 0 && opr_codenum >= -1 && opr_codenum <= 1) ||
! 		(order < 0 && opr_codenum <= 0))
! 		return 0;
! 
! 	/*
! 	 * Otherwise, return a negative value for sup/supeq (notionally, the RHS
! 	 * needs to have a larger masklen than it has, which would make it sort
! 	 * later), or a positive value for sub/subeq (vice versa).
! 	 */
! 	return opr_codenum;
! }
! 
! /*
!  * Inet histogram partial match divider calculation
!  *
!  * First the families and the lengths of the network parts are compared using
!  * the subnet inclusion operator.  If those are acceptable for the operator,
!  * the divider will be calculated using the masklens and the common bits of
!  * the addresses.  -1 will be returned if it cannot be calculated.
!  *
!  * See commentary for inet_hist_value_sel() for some rationale for this.
!  */
! static int
! inet_hist_match_divider(inet *boundary, inet *query, int opr_codenum)
! {
! 	if (ip_family(boundary) == ip_family(query) &&
! 		inet_masklen_inclusion_cmp(boundary, query, opr_codenum) == 0)
! 	{
! 		int			min_bits,
! 					decisive_bits;
! 
! 		min_bits = Min(ip_bits(boundary), ip_bits(query));
! 
! 		/*
! 		 * Set decisive_bits to the masklen of the one that should contain the
! 		 * other according to the operator.
! 		 */
! 		if (opr_codenum < 0)
! 			decisive_bits = ip_bits(boundary);
! 		else if (opr_codenum > 0)
! 			decisive_bits = ip_bits(query);
! 		else
! 			decisive_bits = min_bits;
! 
! 		/*
! 		 * Now return the number of non-common decisive bits.  (This will be
! 		 * zero if the boundary and query in fact match, else positive.)
! 		 */
! 		if (min_bits > 0)
! 			return decisive_bits - bitncommon(ip_addr(boundary),
! 											  ip_addr(query),
! 											  min_bits);
! 		return decisive_bits;
! 	}
! 
! 	return -1;
  }
#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#19)
Re: Selectivity estimation for inet operators

I wrote:

I spent a fair chunk of the weekend hacking on this patch to make
it more understandable and fix up a lot of what seemed to me pretty
clear arithmetic errors in the "upper layers" of the patch. However,
I couldn't quite convince myself to commit it, because the business
around estimation for partial histogram-bucket matches still doesn't
make any sense to me.

Actually, there's a second large problem with this patch: blindly
iterating through all combinations of MCV and histogram entries makes the
runtime O(N^2) in the statistics target. I made up some test data (by
scanning my mail logs) and observed the following planning times, as
reported by EXPLAIN ANALYZE:

explain analyze select * from relays r1, relays r2 where r1.ip = r2.ip;
explain analyze select * from relays r1, relays r2 where r1.ip && r2.ip;

stats target eqjoinsel networkjoinsel

100 0.27 ms 1.85 ms
1000 2.56 ms 167.2 ms
10000 56.6 ms 13987.1 ms

I don't think it's necessary for network selectivity to be quite as
fast as eqjoinsel, but I doubt we can tolerate 14 seconds planning
time for a query that might need just milliseconds to execute :-(

It seemed to me that it might be possible to reduce the runtime by
exploiting knowledge about the ordering of the histograms, but
I don't have time to pursue that right now.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Emre Hasegeli
emre@hasegeli.com
In reply to: Tom Lane (#19)
Re: Selectivity estimation for inet operators

I spent a fair chunk of the weekend hacking on this patch to make
it more understandable and fix up a lot of what seemed to me pretty
clear arithmetic errors in the "upper layers" of the patch. However,
I couldn't quite convince myself to commit it, because the business
around estimation for partial histogram-bucket matches still doesn't
make any sense to me. Specifically this:

/* Partial bucket match. */
left_divider = inet_hist_match_divider(left, query, opr_codenum);
right_divider = inet_hist_match_divider(right, query, opr_codenum);

if (left_divider >= 0 || right_divider >= 0)
match += 1.0 / pow(2.0, Max(left_divider, right_divider));

Now unless I'm missing something pretty basic about the divider
function, it returns larger numbers for inputs that are "further away"
from each other (ie, have more not-in-common significant address bits).
So the above calculation seems exactly backwards to me: if one endpoint
of a bucket is "close" to the query, or even an exact match, and the
other endpoint is further away, we completely ignore the close/exact
match and assign a bucket match fraction based only on the further-away
endpoint. Isn't that exactly backwards?

You are right that partial bucket match calculation isn't fair on
some circumstances.

I experimented with logic like this:

if (left_divider >= 0 && right_divider >= 0)
match += 1.0 / pow(2.0, Min(left_divider, right_divider));
else if (left_divider >= 0 || right_divider >= 0)
match += 1.0 / pow(2.0, Max(left_divider, right_divider));

ie, consider the closer endpoint if both are valid. But that didn't seem
to work a whole lot better. I think really we need to consider both
endpoints not just one to the exclusion of the other.

I have tried many combinations like this. Including arithmetic,
geometric, logarithmic mean functions. I could not get good results
with any of them, so I left it in a basic form.

Max() works pretty well most of the time, because if the query matches
the bucket generally it is close to both of the endpoints. By using
Max(), we are actually crediting the ones which are close to the both
of the endpoints.

I'm also not exactly convinced by the divider function itself,
specifically about the decision to fail and return -1 if the masklen
comparison comes out wrong. This effectively causes the masklen to be
the most significant part of the value (after the IP family), which seems
totally wrong. ISTM we ought to consider the number of leading bits in
common as the primary indicator of "how far apart" a query and a
histogram endpoint are.

The partial match calculation with Max() is especially unfair on
the buckets where more significant bits change. For example 63/8 and
64/8. Returning -1 instead of a high divider, forces it to use
the divider for the other endpoint. We consider the number of leading
bits in common as the primary indicator, just for the other endpoint.

I have also experimented with the count of the common bits of
the endpoints of the bucket for better partial match calculation.
I could not find out a meaningful equation with it.

Even if the above aspects of the code are really completely right, the
comments fail to explain why. I spent a lot of time on the comments,
but so far as these points are concerned they still only explain what
is being done and not why it's a useful calculation to make.

I couldn't write better comments because I don't have strong arguments
about it. We can say that we don't try to make use of the both of
the endpoints, because we don't know how to combine them. We only use
the one with matching family and masklen, and when both of them match
we use the distant one to be on the safer side.

Thank you for looking at it. Comments look much better now.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22Emre Hasegeli
emre@hasegeli.com
In reply to: Tom Lane (#20)
1 attachment(s)
Re: Selectivity estimation for inet operators

Actually, there's a second large problem with this patch: blindly
iterating through all combinations of MCV and histogram entries makes the
runtime O(N^2) in the statistics target. I made up some test data (by
scanning my mail logs) and observed the following planning times, as
reported by EXPLAIN ANALYZE:

explain analyze select * from relays r1, relays r2 where r1.ip = r2.ip;
explain analyze select * from relays r1, relays r2 where r1.ip && r2.ip;

stats target eqjoinsel networkjoinsel

100 0.27 ms 1.85 ms
1000 2.56 ms 167.2 ms
10000 56.6 ms 13987.1 ms

I don't think it's necessary for network selectivity to be quite as
fast as eqjoinsel, but I doubt we can tolerate 14 seconds planning
time for a query that might need just milliseconds to execute :-(

It seemed to me that it might be possible to reduce the runtime by
exploiting knowledge about the ordering of the histograms, but
I don't have time to pursue that right now.

I make it break the loop when we passed the last possible match. Patch
attached. It reduces the runtime almost 50% with large histograms.

We can also make it use only some elements of the MCV and histogram
for join estimation. I have experimented with reducing the right and
the left hand side of the lists on the previous versions. I remember
it was better to reduce only the left hand side. I think it would be
enough to use log(n) elements of the right hand side MCV and histogram.
I can make the change, if you think selectivity estimation function
is the right place for this optimization.

Attachments:

inet-selfuncs-break-early.patchtext/x-diff; charset=utf-8Download
diff --git a/src/backend/utils/adt/network_selfuncs.c b/src/backend/utils/adt/network_selfuncs.c
index f854847..16f39db 100644
--- a/src/backend/utils/adt/network_selfuncs.c
+++ b/src/backend/utils/adt/network_selfuncs.c
@@ -612,20 +612,23 @@ inet_hist_value_sel(Datum *values, int nvalues, Datum constvalue,
 		return 0.0;
 
 	query = DatumGetInetPP(constvalue);
 
 	/* "left" is the left boundary value of the current bucket ... */
 	left = DatumGetInetPP(values[0]);
 	left_order = inet_inclusion_cmp(left, query, opr_codenum);
 
 	for (i = 1; i < nvalues; i++)
 	{
+		if (left_order == 256)
+			break;
+
 		/* ... and "right" is the right boundary value */
 		right = DatumGetInetPP(values[i]);
 		right_order = inet_inclusion_cmp(right, query, opr_codenum);
 
 		if (left_order == 0 && right_order == 0)
 		{
 			/* The whole bucket matches, since both endpoints do. */
 			match += 1.0;
 		}
 		else if ((left_order <= 0 && right_order >= 0) ||
@@ -854,20 +857,23 @@ inet_opr_codenum(Oid operator)
 static int
 inet_inclusion_cmp(inet *left, inet *right, int opr_codenum)
 {
 	if (ip_family(left) == ip_family(right))
 	{
 		int			order;
 
 		order = bitncmp(ip_addr(left), ip_addr(right),
 						Min(ip_bits(left), ip_bits(right)));
 
+		if (order > 0)
+			return 256;
+
 		if (order != 0)
 			return order;
 
 		return inet_masklen_inclusion_cmp(left, right, opr_codenum);
 	}
 
 	return ip_family(left) - ip_family(right);
 }
 
 /*
#23Michael Paquier
michael.paquier@gmail.com
In reply to: Emre Hasegeli (#22)
Re: Selectivity estimation for inet operators

On Wed, Dec 3, 2014 at 6:14 AM, Emre Hasegeli <emre@hasegeli.com> wrote:

Actually, there's a second large problem with this patch: blindly
iterating through all combinations of MCV and histogram entries makes the
runtime O(N^2) in the statistics target. I made up some test data (by
scanning my mail logs) and observed the following planning times, as
reported by EXPLAIN ANALYZE:

explain analyze select * from relays r1, relays r2 where r1.ip = r2.ip;
explain analyze select * from relays r1, relays r2 where r1.ip && r2.ip;

stats target eqjoinsel networkjoinsel

100 0.27 ms 1.85 ms
1000 2.56 ms 167.2 ms
10000 56.6 ms 13987.1 ms

I don't think it's necessary for network selectivity to be quite as
fast as eqjoinsel, but I doubt we can tolerate 14 seconds planning
time for a query that might need just milliseconds to execute :-(

It seemed to me that it might be possible to reduce the runtime by
exploiting knowledge about the ordering of the histograms, but
I don't have time to pursue that right now.

I make it break the loop when we passed the last possible match. Patch
attached. It reduces the runtime almost 50% with large histograms.

We can also make it use only some elements of the MCV and histogram
for join estimation. I have experimented with reducing the right and
the left hand side of the lists on the previous versions. I remember
it was better to reduce only the left hand side. I think it would be
enough to use log(n) elements of the right hand side MCV and histogram.
I can make the change, if you think selectivity estimation function
is the right place for this optimization.

Marking as "Returned with feedback" as more work needs to be done.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24Emre Hasegeli
emre@hasegeli.com
In reply to: Emre Hasegeli (#21)
1 attachment(s)
Re: Selectivity estimation for inet operators

New version of the patch attached with the optimization to break the
loop before looking at all of the histogram values. I can reduce
join selectivity estimation runtime by reducing the values of the
left hand side or both of the sides, if there is interest.

Even if the above aspects of the code are really completely right, the
comments fail to explain why. I spent a lot of time on the comments,
but so far as these points are concerned they still only explain what
is being done and not why it's a useful calculation to make.

I couldn't write better comments because I don't have strong arguments
about it. We can say that we don't try to make use of the both of
the endpoints, because we don't know how to combine them. We only use
the one with matching family and masklen, and when both of them match
we use the distant one to be on the safer side.

I added two more sentences to explain the calculation.

Attachments:

inet-selfuncs-v14.patchtext/x-diff; charset=utf-8; name=inet-selfuncs-v14.patchDownload
diff --git a/src/backend/utils/adt/network_selfuncs.c b/src/backend/utils/adt/network_selfuncs.c
index 73fc1ca..51a33c2 100644
--- a/src/backend/utils/adt/network_selfuncs.c
+++ b/src/backend/utils/adt/network_selfuncs.c
@@ -1,32 +1,972 @@
 /*-------------------------------------------------------------------------
  *
  * network_selfuncs.c
  *	  Functions for selectivity estimation of inet/cidr operators
  *
- * Currently these are just stubs, but we hope to do better soon.
+ * This module provides estimators for the subnet inclusion and overlap
+ * operators.  Estimates are based on null fraction, most common values,
+ * and histogram of inet/cidr columns.
  *
  * Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
  *
  * IDENTIFICATION
  *	  src/backend/utils/adt/network_selfuncs.c
  *
  *-------------------------------------------------------------------------
  */
 #include "postgres.h"
 
+#include <math.h>
+
+#include "access/htup_details.h"
+#include "catalog/pg_operator.h"
+#include "catalog/pg_statistic.h"
 #include "utils/inet.h"
+#include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
 
 
+/* Default selectivity for the inet overlap operator */
+#define DEFAULT_OVERLAP_SEL 0.01
+
+/* Default selectivity for the various inclusion operators */
+#define DEFAULT_INCLUSION_SEL 0.005
+
+/* Default selectivity for specified operator */
+#define DEFAULT_SEL(operator) \
+	((operator) == OID_INET_OVERLAP_OP ? \
+	 DEFAULT_OVERLAP_SEL : DEFAULT_INCLUSION_SEL)
+
+static Selectivity networkjoinsel_inner(Oid operator,
+					 VariableStatData *vardata1, VariableStatData *vardata2);
+static Selectivity networkjoinsel_semi(Oid operator,
+					VariableStatData *vardata1, VariableStatData *vardata2);
+static Selectivity mcv_population(float4 *mcv_numbers, int mcv_nvalues);
+static Selectivity inet_hist_value_sel(Datum *values, int nvalues,
+					Datum constvalue, int opr_codenum);
+static Selectivity inet_mcv_join_sel(Datum *mcv1_values,
+				  float4 *mcv1_numbers, int mcv1_nvalues, Datum *mcv2_values,
+				  float4 *mcv2_numbers, int mcv2_nvalues, Oid operator);
+static Selectivity inet_mcv_hist_sel(Datum *mcv_values, float4 *mcv_numbers,
+				  int mcv_nvalues, Datum *hist_values, int hist_nvalues,
+				  int opr_codenum);
+static Selectivity inet_hist_inclusion_join_sel(Datum *hist1_values,
+							 int hist1_nvalues,
+							 Datum *hist2_values, int hist2_nvalues,
+							 int opr_codenum);
+static Selectivity inet_semi_join_sel(Datum lhs_value,
+				   bool mcv_exists, Datum *mcv_values, int mcv_nvalues,
+				   bool hist_exists, Datum *hist_values, int hist_nvalues,
+				   double hist_weight,
+				   FmgrInfo *proc, int opr_codenum);
+static int	inet_opr_codenum(Oid operator);
+static int	inet_inclusion_cmp(inet *left, inet *right, int opr_codenum);
+static int inet_masklen_inclusion_cmp(inet *left, inet *right,
+						   int opr_codenum);
+static int inet_hist_match_divider(inet *boundary, inet *query,
+						int opr_codenum);
+
+/*
+ * Selectivity estimation for the subnet inclusion/overlap operators
+ */
 Datum
 networksel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid			operator = PG_GETARG_OID(1);
+	List	   *args = (List *) PG_GETARG_POINTER(2);
+	int			varRelid = PG_GETARG_INT32(3);
+	VariableStatData vardata;
+	Node	   *other;
+	bool		varonleft;
+	Selectivity selec,
+				mcv_selec,
+				non_mcv_selec;
+	Datum		constvalue,
+			   *hist_values;
+	int			hist_nvalues;
+	Form_pg_statistic stats;
+	double		sumcommon,
+				nullfrac;
+	FmgrInfo	proc;
+
+	/*
+	 * If expression is not (variable op something) or (something op
+	 * variable), then punt and return a default estimate.
+	 */
+	if (!get_restriction_variable(root, args, varRelid,
+								  &vardata, &other, &varonleft))
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+
+	/*
+	 * Can't do anything useful if the something is not a constant, either.
+	 */
+	if (!IsA(other, Const))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	/* All of the operators handled here are strict. */
+	if (((Const *) other)->constisnull)
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(0.0);
+	}
+	constvalue = ((Const *) other)->constvalue;
+
+	/* Otherwise, we need stats in order to produce a non-default estimate. */
+	if (!HeapTupleIsValid(vardata.statsTuple))
+	{
+		ReleaseVariableStats(vardata);
+		PG_RETURN_FLOAT8(DEFAULT_SEL(operator));
+	}
+
+	stats = (Form_pg_statistic) GETSTRUCT(vardata.statsTuple);
+	nullfrac = stats->stanullfrac;
+
+	/*
+	 * If we have most-common-values info, add up the fractions of the MCV
+	 * entries that satisfy MCV OP CONST.  These fractions contribute directly
+	 * to the result selectivity.  Also add up the total fraction represented
+	 * by MCV entries.
+	 */
+	fmgr_info(get_opcode(operator), &proc);
+	mcv_selec = mcv_selectivity(&vardata, &proc, constvalue, varonleft,
+								&sumcommon);
+
+	/*
+	 * If we have a histogram, use it to estimate the proportion of the
+	 * non-MCV population that satisfies the clause.  If we don't, apply the
+	 * default selectivity to that population.
+	 */
+	if (get_attstatsslot(vardata.statsTuple,
+						 vardata.atttype, vardata.atttypmod,
+						 STATISTIC_KIND_HISTOGRAM, InvalidOid,
+						 NULL,
+						 &hist_values, &hist_nvalues,
+						 NULL, NULL))
+	{
+		int			opr_codenum = inet_opr_codenum(operator);
+
+		/* Commute if needed, so we can consider histogram to be on the left */
+		if (!varonleft)
+			opr_codenum = -opr_codenum;
+		non_mcv_selec = inet_hist_value_sel(hist_values, hist_nvalues,
+											constvalue, opr_codenum);
+
+		free_attstatsslot(vardata.atttype, hist_values, hist_nvalues, NULL, 0);
+	}
+	else
+		non_mcv_selec = DEFAULT_SEL(operator);
+
+	/* Combine selectivities for MCV and non-MCV populations */
+	selec = mcv_selec + (1.0 - nullfrac - sumcommon) * non_mcv_selec;
+
+	/* Result should be in range, but make sure... */
+	CLAMP_PROBABILITY(selec);
+
+	ReleaseVariableStats(vardata);
+
+	PG_RETURN_FLOAT8(selec);
 }
 
+/*
+ * Join selectivity estimation for the subnet inclusion/overlap operators
+ *
+ * This function has the same structure as eqjoinsel() in selfuncs.c.
+ */
 Datum
 networkjoinsel(PG_FUNCTION_ARGS)
 {
-	PG_RETURN_FLOAT8(0.001);
+	PlannerInfo *root = (PlannerInfo *) PG_GETARG_POINTER(0);
+	Oid			operator = PG_GETARG_OID(1);
+	List	   *args = (List *) PG_GETARG_POINTER(2);
+#ifdef NOT_USED
+	JoinType	jointype = (JoinType) PG_GETARG_INT16(3);
+#endif
+	SpecialJoinInfo *sjinfo = (SpecialJoinInfo *) PG_GETARG_POINTER(4);
+	double		selec;
+	VariableStatData vardata1;
+	VariableStatData vardata2;
+	bool		join_is_reversed;
+
+	get_join_variables(root, args, sjinfo,
+					   &vardata1, &vardata2, &join_is_reversed);
+
+	switch (sjinfo->jointype)
+	{
+		case JOIN_INNER:
+		case JOIN_LEFT:
+		case JOIN_FULL:
+
+			/*
+			 * Selectivity for left/full join is not exactly the same as inner
+			 * join, but we neglect the difference, as eqjoinsel does.
+			 */
+			selec = networkjoinsel_inner(operator, &vardata1, &vardata2);
+			break;
+		case JOIN_SEMI:
+		case JOIN_ANTI:
+			/* Here, it's important that we pass the outer var on the left. */
+			if (!join_is_reversed)
+				selec = networkjoinsel_semi(operator, &vardata1, &vardata2);
+			else
+				selec = networkjoinsel_semi(get_commutator(operator),
+											&vardata2, &vardata1);
+			break;
+		default:
+			/* other values not expected here */
+			elog(ERROR, "unrecognized join type: %d",
+				 (int) sjinfo->jointype);
+			selec = 0;			/* keep compiler quiet */
+			break;
+	}
+
+	ReleaseVariableStats(vardata1);
+	ReleaseVariableStats(vardata2);
+
+	CLAMP_PROBABILITY(selec);
+
+	PG_RETURN_FLOAT8((float8) selec);
+}
+
+/*
+ * Inner join selectivity estimation for subnet inclusion/overlap operators
+ *
+ * Calculates MCV vs MCV, MCV vs histogram and histogram vs histogram
+ * selectivity for join using the subnet inclusion operators.  Unlike the
+ * join selectivity function for the equality operator, eqjoinsel_inner(),
+ * one to one matching of the values is not enough.  Network inclusion
+ * operators are likely to match many to many, so we must check all pairs.
+ * (Note: it might be possible to exploit understanding of the histogram's
+ * btree ordering to reduce the work needed, but we don't currently try.)
+ * Also, MCV vs histogram selectivity is not neglected as in eqjoinsel_inner().
+ */
+static Selectivity
+networkjoinsel_inner(Oid operator,
+					 VariableStatData *vardata1, VariableStatData *vardata2)
+{
+	Form_pg_statistic stats;
+	double		nullfrac1 = 0.0,
+				nullfrac2 = 0.0;
+	Selectivity selec = 0.0,
+				sumcommon1 = 0.0,
+				sumcommon2 = 0.0;
+	bool		mcv1_exists = false,
+				mcv2_exists = false,
+				hist1_exists = false,
+				hist2_exists = false;
+	int			opr_codenum;
+	int			mcv1_nvalues,
+				mcv2_nvalues,
+				mcv1_nnumbers,
+				mcv2_nnumbers,
+				hist1_nvalues,
+				hist2_nvalues;
+	Datum	   *mcv1_values,
+			   *mcv2_values,
+			   *hist1_values,
+			   *hist2_values;
+	float4	   *mcv1_numbers,
+			   *mcv2_numbers;
+
+	if (HeapTupleIsValid(vardata1->statsTuple))
+	{
+		stats = (Form_pg_statistic) GETSTRUCT(vardata1->statsTuple);
+		nullfrac1 = stats->stanullfrac;
+
+		mcv1_exists = get_attstatsslot(vardata1->statsTuple,
+									   vardata1->atttype, vardata1->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv1_values, &mcv1_nvalues,
+									   &mcv1_numbers, &mcv1_nnumbers);
+		hist1_exists = get_attstatsslot(vardata1->statsTuple,
+									  vardata1->atttype, vardata1->atttypmod,
+										STATISTIC_KIND_HISTOGRAM, InvalidOid,
+										NULL,
+										&hist1_values, &hist1_nvalues,
+										NULL, NULL);
+		if (mcv1_exists)
+			sumcommon1 = mcv_population(mcv1_numbers, mcv1_nnumbers);
+	}
+
+	if (HeapTupleIsValid(vardata2->statsTuple))
+	{
+		stats = (Form_pg_statistic) GETSTRUCT(vardata2->statsTuple);
+		nullfrac2 = stats->stanullfrac;
+
+		mcv2_exists = get_attstatsslot(vardata2->statsTuple,
+									   vardata2->atttype, vardata2->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv2_values, &mcv2_nvalues,
+									   &mcv2_numbers, &mcv2_nnumbers);
+		hist2_exists = get_attstatsslot(vardata2->statsTuple,
+									  vardata2->atttype, vardata2->atttypmod,
+										STATISTIC_KIND_HISTOGRAM, InvalidOid,
+										NULL,
+										&hist2_values, &hist2_nvalues,
+										NULL, NULL);
+		if (mcv2_exists)
+			sumcommon2 = mcv_population(mcv2_numbers, mcv2_nnumbers);
+	}
+
+	opr_codenum = inet_opr_codenum(operator);
+
+	/*
+	 * Calculate selectivity for MCV vs MCV matches.
+	 */
+	if (mcv1_exists && mcv2_exists)
+		selec += inet_mcv_join_sel(mcv1_values, mcv1_numbers, mcv1_nvalues,
+								   mcv2_values, mcv2_numbers, mcv2_nvalues,
+								   operator);
+
+	/*
+	 * Add in selectivities for MCV vs histogram matches, scaling according to
+	 * the fractions of the populations represented by the histograms. Note
+	 * that the second case needs to commute the operator.
+	 */
+	if (mcv1_exists && hist2_exists)
+		selec += (1.0 - nullfrac2 - sumcommon2) *
+			inet_mcv_hist_sel(mcv1_values, mcv1_numbers, mcv1_nvalues,
+							  hist2_values, hist2_nvalues,
+							  opr_codenum);
+	if (mcv2_exists && hist1_exists)
+		selec += (1.0 - nullfrac1 - sumcommon1) *
+			inet_mcv_hist_sel(mcv2_values, mcv2_numbers, mcv2_nvalues,
+							  hist1_values, hist1_nvalues,
+							  -opr_codenum);
+
+	/*
+	 * Add in selectivity for histogram vs histogram matches, again scaling
+	 * appropriately.
+	 */
+	if (hist1_exists && hist2_exists)
+		selec += (1.0 - nullfrac1 - sumcommon1) *
+			(1.0 - nullfrac2 - sumcommon2) *
+			inet_hist_inclusion_join_sel(hist1_values, hist1_nvalues,
+										 hist2_values, hist2_nvalues,
+										 opr_codenum);
+
+	/*
+	 * If useful statistics are not available then use the default estimate.
+	 * We can apply null fractions if known, though.
+	 */
+	if ((!mcv1_exists && !hist1_exists) || (!mcv2_exists && !hist2_exists))
+		selec = (1.0 - nullfrac1) * (1.0 - nullfrac2) * DEFAULT_SEL(operator);
+
+	/* Release stats. */
+	if (mcv1_exists)
+		free_attstatsslot(vardata1->atttype, mcv1_values, mcv1_nvalues,
+						  mcv1_numbers, mcv1_nnumbers);
+	if (mcv2_exists)
+		free_attstatsslot(vardata2->atttype, mcv2_values, mcv2_nvalues,
+						  mcv2_numbers, mcv2_nnumbers);
+	if (hist1_exists)
+		free_attstatsslot(vardata1->atttype, hist1_values, hist1_nvalues,
+						  NULL, 0);
+	if (hist2_exists)
+		free_attstatsslot(vardata2->atttype, hist2_values, hist2_nvalues,
+						  NULL, 0);
+
+	return selec;
+}
+
+/*
+ * Semi join selectivity estimation for subnet inclusion/overlap operators
+ *
+ * Calculates MCV vs MCV, MCV vs histogram, histogram vs MCV, and histogram vs
+ * histogram selectivity for semi/anti join cases.
+ */
+static Selectivity
+networkjoinsel_semi(Oid operator,
+					VariableStatData *vardata1, VariableStatData *vardata2)
+{
+	Form_pg_statistic stats;
+	Selectivity selec = 0.0,
+				sumcommon1 = 0.0,
+				sumcommon2 = 0.0;
+	double		nullfrac1 = 0.0,
+				nullfrac2 = 0.0,
+				hist2_weight = 0.0;
+	bool		mcv1_exists = false,
+				mcv2_exists = false,
+				hist1_exists = false,
+				hist2_exists = false;
+	int			opr_codenum;
+	FmgrInfo	proc;
+	int			i,
+				mcv1_nvalues,
+				mcv2_nvalues,
+				mcv1_nnumbers,
+				mcv2_nnumbers,
+				hist1_nvalues,
+				hist2_nvalues;
+	Datum	   *mcv1_values,
+			   *mcv2_values,
+			   *hist1_values,
+			   *hist2_values;
+	float4	   *mcv1_numbers,
+			   *mcv2_numbers;
+
+	if (HeapTupleIsValid(vardata1->statsTuple))
+	{
+		stats = (Form_pg_statistic) GETSTRUCT(vardata1->statsTuple);
+		nullfrac1 = stats->stanullfrac;
+
+		mcv1_exists = get_attstatsslot(vardata1->statsTuple,
+									   vardata1->atttype, vardata1->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv1_values, &mcv1_nvalues,
+									   &mcv1_numbers, &mcv1_nnumbers);
+		hist1_exists = get_attstatsslot(vardata1->statsTuple,
+									  vardata1->atttype, vardata1->atttypmod,
+										STATISTIC_KIND_HISTOGRAM, InvalidOid,
+										NULL,
+										&hist1_values, &hist1_nvalues,
+										NULL, NULL);
+		if (mcv1_exists)
+			sumcommon1 = mcv_population(mcv1_numbers, mcv1_nnumbers);
+	}
+
+	if (HeapTupleIsValid(vardata2->statsTuple))
+	{
+		stats = (Form_pg_statistic) GETSTRUCT(vardata2->statsTuple);
+		nullfrac2 = stats->stanullfrac;
+
+		mcv2_exists = get_attstatsslot(vardata2->statsTuple,
+									   vardata2->atttype, vardata2->atttypmod,
+									   STATISTIC_KIND_MCV, InvalidOid,
+									   NULL,
+									   &mcv2_values, &mcv2_nvalues,
+									   &mcv2_numbers, &mcv2_nnumbers);
+		hist2_exists = get_attstatsslot(vardata2->statsTuple,
+									  vardata2->atttype, vardata2->atttypmod,
+										STATISTIC_KIND_HISTOGRAM, InvalidOid,
+										NULL,
+										&hist2_values, &hist2_nvalues,
+										NULL, NULL);
+		if (mcv2_exists)
+			sumcommon2 = mcv_population(mcv2_numbers, mcv2_nnumbers);
+	}
+
+	opr_codenum = inet_opr_codenum(operator);
+	fmgr_info(get_opcode(operator), &proc);
+
+	/* Estimate number of input rows represented by RHS histogram. */
+	if (hist2_exists && vardata2->rel)
+		hist2_weight = (1.0 - nullfrac2 - sumcommon2) * vardata2->rel->rows;
+
+	/*
+	 * Consider each element of the LHS MCV list, matching it to whatever RHS
+	 * stats we have.  Scale according to the known frequency of the MCV.
+	 */
+	if (mcv1_exists && (mcv2_exists || hist2_exists))
+	{
+		for (i = 0; i < mcv1_nvalues; i++)
+		{
+			selec += mcv1_numbers[i] *
+				inet_semi_join_sel(mcv1_values[i],
+								   mcv2_exists, mcv2_values, mcv2_nvalues,
+								   hist2_exists, hist2_values, hist2_nvalues,
+								   hist2_weight,
+								   &proc, opr_codenum);
+		}
+	}
+
+	/*
+	 * Consider each element of the LHS histogram, except for the first and
+	 * last elements, which we exclude on the grounds that they're outliers
+	 * and thus not very representative.  Scale on the assumption that each
+	 * such histogram element represents an equal share of the LHS histogram
+	 * population (which is a bit bogus, because the members of its bucket may
+	 * not all act the same with respect to the join clause, but it's hard to
+	 * do better).
+	 */
+	if (hist1_exists && hist1_nvalues > 2 && (mcv2_exists || hist2_exists))
+	{
+		double		hist_selec_sum = 0.0;
+
+		for (i = 1; i < hist1_nvalues - 1; i++)
+		{
+			hist_selec_sum +=
+				inet_semi_join_sel(hist1_values[i],
+								   mcv2_exists, mcv2_values, mcv2_nvalues,
+								   hist2_exists, hist2_values, hist2_nvalues,
+								   hist2_weight,
+								   &proc, opr_codenum);
+		}
+
+		selec += (1.0 - nullfrac1 - sumcommon1) *
+			hist_selec_sum / (hist1_nvalues - 2);
+	}
+
+	/*
+	 * If useful statistics are not available then use the default estimate.
+	 * We can apply null fractions if known, though.
+	 */
+	if ((!mcv1_exists && !hist1_exists) || (!mcv2_exists && !hist2_exists))
+		selec = (1.0 - nullfrac1) * (1.0 - nullfrac2) * DEFAULT_SEL(operator);
+
+	/* Release stats. */
+	if (mcv1_exists)
+		free_attstatsslot(vardata1->atttype, mcv1_values, mcv1_nvalues,
+						  mcv1_numbers, mcv1_nnumbers);
+	if (mcv2_exists)
+		free_attstatsslot(vardata2->atttype, mcv2_values, mcv2_nvalues,
+						  mcv2_numbers, mcv2_nnumbers);
+	if (hist1_exists)
+		free_attstatsslot(vardata1->atttype, hist1_values, hist1_nvalues,
+						  NULL, 0);
+	if (hist2_exists)
+		free_attstatsslot(vardata2->atttype, hist2_values, hist2_nvalues,
+						  NULL, 0);
+
+	return selec;
+}
+
+/*
+ * Compute the fraction of a relation's population that is represented
+ * by the MCV list.
+ */
+static Selectivity
+mcv_population(float4 *mcv_numbers, int mcv_nvalues)
+{
+	Selectivity sumcommon = 0.0;
+	int			i;
+
+	for (i = 0; i < mcv_nvalues; i++)
+	{
+		sumcommon += mcv_numbers[i];
+	}
+
+	return sumcommon;
+}
+
+/*
+ * Inet histogram vs single value selectivity estimation
+ *
+ * Estimate the fraction of the histogram population that satisfies
+ * "value OPR CONST".  (The result needs to be scaled to reflect the
+ * proportion of the total population represented by the histogram.)
+ *
+ * The histogram is originally for the inet btree comparison operators.
+ * Only the common bits of the network part and the length of the network part
+ * (masklen) are interesting for the subnet inclusion operators.  Fortunately,
+ * btree comparison treats the network part as the major sort key.  Even so,
+ * the length of the network part would not really be significant in the
+ * histogram.  This would lead to big mistakes for data sets with uneven
+ * masklen distribution.  To reduce this problem, comparisons with the left
+ * and the right sides of the buckets are used together.
+ *
+ * Histogram bucket matches are calculated in two forms.  If the constant
+ * matches both bucket endpoints the bucket is considered as fully matched.
+ * The second form is to match the bucket partially; we recognize this when
+ * the constant matches just one endpoint, or the two endpoints fall on
+ * opposite sides of the constant.  (Note that when the constant matches an
+ * interior histogram element, it gets credit for partial matches to the
+ * buckets on both sides, while a match to a histogram endpoint gets credit
+ * for only one partial match.  This is desirable.)
+ *
+ * For a partial match, we try to calculate dividers for both of the
+ * boundaries.  If the address family of a boundary value does not match the
+ * constant or comparison of the length of the network parts is not correct
+ * for the operator, the divider for that boundary will not be taken into
+ * account.  If both of the dividers are valid, the greater one will be used
+ * to minimize the mistake in buckets that have disparate masklens.  This
+ * calculation is unfair when dividers can be calculated for both of the
+ * boundaries but far from each other, but it is not a common situation as
+ * the boundaries are expected to share most of their significant bits of
+ * their masklens.  The mistake would be greater, if we would use the minimum
+ * instead of the maximum, and we don't know a sensible way to combine them.
+ *
+ * The divider in the partial bucket match is imagined as the distance
+ * between the decisive bits and the common bits of the addresses.  It will
+ * be used as a power of two as it is the natural scale for the IP network
+ * inclusion.  This partial bucket match divider calculation is an empirical
+ * formula and subject to change with more experiment.
+ *
+ * For partial match in buckets that have different address families on the
+ * left and right sides, only the boundary with the same address family is
+ * taken into consideration.  This can cause more mistakes for these buckets
+ * if the masklens of their boundaries are also disparate.  But this can only
+ * happen in one bucket, since only two address families exist.  It seems a
+ * better option than not considering these buckets at all.
+ */
+static Selectivity
+inet_hist_value_sel(Datum *values, int nvalues, Datum constvalue,
+					int opr_codenum)
+{
+	Selectivity match = 0.0;
+	inet	   *query,
+			   *left,
+			   *right;
+	int			i;
+	int			left_order,
+				right_order,
+				left_divider,
+				right_divider;
+
+	/* guard against zero-divide below */
+	if (nvalues <= 1)
+		return 0.0;
+
+	query = DatumGetInetPP(constvalue);
+
+	/* "left" is the left boundary value of the current bucket ... */
+	left = DatumGetInetPP(values[0]);
+	left_order = inet_inclusion_cmp(left, query, opr_codenum);
+
+	for (i = 1; i < nvalues; i++)
+	{
+		if (left_order == 256)
+		{
+			/*
+			 * 256 is the specific return value which means the subnet of
+			 * the query is certainly passed.  After this one no other bucket
+			 * is going to match.
+			 */
+			break;
+		}
+
+		/* ... and "right" is the right boundary value */
+		right = DatumGetInetPP(values[i]);
+		right_order = inet_inclusion_cmp(right, query, opr_codenum);
+
+		if (left_order == 0 && right_order == 0)
+		{
+			/* The whole bucket matches, since both endpoints do. */
+			match += 1.0;
+		}
+		else if ((left_order <= 0 && right_order >= 0) ||
+				 (left_order >= 0 && right_order <= 0))
+		{
+			/* Partial bucket match. */
+			left_divider = inet_hist_match_divider(left, query, opr_codenum);
+			right_divider = inet_hist_match_divider(right, query, opr_codenum);
+
+			if (left_divider >= 0 || right_divider >= 0)
+				match += 1.0 / pow(2.0, Max(left_divider, right_divider));
+		}
+
+		/* Shift the variables. */
+		left = right;
+		left_order = right_order;
+	}
+
+	/* There are nvalues - 1 buckets. */
+	return match / (nvalues - 1);
+}
+
+/*
+ * Inet MCV vs MCV join selectivity estimation
+ *
+ * We simply add up the fractions of the populations that satisfy the clause.
+ * The result is exact and does not need to be scaled further.
+ */
+static Selectivity
+inet_mcv_join_sel(Datum *mcv1_values, float4 *mcv1_numbers, int mcv1_nvalues,
+				  Datum *mcv2_values, float4 *mcv2_numbers, int mcv2_nvalues,
+				  Oid operator)
+{
+	Selectivity selec = 0.0;
+	FmgrInfo	proc;
+	int			i,
+				j;
+
+	fmgr_info(get_opcode(operator), &proc);
+
+	for (i = 0; i < mcv1_nvalues; i++)
+	{
+		for (j = 0; j < mcv2_nvalues; j++)
+			if (DatumGetBool(FunctionCall2(&proc,
+										   mcv1_values[i],
+										   mcv2_values[j])))
+				selec += mcv1_numbers[i] * mcv2_numbers[j];
+	}
+	return selec;
+}
+
+/*
+ * Inet MCV vs histogram join selectivity estimation
+ *
+ * For each MCV on the lefthand side, estimate the fraction of the righthand's
+ * histogram population that satisfies the join clause, and add those up,
+ * scaling by the MCV's frequency.  The result still needs to be scaled
+ * according to the fraction of the righthand's population represented by
+ * the histogram.
+ */
+static Selectivity
+inet_mcv_hist_sel(Datum *mcv_values, float4 *mcv_numbers, int mcv_nvalues,
+				  Datum *hist_values, int hist_nvalues,
+				  int opr_codenum)
+{
+	Selectivity selec = 0.0;
+	int			i;
+
+	/*
+	 * We'll call inet_hist_value_selec with the histogram on the left, so we
+	 * must commute the operator.
+	 */
+	opr_codenum = -opr_codenum;
+
+	for (i = 0; i < mcv_nvalues; i++)
+	{
+		selec += mcv_numbers[i] *
+			inet_hist_value_sel(hist_values, hist_nvalues, mcv_values[i],
+								opr_codenum);
+	}
+	return selec;
+}
+
+/*
+ * Inet histogram vs histogram join selectivity estimation
+ *
+ * Here, we take all values listed in the second histogram (except for the
+ * first and last elements, which are excluded on the grounds of possibly
+ * not being very representative) and treat them as a uniform sample of
+ * the non-MCV population for that relation.  For each one, we apply
+ * inet_hist_value_selec to see what fraction of the first histogram
+ * it matches.
+ *
+ * We could alternatively do this the other way around using the operator's
+ * commutator.  XXX would it be worthwhile to do it both ways and take the
+ * average?  That would at least avoid non-commutative estimation results.
+ */
+static Selectivity
+inet_hist_inclusion_join_sel(Datum *hist1_values, int hist1_nvalues,
+							 Datum *hist2_values, int hist2_nvalues,
+							 int opr_codenum)
+{
+	float		match = 0.0;
+	int			i;
+
+	if (hist2_nvalues <= 2)
+		return 0.0;				/* no interior histogram elements */
+
+	for (i = 1; i < hist2_nvalues - 1; i++)
+		match += inet_hist_value_sel(hist1_values, hist1_nvalues,
+									 hist2_values[i], opr_codenum);
+
+	return match / (hist2_nvalues - 2);
+}
+
+/*
+ * Inet semi join selectivity estimation for one value
+ *
+ * The function calculates the probability that there is at least one row
+ * in the RHS table that satisfies the "lhs_value op column" condition.
+ * It is used in semi join estimation to check a sample from the left hand
+ * side table.
+ *
+ * The MCV and histogram from the right hand side table should be provided as
+ * arguments with the lhs_value from the left hand side table for the join.
+ * hist_weight is the total number of rows represented by the histogram.
+ * For example, if the table has 1000 rows, and 10% of the rows are in the MCV
+ * list, and another 10% are NULLs, hist_weight would be 800.
+ *
+ * First, the lhs_value will be matched to the most common values.  If it
+ * matches any of them, 1.0 will be returned, because then there is surely
+ * a match.
+ *
+ * Otherwise, the histogram will be used to estimate the number of rows in
+ * the second table that match the condition.  If the estimate is greater
+ * than 1.0, 1.0 will be returned, because it means there is a greater chance
+ * that the lhs_value will match more than one row in the table.  If it is
+ * between 0.0 and 1.0, it will be returned as the probability.
+ */
+static Selectivity
+inet_semi_join_sel(Datum lhs_value,
+				   bool mcv_exists, Datum *mcv_values, int mcv_nvalues,
+				   bool hist_exists, Datum *hist_values, int hist_nvalues,
+				   double hist_weight,
+				   FmgrInfo *proc, int opr_codenum)
+{
+	if (mcv_exists)
+	{
+		int			i;
+
+		for (i = 0; i < mcv_nvalues; i++)
+		{
+			if (DatumGetBool(FunctionCall2(proc,
+										   lhs_value,
+										   mcv_values[i])))
+				return 1.0;
+		}
+	}
+
+	if (hist_exists && hist_weight > 0)
+	{
+		Selectivity hist_selec;
+
+		/* Commute operator, since we're passing lhs_value on the right */
+		hist_selec = inet_hist_value_sel(hist_values, hist_nvalues,
+										 lhs_value, -opr_codenum);
+
+		if (hist_selec > 0)
+			return Min(1.0, hist_weight * hist_selec);
+	}
+
+	return 0.0;
+}
+
+/*
+ * Assign useful code numbers for the subnet inclusion/overlap operators
+ *
+ * Only inet_masklen_inclusion_cmp() and inet_hist_match_divider() depend
+ * on the exact codes assigned here; but many other places in this file
+ * know that they can negate a code to obtain the code for the commutator
+ * operator.
+ */
+static int
+inet_opr_codenum(Oid operator)
+{
+	switch (operator)
+	{
+		case OID_INET_SUP_OP:
+			return -2;
+		case OID_INET_SUPEQ_OP:
+			return -1;
+		case OID_INET_OVERLAP_OP:
+			return 0;
+		case OID_INET_SUBEQ_OP:
+			return 1;
+		case OID_INET_SUB_OP:
+			return 2;
+		default:
+			elog(ERROR, "unrecognized operator %u for inet selectivity",
+				 operator);
+	}
+	return 0;					/* unreached, but keep compiler quiet */
+}
+
+/*
+ * Comparison function for the subnet inclusion/overlap operators
+ *
+ * If the comparison is okay for the specified inclusion operator, the return
+ * value will be 0.  Otherwise the return value will be less than or greater
+ * than 0 as appropriate for the operator.  When the families match but not
+ * the subnets, the return values will be exactly 256 or -255 which are
+ * otherwise not possible.
+ *
+ * Comparison is compatible with the basic comparison function for the inet
+ * type.  See network_cmp_internal() in network.c for the original.  Basic
+ * comparison operators are implemented with the network_cmp_internal()
+ * function.  It is possible to implement the subnet inclusion operators with
+ * this function.
+ *
+ * Comparison is first on the common bits of the network part, then on the
+ * length of the network part (masklen) as in the network_cmp_internal()
+ * function.  Only the first part is in this function.  The second part is
+ * separated to another function for reusability.  The difference between the
+ * second part and the original network_cmp_internal() is that the inclusion
+ * operator is considered while comparing the lengths of the network parts.
+ * See the inet_masklen_inclusion_cmp() function below.
+ */
+static int
+inet_inclusion_cmp(inet *left, inet *right, int opr_codenum)
+{
+	if (ip_family(left) == ip_family(right))
+	{
+		int			order;
+
+		order = bitncmp(ip_addr(left), ip_addr(right),
+						Min(ip_bits(left), ip_bits(right)));
+
+		/*
+		 * bitncmp() can return anything, but we need to return specific
+		 * values to be used by the caller.
+		 */
+		if (order > 0)
+			return 256;
+		if (order < 0)
+			return -255;
+
+		return inet_masklen_inclusion_cmp(left, right, opr_codenum);
+	}
+
+	return ip_family(left) - ip_family(right);
+}
+
+/*
+ * Masklen comparison function for the subnet inclusion/overlap operators
+ *
+ * Compares the lengths of the network parts of the inputs.  If the comparison
+ * is okay for the specified inclusion operator, the return value will be 0.
+ * Otherwise the return value will be less than or greater than 0 as
+ * appropriate for the operator.
+ */
+static int
+inet_masklen_inclusion_cmp(inet *left, inet *right, int opr_codenum)
+{
+	int			order;
+
+	order = (int) ip_bits(left) - (int) ip_bits(right);
+
+	/*
+	 * Return 0 if the operator would accept this combination of masklens.
+	 * Note that opr_codenum zero (overlaps) will accept all cases.
+	 */
+	if ((order > 0 && opr_codenum >= 0) ||
+		(order == 0 && opr_codenum >= -1 && opr_codenum <= 1) ||
+		(order < 0 && opr_codenum <= 0))
+		return 0;
+
+	/*
+	 * Otherwise, return a negative value for sup/supeq (notionally, the RHS
+	 * needs to have a larger masklen than it has, which would make it sort
+	 * later), or a positive value for sub/subeq (vice versa).
+	 */
+	return opr_codenum;
+}
+
+/*
+ * Inet histogram partial match divider calculation
+ *
+ * First the families and the lengths of the network parts are compared using
+ * the subnet inclusion operator.  If those are acceptable for the operator,
+ * the divider will be calculated using the masklens and the common bits of
+ * the addresses.  -1 will be returned if it cannot be calculated.
+ *
+ * See commentary for inet_hist_value_sel() for some rationale for this.
+ */
+static int
+inet_hist_match_divider(inet *boundary, inet *query, int opr_codenum)
+{
+	if (ip_family(boundary) == ip_family(query) &&
+		inet_masklen_inclusion_cmp(boundary, query, opr_codenum) == 0)
+	{
+		int			min_bits,
+					decisive_bits;
+
+		min_bits = Min(ip_bits(boundary), ip_bits(query));
+
+		/*
+		 * Set decisive_bits to the masklen of the one that should contain the
+		 * other according to the operator.
+		 */
+		if (opr_codenum < 0)
+			decisive_bits = ip_bits(boundary);
+		else if (opr_codenum > 0)
+			decisive_bits = ip_bits(query);
+		else
+			decisive_bits = min_bits;
+
+		/*
+		 * Now return the number of non-common decisive bits.  (This will be
+		 * zero if the boundary and query in fact match, else positive.)
+		 */
+		if (min_bits > 0)
+			return decisive_bits - bitncommon(ip_addr(boundary),
+											  ip_addr(query),
+											  min_bits);
+		return decisive_bits;
+	}
+
+	return -1;
 }
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Emre Hasegeli (#24)
Re: Selectivity estimation for inet operators

Emre Hasegeli <emre@hasegeli.com> writes:

[ inet-selfuncs-v14.patch ]

After further reflection I concluded that the best way to deal with the
O(N^2) runtime problem for the join selectivity function was to set a
limit on the number of statistics values we'd consider, as was discussed
awhile back IIRC. We can easily consider just the first N values of the
MCV arrays, since those are sorted by decreasing frequency anyway. For
the histogram arrays, taking every K'th value seems like the thing to do.

I made the limit be 1024 elements as that seemed to give an acceptable
maximum runtime (a couple hundred msec on my machine). We could probably
reduce that if anyone feels the max runtime needs to be less.

I had to drop the idea of breaking out of the histogram loop early as that
didn't play nicely with the decimation logic, unfortunately.

Anyway, pushed. Thanks for your perseverance on this!

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers