PATCH: multivariate histograms and MCV lists

Started by Tomas Vondraover 8 years ago167 messages

tomas.vondra@2ndquadrant.com

over 8 years ago

2 attachment(s)

Hi all,

For PostgreSQL 10 we managed to get the basic CREATE STATISTICS bits in
(grammar, infrastructure, and two simple types of statistics). See:

https://commitfest.postgresql.org/13/852/

This patch presents a rebased version of the remaining parts, adding
more complex statistic types (MCV lists and histograms), and hopefully
some additional improvements.

The code was rebased on top of current master, and I've made various
improvements to match how the committed parts were reworked. So the
basic idea and shape remains the same, the tweaks are mostly small.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-Multivariate-MCV-list-statistics.patchtext/x-patch; name=0001-Multivariate-MCV-list-statistics.patchDownload

From c66c9cd2d5ec3c3433e6c9a8b3477b274468442a Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Thu, 3 Aug 2017 21:55:10 +0200
Subject: [PATCH 1/3] Multivariate MCV list statistics

---
 doc/src/sgml/catalogs.sgml                       |   10 +
 doc/src/sgml/planstats.sgml                      |  139 ++
 doc/src/sgml/ref/create_statistics.sgml          |   32 +-
 src/backend/commands/statscmds.c                 |  107 +-
 src/backend/optimizer/path/clausesel.c           |   10 +
 src/backend/optimizer/util/plancat.c             |   12 +
 src/backend/statistics/Makefile                  |    2 +-
 src/backend/statistics/README.mcv                |  137 ++
 src/backend/statistics/dependencies.c            |   74 +-
 src/backend/statistics/extended_stats.c          |  215 ++-
 src/backend/statistics/mcv.c                     | 1809 ++++++++++++++++++++++
 src/backend/utils/adt/ruleutils.c                |   24 +-
 src/bin/psql/describe.c                          |    9 +-
 src/include/catalog/pg_cast.h                    |    5 +
 src/include/catalog/pg_proc.h                    |   12 +
 src/include/catalog/pg_statistic_ext.h           |    5 +-
 src/include/catalog/pg_type.h                    |    4 +
 src/include/statistics/extended_stats_internal.h |   34 +-
 src/include/statistics/statistics.h              |   47 +
 src/test/regress/expected/opr_sanity.out         |    3 +-
 src/test/regress/expected/stats_ext.out          |  219 ++-
 src/test/regress/expected/type_sanity.out        |    3 +-
 src/test/regress/sql/stats_ext.sql               |  121 ++
 23 files changed, 2957 insertions(+), 76 deletions(-)
 create mode 100644 src/backend/statistics/README.mcv
 create mode 100644 src/backend/statistics/mcv.c

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index ef7054c..e07fe46 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6468,6 +6468,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</>:<replaceable>&lt;salt&gt;<
       </entry>
      </row>
 
+     <row>
+      <entry><structfield>stxmcv</structfield></entry>
+      <entry><type>pg_mcv_list</type></entry>
+      <entry></entry>
+      <entry>
+       MCV (most-common values) list statistics, serialized as
+       <structname>pg_mcv_list</> type.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/planstats.sgml b/doc/src/sgml/planstats.sgml
index 838fcda..1e81d94 100644
--- a/doc/src/sgml/planstats.sgml
+++ b/doc/src/sgml/planstats.sgml
@@ -585,6 +585,145 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
    </para>
 
   </sect2>
+
+  <sect2 id="mcv-lists">
+   <title>MCV lists</title>
+
+   <para>
+    As explained in the previous section, functional dependencies are very
+    cheap and efficient type of statistics, but it has limitations due to the
+    global nature (only tracking column-level dependencies, not between values
+    stored in the columns).
+   </para>
+
+   <para>
+    This section introduces multivariate most-common values (<acronym>MCV</>)
+    lists, a direct generalization of the statistics introduced in
+    <xref linkend="row-estimation-examples">, that is not subject to this
+    limitation. It is however more expensive, both in terms of storage and
+    planning time.
+   </para>
+
+   <para>
+    Let's look at the example query from the previous section again, creating
+    a multivariate <acronym>MCV</> list on the columns (after dropping the
+    functional dependencies, to make sure the planner uses the newly created
+    <acronym>MCV</> list when computing the estimates).
+
+<programlisting>
+CREATE STATISTICS stts2 (mcv) ON a, b FROM t;
+ANALYZE t;
+EXPLAIN ANALYZE SELECT * FROM t WHERE a = 1 AND b = 1;
+                                           QUERY PLAN
+-------------------------------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=100 width=8) (actual time=0.036..3.011 rows=100 loops=1)
+   Filter: ((a = 1) AND (b = 1))
+   Rows Removed by Filter: 9900
+ Planning time: 0.188 ms
+ Execution time: 3.229 ms
+(5 rows)
+</programlisting>
+
+    The estimate is as accurate as with the functional dependencies, mostly
+    thanks to the table being a fairly small and having a simple distribution
+    with low number of distinct values. Before looking at the second query,
+    which was not handled by functional dependencies this well, let's inspect
+    the <acronym>MCV</> list a bit.
+   </para>
+
+   <para>
+    First, let's list statistics defined on a table using <command>\d</>
+    in <application>psql</>:
+
+<programlisting>
+\d t
+       Table "public.t"
+ Column |  Type   | Modifiers
+--------+---------+-----------
+ a      | integer |
+ b      | integer |
+Statistics objects:
+    "public"."stts2" (mcv) ON a, b FROM t
+</programlisting>
+
+   </para>
+
+   <para>
+    Inspecting the contents of the MCV list is possible using
+    <function>pg_mcv_list_items</> set-returning function.
+
+<programlisting>
+SELECT * FROM pg_mcv_list_items((SELECT oid FROM pg_statistic_ext WHERE staname = 'stts2'));
+ index | values  | nulls | frequency
+-------+---------+-------+-----------
+     0 | {0,0}   | {f,f} |      0.01
+     1 | {1,1}   | {f,f} |      0.01
+     2 | {2,2}   | {f,f} |      0.01
+...
+    49 | {49,49} | {f,f} |      0.01
+    50 | {50,0}  | {f,f} |      0.01
+...
+    97 | {97,47} | {f,f} |      0.01
+    98 | {98,48} | {f,f} |      0.01
+    99 | {99,49} | {f,f} |      0.01
+(100 rows)
+</programlisting>
+
+    Which confirms there are 100 distinct combinations of values in the two
+    columns, and all of them are equally likely (1% frequency for each).
+    Had there been any null values in either of the columns, this would be
+    identified in the <structfield>nulls</> column.
+   </para>
+
+   <para>
+    When estimating the selectivity, the planner applies all the conditions
+    on items in the <acronym>MCV</> list, and them sums the frequencies
+    of the matching ones. See <function>clauselist_mv_selectivity_mcvlist</>
+    in <filename>clausesel.c</> for details.
+   </para>
+
+   <para>
+    Compared to functional dependencies, <acronym>MCV</> lists have two major
+    advantages. Firstly, the list stores actual values, making it possible to
+    detect "incompatible" combinations.
+
+<programlisting>
+EXPLAIN ANALYZE SELECT * FROM t WHERE a = 1 AND b = 10;
+                                         QUERY PLAN
+---------------------------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=1 width=8) (actual time=2.823..2.823 rows=0 loops=1)
+   Filter: ((a = 1) AND (b = 10))
+   Rows Removed by Filter: 10000
+ Planning time: 0.268 ms
+ Execution time: 2.866 ms
+(5 rows)
+</programlisting>
+
+    Secondly, <acronym>MCV</> also handle a wide range of clause types, not
+    just equality clauses like functional dependencies. See for example the
+    example range query, presented earlier:
+
+<programlisting>
+EXPLAIN ANALYZE SELECT * FROM t WHERE a <= 49 AND b > 49;
+                                         QUERY PLAN
+---------------------------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=1 width=8) (actual time=3.349..3.349 rows=0 loops=1)
+   Filter: ((a <= 49) AND (b > 49))
+   Rows Removed by Filter: 10000
+ Planning time: 0.163 ms
+ Execution time: 3.389 ms
+(5 rows)
+</programlisting>
+
+   </para>
+
+   <para>
+    For additional information about multivariate MCV lists, see
+    <filename>src/backend/statistics/README.mcv</>.
+   </para>
+
+  </sect2>
+
  </sect1>
 
  <sect1 id="planner-stats-security">
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index deda21f..52851da 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -81,9 +81,10 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
      <para>
       A statistic type to be computed in this statistics object.
       Currently supported types are
-      <literal>ndistinct</literal>, which enables n-distinct statistics, and
-      <literal>dependencies</literal>, which enables functional
-      dependency statistics.
+      <literal>ndistinct</literal>, which enables n-distinct statistics,
+      <literal>dependencies</literal>, which enables functional dependency
+      statistics, and <literal>mcv</literal> which enables most-common
+      values lists.
       If this clause is omitted, all supported statistic types are
       included in the statistics object.
       For more information, see <xref linkend="planner-stats-extended">
@@ -164,6 +165,31 @@ EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 0);
    conditions are redundant and does not underestimate the rowcount.
   </para>
 
+  <para>
+   Create table <structname>t2</> with two perfectly correlated columns
+   (containing identical data), and a MCV list on those columns:
+
+<programlisting>
+CREATE TABLE t2 (
+    a   int,
+    b   int
+);
+
+INSERT INTO t2 SELECT mod(i,100), mod(i,100)
+                 FROM generate_series(1,1000000) s(i);
+
+CREATE STATISTICS s2 WITH (mcv) ON (a, b) FROM t2;
+
+ANALYZE t2;
+
+-- valid combination (found in MCV)
+EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 1);
+
+-- invalid combination (not found in MCV)
+EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 2);
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 4765055..0bcea4b 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -64,11 +64,12 @@ CreateStatistics(CreateStatsStmt *stmt)
 	Oid			relid;
 	ObjectAddress parentobject,
 				myself;
-	Datum		types[2];		/* one for each possible type of statistic */
+	Datum		types[3];		/* one for each possible type of statistic */
 	int			ntypes;
 	ArrayType  *stxkind;
 	bool		build_ndistinct;
 	bool		build_dependencies;
+	bool		build_mcv;
 	bool		requested_type = false;
 	int			i;
 	ListCell   *cell;
@@ -246,6 +247,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	 */
 	build_ndistinct = false;
 	build_dependencies = false;
+	build_mcv = false;
 	foreach(cell, stmt->stat_types)
 	{
 		char	   *type = strVal((Value *) lfirst(cell));
@@ -260,6 +262,11 @@ CreateStatistics(CreateStatsStmt *stmt)
 			build_dependencies = true;
 			requested_type = true;
 		}
+		else if (strcmp(type, "mcv") == 0)
+		{
+			build_mcv = true;
+			requested_type = true;
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -271,6 +278,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	{
 		build_ndistinct = true;
 		build_dependencies = true;
+		build_mcv = true;
 	}
 
 	/* construct the char array of enabled statistic types */
@@ -279,6 +287,8 @@ CreateStatistics(CreateStatsStmt *stmt)
 		types[ntypes++] = CharGetDatum(STATS_EXT_NDISTINCT);
 	if (build_dependencies)
 		types[ntypes++] = CharGetDatum(STATS_EXT_DEPENDENCIES);
+	if (build_mcv)
+		types[ntypes++] = CharGetDatum(STATS_EXT_MCV);
 	Assert(ntypes > 0 && ntypes <= lengthof(types));
 	stxkind = construct_array(types, ntypes, CHAROID, 1, true, 'c');
 
@@ -297,6 +307,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	/* no statistics built yet */
 	nulls[Anum_pg_statistic_ext_stxndistinct - 1] = true;
 	nulls[Anum_pg_statistic_ext_stxdependencies - 1] = true;
+	nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
 
 	/* insert it into pg_statistic_ext */
 	statrel = heap_open(StatisticExtRelationId, RowExclusiveLock);
@@ -387,21 +398,95 @@ RemoveStatisticsById(Oid statsOid)
  * null until the next ANALYZE.  (Note that the type change hasn't actually
  * happened yet, so one option that's *not* on the table is to recompute
  * immediately.)
+ *
+ * For both ndistinct and functional-dependencies stats, the on-disk
+ * representation is independent of the source column data types, and it is
+ * plausible to assume that the old statistic values will still be good for
+ * the new column contents.  (Obviously, if the ALTER COLUMN TYPE has a USING
+ * expression that substantially alters the semantic meaning of the column
+ * values, this assumption could fail.  But that seems like a corner case
+ * that doesn't justify zapping the stats in common cases.)
+ *
+ * For MCV lists that's not the case, as those statistics store the datums
+ * internally. In this case we simply reset the statistics value to NULL.
  */
 void
 UpdateStatisticsForTypeChange(Oid statsOid, Oid relationOid, int attnum,
 							  Oid oldColumnType, Oid newColumnType)
 {
+	Form_pg_statistic_ext staForm;
+	HeapTuple	stup,
+				oldtup;
+	int			i;
+
+	/* Do we need to reset anything? */
+	bool		attribute_referenced;
+	bool		reset_stats = false;
+
+	Relation	rel;
+
+	Datum		values[Natts_pg_statistic_ext];
+	bool		nulls[Natts_pg_statistic_ext];
+	bool		replaces[Natts_pg_statistic_ext];
+
+	oldtup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statsOid));
+	if (!oldtup)
+		elog(ERROR, "cache lookup failed for statistics object %u", statsOid);
+	staForm = (Form_pg_statistic_ext) GETSTRUCT(oldtup);
+
+	/*
+	 * If the modified attribute is not referenced by this statistic, we
+	 * can simply leave the statistics alone.
+	 */
+	attribute_referenced = false;
+	for (i = 0; i < staForm->stxkeys.dim1; i++)
+		if (attnum == staForm->stxkeys.values[i])
+			attribute_referenced = true;
+
 	/*
-	 * Currently, we don't actually need to do anything here.  For both
-	 * ndistinct and functional-dependencies stats, the on-disk representation
-	 * is independent of the source column data types, and it is plausible to
-	 * assume that the old statistic values will still be good for the new
-	 * column contents.  (Obviously, if the ALTER COLUMN TYPE has a USING
-	 * expression that substantially alters the semantic meaning of the column
-	 * values, this assumption could fail.  But that seems like a corner case
-	 * that doesn't justify zapping the stats in common cases.)
-	 *
-	 * Future types of extended stats will likely require us to work harder.
+	 * We can also leave the record as it is if there are no statistics
+	 * including the datum values, like for example MCV lists.
 	 */
+	if (statext_is_kind_built(oldtup, STATS_EXT_MCV))
+		reset_stats = true;
+
+	/*
+	 * If we can leave the statistics as it is, just do minimal cleanup
+	 * and we're done.
+	 */
+	if (!attribute_referenced && reset_stats)
+	{
+		ReleaseSysCache(oldtup);
+		return;
+	}
+
+	/*
+	 * OK, we need to reset some statistics. So let's build the new tuple,
+	 * replacing the affected statistics types with NULL.
+	 */
+	memset(nulls, 1, Natts_pg_statistic_ext * sizeof(bool));
+	memset(replaces, 0, Natts_pg_statistic_ext * sizeof(bool));
+	memset(values, 0, Natts_pg_statistic_ext * sizeof(Datum));
+
+	if (statext_is_kind_built(oldtup, STATS_EXT_MCV))
+	{
+		replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
+		nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
+	}
+
+	rel = heap_open(StatisticExtRelationId, RowExclusiveLock);
+
+	/* replace the old tuple */
+	stup = heap_modify_tuple(oldtup,
+							 RelationGetDescr(rel),
+							 values,
+							 nulls,
+							 replaces);
+
+	ReleaseSysCache(oldtup);
+	CatalogTupleUpdate(rel, &stup->t_self, stup);
+
+	heap_freetuple(stup);
+
+	heap_close(rel, RowExclusiveLock);
 }
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 9d34025..28a9321 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -125,6 +125,16 @@ clauselist_selectivity(PlannerInfo *root,
 	if (rel && rel->rtekind == RTE_RELATION && rel->statlist != NIL)
 	{
 		/*
+		 * Perform selectivity estimations on any clauses applicable by
+		 * mcv_clauselist_selectivity.  'estimatedclauses' will be filled with
+		 * the 0-based list positions of clauses used that way, so that we can
+		 * ignore them below.
+		 */
+		s1 *= mcv_clauselist_selectivity(root, clauses, varRelid,
+										 jointype, sjinfo, rel,
+										 &estimatedclauses);
+
+		/*
 		 * Perform selectivity estimations on any clauses found applicable by
 		 * dependencies_clauselist_selectivity.  'estimatedclauses' will be
 		 * filled with the 0-based list positions of clauses used that way, so
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index dc0b0b0..ab2c8c2 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1321,6 +1321,18 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 			stainfos = lcons(info, stainfos);
 		}
 
+		if (statext_is_kind_built(htup, STATS_EXT_MCV))
+		{
+			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+			info->statOid = statOid;
+			info->rel = rel;
+			info->kind = STATS_EXT_MCV;
+			info->keys = bms_copy(keys);
+
+			stainfos = lcons(info, stainfos);
+		}
+
 		ReleaseSysCache(htup);
 		bms_free(keys);
 	}
diff --git a/src/backend/statistics/Makefile b/src/backend/statistics/Makefile
index 3404e45..d281526 100644
--- a/src/backend/statistics/Makefile
+++ b/src/backend/statistics/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/statistics
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-OBJS = extended_stats.o dependencies.o mvdistinct.o
+OBJS = extended_stats.o dependencies.o mcv.o mvdistinct.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/statistics/README.mcv b/src/backend/statistics/README.mcv
new file mode 100644
index 0000000..22c2b87
--- /dev/null
+++ b/src/backend/statistics/README.mcv
@@ -0,0 +1,137 @@
+MCV lists
+=========
+
+Multivariate MCV (most-common values) lists are a straightforward extension of
+regular MCV list, tracking most frequent combinations of values for a group of
+attributes.
+
+This works particularly well for columns with a small number of distinct values,
+as the list may include all the combinations and approximate the distribution
+very accurately.
+
+For columns with large number of distinct values (e.g. those with continuous
+domains), the list will only track the most frequent combinations. If the
+distribution is mostly uniform (all combinations about equally frequent), the
+MCV list will be empty.
+
+Estimates of some clauses (e.g. equality) based on MCV lists are more accurate
+than when using histograms.
+
+Also, MCV lists don't necessarily require sorting of the values (the fact that
+we use sorting when building them is implementation detail), but even more
+importantly the ordering is not built into the approximation (while histograms
+are built on ordering). So MCV lists work well even for attributes where the
+ordering of the data type is disconnected from the meaning of the data. For
+example we know how to sort strings, but it's unlikely to make much sense for
+city names (or other label-like attributes).
+
+
+Selectivity estimation
+----------------------
+
+The estimation, implemented in clauselist_mv_selectivity_mcvlist(), is quite
+simple in principle - we need to identify MCV items matching all the clauses
+and sum frequencies of all those items.
+
+Currently MCV lists support estimation of the following clause types:
+
+    (a) equality clauses    WHERE (a = 1) AND (b = 2)
+    (b) inequality clauses  WHERE (a < 1) AND (b >= 2)
+    (c) NULL clauses        WHERE (a IS NULL) AND (b IS NOT NULL)
+    (d) OR clauses          WHERE (a < 1) OR (b >= 2)
+
+It's possible to add support for additional clauses, for example:
+
+    (e) multi-var clauses   WHERE (a > b)
+
+and possibly others. These are tasks for the future, not yet implemented.
+
+
+Estimating equality clauses
+---------------------------
+
+When computing selectivity estimate for equality clauses
+
+    (a = 1) AND (b = 2)
+
+we can do this estimate pretty exactly assuming that two conditions are met:
+
+    (1) there's an equality condition on all attributes of the statistic
+
+    (2) we find a matching item in the MCV list
+
+In this case we know the MCV item represents all tuples matching the clauses,
+and the selectivity estimate is complete (i.e. we don't need to perform
+estimation using the histogram). This is what we call 'full match'.
+
+When only (1) holds, but there's no matching MCV item, we don't know whether
+there are no such rows or just are not very frequent. We can however use the
+frequency of the least frequent MCV item as an upper bound for the selectivity.
+
+For a combination of equality conditions (not full-match case) we can clamp the
+selectivity by the minimum of selectivities for each condition. For example if
+we know the number of distinct values for each column, we can use 1/ndistinct
+as a per-column estimate. Or rather 1/ndistinct + selectivity derived from the
+MCV list.
+
+We should also probably only use the 'residual ndistinct' by exluding the items
+included in the MCV list (and also residual frequency):
+
+     f = (1.0 - sum(MCV frequencies)) / (ndistinct - ndistinct(MCV list))
+
+but it's worth pointing out the ndistinct values are multi-variate for the
+columns referenced by the equality conditions.
+
+Note: Only the "full match" limit is currently implemented.
+
+
+Hashed MCV (not yet implemented)
+--------------------------------
+
+Regular MCV lists have to include actual values for each item, so if those items
+are large the list may be quite large. This is especially true for multi-variate
+MCV lists, although the current implementation partially mitigates this by
+performing de-duplicating the values before storing them on disk.
+
+It's possible to only store hashes (32-bit values) instead of the actual values,
+significantly reducing the space requirements. Obviously, this would only make
+the MCV lists useful for estimating equality conditions (assuming the 32-bit
+hashes make the collisions rare enough).
+
+This might also complicate matching the columns to available stats.
+
+
+TODO Consider implementing hashed MCV list, storing just 32-bit hashes instead
+     of the actual values. This type of MCV list will be useful only for
+     estimating equality clauses, and will reduce space requirements for large
+     varlena types (in such cases we usually only want equality anyway).
+
+TODO Currently there's no logic to consider building only a MCV list (and not
+     building the histogram at all), except for doing this decision manually in
+     ADD STATISTICS.
+
+
+Inspecting the MCV list
+-----------------------
+
+Inspecting the regular (per-attribute) MCV lists is trivial, as it's enough
+to select the columns from pg_stats - the data is encoded as anyarrays, so we
+simply get the text representation of the arrays.
+
+With multivariate MCV lits it's not that simple due to the possible mix of
+data types. It might be possible to produce similar array-like representation,
+but that'd unnecessarily complicate further processing and analysis of the MCV
+list. Instead, there's a SRF function providing values, frequencies etc.
+
+    SELECT * FROM pg_mcv_list_items();
+
+It has two input parameters:
+
+    oid   - OID of the MCV list (pg_statistic_ext.staoid)
+
+and produces a table with these columns:
+
+    - item ID (0...nitems-1)
+    - values (string array)
+    - nulls only (boolean array)
+    - frequency (double precision)
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 2e7c0ad..27e096f 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -201,14 +201,11 @@ static double
 dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
 				  VacAttrStats **stats, Bitmapset *attrs)
 {
-	int			i,
-				j;
-	int			nvalues = numrows * k;
+	int			i;
 	MultiSortSupport mss;
 	SortItem   *items;
-	Datum	   *values;
-	bool	   *isnull;
 	int		   *attnums;
+	int		   *attnums_dep;
 
 	/* counters valid within a group */
 	int			group_size = 0;
@@ -223,26 +220,16 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
 	/* sort info for all attributes columns */
 	mss = multi_sort_init(k);
 
-	/* data for the sort */
-	items = (SortItem *) palloc(numrows * sizeof(SortItem));
-	values = (Datum *) palloc(sizeof(Datum) * nvalues);
-	isnull = (bool *) palloc(sizeof(bool) * nvalues);
-
-	/* fix the pointers to values/isnull */
-	for (i = 0; i < numrows; i++)
-	{
-		items[i].values = &values[i * k];
-		items[i].isnull = &isnull[i * k];
-	}
-
 	/*
-	 * Transform the bms into an array, to make accessing i-th member easier.
+	 * Transform the bms into an array, to make accessing i-th member easier,
+	 * and then construct a filtered version with only attnums referenced
+	 * by the dependency we validate.
 	 */
-	attnums = (int *) palloc(sizeof(int) * bms_num_members(attrs));
-	i = 0;
-	j = -1;
-	while ((j = bms_next_member(attrs, j)) >= 0)
-		attnums[i++] = j;
+	attnums = build_attnums(attrs);
+
+	attnums_dep = (int *)palloc(k * sizeof(int));
+	for (i = 0; i < k; i++)
+		attnums_dep[i] = attnums[dependency[i]];
 
 	/*
 	 * Verify the dependency (a,b,...)->z, using a rather simple algorithm:
@@ -254,7 +241,7 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
 	 * (c) for each group count different values in the last column
 	 */
 
-	/* prepare the sort function for the first dimension, and SortItem array */
+	/* prepare the sort function for the dimensions */
 	for (i = 0; i < k; i++)
 	{
 		VacAttrStats *colstat = stats[dependency[i]];
@@ -267,19 +254,16 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
 
 		/* prepare the sort function for this dimension */
 		multi_sort_add_dimension(mss, i, type->lt_opr);
-
-		/* accumulate all the data for both columns into an array and sort it */
-		for (j = 0; j < numrows; j++)
-		{
-			items[j].values[i] =
-				heap_getattr(rows[j], attnums[dependency[i]],
-							 stats[i]->tupDesc, &items[j].isnull[i]);
-		}
 	}
 
-	/* sort the items so that we can detect the groups */
-	qsort_arg((void *) items, numrows, sizeof(SortItem),
-			  multi_sort_compare, mss);
+	/*
+	 * build an array of SortItem(s) sorted using the multi-sort support
+	 *
+	 * XXX This relies on all stats entries pointing to the same tuple
+	 * descriptor. Not sure if that might not be the case.
+	 */
+	items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
+							   mss, k, attnums_dep);
 
 	/*
 	 * Walk through the sorted array, split it into rows according to the
@@ -322,9 +306,9 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
 	}
 
 	pfree(items);
-	pfree(values);
-	pfree(isnull);
 	pfree(mss);
+	pfree(attnums);
+	pfree(attnums_dep);
 
 	/* Compute the 'degree of validity' as (supporting/total). */
 	return (n_supporting_rows * 1.0 / numrows);
@@ -351,7 +335,6 @@ statext_dependencies_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
 						   VacAttrStats **stats)
 {
 	int			i,
-				j,
 				k;
 	int			numattrs;
 	int		   *attnums;
@@ -364,11 +347,7 @@ statext_dependencies_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
 	/*
 	 * Transform the bms into an array, to make accessing i-th member easier.
 	 */
-	attnums = palloc(sizeof(int) * bms_num_members(attrs));
-	i = 0;
-	j = -1;
-	while ((j = bms_next_member(attrs, j)) >= 0)
-		attnums[i++] = j;
+	attnums = build_attnums(attrs);
 
 	Assert(numattrs >= 2);
 
@@ -938,6 +917,9 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	 * the attnums for each clause in a list which we'll reference later so we
 	 * don't need to repeat the same work again. We'll also keep track of all
 	 * attnums seen.
+	 *
+	 * We also skip clauses that we already estimated using different types of
+	 * statistics (we treat them as incompatible).
 	 */
 	listidx = 0;
 	foreach(l, clauses)
@@ -945,7 +927,8 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		Node	   *clause = (Node *) lfirst(l);
 		AttrNumber	attnum;
 
-		if (dependency_is_compatible_clause(clause, rel->relid, &attnum))
+		if ((dependency_is_compatible_clause(clause, rel->relid, &attnum)) &&
+			(!bms_is_member(listidx, *estimatedclauses)))
 		{
 			list_attnums[listidx] = attnum;
 			clauses_attnums = bms_add_member(clauses_attnums, attnum);
@@ -1015,8 +998,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 			/*
 			 * Skip incompatible clauses, and ones we've already estimated on.
 			 */
-			if (list_attnums[listidx] == InvalidAttrNumber ||
-				bms_is_member(listidx, *estimatedclauses))
+			if (list_attnums[listidx] == InvalidAttrNumber)
 				continue;
 
 			/*
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index db4987b..ee64214 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -53,7 +53,7 @@ static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs,
 					  int nvacatts, VacAttrStats **vacatts);
 static void statext_store(Relation pg_stext, Oid relid,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
-			  VacAttrStats **stats);
+			  MCVList *mcvlist, VacAttrStats **stats);
 
 
 /*
@@ -86,6 +86,7 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 		StatExtEntry *stat = (StatExtEntry *) lfirst(lc);
 		MVNDistinct *ndistinct = NULL;
 		MVDependencies *dependencies = NULL;
+		MCVList	   *mcv = NULL;
 		VacAttrStats **stats;
 		ListCell   *lc2;
 
@@ -122,10 +123,12 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 			else if (t == STATS_EXT_DEPENDENCIES)
 				dependencies = statext_dependencies_build(numrows, rows,
 														  stat->columns, stats);
+			else if (t == STATS_EXT_MCV)
+				mcv = statext_mcv_build(numrows, rows, stat->columns, stats);
 		}
 
 		/* store the statistics in the catalog */
-		statext_store(pg_stext, stat->statOid, ndistinct, dependencies, stats);
+		statext_store(pg_stext, stat->statOid, ndistinct, dependencies, mcv, stats);
 	}
 
 	heap_close(pg_stext, RowExclusiveLock);
@@ -153,6 +156,10 @@ statext_is_kind_built(HeapTuple htup, char type)
 			attnum = Anum_pg_statistic_ext_stxdependencies;
 			break;
 
+		case STATS_EXT_MCV:
+			attnum = Anum_pg_statistic_ext_stxmcv;
+			break;
+
 		default:
 			elog(ERROR, "unexpected statistics type requested: %d", type);
 	}
@@ -217,7 +224,8 @@ fetch_statentries_for_relation(Relation pg_statext, Oid relid)
 		for (i = 0; i < ARR_DIMS(arr)[0]; i++)
 		{
 			Assert((enabled[i] == STATS_EXT_NDISTINCT) ||
-				   (enabled[i] == STATS_EXT_DEPENDENCIES));
+				   (enabled[i] == STATS_EXT_DEPENDENCIES) ||
+				   (enabled[i] == STATS_EXT_MCV));
 			entry->types = lappend_int(entry->types, (int) enabled[i]);
 		}
 
@@ -286,13 +294,59 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs,
 }
 
 /*
+ * Find attnums of MV stats using the mvoid.
+ */
+int2vector *
+find_ext_attnums(Oid mvoid, Oid *relid)
+{
+	ArrayType  *arr;
+	Datum       adatum;
+	bool        isnull;
+	HeapTuple   htup;
+	int2vector *keys;
+
+	/* Prepare to scan pg_statistic_ext for entries having indrelid = this rel. */
+	htup = SearchSysCache1(STATEXTOID,
+						   ObjectIdGetDatum(mvoid));
+
+	/* XXX syscache contains OIDs of deleted stats (not invalidated) */
+	if (!HeapTupleIsValid(htup))
+		return NULL;
+
+	/* starelid */
+	adatum = SysCacheGetAttr(STATEXTOID, htup,
+							 Anum_pg_statistic_ext_stxrelid, &isnull);
+	Assert(!isnull);
+
+	*relid = DatumGetObjectId(adatum);
+
+	/* stakeys */
+	adatum = SysCacheGetAttr(STATEXTOID, htup,
+							 Anum_pg_statistic_ext_stxkeys, &isnull);
+	Assert(!isnull);
+
+	arr = DatumGetArrayTypeP(adatum);
+
+	keys = buildint2vector((int16 *) ARR_DATA_PTR(arr),
+						   ARR_DIMS(arr)[0]);
+	ReleaseSysCache(htup);
+
+	/*
+	 * TODO maybe save the list into relcache, as in RelationGetIndexList
+	 * (which was used as an inspiration of this one)?.
+	 */
+
+	return keys;
+}
+
+/*
  * statext_store
  *	Serializes the statistics and stores them into the pg_statistic_ext tuple.
  */
 static void
 statext_store(Relation pg_stext, Oid statOid,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
-			  VacAttrStats **stats)
+			  MCVList *mcv, VacAttrStats **stats)
 {
 	HeapTuple	stup,
 				oldtup;
@@ -323,9 +377,18 @@ statext_store(Relation pg_stext, Oid statOid,
 		values[Anum_pg_statistic_ext_stxdependencies - 1] = PointerGetDatum(data);
 	}
 
+	if (mcv != NULL)
+	{
+		bytea	   *data = statext_mcv_serialize(mcv, stats);
+
+		nulls[Anum_pg_statistic_ext_stxmcv - 1] = (data == NULL);
+		values[Anum_pg_statistic_ext_stxmcv - 1] = PointerGetDatum(data);
+	}
+
 	/* always replace the value (either by bytea or NULL) */
 	replaces[Anum_pg_statistic_ext_stxndistinct - 1] = true;
 	replaces[Anum_pg_statistic_ext_stxdependencies - 1] = true;
+	replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
 
 	/* there should already be a pg_statistic_ext tuple */
 	oldtup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
@@ -432,6 +495,137 @@ multi_sort_compare_dims(int start, int end,
 	return 0;
 }
 
+int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+	return compare_datums_simple(*(Datum *) a,
+								 *(Datum *) b,
+								 (SortSupport) arg);
+}
+
+int
+compare_datums_simple(Datum a, Datum b, SortSupport ssup)
+{
+	return ApplySortComparator(a, false, b, false, ssup);
+}
+
+/* simple counterpart to qsort_arg */
+void *
+bsearch_arg(const void *key, const void *base, size_t nmemb, size_t size,
+			int (*compar) (const void *, const void *, void *),
+			void *arg)
+{
+	size_t		l,
+				u,
+				idx;
+	const void *p;
+	int			comparison;
+
+	l = 0;
+	u = nmemb;
+	while (l < u)
+	{
+		idx = (l + u) / 2;
+		p = (void *) (((const char *) base) + (idx * size));
+		comparison = (*compar) (key, p, arg);
+
+		if (comparison < 0)
+			u = idx;
+		else if (comparison > 0)
+			l = idx + 1;
+		else
+			return (void *) p;
+	}
+
+	return NULL;
+}
+
+int *
+build_attnums(Bitmapset *attrs)
+{
+	int		i,
+			j;
+	int		numattrs = bms_num_members(attrs);
+	int	   *attnums;
+
+	/* build attnums from the bitmapset */
+	attnums = (int*)palloc(sizeof(int) * numattrs);
+	i = 0;
+	j = -1;
+	while ((j = bms_next_member(attrs, j)) >= 0)
+		attnums[i++] = j;
+
+	return attnums;
+}
+
+/* build_sorted_items
+ * 	build sorted array of SortItem with values from rows
+ *
+ * XXX All the memory is allocated in a single chunk, so that the caller
+ * can simply pfree the return value to release all of it.
+ */
+SortItem *
+build_sorted_items(int numrows, HeapTuple *rows, TupleDesc tdesc,
+				   MultiSortSupport mss, int numattrs, int *attnums)
+{
+	int			i,
+				j,
+				len;
+	int			nvalues = numrows * numattrs;
+
+	/*
+	 * We won't allocate the arrays for each item independenly, but in one
+	 * large chunk and then just set the pointers. This allows the caller
+	 * to simply pfree the return value to release all the memory.
+	 */
+	SortItem   *items;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *ptr;
+
+	/* Compute the total amount of memory we need (both items and values). */
+	len = numrows * sizeof(SortItem) + nvalues * (sizeof(Datum) + sizeof(bool));
+
+	/* Allocate the memory and split it into the pieces. */
+	ptr = palloc0(len);
+
+	/* items to sort */
+	items = (SortItem *) ptr;
+	ptr += numrows * sizeof(SortItem);
+
+	/* values and null flags */
+	values = (Datum *) ptr;
+	ptr += nvalues * sizeof(Datum);
+
+	isnull = (bool *) ptr;
+	ptr += nvalues * sizeof(bool);
+
+	/* make sure we consumed the whole buffer exactly */
+	Assert((ptr - (char *) items) == len);
+
+	/* fix the pointers to Datum and bool arrays */
+	for (i = 0; i < numrows; i++)
+	{
+		items[i].values = &values[i * numattrs];
+		items[i].isnull = &isnull[i * numattrs];
+
+		/* load the values/null flags from sample rows */
+		for (j = 0; j < numattrs; j++)
+		{
+			items[i].values[j] = heap_getattr(rows[i],
+											  attnums[j], /* attnum */
+											  tdesc,
+											  &items[i].isnull[j]);		/* isnull */
+		}
+	}
+
+	/* do the sort, using the multi-sort */
+	qsort_arg((void *) items, numrows, sizeof(SortItem),
+			  multi_sort_compare, mss);
+
+	return items;
+}
+
 /*
  * has_stats_of_kind
  *		Check whether the list contains statistic of a given kind
@@ -512,3 +706,16 @@ choose_best_statistics(List *stats, Bitmapset *attnums, char requiredkind)
 
 	return best_match;
 }
+
+int
+bms_member_index(Bitmapset *keys, AttrNumber varattno)
+{
+	int	i, j;
+
+	i = -1;
+	j = 0;
+	while (((i = bms_next_member(keys, i)) >= 0) && (i < varattno))
+		j += 1;
+
+	return j;
+}
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
new file mode 100644
index 0000000..391ddcb
--- /dev/null
+++ b/src/backend/statistics/mcv.c
@@ -0,0 +1,1809 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcv.c
+ *	  POSTGRES multivariate MCV lists
+ *
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/backend/statistics/mcv.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_statistic_ext.h"
+#include "fmgr.h"
+#include "funcapi.h"
+#include "optimizer/clauses.h"
+#include "statistics/extended_stats_internal.h"
+#include "statistics/statistics.h"
+#include "utils/builtins.h"
+#include "utils/bytea.h"
+#include "utils/fmgroids.h"
+#include "utils/fmgrprotos.h"
+#include "utils/lsyscache.h"
+#include "utils/syscache.h"
+#include "utils/typcache.h"
+
+/*
+ * Computes size of a serialized MCV item, depending on the number of
+ * dimentions (columns) the statistic is defined on. The datum values are
+ * stored in a separate array (deduplicated, to minimize the size), and
+ * so the serialized items only store uint16 indexes into that array.
+ *
+ * Each serialized item store (in this order):
+ *
+ * - indexes to values	  (ndim * sizeof(uint16))
+ * - null flags			  (ndim * sizeof(bool))
+ * - frequency			  (sizeof(double))
+ *
+ * So in total each MCV item requires this many bytes:
+ *
+ *	 ndim * (sizeof(uint16) + sizeof(bool)) + sizeof(double)
+ */
+#define ITEM_SIZE(ndims)	\
+	(ndims * (sizeof(uint16) + sizeof(bool)) + sizeof(double))
+
+/*
+ * Macros for convenient access to parts of a serialized MCV item.
+ */
+#define ITEM_INDEXES(item)			((uint16*)item)
+#define ITEM_NULLS(item,ndims)		((bool*)(ITEM_INDEXES(item) + ndims))
+#define ITEM_FREQUENCY(item,ndims)	((double*)(ITEM_NULLS(item,ndims) + ndims))
+
+
+static MultiSortSupport build_mss(VacAttrStats **stats, Bitmapset *attrs);
+
+static SortItem *build_distinct_groups(int numrows, SortItem *items,
+					  MultiSortSupport mss, int *ndistinct);
+
+static int count_distinct_groups(int numrows, SortItem *items,
+					  MultiSortSupport mss);
+
+static bool mcv_is_compatible_clause(Node *clause, Index relid,
+					  Bitmapset **attnums);
+
+/*
+ * Builds MCV list from the set of sampled rows.
+ *
+ * The algorithm is quite simple:
+ *
+ *	   (1) sort the data (default collation, '<' for the data type)
+ *
+ *	   (2) count distinct groups, decide how many to keep
+ *
+ *	   (3) build the MCV list using the threshold determined in (2)
+ *
+ *	   (4) remove rows represented by the MCV from the sample
+ *
+ * FIXME: Single-dimensional MCV is sorted by frequency (descending). We
+ * should do that too, because when walking through the list we want to
+ * check the most frequent items first.
+ *
+ * TODO: We're using Datum (8B), even for data types (e.g. int4 or float4).
+ * Maybe we could save some space here, but the bytea compression should
+ * handle it just fine.
+ *
+ * TODO: This probably should not use the ndistinct directly (as computed from
+ * the table, but rather estimate the number of distinct values in the
+ * table), no?
+ */
+MCVList *
+statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
+				 VacAttrStats **stats)
+{
+	int			i;
+	int			numattrs = bms_num_members(attrs);
+	int			ndistinct = 0;
+	int			mcv_threshold = 0;
+	int			nitems = 0;
+
+	int		   *attnums = build_attnums(attrs);
+
+	MCVList	   *mcvlist = NULL;
+
+	/* comparator for all the columns */
+	MultiSortSupport mss = build_mss(stats, attrs);
+
+	/* sort the rows */
+	SortItem   *items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
+										   mss, numattrs, attnums);
+
+	/* transform the sorted rows into groups (sorted by frequency) */
+	SortItem   *groups = build_distinct_groups(numrows, items, mss, &ndistinct);
+
+	/*
+	 * Determine the minimum size of a group to be eligible for MCV list, and
+	 * check how many groups actually pass that threshold. We use 1.25x the
+	 * avarage group size, just like for regular per-column statistics.
+	 *
+	 * XXX We also use a minimum number of 4 rows for mcv_threshold, not sure
+	 * if that's what per-column statistics do too?
+	 *
+	 * But if we can fit all the distinct values in the MCV list (i.e. if
+	 * there are less distinct groups than STATS_MCVLIST_MAX_ITEMS), we'll
+	 * require only 2 rows per group.
+	 *
+	 * XXX Maybe this part (requiring 2 rows per group) is not very reliable?
+	 * Perhaps we should instead estimate the number of groups the way we
+	 * estimate ndistinct (after all, that's what MCV items are), and base our
+	 * decision on that?
+	 */
+	mcv_threshold = 1.25 * numrows / ndistinct;
+	mcv_threshold = (mcv_threshold < 4) ? 4 : mcv_threshold;
+
+	if (ndistinct <= STATS_MCVLIST_MAX_ITEMS)
+		mcv_threshold = 2;
+
+	/* Walk through the groups and stop once we fall below the threshold. */
+	nitems = 0;
+	for (i = 0; i < ndistinct; i++)
+	{
+		if (groups[i].count < mcv_threshold)
+			break;
+
+		nitems++;
+	}
+
+	/*
+	 * At this point we know the number of items for the MCV list. There might
+	 * be none (for uniform distribution with many groups), and in that case
+	 * there will be no MCV list. Otherwise construct the MCV list.
+	 */
+	if (nitems > 0)
+	{
+		/*
+		 * Allocate the MCV list structure, set the global parameters.
+		 */
+		mcvlist = (MCVList *) palloc0(sizeof(MCVList));
+
+		mcvlist->magic = STATS_MCV_MAGIC;
+		mcvlist->type = STATS_MCV_TYPE_BASIC;
+		mcvlist->ndimensions = numattrs;
+		mcvlist->nitems = nitems;
+
+		/*
+		 * Preallocate Datum/isnull arrays (not as a single chunk, as we will
+		 * pass the result outside and thus it needs to be easy to pfree().
+		 *
+		 * XXX On second thought, we're the only ones dealing with MCV lists,
+		 * so we might allocate everything as a single chunk without any risk.
+		 * Not sure it's worth it, though.
+		 */
+		mcvlist->items = (MCVItem **) palloc0(sizeof(MCVItem *) * nitems);
+
+		for (i = 0; i < nitems; i++)
+		{
+			mcvlist->items[i] = (MCVItem *) palloc(sizeof(MCVItem));
+			mcvlist->items[i]->values = (Datum *) palloc(sizeof(Datum) * numattrs);
+			mcvlist->items[i]->isnull = (bool *) palloc(sizeof(bool) * numattrs);
+		}
+
+		/* Copy the first chunk of groups into the result. */
+		for (i = 0; i < nitems; i++)
+		{
+			/* just pointer to the proper place in the list */
+			MCVItem	   *item = mcvlist->items[i];
+
+			/* copy values from the _previous_ group (last item of) */
+			memcpy(item->values, groups[i].values, sizeof(Datum) * numattrs);
+			memcpy(item->isnull, groups[i].isnull, sizeof(bool) * numattrs);
+
+			/* make sure basic assumptions on group size are correct */
+			Assert(groups[i].count >= mcv_threshold);
+			Assert(groups[i].count <= numrows);
+
+			/* groups should be sorted by frequency in descending order */
+			Assert((i == 0) || (groups[i-1].count >= groups[i].count));
+
+			/* and finally the group frequency */
+			item->frequency = (double) groups[i].count / numrows;
+		}
+
+		/* make sure the loops are consistent */
+		Assert(nitems == mcvlist->nitems);
+	}
+
+	pfree(items);
+	pfree(groups);
+
+	return mcvlist;
+}
+
+/*
+ * build_mss
+ *	build MultiSortSupport for the attributes passed in attrs
+ */
+static MultiSortSupport
+build_mss(VacAttrStats **stats, Bitmapset *attrs)
+{
+	int			i, j;
+	int			numattrs = bms_num_members(attrs);
+
+	/* Sort by multiple columns (using array of SortSupport) */
+	MultiSortSupport mss = multi_sort_init(numattrs);
+
+	/* prepare the sort functions for all the attributes */
+	i = 0;
+	j = -1;
+	while ((j = bms_next_member(attrs, j)) >= 0)
+	{
+		VacAttrStats *colstat = stats[i];
+		TypeCacheEntry *type;
+
+		type = lookup_type_cache(colstat->attrtypid, TYPECACHE_LT_OPR);
+		if (type->lt_opr == InvalidOid) /* shouldn't happen */
+			elog(ERROR, "cache lookup failed for ordering operator for type %u",
+				 colstat->attrtypid);
+
+		multi_sort_add_dimension(mss, i, type->lt_opr);
+		i++;
+	}
+
+	return mss;
+}
+
+/*
+ * count_distinct_groups
+ *	count distinct combinations of SortItems in the array
+ *
+ * The array is assumed to be sorted according to the MultiSortSupport.
+ */
+static int
+count_distinct_groups(int numrows, SortItem *items, MultiSortSupport mss)
+{
+	int			i;
+	int			ndistinct;
+
+	ndistinct = 1;
+	for (i = 1; i < numrows; i++)
+		if (multi_sort_compare(&items[i], &items[i - 1], mss) != 0)
+			ndistinct += 1;
+
+	return ndistinct;
+}
+
+/*
+ * compare_sort_item_count
+ *	comparator for sorting items by count (frequencies) in descending order
+ */
+static int
+compare_sort_item_count(const void *a, const void *b)
+{
+	SortItem   *ia = (SortItem *) a;
+	SortItem   *ib = (SortItem *) b;
+
+	if (ia->count == ib->count)
+		return 0;
+	else if (ia->count > ib->count)
+		return -1;
+
+	return 1;
+}
+
+/*
+ * build_distinct_groups
+ *	build array of SortItems for distinct groups and counts matching items
+ *
+ * The input array is assumed to be sorted
+ */
+static SortItem *
+build_distinct_groups(int numrows, SortItem *items, MultiSortSupport mss,
+					  int *ndistinct)
+{
+	int			i,
+				j;
+	int			ngroups = count_distinct_groups(numrows, items, mss);
+
+	SortItem   *groups = (SortItem *) palloc0(ngroups * sizeof(SortItem));
+
+	j = 0;
+	groups[0] = items[0];
+	groups[0].count = 1;
+
+	for (i = 1; i < numrows; i++)
+	{
+		/* Assume sorted in ascending order. */
+		Assert(multi_sort_compare(&items[i], &items[i - 1], mss) >= 0);
+
+		/* New distinct group detected. */
+		if (multi_sort_compare(&items[i], &items[i - 1], mss) != 0)
+			groups[++j] = items[i];
+
+		groups[j].count++;
+	}
+
+	/* Sort the distinct groups by frequency (in descending order). */
+	pg_qsort((void *) groups, ngroups, sizeof(SortItem),
+			 compare_sort_item_count);
+
+	*ndistinct = ngroups;
+	return groups;
+}
+
+
+/*
+ * statext_mcv_load
+ *		Load the MCV list for the indicated pg_statistic_ext tuple
+ */
+MCVList *
+statext_mcv_load(Oid mvoid)
+{
+	bool		isnull = false;
+	Datum		mcvlist;
+	HeapTuple	htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(mvoid));
+
+	if (!HeapTupleIsValid(htup))
+		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
+
+	mcvlist = SysCacheGetAttr(STATEXTOID, htup,
+							  Anum_pg_statistic_ext_stxmcv, &isnull);
+
+	Assert(!isnull);
+
+	ReleaseSysCache(htup);
+
+	return statext_mcv_deserialize(DatumGetByteaP(mcvlist));
+}
+
+
+/*
+ * Serialize MCV list into a bytea value.
+ *
+ * The basic algorithm is simple:
+ *
+ * (1) perform deduplication (for each attribute separately)
+ *	   (a) collect all (non-NULL) attribute values from all MCV items
+ *	   (b) sort the data (using 'lt' from VacAttrStats)
+ *	   (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all MCV list items
+ *	   (a) replace values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, as we may be mixing different
+ * datatypes, with different sort operators, etc.
+ *
+ * We use uint16 values for the indexes in step (3), as we currently don't allow
+ * more than 8k MCV items anyway, although that's mostly arbitrary limit. We might
+ * increase this to 65k and still fit into uint16. Furthermore, this limit is on
+ * the number of distinct values per column, and we usually have few of those
+ * (and various combinations of them for the those MCV list). So uint16 seems fine.
+ *
+ * We don't really expect the serialization to save as much space as for
+ * histograms, as we are not doing any bucket splits (which is the source
+ * of high redundancy in histograms).
+ *
+ * TODO: Consider packing boolean flags (NULL) for each item into a single char
+ * (or a longer type) instead of using an array of bool items.
+ */
+bytea *
+statext_mcv_serialize(MCVList *mcvlist, VacAttrStats **stats)
+{
+	int			i;
+	int			dim;
+	int			ndims = mcvlist->ndimensions;
+	int			itemsize = ITEM_SIZE(ndims);
+
+	SortSupport ssup;
+	DimensionInfo *info;
+
+	Size		total_length;
+
+	/* allocate the item just once */
+	char	   *item = palloc0(itemsize);
+
+	/* serialized items (indexes into arrays, etc.) */
+	bytea	   *output;
+	char	   *data = NULL;
+
+	/* values per dimension (and number of non-NULL values) */
+	Datum	  **values = (Datum **) palloc0(sizeof(Datum *) * ndims);
+	int		   *counts = (int *) palloc0(sizeof(int) * ndims);
+
+	/*
+	 * We'll include some rudimentary information about the attributes (type
+	 * length, etc.), so that we don't have to look them up while
+	 * deserializing the MCV list.
+	 *
+	 * XXX Maybe this is not a great idea? Or maybe we should actually copy
+	 * more fields, e.g. typeid, which would allow us to display the MCV list
+	 * using only the serialized representation (currently we have to fetch
+	 * this info from the relation).
+	 */
+	info = (DimensionInfo *) palloc0(sizeof(DimensionInfo) * ndims);
+
+	/* sort support data for all attributes included in the MCV list */
+	ssup = (SortSupport) palloc0(sizeof(SortSupportData) * ndims);
+
+	/* collect and deduplicate values for each dimension (attribute) */
+	for (dim = 0; dim < ndims; dim++)
+	{
+		int			ndistinct;
+		StdAnalyzeData *tmp = (StdAnalyzeData *) stats[dim]->extra_data;
+
+		/* copy important info about the data type (length, by-value) */
+		info[dim].typlen = stats[dim]->attrtype->typlen;
+		info[dim].typbyval = stats[dim]->attrtype->typbyval;
+
+		/* allocate space for values in the attribute and collect them */
+		values[dim] = (Datum *) palloc0(sizeof(Datum) * mcvlist->nitems);
+
+		for (i = 0; i < mcvlist->nitems; i++)
+		{
+			/* skip NULL values - we don't need to deduplicate those */
+			if (mcvlist->items[i]->isnull[dim])
+				continue;
+
+			values[dim][counts[dim]] = mcvlist->items[i]->values[dim];
+			counts[dim] += 1;
+		}
+
+		/* if there are just NULL values in this dimension, we're done */
+		if (counts[dim] == 0)
+			continue;
+
+		/* sort and deduplicate the data */
+		ssup[dim].ssup_cxt = CurrentMemoryContext;
+		ssup[dim].ssup_collation = DEFAULT_COLLATION_OID;
+		ssup[dim].ssup_nulls_first = false;
+
+		PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[dim]);
+
+		qsort_arg(values[dim], counts[dim], sizeof(Datum),
+				  compare_scalars_simple, &ssup[dim]);
+
+		/*
+		 * Walk through the array and eliminate duplicate values, but keep the
+		 * ordering (so that we can do bsearch later). We know there's at least
+		 * one item as (counts[dim] != 0), so we can skip the first element.
+		 */
+		ndistinct = 1;			/* number of distinct values */
+		for (i = 1; i < counts[dim]; i++)
+		{
+			/* expect sorted array */
+			Assert(compare_datums_simple(values[dim][i - 1], values[dim][i], &ssup[dim]) <= 0);
+
+			/* if the value is the same as the previous one, we can skip it */
+			if (!compare_datums_simple(values[dim][i - 1], values[dim][i], &ssup[dim]))
+				continue;
+
+			values[dim][ndistinct] = values[dim][i];
+			ndistinct += 1;
+		}
+
+		/* we must not exceed UINT16_MAX, as we use uint16 indexes */
+		Assert(ndistinct <= UINT16_MAX);
+
+		/*
+		 * Store additional info about the attribute - number of deduplicated
+		 * values, and also size of the serialized data. For fixed-length data
+		 * types this is trivial to compute, for varwidth types we need to
+		 * actually walk the array and sum the sizes.
+		 */
+		info[dim].nvalues = ndistinct;
+
+		if (info[dim].typlen > 0) /* fixed-length data types */
+			info[dim].nbytes = info[dim].nvalues * info[dim].typlen;
+		else if (info[dim].typlen == -1)	/* varlena */
+		{
+			info[dim].nbytes = 0;
+			for (i = 0; i < info[dim].nvalues; i++)
+				info[dim].nbytes += VARSIZE_ANY(values[dim][i]);
+		}
+		else if (info[dim].typlen == -2)	/* cstring */
+		{
+			info[dim].nbytes = 0;
+			for (i = 0; i < info[dim].nvalues; i++)
+				info[dim].nbytes += strlen(DatumGetPointer(values[dim][i]));
+		}
+
+		/* we know (count>0) so there must be some data */
+		Assert(info[dim].nbytes > 0);
+	}
+
+	/*
+	 * Now we can finally compute how much space we'll actually need for the
+	 * whole serialized MCV list, as it contains these fields:
+	 *
+	 * - length (4B) for varlena
+	 * - magic (4B)
+	 * - type (4B)
+	 * - ndimensions (4B)
+	 * - nitems (4B)
+	 * - info (ndim * sizeof(DimensionInfo)
+	 * - arrays of values for each dimension
+	 * - serialized items (nitems * itemsize)
+	 *
+	 * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and then we
+	 * will place all the data (values + indexes). We'll however use offsetof
+	 * and sizeof to compute sizes of the structs.
+	 */
+	total_length = (sizeof(int32) + offsetof(MCVList, items)
+					+ (ndims * sizeof(DimensionInfo))
+					+ mcvlist->nitems * itemsize);
+
+	/* add space for the arrays of deduplicated values */
+	for (i = 0; i < ndims; i++)
+		total_length += info[i].nbytes;
+
+	/*
+	 * Enforce arbitrary limit of 1MB on the size of the serialized MCV list.
+	 * This is meant as a protection against someone building MCV list on long
+	 * values (e.g. text documents).
+	 *
+	 * XXX Should we enforce arbitrary limits like this one? Maybe it's not
+	 * even necessary, as long values are usually unique and so won't make it
+	 * into the MCV list in the first place. In the end, we have a 1GB limit
+	 * on bytea values.
+	 */
+	if (total_length > (1024 * 1024))
+		elog(ERROR, "serialized MCV list exceeds 1MB (%ld)", total_length);
+
+	/* allocate space for the serialized MCV list, set header fields */
+	output = (bytea *) palloc0(total_length);
+	SET_VARSIZE(output, total_length);
+
+	/* 'data' points to the current position in the output buffer */
+	data = VARDATA(output);
+
+	/* MCV list header (number of items, ...) */
+	memcpy(data, mcvlist, offsetof(MCVList, items));
+	data += offsetof(MCVList, items);
+
+	/* information about the attributes */
+	memcpy(data, info, sizeof(DimensionInfo) * ndims);
+	data += sizeof(DimensionInfo) * ndims;
+
+	/* Copy the deduplicated values for all attributes to the output. */
+	for (dim = 0; dim < ndims; dim++)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* remember the starting point for Asserts later */
+		char	   *tmp = data;
+#endif
+		for (i = 0; i < info[dim].nvalues; i++)
+		{
+			Datum		v = values[dim][i];
+
+			if (info[dim].typbyval)		/* passed by value */
+			{
+				memcpy(data, &v, info[dim].typlen);
+				data += info[dim].typlen;
+			}
+			else if (info[dim].typlen > 0)		/* pased by reference */
+			{
+				memcpy(data, DatumGetPointer(v), info[dim].typlen);
+				data += info[dim].typlen;
+			}
+			else if (info[dim].typlen == -1)		/* varlena */
+			{
+				memcpy(data, DatumGetPointer(v), VARSIZE_ANY(v));
+				data += VARSIZE_ANY(v);
+			}
+			else if (info[dim].typlen == -2)		/* cstring */
+			{
+				memcpy(data, DatumGetPointer(v), strlen(DatumGetPointer(v)) + 1);
+				data += strlen(DatumGetPointer(v)) + 1; /* terminator */
+			}
+		}
+
+		/* check we got exactly the amount of data we expected for this dimension */
+		Assert((data - tmp) == info[dim].nbytes);
+	}
+
+	/* finally serialize the items, with uint16 indexes instead of the values */
+	for (i = 0; i < mcvlist->nitems; i++)
+	{
+		MCVItem	   *mcvitem = mcvlist->items[i];
+
+		/* don't write beyond the allocated space */
+		Assert(data <= (char *) output + total_length - itemsize);
+
+		/* reset the item (we only allocate it once and reuse it) */
+		memset(item, 0, itemsize);
+
+		for (dim = 0; dim < ndims; dim++)
+		{
+			Datum	   *v = NULL;
+
+			/* do the lookup only for non-NULL values */
+			if (mcvlist->items[i]->isnull[dim])
+				continue;
+
+			v = (Datum *) bsearch_arg(&mcvitem->values[dim], values[dim],
+									  info[dim].nvalues, sizeof(Datum),
+									  compare_scalars_simple, &ssup[dim]);
+
+			Assert(v != NULL);	/* serialization or deduplication error */
+
+			/* compute index within the array */
+			ITEM_INDEXES(item)[dim] = (v - values[dim]);
+
+			/* check the index is within expected bounds */
+			Assert(ITEM_INDEXES(item)[dim] >= 0);
+			Assert(ITEM_INDEXES(item)[dim] < info[dim].nvalues);
+		}
+
+		/* copy NULL and frequency flags into the item */
+		memcpy(ITEM_NULLS(item, ndims), mcvitem->isnull, sizeof(bool) * ndims);
+		memcpy(ITEM_FREQUENCY(item, ndims), &mcvitem->frequency, sizeof(double));
+
+		/* copy the serialized item into the array */
+		memcpy(data, item, itemsize);
+
+		data += itemsize;
+	}
+
+	/* at this point we expect to match the total_length exactly */
+	Assert((data - (char *) output) == total_length);
+
+	pfree(item);
+	pfree(values);
+	pfree(counts);
+
+	return output;
+}
+
+/*
+ * Reads serialized MCV list into MCVList structure.
+ *
+ * Unlike with histograms, we deserialize the MCV list fully (i.e. we don't
+ * keep the deduplicated arrays and pointers into them), as we don't expect
+ * there bo be a lot of duplicate values. But perhaps that's not true and we
+ * should keep the MCV in serialized form too.
+ *
+ * XXX See how much memory we could save by keeping the deduplicated version
+ * (both for typical and corner cases with few distinct values but many items).
+ */
+MCVList *
+statext_mcv_deserialize(bytea *data)
+{
+	int			dim,
+				i;
+	Size		expected_size;
+	MCVList	   *mcvlist;
+	char	   *tmp;
+
+	int			ndims,
+				nitems,
+				itemsize;
+	DimensionInfo *info = NULL;
+	Datum	  **values = NULL;
+
+	/* local allocation buffer (used only for deserialization) */
+	int			bufflen;
+	char	   *buff;
+	char	   *ptr;
+
+	/* buffer used for the result */
+	int			rbufflen;
+	char	   *rbuff;
+	char	   *rptr;
+
+	if (data == NULL)
+		return NULL;
+
+	/*
+	 * We can't possibly deserialize a MCV list if there's not even a
+	 * complete header.
+	 */
+	if (VARSIZE_ANY_EXHDR(data) < offsetof(MCVList, items))
+		elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+			 VARSIZE_ANY_EXHDR(data), offsetof(MCVList, items));
+
+	/* read the MCV list header */
+	mcvlist = (MCVList *) palloc0(sizeof(MCVList));
+
+	/* initialize pointer to the data part (skip the varlena header) */
+	tmp = VARDATA_ANY(data);
+
+	/* get the header and perform further sanity checks */
+	memcpy(mcvlist, tmp, offsetof(MCVList, items));
+	tmp += offsetof(MCVList, items);
+
+	if (mcvlist->magic != STATS_MCV_MAGIC)
+		elog(ERROR, "invalid MCV magic %d (expected %dd)",
+			 mcvlist->magic, STATS_MCV_MAGIC);
+
+	if (mcvlist->type != STATS_MCV_TYPE_BASIC)
+		elog(ERROR, "invalid MCV type %d (expected %dd)",
+			 mcvlist->type, STATS_MCV_TYPE_BASIC);
+
+	if (mcvlist->ndimensions == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid zero-length dimension array in MCVList")));
+	else if (mcvlist->ndimensions > STATS_MAX_DIMENSIONS)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid length (%d) dimension array in MCVList",
+						mcvlist->ndimensions)));
+
+	if (mcvlist->nitems == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid zero-length item array in MCVList")));
+	else if (mcvlist->nitems > STATS_MCVLIST_MAX_ITEMS)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid length (%d) item array in MCVList",
+						mcvlist->nitems)));
+
+	nitems = mcvlist->nitems;
+	ndims = mcvlist->ndimensions;
+	itemsize = ITEM_SIZE(ndims);
+
+	/*
+	 * Check amount of data including DimensionInfo for all dimensions and
+	 * also the serialized items (including uint16 indexes). Also, walk
+	 * through the dimension information and add it to the sum.
+	 */
+	expected_size = offsetof(MCVList, items) +
+		ndims * sizeof(DimensionInfo) +
+		(nitems * itemsize);
+
+	/*
+	 * Check that we have at least the dimension and info records, along
+	 * with the items. We don't know the size of the serialized values yet.
+	 * We need to do this check first, before accessing the dimension info.
+	 */
+	if (VARSIZE_ANY_EXHDR(data) < expected_size)
+		elog(ERROR, "invalid MCV size %ld (expected %ld)",
+			 VARSIZE_ANY_EXHDR(data), expected_size);
+
+	/* Now it's safe to access the dimention info. */
+	info = (DimensionInfo *) (tmp);
+	tmp += ndims * sizeof(DimensionInfo);
+
+	/* account for the value arrays */
+	for (dim = 0; dim < ndims; dim++)
+	{
+		/*
+		 * XXX I wonder if we can/should rely on asserts here. Maybe those
+		 * checks should be done every time?
+		 */
+		Assert(info[dim].nvalues >= 0);
+		Assert(info[dim].nbytes >= 0);
+
+		expected_size += info[dim].nbytes;
+	}
+
+	/*
+	 * Nowe we know the total expected MCV size, including all the pieces
+	 * (header, dimension info. items and deduplicated data). So do the
+	 * final check on size.
+	 */
+	if (VARSIZE_ANY_EXHDR(data) != expected_size)
+		elog(ERROR, "invalid MCV size %ld (expected %ld)",
+			 VARSIZE_ANY_EXHDR(data), expected_size);
+
+	/*
+	 * Allocate one large chunk of memory for the intermediate data, needed
+	 * only for deserializing the MCV list (and allocate densely to minimize
+	 * the palloc overhead).
+	 *
+	 * Let's see how much space we'll actually need, and also include space
+	 * for the array with pointers.
+	 *
+	 * We need an array of Datum pointers values for each dimension, so that
+	 * we can easily translate the uint16 indexes. We also need a top-level
+	 * array of pointers to those per-dimension arrays.
+	 *
+	 * For byval types with size matching sizeof(Datum) we can reuse the
+	 * serialized array directly.
+	 */
+	bufflen = sizeof(Datum **) * ndims;	/* space for top-level pointers */
+
+	for (dim = 0; dim < ndims; dim++)
+	{
+		/* for full-size byval types, we reuse the serialized value */
+		if (!(info[dim].typbyval && info[dim].typlen == sizeof(Datum)))
+			bufflen += (sizeof(Datum) * info[dim].nvalues);
+	}
+
+	buff = palloc0(bufflen);
+	ptr = buff;
+
+	values = (Datum **) buff;
+	ptr += (sizeof(Datum *) * ndims);
+
+	/*
+	 * XXX This uses pointers to the original data array (the types not passed
+	 * by value), so when someone frees the memory, e.g. by doing something
+	 * like this:
+	 *
+	 *	bytea * data = ... fetch the data from catalog ...
+	 *	MCVList mcvlist = deserialize_mcv_list(data);
+	 *	pfree(data);
+	 *
+	 * then 'mcvlist' references the freed memory. Should copy the pieces.
+	 */
+	for (dim = 0; dim < ndims; dim++)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* remember where data for this dimension starts */
+		char *start = tmp;
+#endif
+		if (info[dim].typbyval)
+		{
+			/* passed by value / size matches Datum - just reuse the array */
+			if (info[dim].typlen == sizeof(Datum))
+			{
+				values[dim] = (Datum *) tmp;
+				tmp += info[dim].nbytes;
+
+				/* no overflow of input array */
+				Assert(tmp <= start + info[dim].nbytes);
+			}
+			else
+			{
+				values[dim] = (Datum *) ptr;
+				ptr += (sizeof(Datum) * info[dim].nvalues);
+
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					memcpy(&values[dim][i], tmp, info[dim].typlen);
+					tmp += info[dim].typlen;
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+		}
+		else
+		{
+			/* all the other types need a chunk of the buffer */
+			values[dim] = (Datum *) ptr;
+			ptr += (sizeof(Datum) * info[dim].nvalues);
+
+			/* pased by reference, but fixed length (name, tid, ...) */
+			if (info[dim].typlen > 0)
+			{
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					values[dim][i] = PointerGetDatum(tmp);
+					tmp += info[dim].typlen;
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+			else if (info[dim].typlen == -1)
+			{
+				/* varlena */
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					values[dim][i] = PointerGetDatum(tmp);
+					tmp += VARSIZE_ANY(tmp);
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+			else if (info[dim].typlen == -2)
+			{
+				/* cstring */
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					values[dim][i] = PointerGetDatum(tmp);
+					tmp += (strlen(tmp) + 1);	/* don't forget the \0 */
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+		}
+
+		/* check we consumed the serialized data for this dimension exactly */
+		Assert((tmp - start) == info[dim].nbytes);
+	}
+
+	/* we should have exhausted the buffer exactly */
+	Assert((ptr - buff) == bufflen);
+
+	/* allocate space for all the MCV items in a single piece */
+	rbufflen = (sizeof(MCVItem*) + sizeof(MCVItem) +
+				sizeof(Datum) * ndims + sizeof(bool) * ndims) * nitems;
+
+	rbuff = palloc0(rbufflen);
+	rptr = rbuff;
+
+	mcvlist->items = (MCVItem **) rbuff;
+	rptr += (sizeof(MCVItem *) * nitems);
+
+	/* deserialize the MCV items and translate the indexes to Datums */
+	for (i = 0; i < nitems; i++)
+	{
+		uint16	   *indexes = NULL;
+		MCVItem	   *item = (MCVItem *) rptr;
+
+		rptr += (sizeof(MCVItem));
+
+		item->values = (Datum *) rptr;
+		rptr += (sizeof(Datum) * ndims);
+
+		item->isnull = (bool *) rptr;
+		rptr += (sizeof(bool) * ndims);
+
+		/* just point to the right place */
+		indexes = ITEM_INDEXES(tmp);
+
+		memcpy(item->isnull, ITEM_NULLS(tmp, ndims), sizeof(bool) * ndims);
+		memcpy(&item->frequency, ITEM_FREQUENCY(tmp, ndims), sizeof(double));
+
+#ifdef ASSERT_CHECKING
+		/*
+		 * XXX This seems rather useless, considering the 'indexes' array is
+		 * defined as (uint16*).
+		 */
+		for (dim = 0; dim < ndims; dim++)
+			Assert(indexes[dim] <= UINT16_MAX);
+#endif
+
+		/* translate the values */
+		for (dim = 0; dim < ndims; dim++)
+			if (!item->isnull[dim])
+				item->values[dim] = values[dim][indexes[dim]];
+
+		mcvlist->items[i] = item;
+
+		tmp += ITEM_SIZE(ndims);
+
+		/* check we're not overflowing the input */
+		Assert(tmp <= (char *) data + VARSIZE_ANY(data));
+	}
+
+	/* check that we processed all the data */
+	Assert(tmp == (char *) data + VARSIZE_ANY(data));
+
+	/* release the temporary buffer */
+	pfree(buff);
+
+	return mcvlist;
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - item ID (0...nitems)
+ * - values (string array)
+ * - nulls only (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows returned if
+ * the statistics contains no histogram.
+ */
+PG_FUNCTION_INFO_V1(pg_stats_ext_mcvlist_items);
+
+Datum
+pg_stats_ext_mcvlist_items(PG_FUNCTION_ARGS)
+{
+	FuncCallContext *funcctx;
+	int			call_cntr;
+	int			max_calls;
+	TupleDesc	tupdesc;
+	AttInMetadata *attinmeta;
+
+	/* stuff done only on the first call of the function */
+	if (SRF_IS_FIRSTCALL())
+	{
+		MemoryContext oldcontext;
+		MCVList	   *mcvlist;
+
+		/* create a function context for cross-call persistence */
+		funcctx = SRF_FIRSTCALL_INIT();
+
+		/* switch to memory context appropriate for multiple function calls */
+		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+		mcvlist = statext_mcv_load(PG_GETARG_OID(0));
+
+		funcctx->user_fctx = mcvlist;
+
+		/* total number of tuples to be returned */
+		funcctx->max_calls = 0;
+		if (funcctx->user_fctx != NULL)
+			funcctx->max_calls = mcvlist->nitems;
+
+		/* Build a tuple descriptor for our result type */
+		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("function returning record called in context "
+							"that cannot accept type record")));
+
+		/* build metadata needed later to produce tuples from raw C-strings */
+		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+		funcctx->attinmeta = attinmeta;
+
+		MemoryContextSwitchTo(oldcontext);
+	}
+
+	/* stuff done on every call of the function */
+	funcctx = SRF_PERCALL_SETUP();
+
+	call_cntr = funcctx->call_cntr;
+	max_calls = funcctx->max_calls;
+	attinmeta = funcctx->attinmeta;
+
+	if (call_cntr < max_calls)	/* do when there is more left to send */
+	{
+		char	  **values;
+		HeapTuple	tuple;
+		Datum		result;
+		int2vector *stakeys;
+		Oid			relid;
+
+		char	   *buff = palloc0(1024);
+		char	   *format;
+
+		int			i;
+
+		Oid		   *outfuncs;
+		FmgrInfo   *fmgrinfo;
+
+		MCVList	   *mcvlist;
+		MCVItem	   *item;
+
+		mcvlist = (MCVList *) funcctx->user_fctx;
+
+		Assert(call_cntr < mcvlist->nitems);
+
+		item = mcvlist->items[call_cntr];
+
+		stakeys = find_ext_attnums(PG_GETARG_OID(0), &relid);
+
+		/*
+		 * Prepare a values array for building the returned tuple. This should
+		 * be an array of C strings which will be processed later by the type
+		 * input functions.
+		 */
+		values = (char **) palloc(4 * sizeof(char *));
+
+		values[0] = (char *) palloc(64 * sizeof(char));
+
+		/* arrays */
+		values[1] = (char *) palloc0(1024 * sizeof(char));
+		values[2] = (char *) palloc0(1024 * sizeof(char));
+
+		/* frequency */
+		values[3] = (char *) palloc(64 * sizeof(char));
+
+		outfuncs = (Oid *) palloc0(sizeof(Oid) * mcvlist->ndimensions);
+		fmgrinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * mcvlist->ndimensions);
+
+		for (i = 0; i < mcvlist->ndimensions; i++)
+		{
+			bool		isvarlena;
+
+			getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+							  &outfuncs[i], &isvarlena);
+
+			fmgr_info(outfuncs[i], &fmgrinfo[i]);
+		}
+
+		snprintf(values[0], 64, "%d", call_cntr);		/* item ID */
+
+		for (i = 0; i < mcvlist->ndimensions; i++)
+		{
+			Datum		val,
+						valout;
+
+			format = "%s, %s";
+			if (i == 0)
+				format = "{%s%s";
+			else if (i == mcvlist->ndimensions - 1)
+				format = "%s, %s}";
+
+			if (item->isnull[i])
+				valout = CStringGetDatum("NULL");
+			else
+			{
+				val = item->values[i];
+				valout = FunctionCall1(&fmgrinfo[i], val);
+			}
+
+			snprintf(buff, 1024, format, values[1], DatumGetPointer(valout));
+			strncpy(values[1], buff, 1023);
+			buff[0] = '\0';
+
+			snprintf(buff, 1024, format, values[2], item->isnull[i] ? "t" : "f");
+			strncpy(values[2], buff, 1023);
+			buff[0] = '\0';
+		}
+
+		snprintf(values[3], 64, "%f", item->frequency); /* frequency */
+
+		/* build a tuple */
+		tuple = BuildTupleFromCStrings(attinmeta, values);
+
+		/* make the tuple into a datum */
+		result = HeapTupleGetDatum(tuple);
+
+		/* clean up (this is not really necessary) */
+		pfree(values[0]);
+		pfree(values[1]);
+		pfree(values[2]);
+		pfree(values[3]);
+
+		pfree(values);
+
+		SRF_RETURN_NEXT(funcctx, result);
+	}
+	else	/* do when there is no more left */
+	{
+		SRF_RETURN_DONE(funcctx);
+	}
+}
+
+/*
+ * pg_mcv_list_in		- input routine for type pg_mcv_list.
+ *
+ * pg_mcv_list is real enough to be a table column, but it has no operations
+ * of its own, and disallows input too
+ */
+Datum
+pg_mcv_list_in(PG_FUNCTION_ARGS)
+{
+	/*
+	 * pg_mcv_list stores the data in binary form and parsing text input is
+	 * not needed, so disallow this.
+	 */
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg("cannot accept a value of type %s", "pg_mcv_list")));
+
+	PG_RETURN_VOID();			/* keep compiler quiet */
+}
+
+
+/*
+ * pg_mcv_list_out		- output routine for type PG_MCV_LIST.
+ *
+ * MCV lists are serialized into a bytea value, so we simply call byteaout()
+ * to serialize the value into text. But it'd be nice to serialize that into
+ * a meaningful representation (e.g. for inspection by people).
+ *
+ * XXX This should probably return something meaningful, similar to what
+ * pg_dependencies_out does. Not sure how to deal with the deduplicated
+ * values, though - do we want to expand that or not?
+ */
+Datum
+pg_mcv_list_out(PG_FUNCTION_ARGS)
+{
+	return byteaout(fcinfo);
+}
+
+/*
+ * pg_mcv_list_recv		- binary input routine for type pg_mcv_list.
+ */
+Datum
+pg_mcv_list_recv(PG_FUNCTION_ARGS)
+{
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg("cannot accept a value of type %s", "pg_mcv_list")));
+
+	PG_RETURN_VOID();			/* keep compiler quiet */
+}
+
+/*
+ * pg_mcv_list_send		- binary output routine for type pg_mcv_list.
+ *
+ * MCV lists are serialized in a bytea value (although the type is named
+ * differently), so let's just send that.
+ */
+Datum
+pg_mcv_list_send(PG_FUNCTION_ARGS)
+{
+	return byteasend(fcinfo);
+}
+
+/*
+ * mcv_is_compatible_clause_internal
+ *	Does the heavy lifting of actually inspecting the clauses for
+ * mcv_is_compatible_clause.
+ */
+static bool
+mcv_is_compatible_clause_internal(Node *clause, Index relid, Bitmapset **attnums)
+{
+	/* We only support plain Vars for now */
+	if (IsA(clause, Var))
+	{
+		Var *var = (Var *) clause;
+
+		/* Ensure var is from the correct relation */
+		if (var->varno != relid)
+			return false;
+
+		/* we also better ensure the Var is from the current level */
+		if (var->varlevelsup > 0)
+			return false;
+
+		/* Also skip system attributes (we don't allow stats on those). */
+		if (!AttrNumberIsForUserDefinedAttr(var->varattno))
+			return false;
+
+		*attnums = bms_add_member(*attnums, var->varattno);
+
+		return true;
+	}
+
+	/* Var = Const */
+	if (is_opclause(clause))
+	{
+		OpExpr	   *expr = (OpExpr *) clause;
+		Var		   *var;
+		bool		varonleft = true;
+		bool		ok;
+
+		/* Only expressions with two arguments are considered compatible. */
+		if (list_length(expr->args) != 2)
+			return false;
+
+		/* see if it actually has the right */
+		ok = (NumRelids((Node *) expr) == 1) &&
+			(is_pseudo_constant_clause(lsecond(expr->args)) ||
+			 (varonleft = false,
+			  is_pseudo_constant_clause(linitial(expr->args))));
+
+		/* unsupported structure (two variables or so) */
+		if (!ok)
+			return false;
+
+		/*
+		 * If it's not one of the supported operators ("=", "<", ">", etc.),
+		 * just ignore the clause, as it's not compatible with MCV lists.
+		 *
+		 * This uses the function for estimating selectivity, not the operator
+		 * directly (a bit awkward, but well ...).
+		 */
+		if ((get_oprrest(expr->opno) != F_EQSEL) &&
+			(get_oprrest(expr->opno) != F_SCALARLTSEL) &&
+			(get_oprrest(expr->opno) != F_SCALARGTSEL))
+			return false;
+
+		var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+		return mcv_is_compatible_clause_internal((Node *)var, relid, attnums);
+	}
+
+	/* NOT clause, clause AND/OR clause */
+	if (or_clause(clause) ||
+		and_clause(clause) ||
+		not_clause(clause))
+	{
+		/*
+		 * AND/OR/NOT-clauses are supported if all sub-clauses are supported
+		 *
+		 * TODO: We might support mixed case, where some of the clauses are
+		 * supported and some are not, and treat all supported subclauses as a
+		 * single clause, compute it's selectivity using mv stats, and compute
+		 * the total selectivity using the current algorithm.
+		 *
+		 * TODO: For RestrictInfo above an OR-clause, we might use the
+		 * orclause with nested RestrictInfo - we won't have to call
+		 * pull_varnos() for each clause, saving time.
+		 */
+		BoolExpr   *expr = (BoolExpr *) clause;
+		ListCell   *lc;
+		Bitmapset  *clause_attnums = NULL;
+
+		foreach(lc, expr->args)
+		{
+			/*
+			 * Had we found incompatible clause in the arguments, treat the
+			 * whole clause as incompatible.
+			 */
+			if (!mcv_is_compatible_clause_internal((Node *) lfirst(lc),
+												   relid, &clause_attnums))
+				return false;
+		}
+
+		/*
+		 * Otherwise the clause is compatible, and we need to merge the
+		 * attnums into the main bitmapset.
+		 */
+		*attnums = bms_join(*attnums, clause_attnums);
+
+		return true;
+	}
+
+	/* Var IS NULL */
+	if (IsA(clause, NullTest))
+	{
+		NullTest   *nt = (NullTest *) clause;
+
+		/*
+		 * Only simple (Var IS NULL) expressions supported for now. Maybe we
+		 * could use examine_variable to fix this?
+		 */
+		if (!IsA(nt->arg, Var))
+			return false;
+
+		return mcv_is_compatible_clause_internal((Node *) (nt->arg), relid, attnums);
+	}
+
+	return false;
+}
+
+/*
+ * mcv_is_compatible_clause
+ *		Determines if the clause is compatible with MCV lists
+ *
+ * Only OpExprs with two arguments using an equality operator are supported.
+ * When returning True attnum is set to the attribute number of the Var within
+ * the supported clause.
+ *
+ * Currently we only support Var = Const, or Const = Var. It may be possible
+ * to expand on this later.
+ */
+static bool
+mcv_is_compatible_clause(Node *clause, Index relid, Bitmapset **attnums)
+{
+	RestrictInfo *rinfo = (RestrictInfo *) clause;
+
+	if (!IsA(rinfo, RestrictInfo))
+		return false;
+
+	/* Pseudoconstants are not really interesting here. */
+	if (rinfo->pseudoconstant)
+		return false;
+
+	/* clauses referencing multiple varnos are incompatible */
+	if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
+		return false;
+
+	return mcv_is_compatible_clause_internal((Node *)rinfo->clause,
+											 relid, attnums);
+}
+
+#define UPDATE_RESULT(m,r,isor) \
+	(m) = (isor) ? (Max(m,r)) : (Min(m,r))
+
+/*
+ * mcv_update_match_bitmap
+ *	Evaluate clauses using the MCV list, and update the match bitmap.
+ *
+ * A match bitmap keeps match/mismatch status for each MCV item, and we
+ * update it based on additional clauses. We also use it to skip items
+ * that can't possibly match (e.g. item marked as "mismatch" can't change
+ * to "match" when evaluating AND clause list).
+ *
+ * The function returns the number of items currently marked as 'match', and
+ * it also returns two additional pieces of information - a flag indicating
+ * whether there was an equality condition for all attributes, and the
+ * minimum frequency in the MCV list.
+ *
+ * XXX Currently the match bitmap uses a char for each MCV item, which is
+ * somewhat wasteful as we could do with just a single bit, thus reducing
+ * the size to ~1/8. It would also allow us to combine bitmaps simply using
+ * & and |, which should be faster than min/max. The bitmaps are fairly
+ * small, though (as we cap the MCV list size to 8k items).
+ */
+static int
+mcv_update_match_bitmap(PlannerInfo *root, List *clauses,
+						Bitmapset *keys, MCVList *mcvlist,
+						int nmatches, char *matches,
+						Selectivity *lowsel, bool *fullmatch,
+						bool is_or)
+{
+	int			i;
+	ListCell   *l;
+
+	Bitmapset  *eqmatches = NULL;		/* attributes with equality matches */
+
+	/* The bitmap may be partially built. */
+	Assert(nmatches >= 0);
+	Assert(nmatches <= mcvlist->nitems);
+	Assert(clauses != NIL);
+	Assert(list_length(clauses) >= 1);
+	Assert(mcvlist != NULL);
+	Assert(mcvlist->nitems > 0);
+
+	/*
+	 * Handle cases where either all MCV items are marked as mismatch (AND),
+	 * or match (OR). In those cases additional clauses can't possibly change
+	 * match status of any items, so don't waste time by trying.
+	 */
+	if (((nmatches == 0) && (!is_or)) ||			/* AND-ed clauses */
+		((nmatches == mcvlist->nitems) && is_or))	/* OR-ed clauses */
+		return nmatches;
+
+	/*
+	 * Find the lowest frequency in the MCV list. The MCV list is sorted by
+	 * frequency in descending order, so simply get frequency of the the last
+	 * MCV item.
+	 */
+	*lowsel = mcvlist->items[mcvlist->nitems-1]->frequency;
+
+	/*
+	 * Loop through the list of clauses, and for each of them evaluate all the
+	 * MCV items not yet eliminated by the preceding clauses.
+	 */
+	foreach(l, clauses)
+	{
+		Node	   *clause = (Node *) lfirst(l);
+
+		/* if it's a RestrictInfo, then extract the clause */
+		if (IsA(clause, RestrictInfo))
+			clause = (Node *) ((RestrictInfo *) clause)->clause;
+
+		/*
+		 * Check it still makes sense to continue evaluating the clauses on the
+		 * MCV list, just like we did at the very beginning.
+		 */
+		if (((nmatches == 0) && (!is_or)) ||
+			((nmatches == mcvlist->nitems) && is_or))
+			break;
+
+		/* Handle the various types of clauses - OpClause, NullTest and AND/OR/NOT */
+		if (is_opclause(clause))
+		{
+			OpExpr	   *expr = (OpExpr *) clause;
+			bool		varonleft = true;
+			bool		ok;
+			FmgrInfo	opproc;
+
+			/* get procedure computing operator selectivity */
+			RegProcedure oprrest = get_oprrest(expr->opno);
+
+			fmgr_info(get_opcode(expr->opno), &opproc);
+
+			ok = (NumRelids(clause) == 1) &&
+				(is_pseudo_constant_clause(lsecond(expr->args)) ||
+				 (varonleft = false,
+				  is_pseudo_constant_clause(linitial(expr->args))));
+
+			if (ok)
+			{
+
+				FmgrInfo	gtproc;
+				Var		   *var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+				Const	   *cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+				bool		isgt = (!varonleft);
+
+				TypeCacheEntry *typecache
+				= lookup_type_cache(var->vartype, TYPECACHE_GT_OPR);
+
+				/* FIXME proper matching attribute to dimension */
+				int			idx = bms_member_index(keys, var->varattno);
+
+				fmgr_info(get_opcode(typecache->gt_opr), &gtproc);
+
+				/*
+				 * Walk through the MCV items and evaluate the current clause.
+				 * We can skip items that were already ruled out, and
+				 * terminate if there are no remaining MCV items that might
+				 * possibly match.
+				 */
+				for (i = 0; i < mcvlist->nitems; i++)
+				{
+					bool		mismatch = false;
+					MCVItem	   *item = mcvlist->items[i];
+
+					/*
+					 * If there are no more matches (AND) or no remaining
+					 * unmatched items (OR), we can stop processing this
+					 * clause.
+					 */
+					if (((nmatches == 0) && (!is_or)) ||
+						((nmatches == mcvlist->nitems) && is_or))
+						break;
+
+					/*
+					 * For AND-lists, we can also mark NULL items as 'no
+					 * match' (and then skip them). For OR-lists this is not
+					 * possible.
+					 */
+					if ((!is_or) && item->isnull[idx])
+						matches[i] = STATS_MATCH_NONE;
+
+					/* skip MCV items that were already ruled out */
+					if ((!is_or) && (matches[i] == STATS_MATCH_NONE))
+						continue;
+					else if (is_or && (matches[i] == STATS_MATCH_FULL))
+						continue;
+
+					switch (oprrest)
+					{
+						case F_EQSEL:
+
+							/*
+							 * We don't care about isgt in equality, because
+							 * it does not matter whether it's (var = const)
+							 * or (const = var).
+							 */
+							mismatch = !DatumGetBool(FunctionCall2Coll(&opproc,
+													   DEFAULT_COLLATION_OID,
+															 cst->constvalue,
+														 item->values[idx]));
+
+							if (!mismatch)
+								eqmatches = bms_add_member(eqmatches, idx);
+
+							break;
+
+						case F_SCALARLTSEL:		/* column < constant */
+						case F_SCALARGTSEL:		/* column > constant */
+
+							/*
+							 * First check whether the constant is below the
+							 * lower boundary (in that case we can skip the
+							 * bucket, because there's no overlap).
+							 */
+							if (isgt)
+								mismatch = !DatumGetBool(FunctionCall2Coll(&opproc,
+														   DEFAULT_COLLATION_OID,
+															 cst->constvalue,
+															item->values[idx]));
+							else
+								mismatch = !DatumGetBool(FunctionCall2Coll(&opproc,
+														   DEFAULT_COLLATION_OID,
+															 item->values[idx],
+															  cst->constvalue));
+
+							break;
+					}
+
+					/*
+					 * XXX The conditions on matches[i] are not needed, as we
+					 * skip MCV items that can't become true/false, depending
+					 * on the current flag. See beginning of the loop over MCV
+					 * items.
+					 */
+
+					if ((is_or) && (matches[i] == STATS_MATCH_NONE) && (!mismatch))
+					{
+						/* OR - was MATCH_NONE, but will be MATCH_FULL */
+						matches[i] = STATS_MATCH_FULL;
+						++nmatches;
+						continue;
+					}
+					else if ((!is_or) && (matches[i] == STATS_MATCH_FULL) && mismatch)
+					{
+						/* AND - was MATC_FULL, but will be MATCH_NONE */
+						matches[i] = STATS_MATCH_NONE;
+						--nmatches;
+						continue;
+					}
+
+				}
+			}
+		}
+		else if (IsA(clause, NullTest))
+		{
+			NullTest   *expr = (NullTest *) clause;
+			Var		   *var = (Var *) (expr->arg);
+
+			/* FIXME proper matching attribute to dimension */
+			int			idx = bms_member_index(keys, var->varattno);
+
+			/*
+			 * Walk through the MCV items and evaluate the current clause. We
+			 * can skip items that were already ruled out, and terminate if
+			 * there are no remaining MCV items that might possibly match.
+			 */
+			for (i = 0; i < mcvlist->nitems; i++)
+			{
+				MCVItem	   *item = mcvlist->items[i];
+
+				/*
+				 * if there are no more matches, we can stop processing this
+				 * clause
+				 */
+				if (nmatches == 0)
+					break;
+
+				/* skip MCV items that were already ruled out */
+				if (matches[i] == STATS_MATCH_NONE)
+					continue;
+
+				/* if the clause mismatches the MCV item, set it as MATCH_NONE */
+				if (((expr->nulltesttype == IS_NULL) && (!item->isnull[idx])) ||
+				((expr->nulltesttype == IS_NOT_NULL) && (item->isnull[idx])))
+				{
+					matches[i] = STATS_MATCH_NONE;
+					--nmatches;
+				}
+			}
+		}
+		else if (or_clause(clause) || and_clause(clause))
+		{
+			/*
+			 * AND/OR clause, with all clauses compatible with the selected MV
+			 * stat
+			 */
+
+			int			i;
+			BoolExpr   *orclause = ((BoolExpr *) clause);
+			List	   *orclauses = orclause->args;
+
+			/* match/mismatch bitmap for each MCV item */
+			int			or_nmatches = 0;
+			char	   *or_matches = NULL;
+
+			Assert(orclauses != NIL);
+			Assert(list_length(orclauses) >= 2);
+
+			/* number of matching MCV items */
+			or_nmatches = mcvlist->nitems;
+
+			/* by default none of the MCV items matches the clauses */
+			or_matches = palloc0(sizeof(char) * or_nmatches);
+
+			if (or_clause(clause))
+			{
+				/* OR clauses assume nothing matches, initially */
+				memset(or_matches, STATS_MATCH_NONE, sizeof(char) * or_nmatches);
+				or_nmatches = 0;
+			}
+			else
+			{
+				/* AND clauses assume nothing matches, initially */
+				memset(or_matches, STATS_MATCH_FULL, sizeof(char) * or_nmatches);
+			}
+
+			/* build the match bitmap for the OR-clauses */
+			or_nmatches = mcv_update_match_bitmap(root, orclauses, keys,
+										mcvlist, or_nmatches, or_matches,
+										lowsel, fullmatch, or_clause(clause));
+
+			/* merge the bitmap into the existing one */
+			for (i = 0; i < mcvlist->nitems; i++)
+			{
+				/*
+				 * Merge the result into the bitmap (Min for AND, Max for OR).
+				 *
+				 * FIXME this does not decrease the number of matches
+				 */
+				UPDATE_RESULT(matches[i], or_matches[i], is_or);
+			}
+
+			pfree(or_matches);
+
+		}
+		else
+		{
+			elog(ERROR, "unknown clause type: %d", clause->type);
+		}
+	}
+
+	/*
+	 * If all the columns were matched by equality, it's a full match. In this
+	 * case there can be just a single MCV item, matching the clause (if there
+	 * were two, both would match the other one).
+	 */
+	*fullmatch = (bms_num_members(eqmatches) == mcvlist->ndimensions);
+
+	/* free the allocated pieces */
+	if (eqmatches)
+		pfree(eqmatches);
+
+	return nmatches;
+}
+
+
+Selectivity
+mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
+						   JoinType jointype, SpecialJoinInfo *sjinfo,
+						   RelOptInfo *rel, Bitmapset **estimatedclauses)
+{
+	int			i;
+	ListCell   *l;
+	Bitmapset  *clauses_attnums = NULL;
+	Bitmapset **list_attnums;
+	int			listidx;
+	StatisticExtInfo *stat;
+	MCVList	   *mcv;
+	List	   *mcv_clauses;
+
+	/* match/mismatch bitmap for each MCV item */
+	char	   *matches = NULL;
+	bool		fullmatch;
+	Selectivity lowsel;
+	int			nmatches = 0;
+	Selectivity	s;
+
+	/* check if there's any stats that might be useful for us. */
+	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
+		return 1.0;
+
+	list_attnums = (Bitmapset **) palloc(sizeof(Bitmapset *) *
+										 list_length(clauses));
+
+	/*
+	 * Pre-process the clauses list to extract the attnums seen in each item.
+	 * We need to determine if there's any clauses which will be useful for
+	 * dependency selectivity estimations. Along the way we'll record all of
+	 * the attnums for each clause in a list which we'll reference later so we
+	 * don't need to repeat the same work again. We'll also keep track of all
+	 * attnums seen.
+	 *
+	 * FIXME Should skip already estimated clauses (using the estimatedclauses
+	 * bitmap).
+	 */
+	listidx = 0;
+	foreach(l, clauses)
+	{
+		Node	   *clause = (Node *) lfirst(l);
+		Bitmapset  *attnums = NULL;
+
+		if (mcv_is_compatible_clause(clause, rel->relid, &attnums))
+		{
+			list_attnums[listidx] = attnums;
+			clauses_attnums = bms_add_members(clauses_attnums, attnums);
+		}
+		else
+			list_attnums[listidx] = NULL;
+
+		listidx++;
+	}
+
+	/* We need at least two attributes for MCV lists. */
+	if (bms_num_members(clauses_attnums) < 2)
+		return 1.0;
+
+	/* find the best suited statistics object for these attnums */
+	stat = choose_best_statistics(rel->statlist, clauses_attnums,
+								  STATS_EXT_MCV);
+
+	/* if no matching stats could be found then we've nothing to do */
+	if (!stat)
+		return 1.0;
+
+	/* load the MCV list stored in the statistics object */
+	mcv = statext_mcv_load(stat->statOid);
+
+	/* now filter the clauses to be estimated using the selected MCV */
+	mcv_clauses = NIL;
+
+	listidx = 0;
+	foreach (l, clauses)
+	{
+		/*
+		 * If the clause is compatible with the selected MCV statistics,
+		 * mark it as estimated and add it to the MCV list.
+		 */
+		if ((list_attnums[listidx] != NULL) &&
+			(bms_is_subset(list_attnums[listidx], stat->keys)))
+		{
+			mcv_clauses = lappend(mcv_clauses, (Node *)lfirst(l));
+			*estimatedclauses = bms_add_member(*estimatedclauses, listidx);
+		}
+
+		listidx++;
+	}
+
+	/* by default all the MCV items match the clauses fully */
+	matches = palloc0(sizeof(char) * mcv->nitems);
+	memset(matches, STATS_MATCH_FULL, sizeof(char) * mcv->nitems);
+
+	/* number of matching MCV items */
+	nmatches = mcv->nitems;
+
+	nmatches = mcv_update_match_bitmap(root, clauses,
+									   stat->keys, mcv,
+									   nmatches, matches,
+									   &lowsel, &fullmatch, false);
+
+	/* sum frequencies for all the matching MCV items */
+	for (i = 0; i < mcv->nitems; i++)
+	{
+		if (matches[i] != STATS_MATCH_NONE)
+			s += mcv->items[i]->frequency;
+	}
+
+	return s;
+}
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 0faa020..80746da 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -1461,6 +1461,7 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 	bool		isnull;
 	bool		ndistinct_enabled;
 	bool		dependencies_enabled;
+	bool		mcv_enabled;
 	int			i;
 
 	statexttup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statextid));
@@ -1496,6 +1497,7 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 
 	ndistinct_enabled = false;
 	dependencies_enabled = false;
+	mcv_enabled = false;
 
 	for (i = 0; i < ARR_DIMS(arr)[0]; i++)
 	{
@@ -1503,6 +1505,8 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 			ndistinct_enabled = true;
 		if (enabled[i] == STATS_EXT_DEPENDENCIES)
 			dependencies_enabled = true;
+		if (enabled[i] == STATS_EXT_MCV)
+			mcv_enabled = true;
 	}
 
 	/*
@@ -1512,13 +1516,27 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 	 * statistics types on a newer postgres version, if the statistics had all
 	 * options enabled on the original version.
 	 */
-	if (!ndistinct_enabled || !dependencies_enabled)
+	if (!ndistinct_enabled || !dependencies_enabled || !mcv_enabled)
 	{
+		bool	gotone = false;
+
 		appendStringInfoString(&buf, " (");
+
 		if (ndistinct_enabled)
+		{
 			appendStringInfoString(&buf, "ndistinct");
-		else if (dependencies_enabled)
-			appendStringInfoString(&buf, "dependencies");
+			gotone = true;
+		}
+
+		if (dependencies_enabled)
+		{
+			appendStringInfo(&buf, "%sdependencies", gotone ? ", " : "");
+			gotone = true;
+		}
+
+		if (mcv_enabled)
+			appendStringInfo(&buf, "%smcv", gotone ? ", " : "");
+
 		appendStringInfoChar(&buf, ')');
 	}
 
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 798e710..bedd3db 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2382,7 +2382,8 @@ describeOneTableDetails(const char *schemaname,
 							  "   JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND\n"
 							  "        a.attnum = s.attnum AND NOT attisdropped)) AS columns,\n"
 							  "  (stxkind @> '{d}') AS ndist_enabled,\n"
-							  "  (stxkind @> '{f}') AS deps_enabled\n"
+							  "  (stxkind @> '{f}') AS deps_enabled,\n"
+							  "  (stxkind @> '{m}') AS mcv_enabled\n"
 							  "FROM pg_catalog.pg_statistic_ext stat "
 							  "WHERE stxrelid = '%s'\n"
 							  "ORDER BY 1;",
@@ -2419,6 +2420,12 @@ describeOneTableDetails(const char *schemaname,
 					if (strcmp(PQgetvalue(result, i, 6), "t") == 0)
 					{
 						appendPQExpBuffer(&buf, "%sdependencies", gotone ? ", " : "");
+						gotone = true;
+					}
+
+					if (strcmp(PQgetvalue(result, i, 7), "t") == 0)
+					{
+						appendPQExpBuffer(&buf, "%smcv", gotone ? ", " : "");
 					}
 
 					appendPQExpBuffer(&buf, ") ON %s FROM %s",
diff --git a/src/include/catalog/pg_cast.h b/src/include/catalog/pg_cast.h
index 1782753..4881134 100644
--- a/src/include/catalog/pg_cast.h
+++ b/src/include/catalog/pg_cast.h
@@ -262,6 +262,11 @@ DATA(insert (  3361  25    0 i i ));
 DATA(insert (  3402  17    0 i b ));
 DATA(insert (  3402  25    0 i i ));
 
+/* pg_mcv_list can be coerced to, but not from, bytea and text */
+DATA(insert (  441	 17    0 i b ));
+DATA(insert (  441	 25    0 i i ));
+
+
 /*
  * Datetime category
  */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 8b33b4e..d78ad54 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2786,6 +2786,18 @@ DESCR("I/O");
 DATA(insert OID = 3407 (  pg_dependencies_send	PGNSP PGUID 12 1 0 0 0 f f f f t f s s 1 0 17 "3402" _null_ _null_ _null_ _null_ _null_ pg_dependencies_send _null_ _null_ _null_ ));
 DESCR("I/O");
 
+DATA(insert OID = 442 (  pg_mcv_list_in	PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 441 "2275" _null_ _null_ _null_ _null_ _null_ pg_mcv_list_in _null_ _null_ _null_ ));
+DESCR("I/O");
+DATA(insert OID = 443 (  pg_mcv_list_out	PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 2275 "441" _null_ _null_ _null_ _null_ _null_ pg_mcv_list_out _null_ _null_ _null_ ));
+DESCR("I/O");
+DATA(insert OID = 444 (  pg_mcv_list_recv	PGNSP PGUID 12 1 0 0 0 f f f f t f s s 1 0 441 "2281" _null_ _null_ _null_ _null_ _null_ pg_mcv_list_recv _null_ _null_ _null_ ));
+DESCR("I/O");
+DATA(insert OID = 445 (  pg_mcv_list_send	PGNSP PGUID 12 1 0 0 0 f f f f t f s s 1 0 17 "441" _null_ _null_ _null_ _null_ _null_	pg_mcv_list_send _null_ _null_ _null_ ));
+DESCR("I/O");
+
+DATA(insert OID = 3410 (  pg_mcv_list_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_stats_ext_mcvlist_items _null_ _null_ _null_ ));
+DESCR("details about MCV list items");
+
 DATA(insert OID = 1928 (  pg_stat_get_numscans			PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
 DESCR("statistics: number of scans done for table/index");
 DATA(insert OID = 1929 (  pg_stat_get_tuples_returned	PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_tuples_returned _null_ _null_ _null_ ));
diff --git a/src/include/catalog/pg_statistic_ext.h b/src/include/catalog/pg_statistic_ext.h
index 7813802..4752525 100644
--- a/src/include/catalog/pg_statistic_ext.h
+++ b/src/include/catalog/pg_statistic_ext.h
@@ -49,6 +49,7 @@ CATALOG(pg_statistic_ext,3381)
 												 * to build */
 	pg_ndistinct stxndistinct;	/* ndistinct coefficients (serialized) */
 	pg_dependencies stxdependencies;	/* dependencies (serialized) */
+	pg_mcv_list		stxmcv;		/* MCV (serialized) */
 #endif
 
 } FormData_pg_statistic_ext;
@@ -64,7 +65,7 @@ typedef FormData_pg_statistic_ext *Form_pg_statistic_ext;
  *		compiler constants for pg_statistic_ext
  * ----------------
  */
-#define Natts_pg_statistic_ext					8
+#define Natts_pg_statistic_ext					9
 #define Anum_pg_statistic_ext_stxrelid			1
 #define Anum_pg_statistic_ext_stxname			2
 #define Anum_pg_statistic_ext_stxnamespace		3
@@ -73,8 +74,10 @@ typedef FormData_pg_statistic_ext *Form_pg_statistic_ext;
 #define Anum_pg_statistic_ext_stxkind			6
 #define Anum_pg_statistic_ext_stxndistinct		7
 #define Anum_pg_statistic_ext_stxdependencies	8
+#define Anum_pg_statistic_ext_stxmcv			9
 
 #define STATS_EXT_NDISTINCT			'd'
 #define STATS_EXT_DEPENDENCIES		'f'
+#define STATS_EXT_MCV				'm'
 
 #endif							/* PG_STATISTIC_EXT_H */
diff --git a/src/include/catalog/pg_type.h b/src/include/catalog/pg_type.h
index ffdb452..b5fcc3d 100644
--- a/src/include/catalog/pg_type.h
+++ b/src/include/catalog/pg_type.h
@@ -372,6 +372,10 @@ DATA(insert OID = 3402 ( pg_dependencies		PGNSP PGUID -1 f b S f t \054 0 0 0 pg
 DESCR("multivariate dependencies");
 #define PGDEPENDENCIESOID	3402
 
+DATA(insert OID = 441 ( pg_mcv_list		PGNSP PGUID -1 f b S f t \054 0 0 0 pg_mcv_list_in pg_mcv_list_out pg_mcv_list_recv pg_mcv_list_send - - - i x f 0 -1 0 100 _null_ _null_ _null_ ));
+DESCR("multivariate MCV list");
+#define PGMCVLISTOID	441
+
 DATA(insert OID = 32 ( pg_ddl_command	PGNSP PGUID SIZEOF_POINTER t p P f t \054 0 0 0 pg_ddl_command_in pg_ddl_command_out pg_ddl_command_recv pg_ddl_command_send - - - ALIGNOF_POINTER p f 0 -1 0 0 _null_ _null_ _null_ ));
 DESCR("internal type for passing CollectedCommand");
 #define PGDDLCOMMANDOID 32
diff --git a/src/include/statistics/extended_stats_internal.h b/src/include/statistics/extended_stats_internal.h
index 738ff3f..7a04863 100644
--- a/src/include/statistics/extended_stats_internal.h
+++ b/src/include/statistics/extended_stats_internal.h
@@ -31,6 +31,15 @@ typedef struct
 	int			tupno;			/* position index for tuple it came from */
 } ScalarItem;
 
+/* (de)serialization info */
+typedef struct DimensionInfo
+{
+	int			nvalues;		/* number of deduplicated values */
+	int			nbytes;			/* number of bytes (serialized) */
+	int			typlen;			/* pg_type.typlen */
+	bool		typbyval;		/* pg_type.typbyval */
+} DimensionInfo;
+
 /* multi-sort */
 typedef struct MultiSortSupportData
 {
@@ -44,6 +53,7 @@ typedef struct SortItem
 {
 	Datum	   *values;
 	bool	   *isnull;
+	int			count;
 } SortItem;
 
 extern MVNDistinct *statext_ndistinct_build(double totalrows,
@@ -57,13 +67,35 @@ extern MVDependencies *statext_dependencies_build(int numrows, HeapTuple *rows,
 extern bytea *statext_dependencies_serialize(MVDependencies *dependencies);
 extern MVDependencies *statext_dependencies_deserialize(bytea *data);
 
+extern MCVList *statext_mcv_build(int numrows, HeapTuple *rows,
+					Bitmapset *attrs, VacAttrStats **stats);
+extern bytea *statext_mcv_serialize(MCVList *mcv, VacAttrStats **stats);
+extern MCVList *statext_mcv_deserialize(bytea *data);
+
 extern MultiSortSupport multi_sort_init(int ndims);
 extern void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
 						 Oid oper);
-extern int	multi_sort_compare(const void *a, const void *b, void *arg);
+extern int multi_sort_compare(const void *a, const void *b, void *arg);
 extern int multi_sort_compare_dim(int dim, const SortItem *a,
 					   const SortItem *b, MultiSortSupport mss);
 extern int multi_sort_compare_dims(int start, int end, const SortItem *a,
 						const SortItem *b, MultiSortSupport mss);
+extern int compare_scalars_simple(const void *a, const void *b, void *arg);
+extern int compare_datums_simple(Datum a, Datum b, SortSupport ssup);
+
+extern void *bsearch_arg(const void *key, const void *base,
+			size_t nmemb, size_t size,
+			int (*compar) (const void *, const void *, void *),
+			void *arg);
+
+extern int *build_attnums(Bitmapset *attrs);
+
+extern SortItem * build_sorted_items(int numrows, HeapTuple *rows,
+				   TupleDesc tdesc, MultiSortSupport mss,
+				   int numattrs, int *attnums);
+
+extern int2vector *find_ext_attnums(Oid mvoid, Oid *relid);
+
+extern int bms_member_index(Bitmapset *keys, AttrNumber varattno);
 
 #endif							/* EXTENDED_STATS_INTERNAL_H */
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 1d68c39..7b94dde 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -15,6 +15,14 @@
 
 #include "commands/vacuum.h"
 #include "nodes/relation.h"
+ 
+/*
+ * Degree of how much MCV item matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define STATS_MATCH_NONE		0		/* no match at all */
+#define STATS_MATCH_PARTIAL		1		/* partial match */
+#define STATS_MATCH_FULL		2		/* full match */
 
 #define STATS_MAX_DIMENSIONS	8	/* max number of attributes */
 
@@ -78,8 +86,40 @@ typedef struct MVDependencies
 /* size of the struct excluding the deps array */
 #define SizeOfDependencies	(offsetof(MVDependencies, ndeps) + sizeof(uint32))
 
+ 
+/* used to flag stats serialized to bytea */
+#define STATS_MCV_MAGIC                        0xE1A651C2              /* marks serialized bytea */
+#define STATS_MCV_TYPE_BASIC   1                               /* basic MCV list type */
+
+/* max items in MCV list (mostly arbitrary number) */
+#define STATS_MCVLIST_MAX_ITEMS        8192
+
+/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and null flags.
+ */
+typedef struct MCVItem
+{
+	double		frequency;	/* frequency of this combination */
+	bool	   *isnull;		/* lags of NULL values (up to 32 columns) */
+	Datum	   *values;		/* variable-length (ndimensions) */
+} MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVList
+{
+	uint32		magic;		/* magic constant marker */
+	uint32		type;		/* type of MCV list (BASIC) */
+	uint32		nitems;		/* number of MCV items in the array */
+	AttrNumber	ndimensions;	/* number of dimensions */
+	MCVItem	  **items;		/* array of MCV items */
+} MCVList;
+
 extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
 extern MVDependencies *statext_dependencies_load(Oid mvoid);
+extern MCVList *statext_mcv_load(Oid mvoid);
 
 extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
 						   int numrows, HeapTuple *rows,
@@ -92,6 +132,13 @@ extern Selectivity dependencies_clauselist_selectivity(PlannerInfo *root,
 									SpecialJoinInfo *sjinfo,
 									RelOptInfo *rel,
 									Bitmapset **estimatedclauses);
+extern Selectivity mcv_clauselist_selectivity(PlannerInfo *root,
+									List *clauses,
+									int varRelid,
+									JoinType jointype,
+									SpecialJoinInfo *sjinfo,
+									RelOptInfo *rel,
+									Bitmapset **estimatedclauses);
 extern bool has_stats_of_kind(List *stats, char requiredkind);
 extern StatisticExtInfo *choose_best_statistics(List *stats,
 					   Bitmapset *attnums, char requiredkind);
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index fcf8bd7..bdc0889 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -859,11 +859,12 @@ WHERE c.castmethod = 'b' AND
  pg_node_tree      | text              |        0 | i
  pg_ndistinct      | bytea             |        0 | i
  pg_dependencies   | bytea             |        0 | i
+ pg_mcv_list       | bytea             |        0 | i
  cidr              | inet              |        0 | i
  xml               | text              |        0 | a
  xml               | character varying |        0 | a
  xml               | character         |        0 | a
-(9 rows)
+(10 rows)
 
 -- **************** pg_conversion ****************
 -- Look for illegal values in pg_conversion fields.
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 441cfaa..85009d2 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -58,7 +58,7 @@ ALTER TABLE ab1 DROP COLUMN a;
  b      | integer |           |          | 
  c      | integer |           |          | 
 Statistics objects:
-    "public"."ab1_b_c_stats" (ndistinct, dependencies) ON b, c FROM ab1
+    "public"."ab1_b_c_stats" (ndistinct, dependencies, mcv) ON b, c FROM ab1
 
 -- Ensure statistics are dropped when table is
 SELECT stxname FROM pg_statistic_ext WHERE stxname LIKE 'ab1%';
@@ -206,7 +206,7 @@ SELECT stxkind, stxndistinct
   FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass;
  stxkind |                      stxndistinct                       
 ---------+---------------------------------------------------------
- {d,f}   | {"3, 4": 301, "3, 6": 301, "4, 6": 301, "3, 4, 6": 301}
+ {d,f,m} | {"3, 4": 301, "3, 6": 301, "4, 6": 301, "3, 4, 6": 301}
 (1 row)
 
 -- Hash Aggregate, thanks to estimates improved by the statistic
@@ -272,7 +272,7 @@ SELECT stxkind, stxndistinct
   FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass;
  stxkind |                        stxndistinct                         
 ---------+-------------------------------------------------------------
- {d,f}   | {"3, 4": 2550, "3, 6": 800, "4, 6": 1632, "3, 4, 6": 10000}
+ {d,f,m} | {"3, 4": 2550, "3, 6": 800, "4, 6": 1632, "3, 4, 6": 10000}
 (1 row)
 
 -- plans using Group Aggregate, thanks to using correct esimates
@@ -509,3 +509,216 @@ EXPLAIN (COSTS OFF)
 (5 rows)
 
 RESET random_page_cost;
+-- MCV lists
+CREATE TABLE mcv_lists (
+    filler1 TEXT,
+    filler2 NUMERIC,
+    a INT,
+    b TEXT,
+    filler3 DATE,
+    c INT,
+    d TEXT
+);
+SET random_page_cost = 1.2;
+CREATE INDEX mcv_lists_ab_idx ON mcv_lists (a, b);
+CREATE INDEX mcv_lists_abc_idx ON mcv_lists (a, b, c);
+-- random data (no MCV list)
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT mod(i,37), mod(i,41), mod(i,43), mod(i,47) FROM generate_series(1,5000) s(i);
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text) AND (c = 1))
+(2 rows)
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text) AND (c = 1))
+(2 rows)
+
+-- 100 distinct combinations, all in the MCV list
+TRUNCATE mcv_lists;
+DROP STATISTICS mcv_lists_stats;
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT mod(i,100), mod(i,50), mod(i,25), i FROM generate_series(1,5000) s(i);
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a < 1) AND (b < '1'::text))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text) AND (c = 1))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a < 5) AND (b < '1'::text) AND (c < 5))
+(2 rows)
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a < 1) AND (b < '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a < 1) AND (b < '1'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   Filter: (c = 1)
+   ->  Bitmap Index Scan on mcv_lists_ab_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(5 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a < 5) AND (b < '1'::text))
+   Filter: (c < 5)
+   ->  Bitmap Index Scan on mcv_lists_ab_idx
+         Index Cond: ((a < 5) AND (b < '1'::text))
+(5 rows)
+
+-- check change of column type resets the MCV statistics
+ALTER TABLE mcv_lists ALTER COLUMN c TYPE numeric;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text))
+(2 rows)
+
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(4 rows)
+
+-- 100 distinct combinations with NULL values, all in the MCV list
+TRUNCATE mcv_lists;
+DROP STATISTICS mcv_lists_stats;
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT
+         (CASE WHEN mod(i,100) = 1 THEN NULL ELSE mod(i,100) END),
+         (CASE WHEN mod(i,50) = 1  THEN NULL ELSE mod(i,50) END),
+         (CASE WHEN mod(i,25) = 1  THEN NULL ELSE mod(i,25) END),
+         i
+     FROM generate_series(1,5000) s(i);
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a IS NULL) AND (b IS NULL))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
+                   QUERY PLAN                   
+------------------------------------------------
+ Index Scan using mcv_lists_ab_idx on mcv_lists
+   Index Cond: ((a IS NULL) AND (b IS NULL))
+   Filter: (c IS NULL)
+(3 rows)
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a IS NULL) AND (b IS NULL))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a IS NULL) AND (b IS NULL))
+   Filter: (c IS NULL)
+   ->  Bitmap Index Scan on mcv_lists_ab_idx
+         Index Cond: ((a IS NULL) AND (b IS NULL))
+(5 rows)
+
+RESET random_page_cost;
diff --git a/src/test/regress/expected/type_sanity.out b/src/test/regress/expected/type_sanity.out
index 7b200ba..5a7c570 100644
--- a/src/test/regress/expected/type_sanity.out
+++ b/src/test/regress/expected/type_sanity.out
@@ -72,8 +72,9 @@ WHERE p1.typtype not in ('c','d','p') AND p1.typname NOT LIKE E'\\_%'
   194 | pg_node_tree
  3361 | pg_ndistinct
  3402 | pg_dependencies
+  441 | pg_mcv_list
   210 | smgr
-(4 rows)
+(5 rows)
 
 -- Make sure typarray points to a varlena array type of our own base
 SELECT p1.oid, p1.typname as basetype, p2.typname as arraytype,
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 46acaad..e9902ce 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -282,3 +282,124 @@ EXPLAIN (COSTS OFF)
  SELECT * FROM functional_dependencies WHERE a = 1 AND b = '1' AND c = 1;
 
 RESET random_page_cost;
+
+-- MCV lists
+CREATE TABLE mcv_lists (
+    filler1 TEXT,
+    filler2 NUMERIC,
+    a INT,
+    b TEXT,
+    filler3 DATE,
+    c INT,
+    d TEXT
+);
+
+SET random_page_cost = 1.2;
+
+CREATE INDEX mcv_lists_ab_idx ON mcv_lists (a, b);
+CREATE INDEX mcv_lists_abc_idx ON mcv_lists (a, b, c);
+
+-- random data (no MCV list)
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT mod(i,37), mod(i,41), mod(i,43), mod(i,47) FROM generate_series(1,5000) s(i);
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+
+-- 100 distinct combinations, all in the MCV list
+TRUNCATE mcv_lists;
+DROP STATISTICS mcv_lists_stats;
+
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT mod(i,100), mod(i,50), mod(i,25), i FROM generate_series(1,5000) s(i);
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
+
+-- check change of column type resets the MCV statistics
+ALTER TABLE mcv_lists ALTER COLUMN c TYPE numeric;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+-- 100 distinct combinations with NULL values, all in the MCV list
+TRUNCATE mcv_lists;
+DROP STATISTICS mcv_lists_stats;
+
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT
+         (CASE WHEN mod(i,100) = 1 THEN NULL ELSE mod(i,100) END),
+         (CASE WHEN mod(i,50) = 1  THEN NULL ELSE mod(i,50) END),
+         (CASE WHEN mod(i,25) = 1  THEN NULL ELSE mod(i,25) END),
+         i
+     FROM generate_series(1,5000) s(i);
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
+
+RESET random_page_cost;
-- 
2.9.4

0002-Multivariate-histograms.patchtext/x-patch; name=0002-Multivariate-histograms.patchDownload

From 0f977c45527a4375a2b80a3d560436bd1d1baf0b Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Fri, 4 Aug 2017 01:20:24 +0200
Subject: [PATCH 2/3] Multivariate histograms

---
 doc/src/sgml/catalogs.sgml                       |    9 +
 doc/src/sgml/planstats.sgml                      |  105 +
 doc/src/sgml/ref/create_statistics.sgml          |   31 +-
 src/backend/commands/statscmds.c                 |   33 +-
 src/backend/nodes/outfuncs.c                     |    2 +-
 src/backend/optimizer/path/clausesel.c           |   22 +-
 src/backend/optimizer/util/plancat.c             |   44 +-
 src/backend/statistics/Makefile                  |    2 +-
 src/backend/statistics/README.histogram          |  299 +++
 src/backend/statistics/dependencies.c            |    2 +-
 src/backend/statistics/extended_stats.c          |  374 ++-
 src/backend/statistics/histogram.c               | 2679 ++++++++++++++++++++++
 src/backend/statistics/mcv.c                     |  349 +--
 src/backend/utils/adt/ruleutils.c                |   10 +
 src/backend/utils/adt/selfuncs.c                 |    2 +-
 src/bin/psql/describe.c                          |    9 +-
 src/include/catalog/pg_cast.h                    |    3 +
 src/include/catalog/pg_proc.h                    |   12 +
 src/include/catalog/pg_statistic_ext.h           |    5 +-
 src/include/catalog/pg_type.h                    |    4 +
 src/include/nodes/relation.h                     |    7 +-
 src/include/statistics/extended_stats_internal.h |   31 +-
 src/include/statistics/statistics.h              |   97 +-
 src/test/regress/expected/opr_sanity.out         |    3 +-
 src/test/regress/expected/stats_ext.out          |  192 +-
 src/test/regress/expected/type_sanity.out        |    3 +-
 src/test/regress/sql/stats_ext.sql               |  110 +
 27 files changed, 4108 insertions(+), 331 deletions(-)
 create mode 100644 src/backend/statistics/README.histogram
 create mode 100644 src/backend/statistics/histogram.c

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index e07fe46..3a86577 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6478,6 +6478,15 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</>:<replaceable>&lt;salt&gt;<
       </entry>
      </row>
 
+     <row>
+      <entry><structfield>stxhistogram</structfield></entry>
+      <entry><type>pg_histogram</type></entry>
+      <entry></entry>
+      <entry>
+       Histogram, serialized as <structname>pg_histogram</> type.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/planstats.sgml b/doc/src/sgml/planstats.sgml
index 1e81d94..8857fc7 100644
--- a/doc/src/sgml/planstats.sgml
+++ b/doc/src/sgml/planstats.sgml
@@ -724,6 +724,111 @@ EXPLAIN ANALYZE SELECT * FROM t WHERE a <= 49 AND b > 49;
 
   </sect2>
 
+  <sect2 id="mv-histograms">
+   <title>Histograms</title>
+
+   <para>
+    <acronym>MCV</> lists, introduced in the previous section, work very well
+    for low-cardinality columns (i.e. columns with only very few distinct
+    values), and for columns with a few very frequent values (and possibly
+    many rare ones). Histograms, a generalization of per-column histograms
+    briefly described in <xref linkend="row-estimation-examples">, are meant
+    to address the other cases, i.e. high-cardinality columns, particularly
+    when there are no frequent values.
+   </para>
+
+   <para>
+    Although the example data we've used so far is not a very good match, we
+    can try creating a histogram instead of the <acronym>MCV</> list. With the
+    histogram in place, you may get a plan like this:
+
+<programlisting>
+CREATE STATISTICS stts3 (histogram) ON a, b FROM t;
+ANALYZE t;
+EXPLAIN ANALYZE SELECT * FROM t WHERE a = 1 AND b = 1;
+                                           QUERY PLAN
+-------------------------------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=100 width=8) (actual time=0.035..2.967 rows=100 loops=1)
+   Filter: ((a = 1) AND (b = 1))
+   Rows Removed by Filter: 9900
+ Planning time: 0.227 ms
+ Execution time: 3.189 ms
+(5 rows)
+</programlisting>
+
+    Which seems quite accurate, however for other combinations of values the
+    results may be much worse, as illustrated by the following query
+
+<programlisting>
+                                          QUERY PLAN
+-----------------------------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=100 width=8) (actual time=2.771..2.771 rows=0 loops=1)
+   Filter: ((a = 1) AND (b = 10))
+   Rows Removed by Filter: 10000
+ Planning time: 0.179 ms
+ Execution time: 2.812 ms
+(5 rows)
+</programlisting>
+
+    This is due to histograms tracking ranges of values, not individual values.
+    That means it's only possible say whether a bucket may contain items
+    matching the conditions, but it's unclear how many such tuples there
+    actually are in the bucket. Moreover, for larger tables only a small subset
+    of rows gets sampled by <command>ANALYZE</>, causing small variations in
+    the shape of buckets.
+   </para>
+
+   <para>
+    Similarly to <acronym>MCV</> lists, we can inspect histogram contents
+    using a function called <function>pg_histogram_buckets</>.
+
+<programlisting>
+test=# SELECT * FROM pg_histogram_buckets((SELECT oid FROM pg_statistic_ext WHERE staname = 'stts3'), 0);
+ index | minvals | maxvals | nullsonly | mininclusive | maxinclusive | frequency | density  | bucket_volume 
+-------+---------+---------+-----------+--------------+--------------+-----------+----------+---------------
+     0 | {0,0}   | {3,1}   | {f,f}     | {t,t}        | {f,f}        |      0.01 |     1.68 |      0.005952
+     1 | {50,0}  | {51,3}  | {f,f}     | {t,t}        | {f,f}        |      0.01 |     1.12 |      0.008929
+     2 | {0,25}  | {26,31} | {f,f}     | {t,t}        | {f,f}        |      0.01 |     0.28 |      0.035714
+...
+    61 | {60,0}  | {99,12} | {f,f}     | {t,t}        | {t,f}        |      0.02 | 0.124444 |      0.160714
+    62 | {34,35} | {37,49} | {f,f}     | {t,t}        | {t,t}        |      0.02 |     0.96 |      0.020833
+    63 | {84,35} | {87,49} | {f,f}     | {t,t}        | {t,t}        |      0.02 |     0.96 |      0.020833
+(64 rows)
+</programlisting>
+
+    Which confirms there are 64 buckets, with frequencies ranging between 1%
+    and 2%. The <structfield>minvals</> and <structfield>maxvals</> show the
+    bucket boundaries, <structfield>nullsonly</> shows which columns contain
+    only null values (in the given bucket).
+   </para>
+
+   <para>
+    Similarly to <acronym>MCV</> lists, the planner applies all conditions to
+    the buckets, and sums the frequencies of the matching ones. For details,
+    see <function>clauselist_mv_selectivity_histogram</> function in
+    <filename>clausesel.c</>.
+   </para>
+
+   <para>
+    It's also possible to build <acronym>MCV</> lists and a histogram, in which
+    case <command>ANALYZE</> will build a <acronym>MCV</> lists with the most
+    frequent values, and a histogram on the remaining part of the sample.
+
+<programlisting>
+CREATE STATISTICS stts4 (mcv, histogram) ON a, b FROM t;
+</programlisting>
+
+    In this case the <acronym>MCV</> list and histogram are treated as a single
+    composed statistics.
+   </para>
+
+   <para>
+    For additional information about multivariate histograms, see
+    <filename>src/backend/statistics/README.histogram</>.
+   </para>
+
+  </sect2>
+
  </sect1>
 
  <sect1 id="planner-stats-security">
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index 52851da..2968481 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -83,8 +83,9 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
       Currently supported types are
       <literal>ndistinct</literal>, which enables n-distinct statistics,
       <literal>dependencies</literal>, which enables functional dependency
-      statistics, and <literal>mcv</literal> which enables most-common
-      values lists.
+      statistics, <literal>mcv</literal> which enables most-common
+      values lists, and <literal>histogram</literal> which enables
+      histograms.
       If this clause is omitted, all supported statistic types are
       included in the statistics object.
       For more information, see <xref linkend="planner-stats-extended">
@@ -190,6 +191,32 @@ EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 2);
 </programlisting>
   </para>
 
+  <para>
+   Create table <structname>t3</> with two strongly correlated columns, and
+   a histogram on those two columns:
+
+<programlisting>
+CREATE TABLE t3 (
+    a   float,
+    b   float
+);
+
+INSERT INTO t3 SELECT mod(i,1000), mod(i,1000) + 50 * (r - 0.5) FROM (
+                   SELECT i, random() r FROM generate_series(1,1000000) s(i)
+                 ) foo;
+
+CREATE STATISTICS s3 WITH (histogram) ON (a, b) FROM t3;
+
+ANALYZE t3;
+
+-- small overlap
+EXPLAIN ANALYZE SELECT * FROM t3 WHERE (a < 500) AND (b > 500);
+
+-- no overlap
+EXPLAIN ANALYZE SELECT * FROM t3 WHERE (a < 400) AND (b > 600);
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 0bcea4b..3f092a3 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -64,12 +64,13 @@ CreateStatistics(CreateStatsStmt *stmt)
 	Oid			relid;
 	ObjectAddress parentobject,
 				myself;
-	Datum		types[3];		/* one for each possible type of statistic */
+	Datum		types[4];		/* one for each possible type of statistic */
 	int			ntypes;
 	ArrayType  *stxkind;
 	bool		build_ndistinct;
 	bool		build_dependencies;
 	bool		build_mcv;
+	bool		build_histogram;
 	bool		requested_type = false;
 	int			i;
 	ListCell   *cell;
@@ -248,6 +249,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	build_ndistinct = false;
 	build_dependencies = false;
 	build_mcv = false;
+	build_histogram = false;
 	foreach(cell, stmt->stat_types)
 	{
 		char	   *type = strVal((Value *) lfirst(cell));
@@ -267,6 +269,11 @@ CreateStatistics(CreateStatsStmt *stmt)
 			build_mcv = true;
 			requested_type = true;
 		}
+		else if (strcmp(type, "histogram") == 0)
+		{
+			build_histogram = true;
+			requested_type = true;
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -279,6 +286,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 		build_ndistinct = true;
 		build_dependencies = true;
 		build_mcv = true;
+		build_histogram = true;
 	}
 
 	/* construct the char array of enabled statistic types */
@@ -289,6 +297,8 @@ CreateStatistics(CreateStatsStmt *stmt)
 		types[ntypes++] = CharGetDatum(STATS_EXT_DEPENDENCIES);
 	if (build_mcv)
 		types[ntypes++] = CharGetDatum(STATS_EXT_MCV);
+	if (build_histogram)
+		types[ntypes++] = CharGetDatum(STATS_EXT_HISTOGRAM);
 	Assert(ntypes > 0 && ntypes <= lengthof(types));
 	stxkind = construct_array(types, ntypes, CHAROID, 1, true, 'c');
 
@@ -308,6 +318,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	nulls[Anum_pg_statistic_ext_stxndistinct - 1] = true;
 	nulls[Anum_pg_statistic_ext_stxdependencies - 1] = true;
 	nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
+	nulls[Anum_pg_statistic_ext_stxhistogram - 1] = true;
 
 	/* insert it into pg_statistic_ext */
 	statrel = heap_open(StatisticExtRelationId, RowExclusiveLock);
@@ -407,8 +418,9 @@ RemoveStatisticsById(Oid statsOid)
  * values, this assumption could fail.  But that seems like a corner case
  * that doesn't justify zapping the stats in common cases.)
  *
- * For MCV lists that's not the case, as those statistics store the datums
- * internally. In this case we simply reset the statistics value to NULL.
+ * For MCV lists and histograms that's not the case, as those statistics
+ * store the datums internally. In those cases we simply reset those
+ * statistics to NULL.
  */
 void
 UpdateStatisticsForTypeChange(Oid statsOid, Oid relationOid, int attnum,
@@ -445,9 +457,10 @@ UpdateStatisticsForTypeChange(Oid statsOid, Oid relationOid, int attnum,
 
 	/*
 	 * We can also leave the record as it is if there are no statistics
-	 * including the datum values, like for example MCV lists.
+	 * including the datum values, like for example MCV and histograms.
 	 */
-	if (statext_is_kind_built(oldtup, STATS_EXT_MCV))
+	if (statext_is_kind_built(oldtup, STATS_EXT_MCV) ||
+		statext_is_kind_built(oldtup, STATS_EXT_HISTOGRAM))
 		reset_stats = true;
 
 	/*
@@ -468,11 +481,11 @@ UpdateStatisticsForTypeChange(Oid statsOid, Oid relationOid, int attnum,
 	memset(replaces, 0, Natts_pg_statistic_ext * sizeof(bool));
 	memset(values, 0, Natts_pg_statistic_ext * sizeof(Datum));
 
-	if (statext_is_kind_built(oldtup, STATS_EXT_MCV))
-	{
-		replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
-		nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
-	}
+	replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
+	replaces[Anum_pg_statistic_ext_stxhistogram - 1] = true;
+
+	nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
+	nulls[Anum_pg_statistic_ext_stxhistogram - 1] = true;
 
 	rel = heap_open(StatisticExtRelationId, RowExclusiveLock);
 
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 379d92a..fe98fea 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2351,7 +2351,7 @@ _outStatisticExtInfo(StringInfo str, const StatisticExtInfo *node)
 	/* NB: this isn't a complete set of fields */
 	WRITE_OID_FIELD(statOid);
 	/* don't write rel, leads to infinite recursion in plan tree dump */
-	WRITE_CHAR_FIELD(kind);
+	WRITE_INT_FIELD(kinds);
 	WRITE_BITMAPSET_FIELD(keys);
 }
 
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 28a9321..2260b99 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -125,14 +125,17 @@ clauselist_selectivity(PlannerInfo *root,
 	if (rel && rel->rtekind == RTE_RELATION && rel->statlist != NIL)
 	{
 		/*
-		 * Perform selectivity estimations on any clauses applicable by
-		 * mcv_clauselist_selectivity.  'estimatedclauses' will be filled with
-		 * the 0-based list positions of clauses used that way, so that we can
-		 * ignore them below.
+		 * Estimate selectivity on any clauses applicable by histograms and MCV
+		 * list, then by functional dependencies. This particular order is chosen
+		 * as MCV and histograms include attribute values and may be considered
+		 * more reliable.
+		 *
+		 * 'estimatedclauses' will be filled with the 0-based list positions of
+		 * clauses used that way, so that we can ignore them below.
 		 */
-		s1 *= mcv_clauselist_selectivity(root, clauses, varRelid,
-										 jointype, sjinfo, rel,
-										 &estimatedclauses);
+		s1 *= statext_clauselist_selectivity(root, clauses, varRelid,
+											 jointype, sjinfo, rel,
+											 &estimatedclauses);
 
 		/*
 		 * Perform selectivity estimations on any clauses found applicable by
@@ -143,11 +146,6 @@ clauselist_selectivity(PlannerInfo *root,
 		s1 *= dependencies_clauselist_selectivity(root, clauses, varRelid,
 												  jointype, sjinfo, rel,
 												  &estimatedclauses);
-
-		/*
-		 * This would be the place to apply any other types of extended
-		 * statistics selectivity estimations for remaining clauses.
-		 */
 	}
 
 	/*
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index ab2c8c2..be5e6ab 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1282,6 +1282,9 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 		HeapTuple	htup;
 		Bitmapset  *keys = NULL;
 		int			i;
+		int			kind = 0;
+
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
 		htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
 		if (!htup)
@@ -1296,42 +1299,25 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 		for (i = 0; i < staForm->stxkeys.dim1; i++)
 			keys = bms_add_member(keys, staForm->stxkeys.values[i]);
 
-		/* add one StatisticExtInfo for each kind built */
+		/* now build the bitmask of statistics kinds */
 		if (statext_is_kind_built(htup, STATS_EXT_NDISTINCT))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_NDISTINCT;
-			info->keys = bms_copy(keys);
-
-			stainfos = lcons(info, stainfos);
-		}
+			kind |= STATS_EXT_INFO_NDISTINCT;
 
 		if (statext_is_kind_built(htup, STATS_EXT_DEPENDENCIES))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_DEPENDENCIES;
-			info->keys = bms_copy(keys);
-
-			stainfos = lcons(info, stainfos);
-		}
+			kind |= STATS_EXT_INFO_DEPENDENCIES;
 
 		if (statext_is_kind_built(htup, STATS_EXT_MCV))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			kind |= STATS_EXT_INFO_MCV;
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_MCV;
-			info->keys = bms_copy(keys);
+		if (statext_is_kind_built(htup, STATS_EXT_HISTOGRAM))
+			kind |= STATS_EXT_INFO_HISTOGRAM;
 
-			stainfos = lcons(info, stainfos);
-		}
+		info->statOid = statOid;
+		info->rel = rel;
+		info->kinds = kind;
+		info->keys = bms_copy(keys);
+
+		stainfos = lcons(info, stainfos);
 
 		ReleaseSysCache(htup);
 		bms_free(keys);
diff --git a/src/backend/statistics/Makefile b/src/backend/statistics/Makefile
index d281526..3e5ad45 100644
--- a/src/backend/statistics/Makefile
+++ b/src/backend/statistics/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/statistics
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-OBJS = extended_stats.o dependencies.o mcv.o mvdistinct.o
+OBJS = extended_stats.o dependencies.o histogram.o mcv.o mvdistinct.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/statistics/README.histogram b/src/backend/statistics/README.histogram
new file mode 100644
index 0000000..a4c7e3d
--- /dev/null
+++ b/src/backend/statistics/README.histogram
@@ -0,0 +1,299 @@
+Multivariate histograms
+=======================
+
+Histograms on individual attributes consist of buckets represented by ranges,
+covering the domain of the attribute. That is, each bucket is a [min,max]
+interval, and contains all values in this range. The histogram is built in such
+a way that all buckets have about the same frequency.
+
+Multivariate histograms are an extension into n-dimensional space - the buckets
+are n-dimensional intervals (i.e. n-dimensional rectagles), covering the domain
+of the combination of attributes. That is, each bucket has a vector of lower
+and upper boundaries, denoted min[i] and max[i] (where i = 1..n).
+
+In addition to the boundaries, each bucket tracks additional info:
+
+    * frequency (fraction of tuples in the bucket)
+    * whether the boundaries are inclusive or exclusive
+    * whether the dimension contains only NULL values
+    * number of distinct values in each dimension (for building only)
+
+It's possible that in the future we'll multiple histogram types, with different
+features. We do however expect all the types to share the same representation
+(buckets as ranges) and only differ in how we build them.
+
+The current implementation builds non-overlapping buckets, that may not be true
+for some histogram types and the code should not rely on this assumption. There
+are interesting types of histograms (or algorithms) with overlapping buckets.
+
+When used on low-cardinality data, histograms usually perform considerably worse
+than MCV lists (which are a good fit for this kind of data). This is especially
+true on label-like values, where ordering of the values is mostly unrelated to
+meaning of the data, as proper ordering is crucial for histograms.
+
+On high-cardinality data the histograms are usually a better choice, because MCV
+lists can't represent the distribution accurately enough.
+
+
+Selectivity estimation
+----------------------
+
+The estimation is implemented in clauselist_mv_selectivity_histogram(), and
+works very similarly to clauselist_mv_selectivity_mcvlist.
+
+The main difference is that while MCV lists support exact matches, histograms
+often result in approximate matches - e.g. with equality we can only say if
+the constant would be part of the bucket, but not whether it really is there
+or what fraction of the bucket it corresponds to. In this case we rely on
+some defaults just like in the per-column histograms.
+
+The current implementation uses histograms to estimates those types of clauses
+(think of WHERE conditions):
+
+    (a) equality clauses    WHERE (a = 1) AND (b = 2)
+    (b) inequality clauses  WHERE (a < 1) AND (b >= 2)
+    (c) NULL clauses        WHERE (a IS NULL) AND (b IS NOT NULL)
+    (d) OR-clauses          WHERE (a = 1)  OR (b = 2)
+
+Similarly to MCV lists, it's possible to add support for additional types of
+clauses, for example:
+
+    (e) multi-var clauses   WHERE (a > b)
+
+and so on. These are tasks for the future, not yet implemented.
+
+
+When evaluating a clause on a bucket, we may get one of three results:
+
+    (a) FULL_MATCH - The bucket definitely matches the clause.
+
+    (b) PARTIAL_MATCH - The bucket matches the clause, but not necessarily all
+                        the tuples it represents.
+
+    (c) NO_MATCH - The bucket definitely does not match the clause.
+
+This may be illustrated using a range [1, 5], which is essentially a 1-D bucket.
+With clause
+
+    WHERE (a < 10) => FULL_MATCH (all range values are below
+                      10, so the whole bucket matches)
+
+    WHERE (a < 3)  => PARTIAL_MATCH (there may be values matching
+                      the clause, but we don't know how many)
+
+    WHERE (a < 0)  => NO_MATCH (the whole range is above 1, so
+                      no values from the bucket can match)
+
+Some clauses may produce only some of those results - for example equality
+clauses may never produce FULL_MATCH as we always hit only part of the bucket
+(we can't match both boundaries at the same time). This results in less accurate
+estimates compared to MCV lists, where we can hit a MCV items exactly (there's
+no PARTIAL match in MCV).
+
+There are also clauses that may not produce any PARTIAL_MATCH results. A nice
+example of that is 'IS [NOT] NULL' clause, which either matches the bucket
+completely (FULL_MATCH) or not at all (NO_MATCH), thanks to how the NULL-buckets
+are constructed.
+
+Computing the total selectivity estimate is trivial - simply sum selectivities
+from all the FULL_MATCH and PARTIAL_MATCH buckets (but for buckets marked with
+PARTIAL_MATCH, multiply the frequency by 0.5 to minimize the average error).
+
+
+Building a histogram
+---------------------
+
+The algorithm of building a histogram in general is quite simple:
+
+    (a) create an initial bucket (containing all sample rows)
+
+    (b) create NULL buckets (by splitting the initial bucket)
+
+    (c) repeat
+
+        (1) choose bucket to split next
+
+        (2) terminate if no bucket that might be split found, or if we've
+            reached the maximum number of buckets (16384)
+
+        (3) choose dimension to partition the bucket by
+
+        (4) partition the bucket by the selected dimension
+
+The main complexity is hidden in steps (c.1) and (c.3), i.e. how we choose the
+bucket and dimension for the split, as discussed in the next section.
+
+
+Partitioning criteria
+---------------------
+
+Similarly to one-dimensional histograms, we want to produce buckets with roughly
+the same frequency.
+
+We also need to produce "regular" buckets, because buckets with one dimension
+much longer than the others are very likely to match a lot of conditions (which
+increases error, even if the bucket frequency is very low).
+
+This is especially important when handling OR-clauses, because in that case each
+clause may add buckets independently. With AND-clauses all the clauses have to
+match each bucket, which makes this issue somewhat less concenrning.
+
+To achieve this, we choose the largest bucket (containing the most sample rows),
+but we only choose buckets that can actually be split (have at least 3 different
+combinations of values).
+
+Then we choose the "longest" dimension of the bucket, which is computed by using
+the distinct values in the sample as a measure.
+
+For details see functions select_bucket_to_partition() and partition_bucket(),
+which also includes further discussion.
+
+
+The current limit on number of buckets (16384) is mostly arbitrary, but chosen
+so that it guarantees we don't exceed the number of distinct values indexable by
+uint16 in any of the dimensions. In practice we could handle more buckets as we
+index each dimension separately and the splits should use the dimensions evenly.
+
+Also, histograms this large (with 16k values in multiple dimensions) would be
+quite expensive to build and process, so the 16k limit is rather reasonable.
+
+The actual number of buckets is also related to statistics target, because we
+require MIN_BUCKET_ROWS (10) tuples per bucket before a split, so we can't have
+more than (2 * 300 * target / 10) buckets. For the default target (100) this
+evaluates to ~6k.
+
+
+NULL handling (create_null_buckets)
+-----------------------------------
+
+When building histograms on a single attribute, we first filter out NULL values.
+In the multivariate case, we can't really do that because the rows may contain
+a mix of NULL and non-NULL values in different columns (so we can't simply
+filter all of them out).
+
+For this reason, the histograms are built in a way so that for each bucket, each
+dimension only contains only NULL or non-NULL values. Building the NULL-buckets
+happens as the first step in the build, by the create_null_buckets() function.
+The number of NULL buckets, as produced by this function, has a clear upper
+boundary (2^N) where N is the number of dimensions (attributes the histogram is
+built on). Or rather 2^K where K is the number of attributes that are not marked
+as not-NULL.
+
+The buckets with NULL dimensions are then subject to the same build algorithm
+(i.e. may be split into smaller buckets) just like any other bucket, but may
+only be split by non-NULL dimension.
+
+
+Serialization
+-------------
+
+To store the histogram in pg_statistic_ext table, it is serialized into a more
+efficient form. We also use the representation for estimation, i.e. we don't
+fully deserialize the histogram.
+
+For example the boundary values are deduplicated to minimize the required space.
+How much redundancy is there, actually? Let's assume there are no NULL values,
+so we start with a single bucket - in that case we have 2*N boundaries. Each
+time we split a bucket we introduce one new value (in the "middle" of one of
+the dimensions), and keep boundries for all the other dimensions. So after K
+splits, we have up to
+
+    2*N + K
+
+unique boundary values (we may have fewe values, if the same value is used for
+several splits). But after K splits we do have (K+1) buckets, so
+
+    (K+1) * 2 * N
+
+boundary values. Using e.g. N=4 and K=999, we arrive to those numbers:
+
+    2*N + K       = 1007
+    (K+1) * 2 * N = 8000
+
+wich means a lot of redundancy. It's somewhat counter-intuitive that the number
+of distinct values does not really depend on the number of dimensions (except
+for the initial bucket, but that's negligible compared to the total).
+
+By deduplicating the values and replacing them with 16-bit indexes (uint16), we
+reduce the required space to
+
+    1007 * 8 + 8000 * 2 ~= 24kB
+
+which is significantly less than 64kB required for the 'raw' histogram (assuming
+the values are 8B).
+
+While the bytea compression (pglz) might achieve the same reduction of space,
+the deduplicated representation is used to optimize the estimation by caching
+results of function calls for already visited values. This significantly
+reduces the number of calls to (often quite expensive) operators.
+
+Note: Of course, this reasoning only holds for histograms built by the algorithm
+that simply splits the buckets in half. Other histograms types (e.g. containing
+overlapping buckets) may behave differently and require different serialization.
+
+Serialized histograms are marked with 'magic' constant, to make it easier to
+check the bytea value really is a serialized histogram.
+
+
+varlena compression
+-------------------
+
+This serialization may however disable automatic varlena compression, the array
+of unique values is placed at the beginning of the serialized form. Which is
+exactly the chunk used by pglz to check if the data is compressible, and it
+will probably decide it's not very compressible. This is similar to the issue
+we had with JSONB initially.
+
+Maybe storing buckets first would make it work, as the buckets may be better
+compressible.
+
+On the other hand the serialization is actually a context-aware compression,
+usually compressing to ~30% (or even less, with large data types). So the lack
+of additional pglz compression may be acceptable.
+
+
+Deserialization
+---------------
+
+The deserialization is not a perfect inverse of the serialization, as we keep
+the deduplicated arrays. This reduces the amount of memory and also allows
+optimizations during estimation (e.g. we can cache results for the distinct
+values, saving expensive function calls).
+
+
+Inspecting the histogram
+------------------------
+
+Inspecting the regular (per-attribute) histograms is trivial, as it's enough
+to select the columns from pg_stats - the data is encoded as anyarray, so we
+simply get the text representation of the array.
+
+With multivariate histograms it's not that simple due to the possible mix of
+data types in the histogram. It might be possible to produce similar array-like
+text representation, but that'd unnecessarily complicate further processing
+and analysis of the histogram. Instead, there's a SRF function that allows
+access to lower/upper boundaries, frequencies etc.
+
+    SELECT * FROM pg_histogram_buckets();
+
+It has two input parameters:
+
+    oid   - OID of the histogram (pg_statistic_ext.staoid)
+    otype - type of output
+
+and produces a table with these columns:
+
+    - bucket ID                (0...nbuckets-1)
+    - lower bucket boundaries  (string array)
+    - upper bucket boundaries  (string array)
+    - nulls only dimensions    (boolean array)
+    - lower boundary inclusive (boolean array)
+    - upper boundary includive (boolean array)
+    - frequency                (double precision)
+
+The 'otype' accepts three values, determining what will be returned in the
+lower/upper boundary arrays:
+
+    - 0 - values stored in the histogram, encoded as text
+    - 1 - indexes into the deduplicated arrays
+    - 2 - idnexes into the deduplicated arrays, scaled to [0,1]
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 27e096f..a306cc0 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -904,7 +904,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			listidx;
 
 	/* check if there's any stats that might be useful for us. */
-	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
+	if (!has_stats_of_kind(rel->statlist, STATS_EXT_INFO_DEPENDENCIES))
 		return 1.0;
 
 	list_attnums = (AttrNumber *) palloc(sizeof(AttrNumber) *
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index ee64214..4dcfa02 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -23,6 +23,7 @@
 #include "catalog/pg_collation.h"
 #include "catalog/pg_statistic_ext.h"
 #include "nodes/relation.h"
+#include "optimizer/clauses.h"
 #include "postmaster/autovacuum.h"
 #include "statistics/extended_stats_internal.h"
 #include "statistics/statistics.h"
@@ -33,7 +34,6 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 
-
 /*
  * Used internally to refer to an individual statistics object, i.e.,
  * a pg_statistic_ext entry.
@@ -53,7 +53,7 @@ static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs,
 					  int nvacatts, VacAttrStats **vacatts);
 static void statext_store(Relation pg_stext, Oid relid,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
-			  MCVList *mcvlist, VacAttrStats **stats);
+			  MCVList *mcvlist, MVHistogram *histogram, VacAttrStats **stats);
 
 
 /*
@@ -86,10 +86,14 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 		StatExtEntry *stat = (StatExtEntry *) lfirst(lc);
 		MVNDistinct *ndistinct = NULL;
 		MVDependencies *dependencies = NULL;
+		MVHistogram *histogram = NULL;
 		MCVList	   *mcv = NULL;
 		VacAttrStats **stats;
 		ListCell   *lc2;
 
+		bool		build_mcv = false;
+		bool		build_histogram = false;
+
 		/*
 		 * Check if we can build these stats based on the column analyzed. If
 		 * not, report this fact (except in autovacuum) and move on.
@@ -124,11 +128,45 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 				dependencies = statext_dependencies_build(numrows, rows,
 														  stat->columns, stats);
 			else if (t == STATS_EXT_MCV)
-				mcv = statext_mcv_build(numrows, rows, stat->columns, stats);
+				build_mcv = true;
+			else if (t == STATS_EXT_HISTOGRAM)
+				build_histogram = true;
 		}
 
+		/*
+		 * If asked to build both MCV and histogram, first build the MCV part
+		 * and then histogram on the remaining rows.
+		 */
+		if (build_mcv && build_histogram)
+		{
+			HeapTuple  *rows_filtered = NULL;
+			int			numrows_filtered;
+
+			mcv = statext_mcv_build(numrows, rows, stat->columns, stats,
+									&rows_filtered, &numrows_filtered);
+
+			/* Only build the histogram when there are rows not covered by MCV. */
+			if (rows_filtered)
+			{
+				Assert(numrows_filtered > 0);
+
+				histogram = statext_histogram_build(numrows_filtered, rows_filtered,
+													stat->columns, stats, numrows);
+
+				/* free this immediately, as we may be building many stats */
+				pfree(rows_filtered);
+			}
+		}
+		else if (build_mcv)
+			mcv = statext_mcv_build(numrows, rows, stat->columns, stats,
+									NULL, NULL);
+		else if (build_histogram)
+			histogram = statext_histogram_build(numrows, rows, stat->columns,
+												stats, numrows);
+
 		/* store the statistics in the catalog */
-		statext_store(pg_stext, stat->statOid, ndistinct, dependencies, mcv, stats);
+		statext_store(pg_stext, stat->statOid, ndistinct, dependencies, mcv,
+					  histogram, stats);
 	}
 
 	heap_close(pg_stext, RowExclusiveLock);
@@ -160,6 +198,10 @@ statext_is_kind_built(HeapTuple htup, char type)
 			attnum = Anum_pg_statistic_ext_stxmcv;
 			break;
 
+		case STATS_EXT_HISTOGRAM:
+			attnum = Anum_pg_statistic_ext_stxhistogram;
+			break;
+
 		default:
 			elog(ERROR, "unexpected statistics type requested: %d", type);
 	}
@@ -225,7 +267,8 @@ fetch_statentries_for_relation(Relation pg_statext, Oid relid)
 		{
 			Assert((enabled[i] == STATS_EXT_NDISTINCT) ||
 				   (enabled[i] == STATS_EXT_DEPENDENCIES) ||
-				   (enabled[i] == STATS_EXT_MCV));
+				   (enabled[i] == STATS_EXT_MCV) ||
+				   (enabled[i] == STATS_EXT_HISTOGRAM));
 			entry->types = lappend_int(entry->types, (int) enabled[i]);
 		}
 
@@ -346,7 +389,7 @@ find_ext_attnums(Oid mvoid, Oid *relid)
 static void
 statext_store(Relation pg_stext, Oid statOid,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
-			  MCVList *mcv, VacAttrStats **stats)
+			  MCVList *mcv, MVHistogram *histogram, VacAttrStats **stats)
 {
 	HeapTuple	stup,
 				oldtup;
@@ -385,10 +428,19 @@ statext_store(Relation pg_stext, Oid statOid,
 		values[Anum_pg_statistic_ext_stxmcv - 1] = PointerGetDatum(data);
 	}
 
+	if (histogram != NULL)
+	{
+		bytea	   *data = statext_histogram_serialize(histogram, stats);
+
+		nulls[Anum_pg_statistic_ext_stxhistogram - 1] = (data == NULL);
+		values[Anum_pg_statistic_ext_stxhistogram - 1] = PointerGetDatum(data);
+	}
+
 	/* always replace the value (either by bytea or NULL) */
 	replaces[Anum_pg_statistic_ext_stxndistinct - 1] = true;
 	replaces[Anum_pg_statistic_ext_stxdependencies - 1] = true;
 	replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
+	replaces[Anum_pg_statistic_ext_stxhistogram - 1] = true;
 
 	/* there should already be a pg_statistic_ext tuple */
 	oldtup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
@@ -503,6 +555,19 @@ compare_scalars_simple(const void *a, const void *b, void *arg)
 								 (SortSupport) arg);
 }
 
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+	Datum		da = ((ScalarItem *) a)->value;
+	Datum		db = ((ScalarItem *) b)->value;
+	SortSupport ssup = (SortSupport) arg;
+
+	return ApplySortComparator(da, false, db, false, ssup);
+}
+
 int
 compare_datums_simple(Datum a, Datum b, SortSupport ssup)
 {
@@ -628,10 +693,11 @@ build_sorted_items(int numrows, HeapTuple *rows, TupleDesc tdesc,
 
 /*
  * has_stats_of_kind
- *		Check whether the list contains statistic of a given kind
+ *		Check whether the list contains statistic of a given kind (at least
+ * one of those specified statistics types).
  */
 bool
-has_stats_of_kind(List *stats, char requiredkind)
+has_stats_of_kind(List *stats, int requiredkinds)
 {
 	ListCell   *l;
 
@@ -639,7 +705,7 @@ has_stats_of_kind(List *stats, char requiredkind)
 	{
 		StatisticExtInfo *stat = (StatisticExtInfo *) lfirst(l);
 
-		if (stat->kind == requiredkind)
+		if (stat->kinds & requiredkinds)
 			return true;
 	}
 
@@ -661,7 +727,7 @@ has_stats_of_kind(List *stats, char requiredkind)
  * further tiebreakers are needed.
  */
 StatisticExtInfo *
-choose_best_statistics(List *stats, Bitmapset *attnums, char requiredkind)
+choose_best_statistics(List *stats, Bitmapset *attnums, int requiredkinds)
 {
 	ListCell   *lc;
 	StatisticExtInfo *best_match = NULL;
@@ -675,8 +741,8 @@ choose_best_statistics(List *stats, Bitmapset *attnums, char requiredkind)
 		int			numkeys;
 		Bitmapset  *matched;
 
-		/* skip statistics that are not of the correct type */
-		if (info->kind != requiredkind)
+		/* skip statistics that do not match any of the requested types */
+		if ((info->kinds & requiredkinds) == 0)
 			continue;
 
 		/* determine how many attributes of these stats can be matched to */
@@ -719,3 +785,287 @@ bms_member_index(Bitmapset *keys, AttrNumber varattno)
 
 	return j;
 }
+
+/*
+ * statext_is_compatible_clause_internal
+ *	Does the heavy lifting of actually inspecting the clauses for
+ * statext_is_compatible_clause.
+ */
+static bool
+statext_is_compatible_clause_internal(Node *clause, Index relid, Bitmapset **attnums)
+{
+	/* We only support plain Vars for now */
+	if (IsA(clause, Var))
+	{
+		Var *var = (Var *) clause;
+
+		/* Ensure var is from the correct relation */
+		if (var->varno != relid)
+			return false;
+
+		/* we also better ensure the Var is from the current level */
+		if (var->varlevelsup > 0)
+			return false;
+
+		/* Also skip system attributes (we don't allow stats on those). */
+		if (!AttrNumberIsForUserDefinedAttr(var->varattno))
+			return false;
+
+		*attnums = bms_add_member(*attnums, var->varattno);
+
+		return true;
+	}
+
+	/* Var = Const */
+	if (is_opclause(clause))
+	{
+		OpExpr	   *expr = (OpExpr *) clause;
+		Var		   *var;
+		bool		varonleft = true;
+		bool		ok;
+
+		/* Only expressions with two arguments are considered compatible. */
+		if (list_length(expr->args) != 2)
+			return false;
+
+		/* see if it actually has the right */
+		ok = (NumRelids((Node *) expr) == 1) &&
+			(is_pseudo_constant_clause(lsecond(expr->args)) ||
+			 (varonleft = false,
+			  is_pseudo_constant_clause(linitial(expr->args))));
+
+		/* unsupported structure (two variables or so) */
+		if (!ok)
+			return false;
+
+		/*
+		 * If it's not one of the supported operators ("=", "<", ">", etc.),
+		 * just ignore the clause, as it's not compatible with MCV lists.
+		 *
+		 * This uses the function for estimating selectivity, not the operator
+		 * directly (a bit awkward, but well ...).
+		 */
+		if ((get_oprrest(expr->opno) != F_EQSEL) &&
+			(get_oprrest(expr->opno) != F_SCALARLTSEL) &&
+			(get_oprrest(expr->opno) != F_SCALARGTSEL))
+			return false;
+
+		var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+		return statext_is_compatible_clause_internal((Node *)var, relid, attnums);
+	}
+
+	/* NOT clause, clause AND/OR clause */
+	if (or_clause(clause) ||
+		and_clause(clause) ||
+		not_clause(clause))
+	{
+		/*
+		 * AND/OR/NOT-clauses are supported if all sub-clauses are supported
+		 *
+		 * TODO: We might support mixed case, where some of the clauses are
+		 * supported and some are not, and treat all supported subclauses as a
+		 * single clause, compute it's selectivity using mv stats, and compute
+		 * the total selectivity using the current algorithm.
+		 *
+		 * TODO: For RestrictInfo above an OR-clause, we might use the
+		 * orclause with nested RestrictInfo - we won't have to call
+		 * pull_varnos() for each clause, saving time.
+		 */
+		BoolExpr   *expr = (BoolExpr *) clause;
+		ListCell   *lc;
+		Bitmapset  *clause_attnums = NULL;
+
+		foreach(lc, expr->args)
+		{
+			/*
+			 * Had we found incompatible clause in the arguments, treat the
+			 * whole clause as incompatible.
+			 */
+			if (!statext_is_compatible_clause_internal((Node *) lfirst(lc),
+												   relid, &clause_attnums))
+				return false;
+		}
+
+		/*
+		 * Otherwise the clause is compatible, and we need to merge the
+		 * attnums into the main bitmapset.
+		 */
+		*attnums = bms_join(*attnums, clause_attnums);
+
+		return true;
+	}
+
+	/* Var IS NULL */
+	if (IsA(clause, NullTest))
+	{
+		NullTest   *nt = (NullTest *) clause;
+
+		/*
+		 * Only simple (Var IS NULL) expressions supported for now. Maybe we
+		 * could use examine_variable to fix this?
+		 */
+		if (!IsA(nt->arg, Var))
+			return false;
+
+		return statext_is_compatible_clause_internal((Node *) (nt->arg), relid, attnums);
+	}
+
+	return false;
+}
+
+/*
+ * statext_is_compatible_clause
+ *		Determines if the clause is compatible with MCV lists and histograms
+ *
+ * Only OpExprs with two arguments using an equality operator are supported.
+ * When returning True attnum is set to the attribute number of the Var within
+ * the supported clause.
+ *
+ * Currently we only support Var = Const, or Const = Var. It may be possible
+ * to expand on this later.
+ */
+static bool
+statext_is_compatible_clause(Node *clause, Index relid, Bitmapset **attnums)
+{
+	RestrictInfo *rinfo = (RestrictInfo *) clause;
+
+	if (!IsA(rinfo, RestrictInfo))
+		return false;
+
+	/* Pseudoconstants are not really interesting here. */
+	if (rinfo->pseudoconstant)
+		return false;
+
+	/* clauses referencing multiple varnos are incompatible */
+	if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
+		return false;
+
+	return statext_is_compatible_clause_internal((Node *)rinfo->clause,
+												 relid, attnums);
+}
+
+Selectivity
+statext_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
+							   JoinType jointype, SpecialJoinInfo *sjinfo,
+							   RelOptInfo *rel, Bitmapset **estimatedclauses)
+{
+	ListCell   *l;
+	Bitmapset  *clauses_attnums = NULL;
+	Bitmapset **list_attnums;
+	int			listidx;
+	StatisticExtInfo *stat;
+	List	   *stat_clauses;
+
+	/* selectivities for MCV and histogram part */
+	Selectivity	s1, s2;
+
+	/* we're interested in MCV lists and/or histograms */
+	int			types = (STATS_EXT_INFO_MCV | STATS_EXT_INFO_HISTOGRAM);
+
+	/* additional information for MCV matching */
+	bool		fullmatch;
+	Selectivity	lowsel;
+	Selectivity	max_selectivity = 1.0;
+
+	/* check if there's any stats that might be useful for us. */
+	if (!has_stats_of_kind(rel->statlist, types))
+		return (Selectivity)1.0;
+
+	list_attnums = (Bitmapset **) palloc(sizeof(Bitmapset *) *
+										 list_length(clauses));
+
+	/*
+	 * Pre-process the clauses list to extract the attnums seen in each item.
+	 * We need to determine if there's any clauses which will be useful for
+	 * dependency selectivity estimations. Along the way we'll record all of
+	 * the attnums for each clause in a list which we'll reference later so we
+	 * don't need to repeat the same work again. We'll also keep track of all
+	 * attnums seen.
+	 *
+	 * FIXME Should skip already estimated clauses (using the estimatedclauses
+	 * bitmap).
+	 */
+	listidx = 0;
+	foreach(l, clauses)
+	{
+		Node	   *clause = (Node *) lfirst(l);
+		Bitmapset  *attnums = NULL;
+
+		if (statext_is_compatible_clause(clause, rel->relid, &attnums))
+		{
+			list_attnums[listidx] = attnums;
+			clauses_attnums = bms_add_members(clauses_attnums, attnums);
+		}
+		else
+			list_attnums[listidx] = NULL;
+
+		listidx++;
+	}
+
+	/* We need at least two attributes for MCV lists. */
+	if (bms_num_members(clauses_attnums) < 2)
+		return 1.0;
+
+	/* find the best suited statistics object for these attnums */
+	stat = choose_best_statistics(rel->statlist, clauses_attnums, types);
+
+	/* if no matching stats could be found then we've nothing to do */
+	if (!stat)
+		return (Selectivity)1.0;
+
+	/* now filter the clauses to be estimated using the selected MCV */
+	stat_clauses = NIL;
+
+	listidx = 0;
+	foreach (l, clauses)
+	{
+		/*
+		 * If the clause is compatible with the selected statistics,
+		 * mark it as estimated and add it to the list to estimate.
+		 */
+		if ((list_attnums[listidx] != NULL) &&
+			(bms_is_subset(list_attnums[listidx], stat->keys)))
+		{
+			stat_clauses = lappend(stat_clauses, (Node *)lfirst(l));
+			*estimatedclauses = bms_add_member(*estimatedclauses, listidx);
+		}
+
+		listidx++;
+	}
+
+	/*
+	 * Evaluate the MCV selectivity. See if we got a full match and the
+	 * minimal selectivity.
+	 */
+	if (stat->kinds & STATS_EXT_INFO_MCV)
+	{
+		s1 = mcv_clauselist_selectivity(root, stat, clauses, varRelid,
+										jointype, sjinfo, rel,
+										&fullmatch, &lowsel);
+	}
+
+	/*
+	 * If we got a full equality match on the MCV list, we're done (and the
+	 * estimate is likely pretty good).
+	 */
+	if (fullmatch && (s1 > 0.0))
+		return s1;
+
+	/*
+	 * If it's a full match (equalities on all columns) but we haven't
+	 * found it in the MCV, then we limit the selectivity by frequency
+	 * of the last MCV item.
+	 */
+	if (fullmatch)
+		max_selectivity = lowsel;
+
+	/* Now estimate the selectivity from a histogram. */
+	if (stat->kinds & STATS_EXT_INFO_HISTOGRAM)
+	{
+		s2 = histogram_clauselist_selectivity(root, stat, clauses, varRelid,
+											  jointype, sjinfo, rel);
+	}
+
+	return Min(s1 + s2, max_selectivity);
+}
diff --git a/src/backend/statistics/histogram.c b/src/backend/statistics/histogram.c
new file mode 100644
index 0000000..e5a8f78
--- /dev/null
+++ b/src/backend/statistics/histogram.c
@@ -0,0 +1,2679 @@
+/*-------------------------------------------------------------------------
+ *
+ * histogram.c
+ *	  POSTGRES multivariate histograms
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/backend/statistics/histogram.c
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include <math.h>
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_statistic_ext.h"
+#include "fmgr.h"
+#include "funcapi.h"
+#include "optimizer/clauses.h"
+#include "statistics/extended_stats_internal.h"
+#include "statistics/statistics.h"
+#include "utils/builtins.h"
+#include "utils/bytea.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/syscache.h"
+#include "utils/typcache.h"
+
+
+static MVBucket *create_initial_ext_bucket(int numrows, HeapTuple *rows,
+						 Bitmapset *attrs, VacAttrStats **stats);
+
+static MVBucket *select_bucket_to_partition(int nbuckets, MVBucket **buckets);
+
+static MVBucket *partition_bucket(MVBucket *bucket, Bitmapset *attrs,
+				 VacAttrStats **stats,
+				 int *ndistvalues, Datum **distvalues);
+
+static MVBucket *copy_ext_bucket(MVBucket *bucket, uint32 ndimensions);
+
+static void update_bucket_ndistinct(MVBucket *bucket, Bitmapset *attrs,
+						VacAttrStats **stats);
+
+static void update_dimension_ndistinct(MVBucket *bucket, int dimension,
+						   Bitmapset *attrs, VacAttrStats **stats,
+						   bool update_boundaries);
+
+static void create_null_buckets(MVHistogram *histogram, int bucket_idx,
+					Bitmapset *attrs, VacAttrStats **stats);
+
+static Datum *build_ndistinct(int numrows, HeapTuple *rows, Bitmapset *attrs,
+				VacAttrStats **stats, int i, int *nvals);
+
+/*
+ * Computes size of a serialized histogram bucket, depending on the number
+ * of dimentions (columns) the statistic is defined on. The datum values
+ * are stored in a separate array (deduplicated, to minimize the size), and
+ * so the serialized buckets only store uint16 indexes into that array.
+ *
+ * Each serialized bucket needs to store (in this order):
+ *
+ * - number of tuples     (float)
+ * - number of distinct   (float)
+ * - min inclusive flags  (ndim * sizeof(bool))
+ * - max inclusive flags  (ndim * sizeof(bool))
+ * - null dimension flags (ndim * sizeof(bool))
+ * - min boundary indexes (2 * ndim * sizeof(uint16))
+ * - max boundary indexes (2 * ndim * sizeof(uint16))
+ *
+ * So in total:
+ *
+ *	 ndim * (4 * sizeof(uint16) + 3 * sizeof(bool)) + (2 * sizeof(float))
+ *
+ * XXX We might save a bit more space by using proper bitmaps instead of
+ * boolean arrays.
+ */
+#define BUCKET_SIZE(ndims)	\
+	(ndims * (4 * sizeof(uint16) + 3 * sizeof(bool)) + sizeof(float))
+
+/*
+ * Macros for convenient access to parts of a serialized bucket.
+ */
+#define BUCKET_FREQUENCY(b)		(*(float*)b)
+#define BUCKET_MIN_INCL(b,n)	((bool*)(b + sizeof(float)))
+#define BUCKET_MAX_INCL(b,n)	(BUCKET_MIN_INCL(b,n) + n)
+#define BUCKET_NULLS_ONLY(b,n)	(BUCKET_MAX_INCL(b,n) + n)
+#define BUCKET_MIN_INDEXES(b,n) ((uint16*)(BUCKET_NULLS_ONLY(b,n) + n))
+#define BUCKET_MAX_INDEXES(b,n) ((BUCKET_MIN_INDEXES(b,n) + n))
+
+/*
+ * Minimal number of rows per bucket (can't split smaller buckets).
+ */
+#define MIN_BUCKET_ROWS			10
+
+/*
+ * Data used while building the histogram (rows for a particular bucket).
+ */
+typedef struct HistogramBuild
+{
+	uint32		ndistinct;	/* number of distinct combination of values */
+
+	HeapTuple  *rows;		/* aray of sample rows (for this bucket) */
+	uint32		numrows;	/* number of sample rows (array size) */
+
+	/*
+	 * Number of distinct values in each dimension. This is used when building
+	 * the histogram (and is not serialized/deserialized).
+	 */
+	uint32	   *ndistincts;
+
+} HistogramBuild;
+
+/*
+ * Builds a multivariate histogram from the set of sampled rows.
+ *
+ * The build algorithm is iterative - initially a single bucket containing all
+ * sample rows is formed, and then repeatedly split into smaller buckets. In
+ * each round the largest bucket is split into two smaller ones.
+ *
+ * The criteria for selecting the largest bucket (and the dimension for the
+ * split) needs to be elaborate enough to produce buckets of roughly the same
+ * size, and also regular shape (not very narrow in just one dimension).
+ *
+ * The current algorithm works like this:
+ *
+ *   a) build NULL-buckets (create_null_buckets)
+ *
+ *   b) while [maximum number of buckets not reached]
+ *
+ *   c) choose bucket to partition (largest bucket)
+ *
+ *       c.1) if no bucket eligible to split, terminate the build
+ *
+ *       c.2) choose bucket dimension to partition (largest dimension)
+ *
+ *       c.3) split the bucket into two buckets
+ *
+ * See the discussion at select_bucket_to_partition and partition_bucket for
+ * more details about the algorithm.
+ */
+MVHistogram *
+statext_histogram_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
+						VacAttrStats **stats, int numrows_total)
+{
+	int			i;
+	int			numattrs = bms_num_members(attrs);
+
+	int		   *ndistvalues;
+	Datum	  **distvalues;
+
+	MVHistogram *histogram;
+	HeapTuple   *rows_copy;
+
+	/* not supposed to build of too few or too many columns */
+	Assert((numattrs >= 2) && (numattrs <= STATS_MAX_DIMENSIONS));
+
+	/* we need to make a copy of the row array, as we'll modify it */
+	rows_copy = (HeapTuple *) palloc0(numrows * sizeof(HeapTuple));
+	memcpy(rows_copy, rows, sizeof(HeapTuple) * numrows);
+
+	/* build the histogram header */
+
+	histogram = (MVHistogram *) palloc0(sizeof(MVHistogram));
+
+	histogram->magic = STATS_HIST_MAGIC;
+	histogram->type = STATS_HIST_TYPE_BASIC;
+	histogram->ndimensions = numattrs;
+	histogram->nbuckets = 1;	/* initially just a single bucket */
+
+	/*
+	 * Allocate space for maximum number of buckets (better than repeatedly
+	 * doing repalloc for short-lived objects).
+	 */
+	histogram->buckets
+		= (MVBucket **) palloc0(STATS_HIST_MAX_BUCKETS * sizeof(MVBucket));
+
+	/* Create the initial bucket, covering all sampled rows */
+	histogram->buckets[0]
+		= create_initial_ext_bucket(numrows, rows_copy, attrs, stats);
+
+	/*
+	 * Collect info on distinct values in each dimension (used later to pick
+	 * dimension to partition).
+	 */
+	ndistvalues = (int *) palloc0(sizeof(int) * numattrs);
+	distvalues = (Datum **) palloc0(sizeof(Datum *) * numattrs);
+
+	for (i = 0; i < numattrs; i++)
+		distvalues[i] = build_ndistinct(numrows, rows, attrs, stats, i,
+										&ndistvalues[i]);
+
+	/*
+	 * Split the initial bucket into buckets that don't mix NULL and non-NULL
+	 * values in a single dimension.
+	 *
+	 * XXX Maybe this should be happening before the build_ndistinct()?
+	 */
+	create_null_buckets(histogram, 0, attrs, stats);
+
+	/*
+	 * Split the buckets into smaller and smaller buckets. The loop will end
+	 * when either all buckets are too small (MIN_BUCKET_ROWS), or there are
+	 * too many buckets in total (STATS_HIST_MAX_BUCKETS).
+	 */
+	while (histogram->nbuckets < STATS_HIST_MAX_BUCKETS)
+	{
+		MVBucket   *bucket = select_bucket_to_partition(histogram->nbuckets,
+														histogram->buckets);
+
+		/* no bucket eligible for partitioning */
+		if (bucket == NULL)
+			break;
+
+		/* we modify the bucket in-place and add one new bucket */
+		histogram->buckets[histogram->nbuckets++]
+			= partition_bucket(bucket, attrs, stats, ndistvalues, distvalues);
+	}
+
+	/* Finalize the histogram build - compute bucket frequencies etc. */
+	for (i = 0; i < histogram->nbuckets; i++)
+	{
+		HistogramBuild *build_data
+		= ((HistogramBuild *) histogram->buckets[i]->build_data);
+
+		/*
+		 * The frequency has to be computed from the whole sample, in case
+		 * some of the rows were filtered out in the MCV build.
+		 */
+		histogram->buckets[i]->frequency
+			= (build_data->numrows * 1.0) / numrows_total;
+	}
+
+	return histogram;
+}
+
+/*
+ * build_ndistinct
+ *		build array of ndistinct values in a particular column, count them
+ *
+ */
+static Datum *
+build_ndistinct(int numrows, HeapTuple *rows, Bitmapset *attrs,
+				VacAttrStats **stats, int i, int *nvals)
+{
+	int			j;
+	int			nvalues,
+				ndistinct;
+	Datum	   *values,
+			   *distvalues;
+	int		   *attnums;
+
+	SortSupportData ssup;
+	StdAnalyzeData *mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+	/* initialize sort support, etc. */
+	memset(&ssup, 0, sizeof(ssup));
+	ssup.ssup_cxt = CurrentMemoryContext;
+
+	/* We always use the default collation for statistics */
+	ssup.ssup_collation = DEFAULT_COLLATION_OID;
+	ssup.ssup_nulls_first = false;
+
+	PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+	nvalues = 0;
+	values = (Datum *) palloc0(sizeof(Datum) * numrows);
+
+	attnums = build_attnums(attrs);
+
+	/* collect values from the sample rows, ignore NULLs */
+	for (j = 0; j < numrows; j++)
+	{
+		Datum		value;
+		bool		isnull;
+
+		/*
+		 * remember the index of the sample row, to make the partitioning
+		 * simpler
+		 */
+		value = heap_getattr(rows[j], attnums[i],
+							 stats[i]->tupDesc, &isnull);
+
+		if (isnull)
+			continue;
+
+		values[nvalues++] = value;
+	}
+
+	/* if no non-NULL values were found, free the memory and terminate */
+	if (nvalues == 0)
+	{
+		pfree(values);
+		return NULL;
+	}
+
+	/* sort the array of values using the SortSupport */
+	qsort_arg((void *) values, nvalues, sizeof(Datum),
+			  compare_scalars_simple, (void *) &ssup);
+
+	/* count the distinct values first, and allocate just enough memory */
+	ndistinct = 1;
+	for (j = 1; j < nvalues; j++)
+		if (compare_scalars_simple(&values[j], &values[j - 1], &ssup) != 0)
+			ndistinct += 1;
+
+	distvalues = (Datum *) palloc0(sizeof(Datum) * ndistinct);
+
+	/* now collect distinct values into the array */
+	distvalues[0] = values[0];
+	ndistinct = 1;
+
+	for (j = 1; j < nvalues; j++)
+	{
+		if (compare_scalars_simple(&values[j], &values[j - 1], &ssup) != 0)
+		{
+			distvalues[ndistinct] = values[j];
+			ndistinct += 1;
+		}
+	}
+
+	pfree(values);
+
+	*nvals = ndistinct;
+	return distvalues;
+}
+
+/*
+ * statext_histogram_load
+ *		Load the histogram list for the indicated pg_statistic_ext tuple
+*/
+MVSerializedHistogram *
+statext_histogram_load(Oid mvoid)
+{
+	bool		isnull = false;
+	Datum		histogram;
+	HeapTuple	htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(mvoid));
+
+	if (!HeapTupleIsValid(htup))
+		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
+
+	histogram = SysCacheGetAttr(STATEXTOID, htup,
+								Anum_pg_statistic_ext_stxhistogram, &isnull);
+
+	Assert(!isnull);
+
+	ReleaseSysCache(htup);
+
+	return statext_histogram_deserialize(DatumGetByteaP(histogram));
+}
+
+/*
+ * Serialize the MV histogram into a bytea value. The basic algorithm is quite
+ * simple, and mostly mimincs the MCV serialization:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ *
+ *   (a) collect all (non-NULL) attribute values from all buckets
+ *   (b) sort the data (using 'lt' from VacAttrStats)
+ *   (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all buckets
+ *
+ *   (a) replace min/max values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, as we're mixing different
+ * datatypes, and we we need to use the right operators to compare/sort them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ *
+ * FIXME This probably leaks memory, or at least uses it inefficiently
+ * (many small palloc calls instead of a large one).
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char' or
+ * a longer type (instead of using an array of bool items).
+ */
+bytea *
+statext_histogram_serialize(MVHistogram *histogram, VacAttrStats **stats)
+{
+	int			dim,
+				i;
+	Size		total_length = 0;
+
+	bytea	   *output = NULL;
+	char	   *data = NULL;
+
+	DimensionInfo *info;
+	SortSupport ssup;
+
+	int			nbuckets = histogram->nbuckets;
+	int			ndims = histogram->ndimensions;
+
+	/* allocated for serialized bucket data */
+	int			bucketsize = BUCKET_SIZE(ndims);
+	char	   *bucket = palloc0(bucketsize);
+
+	/* values per dimension (and number of non-NULL values) */
+	Datum	  **values = (Datum **) palloc0(sizeof(Datum *) * ndims);
+	int		   *counts = (int *) palloc0(sizeof(int) * ndims);
+
+	/* info about dimensions (for deserialize) */
+	info = (DimensionInfo *) palloc0(sizeof(DimensionInfo) * ndims);
+
+	/* sort support data */
+	ssup = (SortSupport) palloc0(sizeof(SortSupportData) * ndims);
+
+	/* collect and deduplicate values for each dimension separately */
+	for (dim = 0; dim < ndims; dim++)
+	{
+		int			b;
+		int			count;
+		StdAnalyzeData *tmp = (StdAnalyzeData *) stats[dim]->extra_data;
+
+		/* keep important info about the data type */
+		info[dim].typlen = stats[dim]->attrtype->typlen;
+		info[dim].typbyval = stats[dim]->attrtype->typbyval;
+
+		/*
+		 * Allocate space for all min/max values, including NULLs (we won't
+		 * use them, but we don't know how many are there), and then collect
+		 * all non-NULL values.
+		 */
+		values[dim] = (Datum *) palloc0(sizeof(Datum) * nbuckets * 2);
+
+		for (b = 0; b < histogram->nbuckets; b++)
+		{
+			/* skip buckets where this dimension is NULL-only */
+			if (!histogram->buckets[b]->nullsonly[dim])
+			{
+				values[dim][counts[dim]] = histogram->buckets[b]->min[dim];
+				counts[dim] += 1;
+
+				values[dim][counts[dim]] = histogram->buckets[b]->max[dim];
+				counts[dim] += 1;
+			}
+		}
+
+		/* there are just NULL values in this dimension */
+		if (counts[dim] == 0)
+			continue;
+
+		/* sort and deduplicate */
+		ssup[dim].ssup_cxt = CurrentMemoryContext;
+		ssup[dim].ssup_collation = DEFAULT_COLLATION_OID;
+		ssup[dim].ssup_nulls_first = false;
+
+		PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[dim]);
+
+		qsort_arg(values[dim], counts[dim], sizeof(Datum),
+				  compare_scalars_simple, &ssup[dim]);
+
+		/*
+		 * Walk through the array and eliminate duplicitate values, but keep
+		 * the ordering (so that we can do bsearch later). We know there's at
+		 * least 1 item, so we can skip the first element.
+		 */
+		count = 1;				/* number of deduplicated items */
+		for (i = 1; i < counts[dim]; i++)
+		{
+			/* if it's different from the previous value, we need to keep it */
+			if (compare_datums_simple(values[dim][i - 1], values[dim][i], &ssup[dim]) != 0)
+			{
+				/* XXX: not needed if (count == j) */
+				values[dim][count] = values[dim][i];
+				count += 1;
+			}
+		}
+
+		/* make sure we fit into uint16 */
+		Assert(count <= UINT16_MAX);
+
+		/* keep info about the deduplicated count */
+		info[dim].nvalues = count;
+
+		/* compute size of the serialized data */
+		if (info[dim].typlen > 0)
+			/* byval or byref, but with fixed length (name, tid, ...) */
+			info[dim].nbytes = info[dim].nvalues * info[dim].typlen;
+		else if (info[dim].typlen == -1)
+			/* varlena, so just use VARSIZE_ANY */
+			for (i = 0; i < info[dim].nvalues; i++)
+				info[dim].nbytes += VARSIZE_ANY(values[dim][i]);
+		else if (info[dim].typlen == -2)
+			/* cstring, so simply strlen */
+			for (i = 0; i < info[dim].nvalues; i++)
+				info[dim].nbytes += strlen(DatumGetPointer(values[dim][i]));
+		else
+			elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+				 info[dim].typbyval, info[dim].typlen);
+	}
+
+	/*
+	 * Now we finally know how much space we'll need for the serialized
+	 * histogram, as it contains these fields:
+	 *
+	 * - length (4B) for varlena
+	 * - magic (4B)
+	 * - type (4B)
+	 * - ndimensions (4B)
+	 * - nbuckets (4B)
+	 * - info (ndim * sizeof(DimensionInfo)
+	 * - arrays of values for each dimension
+	 * - serialized buckets (nbuckets * bucketsize)
+	 *
+	 * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and then
+	 * we'll place the data (and buckets).
+	 */
+	total_length = (sizeof(int32) + offsetof(MVHistogram, buckets)
+					+ndims * sizeof(DimensionInfo)
+					+ nbuckets * bucketsize);
+
+	/* account for the deduplicated data */
+	for (dim = 0; dim < ndims; dim++)
+		total_length += info[dim].nbytes;
+
+	/*
+	 * Enforce arbitrary limit of 1MB on the size of the serialized MCV list.
+	 * This is meant as a protection against someone building MCV list on long
+	 * values (e.g. text documents).
+	 *
+	 * XXX Should we enforce arbitrary limits like this one? Maybe it's not
+	 * even necessary, as long values are usually unique and so won't make it
+	 * into the MCV list in the first place. In the end, we have a 1GB limit
+	 * on bytea values.
+	 */
+	if (total_length > (1024 * 1024))
+		elog(ERROR, "serialized histogram exceeds 1MB (%ld > %d)",
+			 total_length, (1024 * 1024));
+
+	/* allocate space for the serialized histogram list, set header */
+	output = (bytea *) palloc0(total_length);
+	SET_VARSIZE(output, total_length);
+
+	/* we'll use 'data' to keep track of the place to write data */
+	data = VARDATA(output);
+
+	memcpy(data, histogram, offsetof(MVHistogram, buckets));
+	data += offsetof(MVHistogram, buckets);
+
+	memcpy(data, info, sizeof(DimensionInfo) * ndims);
+	data += sizeof(DimensionInfo) * ndims;
+
+	/* serialize the deduplicated values for all attributes */
+	for (dim = 0; dim < ndims; dim++)
+	{
+#ifdef USE_ASSERT_CHECKING
+		char	   *tmp = data;
+#endif
+		for (i = 0; i < info[dim].nvalues; i++)
+		{
+			Datum		v = values[dim][i];
+
+			if (info[dim].typbyval)		/* passed by value */
+			{
+				memcpy(data, &v, info[dim].typlen);
+				data += info[dim].typlen;
+			}
+			else if (info[dim].typlen > 0)		/* pased by reference */
+			{
+				memcpy(data, DatumGetPointer(v), info[dim].typlen);
+				data += info[dim].typlen;
+			}
+			else if (info[dim].typlen == -1)		/* varlena */
+			{
+				memcpy(data, DatumGetPointer(v), VARSIZE_ANY(v));
+				data += VARSIZE_ANY(values[dim][i]);
+			}
+			else if (info[dim].typlen == -2)		/* cstring */
+			{
+				memcpy(data, DatumGetPointer(v), strlen(DatumGetPointer(v)) + 1);
+				data += strlen(DatumGetPointer(v)) + 1;
+			}
+		}
+
+		/* make sure we got exactly the amount of data we expected */
+		Assert((data - tmp) == info[dim].nbytes);
+	}
+
+	/* finally serialize the items, with uint16 indexes instead of the values */
+	for (i = 0; i < nbuckets; i++)
+	{
+		/* don't write beyond the allocated space */
+		Assert(data <= (char *) output + total_length - bucketsize);
+
+		/* reset the values for each item */
+		memset(bucket, 0, bucketsize);
+
+		BUCKET_FREQUENCY(bucket) = histogram->buckets[i]->frequency;
+
+		for (dim = 0; dim < ndims; dim++)
+		{
+			/* do the lookup only for non-NULL values */
+			if (!histogram->buckets[i]->nullsonly[dim])
+			{
+				uint16		idx;
+				Datum	   *v = NULL;
+
+				/* min boundary */
+				v = (Datum *) bsearch_arg(&histogram->buckets[i]->min[dim],
+								   values[dim], info[dim].nvalues, sizeof(Datum),
+										  compare_scalars_simple, &ssup[dim]);
+
+				Assert(v != NULL);		/* serialization or deduplication
+										 * error */
+
+				/* compute index within the array */
+				idx = (v - values[dim]);
+
+				Assert((idx >= 0) && (idx < info[dim].nvalues));
+
+				BUCKET_MIN_INDEXES(bucket, ndims)[dim] = idx;
+
+				/* max boundary */
+				v = (Datum *) bsearch_arg(&histogram->buckets[i]->max[dim],
+								   values[dim], info[dim].nvalues, sizeof(Datum),
+										  compare_scalars_simple, &ssup[dim]);
+
+				Assert(v != NULL);		/* serialization or deduplication
+										 * error */
+
+				/* compute index within the array */
+				idx = (v - values[dim]);
+
+				Assert((idx >= 0) && (idx < info[dim].nvalues));
+
+				BUCKET_MAX_INDEXES(bucket, ndims)[dim] = idx;
+			}
+		}
+
+		/* copy flags (nulls, min/max inclusive) */
+		memcpy(BUCKET_NULLS_ONLY(bucket, ndims),
+			   histogram->buckets[i]->nullsonly, sizeof(bool) * ndims);
+
+		memcpy(BUCKET_MIN_INCL(bucket, ndims),
+			   histogram->buckets[i]->min_inclusive, sizeof(bool) * ndims);
+
+		memcpy(BUCKET_MAX_INCL(bucket, ndims),
+			   histogram->buckets[i]->max_inclusive, sizeof(bool) * ndims);
+
+		/* copy the item into the array */
+		memcpy(data, bucket, bucketsize);
+
+		data += bucketsize;
+	}
+
+	/* at this point we expect to match the total_length exactly */
+	Assert((data - (char *) output) == total_length);
+
+	/* free the values/counts arrays here */
+	pfree(counts);
+	pfree(info);
+	pfree(ssup);
+
+	for (dim = 0; dim < ndims; dim++)
+		pfree(values[dim]);
+
+	pfree(values);
+
+	return output;
+}
+
+/*
+* Reads serialized histogram into MVSerializedHistogram structure.
+ 
+ * Returns histogram in a partially-serialized form (keeps the boundary values
+ * deduplicated, so that it's possible to optimize the estimation part by
+ * caching function call results across buckets etc.).
+ */
+MVSerializedHistogram *
+statext_histogram_deserialize(bytea *data)
+{
+	int			dim,
+				i;
+
+	Size		expected_size;
+	char	   *tmp = NULL;
+
+	MVSerializedHistogram *histogram;
+	DimensionInfo *info;
+
+	int			nbuckets;
+	int			ndims;
+	int			bucketsize;
+
+	/* temporary deserialization buffer */
+	int			bufflen;
+	char	   *buff;
+	char	   *ptr;
+
+	if (data == NULL)
+		return NULL;
+
+	/*
+	 * We can't possibly deserialize a histogram if there's not even a
+	 * complete header.
+	 */
+	if (VARSIZE_ANY_EXHDR(data) < offsetof(MVSerializedHistogram, buckets))
+		elog(ERROR, "invalid histogram size %ld (expected at least %ld)",
+			 VARSIZE_ANY_EXHDR(data), offsetof(MVSerializedHistogram, buckets));
+
+	/* read the histogram header */
+	histogram
+		= (MVSerializedHistogram *) palloc(sizeof(MVSerializedHistogram));
+
+	/* initialize pointer to the data part (skip the varlena header) */
+	tmp = VARDATA_ANY(data);
+
+	/* get the header and perform basic sanity checks */
+	memcpy(histogram, tmp, offsetof(MVSerializedHistogram, buckets));
+	tmp += offsetof(MVSerializedHistogram, buckets);
+
+	if (histogram->magic != STATS_HIST_MAGIC)
+		elog(ERROR, "invalid histogram magic %d (expected %dd)",
+			 histogram->magic, STATS_HIST_MAGIC);
+
+	if (histogram->type != STATS_HIST_TYPE_BASIC)
+		elog(ERROR, "invalid histogram type %d (expected %dd)",
+			 histogram->type, STATS_HIST_TYPE_BASIC);
+
+	if (histogram->ndimensions == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid zero-length dimension array in histogram")));
+	else if (histogram->ndimensions > STATS_MAX_DIMENSIONS)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid length (%d) dimension array in histogram",
+						histogram->ndimensions)));
+
+	if (histogram->nbuckets == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid zero-length bucket array in histogram")));
+	else if (histogram->nbuckets > STATS_HIST_MAX_BUCKETS)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid length (%d) bucket array in histogram",
+						histogram->nbuckets)));
+
+	nbuckets = histogram->nbuckets;
+	ndims = histogram->ndimensions;
+	bucketsize = BUCKET_SIZE(ndims);
+
+	/*
+	 * What size do we expect with those parameters (it's incomplete, as we
+	 * yet have to count the array sizes (from DimensionInfo records).
+	 */
+	expected_size = offsetof(MVSerializedHistogram, buckets) +
+		ndims * sizeof(DimensionInfo) +
+		(nbuckets * bucketsize);
+
+	/* check that we have at least the DimensionInfo records */
+	if (VARSIZE_ANY_EXHDR(data) < expected_size)
+		elog(ERROR, "invalid histogram size %ld (expected %ld)",
+			 VARSIZE_ANY_EXHDR(data), expected_size);
+
+	/* Now it's safe to access the dimention info. */
+	info = (DimensionInfo *) (tmp);
+	tmp += ndims * sizeof(DimensionInfo);
+
+	/* account for the value arrays */
+	for (dim = 0; dim < ndims; dim++)
+		expected_size += info[dim].nbytes;
+
+	if (VARSIZE_ANY_EXHDR(data) != expected_size)
+		elog(ERROR, "invalid histogram size %ld (expected %ld)",
+			 VARSIZE_ANY_EXHDR(data), expected_size);
+
+	/* looks OK - not corrupted or something */
+
+	/* a single buffer for all the values and counts */
+	bufflen = (sizeof(int) + sizeof(Datum *)) * ndims;
+
+	for (dim = 0; dim < ndims; dim++)
+		/* don't allocate space for byval types, matching Datum */
+		if (!(info[dim].typbyval && (info[dim].typlen == sizeof(Datum))))
+			bufflen += (sizeof(Datum) * info[dim].nvalues);
+
+	/* also, include space for the result, tracking the buckets */
+	bufflen += nbuckets * (sizeof(MVSerializedBucket *) +	/* bucket pointer */
+						   sizeof(MVSerializedBucket));		/* bucket data */
+
+	buff = palloc0(bufflen);
+	ptr = buff;
+
+	histogram->nvalues = (int *) ptr;
+	ptr += (sizeof(int) * ndims);
+
+	histogram->values = (Datum **) ptr;
+	ptr += (sizeof(Datum *) * ndims);
+
+	/*
+	 * XXX This uses pointers to the original data array (the types not passed
+	 * by value), so when someone frees the memory, e.g. by doing something
+	 * like this:
+	 *
+	 *	bytea * data = ... fetch the data from catalog ...
+	 *	MVHistogram histogram = deserialize_histogram(data);
+	 *	pfree(data);
+	 *
+	 * then 'histogram' references the freed memory. Should copy the pieces.
+	 */
+	for (dim = 0; dim < ndims; dim++)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* remember where data for this dimension starts */
+		char *start = tmp;
+#endif
+
+		histogram->nvalues[dim] = info[dim].nvalues;
+
+		if (info[dim].typbyval)
+		{
+			/* passed by value / Datum - simply reuse the array */
+			if (info[dim].typlen == sizeof(Datum))
+			{
+				histogram->values[dim] = (Datum *) tmp;
+				tmp += info[dim].nbytes;
+
+				/* no overflow of input array */
+				Assert(tmp <= start + info[dim].nbytes);
+			}
+			else
+			{
+				histogram->values[dim] = (Datum *) ptr;
+				ptr += (sizeof(Datum) * info[dim].nvalues);
+
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					memcpy(&histogram->values[dim][i], tmp, info[dim].typlen);
+					tmp += info[dim].typlen;
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+		}
+		else
+		{
+			/* all the other types need a chunk of the buffer */
+			histogram->values[dim] = (Datum *) ptr;
+			ptr += (sizeof(Datum) * info[dim].nvalues);
+
+			if (info[dim].typlen > 0)
+			{
+				/* pased by reference, but fixed length (name, tid, ...) */
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					histogram->values[dim][i] = PointerGetDatum(tmp);
+					tmp += info[dim].typlen;
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+			else if (info[dim].typlen == -1)
+			{
+				/* varlena */
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					histogram->values[dim][i] = PointerGetDatum(tmp);
+					tmp += VARSIZE_ANY(tmp);
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+			else if (info[dim].typlen == -2)
+			{
+				/* cstring */
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					histogram->values[dim][i] = PointerGetDatum(tmp);
+					tmp += (strlen(tmp) + 1);	/* don't forget the \0 */
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+		}
+
+		/* check we consumed the serialized data for this dimension exactly */
+		Assert((tmp - start) == info[dim].nbytes);
+	}
+
+	/* now deserialize the buckets and point them into the varlena values */
+	histogram->buckets = (MVSerializedBucket **) ptr;
+	ptr += (sizeof(MVSerializedBucket *) * nbuckets);
+
+	for (i = 0; i < nbuckets; i++)
+	{
+		MVSerializedBucket *bucket = (MVSerializedBucket *) ptr;
+
+		ptr += sizeof(MVSerializedBucket);
+
+		bucket->frequency = BUCKET_FREQUENCY(tmp);
+		bucket->nullsonly = BUCKET_NULLS_ONLY(tmp, ndims);
+		bucket->min_inclusive = BUCKET_MIN_INCL(tmp, ndims);
+		bucket->max_inclusive = BUCKET_MAX_INCL(tmp, ndims);
+
+		bucket->min = BUCKET_MIN_INDEXES(tmp, ndims);
+		bucket->max = BUCKET_MAX_INDEXES(tmp, ndims);
+
+		histogram->buckets[i] = bucket;
+
+		Assert(tmp <= (char *) data + VARSIZE_ANY(data));
+
+		tmp += bucketsize;
+	}
+
+	/* at this point we expect to match the total_length exactly */
+	Assert((tmp - VARDATA(data)) == expected_size);
+
+	/* we should exhaust the output buffer exactly */
+	Assert((ptr - buff) == bufflen);
+
+	return histogram;
+}
+
+/*
+ * create_initial_ext_bucket
+ *		Create an initial bucket, covering all the sampled rows.
+ */
+static MVBucket *
+create_initial_ext_bucket(int numrows, HeapTuple *rows, Bitmapset *attrs,
+						  VacAttrStats **stats)
+{
+	int			i;
+	int			numattrs = bms_num_members(attrs);
+	HistogramBuild *data = NULL;
+
+	/* TODO allocate bucket as a single piece, including all the fields. */
+	MVBucket   *bucket = (MVBucket *) palloc0(sizeof(MVBucket));
+
+	Assert(numrows > 0);
+	Assert(rows != NULL);
+	Assert((numattrs >= 2) && (numattrs <= STATS_MAX_DIMENSIONS));
+
+	/* allocate the per-dimension arrays */
+
+	/* flags for null-only dimensions */
+	bucket->nullsonly = (bool *) palloc0(numattrs * sizeof(bool));
+
+	/* inclusiveness boundaries - lower/upper bounds */
+	bucket->min_inclusive = (bool *) palloc0(numattrs * sizeof(bool));
+	bucket->max_inclusive = (bool *) palloc0(numattrs * sizeof(bool));
+
+	/* lower/upper boundaries */
+	bucket->min = (Datum *) palloc0(numattrs * sizeof(Datum));
+	bucket->max = (Datum *) palloc0(numattrs * sizeof(Datum));
+
+	/* build-data */
+	data = (HistogramBuild *) palloc0(sizeof(HistogramBuild));
+
+	/* number of distinct values (per dimension) */
+	data->ndistincts = (uint32 *) palloc0(numattrs * sizeof(uint32));
+
+	/* all the sample rows fall into the initial bucket */
+	data->numrows = numrows;
+	data->rows = rows;
+
+	bucket->build_data = data;
+
+	/*
+	 * Update the number of ndistinct combinations in the bucket (which we use
+	 * when selecting bucket to partition), and then number of distinct values
+	 * for each partition (which we use when choosing which dimension to
+	 * split).
+	 */
+	update_bucket_ndistinct(bucket, attrs, stats);
+
+	/* Update ndistinct (and also set min/max) for all dimensions. */
+	for (i = 0; i < numattrs; i++)
+		update_dimension_ndistinct(bucket, i, attrs, stats, true);
+
+	return bucket;
+}
+
+/*
+ * Choose the bucket to partition next.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm produces
+ * buckets with about equal frequency and regular size. We select the bucket
+ * with the highest number of distinct values, and then split it by the longest
+ * dimension.
+ *
+ * The distinct values are uniformly mapped to [0,1] interval, and this is used
+ * to compute length of the value range.
+ *
+ * NOTE: This is not the same array used for deduplication, as this contains
+ *		 values for all the tuples from the sample, not just the boundary values.
+ *
+ * Returns either pointer to the bucket selected to be partitioned, or NULL if
+ * there are no buckets that may be split (e.g. if all buckets are too small
+ * or contain too few distinct values).
+ *
+ *
+ * Tricky example
+ * --------------
+ *
+ * Consider this table:
+ *
+ *	   CREATE TABLE t AS SELECT i AS a, i AS b
+ *						   FROM generate_series(1,1000000) s(i);
+ *
+ *	   CREATE STATISTICS s1 ON t (a,b) WITH (histogram);
+ *
+ *	   ANALYZE t;
+ *
+ * It's a very specific (and perhaps artificial) example, because every bucket
+ * always has exactly the same number of distinct values in all dimensions,
+ * which makes the partitioning tricky.
+ *
+ * Then:
+ *
+ *	   SELECT * FROM t WHERE (a < 100) AND (b < 100);
+ *
+ * is estimated to return ~120 rows, while in reality it returns only 99.
+ *
+ *							 QUERY PLAN
+ *	   -------------------------------------------------------------
+ *		Seq Scan on t  (cost=0.00..19425.00 rows=117 width=8)
+ *					   (actual time=0.129..82.776 rows=99 loops=1)
+ *		  Filter: ((a < 100) AND (b < 100))
+ *		  Rows Removed by Filter: 999901
+ *		Planning time: 1.286 ms
+ *		Execution time: 82.984 ms
+ *	   (5 rows)
+ *
+ * So this estimate is reasonably close. Let's change the query to OR clause:
+ *
+ *	   SELECT * FROM t WHERE (a < 100) OR (b < 100);
+ *
+ *							 QUERY PLAN
+ *	   -------------------------------------------------------------
+ *		Seq Scan on t  (cost=0.00..19425.00 rows=8100 width=8)
+ *					   (actual time=0.145..99.910 rows=99 loops=1)
+ *		  Filter: ((a < 100) OR (b < 100))
+ *		  Rows Removed by Filter: 999901
+ *		Planning time: 1.578 ms
+ *		Execution time: 100.132 ms
+ *	   (5 rows)
+ *
+ * That's clearly a much worse estimate. This happens because the histogram
+ * contains buckets like this:
+ *
+ *	   bucket 592  [3 30310] [30134 30593] => [0.000233]
+ *
+ * i.e. the length of "a" dimension is (30310-3)=30307, while the length of "b"
+ * is (30593-30134)=459. So the "b" dimension is much narrower than "a".
+ * Of course, there are also buckets where "b" is the wider dimension.
+ *
+ * This is partially mitigated by selecting the "longest" dimension but that
+ * only happens after we already selected the bucket. So if we never select the
+ * bucket, this optimization does not apply.
+ *
+ * The other reason why this particular example behaves so poorly is due to the
+ * way we actually split the selected bucket. We do attempt to divide the bucket
+ * into two parts containing about the same number of tuples, but that does not
+ * too well when most of the tuples is squashed on one side of the bucket.
+ *
+ * For example for columns with data on the diagonal (i.e. when a=b), we end up
+ * with a narrow bucket on the diagonal and a huge bucket overing the remaining
+ * part (with much lower density).
+ *
+ * So perhaps we need two partitioning strategies - one aiming to split buckets
+ * with high frequency (number of sampled rows), the other aiming to split
+ * "large" buckets. And alternating between them, somehow.
+ *
+ * TODO Consider using similar lower boundary for row count as for simple
+ * histograms, i.e. 300 tuples per bucket.
+ */
+static MVBucket *
+select_bucket_to_partition(int nbuckets, MVBucket **buckets)
+{
+	int			i;
+	int			numrows = 0;
+	MVBucket   *bucket = NULL;
+
+	for (i = 0; i < nbuckets; i++)
+	{
+		HistogramBuild *data = (HistogramBuild *) buckets[i]->build_data;
+
+		/* if the number of rows is higher, use this bucket */
+		if ((data->ndistinct > 2) &&
+			(data->numrows > numrows) &&
+			(data->numrows >= MIN_BUCKET_ROWS))
+		{
+			bucket = buckets[i];
+			numrows = data->numrows;
+		}
+	}
+
+	/* may be NULL if there are not buckets with (ndistinct>1) */
+	return bucket;
+}
+
+/*
+ * A simple bucket partitioning implementation - we choose the longest bucket
+ * dimension, measured using the array of distinct values built at the very
+ * beginning of the build.
+ *
+ * We map all the distinct values to a [0,1] interval, uniformly distributed,
+ * and then use this to measure length. It's essentially a number of distinct
+ * values within the range, normalized to [0,1].
+ *
+ * Then we choose a 'middle' value splitting the bucket into two parts with
+ * roughly the same frequency.
+ *
+ * This splits the bucket by tweaking the existing one, and returning the new
+ * bucket (essentially shrinking the existing one in-place and returning the
+ * other "half" as a new bucket). The caller is responsible for adding the new
+ * bucket into the list of buckets.
+ *
+ * There are multiple histogram options, centered around the partitioning
+ * criteria, specifying both how to choose a bucket and the dimension most in
+ * need of a split. For a nice summary and general overview, see "rK-Hist : an
+ * R-Tree based histogram for multi-dimensional selectivity estimation" thesis
+ * by J. A. Lopez, Concordia University, p.34-37 (and possibly p. 32-34 for
+ * explanation of the terms).
+ *
+ * It requires care to prevent splitting only one dimension and not splitting
+ * another one at all (which might happen easily in case of strongly dependent
+ * columns - e.g. y=x). The current algorithm minimizes this, but may still
+ * happen for perfectly dependent examples (when all the dimensions have equal
+ * length, the first one will be selected).
+ *
+ * TODO Should probably consider statistics target for the columns (e.g.
+ * to split dimensions with higher statistics target more frequently).
+ */
+static MVBucket *
+partition_bucket(MVBucket *bucket, Bitmapset *attrs,
+				 VacAttrStats **stats,
+				 int *ndistvalues, Datum **distvalues)
+{
+	int			i;
+	int			dimension;
+	int			numattrs = bms_num_members(attrs);
+
+	Datum		split_value;
+	MVBucket   *new_bucket;
+	HistogramBuild *new_data;
+
+	/* needed for sort, when looking for the split value */
+	bool		isNull;
+	int			nvalues = 0;
+	HistogramBuild *data = (HistogramBuild *) bucket->build_data;
+	StdAnalyzeData *mystats = NULL;
+	ScalarItem *values = (ScalarItem *) palloc0(data->numrows * sizeof(ScalarItem));
+	SortSupportData ssup;
+	int		   *attnums;
+
+	int			nrows = 1;		/* number of rows below current value */
+	double		delta;
+
+	/* needed when splitting the values */
+	HeapTuple  *oldrows = data->rows;
+	int			oldnrows = data->numrows;
+
+	/*
+	 * We can't split buckets with a single distinct value (this also
+	 * disqualifies NULL-only dimensions). Also, there has to be multiple
+	 * sample rows (otherwise, how could there be more distinct values).
+	 */
+	Assert(data->ndistinct > 1);
+	Assert(data->numrows > 1);
+	Assert((numattrs >= 2) && (numattrs <= STATS_MAX_DIMENSIONS));
+
+	/* Look for the next dimension to split. */
+	delta = 0.0;
+	dimension = -1;
+
+	for (i = 0; i < numattrs; i++)
+	{
+		Datum	   *a,
+				   *b;
+
+		mystats = (StdAnalyzeData *) stats[i]->extra_data;
+
+		/* initialize sort support, etc. */
+		memset(&ssup, 0, sizeof(ssup));
+		ssup.ssup_cxt = CurrentMemoryContext;
+
+		/* We always use the default collation for statistics */
+		ssup.ssup_collation = DEFAULT_COLLATION_OID;
+		ssup.ssup_nulls_first = false;
+
+		PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+		/* can't split NULL-only dimension */
+		if (bucket->nullsonly[i])
+			continue;
+
+		/* can't split dimension with a single ndistinct value */
+		if (data->ndistincts[i] <= 1)
+			continue;
+
+		/* search for min boundary in the distinct list */
+		a = (Datum *) bsearch_arg(&bucket->min[i],
+								  distvalues[i], ndistvalues[i],
+							   sizeof(Datum), compare_scalars_simple, &ssup);
+
+		b = (Datum *) bsearch_arg(&bucket->max[i],
+								  distvalues[i], ndistvalues[i],
+							   sizeof(Datum), compare_scalars_simple, &ssup);
+
+		/* if this dimension is 'larger' then partition by it */
+		if (((b - a) * 1.0 / ndistvalues[i]) > delta)
+		{
+			delta = ((b - a) * 1.0 / ndistvalues[i]);
+			dimension = i;
+		}
+	}
+
+	/*
+	 * If we haven't found a dimension here, we've done something wrong in
+	 * select_bucket_to_partition.
+	 */
+	Assert(dimension != -1);
+
+	/*
+	 * Walk through the selected dimension, collect and sort the values and
+	 * then choose the value to use as the new boundary.
+	 */
+	mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+	/* initialize sort support, etc. */
+	memset(&ssup, 0, sizeof(ssup));
+	ssup.ssup_cxt = CurrentMemoryContext;
+
+	/* We always use the default collation for statistics */
+	ssup.ssup_collation = DEFAULT_COLLATION_OID;
+	ssup.ssup_nulls_first = false;
+
+	PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+	attnums = build_attnums(attrs);
+
+	for (i = 0; i < data->numrows; i++)
+	{
+		/*
+		 * remember the index of the sample row, to make the partitioning
+		 * simpler
+		 */
+		values[nvalues].value = heap_getattr(data->rows[i], attnums[dimension],
+										 stats[dimension]->tupDesc, &isNull);
+		values[nvalues].tupno = i;
+
+		/* no NULL values allowed here (we never split null-only dimension) */
+		Assert(!isNull);
+
+		nvalues++;
+	}
+
+	/* sort the array of values */
+	qsort_arg((void *) values, nvalues, sizeof(ScalarItem),
+			  compare_scalars_partition, (void *) &ssup);
+
+	/*
+	 * We know there are bucket->ndistincts[dimension] distinct values in this
+	 * dimension, and we want to split this into half, so walk through the
+	 * array and stop once we see (ndistinct/2) values.
+	 *
+	 * We always choose the "next" value, i.e. (n/2+1)-th distinct value, and
+	 * use it as an exclusive upper boundary (and inclusive lower boundary).
+	 *
+	 * TODO Maybe we should use "average" of the two middle distinct values
+	 * (at least for even distinct counts), but that would require being able
+	 * to do an average (which does not work for non-numeric types).
+	 *
+	 * TODO Another option is to look for a split that'd give about 50% tuples
+	 * (not distinct values) in each partition. That might work better when
+	 * there are a few very frequent values, and many rare ones.
+	 */
+	delta = fabs(data->numrows);
+	split_value = values[0].value;
+
+	for (i = 1; i < data->numrows; i++)
+	{
+		if (values[i].value != values[i - 1].value)
+		{
+			/* are we closer to splitting the bucket in half? */
+			if (fabs(i - data->numrows / 2.0) < delta)
+			{
+				/* let's assume we'll use this value for the split */
+				split_value = values[i].value;
+				delta = fabs(i - data->numrows / 2.0);
+				nrows = i;
+			}
+		}
+	}
+
+	Assert(nrows > 0);
+	Assert(nrows < data->numrows);
+
+	/*
+	 * create the new bucket as a (incomplete) copy of the one being
+	 * partitioned.
+	 */
+	new_bucket = copy_ext_bucket(bucket, numattrs);
+	new_data = (HistogramBuild *) new_bucket->build_data;
+
+	/*
+	 * Do the actual split of the chosen dimension, using the split value as
+	 * the upper bound for the existing bucket, and lower bound for the new
+	 * one.
+	 */
+	bucket->max[dimension] = split_value;
+	new_bucket->min[dimension] = split_value;
+
+	/*
+	 * We also treat only one side of the new boundary as inclusive, in the
+	 * bucket where it happens to be the upper boundary. We never set the
+	 * min_inclusive[] to false anywhere, but we set it to true anyway.
+	 */
+	bucket->max_inclusive[dimension] = false;
+	new_bucket->min_inclusive[dimension] = true;
+
+	/*
+	 * Redistribute the sample tuples using the 'ScalarItem->tupno' index. We
+	 * know 'nrows' rows should remain in the original bucket and the rest
+	 * goes to the new one.
+	 */
+
+	data->rows = (HeapTuple *) palloc0(nrows * sizeof(HeapTuple));
+	new_data->rows = (HeapTuple *) palloc0((oldnrows - nrows) * sizeof(HeapTuple));
+
+	data->numrows = nrows;
+	new_data->numrows = (oldnrows - nrows);
+
+	/*
+	 * The first nrows should go to the first bucket, the rest should go to
+	 * the new one. Use the tupno field to get the actual HeapTuple row from
+	 * the original array of sample rows.
+	 */
+	for (i = 0; i < nrows; i++)
+		memcpy(&data->rows[i], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+	for (i = nrows; i < oldnrows; i++)
+		memcpy(&new_data->rows[i - nrows], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+	/* update ndistinct values for the buckets (total and per dimension) */
+	update_bucket_ndistinct(bucket, attrs, stats);
+	update_bucket_ndistinct(new_bucket, attrs, stats);
+
+	/*
+	 * TODO We don't need to do this for the dimension we used for split,
+	 * because we know how many distinct values went to each partition.
+	 */
+	for (i = 0; i < numattrs; i++)
+	{
+		update_dimension_ndistinct(bucket, i, attrs, stats, false);
+		update_dimension_ndistinct(new_bucket, i, attrs, stats, false);
+	}
+
+	pfree(oldrows);
+	pfree(values);
+
+	return new_bucket;
+}
+
+/*
+ * Copy a histogram bucket. The copy does not include the build-time data, i.e.
+ * sampled rows etc.
+ */
+static MVBucket *
+copy_ext_bucket(MVBucket *bucket, uint32 ndimensions)
+{
+	/* TODO allocate as a single piece (including all the fields) */
+	MVBucket   *new_bucket = (MVBucket *) palloc0(sizeof(MVBucket));
+	HistogramBuild *data = (HistogramBuild *) palloc0(sizeof(HistogramBuild));
+
+	/*
+	 * Copy only the attributes that will stay the same after the split, and
+	 * we'll recompute the rest after the split.
+	 */
+
+	/* allocate the per-dimension arrays */
+	new_bucket->nullsonly = (bool *) palloc0(ndimensions * sizeof(bool));
+
+	/* inclusiveness boundaries - lower/upper bounds */
+	new_bucket->min_inclusive = (bool *) palloc0(ndimensions * sizeof(bool));
+	new_bucket->max_inclusive = (bool *) palloc0(ndimensions * sizeof(bool));
+
+	/* lower/upper boundaries */
+	new_bucket->min = (Datum *) palloc0(ndimensions * sizeof(Datum));
+	new_bucket->max = (Datum *) palloc0(ndimensions * sizeof(Datum));
+
+	/* copy data */
+	memcpy(new_bucket->nullsonly, bucket->nullsonly, ndimensions * sizeof(bool));
+
+	memcpy(new_bucket->min_inclusive, bucket->min_inclusive, ndimensions * sizeof(bool));
+	memcpy(new_bucket->min, bucket->min, ndimensions * sizeof(Datum));
+
+	memcpy(new_bucket->max_inclusive, bucket->max_inclusive, ndimensions * sizeof(bool));
+	memcpy(new_bucket->max, bucket->max, ndimensions * sizeof(Datum));
+
+	/* allocate and copy the interesting part of the build data */
+	data->ndistincts = (uint32 *) palloc0(ndimensions * sizeof(uint32));
+
+	new_bucket->build_data = data;
+
+	return new_bucket;
+}
+
+/*
+ * Counts the number of distinct values in the bucket. This just copies the
+ * Datum values into a simple array, and sorts them using memcmp-based
+ * comparator. That means it only works for pass-by-value data types (assuming
+ * they don't use collations etc.)
+ */
+static void
+update_bucket_ndistinct(MVBucket *bucket, Bitmapset *attrs, VacAttrStats **stats)
+{
+	int			i;
+	int			numattrs = bms_num_members(attrs);
+
+	HistogramBuild *data = (HistogramBuild *) bucket->build_data;
+	int			numrows = data->numrows;
+
+	MultiSortSupport mss = multi_sort_init(numattrs);
+	int		   *attnums;
+	SortItem   *items;
+
+	attnums = build_attnums(attrs);
+
+	/* prepare the sort function for the first dimension */
+	for (i = 0; i < numattrs; i++)
+	{
+		VacAttrStats *colstat = stats[i];
+		TypeCacheEntry *type;
+
+		type = lookup_type_cache(colstat->attrtypid, TYPECACHE_LT_OPR);
+		if (type->lt_opr == InvalidOid) /* shouldn't happen */
+			elog(ERROR, "cache lookup failed for ordering operator for type %u",
+				 colstat->attrtypid);
+
+		multi_sort_add_dimension(mss, i, type->lt_opr);
+	}
+
+	/*
+	 * build an array of SortItem(s) sorted using the multi-sort support
+	 *
+	 * XXX This relies on all stats entries pointing to the same tuple
+	 * descriptor. Not sure if that might not be the case.
+	 */
+	items = build_sorted_items(numrows, data->rows, stats[0]->tupDesc, mss,
+							   numattrs, attnums);
+
+	data->ndistinct = 1;
+
+	for (i = 1; i < numrows; i++)
+		if (multi_sort_compare(&items[i], &items[i - 1], mss) != 0)
+			data->ndistinct += 1;
+
+	pfree(items);
+}
+
+/*
+ * Count distinct values per bucket dimension.
+ */
+static void
+update_dimension_ndistinct(MVBucket *bucket, int dimension, Bitmapset *attrs,
+						   VacAttrStats **stats, bool update_boundaries)
+{
+	int			j;
+	int			nvalues = 0;
+	bool		isNull;
+	HistogramBuild *data = (HistogramBuild *) bucket->build_data;
+	Datum	   *values = (Datum *) palloc0(data->numrows * sizeof(Datum));
+	SortSupportData ssup;
+
+	StdAnalyzeData *mystats = (StdAnalyzeData *) stats[dimension]->extra_data;
+
+	int		   *attnums;
+
+	/* we may already know this is a NULL-only dimension */
+	if (bucket->nullsonly[dimension])
+		data->ndistincts[dimension] = 1;
+
+	memset(&ssup, 0, sizeof(ssup));
+	ssup.ssup_cxt = CurrentMemoryContext;
+
+	/* We always use the default collation for statistics */
+	ssup.ssup_collation = DEFAULT_COLLATION_OID;
+	ssup.ssup_nulls_first = false;
+
+	PrepareSortSupportFromOrderingOp(mystats->ltopr, &ssup);
+
+	attnums = build_attnums(attrs);
+
+	for (j = 0; j < data->numrows; j++)
+	{
+		values[nvalues] = heap_getattr(data->rows[j], attnums[dimension],
+									   stats[dimension]->tupDesc, &isNull);
+
+		/* ignore NULL values */
+		if (!isNull)
+			nvalues++;
+	}
+
+	/* there's always at least 1 distinct value (may be NULL) */
+	data->ndistincts[dimension] = 1;
+
+	/*
+	 * if there are only NULL values in the column, mark it so and continue
+	 * with the next one
+	 */
+	if (nvalues == 0)
+	{
+		pfree(values);
+		bucket->nullsonly[dimension] = true;
+		return;
+	}
+
+	/* sort the array (pass-by-value datum */
+	qsort_arg((void *) values, nvalues, sizeof(Datum),
+			  compare_scalars_simple, (void *) &ssup);
+
+	/*
+	 * Update min/max boundaries to the smallest bounding box. Generally, this
+	 * needs to be done only when constructing the initial bucket.
+	 */
+	if (update_boundaries)
+	{
+		/* store the min/max values */
+		bucket->min[dimension] = values[0];
+		bucket->min_inclusive[dimension] = true;
+
+		bucket->max[dimension] = values[nvalues - 1];
+		bucket->max_inclusive[dimension] = true;
+	}
+
+	/*
+	 * Walk through the array and count distinct values by comparing
+	 * succeeding values.
+	 *
+	 * FIXME This only works for pass-by-value types (i.e. not VARCHARs etc.).
+	 * Although thanks to the deduplication it might work even for those types
+	 * (equal values will get the same item in the deduplicated array).
+	 */
+	for (j = 1; j < nvalues; j++)
+	{
+		if (values[j] != values[j - 1])
+			data->ndistincts[dimension] += 1;
+	}
+
+	pfree(values);
+}
+
+/*
+ * A properly built histogram must not contain buckets mixing NULL and non-NULL
+ * values in a single dimension. Each dimension may either be marked as 'nulls
+ * only', and thus containing only NULL values, or it must not contain any NULL
+ * values.
+ *
+ * Therefore, if the sample contains NULL values in any of the columns, it's
+ * necessary to build those NULL-buckets. This is done in an iterative way
+ * using this algorithm, operating on a single bucket:
+ *
+ *	   (1) Check that all dimensions are well-formed (not mixing NULL and
+ *		   non-NULL values).
+ *
+ *	   (2) If all dimensions are well-formed, terminate.
+ *
+ *	   (3) If the dimension contains only NULL values, but is not marked as
+ *		   NULL-only, mark it as NULL-only and run the algorithm again (on
+ *		   this bucket).
+ *
+ *	   (4) If the dimension mixes NULL and non-NULL values, split the bucket
+ *		   into two parts - one with NULL values, one with non-NULL values
+ *		   (replacing the current one). Then run the algorithm on both buckets.
+ *
+ * This is executed in a recursive manner, but the number of executions should
+ * be quite low - limited by the number of NULL-buckets. Also, in each branch
+ * the number of nested calls is limited by the number of dimensions
+ * (attributes) of the histogram.
+ *
+ * At the end, there should be buckets with no mixed dimensions. The number of
+ * buckets produced by this algorithm is rather limited - with N dimensions,
+ * there may be only 2^N such buckets (each dimension may be either NULL or
+ * non-NULL). So with 8 dimensions (current value of STATS_MAX_DIMENSIONS)
+ * there may be only 256 such buckets.
+ *
+ * After this, a 'regular' bucket-split algorithm shall run, further optimizing
+ * the histogram.
+ */
+static void
+create_null_buckets(MVHistogram *histogram, int bucket_idx,
+					Bitmapset *attrs, VacAttrStats **stats)
+{
+	int			i,
+				j;
+	int			null_dim = -1;
+	int			null_count = 0;
+	bool		null_found = false;
+	MVBucket   *bucket,
+			   *null_bucket;
+	int			null_idx,
+				curr_idx;
+	HistogramBuild *data,
+			   *null_data;
+	int		   *attnums;
+
+	/* remember original values from the bucket */
+	int			numrows;
+	HeapTuple  *oldrows = NULL;
+
+	Assert(bucket_idx < histogram->nbuckets);
+	Assert(histogram->ndimensions == bms_num_members(attrs));
+
+	bucket = histogram->buckets[bucket_idx];
+	data = (HistogramBuild *) bucket->build_data;
+
+	numrows = data->numrows;
+	oldrows = data->rows;
+
+	attnums = build_attnums(attrs);
+
+	/*
+	 * Walk through all rows / dimensions, and stop once we find NULL in a
+	 * dimension not yet marked as NULL-only.
+	 */
+	for (i = 0; i < data->numrows; i++)
+	{
+		/*
+		 * FIXME We don't need to start from the first attribute here - we can
+		 * start from the last known dimension.
+		 */
+		for (j = 0; j < histogram->ndimensions; j++)
+		{
+			/* Is this a NULL-only dimension? If yes, skip. */
+			if (bucket->nullsonly[j])
+				continue;
+
+			/* found a NULL in that dimension? */
+			if (heap_attisnull(data->rows[i], attnums[j]))
+			{
+				null_found = true;
+				null_dim = j;
+				break;
+			}
+		}
+
+		/* terminate if we found attribute with NULL values */
+		if (null_found)
+			break;
+	}
+
+	/* no regular dimension contains NULL values => we're done */
+	if (!null_found)
+		return;
+
+	/* walk through the rows again, count NULL values in 'null_dim' */
+	for (i = 0; i < data->numrows; i++)
+	{
+		if (heap_attisnull(data->rows[i], attnums[null_dim]))
+			null_count += 1;
+	}
+
+	Assert(null_count <= data->numrows);
+
+	/*
+	 * If (null_count == numrows) the dimension already is NULL-only, but is
+	 * not yet marked like that. It's enough to mark it and repeat the process
+	 * recursively (until we run out of dimensions).
+	 */
+	if (null_count == data->numrows)
+	{
+		bucket->nullsonly[null_dim] = true;
+		create_null_buckets(histogram, bucket_idx, attrs, stats);
+		return;
+	}
+
+	/*
+	 * We have to split the bucket into two - one with NULL values in the
+	 * dimension, one with non-NULL values. We don't need to sort the data or
+	 * anything, but otherwise it's similar to what partition_bucket() does.
+	 */
+
+	/* create bucket with NULL-only dimension 'dim' */
+	null_bucket = copy_ext_bucket(bucket, histogram->ndimensions);
+	null_data = (HistogramBuild *) null_bucket->build_data;
+
+	/* remember the current array info */
+	oldrows = data->rows;
+	numrows = data->numrows;
+
+	/* we'll keep non-NULL values in the current bucket */
+	data->numrows = (numrows - null_count);
+	data->rows
+		= (HeapTuple *) palloc0(data->numrows * sizeof(HeapTuple));
+
+	/* and the NULL values will go to the new one */
+	null_data->numrows = null_count;
+	null_data->rows
+		= (HeapTuple *) palloc0(null_data->numrows * sizeof(HeapTuple));
+
+	/* mark the dimension as NULL-only (in the new bucket) */
+	null_bucket->nullsonly[null_dim] = true;
+
+	/* walk through the sample rows and distribute them accordingly */
+	null_idx = 0;
+	curr_idx = 0;
+	for (i = 0; i < numrows; i++)
+	{
+		if (heap_attisnull(oldrows[i], attnums[null_dim]))
+			/* NULL => copy to the new bucket */
+			memcpy(&null_data->rows[null_idx++], &oldrows[i],
+				   sizeof(HeapTuple));
+		else
+			memcpy(&data->rows[curr_idx++], &oldrows[i],
+				   sizeof(HeapTuple));
+	}
+
+	/* update ndistinct values for the buckets (total and per dimension) */
+	update_bucket_ndistinct(bucket, attrs, stats);
+	update_bucket_ndistinct(null_bucket, attrs, stats);
+
+	/*
+	 * TODO We don't need to do this for the dimension we used for split,
+	 * because we know how many distinct values went to each bucket (NULL is
+	 * not a value, so NULL buckets get 0, and the other bucket got all the
+	 * distinct values).
+	 */
+	for (i = 0; i < histogram->ndimensions; i++)
+	{
+		update_dimension_ndistinct(bucket, i, attrs, stats, false);
+		update_dimension_ndistinct(null_bucket, i, attrs, stats, false);
+	}
+
+	pfree(oldrows);
+
+	/* add the NULL bucket to the histogram */
+	histogram->buckets[histogram->nbuckets++] = null_bucket;
+
+	/*
+	 * And now run the function recursively on both buckets (the new one
+	 * first, because the call may change number of buckets, and it's used as
+	 * an index).
+	 */
+	create_null_buckets(histogram, (histogram->nbuckets - 1), attrs, stats);
+	create_null_buckets(histogram, bucket_idx, attrs, stats);
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - bucket ID (0...nbuckets)
+ * - min values (string array)
+ * - max values (string array)
+ * - nulls only (boolean array)
+ * - min inclusive flags (boolean array)
+ * - max inclusive flags (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows returned if the
+ * statistics contains no histogram (or if there's no statistics for the OID).
+ *
+ * The second parameter (type) determines what values will be returned
+ * in the (minvals,maxvals). There are three possible values:
+ *
+ * 0 (actual values)
+ * -----------------
+ *	  - prints actual values
+ *	  - using the output function of the data type (as string)
+ *	  - handy for investigating the histogram
+ *
+ * 1 (distinct index)
+ * ------------------
+ *	  - prints index of the distinct value (into the serialized array)
+ *	  - makes it easier to spot neighbor buckets, etc.
+ *	  - handy for plotting the histogram
+ *
+ * 2 (normalized distinct index)
+ * -----------------------------
+ *	  - prints index of the distinct value, but normalized into [0,1]
+ *	  - similar to 1, but shows how 'long' the bucket range is
+ *	  - handy for plotting the histogram
+ *
+ * When plotting the histogram, be careful as the (1) and (2) options skew the
+ * lengths by distributing the distinct values uniformly. For data types
+ * without a clear meaning of 'distance' (e.g. strings) that is not a big deal,
+ * but for numbers it may be confusing.
+ */
+PG_FUNCTION_INFO_V1(pg_histogram_buckets);
+
+#define OUTPUT_FORMAT_RAW		0
+#define OUTPUT_FORMAT_INDEXES	1
+#define OUTPUT_FORMAT_DISTINCT	2
+
+Datum
+pg_histogram_buckets(PG_FUNCTION_ARGS)
+{
+	FuncCallContext *funcctx;
+	int			call_cntr;
+	int			max_calls;
+	TupleDesc	tupdesc;
+	AttInMetadata *attinmeta;
+
+	Oid			mvoid = PG_GETARG_OID(0);
+	int			otype = PG_GETARG_INT32(1);
+
+	if ((otype < 0) || (otype > 2))
+		elog(ERROR, "invalid output type specified");
+
+	/* stuff done only on the first call of the function */
+	if (SRF_IS_FIRSTCALL())
+	{
+		MemoryContext oldcontext;
+		MVSerializedHistogram *histogram;
+
+		/* create a function context for cross-call persistence */
+		funcctx = SRF_FIRSTCALL_INIT();
+
+		/* switch to memory context appropriate for multiple function calls */
+		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+		histogram = statext_histogram_load(mvoid);
+
+		funcctx->user_fctx = histogram;
+
+		/* total number of tuples to be returned */
+		funcctx->max_calls = 0;
+		if (funcctx->user_fctx != NULL)
+			funcctx->max_calls = histogram->nbuckets;
+
+		/* Build a tuple descriptor for our result type */
+		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("function returning record called in context "
+							"that cannot accept type record")));
+
+		/*
+		 * generate attribute metadata needed later to produce tuples from raw
+		 * C strings
+		 */
+		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+		funcctx->attinmeta = attinmeta;
+
+		MemoryContextSwitchTo(oldcontext);
+	}
+
+	/* stuff done on every call of the function */
+	funcctx = SRF_PERCALL_SETUP();
+
+	call_cntr = funcctx->call_cntr;
+	max_calls = funcctx->max_calls;
+	attinmeta = funcctx->attinmeta;
+
+	if (call_cntr < max_calls)	/* do when there is more left to send */
+	{
+		char	  **values;
+		HeapTuple	tuple;
+		Datum		result;
+		int2vector *stakeys;
+		Oid			relid;
+		double		bucket_volume = 1.0;
+		StringInfo	bufs;
+
+		char	   *format;
+		int			i;
+
+		Oid		   *outfuncs;
+		FmgrInfo   *fmgrinfo;
+
+		MVSerializedHistogram *histogram;
+		MVSerializedBucket *bucket;
+
+		histogram = (MVSerializedHistogram *) funcctx->user_fctx;
+
+		Assert(call_cntr < histogram->nbuckets);
+
+		bucket = histogram->buckets[call_cntr];
+
+		stakeys = find_ext_attnums(mvoid, &relid);
+
+		/*
+		 * The scalar values will be formatted directly, using snprintf.
+		 *
+		 * The 'array' values will be formatted through StringInfo.
+		 */
+		values = (char **) palloc0(9 * sizeof(char *));
+		bufs = (StringInfo) palloc0(9 * sizeof(StringInfoData));
+
+		values[0] = (char *) palloc(64 * sizeof(char));
+
+		initStringInfo(&bufs[1]);		/* lower boundaries */
+		initStringInfo(&bufs[2]);		/* upper boundaries */
+		initStringInfo(&bufs[3]);		/* nulls-only */
+		initStringInfo(&bufs[4]);		/* lower inclusive */
+		initStringInfo(&bufs[5]);		/* upper inclusive */
+
+		values[6] = (char *) palloc(64 * sizeof(char));
+		values[7] = (char *) palloc(64 * sizeof(char));
+		values[8] = (char *) palloc(64 * sizeof(char));
+
+		/* we need to do this only when printing the actual values */
+		outfuncs = (Oid *) palloc0(sizeof(Oid) * histogram->ndimensions);
+		fmgrinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * histogram->ndimensions);
+
+		/*
+		 * lookup output functions for all histogram dimensions
+		 *
+		 * XXX This might be one in the first call and stored in user_fctx.
+		 */
+		for (i = 0; i < histogram->ndimensions; i++)
+		{
+			bool		isvarlena;
+
+			getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
+							  &outfuncs[i], &isvarlena);
+
+			fmgr_info(outfuncs[i], &fmgrinfo[i]);
+		}
+
+		snprintf(values[0], 64, "%d", call_cntr);		/* bucket ID */
+
+		/*
+		 * for the arrays of lower/upper boundaries, formated according to
+		 * otype
+		 */
+		for (i = 0; i < histogram->ndimensions; i++)
+		{
+			Datum	   *vals = histogram->values[i];
+
+			uint16		minidx = bucket->min[i];
+			uint16		maxidx = bucket->max[i];
+
+			/*
+			 * compute bucket volume, using distinct values as a measure
+			 *
+			 * XXX Not really sure what to do for NULL dimensions here, so
+			 * let's simply count them as '1'.
+			 */
+			bucket_volume
+				*= (double) (maxidx - minidx + 1) / (histogram->nvalues[i] - 1);
+
+			if (i == 0)
+				format = "{%s"; /* fist dimension */
+			else if (i < (histogram->ndimensions - 1))
+				format = ", %s";	/* medium dimensions */
+			else
+				format = ", %s}";		/* last dimension */
+
+			appendStringInfo(&bufs[3], format, bucket->nullsonly[i] ? "t" : "f");
+			appendStringInfo(&bufs[4], format, bucket->min_inclusive[i] ? "t" : "f");
+			appendStringInfo(&bufs[5], format, bucket->max_inclusive[i] ? "t" : "f");
+
+			/*
+			 * for NULL-only  dimension, simply put there the NULL and
+			 * continue
+			 */
+			if (bucket->nullsonly[i])
+			{
+				if (i == 0)
+					format = "{%s";
+				else if (i < (histogram->ndimensions - 1))
+					format = ", %s";
+				else
+					format = ", %s}";
+
+				appendStringInfo(&bufs[1], format, "NULL");
+				appendStringInfo(&bufs[2], format, "NULL");
+
+				continue;
+			}
+
+			/* otherwise we really need to format the value */
+			switch (otype)
+			{
+				case OUTPUT_FORMAT_RAW: /* actual boundary values */
+
+					if (i == 0)
+						format = "{%s";
+					else if (i < (histogram->ndimensions - 1))
+						format = ", %s";
+					else
+						format = ", %s}";
+
+					appendStringInfo(&bufs[1], format,
+								  FunctionCall1(&fmgrinfo[i], vals[minidx]));
+
+					appendStringInfo(&bufs[2], format,
+								  FunctionCall1(&fmgrinfo[i], vals[maxidx]));
+
+					break;
+
+				case OUTPUT_FORMAT_INDEXES:		/* indexes into deduplicated
+												 * arrays */
+
+					if (i == 0)
+						format = "{%d";
+					else if (i < (histogram->ndimensions - 1))
+						format = ", %d";
+					else
+						format = ", %d}";
+
+					appendStringInfo(&bufs[1], format, minidx);
+
+					appendStringInfo(&bufs[2], format, maxidx);
+
+					break;
+
+				case OUTPUT_FORMAT_DISTINCT:	/* distinct arrays as measure */
+
+					if (i == 0)
+						format = "{%f";
+					else if (i < (histogram->ndimensions - 1))
+						format = ", %f";
+					else
+						format = ", %f}";
+
+					appendStringInfo(&bufs[1], format,
+							   (minidx * 1.0 / (histogram->nvalues[i] - 1)));
+
+					appendStringInfo(&bufs[2], format,
+							   (maxidx * 1.0 / (histogram->nvalues[i] - 1)));
+
+					break;
+
+				default:
+					elog(ERROR, "unknown output type: %d", otype);
+			}
+		}
+
+		values[1] = bufs[1].data;
+		values[2] = bufs[2].data;
+		values[3] = bufs[3].data;
+		values[4] = bufs[4].data;
+		values[5] = bufs[5].data;
+
+		snprintf(values[6], 64, "%f", bucket->frequency); /* frequency */
+		snprintf(values[7], 64, "%f", bucket->frequency / bucket_volume); /* density */
+		snprintf(values[8], 64, "%f", bucket_volume);	/* volume (as a
+														 * fraction) */
+
+		/* build a tuple */
+		tuple = BuildTupleFromCStrings(attinmeta, values);
+
+		/* make the tuple into a datum */
+		result = HeapTupleGetDatum(tuple);
+
+		/* clean up (this is not really necessary) */
+		pfree(values[0]);
+		pfree(values[6]);
+		pfree(values[7]);
+		pfree(values[8]);
+
+		resetStringInfo(&bufs[1]);
+		resetStringInfo(&bufs[2]);
+		resetStringInfo(&bufs[3]);
+		resetStringInfo(&bufs[4]);
+		resetStringInfo(&bufs[5]);
+
+		pfree(bufs);
+		pfree(values);
+
+		SRF_RETURN_NEXT(funcctx, result);
+	}
+	else	/* do when there is no more left */
+	{
+		SRF_RETURN_DONE(funcctx);
+	}
+}
+
+/*
+ * pg_histogram_in		- input routine for type pg_histogram.
+ *
+ * pg_histogram is real enough to be a table column, but it has no operations
+ * of its own, and disallows input too
+ */
+Datum
+pg_histogram_in(PG_FUNCTION_ARGS)
+{
+	/*
+	 * pg_histogram stores the data in binary form and parsing text input is
+	 * not needed, so disallow this.
+	 */
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg("cannot accept a value of type %s", "pg_histogram")));
+
+	PG_RETURN_VOID();			/* keep compiler quiet */
+}
+
+/*
+ * pg_histogram_out		- output routine for type pg_histogram.
+ *
+ * histograms are serialized into a bytea value, so we simply call byteaout()
+ * to serialize the value into text. But it'd be nice to serialize that into
+ * a meaningful representation (e.g. for inspection by people).
+ *
+ * XXX This should probably return something meaningful, similar to what
+ * pg_dependencies_out does. Not sure how to deal with the deduplicated
+ * values, though - do we want to expand that or not?
+ */
+Datum
+pg_histogram_out(PG_FUNCTION_ARGS)
+{
+	return byteaout(fcinfo);
+}
+
+/*
+ * pg_histogram_recv		- binary input routine for type pg_histogram.
+ */
+Datum
+pg_histogram_recv(PG_FUNCTION_ARGS)
+{
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg("cannot accept a value of type %s", "pg_histogram")));
+
+	PG_RETURN_VOID();			/* keep compiler quiet */
+}
+
+/*
+ * pg_histogram_send		- binary output routine for type pg_histogram.
+ *
+ * Histograms are serialized in a bytea value (although the type is named
+ * differently), so let's just send that.
+ */
+Datum
+pg_histogram_send(PG_FUNCTION_ARGS)
+{
+	return byteasend(fcinfo);
+}
+
+/*
+ * selectivity estimation
+ */
+
+/*
+ * When evaluating conditions on the histogram, we can leverage the fact that
+ * each bucket boundary value is used by many buckets (each bucket split
+ * introduces a single new value, duplicating all the other values). That
+ * allows us to significantly reduce the number of function calls by caching
+ * the results.
+ *
+ * This is one of the reasons why we keep the histogram in partially serialized
+ * form, with deduplicated values. This allows us to maintain a simple array
+ * of results indexed by uint16 values.
+ *
+ * We only need 2 bits per value, but we allocate a full char as it's more
+ * convenient and there's not much to gain. 0 means 'unknown' as the function
+ * was not executed for this value yet.
+ */
+
+#define HIST_CACHE_FALSE			0x01
+#define HIST_CACHE_TRUE				0x03
+#define HIST_CACHE_MASK				0x02
+
+/*
+ * bucket_contains_value
+ *		Decide if the bucket (a range of values in a particular dimension) may
+ *		contain the supplied value.
+ *
+ * The function does not simply return true/false, but a "match level" (none,
+ * partial, full), just like other similar functions. In fact, thise function
+ * only returns "partial" or "none" levels, as a range can never match exactly
+ * a value (we never generate histograms with "collapsed" dimensions).
+ */
+static char
+bucket_contains_value(FmgrInfo ltproc, Datum constvalue,
+					  Datum min_value, Datum max_value,
+					  int min_index, int max_index,
+					  bool min_include, bool max_include,
+					  char *callcache)
+{
+	bool		a,
+				b;
+
+	char		min_cached = callcache[min_index];
+	char		max_cached = callcache[max_index];
+
+	/*
+	 * First some quick checks on equality - if any of the boundaries equals,
+	 * we have a partial match (so no need to call the comparator).
+	 */
+	if (((min_value == constvalue) && (min_include)) ||
+		((max_value == constvalue) && (max_include)))
+		return STATS_MATCH_PARTIAL;
+
+	/* Keep the values 0/1 because of the XOR at the end. */
+	a = ((min_cached & HIST_CACHE_MASK) >> 1);
+	b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+	/*
+	 * If result for the bucket lower bound not in cache, evaluate the
+	 * function and store the result in the cache.
+	 */
+	if (!min_cached)
+	{
+		a = DatumGetBool(FunctionCall2Coll(&ltproc,
+										   DEFAULT_COLLATION_OID,
+										   constvalue, min_value));
+		/* remember the result */
+		callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+	}
+
+	/* And do the same for the upper bound. */
+	if (!max_cached)
+	{
+		b = DatumGetBool(FunctionCall2Coll(&ltproc,
+										   DEFAULT_COLLATION_OID,
+										   constvalue, max_value));
+		/* remember the result */
+		callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+	}
+
+	return (a ^ b) ? STATS_MATCH_PARTIAL : STATS_MATCH_NONE;
+}
+
+/*
+ * bucket_is_smaller_than_value
+ *		Decide if the bucket (a range of values in a particular dimension) is
+ *		smaller than the supplied value.
+ *
+ * The function does not simply return true/false, but a "match level" (none,
+ * partial, full), just like other similar functions.
+ *
+ * Unlike bucket_contains_value this may return all three match levels, i.e.
+ * "full" (e.g. [10,20] < 30), "partial" (e.g. [10,20] < 15) and "none"
+ * (e.g. [10,20] < 5).
+ */
+static char
+bucket_is_smaller_than_value(FmgrInfo opproc, Datum constvalue,
+							 Datum min_value, Datum max_value,
+							 int min_index, int max_index,
+							 bool min_include, bool max_include,
+							 char *callcache, bool isgt)
+{
+	char		min_cached = callcache[min_index];
+	char		max_cached = callcache[max_index];
+
+	/* Keep the values 0/1 because of the XOR at the end. */
+	bool		a = ((min_cached & HIST_CACHE_MASK) >> 1);
+	bool		b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+	if (!min_cached)
+	{
+		a = DatumGetBool(FunctionCall2Coll(&opproc,
+										   DEFAULT_COLLATION_OID,
+										   min_value,
+										   constvalue));
+		/* remember the result */
+		callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+	}
+
+	if (!max_cached)
+	{
+		b = DatumGetBool(FunctionCall2Coll(&opproc,
+										   DEFAULT_COLLATION_OID,
+										   max_value,
+										   constvalue));
+		/* remember the result */
+		callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+	}
+
+	/*
+	 * Now, we need to combine both results into the final answer, and we need
+	 * to be careful about the 'isgt' variable which kinda inverts the
+	 * meaning.
+	 *
+	 * First, we handle the case when each boundary returns different results.
+	 * In that case the outcome can only be 'partial' match.
+	 */
+	if (a != b)
+		return STATS_MATCH_PARTIAL;
+
+	/*
+	 * When the results are the same, then it depends on the 'isgt' value.
+	 * There are four options:
+	 *
+	 * isgt=false a=b=true	=> full match isgt=false a=b=false => empty
+	 * isgt=true  a=b=true	=> empty isgt=true	a=b=false => full match
+	 *
+	 * We'll cheat a bit, because we know that (a=b) so we'll use just one of
+	 * them.
+	 */
+	if (isgt)
+		return (!a) ? STATS_MATCH_FULL : STATS_MATCH_NONE;
+	else
+		return (a) ? STATS_MATCH_FULL : STATS_MATCH_NONE;
+}
+
+/*
+ * Evaluate clauses using the histogram, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * Note: This is not a simple bitmap in the sense that there are more
+ * than two possible values for each item - no match, partial
+ * match and full match. So we need 2 bits per item.
+ *
+ * TODO: This works with 'bitmap' where each item is represented as a
+ * char, which is slightly wasteful. Instead, we could use a bitmap
+ * with 2 bits per item, reducing the size to ~1/4. By using values
+ * 0, 1 and 3 (instead of 0, 1 and 2), the operations (merging etc.)
+ * might be performed just like for simple bitmap by using & and |,
+ * which might be faster than min/max.
+ */
+static int
+histogram_update_match_bitmap(PlannerInfo *root, List *clauses,
+							  Bitmapset *stakeys,
+							  MVSerializedHistogram *histogram,
+							  int nmatches, char *matches,
+							  bool is_or)
+{
+	int			i;
+	ListCell   *l;
+
+	/*
+	 * Used for caching function calls, only once per deduplicated value.
+	 *
+	 * We know may have up to (2 * nbuckets) values per dimension. It's
+	 * probably overkill, but let's allocate that once for all clauses, to
+	 * minimize overhead.
+	 *
+	 * Also, we only need two bits per value, but this allocates byte per
+	 * value. Might be worth optimizing.
+	 *
+	 * 0x00 - not yet called 0x01 - called, result is 'false' 0x03 - called,
+	 * result is 'true'
+	 */
+	char	   *callcache = palloc(histogram->nbuckets);
+
+	Assert(histogram != NULL);
+	Assert(histogram->nbuckets > 0);
+	Assert(nmatches >= 0);
+	Assert(nmatches <= histogram->nbuckets);
+
+	Assert(clauses != NIL);
+	Assert(list_length(clauses) >= 1);
+
+	/* loop through the clauses and do the estimation */
+	foreach(l, clauses)
+	{
+		Node	   *clause = (Node *) lfirst(l);
+
+		/* if it's a RestrictInfo, then extract the clause */
+		if (IsA(clause, RestrictInfo))
+			clause = (Node *) ((RestrictInfo *) clause)->clause;
+
+		/* it's either OpClause, or NullTest */
+		if (is_opclause(clause))
+		{
+			OpExpr	   *expr = (OpExpr *) clause;
+			bool		varonleft = true;
+			bool		ok;
+
+			FmgrInfo	opproc; /* operator */
+
+			fmgr_info(get_opcode(expr->opno), &opproc);
+
+			/* reset the cache (per clause) */
+			memset(callcache, 0, histogram->nbuckets);
+
+			ok = (NumRelids(clause) == 1) &&
+				(is_pseudo_constant_clause(lsecond(expr->args)) ||
+				 (varonleft = false,
+				  is_pseudo_constant_clause(linitial(expr->args))));
+
+			if (ok)
+			{
+				FmgrInfo	ltproc;
+				RegProcedure oprrest = get_oprrest(expr->opno);
+
+				Var		   *var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+				Const	   *cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+				bool		isgt = (!varonleft);
+
+				TypeCacheEntry *typecache
+				= lookup_type_cache(var->vartype, TYPECACHE_LT_OPR);
+
+				/* lookup dimension for the attribute */
+				int			idx = bms_member_index(stakeys, var->varattno);
+
+				fmgr_info(get_opcode(typecache->lt_opr), &ltproc);
+
+				/*
+				 * Check this for all buckets that still have "true" in the
+				 * bitmap
+				 *
+				 * We already know the clauses use suitable operators (because
+				 * that's how we filtered them).
+				 */
+				for (i = 0; i < histogram->nbuckets; i++)
+				{
+					char		res = STATS_MATCH_NONE;
+
+					MVSerializedBucket *bucket = histogram->buckets[i];
+
+					/* histogram boundaries */
+					Datum		minval,
+								maxval;
+					bool		mininclude,
+								maxinclude;
+					int			minidx,
+								maxidx;
+
+					/*
+					 * For AND-lists, we can also mark NULL buckets as 'no
+					 * match' (and then skip them). For OR-lists this is not
+					 * possible.
+					 */
+					if ((!is_or) && bucket->nullsonly[idx])
+						matches[i] = STATS_MATCH_NONE;
+
+					/*
+					 * Skip buckets that were already eliminated - this is
+					 * impotant considering how we update the info (we only
+					 * lower the match). We can't really do anything about the
+					 * MATCH_PARTIAL buckets.
+					 */
+					if ((!is_or) && (matches[i] == STATS_MATCH_NONE))
+						continue;
+					else if (is_or && (matches[i] == STATS_MATCH_FULL))
+						continue;
+
+					/* lookup the values and cache of function calls */
+					minidx = bucket->min[idx];
+					maxidx = bucket->max[idx];
+
+					minval = histogram->values[idx][bucket->min[idx]];
+					maxval = histogram->values[idx][bucket->max[idx]];
+
+					mininclude = bucket->min_inclusive[idx];
+					maxinclude = bucket->max_inclusive[idx];
+
+					/*
+					 * TODO Maybe it's possible to add here a similar
+					 * optimization as for the MCV lists:
+					 *
+					 * (nmatches == 0) && AND-list => all eliminated (FALSE)
+					 * (nmatches == N) && OR-list  => all eliminated (TRUE)
+					 *
+					 * But it's more complex because of the partial matches.
+					 */
+
+					/*
+					 * If it's not a "<" or ">" or "=" operator, just ignore
+					 * the clause. Otherwise note the relid and attnum for the
+					 * variable.
+					 *
+					 * TODO I'm really unsure the handling of 'isgt' flag
+					 * (that is, clauses with reverse order of
+					 * variable/constant) is correct. I wouldn't be surprised
+					 * if there was some mixup. Using the lt/gt operators
+					 * instead of messing with the opproc could make it
+					 * simpler. It would however be using a different operator
+					 * than the query, although it's not any shadier than
+					 * using the selectivity function as is done currently.
+					 */
+					switch (oprrest)
+					{
+						case F_SCALARLTSEL:		/* Var < Const */
+						case F_SCALARGTSEL:		/* Var > Const */
+
+							res = bucket_is_smaller_than_value(opproc, cst->constvalue,
+															   minval, maxval,
+															   minidx, maxidx,
+													  mininclude, maxinclude,
+															callcache, isgt);
+							break;
+
+						case F_EQSEL:
+
+							/*
+							 * We only check whether the value is within the
+							 * bucket, using the lt operator, and we also
+							 * check for equality with the boundaries.
+							 */
+
+							res = bucket_contains_value(ltproc, cst->constvalue,
+														minval, maxval,
+														minidx, maxidx,
+													  mininclude, maxinclude,
+														callcache);
+							break;
+					}
+
+					UPDATE_RESULT(matches[i], res, is_or);
+
+				}
+			}
+		}
+		else if (IsA(clause, NullTest))
+		{
+			NullTest   *expr = (NullTest *) clause;
+			Var		   *var = (Var *) (expr->arg);
+
+			/* FIXME proper matching attribute to dimension */
+			int			idx = bms_member_index(stakeys, var->varattno);
+
+			/*
+			 * Walk through the buckets and evaluate the current clause. We
+			 * can skip items that were already ruled out, and terminate if
+			 * there are no remaining buckets that might possibly match.
+			 */
+			for (i = 0; i < histogram->nbuckets; i++)
+			{
+				MVSerializedBucket *bucket = histogram->buckets[i];
+
+				/*
+				 * Skip buckets that were already eliminated - this is
+				 * impotant considering how we update the info (we only lower
+				 * the match)
+				 */
+				if ((!is_or) && (matches[i] == STATS_MATCH_NONE))
+					continue;
+				else if (is_or && (matches[i] == STATS_MATCH_FULL))
+					continue;
+
+				/* if the clause mismatches the bucket, set it as MATCH_NONE */
+				if ((expr->nulltesttype == IS_NULL)
+					&& (!bucket->nullsonly[idx]))
+					UPDATE_RESULT(matches[i], STATS_MATCH_NONE, is_or);
+
+				else if ((expr->nulltesttype == IS_NOT_NULL) &&
+						 (bucket->nullsonly[idx]))
+					UPDATE_RESULT(matches[i], STATS_MATCH_NONE, is_or);
+			}
+		}
+		else if (or_clause(clause) || and_clause(clause))
+		{
+			/*
+			 * AND/OR clause, with all clauses compatible with the selected MV
+			 * stat
+			 */
+
+			int			i;
+			BoolExpr   *orclause = ((BoolExpr *) clause);
+			List	   *orclauses = orclause->args;
+
+			/* match/mismatch bitmap for each bucket */
+			int			or_nmatches = 0;
+			char	   *or_matches = NULL;
+
+			Assert(orclauses != NIL);
+			Assert(list_length(orclauses) >= 2);
+
+			/* number of matching buckets */
+			or_nmatches = histogram->nbuckets;
+
+			/* by default none of the buckets matches the clauses */
+			or_matches = palloc0(sizeof(char) * or_nmatches);
+
+			if (or_clause(clause))
+			{
+				/* OR clauses assume nothing matches, initially */
+				memset(or_matches, STATS_MATCH_NONE, sizeof(char) * or_nmatches);
+				or_nmatches = 0;
+			}
+			else
+			{
+				/* AND clauses assume nothing matches, initially */
+				memset(or_matches, STATS_MATCH_FULL, sizeof(char) * or_nmatches);
+			}
+
+			/* build the match bitmap for the OR-clauses */
+			or_nmatches = histogram_update_match_bitmap(root, orclauses,
+														stakeys, histogram,
+								 or_nmatches, or_matches, or_clause(clause));
+
+			/* merge the bitmap into the existing one */
+			for (i = 0; i < histogram->nbuckets; i++)
+			{
+				/*
+				 * Merge the result into the bitmap (Min for AND, Max for OR).
+				 *
+				 * FIXME this does not decrease the number of matches
+				 */
+				UPDATE_RESULT(matches[i], or_matches[i], is_or);
+			}
+
+			pfree(or_matches);
+
+		}
+		else
+			elog(ERROR, "unknown clause type: %d", clause->type);
+	}
+
+	/* free the call cache */
+	pfree(callcache);
+
+	return nmatches;
+}
+
+/*
+ * Estimate selectivity of clauses using a histogram.
+ *
+ * If there's no histogram for the stats, the function returns 0.0.
+ *
+ * The general idea of this method is similar to how MCV lists are
+ * processed, except that this introduces the concept of a partial
+ * match (MCV only works with full match / mismatch).
+ *
+ * The algorithm works like this:
+ *
+ *	 1) mark all buckets as 'full match'
+ *	 2) walk through all the clauses
+ *	 3) for a particular clause, walk through all the buckets
+ *	 4) skip buckets that are already 'no match'
+ *	 5) check clause for buckets that still match (at least partially)
+ *	 6) sum frequencies for buckets to get selectivity
+ *
+ * Unlike MCV lists, histograms have a concept of a partial match. In
+ * that case we use 1/2 the bucket, to minimize the average error. The
+ * MV histograms are usually less detailed than the per-column ones,
+ * meaning the sum is often quite high (thanks to combining a lot of
+ * "partially hit" buckets).
+ *
+ * Maybe we could use per-bucket information with number of distinct
+ * values it contains (for each dimension), and then use that to correct
+ * the estimate (so with 10 distinct values, we'd use 1/10 of the bucket
+ * frequency). We might also scale the value depending on the actual
+ * ndistinct estimate (not just the values observed in the sample).
+ *
+ * Another option would be to multiply the selectivities, i.e. if we get
+ * 'partial match' for a bucket for multiple conditions, we might use
+ * 0.5^k (where k is the number of conditions), instead of 0.5. This
+ * probably does not minimize the average error, though.
+ *
+ * TODO: This might use a similar shortcut to MCV lists - count buckets
+ * marked as partial/full match, and terminate once this drop to 0.
+ * Not sure if it's really worth it - for MCV lists a situation like
+ * this is not uncommon, but for histograms it's not that clear.
+ */
+Selectivity
+histogram_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
+								 List *clauses, int varRelid,
+								 JoinType jointype, SpecialJoinInfo *sjinfo,
+								 RelOptInfo *rel)
+{
+	int			i;
+	MVSerializedHistogram	   *histogram;
+	Selectivity s;
+
+	/* match/mismatch bitmap for each MCV item */
+	char	   *matches = NULL;
+	int			nmatches = 0;
+
+	/* load the histogram stored in the statistics object */
+	histogram = statext_histogram_load(stat->statOid);
+
+	/* by default all the histogram buckets match the clauses fully */
+	matches = palloc0(sizeof(char) * histogram->nbuckets);
+	memset(matches, STATS_MATCH_FULL, sizeof(char) * histogram->nbuckets);
+
+	/* number of matching histogram buckets */
+	nmatches = histogram->nbuckets;
+
+	nmatches = histogram_update_match_bitmap(root, clauses, stat->keys,
+											 histogram, nmatches, matches,
+											 false);
+
+	/* now, walk through the buckets and sum the selectivities */
+	for (i = 0; i < histogram->nbuckets; i++)
+	{
+		if (matches[i] == STATS_MATCH_FULL)
+			s += histogram->buckets[i]->frequency;
+		else if (matches[i] == STATS_MATCH_PARTIAL)
+			s += 0.5 * histogram->buckets[i]->frequency;
+	}
+
+	return s;
+}
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index 391ddcb..65a8875 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -65,9 +65,6 @@ static SortItem *build_distinct_groups(int numrows, SortItem *items,
 static int count_distinct_groups(int numrows, SortItem *items,
 					  MultiSortSupport mss);
 
-static bool mcv_is_compatible_clause(Node *clause, Index relid,
-					  Bitmapset **attnums);
-
 /*
  * Builds MCV list from the set of sampled rows.
  *
@@ -95,12 +92,14 @@ static bool mcv_is_compatible_clause(Node *clause, Index relid,
  */
 MCVList *
 statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
-				 VacAttrStats **stats)
+				 VacAttrStats **stats, HeapTuple **rows_filtered,
+				 int *numrows_filtered)
 {
 	int			i;
 	int			numattrs = bms_num_members(attrs);
 	int			ndistinct = 0;
 	int			mcv_threshold = 0;
+	int			numrows_mcv;	/* rows covered by the MCV items */
 	int			nitems = 0;
 
 	int		   *attnums = build_attnums(attrs);
@@ -117,6 +116,9 @@ statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
 	/* transform the sorted rows into groups (sorted by frequency) */
 	SortItem   *groups = build_distinct_groups(numrows, items, mss, &ndistinct);
 
+	/* Either we have both pointers or none of them. */
+	Assert((rows_filtered && numrows_filtered) || (!rows_filtered && !numrows_filtered));
+
 	/*
 	 * Determine the minimum size of a group to be eligible for MCV list, and
 	 * check how many groups actually pass that threshold. We use 1.25x the
@@ -142,14 +144,19 @@ statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
 
 	/* Walk through the groups and stop once we fall below the threshold. */
 	nitems = 0;
+	numrows_mcv = 0;
 	for (i = 0; i < ndistinct; i++)
 	{
 		if (groups[i].count < mcv_threshold)
 			break;
 
+		numrows_mcv += groups[i].count;
 		nitems++;
 	}
 
+	/* The MCV can't possibly cover more rows than we sampled. */
+	Assert(numrows_mcv <= numrows);
+
 	/*
 	 * At this point we know the number of items for the MCV list. There might
 	 * be none (for uniform distribution with many groups), and in that case
@@ -209,6 +216,87 @@ statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
 		Assert(nitems == mcvlist->nitems);
 	}
 
+	/* Assume we're not returning any filtered rows by default. */
+	if (numrows_filtered)
+		*numrows_filtered = 0;
+
+	if (rows_filtered)
+		*rows_filtered = NULL;
+
+	/*
+	 * Produce an array with only tuples not covered by the MCV list. This
+	 * is needed when building MCV+histogram pair, where MCV covers the most
+	 * common combinations and histogram covers the remaining part.
+	 *
+	 * We will first sort the groups by the keys (not by count) and then use
+	 * binary search in the group array to check which rows are covered by
+	 * the MCV items.
+	 *
+	 * Do not modify the array in place, as there may be additional stats on
+	 * the table and we need to keep the original array for them.
+	 *
+	 * We only do this when requested by passing non-NULL rows_filtered,
+	 * and when there are rows not covered by the MCV list (that is, when
+	 * numrows_mcv < numrows), or also (nitems < ndistinct).
+	 */
+	if (rows_filtered && numrows_filtered && (nitems < ndistinct))
+	{
+		int		i,
+				j;
+
+		/* used to build the filtered array of tuples */
+		HeapTuple  *filtered;
+		int			nfiltered;
+
+		/* used for the searches */
+		SortItem        key;
+
+		/* We do know how many rows we expect (total - MCV rows). */
+		nfiltered = (numrows - numrows_mcv);
+		filtered = (HeapTuple *) palloc(nfiltered * sizeof(HeapTuple));
+
+		/* wfill this with data from the rows */
+		key.values = (Datum *) palloc0(numattrs * sizeof(Datum));
+		key.isnull = (bool *) palloc0(numattrs * sizeof(bool));
+
+		/*
+		 * Sort the groups for bsearch_r (but only the items that actually
+		 * made it to the MCV list).
+		 */
+		qsort_arg((void *) groups, nitems, sizeof(SortItem),
+				  multi_sort_compare, mss);
+
+		/* walk through the tuples, compare the values to MCV items */
+		nfiltered = 0;
+		for (i = 0; i < numrows; i++)
+		{
+			/* collect the key values from the row */
+			for (j = 0; j < numattrs; j++)
+				key.values[j]
+					= heap_getattr(rows[i], attnums[j],
+								   stats[j]->tupDesc, &key.isnull[j]);
+
+			/* if not included in the MCV list, keep it in the array */
+			if (bsearch_arg(&key, groups, nitems, sizeof(SortItem),
+							multi_sort_compare, mss) == NULL)
+				filtered[nfiltered++] = rows[i];
+
+			/* do not overflow the array */
+			Assert(nfiltered <= (numrows - numrows_mcv));
+		}
+
+		/* expect to get the right number of remaining rows exactly */
+		Assert(nfiltered + numrows_mcv == numrows);
+
+		/* pass the filtered tuples up */
+		*numrows_filtered = nfiltered;
+		*rows_filtered = filtered;
+
+		/* free all the data used here */
+		pfree(key.values);
+		pfree(key.isnull);
+	}
+
 	pfree(items);
 	pfree(groups);
 
@@ -1211,168 +1299,6 @@ pg_mcv_list_send(PG_FUNCTION_ARGS)
 }
 
 /*
- * mcv_is_compatible_clause_internal
- *	Does the heavy lifting of actually inspecting the clauses for
- * mcv_is_compatible_clause.
- */
-static bool
-mcv_is_compatible_clause_internal(Node *clause, Index relid, Bitmapset **attnums)
-{
-	/* We only support plain Vars for now */
-	if (IsA(clause, Var))
-	{
-		Var *var = (Var *) clause;
-
-		/* Ensure var is from the correct relation */
-		if (var->varno != relid)
-			return false;
-
-		/* we also better ensure the Var is from the current level */
-		if (var->varlevelsup > 0)
-			return false;
-
-		/* Also skip system attributes (we don't allow stats on those). */
-		if (!AttrNumberIsForUserDefinedAttr(var->varattno))
-			return false;
-
-		*attnums = bms_add_member(*attnums, var->varattno);
-
-		return true;
-	}
-
-	/* Var = Const */
-	if (is_opclause(clause))
-	{
-		OpExpr	   *expr = (OpExpr *) clause;
-		Var		   *var;
-		bool		varonleft = true;
-		bool		ok;
-
-		/* Only expressions with two arguments are considered compatible. */
-		if (list_length(expr->args) != 2)
-			return false;
-
-		/* see if it actually has the right */
-		ok = (NumRelids((Node *) expr) == 1) &&
-			(is_pseudo_constant_clause(lsecond(expr->args)) ||
-			 (varonleft = false,
-			  is_pseudo_constant_clause(linitial(expr->args))));
-
-		/* unsupported structure (two variables or so) */
-		if (!ok)
-			return false;
-
-		/*
-		 * If it's not one of the supported operators ("=", "<", ">", etc.),
-		 * just ignore the clause, as it's not compatible with MCV lists.
-		 *
-		 * This uses the function for estimating selectivity, not the operator
-		 * directly (a bit awkward, but well ...).
-		 */
-		if ((get_oprrest(expr->opno) != F_EQSEL) &&
-			(get_oprrest(expr->opno) != F_SCALARLTSEL) &&
-			(get_oprrest(expr->opno) != F_SCALARGTSEL))
-			return false;
-
-		var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
-
-		return mcv_is_compatible_clause_internal((Node *)var, relid, attnums);
-	}
-
-	/* NOT clause, clause AND/OR clause */
-	if (or_clause(clause) ||
-		and_clause(clause) ||
-		not_clause(clause))
-	{
-		/*
-		 * AND/OR/NOT-clauses are supported if all sub-clauses are supported
-		 *
-		 * TODO: We might support mixed case, where some of the clauses are
-		 * supported and some are not, and treat all supported subclauses as a
-		 * single clause, compute it's selectivity using mv stats, and compute
-		 * the total selectivity using the current algorithm.
-		 *
-		 * TODO: For RestrictInfo above an OR-clause, we might use the
-		 * orclause with nested RestrictInfo - we won't have to call
-		 * pull_varnos() for each clause, saving time.
-		 */
-		BoolExpr   *expr = (BoolExpr *) clause;
-		ListCell   *lc;
-		Bitmapset  *clause_attnums = NULL;
-
-		foreach(lc, expr->args)
-		{
-			/*
-			 * Had we found incompatible clause in the arguments, treat the
-			 * whole clause as incompatible.
-			 */
-			if (!mcv_is_compatible_clause_internal((Node *) lfirst(lc),
-												   relid, &clause_attnums))
-				return false;
-		}
-
-		/*
-		 * Otherwise the clause is compatible, and we need to merge the
-		 * attnums into the main bitmapset.
-		 */
-		*attnums = bms_join(*attnums, clause_attnums);
-
-		return true;
-	}
-
-	/* Var IS NULL */
-	if (IsA(clause, NullTest))
-	{
-		NullTest   *nt = (NullTest *) clause;
-
-		/*
-		 * Only simple (Var IS NULL) expressions supported for now. Maybe we
-		 * could use examine_variable to fix this?
-		 */
-		if (!IsA(nt->arg, Var))
-			return false;
-
-		return mcv_is_compatible_clause_internal((Node *) (nt->arg), relid, attnums);
-	}
-
-	return false;
-}
-
-/*
- * mcv_is_compatible_clause
- *		Determines if the clause is compatible with MCV lists
- *
- * Only OpExprs with two arguments using an equality operator are supported.
- * When returning True attnum is set to the attribute number of the Var within
- * the supported clause.
- *
- * Currently we only support Var = Const, or Const = Var. It may be possible
- * to expand on this later.
- */
-static bool
-mcv_is_compatible_clause(Node *clause, Index relid, Bitmapset **attnums)
-{
-	RestrictInfo *rinfo = (RestrictInfo *) clause;
-
-	if (!IsA(rinfo, RestrictInfo))
-		return false;
-
-	/* Pseudoconstants are not really interesting here. */
-	if (rinfo->pseudoconstant)
-		return false;
-
-	/* clauses referencing multiple varnos are incompatible */
-	if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
-		return false;
-
-	return mcv_is_compatible_clause_internal((Node *)rinfo->clause,
-											 relid, attnums);
-}
-
-#define UPDATE_RESULT(m,r,isor) \
-	(m) = (isor) ? (Max(m,r)) : (Min(m,r))
-
-/*
  * mcv_update_match_bitmap
  *	Evaluate clauses using the MCV list, and update the match bitmap.
  *
@@ -1694,98 +1620,29 @@ mcv_update_match_bitmap(PlannerInfo *root, List *clauses,
 	return nmatches;
 }
 
-
+/*
+ * mcv_clauselist_selectivity
+ *		Return the estimated selectivity of the given clauses using MCV list
+ *		statistics, or 1.0 if no useful MCV list statistic exists.
+ */
 Selectivity
-mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
+mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
+						   List *clauses, int varRelid,
 						   JoinType jointype, SpecialJoinInfo *sjinfo,
-						   RelOptInfo *rel, Bitmapset **estimatedclauses)
+						   RelOptInfo *rel,
+						   bool *fullmatch, Selectivity *lowsel)
 {
 	int			i;
-	ListCell   *l;
-	Bitmapset  *clauses_attnums = NULL;
-	Bitmapset **list_attnums;
-	int			listidx;
-	StatisticExtInfo *stat;
 	MCVList	   *mcv;
-	List	   *mcv_clauses;
+	Selectivity	s;
 
 	/* match/mismatch bitmap for each MCV item */
 	char	   *matches = NULL;
-	bool		fullmatch;
-	Selectivity lowsel;
 	int			nmatches = 0;
-	Selectivity	s;
-
-	/* check if there's any stats that might be useful for us. */
-	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
-		return 1.0;
-
-	list_attnums = (Bitmapset **) palloc(sizeof(Bitmapset *) *
-										 list_length(clauses));
-
-	/*
-	 * Pre-process the clauses list to extract the attnums seen in each item.
-	 * We need to determine if there's any clauses which will be useful for
-	 * dependency selectivity estimations. Along the way we'll record all of
-	 * the attnums for each clause in a list which we'll reference later so we
-	 * don't need to repeat the same work again. We'll also keep track of all
-	 * attnums seen.
-	 *
-	 * FIXME Should skip already estimated clauses (using the estimatedclauses
-	 * bitmap).
-	 */
-	listidx = 0;
-	foreach(l, clauses)
-	{
-		Node	   *clause = (Node *) lfirst(l);
-		Bitmapset  *attnums = NULL;
-
-		if (mcv_is_compatible_clause(clause, rel->relid, &attnums))
-		{
-			list_attnums[listidx] = attnums;
-			clauses_attnums = bms_add_members(clauses_attnums, attnums);
-		}
-		else
-			list_attnums[listidx] = NULL;
-
-		listidx++;
-	}
-
-	/* We need at least two attributes for MCV lists. */
-	if (bms_num_members(clauses_attnums) < 2)
-		return 1.0;
-
-	/* find the best suited statistics object for these attnums */
-	stat = choose_best_statistics(rel->statlist, clauses_attnums,
-								  STATS_EXT_MCV);
-
-	/* if no matching stats could be found then we've nothing to do */
-	if (!stat)
-		return 1.0;
 
 	/* load the MCV list stored in the statistics object */
 	mcv = statext_mcv_load(stat->statOid);
 
-	/* now filter the clauses to be estimated using the selected MCV */
-	mcv_clauses = NIL;
-
-	listidx = 0;
-	foreach (l, clauses)
-	{
-		/*
-		 * If the clause is compatible with the selected MCV statistics,
-		 * mark it as estimated and add it to the MCV list.
-		 */
-		if ((list_attnums[listidx] != NULL) &&
-			(bms_is_subset(list_attnums[listidx], stat->keys)))
-		{
-			mcv_clauses = lappend(mcv_clauses, (Node *)lfirst(l));
-			*estimatedclauses = bms_add_member(*estimatedclauses, listidx);
-		}
-
-		listidx++;
-	}
-
 	/* by default all the MCV items match the clauses fully */
 	matches = palloc0(sizeof(char) * mcv->nitems);
 	memset(matches, STATS_MATCH_FULL, sizeof(char) * mcv->nitems);
@@ -1796,7 +1653,7 @@ mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
 	nmatches = mcv_update_match_bitmap(root, clauses,
 									   stat->keys, mcv,
 									   nmatches, matches,
-									   &lowsel, &fullmatch, false);
+									   lowsel, fullmatch, false);
 
 	/* sum frequencies for all the matching MCV items */
 	for (i = 0; i < mcv->nitems; i++)
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 80746da..c7fbbd2 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -1462,6 +1462,7 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 	bool		ndistinct_enabled;
 	bool		dependencies_enabled;
 	bool		mcv_enabled;
+	bool		histogram_enabled;
 	int			i;
 
 	statexttup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statextid));
@@ -1498,6 +1499,7 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 	ndistinct_enabled = false;
 	dependencies_enabled = false;
 	mcv_enabled = false;
+	histogram_enabled = false;
 
 	for (i = 0; i < ARR_DIMS(arr)[0]; i++)
 	{
@@ -1507,6 +1509,8 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 			dependencies_enabled = true;
 		if (enabled[i] == STATS_EXT_MCV)
 			mcv_enabled = true;
+		if (enabled[i] == STATS_EXT_HISTOGRAM)
+			histogram_enabled = true;
 	}
 
 	/*
@@ -1535,7 +1539,13 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 		}
 
 		if (mcv_enabled)
+		{
 			appendStringInfo(&buf, "%smcv", gotone ? ", " : "");
+			gotone = true;
+		}
+
+		if (histogram_enabled)
+			appendStringInfo(&buf, "%shistogram", gotone ? ", " : "");
 
 		appendStringInfoChar(&buf, ')');
 	}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index e103f5e..40916ae 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3747,7 +3747,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 		int			nshared;
 
 		/* skip statistics of other kinds */
-		if (info->kind != STATS_EXT_NDISTINCT)
+		if ((info->kinds & STATS_EXT_INFO_NDISTINCT) == 0)
 			continue;
 
 		/* compute attnums shared by the vars and the statistics object */
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index bedd3db..ed60fb6 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2383,7 +2383,8 @@ describeOneTableDetails(const char *schemaname,
 							  "        a.attnum = s.attnum AND NOT attisdropped)) AS columns,\n"
 							  "  (stxkind @> '{d}') AS ndist_enabled,\n"
 							  "  (stxkind @> '{f}') AS deps_enabled,\n"
-							  "  (stxkind @> '{m}') AS mcv_enabled\n"
+							  "  (stxkind @> '{m}') AS mcv_enabled,\n"
+							  "  (stxkind @> '{h}') AS histogram_enabled\n"
 							  "FROM pg_catalog.pg_statistic_ext stat "
 							  "WHERE stxrelid = '%s'\n"
 							  "ORDER BY 1;",
@@ -2426,6 +2427,12 @@ describeOneTableDetails(const char *schemaname,
 					if (strcmp(PQgetvalue(result, i, 7), "t") == 0)
 					{
 						appendPQExpBuffer(&buf, "%smcv", gotone ? ", " : "");
+						gotone = true;
+					}
+
+					if (strcmp(PQgetvalue(result, i, 8), "t") == 0)
+					{
+						appendPQExpBuffer(&buf, "%shistogram", gotone ? ", " : "");
 					}
 
 					appendPQExpBuffer(&buf, ") ON %s FROM %s",
diff --git a/src/include/catalog/pg_cast.h b/src/include/catalog/pg_cast.h
index 4881134..e63adfe 100644
--- a/src/include/catalog/pg_cast.h
+++ b/src/include/catalog/pg_cast.h
@@ -266,6 +266,9 @@ DATA(insert (  3402  25    0 i i ));
 DATA(insert (  441	 17    0 i b ));
 DATA(insert (  441	 25    0 i i ));
 
+/* pg_histogram can be coerced to, but not from, bytea */
+DATA(insert (  772	 17    0 i b ));
+
 
 /*
  * Datetime category
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index d78ad54..dc37133 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2795,9 +2795,21 @@ DESCR("I/O");
 DATA(insert OID = 445 (  pg_mcv_list_send	PGNSP PGUID 12 1 0 0 0 f f f f t f s s 1 0 17 "441" _null_ _null_ _null_ _null_ _null_	pg_mcv_list_send _null_ _null_ _null_ ));
 DESCR("I/O");
 
+DATA(insert OID = 779 (  pg_histogram_in	PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 772 "2275" _null_ _null_ _null_ _null_ _null_ pg_histogram_in _null_ _null_ _null_ ));
+DESCR("I/O");
+DATA(insert OID = 776 (  pg_histogram_out	PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 2275 "772" _null_ _null_ _null_ _null_ _null_ pg_histogram_out _null_ _null_ _null_ ));
+DESCR("I/O");
+DATA(insert OID = 777 (  pg_histogram_recv	PGNSP PGUID 12 1 0 0 0 f f f f t f s s 1 0 772 "2281" _null_ _null_ _null_ _null_ _null_ pg_histogram_recv _null_ _null_ _null_ ));
+DESCR("I/O");
+DATA(insert OID = 778 (  pg_histogram_send	PGNSP PGUID 12 1 0 0 0 f f f f t f s s 1 0 17 "772" _null_ _null_ _null_ _null_ _null_	pg_histogram_send _null_ _null_ _null_ ));
+DESCR("I/O");
+
 DATA(insert OID = 3410 (  pg_mcv_list_items PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 1 0 2249 "26" "{26,23,1009,1000,701}" "{i,o,o,o,o}" "{oid,index,values,nulls,frequency}" _null_ _null_ pg_stats_ext_mcvlist_items _null_ _null_ _null_ ));
 DESCR("details about MCV list items");
 
+DATA(insert OID = 3412 (  pg_histogram_buckets PGNSP PGUID 12 1 1000 0 0 f f f f t t i s 2 0 2249 "26 23" "{26,23,23,1009,1009,1000,1000,1000,701,701,701}" "{i,i,o,o,o,o,o,o,o,o,o}" "{oid,otype,index,minvals,maxvals,nullsonly,mininclusive,maxinclusive,frequency,density,bucket_volume}" _null_ _null_ pg_histogram_buckets _null_ _null_ _null_ ));
+DESCR("details about histogram buckets");
+
 DATA(insert OID = 1928 (  pg_stat_get_numscans			PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_numscans _null_ _null_ _null_ ));
 DESCR("statistics: number of scans done for table/index");
 DATA(insert OID = 1929 (  pg_stat_get_tuples_returned	PGNSP PGUID 12 1 0 0 0 f f f f t f s r 1 0 20 "26" _null_ _null_ _null_ _null_ _null_ pg_stat_get_tuples_returned _null_ _null_ _null_ ));
diff --git a/src/include/catalog/pg_statistic_ext.h b/src/include/catalog/pg_statistic_ext.h
index 4752525..213512c 100644
--- a/src/include/catalog/pg_statistic_ext.h
+++ b/src/include/catalog/pg_statistic_ext.h
@@ -50,6 +50,7 @@ CATALOG(pg_statistic_ext,3381)
 	pg_ndistinct stxndistinct;	/* ndistinct coefficients (serialized) */
 	pg_dependencies stxdependencies;	/* dependencies (serialized) */
 	pg_mcv_list		stxmcv;		/* MCV (serialized) */
+	pg_histogram	stxhistogram;	/* MV histogram (serialized) */
 #endif
 
 } FormData_pg_statistic_ext;
@@ -65,7 +66,7 @@ typedef FormData_pg_statistic_ext *Form_pg_statistic_ext;
  *		compiler constants for pg_statistic_ext
  * ----------------
  */
-#define Natts_pg_statistic_ext					9
+#define Natts_pg_statistic_ext					10
 #define Anum_pg_statistic_ext_stxrelid			1
 #define Anum_pg_statistic_ext_stxname			2
 #define Anum_pg_statistic_ext_stxnamespace		3
@@ -75,9 +76,11 @@ typedef FormData_pg_statistic_ext *Form_pg_statistic_ext;
 #define Anum_pg_statistic_ext_stxndistinct		7
 #define Anum_pg_statistic_ext_stxdependencies	8
 #define Anum_pg_statistic_ext_stxmcv			9
+#define Anum_pg_statistic_ext_stxhistogram		10
 
 #define STATS_EXT_NDISTINCT			'd'
 #define STATS_EXT_DEPENDENCIES		'f'
 #define STATS_EXT_MCV				'm'
+#define STATS_EXT_HISTOGRAM			'h'
 
 #endif							/* PG_STATISTIC_EXT_H */
diff --git a/src/include/catalog/pg_type.h b/src/include/catalog/pg_type.h
index b5fcc3d..edb21a6 100644
--- a/src/include/catalog/pg_type.h
+++ b/src/include/catalog/pg_type.h
@@ -376,6 +376,10 @@ DATA(insert OID = 441 ( pg_mcv_list		PGNSP PGUID -1 f b S f t \054 0 0 0 pg_mcv_
 DESCR("multivariate MCV list");
 #define PGMCVLISTOID	441
 
+DATA(insert OID = 772 ( pg_histogram		PGNSP PGUID -1 f b S f t \054 0 0 0 pg_histogram_in pg_histogram_out pg_histogram_recv pg_histogram_send - - - i x f 0 -1 0 100 _null_ _null_ _null_ ));
+DESCR("multivariate histogram");
+#define PGHISTOGRAMOID	772
+
 DATA(insert OID = 32 ( pg_ddl_command	PGNSP PGUID SIZEOF_POINTER t p P f t \054 0 0 0 pg_ddl_command_in pg_ddl_command_out pg_ddl_command_recv pg_ddl_command_send - - - ALIGNOF_POINTER p f 0 -1 0 0 _null_ _null_ _null_ ));
 DESCR("internal type for passing CollectedCommand");
 #define PGDDLCOMMANDOID 32
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 9bae3c6..cb3ab7c 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -721,10 +721,15 @@ typedef struct StatisticExtInfo
 
 	Oid			statOid;		/* OID of the statistics row */
 	RelOptInfo *rel;			/* back-link to statistic's table */
-	char		kind;			/* statistic kind of this entry */
+	int			kinds;			/* statistic kinds of this entry */
 	Bitmapset  *keys;			/* attnums of the columns covered */
 } StatisticExtInfo;
 
+#define STATS_EXT_INFO_NDISTINCT			1
+#define STATS_EXT_INFO_DEPENDENCIES			2
+#define STATS_EXT_INFO_MCV					4
+#define STATS_EXT_INFO_HISTOGRAM			8
+
 /*
  * EquivalenceClasses
  *
diff --git a/src/include/statistics/extended_stats_internal.h b/src/include/statistics/extended_stats_internal.h
index 7a04863..dbd5886 100644
--- a/src/include/statistics/extended_stats_internal.h
+++ b/src/include/statistics/extended_stats_internal.h
@@ -68,10 +68,18 @@ extern bytea *statext_dependencies_serialize(MVDependencies *dependencies);
 extern MVDependencies *statext_dependencies_deserialize(bytea *data);
 
 extern MCVList *statext_mcv_build(int numrows, HeapTuple *rows,
-					Bitmapset *attrs, VacAttrStats **stats);
+					Bitmapset *attrs, VacAttrStats **stats,
+					HeapTuple **rows_filtered, int *numrows_filtered);
 extern bytea *statext_mcv_serialize(MCVList *mcv, VacAttrStats **stats);
 extern MCVList *statext_mcv_deserialize(bytea *data);
 
+extern MVHistogram *statext_histogram_build(int numrows, HeapTuple *rows,
+					Bitmapset *attrs, VacAttrStats **stats,
+					int numrows_total);
+extern bytea *statext_histogram_serialize(MVHistogram *histogram,
+					VacAttrStats **stats);
+extern MVSerializedHistogram *statext_histogram_deserialize(bytea *data);
+
 extern MultiSortSupport multi_sort_init(int ndims);
 extern void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
 						 Oid oper);
@@ -82,6 +90,7 @@ extern int multi_sort_compare_dims(int start, int end, const SortItem *a,
 						const SortItem *b, MultiSortSupport mss);
 extern int compare_scalars_simple(const void *a, const void *b, void *arg);
 extern int compare_datums_simple(Datum a, Datum b, SortSupport ssup);
+extern int compare_scalars_partition(const void *a, const void *b, void *arg);
 
 extern void *bsearch_arg(const void *key, const void *base,
 			size_t nmemb, size_t size,
@@ -98,4 +107,24 @@ extern int2vector *find_ext_attnums(Oid mvoid, Oid *relid);
 
 extern int bms_member_index(Bitmapset *keys, AttrNumber varattno);
 
+extern Selectivity mcv_clauselist_selectivity(PlannerInfo *root,
+									StatisticExtInfo *stat,
+									List *clauses,
+									int varRelid,
+									JoinType jointype,
+									SpecialJoinInfo *sjinfo,
+									RelOptInfo *rel,
+									bool *fulmatch,
+									Selectivity *lowsel);
+extern Selectivity histogram_clauselist_selectivity(PlannerInfo *root,
+									StatisticExtInfo *stat,
+									List *clauses,
+									int varRelid,
+									JoinType jointype,
+									SpecialJoinInfo *sjinfo,
+									RelOptInfo *rel);
+
+#define UPDATE_RESULT(m,r,isor) \
+	(m) = (isor) ? (Max(m,r)) : (Min(m,r))
+
 #endif							/* EXTENDED_STATS_INTERNAL_H */
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 7b94dde..90774a1 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -117,9 +117,100 @@ typedef struct MCVList
 	MCVItem	  **items;		/* array of MCV items */
 } MCVList;
 
+
+/* used to flag stats serialized to bytea */
+#define STATS_HIST_MAGIC       0x7F8C5670      /* marks serialized bytea */
+#define STATS_HIST_TYPE_BASIC  1               /* basic histogram type */
+
+/* max buckets in a histogram (mostly arbitrary number) */
+#define STATS_HIST_MAX_BUCKETS 16384
+
+/*
+ * Multivariate histograms
+ */
+typedef struct MVBucket
+{
+	/* Frequencies of this bucket. */
+	float		frequency;
+
+	/*
+	 * Information about dimensions being NULL-only. Not yet used.
+	 */
+	bool	   *nullsonly;
+
+	/* lower boundaries - values and information about the inequalities */
+	Datum	   *min;
+	bool	   *min_inclusive;
+
+	/* upper boundaries - values and information about the inequalities */
+	Datum	   *max;
+	bool       *max_inclusive;
+
+	/* used when building the histogram (not serialized/deserialized) */
+	void	   *build_data;
+} MVBucket;
+
+typedef struct MVHistogram
+{
+	uint32		magic;          /* magic constant marker */
+	uint32		type;           /* type of histogram (BASIC) */
+	uint32		nbuckets;       /* number of buckets (buckets array) */
+	uint32		ndimensions;    /* number of dimensions */
+
+	MVBucket  **buckets;        /* array of buckets */
+} MVHistogram;
+
+/*
+ * Histogram in a partially serialized form, with deduplicated boundary
+ * values etc.
+ */
+typedef struct MVSerializedBucket
+{
+	/* Frequencies of this bucket. */
+	float		frequency;
+
+	/*
+	 * Information about dimensions being NULL-only. Not yet used.
+	 */
+	bool	   *nullsonly;
+
+	/* lower boundaries - values and information about the inequalities */
+	uint16	   *min;
+	bool	   *min_inclusive;
+
+	/*
+	 * indexes of upper boundaries - values and information about the
+	 * inequalities (exclusive vs. inclusive)
+	 */
+	uint16	   *max;
+	bool	   *max_inclusive;
+} MVSerializedBucket;
+
+typedef struct MVSerializedHistogram
+{
+	uint32		magic;          /* magic constant marker */
+	uint32		type;           /* type of histogram (BASIC) */
+	uint32		nbuckets;       /* number of buckets (buckets array) */
+	uint32		ndimensions;    /* number of dimensions */
+
+	/*
+	 * keep this the same with MVHistogram, because of deserialization
+	 * (same offset)
+	 */
+	MVSerializedBucket **buckets;    /* array of buckets */
+
+	/*
+	 * serialized boundary values, one array per dimension, deduplicated (the
+	 * min/max indexes point into these arrays)
+	 */
+	int		   *nvalues;
+	Datum	  **values;
+} MVSerializedHistogram;
+
 extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
 extern MVDependencies *statext_dependencies_load(Oid mvoid);
 extern MCVList *statext_mcv_load(Oid mvoid);
+extern MVSerializedHistogram *statext_histogram_load(Oid mvoid);
 
 extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
 						   int numrows, HeapTuple *rows,
@@ -132,15 +223,15 @@ extern Selectivity dependencies_clauselist_selectivity(PlannerInfo *root,
 									SpecialJoinInfo *sjinfo,
 									RelOptInfo *rel,
 									Bitmapset **estimatedclauses);
-extern Selectivity mcv_clauselist_selectivity(PlannerInfo *root,
+extern Selectivity statext_clauselist_selectivity(PlannerInfo *root,
 									List *clauses,
 									int varRelid,
 									JoinType jointype,
 									SpecialJoinInfo *sjinfo,
 									RelOptInfo *rel,
 									Bitmapset **estimatedclauses);
-extern bool has_stats_of_kind(List *stats, char requiredkind);
+extern bool has_stats_of_kind(List *stats, int requiredkinds);
 extern StatisticExtInfo *choose_best_statistics(List *stats,
-					   Bitmapset *attnums, char requiredkind);
+					   Bitmapset *attnums, int requiredkinds);
 
 #endif							/* STATISTICS_H */
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index bdc0889..c2884e3 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -860,11 +860,12 @@ WHERE c.castmethod = 'b' AND
  pg_ndistinct      | bytea             |        0 | i
  pg_dependencies   | bytea             |        0 | i
  pg_mcv_list       | bytea             |        0 | i
+ pg_histogram      | bytea             |        0 | i
  cidr              | inet              |        0 | i
  xml               | text              |        0 | a
  xml               | character varying |        0 | a
  xml               | character         |        0 | a
-(10 rows)
+(11 rows)
 
 -- **************** pg_conversion ****************
 -- Look for illegal values in pg_conversion fields.
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 85009d2..549cccf 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -58,7 +58,7 @@ ALTER TABLE ab1 DROP COLUMN a;
  b      | integer |           |          | 
  c      | integer |           |          | 
 Statistics objects:
-    "public"."ab1_b_c_stats" (ndistinct, dependencies, mcv) ON b, c FROM ab1
+    "public"."ab1_b_c_stats" (ndistinct, dependencies, mcv, histogram) ON b, c FROM ab1
 
 -- Ensure statistics are dropped when table is
 SELECT stxname FROM pg_statistic_ext WHERE stxname LIKE 'ab1%';
@@ -204,9 +204,9 @@ CREATE STATISTICS s10 ON a, b, c FROM ndistinct;
 ANALYZE ndistinct;
 SELECT stxkind, stxndistinct
   FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass;
- stxkind |                      stxndistinct                       
----------+---------------------------------------------------------
- {d,f,m} | {"3, 4": 301, "3, 6": 301, "4, 6": 301, "3, 4, 6": 301}
+  stxkind  |                      stxndistinct                       
+-----------+---------------------------------------------------------
+ {d,f,m,h} | {"3, 4": 301, "3, 6": 301, "4, 6": 301, "3, 4, 6": 301}
 (1 row)
 
 -- Hash Aggregate, thanks to estimates improved by the statistic
@@ -270,9 +270,9 @@ INSERT INTO ndistinct (a, b, c, filler1)
 ANALYZE ndistinct;
 SELECT stxkind, stxndistinct
   FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass;
- stxkind |                        stxndistinct                         
----------+-------------------------------------------------------------
- {d,f,m} | {"3, 4": 2550, "3, 6": 800, "4, 6": 1632, "3, 4, 6": 10000}
+  stxkind  |                        stxndistinct                         
+-----------+-------------------------------------------------------------
+ {d,f,m,h} | {"3, 4": 2550, "3, 6": 800, "4, 6": 1632, "3, 4, 6": 10000}
 (1 row)
 
 -- plans using Group Aggregate, thanks to using correct esimates
@@ -722,3 +722,181 @@ EXPLAIN (COSTS OFF)
 (5 rows)
 
 RESET random_page_cost;
+-- histograms
+CREATE TABLE histograms (
+    filler1 TEXT,
+    filler2 NUMERIC,
+    a INT,
+    b TEXT,
+    filler3 DATE,
+    c INT,
+    d TEXT
+);
+SET random_page_cost = 1.2;
+CREATE INDEX histograms_ab_idx ON mcv_lists (a, b);
+CREATE INDEX histograms_abc_idx ON histograms (a, b, c);
+-- random data (we still get histogram, but as the columns are not
+-- correlated, the estimates remain about the same)
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT mod(i,37), mod(i,41), mod(i,43), mod(i,47) FROM generate_series(1,5000) s(i);
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 5) AND (b < '5'::text))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 5) AND (b < '5'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5' AND c < 5;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 5) AND (b < '5'::text) AND (c < 5))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 5) AND (b < '5'::text) AND (c < 5))
+(4 rows)
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 5) AND (b < '5'::text))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 5) AND (b < '5'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5' AND c < 5;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 5) AND (b < '5'::text) AND (c < 5))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 5) AND (b < '5'::text) AND (c < 5))
+(4 rows)
+
+-- values correlated along the diagonal
+TRUNCATE histograms;
+DROP STATISTICS histograms_stats;
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT mod(i,100), mod(i,100) + mod(i,7), mod(i,100) + mod(i,11), i FROM generate_series(1,5000) s(i);
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND c < 3;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Index Scan using histograms_abc_idx on histograms
+   Index Cond: ((a < 3) AND (c < 3))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND b > '2' AND c < 3;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using histograms_abc_idx on histograms
+   Index Cond: ((a < 3) AND (b > '2'::text) AND (c < 3))
+(2 rows)
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND c < 3;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 3) AND (c < 3))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 3) AND (c < 3))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND b > '2' AND c < 3;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 3) AND (b > '2'::text) AND (c < 3))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 3) AND (b > '2'::text) AND (c < 3))
+(4 rows)
+
+-- almost 5000 unique combinations with NULL values
+TRUNCATE histograms;
+DROP STATISTICS histograms_stats;
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT
+         (CASE WHEN mod(i,100) =  0 THEN NULL ELSE mod(i,100) END),
+         (CASE WHEN mod(i,100) <= 1 THEN NULL ELSE mod(i,100) + mod(i,7)  END),
+         (CASE WHEN mod(i,100) <= 2 THEN NULL ELSE mod(i,100) + mod(i,11) END),
+         i
+     FROM generate_series(1,5000) s(i);
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Index Scan using histograms_abc_idx on histograms
+   Index Cond: ((a IS NULL) AND (b IS NULL))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL AND c IS NULL;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Index Scan using histograms_abc_idx on histograms
+   Index Cond: ((a IS NULL) AND (b IS NULL) AND (c IS NULL))
+(2 rows)
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a IS NULL) AND (b IS NULL))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL AND c IS NULL;
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a IS NULL) AND (b IS NULL) AND (c IS NULL))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a IS NULL) AND (b IS NULL) AND (c IS NULL))
+(4 rows)
+
+-- check change of column type resets the histogram statistics
+ALTER TABLE histograms ALTER COLUMN c TYPE numeric;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Index Scan using histograms_abc_idx on histograms
+   Index Cond: ((a IS NULL) AND (b IS NULL))
+(2 rows)
+
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a IS NULL) AND (b IS NULL))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+RESET random_page_cost;
diff --git a/src/test/regress/expected/type_sanity.out b/src/test/regress/expected/type_sanity.out
index 5a7c570..c7b9a64 100644
--- a/src/test/regress/expected/type_sanity.out
+++ b/src/test/regress/expected/type_sanity.out
@@ -73,8 +73,9 @@ WHERE p1.typtype not in ('c','d','p') AND p1.typname NOT LIKE E'\\_%'
  3361 | pg_ndistinct
  3402 | pg_dependencies
   441 | pg_mcv_list
+  772 | pg_histogram
   210 | smgr
-(5 rows)
+(6 rows)
 
 -- Make sure typarray points to a varlena array type of our own base
 SELECT p1.oid, p1.typname as basetype, p2.typname as arraytype,
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index e9902ce..2a03878 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -403,3 +403,113 @@ EXPLAIN (COSTS OFF)
  SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
 
 RESET random_page_cost;
+
+-- histograms
+CREATE TABLE histograms (
+    filler1 TEXT,
+    filler2 NUMERIC,
+    a INT,
+    b TEXT,
+    filler3 DATE,
+    c INT,
+    d TEXT
+);
+
+SET random_page_cost = 1.2;
+
+CREATE INDEX histograms_ab_idx ON mcv_lists (a, b);
+CREATE INDEX histograms_abc_idx ON histograms (a, b, c);
+
+-- random data (we still get histogram, but as the columns are not
+-- correlated, the estimates remain about the same)
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT mod(i,37), mod(i,41), mod(i,43), mod(i,47) FROM generate_series(1,5000) s(i);
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5' AND c < 5;
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5' AND c < 5;
+
+-- values correlated along the diagonal
+TRUNCATE histograms;
+DROP STATISTICS histograms_stats;
+
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT mod(i,100), mod(i,100) + mod(i,7), mod(i,100) + mod(i,11), i FROM generate_series(1,5000) s(i);
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND c < 3;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND b > '2' AND c < 3;
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND c < 3;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND b > '2' AND c < 3;
+
+-- almost 5000 unique combinations with NULL values
+TRUNCATE histograms;
+DROP STATISTICS histograms_stats;
+
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT
+         (CASE WHEN mod(i,100) =  0 THEN NULL ELSE mod(i,100) END),
+         (CASE WHEN mod(i,100) <= 1 THEN NULL ELSE mod(i,100) + mod(i,7)  END),
+         (CASE WHEN mod(i,100) <= 2 THEN NULL ELSE mod(i,100) + mod(i,11) END),
+         i
+     FROM generate_series(1,5000) s(i);
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL AND c IS NULL;
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL AND c IS NULL;
+
+-- check change of column type resets the histogram statistics
+ALTER TABLE histograms ALTER COLUMN c TYPE numeric;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+
+RESET random_page_cost;
-- 
2.9.4

Adrien Nayrat

adrien.nayrat@dalibo.com

over 8 years ago

In reply to: Tomas Vondra (#1)

1 attachment(s)

Re: PATCH: multivariate histograms and MCV lists

On 08/14/2017 12:48 AM, Tomas Vondra wrote:

Hi all,

For PostgreSQL 10 we managed to get the basic CREATE STATISTICS bits in
(grammar, infrastructure, and two simple types of statistics). See:

https://commitfest.postgresql.org/13/852/

This patch presents a rebased version of the remaining parts, adding more
complex statistic types (MCV lists and histograms), and hopefully some
additional improvements.

The code was rebased on top of current master, and I've made various
improvements to match how the committed parts were reworked. So the basic idea
and shape remains the same, the tweaks are mostly small.

regards

Hello,

There is no check of "statistics type/kind" in pg_stats_ext_mcvlist_items and
pg_histogram_buckets.

select stxname,stxkind from pg_statistic_ext ;
stxname | stxkind
-----------+---------
stts3 | {h}
stts2 | {m}

So you can call :

SELECT * FROM pg_mcv_list_items((SELECT oid FROM pg_statistic_ext WHERE stxname
= 'stts3'));

SELECT * FROM pg_histogram_buckets((SELECT oid FROM pg_statistic_ext WHERE
stxname = 'stts2'), 0);

Both crashes.

Unfotunately, I don't have the knowledge to produce a patch :/

Small fix in documentation, patch attached.

Thanks!

--
Adrien NAYRAT

http://dalibo.com - http://dalibo.org

Attachments:

doc.patchtext/x-patch; name=doc.patchDownload

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 3a86577b0a..a4ab48cc81 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6445,7 +6445,9 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</>:<replaceable>&lt;salt&gt;<
         An array containing codes for the enabled statistic types;
         valid values are:
         <literal>d</literal> for n-distinct statistics,
-        <literal>f</literal> for functional dependency statistics
+        <literal>f</literal> for functional dependency statistics,
+        <literal>m</literal> for mcv statistics,
+        <literal>h</literal> for histogram statistics
       </entry>
      </row>
 
diff --git a/doc/src/sgml/planstats.sgml b/doc/src/sgml/planstats.sgml
index 8857fc7542..9faa7ee393 100644
--- a/doc/src/sgml/planstats.sgml
+++ b/doc/src/sgml/planstats.sgml
@@ -653,7 +653,7 @@ Statistics objects:
     <function>pg_mcv_list_items</> set-returning function.
 
 <programlisting>
-SELECT * FROM pg_mcv_list_items((SELECT oid FROM pg_statistic_ext WHERE staname = 'stts2'));
+SELECT * FROM pg_mcv_list_items((SELECT oid FROM pg_statistic_ext WHERE stxname = 'stts2'));
  index | values  | nulls | frequency
 -------+---------+-------+-----------
      0 | {0,0}   | {f,f} |      0.01
@@ -783,7 +783,7 @@ EXPLAIN ANALYZE SELECT * FROM t WHERE a = 1 AND b = 1;
     using a function called <function>pg_histogram_buckets</>.
 
 <programlisting>
-test=# SELECT * FROM pg_histogram_buckets((SELECT oid FROM pg_statistic_ext WHERE staname = 'stts3'), 0);
+test=# SELECT * FROM pg_histogram_buckets((SELECT oid FROM pg_statistic_ext WHERE stxname = 'stts3'), 0);
  index | minvals | maxvals | nullsonly | mininclusive | maxinclusive | frequency | density  | bucket_volume 
 -------+---------+---------+-----------+--------------+--------------+-----------+----------+---------------
      0 | {0,0}   | {3,1}   | {f,f}     | {t,t}        | {f,f}        |      0.01 |     1.68 |      0.005952

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 8 years ago

In reply to: Adrien Nayrat (#2)

Re: PATCH: multivariate histograms and MCV lists

On 08/17/2017 12:06 PM, Adrien Nayrat wrote:>

Hello,

There is no check of "statistics type/kind" in
pg_stats_ext_mcvlist_items and pg_histogram_buckets.

select stxname,stxkind from pg_statistic_ext ; stxname | stxkind
-----------+--------- stts3 | {h} stts2 | {m}

So you can call :

SELECT * FROM pg_mcv_list_items((SELECT oid FROM pg_statistic_ext
WHERE stxname = 'stts3'));

SELECT * FROM pg_histogram_buckets((SELECT oid FROM pg_statistic_ext
WHERE stxname = 'stts2'), 0);

Both crashes.

Thanks for the report, this is clearly a bug. I don't think we need to
test the stxkind, but rather a missing check that the requested type is
actually built.

Unfotunately, I don't have the knowledge to produce a patch :/

Small fix in documentation, patch attached.

Thanks, will fix.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 8 years ago

In reply to: Tomas Vondra (#3)

2 attachment(s)

Re: PATCH: multivariate histograms and MCV lists

Hi,

Attached is an updated version of the patch, fixing the issues reported
by Adrien Nayrat, and also a bunch of issues pointed out by valgrind.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 8 years ago

In reply to: Tomas Vondra (#4)

2 attachment(s)

Re: PATCH: multivariate histograms and MCV lists

Attached is an updated version of the patch, dealing with fallout of
821fb8cdbf700a8aadbe12d5b46ca4e61be5a8a8 which touched the SGML
documentation for CREATE STATISTICS.

regards

On 09/07/2017 10:07 PM, Tomas Vondra wrote:

Hi,

Attached is an updated version of the patch, fixing the issues reported
by Adrien Nayrat, and also a bunch of issues pointed out by valgrind.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Mark Dilger

hornschnorter@gmail.com

about 8 years ago

In reply to: Tomas Vondra (#5)

Re: PATCH: multivariate histograms and MCV lists

On Sep 12, 2017, at 2:06 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated version of the patch, dealing with fallout of
821fb8cdbf700a8aadbe12d5b46ca4e61be5a8a8 which touched the SGML
documentation for CREATE STATISTICS.

Your patches need updating.

Tom's commit 471d55859c11b40059aef7dd82f82b3a0dc338b1 changed
src/bin/psql/describe.c, which breaks your 0001-multivariate-MCV-lists.patch.gz
file.

I reviewed the patch a few months ago, and as I recall, it looked good to me.
I should review it again before approving it, though.

mark

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 8 years ago

In reply to: Mark Dilger (#6)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

Attached is an updated version of the patch, adopting the psql describe
changes introduced by 471d55859c11b.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Mark Dilger

hornschnorter@gmail.com

about 8 years ago

In reply to: Tomas Vondra (#7)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

Attached is an updated version of the patch, adopting the psql describe
changes introduced by 471d55859c11b.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
<0001-multivariate-MCV-lists.patch.gz><0002-multivariate-histograms.patch.gz>

Thanks, Tomas, again for your work on this feature.

Applying just the 0001-multivariate-MCV-lists.patch to the current master, and
then extending the stats_ext.sql test as follows, I am able to trigger an error,
"ERROR: operator 4294934272 is not a valid ordering operator".

diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index e9902ced5c..5083dc05e6 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -402,4 +402,22 @@ EXPLAIN (COSTS OFF)
 EXPLAIN (COSTS OFF)
  SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;

-RESET random_page_cost;
+DROP TABLE mcv_lists; 
+ 
+CREATE TABLE mcv_lists (
+    a NUMERIC[],
+   b NUMERIC[]
+);
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b FROM mcv_lists;
+INSERT INTO mcv_lists (a, b)
+   (SELECT array_agg(gs::numeric) AS a, array_agg(gs::numeric) AS b
+       FROM generate_series(1,1000) gs
+   );
+ANALYZE mcv_lists;
+INSERT INTO mcv_lists (a, b)
+   (SELECT array_agg(gs::numeric) AS a, array_agg(gs::numeric) AS b
+       FROM generate_series(1,1000) gs
+   );
+ANALYZE mcv_lists;
+
+DROP TABLE mcv_lists;

Which gives me the following regression.diffs:

*** /Users/mark/master/postgresql/src/test/regress/expected/stats_ext.out   2017-11-25 08:06:37.000000000 -0800
--- /Users/mark/master/postgresql/src/test/regress/results/stats_ext.out    2017-11-25 08:10:18.000000000 -0800
***************
*** 721,724 ****
           Index Cond: ((a IS NULL) AND (b IS NULL))
  (5 rows)

! RESET random_page_cost;
--- 721,741 ----
           Index Cond: ((a IS NULL) AND (b IS NULL))
  (5 rows)

! DROP TABLE mcv_lists;
! CREATE TABLE mcv_lists (
! a NUMERIC[],
! b NUMERIC[]
! );
! CREATE STATISTICS mcv_lists_stats (mcv) ON a, b FROM mcv_lists;
! INSERT INTO mcv_lists (a, b)
! (SELECT array_agg(gs::numeric) AS a, array_agg(gs::numeric) AS b
! FROM generate_series(1,1000) gs
! );
! ANALYZE mcv_lists;
! INSERT INTO mcv_lists (a, b)
! (SELECT array_agg(gs::numeric) AS a, array_agg(gs::numeric) AS b
! FROM generate_series(1,1000) gs
! );
! ANALYZE mcv_lists;
! ERROR: operator 4294934272 is not a valid ordering operator
! DROP TABLE mcv_lists;

======================================================================

Mark Dilger

hornschnorter@gmail.com

about 8 years ago

In reply to: Tomas Vondra (#7)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

Attached is an updated version of the patch, adopting the psql describe
changes introduced by 471d55859c11b.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
<0001-multivariate-MCV-lists.patch.gz><0002-multivariate-histograms.patch.gz>

Hello Tomas,

In 0002-multivariate-histograms.patch, src/include/nodes/relation.h,
struct StatisticExtInfo, you change:

-       char            kind;                   /* statistic kind of this entry */
+       int                     kinds;                  /* statistic kinds of this entry */

to have 'kinds' apparently be a bitmask, based on reading how you use
this in the code. The #defines just below the struct give the four bits
to be used,

#define STATS_EXT_INFO_NDISTINCT 1
#define STATS_EXT_INFO_DEPENDENCIES 2
#define STATS_EXT_INFO_MCV 4
#define STATS_EXT_INFO_HISTOGRAM 8

except that nothing in the file indicates that this is so. Perhaps a comment
could be added here mentioning that 'kinds' is a bitmask, and that these
#defines are related?

mark

#10

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 8 years ago

In reply to: Mark Dilger (#8)

1 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

On 11/25/2017 05:15 PM, Mark Dilger wrote:

On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

Attached is an updated version of the patch, adopting the psql describe
changes introduced by 471d55859c11b.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
<0001-multivariate-MCV-lists.patch.gz><0002-multivariate-histograms.patch.gz>

Thanks, Tomas, again for your work on this feature.

Applying just the 0001-multivariate-MCV-lists.patch to the current master, and
then extending the stats_ext.sql test as follows, I am able to trigger an error,
"ERROR: operator 4294934272 is not a valid ordering operator".

Ah, that's a silly bug ...

The code assumes that VacAttrStats->extra_data is always StdAnalyzeData,
and attempts to extract the ltopr from that. But for arrays that's of
course not true (array_typanalyze uses ArrayAnalyzeExtraData instead).

The reason why this only fails after the second INSERT is that we need
at least two occurrences of a value before considering it eligible for
MCV list. So after the first INSERT we don't even call the serialize.

Attached is a fix that should resolve this in MCV lists by looking up
the operator using lookup_type_cache() when serializing the MCV.

FWIW histograms have the same issue, but on more places (not just in
serialize, but also when building the histogram).

I'll send a properly updated patch series shortly, with tests checking
correct behavior with arrays.

Thanks for the report.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-MCV-fix.patchtext/x-patch; name=0001-MCV-fix.patchDownload

From 1d546eb3d27507ee51824d5a8c348b86187d1754 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Sat, 25 Nov 2017 18:44:14 +0100
Subject: [PATCH] MCV fix

---
 src/backend/statistics/mcv.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index 0586054..af4d894 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -515,7 +515,13 @@ statext_mcv_serialize(MCVList *mcvlist, VacAttrStats **stats)
 	for (dim = 0; dim < ndims; dim++)
 	{
 		int			ndistinct;
-		StdAnalyzeData *tmp = (StdAnalyzeData *) stats[dim]->extra_data;
+		TypeCacheEntry *typentry;
+
+		/*
+		 * Lookup the LT operator (can't get it from stats extra_data, as
+		 * we don't know how to interpret that - scalar vs. array etc.).
+		 */
+		typentry = lookup_type_cache(stats[dim]->attrtypid, TYPECACHE_LT_OPR);
 
 		/* copy important info about the data type (length, by-value) */
 		info[dim].typlen = stats[dim]->attrtype->typlen;
@@ -543,7 +549,7 @@ statext_mcv_serialize(MCVList *mcvlist, VacAttrStats **stats)
 		ssup[dim].ssup_collation = DEFAULT_COLLATION_OID;
 		ssup[dim].ssup_nulls_first = false;
 
-		PrepareSortSupportFromOrderingOp(tmp->ltopr, &ssup[dim]);
+		PrepareSortSupportFromOrderingOp(typentry->lt_opr, &ssup[dim]);
 
 		qsort_arg(values[dim], counts[dim], sizeof(Datum),
 				  compare_scalars_simple, &ssup[dim]);
-- 
2.9.5

#11

Mark Dilger

hornschnorter@gmail.com

about 8 years ago

In reply to: Tomas Vondra (#7)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

Attached is an updated version of the patch, adopting the psql describe
changes introduced by 471d55859c11b.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
<0001-multivariate-MCV-lists.patch.gz><0002-multivariate-histograms.patch.gz>

Hello Tomas,

After applying both your patches, I get a warning:

histogram.c:1284:10: warning: taking the absolute value of unsigned type 'uint32' (aka 'unsigned int') has no effect [-Wabsolute-value]
delta = fabs(data->numrows);
^
histogram.c:1284:10: note: remove the call to 'fabs' since unsigned values cannot be negative
delta = fabs(data->numrows);
^~~~
1 warning generated.

Looking closer at this section, there is some odd integer vs. floating point arithmetic happening
that is not necessarily wrong, but might be needlessly inefficient:

delta = fabs(data->numrows);
split_value = values[0].value;

for (i = 1; i < data->numrows; i++)
{
if (values[i].value != values[i - 1].value)
{
/* are we closer to splitting the bucket in half? */
if (fabs(i - data->numrows / 2.0) < delta)
{
/* let's assume we'll use this value for the split */
split_value = values[i].value;
delta = fabs(i - data->numrows / 2.0);
nrows = i;
}
}
}

I'm not sure the compiler will be able to optimize out the recomputation of data->numrows / 2.0
each time through the loop, since the compiler might not be able to prove to itself that data->numrows
does not get changed. Perhaps you should compute it just once prior to entering the outer loop,
store it in a variable of integer type, round 'delta' off and store in an integer, and do integer comparisons
within the loop? Just a thought....

mark

#12

Mark Dilger

hornschnorter@gmail.com

about 8 years ago

In reply to: Tomas Vondra (#7)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

Attached is an updated version of the patch, adopting the psql describe
changes introduced by 471d55859c11b.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
<0001-multivariate-MCV-lists.patch.gz><0002-multivariate-histograms.patch.gz>

In src/backend/statistics/mcv.c, you have a few typos:

+ * there bo be a lot of duplicate values. But perhaps that's not true and we

+ /* Now it's safe to access the dimention info. */

+ * Nowe we know the total expected MCV size, including all the pieces

+ /* pased by reference, but fixed length (name, tid, ...) */

In src/include/statistics/statistics.h, there is some extraneous whitespace that needs
removing.

mark

#13

Mark Dilger

hornschnorter@gmail.com

about 8 years ago

In reply to: Tomas Vondra (#7)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

Attached is an updated version of the patch, adopting the psql describe
changes introduced by 471d55859c11b.

Hi Tomas,

In src/backend/statistics/dependencies.c, you have introduced a comment:

+       /*
+        * build an array of SortItem(s) sorted using the multi-sort support
+        *
+        * XXX This relies on all stats entries pointing to the same tuple
+        * descriptor. Not sure if that might not be the case.
+        */

Would you mind explaining that a bit more for me? I don't understand exactly what
you mean here, but it sounds like the sort of thing that needs to be clarified/fixed
before it can be committed. Am I misunderstanding this?

In src/backend/statistics/mcv.c, you have comments:

+ * FIXME: Single-dimensional MCV is sorted by frequency (descending). We
+ * should do that too, because when walking through the list we want to
+ * check the most frequent items first.
+ *
+ * TODO: We're using Datum (8B), even for data types (e.g. int4 or float4).
+ * Maybe we could save some space here, but the bytea compression should
+ * handle it just fine.
+ *
+ * TODO: This probably should not use the ndistinct directly (as computed from
+ * the table, but rather estimate the number of distinct values in the
+ * table), no?

Do you intend these to be fixed/implemented prior to committing this patch?

Further down in function statext_mcv_build, you have two loops, the first allocating
memory and the second initializing the memory. There is no clear reason why this
must be done in two loops. I tried combining the two loops into one, and it worked
just fine, but did not look any cleaner to me. Feel free to disregard this paragraph
if you like it better the way you currently have it organized.

Further down in statext_mcv_deserialize, you have some elogs which might need to be
ereports. It is unclear to me whether you consider these deserialize error cases to be
"can't happen" type errors. If so, you might add that fact to the comments rather than
changing the elogs to ereports.

mark

#14

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 8 years ago

In reply to: Mark Dilger (#11)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

On 11/25/2017 09:23 PM, Mark Dilger wrote:

On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

Attached is an updated version of the patch, adopting the psql describe
changes introduced by 471d55859c11b.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
<0001-multivariate-MCV-lists.patch.gz><0002-multivariate-histograms.patch.gz>

Hello Tomas,

After applying both your patches, I get a warning:

histogram.c:1284:10: warning: taking the absolute value of unsigned type 'uint32' (aka 'unsigned int') has no effect [-Wabsolute-value]
delta = fabs(data->numrows);
^
histogram.c:1284:10: note: remove the call to 'fabs' since unsigned values cannot be negative
delta = fabs(data->numrows);
^~~~
1 warning generated.

Hmm, yeah. The fabs() call is unnecessary, and probably a remnant from
some previous version where the field was not uint32.

I wonder why you're getting the warning and I don't, though. What
compiler are you using?

Looking closer at this section, there is some odd integer vs. floating point arithmetic happening
that is not necessarily wrong, but might be needlessly inefficient:

delta = fabs(data->numrows);
split_value = values[0].value;

for (i = 1; i < data->numrows; i++)
{
if (values[i].value != values[i - 1].value)
{
/* are we closer to splitting the bucket in half? */
if (fabs(i - data->numrows / 2.0) < delta)
{
/* let's assume we'll use this value for the split */
split_value = values[i].value;
delta = fabs(i - data->numrows / 2.0);
nrows = i;
}
}
}

I'm not sure the compiler will be able to optimize out the recomputation of data->numrows / 2.0
each time through the loop, since the compiler might not be able to prove to itself that data->numrows
does not get changed. Perhaps you should compute it just once prior to entering the outer loop,
store it in a variable of integer type, round 'delta' off and store in an integer, and do integer comparisons
within the loop? Just a thought....

Yeah, that's probably right. But I wonder if the loop is needed at all,
or whether we should start at i=(data->numrows/2.0) instead, and walk to
the closest change of value in both directions. That would probably save
more CPU than computing numrows/2.0 only once.

The other issue in that block of code seems to be that we compare the
values using simple inequality. That probably works for passbyval data
types, but we should use proper comparator (e.g. compare_datums_simple).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#15

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 8 years ago

In reply to: Mark Dilger (#13)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 11/25/2017 10:01 PM, Mark Dilger wrote:

On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

Attached is an updated version of the patch, adopting the psql describe
changes introduced by 471d55859c11b.

Hi Tomas,

In src/backend/statistics/dependencies.c, you have introduced a comment:
+       /*
+        * build an array of SortItem(s) sorted using the multi-sort support
+        *
+        * XXX This relies on all stats entries pointing to the same tuple
+        * descriptor. Not sure if that might not be the case.
+        */
Would you mind explaining that a bit more for me? I don't understand exactly what
you mean here, but it sounds like the sort of thing that needs to be clarified/fixed
before it can be committed. Am I misunderstanding this?

The call right after that comment is

items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
mss, k, attnums_dep);

That method processes an array of tuples, and the structure is defined
by "tuple descriptor" (essentially a list of attribute info - data type,
length, ...). We get that from stats[0] and assume all the entries point
to the same tuple descriptor. That's generally safe assumption, I think,
because all the stats entries relate to columns from the same table.

In src/backend/statistics/mcv.c, you have comments:

+ * FIXME: Single-dimensional MCV is sorted by frequency (descending). We
+ * should do that too, because when walking through the list we want to
+ * check the most frequent items first.
+ *
+ * TODO: We're using Datum (8B), even for data types (e.g. int4 or float4).
+ * Maybe we could save some space here, but the bytea compression should
+ * handle it just fine.
+ *
+ * TODO: This probably should not use the ndistinct directly (as computed from
+ * the table, but rather estimate the number of distinct values in the
+ * table), no?

Do you intend these to be fixed/implemented prior to committing this patch?

Actually, the first FIXME is obsolete, as build_distinct_groups returns
the groups sorted by frequency. I'll remove that.

I think the rest is more a subject for discussion, so I'd need to hear
some feedback.

Further down in function statext_mcv_build, you have two loops, the first allocating
memory and the second initializing the memory. There is no clear reason why this
must be done in two loops. I tried combining the two loops into one, and it worked
just fine, but did not look any cleaner to me. Feel free to disregard this paragraph
if you like it better the way you currently have it organized.

I did it this way because of readability. I don't think this is a major
efficiency issue, as the maximum number of items is fairly limited, and
it happens only once at the end of the MCV list build (and the sorts and
comparisons are likely much more CPU expensive).

Further down in statext_mcv_deserialize, you have some elogs which might need to be
ereports. It is unclear to me whether you consider these deserialize error cases to be
"can't happen" type errors. If so, you might add that fact to the comments rather than
changing the elogs to ereports.

I might be missing something, but why would ereport be more appropriate
than elog? Ultimately, there's not much difference between elog(ERROR)
and ereport(ERROR) - both will cause a failure.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#16

Mark Dilger

hornschnorter@gmail.com

about 8 years ago

In reply to: Tomas Vondra (#15)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Nov 25, 2017, at 3:33 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 11/25/2017 10:01 PM, Mark Dilger wrote:
On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

Attached is an updated version of the patch, adopting the psql describe
changes introduced by 471d55859c11b.

Hi Tomas,

In src/backend/statistics/dependencies.c, you have introduced a comment:
+       /*
+        * build an array of SortItem(s) sorted using the multi-sort support
+        *
+        * XXX This relies on all stats entries pointing to the same tuple
+        * descriptor. Not sure if that might not be the case.
+        */
Would you mind explaining that a bit more for me? I don't understand exactly what
you mean here, but it sounds like the sort of thing that needs to be clarified/fixed
before it can be committed. Am I misunderstanding this?
The call right after that comment is

items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
mss, k, attnums_dep);

That method processes an array of tuples, and the structure is defined
by "tuple descriptor" (essentially a list of attribute info - data type,
length, ...). We get that from stats[0] and assume all the entries point
to the same tuple descriptor. That's generally safe assumption, I think,
because all the stats entries relate to columns from the same table.

Right, I got that, and tried mocking up some code to test that in an Assert.
I did not pursue that far enough to reach any conclusion, however. You
seem to be indicating in the comment some uncertainty about whether the
assumption is safe. Do we need to dig into that further?

In src/backend/statistics/mcv.c, you have comments:
+ * FIXME: Single-dimensional MCV is sorted by frequency (descending). We
+ * should do that too, because when walking through the list we want to
+ * check the most frequent items first.
+ *
+ * TODO: We're using Datum (8B), even for data types (e.g. int4 or float4).
+ * Maybe we could save some space here, but the bytea compression should
+ * handle it just fine.
+ *
+ * TODO: This probably should not use the ndistinct directly (as computed from
+ * the table, but rather estimate the number of distinct values in the
+ * table), no?
Do you intend these to be fixed/implemented prior to committing this patch?
Actually, the first FIXME is obsolete, as build_distinct_groups returns
the groups sorted by frequency. I'll remove that.

Ok, good. That's the one I understood least.

I think the rest is more a subject for discussion, so I'd need to hear
some feedback.

In terms of storage efficiency, you are using float8 for the frequency, which is consistent
with what other stats work uses, but may be overkill. A float4 seems sufficient to me.
The extra four bytes for a float8 may be pretty small compared to the size of the arrays
being stored, so I'm not sure it matters. Also, this might have been discussed before,
and I am not asking for a reversal of decisions the members of this mailing list may
already have reached.

As for using arrays of something smaller than Datum, you'd need some logic to specify
what the size is in each instance, and that probably complicates the code rather a lot.
Maybe someone else has a technique for doing that cleanly?

Further down in function statext_mcv_build, you have two loops, the first allocating
memory and the second initializing the memory. There is no clear reason why this
must be done in two loops. I tried combining the two loops into one, and it worked
just fine, but did not look any cleaner to me. Feel free to disregard this paragraph
if you like it better the way you currently have it organized.

I did it this way because of readability. I don't think this is a major
efficiency issue, as the maximum number of items is fairly limited, and
it happens only once at the end of the MCV list build (and the sorts and
comparisons are likely much more CPU expensive).

I defer to your judgement here. It seems fine the way you did it.

Further down in statext_mcv_deserialize, you have some elogs which might need to be
ereports. It is unclear to me whether you consider these deserialize error cases to be
"can't happen" type errors. If so, you might add that fact to the comments rather than
changing the elogs to ereports.

I might be missing something, but why would ereport be more appropriate
than elog? Ultimately, there's not much difference between elog(ERROR)
and ereport(ERROR) - both will cause a failure.

I understand project policy to allow elog for error conditions that will be reported
in "can't happen" type situations, similar to how an Assert would be used. For
conditions that can happen through (mis)use by the user, ereport is appropriate.
Not knowing whether you thought these elogs were reporting conditions that a
user could cause, I did not know if you should change them to ereports, or if you
should just add a brief comment along the lines of /* should not be possible */.

I may misunderstand project policy. If so, I'd gratefully accept correction on this
matter.

mark

#17

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 8 years ago

In reply to: Mark Dilger (#16)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 11/26/2017 02:17 AM, Mark Dilger wrote:

On Nov 25, 2017, at 3:33 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 11/25/2017 10:01 PM, Mark Dilger wrote:
On Nov 18, 2017, at 12:28 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

Attached is an updated version of the patch, adopting the psql describe
changes introduced by 471d55859c11b.

Hi Tomas,

In src/backend/statistics/dependencies.c, you have introduced a comment:
+       /*
+        * build an array of SortItem(s) sorted using the multi-sort support
+        *
+        * XXX This relies on all stats entries pointing to the same tuple
+        * descriptor. Not sure if that might not be the case.
+        */
Would you mind explaining that a bit more for me? I don't understand exactly what
you mean here, but it sounds like the sort of thing that needs to be clarified/fixed
before it can be committed. Am I misunderstanding this?
The call right after that comment is

items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
mss, k, attnums_dep);

That method processes an array of tuples, and the structure is defined
by "tuple descriptor" (essentially a list of attribute info - data type,
length, ...). We get that from stats[0] and assume all the entries point
to the same tuple descriptor. That's generally safe assumption, I think,
because all the stats entries relate to columns from the same table.
Right, I got that, and tried mocking up some code to test that in an Assert.
I did not pursue that far enough to reach any conclusion, however. You
seem to be indicating in the comment some uncertainty about whether the
assumption is safe. Do we need to dig into that further?

I don't think it's worth the effort, really. I don't think we can really
get mismatching tuple descriptors here - that could only happen with
columns coming from different tables, or something similarly obscure.

In src/backend/statistics/mcv.c, you have comments:
+ * FIXME: Single-dimensional MCV is sorted by frequency (descending). We
+ * should do that too, because when walking through the list we want to
+ * check the most frequent items first.
+ *
+ * TODO: We're using Datum (8B), even for data types (e.g. int4 or float4).
+ * Maybe we could save some space here, but the bytea compression should
+ * handle it just fine.
+ *
+ * TODO: This probably should not use the ndistinct directly (as computed from
+ * the table, but rather estimate the number of distinct values in the
+ * table), no?
Do you intend these to be fixed/implemented prior to committing this patch?
Actually, the first FIXME is obsolete, as build_distinct_groups returns
the groups sorted by frequency. I'll remove that.
Ok, good. That's the one I understood least.

I think the rest is more a subject for discussion, so I'd need to hear
some feedback.

In terms of storage efficiency, you are using float8 for the frequency, which is consistent
with what other stats work uses, but may be overkill. A float4 seems sufficient to me.
The extra four bytes for a float8 may be pretty small compared to the size of the arrays
being stored, so I'm not sure it matters. Also, this might have been discussed before,
and I am not asking for a reversal of decisions the members of this mailing list may
already have reached.

As for using arrays of something smaller than Datum, you'd need some logic to specify
what the size is in each instance, and that probably complicates the code rather a lot.
Maybe someone else has a technique for doing that cleanly?

Note that this is not about storage efficiency. The comment is before
statext_mcv_build, so it's actually related to in-memory representation.
If you look into statext_mcv_serialize, it does use typlen to only copy
the number of bytes needed for each column.

Further down in function statext_mcv_build, you have two loops, the first allocating
memory and the second initializing the memory. There is no clear reason why this
must be done in two loops. I tried combining the two loops into one, and it worked
just fine, but did not look any cleaner to me. Feel free to disregard this paragraph
if you like it better the way you currently have it organized.

I did it this way because of readability. I don't think this is a major
efficiency issue, as the maximum number of items is fairly limited, and
it happens only once at the end of the MCV list build (and the sorts and
comparisons are likely much more CPU expensive).

I defer to your judgement here. It seems fine the way you did it.

Further down in statext_mcv_deserialize, you have some elogs which might need to be
ereports. It is unclear to me whether you consider these deserialize error cases to be
"can't happen" type errors. If so, you might add that fact to the comments rather than
changing the elogs to ereports.

I might be missing something, but why would ereport be more appropriate
than elog? Ultimately, there's not much difference between elog(ERROR)
and ereport(ERROR) - both will cause a failure.

I understand project policy to allow elog for error conditions that will be reported
in "can't happen" type situations, similar to how an Assert would be used. For
conditions that can happen through (mis)use by the user, ereport is appropriate.
Not knowing whether you thought these elogs were reporting conditions that a
user could cause, I did not know if you should change them to ereports, or if you
should just add a brief comment along the lines of /* should not be possible */.

I may misunderstand project policy. If so, I'd gratefully accept correction on this
matter.

I don't know - I always considered "elog" old interface, and "ereport"
is the new one. In any case, those are "should not happen" cases. It
would mean some sort of data corruption, or so.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#18

Tom Lane

tgl@sss.pgh.pa.us

about 8 years ago

In reply to: Mark Dilger (#16)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Mark Dilger <hornschnorter@gmail.com> writes:

On Nov 25, 2017, at 3:33 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
I might be missing something, but why would ereport be more appropriate
than elog? Ultimately, there's not much difference between elog(ERROR)
and ereport(ERROR) - both will cause a failure.

The core technical differences are (1) an ereport message is exposed for
translation, normally, while an elog is not; and (2) with ereport you can
set the errcode, whereas with elog it's always going to be XX000
(ERRCODE_INTERNAL_ERROR).

I understand project policy to allow elog for error conditions that will be reported
in "can't happen" type situations, similar to how an Assert would be used. For
conditions that can happen through (mis)use by the user, ereport is appropriate.

The project policy about this is basically that elog should only be used
for things that are legitimately "internal errors", ie not user-facing.
If there's a deterministic way for a user to trigger the error, or if
it can reasonably be expected to occur during normal operation, it should
definitely have an ereport (and a non-default errcode).

regards, tom lane

#19

Alvaro Herrera

alvherre@alvh.no-ip.org

about 8 years ago

In reply to: Mark Dilger (#16)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Mark Dilger wrote:

I understand project policy to allow elog for error conditions that will be reported
in "can't happen" type situations, similar to how an Assert would be used. For
conditions that can happen through (mis)use by the user, ereport is appropriate.
Not knowing whether you thought these elogs were reporting conditions that a
user could cause, I did not know if you should change them to ereports, or if you
should just add a brief comment along the lines of /* should not be possible */.

Two things dictate that policy:

1. messages are translated by default for ereport but not for elog.
Both things can be overridden, but we tend not to do it unless there's
no choice.

2. you can assign SQLSTATE only with ereport.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#20

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 8 years ago

In reply to: Tomas Vondra (#17)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

Attached is an updated version of the patch series, fixing the issues
reported by Mark Dilger:

1) Fix fabs() issue in histogram.c.

2) Do not rely on extra_data being StdAnalyzeData, and instead lookup
the LT operator explicitly. This also adds a simple regression tests to
make sure ANALYZE on arrays works fine, but perhaps we should invent
some simple queries too.

3) I've removed / clarified some of the comments mentioned by Mark.

4) I haven't changed how the statistics kinds are defined in relation.h,
but I agree there should be a comment explaining how STATS_EXT_INFO_*
relate to StatisticExtInfo.kinds.

5) The most significant change happened histograms. There used to be two
structures for histograms:

- MVHistogram - expanded (no deduplication etc.), result of histogram
build and never used for estimation

- MVSerializedHistogram - deduplicated to save space, produced from
MVHistogram before storing in pg_statistic_ext and never used for
estimation

So there wasn't really any reason to expose the "non-serialized" version
outside histogram.c. It was just confusing and unnecessary, so I've
moved MVHistogram to histogram.c (and renamed it to MVHistogramBuild),
and renamed MVSerializedHistogram. And same for the MVBucket stuff.

So now we only deal with MVHistogram everywhere, except in histogram.c.

6) I've also made MVHistogram to include a varlena header directly (and
be packed as a bytea), which allows us to store it without having to
call any serialization functions).

I guess if we should do (5) and (6) for the MCV lists too, it seems more
convenient than the current approach. And perhaps even for the
statistics added to 9.6 (it does not change the storage format).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-multivariate-MCV-lists.patch.gzapplication/gzip; name=0001-multivariate-MCV-lists.patch.gzDownload

��@Z0001-multivariate-MCV-lists.patch�[�w�8��l�����(Y�d������[G�ZJ������5	��t���HQ��L�~X��	�����6��y�[V���m1k�O,Qo�����vZm�So:'���{�l$��j�z���1�^�Jo��)GO��Q����K|����S/BYs�����Kq�F\��>a��(=�~jwN�:���z�4J��	G��_?\�{��ulbA�K����`�{?2�KdR*U��s#�8���]�;\r?�Kj�����	�-�om���a"��7�Z�.��������,o�$`u��N��;�K�qS�|�{�	x�&�4��I��:o��16�K/��%��9������D$�_�����������wkS�P{}�r����g1�|�����Xqa�m����~-p�r�M\�1�sx#B�k��y��}��'�c����2��N`���a�[��[�L��g�r<9��<�S_��&QR��F��x�����'��b�2����c���SWd�q<��8<����cO2�o:�#g����l�3rBX����{�r1{V�\�v��'^(Er���Fsc����9�oN�le�J��8�],�p>L�������\��T����O�}�Z,2�8����l�`�����M�����6�l5�d�v���'�����pM��[
�������Q)�h����zQ-����,�\��j6��
_3���Tr���U�w�d�xOt��i,y�0O�������Z��;�3�����v	l~?���c�V��5A���j1x7��^����]U������#����|y���rd(s�4������q���������e��,e�q���z���nS=����y�w/��=?.��I��C���������;�o�=R�7�(�U�����")`K��t��>�4��$���`mQ��<J>��:4�*%�$����wq���	��{�n
WL��j��V���Z�n�tZ��G���o����x'��;���'�w��?\_
���^]���}��������e6�_�{c���8Ge����=���������0n���f���5��F6�������� �*A�C��s�I`b�����B
d��&I�*a��|�� g/d�^�y,�(M��a�Y�7��"T`<�A�E��|�|�gt%)�f+�5
r���h�d�	sS�d�+P���h
�\�0�����������iV}� V�d�0�l*���v+z����&���46��&���^�+��M��
��4(f�s'��E�b�(+r$�q�z1ew"f��p
���rJ\��q��(���/���x��HT��L/M����	�$Q��b��d���_c�z=Oc�0�_�x��{����DI4���@y+Zh!��"�c����A�Q���q�z�����0���������*��l�_����s�V�>��07��sZu�w;��D�����Z�h`�@/B�J�"�����<�9�<�zj�%(H#K�s��w1�;t�� ���l4�F�Ao��
��y(��!y
�J�J�x�{���w��e���w��>���Y��
��og����}����`�a�R��������l��������^�k�z�fuO���Y$��������N$������}�Z�1�Z����7e�o=4����L�1�9e����[D`��E>������>-���kV��4���pRRg����v���Z
��Y���%-�M�(w�4�G������J�TdtW�v��"���v���/xO\4g3����I�}���=P-�G���� y��(r4�=�0
�`�(\Z>x�k�{1C���O����}G0�����{�={����B]_�yW<B������'|������'��Xo�8��5���QE����&��s]�����)J�E��|�{y�K�C��t����e27�1�>��S]����#���c-&�#�O����-.���jH\�����	/�[^����(�$�jYz5��CrH�[=�nu����^�!A= ��I$3x�&0������=o�%E�:�$(�D�j,[��f}w8�U��A�0t��s�.+�����:�yM�z].��U0��;��0��>g���E.�MI�I�0���Y�BR�cf���d����^�Y�����m�i��U�i���b�f�lv��&xB�;Q��O������l�!��P������E���Y�a������D>	tq�!EW�8���~�����j�j�)#��1����U�A<�Z���aX_j�w�5~e����2E����Pab�a{f<���eF��U ��D������T-H�x�o��X'������a?xra� )r��X�����qQC��/l�;W�$6&�Q�8/K4`M�=�*
12��(�
U�%+&�5����w����Ciq�l�KBI�Br�J>�� B�9
D���{�����W����X�b���0:��������aD�����p����.Uje�M`�p
����x���;��7�������3���s��k�Q�����r�Yv���;���	�h�^�;G��P��u��I���6.X��L43eTP��"�[
:H.JwJ���R}e���t���l�5����9&���E���U����y~�N)�%|����Z����>�UP�I����oUQ����5:/M���wP��X;�B�#U0a|���g�X����Ev0�-��P��,U��R�E�<�Ta���4�8�W�sX:}iO](mtln�i���O����S(�CkK�tOo,�vf�U�w�����_��-��Y�gx3b�X�.��#I�w���T|x��d��h�K�f��|�� ��T��+�>
��a����W��HA
�x\Vs�W�obEl�Z���&��|��"��[����[[��g&}�V���l)_����mf
'j_����O��-�YH�78�OsP��'��th�T,�)Y�z�s�Z�s�C�e�UV��ey�ub6�bU~�����������/�(t@ji��"���Pu
x8<���N�k��R���%������s9U<(�HC������x��:c��c�1��D��D�{�]�w0�r��:�X��T��<�"�E�����3i3#���T���L�S��)m�2���c6�op��u��g�[)���[")�DU��i���L�0�%����R��O���e��BDv�b���<�����{n�2cB#u~��
�.����^�e���h&g�������~��+Q��k5�g���������{;��i���>KD�g��3`�����l�%dU����|��d �Q��%vp��`P�{�����\��e�Tn��V�	����J��7\���!�_�Og�G�P�Iz!�Z?cG���*���S``x���(�����/TAv���_O��?8����N���P�U�O+m`R���T'��6	��q�/����0KO�x�����3���l�o��m��_%'���-��������Fq�����C�c��J��������{�>�v�I��}��c&;*3�i6���m�U���lT����6y��h�[��7}��c���N'���dxZ|Xf����k�-�Je����	h!�����[m��c'r�����OF��W?O�S�T�i[��v�k����9����m����� ~���JP t�`Nhv
9��JlQ)�~���ik(�ra+�W\�|���`��$�``|M�?�'�7���q�,
����������?|���F�Q�����F(��K��a��z��d��J�����h��S@;���od���{wu{3xc2�$�����I(t�(�F��]�>���y �BH�t����U�����+��V����Y��[Q��������N�����a<q�yA��W�`,^��D�
#gS�I�"��2��|�	�;^������5@�m���n������/���b�[��	|F���4@b����(�vja�Ag}4H�G���[#�k��������9��$�;���7r	�r<�;�u�]P��S�X�sxA�� .��E
Q���|��?,%�8z��UWDH;U���R��D|��H)��%t������G��N�	�g�3����9fb�<� �M��"���tF��a�&��T��(r]1}u!�5;hCA�L���<����q���n�?���/�t�������"'�p�I�}�y�Y�q�BP��|��@pJWV'�VV�����9����&g��a���R�$�l����#<Rpx"�.�fi�?X~�f�/���PrC�Sa���A^��;�:�x������B;�q�W*�e�fA�� ��5�`�B�/*�\���(����k�'��"�-��sw���r��nVL�d�k��f�R����� �����������3B�m�E��������q:���tn�8��@!'C��)�{!+B�NC/x��xZ����U�1�4�H�w�Zi���\@���97D����J�~�dcC����g~���S��	}`�#�c�~�Hz�G�E��*E;���<��.5�S��
�Ew��C�����F��oa���Cs����"���*�2��?�o?����?#�e��6u-�e����iK�a���!��=Pd���m�t��j��_������	��
�;g�c���"��,I���*b'�WYb���Zm�O�y��:�S���6�%q�����-W7�u��!o��1IQ������o�L+QI��7BS��� 6��*8��(:�oO8Z�x��UY�1Y�Q=���o`f��x_��1�?�bW��~�O�������`)���Z�����
tVTT�7����g���p��������KC��v�d����0��3&[M<��V]~fq+��Q�n���3�%�^��^�}�@�0�+*����uL6[�:���Z?�?���R)��@$@���/xP�
�O��_V�p���ftL����E���r9���MEHE�L@|6�?w)�C)U���6�2H�0R&�=��$��/��C���,4���
�3=z�X��e��E��P������X����L��Fb��^o�vI�Jv�D���bB�8C�M���R����Z
�J����9����=�=����L�
W[��xg��$���Y,�Z�z��W4�p�}�]������v���i��:�f�f9�����]��Cj����3�|��)����p��p�������,bGqQm
����_���Pe�v;�On��W���0oG�����]���:/���B'�A�x�W���������pyD_����M�[n����a��I�V���>���������)������Q�����5��L��/���(nhJ�]�������j@;H,vt�g�q;��;�o��ai9h<�E^����oxj"�W��Z���d?�����U!(BE,�-��d
2�Vi��r��{���v�V�u'�i�jm��p��K�v��>�]�J�h4�d�����@�,G/��0@On�R�,97�)����A����A��UCiP���
X��l���`�:7(�%�$$S�_�����#������(��og���i����8�=������z���������6x�d2��G�������Qcv����<�Nu�����dw�{���w�:��9�/A���C�]
��8�;W�I����v��=)d�@	�pW����P7�;Z7�n�2���]%�������6����nko�F,����J����E�n��z�).I����sNh{��X��I�������H���
6��6^����y��B�C�0��U�m�W��~�}��4�w���gM��Q���R+j�.��i�}��7m�~-K����F����:�o�����R-^K���Vm��A��?W�>�G������*y���x�
��������y
2v\\�N���a�u�5���+{.:��$"g`��IQ,�`�����a������C��g�,��-��C�k4�<�*�u����M?�����?�'����R����}�L��*��|4D�L��c0�_����W@����q���}cE�6o�����L�Q�g���z4 f!�9�O0IuJ���+~�9�+���x
����`(����cUp%b>`<2Zv�w�y]���{P��l��]x��3 ��V��R�A2��F����3��� ���J���
*I|aj��H�W�A
����0������{���Z�bPMBD�P�B8-�����Lv$���F@���H9�pa��}�GP*����'��;tH�'t�����LI�z#�t|�=�/�G�p��9�~�@��jU�:0}�F��+FUN#���y���7����^]w��:�R�Ru�kE8,`L�,���:+����v�z^����_3�IM��_T�,�����L���m*.���F��u��7�[Z*�jq�(��I����]_YG�t.���*��{�����4��%��� ��K��?V��T�(}�7�&�u*0�����o�g��w�
��lSAs����}��^�3��W�N���������1nB���Xq7i�;�d6�����VJ^���t���,�P�����-og ��3V����]���}R�rhD#��)���;X�g}��~����DNEm�b�#%��'*Y{o�W5��3��CB(VR�3-E��8��Jg~z��8���lz���4������c*��Z<��Ab�����e=@�ErVZ�jZ�>r��_C��9��<�x�s�p�S7��0���A#mx��R�	K�@h����ku�D��,�ur��z������G��{B�u�Gc?��Y�t�e�����V����W+����b�8V��F
AA�/&��(k�r��p�
$�+���hC=�� ��P�,	p�h/)l�L��0Yu=-E
��W��]���d7��l�{;F�P���4�%d����v�|n���<fM;�S��tVAbu�3u���VkD���*�N�Q5����7����x����%����i�$D�p;�*F�p6=t3r���L����n1��r�����Q6#���6G�Q�����fc.\ .1~n.�J%��u5h-�{�-U��9�x��p#��\����+%�%�4c*�m�i���iE���t$���?�k���������_��-4C�]��Q��
k9��u�l:����!�{�)��	��l+�A���I�w�w�p����h#��I6".:�@[���[�����$?R�0��KN����Y(+�s/���%P��Pd�cX ������w�R���C����e�n�;���A���IOi\�����%�f�Zl�=qv�l�vk��<n�����z�6�h��pJa{����� �;�x�U��b%����)9��pM*^��B=e��p���e)'��� O�6��=��/NO�hX��_c
��gU���?2��i�x�e�/{Q��aWf�f�������#���sZ��5���� ��nGD1��@�f%@Db*�����Kr�&����V��f��
s���h&���	�Tg�#���7�=���]�G ����9�����L	���GW����!����C��#������+����b��X1'�n; 5������33�k1��t+�q�NvF�5B|�9�[B3gDoTD�EE?�R�h�C��n���E�P���L���D��H�V�������y���3�I�5��cX�`�Px������$\d$z�r1G�=%n��"�]��A���6��j�8�����9�����C9aM���`��t�m8�;������CQd0�(r��^.�@/����a!M\�
���r���XAj�9�E�q��L��h4F����bGu��b�
��{�Qd��GE�����g�����K�����<:x��:,�v��{���Fckokwg���y#W�B�x�,�l5k���J?�#��w#��q�m�0�k#8xpaj�������w�h��	3��
A�F��IK��A��!P����d���
$F�i�
�������'��H�������>��Z��s�_\e����q/E��"�����f|F�U*��r�\����k�<C�'���zDP:vx�
>9��qY�m�����]k�MkK�_|M����$k%)xB���v�)L Z4q^��U�YL �.����5�^����z �����1�����������+n��d>0M���6	-~�kX��@T�7<�>\|Cx���s�M������ $[_�����y�������cq��yr]��3�>B�<P�b��2�M`G;9C�h><c��{eP��2��O�l���_����X��nc	��`����wQxf���9��a�����|zPLHd}E?�(,1<�F5b���Y���7	���/�K����e!��9�����e-���g�3X_�N�<`W�_�0�Z:��c�����wC)�����Q�1T� ��<�T��h����7J�h���)*�s�'�N
��J�����-Q��pQ��������L�vw0bEk��>f_�>AgQ�di��DU#��Sk��bCT�	<Y�d�t��>I����c����V��+6��x��{?�i��\=B?�-�������B���N�j�G0������;�J�1,�u�\I��v�c�E�����g�Y{<�
����	�RO���k����J
H-�dr��l�����%��0�f*-��7�s!z���'A(����P�6�|	��,/��M���z�`��tmD��"A�>�Q�'u�i�r`;
;���d<��O/������5�����������j�#�Q�5C��,����Y�z����Y��M�������~b�H35Q���a����-���a�H2!p%��b2O�	�s���XKTp����z�L��!L���"����)�5�2k���QD;���
H{+!��[2D��)�d��Z�@�^�����O{~��V�j�����-�+�zO�EA���6����K0/~������(>	����]nd�G!��]g���"�e��tl���dXi0UaG�(�}���������P��]��������l_���p�L����4$�DyF�z�CC@�T_�pAC�P�"���A7��*�����+����,	[�����x����� �rX�`��C�r%1�P:����9]'��0�^�SQ�Qe�9��TI#�$���G�"VE��w)p��b������t�0��o)��L�1_/�d8M�����}��3u����SM����a2[����O�
���}�����r����K��q�Y����]:��,O��g��0!\��U���$6�z�����+1���{���o�������Z�o���+�
O|/�F&�mp��MGJ��g�g��}E�k�#M!�'m�
��)$����xwUTy���{����q���V��)49`�[<[���/���j6�E�e�3O�c�ee��_er���3j�^�0�}EP���&`me!Y�\�a��f0��<z���h���N���3`Q]u����b�8���X�.`��n�O�8�S�>����s
��z�.����U�s<��61�z%�]��D�ox���_m����'�j3�p��<?����,(�^�o0��a��hC�x�!h����Bq'�n{(��{�!uGf&d���GQ��F\�?���0�Q�!Rl>�t����|F�C��I��s���,����F�����6:����[�	8&9l��@j�=9>����C�s#\�����]�d����E;�{f^8~���������ym�w
+|�h*"v|��)��q��E������?�*����V���j��S���&���>F�cgo#
�5��"+���J,A�/��m(n�4b/X�����z)�]�WGnA������X6�	�q=���y��ER{}�[c%�L*)��H���a����o�?�h����r�6��Z����m���}x��<���S�	�#B@,�HdO��:��y���)j9p��M��y��jS���3��������z��2��jb�FLi�&hn�2<�������W�G�c(j�K�GD�����1�#�L1�/jk������`��D��nT�
W����o��W��}�e~�Am�P
M��h5��=�����z��n����jFA:B5]���<QnI� u��E�`
��;���
�V��C������p�n3�y#�p34��PM��q�y��	��&�8���`�Gf�����
yx~��]��d8�]����>oCo���H��������?��:�41���U����������+V�DWa��~���:��:s2�H�b��������������3�mv����f�[���A����������z�tH[��A##�����I"��`E>��O��T���� �e�V�s[��@L���@!O��V�'+���3+3A�'�`W%u-*WR5k@1���.��+3�[��%\����hc�j�1V1o����q���0�nr�������(E�V���2�_=U�v�������z2	�<�K��Z�~�Gb�Y��������*��h��$�Y��c�������y&:	���h����27����.���	�^D����Z�����:�����p��w���{���x����$���/�[K`j�����y�*�s���B&���&�����Hj���S���K�M�f�
��Umx�Kb�[D�y�>�
�	>�F�F����������������O&�[,�i��w�k�W���7���hE
���$9{%x���j'��&;��@ �����hMh�����_��&�fQ����DA�am�<rY���+^Y�P#��s���'��db7r��l<b�3d�uu��j�e�C�CI�}c7<���h3j�!��w�X�C��;W��?f���#X���T/e�����VbbS�e�^O`
>��C.�h���:�-v���
�dyG���MH�C�=�%��M���<�ce��!���A�v�iF�{;���-	����&.���[@�Z�����F-?�~z��&n�8�w`��6}�g��
5;.�+����������vG��s0�D]��{=z�#�~g8L9��	aI�b���`\h�)3���S��
x�n�����
J�I?���avC^�/]K�Kw������K?���S�2��7�Z��oh����2��:�����BZ�y������#���C��*�Q|:���,X'2U���N~kx�E	��Z���U�j`2��cj!�\��+B��.�}���1��&����D��U
%��r�kBL�8�"�q����IoH�0��������|+�����4 @���Z�"���hT�bu�C�����H�e���z7/��Hl�"�v��4��_>��
�-0t*{	��c��k�,;�i������:�Qr;2���`J�p���dP���Q����zy/���y���w�~m�U
a
���T��f��Ng���V�"������0��E��u��_�hD��6x�[�3�7��6; ��b������&*�p���T.�';�3��V9��>	�R�@��k���D'���I�*���O
}RG��M����+O�&��@��.��>'Sg�U�����4x���8K��4��8h�(u�`s;����a���d<T������$�����k�0�l��{f���5�Q���E� e~�L���}���W����Fh���t���/�*���1&��7�^*F�n�z�y��O�*����z��C��67�rC^��c��5�
������gG�sr<�7��k��!p�`<���q��G���s�A
Jd��iz����
d����x�N�����Q�������
�Q`F\�A2� L�A��u|xtrq����Bo��N�|�Eg��������!k�]���_0Rn��mM|�e|	1xnN.��'L���EBW��X��r�t=�&�A���(��������fj����5W�<f�d��W�U^�oph@��?����x6.=T�G���/g���t�������
a����=�YsvCWP5�7r��WB�5��J����0"N7�JV#1�����~F����>f��Io�9�����I�����L�q��fk��4SF�����;�G	�
*�*����� [�B���j��m�Nq��D�����Nf����f�\�:�N.)E�Q��M��F ��D-���O��p��bq�fn+�4���
�oi])
/wt|q��}~���h8Ye�p����Py�l�W��:��#�5H��L��s2eR\��e�<>9<��������X:�Q�'������5|������P��R�j���������U�2d�������NA���_Y/�������1IF&:�6�l��	_�9Ta��"�)hv@�
A��

�H�V�R*Cc�J�?��if������������S��'9��I8�� Yt.i.��������Y���l<�X"w����o�������*��L�-<�5
�����>�g|�]1Y�`�43{4�N��F_a���^�T`�1��|.�����~ti�|���^=��IUWw�Y@���g� ���� ���VA09�l����o<��D�%�4���I��xp����)����Y#H�����/!�
�e?=<}�xv������?��lhxG�ABl�a�A�Z���N�!d���i�����sHP(4b�`b����Sx���p�b�*�[���|�I��C��Q�d	��x=�nH�'�����S.YEM����f�oy~�,F�q=XGc�}'�����������vu�-��O��tY
>50�-��
q�xD��#U�m�\�s�Y������~�^��<��mtRN��Y�.^9r����g����������]���;��d�
���LP������~T����������.���7�����^F6�����s8H�:��s`��D�&ZM���1��L���)0W�6��7���urh�`����P�N
�	>Br�L������b�A�&�p�9��L���(`S2�z�g&�9�
#�"�I1�� l�d���������t*����������F��$�,=��tA�
��2%i���&���]���J(Z���� ����;$�pO�+�p��H[|5�	�Z
US�@K<����W�?���<�=@�X�ZG�'5�PPiwG�M�A���r���0����L�}pI
8�
O��Ts�tY��zd���1.QeCywP��$szM�gJ�a&��'��Fk�U�Tt�_dD���������n��D���t�����Y����Bh6��K7>F���%�����ijx""*�Q�"g1fv2z���\F�*~��+�hep3�`��&�����l�9����z�U�������?�%�6����o����p������XV�����|������y���������i����4��:$��\���������ik;����J
2���4��W6���b�6U�Odqy�FB����*�@sf��q|�$($)�}L���~�<_pd�M<�f[Fw�H�����}��B,���k��)�jh�O(���i�h�d��3�����Hzb����rh����RM�.;�Wd��2����i�~1��������FHGE.}=�?������aJE�6�B.�8�r������y�����JE xM�b������)2Le�B�clT�2�bHL�@�1AU�'��&��.���i���k����3v��m��3g����_�%�l�!{Awg��Nqg�l���gl�!(F^�D����{z����������[�Ra���-����yo7����y��I8��I8c�f�j�&lO�	\Qv?��Z��e�X�X���������)RU���4�VwrK�3]�Z�/}r���ZxAl-�d���X<��\���"<�������(6�`��/�=���E�u@
�%�"�w�R}I��
w�8��_����d�;�	�p;q<`eUL�����T#V�������W���x,�G��5K�
B���$F�
c����f]���������Tb"���ke[L@;���^�MTst�
/��28�E���^���sRC�2��Z�!������w��!�/w�Ex�m��I��>�Hdqq���f(�;����#9�#����&�],U��a�����:�?��Q��E���S��5!���=�����1���#�T�q�U��Q�1��(���uL�=�i��,���)�r'��n@�*�����RU�$i_��y���:U=��Os1�
��O����Qn���~4r+�hF�����,X�h�Dw}k��,��b��r1��j��$>C�}��R|��k�}c�%�=�A�h���	4��:�o����Qb�����,���.����IC�<����t��b���;� �F���a#�6:��/���x=D45K/���*��<���w�RE�5���^���r���'8\�",��O�7 �����ks��n��^���� �*�K"�n��Fi������	���B��7���S�����}Z�&���G��%)�w���v��%L{����!��j�?[N�������J-��8�f��C;f	�Q@���M�Q�U��H��C���^#R�����M��n�o����R���V.l�ZaO�r���=l����#H��s������x��x����~���]N�K�[��wL��%��v������d���}��>����Ey`Wv��qU���� �:��T*�M����~��~b�&y>DT��mq9D�)$AL�af�����8Y���p�I���0(\��������;���B.C���S@u���>���wV���w~�A�
����E���~��S�M��vE�;���
�H*�X�@���Nav~�(W]��$�vp�:ZI0t5���cX"L���~4�*������o�T��&���$l�t�S�]�)�z����Q������I��[��F	*P%gk2�f�DC�
Ey��@���S6y�D�����������s~���r��0}� 0�C,P����(��OLCf�>\�cI��������+�Z�@���H�]d����R��X����|(��	��|t�����yG	�d��g��\�5�v������w����2��0����i�GKVdXB���z�(�|�<r��7�H.eM��^01����M��Ox1%��G���W�I�D�bB����
�|�s�'�Ka�#��h����/���R�9����z���x�����RjZY!l{��.go��(�K�9&m6T��r{��VfY{h@-y�q<�����_�@8	%"D�1G��'��M�������QS����Q0L�)�b��9��"����
���f��;(FlUd:����G2�+6�����:��#�W��(�Y���
�`6&F��
�S��[����^��Sy$��f������b�W8��w�h�J!w���Q~��-�DW@�R�M�cA������h��&z����o��ep��r�\���_I7��
B�*�i�JQ.A���>�
������{ �q�}�\���9"�+x�U~u��\7��C�WR��c�F=�����'�k��)�]( O�1��[n�������\������/Y
�ixy��q7�-�h�� ���
�{�`x��3�$L�*���X�{���#��D����t��=V;Y��f�6�����i������&����(���wn��>j��/���c���^�8�����-���+*�{��
�#p��#���Fj�)���T\R����V�4\bnZ-������h)���J�����92��q��r�4
F�j���l�3���o
��M��WM�T�M����5V<�C��n����P|EA=^�cy��6��(����S�>$N����v������y������������iG���sp��%y�Ar/���@�����XP��Z����mN'�g5����*���}����"3g9���>Q[��
G�
(7�%EQe�3����;���o��LSq>�����������8�@F��Q�RP{�|������ ��s�Z�^�`���r Q��.�s���p(���<��=�����	VQN����D�P���'��A���Y&l0�T��/��%��[
�EaUO�X��A����"���������N�$�
A��T`q�����8'�k��<��0S�K�nw�q������T��%���H�8oy�!������R��(G,YU9�,��kr7No=L�I�5�	aI)9�2�7|�������Og�O��	BZ8Y�g3q��pL�=���~r�D�}���(_�V��Y��� ��������5��-������OJ�����������������������+yH�~Ck��S
K���87E�2�~�U'�����������X��3&p9��	{Q �R��ly�{%�9q[�v�KO\9E~����n����3����c#��E����+z�ps�kF��[}��y`!��s�`�GDvr.����%E��I���)-Z��H�%,f���7^gx��������x������f%u��M6�'d���f��)��������!�E�*���Xbg^���(��d������*���+�f�9���ry���Z����Tc���$`NJ��V�X::�*����'���������s.�Pq�.�����\Ih��r~?���x ��9(��3@!T�re���56hy��������W���DZ>T��U�����4�7�D���c���'����G�^��*P�Lg u9K�1��v?���g*�������{�5W�b,BY��,��(nn�v��Qb�,Y:a�2Z���a����M�f���9��]�V��.��������x���������17�c�H>��)Lp�E��.�tA"Q�D��a�ki�W��J�|Zj�:?&+�"����Q'��t^!��;����'�9t����K��8T{�*Q,����6�����A�������:??:��/G�u|�B8�i*��$�	��	�`6#�lg�z8M��
U�8x� ��_��Y�D1?&��e"����
gDeqG`��s��r~��V`�5�.a)����~�.�
w�'�[���X��:(�|��_�_G^��*|�|�Z���K_�O��m�����qq�'(d��2���+��`4��(X�t"�����s'��Ep>8j?���o��x��@�A�1�$u�H�����*���/�	���^I'��t"�p!�F(r[�J����<}/������w8�lj���)z�v�[vIKM\�h�I�9�pq)�>�w#J
!���R=�o�3����>�+�5���H�;�e����D[�B����\R�����b�3
���U�����!�6j�7�v�^<P{�PQ�v(����W=�+��<��n�v1�����T��N�
EpO���D�&9c	��7���Gk�kd��;����zy3�5��]��V.� dSh�����m2��q�t�Gz���$�1t�Z���1TAU�0S�:��\�j��[9�]I`��+����l���Y���A�>cFz3|���� �B�s���w	/�]��'�Y\~�������8,���zV&���)��yj{��#����P%4��!v�cg�Q2�
�|�
W!6��]1����+y��m|E��+T�sDQ�������m�}W�$��M&�b�Pg�Y5Cf�.�Bnd��#���~nH�N����6q����Z�J
�S�oV[���P,�����L!�}��;
P�%�T�A%[��H��T�b��`jfA	�������$�O ����+�vd����W��wc�(�'��=�{�ocJ��.Q?4Q�	��n�
�l�ti�ZP]����j�e�NO]����f���*I��Qi�!�Nn�����w�F2��G��xAb3��C���G?����3���3W�/�]$u�T�5�����.Vk_hoDs�]�K����8!i��>!�)3����9-%e��%VE4nX�K��e!AK�q���Jo�2
0��Z -#�KlG��p�h����N}K�X�>��Bx��K�_}m����_���Z���~���������wv����/�N&Dp�� �`BW��R|xpzx���N��~x}qt��P�*��]��w:;��
��l��Z���	��|����	3�������.�e��^eQ��PP��J�t��/�o;��X~�52��K�����dr������5yy~�z7L�.��^M�?��`�%��.|���
��pd���'��V*5*�h����U���7f�SFxB�V��]_��0`����3�XQ�Ld\���X6�
#b�����08�SC��#8b-�%��G�jx/�j�yKx��T
\K=fo��K�d���u}��P d����,w��b��Y��8��	w
���cY�'IC�:�rH9FM��q�	of������/$QY��<(����x�l��3M)�:&�����C��3E�	�������>E3�w�.S�7�n�*����
]�����p�[b����S3�H���G����
��\=�z�PG���X���XC	�����;���������l�`��P�5
��i��`���D*�D���>�/�S��`�S������k�09,���L�����b �9�`YY9G`�}��$Z/'V�:��\,�sY�p����X�8t�;T�ox�
��s�0O`_������febd5H�_��wO����pb����.w���"����w�����B�
R�+�{��]��%z�{D�	UUu�*i��0
��AnUb�D3Q��,�,�����O���6�hu���w�L�a� %=��jI�1T���AR�-��2d��P�pCm�.��w'�2����AU�w�25�
~QQ
�A�&_�L\�W�s+�X�brM�b�i+�����h4�������"�`�=���0Zv���>1|��X����Q�.��{#2���iV?E�8��d�s��C(�w��	<��
�W�#h�z 0���Z5���Y�P��uuv���.���3:�gj���������b%l�g3��	�b��hN6a]��o���l�� k���tU�����s:�D�'���JG�@���o���}�}�<�Cq���7��|�o�e���8�����I�&���R���{��Zb���b�Aq��}K0���Qr�������V �v����"�Q1q��@��p'���Zj�G�H�����l�'2���Md,�"�����S�������G�.�Knp�&
R��<g����&�����:�d	 �w.'
T��m� }�;�,��V��:O�t�G*����A�6�^z�EF�0��4Xr$KV��K
�j9���sC{���Z�{����3��cH;�4we��&��Qjz���P`49�h����%8.�59����-�e��m��!s��������t�j��p��z����VS���13�E�RS�s��P70���c�oi�f�!�I!0wh�)dr�����������W���Ao$���C�v%`�*m�e�]LC�G�,������>tKa��^��S�SQ�Y�p�Y�l��z�f<W��������o,����k�x�M(��k�z�gt��	�\1l���K �5������#^�I���{�fV�"?��Q��)F^@��\�8MwE�k*Y}~�\�jq6Xq��I�x������H���xx���5�+��F�v�M�)�;c
��������qw�w�O�y�VM�O�'�IKc���al�1�41N���.f���������"�^�h?�������O���l�w/k>��m.������b[����sA=�u`�H��h�_wg�xH��n�I|���U����I�O�
1����z�&���lv<z���1�����<x�e3���$!Y,����s�����S���������`�����A�6zB4��&�B9J0F�A�kO>����q���[��a�ww���a8E������ElhD��[cU�k�� pZjl��?�}��������`�����8���q@RZAn�
�+!���&A8L�,/�.`�`@�x�1�b�m��>��0�l��~G�S����L�JdO9�`\���2H����vr���-�;D�P����i_OF��A����_S<���ew%�K-�F�>����g�}p����9���2\AX��h��������E�����O������	E*��d�,I���|���&)�F���M'2Z�v�b1�\�JO1��7	s�L��!F�����6���TY7�;��4�Xp�QO#b��'�������J����#�8�#f��t&<�����D�]��Z����QE�K�'�K����a5��o#WAEH1�W���M���+����y��2i�}�ZZGz&b���,�f��i�<u�Z�Ko���Ao=Y��i$0r�|�`0��.��W"���pk��.�d����w�o�� h������(v�c.�������9�ev��kM����@\h~?$tO�x�_����9oe��
��n�� �[�&Dg��A�Xb�IF2�m2
���l���ar)4�4�%G�D}v��&U�*\;�w�|����.��J�%�l�r��U�>_I��T4R���F���9���9�+e��(E~	��U��S�G_-��If��er�s5�0���U�'\����� �m�b�����#9"N����p���Ilum��Uq�m��i��M�H������u�ho��}�[�E�W�p(#�MT��)������
~_���@B�k_���u���I��������+���aJ"���G�B���5��`E��
��Y�]vN���l��)Z��n��2��t���k|.$E�fl��|����~���;�F�(�����a0��81��z����|����2�����?��������Z�����hm�=���k�}h-���[x�m������nH��O�d	�H�?e�'+O���i��H�&���:$:q��T���O}fm��������y��fC�J�gC-n�)��g0x�,{�*{���	��^���3��N�'G?]��R�B�Q:���hlx7���zOO��^��*Z@�W�������S��� �%^2���y������p���r����h�����w�:�TE�\c �����
2
Z�I�f���;���t~��Bq.����7��/|0�8#���3#�2��!��pM�{X�!����l3N\��%�9�-��u#�|28�kfp^ �	�m�#��	H�p�4G���A��'L�`'6n
��.�
h})����(��
���@c����X��YG��5�H@@2�
�C�`�N������?I��H�%�V�=,6&T�m0�&���I:F��	�Y�N�Z��B�L�1T�S�t��
y�{��.����
gQ��Uc�b�V�#��8|��N#�n6x�&�z�� ���w����%tK�z�����e�v���C�$�)��5�g�3W2a(��	�w�J������(6�!X(�[%�����Zo�7$�-��HO�[i�Ra�e6�+�EX� ��H3�����������!�95�&�1$V�����@�z�E��Z|��y�����t�Z��S�	��h����c� B����!����K�m
e>&�j?�]$A"n�R�8����Q\u)"A^���Yv�r�%:Y�B��ES�j�v�����|O����)��yzYH!�A�_��3�������c�M�c���b��c�;�Sy��x4�����Lt�i�75�4'�8K+6!�W�|��������8@�l��|G�-\:��W�!]��3{bJI��o���U�Q��BG�@�#����s[6�x�k+[�������	k����f�G�~���z3�	=`	������S�y5~�g�t�iR����!�N7��������= &��;�D0�>03P�;k-Zvt ����[�
��}1���	�u=����rB+�w��41_0�p@�{�6�E��Sx<��������/���uSn�hA��w�hUC�#�.G9�$3�p���PG��bw����I��q�$5U��\�(^{�W���?��47��C���r4��D���J�~�s	�����(���"�
����)����nk����Y�I����[v����d�c�&Q�A`�7�*�'@���,�x������_��^�m�������������}1w�����V�����P�8*��!����S�����;+�N�L�e�����������H�x�	��L�_�����8����M�E]/v�}�N��Q���;��MHY�4%������!;j�����rjL�R�o��h^�P�L/�-w��;��,�����I'X�+�;����}$)�9�����\�;{��v���A���]R�":�=�N�����>�6��eC�4*��7A������p�&6$�M+m5K��PW�=����EW�����:D���(:$`������k���D�B�X�L<����:�������KB�\�;
��<u��q!����v���q��|��4��	���i�����q���&S*�M88G��U���W7�����n�p��|��k��B:�|���F��{��e<�+7�%��s��U������H1���'}�{�r���I�'�	"����1�.mV.�S�������Rg7��\��Vv�5
>���z��.��f�s�TQz��������$��2� �z=L94$<���\9����4�����.�]�O�1�f��I�=��4���� �
JEC�������"2���P0��F0��6W =���<"KO�y#:�i�K	-���M�Dc�sC��{���&�����C���c�����5�l�V2��k����\jFM:�;�MX;v�w����?��oH�ru���Y�Z|�Q��
�C��Bc�v0���x�t�|����������G�'s:s���^%����0<8������Q�������UmZdc����W\T~�]�J>`�
2��+ ��G��_Oz���Z�fBO���X��y����y[�+��u\�?C��sR�I����A��p�_7oE��l�9i	�N���l���w�������5��6R������`5����;Fc�i/�����dt�N���#�Q���3|��I��`Ysx1��r��MjZ��ujn��!^(�	�U���O;�����RGh�0��d��L|O�^}��� �l�2�<'4�����w�������������6b�5���e�C�rDB�n����I�K�5*���!����@3����(lW'�Ev��?���D�5�#��M��YrSe���B�*����W�����~�l�,��m^%��Z�B�O��-�*A�Q,�2�dL����w���r�0�����&�Q*���*�c��r��\6����.���D����.�x%7�O-snD�
4L�0NvF��T��L�Y�\!ym c0}�2F�Ou���J^�I[��x-,���
�:�����t���^c�k�(���n����������"����}c�zR���4m!��(��W&�����>��K���*�_��N~��H: ��N������	b��$�@�t���"/0�]��y����
�4^h�O�3s4m<�$��nz8�8�u#�8���"*U�"C�;fR�o���%�\
�~��������| ����4�����-3�;�������Fg!4�L)�e #������p���DS��j�I��]�d�&��|c�1�X_��� $'���|�U�.�[�*G#�9aY�z��|���B*�%�IP&����%T����
��^f.E������0NF��A1(C�$K�J���\d�W���s�����y��c�
f��D(j�2� Itt��X���HU�w�k�#%cl�G�N#��w���h$�/P"�qY��Q��XAcu�%�H)�!�8�����AN5A��+KM��X�Dq��-i����V<0{e��,]��I�T�
���p��K:�*q[��h��������C�s��E�'$��)W�����/�f$UU��O5 �� }�	������>��f�r&��O�)���Xi��0l>����V��T���K��+_��4�5����e����qa}N�ANi����zu������58�W=���\zD��|��IcfIK��s��Q��%�l�������gJ��������^�:���������c1�F�kt_�D��O5YKfTF������TK6WA(��>��117Y��0���1rO��z���}^v3�bJ
�G�L���.2���{-�z
��Oe;>B�(��bgl�4X+?1��B���rwQn�2��%)W�HE6������h}4������?����*�2�MF5��\u�s�B�M�	��o��d��[�)_��}rzrd�}����1��m����o����� }������DUV�<�XnI-�/��u����������N��]D��D�^3��c�����H|-b�A�H#N3��������~W��P��0/7�����H~����Y�u0c��s/$�)����d�
��Q��9�y&��Z���~�+V���W��j��R]HKx,YW�M����0�|�������9{�9�gs������Q�����J�SN�����D��fi>5��������y=�X>�d2g����4�l�/��V�w����9[�s�+)�^�t��KoF\�^�Kr��}j�Qp>���eT��1�U����&��#���Z(����e�q��j�(��f��|���P�nX�j������&�)`�g}�L=�#0�|����_�j�;�(G}��UK�|t]a�<�cC��}S66�2c���K�����+�B-D�0,.@��Rd�JQ&@��l��O�Q>YD�h��3�u�Z�)e���q!�T��IY^���y/��Hhy��
KN��F�C�_��-�4V��c,�P�Z������3������2�$�%���>�����������}�T$��oL'�,�����RRIN�\��h�\R����RhY������Mv���#
%x��Z
Z!�����;o!u���W'<��������Y�cHY��.���u5�b0o~�KL<��k����|��%k����J�5�eI���;��O�t6�fy�r��l_��\G����G'4������S��\?Bua~���}���r<�D��}��<�����)	V}�j��R��pe���`^��w�9�;�j2�{t8�9nv�)5������ZD��e�}�v_!��7n�?�i{kvT�
��*2S�� �����p�~��j��W�%��v-I���|�'���U���@�����b&������b4/A���p��Z�9|�#�v^�Ofy��������gI� ���������q�q&���#�"�3���������%��G�4^O#6>���Y[����J�����U�%�H���ceyV%)z���I��W|V�
��b��nR4P�Ya�)�u"�y1,|�B\b����Er�t_�,������.��1�h�q�t����~G6��N�����?���="�6Kv�I�r�o���BG������-��[T�8��=o,��R����y����)�"O\���+|p���>HG�#�7��M�!�[��a��I8f��)����=���D2��c�_R��
=lly�(�}
OD~���X����2����.���I&q���;�0&N�>b�����O�	��M��]��0�����g�U"����U���o^O���3�B 	B {,�v(K���q�<`#�Y���Tj�ap*?�\�s��
L��x����}L�F����A�tH��qd�U��H;�sRa7l�tG*��)�B"H�H����B/��%W�*3}%� ��0MI����)�a)�;��	�� -]����dB��9p�{�m�Vpe|���s�G��)�SN�7�$q,��tD�D]Z_�����YF��*�R���x��^a�D��^�s�4C���j���2U������A����2���4�YL��*����L����*�����Z;�������f�PP�X�;o��,mcm�m�������(G�\�`t���kE�����|E�F8���`��=m���M�>�![1�LO!m�������eAT����E��,70���ZN�r�>����<�J��1"��������N�u^�^S��	~,u�5��;������"���"H��-nm8��u����(jL:JN{��&�b���H��L��X��(�"�HzD�+?�F/PL��ygmY�m�T	����H�����
d��	��P�� ��3��������z��D"����P�����F���/��PZ]�y.������=���_���fn�jQ}sN
�XH�/Q�~�'q3�v7;�/�6���
��f��m����jt���eV�nm�u���~�����y����[[{;;��z}��V���R���F���^��U���M.���dHo�$8xS�.^���Igar��������G�htg�t�^�=�*������[�|��������1q�y�L�o�o3��5c\f�������q�s1�����,<z�������\a�.��������g�a~����a���8�N+?o������a�n���v��j?qK��T#>l�80m kqx�������������3	���6dT�=S��G�,�sy����6�B�A����(��3�0�%�A�q6��y�1t=&2��Doy������:1����RO�Jo����*.���F_��jEy��?�fryOUx9��j��X\P��9>%������u��E1���mR����1W�������S>�����4,�����\4o�������|�-�"i����`T��Zeq{Ptn3e+v\�|�^Y��I*�s���$��p�#�wR������Qs�As��x��K��I��V����J���n��v�bW�'�8-p:J/p�9��$u'�W	%
�-����O�O�<H���Itz�9P>����l_��'����j�T��Af�_���>�5�M��pE���a�*[��c��[�*��8��Q=��_�lD�s����>���*��l��h�������#���He|����Ef�����s����E��=j>Y����k�
�B�������G/�M�W���_�� V�� ��h�R����W����Q{�����)����B)aX�p�;�� �cU�X���(I$������T��|�E���HX$���l�M�! ��"0����������Fs{/mv�E"3�Oj�����2�h6q?�_������8�����Q�����L� B&�Pjg�A�TgA�B]��D�CRw�N���dD��0R`M�V2��kQ
M5�:T^*�!��D�mE�����%�.��[|���R�N{�R�{Y����G�v��v��p{���.����~��Kk�?�t����k���k���|UOq��L���RL��/N�_G�_�E�5a�����e�>��X��_��(J��(k��$��p�*�0Z[2�z�d�U����e�`���j����7�]����;������}G���u���<���������w������v���������|C�w~N�w
���N����af�����Z����^�����#�������G|>���?�k�&������z5������������0/����E8(�M�Q���
6L�3�������_j��<;[<7���Bcw6/g?6�S��R�Ey�M��9�F�(?b����?���s-��2Wl�d/�ks��M�z��a����z����`�K7_���7_���Gx	����������8_�����Yq�����c��DH.��e	�e�������p�b���G��j�}�5��9=�-���j	f!��X���q,�(�\�����|W�e%���*s����]�-TNa~��E���������>���|i�������5��	��Y�Z��>���R�`��>Z\���*B��eQ����%�R���m�����Q�������;�Fi���L7u���<X����v
�~p�yp|�^%{rx|~q|rpE�{�eE�
��}��jW{�j����/����hcQl����eY��h���J��ww�;;�F�������.����M���~@�$��]T*L 7�#7�e[o�����������\A�KwR�<
`9K�������v}�+��<����;���r�
��<�\`E�]��D��-lf��n�'��x������G<���r6h[�7Dd�U�cn1�����������3�$z]2�s��D��>��������<~qb����Y0;�A4~!g� 
����8����������^����l���D��m5��V/�e>���`�a���o4$[;������*-��>�SM�F�����8���`S7Kv��8��{<�M9o*v�RN��\�������1��d5P�K+�-$RM��7a���BxQ�oQ�#�N����	r������2������;2�[�H.��r1�����-4�$_��r��p<OX��A���f,7���9�<��xW�Fl+�!K��v�7���
�:�$� �r^W1f�����]:���/���2��.���W?�*����&o#�U��S�q�v����d��{P����j<����4Pq���������C�mh��;,��-s�o�C�q���R_�����XW�i,B[fX��6\��d���)J�	��A��|�53x���J{>�U?w���������^8<����N����g"�����b�����0�s������VW�7����l�T)��FR�����>g�[_Q�YE+������=�X��No�4S6EK��Q����O)l���,���6m0��N$?��5p�G��"r%��3��m���z��4��6:r��H��w�m�;���B,\��WF��9�)�����JT�������`��n�p���4!(����	m��*�eH�.��=���|}}q�T�u8���G�R��[�������0���D<��_}^�����h ��q�f�������w�i����.bm)5���v�5��������x����l_P�pE{���5w�������5�����I�����v���a�75I�t���Kt�>
s^][�q&6�=��d��6y�n�o
��Q������C���"WV���x$�V�}�z���x���J�96�+����++-������c:�����N��OO�W�g��w�>2~��
��R��%}�C��<��m?*��B��������K�p�u�/!�I�����N����x���I�s<T
�r��U%|
gu�U����O��fx��W��Of$���#�����h���G�����A+|������E�`H������Rs^k�r�]o� i��e���1�2��8�����3�M1���Jy�������������*�,�F�x�A���;���t4������A�X!�O<~�
iC��b�"r}��#h<��<fW����K���	L|^����A�!����i��?�+VV<�y%�PM��]�WFJ3�al?�0u"/�����-�5�4X�����Rg�7B	:�S��dOtS��u`�����"tc��9��|n�1������).�'T~���(�������F.L{WZ���sP	�d='C�^�DFBG�+�"^���s���8O��*��9��Qg">�������
|���J��Bi�2����z�a��a�X��O4��Ek�w�5���k�C�Q��mDp���sifc�������������~�m���im�=���^���E�k��.?��a�Ww��6�]t��[N� I[E�F�}��z�7���K���!i�J�+��o8H��>rB���d����w�@�,]T���;��V����������*�\�R��=�}�`�*�$�&�=�
�a&�1!>;��]%�<���r���WF��~u��:��(E��������g��w�����4?�a����a�/r��$s>�=�tg,�a���~6��z�(B�YY���-��7����?�G��PxM����q���0�D?	f:��b���a8�%�����y��������l?l6�����t�S�u�!
K�:��������@���<��N3:<;}�����I��q����,��5�5XK�+n�{�����f�W��rmW����X��;�.S�5�R�Kw�$2d��C�)��bIm��$��q��hr��g9��B��(:?zytp	& R{h�wPu*������uh��uFh����c�~�_'��Z�K������w�?~��MZq�h��m�������aK��]��^��������k����h{�Y���=�������?��:jW?��(n"�Q���${�_��.�F�����r@R�&��{�/�-/��-���bA�Z�/��������W����_���v+XFn�Y�O�'���!�����t"i�js��x���a�y���n=�mq��m�������I��_�G���Ck��������.�)����$�L�B"�
:d2h��_	�Vn�f�t}����=Yk�Q��WGg��4��O�D�X|�G��k����jE���"
��z��y|rx���g;��1PV���u*������s�g���	����p�'�Gg8����k:g����q/�����i�_���*|:/��KI���fq��[�e����������<+�Um��������3X|n�j�Y��Gg�������)Zb�d[��("�v�zr�rt�����8�c�p�zK�?F6��^�b�$.-T��o�%�j"�Q�M8�8��s������p�`fJ��Kaa��3~F��k���s�q�f��q�����MoX�+7%g.7��>�>����������K��Z�8���O���$z�?NO>�J�.��k�����_��l�n�����������������������_zf�Mo>wg���%�O���(�s:���������Gvf)��G����KM�"z������_���S��J��_���������Z�y7@a5��[���������	�)s0�u#��4�����b�#k����O�>��T�=~��on|7����p3/��f a���8>�??�=pb�i����}����G'����:v���:v����U��|��������s|r��?��N�����t~������d�5W�=UQp%����?��v�
�G��������>���y��x,�8��6�k]d���R������6�VuB����8�����26���-�����Q-����������|�����&�����"���,{ZuO��
�����8M�k��q5_~Q���~3����m��@������Fx��|�t������,�4WV�r��N�Qo����z��{K�������'��oi�y�����$�a�D`Q���zm��?Y���e���[	�u�����^G`O��N�kN]������U	��kZ
KF�5~��Q���Devu9]�{rk'�+I����`6�$��'�&xK�`�a:J��
��i�������`��3h��dT��dMZ�9U������z�"�U�p�B�_v��n����A���R�f�F)���-�������\�O�f�Kz�d"����-���QCw��~_���;T��~9��o���-��?�|�N���n|�u}����f���������g_��O��%��������+?M��e7�<}�o�������Y,J����{>M��)|�|`G�������9���X�

0002-multivariate-histograms.patch.gzapplication/gzip; name=0002-multivariate-histograms.patch.gzDownload

#21

Michael Paquier

michael.paquier@gmail.com

about 8 years ago

In reply to: Tomas Vondra (#20)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Tue, Nov 28, 2017 at 1:47 AM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated version of the patch series, fixing the issues
reported by Mark Dilger:

Moved to next CF.
--
Michael

#22

Mark Dilger

hornschnorter@gmail.com

about 8 years ago

In reply to: Tomas Vondra (#20)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Nov 27, 2017, at 8:47 AM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

Attached is an updated version of the patch series, fixing the issues
reported by Mark Dilger:

1) Fix fabs() issue in histogram.c.

2) Do not rely on extra_data being StdAnalyzeData, and instead lookup
the LT operator explicitly. This also adds a simple regression tests to
make sure ANALYZE on arrays works fine, but perhaps we should invent
some simple queries too.

3) I've removed / clarified some of the comments mentioned by Mark.

4) I haven't changed how the statistics kinds are defined in relation.h,
but I agree there should be a comment explaining how STATS_EXT_INFO_*
relate to StatisticExtInfo.kinds.

5) The most significant change happened histograms. There used to be two
structures for histograms:

- MVHistogram - expanded (no deduplication etc.), result of histogram
build and never used for estimation

- MVSerializedHistogram - deduplicated to save space, produced from
MVHistogram before storing in pg_statistic_ext and never used for
estimation

So there wasn't really any reason to expose the "non-serialized" version
outside histogram.c. It was just confusing and unnecessary, so I've
moved MVHistogram to histogram.c (and renamed it to MVHistogramBuild),
and renamed MVSerializedHistogram. And same for the MVBucket stuff.

So now we only deal with MVHistogram everywhere, except in histogram.c.

6) I've also made MVHistogram to include a varlena header directly (and
be packed as a bytea), which allows us to store it without having to
call any serialization functions).

I guess if we should do (5) and (6) for the MCV lists too, it seems more
convenient than the current approach. And perhaps even for the
statistics added to 9.6 (it does not change the storage format).

I tested your latest patches on my mac os x laptop and got one test
failure due to the results of 'explain' coming up differently. For the record,
I followed these steps:

cd postgresql/
git pull
# this got my directory up to 8526bcb2df76d5171b4f4d6dc7a97560a73a5eff with no local changes
patch -p 1 < ../0001-multivariate-MCV-lists.patch
patch -p 1 < ../0002-multivariate-histograms.patch
./configure --prefix=/Users/mark/master/testinstall --enable-cassert --enable-tap-tests --enable-depend && make -j4 && make check-world

mark

Attachments:

regression.diffsapplication/octet-stream; name=regression.diffsDownload

*** /Users/mark/master/postgresql/src/test/regress/expected/stats_ext.out	2017-12-19 10:42:32.000000000 -0800
--- /Users/mark/master/postgresql/src/test/regress/results/stats_ext.out	2017-12-19 10:47:45.000000000 -0800
***************
*** 608,652 ****
  ANALYZE mcv_lists;
  EXPLAIN (COSTS OFF)
   SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
!                     QUERY PLAN                     
! ---------------------------------------------------
!  Bitmap Heap Scan on mcv_lists
!    Recheck Cond: ((a = 1) AND (b = '1'::text))
!    ->  Bitmap Index Scan on mcv_lists_abc_idx
!          Index Cond: ((a = 1) AND (b = '1'::text))
! (4 rows)
  
  EXPLAIN (COSTS OFF)
   SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
!                     QUERY PLAN                     
! ---------------------------------------------------
!  Bitmap Heap Scan on mcv_lists
!    Recheck Cond: ((a < 1) AND (b < '1'::text))
!    ->  Bitmap Index Scan on mcv_lists_abc_idx
!          Index Cond: ((a < 1) AND (b < '1'::text))
! (4 rows)
  
  EXPLAIN (COSTS OFF)
   SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
!                     QUERY PLAN                     
! ---------------------------------------------------
!  Bitmap Heap Scan on mcv_lists
!    Recheck Cond: ((a = 1) AND (b = '1'::text))
!    Filter: (c = 1)
!    ->  Bitmap Index Scan on mcv_lists_ab_idx
!          Index Cond: ((a = 1) AND (b = '1'::text))
! (5 rows)
  
  EXPLAIN (COSTS OFF)
   SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
!                     QUERY PLAN                     
! ---------------------------------------------------
!  Bitmap Heap Scan on mcv_lists
!    Recheck Cond: ((a < 5) AND (b < '1'::text))
     Filter: (c < 5)
!    ->  Bitmap Index Scan on mcv_lists_ab_idx
!          Index Cond: ((a < 5) AND (b < '1'::text))
! (5 rows)
  
  -- check change of column type resets the MCV statistics
  ALTER TABLE mcv_lists ALTER COLUMN c TYPE numeric;
--- 608,643 ----
  ANALYZE mcv_lists;
  EXPLAIN (COSTS OFF)
   SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
!                    QUERY PLAN                    
! -------------------------------------------------
!  Index Scan using mcv_lists_abc_idx on mcv_lists
!    Index Cond: ((a = 1) AND (b = '1'::text))
! (2 rows)
  
  EXPLAIN (COSTS OFF)
   SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
!                    QUERY PLAN                    
! -------------------------------------------------
!  Index Scan using mcv_lists_abc_idx on mcv_lists
!    Index Cond: ((a < 1) AND (b < '1'::text))
! (2 rows)
  
  EXPLAIN (COSTS OFF)
   SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
!                        QUERY PLAN                        
! ---------------------------------------------------------
!  Index Scan using mcv_lists_abc_idx on mcv_lists
!    Index Cond: ((a = 1) AND (b = '1'::text) AND (c = 1))
! (2 rows)
  
  EXPLAIN (COSTS OFF)
   SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
!                    QUERY PLAN                   
! ------------------------------------------------
!  Index Scan using mcv_lists_ab_idx on mcv_lists
!    Index Cond: ((a < 5) AND (b < '1'::text))
     Filter: (c < 5)
! (3 rows)
  
  -- check change of column type resets the MCV statistics
  ALTER TABLE mcv_lists ALTER COLUMN c TYPE numeric;
***************
*** 661,673 ****
  ANALYZE mcv_lists;
  EXPLAIN (COSTS OFF)
   SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
!                     QUERY PLAN                     
! ---------------------------------------------------
!  Bitmap Heap Scan on mcv_lists
!    Recheck Cond: ((a = 1) AND (b = '1'::text))
!    ->  Bitmap Index Scan on mcv_lists_abc_idx
!          Index Cond: ((a = 1) AND (b = '1'::text))
! (4 rows)
  
  -- 100 distinct combinations with NULL values, all in the MCV list
  TRUNCATE mcv_lists;
--- 652,662 ----
  ANALYZE mcv_lists;
  EXPLAIN (COSTS OFF)
   SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
!                    QUERY PLAN                    
! -------------------------------------------------
!  Index Scan using mcv_lists_abc_idx on mcv_lists
!    Index Cond: ((a = 1) AND (b = '1'::text))
! (2 rows)
  
  -- 100 distinct combinations with NULL values, all in the MCV list
  TRUNCATE mcv_lists;
***************
*** 702,725 ****
  ANALYZE mcv_lists;
  EXPLAIN (COSTS OFF)
   SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
!                     QUERY PLAN                     
! ---------------------------------------------------
!  Bitmap Heap Scan on mcv_lists
!    Recheck Cond: ((a IS NULL) AND (b IS NULL))
!    ->  Bitmap Index Scan on mcv_lists_abc_idx
!          Index Cond: ((a IS NULL) AND (b IS NULL))
! (4 rows)
  
  EXPLAIN (COSTS OFF)
   SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
!                     QUERY PLAN                     
! ---------------------------------------------------
!  Bitmap Heap Scan on mcv_lists
!    Recheck Cond: ((a IS NULL) AND (b IS NULL))
     Filter: (c IS NULL)
!    ->  Bitmap Index Scan on mcv_lists_ab_idx
!          Index Cond: ((a IS NULL) AND (b IS NULL))
! (5 rows)
  
  -- mcv with arrays
  CREATE TABLE mcv_lists_arrays (
--- 691,710 ----
  ANALYZE mcv_lists;
  EXPLAIN (COSTS OFF)
   SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
!                    QUERY PLAN                    
! -------------------------------------------------
!  Index Scan using mcv_lists_abc_idx on mcv_lists
!    Index Cond: ((a IS NULL) AND (b IS NULL))
! (2 rows)
  
  EXPLAIN (COSTS OFF)
   SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
!                    QUERY PLAN                   
! ------------------------------------------------
!  Index Scan using mcv_lists_ab_idx on mcv_lists
!    Index Cond: ((a IS NULL) AND (b IS NULL))
     Filter: (c IS NULL)
! (3 rows)
  
  -- mcv with arrays
  CREATE TABLE mcv_lists_arrays (
***************
*** 790,802 ****
  
  EXPLAIN (COSTS OFF)
   SELECT * FROM histograms WHERE a < 5 AND b < '5' AND c < 5;
!                           QUERY PLAN                           
! ---------------------------------------------------------------
!  Bitmap Heap Scan on histograms
!    Recheck Cond: ((a < 5) AND (b < '5'::text) AND (c < 5))
!    ->  Bitmap Index Scan on histograms_abc_idx
!          Index Cond: ((a < 5) AND (b < '5'::text) AND (c < 5))
! (4 rows)
  
  -- values correlated along the diagonal
  TRUNCATE histograms;
--- 775,785 ----
  
  EXPLAIN (COSTS OFF)
   SELECT * FROM histograms WHERE a < 5 AND b < '5' AND c < 5;
!                        QUERY PLAN                        
! ---------------------------------------------------------
!  Index Scan using histograms_abc_idx on histograms
!    Index Cond: ((a < 5) AND (b < '5'::text) AND (c < 5))
! (2 rows)
  
  -- values correlated along the diagonal
  TRUNCATE histograms;
***************
*** 825,847 ****
  ANALYZE histograms;
  EXPLAIN (COSTS OFF)
   SELECT * FROM histograms WHERE a < 3 AND c < 3;
!                   QUERY PLAN                   
! -----------------------------------------------
!  Bitmap Heap Scan on histograms
!    Recheck Cond: ((a < 3) AND (c < 3))
!    ->  Bitmap Index Scan on histograms_abc_idx
!          Index Cond: ((a < 3) AND (c < 3))
! (4 rows)
  
  EXPLAIN (COSTS OFF)
   SELECT * FROM histograms WHERE a < 3 AND b > '2' AND c < 3;
!                           QUERY PLAN                           
! ---------------------------------------------------------------
!  Bitmap Heap Scan on histograms
!    Recheck Cond: ((a < 3) AND (b > '2'::text) AND (c < 3))
!    ->  Bitmap Index Scan on histograms_abc_idx
!          Index Cond: ((a < 3) AND (b > '2'::text) AND (c < 3))
! (4 rows)
  
  -- almost 5000 unique combinations with NULL values
  TRUNCATE histograms;
--- 808,826 ----
  ANALYZE histograms;
  EXPLAIN (COSTS OFF)
   SELECT * FROM histograms WHERE a < 3 AND c < 3;
!                     QUERY PLAN                     
! ---------------------------------------------------
!  Index Scan using histograms_abc_idx on histograms
!    Index Cond: ((a < 3) AND (c < 3))
! (2 rows)
  
  EXPLAIN (COSTS OFF)
   SELECT * FROM histograms WHERE a < 3 AND b > '2' AND c < 3;
!                        QUERY PLAN                        
! ---------------------------------------------------------
!  Index Scan using histograms_abc_idx on histograms
!    Index Cond: ((a < 3) AND (b > '2'::text) AND (c < 3))
! (2 rows)
  
  -- almost 5000 unique combinations with NULL values
  TRUNCATE histograms;
***************
*** 877,897 ****
   SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
                      QUERY PLAN                     
  ---------------------------------------------------
!  Bitmap Heap Scan on histograms
!    Recheck Cond: ((a IS NULL) AND (b IS NULL))
!    ->  Bitmap Index Scan on histograms_abc_idx
!          Index Cond: ((a IS NULL) AND (b IS NULL))
! (4 rows)
  
  EXPLAIN (COSTS OFF)
   SELECT * FROM histograms WHERE a IS NULL AND b IS NULL AND c IS NULL;
!                             QUERY PLAN                             
! -------------------------------------------------------------------
!  Bitmap Heap Scan on histograms
!    Recheck Cond: ((a IS NULL) AND (b IS NULL) AND (c IS NULL))
!    ->  Bitmap Index Scan on histograms_abc_idx
!          Index Cond: ((a IS NULL) AND (b IS NULL) AND (c IS NULL))
! (4 rows)
  
  -- check change of column type resets the histogram statistics
  ALTER TABLE histograms ALTER COLUMN c TYPE numeric;
--- 856,872 ----
   SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
                      QUERY PLAN                     
  ---------------------------------------------------
!  Index Scan using histograms_abc_idx on histograms
!    Index Cond: ((a IS NULL) AND (b IS NULL))
! (2 rows)
  
  EXPLAIN (COSTS OFF)
   SELECT * FROM histograms WHERE a IS NULL AND b IS NULL AND c IS NULL;
!                          QUERY PLAN                          
! -------------------------------------------------------------
!  Index Scan using histograms_abc_idx on histograms
!    Index Cond: ((a IS NULL) AND (b IS NULL) AND (c IS NULL))
! (2 rows)
  
  -- check change of column type resets the histogram statistics
  ALTER TABLE histograms ALTER COLUMN c TYPE numeric;
***************
*** 908,918 ****
   SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
                      QUERY PLAN                     
  ---------------------------------------------------
!  Bitmap Heap Scan on histograms
!    Recheck Cond: ((a IS NULL) AND (b IS NULL))
!    ->  Bitmap Index Scan on histograms_abc_idx
!          Index Cond: ((a IS NULL) AND (b IS NULL))
! (4 rows)
  
  -- histograms with arrays
  CREATE TABLE histograms_arrays (
--- 883,891 ----
   SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
                      QUERY PLAN                     
  ---------------------------------------------------
!  Index Scan using histograms_abc_idx on histograms
!    Index Cond: ((a IS NULL) AND (b IS NULL))
! (2 rows)
  
  -- histograms with arrays
  CREATE TABLE histograms_arrays (

======================================================================

stats_ext.outapplication/octet-stream; name=stats_ext.outDownload

#23

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 8 years ago

In reply to: Mark Dilger (#22)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

On 12/19/2017 08:17 PM, Mark Dilger wrote:

I tested your latest patches on my mac os x laptop and got one test
failure due to the results of 'explain' coming up differently. For the record,
I followed these steps:

cd postgresql/
git pull
# this got my directory up to 8526bcb2df76d5171b4f4d6dc7a97560a73a5eff with no local changes
patch -p 1 < ../0001-multivariate-MCV-lists.patch
patch -p 1 < ../0002-multivariate-histograms.patch
./configure --prefix=/Users/mark/master/testinstall --enable-cassert --enable-tap-tests --enable-depend && make -j4 && make check-world

Yeah, those steps sounds about right.

Apparently this got broken by ecc27d55f4, although I don't quite
understand why - but it works fine before. Can you try if it works fine
on 9f4992e2a9 and fails with ecc27d55f4?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#24

Mark Dilger

hornschnorter@gmail.com

about 8 years ago

In reply to: Tomas Vondra (#23)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Dec 19, 2017, at 4:31 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

On 12/19/2017 08:17 PM, Mark Dilger wrote:

I tested your latest patches on my mac os x laptop and got one test
failure due to the results of 'explain' coming up differently. For the record,
I followed these steps:

cd postgresql/
git pull
# this got my directory up to 8526bcb2df76d5171b4f4d6dc7a97560a73a5eff with no local changes
patch -p 1 < ../0001-multivariate-MCV-lists.patch
patch -p 1 < ../0002-multivariate-histograms.patch
./configure --prefix=/Users/mark/master/testinstall --enable-cassert --enable-tap-tests --enable-depend && make -j4 && make check-world

Yeah, those steps sounds about right.

Apparently this got broken by ecc27d55f4, although I don't quite
understand why - but it works fine before. Can you try if it works fine
on 9f4992e2a9 and fails with ecc27d55f4?

It succeeds with 9f4992e2a9. It fails with ecc27d55f4. The failures look
to be the same as I reported previously.

mark

#25

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 8 years ago

In reply to: Mark Dilger (#24)

3 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 12/20/2017 02:44 AM, Mark Dilger wrote:

On Dec 19, 2017, at 4:31 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

On 12/19/2017 08:17 PM, Mark Dilger wrote:

I tested your latest patches on my mac os x laptop and got one test
failure due to the results of 'explain' coming up differently. For the record,
I followed these steps:

cd postgresql/
git pull
# this got my directory up to 8526bcb2df76d5171b4f4d6dc7a97560a73a5eff with no local changes
patch -p 1 < ../0001-multivariate-MCV-lists.patch
patch -p 1 < ../0002-multivariate-histograms.patch
./configure --prefix=/Users/mark/master/testinstall --enable-cassert --enable-tap-tests --enable-depend && make -j4 && make check-world

Yeah, those steps sounds about right.

Apparently this got broken by ecc27d55f4, although I don't quite
understand why - but it works fine before. Can you try if it works fine
on 9f4992e2a9 and fails with ecc27d55f4?

It succeeds with 9f4992e2a9. It fails with ecc27d55f4. The failures look
to be the same as I reported previously.

Gah, this turned out to be a silly bug. The ecc27d55f4 commit does:

... and fix dependencies_clauselist_selectivity() so that
estimatedclauses actually is a pure output argument as stated by
its API contract.

which does bring the code in line with the comment stating that
'estimatedclauses' is an output parameter. It wasn't meant to be
strictly output, though, but an input/output one instead (to pass
information about already estimated clauses when applying multiple
statistics).

With only dependencies it did not matter, but with the new MCV and
histogram patches we do this:

Bitmapset *estimatedclauses = NULL;

s1 *= statext_clauselist_selectivity(..., &estimatedclauses);

s1 *= dependencies_clauselist_selectivity(..., &estimatedclauses);

Since ecc27d55f4, the first thing dependencies_clauselist_selectivity
does is resetting estimatedclauses to NULL, throwing away information
about which clauses were estimated by MCV and histogram stats.

Of course, that's something ecc27d55f4 could not predict, but the reset
of estimatedclauses also makes the first loop over clauses rather
confusing, as it also checks the estimatedclauses bitmapset:

listidx = 0;
foreach(l, clauses)
{
Node *clause = (Node *) lfirst(l);

if (!bms_is_member(listidx, *estimatedclauses))
{
...
}

listidx++;
}

Of course, the index can never be part of the bitmapset - we've just
reset it to NULL, and it's the first loop. This does not break anything,
but it's somewhat confusing.

Attached is an updated patch series, where the first patch fixes this by
removing the reset of estimatedclauses (and tweaking the comment).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#26

Thomas Munro

thomas.munro@enterprisedb.com

about 8 years ago

In reply to: Tomas Vondra (#25)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Thu, Jan 4, 2018 at 1:12 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated patch series, where the first patch fixes this by
removing the reset of estimatedclauses (and tweaking the comment).

Hi Tomas,

FYI, from the annoying robot department:

ref/create_statistics.sgml:170: parser error : Opening and ending tag
mismatch: structname line 170 and unparseable
Create table <structname>t2</> with two perfectly correlated columns
^
ref/create_statistics.sgml:195: parser error : Opening and ending tag
mismatch: structname line 195 and unparseable
Create table <structname>t3</> with two strongly correlated columns, and
^
ref/create_statistics.sgml:213: parser error : StartTag: invalid element name
EXPLAIN ANALYZE SELECT * FROM t3 WHERE (a < 500) AND (b > 500);
^
ref/create_statistics.sgml:216: parser error : StartTag: invalid element name
EXPLAIN ANALYZE SELECT * FROM t3 WHERE (a < 400) AND (b > 600);
^
ref/create_statistics.sgml:239: parser error : chunk is not well balanced
reference.sgml:116: parser error : Failure to process entity createStatistics
&createStatistics;
^
reference.sgml:116: parser error : Entity 'createStatistics' not defined
&createStatistics;
^
reference.sgml:293: parser error : chunk is not well balanced
postgres.sgml:231: parser error : Failure to process entity reference
&reference;
^
postgres.sgml:231: parser error : Entity 'reference' not defined
&reference;
^

--
Thomas Munro
http://www.enterprisedb.com

#27

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Thomas Munro (#26)

3 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 01/12/2018 01:48 AM, Thomas Munro wrote:

On Thu, Jan 4, 2018 at 1:12 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated patch series, where the first patch fixes this by
removing the reset of estimatedclauses (and tweaking the comment).

Hi Tomas,

FYI, from the annoying robot department:

ref/create_statistics.sgml:170: parser error : Opening and ending tag
mismatch: structname line 170 and unparseable
Create table <structname>t2</> with two perfectly correlated columns
^
ref/create_statistics.sgml:195: parser error : Opening and ending tag
mismatch: structname line 195 and unparseable
Create table <structname>t3</> with two strongly correlated columns, and
^

Thanks. Attached is an updated patch fixing all the doc issues.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-Keep-information-about-already-estimated-clauses.patch.gzapplication/gzip; name=0001-Keep-information-about-already-estimated-clauses.patch.gzDownload

0002-multivariate-MCV-lists.patch.gzapplication/gzip; name=0002-multivariate-MCV-lists.patch.gzDownload

0003-multivariate-histograms.patch.gzapplication/gzip; name=0003-multivariate-histograms.patch.gzDownload

�t9YZ0003-multivariate-histograms.patch�\ys�F����=���4��iK�(2=V�-y%��T�M1%3���������x�F�����������B_H��t����?�:����������������Zm�'�����aD�����v�e���Kq�V,~'��QBO������A��C����J�Kqm%�h��Ex��n�l��O�Zf�U�^N�v�R��������4;�
�%���@!�n�������r�^/'��q��n}�i[����q�������n�.<+�+9P������iG�M8t�M4D��	�	nj�e�4!�
����m���N�;�-� td���l�������`�E���2j.�d��=k�Xz,��!�e�*�A��	����]�����(g�'�s�j|�����)C�=��k����\��lWnH�a��SB��D��]�����~������K��q��g/>������.i�~f�F��6�94������M�I�������
��l�����}���n�\��{���������5��������L}Csq;��8i���vR�w�.���j��f��@�
8�j���j!p��$�|C$=����
t�Ev��O� �Q`y����v��\�|�n�C��m$c��@��e�Iln�j��mm�AX�-I9���5z�$���!���k�x�����w{(�����[��t�u�
b�N��Z����GzR��W�B��c[aR�>\�#]�S`s�)`kWR.;�l&��[7V�@l�,���Ob����5h4��cv������A�ex�����V��������O$9xw}vu��~��������(�m����<��K^�P~6'a�� ��6yu��Oz�R����A�N��c���\S��T�cj�Q�DK;���sN��S�1G��H�x�t�^#������:}oS4���������|����7����w���E����i�hA��6
�~���~m$p�C�Z�L��vL��0���J�_�6�w�����A�K*F&B^������B�^������z�n|v#^�7W��E"~|;�K�������b*H���
J������Z���A���3������)%n�e{C��B���Yz;�,;
�������f� <���T�������H�R,"y���X0C���Gq'���������Hx�}��"�E��g�yK?�E�m�F�t�&s�J���{���	�D�������!�33���
4��/�iBT`��;�V
�����H�"p6�j��Z�V0w��2�p&2�+z�*�qM#W��v�����>�0��������uD
�g�u������' .|�bI�KBa9�s�_X���@����]�4v>y�YQ������	"���Q�y������^2��s�E�-8T���;)P98"Z+n0M��a�@��=�fH����
�8���-���x��DZI���G�P~�}��y���!V����$��>�Qb����!�#����=C���Y����_���]�8I���d����BX�R���r-5f��X?��m|{������������\���������]\����U�0N�[�V~s����j����0C'���7;YZ���K���5���?X���p�oU^��Ct})*M�eSa�T��+��+��w���*��Z-L���
H���K�j����9�$�%���4����*=��������k��s�7����%2a��i�k�K�9�!m���J��;&��c
,z�%1+��nx��M]�["n7/��zz���,l��������5���L�|������j�����v�M�1D����_�3t����:^��!�&�951�s"1q�\�d���@h%"�6y�(��1
~����%R�������G������1�
�q�e}�GC���Q�{�
��dL��,)\�H�0��WtJ�h7��0��yd��oE�`�s8��%b��=���{�	�6r
4�iG�Mv���Q�7C���eL�Q�T����
t0W��ZHB��{0�]���!�6�pVt/9�!rQ��C$zY-u�$2W�`�;J_������
7v?����fv��T���u�Y����M(7(o�=����V�J!J%�	�
��1}�>�o���b�X�~@�";�I�G�}�4��AL��*6'w��H�����������cm�����-��g�h}������f��3����H>��4?$t�h-S7�a�u�7��51��gOQ�o����w��?��
G��&�V+k��v��`m�l~e�����k��vW}^^?[�5v�!��N��;�r����k��b�L����������ab�b�����v:�X�������X��}dzO3s#?�e���vo��(�:��"y��L�%������Fj���X#E��
��|���i��)�1��u|�p8���
82?�A������:BW~i�c�C�Y��#�-�N��V����������G����Y1�^x�J�x�v��-����3��
����X�Z�R���:�!�&������S��c�������#�Gr�&w���C;������x�t=��Y����8�WJ�r�@��VC���u��F��>���
��d�SVL��o��R]�wH��'{���2�+*�}g�C��^S?�nUry�8���C!�n.w�,�`�(��qT����l9���(%G��H�TWCXS���9�������G��kd��F�z0O�Y��1��c�X������2��<;�S<p�;}�L�k�g������r�t�z�p��;�M=HD������������8#..o��'����|[�)�1dY�����5�I`e-n����I�6����+C[4v������������Q�~��O���{��$��
S\�,b,��u�8�lU��d��}6H���H0���dm+�t)�3m�����}7I��H�O�E%��/���>E�����mnn�7.2rd�~��\f��
9�vf�����g��:s����9BZ���w[[�|���v�9���k�t�����,�P$����
	�!��y�Q����tS�T�B�����7oN�{7IGTt���f^h%:�O��r�J�r���z|u#�/n.	L����kP�EY�A�D�aV"QG
��*�Vv6�4.��D�	�JUDj��y�#���UL����[�����>d�w��������f��BO3�t6k��GxDU�S��vU;k����������u~1��&����#h�s�Y>*�����A���p�L;m�g6���8�b�y��3N����A����!���:�0����:��"��jY�.]�T*��\�=�':�4�����_���W�z���,�R�N�_:��*��/(_�f�������Ju~2�'^4��"���1��$�)`��iY��K+�D>�^O��+�8Q�d�qs �6���b�W�A��rB�s�U8�rr�*�����w&=������+��vwH���!Sx�n�*Ou{]��X��emsm�����
1�h�u�Cx�1��'��=��-N�����T~��,^T��:9NM��W����#�4���Ra�,������>c�%�Q��
���E�&�Y��gUq|,ZU��'��!����X��	E������(T�WW�W��JEv�Hz{v�z<�������	O�J@�'����;�#��N������_
�V��0`��*���6�Y���W�X�<�
;&c�e�2G�)?~���(OP��
��@�_2a'S��x=�t3y=�0�x=�8;_W�6��Z��U���u�%i=�[���]��g�Nc��TQp�D��7��tt,P��"��U��1�vv����RW�
n����W���
a�u�xn?'�����9|��q����`�O6{��fg�W��9�y���O%�-@��X�u���ku?L�t>����{H�
��si-&!��d�J��m�s�W����������v[���.�`D{���{����� l����#[J�(\��^\���������^C���d�V�OS�$���;���pB���Tg+���X�'VL��]v�������0�
�d"t�����>kU�{���#�Wq�>b���w�V�b�����@��\E&�5���`��{G��&[�����dL��nsI@,�|bLc�����Gw��B���S�>�O��_[,��!�)�j�/������BH���>���

����]�~aI?��8�Ez�������YOL��p&'�z���J�T�X��~��r��o�d;�M�"��"4x!M0������'�_'���J�9�ra�b$��H���_(���X����
o���?�|���:��z{^��@dR�}*��e��[��4��C"`Q�����N���2�����4�NS�>!l�����I��z��(U{8�}���7���_po�=����.�{r�vf(���g�e{���`������i�������	f��|�B!����T�*����QUi��w/UPs9�Z|��QC�i��*�����f<A�6ys>~��m�2�W
�����Har�7���zm/�X>��x�W�&�.(��HP��i�qq��F�/nr1�V#����?�p=�����?��5}�D����}s�hXv4u�+�T��p`27Y�=����b�s'���W��?
Cn��#CB~���I�H���+H�j������"��������wY�
�7g�AF�29�b}�����L�W��Lm��NW��d��@>����Q<�������J;��jL[��E'[��"���b^�_]����Zt�U=p&�1���N�|���{��40��|h����?��FE��r@������j�V����@b#���My]��!��N-�#���;]�m}`A���Dr.PR�[���t��n=���8���vc�>��N�J�)^P�
k}J��{bW�w48���n��1�ot�a�'+L�fs���4�4-x*���L���Bhc_b�3���a��b�N��n��=��J^��,Q��{�p6f�O��1�UlI�\�Nez�������b��K���A
$}��Q���=B�l}�@��P��\��1!j�om�����0�m������t����������Pp���B�����?��p�j!M�s����B�&�7�	\Vp����
�@-��cs��;7��e*��{��T��X�����u������v^D��	��Q^PKtsFjbD������U|f�sirC
)?���I���u��\���C����*YQ�����:�i�{_AZ�x��+\qD��r�� ��D�p\��P����Z|S?�X�3A�4�'Q�6���������?@����X���?�I�����^_��R�����QKz���r���o^�������/U}����(�D�Qi�<�%[�UQ����Z�v0V���.Vi��H�t�m��tE���u����������/�\�i��������d���Ga�>Q�i?����o@�M����q>(��S���$XM6-]��7o��k;_��������CK���Xb
���H��h�Q����P�X�f��q��9c���,��^���/{�������pl��]�T�W1_���r��I'�����m�����Wt�7�e$���8v6�8a�
��������H-�XH�0f�������VUwa;9������Qwu�k�z~��c���C�T������������"
��4�>M���k�u�����#l�I(<M@:��� ���duc�O}`|e5I�e�Q��u������Z��I�
�������}�����ng�z��>���~�a�q�W����Vs�-x W���W7^�@�l<����MP3��O*�$�s����T�c���h*'&���G5���
'�j��M�TC�� o�N�`�J~�������hXq�9���99y��<d��N��&�`~������[j3�^���[y��!	�o4J=�k�mN4����5'����=p�XLA�iY't�0~�d����9~6;a�u��n��H�u��5S�����v�@�x>s�s�}9�:��
��8��v+��OV\|�?�+4���i��
����z)����v���������k����++���� ��!�H@^�0�I<���G���~�a�H:����%p��-5x�m�������#a�H��u�����	W��6��_�����6.�w�	�ov6����[�[��<�����{Ecf	nO�N2�[��]��M<z�qG�n�rd2>sE�N	5�`������t�.g������TL%�;n�o ��5fg�d`�������a1�(Mv�NCbQ�_:v���G;9�����a�n�^�`�|
���������>a&�;�C�4��Vs�����U_�9e����3���W7�L��!;%�i�C�?p�L�9;����F[�bs��B[�B7����,��H*�9:����]J$F�t����g�cv�uhubdMi��.�����Z��%��k��n���e��<��0��eP5g�w�!���=��F=/�W/+�nk>�QI���rW��v����;�'�*��0n���$��e��#��C)Am�n�GA��
�����I��@]��X(C�I��nh
���V�����`/Xfb2p���@���M>�.�?p����t:��dM2R�`C��{���P��	��E��p��<a!q���v��*�$|���}���d����_	���
�,�|�!d����`�T���D�P�kO&�Er�6Q���JGt]����!oh���2�~�E���������n<���o�g��w��������|�����Sz�U���q+� ��+�{�N����g�6o��@&;y�K�;VU���+��h��{���>3N�������������[�Y��`:,�7\�sK���E�iP�7��'�*W<P���.�Z������+D�l!c�G��h��9���G�kk
nc��=>=������?��������$Q��x0ghr�;����A����}���7[�,��� ���i&��JX^���!_+��s�Up+�Z�(`sZ���gv�S���1��������l����w�Q<���vw�Qq��K����[������%�����=h����[�]����L��c������u`<��%�Po�;��D�>���1N	��x�����0�{J	��DY*3+��?_6r2$�9c��n*G���[�My6��D#h���f�Mw�� C���
K����A����,t.���;Bw��+m�uw��I���]"W�������O&'`�e~�N�:D;�M��
s}���IB4Yw����Bx�7��n8���n����8��`��4@�C�+iR��
�����xh�H��7��x��r9{>��_f����g�}K<=�Ta
���,���Q��'O�������"����������AD�5he���a��B���������f�Y����E5av5���8*�V��=�)Fq��"�&��<���A\�@t���,rc�kI��A�E� C�'�����n]�����d���C��'I��=��CT�����a��F���P�`\�x����qr
��9Z���a]Qp	!��}�"p\��N^����<�yt��cG������*��1�����u��P�vv������G8C���~�.f�<��������v��G*�0F�s�P�@X(�<$����,
L"���_�`��>q�a�`?|�c���
�w]�A��)�,	�:3Q��V$��4u��J�f�9��y��H�A����E@��d��N�P�Bb3P`��FF�>s_���	�#��P{$M���ZB��0CI/���V�`^�1�(���:���x��4��,�@nf�v����u:Y��v�L�5T�b+�Q���	��{�KL~�xc�9r(K�U�����g��h�N�p|T�����f&�-'1VP2����	��-d*�S���=������������c��pxz���>��F*�k��E"�2�=�!I���4��
�')��/\?�0��0�*�.�w����Y�����)%Z���L
S��08`lp�rL6<��b������)��we���#f�� ��������/3���S����^�d}�����%��������a_�BM��@��=b�P{�','�A���O2BE��"�8id�Do�[���%��2R�Q�]7��0��M�GU�\v�7�Au�����}V7�K�����e��>��l���LK�6 \g _�1�K2�B����2r��_q90���N0�@��nb_v����������'n�8���Y4����C�,`�����R���G�S��v���Mv�F�ZN6Q�}2�0�:�NJJ���m7���&
���������3R��2Z	���� �j*��{3�Q��E��((g��f�'m�!M�66
���b=_�����.��Z+���#,36��
���hB�
�.���s�N#��|_�J����[�Y��V$��F�W�V����[�lqB9'L;Q��,����p�(�M�2o2s�6������
Un�i���2���'|j�o��a�lh$\K�,��b'�m���K����
���������N`��dO�N�>�	=������R�:��G�I�������ZFH��1��A����}���� 2��V7����������JE}��g�k�����
�%Z����)�A5f�|�
���";�1Z����>�N����rdi0��p�����H��z�3#����� �;��#�����n������^[\?�_��5��@JfB�'����G�|��B]���F����w�����������6�]�����\���x���
�aWY����F��N��r�	���W��������)+�|kN^�kpH�,��^���>wd[Fj����0��9�����q#En��#��"'N�'gJ9vg<trEi�R�z������'Y,��E�r���ky���7�������Pu���N���B�H/�&p?������
j���u����_�>~����f���.���W���(��[+vo!d�csQ^A,b����l$��/�,#���Fp�v�+ �+�<s��V���-���&���N�d�/k\�RI�u~>����*4U�F7������
g�f(`�_\$�m����:Vv1��&�ox���H�7�������������������T��CG�2AIHD>��+�k��X�eW����H!Tp2d���
�
�����<�>2=�W�UK^U�
7J�~���|�-����P�pd�hL��J�el<U5)ZHe!n���ChK
����5�of��u&%ks$4�&��%E���	��;��%���L<l�wp2	�
�Cc�����*]"ss���K68Ni�����"5�b�;de���H�#eALD
����n��<�����'������f�&�������������������8������>�oW7���v�V��&
FC���-B8��d��SQ9���s��o���E}�������uL1Ca��6,�\�,����{:n���C����wq�v9"X�����$c|�^<��7��d����3��w�a?!���r��%���'G���eq�Uv
,�;f�1�OR�,<��4E2�fd!����-�3����;�K	;E��	�����]���/����sG�%w������g���J�8`�*��� �A���{�����e�&.��5�����y��!��~^(%[�-`��c��Eo1fc�~lT5��\������o��"w�4���3�M�8D�Y�����7d�H���{{��N>���VE��QELH��nt@��r7�d;����b������"Y#��UC����C�Yn.'�.���<��X\)��VNj>�@�_���n�q�D��U�n|U�;3|GW5l��]����K^
t$��Z�n��-R������%��"���9t���1f��H���@k�J����
�1�~�x]C�����B��M��t����G��q��I,��G����������=u
Y"��x��3��S7fE��8ddH�9:x^�ZQ�H�k�K�-����pr9w��_�
���JLv�U{����"�/�j��cX����p����	B���v���]�8���k�r�hs�LF9�#������uM�`�������;����\���
���{�g�I>Tp����nCGQI�g��r6V��F���f�m�o���?P}�/���%=s4$���IdA�I�!���||cyG��&����f��R�/Q�����������0�l������v� ����0&*a0���B��,�^
����#����o��#�6�f �����\�C����J�+G�8��d��yi��� �s����$0
6u�io��pB�	���]w���u�������5���f�J��k�3.��h'����������}��.��Ju-�H�����0�;��V����5��x2���k5E�+#,'^[��/}X����K������C���������d����	e�bc�F��(��GT=/�d6��jk5lwc�!�
���4�(����[��*���de]4f6mc��c���*RE����,��wG����i��lpf4C�
�o&��.�)�"AD���}�.k�U�s1���O�I�v�-�@�s�]��$y���s���U ��{���c���S�����.��z�������~�������j�I��~��@����
��J����g�	�8js��9����s���a�lhV�T��G=
���=�S�h�������&�>�?&�\}U5_���A�z�0����O[��jjS"����k�e?��*��p�X������W�w���+8��]bCX���M�D������a�:�E��w������h��d?�7u�~���-X
|�z��t7���9�r�0�JD�#��QU�um �P%�rm�>v�~]	M�;��o
b���q������~�%��&��<2��"��	|)���q3`��������Ta&��B�$��]]2��}{ ���Y�0,�����z�T�L���	p�Gh�I��eF�P` /)��y�.h��(@t�������� �`%d3ft����`JKS�p%��M�������"���H�_�`���J��W���P��FT1�
z�\w�fM�VzR�\�ko3����4��,fl��3b�YO�c
�_P������E�@$�&�P�/Q2���[��BH��hR�Fj��;:���1�W��xjB0(JZB^"�-�"�T�
�fy���h4�WP�op�7Wj2Hl����f��o���s��W�)l;L6��'�B�M�u�I�yE�f�"�>xD���|�{����J'�*�U��fo����B�R��(���hKv���'����h����EzS9�[^\�`�0�e��c�&#���� ��WMp��7_o�3N0����(E=�"�d;6K��|R��]�j#?�.'@�q��"D
�_!=�RrXM`3y���)mBt����2�=�Fv���l0r�2 ��'(�*�F��"��'�_=��V���6����TcXmr�$���3]��g6�y���\7
�TQ<p�A}�#��9
�c�j���NEo�]�`'3F��1�k�E�����."7+��c?j�`��p��;
��v�g`�d ��w�@��M�����%*%�jTO����~�5�����''v;A�������^2���N@�{��t�n�4B����M\�Om�N�����ArC�s�t�7��sN�!�<MO���y~����z���)xVQ��4��Yir�5f��.��B�=�GZ�$�;��L��������i����~�WHbJ�
�N!��La��*m<x��9��-0�acES�k	������<��^2���m��r%
5��XY!%���Ab�m�S����-?�����W��bpKj�<f�������G�]��p��e�U���%�,:�����X��r���^��\�*HR�<�����-6k��;��
:������	�� 
:��x�q�"�S��Z l�(%��?�~����(�U��9r�,�����������Y^�-�k�h���z�?����Ef�<l���DA�����zx�C<�fb�/��X�������-���a����]6��4$���2��_U������
a��TR���x���2����1���Xny�����\�ld5�:W��
P�~����Vs�-�����G���)	��F�������N��I�St�Y�
o�
��������G�B��!���'c�g �t0g�9u��n<�|vc[m��Y�Aj��T�R�eM�
b��fb���_��71=!C���8���I$3r�t����TZp%����]�q��n�����5�C|#,���a!�[�l��\Q�)f�	�p�6<�T�>5p���v�S{��Oq����G
�s`[�q���!��O�u|����F8�Wor���h�f������qB�s@r���F�E���Cw��:��D+��?��
�����������n�	'SZ[�
~M�?N)�e����@������
�c����r;1����^M��YG}2�O���O����K���g�H������~��B}�g�G��"X��e��\�=�����c�z<P��0n]1e@��	m`��pv��;�L�;���IW������iH�=T�
nY�3��Y���B��
jl4��.�6{7��AJa���S������x�~��y[���)���l�����kO������?�����[���`�����@)��+��N;$H����g)�wNl����ijB<�,��%�@�����U6�^��V��v#�����?�����d2u�����������da�������/Oo���E��1Y���H�K��P�MR�WC��U&'�5��;L���Bv�,=#"�>�<:�_J�&�nH��e�~����2�����t��3��]��ryV] ��G���������>�~
F���7Q'�\dOH�?�uB`��@�99��!�������L+r�Oj��;�`�6�fa4�����sM�T����5��]13�x��Kf�8|�rl�bfN����`
�o�cbt��Z�a2�t�f@�6�~
����`���)K�����x��.�������+1h6B��O������$	���L����c��0�!D"��:����<=l5������grQ�'�|V����7�~u�o<U"�B!%�N��r����t5,��!�)�-��W�j�/r���H�_]K��L��}��A� ��ECX�c��j^��m�S��=��'���sbD�I�7�	B1��S�f#��<�`���Pp���RO�1 ����]a`N��h�.������{�tQy#}�e�H��wJ���""�����+�\+7Y��D���&l���J����T�����4��NA���N����r��>
��9
z@L��F0�+��#r>�V����H�X_$\<���An�AmM`�ee�(*n�o-*��\wg�%
�'Jw���n�����{�������^�H

7�mO
f����7$W=���Uno���KN���������x7�J�|�p:I����I��������G�5���������{S�7���	)����,y�������<���;!�/x�fJg?+e�����3��e���`�2��bk����.r���Q�?��Q��<��n�����3����2k���&[�-��b�~j�W�����H���n�Q��kq\�����%��������$��U~,5IZ@���%v���!_���c����F�]���t�����!���!��q�z0��Ev�N�*f)=�[R�6h��[�a����9a*";�:�x��g��:�O���$���(������|��d ��_�fD�($�;B$QU���AXH8R�W"��x�I0&�e$Auz��bN�s���T��Q+��"��F����0�4�����/�������S��J�tS/��$�*#6�A�o��r���z9Fnzn�Q~$7<Y��FC�$3U
KQ6�2� ��f�#5H����W	�9z��R���f�H�!O�}:6i��O76�A��!"+�u�B�H;^���Q��F�-+Jm�	�\�0�Z���=�"W$���[g3l�m.�d�9$��]!�Y%�ln��
$�	�q��\�8��?1���\r�HH�d����0����E�����`^kO��G&�:3V��X	���Yt������t���� ��Ri.S�sVo_�S�%l:w2��3w��?[�Mge4m�����M%J9�Q��iUR�"��������g$B�r���e�IM����j����%H�P���j�������!�ucnj�����7������#����8��x�*��$p8���&��}0%��Y(r���s�j�l@�p�<�v�������[oTtJ=E=h�F)w�^�}t/���!9@��hmP	��4��WS(x��Il�a�xo����u��oe����:*��+:��=t$M�4���[����a������=Y�������cX����Fw|�/_*���zA,�)�T��������m� �&l��[�1���%�L��O�<�M��%�������O�P?��������������8�y��Ak{����J��Y~��/�0�f����)���f���
5���9j������e�d��Uw����u�|�����������!j�w���n���)�o��������Y�p��C6!��� �Q(u]�vu��7*6�8���.��l:\&z{������?�2o��.�q�Kp��|C�����lw�������F�\��7P�~>`s|�����F������
��>1?Q��h��WV���4�������,fQtA��v7�l0���m��%�
	`P��@K�R��Q�rA�`����m���*���C�|v��Gi�QV�QE���z�m ��6M�|R,�.�i����V�"���eH��>%�4���:�x�'�b���j�#>+�W�{1[���0�P�	�����F��P����|�T?+�
g����}C�B�=Ach���z	8�C��� ���wO��T3
�/_1����Nr�����G���-�]m�WW��O~/�>��M���2�5{w`0�X��.�R����V����������XTw)9B�^Y��8����;�\�Q����w��E}��v�k�F;&1&�C;.�}XD�B��m5>Cmk�������~�$K���a��r'���cD�Ay�$/M~���]@%~j���,I�l`W�M���);�f&�P��L*�E�`����2C�����0��
Ua�I�N(����R��q��� o����'+i&�	���1\Q�2M���2��`��9Er���$�HJKVnD��$l��0"��P^�
5��skK%�;f����y�Z�)���!��(Q�!.��k0���������u��5�����>�SgSW�[������~���c��?=���t~W���^��d#�����aS����7��/��>�M�>������%e���f�"'�H�"�����otN�rr���q������)��o5�x ^R�z�	
�!c�Y	�lQ:��(���/��������_����Jz�Z���L�QQ�0sp��2=k�+)��n#=+�����_��U�Z��A�9���k�����S����/�'T*�iw}�����|PQ]\�_0�{b�l��L9�fI��l��M|�TC%��1�����u3c%5D��	"
��"�4GAR��'m16^����1��w��R����e�w&���}P��*���G~����3����
�`��M�2V���"1����\?� D���/ ���OUkCqr&v�8�8x�:��'�L�y4F�	�����1��������-6�e������%���d��71U�.=U��v�C<2r�.s�Y���cG'W@�1@ �
��F�'�K�����o�$���6�P]�(����Y��OX�^���|��H��3Q��Z�6�?L,$��%��d�l���s���j��^�$�����.����]�a�����Lr!��ZM�����^4�L}�wx�J>=�����HQV�����E���/b�q�2|�n<��[�+���Q����9�5�����~b4XV�/��OB-����jl��PX5`I'�JN��'
9j2#G������������?<98:<��C��6�����:b�0��~0��pp
1��	�J�j�k��@TyIK���-Coz�jE�e�N�F�`��U�h���
)W��hUk���^\D&K�i=#��-����p��$,������__�wQ�5����_\H������*e"���M��O��0�`����D���g�m*�S�����]�n2��F�*��kJa��fJ�VVp=B%�_�`na��o��K���(i��"O���I��O����[o���j���/�6���k�7�(��/��N1Gy���1��-Wu��I7�>�7J�������~'j����"���j\{�d��o�~N�h��M RV�w"X;;��������h�O�������$�%�@��PJ�Hu�5��f3����������9����*��SY3��w��UZ7�c�Z��N"n�y`n��a`�F����9s��P��c�	��W��x9a=�0��D#������$<����o641�iE�6��Z��"��(SY�^a^�R_YA5L$p	g#'�8��%��e�6$F����
lZ�8�IF>sTu����ml �s4���2�@e�����)�3*������(�$�`�kDT*:��%t��]��<`�9
T�lZeX����xK�Zl�(^W��'�b��DC�.|�?|�zf���F�)���N{��l�Lo�O�<���h��"JH.��0�(�1�c��<���j�Y��&P%����!���i�]RF�!�i���L��D�As��M�B�^���E�b�X� �q�E����$L�u���H/D���M��O3FXy�7#�h�O����������L�����vO!�q��M��&�s�Y0���\���_]t�@�5��0p��������q����{�����

��@�1j�i������4� �"�G�4�j��t{��s,�+Lr�GI��EL�������`�c�U��{�oD�=M����}��w��%�?u���}B��.�`?��\�����f�_8�t6w���)Lk������q>Oy5�i.qn��[Y�2N���A�rvZ�,,���^%�h	��
C�#�������fH6�)�
��8�v���4
'��+>%!<���Pu��BM���	b��s>
����V���6h~��/��b���,�!����c=mh������v��$�-J*�8�8]C��l
Dvns�yE�z���yJ�<8��mno�����(�?������+�U��4%����KJ��S�V5�d3�j�Q����$[��%(��(7�:����N�����n�o_&��}q^k�0��������S2x�3R�oy����k���
�T��-_������7"��nF��I��,1=���SD�
�������>��N

��oy��.]����OBp}V��{U�h�1��K�W�:a���f���Z����K�V�A��\�%�6���l��r��'��R��D�x�����]������}�2G��z812B9�����&/R�Ao���G�e{��^p�t7���9�}��
���P��f��B���n�������N��s����=	r����l*<�4��gY�x����I��.D����9p1t�E�06�)�u�N��Nf6�� ?��c!��52����4k(UB
��m
���3��A�8k���9�r}4_��,����z
��F�WJ
'�o7���/*&����$�q������M����b�J��4_�q�����[n"�t�E�?����,I��2�?F�g c�B<��Q����`�l	
���4+���u�x,2:��Y>HL�����(�6�hmL��S�����M���gh���ao�SRpXJ
�*+V`SRp�'���x2������"n*����Uy�O�n9�}�@4-�:��'lSs-L��9�������$���E��v���-�����������N\�++�r�Xt�:������$�*��Z4��cs{���:����
���>+A(��<�G�,�����T��|9��	Wbu�fv�xI{�3���J�v�N<��(zb��zH�����}c�0�$�j@
�Zn"��z���{K���T��Y/aJ�/E�'��0]��
���#t�h����+���:�P�Q�����r�����G�>�����������0��-5�?,�L������i��
��R*,8&�M�����Wp����	V�v���EQ�9�F��'�o���,.�
���
�&P��>mJ&`��!�YD��j����I����0��f�n��n�[������h5�,�����4!����tF���N~��)	Lh-�"i�u�����U���P-�aJ8� ����D:���!������"$�����,�i5go�y��"���HI`U.�RE ����~�R}�������������($��j��������x��Y6����
���O�
%?g��jn�*
����Q
F�=�{�MH����<F'���9���;��@����
@�!*��8B
�� ��$��O"����;���-� =zz�D!QA>n��5�u�Df&����9_e/�Lg�������BF�an�Cu{J���	��
������l�%�MgDx�)O���7AO��F����_�QU��s�)>��@���U�������i��K�ftY�e���[��������B*�!Q�gq4�Ap
+;xL�;�LJN���D��2[�����s`uY����w�j��w��OH����l�����"��8����.��v������U�[�K�����n�:�����D����=�'�c�q�C(������S='��;�{�5EI��B�
�&����1g���)��[�x��-k�V50�����.��Pn�;�;�g�F��Dt�~+S-FDEDr������|�c��i����=�����5���[yB��y`�����g����pB�DJ����I����J
�gDC3�R����:���lo}�z�,UD���:
��P��"@:�|�O6�;��Ro���*����QO�=a��p)+�F@����>U���6���|aR�2�����{���!����{����*��l�G�M�9��#�M.rp����R4*�U������v���Rz�t��]�|�H^=������MPM9$���c�f�YD?A�@���;\��gup@�f�8��rS��4����`
�C'#�9���=���r�b�E��,��$�lmC8�Sa��TZAnr���N�A~���
���m��_�&i5�)�����;��yEl)+���h{9����t��rs�g���N���)�?��8�����
�(��/���	�!���K1	��
�����P�i/�*
�-y2C��P�;�K��h�`�nT3i+Kz}T`7���@o`Po�*����"��8f��d���|?�������%D�8����u�� zx�
7�'����F��bA��h �*�NNf����V����T����&��8���7���C%��Gu�Xb���_w��{u��T����m��Z���^\�fiP���&�A��2 Oc�P@��,����v�b�I�b!�t�7<�(�7I�*����g�N����un���$�}�_O8���-��	��������/���s�.q`�"����/j����ox�YQW9��!t*uy���U��������`��n�jNJ�:U���.V}���	\	&�d����Om���4�0k�P�v��G�?5���@�V����7�xy����H�d�^}��4���S9�M4pk%�|���P���Vp�1���SU�.�r�@���r����k���b�y?3 z|Zk��K�3������_~I- ��%-_+�((@�jjR\����p	V@M���M���LD���5��p�� �[��p�k���-nO&_��J��������a,���&G4xp6F���Ph�sI�%w�pAx.s�5��D���}xi�n����Hh���y��"B`����9v����D��V�*0X^�#�lJJJ�)��`n3��7��.���O����nr����Z^��`k����9��7��N|$� ���L������Lb�Rt� ���%���� x&��'��8#�$0��I�1	x�����)T1��������4�����gU�}�-����I����K�W�����:�g���Q�{��O'���c�)=��V��3��eo���t>�.���0�x%�NQ���=��j_qN��g>}��P���i�G�<g5O��2�Zw���c� {�b�kY�5��ph�*A�����>t��j�j�P��	52�\���>N���BsOc*�I1/M&|��0*�E��&�����|Q�V*����%�����|k�l���s��8%�����5��daGj�W��
�F	���2,�2��>������~��f��F*��6��j0��V�L�8[Te;�] ���N'�8p������M�9<�;z���gw�������������]��������w��!F�]9��5J�U-5�|V�����-v����X�m,���h����f��������q���!����w�r��rH(��~�7z�.��j/?�u�lbxdN��G��x����Le�������&������E
��(����%�����[In"�	J��T����B1���	�mJ���Tv�����=�'��K]�aC:$0�S��l��"�Z���r����{7�����w����HZ��n�Zo�*�����2���H�����}6��b��Z���5����!,�%����(V�<���v��{#P$���[KM���+�����a�_�[��m+,I���0I�/�m����T�����j��x
��Z��d;��W%vRa�{��Y�����3Y�O�\E	���������n4�%�>�>`>��n�zVC`�2J�B��F�+B����2�����8s�L��� ��<Y����,4nF!F]&��3BF�L��
r�om���Fa����	�\����,���WX�N���n��A.�!|������]�\nc�i"y�L<��Ic��J��q����{�&����h*T5j�!�s������7M�>����H��bs�e�W���k���`
���W��%��	�/��W�u(6Mo21j�[�,�0�@W]g 
���1��M9�
������Vi��#wiO�(��B}:k$��o��l;_i����[��O�����/��+����>���K����U
��z#��^g�//�Z�}?�X��-�� ������}��\L1Y��I/���c�b�[����bd��]�� w��<����R�I\k�x������K���|��
D�%��sjys~��1����F�0��s��O
�����g�]�E���-c��3��
$SC��d\\^�����
N"0��%����,����V�o�{T��@��_9M���l�L��&�J>�a��5�W�Q�A�j���cA�������k�����/)�6k_���������U`}�_�-��;kE6��M:�.j1n��47P�A��z�w"6g.��}:))%�n>�����������U�f�4��.�d�{�1R�����������akP��(���������x1��
3�6��M�n|b��z|��f4����o2j�*_D�\x=�}6:Z���l�����HJ=�vT3���Zw��?�g�u�����uP��g���Q=9��s�#dm�1��)�:�\����Q�c�d�5�j�k�>n�$�Q�1��M�����#(u��w�2��\9��Qo���
y$�N������v9k����L��b`D�y~!��������h^�M!���7��
+7� U)/������wCw+P��U����p�m�` �N������I{a���0]4F����O����uTu��������4��I55
Pf��k�����0�2����~��a��<�]����|����(��Y� ���W�j��4�/�e�0���������,3��2~��(����e�LM1�N��F2����Rn!�� J�62[�����{s�!}
�cDv3�����1����3f���_�������!��&E�]�RA����y'�����u�7xt������~z.Y�HfB��A�G��QlZ�_���#�����uk�M���y�wP<���8��-F��IJ�*8�������2�y��q��F{���Q����k!�)��L&@s�D����Ng���k�3/>"k���+�N�<;�&]������'�����O���Ir��r��y�7D���g�.��%/��^%��*��.��y�v�JM�H�tG��p.'�{'I�I� J��F������C!�x�p��_����>M�d�Q>��q���=���;H�K�h��!��t�������6��'��Y�����]��9$�M:�H��m�(!�:���q�<�wi����?����9����=|�8����[[v�������t����2!�	����63>��=~��Z��vb��_��/w%�}R���Y�WVN��''W�N������V��y|o�~�S�=�t:����{����B<����n�>�l?n�m�>|@�=~�	\����a�M��L�<��QQk+>v�mu���QF����o�N{�����I����w�w�u{��������^�%�l�����g��F�����Z��e{�������r����n���]����G���6��X�G�OK-��������;[�Yx;��[��������q����?u���(�f�$�&�B���tXpV�B)X�dJA��[����!�Mw��v�����l�t�����:;�����w�&O�9�������O87l��xG���W+��Z;������Bt�����d��b[o<�w�q[�\��j���')�\��y4W7mMs/#��A�C"�Wx'V2R���������s��g�Q��5��l_�
'��	�,^6��
������P'���F�!���p�M. {����Qx��������kV���|E��n]����v%(�/sf��~s��&�$�5�G��3��9 ��"�~p3�������\��,���
6u�u��������Q��_�^����E��/�aZhL&\�7$b�=.��O���$u����a1���|�)l>{z�hdB�"A&�(�P�u�����<WU�y��sA��b�E���IAm�c����_|W���ER���I��s���I;S����&����Il�l��u���F�Xq�:��5D�]���vQb�L.$����<�iB���������@f��7�N�����/	�QO��M+qLEXJHD������A�� �jPy�����T�N3��L�Q����KZSG�� �^gM���J����J�|�.?#-&*��"��,��=��6�k���J�7?Y�>�5*����^ ��C�@�C|�i)~*U����
��9|��D�[h�:�g,���	�:��7�)X9Zdg���� �V�q'�F) �v��i�����q������G�D$�r�^�
�BpTH�x���L.�bX�g��'@70�`;��=B��Ls��_��Q��*�K���1���I�,�H��a�Z`���<Wy���G�9��&��`W��q~eX�$�ST�s�CeMa��F�T�*���F�52���j
B��Q&=O�)�YI����5�S=0�9i��`��a�m8 �O8��^>�d��a��A+�����&�r���r��ZLD����+)e6������l���aB/.2�>C�������9�p���?&��B��j�B��z�1b�R�����������\��C�R^'���H'N��5����a��]_��o&������CV�H����h�-��K����cwg^)�]����0���������BMP����,o�U��L�BLh�!���Xw����6'�*G���y����CBt���/LY�\\��~�}[J�}��1��q�@��<�F��L��l��NH�f����S���3�@�1�'xYsE��ac
�3�8�@�����z����!��X�i[f5���%���N����0607�2"����(F%Dh��W�=F�<HT��y����M������O8o�����g9b��n�w��IL��0�Y��pA6�d�9�$`a<��j��!�����?p]2���x� sJ�'/>?��J���}r��7�����T\�9C/S�<d�������FV@?��8*���U��NI#	��L������qs5}D�4E����UX�b����9@����&���q� n�g8�+�
����"�����	2�$|������lCX	|RJ�`��w�/B<a���tUx�N��1":���|�������'����	'�W���Q�2E�vJ���D��%�<.����?'��ms=.���s�=�c�9tG��R��U
��!��Z}]�a���V�������y��0����X�q(G!wL
e�J��$H���p�<�X�qLB��b�$u#\�O��?�O���s;��i�N�w8������H��O;�-�6�Xw������D��OI�`i���g �~0�PZ�G�D�����G�����#7��W����WC���D�}�_��|t���T�k6=�f&����>P��0�4q���iQ�^�1�u�����Iw�I�_�1[V����Lk�)f��i�Ein-g�tI���m���7���{����R�	�}&X��E`��
t�������S�G���A���O5�n�#]����)k
�\��V�6u���>��~*nV?�U�@���V�$	�l\�^�l5��(B6�qT���R�'���8����������-��T�
�[�vM���!onot-DK�cjz�	cH���!.����5is�Ckh���Q�@���uYs��,{��{� �!�@��SE{@GG�v2^���frW�+��HB�<e�8���	Q���j2{����|:r������vE�4�3���p��F������������f���&�m�Sg-����~���9�����34zs���hTg?L�5���z7���k��\��HL����),Y�,A��R����WZ1����a$qF����2�W���������"*���/7����x��D���&+ P�@��	�n��	�����������5.-Z��|�������?=,�HD!)���H&�!��2
�k���)r
�7��~Wy����^��Q�����9����07����{K�Uue�K�t��`������mFYp���r?:+����i���,(�p]���B��9��Ws[���bV�I��+@N)��:���'����mq��z���_������i��"�����gC����g��g*�E�jG�WW�	��8�v@���U���u�j �5����@u!���gi��a|��W������{f�I
�OrE�����O&NAh���:Ue����bR1��rRK� b�^XaUk*5�t�������L���H4�A9�eN�7�W�aWP���2U������U�b*'k����U�b$���;�|����W���4e���p���-���7+������x�x�y�F`K�mLj?�C��J1L���{T����VSy�\���<7���7�U%0���*'� ����L���S�]�l���s;o�����~10�@;�� ��*.{h0����8������5�\2m�	�P�rSli�|��9*�b.����Y
�DN�2��Q�pG��)[�y�n>�
�F`�u3c�E�9T&#�pz��7�D���-�@��lQ��
�rQ�������uc���Y�u�:t��
�u�S�X�P�x������m&�n���#���7,Yu�Am5uSPUM�_����}j[�{�80���W�<��k����+yS�nUw����oN�@+�p��8����xntC�}��hG�G�+���H��jF�o.X������[��{.0��d&
�<c�L$Y��!��Zg�-�95�c�Pu �G��kfA��P�;�����3W7���e�]�8�A���-�x@rQ@Yt.��,$�����D��]*�cL�������A�BS�G�Ed�]�7'��%Lm�4�=r`5�3���\���&RlYc
�|r����H@�2�8�/c
U����J��A61����'�����_jN�r�>`�;��=����l��|$Z�����_�R�QA��2�4Z{�Q�*��,����V��u�f1H���OBmU����l8E�p���=�6a����6m9�490���X�|
\�'���v��1��6��lM�)E����(7w��6V���2+�&����W�������D���7�q���7�n����Q�T
���ujh�p<���2���	�wB��W��o��u����ev�?]�W^��]	Av	����i�*G�W�Hc���e��7�~7�u�{��z���������b�������#��B��o7��eg)��*;������}���8�_m��sJ^X�CvJ��&z���r�-������f�����{��b�l=���.���8���q�V��W)V�T�Uj�%u�"�Zb���N6�^}�������lrF��k`10�	^Q��v�EY�����cLTMv��"Q=�'����!	����CU��F��	����`�Z��2�$*v�����d���D�P}�����D��	8���WScq����CM�n-�/~y�O��B9�E4��������w�}0�lw4����u��2p �m�H�@�h�*K	�G��*�d59�^�� �jD�����J����U[�{k�����0k�b�-P{Z�Ng %��9��$A	��,���.�a��Q����v�EN}��L���
h0���'���0gE��DL���y�2��i.B����E��(���X^�2�I��:�&�l
�ibf��� �m�D�	$	iS��>��!�d5PA���		#8 ��2���������H�|"���]2F�D^�YG��he�"N������x�.���1b��������Gg���`�f�(�.�MeB�}��0
�g��������I'c���,GC�W�[7i����G=�1��L!��@/���Z�F5k=��GYOn/��q�QX��b���<��P�q���#JI���<s?f����c��>c�����D,�/���RqC@P+
/��~��]N6Bv��Y6��U���s����M�������������������.
1�%��M�g�`�+���x>d��H�B����x,:K��D	=�z�l,����}7�;o�����f��{��:eo5V[|d�s�r�J�:�Zu��?:f��-S[�%����."
�?=�;�z�8���L]�.��et��R����+���EZJ@�v������G�n>Ee��Z����YE0:�>&�*�;�sr�7�65��j��k��,p��<W+$l�u�?�c��I7j��#�<�DK����U���\}�jUx�hK��o\�����Tx��|�6qi}ww���|	��/���	Ph�T����Lr�������a9�|��K��l���TJl�}�+�1�}2i�K�@q����.�&u�0�����#+uU���~�����T��j�W�1Gq!N2�����o�T�j77����6p���
P�> %���3G�����,�*7�R�o]��
1Z�>�"������L����3pZ��H�B�WQ��`Ud,��QA>���h����������bI;�f���B�"~��oK�%�2�|1<�`8!G*:�aaYbb�YRF�����2���x>�@�9��8��L�Cw�R���v���:p_�P��&����,�Pu��[���e7�
�����J�����{5�8��E?�� ���#����P��%��nNO�(VsH��� q�j��V��^��n,#�5����i^� ^���OX�E��B���E7�M��Mq�����t���[��K�����J��
@��-���t��n����/��&�ri5B�t�����M���qa��TD��J�)O�A����C^C�-��\�=~��Sg�t�D��
s-�����$X�Z��E7���UuW9�L�YA���e�U���a��5R��IG��>�]�o���m��{�_��������1�zJ,���3��+8T�t�[���\\��d.>�r��A{�������<��b�����f��M�>`^�]����
I�!�*>������+�����e�"U�ge�M���� {A��*�q��,����|��0��}��G��70Fi������8i��cY����FD��L0*8�RG$5�V��j�<O��v��
~
��������F�{�\�{�|He�����0���0%��)��Z��AC%t$��
� ���Y�a���H$��'(M��R$���u�4^
*������`M�E}<��Qt�h����]u}���\���(s���r5g�v�a~�����M7��/��yl��s��L	�W�\���8���'n�n7@������j��K:/�����hB[����#�' Y7��=���`sc;�f�i6������#��
�h2�m��P"'�����N�B��;��<����ht��d�``�`dR�����i�%��4��GTO��J*_�����:���[YaB�aq�Q���2�e��#>��r$8'`��CV5��L�����[�D����Rs�g(�*!Rzgd���<zI3��:���{�:'&�-���L-pYr6<w;5��1�a�qE[,YI�!����/����������O�tz��{��BW�����������oN�/��_��v�w^Y��{�iW:u�C����mh�UV7�������p�J����{�����@zs�A_�[��SFKz^h�B�@`��je�8#�/*�����+w1�S?HcGI-�h��Z�o�i�z�����8��[��{��������v*(>lD��M��?���Oan��`|����,KrB����9�b~9�*'M0r5|b���6�q������������/����i� r���|+6{���x�2Vnd�I���g�
ia����Z�!��h�iE�<B��gn.NS��V���#��T[���_�(�<����}A�%5��Z�|�
�NR�Q�����6��q)�&;��j����$��������x�Yw@C��J�bp}v�R.#�;r4�Q�xM��&%�!����W���sd���S���|�/!�A���w�>�IS4��]e�A�OM����x��4��9���z}trp�Ov�	xn��!�y%�g��������������o����G���7�_��?oH���(��5#I�2�����Mfy�*k��o�������:�z��GK��5~����E�3������e����A��Yv�5��%���5J���)��!�4.��}l�mDW���?>�^"�I�����g���1�����7���+]+��7 �v��w2���+��i����V������H�>d>� (��0� �)(��u��]�����+�����l�,�a��?�=8H`�~�_��tAXF�+��7
n�q����8p�����~���5m���(���(j������O��p\es�����2
>o$e:����O���1�
�Z��Wx�ag8.�b:F��(�{�4��>��|E�K�|3���b�������u���%�_Zk���0�h�{U2'S'���j������mhu�ML�h�<}p/lR?g__�D����
�5����Z���|T\���|����/(~/���0,��~���3o��7����?yt�����X%�}�q�j*+�K�D=VE�c����*w'1��J���k�W��,�Ej���S&��f�����#�Y�)b��]Gf�:(-+�V,��l*�!�9�T.��UA��7#�Y�l {����@�Z�<"����#���k���"�K&
K�z)�*;>�K�R4���t�u��(�����rPt�D������h2�e���!�`.>{���I�y�����'@:J��.� yp�P�1*���������Ko?��_��R�Y���.\����3!�$�(�0MT��HjU�=��?�T�V�)fA���zg�����p*���>-lB�p����?6�N#��:��2���m�;+��f������'�;�C��D�����-�ACS���	�����?���V^���k|�d�.aqA�W�]��+�"`J�K��k�7��`���jj�WQS�r���W�D��k�v�l5����o�)�?s�������]~�-� g������=��6)�_Me���dN�&�c&y
�/�R��v�����[Qj
;�{��K����0�}��1��
&�������id�!J��@��T/����fq���/���W*�9e�����Ms�+������-ljHV����v�X��
�L��"��[�n��94�E��b�K�b��U��n����$4���iQ��*��y"�-*�����������s:���a4|�T��k�6B<�-	W������0z��19O=�7	��D����j�-2z8?mP�Y�m[_n�_������{��^��}}y__V���O��+]
�
������J�J.��-F�R�����GU��p~X	F�,$N�e�2$�
#az
���a��)��Qe{��0����M�v{�Y����U2A���s��>��|�RE���\+����T���]��GA����N���<|P��a��Go���y��������;��[���������2�Q>�?}s|�=���T� M6M���[��"�T�������G��R��g}o�p���b������
�%���}�	�G�a�ms�oW�t<�c�*�}�_�P<':C�KG������
��|2a�q��v8�3���V�]T��c���c��{A����=��"7���G&t����J�{[U��neRPh��|�9�����y����?�Y����D�!$�����}Mkw���=�*Yr�|���c�M���]�s����,���
_�����$�*�O�,n���&��Z��L��}���>e�'��pgZ2���y��e[rM�	A?BU�Q�E���w|c��1_f[2E��yN����������		y^
j�(U<�-$>�E�qJ�aG���-8T0�5�J����4v�j�������-Kkj���u��C'�wf�nw\p����A�x�d����� �QnO<��^C�F40b�����B�l&,W�����r�Z
��r�����t���b��(�
�S�����b8Jrk<�C�Lz�:����a�5�S�;�Cfs�:��L��`�C��k#�%Q/�/�������),�#=d���#�cq]����vT����|�e�3���F�v���C������V�VkS�}
2��<�R����lG�)�w�����.!{��.H�/�vgiN�t�E�*7p�n���e
22�F^�!E�������v�q(.��������� �.�}��z�����@�Nr�����k?:��K�l/v_��;���q�SY���
��;�^����l�c�2�xSvB�x�� ��Ppg)j!c�H�.
W�C/m�7 �%�,��z9u�B�D�L�;X1��U`�qA�@��%k�<����=Z��qN���C�����!
��G:nr?�
����*�T�������
��5�]���5�xd���9�C�9�������0E��^��c8*<Gk;7u4h-
��`��W7*���0Gs���L��E�������.��
?�>v���@:{w0	5����SK!�����9�1�z�oj$#1/D.�k��������	X��|#��_�Wh����JS.*=~Fz�@K*pvp}�������x� 6\N-8�8`����8����l��!u���D������G��i���7�|�_��r<a��TW�����	�V������w�Ov_���_�F`����X��_���L�T8��d�5��F�L��Qa�j
�1������\A��/G����}�R!5���S�H��p}��cSOV4����n'�V]���Nvz�O�U%I5�d\�ah�#�����������g����bI�)�E���E�
�L�����,�����?q��<�r��P`����+w}��?�8`�����p?b�%�����hw_�����nx���[{�f��*/Ab� >��D���LO
��d
:��z�_;[��������V�i8��u�>t[@�SQ�������W�d��
w~�;�]�w����h<�����8�����]��|71�q�
?��=�9����@�j)��zi|�=�9�P>�9
�w+�rI������@~�8b/k�5�u6.���S�����@���a�-|�'�{���zJ���u?�8E��A��&3�k��e�f��
v���d� m��NDRR��Q= ��PI�=Z����	e-7�u&��D�Cf+����%yVj�g6���j������z8OHE�z$�7���Y��Rg�q{����o�r���gO��[y��4t�D%�/W$����M%��1�K��+�������m����L���f\��Qxz� ��P����C��p�"\	����WY����J�FM��!-��e������`\6k�Q�aa�
xC+q��~!?C�$�Q���k��Q1��I40���r�e�A^B�@�����)0���|+Ei�jM���({�Nw��}#�$��N&�����7�����u8�������9Y��������C�U3F� ���������#��t	���i��i�a�N%��������=?��R4*G�����3�w�
>����:����l:#�������5��+���X�bg<KJ��IJY�n���I�:�������!}�9@ztp&v \������
J��-�N`�g�P�&"�~��c�z�K�5>����6�I<7M�V5��`����#��]F�^��8��]j)}=B�X���M&������3����dq��:����u���U�P�i�1#�������o{���Irw�o�������)�����mJz�(#<\ -����6I���,AD6����>@�i4s�
r����hDb���	�0q/���{M�
NP�U�"wA�m_	�������Wi������4�@��&!y%���_w~<��ms���RB���GP|�������"Y��o
���T����\�[K����s�����t��������V��|��L�=�y�me�k����A�R�j��j@[^UQ(�QKj��&���=�k�v��*\�����t���Xax3#II����d�q�=a�+�8��aN:e"w�]Q3�������-O�L����0�BD����{�
x��k��N�~�R�����8������?B��$�G�}r�#����t�{�E��w�����U�l_����c ���L������t<i�~o<��s�*�$����2�d�����[�E��������S�4���g"�0���_�']�������:"\�}6;/T��(yjg��J�&YP%����^���������:���8?
x�}���Lg�r�iBS������O�����:`x������Y�N�����a�Qu�����F�:��:�����i���Y�^��U��\-�����TS+��qq>E#�4L������tQ0�3���p��
q�f*O�Es���BKn;�;��fY��=��x�:�	�|p����^���T�p��S������8
�!y����)�����:�� ^��������R<����mA���P�Nn���R��C��C�}tn�u:��]�XuB�2*��_q������Ip��R�AZ�<*��h��������Z(��o��A��@�ar�����u'��S���yu���c�81��s����H���_o����#���!2�������z��+y����`�Q
����� ���ir�J��F ����N�5�Tbo t���u�E�O&���|���g%�aS;M�y�aa<�
|��n�
�U�s���(�1��e�Ou�'���}G|*>����3u���u/�!7�)�������h��a`0&���Uv}�������~�0�3�:��Xz!9�y��~"��7Z��y|����!��wp��s����Q��!~��"������^�-���i���$��B��i<`���Jk��k�3����^Yl����s���#�t�����2��_^�l�
D���/.������1���nT��F�^�n4S�h*���g�<�,��H���#��~�E�M�P8���_��{7�l:���4��3�(����v�F�������	0�z-�E^`y�!&��Z$24)�f�b�qF����F�Q�K�v�J���|��������5�s�Y�����Z����0|�,��|!��(Pb�e��B�o/�'{�/w�_������B��;��i_YX����3SZYb��p�r�v'Vn�?2����+���D��"�^���B��2Bj��h�0
M'e�?a:�����"���,�������e��C�4��
6t�e��fP�,�Y��=�6��.[�5'�r7��
k����������z�����~�x����S�x����
P]W�
�A��Xu�hA��C5#F���%��#~"8/�Z�%��D�t� �QI�#b���
=@)����n�YG1M# ������r��I��)�.A��xT��5�Y��y��_p���&m=�+��D�����;������`��"�'J$�W�H��T�>b�?��/1��r����bXo��]�����w/� ����
�������WV��_�����F<?%���7G��#U�����R51�����P�����q=U,��`�)��;G�A��d�N��T�3>-��^�����n������
l�lV+�8\f�7)�Y�����&A5���(Uv��"�yz��{�_��n
��{e�+Uv
-���m{���	��pW�N�C�b���&}����A��o�7�!����'�Z9�|�u���zw�!w3��Q�u�
�Jy���c���+ukLL��.��������QWo�����T�����?n���j�TZ��X�{���);Qa�^���J���h����80M�1��+����>��'����~���S;��������w��L^e�����!k]���u����@��8@��"������=��x���hA�S����+ 3���!����.��
������
�O�����{�>�}�����nzB�%��;�	��\6>�O	Pn�o��:9Q��(q�VF�<���6��k��`>L[����6e��|�����ihX�0H�icL�Q�J�r��Q�,���Q�}����9���0��;
R�[;
��}��Tp�����Z*�� ����W���kn���� � `�3�#�����~�%\������x3pP7�T�t\��6c��!�kiML2d�c�S�����i�T����u0�1���lJIgs;�t!T|Q������l6�a"���O1��eq�w��mEN0�/�G9�E�@@9���>��t����sW�s�j�����K�r4�K~�5��n8_��4[�����G
b����&�51�h�4sK�l����l0���gq"�G����@�6���A�����ur�6��|��1$����-t��r����b->u����#6����c?PD;;��
�c
!W\Va����!��##�uB�gD=����w�;�X��C��
HS��Xpcj���������{�����hZ��������������gD� *=&�s��p�y1Wqo1��![&�0O���������M���nq!@��'g0��[8���q=�_���;����nrq1a�����;���'��xb��g���Xba���$F����Ep�Z&0�tD����al�����X���� 7�H����
W��&0_�j����,�}��$
j(�}�?���n,!"���sm�EW%x(V���Y���
N��0�ujr���j���>�<n=��=R��P��;��X�/��`��s��W�
%r��BO�OU�(\%�\B��8�{�f9���mi���T���������/�t�pI����i
cY%!��o��aoH�\��40rO-S}��{`jv�>\����R&���Z���,f����}<�o���y����%g_�RV���������v�p�~���[��[m�Z7�����qS���I�����d������d/�G����<�n�{�v��H���r�6MaT�7W��Nw�m��������\NP(�(��%�-n�b{���u}"�����8��w����{�]N��h�&N L�dS��G�Vh�#SM�N�����������d��v�s�L�'�6�����kU�,T�N��#L�0)$M@�S��u���tg�.��d%@���j��+�l��	��=�����nU�"�?]9����O6!��[�*C�wV��8m9)��^l���?b��8�N����!&W>�'��<���7�$�@H�&,{�����N��;Q����J	�z�e�a�p��s��tCt�@f�nt^����r���2��\j��$��k&wt��������Q�����S`� �"����#yY��;l5���9��*���T/*�$�?��>�������N��0R�9�m��������}wr��&w�'�A-�
�m�aw�l�Rl��/�d�e����]����m"g����M���AK���>���k��wY�S�?��SK'�0�]��������p���}*s��.+�pk��H+����^^���+^���%������nY��x�8��������y'}{�~���/�Otbx3<�����%|��`�t�W�:��&A@����
�G�J�-Q��^F�o*>�K{{��i��te�p�������_A�e!��KJ���8�0`W���Nb !,?�R�����I1��H@�!����j�g��bv4���K4#��Y
��B����h!��!'C��W��*#v���@\��e�V%��n���'���8&�zFl�(��M�H�c���}
rE �-i�\�M�;�PQ�&��9��c|>/���M ��pQ��*�*�f�,��m��K< v���7��`S�A���	(�$�5B���LC~#��6tM����@�p����GA!�!�yM����9 �z�INrW�>�V���mB���6�B�nG��2�D���'�������|�-\�V�C�#`P��������3�TW�z4��X
��j`g7�V�[���!�}�4j1���u����T����'~���B�rJ�����a�V�;�9Qp��>M��%7}��*�h��S���f�i�	m]g��f2fVP2��Nhh\�?�
 -o���s+h�6P���Qy*gm����!�-qB&�I��@/!dXC����������R\%�K#m�������9y��s�{���8�4����
Y9�����gn 
����~q/���QEC��~��6�������s����i.�� \��,:�{'&Y�_u�66 z�g���O����������[K�|b�n�4l0��E��g�-��1�#7]jy#���*5	�SH���+�j����9.�Oe��&nQ�"�BR����������������_y��~3���C�;�����@���C���9������N�EJ�O������c����	[��I�p�v�`���B�
�m�����������h�uc�(c*UF�����*D�\����U�`����x����JA�?A�E��YA��W<v[��o
��BCC���]���q�O������������
�?���8������6��Q�*$�6w" �~A��h�|q]��n����' k����`X+r�%���@iu�g��x1��q��9����M:�v�*|.����Y3�����2~h`��w�Z�3~�����V}�:�����[�7>��^o��u��#��P?D� �4�I^M]>w��x.2��:�����w�4h������^�lw�G�5��&��uE��L�:4�	���Ye�:����u�4!-��tv���3���$T������^;i��Qd�q����h������-��3�?�R��zL����M��&9*��Ucp�*������9ptB;%�V@�co�l�%��NsM����`w\���f}Aw�a��np�:oiM�?lC����daY�t�/0$���|.���17_crJ>��_��������.;�3���dU�kC������o���I��djk��B�C1a��d�svK^\�U7!zW�����C@��(�0�o2
p��j|�h����!�!3Dm�n�	�<�>�I&5Q��`N����z�{��q�Pb*�5����t�;{��]�bf �=�_��&P;�����T�5��;�E��&����N��F�h��S��v�
)���_�V��ex���m��2,����E�������������x�H��L���W�]y��0~�P�5@�jP�+u��Qq�/�&o�������@kM5m.��gP��O�b��b���'S�;�D0��)�
SR}�	��H�f�Tj�B�^=�A*c���-�u��0��1�W4�%������M$�'G�&��(�h���'d�����,O1X,h�G"�����
6����B���G�a}�_���s��^����n}�$Rqc�]5�oQ\Q�]�_
�a<�|��X7�:���,,;n����#(.�7���'����u��4���"#Q(-�reF0��{:�xnrc���$=<���JzF��4��G��N�g���?M�
�0�,�j�/���e�t	;�U�q�Q~������\����;7�������c;�g������.��^���eJ�6����v-�
��6/�N���u�=98������aMgn}Z���k����9��4Q�Gs���7WKz��E��%���#�h�
�)��O}����w
��,���
�����q�y�y�|����7��v_����I9L�Z-
����R	���/!�����JP�?��������_}��@�Z�	��0�j����T�Omb}j[���yZb_���u�g��b��h��x�����37�c�D?��C���:��������.$��������*���������|sxG��S��j�TO6(z�����{�C������W�9!��[�#'���-VH�b��$�'��o�U�E��h*��,�9A����G<H�R����wY����A��S����H����r�u���t�.J-f�v���A"�R�*�����:Y�-��
Z$�xO�P�����3����=��W��h*u��	"�L�I�j^��t3�;�0��nB8��������d@U��G�#eM��rw���s��?����r�z��'0O��%��#�vT��o����h�@��40�4����/O(7)j�oA����_}15����5��M�Y���G�
V���P`#A�RBFO�:�Q�E<$"f�%I��H���s�!hk	a���R-
# �p���&���)8��Q����X�
A\L;d��7k��F�-k�cz��!����:���+b/�S���[������d��B�����	X<���$q����aDP{�E
�krG��AJ21��+M_.�bx��P�r���x��'	�Lqdj
�%����3F���+��/�Q�$�|0���iU��E�R_�u`(-A���#2[�_��QR�W��V8lm���<����l(J��p�F�.��<m*iR�D�y���UX��"��{�4���'5�H�eARrs�_��y�����wmw�A�y���P��/��'�������#����"�������}��>wy�Z��lI)XQ�IIUh�,���( v.2��FK����	�� ,�
��9U��9��B����v{������0�g�����%�|��6[�-�g��Gp�i������V>����������'�4A��D��7������p�&�k2���F�y��Ak{����J��Y~��/�F��l�<���t|���rCM�@�t�A�������"�^6���0����O^�a�*���+���������	������h�[������������K��T���e�l49���w1w-�S}Z���Q1���]�{�t=e8�|���"zo����B��.�i�����\���7�Yv>���l��o`h�a��3x���������.(i���/��S�R5M{d�+L��#/��Y�W�4���CB�jt�1�3�P��(�@J�Sr6sT��\�3c�L�+^|���RF��!PHU��I�p�
���rBqJ1�&#D�p����}������t�zpD�m(�d	�j��;������Kh4����o��0y�rQ����\�?�<�T�L��E�n�;������)����@��O��&a��on](
�{N�_uO��>�h���]���'T�0��U��M���y�{���`4EQ���U�<8|����	�.6����m���4��T��cs@��6�ru�����j^��������U�<d��������/��V
�(��:�2�"�/����TU�m��9M����3�I�����hYi�:��o����Vc��I�4d�5G��N���n*����d��u
��Bq���`�
��>%����	��i����k!�n�n�
����k6��^��Q��m
7\�'gO�
ua�$'�	�Rh��`U-���|=���
���i}���������l���H5
��\�r��
�f�O�g���3���QDH���d*���MW�����0���`4���'*�M���i���f)��r���JP�x�&N�9d`�X�(3
W��� -�������x��%U�"�����{B�(���p�4�
�u4���;��>�}��#)����x,K��oT���c��?Ha,��������C��\C��K��AW+�2�GU�V��{G�L� =��aQ6�\F����CGN�K!/[&����d:�6�V��=����/oc[��9g��hx�*=�6f~�#�+5�~�_4������B|!X{��G:ah���s	��4������
���L��/��B���	�T� �a�Gk� \�$�w/�p�� �4HW0�����O�4h0-�@��^EWMeL�'��i�*��$]x�R�:+�������v�ow�h�Mi��"kN��6
 ���x���S��w����;$��!ZrL�<B���8�:��s����:Y
	�a�a���������d���5cn�8��'��@8��&#�����$v��M6
�AE�4|�-�3~���7����~0�>�]D�:qK�JdJ�r��w>���}"`e.l�4�S��pt��>��n��%�����3����p���N0GP����i����c�JDj���AE<�7����F$�� �v\`R#��yt��1���M�?i�:~~���
�p��;!�lE�
8�{"vSR�u��=���c�}���a�v�����_�w��������'������]��c��W�Q��P<��Ch#\"&M���(�P���se@ ���P��p1��:�p��S�H�x�S�m��v���<9�{,���0\}X���8D���h@��z&3������B��������pfkl�h��5Jgk
�I����?��_��i�N4����u����VWT������#iG��_a�*N[N����M�pbx"��r���;�"����<�	m��������>'�E�9�tNX�d�0I\{�k���4
�<�iYl-� �y�`Ed,��
����C8�)i���L%�R�i��e��F���E��-�[C�x70�s�W*
�Rz���[���������F�	����
���h6|�{������
��v`�� <~poN&S��8�����A����q���3�:"�tQ/D���tYJ[���F�ma������[kD���:����C�
���0�����@+�����P��4:l!0��m��m�f&���*Z�����SD�����<#X��������1������C$�)�cU~a>���sU�g��{�`Ui����%�/f���]�>��@:m:�P#r�������853�)�|����}�&��r���K�&���� o&��<$���i*�\��s<U��7����w�'�b~�
j�u�D }���\f=��d�,d������;G�����������42���u1�S����%�������D�LZI�-#�|���j�6�I�x�o�b|�]�]�%^�@-E`��D����sJ����\��VS��������y��h�2����5��I\�,.r��a��z�i2<�8�<mH�|j.-���x+��N��Su^�
��������bj/��,����xz9_|�����$���o���p�e��Xar�T�Y}h+�L�*���P+~}��U�&7s;[ ���z<o�+��M'�e}@d��Ext�!wt{��W���9�����3����>���ll��gK�~�OX���Xa��-j$F����]����?����� ��]K���'������H��=���h��%�`�5��t;Ib�%eZliBOa�o$0v>`���-G�mh�{e�=��
�^�FA�N�d�:)hx�9=:x�L��g���C>��Rj��I�WZ�A�pB)TK���bz����������^�� B��+��m5I��s<���Gp�t�[q��7�9���R=���T�\��%��f�����]w�������3n����T%dlUE�:����k-��m
���$-�:<��'����"��$�
L~*��
���%@!��k-g%��S�G�,����o���|u�*i.B�\t����b6�7;
�C����AgH	�}@�^��%
�_~�2/gM}e&^��/}1���I��A�M �9��O��s�P:����`����,z���p[9���HW4_���"�Y��������3���l�n������%��
�3���T�s���t�W��
q���f
��sA�h'/.g����49�j�lL�5j��k$<+�$�BL���J�w�IyR(�����$���|5���X��~�������Y��86\P��!]7z��p�sq	�������Z%�������'����
w���Z��h�a��\�z4Z7&Ld5s�c��Xk�h�>{�����������-\��h��/��T�wF�H1kD��$7l�#�IPn���$�W(����YoX&��[�b8�:�	/
�w5f_
;t�i���s��V;�q�w�TDQ.��]M++��������Y	���h����X�e�����}^����c�.�g���^��p�����?�1G����s����*3l$5E����n�DP6k���n�s�}\X���q��lP���Uo���K��F���%�1��)�����<���V���u�3�;��L�)�6��B}�*�e�bx��~�����B*���#H���_��V�� PMx���{��`,�C��V��/���hNb\S]t�&�l<��c��gqHdJn������G����sg��`3�Gt�����/��u�@�Yj����g]�WU�F�0��N�*:����o�����a��$�6L"$p�FK��-��?�@IV�~��b��K�`^�zmgJ9�)]9��lb<xf�3�D���y�'����
glka"�l�|(�|O����Q�d�r���n�oo�����5�C?�D�C-�4�/jtS9�g�-\1���b��s�T������OJ��]��}�|��� >v@���yW�H)L5�p[a�ih��<�V�|V+l�Q�
c)F��[����y{�#3����a�k@����R,	v�������2���K��|"�[kJ�������=�������a`������F"�+��S�> N����n�#����Ig�k��{�������o����}�N�������u�{�G��4=+�F�_�����N���h��q;���r�hB�,N31sY�@^�GHV�.��Um(�+�j�9�AT���1�&�eIP�5�}(5l��'��O��NU���Q��1��Y�T�5�E��#���Ql�%����f���C���t����$$6�
vY��e���j����`����ZT9��F�����2��&l8�T��/��%��.����'��#Ky@��xA��������i��%�$iT"��}���?/'���7��%�)�������qEr`>����9�!���Pt��-��1�D����?@
_J��M�gW�>�`q���(���$�����G�����*�DfF�D��8&�����'�����y��/5y7��Bj\1��80�7H���ra���MJ�Q��K]�4���s�=���{����Uo��������c���R�]k�}U
�&7�x������~���p���3�9�������gD�"�CW�z��Y�#�_���J&+�-�;��]T!�F�;f'@�1�i�e�$��}CNl.yM���V�CmX��9k0�#$;Qpl(�II�#;�R+OI����[����������������D�	�����2���ss��<�+��:�������d�!xQn���)�d0(�����da��
�R������:
��=���(SiA�ol���D�y+�����fj���7R�1DWI��Tx�[�#������P1��X�P�
�w^}/��5V����8���A�Da��	��0��E���7GzH�>�Se<^
�����=�m��"F�Y��)(T�"���w�Z@~�	��9<��H
K�G����r<��e�P�����C�K--�TfxT�����A-�����������k�������~�������}�}�TX8+����W;�z�o�5[���e)�f?PL�
���e���g3�����r���[4���:��c�\�q�����@�M��pv	)�+|]c�wOw�1_�N(��>��<��t�m�A����(8Nl����B�������JB������/�"P�E^�����x�6��A���qR''��N2�q�/�?0�1��+�I}F�a�m�����z��~*M������J����"~�}>T�4*b�5`
�{En�$�1?L5��y�C����,a%�L��~�.�
u��=.�P�Ml�QdF���/���#o��
�.��m�S��/��:n�@:q�~B�������{��s��<9G���!�@�1`;�9�6���U��<<���)&�y���W�=t�`J��%�����������z�W��0�M�5��$�Y~=a$I��q��������������>d��I(�-{�x�l�c�#�����m5K�/Cpr�Y�;�d�{7)�e]�������A�E^�@o��l<|R��������Oe�?(��k��{|f��|6��)�	lv��Od_G��n2���A��A�s�V)��} �E���kM�8o.Yqk�J�C���( u�$7���s��0=�����A�+������	����:���s��)2�U@pE^W�P�M�f��&?�h�$���_3 9A���rG�
�x����L��Y\z@��0�I�,�����E��@S&�/j�y��Q������i-/��� ��a &8�L�l�,���59�����<�+��^\� ����3�n��v�F�)C���n�]Jp��=D���2���L���@����"���A���r�����i���9�(�~������B\����0a;	2��T�5l}RY:���6��������HHYWI�K�����'=�������������UN>����#���`�3���z��) Gq�^��-�@�Y�`:�i��D�����M���zg��gZ3��$0���%�c�w�������gfN���G��bX�
�(��,�
��w����1v �k���ZdHr��R�����zC*\9iGpo$5�.��ia�>$M�X��[%�r�B���l\,�JQK��P�0�3��Y��I��L��[����= WGZ���lG��P]�k����F}H�X����>�������K�u��pX{�TwE?P��U5���P��]�-U�Wt'��w��'�{A�.u\io��������.������7�O���s�+vQ������g���l��^c�����*HWv��"w��}~�j��������wY,7_����BAU_������?v��)b�9h�E0>�W��br����*�J,��n���|��\M{(Ej��DD�������
4�&�
�����P��hc��&Z�W�!������|K���lPRO?;Zd��"��7�^hj�`_4�p&����M�@��>�@Q�j�O�k]�b���v&$��V@��KB�*�L���E0��i{Qf)��$�&�S8%���U,�2����'�v��]�'��G!��S������y���
�,`G�B��6Y����Z�����4���Q/+��gM��a�0	�w�����~����8j=�Vx�6���JW�<�c�j��e����|:�$����7��8�C�wF��
�jAm,�@Cu
/w���P��Vy�A^\�:b�����b<�c�������W`ql{�&��fv�e2�3��L���5
��/s�S�������
(�+g������6�I�L�UU�Q�,��L�����!PDG���om����<M��!g�G���o4p}��1��q�?��m0p�^>�R����&}��{s�Pg���~����A��#QT�zk�������X��Z8(3TTo�0J�q��������S2��L;���7�w�z�W�u���w&R��!��|���Wp����D�e����(C��@�L%n(Gf�s$q��qb���{@�@X�L���`nK���M�e��p0�:��E(��xT�:����k%�q�n�)&��S�YC�=}'��0j��w�U^�OHk�(}'�\��������>��-n5������/n��B���'E6G��������5O=o�����n�����i����u-�o��.0��Y��3�j,�uq(4(��XA�1��-�X'���*��)���Uf��{���&JT��ZA���?�;bV�S���N���
�:���t�S�o|��������i���nNa�&�VcZEW��8���1-�d�_�J���@�n�����C��w�Z�n���+����`�	��_b3��cg��
��6����_�kL�v<������D����6���S\WE3��-�Y��P.����`����Ru�6i5JCWZ��%��|�7j����2������o �-,��nu�������w,�G�����]��I`�^�&-*d6����
��=����������-�p�����w+M�^��&[�������DM���Z�%�
i���2���>��(�Ngr��5��Su��������ApF�n���0-���3�P�Jdc�R�������@s�� ��TU8��xP)�z� �xJ��r ;k��=t��:O��br=��BJ%���q[�G���vw�#	E��79���FQ��kf��J�p�#����7V�����/�j������#��,�5a�+F����)�P^
�{/�2xD� v�v�c/��*[�G����9byk�������q�O�_0��
ew}4�}���P*u�<I�=��#|%�0������9�c*`FA��S��<�TW���\O4�JdK@�L0�$�L���ie��4������G���/��a�������O��q��.�a"fs� �]��/��Zw�8aO������<�������#�'���v{c����*������G0AH��2��$<������z�~�����"���d�.h4Ipu�\h������=8��88>9��}�2mx�f�n�LF��:�v��H:�������ar6)�vk
�`��������i������F��rk"�xi!�B�J`�y�$��&����r->�N')����Cq���{�i^R���m��O�6��tK����6���F]�K2��� �==���_	�)
����h�+�iU�y�N����|�CV:NFyr9�0�?�
d'P��k��I=
':���wx��%�P��u��^�>:q�u�����X�b����~����{�����c�_�to��$s2n�-�}�����q��1�G�z��GK��5�/�_�)&�A�����N//�jg�U��"�,V�[%��iD*�$������T�}<��3 8l��3��~��'�d���k9�J�D�K�'�J������j|[�&ZA�P/OW��N0�����]4 $���(~�8�_��qmw:��������4�Z������a
���/+t�4a>	J�3��
���2"L����|��z��������\KP��Y��2����Zb���:�F?~������x]K����:���[B�`h�$�1���-�hbrse���f��}bM��m�{�`����Z��$��Y�}�a�<V���#q�>��=�d�L���v��V��z}f	�u��J:�lI�p\�~�}����8_W�s�Q���o�2;�z�w+=����	�*�����W�*S��2����E��s��Y�x����0��+��Y)�r�GD%�������#3������Ua�]��i��#x$������f��^3Y����L�7���&�a����@�l�sO.�:$��������u��D���A	3���u����7��c��ZI�T5���tk���Pm�,`�� �P��0Ek
��U�_&��@H�#_�&! �tR�f��Mu��.�9M��G�7"
���
��D�:�7%��@���2���������������$����dm��|��
�����Tj���n�������b�lH��O�d�]�������c�#bZd�� "�X��I���3����`���e5-(�C�3����W!7����f0x��z�]�pG�	�P/x|������������d�B�Q<���xbx7���z��K�R�U�8�_��[Yi�}<s��kv"S�GJ�����$�.pavmE���p���`YATP�*�C
@#�+����-�O����^ua:_��7����
��r�`|���Efd3�piH�:T��l��E����N\�n%����^����1�;�kfp^ s���'����p49������6O���N���58��7\;Kn�~ohJLTrY��H�`b�0�X��5��ka�>4�����|�V��^��Ak�a����rTF�$�p\Lz����|T%4|�Oj/�"���4
��!���'��X���O��E��`��Y'@M�����t��{�Hy'�1�[H1 �2<��SR=������w�
���D�u�H��xROZfy�� >dK����A]u=��y�*&�p3au'��*/8y��KR����2�U��O�ubtM�|#t�E3�	`+-X*(��&�r�����h����Q�d�.�c�l��;�O�����Y5@��?��f�p��O���(�H'�7v*=t;.�KO��b��b�E�e^o*��E��@�OG�[���]� W^�}P����O �jE'/:����Hv�r�%�#Y�BC�gE��
���h���|�����S�������
�}�,�����5s�Oq��3E����9`���Su��
��u�aS}_#���&�q4 L��af}�!Q}P�����������H�K'Xu��M(�%$k�����^[���.d��?�z�a=�nN�����.8������~��(��[Ho�z�Hw��N���v�_M��Y?S��|H��d�8'�9�%���I?����;���73�JG��h�!`��?������ly���vn��	m`��}��H���+�M�;���I����x:��������.���v�o�dA�w�hUC�c��#@N�q3j��.L����x�~�����`�~"�JmO�%��S�����y�&�+�����	�p!J����E�2��	��z�`���b>�@>���~�a8�n�xzrg�&	Gp���E���z���$�\���/����j�&SG��9/�d�'l����<������'{�/w�_��������c��;��|�T�����b��4��kLN�k�)���Yur��.;���>�<:�_J�&�nH���8����[�n
���Lmm�^��bG�Gw�k������@�	1��������u���������T����	��Bs4�fb�n�	�<���DP�%k�L�Q$����r�!����4���&.W��^��q�]13At������X���������t�J��G
�=o��
��0��U�	$B�~z9u�
)R�cJZ-�8��En��������\��,�������c�����������{Al��L:�a��:����
��1C��`GAl���<=
a��h�����a��YT��B�~��������#����L���p"�7�X������WW�����ni�6�D��!2�o!�{���P��d�=��~�������vpBx[�<�����9��}�'��0�l�O�<�L���F�;���,O1X,�p�+��E��n�1�p�kW�L�4���*���4����RE�Mt�K$������eDA���<'�F�;�n���P�g���JV3��w<&�LMi�#mJ�ZZ�dq��T8TL�������`\_XF"�9�
�+nS|�g3e�����#��������d�dh�\4m�'�p���J�4�7���2|��3���eX�l\w�_#/.�x���$�;�M�v*���i=��7$w����,�Zt�a��
���"��x7�J�|�p:I>���I��������G�5���i
Z���������=8y�����~�x�������9k�����jI/�����$}�}�
`X�W�<����_N�nv1��K��>���N��32&'���T_�QG��3�<Gugd����������������q�4�M������A��j�)��hq���_�
RP�1�"��'o�5���{����D��;4�4k0S�o2I�K_��_�d�������Y�������'4\�����������Qv.��12�������>��ey�*4�@��zK;pI���(o_���d6�F$�f���j{H�UB$tr��bg�������p��H�q1�P�_���'��.n��Og��t���f��d�,���Ibk����������c���0�n���m^d�����6����5
�"��V,�<�lL�������/�!$B�9���Q6���2��@E�X|��\��C�
���`�$���9t�I+yW~J�#2��d��I�(�7���0e�-������2�g+c��j/�C��3q�����l���kVG�x�u��
��!/����n�_|[����������K��*��TZ��B�X�(��W�K�c������ �&R�|��!�R���(�����(��h*�����%P��.S��<������Ic������*�����Q�cR����W��@=B�%7������~/bYP5��=.W������!Cy�S�L=	���A*��v��e�/���~�t3Ch����
B4}a����$���������]MH������c}9�Lp1J�2���KE��d��7�0�������>A`_�#!�$�IvA��^/k:����}�f$����U�Y�27	�s�rZr��+I�ev{��zZ��7���%`BE�9<���4.H�8�����7�
��<�*a�W����������x��/P�pI���Q��gn����c�b�A+��1BG��H�����=M�klB��Z�*��
d�@+��zeA)=�pc�S��_�	4�x�Ev�JH�U��������k4�e�M*�S�������AH���%�SMA����0����F�S}:��s�}xj�z�g/��Po@�������Am�pu��{�u}�+��Vf��l���*��S	��go�!�������/��acB6z����`&��`��G�%�`�������J{n���������-JKh��{�"�l���U���H�����
#@U�Nf�4������JZ�<�]��9�_%
����ZA(�j�j�"j�B{��D��=5:��� ��<%f��(�,��#�H�S~���K���1�>i���A��U���Z��Y���E�E�4��KT��P�6�������?��OI�'�?�'5��A�Y�I�I��.�����q�h�����:0�
�X�����G��v���h�V��~���r{~�M��������z�QUU��cES�%��?��f�7��Yu�$��0N���R��p`{�y��g�Z7�9���$��fJ7^y
�!�+�]�F������A�J|����Z�����h��M���T&'����=86�'�O�KD���4�����~�+V�����U3q��E�T�8|��S��3m[a�/����7��oj����&�������|��'t�7@J�Q��fLQHD�f��I��w�����z!�<�lZ�[�L����6���A+�;H�����?����U�������������|�s��E���>����=j��UT��1n����*��v'�[\� �h����E%A�0�@�����
i��P�nX�j��\IE�h�r\"������c,��`�@�
�Bp~���
���G��{|W-y��ueb:h��?8��jl0�e���;�j���U'�,���0��_��J�JY&c�\/ Y��O�Q>YD�~)�!f8�Aj���HR�-��J�B&ey��7��D#�����T�D#��!�/P�|+��.S(n��on���,�����7�Qc���+N3�7t��������/'�C�"v��*�,�����RQI$W.E\J����T�;&e_�J'����S{�6Y��[a��'���HXJA���q����S���(
� �����I�cHYdFg�T�NR17o~�3����po#7�v#�/c�Z��JiI�Km[�-�J�������F���X��5� bul\�}}������`�Q$>��������q;u����vQ���#��y�KN��;�,��f���+uk\������*q7v�w���\r�d�[�E*M�d����L�����O��+�������K'��`���X�x��2�8$O(���z��k�������^@�y��y3y�}�`��:�.b�Ut�p���4:�����;"O%%�v^�'���f��n_P���:H�q9��%c�7���&!�&��H�&��'�#���K^4���x=
��`0oDfm�d�e"U#w�����_�J���
����- ��j>�6"G�K����.j���O��t<�[1�y�:�F�G �� �JM.�^EjM�������g�����9*��~�_
��9���]S`��f'=��R���pv���9O�4�9ro���7�s4_�j���2	'��Wr�L����<Q*|���������S�E�h����W��D@�?rG�#�7�tH~�����8	eJ��t�a'�%���K|I��+����=>�x��q�����470p���_����wY����A���������k��S���\;��&�`�K����3�������r���o^���3��N�K�}	�CE��Q���6��c��O���L����4��:01K�L%��@
�u�0{������#��G�&P�#��p%����Wn��B��r�z�nEvA/��Iv�$]�1��[(y��ELZ@�0&f3����"�A�P�?%:��w��iUFG�Pl>�r>�!%�����t����"�!��� *��`{$~�Q�%�@�\��b)^��7-�0�Wm�fh�tc#5��S��e���C@:��4�,����������AuU\)�T�g'g�Ad8���?�:�������PE��x=�I"���s�OL�*$�=uQ�0*�5�#������V��n�4�Q�����������>T�������a>�+7���XN�Z�?��Xzy��J��1A������L�N�
^(S���1�:�L���r�M�SzXU���Y�����g �U~��on�Arj����(������M%MJ���/�P��q�f�=
���O���S#x�Y��e�Un|��.K����Q��oT���n�YZ�����������^(�����;"�1R1����
0dJ+S7�
i�$	�C�1�^��N�Zp<�"���95�#!�Z�V�N�f1�m�9~�m����pTlf��&��W����\f�I�`��A�o�w�<�>�$�������Z�eZZ���X��������`�� ��&����NS��8��>�<��d��>:�;�Hg�����������!���D_�K���E��H
��n��D�����%d�?��Y���u��l�V��n`{M��9�+p���M���8!����������M%��a�����V�	1������{|��O�l6k�����H������#qg����G\����������
Dx�$�z�.��������w_����&���8S�Z��y�������fg�LU�l4��$��nD�eDl�;g���,\��fr>������?|R�'�����=�^1MD}c�A]�8���=w]q�������]����"L������v{p�?�>[��J�BSv����c���{$lFH1����]�y��C�H�Q�YC���xhY�+�@�]���sw0H_��~����������v�h��1E1C�/�VM����(���v��*�������A/o��xsZ�}�IX�g�_��^�Gy�w/{�nw��{�yX������x��}o�!,2���N
��S83�)�	;�q�Nw�\d�.'��������j���l'%��E6qG��1������������A:_�[�#�!��&����A�G��7�����o��/6n�E�����OJ*����W@|9�l�s��&v�rI���?����Q��_�Q�G��������t��5y#<��[���iv�?}'��+����4}����@���,-�d�L6�� l��J�_��������g����2X��_7v�Q��
����]�����$�Wm]���0��Hp���L?8��&��M�N�}�g��=������=n����t��LK�k���n�p7��^z�{���1��4Iv�mm'��}�[�)&�o�Ku�$��R�lA�R]�'G�F�K�I>������dP�M�D$�������l�j:�)q����s�����dv�����z���{^�^�����g���|g�3�-�l\��e�2�l���s���nwp��O��������x�!��S����x��O^'�x��9��qS�7��s�������]�u\K���]��J�Pua�*a�����?��{b</��!��G������2��[���F����?,u��o�{�7���'�<�>���J�G��[���G������<.
��6�r��Rji����{��%�Ri���c��Mt��[�k��?�4�w���c���������|����������533�5U��{<�i�n&a?_/�;[�����O����d{����$�)�?n����������y����f�b����E���iL&x���~�����S���a~��av?o�W-EyFn���"�{�+�����J%`�"$1w���Z���h�n�]�L��y�y�����1���������R���#��e]���';j�~���:������
��{�l��8������v�������\P\���(�g���[��G?�q�����2��*C@�����r�`0�
�-������X�!�dS�p[��
!����#���2H_\|#�T�����^�dk�-���X��!���q�Y7�"�%?���h:�fA�������������������K#w�������-A;���Z��D9���[P���������;.�q
/QjWt{����|eeg��}��>hv:�9�7�-[ze�����]y������AmI�}q1|��AW�z������������O\�Ae!w�`��/�}�*t�P��:vO+_���.�����7�j�
k��v�{��������NX��K�L�a!��2�#~�b>�����|��:p���]����������u�_d����V��^���g>����+�j��w��c���
��� T�,��z�o�|t�oAS[�,7�N����	J�������_��o����x3]d��)��^t_���I�&�+&�|��a����G2%������?�v�~�L��A��"xd�a��>�GmF�����{G�^�>�����?��I?/6%1E����>�<x�����\�?�����a���),�C���d��oi>��'��+j!)	*�sP�,R�S�r8���Z���=BH�g���B�
:�����?�rh��F�E�������������Q��a����C������z�j)���]�IB��L�C�^�u����]�u�FY�W[��*��T��q����*���so��{^���-��u�]�p�Y���S���|"�������n�?����?xT���������9�Gx"��h���E�+�r�]^����O�-w}������*ku�U���!�Y��"A��.|��"&���;�E����{�F,�=
��f� Q�	�(��j5�i����d��F��:�����9}l�'�l��c��Y����hf7t�~T�E?������S&����T0��u�~T~:�8���'�t���g4���������o���F���*������y����p�z�
��G�!@����-�����^��s����~wT�$�%��Od�\���Y1��*��^6�f�l#�=.q&\������_U�����]*�8=k���]*���Y��>!*����V��w�
�����A-������!8��a�������-��Kf�������������pA5�z)�7��h�����$�N�i,���q}������ldn�d�0�#�l�j_>���)�_9j��wC���J�V�W����T��f�z�����"��/
�^�����.�e(����Ap7���seK����x�����[�����{K�TA=���$��u�`��<xX!J�U�H���{�<�}{��e�_��x��6YP�S8��	�j��b`���v8�c���_<��������������Vc����������S8���@Q�$$l���0(�(K/&��g���|�Sdh������o���z�t�<�g�x�[�x�f�]�0
65�_���@���E��{���|a��E��!s��`4���V�_|` ���c���� �C�E�T�q`�a��/t}�,6b�2a3�k�h�<�S������B���{��Q��k�Ol��wW�e������������:L_��#7�|(��6�����>��>�]�{]Zf�R����r�A��M��Pv��e��&����2������Dhv�p���Ev>����{�6�@���@q�����f�"<����1��'N��T�@z�����s�/���B a��`IL��'���AM����S%)~7��T�(��)�RSN�+K��NR0�����}�#k��!�m��tKO�zVl��++|�QE�G���
��0�	��w;G����s1��(��0�My��
��_W	���K�k�VB�{�0�Y]�X(e��T��:a�|rI����H���\I�XF��{�����#QV���-�&7�gI{f_-x���|>��Y�On���!o��W������P��[^r��U��y]���_�{�����e��&"����,��CE'�����*�W���wR�����,?��w���)Z6'�Y���n���?[�(��g�����o����������/U����*���[`�����|�{m������&�/~���������2#m��������g������P����F���|'{��,��8Nl?�>�x1J�"@_|��|�:��I��Wn�YMk���V��S����;n���������C>#��?���d�-j��(?�F�����|=��~�^� �7���(()�q��yt?o��<���T�C/S�M�(,�>A�����
���`�<�����c�;���������7��<U������n��,����-
{V�c��oV[�rm���ao���:�=�����%���f�	`$3��9bK��Sw�>�����+*���?. ���4l%�Q�82!7����{�	��$:�h��yy���d����:������������}s
$�;8��Y�t�;q�:������k?x�{WM3��rKzSW5HG?Z���
{`�s��Tl��m��D���6bO���s=�G�9h^��z����fro��dg��L���q���b��w�Q2����g�h�G�|�9cJR��B`��w���##��Akb����!L�#������!��A���nS��=88<�?>u����7i�<F����R���(.j�6�������-���;�������R[u��}��������0�h�NG��L�/?�&���=Lz�(����
Mn����4��������a�����$G/^4V��PB�����M���O1
q����v�����i����&�l��\t��y���)�!���bU�;]��Y]z����,0�_���p�as�
b�A���a|��u�!l�V\&qi�81�an=�qys��_m����`��-V7�I�/���zg�����������?>���L�}V.�~f���g������lU=o���������_LG��Y@����,��7~�����e�h��+	����*�do�1L������83��G@�����`�p�r�����������8B�/&�t����Y��^�����|���t��!�Y�>y�v����o$E:�����=p����t�.�k�L�D�|���c{��Z�����*2��o�����o�����$��N,9�O'����@%A"zj�z�����7 �5h�[�D��d�\�l03���-h#�'����s�&��5���|�J���B�pt�
E����?����-J��7�`��C�����9��s0��`~��d�����L�U���'�n��9���0T�,8�O>��vL�^��w��?V?���>��p������s���.$dTps��Z�9;f���.��gz����d}{��)Z�@�i����V:h�)��v���.�����)li�~AjZ>_�����X�X|���R���8�+�����\6��.��r<��%��>��U���:L�J7��~���{�'��g�-�,U�������1���7�7V�������&�W��D��j.U8�������d|���{�^��%���ib��x����/A�����R3���n��m�I���]�|�;e�I��[�3O��������N��_���)����w��=O9�2:���"�������#n��z�c�c�%����B���>��m�?$�_�����RXg$��B6�����`l!�'h���-��y�����s�=>����������f��a����l����R���MDK����>���R���[��x	���4
o
-�����K��]��3*+1������v���V�u���Z\�M�jqy�f7�1��H��L;mW/���t���\���?]�3De��
 ���j�o�~���$;;:�?��UFt�g���J���B��n1�����&���;��X\��V[j��H.�����M�1�IpoktP�4���H8��r�@�C�U�7N�Z5��
,B�����9VH>��w�l{�A��j�J�.d���{{�����,�������R.;�-l���i���#���m�V�����s����
N��!���j��/����]c�������
N�����7j��/h#�CL�����������]���_�����8�{��?kb�D���(����zu��?X
�����_��?]��L����6��I%�j����e9�)V�����WW�_
����

#28

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Tomas Vondra (#27)

3 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

Attached is an updated version of the patch, fixing some minor bitrot
and duplicate OIDs.

Sadly, this patch series does not seem to move forward very much, and
I'm not sure how to change that :-/ But I'd like to point out that while
those new statistics are still per-table, we might use them to improve
join estimates (which is kinda the elephant in the room, when it comes
to estimates) similarly to what eqjoinsel() does with per-column stats.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#29

Mark Dilger

hornschnorter@gmail.com

almost 8 years ago

In reply to: Tomas Vondra (#28)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Feb 24, 2018, at 2:01 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Sadly, this patch series does not seem to move forward very much, and
I'm not sure how to change that :-/

I'll take a look at the new patch set this evening. I have been using your
previous version of these patches applied against postgres 10 sources
with good results.

mark

#30

Mark Dilger

hornschnorter@gmail.com

almost 8 years ago

In reply to: Tomas Vondra (#28)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Feb 24, 2018, at 2:01 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

Attached is an updated version of the patch, fixing some minor bitrot
and duplicate OIDs.

The three patches apply cleanly, compile, and pass check-world.

You might consider using PointerGetDatum in compare_scalars_simple
rather than hardcoding the logic directly.

mark

#31

Andres Freund

andres@anarazel.de

almost 8 years ago

In reply to: Tomas Vondra (#28)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

On 2018-02-24 23:01:59 +0100, Tomas Vondra wrote:

Sadly, this patch series does not seem to move forward very much, and
I'm not sure how to change that :-/

What's your estimate about the patchset's maturity?

Greetings,

Andres Freund

#32

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Andres Freund (#31)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 03/02/2018 04:29 AM, Andres Freund wrote:

Hi,

On 2018-02-24 23:01:59 +0100, Tomas Vondra wrote:

Sadly, this patch series does not seem to move forward very much, and
I'm not sure how to change that :-/

What's your estimate about the patchset's maturity?

It's dying of old age.

On a more serious note, I think the basics are pretty solid - both the
theory and the code (which mostly builds on what was introduced by the
CREATE STATISTICS thing in PG10).

I'm sure there are things to fix, but I don't expect radical reworks.
There are limitations I'd like to relax (say, allowing expressions
etc.), but those are clearly PG12 stuff at this point.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#33

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Tomas Vondra (#32)

3 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

An updated patch version, fixing the breakage caused by fd1a421fe6
twiddling with pg_proc.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#34

Mark Dilger

hornschnorter@gmail.com

almost 8 years ago

In reply to: Tomas Vondra (#33)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Mar 3, 2018, at 2:40 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

An updated patch version, fixing the breakage caused by fd1a421fe6
twiddling with pg_proc.

Hi Tomas, thanks again for this most useful patch!

Perhaps this is intentional, but there seems to be a place in src/backend/parser/parse_utilcmd.c
that is overlooked in your recent patch set. The simplest fix would be:

diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 0fd14f43c6..6ec7818f31 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1661,6 +1661,10 @@ generateClonedExtStatsStmt(RangeVar *heapRel, Oid heapRelid,
                        stat_types = lappend(stat_types, makeString("ndistinct"));
                else if (enabled[i] == STATS_EXT_DEPENDENCIES)
                        stat_types = lappend(stat_types, makeString("dependencies"));
+               else if (enabled[i] == STATS_EXT_MCV)
+                       stat_types = lappend(stat_types, makeString("mcv"));
+               else if (enabled[i] == STATS_EXT_HISTOGRAM)
+                       stat_types = lappend(stat_types, makeString("histogram"));
                else
                        elog(ERROR, "unrecognized statistics kind %c", enabled[i]);
        }

diff --git a/src/test/regress/expected/create_table_like.out b/src/test/regress/expected/create_table_like.out
index 52ff18c8ca..d7454648fc 100644
--- a/src/test/regress/expected/create_table_like.out
+++ b/src/test/regress/expected/create_table_like.out
@@ -243,7 +243,7 @@ Indexes:
 Check constraints:
     "ctlt1_a_check" CHECK (length(a) > 2)
 Statistics objects:
-    "public"."ctlt_all_a_b_stat" (ndistinct, dependencies) ON a, b FROM ctlt_all
+    "public"."ctlt_all_a_b_stat" (ndistinct, dependencies, mcv, histogram) ON a, b FROM ctlt_all

SELECT c.relname, objsubid, description FROM pg_description, pg_index i, pg_class c WHERE classoid = 'pg_class'::regclass AND objoid = i.indexrelid AND c.oid = i.indexrelid AND i.indrelid = 'ctlt_all'::regclass ORDER BY c.relname, objsubid;
relname | objsubid | description

Otherwise, perhaps you could include a comment about why STATS_EXT_MCV and
STATS_EXT_HISTOGRAM are not handled in this case.

mark

#35

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Mark Dilger (#34)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 03/10/2018 02:08 PM, Mark Dilger wrote:

On Mar 3, 2018, at 2:40 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

An updated patch version, fixing the breakage caused by fd1a421fe6
twiddling with pg_proc.

Hi Tomas, thanks again for this most useful patch!

Perhaps this is intentional, but there seems to be a place in src/backend/parser/parse_utilcmd.c
that is overlooked in your recent patch set. The simplest fix would be:

Yeah, this is consequence of 5564c11815486bdfe87eb46ebc7c070293fa6956,
which fixed a place we forgot to modify in pg10. Will fix.

thanks

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#36

Alvaro Herrera

alvherre@2ndquadrant.com

almost 8 years ago

In reply to: Tomas Vondra (#33)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 0002:

In terms of docs, I think it's better not to have anything user-facing
in the README. Consider that users are going to be reading the HTML
docs only, and many of them may not have the README available at all.
So anything that could be useful to users must be in the XML docs only;
keep in the README only stuff that would be useful to a developer (a
section such as "not yet implemented" would belong there, for example).
Stuff that's in the XML should not appear in the README (because DRY).
For the same reason, having the XML docs end with "see the README" seems
a bad idea to me.

UPDATE_RESULT() is a bit weird to me. I think after staring at it for a
while it looks okay, but why was it such a shock? In 0002 it's only
used in one place so I would suggest to have it expanded, but I see you
use it in 0003 also, three times I think. IMO for clarity it seems
better to just have the expanded code rather than the macro.

find_ext_attnums (and perhaps other places) have references to renamed
columns, "starelid" and others. Also there is this comment:
/* Prepare to scan pg_statistic_ext for entries having indrelid = this rel. */
which is outdated since it uses syscache, not a scan. Just remove the
comment ...

Please add a comment on what does build_attnums() do.

pg_stats_ext_mcvlist_items is odd. I suppose you made it take oid to
avoid having to deal with a malicious bytea? The query in docs is
pretty odd-looking,

SELECT * FROM pg_mcv_list_items((SELECT oid FROM pg_statistic_ext WHERE stxname = 'stts2'));
If we keep the function as is, I would suggest to use LATERAL instead,
SELECT m.* FROM pg_statistic_ext, pg_mcv_list_items(oid) m WHERE stxname = 'stts2';
but seems like it should be more like this instead:
SELECT m.* FROM pg_statistic_ext, pg_mcv_list_items(stxmcv) m WHERE stxname = 'stts2';
and not have the output formatting function load the data again from the
table. It'd be a bit like a type-specific UNNEST.

There are a few elog(ERROR) messages. The vast majority seem to be just
internal messages so they're okay, but there is one that should be
ereport:

+   if (total_length > (1024 * 1024))
+       elog(ERROR, "serialized MCV list exceeds 1MB (%ld)", total_length);
I think we have some precedent for better wording, such as
	errmsg("index row size %zu exceeds maximum %zu for index \"%s\""
so I would say
   errmsg("serialized MCV list size %zu exceedes maximum %zu" )
though I wonder when is this error thrown -- if this is detected during
analyze for example, what happens?

There is this FIXME:
+    * FIXME Should skip already estimated clauses (using the estimatedclauses
+    * bitmap).
Are you planning on implementing this before commit?

There are other FIXMEs also. This in particular caught my attention:

+           /* merge the bitmap into the existing one */
+           for (i = 0; i < mcvlist->nitems; i++)
+           {
+               /*
+                * Merge the result into the bitmap (Min for AND, Max for OR).
+                *
+                * FIXME this does not decrease the number of matches
+                */
+               UPDATE_RESULT(matches[i], or_matches[i], is_or);
+           }

We come back to UPDATE_RESULT again ... and note how the comment makes
no sense unless you know what UPDATE_RESULT does internally. This is
one more indication that the macro is not a great thing to have. Let's
lose it. But while at it, what to do about the FIXME?

You also have this
+   /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+   if (!HeapTupleIsValid(htup))
+       return NULL;
but what does it mean?  Is it here to cover for some unknown bug?
Should we maybe not have this at all?

Another XXX comment says
+ * XXX All the memory is allocated in a single chunk, so that the caller
+ * can simply pfree the return value to release all of it.

but I would say just remove the XXX and leave the rest of the comment.

There is another XXX comment that says "this is useless", and I agree.
Just take it all out ...

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#37

Mark Dilger

hornschnorter@gmail.com

almost 8 years ago

In reply to: Tomas Vondra (#33)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Mar 3, 2018, at 2:40 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

An updated patch version, fixing the breakage caused by fd1a421fe6
twiddling with pg_proc.

Hi Tomas!

Reviewing the sgml documentation, I think something like the following should be added:

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index a0e6d7062b..108c4ec430 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6496,7 +6496,9 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
         An array containing codes for the enabled statistic kinds;
         valid values are:
         <literal>d</literal> for n-distinct statistics,
-        <literal>f</literal> for functional dependency statistics
+        <literal>f</literal> for functional dependency statistics,
+        <literal>m</literal> for most common values (mcv) statistics, and
+        <literal>h</literal> for histogram statistics
       </entry>
      </row>

mark

#38

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Tomas Vondra (#35)

3 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

OK, here is an updated patch fixing breakage caused by 5564c11815.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#39

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Alvaro Herrera (#36)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

On 03/10/2018 06:19 PM, Alvaro Herrera wrote:

On 0002:

In terms of docs, I think it's better not to have anything
user-facing in the README. Consider that users are going to be
reading the HTML docs only, and many of them may not have the README
available at all. So anything that could be useful to users must be
in the XML docs only; keep in the README only stuff that would be
useful to a developer (a section such as "not yet implemented" would
belong there, for example). Stuff that's in the XML should not appear
in the README (because DRY). For the same reason, having the XML docs
end with "see the README" seems a bad idea to me.

I do agree with this in general, but I'm not sure which "user-facing"
bits in the READMEs you mean. I'll go through the docs, but it would be
easier to start with some hints.

UPDATE_RESULT() is a bit weird to me. I think after staring at it for
a while it looks okay, but why was it such a shock? In 0002 it's
only used in one place so I would suggest to have it expanded, but I
see you use it in 0003 also, three times I think. IMO for clarity it
seems better to just have the expanded code rather than the macro.

I don't quite see why expanding the macro would make the code clearer,
to be honest. I mean, expanding all

UPDATE_RESULT(matches[i], tmp[i], is_or)

calls to

matches[i]
= is_or ? Max(matches[i],tmp[i]) : Min(matches[i], tmp[i]);

does not convey the intent of the code very well, I think. But I'm not
going to fight for it very hard.

That being said, perhaps the name of the macro is not very clear, and
something like MERGE_MATCH would be a better fit.

find_ext_attnums (and perhaps other places) have references to
renamed columns, "starelid" and others.

Will fix.

Also there is this comment:
/* Prepare to scan pg_statistic_ext for entries having indrelid = this rel. */
which is outdated since it uses syscache, not a scan. Just remove the
comment ...

Will fix.

Please add a comment on what does build_attnums() do.

pg_stats_ext_mcvlist_items is odd. I suppose you made it take oid to
avoid having to deal with a malicious bytea?

That is one reason, yes. The other reason is that we also need to do

getTypeOutputInfo(get_atttype(relid, stakeys->values[i]),
&outfuncs[i], &isvarlena);

so that we can format the MCV items as text. Which means we need
additional information about the extended statistic, so that we can
determine data types. Maybe we could simply store OIDs into the
statistic, similarly to arrays.

That wouldn't solve the issue of malicious values, but maybe we could
make it accept just pg_mcv_list - that should be safe, as casts from
bytea are not supported.

The query in docs is pretty odd-looking,

SELECT * FROM pg_mcv_list_items((SELECT oid FROM pg_statistic_ext WHERE stxname = 'stts2'));
If we keep the function as is, I would suggest to use LATERAL instead,
SELECT m.* FROM pg_statistic_ext, pg_mcv_list_items(oid) m WHERE stxname = 'stts2';
but seems like it should be more like this instead:
SELECT m.* FROM pg_statistic_ext, pg_mcv_list_items(stxmcv) m WHERE stxname = 'stts2';
and not have the output formatting function load the data again from the
table. It'd be a bit like a type-specific UNNEST.

OK, I'll look into that while reviewing the docs.

There are a few elog(ERROR) messages. The vast majority seem to be just
internal messages so they're okay, but there is one that should be
ereport:

+   if (total_length > (1024 * 1024))
+       elog(ERROR, "serialized MCV list exceeds 1MB (%ld)", total_length);
I think we have some precedent for better wording, such as
errmsg("index row size %zu exceeds maximum %zu for index \"%s\""
so I would say
errmsg("serialized MCV list size %zu exceedes maximum %zu" )
though I wonder when is this error thrown -- if this is detected during
analyze for example, what happens?

Actually, do we need/want to enforce such limit? It seemed like a good
idea back then, but perhaps having a limit with a mostly arbitrary value
is not such a great idea after all.

There is this FIXME:
+    * FIXME Should skip already estimated clauses (using the estimatedclauses
+    * bitmap).
Are you planning on implementing this before commit?

Actually, in the MCV patch this is not really needed, because it gets
applied before functional dependencies (and those do skip already
estimated clauses).

Moreover the 0003 patch (histograms) reworks this part of the code a bit
(because MCV and histograms are somewhat complementary). So I think this
shouldn't really be a FIXME, but more a comment "We're not handling
this, because it's not needed."

But let me look at this a bit - it might make sense to move some of the
code from 0003 to 0002, which would fix this limitation, of course.

There are other FIXMEs also. This in particular caught my attention:
+           /* merge the bitmap into the existing one */
+           for (i = 0; i < mcvlist->nitems; i++)
+           {
+               /*
+                * Merge the result into the bitmap (Min for AND, Max for OR).
+                *
+                * FIXME this does not decrease the number of matches
+                */
+               UPDATE_RESULT(matches[i], or_matches[i], is_or);
+           }
We come back to UPDATE_RESULT again ... and note how the comment makes
no sense unless you know what UPDATE_RESULT does internally. This is
one more indication that the macro is not a great thing to have. Let's
lose it. But while at it, what to do about the FIXME?

Hmmm, not sure that's really a fault of the UPDATE_RESULT macro.

Sorting out the FIXME should not be difficult, I think - just remember
the original values of matches[i], and update the number of matches if
it gets flipped from true to false.

You also have this
+   /* XXX syscache contains OIDs of deleted stats (not invalidated) */
+   if (!HeapTupleIsValid(htup))
+       return NULL;
but what does it mean?  Is it here to cover for some unknown bug?
Should we maybe not have this at all?

Yeah, that's a bogus/obsolete FIXME, from before we had proper cache
invalidations in RemoveStatistics I think.

Another XXX comment says
+ * XXX All the memory is allocated in a single chunk, so that the caller
+ * can simply pfree the return value to release all of it.
but I would say just remove the XXX and leave the rest of the comment.

OK, makes sense.

There is another XXX comment that says "this is useless", and I agree.
Just take it all out ...

You mean the check with UINT16_MAX assert? Yeah, I'll get rid of that.

Thanks for the review!

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#40

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Tomas Vondra (#39)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

Attached is an updated version of the patch series, addressing issues
pointed out by Alvaro. Let me go through the main changes:

1) I've updated / reworked the docs, updating the XML docs. There were
some obsolete references to functions that got renamed later, and I've
also reworked some of the user-facing docs with the aim to meet Alvaro's
suggestions. I've removed the references to READMEs etc, and at this
point I'm not sure I have a good idea how to improve this further ...

2) I got rid of the UPDATE_RESULT macro, along with counting the
matches. Initially I intended to just expand the macro and fix the match
counting (as mentioned in the FIXME), but I came to the conclusion it's
not really worth the complexity.

The idea was that by keeping the count of matching MCV items / histogram
buckets, we can terminate early in some cases. For example when
evaluating AND-clause, we can just terminate when (nmatches==0). But I
have no numbers demonstrating this actually helps, and furthermore it
was not implemented in histograms (well, we still counted the matches
but never terminated).

So I've just ripped that out and we can put it back later if needed.

3) Regarding the pg_mcv_list_items() and pg_histogram_buckets()
functions, it occurred to me that people can't really inject malicious
values because are no casts to the custom data types used to store MCV
lists and histograms in pg_statistic_ext.

The other issue was the lack of knowledge of data types for values
stored in the statistics. The code used OID of the statistic to get this
information (by looking at the relation). But it occurred to me this
could be solved the same way the regular statistics solve this - by
storing OID of the types. The anyarray does this automatically, but
there's no reason we can't do that too in pg_mcv_list and pg_histogram.

So I've done that, and the functions now take the custom data types
instead of the OID. I've also tweaked the documentation to use the
lateral syntax (good idea!) and added a new section into funcs.sgml.

4) I've merged the 0001 and 0002 patches. The 0001 was not really a bug
fix, and it was a behavior change required by introducing the MCV list,
so merging it seems right.

5) I've moved some changes from the histogram patch to MCV. The original
patch series was structured so that it introduced some code in mcv.c and
them moved it into extended_statistic.c so that it can be shared. Now
it's introduced in mcv.c right away, which makes it easier to understand
and reduces size of the patches.

6) I've fixed a bunch of comments, obsolete FIXMEs, etc.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-multivariate-MCV-lists.patch.gzapplication/gzip; name=0001-multivariate-MCV-lists.patch.gzDownload

�V��Z0001-multivariate-MCV-lists.patch�[�s����Y�+f]�b�P�e�zQ$�V�-��t�u�XC`(!���$&/��~�3�xIN�[�*[$�������<�a�=�7�*���9�������C�������;�;�}���
{_t:/���v:��9���(���1
�X��������o3�a�r��x�L�
Cog��w�{��}@���/w�D�cw:����W��/��'����nw?�`�����=��O�����v���n����o���T��M��+�����owDca�d:��eS��Ecq���a��t���k�=�X�r�&m'V`mD$���,��u�<�����E�nr
d�&m^�	����\���47��^�����T��m���D%��P�u{��ROs�/��b����S�2NxM|�h6v��i�*���i��_����j
�����������u��
������[%�^�
c�\WMqG����D��l��zHi�;�6�T���0����WX�&`��
���*=�V��n��g���e�����i���'��j�6
��s���g��|�=�92I[�k��f�[1uG��������rEThd���q:��
;�-.�V�#/LUJ�hj��aj)��.��;���*�6�M�Zo����,�r����}Q�h�.�����4%2����Tfv����O�qy*����Da��g��W,�ie2�bQ\-�(��������V�7���NgwOx!�Y�EaRk�-qp \�+}�Y�Z�"�\�(�����-_3�{{��&�l�x���
	q����c{[jo�g�Z{����]�l�r�F����f��/� ��������78�>y��=ivw���(VH'�"�;��O_y�zIN4��n�WG����S�����*L���vvG���n�K�j�gz�Q��3'�x�w����=j���$�����1aD����q�����K8� J�&�w0~'��J��zD�|K��<�#�B&�|�F(��M���K��;2�jl��t�F��������p�2�`~��y`��;�=��n���w���x��SW�\�������;�����6�������^_�y��L���}�E���L`���w���_�}���e�o�y��y�z)�4����p��6��l"��� v��b����(TWG��$�
:jgw����n��sW������&
�aD���m��&^�%^x#�$=6BO������(C�����h������=b��_���������k�)�_.�����j���Z�4�U�\h�����X���O2!��l�	�E��Pu����Q����g"V�,!u��&B���a�N�F1T����D��gA(�|��1�p��.��%���tz�4�I���=�M�H^�}����}$� t�����g;��(��-�`%@�A���?��C�����N�w,��^v�m���������uX�o��t��[�/��:DB����TG�7��&�A��'3��h?}~�M�d~����77�|���G��d���'��IHp�M���%.?�{Wz&����$V��P$������^����6�G�V���%���	:��A�2p�@{E���jz!{~Z���	G�b��{`NT�0C�(���2 �����O�"h���W��bag��~�6Q���.����u��@�I���4M�������
���?^�3��q*9�F�?���Q��Y��V�Ys���� K��Q\����o��7���n�x�S��d��#�V�Uj���-�_��F����Z���?pV�@)5�*92v=��(
����z���	�6��.5�-2s��X��b!a��f=����7���gUk�q�g�A9z�V#q�����7�?xwrq)j'�'�~���%��/.������0`�?5,!&[�\��1}{e`�^D���z��j�.��C�
���[���33V���
��R�H���������a�E=��������_���x��
��0~�dTX��k�b)��$B=�f�����b5��������>�D�L�*2V�N�s��L������0���"M�\�R
!^����x�.E�D?7~4�
�XO��(���t�P��n!��(���;�["�R��^)�I��@�2�
����'�o��
�����?C��G���p�L�]hZk+jFBL��nn�&�="�����,\OUlPr��K%��m��.Q������z���n��["&]jM)�EY;�9�&RAI���Z�(2c���E�E�G��)��[F^�X�6�(�7���kH+`��1��S��D�Q�f����M�z��kZ���(�!���*�(U�6�QF)���	��D�A��D��O2�4nMy���*FmaP� _�id$���	�=�m�P!��*rv}��T�K�2?��`	h\]r�2��M����$�/��?�����E`���F���`���M����<�u����Z��.>9��d�M�������I'%�����Vg~�=�R�j�n���1�u=����
�;����8<�t6`8���9��?������d7�E!�EM~I�&i
�d�L����b�VP�N�y+��>�G�CP�E����/�R\��`3��g��D���AM(��25���h�5 ���sn�=8�8�m�����L�K����}C�� ����npg)�^�X�)�j�M�$�H�����6
bHS�;X%������.�T^
��m�J�\��3/�r'-�u���t0�����S�?&��O}j�G��mF����������ku��j�L��wHz�V����wys�zw�	����}�������:�z�F��7l�f��T�-~b7�wM<��0Wx�6��{'0��7)��Hze��XE��LF���J�TO0>j�?
Kpx�$v���-�V�f/c�g2��	e�-(��g�'�������G,0��tF���S9E1]����E����t��e���z�H�*9YN�>�Qp�w����>���{\4t��E����rqb	CO[snis.��@�;����
����KA���N�����3�a�YW������i�
*B�Y��6�$���F��_��t�������S^�j�s*���i�R1�M��lR!p�r`L&�T|�L�!R��&,�y
�_�B_�����C��#�i#0` m?b3 d�>��9�rv
�P��F(��Ybb�i��H����f��q��eUQd7���P�4A��
8��������5�5mOtP��'���G���������,�z�W�b*��sW��l/i��,��<���������#�w%��V�V����]wwCWs��
��v�Cm?�>(�g�
�O��\\^
E�g�������V�@v�u���g�C�Q(�(Ngqw�:d6�F19���9�"?��L�~�L0G���3a3�<����5���\G��kj��yi��=��Ou��W�6�+�ed�;I:��26.����t�^��[C�P
q��>K�5�u���46�wd������/��!j�
S����2���U�{=j��{���]��f-���g����`:u�Y-P�p�Y�R���FJW��(����)���[����mT@��~GJW��W^�n�M]�OOU<���g��}n�l�$C:sa������#SY�F"��t�)JT��{9������/���0��������n@n�rq9�_������d�e��<�n����.���kS*��l�C/f����uV4���������T��S�<������/���[Fl�6���p!�z�P�Rw�!��l
/\�
Y�7Z�[_}rW�I|��&�DT\V����h�b2������m�vz��<�f�G�����0����e��A]�*�_�[K��F2H�T�H���-��<wkk���������}lj��t���?�����vs�L��`k������W[[�TKUk��Y:�/���*���� ��x�S�h�'q,��&� (,�mza`kk<�|w������{�<�K���L��`��D��z�o�����������>_�W��(Jw��������]`���2��F�Fy
�!��)+5L0_	��)�����a�E|�Q�����u�O��d2��a������0���U���H
�����	N��OY{"j��Lk4�t��.^S��Q���X4�������F��
����W��?�j*���Ut����?�r9<�y�������Ig���Z3`�,[A.�
�k�bWK�
�C���W7a�t$~����P%��@2J�g�ck~
�}����kn4>cc�X��J98�(�
F�����3�s���:{�����Q<��_��/O/�� %��BI`O�E	�PB����q,:�����:z-|�yLj��x��S��k`����,3��oO��.�,a[�=K<w�	���u������Q��R1W�:��t���bc|DM�����\��G�U��2���T���b�
�B?5����6�f�n!�a.���"�������^�D��G)�����w���D�����;����Z�,m].��a~���[Y�	�@�2��8�|s�M�����D��>�/��L���i��s&v+��;��B��8uF��$y��zaX
����*V�����)yA�\:���-B�/�?W&|�\h�p���`���!F\,~a*���s���@�H�K�^l�b'�@��0������M�&�� ��gAI^�_P�����fK��X��(r	]01s��J����"��w^4K�K������������/���}R	J�����7LN=�I&}�c������Jb��vD�$�+�N=��:�\�2����(��Dz>v���J����l�TQS���2\b�kd?���&s�;L'��������f&f��)��w�3[tx���R�2�q)Vj�e;���AB�+}8:d,�b��� ��k�TY�'��;��n%�a�O�o*.h	���{�
���4E�����
�1��>��l��&���C�H(��E����B��M-�[ �N9bD�D�,"Q�J���,d8O����5k�]�2���@$t�[�V�������8n������}��d���e�f:V�Fb��v8P2vn��xT��X�V9�������nR����P]���� 2��!r|d�e�m��r;F�c��*�56J�Wj�.��������a�����`���)r�[�2~�Th�����
�@����
_�IN�7������d��\	�5�=��'�2[%���E����65,N����K���x�y���MZ�$��L��I�����r�w����x�D��4������&��3S%+iZ#5eiI����|�����QvbZ9=�lM�|��:j�M��i���,fs�N����$d_�Q���]k�?)v7�JTr6K1������y��5�������L��N����2U�iD������:1&7q����}��3��D�{���(b���Q\/�B/��>�o-fSM���^=��gKQ���J�G�S�m�Ko$"Mf!�A�zo���Ya�������DA���D��%.�r��R�4!��2�����,mL�}��#c4�1+x��[����t�w�J����Tfj�-���3;z�\��u����M���a2������Wp���Faf��ovY�Zw�L���|�O3hpF0[�0�����s�������:�Z=�]���T��5cD�ei1K|G�m��'�&9���Ob�����O{o�������->E�sJh@�����v8m�H:����)I%,[��*������w�{�A��N���sH{�^�o����s���nT��{	�Ho�����<j?:�i6w
w:�V��{ISy-���d	�P�����l^���f{����$!
7^<�D���g�y�������I����������O6���z���4*:"YHjN�6�t�9�m��9@����N^-u��f<O������L�r�k��VW�����Lzn�O��f�~���\�� ����0b�R�����E�V���u���_�����W�4d�
a�n�;d�P��'@��5�QD�B�U���O)���<���+|�Vkgp��aN��x���x\���W,�6K�W���h�F.oF�?���-�����R�C�����7�k����Rr����I�%��#��6�#���,��z"��M������Q����������"�`%m�$
�{Rr��N����q�f&K%�R�KFV���p����2�d�E�^����:���{�l#7���c��w��!w�K&��`�n���f��l�q)�w����l{�,{�]��<���6�0���
*�F��V�T�~5�Yg����uFUo�>�����n%������	W�T��x��D��o9��U�XV������7��Z}x�������\���g0S�������MF�k���;�I!l��t.'��)dTel�K��b��t1F�����U���5(�6���/l�s�=6]b8����%�T� ?���c�t><q�d��U_�C1�xC����p���<��.��M��/
Y
Q��c�2��V;h:mOf��,a��PH�ck8h��/k�{��N�q,i&K
�g�^�]����D�o9B���������\�&��xq7�%H�H��Nf�Ah?$O��r��v��Q����������
�w%���w����v�������mg�ViW�����zB0C�����z=����9����w_	u��i��;�����o�^��+j#��J1���??���7��V]��fJti����-��5�����-R�1h�M���������g����9	n>�Yl&k�U+�?d������\g�d��y���
�b�*�����������i���^���oa�^J�I������2���qlE��/�J?n���a�
��p^�������%���~������d}��3(��s�_�\����cxv'����VI�YC����k��@Y����tT-�o#���W��pu�^���,B�]�>i�'���tv~�js�b���d�������D��/@2�����\���Hn)@��n=x���PFE���X����K��:�c���u[]sxH��u*�6ryL����Ex���w�
D%�T8P�k��lv����=���P���������-�TuF���^�Y8��S�����,foy>��������,�&]��W�����he���h��#��������D���/�e��J.Q��[k�g�!�08��Jnm������,�h���_Bz�X��
�yv\0���|T�;4��bh�
hM��o����V� .�H-��(m�V�)�Jh�:�8�9XLFt7C\�^$���C�gL%� ���y	T9�XH����*Y�,='���#����Z
c��[��1��9��19�[_wZ`��$��i
g���@Y��n���?U�-�fS�:&VA,7d,B�������*�%����jP���aC�5j����m���EnM�33����Y]e�D�^Ai�J��d�KB#�[fn�g&QYf4��R;(���	�P�(k?�M����T]��~��"�F����d*L�������a:� =N�������w�`�`�4G�x�`���c;��.�e�[L`b�d�n*Gx8J���^	��6V
S���S�m��eK��Zj�U
RuR�tq���*j �
j��n����w���n���0"����J��-'Q�0�j�:��9��
C����NXS�>�����AX�l�����y�v������0�E�3�82����1��������3����G=���at�S���Sf��Q�>5�C2~3dzl:'SN��
������E�Kq��)���lb��5�g���D-� ac�ij:���4��Q��LK�����^P�M<����U~�u�fDZ�t���a)��x(�d��T�:G6d������H�������h�M$�
w��v�s��?0����Y~G��i�g�zLY��^i~�U�HT�[�5����TilMs���5~�����{������ZxL�8�����r�{�CD��������h�c�r�D��a�"����r8f":pmct B��� W��|Y��*�x�;N��0�b5h���U���v�Ly��>��u'�
����xr��|1�4=�hH����rn�i	y�����e�b��F@�9j�����9u*5�`"_��y�-��g��-��I���Y��$�l��`D,���T��q��l`��.C��H�/�[3��j�D0D�v�.��;g�l��:���P{�	3�	�PsTM�c�{����~�R�?R�I+�R@�0.:v���(�88���n��:���0�}�Q�>J�B7$�<��������X�P���|���I��W�Je�f[Y���^��g	W�hB���~$�������������-r�:Y���G_���&���{�n�&������
��L��8M��Fr2���}��������oq[��lw���� ���h�4���@~�d3
m6F���t�V5kn���q t0ah�`�v�n)/��"V�!���<�����xw�$�-x��x�{R���
��*�S�kgK�1��}y�l��?��}u~|NJ#B�0{LB�=�������GV^-/��h��,��.'���1l"�����#��au���=���� ��iGd���@�e%_���T��i���(��n�;Y~Z5�`���8�����xeXA���2�h���-���e0"����x��� �r(S
 2l���>������	{�����Y�l	���QU�o�a]�'�3�v@��/	k���L"��w>�U8f�YtF�5A����;
�8���x,��5���&f$o"�Z�"��9�T�B���-��g~�Q�T����a|�Q��t]�g"�i�*fv��t.�.�[��P��>P��w��x"�Jd������L���(,�r�hv����1(����>����+�~�l��|��2��r�H����TT�l�����h��c�U�$UK���gu+���h�j$[����F���x����u�a���x+S
)3P#�EnU	�	�I���kd��
�>��[���Q�Y7D����'���U�N��n�������V=h6�u�lhn
��F��z&p5_2K:�1�)a��0�^]�(�I$V�����`�h���{�l��>���K���--5d�r�`��DrS�~��_s\�����Tr	�u���?�[~�����@`�Xx��M�'����9��������k	tJAX

���cO�����h"��T����vT��q	L���tnR���N�P�&#�y�q��<Q�����8r&5]:�a��{��Pe�v��[<�������[���w�����L_|O`���%gRz��}&h���������Z]v�!9�rASDC����B=�3z�@<3K�Y�3'����8�k�~hT�>���S�����!������a-��|%r�����������x���q�-i�&�
��IHpU��g�g���M���
������W����)�+��X]��IJ�h���[�V7�t���X�������HT�i�nH��UT����sr�����>2o1c�������P���o�.$��Q��O��-�q��cx�i���	_�F���h�olm	�w��a�R�k��w�*��[��:��G!��	����q@�������v2���*��E�}�m������l>���/�1��"��aT���}�6��o]��*������ S���<f
����g��6c|q��E���/h��2��e���\@E���	�
�nx����.u2K������6q��f�s��(�Rk���e�(�������7�w��';�<��`�a~��C��+e�<���LD�A�|�+����n<���tf����u	�[�%��h�>I�����L^�w���\"���4Si��������)�tR���B�z��Y�������b�AFu�����7D�v�W=�4|+f<�A0P��.A������:1l	!U��M�>�<��}��E[G�(�F�q��Ye���0���M�9R���
��uQ3��O?��:�`W������\��b'���*��@�%�fp��9)#�A9�KaQ��2
���/R�Y,Y2�����6,��7Gio�'�����13JIY)yVS�l-�	�?p����N��_:��������������\�i!�X��?%���3)���������y�-��f�g!�@��h~��z�P,��<���d���"�][�+�#���t1�e{����������9A�����
(&2"(iH�	���zo!��Gj�X��)��Vm{���_tz`�W���� \Y�@<������0���*��?���&�.����s��k��cT�}���?������'�
S�"�N�6<�@�&���[�Uj��j��t��/�K����)�����2!}��P��(����S�L��9!�������$?�����Kq;�6	��3�QKf\3��1�^ik��S3��?s����������#$��zd��d�1�b7���S�]��9G���tNh�����' ��>�V�
�=}
=T3�|5Rh�1�x�A����~������ �� 5�����)�#@��G�>@ugN�?�*������^@OD@��[�����EL�����Y�E*o!��RI��)�)����e�$M"��L&�{�p���y���6�����$��&��1��D�����1��x�X�� �E����E�l�Lv�c�V������-T
���p{��Z�~45�\�h<����sW��Y�N��1�u>������?�O���A��.~O,&��DCVe�W�v�#.���I�b��[��)��
\ Mf���!��:��7J�vb2���%��"��o�������V�o�!n7�
_[�/o�,h=���*��b+�]�Y�X�#}J���,CW('J_��k[�nm# e��jd�5	5�Sv�G�=M�������Pt~����l�X	�P�O�S�Mv(�/�0����s ��z����x��bei��cC��3-�qdK��c{g���n6�z��0Z���kk��#W��$��������������jkz�E�6��]/)��2�&p��-
��/k�K����~�i�c������C�_,n2eJ��U���
���������
�f��V�~�fx�f�[���KE����>"���M��H�oR]�`C�Y�-=��R�?

��>3�J7�c�}�v���!�������4A��ss�����P���9�;4tk#���.��I�%a
'���	�zf!�l(���i���R�$�Q���c��2�����������/\p�q���S._���-�c���co����V23\"@�j ��Vm<3`"��)�����"j�hk/�����g}���t8#�?�d�C�_*{��	������O9S���`��tx6��m������0v��49���*>�����YQ���J������G�
.9�{��X!�i���k�9��h1�?.
���Ef�2|=Xg]�,*�x�}2�t������[��Y��
����2��V-k�8����������s���d(����x�mX��X%p_.����{k^�6.�]�Y&V_�����>+hO[��Q�������/��D���F�U��^��G���d5h+���b���{���������)�����f+�����2J�Yi�WO5���Lzw�8"A�S>I��.��=rW{v^�-K�t����S�d^5�+A#��6�K=H=q��Q#`��=��+����X
�e���V(�K1�
,��bvx����`���_Lh��$y�����m"y�FSR�C���m6�uk�h	�_��i����Y�e�|H^�GoF����W��g�p��I�I-�1��6�i}m"�+p>������UUvx�����	7#�_`���V���3�C��"�53��i������7���#��~���A��2VS2����z)m6YS�|�eV������1�u�����<7y���k�oM�Gp(�{�
W*������9�U���q�w�&&>c���*t��!�bZ���0���f�a
>����TQs�|����0�*T�2;��;m�
���E�L�������JF�����N�gZp�amSx����P�?u�I�+{���q�,EyE�c����b-c�S�U��U�;h�N���
_����AlyV��NP:�R��j3��

lv!�L�
���d.��5�e��/`W1�/�@��$/@g�C���& �(3�Gsu�F�2^����{��B�xs�����3@�1�	f����9:��ui���,����^����C��*?
��������(?���9������2�-�.��6����xu�=�:��!��M�^���	!��Rbn�����'�w ;�C��
�R��[+p����k6]�t>3�ug 
E��w�9����;��:��E�G��mJ�k����f��~��P�yp����h�Q��?�!X��g,��f[��+���U�{��QK�e�2
k0%��#�p]��A����YoY;���2��9=���-#�.nb����M'm�+�NV�{h�K%o$;� W�-i���pG�4h�t��r�c}����}���� �e��X���hg��N����v�Pe\����/�����wo��s�������wo�$��*�;w��T�|�Mf��������f�GZ����U/�~�J�������<��*#�nkSrGe��rD�q���!��Z�IOB�]�X�2{���k�&#o%gE�8������~�9�$!�����!r�Dp���M�2*>B�5v�F���fji����S �����9��N��{S�=1��t���G��q�i����4p��>LODk������"�����D�M�J~�!5�1s0Z>$,�/���o��]�������[��0G��~MvR���0�h���J�����8��!���jv�J������������������w��*Kd3��r'#�o>D�j�"�����w��Y���������KV���� m�`���Ie2���Vo=��/�LEo��q���]A�N�j+���d��-b9�]����
�1���&��izh����GT�	C�#��_M.k_��	��Xh�HF��E�ET�L��G(�x��$����#:Y���[q\��st���?���l��40�?����sX+��xh�d���
O0�E	�b����Z�Gv^�7����\�q<�pM���9�E����2�U����������#�'=�c1�z�(�$0p$�Lo���?���b�NQ�i��6��*yoW�@6l����$W�3��N�{u�u'Nb{�g�0z_1*R��c*������x�9�q�{�$Z�Nr�"�����u��e�K�$]��D��kDtpGh#td���&M�%
�����
8���@y�)����B%a���������E[���C�GWJ�)xO��d��.����z��-���"��Mf���Bm��@	��`��;�7���@��8��~|�
��#�M�E��4!%�XevX�n�#�&���AZn���B�d�>�s��d
������_.O^�c����������W���j�n��us������3<
��OcV{��M�>����a
�D!����x��oL9K����C�wd��/`��%t�g�kF�0�cf�q1�_���N���,���ct�(�������:��T-"7B�$�N��T���q����}��b/$��`��������V=@����*|Z.��m,�OJZ�Y�8~��1!T�sx]U�r��
�g�C�b|��p$g ���qp$��]���a�R���w���9kQ��N�_���@��{��1�R�.�|������TUa�x��6�Bm��x���4PO��|�*���[m>�_k�7�>�L�r��G�h�E����z���8�����>��<!��r��d����&�6���6��B�Z�����:l��be��_
!�Y�Zf	�9�|���}�Y��%(D��&P.��F5������p��B��3SX�I��bY�l1_������������I�O�r�.�)l�>�'h���/��Ut��9u+��D����~�bw���5e��(�S���
'4�����O�k�i�6^]��e:[U�%��r�p18_M�Yx����Xm�]\�~[������������B�\!P���0�n<��#O9��*�1$���%yv�>�6#�:��i���	�����2 ����R�����O��/b���kr I���w���g�q�j���)�{����R
4�����m&`s�g��g�5P���P�z���H����7�tf�P���T��{F���w�.��|�����Y�`>��z]f^���\tm�_��&]�MK�j�Dw-H����9�������@�`N�0��#T��e$w�CU��q��g�2n��	J�����4]U��5�����9��fq� �#�{KN~���.���-+	XUj�GFQ�M�������9���k��{��H�(�cs��{�=)�7����1��^i����-�A|�l_� �.���45��,�uv�&p������i��Z'�8[�6�[�7��4B���&�f�orM�a�)�A�"!(+#;�����Zr�RV��3��P���H��r��j��n'���VgI���,��j�Lp��I�z7����d���������������&�6Y���.�9��$����,�`38'B$Y���JNN� ���
��}�����28�Gq2�����A|U)�����8���
����{K�%5���]AI��&Qe����\�-&���%^�i�0�T����D3�q�XLj�����8��S=���<�3e/��"=��PO�b���+��Y��V��K%�\��I��,��{�W|��u���|+:~7��8v�����a�y+�*�RKB$�9�{�c3D?���
X������/N_�f�,a�4����V@���`�^�M~��Q�yR��S����B@I�����Ms��EUo�}�����T���	�~0@R�L��)�~F�/	���s�6r��JuEyE�K�Y5/��aH�Gbj��F�l�z ��9Q��!S���Q�4�n/�e����� 
�*���������*$I��y�P"�V��3+�������L������3|�iv�F��O��ua�h��q��H�%I�r�$���P��E�X�g�g{,d�����c���2\�K��b�"�k�n��pS�w�y�sL,���4Y6��A��Q�LN���E����Q P���2S������FX�z���jYU1���xy�z�<�{����!y�q7E�������/���� ����!^�_^��8�,�!�:T�5by���(����q�{�T��%���,������g
���"�>������TZ]��Q������0����d2���N�O��N��^������.����+���	Ly(���~��\[��e�4DP�wP\$�w��\���MG�`��kP\��5p�`��t���G��/1����Z2W�/A��'�_�����qb��.�������wyJ~���iNJ&k8i�l���f|SRy����(��
!��&4X�8YnD�h����"�$(�k4s
&oE���8r�c��Z;v$w��,����+6)L0��Y�e��4$M�#���%��:
���x����n�+8�@�7F��I�)���R��R\R�0�b�jB��
���I,F~��J��L�*�t������w�(�����������9����������?���sx_E�Y�J��	<�#��L�(9D4cR\x��E�<=;>�����*��*��6��I�4r�\�.s�
8��j�o,�T���'�����o^S�LY��u�����N.����n�4\�������f4����~��-�n���.7���^�#�1����`�����Im�c���c0� K�jY���)�^��pRA��K���1�i��nX0g�R���T
��=�B�E�DIP���YK����J������+o�`b>��$�v�Q�T�+}|(���V�V����H��-lQ.-�YXk��'y(�h���:�MV\l������I���)	O���\�����-rUN��A��M���d���3o���/����E���4� ���4�XXc_�;n`$��+q�4*z��:�/\Q������g��������##��vX���gb��LbAW}���81���X.&�tR�PR5wL�iy�F�a�6�SS�>�[���weo����=���z�����4�[B�]b�����\���mM����tF��Nk�)�c4p	�C�s�v{�������mY�	����i�U%OJ�� xg�����S�L�KF��bb\��}��nc&V��#r�#t��;C0���Iv��Q.�[��h����������U���O]|�/s��mf/�I2A~����������sK�L��Y��
���(�3�S��Y�Wo���2��eqO��`�O�`�r�����Ltk�\n7���������]���Ez":SX5�N��u-�����e�_R�,gA������<��h��������i���z�)�93t�w��d4!z��(6+���>5Q��$�Q�y�Q>�����	��-a��.����\��>yE����E-H:�#T���+:�d�������d����}��e��|��,a�e� )U��m+q�$N�V�g���i���d}%�&�j��D���'�
q�f%������|���u���i��!��n"���s8�/N��x���-s���'��/�'N���n,�[FO�DcF�j���kf����k����p@#�����INFf��g��t�6��l��)�y��&'3�n���$NOM	r�4����n���z����!Ab�XP:@4/W�'�i2���l�Oq8q���S8��z�����O�w�Tx�0IUe������.73�'���|T�!��z��C.����+C���p��3LzkpH;N��z���d97��_�$����'�9{(dnX������,Y|�y`S����y�`(��?n���u��/�j����q+�-s�Y
�Z9��������$�n�{4^9����[#�V[�-�����Ng��0���ei�a�� �[���+% �
f�B�`{]�q)�qW���?Qs��HW@���;R�������'�m��
o���~�dGt.P�|*0p��?Nm"S!V��9�U&#�Y>h�a�,+����p�$A0�z.�c�mE��?*<-B��/�U!L6;�����0���kK�s^
;���=�?����l�6L�F�=���$S'�3�K�����>��)�DF��������x�Di#��k����3�2P���\�����H}��l�Q�m����������_�jGB��w���2�I6e��-�[�zzg9���l�,��+�:�
���� :#�#�BX@�
���d#
��P=@�������������&H�$Zsp��aN������"z�<,Z�WF9C1� �"�AG��b]u������Y�j����Y(��w�
��t�2����lt�FN�s�����"PU*p@�C5���r�0�! �V�+���Z�-�uw��mOm/A�QN^
�Hu<��G�	
������C��r\��W*s8��������$�ti���x���������s@��=������������J8��qEO�Ez�"=��E?N*����'��@D
�-���x�Ni�|EWUP���L�3	Y�{�qt4h���R���7���<����Rk����{l����^h��"���1�B��0!sm���n���_������r�U���2�9��4�$�����6s�$�f<pq>>����j��jG�|'�q]�4f���������(:���KPd�Lm������"��y�%��'���~ex&Z������TP9k�^w�0�g1`�r&�"X�
�].������0d�2�r��O���>���~e�;MD&,�f��r%~+�>�l�L����I%�����Rj�5T�W�������lj�<�����`m���H�!8�3��
���g�e*�j�]j{^8a����X�;����d�+�X�ag&Tb�&��������j�Q��pcP�'����������B����Vz9��6��
������q0�X9�������Y~�}Z����v5�p.?��7��j��e���xsi��3�],BD��Y������������G/P���`%_
��1�� Vw�q��-b����hi�L�r<�%���M�������2M���v�1b�3�FcV��Yd#%\Dh%��F�Y4�c�h��R�!��S�]`��v����(����h��"�L�h�r6�����Q����n�$�bW���SJ��	#�_:��&�4'kQ���)�1��JfIY}�\A�,ox�E��\)�O!_]�4�����,�Qp�G2�`z���j�� a88\���x����F�|�>3Y����5���!-�d���v��,_��?&���O��4�pDO5d��E[��i4�O#D�F�����Q�-� +n���tU ���y@-q^5)V�
��'��V���eK���0=��)Z���@��6�!�h����R�B���j�D�5-2i��`L,��~m�r�:B�_'��z����V8������O��$���m��E�j|�����GQ��+E����(�Fc����[�)�����,���[Y�V��`�f�QF:��a�Q2��#�7����n��w��eGN��
�S��[�t��_�2R��&;�Q������p�?�3|�#�dQX?�_pK�G�x�g�y�����;��4%��� ��;ol�_i�f<����0/}�D���B���
��8�n����d.X������?���B�[u��grB[&�cSb���#�a8�c��@�68�^�>�Du�K�#�:h���Mv2��[���p�7rj�b�/�f$�5���d1����������5�71�6b
��|+�<h�����>V���02��M#`?��C����.�k��A*�Y�
3s�nZdo2���ve��s�
B����������c�1�~�(br$�L~������#6o3x|�*���J����cW�V�D\`tZ+��f��j)��us��,�����9Xg��i�������l�s��3����"���h��2�*F��������6z�s�n��W�������Fk�����,�0'��C�����?�,.P�(	p}�_=)*���P�����pc��_���.��E�(�X���O=��lsh�s��b�9��z�����i\���w6�8k�����dUE}b���c:�k��2���j��d�dglS��`��������7�$T���h�7���a!�/J��B����P&�b��B��127.�-0|=����K��O��\v���:���83�{]U��}���?�)0�T{!%%��H]"������Z��m4wi�W_dx+�$�H����������:�_���T���������@�hT�\�W[=w9.�]/YT�$S��9MN\�4���@	F��3�$��f�m]���W�>��F�b���%,Rv���9�
i��[�Ii����|��k�.nL���i�q'��o�d�d�E@�PI���I�����O�w���u�d���\����t)��7����gY��y�{�sC�4��8�?^��=<����d�V����q���W!�~�qCo���
Kv���^o�����Y�*�3��=�<��[.cn���e���8L��z-����kb�'s���[��@g3A��a!$r����bX�|�H�
���p����8:\�������0����9�qc��$���db]3"^C��@@����e$�F��D���R�Mqfv�Z�
vJ�/�4������aE�f���JZ]X�.��)u�*��� O��4�+�Q&��!vL�%���J���	9�*��`�����61������[8a[.���,��y
����3/���+���7�����^$_t"�H7v�~�����K�����1N:F!��1L?f~�KC/����d�T�B�8��d�!�g��8H�P�C0�dM���x:��"1���O�G��rG0��3��P�.=���{�e_LF�X���.	��)g$m��cs���7�S���P�_|����0u����~����������@�Q`�,�:��R����������V����K��o0�2�Cf�*9Q������>+�'W]y,C�W6�
$���������&)�i�2�G��.tv|xu(����R���L�N�2AHP{(���(��(
��z�[ZV���^�l~��t�F��aiI3��^D�
���#������k4����K`�./O.@������g/������)�>b70��>��\g�0�Z~��K�<����icX��Y�+1�%�$#I��]��)��=q�W���w����+��f�`f*�p����`x,s�xr�`��_����<�������C*b�WRCMx�����L�}MJi��V0I���5����3�A���P������?h��z��Y�_dI�$5M�I�df����&�a�)����9H�i���i��|�-��B�6�!��m���hO�f1�U����wCk��=�5[j���� �%ZiS)���-����m*o[�{'Qx��N��bIcM�<�#1�"�3�	�Q�l#&"�����W!����a>����S�h|�
���~�j���������������o�/E�J��P����S���r3O�����T<�	-�U��8���������+�8M��]�#_x��C������=�����=���/tVA.:3�d6���p�t�G��E���;?-uf�+����i��43�z�)c�+���s�S�����X�<����aJ��������Kr
1��P+"�|���r����X��Gv�
����zZ$����W�	A�.B?LHv���uR"�j�k�z�.PJGM���G��q�g�E���i�O�N��6~"x��I��,��8�G5c�i���0�ld��F9�f"��
��f��7q��Qg�����P�F�f��3�1�|?���o�/5 ��8�qjv	]���b��M��-Jc�^8g�A������T%�����H�x:=�)��B��}�����u���y�C��<�`G�q%�N\8.���N9� 
��<�$|�����3 �n�H	VPQ��a��j������g������.��H�������Y���i_t������y3[���}1������s�m����'�,rsQR���+[1�{����48D/�s*����`�H_�-	�i�XCg#(�w�^�7s��I��+�
RDP�H}&��6���`Ub������-��hJ��kaNe�D�%\sj��!c���uF����|�q��{�K����
Y��\�kw����[���.��*�&Y>��w(�	�����x�'��w�dL�B�J�� ����O�t��/.~x}url���Mz���o<K���2$�l��:���r{�������:9�<=?���CVc�����!��h��j�j�P��v���b�5V��e�Q��
��`q����P��<�V��?�>�h}��H6�h>���"��W
�s`#�i'/�����9_�QmRj�:��<x<�g�����si���}
=6((���`-3�q��c�
��[JLok5L$E�W�YG��(��6~��o�������9���k��������H��f�a�]��z��*w���r���Y��8	���u��di4$ICq��s��;9DM���.=�����U�)YE.���\�b������ql���?��`�0$���Q�����e�����+#�>s���t4�� ���8J��.����J�R�s��Z�����C��$?�%��	�
���\c��PG��B�D�G��_h;^���o�<h�W��<�0W�p����>P�V� �PrQ ec8��(b��1c�*��%c���0�2�2����	�����,�-y�	,���"u�����&V��F�S������ADq��y;V'_�yk�4j�7�����&_���1���.� Q42|�7t�c�t�������9�<'q5���_����`#D�;�rD0�z
R8�_	$��u�vZ��}A��PU��I9S�sft#���n2��&����e�b�����fq��}�V�)�s"����g�hLj��(�J�X6�u&��!�t��a��
h�lt=��v��I6��tzP���^nL-����R���M~d��H�R�Nb!��rK����E��5���0�f���FVR�hI6��{�U#`��:����c#�;5Y5�����6��
kb��b�����=F�����0_�}���M����j�0�*�F�:>�?Pb�_�9~�Y���CX`E� 68��^D�LSf�����c�.y�	`	��
8��-�j;&�E��6��S^<��+�����U���A,QtuK�Wy���W,6�Xs�7Y'xRA�9E���8E��WX��-��8��K�h���	)��|��j���u�Za��c�E1��L�m�h&�
��c����Z�x�����X����
N�b���4�����Z������[�|��X�Nv!3.D�������m�9����PUY���u6���.y���b�1�X>~��Y����N�Ci�HV�-������?����!���p��
*A�8�Z����e[����x$�cf�pf�.J����.)ij��
&DA���|���3��g�)�Y�Xg�G�czk��oU��$�����8w��@��5��*�������G���{Qg��f[�w�o'`]�L��E.�C^��p�9�C��tE<w���8;~6�2�_��^5���f2~6E��=j���������awN��Y��,�)�����L`���|-�����RE�|WJ��&���"l<t�+�{GI��)`�S�[�(/��%}d�]��51c�_	�?F��p�O�9��z�@�0�����St�?=�I6<����+�[|z�-�2'D_i��<����9%�b��B^B��&��G�J�<�Wf�$��Fb��l�l���)���#�A{��@L9@v&�
��������?�Q���������P`���8�P�f� -Z���V/^\�������G���M����?���I������n��]���������*��������t�
���c4��&71���!K����_�?8;�����W[�5����e����������a�z�XA2���b�!���kSvg�O�r2�Y��
�}�0�aY1D���a���^��H��B(Q-��M1y�
��\�.�tgg=z����PF�x������g>��4Op@������8������oW'���a��q���w���ee���!�J��3��[G����e"j9gw���O���y;��;�����l4U��d1S�*���X��y9�k�	�}�����o��)�8��u��_�>���[d�����^�pq�=;��^������c��zsLxA� �9��
��Y������"��S�Oe�\w����K|���,��Y_f	�X��%U�,�
�L���:[c�cuZr�>�N�[7CO*�����
�h���I��_�\���<�����nC"�D���M�L�*�p����uK�}}����O�m~�04m��xHdO������5�R[�W�G�L�����j{n��B��O�����������G��B���y�o�gd�������+����9�������W�wy�A���KLEg��
('ed�lp+
�HhLX.�p
9��o����]i	]�s�Q��K��$���
��J>T����YdlLfa�J@�1��;;��@�I]��	�0�Rk/S��*�o�6�.h�����vL��CjI��4+����f����3�t����Nl�5	��c�M���oG�Z����
C��LL��C��Quw��!`�9��c��G����&�6b4�8�.N0�K��]4��N�����9V�`o��=X����^Le��}��5�46�=Y����A��_��u���H�#;�)���SS��'�t:��Y#h����~]�AP��;R����[�D�vq9������	e8OXV���s!("�Cw�(0�Z����9|�OX80#���E�&��F�Q�����`���2������6�<����z�Y��O��|=x�����Yi�'x������n3���O�<�K�T#���2K�5����S"�t��#�z	�
��������ym5����� �s����Ll��6�F���
z��>�}�m"��qJ�2i��]�u�N~�R��.�r�t*�;��N �[��rMN���g'��6�j=@W�=�J�!��N:�m6I���6��������#��y�3��HG)�i����E1��#���(%��T�4O�=X=�3�2��zj������UE�������u����}4QQ��G�e���q�)�q��O����?���@^�)\�ugrV$��c����1�|�(�C[Fc`����|x| wq�0���X����~0���0Y19Y�����W�>
v!����BV��h�����,Z�C�6�o���d���U"�Qk�����&/����F�t*@5��N�I���0����"1�T
���t��fJh���%�6^�A<���u)�k�~l��Saa1�(k)�Qs��^}�(F�F���?NYkA1��?�_(\��%4[�����r���<Ar�V�1����F���s�r��`e7��*/�y��B'�K^��D7r�LF�������l>$��QZ�UXf�CD��O���������QDJfq��@ud���"�:��K�
��E's#v���i�$Y���(��P��p�T��Xw#f�'x�z�V���2N���
D�gS�t��rCyL�H���f�����:�u����:V��.�����J���$��gW�(����f"��v4��F�cT(���k�M�eaO>v��&
Ts��k�3D�����g%GL1��?p�����vP���OzNO�eRu��O���D���]?#J�Pt<4'�T��pKv�(����HvzC�^#u�f��=N��H9I�.�	M���[���tnp�a�]�zN���S}j��Pc�������AC'�6�2,��M���i{�)$2�h4��o"4�
]	ev��[Z3�����lI��i��j%�?|=��
b�$�<�9�fd'i���;��`�xY���1���>���D&4�I������|0�{�O����ph�;�`3��:v���*�,:�XkY��6�-�$;d;"�%2E�R���H4F��$Q@�!J�>sZ����B��4X#���P�����}Z
�f7�����V?F��{JkaC�S�&�[Z�+��5���d����in���������fg����D���e�L� <�?�����P���(�u�,����B�@��hL�i�U23�(5�$��BE�3��8�?O�?BE��������GC�����}R������!Hb i�Z�N�C9�u��8/�{C�~�e��g�KV��=��Xe	���^�v�a���G2�3��Wx9��B�����/���t4��h��L��E�Q��OO>N����t����G�	�K&$�?%����=��Z�J�F��j�$j�4����F�/-p�:�����_�*��=2P�UQ��8�Y�Y2�hd��@���t;�PK�35������8��T�
��mcZ��	4s1H��`�����8����^��uZ�$�\�c������_�� xm�l����{������������?F3��0'IGS
�u<
&$�a>�9�<�����ak�Hs�J�������Ds�P.��(���J����"/��"�������z>����b1��������9\�KS!?�:h�9��G�L�����Y�I��t�,G� ��Rh(��8���i�������J�n�w�l�F�d1�+�aXw�������D����:�JV:i��'��J��Ks�X'c>KFNr�v����2�;g�5�Hl�dY���X[�5�K��Ib���Q���(��M�A��/���Q���D�avu���_1�G1m��~�Q��}D;�)���������w�<������q�nw���tG[��uOS���+jJ>��P^	-�OS��wO�ry�������^�#��=����R������}�Sn������I�9���EbCB��D��nE��D���sVm:_n�������
g?t�O��d<��s�������1%e�xOhl�*-.����:�
�����3��2RR�������z�y���T�w>�]�<:|yx��
�6�$^��	eC�'�YK���l�g~��c��������l]�������1Ayb����[I� uKg�������<0�i��������v��:K:L+������f����Dr#\q����L�Z���?�z%��
�)I� d��q�"��*:8F��Y�-�N�p���Se$P!�$|�^|=��k/��%H<���m�QM���}B������y~�9�`r�Qas�����9����s
���7������tY2���v��I� q��R^��A�oE��AV�U����D�����(�&~d��������\2o-w���cx�����[�m<�;.��uQV���0�caE�b�H��/��4s��q|�l�z�BD����A�������Oe��0!+;�#H��}N�^��K
�y�]<�y^$�����*�{��&L�W%�|�y�xw�o�e"�"#!>��B
�8[[u�L��GM^Ev����[�<�~����R�If]_���^�n���U�`J�H���DU��E!kuf�B��� �7�%$<jdH�����{���B��U�q%��W25�Q��N#K���N��9����u�63�xiU��� %z�H��&]z�%���#ta��w�0C&�+�]��3�8_��gP���Rx\����<�J���a?#�^ilcv��^=G��������?������U�fv���r�@�E:�hd������+���zi����Y8�B������_��[5��$	� +A��M�y����l������AT�������0�c�%+�
�����O\���2����{���nI���>����a��W'�����I��S��g�9<�����&��k�8`�d!�����|x���	�di<e���@���`����N�&�_��r�H�d��zq��m����03�*8����o\��&d��.���:�9s(�����Fx���8�e�6���9j�}C�L��mW�w�~�$�_���'*�2�C�-����>��p��X#�����}��[r�@����-���>Ak�7�@���j��t�V>�"���6y�Qs���\����{$.(a�E�-�?�h�&��zc3�4���o��w����f����,�_�l���Q?����(��2��A���F�Bq�w��@��N�n�{���;<�
QF��3�/�~�N�$-ao��7�j�����x��<�z�3,��7�*�+��P�>������t2vaR�Uh�$53�/Z��*�
D������@�Oc�����Mf�:1WT":��4�(�m���V/���fk1���h0�BY��j����e�8�Kow?�����~������j����5�UzZ��j+����4�;���{A�~>
���u�x������&����B���#�]u��H�h�
��e�'�o���O�o}`��t��55~�9t���2����/)���]J��v��|��j:�s����`�����n�M����|�U���fl�wA��]��3k�T�����U�����sK��p�eZ�<�N�����[���w"��&{6.��������������Kn�d�'����T
�����X��"t����
��z��q��I�t��h�&��� $�J]uIN��avpc����:QM�Ki�I���lv�o�/�_�jU�T�/���3U��u2G���X��h��p 2�[�Mo1�9)�J�In$�E����!S�k8����7�m����_f��4��P�����i?_�^O�@:�S�>�~�R���,���vS�cG�H���FUoR��M���?�[��������F��V�����������v��VP�y�
������;�]��D����$��e:f��C�Sj
o�MD8�t��u]Gv����� �6�W�����3��8C.���%�B��'������E�E���+B2H�v���R��z�����8����U��t�@P��WiXT	�9�X���*�|���:�Q��������*���5D*��_�?�8	����:��4{~q|r|����d���c�E�
��G�����^:��o������AF��(@���^��a�U	�Z����U{������I��T��0T\���c|��1c�[2�%�$�A�[i+U�<_��=��	���o�������o��~��������~��w��ZE�sy��������?�m<O��z4A]u���^;:�x�[���d2s�vZ� h?�R�%�rm�����P?�g}R�q����R��$6�}d�5��N����T\*; ���m�����1�d@���:����[+m5�0,�j�^����������������n�����������mz\�O.�.�������~!@�S��G�R��C�;{����u���P�Q��7���^)�?�}X��_�(Hu��(���$�1
��$`�;2	7�����d�xL��N����p�)z�z�������G�c�q�9z�}��w�C�����������}��w����_u�+�nVz���w:��g]Sn�@d�	����s�SX�'��w��P����?j��oF���Gi�u�ju�3&���?��]�r���W�Ga��=X[$���:��D�1	������������3^���U�g������;�\�����l��x	��d���f]�i��'���|����w�������3�������fgpp�J��s�mk���-���>�;��;�mzy�"��oo���F`a��s���G�$�hP�0����
x��p��d���&��j�q�Zl�������&j.���h����b�G�t����_�U���+1�vP����B���(�l�Wmh�2!������li�����5�����3r�\?�M�k�J�2H�C�3fuU*�����\E;+%L�Je���v}?�=�����|o�(G�0�V���J��
����Ai�D��}��[e����������lq��Ph�a������7�|��p;^��bQ������\�{�����B�v{��v��l�������V"t��R�&e�m?"�)*#�����m�����������#/���<��Q����Wt�}z���@�/
�<0%b������s���x
��(�-���n��������c�/�8s�_1��2w�:�D�����w_���]�\������Up��B���Z��r��9+r�������������BlIE�>��Y@��G�g���<:����Wg�S~-K��O�@�W�<w9W��*��~���}}��i
Z��u�F�Wx�jt����m��/C:�-�sc��n���?MRv���k��l����H��_��~4�f��o�
�pW�����y�o?���y�	N�_���^~I�_jq��J�'���;2oN�#r(iZ8-�{5���l9NvG�>��7�'�~b��B��e�~{	_^.�e���Z�v�����m�s��X����+�l�+���N���Oh��2��9�M�W?�+���n$�0F��"��6B��Z{�x��� M��.�i���C�,z��Q�����mP���ca������h<dt.���.����kCj���Ye^��6�A0A�1��o�Y����;L�b`����'�6U<�����"fN1�.�.z��Hv
���/nA�F4�xa��T���0�3&�M��o ^ j���V}���B��tF���;��a�,T�K;�}``Y��^�`��;��S�������-���z�5�wE;L�F��@��XJ���E��	���`T��r��8M$���w��������f/%��.t�
�}|�N��������&�eL���&�x��-������;]�{�e�hfo�W����h4��b������s�o�-.�
����H�����rIH������tz�����A�����:_�� ���e\E�3�m��g�I�X� J.�]�#�>;|y�t�0u��el�[J8��V�`��y��%���v�2�^I�cb?��z{�-��&�^� �������u��I'}V��D���J���
��I[m\g��4� �_������&�����c��h�����f���v�	��lY���W��/+�6g�$x�����A��J�C>�������l����������������V�ACN���W��]�N���B�G��^}�]�n�l5M�n&��J�5�<��NS���N����*a2I�h^?��q��a���N�*��"��$~��D]+s,��?�V���QP�����}���>��B���M;�8F���+��~wxI�����m���dH��}5��lR��$E@�h��)��}��SmM���>�8a	��z���wD7��gtQ<v;�(�����l@�	4�N��`���pU%�=���"�CFc��^k�\3��):Pm�2@��p�M�B��rh-K�,�T*�{:�m��(u��H�����:J�y�S<r���Q3'�U4u�M>
����D12������"�,4f���)�M��;���%E����B��Q_FH�[�
B��dkZ���t��O�p:�U����g�t/�e&g��P*n���v��-�J����	!���~zL�H���2K�mX���;������������4�e��r���Ce�\��b\�3_#��y��M3,[��m��R����"-gx��9 5s{{������V�DWg8�����Er�����7M0���>)��u���3����r�w�w��B-�Q� ��6JE���h
y:����"�?F>��~^,��M�4����u��Z�"�/���S�]��F�`=���SR(oq��.�������I	!ZC������v��l
m�������iy���bO�mvD5�)�}��Qwx������z>���Q�������O����U�gA�/K $��7�����������'��h;���w��
3�=v�����j��U�,����'GWA�	��S��4�E�+4�=u
v>�����#��F
�C����W���������i\�b�{����I
omP���q3���
G��I=������_����;�UoL2�u�h���}W�/�����������Vg8�zG2��w92��V������F?��Z���>���!�U�m�����
J��9
��z������H��H��K���p_5�]c�j5�����G�Yv I�7��7� [��%�J�!���sVA�����j%�5��|[]�����
w��f�]�d"AM�����$y��B���C���g��S{8���KV�G�k��k���[��m���������[��Y���.�_��=��������h��:���I�������������#����5��a�~������
�������3�r�#�&��>���?+6d��5>z'�����!f���H�'J�4J�3%N?�5��yL
�yy���`��z�-"���-���m"�T��������J�#�[-
{���s�+.A�$�jY?����)���>����?�������`�������c�������m�o~��6����GI��Q�68��v
L[�r-���
�kI������lo��m����bCW��/������m�]�o������;����s�ds?���O�����/f�bZ��\���f����L[[�m ����M����^�<<=���������s�-��o�����U0�&���;��cx�H2)��jG'�W'B��[H��)�_�Y;� ���Q'8�������|�gZ��/���J'�Q�);��k5����^���3�������8�Q��h�@Y��#�_]V�_R	�s�W���z�p���������
�p^P��kV�	�]�I�����m��N���m}T��yO�?&�}����.�j��#�������g����#Q��7�Y;`�����M�D/|3D����pr���?+���r��'�I/��CW�!�H.b��=J&��A���4�P���1e��o<SE���fu���r���w���B�B~Z������n��g���/KncV]3��&H�������cn�q��tF�&����
C�^��\��\��?\�v���BV'�s&�Z�����#�=��'���:=��'
�h^�]�kg~������N�o|B2���V�[�2���V�[ge���+�z��'k�=Y�~��{�`�gN���;��}����Y����|	��R����������/?��G�y<{��{��������~~G�Q��v~��X��������2=S�J��DnW��~n�@���k������}�O���m�����%��^�����LPfgQ�-�w 9;3�/O��9�4��+���}��8���W�����&���]�	��4��f����Z�����7G��cQ�9�����
�G^�O_��/�}k��KxO�~K����p�O�����"���{��R�'��?�&^�U7�3���Z���1Ov�O}�e��_��:"���o��Y���26���-����iQ-�eo������~����h_b�F�����.��f>}S�5�T�{����K���l�e-��������2�kWB'���I�"�i�Z�I����m�)������`���v�{��]�����*�-O�uh�������$�!&*���F��1���n���2�������u����w��@����N�����&X��x��k���K�����s~q�G��N](����Z��[���J�����3�cc���r���lO"�����d10a�w�6�g����|���p��i������������"�U�q�Br^v��~
��G����G�{�K�����/���t�����R����6������^A;��g�w�5t�����s��;T��~9��o���-��?�z�����������v~�������o�}��u��/�W��Y�7�~����4���=e���V��!����Y.J�+�=�Q�|���7R�,S���X�4��kk�u���j�

0002-multivariate-histograms.patch.gzapplication/gzip; name=0002-multivariate-histograms.patch.gzDownload

�V��Z0002-multivariate-histograms.patch�<ys���[�/�~�QE����b��f|�IJ��NG�����JR>�����x�v�����"�]�.����w7�F�l������k�;�v���m�=���'���y���|�����V����X�^���5��.���g��D��}��~_rx�EU�wO�<0�n�1���x�.�3���u�^/������f���O���F��+s�Nd�����#�6�nX(T*��|������Lq��
���6����z��:_z�.���Z]8�#�Z@�z{
4���l��9�\E��6�``�f��$<+��>��p���x����e����9P�u�g�5��t��jnd�0���|K�5��6�%2n�����v�?DP[F��p��a��5�B�?S��V�������]�Obn;�`�3��A��j�����e%h5Q��Q�*o����������,#���R?�����6�v���^�����k��P�����.�5nE�`�zZG@^_�}

g������j0 xh�Ll�6�M���Lgi��(�S��Q�n+���t��vP��4���>��Vm��8zZ�7W��� ,�{�!%hgm��::��Hwq�6�4���m��+!���
D�+. �+v��9b���D��bl���4��=I����	+9���"l������a�]��R�K�����7t��1d��n��1�Z�����.��rIc�&��#��J��� ��Xo�?�>6��
���Y�rkG��vd�����{�z��W�]�l�;uE&;q���������n5��~������W���~��j��I  ����;':������/����������z��6��&�(x:-������|����3O�(X����uF���Oj���
4*�)����6�������h���$��R��qwu��#�����dQ.��h�[JP�p��=�	u�~���l���+�1��3�V��zV��C�R�
:��>5��.)����I����	s���������Mq
�W|��OJ;N������v%��J"�R�'����^�^��n3�9H�A�4�d�&[�l����w2~����X�7�-�/D!����:��B�������|m���@D��gLc��q�ZN��:��=�D�I������R�$�	��N���8��J�
_+*QBs?p��4O�/�g�o^/R�l�J�&g��M]�E��[28C�:xN����6'���)O��2����8���S���Z��� �e /��\����`�P������i����N��Q���=w��i�@/�u��"`3p��������R�W�������� �[:N�{�����|��������!�@x��Oh����./�+�>�lWx!:���L������'�z���;��l<�I���*�/�������y ~_B~��Y��9[��Cr�{��W��c���v,����c@��J>+�,��=FAQ��<I�S��o�L��������������+I@��4NAeFP�����zn3?{x~b&��L�e���e%��8&@`_����2�Y����b��(1�}|7
 �z���a/�(
���@Cm�����Y�fk
���"n{�#��9��.|�1Y���B(���j~72�t9M�uL��3����UN���g�FUaa�����j���P��*,V�7T+3�d���b�?����b�������������w����jx����(�����H�*gX��a��_���B��Q4;*�>�7�|��nk��M�����{zu��I-~�l>�  D�o-M���<bC�`i��B�^�N�J?��'v/�'� ���x�`GwR�8��H	PS������$�u(��^<�����!&�~(@yC������ �%��4��R�l��)�se(�����%u�$�T��A��cr�Z��dn�"�Q��"��pDFbH�C/b���eaJ���w���Ai����qn}���#j1f���~��W�k�N2��r�������s��6��nr��e
��:~�����'�D����>���b ��8JlO������B@�8��������l<�O����l��y6�w����`u�`�����f6\�,)���_g�o��,>�G�����0��`��B��m`�����q�"���Fo��z�Z�{-��@��7�<�(���M����@?�����D�\�x���X$nK�n��-�#�c#�����������w�����B�!���6����.����:�x��)�gP���H��2Kx����j3h��@!B"+����,x��c�@���B�
{�~c]��r����(
��SS&�����l�YK�N6
�����	�4��}FK4U%����#�����6f����|��?�h�N]+�y!�e0��H}�j���o�{F/�
CM������c��[y��f{*`�����:�>U�{}�c���e����X.hSuL>�g�!5���r��op�8�Z�����uc�_o�"D;������������3�{`���{rR�T���"::���
��	���9�H��t�!�m�vx����:b@s�^my 1e��2��S��t ��jU��t�
�����R>�ww|����"��3��K;DZl�&=�2_D&�kw��<���\�0��a���{����I-~�(�!u�)]���h��Wk3YV��Tc?�.1|JZ�r<i��I�����'��A�1��Im=��M���w?����(Q����u���!A�
MW��������}��b�t�T���Pm��������b8�sK��z=���^,���1����n��F�P�5$g��\��������s���N7��h��v�\��r�M���Z��E��n|{
Z��jt[���v��d�&���'���������a�kz��������u�b�����_,�������,g�nt��b�I���� �����H&Y��Z��&�\��Q��Nf&�.L��J� x4�S�����~r��}I����S��Ks(_L2��������\����JL������ t�qU�y��S���a�C�X�{Pt� �)0�8�T~I�*0$��b��:��l�j@02)�v{��������C�	��O7�R7�?x�~O��
m�8>��<24��N�����kpf���k�S�N��n
�x����:�;n����:��\�4�V��4E�2�;:�;pm�������nS��2��b�t=?��
/����
~�7c�+�^�a��a�YCv ���"S�'�z������v���vo��WI��).��5l��������kF	���)n�H2���BE��R���J�K��l�����V}�����41�������/Lk}��"����.?�r�4�����k�D]L���73����rh��O5r��j������5�W$_�NI��]�{�f���q��R�5T��V�4J�xJ��2@6�S/z&o���T�ZVd�/c����i�o��k�]aRN~K��=p��^�YO���M�o/��5XQ;���<N�g�s����x0�������������5[I�>�2k�A���U ��JR���5��ek�-�-�X ��
���#�:a���]������Oo��
�q8y���-b����Mj�1tQ��X
�H[��H�t@�D��H_��=���6W��%�
9R6@����g�R��>}}bv�4vZ�-��]��f��h�|�9��9M%����>��;h����>��:F ���h�^]}����?�_^.��1(�����
R��T��}�gr�����h;w6`��&��k���3�����|<���6q)������r��+�����*=�L�^�U��y���zT�[�����E�[Z��c�r��
���]�;�Q\��M��|9�:��h0�0���Ln.�x��].���2���]������S��XXb��N�"p��.5�Agr���d��'H^�tL<B�Q����>l�;��|7�]�e�G�������c����8M,tv����.��NQ��5��� M���);�u�
��i�i���5��A�!��q���8r#�*���;�����#���u���A��1*8=�E2��`~�'���q�rt���{t���b�z|tT{��������=;4O���^�
�<���1�#0d��#H~?��7e��#�N��������$�]����c�_��^%z��%�/j��[ ����	�6~��3<�M�{,�Yw�%���u���+\eW]�k}8K_�Wy���M�=)"��E�����I��*b����S2���G��t���*1gnaDhJ%�}�G�C����c�(�d�D���H���G���{��@��.�8Ic/���%��
��p�g��A2cy�be\����(�$���ht3RuTA`����g7������I��)M*i$ �����z����A?Il�����\�����S�
l�����T�^�,�|em��o��C4fbS�l�=U���t"����ag��""'S�Tv<�4������g���D�	Z���<\Wg?b��A���(����4���Q��(���%%;eu��wL=��a��n�d����rv�B"�)I]N������G7�s��m��^�/�������?W�(!��y�z�0�C�D�*L�5�6{�r��<P��DyD�����^����� &������� ��������
�����J���0xT��o~R��i`��l����<�K�����*B�&#
�����p�<��"�R�c���*co������:�X*c��	��[L�Da�d���|��^b�r����%t}�Jt*XqA��Pgg��eH���#puID����}���Xh!�����<U��">��\ED����L�:=6yV��pN%aZ�t���V�:aLaI�M�d�G��m�����>�s�3��dN ���@R��P�x�����4@���nS���W����,}��:]����^��	XU�� :������J�rD{�-�X�H��~��R��n�%;
��U�yE�#5 D�k��N��N��GE����Bc�HPR��90��?1P
��	
T�P���+�Jn9��e��
2��v{~��QQ��A�u�]�p��ZQ��?,L�e�R���K!���jk*��UPb��g�
�zn��P��k�%J����o�B�c����c�sX�s�[���jU7�v����k����)T�������j
3�<��B���G�{4����i��Q�i����2��n9��qDQC�%.:�ez�q4����M/���s���%.�G������[��23�=��\�<���~!_7	���zr	L�h\������$3��r��pr�?$����������J���iJtk��7z�j�a�-��]�"ZW��S�3j���������+��e����p�P�N�[G�]���K��5��;p`T���#�/P
�+&�&	4�t��g����+8ZW:<�	x%����%i��>��w����~�S�����&ZI)+bI���MI��Q��
V���o -��E��e6;�����������md����WT���d�@b1�mO0�=�mp�^&��OI*����Q	c2����Y�=�!lwg��x���/���{>�R5��U�Z[c%�,_�*��}����	m����&}+�
^-�E����:]�?=B���Z����}���3�����G�{���.���<G�����m��}���SU,��|��D��c�%S|���PN]���R~�{�����:C����`��/���N
������mw�'/Nm����1��������a��\Oe��d��
��+������su���
�M�3WL����U�~�d�kv��8������H&w�NP��]}-hE�-�V�bh���@�Zw�Bjm��{�v�����������f�<@EAz�vw����{�����p/E�����P��9�F�c>��
7N���^)�N�����~���k�<fm���V��ujmQ���z��~��+���z�g�����1�w?�����6�V�G�P�^O�%����4"?�:��+O��Psp}��Jh`l{����m6�����=��,k�x��J�Ik�9�����u��N���V]vBt�����
��e�����-R�1h�������W��������P��Q8�i��Q5������U�O`i3Q~�0P7`k��W�c�U�\w�^������o���������WZ���R���\�)��������Wt���YQs�q�3���z�&�gl�`��Z�nU��<<Z-���!H�������U��������,m��r�����0ZK5����d��k�m��u�������������3�8BQ(���E������g���%;��j�������d���:�EH��X�Q6�bS�
?��,d@���_�����4JQ'=��Uc_�O���=����*W���y�]r
��d��;��c��{��,z�	7�{�����,��\3vK�l6�V�����k��|�@��"7���Gq��	���P�����k�* }����C����G�*��*�1d+�qD{B��Y��]4M&��I�+GE6��z���T��+r3g'm� z���o�<)��h���\���U����Y@BeOF�Ygq�ZM�JB���	�DF�Rx����&�o=�C�=�1�e�x�_EV�+c��T��	�'z
�,c6�{	\:��#Ak@PL'�>�M�K�-p��r�du��VxyM�8G���b��Q��Gh}L�E�^������4��j��
;�����_k7D�QK�j6GuZ�����������0(�6s�i9c������P��/p0m[�:0&
�;��"2��e�&��Gt���B��=�rx8>��o���bK8hax[�%C�=�h���^��&Y��K�xh���]XLrF�� �^^z��`�����-���-���k�M���b��[�.9�����^�X�\V�c �]�-�Vn������S�z�&W�0�D��h]�\���?h%�AE�e7!�<L��^��2D��!��XD�[���MfF�V�=L�VZ�
��t/�SX�+�>��MS�`G��W�[ bm4���k4��� ��Du� gu�CBX^��Y����r<���3#E���<��U��!��	���4KI����p���>�/�x�gr#$F�tFa02�Q����:4��<I8.����k
m��k
�Z%a�NGE�&	�E�2�����,B��8���c�#�
�ky������k�����C@{@��/��p���}pP�AZ�������d�C�
&�����>��5�:��!XN�`4����n�aH���C���Gt��Y��rf�K@qUx,31�s�&�k~\=*N���l^6�v8�R�_�Zdp����mG�ksCW.�	���^�x��<�3����R�'�g<>�.�q�D6��ZTt���F�Ex3����!s�4���Kj=1�ZDb3��D��I��n�>V��^�Z\�
l<��O�j g��������SS
zMWAw"1����������\����1�B{���m��`�����'���,t��U���M��hRZ���<���F]�&�#yF2�igq�!:��w�N�m2��@�=	�V�D�I��7�2B
�m4i�U���������}�/�\8�d�5��JW��k��`�>8�8>(m�X�_�Q�K�x-$�(�����?b<��349���Svz�$�y���1��B/�H��G��������[���G����Z�AWo�{p!���l��'O������n����o���jiZ�
<Q8��w�a~��%�n�U�~�-����L_z�H����[z����Z�����G�ec��1���	����������j��U!u��v��� }�{�s�0��</c��D�*���R��(��� kcf'crp�� C"�qG��
(6?t�z(����@1�&P�DY3&<1CJPfdy���3OY�d( �j����b���3~�a�| V�M�E��"CL����&�9����L�������I41f�0�,7-���*�o�AxK/��;p����n����z��d���Q��Q�`_�3X'�a��^Z���.�I�9=F���LE��x�b|�d&j
������RE��A��B��eO<���|U���2"pO?��#?���i�8x���Vf���3�H-t�`-�nc2���uy��(`���m��Z�b}<Sl'�B1�-R�T��"F�I�D�����M�b�0�#Ax��4@�_@��	�n��a�u�����`�'I5�=@�@��*�1���SX�]��.d/�]�1 �����HH+
E	i1$�y����j8N^�����������rC�z��%J%qT��WcBE���
����f�o����U6K&0�^��Z�mn$��DE &�.�K�	(E�G�$�k��,�Q��!��2�&���N�����4�������*��y~�8��T
�[B��EF����M%L����iry
#X�:�^����3�puM���K�V��i�l	�(������P�MN�=�\�2!�'����2�<�IE$^����@�@@xS����A���+���SM%�d1��2���<�.����6C?tb�@�p��[��T����%0���O��k���,IO1�U/M�@�LA�>*���S��0G����J�Q2vg�Y�?1[�T�'�EA���u���Xy�����-��+G�to�(7�:H�l�\��������'^�O|��S.��t���M�00-��`"[��H\mg6�8�TcZ���25\BQ��k*�lL����i���p������F�O�)��[f�{�a��@��Ytyc���C"�}N>�!���}��8�wy�:�Z��a�:��K�@��	):z�����nX�hF]{�����9�*:K0��X*��������Z�������)��v#�.����T�ek��9Nw���;���s	���~�aV��O�([:�[��#�3��1�D8e�@�6�L��Up3O�p�������O:�~:����sv��9��~D�%����	�������r�^\�G]w���D[����=G$N�>�bzf&�\JA��7�:���A`]���~��N����5��A��F��B�t�
�������$:��E��E�(*g��f�'M�M�66���|}�b�#p�W�<�E+�UJ�j�<7hW0}FF:s�X5�9�����&T�W8���^l}�#�(�M;$��F8OZ%����K�@��DrN0�f�x��p�p��g�F [�MfP��|^��r�T����_#�p�������C5����Wj��Ex�X I��_'u�`OD�IGwj�&�v���{<q�t����_?J�?��C����f"m�Q"�������~���P�ghp&�q5|+(�L���5���~��s2 ���"@���^��U�C�k���
uo�Iqt��Wf����������%�P�R�[���+di0H{)^6�k�-��������o������)���!#�ko�0��T�����Z\?��N{J�)Y}������Q���6|�m	�K��c���Lx��I�g.m���
����(9�)o�z�Cab��N�*�I�G�|�S{���1�O\Z��7��E�o�
��a������������/>$@6�g�a�V]���pX�39�� ��fJ�D���zB�8��8�5,��(���U5�zS�Ar���"����I�[[)7D��<r����M
	�Q)�s�Vql���Z���Y���_<��}>�s#lF?�6�^'O�i�~|���Os��SadX���+�eDG����%�7{�yl.�+I���X�������N|�T��t�V4������H�uju}zI�R��r��,�d�v����)��'�������B�����}vk.��pz�)
 F�W�p��]������a&����nB��������N��?O����g��*e��(�����sq�}�����f���Q'8����.����Y��rUC�#����E���P���T�'M�!w�R�����
���	��I<�x��j�GD�J������Y�}�E!�@�p�'������������z��H���d<KE�������1S����LBo���2o3c�J���������i����kD
�����#R������HESQ�s��}���!O�y��������2��:3{�uAh��4�Rr�\^��Kz��c��m��}���vy
8w:���T�pj�`6L��9	h����l�G��4��6���{�Ppq_m-��2Y�3aJ�b����%���}��g��]�J���Eu"48����v��I`����#� R�_ �����t��C�")��9|De_������RY�_�����0�L��u�������73��4lnq#����s��`+����*�+��z|����������)9�[����)�$���Y��~@C���z�C?m���2������,�=O�9��x���D��D[���#��,�1ac�~�lT	5���9;�'������U�4���c�2��8L�E�����7d�I����{{��N���5f�*�?R3NRy��D�^�N(�%�ox��Mq"Y���
m�t��g�Z�E]�V�y������k�^�d�����at�+/~V�����]l�Uw�H��5�nLQ��q�����r�)<�����w<�W��v&xl�3��@AE����\�����L��D�bj�r�Q������='���>t~��U4td�����x������|�]����M���M9 �@�t���h2V����=������@�Ft>�E���@�b#{9����d���������Q2Kd*C��R�O�e�l��IGh�AX{���r�?/f���� ��8`1�(�[w�+0���L���h��WR��d�p�1���	����m6���Ls]r�@��������N�DW\��-NpQ.�0[��4W>��{k<�+����
�S]�[���S�T�U\������bU�U[���*�"�t)�w���m������N�\g�]^���-SsJ6q����������32�s��DX�#�R�#���X�?��<�sx�X�y��h�](@(�����V��8��������l�oo�������r-�
���P���������H�P{�h��Q14*cpft�����,1�������n���������:��o���k�(��6M������2X�w��E��C���</Z�����0'',�>@�	��z5Am�����\���C�/)�hw����f�a/���7:F����Bi
����_��]��ECWW0l<��Uo�[)~��f�~�����B����;s(���i2�X� $�X�����
j&.j�)Ut���Cc����O�z��q����������I�1��)��faY�G�y�e2�6�4l�1�!�W#��|�1��7�E
�@���D��T
M�a�Q�������{��&��s ��^�A.D������VP&�X�)J��������hW4���m�!�����6Z��	d��9����� 6 ?���|�rM~�X��.G�%^��S��;�q�}�Z��yK���a9�(��$�0����l����<��	@~)�l�����M	l�P3�P���.H+S��n,K^������r�@��1���0`D����_��p�����e�=�M^�w	����ol�}������(�\�KG��A�z����ETu�����N�������M�p8����vZ�^�rD���s@��c�8k���`�>0��G���`�
�!y�',�^8����L4L1Y'����7f������2da��v�a2"7Y�-�I�^HI�uvE��K���K��oD���S@Ye\�S�����kDNq�LH��T� _R:j��M�}�'��K�z9d���c��o�]+���1�4+�$((��������&�pwu��S�AR5�*"������2��	a�V~I��C�s{����G5=.�.<��Y���K]X��	���~I�,���l�R�~x����%���[Qz\C��F{2���O{��8{�qf;��n��w�-h���V��0����d� B,������;M��l����P?�Y���I�����]����qxwD:��(�����	��DK�!A>l1�� ����������6f��W��JG��U")y�?D	�x�����
k��U����m����/hp��-@vs�i��/R�~�@��Y=�D���e�,[=dA�3E#c,/&d����bF�dP&a�W�X��ty4�/.��F~,�j���-<�,XxA&s���c�d\(��+�bE��y�!��lo�I;#:Y^�E�!a�������]�/����'�)�k��������-D��9�p�8��{+�O%��C�F��L�pw�y��Xlk��1G(�z�vY��G����/�%(��P������~�A��Z�x�u��Uc�g�b<�����6���e�W ���x�a� �������2���f�M��N��#b��)��D<�����@9\� ]�T ��� ����7����A���9�v<K�P�����~l
vK
vmA�Q����yeQIL&�[,q�HK�e|`���Wl��8\����A���A�4�X��q&����MA�g�6#�Kk��R��sS���y���Y�����
j[L�����
���(��ng����T}�����1G���t�����i��B�8��@e��zQ������\��������*����>��H������w��\���#)�����Z-[����)����*�����E����$��Z��_�Y���[��\��7K����,M����9|`1���8��u���t�lfq��NC������m-t,z�j�Pg�JZ�-���v��X������Ar�����`���t\6���@��S������-+��M���j�?6p&���N>�������`I���iO:�W���u5q�
6$�;vzY��*��~�Klo�6�=���
p{�RL|����:z����dPbK��M�NC��5�d��NIs����
��|6�]7���23E�rSxJ�4��kD^�d�qs����O��(@DaO��Q�D|'_F����8���@��HG�g�e��{��m�k�b#��.�I��d��vkbFJ\�H��9��` y��VY���	�N��������2���5$z�����hn�U����~��zL$E%��������#���Q��������+a�����a���.���0��pQ�,62�o�	���j�����8�GA�T���&����.��G�(%���
�	A3���<W�3���p��'�J���k������������Wv�{����[�%�s�����>��2����������������������7����*��iU�X�>��kct�;Oh%��M��]{Z��"{������wGQ��1���EY;7�h1dt?�9���Bh��`'��n���va�~��������u��o/�:�O|�������gG�U~���k�����p<��������w�����C�lv9M���e���G����p<��8�a:�����6f����[���F):\��=���:B�����������l'���C����p�3��W"C7�5���p�5�=
>\A��F7��t+�*���~r	<��u��E�Q����r���z��'i�S�/J.m��J"�~VW�����Yk@�/�������i����z&���%�#�%�O4Y�H�t���������~~F�+r�`���1��(@��C�y0���8�R��dt��_pM6~��y
G�&�M��.�]rC��#�HZe,�}����65�y�~�j2U��f�������s�u��A�����S�k����� ;St�����*���r����mk�ds�vbI5m�dX��:7�w�xlf�5���5��H������+�}v������Gx�E�6�^Ma��#"�q�x�2��C:� �&X
h{�n�������iX�/����8�?;8?>��[��y%�0Y���;��q~��
��i_E�7�M�:��������9<X�o�����LQ���!�(����l����b
��p�h��\�kP� $��,��3W��)���N���D#qQ����&s[/`	�hs��jfZ6������:��������qb�*�	U��s�c4K��kJ��+�h���~�^��������w�I]�<�H�����Qe�I���vi��a���}~�����v�8��*�R�4��<a���Y��wgT6P�s���9����51��d)�,vG�����}F������T=tBa�c��t����|��c	-����F0��h����|��b��%-����zl:1�R�"gm��j�
�0�\��[U$cTQH���B�����|M\�`_b���|�k��_�y��j���db��3�J�av�jT�"������V�mjFk�V~��u%��.~��W��x����0�*G22X���iR�.�L�.HIm ��Q��	|����
g����k���k6��x!(�M����c2" �h�4+��]6�gG������_j���R�����X�����v|r���m��K5��z�[qIU��L����Q�"JT������U�
�W����zt��j��0�������
Uu!
�
��'$T�������P�����i���������Q��8��2�R�B�����/7�ft��17���2H@�?�������S��4��Ih�
e��#��m,v���]f~���E�?fh[B2��
�O!�1?$F�K�.]������8�+LI)�BERk$�N��	lBS��,������(��QQuET�x�<���DP}y��c�?f�$J� ����1�ep�t6	l�m=7W��A[ {c����Q� 1'l�W���������l,"��i�(���[ �*�EW�[�<,o�!��	m�oM�^	P��r��{a���@V'
��x�
n��������v���]hk�.������
�AN�T�]��Z^,�e�0�[Z� �z
����`8H��2�h�����	�Eb
���������]�aO�!�(FpiF�B��{����	HM�"]"I7�$����!�#z��;,��'|�@2O$����r�$���_��'�������g��q%T�)���l�}V��E�&��/D���x���&���)d��cD�CV���:��47���E�u0����EC�P����I�0aW��!p�*@�����J7���[<347%�o�����M
G�L�+������m�5��V��9?���xP��WQ�A!1S"��qJ�"�X�����.��E�%,�����b�'J]���Q�t���F�5t�����?�Z���L=@yN@@��8M�K�9�$��y��[��)/��cZ�n<���9�KX5�rb0�p�wi��#�k��,���t��ST5'��39�����>�^U�|���Z���������(-�&��C�y`��E�<p�K\;I��v��y4�n(�}���;E�<pG�ndXO5����E�:Z��D�0����(���:J���������(����������������� ��������gn����jw)�>Qra5���~a�0��P��Iv���z�On���Fs�9�����,d�)�X��G�d8O�%AE�g3iJ�������&��P8�S���M���� rv��o4���2��}TQ[�br�F��yf��~����)������^���r�6��G�CzL�7x�����
C2K��pz�K�TF�J���F4�^���t4���9"4CH�8�)ggY��u��O���d�S[u����	�Z��r��x�c���RP8��s������1&i�V|>0��o�d��h��%B��hj6R�d����O��d�k�<�jn�������-^Le���hX
�P�E��C�("�8*y�b>*=���Im0f0�������T0��f����{#8<ml���������C��	
������?
�2!/�;���q2\C8F	���:�BNs|p���X#7yxp���������3��S��-�H��a.p������szI�Q�����[���}BJy�:�W!y�����_\V;��CRL8"�I����a:q��D��^��:<}����:����1�7I�%L���k�K��Y�p�O%g�������t8��'�F���:!����I.�@�PQ.�hO|���@�D�+���
�����|2C�3|���^D�
��(2,&n�i���^!l��%aa���p<�
������R�H!aSC�X��$�H��d��'1����
�|�C�������C�`�x���RTB�|@���3{#;	o����F�����`r���X"���^���Y���
Y{�0���dq���&��U�cU
~c��[�X�q�a��)%����D�<��f�U��ym�[[�O�N�"T�*�VY3/�:R��=�-9�BF����V��~'�����N��z�1Lg�O��y����l�\km��
f0��[|
5�����od�Mw���%������+.����������|9��r���G���6���r���� E��pK8��q���1}����2�8/�-���k��C�K>-p���r�������^U�������gx��5l� C��ef�FH���7�U3���+
�^N����q����
FK�&^�;2=O���M�������!�4��\�4�!��z���o=����_^�����3.��n����(������2J%y�J���e&�8#z��	E��������$6�
�a5�j�Z=(�P%��m�����0R�6������k�
�g�\�2�u�F�O	����8���	��Q�x��J����+0t~�f	�}�3�a�����Cz52~���(�b��[{��c����
��������<7�U���E����PPR&��l-�zU3)Kw�8������c�A�?�&��t�i������i�{���|�>�6J�<�Y��;}���������X�b�6���������.��>GA��f(���}����a���O����A����I��f|9{'?�y<z$�1��5s|b�{>�����a���C�����c�P�[��`lPKF�_������&,d�����pj��1�{���M����W�I���c;]�MSF�W���\��M���}o~���������{j��H��NL��p���4lZ!�����S����8�p�����tj%w�;�P��s����c����"���_�������F�(��=wBXf���c�;m�]�"�����@iC����>�����u����z�MR&3����������V>�S�~BC���toac���y�����og�:6w^�y8[�
���f�����
.��<��TR���c��2i��g�����/ �������M���G�[�"������b8���d� ��?���t�����(k����X'�N�i������6���1�/*�r���d$��k9�tG�K��0_
�(������u��
��'���7J��T��A��7�n�d��"��|�J%�gi�,H^�c�M���G�g����#������+(p�L~LJ-^�t�����m����M2GJv��\��rF�.[���������WR���[����|�^s�-�R	�J��R.�*����w��3��pd�[�5��J��|�'�L���t|�)��a��A�Y�P��eoi*������V���
20?r~8Y�`x����m�"��tf�~��'�/��Q.f��K��&�H���O���O.Z�h������ci����_M��W�A�S�����%<���|�g��.-z�;F�@���C�q�~B�j��
S"��l6�R���<�C-�A��cV��o�MxQ��d���B�����+���E�����}����6��������n�=��C�4�3��W)7��>���5���E
52,�rSO�%�&h�	��
�-�e�����er{@��:����N��N���=Q��?����z)����c��A���K�\Q����~�"��"��}��[���y��uN>w�b�C���p�P������F����EKB�j��3������[ew�U����7�a���:����X���2&hA���Z�6�d�6\=���h�9&�J��%�SO���VO
��y���._�jt�A��~2���)D!WA�W�4���H+�/��]���g�_5p�����K<�3��@'���`g��Z�[�������=���=�|,HR>���6/M��-Q��Q�C�H��c��j�JI7*Z��>�����SU���	sGt"1,�>K�l�������357E�w�Z5��RA��~�Z�m��%f�R=��������C��m�.T3�4r�TFZ������� ����tv5��y��� �'����U�$������}�+��	�V�:�������z$)����H�����wC��5�=����x��'�E�%(#1���3���m�U^�|I��t����2���qf��`�G��>��I�:��9p3��Gg ��������H�P��
uj8v��`�� ��O�9W��26��E�m< >�f���H������
������&��s8���x0<��EQ's�S�������#6��}��������|_�n��������(�V~��K�-�!�MU�33J6�K�4��G@^�jB��y� t�T�7��$�KV�B�3%9pw{��ky���()�Zw���z3?3��hV��/�e�D��/������C�$~w�J��(i��,8<�������A������/	��s��������{������P��I���le<� ,�)4����H��]��Ty�KD���@��UoZwq�;X]����(`M���?�N�.~i)Pt�w���]��hqq�>:L������ST,Q��U���}���o9������w�o�I��M=�?7�kl�����������Poz���DG��9|�JP�.�G�]�@F����<����
�����Cf�_(n~���n���������(���	^�Xr1R��g��az���=T"�K!���P*�y�[>�b�5Y���5�o������?�/�B���`�����1`����#��p4�����eS��Z�Sw��sZ{�O���tl��#Dwr�9�~����n {�n&g�J�,�:n��B,#5�6d2�2EP�OVR��s����WX��}��]4|E>����x&s�?�a�/rn�E+@������EHY��^����9��,��Z�������y��?�;�g��dLZ)?7^��5Z�B'��@d+��3LfS����V��T��/d�[v��'���6�7�HE&
h|�	���'�Ki�s�����q&,�����E�U��aj/)#QQUs��l�����b�j�P����y�����	v�$Or������#Hx�XK�R�t��U�?���������@��Kg�2y��*��f���)o{����
2�)�a����T����� ���S�_� p�o�A����?�M��I�����ME��"������S:�l@��	�a�15��������Q��������^_=w.	P�*���I�w2����}A�SIMv��:��S�T�i9����Z����1w�N�(o�^������^t��������������z����(�����,{������Cf�����I�$1���%�0
HK����N���%��&�
�k�/v�������f��}kz���.B�(0���E*�������#[��!������w�{0��|�z������XO;����T2������U.�5T���m�zWr�����m����4���<]������,:�]
(��tz=�F)�#\-Mz����o�V����"#r%��3�{g���ld�r���;t�a���$L��������7%�"����eW�.�2�5?A�?[TSl6V���=���[G5J����x�=��<�3��u�P��XW����$W����|�����X0�R$6I=�!�R�d!���;�J��R�P�NYl�yS�2�����iV��4�DC
�[�IC��xE������������N���Lt3�@������.�dA
���$�t$�f������=Z�+��z{�
=4�	����T��1�"y���*��l���Z}��C�t�I���%�*��
W�>��*�[�s���0l��U�r�s�^M�M��S���&h�hu^�F���yC�����M
\��Uf��-KP��}�y�XX���kI1v�k0�g{<��h�U����������jim���\z�Wv�����{�B��p��	���R����W>�8&!��d_�g���X`��
�j��;���>�c�2�=��z�|w�^�~y�}��g�]���r�����yX*II~v�n�!��b~�~r$��kj�s��cM��5�9���iM\g���<k<��V��n�A��u`��b����*���6�0�-��p(�<��u�9���-�OJ`�	��������������RF�
	>��
]���5a������C{�K�-�6N6^=�]t����r�K+8���k��
l����hW���dSO-�a�\��~�7`NW��*�8Uo����,���X"��4���������&��
��#`���w��h���L���~��S	j�����@l������Qs7����h_&�FU������U
��
�p4�
A�p*U3g^[D�������h�fi��"E���.�)�}WLG����3�k������"���L�sF����������q�������H��}%������|1�K�Z#T�i)?������8O,��y%���
�����Z�^l\���]y���������Q�p��\ps;�2�a��b��s9<���������/��p���~���HY��L#i��������1�}R)@m�������'�Ica���(�v�_��9s3���+��2������e�Ss�N�.�K_�������-�{����cK�8���u�x�M���!�1�daJ���+.����T���C����0�:4kL&���v>�a���^�����Aa�|����5��o��;LiRy��Q��:�<%n�#L��>Y��,A��S�3]_!N��S���F�-���V�����I��	B��UX'��K�����^=rA}���������d!��5����|X�q��.������+��`�T�8�774����I���o0��O9c�������`��D��CA���S���S]�$��>�o�4�}�E.�� 1~�<U�C�e��1&]�\E�^E���<:?zytxW~��0���}a.zqv�*�����:(A&Y��hq�z��R���N�s9>�8><��Vtz!Qjt��/�6^
a��������4:F�m���2���F7���;Lp�[�u%���m�artM���*���g�d���M��_w"��;�V�~F��Q���x�g�/>:;��n���<G������x]����?���@3�~�y>0�5�v��� �n�
kI�����D�_����8����y���a@��Ch�q6{����l6[������p{�j=����{�Ww�"���1L7����~���n>|�����)�4 ���|���V����rg�tHz[���6[\��0�f�E�f{o7��u�)�]���������k��2�IwS��^�$^;Fw��^�f�2�3"*�]:V��!��>=�2x��u��V��c���i����i6����6���v�_��;�*�Zo�������\���xJ)�&�f<��5a�����<MY?�����Q�c~�v��Q�f+���jm���6[[���������S``76�[[>�\���?x �+�J��U�����O����Jtr��+J�j��:�^����T<(6Mk���4at����t��ek�w�8��M�(���X�H1���S���DW�����+����v	�&����n^<@���������p<-��9�1���p�
�'`��8L-�
�FLgLT`-nEk��9�	�]�2��'�1�JT;_'�(��N���K&��k������ ���O?���d��������?(�37�p���/�+z���L�*�*\$���>{Gi�)�
rI��H�wk�b���s�r�6��I�)��4���F��u?��8����HknI���6H^��]_:UUy��s��D
R�E����B�
`��L�nM�J8]�"9X��q��d3��.Y;�K#x��K�@
S�(������*�
c���Im�P������$V�5(�S>��M��2��U�n|S{��� ��x�yA�|��8%��k�!���q���"r"n%&���cR���>��v���2�F��j�)FE?B9,���>�c��j������'��iCp"�,���A\�`9��q�OY�I�����jb���������o������7�T%�>V�gR�Md�@���z������h��i4�x9O8{��t9eZ�N6A��g��L��4C���&��Cd��,��3�D��b�*p���q�����,������F	�;�U��,��}������%4��,,�e<D���EQ��3����#���WbHU�G(y�e������&��� �4]�/vI��{b��,���6����$vN��'��%R�����T�����pQ�.Q�������aA���Q^y+l7�������,r$�d����,��+F�����2O� ��i�������\�����}J�0;
��X����-�������c��j*��&V!���8r�y��&q�.i�B��bI���!=�����5����>��#(�*���}��K��#e���{C
4&|�G�}8�
7KEIy�<� �'�7��C��4�~�0|��������P�#�7���,��3N>��I���J���fj?F}�_�S�}8��"�L����Bo�$[R,�BB���!�DuB���5G@����h#�b6rH��P��`���������OznY��0y=�\ �a&�>��(��sJ��L	��\�"�2qx�.:�EH����SP�����iS������9VF<X'�q�U���\���X�o[W����e����1m�mQnk!`0�"���)d�s3*�TX�KM��qdin���I�V�%����,O�p���Y*7/�u����@�Z���Tlt>0��������?i��N���S&�2�a[����J?W��N:F	Qw�"}�M�Jo�}|
/,w24a��7����R�<d�f����Sc�_�~�+���e���.I=
���
���{��r�>*�����Bo�4#���d�EQs�P����&��8�b0��s���(fAcc���IQ����N�=M�"e�/Z���8���D�/X��/tTx	��]��6�\����	�����I7��C��
'o�$1���i�%u6�����@j�{�f��������Hz�/��H�����\�B2]�I���B���������|Z��s�@778��J�{�I'��\��P�"��;��Q���I�N�p�;���/�%��Gr>��)����I5��
��B���|�����t�.���i���yv���@����zB����jK�����)��@�i���~�&���$�)Gb�!o�&I���J=S�����!w��{�O!�T�F�`���\b>�r��j�&��g*,�Ka���L	��g���5M�"�m��n�-+���{&�uIa�6����Z����&���*��O�MZ{"I��C��X.�P,��"-"[z����j��C�S�C��M������d���te���f�5<rU�Y�K��fS�������i�^�w��_�kG�4"�;���P
��-���9n�'K�fc��$�~��xM�F��]�\\������ K��9D��Pm��Z�uL�sjx��#e�e�P�������9��54n���H`s�������3�������;�*�r���| ?G�w2N���f�q7��(�@&ak��2h
S�@��~��f<���I����q�_qv�I������*]�� ���~t���f���o��#���R�����I�{�R�8��l���#��������H5*��e��bIx�Mp�2EA�*%"�i��j
K�[�ed��"T��0'����P�����?e�.\QcN�O6�xw�Z�v�������
�y/v6$�8�H)��N7��*X4R�t�R�`K���chRZ���������2����#9
�a�G2�5�/S��v���'!q���y��*���$;�[��Z:��scz.�y���+,]�c�!��eu]q���%hn����y6����m���$<H���0b��'QN��W�W�
_-r)���x5�u��,�n���,�qK������+}r���1��� 3�N��-�'�����eI���Rz�����oK�4,�K��[����O�~g��ZfO�����zV�x��x��D\n���U�K������"T�vX9{������q��7p�J��	SPlW�]s�u.*W(�P3�Rh']i�Dl�s,cE�f�.��bd�r���_z?$^��������O�1WL1W���
U��@�t��M��t��Q��P=�0x�HBA�N��o��@c�	�
K���>�RN@������Je?�wx��!�m�?���~Qd��@q����
~�A)�i�E�jj������F^A���"�
���y�������&a� ��U��R���D��d�aNu�ns���@�y1P�B�8�� ���{d0�/'��P�-d\#��� ����+w����*��D�Y�E6�b>���*�{�K�=���>�����@��'�a����0�]�]C�d�A�sP��q��Uq�y�Q��wl^��
�z1�������ug������xy�fY&�37��icy�����o����m#*����(i1�_��[V�n�ZE��5��W|���?�-2<{�P1����Er���wfC���+h�l8a����A'�n��J����tS�.��'j�O��D���I����T�J\���:9K�{�e�OU���#��I$�0�; ��z�v�EN�	�F�u?��n�A�)�y>|�P���V��,�����
���~���������wA�4�h"�Y���M�\J��>��N�

�M�b�(\��X��y����������Dj�'���fy��_������-kMA��cO=M����e$<r�_&�p���,M�8
��2r�'	�^7��v�k��U�������|h
N��v�R�j-�t�#/�y�C�����h
nFm� 0�����}�h�"`����]�p�i��~�����(�	���0��v�|�������;@y�����;���ux*��I�4��e5�}�5����J9���i���DQ�W���2;'&��w4V\�WI2#����}���%�
V��=��r�S��L��"��hf�sd��#���D;n-���+}��x&i��uU~y��v�4@�g������<_5"M�rW���{���������c�O������������<�9��\���;������Y|Y\u.G��XL3H��R��/�b�$��l�3D�
��&|�����3�����������{�Y�l,�={�������8%���Y;$��v)��)��+U�k��y��Z�1V�{x�${����Z��FS��Z�y���h'��c8>5�8��Uo�Q:j6����{�<e�R�$D"
s��.$.���o�Ag����L��s��6�T����L���\�;���u�/��+��������]�0/x���T����A"�A��,�z��\�QR��{an���J-�q:�=�'���{��M��bo%X�$B���0X�*���O8y���9Umx�as��=EGq���K0j���=e�*S���=�A�t�^l
����QK�`.$6���_����DlMM����F�Y5b��`!�3�s��`.����j�d8CAx@��-/��^�y"��mzl9?�����Od���?�p��a������Y���g���;�O������oP�"�%a�-��@��wM�������c��[� 
����.>*��3R�\;&�77������I�Cp��=b�h�nr���%�hx���\���p^����_cV����J&�N�c�U)�|2B�H�|�������W�h���]A*�:�D1A��*���_�1_5���G��*p��vx?$Y���x�{��,��'4���4������Vj�HQ����_��b2�k���}���(;C���
gl\r#0a�����8��68n:�u9�%���t���u��������-�KgI""�l�	��z����FB�>�<}Y��vv���%�&���`��*{�w�/O���h^7:��:?������3��yq]P^���fV+d�0�,��4�R��s�R���8�>�����1s$��N����qS)EDp�z���z�����H��rY�&�p�	j�W�:Q��t����<���6�XN�\� �x�U�,W#J��4����-��p^��[XW\���!g?<s�����>a��}��v|$1����SX�
0�y2/�W��Sv�{O�a���O����S�!�h�X�����;���WbJtE*�wc&}�C�a�����]���?'����|�:�!t���B�}$��7�����_����;�������)���"K:���\N�Z^���JLTCd�������G�>�����"JD��}�}�r����.��'�%dU��f�e�U���^
tY<S�/��3�b1I��2�1��������!�(d���;`}�� �M~�����V����G�����,:X��s��	U�H��h�<�������W1������(1������b�xK]`��V��&'�
��:����(��"���(�3���W�
2�<���"�k�5�R�&+�\��������������T���eJ<M�����M�������<lu���%/&�*����Q�u�����ni�s�L�"G���[�787���s�+��
������(���e�BOel����Kl�=�T:�E�D������_�8�*�#8?��������YOV��<�<�}��R�\]��6���8�X�r������5f�k�Z&a��~������JJ�����R_�*T��)�b�� �\�Y�w���7tSVD.��*W$QzS"��UP2��WspH�v��[����D9='����)�`��>��@��H�o?=�t5�����Of��`�j%�F����}�eU��g/�3�`�I|�vN`�Y�QvAb������l6�8._�c�&���i�L���)��{z|X_D^��z����+gs�(	]-V��x2�zn�R"��,'E<�SX
�v�uG<T>=����&u���9��fP�_��2���*Ja���e	4��i��I��[M2'�?dL-W�M����]Q2����?Y�,s I`M��Y�t���*�c�+Rv�0����l������S���8�Gtz��h�]u��}F@�����e����4���Ir�f�U.za�Ad�N��M�;�F�!�4�t�bz��;�z�����M���������"���M�%#����(]+FPjq���|�W�s��zp���g�����A�O�~���Bt��/1���1�>$7��2�Y���-��	�X��7�!	#���|�t[�G�dU2j��&MC��_q�M/�b���f�����de<��]"U��������Cr�9>yq���U�\��������/�d�#:����O��g�.:g�,-mV}+���ZU�cf�R� G����1��`�~8�� ��N�CT��Bo���K��wz#��������
�h��Y�g?I�<��^�����(]���
�� ��[����L�b�]SxB��B�G������O��,`|`S��U�F%%�G�_��U6�����C���x���@:]��������������/ku�S���aD_���-������O4����}�R�����@���&����0'��X�_��8872X����������XF#u=�t]z:K<�(a����]�����9�S���h�)�����>;�q!��8\����Z�b������\t^�6]�G�������d�YX�#����,�e������CP�R����YZ��=��csS��2���}���"���:l��xq�k���
z�sj�x@�+I./���$�sx��������t3�{�r�R-�N{�~���>?��8:�����srz�9���������u-A�����b�E�0�)[���heY�W�"��Q�^/��l��J��O�2C�h&!���Rz#(��X�5��8�5�o��C}�=,�#X�w������V��}[7G���o��T�."�T����@�����������^+!p���:����C[<�X�N�8gO�}_�G��:gRfg;��1M�� ����hBDW1�*0+�O�7�.���g��w�>Q�Q�>���7-��\:�]?��Ez�?x����|�
�u����
.���������;^C��*�_����<���Q�I%�������z�D���5�6�H<"�������oS?VH�lD����m��V�}_�nJU�~����z�xN�kt��^�+)z�7v�h����:���6�����L:b�^���t������
�A������(L�7R��iK�����:[R�����Sz;����T�	V1K�����*�_e�^��y]���S��#X�Z�!���iZ�MQ\�����j��8*RlI���\pW%,'�����X�.sE]�GFu�
�g���b���#K���SQ�	:$�FR����>@sJ��h�#�P�9d$�I������������%�S����hD�� {|�_iD�h���z>�
ncT�$Q������7���C�?�B8F8�/^� j)��:`]0�	n����LM��]���B��\!|��n�GS�-I���������I!g)z�H*i@���q~��s]`�����4C�qz"�R`��D
Uv����b��l,@c��~����2`<����W�8��@3�
w�A������l����fW���z�r�=d�<Qmmc`h�H=���A5��r�5���<���dB�+�6[y������������/���p#���.���kK�l�e�~�z��0"*M�����Ro\	�.�����J�(Z��[EK�%-�A�im��� f��Z�B�Ig��r����'��x�u��.J)<�e��zN�����=%�c��)-����I�"��"���uS�*�K+p�V���<������*)"#�9�cYg����+X\JoR�(>��(�J.���e;S�5�������S�=��S�Y|!l
j1[5�tS�O��	��z��������u(��&Q��V�@�.|���oF���f������������=��F��"����>����G�wP�A��d��������^��=W�&o�b����Y]e���+-������z�~##�������+�J�-��4�&���`�v_��_n�/��_n�/��_��/w��e���c�+�	vVof��\Rd�y8�X��?�F%gE�{%M���C�nP&(�iL��nd�}4w��?a]5��0,������O�-�����
��1�������b��dFrI����{BmMF��wQ�
znC6AX��|�[�������0`�I�B�����o��~�=��7..~X��A�����Og'���_/�������&�))S�b����z]�i�����6��Z���F������5d:FSp�qjlI��a?���C���Ec���]�=���(Q_�?5�
$p��t��������f��}�l������NV�pIS�y��	NG�`��]
�)�k��D}f��W8H�������K�u�e�q��(��5������y[
,���O�����a"?f�SL���T�E+O<��,x>|*^r�1#B����:'"����������Z����&�)�GM �x�V)���=����k�Y�M}&`��bkWvU`?�^k"�I2FJ�g�tsY.�� ���/��F��^W[���$�g�z��$�0�M�CT����	� O���D8|n����]�
A�g�s�p�+n�f��=b�g�A/hm��>�)���(���V5���<4��E��
����?�N�rd�M�6�����\��� �r^I�k#u�)�g�~a�E�+?M��U�OW���{��a�Jwq31�S7�'&#�,H��8�'X5��� �D>�@�=;�U�:>K:�6��[�������K�z\~"S/~��}"K/G�����a;{4��#A$�
4f��$z�,�)g��}wK��x�C��TRw���7j����z����.8|�L!��"@�TnA�,�;m������)y�p�f��f��n��	�@&,��y-��}LF)�����8\�w6�MA�\ilU]�t3�y.�����.3��OV���"u�q�^�<?���i�UZ���'�l�xup��h�k$����rfjx��0yB���w�s�OUD;G��q�����R`r)�����#������U���=����G+@`?�jW�et��{����23�a���M�gg�kF�#�
�`��A:Y<�^n|_��s�G�>�x�y���pbd�)Fc�p&B�]b)�gc8*�G+�x6����Xx���J��;�3���$�����C$�H�A�\� ����D�V���d��������9}|?$@�S�������@���`��&�d��<��1��C��������dR�c�Eu�o�X�drE����[��@o����i���1nT&�@I"��$�q���
���{�x,�\l����f~�8��Y�::q���jn�J��m��M����8�s�������K�����B67Z.E�����(v`$!����}�'P���&��ra�ks
/Q������<5��O�#�Ga"��������E�b�p��ss��8i�B?$�gp�kV��>��]�Nn��EQ9�a�������221+�g�M�m�%�����K����s��7(���������6�m������������w��Z���_U.�`�������(�k�T�a�i��W~�%laI�@���v#�iD�JAf�0�@������H2�4c�"z�7��F{�m�}��Yo �uk�C2�-`��\���ow���'|<�����_�
���<�Pl�WA��g\����L^�������I��{�PT�����<���C�Y������%���,C�|�2|���\�	f�5��vQ~�H`/�_�gk�� ���U�������Q�,V[��� ����Sz~���i()��d�d�/8�����!�	n;�,X�����B���O�#�/��=��Qf"Y��*Dv�)t�l�i�]�g�N��[R�� F��~���IE��Hn����j�+`��D�=2��u�H��'�'��-=y�~nr%�7(�\Mf��J	4B%��KAu���/�k6�0<>~�����(<��YG��;��e���&{�
wB(����7q��y�NN�����;i���5:R.�7�y���F��Ep'���.������
�d�^3���mT��1b�Fb|�[��
s��r(��&��y�7`d��T��SkB%�;�${����Y=���O��'�7���u��:��GN����p����8A1���p�cvA@�`jN;�DS`x������F�y�q�Ie���������>?��s:R����W%_�m�|L.���!�m:Ct�F?�8�!�D5��Ym�2��J.6?�+e]fU�F���iml7�g�r }�8�r�h%�0�����s�����[PA`O����������@�z��0G�=��@��>��<7/�kj�k!���j�"Sz��@u�vv����!A����t<���V���):6%��!�
�
�r�}6E�������������q���������I�ZDq�S�74���0MA���;N��)��'x jm�-k��M�ap��	���jm����,@I�e�G����r���~���Jl���+���X
:��2�L5��qF�,���"D��$�M���!Om���67�N0���-��>��N��E����b�-_`Y@�\)|W��������D�TGl��:]����/
�d����1���Au$����R (<1
-*��U�Z��-�@�Pz����h��[���	/?>���G3��Z�b��8:K0j�G��$%�f�X���5-{���A]��+�^��b�	���?�_��o�_���������\`V?��n@�S7����O>�'L��#�?����\3�t��(����������2
���|P�����``1�
�8�����h\�p�o��#rI������R�e0�0U���7��a���Wg�0�g�L�)n����+�$�����CrR<�ud]���"������<Q��Uc��|�iRAmzI�����5��:k��,�|���}��O����I�KN�5w�?���ct�P�) [Z66`MKVAq���KG���+���J:��4H�����7��_����]��Z��#LF�T�"�C(an>���W��p9���rT�U&��,,�����o��3�PzC�T\4�(��v�r��T��S:M�%PN����y���������?���$�|LV�@5i��pz����-�`��cy��&��|Uoj)Y�y�f�p�"�������vKd')U�.S��dtLf�|l���g<&�ja`&u����\I^�H�l��8�����C�_P_���:	Y����� y��c�+#�
�(�G��,�C-��q������U6j��tm���
s�h�+�����0�u4o���q��I�!!N�l�8�����e��2�
c���K%��S#8SmM�E_��lN��H-���s�� �WM�B{���ckv������`!Y�m��*"J���B�Q@S�z��:n>�A���n�y>���K�eyT)|�o���XE�2�S��c��r�Y���u����>��+����S��d�Q[1qN3_�S�ft�"��1��"�Q�����+}�|�7���G����Wz��G��I�'��*Nk����	F�c�Y�0EM-0+�r�P�	Mvp_��#�\������t2M3��j��DW���I�_�]g�b�����j��$���I�yV��P(Bj("k!�(������K7�Dl�����@1C��:�b^b�d��bt��6���.|]�������S	����Ht\l s�J����_t�^���8?zI!g�P��M��o+,�CP��)�^:~��Z�0�&N%vA��� |�#��UE��IOe���]Z��$�.~�
��Nne�������������rP�G"���kd9:��������!��l-��4����������J���\��#��]�c����w��^������4�J��91U1����S0����X]�5��"b��������w�
���>�)��*�T����B�����X�
��H�T7���&z�v�#�_���z�bw��?4x�p*V��Z��p������P��vf4/yZ
��6H��|j7��l�2!��R :�.����"�>�
d������E�xz��G`H�d<�6fRnr%Y�V�U����z��?������>@�r�Y����-6|�������Uh�������X������2|KPp�����^������������@��Wn��Z�(��L���EE_�|���Oy{nU���?=�@*i`K��\z�G��ei�Y��J(�x�	��|
��S�4�4)0�J����������Q`�.����$��:X���FRJ����;^{^s�y�9WB��R�1 �OV+�NK�q�fv�I7�5��*���xd�9��`�
�Tf������BK���	WQ�����8on�9�/*�A}����%���y8M�a08H%�fIQ�:h8���Cl�Z#Zl��op7��/��0�z�����d�!��X�g��m�omf�=B9'/�o����4T�O���E��3�2x�(s�p%D��}&	*i��+���y
����7f��<�.��$�-�~-Y�.��g���},7��j,���c�X�>����v��m�Z8K:�jyo�����P/���_��2	�`Yo,��O������9�>q��98<'�7�����1�����^V�;0E�.�U�N	:��X�D� R�	��q�[JjS�� O�/����xB_0�%�{9+�]C��d��}��q��z�q�0����.�������c��:�Y<�b���Dy���|blf��J3W�AB#*@�O�V5l��5����>�����%���k��$�K3���Ug�5��������0���p�}�B[u���p�W���.��v���@r�����~h2���hs�`|�
k�e��������d���~QM,�2�8�����w���|����F�0�����r�#���YN0�6�fyZ�.?d����d:O	 ��y�s��:�&`GQr��mR���d��P�XIng��q"�}7��	�\��@,&M��C���R�3�7��,���]MsD_���$t������D %M4	����'���KfSs���q	riU��XU�.�:Q��}�mk3��������|0�8� ���gEY+�ba��4�K��lj~T�'��1K���}d���A/Fm�������G��-��)�	E�g$I��Ll58��r�e��<>�����d6Go���}�j���A�
���
k��zu����\�R�Qx�RT78��%���Og=�1�}]���e2��lx��W@�W=���<����!��%�'�����N�k������ 9��!����X0|��!
�������S��%s��e
K�!�e~���:<�$N`���<�9J�J�)�-�b�6NG���_���3��C�F�~O�L-h�t"O�a�e�X�L=���y����P�C����b�<M����� ���u�}�p�a�����*jbg�O����z4�>���������'Y����U��b��������D�~�b"��\���5�
+���j|����y����E�P_��3�#]�6px��Xu�"~�����k_�~�����h}��Eod��F7�����?� �6{Qw���l�������������f?��Gxsw{{y}}����������_�5Z�k5Z[�����q�OI�
��oo�N�$�,j��0P�c��d�h,G��H��-����7)G�L�]����s���u>�?����*��TQ���_�S��I��0�2�m}9��r����e�?vAy���O�����+�8�y�/I����\�4X*_���=tlg�I�B^���^��B?E�+	x�NK����0��6���Zks�����6,��I�x� �8'0�)C��A�	�K�f�YE��;Z0)���I78>��U����=���#	I�
�D������S��`�$�V4�������pj(��7�r�
}�E���y�Y!Ym��U��!I�&����=M�����hm�n��j[=(_�:��;d���=�z�9��-����3�FE��m���>�;R��'����'������mm
��Mw�B��}��9�nk�����)�1K������k����n��2������S���V��$F\v!�cI��d34iU($/�"I�k�Q���[;x��������uN���ax�JV�
���Q=��V�����W5�N�]�	����"��#Q��r*�NI�|i��.S��kI���jy�I�$��p�%�Y7R���L�B�����	CY/W3�Rn?�Rl����q6�vX�������fL5o�Gy,�EY$4e1Z��LF��������)��@p����oHV
u����x��5���qc�u��}>f�v���F��a�c5<�1hdk���'!���$������x�^r$:�!��*�xt4�m*=^�<e�������Cv�|Hb�Pt�����9���UbM9Z�"9�D�C�h��i�o�����J�����J,^���M2�Vgs��
�l����D'z�E�>F�3���l�9t�)�����<������p"�O��
+M)�i�iE����ZV��f�m��L����.��o�e�r��S���Tq�c�_�AHw{����>����<P�ku�Q����i�������s�b�bV��%!�1oIg��!3���7�}�X��>��GB���^O����+�N<���0j�}6��p���M�@�1��X�`T��XF��y���hCP���2S��c�sG��8!>-�dJ�P>5�l��C7X�s���Q�%����C���[�� <�A�\&3,N���f"@�@O�4>]
���,������
@P���1�y����6u�cLlF9/���Mc��E��bg�-�)V��|�6jm
���f6}~{����#V������T!Hm��
�-�_=��pjN���B�k�S�$dN�K|�B",�^kn���t��9)2Y<U�F��h4�sY�N�8�����F��g(�o�wQ. Ihgs��$���}�t��
��Q��'��7G���/G��#�������5Fk2L�1����\����d�������(�?�6��l�<���Z�������n���4��v����W���[)*�����l?|HZ�I���	P��������&S��(�I�WT��x���Ff�8�N2B����} ����s*������N]�z��W%2u~�Qdj��J��_/�):%U�q���p����e��E����RX

N�I�-���,��x�� �N�tpv�y~�����f���y��6�����������d��|(~a�:p���p�.
~^]��:����?'e��c��VjC5;;�L�v�Q���K������C
�]i����A�xZ�3�-�	�Q�7MZ&_���+H'�8�32_�%M#_��]<�6V���GX���d�s��/"��;hoo�q���j���w!���y���S���n�)�����
b_(�p7��������)�U#���apt$�f��@N"Hq�����?�O������U��7E��;S�����)/i�� zs���T5�[E^X�L�
�F��AHG����
L�>M�����7���{���[�f�������^R��v���j�~HW�~�������srA�j%@?<$ �"n��^��_��0�'Q�������e����>Astp.�k���V��V������*}�=��t��>�T�U
�u��:Weu��k���(VyWV�@�rzqv�
�o�����	��C���9�WT���GgG��'��}�~����=={~t=�K�z���� �|��V��O��l6�]����fp���6��=D����t�
8�����>M�]c<��/A�c�d����^8�5���C�����D�[e[���$�6�h�1�I�~���!gb��G6k���^�������x���d���/��V��)Ju:N��t�v�X����a�@�jQ�������R�M`����|�����R�z���sJ�B��0F�!6�8��H���6(7���!�Z(�J~����@�p7E��9��}gg��xz�������m��e�Z�������V������Is�M�����n�����v�v������r�q��_|!������"AKHE���^�pr�:z��OP(J��7����G������+��+Q����K�n��A	�n+vvW�i�s����F��n?�Yd��~���|�C�$b�G���?v����O��>5��{��bO_8������]��B?����B���+�z������3������w�o����j@�}���������1���_�a�Q��A�"#��������u�l�|��g'�Lo���j�![��V��EW��V�k{��2k%��K��/K�������q�vVV��
�����E��X���P0��-~Cr�6�<�e[R\�;�e����#��wi#P�G=~j�k7�C�	4cwQ�Q�����e�GFC��	��,�!C{U���|���ex�=��h�M:�_���u�e��+(�O������n6[� ���~�M�e������D.��a��!����j������J2��w�	������
X�� �a�3t-s9���,��[�K�dS��	%�X�<�H���/_|-x_��w��:&H����`8�y����
X���_"V&����dZN�W�dP])z�_��qV3���HLl���&�|=���?�5S�I<��%_��}��mN��&��;@?�|����Xx��(QC���E�I���[��w����h��p���[�������C���@�+�fj����;����� 4��_-+������������s(4(-W�|�j���D��E���P�������:XGs�uJ��%4
�r����d��lowww���[}��]�dJs)��!%�C���Gk3/�;��/��H�s�����������;���q��Z��c����=x��s��?_�^+E��93(��
�PQ�(�����4��obW�~��o
�or�>�g�����
�<+��6������U��[������E�������,�$z]���,��D�~�Kb?3�r�������/����A���G8'��w�!{�$�C�3wh�?yx�����s\��v�E��I�d��0��\�����8~�l��Z��N��(��B�����F?w,��f�k�7�^��<��H��s$=,����)C<y���1G8
4$�������V3q`$�
#(��[*�rd�q��	�&o�P2�������%��=@�P���E;c���^�����"}MG��W�
_x��
����]Y��*{x�E�x��k�
CxR��!��$�����	������z
�z�*r�v�7����f{?n�����-�����G7�
������|C`�w|m����-w��C�!�����
���+������!�B����3c��[���<�c.^)t���V�S�K�[��Yv��y���/���^s�f���2Yx]�t�E\���R���;���g���t=<��L����"�O��^�E%US0�c\�D���Mf�����G�����E�G5{{�G��Q���#$���{0�������`9�D�����7�F��3;Q|]�C����@$W����k��'�+�8cYc}t�rm�ss�9 ���C�������1Q�:>��V������F>�A;q&f\�\����pA�c���GCB�v���S��g���}K�D������C�7��� �N2����3��
��o@��b��x��������K`��x�a�������w���Q���cl)���������mw��z��v�>�AI:��=Tr�����W^8=��Fgr�����F���Je:[dF����H���g9�7J>�����Cq�����������M�(@����v09H���96�b�;f����#
�*��"�)����a�&��v����XS]�����g?�xtq�v���
���|Zc���������d������	�E~Sb�"`�/��J<e	��1�6����G���IPvDg�u4�4	�3e�a�qI������Xf�Kz�):�mZ���9��
��C'�r��el�t��v`��S�[��b��/:m����|����Y3r���J�������,?���ri���v���5Usb	A@��D<������@��o�Nx�j/-}"�e���?��/�������f�'��q`R�n���Ft�a
�0���W��P)��+��8�nN&�`����Q��_�����������c��PlV�S=(P(�>�@���������YO>35��M����������H�.EnL� jJ[rB�r�5B�Rs�X(����
��Ic���$q�xx8���}�G��Q�y�|�\-=N��~�����r]��'�k��C�����~B��
Ig��6��LYM?��B���?�FMI9,>y�n���!������g�nO�o~���*���)sc.�����}����/����L��A�X5�3�!�J|�����G�o�"�S����[��������Y��&�u�ZCAQ�������P:�"��#��+"~oL��i��"hB����4AtR;va���Y��5����?���i6�����n���O����O-���b�-����=Z�k�|McXZ����K8ku����D�>:�1�	�s\��F���?�������k�[iRC����.����Y#�A7P�#�UN���������f
FcUp���^^D�&\PrH�ie�]�IaWf�S�j�Y��mO���R�^������}����G�`�w�s��I
��5!�6+�����N�6���K&��7[>������_���A�����#��w^���J�7��������7�/��[���.K�8)�v��{�~�JB���N���]��_&���!������w�e~M�Z�y�x5���pok��������i�+��,�Y����a�/����jqE5|���ROoQ��g����k�MF9X^��~��p���=���{<!�%����:/�cN,���e<t���\�A��Ys���|��GAI
x�{[{;I������o��Qi��[�&o'
J�-��K4�^<{y�a+z~v�::<}��+��H}��l��K����i~�3y���L0 x;z�-�*u�]a�U��}Z�?Fa�|�8���"�D=	�
G���[������E��P���?E������_oo��>�@O�����#�HE\y:���{�2P�������3?2d��G�>w�=m��y��1����`����KD���k-����?����A��7��V�����hk�����]��������������Y|���dF�w_2��FV	����]tp	d�x�F��U��0�t���G������&S����������85��59�
Dl&St��<�m�W:���������nO�67���v�����k��BGu��}��J��8��1��������*~ ���3�]N4hv��@h�A�G��~yp|�O�/���/��~QH@�,��GQ�G���B<����:������]D�x�_u&�e�� `
���I�HY����g�,��x4q�8o�������LA��o_�
������$_�N����W�]�|y���Z�Z	8`{y-X�]��]�-B�~���utv|(����{�X|=�����)����k��(1z�7���@�O��j������V���>�ROk�9�v�2�PX��)�8���0e~���#�����
�
CTf5d�L�~��u����U26C�i���_������q���v����}X���y>f	��$Y����{_��Z�����G�]�5�n���8�>��k���Vw��Z+#������_"���n*yz�4R�F��
r�=��%��S��1��P��G(��y����bN�T�]=`f�y�6�����g_k�Y���V��B{�;�{�
F��Cv(���-����4b��B���q�/*�����^�^�^�������/r�������� ��vq���!������f��}���60������5��a��-|�����|���?�\���BF	7W<����eN����W��[r��F���;�h����wX.�Z������x����Q�������s��i�N|�w��wb�k�e7�k�����;�����s�#|0��_��A�t T�/�����������W�	Z�.�C��K(c�>:y^o����O���F�-�`{��i.4����{-Z2������w�Z��5���e�5�Y��A�~��T�R����^��c�Y���S}�7e�M��W�o���������O%��w�����SH����xt�p�QJ|K���$���aNF�����o�X�zU�*<h�����}��I���H^`,�UF2o*��
�
&2�	����ecW9�\�Vp�ggys����"4"�������O��'o�&����
BN��������-���o�����%xM�2�Q�jxSh���m&]�E
������������w{;����^{�]�I�Xkw9����5c�����s�jBQz8�KT[��6V����U�C\�������V��?;��.G���n+�{��,`���~.3�,��[�L���{���v�v=������3���v�W����J� %��U�A!v!P��L�����h|3�h��9��<���l�gT��2O��sj��R��D�~��W��
�w������^����������L�A)���GT;m�{r�O�����U�������:��mt��no6Z[���_�Yc�����h��;�6�p����;nT��_�F���>���������������i���w1p��<�������(~��>zU���Y
�����.��H������?X��O}R�F����:�jY�r����Y^�T�����

#41

Dean Rasheed

dean.a.rasheed@gmail.com

almost 8 years ago

In reply to: Tomas Vondra (#40)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 18 March 2018 at 23:57, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated version of the patch series, addressing issues
pointed out by Alvaro.

I'm just starting to look at this now, and I think I'll post
individual comments/questions as I get to them rather than trying to
review the whole thing, because it's quite a large patch. Apologies if
some of this has already been discussed.

Looking at the changes to UpdateStatisticsForTypeChange():

+ memset(nulls, 1, Natts_pg_statistic_ext * sizeof(bool));

why the "1" there -- is it just a typo?

A wider concern I have is that I think this function is trying to be
too clever by only resetting selected stats. IMO it should just reset
all stats unconditionally when the column type changes, which would be
consistent with what we do for regular stats.

Consider, for example, what would happen if a column was changed from
real to int -- all the data values will be coerced to integers, losing
precision, and any ndistinct and dependency stats would likely be
completely wrong afterwards. IMO that's a bug, and should be
back-patched independently of these new types of extended stats.

Thoughts?

Regards,
Dean

#42

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Dean Rasheed (#41)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 03/26/2018 12:31 PM, Dean Rasheed wrote:

On 18 March 2018 at 23:57, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated version of the patch series, addressing issues
pointed out by Alvaro.

I'm just starting to look at this now, and I think I'll post
individual comments/questions as I get to them rather than trying to
review the whole thing, because it's quite a large patch. Apologies
if some of this has already been discussed.

Sure, works for me. And thanks for looking!

Looking at the changes to UpdateStatisticsForTypeChange():

+ memset(nulls, 1, Natts_pg_statistic_ext * sizeof(bool));

why the "1" there -- is it just a typo?

Yeah, that should be 0. It's not causing any issues, because the
"replaces" array is initialized to 0 so we're not really using the null
value except for individual entries like here:

if (statext_is_kind_built(oldtup, STATS_EXT_MCV))
{
replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
}

but that sets the "nulls" to true anyway.

A wider concern I have is that I think this function is trying to be
too clever by only resetting selected stats. IMO it should just reset
all stats unconditionally when the column type changes, which would
be consistent with what we do for regular stats.

Consider, for example, what would happen if a column was changed from
real to int -- all the data values will be coerced to integers,
losing precision, and any ndistinct and dependency stats would
likely be completely wrong afterwards. IMO that's a bug, and should
be back-patched independently of these new types of extended stats.

Thoughts?

The argument a year ago was that it's more plausible that the semantics
remains the same. I think the question is how the type change affects
precision - had the type change in the opposite direction (int to real)
there would be no problem, because both ndistinct and dependencies would
produce the same statistics.

In my experience people are far more likely to change data types in a
way that preserves precision, so I think the current behavior is OK.

The other reason is that when reducing precision, it generally enforces
the dependency (you can't violate functional dependencies or break
grouping by merging values). So you will have stale stats with weaker
dependencies, but it's still better than not having any.

But that's mostly unrelated to this patch, of course - for MCV lists and
histograms we can't keep the stats anyway, because the stats actually do
contain the type values (unlike stats introduced in PG10).

Actually, to be more accurate - we now store OIDs of the data types in
the MCV/histogram stats, so perhaps we could keep those too. But that
would be way more work (if at all possible).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#43

Dean Rasheed

dean.a.rasheed@gmail.com

almost 8 years ago

In reply to: Tomas Vondra (#42)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 26 March 2018 at 14:08, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 03/26/2018 12:31 PM, Dean Rasheed wrote:

A wider concern I have is that I think this function is trying to be
too clever by only resetting selected stats. IMO it should just reset
all stats unconditionally when the column type changes, which would
be consistent with what we do for regular stats.

The argument a year ago was that it's more plausible that the semantics
remains the same. I think the question is how the type change affects
precision - had the type change in the opposite direction (int to real)
there would be no problem, because both ndistinct and dependencies would
produce the same statistics.

In my experience people are far more likely to change data types in a
way that preserves precision, so I think the current behavior is OK.

Hmm, I don't really buy that argument. Altering a column's type allows
the data in it to be rewritten in arbitrary ways, and I don't think we
should presume that the statistics will still be valid just because
the user *probably* won't do something that changes the data much.

Regards,
Dean

#44

Dean Rasheed

dean.a.rasheed@gmail.com

almost 8 years ago

In reply to: Tomas Vondra (#40)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 18 March 2018 at 23:57, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated version of the patch series, addressing issues
pointed out by Alvaro.

I've just been reading the new code in
statext_clauselist_selectivity() and mcv_clauselist_selectivity(), and
I'm having a hard time convincing myself that it's correct.

This code in statext_clauselist_selectivity() looks a bit odd:

/*
* Evaluate the MCV selectivity. See if we got a full match and the
* minimal selectivity.
*/
if (stat->kind == STATS_EXT_MCV)
s1 = mcv_clauselist_selectivity(root, stat, clauses, varRelid,
jointype, sjinfo, rel,
&fullmatch, &mcv_lowsel);

/*
* If we got a full equality match on the MCV list, we're done (and the
* estimate is likely pretty good).
*/
if (fullmatch && (s1 > 0.0))
return s1;

/*
* If it's a full match (equalities on all columns) but we haven't found
* it in the MCV, then we limit the selectivity by frequency of the last
* MCV item. Otherwise reset it to 1.0.
*/
if (!fullmatch)
mcv_lowsel = 1.0;

return Min(s1, mcv_lowsel);

So if fullmatch is true and s1 is greater than 0, it will return s1.
If fullmatch is true and s1 equals 0, it will return Min(s1,
mcv_lowsel) which will also be s1. If fullmatch is false, mcv_lowsel
will be set to 1 and it will return Min(s1, mcv_lowsel) which will
also be s1. So it always just returns s1, no? Maybe there's no point
in computing fullmatch.

Also, wouldn't mcv_lowsel potentially be a significant overestimate
anyway? Perhaps 1 minus the sum of the MCV frequencies might be
closer, but even that ought to take into account the number of
distinct values remaining, although that information may not always be
available.

Also, just above that, in statext_clauselist_selectivity(), it
computes the list stat_clauses, then doesn't appear to use it
anywhere. I think that would have been the appropriate thing to pass
to mcv_clauselist_selectivity(). Otherwise, passing unrelated clauses
into mcv_clauselist_selectivity() will cause it to fail to find any
matches and then underestimate.

I've also come across a few incorrect/out-of-date comments:

/*
* mcv_clauselist_selectivity
* Return the estimated selectivity of the given clauses using MCV list
* statistics, or 1.0 if no useful MCV list statistic exists.
*/

-- I can't see any code path that returns 1.0 if there are no MCV
stats. The last part of that comment is probably more appropriate to
statext_clauselist_selectivity()

/*
* mcv_update_match_bitmap
* [snip]
* The function returns the number of items currently marked as 'match', and
* ...

-- it doesn't seem to return the number of items marked as 'match'.

Then inside that function, this comment is wrong (copied from the
preceding comment):

/* AND clauses assume nothing matches, initially */
memset(bool_matches, STATS_MATCH_FULL, sizeof(char) *
mcvlist->nitems);

Still reading...

Regards,
Dean

#45

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Dean Rasheed (#43)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 03/26/2018 06:21 PM, Dean Rasheed wrote:

On 26 March 2018 at 14:08, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 03/26/2018 12:31 PM, Dean Rasheed wrote:

A wider concern I have is that I think this function is trying to be
too clever by only resetting selected stats. IMO it should just reset
all stats unconditionally when the column type changes, which would
be consistent with what we do for regular stats.

The argument a year ago was that it's more plausible that the semantics
remains the same. I think the question is how the type change affects
precision - had the type change in the opposite direction (int to real)
there would be no problem, because both ndistinct and dependencies would
produce the same statistics.

In my experience people are far more likely to change data types in a
way that preserves precision, so I think the current behavior is OK.

Hmm, I don't really buy that argument. Altering a column's type
allows the data in it to be rewritten in arbitrary ways, and I don't
think we should presume that the statistics will still be valid just
because the user *probably* won't do something that changes the data
much.

Maybe, I can only really speak about my experience, and in those cases
it's usually "the column is an INT and I need a FLOAT". But you're right
it's not guaranteed to be like that, perhaps the right thing to do is
resetting the stats.

Another reason to do that might be consistency - resetting just some of
the stats might be surprising for users. And we're are already resetting
per-column stats on that column, so the users running ANALYZE anyway.

BTW in my response I claimed this:

The other reason is that when reducing precision, it generally
enforces the dependency (you can't violate functional dependencies or
break grouping by merging values). So you will have stale stats with
weaker dependencies, but it's still better than not having any.>

That's actually bogus. For example for functional dependencies, it's
important on which side of the dependency we reduce precision. With
(a->b) dependency, reducing precision of "b" does indeed strengthen it,
but reducing precision of "a" does weaken it. So I take that back.

So, I'm not particularly opposed to just resetting extended stats
referencing the altered column.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#46

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Dean Rasheed (#44)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 03/26/2018 09:01 PM, Dean Rasheed wrote:

On 18 March 2018 at 23:57, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated version of the patch series, addressing issues
pointed out by Alvaro.

I've just been reading the new code in
statext_clauselist_selectivity() and mcv_clauselist_selectivity(), and
I'm having a hard time convincing myself that it's correct.

This code in statext_clauselist_selectivity() looks a bit odd:

/*
* Evaluate the MCV selectivity. See if we got a full match and the
* minimal selectivity.
*/
if (stat->kind == STATS_EXT_MCV)
s1 = mcv_clauselist_selectivity(root, stat, clauses, varRelid,
jointype, sjinfo, rel,
&fullmatch, &mcv_lowsel);

/*
* If we got a full equality match on the MCV list, we're done (and the
* estimate is likely pretty good).
*/
if (fullmatch && (s1 > 0.0))
return s1;

/*
* If it's a full match (equalities on all columns) but we haven't found
* it in the MCV, then we limit the selectivity by frequency of the last
* MCV item. Otherwise reset it to 1.0.
*/
if (!fullmatch)
mcv_lowsel = 1.0;

return Min(s1, mcv_lowsel);

So if fullmatch is true and s1 is greater than 0, it will return s1.
If fullmatch is true and s1 equals 0, it will return Min(s1,
mcv_lowsel) which will also be s1. If fullmatch is false, mcv_lowsel
will be set to 1 and it will return Min(s1, mcv_lowsel) which will
also be s1. So it always just returns s1, no? Maybe there's no point
in computing fullmatch.

Hmmm, I think you're right. It probably got broken in the last rebase,
when I moved a bunch of code from the histogram part to the MCV one.
I'll take a look.

Also, wouldn't mcv_lowsel potentially be a significant overestimate
anyway? Perhaps 1 minus the sum of the MCV frequencies might be
closer, but even that ought to take into account the number of
distinct values remaining, although that information may not always be
available.

That is definitely true. 1 minus the sum of the MCV frequencies, and I
suppose we might even improve that if we had some ndistinct estimate on
those columns to compute an average.

Also, just above that, in statext_clauselist_selectivity(), it
computes the list stat_clauses, then doesn't appear to use it
anywhere. I think that would have been the appropriate thing to pass
to mcv_clauselist_selectivity(). Otherwise, passing unrelated clauses
into mcv_clauselist_selectivity() will cause it to fail to find any
matches and then underestimate.

Will check.

I've also come across a few incorrect/out-of-date comments:

/*
* mcv_clauselist_selectivity
* Return the estimated selectivity of the given clauses using MCV list
* statistics, or 1.0 if no useful MCV list statistic exists.
*/

-- I can't see any code path that returns 1.0 if there are no MCV
stats. The last part of that comment is probably more appropriate to
statext_clauselist_selectivity()

/*
* mcv_update_match_bitmap
* [snip]
* The function returns the number of items currently marked as 'match', and
* ...

-- it doesn't seem to return the number of items marked as 'match'.

Then inside that function, this comment is wrong (copied from the
preceding comment):

/* AND clauses assume nothing matches, initially */
memset(bool_matches, STATS_MATCH_FULL, sizeof(char) *
mcvlist->nitems);

Still reading...

Regards,
Dean

Yeah, sorry about that - I forgot to fix those comments after removing
the match counting to simplify the patches.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#47

Dean Rasheed

dean.a.rasheed@gmail.com

almost 8 years ago

In reply to: Tomas Vondra (#46)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 26 March 2018 at 20:17, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 03/26/2018 09:01 PM, Dean Rasheed wrote:

Also, just above that, in statext_clauselist_selectivity(), it
computes the list stat_clauses, then doesn't appear to use it
anywhere. I think that would have been the appropriate thing to pass
to mcv_clauselist_selectivity(). Otherwise, passing unrelated clauses
into mcv_clauselist_selectivity() will cause it to fail to find any
matches and then underestimate.

Will check.

Here's a test case demonstrating this bug:

drop table if exists foo;
create table foo(a int, b int, c int);
insert into foo select 0,0,0 from generate_series(1,100000);
insert into foo select 1,1,1 from generate_series(1,10000);
insert into foo select 2,2,2 from generate_series(1,1000);
insert into foo select 3,3,3 from generate_series(1,100);
insert into foo select x,x,x from generate_series(4,1000) g(x);
insert into foo select x,x,x from generate_series(4,1000) g(x);
insert into foo select x,x,x from generate_series(4,1000) g(x);
insert into foo select x,x,x from generate_series(4,1000) g(x);
insert into foo select x,x,x from generate_series(4,1000) g(x);
analyse foo;
explain analyse select * from foo where a=1 and b=1 and c=1;
create statistics foo_mcv_ab (mcv) on a,b from foo;
analyse foo;
explain analyse select * from foo where a=1 and b=1 and c=1;

With the multivariate MCV statistics, the estimate gets worse because
it passes the c=1 clause to mcv_clauselist_selectivity(), and nothing
matches.

There's also another bug, arising from the fact that
statext_is_compatible_clause() says that NOT clauses are supported,
but mcv_clauselist_selectivity() doesn't support them. So with the
above table:

select * from foo where (a=0 or b=0) and not (b in (1,2));
ERROR: unknown clause type: 111

Regards,
Dean

#48

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Dean Rasheed (#47)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi Dean,

Here is an updated patch (hopefully) fixing the bugs you've reported so
far. In particular, it fixes this:

1) mostly harmless memset bug in UpdateStatisticsForTypeChange

2) passing the right list (stat_clauses) to mcv_clauselist_selectivity

3) corrections to a couple of outdated comments

4) handling of NOT clauses in MCV lists (and in histograms)

The query you posted does not fail anymore, but there's a room for
improvement. We should be able to handle queries like this:

select * from foo where a=1 and not b=1;

But we don't, because we only recognize F_EQSEL, F_SCALARLTSEL and
F_SCALARGTSEL, but F_NEQSEL (which is what "not b=1" uses). Should be
simple to fix, I believe.

5) handling of mcv_lowsel in statext_clauselist_selectivity

I do believe the new behavior is correct - as I suspected, I broke this
during the last rebase, where I also moved some stuff from the histogram
part to the MCV part. I've also added the (sum of MCV frequencies), as
you suggested.

I think we could improve the estimate further by computing ndistinct
estimate, and then using that to compute average frequency of non-MCV
items. Essentially what var_eq_const does:

if (otherdistinct > 1)
selec /= otherdistinct;

Not sure how to do that when there are not just equality clauses.

BTW I think there's a bug in handling the fullmatch flag - it should not
be passed to AND/OR subclauses the way it is, because then

WHERE a=1 OR (a=2 AND b=2)

will probably set it to 'true' because of (a=2 AND b=2). Which will
short-circuit the statext_clauselist_selectivity, forcing it to ignore
the non-MCV part.

But that's something I need to look at more closely tomorrow. Another
thing I probably need to do is to add more regression tests, protecting
against bugs similar to those you found.

Thanks for the feedback so far!

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-multivariate-MCV-lists-20180327.patch.gzapplication/gzip; name=0001-multivariate-MCV-lists-20180327.patch.gzDownload

0002-multivariate-histograms-20180327.patch.gzapplication/gzip; name=0002-multivariate-histograms-20180327.patch.gzDownload

����Z0002-multivariate-histograms-20180327.patch�<ys���[�/�~�Q)��O[n4�#OV��u:��l6<T���4���.�K�������H`����,<�w����[�mX�����f���M���
�s��[������g�b��.����k��Q:4��$�x�~|;��(���������?�[�w\:�����K�mv�C�`��	�LV�
]/]/g�	+~�~y?���c�F�W�-�����(����M���T��j%fV#
���mX<�np���m���Yut���]`	h���p��<��,�zg
4��
�l��9�ZE��6�``�f��$|;��=F���w������e�a������oG
�����nmd�4���[D�`��6��2n������?D�X���p��a[�5�#�?S��VO�-�f�m\�Ob����$�`�����p+���J�z��PS��n����r�����,b���R?����w���ns��w��
?[�y��u��o9�r��v�����u�����3�H��4uU|��X���#+tfb�lf��o�K[�F���Z<���[a���
��0�����������j{���B������t�pa���)A�k�n�����"�������9�y��n's_��@��M("\qQF������S��$����c3l�����Q�l�NX�)�Pa�z,����w��J�-�#�/P��4���Y������>������.G�jECc��+��Z��� ���w��?�>6��J%���Y�v���7vd��%��q���;v��5:-]�+20�����n7���g�N��k��/$?���d<��]������B�%P��s�C,�l�Y��������I�W@Jj#������R���P��Gz��19�(�������Q������I�@�2������m�7�N����@��B�`3�.�>�V��
0r`)��ZJ��z�g��7a�\�<�Pw�W�*��(����������R�V{�^e�t*D}j��)���7H�����	����������Mq
�W|F�O�J;���9���v-��Z*�R�G���F~�^��or�9H�A�4�d��[����w2y�����X�7�-�/�����:��B�����%|m���P����g.Lc��q�ZN��:�t|�D�I�
���gR�$�)��V�y���=�J��^+*QB� ���4O�/�gGo^�/2�l����`��M]�E��[:8C�:xJ����6'����H��2�������c���F��� �e /��\����`�P������Y����N��Q�9�w�'��^���>�A=D�f��A�!�O8Y���������?A��t�(���'�?Wp?s��D����?����^~8?�����v<�G�(��2Qf9w�������n����l&���������
~K����}	E�������"��3��V��4'���c���
�V�Y9c���0
��_�Ib�Jdg�� 
<Z�4'��u_i�d�I
*3���� I��s�	����7�'fq��[F��`�K�QV�c����/S�%?��(�K�
���w��������2���|y44��Pk���e`�� 	�,��<���c���'�g�^,��)*q�w#�JW�����>C��]��J�x�kT�������;]��w�;
��
���*Y�m�X��a4�����|0�d�������jl2�]�����*I��t#V���oX����l�����C��
������Xp��Ze�S}n����_��x�H(��4q�KK��G������X������c���A���������,���;���*���R�T+fw����h
a��(!�l�	k	P�1l�E"$B)�qI ��e����y�r��\J(�K$��lI;�&���f7�����&��	`Tb���-\��R���Y0n���R���m�����P�C�@&�{����,on	�Z��<��/����������� f�/��k��4i���xp���,�g�_"s�n��I=�v����������&��c�dd7P8��{�����=��&��������l��<������9�{u�8l�2YR�M�����0CY:|:\��?��a8���Z��j�������wv��F3V��(~��u�^7�m��@��7�<�(���M����A?�7��B��9.x���\&n+�n����c�cc�w��������A�#e��^���
+c��3p9�:ozJ��:�5�p��^�"A������������*v$ )���7P�)�������XW���<]O�=�k���	e���!�f�R����%x,n}�=
���MUI���X�0oD���0��!��E�,^�S��Jy^l9,�_(2���.2����G�Kp�P��<!����;9�V�o�9�
��&�qv�N����%G���5�p�s!�4����TS�9bH�(�z���*N�"���������X����A���������3�{`���{rR�L���b::���
��	���9�H��t�!^;����?��=)t�����:� @b�y�e$�\���@���U-o�x+�O���K��B��
�#����hZ�����/�i�%[���|[���
�h�r��b��U�*6�xH�Od�G��Q�|.��<�tQ�	�k���_��dY�'S�m�$���)m�����&'�����,i��B�&����7}*~��X�:��D��uM������f�Osm��f��X��$�1u�h���n�;f;�Z7���Z�|n���S���a����[�?f~��M��g�����f[.��hfw�b�9C��n��{m��v�T��U��!�:���|Q�u�_�^�V����v7����(Yv�=�M�I�~-{r�^�g�f����j1|��gh=�X�������,fg�N'��a�=���iQ���/n1���<�Iy}�V&�I?�sb�������^�3��3���e�0���e�_������(�R������_8r�7��$�����+��w'�%�d�e�c�������o�7V�>]2Dt
�'��_�
�v�����N|s��Z�,������l,��r�P��c��-�/#��M����Q+C[7����"�M{���d�)��������S��[Q
�/C������������TQknt�v�^oY�g����6tGw��Ni��uY~��K�����Fg��j��?��k�+�_�a��Q�YCv ���"S���z������u���vo��_K��.��5l����&���kF)���n�Hr���RM��R�V�J�K��l�����V}�����,1�'������/Lk��c����-� ��4�����k�D]L���777���rh�^@5r��j������5%W$_4�I��>]2��f��I��R�5U��V�4+�xJ��r@6�3/z"o���T�ZVl/c���>i�o��k!]6aRN~K��=p��_�YO���Mo���5YY;��<I�g�s�
����p<a����%w`�
k���`U��A���� ��+R���5���h���+WX(�����#�a���S�����A@o��M�q4y���-c���M�Mk�1�P��X
]��[����t@z�Bt��/�~��x[�x;��)��hU��Q){,�'}}b��i*��EG�x�^��L�����$D2�<i*��at�����E�?\��a���y8Y�@O�#�������������|x��g�\���A*6����/�L.Q��1a�N������dtu��XcfV��X�G�x�&n �:�
>�O��R�^}���S���UR���G���G��u<Y��^$I�eEj6���P�� �J���%���T-���s�
�����k0����� �����=�\�]��kX��~{
�[�����B����&:�L���y��C���+�����bj\yuB6�m�f����f������HBQ��=���Q���2�����������)�^�o�����Y�57v��&$lu�4�����5��I�!���43(g/������~WJ���� R:�!>R1P�������B�Q���;.����)�����^������A�V���C����<��JB{�(U�Z������&�0 ��0��x��!��;�k4����O��vu ����A~	Z\x��y6����M_n��g>D�t:���N�lH�����f�������}��p�_u���Q�,{Y]�-?6���DC	���~ f#$�3��u��dN���[.�=�����F1��T��>q�a<������"�l��9+y��(�$���n�W��7L�����k�d$��������Q�1H�����j���,��
l�oO�N����/'���4����L�@�z����A?RIl�����\�����S�
l�����T�^�,�|em��o��#4fb��l}U�>�M�D�'�V�N`�DLN�����t��dz:|?�<^�����M�������8�S��,=Fi=�;H��~.��A�YRY��c����c���
s�%��,
�@���R�OI�r���5v�n0��j��h�4��z�(�������\=�������	�R��1������+(��@Qs�}�ZW�z-sW:����vN"��`
���/�P_N�7|�����#(�����Ae����I9�����j�������/����#�a��(� ~�r��aj0�(Zz��K]�
^�q���]�2��w?�l�c���G<$T4o1a�9�3d��{�I�!���j����:�Q�a��lC��!��]�"���%Yv���E�c�}D�+����X/`����r��BH�2�����Y%�p��t*	�*��T���6�	c
KJmJ'#=���>,���������%*���_��E/�"��t]��V�	�Tm�Z���^}3��>���Q��t�*�;�n&`UA��0���� /���9��m#��M:K	�����(l�W�N�4��p��N4E�:E/�����#AEE����?��@�T�,NP�"���d^�Vr��F-��!m�y�����������5v	���kEMw��0M��JA'��X
��U[S�4����>#l�s#M����X�)@[�Tu|���/D9���^�?f;�U�=�\�f�>����7��k����)T��]L���/����<�#�Bp<��M&clu{�kTdZq���j�[N�i\C�t���-#�^|�&�)�o������l3�C���}�7:a4t�v$���_���a$O���_�W�MB� ��\�D�e�rdt9�
D��y;�\�_�q�H�_vm���e��iJ{��7��z�i�m��]�"ZW��S�3j����_��n�'�2wzWc�Z�}�MI�#������%M���-80���:����
�
�K;
�����u�
��O�`^���zsuFE6���������~aP�����&�i)�(bE���ME���k�&
�ij��7����C�2�a��X�aG�EU����7�k���@�����	���b�c���Ll�Yz<}JR	�-$Z%�������z���a;����=@��r�Y�'�j�W���$���]e�x�^�zC:��b]~�^�o�����H5UXV�����G��[Xk<��O2��~f�=���z�y���^��gh�99�����O��
��Y�����H~{l�d�����+XP�/r}�������b8�#$�5����>�SC���3�w�����S�g��e��`+i��.f���S���Y�9c���J~�e�6�]�����e�����9-�y��_/����;6�z����,��=C�T:iW_Z~��!��j%'P���[���G���^��l�&�G{��j.���"PQ�^���-����ilr?�K1���d�3�~���r�g����� ]�+�����u1��tm����X���<�N�-�Vx�P����w����@����P?F��������R�����
�c���������F�Wg�t��3j��w_	
����w:���V���;��Q<le���u�������7H�x�K�.�Mti����M��e�����-R�1h�������W�����9����M�"p!����jx�����z���f���a�n��&:7��/���X�����������T�?H��v�_ia��J�Wbp%� N+R�^�}�v"<"PdE�����1��n��5�>c��
�u�������h�_�A���e=������<�MTd9��D��#.m]� ���X�!���'[d^�m���g��%.�v��p�O��a��*@1NN/������'�}���,��nV���O��>^O3��C������v�������~&�9�������C2�R�I�i�����5��T��a�nS
}F��[O�����.9��T�����1���n�A�����qrKXyO�������o���~m����Z�&�~�(�- �
�:��]^Y�����2>>L="U�Vq�![��#���R�����ir�:����rTdcy��^jN%=�"7sv�v
��|������������U:n\��,���$T�d�u���T�$�;���0�LdT(��`y=�n�;����c�[T�S[V0���%Q$a��2F��I�h��|���`�2f�q?�Kg��a$h
��d��	s	�.`_�������
/����H�V,�[����O�����`x��y���_Mp�a���B�Q�k���9j)����N�v<FUy�;~���f�8-�p������Xme����u�c������("��QV�m�?D���)�j�C+�W������j�K!���FwuZ2����vL��Qpn�U8���6���$�`�o	R��u��X��<+��2�)�l��V���m,�>�����Z^������eu:�����1n�6���+:x;�����Qr�
sI�H�7����.��VT�Y�����`
�5�-C<*�5AD�c/��p�df�o%����a�5�P�I@���r2�����3�Hq�4e
v$z��b �F#��A�*z�;JTwrV>$���uX��e}8��J.'��<3R��IL�	]�����`Pi�}A���4����6�j����RO�&7Bb�LgF#C����-��C�)����r=� =q��m{z����Z�$,��� B�$A�(_f]�c��E����}t{L��[�r-��z�c4�c-W�C�8th����[�����<HKU|�w_�l�z�9���]}��G���V��u���&��V�m3L):z��2�)"�n kYV�Lp	(�
��e&&Cx������G�I�2���&��_
�Q�ns:����qmn���7a�r���o���tF�;�V���$����G�G��3���'c�-�F,�e���i�vA��\'���s��ZO������9�����fR��[���������*�S��%�����������^@�U��H�.�}z��y��l#�@~��3�F���}wf,;��y�a�IF�b8]``U���u��+����>3O7��Q��D���H���e�Y���������{��,!�BE��U!�D.D��n�m����@CgM�|��4<�����l��.WN4�Ga
�������8��.�J)���s���,^I2
5%�<��a�M�� ����3	t��~iL����'��E�Q�bb���F��F����������7�y����\H��-�z���]��X����x>����V-M����S�};�W�^��\�7��{���w����<���������9��x|W6����X�O���LWX%t��&UOt�
)�����������K��5����2Q\I$�rA�Y�(�{�0�I�6fv2&�x2$�wT|��b�C����7��
D#h�O�5�a�3��eF���+�>��uJ�2�&,.�o�9����b�Xd9+2�����>`��#M�.N�����Y��:�DcV�
���r�"���{��7�"��'��)��������N�oM�8��5=�u�����hz�r#���#��c�
��T��Mf(�@f���|���� U��(T� ��eO<���|U���2"pO���#?���i�8x���V�����c��w��w�1�N'���?��(`���m��Z�b}<Sl'�B1�-R�T��"F�I�D�����M�b�0�#Ax��4@�_@��k��n�������'�VO�j�{���CUn����h�!8>�]�^4��:tc@�!�����V��bH)�$�p�p��f�M�����]��������
J�0J�.��]�*����3�5H|)?�� �� !��l�\�P��6k	��-���������O,�'�,-	� ��o��G���D������W:J�Nb�4�s��X�@���u��n�S)�n	�:SI�.W$�"4u�0�FV���
�`��<Tz
�G��l��
!��/I[�z�!�$0���3cRC
6=8AB��r������CxR��@��&�x��g��MY�C9��x��O5�PH�����L�����Ng�����)Q��Jl�oR��z����C�>qO�n�NC�$=��V�d<��3���#N5���V^�+QF���%g
<��lS��L�3�u8rRc�5�2�6�����������W� e�s�r"�c���B���xi>�%�O�8S�5�7
���V��l%�#q������S�i��[��p	E�L����1%nB���VL��G~���}W<�� FL��A�#�I�aof��M��`�9��O���S�[�����t<k���e|�j.=�.�I��g���'t�F�0���o7�r��,��b�ToJGK���9�g�
#�h���0����?P������<8��o���>���%�zzL�y�YI�>��l��#l���l@��H�&���1AX��T2��2T�9�<u�D�����<>�~����G���_�����#�,�M��L0�N�<_��
��=�����N�m��{�6I�T}2���LP�����oXu`�Y������}/G�8#Gyk��&G�\�/��4��{c�Itv'���&QT����O�d'"�hml,z���r�tG(�!��y�V�����P7xn��`���&t�0�jvs����M�t��puG���HG�Q��vH6=*�p��JX��3t�^������`2����E�F�lc���@�����nC�����V�r�o��F��V?�S{��8j������-����@�\��N�"���27 �����M:X�O?/�d���T�A��~��,64��#i�D���D��
�����-��X���L~�j�$VP��M��l����d@&GE�>z�3�41��&���C�����������!���q��AK8���&��#�W��p��S�lh�"[$�]w�3#�g9�AR��7#F����a���zc���:�A2�����R������?���m����4*
���)z��2�0��\���l9��Qr�S��������u�N�*�I�G�|�S{���1�O\Z��7��E�o�
��Q������������/�'@6�g�a�V]���pX�39��"��fJ�D�����8q?q:�X
��q
rEa�j��������'E,����J3��Rn�y������,�R�����t���v����
��X��}>�s#lF?�6�^'O�i�~|���Os��SadX���+�eDG��������=�<6���$�\z,��S�d�o'�P��d�x+��^�P�y$��:��>�$n)�v9qF�z�v����)��'�������B��������\R}��S@� �/�"�67zDG�����Lt�!����7'�|6����D������������ZL��#��|�.���Z�N��UCKkD!��d����u�H�x���C�r���w]TK^u^0K�~��r�-��]�:��h0j���{��C���6xD�A����.H����^��{���-L���9���7A���N&����u7���0f�a����I�m �0@�mf�Q�2�����\��8
�}V�!�b������}D����q�b*jxN5��v�>�� �x}q�z_��Ugfo�.m� ��YJn�����I��9qLS��=�|��.���@�����NM��)�8'-"8��M��������&�"`�zx�
.����\�c��b�"L�\l2X���X��7��|3����#�Z�����N�G����X�#	�q��x$D���>�����y(Y$�4���������|�TV������p1�9���l�xt)i�d���B�#
�[���/�?�o�� �J<)����
������o �����>EJ��V�k��"���$~�0k��a��`}[�=c����TS&���r���gI6�!��AR�(��a��t��%�#&l,���*��t�3g�$1�����J��f�@�s�Yf������7���l3)� po��I_���[e�j�)bB*�v#��8`���	%�D�
��)N$���������a�,X������8O67W2v
�����1��;��x���J9Z���������q�&��)jc1n��1�SN8��?�����'��������Mf(����t��9�U���)��([L-Bn� 
�0�g�������������,v��p���������K�U�;���� �~��#d���W�M��jP\��O:3�RC��������8=x�QlDb/���=���87VZ��r�0Jf�Le��Z��)�����0���s�^b�������ky1=H�(X�$��6\�
L��f�^Mr4p��+��z2w8��6T������Z�fs,����/PA2����#B���'��E��G��(�cF ���������{��B�T��V���� �r��c�*~�X~�V� �����]��$l��l�x�)�S"��Ya��i7s�������y4������yF�v��&��u�^�xD�|����ro�>�[��������|�ne9���yo
;����V�7\(�.���`�\Y��� ���l��4�W��.�	A�2aF>:�8�3��	zZ��o�������
������B�k�d�����,��7P�W�)<�Q���y��0���99au
]&�����W��:B���9�(_R�Q�k����������B������T(M�����Q����h��
��g�����+������O�ryc9B(5xbg���{8M������@��r�A��E�4��n��wh�;[|��X?�9���Q�z��6�L��\w��M���k��y��5,�)�iD�a�q�����s�����.R����'�f�Rh2
����?�<S]����l���,�����
�\���1ya�$:�0L��S�������
��h(��	H��#��[��m��q�p�s��#Al@~p���"� ��h9��;�]��K�����w��@����C����[�rf-,P�oI��1����l����<�w@~)�l�����M	l�P3�P���.H+S��^,K^�����dr�P��1���0`D������&p�on��e�=�M^�w	����ol�}������(�\�KG��A�z����ETu�����N�������Mlp8����vZ�^�rD���s@s��	��������C`��3��G���`�
�!y�',�^8����L4L1Y'����7f����o�2da��v�a2"�Y�-���^HI�uvE��s���K��	oD���S@Ye\�S�����kDNq�LH��T� _R:j��M�}�'��K�z9d���c��o�]+���1�4+�$((������&��&�pwu�R�AR5�*"������2��k��)���������<y
�jz\�]x������di������D��(�v/����:�[AZ�.<����5��K�'��,������	��?�d��
���n�qG:�\l�	C\�;J�"�Bl�<�Q~ z�$~/�6��K��4��5��}�����mh�w��A��q���F_V<A^W�h�<"��Gm�6&��[r��|����s�j��Xi�(Q��J$%��(��5�R �a���jX��������
����nn<��?u�E����x1��a��\3�L�e�G,�|�Hbd������_Z����$��J��.�&���������EZm��������/��d.7�jB�����{�\�����WO4$���M3igD'����6$�RV����96�y��<e�M��P~�����~#����?'������}��d����`��h�����:/���m-�4��[�.K<����5�S�����Cjv��^��8�tQKO�.[�j�LZ��X����^C?�^��p�9���
2!|)����)���k�t��<"������@�3_��(�����JD0rD�������7
0(�V;���g�j���S:��M�^I��-h"�2����0�,*������8t���2>�Q=�+6�W�n.��@N���Gc��	���,��8����NK�������K�����{���@<K�~4��B�]Am��0��XQa�xP����-�L������Vut2�h:�.��rc`X�0-2Vh�|��Y�(�������K�#�q���[�����/Pir��}����v��x�p$��S�8T[�e���6�VY��}��s�H}S���S���K8��6q���+��f��Rb���	��;�,��������S�^��,�w�i����|�?����EW�8���T�B��������_p��Q� Hn��c��<{�^����;����,�(��`���� Ur��L��	�r��O�%��f!��X���1F������,b]M��
I���^�����C����k�S��G���!n��[����v�bRGkyL%�4����4�z^�L����4��I���+��g��usIm(3S�*7��tI��]�A��L!@�7G]J��D*�D��@.5K�w�et�n��3+~��8�t�{\���Q�����)6���h����@��i�&f�����t����[a���G�n����������a�
(h�)�nMYC���:����6Y5�l�/�����DRT�������H�0^e�l��������8���:�)3�l��%�b�a>���=I����Uk�����e�v�	V���>n�_�!E���}_�?!/PL.x_��<f���bH��k~��F��l9ZI�O���Y��.� O/>�����o�7�n�w����[	�HbA�����	1#�%����so)�
�
�O|�0���<�7'�������n8}�;T �{�;��G��)�������_sE�c2�.aex&�G���/1�����)Q9[&7���8�}�!�}��^��P�J45�d:���Fs���Z�����@*9��n�2	r�
N ���K{��`F���<��"7��9�����Y���y��
�Jf�f]�=��;�ke%>j��	����m.r�B[-=w�������U/=xt(�s�f�����xhd&ca i���K���g^��9w���59������b��+��_c06A�pS����4��"��jp�l�q�����8�<����/#���'����Ky�B*j�������L>[���02�5#7���`
�i~sN���R_�O�=t�l�:�������{L���P�U\����N���7�J?6K�>�V���,|���~�C�X�����+������A�A���tC���
����kP��s�\�<�D����
�K@����O��u�ZrX_h�&F�>�VL_���p��sL�T�h�W�����{����j�L�5�D�I1��}��Y�a���L'=.��H���E4�k�����x_�p]��	��G���"(�uH+���w���<i
-����������o�0���G�G�������b�PR&�����(zuz~����yUT���
���N���t�������i�5�D6��&���"����D��:�&�(;G?L'7�������.I�	��q��Q$i�C�#@����%�:~vtrq��������D&���C�/����y��k^��V�����x�|�4�p��3����+�FT������.�wly�.�zr��W���G7�~|��>���"��7���:eQu�k����oE:���|����&���������%�E�K����*�f�a@nj�`r���^��=Y��a�
D�Q=7�%xJ?A���+
G�x����8���B ��Z�%��
���D��1.����=@������F�0�c�����0t��N6o�3k:~l{���.d��Z�Z}^������cI���@ D�V�WN�d���e���m[K�!�N�����a�8
�9*un$����6�&6j��%tB6��>-W�SaN���a�%��c
n�x���`m0�=�N\�����4��W
�����ZP��cv���=�;��O��8~)�qi&�������zn�%���Q���p�($��Sz�����@�q�T!������v��<:9���
�?=~�i�k*6*Z[sc��!�C�5xw���
��\���$���g �xsM��Sa���2D�r s��h$c%_Q�dn���|�)�E���M*(�#er��pl'_[���'����P5L>��9�A���pC��uEmy����K���=�r���; ���'[����{;��,C+�����)k�>j��p��(��_��.�R�4����1-T��Y?9�f��]{��LV�#F���M�2#o�2w���QW�!���*��N�8+��O���b�a�]���B���X
-��@�F����Up'yL%�&O��3A%s4��tb%,4E��c�Z�i�2����<���52�$��X/qLI���D1G�%�J#�!��,�U��
���U��g�����`H&bZ��*�$�Mv`�FU	_�u�n}��Rm�P3Z���CG�E��K��@��/������&���"WS ��8�D\CJj���"��(_�H�F�����V8�/�'\c��A��O�Aq:a��g�1!��D�@�Y����e�}~v��?�����/-�����:fN�ED�����^c\_��x���^a�%U�3U��V���(Q�wOO^�%W�6X^�[v���9�������[�v�4T��4�7Lls��Pa`j�fJ��yT�fp�/�*"��52�1�l6�,�XG��1C^DW��^�b�I3:g_����nj��;B9�]x��-3��9e\\����C$�*�]�A��_�-�rR��z� ���J��0=?$F�K�.]�����8�@2PP�R�Ov��HX�"���T�N|���v&�$7G��Qmd�w�P�
�T���	b�u���5|�;=���S�Og�����ssp��H����1b�8��1��-sBB
P����
���O��F�6����6��������Hv�zu����:A�@'���S�_�V�g^��{a���H!H0�(��f�h�T3=��f:�QT$�p���]hk�.���������ANJ\����Z^,�zgtO4����{
����`8+m�2�h�����kb��w�%/j��%L��21�t�o�}�����K&��V�@j
�YH�%��a���V��@`��,����y"��2H�S%�����
����I����q��)�7k(vJ7f�}����FQ�	��
1�(�{=�89�e6����qe��%/L����E57/����������)T	��)_>��!SdL�]
��8?	��4��A:��3C�pSB���L"ii��g�\)6�.���j�!e�/��lM���Zi��[�_E}�l|k���)����b�v6���P�}'`�8>|#=aP��/����+�5z����/�d�I�&�g��s���?�iR�w��I�#YK	<����`Bq�	������lc��@@P�����#�KK�9]���`q�������9���wiV��rT��M�zU��u�
�Z��F����QZ�M8��F�8��>�$e&�O�f�:ELI��������`x�IkS8���������jU��|u�*���a����(���:J�����wAA�Q^i��8�+
4?�F������<^��l�_�4r4p[~��U� ���%�<%]����w�o	��d�
�����v�L�h4���G�<�e![OB�,��	M�P-AEy�P"P���9�)��D���
Gur�s1��{�f1�sU�xs���������������4r��3�����@N�H6P���W��#���#<*�k�����D_������U�Y��o^��2�V2��u�y@�
6
���m���l�M���Y���p�K�d^hmE7H�U��J������e��!N�A�%	�.a���93(�

���{"�c��!C4M"�a���w��Y\���O�l<u��U�U��lT�������z�b��R�����2U{��D�"���K^���J��dfR����q��f�R��Q�0[����e�������m�;�� af|]/���Pg�"�rf�����5�cx��K�!����'_`@r���>������:�<�<ux�2�����2�
Y�;��Du
�����5�?��G�����xI�����z����u��$!��'����h�W�I�����O0���/���{z�,y�t�����/�B�t��E~���@lO�k��OG���z�����dd��CD��*�%�������N)�`>��Q����#���Af��;&����"����!}��B�%�'�p��k"�m��$,l�N���xx~u���K�#pO
�c-���pq.�N������4$N
�T#�g7��
�c�7�b(�����]��k�Ix����5buU>�s��*0P��7�J,�T�N�S��3�y�&+$�G�0u�'����������C#����`q��E��T�U����\3�*��^R����T�J�@E�*k���2l?���-�W��_Z�
t�od{����	�B/2���d�����*�f��v��Fas�.n�)�pK�N���MZ�0���Kg����\"e3#t�1#}��q�r2���������Em�}Q�&��eA�:��$���<$;�'��>�P.�#Y^_HI��k�
/�p
�|Z�K�������|���6�]{�����j�,�[�E��!A���\V�D_��4dz9��V�	#B�*-��y�~�<�>K�"�>x*~����4�0�s5���0�������x�y^�<����� �e�3�F,��x��$G	��f���1E)%�U
�E?3����|0�7���L�lb)M9�� �]��Q���2�U"Q\����k�#�k����M�����h���,�]�j���PT�kK����8:�N=
���wA���:[uM���Y�o�u�D�}�1p���p����_,/$J����^����~$ ��us�~����6��v�`��3\�(J��D}S7��E�]cOd�����,���m0��)�d�#�u6!t[gs:}v�c����{�F���8���q���x�V=c5KW��|S��9�Q�)��Q�S���yz�P��*$7/�
Ru.#s�94��DR3�_��
��d�Ir�tB!HM����W�����x����!^�TO�����-C\26�%��/�t�r�l2����[8�A��=�h�b@��+����������)#�+{�I.��M���}o~������~m��Zi7R'�S��:\k�<
�V� ?y���u��0�0���@��%�Z���N)0W������q�X���DG����f��b(:�� ��|����Y�����N�@W�H��{t!��B�q~w��s��y{7,P��^|��wSM)b��������V>�S�~BC�����`c���y�����og�:64K��pn�[�jK��{u��h�����Z7�Er���;�':�.p���p����n6��^���'�&�?���2(���7���.��"]U�����Vb�h:���k�F�������`���aK���l?����=-/y�j9�>��D�\���y��m����s��zy�$a�@��F���,�^D���T�d�$-#�i��q����\�(Q�����p���x����x.s2JE������3��sR;Xu"s����w�C��-g$����N)��J��;J
�0jx��q�7{�5a��-�`�y@,���,Jk�u�����#���"����x��vN���&��M������af�C9�����(f���[iZB@6������������������u�3{��/=)~)���F��{���?7���������.�b��g3�X�m���W�+��u���n�A �x	�59���~�K��D��N�D(y�q"��,)���&�E���l��6cEy�Z�Z��cV�>�MxQj:�����B�����+���E�����}����6�������;n�}��G���3��)7��>����5����G
56,�rSO�&PEh�	���-�e�����er{8A�:�L��N��N���=Q��?����z)���eL�a���K�\Q6�����Ez9E�5�����	K����|�����t�s�6���%E���E�C���@��g�M3uI������L#�;������9#t,7����Y9�eL���O9�Bm2f�,t+l�z�I��dsLJ�N�K�}��.�%��."���X�]��������d�KgS�B��"�/����
"��Y��W��v�������`:�%�&_�q��/:��T,�
v�j�e�����)��`�g��z�c���$��1���`�'��R�n����	j�"�S�� ?lN0-��EKw��g��m@�����#:��A�%h�|*X������)X�V�Bh�`��F�v���[����P��[|��]�z�}���jF��F�����H����}Dc�}����&:9/<�����z�[���d�qs����z`���>a�
]���9��������V��!����R�?�,���!z����yF<^���"
�����������*/P��Cm�Wu�z�[���3@@������@�����CD��������3��|t����$L�sv�:5;�_0Px��'����E����"Q�6�H��xU��I��`3��P�t��[6���d�~����B9/)�d�x
�W��s��~������}��B���
����Uq�%���w~��E9��L>������W�A(P����)��h�+^]�x�=�?��
o�E�+D�;S���SJ�:�Vy~����u/��H�7��3L�f�]���ZO�F���1+��8��qO�w����FI[�8e��)U�N�Vz��4��D������G���L��=y�����b�]��xPAX�Sh�aQ7������4����[��������
8w�>��J�)Q���9�2�N\��R��`�<�l��-���
}p�V���a���X�x1���q��ze��r���W�=���D���(z�n���D�?g�O��)�����%9����s����]:���J#���4s�y�����=�?�����,P���t���,u5�K�yQ�����dN��c���/�$;f}���R�C�=��-�
c�������EMDV'��}
�[�3�<��O����ch=�}+�l�(/�q>N��	%�hCY��7������#L�I~�im-:�(+��<�gF��6��sF�dA%
�@�l�4L�h5��YNu4�`�XFj"m��$�@q>YI���9�F���"4�#�����(�?u3��0���.��!��9P�����AhY]���������c)��N���~?�y��7��3��|���_OH+���K��A�U��3�le�u�������{`����0Uo���������A
�9g�}�*2Q@��8J����8�^�Hc��_�����V����V-��R|
S{aH�����p4�~�TWc(W���.��X���O��'y* ����t������w��V)Y�D�*���P�QvTZT ����3R���Q�}a�xL����.sqE�����j�H*�Sy�T��T���*�/r���� |=�����Q��tLdE��"��qQ���*��S:�l@��p=��A���<l2H����gG]TnvO��~zuq���$@�����&���t�!���aN�7��z������i9����Z����1w�N�(o�^������^t��|�����������z����(�����,{������Cf��&�I�����%�0wHK����N���%����s����):���������v%���F$
��;r�
C��$������'}Hq"��.}����������#7%4���e���E����-�������PAj���]i�'�������[��t�oo�#���Gt5�����56J���ji�:]6�C|����FY�+���c�;����`#s�������K��l$a.�������*1�����(��u�����	:������b������,�5�:�Q��D�E���i��a�P
\�M%*�uU-iNru��y�G,H�A�
���)Eb���"*�qH�:��C�dL+�A	����&�7�e��|���9<'��
)<o&]���a"7�vDI����4�*��D7�
D����N�rqK�P-�K"KGb�n6��0Q�}H��d
���^�nC�i��J<����1����|�UY�fK�'��l����]����_�/b[�4q��S����<����V�[U*�Oa�r�4)`��-��M�|���l�@y�����
�JL�uu�4�G��T�:E��4�����ZR�������(��B��(%�������)�ZZ�G-�����y4w����2&�=y����T)������O�<�IH�2�7����8���m���� ���#�����H��1p�7c��c�b�����|?c��]t�`�������RyH�7�/p�����C�������H����r���?��v�k�sz�^�����y�x�F����l�A��u`��b&���*��6�0�-��p(<��u�9���-�OJ`�	��������������RF�
	>��
]���5a������C{�K�-�6N6^=�]t����r�K+8���k��
l����hW���dSO-�a�\��~�7`NW��*�8Uo����,���X"��4���������&��
��#`������h���L���~�nP	j�����@l������Qs7����h_&�fU���O��U
����p4�MA�p*U3g^[D�������h�fi��"E���.�)�}WLG����3�k������"���L7rF����������q�������H��}%������|1�K�ZcT�i)?������8O,�y%�|�
�����Z�^l\��\y���������Q�p��\ps;�2�a��b��s9<���������/��p���~���HY�L�3Z:���M�Q�<0���'%����A:�\hn�y�4�K���_`g�����	7�yN���*_P��Y&�05������������
 �"���\}:�4�#<��$������r��yAv�1�$.�����[0�Ie�N�}4���S��C��d�+(h��f�k��9������Aa�|����5��o��[LiRy��Q��:3�lN L��N8r���OQhc�t}E	��tQ��V�����P`+���x7�Y�!��*��u�������ON/��uA}���������d!��5����|X�q��.������)��`�T�8�774����I���o0��O9c������%�=��$�d]����	��.#Hn3}�!��i��\�Ab�ly�v�F�L�cL>�(:<;:�8�.�q����������\y�C��g���9��������rD
���d�v��Y��QVK��:A�������<����I�D���G�_��x)��N^��?`p�����1�g��GO��,�Ko1�n|�b���R���9��5mzgo��#�����;7�sH8|�����([%�m\po��~����"��_�|tvs���=8y��7��_�[����+�����@3�~�y>0�5�v��� �n�
kI�����D�^���>;����y���a@��Ch�I6{�j�Z�f{����nO��G@;��O��nT���!1��&P���o6�:�G�v���>e7���X�o�6�U,�+x�\�!��V+���V�K��c�L��Q�������t}L�7�F��0���m�G�C�2E������+�������?�W��H����
t���Cp�a�O�����#�U<F���cZh��w�M8����C6�����}��^��C��6��s6����QO)��D�N��s�&L�4��"��`��)�#B_�b�9Jp�/��~'�^oE[��v�
��jom�_;�[o�'O�E��hu��|��	�1~�@W��=�F�ml���/�GJtr�z+J�j���^����T<(6Mk���4at����Ct��ek�w�8��M�(���X�H1���S���DW{w���+������&����n^<D������w��p<-��9�!���p�
�'`��8L-���FLgLT`-�Dk��9�	�]�2���'<��v�I�Q��
��E�L��2��/�7@ruM� ��g�]�>N�js[:���[��`�����0���o3	\����p�����-�y�D*�%yS"������_"�Q�)���'��_��r��5�.�}��Wo0��7�"A��%�R�� y-z{s�TU����/5Hq5j�n
i+���39�3i*�t)��`�n'!������d�.M��a�.�c)L]�`cd���'�l�7�7�&��B9W|���X���<O��f�IB�D�bW���m�-������	�q����@�o��RS�M;�������XV�I-{��tN�!���D���y������R[�p���9�#��� ^�
��L�@�Cq����>�-?e%&i��"N/��5�{���[7�-n��*[��R�4�X��IA�7��9o�jn�O��Y���8���<����o���i���
D�K3���@R�|�9/� ��,��A���mg��@Nk$����N�7Jx�y���<`qF������,���g�`),�!"���@.�b����}i'P50�C��=B��,s�^���(Y
��Y�����Kj,�`��gA�f�I���&�s�L>��.��6D���J�'���jv����<��&ZdF�����x4\a��O,Vg`�>g�#�$�F*�y����btN
��M.�d�
r�(�6^�+���Q?s����������U
�	�1�9�O�N9�[���B�ibR��M�#���Mk���1(��*��1����!Mn�X����}�X�q���=r	�s��8uo����������\��f�()���'D��u��
|���>����)�_�Ft�����z$�������,��3N>��K���J���fj?F}�_oR�}8��"�L����Bo�$[R,�BB���!�DuB���5G@����h#�b6rH��P`�`���m������znY��0y=�\ �a&�>��(��sJ��L�	��\�"�2qx�.:�EH����SP�����iS������9VF<X������1�>f1y�����)U��\��c�J����B�`�E(��S� ��fTh��
���=�,���P�y����Kr3~�EY�h��4�Tn^@����asM�Z���Tlt>0��������?i��N���S&�2�a[����J?W��N:F	Qw�"}�K�Jo��`r/,w22a�������R�<d�fOF��Sc�_�~�+���e���.I=
���
���{��r�>*�����Bo�4#���d�EQs�Q����&��8�b0��s���(fAcc���MQ����N�=M�"e�/Z���8���D�/X���tTx��]�1�6�\����	�����I7��C��
'o�$1���i�%u6�����@j��f�������Hz����H�����\�B2]�I���B���������|Z��s�@778��J�{�I'��\��P�"��;��Q���I�N�p�;���/�%��Gr>��)����I5��
��B���|�zw��t����i���<�azq�uDtl=!B�Ue���QiA���x�����|C
?_��Eo����#1������BOC���@�}�_��|����p*]�o�Ye\.1z��c��
�o����3��0��������3M���&w����G7������=����o�|�r�i-gItI��cf����&�=�������x"�_(�m��-�E]rm5o�����!S��&�{�O]2��C�2��C���*�����l�����BH�S���/��@�����K�������M��B�G���7���d�	�cX?�r��w�S��E..��wC�V��]�"dC��7;�����95<�D����2�y��j�wz����7�sR$0���[�NV����hnQ}�-�Dh�wc�>��#�;�K�i3��[�HF ��5Oe4N(u BU?��y;��w��pA�^�9��8��$]����G*]�� ���At���f����b��	b�y�	O[���$��h�h��d6����{��
M�\�o������jM�$��&8Y����k���4�T5�����2�C�{���k�����a(q�����2u��1���'H<�{Y-u�^���:�b�x������&�0R����
�a�
��*]��/��a����V53v>x��������?,�H�Br����L8C����9�]�/�IH\8�q^���k�(������P-�J��1=��<v��������bo����8s��4�m�`�<��KC����vY$~vV1���(��������������po����
X~�R�x�O��%��IIg�?��>��A��Vo��o'�x����vw������j)=��~M��w�g�����-^Uy�'\����-�'FubV=�A��x��\"��E��*��UV����N*`;,���gJ�����$q��{����)(
�����9
��+�i��p����4K"���
����Q�M�B1�{9�e�/�/gP��r]��'���+��+lPp���y�o�z���f:N�(Ds�V<x$� |#����}���}�%s]iY)'��c~��e����;<�7��6���LR�(�cA�xh|Ie?�������	F5��@�`]^#��K|�����Qq��MrYJ��q��D�����L���S�]�l2�0��]�9M�f����(��BQU^���oU�=2����W|(�������irH��E���Ks�G��P��,�"�
1��`Ob�J�=����s�i�z�iy�E �����h�f_X�.����_2� �����8���������(��;6/����~��Jn��c��7�s������
�<p����������@�}�7yqJ���|t�b���������-+o7h���`	��	Q�+>���������M�����"9�Z�;���N��[6�0n�����C7�R%\b�
x�)f�r�5�'�U"WAq�$B�pf�s%�.���9K�{�e��LU���c��I$�0�; ��F�n�EN�	�F�u?����A�)�y>|�P���^��,�����
���~/��������wA�4�h"�Y���M�\J��>��N�

�M�b�(\��X��y����������Dj�'���fy��_������-kMA��cO=M���e$<r�_&�p���,M�8	��2rO���n�~��������`����b;��� �Z����^���P�
�
<��������A`N��	���nE���;K1�#�%�p#�$�O�k�'�^���dv�Y���#���&��
����������<��'�e�?�j&��@k�G��r����8w��6���SevNL.�oh�,����dFb���^w�KF�^/{
4�>�"�*(e��SE,���*����F��)�v�Z@]�U����L��{=�����=�
0�h�������	y�jD���8.�v�)�������o�s������0�����]����y s���f�w����y!�����\��#��f����8�_n��I^�Xg�l|/M�N%�M�)�g.����oG-\/{Y#���=���D{��]a?��q2N���vHH5�R^�S�3W�f��=�L��c�(���I�������.���3�����A�N��p|lF?p�����?�t�l8!)��xy���I�D� *{7\H\6�����p�)��f���o4��e�����
�4���
._��_���d���1��c^��%@��V�1��D����Y��,�����v�w����o�Z��t�{.\O�C��,����"�=��`9�D�[NR�`-�`�J?���Rp��T��i���y�+���/��A?n�/��E�D\L�������yzE�5@�
G-yn���,�84[~Yc G�55s�I�2kHf�������B����������T����T� �z��i�@����!������G�>�I�"����-���;Hng."�5�G�P?�:�v�o�AV�T�����rhq+_�a4�
��;.n����oUl�(�F�������Hm2v�����T�KF�&��mpC��>�����WnH������Os-�vj�y���R~�Y�z6.�4:	O\V��:��L�#�M��S�._��1F������Q3��F��|�N����T�%����8�dl��M�q��*�,<���j��+[��Z�y#E]��#|~-x��x��*b��yk����7��q�����f��&�������\6�H��xD������	�NnW�ZR��PXH.�%����U'H�q�^�Z	�����e���
f�X�:�G��1���=��<�[J�y���f������6���V��uAy�'F��Y����`"��.z�t�QE�OQxJ]+��0����{����B;}����M����]3S��������$����M�����8u�01�:�/_y���m�����,A��0�NX�F�TQi�y��.<O��T������1�C�~x���u�)|��
.�����Hb"6�W����`��d^��;�����<��R%���M����}z�cZ�w��@"�_�)�a���i���i�MW;�Cw�cfp��<\�w�Q����eRD,�
	���[��`����t��:K8*����ho��,����r}8a[thy	�*1Q
��r��.��'��l���{f�h(9�j��w��������B��x��T�b�Y~��V��{)�e�LQ@�������u"@&�>����9�nT
F���D!��N��S��l����/�@����=��5�g���D���N��D2�D{���m\P�����n]��F�!���H��[��%���/�09YlPo���N�F9��VGY�I^��o�Y�9���\����4Y)�z�����g��To���_n�Z�^-�P�i��nD��h��-���t7�a���/y1y�P��,�F���C}�/t�pK[�;f"9BdE��,���)��K^A�m�>�\��|E�?��-���x*c����]b_�YP��y/&B�]����R&�UQ�������}�z�� _��1������z��*<�Y������������]�1�h�Xc�2	c��C'���.()i�r2S�K}y�P)�������|r�f=�)W����MY�`^�\�D�u��WA��^��9 e��Sn9���\�\T�77C��F�5��$�_��#I3�����Y�PJH3H>�}��1���������<�U���=(�d�Q&���9a��f�F��E���Z��l:q\�F�PM�!)���p�{SR�����������x7W��RQ�Z����d`����DPYN�&x�������&x�|z��EM���	s�����&54e�)�3U�(�0��!#�hn�����V��dN$��Z����!
>-5^��d�Y�.Y�@�(���!���r���*�c�+Rv�p
�S�:�K�e%I��E�-p^��������p���
�A����IPic��
���\>����������dw�R3��2B�i�����mo2��G�
r��Mfs&�A��\l��6]��Lg4{��t�A�����xW��^E��U+��e�f��/��RZI>!�
oF
��&3��h�4�({���{��vdv|����3$(c)z�\�$�d7fDtr�H�Ul)��U���G�4
�3�Q/����b\�In+�w�!�U�@�t�T������N������i��v���g������$C���O�~��>?={yp�=;�ei�U����[jWx��1`K��-����P��=8��\����;-Q�R�������������?Bw2�'2����4�~�Zy6;������Q�%�7Af�4�G0&���V���2������Z�����O��,`|`S��U�F%%�G2X��U6�����C���x���@:]�������������/ju�S���aD_���-�����M�'�LQ]��K����Sx 6hd�����0'��X�_��8872X����������YF#u=�����t�xXQ�,wc!� w�'=s>�.&5��SvA���}v��B�Gq�F��?�&�h�����rqt�}Uk�L��/���!/D����G� �Y+�^1<w�������#��z	{�����e<����1�3tE��u� A������Y�Q�����,W�\^��I@���������#00#�f(�����Z2��'�?=<}v�}~tp���Q����{���W�gG��Z<��W�em�H��ahS�.�\���:�E��c���~r-���+��?���������J���H�b��$�� �4����aX��@�`��9�C2�ZMW�qpll��[*�1�R����RI�jR�WGgD��.~z����r�z�xom�$c	;���=��}}���I������4��2f<�	]�������jo]���|��}��"�}@'#|o���tN�~���>J�"��-�0��4�Y\����)�EU�w����U��,q	y�
��"�Jb��+_a��&�r��k�M���xD��y���a��~������a��0-���Z���H�f5�5�~�����~���WR��o���^]��u���lfa���t�\���v�*��o���u�
�P�Ro�,���:����u���X�=��v0*/����*�b�lw�%sU=����6�C��R�
L��eG�R�rCB����,����������j��8*RlI���\pW%,'�����X�.sE]�GF��
�g���b���#K��>PQ�	:$�FR����>@sJ��h�#�P�9d$�I������������%�S����hD�� {|=XiD�h���z>�
ncT�$Q������7���C�?�B8F8��^� j)��:`]0�	n����LM��]���B��\!|��n�GS�-I���������I!g)z�H*i@���q~��s]`�����5C�qz"�R`��D
Uv����b��l"@c��~����2`<����W�8��P3�
w�A������l����fW���F�v�=d�������10�Y��_���M����r��K|2����}��<�P�aZ�e�]���i�GYh�a�d�M��%S���j�q�Q\��H�A�c�7���O��l%�6Z��[EK�%-�Aim��� f��Z�B�Ig��r���fOd	Z���z]�Rx2��)p�*����OzJ
��7SZ��]��En�E^����TU��V�*��n�=x"���1URDF<s������a�W������Q�/��+����
��L��<|o*6��N��,�?Nqg���5��l���M�?�k&ho��{{�|jD;����r�D�K[EVE����������}���������P�c�\2^�������%���
���$k���Wt��Vtx����5y�c~P��*�.]i	���M���o��72�j�o#���|]y�Wi�!V���D���:��N��-��V��m��v����������u��p�?�����7�K��\#��7K�g��d��hr������zh��j�e8��������������'��&�����"W|��i��� ;�!p>&�X�v��S����H.�Q���|@��)���s�.jPA�m�&k|��w�>|T���7)W�T������o��~���������2�>;�����{r���Z3bcP�3�$>%e
[�r:[��5�?;=9���Q�=�(�������L'h
N<N�-��4����|d\�{hL�!t��'��%J��K��F����n5�����3�l2�o��5W�W��*.iJ2�s�1��X�^�Ka<ew
��������c��:R�]y����L:��eb�� B��;;o��%��������#:L����}��!�z���h�����C����O�K�6�aD�O�n�X�D<�5Z�5tU��|x�����!��	���*5R��\��}�9���)��l^l����
�g��kMDs�L��:�\�K�'Hm>�����w����,��4�p����#$I.Lt��!��b�1�����[!�x��]�� ��?��T���J3����3���6M���E[��T�
[����n����=��y�
�p'#92�&`����Wr��p�L9�����:����3g���"�����������+E�R	������0e������[���t�T���u����P��"�Y 
���*K�%h�
�-��
��lao�%[=�	?��?�7�>����tUqs����=��� �3�Q=O�����3R���#�t<��Kt*��X���5�kC�u=�rQK>n&����z O*� C������`�_����<�K3�D3N~�������
 f������Y�?$����IvU`n���QK�.WE[U�<�v����)`����q���{�?�H�e\��/�����>���.�~����Vi���?J���F"��O/gV`��gI�'�
z�q�87�TE�st�ag���-6!���k8z$Bw`w����
B;0�g���y?�h��^��
���v/���]bF4�B������u��xLT��r;H'�G��k�������}�����=��N��L>�h��D(�K,��lGE�h�������#-V����������c15I/a0���5�oP�.�F>�?vs%d��p1s�5���+E�N?H	����E�&#�"�<y6�2�	A%(9��q����*���v�"3�%��bQ��0V�9�\������������Bi��q���a�u����$�G��-�Grj�����:��>K7[��}A���>N�jV��N��`�����*~�u<��2.��}upvq|���u��/���f�E��R�zz��D"�9;���o��=�$�=.�wmN�� �YtaY���q��q��(LD���r�v�x�HP���W~n�['MW��d�=����qw�f���[i�FQT�m�/d�������G���YaSv[p��z����}[x��9lX]zcdo�Ur������U����Ur�����Jr������*�\0h�~zrzr�c�5b*�2
����?���$] ���p���4�r�� �[a Cbr�aAf$�I���X=��v��i�����Z����_�w8$��Fo����v����'���p8����Pt�G�-��#�u��K�4������=���6	�����<����gBv�S�?���}4>���e(���\��{�k9����A8��o	�`����l��[�����XuY�w�@1���j�x��uT�sJ��0��"
%�7"�L���9�7;�4�3�mg�kCTu���S(��ivD�%T���6�L$k�V���2���#-z����/bZqK����6��1�<�h���?z�P-q�C����o�ba�'���I�	>KO������\	�
�$W��;�������������5�s�	�u�~cd�����������vhY��]�;!���C��8��<G'�R������x]�)���l�B���"��
�D�p�B~�?��D�H�Z��G�6*v��P#1>�-���9�H9�AW�Q���
02�c����5��R�=_���}=���O��'�6���u��:���o���#2E�&�U
p�bH������A��n/?�*���	�k#'������,����=�n4w�����g�2�'C��U��$��#�]�������R4���Qg�����#�D=�Xo�22��L.6?�ke]jU2GM��ion7����D�`qD�h��la�5���>��9��7����L/���{�C)���,3������|C}���n^���BXA�.���������rW�W#�^e7��dW�:�����9:8%��>��
�r��6E�������������q��I���Q�D��H-�9�����O�����%'��T�4@7�x j4.k����a���	��,�jn��x��F,AI�e�H����s���2~���Jl���+��Z:���)�g����3��`n�!z�'	n*��yj�l}l��`0���-��>��N��E����b�-_`YP�\)|W���������T�l��:]����/
������5���Au��]��R )<1-�*������+\����� 	��)�V��>�^
~}�c��0fD!&� )��M"qt�`�f��?�KJ>���8�Kk>\�8;�5����W��V���.S�x���Q� ?�����z\��r�i	�8��_s2����8�>�x�G#�������g:���m���^8F���8=���XH!_}���>�`�,�1����tr=��1������|���1����i0��bD�Vc^`"������,��L�	U�m�3}���|3�tIP�������\d���2s�g �5�j���o�9MJ&��M�!�;;y�f�Ug�xb�%��xr�n��)��|���_v������p� :n
����
x��U�A����Q��������N�9������������;���_��&�V0	�%����J����p�U.6\�����t�	�1K�5}�#����&��7�
.
o����5�K �qxJ������y�0���^�����{�����$���J�'M�!No>b9������2|�"�0�$����M-%+9/�L!\����:�R�n��$��q��q�>���IM��]r!��dt-�����|��+��)��?����Bn�����vh�*,`�@� )�9���%��xL|eb$WAe�����e���3n82����
IM���
��b�mp%a=��S����m���9�48��$���M�3T��WYa���Tb��4�cj$g
��	��k�����uB��U_u����C���)�@h����,b�.Sq�|`@-dC��{�BMDiS�^�4h��C/v�C���<����9�Gx:�wi�,+����M���(]�}j�`�6^.7�b�0�.��=������b�;v�x�?��8j+#Nj��{*��N]h4��]D4��||s�/�o@��4���x�JO����9I���R�bU"����5�r;+��*B�&�c�[n ��9����� @r�K����1�^O��E8E&�z%���U����	���.F�M`1��e�z5�U�Qy���<�hL(!5��S�ly�HU�����F"6�b��n��!�z�L11��2�S1����TT��WZ�Bo�����ytW$:.8��G%su�����/�^\����3`(��&F���!(��v/��sM�jN����c�>����"��������.-jd�?v���'�2G��kbV��e�HSA~9��#��8��5�WT��s�
�s
r	�wC�Nu�q��s@MS�r�r�Y��u����c����;�}���C�VJ�_&���������n�ME��)�W��d��.��pn�Y���_u{����p�N�`����}����_�t�DyM,k����v��I�O>h;��j�/��B�A�{z��g8�^Q��Ul8M��lx�H����?3���<�Bx���e>5�r�~��#�9����o�r������sn�"[<�A�#0��kR�h�*7���~��*ta��������sr���R9}�,g��O>Qj`a�H�*4���O��s���s�F_�%(���ny�/��Sv|�EI�����@��Wn��Z�(��L���EE__}���O�{nU���?=�D*ipK��\z�G��ei�Y��J(�d�
��^s	$��;`5��jeO�;��D�������X	�N����Y�A������</t�!�5^z�����5��W�s%��-�j���d��!����Wif��[�t��B-��D6w���`�$Ke�[��%�B!��d��%j:�������C#,�X^�^z������d���a��A5\�����9��b�@��bcu���C~txE��='��������
i���?��m�|{4������9�������~���.�.�5���\�s�+Q��u4yRI��������l����1����S�v7Q�0:l�"{��H>Y�U�C���X���
���8�]����p������d)���fm#��+�"�/�`��DndN�G��:*��tV�Q�fWY�;5B��h��GR�$���a%�����GZ�����DK,	�E����LaM�s^�g9� _�p���{�� ��m�p��5���p�� ��O�>r�x(R^],h������.��������0`\��#d�CH)M��#�����}B������I��}���� �j�$\���I�;�j�B�<��E@�b~���?��?��'=^�,v�uJB�*I>2�5;���
���1��2r����C��q�/��h��s3R>)���g���K1�L&�v��S�`��)��*����f.����m���i�=OZV[����u���06b������v��+��9��@��U��9�9�JI0���^ {R�3t-M(��9��������������X���;��wE���Y�E.�CY�%�<�N'S���v^��G���n��Q��Fd���u2�`$|�%u�&��0�m�g�q���)����3�?��r4�i:����6���j�#�2��%��#����d+b('��W��_��8�+���i�������2\�z���%��v+L������|0������e5
��c�B���J�GL��du~Thy ��1�Mz �|`q01E�����b���������+�c�$��Ll58��r�e��4"��������x���z���������+��
�8�_),�*�0��;��%��np��Lt={;����d�����5d�$������3��#�Iz�	�g���!Y�Nl{���{���<30��
_�Ar���5<p����@$aeC��)��Q��sK������cC��n�U1{�Q\���5�M�/���>�S�
���m����(z����9&~�G��4��c�Z����t<MFE��j�~3����PJ����1�����"H��h���SnOz�M8�X�][E=�l�)�1�]�FY�O����*�@a�'Y��O�W����������+^���6�h�!����N�XH�j��,(���+�IA(,"�d���T�n�:��fEt5-m
���� �8#�M����%�������^���M�7A�o��������������f��$��>�����������=�/��������k���n���u���h��U(|W���%�����i�]8�@����������#`���n'�-H�6`��&"��S.o�q�9��HT���q�J��K��������aa,k2�m}9��r����e�=vAy�X��O��N���+�8J�y�:�����\�4X*�����}`��U���s��'��~�TW����+�a�h����z������yQF�i<��������Q�����%}���"O�-����i�����*��|�������D|	4z^����������$DY�������pj(��W�r_
}�E���y�I7Y���U�c"I�&����=M�����ho�n��b[��/v�{���n��^=�������h	�'�����6�b|����k�c���\U�X�������!�����XL7�5�Sv_�)}4K�����{�k����n��2������S���V	r%F\v!�cI��$�4Y%U($�4�"�;l�}��t�;x�;�����/uN���ax�
j�
���Q=��6p����m5�N�]�	�� �"��#Q��r*�NI�|i��.Sp�+�����y�I�$��p�%�Y7R�����B�����	CY/W3�R�$n?�Rl����I6�vX�������fL5�+��X.D��toIt�Bfe*�0������@u�M	vb�Fv�}C�j�[o1v��E�L�p��{��`�MX��o�#��o����N"8<p���	D�&��0��8B�7����O}����-��D�J�W&O�u�>�po7��=d>$��:��X���Ce�V��-K�A��,R���4�7pu��Fn%�x�]Y%/���1�pD��)�Fy�}�J�x��i�������v�4�t]k�D,�a�5���D�-������|DPZXi���AkO+*$��m��R���m���2�%uY�|;.y�+d��$��k%C*S�G���;���	�|���r\�����VN3tS!��[K�V��(	)�yK�St��	�D�$���>�h =b)������M��W$Z�xzY�aL4��lDc�5����c�9o+5��o	��v��"/�����e�j1�����b,pB|Z���~2i�$��o6<����F�q��K��K�����7r=Ax����LfX��?��D:�����i(|����s��|a�A��0����|k�Im���
��;r^po�����rU��"[�3�����m��:e%�l�� 9���F������*�B�i
�J#F�[R�z�?���YJ�������I��r��0�DXh/���j#5�p�s*Rd�x�v���hz�l�8-��?Z���W������E��$���v�HB�-�t��
g�0"N�Wo����X���GB�����k��e�\�3b��^���#�?�6��l]g���2���h���:����l��z�A�_�Q*o��V*/G���h�t��$��K$;]�����u�&�L)��H$�@��R���zI���?��y��c�U�^���}�Hb^�j�;%9jSdLpS�����]F�����~�@�������������������X��Rh�7N�l1��f9����1uJd���������5 ����7F��wX����m�}�������6A>���K]�_�H8w�,0�.����pv����e�1�F+�)�������]T�~�p��.��6gW�	��dP~��5
�$L�QO3��fE��
r4=���#�WtI���?|O�����h �@a)��j��1�
;��8n6�������"��42��bx
��;m<�>�W�KLE��F�=Q9�=R"e�jD��3L���#�x6OD�#>uC�������)z��~�������h}c��<?5�%�D����������q���������W<�x�:��h3�O�^�����l�������V��o�vv������5�]�-)���p��';-p:N.��<#���Fg���C"(��������H��M	�}e�+����#�����~�j���\�e������V��t�U��{H����}.�4,�8�lu���B�5�T�Q����J�B�&����%�>����e~u�������(������������$Z�:[�5{z���,��/Q��J��N{�T����;�~��e�i�
X��7���e�������V���W���W�~������]�%(}�l$��C���.C[�wh��2�H~�l���D_gm<f�)�T7�Ll���������������^�ln%{��~�HK�[�����]:M�b�{vpq��u�w\����V'�:;H-Z������R��V{)j?�R�9�
m�Sj��#��t�L���c���QW��8��&?��Nq ���"W��������r2�[h���t�����m�;;;��~��y�����-�m���m�2�m{���a������;:?<��o���D�k��;�H�@�+���W?����^��������
������),�
��J�E�;��R����|P��������������{*c���t:�v}�������/����13��c�QG+8���Y>o�{��SF��O����B����g�0���E�~�����/����v���6�0| 2�f��l���]��lt�Pa�le��~�6&�?�K;l0t���R�i�9T��_�{��������:�	���LQ�9�a�U�au#[t�:n���v�p�W\(�V�`�?�t����n����G�|eU"k0�,/�1���:
x�?��7$�m#H^[�%����X��Q�c�<��}G0� �6�1x�3���vS>�-���R�BW��6/~k��\��c��G�<�W�*�w??_�|w�� Y��|��s�U�}^'[�1=����t����v���x����oj.�/���N����?a�^��P�nlm��Q��/��H`�{�\S�W���i���C�K�'~d������[�P��M(�����G�$��������k5�1�)���� ����xwW�
����2ywu
��HFz5H����5�����S�I�LE�b�m�4���������N�<-��$���$ysJ�[�R��+���*�c���D
E;���>���U~��,���F����|�z���-P����*\a�X��%����m������jY�gG��N���C�ai!�B���W�~\%Jw,�v��B7D�/p�^����8�C�S�',�Q���h'/$�����n��hk�z�Lia.��2�$~Dr>�h���������>�mx�z�9�Y�����g�+�w���G������������Z)���x��Yp@��V���"F�o�����Gh��]���~�)��I0����W?���+���`�;<��`�NWqNn9?������W��'Gg������0�e9�'� �#]��Y���?��~���Y =��np�C/����J28�>s������/_�<����T_��d�d�@����p�_�5�{���6;��a��s

oa�D����4?�Mo����u���#�6z��#�iL��-��9�Q�!ic���S�a�����"�m�
��R��#����N�z�����NV�-yw�h
�A���nA�-�g��VX$��=4�s^U*|M�1�*(/��ve	��������_o�*��IMGp��?�>�&�N|�"X��)�E��Q�����V���������B���B������?��KkJqjD�!0�;�����g��^���vHc�b���uu�J��^��dmHc�����X��Vpu/�?�x��W
���c�*�������i��-|^9���6o�����1�X�E���,=|�$���y���,8��Y���A��o�����x"6Op�^�E%�TS0`4�����Mf�����G��X��E��A.qoO��3�!Yp��8d�=�q�p�,�����u����(�~f'���v�q���y�]tc����v�s�,kl�nT�-vi�5�f��uHYvsmv�lL��O�C�n��uO���S5h'������]8$�����hH���o9S��g���=
K�D����6�*U�F���{��mK|B�!u��7_,o�{x�<���|	\����D7}r��1�����y��-%���`�I����v��Ng! hg�������C]���+a|�����ih&�*k�!a��o��T��E�d�AXv	���0
q������/~8>���G��wv��~LSs���+�����63���^C��3i�W�G�N���*��0������
�z�~�~����G�Q{wko�`?������n^(�AB4�jAa0o��Z�7U�k������P�S���h�4a#�M���A�a��{���EH���@�I��>0	�D��x����F�����d�f|H�3����;h� �]�H��m�A�pjt+yUL��E��m������Q�!kF���[	;��c8P����<1�����P���
�N�� �dX�p���� �ht�U���[���#D�>���GW�e�@�;-2c�L�d�8LJ����4��.c=���"���*�$�|e�����,�������+z7������y,r=�
l����E�	����55[����=�g��U���p��?lzm-XGzu)rc�Q�+���~�����;����]�sV`�L��� ��������p�3>�n����gj�q�w'�d����z�XY]�T,*_T�(�2����t&�(`�����S�+t���T��R�g��K[������}�S-��t�f;���b�L�2o�IzM�
@�xyC%`��g���������	�V�[�l@��~���Rm_�
>E���v2��n/�����
�T�Uw^����Xg#9��\L}s�\N��A�����OtK'�c������ZC8��p����N�9��>�w��C����Cj����io�}����e�^G��4������
���v7�R,�Jt�������c���i��/��>�P��_�uM�+Mj�����#��!k���}��
2j��X�:ps���h��N����������J~�8�����)���Hz�Ym>k����)��G�j��������K�~������\��{��D�����pe�_��W[�bv��%�z��-���w_��v���������c���^���J��Z��`�l�����3����,���g�}�V�u�I�	��&�&,�ro�~��k�[��K_�������r��c�j�+c���C/���O�(W��Y��\��W�(_��9���j�2��e���� ��j�m��-�8X^��~��p�������=� �5�!�Mu^L&��)���x��I����4
��X���\��GAI�F{qko'i6{��[�����(l��[�&�'T��������b/�q�a;zvv�*:<}��K��H}z�l��K����i~�3����L0 x;��-�*��]a�U��CZ�?Fa�|�8��1�"�?	�
�H���[������S���P��?E������f���{�~��������a��.�<�n��Km�������Q��2k���
_�{���6W��m�k��Q0�lw�%�e����w�Z���o���q���o+[�h{��h��nD����c�����������|�|�,>cF�2�����ST#
���q�6:�2r	�j#�h�*�d��i:���$���c��E��~ /{r~tv?.N��dMi�F�=,���B���(mj�1��������Z~O��[�`S�����Q]tn_����*��'�/wl�G����?�W��-'�4�Ur<4� ���__�88>�j���������e�($�a����Q�G����B<����:���l��]D�x<�\u�����A�$��C��jY�;?��Yf]�:��h��^�a5���G�-�"0�?���br	��%�I�4?��_7��������)~�^�p���z�'����Z����������P>�q	��^��:��G}Sv@e��Q�Q6bt�ov���<;����������{}n����s���e������qJ��1H�g�cg.�QtF��j(��`�����#��dl����)��jic�������u��������|�� I�Z����e���'��>,���:�5n���8�.��k���Vw��Z/#������_"���~*�z�4R�F��
r�=��%��S��0��P����W�5�x��9!S�v�����)��>j�*�}�5g
~Z����g����+�����\v������-�
�o�!T���sFp�y1�y1�y1?�b�V���Q<�H��A_"|�����O'�x[�Fj�9���g���t�'����G���qO�^�-w`��Xra�%�\��O��9%[tJ:_�No��~�vV�]��.�g���a�dj��VX��k/�}z�F=x�>�������;�%����������/�s����3��_���#tL�����IB�x��J9����gf\;<8?�3sb_�'h���i�G/������Y�qo#�=��s�l�x������`*�����h��"�_���j���x�K�I(��E[d�����RIJ����z����'�/sNy����y7��_���I^���O��>�\��}��.nN!�no��%#�s�[���&g2�+n��~������0�P�A������k�O��E��X`)�2�yS����T(V0���O����
��y�����s>8;;�����N-���_�t�]�}��>y�(4!��n�Z��?�F��Um,���w/�k2����rW��B���~3��.jHDu���U|����n������wJ��k�>G�|y�fcL�w��z^��P�N��V�������*�!.CNTJM�TG������W��hkk��=�
]����]?����P�-l&���=p�D;x������ty����k�����}��a����*���(IV&�H��i4�c4`��b`�t|6��3*�9��;�sj��R���~��W��
�wpk{o�Q����n�����Rl����	�M�#�?��6�=9�'����*�q���bN��:ulwZ���v���p����1�a��e���?�m����l���.��?B��yi~g��?����oZh��]�7O��0����&���D���^���wVC�_����K��+�e�����&�S�T�Q��Fi�N�Z��C��i�7w���_w�����

#49

Dean Rasheed

dean.a.rasheed@gmail.com

almost 8 years ago

In reply to: Tomas Vondra (#48)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 27 March 2018 at 01:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

4) handling of NOT clauses in MCV lists (and in histograms)

The query you posted does not fail anymore...

Ah, it turns out the previous query wasn't actually failing for the
reason I thought it was -- it was failing because it had a
ScalarArrayOpExpr that was being passed to
mcv_clauselist_selectivity() because of the wrong list being passed to
it. I could see from the code that a NOT clause would have tripped it
up, but most NOT clauses actually get rewritten by negate_clause() so
they end up not being NOT clauses.

One way to get a NOT clause, is with a boolean column, and this
reveals another couple of problems:

drop table if exists foo;
create table foo(a int, b boolean);
insert into foo values(1,true);
insert into foo values(1,true);
insert into foo values(1,false);
create statistics foo_mcv_ab (mcv) on a,b from foo;
analyse foo;

select * from foo where a=1 and b;
ERROR: unknown clause type: 99

This fails because the clause is now a Var, which
statext_is_compatible_clause() lets through, but
mcv_clauselist_selectivity() doesn't support. So it's important to
keep those 2 functions in sync, and it might be worth having comments
in each to emphasise that.

And, if a NOT clause is used:

select * from foo where a=1 and not b;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

This is an Assert failure in mcv_update_match_bitmap()'s BoolExpr
handling block:

Assert(bool_clauses != NIL);
Assert(list_length(bool_clauses) >= 2);

The first of those Asserts is actually redundant with the second, but
the second fails because a NOT clause always only has one argument.

Regards,
Dean

#50

Dean Rasheed

dean.a.rasheed@gmail.com

almost 8 years ago

In reply to: Tomas Vondra (#48)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 27 March 2018 at 01:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

BTW I think there's a bug in handling the fullmatch flag - it should not
be passed to AND/OR subclauses the way it is, because then

WHERE a=1 OR (a=2 AND b=2)

will probably set it to 'true' because of (a=2 AND b=2). Which will
short-circuit the statext_clauselist_selectivity, forcing it to ignore
the non-MCV part.

I'm not sure that's true. Won't the outer call to
mcv_update_match_bitmap() overwrite the value of fullmatch returned by
the nested call, and set fullmatch to false because it has only seen 1
attribute equality match? I think that's the correct result, but I
think that's just luck.

The dubious part is the way fullmatch is calculated for OR clauses --
I think for an OR clause we want to know the attributes matched in
*every* subclause, rather than in *any* subclause, as we do for AND.
So I think the only way an OR clause at the top-level should return a
full match is if every sub-clause was a full match, for example:

WHERE (a=1 AND b=2) OR (a=2 AND b=1)

But then consider this:

WHERE a=1 AND (b=1 OR b=2)

That should also potentially be a full match, but that can only work
if mcv_update_match_bitmap() returned the set of matching attributes
(eqmatches), rather than fullmatch, so that it can be merged
appropriately in the caller. So for an OR clause, it needs to return
eqmatches containing the list of attributes for which every sub-clause
matched with equality against the MCV list, and in an outer AND clause
that can be added to the outer eqmatches list, which is the list of
attributes for which any sub-clause matched with equality.

Regards,
Dean

#51

Dean Rasheed

dean.a.rasheed@gmail.com

almost 8 years ago

In reply to: Dean Rasheed (#49)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 27 March 2018 at 14:58, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

On 27 March 2018 at 01:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

4) handling of NOT clauses in MCV lists (and in histograms)

The query you posted does not fail anymore...

Ah, it turns out the previous query wasn't actually failing for the
reason I thought it was -- it was failing because it had a
ScalarArrayOpExpr that was being passed to
mcv_clauselist_selectivity() because of the wrong list being passed to
it. I could see from the code that a NOT clause would have tripped it
up, but most NOT clauses actually get rewritten by negate_clause() so
they end up not being NOT clauses.

Thinking about that some, I think that the only NOT clauses this needs
to actually worry about are NOTs of boolean Vars. Anything else that
this code supports will have been transformed into something other
than a NOT before reaching this point. Thus it might be much simpler
to handle that as a special case in statext_is_compatible_clause() and
mcv_update_match_bitmap(), rather than trying to support general NOT
clauses, and going through a recursive call to
mcv_update_match_bitmap(), and then having to merge bitmaps. NOT of a
boolean Var could then be treated just like var=false, setting the
appropriate attribute match entry if it's found in the MCV list. This
would allow clauses like (a=1 and NOT b) to be supported, which I
don't think currently works, because fullmatch won't get set.

Regards,
Dean

#52

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Dean Rasheed (#51)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 03/27/2018 07:03 PM, Dean Rasheed wrote:

On 27 March 2018 at 14:58, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

On 27 March 2018 at 01:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

4) handling of NOT clauses in MCV lists (and in histograms)

The query you posted does not fail anymore...

Ah, it turns out the previous query wasn't actually failing for the
reason I thought it was -- it was failing because it had a
ScalarArrayOpExpr that was being passed to
mcv_clauselist_selectivity() because of the wrong list being passed to
it. I could see from the code that a NOT clause would have tripped it
up, but most NOT clauses actually get rewritten by negate_clause() so
they end up not being NOT clauses.

Thinking about that some, I think that the only NOT clauses this needs
to actually worry about are NOTs of boolean Vars. Anything else that
this code supports will have been transformed into something other
than a NOT before reaching this point. Thus it might be much simpler
to handle that as a special case in statext_is_compatible_clause() and
mcv_update_match_bitmap(), rather than trying to support general NOT
clauses, and going through a recursive call to
mcv_update_match_bitmap(), and then having to merge bitmaps. NOT of a
boolean Var could then be treated just like var=false, setting the
appropriate attribute match entry if it's found in the MCV list. This
would allow clauses like (a=1 and NOT b) to be supported, which I
don't think currently works, because fullmatch won't get set.

Yes, I came to the same conclusion ;-) I'll send an updated patch later
today.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#53

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Dean Rasheed (#50)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 03/27/2018 04:58 PM, Dean Rasheed wrote:

On 27 March 2018 at 01:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

BTW I think there's a bug in handling the fullmatch flag - it should not
be passed to AND/OR subclauses the way it is, because then

WHERE a=1 OR (a=2 AND b=2)

will probably set it to 'true' because of (a=2 AND b=2). Which will
short-circuit the statext_clauselist_selectivity, forcing it to ignore
the non-MCV part.

I'm not sure that's true. Won't the outer call to
mcv_update_match_bitmap() overwrite the value of fullmatch returned by
the nested call, and set fullmatch to false because it has only seen 1
attribute equality match? I think that's the correct result, but I
think that's just luck.

The dubious part is the way fullmatch is calculated for OR clauses --
I think for an OR clause we want to know the attributes matched in
*every* subclause, rather than in *any* subclause, as we do for AND.
So I think the only way an OR clause at the top-level should return a
full match is if every sub-clause was a full match, for example:

WHERE (a=1 AND b=2) OR (a=2 AND b=1)

Yes, that seems like the right behavior.

But then consider this:

WHERE a=1 AND (b=1 OR b=2)

That should also potentially be a full match, but that can only work
if mcv_update_match_bitmap() returned the set of matching attributes
(eqmatches), rather than fullmatch, so that it can be merged
appropriately in the caller. So for an OR clause, it needs to return
eqmatches containing the list of attributes for which every sub-clause
matched with equality against the MCV list, and in an outer AND clause
that can be added to the outer eqmatches list, which is the list of
attributes for which any sub-clause matched with equality.

I think it's useful to see it transformed from:

WHERE a=1 AND (b=1 OR b=2)

WHERE (a=1 AND b=1) OR (a=1 AND b=2)

which is the case already handled above. And yes, tracking columns with
an equality seems reasonable.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#54

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Tomas Vondra (#52)

3 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 03/27/2018 07:34 PM, Tomas Vondra wrote:

On 03/27/2018 07:03 PM, Dean Rasheed wrote:

On 27 March 2018 at 14:58, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

On 27 March 2018 at 01:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

4) handling of NOT clauses in MCV lists (and in histograms)

The query you posted does not fail anymore...

Ah, it turns out the previous query wasn't actually failing for the
reason I thought it was -- it was failing because it had a
ScalarArrayOpExpr that was being passed to
mcv_clauselist_selectivity() because of the wrong list being passed to
it. I could see from the code that a NOT clause would have tripped it
up, but most NOT clauses actually get rewritten by negate_clause() so
they end up not being NOT clauses.

Thinking about that some, I think that the only NOT clauses this needs
to actually worry about are NOTs of boolean Vars. Anything else that
this code supports will have been transformed into something other
than a NOT before reaching this point. Thus it might be much simpler
to handle that as a special case in statext_is_compatible_clause() and
mcv_update_match_bitmap(), rather than trying to support general NOT
clauses, and going through a recursive call to
mcv_update_match_bitmap(), and then having to merge bitmaps. NOT of a
boolean Var could then be treated just like var=false, setting the
appropriate attribute match entry if it's found in the MCV list. This
would allow clauses like (a=1 and NOT b) to be supported, which I
don't think currently works, because fullmatch won't get set.

Yes, I came to the same conclusion ;-) I'll send an updated patch later
today.

Attached is a patch fixing this. In the end I've decided to keep both
branches - one handling boolean Vars and one for NOT clauses. I think
you're right we can only see (NOT var) cases, but I'm not sure about that.

For example, what if an operator does not have a negator? Then we can't
transform NOT (a AND b) => (NOT a OR NOT b), I guess. So I kept this for
now, and we can remove this later.

I've added scalarneqsel, scalarlesel and scalargesel so that we
recognize those cases correctly. This fixes surprising behavior where
"obviously compatible" clauses like (a=1 AND b<1) became incompatible
when NOT was used, because

NOT (a=1 AND b<1) = (a!=1 OR b>=1)

In my defense, the scalarlesel/scalargesel were introduced fairly
recently, I think.

I've also realized that the "fullmatch" flag is somewhat confused,
because some places interpreted it as "there is equality on each
attribute" but in fact it also required an actual MCV match. So when the
value was rare (not in MCV), it was always false.

There's a WIP part 0002, which should eventually be merged into 0001. It
should properly detect the case when each column has an equality, simply
by counting the top-level equality clauses (I'm not sure about the more
complex cases yet).

Another improvement done in this part is the ndistinct estimate. It
simply extracts Vars (from the top--level equality clauses, because it's
directly related to the fullmatch semantics), and uses that to compute
average frequency of non-MCV items. The function is just a simplified
version of the estimate_num_groups(), for this single-relation case.

BTW an unsquashed tag with those fixes is here:

https://github.com/tvondra/postgres/tree/mvstats-20180328

it may be more convenient to quickly check the differences than
comparing the patches.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-multivariate-MCV-lists-20180328.patch.gzapplication/gzip; name=0001-multivariate-MCV-lists-20180328.patch.gzDownload

0002-WIP-detecting-full-equality-matches-and-ndi-20180328.patch.gzapplication/gzip; name=0002-WIP-detecting-full-equality-matches-and-ndi-20180328.patch.gzDownload

0003-multivariate-histograms-20180328.patch.gzapplication/gzip; name=0003-multivariate-histograms-20180328.patch.gzDownload

����Z0003-multivariate-histograms-20180328.patch�<ys���[�/�~�Q)��O[n4�#OV��u:��l6<T���4���.�K�������H`����,<f�z�����{F����=��\��n�#���a�Lx�8�|v-��2]M�����3@��M�G����C��b|�����/9����x��S�z����f<F����dU�����r�����������;f6�_��tc����`�N7!��R�V���X�(��o<�a����MT�'���O��g�����v�%������~�x��j��5�P�V(��)b�k
����Y��q������F�����9� wW�e���@���5�j������Y�\[�l5�e������,b�s�ac;Rn��
�m��`<���L�^[=Y��I�q�?����'��������v,+A���@MT���x#|�+��O�x���J��
�����R�A��n_G�7�l��Yw;�������Q��q#\����������@#�n��U�9~c�����-�f��Q:��.m���fj�(��n�5�7�.���j4w������p���]������������%3���-�UG�����.�#nw���������}�"57��p�Da'�7�3WL]����c[S����"�F�w�G	���;a%��CyP�m��=��K��.*u��j�P�@��C�1���7��XKou�����]�������m�
��V)1)H��F�5�p�Ot����y�R�v�sV��81��Y�l�`��{`\t:��]�w�NK���L0v�.�������Y��n�xZ����>.j���f���G��Ph	������� �bV����n���F~��������������<��=<������jL�<��pi�sG��q?�">j�GR�+�����:r`�j�����G�.A�1�7���,��KE�����r�X�����E����Y`+A��M,�$������-J�f=�k��z�97�N��C�2�
:�
�>5�^�J~��Q�����y�W�l|u�VC��8���+>#�'M�G����\���Y^-�)����AH#?B���7��$H� k�J�����X���ak�;��MfIPS���������W�r�F~!�JrK����`K�Y(�e�G�3��`��[-'Jd�b:>L���f���3)h�U|+�<p��E%�F��(�yz�U����J���7/���6ngFS0���.�"V�-���l<��g��X��XQ�T$�H�n��\�o�1��Q#y�J��2�lq.�Diu�a(o�����,�N�{'c�(������i�A/�u��"d3p�����'��R�W������� �_�n�������+������o"p�����P�U/?��g�p}f;��#t�]�(���;�Opu�X�lM�x6���WU�_`iM�%K�P���"���gKp�l
�����a+YA��}k��Q_����+�����`wE�/�$1O%���3�~c�-C�A����4M��$�AcCJ�$o������������f�-#He��%�(+���1��e�������Mt��%F�y����x��f��
{�qd�<kD�5~��20[k���`s��m���
p����3T/B�������K��Ybm`b
���^��rb�}<�5�
[�������.o�����
���be���6U���0�X_z>]���rp���56]�.`Wgg���P��]���7��g��S6cX���!�fG��e�FT,8�m�2��>�����/N~<j$��G�8��%����P�a��^,m��h����R�� ���D����`�I����Vjgsq)j��;�PY�D��0��[�u6��5�(o�6�"���$�[�2�^
_��<@9Ev�%�%Cg����D��Tc3��sLnTt�����0*������I)w��,�m�A�\���f��v|(�!R ��=�m��@�7��A-�ls���w�}��IfBUn~���ru�5�F�4�M�<��Oa��3�/�9`7P���h����G��bBDSG��1X2�?�(G��^�t|��]O���dtr��y6�w�����pu�d�����d6\�,)���_g�o��,>�G�����0��`��R��m`�k�;��m#�+[A����^��6�e���p�`�v|��W�Pn�����T!v���kV.�b�L�V��1�����;P��c
��������2�H/�t���1����T�7=��j��Y8Sf	�P� ���BmMw\H�cDHd;����[t����]�Ta��o�+�BS��'�E�5vj��2P���m3{)��f�<�>������i���$���N�\�����l�������{/���`�</���/��WM���s���%�a��Y����q����b+�7�_lb�8;R'��Jw����c`��z����B����`��)�1�Fr=���
�c'T�Y��~����[��Nc��s�E�[�s�����=�g��=9)Y�bhi1��		Wj���@���`$w@����qy����:f@s�^y 1���2��S��t] ���������'�_���|����B{UE4-�g�R��u����-zPe��-��n��y����a1u�*Rw<��'���F��Q>A�
S�(��5�����f������6~�]b���\�x�P����Y{�O�4��c�G��zf��>?�~�n�US�������aH�gS3���6�B3�s��_����:]4u�P����{���~[-�S>��j��������������?��f_7��Z�)9k��b��fv�-��3D��v{��f�hwJ�z]��b��k���^��u�5i�n��kw��;���e��3�������'�u{�i&��=�����{������Z���x�bp��t������������K|�c�d��he���s9'Fi:�����0�?*1����O]�
�G��YV�%M/NKN��/U��|1��
���#wxSXLrI>��*�{w�[��I�YV9�
8_�[��qc�C�%�@D��x��P�%Q���n�-Zz��7�-�E�����������)�`.8f>�r�2H�������7�u�<�/�����:
 J&o�b����o���A];%.�5���2
_��|���:{�L���F�i����%z������kCwt�l���ZZ�U�w�����/lt�.�&l���f�����E ;�5d"|q�-2�yz�'=�LZ�YW:��`�fX����
��"�\���Y�a�X<�f��=�.����$�9/`,�$,�j���4��
��	+Kmj���	K9��z��H�H���6��8��oh��r����H3(�m��&H��D;�|ss���.��T#���&����,ZSrE�E������ �ohf�=��y+E\S�~kL�r���1(�ds>��'���K�e�f�2���������V�eQa&�����G��e��TY�d��j[��U����n��Ty�<���0`]^�6��\!X�qv���f�h�Vem�YY
�x�"�Z�X+\�����r��r����^8���lvZ *;�
�*lD����dG�w�=�2�����$������#�������!�E{�L{�@d��*DG:�"A�_�����#�n�����V�<���B�y��'fO���N[tD�w����4�-Q;OB$c�����F�l�]4����?Q��c�u	�d<I�..>H�x?�����zp6,��5{��b�������%���v�d��>\�LFW�(��5ff��a�x�gm�R���������+�����*=�L�^%U��y���zT�[�����E�[V�fc�
��
���]�;�QR��M��b9�:��x8�0��s�\�1B����\���U��M��%����@M� ���/��-D�9]j����.(���:��������x�,/���W'd�o;�n,� ���$$���#�m%(�o-cp�X����M�]����������lA����[sc��nB��V7MC;��~Q���R��N3�r�"�������w������"�c�#eu�
NO���*4����rX���x���E���4^a�A{)8d`Y����$�W�R����|�h�	�?C���Mr����F�:8���4MjW���������W��gs�j�����Hx��C�aJ�������D��k�]y��kh��m�
W�U��Z����U���aSpO�H4d��|��b6BR>��X�<<@�� !����2��a�*��;a�JEq����a���?(�!��a/��/���p1���2�gy�2N�����y��y��
����6HFb�n_���1�%���<������"������t8���r2�iJ�*	��
������#��F�X����]
9��`�8a���:Au����W�F���;Bc&6
���W����NDz�j�W �V�A��d���^O�?M����������hx]!�-Hy}���1e����c�����4������(D�%�%;f:��;����0W�7P�����D9;X!���.'i
\c'��������Nc/���Bn�In�l���3J�~@@��� L� ��3~�����r>5�Q���uq%��2w��(��k�$2 �+�{+�b��T|��x���8���TxX��3l�����Zm��{ ��=x�8�����"���*'9�����H����������e,�*y����8���z�cABE3�&Q�39�G�_,����"�?�v	]��V\��6���{�e(�\]�ew�}A_t�96�GD��oh������H�� W�j/���-S�N�M�U�
�O��0�R�Nez�n�0�����t2�����K���.X����	}Y����;� �\�(b<�A�eiu��O�V��:��7�.�~Ke�NW,����fVt1�s�y��������c�+�6q����� ��y���&pQ��H���p�DS��S��q9p�x��X1TT$x��OO��*R(�M�o%��n��j��W}�=?�������� []c�0��V�t��d�t�/O���I�_�55H�j(1I�3�@=7��(������%JU���o�B�c����c�sX�����n��3�m�}s{���`��^�B���������)���<��!�����d2�V��W�FE��o_���P��t��1D
A����2�����h2�B�6=
�O�08���Q��x��GClG������kkF��N^��|�$����%0MTk�qQ�+GF���@DK���������0�4�e��o�WZfO������y�_�7������`+�uE�:�:�����(��<�\q�/s�we0����w���:R�J��X��A���
��{|�j�^�0��0I����PM>��_���`]��&���K�7WgT�a��0�Z����~�N��U+�O�x�m����2�"V$q����T�����j�����lia/;�.s��-�e������qe�����0o�B q1��{0�i�m�I����W�JX��h�0f�3�����k���vrz������_�^�g]�F�.U�7�j�W���$���\�i$���tB���*������"�TaY�����!�na���V�>�p����Xj���=�]{�j���2s|pNo?n���kT�>��'����|��k?�SW2���_����NIw���E��H0k�m}\��Hk�g������'���a�h��V���w\�����2[_��3�����z�Lm����Y�����+�s��*g�Z2�U;wl�����MY$�{�D'�t����"�`�}`1��JN�s�3G!������N������v[��\@E3E�� �j;;����]~���~0������j�(`��N���O�a���?@��W���q�)��W�6��0�Y[���%yx�Z[����;���
�au�������~�����5���������*.G��������F�Wg�t��3j��s_	
�m����;��f���b����(��Rt��t����g~��
&x�K�.�Mt�����u��E�J���-R�1h�������W�k'����G�~�$�'�������xqu���'���$2�
��D�������X`+���������7S�*�z�������WZ���R���\�)��������Wt���YQs���g<��fM�����n�^��<k1<Z-���H�������U��������,m��!r�����0ZK5����d��k�m��u�������������s�8BQ(���y�������k���%;��j��������I�z�|� �_����)��o���d!:����}�&�u�}xZ5�%�t��#5b�����B�Q���#���KN�#7��vatyL�����c�E�9�xo����E��k�ni�f�J��7�~���������&�~�(n46�\6��_�uq�U��������0��T%=X��l�S�h��_<0f��M�k���$���"��]�Rs*�������k��ST�7X��v��$o���U����*Y@BeOF�Ygq�Z���zg|���
��,���mz'�6z,y��{�cb�
F�8�$�$�XW�1�M�O�#X��:�fp��?;��A1�,��7a.���L��a��Q�VX��UY�(��oa����X�����|������4��j��
;�����_k�D�QK�j6GuZ�����������0(�6w�i9�c��b��j}(��8�6��kw���E	H����n��#���PN�V�Z^��f���.��Z��i��zO �1!�7SD������4�t�a�����eH�W��b=��_��8b���t�u�Za��w�X���KN'jq��W&�;���pv����������<���������6�%�"9Z'W2���Z	|P�f��H7��5��4Z�$x6$Tk��x�^�������J����Jk�����N���x+}�g����i��H�
w+�@��F�y�fU��w��n��|H����#��pc�\��=xy����9�����79��/����
�R���N6\#���k�K=������3�S�e�>������4O���d��������
���k���p'�"�a���|�u�R��B��n<���c�%�
�kq�������k�����C@{@�n.��pW�|pP�AZ�������d�C�
&�����>��5�:��"XN�`4����n�aH���C���Kt��Y��rf�K@qUx,31��s�&�~\=*N���l^6�v8�R�_�Z�p�}<�������*\p�/��X�+yL	z�8�8I�8����t�)K���Y���B#W���o���t;� �a.�����9bI�'��@�Hl���`��bt3���������YK�I�����)R
����W��T{j��@A/��*�N$W�>9����vr�5�}��������l�e7�6=�H^g��*���n~E��@��g����4���(4q��3��L;M���a}�tz���%B^�����*$���H��
���h�l�I��b�����(�����M�QXC��t���k��������F�5��e�,���B��BMI,�#�C8C�s?<e'�L���_S<z�D�H=��@LL���H��j|=�8��<+���t����Rln�F=y��.~
y,�F^N<���o�������S�}7��^/�w�
�nY�=�d���F�Vuo�m&��h�C��*���a���v��'��G���:d�����:T�����A�gZ��Y�%������y+�$U� p��G��=G	��$Y3;���[:���;*�T@���[�C�O�"�4��g���0��bP�2#����@��t�:%C�W�������[c��o
,�������>`��#M�.N�����Y�N�L��1+����e�i�HM��=x�[z����p���������N�oM�8��5=�uF���hz�r-���#��c�
�LU����(�@f�&�|���� U��(T� ��eO<���|U���2"p�N>��#?���i�8x���Vf���3�H-t�`-�nc6��'uy��(`���m��Z�b}<Sl'�B1�-R�T��"E����C�����a>G���*i�����p��6����#oO�1��$���1i��(����Oa�v=Ap|T���h4vu���CB+W�# �(%���R�I�T���8y���^kgsw��c�
���{�(a��]Pu�^U�	g�k8�*�R~��A ��^F>W�4���v�-��o�p#I�%*0�wI�X�O@Y(Z<$A\�����
�
I0,��7��u2���&�t�y�J`�������N���%$�LPd$��\�����Q�\Y�d�70�%��P�5h�>�	W7�8>�$mU��A���f���R��I}
5���	�#��/R{"
�Iq/!�S�������=�(o����0��w���~�	�$Y������ O����t��M���25�8��V�&/�'h	��1�������4$K�Sk��F<P<S�1���>2�T#<�	l�%�e����Yr���O�1��uQP;1e]�#'5V^�(Shs�*��Q-�?���D)�.�����A���������(>�"�LI�,x�4�BX
&��(���^L��<������L
�P�ax����(�&�lxa��!����)��w������3�]�0I_ ��4��I1Cc�!�>g�����Y��pK���z3M[;�a�9��K�@��kRtt��#�	����0���6���F�H�g�O,��M�hixqr#���LTa�mV�F��R�����������������gq��TO�H?�0+i�'c�-�t�-���
�����|"�2F �{;�L��UpSO�p��
`a__<���O���'���)�G�Y2����a()�zy��/'��z��5p�k�d%��@<�9Y'qR���35A�R
���a���f
���g��u����	��5r�>R��T8v�]`<L&����.R,�DQ9�V6c<i���h���Y��e�����������AH,Z�������s�v�gb4�S��U����<p�<hB���w��;���G:r�2��3@��Qi���U�������
-N$��i&�w/
7
g�
pm�e�du��� �J��|��5������x�����JM��� �����.��(s"��N�����N���b�G N�L����O����M���HAZS���(�jiM��x�?FK`(�348���>�B�r��*[zE?��9�	F�Q��^�L/M������U����:w���:c��+�as���B\�8h�'�����y��
Y���^6�k�-��������o������)���!#�ko�0��T�����Y\/��t��R������?���m����4*
���!y��2�0��\���l9��Qr�S���������W��*�I�G�|�S{���1�O\Z��7��E�o�
��a������������/>d@6�g�a�V]���pX�39���#����������5q�,~�tV��;�����������[o����Oj����rC���#�����`��8�l��kZm�=9�������c�0a3����d�:~�E���'{{{4�t2F�u�L}��^V@Dq�>�xT�1|�K���������K�e9*����Ju�M�� oEC�K
9����S���K���o�gdq��l�/j\�Ri�qv9\���Qh:u�����%�No0��B��*ns�CtX]<=�D����M(^ps"q�a�waSp�i���I�����_E�|p9]K�uq��8������u]��Iz�lhi�(��Y�}V�����i��rUC�#������j����kf��O�^C���x����
���	��I<�x��j�GD�J���������������/3�����9��	z��H���x<��ONP�qC�c�6��1:�����	d�f��.��y���%��� �g��/��x�.Y�G�.o?���� ����Ts�l7�C��b��������r�]vf�����
ri���rq��.�~0'�i���������U�����R������0��$�E$�7�1�nR�8���AL^��B��=��t����:���S"�d.!������8#x����Vj�]\T'B�#��\k�H�����C^<�b���>�����y(Y$�4��������?S*+����Y@���	��R�n<��4T2�fb!����-nd����wN_l%��S�t�Q\Ko���7^|��q�"%�us�[��"���$~�0k��a��`}[�c����TS&�v��r����Y>�!��^V�(��a����3��#&l,���*��t�sg�$1��������f�@�s�Yf������z7���l3)� po��I_���[��Gj�)bB*�v#��8`���	%�D�
��)N$���������a�,X������8O67Wrv
�����1��;��x���J9Z���K�����q�&����4�b�b'c���p
O�����'��������Mn(�����4�9�U���)��([L-Bn� 
�0�g�������������,v��p��A1k-?_c���Dwb�AwA��tG� P9��>!������r�.��\J
Yb N#:�"����G����pv�B�N��XiM��2a�������St�'#a�kh�AX{���r�=/f��bz�xQ�I��������T���h��WR����p�1���	����6���H��&�_��d����B���'��y��Gef�1#�F����w�C��|��P<����,^��R��R}�_E�O�����2�ey���K����m��
�>�w��28+��*-�z�,L}�)��A2���wpz�>O���U[�a���K���oc�����@��Mc���F��v�� �/���,'q@[����~������nv�s��E-���b��^{a]�|`�T�A��f�t���	�183���a�Y��i��N��B7���N�7@x��7@�����)��6�������2X}p��E��C���</Z�����>0'�,����p`���6�����\G(��q���
�h����l�v�����\������T(M�t�6�)Jw1�
]]���|as	P��R��w9������b�xj�0���-��q��%�/�K3&:��%
q��I�*j����o�����P�n�s�����dee8����w~��Yl�F��jV�L��	!��>�]��j�M���-}#�]�HD}O4LK��d;�$�~>~�
�8����[�{��������ZI�@a�4cA�(-��g4�]�P�����]�[�}�hm�&��J�,@ G"�����FB���r`�w���P�xS�O������hQ�f-�+����ZX�X��, �[a�Y�U�z�:y�m��B��l����M	l��8�@��H�.R+W��N*QK^I������D_���0��X�0jD4����L�p�on��o>�W^��j_ks����UN��B�Qz�����[�N�S��K���S�N����
1-�X#�p���*����z����]Gh�P�e�8���G}��>0�G�<���
Q$y�',��8��(�LHL1c'�����k������2xa�v�a	r#������������������K���}4�H���STYe\�2����kD�q�VH.
�Y�H_�<j��M�}�'��K�z9x�hl����`����O���	���Y(���%���y�{3�����zrDR]�j#������2��k�)���������+<y
mz\�]x��������	zc�7���Z��y�����\w+pK��g3�"���p}��d���=��2;rv���vR�4�b����@��[y�8W���1#	�$7w���$K?��M��B�=���FM��U�4����}r�x��A��q��'���<A^W�ii���#8�g��;r��|���
#~5�8VZ8�����N���!Tp��_��HoXc����&o�}��xvA~&l��kO#�x�j��z_L�1v�k��$kKr |�Hbd������_[����$��J�	�.)�������L6��$wl���k)��"X��`��$�0C)�^)k+�;�����g{�L�Y���j��
���U�
���C���h�<1O�}S.4O�?~���D�b�����u�U�����Lv<�@bB
�2�2���Ug%`�}�E���xa�A@f�=Z�F|������w�j��%A�����J�I~����>�c%Uh�����'�u/\�h���f�L_
p�*r�p����7��t�^	c}L�fHV ���m�O
��
�EHD0rI0��W��S8�02�V;�����j���S:��M�NI��-h"�s����0�,z������8p���2>�Q�+6�W�n.��@����GcM�	��E�tg��DAw�5u��9�x]���F�/�^=7�������L{�����PoW��bF��VTX@�@y�hw8]=�-���_T=�9���������+��sRT�,�����	�N��:��8:l����kk����4����r��I\���#y���A�Z-[����)��b�*��x�q�<��0����I�p�;-�Vwv�?{��/%��� #�Z�����`���t�8�����S�t�fhL(���mk�c��U+��:��8U�������������L���r���������up����C>�L����x��7%'�����`J��+':�l^r~kw8>��a
�1���H_yJe &���+�����eI[�0;��IB�������!n��\���B�,5�NV4����Jli����ih����l?�)i��Wa�����F�m(=S�*7��t���]�A��\q@F<G�J�Ea*�$���@�:K�wrht�n��G+~�(�8��HL�{���J�j�]����
t��H
@NtH��$��fsA��C�9��E��`0��A���E��[��-��a�
(��#��JYC��������6[6xl�)�7���4�#����I1�E��K�������W�������.��0���mQZ,�:'{-$���m��_����[�f������ik~�8� �)��E�S�����E�r����5�0�}��@
��m���87�gC��6�������#�}��xZ��#yU[m�I[��.���%�C�u6�b�Ll������=ud��V( *�>�)���	)7'���p��c]sj���7`�����J-S�;w5�w�-��FE�c��\���K\��� ��kL���+ffJTN���17���(�V�����5N�UC�c��n>����F�M��7�{��I���W����FR�*?O?�U�(9��V��L=qU$E\q��\������p�����<Z��Ze�n�.^����g���6� k�y���9��y�Ko6����>�IRzP��Pjf���Gc;/q$��N�L�����L���d��`�e�����D�7:l�e��xe���A���M
�.���6���V���`9G
RQn�}������e.|y��0?A��\�L�kQ9A>����^����b_��I��i�h��<�7���L�)U�����C��&��#�li
Jn�q����(
�t�5y���nW���\�Gf�����
���e�o����;Ci��h����J���va3�v�0������
 �E�����$�Rp���e��.9���?y�;�6j�a���|3�>}
V�a8�-�X0f�Z���^���������o�:gN��+e��4u�
I���j��)9�u��cws��y;*�I�M���Z|B�.G��Lq72N�n8�D&��j���	��y�T���nz�n���\Y6v���>(+���$h?�����+s�z���s���>QdO��'g�?��U�����Q��O����nB��Z ��7Z�P"�^N����H�Y��@���k��'���{Z��Lz�$�����h��Y$����e@�F�&�:z~x|~���`����X&5��C�����6z�k^	r�]5_���]�����%t��at����T����x}�__K5dsgu����_]N��nF��z}��J
M���t4v��
�����C�`T�%�K������jC��\�v���Y�/�b����A���^�ie�2y��32o��$[��'{a�[:�����j�����ta�E�8.�9�B����d�D�����X �1"�s���@-3������0�d"����$�@U�lS#�������p�:�����zM?����Jz}H��	6��rZ%��\��CZ�mYM���b�"Bx�9N@a�J��=,����G�������4�����9����>/����1�7#<O�'�6����-a'.GD����*xe������`-(��1{��j��'aq����8T�l����-����4J���vJ/��x��
�'��*$��\�j����^���u����=�5���O����1������������GQ���FI$:97*�L{p��X�`��K���@�Cw�+���l��9P���/��i���8R&��������qbfK�	U��s��c4K7�=�ZYQG[>�{��v�F4G��\���H�|���@��f����(�p������ e
�O��g�x9
�z�g��E���.
u�?VF��*:k�9�����%3������q�1T]��E�����b���]8��
�����@����P`�]���Bl�h
-���FX��Y]p�1�b��'/n��
zl:1�R�"'��D��4k���"R}�Z+��B����8�$.�5���az�Q�HfH�E��x���B��a��Y�h�1����l(�
+	Z��C�QUB6`-�[��T�*�LV��x���h�y�}���EuK9�c�u�E1����j�����XoHIm�:��Q��������
�����k��?h6�D��v'c6��1V K�
a���K�Q6���������_j���Bm�{]�c���(b���u��B���R�u
#.�
\��Z� �2*VD�������_����������?�q��,;���uj�j�ACU]H~��2��	��&8��G�Y6�C|NT�5��������O���:L���P�9:#�^Ht�F��h$���ho3g�0�����Ct�"p_x�Zf��8Y]Z���iK$�+�]=
|����+�R��)�� \��Nl�01?$F��]����E�q����pf�H�)M���:A�'�J�9:��6P����GY4GE�Qmhpb�t��*��������?���^�*����I��n��9��
��D���e6�����9a����h�&������>���M��.�o{���]�N]����N
!�����T��$��9j�^�>��pR[d
��y1&	�T�v��v<��Tn<~�mm��z3 �>����+<"�U��� ��+�4B�z�Ar;��@5Y����Qr���j���EpZh���|F�!��n#�5�g5�����@t���`�.����Q�h�uz��
�B?K �%�h*��:U�����P��K�&0�\�T�)���l�}V��E�&�+&���x���I��kd��cD�E���\?��:6���(I��~Pw�P%Tw�t|�4P���)i�K+�7��$����{�������f*��F�z�����;����]�5�B[���G��	����L��U�wP����~��_\D����������}�������	�R�y}xA������@
������'I�t�	�(�dT�����,�$ ��DH��l��������3^9�5�������� �+'63	G|h?"]���`q�,L��?EAUs�8���d�e����������<�j-f�-�W6Fi�6�`<2d�	���F]����Y�.�>�f�
��OZ��q�`�w��F��T�X�*_����'���7'������*9)�.(r6��VQ��]Q�����(h,Z�3�D������&�(d������f���7J.�J��������*?��2S����`�2�h7f����C��r��,�9y���-CE��P
R����-�N���P8��c������cf���#������m��$��T�^������!x2�,P�Of��B�l(����/G�h�NxT>����s�����k����$��z���e��d���D���lJG���#B3����!�2u����^s�OQ1���#���#�R2xG`$Zz��g��v ��IB�KU��r�I�oC�2q���H����pF�M��� ��	yj��d�o���q���*�y6��� W%����V���{	`E���"�Ds#���J^���J��dfR�[W�J���R�iY~7[��������0o��e��P8pB����u=����i�L��Al�s4�b(-��Q0���!'=>8q�����<�?��������������[N���[�\&��a!k�a�����Nq�tw��&����	)��0^��Cw�q�o]J0IE���D&u�;����{�<?|��L����Kr��89z���I.s� ��.�F�t��E~���@lO�k��kO�������|����09��"�
,P9T�K<*$�o�;�3Sx�8��Q����!���An��{&����"����!}h���%�h�����D�$cIX�$�;������v���RG����Z�'	��\fS�>�yo��mHpj��&��Oo�16���o*n�'T��L��>�7����A�n�k���8��s��*�S�EV�J,���N�p��3��y�&+�G��x�'��� �1����@�����y��%��T�U��H\3�*��^R����T�J�@E�*k����h=���%�W��_Z�
H��d{����	C/2F�n���=�r�U���k�������]��S�1���4#�l����?.Y�9��o_q�	�������������j|��#����������,HQg8����r��3d��)���r�������a
~��b�`����:[��[x�h+��j#1���w���z����q����y�}���j&��f�!���[8Z'���`�}�e���kb�}�!rF�}�T�\
ei�algj�i
a�������x�1�{d�{�s�����D%�;i�k`���Cb3bj�@���j�E77����|0�8f	O�lJ+Mv�� 3^�)Y���r�U"Q\����k�#�k����M�����p���,�]�j�#��[�kK���T<:�v=	��wA���:�uM ��,��F�����>�zT��w���A�/��[�tqk��pQ|"�!�[us�~���6��v����3\�*J�D}]7��E�]cOd�����<_���1��)�d���u>&H]gs:y~�������F���8���q��x���c9KW���S�
�9���)��QU���1=~��X|���zu.#s�4��@R3�_N�	��d�Ir��BqO�����W�����x����!^�UO�!���-�y26���/�t�r�l2����[8�^��=�h��@��+����������)#�+{�I.��M��}o~�X�z�k���@�������C�5U��M+@���y|��|PGnyq����N��nv�
,j:�0�4�X~���(����\�^E�5D�����2�|���Y��P��W��B����!���R�G�������B/�I�x�I�&1`H���v�kj�	�)o?�!���u:w�1�jQ�X�/������
�O#�<�U����R3�^]54���F{X������c��2i��3T����sg.,!i�f������T��;"h���0O.w3�+�p�?���,��7%��)k%����n�i�ajD��
)�m�����`*���ZN?�����d;Xx#M"�;=Z7o��m?q2z��Q�S%��7�~�d��<���R������|���1�&Zs��D�3C�S���_��������p J-^���?|�9+!��U�2GE��\��{�-�$����N���J��?J
>1i	����6�&b/�� ,�[����"��(�*�����\��Zl82��-��(�������I�q0��g�0�� L-q(����t ������V���u20�@~q�z��j��;�E0z��^��KO�_��������po���
��$���������z�lF���6���^1�����vC
��%<����g��.-z�;F|a���#���'t�`I�6J�0�0r�f�-�+�8���W�x�Y�#I�l������%�-����0���\����_t,���0�Kg�����9�v�r�=��K����H�9����x�~��R���jdX6���|KH������[d��]
L����p�02xu8�����W��3v�#z�M��#�R�S�V��0�A6��/�sEYsw���r��k�e�o�����9-��9���rG�6���%E���E�C���@��g�M3uI������L#�;io<���#t,7����Y9�eL���O�Z�6���l�6\=���h�&�J��%�.SO���VO
���y���._�jt�A��~2���B��"Z��i�]��(-����`=WYJ���]0O3v�N/��O��\\�m;C������4��m�7�2�g=���� Y�t��>:��A|���["��Q�C�H��� ��S[KZr�����#
q[X!�/a��N$�e�g�-�
����357�b��y.l����P��g?J��z0�������o��F��^]�fb;hD�TFZ�NJh�iP�q�]p:������hG������t/����`�������}���N��9���|�|�V��!����R�?�,���!z�A#��yF<^���"
�����������*�Q��C�}�Wu�z�Y���3@@�����{������:��p3gg�� �����OG�?�0�
����p�0��@�~ 8���0s�nel���,D=Y_!>cW�R_x$������c���n��RV���r�<%�����)�_��j|�n���
x�z<��8�����������h)J��wv��E9D�>������W�A(P���)*��h�+^]�y�������������)i��)�K�@�<?EI������%���A��	�E���@a-�'r��~a���`����dA����N]i7%m����T�8�Z=�}<��x��1I�.~,���G�9����-���Or��we`+�Aa�N�h�E�DZ��*F���^"	�n�^�z���+�����29�Dk�Dv�l2����@���y��-���
}t�V���a���X�x1���q��ze��r���W�=���D���(z�n���D�?c�O��)�����%9����s����]:���J#����Qod�A}pq��!3<�?7�?]|}7K]
C�B�R|^�y���u*��)������0��Y�C� ��T������Ra�����]S�����D��!K|�'��i|Y�z���o%����8NB��f8�
a(k<�F���7p��;��9��$����1���������uN��<�����������R0�������H�A�
���"(�'+)\�9'���WX��}��]4|E>��n�x&w�?BV�;��:�V�2.V;-�����zE"q�s,����?���g1����y�������zLZ)?7^��
Z�B'�~_d+����\O'��5&����z�_�d<���j��9���SV����a��v��	�RF�8����r�L�����E��_���0��DEU�I8��
?v���1��]B�o�f�P��'����
���$6��`�})�k�UJ�.Q�����<�|��7|)��T&/CqTe_�<(9���]\�AF?=��Z4�
�T��
��j<�7���yW���o{~��n�NJ�DV�o*���50��81Cy?�C�����d@L
k��v��?<8y~x���������^�>w.	P�*���I�w6����}A�S�7�
e�^wI$��|Z�b������o�n��%��[�W��s���]�1��:��3b���r�j;���{�:���&;���}�������a�9��Ew�+LX�qE@y���qn	!�9s���\�B`�h�(!+l�x�����]I�#�	�3��\����=I)��:��IR�H��K�x��3���|�7<��M	���cy��mQ�A #ny��=.�5T���m�zWr�����m����4���<]�����yr�'t5�����56J���ji�<]6�C|����FY�+���c�;����`##��\��t�%�Mv6�0
��Jzs�J����X�Sbv�]���G��V���lQUdL��X��c{,Ke
��j��#vQ��{��y�gJ����D����#�IT����8bA��h��N(����Q���C��Y�E��1��%��S�`����r*��&O�hH�y�0�b�����1�#Jb)t�V�	Ti�&�n �G@
Fvr��[� �j�^Y:�u��L��j��Cz�$k~��
�pzhH�CT�1Wgu����H������7[�<�VOf�P?�F�z���%�*��uW�>��*�[�s���0l��U�r�F/��Ic��v���}������������dd������!=Z4����)������<��O��b� �W���E�d��E)������O����x8j������������1����u��J����|
�qLBz����	�d��1�.<ld��w�Y}�m6F��L�+�9��{�F>|g�{������m�[��C����UW������G�w r �^S�����k���4rz�^�����y�x�F�����0&����	��R5F��G��U�+
m(a&[�3��P�x$s���s���[����1�wGo��)�=��)p����|�_�d��k��'
;���K�-�6N6^=�]t����r�K+8���k��
l����hW���dSO-�a�����
��U�AY��S���D-�R(�-�%nMA3�m�=�m��|o�1����X@?��}z���L���)������������h�L*�i5G�p��l8\��e��W��Z\���^�����@G�^���Q5s��E$��(h]��i�&J+R�j<��b�w�tD�����>s�F�_��-���d-2r�F��^>�t����EF�<EJG�+�7�/��b��B�F�b
�2R~�u��1�8&����r>|�a�������9��+/��u��p��u	'�_�7�c.�����c����#���C74_�7.������S'H���.�t^�}7�G���,z�c��D
P���r��1�	l�X�.�7J���W�zN���9����|e@�~kd����\�����Ww�s�+�x���3r���1��tr��/�c�sw�9B���AF���4���6l�$%��:}v�`7r>LA���D��D���c�)3������2n���A2����Z��!��0�I�12GM��L5�9�0��;e��������F����R��� �7��[N;B���'����f���#��N�M����?>9?�����Z��
to���g�BJ]k�+s3qX�q��.�OmPg��	�<���-N����9b�}������������1��c�"|yIf���!� YW��sj��.��m��4��=t?�!�yE��?[������2S���.IN�����g/�i����<<8�+�cH����0��8=y�\�S�HA(Afy��hq�z��x�� �rtv~tp�����8A�����_���h������/��08�ir�f���3�����,�K�0�n|8��+Y�lKs��k����P5F48'C�wfJ��p6��YG�Q�J(�S���������_�E�&��������k���o�/�"���u��Wh����1������)�)��Dn�u{{nX�� ���%y�r�X���	����&g�OB�������F����jo7%���V���������av�H�a�Tl�����v������G�
��'����Oj�
�"�qJH�����{�o���^�o&����l��$WB�?e�z��[����~����Q�"�n�������k��n����L^fx�AD�K�
�!8���'�P�����*��}��1���[��&�����C6�����}��n��C����36����a�N(��D��'�s�&L�A.)ErG�OS�(F��2��s���_���v���L676[o����������6y�X������O 7h�`�<��t)D��Q{k��'���#%:Q�������F���lm�5�
�M��p.>MX�3p�I�=`��]&N7���FL�o�M,e���q��)O�DW;wQ��%���X;x���B�U7/�#�q�CCw�;�o8�����MKi8�����}������#�3&*�w�5��������]9����1�JT;�d�(��N���K&��k������ ��&�7�3�.g�y��-~P�37�p���/�+z���L�*�*\$���>Gi�)�
rI��H�wk�b���s�r�6��I�)��Az9F��]�>}��7p��{� ���r)�6H^K��\:UUy��s��D
R�E����B�
`��LN�L�J8]�"9X��q���S��.Y;�K�"x��K�@
S�(������*�
c������(���OT�O+���)���6�h�P�*{7����exZ�x�yA�|��8!���!�����qE�D�JL,+�����G}:#��Oe����jg)FE?B9,���>�c��j����������!8��<���A\�`9��q�OY�I�����jb�����yy������We��_��F���3)��&�^ �MC�@�M�iK4k�4�r��'��r�
��2-_#��W��3ci�{p���Aj=�O�!2�2"�L�~���
�vfm��Fr�0�xz�D��Q����
^�������Sr������������#��(�Qs�[�}��@���J���%/��i�|5�����h8�,MC�����.��8��'�gA���I���f�s��>��.��6D���J�e��Hjv��w�<��&ZdF�����t�_b��O,Vg`�.g�#�$�F*�y��^�btN
��M.�d�
r�(�6^�+���I7q����������U
�	�1�9�O�N9�[���B�ib��GN9���$.�e������T��"dH���Av��E�N���>���P�UPk���!� ?G���S��hL���(��p�%n>%�]�o�D�h8�����z�������/#�nnn�m>����]���^�_�g��t�$�]X�lre3�����7�}8��"�L����Bo�$[R,�BB���!�DuB���5'@�;��F,�t���@O���'[c7��'����r_a�z��@��L$�}XgQ�I���/��3R	���De��]t�3�� �1���XS�������7�#VF<X�'����1�>�)y�����)U��\��c�J����B�`�E(���)d��Z*���&a��8
�47��$g�����lQ�'Z����Y*��������H��LR*6:�@��
vF�a��4��1��
S�)�W�h��-�|��W���Hb'
�����Y�>�d�
�����7����;��RY�[g�po�u�y���^���
��|?�
x6n�fr�KRO�t6��������F?���o.;�e��[���aQ2���9�Q����&��8�b0��s���(fAcc�������1� {�QE�N_� ,�(;0p�-��/X��:*��[����Pm�cyt�����{�������g��[+IL�`D�C�yI���Fb�933������$9�C�3����K�3�c��t%W��L�bR
�V���V}[�e�2�����&��
�y���q����8W�Q(Fs����I��$H�G8H����K�`I�����cJ?��cRME��a�TH�e�/��@��������Apb�;m570�n�^h�AO��}UY�`i���f`�~��`Z�G�!���������G����x�[��IR���R�T ���o�]>z��N�k�
6���%�C/}��Qa�m��Xi����\���0<����M�����6~tc��<O�o���%�}��%���%�%MP��U��=)�:���X���R��\~�X��EZD��u���������L
�����?u�p�����2kx��T�n�*��:���!�O������	�r��.i>DwR/;6�
u[%9n�'K��c��$�~��xM�z��]�\\�����-!K��9D��P��W[�5L�sjx��c���������ks�?kh�0�I����o;Y�gg���E�=��wU����5�@~��d�.	����n�#Q�L��<�e�8���<U�8#��x�����z�������U�t=.��t����N�{�%���7���L;��&<m�Og�����qv����G��S+49s���jT_?��5������de6���UJ$&�pS��.�����9�E��7�aN^����)����]����.�l q?�������z�nn 6���L���\`�#��;�pv�`�H���[0��=C����`�������9�����($�]x��3�<�L�c���R������E������\���������������t-���{K��u������ms��_�f����� �����y>�$�����*�Z�R2���jn�6`�YJ��a?Y���d�'e0u�!���Yc�[�A��� 0��[�O�,��-����
�QD�k�>�+]8�`�,.v�xU���p�N3o������Y���Z��2s�����t��Y!&;E����4"{������q��7p�J��	SPlW�]s�5.*W(
i��p�9���4K"���
����Q�M��B1�{9�e�/�/gP��r]��'���+��+lPp���y�o�z���f:N�(D#T�<�P�����>�X}����������P�1�7zY�����M�a����K�Ev,�/����X�b��_4��������k�t� ��:*���Y��4^���%����>3�����v!��\���v�f4���*��b���
qD	,TAx
2�U=��`�_:N\���[�:�&�/��A ��)V�,���*��D�Y�E6�b>���*�{�K�=���>�����@��'�a����0�]�]C�d�A�sP��q��������(����i�B�^�g%7��1�m��9�|fE|F/�,��~F��6Wh�O�&/N��6����]��#�������v��*�����u����^�)hm���������ONM,�����3��}A�e�	�v�:1t-U�e�������`v�|���D�J�*(��D���@u�D����^#g)q��������_�tD�1�d��yD8[�����i2��Hs��G���p �C8�� ���j3W�����]�:�A���/��@r�cY�.� 
0�HjV�/s!�����)������bS�X8
�@9>�,"#�|�Y��������D
���U8�,���KS���X�e�)��q���'��~c	������*\�~5KS?��u����^7��q�k��U�������|h
N��v�R�j-�t�#/�y�C�����h
nFm� 0'����}�h�`����]�p�i��^�w'�k���^���dv�Y���#���&��
�������Sy��O��1�,�����MgT��/�q�Em���SevNL.��h�,����dFb���^u�KF�^/{
4�>D�S��L��"����*g���&��)�v�Z@]�W���x&i��uU~y��v�94@�g������<_5"M�r�V���{���������c�O������������<��uz�Y���fq�l��,�(�:�#�H,�$�k)N��[���WF6�"[�K�S�vSz
��b~���Q��^���=��,h>��=�DrW��m�2����RM�krJp�J����g���V`���^6�^uv�������rF��wa1(�	���O��G�@y�[�G���
'�!���/O���#	���De�����c�[u���4%9������F�j[Vx���U�����+�|yG�n)����w�����y�.z�Z}�)�>�@�Y�s�GI������?�*�����\��@���Y�w=A{x'�r&�������ZT�~5���'J��.�S���Q6�1��Sr�PH������`�S�,p1����s�K���`k�>�Z�h�QHl�-�,�1�DlMM����F�Y5R��`!�3�s��`.����j�d8EAx@��-/��^�y"��mzl9?�����Od���?�p��a�������E�Ys}�����o���`�HeYXy�*�&���F�`.�����8�V�6��o$=���J/����#��	����U2ZX4	un��G���M^�&QR����Z?�Zv�����]����8&lT2it��J�u����G���<�?&�]�JG#��R���e�	�fT	�M�z���1�f-�.����$�`#Vg����T�eq�����0V�]��?��J�)�z���k��S�F=�V�o�[eg�����Kn&L4W�5����M��&G����#��w����ur����"���Br�,ID���:ArP���:��H(����/+��N00��b����4Y��ee�������Q��F�7�w5�o����2����n��B������A�O*|��S�Z������c?f�$��s��7n*��.U���j������$I.��m����6x����I�i|����<lC�u���������j�@I������������a|s��KS:���gn}�]�?�O��e��IL���*�V{�t������s��z��#>,U����;z*?�5b�B��^���$���]������I�0���jGy�.p����G�k�^>����L���X!�>�q��T�����/r�vb��Rg	Ge[}����%�t�M���E�KPtP��j�1���8��q��Y����!��D�����?.wt?�t��(?�0.!;���An�Q�[E���@��SE���=K �����B7�83�Q5�`�'
>t�X�z�g�^��x�*�P������E�u��:�j�����s9���M{3�,�h9�C^M�-��(&���&KnE�_�br������O�<�"��VGY�I��^�7�,����L����J
��s=��z������7f\�����W�6�x���k�?��8~�'��My��8k�K^L�-T<=��|�P���ni�3�L�""DV�����������������~�W��s��2����2��y��%6���u*��"`"T�Ux��/e�����5��7:��U����|}�����'_+���Uxh�����s��+��+�^c��<���������N��\������T�/���B��Oy+�;����z�S�������"r��V�"���9���Q'����O��[��r6�%��\�\T�77C��F�5��$�_��#I3�����Y�PJH��>�}��1���������<�U����(�d�a.���9a��f�F��%G���F��t��|�����CR�3���������a}y%f�}c;�n�����$t5_I���>���A@e9)�����h�3�;�����Qd�5�}�'h��N4��������w�`��TQ�����$����O��NZu�j�9��!cj��N��(���x���a�
XL���d�0Y�k
������v�UI���H�	��5�L`��/���$�r]��y=��c�G�����1*��O/Sg$
@�i�-L�+4��r��c"�c;�4���1�fAd����1>���w�����7�fw=OgL��9�?o�,��h��G�Z1�R�������������V�#��d�?|��.Ji$�����*D����IK���Cv��-#��U����r�g8HP�R���I�n�����+����R:�f��Q��4iRg�J���.v:l�qa*��H�I�PV�I��%Ru��/^�t|@>G�/N.~n��/}vj}]�i��^�����t�����'����/N�YX���V��-��
<��0��6vA���ec������x&
�@����d�Bw�����_tG���?Bw2�'2����0�~�Zy:=������Q�%�7Af�4����0&���f���2����~H6�������\gY����oV���<Yo��W��{�A�f_,�Y�>:�t=��_�]�8:=;?���V�9%�_F�u�����l���d}���E����>�b�Fv���s��%����s#��<:�o�nJ��a4R�Cz��@O���%�r7�rw~b�c0�3��|\��=eT:?��|\��(��x���������/����k.������E��d�YX�����q���W�]Dj�!(v���H�,�^������)�Ae���A�]�D{6HP���5�u��]�95h< ��$���Dpqp�������!�t3�{�r�B-�L��^���<?�xq�����������O�_���>�k��_���%#=(Z0��M���vp=@K���D����"w������.���$/2��f2��+�7�"9L%[���lP��VZ:��a��9�{����4�k5]�����u#��P~c��
u��D�*��O�����Z	���h������I�vz�9{����!q
�9�2;�1��i:���g�x@"���U�YY1>���@;J�	�*��DFE���NF���r��v����}���!D��[�a6�����f]xhpxL/�.'�U�����WQ��L�%��4PO�H*�=��|����$��������#��\7c���a�&~������~��0--��\���H�f5�5�~�����~���WR��ol��^]��u���mfa����t�\���v�*��o���M�-�P��HYV�-u��K��lJR�{F��`T^��Qe;TX�,���K��<zx��m�2��R�
L��eG�R�rCB���AY�7Eq���!��!�qT���~5����J<X$�����X�.sE]�GFu2�
3��Q1����%{U���b#)��IX�9���Y4��p(��?����B��_������%�S����h$;[ {|�[j$�h���z>�
ncT�$Q������7���C�?�B8F8�/^� j)�U��.��v@R@�&����S\��ST�_�-�[���@�{����x�{R�Y�1�
D�v�c��/�\�G�;��}��i���D���?�B���0+�2)������6���|����#N�������x�%��l�3[�6������U�(����yY��'���cm����U��~Sn��t�����L�w�o��K�1T�?(����.��?��4���,4�H�e��f���%S���j�r�aZ��H�^�c�7�����t)�>Y�/	�[EK[%-�Aim��� f��Zt���3`}9��^�'��x�u��.J)<�e��zN|r������1����uI��j�[f��p��)U����q�[p��H����EL���O�����nX�,.�7)h���(�J���_���)����M�������9��)�,������}�)�'��m�~o����hg��:Vn��vi��
�H>�u�3�Xpp���������{�{��C�wl�K����UV����wP�A��d�6��������_��W�&o�b���Y]e���+-����	5���FFV
�}B�1���+��*���*����_��]�}�.~����,~����*~����v_�1�;���/�'�Y����FpI�5�hv#����J6��&wK�����4�&LP���H[��(
�!�h����j��aX�(r�g���[��s�cB�Ei�;���������9>������<w�����l���g/�p���Ge���a�p�r�N�����n��vk���n\<\��8|-�������N�/��|�����T�L4�OI�������jM��O��qm�z�6��haaM\C&c4g����^�S�}>4.�4�M:�������%��%�S#�@��@��}I@���j:�����:U�W��*.iJr�s�1���$��b��t��h'Q��E�U�;u�����R%F��t�A'4��>p�A�r ;Kvv�VK"������5�Gt�����&����@�E+O<���y>|*^r�1#B:w��:'"����������Z����&�)�GM �x��)���=FUR���,����>�Qx�5�+�*��
Z�5�u6FJ�g�tsy��O��|��Y#v����,��A�����GH�\��&�!*BN���c�'��5">�B���]�
A���q�p�+n�f��=b�g���6����E���T�
[���C7Mqf�v�8������b�M|d�+9W�A��M8�����:����3c���<�����������+E�R������p�Jwq31�S��'&#�,H��v9�'X5��� �D>�@�=;�U�:>K:�6��[�������K�z\~"S/~o�}"\�}�������A�x�G�H 0Dh��GI�<Y��R&�I���|��,�.�I���^w��A|X������Z:��q31��g��yR�����
��.c�"nF��i�]�2&�q����'�l��0t���r�1
(�H�d�����{0���l���Hc�������s)�u l�p�1�|�z���G�/�����C�-�6Z��O����fi�W�g�ms�D�Q�^���@
��.&O(�)�����S���y��A�u������\Jo��T�����	v���*����u����L�% �x��K�2:���PNh�v��0
]�����3�5��QF0�v�N�"���_��u	�\��O5�{^#�9���}J�X/��Pd�X�������!�]
Z��#-V����������c15I/a0���5�oP�.�F>H?]D%d��p1s�5���+E�N��	���{�E�&#�"�<y6�2�	A%(9��q����*���q�"3�%��bQ��0V��\��������u?��Bi��qZ��a�u����%�G��-I�rj�����:��>K7
��������o'w5�\G'N\�Z��\i�
��	�v`��x�z~����u�I_!�-�"K����$u`$!����}�z�T��v�������g�q�adA�������0��rT�Y����"A�v�?���9oU�4]���38�5��n���}'��j����0.d�������#fe���)�-��i=�C��$����{�zV����v�������q��><p�� �����\+x>�+��%���F��FL���7�v��m	[X�.�G���H~Q��G��-�0�!1��� 3��$�X�,���Mk���x���ln����nmsH������v��������p8����Pt�G����#�u��K�A~9���{�?�m&�A/Uy�;�%�����@*)���h|�;�%�P�_�_�*(�r�u��$p���(��.������k'iU���&���b+�����0�������`�)EJ�nD4�6���;&nv�iH����&����D%%�P2������K�d�m��H��9,�]f
2[)FZt��Y��_�����%<H��m�_c�yR�:=�[7~���Z�J�5"�w����:O$����|��<e�?7Q	�
�dW��;�������������5��r�	�u�~Sd�t���������vhQ��]�;!���C��4�p^��S){�l��C���e����q^6c�Q�e�	��w�C�x!?��i@�B$Y�����j�h�X����i��x��� ���)z^�
�1Ui���P���=_������P]�'���������:�i����N����p����8A1�L�j��� jN7��S`X���������t2�u6vd�����h�����L����vd��,	;�HhW���hr)/)�
�~h���5:�C�H2Q��=�����,����G�`��K�J��q�?���f��NN�GT�-�&F\S�����v�38z*H����qip��8�l\�2�H���w�������pM�q-��T�RhJ��������zH���f?����D��9�6G���>dw��A\����s���;��b<)f��Adp�de>?)R�h�#�zCY���T�Q�q��HEHts���F�����x��������1�����%(I���Q��n�bt�]�O�Y��C�
�[��ce�[�BG��2%�TU��qN�,���"$��$�M���1Om��66�`0���-��>��N��'���-c�M_`QP�\)|����������T�l�`�.��q��E�I�B�a���:����R )<1-�*������+\����� 	��)�V��>�2^
~}�c��0fH!&� )D�M"ir�a�f��?�K�>M'�8�Kk>\�(��5����W��V���.S_{���Q� ?�'��z\��r�i	�8��_s2����8�>�t�G#�������g:���m���^8F��;=��XH!_}���>�`�,�1����t|=�1������|���1����i0��bD�����D9����i6�r]&T�PC��WH�Mo|A�R:�^���EF�N.s�z�\�����&���d��������o�[uV�'�iv��'{�V1��PZ�'	/9�e��y\��
��C���|i���7-Y�N�/�5W����t"�i��%���o�
�,����,7��H��(�LF��P��|����r��r������\`�YZ���h��g6������4hpQx�� ����\��t�^K�����^�%���
(�������LV�@=�����#��[�
�2��,�SM������R����������K!��	E��OR�]����� ���qp��g@&�kad&�����\I�B��6�P
�]G{����TX���0�FR�s*$@�J%����H.�8���A=��P�'��pb�c}u���#]2�}��)��J�z0�A�(-���F;�3ip�oI��3�!P�(e�~�����1����P����)��&����V6'Z�	�V}�y������|�A������5�L�u��a��
-���i
5�M�{!�h�)S��E7��(Ot7�<��t���tYV
��M����(]�}j�`�6^.�E�x^���#aHYs��;_<�O���'5��=�n&'.4S�."�k>�����7��S�w�	z�|���fD���}Be�e�*��|���d9���T�����-��rNh���:A,�A��<�|�&��A�"��C
�����j����	���.��u`1��e�z5�U�Sy���<�hL(a`("k!&(������K'�Dj�����@�C��&�`bbue��bt�	7zQ]��^h�����6���]����@�����e��8;�������%��C	�61�����aP����?M?��p��S��{E�G�i~T�#p� ������O��T!�O&k�E����FiQ#��.��+S��9�w\���c���v�G�>���'jI5.�|q-����\/C����|�����rU��M�<W�T����L�����{���=�s���Wq���_K�W��2���3�
��H�1�o���M��E!Q^1!��^�p�6�>p�n��)���b���O�:�+ �.)��e�����]u3i�im�8R���`4WoP��������*tT�bU)N�bt)^jSb��g�����y7��;�����(K����>���u��	pY��Xt����6P� ���Y��
�;� ^�dE������$�{V���q�M���>L��������JX�3�G#�����r
�'FW1%�?W������-
���|KPT���� ^������e��B@
��f���j9�����������
���R�6����?9]G*i�K��Lz�G��ei����J%�t<����
���RM�iR`���i�7�5��uxJ�Y�,����J�8����x���������R�+�
m!�A,���U������_
r�o�~&�p�jQ�8���_0ciY(����dpi�q�%����(��aX���7f�)Fhh�X��/= Mf^�Y��#�A���C5}�����9��b�@��|cu���~tx�~��� ������d�
i��?]��q�E����E�d��(�Q/h<��dq����gue�6���J��L�U��y3~C9�fX3�u�k3x��'�V�#!�/�R�����.Y�+G��I���;���I{yu=k	`��V�/\�Y�>�����P�����lNb��hcW��t������Srlr&!w��G���X*�'�:��$<�gP�`�@���?�����O���T/������:�\�V�8��U�ci��������M��1F���2Pa��@>N�>t��H�k���������
uuR9�:%u v�j�e�w�����r8��C�`������Ua
H��q�y�{��
vQ%��Hgr���Km��~�`��i:����Z��C�����f�8��
����<g���!��Mj�3:a:��w2DW���1�*�E��
�������B��k���q�>xT�����lG_Q�����A��@�*�19
��$�lC��c�8Rs{�N>��!��#}���.��q)����fTi&BM
�{��\�~�,���|*,�
�xJbc�*Fb`��H^���t.�j;%O�����D�����bv=������6G�4>��
�r����"}�>t�%/�S��6;��gh���>��x���(����x�/�����l47�(	�,& ;�L]1�2���X��E�]fRA�������}�\$u0�-�
���e������K��3��
0�6���D	����x����$���p�
IC����|��\�]����~�.�Ygw4~�$����%����Q!P��Fk���2�����`�}j6�'���0\��d|��~�+B0��ZS�Qs��\���_����X2�����h�'TMZ�m�<
��Qwb�'�L��v^�C���7$�
�(J�Ir�wx�������%4��ES���?N��7����C��������������>���`���i�������q(8}`G)J�J�������!���7����C�]BH�, wq��%D(S���mk#F�Gw����|�Q(�Az���V,r���'�-�������G�� �d���`�>2$�D�� o�#�.�X>�P�)y��"������5��O--�wY.��X���cV�w"/y�bpDs��>$5�j���<�����u��z�����G\\~�Q8���+78�#���O�]m4�}�1�-�"��1X�m�8���IA,��&c
��W�c�ih�8[q�	�5Z^���H�7|=��	�p�������Y��
ipnELy0���@�Y2�5LaIOeaY��� ��$��]�4��(0�r����m<��w�~a�3����i�=e*3����k�x�
�R�����J�F9�W�tZ�2� ���^�i�D��q�=�wS5'e���U��I��b���Q��k���(P�IM��*U�5����������`SL�l��N���Y�,>��y.�Xz%yy�yt1y�Z!��������Y�\L�@[���l?@}��}G��dm�U��O���*�z�� �_u?6�Ig����������n5��{��{-x�7v������iquu��>��_���V�������>���9]������N�u�Y����$�p��*�m'�{c1��G"��m��N����:l�AT#��L�(�_\���s������[�$�"��~4N�_h�P����`�����o��RJ��}��}�ih�H�?E�D�������\(�e�#���L��],��S��.�q�v<4q�:kE����O����3;	��_��n�xZ[��d�������e�P��I:�	�������!�Q�����%}��3�"O�-����i�G�
�ar���sk�����PB����%���bjb��u��+���_-8!�H)����M��7��D_q��?@i�kt�� �D��IR�I�{�|O���n�4Z[��[p���&+��]��q���[��W{����,�=����8��&_��#u��,����:����VW��_u��e�8T������qg��� ���������'������v������=��*d��
�1��K��${�I��B!���1�=0�+tP��m����vc���:'na�0<AN^y����>[K���s
hb[���8��W3�rEj
G�@��TJ��
qi��.R��kIs��d�U��L�'�|��/������(�D(D���8a(��
`&�X��O�fy�j�O��d,�a�S����<aP���`��O��LF��;�����)��ApB���oHV
u�L������	�s����1��c��xd�0�.[a��V�\zN�8��]�j�Q�P	G������%#/Qr?����C�	.iS����)�N9h������$#@�"U�3�P�("�)G�R$G�(\�-z3
��G�}�������+����{�O��ht�
�Q^a�=�U���$+Z���cd>�]8�6�C��2p�}
�V3Q~G�@yh��Dd�0K�4�8_����G��JY)���� yHM1CI]U������)U��S%��s\�)��8���)6}_F4yO����"���9����
Y3�b�b���������b���S������Y��!��GBpKR8��_��O��D�Tj5���r��d$\���#PwL?��.�F���e�k��|D��k�����SN�]������C1#�x8�,��)�lx`~����Fi��/'����o�z����2�bq��d�
z�����SX����?/����Aa�]���|k�I��!�2�����b7�y7����E����r���q����AY	3��=H��C���+��v��J��������Q������G�85'���u����)x2��%>L!��5�Z�H�:\����,��]#zE4��9�I�N�a�-gGM�+�P�����\@����v�HB;�;���G�^s�V%:����/&A>vm1AF	�^�����aq��Tg�(s�tW�G������:F�_eRi�(���ik���lvz�l�Z�T�JQ�T^��c����'I��Hv.�&j�y�6�lB�E"�{v@��b�$�%����l��\���*��S���}���^�j�;%9jSdLpS������]N�+���|�0��	)��z.���M]u-�,������ ��P�D�b���r��W�b���������^���,��l�5�=�������Nc��g\���I�	�q���]������swPO��"��������)[&���R�����nQ���AU��/7��r?a#pv��)1�a���AOQ����I)�55�|aVD�� iv����|*��.iq��w�D�X�/��`r�R�������v7��H7����v������Ff�BSO���.��U�	�+pAL�=��F�=Q�E�)�rX5�]�<��>b�g}VG(`�S���
Z[{���?�?�������o���w�����S^���A�!�:.MD��i�
��
���M��0�_����1�:����7��[�{�������N{,n{I]��%���m��n2�$b�NF�9����x��( _�,sxH@Etb65������)X%O�\E79t��<�9��Q����Iz����������2��.�J�p�87��%��e���A����UYC�����j4�U��U)P�hB/NO^!��K:_6�Ww�.05$��U~����!|���c�$����Q�'��O�gIZ��|�[��j�nm5Z��?	x����{,�������u�H���)�.�K������������]��%(},P��C�
��*C[�wh��2�D~�l���'��	m<��+�T��L�������������m��Y������{EZR���(�e�8���i�����}����Z�lnm������b��A��f\jkc����i���R���)5�����@?�&]R��{
J���hHFBJ!��]�*���U�=g�8���4�O���6�3��m��l�����V�/����low�m�fn���m����
[��������im�h�d)^|!����o�"AKHE��^�x|�:y��OP(J�������'������K��K����~,�����=�V<���>�F|��2v<�R��h{����T~5|�����r�;�:Y�<|����e��-�R��}jt�w�:7���p�{��?������~�|���~��+��������3������w�o����l@�=����_��Ac���������\
yi8��_�U�'�������]�s�ut�)�=�0lnvX���]��[%:��M\����,��,���,�[D�?��������W�!��Cd������Wj�P�+��-~��~�����t}D=+�����{,��(�1V���#xH��<�9�Ss\�	�
�;�R�BW��6/~o��\�����G8��W�*�w/�/k�/�� ���|������>��-������=z{{���f���n�[�M�d����mo ��������'?�������J2��w�	�{�k��
X�~��4��P�2�Y��[�ScA��&��bu��#m����j��`y�ZMuLhJ��� �>���"����E�D�L�![�����^��~u�d�*i+�����\�F�x�&�|-��N�k���8����$Am�g-�Q��� ����2.u��8��,J�P�=g��:�B�MZ�G���?�i�Z_�����Gza�����]�����p*K�����.�B�����"�_??<>8:<�B��Bp�p������J��X��2
�n�h_�>�����h�N����F�^������L�:;;�������E2����R��Y�|Dr>�hm����[���>�k-x�:��Y����%�Wp���9;&�O���?�7.�����R�����33���� E�"��F�$���
���/��M����	K�����+���`��<�^o�NWi$���������'G����������0�e9�'� �#]��Y���G?�~���X =��np�M/����J�;�>�C������W������l��/�h���u ���X����r
�N��C��vo��m�������P�"P����=	�P��O'7�o�a]��<��H��3�,@F���N�s��@C��p0��~!��r.n�d�a�N�zK%�uAv���89��0 �Mi��X)Y����
:�JM5�(�.[��"V���H$�{h:����T���c^UP^�����U��[-z����U�����g9~\}�M@
<���D���S8��T����lm�7z�f�����)��xP���?W5�4w�F�������jx��<0�;��������^���vHcq��V���v�wt/bS�6��Guowf,�y+����9c�*��u�1J�I~����i�ew��W�~���Z�U�o6i_1s��w��K��E\���R���;3��g���t8<���F'W�i�@o<����.*9���i�i$�Vl2�`,��58A���/��6r����G��QR���#��!{0�������`9��������N�d���(�.�!��� �	B�����j��'�+�xYc=t�rm�Ks���;~�C���k��ec��t|�2�p;W�{��F��A;i.f\�W����A���
�����60|��-�b��aW{����s�m�����oD
���)���l[�3��
�{h ��b��x���������%0�r<�`�������H��Y���bl)y�uz{�v�l�v�:���s��A;3���$���.�9���n	�+��M�29�W�B��#T�o�2�-2$kP�����hg;98J��j���q�����������
�&@p���v0����3l���w��W�9 *�U��E����Z-hab�Ig0�`8����S������tx~��v6w���������c96HN���-�D�xS)��!`�/���<e	~��)6����H����PzD'�54�4	���a��H��K0�jd1X�!Z����5(t��$��C%�r��aL�����2HiN�n%�J���ahf,���t�|������V�4��&�B���G�.l�;Tn������b(bB�$�a:�!� ��}Sw��f{a���y/������r�@�;2��L�d�8LJ����4��.c=���"M����*����?�����dR
��qL����3:6����},"=��i��
b�E	�'O���lA#�<���c>����
75�����;����H�.E4&K�@8'��q� 5u#�+5w�%O�������4��Q;,�w���3G�<�g|�������������N`/�!_!��P���v�DT(>��Q�'�+���t*�(`�����S�+��Mz��;�T��UZ�������=vQ-��t��8���b�L�2G�!zE}�{������
���.�Ec1�_���{&��U�o��y��
��>.��ne�"O����,���A��b(�
P�V1�YJ�Wdc�y���s�43���r���
�&��zw��G:)/Y���Y��5�����=���n6���G��N��}H���}H-���d�M����}�(����������[��p��H/(�})9������������i��/����P��_�ut�����r�V���RRs��X�(���*��v`����}Q����8Eg�/��n.(�$����F��33���S��������.jS�n��������������M�r�����K���_�v�����i�Tl��Q�dR�y��s����K��Ny�4��\��h��W',�R���v��o6������S�j���'��#4x��ORJ��5�-Y�_a�;�����Y�=2d�X������l����^�_�9d���j��<� ���$��3?�?��}�&q|VgTK+�����^M�m�CA���*z[��6�`q����������{<F�~�Y���:/�c�f9��t�0IGQ�� ���0������GAI�F�iwsw;k6;;{������(l��[�&'���n��v�%�a����<���<?=y�����P`�>���%l����4���|@�9^&�]f��y�:��0����������(l���!'Z6VT�`��'Q!�� w������)t�qa=T�����e���e���7�q�~��������a��.�<�n��Am�������a��2k���
_�{���6W��m�k��Q0�lw�%�E���{y��F��^����F����F���}���j$����c�����������|�|�,�`F�2���/�SR#
���i�.��2r	�j#�@�*���s2��$���c�h��)�@^�����~������"5�	:W���6�+Q���c�����{���������6��7�:�����W�]�����E��z����W�#�|��a. ��p��@�Ke��#� w��?�~�t��N�����/��~QH@������Z-M����x�Z��
uj���Y;=<;<O&��7���N/3`?�S@|N�F��E���8+g�_��I�G�&����-�Qm>jl�$|q�Q|��m�%�/�<&qi~8)��n^w	����)~�Z�p���j�"����Z����
������@>Mq	��N��&���G]S�GeWQ�Q6b��o���??���E���-�`nx��3+u����n/����g��) l�Q���4|�8������0DeV#����X-P�=P%c3���O	���W4612��j�_7����|�8��4#��,���p��I^���d���eWvU����!�#�&?$�tm;���60[�ed��:<�K�O%@Y�FJ���WA��?A�$Czj���JCc��{�����kOU1'd���03�����������Zs�����?�|�N�������~������7���FlqT��~3��E���0�;����������9S�R�EN��X����%"�.����t|��5�`���q��
L�{���dU�xT�q�����|�����\���BF	7W<����iN�&���W���r��&���{�h����7X.�Z�����x����Q������s��i�N|�w��wb�k�e7�k����������s�|0s�_o�9�t P�/�����������W�	Z���C��K(c�><~^o���O���F����`{��i.48����Z�dr�����v���k����$k����r_�f���$�ba�w���������<�Zo���������9/�����f�J.�o�oe7RH����tt������<�'Y����\���[������1t�@Tx��������Z����C���0X
��d�T(&05�L,d��{�y���r��h9����OO������]�T�H���`�U�>Yu��m��
r{�	;-�H#�������}�;���5��P`G������=��L:���y�=��
m�����n7�{����v	*�|�����gk6������u�	E�q��.Im���X���_/��2�D�x��Hu����y���b�ln�������W����s��g����f����M���[���_]N������0�J?d�L)Q.�B
���<e�4��$��Ff�)�9F�g3�4�"����m?�9(u��Aq�7�m\��zol�n=�6������s^��T�bA�o���N�����m�bg�t�}5���v�:�����V���Wp����1�a��e�����m����l�W��&���C���yi~c��?����oZh��M�;O���0�����&���D���^���7VC�o���W��wW$���{m���M��>�@�~C��l�R�,�b���kn/.���W��

#55

Dean Rasheed

dean.a.rasheed@gmail.com

almost 8 years ago

In reply to: Tomas Vondra (#54)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 28 March 2018 at 01:34, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is a patch fixing this. In the end I've decided to keep both
branches - one handling boolean Vars and one for NOT clauses. I think
you're right we can only see (NOT var) cases, but I'm not sure about that.

For example, what if an operator does not have a negator? Then we can't
transform NOT (a AND b) => (NOT a OR NOT b), I guess. So I kept this for
now, and we can remove this later.

OK, but it's going to have to work harder to set "fullmatch"
correctly. If you have a boolean Var clause, which is identical to
"bool_var = true", it ought to add to "eqmatches" if true is found in
the MCV list. Likewise a boolean Var under a NOT clause is identical
to "bool_var = false", so it ought to add to "eqmatches" if false is
found in the MCV list. Both those cases would be easy to handle, if
general NOT support wasn't required, and you just special-cased "NOT
bool_var".

If you're going to handle the general case of an arbitrary clause
under a NOT, then the recursive call to mcv_update_match_bitmap()
would seem to need to know that it's under a NOT (a new "is_not"
parameter?), to invert the logic around adding to "eqmatches". That
applies to other general OpExpr's too -- for example, "NOT (box_var =
?)" won't be rewritten because there is no box_ne operator, but when
mcv_update_match_bitmap() is called recursively with the "box_var =
?", it shouldn't add to "eqmatches", despite this being an EQSEL
operator.

As mentioned before, I think this whole thing only works if
mcv_update_match_bitmap() returns the "eqmatches" list, so that if it
is called recursively, it can be merged with the caller's list. What
isn't immediately obvious to me is what happens to a NOT clause under
another NOT clause, possibly with an AND or OR in-between. Would the
"is_not" parameter just flip back to false again?

There's also an interesting question around the NullTest clause. Since
NULLs are being recorded in the MCV list, shouldn't "IS NULL" be
treated as semantically like an equality clause, and cause that
attribute to be added to "eqmatches" if NULL is found in the MCV list?

I've also realized that the "fullmatch" flag is somewhat confused,
because some places interpreted it as "there is equality on each
attribute" but in fact it also required an actual MCV match.

Yes, I was having similar thoughts. I think "eqmatches" / "fullmatch"
probably just wants to track whether there was an exact comparison on
all the attributes, not whether or not the value was in the MCV list,
because the latter is already available in the "matches" bitmap.
Knowing that complete, exact comparisons happened, and it wasn't in
the MCV list, makes the "(1 - mcv_totalsel)) / otherdistinct" estimate
reasonable.

However, I don't think that tracking "eqmatches" or "fullmatch" is
sufficient for the general case. For example, for other operators like
"!=", "<", "<=", all (or maybe half) the "1 - mcv_totalsel" ought to
count towards the selectivity, plus possibly part of the MCV list
(e.g., for "<=", using the sum of the matching MCV frequencies plus
half the sum of the non-MCV frequencies might be reasonable -- c.f.
scalarineqsel()). For an OR clause, you might want to count the number
of non-MCV matches, because logically each one adds another "(1 -
mcv_totalsel)) / otherdistinct" to the total selectivity. It's not
immediately obvious how that can be made to fit into the current code
structure. Perhaps it could be made to work by tracking the overall
selectivity as it goes along. Or perhaps it could track the
count/proportion of non-MCV matches.

Regards,
Dean

#56

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Dean Rasheed (#55)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 03/28/2018 04:12 PM, Dean Rasheed wrote:

On 28 March 2018 at 01:34, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is a patch fixing this. In the end I've decided to keep both
branches - one handling boolean Vars and one for NOT clauses. I think
you're right we can only see (NOT var) cases, but I'm not sure about that.

For example, what if an operator does not have a negator? Then we can't
transform NOT (a AND b) => (NOT a OR NOT b), I guess. So I kept this for
now, and we can remove this later.

OK, but it's going to have to work harder to set "fullmatch"
correctly. If you have a boolean Var clause, which is identical to
"bool_var = true", it ought to add to "eqmatches" if true is found in
the MCV list. Likewise a boolean Var under a NOT clause is identical
to "bool_var = false", so it ought to add to "eqmatches" if false is
found in the MCV list. Both those cases would be easy to handle, if
general NOT support wasn't required, and you just special-cased "NOT
bool_var".

If you're going to handle the general case of an arbitrary clause
under a NOT, then the recursive call to mcv_update_match_bitmap()
would seem to need to know that it's under a NOT (a new "is_not"
parameter?), to invert the logic around adding to "eqmatches". That
applies to other general OpExpr's too -- for example, "NOT (box_var =
?)" won't be rewritten because there is no box_ne operator, but when
mcv_update_match_bitmap() is called recursively with the "box_var =
?", it shouldn't add to "eqmatches", despite this being an EQSEL
operator.

As mentioned before, I think this whole thing only works if
mcv_update_match_bitmap() returns the "eqmatches" list, so that if it
is called recursively, it can be merged with the caller's list. What
isn't immediately obvious to me is what happens to a NOT clause under
another NOT clause, possibly with an AND or OR in-between. Would the
"is_not" parameter just flip back to false again?

After thinking about this a bit more, I'm not sure if updating the info
based on recursive calls makes sense. The fullmatch flag was supposed to
answer a simple question - can there be just a single matching item?

If there are equality conditions on all columns, there can be just a
single matching item - if we have found it in the MCV (i.e. s1 > 0.0),
then we don't need to inspect the non-MCV part.

But handling this in recursive manner breaks this entirely, because with
something like

(a=1) AND (b=1 OR b=2)

you suddenly can have multiple matching items. Which makes the fullmatch
flag somewhat useless.

So I think we should be looking at top-level equality clauses only, just
like number_of_groups() does.

There's also an interesting question around the NullTest clause. Since
NULLs are being recorded in the MCV list, shouldn't "IS NULL" be
treated as semantically like an equality clause, and cause that
attribute to be added to "eqmatches" if NULL is found in the MCV list?

I've also realized that the "fullmatch" flag is somewhat confused,
because some places interpreted it as "there is equality on each
attribute" but in fact it also required an actual MCV match.

Yes, I was having similar thoughts. I think "eqmatches" / "fullmatch"
probably just wants to track whether there was an exact comparison on
all the attributes, not whether or not the value was in the MCV list,
because the latter is already available in the "matches" bitmap.
Knowing that complete, exact comparisons happened, and it wasn't in
the MCV list, makes the "(1 - mcv_totalsel)) / otherdistinct" estimate
reasonable.

I think we can remove the fullmatch flag from mcv_update_bitmap
entirely. All we need to know is the presence of equality clauses and
whether there was a match in MCV (which we know from s1 > 0.0).

However, I don't think that tracking "eqmatches" or "fullmatch" is
sufficient for the general case. For example, for other operators like
"!=", "<", "<=", all (or maybe half) the "1 - mcv_totalsel" ought to
count towards the selectivity, plus possibly part of the MCV list
(e.g., for "<=", using the sum of the matching MCV frequencies plus
half the sum of the non-MCV frequencies might be reasonable -- c.f.
scalarineqsel()). For an OR clause, you might want to count the number
of non-MCV matches, because logically each one adds another "(1 -
mcv_totalsel)) / otherdistinct" to the total selectivity. It's not
immediately obvious how that can be made to fit into the current code
structure. Perhaps it could be made to work by tracking the overall
selectivity as it goes along. Or perhaps it could track the
count/proportion of non-MCV matches.

Yes, ignoring the non-equality clauses in 0002 is wrong - that's pretty
much why it's WIP and not merged into 0001.

thanks for the feedback

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#57

Dean Rasheed

dean.a.rasheed@gmail.com

almost 8 years ago

In reply to: Tomas Vondra (#56)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 28 March 2018 at 15:50, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

After thinking about this a bit more, I'm not sure if updating the info
based on recursive calls makes sense. The fullmatch flag was supposed to
answer a simple question - can there be just a single matching item?

If there are equality conditions on all columns, there can be just a
single matching item - if we have found it in the MCV (i.e. s1 > 0.0),
then we don't need to inspect the non-MCV part.

But handling this in recursive manner breaks this entirely, because with
something like

(a=1) AND (b=1 OR b=2)

you suddenly can have multiple matching items. Which makes the fullmatch
flag somewhat useless.

So I think we should be looking at top-level equality clauses only, just
like number_of_groups() does.

I'm not quite sure what you mean by that, but it sounds a bit limiting
in terms of the kinds of user queries that would be supported.

I think we can remove the fullmatch flag from mcv_update_bitmap
entirely. All we need to know is the presence of equality clauses and
whether there was a match in MCV (which we know from s1 > 0.0).

I agree with removing the fullmatch flag, but I don't think we
actually need to know about the presence of equality clauses:

The way that mcv_update_bitmap() recursively computes the set of
matching MCVs seems to be correct. That gives us a value (call it
mcv_matchsel) for the proportion of the table's rows that are in the
MCV list and satisfy the clauses in stat_clauses.

We can also estimate that there are (1-mcv_totalsel)*N rows that are
not in the MCV list, for which the MCV stats therefore tell us
nothing. The best way to estimate those rows would seem to be to use
the logic from the guts of clauselist_selectivity(), without
consulting any extended MCV stats (but still consulting other extended
stats, I think). Doing that would return a selectivity value (call it
nonmcv_sel) for those remaining rows. Then a reasonable estimate for
the overall selectivity would seem to be

mcv_matchsel + (1-mcv_totalsel) * nonmcv_sel

and there would be no need for mcv_update_bitmap() to track eqmatches
or return fullmatch, and it wouldn't actually matter whether or not we
had equality clauses or if all the MCV columns were used.

Regards,
Dean

#58

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Dean Rasheed (#57)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 03/29/2018 02:27 AM, Dean Rasheed wrote:

On 28 March 2018 at 15:50, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

After thinking about this a bit more, I'm not sure if updating the info
based on recursive calls makes sense. The fullmatch flag was supposed to
answer a simple question - can there be just a single matching item?

If there are equality conditions on all columns, there can be just a
single matching item - if we have found it in the MCV (i.e. s1 > 0.0),
then we don't need to inspect the non-MCV part.

But handling this in recursive manner breaks this entirely, because with
something like

(a=1) AND (b=1 OR b=2)

you suddenly can have multiple matching items. Which makes the fullmatch
flag somewhat useless.

So I think we should be looking at top-level equality clauses only, just
like number_of_groups() does.

I'm not quite sure what you mean by that, but it sounds a bit limiting
in terms of the kinds of user queries that would be supported.

Let me explain. The question is "Can there be just a single combination
of values matching the conditions?" which (if true) allows us to produce
better estimates. If we found a match in the MCV, we don't need to look
at the non-MCV part. If not found in the MCV, we can compute an average
selectivity as 1/ndistinct (possibly using the ndistinct coefficients).

If we can't deduce the existence of a single possible match, we have to
compute an estimate in a more generic way.

With (a=1 AND b=1) and stats on (a,b) there's just a single possible
match (1,1), so that's fine. But it does not work once we start looking
for equalities nested deeper - for example (a=1 AND (b=1 OR b=2)) can be
translated as ((a=1 AND b=1) OR (a=1 AND b=2)) so technically there's an
equality on each column, but there are two possible matches (1,1) and
(1,2). So the optimization does not work.

Does that clarify what I meant?

Although, perhaps we could improve this by deducing number of possible
matches and then track matching items in the MCV list. But that seems
quite a bit harder.

(Of course, we need to consider the non-equality clauses in both cases,
the WIP patch does not do that yet.)

I think we can remove the fullmatch flag from mcv_update_bitmap
entirely. All we need to know is the presence of equality clauses and
whether there was a match in MCV (which we know from s1 > 0.0).

I agree with removing the fullmatch flag, but I don't think we
actually need to know about the presence of equality clauses:

The way that mcv_update_bitmap() recursively computes the set of
matching MCVs seems to be correct. That gives us a value (call it
mcv_matchsel) for the proportion of the table's rows that are in the
MCV list and satisfy the clauses in stat_clauses.

Sure, but the extra bit of information allows us to (a) ignore the
non-MCV part and (b) apply the 1/ndistinct estimate.

We can also estimate that there are (1-mcv_totalsel)*N rows that are
not in the MCV list, for which the MCV stats therefore tell us
nothing. The best way to estimate those rows would seem to be to use
the logic from the guts of clauselist_selectivity(), without
consulting any extended MCV stats (but still consulting other extended
stats, I think). Doing that would return a selectivity value (call it
nonmcv_sel) for those remaining rows. Then a reasonable estimate for
the overall selectivity would seem to be

mcv_matchsel + (1-mcv_totalsel) * nonmcv_sel

and there would be no need for mcv_update_bitmap() to track eqmatches
or return fullmatch, and it wouldn't actually matter whether or not we
had equality clauses or if all the MCV columns were used.

Right, although I'm not sure about fallback to clauselist_selectivity()
which kinda throws away the statistical dependency.

That's why I think we should use 1/ndistinct for equality clauses, and
then perhaps leverage the MCV for non-equality clauses somehow.

It just occurred we can apply the 1/ndistinct estimate for equalities
even when we it's not a 'fullmatch'.

So what I propose is roughly this

1) compute selectivity "mcv_sel" using MCV

2) see if there can be just a single match, and (mcv_sel > 0) - if yes,
we're done and we don't need to look at non-MCV part

3) split the clauses into top-level equality clauses and the rest

4) estimate "equal_sel" for equality clauses using 1/ndistinct

5) estimate the "inequal_sel" for remaining clauses using MCV (assumes
the selectivity will be the same on non-MCV part)

6) total selectivity is

mcv_sel + (1 - mcv_totalsel) * equal_sel * inequal_sel

We may need to fall back to clauselist_selectivity() in some cases, of
course, but I think we should leverage the MCV as much as possible.

Another thing is that some of this will change once the histograms are
considered, which helps with estimating the non-MCV part.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#59

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Tomas Vondra (#58)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

The attached patch version modifies how the non-MCV selectivity is
computed, along the lines explained in the previous message.

The comments in statext_clauselist_selectivity() explain it in far more
detail, but we this:

1) Compute selectivity using the MCV (s1).

2) To compute the non-MCV selectivity (s2) we do this:

2a) See how many top-level equalities are there (and compute ndistinct
estimate for those attributes).

2b) If there is an equality on each column, we know there can only be a
single matching item. If we found it in the MCV (i.e. s1 > 0) we're
done, and 's1' is the answer.

2c) If only some columns have equalities, we estimate the selectivity
for equalities as

s2 = ((1 - mcv_total_sel) / ndistinct)

If there are no remaining conditions, we're done.

2d) To estimate the non-equality clauses (on non-MCV part only), we
either repeat the whole process by calling clauselist_selectivity() or
approximating s1 to the non-MCV part. This needs a bit of care to
prevent infinite loops.

Of course, with 0002 this changes slightly, because we may try using a
histogram to estimate the non-MCV part. But that's just an extra step
right before (2a).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#60

Bruce Momjian

bruce@momjian.us

almost 8 years ago

In reply to: Tomas Vondra (#59)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Sun, Apr 1, 2018 at 06:07:55PM +0200, Tomas Vondra wrote:

Hi,

The attached patch version modifies how the non-MCV selectivity is
computed, along the lines explained in the previous message.

The comments in statext_clauselist_selectivity() explain it in far more
detail, but we this:

Uh, where are we on this patch? It isn't going to make it into PG 11?
Feature development for this has been going on for years. I thought
when Dean Rasheed got involved that it would be applied for this
release.

I realize this is a complex feature, but I think it will solve 80-90% of
optimizer complaints, and I don't see any other way to fix them than
this. This seems like the right way to fix optimizer problems, instead
of optimizer hints.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

#61

Dean Rasheed

dean.a.rasheed@gmail.com

almost 8 years ago

In reply to: Bruce Momjian (#60)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 7 April 2018 at 15:12, Bruce Momjian <bruce@momjian.us> wrote:

Uh, where are we on this patch? It isn't going to make it into PG 11?
Feature development for this has been going on for years.

Unfortunately, I think there's no way that this will be ready for
PG11. So far, I have only read parts of the patch, focusing mainly on
the planner changes, and how it will make use of the new statistics. I
think there are still issues to work out in that area, although I
haven't read the latest version yet, I have some doubts about the new
approach described.

Looking back at the history of this patch, it appears to have moved
through all 4 PG11 commitfests with fairly minimal review, never
becoming Ready for Committer. That's a real shame because I agree that
it's an important feature, but IMO it's not yet in a committable
state. I feel bad saying that, because the patch hasn't really had a
fair chance to-date, despite Tomas' efforts and quick responses to all
review comments.

At this stage though, all I can say is that I'll make every effort to
keep reviewing it, and I hope Tomas will persist, so that it has a
better chance in PG12.

Regards,
Dean

#62

Bruce Momjian

bruce@momjian.us

almost 8 years ago

In reply to: Dean Rasheed (#61)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Sat, Apr 7, 2018 at 06:52:42PM +0100, Dean Rasheed wrote:

On 7 April 2018 at 15:12, Bruce Momjian <bruce@momjian.us> wrote:

Uh, where are we on this patch? It isn't going to make it into PG 11?
Feature development for this has been going on for years.

Unfortunately, I think there's no way that this will be ready for
PG11. So far, I have only read parts of the patch, focusing mainly on
the planner changes, and how it will make use of the new statistics. I
think there are still issues to work out in that area, although I
haven't read the latest version yet, I have some doubts about the new
approach described.

Looking back at the history of this patch, it appears to have moved
through all 4 PG11 commitfests with fairly minimal review, never
becoming Ready for Committer. That's a real shame because I agree that
it's an important feature, but IMO it's not yet in a committable
state. I feel bad saying that, because the patch hasn't really had a
fair chance to-date, despite Tomas' efforts and quick responses to all
review comments.

At this stage though, all I can say is that I'll make every effort to
keep reviewing it, and I hope Tomas will persist, so that it has a
better chance in PG12.

Yes, let's please keep going on this patch.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

#63

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 8 years ago

In reply to: Dean Rasheed (#61)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 04/07/2018 07:52 PM, Dean Rasheed wrote:

On 7 April 2018 at 15:12, Bruce Momjian <bruce@momjian.us> wrote:

Uh, where are we on this patch? It isn't going to make it into PG 11?
Feature development for this has been going on for years.

Unfortunately, I think there's no way that this will be ready for
PG11. So far, I have only read parts of the patch, focusing mainly on
the planner changes, and how it will make use of the new statistics. I
think there are still issues to work out in that area, although I
haven't read the latest version yet, I have some doubts about the new
approach described.

Looking back at the history of this patch, it appears to have moved
through all 4 PG11 commitfests with fairly minimal review, never
becoming Ready for Committer. That's a real shame because I agree that
it's an important feature, but IMO it's not yet in a committable
state. I feel bad saying that, because the patch hasn't really had a
fair chance to-date, despite Tomas' efforts and quick responses to all
review comments.

Well, yeah. The free fall through all PG11 commitfests is somewhat
demotivating :-/

I certainly agree the patch is not committable in the current state. I
don't think it's in terrible shape, but I'm sure there are still some
dubious parts. I certainly know about a few.

At this stage though, all I can say is that I'll make every effort to
keep reviewing it, and I hope Tomas will persist, so that it has a
better chance in PG12.

Thank you for the effort and for the reviews, of course.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#64

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 7 years ago

In reply to: Tomas Vondra (#63)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi all,

Attached is a rebased version of this patch series, mostly just fixing
the breakage caused by reworked format of initial catalog data.

Aside from that, the MCV building now adopts the logic introduced by
commit b5db1d93d2 for single-column MCV lists. The new algorithm seems
pretty good and I don't see why multi-column MCV lists should use
something special.

I'm sure there are plenty of open questions to discuss, particularly
stuff related to combining the various types of statistics to the final
estimate (a lot of that was already improved based on Dean's reviews).

On thing that occurred to me while comparing the single-column logic (as
implemented in selfuncs.c) and the new multi-column stuff, is dealing
with partially-matching histogram buckets.

In the single-column case, we pretty much assume uniform distribution in
each bucket, and linearly interpolate the selectivity. So for a bucket
with boundaries [0, 10] and condition "x <= 5" we return 0.5, for "x <
7" we return 0.7 and so on. This is what convert_to_scalar() does.

In the multi-column case, we simply count each matching bucket as 0.5,
without any attempts to linearly interpolate. It would not be difficult
to call "convert_to_scalar" for each condition (essentially repeating
the linear interpolation for each column), but then what? We could
simply compute a product of those results, of course, but that only
works assuming independence. And that's exactly the wrong thing to
assume here, considering the extended statistics are meant for cases
where the columns are not independent.

So I'd argue the 0.5 estimate for partially-matching buckets is the
right thing to do here, as it's minimizing the average error.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-multivariate-MCV-lists.patch.gzapplication/gzip; name=0001-multivariate-MCV-lists.patch.gzDownload

0002-multivariate-histograms.patch.gzapplication/gzip; name=0002-multivariate-histograms.patch.gzDownload

��/[0002-multivariate-histograms.patch�<�r�F�����r��&H�i���sK��'�r���PB��@q����38yI���*�pL�t��=�����:XV��������E�e���n�30�:o�7-�����X2��t�%����n���K6�]��}�
8;�����g����������Ny$^�����f����F����F�e�`U0��V����d��NO��f����+'�oy`
vc��p7,�j�Z�Y���w�:
�G����:���?�?���E���3w�����@����G��PC���b�0��pg�E48ks@#�����g��6��0���w��S��N��]�{V� �M�
��Ff[��i=�a�_E(�Mp1,k�����v�?E�XE��pQ�������!����k����AS�6��Oba;��$g`'�����Vv�;����D{s�-T���Kx"<�y��Y�Gk��~�9Xc��f`j7������
?[����u���h�U����t���4\/�>
��MS�������=[�MaS��=�YY"1�����aT�x�6Q�
���7��6w�&�����o6p�4zX�������}/7�:hAN[Utf{�<�>Z�Nq�h�e��O�M�;!(��D�.!�+v��;b���D���Zl����,��=H�,�������dA�����"m�x���������-��w�y��kai��w������"��jECS��#��Z��� �
��w�}?�>6��J%�^,X�vmG�7v$�/K6��=3�����zz�^�
�\��P�`��s�������#�u;���gU�;`���d2|[�z3�5;��:
DCS�*��D�6X�3���p5��^@J�z=���?�q���k�[e?`��`�C �%�e�@�a���;���W��2}u�m��u��/	�W��gF)�P+��@���*wX��8$+��<�Fs��"�����
sO_���$m����H�������C���!�(x8.���7R����A�	

����|������b�H�<
�`eF[8�q�'L5�o�ht���2`�h��������h<�
F�'�	�7�L����2/E�w~.%�jI�Es�R�����_-�nH�;<\Z����P��c-:\�7{f�.Z=���j��l)�
^-}�Z�j�D����jt>:�2����M.��bB�)~!�����ZS���0�z�,6���*{-�3)����@Tc/���~.�A��$NCI"��n� ��v9,P~=���( 	J����|�)
�~�^���N$In	�Q��l�k�hx�|����;��r�D���
�^V�5�wV��J�\K�
]��w�N��TT������Q��������W�Z�R�l�JM'g��e]�]�.y9s������%
cmLlK�]��#eZ�)K�,�W�j�$}
��[\L2P�,����Bx�:�d�w2��r�� ����=����>�A=D����A����9)R���������r�O��'�=�����}��������!�@��h�������	k8?�lWx!:�oW&��B�V�\����
[S6�5Y��oU���5��,-��
s�G�c�+p�l����a+�A��~o��P_����+�����`�E�y��g����D��1HW�	!w��W���9j������!%�����f
~���M|�L���`�Rl�s��@�F����0���2�Y����r���0�}x3�� ���<��b��(
[����j��eI���J$���c-<��^�����5&�P=_%��L��������4�60��k��/zWQQ����z��K����v�A�������4�����eC�QA���C�+�1�&V�_����<�����#�M�o�?����J\�(
���r���+����)�3,j����qo��#��o��������=��P�5������l2���)����@,����/������g�c��w~�������	��+��xvgG7R�8[�;��bd�AV �C!L�YQgcL[�P�
��a3/!J��K�i����%X�=Ud��\���	�����M*S��!z.0�Q��M2gV���^��]:"#1�����g���#�2����������x�L�����Y]��.���Y<���o���������<?b���+��4i����q�-�\���_"��n�
J��v��������������b.`�V8��������;v5N�W���C�z2
����+g<������dHY6]���b��t�:\��?��~4���\�j�����W�v��F1V6�0z��u�^7��@��W�<�(���U���@?�7��/��B�������\&n+�n�����c�������b0�������n�i�
+c��3p9>�:wzJM�:��L��2Kx���j3h��@2!B"+����,x��c�A���B�
{�~g]���x=��(
��SS��������Y+�N6
�����	�4�����h�Vr�'xlG���K?m���Y�.��c�
��V���QV��������L����q��0T�,K�|e;��@N��g�n��6��n��S�Jw��c����p�s"�4��0�j&�3��Q�u�fp�8���������~����4�:��X��e�O0���)�{&m��C���*���F.��p�V�L��)���3�+��@���'�����W[nHLo�
e���<�}t��em:o�I�Wc�K���;>���^U)M���R��}"-�d�nT�/"���a��@.p�O���T��-h'�l���j���������.�]k
������,.�b���W�WWI�U�O�jrP�6m�����\�:5���6��*����uTM�J2>����!A�[��������H�����b������:�R��:��z�7���p����
�����,�:�i��'}���fKM���5;r�fWk��M��3D��������������*��^W�:�/��n���k���N���ew;]%�.���R�_�������Z��1=~�d��s���r�~O�Om�����n�3�h���r�M���� �����H&Y��Z��&�\��Q��Nf.�;.���J� x4�S���6���}H�������Cs(_L2�����l��&�\����J�i�-A��P��1���xj��w�p�+p�.B��}g��/�R�d9Pl��U���eQ-F&���nVv���tT�����t���P u3��;����������#<V&C���]�(�A��6g��)�vu�����@X���(|�����$���#�e���:�]��M��Zm���������G�c4�K�m���������,�76>c�S6��\��,{�����!H$9�����<���I�B�Fc����;y�,�r0�N=�e�:oG����l?7�9�e�f��9���2,�j�#[d����m
Smj�g\
K8I��z��Xm�H+���w�(���h��j���(�N��{��c$�������>}c�LwF'��RNTM�~����������1i�1�3f���Z�&}\�J������f�w,��(�d�>��'�)�Q�jE���,Y7�� 
��������������RBc'��g�fep���s�p[��B�[8>��y|_���i|q5�L��bz�`q��[e[����eoX�utf9`5���kyc%�p�d�w�������~�H���A�i��lW6`��������0��)vv�X�+��&y1tQ��d
�L�[���jn��,�D�hc��=�����x���L)�h�U,��)����G}�3�0|����/�:7��.������B�y:y5��a��_��.�|x����O#T��$^m��A���o�Ky7���G��jx6*��{�;�e��<s���bU������d���_�L��(���fj����qS���5�S������)�e����*C�H���<�����T=[�������[Z����1�*����-�&��(������|i\�`2���\\��L/�������\���U\�M�����g}��XXb�WN��"p���9��g���'k�=@"K;e�2���Xn��
a�9�8��)���x�#���@D���U�����ub���z7}S7�;DY-T�V��t�^�[=R�VW�vX�&T,v�0����MV�?-J)�^%YB9}^En�^���Rb��uppQ��������X(�Hf�����:,�Ny�r���Z/��������F�gX�^4J�<���1�0g��#H�?��s�9��<
��PG~�$���\��rM����D���d��K-��
��xq��i��9���n?�h��Cy��g���^S�=m\eg]�k�-p�>�y���E�5)#��Q��5�%��I��*b����S2���g���t����0gaaDh*���6���fOc� �d�D���H^}6�������<�]�q����'�+��+�Wp�g��A2cu�dU�����(�$���dr9QuPA����������������
�h$��Ng�[�����A?Il�����\�����S�
l�+�,�*T������J�<h���!�l���}<���HOP�~�N`��DDN��i��l��tv:z7�8]��GW�M��������P,)�4UFi=�H�/��E�s���c�Lg?�����+����EY�(g3$����� M�k���pr9>�����i���Q�h�j�����E�
!,��{
3�H��������S������ �� R���$�Zf��A1qm�D$|�`o_�|����otM���c(�'���^�����I9�v�������B��������U��MF��*�x���
<W�2�������N����HfW�4�q,������F��&L�0s��O�\bw1n?���P�:PV':J5�� ~�m��4D�<��Q��:6"K�L+�>��1�GH�����z~K�N��v�����!y`��c��HVn���T�"��T&��6�	�����NFzt��V��_Z9��1�9��ys*��I��@�Q�.CH��}�����Q��vi�X� 3v:tu��Pg����B��k/2[���n����]�����&��Q��KV�����P'F@�L8V�3;��������cE�������"�S`�_a�x,H'(P�B���J.����T�i��������@DTV�f����x��V�t�O�d�t�.����I�_�45H�j(1I��@=5��(�����%JU�����!���e����f��u\n�V��x�}���7������R{m�)�fwc�_x4�Y����Q�~��M�d�8���9*2��x�R�6��.�S6�� v:�E���L2>L�������xt~J��y���e���.����J:b+�Gi�'�� �;z�80d-�,!i��ON����M�r]�f|1��ij���x�v��j������e���?j�R�0����6����F�e��v5��n]��^i���^)��������+K�>�	�&H ��L�c����
d�v��;�����D�c�*����^�B�Nn�}��@�����E<�%�e����W��qz-�O[�7�QZ��?���@'���@�H{�N��k<�
.)�=d�;�]X��Eh1YX(=�At�<A�3_������r�I<��=������M��V$Xet�wb��:H�W�L]�js�����B���X-|�k)M7J���	�w�
�q�����
_����@��{�u�>H}	j��6�u��z����/�"5UaY�����$�ta���V�B������tZj��K>��{�{�N��������9~�>m���*d���#������?��~(��d`A)�������:C����l��/���N
�����mw�'�Om����1��������a�\Oe�>gg�
��+������wuW�
�e�1WL�4��U�~�d��v��8��W���H)��NP�$_}-hE�-�V�hh���@:`w�B�6���A{{��ln��������^5/P�X��(Ho���&��s�_66������v����\?��hy�3|�p�������
v\t������
�<�o����lI�_���+�d��o���Q�zX���<�����a]�p|)��h�f�*����������R#��������/6/������n���7v�x���������a�+k�x��J�y#m9���n��t�����"�D��\��\��[T������"u�v��������v�����~�r����!��T��/��Y�{�6��u�����y�~��0��u���'��f���m7���dw����������=�2����LLM.���7������X�F4����+�-Dc�J�(?k���&v-���V!Z��������d
��s!����E�|y�����9B�`�u��v���R|�o��m�M7����Z��P����?�r�J(��89��7��s��b��b|Fs�d��Y
v6>�>�x=�����"|�����l
�&�~�3����9O��%�I���z����$�q|��FF���P�4*d�zb�X�}}��x������ 4�	~4p�*�='��F��[B����*���f��Q��nm�-[�'�Q�5�}2���l5Z@:��o������z�q�?d�|p{D��>�e��)��'�e�����AE����#���X���j����z|E�����D�U�
�3�+->���t���?�Y\%kH���(���e.^��XR�w��7�����P
��j��w���a�������'��`���`����ued�,�C�)E�b�?�e���^W�x��H�b����~�~|�4�&Y�p�^\�%�E��-�X����Z�!��`���A
t��v�!w��`����-�t�^���Q��x�*�pQ�5;
��\qZ����������:��
�i�Z�D!�a�!�QDF��������!U�S���/W��G����P�B,
�:���dH	�������)"��&�pp	V�>���I��H�����@U�l�/xV�e�^��:���I��X,�t�%���Z�+����t���c��m��DWt�v���A���J��h���S,i]��>�h��&��������[�(x<$�k��x�^��q��,	�J��	��Jk�����N���x+}�g����i�"I|�
w+��@��F
z��W�+����n$�.|H8���#�q�#�\��}x���������790�/�P���a)itq7��e�G���O�Ln�����(�F�Ry��Z\E�GS�'	��z2Fz�ZC���G��������#�h��e�e�1����q�C��G,��*�W/F��;�r�=���������\���,������C�TEe�q�%�^���# L������oju��E���!l2�U�����C��X)C�"8�V���g�������,Xfb<���n�q�X:�V@V�y�����L��!j��mNx��,!�-]U��@.���-V��N	��J���[�U�H���t����xD���3����U�,�����-�!�p����}t�S��16�$��4G�X �M*v�vu�B��zT`��_e8�Ti�D
���3���z V�h�
������O/�3/����������������l�%m��
=�H^g��*����~E�����g����$���(�q�3��d;���������dj	��z($�
�$�!�$wn��:�h������9,u�%%�a�u�p��x
k�w�� u�t-��xupvq|P�H�������d�ZH�Q�)����xghr������I���Kc���^<��-���.�n7��7�G$�$�g����C�J����mi���O������n<p=����i�x�p*�o�����K������[Vc�$������U����6K�{��!gw������1���	�������N���j��U�u��v��� }�{�s�f�:y^��I�DU."�5�t�Qf1�A���N��A�D">���/Pl~���Pv����bM����f:Lxb�������%��"��f�P@��������-=g���X�@,���,gE��WY�Lt�o��	�p]8�S'�hbt�a0vYnZ$�A�ex^����^�ew�$�5���]Y�'�����������g�N���=����]���s6z�����������"�L�>����=��P)�
d������#\<��jx�XF����{�t��W�:
����;�����8s��B�������6&��xR��������L��	A�*��3���!.c��"%�K%�Vb���Htr�IBA�|��?T�1$~a'��v�m��G��c^=I��b�UQ��=������Q�B�����r	�\}�d�t�P��CBH�)I_������5nz�����z0�M7d�g@o P��1wA��zU1&Tt�aP��@��K�o��	yde����k�YK�m��$���d���%}bI?e�h�H�q�w�e>�7$�0V������PJt!��������Z�7��v+��JuKH����Hbu�"�����)�6�4I.o`K^���k�>�}f�n�|tI�������� �����j��	�G@�+^&��D���^B��0������=�(>o����0��w���~��M�,��@_fzv����@�:�����NL�N�Pb������F�����bv���%�)���%�	(�)��G%�����^�2J� �,9k��'f�����d�(��������+�q�1��i��X������D)�.��������U��K��/Q|�E����Y��i���L,Q���L�G�jLK��R��K(�0�`bYET�	qB6<��b��?�s������I71br�r9L�{3�.obL2�p�����^"d}�����.����i{�4,�;�Ps�HwqM���~�=��0z�Q��&x���SEg	K�zS:Z�����=U2E;��n�(�Tt�����l��7����}su��Y\e.��#��;�K���eK'a��ud�}F�5&/	������������A0�a��� �����I��O�?]t�N9�S���d6��3�pS8��|A_N4����#p�n�6��h����s�N����)�gj���t�}���������{9��9�[�49j�{}.|I��p����x+���;]�HY4��r�l�x�$;�Dkc���N���+�;BA9�u������*�L7�
��+�>#�	�:$������s�A
��+�@�Q/�>��k�����M�J#���V~���W�hq"9'�L3r�{Q�Q�X��h#�-�&3��P>��`�U�������e�U�����^�!�bh��+5a�"<x,�$���'u�`OD�IGwj�&�v���{<q�t��`����,64��#rME���D��5�����-��X���L��j�$VP��M��l����d@&�GE�>z�3�41��&W�C��������E���!��
����MM�x�&;�#�W��`��R�lh�"[$�]w�3#�g9�AR��7C�����a���zc���:�~������R��>����P���6|�m	�K��c���H���Ih.m���[
��L�(9�)o�Z�Cab�6VN�*�INH�|�S{���1�O\2��C�CJp�
��a������������/�'@6�g�a�V]���pX�39�� ��fJ�D�����8q?q:�X
��Q
rEa�j��������'E,����J3��RF�y������6,�R�����t���v����
��X��}>�s#lF?�6�^'��h�~|���Os�'adX���+�eDG���nI���!%�K�Js>����l2`�_(Yf2Y���
E�2e(�<��D�Z]�^����8#�{=e�|Q���J����az��wF���m��>�3�T_8���B����H���.�Q`u��0]o�F7�x��������M�������������*e��(����3x�}�����v���Q'8����.����i��tgC�#������j����kf��O�^C���x9��W %F-vS��@U������0����I����B��`�tO�/3�����9'�#�&���bs��x�|��N���9�L=l��ct2	�
�C����/ *]&S����K<�A���?d_�5�b?\���H]�~�?�#ALE
��f��n��<��O�/����������
��-��8K��rq��6��7'�i���������U����R������0��$�E$�7�1�^T�8��BL^��B�}��t��tdS�T�)��M2�K��f��o��v�vA+�z�����#"�)kx$�1�����H���g��w�?%�������}������JeE�!��3a�Y���G���J��@,$;�����L�r����������z���0�k�-[Z���/�;�S�� �n��&?+R>I��	�V���4�
�����10�~�,K5erq�)�!X\}�d3bo��'��������OG��Y�?b������jL9sv@O�+|�����i�
�=����q�\���C;o�6�"1
"��P}����k��U�f�"&��j72I�&�Z�PRKt�����D�����8U��������/���dsq%c�`���9'���W^����e����>����Gk������1�;=�4Ux�s�-��x��*��L���d�!�7����yI��L5X�M�Bo����!�
�P��Q{N(	�}��:-�h��bG�	�.K�o���a%��H�0���8r@���~�	�>���K{�3s�8d��8��|�������F$�r���s�X�sc�5
.�?�d��T��������>�����=y�%�qz���M������������o����o���$G�����'s���iMENO�_��l6G2�5�$�n?"4;:Ax]q�G��8A�D�l�l1f��\� �����(���)\Ouyo5�/O	�)Wq	B��"0��U�Wm���<�Hz��L��~��m�$�W��B%r�vq�p=�,L}�)i� �G��;8�
K��dl��mR��XG���GT�o`�����@��Mc��u��~3W�P�����,'�@�`g�����4�����`������R�����\Y��Dh�U��")j�c],�cQe�B��t�r�1f�9���������
�_�T�P?~�;���&������H�n��xTx��v����a�(���r���
:N��	������u�rQ�3�R������� t[;�f���n�l�6��:L�g�Bi
����������O���A��:�������/�2�G������������;�/�$H`�f�u��K��5���w�I���o��=�b�����;H�����p<~s�
���Y��F�!mV�L���!��>�=�k����1�[�F��H��8��h"��J��4�>�^�|�L�w+���?���+��7�!B�?����p��0i��`QZ���.�>6T����'�n��b�n�����s��#Qm@�p��=#m!�i9��;�]��K�����w��X����D����[�rf-,��oI��0&���l���<�m��B����<��*A"����l�]�V���X"��Tc�?I��4�p#bh`��a�h����������\����� *�	��������"�aa!�Qz��G��[�N�"S������������1-��X#�q���*�������MO.d9�q��r3P�	�����������3���|���
Q5y�(,��:����L�L1(����������o�2�b��v�cr*�Y�
�I�^S�[��wE��vk�m�7A����|�p���9/=.�)�u����#�:ob$OTHQ�0),�	������+K��/�Cn�
1��������R�Y2�
�+� �*�%����:�'��wu��S�_R�*1�/��3�2��k��	W���o�wna��<x�7w�s=��!_>�p�KV���`�,��$�
��!��y�V������HD�y
��%��IF�6y��0����3�I�u{g�AL�x�7� F�"�x������Y����t'I�^{�(!���,�o�dyX�M�F�7��ehD%b���e
�6�����	�����&����2P� ��������G�>����r��KG9�.UB4yA$�x����k��W��������`"�.h`���@&u�IX�/V�~�@��i=�E���jX,��d��#�,�'������I�<Q&�<T&Y��ty��/.��af���K��&J-{��� ��9���1����"��r����_=��Z~�����-������hM�{u�T�D>Q��������}S.4��������h!x)��	��w����c�;��x"���Gz?0ue�����J����h�:��C�����&�G>z�h���^.���l�{������9�����5�d6��U�t+��L����~>��{�z1�9�$I�R��W�i�������)��D�2!�#���}���B1PjH!�8�`�>a�����'��cl{�vN�O�+�I���'t������][�,D�e�g��a�Yt����;,q�HK�5�Fu���^9�q�n9W���5�&�h ?("��q�o��� �o3�����FJ3�^=7������M����(�P�W��b���VoXH�[y�jw�X�%S��������CG���S���E�
���:U8�kE]6?s��s��fm�.�3������J&�����:�;;{�E<|8�M��nh�����x��bU+J�"�>��9_$��)Lr���m�'��q�;p:?u��P%�i�{�!x��bX1okq��k���&�������"�w�S���X�p��3��8�N�,��������$���fdT8��[�������:�����e���;m2!��pX��ce0��*9yU������o9�����[����,y��#�IG��S�1�&��F6p�N�K:^�!�O���m�=�o��ID���aG�35���V4���,Rl�����ih����l?�)i��y�Wa��:������P�H�o�u�r5��z�.�"��<��z����]�(z���\ ������F���X��,q��-(���5o�l�/f��6�F|]4.�|3_5�s�+�O��t����M0 	ISE<�N�
�c� ��mem���l�P�m�����d���K��W6-�T��!����H��37_�V�l����!��������)bKqapO��p��\M����>R�v{c����	4�c��^^3C����6�	k��0A��Y�������x�������FsY�l��>��>p���!,��	��s4�8�y�84�h
r;8�^G+�z�Bwro�8�v{{SX�/��@��$l�lL�M�1^��+^{���M�[���)�g!�;L4�|�<T��������-a�PI�����5X��7�j4"��T�YI)"j.�ex(�g���/1�����)1k:�b:����A&��RcG����1*�n�fD	q:u��n���$��?�JN������ 5�,�:=1[[Th�|T�#��M	|p8��z���u`,%��;��2������(*�y5t9�+C�"7)>�4�N�QC|^�i��a���b/�X�f��2:��X1rNZ��&�)En�g���8&����P��t�7�P��e:����m�j�S����\uG��pnS��$�2�&��������^�:7�^j�l����7��&�Xhv����G*s�b�����Qy�����?�|R�A=��@��i���)4/L�N7�N�q��yL��� �o��oD���@��Cn��+�O��������gG�U���+��g����nB�������Y�h�wP"�^N���x}O�#�g�?��5����H1�iiS~]�� \�O�#EHZD�pd�C!D�����������������+����e�N{�����j��.����'��K�O����7�y�������/������H���\]N���z�u���'���������	�����2��H��tT�%�*J�����~y����,|;�Kxv���?��W�����U?�l����OI[N�Z�����y���?�g2��1N�%��Rr~�1������D(��7��T�=�����J+�caI�#t�gC�	).�p��jlXEOn|���������kM��{�?���Kz}$=u�����rZ%��\.���n�Zp
�vb����d�����������C�#�0�	�������F��p\�]��Xi���3LF�G�Q��y�6��A�}�������3����1�f�2Q��ZP$�#�a��=�5E�N��8~)�!:&������a=��g��(��r�P�R�)�h����n �8M���x�����_;��_����u����?�~��;���VV���l����b
��p�h��A!���Hh,�J&l�y)t�*:DT:GWd����M�+���l���/��E�3��Ip�L4*�m}++�����|v����02�0h���7��[W����O��������#.�+���:�y�5������8�2���}	=ap���i����	'^�[r���6�VlJ�r�PS��2�9��f}}����0s!�9�1Ya�c��S�,�e��(�CP�+h�*���Y.QP%�sr�wa���.��J!XV����l�[��t�S�4���<<�b��&�P�LD��I
zl:1�2��"�(����j�����`R��f��G���?�ES�/�5��a|�a�Hfp���W��*�Vo�E��3{�!��Q�A��������=�U�Pk����]�J��B�h5��>���C^b�������*��XO��lF�Q�P��e:���|
)���e�
�|"�]<?��#Z����_p����&?/�������}HF^��b*g���r�
�����ttr��Z���P[�^W��D"W�%�O_���Q}�F�]�����T.�T-kZ+�Du�9=y��\U�`yUn����G�\�&�c.o���|�PU���0�=xb@B���	�#�����!� ��Xl����k���)^%���e��f���@Sx!���<s#iF���w�8<a�{���(<U����S��W(��vRO����8����B!%-��5�����)���H����.��t���b�����J����qK���:A�'x/�2��}h��v��$7G�Qmh�H����oS�#�	b�u����o�;=�NR�Ng�P���ssp��EAQ��1�]8����-s�����
���OV/�>������.�oD��b]�n]H����9�6yc��J�=}-��'�#L�k?Q<9�oT����B3�QT`�����.��Y�S����t��� '�����hy��KQ��m�s"P�^��v><������!Z'������1�%���@Sx�nR�!��nCO5���}���$�P=���`�.����Q�d�>����
�"�>I �%�hl�:U�����P���\���}��#A�jPvJ7f�}����FQ�	b|
q�(�{=�89&e6����!6�g%O2�!��57/J
�1�����a#T	��������NL�!��8?������~:��3C�pSB�������8���R��������\�\�4'�4��+���o���1z%����/����Y{��B][4O�|�(,Fz����_^uHW�+j�,PC�/_H����M
:����dT������$ ���D������n�?����W�i���d�6$,�?+'63	G|h?r�6����*L��?EAUs�9M��K��S�0�U5���7<�j-f�)�W6Fi�6�p<2>�������]�����hJ�G������'�M��S���;rt#�z�a,T�/����'��)�[��4���(]]%��Ey�U.j��(����f4�[�s��R;�����X_�������f��"�%F$]��t��c	��d�
�����v�L�h4���G�<�e![O��,>�$������TD-p������	c},m�m��:�VrQ���{�|+{�T�xs������������2��4r�2����GN�H6��PW��#�u��
<*Dh"����D_���]��;�Y�}0n^��2�V2��U�y@�
6
���m���9E�)up ^��5Mb��k����V��*%�w�o��q���P�HHt�s�\�I�mhPj���m�D�"K���v�����:7����U���o���q���*�y6��l���������Y%����{
*S�E���("�8*y�b>*=���Imx��+x���
����*�`��.6fM���o����p��	���z��k
����P�gh���<�!����_�!�H>8q��:��<<8������s����@���}�(��{k��\7,d���^y�)L��.����tz�R��%a�2��������(��8��HdR���a�N\������?��O_� ������pL�M�a�����
��Ib�9\�SIv~z]�e]{2�v���F��FW"����vV`��9��\���|�����}��s�u���2
�d��c2���+�����'q�.<U�D��-��:IX6	����P<R9~n�/���:1����#���L�8}�^�{��X<��Q��9��\cp)l�Y�T��O���Zu}f�e'�-���"����U�\DL�9)���$��+����;1&k��6�T���Gnt%��J{�����R�.��V����CUJ�S�Vy�w#r���@�S���$�n�*HE�$	TT��f^��#{|�r|�����u��P�F��������"c``�O��y����lf\km��
:��->�2F�I�7�I�F��Q����<��K�Sf�n<f���<*_N�f���?��?����/������,HQg8��b�G��p���/h���
��s))����\�����O�'t�\������W�Fbt�k�8��~
�%��%���y�*v��e�L���RC��Sn�he�0"d��������������������j({H3
c;S#Mk#�^��<�[����A���?��(a�ha��g��&9��"6#����H�)�Rx.z�	�2�����	�1m�xres�h�-�����i���Y������M]�g)^����{Dn�5~����e�d��V�����\[
���	o��0c{�,���YWD����(
��:�g����/�jQmd�by!Q�E������.�G�1��^Q7�'�P1��E���F�.{e3a�����"���5�twg�Z�n�y,6H�'�d���1�z:�������,E�u�{O�(��g>8��!/�2",Gb��]�cJ�=G|$�w!E�9��7CY0O��6V�F�e�~��ed:�����/�o��G2�G�$9��W��.q/�c����u�<,��yl��3�r���Pc�
j����4�9��������N�8F{O0�	��~��>.q�yd���i�H���s��xm���a��r,y=��_Yy�V���	�������*O�� �O�<>u]>(�#��8P�}I�Vr7�S
.�k:�a(i�=��=)Rn�������Nk$��/�q'��eV�8=��x�]�"�����@siC����l�����u���yz�MR
F�P��������V>�S�~BC���t�`cf��y�����og�:64a��p��[N�I��{u�Yg\m���('��Y7�Er���;�27�.p�?�p��"�J���G�[�"���������d�d� ��?���t���_�(k����X'�N��k������&���1�/*�r���d$�Ok9�xO��WN�G9&�?��9�n��a�~�d���F�^�*q3H��F���,��G���T�d�$-#�i��q����\�(Q������a����s��

\�#�I����N�3����I�z�����\ShK z���$]�V�)��FZs
%����#4���&b/�� ,�[����"��\`U�P���qg8j���,��H;j�����$_N���$��M������aj�C9J����(f���[iZB@������������������u�3{��/=)~)������\b�7�G���{�t|r��A[l=�l�K�m\=�jz������
5$/��&�<�Oti�����1���%o�&���%��(�����q4�M��f�(�P��_��#x�J�/�	�#���#���t_�������r������X�v��@��/8��fx\�����p�����5����F��9�� ���GM-����B�������	����B~�lY�����{��N�
���4:�w"_ub������j4�Q�V��K�=�%��
�d�G_b������S���������',���sZ��s3�����������7z�-ZU3���6��%�2�.3��d���f��1��o�G7f��1A�>��
��`�%�������&�G��X0)U:e/���z�T���zR8���c5 ft�BW�#�G��I7�N0 �qg`o�/�j�]�V�0^D�t��O)1(j��	�I��x�����N..������Zj�zlq�UJ���.������ I�t��>:��Y|�4�0�D�1�5�P��)���\��'Y�t��}�)��-��t�����DbX������2������(g�S�f!�R��O�Z����[����P���B|��]�z�	J����jF��F�����H����}Dc�}���&:9/<�����z�[���d�qs����z`���>a�
]���9�������V��!����R�?�,���!z�!���yF<^���"
�����������*�Q��C���Wu�z�Y���3@@�������������N�t�����}?�x|�=	���^�N
�C�
��~���su�(c��\Pd!���
)�9�9+��G�?��o>T0���
(e5�����!G�����P�?�:�1��U�������:�����u�;��6���}���8Z��m��;�����H&��TB�|�s��O"�x&��h�+^]r�=�?��
o;I��N���LI.���������S��L����X����`Z4��
��2x"�����	��v���{�;u�m5J�*�)O�"/p"�z��x<���3�!�.~,�u��3����-����r��we`+�Aa�N�h�E�DZ�o*F���^"
�n�^�z���+�������Dk�����d2v�K�����v:�j�@��+��aZ������b������������	��_����~-H�gl������_bml��M,>���x���� :����CW�:t�<��#(�2"g����A�W���2����@q�����w���0d.t,��E��M�Z��3���=���5��\�=T"�K!���P*�y�[>��b�5Y���5�o������?�/�\���`�����1`���9
8q
���0�5�}�l���8�������V�3���r����m�q�`��nC�:g�OT��
d��L���VC)��TG�
�Q�e�� ��\��L���.������aye
�C����!�D����c�s��[�@�
P�j�EuR���W$'.z��<�4�����,�}|r�<��������=&���/i�-V��` ��q��������V��T��/d�[v��'��dD���D�O�0A;���z)#�u�~�������;[�Z�Y������2U5'�h�(����*�P�v	5.�]���:@���`w��T@�W%����#Hx�XK�R�t��U�?���������@��Kg�2y��*��f���{��+sqE���0�j�H*�Sy�T�sQ���*��s���� |������Q��tLdE��"��qQ��Q���)Z6���������������	~xx�����������O�.��9�(v�]��$�'����Q�� ���&��R�w��S*����K[-e_���9l'J���^�Xw���{/��c>p�uxOf�R�n^=���v�CC��u���Mv�W��!�Ce�c�$sZ*J-���W��
�%�����k'���Bf�����5��;@GSt@	Y	`3���5=�Jt!�H�iw�"�V�IJi���O���DJ�]����=��'>�s��aGnJh����m��
*q[��5�q�����<n3����+N�,�o�
D��|�����
�������`<%������p�4�.���!>XMGF#����:
�{g�W�ld�r5��;t�a���$L�e�����W%�"����eW�.�2�U?A�?[TSl6V���=���[G5J����x�=��<�S��u^W��XW����$W����|�����X0�P$6I=�!�R�d!���;�J��R�P�NYl�yS�2�����i��$�DC
�[�IC��xE������������N���Lt3�@������.�dA
��� �t$�f�
����hI������6���&�G�SuV��Ox��m���,{�%�k�e���i��S���/��-���z�)�U��
�����a+��*����E������H�;�m�>m���V�Ek���������$���@]]e
�����N���'����,�c��Cx��(�&�P�.JIl�����@|J������Q���e�~�]x�/��	gO����-U
/8��7�S �c��L��Mp&{?�9v�a� ��6�#����>&`*!�x�c�JoF�w��������~�����`�.�sy������@o�_�b�����8#��Q��9���������5���i���xMk�:C,�Y�v��>��eL�����j�������_�P�L�8g�����H���A�t���>)�ic0$����&�+R�{F�S�>Ke+$�H�*t�R ���Ovv�=/m��\�8�x��w�qGLK;��/m����z�IW+�A����]Y���M=�L��r=��a��]�;(���T��;R��
oK`��[S�Lh�r�_#6��`Lt*4���}&��7�'2�BJ���F%�y#�.:��+Z$�JlZF�<�x4��}�l�Uy�W?1�W5�wt?���������`T��ymI6
Z���i������O��X�]1�?|�?���������v�@2o2Y�92�Kc/�:�q��"#z�"�#���7�/���X/�j�P��e����&_c��<�|H�����3(���jiP{�q�sC|p��{�N��znG]�I�Wr�������-f��C�1���sG�/�c�nh��o\�I�?���N0�"e	2]�u��v�6�G���,z�c��D
P���r��1�	l�X�.�7J���W�zN���9����|e@�~kd����\�����Ww�s�+�x���3r����<��tr��/�c�sw�9B���AF���4���6l�$&��:}��`7r>LA���	D��D���c�)3����t)c�vP<$3jjM���r�S�T#s�$�:O������O��#7K����6�L�W�S�������h���#��}r<�
i�9A�:�
�d��aia8������o]P*��*�@G�����,YH�k
qen&Dl�����6��w���<���-N����9b�}�����������S����1F9|�+�m7����D��+��95�D�e�m��4��=I{�����"�C��-O��P�h�)w��GE�gGG����G�4:8���^^����1$vy_��������.�)G��J�IVk7����e��x��\��/������DH��z�����K!�|pr��/�	�s�F�h��9?S<z:��f�\z�	�p�����d��-=�A��i�;{C���l����)�C����Nd}G�*��Oi��{;
�CV�y������
p����b��_~E`o��������mo�b�a�S���)�)��Dn�u��nX�� ���%z���D���	����F��OB����V��j6��[�MI���������}�Ww�"���1L7�����fso��������)�4 ���|��Q�b!]�3�:�����Zq���\��0�f���������JH����wCo�����q��4� S$�M�?z��x��-{Cx�����<��@w�X�;�������~�9�Z�c����{0��6~k���S��n=d���?o��w�*�Zo�������\���xB)�&�v<��5a�����<MY?�����Q�c~���7���f���l���o��������&z�X������O �6a0��R��g������c����D'W���$�F=�Q���[��M�s�Ba��6��OA��y:@wX��y���
�������-�����0�a8�izIt�{��_�$\�k�ob~X�����d9nqh�.|g�
����@�C��a)
g�xv����r��0|a�t�D��N�f>��<�p���+C�z<�S�j��D%~����I�]t�$Yq-����B~s $W�������y����4�6�������
6�j�����z�6���@���
��z����R�gJ��\�7%���>�%�����8{Ra
�E?�/���Q��B�����N`zs/��[R.e�
����7��AUU{�\�"Q�gQ�f�����m8��;��N��H��v�(���K�����v�=���%
6F6��}���z��qsmR[K��q�'*: �n
���Oiz�$�L(v���V��2<-)�p^'�.N����G.5u��#����[��e�������Og�"��L������Y�Q��P�*����;���8�y}�	B�u���44:dW<X�s��Vb�8,����Xcv�����u���������/UI������D|Y/����^�����-��Y�9^��^9�]N����M��@���4�=8
��� ���'���2"�L#�~���
�vfm��Fr�0�x|�D�|�����*x�gT�>KHN���lx
��2"r�\
��(F��o���vU�+1���#���2���U���e�pY������
����	�=1�q�jf�D��m;'��#��)oC�Lx���Qrk8��f�({�CiKa���EfT�(/����%����bu��q9�N�k����,��+F�����2O� ��i�������\�����}J�0;
��X�����������c��j*��&V!���8r�y��&q�.i�B��bI���!=�����5����>���P�UPk��%� ?G���S��hL��(��p�%n����.�wx"A4_'�������8�i��X��oD������]��h�5<hk��k>��#������L�l��c���&����@:,��D���,��I�%��)$D��1��@T'���\sT!�i�6b)� ���
��<����=�������
��S�f"���:�BN:�$~�d��J���� �(����3�Q����=��:X/�6�_�z���ce���z2��*@Y��c���m�*��Q�����9���-�m-�]���q9��ynF��
�p�I�#1N�,�
��7����$7�[�����A3K����c}�W���IJ�F�C(�Y��h?����9��]a�:e�*���1��8���sI��Ca�u70+������6����7��r'CV*Kq���-��C6o�x�85�����G���[�������0]����@���7m.���������k�!��O3R?,J_5��*�/n"	��S(3�>'���b46����UoY�����*Rv��a!�G)���o[O������LG�p+��j��u,�.�0<Yx�t|9���p��J<Q���A^Rg�/��q���s�f|@n��I�������g�����i.]��)$���T��*�y��o+�����e=9�	ts�s��D�g�t��,�U:
�(b����5�8������\���!X�l|$�C8������TS����L*��2���a�{�a�IWQ�"81����-���ZGD�f�"t_UV;X��L��L���7���5^�6Y�@�8y�<0I*�4T��
D��������|
��5��U������>V��0�6I^�8Sa�]
�|oJ������ir�Io�t�lYy���3�K
���'.���r�D�4A�?Va���x�l��IzX�J��r��b��i��[� �V�z�J250Hn�W�w��%�
<�+s�>4�����R��]��6��j�/�T?%O�_�2�$��];��9�Q�I����j(�mq�l��qC>YJ6#>&���.�kz�7��������n���K�C�l����j��F�)vN
O7�b����a������6��������	a�����%xv&1�[T�s�yQZ��Xs�����N���p��=��:��$l�SYM�cJ��P��3b��'�<)\P�W=��+��@5I�S����J�;�t�]�z�Yq����td��q^j���t6�p/Z*g7�N9�w�?�B�3��F%����Z,	��	NVf�(�Z�Dd2
7UMa�r{�����^�j~���5�qJ3�0��L��+j������nVK��0���f�A<�e����g)e������C�Fj�.X�l�0w�MJ��;��[��_���v$G!9���H&����e��.���$$.��8/�S�5V�d���Ft��N%����I|���
K��XkH��DY]W���a	��6g�p�����mF[p�(	?;+����q�S��Up��Ue�W�\J��7^�m�,?K�[<�'�|������S��J��� k�}�7���F�~�����e��e�AR�a��Et�������3
��Ra���<���Y�������?1��� ^k4^f.���"�a���2+��ac���F��3!Gjl�r�8�
����d���U�_���r�����4�L���IW�%�����X����g�����2�����3(�o��i��pE�S�6��BU�<�7]�}Sn3'k�9T�<�P����k�>�X}����������P�1�o��R�O���|��q��Op&�_�� P<4�����bEP�a�~���Z�/n�.��W�%>|��sC��8C�&�,����MX"��q�g�Tw�)�.D6�k�Q����&~3P�r^r�P!�(��*�A������L�K��+>�sYC�H��49$��"��}��9���d(Qiw��}���B�'�J�����`��9���O=����"q�ImX4B�/,�qa�P�/y��\��`GpsU�i^|@h���i�B�\�g%7��1�m��9�|fE|�^�Y��	��
�S�X\u��>���8�{��J>�w1JZ���W||�����V�F�e��������Ak��^G�&T�rjb��x�����`'��-N�{��@���h�.1�<���K9����*���8B!j83��A���z�������2��'�I�e�$��z��l�{��"��_#���A_w�� ���<>_��\\�b{�vw���o�R����F`L�U�e���C`4����_�&B.%�SSZ��
���~�p.
�r|,Y�����f���f��F�#5�V�@�8��/M�kjb���� ��1����}���29�/
U�J�j��A���u���	�^7��q�k��U�������|h
N��v�R�j-�t�#/�y�C�����h
nFm� 0'����}�h�"`����]�p�i��~��&�5Q�t/D�a2
;�,�l��C�A�w�������w�CE��T�����2��E5�}�5����J9���i���DQ�W���2;'&��74V\�WI2#����}���%�
V��=��r�S��L��"��hj�3d��#���D;n-���*}��x&i��uU~y��v�94@�g������<_5"M�rW���{���������c�/��g����������<�9��L���{����<�Y|Q\u.G��XL3H��R��/�bO%��l�3D���&|�����3�����������{�Y�l,�={�������8%���Y;$��v)��)��+U�k��Y��Z�1V�x�${���fZ��FS��Z�y���h'��c8>6��9��Uo�Q:j6����{�<e�R�$D"
s��.$.���o�Ag����L��r��7�T����L���\�{���u�/��+��w�������0/x���T����A"�A��,�z��\�QR��;an��7J-gq:�=�'���{��]O���I��I"��-')`�U�_�q�D)8��s���4J��<��{���
��`��7�z��e".�|YQ{n�p�<�"��O����<7�\Hl�-�,�1�#�����9��A�5$�
�j��U�BgN!�B��\�}����p��*���S[*^��D�4D �����r~Pk��C��$o��p
�a��$��
����#w���:�v�n�AV�T���7�rhq+_�a4�
��;.n����oUl�(�F�������Hm<r�����T�JF�&��mpC��>�����W�I������Os-�vj�y
��R~�Y�z6*�4:	�]V��:��L�#��M���._��Fw�����Q3��F��|�N�
��T�%����8�dl��N�Q��*�,<���j��+[��Z�y#E]��#|~-x�����*b��zk����7��q�����f��&�������\��H��xD������|���F��Ho���\:K�g�N�������0
�!�������,��u001��cnY�{�;~y���D��������]��c4���������O�h7�Z!���D`i]��i��������V4�a�)/����#	�v��}���J)"�K��b����O��#I��ey����'�
^q�Dab�u_���7�bu8�sY�4�aV��\�(�������7\x��y�2�on]q	cJ������/��R���\�Q����Dl��rOa�7�L���^�wN�]�=y,���J�?��zGO���Z#�*�.�$�)�D�S�+�P9��0�`���v��&�|������,0��t�jp���G]�B�C�J��$�H2+}�
�Q.��EJ�N���,��l����5~"�������Dq����+:��b5D������8��p �����]�
u_�����P�r`����
�!w��u���?�@0!C���4�-s��o����S����=L0��@��b
7�x;��5Y:�S�,#���S=�#����^��x��r�[R�,�7���E������j�<���r�;pb5��*v�Yt�r:)��p[ ?RTo�m����4��f���Q��:�*��_��Z]mu&ye���ef�g8GW��-�V��t��-r}�y�%w]��0g{~��rRz��G������C����x�>�a�����v���K�W��!N��Q��}�-mq���T���k��)�t�3.y�aC�q�B��+J:A}�f�P�UYc�]��+�\O���~����Vq��	e��qn��q����}�S\]���.������K�o����Rm���pN�~����pe�k���G<���L�%u��U����J��O��S_�*���M�b�� �_�Yw���7�UVD.��]W�jz]"M�UPR�DdspH�w�Th����M9m�0
����� �����$s��U�o?�t5�������f��a�j%�F#C���}f
�g�04�b�I�vNHc�Y��v�h�����j6�N�����T�}Hb6b8���`U�==P��"��D}|l���s�T����+i�B>��z)���S�	j�)����;���	��N��Q_;p�����D<(��I

&gp��L%�0�z`.�����d��
V��&��2����&n���K���(�v�����������$
������D������;�]����		l������V���8�Jt���h�]u��}�Y�����e�L�t5����x�f�U.za�~@�����z�R3��r�i����mw<��G�r����3&��v$C9�?m�,=��h��z�Z1�T��go������(��V$�(�>�_}���|��
sx��p&��Q�>�u�������xam9�g8�QFl����n����F,����R<�%�����4i#bgd��nz	;6��1�Z�U%s++��$��������?�����������������?��'���O�~��<?={yp�9;�ea�U��$�[hWx��7`��s,����f{p������8�wZ����
����/�+���8�#�N	d�"9d�Q6X�8Iy=��^���^�(����
����������L�bs�� 6������Z�����O� ��,`<mS��U�F%%�H�_��U6�����Dp����x���@:}�������������/ju���z�a�`�9�-�����M	(QQo��K����x �hd���M1���X�_��8872X����������XeF#u=��� �t�x�RBFwc!�#w�'=s>�.�5��vt���}v��B�Iq�F��?�&�i�����8:����\>I�������d�YX�������,�e������vP�R����YZ��=��c�V��2~��'~�H�:<����p��xq�k��
z��j�x@�+I�5�5�$�sx��������tf�{$J�Z2����?=<}v�y~tp���Q����s���W�gG��Z<��W�em�H�I��n�a��]��3<,E��#���^r-���K��?��������J��`Uc�	%�� ��$����aX��C�`��9�}2
�ZMW�Qpll��[(�1�R����RI��jR�WGgD��.~z����r���xom�$c	;���=��}}�����]����d c
0��U�	]�,�����mof]������}�8�"�}@W&|o��:�pN�~���>J�"��-�`��4���\����	�EU�w����U��,�q	y�
��"�Jb��+_a]�f�r�k�M���xD�y�����&���!����A��0--��\���Q�f5�5����������2X���1e]��u���l�b����\���v�*���o���u�
'/��Ro�,��!u�cX��lJR��{F��`T^��Qe;TX�,���K���>����6�&��R�sL���`�R�rCB����,��b�Vf����j��8*RlI���\pW%�,'��<���X�1sE]`�_u�=�g��oc���#K��>PQ�i@$�G��+�>�sJ��h�#�P@@d$�J������������%�S����hD;[ {|�_jD�h���z>�
ncT�$������7���CmD�BhI8��^� 6*��:�]0�<n�����Q�[^S��B��\!�������l�-H|��������I!3*��H�i@������%3j`P!����4�q�"3�B`/H
�v���b`�l,pf��~�� ���2`<����W�8��@��
w�������l���fW���Z�v�=d�V�����6�Y��_V�zg����r��|2����}�-=���AZ�e�]���i�GYh�a�d�M���%S���j�q�a\��x�~�c�7�X����t)�6Z,	x\EK[%-��"im��� 2��Z�B�Ig��{���fg	���z],Tx2��)p+����OzJ
��7SZ��%]��En�E^����TU�(��U����{>�Dzw�Ab����x���e�Ew���`q)�JA��-^F�Wr)��(����y��Tl������c����sakP���������L���������=�zP���m�.mYO�g�N�f��!{��R{����Cyh���s�x��������6�2���Y��_���Z���+:x�����V$�~�!��,@X�t�%�[7������������?f�Ut	^���XE�S�kV��p_n��t_n��r_n��v_n�/�����������{#�$��5�;�X��?�F%�gE�{%M���C�nP&(�IL��nd�}4w��?f]5��0���9���O�:���`��

2������b��dJrI���L"�{BmMa
F��w��
�nCAX��|�S��n��{ot0��\�S����o7g~�5��m7..~X��A�����Og'���_/�������&�))S�b8���z]�i�����6��Z���F����5d2FSp��plI��a?��C��Ec��]p?��O)_�L5^$p��t����������}�l������NV�pIS�y��<NG����]
�	�k��D}f�W8H�������K�ud�q��(��8�*���,��y[
,���O�����a"_g�SL?���T�E+O<���y>|�_r�1#B�w��:'"����������Z�e��&�)�GM �x��)t�rJ��������M}&`��bk�XvU`?�^k���d��������\ZA����e|g�������B��I��>��I2n����9Uy���t��p���kv�B�mL0���K��_q�4/��=�zAk���O�\�yiM�������?t��g�a7��70�������m�#_�����A2���F�F�82Rh����B�&*W~��3�J��9J%8+��C7�����fb�n)!LF�f�R=�qnN�j|�CA�|f�4�{v-u|.v�=l7����+P/��e�l��&�F�^�0���D�^��T����v� 4=�G�x#0D���GI�>Y�R&�H���|��,�.��$cE�;n� >�
u��T�E6]p����3�F�o���YDw�V{�7~7#R�l�]�2��q��M�(�l��0�t��5s�!���N��q��=���Z���,������f��\�u�%\�����*�����F{~���hK�c�]Z���'�l�xyp���0�H����
�,�a��T�U9� �F��v��C/��?���)�Rzs
�B�D��N�C�VAhF���w?�g-y����=\B��Q���rB��@���Qx#_7}������*0NRn�d�(2x
��%|]���%�T���5B���'x�������Ev��������-��5���\�E$�s��Z����9�b���&�'�"�F�
*������N�$B��.&#���t�]������
:�z����d$^�'�[&5���%�4�1�Ze\��nTd&���S,�#~�j<'�+rv�|������{z[(95>Nkx7��q�2��H�B�%�P�C
�W�T'����#���dk5�/�6���)d�*����Vs;WZ�oC�n��$����Wg�/�[���"h��]4�,���gQ� O$�s@�=�&O���M����~�f^"�E����p��G8��D$>�Q!gm7�������|����Uq�t��O�O�$���{�nv���VkE���B�
�K+��|T���6e��<�G�?,����(����������7F��]%��>��\%�\%G(���$�
�����r����'�'G9�X#��c�M:������-a�������O#*W�2��2$&�dF���K8�%��n��7�w�f��0@���6�d0[�Q�"�3�����O��z�~�7����C��_q���q)�f�Sy������&a2�BQ��S��L��
�g��2�������������y��r-'���8H�E��"����Q6�Qv��y���.j�P(F��Xmo :����zN��S��$��FD�iS�>��c�fG��t*��l�`m��NTRr
%�z >M ����J���F��d-����e��!�c�EwN��;�EL+nI]Z�]���5f�'��#�u�GO�%�t�Q#y��Y,��X�
>�>��o���8�s�+��A���zzg���A#T��T�=���B�'p`c�3���.�p����n�u��$�^_�-jJ��p'���}�}�9����T��8������kY�#��{C����h�xYw�����^���g��0JV�5%���F�.#�j$���z�0�i �2H�j2��E�_FLU�9�&Tb�CJ��������#J����a|p�l�8Y?���4��)�8"Sn2[U�'(��i@��YD�I���cB���6r�.��?��b���s��Fs�-�}9}v*��5�Y��/KZP?���0f]�K
,EC�u��q����9�L��oq���.�s,���a�#m�V�%p%s�8����V3zz''��#*G��e#��O�@��F��Dz*H����q�v��8�^�2�H��]�7�������pM
p-����RL�����wU{5$�Wv����pe^�����M��g 0����]�m�c.�)�\'��d����'�<A8��Nb��'�'Ej�yN���@�^�`4w.�9�V�"����Q�@����!���`������1����p��$u���(�n/q1��.�(�,]�����-���2L��U�#�Y�}�*|�8#{�vx��z����M�������V�C�I�"���S����.Y�L��2��y���7q�=P.��1m��I�Ju�V��>*���H�j_H_#�TG����|-��c�������J[����[�L���mU���q?����8V�#`�bRRO�$Gg	�m�(�C����t�C����e���Q#�+�y��k5[?�2��'����syz}(���5�,����o�57 ���h������|4���u*��G�v�c9�f���c�8�����S)}����wi�������b#Z�qkO���q�i���.��'�k��J����0F�n5f&��in���a��t�PU�F
1�W\I �7�q��x4����EFO.3�z�\�����&�������������o�[uV�'�Yr�
�)��V1��P���/9�e��y\��
��C���|i���7-Y�!N�/��k�X[+�D��(�Kj�+��~YR%:�uY$h�0�QR�������8_�b��(�{�aIW��'��\�'>��#�lB�
qSq�������A�	[qR��t�^K�����^�%���
(����I�i����z�����#��[�
�2����Z������R����������K!��	E��OR�]����� �����"��LF���L�)�����
���C-,x(�v�+��h�&����
��5���P!�
P�(U��W&Fr�Q���PZ��Z>=��#������t���!�+�L�W��:Eah1�h�6�y��H��~KR�����bN ���AV#5&�X*������hk�.��les�u��la����p�Pk�n�7D<��=�X��T\'F��l���PQ���B �����]��q�)��d�xs���W^��.��J���|����*J�a���<������X<��d���0���X���/����K�����S����J7�S��q�b�G�\���P��?��;�=^��S~3"gN�>��T��X���~�nM@�����	���j�i�X����rNh���:A,�A��l�]|�&��4sN����^I��p�~��n������tX�~��^Mq�dT�y�2�*�EH
Ed-�%[ Rr|�&�����X��(v���$L����T��0�G?�����������}%����d�Q	��A@]����������G/(�Jx���m�
��.�}���{
M?�M�W�y���G�:BY��Q���	����JU��d�VYD���n�5����Ba0�[����E�}x������(�G��Bm#	��E�/��!�������e��p�/���R�����i��j�*�[��S'�����}�g��?��U�5������e2��y��u����Xq$���7peM�&x�����cW/g8T�H�[7����]Z�����Z��f�%���2sX�@����4�4��C���2����7�����7�
��XU���]�������x�3�xY��<��r��^�TSE.����>������	pY��X�!��m��AB��.2��Tw,@�&Y��1
�y�+IW0�������J��}�)M=/��+�<��'H�F@�dA��O��bJN��T�"'\|�lQ-���[�����)�B?�e���m�
9(����W���r�%u��^T����,���r��Qe��Q���u���������|4�^�V��U[����'�P!�X�p�;�jjO����X�N+�����@��S��:f�0�==�T����6���K�w������Jy��6��\
b9����4��g�*���y�q���n�EY���xs0|���e���b��e��8��,Y�G&&F�����d��a�0�
-�'�W����&3���@Y�� �p���>E}����]��Q c�h��:N��	?:�r������C���S����������q�E�����<��Qb�^�x?��vy����m�����8X���|�����f��r�8��f"��-�f���O
;��GB_������]�&W���?����wxs�����z�����_���}j)�/���Ss����+ ��4&��C������'O������u��b������f��p|�@���{h��t���R>]��R�`f.;�C��s�Z�����zJ���
�6�n
�E�1:����
���q�@��[,Dj'X��^�v�C�E�����)�)����[�U�t)�����?5
������Ie���&W=p�5 i��Y�����*�E���#��I��,\���6�������;������?�Q+��i���h�P�����s���"��m��F?��SZP|7At8#�R]����Q��"0��sV�g@�:#y������/0��T-}����+��~�
�@L���3�:�PG2�����^�S��k|�{o�H�*��K��G\�gy,�U��P�B��^�,���8K���(�
��)����������)��Wg-"�K����!t:�&�������]�Gb,,������O�m���\��s��H_��v�K����7�����Z3����&+�w�?r�y7�%��E�p�<�f+��`��a�c��C,x��95�����L,(�<�|����������pP��
P�?��������8����m���O.]<�O���T���O�����0d=��_����7�aJZ�����6���f�o��WV[���[u6*j��h��5�B��]S���O�f��d]�u��o;�/zE�b*���1�5�1��v�Q�����KFt<�
���Q{}�,O��d�����I&�������?�!�n���p%�$����<Dkm��D�������)�d<�'���� <`���c�>�}|)��)�s`��(z�N�t5�}�u�0����!EIC���B��0��]����}��KI��.�����%J}���m��H��n���}��=*9HO���E�y���b���\� �Z�������\����l��������`����'*]b!%O�]���`�8k���SK���]��.;����U���K@^������Q����k"O���r�a�r��,)����_}GN�4��
�����go��i�@
q_cL~K��|r�e�4N,�~R=���f���y� �Vb�{���<30��
_�Ar�}<p����`VbeC�[SLz�1P���y�SX�SBXu�"��G�y�w�(Mr�R�����b�>NG����_���3��C�F�~G��L-h��Z:�$�b�Tu"%3����Q������V�L#H����o��'9�����;�����tw�*�t���1i������5n��(�$�&�S������y�x�X�J��v�)&b�Tk����,Y����?�x^z%yy�yt1Y�Z!���G����Y�\L�@[���l?@}��}G��hm�U��Mz��*���� �_�>4{Qw��������6w�{�fs{���{{��v�������equu�����������n�
����O��L���|G���%���3�#����e>�O���b�����������}%�u����F���0QN���G���w a�C�"b��������]��Mw�_G3|��nE���-�'��b�����=�m�7;�j�����%��%�(��$ekL�VH�����;��A������6e�ml��~�V��&����FW�������JI`� ���_*L�K���Ds%H,�P�+�����s���
��*$������+Io)2wt��Yc����E�bN��&�p���c����`���+a^�i�N���6���������:���S���X���G�C@�d��N��_Fy����>nq�4�W�`z��K��D��U��T����"�������(�,��/9Gq�=�p�S����"+"#6�4N�z�������V�a�������a��0���1��r��hq�W5�A1��U���S�NWr_�����XV9��d�P�7f����EQ8\8n��N�`��Y����#��v���������w@Dm#�T{��(we(�`�^2�
�!������!�����D�2!w�1�+�Y���	�L���qIhmf*��5�h���k���U�@eq�f~��������	K��<�>>����A�g�g"v��
����l�D'Z���|#����������Y���k���*e����|D|vXiJ.�FkO+*$��mD�R�,��A2@�b���������YR��������8���)6}�T�a3'a�D�}wTD@��r�����7��0�.�(�o>�QR���t&�:�����U��,�G<�
��oO�f��+-����}6��p
��U�@�1V���`����
��ya��hC�R��(S�O�xD8-q^
���O���P�����Mq�7f���;n�7J��](^�{#��8h��d�����	����` �1
�O���2`�l`��y�/�m
c�RF?��4w"_0�������b7�y7-,;�l�O����k�Q��h���0������x0���`�������*IG�m�4����%�������Lg����j����-��������[-c�F.pNE�L�&�k�"M�������m����`E��g(�o>��d�U�+���i:����t�"���(\ZF��^�������v����^��������py[E���	r[��$��O���/�w�=����])�P�b�����g���t�"�xx��w�rE�G�� wH�[\f���Szm�)�}�!c�3��[���wAH�k�9����SR��]�(7�E� ���� ��[_bA
K�("8������������Z�7��������y
L�u��<�
<�-����XA�y3.�c915����R�97������� �����dr\�L:��h�����m��~n5��_`	�yV��y����+�����2(�O�x�j�?cf?J���E��"B����A�3:��`*��i����'��r}Y��9�b�)m����Il'{����^�j6�ww[;{�V�=I4M�������vk������(�!&���w��L#9��3D��`����g���!�������������?E�b�O4ke]O�)�E����'�OMyI[���k��4���n�[GX�Le�r#F�8��������>0z����SP�F6��7���A��������[�I����`���[R[m��V�O"|Z�t�\��yF���F����������J�x��H��M�^xe�+���K*�=�� x�
���\��e��-��-��������;I�����.�4(��8��u�����5VT�Q����J�Z�&����%�0z��e~uG������(���GgG��G��x-�-��=={vt=�K�~�������#�v{��������/�NzW��W�����q#J�.��O�������W�q����
�H��*�>6������C[����;���J$�U�U�NO����6S�����r&��xdS��*�}F	�*������������nu���"U����.�J����l�������o�4��7��/C�>(�.7��)���
_"�!��;E����E�!�����RN���������f���nD�'�)��}eR�48���j�X]+��R�����5�.���?��h�.�N�����d��.>�Iqh3��?�I�z����
hD�C��}�����nkk�������{�[;�C���c�J�1�n��G?7� F�Z��$,���M�a��'�#97���7^>�=�� eNG�>�?BB����E�M���[;�
������)�h�R8g�t��I2e���A��0,���e��rU3��]:p��v8����8:�w��rw�jt�pqx{f��N>���z�L��U/ja�d��CW�t��cq��8�W,x~�Bu>c��/�=��7�v�q�oe&�o��{��J�P��E����#�"	CVh���L�����`<���&hX��������l���t�d�Q�
�;���^��7;��q_;Hic��_P�'?J/4�!~�C��Bk@C�{����=��#e
������6���.�c�L{?w=�h��Q�f=��HP]d^" 4��#���`#��@A�V��6&C>�5{2`�J�6Lay��.���o���\Ay�v��n��l�������`kc�������78���&���|�/	��/N���767��
���<������Cy������9������&�D��cT|Ur���%�75(mBl��K�^�����.����N�T�8����/���
�C��?�;��c������O����Au�h�*i+��%��&��eE��::yvtrx|t���`}PQ�'-�#b���8i������������i�����	��yvD5��E��,j	�5��`c;���w���v�w���s������J�Bt�5�����HF���t��,�����$@]��F(s�����5�������tL����G�
�%�hy@K�mh��dwW�1��W���==~����/�1��g����&<��6���#4R����E�����g/�w�V��y����y�O���^`��Y���n^����E�������'�x�#���Trd9����!�������*���������h����^����������������Lg�v#dP�'B�o\�k�V�������V�l�����6<m�w��L����7��6�V������<������'��i��%�����Y!�8SG�h!����5�k�t��zk��L��Ha��k���J>\��f�0���s�'�ui��X)Z�~��
z�IM�Y+F"+Y��V���H��Y|B�9���,�<���Y��,a_�=�1���&�z
Nx/9�-I2��� ohxR��������p���"�����vm3;��N���MJ���t��U��}��c�,7�k`l�Y6jE�S3%�3�xj$OF+�`�&0mH}��������Rpv-w�<e���+��������d�v>~�|���a���Wg����
����/���[-����xA���:�R�n}�V�r��r�[t~s}��l���tx����.*����q=�%��Vl2�8+,��9��h�x����GN{�b���(sj��	�����d����ip�I ����t%���D��k�<O��m�����b����v�s]-k��~�-�^�5�l��;tHYvsmv�lL����C�n���D���)-h'�����i8���j�)�hH<F�����b���l��0����M��B#�� 
����m�~B�����64_�wh�{x"]���x�L����������Pj$/y�����j[J-���`#I6�����Vww{cgg��9hm��������������+�M�����i��V)������k�|+���!����l������x=�iQ>�������q������n��@G�45�L7�yzp����G��m����A���?� C�an�.��98'�t:�������k��O�?]�G����-����,�)]�ya����y5	�
��t�<��T��\��s��|;OY��I+��`�
M����`E�{���eC��]C-o�b�1��@������`�{���D'i�sa{zD���������2�N��&�J��ahf,���t}������V�4���-����D�.l�;Tn����@��7�o���y:�!`�]�}Uwb������!� v������2�		��#M
���?,���t�`=�5��Xk��\�q�x��5��(_��Gaus2����X���������������G"z� 'f��^�����kj��a��{��L�"�V9W�xa <���J�(���R��d	�f�V��fp� h#�+5w�%%��R��2i,��&
]h�'��y����5
8��O��r��<�A����O�B��J6����K��B�yE�b?!w��%5X�������^�[bA0�su�����{������\	���0�I�gf��yZ��+���W��G�k*����8��x�A}S��L�����* g}r`�6O������V�)r����&��@Z'��:�^��U�t|Ef�j��+�WD�_�$��$Ce�5io�{�]fIA�A�&>g���p���������v�9���nvz���!-{�!���t�}H7Y�>fr����n�Gl��g�o	��i�w(zu):������&��q=zm���P�Z��F�������&5�����AY�j��5�a�?��Q��vZu������������������kJ>�8����.��m��E��i>k����)��C�g���������~������\������D����?��K�_���T�bv��%�z��-���w_��v���������#���^����k[���v��7������F<�{srM�wer�I����h��U�Ij
�����#7@�X���.k�����w����\-0��_�Y���m5������?�rE����?�U�x5��E����Z\Q
_��
�O�P�z`����j�V$H%5�oY�/q��n�a�	�6�5�y1s�t8L.�����j�d����^'��x�m
J�������������������~�_b������RX��9��eu[<Y^\!�8x��d�v����Utx����@��u�	�&��u����'���x�`@����=������T�?Ni-�0�
��<�,����a��!8
d�L3����u}�Bw�@e^�x������Y����B�@/�����#�ZE\yF�����"�������3?2d��[�D��=s��y��U����`����KDx%�����������4�~���mi�m-}m�����ql�?����7<Q:������g�hUf�x�9s�jDg���go��K #���6"����P���&�>��
>��-:���������qqj\�jrH�=L&��?���kC���M-=���-��{�������F��������;�/s\ivG���;��Ct�e���	���s�t�����.?���\�~}����$���_�G����������d��F�Z�3����������vvt~tM�Q|���/`B�*S@��yC��E���8+�u������#Q��+{�M`�7����n�4�N����-�>���d�.����s��iS��)E��h��n��U���dGW��(���B'h�
��,�����C�4FZ#�w��7�g��|�3e�TvqUe#2�nn<r=>yv��h'�v	^`��L��3+�������:/�u����Q�1h��UbV��K�;sJm���&�g�aL��R����M����j����M�X�_����M��n� ��O�I�Z��
W�e�����|Xv;Ws��u}m�q�.Z��j�����OGg�����	�H��H#�c��G��d`2OX�Cs!*4�o�E���k��D�rB����3S��}��T�Rk����b��������sW�3���C���n��'���P22���kE��&�;����������)S�P�E���X�)���q"W/�~:9��\0������>����=��{�*�����=����o�����s\X�(���g�xr6�)��S�������'�����K4������;,�L�p�
K����w�gn�����)l��~AjZ�_��{����Z|���R�0�:�;3����\<D����f���&	q�I�Ap��"��o��q�������}��i�?�a��2����g����|�8j�h�?���
n�� <��S���~�E!&Q��g|��5}_��.]&�X�m���4�w_H%)������Oz_f���K�)�n�����y���L��~�}*�D����]��B���������Q�r��$'es^�+n]�~������0��O�A��S_�?������"y�a,0
V��UPL`j+�X��'d�z���]���kRMQU������_^_��k�H����k�z��U���F�	� �w-��(� ����"xJ*��l�^��d&C�����d)k���2�����Dt^�\Y�K��w�;����6&�6�7;�|�����3-_�l��gmuWR�1A�n7�(=��!�-���}���e�I\����M�SG���_����hss��=p	�O�Z�����/�Z-)�V\�R�����1|�������dq���Wk;�����}��;.�]0d�<b%��$$i|3�0�6�:/�����|���p���
�95�����"{GA��l�
������vks���l��[���d�_�}Oc��X���o~T�j�;pT�><�m������|9�v6��ck��hon}A_������7��h0�3Y��5��7r����;��{��
��������g^������kC�����~+�?�i�������N�{�I�����������[�u��pm���L�v��*�)�
4�wTk�VlU�rp(7���������
�<W�

#65

Dean Rasheed

dean.a.rasheed@gmail.com

over 7 years ago

In reply to: Tomas Vondra (#64)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 24 June 2018 at 20:45, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is a rebased version of this patch series, mostly just fixing
the breakage caused by reworked format of initial catalog data.

Aside from that, the MCV building now adopts the logic introduced by
commit b5db1d93d2 for single-column MCV lists. The new algorithm seems
pretty good and I don't see why multi-column MCV lists should use
something special.

Agreed.

I'm sure there are plenty of open questions to discuss, particularly
stuff related to combining the various types of statistics to the final
estimate (a lot of that was already improved based on Dean's reviews).

Yes, that's definitely one of the trickier parts of this. I don't
think that the current algorithm is ideal as it stands. In particular,
the way that it attempts to handle complex combinations of clauses
doesn't look right. I think mcv_clauselist_selectivity() and
histogram_clauselist_selectivity() are plausibly correct, but the way
that the resulting selectivities are combined in
statext_clauselist_selectivity() doesn't seem right. In particular,
the use of estimate_equality_groups() to count "nmatches" and
"fullmatch" only takes into account top-level equality clauses, so it
will fail to recognise other cases like (a=1 AND (b=1 OR b=2)) which
might be fully covered by, say, the MCV stats. Consider, for example,
the following simple test case:

create table foo(a int, b int);
insert into foo select 1,1 from generate_series(1,50000);
insert into foo select 1,2 from generate_series(1,40000);
insert into foo select 1,x/10 from generate_series(30,250000) g(x);
insert into foo select 2,1 from generate_series(1,30000);
insert into foo select 2,2 from generate_series(1,20000);
insert into foo select 2,x/10 from generate_series(30,500000) g(x);
insert into foo select 3,1 from generate_series(1,10000);
insert into foo select 3,2 from generate_series(1,5000);
insert into foo select 3,x from generate_series(3,600000) g(x);
insert into foo select x,x/10 from generate_series(4,750000) g(x);

create statistics foo_mcv_ab (mcv) on a,b from foo;
analyse foo;

explain analyse select * from foo where a=1 and b=1;
-- Actual rows: 50000, Estimated: 52690 (14149 without MV-stats)

explain analyse select * from foo where a=1 and b=2;
-- Actual rows: 40000, Estimated: 41115 (10534 without MV-stats)

explain analyse select * from foo where a=1 and (b=1 or b=2);
-- Actual rows: 90000, Estimated: 181091 (24253 without MV-stats)

explain analyse select * from foo where (a=1 or a=2) and (b=1 or b=2);
-- Actual rows: 140000, Estimated: 276425 (56716 without MV-stats)

In the first 2 queries the multivariate MCV stats help a lot and give
good estimates, but in the last 2 queries the estimates are around
twice as large as they should be, even though good MCV stats are
available on those specific values.

The tricky thing is to work out how to correctly combine the various
stats that are available. In the above examples, the selectivity
returned by mcv_clauselist_selectivity() would have been basically
correct, but since it will have not been identified as a fullmatch and
some non-equality clauses will have been seen at the top-level (the OR
clauses), it ends up adding on additional selectivities from
clauselist_selectivity().

I think perhaps it might be better not to attempt to combine the
*overall* selectivity returned by mcv_clauselist_selectivity() with
that returned by clauselist_selectivity(), but rather merge the logic
of these two functions together into a single traversal of the
clause-tree. That way, the various individual selectivities can be
combined on a clause-by-clause basis to give the best running total
based on the available information. Even when the multivariate stats
are incomplete, they may still provide a useful lower bound on the
selectivity. If/when all MCV columns have been matched exactly, that
lower bound might turn into the appropriate overall result, if there
is a matching MCV entry. For example, suppose that there are MCV stats
on 3 columns a,b,c and a WHERE clause like (a=1 AND b=2 AND c=3). You
might process that something like:

* Get sel(a=1) using the normal univariate stats
* Update the multivariate MCV stats match bitmap based on a=1
* Get sel(b=2) using the normal univariate stats
* Compute the total selectivity assuming independence:
total_sel = sel(a=1)*sel(b=2)
* Update the multivariate MCV stats match bitmap based on b=2
* Compute the multivariate MCV selectivity so far:
mcv_sel = Sum of MCV frequencies that match so far
* Use that as a lower bound:
total_sel = Max(total_sel, mcv_sel)
* Get sel(c=3) using the normal univariate stats
* Compute the new total selectivity assuming independence:
total_sel *= sel(c=3)
* Update the multivariate MCV stats match bitmap based on c=3
* Compute the new multivariate MCV selectivity:
mcv_sel = Sum of MCV frequencies that now match
* Use that as a new lower bound:
total_sel = Max(total_sel, mcv_sel)

so it becomes simpler to merge the selectivities, because it need only
worry about one clause at a time, and it still makes use of partial
information.

If there was no MCV entry for (a=1,b=2,c=3), it will still have made
use of any MCV frequencies for (a=1,b=2) to give a somewhat better
estimate, and it will have made use of any available univariate stats,
which might be better under some circumstances.

I think this approach generalises quite simply to arbitrary AND/OR
combinations, and as discussed before, I don't think that it needs to
handle NOT clauses except in the special case of a "NOT bool_var"
clause.

One drawback of this approach is that the result will depend on the
order the clauses are processed, but maybe that's OK, given that we
can't reasonably try all possible combinations.

On thing that occurred to me while comparing the single-column logic (as
implemented in selfuncs.c) and the new multi-column stuff, is dealing
with partially-matching histogram buckets.

In the single-column case, we pretty much assume uniform distribution in
each bucket, and linearly interpolate the selectivity. So for a bucket
with boundaries [0, 10] and condition "x <= 5" we return 0.5, for "x <
7" we return 0.7 and so on. This is what convert_to_scalar() does.

In the multi-column case, we simply count each matching bucket as 0.5,
without any attempts to linearly interpolate. It would not be difficult
to call "convert_to_scalar" for each condition (essentially repeating
the linear interpolation for each column), but then what? We could
simply compute a product of those results, of course, but that only
works assuming independence. And that's exactly the wrong thing to
assume here, considering the extended statistics are meant for cases
where the columns are not independent.

So I'd argue the 0.5 estimate for partially-matching buckets is the
right thing to do here, as it's minimizing the average error.

Hmm, that seems a bit dubious to me.

I think anything that tried to interpolate a value between 0 and the
bucket frequency ought to be better, at least in cases where nearly
all or nearly none of the bucket is matched. So then it just becomes a
question of how best to interpolate.

As you say, if the columns were independent, simply taking the product
would probably be right, but if the columns were fully dependent on
one another, it's not at all obvious what the best interpolation is,
because the bucket may be long and thin, with most of the data at one
end. However, in the absence of any other information, a reasonable
approach might be just to take the geometric mean -- i.e., the nth
root of the product.

So perhaps a reasonable interpolation algorithm would be to take the
product to some power, determined from an estimate of the degree of
dependence in the histogram. I think there's enough information in the
histogram data to get an estimate of that -- the bucket's size
relative to the total data extents vs the bucket frequency.

On a different note, reading another recent thread [1]/messages/by-id/3876.1531261875@sss.pgh.pa.us made me realise
there's another thing this patch needs to worry about -- the new code
needs to be calling statistic_proc_security_check() to determine
whether it's OK to be using these statistics -- c.f. commit
e2d4ef8de8.

Similarly, I think that access to pg_statistic_ext should be
restricted in the same way that access to pg_statistic is, with a SB
view on top. It's probably OK as it is now with just ndistinct and
dependency degree stats, since they don't reveal actual data values,
but the addition of MCV stats changes that.

That's it for now. I hope some of that was useful.

Regards,
Dean

[1]: /messages/by-id/3876.1531261875@sss.pgh.pa.us

#66

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 7 years ago

In reply to: Dean Rasheed (#65)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 07/13/2018 01:19 PM, Dean Rasheed wrote:

On 24 June 2018 at 20:45, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is a rebased version of this patch series, mostly just fixing
the breakage caused by reworked format of initial catalog data.

Aside from that, the MCV building now adopts the logic introduced by
commit b5db1d93d2 for single-column MCV lists. The new algorithm seems
pretty good and I don't see why multi-column MCV lists should use
something special.

Agreed.

I'm sure there are plenty of open questions to discuss, particularly
stuff related to combining the various types of statistics to the final
estimate (a lot of that was already improved based on Dean's reviews).

Yes, that's definitely one of the trickier parts of this. I don't
think that the current algorithm is ideal as it stands. In particular,
the way that it attempts to handle complex combinations of clauses
doesn't look right. I think mcv_clauselist_selectivity() and
histogram_clauselist_selectivity() are plausibly correct, but the way
that the resulting selectivities are combined in
statext_clauselist_selectivity() doesn't seem right. In particular,
the use of estimate_equality_groups() to count "nmatches" and
"fullmatch" only takes into account top-level equality clauses, so it
will fail to recognise other cases like (a=1 AND (b=1 OR b=2)) which
might be fully covered by, say, the MCV stats. Consider, for example,
the following simple test case:

create table foo(a int, b int);
insert into foo select 1,1 from generate_series(1,50000);
insert into foo select 1,2 from generate_series(1,40000);
insert into foo select 1,x/10 from generate_series(30,250000) g(x);
insert into foo select 2,1 from generate_series(1,30000);
insert into foo select 2,2 from generate_series(1,20000);
insert into foo select 2,x/10 from generate_series(30,500000) g(x);
insert into foo select 3,1 from generate_series(1,10000);
insert into foo select 3,2 from generate_series(1,5000);
insert into foo select 3,x from generate_series(3,600000) g(x);
insert into foo select x,x/10 from generate_series(4,750000) g(x);

create statistics foo_mcv_ab (mcv) on a,b from foo;
analyse foo;

explain analyse select * from foo where a=1 and b=1;
-- Actual rows: 50000, Estimated: 52690 (14149 without MV-stats)

explain analyse select * from foo where a=1 and b=2;
-- Actual rows: 40000, Estimated: 41115 (10534 without MV-stats)

explain analyse select * from foo where a=1 and (b=1 or b=2);
-- Actual rows: 90000, Estimated: 181091 (24253 without MV-stats)

explain analyse select * from foo where (a=1 or a=2) and (b=1 or b=2);
-- Actual rows: 140000, Estimated: 276425 (56716 without MV-stats)

In the first 2 queries the multivariate MCV stats help a lot and give
good estimates, but in the last 2 queries the estimates are around
twice as large as they should be, even though good MCV stats are
available on those specific values.

I'm not so sure. The issue is that a lot of the MCV deductions depends
on whether we can answer questions like "Is there a single match?" or
"If we got a match in MCV, do we need to look at the non-MCV part?" This
is not very different from the single-column estimates, except of course
here we need to look at multiple columns.

The top-level clauses allow us to make such deductions, with deeper
clauses it's much more difficult (perhaps impossible). Because for
example with (a=1 AND b=1) there can be just a single match, so if we
find it in MCV we're done. With clauses like ((a=1 OR a=2) AND (b=1 OR
b=2)) it's not that simple, because there may be multiple combinations
and so a match in MCV does not guarantee anything.

I don't think there's a way around this. The single-dimensional case
does not have this issue, of course.

The tricky thing is to work out how to correctly combine the various
stats that are available. In the above examples, the selectivity
returned by mcv_clauselist_selectivity() would have been basically
correct, but since it will have not been identified as a fullmatch and
some non-equality clauses will have been seen at the top-level (the OR
clauses), it ends up adding on additional selectivities from
clauselist_selectivity().

I think perhaps it might be better not to attempt to combine the
*overall* selectivity returned by mcv_clauselist_selectivity() with
that returned by clauselist_selectivity(), but rather merge the logic
of these two functions together into a single traversal of the
clause-tree. That way, the various individual selectivities can be
combined on a clause-by-clause basis to give the best running total
based on the available information. Even when the multivariate stats
are incomplete, they may still provide a useful lower bound on the
selectivity.

I don't follow. The example you presented above showed multivariate
stats producing over-estimates, so how would it be helpful to use that
as lower boundary for anything?

If/when all MCV columns have been matched exactly, that
lower bound might turn into the appropriate overall result, if there
is a matching MCV entry.

Isn't the problem illustrated by the examples that we don't know if the
MCV matches represent all matches, or if there may be matches in the
histogram?

For example, suppose that there are MCV stats
on 3 columns a,b,c and a WHERE clause like (a=1 AND b=2 AND c=3). You
might process that something like:

* Get sel(a=1) using the normal univariate stats
* Update the multivariate MCV stats match bitmap based on a=1
* Get sel(b=2) using the normal univariate stats
* Compute the total selectivity assuming independence:
total_sel = sel(a=1)*sel(b=2)
* Update the multivariate MCV stats match bitmap based on b=2
* Compute the multivariate MCV selectivity so far:
mcv_sel = Sum of MCV frequencies that match so far
* Use that as a lower bound:
total_sel = Max(total_sel, mcv_sel)
* Get sel(c=3) using the normal univariate stats
* Compute the new total selectivity assuming independence:
total_sel *= sel(c=3)
* Update the multivariate MCV stats match bitmap based on c=3
* Compute the new multivariate MCV selectivity:
mcv_sel = Sum of MCV frequencies that now match
* Use that as a new lower bound:
total_sel = Max(total_sel, mcv_sel)

so it becomes simpler to merge the selectivities, because it need only
worry about one clause at a time, and it still makes use of partial
information.

I'm not sure how this makes it any simpler? It's pretty much how we do
it now - we update the bitmaps clause-by-clause.

We can probably make better use of the univariate estimates, using them
to deduce upper/lower boundaries in various places (because the
multivariate stats are generally coarser than univariate ones).

If there was no MCV entry for (a=1,b=2,c=3), it will still have made
use of any MCV frequencies for (a=1,b=2) to give a somewhat better
estimate, and it will have made use of any available univariate stats,
which might be better under some circumstances.

IMHO it's quite dangerous to use MCV like this. The values that get into
MCV lists are often/usually somewhat exceptional, and using them to
estimate probability distributions on the non-MCV part is tricky.

I think this approach generalises quite simply to arbitrary AND/OR
combinations, and as discussed before, I don't think that it needs to
handle NOT clauses except in the special case of a "NOT bool_var"
clause.

One drawback of this approach is that the result will depend on the
order the clauses are processed, but maybe that's OK, given that we
can't reasonably try all possible combinations.

On thing that occurred to me while comparing the single-column logic (as
implemented in selfuncs.c) and the new multi-column stuff, is dealing
with partially-matching histogram buckets.

In the single-column case, we pretty much assume uniform distribution in
each bucket, and linearly interpolate the selectivity. So for a bucket
with boundaries [0, 10] and condition "x <= 5" we return 0.5, for "x <
7" we return 0.7 and so on. This is what convert_to_scalar() does.

In the multi-column case, we simply count each matching bucket as 0.5,
without any attempts to linearly interpolate. It would not be difficult
to call "convert_to_scalar" for each condition (essentially repeating
the linear interpolation for each column), but then what? We could
simply compute a product of those results, of course, but that only
works assuming independence. And that's exactly the wrong thing to
assume here, considering the extended statistics are meant for cases
where the columns are not independent.

So I'd argue the 0.5 estimate for partially-matching buckets is the
right thing to do here, as it's minimizing the average error.

Hmm, that seems a bit dubious to me.

I think anything that tried to interpolate a value between 0 and the
bucket frequency ought to be better, at least in cases where nearly
all or nearly none of the bucket is matched. So then it just becomes a
question of how best to interpolate.

As you say, if the columns were independent, simply taking the product
would probably be right, but if the columns were fully dependent on
one another, it's not at all obvious what the best interpolation is,
because the bucket may be long and thin, with most of the data at one
end. However, in the absence of any other information, a reasonable
approach might be just to take the geometric mean -- i.e., the nth
root of the product.

So perhaps a reasonable interpolation algorithm would be to take the
product to some power, determined from an estimate of the degree of
dependence in the histogram. I think there's enough information in the
histogram data to get an estimate of that -- the bucket's size
relative to the total data extents vs the bucket frequency.

That's an interesting idea. I'll explore doing something like that.

On a different note, reading another recent thread [1] made me realise
there's another thing this patch needs to worry about -- the new code
needs to be calling statistic_proc_security_check() to determine
whether it's OK to be using these statistics -- c.f. commit
e2d4ef8de8.

Similarly, I think that access to pg_statistic_ext should be

restricted in the same way that access to pg_statistic is, with a SB
view on top. It's probably OK as it is now with just ndistinct and
dependency degree stats, since they don't reveal actual data values,
but the addition of MCV stats changes that.

Phew! Who needs security? ;-)

That's it for now. I hope some of that was useful.

Certainly. Thanks for sharing the thoughts.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#67

Dean Rasheed

dean.a.rasheed@gmail.com

over 7 years ago

In reply to: Tomas Vondra (#66)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 13 July 2018 at 18:27, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

I'm not so sure. The issue is that a lot of the MCV deductions depends
on whether we can answer questions like "Is there a single match?" or
"If we got a match in MCV, do we need to look at the non-MCV part?" This
is not very different from the single-column estimates, except of course
here we need to look at multiple columns.

The top-level clauses allow us to make such deductions, with deeper
clauses it's much more difficult (perhaps impossible). Because for
example with (a=1 AND b=1) there can be just a single match, so if we
find it in MCV we're done. With clauses like ((a=1 OR a=2) AND (b=1 OR
b=2)) it's not that simple, because there may be multiple combinations
and so a match in MCV does not guarantee anything.

Actually, it guarantees a lower bound on the overall selectivity, and
maybe that's the best that we can do in the absence of any other
stats.

I did wonder if maybe we could do better by tracking allowed value
counts. E.g., with a clause like ((a=1 OR a=2) AND (b=1 OR b=2)) it
would be fairly simple to see that there are 2 allowed values of a,
and 2 allowed values of b, so 4 allowed values overall. If we had,
say, 3 MCV matches, we'd then know to factor in something extra for
the 1 non-MCV match. I'm not sure what to do with non-equality clauses
though.

I think perhaps it might be better not to attempt to combine the
*overall* selectivity returned by mcv_clauselist_selectivity() with
that returned by clauselist_selectivity(), but rather merge the logic
of these two functions together into a single traversal of the
clause-tree. That way, the various individual selectivities can be
combined on a clause-by-clause basis to give the best running total
based on the available information. Even when the multivariate stats
are incomplete, they may still provide a useful lower bound on the
selectivity.

I don't follow. The example you presented above showed multivariate
stats producing over-estimates, so how would it be helpful to use that
as lower boundary for anything?

No, the multivariate MCV stats were producing good estimates, even for
the complex clauses, because they were all common values in my
example. The problem was that the good MCV estimate was then being
ruined by adding on extra factors because at the top-level it didn't
appear to be a full match.

If/when all MCV columns have been matched exactly, that
lower bound might turn into the appropriate overall result, if there
is a matching MCV entry.

Isn't the problem illustrated by the examples that we don't know if the
MCV matches represent all matches, or if there may be matches in the
histogram?

The example illustrated a case where the MCV matches represented all
the matches, but we failed to recognise that. Now we could fix that to
reliably detect cases where the MCV matches represented all the
matches, but it's still not entirely obvious what to do when they
don't.

What I'm considering is an algorithm where we simultaneously compute 3 things:

simple_sel - The result we would get without multivariate stats (*)
mcv_sel - The multivariate MCV result
hist_sel - The multivariate histogram result

(*) except that at each stage where we add a new clause to the
simple_sel value, we improve upon that estimate by factoring in a
lower bound from the multivariate stats so far, so that even if the
multivariate stats fail to generate anything at the end, we've managed
to account for some of the non-independence of the columns.

If the MCV matches represented all the matches, then at the end we
would have simple_sel = mcv_sel, and hist_sel = 0, and we'd be done.

Otherwise, we'd have simple_sel >= mcv_sel, and a possibly non-zero
hist_sel, but if we had managed to factor in both mcv_sel and hist_sel
to simple_sel at each stage as we went along, then simple_sel is the
best overall estimate we can return.

Perhaps this is not so very different from what you're currently
doing, except that with this approach we might also end up with
mcv_sel = 0 and hist_sel = 0, but still have an improved simple_sel
estimate that had accounted for some the multivariate stats available
along the way.

Regards,
Dean

#68

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 7 years ago

In reply to: Dean Rasheed (#67)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 07/15/2018 11:36 AM, Dean Rasheed wrote:

...

What I'm considering is an algorithm where we simultaneously compute 3 things:

simple_sel - The result we would get without multivariate stats (*)
mcv_sel - The multivariate MCV result
hist_sel - The multivariate histogram result

(*) except that at each stage where we add a new clause to the
simple_sel value, we improve upon that estimate by factoring in a
lower bound from the multivariate stats so far, so that even if the
multivariate stats fail to generate anything at the end, we've managed
to account for some of the non-independence of the columns.

If the MCV matches represented all the matches, then at the end we
would have simple_sel = mcv_sel, and hist_sel = 0, and we'd be done.

It's quite unclear to me how this algorithm could reliably end up with
hist_sel=0 (in cases where we already don't end up with that). I mean,
if a bucket matches the conditions, then the only way to eliminate is by
deducing that MCV already contains all the matches - and that's rather
difficult for complex clauses ...

Otherwise, we'd have simple_sel >= mcv_sel, and a possibly non-zero
hist_sel, but if we had managed to factor in both mcv_sel and hist_sel
to simple_sel at each stage as we went along, then simple_sel is the
best overall estimate we can return.

Hmm. I may not be understanding the algorithm yet, but I find it hard to
believe applying the stats incrementally is going to produce reliably
better estimates that looking at all clauses at once. I understand
deducing upper/lower boundaries is useful, but I wonder if we could do
that somehow with the current algorithm.

Perhaps this is not so very different from what you're currently
doing, except that with this approach we might also end up with
mcv_sel = 0 and hist_sel = 0, but still have an improved simple_sel
estimate that had accounted for some the multivariate stats available
along the way.

I don't know, really. I'll have to try hacking on this a bit I guess.
But there's one obvious question - in what order should we add the
clauses? Does it matter at all, or what is the optimal order? We don't
need to worry about it now, because we simply consider all clauses at
once, but I guess the proposed algorithm is more sensitive to this.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#69

Dean Rasheed

dean.a.rasheed@gmail.com

over 7 years ago

In reply to: Tomas Vondra (#68)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 15 July 2018 at 14:29, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

It's quite unclear to me how this algorithm could reliably end up with
hist_sel=0 (in cases where we already don't end up with that). I mean,
if a bucket matches the conditions, then the only way to eliminate is by
deducing that MCV already contains all the matches - and that's rather
difficult for complex clauses ...

Ah, I didn't realise that you were using histograms for equality
clauses as well. I had assumed that they would only use the MCV stats,
as in the univariate case. Using histograms for equality seems
problematic -- if bucket_contains_value() returns STATS_MATCH_PARTIAL,
as things stand that would end up with an estimate of half the
bucket's frequency, which seems excessive. Also, if I'm reading it
correctly, the code for histograms with not-equals will return
STATS_MATCH_PARTIAL for all but one of the buckets, which isn't great
either.

I don't know, really. I'll have to try hacking on this a bit I guess.
But there's one obvious question - in what order should we add the
clauses? Does it matter at all, or what is the optimal order? We don't
need to worry about it now, because we simply consider all clauses at
once, but I guess the proposed algorithm is more sensitive to this.

I don't know. That's definitely one of the least satisfactory parts of
that idea.

The alternative seems to be to improve the match tracking in your
current algorithm so that it keeps more detailed information about the
kinds of matches seen at each level, and combines them appropriately.
Maybe that's possible, but I'm struggling to see exactly how. Counting
equality clauses seen on each column might be a start. But it would
also need to track inequalities, with min/max values or fractions of
the non-MCV total, or some such thing.

Regards,
Dean

#70

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 7 years ago

In reply to: Dean Rasheed (#69)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 07/15/2018 04:43 PM, Dean Rasheed wrote:

On 15 July 2018 at 14:29, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

It's quite unclear to me how this algorithm could reliably end up with
hist_sel=0 (in cases where we already don't end up with that). I mean,
if a bucket matches the conditions, then the only way to eliminate is by
deducing that MCV already contains all the matches - and that's rather
difficult for complex clauses ...

Ah, I didn't realise that you were using histograms for equality
clauses as well. I had assumed that they would only use the MCV stats,
as in the univariate case. Using histograms for equality seems
problematic -- if bucket_contains_value() returns STATS_MATCH_PARTIAL,
as things stand that would end up with an estimate of half the
bucket's frequency, which seems excessive.

Yeah, I think you're right - this is likely to produce over-estimates
and needs rethinking. With top-level equality clauses it should be
fairly trivial to use approach similar to the univariate case, i.e.
computing ndistinct and using

(1 - mcv_totalsel) / ndistinct

If there are ndistinct coefficients this might be pretty good estimate
of the non-MCV part, I think. But it only works for top-level
equalities, not for complex clauses as in your examples.

While looking at the statext_clauselist_selectivity code I think I see
two more bugs:

1) the histogram_clauselist_selectivity() should probably use
'stat_clauses' and not 'clauses'

2) the early returns fail to estimate the neqclauses

It's a bit too late here but I'll look at it tomorrow.

Also, if I'm reading it correctly, the code for histograms with
not-equals will return STATS_MATCH_PARTIAL for all but one of the
buckets, which isn't great either.

Ummm, why?

I don't know, really. I'll have to try hacking on this a bit I guess.
But there's one obvious question - in what order should we add the
clauses? Does it matter at all, or what is the optimal order? We don't
need to worry about it now, because we simply consider all clauses at
once, but I guess the proposed algorithm is more sensitive to this.

I don't know. That's definitely one of the least satisfactory parts of
that idea.

The alternative seems to be to improve the match tracking in your
current algorithm so that it keeps more detailed information about the
kinds of matches seen at each level, and combines them appropriately.
Maybe that's possible, but I'm struggling to see exactly how. Counting
equality clauses seen on each column might be a start. But it would
also need to track inequalities, with min/max values or fractions of
the non-MCV total, or some such thing.

Yeah, I agree, I'm not happy with this part either. But I'm grateful
there's someone else thinking about the issues and proposing alternative
approaches. Thanks for doing that.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#71

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 7 years ago

In reply to: Tomas Vondra (#70)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 07/16/2018 12:16 AM, Tomas Vondra wrote:

On 07/15/2018 04:43 PM, Dean Rasheed wrote:

On 15 July 2018 at 14:29, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

It's quite unclear to me how this algorithm could reliably end up with
hist_sel=0 (in cases where we already don't end up with that). I mean,
if a bucket matches the conditions, then the only way to eliminate is by
deducing that MCV already contains all the matches - and that's rather
difficult for complex clauses ...

Ah, I didn't realise that you were using histograms for equality
clauses as well. I had assumed that they would only use the MCV stats,
as in the univariate case. Using histograms for equality seems
problematic -- if bucket_contains_value() returns STATS_MATCH_PARTIAL,
as things stand that would end up with an estimate of half the
bucket's frequency, which seems excessive.

Yeah, I think you're right - this is likely to produce over-estimates
and needs rethinking. With top-level equality clauses it should be
fairly trivial to use approach similar to the univariate case, i.e.
computing ndistinct and using

(1 - mcv_totalsel) / ndistinct

If there are ndistinct coefficients this might be pretty good estimate
of the non-MCV part, I think. But it only works for top-level
equalities, not for complex clauses as in your examples.

On further thought, it's a bit more complicated, actually. Firstly, we
already do that when there's no histogram (as in your example), and
clearly it does not help. I initially thought it's a mistake to use the
histogram in this case, but I can think of cases where it helps a lot.

1) when the equality clauses match nothing

In this case we may not find any buckets possibly matching the
combination of values, producing selectivity estimate 0.0. While by
using 1/ndistinct would give us something else.

2) when there are equality and inequality clauses

Similarly to the previous case, the equality clauses are useful in
eliminating some of the buckets.

Now, I agree estimating equality clauses using histogram is tricky, so
perhaps what we should do is using them as "conditions" to eliminate
histogram buckets, but use ndistinct to estimate the selectivity. That
is something like this:

P(a=1 & b=1 & c<10 & d>=100)
= P(a=1 & b=1) * P(c<10 & d>=100 | a=1 & b=1)
= 1/ndistinct(a,b) * P(c<10 & d>=100 | a=1 & b=1)

where the second part is estimated using the histogram.

Of course, this still only works for the top-level equality clauses :-(

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#72

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 7 years ago

In reply to: Dean Rasheed (#67)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 07/15/2018 11:36 AM, Dean Rasheed wrote:

On 13 July 2018 at 18:27, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

I'm not so sure. The issue is that a lot of the MCV deductions depends
on whether we can answer questions like "Is there a single match?" or
"If we got a match in MCV, do we need to look at the non-MCV part?" This
is not very different from the single-column estimates, except of course
here we need to look at multiple columns.

The top-level clauses allow us to make such deductions, with deeper
clauses it's much more difficult (perhaps impossible). Because for
example with (a=1 AND b=1) there can be just a single match, so if we
find it in MCV we're done. With clauses like ((a=1 OR a=2) AND (b=1 OR
b=2)) it's not that simple, because there may be multiple combinations
and so a match in MCV does not guarantee anything.

Actually, it guarantees a lower bound on the overall selectivity, and
maybe that's the best that we can do in the absence of any other
stats.

Hmmm, is that actually true? Let's consider a simple example, with two
columns, each with just 2 values, and a "perfect" MCV list:

a | b | frequency
-------------------
1 | 1 | 0.5
2 | 2 | 0.5

And let's estimate sel(a=1 & b=2). Your proposed algorithm does this:

1) sel(a=1) = 0.5
2) sel(b=2) = 0.5
3) total_sel = sel(a=1) * sel(b=2) = 0.25
4) mcv_sel = 0.0
5) total_sel = Max(total_sel, mcv_sel) = 0.25

How is that a lower bound? Or what is it lower than?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#73

Dean Rasheed

dean.a.rasheed@gmail.com

over 7 years ago

In reply to: Tomas Vondra (#72)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 16 July 2018 at 13:23, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

The top-level clauses allow us to make such deductions, with deeper
clauses it's much more difficult (perhaps impossible). Because for
example with (a=1 AND b=1) there can be just a single match, so if we
find it in MCV we're done. With clauses like ((a=1 OR a=2) AND (b=1 OR
b=2)) it's not that simple, because there may be multiple combinations
and so a match in MCV does not guarantee anything.

Actually, it guarantees a lower bound on the overall selectivity, and
maybe that's the best that we can do in the absence of any other
stats.

Hmmm, is that actually true? Let's consider a simple example, with two
columns, each with just 2 values, and a "perfect" MCV list:

a | b | frequency
-------------------
1 | 1 | 0.5
2 | 2 | 0.5

And let's estimate sel(a=1 & b=2).

OK.In this case, there are no MCV matches, so there is no lower bound (it's 0).

What we could do though is also impose an upper bound, based on the
sum of non-matching MCV frequencies. In this case, the upper bound is
also 0, so we could actually say the resulting selectivity is 0.

Regards,
Dean

#74

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 7 years ago

In reply to: Dean Rasheed (#73)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 07/16/2018 02:54 PM, Dean Rasheed wrote:

On 16 July 2018 at 13:23, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

The top-level clauses allow us to make such deductions, with deeper
clauses it's much more difficult (perhaps impossible). Because for
example with (a=1 AND b=1) there can be just a single match, so if we
find it in MCV we're done. With clauses like ((a=1 OR a=2) AND (b=1 OR
b=2)) it's not that simple, because there may be multiple combinations
and so a match in MCV does not guarantee anything.

Actually, it guarantees a lower bound on the overall selectivity, and
maybe that's the best that we can do in the absence of any other
stats.

Hmmm, is that actually true? Let's consider a simple example, with two
columns, each with just 2 values, and a "perfect" MCV list:

a | b | frequency
-------------------
1 | 1 | 0.5
2 | 2 | 0.5

And let's estimate sel(a=1 & b=2).

OK.In this case, there are no MCV matches, so there is no lower bound (it's 0).

What we could do though is also impose an upper bound, based on the
sum of non-matching MCV frequencies. In this case, the upper bound is
also 0, so we could actually say the resulting selectivity is 0.

Hmmm, it's not very clear to me how would we decide which of these cases
applies, because in most cases we don't have MCV covering 100% rows.

Anyways, I've been thinking about how to modify the code to wort the way
you proposed (in a way sufficient for a PoC). But after struggling with
it for a while it occurred to me it might be useful to do it on paper
first, to verify how would it work on your examples.

So let's use this data

Assuming we have perfectly exact statistics, we have these MCV lists
(both univariate and multivariate):

select a, count(*), round(count(*) /2254937.0, 4) AS frequency
from foo group by a order by 2 desc;

a | count | frequency
--------+--------+-----------
3 | 614998 | 0.2727
2 | 549971 | 0.2439
1 | 339971 | 0.1508
1014 | 1 | 0.0000
57220 | 1 | 0.0000
...

select b, count(*), round(count(*) /2254937.0, 4) AS frequency
from foo group by b order by 2 desc;

b | count | frequency
--------+-------+-----------
1 | 90010 | 0.0399
2 | 65010 | 0.0288
3 | 31 | 0.0000
7 | 31 | 0.0000
...

select a, b, count(*), round(count(*) /2254937.0, 4) AS frequency
from foo group by a, b order by 3 desc;

a | b | count | frequency
--------+--------+-------+-----------
1 | 1 | 50000 | 0.0222
1 | 2 | 40000 | 0.0177
2 | 1 | 30000 | 0.0133
2 | 2 | 20000 | 0.0089
3 | 1 | 10000 | 0.0044
3 | 2 | 5000 | 0.0022
2 | 12445 | 10 | 0.0000
...

And let's estimate the two queries with complex clauses, where the
multivariate stats gave 2x overestimates.

SELECT * FROM foo WHERE a=1 and (b=1 or b=2);
-- actual 90000, univariate: 24253, multivariate: 181091

univariate:

sel(a=1) = 0.1508
sel(b=1) = 0.0399
sel(b=2) = 0.0288
sel(b=1 or b=2) = 0.0673

multivariate:
sel(a=1 and (b=1 or b=2)) = 0.0399 (0.0770)

The second multivariate estimate comes from assuming only the first 5
items make it to the multivariate MCV list (covering 6.87% of the data)
and extrapolating the selectivity to the non-MCV data too.

(Notice it's about 2x the actual selectivity, so this extrapolation due
to not realizing the MCV already contains all the matches is pretty much
responsible for the whole over-estimate).

So, how would the proposed algorithm work? Let's start with "a=1":

sel(a=1) = 0.1508

I don't see much point in applying the two "b" clauses independently (or
how would it be done, as it's effectively a single clause):

sel(b=1 or b=2) = 0.0673

And we get

total_sel = sel(a=1) * sel(b=1 or b=2) = 0.0101

From the multivariate MCV we get

mcv_sel = 0.0399

And finally

total_sel = Max(total_sel, mcv_sel) = 0.0399

Which is great, but I don't see how that actually helped anything? We
still only have the estimate for the ~7% covered by the MCV list, and
not the remaining non-MCV part.

I could do the same thing for the second query, but the problem there is
actually exactly the same - extrapolation from MCV to non-MCV part
roughly doubles the estimate.

So unless I'm applying your algorithm incorrectly, this does not seem
like a very promising direction :-(

There may be valuable information we could learn from the univariate
estimates (using a Max() of them as an upper boundary seems reasonable),
but that's still quite crude. And it will only ever work with simple
top-level clauses. Once the clauses get more complicated, it seems
rather tricky - presumably multivariate stats would be only used for
correlated columns, so trying to deduce something from univariate
estimates on complex clauses on such columns seems somewhat suspicious.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#75

Dean Rasheed

dean.a.rasheed@gmail.com

over 7 years ago

In reply to: Tomas Vondra (#74)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 16 July 2018 at 21:55, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 07/16/2018 02:54 PM, Dean Rasheed wrote:

On 16 July 2018 at 13:23, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

The top-level clauses allow us to make such deductions, with deeper
clauses it's much more difficult (perhaps impossible). Because for
example with (a=1 AND b=1) there can be just a single match, so if we
find it in MCV we're done. With clauses like ((a=1 OR a=2) AND (b=1 OR
b=2)) it's not that simple, because there may be multiple combinations
and so a match in MCV does not guarantee anything.

Actually, it guarantees a lower bound on the overall selectivity, and
maybe that's the best that we can do in the absence of any other
stats.

Hmmm, is that actually true? Let's consider a simple example, with two
columns, each with just 2 values, and a "perfect" MCV list:

a | b | frequency
-------------------
1 | 1 | 0.5
2 | 2 | 0.5

And let's estimate sel(a=1 & b=2).

OK.In this case, there are no MCV matches, so there is no lower bound (it's 0).

What we could do though is also impose an upper bound, based on the
sum of non-matching MCV frequencies. In this case, the upper bound is
also 0, so we could actually say the resulting selectivity is 0.

Hmmm, it's not very clear to me how would we decide which of these cases
applies, because in most cases we don't have MCV covering 100% rows.

Anyways, I've been thinking about how to modify the code to wort the way
you proposed (in a way sufficient for a PoC). But after struggling with
it for a while it occurred to me it might be useful to do it on paper
first, to verify how would it work on your examples.

So let's use this data

create table foo(a int, b int);
insert into foo select 1,1 from generate_series(1,50000);
insert into foo select 1,2 from generate_series(1,40000);
insert into foo select 1,x/10 from generate_series(30,250000) g(x);
insert into foo select 2,1 from generate_series(1,30000);
insert into foo select 2,2 from generate_series(1,20000);
insert into foo select 2,x/10 from generate_series(30,500000) g(x);
insert into foo select 3,1 from generate_series(1,10000);
insert into foo select 3,2 from generate_series(1,5000);
insert into foo select 3,x from generate_series(3,600000) g(x);
insert into foo select x,x/10 from generate_series(4,750000) g(x);

Assuming we have perfectly exact statistics, we have these MCV lists
(both univariate and multivariate):

select a, count(*), round(count(*) /2254937.0, 4) AS frequency
from foo group by a order by 2 desc;

a | count | frequency
--------+--------+-----------
3 | 614998 | 0.2727
2 | 549971 | 0.2439
1 | 339971 | 0.1508
1014 | 1 | 0.0000
57220 | 1 | 0.0000
...

select b, count(*), round(count(*) /2254937.0, 4) AS frequency
from foo group by b order by 2 desc;

b | count | frequency
--------+-------+-----------
1 | 90010 | 0.0399
2 | 65010 | 0.0288
3 | 31 | 0.0000
7 | 31 | 0.0000
...

select a, b, count(*), round(count(*) /2254937.0, 4) AS frequency
from foo group by a, b order by 3 desc;

a | b | count | frequency
--------+--------+-------+-----------
1 | 1 | 50000 | 0.0222
1 | 2 | 40000 | 0.0177
2 | 1 | 30000 | 0.0133
2 | 2 | 20000 | 0.0089
3 | 1 | 10000 | 0.0044
3 | 2 | 5000 | 0.0022
2 | 12445 | 10 | 0.0000
...

And let's estimate the two queries with complex clauses, where the
multivariate stats gave 2x overestimates.

SELECT * FROM foo WHERE a=1 and (b=1 or b=2);
-- actual 90000, univariate: 24253, multivariate: 181091

univariate:

sel(a=1) = 0.1508
sel(b=1) = 0.0399
sel(b=2) = 0.0288
sel(b=1 or b=2) = 0.0673

multivariate:
sel(a=1 and (b=1 or b=2)) = 0.0399 (0.0770)

The second multivariate estimate comes from assuming only the first 5
items make it to the multivariate MCV list (covering 6.87% of the data)
and extrapolating the selectivity to the non-MCV data too.

(Notice it's about 2x the actual selectivity, so this extrapolation due
to not realizing the MCV already contains all the matches is pretty much
responsible for the whole over-estimate).

Agreed. I think the actual MCV stats I got included the first 6
entries, but yes, that's only around 7% of the data.

So, how would the proposed algorithm work? Let's start with "a=1":

sel(a=1) = 0.1508

I don't see much point in applying the two "b" clauses independently (or
how would it be done, as it's effectively a single clause):

sel(b=1 or b=2) = 0.0673

And we get

total_sel = sel(a=1) * sel(b=1 or b=2) = 0.0101

From the multivariate MCV we get

mcv_sel = 0.0399

And finally

total_sel = Max(total_sel, mcv_sel) = 0.0399

Which is great, but I don't see how that actually helped anything? We
still only have the estimate for the ~7% covered by the MCV list, and
not the remaining non-MCV part.

Right. If these are the only stats available, and there are just 2
top-level clauses like this, it either returns the MCV estimate, or
the old univariate estimate (whichever is larger). It avoids
over-estimating, but will almost certainly under-estimate when the MCV
matches are incomplete.

I could do the same thing for the second query, but the problem there is
actually exactly the same - extrapolation from MCV to non-MCV part
roughly doubles the estimate.

So unless I'm applying your algorithm incorrectly, this does not seem
like a very promising direction :-(

You could be right. Actually it's the order dependence with more than
2 top-level clauses that bothers me most about this algorithm. It's
also not entirely obvious how to include histogram stats in this
scheme.

A different approach that I have been thinking about is, in
mcv_update_match_bitmap(), attempt to work out the probability of all
the clauses matching and it not being an MCV value. For example, given
a clause like a=1 whose univariate estimate we know (0.1508 in the
above example), knowing that the MCV values account for 0.0222+0.0177
of that, the remainder must be from non-MCV values. So we could say
that the probability that a=1 and it not being and MCV is
0.1508-0.0222-0.0177 = 0.1109. So then the question is could we
combine these non-MCV probabilities to give an estimate of how many
non-MCV values we need to worry about? I've not fully thought that
through, but it might be useful. The problem is, it's still likely to
over-estimate when the MCV matches fully cover all possibilities, and
I still don't see a way to reliably detect that case.

Regards,
Dean

#76

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 7 years ago

In reply to: Dean Rasheed (#75)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 07/17/2018 11:09 AM, Dean Rasheed wrote:

On 16 July 2018 at 21:55, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

...

So, how would the proposed algorithm work? Let's start with "a=1":

sel(a=1) = 0.1508

I don't see much point in applying the two "b" clauses independently (or
how would it be done, as it's effectively a single clause):

sel(b=1 or b=2) = 0.0673

And we get

total_sel = sel(a=1) * sel(b=1 or b=2) = 0.0101

From the multivariate MCV we get

mcv_sel = 0.0399

And finally

total_sel = Max(total_sel, mcv_sel) = 0.0399

Which is great, but I don't see how that actually helped anything? We
still only have the estimate for the ~7% covered by the MCV list, and
not the remaining non-MCV part.

Right. If these are the only stats available, and there are just 2
top-level clauses like this, it either returns the MCV estimate, or
the old univariate estimate (whichever is larger). It avoids
over-estimating, but will almost certainly under-estimate when the MCV
matches are incomplete.

Yeah :-(

In my experience under-estimates tend to have much worse consequences
(say a nested loop chosen by under-estimate vs. hash join chosen by
over-estimate). This certainly influenced some of the choices I've made
in this patch (extrapolation to non-MCV part is an example of that), but
I agree it's not particularly scientific approach and I'd very much want
something better.

I could do the same thing for the second query, but the problem there is
actually exactly the same - extrapolation from MCV to non-MCV part
roughly doubles the estimate.

So unless I'm applying your algorithm incorrectly, this does not seem
like a very promising direction :-(

You could be right. Actually it's the order dependence with more than
2 top-level clauses that bothers me most about this algorithm. It's
also not entirely obvious how to include histogram stats in this
scheme.

I think for inequalities that's fairly simple - histograms work pretty
well for that, and I have a hunch that replacing the 0.5 estimate for
partially-matching buckets with conver_to_scalar-like logic and the
geometric mean (as you proposed) will work well enough.

For equalities it's going to be hard. The only thing I can think of at
the moment is checking if there are any matching buckets at all, and
using that to decide whether to extrapolate the MCV selectivity to the
non-MCV part or not (or perhaps to what part of the non-MCV part).

A different approach that I have been thinking about is, in
mcv_update_match_bitmap(), attempt to work out the probability of all
the clauses matching and it not being an MCV value. For example, given
a clause like a=1 whose univariate estimate we know (0.1508 in the
above example), knowing that the MCV values account for 0.0222+0.0177
of that, the remainder must be from non-MCV values. So we could say
that the probability that a=1 and it not being and MCV is
0.1508-0.0222-0.0177 = 0.1109. So then the question is could we
combine these non-MCV probabilities to give an estimate of how many
non-MCV values we need to worry about? I've not fully thought that
through, but it might be useful.

Could we use it to derive some upper boundaries on the non-MCV part?

The problem is, it's still likely to
over-estimate when the MCV matches fully cover all possibilities, and
I still don't see a way to reliably detect that case.

I guess we can use a histogram to limit the over-estimates, as explained
above. It may not be 100% reliable as it depends on the sample and how
exactly the buckets are formed, but it might help.

But perhaps these over-estimates are a particularly serious issue? When
you already get matches in a MCV, the number of matching rows is going
to be pretty significant. If you over-estimate 2x, so what? IMHO that's
still pretty accurate estimate.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#77

Konstantin Knizhnik

k.knizhnik@postgrespro.ru

over 7 years ago

In reply to: Tomas Vondra (#74)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 16.07.2018 23:55, Tomas Vondra wrote:

On 07/16/2018 02:54 PM, Dean Rasheed wrote:

On 16 July 2018 at 13:23, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

The top-level clauses allow us to make such deductions, with deeper
clauses it's much more difficult (perhaps impossible). Because for
example with (a=1 AND b=1) there can be just a single match, so if we
find it in MCV we're done. With clauses like ((a=1 OR a=2) AND (b=1 OR
b=2)) it's not that simple, because there may be multiple combinations
and so a match in MCV does not guarantee anything.

Actually, it guarantees a lower bound on the overall selectivity, and
maybe that's the best that we can do in the absence of any other
stats.

Hmmm, is that actually true? Let's consider a simple example, with two
columns, each with just 2 values, and a "perfect" MCV list:

a | b | frequency
-------------------
1 | 1 | 0.5
2 | 2 | 0.5

And let's estimate sel(a=1 & b=2).

OK.In this case, there are no MCV matches, so there is no lower bound (it's 0).

What we could do though is also impose an upper bound, based on the
sum of non-matching MCV frequencies. In this case, the upper bound is
also 0, so we could actually say the resulting selectivity is 0.

Hmmm, it's not very clear to me how would we decide which of these cases
applies, because in most cases we don't have MCV covering 100% rows.

Anyways, I've been thinking about how to modify the code to wort the way
you proposed (in a way sufficient for a PoC). But after struggling with
it for a while it occurred to me it might be useful to do it on paper
first, to verify how would it work on your examples.

So let's use this data

create table foo(a int, b int);
insert into foo select 1,1 from generate_series(1,50000);
insert into foo select 1,2 from generate_series(1,40000);
insert into foo select 1,x/10 from generate_series(30,250000) g(x);
insert into foo select 2,1 from generate_series(1,30000);
insert into foo select 2,2 from generate_series(1,20000);
insert into foo select 2,x/10 from generate_series(30,500000) g(x);
insert into foo select 3,1 from generate_series(1,10000);
insert into foo select 3,2 from generate_series(1,5000);
insert into foo select 3,x from generate_series(3,600000) g(x);
insert into foo select x,x/10 from generate_series(4,750000) g(x);

Assuming we have perfectly exact statistics, we have these MCV lists
(both univariate and multivariate):

select a, count(*), round(count(*) /2254937.0, 4) AS frequency
from foo group by a order by 2 desc;

a | count | frequency
--------+--------+-----------
3 | 614998 | 0.2727
2 | 549971 | 0.2439
1 | 339971 | 0.1508
1014 | 1 | 0.0000
57220 | 1 | 0.0000
...

select b, count(*), round(count(*) /2254937.0, 4) AS frequency
from foo group by b order by 2 desc;

b | count | frequency
--------+-------+-----------
1 | 90010 | 0.0399
2 | 65010 | 0.0288
3 | 31 | 0.0000
7 | 31 | 0.0000
...

select a, b, count(*), round(count(*) /2254937.0, 4) AS frequency
from foo group by a, b order by 3 desc;

a | b | count | frequency
--------+--------+-------+-----------
1 | 1 | 50000 | 0.0222
1 | 2 | 40000 | 0.0177
2 | 1 | 30000 | 0.0133
2 | 2 | 20000 | 0.0089
3 | 1 | 10000 | 0.0044
3 | 2 | 5000 | 0.0022
2 | 12445 | 10 | 0.0000
...

And let's estimate the two queries with complex clauses, where the
multivariate stats gave 2x overestimates.

SELECT * FROM foo WHERE a=1 and (b=1 or b=2);
-- actual 90000, univariate: 24253, multivariate: 181091

univariate:

sel(a=1) = 0.1508
sel(b=1) = 0.0399
sel(b=2) = 0.0288
sel(b=1 or b=2) = 0.0673

multivariate:
sel(a=1 and (b=1 or b=2)) = 0.0399 (0.0770)

The second multivariate estimate comes from assuming only the first 5
items make it to the multivariate MCV list (covering 6.87% of the data)
and extrapolating the selectivity to the non-MCV data too.

(Notice it's about 2x the actual selectivity, so this extrapolation due
to not realizing the MCV already contains all the matches is pretty much
responsible for the whole over-estimate).

So, how would the proposed algorithm work? Let's start with "a=1":

sel(a=1) = 0.1508

I don't see much point in applying the two "b" clauses independently (or
how would it be done, as it's effectively a single clause):

sel(b=1 or b=2) = 0.0673

And we get

total_sel = sel(a=1) * sel(b=1 or b=2) = 0.0101

From the multivariate MCV we get

mcv_sel = 0.0399

And finally

total_sel = Max(total_sel, mcv_sel) = 0.0399

Which is great, but I don't see how that actually helped anything? We
still only have the estimate for the ~7% covered by the MCV list, and
not the remaining non-MCV part.

I could do the same thing for the second query, but the problem there is
actually exactly the same - extrapolation from MCV to non-MCV part
roughly doubles the estimate.

So unless I'm applying your algorithm incorrectly, this does not seem
like a very promising direction :-(

There may be valuable information we could learn from the univariate
estimates (using a Max() of them as an upper boundary seems reasonable),
but that's still quite crude. And it will only ever work with simple
top-level clauses. Once the clauses get more complicated, it seems
rather tricky - presumably multivariate stats would be only used for
correlated columns, so trying to deduce something from univariate
estimates on complex clauses on such columns seems somewhat suspicious.

regards

Teodor Sigaev has proposed an alternative approach for calculating
selectivity of multicolumn join or compound index search.
Usually DBA creates compound indexes which can be used by optimizer to
build efficient query execution plan based on index search.
We can stores statistic for compound keys of such indexes and (as it is
done now for functional indexes) and use it to estimate selectivity
of clauses. I have implemented this idea and will publish patch in
separate thread soon.
Now I just want to share some results for the Tomas examples.

So for Vanilla Postges without extra statistic estimated number of rows
is about 4 times smaller than real.

postgres=# explain analyze SELECT count(*) FROM foo WHERE a=1 and (b=1
or b=2);
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=10964.76..10964.77 rows=1 width=8) (actual
time=49.251..49.251 rows=1 loops=1)
   -> Bitmap Heap Scan on foo (cost=513.60..10906.48 rows=23310
width=0) (actual time=17.368..39.928 rows=90000 loops=1)
         Recheck Cond: (((a = 1) AND (b = 1)) OR ((a = 1) AND (b = 2)))
         Heap Blocks: exact=399
         -> BitmapOr (cost=513.60..513.60 rows=23708 width=0) (actual
time=17.264..17.264 rows=0 loops=1)
               -> Bitmap Index Scan on foo_a_b_idx (cost=0.00..295.41
rows=13898 width=0) (actual time=10.319..10.319 rows=50000 loops=1)
                     Index Cond: ((a = 1) AND (b = 1))
               -> Bitmap Index Scan on foo_a_b_idx (cost=0.00..206.53
rows=9810 width=0) (actual time=6.941..6.941 rows=40000 loops=1)
                     Index Cond: ((a = 1) AND (b = 2))

If we add statistic for a and b columns:

create statistics ab on a,b from foo;
analyze foo;

then expected results is about 30% larger then real: 120k vs 90k:

postgres=# explain analyze SELECT count(*) FROM foo WHERE a=1 and (b=1
or b=2);
QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=14447.11..14447.12 rows=1 width=8) (actual
time=36.048..36.048 rows=1 loops=1)
   -> Gather (cost=14446.90..14447.11 rows=2 width=8) (actual
time=35.982..36.037 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         -> Partial Aggregate (cost=13446.90..13446.91 rows=1
width=8) (actual time=30.172..30.172 rows=1 loops=3)
               -> Parallel Bitmap Heap Scan on foo
(cost=2561.33..13424.24 rows=9063 width=0) (actual time=15.551..26.057
rows=30000 loops=3)
                     Recheck Cond: (((a = 1) AND (b = 1)) OR ((a = 1)
AND (b = 2)))
                     Heap Blocks: exact=112
                     -> BitmapOr (cost=2561.33..2561.33 rows=121360
width=0) (actual time=20.304..20.304 rows=0 loops=1)
                           -> Bitmap Index Scan on foo_a_b_idx
(cost=0.00..1488.46 rows=70803 width=0) (actual time=13.190..13.190
rows=50000 loops=1)
                                 Index Cond: ((a = 1) AND (b = 1))
                           -> Bitmap Index Scan on foo_a_b_idx
(cost=0.00..1061.99 rows=50556 width=0) (actual time=7.110..7.110
rows=40000 loops=1)
                                 Index Cond: ((a = 1) AND (b = 2))

With compound index statistic estimation is almost equal to real value:

---------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=13469.94..13469.95 rows=1 width=8) (actual
time=70.710..70.710 rows=1 loops=1)
   -> Bitmap Heap Scan on foo (cost=1880.20..13411.66 rows=23310
width=0) (actual time=38.776..61.050 rows=90000 loops=1)
         Recheck Cond: (((a = 1) AND (b = 1)) OR ((a = 1) AND (b = 2)))
         Heap Blocks: exact=399
         -> BitmapOr (cost=1880.20..1880.20 rows=88769 width=0)
(actual time=38.618..38.618 rows=0 loops=1)
               -> Bitmap Index Scan on foo_a_b_idx (cost=0.00..1030.50
rows=49007 width=0) (actual time=26.335..26.335 rows=50000 loops=1)
                     Index Cond: ((a = 1) AND (b = 1))
               -> Bitmap Index Scan on foo_a_b_idx (cost=0.00..838.05
rows=39762 width=0) (actual time=12.278..12.278 rows=40000 loops=1)
                     Index Cond: ((a = 1) AND (b = 2))

#78

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 7 years ago

In reply to: Konstantin Knizhnik (#77)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 07/18/2018 12:41 AM, Konstantin Knizhnik wrote:

...

Teodor Sigaev has proposed an alternative approach for calculating
selectivity of multicolumn join or compound index search.
Usually DBA creates compound indexes which can be used by optimizer to
build efficient query execution plan based on index search.
We can stores statistic for compound keys of such indexes and (as it is
done now for functional indexes) and use it to estimate selectivity
of clauses. I have implemented this idea and will publish patch in
separate thread soon.
Now I just want to share some results for the Tomas examples.

So for Vanilla Postges without extra statistic estimated number of rows
is about 4 times smaller than real.

Can you please post plans with parallelism disabled, and perhaps without
the aggregate? Both makes reading the plans unnecessarily difficult ...

postgres=# explain analyze SELECT count(*) FROM foo WHERE a=1 and (b=1
or b=2);
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------

Aggregate (cost=10964.76..10964.77 rows=1 width=8) (actual
time=49.251..49.251 rows=1 loops=1)
-> Bitmap Heap Scan on foo (cost=513.60..10906.48 rows=23310
width=0) (actual time=17.368..39.928 rows=90000 loops=1)

ok, 23k vs. 90k

If we add statistic for a and b columns:

create statistics ab on a,b from foo;
analyze foo;

then expected results is about 30% larger then real: 120k vs 90k:

Eh? The plan however says 9k vs. 30k ...

-> Parallel Bitmap Heap Scan on foo
(cost=2561.33..13424.24 rows=9063 width=0) (actual time=15.551..26.057
rows=30000 loops=3)

...

With compound index statistic estimation is almost equal to real value:

-> Bitmap Heap Scan on foo (cost=1880.20..13411.66 rows=23310
width=0) (actual time=38.776..61.050 rows=90000 loops=1)

Well, this says 23k vs. 90k.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#79

Konstantin Knizhnik

k.knizhnik@postgrespro.ru

over 7 years ago

In reply to: Tomas Vondra (#78)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 18.07.2018 02:58, Tomas Vondra wrote:

On 07/18/2018 12:41 AM, Konstantin Knizhnik wrote:

...

Teodor Sigaev has proposed an alternative approach for calculating
selectivity of multicolumn join or compound index search.
Usually DBA creates compound indexes which can be used by optimizer to
build efficient query execution plan based on index search.
We can stores statistic for compound keys of such indexes and (as it is
done now for functional indexes) and use it to estimate selectivity
of clauses. I have implemented this idea and will publish patch in
separate thread soon.
Now I just want to share some results for the Tomas examples.

So for Vanilla Postges without extra statistic estimated number of rows
is about 4 times smaller than real.

Can you please post plans with parallelism disabled, and perhaps without
the aggregate? Both makes reading the plans unnecessarily difficult ...

Sorry, below are plans with disabled parallel execution on simpler
query(a=1 and b=1):

explain analyze SELECT count(*) FROM foo WHERE a=1 and b=1;

Vanilla:

Aggregate (cost=11035.86..11035.87 rows=1 width=8) (actual
time=22.746..22.746 rows=1 loops=1)
   -> Bitmap Heap Scan on foo (cost=291.35..11001.97 rows=13553
width=0) (actual time=9.055..18.711 rows=50000 loops=1)
         Recheck Cond: ((a = 1) AND (b = 1))
         Heap Blocks: exact=222
         -> Bitmap Index Scan on foo_a_b_idx (cost=0.00..287.96
rows=13553 width=0) (actual time=9.005..9.005 rows=50000 loops=1)
               Index Cond: ((a = 1) AND (b = 1))

----------------------------------------------------------------------

Vanilla + extra statistic (create statistics ab on a,b from foo):

Aggregate (cost=12693.35..12693.36 rows=1 width=8) (actual
time=22.747..22.748 rows=1 loops=1)
   -> Bitmap Heap Scan on foo (cost=1490.08..12518.31 rows=70015
width=0) (actual time=9.399..18.636 rows=50000 loops=1)
         Recheck Cond: ((a = 1) AND (b = 1))
         Heap Blocks: exact=222
         -> Bitmap Index Scan on foo_a_b_idx (cost=0.00..1472.58
rows=70015 width=0) (actual time=9.341..9.341 rows=50000 loops=1)
               Index Cond: ((a = 1) AND (b = 1))

----------------------------------------------------------------------

Multicolumn index statistic:

Aggregate (cost=11946.35..11946.36 rows=1 width=8) (actual
time=25.117..25.117 rows=1 loops=1)
   -> Bitmap Heap Scan on foo (cost=1080.47..11819.51 rows=50736
width=0) (actual time=11.568..21.362 rows=50000 loops=1)
         Recheck Cond: ((a = 1) AND (b = 1))
         Heap Blocks: exact=222
         -> Bitmap Index Scan on foo_a_b_idx (cost=0.00..1067.79
rows=50736 width=0) (actual time=11.300..11.300 rows=50000 loops=1)
               Index Cond: ((a = 1) AND (b = 1))

#80

Dean Rasheed

dean.a.rasheed@gmail.com

over 7 years ago

In reply to: Tomas Vondra (#76)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 17 July 2018 at 14:03, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

For equalities it's going to be hard. The only thing I can think of at the
moment is checking if there are any matching buckets at all, and using that
to decide whether to extrapolate the MCV selectivity to the non-MCV part or
not (or perhaps to what part of the non-MCV part).

So I decided to play a little more with this, experimenting with a
much simpler approach -- this is for MCV's only at the moment, see the
attached (very much WIP) patch (no doc or test updates, and lots of
areas for improvement).

The basic idea when building the MCV stats is to not just record the
frequency of each combination of values, but also what I'm calling the
"base frequency" -- that is the frequency that that combination of
values would have if the columns were independent (i.e., the product
of each value's individual frequency).

The reasoning then, is that if we find an MCV entry matching the query
clauses, the difference (frequency - base_frequency) can be viewed as
a correction to be applied to the selectivity returned by
clauselist_selectivity_simple(). If all possible values were covered
by matching MCV entries, the sum of the base frequencies of the
matching MCV entries would approximately cancel out with the simple
selectivity, and only the MCV frequencies would be left (ignoring
second order effects arising from the fact that
clauselist_selectivity_simple() doesn't just sum up disjoint
possibilities). For partial matches, it will use what multivariate
stats are available to improve upon the simple selectivity.

I wondered about just storing the difference (frequency -
base_frequency) in the stats, but it's actually useful to have both
values, because then the total of all the MCV frequencies can be used
to set an upper bound on the non-MCV part.

The advantage of this approach is that it is very simple, and in
theory ought to be reasonably applicable to arbitrary combinations of
clauses. Also, it naturally falls back to the univariate-based
estimate when there are no matching MCV entries. In fact, even when
there are no matching MCV entries, it can still improve upon the
univariate estimate by capping it to 1-total_mcv_sel.

I tested it with the same data posted previously and a few simple
queries, and the initial results are quite encouraging. Where the
previous patch sometimes gave noticeable over- or under-estimates,
this patch generally did better:

Query Actual rows Est (HEAD) Est (24 Jun patch) Est (new patch)
Q1 50000 12625 48631 49308
Q2 40000 9375 40739 38710
Q3 90000 21644 172688 88018
Q4 140000 52048 267528 138228
Q5 140000 52978 267528 138228
Q6 140000 52050 267528 138228
Q7 829942 777806 149886 822788
Q8 749942 748302 692686 747922
Q9 15000 40989 27595 14131
Q10 15997 49853 27595 23121

Q1: a=1 and b=1
Q2: a=1 and b=2
Q3: a=1 and (b=1 or b=2)
Q4: (a=1 or a=2) and (b=1 or b=2)
Q5: (a=1 or a=2) and (b<=2)
Q6: (a=1 or a=2 or a=4) and (b=1 or b=2)
Q7: (a=1 or a=2) and not (b=2)
Q8: (a=1 or a=2) and not (b=1 or b=2)
Q9: a=3 and b>0 and b<3
Q10: a=3 and b>0 and b<1000

I've not tried anything with histograms. Possibly the histograms could
be used as-is, to replace the non-MCV part (other_sel). Or, a similar
approach could be used, recording the base frequency of each histogram
bucket, and then using that to refine the other_sel estimate. Either
way, I think it would be necessary to exclude equality clauses from
the histograms, otherwise MCVs might end up being double-counted.

Regards,
Dean

Attachments:

multivariate-MCV-lists-with-base-freqs.patchtext/x-patch; charset=US-ASCII; name=multivariate-MCV-lists-with-base-freqs.patchDownload

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
new file mode 100644
index fffb79f..2fc561f
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6570,7 +6570,8 @@ SCRAM-SHA-256$<replaceable>&lt;iteration
         An array containing codes for the enabled statistic kinds;
         valid values are:
         <literal>d</literal> for n-distinct statistics,
-        <literal>f</literal> for functional dependency statistics
+        <literal>f</literal> for functional dependency statistics, and
+        <literal>m</literal> for most common values (MCV) list statistics
       </entry>
      </row>
 
@@ -6593,6 +6594,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration
       </entry>
      </row>
 
+     <row>
+      <entry><structfield>stxmcv</structfield></entry>
+      <entry><type>pg_mcv_list</type></entry>
+      <entry></entry>
+      <entry>
+       MCV (most-common values) list statistics, serialized as
+       <structname>pg_mcv_list</structname> type.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index edc9be9..d68b789
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -20917,4 +20917,81 @@ CREATE EVENT TRIGGER test_table_rewrite_
   </sect2>
   </sect1>
 
+  <sect1 id="functions-statistics">
+   <title>Statistics Information Functions</title>
+
+   <indexterm zone="functions-statistics">
+    <primary>function</primary>
+    <secondary>statistics</secondary>
+   </indexterm>
+
+   <para>
+    To inspect statistics defined using <command>CREATE STATISTICS</command>
+    command, <productname>PostgreSQL</productname> provides multiple functions.
+   </para>
+
+  <sect2 id="functions-statistics-mcv">
+   <title>Inspecting MCV lists</title>
+
+   <indexterm>
+     <primary>pg_mcv_list_items</primary>
+     <secondary>pg_mcv_list</secondary>
+   </indexterm>
+
+   <para>
+    <function>pg_mcv_list_items</function> returns a list of all items
+    stored in a multi-column <literal>MCV</literal> list, and returns the
+    following columns:
+
+    <informaltable>
+     <tgroup cols="3">
+      <thead>
+       <row>
+        <entry>Name</entry>
+        <entry>Type</entry>
+        <entry>Description</entry>
+       </row>
+      </thead>
+
+      <tbody>
+       <row>
+        <entry><literal>index</literal></entry>
+        <entry><type>int</type></entry>
+        <entry>index of the item in the <literal>MCV</literal> list</entry>
+       </row>
+       <row>
+        <entry><literal>values</literal></entry>
+        <entry><type>text[]</type></entry>
+        <entry>values stored in the MCV item</entry>
+       </row>
+       <row>
+        <entry><literal>nulls</literal></entry>
+        <entry><type>boolean[]</type></entry>
+        <entry>flags identifying <literal>NULL</literal> values</entry>
+       </row>
+       <row>
+        <entry><literal>frequency</literal></entry>
+        <entry><type>double precision</type></entry>
+        <entry>frequency of this <literal>MCV</literal> item</entry>
+       </row>
+      </tbody>
+     </tgroup>
+    </informaltable>
+   </para>
+
+   <para>
+    The <function>pg_mcv_list_items</function> function can be used like this:
+
+<programlisting>
+SELECT m.* FROM pg_statistic_ext,
+                pg_mcv_list_items(stxmcv) m WHERE stxname = 'stts';
+</programlisting>
+
+     Values of the <type>pg_mcv_list</type> can be obtained only from the
+     <literal>pg_statistic.stxmcv</literal> column.
+   </para>
+  </sect2>
+
+  </sect1>
+
 </chapter>
diff --git a/doc/src/sgml/planstats.sgml b/doc/src/sgml/planstats.sgml
new file mode 100644
index ef643ad..de8ef16
--- a/doc/src/sgml/planstats.sgml
+++ b/doc/src/sgml/planstats.sgml
@@ -455,7 +455,7 @@ rows = (outer_cardinality * inner_cardin
    <secondary>multivariate</secondary>
   </indexterm>
 
-  <sect2>
+  <sect2 id="functional-dependencies">
    <title>Functional Dependencies</title>
 
    <para>
@@ -540,7 +540,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * F
    </para>
   </sect2>
 
-  <sect2>
+  <sect2 id="multivariate-ndistinct-counts">
    <title>Multivariate N-Distinct Counts</title>
 
    <para>
@@ -585,6 +585,116 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT COU
    </para>
 
   </sect2>
+
+  <sect2 id="mcv-lists">
+   <title>MCV lists</title>
+
+   <para>
+    As explained in <xref linkend="functional-dependencies"/>, functional
+    dependencies are very cheap and efficient type of statistics, but their
+    main limitation is their global nature (only tracking dependencies at
+    the column level, not between individual column values).
+   </para>
+
+   <para>
+    This section introduces multivariate variant of <acronym>MCV</acronym>
+    (most-common values) lists, a straight-forward extension of the per-column
+    statistics described in <xref linkend="row-estimation-examples"/>. This
+    statistics adresses the limitation by storing individual values, but it
+    is naturally more expensive, both in terms of storage and planning time.
+   </para>
+
+   <para>
+    Let's look at the query from <xref linkend="functional-dependencies"/>
+    again, but this time with a <acronym>MCV</acronym> list created on the
+    same set of columns (be sure to drop the functional dependencies, to
+    make sure the planner uses the newly created statistics).
+
+<programlisting>
+DROP STATISTICS stts;
+CREATE STATISTICS stts2 (mcv) ON a, b FROM t;
+ANALYZE t;
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
+                                   QUERY PLAN
+-------------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1)
+   Filter: ((a = 1) AND (b = 1))
+   Rows Removed by Filter: 9900
+</programlisting>
+
+    The estimate is as accurate as with the functional dependencies, mostly
+    thanks to the table being fairly small and having a simple distribution
+    with low number of distinct values. Before looking at the second query,
+    which was not handled by functional dependencies particularly well,
+    let's inspect the <acronym>MCV</acronym> list a bit.
+   </para>
+
+   <para>
+    Inspecting the <acronym>MCV</acronym> list is possible using
+    <function>pg_mcv_list_items</function> set-returning function.
+
+<programlisting>
+SELECT m.* FROM pg_statistic_ext,
+                pg_mcv_list_items(stxmcv) m WHERE stxname = 'stts2';
+ index | values  | nulls | frequency
+-------+---------+-------+-----------
+     0 | {0,0}   | {f,f} |      0.01
+     1 | {1,1}   | {f,f} |      0.01
+     2 | {2,2}   | {f,f} |      0.01
+...
+    49 | {49,49} | {f,f} |      0.01
+    50 | {50,0}  | {f,f} |      0.01
+...
+    97 | {97,47} | {f,f} |      0.01
+    98 | {98,48} | {f,f} |      0.01
+    99 | {99,49} | {f,f} |      0.01
+(100 rows)
+</programlisting>
+
+    Which confirms there are 100 distinct combinations in the two columns,
+    and all of them are about equally likely (1% frequency for each one).
+    Had there been any null values in either of the columns, this would be
+    identified in the <structfield>nulls</structfield> column.
+   </para>
+
+   <para>
+    When estimating the selectivity, the planner applies all the conditions
+    on items in the <acronym>MCV</acronym> list, and them sums the frequencies
+    of the matching ones. See <function>mcv_clauselist_selectivity</function>
+    in <filename>src/backend/statistics/mcv.c</filename> for details.
+   </para>
+
+   <para>
+    Compared to functional dependencies, <acronym>MCV</acronym> lists have two
+    major advantages. Firstly, the list stores actual values, making it possible
+    to decide which combinations are compatible.
+
+<programlisting>
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 10;
+                                 QUERY PLAN
+---------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)
+   Filter: ((a = 1) AND (b = 10))
+   Rows Removed by Filter: 10000
+</programlisting>
+
+    Secondly, <acronym>MCV</acronym> lists handle a wider range of clause types,
+    not just equality clauses like functional dependencies. See for example the
+    example range query, presented earlier:
+
+<programlisting>
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a &lt;= 49 AND b &gt; 49;
+                                QUERY PLAN
+---------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)
+   Filter: ((a &lt;= 49) AND (b &gt; 49))
+   Rows Removed by Filter: 10000
+</programlisting>
+
+   </para>
+
+  </sect2>
+
  </sect1>
 
  <sect1 id="planner-stats-security">
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
new file mode 100644
index 539f5bd..fcbfa56
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -83,7 +83,8 @@ CREATE STATISTICS [ IF NOT EXISTS ] <rep
       Currently supported kinds are
       <literal>ndistinct</literal>, which enables n-distinct statistics, and
       <literal>dependencies</literal>, which enables functional
-      dependency statistics.
+      dependency statistics, and <literal>mcv</literal> which enables
+      most-common values lists.
       If this clause is omitted, all supported statistics kinds are
       included in the statistics object.
       For more information, see <xref linkend="planner-stats-extended"/>
@@ -164,6 +165,31 @@ EXPLAIN ANALYZE SELECT * FROM t1 WHERE (
    conditions are redundant and does not underestimate the row count.
   </para>
 
+  <para>
+   Create table <structname>t2</structname> with two perfectly correlated columns
+   (containing identical data), and a MCV list on those columns:
+
+<programlisting>
+CREATE TABLE t2 (
+    a   int,
+    b   int
+);
+
+INSERT INTO t2 SELECT mod(i,100), mod(i,100)
+                 FROM generate_series(1,1000000) s(i);
+
+CREATE STATISTICS s2 WITH (mcv) ON (a, b) FROM t2;
+
+ANALYZE t2;
+
+-- valid combination (found in MCV)
+EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 1);
+
+-- invalid combination (not found in MCV)
+EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 2);
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
new file mode 100644
index 3e148f0..92456dc
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -1769,12 +1769,6 @@ static void compute_scalar_stats(VacAttr
 					 double totalrows);
 static int	compare_scalars(const void *a, const void *b, void *arg);
 static int	compare_mcvs(const void *a, const void *b);
-static int analyze_mcv_list(int *mcv_counts,
-				 int num_mcv,
-				 double stadistinct,
-				 double stanullfrac,
-				 int samplerows,
-				 double totalrows);
 
 
 /*
@@ -2869,7 +2863,7 @@ compare_mcvs(const void *a, const void *
  * number that are significantly more common than the values not in the list,
  * and which are therefore deemed worth storing in the table's MCV list.
  */
-static int
+int
 analyze_mcv_list(int *mcv_counts,
 				 int num_mcv,
 				 double stadistinct,
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
new file mode 100644
index 3bb0d24..903d815
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -70,11 +70,12 @@ CreateStatistics(CreateStatsStmt *stmt)
 	Oid			relid;
 	ObjectAddress parentobject,
 				myself;
-	Datum		types[2];		/* one for each possible type of statistic */
+	Datum		types[3];		/* one for each possible type of statistic */
 	int			ntypes;
 	ArrayType  *stxkind;
 	bool		build_ndistinct;
 	bool		build_dependencies;
+	bool		build_mcv;
 	bool		requested_type = false;
 	int			i;
 	ListCell   *cell;
@@ -269,6 +270,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	 */
 	build_ndistinct = false;
 	build_dependencies = false;
+	build_mcv = false;
 	foreach(cell, stmt->stat_types)
 	{
 		char	   *type = strVal((Value *) lfirst(cell));
@@ -283,6 +285,11 @@ CreateStatistics(CreateStatsStmt *stmt)
 			build_dependencies = true;
 			requested_type = true;
 		}
+		else if (strcmp(type, "mcv") == 0)
+		{
+			build_mcv = true;
+			requested_type = true;
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -294,6 +301,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	{
 		build_ndistinct = true;
 		build_dependencies = true;
+		build_mcv = true;
 	}
 
 	/* construct the char array of enabled statistic types */
@@ -302,6 +310,8 @@ CreateStatistics(CreateStatsStmt *stmt)
 		types[ntypes++] = CharGetDatum(STATS_EXT_NDISTINCT);
 	if (build_dependencies)
 		types[ntypes++] = CharGetDatum(STATS_EXT_DEPENDENCIES);
+	if (build_mcv)
+		types[ntypes++] = CharGetDatum(STATS_EXT_MCV);
 	Assert(ntypes > 0 && ntypes <= lengthof(types));
 	stxkind = construct_array(types, ntypes, CHAROID, 1, true, 'c');
 
@@ -320,6 +330,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	/* no statistics built yet */
 	nulls[Anum_pg_statistic_ext_stxndistinct - 1] = true;
 	nulls[Anum_pg_statistic_ext_stxdependencies - 1] = true;
+	nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
 
 	/* insert it into pg_statistic_ext */
 	statrel = heap_open(StatisticExtRelationId, RowExclusiveLock);
@@ -415,23 +426,97 @@ RemoveStatisticsById(Oid statsOid)
  * null until the next ANALYZE.  (Note that the type change hasn't actually
  * happened yet, so one option that's *not* on the table is to recompute
  * immediately.)
+ *
+ * For both ndistinct and functional-dependencies stats, the on-disk
+ * representation is independent of the source column data types, and it is
+ * plausible to assume that the old statistic values will still be good for
+ * the new column contents.  (Obviously, if the ALTER COLUMN TYPE has a USING
+ * expression that substantially alters the semantic meaning of the column
+ * values, this assumption could fail.  But that seems like a corner case
+ * that doesn't justify zapping the stats in common cases.)
+ *
+ * For MCV lists that's not the case, as those statistics store the datums
+ * internally. In this case we simply reset the statistics value to NULL.
  */
 void
 UpdateStatisticsForTypeChange(Oid statsOid, Oid relationOid, int attnum,
 							  Oid oldColumnType, Oid newColumnType)
 {
+	Form_pg_statistic_ext staForm;
+	HeapTuple	stup,
+				oldtup;
+	int			i;
+
+	/* Do we need to reset anything? */
+	bool		attribute_referenced;
+	bool		reset_stats = false;
+
+	Relation	rel;
+
+	Datum		values[Natts_pg_statistic_ext];
+	bool		nulls[Natts_pg_statistic_ext];
+	bool		replaces[Natts_pg_statistic_ext];
+
+	oldtup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statsOid));
+	if (!oldtup)
+		elog(ERROR, "cache lookup failed for statistics object %u", statsOid);
+	staForm = (Form_pg_statistic_ext) GETSTRUCT(oldtup);
+
 	/*
-	 * Currently, we don't actually need to do anything here.  For both
-	 * ndistinct and functional-dependencies stats, the on-disk representation
-	 * is independent of the source column data types, and it is plausible to
-	 * assume that the old statistic values will still be good for the new
-	 * column contents.  (Obviously, if the ALTER COLUMN TYPE has a USING
-	 * expression that substantially alters the semantic meaning of the column
-	 * values, this assumption could fail.  But that seems like a corner case
-	 * that doesn't justify zapping the stats in common cases.)
-	 *
-	 * Future types of extended stats will likely require us to work harder.
+	 * If the modified attribute is not referenced by this statistic, we
+	 * can simply leave the statistics alone.
+	 */
+	attribute_referenced = false;
+	for (i = 0; i < staForm->stxkeys.dim1; i++)
+		if (attnum == staForm->stxkeys.values[i])
+			attribute_referenced = true;
+
+	/*
+	 * We can also leave the record as it is if there are no statistics
+	 * including the datum values, like for example MCV lists.
 	 */
+	if (statext_is_kind_built(oldtup, STATS_EXT_MCV))
+		reset_stats = true;
+
+	/*
+	 * If we can leave the statistics as it is, just do minimal cleanup
+	 * and we're done.
+	 */
+	if (!attribute_referenced && reset_stats)
+	{
+		ReleaseSysCache(oldtup);
+		return;
+	}
+
+	/*
+	 * OK, we need to reset some statistics. So let's build the new tuple,
+	 * replacing the affected statistics types with NULL.
+	 */
+	memset(nulls, 0, Natts_pg_statistic_ext * sizeof(bool));
+	memset(replaces, 0, Natts_pg_statistic_ext * sizeof(bool));
+	memset(values, 0, Natts_pg_statistic_ext * sizeof(Datum));
+
+	if (statext_is_kind_built(oldtup, STATS_EXT_MCV))
+	{
+		replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
+		nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
+	}
+
+	rel = heap_open(StatisticExtRelationId, RowExclusiveLock);
+
+	/* replace the old tuple */
+	stup = heap_modify_tuple(oldtup,
+							 RelationGetDescr(rel),
+							 values,
+							 nulls,
+							 replaces);
+
+	ReleaseSysCache(oldtup);
+	CatalogTupleUpdate(rel, &stup->t_self, stup);
+
+	heap_freetuple(stup);
+
+	heap_close(rel, RowExclusiveLock);
 }
 
 /*
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
new file mode 100644
index f471794..fd09ad3
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -105,9 +105,6 @@ clauselist_selectivity(PlannerInfo *root
 	Selectivity s1 = 1.0;
 	RelOptInfo *rel;
 	Bitmapset  *estimatedclauses = NULL;
-	RangeQueryClause *rqlist = NULL;
-	ListCell   *l;
-	int			listidx;
 
 	/*
 	 * If there's exactly one clause, just go directly to
@@ -125,6 +122,25 @@ clauselist_selectivity(PlannerInfo *root
 	if (rel && rel->rtekind == RTE_RELATION && rel->statlist != NIL)
 	{
 		/*
+		 * Estimate selectivity on any clauses applicable by stats tracking
+		 * actual values first, then apply functional dependencies on the
+		 * remaining clauses.  The reasoning for this particular order is that
+		 * the more complex stats can track more complex correlations between
+		 * the attributes, and may be considered more reliable.
+		 *
+		 * For example MCV list can give us an exact selectivity for values in
+		 * two columns, while functional dependencies can only provide
+		 * information about overall strength of the dependency.
+		 *
+		 * 'estimatedclauses' is a bitmap of 0-based list positions of clauses
+		 * used that way, so that we can ignore them later (not to estimate
+		 * them twice).
+		 */
+		s1 *= statext_clauselist_selectivity(root, clauses, varRelid,
+											 jointype, sjinfo, rel,
+											 &estimatedclauses);
+
+		/*
 		 * Perform selectivity estimations on any clauses found applicable by
 		 * dependencies_clauselist_selectivity.  'estimatedclauses' will be
 		 * filled with the 0-based list positions of clauses used that way, so
@@ -133,17 +149,72 @@ clauselist_selectivity(PlannerInfo *root
 		s1 *= dependencies_clauselist_selectivity(root, clauses, varRelid,
 												  jointype, sjinfo, rel,
 												  &estimatedclauses);
-
-		/*
-		 * This would be the place to apply any other types of extended
-		 * statistics selectivity estimations for remaining clauses.
-		 */
 	}
 
 	/*
 	 * Apply normal selectivity estimates for remaining clauses. We'll be
 	 * careful to skip any clauses which were already estimated above.
-	 *
+	 */
+	return s1 * clauselist_selectivity_simple(root, clauses, varRelid,
+											  jointype, sjinfo,
+											  estimatedclauses);
+}
+
+/*
+ * clauselist_selectivity_simple -
+ *	  Compute the selectivity of an implicitly-ANDed list of boolean
+ *	  expression clauses.  The list can be empty, in which case 1.0
+ *	  must be returned.  List elements may be either RestrictInfos
+ *	  or bare expression clauses --- the former is preferred since
+ *	  it allows caching of results.
+ *
+ * See clause_selectivity() for the meaning of the additional parameters.
+ *
+ * Our basic approach is to take the product of the selectivities of the
+ * subclauses.  However, that's only right if the subclauses have independent
+ * probabilities, and in reality they are often NOT independent.  So,
+ * we want to be smarter where we can.
+ *
+ * We also recognize "range queries", such as "x > 34 AND x < 42".  Clauses
+ * are recognized as possible range query components if they are restriction
+ * opclauses whose operators have scalarltsel or a related function as their
+ * restriction selectivity estimator.  We pair up clauses of this form that
+ * refer to the same variable.  An unpairable clause of this kind is simply
+ * multiplied into the selectivity product in the normal way.  But when we
+ * find a pair, we know that the selectivities represent the relative
+ * positions of the low and high bounds within the column's range, so instead
+ * of figuring the selectivity as hisel * losel, we can figure it as hisel +
+ * losel - 1.  (To visualize this, see that hisel is the fraction of the range
+ * below the high bound, while losel is the fraction above the low bound; so
+ * hisel can be interpreted directly as a 0..1 value but we need to convert
+ * losel to 1-losel before interpreting it as a value.  Then the available
+ * range is 1-losel to hisel.  However, this calculation double-excludes
+ * nulls, so really we need hisel + losel + null_frac - 1.)
+ *
+ * If either selectivity is exactly DEFAULT_INEQ_SEL, we forget this equation
+ * and instead use DEFAULT_RANGE_INEQ_SEL.  The same applies if the equation
+ * yields an impossible (negative) result.
+ *
+ * A free side-effect is that we can recognize redundant inequalities such
+ * as "x < 4 AND x < 5"; only the tighter constraint will be counted.
+ *
+ * Of course this is all very dependent on the behavior of the inequality
+ * selectivity functions; perhaps some day we can generalize the approach.
+ */
+Selectivity
+clauselist_selectivity_simple(PlannerInfo *root,
+							  List *clauses,
+							  int varRelid,
+							  JoinType jointype,
+							  SpecialJoinInfo *sjinfo,
+							  Bitmapset *estimatedclauses)
+{
+	Selectivity s1 = 1.0;
+	RangeQueryClause *rqlist = NULL;
+	ListCell   *l;
+	int			listidx;
+
+	/*
 	 * Anything that doesn't look like a potential rangequery clause gets
 	 * multiplied into s1 and forgotten. Anything that does gets inserted into
 	 * an rqlist entry.
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
new file mode 100644
index 8369e3a..0112450
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1363,6 +1363,18 @@ get_relation_statistics(RelOptInfo *rel,
 			stainfos = lcons(info, stainfos);
 		}
 
+		if (statext_is_kind_built(htup, STATS_EXT_MCV))
+		{
+			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+			info->statOid = statOid;
+			info->rel = rel;
+			info->kind = STATS_EXT_MCV;
+			info->keys = bms_copy(keys);
+
+			stainfos = lcons(info, stainfos);
+		}
+
 		ReleaseSysCache(htup);
 		bms_free(keys);
 	}
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
new file mode 100644
index 656b1b5..2987712
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1681,6 +1681,8 @@ generateClonedExtStatsStmt(RangeVar *hea
 			stat_types = lappend(stat_types, makeString("ndistinct"));
 		else if (enabled[i] == STATS_EXT_DEPENDENCIES)
 			stat_types = lappend(stat_types, makeString("dependencies"));
+		else if (enabled[i] == STATS_EXT_MCV)
+			stat_types = lappend(stat_types, makeString("mcv"));
 		else
 			elog(ERROR, "unrecognized statistics kind %c", enabled[i]);
 	}
diff --git a/src/backend/statistics/Makefile b/src/backend/statistics/Makefile
new file mode 100644
index 3404e45..d281526
--- a/src/backend/statistics/Makefile
+++ b/src/backend/statistics/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/statistics
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-OBJS = extended_stats.o dependencies.o mvdistinct.o
+OBJS = extended_stats.o dependencies.o mcv.o mvdistinct.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/statistics/README b/src/backend/statistics/README
new file mode 100644
index a8f00a5..8f153a9
--- a/src/backend/statistics/README
+++ b/src/backend/statistics/README
@@ -18,6 +18,8 @@ There are currently two kinds of extende
 
     (b) soft functional dependencies (README.dependencies)
 
+    (c) MCV lists (README.mcv)
+
 
 Compatible clause types
 -----------------------
@@ -26,6 +28,8 @@ Each type of statistics may be used to e
 
     (a) functional dependencies - equality clauses (AND), possibly IS NULL
 
+    (b) MCV lists - equality and inequality clauses (AND, OR, NOT), IS NULL
+
 Currently, only OpExprs in the form Var op Const, or Const op Var are
 supported, however it's feasible to expand the code later to also estimate the
 selectivities on clauses such as Var op Var.
diff --git a/src/backend/statistics/README.mcv b/src/backend/statistics/README.mcv
new file mode 100644
index ...2910eca
--- a/src/backend/statistics/README.mcv
+++ b/src/backend/statistics/README.mcv
@@ -0,0 +1,140 @@
+MCV lists
+=========
+
+Multivariate MCV (most-common values) lists are a straightforward extension of
+regular MCV list, tracking most frequent combinations of values for a group of
+attributes.
+
+This works particularly well for columns with a small number of distinct values,
+as the list may include all the combinations and approximate the distribution
+very accurately.
+
+For columns with large number of distinct values (e.g. those with continuous
+domains), the list will only track the most frequent combinations. If the
+distribution is mostly uniform (all combinations about equally frequent), the
+MCV list will be empty.
+
+Estimates of some clauses (e.g. equality) based on MCV lists are more accurate
+than when using histograms.
+
+Also, MCV lists don't necessarily require sorting of the values (the fact that
+we use sorting when building them is implementation detail), but even more
+importantly the ordering is not built into the approximation (while histograms
+are built on ordering). So MCV lists work well even for attributes where the
+ordering of the data type is disconnected from the meaning of the data. For
+example we know how to sort strings, but it's unlikely to make much sense for
+city names (or other label-like attributes).
+
+
+Selectivity estimation
+----------------------
+
+The estimation, implemented in clauselist_mv_selectivity_mcvlist(), is quite
+simple in principle - we need to identify MCV items matching all the clauses
+and sum frequencies of all those items.
+
+Currently MCV lists support estimation of the following clause types:
+
+    (a) equality clauses    WHERE (a = 1) AND (b = 2)
+    (b) inequality clauses  WHERE (a < 1) AND (b >= 2)
+    (c) NULL clauses        WHERE (a IS NULL) AND (b IS NOT NULL)
+    (d) OR clauses          WHERE (a < 1) OR (b >= 2)
+
+It's possible to add support for additional clauses, for example:
+
+    (e) multi-var clauses   WHERE (a > b)
+
+and possibly others. These are tasks for the future, not yet implemented.
+
+
+Estimating equality clauses
+---------------------------
+
+When computing selectivity estimate for equality clauses
+
+    (a = 1) AND (b = 2)
+
+we can do this estimate pretty exactly assuming that two conditions are met:
+
+    (1) there's an equality condition on all attributes of the statistic
+
+    (2) we find a matching item in the MCV list
+
+In this case we know the MCV item represents all tuples matching the clauses,
+and the selectivity estimate is complete (i.e. we don't need to perform
+estimation using the histogram). This is what we call 'full match'.
+
+When only (1) holds, but there's no matching MCV item, we don't know whether
+there are no such rows or just are not very frequent. We can however use the
+frequency of the least frequent MCV item as an upper bound for the selectivity.
+
+For a combination of equality conditions (not full-match case) we can clamp the
+selectivity by the minimum of selectivities for each condition. For example if
+we know the number of distinct values for each column, we can use 1/ndistinct
+as a per-column estimate. Or rather 1/ndistinct + selectivity derived from the
+MCV list.
+
+We should also probably only use the 'residual ndistinct' by exluding the items
+included in the MCV list (and also residual frequency):
+
+     f = (1.0 - sum(MCV frequencies)) / (ndistinct - ndistinct(MCV list))
+
+but it's worth pointing out the ndistinct values are multi-variate for the
+columns referenced by the equality conditions.
+
+Note: Only the "full match" limit is currently implemented.
+
+
+Hashed MCV (not yet implemented)
+--------------------------------
+
+Regular MCV lists have to include actual values for each item, so if those items
+are large the list may be quite large. This is especially true for multi-variate
+MCV lists, although the current implementation partially mitigates this by
+performing de-duplicating the values before storing them on disk.
+
+It's possible to only store hashes (32-bit values) instead of the actual values,
+significantly reducing the space requirements. Obviously, this would only make
+the MCV lists useful for estimating equality conditions (assuming the 32-bit
+hashes make the collisions rare enough).
+
+This might also complicate matching the columns to available stats.
+
+
+TODO Consider implementing hashed MCV list, storing just 32-bit hashes instead
+     of the actual values. This type of MCV list will be useful only for
+     estimating equality clauses, and will reduce space requirements for large
+     varlena types (in such cases we usually only want equality anyway).
+
+TODO Currently there's no logic to consider building only a MCV list (and not
+     building the histogram at all), except for doing this decision manually in
+     ADD STATISTICS.
+
+
+Inspecting the MCV list
+-----------------------
+
+Inspecting the regular (per-attribute) MCV lists is trivial, as it's enough
+to select the columns from pg_stats. The data is encoded as anyarrays, and
+all the items have the same data type, so anyarray provides a simple way to
+get a text representation.
+
+With multivariate MCV lists the columns may use different data types, making
+it impossible to use anyarrays. It might be possible to produce similar
+array-like representation, but that would complicate further processing and
+analysis of the MCV list.
+
+So instead the MCV lists are stored in a custom data type (pg_mcv_list),
+which however makes it more difficult to inspect the contents. To make that
+easier, there's a SRF returning detailed information about the MCV lists.
+
+    SELECT * FROM pg_mcv_list_items(stxmcv);
+
+It accepts one parameter - a pg_mcv_list value (which can only be obtained
+from pg_statistic_ext catalog, to defend against malicious input), and
+returns these columns:
+
+    - item index (0, ..., (nitems-1))
+    - values (string array)
+    - nulls only (boolean array)
+    - frequency (double precision)
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
new file mode 100644
index 3b4a09b..af63dc9
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -201,14 +201,11 @@ static double
 dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
 				  VacAttrStats **stats, Bitmapset *attrs)
 {
-	int			i,
-				j;
-	int			nvalues = numrows * k;
+	int			i;
 	MultiSortSupport mss;
 	SortItem   *items;
-	Datum	   *values;
-	bool	   *isnull;
 	int		   *attnums;
+	int		   *attnums_dep;
 
 	/* counters valid within a group */
 	int			group_size = 0;
@@ -223,26 +220,16 @@ dependency_degree(int numrows, HeapTuple
 	/* sort info for all attributes columns */
 	mss = multi_sort_init(k);
 
-	/* data for the sort */
-	items = (SortItem *) palloc(numrows * sizeof(SortItem));
-	values = (Datum *) palloc(sizeof(Datum) * nvalues);
-	isnull = (bool *) palloc(sizeof(bool) * nvalues);
-
-	/* fix the pointers to values/isnull */
-	for (i = 0; i < numrows; i++)
-	{
-		items[i].values = &values[i * k];
-		items[i].isnull = &isnull[i * k];
-	}
-
 	/*
-	 * Transform the bms into an array, to make accessing i-th member easier.
+	 * Transform the bms into an array, to make accessing i-th member easier,
+	 * and then construct a filtered version with only attnums referenced
+	 * by the dependency we validate.
 	 */
-	attnums = (int *) palloc(sizeof(int) * bms_num_members(attrs));
-	i = 0;
-	j = -1;
-	while ((j = bms_next_member(attrs, j)) >= 0)
-		attnums[i++] = j;
+	attnums = build_attnums(attrs);
+
+	attnums_dep = (int *)palloc(k * sizeof(int));
+	for (i = 0; i < k; i++)
+		attnums_dep[i] = attnums[dependency[i]];
 
 	/*
 	 * Verify the dependency (a,b,...)->z, using a rather simple algorithm:
@@ -254,7 +241,7 @@ dependency_degree(int numrows, HeapTuple
 	 * (c) for each group count different values in the last column
 	 */
 
-	/* prepare the sort function for the first dimension, and SortItem array */
+	/* prepare the sort function for the dimensions */
 	for (i = 0; i < k; i++)
 	{
 		VacAttrStats *colstat = stats[dependency[i]];
@@ -267,19 +254,16 @@ dependency_degree(int numrows, HeapTuple
 
 		/* prepare the sort function for this dimension */
 		multi_sort_add_dimension(mss, i, type->lt_opr);
-
-		/* accumulate all the data for both columns into an array and sort it */
-		for (j = 0; j < numrows; j++)
-		{
-			items[j].values[i] =
-				heap_getattr(rows[j], attnums[dependency[i]],
-							 stats[i]->tupDesc, &items[j].isnull[i]);
-		}
 	}
 
-	/* sort the items so that we can detect the groups */
-	qsort_arg((void *) items, numrows, sizeof(SortItem),
-			  multi_sort_compare, mss);
+	/*
+	 * build an array of SortItem(s) sorted using the multi-sort support
+	 *
+	 * XXX This relies on all stats entries pointing to the same tuple
+	 * descriptor. Not sure if that might not be the case.
+	 */
+	items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
+							   mss, k, attnums_dep);
 
 	/*
 	 * Walk through the sorted array, split it into rows according to the
@@ -322,9 +306,9 @@ dependency_degree(int numrows, HeapTuple
 	}
 
 	pfree(items);
-	pfree(values);
-	pfree(isnull);
 	pfree(mss);
+	pfree(attnums);
+	pfree(attnums_dep);
 
 	/* Compute the 'degree of validity' as (supporting/total). */
 	return (n_supporting_rows * 1.0 / numrows);
@@ -351,7 +335,6 @@ statext_dependencies_build(int numrows,
 						   VacAttrStats **stats)
 {
 	int			i,
-				j,
 				k;
 	int			numattrs;
 	int		   *attnums;
@@ -364,11 +347,7 @@ statext_dependencies_build(int numrows,
 	/*
 	 * Transform the bms into an array, to make accessing i-th member easier.
 	 */
-	attnums = palloc(sizeof(int) * bms_num_members(attrs));
-	i = 0;
-	j = -1;
-	while ((j = bms_next_member(attrs, j)) >= 0)
-		attnums[i++] = j;
+	attnums = build_attnums(attrs);
 
 	Assert(numattrs >= 2);
 
@@ -915,9 +894,9 @@ find_strongest_dependency(StatisticExtIn
  *		using functional dependency statistics, or 1.0 if no useful functional
  *		dependency statistic exists.
  *
- * 'estimatedclauses' is an output argument that gets a bit set corresponding
- * to the (zero-based) list index of each clause that is included in the
- * estimated selectivity.
+ * 'estimatedclauses' is an input/output argument that gets a bit set
+ * corresponding to the (zero-based) list index of each clause that is included
+ * in the estimated selectivity.
  *
  * Given equality clauses on attributes (a,b) we find the strongest dependency
  * between them, i.e. either (a=>b) or (b=>a). Assuming (a=>b) is the selected
@@ -952,9 +931,6 @@ dependencies_clauselist_selectivity(Plan
 	AttrNumber *list_attnums;
 	int			listidx;
 
-	/* initialize output argument */
-	*estimatedclauses = NULL;
-
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
 		return 1.0;
@@ -969,6 +945,9 @@ dependencies_clauselist_selectivity(Plan
 	 * the attnums for each clause in a list which we'll reference later so we
 	 * don't need to repeat the same work again. We'll also keep track of all
 	 * attnums seen.
+	 *
+	 * We also skip clauses that we already estimated using different types of
+	 * statistics (we treat them as incompatible).
 	 */
 	listidx = 0;
 	foreach(l, clauses)
@@ -976,7 +955,8 @@ dependencies_clauselist_selectivity(Plan
 		Node	   *clause = (Node *) lfirst(l);
 		AttrNumber	attnum;
 
-		if (dependency_is_compatible_clause(clause, rel->relid, &attnum))
+		if ((dependency_is_compatible_clause(clause, rel->relid, &attnum)) &&
+			(!bms_is_member(listidx, *estimatedclauses)))
 		{
 			list_attnums[listidx] = attnum;
 			clauses_attnums = bms_add_member(clauses_attnums, attnum);
@@ -1046,8 +1026,7 @@ dependencies_clauselist_selectivity(Plan
 			/*
 			 * Skip incompatible clauses, and ones we've already estimated on.
 			 */
-			if (list_attnums[listidx] == InvalidAttrNumber ||
-				bms_is_member(listidx, *estimatedclauses))
+			if (list_attnums[listidx] == InvalidAttrNumber)
 				continue;
 
 			/*
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
new file mode 100644
index 2df5f7d..5f71a7d
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -16,6 +16,8 @@
  */
 #include "postgres.h"
 
+#include <math.h>
+
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/htup_details.h"
@@ -23,6 +25,8 @@
 #include "catalog/pg_collation.h"
 #include "catalog/pg_statistic_ext.h"
 #include "nodes/relation.h"
+#include "optimizer/clauses.h"
+#include "optimizer/cost.h"
 #include "postmaster/autovacuum.h"
 #include "statistics/extended_stats_internal.h"
 #include "statistics/statistics.h"
@@ -31,6 +35,7 @@
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/rel.h"
+#include "utils/selfuncs.h"
 #include "utils/syscache.h"
 
 
@@ -53,7 +58,7 @@ static VacAttrStats **lookup_var_attr_st
 					  int nvacatts, VacAttrStats **vacatts);
 static void statext_store(Relation pg_stext, Oid relid,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
-			  VacAttrStats **stats);
+			  MCVList * mcvlist, VacAttrStats **stats);
 
 
 /*
@@ -87,6 +92,7 @@ BuildRelationExtStatistics(Relation oner
 		StatExtEntry *stat = (StatExtEntry *) lfirst(lc);
 		MVNDistinct *ndistinct = NULL;
 		MVDependencies *dependencies = NULL;
+		MCVList    *mcv = NULL;
 		VacAttrStats **stats;
 		ListCell   *lc2;
 
@@ -124,10 +130,13 @@ BuildRelationExtStatistics(Relation oner
 			else if (t == STATS_EXT_DEPENDENCIES)
 				dependencies = statext_dependencies_build(numrows, rows,
 														  stat->columns, stats);
+			else if (t == STATS_EXT_MCV)
+				mcv = statext_mcv_build(numrows, rows, stat->columns, stats,
+										totalrows);
 		}
 
 		/* store the statistics in the catalog */
-		statext_store(pg_stext, stat->statOid, ndistinct, dependencies, stats);
+		statext_store(pg_stext, stat->statOid, ndistinct, dependencies, mcv, stats);
 	}
 
 	heap_close(pg_stext, RowExclusiveLock);
@@ -155,6 +164,10 @@ statext_is_kind_built(HeapTuple htup, ch
 			attnum = Anum_pg_statistic_ext_stxdependencies;
 			break;
 
+		case STATS_EXT_MCV:
+			attnum = Anum_pg_statistic_ext_stxmcv;
+			break;
+
 		default:
 			elog(ERROR, "unexpected statistics type requested: %d", type);
 	}
@@ -219,7 +232,8 @@ fetch_statentries_for_relation(Relation
 		for (i = 0; i < ARR_DIMS(arr)[0]; i++)
 		{
 			Assert((enabled[i] == STATS_EXT_NDISTINCT) ||
-				   (enabled[i] == STATS_EXT_DEPENDENCIES));
+				   (enabled[i] == STATS_EXT_DEPENDENCIES) ||
+				   (enabled[i] == STATS_EXT_MCV));
 			entry->types = lappend_int(entry->types, (int) enabled[i]);
 		}
 
@@ -294,7 +308,7 @@ lookup_var_attr_stats(Relation rel, Bitm
 static void
 statext_store(Relation pg_stext, Oid statOid,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
-			  VacAttrStats **stats)
+			  MCVList * mcv, VacAttrStats **stats)
 {
 	HeapTuple	stup,
 				oldtup;
@@ -325,9 +339,18 @@ statext_store(Relation pg_stext, Oid sta
 		values[Anum_pg_statistic_ext_stxdependencies - 1] = PointerGetDatum(data);
 	}
 
+	if (mcv != NULL)
+	{
+		bytea	   *data = statext_mcv_serialize(mcv, stats);
+
+		nulls[Anum_pg_statistic_ext_stxmcv - 1] = (data == NULL);
+		values[Anum_pg_statistic_ext_stxmcv - 1] = PointerGetDatum(data);
+	}
+
 	/* always replace the value (either by bytea or NULL) */
 	replaces[Anum_pg_statistic_ext_stxndistinct - 1] = true;
 	replaces[Anum_pg_statistic_ext_stxdependencies - 1] = true;
+	replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
 
 	/* there should already be a pg_statistic_ext tuple */
 	oldtup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
@@ -434,6 +457,137 @@ multi_sort_compare_dims(int start, int e
 	return 0;
 }
 
+int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+	return compare_datums_simple(*(Datum *) a,
+								 *(Datum *) b,
+								 (SortSupport) arg);
+}
+
+int
+compare_datums_simple(Datum a, Datum b, SortSupport ssup)
+{
+	return ApplySortComparator(a, false, b, false, ssup);
+}
+
+/* simple counterpart to qsort_arg */
+void *
+bsearch_arg(const void *key, const void *base, size_t nmemb, size_t size,
+			int (*compar) (const void *, const void *, void *),
+			void *arg)
+{
+	size_t		l,
+				u,
+				idx;
+	const void *p;
+	int			comparison;
+
+	l = 0;
+	u = nmemb;
+	while (l < u)
+	{
+		idx = (l + u) / 2;
+		p = (void *) (((const char *) base) + (idx * size));
+		comparison = (*compar) (key, p, arg);
+
+		if (comparison < 0)
+			u = idx;
+		else if (comparison > 0)
+			l = idx + 1;
+		else
+			return (void *) p;
+	}
+
+	return NULL;
+}
+
+int *
+build_attnums(Bitmapset *attrs)
+{
+	int			i,
+				j;
+	int			numattrs = bms_num_members(attrs);
+	int		   *attnums;
+
+	/* build attnums from the bitmapset */
+	attnums = (int *) palloc(sizeof(int) * numattrs);
+	i = 0;
+	j = -1;
+	while ((j = bms_next_member(attrs, j)) >= 0)
+		attnums[i++] = j;
+
+	return attnums;
+}
+
+/* build_sorted_items
+ * 	build sorted array of SortItem with values from rows
+ *
+ * XXX All the memory is allocated in a single chunk, so that the caller
+ * can simply pfree the return value to release all of it.
+ */
+SortItem *
+build_sorted_items(int numrows, HeapTuple *rows, TupleDesc tdesc,
+				   MultiSortSupport mss, int numattrs, int *attnums)
+{
+	int			i,
+				j,
+				len;
+	int			nvalues = numrows * numattrs;
+
+	/*
+	 * We won't allocate the arrays for each item independenly, but in one
+	 * large chunk and then just set the pointers. This allows the caller to
+	 * simply pfree the return value to release all the memory.
+	 */
+	SortItem   *items;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *ptr;
+
+	/* Compute the total amount of memory we need (both items and values). */
+	len = numrows * sizeof(SortItem) + nvalues * (sizeof(Datum) + sizeof(bool));
+
+	/* Allocate the memory and split it into the pieces. */
+	ptr = palloc0(len);
+
+	/* items to sort */
+	items = (SortItem *) ptr;
+	ptr += numrows * sizeof(SortItem);
+
+	/* values and null flags */
+	values = (Datum *) ptr;
+	ptr += nvalues * sizeof(Datum);
+
+	isnull = (bool *) ptr;
+	ptr += nvalues * sizeof(bool);
+
+	/* make sure we consumed the whole buffer exactly */
+	Assert((ptr - (char *) items) == len);
+
+	/* fix the pointers to Datum and bool arrays */
+	for (i = 0; i < numrows; i++)
+	{
+		items[i].values = &values[i * numattrs];
+		items[i].isnull = &isnull[i * numattrs];
+
+		/* load the values/null flags from sample rows */
+		for (j = 0; j < numattrs; j++)
+		{
+			items[i].values[j] = heap_getattr(rows[i],
+											  attnums[j],	/* attnum */
+											  tdesc,
+											  &items[i].isnull[j]); /* isnull */
+		}
+	}
+
+	/* do the sort, using the multi-sort */
+	qsort_arg((void *) items, numrows, sizeof(SortItem),
+			  multi_sort_compare, mss);
+
+	return items;
+}
+
 /*
  * has_stats_of_kind
  *		Check whether the list contains statistic of a given kind
@@ -464,7 +618,7 @@ has_stats_of_kind(List *stats, char requ
  * object referencing the most of the requested attributes, breaking ties
  * in favor of objects with fewer keys overall.
  *
- * XXX if multiple statistics objects tie on both criteria, then which object
+ * XXX If multiple statistics objects tie on both criteria, then which object
  * is chosen depends on the order that they appear in the stats list. Perhaps
  * further tiebreakers are needed.
  */
@@ -514,3 +668,335 @@ choose_best_statistics(List *stats, Bitm
 
 	return best_match;
 }
+
+int
+bms_member_index(Bitmapset *keys, AttrNumber varattno)
+{
+	int			i,
+				j;
+
+	i = -1;
+	j = 0;
+	while (((i = bms_next_member(keys, i)) >= 0) && (i < varattno))
+		j += 1;
+
+	return j;
+}
+
+/* The Duj1 estimator (already used in analyze.c). */
+double
+estimate_ndistinct(double totalrows, int numrows, int d, int f1)
+{
+	double		numer,
+				denom,
+				ndistinct;
+
+	numer = (double) numrows * (double) d;
+
+	denom = (double) (numrows - f1) +
+		(double) f1 * (double) numrows / totalrows;
+
+	ndistinct = numer / denom;
+
+	/* Clamp to sane range in case of roundoff error */
+	if (ndistinct < (double) d)
+		ndistinct = (double) d;
+
+	if (ndistinct > totalrows)
+		ndistinct = totalrows;
+
+	return floor(ndistinct + 0.5);
+}
+
+/*
+ * statext_is_compatible_clause_internal
+ *	Does the heavy lifting of actually inspecting the clauses for
+ * statext_is_compatible_clause.
+ */
+static bool
+statext_is_compatible_clause_internal(Node *clause, Index relid, Bitmapset **attnums)
+{
+	/* We only support plain Vars for now */
+	if (IsA(clause, Var))
+	{
+		Var		   *var = (Var *) clause;
+
+		/* Ensure var is from the correct relation */
+		if (var->varno != relid)
+			return false;
+
+		/* we also better ensure the Var is from the current level */
+		if (var->varlevelsup > 0)
+			return false;
+
+		/* Also skip system attributes (we don't allow stats on those). */
+		if (!AttrNumberIsForUserDefinedAttr(var->varattno))
+			return false;
+
+		*attnums = bms_add_member(*attnums, var->varattno);
+
+		return true;
+	}
+
+	/* Var = Const */
+	if (is_opclause(clause))
+	{
+		OpExpr	   *expr = (OpExpr *) clause;
+		Var		   *var;
+		bool		varonleft = true;
+		bool		ok;
+
+		/* Only expressions with two arguments are considered compatible. */
+		if (list_length(expr->args) != 2)
+			return false;
+
+		/* see if it actually has the right */
+		ok = (NumRelids((Node *) expr) == 1) &&
+			(is_pseudo_constant_clause(lsecond(expr->args)) ||
+			 (varonleft = false,
+			  is_pseudo_constant_clause(linitial(expr->args))));
+
+		/* unsupported structure (two variables or so) */
+		if (!ok)
+			return false;
+
+		/*
+		 * If it's not one of the supported operators ("=", "<", ">", etc.),
+		 * just ignore the clause, as it's not compatible with MCV lists.
+		 *
+		 * This uses the function for estimating selectivity, not the operator
+		 * directly (a bit awkward, but well ...).
+		 */
+		if ((get_oprrest(expr->opno) != F_EQSEL) &&
+			(get_oprrest(expr->opno) != F_NEQSEL) &&
+			(get_oprrest(expr->opno) != F_SCALARLTSEL) &&
+			(get_oprrest(expr->opno) != F_SCALARLESEL) &&
+			(get_oprrest(expr->opno) != F_SCALARGTSEL) &&
+			(get_oprrest(expr->opno) != F_SCALARGESEL))
+			return false;
+
+		var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+		return statext_is_compatible_clause_internal((Node *) var, relid, attnums);
+	}
+
+	/* NOT/AND/OR clause */
+	if (or_clause(clause) ||
+		and_clause(clause) ||
+		not_clause(clause))
+	{
+		/*
+		 * AND/OR/NOT-clauses are supported if all sub-clauses are supported
+		 *
+		 * Perhaps we could improve this by handling mixed cases, when some of
+		 * the clauses are supported and some are not. Selectivity for the
+		 * supported subclauses would be computed using extended statistics,
+		 * and the remaining clauses would be estimated using the traditional
+		 * algorithm (product of selectivities).
+		 *
+		 * It however seems overly complex, and in a way we already do that
+		 * because if we reject the whole clause as unsupported here, it will
+		 * be eventually passed to clauselist_selectivity() which does exactly
+		 * this (split into supported/unsupported clauses etc).
+		 */
+		BoolExpr   *expr = (BoolExpr *) clause;
+		ListCell   *lc;
+		Bitmapset  *clause_attnums = NULL;
+
+		foreach(lc, expr->args)
+		{
+			/*
+			 * Had we found incompatible clause in the arguments, treat the
+			 * whole clause as incompatible.
+			 */
+			if (!statext_is_compatible_clause_internal((Node *) lfirst(lc),
+													   relid, &clause_attnums))
+				return false;
+		}
+
+		/*
+		 * Otherwise the clause is compatible, and we need to merge the
+		 * attnums into the main bitmapset.
+		 */
+		*attnums = bms_join(*attnums, clause_attnums);
+
+		return true;
+	}
+
+	/* Var IS NULL */
+	if (IsA(clause, NullTest))
+	{
+		NullTest   *nt = (NullTest *) clause;
+
+		/*
+		 * Only simple (Var IS NULL) expressions supported for now. Maybe we
+		 * could use examine_variable to fix this?
+		 */
+		if (!IsA(nt->arg, Var))
+			return false;
+
+		return statext_is_compatible_clause_internal((Node *) (nt->arg), relid, attnums);
+	}
+
+	return false;
+}
+
+/*
+ * statext_is_compatible_clause
+ *		Determines if the clause is compatible with MCV lists.
+ *
+ * Only OpExprs with two arguments using an equality operator are supported.
+ * When returning True attnum is set to the attribute number of the Var within
+ * the supported clause.
+ *
+ * Currently we only support Var = Const, or Const = Var. It may be possible
+ * to expand on this later.
+ */
+static bool
+statext_is_compatible_clause(Node *clause, Index relid, Bitmapset **attnums)
+{
+	RestrictInfo *rinfo = (RestrictInfo *) clause;
+
+	if (!IsA(rinfo, RestrictInfo))
+		return false;
+
+	/* Pseudoconstants are not really interesting here. */
+	if (rinfo->pseudoconstant)
+		return false;
+
+	/* clauses referencing multiple varnos are incompatible */
+	if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
+		return false;
+
+	return statext_is_compatible_clause_internal((Node *) rinfo->clause,
+												 relid, attnums);
+}
+
+/*
+ * statext_clauselist_selectivity
+ *		Estimate clauses using the best multi-column statistics.
+ *
+ * Selects the best multi-column statistic on a table (measured by a number
+ * of attributes extracted from the clauses and covered by the statistic).
+ *
+ * XXX If we were to use multiple statistics, this is where it would happen.
+ */
+Selectivity
+statext_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
+							   JoinType jointype, SpecialJoinInfo *sjinfo,
+							   RelOptInfo *rel, Bitmapset **estimatedclauses)
+{
+	ListCell   *l;
+	Bitmapset  *clauses_attnums = NULL;
+	Bitmapset **list_attnums;
+	int			listidx;
+	StatisticExtInfo *stat;
+	List	   *stat_clauses;
+	Selectivity	simple_sel,
+				mcv_sel,
+				mcv_basesel,
+				mcv_totalsel,
+				other_sel,
+				sel;
+
+	/* we're interested in MCV lists */
+	int			types = STATS_EXT_MCV;
+
+	/* check if there's any stats that might be useful for us. */
+	if (!has_stats_of_kind(rel->statlist, types))
+		return (Selectivity) 1.0;
+
+	list_attnums = (Bitmapset **) palloc(sizeof(Bitmapset *) *
+										 list_length(clauses));
+
+	/*
+	 * Pre-process the clauses list to extract the attnums seen in each item.
+	 * We need to determine if there's any clauses which will be useful for
+	 * dependency selectivity estimations. Along the way we'll record all of
+	 * the attnums for each clause in a list which we'll reference later so we
+	 * don't need to repeat the same work again. We'll also keep track of all
+	 * attnums seen.
+	 *
+	 * We also skip clauses that we already estimated using different types of
+	 * statistics (we treat them as incompatible).
+	 *
+	 * XXX Currently, the estimated clauses are always empty because the extra
+	 * statistics are applied before functional dependencies. Once we decide
+	 * to apply multiple statistics, this may change.
+	 */
+	listidx = 0;
+	foreach(l, clauses)
+	{
+		Node	   *clause = (Node *) lfirst(l);
+		Bitmapset  *attnums = NULL;
+
+		if ((statext_is_compatible_clause(clause, rel->relid, &attnums)) &&
+			(!bms_is_member(listidx, *estimatedclauses)))
+		{
+			list_attnums[listidx] = attnums;
+			clauses_attnums = bms_add_members(clauses_attnums, attnums);
+		}
+		else
+			list_attnums[listidx] = NULL;
+
+		listidx++;
+	}
+
+	/* We need at least two attributes for MCV lists. */
+	if (bms_num_members(clauses_attnums) < 2)
+		return 1.0;
+
+	/* find the best suited statistics object for these attnums */
+	stat = choose_best_statistics(rel->statlist, clauses_attnums, types);
+
+	/* if no matching stats could be found then we've nothing to do */
+	if (!stat)
+		return (Selectivity) 1.0;
+
+	/* We only understand MCV lists for now. */
+	Assert(stat->kind == STATS_EXT_MCV);
+
+	/* now filter the clauses to be estimated using the selected MCV */
+	stat_clauses = NIL;
+
+	listidx = 0;
+	foreach(l, clauses)
+	{
+		/*
+		 * If the clause is compatible with the selected statistics, mark it
+		 * as estimated and add it to the list to estimate.
+		 */
+		if ((list_attnums[listidx] != NULL) &&
+			(bms_is_subset(list_attnums[listidx], stat->keys)))
+		{
+			stat_clauses = lappend(stat_clauses, (Node *) lfirst(l));
+			*estimatedclauses = bms_add_member(*estimatedclauses, listidx);
+		}
+
+		listidx++;
+	}
+
+	/*
+	 * Use the MCV stats to improve upon the simple selectivity of these
+	 * clauses.
+	 */
+	simple_sel = clauselist_selectivity_simple(root, stat_clauses, varRelid,
+											   jointype, sjinfo, NULL);
+
+	mcv_sel = mcv_clauselist_selectivity(root, stat, stat_clauses, varRelid,
+										 jointype, sjinfo, rel,
+										 &mcv_basesel, &mcv_totalsel);
+
+	/* Estimated selectivity of values not covered by MCV matches */
+	other_sel = simple_sel - mcv_basesel;
+	CLAMP_PROBABILITY(other_sel);
+
+	if (other_sel > 1.0 - mcv_totalsel)
+		other_sel = 1.0 - mcv_totalsel;
+
+	/* Overall selectivity */
+	sel = mcv_sel + other_sel;
+	CLAMP_PROBABILITY(sel);
+
+	return sel;
+}
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
new file mode 100644
index ...1136a95
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -0,0 +1,1644 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcv.c
+ *	  POSTGRES multivariate MCV lists
+ *
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/backend/statistics/mcv.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_statistic_ext.h"
+#include "fmgr.h"
+#include "funcapi.h"
+#include "optimizer/clauses.h"
+#include "statistics/extended_stats_internal.h"
+#include "statistics/statistics.h"
+#include "utils/builtins.h"
+#include "utils/bytea.h"
+#include "utils/fmgroids.h"
+#include "utils/fmgrprotos.h"
+#include "utils/lsyscache.h"
+#include "utils/syscache.h"
+#include "utils/typcache.h"
+
+#include <math.h>
+
+/*
+ * Computes size of a serialized MCV item, depending on the number of
+ * dimensions (columns) the statistic is defined on. The datum values are
+ * stored in a separate array (deduplicated, to minimize the size), and
+ * so the serialized items only store uint16 indexes into that array.
+ *
+ * Each serialized item store (in this order):
+ *
+ * - indexes to values	  (ndim * sizeof(uint16))
+ * - null flags			  (ndim * sizeof(bool))
+ * - frequency			  (sizeof(double))
+ * - base_frequency		  (sizeof(double))
+ *
+ * So in total each MCV item requires this many bytes:
+ *
+ *	 ndim * (sizeof(uint16) + sizeof(bool)) + 2 * sizeof(double)
+ */
+#define ITEM_SIZE(ndims)	\
+	(ndims * (sizeof(uint16) + sizeof(bool)) + 2 * sizeof(double))
+
+/*
+ * Macros for convenient access to parts of a serialized MCV item.
+ */
+#define ITEM_INDEXES(item)			((uint16*)item)
+#define ITEM_NULLS(item,ndims)		((bool*)(ITEM_INDEXES(item) + ndims))
+#define ITEM_FREQUENCY(item,ndims)	((double*)(ITEM_NULLS(item,ndims) + ndims))
+#define ITEM_BASE_FREQUENCY(item,ndims)	((double*)(ITEM_FREQUENCY(item,ndims) + 1))
+
+
+static MultiSortSupport build_mss(VacAttrStats **stats, Bitmapset *attrs);
+
+static SortItem *build_distinct_groups(int numrows, SortItem *items,
+					  MultiSortSupport mss, int *ndistinct);
+
+static int count_distinct_groups(int numrows, SortItem *items,
+					  MultiSortSupport mss);
+
+/*
+ * Builds MCV list from the set of sampled rows.
+ *
+ * The algorithm is quite simple:
+ *
+ *	   (1) sort the data (default collation, '<' for the data type)
+ *
+ *	   (2) count distinct groups, decide how many to keep
+ *
+ *	   (3) build the MCV list using the threshold determined in (2)
+ *
+ *	   (4) remove rows represented by the MCV from the sample
+ *
+ */
+MCVList *
+statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
+				  VacAttrStats **stats, double totalrows)
+{
+	int			i,
+				j,
+				k;
+	int			numattrs = bms_num_members(attrs);
+	int			ndistinct;
+	int			nitems;
+	double		stadistinct;
+	int		   *mcv_counts;
+	int			f1;
+
+	int		   *attnums = build_attnums(attrs);
+
+	MCVList    *mcvlist = NULL;
+
+	/* comparator for all the columns */
+	MultiSortSupport mss = build_mss(stats, attrs);
+
+	/* sort the rows */
+	SortItem   *items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
+										   mss, numattrs, attnums);
+
+	/* transform the sorted rows into groups (sorted by frequency) */
+	SortItem   *groups = build_distinct_groups(numrows, items, mss, &ndistinct);
+
+	/*
+	 * Maximum number of MCV items to store, based on the attribute with the
+	 * largest stats target
+	 */
+	nitems = stats[0]->attr->attstattarget;
+	for (i = 1; i < numattrs; i++)
+	{
+		if (stats[i]->attr->attstattarget > nitems)
+			nitems = stats[i]->attr->attstattarget;
+	}
+	if (nitems > ndistinct)
+		nitems = ndistinct;
+
+	/*
+	 * Decide how many items to keep in the MCV list. We simply use the same
+	 * algorithm as for per-column MCV lists, to keep it consistent.
+	 *
+	 * One difference is that we do not have a multi-column stanullfrac, and
+	 * we simply treat it as a special item in the MCV list (it it makes it).
+	 * We could compute and store it, of course, but we may have statistics
+	 * on more than two columns, so we'd probably want to store this for
+	 * various combinations of columns - for K columns that'd be 2^K values.
+	 * So we instead store those as items of the multi-column MCV list (if
+	 * common enough).
+	 *
+	 * XXX Conceptually this is similar to the NULL-buckets of histograms.
+	 */
+	mcv_counts = (int *) palloc(sizeof(int) * nitems);
+	f1 = 0;
+
+	for (i = 0; i < nitems; i++)
+	{
+		mcv_counts[i] = groups[i].count;
+
+		/* count values that occur exactly once for the ndistinct estimate */
+		if (groups[i].count == 1)
+			f1 += 1;
+	}
+
+	stadistinct = estimate_ndistinct(totalrows, numrows, ndistinct, f1);
+
+	nitems = analyze_mcv_list(mcv_counts, nitems, stadistinct,
+							  0.0, /* stanullfrac */
+							  numrows, totalrows);
+
+	/*
+	 * At this point we know the number of items for the MCV list. There might
+	 * be none (for uniform distribution with many groups), and in that case
+	 * there will be no MCV list. Otherwise construct the MCV list.
+	 */
+	if (nitems > 0)
+	{
+		/*
+		 * Allocate the MCV list structure, set the global parameters.
+		 */
+		mcvlist = (MCVList *) palloc0(sizeof(MCVList));
+
+		mcvlist->magic = STATS_MCV_MAGIC;
+		mcvlist->type = STATS_MCV_TYPE_BASIC;
+		mcvlist->ndimensions = numattrs;
+		mcvlist->nitems = nitems;
+
+		/* store info about data type OIDs */
+		i = 0;
+		j = -1;
+		while ((j = bms_next_member(attrs, j)) >= 0)
+		{
+			VacAttrStats *colstat = stats[i];
+
+			mcvlist->types[i] = colstat->attrtypid;
+			i++;
+		}
+
+		/*
+		 * Preallocate Datum/isnull arrays (not as a single chunk, as we will
+		 * pass the result outside and thus it needs to be easy to pfree().
+		 *
+		 * XXX On second thought, we're the only ones dealing with MCV lists,
+		 * so we might allocate everything as a single chunk to reduce palloc
+		 * overhead (chunk headers, etc.) without significant risk. Not sure
+		 * it's worth it, though, as we're not re-building stats very often.
+		 */
+		mcvlist->items = (MCVItem * *) palloc0(sizeof(MCVItem *) * nitems);
+
+		for (i = 0; i < nitems; i++)
+		{
+			mcvlist->items[i] = (MCVItem *) palloc(sizeof(MCVItem));
+			mcvlist->items[i]->values = (Datum *) palloc(sizeof(Datum) * numattrs);
+			mcvlist->items[i]->isnull = (bool *) palloc(sizeof(bool) * numattrs);
+		}
+
+		/* Copy the first chunk of groups into the result. */
+		for (i = 0; i < nitems; i++)
+		{
+			/* just pointer to the proper place in the list */
+			MCVItem    *item = mcvlist->items[i];
+
+			/* copy values from the _previous_ group (last item of) */
+			memcpy(item->values, groups[i].values, sizeof(Datum) * numattrs);
+			memcpy(item->isnull, groups[i].isnull, sizeof(bool) * numattrs);
+
+			/* groups should be sorted by frequency in descending order */
+			Assert((i == 0) || (groups[i - 1].count >= groups[i].count));
+
+			/* group frequency */
+			item->frequency = (double) groups[i].count / numrows;
+
+			/* base frequency, if the attributes were independent */
+			item->base_frequency = 1.0;
+			for (j = 0; j < numattrs; j++)
+			{
+				int			count = 0;
+
+				for (k = 0; k < ndistinct; k++)
+				{
+					if (multi_sort_compare_dim(j, &groups[i], &groups[k], mss) == 0)
+						count += groups[k].count;
+				}
+
+				item->base_frequency *= (double) count / numrows;
+			}
+		}
+	}
+
+	pfree(items);
+	pfree(groups);
+	pfree(mcv_counts);
+
+	return mcvlist;
+}
+
+/*
+ * build_mss
+ *	build MultiSortSupport for the attributes passed in attrs
+ */
+static MultiSortSupport
+build_mss(VacAttrStats **stats, Bitmapset *attrs)
+{
+	int			i,
+				j;
+	int			numattrs = bms_num_members(attrs);
+
+	/* Sort by multiple columns (using array of SortSupport) */
+	MultiSortSupport mss = multi_sort_init(numattrs);
+
+	/* prepare the sort functions for all the attributes */
+	i = 0;
+	j = -1;
+	while ((j = bms_next_member(attrs, j)) >= 0)
+	{
+		VacAttrStats *colstat = stats[i];
+		TypeCacheEntry *type;
+
+		type = lookup_type_cache(colstat->attrtypid, TYPECACHE_LT_OPR);
+		if (type->lt_opr == InvalidOid) /* shouldn't happen */
+			elog(ERROR, "cache lookup failed for ordering operator for type %u",
+				 colstat->attrtypid);
+
+		multi_sort_add_dimension(mss, i, type->lt_opr);
+		i++;
+	}
+
+	return mss;
+}
+
+/*
+ * count_distinct_groups
+ *	count distinct combinations of SortItems in the array
+ *
+ * The array is assumed to be sorted according to the MultiSortSupport.
+ */
+static int
+count_distinct_groups(int numrows, SortItem *items, MultiSortSupport mss)
+{
+	int			i;
+	int			ndistinct;
+
+	ndistinct = 1;
+	for (i = 1; i < numrows; i++)
+	{
+		/* make sure the array really is sorted */
+		Assert(multi_sort_compare(&items[i], &items[i - 1], mss) >= 0);
+
+		if (multi_sort_compare(&items[i], &items[i - 1], mss) != 0)
+			ndistinct += 1;
+	}
+
+	return ndistinct;
+}
+
+/*
+ * compare_sort_item_count
+ *	comparator for sorting items by count (frequencies) in descending order
+ */
+static int
+compare_sort_item_count(const void *a, const void *b)
+{
+	SortItem   *ia = (SortItem *) a;
+	SortItem   *ib = (SortItem *) b;
+
+	if (ia->count == ib->count)
+		return 0;
+	else if (ia->count > ib->count)
+		return -1;
+
+	return 1;
+}
+
+/*
+ * build_distinct_groups
+ *	build array of SortItems for distinct groups and counts matching items
+ *
+ * The input array is assumed to be sorted
+ */
+static SortItem *
+build_distinct_groups(int numrows, SortItem *items, MultiSortSupport mss,
+					  int *ndistinct)
+{
+	int			i,
+				j;
+	int			ngroups = count_distinct_groups(numrows, items, mss);
+
+	SortItem   *groups = (SortItem *) palloc0(ngroups * sizeof(SortItem));
+
+	j = 0;
+	groups[0] = items[0];
+	groups[0].count = 1;
+
+	for (i = 1; i < numrows; i++)
+	{
+		/* Assume sorted in ascending order. */
+		Assert(multi_sort_compare(&items[i], &items[i - 1], mss) >= 0);
+
+		/* New distinct group detected. */
+		if (multi_sort_compare(&items[i], &items[i - 1], mss) != 0)
+			groups[++j] = items[i];
+
+		groups[j].count++;
+	}
+
+	/* Sort the distinct groups by frequency (in descending order). */
+	pg_qsort((void *) groups, ngroups, sizeof(SortItem),
+			 compare_sort_item_count);
+
+	*ndistinct = ngroups;
+	return groups;
+}
+
+
+/*
+ * statext_mcv_load
+ *		Load the MCV list for the indicated pg_statistic_ext tuple
+ */
+MCVList *
+statext_mcv_load(Oid mvoid)
+{
+	bool		isnull = false;
+	Datum		mcvlist;
+	HeapTuple	htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(mvoid));
+
+	if (!HeapTupleIsValid(htup))
+		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
+
+	mcvlist = SysCacheGetAttr(STATEXTOID, htup,
+							  Anum_pg_statistic_ext_stxmcv, &isnull);
+
+	ReleaseSysCache(htup);
+
+	if (isnull)
+		return NULL;
+
+	return statext_mcv_deserialize(DatumGetByteaP(mcvlist));
+}
+
+
+/*
+ * Serialize MCV list into a bytea value.
+ *
+ * The basic algorithm is simple:
+ *
+ * (1) perform deduplication (for each attribute separately)
+ *	   (a) collect all (non-NULL) attribute values from all MCV items
+ *	   (b) sort the data (using 'lt' from VacAttrStats)
+ *	   (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all MCV list items
+ *	   (a) replace values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, as we may be mixing different
+ * datatypes, with different sort operators, etc.
+ *
+ * We use uint16 values for the indexes in step (3), as we currently don't allow
+ * more than 8k MCV items anyway, although that's mostly arbitrary limit. We might
+ * increase this to 65k and still fit into uint16. Furthermore, this limit is on
+ * the number of distinct values per column, and we usually have few of those
+ * (and various combinations of them for the those MCV list). So uint16 seems fine.
+ *
+ * We don't really expect the serialization to save as much space as for
+ * histograms, as we are not doing any bucket splits (which is the source
+ * of high redundancy in histograms).
+ *
+ * TODO: Consider packing boolean flags (NULL) for each item into a single char
+ * (or a longer type) instead of using an array of bool items.
+ */
+bytea *
+statext_mcv_serialize(MCVList * mcvlist, VacAttrStats **stats)
+{
+	int			i;
+	int			dim;
+	int			ndims = mcvlist->ndimensions;
+	int			itemsize = ITEM_SIZE(ndims);
+
+	SortSupport ssup;
+	DimensionInfo *info;
+
+	Size		total_length;
+
+	/* allocate the item just once */
+	char	   *item = palloc0(itemsize);
+
+	/* serialized items (indexes into arrays, etc.) */
+	bytea	   *output;
+	char	   *data = NULL;
+
+	/* values per dimension (and number of non-NULL values) */
+	Datum	  **values = (Datum **) palloc0(sizeof(Datum *) * ndims);
+	int		   *counts = (int *) palloc0(sizeof(int) * ndims);
+
+	/*
+	 * We'll include some rudimentary information about the attributes (type
+	 * length, etc.), so that we don't have to look them up while
+	 * deserializing the MCV list.
+	 *
+	 * XXX Maybe this is not a great idea? Or maybe we should actually copy
+	 * more fields, e.g. typeid, which would allow us to display the MCV list
+	 * using only the serialized representation (currently we have to fetch
+	 * this info from the relation).
+	 */
+	info = (DimensionInfo *) palloc0(sizeof(DimensionInfo) * ndims);
+
+	/* sort support data for all attributes included in the MCV list */
+	ssup = (SortSupport) palloc0(sizeof(SortSupportData) * ndims);
+
+	/* collect and deduplicate values for each dimension (attribute) */
+	for (dim = 0; dim < ndims; dim++)
+	{
+		int			ndistinct;
+		TypeCacheEntry *typentry;
+
+		/*
+		 * Lookup the LT operator (can't get it from stats extra_data, as we
+		 * don't know how to interpret that - scalar vs. array etc.).
+		 */
+		typentry = lookup_type_cache(stats[dim]->attrtypid, TYPECACHE_LT_OPR);
+
+		/* copy important info about the data type (length, by-value) */
+		info[dim].typlen = stats[dim]->attrtype->typlen;
+		info[dim].typbyval = stats[dim]->attrtype->typbyval;
+
+		/* allocate space for values in the attribute and collect them */
+		values[dim] = (Datum *) palloc0(sizeof(Datum) * mcvlist->nitems);
+
+		for (i = 0; i < mcvlist->nitems; i++)
+		{
+			/* skip NULL values - we don't need to deduplicate those */
+			if (mcvlist->items[i]->isnull[dim])
+				continue;
+
+			values[dim][counts[dim]] = mcvlist->items[i]->values[dim];
+			counts[dim] += 1;
+		}
+
+		/* if there are just NULL values in this dimension, we're done */
+		if (counts[dim] == 0)
+			continue;
+
+		/* sort and deduplicate the data */
+		ssup[dim].ssup_cxt = CurrentMemoryContext;
+		ssup[dim].ssup_collation = DEFAULT_COLLATION_OID;
+		ssup[dim].ssup_nulls_first = false;
+
+		PrepareSortSupportFromOrderingOp(typentry->lt_opr, &ssup[dim]);
+
+		qsort_arg(values[dim], counts[dim], sizeof(Datum),
+				  compare_scalars_simple, &ssup[dim]);
+
+		/*
+		 * Walk through the array and eliminate duplicate values, but keep the
+		 * ordering (so that we can do bsearch later). We know there's at
+		 * least one item as (counts[dim] != 0), so we can skip the first
+		 * element.
+		 */
+		ndistinct = 1;			/* number of distinct values */
+		for (i = 1; i < counts[dim]; i++)
+		{
+			/* expect sorted array */
+			Assert(compare_datums_simple(values[dim][i - 1], values[dim][i], &ssup[dim]) <= 0);
+
+			/* if the value is the same as the previous one, we can skip it */
+			if (!compare_datums_simple(values[dim][i - 1], values[dim][i], &ssup[dim]))
+				continue;
+
+			values[dim][ndistinct] = values[dim][i];
+			ndistinct += 1;
+		}
+
+		/* we must not exceed UINT16_MAX, as we use uint16 indexes */
+		Assert(ndistinct <= UINT16_MAX);
+
+		/*
+		 * Store additional info about the attribute - number of deduplicated
+		 * values, and also size of the serialized data. For fixed-length data
+		 * types this is trivial to compute, for varwidth types we need to
+		 * actually walk the array and sum the sizes.
+		 */
+		info[dim].nvalues = ndistinct;
+
+		if (info[dim].typlen > 0)	/* fixed-length data types */
+			info[dim].nbytes = info[dim].nvalues * info[dim].typlen;
+		else if (info[dim].typlen == -1)	/* varlena */
+		{
+			info[dim].nbytes = 0;
+			for (i = 0; i < info[dim].nvalues; i++)
+				info[dim].nbytes += VARSIZE_ANY(values[dim][i]);
+		}
+		else if (info[dim].typlen == -2)	/* cstring */
+		{
+			info[dim].nbytes = 0;
+			for (i = 0; i < info[dim].nvalues; i++)
+				info[dim].nbytes += strlen(DatumGetPointer(values[dim][i]));
+		}
+
+		/* we know (count>0) so there must be some data */
+		Assert(info[dim].nbytes > 0);
+	}
+
+	/*
+	 * Now we can finally compute how much space we'll actually need for the
+	 * whole serialized MCV list, as it contains these fields:
+	 *
+	 * - length (4B) for varlena - magic (4B) - type (4B) - ndimensions (4B) -
+	 * nitems (4B) - info (ndim * sizeof(DimensionInfo) - arrays of values for
+	 * each dimension - serialized items (nitems * itemsize)
+	 *
+	 * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and then we
+	 * will place all the data (values + indexes). We'll however use offsetof
+	 * and sizeof to compute sizes of the structs.
+	 */
+	total_length = (sizeof(int32) + offsetof(MCVList, items)
+					+ (ndims * sizeof(DimensionInfo))
+					+ mcvlist->nitems * itemsize);
+
+	/* add space for the arrays of deduplicated values */
+	for (i = 0; i < ndims; i++)
+		total_length += info[i].nbytes;
+
+	/*
+	 * Enforce arbitrary limit of 1MB on the size of the serialized MCV list.
+	 * This is meant as a protection against someone building MCV list on long
+	 * values (e.g. text documents).
+	 *
+	 * XXX Should we enforce arbitrary limits like this one? Maybe it's not
+	 * even necessary, as long values are usually unique and so won't make it
+	 * into the MCV list in the first place. In the end, we have a 1GB limit
+	 * on bytea values.
+	 */
+	if (total_length > (1024 * 1024))
+		elog(ERROR, "serialized MCV list exceeds 1MB (%ld)", total_length);
+
+	/* allocate space for the serialized MCV list, set header fields */
+	output = (bytea *) palloc0(total_length);
+	SET_VARSIZE(output, total_length);
+
+	/* 'data' points to the current position in the output buffer */
+	data = VARDATA(output);
+
+	/* MCV list header (number of items, ...) */
+	memcpy(data, mcvlist, offsetof(MCVList, items));
+	data += offsetof(MCVList, items);
+
+	/* information about the attributes */
+	memcpy(data, info, sizeof(DimensionInfo) * ndims);
+	data += sizeof(DimensionInfo) * ndims;
+
+	/* Copy the deduplicated values for all attributes to the output. */
+	for (dim = 0; dim < ndims; dim++)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* remember the starting point for Asserts later */
+		char	   *tmp = data;
+#endif
+		for (i = 0; i < info[dim].nvalues; i++)
+		{
+			Datum		v = values[dim][i];
+
+			if (info[dim].typbyval) /* passed by value */
+			{
+				memcpy(data, &v, info[dim].typlen);
+				data += info[dim].typlen;
+			}
+			else if (info[dim].typlen > 0)	/* pased by reference */
+			{
+				memcpy(data, DatumGetPointer(v), info[dim].typlen);
+				data += info[dim].typlen;
+			}
+			else if (info[dim].typlen == -1)	/* varlena */
+			{
+				memcpy(data, DatumGetPointer(v), VARSIZE_ANY(v));
+				data += VARSIZE_ANY(v);
+			}
+			else if (info[dim].typlen == -2)	/* cstring */
+			{
+				memcpy(data, DatumGetPointer(v), strlen(DatumGetPointer(v)) + 1);
+				data += strlen(DatumGetPointer(v)) + 1; /* terminator */
+			}
+
+			/* no underflows or overflows */
+			Assert((data > tmp) && ((data - tmp) <= info[dim].nbytes));
+		}
+
+		/*
+		 * check we got exactly the amount of data we expected for this
+		 * dimension
+		 */
+		Assert((data - tmp) == info[dim].nbytes);
+	}
+
+	/* Serialize the items, with uint16 indexes instead of the values. */
+	for (i = 0; i < mcvlist->nitems; i++)
+	{
+		MCVItem    *mcvitem = mcvlist->items[i];
+
+		/* don't write beyond the allocated space */
+		Assert(data <= (char *) output + total_length - itemsize);
+
+		/* reset the item (we only allocate it once and reuse it) */
+		memset(item, 0, itemsize);
+
+		for (dim = 0; dim < ndims; dim++)
+		{
+			Datum	   *v = NULL;
+
+			/* do the lookup only for non-NULL values */
+			if (mcvlist->items[i]->isnull[dim])
+				continue;
+
+			v = (Datum *) bsearch_arg(&mcvitem->values[dim], values[dim],
+									  info[dim].nvalues, sizeof(Datum),
+									  compare_scalars_simple, &ssup[dim]);
+
+			Assert(v != NULL);	/* serialization or deduplication error */
+
+			/* compute index within the array */
+			ITEM_INDEXES(item)[dim] = (v - values[dim]);
+
+			/* check the index is within expected bounds */
+			Assert(ITEM_INDEXES(item)[dim] >= 0);
+			Assert(ITEM_INDEXES(item)[dim] < info[dim].nvalues);
+		}
+
+		/* copy NULL and frequency flags into the item */
+		memcpy(ITEM_NULLS(item, ndims), mcvitem->isnull, sizeof(bool) * ndims);
+		memcpy(ITEM_FREQUENCY(item, ndims), &mcvitem->frequency, sizeof(double));
+		memcpy(ITEM_BASE_FREQUENCY(item, ndims), &mcvitem->base_frequency, sizeof(double));
+
+		/* copy the serialized item into the array */
+		memcpy(data, item, itemsize);
+
+		data += itemsize;
+	}
+
+	/* at this point we expect to match the total_length exactly */
+	Assert((data - (char *) output) == total_length);
+
+	pfree(item);
+	pfree(values);
+	pfree(counts);
+
+	return output;
+}
+
+/*
+ * Reads serialized MCV list into MCVList structure.
+ *
+ * Unlike with histograms, we deserialize the MCV list fully (i.e. we don't
+ * keep the deduplicated arrays and pointers into them), as we don't expect
+ * there to be a lot of duplicate values. But perhaps that's not true and we
+ * should keep the MCV in serialized form too.
+ *
+ * XXX See how much memory we could save by keeping the deduplicated version
+ * (both for typical and corner cases with few distinct values but many items).
+ */
+MCVList *
+statext_mcv_deserialize(bytea *data)
+{
+	int			dim,
+				i;
+	Size		expected_size;
+	MCVList    *mcvlist;
+	char	   *tmp;
+
+	int			ndims,
+				nitems,
+				itemsize;
+	DimensionInfo *info = NULL;
+	Datum	  **values = NULL;
+
+	/* local allocation buffer (used only for deserialization) */
+	int			bufflen;
+	char	   *buff;
+	char	   *ptr;
+
+	/* buffer used for the result */
+	int			rbufflen;
+	char	   *rbuff;
+	char	   *rptr;
+
+	if (data == NULL)
+		return NULL;
+
+	/*
+	 * We can't possibly deserialize a MCV list if there's not even a complete
+	 * header.
+	 */
+	if (VARSIZE_ANY_EXHDR(data) < offsetof(MCVList, items))
+		elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+			 VARSIZE_ANY_EXHDR(data), offsetof(MCVList, items));
+
+	/* read the MCV list header */
+	mcvlist = (MCVList *) palloc0(sizeof(MCVList));
+
+	/* initialize pointer to the data part (skip the varlena header) */
+	tmp = VARDATA_ANY(data);
+
+	/* get the header and perform further sanity checks */
+	memcpy(mcvlist, tmp, offsetof(MCVList, items));
+	tmp += offsetof(MCVList, items);
+
+	if (mcvlist->magic != STATS_MCV_MAGIC)
+		elog(ERROR, "invalid MCV magic %d (expected %dd)",
+			 mcvlist->magic, STATS_MCV_MAGIC);
+
+	if (mcvlist->type != STATS_MCV_TYPE_BASIC)
+		elog(ERROR, "invalid MCV type %d (expected %dd)",
+			 mcvlist->type, STATS_MCV_TYPE_BASIC);
+
+	if (mcvlist->ndimensions == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid zero-length dimension array in MCVList")));
+	else if (mcvlist->ndimensions > STATS_MAX_DIMENSIONS)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid length (%d) dimension array in MCVList",
+						mcvlist->ndimensions)));
+
+	if (mcvlist->nitems == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid zero-length item array in MCVList")));
+	else if (mcvlist->nitems > STATS_MCVLIST_MAX_ITEMS)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid length (%d) item array in MCVList",
+						mcvlist->nitems)));
+
+	nitems = mcvlist->nitems;
+	ndims = mcvlist->ndimensions;
+	itemsize = ITEM_SIZE(ndims);
+
+	/*
+	 * Check amount of data including DimensionInfo for all dimensions and
+	 * also the serialized items (including uint16 indexes). Also, walk
+	 * through the dimension information and add it to the sum.
+	 */
+	expected_size = offsetof(MCVList, items) +
+		ndims * sizeof(DimensionInfo) +
+		(nitems * itemsize);
+
+	/*
+	 * Check that we have at least the dimension and info records, along with
+	 * the items. We don't know the size of the serialized values yet. We need
+	 * to do this check first, before accessing the dimension info.
+	 */
+	if (VARSIZE_ANY_EXHDR(data) < expected_size)
+		elog(ERROR, "invalid MCV size %ld (expected %ld)",
+			 VARSIZE_ANY_EXHDR(data), expected_size);
+
+	/* Now it's safe to access the dimension info. */
+	info = (DimensionInfo *) (tmp);
+	tmp += ndims * sizeof(DimensionInfo);
+
+	/* account for the value arrays */
+	for (dim = 0; dim < ndims; dim++)
+	{
+		/*
+		 * XXX I wonder if we can/should rely on asserts here. Maybe those
+		 * checks should be done every time?
+		 */
+		Assert(info[dim].nvalues >= 0);
+		Assert(info[dim].nbytes >= 0);
+
+		expected_size += info[dim].nbytes;
+	}
+
+	/*
+	 * Now we know the total expected MCV size, including all the pieces
+	 * (header, dimension info. items and deduplicated data). So do the final
+	 * check on size.
+	 */
+	if (VARSIZE_ANY_EXHDR(data) != expected_size)
+		elog(ERROR, "invalid MCV size %ld (expected %ld)",
+			 VARSIZE_ANY_EXHDR(data), expected_size);
+
+	/*
+	 * Allocate one large chunk of memory for the intermediate data, needed
+	 * only for deserializing the MCV list (and allocate densely to minimize
+	 * the palloc overhead).
+	 *
+	 * Let's see how much space we'll actually need, and also include space
+	 * for the array with pointers.
+	 *
+	 * We need an array of Datum pointers values for each dimension, so that
+	 * we can easily translate the uint16 indexes. We also need a top-level
+	 * array of pointers to those per-dimension arrays.
+	 *
+	 * For byval types with size matching sizeof(Datum) we can reuse the
+	 * serialized array directly.
+	 */
+	bufflen = sizeof(Datum **) * ndims; /* space for top-level pointers */
+
+	for (dim = 0; dim < ndims; dim++)
+	{
+		/* for full-size byval types, we reuse the serialized value */
+		if (!(info[dim].typbyval && info[dim].typlen == sizeof(Datum)))
+			bufflen += (sizeof(Datum) * info[dim].nvalues);
+	}
+
+	buff = palloc0(bufflen);
+	ptr = buff;
+
+	values = (Datum **) buff;
+	ptr += (sizeof(Datum *) * ndims);
+
+	/*
+	 * XXX This uses pointers to the original data array (the types not passed
+	 * by value), so when someone frees the memory, e.g. by doing something
+	 * like this:
+	 *
+	 *	  bytea * data = ... fetch the data from catalog ...
+	 *
+	 *	  MCVList mcvlist = deserialize_mcv_list(data);
+	 *
+	 *	  pfree(data);
+	 *
+	 * then 'mcvlist' references the freed memory. Should copy the pieces.
+	 */
+	for (dim = 0; dim < ndims; dim++)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* remember where data for this dimension starts */
+		char	   *start = tmp;
+#endif
+		if (info[dim].typbyval)
+		{
+			/* passed by value / size matches Datum - just reuse the array */
+			if (info[dim].typlen == sizeof(Datum))
+			{
+				values[dim] = (Datum *) tmp;
+				tmp += info[dim].nbytes;
+
+				/* no overflow of input array */
+				Assert(tmp <= start + info[dim].nbytes);
+			}
+			else
+			{
+				values[dim] = (Datum *) ptr;
+				ptr += (sizeof(Datum) * info[dim].nvalues);
+
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					memcpy(&values[dim][i], tmp, info[dim].typlen);
+					tmp += info[dim].typlen;
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+		}
+		else
+		{
+			/* all the other types need a chunk of the buffer */
+			values[dim] = (Datum *) ptr;
+			ptr += (sizeof(Datum) * info[dim].nvalues);
+
+			/* passed by reference, but fixed length (name, tid, ...) */
+			if (info[dim].typlen > 0)
+			{
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					values[dim][i] = PointerGetDatum(tmp);
+					tmp += info[dim].typlen;
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+			else if (info[dim].typlen == -1)
+			{
+				/* varlena */
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					values[dim][i] = PointerGetDatum(tmp);
+					tmp += VARSIZE_ANY(tmp);
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+			else if (info[dim].typlen == -2)
+			{
+				/* cstring */
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					values[dim][i] = PointerGetDatum(tmp);
+					tmp += (strlen(tmp) + 1);	/* don't forget the \0 */
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+		}
+
+		/* check we consumed the serialized data for this dimension exactly */
+		Assert((tmp - start) == info[dim].nbytes);
+	}
+
+	/* we should have exhausted the buffer exactly */
+	Assert((ptr - buff) == bufflen);
+
+	/* allocate space for all the MCV items in a single piece */
+	rbufflen = (sizeof(MCVItem *) + sizeof(MCVItem) +
+				sizeof(Datum) * ndims + sizeof(bool) * ndims) * nitems;
+
+	rbuff = palloc0(rbufflen);
+	rptr = rbuff;
+
+	mcvlist->items = (MCVItem * *) rbuff;
+	rptr += (sizeof(MCVItem *) * nitems);
+
+	/* deserialize the MCV items and translate the indexes to Datums */
+	for (i = 0; i < nitems; i++)
+	{
+		uint16	   *indexes = NULL;
+		MCVItem    *item = (MCVItem *) rptr;
+
+		rptr += (sizeof(MCVItem));
+
+		item->values = (Datum *) rptr;
+		rptr += (sizeof(Datum) * ndims);
+
+		item->isnull = (bool *) rptr;
+		rptr += (sizeof(bool) * ndims);
+
+		/* just point to the right place */
+		indexes = ITEM_INDEXES(tmp);
+
+		memcpy(item->isnull, ITEM_NULLS(tmp, ndims), sizeof(bool) * ndims);
+		memcpy(&item->frequency, ITEM_FREQUENCY(tmp, ndims), sizeof(double));
+		memcpy(&item->base_frequency, ITEM_BASE_FREQUENCY(tmp, ndims), sizeof(double));
+
+		/* translate the values */
+		for (dim = 0; dim < ndims; dim++)
+			if (!item->isnull[dim])
+				item->values[dim] = values[dim][indexes[dim]];
+
+		mcvlist->items[i] = item;
+
+		tmp += ITEM_SIZE(ndims);
+
+		/* check we're not overflowing the input */
+		Assert(tmp <= (char *) data + VARSIZE_ANY(data));
+	}
+
+	/* check that we processed all the data */
+	Assert(tmp == (char *) data + VARSIZE_ANY(data));
+
+	/* release the temporary buffer */
+	pfree(buff);
+
+	return mcvlist;
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - item ID (0...nitems)
+ * - values (string array)
+ * - nulls only (boolean array)
+ * - frequency (double precision)
+ * - base_frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows returned if
+ * the statistics contains no histogram.
+ */
+PG_FUNCTION_INFO_V1(pg_stats_ext_mcvlist_items);
+
+Datum
+pg_stats_ext_mcvlist_items(PG_FUNCTION_ARGS)
+{
+	FuncCallContext *funcctx;
+	int			call_cntr;
+	int			max_calls;
+	TupleDesc	tupdesc;
+	AttInMetadata *attinmeta;
+
+	/* stuff done only on the first call of the function */
+	if (SRF_IS_FIRSTCALL())
+	{
+		MemoryContext oldcontext;
+		MCVList    *mcvlist;
+
+		/* create a function context for cross-call persistence */
+		funcctx = SRF_FIRSTCALL_INIT();
+
+		/* switch to memory context appropriate for multiple function calls */
+		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+		mcvlist = statext_mcv_deserialize(PG_GETARG_BYTEA_P(0));
+
+		funcctx->user_fctx = mcvlist;
+
+		/* total number of tuples to be returned */
+		funcctx->max_calls = 0;
+		if (funcctx->user_fctx != NULL)
+			funcctx->max_calls = mcvlist->nitems;
+
+		/* Build a tuple descriptor for our result type */
+		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("function returning record called in context "
+							"that cannot accept type record")));
+
+		/* build metadata needed later to produce tuples from raw C-strings */
+		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+		funcctx->attinmeta = attinmeta;
+
+		MemoryContextSwitchTo(oldcontext);
+	}
+
+	/* stuff done on every call of the function */
+	funcctx = SRF_PERCALL_SETUP();
+
+	call_cntr = funcctx->call_cntr;
+	max_calls = funcctx->max_calls;
+	attinmeta = funcctx->attinmeta;
+
+	if (call_cntr < max_calls)	/* do when there is more left to send */
+	{
+		char	  **values;
+		HeapTuple	tuple;
+		Datum		result;
+
+		char	   *buff = palloc0(1024);
+		char	   *format;
+
+		int			i;
+
+		Oid		   *outfuncs;
+		FmgrInfo   *fmgrinfo;
+
+		MCVList    *mcvlist;
+		MCVItem    *item;
+
+		mcvlist = (MCVList *) funcctx->user_fctx;
+
+		Assert(call_cntr < mcvlist->nitems);
+
+		item = mcvlist->items[call_cntr];
+
+		/*
+		 * Prepare a values array for building the returned tuple. This should
+		 * be an array of C strings which will be processed later by the type
+		 * input functions.
+		 */
+		values = (char **) palloc(5 * sizeof(char *));
+
+		values[0] = (char *) palloc(64 * sizeof(char));
+
+		/* arrays */
+		values[1] = (char *) palloc0(1024 * sizeof(char));
+		values[2] = (char *) palloc0(1024 * sizeof(char));
+
+		/* frequency */
+		values[3] = (char *) palloc(64 * sizeof(char));
+
+		/* base frequency */
+		values[4] = (char *) palloc(64 * sizeof(char));
+
+		outfuncs = (Oid *) palloc0(sizeof(Oid) * mcvlist->ndimensions);
+		fmgrinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * mcvlist->ndimensions);
+
+		for (i = 0; i < mcvlist->ndimensions; i++)
+		{
+			bool		isvarlena;
+
+			getTypeOutputInfo(mcvlist->types[i], &outfuncs[i], &isvarlena);
+
+			fmgr_info(outfuncs[i], &fmgrinfo[i]);
+		}
+
+		snprintf(values[0], 64, "%d", call_cntr);	/* item ID */
+
+		for (i = 0; i < mcvlist->ndimensions; i++)
+		{
+			Datum		val,
+						valout;
+
+			format = "%s, %s";
+			if (i == 0)
+				format = "{%s%s";
+			else if (i == mcvlist->ndimensions - 1)
+				format = "%s, %s}";
+
+			if (item->isnull[i])
+				valout = CStringGetDatum("NULL");
+			else
+			{
+				val = item->values[i];
+				valout = FunctionCall1(&fmgrinfo[i], val);
+			}
+
+			snprintf(buff, 1024, format, values[1], DatumGetPointer(valout));
+			strncpy(values[1], buff, 1023);
+			buff[0] = '\0';
+
+			snprintf(buff, 1024, format, values[2], item->isnull[i] ? "t" : "f");
+			strncpy(values[2], buff, 1023);
+			buff[0] = '\0';
+		}
+
+		snprintf(values[3], 64, "%f", item->frequency); /* frequency */
+		snprintf(values[4], 64, "%f", item->base_frequency); /* base frequency */
+
+		/* build a tuple */
+		tuple = BuildTupleFromCStrings(attinmeta, values);
+
+		/* make the tuple into a datum */
+		result = HeapTupleGetDatum(tuple);
+
+		/* clean up (this is not really necessary) */
+		pfree(values[0]);
+		pfree(values[1]);
+		pfree(values[2]);
+		pfree(values[3]);
+		pfree(values[4]);
+
+		pfree(values);
+
+		SRF_RETURN_NEXT(funcctx, result);
+	}
+	else						/* do when there is no more left */
+	{
+		SRF_RETURN_DONE(funcctx);
+	}
+}
+
+/*
+ * pg_mcv_list_in		- input routine for type pg_mcv_list.
+ *
+ * pg_mcv_list is real enough to be a table column, but it has no operations
+ * of its own, and disallows input too
+ */
+Datum
+pg_mcv_list_in(PG_FUNCTION_ARGS)
+{
+	/*
+	 * pg_mcv_list stores the data in binary form and parsing text input is
+	 * not needed, so disallow this.
+	 */
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg("cannot accept a value of type %s", "pg_mcv_list")));
+
+	PG_RETURN_VOID();			/* keep compiler quiet */
+}
+
+
+/*
+ * pg_mcv_list_out		- output routine for type PG_MCV_LIST.
+ *
+ * MCV lists are serialized into a bytea value, so we simply call byteaout()
+ * to serialize the value into text. But it'd be nice to serialize that into
+ * a meaningful representation (e.g. for inspection by people).
+ *
+ * XXX This should probably return something meaningful, similar to what
+ * pg_dependencies_out does. Not sure how to deal with the deduplicated
+ * values, though - do we want to expand that or not?
+ */
+Datum
+pg_mcv_list_out(PG_FUNCTION_ARGS)
+{
+	return byteaout(fcinfo);
+}
+
+/*
+ * pg_mcv_list_recv		- binary input routine for type pg_mcv_list.
+ */
+Datum
+pg_mcv_list_recv(PG_FUNCTION_ARGS)
+{
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg("cannot accept a value of type %s", "pg_mcv_list")));
+
+	PG_RETURN_VOID();			/* keep compiler quiet */
+}
+
+/*
+ * pg_mcv_list_send		- binary output routine for type pg_mcv_list.
+ *
+ * MCV lists are serialized in a bytea value (although the type is named
+ * differently), so let's just send that.
+ */
+Datum
+pg_mcv_list_send(PG_FUNCTION_ARGS)
+{
+	return byteasend(fcinfo);
+}
+
+/*
+ * mcv_update_match_bitmap
+ *	Evaluate clauses using the MCV list, and update the match bitmap.
+ *
+ * A match bitmap keeps match/mismatch status for each MCV item, and we
+ * update it based on additional clauses. We also use it to skip items
+ * that can't possibly match (e.g. item marked as "mismatch" can't change
+ * to "match" when evaluating AND clause list).
+ *
+ * The function also returns a flag indicating whether there was an
+ * equality condition for all attributes, the minimum frequency in the MCV
+ * list, and a total MCV frequency (sum of frequencies for all items).
+ *
+ * XXX Currently the match bitmap uses a char for each MCV item, which is
+ * somewhat wasteful as we could do with just a single bit, thus reducing
+ * the size to ~1/8. It would also allow us to combine bitmaps simply using
+ * & and |, which should be faster than min/max. The bitmaps are fairly
+ * small, though (as we cap the MCV list size to 8k items).
+ */
+static void
+mcv_update_match_bitmap(PlannerInfo *root, List *clauses,
+						Bitmapset *keys, MCVList * mcvlist, char *matches,
+						bool is_or)
+{
+	int			i;
+	ListCell   *l;
+
+	/* The bitmap may be partially built. */
+	Assert(clauses != NIL);
+	Assert(list_length(clauses) >= 1);
+	Assert(mcvlist != NULL);
+	Assert(mcvlist->nitems > 0);
+	Assert(mcvlist->nitems <= STATS_MCVLIST_MAX_ITEMS);
+
+	/*
+	 * Loop through the list of clauses, and for each of them evaluate all the
+	 * MCV items not yet eliminated by the preceding clauses.
+	 */
+	foreach(l, clauses)
+	{
+		Node	   *clause = (Node *) lfirst(l);
+
+		/* if it's a RestrictInfo, then extract the clause */
+		if (IsA(clause, RestrictInfo))
+			clause = (Node *) ((RestrictInfo *) clause)->clause;
+
+		/*
+		 * Handle the various types of clauses - OpClause, NullTest and
+		 * AND/OR/NOT
+		 */
+		if (is_opclause(clause))
+		{
+			OpExpr	   *expr = (OpExpr *) clause;
+			bool		varonleft = true;
+			bool		ok;
+			FmgrInfo	opproc;
+
+			/* get procedure computing operator selectivity */
+			RegProcedure oprrest = get_oprrest(expr->opno);
+
+			fmgr_info(get_opcode(expr->opno), &opproc);
+
+			ok = (NumRelids(clause) == 1) &&
+				(is_pseudo_constant_clause(lsecond(expr->args)) ||
+				 (varonleft = false,
+				  is_pseudo_constant_clause(linitial(expr->args))));
+
+			if (ok)
+			{
+
+				FmgrInfo	gtproc;
+				Var		   *var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+				Const	   *cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+				bool		isgt = (!varonleft);
+
+				TypeCacheEntry *typecache
+				= lookup_type_cache(var->vartype, TYPECACHE_GT_OPR);
+
+				/* match the attribute to a dimension of the statistic */
+				int			idx = bms_member_index(keys, var->varattno);
+
+				fmgr_info(get_opcode(typecache->gt_opr), &gtproc);
+
+				/*
+				 * Walk through the MCV items and evaluate the current clause.
+				 * We can skip items that were already ruled out, and
+				 * terminate if there are no remaining MCV items that might
+				 * possibly match.
+				 */
+				for (i = 0; i < mcvlist->nitems; i++)
+				{
+					bool		mismatch = false;
+					MCVItem    *item = mcvlist->items[i];
+
+					/*
+					 * For AND-lists, we can also mark NULL items as 'no
+					 * match' (and then skip them). For OR-lists this is not
+					 * possible.
+					 */
+					if ((!is_or) && item->isnull[idx])
+						matches[i] = STATS_MATCH_NONE;
+
+					/* skip MCV items that were already ruled out */
+					if ((!is_or) && (matches[i] == STATS_MATCH_NONE))
+						continue;
+					else if (is_or && (matches[i] == STATS_MATCH_FULL))
+						continue;
+
+					switch (oprrest)
+					{
+						case F_EQSEL:
+						case F_NEQSEL:
+
+							/*
+							 * We don't care about isgt in equality, because
+							 * it does not matter whether it's (var op const)
+							 * or (const op var).
+							 */
+							mismatch = !DatumGetBool(FunctionCall2Coll(&opproc,
+																	   DEFAULT_COLLATION_OID,
+																	   cst->constvalue,
+																	   item->values[idx]));
+
+							break;
+
+						case F_SCALARLTSEL: /* column < constant */
+						case F_SCALARLESEL: /* column <= constant */
+						case F_SCALARGTSEL: /* column > constant */
+						case F_SCALARGESEL: /* column >= constant */
+
+							/*
+							 * First check whether the constant is below the
+							 * lower boundary (in that case we can skip the
+							 * bucket, because there's no overlap).
+							 */
+							if (isgt)
+								mismatch = !DatumGetBool(FunctionCall2Coll(&opproc,
+																		   DEFAULT_COLLATION_OID,
+																		   cst->constvalue,
+																		   item->values[idx]));
+							else
+								mismatch = !DatumGetBool(FunctionCall2Coll(&opproc,
+																		   DEFAULT_COLLATION_OID,
+																		   item->values[idx],
+																		   cst->constvalue));
+
+							break;
+					}
+
+					/*
+					 * XXX The conditions on matches[i] are not needed, as we
+					 * skip MCV items that can't become true/false, depending
+					 * on the current flag. See beginning of the loop over MCV
+					 * items.
+					 */
+
+					if ((is_or) && (!mismatch))
+					{
+						/* OR - was MATCH_NONE, but will be MATCH_FULL */
+						matches[i] = STATS_MATCH_FULL;
+						continue;
+					}
+					else if ((!is_or) && mismatch)
+					{
+						/* AND - was MATC_FULL, but will be MATCH_NONE */
+						matches[i] = STATS_MATCH_NONE;
+						continue;
+					}
+
+				}
+			}
+		}
+		else if (IsA(clause, NullTest))
+		{
+			NullTest   *expr = (NullTest *) clause;
+			Var		   *var = (Var *) (expr->arg);
+
+			/* match the attribute to a dimension of the statistic */
+			int			idx = bms_member_index(keys, var->varattno);
+
+			/*
+			 * Walk through the MCV items and evaluate the current clause. We
+			 * can skip items that were already ruled out, and terminate if
+			 * there are no remaining MCV items that might possibly match.
+			 */
+			for (i = 0; i < mcvlist->nitems; i++)
+			{
+				char		match = STATS_MATCH_NONE;	/* assume mismatch */
+				MCVItem    *item = mcvlist->items[i];
+
+				/* if the clause mismatches the MCV item, set it as MATCH_NONE */
+				switch (expr->nulltesttype)
+				{
+					case IS_NULL:
+						match = (item->isnull[idx]) ? STATS_MATCH_FULL : match;
+						break;
+
+					case IS_NOT_NULL:
+						match = (!item->isnull[idx]) ? STATS_MATCH_FULL : match;
+						break;
+				}
+
+				/* now, update the match bitmap, depending on OR/AND type */
+				if (is_or)
+					matches[i] = Max(matches[i], match);
+				else
+					matches[i] = Min(matches[i], match);
+			}
+		}
+		else if (or_clause(clause) || and_clause(clause))
+		{
+			/* AND/OR clause, with all subclauses being compatible */
+
+			int			i;
+			BoolExpr   *bool_clause = ((BoolExpr *) clause);
+			List	   *bool_clauses = bool_clause->args;
+
+			/* match/mismatch bitmap for each MCV item */
+			char	   *bool_matches = NULL;
+
+			Assert(bool_clauses != NIL);
+			Assert(list_length(bool_clauses) >= 2);
+
+			/* by default none of the MCV items matches the clauses */
+			bool_matches = palloc0(sizeof(char) * mcvlist->nitems);
+
+			if (or_clause(clause))
+			{
+				/* OR clauses assume nothing matches, initially */
+				memset(bool_matches, STATS_MATCH_NONE, sizeof(char) * mcvlist->nitems);
+			}
+			else
+			{
+				/* AND clauses assume everything matches, initially */
+				memset(bool_matches, STATS_MATCH_FULL, sizeof(char) * mcvlist->nitems);
+			}
+
+			/* build the match bitmap for the OR-clauses */
+			mcv_update_match_bitmap(root, bool_clauses, keys,
+									mcvlist, bool_matches,
+									or_clause(clause));
+
+			/*
+			 * Merge the bitmap produced by mcv_update_match_bitmap into the
+			 * current one. We need to consider if we're evaluating AND or OR
+			 * condition when merging the results.
+			 */
+			for (i = 0; i < mcvlist->nitems; i++)
+			{
+				/* Is this OR or AND clause? */
+				if (is_or)
+					matches[i] = Max(matches[i], bool_matches[i]);
+				else
+					matches[i] = Min(matches[i], bool_matches[i]);
+			}
+
+			pfree(bool_matches);
+		}
+		else if (not_clause(clause))
+		{
+			/* NOT clause, with all subclauses compatible */
+
+			int			i;
+			BoolExpr   *not_clause = ((BoolExpr *) clause);
+			List	   *not_args = not_clause->args;
+
+			/* match/mismatch bitmap for each MCV item */
+			char	   *not_matches = NULL;
+
+			Assert(not_args != NIL);
+			Assert(list_length(not_args) == 1);
+
+			/* by default none of the MCV items matches the clauses */
+			not_matches = palloc0(sizeof(char) * mcvlist->nitems);
+
+			/* NOT clauses assume nothing matches, initially */
+			memset(not_matches, STATS_MATCH_FULL, sizeof(char) * mcvlist->nitems);
+
+			/* build the match bitmap for the NOT-clause */
+			mcv_update_match_bitmap(root, not_args, keys,
+									mcvlist, not_matches, false);
+
+			/*
+			 * Merge the bitmap produced by mcv_update_match_bitmap into the
+			 * current one.
+			 */
+			for (i = 0; i < mcvlist->nitems; i++)
+			{
+				/*
+				 * When handling a NOT clause, we need to invert the result
+				 * before merging it into the global result.
+				 */
+				if (not_matches[i] == STATS_MATCH_NONE)
+					not_matches[i] = STATS_MATCH_FULL;
+				else
+					not_matches[i] = STATS_MATCH_NONE;
+
+				/* Is this OR or AND clause? */
+				if (is_or)
+					matches[i] = Max(matches[i], not_matches[i]);
+				else
+					matches[i] = Min(matches[i], not_matches[i]);
+			}
+
+			pfree(not_matches);
+		}
+		else if (IsA(clause, Var))
+		{
+			/* Var (has to be a boolean Var, possibly from below NOT) */
+
+			Var		   *var = (Var *) (clause);
+
+			/* match the attribute to a dimension of the statistic */
+			int			idx = bms_member_index(keys, var->varattno);
+
+			Assert(var->vartype == BOOLOID);
+
+			/*
+			 * Walk through the MCV items and evaluate the current clause. We
+			 * can skip items that were already ruled out, and terminate if
+			 * there are no remaining MCV items that might possibly match.
+			 */
+			for (i = 0; i < mcvlist->nitems; i++)
+			{
+				MCVItem    *item = mcvlist->items[i];
+				bool		match = STATS_MATCH_NONE;
+
+				/* if the item is NULL, it's a mismatch */
+				if (!item->isnull[idx] && DatumGetBool(item->values[idx]))
+					match = STATS_MATCH_FULL;
+
+				/* now, update the match bitmap, depending on OR/AND type */
+				if (is_or)
+					matches[i] = Max(matches[i], match);
+				else
+					matches[i] = Min(matches[i], match);
+			}
+		}
+		else
+		{
+			elog(ERROR, "unknown clause type: %d", clause->type);
+		}
+	}
+}
+
+
+/*
+ * mcv_clauselist_selectivity
+ *		Return the selectivity estimate of clauses using MCV list.
+ *
+ * It also produces two interesting selectivities - total selectivity of
+ * all the MCV items combined, and selectivity of the least frequent item
+ * in the list.
+ */
+Selectivity
+mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
+						   List *clauses, int varRelid,
+						   JoinType jointype, SpecialJoinInfo *sjinfo,
+						   RelOptInfo *rel,
+						   Selectivity *basesel, Selectivity *totalsel)
+{
+	int			i;
+	MCVList    *mcv;
+	Selectivity s = 0.0;
+
+	/* match/mismatch bitmap for each MCV item */
+	char	   *matches = NULL;
+
+	/* load the MCV list stored in the statistics object */
+	mcv = statext_mcv_load(stat->statOid);
+
+	/* by default all the MCV items match the clauses fully */
+	matches = palloc0(sizeof(char) * mcv->nitems);
+	memset(matches, STATS_MATCH_FULL, sizeof(char) * mcv->nitems);
+
+	mcv_update_match_bitmap(root, clauses, stat->keys, mcv,
+							matches, false);
+
+	/* sum frequencies for all the matching MCV items */
+	*basesel = 0.0;
+	*totalsel = 0.0;
+	for (i = 0; i < mcv->nitems; i++)
+	{
+		*totalsel += mcv->items[i]->frequency;
+
+		if (matches[i] != STATS_MATCH_NONE)
+		{
+			*basesel += mcv->items[i]->base_frequency;
+			s += mcv->items[i]->frequency;
+		}
+	}
+
+	return s;
+}
diff --git a/src/backend/statistics/mvdistinct.c b/src/backend/statistics/mvdistinct.c
new file mode 100644
index 593c219..fb74ed3
--- a/src/backend/statistics/mvdistinct.c
+++ b/src/backend/statistics/mvdistinct.c
@@ -23,8 +23,6 @@
  */
 #include "postgres.h"
 
-#include <math.h>
-
 #include "access/htup_details.h"
 #include "catalog/pg_statistic_ext.h"
 #include "utils/fmgrprotos.h"
@@ -39,7 +37,6 @@
 static double ndistinct_for_combination(double totalrows, int numrows,
 						  HeapTuple *rows, VacAttrStats **stats,
 						  int k, int *combination);
-static double estimate_ndistinct(double totalrows, int numrows, int d, int f1);
 static int	n_choose_k(int n, int k);
 static int	num_combinations(int n);
 
@@ -508,31 +505,6 @@ ndistinct_for_combination(double totalro
 	return estimate_ndistinct(totalrows, numrows, d, f1);
 }
 
-/* The Duj1 estimator (already used in analyze.c). */
-static double
-estimate_ndistinct(double totalrows, int numrows, int d, int f1)
-{
-	double		numer,
-				denom,
-				ndistinct;
-
-	numer = (double) numrows * (double) d;
-
-	denom = (double) (numrows - f1) +
-		(double) f1 * (double) numrows / totalrows;
-
-	ndistinct = numer / denom;
-
-	/* Clamp to sane range in case of roundoff error */
-	if (ndistinct < (double) d)
-		ndistinct = (double) d;
-
-	if (ndistinct > totalrows)
-		ndistinct = totalrows;
-
-	return floor(ndistinct + 0.5);
-}
-
 /*
  * n_choose_k
  *		computes binomial coefficients using an algorithm that is both
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
new file mode 100644
index 03e9a28..c941c33
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -1504,6 +1504,7 @@ pg_get_statisticsobj_worker(Oid statexti
 	bool		isnull;
 	bool		ndistinct_enabled;
 	bool		dependencies_enabled;
+	bool		mcv_enabled;
 	int			i;
 
 	statexttup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statextid));
@@ -1539,6 +1540,7 @@ pg_get_statisticsobj_worker(Oid statexti
 
 	ndistinct_enabled = false;
 	dependencies_enabled = false;
+	mcv_enabled = false;
 
 	for (i = 0; i < ARR_DIMS(arr)[0]; i++)
 	{
@@ -1546,6 +1548,8 @@ pg_get_statisticsobj_worker(Oid statexti
 			ndistinct_enabled = true;
 		if (enabled[i] == STATS_EXT_DEPENDENCIES)
 			dependencies_enabled = true;
+		if (enabled[i] == STATS_EXT_MCV)
+			mcv_enabled = true;
 	}
 
 	/*
@@ -1555,13 +1559,27 @@ pg_get_statisticsobj_worker(Oid statexti
 	 * statistics types on a newer postgres version, if the statistics had all
 	 * options enabled on the original version.
 	 */
-	if (!ndistinct_enabled || !dependencies_enabled)
+	if (!ndistinct_enabled || !dependencies_enabled || !mcv_enabled)
 	{
+		bool	gotone = false;
+
 		appendStringInfoString(&buf, " (");
+
 		if (ndistinct_enabled)
+		{
 			appendStringInfoString(&buf, "ndistinct");
-		else if (dependencies_enabled)
-			appendStringInfoString(&buf, "dependencies");
+			gotone = true;
+		}
+
+		if (dependencies_enabled)
+		{
+			appendStringInfo(&buf, "%sdependencies", gotone ? ", " : "");
+			gotone = true;
+		}
+
+		if (mcv_enabled)
+			appendStringInfo(&buf, "%smcv", gotone ? ", " : "");
+
 		appendStringInfoChar(&buf, ')');
 	}
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
new file mode 100644
index f1c78ff..fdfc0d6
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3735,6 +3735,171 @@ estimate_num_groups(PlannerInfo *root, L
 }
 
 /*
+ * estimate_num_groups_simple
+ *		Estimate number of groups in a relation.
+ *
+ * A simplified version of estimate_num_groups, assuming all expressions
+ * are only plain Vars from a single relation, and that no filtering is
+ * happenning.
+ */
+double
+estimate_num_groups_simple(PlannerInfo *root, List *vars)
+{
+	List	   *varinfos = NIL;
+	double		numdistinct;
+	ListCell   *l;
+
+	RelOptInfo *rel;
+	double		reldistinct = 1;
+	double		relmaxndistinct = reldistinct;
+	int			relvarcount = 0;
+
+
+	/*
+	 * If no grouping columns, there's exactly one group.  (This can't happen
+	 * for normal cases with GROUP BY or DISTINCT, but it is possible for
+	 * corner cases with set operations.)
+	 */
+	if (vars == NIL)
+		return 1.0;
+
+	/*
+	 * We expect only variables from a single relation.
+	 */
+	Assert(NumRelids((Node *) vars) == 1);
+
+	/*
+	 * Find the unique Vars used, treating an expression as a Var if we can
+	 * find stats for it.  For each one, record the statistical estimate of
+	 * number of distinct values (total in its table).
+	 */
+	numdistinct = 1.0;
+
+	foreach(l, vars)
+	{
+		Var	   *var = (Var *) lfirst(l);
+		VariableStatData vardata;
+
+		Assert(IsA(var, Var));
+
+		/*
+		 * If examine_variable is able to deduce anything about the GROUP BY
+		 * expression, treat it as a single variable even if it's really more
+		 * complicated.
+		 */
+		examine_variable(root, (Node *) var, 0, &vardata);
+		if (HeapTupleIsValid(vardata.statsTuple) || vardata.isunique)
+		{
+			varinfos = add_unique_group_var(root, varinfos,
+											(Node *) var, &vardata);
+			ReleaseVariableStats(vardata);
+			continue;
+		}
+		ReleaseVariableStats(vardata);
+	}
+
+	Assert(varinfos);
+
+	/*
+	 * Get the numdistinct estimate for the Vars of this rel.
+	 *
+	 * We
+	 * iteratively search for multivariate n-distinct with maximum number
+	 * of vars; assuming that each var group is independent of the others,
+	 * we multiply them together.  Any remaining relvarinfos after no more
+	 * multivariate matches are found are assumed independent too, so
+	 * their individual ndistinct estimates are multiplied also.
+	 *
+	 * While iterating, count how many separate numdistinct values we
+	 * apply.  We apply a fudge factor below, but only if we multiplied
+	 * more than one such values.
+	 */
+	while (varinfos)
+	{
+		double		mvndistinct;
+
+		rel = ((GroupVarInfo *) linitial(varinfos))->rel;
+
+		if (estimate_multivariate_ndistinct(root, rel, &varinfos,
+											&mvndistinct))
+		{
+			reldistinct *= mvndistinct;
+			if (relmaxndistinct < mvndistinct)
+				relmaxndistinct = mvndistinct;
+			relvarcount++;
+		}
+		else
+		{
+			foreach(l, varinfos)
+			{
+				GroupVarInfo *varinfo = (GroupVarInfo *) lfirst(l);
+
+				reldistinct *= varinfo->ndistinct;
+				if (relmaxndistinct < varinfo->ndistinct)
+					relmaxndistinct = varinfo->ndistinct;
+				relvarcount++;
+			}
+
+			/* we're done with this relation */
+			varinfos = NIL;
+		}
+	}
+
+	/*
+	 * Sanity check --- don't divide by zero if empty relation.
+	 */
+	Assert(IS_SIMPLE_REL(rel));
+	if (rel->tuples > 0)
+	{
+		/*
+		 * Clamp to size of rel, or size of rel / 10 if multiple Vars. The
+		 * fudge factor is because the Vars are probably correlated but we
+		 * don't know by how much.  We should never clamp to less than the
+		 * largest ndistinct value for any of the Vars, though, since
+		 * there will surely be at least that many groups.
+		 */
+		double		clamp = rel->tuples;
+
+		if (relvarcount > 1)
+		{
+			clamp *= 0.1;
+			if (clamp < relmaxndistinct)
+			{
+				clamp = relmaxndistinct;
+				/* for sanity in case some ndistinct is too large: */
+				if (clamp > rel->tuples)
+					clamp = rel->tuples;
+			}
+		}
+		if (reldistinct > clamp)
+			reldistinct = clamp;
+
+		/*
+		 * We're assuming we are returning all rows.
+		 */
+		reldistinct = clamp_row_est(reldistinct);
+
+		/*
+		 * Update estimate of total distinct groups.
+		 */
+		numdistinct *= reldistinct;
+
+		/* Guard against out-of-range answers */
+		if (numdistinct > rel->tuples)
+			numdistinct = rel->tuples;
+	}
+
+	if (numdistinct < 1.0)
+		numdistinct = 1.0;
+
+	/* Round off */
+	numdistinct = ceil(numdistinct);
+
+	return numdistinct;
+
+}
+
+/*
  * Estimate hash bucket statistics when the specified expression is used
  * as a hash key for the given number of buckets.
  *
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
new file mode 100644
index 80d8338..9207b94
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2517,7 +2517,8 @@ describeOneTableDetails(const char *sche
 							  "   JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND\n"
 							  "        a.attnum = s.attnum AND NOT attisdropped)) AS columns,\n"
 							  "  'd' = any(stxkind) AS ndist_enabled,\n"
-							  "  'f' = any(stxkind) AS deps_enabled\n"
+							  "  'f' = any(stxkind) AS deps_enabled,\n"
+							  "  'm' = any(stxkind) AS mcv_enabled\n"
 							  "FROM pg_catalog.pg_statistic_ext stat "
 							  "WHERE stxrelid = '%s'\n"
 							  "ORDER BY 1;",
@@ -2554,6 +2555,12 @@ describeOneTableDetails(const char *sche
 					if (strcmp(PQgetvalue(result, i, 6), "t") == 0)
 					{
 						appendPQExpBuffer(&buf, "%sdependencies", gotone ? ", " : "");
+						gotone = true;
+					}
+
+					if (strcmp(PQgetvalue(result, i, 7), "t") == 0)
+					{
+						appendPQExpBuffer(&buf, "%smcv", gotone ? ", " : "");
 					}
 
 					appendPQExpBuffer(&buf, ") ON %s FROM %s",
diff --git a/src/include/catalog/pg_cast.dat b/src/include/catalog/pg_cast.dat
new file mode 100644
index cf00752..dff3a9a
--- a/src/include/catalog/pg_cast.dat
+++ b/src/include/catalog/pg_cast.dat
@@ -324,6 +324,12 @@
 { castsource => 'pg_dependencies', casttarget => 'text', castfunc => '0',
   castcontext => 'i', castmethod => 'i' },
 
+# pg_mcv_list can be coerced to, but not from, bytea and text
+{ castsource => 'pg_mcv_list', casttarget => 'bytea', castfunc => '0',
+  castcontext => 'i', castmethod => 'b' },
+{ castsource => 'pg_mcv_list', casttarget => 'text', castfunc => '0',
+  castcontext => 'i', castmethod => 'i' },
+
 # Datetime category
 { castsource => 'abstime', casttarget => 'date', castfunc => 'date(abstime)',
   castcontext => 'a', castmethod => 'f' },
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
new file mode 100644
index a146510..3cfcafc
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5073,6 +5073,30 @@
   proname => 'pg_dependencies_send', provolatile => 's', prorettype => 'bytea',
   proargtypes => 'pg_dependencies', prosrc => 'pg_dependencies_send' },
 
+{ oid => '4002', descr => 'I/O',
+  proname => 'pg_mcv_list_in', prorettype => 'pg_mcv_list',
+  proargtypes => 'cstring', prosrc => 'pg_mcv_list_in' },
+{ oid => '4003', descr => 'I/O',
+  proname => 'pg_mcv_list_out', prorettype => 'cstring',
+  proargtypes => 'pg_mcv_list', prosrc => 'pg_mcv_list_out' },
+{ oid => '4004', descr => 'I/O',
+  proname => 'pg_mcv_list_recv', provolatile => 's',
+  prorettype => 'pg_mcv_list', proargtypes => 'internal',
+  prosrc => 'pg_mcv_list_recv' },
+{ oid => '4005', descr => 'I/O',
+  proname => 'pg_mcv_list_send', provolatile => 's', prorettype => 'bytea',
+  proargtypes => 'pg_mcv_list', prosrc => 'pg_mcv_list_send' },
+
+{ oid => '3424',
+  descr => 'details about MCV list items',
+  proname => 'pg_mcv_list_items', prorows => '1000', proisstrict => 'f',
+  proretset => 't', provolatile => 's', prorettype => 'record',
+  proargtypes => 'pg_mcv_list',
+  proallargtypes => '{pg_mcv_list,int4,text,_bool,float8,float8}',
+  proargmodes => '{i,o,o,o,o,o}',
+  proargnames => '{mcv_list,index,values,nulls,frequency,base_frequency}',
+  prosrc => 'pg_stats_ext_mcvlist_items' },
+
 { oid => '1928', descr => 'statistics: number of scans done for table/index',
   proname => 'pg_stat_get_numscans', provolatile => 's', proparallel => 'r',
   prorettype => 'int8', proargtypes => 'oid',
diff --git a/src/include/catalog/pg_statistic_ext.h b/src/include/catalog/pg_statistic_ext.h
new file mode 100644
index 443798a..7ddbee6
--- a/src/include/catalog/pg_statistic_ext.h
+++ b/src/include/catalog/pg_statistic_ext.h
@@ -47,6 +47,7 @@ CATALOG(pg_statistic_ext,3381,StatisticE
 												 * to build */
 	pg_ndistinct stxndistinct;	/* ndistinct coefficients (serialized) */
 	pg_dependencies stxdependencies;	/* dependencies (serialized) */
+	pg_mcv_list stxmcv;			/* MCV (serialized) */
 #endif
 
 } FormData_pg_statistic_ext;
@@ -62,6 +63,7 @@ typedef FormData_pg_statistic_ext *Form_
 
 #define STATS_EXT_NDISTINCT			'd'
 #define STATS_EXT_DEPENDENCIES		'f'
+#define STATS_EXT_MCV				'm'
 
 #endif							/* EXPOSE_TO_CLIENT_CODE */
 
diff --git a/src/include/catalog/pg_type.dat b/src/include/catalog/pg_type.dat
new file mode 100644
index 48e01cd..0ff25e8
--- a/src/include/catalog/pg_type.dat
+++ b/src/include/catalog/pg_type.dat
@@ -165,6 +165,13 @@
   typoutput => 'pg_dependencies_out', typreceive => 'pg_dependencies_recv',
   typsend => 'pg_dependencies_send', typalign => 'i', typstorage => 'x',
   typcollation => '100' },
+{ oid => '4001', oid_symbol => 'PGMCVLISTOID',
+  descr => 'multivariate MCV list',
+  typname => 'pg_mcv_list', typlen => '-1', typbyval => 'f',
+  typcategory => 'S', typinput => 'pg_mcv_list_in',
+  typoutput => 'pg_mcv_list_out', typreceive => 'pg_mcv_list_recv',
+  typsend => 'pg_mcv_list_send', typalign => 'i', typstorage => 'x',
+  typcollation => '100' },
 { oid => '32', oid_symbol => 'PGDDLCOMMANDOID',
   descr => 'internal type for passing CollectedCommand',
   typname => 'pg_ddl_command', typlen => 'SIZEOF_POINTER', typbyval => 't',
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
new file mode 100644
index 85d472f..f327e22
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -195,6 +195,12 @@ extern void analyze_rel(Oid relid, Range
 			VacuumParams *params, List *va_cols, bool in_outer_xact,
 			BufferAccessStrategy bstrategy);
 extern bool std_typanalyze(VacAttrStats *stats);
+extern int analyze_mcv_list(int *mcv_counts,
+				 int num_mcv,
+				 double stadistinct,
+				 double stanullfrac,
+				 int samplerows,
+				 double totalrows);
 
 /* in utils/misc/sampling.c --- duplicate of declarations in utils/sampling.h */
 extern double anl_random_fract(void);
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
new file mode 100644
index 77ca7ff..e6cded0
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -215,6 +215,12 @@ extern Selectivity clause_selectivity(Pl
 				   int varRelid,
 				   JoinType jointype,
 				   SpecialJoinInfo *sjinfo);
+extern Selectivity clauselist_selectivity_simple(PlannerInfo *root,
+							  List *clauses,
+							  int varRelid,
+							  JoinType jointype,
+							  SpecialJoinInfo *sjinfo,
+							  Bitmapset *estimatedclauses);
 extern void cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 				  RelOptInfo *rel, ParamPathInfo *param_info,
 				  Cost input_startup_cost, Cost input_total_cost,
diff --git a/src/include/statistics/extended_stats_internal.h b/src/include/statistics/extended_stats_internal.h
new file mode 100644
index b3ca0c1..11159b5
--- a/src/include/statistics/extended_stats_internal.h
+++ b/src/include/statistics/extended_stats_internal.h
@@ -31,6 +31,15 @@ typedef struct
 	int			tupno;			/* position index for tuple it came from */
 } ScalarItem;
 
+/* (de)serialization info */
+typedef struct DimensionInfo
+{
+	int			nvalues;		/* number of deduplicated values */
+	int			nbytes;			/* number of bytes (serialized) */
+	int			typlen;			/* pg_type.typlen */
+	bool		typbyval;		/* pg_type.typbyval */
+}			DimensionInfo;
+
 /* multi-sort */
 typedef struct MultiSortSupportData
 {
@@ -44,6 +53,7 @@ typedef struct SortItem
 {
 	Datum	   *values;
 	bool	   *isnull;
+	int			count;
 } SortItem;
 
 extern MVNDistinct *statext_ndistinct_build(double totalrows,
@@ -57,6 +67,12 @@ extern MVDependencies *statext_dependenc
 extern bytea *statext_dependencies_serialize(MVDependencies *dependencies);
 extern MVDependencies *statext_dependencies_deserialize(bytea *data);
 
+extern MCVList * statext_mcv_build(int numrows, HeapTuple *rows,
+								   Bitmapset *attrs, VacAttrStats **stats,
+								   double totalrows);
+extern bytea *statext_mcv_serialize(MCVList * mcv, VacAttrStats **stats);
+extern MCVList * statext_mcv_deserialize(bytea *data);
+
 extern MultiSortSupport multi_sort_init(int ndims);
 extern void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
 						 Oid oper);
@@ -65,5 +81,32 @@ extern int multi_sort_compare_dim(int di
 					   const SortItem *b, MultiSortSupport mss);
 extern int multi_sort_compare_dims(int start, int end, const SortItem *a,
 						const SortItem *b, MultiSortSupport mss);
+extern int	compare_scalars_simple(const void *a, const void *b, void *arg);
+extern int	compare_datums_simple(Datum a, Datum b, SortSupport ssup);
+
+extern void *bsearch_arg(const void *key, const void *base,
+			size_t nmemb, size_t size,
+			int (*compar) (const void *, const void *, void *),
+			void *arg);
+
+extern int *build_attnums(Bitmapset *attrs);
+
+extern SortItem *build_sorted_items(int numrows, HeapTuple *rows,
+				   TupleDesc tdesc, MultiSortSupport mss,
+				   int numattrs, int *attnums);
+
+extern int	bms_member_index(Bitmapset *keys, AttrNumber varattno);
+
+extern double estimate_ndistinct(double totalrows, int numrows, int d, int f1);
+
+extern Selectivity mcv_clauselist_selectivity(PlannerInfo *root,
+						   StatisticExtInfo *stat,
+						   List *clauses,
+						   int varRelid,
+						   JoinType jointype,
+						   SpecialJoinInfo *sjinfo,
+						   RelOptInfo *rel,
+						   Selectivity *basesel,
+						   Selectivity *totalsel);
 
 #endif							/* EXTENDED_STATS_INTERNAL_H */
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
new file mode 100644
index 8009fee..e69d6a0
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -16,6 +16,14 @@
 #include "commands/vacuum.h"
 #include "nodes/relation.h"
 
+/*
+ * Degree of how much MCV item matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define STATS_MATCH_NONE		0	/* no match at all */
+#define STATS_MATCH_PARTIAL		1	/* partial match */
+#define STATS_MATCH_FULL		2	/* full match */
+
 #define STATS_MAX_DIMENSIONS	8	/* max number of attributes */
 
 /* Multivariate distinct coefficients */
@@ -78,8 +86,42 @@ typedef struct MVDependencies
 /* size of the struct excluding the deps array */
 #define SizeOfDependencies	(offsetof(MVDependencies, ndeps) + sizeof(uint32))
 
+/* used to flag stats serialized to bytea */
+#define STATS_MCV_MAGIC                        0xE1A651C2	/* marks serialized
+															 * bytea */
+#define STATS_MCV_TYPE_BASIC   1	/* basic MCV list type */
+
+/* max items in MCV list (mostly arbitrary number) */
+#define STATS_MCVLIST_MAX_ITEMS        8192
+
+/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and null flags.
+ */
+typedef struct MCVItem
+{
+	double		frequency;		/* frequency of this combination */
+	double		base_frequency;	/* frequency if independent */
+	bool	   *isnull;			/* lags of NULL values (up to 32 columns) */
+	Datum	   *values;			/* variable-length (ndimensions) */
+}			MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVList
+{
+	uint32		magic;			/* magic constant marker */
+	uint32		type;			/* type of MCV list (BASIC) */
+	uint32		nitems;			/* number of MCV items in the array */
+	AttrNumber	ndimensions;	/* number of dimensions */
+	Oid			types[STATS_MAX_DIMENSIONS];	/* OIDs of data types */
+	MCVItem   **items;			/* array of MCV items */
+}			MCVList;
+
 extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
 extern MVDependencies *statext_dependencies_load(Oid mvoid);
+extern MCVList * statext_mcv_load(Oid mvoid);
 
 extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
 						   int numrows, HeapTuple *rows,
@@ -92,6 +134,13 @@ extern Selectivity dependencies_clauseli
 									SpecialJoinInfo *sjinfo,
 									RelOptInfo *rel,
 									Bitmapset **estimatedclauses);
+extern Selectivity statext_clauselist_selectivity(PlannerInfo *root,
+							   List *clauses,
+							   int varRelid,
+							   JoinType jointype,
+							   SpecialJoinInfo *sjinfo,
+							   RelOptInfo *rel,
+							   Bitmapset **estimatedclauses);
 extern bool has_stats_of_kind(List *stats, char requiredkind);
 extern StatisticExtInfo *choose_best_statistics(List *stats,
 					   Bitmapset *attnums, char requiredkind);
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
new file mode 100644
index 95e4428..4e9aaca
--- a/src/include/utils/selfuncs.h
+++ b/src/include/utils/selfuncs.h
@@ -209,6 +209,8 @@ extern void mergejoinscansel(PlannerInfo
 extern double estimate_num_groups(PlannerInfo *root, List *groupExprs,
 					double input_rows, List **pgset);
 
+extern double estimate_num_groups_simple(PlannerInfo *root, List *vars);
+
 extern void estimate_hash_bucket_stats(PlannerInfo *root,
 						   Node *hashkey, double nbuckets,
 						   Selectivity *mcv_freq,
diff --git a/src/test/regress/expected/create_table_like.out b/src/test/regress/expected/create_table_like.out
new file mode 100644
index 8d4543b..0f97355
--- a/src/test/regress/expected/create_table_like.out
+++ b/src/test/regress/expected/create_table_like.out
@@ -243,7 +243,7 @@ Indexes:
 Check constraints:
     "ctlt1_a_check" CHECK (length(a) > 2)
 Statistics objects:
-    "public"."ctlt_all_a_b_stat" (ndistinct, dependencies) ON a, b FROM ctlt_all
+    "public"."ctlt_all_a_b_stat" (ndistinct, dependencies, mcv) ON a, b FROM ctlt_all
 
 SELECT c.relname, objsubid, description FROM pg_description, pg_index i, pg_class c WHERE classoid = 'pg_class'::regclass AND objoid = i.indexrelid AND c.oid = i.indexrelid AND i.indrelid = 'ctlt_all'::regclass ORDER BY c.relname, objsubid;
     relname     | objsubid | description 
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
new file mode 100644
index 3c6d853..d13f092
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -902,11 +902,12 @@ WHERE c.castmethod = 'b' AND
  pg_node_tree      | text              |        0 | i
  pg_ndistinct      | bytea             |        0 | i
  pg_dependencies   | bytea             |        0 | i
+ pg_mcv_list       | bytea             |        0 | i
  cidr              | inet              |        0 | i
  xml               | text              |        0 | a
  xml               | character varying |        0 | a
  xml               | character         |        0 | a
-(9 rows)
+(10 rows)
 
 -- **************** pg_conversion ****************
 -- Look for illegal values in pg_conversion fields.
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
new file mode 100644
index 054a381..5d05962
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -58,7 +58,7 @@ ALTER TABLE ab1 DROP COLUMN a;
  b      | integer |           |          | 
  c      | integer |           |          | 
 Statistics objects:
-    "public"."ab1_b_c_stats" (ndistinct, dependencies) ON b, c FROM ab1
+    "public"."ab1_b_c_stats" (ndistinct, dependencies, mcv) ON b, c FROM ab1
 
 -- Ensure statistics are dropped when table is
 SELECT stxname FROM pg_statistic_ext WHERE stxname LIKE 'ab1%';
@@ -206,7 +206,7 @@ SELECT stxkind, stxndistinct
   FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass;
  stxkind |                      stxndistinct                       
 ---------+---------------------------------------------------------
- {d,f}   | {"3, 4": 301, "3, 6": 301, "4, 6": 301, "3, 4, 6": 301}
+ {d,f,m} | {"3, 4": 301, "3, 6": 301, "4, 6": 301, "3, 4, 6": 301}
 (1 row)
 
 -- Hash Aggregate, thanks to estimates improved by the statistic
@@ -272,7 +272,7 @@ SELECT stxkind, stxndistinct
   FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass;
  stxkind |                        stxndistinct                         
 ---------+-------------------------------------------------------------
- {d,f}   | {"3, 4": 2550, "3, 6": 800, "4, 6": 1632, "3, 4, 6": 10000}
+ {d,f,m} | {"3, 4": 2550, "3, 6": 800, "4, 6": 1632, "3, 4, 6": 10000}
 (1 row)
 
 -- plans using Group Aggregate, thanks to using correct esimates
@@ -509,3 +509,316 @@ EXPLAIN (COSTS OFF)
 (5 rows)
 
 RESET random_page_cost;
+-- MCV lists
+CREATE TABLE mcv_lists (
+    filler1 TEXT,
+    filler2 NUMERIC,
+    a INT,
+    b TEXT,
+    filler3 DATE,
+    c INT,
+    d TEXT
+);
+SET random_page_cost = 1.2;
+CREATE INDEX mcv_lists_ab_idx ON mcv_lists (a, b);
+CREATE INDEX mcv_lists_abc_idx ON mcv_lists (a, b, c);
+-- random data (no MCV list)
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT mod(i,37), mod(i,41), mod(i,43), mod(i,47) FROM generate_series(1,5000) s(i);
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text) AND (c = 1))
+(2 rows)
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text) AND (c = 1))
+(2 rows)
+
+-- 100 distinct combinations, all in the MCV list
+TRUNCATE mcv_lists;
+DROP STATISTICS mcv_lists_stats;
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT mod(i,100), mod(i,50), mod(i,25), i FROM generate_series(1,5000) s(i);
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a < 1) AND (b < '1'::text))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 0 AND b <= '0';
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a <= 0) AND (b <= '0'::text))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text) AND (c = 1))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a < 5) AND (b < '1'::text) AND (c < 5))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 4 AND b <= '0' AND c <= 4;
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a <= 4) AND (b <= '0'::text) AND (c <= 4))
+(2 rows)
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a < 1) AND (b < '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a < 1) AND (b < '1'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 0 AND b <= '0';
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a <= 0) AND (b <= '0'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a <= 0) AND (b <= '0'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   Filter: (c = 1)
+   ->  Bitmap Index Scan on mcv_lists_ab_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(5 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a < 5) AND (b < '1'::text))
+   Filter: (c < 5)
+   ->  Bitmap Index Scan on mcv_lists_ab_idx
+         Index Cond: ((a < 5) AND (b < '1'::text))
+(5 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 4 AND b <= '0' AND c <= 4;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a <= 4) AND (b <= '0'::text))
+   Filter: (c <= 4)
+   ->  Bitmap Index Scan on mcv_lists_ab_idx
+         Index Cond: ((a <= 4) AND (b <= '0'::text))
+(5 rows)
+
+-- check change of column type resets the MCV statistics
+ALTER TABLE mcv_lists ALTER COLUMN c TYPE numeric;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text))
+(2 rows)
+
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(4 rows)
+
+-- 100 distinct combinations with NULL values, all in the MCV list
+TRUNCATE mcv_lists;
+DROP STATISTICS mcv_lists_stats;
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT
+         (CASE WHEN mod(i,100) = 1 THEN NULL ELSE mod(i,100) END),
+         (CASE WHEN mod(i,50) = 1  THEN NULL ELSE mod(i,50) END),
+         (CASE WHEN mod(i,25) = 1  THEN NULL ELSE mod(i,25) END),
+         i
+     FROM generate_series(1,5000) s(i);
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a IS NULL) AND (b IS NULL))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
+                   QUERY PLAN                   
+------------------------------------------------
+ Index Scan using mcv_lists_ab_idx on mcv_lists
+   Index Cond: ((a IS NULL) AND (b IS NULL))
+   Filter: (c IS NULL)
+(3 rows)
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a IS NULL) AND (b IS NULL))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a IS NULL) AND (b IS NULL))
+   Filter: (c IS NULL)
+   ->  Bitmap Index Scan on mcv_lists_ab_idx
+         Index Cond: ((a IS NULL) AND (b IS NULL))
+(5 rows)
+
+RESET random_page_cost;
+-- mcv with arrays
+CREATE TABLE mcv_lists_arrays (
+    a TEXT[],
+    b NUMERIC[],
+    c INT[]
+);
+INSERT INTO mcv_lists_arrays (a, b, c)
+     SELECT
+         ARRAY[md5((i/100)::text), md5((i/100-1)::text), md5((i/100+1)::text)],
+         ARRAY[(i/100-1)::numeric/1000, (i/100)::numeric/1000, (i/100+1)::numeric/1000],
+         ARRAY[(i/100-1), i/100, (i/100+1)]
+     FROM generate_series(1,5000) s(i);
+CREATE STATISTICS mcv_lists_arrays_stats (mcv) ON a, b, c
+  FROM mcv_lists_arrays;
+ANALYZE mcv_lists_arrays;
+-- mcv with bool
+CREATE TABLE mcv_lists_bool (
+    a BOOL,
+    b BOOL,
+    c BOOL
+);
+INSERT INTO mcv_lists_bool (a, b, c)
+     SELECT
+         (mod(i,2) = 0), (mod(i,4) = 0), (mod(i,8) = 0)
+     FROM generate_series(1,10000) s(i);
+CREATE INDEX mcv_lists_bool_ab_idx ON mcv_lists_bool (a, b);
+CREATE INDEX mcv_lists_bool_abc_idx ON mcv_lists_bool (a, b, c);
+CREATE STATISTICS mcv_lists_bool_stats (mcv) ON a, b, c
+  FROM mcv_lists_bool;
+ANALYZE mcv_lists_bool;
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE a AND b AND c;
+                           QUERY PLAN                           
+----------------------------------------------------------------
+ Bitmap Heap Scan on mcv_lists_bool
+   Filter: (a AND b AND c)
+   ->  Bitmap Index Scan on mcv_lists_bool_abc_idx
+         Index Cond: ((a = true) AND (b = true) AND (c = true))
+(4 rows)
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND c;
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Index Scan using mcv_lists_bool_ab_idx on mcv_lists_bool
+   Index Cond: ((a = false) AND (b = true))
+   Filter: ((NOT a) AND b AND c)
+(3 rows)
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND NOT b AND c;
+                           QUERY PLAN                           
+----------------------------------------------------------------
+ Index Only Scan using mcv_lists_bool_abc_idx on mcv_lists_bool
+   Index Cond: ((a = false) AND (b = false) AND (c = true))
+   Filter: ((NOT a) AND (NOT b) AND c)
+(3 rows)
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND NOT c;
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Index Scan using mcv_lists_bool_ab_idx on mcv_lists_bool
+   Index Cond: ((a = false) AND (b = true))
+   Filter: ((NOT a) AND b AND (NOT c))
+(3 rows)
+
diff --git a/src/test/regress/expected/type_sanity.out b/src/test/regress/expected/type_sanity.out
new file mode 100644
index b1419d4..a56d6c5
--- a/src/test/regress/expected/type_sanity.out
+++ b/src/test/regress/expected/type_sanity.out
@@ -72,8 +72,9 @@ WHERE p1.typtype not in ('c','d','p') AN
   194 | pg_node_tree
  3361 | pg_ndistinct
  3402 | pg_dependencies
+ 4001 | pg_mcv_list
   210 | smgr
-(4 rows)
+(5 rows)
 
 -- Make sure typarray points to a varlena array type of our own base
 SELECT p1.oid, p1.typname as basetype, p2.typname as arraytype,
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
new file mode 100644
index 46acaad..ad1f103
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -282,3 +282,184 @@ EXPLAIN (COSTS OFF)
  SELECT * FROM functional_dependencies WHERE a = 1 AND b = '1' AND c = 1;
 
 RESET random_page_cost;
+
+-- MCV lists
+CREATE TABLE mcv_lists (
+    filler1 TEXT,
+    filler2 NUMERIC,
+    a INT,
+    b TEXT,
+    filler3 DATE,
+    c INT,
+    d TEXT
+);
+
+SET random_page_cost = 1.2;
+
+CREATE INDEX mcv_lists_ab_idx ON mcv_lists (a, b);
+CREATE INDEX mcv_lists_abc_idx ON mcv_lists (a, b, c);
+
+-- random data (no MCV list)
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT mod(i,37), mod(i,41), mod(i,43), mod(i,47) FROM generate_series(1,5000) s(i);
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+
+-- 100 distinct combinations, all in the MCV list
+TRUNCATE mcv_lists;
+DROP STATISTICS mcv_lists_stats;
+
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT mod(i,100), mod(i,50), mod(i,25), i FROM generate_series(1,5000) s(i);
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 0 AND b <= '0';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 4 AND b <= '0' AND c <= 4;
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 0 AND b <= '0';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 4 AND b <= '0' AND c <= 4;
+
+-- check change of column type resets the MCV statistics
+ALTER TABLE mcv_lists ALTER COLUMN c TYPE numeric;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+-- 100 distinct combinations with NULL values, all in the MCV list
+TRUNCATE mcv_lists;
+DROP STATISTICS mcv_lists_stats;
+
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT
+         (CASE WHEN mod(i,100) = 1 THEN NULL ELSE mod(i,100) END),
+         (CASE WHEN mod(i,50) = 1  THEN NULL ELSE mod(i,50) END),
+         (CASE WHEN mod(i,25) = 1  THEN NULL ELSE mod(i,25) END),
+         i
+     FROM generate_series(1,5000) s(i);
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
+
+RESET random_page_cost;
+
+-- mcv with arrays
+CREATE TABLE mcv_lists_arrays (
+    a TEXT[],
+    b NUMERIC[],
+    c INT[]
+);
+
+INSERT INTO mcv_lists_arrays (a, b, c)
+     SELECT
+         ARRAY[md5((i/100)::text), md5((i/100-1)::text), md5((i/100+1)::text)],
+         ARRAY[(i/100-1)::numeric/1000, (i/100)::numeric/1000, (i/100+1)::numeric/1000],
+         ARRAY[(i/100-1), i/100, (i/100+1)]
+     FROM generate_series(1,5000) s(i);
+
+CREATE STATISTICS mcv_lists_arrays_stats (mcv) ON a, b, c
+  FROM mcv_lists_arrays;
+
+ANALYZE mcv_lists_arrays;
+
+-- mcv with bool
+CREATE TABLE mcv_lists_bool (
+    a BOOL,
+    b BOOL,
+    c BOOL
+);
+
+INSERT INTO mcv_lists_bool (a, b, c)
+     SELECT
+         (mod(i,2) = 0), (mod(i,4) = 0), (mod(i,8) = 0)
+     FROM generate_series(1,10000) s(i);
+
+CREATE INDEX mcv_lists_bool_ab_idx ON mcv_lists_bool (a, b);
+
+CREATE INDEX mcv_lists_bool_abc_idx ON mcv_lists_bool (a, b, c);
+
+CREATE STATISTICS mcv_lists_bool_stats (mcv) ON a, b, c
+  FROM mcv_lists_bool;
+
+ANALYZE mcv_lists_bool;
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE a AND b AND c;
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND c;
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND NOT b AND c;
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND NOT c;

test.sqlapplication/sql; name=test.sqlDownload

#81

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 7 years ago

In reply to: Dean Rasheed (#80)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 08/03/2018 04:24 PM, Dean Rasheed wrote:

On 17 July 2018 at 14:03, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

For equalities it's going to be hard. The only thing I can think of at the
moment is checking if there are any matching buckets at all, and using that
to decide whether to extrapolate the MCV selectivity to the non-MCV part or
not (or perhaps to what part of the non-MCV part).

So I decided to play a little more with this, experimenting with a
much simpler approach -- this is for MCV's only at the moment, see the
attached (very much WIP) patch (no doc or test updates, and lots of
areas for improvement).

The basic idea when building the MCV stats is to not just record the
frequency of each combination of values, but also what I'm calling the
"base frequency" -- that is the frequency that that combination of
values would have if the columns were independent (i.e., the product
of each value's individual frequency).

The reasoning then, is that if we find an MCV entry matching the query
clauses, the difference (frequency - base_frequency) can be viewed as
a correction to be applied to the selectivity returned by
clauselist_selectivity_simple(). If all possible values were covered
by matching MCV entries, the sum of the base frequencies of the
matching MCV entries would approximately cancel out with the simple
selectivity, and only the MCV frequencies would be left (ignoring
second order effects arising from the fact that
clauselist_selectivity_simple() doesn't just sum up disjoint
possibilities). For partial matches, it will use what multivariate
stats are available to improve upon the simple selectivity.

I wondered about just storing the difference (frequency -
base_frequency) in the stats, but it's actually useful to have both
values, because then the total of all the MCV frequencies can be used
to set an upper bound on the non-MCV part.

The advantage of this approach is that it is very simple, and in
theory ought to be reasonably applicable to arbitrary combinations of
clauses. Also, it naturally falls back to the univariate-based
estimate when there are no matching MCV entries. In fact, even when
there are no matching MCV entries, it can still improve upon the
univariate estimate by capping it to 1-total_mcv_sel.

I tested it with the same data posted previously and a few simple
queries, and the initial results are quite encouraging. Where the
previous patch sometimes gave noticeable over- or under-estimates,
this patch generally did better:

Query Actual rows Est (HEAD) Est (24 Jun patch) Est (new patch)
Q1 50000 12625 48631 49308
Q2 40000 9375 40739 38710
Q3 90000 21644 172688 88018
Q4 140000 52048 267528 138228
Q5 140000 52978 267528 138228
Q6 140000 52050 267528 138228
Q7 829942 777806 149886 822788
Q8 749942 748302 692686 747922
Q9 15000 40989 27595 14131
Q10 15997 49853 27595 23121

Q1: a=1 and b=1
Q2: a=1 and b=2
Q3: a=1 and (b=1 or b=2)
Q4: (a=1 or a=2) and (b=1 or b=2)
Q5: (a=1 or a=2) and (b<=2)
Q6: (a=1 or a=2 or a=4) and (b=1 or b=2)
Q7: (a=1 or a=2) and not (b=2)
Q8: (a=1 or a=2) and not (b=1 or b=2)
Q9: a=3 and b>0 and b<3
Q10: a=3 and b>0 and b<1000

Interesting idea, and the improvements certainly seem encouraging.

I wonder what a counter-example would look like - I think the MCV and
non-MCV parts would have to behave very differently (one perfectly
dependent, the other perfectly independent). But that does seem very
likely, and even if it was there's not much we can do about such cases.

I've not tried anything with histograms. Possibly the histograms could
be used as-is, to replace the non-MCV part (other_sel). Or, a similar
approach could be used, recording the base frequency of each histogram
bucket, and then using that to refine the other_sel estimate. Either
way, I think it would be necessary to exclude equality clauses from
the histograms, otherwise MCVs might end up being double-counted.

I do have an idea about histograms. I didn't have time to hack on it
yet, but I think it could work in combination with your MCV algorithm.

Essentially there are two related issues with histograms:

1) equality conditions

Histograms work nicely with inequalities, not that well for equalities.
For equality clauses, we can estimate the selectivity as 1/ndistinct,
similarly to what we do in 1-D cases (we can use ndistinct coefficients
if we have them, and MCV tracking the common combinations).

If there are both equalities and inequalities, we can then use the
equality clauses merely as condition (to limit the set of buckets), and
evaluate the inequalities for those buckets only. Essentially compute

P(equals + inequals) = P(equals) * P(inequals | equals)

IMHO that should help with estimating selectivity of equality clauses.

2) estimating bucket selectivity

The other question is how to combine selectivities of multiple clauses
for a single bucket. I think the linear approximation (convert_scalar or
something like that) and computing geometric mean (as you proposed) is a
solid plan.

I do have this on my TODO list for this week, unless something urgent
comes up.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#82

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 7 years ago

In reply to: Tomas Vondra (#81)

4 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

Attached is an updated version of the patch series, adopting a couple of
improvements - both for MCV lists and histograms.

MCV
---

For the MCV list part, I've adopted the approach proposed by Dean, using
base selectivity and using it to correct the non-MCV part. I agree the
simplicity of the approach is a nice feature, and it seems to produce
better estimates. I'm not sure I understand the approach perfectly, but
I've tried to add comments explaining how it works etc.

I've also changed how we build the MCV lists, particularly how we decide
how many / which items store in the MCV list. In the previous version
I've adopted the same algorithm we use for per-column MCV lists, but in
certain cases that turned out to be too restrictive.

Consider for example a table with multiple perfectly correlated columns,
with very few combinations. That is, something like this:

CREATE TABLE t (a int, b int);

INSERT INTO t SELECT mod(i,50), mod(i,50)
FROM generate_series(1,1e6) s(i);

CREATE STATISTICS s (mcv) ON a,b FROM t;

Now, the data distribution is very simple - uniform, with 50 distinct
combinations, each representing 2% of data (and the random sample should
be pretty close to that).

In these cases, analyze_mcv_list decides it does not need any MCV list,
because the frequency for each value is pretty much 1/ndistinct. For
single column that's reasonable, but for multiple correlated columns
it's rather problematic. We might use the same ndistinct approach
(assuming we have the ndistinct coefficients), but that still does not
allow us to decide which combinations are "valid" with respect to the
data. For example we can't decide (1,10) does not appear in the data.

So I'm not entirely sure adopting the same algorithm analyze_mcv_list
algorithm both for single-column and multi-column stats. It may make
sense to keep more items in the multi-column case for reasons that are
not really valid for a single single-column.

For now I've added a trivial condition to simply keep all the groups
when possible. This probably needs more thought.

BTW Dean's patch also modified how the maximum number of items on a MCV
list is determined - instead of the shaky defaults I used before, it
derives the size from attstattarget values for the columns, keeping the
maximum value. That seems to make sense, so I've kept this.

histograms
----------

For histograms, I've made the two improvements I mentioned previously.

Firstly, simple equality conditions (of the form "var = const") are
estimated using as 1/ndistinct (possibly using ndistinct coefficients
when available), and then used only as "conditions" (in the "conditional
probability" sense) when estimating the rest of the clauses using the
histogram.

That is P(clauses) is split into two parts

P(clauses) = P(equalities) * P(remaining|clauses)

where the first part is estimated as 1/ndistinct, the second part is
estimated using histogram.

I'm sure this needs more thought, particularly when combining MCV and
histogram estimates. But in general it seems to work quite nicely.

The second improvement is about estimating what fraction of a bucket
matches the conditions. Instead of using the rough 1/2-bucket estimate,
I've adopted the convert_to_scalar approach, computing a geometric mean
for all the clauses (at a bucket level).

I'm not entirely sure the geometric mean is the right approach (or
better than simply using 1/2 the bucket) because multiplying the
per-clause frequencies is mostly equal to assumption of independence at
the bucket level. Which is rather incompatible with the purpose of
multi-column statistics, which are meant to be used exactly when the
columns are not independent.

measurements
------------

I think we need to maintain a set of tests (dataset + query), so that we
can compare impact of various changes in the algorithm. So far we've
used mostly ad-hoc queries, often created as counter-examples, and that
does not seem very practical.

So I'm attaching a simple SQL script that I consider an initial version
of that. It has a couple of synthetic data sets, and queries estimated
with and without extended statistics.

I'm also attaching a spreadsheet with results for (a) the original
version of the patch series, as submitted on 6/24, (b) the new version
attached here and (c) the new version using the per-bucket estimates
directly, without the geometric mean.

Overall, the new versions seem to perform better than the version from
6/24, and also compared to only per-column statistics. There are cases
where extended statistic produce over-estimates, but I find it somewhat
natural due to lower resolution of the multi-column stats.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

stats.odsapplication/vnd.oasis.opendocument.spreadsheet; name=stats.odsDownload

stats-tests.sqlapplication/sql; name=stats-tests.sqlDownload

0002-multivariate-histograms-20180902.patchtext/x-patch; name=0002-multivariate-histograms-20180902.patchDownload

From ebdc2d2115b850384f3910f33fbaefc3d57687f3 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Sat, 1 Sep 2018 22:46:45 +0200
Subject: [PATCH 2/2] multivariate histograms

---
 doc/src/sgml/catalogs.sgml                       |   14 +-
 doc/src/sgml/func.sgml                           |   85 +
 doc/src/sgml/planstats.sgml                      |  106 +
 doc/src/sgml/ref/create_statistics.sgml          |   29 +-
 src/backend/catalog/system_views.sql             |   11 +
 src/backend/commands/statscmds.c                 |   33 +-
 src/backend/nodes/outfuncs.c                     |    2 +-
 src/backend/optimizer/util/plancat.c             |   44 +-
 src/backend/parser/parse_utilcmd.c               |    2 +
 src/backend/statistics/Makefile                  |    2 +-
 src/backend/statistics/README                    |    4 +
 src/backend/statistics/README.histogram          |  305 +++
 src/backend/statistics/dependencies.c            |    2 +-
 src/backend/statistics/extended_stats.c          |  252 +-
 src/backend/statistics/histogram.c               | 3019 ++++++++++++++++++++++
 src/backend/statistics/mcv.c                     |   94 +-
 src/backend/utils/adt/ruleutils.c                |   10 +
 src/backend/utils/adt/selfuncs.c                 |    7 +-
 src/bin/psql/describe.c                          |    9 +-
 src/include/catalog/pg_cast.dat                  |    4 +
 src/include/catalog/pg_proc.dat                  |   24 +
 src/include/catalog/pg_statistic_ext.h           |    2 +
 src/include/catalog/pg_type.dat                  |    7 +
 src/include/nodes/relation.h                     |    7 +-
 src/include/statistics/extended_stats_internal.h |   15 +
 src/include/statistics/statistics.h              |   63 +-
 src/include/utils/selfuncs.h                     |    4 +
 src/test/regress/expected/create_table_like.out  |    2 +-
 src/test/regress/expected/opr_sanity.out         |    3 +-
 src/test/regress/expected/stats_ext.out          |  209 +-
 src/test/regress/expected/type_sanity.out        |    3 +-
 src/test/regress/sql/stats_ext.sql               |  133 +-
 32 files changed, 4403 insertions(+), 103 deletions(-)
 create mode 100644 src/backend/statistics/README.histogram
 create mode 100644 src/backend/statistics/histogram.c

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index dc7bbe5173..0edd28ad0e 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6571,8 +6571,9 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
         An array containing codes for the enabled statistic kinds;
         valid values are:
         <literal>d</literal> for n-distinct statistics,
-        <literal>f</literal> for functional dependency statistics, and
-        <literal>m</literal> for most common values (MCV) list statistics
+        <literal>f</literal> for functional dependency statistics,
+        <literal>m</literal> for most common values (MCV) list statistics, and
+        <literal>h</literal> for histogram statistics
       </entry>
      </row>
 
@@ -6605,6 +6606,15 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </entry>
      </row>
 
+     <row>
+      <entry><structfield>stxhistogram</structfield></entry>
+      <entry><type>pg_histogram</type></entry>
+      <entry></entry>
+      <entry>
+       Histogram, serialized as <structname>pg_histogram</structname> type.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 69cfe7bbe9..0a5a62685a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -20992,6 +20992,91 @@ SELECT m.* FROM pg_statistic_ext,
    </para>
   </sect2>
 
+  <sect2 id="functions-statistics-histogram">
+   <title>Inspecting histograms</title>
+
+   <indexterm>
+     <primary>pg_histogram</primary>
+     <secondary>pg_histogram_buckets</secondary>
+   </indexterm>
+
+   <para>
+    <function>pg_histogram_buckets</function> returns a list of all buckets
+    stored in a multi-column histogram, and returns the following columns:
+
+    <informaltable>
+     <tgroup cols="3">
+      <thead>
+       <row>
+        <entry>Name</entry>
+        <entry>Type</entry>
+        <entry>Description</entry>
+       </row>
+      </thead>
+      <tbody>
+       <row>
+        <entry><literal>index</literal></entry>
+        <entry><type>int</type></entry>
+        <entry>index of the item in the histogram buckets</entry>
+       </row>
+       <row>
+        <entry><literal>minvals</literal></entry>
+        <entry><type>text[]</type></entry>
+        <entry>lower boundaries of the histogram bucket</entry>
+       </row>
+       <row>
+        <entry><literal>maxvals</literal></entry>
+        <entry><type>text[]</type></entry>
+        <entry>upper boundaries of the histogram bucket</entry>
+       </row>
+       <row>
+        <entry><literal>nullsonly</literal></entry>
+        <entry><type>boolean[]</type></entry>
+        <entry>flags identifying <literal>NULL</literal>-only dimensions</entry>
+       </row>
+       <row>
+        <entry><literal>mininclusive</literal></entry>
+        <entry><type>boolean[]</type></entry>
+        <entry>flags identifying which lower boundaries are inclusive</entry>
+       </row>
+       <row>
+        <entry><literal>maxinclusive</literal></entry>
+        <entry><type>boolean[]</type></entry>
+        <entry>flags identifying which upper boundaries are inclusive</entry>
+       </row>
+       <row>
+        <entry><literal>frequency</literal></entry>
+        <entry><type>double precision</type></entry>
+        <entry>frequency of this histogram bucket</entry>
+       </row>
+       <row>
+        <entry><literal>density</literal></entry>
+        <entry><type>double precision</type></entry>
+        <entry>density of this histogram bucket (frequency / volume)</entry>
+       </row>
+       <row>
+        <entry><literal>bucket_volume</literal></entry>
+        <entry><type>double precision</type></entry>
+        <entry>volume of this histogram bucket (a measure of size)</entry>
+       </row>
+      </tbody>
+     </tgroup>
+    </informaltable>
+   </para>
+
+   <para>
+    The <function>pg_histogram_buckets</function> function can be used like this:
+
+<programlisting>
+SELECT m.* FROM pg_statistic_ext,
+                pg_histogram_buckets(stxhistogram) m WHERE stxname = 'stts3';
+</programlisting>
+
+     Values of the <type>pg_histogram</type> can be obtained only from the
+     <literal>pg_statistic.stxhistogram</literal> column.
+   </para>
+  </sect2>
+
   </sect1>
 
 </chapter>
diff --git a/doc/src/sgml/planstats.sgml b/doc/src/sgml/planstats.sgml
index de8ef165c9..67a4f7219c 100644
--- a/doc/src/sgml/planstats.sgml
+++ b/doc/src/sgml/planstats.sgml
@@ -695,6 +695,112 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a &lt;= 49 AND b &gt; 49;
 
   </sect2>
 
+  <sect2 id="mv-histograms">
+   <title>Histograms</title>
+
+   <para>
+    <acronym>MCV</acronym> lists, introduced in <xref linkend="mcv-lists"/>,
+    work very well for columns with only a few distinct values, and for
+    columns with only few common values. In those cases, <acronym>MCV</acronym>
+    lists are a very accurate approximation of the real distribution.
+    Histograms, briefly described in <xref linkend="row-estimation-examples"/>,
+    are meant to address the high-cardinality case.
+   </para>
+
+   <para>
+    Although the example data we've used in <xref linkend="mcv-lists"/> does
+    not quality as a high-cardinality case, we can try creating a histogram
+    instead of the <acronym>MCV</acronym> list. With the histogram in place,
+    you may get a plan like this:
+
+<programlisting>
+DROP STATISTICS stts2;
+CREATE STATISTICS stts3 (histogram) ON a, b FROM t;
+ANALYZE t;
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
+                                  QUERY PLAN
+-------------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1)
+   Filter: ((a = 1) AND (b = 1))
+   Rows Removed by Filter: 9900
+</programlisting>
+
+    Which seems quite accurate. For other combinations of values the
+    estimates may be worse, as illustrated by the following query:
+
+<programlisting>
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 10;
+                                 QUERY PLAN                                 
+-----------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=100 width=8) (actual rows=0 loops=1)
+   Filter: ((a = 1) AND (b = 10))
+   Rows Removed by Filter: 10000
+</programlisting>
+
+    This happens due to histograms tracking ranges of values, which makes it
+    impossible to decide how tuples with the exact combination of values there
+    are in the bucket.
+   </para>
+
+   <para>
+    It's also possible to build <acronym>MCV</acronym> lists and a histogram, in
+    which case <command>ANALYZE</command> will build a <acronym>MCV</acronym> lists
+    with the most frequent values, and a histogram on the remaining part of the
+    sample.
+
+<programlisting>
+DROP STATISTICS stts3;
+CREATE STATISTICS stts4 (mcv, histogram) ON a, b FROM t;
+ANALYZE t;
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 10;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)
+   Filter: ((a = 1) AND (b = 10))
+   Rows Removed by Filter: 10000
+</programlisting>
+
+    In this case the <acronym>MCV</acronym> list and histogram are treated as a
+    single composed statistics.
+   </para>
+
+   <para>
+    Similarly to <acronym>MCV</acronym> lists, it is possible to inspect
+    histograms using a function called <function>pg_histogram_buckets</function>,
+    which simply lists buckets of a histogram, along with information about
+    boundaries, frequencies, volume etc. When applied to the histogram from
+    <varname>stts3</varname>, you should get something like this:
+
+<programlisting>
+SELECT m.* FROM pg_statistic_ext,
+                pg_histogram_buckets(stxhistogram) m WHERE stxname = 'stts3';
+ index | minvals | maxvals | nullsonly | mininclusive | maxinclusive | frequency | density  | bucket_volume 
+-------+---------+---------+-----------+--------------+--------------+-----------+----------+---------------
+     0 | {0,0}   | {3,1}   | {f,f}     | {t,t}        | {f,f}        |      0.01 |    2.635 |      0.003795
+     1 | {50,0}  | {99,51} | {f,f}     | {t,t}        | {t,f}        |      0.01 | 0.034444 |      0.290323
+     2 | {0,25}  | {26,37} | {f,f}     | {t,t}        | {f,f}        |      0.01 | 0.292778 |      0.034156
+...
+    61 | {60,56} | {99,62} | {f,f}     | {t,t}        | {t,f}        |      0.02 | 0.752857 |      0.026565
+    62 | {35,25} | {50,37} | {f,f}     | {t,t}        | {f,f}        |      0.02 | 0.878333 |       0.02277
+    63 | {81,85} | {87,99} | {f,f}     | {t,t}        | {f,t}        |      0.02 | 1.756667 |      0.011385
+(64 rows)
+</programlisting>
+
+    Which shows that there are 64 buckets, with frequencies ranging between 1%
+    and 2%. The <structfield>minvals</structfield> and <structfield>maxvals</structfield>
+    show the bucket boundaries, <structfield>nullsonly</structfield> shows which
+    columns contain only null values (in the given bucket).
+   </para>
+
+   <para>
+    Similarly to <acronym>MCV</acronym> lists, the planner applies all conditions
+    to the buckets, and sums the frequencies of the matching ones. For details,
+    see <function>histogram_clauselist_selectivity</function> function in
+    <filename>src/backend/statistics/histogram.c</filename>.
+   </para>
+
+  </sect2>
+
  </sect1>
 
  <sect1 id="planner-stats-security">
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index fcbfa569d0..ef84341551 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -84,7 +84,8 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="parameter">statistics_na
       <literal>ndistinct</literal>, which enables n-distinct statistics, and
       <literal>dependencies</literal>, which enables functional
       dependency statistics, and <literal>mcv</literal> which enables
-      most-common values lists.
+      most-common values lists, and <literal>histogram</literal> which
+      enables histograms.
       If this clause is omitted, all supported statistics kinds are
       included in the statistics object.
       For more information, see <xref linkend="planner-stats-extended"/>
@@ -190,6 +191,32 @@ EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 2);
 </programlisting>
   </para>
 
+  <para>
+   Create table <structname>t3</structname> with two strongly correlated
+   columns, and a histogram on those two columns:
+
+<programlisting>
+CREATE TABLE t3 (
+    a   float,
+    b   float
+);
+
+INSERT INTO t3 SELECT mod(i,1000), mod(i,1000) + 50 * (r - 0.5) FROM (
+                   SELECT i, random() r FROM generate_series(1,1000000) s(i)
+                 ) foo;
+
+CREATE STATISTICS s3 WITH (histogram) ON (a, b) FROM t3;
+
+ANALYZE t3;
+
+-- small overlap
+EXPLAIN ANALYZE SELECT * FROM t3 WHERE (a &lt; 500) AND (b &gt; 500);
+
+-- no overlap
+EXPLAIN ANALYZE SELECT * FROM t3 WHERE (a &lt; 400) AND (b &gt; 600);
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 7251552419..d823e42125 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1119,6 +1119,17 @@ LANGUAGE INTERNAL
 STRICT IMMUTABLE PARALLEL SAFE
 AS 'jsonb_insert';
 
+CREATE OR REPLACE FUNCTION
+  pg_histogram_buckets(histogram pg_histogram, otype integer DEFAULT 0,
+            OUT index integer, OUT minvals text[], OUT maxvals text[],
+            OUT nullsonly boolean[], OUT mininclusive boolean[],
+            OUT maxinclusive boolean[], OUT frequency double precision,
+            OUT density double precision, OUT bucket_volume double precision)
+RETURNS SETOF record
+LANGUAGE INTERNAL
+STRICT IMMUTABLE PARALLEL SAFE
+AS 'pg_histogram_buckets';
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 903d8155e0..d7d504360d 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -70,12 +70,13 @@ CreateStatistics(CreateStatsStmt *stmt)
 	Oid			relid;
 	ObjectAddress parentobject,
 				myself;
-	Datum		types[3];		/* one for each possible type of statistic */
+	Datum		types[4];		/* one for each possible type of statistic */
 	int			ntypes;
 	ArrayType  *stxkind;
 	bool		build_ndistinct;
 	bool		build_dependencies;
 	bool		build_mcv;
+	bool		build_histogram;
 	bool		requested_type = false;
 	int			i;
 	ListCell   *cell;
@@ -271,6 +272,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	build_ndistinct = false;
 	build_dependencies = false;
 	build_mcv = false;
+	build_histogram = false;
 	foreach(cell, stmt->stat_types)
 	{
 		char	   *type = strVal((Value *) lfirst(cell));
@@ -290,6 +292,11 @@ CreateStatistics(CreateStatsStmt *stmt)
 			build_mcv = true;
 			requested_type = true;
 		}
+		else if (strcmp(type, "histogram") == 0)
+		{
+			build_histogram = true;
+			requested_type = true;
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -302,6 +309,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 		build_ndistinct = true;
 		build_dependencies = true;
 		build_mcv = true;
+		build_histogram = true;
 	}
 
 	/* construct the char array of enabled statistic types */
@@ -312,6 +320,8 @@ CreateStatistics(CreateStatsStmt *stmt)
 		types[ntypes++] = CharGetDatum(STATS_EXT_DEPENDENCIES);
 	if (build_mcv)
 		types[ntypes++] = CharGetDatum(STATS_EXT_MCV);
+	if (build_histogram)
+		types[ntypes++] = CharGetDatum(STATS_EXT_HISTOGRAM);
 	Assert(ntypes > 0 && ntypes <= lengthof(types));
 	stxkind = construct_array(types, ntypes, CHAROID, 1, true, 'c');
 
@@ -331,6 +341,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	nulls[Anum_pg_statistic_ext_stxndistinct - 1] = true;
 	nulls[Anum_pg_statistic_ext_stxdependencies - 1] = true;
 	nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
+	nulls[Anum_pg_statistic_ext_stxhistogram - 1] = true;
 
 	/* insert it into pg_statistic_ext */
 	statrel = heap_open(StatisticExtRelationId, RowExclusiveLock);
@@ -435,8 +446,9 @@ RemoveStatisticsById(Oid statsOid)
  * values, this assumption could fail.  But that seems like a corner case
  * that doesn't justify zapping the stats in common cases.)
  *
- * For MCV lists that's not the case, as those statistics store the datums
- * internally. In this case we simply reset the statistics value to NULL.
+ * For MCV lists and histograms that's not the case, as those statistics
+ * store the datums internally. In those cases we simply reset those
+ * statistics to NULL.
  */
 void
 UpdateStatisticsForTypeChange(Oid statsOid, Oid relationOid, int attnum,
@@ -473,9 +485,10 @@ UpdateStatisticsForTypeChange(Oid statsOid, Oid relationOid, int attnum,
 
 	/*
 	 * We can also leave the record as it is if there are no statistics
-	 * including the datum values, like for example MCV lists.
+	 * including the datum values, like for example MCV and histograms.
 	 */
-	if (statext_is_kind_built(oldtup, STATS_EXT_MCV))
+	if (statext_is_kind_built(oldtup, STATS_EXT_MCV) ||
+		statext_is_kind_built(oldtup, STATS_EXT_HISTOGRAM))
 		reset_stats = true;
 
 	/*
@@ -496,11 +509,11 @@ UpdateStatisticsForTypeChange(Oid statsOid, Oid relationOid, int attnum,
 	memset(replaces, 0, Natts_pg_statistic_ext * sizeof(bool));
 	memset(values, 0, Natts_pg_statistic_ext * sizeof(Datum));
 
-	if (statext_is_kind_built(oldtup, STATS_EXT_MCV))
-	{
-		replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
-		nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
-	}
+	replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
+	replaces[Anum_pg_statistic_ext_stxhistogram - 1] = true;
+
+	nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
+	nulls[Anum_pg_statistic_ext_stxhistogram - 1] = true;
 
 	rel = heap_open(StatisticExtRelationId, RowExclusiveLock);
 
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index b5af904c18..053cbc498f 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2437,7 +2437,7 @@ _outStatisticExtInfo(StringInfo str, const StatisticExtInfo *node)
 	/* NB: this isn't a complete set of fields */
 	WRITE_OID_FIELD(statOid);
 	/* don't write rel, leads to infinite recursion in plan tree dump */
-	WRITE_CHAR_FIELD(kind);
+	WRITE_INT_FIELD(kinds);
 	WRITE_BITMAPSET_FIELD(keys);
 }
 
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 0112450419..1079183ccc 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1324,6 +1324,9 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 		HeapTuple	htup;
 		Bitmapset  *keys = NULL;
 		int			i;
+		int			kind = 0;
+
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
 		htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
 		if (!htup)
@@ -1338,42 +1341,25 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 		for (i = 0; i < staForm->stxkeys.dim1; i++)
 			keys = bms_add_member(keys, staForm->stxkeys.values[i]);
 
-		/* add one StatisticExtInfo for each kind built */
+		/* now build the bitmask of statistics kinds */
 		if (statext_is_kind_built(htup, STATS_EXT_NDISTINCT))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_NDISTINCT;
-			info->keys = bms_copy(keys);
-
-			stainfos = lcons(info, stainfos);
-		}
+			kind |= STATS_EXT_INFO_NDISTINCT;
 
 		if (statext_is_kind_built(htup, STATS_EXT_DEPENDENCIES))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_DEPENDENCIES;
-			info->keys = bms_copy(keys);
-
-			stainfos = lcons(info, stainfos);
-		}
+			kind |= STATS_EXT_INFO_DEPENDENCIES;
 
 		if (statext_is_kind_built(htup, STATS_EXT_MCV))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			kind |= STATS_EXT_INFO_MCV;
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_MCV;
-			info->keys = bms_copy(keys);
+		if (statext_is_kind_built(htup, STATS_EXT_HISTOGRAM))
+			kind |= STATS_EXT_INFO_HISTOGRAM;
 
-			stainfos = lcons(info, stainfos);
-		}
+		info->statOid = statOid;
+		info->rel = rel;
+		info->kinds = kind;
+		info->keys = bms_copy(keys);
+
+		stainfos = lcons(info, stainfos);
 
 		ReleaseSysCache(htup);
 		bms_free(keys);
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 29877126d7..58b06fca92 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1683,6 +1683,8 @@ generateClonedExtStatsStmt(RangeVar *heapRel, Oid heapRelid,
 			stat_types = lappend(stat_types, makeString("dependencies"));
 		else if (enabled[i] == STATS_EXT_MCV)
 			stat_types = lappend(stat_types, makeString("mcv"));
+		else if (enabled[i] == STATS_EXT_HISTOGRAM)
+			stat_types = lappend(stat_types, makeString("histogram"));
 		else
 			elog(ERROR, "unrecognized statistics kind %c", enabled[i]);
 	}
diff --git a/src/backend/statistics/Makefile b/src/backend/statistics/Makefile
index d2815265fb..3e5ad454cd 100644
--- a/src/backend/statistics/Makefile
+++ b/src/backend/statistics/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/statistics
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-OBJS = extended_stats.o dependencies.o mcv.o mvdistinct.o
+OBJS = extended_stats.o dependencies.o histogram.o mcv.o mvdistinct.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/statistics/README b/src/backend/statistics/README
index 8f153a9e85..9de750614f 100644
--- a/src/backend/statistics/README
+++ b/src/backend/statistics/README
@@ -20,6 +20,8 @@ There are currently two kinds of extended statistics:
 
     (c) MCV lists (README.mcv)
 
+    (d) histograms (README.histogram)
+
 
 Compatible clause types
 -----------------------
@@ -30,6 +32,8 @@ Each type of statistics may be used to estimate some subset of clause types.
 
     (b) MCV lists - equality and inequality clauses (AND, OR, NOT), IS NULL
 
+    (c) histogram - equality and inequality clauses (AND, OR, NOT), IS NULL
+
 Currently, only OpExprs in the form Var op Const, or Const op Var are
 supported, however it's feasible to expand the code later to also estimate the
 selectivities on clauses such as Var op Var.
diff --git a/src/backend/statistics/README.histogram b/src/backend/statistics/README.histogram
new file mode 100644
index 0000000000..e1a4504502
--- /dev/null
+++ b/src/backend/statistics/README.histogram
@@ -0,0 +1,305 @@
+Multivariate histograms
+=======================
+
+Histograms on individual attributes consist of buckets represented by ranges,
+covering the domain of the attribute. That is, each bucket is a [min,max]
+interval, and contains all values in this range. The histogram is built in such
+a way that all buckets have about the same frequency.
+
+Multivariate histograms are an extension into n-dimensional space - the buckets
+are n-dimensional intervals (i.e. n-dimensional rectagles), covering the domain
+of the combination of attributes. That is, each bucket has a vector of lower
+and upper boundaries, denoted min[i] and max[i] (where i = 1..n).
+
+In addition to the boundaries, each bucket tracks additional info:
+
+    * frequency (fraction of tuples in the bucket)
+    * whether the boundaries are inclusive or exclusive
+    * whether the dimension contains only NULL values
+    * number of distinct values in each dimension (for building only)
+
+It's possible that in the future we'll multiple histogram types, with different
+features. We do however expect all the types to share the same representation
+(buckets as ranges) and only differ in how we build them.
+
+The current implementation builds non-overlapping buckets, that may not be true
+for some histogram types and the code should not rely on this assumption. There
+are interesting types of histograms (or algorithms) with overlapping buckets.
+
+When used on low-cardinality data, histograms usually perform considerably worse
+than MCV lists (which are a good fit for this kind of data). This is especially
+true on label-like values, where ordering of the values is mostly unrelated to
+meaning of the data, as proper ordering is crucial for histograms.
+
+On high-cardinality data the histograms are usually a better choice, because MCV
+lists can't represent the distribution accurately enough.
+
+
+Selectivity estimation
+----------------------
+
+The estimation is implemented in clauselist_mv_selectivity_histogram(), and
+works very similarly to clauselist_mv_selectivity_mcvlist.
+
+The main difference is that while MCV lists support exact matches, histograms
+often result in approximate matches - e.g. with equality we can only say if
+the constant would be part of the bucket, but not whether it really is there
+or what fraction of the bucket it corresponds to. In this case we rely on
+some defaults just like in the per-column histograms.
+
+The current implementation uses histograms to estimates those types of clauses
+(think of WHERE conditions):
+
+    (a) equality clauses    WHERE (a = 1) AND (b = 2)
+    (b) inequality clauses  WHERE (a < 1) AND (b >= 2)
+    (c) NULL clauses        WHERE (a IS NULL) AND (b IS NOT NULL)
+    (d) OR-clauses          WHERE (a = 1)  OR (b = 2)
+
+Similarly to MCV lists, it's possible to add support for additional types of
+clauses, for example:
+
+    (e) multi-var clauses   WHERE (a > b)
+
+and so on. These are tasks for the future, not yet implemented.
+
+
+When evaluating a clause on a bucket, we may get one of three results:
+
+    (a) FULL_MATCH - The bucket definitely matches the clause.
+
+    (b) PARTIAL_MATCH - The bucket matches the clause, but not necessarily all
+                        the tuples it represents.
+
+    (c) NO_MATCH - The bucket definitely does not match the clause.
+
+This may be illustrated using a range [1, 5], which is essentially a 1-D bucket.
+With clause
+
+    WHERE (a < 10) => FULL_MATCH (all range values are below
+                      10, so the whole bucket matches)
+
+    WHERE (a < 3)  => PARTIAL_MATCH (there may be values matching
+                      the clause, but we don't know how many)
+
+    WHERE (a < 0)  => NO_MATCH (the whole range is above 1, so
+                      no values from the bucket can match)
+
+Some clauses may produce only some of those results - for example equality
+clauses may never produce FULL_MATCH as we always hit only part of the bucket
+(we can't match both boundaries at the same time). This results in less accurate
+estimates compared to MCV lists, where we can hit a MCV items exactly (there's
+no PARTIAL match in MCV).
+
+There are also clauses that may not produce any PARTIAL_MATCH results. A nice
+example of that is 'IS [NOT] NULL' clause, which either matches the bucket
+completely (FULL_MATCH) or not at all (NO_MATCH), thanks to how the NULL-buckets
+are constructed.
+
+Computing the total selectivity estimate is trivial - simply sum selectivities
+from all the FULL_MATCH and PARTIAL_MATCH buckets (but for buckets marked with
+PARTIAL_MATCH, multiply the frequency by 0.5 to minimize the average error).
+
+
+Building a histogram
+---------------------
+
+The algorithm of building a histogram in general is quite simple:
+
+    (a) create an initial bucket (containing all sample rows)
+
+    (b) create NULL buckets (by splitting the initial bucket)
+
+    (c) repeat
+
+        (1) choose bucket to split next
+
+        (2) terminate if no bucket that might be split found, or if we've
+            reached the maximum number of buckets (16384)
+
+        (3) choose dimension to partition the bucket by
+
+        (4) partition the bucket by the selected dimension
+
+The main complexity is hidden in steps (c.1) and (c.3), i.e. how we choose the
+bucket and dimension for the split, as discussed in the next section.
+
+
+Partitioning criteria
+---------------------
+
+Similarly to one-dimensional histograms, we want to produce buckets with roughly
+the same frequency.
+
+We also need to produce "regular" buckets, because buckets with one dimension
+much longer than the others are very likely to match a lot of conditions (which
+increases error, even if the bucket frequency is very low).
+
+This is especially important when handling OR-clauses, because in that case each
+clause may add buckets independently. With AND-clauses all the clauses have to
+match each bucket, which makes this issue somewhat less concenrning.
+
+To achieve this, we choose the largest bucket (containing the most sample rows),
+but we only choose buckets that can actually be split (have at least 3 different
+combinations of values).
+
+Then we choose the "longest" dimension of the bucket, which is computed by using
+the distinct values in the sample as a measure.
+
+For details see functions select_bucket_to_partition() and partition_bucket(),
+which also includes further discussion.
+
+
+The current limit on number of buckets (16384) is mostly arbitrary, but chosen
+so that it guarantees we don't exceed the number of distinct values indexable by
+uint16 in any of the dimensions. In practice we could handle more buckets as we
+index each dimension separately and the splits should use the dimensions evenly.
+
+Also, histograms this large (with 16k values in multiple dimensions) would be
+quite expensive to build and process, so the 16k limit is rather reasonable.
+
+The actual number of buckets is also related to statistics target, because we
+require MIN_BUCKET_ROWS (10) tuples per bucket before a split, so we can't have
+more than (2 * 300 * target / 10) buckets. For the default target (100) this
+evaluates to ~6k.
+
+
+NULL handling (create_null_buckets)
+-----------------------------------
+
+When building histograms on a single attribute, we first filter out NULL values.
+In the multivariate case, we can't really do that because the rows may contain
+a mix of NULL and non-NULL values in different columns (so we can't simply
+filter all of them out).
+
+For this reason, the histograms are built in a way so that for each bucket, each
+dimension only contains only NULL or non-NULL values. Building the NULL-buckets
+happens as the first step in the build, by the create_null_buckets() function.
+The number of NULL buckets, as produced by this function, has a clear upper
+boundary (2^N) where N is the number of dimensions (attributes the histogram is
+built on). Or rather 2^K where K is the number of attributes that are not marked
+as not-NULL.
+
+The buckets with NULL dimensions are then subject to the same build algorithm
+(i.e. may be split into smaller buckets) just like any other bucket, but may
+only be split by non-NULL dimension.
+
+
+Serialization
+-------------
+
+To store the histogram in pg_statistic_ext table, it is serialized into a more
+efficient form. We also use the representation for estimation, i.e. we don't
+fully deserialize the histogram.
+
+For example the boundary values are deduplicated to minimize the required space.
+How much redundancy is there, actually? Let's assume there are no NULL values,
+so we start with a single bucket - in that case we have 2*N boundaries. Each
+time we split a bucket we introduce one new value (in the "middle" of one of
+the dimensions), and keep boundries for all the other dimensions. So after K
+splits, we have up to
+
+    2*N + K
+
+unique boundary values (we may have fewe values, if the same value is used for
+several splits). But after K splits we do have (K+1) buckets, so
+
+    (K+1) * 2 * N
+
+boundary values. Using e.g. N=4 and K=999, we arrive to those numbers:
+
+    2*N + K       = 1007
+    (K+1) * 2 * N = 8000
+
+wich means a lot of redundancy. It's somewhat counter-intuitive that the number
+of distinct values does not really depend on the number of dimensions (except
+for the initial bucket, but that's negligible compared to the total).
+
+By deduplicating the values and replacing them with 16-bit indexes (uint16), we
+reduce the required space to
+
+    1007 * 8 + 8000 * 2 ~= 24kB
+
+which is significantly less than 64kB required for the 'raw' histogram (assuming
+the values are 8B).
+
+While the bytea compression (pglz) might achieve the same reduction of space,
+the deduplicated representation is used to optimize the estimation by caching
+results of function calls for already visited values. This significantly
+reduces the number of calls to (often quite expensive) operators.
+
+Note: Of course, this reasoning only holds for histograms built by the algorithm
+that simply splits the buckets in half. Other histograms types (e.g. containing
+overlapping buckets) may behave differently and require different serialization.
+
+Serialized histograms are marked with 'magic' constant, to make it easier to
+check the bytea value really is a serialized histogram.
+
+
+varlena compression
+-------------------
+
+This serialization may however disable automatic varlena compression, the array
+of unique values is placed at the beginning of the serialized form. Which is
+exactly the chunk used by pglz to check if the data is compressible, and it
+will probably decide it's not very compressible. This is similar to the issue
+we had with JSONB initially.
+
+Maybe storing buckets first would make it work, as the buckets may be better
+compressible.
+
+On the other hand the serialization is actually a context-aware compression,
+usually compressing to ~30% (or even less, with large data types). So the lack
+of additional pglz compression may be acceptable.
+
+
+Deserialization
+---------------
+
+The deserialization is not a perfect inverse of the serialization, as we keep
+the deduplicated arrays. This reduces the amount of memory and also allows
+optimizations during estimation (e.g. we can cache results for the distinct
+values, saving expensive function calls).
+
+
+Inspecting the histogram
+------------------------
+
+Inspecting the regular (per-attribute) histograms is trivial, as it's enough
+to select the columns from pg_stats. The data is encoded as anyarrays, and
+all the items have the same data type, so anyarray provides a simple way to
+get a text representation.
+
+With multivariate histograms the columns may use different data types, making
+it impossible to use anyarrays. It might be possible to produce similar
+array-like representation, but that would complicate further processing and
+analysis of the histogram.
+
+So instead the histograms are stored in a custom data type (pg_histogram),
+which however makes it more difficult to inspect the contents. To make that
+easier, there's a SRF returning detailed information about the histogram.
+
+    SELECT * FROM pg_histogram_buckets();
+
+It has two input parameters:
+
+    histogram - OID of the histogram (pg_statistic_ext.stxhistogram)
+    otype     - type of output
+
+and produces a table with these columns:
+
+    - bucket ID                (0...nbuckets-1)
+    - lower bucket boundaries  (string array)
+    - upper bucket boundaries  (string array)
+    - nulls only dimensions    (boolean array)
+    - lower boundary inclusive (boolean array)
+    - upper boundary includive (boolean array)
+    - frequency                (double precision)
+    - density                  (double precision)
+    - volume                   (double precision)
+
+The 'otype' accepts three values, determining what will be returned in the
+lower/upper boundary arrays:
+
+    - 0 - values stored in the histogram, encoded as text
+    - 1 - indexes into the deduplicated arrays
+    - 2 - idnexes into the deduplicated arrays, scaled to [0,1]
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 29e816c4f7..5505feb913 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -932,7 +932,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			listidx;
 
 	/* check if there's any stats that might be useful for us. */
-	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
+	if (!has_stats_of_kind(rel->statlist, STATS_EXT_INFO_DEPENDENCIES))
 		return 1.0;
 
 	list_attnums = (AttrNumber *) palloc(sizeof(AttrNumber) *
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 0b66000705..25237cd53c 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -38,7 +38,6 @@
 #include "utils/selfuncs.h"
 #include "utils/syscache.h"
 
-
 /*
  * Used internally to refer to an individual statistics object, i.e.,
  * a pg_statistic_ext entry.
@@ -58,7 +57,7 @@ static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs,
 					  int nvacatts, VacAttrStats **vacatts);
 static void statext_store(Relation pg_stext, Oid relid,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
-			  MCVList * mcvlist, VacAttrStats **stats);
+			  MCVList * mcvlist, MVHistogram * histogram, VacAttrStats **stats);
 
 
 /*
@@ -92,10 +91,14 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 		StatExtEntry *stat = (StatExtEntry *) lfirst(lc);
 		MVNDistinct *ndistinct = NULL;
 		MVDependencies *dependencies = NULL;
+		MVHistogram *histogram = NULL;
 		MCVList    *mcv = NULL;
 		VacAttrStats **stats;
 		ListCell   *lc2;
 
+		bool		build_mcv = false;
+		bool		build_histogram = false;
+
 		/*
 		 * Check if we can build these stats based on the column analyzed. If
 		 * not, report this fact (except in autovacuum) and move on.
@@ -131,12 +134,49 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 				dependencies = statext_dependencies_build(numrows, rows,
 														  stat->columns, stats);
 			else if (t == STATS_EXT_MCV)
-				mcv = statext_mcv_build(numrows, rows, stat->columns, stats,
-										totalrows);
+				build_mcv = true;
+			else if (t == STATS_EXT_HISTOGRAM)
+				build_histogram = true;
+		}
+
+		/*
+		 * If asked to build both MCV and histogram, first build the MCV part
+		 * and then histogram on the remaining rows.
+		 */
+		if (build_mcv && build_histogram)
+		{
+			HeapTuple  *rows_filtered = NULL;
+			int			numrows_filtered;
+
+			mcv = statext_mcv_build(numrows, rows, stat->columns, stats,
+									&rows_filtered, &numrows_filtered,
+									totalrows);
+
+			/*
+			 * Only build the histogram when there are rows not covered by
+			 * MCV.
+			 */
+			if (rows_filtered)
+			{
+				Assert(numrows_filtered > 0);
+
+				histogram = statext_histogram_build(numrows_filtered, rows_filtered,
+													stat->columns, stats, numrows);
+
+				/* free this immediately, as we may be building many stats */
+				pfree(rows_filtered);
+			}
 		}
+		else if (build_mcv)
+			mcv = statext_mcv_build(numrows, rows, stat->columns, stats,
+									NULL, NULL, totalrows);
+		else if (build_histogram)
+			histogram = statext_histogram_build(numrows, rows, stat->columns,
+												stats, numrows);
 
 		/* store the statistics in the catalog */
-		statext_store(pg_stext, stat->statOid, ndistinct, dependencies, mcv, stats);
+		statext_store(pg_stext, stat->statOid, ndistinct, dependencies, mcv,
+					  histogram, stats);
 	}
 
 	heap_close(pg_stext, RowExclusiveLock);
@@ -168,6 +208,10 @@ statext_is_kind_built(HeapTuple htup, char type)
 			attnum = Anum_pg_statistic_ext_stxmcv;
 			break;
 
+		case STATS_EXT_HISTOGRAM:
+			attnum = Anum_pg_statistic_ext_stxhistogram;
+			break;
+
 		default:
 			elog(ERROR, "unexpected statistics type requested: %d", type);
 	}
@@ -233,7 +277,8 @@ fetch_statentries_for_relation(Relation pg_statext, Oid relid)
 		{
 			Assert((enabled[i] == STATS_EXT_NDISTINCT) ||
 				   (enabled[i] == STATS_EXT_DEPENDENCIES) ||
-				   (enabled[i] == STATS_EXT_MCV));
+				   (enabled[i] == STATS_EXT_MCV) ||
+				   (enabled[i] == STATS_EXT_HISTOGRAM));
 			entry->types = lappend_int(entry->types, (int) enabled[i]);
 		}
 
@@ -308,7 +353,7 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs,
 static void
 statext_store(Relation pg_stext, Oid statOid,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
-			  MCVList * mcv, VacAttrStats **stats)
+			  MCVList * mcv, MVHistogram * histogram, VacAttrStats **stats)
 {
 	HeapTuple	stup,
 				oldtup;
@@ -347,10 +392,18 @@ statext_store(Relation pg_stext, Oid statOid,
 		values[Anum_pg_statistic_ext_stxmcv - 1] = PointerGetDatum(data);
 	}
 
+	if (histogram != NULL)
+	{
+		/* histogram already is a bytea value, not need to serialize */
+		nulls[Anum_pg_statistic_ext_stxhistogram - 1] = (histogram == NULL);
+		values[Anum_pg_statistic_ext_stxhistogram - 1] = PointerGetDatum(histogram);
+	}
+
 	/* always replace the value (either by bytea or NULL) */
 	replaces[Anum_pg_statistic_ext_stxndistinct - 1] = true;
 	replaces[Anum_pg_statistic_ext_stxdependencies - 1] = true;
 	replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
+	replaces[Anum_pg_statistic_ext_stxhistogram - 1] = true;
 
 	/* there should already be a pg_statistic_ext tuple */
 	oldtup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
@@ -465,6 +518,19 @@ compare_scalars_simple(const void *a, const void *b, void *arg)
 								 (SortSupport) arg);
 }
 
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+	Datum		da = ((ScalarItem *) a)->value;
+	Datum		db = ((ScalarItem *) b)->value;
+	SortSupport ssup = (SortSupport) arg;
+
+	return ApplySortComparator(da, false, db, false, ssup);
+}
+
 int
 compare_datums_simple(Datum a, Datum b, SortSupport ssup)
 {
@@ -590,10 +656,11 @@ build_sorted_items(int numrows, HeapTuple *rows, TupleDesc tdesc,
 
 /*
  * has_stats_of_kind
- *		Check whether the list contains statistic of a given kind
+ *		Check whether the list contains statistic of a given kind (at least
+ * one of those specified statistics types).
  */
 bool
-has_stats_of_kind(List *stats, char requiredkind)
+has_stats_of_kind(List *stats, int requiredkinds)
 {
 	ListCell   *l;
 
@@ -601,7 +668,7 @@ has_stats_of_kind(List *stats, char requiredkind)
 	{
 		StatisticExtInfo *stat = (StatisticExtInfo *) lfirst(l);
 
-		if (stat->kind == requiredkind)
+		if (stat->kinds & requiredkinds)
 			return true;
 	}
 
@@ -623,7 +690,7 @@ has_stats_of_kind(List *stats, char requiredkind)
  * further tiebreakers are needed.
  */
 StatisticExtInfo *
-choose_best_statistics(List *stats, Bitmapset *attnums, char requiredkind)
+choose_best_statistics(List *stats, Bitmapset *attnums, int requiredkinds)
 {
 	ListCell   *lc;
 	StatisticExtInfo *best_match = NULL;
@@ -637,8 +704,8 @@ choose_best_statistics(List *stats, Bitmapset *attnums, char requiredkind)
 		int			numkeys;
 		Bitmapset  *matched;
 
-		/* skip statistics that are not of the correct type */
-		if (info->kind != requiredkind)
+		/* skip statistics that do not match any of the requested types */
+		if ((info->kinds & requiredkinds) == 0)
 			continue;
 
 		/* determine how many attributes of these stats can be matched to */
@@ -843,7 +910,7 @@ statext_is_compatible_clause_internal(Node *clause, Index relid, Bitmapset **att
 
 /*
  * statext_is_compatible_clause
- *		Determines if the clause is compatible with MCV lists.
+ *		Determines if the clause is compatible with MCV lists and histograms
  *
  * Only OpExprs with two arguments using an equality operator are supported.
  * When returning True attnum is set to the attribute number of the Var within
@@ -873,6 +940,89 @@ statext_is_compatible_clause(Node *clause, Index relid, Bitmapset **attnums)
 }
 
 /*
+ * examine_equality_clause
+ *		Extract variable from a simple top-level equality clause.
+ *
+ * For simple equality clause (Var = Const) or (Const = Var) extracts
+ * the Var. For other clauses returns NULL.
+ */
+static Var *
+examine_equality_clause(PlannerInfo *root, RestrictInfo *rinfo)
+{
+	OpExpr	   *expr;
+	Var		   *var;
+	bool		ok;
+	bool		varonleft = true;
+
+	if (!IsA(rinfo->clause, OpExpr))
+		return NULL;
+
+	expr = (OpExpr *) rinfo->clause;
+
+	if (list_length(expr->args) != 2)
+		return NULL;
+
+	/* see if it actually has the right */
+	ok = (NumRelids((Node *) expr) == 1) &&
+		(is_pseudo_constant_clause(lsecond(expr->args)) ||
+		 (varonleft = false,
+		  is_pseudo_constant_clause(linitial(expr->args))));
+
+	/* unsupported structure (two variables or so) */
+	if (!ok)
+		return NULL;
+
+	if (get_oprrest(expr->opno) != F_EQSEL)
+		return NULL;
+
+	var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+	return var;
+}
+
+/*
+ * estimate_equality_groups
+ *		Estimates number of groups for attributes in equality clauses.
+ *
+ * Extracts simple top-level equality clauses, and estimates ndistinct
+ * for that combination (using simplified estimate_num_groups). Then
+ * returns number of attributes with an equality clause, and a lists
+ * of equality clauses (to use as conditions for histograms) and also
+ * remaining non-equality clauses.
+ */
+static double
+estimate_equality_groups(PlannerInfo *root, List *clauses,
+						 List **eqclauses, List **neqclauses)
+{
+	List   *vars = NIL;
+	ListCell *lc;
+
+	*eqclauses = NIL;
+	*neqclauses = NIL;
+
+	foreach(lc, clauses)
+	{
+		Var	   *var;
+		RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+
+		Assert(IsA(rinfo, RestrictInfo));
+
+		var = examine_equality_clause(root, rinfo);
+
+		/* is it a simple equality clause */
+		if (var)
+		{
+			vars = lappend(vars, var);
+			*eqclauses = lappend(*eqclauses, rinfo);
+		}
+		else
+			*neqclauses = lappend(*neqclauses, rinfo);
+	}
+
+	return estimate_num_groups_simple(root, vars);
+}
+
+/*
  * statext_clauselist_selectivity
  *		Estimate clauses using the best multi-column statistics.
  *
@@ -937,13 +1087,14 @@ statext_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
 				mcv_sel,
 				mcv_basesel,
 				mcv_totalsel,
+				histogram_sel,
 				other_sel,
 				sel;
 
-	/* we're interested in MCV lists */
-	int			types = STATS_EXT_MCV;
+	/* we're interested in MCV lists and histograms */
+	int			types = (STATS_EXT_INFO_MCV | STATS_EXT_INFO_HISTOGRAM);
 
-	/* check if there's any stats that might be useful for us. */
+	/* Check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, types))
 		return (Selectivity) 1.0;
 
@@ -994,8 +1145,8 @@ statext_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
 	if (!stat)
 		return (Selectivity) 1.0;
 
-	/* We only understand MCV lists for now. */
-	Assert(stat->kind == STATS_EXT_MCV);
+	/* We only understand MCV lists and histograms for now. */
+	Assert(stat->kinds & (STATS_EXT_INFO_MCV | STATS_EXT_INFO_HISTOGRAM));
 
 	/* now filter the clauses to be estimated using the selected MCV */
 	stat_clauses = NIL;
@@ -1018,28 +1169,59 @@ statext_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
 	}
 
 	/*
-	 * First compute "simple" selectivity, i.e. without the extended statistics,
-	 * and essentially assuming independence of the columns/clauses. We'll then
-	 * use the various selectivities computed from MCV list to improve it.
+	 * For statistics with MCV list, we'll estimate the MCV and non-MCV parts.
 	 */
-	simple_sel = clauselist_selectivity_simple(root, stat_clauses, varRelid,
-											   jointype, sjinfo, NULL);
+	if (stat->kinds & STATS_EXT_INFO_MCV)
+	{
+		/*
+		 * First compute "simple" selectivity, i.e. without the extended statistics,
+		 * and essentially assuming independence of the columns/clauses. We'll then
+		 * use the various selectivities computed from MCV list to improve it.
+		 */
+		simple_sel = clauselist_selectivity_simple(root, stat_clauses, varRelid,
+												   jointype, sjinfo, NULL);
+
+		/*
+		 * Now compute the multi-column estimate from the MCV list, along with the
+		 * other selectivities (base & total selectivity).
+		 */
+		mcv_sel = mcv_clauselist_selectivity(root, stat, stat_clauses, varRelid,
+											 jointype, sjinfo, rel,
+											 &mcv_basesel, &mcv_totalsel);
+
+		/* Estimated selectivity of values not covered by MCV matches */
+		other_sel = simple_sel - mcv_basesel;
+		CLAMP_PROBABILITY(other_sel);
+
+		/* The non-MCV selectivity can't exceed the 1 - mcv_totalsel. */
+		if (other_sel > 1.0 - mcv_totalsel)
+			other_sel = 1.0 - mcv_totalsel;
+	}
+	else
+	{
+		/* Otherwise just remember there was no MCV list. */
+		mcv_totalsel = 0.0;
+	}
 
 	/*
-	 * Now compute the multi-column estimate from the MCV list, along with the
-	 * other selectivities (base & total selectivity).
+	 * If we have a histogram, we'll use it to improve the non-MCV estimate.
 	 */
-	mcv_sel = mcv_clauselist_selectivity(root, stat, stat_clauses, varRelid,
-										 jointype, sjinfo, rel,
-										 &mcv_basesel, &mcv_totalsel);
+	if (stat->kinds & STATS_EXT_INFO_HISTOGRAM)
+	{
+		List   *eqclauses,
+			   *neqclauses;
+		double	ngroups;
 
-	/* Estimated selectivity of values not covered by MCV matches */
-	other_sel = simple_sel - mcv_basesel;
-	CLAMP_PROBABILITY(other_sel);
+		ngroups = estimate_equality_groups(root, stat_clauses,
+										   &eqclauses, &neqclauses);
 
-	/* The non-MCV selectivity can't exceed the 1 - mcv_totalsel. */
-	if (other_sel > 1.0 - mcv_totalsel)
-		other_sel = 1.0 - mcv_totalsel;
+		histogram_sel = histogram_clauselist_selectivity(root, stat,
+														 neqclauses, eqclauses,
+														 varRelid, jointype,
+														 sjinfo, rel);
+
+		other_sel = (1 / ngroups) * histogram_sel;
+	}
 
 	/* Overall selectivity is the combination of MCV and non-MCV estimates. */
 	sel = mcv_sel + other_sel;
diff --git a/src/backend/statistics/histogram.c b/src/backend/statistics/histogram.c
new file mode 100644
index 0000000000..1ff34a53c0
--- /dev/null
+++ b/src/backend/statistics/histogram.c
@@ -0,0 +1,3019 @@
+/*-------------------------------------------------------------------------
+ *
+ * histogram.c
+ *	  POSTGRES multivariate histograms
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/backend/statistics/histogram.c
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include <math.h>
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_statistic_ext.h"
+#include "fmgr.h"
+#include "funcapi.h"
+#include "optimizer/clauses.h"
+#include "statistics/extended_stats_internal.h"
+#include "statistics/statistics.h"
+#include "utils/builtins.h"
+#include "utils/bytea.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
+#include "utils/syscache.h"
+#include "utils/typcache.h"
+
+
+/*
+ * Multivariate histograms
+ */
+typedef struct MVBucketBuild
+{
+	/* Frequencies of this bucket. */
+	float		frequency;
+
+	/*
+	 * Information about dimensions being NULL-only. Not yet used.
+	 */
+	bool	   *nullsonly;
+
+	/* lower boundaries - values and information about the inequalities */
+	Datum	   *min;
+	bool	   *min_inclusive;
+
+	/* upper boundaries - values and information about the inequalities */
+	Datum	   *max;
+	bool	   *max_inclusive;
+
+	/* number of distinct values in each dimension */
+	uint32	   *ndistincts;
+
+	/* number of distinct combination of values */
+	uint32		ndistinct;
+
+	/* aray of sample rows (for this bucket) */
+	HeapTuple  *rows;
+	uint32		numrows;
+
+}			MVBucketBuild;
+
+typedef struct MVHistogramBuild
+{
+	int32		vl_len_;		/* unused: ensure same alignment as
+								 * MVHistogram for serialization */
+	uint32		magic;			/* magic constant marker */
+	uint32		type;			/* type of histogram (BASIC) */
+	uint32		nbuckets;		/* number of buckets (buckets array) */
+	uint32		ndimensions;	/* number of dimensions */
+	Oid			types[STATS_MAX_DIMENSIONS];	/* OIDs of data types */
+	MVBucketBuild **buckets;	/* array of buckets */
+}			MVHistogramBuild;
+
+static MVBucketBuild * create_initial_ext_bucket(int numrows, HeapTuple *rows,
+												 Bitmapset *attrs, VacAttrStats **stats);
+
+static MVBucketBuild * select_bucket_to_partition(int nbuckets, MVBucketBuild * *buckets);
+
+static MVBucketBuild * partition_bucket(MVBucketBuild * bucket, Bitmapset *attrs,
+										VacAttrStats **stats,
+										int *ndistvalues, Datum **distvalues);
+
+static MVBucketBuild * copy_ext_bucket(MVBucketBuild * bucket, uint32 ndimensions);
+
+static void update_bucket_ndistinct(MVBucketBuild * bucket, Bitmapset *attrs,
+						VacAttrStats **stats);
+
+static void update_dimension_ndistinct(MVBucketBuild * bucket, int dimension,
+						   Bitmapset *attrs, VacAttrStats **stats,
+						   bool update_boundaries);
+
+static void create_null_buckets(MVHistogramBuild * histogram, int bucket_idx,
+					Bitmapset *attrs, VacAttrStats **stats);
+
+static Datum *build_ndistinct(int numrows, HeapTuple *rows, Bitmapset *attrs,
+				VacAttrStats **stats, int i, int *nvals);
+
+static MVHistogram * serialize_histogram(MVHistogramBuild * histogram,
+										 VacAttrStats **stats);
+
+/*
+ * Computes size of a serialized histogram bucket, depending on the number
+ * of dimentions (columns) the statistic is defined on. The datum values
+ * are stored in a separate array (deduplicated, to minimize the size), and
+ * so the serialized buckets only store uint16 indexes into that array.
+ *
+ * Each serialized bucket needs to store (in this order):
+ *
+ * - number of tuples     (float)
+ * - number of distinct   (float)
+ * - min inclusive flags  (ndim * sizeof(bool))
+ * - max inclusive flags  (ndim * sizeof(bool))
+ * - null dimension flags (ndim * sizeof(bool))
+ * - min boundary indexes (2 * ndim * sizeof(uint16))
+ * - max boundary indexes (2 * ndim * sizeof(uint16))
+ *
+ * So in total:
+ *
+ *	 ndim * (4 * sizeof(uint16) + 3 * sizeof(bool)) + (2 * sizeof(float))
+ *
+ * XXX We might save a bit more space by using proper bitmaps instead of
+ * boolean arrays.
+ */
+#define BUCKET_SIZE(ndims)	\
+	(ndims * (4 * sizeof(uint16) + 3 * sizeof(bool)) + sizeof(float))
+
+/*
+ * Macros for convenient access to parts of a serialized bucket.
+ */
+#define BUCKET_FREQUENCY(b)		(*(float*)b)
+#define BUCKET_MIN_INCL(b,n)	((bool*)(b + sizeof(float)))
+#define BUCKET_MAX_INCL(b,n)	(BUCKET_MIN_INCL(b,n) + n)
+#define BUCKET_NULLS_ONLY(b,n)	(BUCKET_MAX_INCL(b,n) + n)
+#define BUCKET_MIN_INDEXES(b,n) ((uint16*)(BUCKET_NULLS_ONLY(b,n) + n))
+#define BUCKET_MAX_INDEXES(b,n) ((BUCKET_MIN_INDEXES(b,n) + n))
+
+/*
+ * Minimal number of rows per bucket (can't split smaller buckets).
+ *
+ * XXX The single-column statistics (std_typanalyze) pretty much says we
+ * need 300 rows per bucket. Should we use the same value here?
+ */
+#define MIN_BUCKET_ROWS			10
+
+/*
+ * Represents match info for a histogram bucket.
+ */
+typedef struct bucket_match
+{
+	bool		match;		/* true/false */
+	double		fraction;	/* fraction of bucket */
+} bucket_match;
+
+/*
+ * Builds a multivariate histogram from the set of sampled rows.
+ *
+ * The build algorithm is iterative - initially a single bucket containing all
+ * sample rows is formed, and then repeatedly split into smaller buckets. In
+ * each round the largest bucket is split into two smaller ones.
+ *
+ * The criteria for selecting the largest bucket (and the dimension for the
+ * split) needs to be elaborate enough to produce buckets of roughly the same
+ * size, and also regular shape (not very narrow in just one dimension).
+ *
+ * The current algorithm works like this:
+ *
+ *   a) build NULL-buckets (create_null_buckets)
+ *
+ *   b) while [maximum number of buckets not reached]
+ *
+ *   c) choose bucket to partition (largest bucket)
+ *
+ *       c.1) if no bucket eligible to split, terminate the build
+ *
+ *       c.2) choose bucket dimension to partition (largest dimension)
+ *
+ *       c.3) split the bucket into two buckets
+ *
+ * See the discussion at select_bucket_to_partition and partition_bucket for
+ * more details about the algorithm.
+ *
+ * The function does not update the interan pointers, hence the histogram
+ * is suitable only for storing. Before using it for estimation, it needs
+ * to go through statext_histogram_deserialize() first.
+ */
+MVHistogram *
+statext_histogram_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
+						VacAttrStats **stats, int numrows_total)
+{
+	int			i;
+	int			numattrs = bms_num_members(attrs);
+
+	int		   *ndistvalues;
+	Datum	  **distvalues;
+
+	MVHistogramBuild *histogram;
+	HeapTuple  *rows_copy;
+
+	/* not supposed to build of too few or too many columns */
+	Assert((numattrs >= 2) && (numattrs <= STATS_MAX_DIMENSIONS));
+
+	/* we need to make a copy of the row array, as we'll modify it */
+	rows_copy = (HeapTuple *) palloc0(numrows * sizeof(HeapTuple));
+	memcpy(rows_copy, rows, sizeof(HeapTuple) * numrows);
+
+	/* build the histogram header */
+
+	histogram = (MVHistogramBuild *) palloc0(sizeof(MVHistogramBuild));
+
+	histogram->magic = STATS_HIST_MAGIC;
+	histogram->type = STATS_HIST_TYPE_BASIC;
+	histogram->ndimensions = numattrs;
+	histogram->nbuckets = 1;	/* initially just a single bucket */
+
+	/*
+	 * Allocate space for maximum number of buckets (better than repeatedly
+	 * doing repalloc for short-lived objects).
+	 */
+	histogram->buckets
+		= (MVBucketBuild * *) palloc0(STATS_HIST_MAX_BUCKETS * sizeof(MVBucketBuild));
+
+	/* Create the initial bucket, covering all sampled rows */
+	histogram->buckets[0]
+		= create_initial_ext_bucket(numrows, rows_copy, attrs, stats);
+
+	/*
+	 * Collect info on distinct values in each dimension (used later to pick
+	 * dimension to partition).
+	 */
+	ndistvalues = (int *) palloc0(sizeof(int) * numattrs);
+	distvalues = (Datum **) palloc0(sizeof(Datum *) * numattrs);
+
+	for (i = 0; i < numattrs; i++)
+		distvalues[i] = build_ndistinct(numrows, rows, attrs, stats, i,
+										&ndistvalues[i]);
+
+	/*
+	 * Split the initial bucket into buckets that don't mix NULL and non-NULL
+	 * values in a single dimension.
+	 *
+	 * XXX Maybe this should be happening before the build_ndistinct()?
+	 */
+	create_null_buckets(histogram, 0, attrs, stats);
+
+	/*
+	 * Split the buckets into smaller and smaller buckets. The loop will end
+	 * when either all buckets are too small (MIN_BUCKET_ROWS), or there are
+	 * too many buckets in total (STATS_HIST_MAX_BUCKETS).
+	 */
+	while (histogram->nbuckets < STATS_HIST_MAX_BUCKETS)
+	{
+		MVBucketBuild *bucket = select_bucket_to_partition(histogram->nbuckets,
+														   histogram->buckets);
+
+		/* no bucket eligible for partitioning */
+		if (bucket == NULL)
+			break;
+
+		/* we modify the bucket in-place and add one new bucket */
+		histogram->buckets[histogram->nbuckets++]
+			= partition_bucket(bucket, attrs, stats, ndistvalues, distvalues);
+	}
+
+	/* Finalize the histogram build - compute bucket frequencies etc. */
+	for (i = 0; i < histogram->nbuckets; i++)
+	{
+		/*
+		 * The frequency has to be computed from the whole sample, in case
+		 * some of the rows were filtered out in the MCV build.
+		 */
+		histogram->buckets[i]->frequency
+			= (histogram->buckets[i]->numrows * 1.0) / numrows_total;
+	}
+
+	return serialize_histogram(histogram, stats);
+}
+
+/*
+ * build_ndistinct
+ *		build array of ndistinct values in a particular column, count them
+ *
+ */
+static Datum *
+build_ndistinct(int numrows, HeapTuple *rows, Bitmapset *attrs,
+				VacAttrStats **stats, int i, int *nvals)
+{
+	int			j;
+	int			nvalues,
+				ndistinct;
+	Datum	   *values,
+			   *distvalues;
+	int		   *attnums;
+
+	TypeCacheEntry *type;
+	SortSupportData ssup;
+
+	type = lookup_type_cache(stats[i]->attrtypid, TYPECACHE_LT_OPR);
+
+	/* initialize sort support, etc. */
+	memset(&ssup, 0, sizeof(ssup));
+	ssup.ssup_cxt = CurrentMemoryContext;
+
+	/* We always use the default collation for statistics */
+	ssup.ssup_collation = DEFAULT_COLLATION_OID;
+	ssup.ssup_nulls_first = false;
+
+	PrepareSortSupportFromOrderingOp(type->lt_opr, &ssup);
+
+	nvalues = 0;
+	values = (Datum *) palloc0(sizeof(Datum) * numrows);
+
+	attnums = build_attnums(attrs);
+
+	/* collect values from the sample rows, ignore NULLs */
+	for (j = 0; j < numrows; j++)
+	{
+		Datum		value;
+		bool		isnull;
+
+		/*
+		 * remember the index of the sample row, to make the partitioning
+		 * simpler
+		 */
+		value = heap_getattr(rows[j], attnums[i],
+							 stats[i]->tupDesc, &isnull);
+
+		if (isnull)
+			continue;
+
+		values[nvalues++] = value;
+	}
+
+	/* if no non-NULL values were found, free the memory and terminate */
+	if (nvalues == 0)
+	{
+		pfree(values);
+		return NULL;
+	}
+
+	/* sort the array of values using the SortSupport */
+	qsort_arg((void *) values, nvalues, sizeof(Datum),
+			  compare_scalars_simple, (void *) &ssup);
+
+	/* count the distinct values first, and allocate just enough memory */
+	ndistinct = 1;
+	for (j = 1; j < nvalues; j++)
+		if (compare_scalars_simple(&values[j], &values[j - 1], &ssup) != 0)
+			ndistinct += 1;
+
+	distvalues = (Datum *) palloc0(sizeof(Datum) * ndistinct);
+
+	/* now collect distinct values into the array */
+	distvalues[0] = values[0];
+	ndistinct = 1;
+
+	for (j = 1; j < nvalues; j++)
+	{
+		if (compare_scalars_simple(&values[j], &values[j - 1], &ssup) != 0)
+		{
+			distvalues[ndistinct] = values[j];
+			ndistinct += 1;
+		}
+	}
+
+	pfree(values);
+
+	*nvals = ndistinct;
+	return distvalues;
+}
+
+/*
+ * statext_histogram_load
+ *		Load the histogram list for the indicated pg_statistic_ext tuple
+*/
+MVHistogram *
+statext_histogram_load(Oid mvoid)
+{
+	bool		isnull = false;
+	Datum		histogram;
+	HeapTuple	htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(mvoid));
+
+	if (!HeapTupleIsValid(htup))
+		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
+
+	histogram = SysCacheGetAttr(STATEXTOID, htup,
+								Anum_pg_statistic_ext_stxhistogram, &isnull);
+
+	ReleaseSysCache(htup);
+
+	if (isnull)
+		return NULL;
+
+	return statext_histogram_deserialize(DatumGetByteaP(histogram));
+}
+
+/*
+ * Serialize the MV histogram into a bytea value. The basic algorithm is quite
+ * simple, and mostly mimincs the MCV serialization:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ *
+ *   (a) collect all (non-NULL) attribute values from all buckets
+ *   (b) sort the data (using 'lt' from VacAttrStats)
+ *   (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all buckets
+ *
+ *   (a) replace min/max values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, as we're mixing different
+ * datatypes, and we we need to use the right operators to compare/sort them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char' or
+ * a longer type (instead of using an array of bool items).
+ */
+static MVHistogram *
+serialize_histogram(MVHistogramBuild * histogram, VacAttrStats **stats)
+{
+	int			dim,
+				i;
+	Size		total_length = 0;
+
+	bytea	   *output = NULL;	/* serialized histogram as bytea */
+	char	   *data = NULL;
+
+	DimensionInfo *info;
+	SortSupport ssup;
+
+	int			nbuckets = histogram->nbuckets;
+	int			ndims = histogram->ndimensions;
+
+	/* allocated for serialized bucket data */
+	int			bucketsize = BUCKET_SIZE(ndims);
+	char	   *bucket = palloc0(bucketsize);
+
+	/* values per dimension (and number of non-NULL values) */
+	Datum	  **values = (Datum **) palloc0(sizeof(Datum *) * ndims);
+	int		   *counts = (int *) palloc0(sizeof(int) * ndims);
+
+	/* info about dimensions (for deserialize) */
+	info = (DimensionInfo *) palloc0(sizeof(DimensionInfo) * ndims);
+
+	/* sort support data */
+	ssup = (SortSupport) palloc0(sizeof(SortSupportData) * ndims);
+
+	/* collect and deduplicate values for each dimension separately */
+	for (dim = 0; dim < ndims; dim++)
+	{
+		int			b;
+		int			count;
+		TypeCacheEntry *type;
+
+		type = lookup_type_cache(stats[dim]->attrtypid, TYPECACHE_LT_OPR);
+
+		/* OID of the data types */
+		histogram->types[dim] = stats[dim]->attrtypid;
+
+		/* keep important info about the data type */
+		info[dim].typlen = stats[dim]->attrtype->typlen;
+		info[dim].typbyval = stats[dim]->attrtype->typbyval;
+
+		/*
+		 * Allocate space for all min/max values, including NULLs (we won't
+		 * use them, but we don't know how many are there), and then collect
+		 * all non-NULL values.
+		 */
+		values[dim] = (Datum *) palloc0(sizeof(Datum) * nbuckets * 2);
+
+		for (b = 0; b < histogram->nbuckets; b++)
+		{
+			/* skip buckets where this dimension is NULL-only */
+			if (!histogram->buckets[b]->nullsonly[dim])
+			{
+				values[dim][counts[dim]] = histogram->buckets[b]->min[dim];
+				counts[dim] += 1;
+
+				values[dim][counts[dim]] = histogram->buckets[b]->max[dim];
+				counts[dim] += 1;
+			}
+		}
+
+		/* there are just NULL values in this dimension */
+		if (counts[dim] == 0)
+			continue;
+
+		/* sort and deduplicate */
+		ssup[dim].ssup_cxt = CurrentMemoryContext;
+		ssup[dim].ssup_collation = DEFAULT_COLLATION_OID;
+		ssup[dim].ssup_nulls_first = false;
+
+		PrepareSortSupportFromOrderingOp(type->lt_opr, &ssup[dim]);
+
+		qsort_arg(values[dim], counts[dim], sizeof(Datum),
+				  compare_scalars_simple, &ssup[dim]);
+
+		/*
+		 * Walk through the array and eliminate duplicitate values, but keep
+		 * the ordering (so that we can do bsearch later). We know there's at
+		 * least 1 item, so we can skip the first element.
+		 */
+		count = 1;				/* number of deduplicated items */
+		for (i = 1; i < counts[dim]; i++)
+		{
+			/* if it's different from the previous value, we need to keep it */
+			if (compare_datums_simple(values[dim][i - 1], values[dim][i], &ssup[dim]) != 0)
+			{
+				/* XXX: not needed if (count == j) */
+				values[dim][count] = values[dim][i];
+				count += 1;
+			}
+		}
+
+		/* make sure we fit into uint16 */
+		Assert(count <= UINT16_MAX);
+
+		/* keep info about the deduplicated count */
+		info[dim].nvalues = count;
+
+		/* compute size of the serialized data */
+		if (info[dim].typlen > 0)
+			/* byval or byref, but with fixed length (name, tid, ...) */
+			info[dim].nbytes = info[dim].nvalues * info[dim].typlen;
+		else if (info[dim].typlen == -1)
+			/* varlena, so just use VARSIZE_ANY */
+			for (i = 0; i < info[dim].nvalues; i++)
+				info[dim].nbytes += VARSIZE_ANY(values[dim][i]);
+		else if (info[dim].typlen == -2)
+			/* cstring, so simply strlen */
+			for (i = 0; i < info[dim].nvalues; i++)
+				info[dim].nbytes += strlen(DatumGetPointer(values[dim][i]));
+		else
+			elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+				 info[dim].typbyval, info[dim].typlen);
+	}
+
+	/*
+	 * Now we finally know how much space we'll need for the serialized
+	 * histogram, as it contains these fields:
+	 *
+	 * - length (4B) for varlena
+	 * - magic (4B)
+	 * - type (4B)
+	 * - ndimensions (4B)
+	 * - nbuckets (4B)
+	 * - info (ndim * sizeof(DimensionInfo)
+	 * - arrays of values for each dimension
+	 * - serialized buckets (nbuckets * bucketsize)
+	 *
+	 * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and then
+	 * we'll place the data (and buckets).
+	 */
+	total_length = (offsetof(MVHistogram, buckets)
+					+ ndims * sizeof(DimensionInfo)
+					+ nbuckets * bucketsize);
+
+	/* account for the deduplicated data */
+	for (dim = 0; dim < ndims; dim++)
+		total_length += info[dim].nbytes;
+
+	/*
+	 * Enforce arbitrary limit of 1MB on the size of the serialized MCV list.
+	 * This is meant as a protection against someone building MCV list on long
+	 * values (e.g. text documents).
+	 *
+	 * XXX Should we enforce arbitrary limits like this one? Maybe it's not
+	 * even necessary, as long values are usually unique and so won't make it
+	 * into the MCV list in the first place. In the end, we have a 1GB limit
+	 * on bytea values.
+	 */
+	if (total_length > (1024 * 1024))
+		elog(ERROR, "serialized histogram exceeds 1MB (%ld > %d)",
+			 total_length, (1024 * 1024));
+
+	/* allocate space for the serialized histogram list, set header */
+	output = (bytea *) palloc0(total_length);
+
+	/*
+	 * we'll use 'data' to keep track of the place to write data
+	 *
+	 * XXX No VARDATA() here, as MVHistogramBuild includes the length.
+	 */
+	data = (char *) output;
+
+	memcpy(data, histogram, offsetof(MVHistogramBuild, buckets));
+	data += offsetof(MVHistogramBuild, buckets);
+
+	memcpy(data, info, sizeof(DimensionInfo) * ndims);
+	data += sizeof(DimensionInfo) * ndims;
+
+	/* serialize the deduplicated values for all attributes */
+	for (dim = 0; dim < ndims; dim++)
+	{
+#ifdef USE_ASSERT_CHECKING
+		char	   *tmp = data;
+#endif
+		for (i = 0; i < info[dim].nvalues; i++)
+		{
+			Datum		v = values[dim][i];
+
+			if (info[dim].typbyval) /* passed by value */
+			{
+				memcpy(data, &v, info[dim].typlen);
+				data += info[dim].typlen;
+			}
+			else if (info[dim].typlen > 0)	/* pased by reference */
+			{
+				memcpy(data, DatumGetPointer(v), info[dim].typlen);
+				data += info[dim].typlen;
+			}
+			else if (info[dim].typlen == -1)	/* varlena */
+			{
+				memcpy(data, DatumGetPointer(v), VARSIZE_ANY(v));
+				data += VARSIZE_ANY(values[dim][i]);
+			}
+			else if (info[dim].typlen == -2)	/* cstring */
+			{
+				memcpy(data, DatumGetPointer(v), strlen(DatumGetPointer(v)) + 1);
+				data += strlen(DatumGetPointer(v)) + 1;
+			}
+		}
+
+		/* make sure we got exactly the amount of data we expected */
+		Assert((data - tmp) == info[dim].nbytes);
+	}
+
+	/* finally serialize the items, with uint16 indexes instead of the values */
+	for (i = 0; i < nbuckets; i++)
+	{
+		/* don't write beyond the allocated space */
+		Assert(data <= (char *) output + total_length - bucketsize);
+
+		/* reset the values for each item */
+		memset(bucket, 0, bucketsize);
+
+		BUCKET_FREQUENCY(bucket) = histogram->buckets[i]->frequency;
+
+		for (dim = 0; dim < ndims; dim++)
+		{
+			/* do the lookup only for non-NULL values */
+			if (!histogram->buckets[i]->nullsonly[dim])
+			{
+				uint16		idx;
+				Datum	   *v = NULL;
+
+				/* min boundary */
+				v = (Datum *) bsearch_arg(&histogram->buckets[i]->min[dim],
+										  values[dim], info[dim].nvalues, sizeof(Datum),
+										  compare_scalars_simple, &ssup[dim]);
+
+				Assert(v != NULL);	/* serialization or deduplication error */
+
+				/* compute index within the array */
+				idx = (v - values[dim]);
+
+				Assert((idx >= 0) && (idx < info[dim].nvalues));
+
+				BUCKET_MIN_INDEXES(bucket, ndims)[dim] = idx;
+
+				/* max boundary */
+				v = (Datum *) bsearch_arg(&histogram->buckets[i]->max[dim],
+										  values[dim], info[dim].nvalues, sizeof(Datum),
+										  compare_scalars_simple, &ssup[dim]);
+
+				Assert(v != NULL);	/* serialization or deduplication error */
+
+				/* compute index within the array */
+				idx = (v - values[dim]);
+
+				Assert((idx >= 0) && (idx < info[dim].nvalues));
+
+				BUCKET_MAX_INDEXES(bucket, ndims)[dim] = idx;
+			}
+		}
+
+		/* copy flags (nulls, min/max inclusive) */
+		memcpy(BUCKET_NULLS_ONLY(bucket, ndims),
+			   histogram->buckets[i]->nullsonly, sizeof(bool) * ndims);
+
+		memcpy(BUCKET_MIN_INCL(bucket, ndims),
+			   histogram->buckets[i]->min_inclusive, sizeof(bool) * ndims);
+
+		memcpy(BUCKET_MAX_INCL(bucket, ndims),
+			   histogram->buckets[i]->max_inclusive, sizeof(bool) * ndims);
+
+		/* copy the item into the array */
+		memcpy(data, bucket, bucketsize);
+
+		data += bucketsize;
+	}
+
+	/* at this point we expect to match the total_length exactly */
+	Assert((data - (char *) output) == total_length);
+
+	/* free the values/counts arrays here */
+	pfree(counts);
+	pfree(info);
+	pfree(ssup);
+
+	for (dim = 0; dim < ndims; dim++)
+		pfree(values[dim]);
+
+	pfree(values);
+
+	/* make sure the length is correct */
+	SET_VARSIZE(output, total_length);
+
+	return (MVHistogram *)output;
+}
+
+/*
+* Reads serialized histogram into MVHistogram structure.
+ 
+ * Returns histogram in a partially-serialized form (keeps the boundary values
+ * deduplicated, so that it's possible to optimize the estimation part by
+ * caching function call results across buckets etc.).
+ */
+MVHistogram *
+statext_histogram_deserialize(bytea *data)
+{
+	int			dim,
+				i;
+
+	Size		expected_size;
+	char	   *tmp = NULL;
+
+	MVHistogram *histogram;
+	DimensionInfo *info;
+
+	int			nbuckets;
+	int			ndims;
+	int			bucketsize;
+
+	/* temporary deserialization buffer */
+	int			bufflen;
+	char	   *buff;
+	char	   *ptr;
+
+	if (data == NULL)
+		return NULL;
+
+	/*
+	 * We can't possibly deserialize a histogram if there's not even a
+	 * complete header.
+	 */
+	if (VARSIZE_ANY_EXHDR(data) < offsetof(MVHistogram, buckets))
+		elog(ERROR, "invalid histogram size %ld (expected at least %ld)",
+			 VARSIZE_ANY_EXHDR(data), offsetof(MVHistogram, buckets));
+
+	/* read the histogram header */
+	histogram
+		= (MVHistogram *) palloc(sizeof(MVHistogram));
+
+	/* initialize pointer to data (varlena header is included) */
+	tmp = (char *) data;
+
+	/* get the header and perform basic sanity checks */
+	memcpy(histogram, tmp, offsetof(MVHistogram, buckets));
+	tmp += offsetof(MVHistogram, buckets);
+
+	if (histogram->magic != STATS_HIST_MAGIC)
+		elog(ERROR, "invalid histogram magic %d (expected %dd)",
+			 histogram->magic, STATS_HIST_MAGIC);
+
+	if (histogram->type != STATS_HIST_TYPE_BASIC)
+		elog(ERROR, "invalid histogram type %d (expected %dd)",
+			 histogram->type, STATS_HIST_TYPE_BASIC);
+
+	if (histogram->ndimensions == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid zero-length dimension array in histogram")));
+	else if (histogram->ndimensions > STATS_MAX_DIMENSIONS)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid length (%d) dimension array in histogram",
+						histogram->ndimensions)));
+
+	if (histogram->nbuckets == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid zero-length bucket array in histogram")));
+	else if (histogram->nbuckets > STATS_HIST_MAX_BUCKETS)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid length (%d) bucket array in histogram",
+						histogram->nbuckets)));
+
+	nbuckets = histogram->nbuckets;
+	ndims = histogram->ndimensions;
+	bucketsize = BUCKET_SIZE(ndims);
+
+	/*
+	 * What size do we expect with those parameters (it's incomplete, as we
+	 * yet have to count the array sizes (from DimensionInfo records).
+	 */
+	expected_size = offsetof(MVHistogram, buckets) +
+		ndims * sizeof(DimensionInfo) +
+		(nbuckets * bucketsize);
+
+	/* check that we have at least the DimensionInfo records */
+	if (VARSIZE_ANY(data) < expected_size)
+		elog(ERROR, "invalid histogram size %ld (expected %ld)",
+			 VARSIZE_ANY_EXHDR(data), expected_size);
+
+	/* Now it's safe to access the dimention info. */
+	info = (DimensionInfo *) (tmp);
+	tmp += ndims * sizeof(DimensionInfo);
+
+	/* account for the value arrays */
+	for (dim = 0; dim < ndims; dim++)
+		expected_size += info[dim].nbytes;
+
+	if (VARSIZE_ANY(data) != expected_size)
+		elog(ERROR, "invalid histogram size %ld (expected %ld)",
+			 VARSIZE_ANY_EXHDR(data), expected_size);
+
+	/* looks OK - not corrupted or something */
+
+	/* a single buffer for all the values and counts */
+	bufflen = (sizeof(int) + sizeof(Datum *)) * ndims;
+
+	for (dim = 0; dim < ndims; dim++)
+		/* don't allocate space for byval types, matching Datum */
+		if (!(info[dim].typbyval && (info[dim].typlen == sizeof(Datum))))
+			bufflen += (sizeof(Datum) * info[dim].nvalues);
+
+	/* also, include space for the result, tracking the buckets */
+	bufflen += nbuckets * (sizeof(MVBucket *) + /* bucket pointer */
+						   sizeof(MVBucket));	/* bucket data */
+
+	buff = palloc0(bufflen);
+	ptr = buff;
+
+	histogram->nvalues = (int *) ptr;
+	ptr += (sizeof(int) * ndims);
+
+	histogram->values = (Datum **) ptr;
+	ptr += (sizeof(Datum *) * ndims);
+
+	/*
+	 * XXX This uses pointers to the original data array (the types not passed
+	 * by value), so when someone frees the memory, e.g. by doing something
+	 * like this:
+	 *
+	 *	bytea * data = ... fetch the data from catalog ...
+	 *	MVHistogramBuild histogram = deserialize_histogram(data);
+	 *	pfree(data);
+	 *
+	 * then 'histogram' references the freed memory. Should copy the pieces.
+	 */
+	for (dim = 0; dim < ndims; dim++)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* remember where data for this dimension starts */
+		char	   *start = tmp;
+#endif
+
+		histogram->nvalues[dim] = info[dim].nvalues;
+
+		if (info[dim].typbyval)
+		{
+			/* passed by value / Datum - simply reuse the array */
+			if (info[dim].typlen == sizeof(Datum))
+			{
+				histogram->values[dim] = (Datum *) tmp;
+				tmp += info[dim].nbytes;
+
+				/* no overflow of input array */
+				Assert(tmp <= start + info[dim].nbytes);
+			}
+			else
+			{
+				histogram->values[dim] = (Datum *) ptr;
+				ptr += (sizeof(Datum) * info[dim].nvalues);
+
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					memcpy(&histogram->values[dim][i], tmp, info[dim].typlen);
+					tmp += info[dim].typlen;
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+		}
+		else
+		{
+			/* all the other types need a chunk of the buffer */
+			histogram->values[dim] = (Datum *) ptr;
+			ptr += (sizeof(Datum) * info[dim].nvalues);
+
+			if (info[dim].typlen > 0)
+			{
+				/* pased by reference, but fixed length (name, tid, ...) */
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					histogram->values[dim][i] = PointerGetDatum(tmp);
+					tmp += info[dim].typlen;
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+			else if (info[dim].typlen == -1)
+			{
+				/* varlena */
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					histogram->values[dim][i] = PointerGetDatum(tmp);
+					tmp += VARSIZE_ANY(tmp);
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+			else if (info[dim].typlen == -2)
+			{
+				/* cstring */
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					histogram->values[dim][i] = PointerGetDatum(tmp);
+					tmp += (strlen(tmp) + 1);	/* don't forget the \0 */
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+		}
+
+		/* check we consumed the serialized data for this dimension exactly */
+		Assert((tmp - start) == info[dim].nbytes);
+	}
+
+	/* now deserialize the buckets and point them into the varlena values */
+	histogram->buckets = (MVBucket * *) ptr;
+	ptr += (sizeof(MVBucket *) * nbuckets);
+
+	for (i = 0; i < nbuckets; i++)
+	{
+		MVBucket   *bucket = (MVBucket *) ptr;
+
+		ptr += sizeof(MVBucket);
+
+		bucket->frequency = BUCKET_FREQUENCY(tmp);
+		bucket->nullsonly = BUCKET_NULLS_ONLY(tmp, ndims);
+		bucket->min_inclusive = BUCKET_MIN_INCL(tmp, ndims);
+		bucket->max_inclusive = BUCKET_MAX_INCL(tmp, ndims);
+
+		bucket->min = BUCKET_MIN_INDEXES(tmp, ndims);
+		bucket->max = BUCKET_MAX_INDEXES(tmp, ndims);
+
+		histogram->buckets[i] = bucket;
+
+		Assert(tmp <= (char *) data + VARSIZE_ANY(data));
+
+		tmp += bucketsize;
+	}
+
+	/* at this point we expect to match the total_length exactly */
+	Assert((tmp - (char *) data) == expected_size);
+
+	/* we should exhaust the output buffer exactly */
+	Assert((ptr - buff) == bufflen);
+
+	return histogram;
+}
+
+/*
+ * create_initial_ext_bucket
+ *		Create an initial bucket, covering all the sampled rows.
+ */
+static MVBucketBuild *
+create_initial_ext_bucket(int numrows, HeapTuple *rows, Bitmapset *attrs,
+						  VacAttrStats **stats)
+{
+	int			i;
+	int			numattrs = bms_num_members(attrs);
+
+	/* TODO allocate bucket as a single piece, including all the fields. */
+	MVBucketBuild *bucket = (MVBucketBuild *) palloc0(sizeof(MVBucketBuild));
+
+	Assert(numrows > 0);
+	Assert(rows != NULL);
+	Assert((numattrs >= 2) && (numattrs <= STATS_MAX_DIMENSIONS));
+
+	/* allocate the per-dimension arrays */
+
+	/* flags for null-only dimensions */
+	bucket->nullsonly = (bool *) palloc0(numattrs * sizeof(bool));
+
+	/* inclusiveness boundaries - lower/upper bounds */
+	bucket->min_inclusive = (bool *) palloc0(numattrs * sizeof(bool));
+	bucket->max_inclusive = (bool *) palloc0(numattrs * sizeof(bool));
+
+	/* lower/upper boundaries */
+	bucket->min = (Datum *) palloc0(numattrs * sizeof(Datum));
+	bucket->max = (Datum *) palloc0(numattrs * sizeof(Datum));
+
+	/* number of distinct values (per dimension) */
+	bucket->ndistincts = (uint32 *) palloc0(numattrs * sizeof(uint32));
+
+	/* all the sample rows fall into the initial bucket */
+	bucket->numrows = numrows;
+	bucket->rows = rows;
+
+	/*
+	 * Update the number of ndistinct combinations in the bucket (which we use
+	 * when selecting bucket to partition), and then number of distinct values
+	 * for each partition (which we use when choosing which dimension to
+	 * split).
+	 */
+	update_bucket_ndistinct(bucket, attrs, stats);
+
+	/* Update ndistinct (and also set min/max) for all dimensions. */
+	for (i = 0; i < numattrs; i++)
+		update_dimension_ndistinct(bucket, i, attrs, stats, true);
+
+	return bucket;
+}
+
+/*
+ * Choose the bucket to partition next.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm produces
+ * buckets with about equal frequency and regular size. We select the bucket
+ * with the highest number of distinct values, and then split it by the longest
+ * dimension.
+ *
+ * The distinct values are uniformly mapped to [0,1] interval, and this is used
+ * to compute length of the value range.
+ *
+ * NOTE: This is not the same array used for deduplication, as this contains
+ *		 values for all the tuples from the sample, not just the boundary values.
+ *
+ * Returns either pointer to the bucket selected to be partitioned, or NULL if
+ * there are no buckets that may be split (e.g. if all buckets are too small
+ * or contain too few distinct values).
+ *
+ *
+ * Tricky example
+ * --------------
+ *
+ * Consider this table:
+ *
+ *	   CREATE TABLE t AS SELECT i AS a, i AS b
+ *						   FROM generate_series(1,1000000) s(i);
+ *
+ *	   CREATE STATISTICS s1 ON t (a,b) WITH (histogram);
+ *
+ *	   ANALYZE t;
+ *
+ * It's a very specific (and perhaps artificial) example, because every bucket
+ * always has exactly the same number of distinct values in all dimensions,
+ * which makes the partitioning tricky.
+ *
+ * Then:
+ *
+ *	   SELECT * FROM t WHERE (a < 100) AND (b < 100);
+ *
+ * is estimated to return ~120 rows, while in reality it returns only 99.
+ *
+ *							 QUERY PLAN
+ *	   -------------------------------------------------------------
+ *		Seq Scan on t  (cost=0.00..19425.00 rows=117 width=8)
+ *					   (actual time=0.129..82.776 rows=99 loops=1)
+ *		  Filter: ((a < 100) AND (b < 100))
+ *		  Rows Removed by Filter: 999901
+ *		Planning time: 1.286 ms
+ *		Execution time: 82.984 ms
+ *	   (5 rows)
+ *
+ * So this estimate is reasonably close. Let's change the query to OR clause:
+ *
+ *	   SELECT * FROM t WHERE (a < 100) OR (b < 100);
+ *
+ *							 QUERY PLAN
+ *	   -------------------------------------------------------------
+ *		Seq Scan on t  (cost=0.00..19425.00 rows=8100 width=8)
+ *					   (actual time=0.145..99.910 rows=99 loops=1)
+ *		  Filter: ((a < 100) OR (b < 100))
+ *		  Rows Removed by Filter: 999901
+ *		Planning time: 1.578 ms
+ *		Execution time: 100.132 ms
+ *	   (5 rows)
+ *
+ * That's clearly a much worse estimate. This happens because the histogram
+ * contains buckets like this:
+ *
+ *	   bucket 592  [3 30310] [30134 30593] => [0.000233]
+ *
+ * i.e. the length of "a" dimension is (30310-3)=30307, while the length of "b"
+ * is (30593-30134)=459. So the "b" dimension is much narrower than "a".
+ * Of course, there are also buckets where "b" is the wider dimension.
+ *
+ * This is partially mitigated by selecting the "longest" dimension but that
+ * only happens after we already selected the bucket. So if we never select the
+ * bucket, this optimization does not apply.
+ *
+ * The other reason why this particular example behaves so poorly is due to the
+ * way we actually split the selected bucket. We do attempt to divide the bucket
+ * into two parts containing about the same number of tuples, but that does not
+ * too well when most of the tuples is squashed on one side of the bucket.
+ *
+ * For example for columns with data on the diagonal (i.e. when a=b), we end up
+ * with a narrow bucket on the diagonal and a huge bucket overing the remaining
+ * part (with much lower density).
+ *
+ * So perhaps we need two partitioning strategies - one aiming to split buckets
+ * with high frequency (number of sampled rows), the other aiming to split
+ * "large" buckets. And alternating between them, somehow.
+ *
+ * TODO Consider using similar lower boundary for row count as for simple
+ * histograms, i.e. 300 tuples per bucket.
+ */
+static MVBucketBuild *
+select_bucket_to_partition(int nbuckets, MVBucketBuild * *buckets)
+{
+	int			i;
+	int			numrows = 0;
+	MVBucketBuild *bucket = NULL;
+
+	for (i = 0; i < nbuckets; i++)
+	{
+		/* if the number of rows is higher, use this bucket */
+		if ((buckets[i]->ndistinct > 2) &&
+			(buckets[i]->numrows > numrows) &&
+			(buckets[i]->numrows >= MIN_BUCKET_ROWS))
+		{
+			bucket = buckets[i];
+			numrows = buckets[i]->numrows;
+		}
+	}
+
+	/* may be NULL if there are not buckets with (ndistinct>1) */
+	return bucket;
+}
+
+/*
+ * A simple bucket partitioning implementation - we choose the longest bucket
+ * dimension, measured using the array of distinct values built at the very
+ * beginning of the build.
+ *
+ * We map all the distinct values to a [0,1] interval, uniformly distributed,
+ * and then use this to measure length. It's essentially a number of distinct
+ * values within the range, normalized to [0,1].
+ *
+ * Then we choose a 'middle' value splitting the bucket into two parts with
+ * roughly the same frequency.
+ *
+ * This splits the bucket by tweaking the existing one, and returning the new
+ * bucket (essentially shrinking the existing one in-place and returning the
+ * other "half" as a new bucket). The caller is responsible for adding the new
+ * bucket into the list of buckets.
+ *
+ * There are multiple histogram options, centered around the partitioning
+ * criteria, specifying both how to choose a bucket and the dimension most in
+ * need of a split. For a nice summary and general overview, see "rK-Hist : an
+ * R-Tree based histogram for multi-dimensional selectivity estimation" thesis
+ * by J. A. Lopez, Concordia University, p.34-37 (and possibly p. 32-34 for
+ * explanation of the terms).
+ *
+ * It requires care to prevent splitting only one dimension and not splitting
+ * another one at all (which might happen easily in case of strongly dependent
+ * columns - e.g. y=x). The current algorithm minimizes this, but may still
+ * happen for perfectly dependent examples (when all the dimensions have equal
+ * length, the first one will be selected).
+ *
+ * TODO Should probably consider statistics target for the columns (e.g.
+ * to split dimensions with higher statistics target more frequently).
+ */
+static MVBucketBuild *
+partition_bucket(MVBucketBuild * bucket, Bitmapset *attrs,
+				 VacAttrStats **stats,
+				 int *ndistvalues, Datum **distvalues)
+{
+	int			i;
+	int			dimension;
+	int			numattrs = bms_num_members(attrs);
+
+	Datum		split_value;
+	MVBucketBuild *new_bucket;
+
+	/* needed for sort, when looking for the split value */
+	bool		isNull;
+	int			nvalues = 0;
+	TypeCacheEntry *type;
+	ScalarItem *values;
+	SortSupportData ssup;
+	int		   *attnums;
+
+	int			nrows = 1;		/* number of rows below current value */
+	double		delta;
+
+	/* needed when splitting the values */
+	HeapTuple  *oldrows = bucket->rows;
+	int			oldnrows = bucket->numrows;
+
+	values = (ScalarItem *) palloc0(bucket->numrows * sizeof(ScalarItem));
+
+	/*
+	 * We can't split buckets with a single distinct value (this also
+	 * disqualifies NULL-only dimensions). Also, there has to be multiple
+	 * sample rows (otherwise, how could there be more distinct values).
+	 */
+	Assert(bucket->ndistinct > 1);
+	Assert(bucket->numrows > 1);
+	Assert((numattrs >= 2) && (numattrs <= STATS_MAX_DIMENSIONS));
+
+	/* Look for the next dimension to split. */
+	delta = 0.0;
+	dimension = -1;
+
+	for (i = 0; i < numattrs; i++)
+	{
+		Datum	   *a,
+				   *b;
+
+		type = lookup_type_cache(stats[i]->attrtypid, TYPECACHE_LT_OPR);
+
+		/* initialize sort support, etc. */
+		memset(&ssup, 0, sizeof(ssup));
+		ssup.ssup_cxt = CurrentMemoryContext;
+
+		/* We always use the default collation for statistics */
+		ssup.ssup_collation = DEFAULT_COLLATION_OID;
+		ssup.ssup_nulls_first = false;
+
+		PrepareSortSupportFromOrderingOp(type->lt_opr, &ssup);
+
+		/* can't split NULL-only dimension */
+		if (bucket->nullsonly[i])
+			continue;
+
+		/* can't split dimension with a single ndistinct value */
+		if (bucket->ndistincts[i] <= 1)
+			continue;
+
+		/* search for min boundary in the distinct list */
+		a = (Datum *) bsearch_arg(&bucket->min[i],
+								  distvalues[i], ndistvalues[i],
+								  sizeof(Datum), compare_scalars_simple, &ssup);
+
+		b = (Datum *) bsearch_arg(&bucket->max[i],
+								  distvalues[i], ndistvalues[i],
+								  sizeof(Datum), compare_scalars_simple, &ssup);
+
+		/* if this dimension is 'larger' then partition by it */
+		if (((b - a) * 1.0 / ndistvalues[i]) > delta)
+		{
+			delta = ((b - a) * 1.0 / ndistvalues[i]);
+			dimension = i;
+		}
+	}
+
+	/*
+	 * If we haven't found a dimension here, we've done something wrong in
+	 * select_bucket_to_partition.
+	 */
+	Assert(dimension != -1);
+
+	/*
+	 * Walk through the selected dimension, collect and sort the values and
+	 * then choose the value to use as the new boundary.
+	 */
+	type = lookup_type_cache(stats[dimension]->attrtypid, TYPECACHE_LT_OPR);
+
+	/* initialize sort support, etc. */
+	memset(&ssup, 0, sizeof(ssup));
+	ssup.ssup_cxt = CurrentMemoryContext;
+
+	/* We always use the default collation for statistics */
+	ssup.ssup_collation = DEFAULT_COLLATION_OID;
+	ssup.ssup_nulls_first = false;
+
+	PrepareSortSupportFromOrderingOp(type->lt_opr, &ssup);
+
+	attnums = build_attnums(attrs);
+
+	for (i = 0; i < bucket->numrows; i++)
+	{
+		/*
+		 * remember the index of the sample row, to make the partitioning
+		 * simpler
+		 */
+		values[nvalues].value = heap_getattr(bucket->rows[i], attnums[dimension],
+											 stats[dimension]->tupDesc, &isNull);
+		values[nvalues].tupno = i;
+
+		/* no NULL values allowed here (we never split null-only dimension) */
+		Assert(!isNull);
+
+		nvalues++;
+	}
+
+	/* sort the array of values */
+	qsort_arg((void *) values, nvalues, sizeof(ScalarItem),
+			  compare_scalars_partition, (void *) &ssup);
+
+	/*
+	 * We know there are bucket->ndistincts[dimension] distinct values in this
+	 * dimension, and we want to split this into half, so walk through the
+	 * array and stop once we see (ndistinct/2) values.
+	 *
+	 * We always choose the "next" value, i.e. (n/2+1)-th distinct value, and
+	 * use it as an exclusive upper boundary (and inclusive lower boundary).
+	 *
+	 * TODO Maybe we should use "average" of the two middle distinct values
+	 * (at least for even distinct counts), but that would require being able
+	 * to do an average (which does not work for non-numeric types).
+	 *
+	 * TODO Another option is to look for a split that'd give about 50% tuples
+	 * (not distinct values) in each partition. That might work better when
+	 * there are a few very frequent values, and many rare ones.
+	 */
+	delta = bucket->numrows;
+	split_value = values[0].value;
+
+	for (i = 1; i < bucket->numrows; i++)
+	{
+		if (values[i].value != values[i - 1].value)
+		{
+			/* are we closer to splitting the bucket in half? */
+			if (fabs(i - bucket->numrows / 2.0) < delta)
+			{
+				/* let's assume we'll use this value for the split */
+				split_value = values[i].value;
+				delta = fabs(i - bucket->numrows / 2.0);
+				nrows = i;
+			}
+		}
+	}
+
+	Assert(nrows > 0);
+	Assert(nrows < bucket->numrows);
+
+	/*
+	 * create the new bucket as a (incomplete) copy of the one being
+	 * partitioned.
+	 */
+	new_bucket = copy_ext_bucket(bucket, numattrs);
+
+	/*
+	 * Do the actual split of the chosen dimension, using the split value as
+	 * the upper bound for the existing bucket, and lower bound for the new
+	 * one.
+	 */
+	bucket->max[dimension] = split_value;
+	new_bucket->min[dimension] = split_value;
+
+	/*
+	 * We also treat only one side of the new boundary as inclusive, in the
+	 * bucket where it happens to be the upper boundary. We never set the
+	 * min_inclusive[] to false anywhere, but we set it to true anyway.
+	 */
+	bucket->max_inclusive[dimension] = false;
+	new_bucket->min_inclusive[dimension] = true;
+
+	/*
+	 * Redistribute the sample tuples using the 'ScalarItem->tupno' index. We
+	 * know 'nrows' rows should remain in the original bucket and the rest
+	 * goes to the new one.
+	 */
+	bucket->numrows = nrows;
+	new_bucket->numrows = (oldnrows - nrows);
+
+	bucket->rows = (HeapTuple *) palloc0(bucket->numrows * sizeof(HeapTuple));
+	new_bucket->rows = (HeapTuple *) palloc0(new_bucket->numrows * sizeof(HeapTuple));
+
+	/*
+	 * The first nrows should go to the first bucket, the rest should go to
+	 * the new one. Use the tupno field to get the actual HeapTuple row from
+	 * the original array of sample rows.
+	 */
+	for (i = 0; i < nrows; i++)
+		memcpy(&bucket->rows[i], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+	for (i = nrows; i < oldnrows; i++)
+		memcpy(&new_bucket->rows[i - nrows], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+	/* update ndistinct values for the buckets (total and per dimension) */
+	update_bucket_ndistinct(bucket, attrs, stats);
+	update_bucket_ndistinct(new_bucket, attrs, stats);
+
+	/*
+	 * TODO We don't need to do this for the dimension we used for split,
+	 * because we know how many distinct values went to each partition.
+	 */
+	for (i = 0; i < numattrs; i++)
+	{
+		update_dimension_ndistinct(bucket, i, attrs, stats, false);
+		update_dimension_ndistinct(new_bucket, i, attrs, stats, false);
+	}
+
+	pfree(oldrows);
+	pfree(values);
+
+	return new_bucket;
+}
+
+/*
+ * Copy a histogram bucket. The copy does not include the build-time data, i.e.
+ * sampled rows etc.
+ */
+static MVBucketBuild *
+copy_ext_bucket(MVBucketBuild * bucket, uint32 ndimensions)
+{
+	/* TODO allocate as a single piece (including all the fields) */
+	MVBucketBuild *new_bucket = (MVBucketBuild *) palloc0(sizeof(MVBucketBuild));
+
+	/*
+	 * Copy only the attributes that will stay the same after the split, and
+	 * we'll recompute the rest after the split.
+	 */
+
+	/* allocate the per-dimension arrays */
+	new_bucket->nullsonly = (bool *) palloc0(ndimensions * sizeof(bool));
+
+	/* inclusiveness boundaries - lower/upper bounds */
+	new_bucket->min_inclusive = (bool *) palloc0(ndimensions * sizeof(bool));
+	new_bucket->max_inclusive = (bool *) palloc0(ndimensions * sizeof(bool));
+
+	/* lower/upper boundaries */
+	new_bucket->min = (Datum *) palloc0(ndimensions * sizeof(Datum));
+	new_bucket->max = (Datum *) palloc0(ndimensions * sizeof(Datum));
+
+	/* copy data */
+	memcpy(new_bucket->nullsonly, bucket->nullsonly, ndimensions * sizeof(bool));
+
+	memcpy(new_bucket->min_inclusive, bucket->min_inclusive, ndimensions * sizeof(bool));
+	memcpy(new_bucket->min, bucket->min, ndimensions * sizeof(Datum));
+
+	memcpy(new_bucket->max_inclusive, bucket->max_inclusive, ndimensions * sizeof(bool));
+	memcpy(new_bucket->max, bucket->max, ndimensions * sizeof(Datum));
+
+	/* allocate and copy the interesting part of the build data */
+	new_bucket->ndistincts = (uint32 *) palloc0(ndimensions * sizeof(uint32));
+
+	return new_bucket;
+}
+
+/*
+ * Counts the number of distinct values in the bucket. This just copies the
+ * Datum values into a simple array, and sorts them using memcmp-based
+ * comparator. That means it only works for pass-by-value data types (assuming
+ * they don't use collations etc.)
+ */
+static void
+update_bucket_ndistinct(MVBucketBuild * bucket, Bitmapset *attrs, VacAttrStats **stats)
+{
+	int			i;
+	int			numattrs = bms_num_members(attrs);
+	int			numrows = bucket->numrows;
+
+	MultiSortSupport mss = multi_sort_init(numattrs);
+	int		   *attnums;
+	SortItem   *items;
+
+	attnums = build_attnums(attrs);
+
+	/* prepare the sort function for the first dimension */
+	for (i = 0; i < numattrs; i++)
+	{
+		VacAttrStats *colstat = stats[i];
+		TypeCacheEntry *type;
+
+		type = lookup_type_cache(colstat->attrtypid, TYPECACHE_LT_OPR);
+		if (type->lt_opr == InvalidOid) /* shouldn't happen */
+			elog(ERROR, "cache lookup failed for ordering operator for type %u",
+				 colstat->attrtypid);
+
+		multi_sort_add_dimension(mss, i, type->lt_opr);
+	}
+
+	/*
+	 * build an array of SortItem(s) sorted using the multi-sort support
+	 *
+	 * XXX This relies on all stats entries pointing to the same tuple
+	 * descriptor. Not sure if that might not be the case.
+	 */
+	items = build_sorted_items(numrows, bucket->rows, stats[0]->tupDesc,
+							   mss, numattrs, attnums);
+
+	bucket->ndistinct = 1;
+
+	for (i = 1; i < numrows; i++)
+		if (multi_sort_compare(&items[i], &items[i - 1], mss) != 0)
+			bucket->ndistinct += 1;
+
+	pfree(items);
+}
+
+/*
+ * Count distinct values per bucket dimension.
+ */
+static void
+update_dimension_ndistinct(MVBucketBuild * bucket, int dimension, Bitmapset *attrs,
+						   VacAttrStats **stats, bool update_boundaries)
+{
+	int			j;
+	int			nvalues = 0;
+	bool		isNull;
+	Datum	   *values;
+	SortSupportData ssup;
+	TypeCacheEntry *type;
+	int		   *attnums;
+
+	values = (Datum *) palloc0(bucket->numrows * sizeof(Datum));
+	type = lookup_type_cache(stats[dimension]->attrtypid, TYPECACHE_LT_OPR);
+
+	/* we may already know this is a NULL-only dimension */
+	if (bucket->nullsonly[dimension])
+		bucket->ndistincts[dimension] = 1;
+
+	memset(&ssup, 0, sizeof(ssup));
+	ssup.ssup_cxt = CurrentMemoryContext;
+
+	/* We always use the default collation for statistics */
+	ssup.ssup_collation = DEFAULT_COLLATION_OID;
+	ssup.ssup_nulls_first = false;
+
+	PrepareSortSupportFromOrderingOp(type->lt_opr, &ssup);
+
+	attnums = build_attnums(attrs);
+
+	for (j = 0; j < bucket->numrows; j++)
+	{
+		values[nvalues] = heap_getattr(bucket->rows[j], attnums[dimension],
+									   stats[dimension]->tupDesc, &isNull);
+
+		/* ignore NULL values */
+		if (!isNull)
+			nvalues++;
+	}
+
+	/* there's always at least 1 distinct value (may be NULL) */
+	bucket->ndistincts[dimension] = 1;
+
+	/*
+	 * if there are only NULL values in the column, mark it so and continue
+	 * with the next one
+	 */
+	if (nvalues == 0)
+	{
+		pfree(values);
+		bucket->nullsonly[dimension] = true;
+		return;
+	}
+
+	/* sort the array (pass-by-value datum */
+	qsort_arg((void *) values, nvalues, sizeof(Datum),
+			  compare_scalars_simple, (void *) &ssup);
+
+	/*
+	 * Update min/max boundaries to the smallest bounding box. Generally, this
+	 * needs to be done only when constructing the initial bucket.
+	 */
+	if (update_boundaries)
+	{
+		/* store the min/max values */
+		bucket->min[dimension] = values[0];
+		bucket->min_inclusive[dimension] = true;
+
+		bucket->max[dimension] = values[nvalues - 1];
+		bucket->max_inclusive[dimension] = true;
+	}
+
+	/*
+	 * Walk through the array and count distinct values by comparing
+	 * succeeding values.
+	 */
+	for (j = 1; j < nvalues; j++)
+	{
+		if (compare_datums_simple(values[j - 1], values[j], &ssup) != 0)
+			bucket->ndistincts[dimension] += 1;
+	}
+
+	pfree(values);
+}
+
+/*
+ * A properly built histogram must not contain buckets mixing NULL and non-NULL
+ * values in a single dimension. Each dimension may either be marked as 'nulls
+ * only', and thus containing only NULL values, or it must not contain any NULL
+ * values.
+ *
+ * Therefore, if the sample contains NULL values in any of the columns, it's
+ * necessary to build those NULL-buckets. This is done in an iterative way
+ * using this algorithm, operating on a single bucket:
+ *
+ *	   (1) Check that all dimensions are well-formed (not mixing NULL and
+ *		   non-NULL values).
+ *
+ *	   (2) If all dimensions are well-formed, terminate.
+ *
+ *	   (3) If the dimension contains only NULL values, but is not marked as
+ *		   NULL-only, mark it as NULL-only and run the algorithm again (on
+ *		   this bucket).
+ *
+ *	   (4) If the dimension mixes NULL and non-NULL values, split the bucket
+ *		   into two parts - one with NULL values, one with non-NULL values
+ *		   (replacing the current one). Then run the algorithm on both buckets.
+ *
+ * This is executed in a recursive manner, but the number of executions should
+ * be quite low - limited by the number of NULL-buckets. Also, in each branch
+ * the number of nested calls is limited by the number of dimensions
+ * (attributes) of the histogram.
+ *
+ * At the end, there should be buckets with no mixed dimensions. The number of
+ * buckets produced by this algorithm is rather limited - with N dimensions,
+ * there may be only 2^N such buckets (each dimension may be either NULL or
+ * non-NULL). So with 8 dimensions (current value of STATS_MAX_DIMENSIONS)
+ * there may be only 256 such buckets.
+ *
+ * After this, a 'regular' bucket-split algorithm shall run, further optimizing
+ * the histogram.
+ */
+static void
+create_null_buckets(MVHistogramBuild * histogram, int bucket_idx,
+					Bitmapset *attrs, VacAttrStats **stats)
+{
+	int			i,
+				j;
+	int			null_dim = -1;
+	int			null_count = 0;
+	bool		null_found = false;
+	MVBucketBuild *bucket,
+			   *null_bucket;
+	int			null_idx,
+				curr_idx;
+	int		   *attnums;
+
+	/* remember original values from the bucket */
+	int			numrows;
+	HeapTuple  *oldrows = NULL;
+
+	Assert(bucket_idx < histogram->nbuckets);
+	Assert(histogram->ndimensions == bms_num_members(attrs));
+
+	bucket = histogram->buckets[bucket_idx];
+
+	numrows = bucket->numrows;
+	oldrows = bucket->rows;
+
+	attnums = build_attnums(attrs);
+
+	/*
+	 * Walk through all rows / dimensions, and stop once we find NULL in a
+	 * dimension not yet marked as NULL-only.
+	 */
+	for (i = 0; i < bucket->numrows; i++)
+	{
+		for (j = 0; j < histogram->ndimensions; j++)
+		{
+			/* Is this a NULL-only dimension? If yes, skip. */
+			if (bucket->nullsonly[j])
+				continue;
+
+			/* found a NULL in that dimension? */
+			if (heap_attisnull(bucket->rows[i], attnums[j],
+							   stats[j]->tupDesc))
+			{
+				null_found = true;
+				null_dim = j;
+				break;
+			}
+		}
+
+		/* terminate if we found attribute with NULL values */
+		if (null_found)
+			break;
+	}
+
+	/* no regular dimension contains NULL values => we're done */
+	if (!null_found)
+		return;
+
+	/* walk through the rows again, count NULL values in 'null_dim' */
+	for (i = 0; i < bucket->numrows; i++)
+	{
+		if (heap_attisnull(bucket->rows[i], attnums[null_dim],
+						   stats[null_dim]->tupDesc))
+			null_count += 1;
+	}
+
+	Assert(null_count <= bucket->numrows);
+
+	/*
+	 * If (null_count == numrows) the dimension already is NULL-only, but is
+	 * not yet marked like that. It's enough to mark it and repeat the process
+	 * recursively (until we run out of dimensions).
+	 */
+	if (null_count == bucket->numrows)
+	{
+		bucket->nullsonly[null_dim] = true;
+		create_null_buckets(histogram, bucket_idx, attrs, stats);
+		return;
+	}
+
+	/*
+	 * We have to split the bucket into two - one with NULL values in the
+	 * dimension, one with non-NULL values. We don't need to sort the data or
+	 * anything, but otherwise it's similar to what partition_bucket() does.
+	 */
+
+	/* create bucket with NULL-only dimension 'dim' */
+	null_bucket = copy_ext_bucket(bucket, histogram->ndimensions);
+
+	/* remember the current array info */
+	oldrows = bucket->rows;
+	numrows = bucket->numrows;
+
+	/* we'll keep non-NULL values in the current bucket */
+	bucket->numrows = (numrows - null_count);
+	bucket->rows
+		= (HeapTuple *) palloc0(bucket->numrows * sizeof(HeapTuple));
+
+	/* and the NULL values will go to the new one */
+	null_bucket->numrows = null_count;
+	null_bucket->rows
+		= (HeapTuple *) palloc0(null_bucket->numrows * sizeof(HeapTuple));
+
+	/* mark the dimension as NULL-only (in the new bucket) */
+	null_bucket->nullsonly[null_dim] = true;
+
+	/* walk through the sample rows and distribute them accordingly */
+	null_idx = 0;
+	curr_idx = 0;
+	for (i = 0; i < numrows; i++)
+	{
+		if (heap_attisnull(oldrows[i], attnums[null_dim],
+						   stats[null_dim]->tupDesc))
+			/* NULL => copy to the new bucket */
+			memcpy(&null_bucket->rows[null_idx++], &oldrows[i],
+				   sizeof(HeapTuple));
+		else
+			memcpy(&bucket->rows[curr_idx++], &oldrows[i],
+				   sizeof(HeapTuple));
+	}
+
+	/* update ndistinct values for the buckets (total and per dimension) */
+	update_bucket_ndistinct(bucket, attrs, stats);
+	update_bucket_ndistinct(null_bucket, attrs, stats);
+
+	/*
+	 * TODO We don't need to do this for the dimension we used for split,
+	 * because we know how many distinct values went to each bucket (NULL is
+	 * not a value, so NULL buckets get 0, and the other bucket got all the
+	 * distinct values).
+	 */
+	for (i = 0; i < histogram->ndimensions; i++)
+	{
+		update_dimension_ndistinct(bucket, i, attrs, stats, false);
+		update_dimension_ndistinct(null_bucket, i, attrs, stats, false);
+	}
+
+	pfree(oldrows);
+
+	/* add the NULL bucket to the histogram */
+	histogram->buckets[histogram->nbuckets++] = null_bucket;
+
+	/*
+	 * And now run the function recursively on both buckets (the new one
+	 * first, because the call may change number of buckets, and it's used as
+	 * an index).
+	 */
+	create_null_buckets(histogram, (histogram->nbuckets - 1), attrs, stats);
+	create_null_buckets(histogram, bucket_idx, attrs, stats);
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - bucket ID (0...nbuckets)
+ * - min values (string array)
+ * - max values (string array)
+ * - nulls only (boolean array)
+ * - min inclusive flags (boolean array)
+ * - max inclusive flags (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows returned if the
+ * statistics contains no histogram (or if there's no statistics for the OID).
+ *
+ * The second parameter (type) determines what values will be returned
+ * in the (minvals,maxvals). There are three possible values:
+ *
+ * 0 (actual values)
+ * -----------------
+ *	  - prints actual values
+ *	  - using the output function of the data type (as string)
+ *	  - handy for investigating the histogram
+ *
+ * 1 (distinct index)
+ * ------------------
+ *	  - prints index of the distinct value (into the serialized array)
+ *	  - makes it easier to spot neighbor buckets, etc.
+ *	  - handy for plotting the histogram
+ *
+ * 2 (normalized distinct index)
+ * -----------------------------
+ *	  - prints index of the distinct value, but normalized into [0,1]
+ *	  - similar to 1, but shows how 'long' the bucket range is
+ *	  - handy for plotting the histogram
+ *
+ * When plotting the histogram, be careful as the (1) and (2) options skew the
+ * lengths by distributing the distinct values uniformly. For data types
+ * without a clear meaning of 'distance' (e.g. strings) that is not a big deal,
+ * but for numbers it may be confusing.
+ */
+PG_FUNCTION_INFO_V1(pg_histogram_buckets);
+
+#define OUTPUT_FORMAT_RAW		0
+#define OUTPUT_FORMAT_INDEXES	1
+#define OUTPUT_FORMAT_DISTINCT	2
+
+Datum
+pg_histogram_buckets(PG_FUNCTION_ARGS)
+{
+	FuncCallContext *funcctx;
+	int			call_cntr;
+	int			max_calls;
+	TupleDesc	tupdesc;
+	AttInMetadata *attinmeta;
+
+	int			otype = PG_GETARG_INT32(1);
+
+	if ((otype < 0) || (otype > 2))
+		elog(ERROR, "invalid output type specified");
+
+	/* stuff done only on the first call of the function */
+	if (SRF_IS_FIRSTCALL())
+	{
+		MemoryContext oldcontext;
+		MVHistogram *histogram;
+
+		/* create a function context for cross-call persistence */
+		funcctx = SRF_FIRSTCALL_INIT();
+
+		/* switch to memory context appropriate for multiple function calls */
+		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+		histogram = statext_histogram_deserialize(PG_GETARG_BYTEA_P(0));
+
+		funcctx->user_fctx = histogram;
+
+		/* total number of tuples to be returned */
+		funcctx->max_calls = 0;
+		if (funcctx->user_fctx != NULL)
+			funcctx->max_calls = histogram->nbuckets;
+
+		/* Build a tuple descriptor for our result type */
+		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("function returning record called in context "
+							"that cannot accept type record")));
+
+		/*
+		 * generate attribute metadata needed later to produce tuples from raw
+		 * C strings
+		 */
+		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+		funcctx->attinmeta = attinmeta;
+
+		MemoryContextSwitchTo(oldcontext);
+	}
+
+	/* stuff done on every call of the function */
+	funcctx = SRF_PERCALL_SETUP();
+
+	call_cntr = funcctx->call_cntr;
+	max_calls = funcctx->max_calls;
+	attinmeta = funcctx->attinmeta;
+
+	if (call_cntr < max_calls)	/* do when there is more left to send */
+	{
+		char	  **values;
+		HeapTuple	tuple;
+		Datum		result;
+		double		bucket_volume = 1.0;
+		StringInfo	bufs;
+
+		char	   *format;
+		int			i;
+
+		Oid		   *outfuncs;
+		FmgrInfo   *fmgrinfo;
+
+		MVHistogram *histogram;
+		MVBucket   *bucket;
+
+		histogram = (MVHistogram *) funcctx->user_fctx;
+
+		Assert(call_cntr < histogram->nbuckets);
+
+		bucket = histogram->buckets[call_cntr];
+
+		/*
+		 * The scalar values will be formatted directly, using snprintf.
+		 *
+		 * The 'array' values will be formatted through StringInfo.
+		 */
+		values = (char **) palloc0(9 * sizeof(char *));
+		bufs = (StringInfo) palloc0(9 * sizeof(StringInfoData));
+
+		values[0] = (char *) palloc(64 * sizeof(char));
+
+		initStringInfo(&bufs[1]);	/* lower boundaries */
+		initStringInfo(&bufs[2]);	/* upper boundaries */
+		initStringInfo(&bufs[3]);	/* nulls-only */
+		initStringInfo(&bufs[4]);	/* lower inclusive */
+		initStringInfo(&bufs[5]);	/* upper inclusive */
+
+		values[6] = (char *) palloc(64 * sizeof(char));
+		values[7] = (char *) palloc(64 * sizeof(char));
+		values[8] = (char *) palloc(64 * sizeof(char));
+
+		/* we need to do this only when printing the actual values */
+		outfuncs = (Oid *) palloc0(sizeof(Oid) * histogram->ndimensions);
+		fmgrinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * histogram->ndimensions);
+
+		/*
+		 * lookup output functions for all histogram dimensions
+		 *
+		 * XXX This might be one in the first call and stored in user_fctx.
+		 */
+		for (i = 0; i < histogram->ndimensions; i++)
+		{
+			bool		isvarlena;
+
+			getTypeOutputInfo(histogram->types[i], &outfuncs[i], &isvarlena);
+
+			fmgr_info(outfuncs[i], &fmgrinfo[i]);
+		}
+
+		snprintf(values[0], 64, "%d", call_cntr);	/* bucket ID */
+
+		/*
+		 * for the arrays of lower/upper boundaries, formated according to
+		 * otype
+		 */
+		for (i = 0; i < histogram->ndimensions; i++)
+		{
+			Datum	   *vals = histogram->values[i];
+
+			uint16		minidx = bucket->min[i];
+			uint16		maxidx = bucket->max[i];
+
+			int			d = 1;
+
+			/*
+			 * compute bucket volume, using distinct values as a measure
+			 *
+			 * XXX Not really sure what to do for NULL dimensions or
+			 * dimensions with just a single value here, so let's simply count
+			 * them as 1. They will not affect the volume anyway.
+			 */
+			if (histogram->nvalues[i] > 1)
+				d = (histogram->nvalues[i] - 1);
+
+			bucket_volume *= (double) (maxidx - minidx + 1) / d;
+
+			if (i == 0)
+				format = "{%s"; /* fist dimension */
+			else if (i < (histogram->ndimensions - 1))
+				format = ", %s";	/* medium dimensions */
+			else
+				format = ", %s}";	/* last dimension */
+
+			appendStringInfo(&bufs[3], format, bucket->nullsonly[i] ? "t" : "f");
+			appendStringInfo(&bufs[4], format, bucket->min_inclusive[i] ? "t" : "f");
+			appendStringInfo(&bufs[5], format, bucket->max_inclusive[i] ? "t" : "f");
+
+			/*
+			 * for NULL-only  dimension, simply put there the NULL and
+			 * continue
+			 */
+			if (bucket->nullsonly[i])
+			{
+				if (i == 0)
+					format = "{%s";
+				else if (i < (histogram->ndimensions - 1))
+					format = ", %s";
+				else
+					format = ", %s}";
+
+				appendStringInfo(&bufs[1], format, "NULL");
+				appendStringInfo(&bufs[2], format, "NULL");
+
+				continue;
+			}
+
+			/* otherwise we really need to format the value */
+			switch (otype)
+			{
+				case OUTPUT_FORMAT_RAW: /* actual boundary values */
+
+					if (i == 0)
+						format = "{%s";
+					else if (i < (histogram->ndimensions - 1))
+						format = ", %s";
+					else
+						format = ", %s}";
+
+					appendStringInfo(&bufs[1], format,
+									 FunctionCall1(&fmgrinfo[i], vals[minidx]));
+
+					appendStringInfo(&bufs[2], format,
+									 FunctionCall1(&fmgrinfo[i], vals[maxidx]));
+
+					break;
+
+				case OUTPUT_FORMAT_INDEXES: /* indexes into deduplicated
+											 * arrays */
+
+					if (i == 0)
+						format = "{%d";
+					else if (i < (histogram->ndimensions - 1))
+						format = ", %d";
+					else
+						format = ", %d}";
+
+					appendStringInfo(&bufs[1], format, minidx);
+					appendStringInfo(&bufs[2], format, maxidx);
+
+					break;
+
+				case OUTPUT_FORMAT_DISTINCT:	/* distinct arrays as measure */
+
+					if (i == 0)
+						format = "{%f";
+					else if (i < (histogram->ndimensions - 1))
+						format = ", %f";
+					else
+						format = ", %f}";
+
+					appendStringInfo(&bufs[1], format, (minidx * 1.0 / d));
+					appendStringInfo(&bufs[2], format, (maxidx * 1.0 / d));
+
+					break;
+
+				default:
+					elog(ERROR, "unknown output type: %d", otype);
+			}
+		}
+
+		values[1] = bufs[1].data;
+		values[2] = bufs[2].data;
+		values[3] = bufs[3].data;
+		values[4] = bufs[4].data;
+		values[5] = bufs[5].data;
+
+		snprintf(values[6], 64, "%f", bucket->frequency);	/* frequency */
+		snprintf(values[7], 64, "%f", bucket->frequency / bucket_volume);	/* density */
+		snprintf(values[8], 64, "%f", bucket_volume);	/* volume (as a
+														 * fraction) */
+
+		/* build a tuple */
+		tuple = BuildTupleFromCStrings(attinmeta, values);
+
+		/* make the tuple into a datum */
+		result = HeapTupleGetDatum(tuple);
+
+		/* clean up (this is not really necessary) */
+		pfree(values[0]);
+		pfree(values[6]);
+		pfree(values[7]);
+		pfree(values[8]);
+
+		resetStringInfo(&bufs[1]);
+		resetStringInfo(&bufs[2]);
+		resetStringInfo(&bufs[3]);
+		resetStringInfo(&bufs[4]);
+		resetStringInfo(&bufs[5]);
+
+		pfree(bufs);
+		pfree(values);
+
+		SRF_RETURN_NEXT(funcctx, result);
+	}
+	else						/* do when there is no more left */
+	{
+		SRF_RETURN_DONE(funcctx);
+	}
+}
+
+/*
+ * pg_histogram_in		- input routine for type pg_histogram.
+ *
+ * pg_histogram is real enough to be a table column, but it has no operations
+ * of its own, and disallows input too
+ */
+Datum
+pg_histogram_in(PG_FUNCTION_ARGS)
+{
+	/*
+	 * pg_histogram stores the data in binary form and parsing text input is
+	 * not needed, so disallow this.
+	 */
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg("cannot accept a value of type %s", "pg_histogram")));
+
+	PG_RETURN_VOID();			/* keep compiler quiet */
+}
+
+/*
+ * pg_histogram_out		- output routine for type pg_histogram.
+ *
+ * histograms are serialized into a bytea value, so we simply call byteaout()
+ * to serialize the value into text. But it'd be nice to serialize that into
+ * a meaningful representation (e.g. for inspection by people).
+ *
+ * XXX This should probably return something meaningful, similar to what
+ * pg_dependencies_out does. Not sure how to deal with the deduplicated
+ * values, though - do we want to expand that or not?
+ */
+Datum
+pg_histogram_out(PG_FUNCTION_ARGS)
+{
+	return byteaout(fcinfo);
+}
+
+/*
+ * pg_histogram_recv		- binary input routine for type pg_histogram.
+ */
+Datum
+pg_histogram_recv(PG_FUNCTION_ARGS)
+{
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg("cannot accept a value of type %s", "pg_histogram")));
+
+	PG_RETURN_VOID();			/* keep compiler quiet */
+}
+
+/*
+ * pg_histogram_send		- binary output routine for type pg_histogram.
+ *
+ * Histograms are serialized in a bytea value (although the type is named
+ * differently), so let's just send that.
+ */
+Datum
+pg_histogram_send(PG_FUNCTION_ARGS)
+{
+	return byteasend(fcinfo);
+}
+
+/*
+ * selectivity estimation
+ */
+
+/*
+ * When evaluating conditions on the histogram, we can leverage the fact that
+ * each bucket boundary value is used by many buckets (each bucket split
+ * introduces a single new value, duplicating all the other values). That
+ * allows us to significantly reduce the number of function calls by caching
+ * the results.
+ *
+ * This is one of the reasons why we keep the histogram in partially serialized
+ * form, with deduplicated values. This allows us to maintain a simple array
+ * of results indexed by uint16 values.
+ *
+ * We only need 2 bits per value, but we allocate a full char as it's more
+ * convenient and there's not much to gain. 0 means 'unknown' as the function
+ * was not executed for this value yet.
+ */
+
+#define HIST_CACHE_FALSE			0x01
+#define HIST_CACHE_TRUE				0x03
+#define HIST_CACHE_MASK				0x02
+
+/*
+ * bucket_contains_value
+ *		Decide if the bucket (a range of values in a particular dimension) may
+ *		contain the supplied value.
+ *
+ * The function does not simply return true/false, but a "match level" (none,
+ * partial, full), just like other similar functions. In fact, thise function
+ * only returns "partial" or "none" levels, as a range can never match exactly
+ * a value (we never generate histograms with "collapsed" dimensions).
+ *
+ * FIXME Should use a better estimate than DEFAULT_EQ_SEL, e.g. derived
+ * from ndistinct for the variable. But for histograms we shouldn't really
+ * get here, because equalities are handled as conditions (i.e. we'll get
+ * here when deciding which buckets match the conditions, but the fraction
+ * value does not really matter, we only care about the match flag).
+ */
+static bool
+bucket_contains_value(FmgrInfo ltproc, Datum constvalue,
+					  Datum min_value, Datum max_value,
+					  int min_index, int max_index,
+					  bool min_include, bool max_include,
+					  char *callcache, double *fraction)
+{
+	bool		a,
+				b;
+
+	char		min_cached = callcache[min_index];
+	char		max_cached = callcache[max_index];
+
+	/*
+	 * First some quick checks on equality - if any of the boundaries equals,
+	 * we have a partial match (so no need to call the comparator).
+	 */
+	if (((min_value == constvalue) && (min_include)) ||
+		((max_value == constvalue) && (max_include)))
+	{
+		*fraction = DEFAULT_EQ_SEL;
+		return true;
+	}
+
+	/* Keep the values 0/1 because of the XOR at the end. */
+	a = ((min_cached & HIST_CACHE_MASK) >> 1);
+	b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+	/*
+	 * If result for the bucket lower bound not in cache, evaluate the
+	 * function and store the result in the cache.
+	 */
+	if (!min_cached)
+	{
+		a = DatumGetBool(FunctionCall2Coll(&ltproc,
+										   DEFAULT_COLLATION_OID,
+										   constvalue, min_value));
+		/* remember the result */
+		callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+	}
+
+	/* And do the same for the upper bound. */
+	if (!max_cached)
+	{
+		b = DatumGetBool(FunctionCall2Coll(&ltproc,
+										   DEFAULT_COLLATION_OID,
+										   constvalue, max_value));
+		/* remember the result */
+		callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+	}
+
+	*fraction = (a ^ b) ? DEFAULT_EQ_SEL : 0.0;
+
+	return (a ^ b) ? true : false;
+}
+
+/*
+ * bucket_is_smaller_than_value
+ *		Decide if the bucket (a range of values in a particular dimension) is
+ *		smaller than the supplied value.
+ *
+ * The function does not simply return true/false, but a "match level" (none,
+ * partial, full), just like other similar functions.
+ *
+ * Unlike bucket_contains_value this may return all three match levels, i.e.
+ * "full" (e.g. [10,20] < 30), "partial" (e.g. [10,20] < 15) and "none"
+ * (e.g. [10,20] < 5).
+ *
+ * FIXME Use a better estimate, instead of DEFAULT_INEQ_SEL, i.e. something
+ * derived in a way similar to convert_to_scalar.
+ */
+static bool
+bucket_is_smaller_than_value(FmgrInfo opproc, Oid typeoid, Datum constvalue,
+							 Datum min_value, Datum max_value,
+							 int min_index, int max_index,
+							 bool min_include, bool max_include,
+							 char *callcache, bool isgt,
+							 double *fraction)
+{
+	char		min_cached = callcache[min_index];
+	char		max_cached = callcache[max_index];
+
+	/* Keep the values 0/1 because of the XOR at the end. */
+	bool		a = ((min_cached & HIST_CACHE_MASK) >> 1);
+	bool		b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+	if (!min_cached)
+	{
+		a = DatumGetBool(FunctionCall2Coll(&opproc,
+										   DEFAULT_COLLATION_OID,
+										   min_value,
+										   constvalue));
+		/* remember the result */
+		callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+	}
+
+	if (!max_cached)
+	{
+		b = DatumGetBool(FunctionCall2Coll(&opproc,
+										   DEFAULT_COLLATION_OID,
+										   max_value,
+										   constvalue));
+		/* remember the result */
+		callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+	}
+
+	/*
+	 * Now, we need to combine both results into the final answer, and we need
+	 * to be careful about the 'isgt' variable which kinda inverts the
+	 * meaning.
+	 *
+	 * First, we handle the case when each boundary returns different results.
+	 * In that case the outcome can only be 'partial' match, and the fraction
+	 * is computed using convert_to_scalar, just like for 1D histograms.
+	 */
+	if (a != b)
+	{
+		double	val, high, low, binfrac;
+
+		if (convert_to_scalar(constvalue, typeoid, &val, min_value, max_value,
+							  typeoid, &low, &high))
+		{
+
+			/* shamelessly copied from ineq_histogram_selectivity */
+			if (high <= low)
+			{
+				/* cope if bin boundaries appear identical */
+				binfrac = 0.5;
+			}
+			else if (val <= low)
+				binfrac = 0.0;
+			else if (val >= high)
+				binfrac = 1.0;
+			else
+			{
+				binfrac = (val - low) / (high - low);
+
+				/*
+				 * Watch out for the possibility that we got a NaN or
+				 * Infinity from the division.  This can happen
+				 * despite the previous checks, if for example "low"
+				 * is -Infinity.
+				 */
+				if (isnan(binfrac) ||
+					binfrac < 0.0 || binfrac > 1.0)
+					binfrac = 0.5;
+			}
+		}
+		else
+			binfrac = 0.5;
+
+		*fraction = (isgt) ? binfrac : (1-binfrac);
+		return true;
+	}
+
+	/*
+	 * When the results are the same, then it depends on the 'isgt' value.
+	 * There are four options:
+	 *
+	 * isgt=false a=b=true	=> full match isgt=false a=b=false => empty
+	 * isgt=true  a=b=true	=> empty isgt=true	a=b=false => full match
+	 *
+	 * We'll cheat a bit, because we know that (a=b) so we'll use just one of
+	 * them.
+	 */
+	if (isgt)
+	{
+		*fraction = (!a) ? 1.0 : 0.0;
+		return (!a);
+	}
+	else
+	{
+		*fraction = (a) ? 1.0 : 0.0;
+		return a;
+	}
+}
+
+/*
+ * Evaluate clauses using the histogram, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * Note: This is not a simple bitmap in the sense that there are three
+ * possible values for each item - no match, partial match and full match.
+ * So we need at least 2 bits per item.
+ *
+ * TODO: This works with 'bitmap' where each item is represented as a
+ * char, which is slightly wasteful. Instead, we could use a bitmap
+ * with 2 bits per item, reducing the size to ~1/4. By using values
+ * 0, 1 and 3 (instead of 0, 1 and 2), the operations (merging etc.)
+ * might be performed just like for simple bitmap by using & and |,
+ * which might be faster than min/max.
+ */
+static void
+histogram_update_match_bitmap(PlannerInfo *root, List *clauses,
+							  Bitmapset *stakeys,
+							  MVHistogram * histogram,
+							  bucket_match *matches, bool is_or)
+{
+	int			i;
+	ListCell   *l;
+
+	/*
+	 * Used for caching function calls, only once per deduplicated value.
+	 *
+	 * We know may have up to (2 * nbuckets) values per dimension. It's
+	 * probably overkill, but let's allocate that once for all clauses, to
+	 * minimize overhead.
+	 *
+	 * Also, we only need two bits per value, but this allocates byte per
+	 * value. Might be worth optimizing.
+	 *
+	 * 0x00 - not yet called 0x01 - called, result is 'false' 0x03 - called,
+	 * result is 'true'
+	 */
+	char	   *callcache = palloc(histogram->nbuckets);
+
+	Assert(histogram != NULL);
+	Assert(histogram->nbuckets > 0);
+
+	Assert(clauses != NIL);
+	Assert(list_length(clauses) >= 1);
+
+	/* loop through the clauses and do the estimation */
+	foreach(l, clauses)
+	{
+		Node	   *clause = (Node *) lfirst(l);
+
+		/* if it's a RestrictInfo, then extract the clause */
+		if (IsA(clause, RestrictInfo))
+			clause = (Node *) ((RestrictInfo *) clause)->clause;
+
+		/* it's either OpClause, or NullTest */
+		if (is_opclause(clause))
+		{
+			OpExpr	   *expr = (OpExpr *) clause;
+			bool		varonleft = true;
+			bool		ok;
+
+			FmgrInfo	opproc; /* operator */
+
+			fmgr_info(get_opcode(expr->opno), &opproc);
+
+			/* reset the cache (per clause) */
+			memset(callcache, 0, histogram->nbuckets);
+
+			ok = (NumRelids(clause) == 1) &&
+				(is_pseudo_constant_clause(lsecond(expr->args)) ||
+				 (varonleft = false,
+				  is_pseudo_constant_clause(linitial(expr->args))));
+
+			if (ok)
+			{
+				FmgrInfo	ltproc;
+				RegProcedure oprrest = get_oprrest(expr->opno);
+
+				Var		   *var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+				Const	   *cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+				bool		isgt = (!varonleft);
+
+				TypeCacheEntry *typecache
+				= lookup_type_cache(var->vartype, TYPECACHE_LT_OPR);
+
+				/* lookup dimension for the attribute */
+				int			idx = bms_member_index(stakeys, var->varattno);
+
+				fmgr_info(get_opcode(typecache->lt_opr), &ltproc);
+
+				/*
+				 * Check this for all buckets that still have "true" in the
+				 * bitmap
+				 *
+				 * We already know the clauses use suitable operators (because
+				 * that's how we filtered them).
+				 */
+				for (i = 0; i < histogram->nbuckets; i++)
+				{
+					bool		res;
+					double		fraction;
+
+					MVBucket   *bucket = histogram->buckets[i];
+
+					/* histogram boundaries */
+					Datum		minval,
+								maxval;
+					bool		mininclude,
+								maxinclude;
+					int			minidx,
+								maxidx;
+
+					/*
+					 * For AND-lists, we can also mark NULL buckets as 'no
+					 * match' (and then skip them). For OR-lists this is not
+					 * possible.
+					 */
+					if ((!is_or) && bucket->nullsonly[idx])
+						matches[i].match = false;
+
+					/*
+					 * XXX There used to be logic to skip buckets that can't
+					 * possibly match, depending on the is_or flag (either
+					 * fully matching or elimated). Once we abandoned the
+					 * concept of NONE/PARTIAL/FULL matches and switched to
+					 * a bool flag + fraction that does not seem possible.
+					 * But maybe we can make it work somehow?
+					 */
+
+					/* lookup the values and cache of function calls */
+					minidx = bucket->min[idx];
+					maxidx = bucket->max[idx];
+
+					minval = histogram->values[idx][bucket->min[idx]];
+					maxval = histogram->values[idx][bucket->max[idx]];
+
+					mininclude = bucket->min_inclusive[idx];
+					maxinclude = bucket->max_inclusive[idx];
+
+					/*
+					 * If it's not a "<" or ">" or "=" operator, just ignore
+					 * the clause. Otherwise note the relid and attnum for the
+					 * variable.
+					 *
+					 * TODO I'm really unsure the handling of 'isgt' flag
+					 * (that is, clauses with reverse order of
+					 * variable/constant) is correct. I wouldn't be surprised
+					 * if there was some mixup. Using the lt/gt operators
+					 * instead of messing with the opproc could make it
+					 * simpler. It would however be using a different operator
+					 * than the query, although it's not any shadier than
+					 * using the selectivity function as is done currently.
+					 */
+					switch (oprrest)
+					{
+						case F_SCALARLTSEL: /* Var < Const */
+						case F_SCALARLESEL: /* Var <= Const */
+						case F_SCALARGTSEL: /* Var > Const */
+						case F_SCALARGESEL: /* Var >= Const */
+
+							res = bucket_is_smaller_than_value(opproc, var->vartype,
+															   cst->constvalue,
+															   minval, maxval,
+															   minidx, maxidx,
+															   mininclude, maxinclude,
+															   callcache, isgt, &fraction);
+
+							break;
+
+						case F_EQSEL:
+						case F_NEQSEL:
+
+							/*
+							 * We only check whether the value is within the
+							 * bucket, using the lt operator, and we also
+							 * check for equality with the boundaries.
+							 */
+
+							res = bucket_contains_value(ltproc, cst->constvalue,
+														minval, maxval,
+														minidx, maxidx,
+														mininclude, maxinclude,
+														callcache, &fraction);
+
+							break;
+					}
+
+					/*
+					 * Merge the result into the bitmap, depending on type
+					 * of the current clause (AND or OR).
+					 */
+					if (is_or)
+					{
+						Selectivity s1, s2;
+
+						/* OR follows the Max() semantics */
+						matches[i].match |= res;
+
+						/*
+						 * Selectivities for an OR clause are combined as s1+s2 - s1*s2
+						 * to account for the probable overlap of selected tuple sets.
+						 * This is the same formula as in clause_selectivity, because
+						 * the fraction is computed assuming independence (but then we
+						 * also apply geometric mean).
+						 */
+						s1 = matches[i].fraction;
+						s2 = fraction;
+
+						matches[i].fraction = s1 + s2 - s1 * s2;
+
+						CLAMP_PROBABILITY(matches[i].fraction);
+					}
+					else
+					{
+						/* AND follows Min() semantics */
+						matches[i].match &= res;
+						matches[i].fraction *= fraction;
+					}
+				}
+			}
+		}
+		else if (IsA(clause, NullTest))
+		{
+			NullTest   *expr = (NullTest *) clause;
+			Var		   *var = (Var *) (expr->arg);
+
+			/* lookup index of attribute in the statistics */
+			int			idx = bms_member_index(stakeys, var->varattno);
+
+			/*
+			 * Walk through the buckets and evaluate the current clause. We
+			 * can skip items that were already ruled out, and terminate if
+			 * there are no remaining buckets that might possibly match.
+			 */
+			for (i = 0; i < histogram->nbuckets; i++)
+			{
+				char		match = false;
+				MVBucket   *bucket = histogram->buckets[i];
+
+				/*
+				 * Skip buckets that were already eliminated - this is
+				 * impotant considering how we update the info (we only lower
+				 * the match)
+				 */
+				if ((!is_or) && (!matches[i].match))
+					continue;
+				else if (is_or && (matches[i].match))
+					continue;
+
+				switch (expr->nulltesttype)
+				{
+					case IS_NULL:
+						match = (bucket->nullsonly[idx]) ? true : match;
+						break;
+
+					case IS_NOT_NULL:
+						match = (!bucket->nullsonly[idx]) ? true : match;
+						break;
+				}
+
+				/* now, update the match bitmap, depending on OR/AND type */
+				if (is_or)
+				{
+					matches[i].match |= match;
+					matches[i].fraction = (match) ? 1.0 : matches[i].fraction;
+				}
+				else
+				{
+					matches[i].match &= match;
+					matches[i].fraction = (match) ? matches[i].fraction : 0.0;
+				}
+			}
+		}
+		else if (or_clause(clause) || and_clause(clause))
+		{
+			/*
+			 * AND/OR clause, with all sub-clauses compatible with the stats
+			 */
+
+			int			i;
+			BoolExpr   *bool_clause = ((BoolExpr *) clause);
+			List	   *bool_clauses = bool_clause->args;
+
+			/* match/mismatch bitmap for each bucket */
+			bucket_match   *bool_matches = NULL;
+
+			Assert(bool_clauses != NIL);
+			Assert(list_length(bool_clauses) >= 2);
+
+			/* by default none of the buckets matches the clauses */
+			bool_matches = palloc0(sizeof(bucket_match) * histogram->nbuckets);
+
+			if (or_clause(clause))
+			{
+				/* OR clauses assume nothing matches, initially */
+				for (i = 0; i < histogram->nbuckets; i++)
+				{
+					bool_matches[i].match = false;
+					bool_matches[i].fraction = 0.0;
+				}
+			}
+			else
+			{
+				/* AND clauses assume nothing matches, initially */
+				for (i = 0; i < histogram->nbuckets; i++)
+				{
+					bool_matches[i].match = true;
+					bool_matches[i].fraction = 1.0;
+				}
+			}
+
+			/* build the match bitmap for the OR-clauses */
+			histogram_update_match_bitmap(root, bool_clauses,
+										  stakeys, histogram,
+										  bool_matches, or_clause(clause));
+
+			/*
+			 * Merge the bitmap produced by histogram_update_match_bitmap into
+			 * the current one. We need to consider if we're evaluating AND or
+			 * OR condition when merging the results.
+			 */
+			for (i = 0; i < histogram->nbuckets; i++)
+			{
+				/* Is this OR or AND clause? */
+				if (is_or)
+				{
+					Selectivity	s1, s2;
+
+					matches[i].match |= bool_matches[i].match;
+
+					/*
+					 * Selectivities for an OR clause are combined as s1+s2 - s1*s2
+					 * to account for the probable overlap of selected tuple sets.
+					 * This is the same formula as in clause_selectivity, because
+					 * the fraction is computed assuming independence (but then we
+					 * also apply geometric mean).
+					 */
+					s1 = matches[i].fraction;
+					s2 = bool_matches[i].fraction;
+
+					matches[i].fraction = s1 + s2 - s1 * s2;
+
+					CLAMP_PROBABILITY(matches[i].fraction);
+				}
+				else
+				{
+					matches[i].match &= bool_matches[i].match;
+					matches[i].fraction *= bool_matches[i].fraction;
+				}
+			}
+
+			pfree(bool_matches);
+
+		}
+		else if (not_clause(clause))
+		{
+			/* NOT clause, with all subclauses compatible */
+
+			int			i;
+			BoolExpr   *not_clause = ((BoolExpr *) clause);
+			List	   *not_args = not_clause->args;
+
+			/* match/mismatch bitmap for each MCV item */
+			bucket_match   *not_matches = NULL;
+
+			Assert(not_args != NIL);
+			Assert(list_length(not_args) == 1);
+
+			/* by default none of the MCV items matches the clauses */
+			not_matches = palloc0(sizeof(bucket_match) * histogram->nbuckets);
+
+			/* NOT clauses assume nothing matches, initially
+			 *
+			 * FIXME The comment seems to disagree with the code - not sure
+			 * if nothing should match (code is wrong) or everything should
+			 * match (comment is wrong) by default.
+			 */
+			for (i = 0; i < histogram->nbuckets; i++)
+			{
+				not_matches[i].match = true;
+				not_matches[i].fraction = 1.0;
+			}
+
+			/* build the match bitmap for the OR-clauses */
+			histogram_update_match_bitmap(root, not_args,
+										  stakeys, histogram,
+										  not_matches, false);
+
+			/*
+			 * Merge the bitmap produced by histogram_update_match_bitmap into
+			 * the current one.
+			 *
+			 * This is similar to what mcv_update_match_bitmap does, but we
+			 * need to be a tad more careful here, as histograms also track
+			 * what fraction of a bucket matches.
+			 */
+			for (i = 0; i < histogram->nbuckets; i++)
+			{
+				/*
+				 * When handling a NOT clause, invert the result before
+				 * merging it into the global result. We don't care about
+				 * partial matches here (those invert to partial).
+				 */
+				not_matches[i].match = (!not_matches[i].match);
+
+				/* Is this OR or AND clause? */
+				if (is_or)
+				{
+					Selectivity s1, s2;
+
+					matches[i].match |= not_matches[i].match;
+
+					/*
+					 * Selectivities for an OR clause are combined as s1+s2 - s1*s2
+					 * to account for the probable overlap of selected tuple sets.
+					 * This is the same formula as in clause_selectivity, because
+					 * the fraction is computed assuming independence (but then we
+					 * also apply geometric mean).
+					 */
+					s1 = matches[i].fraction;
+					s2 = not_matches[i].fraction;
+
+					matches[i].fraction = s1 + s2 - s1 * s2;
+
+					CLAMP_PROBABILITY(matches[i].fraction);
+				}
+				else
+				{
+					matches[i].match &= not_matches[i].match;
+					matches[i].fraction *= not_matches[i].fraction;
+				}
+			}
+
+			pfree(not_matches);
+		}
+		else if (IsA(clause, Var))
+		{
+			/* Var (has to be a boolean Var, possibly from below NOT) */
+
+			Var		   *var = (Var *) (clause);
+
+			/* match the attribute to a dimension of the statistic */
+			int			idx = bms_member_index(stakeys, var->varattno);
+
+			Assert(var->vartype == BOOLOID);
+
+			/*
+			 * Walk through the buckets and evaluate the current clause.
+			 */
+			for (i = 0; i < histogram->nbuckets; i++)
+			{
+				MVBucket   *bucket = histogram->buckets[i];
+				bool	match = false;
+				double	fraction = 0.0;
+
+				/*
+				 * If the bucket is NULL, it's a mismatch. Otherwise check
+				 * if lower/upper boundaries match and choose partial/full
+				 * match accordingly.
+				 */
+				if (!bucket->nullsonly[idx])
+				{
+					int		minidx = bucket->min[idx];
+					int		maxidx = bucket->max[idx];
+
+					bool 	a = DatumGetBool(histogram->values[idx][minidx]);
+					bool	b = DatumGetBool(histogram->values[idx][maxidx]);
+
+					/* How many boundary values match? */
+					if (a && b)
+					{
+						/* both values match - the whole bucket matches */
+						match = true;
+						fraction = 1.0;
+					}
+					else if (a || b)
+					{
+						/* one value matches - assume half the bucket matches */
+						match = true;
+						fraction = 0.5;
+					}
+				}
+
+				/* now, update the match bitmap, depending on OR/AND type */
+				if (is_or)
+				{
+					Selectivity	s1, s2;
+
+					matches[i].match |= match;
+
+					/*
+					 * Selectivities for an OR clause are combined as s1+s2 - s1*s2
+					 * to account for the probable overlap of selected tuple sets.
+					 * This is the same formula as in clause_selectivity, because
+					 * the fraction is computed assuming independence (but then we
+					 * also apply geometric mean).
+					 */
+					s1 = matches[i].fraction;
+					s2 = fraction;
+
+					matches[i].fraction = s1 + s2 - s1 * s2;
+
+					CLAMP_PROBABILITY(matches[i].fraction);
+				}
+				else
+				{
+					matches[i].match &= match;
+					matches[i].fraction *= fraction;
+				}
+			}
+		}
+		else
+			elog(ERROR, "unknown clause type: %d", clause->type);
+	}
+
+	/* free the call cache */
+	pfree(callcache);
+}
+
+/*
+ * Estimate selectivity of clauses using a histogram.
+ *
+ * If there's no histogram for the stats, the function returns 0.0.
+ *
+ * The general idea of this method is similar to how MCV lists are
+ * processed, except that this introduces the concept of a partial
+ * match (MCV only works with full match / mismatch).
+ *
+ * The algorithm works like this:
+ *
+ *	 1) mark all buckets as 'full match'
+ *	 2) walk through all the clauses
+ *	 3) for a particular clause, walk through all the buckets
+ *	 4) skip buckets that are already 'no match'
+ *	 5) check clause for buckets that still match (at least partially)
+ *	 6) sum frequencies for buckets to get selectivity
+ *
+ * Unlike MCV lists, histograms have a concept of a partial match. In
+ * that case we use 1/2 the bucket, to minimize the average error. The
+ * MV histograms are usually less detailed than the per-column ones,
+ * meaning the sum is often quite high (thanks to combining a lot of
+ * "partially hit" buckets).
+ *
+ * Maybe we could use per-bucket information with number of distinct
+ * values it contains (for each dimension), and then use that to correct
+ * the estimate (so with 10 distinct values, we'd use 1/10 of the bucket
+ * frequency). We might also scale the value depending on the actual
+ * ndistinct estimate (not just the values observed in the sample).
+ *
+ * Another option would be to multiply the selectivities, i.e. if we get
+ * 'partial match' for a bucket for multiple conditions, we might use
+ * 0.5^k (where k is the number of conditions), instead of 0.5. This
+ * probably does not minimize the average error, though.
+ *
+ * TODO: This might use a similar shortcut to MCV lists - count buckets
+ * marked as partial/full match, and terminate once this drop to 0.
+ * Not sure if it's really worth it - for MCV lists a situation like
+ * this is not uncommon, but for histograms it's not that clear.
+ */
+Selectivity
+histogram_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
+								 List *clauses, List *conditions,
+								 int varRelid, JoinType jointype,
+								 SpecialJoinInfo *sjinfo, RelOptInfo *rel)
+{
+	int			i;
+	MVHistogram *histogram;
+	Selectivity	s = 0.0;
+	Selectivity	total_sel = 0.0;
+	Size		len;
+	int			nclauses;
+
+	/* match/mismatch bitmap for each MCV item */
+	bucket_match   *matches = NULL;
+	bucket_match   *condition_matches = NULL;
+
+	nclauses = list_length(clauses);
+
+	/* load the histogram stored in the statistics object */
+	histogram = statext_histogram_load(stat->statOid);
+
+	/* size of the match "bitmap" */
+	len = sizeof(bucket_match) * histogram->nbuckets;
+
+	/* by default all the histogram buckets match the clauses fully */
+	matches = palloc0(len);
+
+	/* by default all buckets match fully */
+	for (i = 0; i < histogram->nbuckets; i++)
+	{
+		matches[i].match = true;
+		matches[i].fraction = 1.0;
+	}
+
+	histogram_update_match_bitmap(root, clauses, stat->keys,
+								  histogram, matches, false);
+
+	/* if there are condition clauses, build a match bitmap for them */
+	if (conditions)
+	{
+		/* match bitmap for conditions, by default all buckets match */
+		condition_matches = palloc0(len);
+
+		/* by default all buckets match fully */
+		for (i = 0; i < histogram->nbuckets; i++)
+		{
+			condition_matches[i].match = true;
+			condition_matches[i].fraction = 1.0;
+		}
+
+		histogram_update_match_bitmap(root, conditions, stat->keys,
+									  histogram, condition_matches, false);
+	}
+
+	/* now, walk through the buckets and sum the selectivities */
+	for (i = 0; i < histogram->nbuckets; i++)
+	{
+		double fraction;
+
+		/* skip buckets that don't satisfy the conditions */
+		if (conditions && (!condition_matches[i].match))
+			continue;
+
+		/* compute selectivity for buckets matching conditions */
+		total_sel += histogram->buckets[i]->frequency;
+
+		/* geometric mean of the bucket fraction */
+		fraction = pow(matches[i].fraction, 1.0 / nclauses);
+
+		if (matches[i].match)
+			s += histogram->buckets[i]->frequency * fraction;
+	}
+
+	/* conditional selectivity P(clauses|conditions) */
+	if (total_sel > 0.0)
+		return (s / total_sel);
+
+	return 0.0;
+}
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index 533fbdc037..79e6f24deb 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -85,7 +85,8 @@ static int count_distinct_groups(int numrows, SortItem *items,
  */
 MCVList *
 statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
-				  VacAttrStats **stats, double totalrows)
+				  VacAttrStats **stats, HeapTuple **rows_filtered,
+				  int *numrows_filtered, double totalrows)
 {
 	int			i,
 				j,
@@ -96,6 +97,7 @@ statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
 	double		stadistinct;
 	int		   *mcv_counts;
 	int			f1;
+	int			numrows_mcv;
 
 	int		   *attnums = build_attnums(attrs);
 
@@ -111,6 +113,9 @@ statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
 	/* transform the sorted rows into groups (sorted by frequency) */
 	SortItem   *groups = build_distinct_groups(numrows, items, mss, &ngroups);
 
+	/* Either we have both pointers or none of them. */
+	Assert((rows_filtered && numrows_filtered) || (!rows_filtered && !numrows_filtered));
+
 	/*
 	 * Maximum number of MCV items to store, based on the attribute with the
 	 * largest stats target (and the number of groups we have available).
@@ -167,6 +172,9 @@ statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
 								  numrows, totalrows);
 	}
 
+	/* number of rows represented by MCV items */
+	numrows_mcv = 0;
+
 	/*
 	 * At this point we know the number of items for the MCV list. There might
 	 * be none (for uniform distribution with many groups), and in that case
@@ -243,9 +251,93 @@ statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
 
 				item->base_frequency *= (double) count / numrows;
 			}
+
+			/* update the number of sampled rows represented by the MCV list */
+			numrows_mcv += groups[i].count;
 		}
 	}
 
+	/* Assume we're not returning any filtered rows by default. */
+	if (numrows_filtered)
+		*numrows_filtered = 0;
+
+	if (rows_filtered)
+		*rows_filtered = NULL;
+
+	/*
+	 * Produce an array with only tuples not covered by the MCV list. This is
+	 * needed when building MCV+histogram pair, where MCV covers the most
+	 * common combinations and histogram covers the remaining part.
+	 *
+	 * We will first sort the groups by the keys (not by count) and then use
+	 * binary search in the group array to check which rows are covered by the
+	 * MCV items.
+	 *
+	 * Do not modify the array in place, as there may be additional stats on
+	 * the table and we need to keep the original array for them.
+	 *
+	 * We only do this when requested by passing non-NULL rows_filtered, and
+	 * when there are rows not covered by the MCV list (that is, when
+	 * numrows_mcv < numrows), or also (nitems < ngroups).
+	 */
+	if (rows_filtered && numrows_filtered && (nitems < ngroups))
+	{
+		int			i,
+					j;
+
+		/* used to build the filtered array of tuples */
+		HeapTuple  *filtered;
+		int			nfiltered;
+
+		/* used for the searches */
+		SortItem	key;
+
+		/* We do know how many rows we expect (total - MCV rows). */
+		nfiltered = (numrows - numrows_mcv);
+		filtered = (HeapTuple *) palloc(nfiltered * sizeof(HeapTuple));
+
+		/* wfill this with data from the rows */
+		key.values = (Datum *) palloc0(numattrs * sizeof(Datum));
+		key.isnull = (bool *) palloc0(numattrs * sizeof(bool));
+
+		/*
+		 * Sort the groups for bsearch_r (but only the items that actually
+		 * made it to the MCV list).
+		 */
+		qsort_arg((void *) groups, nitems, sizeof(SortItem),
+				  multi_sort_compare, mss);
+
+		/* walk through the tuples, compare the values to MCV items */
+		nfiltered = 0;
+		for (i = 0; i < numrows; i++)
+		{
+			/* collect the key values from the row */
+			for (j = 0; j < numattrs; j++)
+				key.values[j]
+					= heap_getattr(rows[i], attnums[j],
+								   stats[j]->tupDesc, &key.isnull[j]);
+
+			/* if not included in the MCV list, keep it in the array */
+			if (bsearch_arg(&key, groups, nitems, sizeof(SortItem),
+							multi_sort_compare, mss) == NULL)
+				filtered[nfiltered++] = rows[i];
+
+			/* do not overflow the array */
+			Assert(nfiltered <= (numrows - numrows_mcv));
+		}
+
+		/* expect to get the right number of remaining rows exactly */
+		Assert(nfiltered + numrows_mcv == numrows);
+
+		/* pass the filtered tuples up */
+		*numrows_filtered = nfiltered;
+		*rows_filtered = filtered;
+
+		/* free all the data used here */
+		pfree(key.values);
+		pfree(key.isnull);
+	}
+
 	pfree(items);
 	pfree(groups);
 	pfree(mcv_counts);
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index c941c3310b..837660950e 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -1505,6 +1505,7 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 	bool		ndistinct_enabled;
 	bool		dependencies_enabled;
 	bool		mcv_enabled;
+	bool		histogram_enabled;
 	int			i;
 
 	statexttup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statextid));
@@ -1541,6 +1542,7 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 	ndistinct_enabled = false;
 	dependencies_enabled = false;
 	mcv_enabled = false;
+	histogram_enabled = false;
 
 	for (i = 0; i < ARR_DIMS(arr)[0]; i++)
 	{
@@ -1550,6 +1552,8 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 			dependencies_enabled = true;
 		if (enabled[i] == STATS_EXT_MCV)
 			mcv_enabled = true;
+		if (enabled[i] == STATS_EXT_HISTOGRAM)
+			histogram_enabled = true;
 	}
 
 	/*
@@ -1578,7 +1582,13 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 		}
 
 		if (mcv_enabled)
+		{
 			appendStringInfo(&buf, "%smcv", gotone ? ", " : "");
+			gotone = true;
+		}
+
+		if (histogram_enabled)
+			appendStringInfo(&buf, "%shistogram", gotone ? ", " : "");
 
 		appendStringInfoChar(&buf, ')');
 	}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index fdfc0d6a1b..bd1cbe8e50 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -172,9 +172,6 @@ static double eqjoinsel_semi(Oid operator,
 			   RelOptInfo *inner_rel);
 static bool estimate_multivariate_ndistinct(PlannerInfo *root,
 								RelOptInfo *rel, List **varinfos, double *ndistinct);
-static bool convert_to_scalar(Datum value, Oid valuetypid, double *scaledvalue,
-				  Datum lobound, Datum hibound, Oid boundstypid,
-				  double *scaledlobound, double *scaledhibound);
 static double convert_numeric_to_scalar(Datum value, Oid typid, bool *failure);
 static void convert_string_to_scalar(char *value,
 						 double *scaledvalue,
@@ -4104,7 +4101,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 		int			nshared;
 
 		/* skip statistics of other kinds */
-		if (info->kind != STATS_EXT_NDISTINCT)
+		if ((info->kinds & STATS_EXT_INFO_NDISTINCT) == 0)
 			continue;
 
 		/* compute attnums shared by the vars and the statistics object */
@@ -4213,7 +4210,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
  * The several datatypes representing relative times (intervals) are all
  * converted to measurements expressed in seconds.
  */
-static bool
+bool
 convert_to_scalar(Datum value, Oid valuetypid, double *scaledvalue,
 				  Datum lobound, Datum hibound, Oid boundstypid,
 				  double *scaledlobound, double *scaledhibound)
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 394301caa8..929aacf906 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2519,7 +2519,8 @@ describeOneTableDetails(const char *schemaname,
 							  "        a.attnum = s.attnum AND NOT attisdropped)) AS columns,\n"
 							  "  'd' = any(stxkind) AS ndist_enabled,\n"
 							  "  'f' = any(stxkind) AS deps_enabled,\n"
-							  "  'm' = any(stxkind) AS mcv_enabled\n"
+							  "  'm' = any(stxkind) AS mcv_enabled,\n"
+							  "  'h' = any(stxkind) AS histogram_enabled\n"
 							  "FROM pg_catalog.pg_statistic_ext stat "
 							  "WHERE stxrelid = '%s'\n"
 							  "ORDER BY 1;",
@@ -2562,6 +2563,12 @@ describeOneTableDetails(const char *schemaname,
 					if (strcmp(PQgetvalue(result, i, 7), "t") == 0)
 					{
 						appendPQExpBuffer(&buf, "%smcv", gotone ? ", " : "");
+						gotone = true;
+					}
+
+					if (strcmp(PQgetvalue(result, i, 8), "t") == 0)
+					{
+						appendPQExpBuffer(&buf, "%shistogram", gotone ? ", " : "");
 					}
 
 					appendPQExpBuffer(&buf, ") ON %s FROM %s",
diff --git a/src/include/catalog/pg_cast.dat b/src/include/catalog/pg_cast.dat
index dff3a9a08a..99d250bb14 100644
--- a/src/include/catalog/pg_cast.dat
+++ b/src/include/catalog/pg_cast.dat
@@ -330,6 +330,10 @@
 { castsource => 'pg_mcv_list', casttarget => 'text', castfunc => '0',
   castcontext => 'i', castmethod => 'i' },
 
+# pg_histogram can be coerced to, but not from, bytea
+{ castsource => 'pg_histogram', casttarget => 'bytea', castfunc => '0',
+  castcontext => 'i', castmethod => 'b' },
+
 # Datetime category
 { castsource => 'abstime', casttarget => 'date', castfunc => 'date(abstime)',
   castcontext => 'a', castmethod => 'f' },
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 3cfcafcb11..cbe2b36923 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5097,6 +5097,30 @@
   proargnames => '{mcv_list,index,values,nulls,frequency,base_frequency}',
   prosrc => 'pg_stats_ext_mcvlist_items' },
 
+{ oid => '3426', descr => 'I/O',
+  proname => 'pg_histogram_in', prorettype => 'pg_histogram',
+  proargtypes => 'cstring', prosrc => 'pg_histogram_in' },
+{ oid => '3427', descr => 'I/O',
+  proname => 'pg_histogram_out', prorettype => 'cstring',
+  proargtypes => 'pg_histogram', prosrc => 'pg_histogram_out' },
+{ oid => '3428', descr => 'I/O',
+  proname => 'pg_histogram_recv', provolatile => 's',
+  prorettype => 'pg_histogram', proargtypes => 'internal',
+  prosrc => 'pg_histogram_recv' },
+{ oid => '3429', descr => 'I/O',
+  proname => 'pg_histogram_send', provolatile => 's', prorettype => 'bytea',
+  proargtypes => 'pg_histogram', prosrc => 'pg_histogram_send' },
+
+{ oid => '3430',
+  descr => 'details about histogram buckets',
+  proname => 'pg_histogram_buckets', prorows => '1000', proisstrict => 'f',
+  proretset => 't', provolatile => 's', prorettype => 'record',
+  proargtypes => 'pg_histogram int4',
+  proallargtypes => '{pg_histogram,int4,int4,_text,_text,_bool,_bool,_bool,float8,float8,float8}',
+  proargmodes => '{i,i,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{histogram,otype,index,minvals,maxvals,nullsonly,mininclusive,maxinclusive,frequency,density,bucket_volume}',
+  prosrc => 'pg_histogram_buckets' },
+
 { oid => '1928', descr => 'statistics: number of scans done for table/index',
   proname => 'pg_stat_get_numscans', provolatile => 's', proparallel => 'r',
   prorettype => 'int8', proargtypes => 'oid',
diff --git a/src/include/catalog/pg_statistic_ext.h b/src/include/catalog/pg_statistic_ext.h
index 7ddbee63c9..514f4230c9 100644
--- a/src/include/catalog/pg_statistic_ext.h
+++ b/src/include/catalog/pg_statistic_ext.h
@@ -48,6 +48,7 @@ CATALOG(pg_statistic_ext,3381,StatisticExtRelationId)
 	pg_ndistinct stxndistinct;	/* ndistinct coefficients (serialized) */
 	pg_dependencies stxdependencies;	/* dependencies (serialized) */
 	pg_mcv_list stxmcv;			/* MCV (serialized) */
+	pg_histogram stxhistogram;	/* MV histogram (serialized) */
 #endif
 
 } FormData_pg_statistic_ext;
@@ -64,6 +65,7 @@ typedef FormData_pg_statistic_ext *Form_pg_statistic_ext;
 #define STATS_EXT_NDISTINCT			'd'
 #define STATS_EXT_DEPENDENCIES		'f'
 #define STATS_EXT_MCV				'm'
+#define STATS_EXT_HISTOGRAM			'h'
 
 #endif							/* EXPOSE_TO_CLIENT_CODE */
 
diff --git a/src/include/catalog/pg_type.dat b/src/include/catalog/pg_type.dat
index 0ff25e87a7..15e3ab9f93 100644
--- a/src/include/catalog/pg_type.dat
+++ b/src/include/catalog/pg_type.dat
@@ -172,6 +172,13 @@
   typoutput => 'pg_mcv_list_out', typreceive => 'pg_mcv_list_recv',
   typsend => 'pg_mcv_list_send', typalign => 'i', typstorage => 'x',
   typcollation => '100' },
+{ oid => '3425', oid_symbol => 'PGHISTOGRAMOID',
+  descr => 'multivariate histogram',
+  typname => 'pg_histogram', typlen => '-1', typbyval => 'f',
+  typcategory => 'S', typinput => 'pg_histogram_in',
+  typoutput => 'pg_histogram_out', typreceive => 'pg_histogram_recv',
+  typsend => 'pg_histogram_send', typalign => 'i', typstorage => 'x',
+  typcollation => '100' },
 { oid => '32', oid_symbol => 'PGDDLCOMMANDOID',
   descr => 'internal type for passing CollectedCommand',
   typname => 'pg_ddl_command', typlen => 'SIZEOF_POINTER', typbyval => 't',
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index adb4265047..5849a1d0d4 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -859,10 +859,15 @@ typedef struct StatisticExtInfo
 
 	Oid			statOid;		/* OID of the statistics row */
 	RelOptInfo *rel;			/* back-link to statistic's table */
-	char		kind;			/* statistic kind of this entry */
+	int			kinds;			/* statistic kinds of this entry */
 	Bitmapset  *keys;			/* attnums of the columns covered */
 } StatisticExtInfo;
 
+#define STATS_EXT_INFO_NDISTINCT			1
+#define STATS_EXT_INFO_DEPENDENCIES			2
+#define STATS_EXT_INFO_MCV					4
+#define STATS_EXT_INFO_HISTOGRAM			8
+
 /*
  * EquivalenceClasses
  *
diff --git a/src/include/statistics/extended_stats_internal.h b/src/include/statistics/extended_stats_internal.h
index 11159b58ee..ddb0066378 100644
--- a/src/include/statistics/extended_stats_internal.h
+++ b/src/include/statistics/extended_stats_internal.h
@@ -69,10 +69,16 @@ extern MVDependencies *statext_dependencies_deserialize(bytea *data);
 
 extern MCVList * statext_mcv_build(int numrows, HeapTuple *rows,
 								   Bitmapset *attrs, VacAttrStats **stats,
+								   HeapTuple **rows_filtered, int *numrows_filtered,
 								   double totalrows);
 extern bytea *statext_mcv_serialize(MCVList * mcv, VacAttrStats **stats);
 extern MCVList * statext_mcv_deserialize(bytea *data);
 
+extern MVHistogram * statext_histogram_build(int numrows, HeapTuple *rows,
+											 Bitmapset *attrs, VacAttrStats **stats,
+											 int numrows_total);
+extern MVHistogram * statext_histogram_deserialize(bytea *data);
+
 extern MultiSortSupport multi_sort_init(int ndims);
 extern void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
 						 Oid oper);
@@ -83,6 +89,7 @@ extern int multi_sort_compare_dims(int start, int end, const SortItem *a,
 						const SortItem *b, MultiSortSupport mss);
 extern int	compare_scalars_simple(const void *a, const void *b, void *arg);
 extern int	compare_datums_simple(Datum a, Datum b, SortSupport ssup);
+extern int	compare_scalars_partition(const void *a, const void *b, void *arg);
 
 extern void *bsearch_arg(const void *key, const void *base,
 			size_t nmemb, size_t size,
@@ -109,4 +116,12 @@ extern Selectivity mcv_clauselist_selectivity(PlannerInfo *root,
 						   Selectivity *basesel,
 						   Selectivity *totalsel);
 
+extern Selectivity histogram_clauselist_selectivity(PlannerInfo *root,
+								 StatisticExtInfo *stat,
+								 List *clauses, List *conditions,
+								 int varRelid,
+								 JoinType jointype,
+								 SpecialJoinInfo *sjinfo,
+								 RelOptInfo *rel);
+
 #endif							/* EXTENDED_STATS_INTERNAL_H */
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index e69d6a0232..1d276f0b6d 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -119,9 +119,68 @@ typedef struct MCVList
 	MCVItem   **items;			/* array of MCV items */
 }			MCVList;
 
+
+/* used to flag stats serialized to bytea */
+#define STATS_HIST_MAGIC       0x7F8C5670	/* marks serialized bytea */
+#define STATS_HIST_TYPE_BASIC  1	/* basic histogram type */
+
+/* max buckets in a histogram (mostly arbitrary number) */
+#define STATS_HIST_MAX_BUCKETS 16384
+
+/*
+ * Histogram in a partially serialized form, with deduplicated boundary
+ * values etc.
+ */
+typedef struct MVBucket
+{
+	/* Frequencies of this bucket. */
+	float		frequency;
+
+	/*
+	 * Information about dimensions being NULL-only. Not yet used.
+	 */
+	bool	   *nullsonly;
+
+	/* lower boundaries - values and information about the inequalities */
+	uint16	   *min;
+	bool	   *min_inclusive;
+
+	/*
+	 * indexes of upper boundaries - values and information about the
+	 * inequalities (exclusive vs. inclusive)
+	 */
+	uint16	   *max;
+	bool	   *max_inclusive;
+}			MVBucket;
+
+typedef struct MVHistogram
+{
+	/* varlena header (do not touch directly!) */
+	int32		vl_len_;
+	uint32		magic;			/* magic constant marker */
+	uint32		type;			/* type of histogram (BASIC) */
+	uint32		nbuckets;		/* number of buckets (buckets array) */
+	uint32		ndimensions;	/* number of dimensions */
+	Oid			types[STATS_MAX_DIMENSIONS];	/* OIDs of data types */
+
+	/*
+	 * keep this the same with MVHistogram, because of deserialization (same
+	 * offset)
+	 */
+	MVBucket  **buckets;		/* array of buckets */
+
+	/*
+	 * serialized boundary values, one array per dimension, deduplicated (the
+	 * min/max indexes point into these arrays)
+	 */
+	int		   *nvalues;
+	Datum	  **values;
+}			MVHistogram;
+
 extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
 extern MVDependencies *statext_dependencies_load(Oid mvoid);
 extern MCVList * statext_mcv_load(Oid mvoid);
+extern MVHistogram * statext_histogram_load(Oid mvoid);
 
 extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
 						   int numrows, HeapTuple *rows,
@@ -141,8 +200,8 @@ extern Selectivity statext_clauselist_selectivity(PlannerInfo *root,
 							   SpecialJoinInfo *sjinfo,
 							   RelOptInfo *rel,
 							   Bitmapset **estimatedclauses);
-extern bool has_stats_of_kind(List *stats, char requiredkind);
+extern bool has_stats_of_kind(List *stats, int requiredkinds);
 extern StatisticExtInfo *choose_best_statistics(List *stats,
-					   Bitmapset *attnums, char requiredkind);
+					   Bitmapset *attnums, int requiredkinds);
 
 #endif							/* STATISTICS_H */
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
index 4e9aaca6b5..6c9dd1d85a 100644
--- a/src/include/utils/selfuncs.h
+++ b/src/include/utils/selfuncs.h
@@ -222,6 +222,10 @@ extern void genericcostestimate(PlannerInfo *root, IndexPath *path,
 					List *qinfos,
 					GenericCosts *costs);
 
+extern bool convert_to_scalar(Datum value, Oid valuetypid, double *scaledvalue,
+				  Datum lobound, Datum hibound, Oid boundstypid,
+				  double *scaledlobound, double *scaledhibound);
+
 /* Functions in array_selfuncs.c */
 
 extern Selectivity scalararraysel_containment(PlannerInfo *root,
diff --git a/src/test/regress/expected/create_table_like.out b/src/test/regress/expected/create_table_like.out
index 0f97355165..fc672f6cd6 100644
--- a/src/test/regress/expected/create_table_like.out
+++ b/src/test/regress/expected/create_table_like.out
@@ -243,7 +243,7 @@ Indexes:
 Check constraints:
     "ctlt1_a_check" CHECK (length(a) > 2)
 Statistics objects:
-    "public"."ctlt_all_a_b_stat" (ndistinct, dependencies, mcv) ON a, b FROM ctlt_all
+    "public"."ctlt_all_a_b_stat" (ndistinct, dependencies, mcv, histogram) ON a, b FROM ctlt_all
 
 SELECT c.relname, objsubid, description FROM pg_description, pg_index i, pg_class c WHERE classoid = 'pg_class'::regclass AND objoid = i.indexrelid AND c.oid = i.indexrelid AND i.indrelid = 'ctlt_all'::regclass ORDER BY c.relname, objsubid;
     relname     | objsubid | description 
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index d13f0928d0..174efe3ce0 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -903,11 +903,12 @@ WHERE c.castmethod = 'b' AND
  pg_ndistinct      | bytea             |        0 | i
  pg_dependencies   | bytea             |        0 | i
  pg_mcv_list       | bytea             |        0 | i
+ pg_histogram      | bytea             |        0 | i
  cidr              | inet              |        0 | i
  xml               | text              |        0 | a
  xml               | character varying |        0 | a
  xml               | character         |        0 | a
-(10 rows)
+(11 rows)
 
 -- **************** pg_conversion ****************
 -- Look for illegal values in pg_conversion fields.
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 5d05962c04..67975c91d3 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -58,7 +58,7 @@ ALTER TABLE ab1 DROP COLUMN a;
  b      | integer |           |          | 
  c      | integer |           |          | 
 Statistics objects:
-    "public"."ab1_b_c_stats" (ndistinct, dependencies, mcv) ON b, c FROM ab1
+    "public"."ab1_b_c_stats" (ndistinct, dependencies, mcv, histogram) ON b, c FROM ab1
 
 -- Ensure statistics are dropped when table is
 SELECT stxname FROM pg_statistic_ext WHERE stxname LIKE 'ab1%';
@@ -204,9 +204,9 @@ CREATE STATISTICS s10 ON a, b, c FROM ndistinct;
 ANALYZE ndistinct;
 SELECT stxkind, stxndistinct
   FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass;
- stxkind |                      stxndistinct                       
----------+---------------------------------------------------------
- {d,f,m} | {"3, 4": 301, "3, 6": 301, "4, 6": 301, "3, 4, 6": 301}
+  stxkind  |                      stxndistinct                       
+-----------+---------------------------------------------------------
+ {d,f,m,h} | {"3, 4": 301, "3, 6": 301, "4, 6": 301, "3, 4, 6": 301}
 (1 row)
 
 -- Hash Aggregate, thanks to estimates improved by the statistic
@@ -270,9 +270,9 @@ INSERT INTO ndistinct (a, b, c, filler1)
 ANALYZE ndistinct;
 SELECT stxkind, stxndistinct
   FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass;
- stxkind |                        stxndistinct                         
----------+-------------------------------------------------------------
- {d,f,m} | {"3, 4": 2550, "3, 6": 800, "4, 6": 1632, "3, 4, 6": 10000}
+  stxkind  |                        stxndistinct                         
+-----------+-------------------------------------------------------------
+ {d,f,m,h} | {"3, 4": 2550, "3, 6": 800, "4, 6": 1632, "3, 4, 6": 10000}
 (1 row)
 
 -- plans using Group Aggregate, thanks to using correct esimates
@@ -758,7 +758,6 @@ EXPLAIN (COSTS OFF)
          Index Cond: ((a IS NULL) AND (b IS NULL))
 (5 rows)
 
-RESET random_page_cost;
 -- mcv with arrays
 CREATE TABLE mcv_lists_arrays (
     a TEXT[],
@@ -822,3 +821,197 @@ EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND NOT c;
    Filter: ((NOT a) AND b AND (NOT c))
 (3 rows)
 
+RESET random_page_cost;
+-- histograms
+CREATE TABLE histograms (
+    filler1 TEXT,
+    filler2 NUMERIC,
+    a INT,
+    b TEXT,
+    filler3 DATE,
+    c INT,
+    d TEXT
+);
+SET random_page_cost = 1.2;
+CREATE INDEX histograms_ab_idx ON mcv_lists (a, b);
+CREATE INDEX histograms_abc_idx ON histograms (a, b, c);
+-- random data (we still get histogram, but as the columns are not
+-- correlated, the estimates remain about the same)
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT mod(i,37), mod(i,41), mod(i,43), mod(i,47) FROM generate_series(1,5000) s(i);
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 5) AND (b < '5'::text))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 5) AND (b < '5'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5' AND c < 5;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 5) AND (b < '5'::text) AND (c < 5))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 5) AND (b < '5'::text) AND (c < 5))
+(4 rows)
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 5) AND (b < '5'::text))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 5) AND (b < '5'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5' AND c < 5;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 5) AND (b < '5'::text) AND (c < 5))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 5) AND (b < '5'::text) AND (c < 5))
+(4 rows)
+
+-- values correlated along the diagonal
+TRUNCATE histograms;
+DROP STATISTICS histograms_stats;
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT mod(i,100), mod(i,100) + mod(i,7), mod(i,100) + mod(i,11), i FROM generate_series(1,5000) s(i);
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND c < 3;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Index Scan using histograms_abc_idx on histograms
+   Index Cond: ((a < 3) AND (c < 3))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND b > '2' AND c < 3;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using histograms_abc_idx on histograms
+   Index Cond: ((a < 3) AND (b > '2'::text) AND (c < 3))
+(2 rows)
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND c < 3;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 3) AND (c < 3))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 3) AND (c < 3))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND b > '2' AND c < 3;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 3) AND (b > '2'::text) AND (c < 3))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 3) AND (b > '2'::text) AND (c < 3))
+(4 rows)
+
+-- almost 5000 unique combinations with NULL values
+TRUNCATE histograms;
+DROP STATISTICS histograms_stats;
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT
+         (CASE WHEN mod(i,100) =  0 THEN NULL ELSE mod(i,100) END),
+         (CASE WHEN mod(i,100) <= 1 THEN NULL ELSE mod(i,100) + mod(i,7)  END),
+         (CASE WHEN mod(i,100) <= 2 THEN NULL ELSE mod(i,100) + mod(i,11) END),
+         i
+     FROM generate_series(1,5000) s(i);
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Index Scan using histograms_abc_idx on histograms
+   Index Cond: ((a IS NULL) AND (b IS NULL))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL AND c IS NULL;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Index Scan using histograms_abc_idx on histograms
+   Index Cond: ((a IS NULL) AND (b IS NULL) AND (c IS NULL))
+(2 rows)
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a IS NULL) AND (b IS NULL))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL AND c IS NULL;
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a IS NULL) AND (b IS NULL) AND (c IS NULL))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a IS NULL) AND (b IS NULL) AND (c IS NULL))
+(4 rows)
+
+-- check change of column type resets the histogram statistics
+ALTER TABLE histograms ALTER COLUMN c TYPE numeric;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Index Scan using histograms_abc_idx on histograms
+   Index Cond: ((a IS NULL) AND (b IS NULL))
+(2 rows)
+
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a IS NULL) AND (b IS NULL))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+-- histograms with arrays
+CREATE TABLE histograms_arrays (
+    a TEXT[],
+    b NUMERIC[],
+    c INT[]
+);
+INSERT INTO histograms_arrays (a, b, c)
+     SELECT
+         ARRAY[md5(i::text), md5((i-1)::text), md5((i+1)::text)],
+         ARRAY[(i-1)::numeric/1000, i::numeric/1000, (i+1)::numeric/1000],
+         ARRAY[(i-1), i, (i+1)]
+     FROM generate_series(1,5000) s(i);
+CREATE STATISTICS histogram_array_stats (histogram) ON a, b, c
+  FROM histograms_arrays;
+ANALYZE histograms_arrays;
+RESET random_page_cost;
diff --git a/src/test/regress/expected/type_sanity.out b/src/test/regress/expected/type_sanity.out
index a56d6c5231..97c292f6f9 100644
--- a/src/test/regress/expected/type_sanity.out
+++ b/src/test/regress/expected/type_sanity.out
@@ -73,8 +73,9 @@ WHERE p1.typtype not in ('c','d','p') AND p1.typname NOT LIKE E'\\_%'
  3361 | pg_ndistinct
  3402 | pg_dependencies
  4001 | pg_mcv_list
+ 3425 | pg_histogram
   210 | smgr
-(5 rows)
+(6 rows)
 
 -- Make sure typarray points to a varlena array type of our own base
 SELECT p1.oid, p1.typname as basetype, p2.typname as arraytype,
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index ad1f103217..a949c7e6d1 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -414,8 +414,6 @@ EXPLAIN (COSTS OFF)
 EXPLAIN (COSTS OFF)
  SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
 
-RESET random_page_cost;
-
 -- mcv with arrays
 CREATE TABLE mcv_lists_arrays (
     a TEXT[],
@@ -463,3 +461,134 @@ EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND c;
 EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND NOT b AND c;
 
 EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND NOT c;
+
+RESET random_page_cost;
+
+-- histograms
+CREATE TABLE histograms (
+    filler1 TEXT,
+    filler2 NUMERIC,
+    a INT,
+    b TEXT,
+    filler3 DATE,
+    c INT,
+    d TEXT
+);
+
+SET random_page_cost = 1.2;
+
+CREATE INDEX histograms_ab_idx ON mcv_lists (a, b);
+CREATE INDEX histograms_abc_idx ON histograms (a, b, c);
+
+-- random data (we still get histogram, but as the columns are not
+-- correlated, the estimates remain about the same)
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT mod(i,37), mod(i,41), mod(i,43), mod(i,47) FROM generate_series(1,5000) s(i);
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5' AND c < 5;
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5' AND c < 5;
+
+-- values correlated along the diagonal
+TRUNCATE histograms;
+DROP STATISTICS histograms_stats;
+
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT mod(i,100), mod(i,100) + mod(i,7), mod(i,100) + mod(i,11), i FROM generate_series(1,5000) s(i);
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND c < 3;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND b > '2' AND c < 3;
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND c < 3;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND b > '2' AND c < 3;
+
+-- almost 5000 unique combinations with NULL values
+TRUNCATE histograms;
+DROP STATISTICS histograms_stats;
+
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT
+         (CASE WHEN mod(i,100) =  0 THEN NULL ELSE mod(i,100) END),
+         (CASE WHEN mod(i,100) <= 1 THEN NULL ELSE mod(i,100) + mod(i,7)  END),
+         (CASE WHEN mod(i,100) <= 2 THEN NULL ELSE mod(i,100) + mod(i,11) END),
+         i
+     FROM generate_series(1,5000) s(i);
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL AND c IS NULL;
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL AND c IS NULL;
+
+-- check change of column type resets the histogram statistics
+ALTER TABLE histograms ALTER COLUMN c TYPE numeric;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+
+-- histograms with arrays
+CREATE TABLE histograms_arrays (
+    a TEXT[],
+    b NUMERIC[],
+    c INT[]
+);
+
+INSERT INTO histograms_arrays (a, b, c)
+     SELECT
+         ARRAY[md5(i::text), md5((i-1)::text), md5((i+1)::text)],
+         ARRAY[(i-1)::numeric/1000, i::numeric/1000, (i+1)::numeric/1000],
+         ARRAY[(i-1), i, (i+1)]
+     FROM generate_series(1,5000) s(i);
+
+CREATE STATISTICS histogram_array_stats (histogram) ON a, b, c
+  FROM histograms_arrays;
+
+ANALYZE histograms_arrays;
+
+RESET random_page_cost;
-- 
2.13.6

0001-multivariate-MCV-lists-20180902.patchtext/x-patch; name=0001-multivariate-MCV-lists-20180902.patchDownload

From c25a35ea0cbc770b0ff42cef4c86bbfc63667dda Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Sat, 1 Sep 2018 17:48:35 +0200
Subject: [PATCH 1/2] multivariate MCV lists

---
 doc/src/sgml/catalogs.sgml                       |   13 +-
 doc/src/sgml/func.sgml                           |   77 +
 doc/src/sgml/planstats.sgml                      |  114 +-
 doc/src/sgml/ref/create_statistics.sgml          |   28 +-
 src/backend/commands/analyze.c                   |    8 +-
 src/backend/commands/statscmds.c                 |  107 +-
 src/backend/optimizer/path/clausesel.c           |   89 +-
 src/backend/optimizer/util/plancat.c             |   12 +
 src/backend/parser/parse_utilcmd.c               |    2 +
 src/backend/statistics/Makefile                  |    2 +-
 src/backend/statistics/README                    |    4 +
 src/backend/statistics/README.mcv                |  140 ++
 src/backend/statistics/dependencies.c            |   83 +-
 src/backend/statistics/extended_stats.c          |  543 ++++++-
 src/backend/statistics/mcv.c                     | 1656 ++++++++++++++++++++++
 src/backend/statistics/mvdistinct.c              |   28 -
 src/backend/utils/adt/ruleutils.c                |   24 +-
 src/backend/utils/adt/selfuncs.c                 |  165 +++
 src/bin/psql/describe.c                          |    9 +-
 src/include/catalog/pg_cast.dat                  |    6 +
 src/include/catalog/pg_proc.dat                  |   24 +
 src/include/catalog/pg_statistic_ext.h           |    2 +
 src/include/catalog/pg_type.dat                  |    7 +
 src/include/commands/vacuum.h                    |    6 +
 src/include/optimizer/cost.h                     |    6 +
 src/include/statistics/extended_stats_internal.h |   43 +
 src/include/statistics/statistics.h              |   49 +
 src/include/utils/selfuncs.h                     |    2 +
 src/test/regress/expected/create_table_like.out  |    2 +-
 src/test/regress/expected/opr_sanity.out         |    3 +-
 src/test/regress/expected/stats_ext.out          |  319 ++++-
 src/test/regress/expected/type_sanity.out        |    3 +-
 src/test/regress/sql/stats_ext.sql               |  181 +++
 33 files changed, 3630 insertions(+), 127 deletions(-)
 create mode 100644 src/backend/statistics/README.mcv
 create mode 100644 src/backend/statistics/mcv.c

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 07e8b3325f..dc7bbe5173 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6571,7 +6571,8 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
         An array containing codes for the enabled statistic kinds;
         valid values are:
         <literal>d</literal> for n-distinct statistics,
-        <literal>f</literal> for functional dependency statistics
+        <literal>f</literal> for functional dependency statistics, and
+        <literal>m</literal> for most common values (MCV) list statistics
       </entry>
      </row>
 
@@ -6594,6 +6595,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </entry>
      </row>
 
+     <row>
+      <entry><structfield>stxmcv</structfield></entry>
+      <entry><type>pg_mcv_list</type></entry>
+      <entry></entry>
+      <entry>
+       MCV (most-common values) list statistics, serialized as
+       <structname>pg_mcv_list</structname> type.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index bb794e044f..69cfe7bbe9 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -20917,4 +20917,81 @@ CREATE EVENT TRIGGER test_table_rewrite_oid
   </sect2>
   </sect1>
 
+  <sect1 id="functions-statistics">
+   <title>Statistics Information Functions</title>
+
+   <indexterm zone="functions-statistics">
+    <primary>function</primary>
+    <secondary>statistics</secondary>
+   </indexterm>
+
+   <para>
+    To inspect statistics defined using <command>CREATE STATISTICS</command>
+    command, <productname>PostgreSQL</productname> provides multiple functions.
+   </para>
+
+  <sect2 id="functions-statistics-mcv">
+   <title>Inspecting MCV lists</title>
+
+   <indexterm>
+     <primary>pg_mcv_list_items</primary>
+     <secondary>pg_mcv_list</secondary>
+   </indexterm>
+
+   <para>
+    <function>pg_mcv_list_items</function> returns a list of all items
+    stored in a multi-column <literal>MCV</literal> list, and returns the
+    following columns:
+
+    <informaltable>
+     <tgroup cols="3">
+      <thead>
+       <row>
+        <entry>Name</entry>
+        <entry>Type</entry>
+        <entry>Description</entry>
+       </row>
+      </thead>
+
+      <tbody>
+       <row>
+        <entry><literal>index</literal></entry>
+        <entry><type>int</type></entry>
+        <entry>index of the item in the <literal>MCV</literal> list</entry>
+       </row>
+       <row>
+        <entry><literal>values</literal></entry>
+        <entry><type>text[]</type></entry>
+        <entry>values stored in the MCV item</entry>
+       </row>
+       <row>
+        <entry><literal>nulls</literal></entry>
+        <entry><type>boolean[]</type></entry>
+        <entry>flags identifying <literal>NULL</literal> values</entry>
+       </row>
+       <row>
+        <entry><literal>frequency</literal></entry>
+        <entry><type>double precision</type></entry>
+        <entry>frequency of this <literal>MCV</literal> item</entry>
+       </row>
+      </tbody>
+     </tgroup>
+    </informaltable>
+   </para>
+
+   <para>
+    The <function>pg_mcv_list_items</function> function can be used like this:
+
+<programlisting>
+SELECT m.* FROM pg_statistic_ext,
+                pg_mcv_list_items(stxmcv) m WHERE stxname = 'stts';
+</programlisting>
+
+     Values of the <type>pg_mcv_list</type> can be obtained only from the
+     <literal>pg_statistic.stxmcv</literal> column.
+   </para>
+  </sect2>
+
+  </sect1>
+
 </chapter>
diff --git a/doc/src/sgml/planstats.sgml b/doc/src/sgml/planstats.sgml
index ef643ad064..de8ef165c9 100644
--- a/doc/src/sgml/planstats.sgml
+++ b/doc/src/sgml/planstats.sgml
@@ -455,7 +455,7 @@ rows = (outer_cardinality * inner_cardinality) * selectivity
    <secondary>multivariate</secondary>
   </indexterm>
 
-  <sect2>
+  <sect2 id="functional-dependencies">
    <title>Functional Dependencies</title>
 
    <para>
@@ -540,7 +540,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
    </para>
   </sect2>
 
-  <sect2>
+  <sect2 id="multivariate-ndistinct-counts">
    <title>Multivariate N-Distinct Counts</title>
 
    <para>
@@ -585,6 +585,116 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
    </para>
 
   </sect2>
+
+  <sect2 id="mcv-lists">
+   <title>MCV lists</title>
+
+   <para>
+    As explained in <xref linkend="functional-dependencies"/>, functional
+    dependencies are very cheap and efficient type of statistics, but their
+    main limitation is their global nature (only tracking dependencies at
+    the column level, not between individual column values).
+   </para>
+
+   <para>
+    This section introduces multivariate variant of <acronym>MCV</acronym>
+    (most-common values) lists, a straight-forward extension of the per-column
+    statistics described in <xref linkend="row-estimation-examples"/>. This
+    statistics adresses the limitation by storing individual values, but it
+    is naturally more expensive, both in terms of storage and planning time.
+   </para>
+
+   <para>
+    Let's look at the query from <xref linkend="functional-dependencies"/>
+    again, but this time with a <acronym>MCV</acronym> list created on the
+    same set of columns (be sure to drop the functional dependencies, to
+    make sure the planner uses the newly created statistics).
+
+<programlisting>
+DROP STATISTICS stts;
+CREATE STATISTICS stts2 (mcv) ON a, b FROM t;
+ANALYZE t;
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
+                                   QUERY PLAN
+-------------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1)
+   Filter: ((a = 1) AND (b = 1))
+   Rows Removed by Filter: 9900
+</programlisting>
+
+    The estimate is as accurate as with the functional dependencies, mostly
+    thanks to the table being fairly small and having a simple distribution
+    with low number of distinct values. Before looking at the second query,
+    which was not handled by functional dependencies particularly well,
+    let's inspect the <acronym>MCV</acronym> list a bit.
+   </para>
+
+   <para>
+    Inspecting the <acronym>MCV</acronym> list is possible using
+    <function>pg_mcv_list_items</function> set-returning function.
+
+<programlisting>
+SELECT m.* FROM pg_statistic_ext,
+                pg_mcv_list_items(stxmcv) m WHERE stxname = 'stts2';
+ index | values  | nulls | frequency
+-------+---------+-------+-----------
+     0 | {0,0}   | {f,f} |      0.01
+     1 | {1,1}   | {f,f} |      0.01
+     2 | {2,2}   | {f,f} |      0.01
+...
+    49 | {49,49} | {f,f} |      0.01
+    50 | {50,0}  | {f,f} |      0.01
+...
+    97 | {97,47} | {f,f} |      0.01
+    98 | {98,48} | {f,f} |      0.01
+    99 | {99,49} | {f,f} |      0.01
+(100 rows)
+</programlisting>
+
+    Which confirms there are 100 distinct combinations in the two columns,
+    and all of them are about equally likely (1% frequency for each one).
+    Had there been any null values in either of the columns, this would be
+    identified in the <structfield>nulls</structfield> column.
+   </para>
+
+   <para>
+    When estimating the selectivity, the planner applies all the conditions
+    on items in the <acronym>MCV</acronym> list, and them sums the frequencies
+    of the matching ones. See <function>mcv_clauselist_selectivity</function>
+    in <filename>src/backend/statistics/mcv.c</filename> for details.
+   </para>
+
+   <para>
+    Compared to functional dependencies, <acronym>MCV</acronym> lists have two
+    major advantages. Firstly, the list stores actual values, making it possible
+    to decide which combinations are compatible.
+
+<programlisting>
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 10;
+                                 QUERY PLAN
+---------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)
+   Filter: ((a = 1) AND (b = 10))
+   Rows Removed by Filter: 10000
+</programlisting>
+
+    Secondly, <acronym>MCV</acronym> lists handle a wider range of clause types,
+    not just equality clauses like functional dependencies. See for example the
+    example range query, presented earlier:
+
+<programlisting>
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a &lt;= 49 AND b &gt; 49;
+                                QUERY PLAN
+---------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)
+   Filter: ((a &lt;= 49) AND (b &gt; 49))
+   Rows Removed by Filter: 10000
+</programlisting>
+
+   </para>
+
+  </sect2>
+
  </sect1>
 
  <sect1 id="planner-stats-security">
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index 539f5bded5..fcbfa569d0 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -83,7 +83,8 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="parameter">statistics_na
       Currently supported kinds are
       <literal>ndistinct</literal>, which enables n-distinct statistics, and
       <literal>dependencies</literal>, which enables functional
-      dependency statistics.
+      dependency statistics, and <literal>mcv</literal> which enables
+      most-common values lists.
       If this clause is omitted, all supported statistics kinds are
       included in the statistics object.
       For more information, see <xref linkend="planner-stats-extended"/>
@@ -164,6 +165,31 @@ EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 0);
    conditions are redundant and does not underestimate the row count.
   </para>
 
+  <para>
+   Create table <structname>t2</structname> with two perfectly correlated columns
+   (containing identical data), and a MCV list on those columns:
+
+<programlisting>
+CREATE TABLE t2 (
+    a   int,
+    b   int
+);
+
+INSERT INTO t2 SELECT mod(i,100), mod(i,100)
+                 FROM generate_series(1,1000000) s(i);
+
+CREATE STATISTICS s2 WITH (mcv) ON (a, b) FROM t2;
+
+ANALYZE t2;
+
+-- valid combination (found in MCV)
+EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 1);
+
+-- invalid combination (not found in MCV)
+EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 2);
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index edbdce81f2..bc369df4cd 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -1759,12 +1759,6 @@ static void compute_scalar_stats(VacAttrStatsP stats,
 					 double totalrows);
 static int	compare_scalars(const void *a, const void *b, void *arg);
 static int	compare_mcvs(const void *a, const void *b);
-static int analyze_mcv_list(int *mcv_counts,
-				 int num_mcv,
-				 double stadistinct,
-				 double stanullfrac,
-				 int samplerows,
-				 double totalrows);
 
 
 /*
@@ -2859,7 +2853,7 @@ compare_mcvs(const void *a, const void *b)
  * number that are significantly more common than the values not in the list,
  * and which are therefore deemed worth storing in the table's MCV list.
  */
-static int
+int
 analyze_mcv_list(int *mcv_counts,
 				 int num_mcv,
 				 double stadistinct,
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 3bb0d24cd2..903d8155e0 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -70,11 +70,12 @@ CreateStatistics(CreateStatsStmt *stmt)
 	Oid			relid;
 	ObjectAddress parentobject,
 				myself;
-	Datum		types[2];		/* one for each possible type of statistic */
+	Datum		types[3];		/* one for each possible type of statistic */
 	int			ntypes;
 	ArrayType  *stxkind;
 	bool		build_ndistinct;
 	bool		build_dependencies;
+	bool		build_mcv;
 	bool		requested_type = false;
 	int			i;
 	ListCell   *cell;
@@ -269,6 +270,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	 */
 	build_ndistinct = false;
 	build_dependencies = false;
+	build_mcv = false;
 	foreach(cell, stmt->stat_types)
 	{
 		char	   *type = strVal((Value *) lfirst(cell));
@@ -283,6 +285,11 @@ CreateStatistics(CreateStatsStmt *stmt)
 			build_dependencies = true;
 			requested_type = true;
 		}
+		else if (strcmp(type, "mcv") == 0)
+		{
+			build_mcv = true;
+			requested_type = true;
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -294,6 +301,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	{
 		build_ndistinct = true;
 		build_dependencies = true;
+		build_mcv = true;
 	}
 
 	/* construct the char array of enabled statistic types */
@@ -302,6 +310,8 @@ CreateStatistics(CreateStatsStmt *stmt)
 		types[ntypes++] = CharGetDatum(STATS_EXT_NDISTINCT);
 	if (build_dependencies)
 		types[ntypes++] = CharGetDatum(STATS_EXT_DEPENDENCIES);
+	if (build_mcv)
+		types[ntypes++] = CharGetDatum(STATS_EXT_MCV);
 	Assert(ntypes > 0 && ntypes <= lengthof(types));
 	stxkind = construct_array(types, ntypes, CHAROID, 1, true, 'c');
 
@@ -320,6 +330,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	/* no statistics built yet */
 	nulls[Anum_pg_statistic_ext_stxndistinct - 1] = true;
 	nulls[Anum_pg_statistic_ext_stxdependencies - 1] = true;
+	nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
 
 	/* insert it into pg_statistic_ext */
 	statrel = heap_open(StatisticExtRelationId, RowExclusiveLock);
@@ -415,23 +426,97 @@ RemoveStatisticsById(Oid statsOid)
  * null until the next ANALYZE.  (Note that the type change hasn't actually
  * happened yet, so one option that's *not* on the table is to recompute
  * immediately.)
+ *
+ * For both ndistinct and functional-dependencies stats, the on-disk
+ * representation is independent of the source column data types, and it is
+ * plausible to assume that the old statistic values will still be good for
+ * the new column contents.  (Obviously, if the ALTER COLUMN TYPE has a USING
+ * expression that substantially alters the semantic meaning of the column
+ * values, this assumption could fail.  But that seems like a corner case
+ * that doesn't justify zapping the stats in common cases.)
+ *
+ * For MCV lists that's not the case, as those statistics store the datums
+ * internally. In this case we simply reset the statistics value to NULL.
  */
 void
 UpdateStatisticsForTypeChange(Oid statsOid, Oid relationOid, int attnum,
 							  Oid oldColumnType, Oid newColumnType)
 {
+	Form_pg_statistic_ext staForm;
+	HeapTuple	stup,
+				oldtup;
+	int			i;
+
+	/* Do we need to reset anything? */
+	bool		attribute_referenced;
+	bool		reset_stats = false;
+
+	Relation	rel;
+
+	Datum		values[Natts_pg_statistic_ext];
+	bool		nulls[Natts_pg_statistic_ext];
+	bool		replaces[Natts_pg_statistic_ext];
+
+	oldtup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statsOid));
+	if (!oldtup)
+		elog(ERROR, "cache lookup failed for statistics object %u", statsOid);
+	staForm = (Form_pg_statistic_ext) GETSTRUCT(oldtup);
+
+	/*
+	 * If the modified attribute is not referenced by this statistic, we
+	 * can simply leave the statistics alone.
+	 */
+	attribute_referenced = false;
+	for (i = 0; i < staForm->stxkeys.dim1; i++)
+		if (attnum == staForm->stxkeys.values[i])
+			attribute_referenced = true;
+
 	/*
-	 * Currently, we don't actually need to do anything here.  For both
-	 * ndistinct and functional-dependencies stats, the on-disk representation
-	 * is independent of the source column data types, and it is plausible to
-	 * assume that the old statistic values will still be good for the new
-	 * column contents.  (Obviously, if the ALTER COLUMN TYPE has a USING
-	 * expression that substantially alters the semantic meaning of the column
-	 * values, this assumption could fail.  But that seems like a corner case
-	 * that doesn't justify zapping the stats in common cases.)
-	 *
-	 * Future types of extended stats will likely require us to work harder.
+	 * We can also leave the record as it is if there are no statistics
+	 * including the datum values, like for example MCV lists.
 	 */
+	if (statext_is_kind_built(oldtup, STATS_EXT_MCV))
+		reset_stats = true;
+
+	/*
+	 * If we can leave the statistics as it is, just do minimal cleanup
+	 * and we're done.
+	 */
+	if (!attribute_referenced && reset_stats)
+	{
+		ReleaseSysCache(oldtup);
+		return;
+	}
+
+	/*
+	 * OK, we need to reset some statistics. So let's build the new tuple,
+	 * replacing the affected statistics types with NULL.
+	 */
+	memset(nulls, 0, Natts_pg_statistic_ext * sizeof(bool));
+	memset(replaces, 0, Natts_pg_statistic_ext * sizeof(bool));
+	memset(values, 0, Natts_pg_statistic_ext * sizeof(Datum));
+
+	if (statext_is_kind_built(oldtup, STATS_EXT_MCV))
+	{
+		replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
+		nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
+	}
+
+	rel = heap_open(StatisticExtRelationId, RowExclusiveLock);
+
+	/* replace the old tuple */
+	stup = heap_modify_tuple(oldtup,
+							 RelationGetDescr(rel),
+							 values,
+							 nulls,
+							 replaces);
+
+	ReleaseSysCache(oldtup);
+	CatalogTupleUpdate(rel, &stup->t_self, stup);
+
+	heap_freetuple(stup);
+
+	heap_close(rel, RowExclusiveLock);
 }
 
 /*
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index f4717942c3..fd09ad3c49 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -105,9 +105,6 @@ clauselist_selectivity(PlannerInfo *root,
 	Selectivity s1 = 1.0;
 	RelOptInfo *rel;
 	Bitmapset  *estimatedclauses = NULL;
-	RangeQueryClause *rqlist = NULL;
-	ListCell   *l;
-	int			listidx;
 
 	/*
 	 * If there's exactly one clause, just go directly to
@@ -125,6 +122,25 @@ clauselist_selectivity(PlannerInfo *root,
 	if (rel && rel->rtekind == RTE_RELATION && rel->statlist != NIL)
 	{
 		/*
+		 * Estimate selectivity on any clauses applicable by stats tracking
+		 * actual values first, then apply functional dependencies on the
+		 * remaining clauses.  The reasoning for this particular order is that
+		 * the more complex stats can track more complex correlations between
+		 * the attributes, and may be considered more reliable.
+		 *
+		 * For example MCV list can give us an exact selectivity for values in
+		 * two columns, while functional dependencies can only provide
+		 * information about overall strength of the dependency.
+		 *
+		 * 'estimatedclauses' is a bitmap of 0-based list positions of clauses
+		 * used that way, so that we can ignore them later (not to estimate
+		 * them twice).
+		 */
+		s1 *= statext_clauselist_selectivity(root, clauses, varRelid,
+											 jointype, sjinfo, rel,
+											 &estimatedclauses);
+
+		/*
 		 * Perform selectivity estimations on any clauses found applicable by
 		 * dependencies_clauselist_selectivity.  'estimatedclauses' will be
 		 * filled with the 0-based list positions of clauses used that way, so
@@ -133,17 +149,72 @@ clauselist_selectivity(PlannerInfo *root,
 		s1 *= dependencies_clauselist_selectivity(root, clauses, varRelid,
 												  jointype, sjinfo, rel,
 												  &estimatedclauses);
-
-		/*
-		 * This would be the place to apply any other types of extended
-		 * statistics selectivity estimations for remaining clauses.
-		 */
 	}
 
 	/*
 	 * Apply normal selectivity estimates for remaining clauses. We'll be
 	 * careful to skip any clauses which were already estimated above.
-	 *
+	 */
+	return s1 * clauselist_selectivity_simple(root, clauses, varRelid,
+											  jointype, sjinfo,
+											  estimatedclauses);
+}
+
+/*
+ * clauselist_selectivity_simple -
+ *	  Compute the selectivity of an implicitly-ANDed list of boolean
+ *	  expression clauses.  The list can be empty, in which case 1.0
+ *	  must be returned.  List elements may be either RestrictInfos
+ *	  or bare expression clauses --- the former is preferred since
+ *	  it allows caching of results.
+ *
+ * See clause_selectivity() for the meaning of the additional parameters.
+ *
+ * Our basic approach is to take the product of the selectivities of the
+ * subclauses.  However, that's only right if the subclauses have independent
+ * probabilities, and in reality they are often NOT independent.  So,
+ * we want to be smarter where we can.
+ *
+ * We also recognize "range queries", such as "x > 34 AND x < 42".  Clauses
+ * are recognized as possible range query components if they are restriction
+ * opclauses whose operators have scalarltsel or a related function as their
+ * restriction selectivity estimator.  We pair up clauses of this form that
+ * refer to the same variable.  An unpairable clause of this kind is simply
+ * multiplied into the selectivity product in the normal way.  But when we
+ * find a pair, we know that the selectivities represent the relative
+ * positions of the low and high bounds within the column's range, so instead
+ * of figuring the selectivity as hisel * losel, we can figure it as hisel +
+ * losel - 1.  (To visualize this, see that hisel is the fraction of the range
+ * below the high bound, while losel is the fraction above the low bound; so
+ * hisel can be interpreted directly as a 0..1 value but we need to convert
+ * losel to 1-losel before interpreting it as a value.  Then the available
+ * range is 1-losel to hisel.  However, this calculation double-excludes
+ * nulls, so really we need hisel + losel + null_frac - 1.)
+ *
+ * If either selectivity is exactly DEFAULT_INEQ_SEL, we forget this equation
+ * and instead use DEFAULT_RANGE_INEQ_SEL.  The same applies if the equation
+ * yields an impossible (negative) result.
+ *
+ * A free side-effect is that we can recognize redundant inequalities such
+ * as "x < 4 AND x < 5"; only the tighter constraint will be counted.
+ *
+ * Of course this is all very dependent on the behavior of the inequality
+ * selectivity functions; perhaps some day we can generalize the approach.
+ */
+Selectivity
+clauselist_selectivity_simple(PlannerInfo *root,
+							  List *clauses,
+							  int varRelid,
+							  JoinType jointype,
+							  SpecialJoinInfo *sjinfo,
+							  Bitmapset *estimatedclauses)
+{
+	Selectivity s1 = 1.0;
+	RangeQueryClause *rqlist = NULL;
+	ListCell   *l;
+	int			listidx;
+
+	/*
 	 * Anything that doesn't look like a potential rangequery clause gets
 	 * multiplied into s1 and forgotten. Anything that does gets inserted into
 	 * an rqlist entry.
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 8369e3ad62..0112450419 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1363,6 +1363,18 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 			stainfos = lcons(info, stainfos);
 		}
 
+		if (statext_is_kind_built(htup, STATS_EXT_MCV))
+		{
+			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+			info->statOid = statOid;
+			info->rel = rel;
+			info->kind = STATS_EXT_MCV;
+			info->keys = bms_copy(keys);
+
+			stainfos = lcons(info, stainfos);
+		}
+
 		ReleaseSysCache(htup);
 		bms_free(keys);
 	}
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 656b1b5f1b..29877126d7 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1681,6 +1681,8 @@ generateClonedExtStatsStmt(RangeVar *heapRel, Oid heapRelid,
 			stat_types = lappend(stat_types, makeString("ndistinct"));
 		else if (enabled[i] == STATS_EXT_DEPENDENCIES)
 			stat_types = lappend(stat_types, makeString("dependencies"));
+		else if (enabled[i] == STATS_EXT_MCV)
+			stat_types = lappend(stat_types, makeString("mcv"));
 		else
 			elog(ERROR, "unrecognized statistics kind %c", enabled[i]);
 	}
diff --git a/src/backend/statistics/Makefile b/src/backend/statistics/Makefile
index 3404e4554a..d2815265fb 100644
--- a/src/backend/statistics/Makefile
+++ b/src/backend/statistics/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/statistics
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-OBJS = extended_stats.o dependencies.o mvdistinct.o
+OBJS = extended_stats.o dependencies.o mcv.o mvdistinct.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/statistics/README b/src/backend/statistics/README
index a8f00a590e..8f153a9e85 100644
--- a/src/backend/statistics/README
+++ b/src/backend/statistics/README
@@ -18,6 +18,8 @@ There are currently two kinds of extended statistics:
 
     (b) soft functional dependencies (README.dependencies)
 
+    (c) MCV lists (README.mcv)
+
 
 Compatible clause types
 -----------------------
@@ -26,6 +28,8 @@ Each type of statistics may be used to estimate some subset of clause types.
 
     (a) functional dependencies - equality clauses (AND), possibly IS NULL
 
+    (b) MCV lists - equality and inequality clauses (AND, OR, NOT), IS NULL
+
 Currently, only OpExprs in the form Var op Const, or Const op Var are
 supported, however it's feasible to expand the code later to also estimate the
 selectivities on clauses such as Var op Var.
diff --git a/src/backend/statistics/README.mcv b/src/backend/statistics/README.mcv
new file mode 100644
index 0000000000..2910eca962
--- /dev/null
+++ b/src/backend/statistics/README.mcv
@@ -0,0 +1,140 @@
+MCV lists
+=========
+
+Multivariate MCV (most-common values) lists are a straightforward extension of
+regular MCV list, tracking most frequent combinations of values for a group of
+attributes.
+
+This works particularly well for columns with a small number of distinct values,
+as the list may include all the combinations and approximate the distribution
+very accurately.
+
+For columns with large number of distinct values (e.g. those with continuous
+domains), the list will only track the most frequent combinations. If the
+distribution is mostly uniform (all combinations about equally frequent), the
+MCV list will be empty.
+
+Estimates of some clauses (e.g. equality) based on MCV lists are more accurate
+than when using histograms.
+
+Also, MCV lists don't necessarily require sorting of the values (the fact that
+we use sorting when building them is implementation detail), but even more
+importantly the ordering is not built into the approximation (while histograms
+are built on ordering). So MCV lists work well even for attributes where the
+ordering of the data type is disconnected from the meaning of the data. For
+example we know how to sort strings, but it's unlikely to make much sense for
+city names (or other label-like attributes).
+
+
+Selectivity estimation
+----------------------
+
+The estimation, implemented in clauselist_mv_selectivity_mcvlist(), is quite
+simple in principle - we need to identify MCV items matching all the clauses
+and sum frequencies of all those items.
+
+Currently MCV lists support estimation of the following clause types:
+
+    (a) equality clauses    WHERE (a = 1) AND (b = 2)
+    (b) inequality clauses  WHERE (a < 1) AND (b >= 2)
+    (c) NULL clauses        WHERE (a IS NULL) AND (b IS NOT NULL)
+    (d) OR clauses          WHERE (a < 1) OR (b >= 2)
+
+It's possible to add support for additional clauses, for example:
+
+    (e) multi-var clauses   WHERE (a > b)
+
+and possibly others. These are tasks for the future, not yet implemented.
+
+
+Estimating equality clauses
+---------------------------
+
+When computing selectivity estimate for equality clauses
+
+    (a = 1) AND (b = 2)
+
+we can do this estimate pretty exactly assuming that two conditions are met:
+
+    (1) there's an equality condition on all attributes of the statistic
+
+    (2) we find a matching item in the MCV list
+
+In this case we know the MCV item represents all tuples matching the clauses,
+and the selectivity estimate is complete (i.e. we don't need to perform
+estimation using the histogram). This is what we call 'full match'.
+
+When only (1) holds, but there's no matching MCV item, we don't know whether
+there are no such rows or just are not very frequent. We can however use the
+frequency of the least frequent MCV item as an upper bound for the selectivity.
+
+For a combination of equality conditions (not full-match case) we can clamp the
+selectivity by the minimum of selectivities for each condition. For example if
+we know the number of distinct values for each column, we can use 1/ndistinct
+as a per-column estimate. Or rather 1/ndistinct + selectivity derived from the
+MCV list.
+
+We should also probably only use the 'residual ndistinct' by exluding the items
+included in the MCV list (and also residual frequency):
+
+     f = (1.0 - sum(MCV frequencies)) / (ndistinct - ndistinct(MCV list))
+
+but it's worth pointing out the ndistinct values are multi-variate for the
+columns referenced by the equality conditions.
+
+Note: Only the "full match" limit is currently implemented.
+
+
+Hashed MCV (not yet implemented)
+--------------------------------
+
+Regular MCV lists have to include actual values for each item, so if those items
+are large the list may be quite large. This is especially true for multi-variate
+MCV lists, although the current implementation partially mitigates this by
+performing de-duplicating the values before storing them on disk.
+
+It's possible to only store hashes (32-bit values) instead of the actual values,
+significantly reducing the space requirements. Obviously, this would only make
+the MCV lists useful for estimating equality conditions (assuming the 32-bit
+hashes make the collisions rare enough).
+
+This might also complicate matching the columns to available stats.
+
+
+TODO Consider implementing hashed MCV list, storing just 32-bit hashes instead
+     of the actual values. This type of MCV list will be useful only for
+     estimating equality clauses, and will reduce space requirements for large
+     varlena types (in such cases we usually only want equality anyway).
+
+TODO Currently there's no logic to consider building only a MCV list (and not
+     building the histogram at all), except for doing this decision manually in
+     ADD STATISTICS.
+
+
+Inspecting the MCV list
+-----------------------
+
+Inspecting the regular (per-attribute) MCV lists is trivial, as it's enough
+to select the columns from pg_stats. The data is encoded as anyarrays, and
+all the items have the same data type, so anyarray provides a simple way to
+get a text representation.
+
+With multivariate MCV lists the columns may use different data types, making
+it impossible to use anyarrays. It might be possible to produce similar
+array-like representation, but that would complicate further processing and
+analysis of the MCV list.
+
+So instead the MCV lists are stored in a custom data type (pg_mcv_list),
+which however makes it more difficult to inspect the contents. To make that
+easier, there's a SRF returning detailed information about the MCV lists.
+
+    SELECT * FROM pg_mcv_list_items(stxmcv);
+
+It accepts one parameter - a pg_mcv_list value (which can only be obtained
+from pg_statistic_ext catalog, to defend against malicious input), and
+returns these columns:
+
+    - item index (0, ..., (nitems-1))
+    - values (string array)
+    - nulls only (boolean array)
+    - frequency (double precision)
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 140783cfb3..29e816c4f7 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -201,14 +201,11 @@ static double
 dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
 				  VacAttrStats **stats, Bitmapset *attrs)
 {
-	int			i,
-				j;
-	int			nvalues = numrows * k;
+	int			i;
 	MultiSortSupport mss;
 	SortItem   *items;
-	Datum	   *values;
-	bool	   *isnull;
 	int		   *attnums;
+	int		   *attnums_dep;
 
 	/* counters valid within a group */
 	int			group_size = 0;
@@ -223,26 +220,16 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
 	/* sort info for all attributes columns */
 	mss = multi_sort_init(k);
 
-	/* data for the sort */
-	items = (SortItem *) palloc(numrows * sizeof(SortItem));
-	values = (Datum *) palloc(sizeof(Datum) * nvalues);
-	isnull = (bool *) palloc(sizeof(bool) * nvalues);
-
-	/* fix the pointers to values/isnull */
-	for (i = 0; i < numrows; i++)
-	{
-		items[i].values = &values[i * k];
-		items[i].isnull = &isnull[i * k];
-	}
-
 	/*
-	 * Transform the bms into an array, to make accessing i-th member easier.
+	 * Transform the bms into an array, to make accessing i-th member easier,
+	 * and then construct a filtered version with only attnums referenced
+	 * by the dependency we validate.
 	 */
-	attnums = (int *) palloc(sizeof(int) * bms_num_members(attrs));
-	i = 0;
-	j = -1;
-	while ((j = bms_next_member(attrs, j)) >= 0)
-		attnums[i++] = j;
+	attnums = build_attnums(attrs);
+
+	attnums_dep = (int *)palloc(k * sizeof(int));
+	for (i = 0; i < k; i++)
+		attnums_dep[i] = attnums[dependency[i]];
 
 	/*
 	 * Verify the dependency (a,b,...)->z, using a rather simple algorithm:
@@ -254,7 +241,7 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
 	 * (c) for each group count different values in the last column
 	 */
 
-	/* prepare the sort function for the first dimension, and SortItem array */
+	/* prepare the sort function for the dimensions */
 	for (i = 0; i < k; i++)
 	{
 		VacAttrStats *colstat = stats[dependency[i]];
@@ -267,19 +254,16 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
 
 		/* prepare the sort function for this dimension */
 		multi_sort_add_dimension(mss, i, type->lt_opr);
-
-		/* accumulate all the data for both columns into an array and sort it */
-		for (j = 0; j < numrows; j++)
-		{
-			items[j].values[i] =
-				heap_getattr(rows[j], attnums[dependency[i]],
-							 stats[i]->tupDesc, &items[j].isnull[i]);
-		}
 	}
 
-	/* sort the items so that we can detect the groups */
-	qsort_arg((void *) items, numrows, sizeof(SortItem),
-			  multi_sort_compare, mss);
+	/*
+	 * build an array of SortItem(s) sorted using the multi-sort support
+	 *
+	 * XXX This relies on all stats entries pointing to the same tuple
+	 * descriptor. Not sure if that might not be the case.
+	 */
+	items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
+							   mss, k, attnums_dep);
 
 	/*
 	 * Walk through the sorted array, split it into rows according to the
@@ -322,9 +306,9 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
 	}
 
 	pfree(items);
-	pfree(values);
-	pfree(isnull);
 	pfree(mss);
+	pfree(attnums);
+	pfree(attnums_dep);
 
 	/* Compute the 'degree of validity' as (supporting/total). */
 	return (n_supporting_rows * 1.0 / numrows);
@@ -351,7 +335,6 @@ statext_dependencies_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
 						   VacAttrStats **stats)
 {
 	int			i,
-				j,
 				k;
 	int			numattrs;
 	int		   *attnums;
@@ -364,11 +347,7 @@ statext_dependencies_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
 	/*
 	 * Transform the bms into an array, to make accessing i-th member easier.
 	 */
-	attnums = palloc(sizeof(int) * bms_num_members(attrs));
-	i = 0;
-	j = -1;
-	while ((j = bms_next_member(attrs, j)) >= 0)
-		attnums[i++] = j;
+	attnums = build_attnums(attrs);
 
 	Assert(numattrs >= 2);
 
@@ -915,9 +894,9 @@ find_strongest_dependency(StatisticExtInfo *stats, MVDependencies *dependencies,
  *		using functional dependency statistics, or 1.0 if no useful functional
  *		dependency statistic exists.
  *
- * 'estimatedclauses' is an output argument that gets a bit set corresponding
- * to the (zero-based) list index of each clause that is included in the
- * estimated selectivity.
+ * 'estimatedclauses' is an input/output argument that gets a bit set
+ * corresponding to the (zero-based) list index of each clause that is included
+ * in the estimated selectivity.
  *
  * Given equality clauses on attributes (a,b) we find the strongest dependency
  * between them, i.e. either (a=>b) or (b=>a). Assuming (a=>b) is the selected
@@ -952,9 +931,6 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	AttrNumber *list_attnums;
 	int			listidx;
 
-	/* initialize output argument */
-	*estimatedclauses = NULL;
-
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
 		return 1.0;
@@ -969,6 +945,9 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	 * the attnums for each clause in a list which we'll reference later so we
 	 * don't need to repeat the same work again. We'll also keep track of all
 	 * attnums seen.
+	 *
+	 * We also skip clauses that we already estimated using different types of
+	 * statistics (we treat them as incompatible).
 	 */
 	listidx = 0;
 	foreach(l, clauses)
@@ -976,7 +955,8 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		Node	   *clause = (Node *) lfirst(l);
 		AttrNumber	attnum;
 
-		if (dependency_is_compatible_clause(clause, rel->relid, &attnum))
+		if ((dependency_is_compatible_clause(clause, rel->relid, &attnum)) &&
+			(!bms_is_member(listidx, *estimatedclauses)))
 		{
 			list_attnums[listidx] = attnum;
 			clauses_attnums = bms_add_member(clauses_attnums, attnum);
@@ -1046,8 +1026,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 			/*
 			 * Skip incompatible clauses, and ones we've already estimated on.
 			 */
-			if (list_attnums[listidx] == InvalidAttrNumber ||
-				bms_is_member(listidx, *estimatedclauses))
+			if (list_attnums[listidx] == InvalidAttrNumber)
 				continue;
 
 			/*
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 2df5f7dc3a..0b66000705 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -16,6 +16,8 @@
  */
 #include "postgres.h"
 
+#include <math.h>
+
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/htup_details.h"
@@ -23,6 +25,8 @@
 #include "catalog/pg_collation.h"
 #include "catalog/pg_statistic_ext.h"
 #include "nodes/relation.h"
+#include "optimizer/clauses.h"
+#include "optimizer/cost.h"
 #include "postmaster/autovacuum.h"
 #include "statistics/extended_stats_internal.h"
 #include "statistics/statistics.h"
@@ -31,6 +35,7 @@
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/rel.h"
+#include "utils/selfuncs.h"
 #include "utils/syscache.h"
 
 
@@ -53,7 +58,7 @@ static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs,
 					  int nvacatts, VacAttrStats **vacatts);
 static void statext_store(Relation pg_stext, Oid relid,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
-			  VacAttrStats **stats);
+			  MCVList * mcvlist, VacAttrStats **stats);
 
 
 /*
@@ -87,6 +92,7 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 		StatExtEntry *stat = (StatExtEntry *) lfirst(lc);
 		MVNDistinct *ndistinct = NULL;
 		MVDependencies *dependencies = NULL;
+		MCVList    *mcv = NULL;
 		VacAttrStats **stats;
 		ListCell   *lc2;
 
@@ -124,10 +130,13 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 			else if (t == STATS_EXT_DEPENDENCIES)
 				dependencies = statext_dependencies_build(numrows, rows,
 														  stat->columns, stats);
+			else if (t == STATS_EXT_MCV)
+				mcv = statext_mcv_build(numrows, rows, stat->columns, stats,
+										totalrows);
 		}
 
 		/* store the statistics in the catalog */
-		statext_store(pg_stext, stat->statOid, ndistinct, dependencies, stats);
+		statext_store(pg_stext, stat->statOid, ndistinct, dependencies, mcv, stats);
 	}
 
 	heap_close(pg_stext, RowExclusiveLock);
@@ -155,6 +164,10 @@ statext_is_kind_built(HeapTuple htup, char type)
 			attnum = Anum_pg_statistic_ext_stxdependencies;
 			break;
 
+		case STATS_EXT_MCV:
+			attnum = Anum_pg_statistic_ext_stxmcv;
+			break;
+
 		default:
 			elog(ERROR, "unexpected statistics type requested: %d", type);
 	}
@@ -219,7 +232,8 @@ fetch_statentries_for_relation(Relation pg_statext, Oid relid)
 		for (i = 0; i < ARR_DIMS(arr)[0]; i++)
 		{
 			Assert((enabled[i] == STATS_EXT_NDISTINCT) ||
-				   (enabled[i] == STATS_EXT_DEPENDENCIES));
+				   (enabled[i] == STATS_EXT_DEPENDENCIES) ||
+				   (enabled[i] == STATS_EXT_MCV));
 			entry->types = lappend_int(entry->types, (int) enabled[i]);
 		}
 
@@ -294,7 +308,7 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs,
 static void
 statext_store(Relation pg_stext, Oid statOid,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
-			  VacAttrStats **stats)
+			  MCVList * mcv, VacAttrStats **stats)
 {
 	HeapTuple	stup,
 				oldtup;
@@ -325,9 +339,18 @@ statext_store(Relation pg_stext, Oid statOid,
 		values[Anum_pg_statistic_ext_stxdependencies - 1] = PointerGetDatum(data);
 	}
 
+	if (mcv != NULL)
+	{
+		bytea	   *data = statext_mcv_serialize(mcv, stats);
+
+		nulls[Anum_pg_statistic_ext_stxmcv - 1] = (data == NULL);
+		values[Anum_pg_statistic_ext_stxmcv - 1] = PointerGetDatum(data);
+	}
+
 	/* always replace the value (either by bytea or NULL) */
 	replaces[Anum_pg_statistic_ext_stxndistinct - 1] = true;
 	replaces[Anum_pg_statistic_ext_stxdependencies - 1] = true;
+	replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
 
 	/* there should already be a pg_statistic_ext tuple */
 	oldtup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
@@ -434,6 +457,137 @@ multi_sort_compare_dims(int start, int end,
 	return 0;
 }
 
+int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+	return compare_datums_simple(*(Datum *) a,
+								 *(Datum *) b,
+								 (SortSupport) arg);
+}
+
+int
+compare_datums_simple(Datum a, Datum b, SortSupport ssup)
+{
+	return ApplySortComparator(a, false, b, false, ssup);
+}
+
+/* simple counterpart to qsort_arg */
+void *
+bsearch_arg(const void *key, const void *base, size_t nmemb, size_t size,
+			int (*compar) (const void *, const void *, void *),
+			void *arg)
+{
+	size_t		l,
+				u,
+				idx;
+	const void *p;
+	int			comparison;
+
+	l = 0;
+	u = nmemb;
+	while (l < u)
+	{
+		idx = (l + u) / 2;
+		p = (void *) (((const char *) base) + (idx * size));
+		comparison = (*compar) (key, p, arg);
+
+		if (comparison < 0)
+			u = idx;
+		else if (comparison > 0)
+			l = idx + 1;
+		else
+			return (void *) p;
+	}
+
+	return NULL;
+}
+
+int *
+build_attnums(Bitmapset *attrs)
+{
+	int			i,
+				j;
+	int			numattrs = bms_num_members(attrs);
+	int		   *attnums;
+
+	/* build attnums from the bitmapset */
+	attnums = (int *) palloc(sizeof(int) * numattrs);
+	i = 0;
+	j = -1;
+	while ((j = bms_next_member(attrs, j)) >= 0)
+		attnums[i++] = j;
+
+	return attnums;
+}
+
+/* build_sorted_items
+ * 	build sorted array of SortItem with values from rows
+ *
+ * XXX All the memory is allocated in a single chunk, so that the caller
+ * can simply pfree the return value to release all of it.
+ */
+SortItem *
+build_sorted_items(int numrows, HeapTuple *rows, TupleDesc tdesc,
+				   MultiSortSupport mss, int numattrs, int *attnums)
+{
+	int			i,
+				j,
+				len;
+	int			nvalues = numrows * numattrs;
+
+	/*
+	 * We won't allocate the arrays for each item independenly, but in one
+	 * large chunk and then just set the pointers. This allows the caller to
+	 * simply pfree the return value to release all the memory.
+	 */
+	SortItem   *items;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *ptr;
+
+	/* Compute the total amount of memory we need (both items and values). */
+	len = numrows * sizeof(SortItem) + nvalues * (sizeof(Datum) + sizeof(bool));
+
+	/* Allocate the memory and split it into the pieces. */
+	ptr = palloc0(len);
+
+	/* items to sort */
+	items = (SortItem *) ptr;
+	ptr += numrows * sizeof(SortItem);
+
+	/* values and null flags */
+	values = (Datum *) ptr;
+	ptr += nvalues * sizeof(Datum);
+
+	isnull = (bool *) ptr;
+	ptr += nvalues * sizeof(bool);
+
+	/* make sure we consumed the whole buffer exactly */
+	Assert((ptr - (char *) items) == len);
+
+	/* fix the pointers to Datum and bool arrays */
+	for (i = 0; i < numrows; i++)
+	{
+		items[i].values = &values[i * numattrs];
+		items[i].isnull = &isnull[i * numattrs];
+
+		/* load the values/null flags from sample rows */
+		for (j = 0; j < numattrs; j++)
+		{
+			items[i].values[j] = heap_getattr(rows[i],
+											  attnums[j],	/* attnum */
+											  tdesc,
+											  &items[i].isnull[j]); /* isnull */
+		}
+	}
+
+	/* do the sort, using the multi-sort */
+	qsort_arg((void *) items, numrows, sizeof(SortItem),
+			  multi_sort_compare, mss);
+
+	return items;
+}
+
 /*
  * has_stats_of_kind
  *		Check whether the list contains statistic of a given kind
@@ -464,7 +618,7 @@ has_stats_of_kind(List *stats, char requiredkind)
  * object referencing the most of the requested attributes, breaking ties
  * in favor of objects with fewer keys overall.
  *
- * XXX if multiple statistics objects tie on both criteria, then which object
+ * XXX If multiple statistics objects tie on both criteria, then which object
  * is chosen depends on the order that they appear in the stats list. Perhaps
  * further tiebreakers are needed.
  */
@@ -514,3 +668,382 @@ choose_best_statistics(List *stats, Bitmapset *attnums, char requiredkind)
 
 	return best_match;
 }
+
+int
+bms_member_index(Bitmapset *keys, AttrNumber varattno)
+{
+	int			i,
+				j;
+
+	i = -1;
+	j = 0;
+	while (((i = bms_next_member(keys, i)) >= 0) && (i < varattno))
+		j += 1;
+
+	return j;
+}
+
+/* The Duj1 estimator (already used in analyze.c). */
+double
+estimate_ndistinct(double totalrows, int numrows, int d, int f1)
+{
+	double		numer,
+				denom,
+				ndistinct;
+
+	numer = (double) numrows * (double) d;
+
+	denom = (double) (numrows - f1) +
+		(double) f1 * (double) numrows / totalrows;
+
+	ndistinct = numer / denom;
+
+	/* Clamp to sane range in case of roundoff error */
+	if (ndistinct < (double) d)
+		ndistinct = (double) d;
+
+	if (ndistinct > totalrows)
+		ndistinct = totalrows;
+
+	return floor(ndistinct + 0.5);
+}
+
+/*
+ * statext_is_compatible_clause_internal
+ *	Does the heavy lifting of actually inspecting the clauses for
+ * statext_is_compatible_clause.
+ */
+static bool
+statext_is_compatible_clause_internal(Node *clause, Index relid, Bitmapset **attnums)
+{
+	/* We only support plain Vars for now */
+	if (IsA(clause, Var))
+	{
+		Var		   *var = (Var *) clause;
+
+		/* Ensure var is from the correct relation */
+		if (var->varno != relid)
+			return false;
+
+		/* we also better ensure the Var is from the current level */
+		if (var->varlevelsup > 0)
+			return false;
+
+		/* Also skip system attributes (we don't allow stats on those). */
+		if (!AttrNumberIsForUserDefinedAttr(var->varattno))
+			return false;
+
+		*attnums = bms_add_member(*attnums, var->varattno);
+
+		return true;
+	}
+
+	/* Var = Const */
+	if (is_opclause(clause))
+	{
+		OpExpr	   *expr = (OpExpr *) clause;
+		Var		   *var;
+		bool		varonleft = true;
+		bool		ok;
+
+		/* Only expressions with two arguments are considered compatible. */
+		if (list_length(expr->args) != 2)
+			return false;
+
+		/* see if it actually has the right */
+		ok = (NumRelids((Node *) expr) == 1) &&
+			(is_pseudo_constant_clause(lsecond(expr->args)) ||
+			 (varonleft = false,
+			  is_pseudo_constant_clause(linitial(expr->args))));
+
+		/* unsupported structure (two variables or so) */
+		if (!ok)
+			return false;
+
+		/*
+		 * If it's not one of the supported operators ("=", "<", ">", etc.),
+		 * just ignore the clause, as it's not compatible with MCV lists.
+		 *
+		 * This uses the function for estimating selectivity, not the operator
+		 * directly (a bit awkward, but well ...).
+		 */
+		if ((get_oprrest(expr->opno) != F_EQSEL) &&
+			(get_oprrest(expr->opno) != F_NEQSEL) &&
+			(get_oprrest(expr->opno) != F_SCALARLTSEL) &&
+			(get_oprrest(expr->opno) != F_SCALARLESEL) &&
+			(get_oprrest(expr->opno) != F_SCALARGTSEL) &&
+			(get_oprrest(expr->opno) != F_SCALARGESEL))
+			return false;
+
+		var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+		return statext_is_compatible_clause_internal((Node *) var, relid, attnums);
+	}
+
+	/* NOT/AND/OR clause */
+	if (or_clause(clause) ||
+		and_clause(clause) ||
+		not_clause(clause))
+	{
+		/*
+		 * AND/OR/NOT-clauses are supported if all sub-clauses are supported
+		 *
+		 * Perhaps we could improve this by handling mixed cases, when some of
+		 * the clauses are supported and some are not. Selectivity for the
+		 * supported subclauses would be computed using extended statistics,
+		 * and the remaining clauses would be estimated using the traditional
+		 * algorithm (product of selectivities).
+		 *
+		 * It however seems overly complex, and in a way we already do that
+		 * because if we reject the whole clause as unsupported here, it will
+		 * be eventually passed to clauselist_selectivity() which does exactly
+		 * this (split into supported/unsupported clauses etc).
+		 */
+		BoolExpr   *expr = (BoolExpr *) clause;
+		ListCell   *lc;
+		Bitmapset  *clause_attnums = NULL;
+
+		foreach(lc, expr->args)
+		{
+			/*
+			 * Had we found incompatible clause in the arguments, treat the
+			 * whole clause as incompatible.
+			 */
+			if (!statext_is_compatible_clause_internal((Node *) lfirst(lc),
+													   relid, &clause_attnums))
+				return false;
+		}
+
+		/*
+		 * Otherwise the clause is compatible, and we need to merge the
+		 * attnums into the main bitmapset.
+		 */
+		*attnums = bms_join(*attnums, clause_attnums);
+
+		return true;
+	}
+
+	/* Var IS NULL */
+	if (IsA(clause, NullTest))
+	{
+		NullTest   *nt = (NullTest *) clause;
+
+		/*
+		 * Only simple (Var IS NULL) expressions supported for now. Maybe we
+		 * could use examine_variable to fix this?
+		 */
+		if (!IsA(nt->arg, Var))
+			return false;
+
+		return statext_is_compatible_clause_internal((Node *) (nt->arg), relid, attnums);
+	}
+
+	return false;
+}
+
+/*
+ * statext_is_compatible_clause
+ *		Determines if the clause is compatible with MCV lists.
+ *
+ * Only OpExprs with two arguments using an equality operator are supported.
+ * When returning True attnum is set to the attribute number of the Var within
+ * the supported clause.
+ *
+ * Currently we only support Var = Const, or Const = Var. It may be possible
+ * to expand on this later.
+ */
+static bool
+statext_is_compatible_clause(Node *clause, Index relid, Bitmapset **attnums)
+{
+	RestrictInfo *rinfo = (RestrictInfo *) clause;
+
+	if (!IsA(rinfo, RestrictInfo))
+		return false;
+
+	/* Pseudoconstants are not really interesting here. */
+	if (rinfo->pseudoconstant)
+		return false;
+
+	/* clauses referencing multiple varnos are incompatible */
+	if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
+		return false;
+
+	return statext_is_compatible_clause_internal((Node *) rinfo->clause,
+												 relid, attnums);
+}
+
+/*
+ * statext_clauselist_selectivity
+ *		Estimate clauses using the best multi-column statistics.
+ *
+ * Selects the best extended (multi-column) statistic on a table (measured by
+ * a number of attributes extracted from the clauses and covered by it), and
+ * computes the selectivity for supplied clauses.
+ *
+ * One of the main challenges with using MCV lists is how to extrapolate the
+ * estimate to the data not covered by the MCV list. To do that, we compute
+ * not only the "MCV selectivity" (selectivities for MCV items matching the
+ * supplied clauses), but also a couple of derived selectivities:
+ *
+ * - simple selectivity:  Computed without extended statistic, i.e. as if the
+ * columns/clauses were independent
+ *
+ * - base selectivity:  Similar to simple selectivity, but is computed using
+ * the extended statistic by adding up the base frequencies (that we compute
+ * and store for each MCV item) of matching MCV items.
+ *
+ * - total selectivity: Selectivity covered by the whole MCV list.
+ *
+ * - other selectivity: A selectivity estimate for data not covered by the MCV
+ * list (i.e. satisfying the clauses, but not common enough to make it into
+ * the MCV list)
+ *
+ * Note: While simple and base selectivities are defined in a quite similar
+ * way, the values are computed differently and are not therefore equal. The
+ * simple selectivity is computed as a product of per-clause estimates, while
+ * the base selectivity is computed by adding up base frequencies of matching
+ * items of the multi-column MCV list. So the values may differ for two main
+ * reasons - (a) the MCV list may not cover 100% of the data and (b) some of
+ * the MCV items did not match the estimated clauses.
+ *
+ * As both (a) and (b) reduce the base selectivity value, it generally holds
+ * that (simple_selectivity >= base_selectivity). If the MCV list covers all
+ * the data, the values may be equal.
+ *
+ * So (simple_selectivity - base_selectivity) may be seen as a correction for
+ * the part not covered by the MCV list.
+ *
+ * Note: Due to rounding errors and minor differences in how the estimates
+ * are computed, the inequality may not always hold. Which is why we clamp
+ * the selectivities to prevent strange estimate (negative etc.).
+ *
+ * XXX If we were to use multiple statistics, this is where it would happen.
+ * We would simply repeat this on a loop on the "remaining" clauses, possibly
+ * using the already estimated clauses as conditions (and combining the values
+ * using conditional probability formula).
+ */
+Selectivity
+statext_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
+							   JoinType jointype, SpecialJoinInfo *sjinfo,
+							   RelOptInfo *rel, Bitmapset **estimatedclauses)
+{
+	ListCell   *l;
+	Bitmapset  *clauses_attnums = NULL;
+	Bitmapset **list_attnums;
+	int			listidx;
+	StatisticExtInfo *stat;
+	List	   *stat_clauses;
+	Selectivity	simple_sel,
+				mcv_sel,
+				mcv_basesel,
+				mcv_totalsel,
+				other_sel,
+				sel;
+
+	/* we're interested in MCV lists */
+	int			types = STATS_EXT_MCV;
+
+	/* check if there's any stats that might be useful for us. */
+	if (!has_stats_of_kind(rel->statlist, types))
+		return (Selectivity) 1.0;
+
+	list_attnums = (Bitmapset **) palloc(sizeof(Bitmapset *) *
+										 list_length(clauses));
+
+	/*
+	 * Pre-process the clauses list to extract the attnums seen in each item.
+	 * We need to determine if there's any clauses which will be useful for
+	 * dependency selectivity estimations. Along the way we'll record all of
+	 * the attnums for each clause in a list which we'll reference later so we
+	 * don't need to repeat the same work again. We'll also keep track of all
+	 * attnums seen.
+	 *
+	 * We also skip clauses that we already estimated using different types of
+	 * statistics (we treat them as incompatible).
+	 *
+	 * XXX Currently, the estimated clauses are always empty because the extra
+	 * statistics are applied before functional dependencies. Once we decide
+	 * to apply multiple statistics, this may change.
+	 */
+	listidx = 0;
+	foreach(l, clauses)
+	{
+		Node	   *clause = (Node *) lfirst(l);
+		Bitmapset  *attnums = NULL;
+
+		if ((statext_is_compatible_clause(clause, rel->relid, &attnums)) &&
+			(!bms_is_member(listidx, *estimatedclauses)))
+		{
+			list_attnums[listidx] = attnums;
+			clauses_attnums = bms_add_members(clauses_attnums, attnums);
+		}
+		else
+			list_attnums[listidx] = NULL;
+
+		listidx++;
+	}
+
+	/* We need at least two attributes for MCV lists. */
+	if (bms_num_members(clauses_attnums) < 2)
+		return 1.0;
+
+	/* find the best suited statistics object for these attnums */
+	stat = choose_best_statistics(rel->statlist, clauses_attnums, types);
+
+	/* if no matching stats could be found then we've nothing to do */
+	if (!stat)
+		return (Selectivity) 1.0;
+
+	/* We only understand MCV lists for now. */
+	Assert(stat->kind == STATS_EXT_MCV);
+
+	/* now filter the clauses to be estimated using the selected MCV */
+	stat_clauses = NIL;
+
+	listidx = 0;
+	foreach(l, clauses)
+	{
+		/*
+		 * If the clause is compatible with the selected statistics, mark it
+		 * as estimated and add it to the list to estimate.
+		 */
+		if ((list_attnums[listidx] != NULL) &&
+			(bms_is_subset(list_attnums[listidx], stat->keys)))
+		{
+			stat_clauses = lappend(stat_clauses, (Node *) lfirst(l));
+			*estimatedclauses = bms_add_member(*estimatedclauses, listidx);
+		}
+
+		listidx++;
+	}
+
+	/*
+	 * First compute "simple" selectivity, i.e. without the extended statistics,
+	 * and essentially assuming independence of the columns/clauses. We'll then
+	 * use the various selectivities computed from MCV list to improve it.
+	 */
+	simple_sel = clauselist_selectivity_simple(root, stat_clauses, varRelid,
+											   jointype, sjinfo, NULL);
+
+	/*
+	 * Now compute the multi-column estimate from the MCV list, along with the
+	 * other selectivities (base & total selectivity).
+	 */
+	mcv_sel = mcv_clauselist_selectivity(root, stat, stat_clauses, varRelid,
+										 jointype, sjinfo, rel,
+										 &mcv_basesel, &mcv_totalsel);
+
+	/* Estimated selectivity of values not covered by MCV matches */
+	other_sel = simple_sel - mcv_basesel;
+	CLAMP_PROBABILITY(other_sel);
+
+	/* The non-MCV selectivity can't exceed the 1 - mcv_totalsel. */
+	if (other_sel > 1.0 - mcv_totalsel)
+		other_sel = 1.0 - mcv_totalsel;
+
+	/* Overall selectivity is the combination of MCV and non-MCV estimates. */
+	sel = mcv_sel + other_sel;
+	CLAMP_PROBABILITY(sel);
+
+	return sel;
+}
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
new file mode 100644
index 0000000000..533fbdc037
--- /dev/null
+++ b/src/backend/statistics/mcv.c
@@ -0,0 +1,1656 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcv.c
+ *	  POSTGRES multivariate MCV lists
+ *
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/backend/statistics/mcv.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_statistic_ext.h"
+#include "fmgr.h"
+#include "funcapi.h"
+#include "optimizer/clauses.h"
+#include "statistics/extended_stats_internal.h"
+#include "statistics/statistics.h"
+#include "utils/builtins.h"
+#include "utils/bytea.h"
+#include "utils/fmgroids.h"
+#include "utils/fmgrprotos.h"
+#include "utils/lsyscache.h"
+#include "utils/syscache.h"
+#include "utils/typcache.h"
+
+#include <math.h>
+
+/*
+ * Computes size of a serialized MCV item, depending on the number of
+ * dimensions (columns) the statistic is defined on. The datum values are
+ * stored in a separate array (deduplicated, to minimize the size), and
+ * so the serialized items only store uint16 indexes into that array.
+ *
+ * Each serialized item store (in this order):
+ *
+ * - indexes to values	  (ndim * sizeof(uint16))
+ * - null flags			  (ndim * sizeof(bool))
+ * - frequency			  (sizeof(double))
+ * - base_frequency		  (sizeof(double))
+ *
+ * So in total each MCV item requires this many bytes:
+ *
+ *	 ndim * (sizeof(uint16) + sizeof(bool)) + 2 * sizeof(double)
+ */
+#define ITEM_SIZE(ndims)	\
+	(ndims * (sizeof(uint16) + sizeof(bool)) + 2 * sizeof(double))
+
+/*
+ * Macros for convenient access to parts of a serialized MCV item.
+ */
+#define ITEM_INDEXES(item)			((uint16*)item)
+#define ITEM_NULLS(item,ndims)		((bool*)(ITEM_INDEXES(item) + ndims))
+#define ITEM_FREQUENCY(item,ndims)	((double*)(ITEM_NULLS(item,ndims) + ndims))
+#define ITEM_BASE_FREQUENCY(item,ndims)	((double*)(ITEM_FREQUENCY(item,ndims) + 1))
+
+
+static MultiSortSupport build_mss(VacAttrStats **stats, Bitmapset *attrs);
+
+static SortItem *build_distinct_groups(int numrows, SortItem *items,
+					  MultiSortSupport mss, int *ndistinct);
+
+static int count_distinct_groups(int numrows, SortItem *items,
+					  MultiSortSupport mss);
+
+/*
+ * Builds MCV list from the set of sampled rows.
+ *
+ * The algorithm is quite simple:
+ *
+ *	   (1) sort the data (default collation, '<' for the data type)
+ *
+ *	   (2) count distinct groups, decide how many to keep
+ *
+ *	   (3) build the MCV list using the threshold determined in (2)
+ *
+ *	   (4) remove rows represented by the MCV from the sample
+ *
+ */
+MCVList *
+statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
+				  VacAttrStats **stats, double totalrows)
+{
+	int			i,
+				j,
+				k;
+	int			numattrs = bms_num_members(attrs);
+	int			ngroups;
+	int			nitems;
+	double		stadistinct;
+	int		   *mcv_counts;
+	int			f1;
+
+	int		   *attnums = build_attnums(attrs);
+
+	MCVList    *mcvlist = NULL;
+
+	/* comparator for all the columns */
+	MultiSortSupport mss = build_mss(stats, attrs);
+
+	/* sort the rows */
+	SortItem   *items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
+										   mss, numattrs, attnums);
+
+	/* transform the sorted rows into groups (sorted by frequency) */
+	SortItem   *groups = build_distinct_groups(numrows, items, mss, &ngroups);
+
+	/*
+	 * Maximum number of MCV items to store, based on the attribute with the
+	 * largest stats target (and the number of groups we have available).
+	 */
+	nitems = stats[0]->attr->attstattarget;
+	for (i = 1; i < numattrs; i++)
+	{
+		if (stats[i]->attr->attstattarget > nitems)
+			nitems = stats[i]->attr->attstattarget;
+	}
+	if (nitems > ngroups)
+		nitems = ngroups;
+
+	/*
+	 * Decide how many items to keep in the MCV list. We simply use the same
+	 * algorithm as for per-column MCV lists, to keep it consistent.
+	 *
+	 * One difference is that we do not have a multi-column stanullfrac, and
+	 * we simply treat it as a special item in the MCV list (it it makes it).
+	 * We could compute and store it, of course, but we may have statistics
+	 * on more than two columns, so we'd probably want to store this for
+	 * various combinations of columns - for K columns that'd be 2^K values.
+	 * So we instead store those as items of the multi-column MCV list (if
+	 * common enough).
+	 *
+	 * XXX Conceptually this is similar to the NULL-buckets of histograms.
+	 */
+	mcv_counts = (int *) palloc(sizeof(int) * nitems);
+	f1 = 0;
+
+	for (i = 0; i < nitems; i++)
+	{
+		mcv_counts[i] = groups[i].count;
+
+		/* count values that occur exactly once for the ndistinct estimate */
+		if (groups[i].count == 1)
+			f1 += 1;
+	}
+
+	stadistinct = estimate_ndistinct(totalrows, numrows, ngroups, f1);
+
+	/*
+	 * If we can fit all the items onto the MCV list, do that. Otherwise use
+	 * analyze_mcv_list to decide how many items to keep in the MCV list, just
+	 * like for the single-dimensional MCV list.
+	 *
+	 * XXX Should we also consider stadistinct here, to see if the groups do
+	 * represent all the distinct combinations.
+	 */
+	if (ngroups > nitems)
+	{
+		nitems = analyze_mcv_list(mcv_counts, nitems, stadistinct,
+								  0.0, /* stanullfrac */
+								  numrows, totalrows);
+	}
+
+	/*
+	 * At this point we know the number of items for the MCV list. There might
+	 * be none (for uniform distribution with many groups), and in that case
+	 * there will be no MCV list. Otherwise construct the MCV list.
+	 */
+	if (nitems > 0)
+	{
+		/*
+		 * Allocate the MCV list structure, set the global parameters.
+		 */
+		mcvlist = (MCVList *) palloc0(sizeof(MCVList));
+
+		mcvlist->magic = STATS_MCV_MAGIC;
+		mcvlist->type = STATS_MCV_TYPE_BASIC;
+		mcvlist->ndimensions = numattrs;
+		mcvlist->nitems = nitems;
+
+		/* store info about data type OIDs */
+		i = 0;
+		j = -1;
+		while ((j = bms_next_member(attrs, j)) >= 0)
+		{
+			VacAttrStats *colstat = stats[i];
+
+			mcvlist->types[i] = colstat->attrtypid;
+			i++;
+		}
+
+		/*
+		 * Preallocate Datum/isnull arrays (not as a single chunk, as we will
+		 * pass the result outside and thus it needs to be easy to pfree().
+		 *
+		 * XXX On second thought, we're the only ones dealing with MCV lists,
+		 * so we might allocate everything as a single chunk to reduce palloc
+		 * overhead (chunk headers, etc.) without significant risk. Not sure
+		 * it's worth it, though, as we're not re-building stats very often.
+		 */
+		mcvlist->items = (MCVItem * *) palloc0(sizeof(MCVItem *) * nitems);
+
+		for (i = 0; i < nitems; i++)
+		{
+			mcvlist->items[i] = (MCVItem *) palloc(sizeof(MCVItem));
+			mcvlist->items[i]->values = (Datum *) palloc(sizeof(Datum) * numattrs);
+			mcvlist->items[i]->isnull = (bool *) palloc(sizeof(bool) * numattrs);
+		}
+
+		/* Copy the first chunk of groups into the result. */
+		for (i = 0; i < nitems; i++)
+		{
+			/* just pointer to the proper place in the list */
+			MCVItem    *item = mcvlist->items[i];
+
+			/* copy values from the _previous_ group (last item of) */
+			memcpy(item->values, groups[i].values, sizeof(Datum) * numattrs);
+			memcpy(item->isnull, groups[i].isnull, sizeof(bool) * numattrs);
+
+			/* groups should be sorted by frequency in descending order */
+			Assert((i == 0) || (groups[i - 1].count >= groups[i].count));
+
+			/* group frequency */
+			item->frequency = (double) groups[i].count / numrows;
+
+			/* base frequency, if the attributes were independent */
+			item->base_frequency = 1.0;
+			for (j = 0; j < numattrs; j++)
+			{
+				int			count = 0;
+
+				for (k = 0; k < ngroups; k++)
+				{
+					if (multi_sort_compare_dim(j, &groups[i], &groups[k], mss) == 0)
+						count += groups[k].count;
+				}
+
+				item->base_frequency *= (double) count / numrows;
+			}
+		}
+	}
+
+	pfree(items);
+	pfree(groups);
+	pfree(mcv_counts);
+
+	return mcvlist;
+}
+
+/*
+ * build_mss
+ *	build MultiSortSupport for the attributes passed in attrs
+ */
+static MultiSortSupport
+build_mss(VacAttrStats **stats, Bitmapset *attrs)
+{
+	int			i,
+				j;
+	int			numattrs = bms_num_members(attrs);
+
+	/* Sort by multiple columns (using array of SortSupport) */
+	MultiSortSupport mss = multi_sort_init(numattrs);
+
+	/* prepare the sort functions for all the attributes */
+	i = 0;
+	j = -1;
+	while ((j = bms_next_member(attrs, j)) >= 0)
+	{
+		VacAttrStats *colstat = stats[i];
+		TypeCacheEntry *type;
+
+		type = lookup_type_cache(colstat->attrtypid, TYPECACHE_LT_OPR);
+		if (type->lt_opr == InvalidOid) /* shouldn't happen */
+			elog(ERROR, "cache lookup failed for ordering operator for type %u",
+				 colstat->attrtypid);
+
+		multi_sort_add_dimension(mss, i, type->lt_opr);
+		i++;
+	}
+
+	return mss;
+}
+
+/*
+ * count_distinct_groups
+ *	count distinct combinations of SortItems in the array
+ *
+ * The array is assumed to be sorted according to the MultiSortSupport.
+ */
+static int
+count_distinct_groups(int numrows, SortItem *items, MultiSortSupport mss)
+{
+	int			i;
+	int			ndistinct;
+
+	ndistinct = 1;
+	for (i = 1; i < numrows; i++)
+	{
+		/* make sure the array really is sorted */
+		Assert(multi_sort_compare(&items[i], &items[i - 1], mss) >= 0);
+
+		if (multi_sort_compare(&items[i], &items[i - 1], mss) != 0)
+			ndistinct += 1;
+	}
+
+	return ndistinct;
+}
+
+/*
+ * compare_sort_item_count
+ *	comparator for sorting items by count (frequencies) in descending order
+ */
+static int
+compare_sort_item_count(const void *a, const void *b)
+{
+	SortItem   *ia = (SortItem *) a;
+	SortItem   *ib = (SortItem *) b;
+
+	if (ia->count == ib->count)
+		return 0;
+	else if (ia->count > ib->count)
+		return -1;
+
+	return 1;
+}
+
+/*
+ * build_distinct_groups
+ *	build array of SortItems for distinct groups and counts matching items
+ *
+ * The input array is assumed to be sorted
+ */
+static SortItem *
+build_distinct_groups(int numrows, SortItem *items, MultiSortSupport mss,
+					  int *ndistinct)
+{
+	int			i,
+				j;
+	int			ngroups = count_distinct_groups(numrows, items, mss);
+
+	SortItem   *groups = (SortItem *) palloc0(ngroups * sizeof(SortItem));
+
+	j = 0;
+	groups[0] = items[0];
+	groups[0].count = 1;
+
+	for (i = 1; i < numrows; i++)
+	{
+		/* Assume sorted in ascending order. */
+		Assert(multi_sort_compare(&items[i], &items[i - 1], mss) >= 0);
+
+		/* New distinct group detected. */
+		if (multi_sort_compare(&items[i], &items[i - 1], mss) != 0)
+			groups[++j] = items[i];
+
+		groups[j].count++;
+	}
+
+	/* Sort the distinct groups by frequency (in descending order). */
+	pg_qsort((void *) groups, ngroups, sizeof(SortItem),
+			 compare_sort_item_count);
+
+	*ndistinct = ngroups;
+	return groups;
+}
+
+
+/*
+ * statext_mcv_load
+ *		Load the MCV list for the indicated pg_statistic_ext tuple
+ */
+MCVList *
+statext_mcv_load(Oid mvoid)
+{
+	bool		isnull = false;
+	Datum		mcvlist;
+	HeapTuple	htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(mvoid));
+
+	if (!HeapTupleIsValid(htup))
+		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
+
+	mcvlist = SysCacheGetAttr(STATEXTOID, htup,
+							  Anum_pg_statistic_ext_stxmcv, &isnull);
+
+	ReleaseSysCache(htup);
+
+	if (isnull)
+		return NULL;
+
+	return statext_mcv_deserialize(DatumGetByteaP(mcvlist));
+}
+
+
+/*
+ * Serialize MCV list into a bytea value.
+ *
+ * The basic algorithm is simple:
+ *
+ * (1) perform deduplication (for each attribute separately)
+ *	   (a) collect all (non-NULL) attribute values from all MCV items
+ *	   (b) sort the data (using 'lt' from VacAttrStats)
+ *	   (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all MCV list items
+ *	   (a) replace values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, as we may be mixing different
+ * datatypes, with different sort operators, etc.
+ *
+ * We use uint16 values for the indexes in step (3), as we currently don't allow
+ * more than 8k MCV items anyway, although that's mostly arbitrary limit. We might
+ * increase this to 65k and still fit into uint16. Furthermore, this limit is on
+ * the number of distinct values per column, and we usually have few of those
+ * (and various combinations of them for the those MCV list). So uint16 seems fine.
+ *
+ * We don't really expect the serialization to save as much space as for
+ * histograms, as we are not doing any bucket splits (which is the source
+ * of high redundancy in histograms).
+ *
+ * TODO: Consider packing boolean flags (NULL) for each item into a single char
+ * (or a longer type) instead of using an array of bool items.
+ */
+bytea *
+statext_mcv_serialize(MCVList * mcvlist, VacAttrStats **stats)
+{
+	int			i;
+	int			dim;
+	int			ndims = mcvlist->ndimensions;
+	int			itemsize = ITEM_SIZE(ndims);
+
+	SortSupport ssup;
+	DimensionInfo *info;
+
+	Size		total_length;
+
+	/* allocate the item just once */
+	char	   *item = palloc0(itemsize);
+
+	/* serialized items (indexes into arrays, etc.) */
+	bytea	   *output;
+	char	   *data = NULL;
+
+	/* values per dimension (and number of non-NULL values) */
+	Datum	  **values = (Datum **) palloc0(sizeof(Datum *) * ndims);
+	int		   *counts = (int *) palloc0(sizeof(int) * ndims);
+
+	/*
+	 * We'll include some rudimentary information about the attributes (type
+	 * length, etc.), so that we don't have to look them up while
+	 * deserializing the MCV list.
+	 *
+	 * XXX Maybe this is not a great idea? Or maybe we should actually copy
+	 * more fields, e.g. typeid, which would allow us to display the MCV list
+	 * using only the serialized representation (currently we have to fetch
+	 * this info from the relation).
+	 */
+	info = (DimensionInfo *) palloc0(sizeof(DimensionInfo) * ndims);
+
+	/* sort support data for all attributes included in the MCV list */
+	ssup = (SortSupport) palloc0(sizeof(SortSupportData) * ndims);
+
+	/* collect and deduplicate values for each dimension (attribute) */
+	for (dim = 0; dim < ndims; dim++)
+	{
+		int			ndistinct;
+		TypeCacheEntry *typentry;
+
+		/*
+		 * Lookup the LT operator (can't get it from stats extra_data, as we
+		 * don't know how to interpret that - scalar vs. array etc.).
+		 */
+		typentry = lookup_type_cache(stats[dim]->attrtypid, TYPECACHE_LT_OPR);
+
+		/* copy important info about the data type (length, by-value) */
+		info[dim].typlen = stats[dim]->attrtype->typlen;
+		info[dim].typbyval = stats[dim]->attrtype->typbyval;
+
+		/* allocate space for values in the attribute and collect them */
+		values[dim] = (Datum *) palloc0(sizeof(Datum) * mcvlist->nitems);
+
+		for (i = 0; i < mcvlist->nitems; i++)
+		{
+			/* skip NULL values - we don't need to deduplicate those */
+			if (mcvlist->items[i]->isnull[dim])
+				continue;
+
+			values[dim][counts[dim]] = mcvlist->items[i]->values[dim];
+			counts[dim] += 1;
+		}
+
+		/* if there are just NULL values in this dimension, we're done */
+		if (counts[dim] == 0)
+			continue;
+
+		/* sort and deduplicate the data */
+		ssup[dim].ssup_cxt = CurrentMemoryContext;
+		ssup[dim].ssup_collation = DEFAULT_COLLATION_OID;
+		ssup[dim].ssup_nulls_first = false;
+
+		PrepareSortSupportFromOrderingOp(typentry->lt_opr, &ssup[dim]);
+
+		qsort_arg(values[dim], counts[dim], sizeof(Datum),
+				  compare_scalars_simple, &ssup[dim]);
+
+		/*
+		 * Walk through the array and eliminate duplicate values, but keep the
+		 * ordering (so that we can do bsearch later). We know there's at
+		 * least one item as (counts[dim] != 0), so we can skip the first
+		 * element.
+		 */
+		ndistinct = 1;			/* number of distinct values */
+		for (i = 1; i < counts[dim]; i++)
+		{
+			/* expect sorted array */
+			Assert(compare_datums_simple(values[dim][i - 1], values[dim][i], &ssup[dim]) <= 0);
+
+			/* if the value is the same as the previous one, we can skip it */
+			if (!compare_datums_simple(values[dim][i - 1], values[dim][i], &ssup[dim]))
+				continue;
+
+			values[dim][ndistinct] = values[dim][i];
+			ndistinct += 1;
+		}
+
+		/* we must not exceed UINT16_MAX, as we use uint16 indexes */
+		Assert(ndistinct <= UINT16_MAX);
+
+		/*
+		 * Store additional info about the attribute - number of deduplicated
+		 * values, and also size of the serialized data. For fixed-length data
+		 * types this is trivial to compute, for varwidth types we need to
+		 * actually walk the array and sum the sizes.
+		 */
+		info[dim].nvalues = ndistinct;
+
+		if (info[dim].typlen > 0)	/* fixed-length data types */
+			info[dim].nbytes = info[dim].nvalues * info[dim].typlen;
+		else if (info[dim].typlen == -1)	/* varlena */
+		{
+			info[dim].nbytes = 0;
+			for (i = 0; i < info[dim].nvalues; i++)
+				info[dim].nbytes += VARSIZE_ANY(values[dim][i]);
+		}
+		else if (info[dim].typlen == -2)	/* cstring */
+		{
+			info[dim].nbytes = 0;
+			for (i = 0; i < info[dim].nvalues; i++)
+				info[dim].nbytes += strlen(DatumGetPointer(values[dim][i]));
+		}
+
+		/* we know (count>0) so there must be some data */
+		Assert(info[dim].nbytes > 0);
+	}
+
+	/*
+	 * Now we can finally compute how much space we'll actually need for the
+	 * whole serialized MCV list, as it contains these fields:
+	 *
+	 * - length (4B) for varlena - magic (4B) - type (4B) - ndimensions (4B) -
+	 * nitems (4B) - info (ndim * sizeof(DimensionInfo) - arrays of values for
+	 * each dimension - serialized items (nitems * itemsize)
+	 *
+	 * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and then we
+	 * will place all the data (values + indexes). We'll however use offsetof
+	 * and sizeof to compute sizes of the structs.
+	 */
+	total_length = (sizeof(int32) + offsetof(MCVList, items)
+					+ (ndims * sizeof(DimensionInfo))
+					+ mcvlist->nitems * itemsize);
+
+	/* add space for the arrays of deduplicated values */
+	for (i = 0; i < ndims; i++)
+		total_length += info[i].nbytes;
+
+	/*
+	 * Enforce arbitrary limit of 1MB on the size of the serialized MCV list.
+	 * This is meant as a protection against someone building MCV list on long
+	 * values (e.g. text documents).
+	 *
+	 * XXX Should we enforce arbitrary limits like this one? Maybe it's not
+	 * even necessary, as long values are usually unique and so won't make it
+	 * into the MCV list in the first place. In the end, we have a 1GB limit
+	 * on bytea values.
+	 */
+	if (total_length > (1024 * 1024))
+		elog(ERROR, "serialized MCV list exceeds 1MB (%ld)", total_length);
+
+	/* allocate space for the serialized MCV list, set header fields */
+	output = (bytea *) palloc0(total_length);
+	SET_VARSIZE(output, total_length);
+
+	/* 'data' points to the current position in the output buffer */
+	data = VARDATA(output);
+
+	/* MCV list header (number of items, ...) */
+	memcpy(data, mcvlist, offsetof(MCVList, items));
+	data += offsetof(MCVList, items);
+
+	/* information about the attributes */
+	memcpy(data, info, sizeof(DimensionInfo) * ndims);
+	data += sizeof(DimensionInfo) * ndims;
+
+	/* Copy the deduplicated values for all attributes to the output. */
+	for (dim = 0; dim < ndims; dim++)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* remember the starting point for Asserts later */
+		char	   *tmp = data;
+#endif
+		for (i = 0; i < info[dim].nvalues; i++)
+		{
+			Datum		v = values[dim][i];
+
+			if (info[dim].typbyval) /* passed by value */
+			{
+				memcpy(data, &v, info[dim].typlen);
+				data += info[dim].typlen;
+			}
+			else if (info[dim].typlen > 0)	/* pased by reference */
+			{
+				memcpy(data, DatumGetPointer(v), info[dim].typlen);
+				data += info[dim].typlen;
+			}
+			else if (info[dim].typlen == -1)	/* varlena */
+			{
+				memcpy(data, DatumGetPointer(v), VARSIZE_ANY(v));
+				data += VARSIZE_ANY(v);
+			}
+			else if (info[dim].typlen == -2)	/* cstring */
+			{
+				memcpy(data, DatumGetPointer(v), strlen(DatumGetPointer(v)) + 1);
+				data += strlen(DatumGetPointer(v)) + 1; /* terminator */
+			}
+
+			/* no underflows or overflows */
+			Assert((data > tmp) && ((data - tmp) <= info[dim].nbytes));
+		}
+
+		/*
+		 * check we got exactly the amount of data we expected for this
+		 * dimension
+		 */
+		Assert((data - tmp) == info[dim].nbytes);
+	}
+
+	/* Serialize the items, with uint16 indexes instead of the values. */
+	for (i = 0; i < mcvlist->nitems; i++)
+	{
+		MCVItem    *mcvitem = mcvlist->items[i];
+
+		/* don't write beyond the allocated space */
+		Assert(data <= (char *) output + total_length - itemsize);
+
+		/* reset the item (we only allocate it once and reuse it) */
+		memset(item, 0, itemsize);
+
+		for (dim = 0; dim < ndims; dim++)
+		{
+			Datum	   *v = NULL;
+
+			/* do the lookup only for non-NULL values */
+			if (mcvlist->items[i]->isnull[dim])
+				continue;
+
+			v = (Datum *) bsearch_arg(&mcvitem->values[dim], values[dim],
+									  info[dim].nvalues, sizeof(Datum),
+									  compare_scalars_simple, &ssup[dim]);
+
+			Assert(v != NULL);	/* serialization or deduplication error */
+
+			/* compute index within the array */
+			ITEM_INDEXES(item)[dim] = (v - values[dim]);
+
+			/* check the index is within expected bounds */
+			Assert(ITEM_INDEXES(item)[dim] >= 0);
+			Assert(ITEM_INDEXES(item)[dim] < info[dim].nvalues);
+		}
+
+		/* copy NULL and frequency flags into the item */
+		memcpy(ITEM_NULLS(item, ndims), mcvitem->isnull, sizeof(bool) * ndims);
+		memcpy(ITEM_FREQUENCY(item, ndims), &mcvitem->frequency, sizeof(double));
+		memcpy(ITEM_BASE_FREQUENCY(item, ndims), &mcvitem->base_frequency, sizeof(double));
+
+		/* copy the serialized item into the array */
+		memcpy(data, item, itemsize);
+
+		data += itemsize;
+	}
+
+	/* at this point we expect to match the total_length exactly */
+	Assert((data - (char *) output) == total_length);
+
+	pfree(item);
+	pfree(values);
+	pfree(counts);
+
+	return output;
+}
+
+/*
+ * Reads serialized MCV list into MCVList structure.
+ *
+ * Unlike with histograms, we deserialize the MCV list fully (i.e. we don't
+ * keep the deduplicated arrays and pointers into them), as we don't expect
+ * there to be a lot of duplicate values. But perhaps that's not true and we
+ * should keep the MCV in serialized form too.
+ *
+ * XXX See how much memory we could save by keeping the deduplicated version
+ * (both for typical and corner cases with few distinct values but many items).
+ */
+MCVList *
+statext_mcv_deserialize(bytea *data)
+{
+	int			dim,
+				i;
+	Size		expected_size;
+	MCVList    *mcvlist;
+	char	   *tmp;
+
+	int			ndims,
+				nitems,
+				itemsize;
+	DimensionInfo *info = NULL;
+	Datum	  **values = NULL;
+
+	/* local allocation buffer (used only for deserialization) */
+	int			bufflen;
+	char	   *buff;
+	char	   *ptr;
+
+	/* buffer used for the result */
+	int			rbufflen;
+	char	   *rbuff;
+	char	   *rptr;
+
+	if (data == NULL)
+		return NULL;
+
+	/*
+	 * We can't possibly deserialize a MCV list if there's not even a complete
+	 * header.
+	 */
+	if (VARSIZE_ANY_EXHDR(data) < offsetof(MCVList, items))
+		elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+			 VARSIZE_ANY_EXHDR(data), offsetof(MCVList, items));
+
+	/* read the MCV list header */
+	mcvlist = (MCVList *) palloc0(sizeof(MCVList));
+
+	/* initialize pointer to the data part (skip the varlena header) */
+	tmp = VARDATA_ANY(data);
+
+	/* get the header and perform further sanity checks */
+	memcpy(mcvlist, tmp, offsetof(MCVList, items));
+	tmp += offsetof(MCVList, items);
+
+	if (mcvlist->magic != STATS_MCV_MAGIC)
+		elog(ERROR, "invalid MCV magic %d (expected %dd)",
+			 mcvlist->magic, STATS_MCV_MAGIC);
+
+	if (mcvlist->type != STATS_MCV_TYPE_BASIC)
+		elog(ERROR, "invalid MCV type %d (expected %dd)",
+			 mcvlist->type, STATS_MCV_TYPE_BASIC);
+
+	if (mcvlist->ndimensions == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid zero-length dimension array in MCVList")));
+	else if (mcvlist->ndimensions > STATS_MAX_DIMENSIONS)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid length (%d) dimension array in MCVList",
+						mcvlist->ndimensions)));
+
+	if (mcvlist->nitems == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid zero-length item array in MCVList")));
+	else if (mcvlist->nitems > STATS_MCVLIST_MAX_ITEMS)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid length (%d) item array in MCVList",
+						mcvlist->nitems)));
+
+	nitems = mcvlist->nitems;
+	ndims = mcvlist->ndimensions;
+	itemsize = ITEM_SIZE(ndims);
+
+	/*
+	 * Check amount of data including DimensionInfo for all dimensions and
+	 * also the serialized items (including uint16 indexes). Also, walk
+	 * through the dimension information and add it to the sum.
+	 */
+	expected_size = offsetof(MCVList, items) +
+		ndims * sizeof(DimensionInfo) +
+		(nitems * itemsize);
+
+	/*
+	 * Check that we have at least the dimension and info records, along with
+	 * the items. We don't know the size of the serialized values yet. We need
+	 * to do this check first, before accessing the dimension info.
+	 */
+	if (VARSIZE_ANY_EXHDR(data) < expected_size)
+		elog(ERROR, "invalid MCV size %ld (expected %ld)",
+			 VARSIZE_ANY_EXHDR(data), expected_size);
+
+	/* Now it's safe to access the dimension info. */
+	info = (DimensionInfo *) (tmp);
+	tmp += ndims * sizeof(DimensionInfo);
+
+	/* account for the value arrays */
+	for (dim = 0; dim < ndims; dim++)
+	{
+		/*
+		 * XXX I wonder if we can/should rely on asserts here. Maybe those
+		 * checks should be done every time?
+		 */
+		Assert(info[dim].nvalues >= 0);
+		Assert(info[dim].nbytes >= 0);
+
+		expected_size += info[dim].nbytes;
+	}
+
+	/*
+	 * Now we know the total expected MCV size, including all the pieces
+	 * (header, dimension info. items and deduplicated data). So do the final
+	 * check on size.
+	 */
+	if (VARSIZE_ANY_EXHDR(data) != expected_size)
+		elog(ERROR, "invalid MCV size %ld (expected %ld)",
+			 VARSIZE_ANY_EXHDR(data), expected_size);
+
+	/*
+	 * Allocate one large chunk of memory for the intermediate data, needed
+	 * only for deserializing the MCV list (and allocate densely to minimize
+	 * the palloc overhead).
+	 *
+	 * Let's see how much space we'll actually need, and also include space
+	 * for the array with pointers.
+	 *
+	 * We need an array of Datum pointers values for each dimension, so that
+	 * we can easily translate the uint16 indexes. We also need a top-level
+	 * array of pointers to those per-dimension arrays.
+	 *
+	 * For byval types with size matching sizeof(Datum) we can reuse the
+	 * serialized array directly.
+	 */
+	bufflen = sizeof(Datum **) * ndims; /* space for top-level pointers */
+
+	for (dim = 0; dim < ndims; dim++)
+	{
+		/* for full-size byval types, we reuse the serialized value */
+		if (!(info[dim].typbyval && info[dim].typlen == sizeof(Datum)))
+			bufflen += (sizeof(Datum) * info[dim].nvalues);
+	}
+
+	buff = palloc0(bufflen);
+	ptr = buff;
+
+	values = (Datum **) buff;
+	ptr += (sizeof(Datum *) * ndims);
+
+	/*
+	 * XXX This uses pointers to the original data array (the types not passed
+	 * by value), so when someone frees the memory, e.g. by doing something
+	 * like this:
+	 *
+	 *	  bytea * data = ... fetch the data from catalog ...
+	 *
+	 *	  MCVList mcvlist = deserialize_mcv_list(data);
+	 *
+	 *	  pfree(data);
+	 *
+	 * then 'mcvlist' references the freed memory. Should copy the pieces.
+	 */
+	for (dim = 0; dim < ndims; dim++)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* remember where data for this dimension starts */
+		char	   *start = tmp;
+#endif
+		if (info[dim].typbyval)
+		{
+			/* passed by value / size matches Datum - just reuse the array */
+			if (info[dim].typlen == sizeof(Datum))
+			{
+				values[dim] = (Datum *) tmp;
+				tmp += info[dim].nbytes;
+
+				/* no overflow of input array */
+				Assert(tmp <= start + info[dim].nbytes);
+			}
+			else
+			{
+				values[dim] = (Datum *) ptr;
+				ptr += (sizeof(Datum) * info[dim].nvalues);
+
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					memcpy(&values[dim][i], tmp, info[dim].typlen);
+					tmp += info[dim].typlen;
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+		}
+		else
+		{
+			/* all the other types need a chunk of the buffer */
+			values[dim] = (Datum *) ptr;
+			ptr += (sizeof(Datum) * info[dim].nvalues);
+
+			/* passed by reference, but fixed length (name, tid, ...) */
+			if (info[dim].typlen > 0)
+			{
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					values[dim][i] = PointerGetDatum(tmp);
+					tmp += info[dim].typlen;
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+			else if (info[dim].typlen == -1)
+			{
+				/* varlena */
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					values[dim][i] = PointerGetDatum(tmp);
+					tmp += VARSIZE_ANY(tmp);
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+			else if (info[dim].typlen == -2)
+			{
+				/* cstring */
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					values[dim][i] = PointerGetDatum(tmp);
+					tmp += (strlen(tmp) + 1);	/* don't forget the \0 */
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+		}
+
+		/* check we consumed the serialized data for this dimension exactly */
+		Assert((tmp - start) == info[dim].nbytes);
+	}
+
+	/* we should have exhausted the buffer exactly */
+	Assert((ptr - buff) == bufflen);
+
+	/* allocate space for all the MCV items in a single piece */
+	rbufflen = (sizeof(MCVItem *) + sizeof(MCVItem) +
+				sizeof(Datum) * ndims + sizeof(bool) * ndims) * nitems;
+
+	rbuff = palloc0(rbufflen);
+	rptr = rbuff;
+
+	mcvlist->items = (MCVItem * *) rbuff;
+	rptr += (sizeof(MCVItem *) * nitems);
+
+	/* deserialize the MCV items and translate the indexes to Datums */
+	for (i = 0; i < nitems; i++)
+	{
+		uint16	   *indexes = NULL;
+		MCVItem    *item = (MCVItem *) rptr;
+
+		rptr += (sizeof(MCVItem));
+
+		item->values = (Datum *) rptr;
+		rptr += (sizeof(Datum) * ndims);
+
+		item->isnull = (bool *) rptr;
+		rptr += (sizeof(bool) * ndims);
+
+		/* just point to the right place */
+		indexes = ITEM_INDEXES(tmp);
+
+		memcpy(item->isnull, ITEM_NULLS(tmp, ndims), sizeof(bool) * ndims);
+		memcpy(&item->frequency, ITEM_FREQUENCY(tmp, ndims), sizeof(double));
+		memcpy(&item->base_frequency, ITEM_BASE_FREQUENCY(tmp, ndims), sizeof(double));
+
+		/* translate the values */
+		for (dim = 0; dim < ndims; dim++)
+			if (!item->isnull[dim])
+				item->values[dim] = values[dim][indexes[dim]];
+
+		mcvlist->items[i] = item;
+
+		tmp += ITEM_SIZE(ndims);
+
+		/* check we're not overflowing the input */
+		Assert(tmp <= (char *) data + VARSIZE_ANY(data));
+	}
+
+	/* check that we processed all the data */
+	Assert(tmp == (char *) data + VARSIZE_ANY(data));
+
+	/* release the temporary buffer */
+	pfree(buff);
+
+	return mcvlist;
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - item ID (0...nitems)
+ * - values (string array)
+ * - nulls only (boolean array)
+ * - frequency (double precision)
+ * - base_frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows returned if
+ * the statistics contains no histogram.
+ */
+PG_FUNCTION_INFO_V1(pg_stats_ext_mcvlist_items);
+
+Datum
+pg_stats_ext_mcvlist_items(PG_FUNCTION_ARGS)
+{
+	FuncCallContext *funcctx;
+	int			call_cntr;
+	int			max_calls;
+	TupleDesc	tupdesc;
+	AttInMetadata *attinmeta;
+
+	/* stuff done only on the first call of the function */
+	if (SRF_IS_FIRSTCALL())
+	{
+		MemoryContext oldcontext;
+		MCVList    *mcvlist;
+
+		/* create a function context for cross-call persistence */
+		funcctx = SRF_FIRSTCALL_INIT();
+
+		/* switch to memory context appropriate for multiple function calls */
+		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+		mcvlist = statext_mcv_deserialize(PG_GETARG_BYTEA_P(0));
+
+		funcctx->user_fctx = mcvlist;
+
+		/* total number of tuples to be returned */
+		funcctx->max_calls = 0;
+		if (funcctx->user_fctx != NULL)
+			funcctx->max_calls = mcvlist->nitems;
+
+		/* Build a tuple descriptor for our result type */
+		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("function returning record called in context "
+							"that cannot accept type record")));
+
+		/* build metadata needed later to produce tuples from raw C-strings */
+		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+		funcctx->attinmeta = attinmeta;
+
+		MemoryContextSwitchTo(oldcontext);
+	}
+
+	/* stuff done on every call of the function */
+	funcctx = SRF_PERCALL_SETUP();
+
+	call_cntr = funcctx->call_cntr;
+	max_calls = funcctx->max_calls;
+	attinmeta = funcctx->attinmeta;
+
+	if (call_cntr < max_calls)	/* do when there is more left to send */
+	{
+		char	  **values;
+		HeapTuple	tuple;
+		Datum		result;
+
+		char	   *buff = palloc0(1024);
+		char	   *format;
+
+		int			i;
+
+		Oid		   *outfuncs;
+		FmgrInfo   *fmgrinfo;
+
+		MCVList    *mcvlist;
+		MCVItem    *item;
+
+		mcvlist = (MCVList *) funcctx->user_fctx;
+
+		Assert(call_cntr < mcvlist->nitems);
+
+		item = mcvlist->items[call_cntr];
+
+		/*
+		 * Prepare a values array for building the returned tuple. This should
+		 * be an array of C strings which will be processed later by the type
+		 * input functions.
+		 */
+		values = (char **) palloc(5 * sizeof(char *));
+
+		values[0] = (char *) palloc(64 * sizeof(char));
+
+		/* arrays */
+		values[1] = (char *) palloc0(1024 * sizeof(char));
+		values[2] = (char *) palloc0(1024 * sizeof(char));
+
+		/* frequency */
+		values[3] = (char *) palloc(64 * sizeof(char));
+
+		/* base frequency */
+		values[4] = (char *) palloc(64 * sizeof(char));
+
+		outfuncs = (Oid *) palloc0(sizeof(Oid) * mcvlist->ndimensions);
+		fmgrinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * mcvlist->ndimensions);
+
+		for (i = 0; i < mcvlist->ndimensions; i++)
+		{
+			bool		isvarlena;
+
+			getTypeOutputInfo(mcvlist->types[i], &outfuncs[i], &isvarlena);
+
+			fmgr_info(outfuncs[i], &fmgrinfo[i]);
+		}
+
+		snprintf(values[0], 64, "%d", call_cntr);	/* item ID */
+
+		for (i = 0; i < mcvlist->ndimensions; i++)
+		{
+			Datum		val,
+						valout;
+
+			format = "%s, %s";
+			if (i == 0)
+				format = "{%s%s";
+			else if (i == mcvlist->ndimensions - 1)
+				format = "%s, %s}";
+
+			if (item->isnull[i])
+				valout = CStringGetDatum("NULL");
+			else
+			{
+				val = item->values[i];
+				valout = FunctionCall1(&fmgrinfo[i], val);
+			}
+
+			snprintf(buff, 1024, format, values[1], DatumGetPointer(valout));
+			strncpy(values[1], buff, 1023);
+			buff[0] = '\0';
+
+			snprintf(buff, 1024, format, values[2], item->isnull[i] ? "t" : "f");
+			strncpy(values[2], buff, 1023);
+			buff[0] = '\0';
+		}
+
+		snprintf(values[3], 64, "%f", item->frequency); /* frequency */
+		snprintf(values[4], 64, "%f", item->base_frequency); /* base frequency */
+
+		/* build a tuple */
+		tuple = BuildTupleFromCStrings(attinmeta, values);
+
+		/* make the tuple into a datum */
+		result = HeapTupleGetDatum(tuple);
+
+		/* clean up (this is not really necessary) */
+		pfree(values[0]);
+		pfree(values[1]);
+		pfree(values[2]);
+		pfree(values[3]);
+		pfree(values[4]);
+
+		pfree(values);
+
+		SRF_RETURN_NEXT(funcctx, result);
+	}
+	else						/* do when there is no more left */
+	{
+		SRF_RETURN_DONE(funcctx);
+	}
+}
+
+/*
+ * pg_mcv_list_in		- input routine for type pg_mcv_list.
+ *
+ * pg_mcv_list is real enough to be a table column, but it has no operations
+ * of its own, and disallows input too
+ */
+Datum
+pg_mcv_list_in(PG_FUNCTION_ARGS)
+{
+	/*
+	 * pg_mcv_list stores the data in binary form and parsing text input is
+	 * not needed, so disallow this.
+	 */
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg("cannot accept a value of type %s", "pg_mcv_list")));
+
+	PG_RETURN_VOID();			/* keep compiler quiet */
+}
+
+
+/*
+ * pg_mcv_list_out		- output routine for type PG_MCV_LIST.
+ *
+ * MCV lists are serialized into a bytea value, so we simply call byteaout()
+ * to serialize the value into text. But it'd be nice to serialize that into
+ * a meaningful representation (e.g. for inspection by people).
+ *
+ * XXX This should probably return something meaningful, similar to what
+ * pg_dependencies_out does. Not sure how to deal with the deduplicated
+ * values, though - do we want to expand that or not?
+ */
+Datum
+pg_mcv_list_out(PG_FUNCTION_ARGS)
+{
+	return byteaout(fcinfo);
+}
+
+/*
+ * pg_mcv_list_recv		- binary input routine for type pg_mcv_list.
+ */
+Datum
+pg_mcv_list_recv(PG_FUNCTION_ARGS)
+{
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg("cannot accept a value of type %s", "pg_mcv_list")));
+
+	PG_RETURN_VOID();			/* keep compiler quiet */
+}
+
+/*
+ * pg_mcv_list_send		- binary output routine for type pg_mcv_list.
+ *
+ * MCV lists are serialized in a bytea value (although the type is named
+ * differently), so let's just send that.
+ */
+Datum
+pg_mcv_list_send(PG_FUNCTION_ARGS)
+{
+	return byteasend(fcinfo);
+}
+
+/*
+ * mcv_update_match_bitmap
+ *	Evaluate clauses using the MCV list, and update the match bitmap.
+ *
+ * A match bitmap keeps match/mismatch status for each MCV item, and we
+ * update it based on additional clauses. We also use it to skip items
+ * that can't possibly match (e.g. item marked as "mismatch" can't change
+ * to "match" when evaluating AND clause list).
+ *
+ * The function also returns a flag indicating whether there was an
+ * equality condition for all attributes, the minimum frequency in the MCV
+ * list, and a total MCV frequency (sum of frequencies for all items).
+ *
+ * XXX Currently the match bitmap uses a char for each MCV item, which is
+ * somewhat wasteful as we could do with just a single bit, thus reducing
+ * the size to ~1/8. It would also allow us to combine bitmaps simply using
+ * & and |, which should be faster than min/max. The bitmaps are fairly
+ * small, though (as we cap the MCV list size to 8k items).
+ */
+static void
+mcv_update_match_bitmap(PlannerInfo *root, List *clauses,
+						Bitmapset *keys, MCVList * mcvlist, char *matches,
+						bool is_or)
+{
+	int			i;
+	ListCell   *l;
+
+	/* The bitmap may be partially built. */
+	Assert(clauses != NIL);
+	Assert(list_length(clauses) >= 1);
+	Assert(mcvlist != NULL);
+	Assert(mcvlist->nitems > 0);
+	Assert(mcvlist->nitems <= STATS_MCVLIST_MAX_ITEMS);
+
+	/*
+	 * Loop through the list of clauses, and for each of them evaluate all the
+	 * MCV items not yet eliminated by the preceding clauses.
+	 */
+	foreach(l, clauses)
+	{
+		Node	   *clause = (Node *) lfirst(l);
+
+		/* if it's a RestrictInfo, then extract the clause */
+		if (IsA(clause, RestrictInfo))
+			clause = (Node *) ((RestrictInfo *) clause)->clause;
+
+		/*
+		 * Handle the various types of clauses - OpClause, NullTest and
+		 * AND/OR/NOT
+		 */
+		if (is_opclause(clause))
+		{
+			OpExpr	   *expr = (OpExpr *) clause;
+			bool		varonleft = true;
+			bool		ok;
+			FmgrInfo	opproc;
+
+			/* get procedure computing operator selectivity */
+			RegProcedure oprrest = get_oprrest(expr->opno);
+
+			fmgr_info(get_opcode(expr->opno), &opproc);
+
+			ok = (NumRelids(clause) == 1) &&
+				(is_pseudo_constant_clause(lsecond(expr->args)) ||
+				 (varonleft = false,
+				  is_pseudo_constant_clause(linitial(expr->args))));
+
+			if (ok)
+			{
+
+				FmgrInfo	gtproc;
+				Var		   *var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+				Const	   *cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+				bool		isgt = (!varonleft);
+
+				TypeCacheEntry *typecache
+				= lookup_type_cache(var->vartype, TYPECACHE_GT_OPR);
+
+				/* match the attribute to a dimension of the statistic */
+				int			idx = bms_member_index(keys, var->varattno);
+
+				fmgr_info(get_opcode(typecache->gt_opr), &gtproc);
+
+				/*
+				 * Walk through the MCV items and evaluate the current clause.
+				 * We can skip items that were already ruled out, and
+				 * terminate if there are no remaining MCV items that might
+				 * possibly match.
+				 */
+				for (i = 0; i < mcvlist->nitems; i++)
+				{
+					bool		mismatch = false;
+					MCVItem    *item = mcvlist->items[i];
+
+					/*
+					 * For AND-lists, we can also mark NULL items as 'no
+					 * match' (and then skip them). For OR-lists this is not
+					 * possible.
+					 */
+					if ((!is_or) && item->isnull[idx])
+						matches[i] = STATS_MATCH_NONE;
+
+					/* skip MCV items that were already ruled out */
+					if ((!is_or) && (matches[i] == STATS_MATCH_NONE))
+						continue;
+					else if (is_or && (matches[i] == STATS_MATCH_FULL))
+						continue;
+
+					switch (oprrest)
+					{
+						case F_EQSEL:
+						case F_NEQSEL:
+
+							/*
+							 * We don't care about isgt in equality, because
+							 * it does not matter whether it's (var op const)
+							 * or (const op var).
+							 */
+							mismatch = !DatumGetBool(FunctionCall2Coll(&opproc,
+																	   DEFAULT_COLLATION_OID,
+																	   cst->constvalue,
+																	   item->values[idx]));
+
+							break;
+
+						case F_SCALARLTSEL: /* column < constant */
+						case F_SCALARLESEL: /* column <= constant */
+						case F_SCALARGTSEL: /* column > constant */
+						case F_SCALARGESEL: /* column >= constant */
+
+							/*
+							 * First check whether the constant is below the
+							 * lower boundary (in that case we can skip the
+							 * bucket, because there's no overlap).
+							 */
+							if (isgt)
+								mismatch = !DatumGetBool(FunctionCall2Coll(&opproc,
+																		   DEFAULT_COLLATION_OID,
+																		   cst->constvalue,
+																		   item->values[idx]));
+							else
+								mismatch = !DatumGetBool(FunctionCall2Coll(&opproc,
+																		   DEFAULT_COLLATION_OID,
+																		   item->values[idx],
+																		   cst->constvalue));
+
+							break;
+					}
+
+					/*
+					 * XXX The conditions on matches[i] are not needed, as we
+					 * skip MCV items that can't become true/false, depending
+					 * on the current flag. See beginning of the loop over MCV
+					 * items.
+					 */
+
+					if ((is_or) && (!mismatch))
+					{
+						/* OR - was MATCH_NONE, but will be MATCH_FULL */
+						matches[i] = STATS_MATCH_FULL;
+						continue;
+					}
+					else if ((!is_or) && mismatch)
+					{
+						/* AND - was MATC_FULL, but will be MATCH_NONE */
+						matches[i] = STATS_MATCH_NONE;
+						continue;
+					}
+
+				}
+			}
+		}
+		else if (IsA(clause, NullTest))
+		{
+			NullTest   *expr = (NullTest *) clause;
+			Var		   *var = (Var *) (expr->arg);
+
+			/* match the attribute to a dimension of the statistic */
+			int			idx = bms_member_index(keys, var->varattno);
+
+			/*
+			 * Walk through the MCV items and evaluate the current clause. We
+			 * can skip items that were already ruled out, and terminate if
+			 * there are no remaining MCV items that might possibly match.
+			 */
+			for (i = 0; i < mcvlist->nitems; i++)
+			{
+				char		match = STATS_MATCH_NONE;	/* assume mismatch */
+				MCVItem    *item = mcvlist->items[i];
+
+				/* if the clause mismatches the MCV item, set it as MATCH_NONE */
+				switch (expr->nulltesttype)
+				{
+					case IS_NULL:
+						match = (item->isnull[idx]) ? STATS_MATCH_FULL : match;
+						break;
+
+					case IS_NOT_NULL:
+						match = (!item->isnull[idx]) ? STATS_MATCH_FULL : match;
+						break;
+				}
+
+				/* now, update the match bitmap, depending on OR/AND type */
+				if (is_or)
+					matches[i] = Max(matches[i], match);
+				else
+					matches[i] = Min(matches[i], match);
+			}
+		}
+		else if (or_clause(clause) || and_clause(clause))
+		{
+			/* AND/OR clause, with all subclauses being compatible */
+
+			int			i;
+			BoolExpr   *bool_clause = ((BoolExpr *) clause);
+			List	   *bool_clauses = bool_clause->args;
+
+			/* match/mismatch bitmap for each MCV item */
+			char	   *bool_matches = NULL;
+
+			Assert(bool_clauses != NIL);
+			Assert(list_length(bool_clauses) >= 2);
+
+			/* by default none of the MCV items matches the clauses */
+			bool_matches = palloc0(sizeof(char) * mcvlist->nitems);
+
+			if (or_clause(clause))
+			{
+				/* OR clauses assume nothing matches, initially */
+				memset(bool_matches, STATS_MATCH_NONE, sizeof(char) * mcvlist->nitems);
+			}
+			else
+			{
+				/* AND clauses assume everything matches, initially */
+				memset(bool_matches, STATS_MATCH_FULL, sizeof(char) * mcvlist->nitems);
+			}
+
+			/* build the match bitmap for the OR-clauses */
+			mcv_update_match_bitmap(root, bool_clauses, keys,
+									mcvlist, bool_matches,
+									or_clause(clause));
+
+			/*
+			 * Merge the bitmap produced by mcv_update_match_bitmap into the
+			 * current one. We need to consider if we're evaluating AND or OR
+			 * condition when merging the results.
+			 */
+			for (i = 0; i < mcvlist->nitems; i++)
+			{
+				/* Is this OR or AND clause? */
+				if (is_or)
+					matches[i] = Max(matches[i], bool_matches[i]);
+				else
+					matches[i] = Min(matches[i], bool_matches[i]);
+			}
+
+			pfree(bool_matches);
+		}
+		else if (not_clause(clause))
+		{
+			/* NOT clause, with all subclauses compatible */
+
+			int			i;
+			BoolExpr   *not_clause = ((BoolExpr *) clause);
+			List	   *not_args = not_clause->args;
+
+			/* match/mismatch bitmap for each MCV item */
+			char	   *not_matches = NULL;
+
+			Assert(not_args != NIL);
+			Assert(list_length(not_args) == 1);
+
+			/* by default none of the MCV items matches the clauses */
+			not_matches = palloc0(sizeof(char) * mcvlist->nitems);
+
+			/* NOT clauses assume nothing matches, initially */
+			memset(not_matches, STATS_MATCH_FULL, sizeof(char) * mcvlist->nitems);
+
+			/* build the match bitmap for the NOT-clause */
+			mcv_update_match_bitmap(root, not_args, keys,
+									mcvlist, not_matches, false);
+
+			/*
+			 * Merge the bitmap produced by mcv_update_match_bitmap into the
+			 * current one.
+			 */
+			for (i = 0; i < mcvlist->nitems; i++)
+			{
+				/*
+				 * When handling a NOT clause, we need to invert the result
+				 * before merging it into the global result.
+				 */
+				if (not_matches[i] == STATS_MATCH_NONE)
+					not_matches[i] = STATS_MATCH_FULL;
+				else
+					not_matches[i] = STATS_MATCH_NONE;
+
+				/* Is this OR or AND clause? */
+				if (is_or)
+					matches[i] = Max(matches[i], not_matches[i]);
+				else
+					matches[i] = Min(matches[i], not_matches[i]);
+			}
+
+			pfree(not_matches);
+		}
+		else if (IsA(clause, Var))
+		{
+			/* Var (has to be a boolean Var, possibly from below NOT) */
+
+			Var		   *var = (Var *) (clause);
+
+			/* match the attribute to a dimension of the statistic */
+			int			idx = bms_member_index(keys, var->varattno);
+
+			Assert(var->vartype == BOOLOID);
+
+			/*
+			 * Walk through the MCV items and evaluate the current clause. We
+			 * can skip items that were already ruled out, and terminate if
+			 * there are no remaining MCV items that might possibly match.
+			 */
+			for (i = 0; i < mcvlist->nitems; i++)
+			{
+				MCVItem    *item = mcvlist->items[i];
+				bool		match = STATS_MATCH_NONE;
+
+				/* if the item is NULL, it's a mismatch */
+				if (!item->isnull[idx] && DatumGetBool(item->values[idx]))
+					match = STATS_MATCH_FULL;
+
+				/* now, update the match bitmap, depending on OR/AND type */
+				if (is_or)
+					matches[i] = Max(matches[i], match);
+				else
+					matches[i] = Min(matches[i], match);
+			}
+		}
+		else
+		{
+			elog(ERROR, "unknown clause type: %d", clause->type);
+		}
+	}
+}
+
+
+/*
+ * mcv_clauselist_selectivity
+ *		Return the selectivity estimate of clauses using MCV list.
+ *
+ * It also produces two interesting selectivities - total selectivity of
+ * all the MCV items combined, and selectivity of the least frequent item
+ * in the list.
+ */
+Selectivity
+mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
+						   List *clauses, int varRelid,
+						   JoinType jointype, SpecialJoinInfo *sjinfo,
+						   RelOptInfo *rel,
+						   Selectivity *basesel, Selectivity *totalsel)
+{
+	int			i;
+	MCVList    *mcv;
+	Selectivity s = 0.0;
+
+	/* match/mismatch bitmap for each MCV item */
+	char	   *matches = NULL;
+
+	/* load the MCV list stored in the statistics object */
+	mcv = statext_mcv_load(stat->statOid);
+
+	/* by default all the MCV items match the clauses fully */
+	matches = palloc0(sizeof(char) * mcv->nitems);
+	memset(matches, STATS_MATCH_FULL, sizeof(char) * mcv->nitems);
+
+	mcv_update_match_bitmap(root, clauses, stat->keys, mcv,
+							matches, false);
+
+	/* sum frequencies for all the matching MCV items */
+	*basesel = 0.0;
+	*totalsel = 0.0;
+	for (i = 0; i < mcv->nitems; i++)
+	{
+		*totalsel += mcv->items[i]->frequency;
+
+		if (matches[i] != STATS_MATCH_NONE)
+		{
+			/* XXX Shouldn't the basesel be outside the if condition? */
+			*basesel += mcv->items[i]->base_frequency;
+			s += mcv->items[i]->frequency;
+		}
+	}
+
+	return s;
+}
diff --git a/src/backend/statistics/mvdistinct.c b/src/backend/statistics/mvdistinct.c
index 593c219839..fb74ed3618 100644
--- a/src/backend/statistics/mvdistinct.c
+++ b/src/backend/statistics/mvdistinct.c
@@ -23,8 +23,6 @@
  */
 #include "postgres.h"
 
-#include <math.h>
-
 #include "access/htup_details.h"
 #include "catalog/pg_statistic_ext.h"
 #include "utils/fmgrprotos.h"
@@ -39,7 +37,6 @@
 static double ndistinct_for_combination(double totalrows, int numrows,
 						  HeapTuple *rows, VacAttrStats **stats,
 						  int k, int *combination);
-static double estimate_ndistinct(double totalrows, int numrows, int d, int f1);
 static int	n_choose_k(int n, int k);
 static int	num_combinations(int n);
 
@@ -508,31 +505,6 @@ ndistinct_for_combination(double totalrows, int numrows, HeapTuple *rows,
 	return estimate_ndistinct(totalrows, numrows, d, f1);
 }
 
-/* The Duj1 estimator (already used in analyze.c). */
-static double
-estimate_ndistinct(double totalrows, int numrows, int d, int f1)
-{
-	double		numer,
-				denom,
-				ndistinct;
-
-	numer = (double) numrows * (double) d;
-
-	denom = (double) (numrows - f1) +
-		(double) f1 * (double) numrows / totalrows;
-
-	ndistinct = numer / denom;
-
-	/* Clamp to sane range in case of roundoff error */
-	if (ndistinct < (double) d)
-		ndistinct = (double) d;
-
-	if (ndistinct > totalrows)
-		ndistinct = totalrows;
-
-	return floor(ndistinct + 0.5);
-}
-
 /*
  * n_choose_k
  *		computes binomial coefficients using an algorithm that is both
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 03e9a28a63..c941c3310b 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -1504,6 +1504,7 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 	bool		isnull;
 	bool		ndistinct_enabled;
 	bool		dependencies_enabled;
+	bool		mcv_enabled;
 	int			i;
 
 	statexttup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statextid));
@@ -1539,6 +1540,7 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 
 	ndistinct_enabled = false;
 	dependencies_enabled = false;
+	mcv_enabled = false;
 
 	for (i = 0; i < ARR_DIMS(arr)[0]; i++)
 	{
@@ -1546,6 +1548,8 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 			ndistinct_enabled = true;
 		if (enabled[i] == STATS_EXT_DEPENDENCIES)
 			dependencies_enabled = true;
+		if (enabled[i] == STATS_EXT_MCV)
+			mcv_enabled = true;
 	}
 
 	/*
@@ -1555,13 +1559,27 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 	 * statistics types on a newer postgres version, if the statistics had all
 	 * options enabled on the original version.
 	 */
-	if (!ndistinct_enabled || !dependencies_enabled)
+	if (!ndistinct_enabled || !dependencies_enabled || !mcv_enabled)
 	{
+		bool	gotone = false;
+
 		appendStringInfoString(&buf, " (");
+
 		if (ndistinct_enabled)
+		{
 			appendStringInfoString(&buf, "ndistinct");
-		else if (dependencies_enabled)
-			appendStringInfoString(&buf, "dependencies");
+			gotone = true;
+		}
+
+		if (dependencies_enabled)
+		{
+			appendStringInfo(&buf, "%sdependencies", gotone ? ", " : "");
+			gotone = true;
+		}
+
+		if (mcv_enabled)
+			appendStringInfo(&buf, "%smcv", gotone ? ", " : "");
+
 		appendStringInfoChar(&buf, ')');
 	}
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index f1c78ffb65..fdfc0d6a1b 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3735,6 +3735,171 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
 }
 
 /*
+ * estimate_num_groups_simple
+ *		Estimate number of groups in a relation.
+ *
+ * A simplified version of estimate_num_groups, assuming all expressions
+ * are only plain Vars from a single relation, and that no filtering is
+ * happenning.
+ */
+double
+estimate_num_groups_simple(PlannerInfo *root, List *vars)
+{
+	List	   *varinfos = NIL;
+	double		numdistinct;
+	ListCell   *l;
+
+	RelOptInfo *rel;
+	double		reldistinct = 1;
+	double		relmaxndistinct = reldistinct;
+	int			relvarcount = 0;
+
+
+	/*
+	 * If no grouping columns, there's exactly one group.  (This can't happen
+	 * for normal cases with GROUP BY or DISTINCT, but it is possible for
+	 * corner cases with set operations.)
+	 */
+	if (vars == NIL)
+		return 1.0;
+
+	/*
+	 * We expect only variables from a single relation.
+	 */
+	Assert(NumRelids((Node *) vars) == 1);
+
+	/*
+	 * Find the unique Vars used, treating an expression as a Var if we can
+	 * find stats for it.  For each one, record the statistical estimate of
+	 * number of distinct values (total in its table).
+	 */
+	numdistinct = 1.0;
+
+	foreach(l, vars)
+	{
+		Var	   *var = (Var *) lfirst(l);
+		VariableStatData vardata;
+
+		Assert(IsA(var, Var));
+
+		/*
+		 * If examine_variable is able to deduce anything about the GROUP BY
+		 * expression, treat it as a single variable even if it's really more
+		 * complicated.
+		 */
+		examine_variable(root, (Node *) var, 0, &vardata);
+		if (HeapTupleIsValid(vardata.statsTuple) || vardata.isunique)
+		{
+			varinfos = add_unique_group_var(root, varinfos,
+											(Node *) var, &vardata);
+			ReleaseVariableStats(vardata);
+			continue;
+		}
+		ReleaseVariableStats(vardata);
+	}
+
+	Assert(varinfos);
+
+	/*
+	 * Get the numdistinct estimate for the Vars of this rel.
+	 *
+	 * We
+	 * iteratively search for multivariate n-distinct with maximum number
+	 * of vars; assuming that each var group is independent of the others,
+	 * we multiply them together.  Any remaining relvarinfos after no more
+	 * multivariate matches are found are assumed independent too, so
+	 * their individual ndistinct estimates are multiplied also.
+	 *
+	 * While iterating, count how many separate numdistinct values we
+	 * apply.  We apply a fudge factor below, but only if we multiplied
+	 * more than one such values.
+	 */
+	while (varinfos)
+	{
+		double		mvndistinct;
+
+		rel = ((GroupVarInfo *) linitial(varinfos))->rel;
+
+		if (estimate_multivariate_ndistinct(root, rel, &varinfos,
+											&mvndistinct))
+		{
+			reldistinct *= mvndistinct;
+			if (relmaxndistinct < mvndistinct)
+				relmaxndistinct = mvndistinct;
+			relvarcount++;
+		}
+		else
+		{
+			foreach(l, varinfos)
+			{
+				GroupVarInfo *varinfo = (GroupVarInfo *) lfirst(l);
+
+				reldistinct *= varinfo->ndistinct;
+				if (relmaxndistinct < varinfo->ndistinct)
+					relmaxndistinct = varinfo->ndistinct;
+				relvarcount++;
+			}
+
+			/* we're done with this relation */
+			varinfos = NIL;
+		}
+	}
+
+	/*
+	 * Sanity check --- don't divide by zero if empty relation.
+	 */
+	Assert(IS_SIMPLE_REL(rel));
+	if (rel->tuples > 0)
+	{
+		/*
+		 * Clamp to size of rel, or size of rel / 10 if multiple Vars. The
+		 * fudge factor is because the Vars are probably correlated but we
+		 * don't know by how much.  We should never clamp to less than the
+		 * largest ndistinct value for any of the Vars, though, since
+		 * there will surely be at least that many groups.
+		 */
+		double		clamp = rel->tuples;
+
+		if (relvarcount > 1)
+		{
+			clamp *= 0.1;
+			if (clamp < relmaxndistinct)
+			{
+				clamp = relmaxndistinct;
+				/* for sanity in case some ndistinct is too large: */
+				if (clamp > rel->tuples)
+					clamp = rel->tuples;
+			}
+		}
+		if (reldistinct > clamp)
+			reldistinct = clamp;
+
+		/*
+		 * We're assuming we are returning all rows.
+		 */
+		reldistinct = clamp_row_est(reldistinct);
+
+		/*
+		 * Update estimate of total distinct groups.
+		 */
+		numdistinct *= reldistinct;
+
+		/* Guard against out-of-range answers */
+		if (numdistinct > rel->tuples)
+			numdistinct = rel->tuples;
+	}
+
+	if (numdistinct < 1.0)
+		numdistinct = 1.0;
+
+	/* Round off */
+	numdistinct = ceil(numdistinct);
+
+	return numdistinct;
+
+}
+
+/*
  * Estimate hash bucket statistics when the specified expression is used
  * as a hash key for the given number of buckets.
  *
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 4ca0db1d0c..394301caa8 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2518,7 +2518,8 @@ describeOneTableDetails(const char *schemaname,
 							  "   JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND\n"
 							  "        a.attnum = s.attnum AND NOT attisdropped)) AS columns,\n"
 							  "  'd' = any(stxkind) AS ndist_enabled,\n"
-							  "  'f' = any(stxkind) AS deps_enabled\n"
+							  "  'f' = any(stxkind) AS deps_enabled,\n"
+							  "  'm' = any(stxkind) AS mcv_enabled\n"
 							  "FROM pg_catalog.pg_statistic_ext stat "
 							  "WHERE stxrelid = '%s'\n"
 							  "ORDER BY 1;",
@@ -2555,6 +2556,12 @@ describeOneTableDetails(const char *schemaname,
 					if (strcmp(PQgetvalue(result, i, 6), "t") == 0)
 					{
 						appendPQExpBuffer(&buf, "%sdependencies", gotone ? ", " : "");
+						gotone = true;
+					}
+
+					if (strcmp(PQgetvalue(result, i, 7), "t") == 0)
+					{
+						appendPQExpBuffer(&buf, "%smcv", gotone ? ", " : "");
 					}
 
 					appendPQExpBuffer(&buf, ") ON %s FROM %s",
diff --git a/src/include/catalog/pg_cast.dat b/src/include/catalog/pg_cast.dat
index cf007528fd..dff3a9a08a 100644
--- a/src/include/catalog/pg_cast.dat
+++ b/src/include/catalog/pg_cast.dat
@@ -324,6 +324,12 @@
 { castsource => 'pg_dependencies', casttarget => 'text', castfunc => '0',
   castcontext => 'i', castmethod => 'i' },
 
+# pg_mcv_list can be coerced to, but not from, bytea and text
+{ castsource => 'pg_mcv_list', casttarget => 'bytea', castfunc => '0',
+  castcontext => 'i', castmethod => 'b' },
+{ castsource => 'pg_mcv_list', casttarget => 'text', castfunc => '0',
+  castcontext => 'i', castmethod => 'i' },
+
 # Datetime category
 { castsource => 'abstime', casttarget => 'date', castfunc => 'date(abstime)',
   castcontext => 'a', castmethod => 'f' },
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index a14651010f..3cfcafcb11 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5073,6 +5073,30 @@
   proname => 'pg_dependencies_send', provolatile => 's', prorettype => 'bytea',
   proargtypes => 'pg_dependencies', prosrc => 'pg_dependencies_send' },
 
+{ oid => '4002', descr => 'I/O',
+  proname => 'pg_mcv_list_in', prorettype => 'pg_mcv_list',
+  proargtypes => 'cstring', prosrc => 'pg_mcv_list_in' },
+{ oid => '4003', descr => 'I/O',
+  proname => 'pg_mcv_list_out', prorettype => 'cstring',
+  proargtypes => 'pg_mcv_list', prosrc => 'pg_mcv_list_out' },
+{ oid => '4004', descr => 'I/O',
+  proname => 'pg_mcv_list_recv', provolatile => 's',
+  prorettype => 'pg_mcv_list', proargtypes => 'internal',
+  prosrc => 'pg_mcv_list_recv' },
+{ oid => '4005', descr => 'I/O',
+  proname => 'pg_mcv_list_send', provolatile => 's', prorettype => 'bytea',
+  proargtypes => 'pg_mcv_list', prosrc => 'pg_mcv_list_send' },
+
+{ oid => '3424',
+  descr => 'details about MCV list items',
+  proname => 'pg_mcv_list_items', prorows => '1000', proisstrict => 'f',
+  proretset => 't', provolatile => 's', prorettype => 'record',
+  proargtypes => 'pg_mcv_list',
+  proallargtypes => '{pg_mcv_list,int4,text,_bool,float8,float8}',
+  proargmodes => '{i,o,o,o,o,o}',
+  proargnames => '{mcv_list,index,values,nulls,frequency,base_frequency}',
+  prosrc => 'pg_stats_ext_mcvlist_items' },
+
 { oid => '1928', descr => 'statistics: number of scans done for table/index',
   proname => 'pg_stat_get_numscans', provolatile => 's', proparallel => 'r',
   prorettype => 'int8', proargtypes => 'oid',
diff --git a/src/include/catalog/pg_statistic_ext.h b/src/include/catalog/pg_statistic_ext.h
index 443798ae52..7ddbee63c9 100644
--- a/src/include/catalog/pg_statistic_ext.h
+++ b/src/include/catalog/pg_statistic_ext.h
@@ -47,6 +47,7 @@ CATALOG(pg_statistic_ext,3381,StatisticExtRelationId)
 												 * to build */
 	pg_ndistinct stxndistinct;	/* ndistinct coefficients (serialized) */
 	pg_dependencies stxdependencies;	/* dependencies (serialized) */
+	pg_mcv_list stxmcv;			/* MCV (serialized) */
 #endif
 
 } FormData_pg_statistic_ext;
@@ -62,6 +63,7 @@ typedef FormData_pg_statistic_ext *Form_pg_statistic_ext;
 
 #define STATS_EXT_NDISTINCT			'd'
 #define STATS_EXT_DEPENDENCIES		'f'
+#define STATS_EXT_MCV				'm'
 
 #endif							/* EXPOSE_TO_CLIENT_CODE */
 
diff --git a/src/include/catalog/pg_type.dat b/src/include/catalog/pg_type.dat
index 48e01cd694..0ff25e87a7 100644
--- a/src/include/catalog/pg_type.dat
+++ b/src/include/catalog/pg_type.dat
@@ -165,6 +165,13 @@
   typoutput => 'pg_dependencies_out', typreceive => 'pg_dependencies_recv',
   typsend => 'pg_dependencies_send', typalign => 'i', typstorage => 'x',
   typcollation => '100' },
+{ oid => '4001', oid_symbol => 'PGMCVLISTOID',
+  descr => 'multivariate MCV list',
+  typname => 'pg_mcv_list', typlen => '-1', typbyval => 'f',
+  typcategory => 'S', typinput => 'pg_mcv_list_in',
+  typoutput => 'pg_mcv_list_out', typreceive => 'pg_mcv_list_recv',
+  typsend => 'pg_mcv_list_send', typalign => 'i', typstorage => 'x',
+  typcollation => '100' },
 { oid => '32', oid_symbol => 'PGDDLCOMMANDOID',
   descr => 'internal type for passing CollectedCommand',
   typname => 'pg_ddl_command', typlen => 'SIZEOF_POINTER', typbyval => 't',
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 5af96fdc8a..775a2bd1c6 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -198,6 +198,12 @@ extern void analyze_rel(Oid relid, RangeVar *relation, int options,
 			VacuumParams *params, List *va_cols, bool in_outer_xact,
 			BufferAccessStrategy bstrategy);
 extern bool std_typanalyze(VacAttrStats *stats);
+extern int analyze_mcv_list(int *mcv_counts,
+				 int num_mcv,
+				 double stadistinct,
+				 double stanullfrac,
+				 int samplerows,
+				 double totalrows);
 
 /* in utils/misc/sampling.c --- duplicate of declarations in utils/sampling.h */
 extern double anl_random_fract(void);
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 77ca7ff837..e6cded0597 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -215,6 +215,12 @@ extern Selectivity clause_selectivity(PlannerInfo *root,
 				   int varRelid,
 				   JoinType jointype,
 				   SpecialJoinInfo *sjinfo);
+extern Selectivity clauselist_selectivity_simple(PlannerInfo *root,
+							  List *clauses,
+							  int varRelid,
+							  JoinType jointype,
+							  SpecialJoinInfo *sjinfo,
+							  Bitmapset *estimatedclauses);
 extern void cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 				  RelOptInfo *rel, ParamPathInfo *param_info,
 				  Cost input_startup_cost, Cost input_total_cost,
diff --git a/src/include/statistics/extended_stats_internal.h b/src/include/statistics/extended_stats_internal.h
index b3ca0c1229..11159b58ee 100644
--- a/src/include/statistics/extended_stats_internal.h
+++ b/src/include/statistics/extended_stats_internal.h
@@ -31,6 +31,15 @@ typedef struct
 	int			tupno;			/* position index for tuple it came from */
 } ScalarItem;
 
+/* (de)serialization info */
+typedef struct DimensionInfo
+{
+	int			nvalues;		/* number of deduplicated values */
+	int			nbytes;			/* number of bytes (serialized) */
+	int			typlen;			/* pg_type.typlen */
+	bool		typbyval;		/* pg_type.typbyval */
+}			DimensionInfo;
+
 /* multi-sort */
 typedef struct MultiSortSupportData
 {
@@ -44,6 +53,7 @@ typedef struct SortItem
 {
 	Datum	   *values;
 	bool	   *isnull;
+	int			count;
 } SortItem;
 
 extern MVNDistinct *statext_ndistinct_build(double totalrows,
@@ -57,6 +67,12 @@ extern MVDependencies *statext_dependencies_build(int numrows, HeapTuple *rows,
 extern bytea *statext_dependencies_serialize(MVDependencies *dependencies);
 extern MVDependencies *statext_dependencies_deserialize(bytea *data);
 
+extern MCVList * statext_mcv_build(int numrows, HeapTuple *rows,
+								   Bitmapset *attrs, VacAttrStats **stats,
+								   double totalrows);
+extern bytea *statext_mcv_serialize(MCVList * mcv, VacAttrStats **stats);
+extern MCVList * statext_mcv_deserialize(bytea *data);
+
 extern MultiSortSupport multi_sort_init(int ndims);
 extern void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
 						 Oid oper);
@@ -65,5 +81,32 @@ extern int multi_sort_compare_dim(int dim, const SortItem *a,
 					   const SortItem *b, MultiSortSupport mss);
 extern int multi_sort_compare_dims(int start, int end, const SortItem *a,
 						const SortItem *b, MultiSortSupport mss);
+extern int	compare_scalars_simple(const void *a, const void *b, void *arg);
+extern int	compare_datums_simple(Datum a, Datum b, SortSupport ssup);
+
+extern void *bsearch_arg(const void *key, const void *base,
+			size_t nmemb, size_t size,
+			int (*compar) (const void *, const void *, void *),
+			void *arg);
+
+extern int *build_attnums(Bitmapset *attrs);
+
+extern SortItem *build_sorted_items(int numrows, HeapTuple *rows,
+				   TupleDesc tdesc, MultiSortSupport mss,
+				   int numattrs, int *attnums);
+
+extern int	bms_member_index(Bitmapset *keys, AttrNumber varattno);
+
+extern double estimate_ndistinct(double totalrows, int numrows, int d, int f1);
+
+extern Selectivity mcv_clauselist_selectivity(PlannerInfo *root,
+						   StatisticExtInfo *stat,
+						   List *clauses,
+						   int varRelid,
+						   JoinType jointype,
+						   SpecialJoinInfo *sjinfo,
+						   RelOptInfo *rel,
+						   Selectivity *basesel,
+						   Selectivity *totalsel);
 
 #endif							/* EXTENDED_STATS_INTERNAL_H */
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 8009fee322..e69d6a0232 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -16,6 +16,14 @@
 #include "commands/vacuum.h"
 #include "nodes/relation.h"
 
+/*
+ * Degree of how much MCV item matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define STATS_MATCH_NONE		0	/* no match at all */
+#define STATS_MATCH_PARTIAL		1	/* partial match */
+#define STATS_MATCH_FULL		2	/* full match */
+
 #define STATS_MAX_DIMENSIONS	8	/* max number of attributes */
 
 /* Multivariate distinct coefficients */
@@ -78,8 +86,42 @@ typedef struct MVDependencies
 /* size of the struct excluding the deps array */
 #define SizeOfDependencies	(offsetof(MVDependencies, ndeps) + sizeof(uint32))
 
+/* used to flag stats serialized to bytea */
+#define STATS_MCV_MAGIC                        0xE1A651C2	/* marks serialized
+															 * bytea */
+#define STATS_MCV_TYPE_BASIC   1	/* basic MCV list type */
+
+/* max items in MCV list (mostly arbitrary number) */
+#define STATS_MCVLIST_MAX_ITEMS        8192
+
+/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and null flags.
+ */
+typedef struct MCVItem
+{
+	double		frequency;		/* frequency of this combination */
+	double		base_frequency;	/* frequency if independent */
+	bool	   *isnull;			/* lags of NULL values (up to 32 columns) */
+	Datum	   *values;			/* variable-length (ndimensions) */
+}			MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVList
+{
+	uint32		magic;			/* magic constant marker */
+	uint32		type;			/* type of MCV list (BASIC) */
+	uint32		nitems;			/* number of MCV items in the array */
+	AttrNumber	ndimensions;	/* number of dimensions */
+	Oid			types[STATS_MAX_DIMENSIONS];	/* OIDs of data types */
+	MCVItem   **items;			/* array of MCV items */
+}			MCVList;
+
 extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
 extern MVDependencies *statext_dependencies_load(Oid mvoid);
+extern MCVList * statext_mcv_load(Oid mvoid);
 
 extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
 						   int numrows, HeapTuple *rows,
@@ -92,6 +134,13 @@ extern Selectivity dependencies_clauselist_selectivity(PlannerInfo *root,
 									SpecialJoinInfo *sjinfo,
 									RelOptInfo *rel,
 									Bitmapset **estimatedclauses);
+extern Selectivity statext_clauselist_selectivity(PlannerInfo *root,
+							   List *clauses,
+							   int varRelid,
+							   JoinType jointype,
+							   SpecialJoinInfo *sjinfo,
+							   RelOptInfo *rel,
+							   Bitmapset **estimatedclauses);
 extern bool has_stats_of_kind(List *stats, char requiredkind);
 extern StatisticExtInfo *choose_best_statistics(List *stats,
 					   Bitmapset *attnums, char requiredkind);
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
index 95e44280c4..4e9aaca6b5 100644
--- a/src/include/utils/selfuncs.h
+++ b/src/include/utils/selfuncs.h
@@ -209,6 +209,8 @@ extern void mergejoinscansel(PlannerInfo *root, Node *clause,
 extern double estimate_num_groups(PlannerInfo *root, List *groupExprs,
 					double input_rows, List **pgset);
 
+extern double estimate_num_groups_simple(PlannerInfo *root, List *vars);
+
 extern void estimate_hash_bucket_stats(PlannerInfo *root,
 						   Node *hashkey, double nbuckets,
 						   Selectivity *mcv_freq,
diff --git a/src/test/regress/expected/create_table_like.out b/src/test/regress/expected/create_table_like.out
index 8d4543bfe8..0f97355165 100644
--- a/src/test/regress/expected/create_table_like.out
+++ b/src/test/regress/expected/create_table_like.out
@@ -243,7 +243,7 @@ Indexes:
 Check constraints:
     "ctlt1_a_check" CHECK (length(a) > 2)
 Statistics objects:
-    "public"."ctlt_all_a_b_stat" (ndistinct, dependencies) ON a, b FROM ctlt_all
+    "public"."ctlt_all_a_b_stat" (ndistinct, dependencies, mcv) ON a, b FROM ctlt_all
 
 SELECT c.relname, objsubid, description FROM pg_description, pg_index i, pg_class c WHERE classoid = 'pg_class'::regclass AND objoid = i.indexrelid AND c.oid = i.indexrelid AND i.indrelid = 'ctlt_all'::regclass ORDER BY c.relname, objsubid;
     relname     | objsubid | description 
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index 3c6d853ffb..d13f0928d0 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -902,11 +902,12 @@ WHERE c.castmethod = 'b' AND
  pg_node_tree      | text              |        0 | i
  pg_ndistinct      | bytea             |        0 | i
  pg_dependencies   | bytea             |        0 | i
+ pg_mcv_list       | bytea             |        0 | i
  cidr              | inet              |        0 | i
  xml               | text              |        0 | a
  xml               | character varying |        0 | a
  xml               | character         |        0 | a
-(9 rows)
+(10 rows)
 
 -- **************** pg_conversion ****************
 -- Look for illegal values in pg_conversion fields.
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 054a381dad..5d05962c04 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -58,7 +58,7 @@ ALTER TABLE ab1 DROP COLUMN a;
  b      | integer |           |          | 
  c      | integer |           |          | 
 Statistics objects:
-    "public"."ab1_b_c_stats" (ndistinct, dependencies) ON b, c FROM ab1
+    "public"."ab1_b_c_stats" (ndistinct, dependencies, mcv) ON b, c FROM ab1
 
 -- Ensure statistics are dropped when table is
 SELECT stxname FROM pg_statistic_ext WHERE stxname LIKE 'ab1%';
@@ -206,7 +206,7 @@ SELECT stxkind, stxndistinct
   FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass;
  stxkind |                      stxndistinct                       
 ---------+---------------------------------------------------------
- {d,f}   | {"3, 4": 301, "3, 6": 301, "4, 6": 301, "3, 4, 6": 301}
+ {d,f,m} | {"3, 4": 301, "3, 6": 301, "4, 6": 301, "3, 4, 6": 301}
 (1 row)
 
 -- Hash Aggregate, thanks to estimates improved by the statistic
@@ -272,7 +272,7 @@ SELECT stxkind, stxndistinct
   FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass;
  stxkind |                        stxndistinct                         
 ---------+-------------------------------------------------------------
- {d,f}   | {"3, 4": 2550, "3, 6": 800, "4, 6": 1632, "3, 4, 6": 10000}
+ {d,f,m} | {"3, 4": 2550, "3, 6": 800, "4, 6": 1632, "3, 4, 6": 10000}
 (1 row)
 
 -- plans using Group Aggregate, thanks to using correct esimates
@@ -509,3 +509,316 @@ EXPLAIN (COSTS OFF)
 (5 rows)
 
 RESET random_page_cost;
+-- MCV lists
+CREATE TABLE mcv_lists (
+    filler1 TEXT,
+    filler2 NUMERIC,
+    a INT,
+    b TEXT,
+    filler3 DATE,
+    c INT,
+    d TEXT
+);
+SET random_page_cost = 1.2;
+CREATE INDEX mcv_lists_ab_idx ON mcv_lists (a, b);
+CREATE INDEX mcv_lists_abc_idx ON mcv_lists (a, b, c);
+-- random data (no MCV list)
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT mod(i,37), mod(i,41), mod(i,43), mod(i,47) FROM generate_series(1,5000) s(i);
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text) AND (c = 1))
+(2 rows)
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text) AND (c = 1))
+(2 rows)
+
+-- 100 distinct combinations, all in the MCV list
+TRUNCATE mcv_lists;
+DROP STATISTICS mcv_lists_stats;
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT mod(i,100), mod(i,50), mod(i,25), i FROM generate_series(1,5000) s(i);
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a < 1) AND (b < '1'::text))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 0 AND b <= '0';
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a <= 0) AND (b <= '0'::text))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text) AND (c = 1))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a < 5) AND (b < '1'::text) AND (c < 5))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 4 AND b <= '0' AND c <= 4;
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a <= 4) AND (b <= '0'::text) AND (c <= 4))
+(2 rows)
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a < 1) AND (b < '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a < 1) AND (b < '1'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 0 AND b <= '0';
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a <= 0) AND (b <= '0'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a <= 0) AND (b <= '0'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   Filter: (c = 1)
+   ->  Bitmap Index Scan on mcv_lists_ab_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(5 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a < 5) AND (b < '1'::text))
+   Filter: (c < 5)
+   ->  Bitmap Index Scan on mcv_lists_ab_idx
+         Index Cond: ((a < 5) AND (b < '1'::text))
+(5 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 4 AND b <= '0' AND c <= 4;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a <= 4) AND (b <= '0'::text))
+   Filter: (c <= 4)
+   ->  Bitmap Index Scan on mcv_lists_ab_idx
+         Index Cond: ((a <= 4) AND (b <= '0'::text))
+(5 rows)
+
+-- check change of column type resets the MCV statistics
+ALTER TABLE mcv_lists ALTER COLUMN c TYPE numeric;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text))
+(2 rows)
+
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(4 rows)
+
+-- 100 distinct combinations with NULL values, all in the MCV list
+TRUNCATE mcv_lists;
+DROP STATISTICS mcv_lists_stats;
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT
+         (CASE WHEN mod(i,100) = 1 THEN NULL ELSE mod(i,100) END),
+         (CASE WHEN mod(i,50) = 1  THEN NULL ELSE mod(i,50) END),
+         (CASE WHEN mod(i,25) = 1  THEN NULL ELSE mod(i,25) END),
+         i
+     FROM generate_series(1,5000) s(i);
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a IS NULL) AND (b IS NULL))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
+                   QUERY PLAN                   
+------------------------------------------------
+ Index Scan using mcv_lists_ab_idx on mcv_lists
+   Index Cond: ((a IS NULL) AND (b IS NULL))
+   Filter: (c IS NULL)
+(3 rows)
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a IS NULL) AND (b IS NULL))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a IS NULL) AND (b IS NULL))
+   Filter: (c IS NULL)
+   ->  Bitmap Index Scan on mcv_lists_ab_idx
+         Index Cond: ((a IS NULL) AND (b IS NULL))
+(5 rows)
+
+RESET random_page_cost;
+-- mcv with arrays
+CREATE TABLE mcv_lists_arrays (
+    a TEXT[],
+    b NUMERIC[],
+    c INT[]
+);
+INSERT INTO mcv_lists_arrays (a, b, c)
+     SELECT
+         ARRAY[md5((i/100)::text), md5((i/100-1)::text), md5((i/100+1)::text)],
+         ARRAY[(i/100-1)::numeric/1000, (i/100)::numeric/1000, (i/100+1)::numeric/1000],
+         ARRAY[(i/100-1), i/100, (i/100+1)]
+     FROM generate_series(1,5000) s(i);
+CREATE STATISTICS mcv_lists_arrays_stats (mcv) ON a, b, c
+  FROM mcv_lists_arrays;
+ANALYZE mcv_lists_arrays;
+-- mcv with bool
+CREATE TABLE mcv_lists_bool (
+    a BOOL,
+    b BOOL,
+    c BOOL
+);
+INSERT INTO mcv_lists_bool (a, b, c)
+     SELECT
+         (mod(i,2) = 0), (mod(i,4) = 0), (mod(i,8) = 0)
+     FROM generate_series(1,10000) s(i);
+CREATE INDEX mcv_lists_bool_ab_idx ON mcv_lists_bool (a, b);
+CREATE INDEX mcv_lists_bool_abc_idx ON mcv_lists_bool (a, b, c);
+CREATE STATISTICS mcv_lists_bool_stats (mcv) ON a, b, c
+  FROM mcv_lists_bool;
+ANALYZE mcv_lists_bool;
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE a AND b AND c;
+                           QUERY PLAN                           
+----------------------------------------------------------------
+ Bitmap Heap Scan on mcv_lists_bool
+   Filter: (a AND b AND c)
+   ->  Bitmap Index Scan on mcv_lists_bool_abc_idx
+         Index Cond: ((a = true) AND (b = true) AND (c = true))
+(4 rows)
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND c;
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Index Scan using mcv_lists_bool_ab_idx on mcv_lists_bool
+   Index Cond: ((a = false) AND (b = true))
+   Filter: ((NOT a) AND b AND c)
+(3 rows)
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND NOT b AND c;
+                           QUERY PLAN                           
+----------------------------------------------------------------
+ Index Only Scan using mcv_lists_bool_abc_idx on mcv_lists_bool
+   Index Cond: ((a = false) AND (b = false) AND (c = true))
+   Filter: ((NOT a) AND (NOT b) AND c)
+(3 rows)
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND NOT c;
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Index Scan using mcv_lists_bool_ab_idx on mcv_lists_bool
+   Index Cond: ((a = false) AND (b = true))
+   Filter: ((NOT a) AND b AND (NOT c))
+(3 rows)
+
diff --git a/src/test/regress/expected/type_sanity.out b/src/test/regress/expected/type_sanity.out
index b1419d4bc2..a56d6c5231 100644
--- a/src/test/regress/expected/type_sanity.out
+++ b/src/test/regress/expected/type_sanity.out
@@ -72,8 +72,9 @@ WHERE p1.typtype not in ('c','d','p') AND p1.typname NOT LIKE E'\\_%'
   194 | pg_node_tree
  3361 | pg_ndistinct
  3402 | pg_dependencies
+ 4001 | pg_mcv_list
   210 | smgr
-(4 rows)
+(5 rows)
 
 -- Make sure typarray points to a varlena array type of our own base
 SELECT p1.oid, p1.typname as basetype, p2.typname as arraytype,
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 46acaadb39..ad1f103217 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -282,3 +282,184 @@ EXPLAIN (COSTS OFF)
  SELECT * FROM functional_dependencies WHERE a = 1 AND b = '1' AND c = 1;
 
 RESET random_page_cost;
+
+-- MCV lists
+CREATE TABLE mcv_lists (
+    filler1 TEXT,
+    filler2 NUMERIC,
+    a INT,
+    b TEXT,
+    filler3 DATE,
+    c INT,
+    d TEXT
+);
+
+SET random_page_cost = 1.2;
+
+CREATE INDEX mcv_lists_ab_idx ON mcv_lists (a, b);
+CREATE INDEX mcv_lists_abc_idx ON mcv_lists (a, b, c);
+
+-- random data (no MCV list)
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT mod(i,37), mod(i,41), mod(i,43), mod(i,47) FROM generate_series(1,5000) s(i);
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+
+-- 100 distinct combinations, all in the MCV list
+TRUNCATE mcv_lists;
+DROP STATISTICS mcv_lists_stats;
+
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT mod(i,100), mod(i,50), mod(i,25), i FROM generate_series(1,5000) s(i);
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 0 AND b <= '0';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 4 AND b <= '0' AND c <= 4;
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 0 AND b <= '0';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 4 AND b <= '0' AND c <= 4;
+
+-- check change of column type resets the MCV statistics
+ALTER TABLE mcv_lists ALTER COLUMN c TYPE numeric;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+-- 100 distinct combinations with NULL values, all in the MCV list
+TRUNCATE mcv_lists;
+DROP STATISTICS mcv_lists_stats;
+
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT
+         (CASE WHEN mod(i,100) = 1 THEN NULL ELSE mod(i,100) END),
+         (CASE WHEN mod(i,50) = 1  THEN NULL ELSE mod(i,50) END),
+         (CASE WHEN mod(i,25) = 1  THEN NULL ELSE mod(i,25) END),
+         i
+     FROM generate_series(1,5000) s(i);
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
+
+RESET random_page_cost;
+
+-- mcv with arrays
+CREATE TABLE mcv_lists_arrays (
+    a TEXT[],
+    b NUMERIC[],
+    c INT[]
+);
+
+INSERT INTO mcv_lists_arrays (a, b, c)
+     SELECT
+         ARRAY[md5((i/100)::text), md5((i/100-1)::text), md5((i/100+1)::text)],
+         ARRAY[(i/100-1)::numeric/1000, (i/100)::numeric/1000, (i/100+1)::numeric/1000],
+         ARRAY[(i/100-1), i/100, (i/100+1)]
+     FROM generate_series(1,5000) s(i);
+
+CREATE STATISTICS mcv_lists_arrays_stats (mcv) ON a, b, c
+  FROM mcv_lists_arrays;
+
+ANALYZE mcv_lists_arrays;
+
+-- mcv with bool
+CREATE TABLE mcv_lists_bool (
+    a BOOL,
+    b BOOL,
+    c BOOL
+);
+
+INSERT INTO mcv_lists_bool (a, b, c)
+     SELECT
+         (mod(i,2) = 0), (mod(i,4) = 0), (mod(i,8) = 0)
+     FROM generate_series(1,10000) s(i);
+
+CREATE INDEX mcv_lists_bool_ab_idx ON mcv_lists_bool (a, b);
+
+CREATE INDEX mcv_lists_bool_abc_idx ON mcv_lists_bool (a, b, c);
+
+CREATE STATISTICS mcv_lists_bool_stats (mcv) ON a, b, c
+  FROM mcv_lists_bool;
+
+ANALYZE mcv_lists_bool;
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE a AND b AND c;
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND c;
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND NOT b AND c;
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND NOT c;
-- 
2.13.6

#83

Dean Rasheed

dean.a.rasheed@gmail.com

over 7 years ago

In reply to: Tomas Vondra (#82)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 3 September 2018 at 00:17, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

Attached is an updated version of the patch series, adopting a couple of
improvements - both for MCV lists and histograms.

MCV
---

For the MCV list part, I've adopted the approach proposed by Dean, using
base selectivity and using it to correct the non-MCV part. I agree the
simplicity of the approach is a nice feature, and it seems to produce
better estimates. I'm not sure I understand the approach perfectly, but
I've tried to add comments explaining how it works etc.

Cool. Looking at this afresh after some time away, it still looks like
a reasonable approach, and the test results are encouraging.

In mcv_clauselist_selectivity(), you raise the following question:

if (matches[i] != STATS_MATCH_NONE)
{
/* XXX Shouldn't the basesel be outside the if condition? */
*basesel += mcv->items[i]->base_frequency;
s += mcv->items[i]->frequency;
}

The reason it needs to be inside the "if" block is that what it's
computing is the base (independent) selectivity of the clauses found
to match the MCV stats, so that then in
statext_clauselist_selectivity() it can be used in the following
computation:

/* Estimated selectivity of values not covered by MCV matches */
other_sel = simple_sel - mcv_basesel;

to give an estimate for the other clauses that aren't covered by the
MCV stats. So I think the code is correct as it stands, but if you
think it isn't clear enough, maybe a comment update is in order.

The assumption being made is that mcv_basesel will cancel out the part
of simple_sel that is due to clauses matching the MCV stats, so that
we can then just add on the MCV selectivity. Of course that's only an
approximation, and it won't be exact -- partly due to the fact that
simple_sel and mcv_basesel are potentially computed using different
sample rows, and so will differ in the MCV region, and partly because
of second-order effects arising from the way that selectivities are
combined in clauselist_selectivity_simple(). Maybe that's something
that can be improved upon, but it doesn't seem like a bad initial
approximation.

I've also changed how we build the MCV lists, particularly how we decide
how many / which items store in the MCV list. In the previous version
I've adopted the same algorithm we use for per-column MCV lists, but in
certain cases that turned out to be too restrictive.

Consider for example a table with multiple perfectly correlated columns,
with very few combinations. That is, something like this:

CREATE TABLE t (a int, b int);

INSERT INTO t SELECT mod(i,50), mod(i,50)
FROM generate_series(1,1e6) s(i);

CREATE STATISTICS s (mcv) ON a,b FROM t;

Now, the data distribution is very simple - uniform, with 50 distinct
combinations, each representing 2% of data (and the random sample should
be pretty close to that).

In these cases, analyze_mcv_list decides it does not need any MCV list,
because the frequency for each value is pretty much 1/ndistinct. For
single column that's reasonable, but for multiple correlated columns
it's rather problematic. We might use the same ndistinct approach
(assuming we have the ndistinct coefficients), but that still does not
allow us to decide which combinations are "valid" with respect to the
data. For example we can't decide (1,10) does not appear in the data.

So I'm not entirely sure adopting the same algorithm analyze_mcv_list
algorithm both for single-column and multi-column stats. It may make
sense to keep more items in the multi-column case for reasons that are
not really valid for a single single-column.

For now I've added a trivial condition to simply keep all the groups
when possible. This probably needs more thought.

Ah, this is a good point. I think I see the problem here.

analyze_mcv_list() works by keeping those MCV entries that are
statistically significantly more frequent than the selectivity that
would have otherwise been assigned to the values, which is based on
ndistinct and nullfrac. That's not really right for multivariate stats
though, because the selectivity that would be assigned to a
multi-column value if it weren't in the multivariate MCV list is
actually calculated using the product of individual column
selectivities. Fortunately we now calculate this (base_frequency), so
actually I think what's needed is a custom version of
analyze_mcv_list() that keeps MCV entries if the observed frequency is
statistically significantly larger than the base frequency, not the
ndistinct-based frequency.

It might also be worthwhile doing a little more work to make the
base_frequency values more consistent with the way individual column
selectivities are actually calculated -- currently the patch always
uses the observed single-column frequencies to calculate the base
frequencies, but actually the univariate stats would only do that for
a subset of the single-column values, and the rest would get assigned
a fixed share of the remaining selectivity-space. Factoring that into
the base frequency calculation ought to give a better base frequency
estimate (for use in mcv_clauselist_selectivity() and
statext_clauselist_selectivity()), as well as give a more principled
cutoff threshold for deciding which multivariate MCV values to keep.
It may be possible to reuse some of the existing code for that.

The initial goal of the base frequency calculation was to replicate
the univariate stats computations, so that it can be used to give the
right correction to be applied to the simple_sel value. If it can also
be used to determine how many MCV entries to keep, that's an added
bonus.

Regards,
Dean

#84

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 7 years ago

In reply to: Dean Rasheed (#83)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

On 09/04/2018 04:16 PM, Dean Rasheed wrote:

On 3 September 2018 at 00:17, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

Attached is an updated version of the patch series, adopting a couple of
improvements - both for MCV lists and histograms.

MCV
---

For the MCV list part, I've adopted the approach proposed by Dean, using
base selectivity and using it to correct the non-MCV part. I agree the
simplicity of the approach is a nice feature, and it seems to produce
better estimates. I'm not sure I understand the approach perfectly, but
I've tried to add comments explaining how it works etc.

Cool. Looking at this afresh after some time away, it still looks like
a reasonable approach, and the test results are encouraging.

In mcv_clauselist_selectivity(), you raise the following question:

if (matches[i] != STATS_MATCH_NONE)
{
/* XXX Shouldn't the basesel be outside the if condition? */
*basesel += mcv->items[i]->base_frequency;
s += mcv->items[i]->frequency;
}

The reason it needs to be inside the "if" block is that what it's
computing is the base (independent) selectivity of the clauses found
to match the MCV stats, so that then in
statext_clauselist_selectivity() it can be used in the following
computation:

/* Estimated selectivity of values not covered by MCV matches */
other_sel = simple_sel - mcv_basesel;

to give an estimate for the other clauses that aren't covered by the
MCV stats. So I think the code is correct as it stands, but if you
think it isn't clear enough, maybe a comment update is in order.

The assumption being made is that mcv_basesel will cancel out the part
of simple_sel that is due to clauses matching the MCV stats, so that
we can then just add on the MCV selectivity. Of course that's only an
approximation, and it won't be exact -- partly due to the fact that
simple_sel and mcv_basesel are potentially computed using different
sample rows, and so will differ in the MCV region, and partly because
of second-order effects arising from the way that selectivities are
combined in clauselist_selectivity_simple(). Maybe that's something
that can be improved upon, but it doesn't seem like a bad initial
approximation.

Thanks for the clarification. It's one of the comments I added while
reworking the patch, with still a very limited understanding of the
approach at that point in time. I'll replace it with a comment
explaining the reasoning in the next version.

I've also changed how we build the MCV lists, particularly how we decide
how many / which items store in the MCV list. In the previous version
I've adopted the same algorithm we use for per-column MCV lists, but in
certain cases that turned out to be too restrictive.

Consider for example a table with multiple perfectly correlated columns,
with very few combinations. That is, something like this:

CREATE TABLE t (a int, b int);

INSERT INTO t SELECT mod(i,50), mod(i,50)
FROM generate_series(1,1e6) s(i);

CREATE STATISTICS s (mcv) ON a,b FROM t;

Now, the data distribution is very simple - uniform, with 50 distinct
combinations, each representing 2% of data (and the random sample should
be pretty close to that).

In these cases, analyze_mcv_list decides it does not need any MCV list,
because the frequency for each value is pretty much 1/ndistinct. For
single column that's reasonable, but for multiple correlated columns
it's rather problematic. We might use the same ndistinct approach
(assuming we have the ndistinct coefficients), but that still does not
allow us to decide which combinations are "valid" with respect to the
data. For example we can't decide (1,10) does not appear in the data.

So I'm not entirely sure adopting the same algorithm analyze_mcv_list
algorithm both for single-column and multi-column stats. It may make
sense to keep more items in the multi-column case for reasons that are
not really valid for a single single-column.

For now I've added a trivial condition to simply keep all the groups
when possible. This probably needs more thought.

Ah, this is a good point. I think I see the problem here.

analyze_mcv_list() works by keeping those MCV entries that are
statistically significantly more frequent than the selectivity that
would have otherwise been assigned to the values, which is based on
ndistinct and nullfrac. That's not really right for multivariate stats
though, because the selectivity that would be assigned to a
multi-column value if it weren't in the multivariate MCV list is
actually calculated using the product of individual column
selectivities. Fortunately we now calculate this (base_frequency), so
actually I think what's needed is a custom version of
analyze_mcv_list() that keeps MCV entries if the observed frequency is
statistically significantly larger than the base frequency, not the
ndistinct-based frequency.

That's probably a good idea and should be an improvement in some cases.
But I think it kinda misses the second part, when a combination is much
less common than the simple product of selectivities (i.e. values that
are somehow "incompatible").

In the example above, there are only (a,b) combinations where (a == b),
so there's nothing like that. But in practice there will be some sort of
noise. So you may observe combinations where the actual frequency is
much lower than the base frequency.

I suppose "significantly larger than the base frequency" would not catch
that, so we need to also consider cases where it's significantly lower.

I also wonder if we could/should consider the multi-variate ndistinct
estimate, somehow. For the univariate case wen use the ndistinct to
estimate the average selectivity for non-MCV items. I think it'd be a
mistake not to leverage this knowledge (when ndistinct coefficients are
available), even if it only helps with simple equality clauses.

It might also be worthwhile doing a little more work to make the
base_frequency values more consistent with the way individual column
selectivities are actually calculated -- currently the patch always
uses the observed single-column frequencies to calculate the base
frequencies, but actually the univariate stats would only do that for
a subset of the single-column values, and the rest would get assigned
a fixed share of the remaining selectivity-space. Factoring that into
the base frequency calculation ought to give a better base frequency
estimate (for use in mcv_clauselist_selectivity() and
statext_clauselist_selectivity()), as well as give a more principled
cutoff threshold for deciding which multivariate MCV values to keep.
It may be possible to reuse some of the existing code for that.

I agree, but I think we can leave this for later.

The initial goal of the base frequency calculation was to replicate
the univariate stats computations, so that it can be used to give the
right correction to be applied to the simple_sel value. If it can also
be used to determine how many MCV entries to keep, that's an added
bonus.

Yep.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#85

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 7 years ago

In reply to: Konstantin Knizhnik (#79)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

On 07/18/2018 09:32 AM, Konstantin Knizhnik wrote:

On 18.07.2018 02:58, Tomas Vondra wrote:

On 07/18/2018 12:41 AM, Konstantin Knizhnik wrote:

...

Teodor Sigaev has proposed an alternative approach for calculating
selectivity of multicolumn join or compound index search.
Usually DBA creates compound indexes which can be used by optimizer to
build efficient query execution plan based on index search.
We can stores statistic for compound keys of such indexes and (as it is
done now for functional indexes) and use it to estimate selectivity
of clauses. I have implemented this idea and will publish patch in
separate thread soon.
Now I just want to share some results for the Tomas examples.

So for Vanilla Postges without extra statistic estimated number of rows
is about 4 times smaller than real.

Can you please post plans with parallelism disabled, and perhaps without
the aggregate? Both makes reading the plans unnecessarily difficult ...

Sorry, below are plans with disabled parallel execution on simpler
query(a=1 and b=1):

explain analyze SELECT count(*) FROM foo WHERE a=1 and b=1;

Vanilla:

Aggregate (cost=11035.86..11035.87 rows=1 width=8) (actual
time=22.746..22.746 rows=1 loops=1)
   -> Bitmap Heap Scan on foo (cost=291.35..11001.97 rows=13553
width=0) (actual time=9.055..18.711 rows=50000 loops=1)
         Recheck Cond: ((a = 1) AND (b = 1))
         Heap Blocks: exact=222
         -> Bitmap Index Scan on foo_a_b_idx (cost=0.00..287.96
rows=13553 width=0) (actual time=9.005..9.005 rows=50000 loops=1)
               Index Cond: ((a = 1) AND (b = 1))

----------------------------------------------------------------------

Vanilla + extra statistic (create statistics ab on a,b from foo):

Aggregate (cost=12693.35..12693.36 rows=1 width=8) (actual
time=22.747..22.748 rows=1 loops=1)
   -> Bitmap Heap Scan on foo (cost=1490.08..12518.31 rows=70015
width=0) (actual time=9.399..18.636 rows=50000 loops=1)
         Recheck Cond: ((a = 1) AND (b = 1))
         Heap Blocks: exact=222
         -> Bitmap Index Scan on foo_a_b_idx (cost=0.00..1472.58
rows=70015 width=0) (actual time=9.341..9.341 rows=50000 loops=1)
               Index Cond: ((a = 1) AND (b = 1))

----------------------------------------------------------------------

Multicolumn index statistic:

Aggregate (cost=11946.35..11946.36 rows=1 width=8) (actual
time=25.117..25.117 rows=1 loops=1)
   -> Bitmap Heap Scan on foo (cost=1080.47..11819.51 rows=50736
width=0) (actual time=11.568..21.362 rows=50000 loops=1)
         Recheck Cond: ((a = 1) AND (b = 1))
         Heap Blocks: exact=222
         -> Bitmap Index Scan on foo_a_b_idx (cost=0.00..1067.79
rows=50736 width=0) (actual time=11.300..11.300 rows=50000 loops=1)
               Index Cond: ((a = 1) AND (b = 1))

I wonder what happened to this alternative approach, relying on stats
from multicolumn indexes ...

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#86

Thomas Munro

thomas.munro@enterprisedb.com

about 7 years ago

In reply to: Tomas Vondra (#82)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Mon, Sep 3, 2018 at 11:17 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated version of the patch series, adopting a couple of
improvements - both for MCV lists and histograms.

Hello Tomas,

FYI, here are a couple of warnings from GCC (I just noticed because I
turned on -Werror on cfbot so your patch turned red):

extended_stats.c: In function ‘statext_clauselist_selectivity’:
extended_stats.c:1227:6: error: ‘other_sel’ may be used uninitialized
in this function [-Werror=maybe-uninitialized]
sel = mcv_sel + other_sel;
^
extended_stats.c:1091:5: note: ‘other_sel’ was declared here
other_sel,
^
extended_stats.c:1227:6: error: ‘mcv_sel’ may be used uninitialized in
this function [-Werror=maybe-uninitialized]
sel = mcv_sel + other_sel;
^
extended_stats.c:1087:5: note: ‘mcv_sel’ was declared here
mcv_sel,
^

--
Thomas Munro
http://www.enterprisedb.com

#87

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 7 years ago

In reply to: Thomas Munro (#86)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 11/26/18 11:29 PM, Thomas Munro wrote:

On Mon, Sep 3, 2018 at 11:17 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated version of the patch series, adopting a couple of
improvements - both for MCV lists and histograms.

Hello Tomas,

FYI, here are a couple of warnings from GCC (I just noticed because I
turned on -Werror on cfbot so your patch turned red):

extended_stats.c: In function ‘statext_clauselist_selectivity’:
extended_stats.c:1227:6: error: ‘other_sel’ may be used uninitialized
in this function [-Werror=maybe-uninitialized]
sel = mcv_sel + other_sel;
^
extended_stats.c:1091:5: note: ‘other_sel’ was declared here
other_sel,
^
extended_stats.c:1227:6: error: ‘mcv_sel’ may be used uninitialized in
this function [-Werror=maybe-uninitialized]
sel = mcv_sel + other_sel;
^
extended_stats.c:1087:5: note: ‘mcv_sel’ was declared here
mcv_sel,
^

Thanks, I'll fix that in the next version of the patch I'm working on.

cheers

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#88

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 7 years ago

In reply to: Tomas Vondra (#87)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Attached is an updated version of the patch - rebased and fixing the
warnings reported by Thomas Munro.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0002-multivariate-histograms.patchtext/x-patch; name=0002-multivariate-histograms.patchDownload

From 12aa9ee8ddd46a140f1437c138847f6f522aa95b Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Wed, 26 Dec 2018 21:12:25 +0100
Subject: [PATCH 2/2] multivariate histograms

---
 doc/src/sgml/catalogs.sgml                    |   14 +-
 doc/src/sgml/func.sgml                        |   85 +
 doc/src/sgml/planstats.sgml                   |  106 +
 doc/src/sgml/ref/create_statistics.sgml       |   29 +-
 src/backend/catalog/system_views.sql          |   11 +
 src/backend/commands/statscmds.c              |   33 +-
 src/backend/nodes/outfuncs.c                  |    2 +-
 src/backend/optimizer/util/plancat.c          |   44 +-
 src/backend/parser/parse_utilcmd.c            |    2 +
 src/backend/statistics/Makefile               |    2 +-
 src/backend/statistics/README                 |    4 +
 src/backend/statistics/README.histogram       |  305 ++
 src/backend/statistics/dependencies.c         |    2 +-
 src/backend/statistics/extended_stats.c       |  260 +-
 src/backend/statistics/histogram.c            | 3024 +++++++++++++++++
 src/backend/statistics/mcv.c                  |   94 +-
 src/backend/utils/adt/ruleutils.c             |   10 +
 src/backend/utils/adt/selfuncs.c              |   13 +-
 src/bin/psql/describe.c                       |    9 +-
 src/include/catalog/pg_cast.dat               |    4 +
 src/include/catalog/pg_proc.dat               |   24 +
 src/include/catalog/pg_statistic_ext.h        |    2 +
 src/include/catalog/pg_type.dat               |    7 +
 src/include/nodes/relation.h                  |    7 +-
 .../statistics/extended_stats_internal.h      |   15 +
 src/include/statistics/statistics.h           |   63 +-
 src/include/utils/selfuncs.h                  |    5 +
 .../regress/expected/create_table_like.out    |    2 +-
 src/test/regress/expected/opr_sanity.out      |    3 +-
 src/test/regress/expected/stats_ext.out       |  209 +-
 src/test/regress/expected/type_sanity.out     |    3 +-
 src/test/regress/sql/stats_ext.sql            |  133 +-
 32 files changed, 4415 insertions(+), 111 deletions(-)
 create mode 100644 src/backend/statistics/README.histogram
 create mode 100644 src/backend/statistics/histogram.c

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 1f2a45c442..1db5ef5c0b 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6515,8 +6515,9 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
         An array containing codes for the enabled statistic kinds;
         valid values are:
         <literal>d</literal> for n-distinct statistics,
-        <literal>f</literal> for functional dependency statistics, and
-        <literal>m</literal> for most common values (MCV) list statistics
+        <literal>f</literal> for functional dependency statistics,
+        <literal>m</literal> for most common values (MCV) list statistics, and
+        <literal>h</literal> for histogram statistics
       </entry>
      </row>
 
@@ -6549,6 +6550,15 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </entry>
      </row>
 
+     <row>
+      <entry><structfield>stxhistogram</structfield></entry>
+      <entry><type>pg_histogram</type></entry>
+      <entry></entry>
+      <entry>
+       Histogram, serialized as <structname>pg_histogram</structname> type.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index dcfa5a0311..79dcfb37c2 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21338,6 +21338,91 @@ SELECT m.* FROM pg_statistic_ext,
    </para>
   </sect2>
 
+  <sect2 id="functions-statistics-histogram">
+   <title>Inspecting histograms</title>
+
+   <indexterm>
+     <primary>pg_histogram</primary>
+     <secondary>pg_histogram_buckets</secondary>
+   </indexterm>
+
+   <para>
+    <function>pg_histogram_buckets</function> returns a list of all buckets
+    stored in a multi-column histogram, and returns the following columns:
+
+    <informaltable>
+     <tgroup cols="3">
+      <thead>
+       <row>
+        <entry>Name</entry>
+        <entry>Type</entry>
+        <entry>Description</entry>
+       </row>
+      </thead>
+      <tbody>
+       <row>
+        <entry><literal>index</literal></entry>
+        <entry><type>int</type></entry>
+        <entry>index of the item in the histogram buckets</entry>
+       </row>
+       <row>
+        <entry><literal>minvals</literal></entry>
+        <entry><type>text[]</type></entry>
+        <entry>lower boundaries of the histogram bucket</entry>
+       </row>
+       <row>
+        <entry><literal>maxvals</literal></entry>
+        <entry><type>text[]</type></entry>
+        <entry>upper boundaries of the histogram bucket</entry>
+       </row>
+       <row>
+        <entry><literal>nullsonly</literal></entry>
+        <entry><type>boolean[]</type></entry>
+        <entry>flags identifying <literal>NULL</literal>-only dimensions</entry>
+       </row>
+       <row>
+        <entry><literal>mininclusive</literal></entry>
+        <entry><type>boolean[]</type></entry>
+        <entry>flags identifying which lower boundaries are inclusive</entry>
+       </row>
+       <row>
+        <entry><literal>maxinclusive</literal></entry>
+        <entry><type>boolean[]</type></entry>
+        <entry>flags identifying which upper boundaries are inclusive</entry>
+       </row>
+       <row>
+        <entry><literal>frequency</literal></entry>
+        <entry><type>double precision</type></entry>
+        <entry>frequency of this histogram bucket</entry>
+       </row>
+       <row>
+        <entry><literal>density</literal></entry>
+        <entry><type>double precision</type></entry>
+        <entry>density of this histogram bucket (frequency / volume)</entry>
+       </row>
+       <row>
+        <entry><literal>bucket_volume</literal></entry>
+        <entry><type>double precision</type></entry>
+        <entry>volume of this histogram bucket (a measure of size)</entry>
+       </row>
+      </tbody>
+     </tgroup>
+    </informaltable>
+   </para>
+
+   <para>
+    The <function>pg_histogram_buckets</function> function can be used like this:
+
+<programlisting>
+SELECT m.* FROM pg_statistic_ext,
+                pg_histogram_buckets(stxhistogram) m WHERE stxname = 'stts3';
+</programlisting>
+
+     Values of the <type>pg_histogram</type> can be obtained only from the
+     <literal>pg_statistic.stxhistogram</literal> column.
+   </para>
+  </sect2>
+
   </sect1>
 
 </chapter>
diff --git a/doc/src/sgml/planstats.sgml b/doc/src/sgml/planstats.sgml
index de8ef165c9..67a4f7219c 100644
--- a/doc/src/sgml/planstats.sgml
+++ b/doc/src/sgml/planstats.sgml
@@ -695,6 +695,112 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a &lt;= 49 AND b &gt; 49;
 
   </sect2>
 
+  <sect2 id="mv-histograms">
+   <title>Histograms</title>
+
+   <para>
+    <acronym>MCV</acronym> lists, introduced in <xref linkend="mcv-lists"/>,
+    work very well for columns with only a few distinct values, and for
+    columns with only few common values. In those cases, <acronym>MCV</acronym>
+    lists are a very accurate approximation of the real distribution.
+    Histograms, briefly described in <xref linkend="row-estimation-examples"/>,
+    are meant to address the high-cardinality case.
+   </para>
+
+   <para>
+    Although the example data we've used in <xref linkend="mcv-lists"/> does
+    not quality as a high-cardinality case, we can try creating a histogram
+    instead of the <acronym>MCV</acronym> list. With the histogram in place,
+    you may get a plan like this:
+
+<programlisting>
+DROP STATISTICS stts2;
+CREATE STATISTICS stts3 (histogram) ON a, b FROM t;
+ANALYZE t;
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
+                                  QUERY PLAN
+-------------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1)
+   Filter: ((a = 1) AND (b = 1))
+   Rows Removed by Filter: 9900
+</programlisting>
+
+    Which seems quite accurate. For other combinations of values the
+    estimates may be worse, as illustrated by the following query:
+
+<programlisting>
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 10;
+                                 QUERY PLAN                                 
+-----------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=100 width=8) (actual rows=0 loops=1)
+   Filter: ((a = 1) AND (b = 10))
+   Rows Removed by Filter: 10000
+</programlisting>
+
+    This happens due to histograms tracking ranges of values, which makes it
+    impossible to decide how tuples with the exact combination of values there
+    are in the bucket.
+   </para>
+
+   <para>
+    It's also possible to build <acronym>MCV</acronym> lists and a histogram, in
+    which case <command>ANALYZE</command> will build a <acronym>MCV</acronym> lists
+    with the most frequent values, and a histogram on the remaining part of the
+    sample.
+
+<programlisting>
+DROP STATISTICS stts3;
+CREATE STATISTICS stts4 (mcv, histogram) ON a, b FROM t;
+ANALYZE t;
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 10;
+                                QUERY PLAN                                 
+---------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)
+   Filter: ((a = 1) AND (b = 10))
+   Rows Removed by Filter: 10000
+</programlisting>
+
+    In this case the <acronym>MCV</acronym> list and histogram are treated as a
+    single composed statistics.
+   </para>
+
+   <para>
+    Similarly to <acronym>MCV</acronym> lists, it is possible to inspect
+    histograms using a function called <function>pg_histogram_buckets</function>,
+    which simply lists buckets of a histogram, along with information about
+    boundaries, frequencies, volume etc. When applied to the histogram from
+    <varname>stts3</varname>, you should get something like this:
+
+<programlisting>
+SELECT m.* FROM pg_statistic_ext,
+                pg_histogram_buckets(stxhistogram) m WHERE stxname = 'stts3';
+ index | minvals | maxvals | nullsonly | mininclusive | maxinclusive | frequency | density  | bucket_volume 
+-------+---------+---------+-----------+--------------+--------------+-----------+----------+---------------
+     0 | {0,0}   | {3,1}   | {f,f}     | {t,t}        | {f,f}        |      0.01 |    2.635 |      0.003795
+     1 | {50,0}  | {99,51} | {f,f}     | {t,t}        | {t,f}        |      0.01 | 0.034444 |      0.290323
+     2 | {0,25}  | {26,37} | {f,f}     | {t,t}        | {f,f}        |      0.01 | 0.292778 |      0.034156
+...
+    61 | {60,56} | {99,62} | {f,f}     | {t,t}        | {t,f}        |      0.02 | 0.752857 |      0.026565
+    62 | {35,25} | {50,37} | {f,f}     | {t,t}        | {f,f}        |      0.02 | 0.878333 |       0.02277
+    63 | {81,85} | {87,99} | {f,f}     | {t,t}        | {f,t}        |      0.02 | 1.756667 |      0.011385
+(64 rows)
+</programlisting>
+
+    Which shows that there are 64 buckets, with frequencies ranging between 1%
+    and 2%. The <structfield>minvals</structfield> and <structfield>maxvals</structfield>
+    show the bucket boundaries, <structfield>nullsonly</structfield> shows which
+    columns contain only null values (in the given bucket).
+   </para>
+
+   <para>
+    Similarly to <acronym>MCV</acronym> lists, the planner applies all conditions
+    to the buckets, and sums the frequencies of the matching ones. For details,
+    see <function>histogram_clauselist_selectivity</function> function in
+    <filename>src/backend/statistics/histogram.c</filename>.
+   </para>
+
+  </sect2>
+
  </sect1>
 
  <sect1 id="planner-stats-security">
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index fcbfa569d0..ef84341551 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -84,7 +84,8 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="parameter">statistics_na
       <literal>ndistinct</literal>, which enables n-distinct statistics, and
       <literal>dependencies</literal>, which enables functional
       dependency statistics, and <literal>mcv</literal> which enables
-      most-common values lists.
+      most-common values lists, and <literal>histogram</literal> which
+      enables histograms.
       If this clause is omitted, all supported statistics kinds are
       included in the statistics object.
       For more information, see <xref linkend="planner-stats-extended"/>
@@ -187,6 +188,32 @@ EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 1);
 
 -- invalid combination (not found in MCV)
 EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 2);
+</programlisting>
+  </para>
+
+  <para>
+   Create table <structname>t3</structname> with two strongly correlated
+   columns, and a histogram on those two columns:
+
+<programlisting>
+CREATE TABLE t3 (
+    a   float,
+    b   float
+);
+
+INSERT INTO t3 SELECT mod(i,1000), mod(i,1000) + 50 * (r - 0.5) FROM (
+                   SELECT i, random() r FROM generate_series(1,1000000) s(i)
+                 ) foo;
+
+CREATE STATISTICS s3 WITH (histogram) ON (a, b) FROM t3;
+
+ANALYZE t3;
+
+-- small overlap
+EXPLAIN ANALYZE SELECT * FROM t3 WHERE (a &lt; 500) AND (b &gt; 500);
+
+-- no overlap
+EXPLAIN ANALYZE SELECT * FROM t3 WHERE (a &lt; 400) AND (b &gt; 600);
 </programlisting>
   </para>
 
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5253837b54..82445f2d91 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1125,6 +1125,17 @@ LANGUAGE INTERNAL
 STRICT IMMUTABLE PARALLEL SAFE
 AS 'jsonb_insert';
 
+CREATE OR REPLACE FUNCTION
+  pg_histogram_buckets(histogram pg_histogram, otype integer DEFAULT 0,
+            OUT index integer, OUT minvals text[], OUT maxvals text[],
+            OUT nullsonly boolean[], OUT mininclusive boolean[],
+            OUT maxinclusive boolean[], OUT frequency double precision,
+            OUT density double precision, OUT bucket_volume double precision)
+RETURNS SETOF record
+LANGUAGE INTERNAL
+STRICT IMMUTABLE PARALLEL SAFE
+AS 'pg_histogram_buckets';
+
 --
 -- The default permissions for functions mean that anyone can execute them.
 -- A number of functions shouldn't be executable by just anyone, but rather
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 0ea3ff2c34..419da421db 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -71,12 +71,13 @@ CreateStatistics(CreateStatsStmt *stmt)
 	Oid			relid;
 	ObjectAddress parentobject,
 				myself;
-	Datum		types[3];		/* one for each possible type of statistic */
+	Datum		types[4];		/* one for each possible type of statistic */
 	int			ntypes;
 	ArrayType  *stxkind;
 	bool		build_ndistinct;
 	bool		build_dependencies;
 	bool		build_mcv;
+	bool		build_histogram;
 	bool		requested_type = false;
 	int			i;
 	ListCell   *cell;
@@ -272,6 +273,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	build_ndistinct = false;
 	build_dependencies = false;
 	build_mcv = false;
+	build_histogram = false;
 	foreach(cell, stmt->stat_types)
 	{
 		char	   *type = strVal((Value *) lfirst(cell));
@@ -291,6 +293,11 @@ CreateStatistics(CreateStatsStmt *stmt)
 			build_mcv = true;
 			requested_type = true;
 		}
+		else if (strcmp(type, "histogram") == 0)
+		{
+			build_histogram = true;
+			requested_type = true;
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -303,6 +310,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 		build_ndistinct = true;
 		build_dependencies = true;
 		build_mcv = true;
+		build_histogram = true;
 	}
 
 	/* construct the char array of enabled statistic types */
@@ -313,6 +321,8 @@ CreateStatistics(CreateStatsStmt *stmt)
 		types[ntypes++] = CharGetDatum(STATS_EXT_DEPENDENCIES);
 	if (build_mcv)
 		types[ntypes++] = CharGetDatum(STATS_EXT_MCV);
+	if (build_histogram)
+		types[ntypes++] = CharGetDatum(STATS_EXT_HISTOGRAM);
 	Assert(ntypes > 0 && ntypes <= lengthof(types));
 	stxkind = construct_array(types, ntypes, CHAROID, 1, true, 'c');
 
@@ -338,6 +348,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	nulls[Anum_pg_statistic_ext_stxndistinct - 1] = true;
 	nulls[Anum_pg_statistic_ext_stxdependencies - 1] = true;
 	nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
+	nulls[Anum_pg_statistic_ext_stxhistogram - 1] = true;
 
 	/* insert it into pg_statistic_ext */
 	htup = heap_form_tuple(statrel->rd_att, values, nulls);
@@ -442,8 +453,9 @@ RemoveStatisticsById(Oid statsOid)
  * values, this assumption could fail.  But that seems like a corner case
  * that doesn't justify zapping the stats in common cases.)
  *
- * For MCV lists that's not the case, as those statistics store the datums
- * internally. In this case we simply reset the statistics value to NULL.
+ * For MCV lists and histograms that's not the case, as those statistics
+ * store the datums internally. In those cases we simply reset those
+ * statistics to NULL.
  */
 void
 UpdateStatisticsForTypeChange(Oid statsOid, Oid relationOid, int attnum,
@@ -480,9 +492,10 @@ UpdateStatisticsForTypeChange(Oid statsOid, Oid relationOid, int attnum,
 
 	/*
 	 * We can also leave the record as it is if there are no statistics
-	 * including the datum values, like for example MCV lists.
+	 * including the datum values, like for example MCV and histograms.
 	 */
-	if (statext_is_kind_built(oldtup, STATS_EXT_MCV))
+	if (statext_is_kind_built(oldtup, STATS_EXT_MCV) ||
+		statext_is_kind_built(oldtup, STATS_EXT_HISTOGRAM))
 		reset_stats = true;
 
 	/*
@@ -503,11 +516,11 @@ UpdateStatisticsForTypeChange(Oid statsOid, Oid relationOid, int attnum,
 	memset(replaces, 0, Natts_pg_statistic_ext * sizeof(bool));
 	memset(values, 0, Natts_pg_statistic_ext * sizeof(Datum));
 
-	if (statext_is_kind_built(oldtup, STATS_EXT_MCV))
-	{
-		replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
-		nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
-	}
+	replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
+	replaces[Anum_pg_statistic_ext_stxhistogram - 1] = true;
+
+	nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
+	nulls[Anum_pg_statistic_ext_stxhistogram - 1] = true;
 
 	rel = heap_open(StatisticExtRelationId, RowExclusiveLock);
 
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index be6b4ca2f4..ae4988c277 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2332,7 +2332,7 @@ _outStatisticExtInfo(StringInfo str, const StatisticExtInfo *node)
 	/* NB: this isn't a complete set of fields */
 	WRITE_OID_FIELD(statOid);
 	/* don't write rel, leads to infinite recursion in plan tree dump */
-	WRITE_CHAR_FIELD(kind);
+	WRITE_INT_FIELD(kinds);
 	WRITE_BITMAPSET_FIELD(keys);
 }
 
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 2ae9945a28..dfe8ad2af5 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1323,6 +1323,9 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 		HeapTuple	htup;
 		Bitmapset  *keys = NULL;
 		int			i;
+		int			kind = 0;
+
+		StatisticExtInfo *info = makeNode(StatisticExtInfo);
 
 		htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
 		if (!htup)
@@ -1337,42 +1340,25 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 		for (i = 0; i < staForm->stxkeys.dim1; i++)
 			keys = bms_add_member(keys, staForm->stxkeys.values[i]);
 
-		/* add one StatisticExtInfo for each kind built */
+		/* now build the bitmask of statistics kinds */
 		if (statext_is_kind_built(htup, STATS_EXT_NDISTINCT))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_NDISTINCT;
-			info->keys = bms_copy(keys);
-
-			stainfos = lcons(info, stainfos);
-		}
+			kind |= STATS_EXT_INFO_NDISTINCT;
 
 		if (statext_is_kind_built(htup, STATS_EXT_DEPENDENCIES))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
-
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_DEPENDENCIES;
-			info->keys = bms_copy(keys);
-
-			stainfos = lcons(info, stainfos);
-		}
+			kind |= STATS_EXT_INFO_DEPENDENCIES;
 
 		if (statext_is_kind_built(htup, STATS_EXT_MCV))
-		{
-			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+			kind |= STATS_EXT_INFO_MCV;
 
-			info->statOid = statOid;
-			info->rel = rel;
-			info->kind = STATS_EXT_MCV;
-			info->keys = bms_copy(keys);
+		if (statext_is_kind_built(htup, STATS_EXT_HISTOGRAM))
+			kind |= STATS_EXT_INFO_HISTOGRAM;
 
-			stainfos = lcons(info, stainfos);
-		}
+		info->statOid = statOid;
+		info->rel = rel;
+		info->kinds = kind;
+		info->keys = bms_copy(keys);
+
+		stainfos = lcons(info, stainfos);
 
 		ReleaseSysCache(htup);
 		bms_free(keys);
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 3ffd4fb210..2db6c14084 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1650,6 +1650,8 @@ generateClonedExtStatsStmt(RangeVar *heapRel, Oid heapRelid,
 			stat_types = lappend(stat_types, makeString("dependencies"));
 		else if (enabled[i] == STATS_EXT_MCV)
 			stat_types = lappend(stat_types, makeString("mcv"));
+		else if (enabled[i] == STATS_EXT_HISTOGRAM)
+			stat_types = lappend(stat_types, makeString("histogram"));
 		else
 			elog(ERROR, "unrecognized statistics kind %c", enabled[i]);
 	}
diff --git a/src/backend/statistics/Makefile b/src/backend/statistics/Makefile
index d2815265fb..3e5ad454cd 100644
--- a/src/backend/statistics/Makefile
+++ b/src/backend/statistics/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/statistics
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-OBJS = extended_stats.o dependencies.o mcv.o mvdistinct.o
+OBJS = extended_stats.o dependencies.o histogram.o mcv.o mvdistinct.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/statistics/README b/src/backend/statistics/README
index 8f153a9e85..9de750614f 100644
--- a/src/backend/statistics/README
+++ b/src/backend/statistics/README
@@ -20,6 +20,8 @@ There are currently two kinds of extended statistics:
 
     (c) MCV lists (README.mcv)
 
+    (d) histograms (README.histogram)
+
 
 Compatible clause types
 -----------------------
@@ -30,6 +32,8 @@ Each type of statistics may be used to estimate some subset of clause types.
 
     (b) MCV lists - equality and inequality clauses (AND, OR, NOT), IS NULL
 
+    (c) histogram - equality and inequality clauses (AND, OR, NOT), IS NULL
+
 Currently, only OpExprs in the form Var op Const, or Const op Var are
 supported, however it's feasible to expand the code later to also estimate the
 selectivities on clauses such as Var op Var.
diff --git a/src/backend/statistics/README.histogram b/src/backend/statistics/README.histogram
new file mode 100644
index 0000000000..e1a4504502
--- /dev/null
+++ b/src/backend/statistics/README.histogram
@@ -0,0 +1,305 @@
+Multivariate histograms
+=======================
+
+Histograms on individual attributes consist of buckets represented by ranges,
+covering the domain of the attribute. That is, each bucket is a [min,max]
+interval, and contains all values in this range. The histogram is built in such
+a way that all buckets have about the same frequency.
+
+Multivariate histograms are an extension into n-dimensional space - the buckets
+are n-dimensional intervals (i.e. n-dimensional rectagles), covering the domain
+of the combination of attributes. That is, each bucket has a vector of lower
+and upper boundaries, denoted min[i] and max[i] (where i = 1..n).
+
+In addition to the boundaries, each bucket tracks additional info:
+
+    * frequency (fraction of tuples in the bucket)
+    * whether the boundaries are inclusive or exclusive
+    * whether the dimension contains only NULL values
+    * number of distinct values in each dimension (for building only)
+
+It's possible that in the future we'll multiple histogram types, with different
+features. We do however expect all the types to share the same representation
+(buckets as ranges) and only differ in how we build them.
+
+The current implementation builds non-overlapping buckets, that may not be true
+for some histogram types and the code should not rely on this assumption. There
+are interesting types of histograms (or algorithms) with overlapping buckets.
+
+When used on low-cardinality data, histograms usually perform considerably worse
+than MCV lists (which are a good fit for this kind of data). This is especially
+true on label-like values, where ordering of the values is mostly unrelated to
+meaning of the data, as proper ordering is crucial for histograms.
+
+On high-cardinality data the histograms are usually a better choice, because MCV
+lists can't represent the distribution accurately enough.
+
+
+Selectivity estimation
+----------------------
+
+The estimation is implemented in clauselist_mv_selectivity_histogram(), and
+works very similarly to clauselist_mv_selectivity_mcvlist.
+
+The main difference is that while MCV lists support exact matches, histograms
+often result in approximate matches - e.g. with equality we can only say if
+the constant would be part of the bucket, but not whether it really is there
+or what fraction of the bucket it corresponds to. In this case we rely on
+some defaults just like in the per-column histograms.
+
+The current implementation uses histograms to estimates those types of clauses
+(think of WHERE conditions):
+
+    (a) equality clauses    WHERE (a = 1) AND (b = 2)
+    (b) inequality clauses  WHERE (a < 1) AND (b >= 2)
+    (c) NULL clauses        WHERE (a IS NULL) AND (b IS NOT NULL)
+    (d) OR-clauses          WHERE (a = 1)  OR (b = 2)
+
+Similarly to MCV lists, it's possible to add support for additional types of
+clauses, for example:
+
+    (e) multi-var clauses   WHERE (a > b)
+
+and so on. These are tasks for the future, not yet implemented.
+
+
+When evaluating a clause on a bucket, we may get one of three results:
+
+    (a) FULL_MATCH - The bucket definitely matches the clause.
+
+    (b) PARTIAL_MATCH - The bucket matches the clause, but not necessarily all
+                        the tuples it represents.
+
+    (c) NO_MATCH - The bucket definitely does not match the clause.
+
+This may be illustrated using a range [1, 5], which is essentially a 1-D bucket.
+With clause
+
+    WHERE (a < 10) => FULL_MATCH (all range values are below
+                      10, so the whole bucket matches)
+
+    WHERE (a < 3)  => PARTIAL_MATCH (there may be values matching
+                      the clause, but we don't know how many)
+
+    WHERE (a < 0)  => NO_MATCH (the whole range is above 1, so
+                      no values from the bucket can match)
+
+Some clauses may produce only some of those results - for example equality
+clauses may never produce FULL_MATCH as we always hit only part of the bucket
+(we can't match both boundaries at the same time). This results in less accurate
+estimates compared to MCV lists, where we can hit a MCV items exactly (there's
+no PARTIAL match in MCV).
+
+There are also clauses that may not produce any PARTIAL_MATCH results. A nice
+example of that is 'IS [NOT] NULL' clause, which either matches the bucket
+completely (FULL_MATCH) or not at all (NO_MATCH), thanks to how the NULL-buckets
+are constructed.
+
+Computing the total selectivity estimate is trivial - simply sum selectivities
+from all the FULL_MATCH and PARTIAL_MATCH buckets (but for buckets marked with
+PARTIAL_MATCH, multiply the frequency by 0.5 to minimize the average error).
+
+
+Building a histogram
+---------------------
+
+The algorithm of building a histogram in general is quite simple:
+
+    (a) create an initial bucket (containing all sample rows)
+
+    (b) create NULL buckets (by splitting the initial bucket)
+
+    (c) repeat
+
+        (1) choose bucket to split next
+
+        (2) terminate if no bucket that might be split found, or if we've
+            reached the maximum number of buckets (16384)
+
+        (3) choose dimension to partition the bucket by
+
+        (4) partition the bucket by the selected dimension
+
+The main complexity is hidden in steps (c.1) and (c.3), i.e. how we choose the
+bucket and dimension for the split, as discussed in the next section.
+
+
+Partitioning criteria
+---------------------
+
+Similarly to one-dimensional histograms, we want to produce buckets with roughly
+the same frequency.
+
+We also need to produce "regular" buckets, because buckets with one dimension
+much longer than the others are very likely to match a lot of conditions (which
+increases error, even if the bucket frequency is very low).
+
+This is especially important when handling OR-clauses, because in that case each
+clause may add buckets independently. With AND-clauses all the clauses have to
+match each bucket, which makes this issue somewhat less concenrning.
+
+To achieve this, we choose the largest bucket (containing the most sample rows),
+but we only choose buckets that can actually be split (have at least 3 different
+combinations of values).
+
+Then we choose the "longest" dimension of the bucket, which is computed by using
+the distinct values in the sample as a measure.
+
+For details see functions select_bucket_to_partition() and partition_bucket(),
+which also includes further discussion.
+
+
+The current limit on number of buckets (16384) is mostly arbitrary, but chosen
+so that it guarantees we don't exceed the number of distinct values indexable by
+uint16 in any of the dimensions. In practice we could handle more buckets as we
+index each dimension separately and the splits should use the dimensions evenly.
+
+Also, histograms this large (with 16k values in multiple dimensions) would be
+quite expensive to build and process, so the 16k limit is rather reasonable.
+
+The actual number of buckets is also related to statistics target, because we
+require MIN_BUCKET_ROWS (10) tuples per bucket before a split, so we can't have
+more than (2 * 300 * target / 10) buckets. For the default target (100) this
+evaluates to ~6k.
+
+
+NULL handling (create_null_buckets)
+-----------------------------------
+
+When building histograms on a single attribute, we first filter out NULL values.
+In the multivariate case, we can't really do that because the rows may contain
+a mix of NULL and non-NULL values in different columns (so we can't simply
+filter all of them out).
+
+For this reason, the histograms are built in a way so that for each bucket, each
+dimension only contains only NULL or non-NULL values. Building the NULL-buckets
+happens as the first step in the build, by the create_null_buckets() function.
+The number of NULL buckets, as produced by this function, has a clear upper
+boundary (2^N) where N is the number of dimensions (attributes the histogram is
+built on). Or rather 2^K where K is the number of attributes that are not marked
+as not-NULL.
+
+The buckets with NULL dimensions are then subject to the same build algorithm
+(i.e. may be split into smaller buckets) just like any other bucket, but may
+only be split by non-NULL dimension.
+
+
+Serialization
+-------------
+
+To store the histogram in pg_statistic_ext table, it is serialized into a more
+efficient form. We also use the representation for estimation, i.e. we don't
+fully deserialize the histogram.
+
+For example the boundary values are deduplicated to minimize the required space.
+How much redundancy is there, actually? Let's assume there are no NULL values,
+so we start with a single bucket - in that case we have 2*N boundaries. Each
+time we split a bucket we introduce one new value (in the "middle" of one of
+the dimensions), and keep boundries for all the other dimensions. So after K
+splits, we have up to
+
+    2*N + K
+
+unique boundary values (we may have fewe values, if the same value is used for
+several splits). But after K splits we do have (K+1) buckets, so
+
+    (K+1) * 2 * N
+
+boundary values. Using e.g. N=4 and K=999, we arrive to those numbers:
+
+    2*N + K       = 1007
+    (K+1) * 2 * N = 8000
+
+wich means a lot of redundancy. It's somewhat counter-intuitive that the number
+of distinct values does not really depend on the number of dimensions (except
+for the initial bucket, but that's negligible compared to the total).
+
+By deduplicating the values and replacing them with 16-bit indexes (uint16), we
+reduce the required space to
+
+    1007 * 8 + 8000 * 2 ~= 24kB
+
+which is significantly less than 64kB required for the 'raw' histogram (assuming
+the values are 8B).
+
+While the bytea compression (pglz) might achieve the same reduction of space,
+the deduplicated representation is used to optimize the estimation by caching
+results of function calls for already visited values. This significantly
+reduces the number of calls to (often quite expensive) operators.
+
+Note: Of course, this reasoning only holds for histograms built by the algorithm
+that simply splits the buckets in half. Other histograms types (e.g. containing
+overlapping buckets) may behave differently and require different serialization.
+
+Serialized histograms are marked with 'magic' constant, to make it easier to
+check the bytea value really is a serialized histogram.
+
+
+varlena compression
+-------------------
+
+This serialization may however disable automatic varlena compression, the array
+of unique values is placed at the beginning of the serialized form. Which is
+exactly the chunk used by pglz to check if the data is compressible, and it
+will probably decide it's not very compressible. This is similar to the issue
+we had with JSONB initially.
+
+Maybe storing buckets first would make it work, as the buckets may be better
+compressible.
+
+On the other hand the serialization is actually a context-aware compression,
+usually compressing to ~30% (or even less, with large data types). So the lack
+of additional pglz compression may be acceptable.
+
+
+Deserialization
+---------------
+
+The deserialization is not a perfect inverse of the serialization, as we keep
+the deduplicated arrays. This reduces the amount of memory and also allows
+optimizations during estimation (e.g. we can cache results for the distinct
+values, saving expensive function calls).
+
+
+Inspecting the histogram
+------------------------
+
+Inspecting the regular (per-attribute) histograms is trivial, as it's enough
+to select the columns from pg_stats. The data is encoded as anyarrays, and
+all the items have the same data type, so anyarray provides a simple way to
+get a text representation.
+
+With multivariate histograms the columns may use different data types, making
+it impossible to use anyarrays. It might be possible to produce similar
+array-like representation, but that would complicate further processing and
+analysis of the histogram.
+
+So instead the histograms are stored in a custom data type (pg_histogram),
+which however makes it more difficult to inspect the contents. To make that
+easier, there's a SRF returning detailed information about the histogram.
+
+    SELECT * FROM pg_histogram_buckets();
+
+It has two input parameters:
+
+    histogram - OID of the histogram (pg_statistic_ext.stxhistogram)
+    otype     - type of output
+
+and produces a table with these columns:
+
+    - bucket ID                (0...nbuckets-1)
+    - lower bucket boundaries  (string array)
+    - upper bucket boundaries  (string array)
+    - nulls only dimensions    (boolean array)
+    - lower boundary inclusive (boolean array)
+    - upper boundary includive (boolean array)
+    - frequency                (double precision)
+    - density                  (double precision)
+    - volume                   (double precision)
+
+The 'otype' accepts three values, determining what will be returned in the
+lower/upper boundary arrays:
+
+    - 0 - values stored in the histogram, encoded as text
+    - 1 - indexes into the deduplicated arrays
+    - 2 - idnexes into the deduplicated arrays, scaled to [0,1]
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index a4e8eef52f..dbd8b17727 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -935,7 +935,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			listidx;
 
 	/* check if there's any stats that might be useful for us. */
-	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
+	if (!has_stats_of_kind(rel->statlist, STATS_EXT_INFO_DEPENDENCIES))
 		return 1.0;
 
 	list_attnums = (AttrNumber *) palloc(sizeof(AttrNumber) *
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 1628daae83..5184967490 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -38,7 +38,6 @@
 #include "utils/selfuncs.h"
 #include "utils/syscache.h"
 
-
 /*
  * Used internally to refer to an individual statistics object, i.e.,
  * a pg_statistic_ext entry.
@@ -58,7 +57,7 @@ static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs,
 					  int nvacatts, VacAttrStats **vacatts);
 static void statext_store(Relation pg_stext, Oid relid,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
-			  MCVList * mcvlist, VacAttrStats **stats);
+			  MCVList * mcvlist, MVHistogram * histogram, VacAttrStats **stats);
 
 
 /*
@@ -92,10 +91,14 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 		StatExtEntry *stat = (StatExtEntry *) lfirst(lc);
 		MVNDistinct *ndistinct = NULL;
 		MVDependencies *dependencies = NULL;
+		MVHistogram *histogram = NULL;
 		MCVList    *mcv = NULL;
 		VacAttrStats **stats;
 		ListCell   *lc2;
 
+		bool		build_mcv = false;
+		bool		build_histogram = false;
+
 		/*
 		 * Check if we can build these stats based on the column analyzed. If
 		 * not, report this fact (except in autovacuum) and move on.
@@ -131,12 +134,49 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 				dependencies = statext_dependencies_build(numrows, rows,
 														  stat->columns, stats);
 			else if (t == STATS_EXT_MCV)
-				mcv = statext_mcv_build(numrows, rows, stat->columns, stats,
-										totalrows);
+				build_mcv = true;
+			else if (t == STATS_EXT_HISTOGRAM)
+				build_histogram = true;
+		}
+
+		/*
+		 * If asked to build both MCV and histogram, first build the MCV part
+		 * and then histogram on the remaining rows.
+		 */
+		if (build_mcv && build_histogram)
+		{
+			HeapTuple  *rows_filtered = NULL;
+			int			numrows_filtered;
+
+			mcv = statext_mcv_build(numrows, rows, stat->columns, stats,
+									&rows_filtered, &numrows_filtered,
+									totalrows);
+
+			/*
+			 * Only build the histogram when there are rows not covered by
+			 * MCV.
+			 */
+			if (rows_filtered)
+			{
+				Assert(numrows_filtered > 0);
+
+				histogram = statext_histogram_build(numrows_filtered, rows_filtered,
+													stat->columns, stats, numrows);
+
+				/* free this immediately, as we may be building many stats */
+				pfree(rows_filtered);
+			}
 		}
+		else if (build_mcv)
+			mcv = statext_mcv_build(numrows, rows, stat->columns, stats,
+									NULL, NULL, totalrows);
+		else if (build_histogram)
+			histogram = statext_histogram_build(numrows, rows, stat->columns,
+												stats, numrows);
 
 		/* store the statistics in the catalog */
-		statext_store(pg_stext, stat->statOid, ndistinct, dependencies, mcv, stats);
+		statext_store(pg_stext, stat->statOid, ndistinct, dependencies, mcv,
+					  histogram, stats);
 	}
 
 	heap_close(pg_stext, RowExclusiveLock);
@@ -168,6 +208,10 @@ statext_is_kind_built(HeapTuple htup, char type)
 			attnum = Anum_pg_statistic_ext_stxmcv;
 			break;
 
+		case STATS_EXT_HISTOGRAM:
+			attnum = Anum_pg_statistic_ext_stxhistogram;
+			break;
+
 		default:
 			elog(ERROR, "unexpected statistics type requested: %d", type);
 	}
@@ -233,7 +277,8 @@ fetch_statentries_for_relation(Relation pg_statext, Oid relid)
 		{
 			Assert((enabled[i] == STATS_EXT_NDISTINCT) ||
 				   (enabled[i] == STATS_EXT_DEPENDENCIES) ||
-				   (enabled[i] == STATS_EXT_MCV));
+				   (enabled[i] == STATS_EXT_MCV) ||
+				   (enabled[i] == STATS_EXT_HISTOGRAM));
 			entry->types = lappend_int(entry->types, (int) enabled[i]);
 		}
 
@@ -308,7 +353,7 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs,
 static void
 statext_store(Relation pg_stext, Oid statOid,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
-			  MCVList * mcv, VacAttrStats **stats)
+			  MCVList * mcv, MVHistogram * histogram, VacAttrStats **stats)
 {
 	HeapTuple	stup,
 				oldtup;
@@ -347,10 +392,18 @@ statext_store(Relation pg_stext, Oid statOid,
 		values[Anum_pg_statistic_ext_stxmcv - 1] = PointerGetDatum(data);
 	}
 
+	if (histogram != NULL)
+	{
+		/* histogram already is a bytea value, not need to serialize */
+		nulls[Anum_pg_statistic_ext_stxhistogram - 1] = (histogram == NULL);
+		values[Anum_pg_statistic_ext_stxhistogram - 1] = PointerGetDatum(histogram);
+	}
+
 	/* always replace the value (either by bytea or NULL) */
 	replaces[Anum_pg_statistic_ext_stxndistinct - 1] = true;
 	replaces[Anum_pg_statistic_ext_stxdependencies - 1] = true;
 	replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
+	replaces[Anum_pg_statistic_ext_stxhistogram - 1] = true;
 
 	/* there should already be a pg_statistic_ext tuple */
 	oldtup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
@@ -465,6 +518,19 @@ compare_scalars_simple(const void *a, const void *b, void *arg)
 								 (SortSupport) arg);
 }
 
+/*
+ * qsort_arg comparator for sorting data when partitioning a MV bucket
+ */
+int
+compare_scalars_partition(const void *a, const void *b, void *arg)
+{
+	Datum		da = ((ScalarItem *) a)->value;
+	Datum		db = ((ScalarItem *) b)->value;
+	SortSupport ssup = (SortSupport) arg;
+
+	return ApplySortComparator(da, false, db, false, ssup);
+}
+
 int
 compare_datums_simple(Datum a, Datum b, SortSupport ssup)
 {
@@ -590,10 +656,11 @@ build_sorted_items(int numrows, HeapTuple *rows, TupleDesc tdesc,
 
 /*
  * has_stats_of_kind
- *		Check whether the list contains statistic of a given kind
+ *		Check whether the list contains statistic of a given kind (at least
+ * one of those specified statistics types).
  */
 bool
-has_stats_of_kind(List *stats, char requiredkind)
+has_stats_of_kind(List *stats, int requiredkinds)
 {
 	ListCell   *l;
 
@@ -601,7 +668,7 @@ has_stats_of_kind(List *stats, char requiredkind)
 	{
 		StatisticExtInfo *stat = (StatisticExtInfo *) lfirst(l);
 
-		if (stat->kind == requiredkind)
+		if (stat->kinds & requiredkinds)
 			return true;
 	}
 
@@ -623,7 +690,7 @@ has_stats_of_kind(List *stats, char requiredkind)
  * further tiebreakers are needed.
  */
 StatisticExtInfo *
-choose_best_statistics(List *stats, Bitmapset *attnums, char requiredkind)
+choose_best_statistics(List *stats, Bitmapset *attnums, int requiredkinds)
 {
 	ListCell   *lc;
 	StatisticExtInfo *best_match = NULL;
@@ -637,8 +704,8 @@ choose_best_statistics(List *stats, Bitmapset *attnums, char requiredkind)
 		int			numkeys;
 		Bitmapset  *matched;
 
-		/* skip statistics that are not of the correct type */
-		if (info->kind != requiredkind)
+		/* skip statistics that do not match any of the requested types */
+		if ((info->kinds & requiredkinds) == 0)
 			continue;
 
 		/* determine how many attributes of these stats can be matched to */
@@ -843,7 +910,7 @@ statext_is_compatible_clause_internal(Node *clause, Index relid, Bitmapset **att
 
 /*
  * statext_is_compatible_clause
- *		Determines if the clause is compatible with MCV lists.
+ *		Determines if the clause is compatible with MCV lists and histograms
  *
  * Only OpExprs with two arguments using an equality operator are supported.
  * When returning True attnum is set to the attribute number of the Var within
@@ -872,6 +939,89 @@ statext_is_compatible_clause(Node *clause, Index relid, Bitmapset **attnums)
 												 relid, attnums);
 }
 
+/*
+ * examine_equality_clause
+ *		Extract variable from a simple top-level equality clause.
+ *
+ * For simple equality clause (Var = Const) or (Const = Var) extracts
+ * the Var. For other clauses returns NULL.
+ */
+static Var *
+examine_equality_clause(PlannerInfo *root, RestrictInfo *rinfo)
+{
+	OpExpr	   *expr;
+	Var		   *var;
+	bool		ok;
+	bool		varonleft = true;
+
+	if (!IsA(rinfo->clause, OpExpr))
+		return NULL;
+
+	expr = (OpExpr *) rinfo->clause;
+
+	if (list_length(expr->args) != 2)
+		return NULL;
+
+	/* see if it actually has the right */
+	ok = (NumRelids((Node *) expr) == 1) &&
+		(is_pseudo_constant_clause(lsecond(expr->args)) ||
+		 (varonleft = false,
+		  is_pseudo_constant_clause(linitial(expr->args))));
+
+	/* unsupported structure (two variables or so) */
+	if (!ok)
+		return NULL;
+
+	if (get_oprrest(expr->opno) != F_EQSEL)
+		return NULL;
+
+	var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+	return var;
+}
+
+/*
+ * estimate_equality_groups
+ *		Estimates number of groups for attributes in equality clauses.
+ *
+ * Extracts simple top-level equality clauses, and estimates ndistinct
+ * for that combination (using simplified estimate_num_groups). Then
+ * returns number of attributes with an equality clause, and a lists
+ * of equality clauses (to use as conditions for histograms) and also
+ * remaining non-equality clauses.
+ */
+static double
+estimate_equality_groups(PlannerInfo *root, List *clauses,
+						 List **eqclauses, List **neqclauses)
+{
+	List   *vars = NIL;
+	ListCell *lc;
+
+	*eqclauses = NIL;
+	*neqclauses = NIL;
+
+	foreach(lc, clauses)
+	{
+		Var	   *var;
+		RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
+
+		Assert(IsA(rinfo, RestrictInfo));
+
+		var = examine_equality_clause(root, rinfo);
+
+		/* is it a simple equality clause */
+		if (var)
+		{
+			vars = lappend(vars, var);
+			*eqclauses = lappend(*eqclauses, rinfo);
+		}
+		else
+			*neqclauses = lappend(*neqclauses, rinfo);
+	}
+
+	return estimate_num_groups_simple(root, vars);
+}
+
 /*
  * statext_clauselist_selectivity
  *		Estimate clauses using the best multi-column statistics.
@@ -934,16 +1084,17 @@ statext_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
 	StatisticExtInfo *stat;
 	List	   *stat_clauses;
 	Selectivity	simple_sel,
-				mcv_sel,
-				mcv_basesel,
-				mcv_totalsel,
-				other_sel,
+				mcv_sel = 0.0,
+				mcv_basesel = 0.0,
+				mcv_totalsel = 0.0,
+				histogram_sel = 0.0,
+				other_sel = 0.0,
 				sel;
 
-	/* we're interested in MCV lists */
-	int			types = STATS_EXT_MCV;
+	/* we're interested in MCV lists and histograms */
+	int			types = (STATS_EXT_INFO_MCV | STATS_EXT_INFO_HISTOGRAM);
 
-	/* check if there's any stats that might be useful for us. */
+	/* Check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, types))
 		return (Selectivity) 1.0;
 
@@ -994,8 +1145,8 @@ statext_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
 	if (!stat)
 		return (Selectivity) 1.0;
 
-	/* We only understand MCV lists for now. */
-	Assert(stat->kind == STATS_EXT_MCV);
+	/* We only understand MCV lists and histograms for now. */
+	Assert(stat->kinds & (STATS_EXT_INFO_MCV | STATS_EXT_INFO_HISTOGRAM));
 
 	/* now filter the clauses to be estimated using the selected MCV */
 	stat_clauses = NIL;
@@ -1018,28 +1169,59 @@ statext_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
 	}
 
 	/*
-	 * First compute "simple" selectivity, i.e. without the extended statistics,
-	 * and essentially assuming independence of the columns/clauses. We'll then
-	 * use the various selectivities computed from MCV list to improve it.
+	 * For statistics with MCV list, we'll estimate the MCV and non-MCV parts.
 	 */
-	simple_sel = clauselist_selectivity_simple(root, stat_clauses, varRelid,
-											   jointype, sjinfo, NULL);
+	if (stat->kinds & STATS_EXT_INFO_MCV)
+	{
+		/*
+		 * First compute "simple" selectivity, i.e. without the extended statistics,
+		 * and essentially assuming independence of the columns/clauses. We'll then
+		 * use the various selectivities computed from MCV list to improve it.
+		 */
+		simple_sel = clauselist_selectivity_simple(root, stat_clauses, varRelid,
+												   jointype, sjinfo, NULL);
+
+		/*
+		 * Now compute the multi-column estimate from the MCV list, along with the
+		 * other selectivities (base & total selectivity).
+		 */
+		mcv_sel = mcv_clauselist_selectivity(root, stat, stat_clauses, varRelid,
+											 jointype, sjinfo, rel,
+											 &mcv_basesel, &mcv_totalsel);
+
+		/* Estimated selectivity of values not covered by MCV matches */
+		other_sel = simple_sel - mcv_basesel;
+		CLAMP_PROBABILITY(other_sel);
+
+		/* The non-MCV selectivity can't exceed the 1 - mcv_totalsel. */
+		if (other_sel > 1.0 - mcv_totalsel)
+			other_sel = 1.0 - mcv_totalsel;
+	}
+	else
+	{
+		/* Otherwise just remember there was no MCV list. */
+		mcv_totalsel = 0.0;
+	}
 
 	/*
-	 * Now compute the multi-column estimate from the MCV list, along with the
-	 * other selectivities (base & total selectivity).
+	 * If we have a histogram, we'll use it to improve the non-MCV estimate.
 	 */
-	mcv_sel = mcv_clauselist_selectivity(root, stat, stat_clauses, varRelid,
-										 jointype, sjinfo, rel,
-										 &mcv_basesel, &mcv_totalsel);
+	if (stat->kinds & STATS_EXT_INFO_HISTOGRAM)
+	{
+		List   *eqclauses,
+			   *neqclauses;
+		double	ngroups;
 
-	/* Estimated selectivity of values not covered by MCV matches */
-	other_sel = simple_sel - mcv_basesel;
-	CLAMP_PROBABILITY(other_sel);
+		ngroups = estimate_equality_groups(root, stat_clauses,
+										   &eqclauses, &neqclauses);
 
-	/* The non-MCV selectivity can't exceed the 1 - mcv_totalsel. */
-	if (other_sel > 1.0 - mcv_totalsel)
-		other_sel = 1.0 - mcv_totalsel;
+		histogram_sel = histogram_clauselist_selectivity(root, stat,
+														 neqclauses, eqclauses,
+														 varRelid, jointype,
+														 sjinfo, rel);
+
+		other_sel = (1 / ngroups) * histogram_sel;
+	}
 
 	/* Overall selectivity is the combination of MCV and non-MCV estimates. */
 	sel = mcv_sel + other_sel;
diff --git a/src/backend/statistics/histogram.c b/src/backend/statistics/histogram.c
new file mode 100644
index 0000000000..78675cb013
--- /dev/null
+++ b/src/backend/statistics/histogram.c
@@ -0,0 +1,3024 @@
+/*-------------------------------------------------------------------------
+ *
+ * histogram.c
+ *	  POSTGRES multivariate histograms
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/backend/statistics/histogram.c
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include <math.h>
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_statistic_ext.h"
+#include "fmgr.h"
+#include "funcapi.h"
+#include "optimizer/clauses.h"
+#include "statistics/extended_stats_internal.h"
+#include "statistics/statistics.h"
+#include "utils/builtins.h"
+#include "utils/bytea.h"
+#include "utils/fmgroids.h"
+#include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
+#include "utils/syscache.h"
+#include "utils/typcache.h"
+
+
+/*
+ * Multivariate histograms
+ */
+typedef struct MVBucketBuild
+{
+	/* Frequencies of this bucket. */
+	float		frequency;
+
+	/*
+	 * Information about dimensions being NULL-only. Not yet used.
+	 */
+	bool	   *nullsonly;
+
+	/* lower boundaries - values and information about the inequalities */
+	Datum	   *min;
+	bool	   *min_inclusive;
+
+	/* upper boundaries - values and information about the inequalities */
+	Datum	   *max;
+	bool	   *max_inclusive;
+
+	/* number of distinct values in each dimension */
+	uint32	   *ndistincts;
+
+	/* number of distinct combination of values */
+	uint32		ndistinct;
+
+	/* aray of sample rows (for this bucket) */
+	HeapTuple  *rows;
+	uint32		numrows;
+
+}			MVBucketBuild;
+
+typedef struct MVHistogramBuild
+{
+	int32		vl_len_;		/* unused: ensure same alignment as
+								 * MVHistogram for serialization */
+	uint32		magic;			/* magic constant marker */
+	uint32		type;			/* type of histogram (BASIC) */
+	uint32		nbuckets;		/* number of buckets (buckets array) */
+	uint32		ndimensions;	/* number of dimensions */
+	Oid			types[STATS_MAX_DIMENSIONS];	/* OIDs of data types */
+	MVBucketBuild **buckets;	/* array of buckets */
+}			MVHistogramBuild;
+
+static MVBucketBuild * create_initial_ext_bucket(int numrows, HeapTuple *rows,
+												 Bitmapset *attrs, VacAttrStats **stats);
+
+static MVBucketBuild * select_bucket_to_partition(int nbuckets, MVBucketBuild * *buckets);
+
+static MVBucketBuild * partition_bucket(MVBucketBuild * bucket, Bitmapset *attrs,
+										VacAttrStats **stats,
+										int *ndistvalues, Datum **distvalues);
+
+static MVBucketBuild * copy_ext_bucket(MVBucketBuild * bucket, uint32 ndimensions);
+
+static void update_bucket_ndistinct(MVBucketBuild * bucket, Bitmapset *attrs,
+						VacAttrStats **stats);
+
+static void update_dimension_ndistinct(MVBucketBuild * bucket, int dimension,
+						   Bitmapset *attrs, VacAttrStats **stats,
+						   bool update_boundaries);
+
+static void create_null_buckets(MVHistogramBuild * histogram, int bucket_idx,
+					Bitmapset *attrs, VacAttrStats **stats);
+
+static Datum *build_ndistinct(int numrows, HeapTuple *rows, Bitmapset *attrs,
+				VacAttrStats **stats, int i, int *nvals);
+
+static MVHistogram * serialize_histogram(MVHistogramBuild * histogram,
+										 VacAttrStats **stats);
+
+/*
+ * Computes size of a serialized histogram bucket, depending on the number
+ * of dimentions (columns) the statistic is defined on. The datum values
+ * are stored in a separate array (deduplicated, to minimize the size), and
+ * so the serialized buckets only store uint16 indexes into that array.
+ *
+ * Each serialized bucket needs to store (in this order):
+ *
+ * - number of tuples     (float)
+ * - number of distinct   (float)
+ * - min inclusive flags  (ndim * sizeof(bool))
+ * - max inclusive flags  (ndim * sizeof(bool))
+ * - null dimension flags (ndim * sizeof(bool))
+ * - min boundary indexes (2 * ndim * sizeof(uint16))
+ * - max boundary indexes (2 * ndim * sizeof(uint16))
+ *
+ * So in total:
+ *
+ *	 ndim * (4 * sizeof(uint16) + 3 * sizeof(bool)) + (2 * sizeof(float))
+ *
+ * XXX We might save a bit more space by using proper bitmaps instead of
+ * boolean arrays.
+ */
+#define BUCKET_SIZE(ndims)	\
+	(ndims * (4 * sizeof(uint16) + 3 * sizeof(bool)) + sizeof(float))
+
+/*
+ * Macros for convenient access to parts of a serialized bucket.
+ */
+#define BUCKET_FREQUENCY(b)		(*(float*)b)
+#define BUCKET_MIN_INCL(b,n)	((bool*)(b + sizeof(float)))
+#define BUCKET_MAX_INCL(b,n)	(BUCKET_MIN_INCL(b,n) + n)
+#define BUCKET_NULLS_ONLY(b,n)	(BUCKET_MAX_INCL(b,n) + n)
+#define BUCKET_MIN_INDEXES(b,n) ((uint16*)(BUCKET_NULLS_ONLY(b,n) + n))
+#define BUCKET_MAX_INDEXES(b,n) ((BUCKET_MIN_INDEXES(b,n) + n))
+
+/*
+ * Minimal number of rows per bucket (can't split smaller buckets).
+ *
+ * XXX The single-column statistics (std_typanalyze) pretty much says we
+ * need 300 rows per bucket. Should we use the same value here?
+ */
+#define MIN_BUCKET_ROWS			10
+
+/*
+ * Represents match info for a histogram bucket.
+ */
+typedef struct bucket_match
+{
+	bool		match;		/* true/false */
+	double		fraction;	/* fraction of bucket */
+} bucket_match;
+
+/*
+ * Builds a multivariate histogram from the set of sampled rows.
+ *
+ * The build algorithm is iterative - initially a single bucket containing all
+ * sample rows is formed, and then repeatedly split into smaller buckets. In
+ * each round the largest bucket is split into two smaller ones.
+ *
+ * The criteria for selecting the largest bucket (and the dimension for the
+ * split) needs to be elaborate enough to produce buckets of roughly the same
+ * size, and also regular shape (not very narrow in just one dimension).
+ *
+ * The current algorithm works like this:
+ *
+ *   a) build NULL-buckets (create_null_buckets)
+ *
+ *   b) while [maximum number of buckets not reached]
+ *
+ *   c) choose bucket to partition (largest bucket)
+ *
+ *       c.1) if no bucket eligible to split, terminate the build
+ *
+ *       c.2) choose bucket dimension to partition (largest dimension)
+ *
+ *       c.3) split the bucket into two buckets
+ *
+ * See the discussion at select_bucket_to_partition and partition_bucket for
+ * more details about the algorithm.
+ *
+ * The function does not update the interan pointers, hence the histogram
+ * is suitable only for storing. Before using it for estimation, it needs
+ * to go through statext_histogram_deserialize() first.
+ */
+MVHistogram *
+statext_histogram_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
+						VacAttrStats **stats, int numrows_total)
+{
+	int			i;
+	int			numattrs = bms_num_members(attrs);
+
+	int		   *ndistvalues;
+	Datum	  **distvalues;
+
+	MVHistogramBuild *histogram;
+	HeapTuple  *rows_copy;
+
+	/* not supposed to build of too few or too many columns */
+	Assert((numattrs >= 2) && (numattrs <= STATS_MAX_DIMENSIONS));
+
+	/* we need to make a copy of the row array, as we'll modify it */
+	rows_copy = (HeapTuple *) palloc0(numrows * sizeof(HeapTuple));
+	memcpy(rows_copy, rows, sizeof(HeapTuple) * numrows);
+
+	/* build the histogram header */
+
+	histogram = (MVHistogramBuild *) palloc0(sizeof(MVHistogramBuild));
+
+	histogram->magic = STATS_HIST_MAGIC;
+	histogram->type = STATS_HIST_TYPE_BASIC;
+	histogram->ndimensions = numattrs;
+	histogram->nbuckets = 1;	/* initially just a single bucket */
+
+	/*
+	 * Allocate space for maximum number of buckets (better than repeatedly
+	 * doing repalloc for short-lived objects).
+	 */
+	histogram->buckets
+		= (MVBucketBuild * *) palloc0(STATS_HIST_MAX_BUCKETS * sizeof(MVBucketBuild));
+
+	/* Create the initial bucket, covering all sampled rows */
+	histogram->buckets[0]
+		= create_initial_ext_bucket(numrows, rows_copy, attrs, stats);
+
+	/*
+	 * Collect info on distinct values in each dimension (used later to pick
+	 * dimension to partition).
+	 */
+	ndistvalues = (int *) palloc0(sizeof(int) * numattrs);
+	distvalues = (Datum **) palloc0(sizeof(Datum *) * numattrs);
+
+	for (i = 0; i < numattrs; i++)
+		distvalues[i] = build_ndistinct(numrows, rows, attrs, stats, i,
+										&ndistvalues[i]);
+
+	/*
+	 * Split the initial bucket into buckets that don't mix NULL and non-NULL
+	 * values in a single dimension.
+	 *
+	 * XXX Maybe this should be happening before the build_ndistinct()?
+	 */
+	create_null_buckets(histogram, 0, attrs, stats);
+
+	/*
+	 * Split the buckets into smaller and smaller buckets. The loop will end
+	 * when either all buckets are too small (MIN_BUCKET_ROWS), or there are
+	 * too many buckets in total (STATS_HIST_MAX_BUCKETS).
+	 */
+	while (histogram->nbuckets < STATS_HIST_MAX_BUCKETS)
+	{
+		MVBucketBuild *bucket = select_bucket_to_partition(histogram->nbuckets,
+														   histogram->buckets);
+
+		/* no bucket eligible for partitioning */
+		if (bucket == NULL)
+			break;
+
+		/* we modify the bucket in-place and add one new bucket */
+		histogram->buckets[histogram->nbuckets++]
+			= partition_bucket(bucket, attrs, stats, ndistvalues, distvalues);
+	}
+
+	/* Finalize the histogram build - compute bucket frequencies etc. */
+	for (i = 0; i < histogram->nbuckets; i++)
+	{
+		/*
+		 * The frequency has to be computed from the whole sample, in case
+		 * some of the rows were filtered out in the MCV build.
+		 */
+		histogram->buckets[i]->frequency
+			= (histogram->buckets[i]->numrows * 1.0) / numrows_total;
+	}
+
+	return serialize_histogram(histogram, stats);
+}
+
+/*
+ * build_ndistinct
+ *		build array of ndistinct values in a particular column, count them
+ *
+ */
+static Datum *
+build_ndistinct(int numrows, HeapTuple *rows, Bitmapset *attrs,
+				VacAttrStats **stats, int i, int *nvals)
+{
+	int			j;
+	int			nvalues,
+				ndistinct;
+	Datum	   *values,
+			   *distvalues;
+	int		   *attnums;
+
+	TypeCacheEntry *type;
+	SortSupportData ssup;
+
+	type = lookup_type_cache(stats[i]->attrtypid, TYPECACHE_LT_OPR);
+
+	/* initialize sort support, etc. */
+	memset(&ssup, 0, sizeof(ssup));
+	ssup.ssup_cxt = CurrentMemoryContext;
+
+	/* We always use the default collation for statistics */
+	ssup.ssup_collation = DEFAULT_COLLATION_OID;
+	ssup.ssup_nulls_first = false;
+
+	PrepareSortSupportFromOrderingOp(type->lt_opr, &ssup);
+
+	nvalues = 0;
+	values = (Datum *) palloc0(sizeof(Datum) * numrows);
+
+	attnums = build_attnums(attrs);
+
+	/* collect values from the sample rows, ignore NULLs */
+	for (j = 0; j < numrows; j++)
+	{
+		Datum		value;
+		bool		isnull;
+
+		/*
+		 * remember the index of the sample row, to make the partitioning
+		 * simpler
+		 */
+		value = heap_getattr(rows[j], attnums[i],
+							 stats[i]->tupDesc, &isnull);
+
+		if (isnull)
+			continue;
+
+		values[nvalues++] = value;
+	}
+
+	/* if no non-NULL values were found, free the memory and terminate */
+	if (nvalues == 0)
+	{
+		pfree(values);
+		return NULL;
+	}
+
+	/* sort the array of values using the SortSupport */
+	qsort_arg((void *) values, nvalues, sizeof(Datum),
+			  compare_scalars_simple, (void *) &ssup);
+
+	/* count the distinct values first, and allocate just enough memory */
+	ndistinct = 1;
+	for (j = 1; j < nvalues; j++)
+		if (compare_scalars_simple(&values[j], &values[j - 1], &ssup) != 0)
+			ndistinct += 1;
+
+	distvalues = (Datum *) palloc0(sizeof(Datum) * ndistinct);
+
+	/* now collect distinct values into the array */
+	distvalues[0] = values[0];
+	ndistinct = 1;
+
+	for (j = 1; j < nvalues; j++)
+	{
+		if (compare_scalars_simple(&values[j], &values[j - 1], &ssup) != 0)
+		{
+			distvalues[ndistinct] = values[j];
+			ndistinct += 1;
+		}
+	}
+
+	pfree(values);
+
+	*nvals = ndistinct;
+	return distvalues;
+}
+
+/*
+ * statext_histogram_load
+ *		Load the histogram list for the indicated pg_statistic_ext tuple
+*/
+MVHistogram *
+statext_histogram_load(Oid mvoid)
+{
+	bool		isnull = false;
+	Datum		histogram;
+	HeapTuple	htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(mvoid));
+
+	if (!HeapTupleIsValid(htup))
+		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
+
+	histogram = SysCacheGetAttr(STATEXTOID, htup,
+								Anum_pg_statistic_ext_stxhistogram, &isnull);
+
+	ReleaseSysCache(htup);
+
+	if (isnull)
+		return NULL;
+
+	return statext_histogram_deserialize(DatumGetByteaP(histogram));
+}
+
+/*
+ * Serialize the MV histogram into a bytea value. The basic algorithm is quite
+ * simple, and mostly mimincs the MCV serialization:
+ *
+ * (1) perform deduplication for each attribute (separately)
+ *
+ *   (a) collect all (non-NULL) attribute values from all buckets
+ *   (b) sort the data (using 'lt' from VacAttrStats)
+ *   (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all buckets
+ *
+ *   (a) replace min/max values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, as we're mixing different
+ * datatypes, and we we need to use the right operators to compare/sort them.
+ * We're also mixing pass-by-value and pass-by-ref types, and so on.
+ *
+ * TODO Consider packing boolean flags (NULL) for each item into 'char' or
+ * a longer type (instead of using an array of bool items).
+ */
+static MVHistogram *
+serialize_histogram(MVHistogramBuild * histogram, VacAttrStats **stats)
+{
+	int			dim,
+				i;
+	Size		total_length = 0;
+
+	bytea	   *output = NULL;	/* serialized histogram as bytea */
+	char	   *data = NULL;
+
+	DimensionInfo *info;
+	SortSupport ssup;
+
+	int			nbuckets = histogram->nbuckets;
+	int			ndims = histogram->ndimensions;
+
+	/* allocated for serialized bucket data */
+	int			bucketsize = BUCKET_SIZE(ndims);
+	char	   *bucket = palloc0(bucketsize);
+
+	/* values per dimension (and number of non-NULL values) */
+	Datum	  **values = (Datum **) palloc0(sizeof(Datum *) * ndims);
+	int		   *counts = (int *) palloc0(sizeof(int) * ndims);
+
+	/* info about dimensions (for deserialize) */
+	info = (DimensionInfo *) palloc0(sizeof(DimensionInfo) * ndims);
+
+	/* sort support data */
+	ssup = (SortSupport) palloc0(sizeof(SortSupportData) * ndims);
+
+	/* collect and deduplicate values for each dimension separately */
+	for (dim = 0; dim < ndims; dim++)
+	{
+		int			b;
+		int			count;
+		TypeCacheEntry *type;
+
+		type = lookup_type_cache(stats[dim]->attrtypid, TYPECACHE_LT_OPR);
+
+		/* OID of the data types */
+		histogram->types[dim] = stats[dim]->attrtypid;
+
+		/* keep important info about the data type */
+		info[dim].typlen = stats[dim]->attrtype->typlen;
+		info[dim].typbyval = stats[dim]->attrtype->typbyval;
+
+		/*
+		 * Allocate space for all min/max values, including NULLs (we won't
+		 * use them, but we don't know how many are there), and then collect
+		 * all non-NULL values.
+		 */
+		values[dim] = (Datum *) palloc0(sizeof(Datum) * nbuckets * 2);
+
+		for (b = 0; b < histogram->nbuckets; b++)
+		{
+			/* skip buckets where this dimension is NULL-only */
+			if (!histogram->buckets[b]->nullsonly[dim])
+			{
+				values[dim][counts[dim]] = histogram->buckets[b]->min[dim];
+				counts[dim] += 1;
+
+				values[dim][counts[dim]] = histogram->buckets[b]->max[dim];
+				counts[dim] += 1;
+			}
+		}
+
+		/* there are just NULL values in this dimension */
+		if (counts[dim] == 0)
+			continue;
+
+		/* sort and deduplicate */
+		ssup[dim].ssup_cxt = CurrentMemoryContext;
+		ssup[dim].ssup_collation = DEFAULT_COLLATION_OID;
+		ssup[dim].ssup_nulls_first = false;
+
+		PrepareSortSupportFromOrderingOp(type->lt_opr, &ssup[dim]);
+
+		qsort_arg(values[dim], counts[dim], sizeof(Datum),
+				  compare_scalars_simple, &ssup[dim]);
+
+		/*
+		 * Walk through the array and eliminate duplicitate values, but keep
+		 * the ordering (so that we can do bsearch later). We know there's at
+		 * least 1 item, so we can skip the first element.
+		 */
+		count = 1;				/* number of deduplicated items */
+		for (i = 1; i < counts[dim]; i++)
+		{
+			/* if it's different from the previous value, we need to keep it */
+			if (compare_datums_simple(values[dim][i - 1], values[dim][i], &ssup[dim]) != 0)
+			{
+				/* XXX: not needed if (count == j) */
+				values[dim][count] = values[dim][i];
+				count += 1;
+			}
+		}
+
+		/* make sure we fit into uint16 */
+		Assert(count <= UINT16_MAX);
+
+		/* keep info about the deduplicated count */
+		info[dim].nvalues = count;
+
+		/* compute size of the serialized data */
+		if (info[dim].typlen > 0)
+			/* byval or byref, but with fixed length (name, tid, ...) */
+			info[dim].nbytes = info[dim].nvalues * info[dim].typlen;
+		else if (info[dim].typlen == -1)
+			/* varlena, so just use VARSIZE_ANY */
+			for (i = 0; i < info[dim].nvalues; i++)
+				info[dim].nbytes += VARSIZE_ANY(values[dim][i]);
+		else if (info[dim].typlen == -2)
+			/* cstring, so simply strlen */
+			for (i = 0; i < info[dim].nvalues; i++)
+				info[dim].nbytes += strlen(DatumGetPointer(values[dim][i]));
+		else
+			elog(ERROR, "unknown data type typbyval=%d typlen=%d",
+				 info[dim].typbyval, info[dim].typlen);
+	}
+
+	/*
+	 * Now we finally know how much space we'll need for the serialized
+	 * histogram, as it contains these fields:
+	 *
+	 * - length (4B) for varlena
+	 * - magic (4B)
+	 * - type (4B)
+	 * - ndimensions (4B)
+	 * - nbuckets (4B)
+	 * - info (ndim * sizeof(DimensionInfo)
+	 * - arrays of values for each dimension
+	 * - serialized buckets (nbuckets * bucketsize)
+	 *
+	 * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and then
+	 * we'll place the data (and buckets).
+	 */
+	total_length = (offsetof(MVHistogram, buckets)
+					+ ndims * sizeof(DimensionInfo)
+					+ nbuckets * bucketsize);
+
+	/* account for the deduplicated data */
+	for (dim = 0; dim < ndims; dim++)
+		total_length += info[dim].nbytes;
+
+	/*
+	 * Enforce arbitrary limit of 1MB on the size of the serialized MCV list.
+	 * This is meant as a protection against someone building MCV list on long
+	 * values (e.g. text documents).
+	 *
+	 * XXX Should we enforce arbitrary limits like this one? Maybe it's not
+	 * even necessary, as long values are usually unique and so won't make it
+	 * into the MCV list in the first place. In the end, we have a 1GB limit
+	 * on bytea values.
+	 */
+	if (total_length > (1024 * 1024))
+		elog(ERROR, "serialized histogram exceeds 1MB (%ld > %d)",
+			 total_length, (1024 * 1024));
+
+	/* allocate space for the serialized histogram list, set header */
+	output = (bytea *) palloc0(total_length);
+
+	/*
+	 * we'll use 'data' to keep track of the place to write data
+	 *
+	 * XXX No VARDATA() here, as MVHistogramBuild includes the length.
+	 */
+	data = (char *) output;
+
+	memcpy(data, histogram, offsetof(MVHistogramBuild, buckets));
+	data += offsetof(MVHistogramBuild, buckets);
+
+	memcpy(data, info, sizeof(DimensionInfo) * ndims);
+	data += sizeof(DimensionInfo) * ndims;
+
+	/* serialize the deduplicated values for all attributes */
+	for (dim = 0; dim < ndims; dim++)
+	{
+#ifdef USE_ASSERT_CHECKING
+		char	   *tmp = data;
+#endif
+		for (i = 0; i < info[dim].nvalues; i++)
+		{
+			Datum		v = values[dim][i];
+
+			if (info[dim].typbyval) /* passed by value */
+			{
+				memcpy(data, &v, info[dim].typlen);
+				data += info[dim].typlen;
+			}
+			else if (info[dim].typlen > 0)	/* pased by reference */
+			{
+				memcpy(data, DatumGetPointer(v), info[dim].typlen);
+				data += info[dim].typlen;
+			}
+			else if (info[dim].typlen == -1)	/* varlena */
+			{
+				memcpy(data, DatumGetPointer(v), VARSIZE_ANY(v));
+				data += VARSIZE_ANY(values[dim][i]);
+			}
+			else if (info[dim].typlen == -2)	/* cstring */
+			{
+				memcpy(data, DatumGetPointer(v), strlen(DatumGetPointer(v)) + 1);
+				data += strlen(DatumGetPointer(v)) + 1;
+			}
+		}
+
+		/* make sure we got exactly the amount of data we expected */
+		Assert((data - tmp) == info[dim].nbytes);
+	}
+
+	/* finally serialize the items, with uint16 indexes instead of the values */
+	for (i = 0; i < nbuckets; i++)
+	{
+		/* don't write beyond the allocated space */
+		Assert(data <= (char *) output + total_length - bucketsize);
+
+		/* reset the values for each item */
+		memset(bucket, 0, bucketsize);
+
+		BUCKET_FREQUENCY(bucket) = histogram->buckets[i]->frequency;
+
+		for (dim = 0; dim < ndims; dim++)
+		{
+			/* do the lookup only for non-NULL values */
+			if (!histogram->buckets[i]->nullsonly[dim])
+			{
+				uint16		idx;
+				Datum	   *v = NULL;
+
+				/* min boundary */
+				v = (Datum *) bsearch_arg(&histogram->buckets[i]->min[dim],
+										  values[dim], info[dim].nvalues, sizeof(Datum),
+										  compare_scalars_simple, &ssup[dim]);
+
+				Assert(v != NULL);	/* serialization or deduplication error */
+
+				/* compute index within the array */
+				idx = (v - values[dim]);
+
+				Assert((idx >= 0) && (idx < info[dim].nvalues));
+
+				BUCKET_MIN_INDEXES(bucket, ndims)[dim] = idx;
+
+				/* max boundary */
+				v = (Datum *) bsearch_arg(&histogram->buckets[i]->max[dim],
+										  values[dim], info[dim].nvalues, sizeof(Datum),
+										  compare_scalars_simple, &ssup[dim]);
+
+				Assert(v != NULL);	/* serialization or deduplication error */
+
+				/* compute index within the array */
+				idx = (v - values[dim]);
+
+				Assert((idx >= 0) && (idx < info[dim].nvalues));
+
+				BUCKET_MAX_INDEXES(bucket, ndims)[dim] = idx;
+			}
+		}
+
+		/* copy flags (nulls, min/max inclusive) */
+		memcpy(BUCKET_NULLS_ONLY(bucket, ndims),
+			   histogram->buckets[i]->nullsonly, sizeof(bool) * ndims);
+
+		memcpy(BUCKET_MIN_INCL(bucket, ndims),
+			   histogram->buckets[i]->min_inclusive, sizeof(bool) * ndims);
+
+		memcpy(BUCKET_MAX_INCL(bucket, ndims),
+			   histogram->buckets[i]->max_inclusive, sizeof(bool) * ndims);
+
+		/* copy the item into the array */
+		memcpy(data, bucket, bucketsize);
+
+		data += bucketsize;
+	}
+
+	/* at this point we expect to match the total_length exactly */
+	Assert((data - (char *) output) == total_length);
+
+	/* free the values/counts arrays here */
+	pfree(counts);
+	pfree(info);
+	pfree(ssup);
+
+	for (dim = 0; dim < ndims; dim++)
+		pfree(values[dim]);
+
+	pfree(values);
+
+	/* make sure the length is correct */
+	SET_VARSIZE(output, total_length);
+
+	return (MVHistogram *)output;
+}
+
+/*
+* Reads serialized histogram into MVHistogram structure.
+ 
+ * Returns histogram in a partially-serialized form (keeps the boundary values
+ * deduplicated, so that it's possible to optimize the estimation part by
+ * caching function call results across buckets etc.).
+ */
+MVHistogram *
+statext_histogram_deserialize(bytea *data)
+{
+	int			dim,
+				i;
+
+	Size		expected_size;
+	char	   *tmp = NULL;
+
+	MVHistogram *histogram;
+	DimensionInfo *info;
+
+	int			nbuckets;
+	int			ndims;
+	int			bucketsize;
+
+	/* temporary deserialization buffer */
+	int			bufflen;
+	char	   *buff;
+	char	   *ptr;
+
+	if (data == NULL)
+		return NULL;
+
+	/*
+	 * We can't possibly deserialize a histogram if there's not even a
+	 * complete header.
+	 */
+	if (VARSIZE_ANY_EXHDR(data) < offsetof(MVHistogram, buckets))
+		elog(ERROR, "invalid histogram size %ld (expected at least %ld)",
+			 VARSIZE_ANY_EXHDR(data), offsetof(MVHistogram, buckets));
+
+	/* read the histogram header */
+	histogram
+		= (MVHistogram *) palloc(sizeof(MVHistogram));
+
+	/* initialize pointer to data (varlena header is included) */
+	tmp = (char *) data;
+
+	/* get the header and perform basic sanity checks */
+	memcpy(histogram, tmp, offsetof(MVHistogram, buckets));
+	tmp += offsetof(MVHistogram, buckets);
+
+	if (histogram->magic != STATS_HIST_MAGIC)
+		elog(ERROR, "invalid histogram magic %d (expected %dd)",
+			 histogram->magic, STATS_HIST_MAGIC);
+
+	if (histogram->type != STATS_HIST_TYPE_BASIC)
+		elog(ERROR, "invalid histogram type %d (expected %dd)",
+			 histogram->type, STATS_HIST_TYPE_BASIC);
+
+	if (histogram->ndimensions == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid zero-length dimension array in histogram")));
+	else if (histogram->ndimensions > STATS_MAX_DIMENSIONS)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid length (%d) dimension array in histogram",
+						histogram->ndimensions)));
+
+	if (histogram->nbuckets == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid zero-length bucket array in histogram")));
+	else if (histogram->nbuckets > STATS_HIST_MAX_BUCKETS)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid length (%d) bucket array in histogram",
+						histogram->nbuckets)));
+
+	nbuckets = histogram->nbuckets;
+	ndims = histogram->ndimensions;
+	bucketsize = BUCKET_SIZE(ndims);
+
+	/*
+	 * What size do we expect with those parameters (it's incomplete, as we
+	 * yet have to count the array sizes (from DimensionInfo records).
+	 */
+	expected_size = offsetof(MVHistogram, buckets) +
+		ndims * sizeof(DimensionInfo) +
+		(nbuckets * bucketsize);
+
+	/* check that we have at least the DimensionInfo records */
+	if (VARSIZE_ANY(data) < expected_size)
+		elog(ERROR, "invalid histogram size %ld (expected %ld)",
+			 VARSIZE_ANY_EXHDR(data), expected_size);
+
+	/* Now it's safe to access the dimention info. */
+	info = (DimensionInfo *) (tmp);
+	tmp += ndims * sizeof(DimensionInfo);
+
+	/* account for the value arrays */
+	for (dim = 0; dim < ndims; dim++)
+		expected_size += info[dim].nbytes;
+
+	if (VARSIZE_ANY(data) != expected_size)
+		elog(ERROR, "invalid histogram size %ld (expected %ld)",
+			 VARSIZE_ANY_EXHDR(data), expected_size);
+
+	/* looks OK - not corrupted or something */
+
+	/* a single buffer for all the values and counts */
+	bufflen = (sizeof(int) + sizeof(Datum *)) * ndims;
+
+	for (dim = 0; dim < ndims; dim++)
+		/* don't allocate space for byval types, matching Datum */
+		if (!(info[dim].typbyval && (info[dim].typlen == sizeof(Datum))))
+			bufflen += (sizeof(Datum) * info[dim].nvalues);
+
+	/* also, include space for the result, tracking the buckets */
+	bufflen += nbuckets * (sizeof(MVBucket *) + /* bucket pointer */
+						   sizeof(MVBucket));	/* bucket data */
+
+	buff = palloc0(bufflen);
+	ptr = buff;
+
+	histogram->nvalues = (int *) ptr;
+	ptr += (sizeof(int) * ndims);
+
+	histogram->values = (Datum **) ptr;
+	ptr += (sizeof(Datum *) * ndims);
+
+	/*
+	 * XXX This uses pointers to the original data array (the types not passed
+	 * by value), so when someone frees the memory, e.g. by doing something
+	 * like this:
+	 *
+	 *	bytea * data = ... fetch the data from catalog ...
+	 *	MVHistogramBuild histogram = deserialize_histogram(data);
+	 *	pfree(data);
+	 *
+	 * then 'histogram' references the freed memory. Should copy the pieces.
+	 */
+	for (dim = 0; dim < ndims; dim++)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* remember where data for this dimension starts */
+		char	   *start = tmp;
+#endif
+
+		histogram->nvalues[dim] = info[dim].nvalues;
+
+		if (info[dim].typbyval)
+		{
+			/* passed by value / Datum - simply reuse the array */
+			if (info[dim].typlen == sizeof(Datum))
+			{
+				histogram->values[dim] = (Datum *) tmp;
+				tmp += info[dim].nbytes;
+
+				/* no overflow of input array */
+				Assert(tmp <= start + info[dim].nbytes);
+			}
+			else
+			{
+				histogram->values[dim] = (Datum *) ptr;
+				ptr += (sizeof(Datum) * info[dim].nvalues);
+
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					memcpy(&histogram->values[dim][i], tmp, info[dim].typlen);
+					tmp += info[dim].typlen;
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+		}
+		else
+		{
+			/* all the other types need a chunk of the buffer */
+			histogram->values[dim] = (Datum *) ptr;
+			ptr += (sizeof(Datum) * info[dim].nvalues);
+
+			if (info[dim].typlen > 0)
+			{
+				/* pased by reference, but fixed length (name, tid, ...) */
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					histogram->values[dim][i] = PointerGetDatum(tmp);
+					tmp += info[dim].typlen;
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+			else if (info[dim].typlen == -1)
+			{
+				/* varlena */
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					histogram->values[dim][i] = PointerGetDatum(tmp);
+					tmp += VARSIZE_ANY(tmp);
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+			else if (info[dim].typlen == -2)
+			{
+				/* cstring */
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					histogram->values[dim][i] = PointerGetDatum(tmp);
+					tmp += (strlen(tmp) + 1);	/* don't forget the \0 */
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+		}
+
+		/* check we consumed the serialized data for this dimension exactly */
+		Assert((tmp - start) == info[dim].nbytes);
+	}
+
+	/* now deserialize the buckets and point them into the varlena values */
+	histogram->buckets = (MVBucket * *) ptr;
+	ptr += (sizeof(MVBucket *) * nbuckets);
+
+	for (i = 0; i < nbuckets; i++)
+	{
+		MVBucket   *bucket = (MVBucket *) ptr;
+
+		ptr += sizeof(MVBucket);
+
+		bucket->frequency = BUCKET_FREQUENCY(tmp);
+		bucket->nullsonly = BUCKET_NULLS_ONLY(tmp, ndims);
+		bucket->min_inclusive = BUCKET_MIN_INCL(tmp, ndims);
+		bucket->max_inclusive = BUCKET_MAX_INCL(tmp, ndims);
+
+		bucket->min = BUCKET_MIN_INDEXES(tmp, ndims);
+		bucket->max = BUCKET_MAX_INDEXES(tmp, ndims);
+
+		histogram->buckets[i] = bucket;
+
+		Assert(tmp <= (char *) data + VARSIZE_ANY(data));
+
+		tmp += bucketsize;
+	}
+
+	/* at this point we expect to match the total_length exactly */
+	Assert((tmp - (char *) data) == expected_size);
+
+	/* we should exhaust the output buffer exactly */
+	Assert((ptr - buff) == bufflen);
+
+	return histogram;
+}
+
+/*
+ * create_initial_ext_bucket
+ *		Create an initial bucket, covering all the sampled rows.
+ */
+static MVBucketBuild *
+create_initial_ext_bucket(int numrows, HeapTuple *rows, Bitmapset *attrs,
+						  VacAttrStats **stats)
+{
+	int			i;
+	int			numattrs = bms_num_members(attrs);
+
+	/* TODO allocate bucket as a single piece, including all the fields. */
+	MVBucketBuild *bucket = (MVBucketBuild *) palloc0(sizeof(MVBucketBuild));
+
+	Assert(numrows > 0);
+	Assert(rows != NULL);
+	Assert((numattrs >= 2) && (numattrs <= STATS_MAX_DIMENSIONS));
+
+	/* allocate the per-dimension arrays */
+
+	/* flags for null-only dimensions */
+	bucket->nullsonly = (bool *) palloc0(numattrs * sizeof(bool));
+
+	/* inclusiveness boundaries - lower/upper bounds */
+	bucket->min_inclusive = (bool *) palloc0(numattrs * sizeof(bool));
+	bucket->max_inclusive = (bool *) palloc0(numattrs * sizeof(bool));
+
+	/* lower/upper boundaries */
+	bucket->min = (Datum *) palloc0(numattrs * sizeof(Datum));
+	bucket->max = (Datum *) palloc0(numattrs * sizeof(Datum));
+
+	/* number of distinct values (per dimension) */
+	bucket->ndistincts = (uint32 *) palloc0(numattrs * sizeof(uint32));
+
+	/* all the sample rows fall into the initial bucket */
+	bucket->numrows = numrows;
+	bucket->rows = rows;
+
+	/*
+	 * Update the number of ndistinct combinations in the bucket (which we use
+	 * when selecting bucket to partition), and then number of distinct values
+	 * for each partition (which we use when choosing which dimension to
+	 * split).
+	 */
+	update_bucket_ndistinct(bucket, attrs, stats);
+
+	/* Update ndistinct (and also set min/max) for all dimensions. */
+	for (i = 0; i < numattrs; i++)
+		update_dimension_ndistinct(bucket, i, attrs, stats, true);
+
+	return bucket;
+}
+
+/*
+ * Choose the bucket to partition next.
+ *
+ * The current criteria is rather simple, chosen so that the algorithm produces
+ * buckets with about equal frequency and regular size. We select the bucket
+ * with the highest number of distinct values, and then split it by the longest
+ * dimension.
+ *
+ * The distinct values are uniformly mapped to [0,1] interval, and this is used
+ * to compute length of the value range.
+ *
+ * NOTE: This is not the same array used for deduplication, as this contains
+ *		 values for all the tuples from the sample, not just the boundary values.
+ *
+ * Returns either pointer to the bucket selected to be partitioned, or NULL if
+ * there are no buckets that may be split (e.g. if all buckets are too small
+ * or contain too few distinct values).
+ *
+ *
+ * Tricky example
+ * --------------
+ *
+ * Consider this table:
+ *
+ *	   CREATE TABLE t AS SELECT i AS a, i AS b
+ *						   FROM generate_series(1,1000000) s(i);
+ *
+ *	   CREATE STATISTICS s1 ON t (a,b) WITH (histogram);
+ *
+ *	   ANALYZE t;
+ *
+ * It's a very specific (and perhaps artificial) example, because every bucket
+ * always has exactly the same number of distinct values in all dimensions,
+ * which makes the partitioning tricky.
+ *
+ * Then:
+ *
+ *	   SELECT * FROM t WHERE (a < 100) AND (b < 100);
+ *
+ * is estimated to return ~120 rows, while in reality it returns only 99.
+ *
+ *							 QUERY PLAN
+ *	   -------------------------------------------------------------
+ *		Seq Scan on t  (cost=0.00..19425.00 rows=117 width=8)
+ *					   (actual time=0.129..82.776 rows=99 loops=1)
+ *		  Filter: ((a < 100) AND (b < 100))
+ *		  Rows Removed by Filter: 999901
+ *		Planning time: 1.286 ms
+ *		Execution time: 82.984 ms
+ *	   (5 rows)
+ *
+ * So this estimate is reasonably close. Let's change the query to OR clause:
+ *
+ *	   SELECT * FROM t WHERE (a < 100) OR (b < 100);
+ *
+ *							 QUERY PLAN
+ *	   -------------------------------------------------------------
+ *		Seq Scan on t  (cost=0.00..19425.00 rows=8100 width=8)
+ *					   (actual time=0.145..99.910 rows=99 loops=1)
+ *		  Filter: ((a < 100) OR (b < 100))
+ *		  Rows Removed by Filter: 999901
+ *		Planning time: 1.578 ms
+ *		Execution time: 100.132 ms
+ *	   (5 rows)
+ *
+ * That's clearly a much worse estimate. This happens because the histogram
+ * contains buckets like this:
+ *
+ *	   bucket 592  [3 30310] [30134 30593] => [0.000233]
+ *
+ * i.e. the length of "a" dimension is (30310-3)=30307, while the length of "b"
+ * is (30593-30134)=459. So the "b" dimension is much narrower than "a".
+ * Of course, there are also buckets where "b" is the wider dimension.
+ *
+ * This is partially mitigated by selecting the "longest" dimension but that
+ * only happens after we already selected the bucket. So if we never select the
+ * bucket, this optimization does not apply.
+ *
+ * The other reason why this particular example behaves so poorly is due to the
+ * way we actually split the selected bucket. We do attempt to divide the bucket
+ * into two parts containing about the same number of tuples, but that does not
+ * too well when most of the tuples is squashed on one side of the bucket.
+ *
+ * For example for columns with data on the diagonal (i.e. when a=b), we end up
+ * with a narrow bucket on the diagonal and a huge bucket overing the remaining
+ * part (with much lower density).
+ *
+ * So perhaps we need two partitioning strategies - one aiming to split buckets
+ * with high frequency (number of sampled rows), the other aiming to split
+ * "large" buckets. And alternating between them, somehow.
+ *
+ * TODO Consider using similar lower boundary for row count as for simple
+ * histograms, i.e. 300 tuples per bucket.
+ */
+static MVBucketBuild *
+select_bucket_to_partition(int nbuckets, MVBucketBuild * *buckets)
+{
+	int			i;
+	int			numrows = 0;
+	MVBucketBuild *bucket = NULL;
+
+	for (i = 0; i < nbuckets; i++)
+	{
+		/* if the number of rows is higher, use this bucket */
+		if ((buckets[i]->ndistinct > 2) &&
+			(buckets[i]->numrows > numrows) &&
+			(buckets[i]->numrows >= MIN_BUCKET_ROWS))
+		{
+			bucket = buckets[i];
+			numrows = buckets[i]->numrows;
+		}
+	}
+
+	/* may be NULL if there are not buckets with (ndistinct>1) */
+	return bucket;
+}
+
+/*
+ * A simple bucket partitioning implementation - we choose the longest bucket
+ * dimension, measured using the array of distinct values built at the very
+ * beginning of the build.
+ *
+ * We map all the distinct values to a [0,1] interval, uniformly distributed,
+ * and then use this to measure length. It's essentially a number of distinct
+ * values within the range, normalized to [0,1].
+ *
+ * Then we choose a 'middle' value splitting the bucket into two parts with
+ * roughly the same frequency.
+ *
+ * This splits the bucket by tweaking the existing one, and returning the new
+ * bucket (essentially shrinking the existing one in-place and returning the
+ * other "half" as a new bucket). The caller is responsible for adding the new
+ * bucket into the list of buckets.
+ *
+ * There are multiple histogram options, centered around the partitioning
+ * criteria, specifying both how to choose a bucket and the dimension most in
+ * need of a split. For a nice summary and general overview, see "rK-Hist : an
+ * R-Tree based histogram for multi-dimensional selectivity estimation" thesis
+ * by J. A. Lopez, Concordia University, p.34-37 (and possibly p. 32-34 for
+ * explanation of the terms).
+ *
+ * It requires care to prevent splitting only one dimension and not splitting
+ * another one at all (which might happen easily in case of strongly dependent
+ * columns - e.g. y=x). The current algorithm minimizes this, but may still
+ * happen for perfectly dependent examples (when all the dimensions have equal
+ * length, the first one will be selected).
+ *
+ * TODO Should probably consider statistics target for the columns (e.g.
+ * to split dimensions with higher statistics target more frequently).
+ */
+static MVBucketBuild *
+partition_bucket(MVBucketBuild * bucket, Bitmapset *attrs,
+				 VacAttrStats **stats,
+				 int *ndistvalues, Datum **distvalues)
+{
+	int			i;
+	int			dimension;
+	int			numattrs = bms_num_members(attrs);
+
+	Datum		split_value;
+	MVBucketBuild *new_bucket;
+
+	/* needed for sort, when looking for the split value */
+	bool		isNull;
+	int			nvalues = 0;
+	TypeCacheEntry *type;
+	ScalarItem *values;
+	SortSupportData ssup;
+	int		   *attnums;
+
+	int			nrows = 1;		/* number of rows below current value */
+	double		delta;
+
+	/* needed when splitting the values */
+	HeapTuple  *oldrows = bucket->rows;
+	int			oldnrows = bucket->numrows;
+
+	values = (ScalarItem *) palloc0(bucket->numrows * sizeof(ScalarItem));
+
+	/*
+	 * We can't split buckets with a single distinct value (this also
+	 * disqualifies NULL-only dimensions). Also, there has to be multiple
+	 * sample rows (otherwise, how could there be more distinct values).
+	 */
+	Assert(bucket->ndistinct > 1);
+	Assert(bucket->numrows > 1);
+	Assert((numattrs >= 2) && (numattrs <= STATS_MAX_DIMENSIONS));
+
+	/* Look for the next dimension to split. */
+	delta = 0.0;
+	dimension = -1;
+
+	for (i = 0; i < numattrs; i++)
+	{
+		Datum	   *a,
+				   *b;
+
+		type = lookup_type_cache(stats[i]->attrtypid, TYPECACHE_LT_OPR);
+
+		/* initialize sort support, etc. */
+		memset(&ssup, 0, sizeof(ssup));
+		ssup.ssup_cxt = CurrentMemoryContext;
+
+		/* We always use the default collation for statistics */
+		ssup.ssup_collation = DEFAULT_COLLATION_OID;
+		ssup.ssup_nulls_first = false;
+
+		PrepareSortSupportFromOrderingOp(type->lt_opr, &ssup);
+
+		/* can't split NULL-only dimension */
+		if (bucket->nullsonly[i])
+			continue;
+
+		/* can't split dimension with a single ndistinct value */
+		if (bucket->ndistincts[i] <= 1)
+			continue;
+
+		/* search for min boundary in the distinct list */
+		a = (Datum *) bsearch_arg(&bucket->min[i],
+								  distvalues[i], ndistvalues[i],
+								  sizeof(Datum), compare_scalars_simple, &ssup);
+
+		b = (Datum *) bsearch_arg(&bucket->max[i],
+								  distvalues[i], ndistvalues[i],
+								  sizeof(Datum), compare_scalars_simple, &ssup);
+
+		/* if this dimension is 'larger' then partition by it */
+		if (((b - a) * 1.0 / ndistvalues[i]) > delta)
+		{
+			delta = ((b - a) * 1.0 / ndistvalues[i]);
+			dimension = i;
+		}
+	}
+
+	/*
+	 * If we haven't found a dimension here, we've done something wrong in
+	 * select_bucket_to_partition.
+	 */
+	Assert(dimension != -1);
+
+	/*
+	 * Walk through the selected dimension, collect and sort the values and
+	 * then choose the value to use as the new boundary.
+	 */
+	type = lookup_type_cache(stats[dimension]->attrtypid, TYPECACHE_LT_OPR);
+
+	/* initialize sort support, etc. */
+	memset(&ssup, 0, sizeof(ssup));
+	ssup.ssup_cxt = CurrentMemoryContext;
+
+	/* We always use the default collation for statistics */
+	ssup.ssup_collation = DEFAULT_COLLATION_OID;
+	ssup.ssup_nulls_first = false;
+
+	PrepareSortSupportFromOrderingOp(type->lt_opr, &ssup);
+
+	attnums = build_attnums(attrs);
+
+	for (i = 0; i < bucket->numrows; i++)
+	{
+		/*
+		 * remember the index of the sample row, to make the partitioning
+		 * simpler
+		 */
+		values[nvalues].value = heap_getattr(bucket->rows[i], attnums[dimension],
+											 stats[dimension]->tupDesc, &isNull);
+		values[nvalues].tupno = i;
+
+		/* no NULL values allowed here (we never split null-only dimension) */
+		Assert(!isNull);
+
+		nvalues++;
+	}
+
+	/* sort the array of values */
+	qsort_arg((void *) values, nvalues, sizeof(ScalarItem),
+			  compare_scalars_partition, (void *) &ssup);
+
+	/*
+	 * We know there are bucket->ndistincts[dimension] distinct values in this
+	 * dimension, and we want to split this into half, so walk through the
+	 * array and stop once we see (ndistinct/2) values.
+	 *
+	 * We always choose the "next" value, i.e. (n/2+1)-th distinct value, and
+	 * use it as an exclusive upper boundary (and inclusive lower boundary).
+	 *
+	 * TODO Maybe we should use "average" of the two middle distinct values
+	 * (at least for even distinct counts), but that would require being able
+	 * to do an average (which does not work for non-numeric types).
+	 *
+	 * TODO Another option is to look for a split that'd give about 50% tuples
+	 * (not distinct values) in each partition. That might work better when
+	 * there are a few very frequent values, and many rare ones.
+	 */
+	delta = bucket->numrows;
+	split_value = values[0].value;
+
+	for (i = 1; i < bucket->numrows; i++)
+	{
+		if (values[i].value != values[i - 1].value)
+		{
+			/* are we closer to splitting the bucket in half? */
+			if (fabs(i - bucket->numrows / 2.0) < delta)
+			{
+				/* let's assume we'll use this value for the split */
+				split_value = values[i].value;
+				delta = fabs(i - bucket->numrows / 2.0);
+				nrows = i;
+			}
+		}
+	}
+
+	Assert(nrows > 0);
+	Assert(nrows < bucket->numrows);
+
+	/*
+	 * create the new bucket as a (incomplete) copy of the one being
+	 * partitioned.
+	 */
+	new_bucket = copy_ext_bucket(bucket, numattrs);
+
+	/*
+	 * Do the actual split of the chosen dimension, using the split value as
+	 * the upper bound for the existing bucket, and lower bound for the new
+	 * one.
+	 */
+	bucket->max[dimension] = split_value;
+	new_bucket->min[dimension] = split_value;
+
+	/*
+	 * We also treat only one side of the new boundary as inclusive, in the
+	 * bucket where it happens to be the upper boundary. We never set the
+	 * min_inclusive[] to false anywhere, but we set it to true anyway.
+	 */
+	bucket->max_inclusive[dimension] = false;
+	new_bucket->min_inclusive[dimension] = true;
+
+	/*
+	 * Redistribute the sample tuples using the 'ScalarItem->tupno' index. We
+	 * know 'nrows' rows should remain in the original bucket and the rest
+	 * goes to the new one.
+	 */
+	bucket->numrows = nrows;
+	new_bucket->numrows = (oldnrows - nrows);
+
+	bucket->rows = (HeapTuple *) palloc0(bucket->numrows * sizeof(HeapTuple));
+	new_bucket->rows = (HeapTuple *) palloc0(new_bucket->numrows * sizeof(HeapTuple));
+
+	/*
+	 * The first nrows should go to the first bucket, the rest should go to
+	 * the new one. Use the tupno field to get the actual HeapTuple row from
+	 * the original array of sample rows.
+	 */
+	for (i = 0; i < nrows; i++)
+		memcpy(&bucket->rows[i], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+	for (i = nrows; i < oldnrows; i++)
+		memcpy(&new_bucket->rows[i - nrows], &oldrows[values[i].tupno], sizeof(HeapTuple));
+
+	/* update ndistinct values for the buckets (total and per dimension) */
+	update_bucket_ndistinct(bucket, attrs, stats);
+	update_bucket_ndistinct(new_bucket, attrs, stats);
+
+	/*
+	 * TODO We don't need to do this for the dimension we used for split,
+	 * because we know how many distinct values went to each partition.
+	 */
+	for (i = 0; i < numattrs; i++)
+	{
+		update_dimension_ndistinct(bucket, i, attrs, stats, false);
+		update_dimension_ndistinct(new_bucket, i, attrs, stats, false);
+	}
+
+	pfree(oldrows);
+	pfree(values);
+
+	return new_bucket;
+}
+
+/*
+ * Copy a histogram bucket. The copy does not include the build-time data, i.e.
+ * sampled rows etc.
+ */
+static MVBucketBuild *
+copy_ext_bucket(MVBucketBuild * bucket, uint32 ndimensions)
+{
+	/* TODO allocate as a single piece (including all the fields) */
+	MVBucketBuild *new_bucket = (MVBucketBuild *) palloc0(sizeof(MVBucketBuild));
+
+	/*
+	 * Copy only the attributes that will stay the same after the split, and
+	 * we'll recompute the rest after the split.
+	 */
+
+	/* allocate the per-dimension arrays */
+	new_bucket->nullsonly = (bool *) palloc0(ndimensions * sizeof(bool));
+
+	/* inclusiveness boundaries - lower/upper bounds */
+	new_bucket->min_inclusive = (bool *) palloc0(ndimensions * sizeof(bool));
+	new_bucket->max_inclusive = (bool *) palloc0(ndimensions * sizeof(bool));
+
+	/* lower/upper boundaries */
+	new_bucket->min = (Datum *) palloc0(ndimensions * sizeof(Datum));
+	new_bucket->max = (Datum *) palloc0(ndimensions * sizeof(Datum));
+
+	/* copy data */
+	memcpy(new_bucket->nullsonly, bucket->nullsonly, ndimensions * sizeof(bool));
+
+	memcpy(new_bucket->min_inclusive, bucket->min_inclusive, ndimensions * sizeof(bool));
+	memcpy(new_bucket->min, bucket->min, ndimensions * sizeof(Datum));
+
+	memcpy(new_bucket->max_inclusive, bucket->max_inclusive, ndimensions * sizeof(bool));
+	memcpy(new_bucket->max, bucket->max, ndimensions * sizeof(Datum));
+
+	/* allocate and copy the interesting part of the build data */
+	new_bucket->ndistincts = (uint32 *) palloc0(ndimensions * sizeof(uint32));
+
+	return new_bucket;
+}
+
+/*
+ * Counts the number of distinct values in the bucket. This just copies the
+ * Datum values into a simple array, and sorts them using memcmp-based
+ * comparator. That means it only works for pass-by-value data types (assuming
+ * they don't use collations etc.)
+ */
+static void
+update_bucket_ndistinct(MVBucketBuild * bucket, Bitmapset *attrs, VacAttrStats **stats)
+{
+	int			i;
+	int			numattrs = bms_num_members(attrs);
+	int			numrows = bucket->numrows;
+
+	MultiSortSupport mss = multi_sort_init(numattrs);
+	int		   *attnums;
+	SortItem   *items;
+
+	attnums = build_attnums(attrs);
+
+	/* prepare the sort function for the first dimension */
+	for (i = 0; i < numattrs; i++)
+	{
+		VacAttrStats *colstat = stats[i];
+		TypeCacheEntry *type;
+
+		type = lookup_type_cache(colstat->attrtypid, TYPECACHE_LT_OPR);
+		if (type->lt_opr == InvalidOid) /* shouldn't happen */
+			elog(ERROR, "cache lookup failed for ordering operator for type %u",
+				 colstat->attrtypid);
+
+		multi_sort_add_dimension(mss, i, type->lt_opr, type->typcollation);
+	}
+
+	/*
+	 * build an array of SortItem(s) sorted using the multi-sort support
+	 *
+	 * XXX This relies on all stats entries pointing to the same tuple
+	 * descriptor. Not sure if that might not be the case.
+	 */
+	items = build_sorted_items(numrows, bucket->rows, stats[0]->tupDesc,
+							   mss, numattrs, attnums);
+
+	bucket->ndistinct = 1;
+
+	for (i = 1; i < numrows; i++)
+		if (multi_sort_compare(&items[i], &items[i - 1], mss) != 0)
+			bucket->ndistinct += 1;
+
+	pfree(items);
+}
+
+/*
+ * Count distinct values per bucket dimension.
+ */
+static void
+update_dimension_ndistinct(MVBucketBuild * bucket, int dimension, Bitmapset *attrs,
+						   VacAttrStats **stats, bool update_boundaries)
+{
+	int			j;
+	int			nvalues = 0;
+	bool		isNull;
+	Datum	   *values;
+	SortSupportData ssup;
+	TypeCacheEntry *type;
+	int		   *attnums;
+
+	values = (Datum *) palloc0(bucket->numrows * sizeof(Datum));
+	type = lookup_type_cache(stats[dimension]->attrtypid, TYPECACHE_LT_OPR);
+
+	/* we may already know this is a NULL-only dimension */
+	if (bucket->nullsonly[dimension])
+		bucket->ndistincts[dimension] = 1;
+
+	memset(&ssup, 0, sizeof(ssup));
+	ssup.ssup_cxt = CurrentMemoryContext;
+
+	/* We always use the default collation for statistics */
+	ssup.ssup_collation = DEFAULT_COLLATION_OID;
+	ssup.ssup_nulls_first = false;
+
+	PrepareSortSupportFromOrderingOp(type->lt_opr, &ssup);
+
+	attnums = build_attnums(attrs);
+
+	for (j = 0; j < bucket->numrows; j++)
+	{
+		values[nvalues] = heap_getattr(bucket->rows[j], attnums[dimension],
+									   stats[dimension]->tupDesc, &isNull);
+
+		/* ignore NULL values */
+		if (!isNull)
+			nvalues++;
+	}
+
+	/* there's always at least 1 distinct value (may be NULL) */
+	bucket->ndistincts[dimension] = 1;
+
+	/*
+	 * if there are only NULL values in the column, mark it so and continue
+	 * with the next one
+	 */
+	if (nvalues == 0)
+	{
+		pfree(values);
+		bucket->nullsonly[dimension] = true;
+		return;
+	}
+
+	/* sort the array (pass-by-value datum */
+	qsort_arg((void *) values, nvalues, sizeof(Datum),
+			  compare_scalars_simple, (void *) &ssup);
+
+	/*
+	 * Update min/max boundaries to the smallest bounding box. Generally, this
+	 * needs to be done only when constructing the initial bucket.
+	 */
+	if (update_boundaries)
+	{
+		/* store the min/max values */
+		bucket->min[dimension] = values[0];
+		bucket->min_inclusive[dimension] = true;
+
+		bucket->max[dimension] = values[nvalues - 1];
+		bucket->max_inclusive[dimension] = true;
+	}
+
+	/*
+	 * Walk through the array and count distinct values by comparing
+	 * succeeding values.
+	 */
+	for (j = 1; j < nvalues; j++)
+	{
+		if (compare_datums_simple(values[j - 1], values[j], &ssup) != 0)
+			bucket->ndistincts[dimension] += 1;
+	}
+
+	pfree(values);
+}
+
+/*
+ * A properly built histogram must not contain buckets mixing NULL and non-NULL
+ * values in a single dimension. Each dimension may either be marked as 'nulls
+ * only', and thus containing only NULL values, or it must not contain any NULL
+ * values.
+ *
+ * Therefore, if the sample contains NULL values in any of the columns, it's
+ * necessary to build those NULL-buckets. This is done in an iterative way
+ * using this algorithm, operating on a single bucket:
+ *
+ *	   (1) Check that all dimensions are well-formed (not mixing NULL and
+ *		   non-NULL values).
+ *
+ *	   (2) If all dimensions are well-formed, terminate.
+ *
+ *	   (3) If the dimension contains only NULL values, but is not marked as
+ *		   NULL-only, mark it as NULL-only and run the algorithm again (on
+ *		   this bucket).
+ *
+ *	   (4) If the dimension mixes NULL and non-NULL values, split the bucket
+ *		   into two parts - one with NULL values, one with non-NULL values
+ *		   (replacing the current one). Then run the algorithm on both buckets.
+ *
+ * This is executed in a recursive manner, but the number of executions should
+ * be quite low - limited by the number of NULL-buckets. Also, in each branch
+ * the number of nested calls is limited by the number of dimensions
+ * (attributes) of the histogram.
+ *
+ * At the end, there should be buckets with no mixed dimensions. The number of
+ * buckets produced by this algorithm is rather limited - with N dimensions,
+ * there may be only 2^N such buckets (each dimension may be either NULL or
+ * non-NULL). So with 8 dimensions (current value of STATS_MAX_DIMENSIONS)
+ * there may be only 256 such buckets.
+ *
+ * After this, a 'regular' bucket-split algorithm shall run, further optimizing
+ * the histogram.
+ */
+static void
+create_null_buckets(MVHistogramBuild * histogram, int bucket_idx,
+					Bitmapset *attrs, VacAttrStats **stats)
+{
+	int			i,
+				j;
+	int			null_dim = -1;
+	int			null_count = 0;
+	bool		null_found = false;
+	MVBucketBuild *bucket,
+			   *null_bucket;
+	int			null_idx,
+				curr_idx;
+	int		   *attnums;
+
+	/* remember original values from the bucket */
+	int			numrows;
+	HeapTuple  *oldrows = NULL;
+
+	Assert(bucket_idx < histogram->nbuckets);
+	Assert(histogram->ndimensions == bms_num_members(attrs));
+
+	bucket = histogram->buckets[bucket_idx];
+
+	numrows = bucket->numrows;
+	oldrows = bucket->rows;
+
+	attnums = build_attnums(attrs);
+
+	/*
+	 * Walk through all rows / dimensions, and stop once we find NULL in a
+	 * dimension not yet marked as NULL-only.
+	 */
+	for (i = 0; i < bucket->numrows; i++)
+	{
+		for (j = 0; j < histogram->ndimensions; j++)
+		{
+			/* Is this a NULL-only dimension? If yes, skip. */
+			if (bucket->nullsonly[j])
+				continue;
+
+			/* found a NULL in that dimension? */
+			if (heap_attisnull(bucket->rows[i], attnums[j],
+							   stats[j]->tupDesc))
+			{
+				null_found = true;
+				null_dim = j;
+				break;
+			}
+		}
+
+		/* terminate if we found attribute with NULL values */
+		if (null_found)
+			break;
+	}
+
+	/* no regular dimension contains NULL values => we're done */
+	if (!null_found)
+		return;
+
+	/* walk through the rows again, count NULL values in 'null_dim' */
+	for (i = 0; i < bucket->numrows; i++)
+	{
+		if (heap_attisnull(bucket->rows[i], attnums[null_dim],
+						   stats[null_dim]->tupDesc))
+			null_count += 1;
+	}
+
+	Assert(null_count <= bucket->numrows);
+
+	/*
+	 * If (null_count == numrows) the dimension already is NULL-only, but is
+	 * not yet marked like that. It's enough to mark it and repeat the process
+	 * recursively (until we run out of dimensions).
+	 */
+	if (null_count == bucket->numrows)
+	{
+		bucket->nullsonly[null_dim] = true;
+		create_null_buckets(histogram, bucket_idx, attrs, stats);
+		return;
+	}
+
+	/*
+	 * We have to split the bucket into two - one with NULL values in the
+	 * dimension, one with non-NULL values. We don't need to sort the data or
+	 * anything, but otherwise it's similar to what partition_bucket() does.
+	 */
+
+	/* create bucket with NULL-only dimension 'dim' */
+	null_bucket = copy_ext_bucket(bucket, histogram->ndimensions);
+
+	/* remember the current array info */
+	oldrows = bucket->rows;
+	numrows = bucket->numrows;
+
+	/* we'll keep non-NULL values in the current bucket */
+	bucket->numrows = (numrows - null_count);
+	bucket->rows
+		= (HeapTuple *) palloc0(bucket->numrows * sizeof(HeapTuple));
+
+	/* and the NULL values will go to the new one */
+	null_bucket->numrows = null_count;
+	null_bucket->rows
+		= (HeapTuple *) palloc0(null_bucket->numrows * sizeof(HeapTuple));
+
+	/* mark the dimension as NULL-only (in the new bucket) */
+	null_bucket->nullsonly[null_dim] = true;
+
+	/* walk through the sample rows and distribute them accordingly */
+	null_idx = 0;
+	curr_idx = 0;
+	for (i = 0; i < numrows; i++)
+	{
+		if (heap_attisnull(oldrows[i], attnums[null_dim],
+						   stats[null_dim]->tupDesc))
+			/* NULL => copy to the new bucket */
+			memcpy(&null_bucket->rows[null_idx++], &oldrows[i],
+				   sizeof(HeapTuple));
+		else
+			memcpy(&bucket->rows[curr_idx++], &oldrows[i],
+				   sizeof(HeapTuple));
+	}
+
+	/* update ndistinct values for the buckets (total and per dimension) */
+	update_bucket_ndistinct(bucket, attrs, stats);
+	update_bucket_ndistinct(null_bucket, attrs, stats);
+
+	/*
+	 * TODO We don't need to do this for the dimension we used for split,
+	 * because we know how many distinct values went to each bucket (NULL is
+	 * not a value, so NULL buckets get 0, and the other bucket got all the
+	 * distinct values).
+	 */
+	for (i = 0; i < histogram->ndimensions; i++)
+	{
+		update_dimension_ndistinct(bucket, i, attrs, stats, false);
+		update_dimension_ndistinct(null_bucket, i, attrs, stats, false);
+	}
+
+	pfree(oldrows);
+
+	/* add the NULL bucket to the histogram */
+	histogram->buckets[histogram->nbuckets++] = null_bucket;
+
+	/*
+	 * And now run the function recursively on both buckets (the new one
+	 * first, because the call may change number of buckets, and it's used as
+	 * an index).
+	 */
+	create_null_buckets(histogram, (histogram->nbuckets - 1), attrs, stats);
+	create_null_buckets(histogram, bucket_idx, attrs, stats);
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - bucket ID (0...nbuckets)
+ * - min values (string array)
+ * - max values (string array)
+ * - nulls only (boolean array)
+ * - min inclusive flags (boolean array)
+ * - max inclusive flags (boolean array)
+ * - frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows returned if the
+ * statistics contains no histogram (or if there's no statistics for the OID).
+ *
+ * The second parameter (type) determines what values will be returned
+ * in the (minvals,maxvals). There are three possible values:
+ *
+ * 0 (actual values)
+ * -----------------
+ *	  - prints actual values
+ *	  - using the output function of the data type (as string)
+ *	  - handy for investigating the histogram
+ *
+ * 1 (distinct index)
+ * ------------------
+ *	  - prints index of the distinct value (into the serialized array)
+ *	  - makes it easier to spot neighbor buckets, etc.
+ *	  - handy for plotting the histogram
+ *
+ * 2 (normalized distinct index)
+ * -----------------------------
+ *	  - prints index of the distinct value, but normalized into [0,1]
+ *	  - similar to 1, but shows how 'long' the bucket range is
+ *	  - handy for plotting the histogram
+ *
+ * When plotting the histogram, be careful as the (1) and (2) options skew the
+ * lengths by distributing the distinct values uniformly. For data types
+ * without a clear meaning of 'distance' (e.g. strings) that is not a big deal,
+ * but for numbers it may be confusing.
+ */
+PG_FUNCTION_INFO_V1(pg_histogram_buckets);
+
+#define OUTPUT_FORMAT_RAW		0
+#define OUTPUT_FORMAT_INDEXES	1
+#define OUTPUT_FORMAT_DISTINCT	2
+
+Datum
+pg_histogram_buckets(PG_FUNCTION_ARGS)
+{
+	FuncCallContext *funcctx;
+	int			call_cntr;
+	int			max_calls;
+	TupleDesc	tupdesc;
+	AttInMetadata *attinmeta;
+
+	int			otype = PG_GETARG_INT32(1);
+
+	if ((otype < 0) || (otype > 2))
+		elog(ERROR, "invalid output type specified");
+
+	/* stuff done only on the first call of the function */
+	if (SRF_IS_FIRSTCALL())
+	{
+		MemoryContext oldcontext;
+		MVHistogram *histogram;
+
+		/* create a function context for cross-call persistence */
+		funcctx = SRF_FIRSTCALL_INIT();
+
+		/* switch to memory context appropriate for multiple function calls */
+		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+		histogram = statext_histogram_deserialize(PG_GETARG_BYTEA_P(0));
+
+		funcctx->user_fctx = histogram;
+
+		/* total number of tuples to be returned */
+		funcctx->max_calls = 0;
+		if (funcctx->user_fctx != NULL)
+			funcctx->max_calls = histogram->nbuckets;
+
+		/* Build a tuple descriptor for our result type */
+		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("function returning record called in context "
+							"that cannot accept type record")));
+
+		/*
+		 * generate attribute metadata needed later to produce tuples from raw
+		 * C strings
+		 */
+		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+		funcctx->attinmeta = attinmeta;
+
+		MemoryContextSwitchTo(oldcontext);
+	}
+
+	/* stuff done on every call of the function */
+	funcctx = SRF_PERCALL_SETUP();
+
+	call_cntr = funcctx->call_cntr;
+	max_calls = funcctx->max_calls;
+	attinmeta = funcctx->attinmeta;
+
+	if (call_cntr < max_calls)	/* do when there is more left to send */
+	{
+		char	  **values;
+		HeapTuple	tuple;
+		Datum		result;
+		double		bucket_volume = 1.0;
+		StringInfo	bufs;
+
+		char	   *format;
+		int			i;
+
+		Oid		   *outfuncs;
+		FmgrInfo   *fmgrinfo;
+
+		MVHistogram *histogram;
+		MVBucket   *bucket;
+
+		histogram = (MVHistogram *) funcctx->user_fctx;
+
+		Assert(call_cntr < histogram->nbuckets);
+
+		bucket = histogram->buckets[call_cntr];
+
+		/*
+		 * The scalar values will be formatted directly, using snprintf.
+		 *
+		 * The 'array' values will be formatted through StringInfo.
+		 */
+		values = (char **) palloc0(9 * sizeof(char *));
+		bufs = (StringInfo) palloc0(9 * sizeof(StringInfoData));
+
+		values[0] = (char *) palloc(64 * sizeof(char));
+
+		initStringInfo(&bufs[1]);	/* lower boundaries */
+		initStringInfo(&bufs[2]);	/* upper boundaries */
+		initStringInfo(&bufs[3]);	/* nulls-only */
+		initStringInfo(&bufs[4]);	/* lower inclusive */
+		initStringInfo(&bufs[5]);	/* upper inclusive */
+
+		values[6] = (char *) palloc(64 * sizeof(char));
+		values[7] = (char *) palloc(64 * sizeof(char));
+		values[8] = (char *) palloc(64 * sizeof(char));
+
+		/* we need to do this only when printing the actual values */
+		outfuncs = (Oid *) palloc0(sizeof(Oid) * histogram->ndimensions);
+		fmgrinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * histogram->ndimensions);
+
+		/*
+		 * lookup output functions for all histogram dimensions
+		 *
+		 * XXX This might be one in the first call and stored in user_fctx.
+		 */
+		for (i = 0; i < histogram->ndimensions; i++)
+		{
+			bool		isvarlena;
+
+			getTypeOutputInfo(histogram->types[i], &outfuncs[i], &isvarlena);
+
+			fmgr_info(outfuncs[i], &fmgrinfo[i]);
+		}
+
+		snprintf(values[0], 64, "%d", call_cntr);	/* bucket ID */
+
+		/*
+		 * for the arrays of lower/upper boundaries, formated according to
+		 * otype
+		 */
+		for (i = 0; i < histogram->ndimensions; i++)
+		{
+			Datum	   *vals = histogram->values[i];
+
+			uint16		minidx = bucket->min[i];
+			uint16		maxidx = bucket->max[i];
+
+			int			d = 1;
+
+			/*
+			 * compute bucket volume, using distinct values as a measure
+			 *
+			 * XXX Not really sure what to do for NULL dimensions or
+			 * dimensions with just a single value here, so let's simply count
+			 * them as 1. They will not affect the volume anyway.
+			 */
+			if (histogram->nvalues[i] > 1)
+				d = (histogram->nvalues[i] - 1);
+
+			bucket_volume *= (double) (maxidx - minidx + 1) / d;
+
+			if (i == 0)
+				format = "{%s"; /* fist dimension */
+			else if (i < (histogram->ndimensions - 1))
+				format = ", %s";	/* medium dimensions */
+			else
+				format = ", %s}";	/* last dimension */
+
+			appendStringInfo(&bufs[3], format, bucket->nullsonly[i] ? "t" : "f");
+			appendStringInfo(&bufs[4], format, bucket->min_inclusive[i] ? "t" : "f");
+			appendStringInfo(&bufs[5], format, bucket->max_inclusive[i] ? "t" : "f");
+
+			/*
+			 * for NULL-only  dimension, simply put there the NULL and
+			 * continue
+			 */
+			if (bucket->nullsonly[i])
+			{
+				if (i == 0)
+					format = "{%s";
+				else if (i < (histogram->ndimensions - 1))
+					format = ", %s";
+				else
+					format = ", %s}";
+
+				appendStringInfo(&bufs[1], format, "NULL");
+				appendStringInfo(&bufs[2], format, "NULL");
+
+				continue;
+			}
+
+			/* otherwise we really need to format the value */
+			switch (otype)
+			{
+				case OUTPUT_FORMAT_RAW: /* actual boundary values */
+
+					if (i == 0)
+						format = "{%s";
+					else if (i < (histogram->ndimensions - 1))
+						format = ", %s";
+					else
+						format = ", %s}";
+
+					appendStringInfo(&bufs[1], format,
+									 FunctionCall1(&fmgrinfo[i], vals[minidx]));
+
+					appendStringInfo(&bufs[2], format,
+									 FunctionCall1(&fmgrinfo[i], vals[maxidx]));
+
+					break;
+
+				case OUTPUT_FORMAT_INDEXES: /* indexes into deduplicated
+											 * arrays */
+
+					if (i == 0)
+						format = "{%d";
+					else if (i < (histogram->ndimensions - 1))
+						format = ", %d";
+					else
+						format = ", %d}";
+
+					appendStringInfo(&bufs[1], format, minidx);
+					appendStringInfo(&bufs[2], format, maxidx);
+
+					break;
+
+				case OUTPUT_FORMAT_DISTINCT:	/* distinct arrays as measure */
+
+					if (i == 0)
+						format = "{%f";
+					else if (i < (histogram->ndimensions - 1))
+						format = ", %f";
+					else
+						format = ", %f}";
+
+					appendStringInfo(&bufs[1], format, (minidx * 1.0 / d));
+					appendStringInfo(&bufs[2], format, (maxidx * 1.0 / d));
+
+					break;
+
+				default:
+					elog(ERROR, "unknown output type: %d", otype);
+			}
+		}
+
+		values[1] = bufs[1].data;
+		values[2] = bufs[2].data;
+		values[3] = bufs[3].data;
+		values[4] = bufs[4].data;
+		values[5] = bufs[5].data;
+
+		snprintf(values[6], 64, "%f", bucket->frequency);	/* frequency */
+		snprintf(values[7], 64, "%f", bucket->frequency / bucket_volume);	/* density */
+		snprintf(values[8], 64, "%f", bucket_volume);	/* volume (as a
+														 * fraction) */
+
+		/* build a tuple */
+		tuple = BuildTupleFromCStrings(attinmeta, values);
+
+		/* make the tuple into a datum */
+		result = HeapTupleGetDatum(tuple);
+
+		/* clean up (this is not really necessary) */
+		pfree(values[0]);
+		pfree(values[6]);
+		pfree(values[7]);
+		pfree(values[8]);
+
+		resetStringInfo(&bufs[1]);
+		resetStringInfo(&bufs[2]);
+		resetStringInfo(&bufs[3]);
+		resetStringInfo(&bufs[4]);
+		resetStringInfo(&bufs[5]);
+
+		pfree(bufs);
+		pfree(values);
+
+		SRF_RETURN_NEXT(funcctx, result);
+	}
+	else						/* do when there is no more left */
+	{
+		SRF_RETURN_DONE(funcctx);
+	}
+}
+
+/*
+ * pg_histogram_in		- input routine for type pg_histogram.
+ *
+ * pg_histogram is real enough to be a table column, but it has no operations
+ * of its own, and disallows input too
+ */
+Datum
+pg_histogram_in(PG_FUNCTION_ARGS)
+{
+	/*
+	 * pg_histogram stores the data in binary form and parsing text input is
+	 * not needed, so disallow this.
+	 */
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg("cannot accept a value of type %s", "pg_histogram")));
+
+	PG_RETURN_VOID();			/* keep compiler quiet */
+}
+
+/*
+ * pg_histogram_out		- output routine for type pg_histogram.
+ *
+ * histograms are serialized into a bytea value, so we simply call byteaout()
+ * to serialize the value into text. But it'd be nice to serialize that into
+ * a meaningful representation (e.g. for inspection by people).
+ *
+ * XXX This should probably return something meaningful, similar to what
+ * pg_dependencies_out does. Not sure how to deal with the deduplicated
+ * values, though - do we want to expand that or not?
+ */
+Datum
+pg_histogram_out(PG_FUNCTION_ARGS)
+{
+	return byteaout(fcinfo);
+}
+
+/*
+ * pg_histogram_recv		- binary input routine for type pg_histogram.
+ */
+Datum
+pg_histogram_recv(PG_FUNCTION_ARGS)
+{
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg("cannot accept a value of type %s", "pg_histogram")));
+
+	PG_RETURN_VOID();			/* keep compiler quiet */
+}
+
+/*
+ * pg_histogram_send		- binary output routine for type pg_histogram.
+ *
+ * Histograms are serialized in a bytea value (although the type is named
+ * differently), so let's just send that.
+ */
+Datum
+pg_histogram_send(PG_FUNCTION_ARGS)
+{
+	return byteasend(fcinfo);
+}
+
+/*
+ * selectivity estimation
+ */
+
+/*
+ * When evaluating conditions on the histogram, we can leverage the fact that
+ * each bucket boundary value is used by many buckets (each bucket split
+ * introduces a single new value, duplicating all the other values). That
+ * allows us to significantly reduce the number of function calls by caching
+ * the results.
+ *
+ * This is one of the reasons why we keep the histogram in partially serialized
+ * form, with deduplicated values. This allows us to maintain a simple array
+ * of results indexed by uint16 values.
+ *
+ * We only need 2 bits per value, but we allocate a full char as it's more
+ * convenient and there's not much to gain. 0 means 'unknown' as the function
+ * was not executed for this value yet.
+ */
+
+#define HIST_CACHE_FALSE			0x01
+#define HIST_CACHE_TRUE				0x03
+#define HIST_CACHE_MASK				0x02
+
+/*
+ * bucket_contains_value
+ *		Decide if the bucket (a range of values in a particular dimension) may
+ *		contain the supplied value.
+ *
+ * The function does not simply return true/false, but a "match level" (none,
+ * partial, full), just like other similar functions. In fact, thise function
+ * only returns "partial" or "none" levels, as a range can never match exactly
+ * a value (we never generate histograms with "collapsed" dimensions).
+ *
+ * FIXME Should use a better estimate than DEFAULT_EQ_SEL, e.g. derived
+ * from ndistinct for the variable. But for histograms we shouldn't really
+ * get here, because equalities are handled as conditions (i.e. we'll get
+ * here when deciding which buckets match the conditions, but the fraction
+ * value does not really matter, we only care about the match flag).
+ */
+static bool
+bucket_contains_value(FmgrInfo ltproc, Datum constvalue,
+					  Datum min_value, Datum max_value,
+					  int min_index, int max_index,
+					  bool min_include, bool max_include,
+					  char *callcache, double *fraction)
+{
+	bool		a,
+				b;
+
+	char		min_cached = callcache[min_index];
+	char		max_cached = callcache[max_index];
+
+	/*
+	 * First some quick checks on equality - if any of the boundaries equals,
+	 * we have a partial match (so no need to call the comparator).
+	 */
+	if (((min_value == constvalue) && (min_include)) ||
+		((max_value == constvalue) && (max_include)))
+	{
+		*fraction = DEFAULT_EQ_SEL;
+		return true;
+	}
+
+	/* Keep the values 0/1 because of the XOR at the end. */
+	a = ((min_cached & HIST_CACHE_MASK) >> 1);
+	b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+	/*
+	 * If result for the bucket lower bound not in cache, evaluate the
+	 * function and store the result in the cache.
+	 */
+	if (!min_cached)
+	{
+		a = DatumGetBool(FunctionCall2Coll(&ltproc,
+										   DEFAULT_COLLATION_OID,
+										   constvalue, min_value));
+		/* remember the result */
+		callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+	}
+
+	/* And do the same for the upper bound. */
+	if (!max_cached)
+	{
+		b = DatumGetBool(FunctionCall2Coll(&ltproc,
+										   DEFAULT_COLLATION_OID,
+										   constvalue, max_value));
+		/* remember the result */
+		callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+	}
+
+	*fraction = (a ^ b) ? DEFAULT_EQ_SEL : 0.0;
+
+	return (a ^ b) ? true : false;
+}
+
+/*
+ * bucket_is_smaller_than_value
+ *		Decide if the bucket (a range of values in a particular dimension) is
+ *		smaller than the supplied value.
+ *
+ * The function does not simply return true/false, but a "match level" (none,
+ * partial, full), just like other similar functions.
+ *
+ * Unlike bucket_contains_value this may return all three match levels, i.e.
+ * "full" (e.g. [10,20] < 30), "partial" (e.g. [10,20] < 15) and "none"
+ * (e.g. [10,20] < 5).
+ *
+ * FIXME Use a better estimate, instead of DEFAULT_INEQ_SEL, i.e. something
+ * derived in a way similar to convert_to_scalar.
+ */
+static bool
+bucket_is_smaller_than_value(FmgrInfo opproc, Oid typeoid, Oid colloid,
+							 Datum constvalue,
+							 Datum min_value, Datum max_value,
+							 int min_index, int max_index,
+							 bool min_include, bool max_include,
+							 char *callcache, bool isgt,
+							 double *fraction)
+{
+	char		min_cached = callcache[min_index];
+	char		max_cached = callcache[max_index];
+
+	/* Keep the values 0/1 because of the XOR at the end. */
+	bool		a = ((min_cached & HIST_CACHE_MASK) >> 1);
+	bool		b = ((max_cached & HIST_CACHE_MASK) >> 1);
+
+	if (!min_cached)
+	{
+		a = DatumGetBool(FunctionCall2Coll(&opproc,
+										   DEFAULT_COLLATION_OID,
+										   min_value,
+										   constvalue));
+		/* remember the result */
+		callcache[min_index] = (a) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+	}
+
+	if (!max_cached)
+	{
+		b = DatumGetBool(FunctionCall2Coll(&opproc,
+										   DEFAULT_COLLATION_OID,
+										   max_value,
+										   constvalue));
+		/* remember the result */
+		callcache[max_index] = (b) ? HIST_CACHE_TRUE : HIST_CACHE_FALSE;
+	}
+
+	/*
+	 * Now, we need to combine both results into the final answer, and we need
+	 * to be careful about the 'isgt' variable which kinda inverts the
+	 * meaning.
+	 *
+	 * First, we handle the case when each boundary returns different results.
+	 * In that case the outcome can only be 'partial' match, and the fraction
+	 * is computed using convert_to_scalar, just like for 1D histograms.
+	 */
+	if (a != b)
+	{
+		double	val, high, low, binfrac;
+
+		if (convert_to_scalar(constvalue, typeoid, colloid, &val,
+							  min_value, max_value, typeoid, &low, &high))
+		{
+
+			/* shamelessly copied from ineq_histogram_selectivity */
+			if (high <= low)
+			{
+				/* cope if bin boundaries appear identical */
+				binfrac = 0.5;
+			}
+			else if (val <= low)
+				binfrac = 0.0;
+			else if (val >= high)
+				binfrac = 1.0;
+			else
+			{
+				binfrac = (val - low) / (high - low);
+
+				/*
+				 * Watch out for the possibility that we got a NaN or
+				 * Infinity from the division.  This can happen
+				 * despite the previous checks, if for example "low"
+				 * is -Infinity.
+				 */
+				if (isnan(binfrac) ||
+					binfrac < 0.0 || binfrac > 1.0)
+					binfrac = 0.5;
+			}
+		}
+		else
+			binfrac = 0.5;
+
+		*fraction = (isgt) ? binfrac : (1-binfrac);
+		return true;
+	}
+
+	/*
+	 * When the results are the same, then it depends on the 'isgt' value.
+	 * There are four options:
+	 *
+	 * isgt=false a=b=true	=> full match isgt=false a=b=false => empty
+	 * isgt=true  a=b=true	=> empty isgt=true	a=b=false => full match
+	 *
+	 * We'll cheat a bit, because we know that (a=b) so we'll use just one of
+	 * them.
+	 */
+	if (isgt)
+	{
+		*fraction = (!a) ? 1.0 : 0.0;
+		return (!a);
+	}
+	else
+	{
+		*fraction = (a) ? 1.0 : 0.0;
+		return a;
+	}
+}
+
+/*
+ * Evaluate clauses using the histogram, and update the match bitmap.
+ *
+ * The bitmap may be already partially set, so this is really a way to
+ * combine results of several clause lists - either when computing
+ * conditional probability P(A|B) or a combination of AND/OR clauses.
+ *
+ * Note: This is not a simple bitmap in the sense that there are three
+ * possible values for each item - no match, partial match and full match.
+ * So we need at least 2 bits per item.
+ *
+ * TODO: This works with 'bitmap' where each item is represented as a
+ * char, which is slightly wasteful. Instead, we could use a bitmap
+ * with 2 bits per item, reducing the size to ~1/4. By using values
+ * 0, 1 and 3 (instead of 0, 1 and 2), the operations (merging etc.)
+ * might be performed just like for simple bitmap by using & and |,
+ * which might be faster than min/max.
+ */
+static void
+histogram_update_match_bitmap(PlannerInfo *root, List *clauses,
+							  Bitmapset *stakeys,
+							  MVHistogram * histogram,
+							  bucket_match *matches, bool is_or)
+{
+	int			i;
+	ListCell   *l;
+
+	/*
+	 * Used for caching function calls, only once per deduplicated value.
+	 *
+	 * We know may have up to (2 * nbuckets) values per dimension. It's
+	 * probably overkill, but let's allocate that once for all clauses, to
+	 * minimize overhead.
+	 *
+	 * Also, we only need two bits per value, but this allocates byte per
+	 * value. Might be worth optimizing.
+	 *
+	 * 0x00 - not yet called 0x01 - called, result is 'false' 0x03 - called,
+	 * result is 'true'
+	 */
+	char	   *callcache = palloc(histogram->nbuckets);
+
+	Assert(histogram != NULL);
+	Assert(histogram->nbuckets > 0);
+
+	Assert(clauses != NIL);
+	Assert(list_length(clauses) >= 1);
+
+	/* loop through the clauses and do the estimation */
+	foreach(l, clauses)
+	{
+		Node	   *clause = (Node *) lfirst(l);
+
+		/* if it's a RestrictInfo, then extract the clause */
+		if (IsA(clause, RestrictInfo))
+			clause = (Node *) ((RestrictInfo *) clause)->clause;
+
+		/* it's either OpClause, or NullTest */
+		if (is_opclause(clause))
+		{
+			OpExpr	   *expr = (OpExpr *) clause;
+			bool		varonleft = true;
+			bool		ok;
+
+			FmgrInfo	opproc; /* operator */
+
+			fmgr_info(get_opcode(expr->opno), &opproc);
+
+			/* reset the cache (per clause) */
+			memset(callcache, 0, histogram->nbuckets);
+
+			ok = (NumRelids(clause) == 1) &&
+				(is_pseudo_constant_clause(lsecond(expr->args)) ||
+				 (varonleft = false,
+				  is_pseudo_constant_clause(linitial(expr->args))));
+
+			if (ok)
+			{
+				FmgrInfo	ltproc;
+				RegProcedure oprrest = get_oprrest(expr->opno);
+				TypeCacheEntry *typecache;
+				Oid			colloid;
+
+				Var		   *var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+				Const	   *cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+				bool		isgt = (!varonleft);
+
+				/* lookup dimension for the attribute */
+				int			idx = bms_member_index(stakeys, var->varattno);
+
+				typecache = lookup_type_cache(var->vartype, TYPECACHE_LT_OPR);
+				fmgr_info(get_opcode(typecache->lt_opr), &ltproc);
+				colloid = typecache->typcollation;
+
+				/*
+				 * Check this for all buckets that still have "true" in the
+				 * bitmap
+				 *
+				 * We already know the clauses use suitable operators (because
+				 * that's how we filtered them).
+				 */
+				for (i = 0; i < histogram->nbuckets; i++)
+				{
+					bool		res;
+					double		fraction;
+
+					MVBucket   *bucket = histogram->buckets[i];
+
+					/* histogram boundaries */
+					Datum		minval,
+								maxval;
+					bool		mininclude,
+								maxinclude;
+					int			minidx,
+								maxidx;
+
+					/*
+					 * For AND-lists, we can also mark NULL buckets as 'no
+					 * match' (and then skip them). For OR-lists this is not
+					 * possible.
+					 */
+					if ((!is_or) && bucket->nullsonly[idx])
+						matches[i].match = false;
+
+					/*
+					 * XXX There used to be logic to skip buckets that can't
+					 * possibly match, depending on the is_or flag (either
+					 * fully matching or elimated). Once we abandoned the
+					 * concept of NONE/PARTIAL/FULL matches and switched to
+					 * a bool flag + fraction that does not seem possible.
+					 * But maybe we can make it work somehow?
+					 */
+
+					/* lookup the values and cache of function calls */
+					minidx = bucket->min[idx];
+					maxidx = bucket->max[idx];
+
+					minval = histogram->values[idx][bucket->min[idx]];
+					maxval = histogram->values[idx][bucket->max[idx]];
+
+					mininclude = bucket->min_inclusive[idx];
+					maxinclude = bucket->max_inclusive[idx];
+
+					/*
+					 * If it's not a "<" or ">" or "=" operator, just ignore
+					 * the clause. Otherwise note the relid and attnum for the
+					 * variable.
+					 *
+					 * TODO I'm really unsure the handling of 'isgt' flag
+					 * (that is, clauses with reverse order of
+					 * variable/constant) is correct. I wouldn't be surprised
+					 * if there was some mixup. Using the lt/gt operators
+					 * instead of messing with the opproc could make it
+					 * simpler. It would however be using a different operator
+					 * than the query, although it's not any shadier than
+					 * using the selectivity function as is done currently.
+					 */
+					switch (oprrest)
+					{
+						case F_SCALARLTSEL: /* Var < Const */
+						case F_SCALARLESEL: /* Var <= Const */
+						case F_SCALARGTSEL: /* Var > Const */
+						case F_SCALARGESEL: /* Var >= Const */
+
+							res = bucket_is_smaller_than_value(opproc, var->vartype, colloid,
+															   cst->constvalue,
+															   minval, maxval,
+															   minidx, maxidx,
+															   mininclude, maxinclude,
+															   callcache, isgt, &fraction);
+
+							break;
+
+						case F_EQSEL:
+						case F_NEQSEL:
+
+							/*
+							 * We only check whether the value is within the
+							 * bucket, using the lt operator, and we also
+							 * check for equality with the boundaries.
+							 */
+
+							res = bucket_contains_value(ltproc, cst->constvalue,
+														minval, maxval,
+														minidx, maxidx,
+														mininclude, maxinclude,
+														callcache, &fraction);
+
+							break;
+
+						default:
+							elog(ERROR, "unexpected selectivity procedure");
+					}
+
+					/*
+					 * Merge the result into the bitmap, depending on type
+					 * of the current clause (AND or OR).
+					 */
+					if (is_or)
+					{
+						Selectivity s1, s2;
+
+						/* OR follows the Max() semantics */
+						matches[i].match |= res;
+
+						/*
+						 * Selectivities for an OR clause are combined as s1+s2 - s1*s2
+						 * to account for the probable overlap of selected tuple sets.
+						 * This is the same formula as in clause_selectivity, because
+						 * the fraction is computed assuming independence (but then we
+						 * also apply geometric mean).
+						 */
+						s1 = matches[i].fraction;
+						s2 = fraction;
+
+						matches[i].fraction = s1 + s2 - s1 * s2;
+
+						CLAMP_PROBABILITY(matches[i].fraction);
+					}
+					else
+					{
+						/* AND follows Min() semantics */
+						matches[i].match &= res;
+						matches[i].fraction *= fraction;
+					}
+				}
+			}
+		}
+		else if (IsA(clause, NullTest))
+		{
+			NullTest   *expr = (NullTest *) clause;
+			Var		   *var = (Var *) (expr->arg);
+
+			/* lookup index of attribute in the statistics */
+			int			idx = bms_member_index(stakeys, var->varattno);
+
+			/*
+			 * Walk through the buckets and evaluate the current clause. We
+			 * can skip items that were already ruled out, and terminate if
+			 * there are no remaining buckets that might possibly match.
+			 */
+			for (i = 0; i < histogram->nbuckets; i++)
+			{
+				char		match = false;
+				MVBucket   *bucket = histogram->buckets[i];
+
+				/*
+				 * Skip buckets that were already eliminated - this is
+				 * impotant considering how we update the info (we only lower
+				 * the match)
+				 */
+				if ((!is_or) && (!matches[i].match))
+					continue;
+				else if (is_or && (matches[i].match))
+					continue;
+
+				switch (expr->nulltesttype)
+				{
+					case IS_NULL:
+						match = (bucket->nullsonly[idx]) ? true : match;
+						break;
+
+					case IS_NOT_NULL:
+						match = (!bucket->nullsonly[idx]) ? true : match;
+						break;
+				}
+
+				/* now, update the match bitmap, depending on OR/AND type */
+				if (is_or)
+				{
+					matches[i].match |= match;
+					matches[i].fraction = (match) ? 1.0 : matches[i].fraction;
+				}
+				else
+				{
+					matches[i].match &= match;
+					matches[i].fraction = (match) ? matches[i].fraction : 0.0;
+				}
+			}
+		}
+		else if (or_clause(clause) || and_clause(clause))
+		{
+			/*
+			 * AND/OR clause, with all sub-clauses compatible with the stats
+			 */
+
+			int			i;
+			BoolExpr   *bool_clause = ((BoolExpr *) clause);
+			List	   *bool_clauses = bool_clause->args;
+
+			/* match/mismatch bitmap for each bucket */
+			bucket_match   *bool_matches = NULL;
+
+			Assert(bool_clauses != NIL);
+			Assert(list_length(bool_clauses) >= 2);
+
+			/* by default none of the buckets matches the clauses */
+			bool_matches = palloc0(sizeof(bucket_match) * histogram->nbuckets);
+
+			if (or_clause(clause))
+			{
+				/* OR clauses assume nothing matches, initially */
+				for (i = 0; i < histogram->nbuckets; i++)
+				{
+					bool_matches[i].match = false;
+					bool_matches[i].fraction = 0.0;
+				}
+			}
+			else
+			{
+				/* AND clauses assume nothing matches, initially */
+				for (i = 0; i < histogram->nbuckets; i++)
+				{
+					bool_matches[i].match = true;
+					bool_matches[i].fraction = 1.0;
+				}
+			}
+
+			/* build the match bitmap for the OR-clauses */
+			histogram_update_match_bitmap(root, bool_clauses,
+										  stakeys, histogram,
+										  bool_matches, or_clause(clause));
+
+			/*
+			 * Merge the bitmap produced by histogram_update_match_bitmap into
+			 * the current one. We need to consider if we're evaluating AND or
+			 * OR condition when merging the results.
+			 */
+			for (i = 0; i < histogram->nbuckets; i++)
+			{
+				/* Is this OR or AND clause? */
+				if (is_or)
+				{
+					Selectivity	s1, s2;
+
+					matches[i].match |= bool_matches[i].match;
+
+					/*
+					 * Selectivities for an OR clause are combined as s1+s2 - s1*s2
+					 * to account for the probable overlap of selected tuple sets.
+					 * This is the same formula as in clause_selectivity, because
+					 * the fraction is computed assuming independence (but then we
+					 * also apply geometric mean).
+					 */
+					s1 = matches[i].fraction;
+					s2 = bool_matches[i].fraction;
+
+					matches[i].fraction = s1 + s2 - s1 * s2;
+
+					CLAMP_PROBABILITY(matches[i].fraction);
+				}
+				else
+				{
+					matches[i].match &= bool_matches[i].match;
+					matches[i].fraction *= bool_matches[i].fraction;
+				}
+			}
+
+			pfree(bool_matches);
+
+		}
+		else if (not_clause(clause))
+		{
+			/* NOT clause, with all subclauses compatible */
+
+			int			i;
+			BoolExpr   *not_clause = ((BoolExpr *) clause);
+			List	   *not_args = not_clause->args;
+
+			/* match/mismatch bitmap for each MCV item */
+			bucket_match   *not_matches = NULL;
+
+			Assert(not_args != NIL);
+			Assert(list_length(not_args) == 1);
+
+			/* by default none of the MCV items matches the clauses */
+			not_matches = palloc0(sizeof(bucket_match) * histogram->nbuckets);
+
+			/* NOT clauses assume nothing matches, initially
+			 *
+			 * FIXME The comment seems to disagree with the code - not sure
+			 * if nothing should match (code is wrong) or everything should
+			 * match (comment is wrong) by default.
+			 */
+			for (i = 0; i < histogram->nbuckets; i++)
+			{
+				not_matches[i].match = true;
+				not_matches[i].fraction = 1.0;
+			}
+
+			/* build the match bitmap for the OR-clauses */
+			histogram_update_match_bitmap(root, not_args,
+										  stakeys, histogram,
+										  not_matches, false);
+
+			/*
+			 * Merge the bitmap produced by histogram_update_match_bitmap into
+			 * the current one.
+			 *
+			 * This is similar to what mcv_update_match_bitmap does, but we
+			 * need to be a tad more careful here, as histograms also track
+			 * what fraction of a bucket matches.
+			 */
+			for (i = 0; i < histogram->nbuckets; i++)
+			{
+				/*
+				 * When handling a NOT clause, invert the result before
+				 * merging it into the global result. We don't care about
+				 * partial matches here (those invert to partial).
+				 */
+				not_matches[i].match = (!not_matches[i].match);
+
+				/* Is this OR or AND clause? */
+				if (is_or)
+				{
+					Selectivity s1, s2;
+
+					matches[i].match |= not_matches[i].match;
+
+					/*
+					 * Selectivities for an OR clause are combined as s1+s2 - s1*s2
+					 * to account for the probable overlap of selected tuple sets.
+					 * This is the same formula as in clause_selectivity, because
+					 * the fraction is computed assuming independence (but then we
+					 * also apply geometric mean).
+					 */
+					s1 = matches[i].fraction;
+					s2 = not_matches[i].fraction;
+
+					matches[i].fraction = s1 + s2 - s1 * s2;
+
+					CLAMP_PROBABILITY(matches[i].fraction);
+				}
+				else
+				{
+					matches[i].match &= not_matches[i].match;
+					matches[i].fraction *= not_matches[i].fraction;
+				}
+			}
+
+			pfree(not_matches);
+		}
+		else if (IsA(clause, Var))
+		{
+			/* Var (has to be a boolean Var, possibly from below NOT) */
+
+			Var		   *var = (Var *) (clause);
+
+			/* match the attribute to a dimension of the statistic */
+			int			idx = bms_member_index(stakeys, var->varattno);
+
+			Assert(var->vartype == BOOLOID);
+
+			/*
+			 * Walk through the buckets and evaluate the current clause.
+			 */
+			for (i = 0; i < histogram->nbuckets; i++)
+			{
+				MVBucket   *bucket = histogram->buckets[i];
+				bool	match = false;
+				double	fraction = 0.0;
+
+				/*
+				 * If the bucket is NULL, it's a mismatch. Otherwise check
+				 * if lower/upper boundaries match and choose partial/full
+				 * match accordingly.
+				 */
+				if (!bucket->nullsonly[idx])
+				{
+					int		minidx = bucket->min[idx];
+					int		maxidx = bucket->max[idx];
+
+					bool 	a = DatumGetBool(histogram->values[idx][minidx]);
+					bool	b = DatumGetBool(histogram->values[idx][maxidx]);
+
+					/* How many boundary values match? */
+					if (a && b)
+					{
+						/* both values match - the whole bucket matches */
+						match = true;
+						fraction = 1.0;
+					}
+					else if (a || b)
+					{
+						/* one value matches - assume half the bucket matches */
+						match = true;
+						fraction = 0.5;
+					}
+				}
+
+				/* now, update the match bitmap, depending on OR/AND type */
+				if (is_or)
+				{
+					Selectivity	s1, s2;
+
+					matches[i].match |= match;
+
+					/*
+					 * Selectivities for an OR clause are combined as s1+s2 - s1*s2
+					 * to account for the probable overlap of selected tuple sets.
+					 * This is the same formula as in clause_selectivity, because
+					 * the fraction is computed assuming independence (but then we
+					 * also apply geometric mean).
+					 */
+					s1 = matches[i].fraction;
+					s2 = fraction;
+
+					matches[i].fraction = s1 + s2 - s1 * s2;
+
+					CLAMP_PROBABILITY(matches[i].fraction);
+				}
+				else
+				{
+					matches[i].match &= match;
+					matches[i].fraction *= fraction;
+				}
+			}
+		}
+		else
+			elog(ERROR, "unknown clause type: %d", clause->type);
+	}
+
+	/* free the call cache */
+	pfree(callcache);
+}
+
+/*
+ * Estimate selectivity of clauses using a histogram.
+ *
+ * If there's no histogram for the stats, the function returns 0.0.
+ *
+ * The general idea of this method is similar to how MCV lists are
+ * processed, except that this introduces the concept of a partial
+ * match (MCV only works with full match / mismatch).
+ *
+ * The algorithm works like this:
+ *
+ *	 1) mark all buckets as 'full match'
+ *	 2) walk through all the clauses
+ *	 3) for a particular clause, walk through all the buckets
+ *	 4) skip buckets that are already 'no match'
+ *	 5) check clause for buckets that still match (at least partially)
+ *	 6) sum frequencies for buckets to get selectivity
+ *
+ * Unlike MCV lists, histograms have a concept of a partial match. In
+ * that case we use 1/2 the bucket, to minimize the average error. The
+ * MV histograms are usually less detailed than the per-column ones,
+ * meaning the sum is often quite high (thanks to combining a lot of
+ * "partially hit" buckets).
+ *
+ * Maybe we could use per-bucket information with number of distinct
+ * values it contains (for each dimension), and then use that to correct
+ * the estimate (so with 10 distinct values, we'd use 1/10 of the bucket
+ * frequency). We might also scale the value depending on the actual
+ * ndistinct estimate (not just the values observed in the sample).
+ *
+ * Another option would be to multiply the selectivities, i.e. if we get
+ * 'partial match' for a bucket for multiple conditions, we might use
+ * 0.5^k (where k is the number of conditions), instead of 0.5. This
+ * probably does not minimize the average error, though.
+ *
+ * TODO: This might use a similar shortcut to MCV lists - count buckets
+ * marked as partial/full match, and terminate once this drop to 0.
+ * Not sure if it's really worth it - for MCV lists a situation like
+ * this is not uncommon, but for histograms it's not that clear.
+ */
+Selectivity
+histogram_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
+								 List *clauses, List *conditions,
+								 int varRelid, JoinType jointype,
+								 SpecialJoinInfo *sjinfo, RelOptInfo *rel)
+{
+	int			i;
+	MVHistogram *histogram;
+	Selectivity	s = 0.0;
+	Selectivity	total_sel = 0.0;
+	Size		len;
+	int			nclauses;
+
+	/* match/mismatch bitmap for each MCV item */
+	bucket_match   *matches = NULL;
+	bucket_match   *condition_matches = NULL;
+
+	nclauses = list_length(clauses);
+
+	/* load the histogram stored in the statistics object */
+	histogram = statext_histogram_load(stat->statOid);
+
+	/* size of the match "bitmap" */
+	len = sizeof(bucket_match) * histogram->nbuckets;
+
+	/* by default all the histogram buckets match the clauses fully */
+	matches = palloc0(len);
+
+	/* by default all buckets match fully */
+	for (i = 0; i < histogram->nbuckets; i++)
+	{
+		matches[i].match = true;
+		matches[i].fraction = 1.0;
+	}
+
+	histogram_update_match_bitmap(root, clauses, stat->keys,
+								  histogram, matches, false);
+
+	/* if there are condition clauses, build a match bitmap for them */
+	if (conditions)
+	{
+		/* match bitmap for conditions, by default all buckets match */
+		condition_matches = palloc0(len);
+
+		/* by default all buckets match fully */
+		for (i = 0; i < histogram->nbuckets; i++)
+		{
+			condition_matches[i].match = true;
+			condition_matches[i].fraction = 1.0;
+		}
+
+		histogram_update_match_bitmap(root, conditions, stat->keys,
+									  histogram, condition_matches, false);
+	}
+
+	/* now, walk through the buckets and sum the selectivities */
+	for (i = 0; i < histogram->nbuckets; i++)
+	{
+		double fraction;
+
+		/* skip buckets that don't satisfy the conditions */
+		if (conditions && (!condition_matches[i].match))
+			continue;
+
+		/* compute selectivity for buckets matching conditions */
+		total_sel += histogram->buckets[i]->frequency;
+
+		/* geometric mean of the bucket fraction */
+		fraction = pow(matches[i].fraction, 1.0 / nclauses);
+
+		if (matches[i].match)
+			s += histogram->buckets[i]->frequency * fraction;
+	}
+
+	/* conditional selectivity P(clauses|conditions) */
+	if (total_sel > 0.0)
+		return (s / total_sel);
+
+	return 0.0;
+}
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index 9bc2d07e90..23f9b4b7e2 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -85,7 +85,8 @@ static int count_distinct_groups(int numrows, SortItem *items,
  */
 MCVList *
 statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
-				  VacAttrStats **stats, double totalrows)
+				  VacAttrStats **stats, HeapTuple **rows_filtered,
+				  int *numrows_filtered, double totalrows)
 {
 	int			i,
 				j,
@@ -96,6 +97,7 @@ statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
 	double		stadistinct;
 	int		   *mcv_counts;
 	int			f1;
+	int			numrows_mcv;
 
 	int		   *attnums = build_attnums(attrs);
 
@@ -111,6 +113,9 @@ statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
 	/* transform the sorted rows into groups (sorted by frequency) */
 	SortItem   *groups = build_distinct_groups(numrows, items, mss, &ngroups);
 
+	/* Either we have both pointers or none of them. */
+	Assert((rows_filtered && numrows_filtered) || (!rows_filtered && !numrows_filtered));
+
 	/*
 	 * Maximum number of MCV items to store, based on the attribute with the
 	 * largest stats target (and the number of groups we have available).
@@ -167,6 +172,9 @@ statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
 								  numrows, totalrows);
 	}
 
+	/* number of rows represented by MCV items */
+	numrows_mcv = 0;
+
 	/*
 	 * At this point we know the number of items for the MCV list. There might
 	 * be none (for uniform distribution with many groups), and in that case
@@ -243,9 +251,93 @@ statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
 
 				item->base_frequency *= (double) count / numrows;
 			}
+
+			/* update the number of sampled rows represented by the MCV list */
+			numrows_mcv += groups[i].count;
 		}
 	}
 
+	/* Assume we're not returning any filtered rows by default. */
+	if (numrows_filtered)
+		*numrows_filtered = 0;
+
+	if (rows_filtered)
+		*rows_filtered = NULL;
+
+	/*
+	 * Produce an array with only tuples not covered by the MCV list. This is
+	 * needed when building MCV+histogram pair, where MCV covers the most
+	 * common combinations and histogram covers the remaining part.
+	 *
+	 * We will first sort the groups by the keys (not by count) and then use
+	 * binary search in the group array to check which rows are covered by the
+	 * MCV items.
+	 *
+	 * Do not modify the array in place, as there may be additional stats on
+	 * the table and we need to keep the original array for them.
+	 *
+	 * We only do this when requested by passing non-NULL rows_filtered, and
+	 * when there are rows not covered by the MCV list (that is, when
+	 * numrows_mcv < numrows), or also (nitems < ngroups).
+	 */
+	if (rows_filtered && numrows_filtered && (nitems < ngroups))
+	{
+		int			i,
+					j;
+
+		/* used to build the filtered array of tuples */
+		HeapTuple  *filtered;
+		int			nfiltered;
+
+		/* used for the searches */
+		SortItem	key;
+
+		/* We do know how many rows we expect (total - MCV rows). */
+		nfiltered = (numrows - numrows_mcv);
+		filtered = (HeapTuple *) palloc(nfiltered * sizeof(HeapTuple));
+
+		/* wfill this with data from the rows */
+		key.values = (Datum *) palloc0(numattrs * sizeof(Datum));
+		key.isnull = (bool *) palloc0(numattrs * sizeof(bool));
+
+		/*
+		 * Sort the groups for bsearch_r (but only the items that actually
+		 * made it to the MCV list).
+		 */
+		qsort_arg((void *) groups, nitems, sizeof(SortItem),
+				  multi_sort_compare, mss);
+
+		/* walk through the tuples, compare the values to MCV items */
+		nfiltered = 0;
+		for (i = 0; i < numrows; i++)
+		{
+			/* collect the key values from the row */
+			for (j = 0; j < numattrs; j++)
+				key.values[j]
+					= heap_getattr(rows[i], attnums[j],
+								   stats[j]->tupDesc, &key.isnull[j]);
+
+			/* if not included in the MCV list, keep it in the array */
+			if (bsearch_arg(&key, groups, nitems, sizeof(SortItem),
+							multi_sort_compare, mss) == NULL)
+				filtered[nfiltered++] = rows[i];
+
+			/* do not overflow the array */
+			Assert(nfiltered <= (numrows - numrows_mcv));
+		}
+
+		/* expect to get the right number of remaining rows exactly */
+		Assert(nfiltered + numrows_mcv == numrows);
+
+		/* pass the filtered tuples up */
+		*numrows_filtered = nfiltered;
+		*rows_filtered = filtered;
+
+		/* free all the data used here */
+		pfree(key.values);
+		pfree(key.isnull);
+	}
+
 	pfree(items);
 	pfree(groups);
 	pfree(mcv_counts);
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 64edd874c9..6d65837e11 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -1509,6 +1509,7 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 	bool		ndistinct_enabled;
 	bool		dependencies_enabled;
 	bool		mcv_enabled;
+	bool		histogram_enabled;
 	int			i;
 
 	statexttup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statextid));
@@ -1545,6 +1546,7 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 	ndistinct_enabled = false;
 	dependencies_enabled = false;
 	mcv_enabled = false;
+	histogram_enabled = false;
 
 	for (i = 0; i < ARR_DIMS(arr)[0]; i++)
 	{
@@ -1554,6 +1556,8 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 			dependencies_enabled = true;
 		if (enabled[i] == STATS_EXT_MCV)
 			mcv_enabled = true;
+		if (enabled[i] == STATS_EXT_HISTOGRAM)
+			histogram_enabled = true;
 	}
 
 	/*
@@ -1582,7 +1586,13 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 		}
 
 		if (mcv_enabled)
+		{
 			appendStringInfo(&buf, "%smcv", gotone ? ", " : "");
+			gotone = true;
+		}
+
+		if (histogram_enabled)
+			appendStringInfo(&buf, "%shistogram", gotone ? ", " : "");
 
 		appendStringInfoChar(&buf, ')');
 	}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index e727808eae..b096723600 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -182,10 +182,6 @@ static double eqjoinsel_semi(Oid opfuncoid,
 			   RelOptInfo *inner_rel);
 static bool estimate_multivariate_ndistinct(PlannerInfo *root,
 								RelOptInfo *rel, List **varinfos, double *ndistinct);
-static bool convert_to_scalar(Datum value, Oid valuetypid, Oid collid,
-				  double *scaledvalue,
-				  Datum lobound, Datum hibound, Oid boundstypid,
-				  double *scaledlobound, double *scaledhibound);
 static double convert_numeric_to_scalar(Datum value, Oid typid, bool *failure);
 static void convert_string_to_scalar(char *value,
 						 double *scaledvalue,
@@ -3768,7 +3764,7 @@ estimate_num_groups_simple(PlannerInfo *root, List *vars)
 	double		numdistinct;
 	ListCell   *l;
 
-	RelOptInfo *rel;
+	RelOptInfo *rel = NULL;
 	double		reldistinct = 1;
 	double		relmaxndistinct = reldistinct;
 	int			relvarcount = 0;
@@ -3822,8 +3818,7 @@ estimate_num_groups_simple(PlannerInfo *root, List *vars)
 	/*
 	 * Get the numdistinct estimate for the Vars of this rel.
 	 *
-	 * We
-	 * iteratively search for multivariate n-distinct with maximum number
+	 * We iteratively search for multivariate n-distinct with maximum number
 	 * of vars; assuming that each var group is independent of the others,
 	 * we multiply them together.  Any remaining relvarinfos after no more
 	 * multivariate matches are found are assumed independent too, so
@@ -4123,7 +4118,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 		int			nshared;
 
 		/* skip statistics of other kinds */
-		if (info->kind != STATS_EXT_NDISTINCT)
+		if ((info->kinds & STATS_EXT_INFO_NDISTINCT) == 0)
 			continue;
 
 		/* compute attnums shared by the vars and the statistics object */
@@ -4232,7 +4227,7 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
  * The several datatypes representing relative times (intervals) are all
  * converted to measurements expressed in seconds.
  */
-static bool
+bool
 convert_to_scalar(Datum value, Oid valuetypid, Oid collid, double *scaledvalue,
 				  Datum lobound, Datum hibound, Oid boundstypid,
 				  double *scaledlobound, double *scaledhibound)
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 3d68a7c0ea..a476c11163 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2543,7 +2543,8 @@ describeOneTableDetails(const char *schemaname,
 							  "        a.attnum = s.attnum AND NOT attisdropped)) AS columns,\n"
 							  "  'd' = any(stxkind) AS ndist_enabled,\n"
 							  "  'f' = any(stxkind) AS deps_enabled,\n"
-							  "  'm' = any(stxkind) AS mcv_enabled\n"
+							  "  'm' = any(stxkind) AS mcv_enabled,\n"
+							  "  'h' = any(stxkind) AS histogram_enabled\n"
 							  "FROM pg_catalog.pg_statistic_ext stat "
 							  "WHERE stxrelid = '%s'\n"
 							  "ORDER BY 1;",
@@ -2586,6 +2587,12 @@ describeOneTableDetails(const char *schemaname,
 					if (strcmp(PQgetvalue(result, i, 7), "t") == 0)
 					{
 						appendPQExpBuffer(&buf, "%smcv", gotone ? ", " : "");
+						gotone = true;
+					}
+
+					if (strcmp(PQgetvalue(result, i, 8), "t") == 0)
+					{
+						appendPQExpBuffer(&buf, "%shistogram", gotone ? ", " : "");
 					}
 
 					appendPQExpBuffer(&buf, ") ON %s FROM %s",
diff --git a/src/include/catalog/pg_cast.dat b/src/include/catalog/pg_cast.dat
index b382bdce5a..a70fc9868a 100644
--- a/src/include/catalog/pg_cast.dat
+++ b/src/include/catalog/pg_cast.dat
@@ -330,6 +330,10 @@
 { castsource => 'pg_mcv_list', casttarget => 'text', castfunc => '0',
   castcontext => 'i', castmethod => 'i' },
 
+# pg_histogram can be coerced to, but not from, bytea
+{ castsource => 'pg_histogram', casttarget => 'bytea', castfunc => '0',
+  castcontext => 'i', castmethod => 'b' },
+
 # Datetime category
 { castsource => 'date', casttarget => 'timestamp',
   castfunc => 'timestamp(date)', castcontext => 'i', castmethod => 'f' },
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index c08dcc55ec..2ca889bd32 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -4986,6 +4986,30 @@
   proargnames => '{mcv_list,index,values,nulls,frequency,base_frequency}',
   prosrc => 'pg_stats_ext_mcvlist_items' },
 
+{ oid => '3426', descr => 'I/O',
+  proname => 'pg_histogram_in', prorettype => 'pg_histogram',
+  proargtypes => 'cstring', prosrc => 'pg_histogram_in' },
+{ oid => '3427', descr => 'I/O',
+  proname => 'pg_histogram_out', prorettype => 'cstring',
+  proargtypes => 'pg_histogram', prosrc => 'pg_histogram_out' },
+{ oid => '3428', descr => 'I/O',
+  proname => 'pg_histogram_recv', provolatile => 's',
+  prorettype => 'pg_histogram', proargtypes => 'internal',
+  prosrc => 'pg_histogram_recv' },
+{ oid => '3429', descr => 'I/O',
+  proname => 'pg_histogram_send', provolatile => 's', prorettype => 'bytea',
+  proargtypes => 'pg_histogram', prosrc => 'pg_histogram_send' },
+
+{ oid => '3430',
+  descr => 'details about histogram buckets',
+  proname => 'pg_histogram_buckets', prorows => '1000', proisstrict => 'f',
+  proretset => 't', provolatile => 's', prorettype => 'record',
+  proargtypes => 'pg_histogram int4',
+  proallargtypes => '{pg_histogram,int4,int4,_text,_text,_bool,_bool,_bool,float8,float8,float8}',
+  proargmodes => '{i,i,o,o,o,o,o,o,o,o,o}',
+  proargnames => '{histogram,otype,index,minvals,maxvals,nullsonly,mininclusive,maxinclusive,frequency,density,bucket_volume}',
+  prosrc => 'pg_histogram_buckets' },
+
 { oid => '1928', descr => 'statistics: number of scans done for table/index',
   proname => 'pg_stat_get_numscans', provolatile => 's', proparallel => 'r',
   prorettype => 'int8', proargtypes => 'oid',
diff --git a/src/include/catalog/pg_statistic_ext.h b/src/include/catalog/pg_statistic_ext.h
index c4d3270d3f..89d575f8de 100644
--- a/src/include/catalog/pg_statistic_ext.h
+++ b/src/include/catalog/pg_statistic_ext.h
@@ -50,6 +50,7 @@ CATALOG(pg_statistic_ext,3381,StatisticExtRelationId)
 	pg_ndistinct stxndistinct;	/* ndistinct coefficients (serialized) */
 	pg_dependencies stxdependencies;	/* dependencies (serialized) */
 	pg_mcv_list stxmcv;			/* MCV (serialized) */
+	pg_histogram stxhistogram;	/* MV histogram (serialized) */
 #endif
 
 } FormData_pg_statistic_ext;
@@ -66,6 +67,7 @@ typedef FormData_pg_statistic_ext *Form_pg_statistic_ext;
 #define STATS_EXT_NDISTINCT			'd'
 #define STATS_EXT_DEPENDENCIES		'f'
 #define STATS_EXT_MCV				'm'
+#define STATS_EXT_HISTOGRAM			'h'
 
 #endif							/* EXPOSE_TO_CLIENT_CODE */
 
diff --git a/src/include/catalog/pg_type.dat b/src/include/catalog/pg_type.dat
index b87f6bc4d7..904897ffdc 100644
--- a/src/include/catalog/pg_type.dat
+++ b/src/include/catalog/pg_type.dat
@@ -172,6 +172,13 @@
   typoutput => 'pg_mcv_list_out', typreceive => 'pg_mcv_list_recv',
   typsend => 'pg_mcv_list_send', typalign => 'i', typstorage => 'x',
   typcollation => '100' },
+{ oid => '3425', oid_symbol => 'PGHISTOGRAMOID',
+  descr => 'multivariate histogram',
+  typname => 'pg_histogram', typlen => '-1', typbyval => 'f',
+  typcategory => 'S', typinput => 'pg_histogram_in',
+  typoutput => 'pg_histogram_out', typreceive => 'pg_histogram_recv',
+  typsend => 'pg_histogram_send', typalign => 'i', typstorage => 'x',
+  typcollation => '100' },
 { oid => '32', oid_symbol => 'PGDDLCOMMANDOID',
   descr => 'internal type for passing CollectedCommand',
   typname => 'pg_ddl_command', typlen => 'SIZEOF_POINTER', typbyval => 't',
diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h
index 6fd24203dd..018b8a221b 100644
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -858,10 +858,15 @@ typedef struct StatisticExtInfo
 
 	Oid			statOid;		/* OID of the statistics row */
 	RelOptInfo *rel;			/* back-link to statistic's table */
-	char		kind;			/* statistic kind of this entry */
+	int			kinds;			/* statistic kinds of this entry */
 	Bitmapset  *keys;			/* attnums of the columns covered */
 } StatisticExtInfo;
 
+#define STATS_EXT_INFO_NDISTINCT			1
+#define STATS_EXT_INFO_DEPENDENCIES			2
+#define STATS_EXT_INFO_MCV					4
+#define STATS_EXT_INFO_HISTOGRAM			8
+
 /*
  * EquivalenceClasses
  *
diff --git a/src/include/statistics/extended_stats_internal.h b/src/include/statistics/extended_stats_internal.h
index f330f3c1d3..eb31347adf 100644
--- a/src/include/statistics/extended_stats_internal.h
+++ b/src/include/statistics/extended_stats_internal.h
@@ -69,10 +69,16 @@ extern MVDependencies *statext_dependencies_deserialize(bytea *data);
 
 extern MCVList * statext_mcv_build(int numrows, HeapTuple *rows,
 								   Bitmapset *attrs, VacAttrStats **stats,
+								   HeapTuple **rows_filtered, int *numrows_filtered,
 								   double totalrows);
 extern bytea *statext_mcv_serialize(MCVList * mcv, VacAttrStats **stats);
 extern MCVList * statext_mcv_deserialize(bytea *data);
 
+extern MVHistogram * statext_histogram_build(int numrows, HeapTuple *rows,
+											 Bitmapset *attrs, VacAttrStats **stats,
+											 int numrows_total);
+extern MVHistogram * statext_histogram_deserialize(bytea *data);
+
 extern MultiSortSupport multi_sort_init(int ndims);
 extern void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
 						 Oid oper, Oid collation);
@@ -83,6 +89,7 @@ extern int multi_sort_compare_dims(int start, int end, const SortItem *a,
 						const SortItem *b, MultiSortSupport mss);
 extern int	compare_scalars_simple(const void *a, const void *b, void *arg);
 extern int	compare_datums_simple(Datum a, Datum b, SortSupport ssup);
+extern int	compare_scalars_partition(const void *a, const void *b, void *arg);
 
 extern void *bsearch_arg(const void *key, const void *base,
 			size_t nmemb, size_t size,
@@ -109,4 +116,12 @@ extern Selectivity mcv_clauselist_selectivity(PlannerInfo *root,
 						   Selectivity *basesel,
 						   Selectivity *totalsel);
 
+extern Selectivity histogram_clauselist_selectivity(PlannerInfo *root,
+								 StatisticExtInfo *stat,
+								 List *clauses, List *conditions,
+								 int varRelid,
+								 JoinType jointype,
+								 SpecialJoinInfo *sjinfo,
+								 RelOptInfo *rel);
+
 #endif							/* EXTENDED_STATS_INTERNAL_H */
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index e69d6a0232..1d276f0b6d 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -119,9 +119,68 @@ typedef struct MCVList
 	MCVItem   **items;			/* array of MCV items */
 }			MCVList;
 
+
+/* used to flag stats serialized to bytea */
+#define STATS_HIST_MAGIC       0x7F8C5670	/* marks serialized bytea */
+#define STATS_HIST_TYPE_BASIC  1	/* basic histogram type */
+
+/* max buckets in a histogram (mostly arbitrary number) */
+#define STATS_HIST_MAX_BUCKETS 16384
+
+/*
+ * Histogram in a partially serialized form, with deduplicated boundary
+ * values etc.
+ */
+typedef struct MVBucket
+{
+	/* Frequencies of this bucket. */
+	float		frequency;
+
+	/*
+	 * Information about dimensions being NULL-only. Not yet used.
+	 */
+	bool	   *nullsonly;
+
+	/* lower boundaries - values and information about the inequalities */
+	uint16	   *min;
+	bool	   *min_inclusive;
+
+	/*
+	 * indexes of upper boundaries - values and information about the
+	 * inequalities (exclusive vs. inclusive)
+	 */
+	uint16	   *max;
+	bool	   *max_inclusive;
+}			MVBucket;
+
+typedef struct MVHistogram
+{
+	/* varlena header (do not touch directly!) */
+	int32		vl_len_;
+	uint32		magic;			/* magic constant marker */
+	uint32		type;			/* type of histogram (BASIC) */
+	uint32		nbuckets;		/* number of buckets (buckets array) */
+	uint32		ndimensions;	/* number of dimensions */
+	Oid			types[STATS_MAX_DIMENSIONS];	/* OIDs of data types */
+
+	/*
+	 * keep this the same with MVHistogram, because of deserialization (same
+	 * offset)
+	 */
+	MVBucket  **buckets;		/* array of buckets */
+
+	/*
+	 * serialized boundary values, one array per dimension, deduplicated (the
+	 * min/max indexes point into these arrays)
+	 */
+	int		   *nvalues;
+	Datum	  **values;
+}			MVHistogram;
+
 extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
 extern MVDependencies *statext_dependencies_load(Oid mvoid);
 extern MCVList * statext_mcv_load(Oid mvoid);
+extern MVHistogram * statext_histogram_load(Oid mvoid);
 
 extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
 						   int numrows, HeapTuple *rows,
@@ -141,8 +200,8 @@ extern Selectivity statext_clauselist_selectivity(PlannerInfo *root,
 							   SpecialJoinInfo *sjinfo,
 							   RelOptInfo *rel,
 							   Bitmapset **estimatedclauses);
-extern bool has_stats_of_kind(List *stats, char requiredkind);
+extern bool has_stats_of_kind(List *stats, int requiredkinds);
 extern StatisticExtInfo *choose_best_statistics(List *stats,
-					   Bitmapset *attnums, char requiredkind);
+					   Bitmapset *attnums, int requiredkinds);
 
 #endif							/* STATISTICS_H */
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
index 4e9aaca6b5..0b36d9610e 100644
--- a/src/include/utils/selfuncs.h
+++ b/src/include/utils/selfuncs.h
@@ -222,6 +222,11 @@ extern void genericcostestimate(PlannerInfo *root, IndexPath *path,
 					List *qinfos,
 					GenericCosts *costs);
 
+extern bool convert_to_scalar(Datum value, Oid valuetypid, Oid collid,
+				  double *scaledvalue, Datum lobound, Datum hibound,
+				  Oid boundstypid,
+				  double *scaledlobound, double *scaledhibound);
+
 /* Functions in array_selfuncs.c */
 
 extern Selectivity scalararraysel_containment(PlannerInfo *root,
diff --git a/src/test/regress/expected/create_table_like.out b/src/test/regress/expected/create_table_like.out
index 4c8a5bd7e2..73e3d4451b 100644
--- a/src/test/regress/expected/create_table_like.out
+++ b/src/test/regress/expected/create_table_like.out
@@ -243,7 +243,7 @@ Indexes:
 Check constraints:
     "ctlt1_a_check" CHECK (length(a) > 2)
 Statistics objects:
-    "public"."ctlt_all_a_b_stat" (ndistinct, dependencies, mcv) ON a, b FROM ctlt_all
+    "public"."ctlt_all_a_b_stat" (ndistinct, dependencies, mcv, histogram) ON a, b FROM ctlt_all
 
 SELECT c.relname, objsubid, description FROM pg_description, pg_index i, pg_class c WHERE classoid = 'pg_class'::regclass AND objoid = i.indexrelid AND c.oid = i.indexrelid AND i.indrelid = 'ctlt_all'::regclass ORDER BY c.relname, objsubid;
     relname     | objsubid | description 
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index fab4597908..eff93c6c4c 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -894,11 +894,12 @@ WHERE c.castmethod = 'b' AND
  pg_ndistinct      | bytea             |        0 | i
  pg_dependencies   | bytea             |        0 | i
  pg_mcv_list       | bytea             |        0 | i
+ pg_histogram      | bytea             |        0 | i
  cidr              | inet              |        0 | i
  xml               | text              |        0 | a
  xml               | character varying |        0 | a
  xml               | character         |        0 | a
-(10 rows)
+(11 rows)
 
 -- **************** pg_conversion ****************
 -- Look for illegal values in pg_conversion fields.
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 5d05962c04..67975c91d3 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -58,7 +58,7 @@ ALTER TABLE ab1 DROP COLUMN a;
  b      | integer |           |          | 
  c      | integer |           |          | 
 Statistics objects:
-    "public"."ab1_b_c_stats" (ndistinct, dependencies, mcv) ON b, c FROM ab1
+    "public"."ab1_b_c_stats" (ndistinct, dependencies, mcv, histogram) ON b, c FROM ab1
 
 -- Ensure statistics are dropped when table is
 SELECT stxname FROM pg_statistic_ext WHERE stxname LIKE 'ab1%';
@@ -204,9 +204,9 @@ CREATE STATISTICS s10 ON a, b, c FROM ndistinct;
 ANALYZE ndistinct;
 SELECT stxkind, stxndistinct
   FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass;
- stxkind |                      stxndistinct                       
----------+---------------------------------------------------------
- {d,f,m} | {"3, 4": 301, "3, 6": 301, "4, 6": 301, "3, 4, 6": 301}
+  stxkind  |                      stxndistinct                       
+-----------+---------------------------------------------------------
+ {d,f,m,h} | {"3, 4": 301, "3, 6": 301, "4, 6": 301, "3, 4, 6": 301}
 (1 row)
 
 -- Hash Aggregate, thanks to estimates improved by the statistic
@@ -270,9 +270,9 @@ INSERT INTO ndistinct (a, b, c, filler1)
 ANALYZE ndistinct;
 SELECT stxkind, stxndistinct
   FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass;
- stxkind |                        stxndistinct                         
----------+-------------------------------------------------------------
- {d,f,m} | {"3, 4": 2550, "3, 6": 800, "4, 6": 1632, "3, 4, 6": 10000}
+  stxkind  |                        stxndistinct                         
+-----------+-------------------------------------------------------------
+ {d,f,m,h} | {"3, 4": 2550, "3, 6": 800, "4, 6": 1632, "3, 4, 6": 10000}
 (1 row)
 
 -- plans using Group Aggregate, thanks to using correct esimates
@@ -758,7 +758,6 @@ EXPLAIN (COSTS OFF)
          Index Cond: ((a IS NULL) AND (b IS NULL))
 (5 rows)
 
-RESET random_page_cost;
 -- mcv with arrays
 CREATE TABLE mcv_lists_arrays (
     a TEXT[],
@@ -822,3 +821,197 @@ EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND NOT c;
    Filter: ((NOT a) AND b AND (NOT c))
 (3 rows)
 
+RESET random_page_cost;
+-- histograms
+CREATE TABLE histograms (
+    filler1 TEXT,
+    filler2 NUMERIC,
+    a INT,
+    b TEXT,
+    filler3 DATE,
+    c INT,
+    d TEXT
+);
+SET random_page_cost = 1.2;
+CREATE INDEX histograms_ab_idx ON mcv_lists (a, b);
+CREATE INDEX histograms_abc_idx ON histograms (a, b, c);
+-- random data (we still get histogram, but as the columns are not
+-- correlated, the estimates remain about the same)
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT mod(i,37), mod(i,41), mod(i,43), mod(i,47) FROM generate_series(1,5000) s(i);
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 5) AND (b < '5'::text))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 5) AND (b < '5'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5' AND c < 5;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 5) AND (b < '5'::text) AND (c < 5))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 5) AND (b < '5'::text) AND (c < 5))
+(4 rows)
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 5) AND (b < '5'::text))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 5) AND (b < '5'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5' AND c < 5;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 5) AND (b < '5'::text) AND (c < 5))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 5) AND (b < '5'::text) AND (c < 5))
+(4 rows)
+
+-- values correlated along the diagonal
+TRUNCATE histograms;
+DROP STATISTICS histograms_stats;
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT mod(i,100), mod(i,100) + mod(i,7), mod(i,100) + mod(i,11), i FROM generate_series(1,5000) s(i);
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND c < 3;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Index Scan using histograms_abc_idx on histograms
+   Index Cond: ((a < 3) AND (c < 3))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND b > '2' AND c < 3;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using histograms_abc_idx on histograms
+   Index Cond: ((a < 3) AND (b > '2'::text) AND (c < 3))
+(2 rows)
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND c < 3;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 3) AND (c < 3))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 3) AND (c < 3))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND b > '2' AND c < 3;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a < 3) AND (b > '2'::text) AND (c < 3))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a < 3) AND (b > '2'::text) AND (c < 3))
+(4 rows)
+
+-- almost 5000 unique combinations with NULL values
+TRUNCATE histograms;
+DROP STATISTICS histograms_stats;
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT
+         (CASE WHEN mod(i,100) =  0 THEN NULL ELSE mod(i,100) END),
+         (CASE WHEN mod(i,100) <= 1 THEN NULL ELSE mod(i,100) + mod(i,7)  END),
+         (CASE WHEN mod(i,100) <= 2 THEN NULL ELSE mod(i,100) + mod(i,11) END),
+         i
+     FROM generate_series(1,5000) s(i);
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Index Scan using histograms_abc_idx on histograms
+   Index Cond: ((a IS NULL) AND (b IS NULL))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL AND c IS NULL;
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Index Scan using histograms_abc_idx on histograms
+   Index Cond: ((a IS NULL) AND (b IS NULL) AND (c IS NULL))
+(2 rows)
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a IS NULL) AND (b IS NULL))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL AND c IS NULL;
+                            QUERY PLAN                             
+-------------------------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a IS NULL) AND (b IS NULL) AND (c IS NULL))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a IS NULL) AND (b IS NULL) AND (c IS NULL))
+(4 rows)
+
+-- check change of column type resets the histogram statistics
+ALTER TABLE histograms ALTER COLUMN c TYPE numeric;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Index Scan using histograms_abc_idx on histograms
+   Index Cond: ((a IS NULL) AND (b IS NULL))
+(2 rows)
+
+ANALYZE histograms;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on histograms
+   Recheck Cond: ((a IS NULL) AND (b IS NULL))
+   ->  Bitmap Index Scan on histograms_abc_idx
+         Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+-- histograms with arrays
+CREATE TABLE histograms_arrays (
+    a TEXT[],
+    b NUMERIC[],
+    c INT[]
+);
+INSERT INTO histograms_arrays (a, b, c)
+     SELECT
+         ARRAY[md5(i::text), md5((i-1)::text), md5((i+1)::text)],
+         ARRAY[(i-1)::numeric/1000, i::numeric/1000, (i+1)::numeric/1000],
+         ARRAY[(i-1), i, (i+1)]
+     FROM generate_series(1,5000) s(i);
+CREATE STATISTICS histogram_array_stats (histogram) ON a, b, c
+  FROM histograms_arrays;
+ANALYZE histograms_arrays;
+RESET random_page_cost;
diff --git a/src/test/regress/expected/type_sanity.out b/src/test/regress/expected/type_sanity.out
index a56d6c5231..97c292f6f9 100644
--- a/src/test/regress/expected/type_sanity.out
+++ b/src/test/regress/expected/type_sanity.out
@@ -73,8 +73,9 @@ WHERE p1.typtype not in ('c','d','p') AND p1.typname NOT LIKE E'\\_%'
  3361 | pg_ndistinct
  3402 | pg_dependencies
  4001 | pg_mcv_list
+ 3425 | pg_histogram
   210 | smgr
-(5 rows)
+(6 rows)
 
 -- Make sure typarray points to a varlena array type of our own base
 SELECT p1.oid, p1.typname as basetype, p2.typname as arraytype,
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index ad1f103217..a949c7e6d1 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -414,8 +414,6 @@ EXPLAIN (COSTS OFF)
 EXPLAIN (COSTS OFF)
  SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
 
-RESET random_page_cost;
-
 -- mcv with arrays
 CREATE TABLE mcv_lists_arrays (
     a TEXT[],
@@ -463,3 +461,134 @@ EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND c;
 EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND NOT b AND c;
 
 EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND NOT c;
+
+RESET random_page_cost;
+
+-- histograms
+CREATE TABLE histograms (
+    filler1 TEXT,
+    filler2 NUMERIC,
+    a INT,
+    b TEXT,
+    filler3 DATE,
+    c INT,
+    d TEXT
+);
+
+SET random_page_cost = 1.2;
+
+CREATE INDEX histograms_ab_idx ON mcv_lists (a, b);
+CREATE INDEX histograms_abc_idx ON histograms (a, b, c);
+
+-- random data (we still get histogram, but as the columns are not
+-- correlated, the estimates remain about the same)
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT mod(i,37), mod(i,41), mod(i,43), mod(i,47) FROM generate_series(1,5000) s(i);
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5' AND c < 5;
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 5 AND b < '5' AND c < 5;
+
+-- values correlated along the diagonal
+TRUNCATE histograms;
+DROP STATISTICS histograms_stats;
+
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT mod(i,100), mod(i,100) + mod(i,7), mod(i,100) + mod(i,11), i FROM generate_series(1,5000) s(i);
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND c < 3;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND b > '2' AND c < 3;
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND c < 3;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a < 3 AND b > '2' AND c < 3;
+
+-- almost 5000 unique combinations with NULL values
+TRUNCATE histograms;
+DROP STATISTICS histograms_stats;
+
+INSERT INTO histograms (a, b, c, filler1)
+     SELECT
+         (CASE WHEN mod(i,100) =  0 THEN NULL ELSE mod(i,100) END),
+         (CASE WHEN mod(i,100) <= 1 THEN NULL ELSE mod(i,100) + mod(i,7)  END),
+         (CASE WHEN mod(i,100) <= 2 THEN NULL ELSE mod(i,100) + mod(i,11) END),
+         i
+     FROM generate_series(1,5000) s(i);
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL AND c IS NULL;
+
+-- create statistics
+CREATE STATISTICS histograms_stats (histogram) ON a, b, c FROM histograms;
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL AND c IS NULL;
+
+-- check change of column type resets the histogram statistics
+ALTER TABLE histograms ALTER COLUMN c TYPE numeric;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+
+ANALYZE histograms;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM histograms WHERE a IS NULL AND b IS NULL;
+
+-- histograms with arrays
+CREATE TABLE histograms_arrays (
+    a TEXT[],
+    b NUMERIC[],
+    c INT[]
+);
+
+INSERT INTO histograms_arrays (a, b, c)
+     SELECT
+         ARRAY[md5(i::text), md5((i-1)::text), md5((i+1)::text)],
+         ARRAY[(i-1)::numeric/1000, i::numeric/1000, (i+1)::numeric/1000],
+         ARRAY[(i-1), i, (i+1)]
+     FROM generate_series(1,5000) s(i);
+
+CREATE STATISTICS histogram_array_stats (histogram) ON a, b, c
+  FROM histograms_arrays;
+
+ANALYZE histograms_arrays;
+
+RESET random_page_cost;
-- 
2.17.2

0001-multivariate-MCV-lists.patchtext/x-patch; name=0001-multivariate-MCV-lists.patchDownload

From 4ae164a48885f0a86574daf802dc21d0b1f428a5 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas@2ndquadrant.com>
Date: Wed, 26 Dec 2018 20:59:55 +0100
Subject: [PATCH 1/2] multivariate MCV lists

---
 doc/src/sgml/catalogs.sgml                    |   13 +-
 doc/src/sgml/func.sgml                        |   77 +
 doc/src/sgml/planstats.sgml                   |  114 +-
 doc/src/sgml/ref/create_statistics.sgml       |   28 +-
 src/backend/commands/analyze.c                |    8 +-
 src/backend/commands/statscmds.c              |  107 +-
 src/backend/optimizer/path/clausesel.c        |   89 +-
 src/backend/optimizer/util/plancat.c          |   12 +
 src/backend/parser/parse_utilcmd.c            |    2 +
 src/backend/statistics/Makefile               |    2 +-
 src/backend/statistics/README                 |    4 +
 src/backend/statistics/README.mcv             |  140 ++
 src/backend/statistics/dependencies.c         |   83 +-
 src/backend/statistics/extended_stats.c       |  543 +++++-
 src/backend/statistics/mcv.c                  | 1656 +++++++++++++++++
 src/backend/statistics/mvdistinct.c           |   28 -
 src/backend/utils/adt/ruleutils.c             |   24 +-
 src/backend/utils/adt/selfuncs.c              |  165 ++
 src/bin/psql/describe.c                       |    9 +-
 src/include/catalog/pg_cast.dat               |    6 +
 src/include/catalog/pg_proc.dat               |   24 +
 src/include/catalog/pg_statistic_ext.h        |    2 +
 src/include/catalog/pg_type.dat               |    7 +
 src/include/commands/vacuum.h                 |    6 +
 src/include/optimizer/cost.h                  |    6 +
 .../statistics/extended_stats_internal.h      |   43 +
 src/include/statistics/statistics.h           |   49 +
 src/include/utils/selfuncs.h                  |    2 +
 .../regress/expected/create_table_like.out    |    2 +-
 src/test/regress/expected/opr_sanity.out      |    3 +-
 src/test/regress/expected/stats_ext.out       |  319 +++-
 src/test/regress/expected/type_sanity.out     |    3 +-
 src/test/regress/sql/stats_ext.sql            |  181 ++
 33 files changed, 3630 insertions(+), 127 deletions(-)
 create mode 100644 src/backend/statistics/README.mcv
 create mode 100644 src/backend/statistics/mcv.c

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index af4d0625ea..1f2a45c442 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6515,7 +6515,8 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
         An array containing codes for the enabled statistic kinds;
         valid values are:
         <literal>d</literal> for n-distinct statistics,
-        <literal>f</literal> for functional dependency statistics
+        <literal>f</literal> for functional dependency statistics, and
+        <literal>m</literal> for most common values (MCV) list statistics
       </entry>
      </row>
 
@@ -6538,6 +6539,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </entry>
      </row>
 
+     <row>
+      <entry><structfield>stxmcv</structfield></entry>
+      <entry><type>pg_mcv_list</type></entry>
+      <entry></entry>
+      <entry>
+       MCV (most-common values) list statistics, serialized as
+       <structname>pg_mcv_list</structname> type.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 37860996a6..dcfa5a0311 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21263,4 +21263,81 @@ CREATE EVENT TRIGGER test_table_rewrite_oid
   </sect2>
   </sect1>
 
+  <sect1 id="functions-statistics">
+   <title>Statistics Information Functions</title>
+
+   <indexterm zone="functions-statistics">
+    <primary>function</primary>
+    <secondary>statistics</secondary>
+   </indexterm>
+
+   <para>
+    To inspect statistics defined using <command>CREATE STATISTICS</command>
+    command, <productname>PostgreSQL</productname> provides multiple functions.
+   </para>
+
+  <sect2 id="functions-statistics-mcv">
+   <title>Inspecting MCV lists</title>
+
+   <indexterm>
+     <primary>pg_mcv_list_items</primary>
+     <secondary>pg_mcv_list</secondary>
+   </indexterm>
+
+   <para>
+    <function>pg_mcv_list_items</function> returns a list of all items
+    stored in a multi-column <literal>MCV</literal> list, and returns the
+    following columns:
+
+    <informaltable>
+     <tgroup cols="3">
+      <thead>
+       <row>
+        <entry>Name</entry>
+        <entry>Type</entry>
+        <entry>Description</entry>
+       </row>
+      </thead>
+
+      <tbody>
+       <row>
+        <entry><literal>index</literal></entry>
+        <entry><type>int</type></entry>
+        <entry>index of the item in the <literal>MCV</literal> list</entry>
+       </row>
+       <row>
+        <entry><literal>values</literal></entry>
+        <entry><type>text[]</type></entry>
+        <entry>values stored in the MCV item</entry>
+       </row>
+       <row>
+        <entry><literal>nulls</literal></entry>
+        <entry><type>boolean[]</type></entry>
+        <entry>flags identifying <literal>NULL</literal> values</entry>
+       </row>
+       <row>
+        <entry><literal>frequency</literal></entry>
+        <entry><type>double precision</type></entry>
+        <entry>frequency of this <literal>MCV</literal> item</entry>
+       </row>
+      </tbody>
+     </tgroup>
+    </informaltable>
+   </para>
+
+   <para>
+    The <function>pg_mcv_list_items</function> function can be used like this:
+
+<programlisting>
+SELECT m.* FROM pg_statistic_ext,
+                pg_mcv_list_items(stxmcv) m WHERE stxname = 'stts';
+</programlisting>
+
+     Values of the <type>pg_mcv_list</type> can be obtained only from the
+     <literal>pg_statistic.stxmcv</literal> column.
+   </para>
+  </sect2>
+
+  </sect1>
+
 </chapter>
diff --git a/doc/src/sgml/planstats.sgml b/doc/src/sgml/planstats.sgml
index ef643ad064..de8ef165c9 100644
--- a/doc/src/sgml/planstats.sgml
+++ b/doc/src/sgml/planstats.sgml
@@ -455,7 +455,7 @@ rows = (outer_cardinality * inner_cardinality) * selectivity
    <secondary>multivariate</secondary>
   </indexterm>
 
-  <sect2>
+  <sect2 id="functional-dependencies">
    <title>Functional Dependencies</title>
 
    <para>
@@ -540,7 +540,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
    </para>
   </sect2>
 
-  <sect2>
+  <sect2 id="multivariate-ndistinct-counts">
    <title>Multivariate N-Distinct Counts</title>
 
    <para>
@@ -585,6 +585,116 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
    </para>
 
   </sect2>
+
+  <sect2 id="mcv-lists">
+   <title>MCV lists</title>
+
+   <para>
+    As explained in <xref linkend="functional-dependencies"/>, functional
+    dependencies are very cheap and efficient type of statistics, but their
+    main limitation is their global nature (only tracking dependencies at
+    the column level, not between individual column values).
+   </para>
+
+   <para>
+    This section introduces multivariate variant of <acronym>MCV</acronym>
+    (most-common values) lists, a straight-forward extension of the per-column
+    statistics described in <xref linkend="row-estimation-examples"/>. This
+    statistics adresses the limitation by storing individual values, but it
+    is naturally more expensive, both in terms of storage and planning time.
+   </para>
+
+   <para>
+    Let's look at the query from <xref linkend="functional-dependencies"/>
+    again, but this time with a <acronym>MCV</acronym> list created on the
+    same set of columns (be sure to drop the functional dependencies, to
+    make sure the planner uses the newly created statistics).
+
+<programlisting>
+DROP STATISTICS stts;
+CREATE STATISTICS stts2 (mcv) ON a, b FROM t;
+ANALYZE t;
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
+                                   QUERY PLAN
+-------------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1)
+   Filter: ((a = 1) AND (b = 1))
+   Rows Removed by Filter: 9900
+</programlisting>
+
+    The estimate is as accurate as with the functional dependencies, mostly
+    thanks to the table being fairly small and having a simple distribution
+    with low number of distinct values. Before looking at the second query,
+    which was not handled by functional dependencies particularly well,
+    let's inspect the <acronym>MCV</acronym> list a bit.
+   </para>
+
+   <para>
+    Inspecting the <acronym>MCV</acronym> list is possible using
+    <function>pg_mcv_list_items</function> set-returning function.
+
+<programlisting>
+SELECT m.* FROM pg_statistic_ext,
+                pg_mcv_list_items(stxmcv) m WHERE stxname = 'stts2';
+ index | values  | nulls | frequency
+-------+---------+-------+-----------
+     0 | {0,0}   | {f,f} |      0.01
+     1 | {1,1}   | {f,f} |      0.01
+     2 | {2,2}   | {f,f} |      0.01
+...
+    49 | {49,49} | {f,f} |      0.01
+    50 | {50,0}  | {f,f} |      0.01
+...
+    97 | {97,47} | {f,f} |      0.01
+    98 | {98,48} | {f,f} |      0.01
+    99 | {99,49} | {f,f} |      0.01
+(100 rows)
+</programlisting>
+
+    Which confirms there are 100 distinct combinations in the two columns,
+    and all of them are about equally likely (1% frequency for each one).
+    Had there been any null values in either of the columns, this would be
+    identified in the <structfield>nulls</structfield> column.
+   </para>
+
+   <para>
+    When estimating the selectivity, the planner applies all the conditions
+    on items in the <acronym>MCV</acronym> list, and them sums the frequencies
+    of the matching ones. See <function>mcv_clauselist_selectivity</function>
+    in <filename>src/backend/statistics/mcv.c</filename> for details.
+   </para>
+
+   <para>
+    Compared to functional dependencies, <acronym>MCV</acronym> lists have two
+    major advantages. Firstly, the list stores actual values, making it possible
+    to decide which combinations are compatible.
+
+<programlisting>
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 10;
+                                 QUERY PLAN
+---------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)
+   Filter: ((a = 1) AND (b = 10))
+   Rows Removed by Filter: 10000
+</programlisting>
+
+    Secondly, <acronym>MCV</acronym> lists handle a wider range of clause types,
+    not just equality clauses like functional dependencies. See for example the
+    example range query, presented earlier:
+
+<programlisting>
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a &lt;= 49 AND b &gt; 49;
+                                QUERY PLAN
+---------------------------------------------------------------------------
+ Seq Scan on t  (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)
+   Filter: ((a &lt;= 49) AND (b &gt; 49))
+   Rows Removed by Filter: 10000
+</programlisting>
+
+   </para>
+
+  </sect2>
+
  </sect1>
 
  <sect1 id="planner-stats-security">
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index 539f5bded5..fcbfa569d0 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -83,7 +83,8 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="parameter">statistics_na
       Currently supported kinds are
       <literal>ndistinct</literal>, which enables n-distinct statistics, and
       <literal>dependencies</literal>, which enables functional
-      dependency statistics.
+      dependency statistics, and <literal>mcv</literal> which enables
+      most-common values lists.
       If this clause is omitted, all supported statistics kinds are
       included in the statistics object.
       For more information, see <xref linkend="planner-stats-extended"/>
@@ -164,6 +165,31 @@ EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 0);
    conditions are redundant and does not underestimate the row count.
   </para>
 
+  <para>
+   Create table <structname>t2</structname> with two perfectly correlated columns
+   (containing identical data), and a MCV list on those columns:
+
+<programlisting>
+CREATE TABLE t2 (
+    a   int,
+    b   int
+);
+
+INSERT INTO t2 SELECT mod(i,100), mod(i,100)
+                 FROM generate_series(1,1000000) s(i);
+
+CREATE STATISTICS s2 WITH (mcv) ON (a, b) FROM t2;
+
+ANALYZE t2;
+
+-- valid combination (found in MCV)
+EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 1);
+
+-- invalid combination (not found in MCV)
+EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 2);
+</programlisting>
+  </para>
+
  </refsect1>
 
  <refsect1>
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index b5a7475db9..64ec958a52 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -1733,12 +1733,6 @@ static void compute_scalar_stats(VacAttrStatsP stats,
 					 double totalrows);
 static int	compare_scalars(const void *a, const void *b, void *arg);
 static int	compare_mcvs(const void *a, const void *b);
-static int analyze_mcv_list(int *mcv_counts,
-				 int num_mcv,
-				 double stadistinct,
-				 double stanullfrac,
-				 int samplerows,
-				 double totalrows);
 
 
 /*
@@ -2835,7 +2829,7 @@ compare_mcvs(const void *a, const void *b)
  * number that are significantly more common than the values not in the list,
  * and which are therefore deemed worth storing in the table's MCV list.
  */
-static int
+int
 analyze_mcv_list(int *mcv_counts,
 				 int num_mcv,
 				 double stadistinct,
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index bfc0f1d1fa..0ea3ff2c34 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -71,11 +71,12 @@ CreateStatistics(CreateStatsStmt *stmt)
 	Oid			relid;
 	ObjectAddress parentobject,
 				myself;
-	Datum		types[2];		/* one for each possible type of statistic */
+	Datum		types[3];		/* one for each possible type of statistic */
 	int			ntypes;
 	ArrayType  *stxkind;
 	bool		build_ndistinct;
 	bool		build_dependencies;
+	bool		build_mcv;
 	bool		requested_type = false;
 	int			i;
 	ListCell   *cell;
@@ -270,6 +271,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	 */
 	build_ndistinct = false;
 	build_dependencies = false;
+	build_mcv = false;
 	foreach(cell, stmt->stat_types)
 	{
 		char	   *type = strVal((Value *) lfirst(cell));
@@ -284,6 +286,11 @@ CreateStatistics(CreateStatsStmt *stmt)
 			build_dependencies = true;
 			requested_type = true;
 		}
+		else if (strcmp(type, "mcv") == 0)
+		{
+			build_mcv = true;
+			requested_type = true;
+		}
 		else
 			ereport(ERROR,
 					(errcode(ERRCODE_SYNTAX_ERROR),
@@ -295,6 +302,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	{
 		build_ndistinct = true;
 		build_dependencies = true;
+		build_mcv = true;
 	}
 
 	/* construct the char array of enabled statistic types */
@@ -303,6 +311,8 @@ CreateStatistics(CreateStatsStmt *stmt)
 		types[ntypes++] = CharGetDatum(STATS_EXT_NDISTINCT);
 	if (build_dependencies)
 		types[ntypes++] = CharGetDatum(STATS_EXT_DEPENDENCIES);
+	if (build_mcv)
+		types[ntypes++] = CharGetDatum(STATS_EXT_MCV);
 	Assert(ntypes > 0 && ntypes <= lengthof(types));
 	stxkind = construct_array(types, ntypes, CHAROID, 1, true, 'c');
 
@@ -327,6 +337,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	/* no statistics built yet */
 	nulls[Anum_pg_statistic_ext_stxndistinct - 1] = true;
 	nulls[Anum_pg_statistic_ext_stxdependencies - 1] = true;
+	nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
 
 	/* insert it into pg_statistic_ext */
 	htup = heap_form_tuple(statrel->rd_att, values, nulls);
@@ -422,23 +433,97 @@ RemoveStatisticsById(Oid statsOid)
  * null until the next ANALYZE.  (Note that the type change hasn't actually
  * happened yet, so one option that's *not* on the table is to recompute
  * immediately.)
+ *
+ * For both ndistinct and functional-dependencies stats, the on-disk
+ * representation is independent of the source column data types, and it is
+ * plausible to assume that the old statistic values will still be good for
+ * the new column contents.  (Obviously, if the ALTER COLUMN TYPE has a USING
+ * expression that substantially alters the semantic meaning of the column
+ * values, this assumption could fail.  But that seems like a corner case
+ * that doesn't justify zapping the stats in common cases.)
+ *
+ * For MCV lists that's not the case, as those statistics store the datums
+ * internally. In this case we simply reset the statistics value to NULL.
  */
 void
 UpdateStatisticsForTypeChange(Oid statsOid, Oid relationOid, int attnum,
 							  Oid oldColumnType, Oid newColumnType)
 {
+	Form_pg_statistic_ext staForm;
+	HeapTuple	stup,
+				oldtup;
+	int			i;
+
+	/* Do we need to reset anything? */
+	bool		attribute_referenced;
+	bool		reset_stats = false;
+
+	Relation	rel;
+
+	Datum		values[Natts_pg_statistic_ext];
+	bool		nulls[Natts_pg_statistic_ext];
+	bool		replaces[Natts_pg_statistic_ext];
+
+	oldtup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statsOid));
+	if (!oldtup)
+		elog(ERROR, "cache lookup failed for statistics object %u", statsOid);
+	staForm = (Form_pg_statistic_ext) GETSTRUCT(oldtup);
+
+	/*
+	 * If the modified attribute is not referenced by this statistic, we
+	 * can simply leave the statistics alone.
+	 */
+	attribute_referenced = false;
+	for (i = 0; i < staForm->stxkeys.dim1; i++)
+		if (attnum == staForm->stxkeys.values[i])
+			attribute_referenced = true;
+
 	/*
-	 * Currently, we don't actually need to do anything here.  For both
-	 * ndistinct and functional-dependencies stats, the on-disk representation
-	 * is independent of the source column data types, and it is plausible to
-	 * assume that the old statistic values will still be good for the new
-	 * column contents.  (Obviously, if the ALTER COLUMN TYPE has a USING
-	 * expression that substantially alters the semantic meaning of the column
-	 * values, this assumption could fail.  But that seems like a corner case
-	 * that doesn't justify zapping the stats in common cases.)
-	 *
-	 * Future types of extended stats will likely require us to work harder.
+	 * We can also leave the record as it is if there are no statistics
+	 * including the datum values, like for example MCV lists.
 	 */
+	if (statext_is_kind_built(oldtup, STATS_EXT_MCV))
+		reset_stats = true;
+
+	/*
+	 * If we can leave the statistics as it is, just do minimal cleanup
+	 * and we're done.
+	 */
+	if (!attribute_referenced && reset_stats)
+	{
+		ReleaseSysCache(oldtup);
+		return;
+	}
+
+	/*
+	 * OK, we need to reset some statistics. So let's build the new tuple,
+	 * replacing the affected statistics types with NULL.
+	 */
+	memset(nulls, 0, Natts_pg_statistic_ext * sizeof(bool));
+	memset(replaces, 0, Natts_pg_statistic_ext * sizeof(bool));
+	memset(values, 0, Natts_pg_statistic_ext * sizeof(Datum));
+
+	if (statext_is_kind_built(oldtup, STATS_EXT_MCV))
+	{
+		replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
+		nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
+	}
+
+	rel = heap_open(StatisticExtRelationId, RowExclusiveLock);
+
+	/* replace the old tuple */
+	stup = heap_modify_tuple(oldtup,
+							 RelationGetDescr(rel),
+							 values,
+							 nulls,
+							 replaces);
+
+	ReleaseSysCache(oldtup);
+	CatalogTupleUpdate(rel, &stup->t_self, stup);
+
+	heap_freetuple(stup);
+
+	heap_close(rel, RowExclusiveLock);
 }
 
 /*
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index f4717942c3..fd09ad3c49 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -105,9 +105,6 @@ clauselist_selectivity(PlannerInfo *root,
 	Selectivity s1 = 1.0;
 	RelOptInfo *rel;
 	Bitmapset  *estimatedclauses = NULL;
-	RangeQueryClause *rqlist = NULL;
-	ListCell   *l;
-	int			listidx;
 
 	/*
 	 * If there's exactly one clause, just go directly to
@@ -124,6 +121,25 @@ clauselist_selectivity(PlannerInfo *root,
 	rel = find_single_rel_for_clauses(root, clauses);
 	if (rel && rel->rtekind == RTE_RELATION && rel->statlist != NIL)
 	{
+		/*
+		 * Estimate selectivity on any clauses applicable by stats tracking
+		 * actual values first, then apply functional dependencies on the
+		 * remaining clauses.  The reasoning for this particular order is that
+		 * the more complex stats can track more complex correlations between
+		 * the attributes, and may be considered more reliable.
+		 *
+		 * For example MCV list can give us an exact selectivity for values in
+		 * two columns, while functional dependencies can only provide
+		 * information about overall strength of the dependency.
+		 *
+		 * 'estimatedclauses' is a bitmap of 0-based list positions of clauses
+		 * used that way, so that we can ignore them later (not to estimate
+		 * them twice).
+		 */
+		s1 *= statext_clauselist_selectivity(root, clauses, varRelid,
+											 jointype, sjinfo, rel,
+											 &estimatedclauses);
+
 		/*
 		 * Perform selectivity estimations on any clauses found applicable by
 		 * dependencies_clauselist_selectivity.  'estimatedclauses' will be
@@ -133,17 +149,72 @@ clauselist_selectivity(PlannerInfo *root,
 		s1 *= dependencies_clauselist_selectivity(root, clauses, varRelid,
 												  jointype, sjinfo, rel,
 												  &estimatedclauses);
-
-		/*
-		 * This would be the place to apply any other types of extended
-		 * statistics selectivity estimations for remaining clauses.
-		 */
 	}
 
 	/*
 	 * Apply normal selectivity estimates for remaining clauses. We'll be
 	 * careful to skip any clauses which were already estimated above.
-	 *
+	 */
+	return s1 * clauselist_selectivity_simple(root, clauses, varRelid,
+											  jointype, sjinfo,
+											  estimatedclauses);
+}
+
+/*
+ * clauselist_selectivity_simple -
+ *	  Compute the selectivity of an implicitly-ANDed list of boolean
+ *	  expression clauses.  The list can be empty, in which case 1.0
+ *	  must be returned.  List elements may be either RestrictInfos
+ *	  or bare expression clauses --- the former is preferred since
+ *	  it allows caching of results.
+ *
+ * See clause_selectivity() for the meaning of the additional parameters.
+ *
+ * Our basic approach is to take the product of the selectivities of the
+ * subclauses.  However, that's only right if the subclauses have independent
+ * probabilities, and in reality they are often NOT independent.  So,
+ * we want to be smarter where we can.
+ *
+ * We also recognize "range queries", such as "x > 34 AND x < 42".  Clauses
+ * are recognized as possible range query components if they are restriction
+ * opclauses whose operators have scalarltsel or a related function as their
+ * restriction selectivity estimator.  We pair up clauses of this form that
+ * refer to the same variable.  An unpairable clause of this kind is simply
+ * multiplied into the selectivity product in the normal way.  But when we
+ * find a pair, we know that the selectivities represent the relative
+ * positions of the low and high bounds within the column's range, so instead
+ * of figuring the selectivity as hisel * losel, we can figure it as hisel +
+ * losel - 1.  (To visualize this, see that hisel is the fraction of the range
+ * below the high bound, while losel is the fraction above the low bound; so
+ * hisel can be interpreted directly as a 0..1 value but we need to convert
+ * losel to 1-losel before interpreting it as a value.  Then the available
+ * range is 1-losel to hisel.  However, this calculation double-excludes
+ * nulls, so really we need hisel + losel + null_frac - 1.)
+ *
+ * If either selectivity is exactly DEFAULT_INEQ_SEL, we forget this equation
+ * and instead use DEFAULT_RANGE_INEQ_SEL.  The same applies if the equation
+ * yields an impossible (negative) result.
+ *
+ * A free side-effect is that we can recognize redundant inequalities such
+ * as "x < 4 AND x < 5"; only the tighter constraint will be counted.
+ *
+ * Of course this is all very dependent on the behavior of the inequality
+ * selectivity functions; perhaps some day we can generalize the approach.
+ */
+Selectivity
+clauselist_selectivity_simple(PlannerInfo *root,
+							  List *clauses,
+							  int varRelid,
+							  JoinType jointype,
+							  SpecialJoinInfo *sjinfo,
+							  Bitmapset *estimatedclauses)
+{
+	Selectivity s1 = 1.0;
+	RangeQueryClause *rqlist = NULL;
+	ListCell   *l;
+	int			listidx;
+
+	/*
 	 * Anything that doesn't look like a potential rangequery clause gets
 	 * multiplied into s1 and forgotten. Anything that does gets inserted into
 	 * an rqlist entry.
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index a570ac0aab..2ae9945a28 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -1362,6 +1362,18 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
 			stainfos = lcons(info, stainfos);
 		}
 
+		if (statext_is_kind_built(htup, STATS_EXT_MCV))
+		{
+			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+			info->statOid = statOid;
+			info->rel = rel;
+			info->kind = STATS_EXT_MCV;
+			info->keys = bms_copy(keys);
+
+			stainfos = lcons(info, stainfos);
+		}
+
 		ReleaseSysCache(htup);
 		bms_free(keys);
 	}
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 52582d0a13..3ffd4fb210 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1648,6 +1648,8 @@ generateClonedExtStatsStmt(RangeVar *heapRel, Oid heapRelid,
 			stat_types = lappend(stat_types, makeString("ndistinct"));
 		else if (enabled[i] == STATS_EXT_DEPENDENCIES)
 			stat_types = lappend(stat_types, makeString("dependencies"));
+		else if (enabled[i] == STATS_EXT_MCV)
+			stat_types = lappend(stat_types, makeString("mcv"));
 		else
 			elog(ERROR, "unrecognized statistics kind %c", enabled[i]);
 	}
diff --git a/src/backend/statistics/Makefile b/src/backend/statistics/Makefile
index 3404e4554a..d2815265fb 100644
--- a/src/backend/statistics/Makefile
+++ b/src/backend/statistics/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/statistics
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-OBJS = extended_stats.o dependencies.o mvdistinct.o
+OBJS = extended_stats.o dependencies.o mcv.o mvdistinct.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/statistics/README b/src/backend/statistics/README
index a8f00a590e..8f153a9e85 100644
--- a/src/backend/statistics/README
+++ b/src/backend/statistics/README
@@ -18,6 +18,8 @@ There are currently two kinds of extended statistics:
 
     (b) soft functional dependencies (README.dependencies)
 
+    (c) MCV lists (README.mcv)
+
 
 Compatible clause types
 -----------------------
@@ -26,6 +28,8 @@ Each type of statistics may be used to estimate some subset of clause types.
 
     (a) functional dependencies - equality clauses (AND), possibly IS NULL
 
+    (b) MCV lists - equality and inequality clauses (AND, OR, NOT), IS NULL
+
 Currently, only OpExprs in the form Var op Const, or Const op Var are
 supported, however it's feasible to expand the code later to also estimate the
 selectivities on clauses such as Var op Var.
diff --git a/src/backend/statistics/README.mcv b/src/backend/statistics/README.mcv
new file mode 100644
index 0000000000..2910eca962
--- /dev/null
+++ b/src/backend/statistics/README.mcv
@@ -0,0 +1,140 @@
+MCV lists
+=========
+
+Multivariate MCV (most-common values) lists are a straightforward extension of
+regular MCV list, tracking most frequent combinations of values for a group of
+attributes.
+
+This works particularly well for columns with a small number of distinct values,
+as the list may include all the combinations and approximate the distribution
+very accurately.
+
+For columns with large number of distinct values (e.g. those with continuous
+domains), the list will only track the most frequent combinations. If the
+distribution is mostly uniform (all combinations about equally frequent), the
+MCV list will be empty.
+
+Estimates of some clauses (e.g. equality) based on MCV lists are more accurate
+than when using histograms.
+
+Also, MCV lists don't necessarily require sorting of the values (the fact that
+we use sorting when building them is implementation detail), but even more
+importantly the ordering is not built into the approximation (while histograms
+are built on ordering). So MCV lists work well even for attributes where the
+ordering of the data type is disconnected from the meaning of the data. For
+example we know how to sort strings, but it's unlikely to make much sense for
+city names (or other label-like attributes).
+
+
+Selectivity estimation
+----------------------
+
+The estimation, implemented in clauselist_mv_selectivity_mcvlist(), is quite
+simple in principle - we need to identify MCV items matching all the clauses
+and sum frequencies of all those items.
+
+Currently MCV lists support estimation of the following clause types:
+
+    (a) equality clauses    WHERE (a = 1) AND (b = 2)
+    (b) inequality clauses  WHERE (a < 1) AND (b >= 2)
+    (c) NULL clauses        WHERE (a IS NULL) AND (b IS NOT NULL)
+    (d) OR clauses          WHERE (a < 1) OR (b >= 2)
+
+It's possible to add support for additional clauses, for example:
+
+    (e) multi-var clauses   WHERE (a > b)
+
+and possibly others. These are tasks for the future, not yet implemented.
+
+
+Estimating equality clauses
+---------------------------
+
+When computing selectivity estimate for equality clauses
+
+    (a = 1) AND (b = 2)
+
+we can do this estimate pretty exactly assuming that two conditions are met:
+
+    (1) there's an equality condition on all attributes of the statistic
+
+    (2) we find a matching item in the MCV list
+
+In this case we know the MCV item represents all tuples matching the clauses,
+and the selectivity estimate is complete (i.e. we don't need to perform
+estimation using the histogram). This is what we call 'full match'.
+
+When only (1) holds, but there's no matching MCV item, we don't know whether
+there are no such rows or just are not very frequent. We can however use the
+frequency of the least frequent MCV item as an upper bound for the selectivity.
+
+For a combination of equality conditions (not full-match case) we can clamp the
+selectivity by the minimum of selectivities for each condition. For example if
+we know the number of distinct values for each column, we can use 1/ndistinct
+as a per-column estimate. Or rather 1/ndistinct + selectivity derived from the
+MCV list.
+
+We should also probably only use the 'residual ndistinct' by exluding the items
+included in the MCV list (and also residual frequency):
+
+     f = (1.0 - sum(MCV frequencies)) / (ndistinct - ndistinct(MCV list))
+
+but it's worth pointing out the ndistinct values are multi-variate for the
+columns referenced by the equality conditions.
+
+Note: Only the "full match" limit is currently implemented.
+
+
+Hashed MCV (not yet implemented)
+--------------------------------
+
+Regular MCV lists have to include actual values for each item, so if those items
+are large the list may be quite large. This is especially true for multi-variate
+MCV lists, although the current implementation partially mitigates this by
+performing de-duplicating the values before storing them on disk.
+
+It's possible to only store hashes (32-bit values) instead of the actual values,
+significantly reducing the space requirements. Obviously, this would only make
+the MCV lists useful for estimating equality conditions (assuming the 32-bit
+hashes make the collisions rare enough).
+
+This might also complicate matching the columns to available stats.
+
+
+TODO Consider implementing hashed MCV list, storing just 32-bit hashes instead
+     of the actual values. This type of MCV list will be useful only for
+     estimating equality clauses, and will reduce space requirements for large
+     varlena types (in such cases we usually only want equality anyway).
+
+TODO Currently there's no logic to consider building only a MCV list (and not
+     building the histogram at all), except for doing this decision manually in
+     ADD STATISTICS.
+
+
+Inspecting the MCV list
+-----------------------
+
+Inspecting the regular (per-attribute) MCV lists is trivial, as it's enough
+to select the columns from pg_stats. The data is encoded as anyarrays, and
+all the items have the same data type, so anyarray provides a simple way to
+get a text representation.
+
+With multivariate MCV lists the columns may use different data types, making
+it impossible to use anyarrays. It might be possible to produce similar
+array-like representation, but that would complicate further processing and
+analysis of the MCV list.
+
+So instead the MCV lists are stored in a custom data type (pg_mcv_list),
+which however makes it more difficult to inspect the contents. To make that
+easier, there's a SRF returning detailed information about the MCV lists.
+
+    SELECT * FROM pg_mcv_list_items(stxmcv);
+
+It accepts one parameter - a pg_mcv_list value (which can only be obtained
+from pg_statistic_ext catalog, to defend against malicious input), and
+returns these columns:
+
+    - item index (0, ..., (nitems-1))
+    - values (string array)
+    - nulls only (boolean array)
+    - frequency (double precision)
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 58d0df20f6..a4e8eef52f 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -201,14 +201,11 @@ static double
 dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
 				  VacAttrStats **stats, Bitmapset *attrs)
 {
-	int			i,
-				j;
-	int			nvalues = numrows * k;
+	int			i;
 	MultiSortSupport mss;
 	SortItem   *items;
-	Datum	   *values;
-	bool	   *isnull;
 	int		   *attnums;
+	int		   *attnums_dep;
 
 	/* counters valid within a group */
 	int			group_size = 0;
@@ -223,26 +220,16 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
 	/* sort info for all attributes columns */
 	mss = multi_sort_init(k);
 
-	/* data for the sort */
-	items = (SortItem *) palloc(numrows * sizeof(SortItem));
-	values = (Datum *) palloc(sizeof(Datum) * nvalues);
-	isnull = (bool *) palloc(sizeof(bool) * nvalues);
-
-	/* fix the pointers to values/isnull */
-	for (i = 0; i < numrows; i++)
-	{
-		items[i].values = &values[i * k];
-		items[i].isnull = &isnull[i * k];
-	}
-
 	/*
-	 * Transform the bms into an array, to make accessing i-th member easier.
+	 * Transform the bms into an array, to make accessing i-th member easier,
+	 * and then construct a filtered version with only attnums referenced
+	 * by the dependency we validate.
 	 */
-	attnums = (int *) palloc(sizeof(int) * bms_num_members(attrs));
-	i = 0;
-	j = -1;
-	while ((j = bms_next_member(attrs, j)) >= 0)
-		attnums[i++] = j;
+	attnums = build_attnums(attrs);
+
+	attnums_dep = (int *)palloc(k * sizeof(int));
+	for (i = 0; i < k; i++)
+		attnums_dep[i] = attnums[dependency[i]];
 
 	/*
 	 * Verify the dependency (a,b,...)->z, using a rather simple algorithm:
@@ -257,7 +244,7 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
 	 * perhaps at some point it'd be worth using column-specific collations?
 	 */
 
-	/* prepare the sort function for the first dimension, and SortItem array */
+	/* prepare the sort function for the dimensions */
 	for (i = 0; i < k; i++)
 	{
 		VacAttrStats *colstat = stats[dependency[i]];
@@ -270,19 +257,16 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
 
 		/* prepare the sort function for this dimension */
 		multi_sort_add_dimension(mss, i, type->lt_opr, type->typcollation);
-
-		/* accumulate all the data for both columns into an array and sort it */
-		for (j = 0; j < numrows; j++)
-		{
-			items[j].values[i] =
-				heap_getattr(rows[j], attnums[dependency[i]],
-							 stats[i]->tupDesc, &items[j].isnull[i]);
-		}
 	}
 
-	/* sort the items so that we can detect the groups */
-	qsort_arg((void *) items, numrows, sizeof(SortItem),
-			  multi_sort_compare, mss);
+	/*
+	 * build an array of SortItem(s) sorted using the multi-sort support
+	 *
+	 * XXX This relies on all stats entries pointing to the same tuple
+	 * descriptor. Not sure if that might not be the case.
+	 */
+	items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
+							   mss, k, attnums_dep);
 
 	/*
 	 * Walk through the sorted array, split it into rows according to the
@@ -325,9 +309,9 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
 	}
 
 	pfree(items);
-	pfree(values);
-	pfree(isnull);
 	pfree(mss);
+	pfree(attnums);
+	pfree(attnums_dep);
 
 	/* Compute the 'degree of validity' as (supporting/total). */
 	return (n_supporting_rows * 1.0 / numrows);
@@ -354,7 +338,6 @@ statext_dependencies_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
 						   VacAttrStats **stats)
 {
 	int			i,
-				j,
 				k;
 	int			numattrs;
 	int		   *attnums;
@@ -367,11 +350,7 @@ statext_dependencies_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
 	/*
 	 * Transform the bms into an array, to make accessing i-th member easier.
 	 */
-	attnums = palloc(sizeof(int) * bms_num_members(attrs));
-	i = 0;
-	j = -1;
-	while ((j = bms_next_member(attrs, j)) >= 0)
-		attnums[i++] = j;
+	attnums = build_attnums(attrs);
 
 	Assert(numattrs >= 2);
 
@@ -918,9 +897,9 @@ find_strongest_dependency(StatisticExtInfo *stats, MVDependencies *dependencies,
  *		using functional dependency statistics, or 1.0 if no useful functional
  *		dependency statistic exists.
  *
- * 'estimatedclauses' is an output argument that gets a bit set corresponding
- * to the (zero-based) list index of each clause that is included in the
- * estimated selectivity.
+ * 'estimatedclauses' is an input/output argument that gets a bit set
+ * corresponding to the (zero-based) list index of each clause that is included
+ * in the estimated selectivity.
  *
  * Given equality clauses on attributes (a,b) we find the strongest dependency
  * between them, i.e. either (a=>b) or (b=>a). Assuming (a=>b) is the selected
@@ -955,9 +934,6 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	AttrNumber *list_attnums;
 	int			listidx;
 
-	/* initialize output argument */
-	*estimatedclauses = NULL;
-
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
 		return 1.0;
@@ -972,6 +948,9 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	 * the attnums for each clause in a list which we'll reference later so we
 	 * don't need to repeat the same work again. We'll also keep track of all
 	 * attnums seen.
+	 *
+	 * We also skip clauses that we already estimated using different types of
+	 * statistics (we treat them as incompatible).
 	 */
 	listidx = 0;
 	foreach(l, clauses)
@@ -979,7 +958,8 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		Node	   *clause = (Node *) lfirst(l);
 		AttrNumber	attnum;
 
-		if (dependency_is_compatible_clause(clause, rel->relid, &attnum))
+		if ((dependency_is_compatible_clause(clause, rel->relid, &attnum)) &&
+			(!bms_is_member(listidx, *estimatedclauses)))
 		{
 			list_attnums[listidx] = attnum;
 			clauses_attnums = bms_add_member(clauses_attnums, attnum);
@@ -1049,8 +1029,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 			/*
 			 * Skip incompatible clauses, and ones we've already estimated on.
 			 */
-			if (list_attnums[listidx] == InvalidAttrNumber ||
-				bms_is_member(listidx, *estimatedclauses))
+			if (list_attnums[listidx] == InvalidAttrNumber)
 				continue;
 
 			/*
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 082f0506da..1628daae83 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -16,6 +16,8 @@
  */
 #include "postgres.h"
 
+#include <math.h>
+
 #include "access/genam.h"
 #include "access/heapam.h"
 #include "access/htup_details.h"
@@ -23,6 +25,8 @@
 #include "catalog/pg_collation.h"
 #include "catalog/pg_statistic_ext.h"
 #include "nodes/relation.h"
+#include "optimizer/clauses.h"
+#include "optimizer/cost.h"
 #include "postmaster/autovacuum.h"
 #include "statistics/extended_stats_internal.h"
 #include "statistics/statistics.h"
@@ -31,6 +35,7 @@
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/rel.h"
+#include "utils/selfuncs.h"
 #include "utils/syscache.h"
 
 
@@ -53,7 +58,7 @@ static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs,
 					  int nvacatts, VacAttrStats **vacatts);
 static void statext_store(Relation pg_stext, Oid relid,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
-			  VacAttrStats **stats);
+			  MCVList * mcvlist, VacAttrStats **stats);
 
 
 /*
@@ -87,6 +92,7 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 		StatExtEntry *stat = (StatExtEntry *) lfirst(lc);
 		MVNDistinct *ndistinct = NULL;
 		MVDependencies *dependencies = NULL;
+		MCVList    *mcv = NULL;
 		VacAttrStats **stats;
 		ListCell   *lc2;
 
@@ -124,10 +130,13 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
 			else if (t == STATS_EXT_DEPENDENCIES)
 				dependencies = statext_dependencies_build(numrows, rows,
 														  stat->columns, stats);
+			else if (t == STATS_EXT_MCV)
+				mcv = statext_mcv_build(numrows, rows, stat->columns, stats,
+										totalrows);
 		}
 
 		/* store the statistics in the catalog */
-		statext_store(pg_stext, stat->statOid, ndistinct, dependencies, stats);
+		statext_store(pg_stext, stat->statOid, ndistinct, dependencies, mcv, stats);
 	}
 
 	heap_close(pg_stext, RowExclusiveLock);
@@ -155,6 +164,10 @@ statext_is_kind_built(HeapTuple htup, char type)
 			attnum = Anum_pg_statistic_ext_stxdependencies;
 			break;
 
+		case STATS_EXT_MCV:
+			attnum = Anum_pg_statistic_ext_stxmcv;
+			break;
+
 		default:
 			elog(ERROR, "unexpected statistics type requested: %d", type);
 	}
@@ -219,7 +232,8 @@ fetch_statentries_for_relation(Relation pg_statext, Oid relid)
 		for (i = 0; i < ARR_DIMS(arr)[0]; i++)
 		{
 			Assert((enabled[i] == STATS_EXT_NDISTINCT) ||
-				   (enabled[i] == STATS_EXT_DEPENDENCIES));
+				   (enabled[i] == STATS_EXT_DEPENDENCIES) ||
+				   (enabled[i] == STATS_EXT_MCV));
 			entry->types = lappend_int(entry->types, (int) enabled[i]);
 		}
 
@@ -294,7 +308,7 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs,
 static void
 statext_store(Relation pg_stext, Oid statOid,
 			  MVNDistinct *ndistinct, MVDependencies *dependencies,
-			  VacAttrStats **stats)
+			  MCVList * mcv, VacAttrStats **stats)
 {
 	HeapTuple	stup,
 				oldtup;
@@ -325,9 +339,18 @@ statext_store(Relation pg_stext, Oid statOid,
 		values[Anum_pg_statistic_ext_stxdependencies - 1] = PointerGetDatum(data);
 	}
 
+	if (mcv != NULL)
+	{
+		bytea	   *data = statext_mcv_serialize(mcv, stats);
+
+		nulls[Anum_pg_statistic_ext_stxmcv - 1] = (data == NULL);
+		values[Anum_pg_statistic_ext_stxmcv - 1] = PointerGetDatum(data);
+	}
+
 	/* always replace the value (either by bytea or NULL) */
 	replaces[Anum_pg_statistic_ext_stxndistinct - 1] = true;
 	replaces[Anum_pg_statistic_ext_stxdependencies - 1] = true;
+	replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
 
 	/* there should already be a pg_statistic_ext tuple */
 	oldtup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
@@ -434,6 +457,137 @@ multi_sort_compare_dims(int start, int end,
 	return 0;
 }
 
+int
+compare_scalars_simple(const void *a, const void *b, void *arg)
+{
+	return compare_datums_simple(*(Datum *) a,
+								 *(Datum *) b,
+								 (SortSupport) arg);
+}
+
+int
+compare_datums_simple(Datum a, Datum b, SortSupport ssup)
+{
+	return ApplySortComparator(a, false, b, false, ssup);
+}
+
+/* simple counterpart to qsort_arg */
+void *
+bsearch_arg(const void *key, const void *base, size_t nmemb, size_t size,
+			int (*compar) (const void *, const void *, void *),
+			void *arg)
+{
+	size_t		l,
+				u,
+				idx;
+	const void *p;
+	int			comparison;
+
+	l = 0;
+	u = nmemb;
+	while (l < u)
+	{
+		idx = (l + u) / 2;
+		p = (void *) (((const char *) base) + (idx * size));
+		comparison = (*compar) (key, p, arg);
+
+		if (comparison < 0)
+			u = idx;
+		else if (comparison > 0)
+			l = idx + 1;
+		else
+			return (void *) p;
+	}
+
+	return NULL;
+}
+
+int *
+build_attnums(Bitmapset *attrs)
+{
+	int			i,
+				j;
+	int			numattrs = bms_num_members(attrs);
+	int		   *attnums;
+
+	/* build attnums from the bitmapset */
+	attnums = (int *) palloc(sizeof(int) * numattrs);
+	i = 0;
+	j = -1;
+	while ((j = bms_next_member(attrs, j)) >= 0)
+		attnums[i++] = j;
+
+	return attnums;
+}
+
+/* build_sorted_items
+ * 	build sorted array of SortItem with values from rows
+ *
+ * XXX All the memory is allocated in a single chunk, so that the caller
+ * can simply pfree the return value to release all of it.
+ */
+SortItem *
+build_sorted_items(int numrows, HeapTuple *rows, TupleDesc tdesc,
+				   MultiSortSupport mss, int numattrs, int *attnums)
+{
+	int			i,
+				j,
+				len;
+	int			nvalues = numrows * numattrs;
+
+	/*
+	 * We won't allocate the arrays for each item independenly, but in one
+	 * large chunk and then just set the pointers. This allows the caller to
+	 * simply pfree the return value to release all the memory.
+	 */
+	SortItem   *items;
+	Datum	   *values;
+	bool	   *isnull;
+	char	   *ptr;
+
+	/* Compute the total amount of memory we need (both items and values). */
+	len = numrows * sizeof(SortItem) + nvalues * (sizeof(Datum) + sizeof(bool));
+
+	/* Allocate the memory and split it into the pieces. */
+	ptr = palloc0(len);
+
+	/* items to sort */
+	items = (SortItem *) ptr;
+	ptr += numrows * sizeof(SortItem);
+
+	/* values and null flags */
+	values = (Datum *) ptr;
+	ptr += nvalues * sizeof(Datum);
+
+	isnull = (bool *) ptr;
+	ptr += nvalues * sizeof(bool);
+
+	/* make sure we consumed the whole buffer exactly */
+	Assert((ptr - (char *) items) == len);
+
+	/* fix the pointers to Datum and bool arrays */
+	for (i = 0; i < numrows; i++)
+	{
+		items[i].values = &values[i * numattrs];
+		items[i].isnull = &isnull[i * numattrs];
+
+		/* load the values/null flags from sample rows */
+		for (j = 0; j < numattrs; j++)
+		{
+			items[i].values[j] = heap_getattr(rows[i],
+											  attnums[j],	/* attnum */
+											  tdesc,
+											  &items[i].isnull[j]); /* isnull */
+		}
+	}
+
+	/* do the sort, using the multi-sort */
+	qsort_arg((void *) items, numrows, sizeof(SortItem),
+			  multi_sort_compare, mss);
+
+	return items;
+}
+
 /*
  * has_stats_of_kind
  *		Check whether the list contains statistic of a given kind
@@ -464,7 +618,7 @@ has_stats_of_kind(List *stats, char requiredkind)
  * object referencing the most of the requested attributes, breaking ties
  * in favor of objects with fewer keys overall.
  *
- * XXX if multiple statistics objects tie on both criteria, then which object
+ * XXX If multiple statistics objects tie on both criteria, then which object
  * is chosen depends on the order that they appear in the stats list. Perhaps
  * further tiebreakers are needed.
  */
@@ -514,3 +668,382 @@ choose_best_statistics(List *stats, Bitmapset *attnums, char requiredkind)
 
 	return best_match;
 }
+
+int
+bms_member_index(Bitmapset *keys, AttrNumber varattno)
+{
+	int			i,
+				j;
+
+	i = -1;
+	j = 0;
+	while (((i = bms_next_member(keys, i)) >= 0) && (i < varattno))
+		j += 1;
+
+	return j;
+}
+
+/* The Duj1 estimator (already used in analyze.c). */
+double
+estimate_ndistinct(double totalrows, int numrows, int d, int f1)
+{
+	double		numer,
+				denom,
+				ndistinct;
+
+	numer = (double) numrows * (double) d;
+
+	denom = (double) (numrows - f1) +
+		(double) f1 * (double) numrows / totalrows;
+
+	ndistinct = numer / denom;
+
+	/* Clamp to sane range in case of roundoff error */
+	if (ndistinct < (double) d)
+		ndistinct = (double) d;
+
+	if (ndistinct > totalrows)
+		ndistinct = totalrows;
+
+	return floor(ndistinct + 0.5);
+}
+
+/*
+ * statext_is_compatible_clause_internal
+ *	Does the heavy lifting of actually inspecting the clauses for
+ * statext_is_compatible_clause.
+ */
+static bool
+statext_is_compatible_clause_internal(Node *clause, Index relid, Bitmapset **attnums)
+{
+	/* We only support plain Vars for now */
+	if (IsA(clause, Var))
+	{
+		Var		   *var = (Var *) clause;
+
+		/* Ensure var is from the correct relation */
+		if (var->varno != relid)
+			return false;
+
+		/* we also better ensure the Var is from the current level */
+		if (var->varlevelsup > 0)
+			return false;
+
+		/* Also skip system attributes (we don't allow stats on those). */
+		if (!AttrNumberIsForUserDefinedAttr(var->varattno))
+			return false;
+
+		*attnums = bms_add_member(*attnums, var->varattno);
+
+		return true;
+	}
+
+	/* Var = Const */
+	if (is_opclause(clause))
+	{
+		OpExpr	   *expr = (OpExpr *) clause;
+		Var		   *var;
+		bool		varonleft = true;
+		bool		ok;
+
+		/* Only expressions with two arguments are considered compatible. */
+		if (list_length(expr->args) != 2)
+			return false;
+
+		/* see if it actually has the right */
+		ok = (NumRelids((Node *) expr) == 1) &&
+			(is_pseudo_constant_clause(lsecond(expr->args)) ||
+			 (varonleft = false,
+			  is_pseudo_constant_clause(linitial(expr->args))));
+
+		/* unsupported structure (two variables or so) */
+		if (!ok)
+			return false;
+
+		/*
+		 * If it's not one of the supported operators ("=", "<", ">", etc.),
+		 * just ignore the clause, as it's not compatible with MCV lists.
+		 *
+		 * This uses the function for estimating selectivity, not the operator
+		 * directly (a bit awkward, but well ...).
+		 */
+		if ((get_oprrest(expr->opno) != F_EQSEL) &&
+			(get_oprrest(expr->opno) != F_NEQSEL) &&
+			(get_oprrest(expr->opno) != F_SCALARLTSEL) &&
+			(get_oprrest(expr->opno) != F_SCALARLESEL) &&
+			(get_oprrest(expr->opno) != F_SCALARGTSEL) &&
+			(get_oprrest(expr->opno) != F_SCALARGESEL))
+			return false;
+
+		var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+
+		return statext_is_compatible_clause_internal((Node *) var, relid, attnums);
+	}
+
+	/* NOT/AND/OR clause */
+	if (or_clause(clause) ||
+		and_clause(clause) ||
+		not_clause(clause))
+	{
+		/*
+		 * AND/OR/NOT-clauses are supported if all sub-clauses are supported
+		 *
+		 * Perhaps we could improve this by handling mixed cases, when some of
+		 * the clauses are supported and some are not. Selectivity for the
+		 * supported subclauses would be computed using extended statistics,
+		 * and the remaining clauses would be estimated using the traditional
+		 * algorithm (product of selectivities).
+		 *
+		 * It however seems overly complex, and in a way we already do that
+		 * because if we reject the whole clause as unsupported here, it will
+		 * be eventually passed to clauselist_selectivity() which does exactly
+		 * this (split into supported/unsupported clauses etc).
+		 */
+		BoolExpr   *expr = (BoolExpr *) clause;
+		ListCell   *lc;
+		Bitmapset  *clause_attnums = NULL;
+
+		foreach(lc, expr->args)
+		{
+			/*
+			 * Had we found incompatible clause in the arguments, treat the
+			 * whole clause as incompatible.
+			 */
+			if (!statext_is_compatible_clause_internal((Node *) lfirst(lc),
+													   relid, &clause_attnums))
+				return false;
+		}
+
+		/*
+		 * Otherwise the clause is compatible, and we need to merge the
+		 * attnums into the main bitmapset.
+		 */
+		*attnums = bms_join(*attnums, clause_attnums);
+
+		return true;
+	}
+
+	/* Var IS NULL */
+	if (IsA(clause, NullTest))
+	{
+		NullTest   *nt = (NullTest *) clause;
+
+		/*
+		 * Only simple (Var IS NULL) expressions supported for now. Maybe we
+		 * could use examine_variable to fix this?
+		 */
+		if (!IsA(nt->arg, Var))
+			return false;
+
+		return statext_is_compatible_clause_internal((Node *) (nt->arg), relid, attnums);
+	}
+
+	return false;
+}
+
+/*
+ * statext_is_compatible_clause
+ *		Determines if the clause is compatible with MCV lists.
+ *
+ * Only OpExprs with two arguments using an equality operator are supported.
+ * When returning True attnum is set to the attribute number of the Var within
+ * the supported clause.
+ *
+ * Currently we only support Var = Const, or Const = Var. It may be possible
+ * to expand on this later.
+ */
+static bool
+statext_is_compatible_clause(Node *clause, Index relid, Bitmapset **attnums)
+{
+	RestrictInfo *rinfo = (RestrictInfo *) clause;
+
+	if (!IsA(rinfo, RestrictInfo))
+		return false;
+
+	/* Pseudoconstants are not really interesting here. */
+	if (rinfo->pseudoconstant)
+		return false;
+
+	/* clauses referencing multiple varnos are incompatible */
+	if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
+		return false;
+
+	return statext_is_compatible_clause_internal((Node *) rinfo->clause,
+												 relid, attnums);
+}
+
+/*
+ * statext_clauselist_selectivity
+ *		Estimate clauses using the best multi-column statistics.
+ *
+ * Selects the best extended (multi-column) statistic on a table (measured by
+ * a number of attributes extracted from the clauses and covered by it), and
+ * computes the selectivity for supplied clauses.
+ *
+ * One of the main challenges with using MCV lists is how to extrapolate the
+ * estimate to the data not covered by the MCV list. To do that, we compute
+ * not only the "MCV selectivity" (selectivities for MCV items matching the
+ * supplied clauses), but also a couple of derived selectivities:
+ *
+ * - simple selectivity:  Computed without extended statistic, i.e. as if the
+ * columns/clauses were independent
+ *
+ * - base selectivity:  Similar to simple selectivity, but is computed using
+ * the extended statistic by adding up the base frequencies (that we compute
+ * and store for each MCV item) of matching MCV items.
+ *
+ * - total selectivity: Selectivity covered by the whole MCV list.
+ *
+ * - other selectivity: A selectivity estimate for data not covered by the MCV
+ * list (i.e. satisfying the clauses, but not common enough to make it into
+ * the MCV list)
+ *
+ * Note: While simple and base selectivities are defined in a quite similar
+ * way, the values are computed differently and are not therefore equal. The
+ * simple selectivity is computed as a product of per-clause estimates, while
+ * the base selectivity is computed by adding up base frequencies of matching
+ * items of the multi-column MCV list. So the values may differ for two main
+ * reasons - (a) the MCV list may not cover 100% of the data and (b) some of
+ * the MCV items did not match the estimated clauses.
+ *
+ * As both (a) and (b) reduce the base selectivity value, it generally holds
+ * that (simple_selectivity >= base_selectivity). If the MCV list covers all
+ * the data, the values may be equal.
+ *
+ * So (simple_selectivity - base_selectivity) may be seen as a correction for
+ * the part not covered by the MCV list.
+ *
+ * Note: Due to rounding errors and minor differences in how the estimates
+ * are computed, the inequality may not always hold. Which is why we clamp
+ * the selectivities to prevent strange estimate (negative etc.).
+ *
+ * XXX If we were to use multiple statistics, this is where it would happen.
+ * We would simply repeat this on a loop on the "remaining" clauses, possibly
+ * using the already estimated clauses as conditions (and combining the values
+ * using conditional probability formula).
+ */
+Selectivity
+statext_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
+							   JoinType jointype, SpecialJoinInfo *sjinfo,
+							   RelOptInfo *rel, Bitmapset **estimatedclauses)
+{
+	ListCell   *l;
+	Bitmapset  *clauses_attnums = NULL;
+	Bitmapset **list_attnums;
+	int			listidx;
+	StatisticExtInfo *stat;
+	List	   *stat_clauses;
+	Selectivity	simple_sel,
+				mcv_sel,
+				mcv_basesel,
+				mcv_totalsel,
+				other_sel,
+				sel;
+
+	/* we're interested in MCV lists */
+	int			types = STATS_EXT_MCV;
+
+	/* check if there's any stats that might be useful for us. */
+	if (!has_stats_of_kind(rel->statlist, types))
+		return (Selectivity) 1.0;
+
+	list_attnums = (Bitmapset **) palloc(sizeof(Bitmapset *) *
+										 list_length(clauses));
+
+	/*
+	 * Pre-process the clauses list to extract the attnums seen in each item.
+	 * We need to determine if there's any clauses which will be useful for
+	 * dependency selectivity estimations. Along the way we'll record all of
+	 * the attnums for each clause in a list which we'll reference later so we
+	 * don't need to repeat the same work again. We'll also keep track of all
+	 * attnums seen.
+	 *
+	 * We also skip clauses that we already estimated using different types of
+	 * statistics (we treat them as incompatible).
+	 *
+	 * XXX Currently, the estimated clauses are always empty because the extra
+	 * statistics are applied before functional dependencies. Once we decide
+	 * to apply multiple statistics, this may change.
+	 */
+	listidx = 0;
+	foreach(l, clauses)
+	{
+		Node	   *clause = (Node *) lfirst(l);
+		Bitmapset  *attnums = NULL;
+
+		if ((statext_is_compatible_clause(clause, rel->relid, &attnums)) &&
+			(!bms_is_member(listidx, *estimatedclauses)))
+		{
+			list_attnums[listidx] = attnums;
+			clauses_attnums = bms_add_members(clauses_attnums, attnums);
+		}
+		else
+			list_attnums[listidx] = NULL;
+
+		listidx++;
+	}
+
+	/* We need at least two attributes for MCV lists. */
+	if (bms_num_members(clauses_attnums) < 2)
+		return 1.0;
+
+	/* find the best suited statistics object for these attnums */
+	stat = choose_best_statistics(rel->statlist, clauses_attnums, types);
+
+	/* if no matching stats could be found then we've nothing to do */
+	if (!stat)
+		return (Selectivity) 1.0;
+
+	/* We only understand MCV lists for now. */
+	Assert(stat->kind == STATS_EXT_MCV);
+
+	/* now filter the clauses to be estimated using the selected MCV */
+	stat_clauses = NIL;
+
+	listidx = 0;
+	foreach(l, clauses)
+	{
+		/*
+		 * If the clause is compatible with the selected statistics, mark it
+		 * as estimated and add it to the list to estimate.
+		 */
+		if ((list_attnums[listidx] != NULL) &&
+			(bms_is_subset(list_attnums[listidx], stat->keys)))
+		{
+			stat_clauses = lappend(stat_clauses, (Node *) lfirst(l));
+			*estimatedclauses = bms_add_member(*estimatedclauses, listidx);
+		}
+
+		listidx++;
+	}
+
+	/*
+	 * First compute "simple" selectivity, i.e. without the extended statistics,
+	 * and essentially assuming independence of the columns/clauses. We'll then
+	 * use the various selectivities computed from MCV list to improve it.
+	 */
+	simple_sel = clauselist_selectivity_simple(root, stat_clauses, varRelid,
+											   jointype, sjinfo, NULL);
+
+	/*
+	 * Now compute the multi-column estimate from the MCV list, along with the
+	 * other selectivities (base & total selectivity).
+	 */
+	mcv_sel = mcv_clauselist_selectivity(root, stat, stat_clauses, varRelid,
+										 jointype, sjinfo, rel,
+										 &mcv_basesel, &mcv_totalsel);
+
+	/* Estimated selectivity of values not covered by MCV matches */
+	other_sel = simple_sel - mcv_basesel;
+	CLAMP_PROBABILITY(other_sel);
+
+	/* The non-MCV selectivity can't exceed the 1 - mcv_totalsel. */
+	if (other_sel > 1.0 - mcv_totalsel)
+		other_sel = 1.0 - mcv_totalsel;
+
+	/* Overall selectivity is the combination of MCV and non-MCV estimates. */
+	sel = mcv_sel + other_sel;
+	CLAMP_PROBABILITY(sel);
+
+	return sel;
+}
diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
new file mode 100644
index 0000000000..9bc2d07e90
--- /dev/null
+++ b/src/backend/statistics/mcv.c
@@ -0,0 +1,1656 @@
+/*-------------------------------------------------------------------------
+ *
+ * mcv.c
+ *	  POSTGRES multivariate MCV lists
+ *
+ *
+ * Portions Copyright (c) 1996-2017, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/backend/statistics/mcv.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/htup_details.h"
+#include "catalog/pg_collation.h"
+#include "catalog/pg_statistic_ext.h"
+#include "fmgr.h"
+#include "funcapi.h"
+#include "optimizer/clauses.h"
+#include "statistics/extended_stats_internal.h"
+#include "statistics/statistics.h"
+#include "utils/builtins.h"
+#include "utils/bytea.h"
+#include "utils/fmgroids.h"
+#include "utils/fmgrprotos.h"
+#include "utils/lsyscache.h"
+#include "utils/syscache.h"
+#include "utils/typcache.h"
+
+#include <math.h>
+
+/*
+ * Computes size of a serialized MCV item, depending on the number of
+ * dimensions (columns) the statistic is defined on. The datum values are
+ * stored in a separate array (deduplicated, to minimize the size), and
+ * so the serialized items only store uint16 indexes into that array.
+ *
+ * Each serialized item store (in this order):
+ *
+ * - indexes to values	  (ndim * sizeof(uint16))
+ * - null flags			  (ndim * sizeof(bool))
+ * - frequency			  (sizeof(double))
+ * - base_frequency		  (sizeof(double))
+ *
+ * So in total each MCV item requires this many bytes:
+ *
+ *	 ndim * (sizeof(uint16) + sizeof(bool)) + 2 * sizeof(double)
+ */
+#define ITEM_SIZE(ndims)	\
+	(ndims * (sizeof(uint16) + sizeof(bool)) + 2 * sizeof(double))
+
+/*
+ * Macros for convenient access to parts of a serialized MCV item.
+ */
+#define ITEM_INDEXES(item)			((uint16*)item)
+#define ITEM_NULLS(item,ndims)		((bool*)(ITEM_INDEXES(item) + ndims))
+#define ITEM_FREQUENCY(item,ndims)	((double*)(ITEM_NULLS(item,ndims) + ndims))
+#define ITEM_BASE_FREQUENCY(item,ndims)	((double*)(ITEM_FREQUENCY(item,ndims) + 1))
+
+
+static MultiSortSupport build_mss(VacAttrStats **stats, Bitmapset *attrs);
+
+static SortItem *build_distinct_groups(int numrows, SortItem *items,
+					  MultiSortSupport mss, int *ndistinct);
+
+static int count_distinct_groups(int numrows, SortItem *items,
+					  MultiSortSupport mss);
+
+/*
+ * Builds MCV list from the set of sampled rows.
+ *
+ * The algorithm is quite simple:
+ *
+ *	   (1) sort the data (default collation, '<' for the data type)
+ *
+ *	   (2) count distinct groups, decide how many to keep
+ *
+ *	   (3) build the MCV list using the threshold determined in (2)
+ *
+ *	   (4) remove rows represented by the MCV from the sample
+ *
+ */
+MCVList *
+statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
+				  VacAttrStats **stats, double totalrows)
+{
+	int			i,
+				j,
+				k;
+	int			numattrs = bms_num_members(attrs);
+	int			ngroups;
+	int			nitems;
+	double		stadistinct;
+	int		   *mcv_counts;
+	int			f1;
+
+	int		   *attnums = build_attnums(attrs);
+
+	MCVList    *mcvlist = NULL;
+
+	/* comparator for all the columns */
+	MultiSortSupport mss = build_mss(stats, attrs);
+
+	/* sort the rows */
+	SortItem   *items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
+										   mss, numattrs, attnums);
+
+	/* transform the sorted rows into groups (sorted by frequency) */
+	SortItem   *groups = build_distinct_groups(numrows, items, mss, &ngroups);
+
+	/*
+	 * Maximum number of MCV items to store, based on the attribute with the
+	 * largest stats target (and the number of groups we have available).
+	 */
+	nitems = stats[0]->attr->attstattarget;
+	for (i = 1; i < numattrs; i++)
+	{
+		if (stats[i]->attr->attstattarget > nitems)
+			nitems = stats[i]->attr->attstattarget;
+	}
+	if (nitems > ngroups)
+		nitems = ngroups;
+
+	/*
+	 * Decide how many items to keep in the MCV list. We simply use the same
+	 * algorithm as for per-column MCV lists, to keep it consistent.
+	 *
+	 * One difference is that we do not have a multi-column stanullfrac, and
+	 * we simply treat it as a special item in the MCV list (it it makes it).
+	 * We could compute and store it, of course, but we may have statistics
+	 * on more than two columns, so we'd probably want to store this for
+	 * various combinations of columns - for K columns that'd be 2^K values.
+	 * So we instead store those as items of the multi-column MCV list (if
+	 * common enough).
+	 *
+	 * XXX Conceptually this is similar to the NULL-buckets of histograms.
+	 */
+	mcv_counts = (int *) palloc(sizeof(int) * nitems);
+	f1 = 0;
+
+	for (i = 0; i < nitems; i++)
+	{
+		mcv_counts[i] = groups[i].count;
+
+		/* count values that occur exactly once for the ndistinct estimate */
+		if (groups[i].count == 1)
+			f1 += 1;
+	}
+
+	stadistinct = estimate_ndistinct(totalrows, numrows, ngroups, f1);
+
+	/*
+	 * If we can fit all the items onto the MCV list, do that. Otherwise use
+	 * analyze_mcv_list to decide how many items to keep in the MCV list, just
+	 * like for the single-dimensional MCV list.
+	 *
+	 * XXX Should we also consider stadistinct here, to see if the groups do
+	 * represent all the distinct combinations.
+	 */
+	if (ngroups > nitems)
+	{
+		nitems = analyze_mcv_list(mcv_counts, nitems, stadistinct,
+								  0.0, /* stanullfrac */
+								  numrows, totalrows);
+	}
+
+	/*
+	 * At this point we know the number of items for the MCV list. There might
+	 * be none (for uniform distribution with many groups), and in that case
+	 * there will be no MCV list. Otherwise construct the MCV list.
+	 */
+	if (nitems > 0)
+	{
+		/*
+		 * Allocate the MCV list structure, set the global parameters.
+		 */
+		mcvlist = (MCVList *) palloc0(sizeof(MCVList));
+
+		mcvlist->magic = STATS_MCV_MAGIC;
+		mcvlist->type = STATS_MCV_TYPE_BASIC;
+		mcvlist->ndimensions = numattrs;
+		mcvlist->nitems = nitems;
+
+		/* store info about data type OIDs */
+		i = 0;
+		j = -1;
+		while ((j = bms_next_member(attrs, j)) >= 0)
+		{
+			VacAttrStats *colstat = stats[i];
+
+			mcvlist->types[i] = colstat->attrtypid;
+			i++;
+		}
+
+		/*
+		 * Preallocate Datum/isnull arrays (not as a single chunk, as we will
+		 * pass the result outside and thus it needs to be easy to pfree().
+		 *
+		 * XXX On second thought, we're the only ones dealing with MCV lists,
+		 * so we might allocate everything as a single chunk to reduce palloc
+		 * overhead (chunk headers, etc.) without significant risk. Not sure
+		 * it's worth it, though, as we're not re-building stats very often.
+		 */
+		mcvlist->items = (MCVItem * *) palloc0(sizeof(MCVItem *) * nitems);
+
+		for (i = 0; i < nitems; i++)
+		{
+			mcvlist->items[i] = (MCVItem *) palloc(sizeof(MCVItem));
+			mcvlist->items[i]->values = (Datum *) palloc(sizeof(Datum) * numattrs);
+			mcvlist->items[i]->isnull = (bool *) palloc(sizeof(bool) * numattrs);
+		}
+
+		/* Copy the first chunk of groups into the result. */
+		for (i = 0; i < nitems; i++)
+		{
+			/* just pointer to the proper place in the list */
+			MCVItem    *item = mcvlist->items[i];
+
+			/* copy values from the _previous_ group (last item of) */
+			memcpy(item->values, groups[i].values, sizeof(Datum) * numattrs);
+			memcpy(item->isnull, groups[i].isnull, sizeof(bool) * numattrs);
+
+			/* groups should be sorted by frequency in descending order */
+			Assert((i == 0) || (groups[i - 1].count >= groups[i].count));
+
+			/* group frequency */
+			item->frequency = (double) groups[i].count / numrows;
+
+			/* base frequency, if the attributes were independent */
+			item->base_frequency = 1.0;
+			for (j = 0; j < numattrs; j++)
+			{
+				int			count = 0;
+
+				for (k = 0; k < ngroups; k++)
+				{
+					if (multi_sort_compare_dim(j, &groups[i], &groups[k], mss) == 0)
+						count += groups[k].count;
+				}
+
+				item->base_frequency *= (double) count / numrows;
+			}
+		}
+	}
+
+	pfree(items);
+	pfree(groups);
+	pfree(mcv_counts);
+
+	return mcvlist;
+}
+
+/*
+ * build_mss
+ *	build MultiSortSupport for the attributes passed in attrs
+ */
+static MultiSortSupport
+build_mss(VacAttrStats **stats, Bitmapset *attrs)
+{
+	int			i,
+				j;
+	int			numattrs = bms_num_members(attrs);
+
+	/* Sort by multiple columns (using array of SortSupport) */
+	MultiSortSupport mss = multi_sort_init(numattrs);
+
+	/* prepare the sort functions for all the attributes */
+	i = 0;
+	j = -1;
+	while ((j = bms_next_member(attrs, j)) >= 0)
+	{
+		VacAttrStats *colstat = stats[i];
+		TypeCacheEntry *type;
+
+		type = lookup_type_cache(colstat->attrtypid, TYPECACHE_LT_OPR);
+		if (type->lt_opr == InvalidOid) /* shouldn't happen */
+			elog(ERROR, "cache lookup failed for ordering operator for type %u",
+				 colstat->attrtypid);
+
+		multi_sort_add_dimension(mss, i, type->lt_opr, type->typcollation);
+		i++;
+	}
+
+	return mss;
+}
+
+/*
+ * count_distinct_groups
+ *	count distinct combinations of SortItems in the array
+ *
+ * The array is assumed to be sorted according to the MultiSortSupport.
+ */
+static int
+count_distinct_groups(int numrows, SortItem *items, MultiSortSupport mss)
+{
+	int			i;
+	int			ndistinct;
+
+	ndistinct = 1;
+	for (i = 1; i < numrows; i++)
+	{
+		/* make sure the array really is sorted */
+		Assert(multi_sort_compare(&items[i], &items[i - 1], mss) >= 0);
+
+		if (multi_sort_compare(&items[i], &items[i - 1], mss) != 0)
+			ndistinct += 1;
+	}
+
+	return ndistinct;
+}
+
+/*
+ * compare_sort_item_count
+ *	comparator for sorting items by count (frequencies) in descending order
+ */
+static int
+compare_sort_item_count(const void *a, const void *b)
+{
+	SortItem   *ia = (SortItem *) a;
+	SortItem   *ib = (SortItem *) b;
+
+	if (ia->count == ib->count)
+		return 0;
+	else if (ia->count > ib->count)
+		return -1;
+
+	return 1;
+}
+
+/*
+ * build_distinct_groups
+ *	build array of SortItems for distinct groups and counts matching items
+ *
+ * The input array is assumed to be sorted
+ */
+static SortItem *
+build_distinct_groups(int numrows, SortItem *items, MultiSortSupport mss,
+					  int *ndistinct)
+{
+	int			i,
+				j;
+	int			ngroups = count_distinct_groups(numrows, items, mss);
+
+	SortItem   *groups = (SortItem *) palloc0(ngroups * sizeof(SortItem));
+
+	j = 0;
+	groups[0] = items[0];
+	groups[0].count = 1;
+
+	for (i = 1; i < numrows; i++)
+	{
+		/* Assume sorted in ascending order. */
+		Assert(multi_sort_compare(&items[i], &items[i - 1], mss) >= 0);
+
+		/* New distinct group detected. */
+		if (multi_sort_compare(&items[i], &items[i - 1], mss) != 0)
+			groups[++j] = items[i];
+
+		groups[j].count++;
+	}
+
+	/* Sort the distinct groups by frequency (in descending order). */
+	pg_qsort((void *) groups, ngroups, sizeof(SortItem),
+			 compare_sort_item_count);
+
+	*ndistinct = ngroups;
+	return groups;
+}
+
+
+/*
+ * statext_mcv_load
+ *		Load the MCV list for the indicated pg_statistic_ext tuple
+ */
+MCVList *
+statext_mcv_load(Oid mvoid)
+{
+	bool		isnull = false;
+	Datum		mcvlist;
+	HeapTuple	htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(mvoid));
+
+	if (!HeapTupleIsValid(htup))
+		elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
+
+	mcvlist = SysCacheGetAttr(STATEXTOID, htup,
+							  Anum_pg_statistic_ext_stxmcv, &isnull);
+
+	ReleaseSysCache(htup);
+
+	if (isnull)
+		return NULL;
+
+	return statext_mcv_deserialize(DatumGetByteaP(mcvlist));
+}
+
+
+/*
+ * Serialize MCV list into a bytea value.
+ *
+ * The basic algorithm is simple:
+ *
+ * (1) perform deduplication (for each attribute separately)
+ *	   (a) collect all (non-NULL) attribute values from all MCV items
+ *	   (b) sort the data (using 'lt' from VacAttrStats)
+ *	   (c) remove duplicate values from the array
+ *
+ * (2) serialize the arrays into a bytea value
+ *
+ * (3) process all MCV list items
+ *	   (a) replace values with indexes into the arrays
+ *
+ * Each attribute has to be processed separately, as we may be mixing different
+ * datatypes, with different sort operators, etc.
+ *
+ * We use uint16 values for the indexes in step (3), as we currently don't allow
+ * more than 8k MCV items anyway, although that's mostly arbitrary limit. We might
+ * increase this to 65k and still fit into uint16. Furthermore, this limit is on
+ * the number of distinct values per column, and we usually have few of those
+ * (and various combinations of them for the those MCV list). So uint16 seems fine.
+ *
+ * We don't really expect the serialization to save as much space as for
+ * histograms, as we are not doing any bucket splits (which is the source
+ * of high redundancy in histograms).
+ *
+ * TODO: Consider packing boolean flags (NULL) for each item into a single char
+ * (or a longer type) instead of using an array of bool items.
+ */
+bytea *
+statext_mcv_serialize(MCVList * mcvlist, VacAttrStats **stats)
+{
+	int			i;
+	int			dim;
+	int			ndims = mcvlist->ndimensions;
+	int			itemsize = ITEM_SIZE(ndims);
+
+	SortSupport ssup;
+	DimensionInfo *info;
+
+	Size		total_length;
+
+	/* allocate the item just once */
+	char	   *item = palloc0(itemsize);
+
+	/* serialized items (indexes into arrays, etc.) */
+	bytea	   *output;
+	char	   *data = NULL;
+
+	/* values per dimension (and number of non-NULL values) */
+	Datum	  **values = (Datum **) palloc0(sizeof(Datum *) * ndims);
+	int		   *counts = (int *) palloc0(sizeof(int) * ndims);
+
+	/*
+	 * We'll include some rudimentary information about the attributes (type
+	 * length, etc.), so that we don't have to look them up while
+	 * deserializing the MCV list.
+	 *
+	 * XXX Maybe this is not a great idea? Or maybe we should actually copy
+	 * more fields, e.g. typeid, which would allow us to display the MCV list
+	 * using only the serialized representation (currently we have to fetch
+	 * this info from the relation).
+	 */
+	info = (DimensionInfo *) palloc0(sizeof(DimensionInfo) * ndims);
+
+	/* sort support data for all attributes included in the MCV list */
+	ssup = (SortSupport) palloc0(sizeof(SortSupportData) * ndims);
+
+	/* collect and deduplicate values for each dimension (attribute) */
+	for (dim = 0; dim < ndims; dim++)
+	{
+		int			ndistinct;
+		TypeCacheEntry *typentry;
+
+		/*
+		 * Lookup the LT operator (can't get it from stats extra_data, as we
+		 * don't know how to interpret that - scalar vs. array etc.).
+		 */
+		typentry = lookup_type_cache(stats[dim]->attrtypid, TYPECACHE_LT_OPR);
+
+		/* copy important info about the data type (length, by-value) */
+		info[dim].typlen = stats[dim]->attrtype->typlen;
+		info[dim].typbyval = stats[dim]->attrtype->typbyval;
+
+		/* allocate space for values in the attribute and collect them */
+		values[dim] = (Datum *) palloc0(sizeof(Datum) * mcvlist->nitems);
+
+		for (i = 0; i < mcvlist->nitems; i++)
+		{
+			/* skip NULL values - we don't need to deduplicate those */
+			if (mcvlist->items[i]->isnull[dim])
+				continue;
+
+			values[dim][counts[dim]] = mcvlist->items[i]->values[dim];
+			counts[dim] += 1;
+		}
+
+		/* if there are just NULL values in this dimension, we're done */
+		if (counts[dim] == 0)
+			continue;
+
+		/* sort and deduplicate the data */
+		ssup[dim].ssup_cxt = CurrentMemoryContext;
+		ssup[dim].ssup_collation = DEFAULT_COLLATION_OID;
+		ssup[dim].ssup_nulls_first = false;
+
+		PrepareSortSupportFromOrderingOp(typentry->lt_opr, &ssup[dim]);
+
+		qsort_arg(values[dim], counts[dim], sizeof(Datum),
+				  compare_scalars_simple, &ssup[dim]);
+
+		/*
+		 * Walk through the array and eliminate duplicate values, but keep the
+		 * ordering (so that we can do bsearch later). We know there's at
+		 * least one item as (counts[dim] != 0), so we can skip the first
+		 * element.
+		 */
+		ndistinct = 1;			/* number of distinct values */
+		for (i = 1; i < counts[dim]; i++)
+		{
+			/* expect sorted array */
+			Assert(compare_datums_simple(values[dim][i - 1], values[dim][i], &ssup[dim]) <= 0);
+
+			/* if the value is the same as the previous one, we can skip it */
+			if (!compare_datums_simple(values[dim][i - 1], values[dim][i], &ssup[dim]))
+				continue;
+
+			values[dim][ndistinct] = values[dim][i];
+			ndistinct += 1;
+		}
+
+		/* we must not exceed UINT16_MAX, as we use uint16 indexes */
+		Assert(ndistinct <= UINT16_MAX);
+
+		/*
+		 * Store additional info about the attribute - number of deduplicated
+		 * values, and also size of the serialized data. For fixed-length data
+		 * types this is trivial to compute, for varwidth types we need to
+		 * actually walk the array and sum the sizes.
+		 */
+		info[dim].nvalues = ndistinct;
+
+		if (info[dim].typlen > 0)	/* fixed-length data types */
+			info[dim].nbytes = info[dim].nvalues * info[dim].typlen;
+		else if (info[dim].typlen == -1)	/* varlena */
+		{
+			info[dim].nbytes = 0;
+			for (i = 0; i < info[dim].nvalues; i++)
+				info[dim].nbytes += VARSIZE_ANY(values[dim][i]);
+		}
+		else if (info[dim].typlen == -2)	/* cstring */
+		{
+			info[dim].nbytes = 0;
+			for (i = 0; i < info[dim].nvalues; i++)
+				info[dim].nbytes += strlen(DatumGetPointer(values[dim][i]));
+		}
+
+		/* we know (count>0) so there must be some data */
+		Assert(info[dim].nbytes > 0);
+	}
+
+	/*
+	 * Now we can finally compute how much space we'll actually need for the
+	 * whole serialized MCV list, as it contains these fields:
+	 *
+	 * - length (4B) for varlena - magic (4B) - type (4B) - ndimensions (4B) -
+	 * nitems (4B) - info (ndim * sizeof(DimensionInfo) - arrays of values for
+	 * each dimension - serialized items (nitems * itemsize)
+	 *
+	 * So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and then we
+	 * will place all the data (values + indexes). We'll however use offsetof
+	 * and sizeof to compute sizes of the structs.
+	 */
+	total_length = (sizeof(int32) + offsetof(MCVList, items)
+					+ (ndims * sizeof(DimensionInfo))
+					+ mcvlist->nitems * itemsize);
+
+	/* add space for the arrays of deduplicated values */
+	for (i = 0; i < ndims; i++)
+		total_length += info[i].nbytes;
+
+	/*
+	 * Enforce arbitrary limit of 1MB on the size of the serialized MCV list.
+	 * This is meant as a protection against someone building MCV list on long
+	 * values (e.g. text documents).
+	 *
+	 * XXX Should we enforce arbitrary limits like this one? Maybe it's not
+	 * even necessary, as long values are usually unique and so won't make it
+	 * into the MCV list in the first place. In the end, we have a 1GB limit
+	 * on bytea values.
+	 */
+	if (total_length > (1024 * 1024))
+		elog(ERROR, "serialized MCV list exceeds 1MB (%ld)", total_length);
+
+	/* allocate space for the serialized MCV list, set header fields */
+	output = (bytea *) palloc0(total_length);
+	SET_VARSIZE(output, total_length);
+
+	/* 'data' points to the current position in the output buffer */
+	data = VARDATA(output);
+
+	/* MCV list header (number of items, ...) */
+	memcpy(data, mcvlist, offsetof(MCVList, items));
+	data += offsetof(MCVList, items);
+
+	/* information about the attributes */
+	memcpy(data, info, sizeof(DimensionInfo) * ndims);
+	data += sizeof(DimensionInfo) * ndims;
+
+	/* Copy the deduplicated values for all attributes to the output. */
+	for (dim = 0; dim < ndims; dim++)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* remember the starting point for Asserts later */
+		char	   *tmp = data;
+#endif
+		for (i = 0; i < info[dim].nvalues; i++)
+		{
+			Datum		v = values[dim][i];
+
+			if (info[dim].typbyval) /* passed by value */
+			{
+				memcpy(data, &v, info[dim].typlen);
+				data += info[dim].typlen;
+			}
+			else if (info[dim].typlen > 0)	/* pased by reference */
+			{
+				memcpy(data, DatumGetPointer(v), info[dim].typlen);
+				data += info[dim].typlen;
+			}
+			else if (info[dim].typlen == -1)	/* varlena */
+			{
+				memcpy(data, DatumGetPointer(v), VARSIZE_ANY(v));
+				data += VARSIZE_ANY(v);
+			}
+			else if (info[dim].typlen == -2)	/* cstring */
+			{
+				memcpy(data, DatumGetPointer(v), strlen(DatumGetPointer(v)) + 1);
+				data += strlen(DatumGetPointer(v)) + 1; /* terminator */
+			}
+
+			/* no underflows or overflows */
+			Assert((data > tmp) && ((data - tmp) <= info[dim].nbytes));
+		}
+
+		/*
+		 * check we got exactly the amount of data we expected for this
+		 * dimension
+		 */
+		Assert((data - tmp) == info[dim].nbytes);
+	}
+
+	/* Serialize the items, with uint16 indexes instead of the values. */
+	for (i = 0; i < mcvlist->nitems; i++)
+	{
+		MCVItem    *mcvitem = mcvlist->items[i];
+
+		/* don't write beyond the allocated space */
+		Assert(data <= (char *) output + total_length - itemsize);
+
+		/* reset the item (we only allocate it once and reuse it) */
+		memset(item, 0, itemsize);
+
+		for (dim = 0; dim < ndims; dim++)
+		{
+			Datum	   *v = NULL;
+
+			/* do the lookup only for non-NULL values */
+			if (mcvlist->items[i]->isnull[dim])
+				continue;
+
+			v = (Datum *) bsearch_arg(&mcvitem->values[dim], values[dim],
+									  info[dim].nvalues, sizeof(Datum),
+									  compare_scalars_simple, &ssup[dim]);
+
+			Assert(v != NULL);	/* serialization or deduplication error */
+
+			/* compute index within the array */
+			ITEM_INDEXES(item)[dim] = (v - values[dim]);
+
+			/* check the index is within expected bounds */
+			Assert(ITEM_INDEXES(item)[dim] >= 0);
+			Assert(ITEM_INDEXES(item)[dim] < info[dim].nvalues);
+		}
+
+		/* copy NULL and frequency flags into the item */
+		memcpy(ITEM_NULLS(item, ndims), mcvitem->isnull, sizeof(bool) * ndims);
+		memcpy(ITEM_FREQUENCY(item, ndims), &mcvitem->frequency, sizeof(double));
+		memcpy(ITEM_BASE_FREQUENCY(item, ndims), &mcvitem->base_frequency, sizeof(double));
+
+		/* copy the serialized item into the array */
+		memcpy(data, item, itemsize);
+
+		data += itemsize;
+	}
+
+	/* at this point we expect to match the total_length exactly */
+	Assert((data - (char *) output) == total_length);
+
+	pfree(item);
+	pfree(values);
+	pfree(counts);
+
+	return output;
+}
+
+/*
+ * Reads serialized MCV list into MCVList structure.
+ *
+ * Unlike with histograms, we deserialize the MCV list fully (i.e. we don't
+ * keep the deduplicated arrays and pointers into them), as we don't expect
+ * there to be a lot of duplicate values. But perhaps that's not true and we
+ * should keep the MCV in serialized form too.
+ *
+ * XXX See how much memory we could save by keeping the deduplicated version
+ * (both for typical and corner cases with few distinct values but many items).
+ */
+MCVList *
+statext_mcv_deserialize(bytea *data)
+{
+	int			dim,
+				i;
+	Size		expected_size;
+	MCVList    *mcvlist;
+	char	   *tmp;
+
+	int			ndims,
+				nitems,
+				itemsize;
+	DimensionInfo *info = NULL;
+	Datum	  **values = NULL;
+
+	/* local allocation buffer (used only for deserialization) */
+	int			bufflen;
+	char	   *buff;
+	char	   *ptr;
+
+	/* buffer used for the result */
+	int			rbufflen;
+	char	   *rbuff;
+	char	   *rptr;
+
+	if (data == NULL)
+		return NULL;
+
+	/*
+	 * We can't possibly deserialize a MCV list if there's not even a complete
+	 * header.
+	 */
+	if (VARSIZE_ANY_EXHDR(data) < offsetof(MCVList, items))
+		elog(ERROR, "invalid MCV Size %ld (expected at least %ld)",
+			 VARSIZE_ANY_EXHDR(data), offsetof(MCVList, items));
+
+	/* read the MCV list header */
+	mcvlist = (MCVList *) palloc0(sizeof(MCVList));
+
+	/* initialize pointer to the data part (skip the varlena header) */
+	tmp = VARDATA_ANY(data);
+
+	/* get the header and perform further sanity checks */
+	memcpy(mcvlist, tmp, offsetof(MCVList, items));
+	tmp += offsetof(MCVList, items);
+
+	if (mcvlist->magic != STATS_MCV_MAGIC)
+		elog(ERROR, "invalid MCV magic %d (expected %dd)",
+			 mcvlist->magic, STATS_MCV_MAGIC);
+
+	if (mcvlist->type != STATS_MCV_TYPE_BASIC)
+		elog(ERROR, "invalid MCV type %d (expected %dd)",
+			 mcvlist->type, STATS_MCV_TYPE_BASIC);
+
+	if (mcvlist->ndimensions == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid zero-length dimension array in MCVList")));
+	else if (mcvlist->ndimensions > STATS_MAX_DIMENSIONS)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid length (%d) dimension array in MCVList",
+						mcvlist->ndimensions)));
+
+	if (mcvlist->nitems == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid zero-length item array in MCVList")));
+	else if (mcvlist->nitems > STATS_MCVLIST_MAX_ITEMS)
+		ereport(ERROR,
+				(errcode(ERRCODE_DATA_CORRUPTED),
+				 errmsg("invalid length (%d) item array in MCVList",
+						mcvlist->nitems)));
+
+	nitems = mcvlist->nitems;
+	ndims = mcvlist->ndimensions;
+	itemsize = ITEM_SIZE(ndims);
+
+	/*
+	 * Check amount of data including DimensionInfo for all dimensions and
+	 * also the serialized items (including uint16 indexes). Also, walk
+	 * through the dimension information and add it to the sum.
+	 */
+	expected_size = offsetof(MCVList, items) +
+		ndims * sizeof(DimensionInfo) +
+		(nitems * itemsize);
+
+	/*
+	 * Check that we have at least the dimension and info records, along with
+	 * the items. We don't know the size of the serialized values yet. We need
+	 * to do this check first, before accessing the dimension info.
+	 */
+	if (VARSIZE_ANY_EXHDR(data) < expected_size)
+		elog(ERROR, "invalid MCV size %ld (expected %ld)",
+			 VARSIZE_ANY_EXHDR(data), expected_size);
+
+	/* Now it's safe to access the dimension info. */
+	info = (DimensionInfo *) (tmp);
+	tmp += ndims * sizeof(DimensionInfo);
+
+	/* account for the value arrays */
+	for (dim = 0; dim < ndims; dim++)
+	{
+		/*
+		 * XXX I wonder if we can/should rely on asserts here. Maybe those
+		 * checks should be done every time?
+		 */
+		Assert(info[dim].nvalues >= 0);
+		Assert(info[dim].nbytes >= 0);
+
+		expected_size += info[dim].nbytes;
+	}
+
+	/*
+	 * Now we know the total expected MCV size, including all the pieces
+	 * (header, dimension info. items and deduplicated data). So do the final
+	 * check on size.
+	 */
+	if (VARSIZE_ANY_EXHDR(data) != expected_size)
+		elog(ERROR, "invalid MCV size %ld (expected %ld)",
+			 VARSIZE_ANY_EXHDR(data), expected_size);
+
+	/*
+	 * Allocate one large chunk of memory for the intermediate data, needed
+	 * only for deserializing the MCV list (and allocate densely to minimize
+	 * the palloc overhead).
+	 *
+	 * Let's see how much space we'll actually need, and also include space
+	 * for the array with pointers.
+	 *
+	 * We need an array of Datum pointers values for each dimension, so that
+	 * we can easily translate the uint16 indexes. We also need a top-level
+	 * array of pointers to those per-dimension arrays.
+	 *
+	 * For byval types with size matching sizeof(Datum) we can reuse the
+	 * serialized array directly.
+	 */
+	bufflen = sizeof(Datum **) * ndims; /* space for top-level pointers */
+
+	for (dim = 0; dim < ndims; dim++)
+	{
+		/* for full-size byval types, we reuse the serialized value */
+		if (!(info[dim].typbyval && info[dim].typlen == sizeof(Datum)))
+			bufflen += (sizeof(Datum) * info[dim].nvalues);
+	}
+
+	buff = palloc0(bufflen);
+	ptr = buff;
+
+	values = (Datum **) buff;
+	ptr += (sizeof(Datum *) * ndims);
+
+	/*
+	 * XXX This uses pointers to the original data array (the types not passed
+	 * by value), so when someone frees the memory, e.g. by doing something
+	 * like this:
+	 *
+	 *	  bytea * data = ... fetch the data from catalog ...
+	 *
+	 *	  MCVList mcvlist = deserialize_mcv_list(data);
+	 *
+	 *	  pfree(data);
+	 *
+	 * then 'mcvlist' references the freed memory. Should copy the pieces.
+	 */
+	for (dim = 0; dim < ndims; dim++)
+	{
+#ifdef USE_ASSERT_CHECKING
+		/* remember where data for this dimension starts */
+		char	   *start = tmp;
+#endif
+		if (info[dim].typbyval)
+		{
+			/* passed by value / size matches Datum - just reuse the array */
+			if (info[dim].typlen == sizeof(Datum))
+			{
+				values[dim] = (Datum *) tmp;
+				tmp += info[dim].nbytes;
+
+				/* no overflow of input array */
+				Assert(tmp <= start + info[dim].nbytes);
+			}
+			else
+			{
+				values[dim] = (Datum *) ptr;
+				ptr += (sizeof(Datum) * info[dim].nvalues);
+
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					memcpy(&values[dim][i], tmp, info[dim].typlen);
+					tmp += info[dim].typlen;
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+		}
+		else
+		{
+			/* all the other types need a chunk of the buffer */
+			values[dim] = (Datum *) ptr;
+			ptr += (sizeof(Datum) * info[dim].nvalues);
+
+			/* passed by reference, but fixed length (name, tid, ...) */
+			if (info[dim].typlen > 0)
+			{
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					values[dim][i] = PointerGetDatum(tmp);
+					tmp += info[dim].typlen;
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+			else if (info[dim].typlen == -1)
+			{
+				/* varlena */
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					values[dim][i] = PointerGetDatum(tmp);
+					tmp += VARSIZE_ANY(tmp);
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+			else if (info[dim].typlen == -2)
+			{
+				/* cstring */
+				for (i = 0; i < info[dim].nvalues; i++)
+				{
+					/* just point into the array */
+					values[dim][i] = PointerGetDatum(tmp);
+					tmp += (strlen(tmp) + 1);	/* don't forget the \0 */
+
+					/* no overflow of input array */
+					Assert(tmp <= start + info[dim].nbytes);
+				}
+			}
+		}
+
+		/* check we consumed the serialized data for this dimension exactly */
+		Assert((tmp - start) == info[dim].nbytes);
+	}
+
+	/* we should have exhausted the buffer exactly */
+	Assert((ptr - buff) == bufflen);
+
+	/* allocate space for all the MCV items in a single piece */
+	rbufflen = (sizeof(MCVItem *) + sizeof(MCVItem) +
+				sizeof(Datum) * ndims + sizeof(bool) * ndims) * nitems;
+
+	rbuff = palloc0(rbufflen);
+	rptr = rbuff;
+
+	mcvlist->items = (MCVItem * *) rbuff;
+	rptr += (sizeof(MCVItem *) * nitems);
+
+	/* deserialize the MCV items and translate the indexes to Datums */
+	for (i = 0; i < nitems; i++)
+	{
+		uint16	   *indexes = NULL;
+		MCVItem    *item = (MCVItem *) rptr;
+
+		rptr += (sizeof(MCVItem));
+
+		item->values = (Datum *) rptr;
+		rptr += (sizeof(Datum) * ndims);
+
+		item->isnull = (bool *) rptr;
+		rptr += (sizeof(bool) * ndims);
+
+		/* just point to the right place */
+		indexes = ITEM_INDEXES(tmp);
+
+		memcpy(item->isnull, ITEM_NULLS(tmp, ndims), sizeof(bool) * ndims);
+		memcpy(&item->frequency, ITEM_FREQUENCY(tmp, ndims), sizeof(double));
+		memcpy(&item->base_frequency, ITEM_BASE_FREQUENCY(tmp, ndims), sizeof(double));
+
+		/* translate the values */
+		for (dim = 0; dim < ndims; dim++)
+			if (!item->isnull[dim])
+				item->values[dim] = values[dim][indexes[dim]];
+
+		mcvlist->items[i] = item;
+
+		tmp += ITEM_SIZE(ndims);
+
+		/* check we're not overflowing the input */
+		Assert(tmp <= (char *) data + VARSIZE_ANY(data));
+	}
+
+	/* check that we processed all the data */
+	Assert(tmp == (char *) data + VARSIZE_ANY(data));
+
+	/* release the temporary buffer */
+	pfree(buff);
+
+	return mcvlist;
+}
+
+/*
+ * SRF with details about buckets of a histogram:
+ *
+ * - item ID (0...nitems)
+ * - values (string array)
+ * - nulls only (boolean array)
+ * - frequency (double precision)
+ * - base_frequency (double precision)
+ *
+ * The input is the OID of the statistics, and there are no rows returned if
+ * the statistics contains no histogram.
+ */
+PG_FUNCTION_INFO_V1(pg_stats_ext_mcvlist_items);
+
+Datum
+pg_stats_ext_mcvlist_items(PG_FUNCTION_ARGS)
+{
+	FuncCallContext *funcctx;
+	int			call_cntr;
+	int			max_calls;
+	TupleDesc	tupdesc;
+	AttInMetadata *attinmeta;
+
+	/* stuff done only on the first call of the function */
+	if (SRF_IS_FIRSTCALL())
+	{
+		MemoryContext oldcontext;
+		MCVList    *mcvlist;
+
+		/* create a function context for cross-call persistence */
+		funcctx = SRF_FIRSTCALL_INIT();
+
+		/* switch to memory context appropriate for multiple function calls */
+		oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
+
+		mcvlist = statext_mcv_deserialize(PG_GETARG_BYTEA_P(0));
+
+		funcctx->user_fctx = mcvlist;
+
+		/* total number of tuples to be returned */
+		funcctx->max_calls = 0;
+		if (funcctx->user_fctx != NULL)
+			funcctx->max_calls = mcvlist->nitems;
+
+		/* Build a tuple descriptor for our result type */
+		if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+			ereport(ERROR,
+					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+					 errmsg("function returning record called in context "
+							"that cannot accept type record")));
+
+		/* build metadata needed later to produce tuples from raw C-strings */
+		attinmeta = TupleDescGetAttInMetadata(tupdesc);
+		funcctx->attinmeta = attinmeta;
+
+		MemoryContextSwitchTo(oldcontext);
+	}
+
+	/* stuff done on every call of the function */
+	funcctx = SRF_PERCALL_SETUP();
+
+	call_cntr = funcctx->call_cntr;
+	max_calls = funcctx->max_calls;
+	attinmeta = funcctx->attinmeta;
+
+	if (call_cntr < max_calls)	/* do when there is more left to send */
+	{
+		char	  **values;
+		HeapTuple	tuple;
+		Datum		result;
+
+		char	   *buff = palloc0(1024);
+		char	   *format;
+
+		int			i;
+
+		Oid		   *outfuncs;
+		FmgrInfo   *fmgrinfo;
+
+		MCVList    *mcvlist;
+		MCVItem    *item;
+
+		mcvlist = (MCVList *) funcctx->user_fctx;
+
+		Assert(call_cntr < mcvlist->nitems);
+
+		item = mcvlist->items[call_cntr];
+
+		/*
+		 * Prepare a values array for building the returned tuple. This should
+		 * be an array of C strings which will be processed later by the type
+		 * input functions.
+		 */
+		values = (char **) palloc(5 * sizeof(char *));
+
+		values[0] = (char *) palloc(64 * sizeof(char));
+
+		/* arrays */
+		values[1] = (char *) palloc0(1024 * sizeof(char));
+		values[2] = (char *) palloc0(1024 * sizeof(char));
+
+		/* frequency */
+		values[3] = (char *) palloc(64 * sizeof(char));
+
+		/* base frequency */
+		values[4] = (char *) palloc(64 * sizeof(char));
+
+		outfuncs = (Oid *) palloc0(sizeof(Oid) * mcvlist->ndimensions);
+		fmgrinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * mcvlist->ndimensions);
+
+		for (i = 0; i < mcvlist->ndimensions; i++)
+		{
+			bool		isvarlena;
+
+			getTypeOutputInfo(mcvlist->types[i], &outfuncs[i], &isvarlena);
+
+			fmgr_info(outfuncs[i], &fmgrinfo[i]);
+		}
+
+		snprintf(values[0], 64, "%d", call_cntr);	/* item ID */
+
+		for (i = 0; i < mcvlist->ndimensions; i++)
+		{
+			Datum		val,
+						valout;
+
+			format = "%s, %s";
+			if (i == 0)
+				format = "{%s%s";
+			else if (i == mcvlist->ndimensions - 1)
+				format = "%s, %s}";
+
+			if (item->isnull[i])
+				valout = CStringGetDatum("NULL");
+			else
+			{
+				val = item->values[i];
+				valout = FunctionCall1(&fmgrinfo[i], val);
+			}
+
+			snprintf(buff, 1024, format, values[1], DatumGetPointer(valout));
+			strncpy(values[1], buff, 1023);
+			buff[0] = '\0';
+
+			snprintf(buff, 1024, format, values[2], item->isnull[i] ? "t" : "f");
+			strncpy(values[2], buff, 1023);
+			buff[0] = '\0';
+		}
+
+		snprintf(values[3], 64, "%f", item->frequency); /* frequency */
+		snprintf(values[4], 64, "%f", item->base_frequency); /* base frequency */
+
+		/* build a tuple */
+		tuple = BuildTupleFromCStrings(attinmeta, values);
+
+		/* make the tuple into a datum */
+		result = HeapTupleGetDatum(tuple);
+
+		/* clean up (this is not really necessary) */
+		pfree(values[0]);
+		pfree(values[1]);
+		pfree(values[2]);
+		pfree(values[3]);
+		pfree(values[4]);
+
+		pfree(values);
+
+		SRF_RETURN_NEXT(funcctx, result);
+	}
+	else						/* do when there is no more left */
+	{
+		SRF_RETURN_DONE(funcctx);
+	}
+}
+
+/*
+ * pg_mcv_list_in		- input routine for type pg_mcv_list.
+ *
+ * pg_mcv_list is real enough to be a table column, but it has no operations
+ * of its own, and disallows input too
+ */
+Datum
+pg_mcv_list_in(PG_FUNCTION_ARGS)
+{
+	/*
+	 * pg_mcv_list stores the data in binary form and parsing text input is
+	 * not needed, so disallow this.
+	 */
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg("cannot accept a value of type %s", "pg_mcv_list")));
+
+	PG_RETURN_VOID();			/* keep compiler quiet */
+}
+
+
+/*
+ * pg_mcv_list_out		- output routine for type PG_MCV_LIST.
+ *
+ * MCV lists are serialized into a bytea value, so we simply call byteaout()
+ * to serialize the value into text. But it'd be nice to serialize that into
+ * a meaningful representation (e.g. for inspection by people).
+ *
+ * XXX This should probably return something meaningful, similar to what
+ * pg_dependencies_out does. Not sure how to deal with the deduplicated
+ * values, though - do we want to expand that or not?
+ */
+Datum
+pg_mcv_list_out(PG_FUNCTION_ARGS)
+{
+	return byteaout(fcinfo);
+}
+
+/*
+ * pg_mcv_list_recv		- binary input routine for type pg_mcv_list.
+ */
+Datum
+pg_mcv_list_recv(PG_FUNCTION_ARGS)
+{
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg("cannot accept a value of type %s", "pg_mcv_list")));
+
+	PG_RETURN_VOID();			/* keep compiler quiet */
+}
+
+/*
+ * pg_mcv_list_send		- binary output routine for type pg_mcv_list.
+ *
+ * MCV lists are serialized in a bytea value (although the type is named
+ * differently), so let's just send that.
+ */
+Datum
+pg_mcv_list_send(PG_FUNCTION_ARGS)
+{
+	return byteasend(fcinfo);
+}
+
+/*
+ * mcv_update_match_bitmap
+ *	Evaluate clauses using the MCV list, and update the match bitmap.
+ *
+ * A match bitmap keeps match/mismatch status for each MCV item, and we
+ * update it based on additional clauses. We also use it to skip items
+ * that can't possibly match (e.g. item marked as "mismatch" can't change
+ * to "match" when evaluating AND clause list).
+ *
+ * The function also returns a flag indicating whether there was an
+ * equality condition for all attributes, the minimum frequency in the MCV
+ * list, and a total MCV frequency (sum of frequencies for all items).
+ *
+ * XXX Currently the match bitmap uses a char for each MCV item, which is
+ * somewhat wasteful as we could do with just a single bit, thus reducing
+ * the size to ~1/8. It would also allow us to combine bitmaps simply using
+ * & and |, which should be faster than min/max. The bitmaps are fairly
+ * small, though (as we cap the MCV list size to 8k items).
+ */
+static void
+mcv_update_match_bitmap(PlannerInfo *root, List *clauses,
+						Bitmapset *keys, MCVList * mcvlist, char *matches,
+						bool is_or)
+{
+	int			i;
+	ListCell   *l;
+
+	/* The bitmap may be partially built. */
+	Assert(clauses != NIL);
+	Assert(list_length(clauses) >= 1);
+	Assert(mcvlist != NULL);
+	Assert(mcvlist->nitems > 0);
+	Assert(mcvlist->nitems <= STATS_MCVLIST_MAX_ITEMS);
+
+	/*
+	 * Loop through the list of clauses, and for each of them evaluate all the
+	 * MCV items not yet eliminated by the preceding clauses.
+	 */
+	foreach(l, clauses)
+	{
+		Node	   *clause = (Node *) lfirst(l);
+
+		/* if it's a RestrictInfo, then extract the clause */
+		if (IsA(clause, RestrictInfo))
+			clause = (Node *) ((RestrictInfo *) clause)->clause;
+
+		/*
+		 * Handle the various types of clauses - OpClause, NullTest and
+		 * AND/OR/NOT
+		 */
+		if (is_opclause(clause))
+		{
+			OpExpr	   *expr = (OpExpr *) clause;
+			bool		varonleft = true;
+			bool		ok;
+			FmgrInfo	opproc;
+
+			/* get procedure computing operator selectivity */
+			RegProcedure oprrest = get_oprrest(expr->opno);
+
+			fmgr_info(get_opcode(expr->opno), &opproc);
+
+			ok = (NumRelids(clause) == 1) &&
+				(is_pseudo_constant_clause(lsecond(expr->args)) ||
+				 (varonleft = false,
+				  is_pseudo_constant_clause(linitial(expr->args))));
+
+			if (ok)
+			{
+
+				FmgrInfo	gtproc;
+				Var		   *var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
+				Const	   *cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
+				bool		isgt = (!varonleft);
+
+				TypeCacheEntry *typecache
+				= lookup_type_cache(var->vartype, TYPECACHE_GT_OPR);
+
+				/* match the attribute to a dimension of the statistic */
+				int			idx = bms_member_index(keys, var->varattno);
+
+				fmgr_info(get_opcode(typecache->gt_opr), &gtproc);
+
+				/*
+				 * Walk through the MCV items and evaluate the current clause.
+				 * We can skip items that were already ruled out, and
+				 * terminate if there are no remaining MCV items that might
+				 * possibly match.
+				 */
+				for (i = 0; i < mcvlist->nitems; i++)
+				{
+					bool		mismatch = false;
+					MCVItem    *item = mcvlist->items[i];
+
+					/*
+					 * For AND-lists, we can also mark NULL items as 'no
+					 * match' (and then skip them). For OR-lists this is not
+					 * possible.
+					 */
+					if ((!is_or) && item->isnull[idx])
+						matches[i] = STATS_MATCH_NONE;
+
+					/* skip MCV items that were already ruled out */
+					if ((!is_or) && (matches[i] == STATS_MATCH_NONE))
+						continue;
+					else if (is_or && (matches[i] == STATS_MATCH_FULL))
+						continue;
+
+					switch (oprrest)
+					{
+						case F_EQSEL:
+						case F_NEQSEL:
+
+							/*
+							 * We don't care about isgt in equality, because
+							 * it does not matter whether it's (var op const)
+							 * or (const op var).
+							 */
+							mismatch = !DatumGetBool(FunctionCall2Coll(&opproc,
+																	   DEFAULT_COLLATION_OID,
+																	   cst->constvalue,
+																	   item->values[idx]));
+
+							break;
+
+						case F_SCALARLTSEL: /* column < constant */
+						case F_SCALARLESEL: /* column <= constant */
+						case F_SCALARGTSEL: /* column > constant */
+						case F_SCALARGESEL: /* column >= constant */
+
+							/*
+							 * First check whether the constant is below the
+							 * lower boundary (in that case we can skip the
+							 * bucket, because there's no overlap).
+							 */
+							if (isgt)
+								mismatch = !DatumGetBool(FunctionCall2Coll(&opproc,
+																		   DEFAULT_COLLATION_OID,
+																		   cst->constvalue,
+																		   item->values[idx]));
+							else
+								mismatch = !DatumGetBool(FunctionCall2Coll(&opproc,
+																		   DEFAULT_COLLATION_OID,
+																		   item->values[idx],
+																		   cst->constvalue));
+
+							break;
+					}
+
+					/*
+					 * XXX The conditions on matches[i] are not needed, as we
+					 * skip MCV items that can't become true/false, depending
+					 * on the current flag. See beginning of the loop over MCV
+					 * items.
+					 */
+
+					if ((is_or) && (!mismatch))
+					{
+						/* OR - was MATCH_NONE, but will be MATCH_FULL */
+						matches[i] = STATS_MATCH_FULL;
+						continue;
+					}
+					else if ((!is_or) && mismatch)
+					{
+						/* AND - was MATC_FULL, but will be MATCH_NONE */
+						matches[i] = STATS_MATCH_NONE;
+						continue;
+					}
+
+				}
+			}
+		}
+		else if (IsA(clause, NullTest))
+		{
+			NullTest   *expr = (NullTest *) clause;
+			Var		   *var = (Var *) (expr->arg);
+
+			/* match the attribute to a dimension of the statistic */
+			int			idx = bms_member_index(keys, var->varattno);
+
+			/*
+			 * Walk through the MCV items and evaluate the current clause. We
+			 * can skip items that were already ruled out, and terminate if
+			 * there are no remaining MCV items that might possibly match.
+			 */
+			for (i = 0; i < mcvlist->nitems; i++)
+			{
+				char		match = STATS_MATCH_NONE;	/* assume mismatch */
+				MCVItem    *item = mcvlist->items[i];
+
+				/* if the clause mismatches the MCV item, set it as MATCH_NONE */
+				switch (expr->nulltesttype)
+				{
+					case IS_NULL:
+						match = (item->isnull[idx]) ? STATS_MATCH_FULL : match;
+						break;
+
+					case IS_NOT_NULL:
+						match = (!item->isnull[idx]) ? STATS_MATCH_FULL : match;
+						break;
+				}
+
+				/* now, update the match bitmap, depending on OR/AND type */
+				if (is_or)
+					matches[i] = Max(matches[i], match);
+				else
+					matches[i] = Min(matches[i], match);
+			}
+		}
+		else if (or_clause(clause) || and_clause(clause))
+		{
+			/* AND/OR clause, with all subclauses being compatible */
+
+			int			i;
+			BoolExpr   *bool_clause = ((BoolExpr *) clause);
+			List	   *bool_clauses = bool_clause->args;
+
+			/* match/mismatch bitmap for each MCV item */
+			char	   *bool_matches = NULL;
+
+			Assert(bool_clauses != NIL);
+			Assert(list_length(bool_clauses) >= 2);
+
+			/* by default none of the MCV items matches the clauses */
+			bool_matches = palloc0(sizeof(char) * mcvlist->nitems);
+
+			if (or_clause(clause))
+			{
+				/* OR clauses assume nothing matches, initially */
+				memset(bool_matches, STATS_MATCH_NONE, sizeof(char) * mcvlist->nitems);
+			}
+			else
+			{
+				/* AND clauses assume everything matches, initially */
+				memset(bool_matches, STATS_MATCH_FULL, sizeof(char) * mcvlist->nitems);
+			}
+
+			/* build the match bitmap for the OR-clauses */
+			mcv_update_match_bitmap(root, bool_clauses, keys,
+									mcvlist, bool_matches,
+									or_clause(clause));
+
+			/*
+			 * Merge the bitmap produced by mcv_update_match_bitmap into the
+			 * current one. We need to consider if we're evaluating AND or OR
+			 * condition when merging the results.
+			 */
+			for (i = 0; i < mcvlist->nitems; i++)
+			{
+				/* Is this OR or AND clause? */
+				if (is_or)
+					matches[i] = Max(matches[i], bool_matches[i]);
+				else
+					matches[i] = Min(matches[i], bool_matches[i]);
+			}
+
+			pfree(bool_matches);
+		}
+		else if (not_clause(clause))
+		{
+			/* NOT clause, with all subclauses compatible */
+
+			int			i;
+			BoolExpr   *not_clause = ((BoolExpr *) clause);
+			List	   *not_args = not_clause->args;
+
+			/* match/mismatch bitmap for each MCV item */
+			char	   *not_matches = NULL;
+
+			Assert(not_args != NIL);
+			Assert(list_length(not_args) == 1);
+
+			/* by default none of the MCV items matches the clauses */
+			not_matches = palloc0(sizeof(char) * mcvlist->nitems);
+
+			/* NOT clauses assume nothing matches, initially */
+			memset(not_matches, STATS_MATCH_FULL, sizeof(char) * mcvlist->nitems);
+
+			/* build the match bitmap for the NOT-clause */
+			mcv_update_match_bitmap(root, not_args, keys,
+									mcvlist, not_matches, false);
+
+			/*
+			 * Merge the bitmap produced by mcv_update_match_bitmap into the
+			 * current one.
+			 */
+			for (i = 0; i < mcvlist->nitems; i++)
+			{
+				/*
+				 * When handling a NOT clause, we need to invert the result
+				 * before merging it into the global result.
+				 */
+				if (not_matches[i] == STATS_MATCH_NONE)
+					not_matches[i] = STATS_MATCH_FULL;
+				else
+					not_matches[i] = STATS_MATCH_NONE;
+
+				/* Is this OR or AND clause? */
+				if (is_or)
+					matches[i] = Max(matches[i], not_matches[i]);
+				else
+					matches[i] = Min(matches[i], not_matches[i]);
+			}
+
+			pfree(not_matches);
+		}
+		else if (IsA(clause, Var))
+		{
+			/* Var (has to be a boolean Var, possibly from below NOT) */
+
+			Var		   *var = (Var *) (clause);
+
+			/* match the attribute to a dimension of the statistic */
+			int			idx = bms_member_index(keys, var->varattno);
+
+			Assert(var->vartype == BOOLOID);
+
+			/*
+			 * Walk through the MCV items and evaluate the current clause. We
+			 * can skip items that were already ruled out, and terminate if
+			 * there are no remaining MCV items that might possibly match.
+			 */
+			for (i = 0; i < mcvlist->nitems; i++)
+			{
+				MCVItem    *item = mcvlist->items[i];
+				bool		match = STATS_MATCH_NONE;
+
+				/* if the item is NULL, it's a mismatch */
+				if (!item->isnull[idx] && DatumGetBool(item->values[idx]))
+					match = STATS_MATCH_FULL;
+
+				/* now, update the match bitmap, depending on OR/AND type */
+				if (is_or)
+					matches[i] = Max(matches[i], match);
+				else
+					matches[i] = Min(matches[i], match);
+			}
+		}
+		else
+		{
+			elog(ERROR, "unknown clause type: %d", clause->type);
+		}
+	}
+}
+
+
+/*
+ * mcv_clauselist_selectivity
+ *		Return the selectivity estimate of clauses using MCV list.
+ *
+ * It also produces two interesting selectivities - total selectivity of
+ * all the MCV items combined, and selectivity of the least frequent item
+ * in the list.
+ */
+Selectivity
+mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
+						   List *clauses, int varRelid,
+						   JoinType jointype, SpecialJoinInfo *sjinfo,
+						   RelOptInfo *rel,
+						   Selectivity *basesel, Selectivity *totalsel)
+{
+	int			i;
+	MCVList    *mcv;
+	Selectivity s = 0.0;
+
+	/* match/mismatch bitmap for each MCV item */
+	char	   *matches = NULL;
+
+	/* load the MCV list stored in the statistics object */
+	mcv = statext_mcv_load(stat->statOid);
+
+	/* by default all the MCV items match the clauses fully */
+	matches = palloc0(sizeof(char) * mcv->nitems);
+	memset(matches, STATS_MATCH_FULL, sizeof(char) * mcv->nitems);
+
+	mcv_update_match_bitmap(root, clauses, stat->keys, mcv,
+							matches, false);
+
+	/* sum frequencies for all the matching MCV items */
+	*basesel = 0.0;
+	*totalsel = 0.0;
+	for (i = 0; i < mcv->nitems; i++)
+	{
+		*totalsel += mcv->items[i]->frequency;
+
+		if (matches[i] != STATS_MATCH_NONE)
+		{
+			/* XXX Shouldn't the basesel be outside the if condition? */
+			*basesel += mcv->items[i]->base_frequency;
+			s += mcv->items[i]->frequency;
+		}
+	}
+
+	return s;
+}
diff --git a/src/backend/statistics/mvdistinct.c b/src/backend/statistics/mvdistinct.c
index 3071e42d86..5ef243b8a3 100644
--- a/src/backend/statistics/mvdistinct.c
+++ b/src/backend/statistics/mvdistinct.c
@@ -23,8 +23,6 @@
  */
 #include "postgres.h"
 
-#include <math.h>
-
 #include "access/htup_details.h"
 #include "catalog/pg_statistic_ext.h"
 #include "utils/fmgrprotos.h"
@@ -39,7 +37,6 @@
 static double ndistinct_for_combination(double totalrows, int numrows,
 						  HeapTuple *rows, VacAttrStats **stats,
 						  int k, int *combination);
-static double estimate_ndistinct(double totalrows, int numrows, int d, int f1);
 static int	n_choose_k(int n, int k);
 static int	num_combinations(int n);
 
@@ -511,31 +508,6 @@ ndistinct_for_combination(double totalrows, int numrows, HeapTuple *rows,
 	return estimate_ndistinct(totalrows, numrows, d, f1);
 }
 
-/* The Duj1 estimator (already used in analyze.c). */
-static double
-estimate_ndistinct(double totalrows, int numrows, int d, int f1)
-{
-	double		numer,
-				denom,
-				ndistinct;
-
-	numer = (double) numrows * (double) d;
-
-	denom = (double) (numrows - f1) +
-		(double) f1 * (double) numrows / totalrows;
-
-	ndistinct = numer / denom;
-
-	/* Clamp to sane range in case of roundoff error */
-	if (ndistinct < (double) d)
-		ndistinct = (double) d;
-
-	if (ndistinct > totalrows)
-		ndistinct = totalrows;
-
-	return floor(ndistinct + 0.5);
-}
-
 /*
  * n_choose_k
  *		computes binomial coefficients using an algorithm that is both
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 4857caecaa..64edd874c9 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -1508,6 +1508,7 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 	bool		isnull;
 	bool		ndistinct_enabled;
 	bool		dependencies_enabled;
+	bool		mcv_enabled;
 	int			i;
 
 	statexttup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statextid));
@@ -1543,6 +1544,7 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 
 	ndistinct_enabled = false;
 	dependencies_enabled = false;
+	mcv_enabled = false;
 
 	for (i = 0; i < ARR_DIMS(arr)[0]; i++)
 	{
@@ -1550,6 +1552,8 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 			ndistinct_enabled = true;
 		if (enabled[i] == STATS_EXT_DEPENDENCIES)
 			dependencies_enabled = true;
+		if (enabled[i] == STATS_EXT_MCV)
+			mcv_enabled = true;
 	}
 
 	/*
@@ -1559,13 +1563,27 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
 	 * statistics types on a newer postgres version, if the statistics had all
 	 * options enabled on the original version.
 	 */
-	if (!ndistinct_enabled || !dependencies_enabled)
+	if (!ndistinct_enabled || !dependencies_enabled || !mcv_enabled)
 	{
+		bool	gotone = false;
+
 		appendStringInfoString(&buf, " (");
+
 		if (ndistinct_enabled)
+		{
 			appendStringInfoString(&buf, "ndistinct");
-		else if (dependencies_enabled)
-			appendStringInfoString(&buf, "dependencies");
+			gotone = true;
+		}
+
+		if (dependencies_enabled)
+		{
+			appendStringInfo(&buf, "%sdependencies", gotone ? ", " : "");
+			gotone = true;
+		}
+
+		if (mcv_enabled)
+			appendStringInfo(&buf, "%smcv", gotone ? ", " : "");
+
 		appendStringInfoChar(&buf, ')');
 	}
 
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 7155dc9087..e727808eae 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3753,6 +3753,171 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
 	return numdistinct;
 }
 
+/*
+ * estimate_num_groups_simple
+ *		Estimate number of groups in a relation.
+ *
+ * A simplified version of estimate_num_groups, assuming all expressions
+ * are only plain Vars from a single relation, and that no filtering is
+ * happenning.
+ */
+double
+estimate_num_groups_simple(PlannerInfo *root, List *vars)
+{
+	List	   *varinfos = NIL;
+	double		numdistinct;
+	ListCell   *l;
+
+	RelOptInfo *rel;
+	double		reldistinct = 1;
+	double		relmaxndistinct = reldistinct;
+	int			relvarcount = 0;
+
+
+	/*
+	 * If no grouping columns, there's exactly one group.  (This can't happen
+	 * for normal cases with GROUP BY or DISTINCT, but it is possible for
+	 * corner cases with set operations.)
+	 */
+	if (vars == NIL)
+		return 1.0;
+
+	/*
+	 * We expect only variables from a single relation.
+	 */
+	Assert(NumRelids((Node *) vars) == 1);
+
+	/*
+	 * Find the unique Vars used, treating an expression as a Var if we can
+	 * find stats for it.  For each one, record the statistical estimate of
+	 * number of distinct values (total in its table).
+	 */
+	numdistinct = 1.0;
+
+	foreach(l, vars)
+	{
+		Var	   *var = (Var *) lfirst(l);
+		VariableStatData vardata;
+
+		Assert(IsA(var, Var));
+
+		/*
+		 * If examine_variable is able to deduce anything about the GROUP BY
+		 * expression, treat it as a single variable even if it's really more
+		 * complicated.
+		 */
+		examine_variable(root, (Node *) var, 0, &vardata);
+		if (HeapTupleIsValid(vardata.statsTuple) || vardata.isunique)
+		{
+			varinfos = add_unique_group_var(root, varinfos,
+											(Node *) var, &vardata);
+			ReleaseVariableStats(vardata);
+			continue;
+		}
+		ReleaseVariableStats(vardata);
+	}
+
+	Assert(varinfos);
+
+	/*
+	 * Get the numdistinct estimate for the Vars of this rel.
+	 *
+	 * We
+	 * iteratively search for multivariate n-distinct with maximum number
+	 * of vars; assuming that each var group is independent of the others,
+	 * we multiply them together.  Any remaining relvarinfos after no more
+	 * multivariate matches are found are assumed independent too, so
+	 * their individual ndistinct estimates are multiplied also.
+	 *
+	 * While iterating, count how many separate numdistinct values we
+	 * apply.  We apply a fudge factor below, but only if we multiplied
+	 * more than one such values.
+	 */
+	while (varinfos)
+	{
+		double		mvndistinct;
+
+		rel = ((GroupVarInfo *) linitial(varinfos))->rel;
+
+		if (estimate_multivariate_ndistinct(root, rel, &varinfos,
+											&mvndistinct))
+		{
+			reldistinct *= mvndistinct;
+			if (relmaxndistinct < mvndistinct)
+				relmaxndistinct = mvndistinct;
+			relvarcount++;
+		}
+		else
+		{
+			foreach(l, varinfos)
+			{
+				GroupVarInfo *varinfo = (GroupVarInfo *) lfirst(l);
+
+				reldistinct *= varinfo->ndistinct;
+				if (relmaxndistinct < varinfo->ndistinct)
+					relmaxndistinct = varinfo->ndistinct;
+				relvarcount++;
+			}
+
+			/* we're done with this relation */
+			varinfos = NIL;
+		}
+	}
+
+	/*
+	 * Sanity check --- don't divide by zero if empty relation.
+	 */
+	Assert(IS_SIMPLE_REL(rel));
+	if (rel->tuples > 0)
+	{
+		/*
+		 * Clamp to size of rel, or size of rel / 10 if multiple Vars. The
+		 * fudge factor is because the Vars are probably correlated but we
+		 * don't know by how much.  We should never clamp to less than the
+		 * largest ndistinct value for any of the Vars, though, since
+		 * there will surely be at least that many groups.
+		 */
+		double		clamp = rel->tuples;
+
+		if (relvarcount > 1)
+		{
+			clamp *= 0.1;
+			if (clamp < relmaxndistinct)
+			{
+				clamp = relmaxndistinct;
+				/* for sanity in case some ndistinct is too large: */
+				if (clamp > rel->tuples)
+					clamp = rel->tuples;
+			}
+		}
+		if (reldistinct > clamp)
+			reldistinct = clamp;
+
+		/*
+		 * We're assuming we are returning all rows.
+		 */
+		reldistinct = clamp_row_est(reldistinct);
+
+		/*
+		 * Update estimate of total distinct groups.
+		 */
+		numdistinct *= reldistinct;
+
+		/* Guard against out-of-range answers */
+		if (numdistinct > rel->tuples)
+			numdistinct = rel->tuples;
+	}
+
+	if (numdistinct < 1.0)
+		numdistinct = 1.0;
+
+	/* Round off */
+	numdistinct = ceil(numdistinct);
+
+	return numdistinct;
+
+}
+
 /*
  * Estimate hash bucket statistics when the specified expression is used
  * as a hash key for the given number of buckets.
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 0a181b01d9..3d68a7c0ea 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2542,7 +2542,8 @@ describeOneTableDetails(const char *schemaname,
 							  "   JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND\n"
 							  "        a.attnum = s.attnum AND NOT attisdropped)) AS columns,\n"
 							  "  'd' = any(stxkind) AS ndist_enabled,\n"
-							  "  'f' = any(stxkind) AS deps_enabled\n"
+							  "  'f' = any(stxkind) AS deps_enabled,\n"
+							  "  'm' = any(stxkind) AS mcv_enabled\n"
 							  "FROM pg_catalog.pg_statistic_ext stat "
 							  "WHERE stxrelid = '%s'\n"
 							  "ORDER BY 1;",
@@ -2579,6 +2580,12 @@ describeOneTableDetails(const char *schemaname,
 					if (strcmp(PQgetvalue(result, i, 6), "t") == 0)
 					{
 						appendPQExpBuffer(&buf, "%sdependencies", gotone ? ", " : "");
+						gotone = true;
+					}
+
+					if (strcmp(PQgetvalue(result, i, 7), "t") == 0)
+					{
+						appendPQExpBuffer(&buf, "%smcv", gotone ? ", " : "");
 					}
 
 					appendPQExpBuffer(&buf, ") ON %s FROM %s",
diff --git a/src/include/catalog/pg_cast.dat b/src/include/catalog/pg_cast.dat
index 8cd65b3ab5..b382bdce5a 100644
--- a/src/include/catalog/pg_cast.dat
+++ b/src/include/catalog/pg_cast.dat
@@ -324,6 +324,12 @@
 { castsource => 'pg_dependencies', casttarget => 'text', castfunc => '0',
   castcontext => 'i', castmethod => 'i' },
 
+# pg_mcv_list can be coerced to, but not from, bytea and text
+{ castsource => 'pg_mcv_list', casttarget => 'bytea', castfunc => '0',
+  castcontext => 'i', castmethod => 'b' },
+{ castsource => 'pg_mcv_list', casttarget => 'text', castfunc => '0',
+  castcontext => 'i', castmethod => 'i' },
+
 # Datetime category
 { castsource => 'date', casttarget => 'timestamp',
   castfunc => 'timestamp(date)', castcontext => 'i', castmethod => 'f' },
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index acb0154048..c08dcc55ec 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -4962,6 +4962,30 @@
   proname => 'pg_dependencies_send', provolatile => 's', prorettype => 'bytea',
   proargtypes => 'pg_dependencies', prosrc => 'pg_dependencies_send' },
 
+{ oid => '4002', descr => 'I/O',
+  proname => 'pg_mcv_list_in', prorettype => 'pg_mcv_list',
+  proargtypes => 'cstring', prosrc => 'pg_mcv_list_in' },
+{ oid => '4003', descr => 'I/O',
+  proname => 'pg_mcv_list_out', prorettype => 'cstring',
+  proargtypes => 'pg_mcv_list', prosrc => 'pg_mcv_list_out' },
+{ oid => '4004', descr => 'I/O',
+  proname => 'pg_mcv_list_recv', provolatile => 's',
+  prorettype => 'pg_mcv_list', proargtypes => 'internal',
+  prosrc => 'pg_mcv_list_recv' },
+{ oid => '4005', descr => 'I/O',
+  proname => 'pg_mcv_list_send', provolatile => 's', prorettype => 'bytea',
+  proargtypes => 'pg_mcv_list', prosrc => 'pg_mcv_list_send' },
+
+{ oid => '3424',
+  descr => 'details about MCV list items',
+  proname => 'pg_mcv_list_items', prorows => '1000', proisstrict => 'f',
+  proretset => 't', provolatile => 's', prorettype => 'record',
+  proargtypes => 'pg_mcv_list',
+  proallargtypes => '{pg_mcv_list,int4,text,_bool,float8,float8}',
+  proargmodes => '{i,o,o,o,o,o}',
+  proargnames => '{mcv_list,index,values,nulls,frequency,base_frequency}',
+  prosrc => 'pg_stats_ext_mcvlist_items' },
+
 { oid => '1928', descr => 'statistics: number of scans done for table/index',
   proname => 'pg_stat_get_numscans', provolatile => 's', proparallel => 'r',
   prorettype => 'int8', proargtypes => 'oid',
diff --git a/src/include/catalog/pg_statistic_ext.h b/src/include/catalog/pg_statistic_ext.h
index e2e8d4f6d4..c4d3270d3f 100644
--- a/src/include/catalog/pg_statistic_ext.h
+++ b/src/include/catalog/pg_statistic_ext.h
@@ -49,6 +49,7 @@ CATALOG(pg_statistic_ext,3381,StatisticExtRelationId)
 												 * to build */
 	pg_ndistinct stxndistinct;	/* ndistinct coefficients (serialized) */
 	pg_dependencies stxdependencies;	/* dependencies (serialized) */
+	pg_mcv_list stxmcv;			/* MCV (serialized) */
 #endif
 
 } FormData_pg_statistic_ext;
@@ -64,6 +65,7 @@ typedef FormData_pg_statistic_ext *Form_pg_statistic_ext;
 
 #define STATS_EXT_NDISTINCT			'd'
 #define STATS_EXT_DEPENDENCIES		'f'
+#define STATS_EXT_MCV				'm'
 
 #endif							/* EXPOSE_TO_CLIENT_CODE */
 
diff --git a/src/include/catalog/pg_type.dat b/src/include/catalog/pg_type.dat
index 7bca400ec2..b87f6bc4d7 100644
--- a/src/include/catalog/pg_type.dat
+++ b/src/include/catalog/pg_type.dat
@@ -165,6 +165,13 @@
   typoutput => 'pg_dependencies_out', typreceive => 'pg_dependencies_recv',
   typsend => 'pg_dependencies_send', typalign => 'i', typstorage => 'x',
   typcollation => '100' },
+{ oid => '4001', oid_symbol => 'PGMCVLISTOID',
+  descr => 'multivariate MCV list',
+  typname => 'pg_mcv_list', typlen => '-1', typbyval => 'f',
+  typcategory => 'S', typinput => 'pg_mcv_list_in',
+  typoutput => 'pg_mcv_list_out', typreceive => 'pg_mcv_list_recv',
+  typsend => 'pg_mcv_list_send', typalign => 'i', typstorage => 'x',
+  typcollation => '100' },
 { oid => '32', oid_symbol => 'PGDDLCOMMANDOID',
   descr => 'internal type for passing CollectedCommand',
   typname => 'pg_ddl_command', typlen => 'SIZEOF_POINTER', typbyval => 't',
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index dfff23ac55..ac05684652 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -205,6 +205,12 @@ extern void analyze_rel(Oid relid, RangeVar *relation, int options,
 			VacuumParams *params, List *va_cols, bool in_outer_xact,
 			BufferAccessStrategy bstrategy);
 extern bool std_typanalyze(VacAttrStats *stats);
+extern int analyze_mcv_list(int *mcv_counts,
+				 int num_mcv,
+				 double stadistinct,
+				 double stanullfrac,
+				 int samplerows,
+				 double totalrows);
 
 /* in utils/misc/sampling.c --- duplicate of declarations in utils/sampling.h */
 extern double anl_random_fract(void);
diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h
index 77ca7ff837..e6cded0597 100644
--- a/src/include/optimizer/cost.h
+++ b/src/include/optimizer/cost.h
@@ -215,6 +215,12 @@ extern Selectivity clause_selectivity(PlannerInfo *root,
 				   int varRelid,
 				   JoinType jointype,
 				   SpecialJoinInfo *sjinfo);
+extern Selectivity clauselist_selectivity_simple(PlannerInfo *root,
+							  List *clauses,
+							  int varRelid,
+							  JoinType jointype,
+							  SpecialJoinInfo *sjinfo,
+							  Bitmapset *estimatedclauses);
 extern void cost_gather_merge(GatherMergePath *path, PlannerInfo *root,
 				  RelOptInfo *rel, ParamPathInfo *param_info,
 				  Cost input_startup_cost, Cost input_total_cost,
diff --git a/src/include/statistics/extended_stats_internal.h b/src/include/statistics/extended_stats_internal.h
index fff6bc6799..f330f3c1d3 100644
--- a/src/include/statistics/extended_stats_internal.h
+++ b/src/include/statistics/extended_stats_internal.h
@@ -31,6 +31,15 @@ typedef struct
 	int			tupno;			/* position index for tuple it came from */
 } ScalarItem;
 
+/* (de)serialization info */
+typedef struct DimensionInfo
+{
+	int			nvalues;		/* number of deduplicated values */
+	int			nbytes;			/* number of bytes (serialized) */
+	int			typlen;			/* pg_type.typlen */
+	bool		typbyval;		/* pg_type.typbyval */
+}			DimensionInfo;
+
 /* multi-sort */
 typedef struct MultiSortSupportData
 {
@@ -44,6 +53,7 @@ typedef struct SortItem
 {
 	Datum	   *values;
 	bool	   *isnull;
+	int			count;
 } SortItem;
 
 extern MVNDistinct *statext_ndistinct_build(double totalrows,
@@ -57,6 +67,12 @@ extern MVDependencies *statext_dependencies_build(int numrows, HeapTuple *rows,
 extern bytea *statext_dependencies_serialize(MVDependencies *dependencies);
 extern MVDependencies *statext_dependencies_deserialize(bytea *data);
 
+extern MCVList * statext_mcv_build(int numrows, HeapTuple *rows,
+								   Bitmapset *attrs, VacAttrStats **stats,
+								   double totalrows);
+extern bytea *statext_mcv_serialize(MCVList * mcv, VacAttrStats **stats);
+extern MCVList * statext_mcv_deserialize(bytea *data);
+
 extern MultiSortSupport multi_sort_init(int ndims);
 extern void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
 						 Oid oper, Oid collation);
@@ -65,5 +81,32 @@ extern int multi_sort_compare_dim(int dim, const SortItem *a,
 					   const SortItem *b, MultiSortSupport mss);
 extern int multi_sort_compare_dims(int start, int end, const SortItem *a,
 						const SortItem *b, MultiSortSupport mss);
+extern int	compare_scalars_simple(const void *a, const void *b, void *arg);
+extern int	compare_datums_simple(Datum a, Datum b, SortSupport ssup);
+
+extern void *bsearch_arg(const void *key, const void *base,
+			size_t nmemb, size_t size,
+			int (*compar) (const void *, const void *, void *),
+			void *arg);
+
+extern int *build_attnums(Bitmapset *attrs);
+
+extern SortItem *build_sorted_items(int numrows, HeapTuple *rows,
+				   TupleDesc tdesc, MultiSortSupport mss,
+				   int numattrs, int *attnums);
+
+extern int	bms_member_index(Bitmapset *keys, AttrNumber varattno);
+
+extern double estimate_ndistinct(double totalrows, int numrows, int d, int f1);
+
+extern Selectivity mcv_clauselist_selectivity(PlannerInfo *root,
+						   StatisticExtInfo *stat,
+						   List *clauses,
+						   int varRelid,
+						   JoinType jointype,
+						   SpecialJoinInfo *sjinfo,
+						   RelOptInfo *rel,
+						   Selectivity *basesel,
+						   Selectivity *totalsel);
 
 #endif							/* EXTENDED_STATS_INTERNAL_H */
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 8009fee322..e69d6a0232 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -16,6 +16,14 @@
 #include "commands/vacuum.h"
 #include "nodes/relation.h"
 
+/*
+ * Degree of how much MCV item matches a clause.
+ * This is then considered when computing the selectivity.
+ */
+#define STATS_MATCH_NONE		0	/* no match at all */
+#define STATS_MATCH_PARTIAL		1	/* partial match */
+#define STATS_MATCH_FULL		2	/* full match */
+
 #define STATS_MAX_DIMENSIONS	8	/* max number of attributes */
 
 /* Multivariate distinct coefficients */
@@ -78,8 +86,42 @@ typedef struct MVDependencies
 /* size of the struct excluding the deps array */
 #define SizeOfDependencies	(offsetof(MVDependencies, ndeps) + sizeof(uint32))
 
+/* used to flag stats serialized to bytea */
+#define STATS_MCV_MAGIC                        0xE1A651C2	/* marks serialized
+															 * bytea */
+#define STATS_MCV_TYPE_BASIC   1	/* basic MCV list type */
+
+/* max items in MCV list (mostly arbitrary number) */
+#define STATS_MCVLIST_MAX_ITEMS        8192
+
+/*
+ * Multivariate MCV (most-common value) lists
+ *
+ * A straight-forward extension of MCV items - i.e. a list (array) of
+ * combinations of attribute values, together with a frequency and null flags.
+ */
+typedef struct MCVItem
+{
+	double		frequency;		/* frequency of this combination */
+	double		base_frequency;	/* frequency if independent */
+	bool	   *isnull;			/* lags of NULL values (up to 32 columns) */
+	Datum	   *values;			/* variable-length (ndimensions) */
+}			MCVItem;
+
+/* multivariate MCV list - essentally an array of MCV items */
+typedef struct MCVList
+{
+	uint32		magic;			/* magic constant marker */
+	uint32		type;			/* type of MCV list (BASIC) */
+	uint32		nitems;			/* number of MCV items in the array */
+	AttrNumber	ndimensions;	/* number of dimensions */
+	Oid			types[STATS_MAX_DIMENSIONS];	/* OIDs of data types */
+	MCVItem   **items;			/* array of MCV items */
+}			MCVList;
+
 extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
 extern MVDependencies *statext_dependencies_load(Oid mvoid);
+extern MCVList * statext_mcv_load(Oid mvoid);
 
 extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
 						   int numrows, HeapTuple *rows,
@@ -92,6 +134,13 @@ extern Selectivity dependencies_clauselist_selectivity(PlannerInfo *root,
 									SpecialJoinInfo *sjinfo,
 									RelOptInfo *rel,
 									Bitmapset **estimatedclauses);
+extern Selectivity statext_clauselist_selectivity(PlannerInfo *root,
+							   List *clauses,
+							   int varRelid,
+							   JoinType jointype,
+							   SpecialJoinInfo *sjinfo,
+							   RelOptInfo *rel,
+							   Bitmapset **estimatedclauses);
 extern bool has_stats_of_kind(List *stats, char requiredkind);
 extern StatisticExtInfo *choose_best_statistics(List *stats,
 					   Bitmapset *attnums, char requiredkind);
diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h
index 95e44280c4..4e9aaca6b5 100644
--- a/src/include/utils/selfuncs.h
+++ b/src/include/utils/selfuncs.h
@@ -209,6 +209,8 @@ extern void mergejoinscansel(PlannerInfo *root, Node *clause,
 extern double estimate_num_groups(PlannerInfo *root, List *groupExprs,
 					double input_rows, List **pgset);
 
+extern double estimate_num_groups_simple(PlannerInfo *root, List *vars);
+
 extern void estimate_hash_bucket_stats(PlannerInfo *root,
 						   Node *hashkey, double nbuckets,
 						   Selectivity *mcv_freq,
diff --git a/src/test/regress/expected/create_table_like.out b/src/test/regress/expected/create_table_like.out
index b582211270..4c8a5bd7e2 100644
--- a/src/test/regress/expected/create_table_like.out
+++ b/src/test/regress/expected/create_table_like.out
@@ -243,7 +243,7 @@ Indexes:
 Check constraints:
     "ctlt1_a_check" CHECK (length(a) > 2)
 Statistics objects:
-    "public"."ctlt_all_a_b_stat" (ndistinct, dependencies) ON a, b FROM ctlt_all
+    "public"."ctlt_all_a_b_stat" (ndistinct, dependencies, mcv) ON a, b FROM ctlt_all
 
 SELECT c.relname, objsubid, description FROM pg_description, pg_index i, pg_class c WHERE classoid = 'pg_class'::regclass AND objoid = i.indexrelid AND c.oid = i.indexrelid AND i.indrelid = 'ctlt_all'::regclass ORDER BY c.relname, objsubid;
     relname     | objsubid | description 
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index 214ad2d619..fab4597908 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -893,11 +893,12 @@ WHERE c.castmethod = 'b' AND
  pg_node_tree      | text              |        0 | i
  pg_ndistinct      | bytea             |        0 | i
  pg_dependencies   | bytea             |        0 | i
+ pg_mcv_list       | bytea             |        0 | i
  cidr              | inet              |        0 | i
  xml               | text              |        0 | a
  xml               | character varying |        0 | a
  xml               | character         |        0 | a
-(9 rows)
+(10 rows)
 
 -- **************** pg_conversion ****************
 -- Look for illegal values in pg_conversion fields.
diff --git a/src/test/regress/expected/stats_ext.out b/src/test/regress/expected/stats_ext.out
index 054a381dad..5d05962c04 100644
--- a/src/test/regress/expected/stats_ext.out
+++ b/src/test/regress/expected/stats_ext.out
@@ -58,7 +58,7 @@ ALTER TABLE ab1 DROP COLUMN a;
  b      | integer |           |          | 
  c      | integer |           |          | 
 Statistics objects:
-    "public"."ab1_b_c_stats" (ndistinct, dependencies) ON b, c FROM ab1
+    "public"."ab1_b_c_stats" (ndistinct, dependencies, mcv) ON b, c FROM ab1
 
 -- Ensure statistics are dropped when table is
 SELECT stxname FROM pg_statistic_ext WHERE stxname LIKE 'ab1%';
@@ -206,7 +206,7 @@ SELECT stxkind, stxndistinct
   FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass;
  stxkind |                      stxndistinct                       
 ---------+---------------------------------------------------------
- {d,f}   | {"3, 4": 301, "3, 6": 301, "4, 6": 301, "3, 4, 6": 301}
+ {d,f,m} | {"3, 4": 301, "3, 6": 301, "4, 6": 301, "3, 4, 6": 301}
 (1 row)
 
 -- Hash Aggregate, thanks to estimates improved by the statistic
@@ -272,7 +272,7 @@ SELECT stxkind, stxndistinct
   FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass;
  stxkind |                        stxndistinct                         
 ---------+-------------------------------------------------------------
- {d,f}   | {"3, 4": 2550, "3, 6": 800, "4, 6": 1632, "3, 4, 6": 10000}
+ {d,f,m} | {"3, 4": 2550, "3, 6": 800, "4, 6": 1632, "3, 4, 6": 10000}
 (1 row)
 
 -- plans using Group Aggregate, thanks to using correct esimates
@@ -509,3 +509,316 @@ EXPLAIN (COSTS OFF)
 (5 rows)
 
 RESET random_page_cost;
+-- MCV lists
+CREATE TABLE mcv_lists (
+    filler1 TEXT,
+    filler2 NUMERIC,
+    a INT,
+    b TEXT,
+    filler3 DATE,
+    c INT,
+    d TEXT
+);
+SET random_page_cost = 1.2;
+CREATE INDEX mcv_lists_ab_idx ON mcv_lists (a, b);
+CREATE INDEX mcv_lists_abc_idx ON mcv_lists (a, b, c);
+-- random data (no MCV list)
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT mod(i,37), mod(i,41), mod(i,43), mod(i,47) FROM generate_series(1,5000) s(i);
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text) AND (c = 1))
+(2 rows)
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text) AND (c = 1))
+(2 rows)
+
+-- 100 distinct combinations, all in the MCV list
+TRUNCATE mcv_lists;
+DROP STATISTICS mcv_lists_stats;
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT mod(i,100), mod(i,50), mod(i,25), i FROM generate_series(1,5000) s(i);
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a < 1) AND (b < '1'::text))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 0 AND b <= '0';
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a <= 0) AND (b <= '0'::text))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text) AND (c = 1))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
+                       QUERY PLAN                        
+---------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a < 5) AND (b < '1'::text) AND (c < 5))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 4 AND b <= '0' AND c <= 4;
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a <= 4) AND (b <= '0'::text) AND (c <= 4))
+(2 rows)
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a < 1) AND (b < '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a < 1) AND (b < '1'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 0 AND b <= '0';
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a <= 0) AND (b <= '0'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a <= 0) AND (b <= '0'::text))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   Filter: (c = 1)
+   ->  Bitmap Index Scan on mcv_lists_ab_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(5 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a < 5) AND (b < '1'::text))
+   Filter: (c < 5)
+   ->  Bitmap Index Scan on mcv_lists_ab_idx
+         Index Cond: ((a < 5) AND (b < '1'::text))
+(5 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 4 AND b <= '0' AND c <= 4;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a <= 4) AND (b <= '0'::text))
+   Filter: (c <= 4)
+   ->  Bitmap Index Scan on mcv_lists_ab_idx
+         Index Cond: ((a <= 4) AND (b <= '0'::text))
+(5 rows)
+
+-- check change of column type resets the MCV statistics
+ALTER TABLE mcv_lists ALTER COLUMN c TYPE numeric;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a = 1) AND (b = '1'::text))
+(2 rows)
+
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a = 1) AND (b = '1'::text))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a = 1) AND (b = '1'::text))
+(4 rows)
+
+-- 100 distinct combinations with NULL values, all in the MCV list
+TRUNCATE mcv_lists;
+DROP STATISTICS mcv_lists_stats;
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT
+         (CASE WHEN mod(i,100) = 1 THEN NULL ELSE mod(i,100) END),
+         (CASE WHEN mod(i,50) = 1  THEN NULL ELSE mod(i,50) END),
+         (CASE WHEN mod(i,25) = 1  THEN NULL ELSE mod(i,25) END),
+         i
+     FROM generate_series(1,5000) s(i);
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
+                   QUERY PLAN                    
+-------------------------------------------------
+ Index Scan using mcv_lists_abc_idx on mcv_lists
+   Index Cond: ((a IS NULL) AND (b IS NULL))
+(2 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
+                   QUERY PLAN                   
+------------------------------------------------
+ Index Scan using mcv_lists_ab_idx on mcv_lists
+   Index Cond: ((a IS NULL) AND (b IS NULL))
+   Filter: (c IS NULL)
+(3 rows)
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+ANALYZE mcv_lists;
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a IS NULL) AND (b IS NULL))
+   ->  Bitmap Index Scan on mcv_lists_abc_idx
+         Index Cond: ((a IS NULL) AND (b IS NULL))
+(4 rows)
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
+                    QUERY PLAN                     
+---------------------------------------------------
+ Bitmap Heap Scan on mcv_lists
+   Recheck Cond: ((a IS NULL) AND (b IS NULL))
+   Filter: (c IS NULL)
+   ->  Bitmap Index Scan on mcv_lists_ab_idx
+         Index Cond: ((a IS NULL) AND (b IS NULL))
+(5 rows)
+
+RESET random_page_cost;
+-- mcv with arrays
+CREATE TABLE mcv_lists_arrays (
+    a TEXT[],
+    b NUMERIC[],
+    c INT[]
+);
+INSERT INTO mcv_lists_arrays (a, b, c)
+     SELECT
+         ARRAY[md5((i/100)::text), md5((i/100-1)::text), md5((i/100+1)::text)],
+         ARRAY[(i/100-1)::numeric/1000, (i/100)::numeric/1000, (i/100+1)::numeric/1000],
+         ARRAY[(i/100-1), i/100, (i/100+1)]
+     FROM generate_series(1,5000) s(i);
+CREATE STATISTICS mcv_lists_arrays_stats (mcv) ON a, b, c
+  FROM mcv_lists_arrays;
+ANALYZE mcv_lists_arrays;
+-- mcv with bool
+CREATE TABLE mcv_lists_bool (
+    a BOOL,
+    b BOOL,
+    c BOOL
+);
+INSERT INTO mcv_lists_bool (a, b, c)
+     SELECT
+         (mod(i,2) = 0), (mod(i,4) = 0), (mod(i,8) = 0)
+     FROM generate_series(1,10000) s(i);
+CREATE INDEX mcv_lists_bool_ab_idx ON mcv_lists_bool (a, b);
+CREATE INDEX mcv_lists_bool_abc_idx ON mcv_lists_bool (a, b, c);
+CREATE STATISTICS mcv_lists_bool_stats (mcv) ON a, b, c
+  FROM mcv_lists_bool;
+ANALYZE mcv_lists_bool;
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE a AND b AND c;
+                           QUERY PLAN                           
+----------------------------------------------------------------
+ Bitmap Heap Scan on mcv_lists_bool
+   Filter: (a AND b AND c)
+   ->  Bitmap Index Scan on mcv_lists_bool_abc_idx
+         Index Cond: ((a = true) AND (b = true) AND (c = true))
+(4 rows)
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND c;
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Index Scan using mcv_lists_bool_ab_idx on mcv_lists_bool
+   Index Cond: ((a = false) AND (b = true))
+   Filter: ((NOT a) AND b AND c)
+(3 rows)
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND NOT b AND c;
+                           QUERY PLAN                           
+----------------------------------------------------------------
+ Index Only Scan using mcv_lists_bool_abc_idx on mcv_lists_bool
+   Index Cond: ((a = false) AND (b = false) AND (c = true))
+   Filter: ((NOT a) AND (NOT b) AND c)
+(3 rows)
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND NOT c;
+                        QUERY PLAN                        
+----------------------------------------------------------
+ Index Scan using mcv_lists_bool_ab_idx on mcv_lists_bool
+   Index Cond: ((a = false) AND (b = true))
+   Filter: ((NOT a) AND b AND (NOT c))
+(3 rows)
+
diff --git a/src/test/regress/expected/type_sanity.out b/src/test/regress/expected/type_sanity.out
index b1419d4bc2..a56d6c5231 100644
--- a/src/test/regress/expected/type_sanity.out
+++ b/src/test/regress/expected/type_sanity.out
@@ -72,8 +72,9 @@ WHERE p1.typtype not in ('c','d','p') AND p1.typname NOT LIKE E'\\_%'
   194 | pg_node_tree
  3361 | pg_ndistinct
  3402 | pg_dependencies
+ 4001 | pg_mcv_list
   210 | smgr
-(4 rows)
+(5 rows)
 
 -- Make sure typarray points to a varlena array type of our own base
 SELECT p1.oid, p1.typname as basetype, p2.typname as arraytype,
diff --git a/src/test/regress/sql/stats_ext.sql b/src/test/regress/sql/stats_ext.sql
index 46acaadb39..ad1f103217 100644
--- a/src/test/regress/sql/stats_ext.sql
+++ b/src/test/regress/sql/stats_ext.sql
@@ -282,3 +282,184 @@ EXPLAIN (COSTS OFF)
  SELECT * FROM functional_dependencies WHERE a = 1 AND b = '1' AND c = 1;
 
 RESET random_page_cost;
+
+-- MCV lists
+CREATE TABLE mcv_lists (
+    filler1 TEXT,
+    filler2 NUMERIC,
+    a INT,
+    b TEXT,
+    filler3 DATE,
+    c INT,
+    d TEXT
+);
+
+SET random_page_cost = 1.2;
+
+CREATE INDEX mcv_lists_ab_idx ON mcv_lists (a, b);
+CREATE INDEX mcv_lists_abc_idx ON mcv_lists (a, b, c);
+
+-- random data (no MCV list)
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT mod(i,37), mod(i,41), mod(i,43), mod(i,47) FROM generate_series(1,5000) s(i);
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+
+-- 100 distinct combinations, all in the MCV list
+TRUNCATE mcv_lists;
+DROP STATISTICS mcv_lists_stats;
+
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT mod(i,100), mod(i,50), mod(i,25), i FROM generate_series(1,5000) s(i);
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 0 AND b <= '0';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 4 AND b <= '0' AND c <= 4;
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 1 AND b < '1';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 0 AND b <= '0';
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1' AND c = 1;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a < 5 AND b < '1' AND c < 5;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a <= 4 AND b <= '0' AND c <= 4;
+
+-- check change of column type resets the MCV statistics
+ALTER TABLE mcv_lists ALTER COLUMN c TYPE numeric;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a = 1 AND b = '1';
+
+-- 100 distinct combinations with NULL values, all in the MCV list
+TRUNCATE mcv_lists;
+DROP STATISTICS mcv_lists_stats;
+
+INSERT INTO mcv_lists (a, b, c, filler1)
+     SELECT
+         (CASE WHEN mod(i,100) = 1 THEN NULL ELSE mod(i,100) END),
+         (CASE WHEN mod(i,50) = 1  THEN NULL ELSE mod(i,50) END),
+         (CASE WHEN mod(i,25) = 1  THEN NULL ELSE mod(i,25) END),
+         i
+     FROM generate_series(1,5000) s(i);
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
+
+-- create statistics
+CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
+
+ANALYZE mcv_lists;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL;
+
+EXPLAIN (COSTS OFF)
+ SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL;
+
+RESET random_page_cost;
+
+-- mcv with arrays
+CREATE TABLE mcv_lists_arrays (
+    a TEXT[],
+    b NUMERIC[],
+    c INT[]
+);
+
+INSERT INTO mcv_lists_arrays (a, b, c)
+     SELECT
+         ARRAY[md5((i/100)::text), md5((i/100-1)::text), md5((i/100+1)::text)],
+         ARRAY[(i/100-1)::numeric/1000, (i/100)::numeric/1000, (i/100+1)::numeric/1000],
+         ARRAY[(i/100-1), i/100, (i/100+1)]
+     FROM generate_series(1,5000) s(i);
+
+CREATE STATISTICS mcv_lists_arrays_stats (mcv) ON a, b, c
+  FROM mcv_lists_arrays;
+
+ANALYZE mcv_lists_arrays;
+
+-- mcv with bool
+CREATE TABLE mcv_lists_bool (
+    a BOOL,
+    b BOOL,
+    c BOOL
+);
+
+INSERT INTO mcv_lists_bool (a, b, c)
+     SELECT
+         (mod(i,2) = 0), (mod(i,4) = 0), (mod(i,8) = 0)
+     FROM generate_series(1,10000) s(i);
+
+CREATE INDEX mcv_lists_bool_ab_idx ON mcv_lists_bool (a, b);
+
+CREATE INDEX mcv_lists_bool_abc_idx ON mcv_lists_bool (a, b, c);
+
+CREATE STATISTICS mcv_lists_bool_stats (mcv) ON a, b, c
+  FROM mcv_lists_bool;
+
+ANALYZE mcv_lists_bool;
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE a AND b AND c;
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND c;
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND NOT b AND c;
+
+EXPLAIN (COSTS OFF) SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND NOT c;
-- 
2.17.2

#89

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 7 years ago

In reply to: Tomas Vondra (#88)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

FWIW the main unsolved issue (at least on the MCV part) is how it
decides which items to keep in the list.

As explained in [1]/messages/by-id/8ac8bd94-478d-215d-e6bd-339f1f20a74c@2ndquadrant.com, in the multivariate case we can't simply look at
the group frequency and compare it to the average frequency (of the
non-MCV items), which is what analyze_mcv_list() does in the
single-column case. In the multivariate case we also case about observed
vs. base frequency, i.e. we want the MCV list to include groups that are
present singificantly more/less than product of per-column stats.

I've repeatedly tried to come up with a criteria that would address
that, but it seems rather difficult because we can't abandon the other
criteria either. So the MCV list should include groups that match both

(a) items that are statistically more common than the non-MCV part (i.e.
the rule from per-column analyze_mcv_list)

(b) items that are statistically more/less common than estimated from
per-column stats (i.e. the new rule)

Enforcing rule (a) seems reasonable because it ensures the MCV list
includes all items more frequent than the last one. Without it, it's
difficult to decide know whether the absent item is very common (but
close to base frequency) or very uncommon (so less frequent than the
last MCV item).

So it's not clear to me how to best marry these two things. So far the
only thing I came up with is looking for the last item where the
frequency and base frequency are very different (not sure how exactly to
decide when the difference becomes statistically significant), include
all items with higher frequencies, and then do analyze_mcv_list() to
also enforce (a). But it seems a bit cumbersome :-(

[1]: /messages/by-id/8ac8bd94-478d-215d-e6bd-339f1f20a74c@2ndquadrant.com
/messages/by-id/8ac8bd94-478d-215d-e6bd-339f1f20a74c@2ndquadrant.com

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#90

Dean Rasheed

dean.a.rasheed@gmail.com

about 7 years ago

In reply to: Tomas Vondra (#89)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Mon, 7 Jan 2019 at 00:45, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

FWIW the main unsolved issue (at least on the MCV part) is how it
decides which items to keep in the list.

As explained in [1], in the multivariate case we can't simply look at
the group frequency and compare it to the average frequency (of the
non-MCV items), which is what analyze_mcv_list() does in the
single-column case. In the multivariate case we also case about observed
vs. base frequency, i.e. we want the MCV list to include groups that are
present singificantly more/less than product of per-column stats.

I've repeatedly tried to come up with a criteria that would address
that, but it seems rather difficult because we can't abandon the other
criteria either. So the MCV list should include groups that match both

(a) items that are statistically more common than the non-MCV part (i.e.
the rule from per-column analyze_mcv_list)

(b) items that are statistically more/less common than estimated from
per-column stats (i.e. the new rule)

Thinking about this some more, I think that it probably isn't
appropriate to use analyze_mcv_list() directly because that's making
specific assumptions about how items not in the list will be estimated
that aren't actually true for groups of values in multivariate stats.
If a group of values isn't in the MCV list, it gets estimated based on
the product of the selectivities from the per-column stats (modulo the
additional logic preventing the selectivity not exceeding the total
non-MCV selectivity).

So actually, the estimate for a group of values will be either the MCV
item's frequency (if the MCV item is kept), or (roughly) the MCV
item's base_frequency (if the MCV item is not kept). That suggests
that we should simply keep items that are significantly more or less
common than the item's base frequency -- i.e., keep rule (b) and ditch
rule (a).

Enforcing rule (a) seems reasonable because it ensures the MCV list
includes all items more frequent than the last one. Without it, it's
difficult to decide know whether the absent item is very common (but
close to base frequency) or very uncommon (so less frequent than the
last MCV item).

I'm not sure there's much we can do about that. Keeping the item will
result in keeping a frequency that we know is close to the base
frequency, and not keeping the item will result in per-column stats
being used that we expect to also give an estimate close to the base
frequency. So either way, the result is about the same, and it's
probably better to discard it, leaving more room for other items about
which we may have more information.

That said, there is a separate benefit to keeping items in the list
even if their frequency is close to the base frequency -- the more
items kept, the larger their total selectivity will be, giving a
better cap on the non-MCV selectivities. So if, after keeping all
items satisfying rule (b), there are free slots available, perhaps
they should be used for the most common remaining values satisfying
rule (a).

Regards,
Dean

#91

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 7 years ago

In reply to: Dean Rasheed (#90)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 1/8/19 3:18 PM, Dean Rasheed wrote:

On Mon, 7 Jan 2019 at 00:45, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

FWIW the main unsolved issue (at least on the MCV part) is how it
decides which items to keep in the list.

As explained in [1], in the multivariate case we can't simply look at
the group frequency and compare it to the average frequency (of the
non-MCV items), which is what analyze_mcv_list() does in the
single-column case. In the multivariate case we also case about observed
vs. base frequency, i.e. we want the MCV list to include groups that are
present singificantly more/less than product of per-column stats.

I've repeatedly tried to come up with a criteria that would address
that, but it seems rather difficult because we can't abandon the other
criteria either. So the MCV list should include groups that match both

(a) items that are statistically more common than the non-MCV part (i.e.
the rule from per-column analyze_mcv_list)

(b) items that are statistically more/less common than estimated from
per-column stats (i.e. the new rule)

Thinking about this some more, I think that it probably isn't
appropriate to use analyze_mcv_list() directly because that's making
specific assumptions about how items not in the list will be estimated
that aren't actually true for groups of values in multivariate stats.
If a group of values isn't in the MCV list, it gets estimated based on
the product of the selectivities from the per-column stats (modulo the
additional logic preventing the selectivity not exceeding the total
non-MCV selectivity).

So actually, the estimate for a group of values will be either the MCV
item's frequency (if the MCV item is kept), or (roughly) the MCV
item's base_frequency (if the MCV item is not kept). That suggests
that we should simply keep items that are significantly more or less
common than the item's base frequency -- i.e., keep rule (b) and ditch
rule (a).

Hmmm, but won't that interfere with how we with how we extrapolate the
MCV estimate to the non-MCV part? Currently the patch does what you
proposed, i.e.

other_sel = simple_sel - mcv_basesel;

I'm worried that if we only include the items that are significantly
more or less common than the base frequency, it may skew the other_sel
estimate.

Enforcing rule (a) seems reasonable because it ensures the MCV list
includes all items more frequent than the last one. Without it, it's
difficult to decide know whether the absent item is very common (but
close to base frequency) or very uncommon (so less frequent than the
last MCV item).

I'm not sure there's much we can do about that. Keeping the item will
result in keeping a frequency that we know is close to the base
frequency, and not keeping the item will result in per-column stats
being used that we expect to also give an estimate close to the base
frequency. So either way, the result is about the same, and it's
probably better to discard it, leaving more room for other items about
which we may have more information.

That said, there is a separate benefit to keeping items in the list
even if their frequency is close to the base frequency -- the more
items kept, the larger their total selectivity will be, giving a
better cap on the non-MCV selectivities. So if, after keeping all
items satisfying rule (b), there are free slots available, perhaps
they should be used for the most common remaining values satisfying
rule (a).

Hmm, so essentially we'd use (b) first to bootstrap the MCV list, and
then we could do what analyze_mcv_list() does. That could work, I guess.

The question is how to define "significantly different from base freq"
though. Any ideas?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#92

Dean Rasheed

dean.a.rasheed@gmail.com

about 7 years ago

In reply to: Tomas Vondra (#91)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Wed, 9 Jan 2019 at 15:40, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 1/8/19 3:18 PM, Dean Rasheed wrote:

So actually, the estimate for a group of values will be either the MCV
item's frequency (if the MCV item is kept), or (roughly) the MCV
item's base_frequency (if the MCV item is not kept). That suggests
that we should simply keep items that are significantly more or less
common than the item's base frequency -- i.e., keep rule (b) and ditch
rule (a).

Hmmm, but won't that interfere with how we with how we extrapolate the
MCV estimate to the non-MCV part? Currently the patch does what you
proposed, i.e.

other_sel = simple_sel - mcv_basesel;

I'm worried that if we only include the items that are significantly
more or less common than the base frequency, it may skew the other_sel
estimate.

I don't see how that would skew other_sel. Items close to the base
frequency would also tend to be close to simple_sel, making other_sel
approximately zero, so excluding them should have little effect.
However...

Re-reading the thread where we enhanced the per-column MCV stats last
year [1]/messages/by-id/CAMkU=1yvdGvW9TmiLAhz2erFnvnPFYHbOZuO+a=4DVkzpuQ2tw@mail.gmail.com, it was actually the case that an algorithm based on just
looking at the relative standard error worked pretty well for a very
wide range of data distributions.

The final algorithm chosen in analyze_mcv_list() was only a marginal
improvement on that, and was directly based upon the fact that, in the
univariate statistics case, all the values not included in the MCV
list are assigned the same selectivity. However, that's not the case
for multivariate stats, because each group not included in the
multivariate MCV list gets assigned a different selectivity based on
its per-column stats.

So perhaps what we should do for multivariate stats is simply use the
relative standard error approach (i.e., reuse the patch in [2]/messages/by-id/CAEZATCUEmHCZeOHJN8JO5O9LK_VuFeCecy_AxTk7S_2SmLXeyw@mail.gmail.com with a
20% RSE cutoff). That had a lot of testing at the time, against a wide
range of data distributions, and proved to be very good, not to
mention being very simple.

That approach would encompass both groups more and less common than
the base frequency, because it relies entirely on the group appearing
enough times in the sample to infer that any errors on the resulting
estimates will be reasonably well controlled. It wouldn't actually
look at the base frequency at all in deciding which items to keep.

Moreover, if the group appears sufficiently often in the sample to
justify being kept, each of the individual column values must also
appear at least that often as well, which means that the errors on the
base frequency estimate are also well controlled. That was one of my
concerns about other algorithms such as "keep items significantly more
or less common than the base frequency" -- in the less common case,
there's no lower bound on the number of occurrences seen, and so no
guarantee that the errors are kept under control.

Regards,
Dean

[1]: /messages/by-id/CAMkU=1yvdGvW9TmiLAhz2erFnvnPFYHbOZuO+a=4DVkzpuQ2tw@mail.gmail.com

[2]: /messages/by-id/CAEZATCUEmHCZeOHJN8JO5O9LK_VuFeCecy_AxTk7S_2SmLXeyw@mail.gmail.com

#93

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 7 years ago

In reply to: Dean Rasheed (#92)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 1/10/19 4:20 PM, Dean Rasheed wrote:

On Wed, 9 Jan 2019 at 15:40, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 1/8/19 3:18 PM, Dean Rasheed wrote:

So actually, the estimate for a group of values will be either the MCV
item's frequency (if the MCV item is kept), or (roughly) the MCV
item's base_frequency (if the MCV item is not kept). That suggests
that we should simply keep items that are significantly more or less
common than the item's base frequency -- i.e., keep rule (b) and ditch
rule (a).

Hmmm, but won't that interfere with how we with how we extrapolate the
MCV estimate to the non-MCV part? Currently the patch does what you
proposed, i.e.

other_sel = simple_sel - mcv_basesel;

I'm worried that if we only include the items that are significantly
more or less common than the base frequency, it may skew the other_sel
estimate.

I don't see how that would skew other_sel. Items close to the base
frequency would also tend to be close to simple_sel, making other_sel
approximately zero, so excluding them should have little effect.

Oh, I see. You're right those items should contribute very little to
other_sel, I should have realized that.

However...

Re-reading the thread where we enhanced the per-column MCV stats last
year [1], it was actually the case that an algorithm based on just
looking at the relative standard error worked pretty well for a very
wide range of data distributions.

The final algorithm chosen in analyze_mcv_list() was only a marginal
improvement on that, and was directly based upon the fact that, in the
univariate statistics case, all the values not included in the MCV
list are assigned the same selectivity. However, that's not the case
for multivariate stats, because each group not included in the
multivariate MCV list gets assigned a different selectivity based on
its per-column stats.

So perhaps what we should do for multivariate stats is simply use the
relative standard error approach (i.e., reuse the patch in [2] with a
20% RSE cutoff). That had a lot of testing at the time, against a wide
range of data distributions, and proved to be very good, not to
mention being very simple.

That approach would encompass both groups more and less common than
the base frequency, because it relies entirely on the group appearing
enough times in the sample to infer that any errors on the resulting
estimates will be reasonably well controlled. It wouldn't actually
look at the base frequency at all in deciding which items to keep.

Moreover, if the group appears sufficiently often in the sample to
justify being kept, each of the individual column values must also
appear at least that often as well, which means that the errors on the
base frequency estimate are also well controlled. That was one of my
concerns about other algorithms such as "keep items significantly more
or less common than the base frequency" -- in the less common case,
there's no lower bound on the number of occurrences seen, and so no
guarantee that the errors are kept under control.

Yep, that looks like a great approach. Simple and tested. I'll try
tweaking the patch accordingly over the weekend.

Thanks!

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#94

Dean Rasheed

dean.a.rasheed@gmail.com

about 7 years ago

In reply to: Tomas Vondra (#88)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Wed, 26 Dec 2018 at 22:09, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated version of the patch - rebased and fixing the
warnings reported by Thomas Munro.

Here are a few random review comments based on what I've read so far:

On the CREATE STATISTICS doc page, the syntax in the new examples
added to the bottom of the page is incorrect. E.g., instead of

CREATE STATISTICS s2 WITH (mcv) ON (a, b) FROM t2;

it should read

CREATE STATISTICS s2 (mcv) ON a, b FROM t2;

I think perhaps there should be also be a short explanatory sentence
after each example (as in the previous one) just to explain what the
example is intended to demonstrate. E.g., for the new MCV example,
perhaps say

These statistics give the planner more detailed information about the
specific values that commonly appear in the table, as well as an upper
bound on the selectivities of combinations of values that do not appear in
the table, allowing it to generate better estimates in both cases.

I don't think there's a need for too much detail there, since it's
explained more fully elsewhere, but it feels like it needs a little
more just to explain the purpose of the example.

There is additional documentation in perform.sgml that needs updating
-- about what kinds of stats the planner keeps. Those docs are
actually quite similar to the ones on planstats.sgml. It seems the
former focus more one what stats the planner stores, while the latter
focus on how the planner uses those stats.

In func.sgml, the docs for pg_mcv_list_items need extending to include
the base frequency column. Similarly for the example query in
planstats.sgml.

Tab-completion for the CREATE STATISTICS statement should be extended
for the new kinds.

Looking at mcv_update_match_bitmap(), it's called 3 times (twice
recursively from within itself), and I think the pattern for calling
it is a bit messy. E.g.,

/* by default none of the MCV items matches the clauses */
bool_matches = palloc0(sizeof(char) * mcvlist->nitems);

if (or_clause(clause))
{
/* OR clauses assume nothing matches, initially */
memset(bool_matches, STATS_MATCH_NONE, sizeof(char) *
mcvlist->nitems);
}
else
{
/* AND clauses assume everything matches, initially */
memset(bool_matches, STATS_MATCH_FULL, sizeof(char) *
mcvlist->nitems);
}

/* build the match bitmap for the OR-clauses */
mcv_update_match_bitmap(root, bool_clauses, keys,
mcvlist, bool_matches,
or_clause(clause));

the comment for the AND case directly contradicts the initial comment,
and the final comment is wrong because it could be and AND clause. For
a NOT clause it does:

/* by default none of the MCV items matches the clauses */
not_matches = palloc0(sizeof(char) * mcvlist->nitems);

/* NOT clauses assume nothing matches, initially */
memset(not_matches, STATS_MATCH_FULL, sizeof(char) *
mcvlist->nitems);

/* build the match bitmap for the NOT-clause */
mcv_update_match_bitmap(root, not_args, keys,
mcvlist, not_matches, false);

so the second comment is wrong. I understand the evolution that lead
to this function existing in this form, but I think that it can now be
refactored into a "getter" function rather than an "update" function.
I.e., something like mcv_get_match_bitmap() which first allocates the
array to be returned and initialises it based on the passed-in value
of is_or. That way, all the calling sites can be simplified to
one-liners like

/* get the match bitmap for the AND/OR clause */
bool_matches = mcv_get_match_bitmap(root, bool_clauses, keys,
mcvlist, or_clause(clause));

In the previous discussion around UpdateStatisticsForTypeChange(), the
consensus appeared to be that we should just unconditionally drop all
extended statistics when ALTER TABLE changes the type of an included
column (just as we do for per-column stats), since such a type change
can rewrite the data in arbitrary ways, so there's no reason to assume
that the old stats are still valid. I think it makes sense to extract
that as a separate patch to be committed ahead of these ones, and I'd
also argue for back-patching it.

That's it for now. I'll try to keep reviewing if time permits.

Regards,
Dean

#95

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 7 years ago

In reply to: Dean Rasheed (#92)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 1/10/19 4:20 PM, Dean Rasheed wrote:

...

So perhaps what we should do for multivariate stats is simply use the
relative standard error approach (i.e., reuse the patch in [2] with a
20% RSE cutoff). That had a lot of testing at the time, against a wide
range of data distributions, and proved to be very good, not to
mention being very simple.

That approach would encompass both groups more and less common than
the base frequency, because it relies entirely on the group appearing
enough times in the sample to infer that any errors on the resulting
estimates will be reasonably well controlled. It wouldn't actually
look at the base frequency at all in deciding which items to keep.

I've been looking at this approach today, and I'm a bit puzzled. That
patch essentially uses SRE to compute mincount like this:

mincount = n*(N-n) / (N-n+0.04*n*(N-1))

and then includes all items more common than this threshold. How could
that handle items significantly less common than the base frequency?

Or did you mean to use the SRE, but in some different way?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#96

Dean Rasheed

dean.a.rasheed@gmail.com

about 7 years ago

In reply to: Tomas Vondra (#95)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Fri, 11 Jan 2019, 21:18 Tomas Vondra <tomas.vondra@2ndquadrant.com wrote:

On 1/10/19 4:20 PM, Dean Rasheed wrote:

...

So perhaps what we should do for multivariate stats is simply use the
relative standard error approach (i.e., reuse the patch in [2] with a
20% RSE cutoff). That had a lot of testing at the time, against a wide
range of data distributions, and proved to be very good, not to
mention being very simple.

That approach would encompass both groups more and less common than
the base frequency, because it relies entirely on the group appearing
enough times in the sample to infer that any errors on the resulting
estimates will be reasonably well controlled. It wouldn't actually
look at the base frequency at all in deciding which items to keep.

I've been looking at this approach today, and I'm a bit puzzled. That
patch essentially uses SRE to compute mincount like this:

mincount = n*(N-n) / (N-n+0.04*n*(N-1))

and then includes all items more common than this threshold.

Right.

How could

that handle items significantly less common than the base frequency?

Well what I meant was that it will *allow* items significantly less common
than the base frequency, because it's not even looking at the base
frequency. For example, if the table size were N=100,000 and we sampled
n=10,000 rows from that, mincount would work out as 22. So it's easy to
construct allowed items more common than that and still significantly less
common than their base frequency.

A possible refinement would be to say that if there are more than
stats_target items more common than this mincount threshold, rather than
excluding the least common ones to get the target number of items, exclude
the ones closest to their base frequencies, on the grounds that those are
the ones for which the MCV stats will make the least difference. That might
complicate the code somewhat though -- I don't have it in front of me, so I
can't remember if it even tracks more than stats_target items.

Regards,
Dean

#97

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Dean Rasheed (#96)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 1/12/19 8:49 AM, Dean Rasheed wrote:

On Fri, 11 Jan 2019, 21:18 Tomas Vondra <tomas.vondra@2ndquadrant.com
<mailto:tomas.vondra@2ndquadrant.com> wrote:

On 1/10/19 4:20 PM, Dean Rasheed wrote:

...

So perhaps what we should do for multivariate stats is simply use the
relative standard error approach (i.e., reuse the patch in [2] with a
20% RSE cutoff). That had a lot of testing at the time, against a wide
range of data distributions, and proved to be very good, not to
mention being very simple.

That approach would encompass both groups more and less common than
the base frequency, because it relies entirely on the group appearing
enough times in the sample to infer that any errors on the resulting
estimates will be reasonably well controlled. It wouldn't actually
look at the base frequency at all in deciding which items to keep.

I've been looking at this approach today, and I'm a bit puzzled. That
patch essentially uses SRE to compute mincount like this:

mincount = n*(N-n) / (N-n+0.04*n*(N-1))

and then includes all items more common than this threshold.

Right.

How could
that handle items significantly less common than the base frequency?

Well what I meant was that it will *allow* items significantly less
common than the base frequency, because it's not even looking at the
base frequency. For example, if the table size were N=100,000 and we
sampled n=10,000 rows from that, mincount would work out as 22. So it's
easy to construct allowed items more common than that and still
significantly less common than their base frequency.

OK, understood. I agree that's a sensible yet simple approach, so I've
adopted it in the next version of the patch.

A possible refinement would be to say that if there are more than
stats_target items more common than this mincount threshold, rather than
excluding the least common ones to get the target number of items,
exclude the ones closest to their base frequencies, on the grounds that
those are the ones for which the MCV stats will make the least
difference. That might complicate the code somewhat though -- I don't
have it in front of me, so I can't remember if it even tracks more than
stats_target items.

Yes, the patch does limit the number of items to stats_target (a maximum
of per-attribute stattarget values, to be precise). IIRC that's a piece
you've added sometime last year ;-)

I've been experimenting with removing items closest to base frequencies
today, and I came to the conclusion that it's rather tricky for a couple
of reasons.

1) How exactly do you measure "closeness" to base frequency? I've tried
computing the error in different ways, including:

* Max(freq/base, base/freq)
* abs(freq - base)

but this does not seem to affect the behavior very much, TBH.

2) This necessarily reduces mcv_totalsel, i.e. it increases the part not
covered by MCV. And estimates on this part are rather crude.

3) It does nothing for "impossible" items, i.e. combinations that do not
exist at all. Clearly, those won't be part of the sample, and so can't
be included in the MCV no matter which error definition we pick. And for
very rare combinations it might lead to sudden changes, depending on
whether the group gets sampled or not.

So IMHO it's better to stick to the simple SRE approach for now.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#98

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Dean Rasheed (#94)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 1/10/19 6:09 PM, Dean Rasheed wrote:

On Wed, 26 Dec 2018 at 22:09, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated version of the patch - rebased and fixing the
warnings reported by Thomas Munro.

Here are a few random review comments based on what I've read so far:

On the CREATE STATISTICS doc page, the syntax in the new examples
added to the bottom of the page is incorrect. E.g., instead of

CREATE STATISTICS s2 WITH (mcv) ON (a, b) FROM t2;

it should read

CREATE STATISTICS s2 (mcv) ON a, b FROM t2;

Fixed.

I think perhaps there should be also be a short explanatory sentence
after each example (as in the previous one) just to explain what the
example is intended to demonstrate. E.g., for the new MCV example,
perhaps say

These statistics give the planner more detailed information about the
specific values that commonly appear in the table, as well as an upper
bound on the selectivities of combinations of values that do not appear in
the table, allowing it to generate better estimates in both cases.

I don't think there's a need for too much detail there, since it's
explained more fully elsewhere, but it feels like it needs a little
more just to explain the purpose of the example.

I agree, this part of docs can be quite terse. I've adopted the wording
you proposed, and I've done something similar for the histogram patch,
which needs to add something too. It's a bit repetitive, though.

There is additional documentation in perform.sgml that needs updating
-- about what kinds of stats the planner keeps. Those docs are
actually quite similar to the ones on planstats.sgml. It seems the
former focus more one what stats the planner stores, while the latter
focus on how the planner uses those stats.

OK, I've expanded this part a bit too.

In func.sgml, the docs for pg_mcv_list_items need extending to include
the base frequency column. Similarly for the example query in
planstats.sgml.

Fixed.

Tab-completion for the CREATE STATISTICS statement should be extended
for the new kinds.

Fixed.

Looking at mcv_update_match_bitmap(), it's called 3 times (twice
recursively from within itself), and I think the pattern for calling
it is a bit messy. E.g.,

/* by default none of the MCV items matches the clauses */
bool_matches = palloc0(sizeof(char) * mcvlist->nitems);

if (or_clause(clause))
{
/* OR clauses assume nothing matches, initially */
memset(bool_matches, STATS_MATCH_NONE, sizeof(char) *
mcvlist->nitems);
}
else
{
/* AND clauses assume everything matches, initially */
memset(bool_matches, STATS_MATCH_FULL, sizeof(char) *
mcvlist->nitems);
}

/* build the match bitmap for the OR-clauses */
mcv_update_match_bitmap(root, bool_clauses, keys,
mcvlist, bool_matches,
or_clause(clause));

the comment for the AND case directly contradicts the initial comment,
and the final comment is wrong because it could be and AND clause. For
a NOT clause it does:

/* by default none of the MCV items matches the clauses */
not_matches = palloc0(sizeof(char) * mcvlist->nitems);

/* NOT clauses assume nothing matches, initially */
memset(not_matches, STATS_MATCH_FULL, sizeof(char) *
mcvlist->nitems);

/* build the match bitmap for the NOT-clause */
mcv_update_match_bitmap(root, not_args, keys,
mcvlist, not_matches, false);

so the second comment is wrong. I understand the evolution that lead
to this function existing in this form, but I think that it can now be
refactored into a "getter" function rather than an "update" function.
I.e., something like mcv_get_match_bitmap() which first allocates the
array to be returned and initialises it based on the passed-in value
of is_or. That way, all the calling sites can be simplified to
one-liners like

/* get the match bitmap for the AND/OR clause */
bool_matches = mcv_get_match_bitmap(root, bool_clauses, keys,
mcvlist, or_clause(clause));

Yes, I agree. I've reworked the function per your proposal, and I've
done the same for the histogram too.

In the previous discussion around UpdateStatisticsForTypeChange(), the
consensus appeared to be that we should just unconditionally drop all
extended statistics when ALTER TABLE changes the type of an included
column (just as we do for per-column stats), since such a type change
can rewrite the data in arbitrary ways, so there's no reason to assume
that the old stats are still valid. I think it makes sense to extract
that as a separate patch to be committed ahead of these ones, and I'd
also argue for back-patching it.

Wasn't the agreement to keep stats that don't include column values
(functional dependencies and ndistinct coefficients), and reset only
more complex stats? That's what happens in master and how it's extended
by the patch for MCV lists and histograms.

That's it for now. I'll try to keep reviewing if time permits.

Thanks!

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-multivariate-MCV-lists.patch.gzapplication/gzip; name=0001-multivariate-MCV-lists.patch.gzDownload

0002-multivariate-histograms.patch.gzapplication/gzip; name=0002-multivariate-histograms.patch.gzDownload

��u:\0002-multivariate-histograms.patch�<ks�����_����D�%���b�S]Gn}��sm����x(re��C!);n��~��)J�������,��v�GI2�����-���s��<[3��U�����y�r���#v��L�����LWU�sh^��8tR�cy���3|�N��+�Lq�������K���L��k�m�t�����M��j���X��n����vzy����o,\��$>�`7~�������`0�0/v�i���`�:�����O���O����np����@9��l�-y���p3(���:\�Di�d)8M�����	I\!8�wk8p<}B"��q����3L����W�>��A�PM�kpq:���X7�R�]g�0��b���x��d��r@������?x2\e�_E������IR���+�j���#��J9�8�����(��������YmP�)�R�p�
����Kh�����p�����������n,hl�T�L@O�?����&m��O*�c:t�l��NO
h�[�)�.�A�
����G�%H6u���,e<Y����e���19~�+�����r�4S<'��F-p�$v7�����i��Pn��n����|�vNx��0V�����`E���W~��$r�	����_��7�-cm&��
��R	M�u�S�r	��r������p�-S��4[����U�D~v��I@�@!���Z�<���r�����`��C�Xr!0F�h��'��542l�G��Q	��n���1���P���aB~,��V-���t�O��x�N��6\�s�[b����t�#��32��HW���|a��\����:�4����wl`����]��0h�8<��\�0���_�	�������=,����*��}�����^6@:��N#�$�s�Q���]�W0f!�n8�y�.{�H�J�N�{�{`N�_����-8����W�
<D�Y�8�wk�� �-���X��8�2-x��0N3��HM��}s�c��������$���Rr{��n��h��J4�Q��t��$���{�I�B�3�>���P�6�!���'������f���>��4�X0�?��)7����V��M��������3py>�`&�%I2#'lWy�h}-p��������l{RP�p���%=�i:�����g����l6���=-&�b���;�LW���J-^�|�Z�k��F�_&���dvx�B�;:?{���J_���#�MaI�������W�rL��
�?R���Du��0��+�L�#��SW��uXL�2�C&�>�yk�H����j���)K�~�^�3�$���o?�k��5Kx�J"p��]��0���@�	���^$���0*�@��@��A|'�����T��0%v�����;}��xV�`s��tj�]X�)�E�����`._��x~��X���R�T'e_��v�
7K�U���	_��St����!�~/x1�[{����`]IM{z��o���6�Z
ij�I���%����Z.���U�q�?��y���`8�)�@x��h�����NN�8>���G):��W&��R���\����
[S6�5Y���U���5��,-�a��������s�L����L`+A���~k��P_����#��[�8d��
��W�$0_	dg�� �N�J�O
��C|ah������-!A���6��g�����Dl��*�Pk/�E%�x@i�5�p|Y�,�i��[M4z,d?�0;�A<��x��=O�,5��
�5"�?��N.03���x��8�Hn{�9�>��*CJ=*rA�)������e`�a`
��g	�������|�;�Z��]�B7��Zh�x��G��%���h�.j�1��HW!��O�z|zQ�/��<	��OB�E���u���?�4���������������
�:���1�x��&^D�K�y|�����b�Q�}��d-��(>^C���YA���#���~%����V���s2��+<��v49��T�r(����������5�:n2�/H�����M��t;��_��O���g�0�e7Z7�Co�x��d��!G#7������(�Il�}������fJ]�?p�������a���b~�
��2����������cN{����lz9c���������i���uv
G`�+��*=~����$�����z���IG�iL&\��R����w�"2�R������c%���r9G<��OHj�2x����?���-��_=��@
U�>��>�4�R,�3�I���B�i)��glY���<>e��s>�WB~����n�B����F���R��Z�T]�!L'���b��t���#�`kD�a#=�j���|�Z�����Rl������hM�b�>��I gR��x���0��-	0�1Q�QI�lA�L���JM��O�(C5m�19#n��f������i|���a��Q���8���3s\����Mj��/����:��x�����H`���)P
h�0�P�R�n��u	�I�]*!�o���a2�i�������f��6F�$��HSFFm
�	���K�18	0^����`��$`"�5�����|���P3��iU%��kh�	��Z�
@-�T���~
LU�jV`L�P5��
t@Z��Z�}��g�!.�n������f�����Ri�K8�#�ni	��'�
��f+'J�|��<���HL�{1��RD����<��&Z�s��Iz&<��"@?w<F���������x6����������9)�����U����f@5}����c5��[��L\������x��4��-u,m�K���M,�p�M�1s����dz|��RX}vy����{vvt����e2�r���b�	���fs��p���9�������(I�;%�����Hu��q�8� ���y��GY{+W(����/�U���0�{;�~��RW���=���=��A@���6%dH�;Vl���B��u(����*�+�q
���"�v^B"T�� �q�U�v9NlX�t@$&�|U�RJ���<�����0R����J{���"1�(�N��'q<�p�t��f�:��"��l��fs�@V�7b�_&��;����d�L2/�3�(��������V�����H�|!:���dG7��P6���~�Id����-�v�(�WC�U>XNz}~���d�{��/��~90��d��4������\X:|�[O�������c�v�3�����/�v��F�1�u�4{����h��b�|����n^�{�+���
$�K��#v������%n{�n�������y��Z����DU�T�~��{�y��~�h���v.G�A�K-j�<�Q�&�YB*����BmM��73DHd��~&�t����](U����XW���<^O����)�T:�*����p�Z]K�=�i�'�*Z���;����L��p��/�>���.*�c�
�zQ�B��fU��_�K�/w�E��!�.bO�7V	��m9��N������D7�-����4�lH"��@��d����W�!5�%7�����L�aJ��x�nl��#������F������	���3�{X�Fm} � %+U--�4�x9R�;�����'$����Z#~ ��0�����Lo�JE�T�,�������U��=�{i��/,�����W��G;�x%�+���E��d��g.�k7<�8:��\��Q�Ob��:	�#���}��R]��`H��!��(����Ea;/j�Y����]��Q�n���E�mk���
����O)����<S�XF�*f���(E�iS��&���=T	�6�~�f}��Q-D�����[}�~b��6�>�m�Zp4F�iuv���M�g���g���{:�h������t���,-]T��?!�/eO6��F����z^����>��c1���O&�mL�,�V��4c�u���r]-!����jhd��hi�}��*N��>t2s��qpa��d`���/Er��L.�U�{�[~���(������_k8*��j�	.+%��T"/�j	B��e�y
�'�����
G[���G�t�E ���e�R��C1(�t����i�%X�\Z��=��yX��t�+�c����Y��������3H-�Q�8x�������(�(��"������Q���j'�E���������#[.���S�Ec��#]7lE�sw4/,s�I�-[j�[zc�t<��l~��V���������%��-�7V��`r�$��'��A�"���P�������<�7c�
wY��F[�J�f�������.�+%�m�pD
c~�A��	aqaS��Pm�1��'ex����i��0��C?��Z}�Z.�$����������������I1�]��L�P����E��M���
���I��&�����y��H�t��1zO�����)�\T�|*�^��SqVo����-q&Rn!��y���m%q��z%%3�QD�}��1_x,���\�4���g�X@G�����T.��}����r���� v�8{�?wv{t���bv~��O/�,�c���1����l���`��
 0{b6��	����q����c�����
�����v ���l=��c"��l�V�b�@���s7�|h�9@����,$l�7��7�1����V6��������x-��u+�\��,)_TW����h=���-_�F��O<g���h��j�����G�k������t�3�S��i�LO�7�~��<;�w@Q��A��o����vz>=9����������������������k��9;������������S�{k�Z�m�u��x�w��5x�������K�6��w�2q�=��������l���l[�S&����[�����1�r��21n�nA�g�k]�m=�n�?p>�|w~zFsyv��E���\���fyg�m��,�?�j�����*���g��t��v5��@y �������G�3*{�
!�Vv�K`QC��g��!�h�����y���*��i��m��"���g�����7e���mx���b�m�Z,��Z�
N^g��aP`I��EWt���"3�"����9����X0}o)����Xp�<�DL������_��v^;�*���������_`�B3�����i�P.{1���1������9Ma@��x/�1��#F^��F��C5��"$n��F��w��.��
m/�������.-�����N���!�]��wLm�n��F�~�6�������[��l�m�V���s�E�!���}�)��	U��q�dN�������K�E�?I3B��I�'q?1���4�wj|@`H���E��3��������������Yy���^�bj�~�_-�w7���11�!����;;??;����I����������������W���'��24������$Z����*���R`�)�v�S�N)���#
`���A�c&65�������l
'"<���o@�!��=���t1�����|y�z�vv�zvzx<���m��>
��K���QZ�F�D�g��_{�������~�_Q�OB	�@��cwc��m�I�'��S�JP���*a�tg~�Y����*	a;��s��@���������74��y���E�����o"���gQ?\������Q!v��[��:�K�zt�������z��������26���O����sFl����,��V���[�����y�^p8WOnPc���X�=��f�LZ�&�l-� ]OnGP�:MF-��[d��$��k���V2���9�F$Tp{{�S�w�7���~�_��tcx������;�:��%�<��)�E��;�7������*v!�R��2�-Q����@!�,S����;�4��zf�L��� O�X\C�n5B�-��IE� ������5L��&��J����w����II'E
��!��uC�a���GI-GZ�s��'|'����qFt�>��b���\;?25G�U`�3�=`�*�F��C�o���������zs��k�/���������T�P� �"�g=c����-���fLO-�;�tf�3�>�����T9<
��:���E�V�������O�a���O@M�������|!���z��@1�X�L���l#����������
�t�Z�v����y���I��;��
�c~�����u������5k������p�^@��>1k�]~B_�T��<�>��}�����pCx�cw��?N������9�5��a�=��Ty����������n������k4:�^����=C�.5S��KEH@���D�����ZP���d��R`�(��bO�y��X,Z�>j�U��x�O[F�n)@c�O�7���3w�����q�������/�& �������1:��.��ws���e��s;�����������iw�\���W��\�����|�S������7�o/����=}���m��������t���N��t��h67����N�3�Lm�|�����-����
���Nc��b��d�ik��8JKt�{�U��1#Ib��lr���0D+��p��a���ZaU!`���B���!	
�o�)
��5����^���s}q����6Id�J��N�N���X����
���Zm���7q�������qF����[|��]�A��G\�F7�i�W���)��k�����^�*D:��r]~�~�����j�~���NaE�O����`x'�������c��FH�|�w]x�N_�������9~�>m���*d���"������?��~(��b`A)�����^	w���Fo��H6c����>�QC���3������3�g��eT�d+i��;.f��#�S��Y�c���J~�e�6\�����e�g���9��ySg�Z1�U;wl����MY���D'�t
����"�`�=`4��������������w��[�q�������6<;Sy�)��9�)�m�e�Q���/����^t�X;�Y|�"���8ZA��7�>��-z��f��8����N����x�j����un���K�*���>�+%���9z�7�.���1�x?V���W�A��n�b�zi�=���,�#?�����=���`����S���o�l���$(�I��;����SW�R��U���F�"�/�������e�EFD��\��X��-�#R����H���6���6�r-������_�F��0B�|��U�6���sV�����D�!�@����t8h���k�%/��u��e��{���� ��i4�����ns�7�.K;3�X���I�P�_�&�Ng��d�3�n(\#�Ud]�x����R3��Xp����>q�f�q���f���$Sh]�����-���k�r���1��%[�B�]W�c�B!�|�,l�m��������E�������sfB�����e��\�@��c�3��%;����_�����h����"|�#y�'Pl���G?��h�s*�c�]�q�Pc���V�j��#)�_���U�
Y��Ty�`�7�����;�(��?��C&G�'����;������q��!o&�;��M�e����:j���O�7<��
 �u�^��?-�VC�������!C���#R�ta-��L8d>%��<c4c�?�#��$���.���tgsj��
����k��T��Y�z��D?�d��M����*Y#@reoGq�g�tq����B�����3�Ev�RxW��.���]�������aB�u�B��+����	Y�]a�Q�I���y	Z�b:Yt�o�\�o�/�$W�4����������@U��S����>@�Cr?"\W�w��ZC�n�!���l��MG�E���h�N�b��0`Z���������
����$�{P��;p�ny[�:0&F�:a]#2��U�&��G����B����
�8>��o,F�(�����!%�nB�cB^o'�z�.��u8����I�H�R����@U�l�OxV���$t����X���:� �w��D-��ze�sy���@c�8`��K������Ny�#BM����$Z-k�
KFZ_B+��*�5�)��ar�����!
�E�� :���t`\n0K����aJ�����H���a�o`�kR4M��)��^�n��Q�F)H 5��vA�j��i��
+1��0��p���g"(��x ��Z�c��)F�f�4KI�K�i�,�>p/�p�er#$F�tN
l4u8@����hJ�$����C�'�5��oqa�^��A���O�xT���l���w��Y=#�qX�\���^��
�X���0~�jw{uM�Y\���G�G�������7���� (�t���7y���&IF�'g���rz���b�
��[�Z��?�
���`��a�;A�"Q��jd5W
���t�9����:e����U�np�B��+�b%����}���a1�V�H���!z��*��p@���3����U�,�����-�!�� ��9bLCLXO"�Js�������bg��Ws/d���3��F�H5�6*�_�[S���b�����;�\]����?�b���Z��������n�@�,|7	~��d�����.0���w��R�Mk���>3O7��Q��D���H���%�I���y�������O'�y��"E���K""Ir7�.uHIh����O��yp^�R��PVOvW�+@��8��t���k��������F�5����4���B��BME��#�C8C�s?<egL��_Sq.���<n��������f=�y���@�H2yV�k/=�Zq.f����Z���]�y,������70mi�u<Q8���a�����~���@������$���5wZ��-�K��-���$���1l����~<d���*��f>�6��P��`�:|;h�L�>���9��N���"�#Q���zM)�s��YLb��1��	9�%}�!��O���K��e=�&hE#h�O�5�a����eFW�+���5K�2�&,.�o1�M�o9���2�)��rVd�q�5}�DAG�]��	����P8u2���c���E"4Z���gx������D�f��["�+��$8�55�pT"��zk$:��K�����Z �;/c�������=NP�/3��D��3�~M�#A��R ��AVY��(x:��S�*���2"p���&��jP����}}�Z������D�����N�6���p\��������L��	A�*��3���>.��1^~�R	�j�������a!R<2�#����4@�_@���n��a�5�����b��'I1� F �P�����1,�Y�0m�]��4�:tc�0W�! �(%(��RXJ�����8y���^swk��c�
�����(aV�}Pu�6��T�����/�G�����M�#+��#j��d-��7��[�"�<���%�����#A���m���s��(g�~9�ou2�0
]��Y2�<|%�V��������	��R@�u�(2�X]�H�Ih� e
��,���[���y����l����[��4�"mU��A���f���R��I}5����c��x���y�O�{Q��d"�/S��V�` ��1w��� �}�����f-I��}�9;G����lk���:1e j8qB���M*^RO��c�(���-[kH�������x�x� c�~`H�zx�#��+x%�(�C��������-b*����<��#'1+�q�	��e��������D)�,��W�����U��K���P|�E����Y2� �@��W�8z��[�<�LK��R&�%e^0��"����	!�FX1����������q;1b|�r9L�{3��n�x')�M0��~��B�g)���-q����m6�4wI�2�w
5�8�t#Rtt��#�	����6���.��M���*:O�X*�������V�����B�)�!�v=�@����T�es��9Nw���9���*s	���~��b����([:�[��#�3��!yI8e�`��6N���YpO�p%KL��������?_���~��S���d6��3�8S������D��@�8�v��D[���=G�$N�>��~&&�\JA���5f�
���g��u�����M��^���i*��IJ�����.R�,FQ9�V6c<i���h���Y4�e�����������Y���w�QO��
�L����N�Vl7�yy��y��A��'Ps����t�
��D�W$��F8[%����K�@��DrN0�F�x��p�x�� s�F [�MfP��|^��r�T����_#�p��������'�����Wj��Ex�X In��Nk"���27 �����&�6�q��@�>+=���?K�.74'	X���%R-�I��?��h	��z�gN�%�Ob��������^���{Nd��qT���>�K@�jqhbq�a<����?)n����9��l��������$�IL�M�m�GB���^/�dx�(U\�\������o������)���>C�ko�0��T�����[\7���:J�)��#�=��=j�����-�I�Rw����)!;�q1
�����pG��	%G<��X�a(L�����Q%5�		�/�aj��^5�����@�z�pH�B	.����O��:�RYY9�<������L:,���=�s&��=����7Qw3�'��'NgK�3�@�(mU,Zo��K��}R��[<�,gk+%��Q�G��Aa�2*eq��*�M�^m�<9�������S��0�F�i���u�l�V���h��x,��"��xe�������������}U6���4�,�,��S�`lq'�W�&�58�[�PWS�B�#EN�����%qK���3���	����T��^��+��3
M�n���������
��e(_�D�m����������Z]�7�	�nN$�?,�>l
.8���<�6�����W(����������](f���������1Q'8����&����I��t�cC�#����D���P���T�'M�.w�R�����
���	��YB�x��jYD �J��t��dy�}�E!�@�p�'�������lm.���	x��\t:��O�3�i�R�����Mt=D'���@84a�������eb11�?z�$�^�s,��c�^��KV����O��q�"����9��>�
����"�i����Mr�u������ �e�\��9W.�v���{s���z{b���]\��T�1��)��(�
SxqN�JB��v2����*gv�PA������S(����vz�
�c���0%r��`A�b��o��f��kGn�R�wqQ�
�a$��v��I`����#�&2�h �����t��C�")��9|De_������RY��$����0�L��u�������73��4lnq#���_;}A��xRTO�p�������--~����)e���gk�k��"�c�$~�0k���b��`}[�=c����TS&�t��r����i>�!��nZ�(��a��l����#&l,������t�sg�$1�����I��f�@�3��;�k��uoi�
�fR$�A$�������x�������S��T^�F'�(W���lG���Z���W
�'�^����Y}�����P��|aE�'���+9��EM���#����g�-�M����sGz��w�C��F������rF+<�����w<�7S��L�����!�7����y�&��6X�M�Bo����!�
�P��Q{N(	�}��:-�h��bG�	�>�x�j\].�
�	��]V�t@��tG��S9��>!������rY�tf.g�,1�s��!/"N^z����a�*9�%�(�ip�T�D�2�r��]���x�l0��:��l@��������bz��^�l��� ���v�^Mr4r��[���p�1���	�)��7��@��&�e������#����'��y�$�:&;�,�RZ(/�Q��{�O)���+\����jV\�
0S�bRb�[E�Q�����2�ey���K����m���$Nr�T.���.�������9%8H����No���	��j�T",��zi�#��7�|w�`y �������F��n�0� ���;��D������ts��hl6���Aowg���B{3C�
e1`�`k�_�|`�L���e8vA�	=�:b�����Y��)��^p�Bg��w����@~R�W@���n
{��~���@:X}t%p����5</Z���`��X�S�TW�q��N,�N��Z�2�A*D��8J��r��������N��h���������x1�Z�y�J�)$�W	i�OZ���!�KT��#�.A]/U|y�?J_.�-F�������Xtx1�iOB�5c�`_���������~�����w��P�N�c�����he�?����A����c��R�<��
��	5"�������`�b��9�qK���)G�Ms`)��F�G��O_��n�!S����nr���:a3D�����V.P&�X,*��������Oi(��d7�[=h����	d��9�����6�D������m��X��.G�%���S����p,~�Zd�YK���b9�V��$�V��}U�^���X�w6	t~��la����M	l��h�HI�H�.j+W��v"L^*�������EOY�1&�X�0�D�����N�p�ooo��A�o^�����]�>�"�aa��Qz��G��m�z����ETu��K��N��A��T@R��8���UvZ��T���'��x��
�@�e�8$���'=���3���|�K��uQ5y�(,��:����L�L9i(����������o�*tb��v�cr*wy�
�i�^SZ��wE��vk�m�7A����b�p���9/=.��u����#�:ob$OTHQ�0),�	������+K��/��l.���������R�Y2�
SW`A0UJKI#��u�N�Y��&�f���*6Ub�_��gBe�#��	W����eTu�����z��-\�B��A��W�%\��*G�<�e��&�V�o_���p��]x�$`F"J�k8�/��L2���Sm�w'��N��N*�>i�AL�x�7� F�"�x������Y������i�^{�� �Oh�7j�;����o�'UhD%b����x}�]��y]`��E��(��N:�4*������a%�������Q2�KS!�<� G�����k�=T�����}��`"�.h`���@&u�yX�/Vl������V���s�@5,��H�Od�d�8�h���K'S��*���2�b������xq��03���=�V�Pj��`��y���I�uh�����5����F���n&�l�d�5���Dk���s��'����f4�d4tO�r���������JD�K9Op��c\5�k��d�c	E&<�����&wx�����/�����9?n=h�p��G���O�g��P�.e\�i�{I�����P�����bF�&��J� ��g�����j^$�D���I�`�Ud����-�oZ��t�~"�����Hv$b�G6R(��B
�"dG���"�=����Q]�m��j�d���N"��=���lWl��f!�<�=+-����<���X��-��:k��j�_�9�rt�p�r��?kM�/�@qPD`I�v�AtwG�����	�E6���f��znj�?�?^�y'�t��uQ&�����������>�������?�dV����U}�9(������4������9u�p���l~��g'�R]��]�gV���5���L�@gw��\���$�y�p$��S�8�\�����9��V�ZE�}V�s�H}S������O4�M�Vw6>y��P%�I�{�!x��bX1oky��k��j��������"�w�S���Xtp��3��8�N�,��[����mHz�/8���p��-���]�u���F�e���;m<&��pX��ce0���8y��
*���r��O&�7����,	��#����S�1����F6p�N�K:^�!�O���o�=8��lF��8�L���:#�u���'d�b���M�NC�g5�d��NI���$��
{$��9�\��?��@E�}S�K����[p�Id��w�3��X��(@D�S�����D|'�H0��������f����g�i�`������=b������.)��x.���%o�2�n����8��<�p�A�&��Z?����F��.���.��b\�g5JH1��>���!8#���d	9���5D�w%����b:���1e����<E���f}��-L %�)��/��A��%X����9������p�O{>�����O��x��C7��`�vy�D�
�`���r!|n���e�zsv�+�=�C��+c�hIaJ�K�w��������O�n�Gpz{C�y,�
����Y���;�� ����[����l*F�����%���1GK�G3�G���Z�M�v��r�h!�#5�c�G$c�x��=���k�_b�����&��p4�
�j����������)��?FU���Dk�|�����X�M�)�oG�\d�C���~����%@2����t��o�U�.�_� �{3I����vsB)��K�;�
���������dx\	�����l��a�l!�@�9�^���0|�r	=ab���*nZib�K�Wm��J�7�����}$L�H���}BV5�)��G���K������f��w�����Y?J^!��oC�qh�'��(����^t�\�H=/!���j#%l�����g��qc�����'����9X?-d7�
������������T��y�8H�(ps��cX�w5��rh�'����K7�im�����K�����V8�$��d\��jUs\�	��C9&�/���l�����"���~�#�G���iSB56�g�P�sm.��[�����Sb����b���t� ����g��g��05WR��>�H!})?�A�q9aN���ygjn����H�>�������������L�����������N}�7�4dl������$P5� E�����]�N���u
�
U8���5Z8*���r(B?������������v�Erb��:�X�SGv���~}��r���s�����/�<���%G {�%&UK�Nc������$un�2�)�����z�
��+�[��16nM1�����|g�F=�q��7�#	sIB�J�@��p�M�����R/%�7�St�C^�3��hw�n��8p�3��8���C�����,`c���/���E���/7�e6X=����>03�%X�S����F�w�r�?�	�k�i�5��������`'kv<o��M!~�[��E;nD�|cy�K�;"i�3Y�O��I�>y��;1������=z}��m������'�O.���f<�\#����~8WS��95������BIvF��.��[Yu��S�]�NH/ R6O����;K
sB��[���K��O;��9���u�`�LZ���O�F�=Q�b�~��S�6���r����*F{�L�Y"#��	g)a�����>����y���`:0���!*h��*0�o�p���9��~�
^����sK"����Y�x]��P^8
��;�����������&nF���U��E�a����sB��L���CE�)�D�w[��xfd������e
���k��������6���vd�omP�����z��s�R�'j`������w���b�]�������������v����mn4��P"�\���}Gy�������}�'����Q�vE�0ad� ���jD��a��<>�<yurtxyrv*��c���]^xC}<��W���V���a�n\?>\�P�F7����*���>�j�c������X�b���q���A'e�O}nw��������|��;��Q��e8��\����/�����0�VW������6������@:����15u�$���'�����/�K�B�X+d���xgjt�t�7�Y�^�L\(����S*a	���v��;�����Ahf���\H,�obIg�
��A�@iU��0�����F�k�����n��yo5+d���^��a����^���:@���M^9���j��J�����k'A�������	�1�����=���u��i�OK���C�+�}�����t���x��D�6h�$�����I�y�2����+t�`-�
��>�w������aq��Vh
�n������VXl���iT�8;(dBe(��^����v7�j�e]UM�������^��9>������?;y�k�5��lT����LgC�k����G,��B�
�*�`�x��n�eH)`jP��������8��bE]�������q��`fZ5�������a������l�81����i��s�c4K~���.n����-����{i�s#��G\.W���9����������QV�K���5����O8�r���/�l��)�R�4���03l����ar�\�@��>Ya�8!b�`�y:WA���zb^�F�O�N�$�r-4Dq��|Q�����w9WWJ Q�d/�4�X/��/NE�+�
���J�9�C:D���\I�G���,F�)��4��Z���))&&���k���A�	����Z��cJ�E`����O�.	�N�K���\����*����Z���`H4I@Z�6�$��v`��FU	b���n}�R�]��F[���g��|�K����_���K����l+���j����f:���t)�E�r�>��I�����8��k�	���l��BP���
���t@��,�j.��t�]n����:?���O���k�
��R������T����u��j1�w��K#��
\��Z� �2(WD���uv��������������_p�X��\�:�3m�AC�����!Er[�Pa��bIh@��E`�Fp�/�*"y�������QC$����1�s���#g_�(�<��4��C�K�M�@w�����#T��OUs�L����tI�zC�C,�I�jA^X��R;��,K���:��2/��M1���|�YYx�������3^'S�x�$������!�vy]�*��h���$|�������j��4#:p�O���DiiW1�+�`^v%�Wp�T[0Z��H����@7Zmg8Hs�^"f�
xh1��b��R�������������60�hUn/�W���[n�R�#�*�i~� D��A��CD�� �W�0S�H���S?'2�K���&�o�~L���j�k��������#D������x>q\����Q� ?^���\�n2�M�l���,�bJ�;7�����jr
=
�?�.��2�fB�\R/=]���T��	0T��I%���~�����u��,��o� q�4��&'�&oD���m�p�����b��F���1�"���L��r,�0�B��nF����Yd�KE�p ��HS�b�I��,qJ�h8��S������
����Y�������9��Qz*��,����(�p��@�r���P}�u�����[&�CL����+E)t�u����(�7�(���U��P�e��R�@@�	i7\T�7�Z�m�7�n���3C�pSBS��LEP�P����R������!�nS,�@��W*�����#��:����W,����Y{��N][������iX�t�A����=n���P������_����gQ����T�d�q��D�����H+>���?���C��Jy���\���>0']�l�k�~a&�����GAO��#X���x�OQP���#N;��TBna�Yf������w<����I��(����m����gL��$���S��dG@��!�i*�G������'�S�����;rt#�z�-U�/��)��C���Q}�t����C����s�����teWh~���f4VX���;������e��b���{������I�(uq��	PG��0f�R8����!3���V)B��sc�Q�(0@y����J,>�$��p�x�)*��)���;�����D��m��CvHv��{�L��2�r���f�6�"��FSj�_V���!x6��Q�O�;� �9�a+�b�>��C�Qy����
������������`�%X��oS\]%�D�d�Q:�����P$
	���������������R�N:jg/P���;�R����)t����M�vw=�kf[d.)K�����g��8��"r����>S45��[������n\����r��j66j��e��R�W���3/-P"�
E��1/V�G�CR23�uNP��q��z����7;�~1��K���Y��qP�e��P8pB�D��u��� ��������K4%#���Q`���y��*�8|�.��M}�z}�:{{�y y����������e�2�;��Du������5�?��G��G��xC�j�8����r?v��+,�l�����N�H�0��r������I��~M�d�����������yu�-����,�+��gh�b{6�qY���)H�}��H�09��|�K,P5T�K<�<�o�;�31�MxA��������|rCa2�3W�7�bH�`M9H)Q�f9�[1��:����:rC����s�~)u$_��!x�~�U:������/��j��F��OnG�c�7��[���aSV]�������A�n�k���b�`&����+mj�3y%�p'�`\d������X?��A�SO^�=�fa��[����DVsY=��E��T�U��V�\3�*��^R����T�J�@E�*k��A^o>���)�W��_Z�)�r�����q�D�^dDY����{^���I63��6�We@�|q�O����v�]��&�0��O+g����\"�W7#t�1#��������t>��#������6���r���� e�a�h��a1�G����]�S
���:��;���/F�����XP~������O\�^U�������Gx��16�80�/'�c�FH��
o.�f��o��2��rG+��![������0~e�~����j<?WC�K�(����H���_�{�[������E�f���ABlJrm@UY0n'9���I�������)g@�1��M�E'wA�H�
>q�F����W3f�6��F��
���V�8�9�D���m5S��F��&�5���� �,�'�\�2�5���!��1^�-���H>O��f-
�|�gD���:[5�BW�Y���u_G�}�1���c���2�=~���(�R@�����1%k���G5�5��'��x �R�hw�����e���L��u#�Z���cI�.���<_k��1��)�d��"�u>�\��t���0�2T����{�F���8���q�<y���o9Kf�\�cI���^<z�wE�?���n����.o�������1��A���f ,a�����#I�s�)�!�`U���1�����kA��������'�pT�DA#�\���U�_���B���y���
���
�uH?xE�U8)=��u�4e�|e�9�e���l����K^�VV��v#u:1�s�������i��
��7���	tQa����Wtj%w�;���������c�o*����w�?#���	�����	�a�U>NO���	��	~��.��K"��*�}���:nXRp�z�Mb�_��:�hs@�^T�_S[
�hNu�)
�~Z������U�
��
�����X�$����+��,P7�)�dC�{�<2)[3��|��Er����p��~T�
Y�9�3Rm�f��l��lOS�E�d�G�^��;�W����,�AZ6��W��6)k%N��d�1������.���1�/*�r���d$�Ok9��@�B�2G��D$�[�:>��D\�m?s2zA{�T�H�$(������������*?I��gA�j�m�5�?*�83�8�>!�)��w.=^AA�0�L�Z����?|�9�9��U�2GE�K^2�u�[�I�eku
Qc�J8�-%��5�#�����D����E)�T��R��P
��r�o��GM6�E�iG
�t97���3��FRm����%�����f��	>{l�Qh	Y'���'���/5��D�^G:�W��'�/������\�&�7�G���}�prz��E[l��lK�m\��jz������
5�����&��<��ui�����!�J�:{�Gt�`I!$7���8��[j3V�p����D�<e%��+�	�"���#��������^|�<<G��ux�WK�NZ�?��3`������t�p
o0�~��9|��rsN�#Yf������
���B
������	����������ziW����"�������w"?{b������j4�Q�V���8�KX1!(�^����K�\Q����~�"��"��}��[���y��uN>w�b�C���P�P������F����E+��b��i�L���<����4����p1c�����xc|F"V�ap!�SA�{�<��n�uWO4�<�|��I��){	���SOO@X=)�Gp��3�|���1������M���\E84���h�)D�`q[���1�n��b���`<���&�\�q��/:��l�c
�Zj�zl�i��AY��K�0������X���w}t����x��1��b��A�C�H��0����B�v��M�w���>����T�nr�_���H�`�?4[zL��w/x����Q����RBY��������Ag����P�c����;���IE����jF��z�����H����}�Dc�}��.���
~r�l/�������X���O�B��p~N��p�<�<�k�D�_�]
p	�x@~7DoSv>���+~�X�n_�*C]y:�^��6\�9�Wt�T��\/3KzW`�yPq`���~�C����.����8>�����?��~G��*0'7�S������A��@px�?a�\�.��l'�Y�Z��B
|�tc���H������)Lj�e*YM��g0p�/�`x,����9��U�������;������;��6k��}���q���,�wv��E��!b�%����%7�A(�	���,6��h�+��(�G�=�?��
o;I�u��J���4�vJ�J'�i����dj�N��@�� ����Y}[������Vf��G�Z���4��d�;u���W�U���JE^�Dj���xN��e.C<\�X�?��g6[��[������V��
 �B#�����h�L�*O�����D��UoZw~�;X\2���(@S���?��.~i!Pt�w��*�-��R��U=��=G����_�[���
\��O���yw�&Z����D�S�k��&Z\��X~:(�O�@����-����<����1!��yT?DP��E��Y��D<����Q�9T���gA�f�����f��a�\�X���2����N&,�R��g������=T"�K����P)�y�[>��b�5Y���5�o���\�������ZF{�*<&�Kq����M�@/�hC���US���#L�I~�i
�1�n^-�y�-�\f�n#���V��
d��L�����r����)�aRciCF�<W0e%��:�~A���������������mO���N��1��C��s�h(��t�EuR���W� .z�%E�����g���S���w��\����R~n���[�X�N<���V�Y��>M���5&�" �<~J�[v��'@���v�hE*2Q@���O����8�^�Hc���|���FW��;[�Z�Ye�f���2U��p4�3~�TWS�P�v	5.�]��}z���?�������*b�M&�G�"����Z�d�U����C�G�QiQ�p�W"�He�2GU����a�cN����d�S��<�E#���*j��p����*��s���� |���_��^��rLdE�jJ��<����8���rh��Z��z����a<lg�M������-Tn�����x{y���$@���*v���t<\�G���0�o��'����)�|^�b��������n��%��[�MYw���[/��c>r�ux�g�R�f>}����(���=�,�������Cf��v��I��*�b���7���%�����k'����}�����5��;@GSt@	Y	`3���5=�J�!�H�iw�"S������#[��!��T���w�{0�O|��z������XO;�'=��JT2����k<�����<n3�{�!W��Y�����F��]�����6�>2�������f<�a��������e�<�
��j:2YdaD�d`}��q���j�:��,X��Z~���0l����i�T�S�����W�"��T�eW�&�2�U?A�?[VSl>T���=���:[G]�W%vQ��{��y�J����D����#�I�N�5o���3�?�!�>`2�Hl�zCD�2:�BTg�p(B��i�2(�������0ZgF1��C}�D�3gWhH�y2�b�����1�#������iU��n���P�������,H�Z�D���b�h4�^��!��-I���s��=������x��Yc>�!���������<O����:�R�Qe���_�/b[^7q��S���y!<��OW�[U*�Oa�r�4)`��-��M�|���h�@E�����5u�������ACz�h,A��S���Icay8�����A`�^��a���.T��R��/8�J����>r�r��_���Es�)c���'(�j+��N/�M������i&��&8������m�}P��wd�q��B��'��������=n#������w���]���6�-��C����UW�������G�w��@�O���;���9�t������5��u�<X���3�0jU}���1��dNX��1*f8�oo$f1����lq��CY��������V�o9|R��`H���M�W�l�����}V2�VH��~��%+]�\6>9h��=����!�rm�d��C�E�1-�,������{��&]���k:���z��l��e:�W��������AU���
��Z��Px[K���fB��{�����c��3Ec��g��:�
�8�	R��5*A��u���]�"�L��e���M��p=���F�>-���'�$��=��x4�uA�p*U3�^[D�������h�Fe��2E��']�S,�����>��g��h�Ka}�E �7�������y��gc�=O����J���W�Wc��B��b
RJRn�u�k��q�X>������3(
��_jeP{�q�C|t��2f�G���.Y&�+y�fv�e��V3����
x��#���C74_�7.Y�����S'H�����F�H���u���Y���>)���
�������'�Ica���(�v�_��9r3���+��e���[#��&v�N�.�K_�������[����gP��q�1#O��96�>wG�#�l^�d���+s��l3��R��g
v������1�@�
J��8����xa\L�2n���A2����Z��!W���d�12GM������m0
�d8r���OQhL�}�8e�JNA�?o���8��r��wK�eN����:Y7}XZ���������C%�P���Wt�%)u�!��������\z��:u��2a6?@u�xs�Ac�o���0����#!/����y|�A_�f�Ne�8$$��x��Vq��2�6�g��g����e��!��V��w(`���;�����������������h^D����.�����?��/��E����DW��#RP%�4������Ey��/u��������E�7����R�]�~:���x)��O_��?`p�����	�g��G�z�,�K����>�0�JV��������6��7T@�
��������:$�u��D��w��
��6.���`?d�Wx�'�O�����7qmO_"�
��W�V�.��
m�����@3�~�y>0=7�v�� ����
kA�����Fo_����>�7}��-�@|�Zp�O�m466�����NC��=k6��vv'���knT��� 1���P��y�h�o6��v���e7���X�o�S�<G����Io���F�K��'�L��������n�tL;��F��0���m�G�C��\������+M��N�����W��N����
t���Cp�a������G�#�U>F���c�k��w
8����l����������}��M��gl�%<���4S�?����s�BM��Y.)ErG�OS�(F��2��s���_����(�y+���jn���6�[���������s``766��|���1~�@���=+����j����=%:�Z�%%Y1��F���m�4�
�M��p.>MX�3p�Y�=`���]&N7���F3&�w�&V2R��8�y��'����}!��pev�m�I�a!���������������7B�a}H���4�i]�	��>	S�Ao����X�{���t�D��B��y��p������0J���������I��Z��s����@HoF��w���E�����6���*���
��j�����z�6���@���
��z����)E5%RA.��i�n
_
�q~]N�F�=�0���%WC�����P���v�����������KYl�������1���c�^$j��,bj�n
i+���39�7i*�t)��`���!��O���b�.M��aW.�c)L]�`cd����U6����[h��Z���K>Q�!I�pkP��|J��4�e"@���zx7
���iA������8�wqL ���?Bp����qE�D�JL�+����}@}:#��Oe��bM�<�R��~�rXLS[�p���9����� ^�u����@�Cq����>�-?g%&i��"N/��5f{V�)\3�-n��*[��RU4�T��IA�7��9o�b7��M����h��r�p���7�r��|�l�^"�������#��uM>	��\�YIg��� ~U��3k� �5����������wQ��5X�Q=�,!9%�;Kh��Y(X
�x��=p)���5g��E�G�	T
�o���j�P�2��D�7Y��O�E�Adi�>_*��+&����YP��m�;wi��,��4�+��uQ3���������.Q~
�CeKa���EfT�(/]'���M|b��w8�I'����9K��)�sj`mr�'�m�F����x_AdpP��:���o%>%u���Ld����N��A��}Bt���j���M��qlza�	oZ��X���A�77�$�aEH��Yz��E�����>��(�*��K�}m�K��#e���{C
4&|��}8�7�DIy�<� G����C��,�~d0|���������'�������,��3N?��$��X�t|c������o��>��aT&j]�e!�7H�-)M!!|����|�:!�|��#�
Y�^s�K19�@D(�U0~��������G=�����yP�*���D��,
9������^J*!�����L����tF�3��k�`�p��~����Z����x�fU��6&��$!/���U ��*r�k3�sL[e[��Z���_�rJ���
-S���&a��8	�4���$g��%�?��,O�p��Y�0/�u-�����H��LR*6:�@��
vF�a��4��)��
S�)�75�a[������~nJ;�P%D�
���y;Eo(�Mf���-���i����R�9c�{K�����=�wN�m�~=��A��g�m&7�$�(LW`s,*��+�M��a�s ������Zv��������E�~�C�����HB�������I�/����5&�y������N�=N�"e�/[��8���D(.X���tTx
��]��6�\����	�����A7��C��)N�ZIb�#J�6�K�l�%3���q�������9���q�������#����2�d��*{7
u����
/s!�iUO�m����7���3N:qx�*�b1��Q25�8������\���!X�l|$�C8�������TSQ���L*��2���a�}�a�IW���N��
���ZGD�f�"�PUV;X���L�OzL�������5^�.]�@�8Ry�<0I*�4L�3���o�+r�/�w�N�k�
6���%�C/}��Qa�m��XI����\���0<����M�����������<��e"Z��]�O\!1��,�.i�Z����Q����������=������-�"��w�@�-�z�*250Hn�W�w��%�
<����}l�Y�#OK5�vij�Y�T�|!���xZ��W�] �W���%�����N�e�&TC�n��dD���R���1	��t9^���Y��E..��wC����]�"dC��������9�=�D����2�y��j�wz����7�sR$0���[�N���'hnQ}��Dh�wc�������%��{�u$
#�I�������:���g����;xR��@�:�W���j�F����J�;�d�]�z�Yqg�k1��������-��l��^�T4�v:�p���=~j�&g.�7R�J��GY��X�w���FY��J��dn������t�!�����5��k��0�8apa�O�W������
$�%�<���0����b�x������&�1R����
�a�
���\��/��a���V53v>x��������?,�H�Br����L8C��ej�.���$�.��8/�S�5V�dgto#:TK��saL/%>�]px��kq�5��[���+�����m�3X8�f���6�-�[�����F���,*���*8��ie�W�\J&�7^�m�,?K�[<�'�|�����d��J��� k�}�7���F�����v������j=��~M�'��g����.,����O�~���ZfO����,{V�x��p��D\n���e�K������"T�vX{������a��7p�*��	SPlW�;
��+Ti�M���IW�%�����8�Q�M��B1�{5�e�/�/gP��r]�����+��+lPp���y�o�z���f:N�(D�V<x$�$|#�����@c�	�JK���>�RN@������Je?�wxo�!�m�?���~Qd��@����������'M0�������y]���<����3�]Z�RZ\���%�7��T��?%���&s
3��u�����X���B*�%�P�5�8X�V��#�)~�8q��rn!k��I���~^�Xy(���|L3JT��]dc_)���I�Re�g�2�3`�>-�SO8-/�D�|R���c�E�5��KFd>�8��Bw�Zx�fx���/�9��x�����c>�">�����~�)m,�:�x��M^����G=�-`��|���U��6��`	��	Q��|�	�K>��3<{�P1����Er���wfC��|(h�j8a����A'�n��J����tS�.��'j�O��D���I����T�F\���9K�{�
e��U��&��I$�0�; ��Z�~�EN�	>&���A_��� ���<>_��\\���������J���~o��������wA�4�h"���_�&B.%�SSZ��
���~�p.
�r|,Y����f���f��F�"5�6��fq��_�z����-kMA��O=�����e$<r�_&�p���,M�$���2rG)�^7��u�k��U�������|h�����������T��'>x�L�HL�x%���5�U����}$ C����������v%n�1O�A�F�i�g#���r��d(v[���c��&��S��o�����)?���k����tw��U��34T^v��������N��M1����XY��_%�8��f/���:�,+X�V�,��Q���B=���i4L�Rq�L�aD\��m������vE�<���A/�i�z��w%h�9�B��������a5JMlu�Lw���������m�s��_��0�����/U��_��Y�X�4�����q��2�/�������	[�S���������u��f����LE��<���q@t����f�k���G�4
��^a"�+�;�}`��	��]*jw*��*U���g�;@\b����6����{3=�F��a[�y���(���:>6��8�eX�0��lL!�)�	�{���I�N�%�z7\�\>
/����T��3�>-{��2��^dz��*<�|��+��y��N%����w�������3���n��$R$���UZ��K=Jj��E����R�Y�Np���	������{�1���{	�3��P�����J���#N�(G�p�U�F����rO�q�B!�������Py�L���/+��mB]�We��4p��X���CSf���p
$[S�0���)���Va����+X��l*�\h�����r�NP�A�P}jK������I<�2[�f-:��MB@�q���0�V�\D
;��L�>�����������t��eiXy�*�f���F���0�����8�V%60�o%e���J������k���]1ZX4	n��G�A��M!�rM"��
�~Zh����O:J�5��q���b��8<t�����'3%����7y�I�|�,���GK'DM����sXc��Z��n2�k���Px�a�=N�kUnYl<x8�1���AOm�Lj%�����eG��Z��AW#X���H�:��8g���	��M�i���v�����(�1�������_��:�]���Lo���\:K%�g�F0�����q�����f���
f�Xl=��D��;���=��<�5%��n�w;v�q7��ga+��� ��#�<.e�08	,��n4�~T��S��R���8�.�����1s$��N��?�q]�DIp�{W�T=�����$qr�,o�8���+v�(LL
O��W�����'��J�f�����V-J��2������^��7����1�C��y���]x{|��.#����Hbr6�W��p���L���>%������D�+�,�7������H���h]~��bP��J������nL���}�rlz���/��'�~���&��NZ
.�/�Q�����R0�(	Z��J_��`�������=+�-����h����d��
}8Q\4lE���1�Xu��
���.��'��l����jWyC�W�}5d8������y(��� !��<8L�P�z2�-G���(��-�h�D�E��C�4�(x`@PL��ow2��e}<u�2�+y�<�B=B8���u��w�-��%U��y���[v��P-�k���,�����f��V�����7��_N'�@Zcn�G�t�-u����Q@�&��,1X:�NT#���O�xU�������^���,���)�a����;4*f�\��B>d�]W!��^\n���^-�Q�����E����-���tb�a�;�/�FygS�-�F��C}Z,��p+[�9f"Bd��X���T~�%��D��\��������Q��Y'�uU�XyW����>�������j`
V�yxB�#fXX9�D�mqk�)��ZZ^�SK|�}��J`]��6���:�\�r������13��O4J.�J]?t�p��b%�\
'�����4<��M����N��v��,��kh���\0�������
i���Z'"��sH*�;�Bs�=�m*h�8��Pxn�����V�%�c���fx��-���AF�
��~4���W������|�5,\���d�~.���9���f�����E'/�x��h8�_�F�TM"����p�{�U���@�V��!S���x[�#3�$t5_I ��B�3��� p���M�|�`
4|���M� ��p2�����P4F'�AyMb4�����3U�(�0j��$�Sh����{X
��dW$��Z����!
2.5�P����?Yc-s&!M�M��Y�p��,+*fP��va���:�K��%���E�-p���n�����p����
�A�����b8kc��
���\=���L��7��$4i��1�f�� ����!>���u{8��G#
�����D;����o��6]��Lg4{B�t�a����k�+�h/#����	Z3�����'Bd�,�$�0{�}A�^b4�	d��O���eL=�=;^X[.���Q�W��0F��0;��2��:�����#M�������]��N�u1aL$�iU����~ 	=�D�t�]���G�Ftr����c3]���������n����p��������7������6�}+����
���0��M���9W�����w!j�W@�����h�Bg������:���S���_�h�
@F@'NR^O&'�7�0�'<J ��1'�p��#������������~����F-��?"�a�k,���=v��QI�0�v�<�O0�w/�n��"�E��#�No~����������������YZ�5�%�8G��S=Z�iE#*���w)��|��l�������%����s#��<����V7��*3��!����I�M	-������������:��2�����J���l���������jMr�����<>l��7\�I����V��b�YX+b�����,�e�������P�R3��YY��=��c�V��2~��;~���9:<����p��xq�k��{�K�i< ��$��E�::{��������:3�=%@����a7�O��^�^^�p~�:=�l]�������������M~/�Aq�1n�6l���Zr��%���d@��IG2[��T����� B����
n��~e?�<Qb��P��;i�H�EH���#p��������O�cc��B����*�E�JRM'P!%x{|Nt������J-G���7 ����O2���+���n����k��9����y|L2$��c�`���U��
������f��Q�L`_�.�'�m*�te����H���'p]�HO���!r^�������.�����1���c[T~�k��>��W�@� �q�@-*S �$6��S���lQm"vM�+�;��A�(���1���`�L$��kp��eb���7�jT�Y
sM��='6���������c�Y�`e��K�����F&2W�w��.]%�T��������|�	����;��lJ�����:[R�����Qz;���gT�	V1K�;���*{���������������uy�T���P��,��5����v&x8�Z`#���[��f7�U�D+��>7�'2�}�\Q���W�T�f����������T�jj	�����
�������,��H8`����ft�h�[��qXR��a��������z��
�����z��&_�����6F5O��yu�z](!�9�F� -���3��%b�
���4��h��.H�@
�e��5m�+�|,"�YmK��}��.����z�/?���������$!��<�M��1����/|Os��*2c/Fq��� i���J)k��qF��������/���{~�����4��p'���*���V�M!Po��q�Z��������3���0��6�����*TC�,�[=:[�?�O&��������$���A�l������ ���(K
�#l�l�i7��d�6^U�W��OJ#�����x,��U�-�b�1Z�,EO�����Mii���0T�1��T�D��[+\(=��X9�#�w���!����B�'�j�����)������tL|3�~]�5���M��K8��i6++p�����<��}��*)"#�8�eYg����+X\J�R�(>��(�J!���U;S�5���)�cwg����?Nqg���5��l������g&h�j�{����v=�c��m�.mYc�g�N�f��1{��R{�}x���Cyh���s�x����Bzed�I��e��h�K�h���=rEcy���[{��*��\i	�~�&`����X5����c&_e�`�U���U�95P�f�M��f��-��V��m��v����������u�~o�?�����7�Kr�B#{��%�3nT��Nir������zh�Mj��7N������� ������g��&����"���iP��<�!@B&`Y�v��S�}�NH.����.B� ��
k0����MT�u
���������>�������
���n��vk���3��q�������kT
�_�p~�:=���Z3�bcP�3�$>%U
[s:[��5��<;=���Q�=�(���������h
N=B�-��4���u�o���hL� h��'��	�h��K��������n��C� �s�d8�o��5L3���U0\�����#������a��d��h'Q��E�U�;u�����RF�G�t�A'4�$><���r ;Kvv�VK"������1�#:L����}�))�v���������'s���m�������$����,�;j��k�*��L|�����!�����2�NR��B����9��)��l^l���
�g��kM�3J�H	��n./��8�p�wV/���jk~�N�������q�$'�Ix ��Su��JJ�p�����~�k�pcz��g\*\�)�Js��������6N;��E���L�6mp��yh�3�������?�N�rd�M�6��In�\u�^/sFK�k#u�)4h�~a�y��>M�Y]�OW��R�{��a�Jwq31�Sw��&#�4H��t8��v5��� �D>�@�=;���:>?;�6��]������K�z\r#S/~o�}"��Y����a;{�^�#A��
gf��"F�,�=)������'�t<��Kt&I�X���5�k]�u=�r�M��n&�`���`^*� C������%�������<�p�&��f���R�2
�@&,�C�y-A�|H!�����E��w"6�
��\ilY]�t3�y.��
��.'�}�n�w@�D����:|}q�e��F������=��,�����R`�\#d���s:0����i2�[���D�}�$�9:�0���|�����K��N�����`��-�����Y'�~��$Z���W���.�������i���
��F�n�>;c]#:U`������Q��p�K��.a�K<���{�k�4�SR� ��	��3��RZ9?�Q�=Z"\����B�������9�D��������$�h�0����uq���9�T��^^���9j<����y/��15��,R[v"�5$R�2=�N��B��>��S(��b���ErQ�$���LHF���at�%�+�������y�| ��P�FS�n3 �*j���y��l��Zy����?�(YM�L�ZL�DO�7h�b%$[�� ��
 ��'��	���"LGg�����,]�"���� @Dx�8�`��,��f����t����FJFSD��k�g7N2WJQrP*�9�cf^��9ld�:�a�i�=��r����e������%�J�H*�J���1�9 ��m����GS.+���)T_�
���f�k�L��n/+����yOW�O���h)M����/��������D9;�I#8	�����7��=���m.�wsF�� r]tPa�_���3*Dr��K}��{%�7�an\4/���+?7��8i�l���p�ck��<�#��j�����X�\v�EY\�z���B����'�����?zR����g]��Ko� v�������q���<r��L�]%{������Z/4����,��R��'G��2���-Fh��5�����,H���S�[n$?�\�#�L2�����8c0�In��,��D��ss����.�6����
�V�����4��1�W��N�������79	������Ty.bo����n��xw�`�FuDr����/�,����y6d8b6��P�"�1���%��n�TN�}�0�E��U����
b��*��W��<������O~s��z��KU��~��'��S�?&S)���~���9�P������)7v�9��c.p�m�P�W���v�Yn����$���������NxZ�����H�����p��&��h�m��W�~��pw�����S��J�N�iT^�
����J(���sX�gb�����%IB���yR�7���}��j�4�s��&9�v'Rb*>��z�]� ��Y�=8�w'��B�����W}��0�4��S_���BM}Y��*�n���9�a��� 
��:a��
<-���
:(&����j���Ks�Z�X��}p5y��
:�6v�����i}U8J%,��-{�t�����E�>��B�N�'���:�[������z����3�m9Z�@�ir�N�|�{(p�1������V<��F�U���(��g��C6D(d��	���Vd�%���	��i�
�Tv��s�A2�eT�5+�-.5���q5k�R�=����.�*��1$�Z�I7�tL��f��?���Q?��'A��	~�M"��9S�#q�aK��@�0�P�����u�I.�g�g8��g�Y)��m��E�����4B�G���_.�}f4?��
�A���a����OP*y���WsK�c��f"�	��J�Z#���p��z�n�
W_%��De����
g�c�%t�85�Q�����:����	�5���3
[W�=k��M=hD��\�������9[�N��A�����`�
H?><*A8�'Tb�4S����?^�"J���$�Cqx�rRY?���$}��B�/f����A�s�$��`y0��`��O5��A���>��2��P�.��Iq,�}[4�r��L&���H����_���~$�+�������_o�U�����W����Ak�yl]��w��a&GZg��K�MN����������H)��MMZ�-��p���x�����M���+l��i�qEPJp�C�$������,��r��z��j�E��S�Z��\��-�1~�'8p���^�>�5�;��a
�AG��� �Qu��[�q9�vp������~���A����Z0�����2>Y����,�R$��������o�a�6m��xpB���I=P�@(�g�>�g�Xo��R�N����j�itc���
����2r����"�e�]u�V]�8'?�vx�7zl�Z���`�����D�TB���	��_u����ez�����/�(`������Ar��N���D"�aD_�"-,�S
O5�H%���%��V|i��G��}���IP����h���^��DE�<�%�eK�P� $���M1�1@P@��{�J�$����5N&�]xY-��1���w�sOb��p�fx��0�'�@k���	������d���N��K��S�X���OEDp������}OW��DH�f4��3������Qn'���6���[k�����?�"����x����q���k$I���t�M��1_�~��}
q���W=v�H��a#:~*,x�q2N$�OZ�p'��\�zPW�����-��p���s����@��;I/�G��/�q�.��O���������P�3aF1���
c}��dd���f�wC/���VVQ��K��B/|0.��0F���A�=����R}��O���E1>2-B���vv�>FN�p{s���n���&�&ZF�U������;l�n#LZ��}���'����%O�����C��f4)���6}����}��p��&0�9O��"8v�:��1%$���_v��RUNDZJ�� 1D���	?��������i�pT5��*VN�8���J#.4W���Ut"�id����o�hC4�����m:�T
��q��M.YX+;��F��#���w��3������i)l�\���%�[��A������m��*���]�<�������E|d%��.,y�qn@������b�J+��0c�~%H�
�`�1�(�D3u����VT=���7K�X�h<���]>p���R�������������nZ�h�c�i0Jd���*
���� �QY�`�����*pfR�H��Y��	,��KI���2=��Q,J����M���L�����|�}�kC�0_1/�^,���S���TL���QV'�[6.��W T�3,N!�� ��Q����v���:
�\����s_��U���u����!`��$aJ�������F������x���������W�E����
���|���T��37���>Mo*����@�l�z@(,���l�pS�h�������Iy�R�=���F��t7�T�)\>i��pP��b����*J�a�r+���8��,��CX�`N�$�#����o�i�9�x�����8����	3�=����"1&��ep{��o�yo���w�e~�|������\��6�d�6K_7p����22���VoL��*��@�����	m�V�����������0����2��75��doG�����O���pO���g7iN�]t
�����;�k�^l�� �w���NEG��T��(�8�9��*�hK�f�L�u�z�������(�+��e:�@
��T��O,Gs�j]�><}yq��b����w��A�V��qP�����M?�p��s��{3��O�1��E}^B�������$�����|1e���M+B��L�Q�Oa��%*ru��q�-O�����nE�������~�j;��"�Bzw���k"���[n�+**��������2/�wCV��u�3f
Se�I(�a���C���>��sn������P�y���@�T�t�*%|��������8{��Pd��+k�ZI�#z��i�U��YF��Z0D��?o��|�,P��s86���I>�5��M2���y#�l�?�EcU��9��\��X��Z;�"��BG�����o"xs%�4�`�k��W93>+�Y��OFl���nq�~�	,]+jv���7����@�d�)����8�>*I����N��,Q�)�
MX.����p����*Ew�1p���S3v��7��Uh������d�)�Hhn5�u�'������o������8|q�����qE����qk�ws�����z�
�=R����>��`�v,�#�m��zKU'Z��S1��W;�J��F�G�i������A�{���z}*u��%���*�Pw2,<���@d0A������h\��af���[4!
o'��er"j� Fz#Y�����\��=N�!�PMKZ�O�g��EIBZZ�
,r����srO��=C�<�/��4c�R�Q
���V�����.��"{ �9l�������|�QC1�[�Q�U �����*�'�'O�j������0s������������)�4C�i.,�7��|���</?��8Uok0�jr�mQu�?���~��o�s���ehU�[�
=�����Zv<}�l$*5��m{M�f�2����cx)
��-���p�G�lEH���7j��;o��J���P�Al���5��i�e[��r{��3O��.��k?����3;�`$���Pe��������6��f���g��r^��)'�_�y�T�t,*�7�]���l���(�%����r>����L�9�/&�5���k3x
�I�=��E(�����$h�.��$/FX(�U������Z	���/��	&����s��F���O�}q?������Gq��z=����\�9��
��f�(���p2�-��[�|A*�����oV�7����}&�����S
ly�;�F��������.
��-w�k	����+����?��0��L��'r�Zu��l��k�G�����1�\�V�XLZ}����4�A�hSHmC�E���2r�$1�XX�=��k�����ZRx��/�F:1��
��%	�:�FY�\;�9��J����U���T0y�=����]��"8�\�.I�C-U���ci-��WU_�_�?(��������=�_c
��O�L9:��������.��}���L%`P�#��R<M�#.8=���S���R�,�4
�g���������[��[�K�	�u��;|�_����<�_P���$�Q�:�@CUZ?�o*9�����,.9'��Ye�����\�lZ*����k o�:�����Ki��~eH�4}Vx��<�J���'!�r`�'��\�J���*}�P���q/&��chK�q�d
C>c�mMR�b��������"�^XX�p�^m#�3��3�}��5~K�O������{Z�c~#U�cu
�b�w��jna�`?P�p�����d �UwH�.jY����/r���f��y ��hY��B���Z�8�a�l��:�y$��^�z�I�����T�9>ya�-c#�1�?afQ�RV���T(��Gk"p���D��9�)�f�l�0��:p:�8������qN\��KM���{z�����p�o�"^�\��y ^�������e.�Y��,�� �x_��V���E�r�����%���Ux�&�����dL;5q�����9����,��u���Z~�E7FI��$��4$dIs�8Rn��V�����[nt2�\��#�����i��N��7I\��
���pL����7?��o�[��D�x�&I�'�a�6l�1����9�V��3��+{����+�=1�s�����?���%���c�������9�o���e��({9��%d=��A�-����FB����i=Z�CT��r��
���z)"4R���bJ:��_��>��C�����!}���}B�2_r��N��GE��B�M��5n��6H-��%�p�&8`@4F���6IT�����I�		7��<u98��r�e����N��}�m����(�����=�
��k��c8g������������w�c�����p<�PvC��8��%[D>���BK����?(4����xHa���x0~�A�h\���F�k������ 9�C����P�ct�0S �����9�ie
KzoC+B�����?Jp#��F�[��?��1E�
$q���G�6�����_�)z1J;��XF�����C���e������s8,����S�y\�57=�	>U.��ba\-(q��-����d�(��TY9tl�V����&l8($8�V�i
��`�K��pM��'�	[EU���_L��zd[�;E^�~�>��w�����
�k�X7�,]Y�gN�T5��.�Aia[���6��(���@��	y��D��YZ,�J��'�HbA�B����6�OW�^�d�>��|H�]������w��OBD9��j�<H�,w�W�h�L�/�a(��\��~%���.G��*��.�����E)�-A>�p��@jt�����{�4��h�����9G�B-�0z�}(2�uc\���+x���|�Yo'0�Aw�������F'j��z��L{����F��h$������������������@/�������?Ek����^��?�#��n30r�:�������t7�������.w8��,�Im1���w�_�EGN�
�<��N�������]\��t�kL~L:��>�p�WD8.�`uVq�����\��[���o�f���H_�:�
�~��z67����M\V����@;\���> �����X���.7o������4��qh��%��q
��6�hF���/5#���q2�)�Ql���s�3�bi�(���{#��1[��oN�kf"�u"*�����
�V�����odAi�D8�AL��I�<B�-��(�9F�@�Xa��� y/�"����*���T��.��$���
�k^��>��,��	��\��P�2�M��p���t�	�5q���YM��H��h1������/w �zG���������dp��i����=X�C�b�6�>;]nK97[_M7����{;����Y��v�?Y7�l�P�XUZ���L���{���/��L�p2�^BG-�P��,�k�O���ml	~���w+��.���1{/9/�����������49�>�����t����J��t�p%Z���TT(��b��/�em(1(�9�<����jm��i�	������pJ����^�9=#�=<��
5�:��a.8L,=[C��}3���@@}�u�=�*"�5Bd*���nh�v�u7%��r�!p��UC}��N"hm,+v��1w��`_Y�2��G�
c��~�ag&�����B���T�����6�]r��8�bxg�C����!�
�%m*��\.�(��j�!kIQ`38�=yY���$�63�	5��r��
�V��)�b��;!U!$���0�F<��2G,���eX�<�\������Mbj��_��E��>�3�����t�)���� O0~��@M���cHE@��h�i5����������-`��-f(zM��|;+�>��9z�]/#���s�VrL���`�
Q�B����u��C���wGEk+g9z&Ph�
���%�(��h�-P[yKZc��2���)���e��%�]��7>=�l-���
	:w�����r��h |��T�@�1��
nQ��(%&��yQ��G�.i���<W5��Q�#�E�����y��2a8~*hn�)2�lx���7�7J�9� (~���\O����*������.�X�j/#�����sX��i�v����:H9��E���q��u&��KM�T�@��y���n��n,����,�J�$�����m��*&��0��������^2��d��v*Upnmr����������v�75')���R���33�w��k|�B",�^jn���t��5�)2Y���J��h4�q6E�?�6u�?ZN#�WtP����L'x�>�������'��:���_%MMeQ�$���^7M�w��������A�����n����.GR����a��T
�+��z5���K���8�<6"c�qBr�?�B��H@�����#�>u_��K��2�5�W�������	�����9A�5�������\b����4�'���	=R��
�"^��^���/� �����Q�b���rX����:P��&���^�������~�x'�Co�xg�g�������q�;�����|T)
��Ev���	����f��\,g���ar\�L:�_i������$����n������g����<�8;J���`�7���o���z��u�����u�0-v�$ZR������^
^�1/���|��)]�4��������\[��z�A���^H�h�/"$q����t���l4���������EMS�(�)�j��zs�R���J�Q�m�74�B�V��dt�#�OHG�5������5&}��E'F=Z�[l���ET{��S���e�2G�Wj�����5�}9�
3��7��_'�#��g)�t��<��#�q[�!y�j���L������[�n��?�V�r��:#`4�q��53�K#�$���S� 1��\N��f���h2_����������	Tuuk������x���[�V�[�;�����N;h��2�"���8w~|pLT����g�!�;5�sj�3
�\����d���ME�x��}�]aI�#��`�O�������T�����>h��"��F�d�����W'Jz�n��0�
���k�:����a!�c7�d8��$t���d!�n��������A�O�9Jb��35/[W������`$N���6���?��O��������������)�G���'���Ly|�7�
�6h����h��V]���V�hAk��E������\Kq!��0(^����T�Q�CA�)f�F��w��o�O-	�`�M����
4������0�y����@/��?�NW����t�	��#����o��.�����zF�����[����fo���h�;;��N�S��W�`���o���M��~����%�P/��4�|o�3p�0Y�{�#���_���gQ��b�����Y��p#��E�;����,��-w���dpB�G�uT���w�zU��&�A����MU��b��G����\WU)1��	�:?{��x.cx��_Yh�MD"
������1|��A[�E�_���f��_�G/�5�.�� ���a��������X���(~����0<"�
����a����%�*8V�����E����J�`�b>8��ph��C{X����yqz[���4�:�h���aw�S�I�^����N*(I��P��4��m�OL����j�llt�3�I��
�R(A�i��KT��8	%[R*�3�r$����(�k�4�Qco�������/�qm��7����������::{�����q�����������F<�I�<^�v�J���Z�����G�����?P�>���R�j���4�����/�t$E)�.dj�(V>i!�eF	��%�^���6������F���l��������U����
$w���?�i1�;�L����C	���zm-��K���/����(��g��E�g��!y��3)'�D�Q�+������/'Jo;L�b���5�h��S��dq�j����`�V�hW�m�F������Sp]
��������$��ErCq��X�&�U�F5�Q����GQK�Q�����Nz������8�l���{����ki�Qt���?8 g7����1�PX<|Ys����,�9���P�s����C��_y��-��:��M�|�I��G�����[����u�����3>4� ��t�Z�*���t�����)�an,\`����i�.��`p{���vR���j`�k5mt�pyx���8�|�N>Q��s��U������T6H��n������9p
��/�Z�g,5���(���3?n}�%Gx)��sr�h
h$�"@6��,�\^L�����p�%�=���	rE��X����-Z����1W�_������d�����f���zV�/(lh���bv�d1�y^g�s�]��PW\sLrQW`s��S�.��M��q���_gp�Ey���i��W�<�{� K�R*Rb	��#���`#d6A�V�~F���������
����0�����@Rl\O�
�L��zi/�o4�������Ng�W������X������$=^�>�..�om�7�6��<�$W'$=@yQ	���)��������2`�Q?��8K����1�HC3�M�����XP��	}��::���o��.��5�cb~�}��K{(i����n�eKZ��c[�.����D�#�M{�+E+�UE[�����P��4���UE^�=>}y|ztr|�z��`}P^^�Yf���3xb��e�4O_T�����g���������������cZ���.�l�QK���ot7��:�f����ml����
{���f^W��u{'?�n�d
��7���a��{xtR��0"��S[*!o=|
g�j��s,?R�-~�������Z�p�@�g+��i�$����������$
�%�Y��� ))-|����������	���
}|����Y����Z������?i��z���g�Wg,�y�6�����Ggo����U���+S��7� �/��������M�C���;���
���{qq��g�Zo�NN/����2���
�AY�/�q��U����nsk�h�y��������~GK��/h�����}����Kc�y����8M�A��)a�O9/����Y��Q�.O������=�H��\�����P-�<����z�r��uEV��:y�R��-�t}��j�$/l
p�X���"Q�Q�-�p�~sj���Y��ZP^����%���<&Z���o�@��^t��[������l�
�IEV�+���������*4�������f����I�������(��t��U�����|L��x��K�F�h8O�O�L9����hM����
i��c#���	��RtPuDd��;=:rJXd�c)D����L������/|^=L�T���Z�U�A6�"��b������HK�j�9G8}��~}��@'�������3f �	v����3c
&]�,���$�N����r���]����e'FO

��������=���<.�U�[_\��J�z/~��W�Fn����v�z����qA������|NI����I����D-�P�)�oG�T���Z��i�a���b=�m5r��Ir1��K}Nb���=��?$�y����q��bt��,1�^�����Q�A#��e��{�G����-�	�0�0�v1��O��1E��1D���%�/[��zz���=r+�<���Y/�-%��F��y��l&���Ns/�����}���f��AI���[���_���c�
~�@���&O���h��W�V*��C8@�F��9���7(��i�z�!C>������#qB����j�hgwo��x
�45����zqx��5������J�I��I>:�l`�
��K����l2�XNV.��O�/�?����"j�n�o�D��Q/m����%�p_��f���0�@-�t����4��I0�W(O�{��D���QL��=1�n��wO#����QDk�n2�}J�\]NI���A����.^����R��WLR�:l�[ �]��Om�A��pjt7yUJ�C�0c����Q�!oD���[	;��c8P�.U<l4�����P���
rK��l��'�P������<��U�I�[��(�z*���n���#$�~�4#3����d�8LJ����4��.c-���B�����4BT������9�T��q��������7��'g�����Ng���XY/ �zP$
����55[�pr�%������
75���e�������H�.EaL� �8�uBc��x����C��Cwd)�^�����4��Q;�w�������\��3>�n�F������w��8"�1d�o�+��<r�����S���I����|�2��5�}r�T���WQ�-�7[*��~������
Z������9�3e�\�+���x�z0�5��1��:�E�0��P-��	�y��[��K��~�����i�[����2�}���6�*hQ<RKZT�M��
��+3��������\�P�q-~-���~ok#��o6���;;����k��2�Z*B���d���9�t�9�t��CS��#~�A^��������������!�r�`�y 6}n|�D�W����[��%����/"mf�Jp���E�2qldH��4i��i_��c@�(<���qz����~|�3N1���-�@m �~l
9����d���K7���t���������?����N����-�g'��>Y�����1��?�K��M���E0K����G�bA�Lj��h�������@�5�?�}�:K
j����&
������:�#�K9v����:��>�Y��=�8l����.�N��<�i��m
� �u=V�y�Y���������N�N���������'O`�e�l��%�5�^��`�1���y��Tl���bROy��s���K��Ny�{4�[y2�{��
�����vz����:h46���fo{��Sh��+S(NZ��-������f��
�$J������'��!����/�k������9o��{[
]��d'���B�('��
�>���b���^-�R
����a� ;��4��jSz[���@R�����'X���fzs)WT�k��z8|Ov���O���C�j����Qis^'�Q��m
JjLFwc�����������A��5�e
[|�.����
�1Ww$(���%���������^������^���1R��?a��
�����_�d>���������Pm��������1���a>��Bs7�y���P7##�{��7}����S�[�H.*������2���2��67�	��~��En�>����3��}��6��������P���%z`���s��;2�6��vW]"�+!�V�>�����z�~�+���K[�h{�I����G����c�����������|�|�,>cF�2�����)���*��>����+ #W���#�D�) �7��������c��A��~ _{zq|~	?.��f,����G�t���S��6�Q���c������tc��iswk3�TtY�����;�/s\ivS��'��plG}t���Q�^B���N4h:�{�0�rQ9�����'�Q|tvqy��zU[��B�Zt4t�Dq�D'�F��vC�x��1k�����8t�7�Qr��P�����Xc���d�E���8+����(��$b=���5z���l�f��<��2
�S+|y|�����C��4�����d��<Yp�T�fJ�'�������I�Y2	KW���L&1��B'h�u��&,�����#�4AZ#���������|�1e�Tvq5!U#�<���Mzr���/f�����4���5e:W�Y������<�x�x(ll��RIz��v&�F&y�X�0.0�A���*�z�)\c+C����rW���nZ?=W7�n���0���n�_���{5>���m��S����\�Z��N__?���v��2�$�6����m��|�j������A��G�>���*>���O<���P�����|�������<A���k��\��B����3S-�}��*�}�5g�	~:e��������rW�3����C���n��'���P22��_�e��&�;����������)S�P�E���P�1v��
��W/�8=��\0�����O?����=��{�*�U���=�_x������}��9.,OTps��Z>9[��l�)��wzK���hys��%��B�m�
�K�V�h�%���>t�gn�����)l��~AjZ�_��{����Z|���R�0�:�;3����\�Gw�L���[�Y/�VH�A9=�E���<3��������}���i�?�a��2���������|�,j�h�?���
n�� <��3���~�E!&Q����nO��}���r��b�^�yV�K���|!��LY��[��?�����e�)����2�&}����7i����������K���[��-(����:\��������������_q��`��?��N��<� �������k�/��OE��X`�f$�VA1��UP�`b!s�����wl����]�j���9�������N��TP���8[k�
���O��KMH���h��GY�i�~8�-��������%Z�=����
WCV�����M3�����Dt^�BYu�����vv6��{��y�������Vl�!��by��#&�*����x��
(JO�C����r}��?Z���e�k
-��9u������zy1���v��?�P�t{c�?��_�����W+.l)���n����D����j�������n���&y�*��!�GXa�y$$.�N|J,��v
����y��<1���o�S�~�F��sj���g�!Du����W���
���6{�����^��lt���n���M9.���\b
����Q��1���ql�O��/����������zsk��r��g7�����@�������j��<G~w����{�L�m����7���z���K�[P�ym�_~�B��obe��y�����$����v�/}��iP~c]���[���T��6�7X��k;~g����Z%���f+���rp(7���������T��

#99

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: Tomas Vondra (#97)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Sun, 13 Jan 2019 at 00:04, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 1/12/19 8:49 AM, Dean Rasheed wrote:

A possible refinement would be to say that if there are more than
stats_target items more common than this mincount threshold, rather than
excluding the least common ones to get the target number of items,
exclude the ones closest to their base frequencies, on the grounds that
those are the ones for which the MCV stats will make the least
difference. That might complicate the code somewhat though -- I don't
have it in front of me, so I can't remember if it even tracks more than
stats_target items.

Yes, the patch does limit the number of items to stats_target (a maximum
of per-attribute stattarget values, to be precise). IIRC that's a piece
you've added sometime last year ;-)

I've been experimenting with removing items closest to base frequencies
today, and I came to the conclusion that it's rather tricky for a couple
of reasons.

1) How exactly do you measure "closeness" to base frequency? I've tried
computing the error in different ways, including:

* Max(freq/base, base/freq)
* abs(freq - base)

but this does not seem to affect the behavior very much, TBH.

2) This necessarily reduces mcv_totalsel, i.e. it increases the part not
covered by MCV. And estimates on this part are rather crude.

3) It does nothing for "impossible" items, i.e. combinations that do not
exist at all. Clearly, those won't be part of the sample, and so can't
be included in the MCV no matter which error definition we pick. And for
very rare combinations it might lead to sudden changes, depending on
whether the group gets sampled or not.

So IMHO it's better to stick to the simple SRE approach for now.

OK, that makes sense.

Regards,
Dean

#100

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: Tomas Vondra (#98)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

(Removing Adrien from the CC list, because messages to that address
keep bouncing)

On Sun, 13 Jan 2019 at 00:31, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 1/10/19 6:09 PM, Dean Rasheed wrote:

In the previous discussion around UpdateStatisticsForTypeChange(), the
consensus appeared to be that we should just unconditionally drop all
extended statistics when ALTER TABLE changes the type of an included
column (just as we do for per-column stats), since such a type change
can rewrite the data in arbitrary ways, so there's no reason to assume
that the old stats are still valid. I think it makes sense to extract
that as a separate patch to be committed ahead of these ones, and I'd
also argue for back-patching it.

Wasn't the agreement to keep stats that don't include column values
(functional dependencies and ndistinct coefficients), and reset only
more complex stats? That's what happens in master and how it's extended
by the patch for MCV lists and histograms.

Ah OK, I misremembered the exact conclusion reached last time. In that
case the logic in UpdateStatisticsForTypeChange() looks wrong:

/*
* If we can leave the statistics as it is, just do minimal cleanup
* and we're done.
*/
if (!attribute_referenced && reset_stats)
{
ReleaseSysCache(oldtup);
return;
}

That should be "|| !reset_stats", or have more parentheses. In fact, I
think that computing attribute_referenced is unnecessary because the
dependency information includes the columns that the stats are for and
ATExecAlterColumnType() uses that, so attribute_referenced will always
be true.

Regards,
Dean

#101

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Dean Rasheed (#100)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 1/14/19 12:20 PM, Dean Rasheed wrote:

(Removing Adrien from the CC list, because messages to that address
keep bouncing)

On Sun, 13 Jan 2019 at 00:31, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 1/10/19 6:09 PM, Dean Rasheed wrote:

In the previous discussion around UpdateStatisticsForTypeChange(), the
consensus appeared to be that we should just unconditionally drop all
extended statistics when ALTER TABLE changes the type of an included
column (just as we do for per-column stats), since such a type change
can rewrite the data in arbitrary ways, so there's no reason to assume
that the old stats are still valid. I think it makes sense to extract
that as a separate patch to be committed ahead of these ones, and I'd
also argue for back-patching it.

Wasn't the agreement to keep stats that don't include column values
(functional dependencies and ndistinct coefficients), and reset only
more complex stats? That's what happens in master and how it's extended
by the patch for MCV lists and histograms.

Ah OK, I misremembered the exact conclusion reached last time. In that
case the logic in UpdateStatisticsForTypeChange() looks wrong:

/*
* If we can leave the statistics as it is, just do minimal cleanup
* and we're done.
*/
if (!attribute_referenced && reset_stats)
{
ReleaseSysCache(oldtup);
return;
}

That should be "|| !reset_stats", or have more parentheses.

Yeah, it should have been

if (!(attribute_referenced && reset_stats))

i.e. there's a parenthesis missing. Thanks for noticing this. I guess a
regression test for this would be useful.

In fact, I think that computing attribute_referenced is unnecessary
because the dependency information includes the columns that the
stats are for and ATExecAlterColumnType() uses that, so
attribute_referenced will always be true.

Hmmm. I'm pretty sure I came to the conclusion it's in fact necessary,
but I might be wrong. Will check.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#102

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Tomas Vondra (#101)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 1/14/19 4:31 PM, Tomas Vondra wrote:

On 1/14/19 12:20 PM, Dean Rasheed wrote:

(Removing Adrien from the CC list, because messages to that address
keep bouncing)

On Sun, 13 Jan 2019 at 00:31, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 1/10/19 6:09 PM, Dean Rasheed wrote:

In the previous discussion around UpdateStatisticsForTypeChange(), the
consensus appeared to be that we should just unconditionally drop all
extended statistics when ALTER TABLE changes the type of an included
column (just as we do for per-column stats), since such a type change
can rewrite the data in arbitrary ways, so there's no reason to assume
that the old stats are still valid. I think it makes sense to extract
that as a separate patch to be committed ahead of these ones, and I'd
also argue for back-patching it.

Wasn't the agreement to keep stats that don't include column values
(functional dependencies and ndistinct coefficients), and reset only
more complex stats? That's what happens in master and how it's extended
by the patch for MCV lists and histograms.

Ah OK, I misremembered the exact conclusion reached last time. In that
case the logic in UpdateStatisticsForTypeChange() looks wrong:

/*
* If we can leave the statistics as it is, just do minimal cleanup
* and we're done.
*/
if (!attribute_referenced && reset_stats)
{
ReleaseSysCache(oldtup);
return;
}

That should be "|| !reset_stats", or have more parentheses.

Yeah, it should have been

if (!(attribute_referenced && reset_stats))

i.e. there's a parenthesis missing. Thanks for noticing this. I guess a
regression test for this would be useful.

In fact, I think that computing attribute_referenced is unnecessary
because the dependency information includes the columns that the
stats are for and ATExecAlterColumnType() uses that, so
attribute_referenced will always be true.

Hmmm. I'm pretty sure I came to the conclusion it's in fact necessary,
but I might be wrong. Will check.

Turns out you were right - the attribute_referenced piece was quite
unnecessary. So I've removed it. I've also extended the regression tests
to verify changing type of another column does not reset the stats.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#103

David Rowley

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Tomas Vondra (#102)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Tue, 15 Jan 2019 at 08:21, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Turns out you were right - the attribute_referenced piece was quite
unnecessary. So I've removed it. I've also extended the regression tests
to verify changing type of another column does not reset the stats.

(Trying to find my feet over here)

I've read over the entire thread, and apart from missing the last two
emails and therefore the latest patch, I managed to read over most of
the MCV patch. I didn't quite get to reading mcv.c and don't quite
have the energy to take that on now.

At this stage I'm trying to get to know the patch. I read a lot of
discussing between you and Dean ironing out how the stats should be
used to form selectivities. At the time I'd not read the patch yet,
so most of it went over my head.

I did note down a few things on my read. I've included them below.
Hopefully, they're useful.

MCV list review

1. In mvc.c there's Assert(ndistinct <= UINT16_MAX); This should be
PG_UINT16_MAX

2. math.h should be included just after postgres.h

3. Copyright is still -2017 in mcv.c. Hopefully, if you change it to
2019, you'll never have to bump it ever again! :-)

4. Looking at pg_stats_ext_mcvlist_items() I see you've coded the
string building manually. The way it's coded I'm finding a little
strange. It means the copying becomes quadratic due to

snprintf(buff, 1024, format, values[1], DatumGetPointer(valout));
strncpy(values[1], buff, 1023);

So basically, generally, here you're building a new string with
values[1] followed by a comma, then followed by valout. One the next
line you then copy that new buffer back into values[1]. I understand
this part is likely not performance critical, but I see no reason to
write the code this way.

Are you limiting the strings to 1024 bytes on purpose? I don't see
any comment mentioning you want to truncate strings.

Would it not be better to do this part using a
AppendStringInfoString()? and just manually add a '{', ',' or '}' as
and when required?

DatumGetPointer(valout) should really be using DatumGetCString(valout).

Likely you can also use heap_form_tuple. This will save you having to
convert ints into strings then only to have BuildTupleFromCStrings()
do the reverse.

5. individiaul -> individual
lists. This allows very accurate estimates for individiaul columns, but

litst -> lists

litst on combinations of columns. Similarly to functional dependencies

6. Worth mentioning planning cycles too?

"It's advisable to create <literal>MCV</literal> statistics objects only
on combinations of columns that are actually used in conditions together,
and for which misestimation of the number of groups is resulting in bad
plans. Otherwise, the <command>ANALYZE</command> cycles are just wasted."

7. straight-forward -> straightforward

(most-common values) lists, a straight-forward extension of the per-column

8. adresses -> addresses

statistics adresses the limitation by storing individual values, but it

9. Worth mentioning ANALYZE time?

This section introduces multivariate variant of <acronym>MCV</acronym>
(most-common values) lists, a straight-forward extension of the per-column
statistics described in <xref linkend="row-estimation-examples"/>. This
statistics adresses the limitation by storing individual values, but it
is naturally more expensive, both in terms of storage and planning time.

10. low -> a low

with low number of distinct values. Before looking at the second query,

11. them -> then

on items in the <acronym>MCV</acronym> list, and them sums the frequencies

12. Should we be referencing the source from the docs?

See <function>mcv_clauselist_selectivity</function>
in <filename>src/backend/statistics/mcv.c</filename> for details.

hmm. I see it's not the first going by: git grep -E "\w+\.c\<"

13. Pretty minor, but the following loop in
UpdateStatisticsForTypeChange() could use a break;

attribute_referenced = false;
for (i = 0; i < staForm->stxkeys.dim1; i++)
if (attnum == staForm->stxkeys.values[i])
attribute_referenced = true;

UPDATE: If I'd reviewed the correct patch I'd have seen that you'd
removed this already

14. Again in UpdateStatisticsForTypeChange(), would it not be better
to do the statext_is_kind_built(oldtup, STATS_EXT_MCV) check before
checking if the stats contain this column? This gets rid of your
reset_stats variable.

I also don't quite understand why there's an additional check for
statext_is_kind_built(oldtup, STATS_EXT_MCV), which if that's false
then why do we do the dummy update on the tuple?

Have you just coded this so that you can support other stats types
later without too much modification? If so, I'm finding it a bit
confusing to read, so maybe it's worth only coding it that way if
there's more than one stats type to reset for.

UPDATE: If I'd reviewed the correct patch I'd have seen that you'd
removed this already

15. I see you broke out the remainder of the code from
clauselist_selectivity() into clauselist_selectivity_simple(). The
comment looks like just a copy and paste from the original. That
seems like quite a bit of duplication. Is it better to maybe trim down
the original one?

16. I initially didn't see how this code transformed the bms into an array:

/*
* Transform the bms into an array, to make accessing i-th member easier,
* and then construct a filtered version with only attnums referenced
* by the dependency we validate.
*/
attnums = build_attnums(attrs);

attnums_dep = (int *)palloc(k * sizeof(int));
for (i = 0; i < k; i++)
attnums_dep[i] = attnums[dependency[i]];

Would it be better to name build_attnums() build_attnums_array() ?

I think it would also be better to, instead of saying "the bms", just
say "attrs".

17. dependencies_clauselist_selectivity(), in:

if ((dependency_is_compatible_clause(clause, rel->relid, &attnum)) &&
(!bms_is_member(listidx, *estimatedclauses)))

would it be better to have the bms_is_member() first?

18. In dependencies_clauselist_selectivity() there seem to be a new
bug introduced. We do:

/* mark this one as done, so we don't touch it again. */
*estimatedclauses = bms_add_member(*estimatedclauses, listidx);

but the bms_is_member() check that skipped these has been removed.

It might be easier to document if we just always do:

if (bms_is_member(listidx, *estimatedclauses))
continue;

at the start of both loops. list_attnums can just be left unset for
the originally already estimatedclauses.

19. in extended_stats.c, should build_attnums() be documented that the
Bitmapset members are not offset by
FirstLowInvalidHeapAttributeNumber. I think mostly Bitmapsets of
Attnums are offset by this, so might be worth a mention.

20. I think bms_member_index() needs documentation. I imagine you'll
want to mention that the bitmapset must contain the given varattno,
else surely it'll do the wrong thing if it's not. Perhaps an
Assert(bms_is_member(keys, varattno)); should be added to it.

21. Comment does not really explain what the function does or what the
arguments mean:

/*
* statext_is_compatible_clause_internal
* Does the heavy lifting of actually inspecting the clauses for
* statext_is_compatible_clause.
*/

22. In statext_is_compatible_clause_internal():

/* Var = Const */

The above comment seems a bit misplaced. It looks like the code below
it is looking for an OpExpr in the form of "Var <op> Const", or "Const
<op> Var".

23. statext_is_compatible_clause_internal() you have:

if ((get_oprrest(expr->opno) != F_EQSEL) &&
(get_oprrest(expr->opno) != F_NEQSEL) &&
(get_oprrest(expr->opno) != F_SCALARLTSEL) &&
(get_oprrest(expr->opno) != F_SCALARLESEL) &&
(get_oprrest(expr->opno) != F_SCALARGTSEL) &&
(get_oprrest(expr->opno) != F_SCALARGESEL))
return false;

6 calls to get_oprrest(). 1 is enough.

How does the existing MCV and histogram stats handle these operators?
Does it insist on a btree opfamily, or is it as crude as this too?

24. In statext_is_compatible_clause_internal, you have:

/* NOT/AND/OR clause */
if (or_clause(clause) ||
and_clause(clause) ||
not_clause(clause))
{
/*
* AND/OR/NOT-clauses are supported if all sub-clauses are supported

Looks like you were not sure which order to have these, so you just
tried a few variations :-D Maybe just make them all the same?

25. Does statext_is_compatible_clause_internal)_ need to skip over RelabelTypes?

26. In statext_is_compatible_clause_internal() you mention: /* We only
support plain Vars for now */, but I see nothing that ensures that
only Vars are allowed in the is_opclause() condition.

/* see if it actually has the right */
ok = (NumRelids((Node *) expr) == 1) &&
(is_pseudo_constant_clause(lsecond(expr->args)) ||
(varonleft = false,
is_pseudo_constant_clause(linitial(expr->args))));

the above would allow var+var == const through.

The NumRelids seems like it would never have anything > 1 as you have
a BMS_SINGLETON test on the RestrictInfo where you're calling this
function from. I think you likely want just a IsA(... , Var) checks
here, after skipping over RelabelTypes.

Not sure what "/* see if it actually has the right */" means.

27. Should the function be named something more related to MCV? The
name makes it appear fairly generic to extended stats.

* statext_is_compatible_clause
* Determines if the clause is compatible with MCV lists.

28. This comment seems wrong:

* Currently we only support Var = Const, or Const = Var. It may be possible
* to expand on this later.

I see you're allowing IS NULL and IS NOT NULL too. = does not seem to
be required either.

29. The following fragment makes me think we're only processing
clauses to use them with MCV lists, but the comment claims "dependency
selectivity estimations"

/* we're interested in MCV lists */
int types = STATS_EXT_MCV;

/* check if there's any stats that might be useful for us. */
if (!has_stats_of_kind(rel->statlist, types))
return (Selectivity) 1.0;

list_attnums = (Bitmapset **) palloc(sizeof(Bitmapset *) *
list_length(clauses));

/*
* Pre-process the clauses list to extract the attnums seen in each item.
* We need to determine if there's any clauses which will be useful for
* dependency selectivity estimations. Along the way we'll record all of

30. Is it better to do the bms_is_member() first here?

if ((statext_is_compatible_clause(clause, rel->relid, &attnums)) &&
(!bms_is_member(listidx, *estimatedclauses)))

Likely it'll be cheaper.

31. I think this comment should be /* Ensure choose_best_statistics()
didn't mess up */

/* We only understand MCV lists for now. */
Assert(stat->kind == STATS_EXT_MCV);

32. What're lags?

bool *isnull; /* lags of NULL values (up to 32 columns) */

33. "ndimentions"? There's no field in the struct by that name. I'd
assume it's the same size as the isnull array above it?

Datum *values; /* variable-length (ndimensions) */

34. README.mcv

* large -> a large

For columns with large number of distinct values (e.g. those with continuous

* Is the following up-to-date? I thought I saw code for NOT too?

(a) equality clauses WHERE (a = 1) AND (b = 2)
(b) inequality clauses WHERE (a < 1) AND (b >= 2)
(c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
(d) OR clauses WHERE (a < 1) OR (b >= 2)

* multi-variate -> multivariate

are large the list may be quite large. This is especially true for multi-variate

* a -> an

TODO Currently there's no logic to consider building only a MCV list (and not

* I'd have said "an SRF", but git grep "a SRF" disagrees with me. I
guess those people must be pronouncing it, somehow!? surf... serf... ?

easier, there's a SRF returning detailed information about the MCV lists.

* Is it better to put a working SQL in here?

SELECT * FROM pg_mcv_list_items(stxmcv);

maybe like:

SELECT s.* FROM pg_statistic_ext, LATERAL pg_mcv_list_items(stxmcv) s;

Maybe with a WHERE clause?

* This list seems outdated.

- item index (0, ..., (nitems-1))
- values (string array)
- nulls only (boolean array)
- frequency (double precision)

base_frequency seems to exist now too.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#104

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: David Rowley (#103)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 1/16/19 7:56 AM, David Rowley wrote:> On Tue, 15 Jan 2019 at 08:21,
Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Turns out you were right - the attribute_referenced piece was quite
unnecessary. So I've removed it. I've also extended the regression tests
to verify changing type of another column does not reset the stats.

(Trying to find my feet over here)

I've read over the entire thread, and apart from missing the last two
emails and therefore the latest patch, I managed to read over most of
the MCV patch. I didn't quite get to reading mcv.c and don't quite
have the energy to take that on now.

Thanks for looking!

At this stage I'm trying to get to know the patch. I read a lot of
discussing between you and Dean ironing out how the stats should be
used to form selectivities. At the time I'd not read the patch yet,
so most of it went over my head.

I did note down a few things on my read. I've included them below.
Hopefully, they're useful.

MCV list review

1. In mvc.c there's Assert(ndistinct <= UINT16_MAX); This should be
PG_UINT16_MAX

Yep. Will fix.

2. math.h should be included just after postgres.h

Yep. Will fix.

3. Copyright is still -2017 in mcv.c. Hopefully, if you change it to
2019, you'll never have to bump it ever again! :-)

Optimist ;-)

4. Looking at pg_stats_ext_mcvlist_items() I see you've coded the
string building manually. The way it's coded I'm finding a little
strange. It means the copying becomes quadratic due to

snprintf(buff, 1024, format, values[1], DatumGetPointer(valout));
strncpy(values[1], buff, 1023);

So basically, generally, here you're building a new string with
values[1] followed by a comma, then followed by valout. One the next
line you then copy that new buffer back into values[1]. I understand
this part is likely not performance critical, but I see no reason to
write the code this way.

Are you limiting the strings to 1024 bytes on purpose? I don't see
any comment mentioning you want to truncate strings.

Would it not be better to do this part using a
AppendStringInfoString()? and just manually add a '{', ',' or '}' as
and when required?

DatumGetPointer(valout) should really be using DatumGetCString(valout).

Likely you can also use heap_form_tuple. This will save you having to
convert ints into strings then only to have BuildTupleFromCStrings()
do the reverse.

I agree. I admit all of this is a residue of an initial hackish version
of the function, and should be changed to StringInfo. Will fix.

5. individiaul -> individual
lists. This allows very accurate estimates for individiaul

columns, but

litst -> lists

litst on combinations of columns. Similarly to functional

dependencies

Will fix.

6. Worth mentioning planning cycles too?

"It's advisable to create <literal>MCV</literal> statistics

objects only

on combinations of columns that are actually used in conditions

together,

and for which misestimation of the number of groups is

resulting in bad

plans. Otherwise, the <command>ANALYZE</command> cycles are

just wasted."

Makes sense. Although that's what we say about the existing stats, so
perhaps we should tweak that too.

7. straight-forward -> straightforward

(most-common values) lists, a straight-forward extension of the

per-column

8. adresses -> addresses

statistics adresses the limitation by storing individual values, but it

Will fix. Thanks for proof-reading.

9. Worth mentioning ANALYZE time?

This section introduces multivariate variant of

(most-common values) lists, a straight-forward extension of the

per-column

statistics described in <xref

linkend="row-estimation-examples"/>. This

statistics adresses the limitation by storing individual values,

but it

is naturally more expensive, both in terms of storage and

planning time.

Yeah.

10. low -> a low

with low number of distinct values. Before looking at the second

query,

11. them -> then

on items in the <acronym>MCV</acronym> list, and them sums the

frequencies

Will fix.

12. Should we be referencing the source from the docs?

See <function>mcv_clauselist_selectivity</function>
in <filename>src/backend/statistics/mcv.c</filename> for details.

hmm. I see it's not the first going by: git grep -E "\w+\.c\<"
gt

Hmm, that does not return anything to me - do you actually see any
references to .c files in the sgml docs? I agree that probably is not a
good idea, so I'll remove that.

13. Pretty minor, but the following loop in
UpdateStatisticsForTypeChange() could use a break;

attribute_referenced = false;
for (i = 0; i < staForm->stxkeys.dim1; i++)
if (attnum == staForm->stxkeys.values[i])
attribute_referenced = true;

UPDATE: If I'd reviewed the correct patch I'd have seen that you'd
removed this already

;-)

14. Again in UpdateStatisticsForTypeChange(), would it not be better
to do the statext_is_kind_built(oldtup, STATS_EXT_MCV) check before
checking if the stats contain this column? This gets rid of your
reset_stats variable.

I also don't quite understand why there's an additional check for
statext_is_kind_built(oldtup, STATS_EXT_MCV), which if that's false
then why do we do the dummy update on the tuple?

Have you just coded this so that you can support other stats types
later without too much modification? If so, I'm finding it a bit
confusing to read, so maybe it's worth only coding it that way if
there's more than one stats type to reset for.

UPDATE: If I'd reviewed the correct patch I'd have seen that you'd
removed this already

;-)

15. I see you broke out the remainder of the code from
clauselist_selectivity() into clauselist_selectivity_simple(). The
comment looks like just a copy and paste from the original. That
seems like quite a bit of duplication. Is it better to maybe trim down
the original one?

I'll see what I can do.

16. I initially didn't see how this code transformed the bms into an

array:

/*
* Transform the bms into an array, to make accessing i-th member easier,
* and then construct a filtered version with only attnums referenced
* by the dependency we validate.
*/
attnums = build_attnums(attrs);

attnums_dep = (int *)palloc(k * sizeof(int));
for (i = 0; i < k; i++)
attnums_dep[i] = attnums[dependency[i]];

Would it be better to name build_attnums() build_attnums_array() ?

I think it would also be better to, instead of saying "the bms", just
say "attrs".

Hmmm, maybe.

17. dependencies_clauselist_selectivity(), in:

if ((dependency_is_compatible_clause(clause, rel->relid, &attnum)) &&
(!bms_is_member(listidx, *estimatedclauses)))

would it be better to have the bms_is_member() first?

Yes, that might be a tad faster.

18. In dependencies_clauselist_selectivity() there seem to be a new
bug introduced. We do:

/* mark this one as done, so we don't touch it again. */
*estimatedclauses = bms_add_member(*estimatedclauses, listidx);

but the bms_is_member() check that skipped these has been removed.

It might be easier to document if we just always do:

if (bms_is_member(listidx, *estimatedclauses))
continue;

at the start of both loops. list_attnums can just be left unset for
the originally already estimatedclauses.

It's probably not as clear as it should be, but if the clause is already
estimated (or incompatible), then the list_attnums[] entry will be
InvalidAttrNumber. Which is what we check in the second loop.

19. in extended_stats.c, should build_attnums() be documented that the
Bitmapset members are not offset by
FirstLowInvalidHeapAttributeNumber. I think mostly Bitmapsets of
Attnums are offset by this, so might be worth a mention.

Good point.

20. I think bms_member_index() needs documentation. I imagine you'll
want to mention that the bitmapset must contain the given varattno,
else surely it'll do the wrong thing if it's not. Perhaps an
Assert(bms_is_member(keys, varattno)); should be added to it.

Agreed. Or maybe make it return -1 in that case? It might even have
missing_ok flag or something like that.

21. Comment does not really explain what the function does or what the
arguments mean:

/*
* statext_is_compatible_clause_internal
* Does the heavy lifting of actually inspecting the clauses for
* statext_is_compatible_clause.
*/

Will improve.

22. In statext_is_compatible_clause_internal():

/* Var = Const */

The above comment seems a bit misplaced. It looks like the code below
it is looking for an OpExpr in the form of "Var <op> Const", or "Const
<op> Var".

Yes, I agree.

23. statext_is_compatible_clause_internal() you have:

if ((get_oprrest(expr->opno) != F_EQSEL) &&
(get_oprrest(expr->opno) != F_NEQSEL) &&
(get_oprrest(expr->opno) != F_SCALARLTSEL) &&
(get_oprrest(expr->opno) != F_SCALARLESEL) &&
(get_oprrest(expr->opno) != F_SCALARGTSEL) &&
(get_oprrest(expr->opno) != F_SCALARGESEL))
return false;

6 calls to get_oprrest(). 1 is enough.

How does the existing MCV and histogram stats handle these operators?
Does it insist on a btree opfamily, or is it as crude as this too?

It's this crude too, AFAICS.

24. In statext_is_compatible_clause_internal, you have:

/* NOT/AND/OR clause */
if (or_clause(clause) ||
and_clause(clause) ||
not_clause(clause))
{
/*
* AND/OR/NOT-clauses are supported if all sub-clauses are supported

Looks like you were not sure which order to have these, so you just
tried a few variations :-D Maybe just make them all the same?

If you insist ;-)

25. Does statext_is_compatible_clause_internal)_ need to skip over

RelabelTypes?

I believe it does, based on what I've observed during development. Why
do you think it's not necessary?

26. In statext_is_compatible_clause_internal() you mention: /* We only
support plain Vars for now */, but I see nothing that ensures that
only Vars are allowed in the is_opclause() condition.

/* see if it actually has the right */
ok = (NumRelids((Node *) expr) == 1) &&
(is_pseudo_constant_clause(lsecond(expr->args)) ||
(varonleft = false,
is_pseudo_constant_clause(linitial(expr->args))));

the above would allow var+var == const through.

But then we call statext_is_compatible_clause_internal on it again, and
that only allows Vars and "Var op Const" expressions. Maybe there's a
way around that?

The NumRelids seems like it would never have anything > 1 as you have
a BMS_SINGLETON test on the RestrictInfo where you're calling this
function from. I think you likely want just a IsA(... , Var) checks
here, after skipping over RelabelTypes.

Not sure what "/* see if it actually has the right */" means.

That should have been "right structure" I believe.

27. Should the function be named something more related to MCV? The
name makes it appear fairly generic to extended stats.

* statext_is_compatible_clause
* Determines if the clause is compatible with MCV lists.

No, because it's supposed to also handle histograms (and perhaps other
stats types) in the future.

28. This comment seems wrong:

* Currently we only support Var = Const, or Const = Var. It may be

possible

* to expand on this later.

I see you're allowing IS NULL and IS NOT NULL too. = does not seem to
be required either.

OK, will fix.

29. The following fragment makes me think we're only processing
clauses to use them with MCV lists, but the comment claims "dependency
selectivity estimations"

/* we're interested in MCV lists */
int types = STATS_EXT_MCV;

/* check if there's any stats that might be useful for us. */
if (!has_stats_of_kind(rel->statlist, types))
return (Selectivity) 1.0;

list_attnums = (Bitmapset **) palloc(sizeof(Bitmapset *) *
list_length(clauses));

/*
* Pre-process the clauses list to extract the attnums seen in each item.
* We need to determine if there's any clauses which will be useful for
* dependency selectivity estimations. Along the way we'll record all of

Yeah, that's copy-pasto.

30. Is it better to do the bms_is_member() first here?

if ((statext_is_compatible_clause(clause, rel->relid, &attnums)) &&
(!bms_is_member(listidx, *estimatedclauses)))

Likely it'll be cheaper.

Yeah, same as before.

31. I think this comment should be /* Ensure choose_best_statistics()
didn't mess up */

/* We only understand MCV lists for now. */
Assert(stat->kind == STATS_EXT_MCV);

I'll expand the comment a bit.

32. What're lags?

bool *isnull; /* lags of NULL values (up to 32 columns) */

Should be "flags" I think.

33. "ndimentions"? There's no field in the struct by that name. I'd
assume it's the same size as the isnull array above it?

Datum *values; /* variable-length (ndimensions) */

Yes, that's the case.

34. README.mcv

* large -> a large

For columns with large number of distinct values (e.g. those with

continuous

* Is the following up-to-date? I thought I saw code for NOT too?

(a) equality clauses WHERE (a = 1) AND (b = 2)
(b) inequality clauses WHERE (a < 1) AND (b >= 2)
(c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
(d) OR clauses WHERE (a < 1) OR (b >= 2)

* multi-variate -> multivariate

are large the list may be quite large. This is especially true for

multi-variate

* a -> an

TODO Currently there's no logic to consider building only a MCV list

(and not

* I'd have said "an SRF", but git grep "a SRF" disagrees with me. I
guess those people must be pronouncing it, somehow!? surf... serf... ?

easier, there's a SRF returning detailed information about the MCV lists.

* Is it better to put a working SQL in here?

SELECT * FROM pg_mcv_list_items(stxmcv);

maybe like:

SELECT s.* FROM pg_statistic_ext, LATERAL pg_mcv_list_items(stxmcv) s;

Maybe with a WHERE clause?

* This list seems outdated.

- item index (0, ..., (nitems-1))
- values (string array)
- nulls only (boolean array)
- frequency (double precision)

base_frequency seems to exist now too.

Yeah, those are mostly typos. Will fix.

thanks

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#105

David Rowley

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Tomas Vondra (#104)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Thu, 17 Jan 2019 at 14:19, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

12. Should we be referencing the source from the docs?

See <function>mcv_clauselist_selectivity</function>
in <filename>src/backend/statistics/mcv.c</filename> for details.

hmm. I see it's not the first going by: git grep -E "\w+\.c\<"
gt

Hmm, that does not return anything to me - do you actually see any
references to .c files in the sgml docs? I agree that probably is not a
good idea, so I'll remove that.

Yeah, I see quite a few. I shouldn't have escaped the <

18. In dependencies_clauselist_selectivity() there seem to be a new
bug introduced. We do:

/* mark this one as done, so we don't touch it again. */
*estimatedclauses = bms_add_member(*estimatedclauses, listidx);

but the bms_is_member() check that skipped these has been removed.

It might be easier to document if we just always do:

if (bms_is_member(listidx, *estimatedclauses))
continue;

at the start of both loops. list_attnums can just be left unset for
the originally already estimatedclauses.

It's probably not as clear as it should be, but if the clause is already
estimated (or incompatible), then the list_attnums[] entry will be
InvalidAttrNumber. Which is what we check in the second loop.

hmm. what about the items that should be skipped when you do the
*estimatedclauses = bms_add_member(*estimatedclauses, listidx); in the
2nd loop. You'll need to either also do list_attnums[listidx] =
InvalidAttrNumber; for them, or put back the bms_is_member() check,
no? I admit to not having debugged it to find an actual bug, it just
looks suspicious.

25. Does statext_is_compatible_clause_internal)_ need to skip over

RelabelTypes?

I believe it does, based on what I've observed during development. Why
do you think it's not necessary?

The other way around. I thought it was necessary, but the code does not do it.

26. In statext_is_compatible_clause_internal() you mention: /* We only
support plain Vars for now */, but I see nothing that ensures that
only Vars are allowed in the is_opclause() condition.

/* see if it actually has the right */
ok = (NumRelids((Node *) expr) == 1) &&
(is_pseudo_constant_clause(lsecond(expr->args)) ||
(varonleft = false,
is_pseudo_constant_clause(linitial(expr->args))));

the above would allow var+var == const through.

But then we call statext_is_compatible_clause_internal on it again, and
that only allows Vars and "Var op Const" expressions. Maybe there's a
way around that?

True, I missed that. Drop that one.

33. "ndimentions"? There's no field in the struct by that name. I'd
assume it's the same size as the isnull array above it?

Datum *values; /* variable-length (ndimensions) */

Yes, that's the case.

If it relates to the ndimensions field from the struct below, maybe
it's worth crafting that into the comment somehow.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#106

David Rowley

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: David Rowley (#103)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Thu, 17 Jan 2019 at 01:56, David Rowley <david.rowley@2ndquadrant.com> wrote:

At this stage I'm trying to get to know the patch. I read a lot of
discussing between you and Dean ironing out how the stats should be
used to form selectivities. At the time I'd not read the patch yet,
so most of it went over my head.

I did note down a few things on my read. I've included them below.
Hopefully, they're useful.

MCV list review

Part 2:

35. The evaluation order of this macro is wrong.

#define ITEM_SIZE(ndims) \
(ndims * (sizeof(uint16) + sizeof(bool)) + 2 * sizeof(double))

You'd probably want ITEM_SIZE(10) to return 170, but:

select (10 * (2 + 1) + 2 * 8);
?column?
----------
46

Unsure why this does not cause a crash.

ndims should also have parenthesis around it in case someone does
ITEM_SIZE(x + y), likewise for the other ITEM_* macros.

36. Could do with some comments in get_mincount_for_mcv_list(). What's
magic about 0.04?

37. I think statext_mcv_build() needs some comments to detail out the
arguments. For example can attrs be empty? Must it contain at least 2
members? etc.

38. Too many "it"s

* we simply treat it as a special item in the MCV list (it it makes it).

39. I don't see analyze_mcv_list() being used anywhere around this comment:

* If we can fit all the items onto the MCV list, do that. Otherwise use
* analyze_mcv_list to decide how many items to keep in the MCV list, just
* like for the single-dimensional MCV list.

40. The comment in the above item seems to indicate the condition for
when all items can fit in the number of groups, but the if condition
does not seem to allow for an exact match?

if (ngroups > nitems)

if you want to check if the number of items can fit in the number of
groups should it be: if (ngroups >= nitems) or if (nitems <= ngroups)
? Perhaps I've misunderstood. The comment is a little confusing as I'm
not sure where the "Otherwise" code is located.

41. I don't think palloc0() is required here. palloc() should be fine
since you're initialising each element in the loop.

mcvlist->items = (MCVItem * *) palloc0(sizeof(MCVItem *) * nitems);

for (i = 0; i < nitems; i++)
{
mcvlist->items[i] = (MCVItem *) palloc(sizeof(MCVItem));
mcvlist->items[i]->values = (Datum *) palloc(sizeof(Datum) * numattrs);
mcvlist->items[i]->isnull = (bool *) palloc(sizeof(bool) * numattrs);
}

I think I agree with the comment above that chunk about reducing the
number of pallocs, even if it's just allocating the initial array as
MCVItems instead of pointers to MCVItems

42. I don't think palloc0() is required in build_distinct_groups().
palloc() should be ok.

Maybe it's worth an Assert(j + 1 == ngroups) to ensure
count_distinct_groups got them all?

43. You're assuming size_t and long are the same size here.

elog(ERROR, "serialized MCV list exceeds 1MB (%ld)", total_length);

I know at least one platform where that's not true.

44. Should use DatumGetCString() instead of DatumGetPointer().

else if (info[dim].typlen == -2) /* cstring */
{
memcpy(data, DatumGetPointer(v), strlen(DatumGetPointer(v)) + 1);
data += strlen(DatumGetPointer(v)) + 1; /* terminator */
}

45. No need to set this to NULL.

Datum *v = NULL;

Is "value" a better name than "v"?

46. What's the extra 'd' for in:

elog(ERROR, "invalid MCV magic %d (expected %dd)",

and

elog(ERROR, "invalid MCV type %d (expected %dd)",

47. Wondering about the logic behind the variation between elog() and
ereport() in statext_mcv_deserialize(). They all looks like "can't
happen" type errors.

48. format assumes size_t is the same size as long.

elog(ERROR, "invalid MCV size %ld (expected %ld)",
VARSIZE_ANY_EXHDR(data), expected_size);

49. palloc0() followed by memset(). Can just use palloc().

matches = palloc0(sizeof(char) * mcvlist->nitems);
memset(matches, (is_or) ? STATS_MATCH_NONE : STATS_MATCH_FULL,
sizeof(char) * mcvlist->nitems);

50. The coding style in mcv_get_match_bitmap I think needs to be
postgresqlified. We normally declare all our variables in a chunk then
start setting them, unless the assignment is very simple. I don't
recall places in the code where have a comment when declaring a
variable, for example.

FmgrInfo gtproc;
Var *var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
Const *cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
bool isgt = (!varonleft);

TypeCacheEntry *typecache
= lookup_type_cache(var->vartype, TYPECACHE_GT_OPR);

/* match the attribute to a dimension of the statistic */
int idx = bms_member_index(keys, var->varattno);

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#107

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: David Rowley (#106)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Thu, 17 Jan 2019 at 03:42, David Rowley <david.rowley@2ndquadrant.com> wrote:

35. The evaluation order of this macro is wrong.

#define ITEM_SIZE(ndims) \
(ndims * (sizeof(uint16) + sizeof(bool)) + 2 * sizeof(double))

You'd probably want ITEM_SIZE(10) to return 170, but:

select (10 * (2 + 1) + 2 * 8);
?column?
----------
46

Unsure why this does not cause a crash.

No, the code is actually correct, as explained in the comment above
it. Each item contains (ndims) copies of the uint16 index and the
boolean, but it always contains exactly 2 doubles, independent of
ndims.

ndims should also have parenthesis around it in case someone does
ITEM_SIZE(x + y), likewise for the other ITEM_* macros.

+1 on that point.

Regards,
Dean

#108

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: David Rowley (#106)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Thu, 17 Jan 2019 at 03:42, David Rowley <david.rowley@2ndquadrant.com> wrote:

39. I don't see analyze_mcv_list() being used anywhere around this comment:

* If we can fit all the items onto the MCV list, do that. Otherwise use
* analyze_mcv_list to decide how many items to keep in the MCV list, just
* like for the single-dimensional MCV list.

Right. Also, analyze_mcv_list() is no longer being used anywhere
outside of analyze.c, so it can go back to being static.

Regards,
Dean

#109

David Rowley

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Tomas Vondra (#102)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

I've started looking over 0002. Here are a few things so far:

1. I think this should be pg_statistic_ext.stxhistogram?

Values of the <type>pg_histogram</type> can be obtained only from the
<literal>pg_statistic.stxhistogram</literal> column.

2. I don't think this bms_copy is needed anymore. I think it was
previously since there were possibly multiple StatisticExtInfo objects
per pg_statistic_ext row, but now it's 1 for 1.

+ info->keys = bms_copy(keys);

naturally, the bms_free() will need to go too.

3. I've not really got into understanding how the new statistics types
are applied yet, but I found this:

* If asked to build both MCV and histogram, first build the MCV part
* and then histogram on the remaining rows.

I guess that means we'll get different estimates with:

create statistic a_stats (mcv,histogram) on a,b from t;

create statistic a_stats1 (mcv) on a,b from t;
create statistic a_stats2 (histogram) on a,b from t;

Is that going to be surprising to people?

4. I guess you can replace "(histogram == NULL);" with "false". The
compiler would likely do it anyway, but...

if (histogram != NULL)
{
/* histogram already is a bytea value, not need to serialize */
nulls[Anum_pg_statistic_ext_stxhistogram - 1] = (histogram == NULL);
values[Anum_pg_statistic_ext_stxhistogram - 1] = PointerGetDatum(histogram);
}

but, hmm. Shouldn't you serialize this, like you are with the others?

5. serialize_histogram() and statext_histogram_deserialize(), should
these follow the same function naming format?

6. IIRC some compilers may warn about this:

if (stat->kinds & requiredkinds)

making it:

if ((stat->kinds & requiredkinds))

should fix that.

UPDATE: Tried to make a few compilers warn about this and failed.
Perhaps I've misremembered.

7. Comment claims function has a parameter named 'requiredkind', but
it no longer does. The comment also needs updated to mention that it
finds statistics with any of the required kinds.

* choose_best_statistics
* Look for and return statistics with the specified 'requiredkind' which
* have keys that match at least two of the given attnums. Return NULL if
* there's no match.
*
* The current selection criteria is very simple - we choose the statistics
* object referencing the most of the requested attributes, breaking ties
* in favor of objects with fewer keys overall.
*
* XXX If multiple statistics objects tie on both criteria, then which object
* is chosen depends on the order that they appear in the stats list. Perhaps
* further tiebreakers are needed.
*/
StatisticExtInfo *
choose_best_statistics(List *stats, Bitmapset *attnums, int requiredkinds)

8. Looking at statext_clauselist_selectivity() I see it calls
choose_best_statistics() passing requiredkinds as STATS_EXT_INFO_MCV |
STATS_EXT_INFO_HISTOGRAM, do you think the function now needs to
attempt to find the best match plus the one with the most statistics
kinds?

It might only matter if someone had:

create statistic a_stats1 (mcv) on a,b from t;
create statistic a_stats2 (histogram) on a,b from t;
create statistic a_stats3 (mcv,histogram) on a,b from t;

Is it fine to just return a_stats1 and ignore the fact that a_stats3
is probably better? Or too corner case to care?

9. examine_equality_clause() assumes it'll get a Var. I see we should
only allow clauses that pass statext_is_compatible_clause_internal(),
so maybe it's worth an Assert(IsA(var, Var)) along with a comment to
mention anything else could not have been allowed.

10. Does examine_equality_clause need 'root' as an argument?

11. UINT16_MAX -> PG_UINT16_MAX

/* make sure we fit into uint16 */
Assert(count <= UINT16_MAX);

(Out of energy for today.)

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#110

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: David Rowley (#109)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

thanks for the review. The attached patches address most of the issues
mentioned in the past several messages, both in the MCV and histogram parts.

A couple of items remains:

15. I see you broke out the remainder of the code from
clauselist_selectivity() into clauselist_selectivity_simple(). The
comment looks like just a copy and paste from the original. That
seems like quite a bit of duplication. Is it better to maybe trim down
the original one?

I don't follow - where do you see the code duplication? Essentially, we
have clauselist_selectivity and clauselist_selectivity_simple, but the
first one calls the second one. The "simple" version is needed because
in some cases we need to perform estimation without multivariate stats
(e.g. to prevent infinite loop due to recursion).

18. In dependencies_clauselist_selectivity() there seem to be a new
bug introduced. We do:

/* mark this one as done, so we don't touch it again. */
*estimatedclauses = bms_add_member(*estimatedclauses, listidx);

but the bms_is_member() check that skipped these has been removed.

It might be easier to document if we just always do:

if (bms_is_member(listidx, *estimatedclauses))
continue;

at the start of both loops. list_attnums can just be left unset for
the originally already estimatedclauses.

This was already discussed - I don't think there's any bug, but I'll
look into refactoring the code somehow to make it clear.

21. Comment does not really explain what the function does or what the
arguments mean:

/*
* statext_is_compatible_clause_internal
* Does the heavy lifting of actually inspecting the clauses for
* statext_is_compatible_clause.
*/

Isn't it explained in the statext_is_compatible_clause comment?

25. Does statext_is_compatible_clause_internal)_ need to skip over
RelabelTypes?

I don't think it should, because while RelabelType nodes represent casts
to binary-compatible types, there's no guarantee the semantics actually
is compatible. So for example if you do this:

create table t (a int, b int);
insert into t select mod(i,100), mod(i,100)
from generate_series(1,1000000) s(i);
create statistics s (mcv) on a, b from t;
analyze t;
explain analyze select * from t where a = 1::oid and b = 1::oid;

then there will be a RelabelType nodes casting each column from int4 to
oid. So the estimation will be made following oid semantics. But the MCV
list contains int4 values, and is built using int4-specific operators.

I admit this int4/oid example is fairly trivial, but it's not clear to
me we can assume all RelabelType will behave like that. The types may be
binary-coerible, but may use vastly different operators - think about
citext vs. text, for example.

35. The evaluation order of this macro is wrong.

#define ITEM_SIZE(ndims) \
(ndims * (sizeof(uint16) + sizeof(bool)) + 2 * sizeof(double))

Nope, as mentioned by Dean, it's actually correct.

36. Could do with some comments in get_mincount_for_mcv_list(). What's
magic about 0.04?

That was copied from another patch, but I've removed the comment
explaining the details - I've now added it back, which I think should be
more than enough.

40. The comment in the above item seems to indicate the condition for
when all items can fit in the number of groups, but the if condition
does not seem to allow for an exact match?

if (ngroups > nitems)

if you want to check if the number of items can fit in the number of
groups should it be: if (ngroups >= nitems) or if (nitems <= ngroups)
? Perhaps I've misunderstood. The comment is a little confusing as I'm
not sure where the "Otherwise" code is located.

No, the whole point of that block is to decide how many groups to keep
if there are more groups than we have space for (based on stats target).
So if (ngroups == nitems) or (ngrouos < nitems) then we can keep all of
them.

41. I don't think palloc0() is required here. palloc() should be fine
since you're initialising each element in the loop.

...

I think I agree with the comment above that chunk about reducing the
number of pallocs, even if it's just allocating the initial array as
MCVItems instead of pointers to MCVItems

I've left this as it is for now. The number of extra pallocs() is fairly
low anyway, so I don't think it's worth the extra complexity.

47. Wondering about the logic behind the variation between elog() and
ereport() in statext_mcv_deserialize(). They all looks like "can't
happen" type errors.

That's mostly random, I'll review and fix that. All "can't happen" cases
should use elog().

3. I've not really got into understanding how the new statistics types
are applied yet, but I found this:

* If asked to build both MCV and histogram, first build the MCV part
* and then histogram on the remaining rows.

I guess that means we'll get different estimates with:

create statistic a_stats (mcv,histogram) on a,b from t;

vs

create statistic a_stats1 (mcv) on a,b from t;
create statistic a_stats2 (histogram) on a,b from t;

Is that going to be surprising to people?

Well, I don't have a good answer to this, except for mentioning this in
the SGML docs.

5. serialize_histogram() and statext_histogram_deserialize(), should
these follow the same function naming format?

Perhaps, although serialize_histogram() is static and so it's kinda
internal API.

8. Looking at statext_clauselist_selectivity() I see it calls
choose_best_statistics() passing requiredkinds as STATS_EXT_INFO_MCV |
STATS_EXT_INFO_HISTOGRAM, do you think the function now needs to
attempt to find the best match plus the one with the most statistics
kinds?

It might only matter if someone had:

create statistic a_stats1 (mcv) on a,b from t;
create statistic a_stats2 (histogram) on a,b from t;
create statistic a_stats3 (mcv,histogram) on a,b from t;

Is it fine to just return a_stats1 and ignore the fact that a_stats3
is probably better? Or too corner case to care?

I don't know. My assumption is people will not create such overlapping
statics.

9. examine_equality_clause() assumes it'll get a Var. I see we should
only allow clauses that pass statext_is_compatible_clause_internal(),
so maybe it's worth an Assert(IsA(var, Var)) along with a comment to
mention anything else could not have been allowed.

Maybe.

10. Does examine_equality_clause need 'root' as an argument?

Probably not. I guess it's a residue some older version.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#111

David Rowley

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Tomas Vondra (#110)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Thanks for making those changes.

On Fri, 18 Jan 2019 at 10:03, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

A couple of items remains:

15. I see you broke out the remainder of the code from
clauselist_selectivity() into clauselist_selectivity_simple(). The
comment looks like just a copy and paste from the original. That
seems like quite a bit of duplication. Is it better to maybe trim down
the original one?

I don't follow - where do you see the code duplication? Essentially, we
have clauselist_selectivity and clauselist_selectivity_simple, but the
first one calls the second one. The "simple" version is needed because
in some cases we need to perform estimation without multivariate stats
(e.g. to prevent infinite loop due to recursion).

It was the comment duplication that I was complaining about. I think
clauselist_selectivity()'s comment can be simplified to mention it
attempts to apply extended statistics and applies
clauselist_selectivity_simple on any stats that remain. Plus any
details that are specific to extended statistics. That way if anyone
wants further detail on what happens to the remaining clauses they can
look at the comment above clauselist_selectivity_simple().

18. In dependencies_clauselist_selectivity() there seem to be a new
bug introduced. We do:

/* mark this one as done, so we don't touch it again. */
*estimatedclauses = bms_add_member(*estimatedclauses, listidx);

but the bms_is_member() check that skipped these has been removed.

It might be easier to document if we just always do:

if (bms_is_member(listidx, *estimatedclauses))
continue;

at the start of both loops. list_attnums can just be left unset for
the originally already estimatedclauses.

This was already discussed - I don't think there's any bug, but I'll
look into refactoring the code somehow to make it clear.

On looking at this a bit more it seems that since the estimated attr
is removed from the clauses_attnums Bitmapset that
find_strongest_dependency() will no longer find a dependency for that
clause and dependency_implies_attribute() will just return false where
the bms_is_member(listidx, *estimatedclauses) would have done
previously. I'll mean we could get more calls of
dependency_implies_attribute(), but that function is even cheaper than
bms_is_member() so I guess there's no harm in this change.

25. Does statext_is_compatible_clause_internal)_ need to skip over
RelabelTypes?

I don't think it should, because while RelabelType nodes represent casts
to binary-compatible types, there's no guarantee the semantics actually
is compatible.

The code that looks through RelabelTypes for normal stats is in
examine_variable(). This code allows the following to estimate 4 rows.
I guess if we didn't use that then we'd just need to treat it like
some unknown expression and use DEFAULT_NUM_DISTINCT.

create table a (t varchar);
insert into a select v.v from (values('One'),('Two'),('Three')) as
v(v), generate_Series(1,4);
analyze a;
explain (summary off, timing off, analyze) select * from a where t = 'One';
QUERY PLAN
-------------------------------------------------------------------------
Seq Scan on a (cost=0.00..1.15 rows=4 width=4) (actual rows=4 loops=1)
Filter: ((t)::text = 'One'::text)
Rows Removed by Filter: 8
(3 rows)

Why do you think its okay for the normal stats to look through
RelabelTypes but not the new code you're adding?

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#112

David Rowley

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Tomas Vondra (#110)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Fri, 18 Jan 2019 at 10:03, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

thanks for the review. The attached patches address most of the issues
mentioned in the past several messages, both in the MCV and histogram parts.

I made another pass over the 0001 patch. I've not read through mcv.c
again yet. Will try to get to that soon.

0001-multivariate-MCV-lists-20190117.patch

1. The following mentions "multiple functions", but lists just 1 function.

<para>
To inspect statistics defined using <command>CREATE STATISTICS</command>
command, <productname>PostgreSQL</productname> provides multiple functions.
</para>

2. There's a mix of usages of <literal>MCV</literal> and
<acronym>MCV</acronym> around the docs. Should these be the same?

3. analyze_mcv_list() is modified to make it an external function, but
it's not used anywhere out of analyze.c

4. The following can be simplified further:

* We can also leave the record as it is if there are no statistics
* including the datum values, like for example MCV lists.
*/
if (statext_is_kind_built(oldtup, STATS_EXT_MCV))
reset_stats = true;

/*
* If we can leave the statistics as it is, just do minimal cleanup
* and we're done.
*/
if (!reset_stats)
{
ReleaseSysCache(oldtup);
return;
}

to just:

/*
* When none of the defined statistics types contain datum values
* from the table's columns then there's no need to reset the stats.
* Functional dependencies and ndistinct stats should still hold true.
*/
if (!statext_is_kind_built(oldtup, STATS_EXT_MCV))
{
ReleaseSysCache(oldtup);
return;
}

5. "so that we can ignore them below." seems misplaced now since
you've moved all the code below into clauselist_selectivity_simple().
Maybe you can change it to "so that we can inform
clauselist_selectivity_simple about clauses that it should ignore" ?

* filled with the 0-based list positions of clauses used that way, so
* that we can ignore them below.

6. README.mcv: multi-variate -> multivariate

are large the list may be quite large. This is especially true for multi-variate

7. README.mcv: similar -> a similar

it impossible to use anyarrays. It might be possible to produce similar

8. I saw you added IS NOT NULL to README.mcv, but README just mentions:

(b) MCV lists - equality and inequality clauses (AND, OR, NOT), IS NULL

Should that mention IS NOT NULL too?

9. The header comment for build_attnums_array() claims that it
"transforms an array of AttrNumber values into a bitmap", but it does
the opposite.

* Transforms an array of AttrNumber values into a bitmap.

10. The following Assert is not entirely useless. The bitmapset could
have a 0 member, but it can't store negative values.

while ((j = bms_next_member(attrs, j)) >= 0)
{
/* user-defined attributes only */
Assert(AttrNumberIsForUserDefinedAttr(j));

Just checking you thought of that when you added it?

11. XXX comments are normally reserved for things we may wish to
reconsider later, but the following seems more like a "Note:"

* XXX All the memory is allocated in a single chunk, so that the caller
* can simply pfree the return value to release all of it.

12. In statext_is_compatible_clause_internal() there's still a comment
that mentions "Var op Const", but Const op Var is also okay too.

13. This is not fall-through. Generally, such a comment is reserved
to confirm that the "break;" is meant to be missing.

default:
/* fall-through */
return false;

https://developers.redhat.com/blog/2017/03/10/wimplicit-fallthrough-in-gcc-7/
mentions various comment patterns that are used for that case. Your
case seems misplaced since it's right about a return, and not another
case.

14. The header comment for statext_is_compatible_clause() is not
accurate. It mentions only opexprs with equality operations are
allowed, but none of those are true.

* Only OpExprs with two arguments using an equality operator are supported.
* When returning True attnum is set to the attribute number of the Var within
* the supported clause.

15. statext_clauselist_selectivity(): "a number" -> "the number" ?

* Selects the best extended (multi-column) statistic on a table (measured by
* a number of attributes extracted from the clauses and covered by it), and

16. I understand you're changing this to a bitmask in the 0002 patch,
but int is the wrong type here;
/* we're interested in MCV lists */
int types = STATS_EXT_MCV;

Maybe just pass the STATS_EXT_MCV directly, or at least make it a char.

17. bms_membership(clauses_attnums) != BMS_MULTIPLE seems better here.
It can stop once it finds 2. No need to count them all.

/* We need at least two attributes for MCV lists. */
if (bms_num_members(clauses_attnums) < 2)
return 1.0;

18. The following comment in statext_is_compatible_clause_internal()
does not seem to be true. I see OpExprs are supported and NULL test,
including others too.

/* We only support plain Vars for now */

19. The header comment for clauselist_selectivity_simple() does not
mention what estimatedclauses is for.

20. New line. Also, missing "the" before "maximum"

+ * We
+ * iteratively search for multivariate n-distinct with maximum number

21. This comment seems like it's been copied from
estimate_num_groups() without being edited.

/* we're done with this relation */
varinfos = NIL;

Looks like it's using this to break out of the loop.

22. I don't see any dividing going on below this comment:

/*
* Sanity check --- don't divide by zero if empty relation.
*/

23. I see a few tests mentioning: "-- check change of unrelated column
type does not reset the MCV statistics"

Would it be better to just look at pg_statistic_ext there and do something like:

SELECT COUNT(*) FROM pg_statistic_ext WHERE stxname = 'whatever' AND
stxmcv IS NOT NULL;

Otherwise, you seem to be ensuring the stats were not reset by looking
at a query plan, so it's a bit harder to follow and likely testing
more than it needs to.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#113

David Rowley

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: David Rowley (#112)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Wed, 23 Jan 2019 at 03:43, David Rowley <david.rowley@2ndquadrant.com> wrote:

I made another pass over the 0001 patch. I've not read through mcv.c
again yet. Will try to get to that soon.

0001-multivariate-MCV-lists-20190117.patch

I started on mcv.c this morning. I'm still trying to build myself a
picture of how it works, but I have noted a few more things while I'm
reading.

24. These macros are still missing parenthesis around the arguments:

#define ITEM_INDEXES(item) ((uint16*)item)
#define ITEM_NULLS(item,ndims) ((bool*)(ITEM_INDEXES(item) + ndims))
#define ITEM_FREQUENCY(item,ndims) ((double*)(ITEM_NULLS(item,ndims) + ndims))

While I don't see any reason to put parenthesis around the macro's
argument when passing it to another macro, since it should do it...
There is a good reason to have the additional parenthesis when it's
not passed to another macro.

Also, there's a number of places, including with these macros that
white space is not confirming to project standard. e.g.
((uint16*)item) should be ((uint16 *) (item)) (including fixing the
missing parenthesis)

25. In statext_mcv_build() I'm trying to figure out what the for loop
does below the comment:

* If we can fit all the items onto the MCV list, do that. Otherwise
* use get_mincount_for_mcv_list to decide which items to keep in the
* MCV list, based on the number of occurences in the sample.

The comment explains only as far as the get_mincount_for_mcv_list()
call so the following is completely undocumented:

for (i = 0; i < nitems; i++)
{
if (mcv_counts[i] < mincount)
{
nitems = i;
break;
}
}

I was attempting to figure out if the break should be there, or if the
code should continue and find the 'i' for the smallest mcv_counts, but
I don't really understand what the code is meant to be doing.

Also: occurences -> occurrences

26. Again statext_mcv_build() I'm a bit puzzled to why mcv_counts
needs to exist at all. It's built from:

mcv_counts = (int *) palloc(sizeof(int) * nitems);

for (i = 0; i < nitems; i++)
mcv_counts[i] = groups[i].count;

Then only used in the loop mentioned in #25 above. Can't you just use
groups[i].count?

(Stopped in statext_mcv_build(). Need to take a break)

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#114

David Rowley

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: David Rowley (#113)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Wed, 23 Jan 2019 at 12:46, David Rowley <david.rowley@2ndquadrant.com> wrote:

(Stopped in statext_mcv_build(). Need to take a break)

Continuing...

27. statext_mcv_build() could declare the int j,k variables in the
scope that they're required in.

28. "an array"

* build array of SortItems for distinct groups and counts matching items

29. No need to set isnull to false in statext_mcv_load()

30. Wondering about the reason in statext_mcv_serialize() that you're
not passing the collation to sort the array.

You have:

ssup[dim].ssup_collation = DEFAULT_COLLATION_OID;

should it not be:

ssup[dim].ssup_collation = stats[dim]->attrcollid;
?

31. uint32 should use %u, not %d:

if (mcvlist->magic != STATS_MCV_MAGIC)
elog(ERROR, "invalid MCV magic %d (expected %d)",
mcvlist->magic, STATS_MCV_MAGIC);

and

if (mcvlist->type != STATS_MCV_TYPE_BASIC)
elog(ERROR, "invalid MCV type %d (expected %d)",
mcvlist->type, STATS_MCV_TYPE_BASIC);

and

ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("invalid length (%d) item array in MCVList",
mcvlist->nitems)));

I don't think %ld is the correct format for VARSIZE_ANY_EXHDR. %u or
%d seem more suited. I see that value is quite often assigned to int,
so probably can't argue much with %d.

elog(ERROR, "invalid MCV size %ld (expected %zu)",
VARSIZE_ANY_EXHDR(data), expected_size);

32. I think the format is wrong here too:

elog(ERROR, "invalid MCV size %ld (expected %ld)",
VARSIZE_ANY_EXHDR(data), expected_size);

I'd expect "invalid MCV size %d (expected %zu)"

33. How do you allocate a single chunk non-densely?

* Allocate one large chunk of memory for the intermediate data, needed
* only for deserializing the MCV list (and allocate densely to minimize
* the palloc overhead).

34. I thought I saw a few issues with pg_stats_ext_mcvlist_items() so
tried to test it:

create table ab (a int, b int);
insert into ab select x,x from generate_serieS(1,10)x;
create statistics ab_ab_stat (mcv) on a,b from ab;
analyze ab;
select pg_mcv_list_items(stxmcv) from pg_Statistic_ext where stxmcv is not null;
ERROR: cache lookup failed for type 2139062143

The issues I saw were:

You do:
appendStringInfoString(&itemValues, "{");
appendStringInfoString(&itemNulls, "{");

but never append '}' after building the string.

(can use appendStringInfoChar() BTW)

also:

if (i == 0)
{
appendStringInfoString(&itemValues, ", ");
appendStringInfoString(&itemNulls, ", ");
}

I'd have expected you to append the ", " only when i > 0.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#115

Andres Freund

andres@anarazel.de

almost 7 years ago

In reply to: David Rowley (#114)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi Tomas,

On 2019-01-24 14:59:50 +1300, David Rowley wrote:

On Wed, 23 Jan 2019 at 12:46, David Rowley <david.rowley@2ndquadrant.com> wrote:

(Stopped in statext_mcv_build(). Need to take a break)

Continuing...

Are you planning to update the patch, or should the entry be marked as
RWF?

- Andres

#116

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Andres Freund (#115)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Sun, Feb 03, 2019 at 02:43:24AM -0800, Andres Freund wrote:

Are you planning to update the patch, or should the entry be marked as
RWF?

Moved the patch to next CF for now, waiting on author as the last
review happened not so long ago.
--
Michael

#117

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Michael Paquier (#116)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 2/4/19 5:53 AM, Michael Paquier wrote:

On Sun, Feb 03, 2019 at 02:43:24AM -0800, Andres Freund wrote:

Are you planning to update the patch, or should the entry be marked as
RWF?

Moved the patch to next CF for now, waiting on author as the last
review happened not so long ago.

Thanks. Yes, I intend to send a new patch version soon.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#118

Alvaro Herrera

alvherre@2ndquadrant.com

almost 7 years ago

In reply to: Tomas Vondra (#117)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 2019-Feb-04, Tomas Vondra wrote:

On 2/4/19 5:53 AM, Michael Paquier wrote:

Moved the patch to next CF for now, waiting on author as the last
review happened not so long ago.

Thanks. Yes, I intend to send a new patch version soon.

I wonder what should we be doing with this series -- concretely, should
the effort concentrate on one of the two patches, and leave the other
for pg13, to increase the chances of the first one being in pg12? I
would favor that approach, since it's pretty late in the cycle by now
and it seems dubious that both will be ready.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#119

David Rowley

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Alvaro Herrera (#118)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Thu, 7 Feb 2019 at 03:16, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

I wonder what should we be doing with this series -- concretely, should
the effort concentrate on one of the two patches, and leave the other
for pg13, to increase the chances of the first one being in pg12? I
would favor that approach, since it's pretty late in the cycle by now
and it seems dubious that both will be ready.

I mostly have been reviewing the MCV patch with the thoughts that one
is better than none in PG12. I don't see any particular reason that
we need both in the one release.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#120

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: David Rowley (#119)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 2/6/19 10:59 PM, David Rowley wrote:

On Thu, 7 Feb 2019 at 03:16, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

I wonder what should we be doing with this series -- concretely, should
the effort concentrate on one of the two patches, and leave the other
for pg13, to increase the chances of the first one being in pg12? I
would favor that approach, since it's pretty late in the cycle by now
and it seems dubious that both will be ready.

I mostly have been reviewing the MCV patch with the thoughts that one
is better than none in PG12. I don't see any particular reason that
we need both in the one release.

I agree with that, although most of the complexity likely lies in
integrating the stats into the selectivity estimation - if we get that
right for the MCV patch, adding histogram seems comparably simpler.

But yeah, let's focus on the MCV part.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#121

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: Tomas Vondra (#120)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Wed, 6 Feb 2019 at 23:44, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 2/6/19 10:59 PM, David Rowley wrote:

On Thu, 7 Feb 2019 at 03:16, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

I wonder what should we be doing with this series -- concretely, should
the effort concentrate on one of the two patches, and leave the other
for pg13, to increase the chances of the first one being in pg12? I
would favor that approach, since it's pretty late in the cycle by now
and it seems dubious that both will be ready.

I mostly have been reviewing the MCV patch with the thoughts that one
is better than none in PG12. I don't see any particular reason that
we need both in the one release.

I agree with that, although most of the complexity likely lies in
integrating the stats into the selectivity estimation - if we get that
right for the MCV patch, adding histogram seems comparably simpler.

But yeah, let's focus on the MCV part.

Agreed. I think the overall approach of the MCV patch is sound and
it's getting closer to being committable. David's review comments were
excellent. I'll try to review it as well when you post your next
update.

I have some more fundamental doubts about the histogram patch, to do
with the way it integrates with selectivity estimation, and some vague
half-formed ideas about how that could be improved, but nothing clear
enough that I can express right now.

So yes, let's focus on the MCV patch for now.

Regards,
Dean

#122

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Tomas Vondra (#120)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

Attached is an updated version of this patch series. I've decided to
rebase and send both parts (MCV and histograms), although we've agreed
to focus on the MCV part for now. I don't want to leave the histogram to
lag behind, because (a) then it'd be much more work to update it, and
(b) I think it's an useful feedback about likely future changes.

This should address most of the issues pointed out by David in his
recent reviews. Briefly:

1) It fixes/updates a number of comments and docs on various places,
removes redundant comments etc. In most cases I've simply adopted the
wording proposed by David, with minor tweaks in a couple of cases.

2) Reverts changes that exposed analyze_mcv_list - this was a leftover
from the attempt to reuse the single-column algorithm, but we've since
agreed it's not the right approach. So this change is unnecessary.

3) I've tweaked the code to accept RelabelType nodes as supported,
similarly to what examine_variable() does. Previously I concluded we
can't support RelabelType, but it seems that reasoning was bogus. I've
slightly tweaked the regression tests by changing one of the columns to
varchar, so that the queries actualy trigger this.

4) I've tweaked a couple of places (UpdateStatisticsForTypeChange,
statext_clauselist_selectivity and estimate_num_groups_simple) per
David's suggestions. Those were fairly straightforward simplifications.

5) I've removed mcv_count from statext_mcv_build(). As David pointed
out, this was not actually needed - it was another remnant of the
attempt to re-use analyze_mcv_list() which needs such array. But without
it we can access the groups directly.

6) One of the review questions was about the purpose of this code:

for (i = 0; i < nitems; i++)
{
if (groups[i].count < mincount)
{
nitems = i;
break;
}
}

It's quite simple - we want to include groups with more occurrences than
mincount, and the groups are sorted by the count (in descending order).
So we simply find the first group with count below mincount, and the
index is the number of groups to keep. I've tried to explain that in a
comment.

7) I've fixed a bunch of format patters in statext_mcv_deserialize(),
particularly those that confused %d and %u. We can't however use %d for
VARSIZE_ANY_EXHDR, because that macro expands into offsetof() etc. So
that would trigger compiler warnings.

8) Yeah, pg_stats_ext_mcvlist_items was broken. The issue was that one
of the output parameters is defined as boolean[], but the function was
building just string. Originally it used BuildTupleFromCStrings(), but
then it switched to heap_form_tuple() without building a valid array.

I've decided to simply revert back to BuildTupleFromCStrings(). It's not
going to be used very frequently, so the small performance difference is
not important.

I've also fixed the formatting issues, pointed out by David.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#123

David Rowley

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Tomas Vondra (#122)

1 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Fri, 1 Mar 2019 at 08:56, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated version of this patch series.

I made a quick pass over the 0001 patch. I edited a few small things
along the way; patch attached.

I'll try to do a more in-depth review soon.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

mcv_stats_small_fixups_for_0001.patchapplication/octet-stream; name=mcv_stats_small_fixups_for_0001.patchDownload

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 6c29bc9d67..225b13376d 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -21288,7 +21288,7 @@ CREATE EVENT TRIGGER test_table_rewrite_oid
 
    <para>
     <productname>PostgreSQL</productname> provides a function to inspect complex
-    statistics defined using <command>CREATE STATISTICS</command> command.
+    statistics defined using the <command>CREATE STATISTICS</command> command.
    </para>
 
   <sect2 id="functions-statistics-mcv">
diff --git a/doc/src/sgml/planstats.sgml b/doc/src/sgml/planstats.sgml
index 05a0eaf476..4b1d3f4952 100644
--- a/doc/src/sgml/planstats.sgml
+++ b/doc/src/sgml/planstats.sgml
@@ -599,9 +599,9 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
    <para>
     This section introduces multivariate variant of <acronym>MCV</acronym>
     (most-common values) lists, a straightforward extension of the per-column
-    statistics described in <xref linkend="row-estimation-examples"/>. This
-    statistics addresses the limitation by storing individual values, but it
-    is naturally more expensive, both in terms of building the statistics in
+    statistics described in <xref linkend="row-estimation-examples"/>.  These
+    statistics address the limitation by storing individual values, but it is
+    naturally more expensive, both in terms of building the statistics in
     <command>ANALYZE</command>, storage and planning time.
    </para>
 
@@ -651,11 +651,11 @@ SELECT m.* FROM pg_statistic_ext,
 (100 rows)
 </programlisting>
 
-    Which confirms there are 100 distinct combinations in the two columns,
-    and all of them are about equally likely (1% frequency for each one).
-    The base frequency is the frequency computed from per-column statistics,
-    as if there were no multi-column statistics. Had there been any null
-    values in either of the columns, this would be identified in the
+    This confirms there are 100 distinct combinations in the two columns, and
+    all of them are about equally likely (1% frequency for each one).  The
+    base frequency is the frequency computed from per-column statistics, as if
+    there were no multi-column statistics. Had there been any null values in
+    either of the columns, this would be identified in the
     <structfield>nulls</structfield> column.
    </para>
 
diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml
index f86e542237..ae1d8024a4 100644
--- a/doc/src/sgml/ref/create_statistics.sgml
+++ b/doc/src/sgml/ref/create_statistics.sgml
@@ -81,7 +81,7 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="parameter">statistics_na
      <para>
       A statistics kind to be computed in this statistics object.
       Currently supported kinds are
-      <literal>ndistinct</literal>, which enables n-distinct statistics, and
+      <literal>ndistinct</literal>, which enables n-distinct statistics,
       <literal>dependencies</literal>, which enables functional
       dependency statistics, and <literal>mcv</literal> which enables
       most-common values lists.
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index e9c08c7c4a..b895f06a37 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -99,13 +99,14 @@ clauselist_selectivity(PlannerInfo *root,
 		 * the more complex stats can track more complex correlations between
 		 * the attributes, and may be considered more reliable.
 		 *
-		 * For example MCV list can give us an exact selectivity for values in
+		 * For example, MCV list can give us an exact selectivity for values in
 		 * two columns, while functional dependencies can only provide
-		 * information about overall strength of the dependency.
+		 * information about the overall strength of the dependency.
 		 *
-		 * 'estimatedclauses' is a bitmap of 0-based list positions of clauses
-		 * used that way, so that we can ignore them later (not to estimate
-		 * them twice).
+		 * 'estimatedclauses' tracks the 0-based list position index of
+		 * clauses that we've already estimated for.  Each selectivity
+		 * function will set the appropriate bit in the bitmapset to mark that
+		 * no further estimation is required for that list item.
 		 */
 		s1 *= statext_clauselist_selectivity(root, clauses, varRelid,
 											 jointype, sjinfo, rel,
@@ -113,9 +114,8 @@ clauselist_selectivity(PlannerInfo *root,
 
 		/*
 		 * Perform selectivity estimations on any clauses found applicable by
-		 * dependencies_clauselist_selectivity.  'estimatedclauses' will be
-		 * filled with the 0-based list positions of clauses used that way, so
-		 * that we can ignore them lager (not to estimate them twice).
+		 * dependencies_clauselist_selectivity.  Pass 'estimatedclauses' so
+		 * the function can properly skip clauses already estimated above.
 		 */
 		s1 *= dependencies_clauselist_selectivity(root, clauses, varRelid,
 												  jointype, sjinfo, rel,
@@ -123,8 +123,9 @@ clauselist_selectivity(PlannerInfo *root,
 	}
 
 	/*
-	 * Apply normal selectivity estimates for remaining clauses. We'll be
-	 * careful to skip any clauses which were already estimated above.
+	 * Apply normal selectivity estimates for the remaining clauses, again
+	 * passing 'estimatedclauses' so that the function can skip already
+	 * estimated clauses.
 	 */
 	return s1 * clauselist_selectivity_simple(root, clauses, varRelid,
 											  jointype, sjinfo,
diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index ea10d2a718..1c15523d03 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -221,13 +221,13 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
 	mss = multi_sort_init(k);
 
 	/*
-	 * Transform the attrs from bitmap to an array, to make accessing i-th
+	 * Transform the attrs from bitmap to an array to make accessing the i-th
 	 * member easier, and then construct a filtered version with only attnums
 	 * referenced by the dependency we validate.
 	 */
 	attnums = build_attnums_array(attrs);
 
-	attnums_dep = (int *)palloc(k * sizeof(int));
+	attnums_dep = (int *) palloc(k * sizeof(int));
 	for (i = 0; i < k; i++)
 		attnums_dep[i] = attnums[dependency[i]];
 
@@ -958,8 +958,8 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 		Node	   *clause = (Node *) lfirst(l);
 		AttrNumber	attnum;
 
-		if ((!bms_is_member(listidx, *estimatedclauses)) &&
-			(dependency_is_compatible_clause(clause, rel->relid, &attnum)))
+		if (!bms_is_member(listidx, *estimatedclauses) &&
+			dependency_is_compatible_clause(clause, rel->relid, &attnum))
 		{
 			list_attnums[listidx] = attnum;
 			clauses_attnums = bms_add_member(clauses_attnums, attnum);
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index f5b5562e5c..8d35f45661 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -926,7 +926,7 @@ statext_is_compatible_clause(Node *clause, Index relid, Bitmapset **attnums)
  *
  * Selects the best extended (multi-column) statistic on a table (measured by
  * the number of attributes extracted from the clauses and covered by it), and
- * computes the selectivity for supplied clauses.
+ * computes the selectivity for the supplied clauses.
  *
  * One of the main challenges with using MCV lists is how to extrapolate the
  * estimate to the data not covered by the MCV list. To do that, we compute
@@ -965,6 +965,10 @@ statext_is_compatible_clause(Node *clause, Index relid, Bitmapset **attnums)
  * are computed, the inequality may not always hold. Which is why we clamp
  * the selectivities to prevent strange estimate (negative etc.).
  *
+ * 'estimatedclauses' is an input/output parameter.  We set bits for the
+ * 0-based 'clauses' indexes we estimate for and also skip clause items that
+ * already have a bit set.
+ *
  * XXX If we were to use multiple statistics, this is where it would happen.
  * We would simply repeat this on a loop on the "remaining" clauses, possibly
  * using the already estimated clauses as conditions (and combining the values
@@ -1005,10 +1009,6 @@ statext_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
 	 *
 	 * We also skip clauses that we already estimated using different types of
 	 * statistics (we treat them as incompatible).
-	 *
-	 * XXX Currently, the estimated clauses are always empty because the extra
-	 * statistics are applied before functional dependencies. Once we decide
-	 * to apply multiple statistics, this may change.
 	 */
 	listidx = 0;
 	foreach(l, clauses)
@@ -1016,8 +1016,8 @@ statext_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
 		Node	   *clause = (Node *) lfirst(l);
 		Bitmapset  *attnums = NULL;
 
-		if ((!bms_is_member(listidx, *estimatedclauses)) &&
-			(statext_is_compatible_clause(clause, rel->relid, &attnums)))
+		if (!bms_is_member(listidx, *estimatedclauses) &&
+			statext_is_compatible_clause(clause, rel->relid, &attnums))
 		{
 			list_attnums[listidx] = attnums;
 			clauses_attnums = bms_add_members(clauses_attnums, attnums);
@@ -1052,8 +1052,8 @@ statext_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
 		 * If the clause is compatible with the selected statistics, mark it
 		 * as estimated and add it to the list to estimate.
 		 */
-		if ((list_attnums[listidx] != NULL) &&
-			(bms_is_subset(list_attnums[listidx], stat->keys)))
+		if (list_attnums[listidx] != NULL &&
+			bms_is_subset(list_attnums[listidx], stat->keys))
 		{
 			stat_clauses = lappend(stat_clauses, (Node *) lfirst(l));
 			*estimatedclauses = bms_add_member(*estimatedclauses, listidx);
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 88622202e1..c3d6a83f21 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3287,7 +3287,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
  *
  * A simplified version of estimate_num_groups, assuming all expressions
  * are only plain Vars from a single relation, and that no filtering is
- * happenning.
+ * happening.
  */
 double
 estimate_num_groups_simple(PlannerInfo *root, List *vars)
diff --git a/src/include/statistics/extended_stats_internal.h b/src/include/statistics/extended_stats_internal.h
index 64cc8c9ecd..eed7f86036 100644
--- a/src/include/statistics/extended_stats_internal.h
+++ b/src/include/statistics/extended_stats_internal.h
@@ -38,7 +38,7 @@ typedef struct DimensionInfo
 	int			nbytes;			/* number of bytes (serialized) */
 	int			typlen;			/* pg_type.typlen */
 	bool		typbyval;		/* pg_type.typbyval */
-}			DimensionInfo;
+} DimensionInfo;
 
 /* multi-sort */
 typedef struct MultiSortSupportData
@@ -67,9 +67,9 @@ extern MVDependencies *statext_dependencies_build(int numrows, HeapTuple *rows,
 extern bytea *statext_dependencies_serialize(MVDependencies *dependencies);
 extern MVDependencies *statext_dependencies_deserialize(bytea *data);
 
-extern MCVList * statext_mcv_build(int numrows, HeapTuple *rows,
-								   Bitmapset *attrs, VacAttrStats **stats,
-								   double totalrows);
+extern MCVList *statext_mcv_build(int numrows, HeapTuple *rows,
+				  Bitmapset *attrs, VacAttrStats **stats,
+				  double totalrows);
 extern bytea *statext_mcv_serialize(MCVList * mcv, VacAttrStats **stats);
 extern MCVList * statext_mcv_deserialize(bytea *data);
 
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index fed875cd52..a52d5800d5 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -87,9 +87,8 @@ typedef struct MVDependencies
 #define SizeOfDependencies	(offsetof(MVDependencies, ndeps) + sizeof(uint32))
 
 /* used to flag stats serialized to bytea */
-#define STATS_MCV_MAGIC                        0xE1A651C2	/* marks serialized
-															 * bytea */
-#define STATS_MCV_TYPE_BASIC   1	/* basic MCV list type */
+#define STATS_MCV_MAGIC			0xE1A651C2	/* marks serialized bytea */
+#define STATS_MCV_TYPE_BASIC	1	/* basic MCV list type */
 
 /* max items in MCV list (mostly arbitrary number) */
 #define STATS_MCVLIST_MAX_ITEMS        8192
@@ -106,7 +105,7 @@ typedef struct MCVItem
 	double		base_frequency;	/* frequency if independent */
 	bool	   *isnull;			/* NULL flags */
 	Datum	   *values;			/* item values */
-}			MCVItem;
+} MCVItem;
 
 /* multivariate MCV list - essentally an array of MCV items */
 typedef struct MCVList
@@ -117,11 +116,11 @@ typedef struct MCVList
 	AttrNumber	ndimensions;	/* number of dimensions */
 	Oid			types[STATS_MAX_DIMENSIONS];	/* OIDs of data types */
 	MCVItem   **items;			/* array of MCV items */
-}			MCVList;
+} MCVList;
 
 extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
 extern MVDependencies *statext_dependencies_load(Oid mvoid);
-extern MCVList * statext_mcv_load(Oid mvoid);
+extern MCVList *statext_mcv_load(Oid mvoid);
 
 extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
 						   int numrows, HeapTuple *rows,

#124

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: Tomas Vondra (#122)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Thu, 28 Feb 2019 at 19:56, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated version of this patch series.

Here are some random review comments. I'll add more later, but I'm out
of energy for today.

1). src/test/regress/expected/type_sanity.out has bit-rotted.

2). Duplicate OIDs (3425).

3). It looks a bit odd that clauselist_selectivity() calls
statext_clauselist_selectivity(), which does MCV stats and will do
histograms, but it doesn't do dependencies, so
clauselist_selectivity() has to then separately call
dependencies_clauselist_selectivity(). It would seem neater if
statext_clauselist_selectivity() took care of calling
dependencies_clauselist_selectivity(), since dependencies are just
another kind of extended stats.

4). There are no tests for pg_mcv_list_items(). Given a table with a
small enough amount of data, so that it's all sampled, it ought to be
possible to get predictable MCV stats.

5). It's not obvious what some of the new test cases in the
"stats_ext" tests are intended to show. For example, the first test
creates a table with 5000 rows and a couple of indexes, does a couple
of queries, builds some MCV stats, and then repeats the queries, but
the results seem to be the same with and without the stats.

I wonder if it's possible to write smaller, more targeted tests.
Currently "stats_ext" is by far the slowest test in its group, and I'm
not sure that some of those tests add much. It ought to be possible to
write a function that calls EXPLAIN and returns a query's row
estimate, and then you could write tests to confirm the effect of the
new stats by verifying the row estimates change as expected.

6). This enum isn't needed for MCVs:

/*
* Degree of how much MCV item matches a clause.
* This is then considered when computing the selectivity.
*/
#define STATS_MATCH_NONE 0 /* no match at all */
#define STATS_MATCH_PARTIAL 1 /* partial match */
#define STATS_MATCH_FULL 2 /* full match */

STATS_MATCH_PARTIAL is never used for MCVs, so you may as well just
use booleans instead of this enum. If those are needed for histograms,
they can probably be made local to histogram.c.

7). estimate_num_groups_simple() isn't needed in this patch.

8). In README.mcv,
s/clauselist_mv_selectivity_mcvlist/mcv_clauselist_selectivity/.

9). In the list of supported clause types that follows (e) is the same
a (c), but with a more general description.

10). It looks like most of the subsequent description of the algorithm
is out of date and needs rewriting. All the stuff about full matches
and the use of ndistinct is now obsolete.

Regards,
Dean

#125

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: Dean Rasheed (#124)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Sat, 9 Mar 2019 at 18:33, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

On Thu, 28 Feb 2019 at 19:56, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated version of this patch series.

Here are some random review comments. I'll add more later, but I'm out
of energy for today.

Here are some more comments:

11). In dependency_degree():

-    /* sort the items so that we can detect the groups */
-    qsort_arg((void *) items, numrows, sizeof(SortItem),
-              multi_sort_compare, mss);
+    /*
+     * build an array of SortItem(s) sorted using the multi-sort support
+     *
+     * XXX This relies on all stats entries pointing to the same tuple
+     * descriptor. Not sure if that might not be the case.
+     */
+    items = build_sorted_items(numrows, rows, stats[0]->tupDesc,
+                               mss, k, attnums_dep);

That XXX comment puzzled me for a while. Actually it's OK though,
unless/until we try to support stats across multiple relations, which
will require a much larger refactoring of this code. For now though,
The stats entries all point to the same tuple descriptor from the
onerel passed to BuildRelationExtStatistics(), so it's OK to just use
the first tuple descriptor in this way. The comment should be updated
to explain that.

12). bms_member_index() should surely be in bitmapset.c. It could be
more efficient by just traversing the bitmap words and making use of
bmw_popcount(). Also, its second argument should be of type 'int' for
consistency with other bms_* functions.

13). estimate_ndistinct() has been moved from mvdistinct.c to
extended_stats.c and changed from static to extern, but it is only
called from mvdistinct.c, so that change is unnecessary (at least as
far as this patch is concerned).

14). The attnums Bitmapset passed to
statext_is_compatible_clause_internal() is an input/output argument
that it updates. That should probably be documented. When it calls
itself recursively for AND/OR/NOT clauses, it could just pass the
original Bitmapset through to be updated (rather than creating a new
one and merging it), as it does for other types of clause.

On the other hand, the outer function statext_is_compatible_clause()
does need to return a new bitmap, which may or may not be used by its
caller, so it would be cleaner to make that a strictly "out" parameter
and initialise it to NULL in that function, rather than in its caller.

15). As I said yesterday, I don't think that there is a clean
separator of concerns between the functions clauselist_selectivity(),
statext_clauselist_selectivity(),
dependencies_clauselist_selectivity() and
mcv_clauselist_selectivity(), I think things could be re-arranged as
follows:

statext_clauselist_selectivity() - as the name suggests - should take
care of *all* extended stats estimation, not just MCVs and histograms.
So make it a fairly small function, calling
mcv_clauselist_selectivity() and
dependencies_clauselist_selectivity(), and histograms when that gets
added.

Most of the current code in statext_clauselist_selectivity() is really
MCV-specific, so move that to mcv_clauselist_selectivity(). Amongst
other things, that would move the call to choose_best_statistics() to
mcv_clauselist_selectivity() (just as
dependencies_clauselist_selectivity() calls choose_best_statistics()
to get the best dependencies statistics). Then, when histograms are
added later, you won't have the problem pointed out before where it
can't apply both MCV and histogram stats if they're on different
STATISTICS objects.

Most of the comments for statext_clauselist_selectivity() are also
MCV-related. Those too would move to mcv_clauselist_selectivity().

Regards,
Dean

#126

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: Dean Rasheed (#125)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Sun, 10 Mar 2019 at 13:09, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

Here are some more comments:

One more thing --- the comment for statext_clauselist_selectivity() says:

* So (simple_selectivity - base_selectivity) may be seen as a correction for
* the part not covered by the MCV list.

That's not quite right. It should really say that (simple_selectivity
- base_selectivity) is an estimate for the part not covered by the MCV
list, or that (mcv_selectivity - base_selectivity) is a correction for
the part covered by the MCV list. Those 2 statements are actually
equivalent, and different from what you wrote.

Perhaps the easiest way to see it is to work through a simple example:

Suppose you had the following clauses:

a = 1 AND b >= 0 AND b <= 10

The per-column stats might be expected to give reasonable independent
estimates for the following 2 things:

P(a = 1)

P(b >= 0 AND b <= 10) -- in general, greater than P(b >= 0) * P(b <= 10)

but the overall estimate produced by clauselist_selectivity_simple()
would then just be the product of those 2 things:

simple_sel = P(a = 1) * P(b >= 0 AND b <= 10)

which might not be so good if the columns were correlated.

Now suppose you had MCV stats, which included MCV items for the
following specific values:

(a=1,b=1), (a=1,b=2), (a=1,b=3)

but no other relevant MCV entries. (There might be lots of other MCV
items that don't match the original clauses, but they're irrelavent
for this discssion.) That would mean that we could get reasonable
estimates for the following 2 quantities:

mcv_sel = P(a = 1 AND b IN (1,2,3))
= P(a = 1 AND b = 1) + P(a = 1 AND b = 2) + P(a = 1 AND b = 3)

mcv_basesel = base_freq(a = 1 AND b IN (1,2,3))
= P(a = 1) * (P(b = 1) + P(b = 2) + P(b = 3))

So how is that useful? Well, returning to the quantity that we're
actually trying to compute, it can be split into MCV and non-MCV
parts, and since they're mutually exclusive possibilities, their
probabilities just add up. Thus we can write:

P(a = 1 AND b >= 0 AND b <= 10)

= P(a = 1 AND b IN (1,2,3)) -- MCV part
+ P(a = 1 AND b >= 0 AND b <= 10 AND b NOT IN (1,2,3)) -- non-MCV part

= mcv_sel + other_sel

So the first term is easy -- it's just mcv_sel, from above. The second
term is trickier though, since we have no information about the
correlation between a and b in the non-MCV region. Just about the best
we can do is assume that they're independent, which gives:

other_sel = P(a = 1 AND b >= 0 AND b <= 10 AND b NOT IN (1,2,3))
~= P(a = 1) * P(b >= 0 AND b <= 10 AND b NOT IN (1,2,3))

and that can now be written in terms of things that we know

other_sel ~= P(a = 1) * P(b >= 0 AND b <= 10 AND b NOT IN (1,2,3))

= P(a = 1) * P(b >= 0 AND b <= 10)
- P(a = 1) * P(b IN (1,2,3)) -- mutually exclusive possibilities

= simple_sel - mcv_basesel

So, as I said above, (simple_selectivity - base_selectivity) is an
estimate for the part not covered by the MCV list.

Another way to look at it is to split the original per-column estimate
up into MCV and non-MCV parts, and correct the MCV part using the MCV
stats:

simple_sel = P(a = 1) * P(b >= 0 AND b <= 10)

= P(a = 1) * P(b IN (1,2,3))
+ P(a = 1) * P(b >= 0 AND b <= 10 AND b NOT IN (1,2,3))

The first term is just mcv_basesel, so we can define other_sel to be
the other term, giving

simple_sel = mcv_basesel -- MCV part
+ other_sel -- non-MCV part

Clearly mcv_basesel isn't the best estimate for the MCV part, and it
should really be mcv_sel, so we can improve upon simple_sel by
applying a correction of (mcv_sel - basesel) to it:

better estimate = simple_sel + (mcv_sel - mcv_basesel)
= mcv_sel + other_sel

(where other_sel = simple_sel - mcv_basesel)

Of course, that's totally equivalent, but looking at it this way
(mcv_selectivity - base_selectivity) can be seen as a correction for
the part covered by the MCV list.

All of that generalises to arbitrary clauses, because the matching
items in the MCV list are independent possibilities that sum up, and
the MCV and non-MCV parts are mutually exclusive. That's also why the
basesel calculation in mcv_clauselist_selectivity() must only include
matching MCV items, and the following XXX comment is wrong:

+   for (i = 0; i < mcv->nitems; i++)
+   {
+       *totalsel += mcv->items[i]->frequency;
+
+       if (matches[i] != STATS_MATCH_NONE)
+       {
+           /* XXX Shouldn't the basesel be outside the if condition? */
+           *basesel += mcv->items[i]->base_frequency;
+           s += mcv->items[i]->frequency;
+       }
+   }

So I believe that that code is correct, as written.

Regards,
Dean

#127

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Dean Rasheed (#125)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi Dean,

Thanks for the review. I'll post a patch fixing most of the stuff soon,
but a few comments/questions regarding some of the issues:

On 3/9/19 7:33 PM, Dean Rasheed wrote:

5). It's not obvious what some of the new test cases in the
"stats_ext" tests are intended to show. For example, the first test
creates a table with 5000 rows and a couple of indexes, does a couple
of queries, builds some MCV stats, and then repeats the queries, but
the results seem to be the same with and without the stats.

Hmmm. I thought those tests are testing that we get the right plan, but
maybe I broke that somehow during the rebases. Will check.

I wonder if it's possible to write smaller, more targeted tests.
Currently "stats_ext" is by far the slowest test in its group, and I'm
not sure that some of those tests add much. It ought to be possible to
write a function that calls EXPLAIN and returns a query's row
estimate, and then you could write tests to confirm the effect of the
new stats by verifying the row estimates change as expected.

Sure, if we can write more targeted tests, that would be good. But it's
not quite clear to me how wrapping EXPLAIN in a function makes those
tests any faster?

On 3/10/19 2:09 PM, Dean Rasheed wrote:

12). bms_member_index() should surely be in bitmapset.c. It could be
more efficient by just traversing the bitmap words and making use of
bmw_popcount(). Also, its second argument should be of type 'int' for
consistency with other bms_* functions.

Yes, moving to bitmapset.c definitely makes sense. I don't see how it
could use bms_popcount() though.

On 3/10/19 2:09 PM, Dean Rasheed wrote:

14). The attnums Bitmapset passed to
statext_is_compatible_clause_internal() is an input/output argument
that it updates. That should probably be documented. When it calls
itself recursively for AND/OR/NOT clauses, it could just pass the
original Bitmapset through to be updated (rather than creating a new
one and merging it), as it does for other types of clause.

I don't think it's really possible, because the AND/OR/NOT clause is
considered compatible only when all the pieces are compatible. So we
can't tweak the original bitmapset directly in case the incompatible
clause is not the very first one.

On the other hand, the outer function statext_is_compatible_clause()
does need to return a new bitmap, which may or may not be used by its
caller, so it would be cleaner to make that a strictly "out" parameter
and initialise it to NULL in that function, rather than in its caller.

On 3/10/19 2:09 PM, Dean Rasheed wrote:

15). As I said yesterday, I don't think that there is a clean
separator of concerns between the functions clauselist_selectivity(),
statext_clauselist_selectivity(),
dependencies_clauselist_selectivity() and
mcv_clauselist_selectivity(), I think things could be re-arranged as
follows:

statext_clauselist_selectivity() - as the name suggests - should take
care of *all* extended stats estimation, not just MCVs and histograms.
So make it a fairly small function, calling
mcv_clauselist_selectivity() and
dependencies_clauselist_selectivity(), and histograms when that gets
added.

Most of the current code in statext_clauselist_selectivity() is really
MCV-specific, so move that to mcv_clauselist_selectivity(). Amongst
other things, that would move the call to choose_best_statistics() to
mcv_clauselist_selectivity() (just as
dependencies_clauselist_selectivity() calls choose_best_statistics()
to get the best dependencies statistics). Then, when histograms are
added later, you won't have the problem pointed out before where it
can't apply both MCV and histogram stats if they're on different
STATISTICS objects.

I agree clauselist_selectivity() shouldn't care about various types of
extended statistics (MCV vs. functional dependencies). But I'm not sure
the approach you suggested (moving stuff to mcv_clauselist_selectivity)
would work particularly well because most of it is not specific to MCV
lists. It'll also need to care about histograms, for example.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#128

David Rowley

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Tomas Vondra (#127)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Mon, 11 Mar 2019 at 06:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 3/9/19 7:33 PM, Dean Rasheed wrote:

I wonder if it's possible to write smaller, more targeted tests.
Currently "stats_ext" is by far the slowest test in its group, and I'm
not sure that some of those tests add much. It ought to be possible to
write a function that calls EXPLAIN and returns a query's row
estimate, and then you could write tests to confirm the effect of the
new stats by verifying the row estimates change as expected.

Sure, if we can write more targeted tests, that would be good. But it's
not quite clear to me how wrapping EXPLAIN in a function makes those
tests any faster?

I've not looked at the tests in question, but if they're executing an
inferior plan is used when no extended stats exists, then maybe that's
why they're slow.

I think Dean might mean to create a function similar to
explain_parallel_append() in partition_prune.sql then write tests that
check the row estimate with EXPLAIN (COSTS ON) but strip out the other
costing stuff instead of validating that the poor plan was chosen.

On 3/10/19 2:09 PM, Dean Rasheed wrote:

12). bms_member_index() should surely be in bitmapset.c. It could be
more efficient by just traversing the bitmap words and making use of
bmw_popcount(). Also, its second argument should be of type 'int' for
consistency with other bms_* functions.

Yes, moving to bitmapset.c definitely makes sense. I don't see how it
could use bms_popcount() though.

I think it could be done by first checking if the parameter is a
member of the set, and then if so, count all the bits that come on and
before that member. You can use bmw_popcount() for whole words before
the specific member's word then just bitwise-and a bit mask of a
bitmapword that has all bits set for all bits on and before your
parameter's BITNUM(), and add the bmw_popcount of the final word
bitwise-anding the mask. bms_add_range() has some masking code you
could copy.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#129

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: David Rowley (#128)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 3/10/19 11:27 PM, David Rowley wrote:

On Mon, 11 Mar 2019 at 06:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 3/9/19 7:33 PM, Dean Rasheed wrote:

I wonder if it's possible to write smaller, more targeted tests.
Currently "stats_ext" is by far the slowest test in its group, and I'm
not sure that some of those tests add much. It ought to be possible to
write a function that calls EXPLAIN and returns a query's row
estimate, and then you could write tests to confirm the effect of the
new stats by verifying the row estimates change as expected.

Sure, if we can write more targeted tests, that would be good. But it's
not quite clear to me how wrapping EXPLAIN in a function makes those
tests any faster?

I've not looked at the tests in question, but if they're executing an
inferior plan is used when no extended stats exists, then maybe that's
why they're slow.

I don't think the tests are executing any queries - the tests merely
generate execution plans, without executing them.

I think Dean might mean to create a function similar to
explain_parallel_append() in partition_prune.sql then write tests that
check the row estimate with EXPLAIN (COSTS ON) but strip out the other
costing stuff instead of validating that the poor plan was chosen.

I'm not opposed to doing that, of course. I'm just not sure it's a way
to make the tests faster. Will investigate.

On 3/10/19 2:09 PM, Dean Rasheed wrote:

12). bms_member_index() should surely be in bitmapset.c. It could be
more efficient by just traversing the bitmap words and making use of
bmw_popcount(). Also, its second argument should be of type 'int' for
consistency with other bms_* functions.

Yes, moving to bitmapset.c definitely makes sense. I don't see how it
could use bms_popcount() though.

I think it could be done by first checking if the parameter is a
member of the set, and then if so, count all the bits that come on and
before that member. You can use bmw_popcount() for whole words before
the specific member's word then just bitwise-and a bit mask of a
bitmapword that has all bits set for all bits on and before your
parameter's BITNUM(), and add the bmw_popcount of the final word
bitwise-anding the mask. bms_add_range() has some masking code you
could copy.

Ah, right - that would work.

cheers

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#130

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: Tomas Vondra (#127)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Sun, 10 Mar 2019 at 17:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 3/10/19 2:09 PM, Dean Rasheed wrote:

14). The attnums Bitmapset passed to
statext_is_compatible_clause_internal() is an input/output argument
that it updates. That should probably be documented. When it calls
itself recursively for AND/OR/NOT clauses, it could just pass the
original Bitmapset through to be updated (rather than creating a new
one and merging it), as it does for other types of clause.

I don't think it's really possible, because the AND/OR/NOT clause is
considered compatible only when all the pieces are compatible. So we
can't tweak the original bitmapset directly in case the incompatible
clause is not the very first one.

In the case where the overall clause is incompatible, you don't
actually care about the attnums returned. Right now it will return an
empty set (NULL). With this change it would return all the attnums
encountered before the incompatible piece, but that wouldn't matter.
In fact, you could easily preserve the current behaviour just by
having the outer statext_is_compatible_clause() function set attnums
back to NULL if the result is false.

Regards,
Dean

#131

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: David Rowley (#128)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Sun, 10 Mar 2019 at 22:28, David Rowley <david.rowley@2ndquadrant.com> wrote:

On Mon, 11 Mar 2019 at 06:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 3/9/19 7:33 PM, Dean Rasheed wrote:

I wonder if it's possible to write smaller, more targeted tests.
Currently "stats_ext" is by far the slowest test in its group, and I'm
not sure that some of those tests add much. It ought to be possible to
write a function that calls EXPLAIN and returns a query's row
estimate, and then you could write tests to confirm the effect of the
new stats by verifying the row estimates change as expected.

Sure, if we can write more targeted tests, that would be good. But it's
not quite clear to me how wrapping EXPLAIN in a function makes those
tests any faster?

I've not looked at the tests in question, but if they're executing an
inferior plan is used when no extended stats exists, then maybe that's
why they're slow.

I think Dean might mean to create a function similar to
explain_parallel_append() in partition_prune.sql then write tests that
check the row estimate with EXPLAIN (COSTS ON) but strip out the other
costing stuff instead of validating that the poor plan was chosen.

Yeah that's the sort of thing I was thinking of. I think it might be
possible to write simpler and faster tests by inserting far fewer rows
and relying on ANALYSE having sampled everything, so the row estimates
should be predictable. It may be the case that, with just a handful of
rows, the extended stats don't affect the plan, but you'd still see a
difference in the row estimates, and that could be a sufficient test I
think.

On 3/10/19 2:09 PM, Dean Rasheed wrote:

12). bms_member_index() should surely be in bitmapset.c. It could be
more efficient by just traversing the bitmap words and making use of
bmw_popcount(). Also, its second argument should be of type 'int' for
consistency with other bms_* functions.

Yes, moving to bitmapset.c definitely makes sense. I don't see how it
could use bms_popcount() though.

I think it could be done by first checking if the parameter is a
member of the set, and then if so, count all the bits that come on and
before that member. You can use bmw_popcount() for whole words before
the specific member's word then just bitwise-and a bit mask of a
bitmapword that has all bits set for all bits on and before your
parameter's BITNUM(), and add the bmw_popcount of the final word
bitwise-anding the mask. bms_add_range() has some masking code you
could copy.

Yep, that's what I was imagining. Except I think that to get a 0-based
index result you'd want the mask to have all bits set for bits
*before* the parameter's BITNUM(), rather than on and before. So I
think the mask would simply be

((bitmapword) 1 << bitnum) - 1

Regards,
Dean

#132

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Dean Rasheed (#130)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

attached is an updated version of the patch, addressing most of the
issues raised in the recent reviews. There are two main exceptions:

1) I haven't reworked the regression tests to use a function to check
cardinality estimates and making them faster.

2) Review handling of bitmap in statext_is_compatible_clause_internal
when processing AND/OR/NOT clauses.

I plan to look into those items next, but I don't want block review of
other parts of the patch unnecessarily.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-multivariate-MCV-lists-20190312.patch.gzapplication/gzip; name=0001-multivariate-MCV-lists-20190312.patch.gzDownload

0002-multivariate-histograms-20190312.patch.gzapplication/gzip; name=0002-multivariate-histograms-20190312.patch.gzDownload

��Z�\0002-multivariate-histograms-20190312.patch�<isG���_��z�&9���%���r�-[���96�b5g����`���8��@���$��>�-rzh
�t�^%q��#�����6w\����D��x2ruo4_���=7]�bo��]�%3�L���?f����
�<gWq�S�cy	g�>}kF��9��(��8<��3��]�Q�:{��`L�1z��������_���7{�~�����53��o,������(���f�u��t0�3/v�i���`����u��[��'�7l�k.���TN�k�-E���p3(�9�*\��4��FJ��G+p�X�D�$f������4��{yC���G^:�A��K5w�P�^�bO��8�PB+@ 3W�e���"��/���#@@�Y\�$(��!,P���	W�c���?��xqz����z�-J8�T�
�������XB��\_����R�!C@o&������
���-SK�a���F<�{�I��j�����
�<���&���2��R�UA4�q�c?.���l�&�\�#���t0����2�z�������(|�py=sy�i�����e���mp���@-��6������D��
'=��g7��f�i�fm��Q&���{�=`
���nZ�����JS*�*�5��D\'"E*�������	���{��\3��H�U�x��R��}���R88�%�i��yl�mDT�j0xj����]XC7�2��G���3{d����Ag~��^�sX�=�B-��>��c!�	��>���:����<����/l0��3��[b�����>��f���tbWh����0��Kb0��:�>8���|�-���OX�>��.O.��._Lg�?������G����Q����(��:;8�;=o���z1�$� ����]�W0a��ny�.{�H*�<�=��O����a@�G����J�����"7�����
���f�������}�Pf
���'���a�RS�w����e�����}9Ikp|.9��U|7-|U�Qc��(K�����$���{�i��7���v�B��D�d�������a�%��-|xGi��d�pXS"nA�'<���FM���o.���@�g��|0�?�L �QdF<lW{�h}-q7�R�����l{JP�p�������m�pU�2��By6on,���'\M���|b/>���*k�Z���4���j%�@�zu�������3����[��U��_���6�%�T:rH��^<)0T�>(��DJ�0�3�Y����~�JA��$N]I"��a9!��9LPs>�����n��l�����D�z/���M�[�wX��[��%"����t���&\v��:�������� �J
�ZJT��q�wrI���sE%JS[(�S�K�����'���t�*�iwiY�`m�+_^��l|����%
c�OaK�S��CeZ�)+�,�W�j7$}
��\L�Q�L���8C��r��|oel�����>������v��!6�Z
ij�I��/%���������(�4����?��@�h7��_����_����������j���<?Q����������Ww7�{�V�
bMV��KU�����
~M���=�X���xq��-��)9�l#H����m;�k��1��o$�u*��WA��B�$��D��q&�oa���<�>)����*��"F-Q��E���\��}x|S|c.��\�<�Pk/�E%�x@i�5�;�,eV����SO4�,d?�>�8�x����{�fYj==�+D�1~�)�Z`6f"�sq����7d�|�1+��fJT��2x��3R�{Upm`p
����m�Ecwc���J/�����c8SM���_�skKz���&�h��$��s�tFO1�9���U$���I������sj�
mg�j�������/�>=,6]�X��!ZOc0L?�jI	L����`�*/t���b�Q����d%��H>^A����'�)�E�'�_	��������x���PyQi(@�3(�%��b�8�,y@� ���T�A�qc�ybAJ�g7��}�n��k���������F9�qC~�����Q��V��qF������,�����
�/��n��-u`r��' ���)-��M����/>�f8���S9�9X���v7�������Svyu|uvyuvr��)��S�\����,9�������a��/��D�t����zS�)�����S����ZdV�Z�0���S|��v�X-���Icm0�?�g��������z�Pu �������������44��&g�M'��e�_|R�����O�T�_I�����h,ei��m����DK�uC7b������!V[�AW	h>�c�@:�H�H��b��f%��H�$Ns��h25���'��I5"<��a�%����"�B�T��ir���Xh
�b
�T�Z�3��� ���&��TJ 	@	��9�g��nZV)sd�	�L
ph8Gs&M	8�����	�FI��nNl�^	G5	La�I9�-��g��YI�2���)g�WJ���B	'0�d�*|1��2�N�9�lE��64�jL� ��x	��'�+�kc0CG0U�:�����|�V��p�SgT���I`�'�l����Z���x��S����a�T�M�P�c��H���������"g�S���[?��b���n�FEG���h�%��m��	���Q��Y��d�G���	���d�*G��M*`��qF7Q�!t�'�a��K�x��{�N.i��C�w���
(����n��������[��:V��{�
�z��%ei�j'��*m���g-��cj`\��{#����-M,��f�u��i�
>�����������V�]��=;���{��[�5*�T����`����ds�{q�t�9�������,I��%���I}���I��G���"���,����Jx�!x��=����������8y�nEr��D�r�M������[�r���f�h
a{�;�j{�Bf�S�������%	���'h���$�����IDb���Z�R�����/X�U��������JXjC�B��=	�<<���������E�1.2�����q��o���L�}w���(����d^\dJQ���s9:�
��4�7�>2�*�Bt*���dG�7S�Q6���~RId������v��|�+���*w��^^����d���/��~q�pe�U���?��_@\.-��&`�?�����_�u����|�)~g�8�`c7N��~j���.��<�(������:r���@R��o]b��@���u:�m����]������[P��}	1������OT~O�S�OmX��^���u��R�Z�Nf��	e����4��P�A����Y������
��?D*U���+���������CQ`���rE�s�b�x9�k��Z���8�	���iI_m���=4��ra�2NS_�}�p}\T|���zY�B��ful�_"*��v�e�b�{��'����������V^?-�Gj�&6��o��|:�����)���`���4�1�F����p��uB"Li!�����~�fX���ot�����A?�����L���Q[w��d����e���!/��
�S]��~J2�e�����:�l�n��N,1��Q�����a���o���6M{���U_:\�84�Eei�j���8W�U�����J��*2��a�@.p�*�'��oyB��F��c���BW�0��q9.����FaY�.������-d�e���}���������U������edMV���FV�*f��EQ����
�M�}��U	�6�
?U�9�-����~Qh3G}k���kc0sj�����e�h�G�&l{#���>)�F���g��c��8���9rFJ�#SV��?)��eO6O�Zq����7�����O�`�q:�=X�a08��;��&�Ygdo)�5�\X��FAy}�V&��~���(�C'3��f�Cf�x����A����� ^���7�G���j��x�������v��1���V�+J%���� ty]����x"5n�+�p�%xI�\R:�Z�,%J�0���bK�PW�M�*�b��GX�����bP��CX[+��<OR7�����sHk��q�!^��K��;)����F������jgP�N���<�h�����#[.����8�!o��6�5m>���D��[��[0���n���������`�t5J����b�������r�~c�;&7MA� q�D���d���By��(4V��N��I7�gic�_�����PR �|��v�����h`,��`25h�2!,�!l��j�I��B�I�j+g�D��0l���2�@����e�d�PZ�8B_\ Q���"�]�.)�{E�k(K.���4�f�iO���'�#y�h����)���}����8SE+)���7��t��8��!����"TX�U�[�3�FK��t[��<�D��r�'rKIa�_I���E�j�� Dp���K`�C���G��n,�#t����*W������b�J��E�"�����.9;�<��bg�W���c���1����������N���;]9�������q
����e��_���zD�1;
�v��l]��c"xM�`���Z��-�gn�#��4Ds�7	��*"l�7��7�1����66������m�#�w�6h���YR���d�cc����	�;�W5���1 t4M�����E{���J�M;���C�c�~X�,�^���S5��Y��g)�����w����&�{�H�Xm��E�I�
t�������`�%��po����j�v��7|��m�	+@U5�>�#}��p���`?���h����I��C?���f<+��G�Y���_����;X�@�M�YU})��c��wQ�8(�����;��E ���yKml������&��U}�U�V�gUc��[�?L
�I��U�P�B��H�GT��'{���x�y���1d���������v�S�����0��^�XZ��5sQ��l�����yn��`�>{R]���/�������H�����p0&#� ���szq��B�^G$	�u���w/Og���_�<�N�>	��)�����F?JI������\{����S�
y�\�)�@uP���|W/t���`���!�4-]?�M�D�'��~�N`��EFN��k���������N�_�����^v�6AJ^�oI�X*�b��z0���^�����-Qt���1�H8v�t��7L=�`��]C���H�#��`�R�3����W�}v�������>3�4u}��}�(��:M��?V�(���8��Y��9�2d�f�VS�p
�|(jn���R��HR��_���8��+�Z� �d�o_�0����Z{�;8J���~��D)/h��T���//�=�J����yX��zR�r�`�@E�
$�y�,���o?��.�diIn;SY�c��U�	�����NC��{�_.���4�<�Ey�L����Q�S�9'tOS:�A����i���&� \����p�%���{'w��� Ed�$�������MVco��T�6��T�z��	�����N���{_W�6��
��B�������{0�	�6x�$�o��_I*A�B�U����|���{�]!l'�9�==�}�������a���������#Ss@h'�hF�IL����(�3������=�Xl����k<Z��h�{��qLG�����F)W�=C9�yU�cA�8��Q������u9�_��rC���"5�
��������i��B6���7��?��d$�s��$��J��0��rv�c�J%��/��uN��������
�A:���l4�c�2/�G<�������E�T�������j(���L�a�����P�n�?NOeM�`IO����6�z�{O�S+Z����3��rw��h/�x��j��6ww�����Q�HYj�,N��/�����&?����3?
��W��"��d�#*�V��?��/�0��}I"E3��"wH&����N��;��t^�~I/(>yO���������R�����yz������$g-9���t�G��2D��_������s�EN]�7/�����=;t��w������x�;W1�JI��w�M��?��&i���6W>�E9�q��������t�Q�b4�1\��k�}��)���=��sd��g"��"�^%�x�\R�{�6�wN�^����� ���B����
��1�W�B��g;K�I�R�q��s ��,�Z���Xw=�L����f�V[{MD�lomo67}��B�H��t�,��;�+��?�R���U�Z]e�X��{�w�>H6)z��9m��2��9{G+�Fj�E����:��?�v���Z���8Z���3s�B�.�K�U�e��<~�*���sx>�P&��|���QU,��| �E��S�%S���PN]���R~�{��;%�u�v���l��m}���HK�g������W'���a�H����c�3��������-���A�r%��2S�����CV���3f����������VL~��g=�ja��C��U1����� q��^8V@�ad��/p� ���}-�P�2����p��`k�ho��������^w�W�!�4V�j
����C�k����;�c���aw:�����I����)>z����(��y�2n
�@�c�a��s������Qv*^Q:���:���>�+0���9z�������1�h?�"��B��nF���/FXxe��{K��N�e��3�"��{_	�K7�w6ww]O��������������G���7B������M��M�*TV]���(W���Z���E��G���i�����p�Ew�����~���@e"5����-Z���9�z?��f���a�n��J8������-^�������A?io��v`�wA
lt7�g��r;3�X��J�h�/Q�s�2�$���v,�$d]�x�����0��X�}Y������������@��bt�*�.n<�AB�v1��Ng�
�4x"��9���.���@�U��$��-��������[�R�6B'�'W�1E��'��ftt���#a�-I�,��5J�����5��������$W7T�G���)%�B�	���H�#�|�;��A+�sA���d�MO?^��H��@��H2��D�:q������#����J~{��#xL�����d���9�]z�������+����Zi;�����m�][���Q�5��2����hnmb�����j\�|q�Y�?d�|0iDJ�>�e+�)������g�N���u�#�N�����j�B���_�3(�R���5���&����������l��J>�[\%�<H��6$����)�e��w��5�����P
��j�&w���M���C����F_{�O�>5����#�D6%���DX��:��p���)�*��t��)�����w0M.�i�@]i�We��^~k���"j?@�c��'�D�wX��c�n�!���l��eG�F��5h��F�~ �[�ukZ��������V���S�P�Is�[���:0&��
:��F������6��?�~H�j5�oP�*���l}c1�_�Qa���]��)�����z3E���t����]XO��C*�"�^\���`����H.c��- �q��I�e"�1�����Z\���$r����nq��]~�z������G=v�
sI�/�����l��I�VW4�uS����`
�e-,C<bg�� V���a`cm1c����aJh��r�5�O`��`��
P4M�B���^�na��C�nCp����A������
+1�l;�2���x���g*Pl��x ��F�Q�)�~e�4KI�K��p�,�>�/�x�gr#$F�tN���2WO8@��#`J�$��\O�HO\kh���(B�y^��Q9����.���l����.�Y/m��=��`�Wy�z	���+�\7���������z�]�#�P���������4 p���x���!h�i�3dD�W�0�c�C��X)C����V���D	&����g�2�<w�BH|�"�
�j�.Z|;�)
D-r���t�V'D����
��������o���lJ�V��	�&�p_�H�8�l�����xDv��O����U�,�t��,s���!�)�>:G�i����Zi���@t1����H�#Z����FTb�1&���\���Q!�j��j�M=+�4]�������'����NN�

GE�����"f��I&�$��,t��U���la���iCP���3�t�{uiL@4���Y��&�{���&���tj	��z(R$�
\"2"�$wnS7�F:�h�����,u�
��[#�X�AdfD������Z�8o�O���+)���s���<^I2
5����a�M�� ����3	�s�~iL����'R����(gbb����h������#��Yi�����
�=��?s[6���v�c�������@���7�x�p*���aq��n�U�~�-�9Q�/�k<[]��-�M�J�����]%���1l����~<d���*��b>��h�CuNI2|;h�L�>���9^3H�</c��F�*����`:J�,&1������|��!��H���Q������J���"�4������0�����2#����@o�d��%C�W�����"����#��*,��bF\eC0Q���E'`�u}�,N�L��G#`�`����H$�F���j�8	9��}�DFWV}Mp2~k(���D���6Ht���Vc^���@w��B�QCx3U{:��_f����g���z	�T*�@P�����=Q�t����U�W�eD��*��`����NS��� ���2�ob�.!������mL'���!��E������X+T��g��GC\(�Hb�������']0o2�Q`�#����4@�_@��k��n������'��1O�b�=@�@��*�1���cXt�a�T��i4vu���3`�>@2@�Q(Jxh!!����/���U�q��7������F0�-7d�g@g P��1�wA��F]1&T�@u����/�G��'��O�[+���0�^��Z�mn$��DE &��K�	(E�G�$�k��<����!Q���r���d(���M�d�y�J`��������<���%$�LPd$��Z����Tu��F�&��
�`��<Tz
�G��l��
%5��uB�.r��f���R��I}5���<c���Fx���yhO�{Q��d"�/���V�` ��	w��� �C�������&S���������[T���m�.�������	%��7�xI=AK`t�!��86�7l�!Y��bX�^:�������|T��qa��a�`+/����d����&b���HO��2���������Q&���U�Uc��7~T��|�d�_Awr�^HV�/�'�B�)gJ�fIB�0`
O���	�\��t�q�)fZ���21.�(����}#�'�M��4���C�����SZ��J&����3�=�0I_ ��4��I05y���>�{���Y��>pK(�"���F��.iXFwN��20����=f�H{B7,��u���M�v�gI���S�%�J��t�������3Q�!S�}X�f�KE��
����{s���7�pz��U�P=="��C��M��Q�t���[G6 �g$_c��p�
+�m&8v^�t���:�
J�����q��9<����t��f�l"�g�������/���xq�q 
��x3Z�,��C=G�$N�>��_hX'���k���A��u����^�:qF�����&GE�j����4��s�$
f��ND)R
���E+�1���ND4���,����������`�\���h���(!uC�'����32����t���0/�;�����]��z����\!����j�d����}���;Cw�([�H�	&���^nt2x�d����6���8@n�*��6�kdn��3L3��y��{�JM��� �m��qC$�cQ�D������Cx7<������'����i�/������uI(m�Q"�������~���P�ghp��4b�$VP��M��l����d@&�GE�>z�3�41��&W�����w���:c'���as�#�B��b<���0	6Ili	�B@�����e�|K�r��fF��r"�>�p'o��w����twS���{gup��;�)��d�S��������Bv>����%�J�1f�^��������Se�^���[
��L�(9�)o�Z�Cab�6W��*�E�H�|�S{���1�Ox:���������t��0]���Jee������S �3��H�.\�X8,������r�fJ�D�����8q?q:�X
��QrEi�b�zS�Az���"����Ie9[[	�=GE9``��E������e�86�e����,wC�/V"|�����[��M&����mZ��<����9'��02��d�����"�#B�&l	:�o2)��$��	��eY����:��rj��58�[�P�N�B�#EN�����%qKQr��3���S���5.L�4L7�f��g�N��/��;sI���L��1Q�����\�VO3����ot���H\X���)������,��~��_E�<�e@�rc2b>�q�P0L���-O��eCKc�Np2d�����dJ���)��!���\_��!�%/�:�
��Z?izM������^T�H�hL�]������1��"�*a�c������������_fn��� 4"�&H��bs��x�>�NP�qC���1S
2����	d�f��.��y���%Z��$���5�b?\���HC�~�?�#ALE
�����n��<�O�/����������M��-��$���rq�w�����4���������*p�Gh�i�L��D�l���sR_��$7�1�^T�8���AL^��B��}��t��ldS�T�)��M2�K|y3z�7�^;r��������Dhp����S$��Hc�%/�k����g��w�?%�������}�����JeE��!��3a�Y���G���J��@,$;�����L�r�����V�IQ=U�y?N��[���
�_<w����1����������C��i��U`? M!F��m��t���6�RM�\��g�iW_����[<�ii�dt"�-d���g�����l?D6���A������
�ol�*n�iq�����8L�E�����7d�I����{{��N���5f�*O>P3NRy�I�Iy����bj��^5`S�H��{C������Y��cQ��q�ln"����5�c c���x���J9Z����X�Uw�H��5�n�P��2���1�SN��������'��������Mn(����lZ��U���)��([L-Bn� 
�0�g�������������,v��p����������%������g�I`�O7q��J�����)3X
�+��Hg���e�9)*�yqz������^8;}1�����4�^}�,��1+����}�1*&]C{b�K6����eimiq1=��,XcZG���BY`�7S�j����4��R#w���iMENO�_��j�F2�5������@�� <������[��B�1��(d1��)S��2���xM��=^��g��V������*&���U��*��-�AX�gI���;�O���L�$gaI�28+��*-�zaY���S���d�����6-}������I%�b��jQ-�����{�9�7�E��7��ws�	����r
�l�nn��R��=�h����\x��f��b����.�������,��2���[��"���������,c��st'xO�3�3�a�� ��	��������cB\����(�n�@�N��[O}�7c-a�o�,�1K�+�8|X/X%�]#Z�� bIg�bIE���pu��-���mL���y4_4g�����T�#�<�1�w1���.ahy��0�Kt�T��]N�(}�����������l�76�� F�������Mj&)������������R+�E?&=�A�r�VV0=��u4��	O<��.�����`�\A#��}Hz���,6-�c0����q�"5q�=�D0���i|}�����*�V�Z>i!�V�MonC�.dk%��a�������]!|l�xMC��3�
DB[}��lo�&�%K�,`����6�D������m��X��.G�%���S];�qD~�Z��YK���b9�]��$�V��}U�^����6q	�5�5��������Wm1��d�����r��&�������N� X�!���C���#HD+Ll��t�����}8��`����[����/r
��+x�h�1�v����_DU��;{w�~���b���i�w�����>���?7=������*�X�@&�������
�{�<�iF���)�&��E�YG��	�)g������"���~�MT�����,BN�6��?��k�x�������|�n����&h�}S��v�>���%;!��[P�p����e��@���R��Mh�������`�H�":�������o��],+��%3�P���RZ�H�=�S<p*�zWWi?#�%U��S�"�<*�_�&0�p�l��x���J��������sx����c	W�b��y0.��i
x�d8�`�}!dZ<���
�S32�I��D��p_�=�ddi����DZ
�z�������v�dW�[�A�<bx-BNG	�A�X����}��Jw�&���'�
J���y�&���o�vZ���T�q�xID�"��$�����	��2�(��>�a��A:�]����W?�X`s��KGH�J�j��<�`���#��������5���A@q���W����\n����|�<Y�����F�6���\�#���(��dq>�Do���Nj�*���B�b���#��xq��0��SA��!qe������ �9���1	���B��t���?_=��Z~�����1�,�����pMq|u�T�F>n��n�$h�=�o��������+-D6�8<���!p�8|�{'�O$��J�G��Mrp�Yy~_�m��1w(��z�Z��G������e�z�������9q����5�����9M���D���nS?�\4�L=��g�
�$|)����5���{�t��E"�����I�?_�P�(��5���I E0r�0��W��3%�1�=�����iz�J������SS�[Q�k��������8�7�Rs��zx�%��ii�j������+G7��m �L����b�r�(�,���6Q���G�!����:dt�Ik�����V���e���i��m��J�_�?����
��C���s�������R��������CG��S:
�de�

��Bu8�kee6?s��s��f�{�>�3������Z&M$�K��Vw�1���H���q��V9~�t
d��
���8�b�<��4������h�[����F��g�Ub�f)����,��������1X�n�v��
:
y3�DT����5������g@�q`�jYhY�c"�Gm��4#��a���<���=z�>�.�u�w�i�	y����-+��U���k�?6�&���N>i������$2�c��'�+Oy�J�:���;u.)y��x?���x�R�<���6p{3S��5�����h����@aw����YM3�~�S�����{�U�%�� ��fR�BB!J��C5�����2l*/"�<������llm7��*�����x/�������Sc���c!`=:
y�KM��#�C��a�ug�������tXD��v���t�L#��3%���1A�!"g��oN��;�8���j/���g��[�lvF�+���i�.#=����N845���)��,4O"�
,��9~��/�X���oU��G�~<�k��]0��<H���O�S|���Am�2�2������s���J|�PIM#�JI���hu���	������o��;��`�^����5mR��M�Q�b]v�!1��N5"QlW��M�<������5nj7#���<.1F?*=�yD@�gI�3~_�V�%��_#X�T:_�����:��qv����:�~����Q���'Q��:�8d�U�Q��/&���\.�C�����={��')�_�6=��%�������#)rD�]J!(�4f�%j�Y`7'qy���A�������=��*MFMag�{P�G��K2�Pa|@����`�H��u�K�	%=xOq��HsC������W�����OF�#�`b^FjE��#�y96��8<b�-_���>��5e����|g�i�Q��
q\� �x��C<YuD���eH������#���L��f|$G������4nL���
��b�������Bv��0�9�k�*��*��j�A�MQ)����lH�5T]y���
w1��r��&����gs[�Ebp���p8��������y��	AKr��j�E���G[�����M0`�����Fhdt0���f�8Q\���]\��:��\�v��[&�2�������%�����s~��*��H���E	���P�%�����x��x�������Jz��"����������mL���x[��_zG+Ee�D�����;�b�![���#h:?��_��~B�d�q�\i�4�)�Z�3i���6|����o\w�))m�t�����!��:Y�N�o�e��n���e�a$�����r�������<�z������U�I�46��Q��j/\���|����?*���^O���<�/��`4�����D]QH�l#�>!�7��8\Xj�hs0}��e��G#�a$������?:l��C�L���w�	����U��I��T��mE'#F�PCn
�fQ���w|�+K*�yVS�F����#�6�vy�2^�����7��|��b
��"���V]����i���L_�NHFE�h3*?4(�J��zD�N���7�q8��zU{!��~�����?K�h,������`��'��P�����?"��v����A�����
����X����P�7�"�� ����L\�l���������i~�]W�T/��u���������� Z���_�noo�f�����R�P��V�u��d/M�D=������~��4��y����n�����x!J�b<��F&��H��](|�����7&���b�F�	)��Z�2
g�G8e���j�<���gjjQ8���<��N�jL�w�v���gk��q�7�	0>��Ir3cR��1z^c���j��[\�Yh���8�(��Y�����!e�d{f��R?�C�;dt�_~��g��~�s��4#��7n��I�fw���[�D����-A�9��d�����ZV�'8L][�$����2�@��������G����������#nMY2|1�7y!��C�$c��
`|�!�C��![G<�	�NMI�c��9�z�1��|�l���a���%T���B���/��p���H!�+�=�}&��_n{}���_n��l����]	��gnK������ROOQ�v�x9L}�}u�ZZc�[�"1*	`�K��
�^�0�l��o*a�tZ����T}a���������AA��H�<A��0�����
:x���m������G������f<������(
ll[��9������92Z���%k�].#�fUS�)��f���A8I�=��[Htg�e�@�8��-��R�u��N+3/
I�Xa&����}j6�r"L����_���������ATm����AB��#c������"<{�YWlm��v�[������WA�-U�Ic3�{��������.��<������ �wfn
R��`����Ok��9�J��l�v��^5��~�&�N�1���)\N����C9#������o�����;[$Bmn�Z_��SP�����H\S3�`eEqBcU`�KBeF@��dnsp�<�22[�V��������*w]2+���??p���T^R���@St�5�r��.m�n.���3���7-p�����t�����76)%�������s"�Ot<���'g�����Z��o��,��w�t�X�{{�k��GM(�O/&���������K�=_c���;$��������������:�`0��0%���������������c��+�����A2�y%;c�|�-\�������%���at����$U����m�__t���`��"AtH����bR�n[r�>��������|�@�'����E�G��i}��
�0�kf��/1n����8�WWZ<���	�����@�����6�7�#���t �p��_P\�*���5�J���0%��Tw����d�������),t	��`muS$�H�^�-`�9�!�i2��>��[b��t����C��o5��Kg�����4������������������k�1�5�X��C�GSl���+�U�Y�$i����r�$q�0�>Y
'�6GE<*��Om��C����H)�4�����"8����>���y����x��D)k�	�v�bD�E�{f��	�x��3X��.�����%p�IX�/�M����?;:h[ �x�g\��+��^����v7�j�d��[��E�7���<zsx|o��;�r�2�L�60!��he�����(���
7�6X��

j�	q���F��33��%+�.���dF�!�C�/V�5��z)_Q��������L�&��29P?��[Y���g�)�	u��s�c4K��7�}�ZYQG[>�{��v�F4G��\������<O��w7c�v�GY�c�x_��u��!zH��p��(p��_���5�R�4����������w�&�����2����sx"�$��Slfz�;
,"2V|������H�<�,4Bl�CP^b��r����|���"��&�Z[x�f))	NE3��
��H�J�9J��[\�8����G���,F�)���91��YC�w�O'�\���H�`�(���R����|
��`�\ �+��0���Uk�
���U��gLR��=��������a%�4a��jT�P�Ym��wA+�����j�U:|F]������S�)g��5�]EK�����#�f:��1ORRv� ��!�igG���V8o,�'\�X�x�|�S�bP����98���2
�8�	P�p	������?~8<>�[�m,,�+�+����F�8f�;:>xw�����,��i�#�����^�(����*JWg����+T�M�U�^�������4�����I��u���[�T.H�G��|�%/�7*��j��(%�R*�,�-���V��<��qj�}2o)*��0_a$����4nSg@6��,����*$����a&�(����=s�V��jU��Pu��82�>`��)��{1��d����)��%.q�,<��A���W:/������'F��Ov������v0���5����'+L��V3�W��:�A����5�Ke�5L��1w�2N�'���3u�6�aP����5���C���0VSk�2!������cl�t��Jp^���t�(A�;����O,������]����->Ndr7
3���~?1�H�9�����C%
�}���.M�������U��>M�$��8�w�j�"����{a����:H��j*+��4	��z�K�lGQ�����/v�������K���K�F�	]5%.z�����	w9�F�S����5bn�������c,3�
o�(�f((.����Z$����&��1bv�Fq��V��]���������r��8�@��nF��I4��0�8�T8h|0U����Y"��jr�4�C����
Q{��!���M��!�
;%"�*�(
4\�,���%VT)x�$�8Ej�L�����a�6�J��z��n^�1��`�G��p��2q���C<�|	i<6�7�dJA���lp�g�����vk��0������u�:v�w}��Hg�t�R�>]_����~\��P���V�~��_��.�kg�9�?um�a�����a1�{������C��BQ�7���BJ��EmbO<@P��P��>N�\�I�����"��W������,q�+���r<��
�9��|��6
3	G|h?
�C����U�3����D�3)�3L�H>6��W�(�x������r~Ee�0n��!�����������aBn����0��B��hV�P0<���*wB�\qG�ndXO5����E��"E�C'e���@G��*y��.4**�
��vE����oFAc�?sl@x���.E(Ra��R�ln����j6�p����H7$p�]t�C?v�D�O�����v;X�5����G����![��+������kNk��b
�"d@�����-1)�N���P�k4��zF&z'����.�57�h�o�*��mTS[��
�F��Yf��~��\P��
��Z���P-�;����lbn�6�8`m��ScE4��y	��[�WW���+�h����10��f�k '�QEnjy���������N{j{/P���;:���N�	��O;��r<L�m@���MK+�B�pF�a����#�/p��FS������[{��%����lT���@A����1��	��4F�1��("��9�x�>*=���Imr�e
V�g�hy������
���� �����8-�2�w(8�����N ��J�2�/ �/�����\C8F���:��B�|p���"7y���a��y������������(��4�	�n��;��Du(e�E������>"��@�7�v�����r1�����V�L�T�4��+�,zy�j����������99z���c:�]�drY]x�|�$5��
.�	�g���\���k���f���Q\Wd��C�%��*�%&`������KH�!���5�^i��b���
�!��01��Y,�(�~1Pps<��A�W�0cc�@�����T���}��F�{���=�?C���s�Nq�$����;0
G�@���>��Fp[�&����C��P5�������
�w���$VZ)���D��MM���\�^��0(��!HJY{N�a��4�G����'����u����Ct�cFUn��J����*��F�nF�{!�x
/�*��f5���$���U���"�F�����_!�z|i]k�������~'�o�����'����]��73.�6�WeD�:|q�����v�u�K(�0��O+g����\"��2#t�1#������d4/>��#��j���N���r�G�
��k��D�_���q9Qs7c6<NgY
���:��;�1_�^���"1�����V��.?<�{U�$������i���,GeYJ8�5�0BJ�[xyYA}}����T\8Z'���`����KB�c���i�x���S�s5��'���L�4�!��:.��|o�����,���h]� A���$60���&�����)�s��1��s���\�Xz�O�nP�����$eU��tBy#��Ofxgmq�pT�r}����3�,�Mt�="�G�\N���e�Z��@����N��!z�'��0I6��y���:[
M�]�Y�����^'��c�d)a���k�~�byQQ��>]{��O���|5��'�i�@��V��|2�j�O	&������F���'��$�]'y���[c��R��$D��|L������	Ag�(�Nz�I'%�+����>�F����������P���!��0�52���Bi
l�"=~�3\]6/��3u�#��4.y��K�~$�x�H��������NsI.���v�J���!^/U[�aq�.�F��+����z
�h��.�Z?p���m4d��&��}V����N�������=�$�H���t��Q`�a���u�n�NL'�v���T�6��A����Q<eAz-lyy����N��nv�2�I������q�X��WKG������������G�(��-wBXf���S�;m�]�(�����@s�D��_��W�j�
K�f����4��xxXa��0$�E5��5�����T�������:�;��Y��@Q��������Mv�k�����Ed1@R/#����=�B>��4&����Er����p���T�
Y�9�3lm�f�������)��"h���Y+��[�Z����,��[6s�W*�.�l%����5������.���1�/*�r���d$�Ok9�xO�>��|g�a��*P�%��%��o������zE�$�2@��F���,��G�X�T�j�$]#�i��q����\��P������p���d��9�x�
3Qj�rgS���GR'�`���D���L*}��s�t�f�@�!��UI�
��#l"�*m�M�^x�AX�`K%X+E~�p12����q��6���"{����b97��N���� I�����%�Y��-�D1|�.�J���Nf�'.g!NV/^�_.�o�"��tf�~��'�/E����
]}��7�J���}�pt|��E�l��lK�m\��jz�����Z5����&��(x�����?=�cL�"��u(8������B<J�`���q�Z-��f�(�P��_��#x�JIr\f�Ekm7��>L��������S��;���������C_1p8�����k�1�M7��-]����c(������sZ��Y�5�TH��jdX6���}M����g��W�o�-k�v50�/�*�]�ymy�����=c';�'���G�[1RoB�\��)��d�����!e���,��)����
��OX�7X��s�2f>��]"
%-)Jo�(Z�"�+6���6���X���N��L#�;��x�F�F�Xn�7�gWe��1a�>�
�x0��i�\�t�D����g,�������{L=������y�p�1��:":�L��t�a^�UP�C���#�!����:�\�yt�&��>��4eg�����<2�����w��P�R��o+N[�YC�Rz7�S��-:�-��VO�8���������x����^�R�BE�����	����nF��Ey����C�I[NU�&'^&�����F�D���qk��G�M���j�,��`��Gq{c���G���R��^9mw�5,����~C�fd;h�/)���Z8)���I�4��w���b��y�����-�Q����,;ns?��+��	CW�:��������q���h�K�K�4a�����
c�����x�[�4�KPEb�+Og�����<G��y�^���efI��
��2�*��C�O���p3gg�� �}x�����H�P��
uj8v��0�y��'����E����2���WH��0���I��`3��P�t��[6���d�~�����0
A���)�_��j|�n���
x�FqP�q�s�o�a����������vq��K�/�]��1�M%.�"B�On
| ���W���<�E� tU�7��$�7Y�*��;S���)�+]A��?EI������%���A��	�E���Da-�'r��~a��:�-h���w�P���h�{-�B�������A������/���9�����Q��=��zO�b���(���X�2T6��fX�M�E��fD�<-`�J�Y�V�i��p�`}p ��)Q �&�:��]�B��`�<�3�
W�Z\J������c���X��1���qk�zU�r���W�=���D���(z�m���D�O0c�O�)L����%9����s��U`']:���J3��(4�y�����=�?�1���,����t����t5�K�yQ������D\�=�,��"*v���A���!�R�*�1�r�w]L��&"�u����-���K�_��C��h�[�gc�Dy)���L&�V4�38�P�<�f��5GP��p��sZC|���W�tl.�3������4���5�09��(K.S���)DhRciC��y�8
m�J
{N�"�06��}��]L|E>��n�x�&w�?�dh�xn�E+�'s�����)���+R=��"�}Y���X��)x��?�;�g�����R~n����X�N<���V�Yg0>��N�&k LZE�t�y��
����O�H�&yD���D�O�0E;���z)#�u����S]��l�jYg��k��C�HTT�N�q���c���C��%���va���0u���=+R�B�I������^�'#-�J��e����/5eG�E�
_�8#���PU�6OF���<s]�o��OA����F=U�L��(E�U��� p��o�A������M��I�����U
�<��������rh��Z��z����a<lo�O������TnvNNOx{~���$@���"v���t2^�G���0�o��R�w��Sj����K[-e_�4f����m4j��9t���.��\q��������r����{�e�u������~��P���0������]�*E���+�d�����8��@�d�!�o�������PBV����oM���GH#f��H���{�Rgub���
�v��^�`&��������)���v,O�-
��0d�m!/����\~cT���m�z�r�����m����5�V�<]������<:��P�����E��1\�K����y���td4����\������Y��@uY�\���N]a�dg#	�p�g�7���U���u=f�@��h���o�O����UE����/�h�e����Q�Ru$�.*�wO;<�B�)p����8�U�`��&��`�[1bA��h��N(����W���C��Y-�P%cZ�J�l�*6��)���Q$}���4��x�]d���-0���!wv�"lB�����`�Jt�V�	Ti�&�n �G@
�wr��[�P�j�^Y:�u����j��Cz�$�
~��J�pzhH�CT�)Wgu����H������7[�<�V_f�pK�F�:C
��~�����g�r^��9�|J�����R9}
#������H�;�m�>m���V�Ek*��������$���@]]g
������N���'�����~�i��Cx��(�&�P�.JIl��g��@|*������Q���U�~�]x�/��	gO�����T
/8��7�S �c��L��Mp&{?�9v�a� ��6�#�����I/�R�z�JoF�w��������~�����`�.�sy������@o�_�b����C������H��kj�s�������5-8����4��`!�����U������{�9aX�����(��������
f"��9ce�G2��:�[]���I	L�!�~w�6Y_���32���Y�([!�G��aLV��&l|r���{h�EiC�����������;bZ�Y�}ic���{M�Z�
��t��z��l��e:�W��������AU���
��Z��Px[K���fB��{�����c��S���~�3�x���L����������������h�Ljj5GPq��lP\��er��e�Z\��le� ���@��^���Q5s��E$��(h]��iU�+S�zT�
�b�wEvD�����>s�F�_
��-���d�`���.��|���=���y���e|1�<�Bc��5Bk�l��^��,������!��R����4Lq��A���E`.������8����4���d���1�	[����c(�2�_�g��|)��4�N��O�` E��~�I#����<
��f�#��$R0�6H���9O`���v��Q�����s:�f8��WT������[#��&v	O�.�K_��������-�|��@�c��h���'|��`���b6/�2���X\�`&	��������a
�vh��\�r%�|�L�n
�0.�K����� �QSkb-D���KLlR{��Q�$;�<%n�#L��>Y��,A��S�S_!Nb�S����;N>B���'����f����#��N�M����?>9?|���PI-TA:��g�BJ]k�+s3�� b�8�^?�I���LX���P����x��#��'�:�)z��H�K>�<pcT@����vS�?I��U�.Nt\^��f�|C��������+�<���������2��,{Qtpz�~���x}M���������9\y�C��g���9��������rD
����q�����(�3���N�s9:;?:8��vtr!Qjv�OG��/���������?��4:B�m�Y�r���F7���%�����3���Uj���0�:��M��
hcD��q2t~g&�	g��;�u�e�D�?����(�Y�^�i������07 �m\�����}�����K>�B[����!���a�L�M^&r+����s�Z�����o�����:�����M�����<���g���V�������$t���G@;���g�nT���"1���P����j=�l=z�����(�4 ������(�YHW���SB:$��V��m.�v��x3aO�vk��nt%���c���7�����=���q�;4� �%�M�?z��x��-{Cx�Z���<��@w�X�;�������~�9�Z�c�����0��6~{���S���x�������<z\���z�
����?���t�&J�G0��I�\�	�4�%�H�(X�i�����W��z��6�K���E?oE[[��w��F{k����z={,������O#��`0��R��g����V����H�N�VwIIVL=�Q��g�;{-�s�Ba��6��O�A��y2@wX��y���
�������-�����0�a8�ivAt�{W�b�$\�k�o~X����%d9nqh�.|g�
����@�C��a)
g�xv�O�s��0|a�t�D��N�f>��<�p���+G�z<�S�j��T%~����I�]t)%Yq-����D~s �W�������E�e����6���*���
6�j�����z�6���@���
�Y{���/)y5�SA.��i�n
_��q�]N�F�=�0���%c�����P���n��������f��KYl�������1���c�+^$j��,bj�n
i+���39�3�*�t)��`�n�!��O���`�.M��a.�c)L]�`cd����U66���[h��Z���K>]�>I�pkP���J��4�e"@���r|[{��� �ex�yA�|��8!���!���q���"r"n%����c����>��|���2�f��jg)FE?B�,���>�c��j��(f&�'��ISp"�<���A\�`9��q��Y�I�����jz����37�o������7�T�>U�gR�Md�@���z��M�y[4k�4�r��'��r�
��2-_#��W��3ci�{p����Aj}�O�!2eD��F��@1�_����8����ba�������F	���U��,��}������%4��,,�e<D��D�eQ��3����#��&WbHU�G(y�eN��������� �4
]�/vI��{bX�,���6����4qN��G��R�����T�����pQl�(������0�X�"3*D��.��`��&>�X���{�K������f;K����950�6����6����xA�� 28(WG�t�I���:�NC&2V�5E'p����>!:��o���JI��U�86=�0��7�E\�K������D��"dH���,�E�"`
'YC��	b��9��v�%����t����>�
�>�r	��g������H
���7��C��,�~e0|���ukk{m��hG4����5`�5�q�N�(�d��*��+�k����`��,���2Q��/q�A�lI�h
	�3G��3�	!�3�U��w���X���!"B����+O��n�w�>��-�����UY��L$�}XgQ�I���/�RR	���De��]t�3�� �1���XS����$�]�77
��x�^O�]V(kcr}L�bQ�m]R<�"��63:��U�E������P�.��G�����Rc�4	{$�i������&E[��f�`�� <��u>hf�����u�����"�3I	��|`�>+������3���+L\�L^m�9��8"��\���&��t(���fE����7��&3���^X�th�Je)n�1����y�f����6@����P��q�6��]�F�+��H�����0�9|T|s9b-;���YN��E�����0�@���M$!Pq
�`f��$�Q�������P�q�b����T�r��-�?J��x�z"���g:*��[����Pm�cyt�����{�������g��S�V�������
��:}���s�g�#A�24~N����h�'I��di$�Hs�*�L)����������V}[�e.�?����M����v-�=���gq��Q(Fs�%��I��$H�G8H�d�-���f�#9�9��|�}����b��eR!M�Y���;�N�������w�nm`��0�8�:":6������������f`�~4�`Z�G�!�������������x�[��IR���V�T"�����]���S8���7��2.���\��ZF���I�b%�
��Rp�{S��@��,�7M�;����[d��sv���h]j�w->w�����$��ij��
3�G��gS�K��r�Pj4 t��x�����Fy���}���p��� ��_���S�7����@��\��S�K8�v�6��c�}�RU<�~���/��+����f�C\q'��{*�P����R�
��)�t�(��O��]�lx#��o�!{K��.i&�$�����vc�`S����z"�����=�o5�>�c���j
M�9���{�c'K��L4������"���1v���� O������U��:�H&l�S�
�cJ ��P�3e��'�H)\P�Z=�,�n_�I�S������;�t�]�z�qqg�k1��������-��l��^tU4�n:�r��=�j�&�.�:R�J��G��&X^y��,GY������n������t�"�����5�)l��0�8a�a�R��W������
'$�<���3����"�x��
�&�1^����
�a�j���\��/��a����Ve�w>���������?,�H�Br����LPC�Qf�.���%�.���0�S��V�j����u��N����^J�;��
K��^kH��GY�W���a	��6g�p�����mFgp�(iR?;+����YTP��Up��ue�W�K��7^�m�,WK	\<�'~���{�dS�%����� ��}�O����G����vw��$��j=���M�'w�g��%�.,^]y�'\���[-�'�ub�=�A��h��\"��E��2��eV�����*f;D���gB�����8u��{���+(
�������r���*6�&hj�+����zf�Uc�i�l���S���^�u��K�������\�4�A�"��)�
�0\�*~�����)����5j������J(����g�>�X}���������P���o��R�O���b��q��Op.	`��$P<4�����bMh�a�~�4��`�/��.��W��>���sS���D���\����MY"(�q�g�R��)1/D6�k�Q����&~5��r^ r�S!�(��*�A��B����M�K��+J�sYC7�H���8$��"��}�����p(�i}�M~���R�'�J�!������9���O=����"q��mX4B�/,�qaQ�/y��\����GpU�i^|@,h���i�C�\�g-7��1�m��9�|f�}^�Y��	�,�S�X\u��>��8�{��*>�w1*Z,���||��U��V�F�U����5����Ak��^G(T�rmb���x����`'���N�{��@��3h�.5�<����8����*����C!j:c��A���z�\��������'�I���$��z��l�{��"����@����;�@��pjw���/�f.���=s;�|q��Op�w�t&0���*���c�!
0Jb+�W9��c	yV���������b��[8�@�>�/
��8�pyaKq��g����q�Y�� M�kvb���� ����d}���<�5=U�J�j��A�
��u����)�^7�wIl��U+�������hg����,������T���>y�LOJ��x(���5�������C$#c��������bx%���P�G�F�i��d�D+������h��������M^�+�g��������O��1-�	n
��Wgn*(��	�+T�E�-�4��b������H+�Jr��6^�{�u.W�z��q(�>0�YUCC���:2���V�8`���0"�O������K��]�A�$5�����o���3;�n�p�_�x~��g�F����C��z��j��;���>w�e��l��/�f�Re��<���L��/����y.���8�\��o����P��8�_m��J��^g�l�<4M�O-&N�Y��.��`G-\1�b#���=���X��^�A�r<J��vdH6�RQ�S�FW���,?�\���X������s����M<go}�<�b��+�������`�g�w���V�a���XX %��)���%!^Q�����p�s�t,�:8���&g�}Z(v�Q���^dz��*��|��+9�yG�^%/����w�������3�&��z���$R6����Z��K=Jj��EX�����Y�Np���	t�����|�d��wRgR
�����E�W�G�<Q
�����6����9�r����$@�B�%H6��
�C��h���<_Vl�� ��H���Si��E/���f��o�!H"���b�RiR�
	���	#Z�(��U����1����6�;���
>����
��/1?M�x�6��@$��Ft����/��0�ap#�:��vVA%��u}|���@���7)�����UM$n��;�&`Afr��
��q��Jl���HSE�\���#��	.Lu�b��h�7d�+��^��B���R�;j����k'�k����kd����Q����x�r/��OfJ8 i�o������*�0|�+x^��*r��Y%$7��
f�����h�]eA������`#Ww��z����hy�pBc�I��m�Lj%�F���eG��Z���Q_cZ�.���H�:��8g���	&��M�i��A{�����(�3�������_��:�]���Lo���\:K7�g�A���c{��0^
�"�I�����,��}0|1���nY�{�;~y�KJ�y���f������6���V�v���'F��y\��a�Xf=i������(D���q}��A!@�c�H��>g{��R�������z�����H*�j���q�j�W4;Q�������:H��;�ZN]�F�x��'
�V3���2C������^�
�{�.��=����z����z|�@.S����Tb�6�]���g`��eQ��Q�L���<��J����e���}v��e5�_$6��D`��+��:��0i`���w���]�em�/^f�	���Y����|�Z�>t�1��K��$��w��'��_d��@�;��
�����9��'"o:	��BN m[Q��cL�VS��S�����	�![�����V�P�Uq_
1%+�����D$
7�H7H��3�����r��2�+���fPZ2U��/��3��q�
l
�!(�pc���;���!�:d��<`��6!�Mq��G�;�����������.;rU����USV�q��W��������1���+��L1����)�"�xK]�dr��)�BK��:5����%#~����3)*���-3�<���&ol�+����F������������5������T���e?*<Z�����������n�<lu���%o)�~*���1�u�O���ne�3�L��@����T������5O�������(���n��BXe�����+��s=������f���'�}d���SN�����g������ua�>�d���������zh�����������������G<���\B*u���|�����4s1��#��Nu8��P���N��v�� ,��oh���\0���I��s�L	WAI�����'���S�9?�6tj��T(<7C��f�~��+��$��YI3����Y��#J��O?�}��y�+����F�M�>���N_	�h
��0�XE��P�\�N{�����Q��j���/_�����$n�p������{z�X7E�������-���)	]�W��2����2"��-�i��X
(w�|G<x?=����v�|������&1�M����*Ja��h�)4�����U��[M�+�?d^-��M�1���(=w������������
��
��l88m��(vw�p
���:�K��%i��E�-p�����j��p����
�AW���d�kc��
���\=������O��*4���1�f�� ��h�1>���ew<��Gc
����3&���$Y9�?m�,=��h����Z1�T����xW��^FD�e+�f����O��RYI>a
n�
��9��h>|�(���{�X{d�v���\�3H)�;z?a�m�d7a�vr6��el)��e���G���3�%Q7�����b��J�-�������@t�T�������������I��v|}��e���O����(Gt�����;�NN���wN�ZX���V�-��
��\0��M���:W������~w&��W@���h�Bo�u���wz#���k��w�h�
@F@�NRaO�G�7�0�o<J #��='Xq�#�����=�����0�����F#��?#���,O���w��QI�<���<�O1E�w2Lo��"�E��#�No~���������`������\ZO6�.�9���k=Z��E#*���w)��|���1C�)f)K���Fkyt��nJ�Uf4R�Cr�n@O��:%u7�Arw~b�c0�3��|������+����2R��7*g��5IOc>^���p��6�p�']_�?N:^��fa��M/�F��V�=px�"v3&B�K�H2fe�
���M[	��D��
f����N�������_�'i<��^�I�X�$���)�������'gG��<``F��Q��(�t2���)~zp�����p��������y����oON�_6�x����x�H�_�!q�d��]��3<,E�%#���^z-���K��?��+������J���Z�%F� ?�$����aX�E�`��9�]:
�Z���486�n��-T��B]D@�$%R=�
)���S�g��?�UB�h9ZXu�����|����^y��t����\
����}��c��1��
s0����b�W`VV��7�.���g�w�>Q�S�>�C�7mF�\8�]?��Ez�?x����|��u����
.���������;^C��:�_���<�A��L����l���������P��xW�w$��hQ��c �	�1+�H>"vw��6LK���/�7�jT�Y-sM�'>'6��=������;`�Y�`e��K���y�F&A2W�w��.]%�W����r���~���D��;��lJ��H��:[R�����Qz;���gT�	V1K�;���*�^���6��'��R�yL���k�R�rCB����*��b�Vf����j��8*RnI���\pW%*� ������X'2sE]��buS�SPd��oc���#K��>PQ�)C$�G���+�>�sB��h�#�P�Bd$�J���������aI��*�J�bG���6�_����#�|!��O����<I�?<��!�M����P�������8��*���D���� ) s����t.�P��P� h�-����@E|�-��x�{R���~7��D�v�cd!fT��`H�;���j����������_2+��)�����#�6�6�|����#Nw0�\���x�'��l�3[�6������U�)���yY��g��m`�
m����U���Yn�t�����L�w�_�KO1`v����`q�o�A���Q�nF�2�n�~vc��m����\o��FD�)��_�X����_���s�4]��DK�%���ii���0`�!��T����[+\(=��X9����,2�O^����
Of�2.c��S<9��COI���f*���kR��m��K8���
����q�[p��H��>HL���O����nX�,.%\)i��e~��n�_��������M�������9��)�,������>�f����]�qo������uL��&Q��V�@Q7|���oF���o���/��������=��F��<����!����'�wP�A��d�h6������_��W4��[Q�������a����p�'nF
3B���U�?��?f�Uv	^���XE�S�kV��t_n���r_n���v_n���q_��/��]���������{#�$�(4�hv#����J~��&W4TV-�I@M��&	����Q&D���1���u��{�@�>P���?
�4"���6Z��-��Zw����)�%1��YG����X!F��w�
�n�CAX��|�[������a�p�j�N���3�������ow��x��ay�ZU����?�w��z�����T�L4�OI��������jM�/O��qm�z�6�l���&�!�1��S��cKz/
��dc��.��c�B���|J��`�j�Qs �c�G�[M��$��\F5���[ec�Fu�Uu�
�K����\`�q6���W�R�L�]�$�3������Ab���tW�_���� ��3��F���1�Qdg����j`I�0�xr�2zD��|�Q��0U��o2��=p �x�{2��������8����MS�����"��F���������M��R`�Z@�x-S%��,TI���*���L�F��������~6h���8��)��������������E�{]m������}�=&�d�D7	�r�.�J����y���������0`���
��Vi_�#6z���&i��"�h�����aku�������8�h���������d(G��l��J�U?�	g�4�6R����B�f���0Q���T��U�t��Q*�Y�w�f�t7�8uK�a2�Z���I�3`p�W�[
��@�3������h�����a#��E�]!�z�-/�d��5a92������'��b�
�*n����?D�!��f�=*"���=�2IN���;�M���Dg�L����Q���6�]�S-�t���fbO�
�r2d�iX�]��I��H��	wi�(l���65���^�d���x���l���t�Q�:�.d�
{�` b+���e����5O7������.��t���Vy��A��0����g�@[6>n�+�����c���o���"6�5AF}z9��B�L{��!SV�;H����M������A���m)�
���\���#������U���?������DK@`?�j��et��{���&�23�a���M�gg�kEG#�
��T�A:Y<�^n|	_�%�s�G�>�x�y���p�
d�1Ac�p&B�]�+�gc8*�GK��v
4h�@)+����o5A�
'���;B?�-9|������nrnhX����u|@��>$p���g�?�CL
|'�����H�
�T��@�S~��'v����x�����.�����\��3I)�(7���e0m������9'�D�%�	=���������G��b�g4�.�V�"obN1JV�5��$�D�
�-�X���N�$�H�y�!�J�}�t���YA�� >:KF���6||2�(�+Xq*~��B��b%M,9������:�Z������R���Eu����Wd�F���{m�{Oo��;�~�F@%��Aer	����J����k|�j��������q�v
��~C9��Y�:��������*~c������V��+Z
Gg����-zc����,�_ON���N��4{�M�iD�5�q����Qx1�\T��4��Ga�R��^	�
a�����}�����1N�.�w������ 6�������Z�(�F�,2��_oQ��e��P��E��I#�s�����x��@`�Y����'��]%��>��w\%�'\%G@(��CW�^5�q�+��K
p�m/����E��C����cey�q'|��0�%�&�?�hv������|D�I��g�2#�M��%��hw~no477�E�F[�����_�w8���9��*�)q\?T�Z�&� ��q��+�E��������-L���H�Pc���T�0�����
A�9���v� ��[Y�����gU��<�
��YA�"�Be���X��8���Y��7W����T�����<rn>���cRK����s���Y�����e���N��c�*ms������"�������\s��E%I-�U5��
Pt��2^�e'D�����P���9�FD�n���x�[����E$d-���TBw�L��"�S�������P�@Y�D?���<���x�/I�u������D��;�\W���C�4����;�S�����1�����A��;���-7����������������z�P������(V�%p��4����� ��{��
���������bZ[h���.��4��@��U��gQ�����Cnk��=x��V�WP��T�������J�mi;l_��S�/���y����C��������8J=#������'��4��y��w9��� ��h�S�)U\�~�_g�m�~���LR5�S&������^r5��5������wN>JF���
�fe�����R��9�f�P�������RE19Fb��P>�����i���w�'��;��2�$�=���i�V<gjp$�9l����Jp�C���3�N�������9+���-����Hzu=�3���4B%��Au���O�q�s�p��A�u���J�ul��l�D������I�B�F�R�����7��������WI�%QY��zR�q�����fs�s�{A'���.!����a�
�gm#S����h�����z�0�+���2Hx=r�X�]����G%���Jl�f��6����FD)���Dq(��_�C*���u<��O�Y����,#�U5>H�bn��0,�PLV���:�2��5T�����"���1).����`�f_N^��8���y���b���vE�X���^����
������J�928h- 1���V�N=9���H�l�s���ia�O{}�����!E��M����N�to68��w����tr����m>�J	Z{�������|C}���|^���BT
���(��`�[�������=�o�
�1Y����s�'��x�9�A�#��}z�<��u�bg<)'��R8vQ�2U�?(x�Sf��b�R�'+Y=[�eZ�$���7����&�i��M!e�Q~rrR�1���p��6��<������04�����X��bC@��v���~k���qYeW��U;����E�^�����p-<���s���QF%�Pv4{���W�)z�h��e,��,
�+�o��{�\�����!��:��3X���aJ������D5�b���/
���h����?
��s�a\-�@v��������G���l�Wj#�$�4�)�<
��z�D����09���$���/��V|#���.}*B��,N����&��h��)7�=���D�L���Is|	�a�+VX�	�@���;��������"��� c���x�=��p>���qmsA����i��
���ud�5u�oP�:n=�q��$������)�;�k�{��0wH!�q�k�.It�"lD��O�O?N'��Ik��(�����Jx@��8���.�X{���Q(23b'��F|�?�L~�������a������5���#f��b���������0z�����.^@u����XA���^�`\|�a��1��X{>���K�Mt[D>�������)f�3����,`9����i:���.�H�h�WW^����C��d4���@NF�L.���<Q���v�&���d�
����������:��$�4�x���}t�_O(E�������k.���#�R���� ��L����S`����OX!!���R��@z�XIi��/.���\��V��4�����W�1�!��kq�\#C����I:L�pV�[�^��������.H�G�*����p��I�Fh���~������#��r����XSl��'<��t;Ue��J�Q]b�����l�x��dj=9?HJ|a���JQ�i�2k�/>q�� �2�KH��<�97����`i1������q|�2��d��k��h|��b�Y�Ig{�(���b�;.�v"�8�F_Q�U����~��t�'UYp�;ri��v�4%��E����ZX0�(��P�wH����$�c-PVI S���L��r�:}DI:d��{��#r���J��>��!�����B��Pi��)�`�*�����5 7`6s� ��[.N!�� ��Q����U��z
�:
�����9_�g�:���G0�b'��%����G|�}��~��v����p���������W�E���
=��|���e��3|���@�^Ul�	���M��PdZ6%�Y�����n��By�-��*�,�8�#P����w��.bd��<���+�9�����m��U�.�>�V�c���Y������I>���zK����s��l�X�Zq
Z_�f8{.�S�+��L8����J�L���#u��U�h�Ju�7#
�!	�~
5��v<��n,0�M/�ed��
���D������+��`c���Z��Sqh���3���L����x�~���x�}��nE?8u�p��{�L]���Js*��|��m��_�5tT���#��n*��������@�	��
H�����S1�C�^?��������=�|&JI�1�+�i����d}b9��U��`������������������B���������~~O�����i�fNR�z��
G�oB���z���9 ��]��c�;1�+B�Lk�����0����9�D�8'��fA�(v�����kY��X?u��R������LI�6��|�-��5?�������2/�wC�
�0v�3f-S��$������gS���97�l�|[X�"(���t�K �Uv/�%|�����������Pd���+k&]I�#��z�U6�YF��[0D��?o7�|�,P��S8����I>�
��U2"�{OEJl�?�EU�9Z�\�����Z;�&���VHf����o"2y{%�4�`�m�3i9�6���\��q���X9'�t����_�"�#���p�b;�2x�$y~s�;�������p4e����B�Er�n�):NN��F?���[��
��B����Dv�$���E���H��`{^��y�y{z�b��������m�c/����?�������M6��H}�G�~+�t�X�G�e5�����K��4�K�
��E]��a�b�*��w`�^��Vg�TB]�T��	J�UJ��dXx$lB��`�6m��������7&7h��L���dg�6�A6��J��&�_C�2�{��C���������0z����,

��
��X�~W8�1���(���4,�hy���z�j��P�X��**
�14�]N�b�<"�9��������|�QC1�[���U �����*�'�'��j��D���0s���yM�_}bBW�`��f�3M�M?9]G*h�����.N������b�T�����W������{���;/�����=	���c�&�v&O����F-n~�]S�����KI�<}�T����d�B��w��-v�y�WB���B
b���l70�-��U��S�]���x��^���g�1��y#1���*��-L��M��4����\���"�N�
��,�2G����<����!_?J�S��T�@��)U�wg"���1��h`f\��s�Nk���.K!L��>j&a4t���|/�1��BQ�"@v�����/��^�K��_�����7��{�|����Q����?���Rwxj��2s��KB?��TQQzfF����$����xD�0����X�(��>��xs�#��<��������wB�S��q��Z��4�"*�
�W����+��3GQ�$�����}�tJ���K��;�d��~�d��F����#�I���(� 3��>���g,?
�bt��B
i �6��6�Sw�)���9�_��t�@8�K�u���J�n:p&"���7<3���!<HC�`2�{�
W=���E�p<�#e���Z�d�7��Zv�����M��P�VqUC����� �����Y�A����S��������]�5{1�~*���?�g��WQ�Z7E$M .
�"�i
=<�w,��u�Z�l�_B](����������5f���Z��'���a������S����t�p�95�x�*+f}����gu�YLhP�r�������_.��[��A�uz���)����K��WB�4�P�gE�\�J�u� ?p,�^3�S���94�%�@U��!��Q��&)l1{<LLI�fx-/,,T��/�6�����U�A�������%�/2N����"P������J����1�����[��=���1��I�(��*���.�Y�� 4r���f��y"���]��B�"Z/8�a@o��:z�����z�I�������9>ya�/�5� afq�RV���U(��gk"���D��9
+��f����2��:r:�:������qN���]M��{z����x���"����h��� �^���n���e.���n-��@�x_��V�����r�����%��nTx�&�����dL;
q���:�Y����,�-v����~�EwFI�$��4&�Ks��Vn��V�,����nt4�^E,�%'����i��I(��I\�����xB����7?��o���Dx��I6$7b�6l�Q����9X�3�EP�S������=2G�s�)��r8��&�����/�����9�o�����({9�l*d?��a��� ��FB����i=~��x���r������*bFR���b�<��_��>��c����f3!}���}B{_r�\S���i��BN��5���.H-�/&�p�o�?b�6����.IT������I		��,=�u98��r�e������V��}n����(�P���=�
����c�i��$������"�W���1���r<��(_�!�k����-"��fY�%@r~ �MO@2��p~u|z��/i����F�k������ 9������X�u0}�?0S �w���9�ie
KzoC+���3��?J�%��F�[��?��1ER$q�lF�>�F���B�����u����2��/��*�\�P&���
P�U:`�}L��x:�&C\�57=�	>U.��baCn-Lr��-����d�(��UY9tl�Vm��<'l8(�\�W�k���`
N�/���?N
^��������������v��f~�}�8$��q��0p�6�nbX��0����j�W��������Y�?m&Q0���@��	y��D��YZ,�J��'��lcA��h���6�O��N�d�>�|H^����1�����O�D;��jOH/w�Wk4n&��0JC��K���K����*��)��F���E)�-A>�p��@jt���{�4���h�����9G��?�0z�}(2�uc]���x���|�[�&0�Q�������V/���z��M���������N�����������������^\[[��������z�����;��f�J?�#���3�r�<t��u����7�����6w���0�Ic1�������EGP�
�=�pO������m���t�kL~Lz��>�p�WD<.�`uVq�����\��[���o�f�������+��a�>��>�����wE/��]�����7YFK��>��������C�J|G�����]��G[�=�������t��r�[F�8?xg'�9
eP�
&�E���tD���#�y��Hi����<v��jFW9��Y?�(]�CA$`}R_#������[����
%v�s��oc�U��W��Dp��?$7~���U�������]dP�UN0�W�*gER�Gn
8�ti��M���_6�j�@�F5)Yx 677�@ln>�rho��/��
�h������rz����m���������R��%Q�$.qc��
�q�7��N/�5Zm'��mB��D��i�%��h���nm�5�����*)���ku�3g��H��m?�j&����%b�����h����	����BS�OK������D&�4���]��;���5�q����SQ�X�����u�h� j�{N
K����C�By}Zj@��8#;��������DA����"�B
��j������Y�����0P�/����q YJ�9F��L�n��`B.��F�l��$�[�6"�|C�j���H���eQ�.7�.���1+A��6���a��5Lz���"�Z�=SGP���2crh�K6�dM}��SX<:}���M���E�N8>h�KX���29SqIhmf*0�5�h����� l'R���,�wB�B��{4�@J�x\0flT>��Kv����&yu��������hE�>�}��g�g�3�ZS�a��A~\�8\�y��NG�
{����j
��#@7Y'D��[�F/[�P�����vV��s�0��AF��d�4�z���[�6�.9���n#v�����v;V�r�+�4����%�(��h�-P:yK:�m2���'h��a��%�}�'<=�l����	�f��D��r��h$|��T�@�1����P�&)%&��yQk�G�)i���;W%��Q�#��Eu������9"aP}*�p�)��lx�[�7�7J�9 (~���\O����"������)�X��#�����sX��i�v�����79�H���i��M&��Ku�T�@��y���n��n,�kv=���K����6ju��J�����AR<
/�X=���[K�S�2�4b���%e�����M�I
>����j������������[�b�F.qMe�L6s5��"Mo�M���ME�����E���o#���S�W��I��!��WI�RYF4+��{{�$�m��G�n{s�;H�5+�m��+��X��� !>B������^#3���A�w:�1	�hG��4�|t�/\-gr��#d�O�WA^���(��5������^������9�;}v�>`;>;�??��9��NH�~��.������������~�%���k0�\��Y�lXZSj�������7g1P�����%��!�����w5�����y�ELz�c �H>p��:�K ��H8w��6�.&r9��t�
���e�1�J+����-�kX������/��<k��<����Q���������4���s(���<��s�9��'�����fE~����y!�;���k��i�\&ic�������(���f�D_DI��N����9�n����d��}<I4M��������f{��.^
Q�d=U��G�)����UFGa|��	�(b&�ZJ3��v&�������3B�	B��?���y��Q*�b����h�����t�f�/��a.^��qb������?�Y��[� _�� ��w��pL�����2�?�1�5���[t�����������t��f�L&��� ��7��4H\�6����i�(���W��z1H!�����[��7����t[�Jx�$Wx������w5]�L;�����������L\+��0F�Bt��%E-e�C(Y�nG�X��.1�h AEGM�{�L�9 ��x�N(���v=���?&?\Jx�]�>�<Z�!N�������R-��O�~�������)�jE���WqM�����P����@��O�����]��*�82_���$�Y���� ��Hll�Eg5v�$GO����R�(�55���'G�(u=46���+�C����������"�%<�G���4�Q��Kt2OT�8!6�o��>=f�0�#���^[h[�,���L�:Y�aXw��vJ�������@��T�vYl7���7)��V�NR�OF��s�s��a��^������=�������w����1`#tI�;��
��T�ES���rVL��9�.�`1��{�
�%9D�d���=�:_�l�QF�M��RS�TQ�'�le���R�a
6zRR�K��OO����@��e���ZO?`�����J�7Y'�yK���8H���g�`���5h8v�����8x��h�q`%���,��f�9�:%�~��gr���8���\�_8�pxH�R��v#�8,���G���~�<tT����(*��tg]U�t���ntAo2<����Y�w�p\�h�5��\��4Z�Ox����#��x�A^�*�L��������qwQtm� �F�����������`�yb�{���i���dRCtY������i:K����f�:���o�8��e�M2�������Z�.�b	t��<����V
I�����X9���$��/0wRs2P$?'D��z�Aq+W���RT��uO��<��l&1JO��WF�a�7�t��w��p�4����w����
����{����S}1�1��!��YXy���f�k���d��(5e����0���'|!x9�	�	O����V����4U����0��)��k�\y��M����=��M�G�`�\�����<p�����%����sv��������k�&�we��)G�S#��v��c�R���4f��h���9�u�dU�!^�\E q����`�������T�a��E$S�O y���
���%y�������p���V��d����S�@�z������I�ht�T�.������}��N�����7�r��X�_����mU��o���1}����u������b� ����_%��y��A���s;��W9��&Sv�<��k�H�3�����D��Q���w��}BQ�nDE�(w0W����K��.(�ef�9�)�7�>Z+E��[����&J.0lf���k���$]������+=d���s�b����5<<��x��`P����lh[�HC��#���a��I����7�zD9����CK����
�"������$R�m�_lkgs��[��������j��R���?�L��DJPu����f�(I7m���-�K��"��WP�`���`���7�����S��
V�{~[��>��|��Vo����2���rw�t�[��6��ZJ�fjE	N���0���G�#���	d�n�[RK��c/�+ �7�~4��L�
�-�!�_��\\��~���Z�}��3��g��=�_�����7���������V{'}����n�Zi��{��hc�[�T�h���+�%7���%��O2(i��Qz��K���9{�
��xw1}�StG�R$���d�x��+B ����	v�^�����o��h)ll���2��.����Qq:�j3��4��U�Y�\U�1� �����r���*%+PaB�NO�����q8�h���,t��L����O����9y��h��|�������ST��.5� �m�C���6z��I@J�O'��������i*�e��Q�-M�,�TK����������7��cnk�B���Hs�q8�U���C��6'C�����q�u������S�i��<r�tZAI�o������V�7��n�������v�Q5)�RAQ
%�0mo1U��h���)�Dp��8���	�R���H@#����u&�������!�->�����_�����A'��e����OG���K1?9��xp��y�����u,���t��+�����@}��WK��QZ����?��b�����u!S�D��ix9.3J��L�� y�6���z���h��,�Y-��9��P[�K�?@���[�����4�Lz@B�G�PEc�����DK��xh�S4)�g��E�g��!y��3)'H�Q�+z���	���������t�#���h�k��&e�NW���Z)�jU�vu��vi� H�	���u���]����V�`�.NA���������nC*�3��j����$�����������9H���(>�y��}������������J�Q������~n�a�PE��/kN�����&��)���tQ�0��W^lh��L	4�Xh���#����hB������M�����>4� ��t�:�*�� �2�\�lJu�X���{�jF�Kg0���~yt�����U�������68�I����|��5�2��V�����T6J��n�������8p
��/�Z�g,5���(��3?n�~���XJ�~��\)���b��Q�svm�_v���}8������5A�h�����E�X���!�J��u���p�L�?~�C������5���
Z��2&�
&Y�,or0�]h"jj~/L���_���l}����c��(6���>����~���+�6�lybC��A������4GX���F�U;X����m8D����=���+�6Lay�w/�[���Q���
��{��t�>Ya��}������<������X��D~Yz8�?�}�]\,���z�nZ��S�9"��{�:H�^O��+���Xp�O�y:��a���KA���e��%�75�lB�o����l��b�����Cy��Bu�U��?!��%�_�+�
�9:��c�]r���i<b�tP_)Z��*����2�$�S��r��������/�����������W��,��u^�X�r'��U�s�����a���s������sp���j�;��2�w�rk��Vo0����m�v����~{�1��q-��.���or\�&;C#���]�L�R�0@�=<:)�p���F��-�����3w1r�9���	}
?��Zq���Q���
��
�������;f���;��'G/��{�RQ`0���g�:Ly0km��{G��I���*���g\�|9�y^��{��,/|���f�����y�~u��#�oV~�j�_�|}p�����K^��.�������p�Z�?_]%8Z9v3�}���v;�����������������Lg=z#�S����%�Vz�J�;Ao+��>J���{����n�vv*�a����j�^��@e�W�~�Xr|�
�AE1R�d}�b���Su�,����W��AW��a6z�hRmY����P3-�<�8Y��u������b�u�r�h��D+G.5� �yO�*�������E"�������vm��Z��-(��vm	�=�v����"���nJ����������\E_dv�y��=OEJ�[�;��=8�����;[���AM�o�\����{�?(�KOF������bc�Jp����I��V�o.�6�(BD�P�_	d�hQ�����j@�
�E�
��MA�f'�����������|i���u\ub+��YR�1����J��-j��a�,W�� S��g7���nB����'�����R�)�n��5.7��Cd������L��t�����e�BZ��{b�����rx3v��e�J3b���MZI\��o���������k�l�w����'�+�	����'���>Z�][l�O�H���������1H�1;�5�pwWl���F��A;I.�Pt���iA�.�N���[\�;�m`���bo��l���s���cm��r���������nK|$����!8����`RM�"`*���(y����?������|���<���u�nK����ho�������V���������`{�G;hm�;����@�*�����e�����(l+�?'L�b�P��IU:q��a��Bbk�b��b�b��/��:o��;:�����^=>��}���o�������v^��acmf�s�e���e��A_%x&���3�o���M'��Z�F���y���_����.��I��5�����y!��.�8
{q�Q,rx:�i�]qK%�G��_�xe�y�|CZc���?],��4gE�{�5�F��[C-p� ��R���{�jB���/6`��aV�5�"#�;'��������h�RW������8et85���*��>�C�0c����Q�!oE�[Zg�|�|�����D�.m�;Tn������ f���d:�!\yD��}�p�������!")w�������"�	���}���>����80)M���\#�����C�]q�*���O��?�O���dR
�I��3_T��/������{*�8��3a-�d���"b6a]S�.A���T>31V����f |F���`Q���(���0�R��q�>�j�*������YGJ��a5Yb.��&BTHG�%��"
�����5
���_:Go��=|�m#}�<���*�kW������H
"x�K�q�@yJ��:96*��S�)�)�����-��3���|��^�=@+8@��yQb6��/S���[�3���::�=��J��p���f:�����!N:��K��ON�~�����i�[i���rb�N7�-:I��8�FE=��j�;�B���������we�
�QdZ�_+�������&��t���xwk�7�}��L�M-���p
��\�1�.��6V��s0�$�q���!�� '�����H���U��|��2��]K�!Fp>���F^�Db��K��Z� �BN����x��l���B)��^�cd�����M��\AEzaQ���u���������1�	Y���`�?FE�{U�#	�f)����{aTH���L^,t���1p=d��$�4i~�S��P� ��:%���>I/0xc�Q��z����:����n��5�rl�';���t���t������p9xH���<���1�3�����y�p
.dk'��S��-�<mw���.E��%�%�H���G��E�Thd�_�����Yo�E
u@����tu���18�����j"�O��d�q�����i�d���������������bpZ�M��p(���0i4������g�{M$Q/�0�k�a����'��\�J�/.���!�{�����/��S�
���I=�������/�W;�y���z����{/PXTo������J����&\������`�&��2������Br�J?I�)��n��K����n�������f\-p�����������9{�A���$*QZ��P���0*!��jIM5���	
�w�fy`������F$)2b�o�b�'X���fzs)���5�y=�g��0�H�.���P{��C���:9�{oSPR.� `�]{��jm$���`��n��u���Kai�}{�����}��|����(�����'o����?�j����O�4����g���W<�(<�+�w����</T�~�j�qzHk��)l������7	�P��C	w�"���� u}�B8[Ge^��0Z���^f���6*���-R,�����<�n�"h��������3?2���##�D��=s��y�@|����`����KDx%�����������4�~���ci�m/=�0IF���?��X���+�(�������3f�*3j^~���������1�~��������%�(@ 5N�|\�;|L�]� _{|vxz?�O�Sl,���~�t�n�!��\���(mj�1��i���76���w�6�M� ������s�2��fWsd?a~�c{=DWzNeM`B�g�H�Z8�|���>zDA���<�������q����E'�^5�����G�'Q'��C��w��P'�q|������y4IF��U�:�H;�MxJS@�t�������g�/%�V�<�$BH���5��c��l2����^�4�N����-�����d�.�����+�-�i�g8�-7���	��1�RW���<�1��BhnM��&,������4A�"�w��O��?
jlE/�'��g�����UT|T
�`S0��������_�X;I���?������Zcf�����j�������'�M%�6�:�������;c����
���1�+gl�h�����JM����j��������m�����Q�������e��q��7��q���G��>�����sl��g8���v�-��ZG�������E���|t�(k��H���.��)8a��+�bJv�)q���	�k4���{��\�B�����3KP�as�o�x����5)�����{���N�������,�kw)�*�@���1�x��&.������]�]�]����-Z/����[�F�,��|u����?�m.()�f�����7;�X�=Z�?U�F�"�_�$�����9���O:���96,�Tp���[>A[��l�)��w~K���hys��%z�����7X>i��XZ��k/�}}��=x�>���s���m��|������������/�+s]��`���������:��C�7ynI+C�T�%���J3��`�����}���?�a��2���������|�,j�h�?���
n�� <��3���~�Eo'U������+$���_�LB�f/�<+�%h�o��JRj�����8�����8���z[f����
>�&�y�>�>�6�Tq�~�}���E��]l;��TnfE;G���)�LG�jV�[T����9���Z�#SF�����e�_����n=�x��[��-�$�����weY�������YXk��3�z��p��-V\���O�h��;��VP�I����O��������g"6#�+�����'���w�RRAn�:z%4����4b?�k� �����>3������*��������/�wuf�9�'�����\(+>�����v����f���K��{��
8�������X��E�{����#_�[P��d�Q/����}���e�I\�<���������?;_//F���n;�g����nol���S�����������/g��g���D��;�_���}J����31���e���h��f�oG����h��0���a�S�L��7��� c)�>5���X����*R��G����t���j�m��$���w�9)7Vs@��dlob��U�Q�IT�^T���0ia�����4���CO�m�����F�e��M�o��/�`��:������t����������N5�t��M,����������6�?����oZh��M�o�7O���0��}�_���B���^���7���_����K��c�������
��{��YK�/=]��������za��fks��^\���(���

#133

Kyotaro HORIGUCHI

horiguchi.kyotaro@lab.ntt.co.jp

almost 7 years ago

In reply to: Tomas Vondra (#132)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hello.

At Wed, 13 Mar 2019 02:25:40 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <19f76496-dcf3-ccea-dd82-26fbed57b8f5@2ndquadrant.com>

Hi,

attached is an updated version of the patch, addressing most of the
issues raised in the recent reviews. There are two main exceptions:

1) I haven't reworked the regression tests to use a function to check
cardinality estimates and making them faster.

2) Review handling of bitmap in statext_is_compatible_clause_internal
when processing AND/OR/NOT clauses.

I plan to look into those items next, but I don't want block review of
other parts of the patch unnecessarily.

I briefly looked it and have some comments.

0001-multivariate-MCV-lists-20190312.patch

+/*
+ * bms_member_index
+ *		determine 0-based index of the varattno in the bitmap
+ *
+ * Returns (-1) when the value is not a member.

I think the comment should be more generic.

"determine 0-based index of member x among the bitmap members"
" Returns -1 when x is not a member."

(cont'ed)
+	if (a == NULL)
+		return 0;

Isn't the case of "not a member"?

bms_member_index seems working differently than maybe expected.

bms_member_index((2, 4), 0) => 0, (I think) should be -1
bms_member_index((2, 4), 1) => 0, should be -1
bms_member_index((2, 4), 2) => 0, should be 0
bms_member_index((2, 4), 3) => 1, should be -1
bms_member_index((2, 4), 4) => 1, should be 1
bms_member_index((2, 4), 5) => 2, should be -1
bms_member_index((2, 4), 6) => 2, should be -1
...
bms_member_index((2, 4), 63) => 2, should be -1
bms_member_index((2, 4), 64) => -1, correct

It works correctly only when x is a member - the way the function
is maybe actually used in this patch -, or needs to change the
specifiction (or the comment) of the function.

+	if (rel && rel->rtekind == RTE_RELATION && rel->statlist != NIL)
+	{
+		/*
+		 * Estimate selectivity on any clauses applicable by stats tracking
+		 * actual values first, then apply functional dependencies on the
+		 * remaining clauses.

The comment doesn't seem needed since it is mentioning the detail
of statext_clauselist_selectivity() called just below.

+		if (statext_is_kind_built(htup, STATS_EXT_MCV))
+		{
+			StatisticExtInfo *info = makeNode(StatisticExtInfo);
+
+			info->statOid = statOid;
+			info->rel = rel;
+			info->kind = STATS_EXT_MCV;
+			info->keys = bms_copy(keys);
+
+			stainfos = lcons(info, stainfos);
+		}

We are to have four kinds of extended statistics, at worst we
have a list containing four StatisticExtInfos with the same
statOid, rel, keys and only different kind. Couldn't we reverse
the structure so that StatisticExtIbfo be something like:

struct StatsticExtInfo
{
NodeTag type;
Oid statOid;
RelOptInfo *rel;

! char kind[8]; /* arbitrary.. */

Bitmapset *keys;

+OBJS = extended_stats.o dependencies.o mcv.o mvdistinct.o

The module for MV distinctness is named 'mvdistinct', but mcv
doesn't have the prefix. I'm not sure we need to unify the
names, though.

+Multivariate MCV (most-common values) lists are a straightforward extension of

"lists are *a*" is wrong?

@@ -223,26 +220,16 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,

I haven't read it in datil, but why MV-MCV patch contains (maybe)
improvement of functional dependency code?

+int
+compare_scalars_simple(const void *a, const void *b, void *arg)

Seems to need a comment. "compare_scalars without tupnoLink maininance"?

+int
+compare_datums_simple(Datum a, Datum b, SortSupport ssup)
+{
+	return ApplySortComparator(a, false, b, false, ssup);
+}

This wrapper function doesn't seem to me required.

+/* simple counterpart to qsort_arg */
+void *
+bsearch_arg(const void *key, const void *base, size_t nmemb, size_t size,

We have some functions named *_bsearch. If it is really
qsort_arg's bsearch versoin, it might be better to be placed in
qsort_arg.c or new file bsearch_arg.c?

+int *
+build_attnums_array(Bitmapset *attrs)

If the attrs is not offset, I'd like that it is named
differently, say, attrs_nooffset or something.

+	int			i,
+				j,
+				len;

I'm not sure but is it following our coding convention?

+ items[i].values[j] = heap_getattr(rows[i],

items is needed by qsort_arg and as return value. It seems to me
that using just values[] and isnull[] make the code simpler there.

+	/* Look inside any binary-compatible relabeling (as in examine_variable) */
+	if (IsA(clause, RelabelType))
+		clause = (Node *) ((RelabelType *) clause)->arg;

This is quite a common locution so it's enough that the comment
just mention what it does, like "Remove any relabel
decorations". And relabelling can happen recursively so the 'if'
should be 'while'?

+		/* we also better ensure the Var is from the current level */
+		if (var->varlevelsup > 0)
+			return false;

I don't get the meaning of the "better". If it cannot/don't
accept subquery's output, it would be "we refuse Vars from ...",
or if the function is not assumed to receive such Vars, it should
be an assertion.

+		/* see if it actually has the right shape (one Var, one Const) */
+		ok = (NumRelids((Node *) expr) == 1) &&
+			(is_pseudo_constant_clause(lsecond(expr->args)) ||
+			 (varonleft = false,
+			  is_pseudo_constant_clause(linitial(expr->args))));

I don't think such "expression" with unidentifiable side-effect
is a good thing. Counldn't it in more plain code? (Yeah, it is
already used in clauselist_selectivity so I don't insist on
that.)

+		 * This uses the function for estimating selectivity, not the operator
+		 * directly (a bit awkward, but well ...).

Not only it is the right thing but actually the operators for the
type path don't have operrst.

+ * statext_is_compatible_clause
+ *		Determines if the clause is compatible with MCV lists.

I think the name should contain the word "mcv". Isn't the name
better to be "staext_clause_is_mcv_compatibe"?

(Sorry, further comments may come later..)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#134

David Rowley

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Kyotaro HORIGUCHI (#133)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Wed, 13 Mar 2019 at 17:20, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

bms_member_index seems working differently than maybe expected.

bms_member_index((2, 4), 0) => 0, (I think) should be -1
bms_member_index((2, 4), 1) => 0, should be -1
bms_member_index((2, 4), 2) => 0, should be 0
bms_member_index((2, 4), 3) => 1, should be -1
bms_member_index((2, 4), 4) => 1, should be 1
bms_member_index((2, 4), 5) => 2, should be -1
bms_member_index((2, 4), 6) => 2, should be -1
...
bms_member_index((2, 4), 63) => 2, should be -1
bms_member_index((2, 4), 64) => -1, correct

It works correctly only when x is a member - the way the function
is maybe actually used in this patch -, or needs to change the
specifiction (or the comment) of the function.

Looks like:

+ if (wordnum >= a->nwords)
+ return -1;

should be:

+ if (wordnum >= a->nwords ||
+ (a->word[wordnum] & ((bitmapword) 1 << bitnum)) == 0)
+ return -1;

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#135

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Kyotaro HORIGUCHI (#133)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Wed, Mar 13, 2019 at 12:20 AM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

+Multivariate MCV (most-common values) lists are a straightforward extension of

"lists are *a*" is wrong?

No, that's correct. Not sure exactly what your concern is, but it's
probably related to the fact that the first parent of the sentences
(before "are") is plural and the second part is singular. It does
seem a little odd that you can say "lists are an extension," mixing
singular and plural, but English lets you do stuff like that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#136

Kyotaro HORIGUCHI

horiguchi.kyotaro@lab.ntt.co.jp

almost 7 years ago

In reply to: Robert Haas (#135)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

At Wed, 13 Mar 2019 12:39:30 -0400, Robert Haas <robertmhaas@gmail.com> wrote in <CA+TgmobvgTNWCeod_nqOJuPOYRecXd8XcsP4E2b8sbeGVygGJg@mail.gmail.com>

On Wed, Mar 13, 2019 at 12:20 AM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

+Multivariate MCV (most-common values) lists are a straightforward extension of

"lists are *a*" is wrong?

No, that's correct. Not sure exactly what your concern is, but it's
probably related to the fact that the first parent of the sentences
(before "are") is plural and the second part is singular. It does

Exactly, with some doubt on my reading.

seem a little odd that you can say "lists are an extension," mixing
singular and plural, but English lets you do stuff like that.

Thank you for the kind explanation. I'm not sure but I understand
this as '"lists" is an extension' turned into 'lists are an
extension'. That is, the "lists' expresses a concept rather than
the plurarilty. (But I haven't got a gut feeling..)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#137

Kyotaro HORIGUCHI

horiguchi.kyotaro@lab.ntt.co.jp

almost 7 years ago

In reply to: David Rowley (#134)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

At Wed, 13 Mar 2019 19:37:45 +1300, David Rowley <david.rowley@2ndquadrant.com> wrote in <CAKJS1f_6qDQj9m2H0jF4bRkZVLpfc7O9E+MxdXrq0wgv0z1NrQ@mail.gmail.com>

On Wed, 13 Mar 2019 at 17:20, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

bms_member_index seems working differently than maybe expected.

bms_member_index((2, 4), 0) => 0, (I think) should be -1
bms_member_index((2, 4), 1) => 0, should be -1
bms_member_index((2, 4), 2) => 0, should be 0
bms_member_index((2, 4), 3) => 1, should be -1
bms_member_index((2, 4), 4) => 1, should be 1
bms_member_index((2, 4), 5) => 2, should be -1
bms_member_index((2, 4), 6) => 2, should be -1
...
bms_member_index((2, 4), 63) => 2, should be -1
bms_member_index((2, 4), 64) => -1, correct

It works correctly only when x is a member - the way the function
is maybe actually used in this patch -, or needs to change the
specifiction (or the comment) of the function.

Looks like:

+ if (wordnum >= a->nwords)
+ return -1;

should be:
+ if (wordnum >= a->nwords ||
+ (a->word[wordnum] & ((bitmapword) 1 << bitnum)) == 0)
+ return -1;

Yeah, seems right.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#138

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Kyotaro HORIGUCHI (#137)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 3/14/19 12:56 PM, Kyotaro HORIGUCHI wrote:

At Wed, 13 Mar 2019 19:37:45 +1300, David Rowley <david.rowley@2ndquadrant.com> wrote in <CAKJS1f_6qDQj9m2H0jF4bRkZVLpfc7O9E+MxdXrq0wgv0z1NrQ@mail.gmail.com>
On Wed, 13 Mar 2019 at 17:20, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

bms_member_index seems working differently than maybe expected.

bms_member_index((2, 4), 0) => 0, (I think) should be -1
bms_member_index((2, 4), 1) => 0, should be -1
bms_member_index((2, 4), 2) => 0, should be 0
bms_member_index((2, 4), 3) => 1, should be -1
bms_member_index((2, 4), 4) => 1, should be 1
bms_member_index((2, 4), 5) => 2, should be -1
bms_member_index((2, 4), 6) => 2, should be -1
...
bms_member_index((2, 4), 63) => 2, should be -1
bms_member_index((2, 4), 64) => -1, correct

It works correctly only when x is a member - the way the function
is maybe actually used in this patch -, or needs to change the
specifiction (or the comment) of the function.

Looks like:

+ if (wordnum >= a->nwords)
+ return -1;

should be:
+ if (wordnum >= a->nwords ||
+ (a->word[wordnum] & ((bitmapword) 1 << bitnum)) == 0)
+ return -1;
Yeah, seems right.

Yep, that was broken. The attached patch fixes this by simply calling
bms_is_member, instead of copying the checks into bms_member_index.

I've also reworked the regression tests to use a function extracting the
cardinality estimates, as proposed by Dean and David. I have not reduced
the size of data sets yet, so the tests are not much faster, but we no
longer check the exact query plan. That's probably a good idea anyway.
Actually - the tests are a bit faster because it allows removing indexes
that were used for the query plans.

FWIW I've noticed an annoying thing when modifying type of column not
included in a statistics. Consider this:

create table t (a int, b int, c text);
insert into t select mod(i,10), mod(i,10), ''
from generate_series(1,10000) s(i);
create statistics s (dependencies) on a,b from t;
analyze t;

explain analyze select * from t where a = 1 and b = 1;

QUERY PLAN
---------------------------------------------------------------------
Seq Scan on t (cost=0.00..205.00 rows=1000 width=9)
(actual time=0.014..1.910 rows=1000 loops=1)
Filter: ((a = 1) AND (b = 1))
Rows Removed by Filter: 9000
Planning Time: 0.119 ms
Execution Time: 2.234 ms
(5 rows)

alter table t alter c type varchar(61);

explain analyze select * from t where a = 1 and b = 1;

QUERY PLAN
---------------------------------------------------------------------
Seq Scan on t (cost=0.00..92.95 rows=253 width=148)
(actual time=0.020..2.420 rows=1000 loops=1)
Filter: ((a = 1) AND (b = 1))
Rows Removed by Filter: 9000
Planning Time: 0.128 ms
Execution Time: 2.767 ms
(5 rows)

select stxdependencies from pg_statistic_ext;

stxdependencies
------------------------------------------
{"1 => 2": 1.000000, "2 => 1": 1.000000}
(1 row)

That is, we don't remove the statistics, but the estimate still changes.
But that's because the ALTER TABLE also resets reltuples/relpages:

select relpages, reltuples from pg_class where relname = 't';

relpages | reltuples
----------+-----------
0 | 0
(1 row)

That's a bit unfortunate, and it kinda makes the whole effort to not
drop the statistics unnecessarily kinda pointless :-(

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-multivariate-MCV-lists-20190315.patch.gzapplication/gzip; name=0001-multivariate-MCV-lists-20190315.patch.gzDownload

����\0001-multivariate-MCV-lists-20190315.patch�\ks����l�
L�m+��QK�O]YN<��9�z��t< 	�l(R%);n��~v	R�l��g����x�{��� gQ�d#�{��=lw|`��b>�}�=��nOt����X�N���������~E��N�mU���+6�<f?��qv�������c��F�4�py\9��x������6{�#�`��u��,����v�2]��'y�~}2�eV��[������$���O���$�T�F�����#���[O�.�&^�-�������
���)m�;l��n%�y-��b����v>��'�#�v�u�9�H�[N$@7��99"�a�O
�����"p[��%��E�:K7n:�9,4BW�-�K�|���J5���E�]�J�����Z+�����c?#�,v������������;�W<��7�����d~���m����z�?����-�M�
�''��&[%�6$����t�6�����+�#��I��]�l��H�s���p����0�-�����=���O9�;N��J����[�MZ��tU6�l��5��	����7��^�Z�� ���<[l�6�h�a�����/���<���+��j�7�����m
�Ili��B��]���]�����mq������(�����DnY�C��b����
m3�4��rK���DD�usr��bF{������u�XD"��V���]3��/n|��h��d��'���Ut��Kt3���HC�#*0m�
;C������b���D+�z���2�!(���1��1sny�n��A�y������_$*p����U� [���l@`y�w>���J���s�h,����a�xX���>����:������=�FCK8z�Qv�PG�����g��~�_]���������]c�������Q$ N:���[?9���9J�9�:H�]$�G-��W�&�T�'�Q��e�p/���������V0`#��beA�aF��������G�U��������QK�$�ACG��p\�46�
��#��gi\|0hT��6�:�����@g�	C��W�W����C��Z"H�������{�P:�����P�Tz�0$�Gt��?�o�I�v��'|�8N>��:j��R��v�o�!|@��y�����N5��@���	}C�un�����g����0(�>�H�� ?"%����J��Ur��E�Wtu���p���d)�]�@9�w�p�n��l��5�CW8;NF`�����u����^�?�dvcp��	��4�������7�k�QA��H�G`{7��J^c�-��������.���~�g\����BJ�(��4���L��i��� Rz�O�H80���0;{`G��[rP�~�����`������51���>[i��0 ��Y/����=X)����.���>��;�+O�KB�������$�l,���^����A�[>R9����tv2;�����#��3�����F�*�S��L�������4)/UJ:a����v����*0u����S�ff[��3�d�r��q�g��}��=��� .�_/v��(�� ���� 26R��#Ij�~x/�*��_�q����}�-� �#����/�/2w$������N��%�Y���g��J�R������|�J%�t����FR�aMK�$������(}%���UG��\_�0�2�<�������C������o���p������C_���!�}������7@;O{������LZb��y$�X#`{2'n����W8^L��C�iO^\fD_A'6���?�v��&_)���H���Z[<[.�������w30[@<�i��71D���"�K����1�\L�3�l�dg�W�X�R���?��J�[cK�������'���5�.N���C�����������?��b�Thc&����j�Y4�&�F5G�t��xS�b���,B��n�
Z�B��"�]�L����������l�a�������9[ l�9�X�3�c�D��bX����/�:��x��]�+��}��#/;�^�/�v�����j�M��@���z~�\�\|���]�/a����=�L���E�M����gY���R�]��}�C�D-�@�*�`�*�s��F��V)j�_�(��H&fw"z`�q��
2r����=�k����3{�(jK�)�0���"��X}�� �!pU����
d	�����P4�u?M�{L7�����`��O{���'�����
L�0����p5�y:�����ys������!r4J"N(� |0��K��i�(8@���t��KT�"�
���S�0r���`8�;��A�q(��w8Z4�S��|)Z�u@��d��]�j��a����$@A?<��#�:?*B���-�L���V�fE�B��j7�>�U�4�"���P��3$ ���|���� &�y�:f��\���,1��E�����e�M���M��uTF�*��R��<��s��$�����_��������G�+	�i&���eU��W��?p�x���z+b�	�i��������+&��_)!����/F��3����3V�o�����l{a�e�Uc��<�'aPg��/�h>���/8�v������;��h�����/���:;�|a�O�J��b5G�����2��K)��eP�m��UT����g�y�SQ�:*�����/�=�F�����mT��5���ST�$��w����I9G�A������r9���������������T��A�����^��]������c);0���S��<.wc,]S���XF������9��!���Q�i�T�M�����AJ��_jC��Ca=��	y4�pn�Or��>0��;��He�����.�8DD�����9&�[c��u�v������J�m	�����c���1i�����"�e!M��r���EV>��� zp�}���z�J�.$4���F
L��p���)�6P�c%��F`������H�!V�B)���� ���@�{��[�)u�L����@W2��`eH��z5��`��Vra���Y��%�r��1��m��w�I�M?�,&�T�Jb~��r2�f�g[nw��;�r�<�m�R�
��zruQ~�
��-�0QDt�����8_�������B4�����c���y+W�e��-���#��n�Gs�a��*i���Y�IO�����U���\�{m�Z~���/�/N�/YU����w��o���Y�)d�pI��1Y��������bZL�D��3%�4xn�^��\�w�8�@{L��`U���eY�'3;���rV}Y�<�����=����ix�q[\P8���(��P�l`x���`��a�}��Z��_n�����HR�T&;�����+�
i��%S��]�IbKc�%\�iz�~h�
��\CU��	$j������*h(j����u	;�Er/D��?�E�-k8�"�)V��Zt�ns�)�;����A�*]d��qS�1M�~������p�Y\���%�ZmS.��F*B�R�A�b�w]��B���(���#��dD�����:
g��^��B����
�"���|W'z�0<�hy����B�0+S���O)��OXQu�'�I�/���q�
C�
H�Z)Q�s�,q�1��EbB������P��+��n/@���������p�R��=(G!;N�m������b�zX��v����}�Y����J��He�|�������.+�)������6�j�f���5��f������k�`nr�zX,v�q�j������x��Ub�F�V���|����X�w�;����h�n���#F� �-��������Sn[��4��|DhJMd��8�����\���_�����s��g�Wk�CS���
����tI�1Gg������+����
 K\��������n������x/|_��M�}�,}��<�y�%�������#�QP�i�:�Er������L��E���E��d�kE�u���e�v��13��M0�����,y�� ���g��T���0�����g�����a���3�����82���s�Q9d4�v�g�#>G����|V-���:�{�!Z�U���F����U��ZI��=��SK�tS!���
����j}c�b���&1��U�vH�j���9��FnO$x�BX�=�	���c#5{�]�����4�t��IxT�Ok0JP��������_���"��1w!�
���k����[����eYn=L�j��V��u�CC��N�	[P���{����Z��!%��%0�.w*�Esto��
99c�����x�fm���k�4����i�"�x�Y���8�3�/z�w�$� L�B���q����J�Y?�"�u�i�'D�XW5�\I)@��&e�N���onR�\�����$}@�~
"����3�`	|:l?�=�D�S�L��Gl��s4�c������d
 ���9S�s��jl�j��q�beb�9!}Cv*������YP�V��)�����=��o�z�I��(�ljV���gY���Y-+W���bV����
����z^��,�����7Uq�����+�~��������(����P��m�)�����_�ml7��_�������/pg�~c���bq"����ya���	�>��m����Yc�C'��m����acI��I��Q�s��jF��������M���e��*F�S0q���������H��*�e���bl����pC����=��4��Z��4��8W�
�+�����$xJ!T������uf/�[��3:�Q�����Q,����>A��2�Uv��Me�z������P�5[�7oD�vM���E2�������X8����D�i�9�(O�4+�����`�X�����y�%������	�������&�O�5���d�8&�7�=��:5i%<��B�A��c�q/�D2;��b����\��3w[^U�k�s��r:������6���V=pF�Z���%��v"��R��hSY�}j,�z��-�A�8X���^�����Kp�� \ny�#���aY�/1�N�Y5��l�-�+������:3�2-�;U���M;�&�L�����U#&,+��?�/heY�0G��+����)��b.C�����B��	�t�_�6�sC�p�7�U�Q����qG����;4�=�
5��V:�]3hA������e6<��7�������3�
F>4��Q�?����@�QR!<�z�A����~��[���]��F<M�	{��Z��]y�����=�Kr�'jM�� ��B"���|�������)O���=���v~;��k���8�������=��U��S�>��c
�����	�~���� ��-��������0�r��e�Sq�+��|��s���0��?/��1�����������<OO��f��LdO�36�&X)Wq|�:�L	��{��Jwny��|(v!����j�v���5���@dj5�����0���6}@�?i�[��>���	��Ua��rU���W�_��k��g�����_��>v�d4@	������k5?��"�p{%�_�Nn�.g'���K����h����{�5R����!��m�2d	����!A�������t�v����W0J�����]�+��?C�r�����l=�	9�*B��������)B����F3
�����(�N�O.O'����#�A!
���0�����[��������2uu���"XH�W��&�O�T7$x�R]5�����������{��6�+_�o�S�9c�!,���,eh��9#�
I;�����TKA������g���%��o|�DDwu�u���t�z�`���U�l��>tO��OR���9�w����t>�,������F��A�������������>��7��`��,�k�`�����/��!:�t�;���QbI���g�n4�7�
��"dmw{����1\��\X�a�����A�*�g��GJw �����7����?OI�%Q���F�dsn������0�w,Q#	��dgy�d&�U��'c�$!��,��jJ��x�n)��r����H|#v�.-��{6�HY��4��j���7����7W�z�.f}��C���Euy���B8��l1v�+��I���9�o/n��� �2�h�V����G���L��G/�O/����?�:���������������8�Lg������Dj�2��������D:��Bu��#i��9U|	��z��w��cK1�8He����2J��c�?�lM�w��`���3���01��60c���_�>���9�NTf�D��O��`�gN�Hl����x�W,�|��c+J�x�?�����G���������`#�_$_�\��������k��`1�L����7�#�+��g�~�q����b� �a
��$&�]Js�[�awy�>���u{c9O���
���P�{d�
$������8��CW\`t�0�60���a�z��+�1����l��:r��f�,}R�L!���F�RG��n(`�]a7� uI,[�@j�&ws2��i����|*���,�����G���_A��hqE��rQ=��tQe��vQ]�L��k�����AMp�j�S�f��K\V��+�6��������s��l��ey����nnG�uL�dF��@��f������y5�����t���@@O�
;�-�<������[j�4)��<��J�.��]b�B>����p�����-�	W2"��J�����8������C�/K�.�oa"�Q#n�j��\S<��p����� 5+�u��\�9��F�n�D������#�e�'�*��������-Q��-�_�%~8�
��7Naw��z=�8��������c�@/��i���sKW'6��V�g?s8�q:�#-���6�D�&���&Bwe��Q�y/�h����:a5sW��c�~���Y
l�|��m>'G�!^X\�`�a�*8�o�#`����Y
H��3Q�y,�����WX������<n�v�^{��p�{\�s,�ST8�PZ�DM�g�L��q��m��7FH�l��~<�P�d� qE2�-�5���K���kk4S����&:P
2�����1�>��R ��f��"���W2i��[T��$;G����<����!�3J>������p�n�,h�F5c��q��wMO�����Q�t�C�tc��w�x��k�&�[���������W���t�wvm�r_���
��'19�cU�y�0�AI�m���z���Xo�h�|N5��[�m��+����fD��xFW������_�O'����mw�N����V����[W��eRG��A�����#$���X����	��b��E���dt����������G��/�\�1c��j�o�if�zcg�M�*\��Y��JI�Z�0�et�{;�,C� ��|�m����7�S����iNV�M�

�\O�����#�����M���Y#
[ /3\��0�fRB��VRM,��^�V,������`0l�������������
�ttIa"��F�F����HO����-x1�#_�h��n(����J�
��5����,M�V����s��)r�<�D�K�'BnLV%n��~�����Jt|[x1�
������;�H��"���T��r�DSb��3x������-��Y<V�T�����)5�}��]�=~����V����+��(`x�-� �(/�mST�2�
����~��H�z�e\�E�������<�+�C4�D���������;��L#��������wh@�{��h�u�d��������X��g��L�N�3���z���SX�?F�����4�U�$��%N���ak�����v�("#z�q	cF(�y�3���x�����>��������%M���w��0��(A�:,_��5�+�<�srb���3��f�&k��)Ji��n����H1���9��%�M	w���gg���e����#U5~��7���1�7��������G�g��=����N??{ix������8sN�db�F�!��aEw"�j�T�y&+Y"
�C������5�=st��/��f~�or/X���=��^"����J������2��hp�P2XDB��F�s
q�5I�D �=��YH��
�@�g��a�i���3

p0�@6�SA>Iv���G��0T7o5
4K�M�� ��(W�������G��B��6�}5/�S��qEJ�K�<K���C���$����%u���=�78��o�g�8���e�G�'2��NM�8�`!~s�����M�	���������e�K���K������2F_<�b\ld�!@{\JR�/�^,rU<h�]	}B ������}��1��OY&U�V9���\wdj6F���t��x(`_�c�9V�lC
�pZ�(��
7�zL9��,��T�{�>���s���/��g���X^,f���<�,�\6��cU�	���?����PsI@3��E�m��48�y���E�dD��-��	�@
w��C�
�Y���p��yve1���g�u�[�����^��L,m���dM��F�o#�����**�K����W�g+%��[T�u���y"i���^���,/�<
��7�7*��ddA7b�v����STCM�.C������:c���D�`��r`�|�������sLw����o���
X�c��*c�^�h���
!'vX�����K�h��@���%�@:E/�t&0�G	�2x@�r��kH$H$�4�v��d��'�y[���?E��p���}���WJ�����a���BIsu��Y!����Pb�%�]�P��R�����R�	_�IM��Rl��]t�8���5���B�j���j�U�v�[�����l�����q�G���\�����i� �5?���$��y����u2ZsE�	%uN>����Mg��2Z���hM���k<����\vs	d�'����'�IE�]T!1���0��l0�j����-�2��1�]���R��aD5�������O[\�A���_���-X�����+U&�4������%Z���Kc�=k�T���T,�w�;�N�/��YQ+V����4R�����e���e�A(:����A<8R������F�o��"�pI$��D0mf/�zD��a����"���s��U�}.���W�N�6a��V����8Vi��b��iw��^��j
����^�����Ru�t�����#�^�/���t�b:��a�Blc����E��#�r@Gh�����x�jT*w�x[�!��}����C����M�/!Z>h~0eE3V�\`�}��]���P���.y��;3�]���.�����'����'�`4��n�2����u�y���H�[��l�h���3���V+����a�+TV��9�po[���9��x��V��R��i���G�p���T~ e�
)�8���;�CK�����n_�
��a
���s���
w��7�������mr��C��P;*��p�	Gz� �Y�v����@j�l���J(5�m���{{�P��������pIWYM�]WV��(p
���f���HJ?]�oJ4j��Z�-�����g��Q���&"J��w�}����>�B89d�����?�?
��n����n�'�������r�����a�����<��N�xo�5�z�������l���D>�
Q��e0������\�
�^-@�M��"����h����G�����!�_g��y"�c=�B��&	����)�$���X��N-���-z
^�4�2��*�TA����������H�5��s'���5���rt~q
�Bu?�_o�R�7�W��k������"q��t��	�>����O^����7�����dC�{5�!�8��T�~�T��2sC���zV��V���	�Y�Mv4���g�SjX��$"���U5�_�u�������:s[���:��r�����FXSw����m���3���B�jpU��Z���^Wm�a�f	V]��rPc6��Pp�9�/�3�!Y��"���� 
���9,8��X�����:�'R	���r��X2��-����%��Q�_������\E��%�V�b�x�X�A��_Vs�H����l�����(��*���Xb\�`1I���8�x8oZ;w��2�8��&����H��jBcU"SX�N��)���gv���������s���r4B����%&q?�2���c{�������s����6��aowL��]��.����\��m��P����CW`t��P~���C����Z�C�������F;�&���j�/n'�<���PO����%�ZC����r�I2���*7n�9�~������bO������H�I�GB�c���?���xAV�Is��_@�X �eFQ/5YaF���������b;%vQ���.���)�]5���EN6��5]'MIh�'fC�u�����z�A����L$��5m������T��r�:|���|F����*�9,KQ�9�~�|���82�#���/�}1_���k~&�j����7el��uJVa�Rr����#c-�FG����y�k<	w��
����Gm��2f���>_����������R�c�c��9e��eoa��U^R�v?��r����0����2����NJJ�3�H���]��X�kO�8c��h��p����P���
��E��cbGV�F2�@K����$�cP��`A�w�)c�g�n'��T>���J7����W���v����[|G�W�D���tta
&�'���Z��1�9an:)��)��N�>��P;LK�lQ{d- THk�4�a�.c�S�F�t�Y0#��	�I��ml�"��IS�K�l���~��� �E]�\��@8�f���;�9O])2���K�e��M�����VT������i8��t"���d���J��!-��v;b����������0#F� �Gv����d�i��lm#�F�
"V��@sj�F;�ll�K5���n��#&]��0����L6)�CBr����]�v����J<��ms.�^�oT&	:�����=�K<1�0�d��phR��
�P�7��!_%��s:`�
X��P>����g��6�\��b7����.�d�����Q�F����4sB�a\���a��Q���p���Ql�S���'ft O��t��p����y)�$^�	�k~�5���r9DB]-�>����'n��,1!+:F����X��EB�5�z��N7t ��E��e��@���D�����!t;Z�{�|�*�DQ�����0
�._`~9B�F�`�>r��4�y��P���������6r+��P='LQ�����cd=�w�`(�a�R7z���>���2�L��,'�}�G�����i	�
�wo{D�nj.��i�?M���dp}C(.��N��|
lS&�_Rp���K��*����p���G���j�{�������Cfu}KR���mw���N�v/��u��;�o���^G��Ax�A�@�%|�o�7�# ���Lyd��
�o��d�
=t7����QSc����S�h"����'x����mF��+��Dhg����p�#�u%q����+�T�����#��L��3�zzj��B'`(F��c��G�T���D�;�m�^o���|�5�.����$U!r�UF��D=�	D�1�k���wRH���2�T[�#�����}�f���V�����I@�����Yg4w>���1@y�$�`��a�#
x������x�&�l@�����I��� $�T��h�?'oZf��J����S����7��S=R���,�d���J�,���Q�����s�D���t����N�.:y�M�k����Z���j�����V`��"T����u
Ww��������Q�qJ#?qF�0��1&�E#��&�����/�8<�I����������;�c��k���w���pX�L_6�w��,���'��9a�wHl��M�C�q�������0���C!�w~�7A�n]d�����#<c<'����x�:��t�Q���K��|���:w�TI<�����������[{�1�����~v�����1	8�X��y���[�,���A�E��5�P�J>�p�O%�����6*��.s�V�����9����j3�y�ZO�S�oN����S\X��$��:�G����k����Z�c]sn�h0���!\2�n��������Ng�����u$K�G�D�37���-�QJ���������x��t��u&�B��)e��,
!k�0��:�O�T���YIY��
�p1E$�\���o�L��/�e��V���.�E\
�$_.
��<���0D�2�2�y�n��]�}
�����[�V9��5F�1���Vf5���q&��f�
��U�u���f�"���������!�9��O��V�'���3IN���@�~p�b����C6�#��m��tV��%�H�g�L�$6�0���D�)K��O�H~fd)ho������tZ��-���S�?E#�A��RVV@���zEI��t6�*��{��a��n~���S���������N
�� �*~����l�x���n�(������&��BYX���y
�p��L|��I���
+�i��-���������O�U�B~�����p�L�e%��KW�������6�Z������c��������Y���31�E��_�#�^Y��)��a����������,��9KrWp�
T�~����9�`�!VkkL�����T @q�$��f�,#�$?*�HR,	�H�?]�����Y�)3p�G��@N�)�)ZK$�O���[8��&1���D�25�b��������K��A1��}&���
=��e������fJUT�\�W�O�	�`����P
��=7L���l5gK0����F���V����0z�jA&���y��H�Y�*Q��5�F����pg���@����E��w}a�[I@Y���)�
�2�E$�R�������^��X�<1��bFQ���0��6����M���z�N�*�jr��;�B�h*��p�0w�?t*�RRfL���U\l����M2�i�;W�.A��8�5.�r�/��TVC���q,���!�(��6a��%5t��or�%�7L��)FC��2�����L:<&������M���[��5��7Y$�F�����'/�v�-.���	,�\�B9'Cn'>�F/��v`��P+9@����Z8�Nf�Qe|�
�����`�e�5���,�Z}'j�������1H���\������wq����N;��x��1D�lW�M����1�g%�
���-X5"�8��3N�P��������+�
�~��3�����bt�y�/1��K��e��}��tq�������v�+^
5.5�J�f���������<bn��Nj��({=��EHu������Cf��"To�{k��4�����+��i��������C�_,��2���U��0^(��=8��A�+����F�]F���:�-�q���%BU����g��W��kNx��m��{�Vz8'�16y�C4�s4�		���j���������Z�#���I��R��mK���d��� ��p~�^��L�}E���a\X��O�y�����T���2NR���P��P��QH�����K �8�b�%��$S ~��b$+�Jd������Wh��8k�&���A9���9���l.�����V����Y8����Fg��L�
�������m���2���?m>7	��MS��������KDY��mx�Sf"lL.��m*	�-�K���{G��0n[Bk�s�r�7v���*�[���HM)����|h�]{��r[R���&[�)S��9��XS	,�Jx�c����+������	-���R��������0Z��O��=���R��d�z|=�`k��*������
E�0���RD�M0���Q����MEG�����WWa4��~n�1�#a�E�T k�V��<�W���s�����_P;�6A
���
�E!t_5���bnigs6�R���������9�W�:�,~�������7Y�O
C�V�������;���������j�f������*mb�N$�_=��������d��)=fq%N��;BX=�@�5K������u���`���mt�y9	�aO4g������3��}�*�������g���Al.�����}�C��!�y�$-eE������F���h���x��l��ihX�O[s"�~��j��zqj6�F����k����`+���RiU���S�B�:�{����y����dz�����������2���t�pq�e��=���
�\��/-@�:��[����X��8���^�e��d�v'�}|��QJ��v�.H?��0�����������x����r~����5Eu]���7���5M�G"��d����H�,,����?��4�}a���a[�6��R�}�x��z�q�5� ���sH�i����M*pg�_Tq��VS`�e���v
>��#.�������m������W"
���%�2N�a�bPA�5��#��g	*1���D��l�/�V��G���1�(�+�u���v	CP�����!�������/�[����|��q/���A�*{Gy�8���^	�I)x�5'_�c����"���������\Z�k�t�?��
4P�":�C��|�nR�h�3�oo�\��&2H���v�a*���	�o�w�
����mq;OJ�� �8����z��������jv7��|����Ha�yg7�k��`���.&�44\l#��-���BBzh8&������@7��I!`�u[z~/�=�����u���V����������|����x�l�'s�$��u�	mT�P&�)�����/�r7"�W�Q�����b�]=�%&N@��:��Z'�5��R��|��O�^����\�������t)A4��1C�����;I�<�P�{��-J�[
.]�GA�Q�^Hz��;rWNzC���@��X��3�%�:fk��k�����;��QM�e�2��(}����F�������y����.q�_�1���9�dA�~d1��W��m�pV��TV��*&��&p���J\������
�A��CW���w�W�j����w�dz���[V��Q*oD��Q��]y���)qAe��>��/�������u��i�&�����7
=YM��;�\*k~�����4��n�#��d3�T�t�(����>�'��K|R�e��M�Tp�`7�c�����F
�?����%�:n�}�>'!s��L��������Jq��
�i*�5�����1af-�P��9:�
\(���������2�>�m��
��0���c�Q�G��n��7��.#,u@�#���3��YI�v����7`ug������>�h�%M~�i��s��g B�*��n($��1|����V�M�f���k��w���������!9�@7��z��A�K�s��@P��V�@�Q,>���(��fJB�I3W$��z��:&Oq�Gf[����mQ�1NO�����@br��X�*�D��H*	e�UX+:��HV�ss7�6+v1b�RG��J z4��!��H4�x]��{�T�f����<���Y��	��%��&�O�_�4�k:=E�j"8��_9�]�G"���,;2�/��!���{����N!|�Ej��@h��:a&����u��lT-<�B8��d����WP����k�+"����;�������,�8"�Cm�M�E���&)j��,�(H�d������p{��O�6�d��)i��g���q��xd����b�WJ�2�w!KD���(��nS�H��-�wi��C��*�&K����C�K���R]�@v�t���J�R�����i�`��!���$�6��Q~�x��^Z^���R\L(��f���!-������n������B�c}t��|��E�{!�IC�8)��
�g���&���a��cRg�[O))&�B��Bs���V@��p��-i�T��D3Z��+�q;olq���t��V��%jv�F�A����F�/��W�F�%Kk��O�/��u��u6dh��9�m{6�I��l�l|���Ay��s�P	�0���P���!��y\�'�	��U�!p����Q��Hg�&��"d���=�;�����0 �Ka&)D4�t�~�sY�t���������������y�����������O��_Q�{[���E%$��9V������!b���������f���9Og���� �d���Zsn��	�[�+���Z���l�O��s����xs�? Qn��l2����d��rq��	p�/��>�3��yV�6f�6'���4�x3�&��-z���#9��6��nG�I����I��#���\aI
N�&3���u�>��#l�	�4,Wjq���e�S�Y��^�e�P��-f~5��}�I?6��N+�� tR�y0�5���09p�Y�q>�&r��c�9N��5Z�^^�g�;�)e����(��=*x���6������P�iW@8��G�'Did�8��<V�Q�
m�;�:�@�=b�0	�g8��y���c=r�������k�����!��I���
(������|�k8���8�:���V��BL���} 
����N�3u��gC��J�j����$s/`��M���v�cA/��!`��x����]�3��k687�{�`���\:_�F�p�-��'��&s���IQ4�	"�B���N�5����~�1���Ut�`:KLXqv�2&�)+W���d�V�M��}��W��}�������r��VU�������z�������AP���=K��F$���1[p���"��'@5�T���#5��gh7���g5�r�f���w��k>�4�=�R7%�&N�;�\svB�~
�V���VBZ��G���n�N���z�OS���T�G3I����#i������ekn�m�P�xM��JO�����q��1p&����d��r�^�
�����JW��p��cCr(v���T��cK����w���Wg���<��8����Q���ESB\*P��4�d������_6H�������iG���/����\��V�R
e��N��.�b�;�!�v���.�t�8�EJ�����<v�g=���*e�2)�tdV|�k��-���$�g�C��p��4iV��'���:�R�������q6����Pe,�6�~�mX@/���[{��A�'�`��#�L��^�t����^�NW������F���@��#��d�(�`�����l�E��\�W�If�B��� ��+f�{�����0k�)oul��:
���N$YQ��v����y_��,O�;xs���aF�A5�����sT����p�7&��8��bN��.g*����`�Z�O!���n�����7�O���PZh���9AV��Hd3�����acX�lVz���8��5�D����M������������n�l����dFD�RK��������x�M�l!����+�rI�%+W�;���$0E�M��Qmpg�5�_s��>3{�C��r1�Tr�#�@8K�}$������1(�Q�P��V-��Hc#1[%�n���#pPC^H�BD�7�>u��4�:
�<���p�����t���X��v��f���L���fssT�,������1�����)h�����d#�����@�a�X����r��0=������'D����f-������9�/�1��j6#�(L!�FH�D�	���xg���r�
�'��	g���B���o����z���7����T���R��avF8�o��Yo����1(�t�C����3����4����E	�w��(����:|���v�M#hq��b�-N�1_��]�#!)M
�((�[��brA����g04�r�p���Ilm����4���1�eWD0�h�'& �%J?Y�S�)�=\�����GE������w7���/EL�\���������������g1�:I��t���bV�)��5��5�<~M�$�+N�Si����@�o�'v*�,���
�u~�Y��g;�|�Q�X�T?��``�����(gI	��p��yWc��e�����:y����M��D:�iU���`C!�<,���z�A.����\ni?Q]#d����Mb�'���R9�l.`H�
�����W�I��3�����p����������x%@�����8�i�<K�pKy2���'�h��0y�*P���7�j��yZ�{
~=j�ExT�o�*k���\�Q�b-�s�S{������/OKIy?�-��O
���+�r4�=������X���������������������,��$�"`�
�� uh9~W5j�+��&5!�Y/<�#�[��9C#pb�|��i�a��#���Xf�U�:���T��������nV;����r{���gG@5�F��w�^d$EIg�!)���a�$[���N�[���SPBv�&�f���m��"���2���[�H#�>�\�UT�);����l0�����HY���r
9�����ER�I�eU}����������-2��QB_�2J�(s���i��[��E)��Q��c��T�^3�lk��M4�3��p�t��+��f�Qw�G�+��I}IJ�o�:G�B^s]Tv�+�@��g�3�"��l5�R����Ss�]��f���p��>�r���R7�q���/�^�����������g�	���3��=I'��f7��uL"�:�����m����_�����b��c���S����j�w���~�������@���9Q��,�R)���_����YT�DI�eO"u�h1��������E���^A��sS<�v�>�swP�O���`	�R�� /������v<�*�t����)�����\�p_���5��GE�(��q��k�5*�	A�f8�|W����|JE��Ln�*`���z�R�[�����������J*C��J����������Ph8�q�F}�Fc����U���p�Z����V�dg�t�n�9�L�T)��~0�	�������l%p��M������l"��)���UP?��.�������x�H�I�A?���V�p�?x��u��sk����h�������<n�'��������Qskx7����������"��~C���tx�����O��p��6�D6���W||?J{�O0�%�����-���v�������~�$������h���'Id=�NN���^�]�]����N�|�Ygv����<0����[��o[o�{78����u95d�
�����w�+6�����bD���"�:S�����)�G%��dR���J����dP���8O�_{ �E��e/1��yi��cu���|�k ������������j�TO�\5������
�'y�����1�#i��f1a/	�buk�l���h������z�m�G��6U��@g�X����`{��u�:�\EREH��h������F���{@
B����PR�G������f�fD)j�zrI)���������Zpq$$5z�-�+)���	��z����U�~o��H�b����O_u����)�=���/������&j�~�g)�5�A�O����y6�1�/=��������������`5B�����Z��p���>RH��X)]q�BU/.O������_��BM�j�s�d2�+����t��s�l��xc�,�0F�8����>uj�#\��v9���c3��pp��u,r��,�"p��� 7 ��hH�66US3T���fY��P�i	�!%�f�n���1�!�kJ{q>3L���1��\QDN��3������N�%+���g|�mN�B��R�����Lf������{�z���!cr���*�r�4��:�M�����#��P5�d��>b�.���,�RC�\1����A���q�k��>
C�N�nZ�i�sG��f��Vj�����#��w�hQr�Jdu��'�=z��#�����|r=qZ��4���q����{�;��H�s�Ap��N�#��9�����+���r�X{T�My�#������N����
4�wf]����\����E��NP�y�6��&�<���$�Q�T���0)�r(�y��H�"��`�N��|�9{{E�k5}����&
{|J�)�2X�	t��[z�}�:E��_g�p�(�4���	���$<oNj[���lt{HuT�8�������B��X�&6�����QS��g�GT��cZ2���� Y�uH����[��]���!��DG�TE�^Dg&n���C][���4��O�8�h����G�j��e���>3����� #���<��Y��m���'�������-V����v�������������Qu��d����H]�F�������I���y+����GE<{h����{G�'����s�Vw����>�;����T5r�$3��:�)�����{�,C|O� $O�I��U�H=f�����RN,��H������}�$�����������+!b��K/�D@0�H�-��������\�R:~��F��[:1�������'��6Lg����.�n~~ ��] D0���ju�K�k�rq=��0y���yv��H��I���Iq/5��;�zF&| ����?&���^"�t������+;&� 4������iH"���>A�#v���MN�����*�E��'���\G��_����@�q�����K>M���k�|wE�����,*�J��0���!f~���(���S��9b��W����L��A<I��k�����"�9J�d�����O@$Ab�>�Z�c-}�/p����R`H(�3lLP��=J�h��PI��O��UK�Qs/���k?��%�%Ff�P��HD��A����Nrr�U&l
;:�����L����J����`��M��q����W������H������5_��/�s�wp��G�j��G�H���b�xR	[8^����'n'��D��M2����r��Y�+�b+����~
���g�y���E��{���'��<�$����#�nG�s�9�|�d[�:��2#n�n�o�Y�f�����l2i����#v���W���W������}f��Z��RZ;�W��$��Y���U�1'����CEd#`��4g�������B���������{`����C��"��s�+c�[���G\�8�1Ni���
��T
T��J��B���Si]s�W|$�NT�|��4c=�{���?��43����+IP���!���~�b�4d�-
'��0�N�H�_ 6P���]A���r���`'m�P�`�8��LD%������.�.���v���$c�~����i�0PC%���5���u8D�g;]�*dk�
B�X�Q�i(.�����&��]o��Jj&�ZA�'Co��rr;��s�@����U����R�M�4��D�#�wI% ����eF�\&6���K������;�w�z�a27�Z���[�2	OnY��vW��1L������!���)Kn�>#Jl���-��,d�%D;���~������]��B��?��I���S��y�#y���aS�$�H��^r��n2]��NHK�I�9y6U$�GM@�X���a=�04�Jnbf�+C�c�����Ld��%�	{���5qEw1�y��D�$,k�-b���I�UN7)|����I�'3I���$*�u"�B�
��(ke�["F@���Z���@<�����z��L�{���y���.�a��o�Q�������/
�Z3��b��7�P>j>G7 ��O8��:�����W�R��e�����A������M��1�����d��F�~>F�.�N&���pr_<4��l^�-�*��6(��7%l�4s�<��O������ Ox�	�}K���=������"��l��=L��e4�(�E��@�F�8�2�(B�J�H������:��(6�gBz�u��1%8��T�AK|!�q"��=F�QM��[U*B���q4@D|,���~�j\t�0�#��%������C���u������tn�&��6�;�&���|n�����W�3o(g�Em��7������9�(|�����^�"�a����4	^Udj/�d��R9�"���)�o�����-A4�	�id/����	A�R �tfU�d�g�JH����a�����]��Cv�������=�O������5�O�[�^�}�dEt,2����g�ZW|�)C�������G3X$��
���5P<������Z�N���G��1!{m�k�L~[��>7T�p"��2���� �������P�H�1�)��S��������c}"m��:��A�f��5�o��������V��f�T7��^fRz��v�t*9+Q��"�������Sg�T����g0R�����PRN�Y_��L&�V�5��
4/�|N>C�0V.]����@P�8~7��Io�L���4�������3t���:��dg���N�n?O����5���\��x�/94��J[��%����5�U���8����^��d:Jy������F���t�8S�0aD��3�@x���]�	��G�BI��H�\�����P�p���A�K��|����3-k��zr��^����1���+�[���S�{��N&6#�����A�����)u����a�KbPnG��1$����������Z�������
�e1����L�A>�t�{E���C
���.W��t����������@�2Ro�����IK�������X�B=i����	�`���(���.�?��U�!�g%������EzO
(f5��
�,Hz�#o���Bm�������I����eGU�MN�y����r����0�nR/��K�
�q������T�%�s���6��r�Qbj��Tj�������M��IM���^�����yK9E����#��z���xZ���@{��mn3�I�n��_C�dN��wv��&/�����+$�
�`G#	�5^�V��?.\�;tR����1���'������0Mo3��R�7��M����"�[���=r�)�,t�<��]�s����N������}��@�k��8��"i�r#:�{�q�rz9q�����6��0�f�ko?!:�U�$�+S�Y�2|!V��+�vE�d�dP��U�j/�?����+���
/�)������%'1��y���e���Z�s��2�[5��5���0P�u(C��
��J?����OS�x��.) K��9��\R�xfV������U��5htW3. QMs�� b�;����\2vsSK������h����b�m�oPLHRAc����9c����Jf�~�SVK��+�
"�����Ar�xqM���d'��6��/M�N�*�%�x�|�p�8&���
��E��96)�X�����Q�ft--�A�V��8%�������8����t��q��&w�"��Of�d��u=�a��	y�En��%&�L���{Fc�Psu�0Z�N�8NU]��V��/q�4�]&`����I��4���!\�d0L��=�L����������)}c�z	KVf�3�`,��><�"��=��u|t���E`J27�-���0��f�3]C����k��9�"��)�&+1����kC��� -�m�A���8�xB��B��������L;����h��@��<	�%fi�F��h��n�>�a�I�na�������8-q67�j���eU�� {r.�3��w��	��B����YVtJ��V��h���P��Y�8��V^
q�dR���4N����-�z�qj��|hg�Q<�vjS��i��r�M}��G�?����E3k|�����G�xsS�I����m�h�5F�G��s��'�����������r����I���<C0���d�t��^pP�d����*^GM/g\�
�!%u3'r����MI����>�]�K�����((,�L��E�A8�����B��E��M7\��$
p�nZtRQW(���!��\A
��3�.�0�,H���3�����;I}�0������I�D�o�&(���[��:r��o���=���&s���z_g��b�V##f<����n8�`'G%1��d��{�{��u����d;�@c����r+��u$,������W�Y�%�8
/��V7d�>tBL�7�M���e�p���>��"�$�^	{e���I�v����P�gc��N�*�Y�#s�\$z�jb�����.��w�I��5i}���_P�-(��.��k�����7�;�o�WT�����Hx�397W�s�&#��]����T�q�]����W����b%Vj�rnX���{�%s2�U�i0b����jft��3����2{�Z�����O�r�1�+/1mtk���Ns���d��[Vc�6b����(����S%-('�]�W��eq�g�
���p}�_=-+�F�������pb�/^�$��.��e�
d]�lx�%{��6��:q!��i����s�1��F�+��f7��8��������>0z#"+�@����k���4�C����g<O�9:�1�M~:c��O Of��a��	U�
�p�����l�Y���2����!>HX71�g"��#�{v��o���Y-v��QD+�t����@��.�E�}��q�tq�56:R��o��o�=�6r�zOg��+8�
o���Kk��,�[�&��@��W���8���> �A�P�_��������}u�g�)e�
�n�6��P8"W����[�,wk����)�F*��c3+H�w��D��"`�!��m�]L��&�V�����:a�4����&���6���b�[un��?[H��sB����593*�%��:�2�snL�=���~��A�@��GA�����U�[�KA'�Ks3x������t]��K��K	J������%�������g�M�������>::u���
�A�FC||E����w�So&���5A`�������\�r}x�d.�Q���&�a"�C�R|��81������m��@���~h�|k��a((~�����S�"��w5=q���{[��������8���2q��RDbr�O9A���c��k����f��D��<S�SqF&y�6�'w���o�����nE]�5���Vg#V�K��J�k��������a�M�
~�fb�4��Qr����j����
��m�(�T�&6bu�S����ry�wgj��k0p���1<N����`o�u!��$_t��\	_���w^}�A9����a������&[�\�8i5�w�����Q-@!T�re2���4he�}N]����X��	����Qx{ ���?�6�<��q�3����� |m'�����"���[�$������T8e!�vK����A�FU����^sUpn�@�o�-�� ���1��)�j�,��L�v����
�z�Z��Z�+���0�2)8���
��Y�m��|�kW��]�,C���c�H>6��;So2��a����x������^��;9�>��l�f�d@a.V��Z�Km��5}Ea<lNiU!��{���0>��zA���%MG�/�*�%���������d8����W�0]]�^�0�������/�,f�S!�;�q|���D���pF�?�c�t��)v��J�v����2���SE�����rOb���oo5���(��`V��
f�r	W���4����DK����k��?��T�������q��|����w��_�%�5������1Nb���K?���������b��.-�^C��A/��#�A->�l`
�@4�����E�D��Y�����vC�9��b'������
V|��h��B���/W2�H�^/�S�J:�9�]yB�9�s��l�%k�3]U�^Z7�
�^�����"����Tb���U&���X�	��r���{���d����7��E�,s�'b~Egs�G�����l7���n�_��{���&��c
$K_�sC
����G;��D7I
�od�<����r�D��UQ��]{�zV��{+9�jO]�� K�r�S�k����R���1p=�uB����`��h���* ����$�F�A'T�<��jQ}P�/YrU�d������m���a��P�����1��<?"/g�,��r�\����3U*vq��2p���p��j�Y�1���%�W|����Ss��s�<GyT�%J��I������+!G
�z(�ll����~�*~?V�x�����pmde�O���������dS��\W(���o���]��y���H��	�g�E��;^bd�V�9��x���s�"�)b��s��Q�����4S�9�����|+h�,������2FYJS?]�U����d�#�'�E��U�U=H|��3�k]Hy�%��@<C�<�����.���,��I�[����������s���)�9i�w�R���.�?	�������Z�K�
���xh9wK|�<�'�F�'����Q\���S�~,�����/��7�g���|f��&��;�S����Y�����0�H�����3m���I�F{�3rN�MKKV)����=sQ��.�W�8bH���?�\R����V T<	2RgpK_#���x
@(kzho�.�[��`F$�4��Ds�=���\f5�O�EI�Cc�U���}���htH��!�&n����e��!��i�PV7�f|����@+�(w���v<���_`Y�/5���Y�����7
U�t�,_��,�������VV{Io<pu���	&s!���t������.���������OO�gg7��8�y�1�H��D������
�����������W��Wg�W����P���0��>.�G}=�-�K�zV��.� �|��`���"���vk�<����@N������V����
!�j`��*��������h��uLRYN��^��8�<�*q��i����B�l@��Z��W�a"�,m�_����Z#�1=w!ao������j����b�EJ�e3��Ll�1)��a0��0�T����U�/z��
�0Xh�6a��bBA�T���,{9Ud/��
%�1��7��2 �l/��Y��X����075��N�ECFY��<����&�������	k��sl�������a>�F7�b��
��H�qK$$�mD�_$�d%[��=���N8�C>���E_�PT�g�r�����/�]��w�t�0�[�I����ph�zL4�r5!sj���k����hSQ���;���C"y6V;�	���w� �"b����F��) P�D�� �PrQ"���/8���������5��%c���6�8�2����������,�5y�,��*�mP��h��]����.0�+\�xyG�\����C��-��<M�}��~3=1}�
}�'7��y�A�?v��=8m/��{�K�Yco\��[�;3H05����*�s��������K�8������-�"����&
�*)��0
���;�U�i-We&;o��UGG^��@���\���sB��V�g�M��]��z	,�o:����:���R�
h�,�I(�.rw� ��.��F���+S��xP���z����/Y&.-����L,G��\�q�q���q\�`}����'+�S��d�����4�zG�b�zU�w�d�m���mJu���-)�fx|y�-u*2�p���������d�q���&�������������`��8��=��3���1� zF�*�����xU�
KwV�Y�{6f�����X���r3X���x�����L���W�*��D��������p�#>�e�
�z|���'-U�_Fq��_�m|��~�����@3�fs*��.N���������
K����!�B��F0����w�!l����������4!~1%E����n��	����?��m"]�"y������D���~o�9��������>�����7!]r�o'�c��bO���Y��)�������d#�I)|-(��C K-�H
�T�Zq8��*�_��?��r3�H���
%�&E��X�fdR����*L����s��e�g���,S���h����Q������doU�����Vy�������R����![������_����Qgz��k�W���"^YM�{F�C^1���9�D#�tF<G!���"i;>�2�����<��ur>e����|S��S��sO�2[�~+�������;M�+]�1.�������:�%����j_n�p	��+qW
x��
]7~A���]�$(�D��i�-����RS29�����G��#��-��D�=�r�������d�I�c'I{�adQ��?48;	�60�&���0����w����Pq��,�$����>%Q�"9���r9HR	j���/��3��f
#��q��T540[n�m
���f�������?�������O�P�������rt�����z��X��zt���8�XL����� x�����G2�1�<���'L8��8�����?���&�`����d2�)���]6�{�����	M�jtZ���a�u���/�.����^�k����V�����,T�1�#��[�/�����t�A�������,��&uv�.^���h�2�`�������]�1����2*U
��@��3R�ck���g[��c��i����Pz�|������f�|�(`(e�d�i�?������/��G��a��{���w�<v��J ��
a"�N��3���G���e�q3�4���HU�y��-��A��C�o~����b�_�_b�������_N8�s���|#{��8�h���{|�����4��2���S�����/O�����_���t|�s��&<�H9�L��Q�t�l��kk����h��s*��o7j����6��xSu�90���6�|�4@�
XW�4��H�L�%�����X7sR��R�Pg���e�os$h����c�����f5������K:�W��?���o(��h=��n��.����c��2��oSAM�4���sK��~������B"���U�L�j��K+J�$����6�
�Uq��k���R��|��d^Za��E2XS�<5W�b|3##;��������%�t�tF�+��;��XM�8��7�s%�=Ad�Qe����-��LZ��fX���w�1F�c��s�s�2�|^%a� �qH1�>&Y��a�3f>'�=�� ��N�p�Jn�V�6���?���EB0��C��y@�l?R����e�T���I�����s�����Q�/�P/��X����z������U2�8q�bI[��0�k��.(���]5Y]�@p���v��uq��_R������l3s�%�DP�;�[b���ONN�{�4�;��2�E�a��j���7����T�)�������8����BI*+�Z�|�$H���U%.+/q��������
��GD;���Bu!����%B��y���B��&���0
���*Q�$Mil�
S������|#xl7�M����e�}���W���|B��wa�������+��1��:�������O|�9G�;�[�a�$������46}�{��z��~��:qf'�G������HG�"p����P�6�*��Y`�k�_��4
�l���j��8���O�}e�@�q�`m���<�-y�U"Cz	���y�����*?5D�Q����0
l�$u8E�!:��\��j�Z��'���	�����[���%����r
��,�!v�=�'��+�bs�
op�{�g���C��Ht��Y�[4$��f��y���jT��WiQ�]��,%����d0DCg��1��D3v�E�L�M\n1���YF�I��+�(E>H�4�/
J�Wdx/8���!L�l��.�N@��mDaw�������"�y��i�����[Cb�{���
���{�$�%��� ^Hg����,���B���S�����B��}��_'I?���j��Vw����	��L��>�S$=~@���#���0Lk�[�����	����,Gs�j�?������8EgBMg� ��~�����F�H��%{A�(����F�������@��W�(�Y"V���V��Y���;H��4��cXWU��5�\�����LX��+��KN����It�&�*�X����1�c��9��������,�YeQ��]�u�^�\��=J�IyMNq����"r�\���������!�\���#�9-�$C�'�C%���V[n�
k�����E�"�J?��0i�b�UI�����/�n0q"�{���
*6�������X����#�!�Y��::?�>I���b4i�U^,DqCp��D������E<�-"�q�50�����H�9(��j���#9p^�aYH��.f$Jk��T���p8���1'H���c����KQ*����l#�����Q6��.��3t �EJM���E�8�"����5�6(�����v�Z���d �p�p��Xz���I��*�����[2�~�Z���5�>�8!�T�c%��p�5\hD(����)3�>x����\���Y�����(m,��Y�����O��M����}c���$(�Hn�Y1�	�=�aG�#J1f<g������h6#�3NH�@aH��T�(
g/���7D�'PKQ��[H�����	el/{����PGqs�g����PJ���G>m`N1�^�	��lDw�e����B�#��S/����J*���S�Y�b��b���w
^�w�MF�@U�hJ�Ia��x�q��FB���R*N���yV�Z>B|���J��#�&C�Q�����}R�5����AHj$������#�%
�[��.��n1|�ej���G^����H�g����v��i��K/P����D|R�������0O���S�u�)W"��9Z�����)�6�Bg���G��O�'�.��T�����.}��T�����V���OJ����q�%a�b��|��I\�.�����t
���^�/��5������JT.H��S���;�E��Vp1���9�t��]�� kL�p6��-i�R�����3�7���n��	'_C@&;st@M�%U
��W�1����=�fY�J;"�B�67sY|��[��R�yL�5���i�yU{��!�����=JP��� ��I����p��$�����U2KH���d�d��?}6K��Kc%U��`|X�W�wxx����7	l��&��`�8��J;�"$�����b�
R
S����m��-Ys2+���j���vh�6WL��,]��|kCk�[��8w��r`S�2�y�V��+M��0���z�����������o���1����W�UI����\W.-�����������G�r��.�.�6�*�c��_Ao�x%�Uv����R�/c�>U8W�����|���d������*0Dg�5��&	�&_��(
1��L|lNR�)�d�cY�&V��q��)\\r���G���������~%<Fz���G���)��~�nb�5�(�N��_�x��!�*���o��|�7o���H|�B� ����h�}�=�����'����T������}��) ����"Z"�A���a��3�a8s��#��2�p��]Qs��<^t��k$�-�������_�$���C�.�}��F�\�.�����4�PE�>��kL�K��;�|N�	�K�H���W�G/�._^�z��m1�PR�������?yv�7���y~�'�f�����B/��������~��A#p?�G�v����|IA*�,�'r�c�Z�+�/�ME��-�������i�=h�����7��7F��@
=\q��'�~�Rz)� �j.t�B�l�bd���et��x��0
R�-f�6p��|/��X�~�E@���&���/���m���W�
{�������\����(=���KLa�{ut}�C������}����_�m��w)ZIo�x[����������c���=��u;\�1�)��U��%{��]�����xl��dW�r�z#a������7B�CYy9��1��_ky _���Z���RvX'eufXv ��|>���2
4��l�1�6�����:1m[�u��I��i�T6�w2�s�w�������;����'�A��X�|A,&��'<����S��uE�_=�n�`R��m����U$��[H3�r���)��������:7�r�����(�T~TBC�u_������M_�w�R�]���l�9�=���b��Z-�����^���C�V��kU��yg������r_������<���7���1�:8��Jn��j���N8
��2��[����.m����^RH�eS��n�zWja��p# b�s/��Q�+J��1����!���/����6DX�s:�
����1�e���0�hmj1�@2F��[�{���_InaY�D��-������������]�T���I���2�]�?��@.;�@:��mh�a'?������Pc�K��i����E���Y��mz��5�SK���Yv�at� ���K�_{��_d����Es������d���9���T�87������:��3���K&7���+�c[(����)�[|(%)��'$N�2:��!?a�GCP��3S]75��'�+���+�6�E�YzR%�X����JL�k����������/e�}Eadm���s�����AC��b��� ���G1������cd���2�=���b�h���O�:�[��8C~���������
.^|�ZN�wk��9��XK:���`I�9�<'9�xa���n#������J�4'���n_ns)G�Qh�?JPD�OXF8��,6���*�:s�O���W�K������������pp�|G'>/rp��w�L�V��N,��h�o��w��|�\L��Y<r_]����#i�Oi��q��*����/),��6�T���Y�|#���?��sP��;���'rq�XQ����1�4��.���'�����������>��@��u`4��'��]=3�kf����_���~[����;IBP	i����\��6!/�������((��|1�lh�+3�bo�`'���=�.!�L���9h6oPA����[�����-����4P��V)�������������~�5h�No/�<:�����z��\���z��b���_As��n<��<���.l�.Hr�P�ZyDGa24Y�����B����@gO�}��:��*�Q�
~YoU]���d���u�U��2����$��-n�k�n�7)wx���{��?�t�z��d�mE��2g�����ed���{{�^�p��:8����(��^����^^����>l�u��|zC�������t�>�Q@�C�$�Y �xSw��+0X�|=5��&�'�5�������l!PG�s��^��������-^�e�[�	������h�.���
L(�j�xvwyv�>�,@�
#��mP:f�V����>�����K�
sF�Y���!��Q�^�����8���[���W���c�����{r��������������#q/�
�2�������B�,)O�~����Y��ml�=��W�.�,�Lb��� 5��A��P���mD`\Y:e��D���[�iQ���&�9����?���f�����gryO�EZ�I����HkAu�uo1i#)�Z�I�'r���X^���k:*���7����N����n���.US,�K�5m���k�H#��|�8_�uoQ��E+�)[1�S��7k��'i��X-��Xr��"r�u�;������u�{�����pw����j�U�C:����4:�A���A:`9���{3K��:����:9�(�+����$<MR�|/x�P��iD]�~�f��=j�j���9���O���>�3i��I?�>RH�sIE�1����l9��,H������\��s�y	<�5���cv�1������G�;G�c�>�<fA4Do:	:���Q�@DaD�>E��{��j��P(�@�����>$�b�f�+�N'1�E	��'.,U�)��ub���J*�pj��VR����Y+��������8��'�U@��� ���4��D�a0�j��K4���
��m,����I�?	 pp�!l})HK��>n���l�J�b2��fm1�\/����FH����������~������w������W�	���.��u���?-p1���9a�Oq]�`���Q��6f��*�.���s�<-��*�����=?�P�hm-|I?��O�w������B�N�c�?Qy�6!��.)����+�����6����;���������>jz
�>��k.u����O�o�e�8�xn4/./^�f�����I\d�}��N/Og�7��6s�^\��^��%�<�h��8�G��}���'��J���?�������Z�68�}�~���:5I��_�]{�\��?�����>�a*���\��{����c]�X���K��@:�_�����|���������^S�I��V9�vow���N�D��~�[�iU�RBjr%h?����H�E)Jv�*��m��+T�������aX�W|����� ����xvu}v|��=\d5�T�������������_��B<�Jx��z�R.<�\~�����=�m���0��h��\b$aa�I.�-�J[D��9b��XRB����v��:���V+�z��q4l��v�j�[qY)b��I���q[��^��,]��@(��9����G��&��(I���&�����^x�H9nI�(����������C��x�'Wf3��
�
��`��GrY���b����N�W�t�:��6��i�&y�@"�����tsy�7���d��[��
(�xj��t��C���&���!�j�������������;�@����������;�S���`J�A�=<$
/����������s@�+0!P�C:�c��2~6���a7�T������%���-�i�{�&�����f��zp�u��*�{���Go���^O%gM��n�r(���<�g�b^��i��[����V[�������H��-_UNg��d��D#�iY���b�������O�d�;�n�wv�s���~�cj&��=;��P���Ah��$c�&g��d�y��T0�
s�F#����B
X��1r]T7��4��?������V�4R�^�)�4��!�d���	M|C�/e;�"i��7v-;�������'�Z&��;cqR
���E�44�]\���P}Z�F�������djs�f�����6W��<1������4��^�qo�����{8<����J�V����W�0�a|��\q|t}����0_���s�i��3W�����f�a��	���5�B�uYm������W��
�	\Z�9�31i5S�{�aM�oN7��WQ_��&?���D 9)�(�:A��������O9��!�'� L)Z�Q�_��m�|�8R�"w~�����5�us�YV�5B��&p��B0H\���&�C��5�)8��������E�����9t�P4R���dk���sjZB�����^<��u���N8��9,��/�i�f7��R	k��Z���H�� ���M�U@D'J>��s|�Ke������"��n&�-�o��,���?�������|�h1*cF:P	��fw�^����3uqv��B=M���\-�;���$^4;��83'�_�TA���J�1���?�9��8�9�J*q�=���2��{����v��&�������WG�'<��;�����*^m���\M�����8����{����<��]LYu���������2�.�e��9�������{��9�����w�z����A���U��b=��Z,����1V�_"�@a
]�9�1W�~,�/�}�
�z������+I�p.z����:�{u�=��Y�!�X�"�\4���b�UwK>#l�=w
^�\>���1A���8�wu5�g/��g�oiE`�����'-n�~G�1\�3�Wa����}?:��n<h�������v�z���U�����Q��?�/�d�\�^�_�����2�]�|��i������dv6]�3y��.!`�e��V�RTI�E+s�^+��}y���W������y��*���Bn�������)�6�@�-�M
�w�O������� l������A���R�����g�<B��N���2�L-�c�|1����O���U�|���YP<�9
Y�_��~b�e0
0���j��GR
�!~���>�"�.��`��W��"{�<�+�,c�_��aE����e���
@���So���r���?rT�W�f���-?7��+|{/�S��Fq8"�I��������Y����k�04Nl%>��$�&}":<���JH0�������i���y#���s�d�����>Y}���+5�:=����	[�322��c��u�9�h���	�u���mpy���U�U/N�>�0X��R:F����p����`?�iq����p�����1��Q�����|�x���"�m��(8���k�#�b���]X���a��A�C\�AH�����>p�)V��-\&�O}���1�c����|��z��I���V���E�����Vo�ej���hx�WFka�A�	��sF�)��R���8]v$�����R�-$#����
#������N�H�����
�<:���2DwCh��w�x��5��$3�7Xw��6�.;�d]J
������:�)�9���w%aSPs�����J��P���~�f�A�~np�������}�P���&(�i�Z�5�']������e���W�@����N���$<����w���{|xxp�i?^�.��m)k��D����q��w;%\���C8�&��J���@A4IIZa���wC���0�p�����y���ZPW$����m�0!�����)�q�;�M��t����j]L������cX�������^�x�#<g��Z�U�@�����H���oml&��:;�>J�c2�e�q�a��h�K��&�9�Zy�>��,t����Wy}%5��Qc���s�8YC`
#���6�
�.3�6P�$��D�{�����N(ge7��&��x�
�����K�����4b9�9��e
f���lX����S���n1��_�/������B��{3� �^�6�*p�\�"����S�!ezf�qL�:�S�7��=[�)��<S|l0Z�&�Ko�o�J	Xf�-�UJi���&�<m�������%5�Q�l6T�r��������2' ��e�]���r4�H���:=�:�8�zC�_���.��hl���-��Gn�+&��K�
^.MQD6��c�4*��^*����nJj�VG��������t�*-H������!��d+���P�"����"���A�E
�}�0rej�2�I����S5sT�-g�V���1H���	�4�o�L4O���n���|��i�e2��E��lc�m��������hE.��F�����P�[�CQ}6%;�Vc@�FtG�����0��~�1�{������v���?��z���v�?{H��W{�W��,��N:�>������������>��3�o�?�;��KX��������*����60X�L�|����@#��F�*�b:����E�p�(�Y'wr{�P�_��k$���j�^��<=��-8���=���
q
4��y�����	�
$6"����rT���|���P`��HZT��:�U��&\B��Vl&J��W����0/�Ow����t:�f\�w���J2�Q;���q'�p����D��JU�wDr��
R���_R]��\�Zr0��	D��t�l����C���?��6���g�G���p�g���J��=����O��8@��ol�� _��)�*>C
�1�.�;�YEk��0 U�z=���O��������N$z����y���9c�h�`�;�����_�x4Ie��d<��=L^I9K���h��3��Vt��<��;�V=K~��%��V�a������@����{yD�Npry�:8�x��+ �H�zv���X�x+�����W���Cpw��s��B�[�|���V���C~m�O'����Y�`R	!����S:�����~H��f�]�Ld�yy�?��&���&�������?����<S��#�+��F�}d>r.$\j��[4�?���4O�_�����.���1�����7v����`��i�c���u`1���XGc����#;H���eo��8`7��5(Y�{�����=pm^?�����iA��������iQ�v{o�mW�����������]����	��x:B�k����KW����,��r^z�����W�g�`�����_�������Up�����������:�E�A:�N���%�O@�8C�����Hz[p�b�����s��t ��^g)�&�����c������L�BS�x#�V�o��a�n�88Mo��bZ�����z}����LwGb+����xJ?~#(3%v�E�'&����H��Gh�����B��&@��M��F�E�
�F_�����i:��&C�U��e_�=zbAg�8x{Ze,�p��&N����jA�Q`!��W� ��u���k������g���?�O��3����B��Q���?y�����u��]y�,u���u��j�����_�^���(8;��������.�/v�hH����}�^����f�]8IM_a����N/�����(�������*���0i�`D5������?��~���r���ga����da�]<:?z�����<���$Zvt6�2��L�# �{H�7;���%�����u��������|�b�X�/�K�k��������x�E��T��Pf��`a�Kf���(_������7\C�;'^�Z����$H�c�^�������]3�����#�EC����{�g��
U��k��������^>��M������_�|~����W_����}������|�.����k�+�/���Ht�d?����������_��Tu�(�oA'���3�c��|��:L��Y�	� �b.�rU�v`�T4��K�r_��[��!E%3��MK�����y�,��5�wsRue��^,��
�;#�-uI�� �Z��GW�8U��TBSv�����/��������XV��TQ^��*U���
|��"�������w�����]���=���K��_�'����%o�/��U�^f��1�q�>��*_w0�����?�1:]���'����
�mKi��Q6K%�;��<�����^&[H��<!KC#������u��M�P���\����L+e�����jn�������e������a����e��r7�R*��Y�F��rf8?��c��z�[�Pn�fP�&�ws�����%C]��B=�u����9�O=���}:����G��Nx���(���;�������F���.jg;�����.��kwe� �_�D�A��8����"~����Q����Pt_�}5WV|������N{��juvv:������=��u��u�/OQy���{`�W��' ��l@t
7������t�F��e�C����N7���_o��P�h���
Owv�;���'��mo�S��m����e��Z\D�������*zbz�!���L1�!�L���(�D���F�b`6P��2�q0�]��F�\��m�9U�A�w
�[�G�R)Y�B!�"��h��9l�����N{���o�+�� ���v��^C�O����k(����D#���T9�q��O��������G�J�G�o��	��r���T��6��/���z�~#���{��������e���}���7�����{���o�����n���L�e������Y����O3��mRmL�mM������g���e~������f��B����/j����������~�o`�����u����o����������A��R

0002-multivariate-histograms-20190315.patch.gzapplication/gzip; name=0002-multivariate-histograms-20190315.patch.gzDownload

#139

Robert Haas

robertmhaas@gmail.com

almost 7 years ago

In reply to: Kyotaro HORIGUCHI (#136)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Thu, Mar 14, 2019 at 7:50 AM Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:

Thank you for the kind explanation. I'm not sure but I understand
this as '"lists" is an extension' turned into 'lists are an
extension'. That is, the "lists' expresses a concept rather than
the plurarilty. (But I haven't got a gut feeling..)

The idea that it expresses a concept rather than the plurality is
exactly right -- so apparently you DO have a gut feeling!

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#140

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: Tomas Vondra (#138)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Fri, 15 Mar 2019 at 00:06, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

I've noticed an annoying thing when modifying type of column not
included in a statistics...

That is, we don't remove the statistics, but the estimate still changes.
But that's because the ALTER TABLE also resets reltuples/relpages:

That's a bit unfortunate, and it kinda makes the whole effort to not
drop the statistics unnecessarily kinda pointless :-(

Well not entirely. Repeating that test with 100,000 rows, I get an
initial estimate of 9850 (actual 10,000), which then drops to 2451
after altering the column. But if you drop the dependency statistics,
the estimate drops to 241, so clearly there is some benefit in keeping
them in that case.

Besides, I thought there was no extra effort in keeping the extended
statistics in this case -- isn't it just using the column
dependencies, so in this case UpdateStatisticsForTypeChange() never
gets called anyway?

Regards,
Dean

#141

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Dean Rasheed (#140)

3 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 3/16/19 11:55 AM, Dean Rasheed wrote:

On Fri, 15 Mar 2019 at 00:06, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

I've noticed an annoying thing when modifying type of column not
included in a statistics...

That is, we don't remove the statistics, but the estimate still changes.
But that's because the ALTER TABLE also resets reltuples/relpages:

That's a bit unfortunate, and it kinda makes the whole effort to not
drop the statistics unnecessarily kinda pointless :-(

Well not entirely. Repeating that test with 100,000 rows, I get an
initial estimate of 9850 (actual 10,000), which then drops to 2451
after altering the column. But if you drop the dependency statistics,
the estimate drops to 241, so clearly there is some benefit in keeping
them in that case.

Sure. What I meant is that to correct the relpages/reltuples estimates
you need to do ANALYZE, which rebuilds the statistics anyway. Although
VACUUM also fixes the estimates, without the stats rebuild.

Besides, I thought there was no extra effort in keeping the extended
statistics in this case -- isn't it just using the column
dependencies, so in this case UpdateStatisticsForTypeChange() never
gets called anyway?

Yes, it does not get called at all. My point was that I was a little bit
confused because the test says "check change of unrelated column type
does not reset the MCV statistics" yet the estimates do actually change.

I wonder why we reset the relpages/reltuples to 0, instead of retaining
the original values, though. That would likely give us better density
estimates in estimate_rel_size, I think.

So I've tried doing that, and I've included it as 0001 into the patch
series. It seems to work, but I suppose the reset is there for a reason.
In any case, this is a preexisting issue, independent of what this patch
does or changes.

I've discovered another issue, though. Currently, clauselist_selectivity
has this as the very beginning:

/*
* If there's exactly one clause, just go directly to
* clause_selectivity(). None of what we might do below is relevant.
*/
if (list_length(clauses) == 1)
return clause_selectivity(root, (Node *) linitial(clauses),
varRelid, jointype, sjinfo);

Which however fails with queries like this:

WHERE (a = 1 OR b = 1)

because clauselist_selectivity sees it as a single clause, passes it to
clause_selectivity and the OR-clause handling simply relies on

(s1 + s2 - s1 * s2)

which entirely ignores the multivariate stats. The other similar places
in clause_selectivity() simply call clauselist_selectivity() so that's
OK, but OR-clauses don't do that.

For functional dependencies this is not a huge issue because those apply
only to AND-clauses. But there were proposals to maybe apply them to
other types of clauses, in which case it might become issue.

I think the best fix is moving the optimization after the multivariate
stats are applied. The only alternative I can think of is modifying
clauselist_selectivity so that it can be executed on OR-clauses. But
that seems much more complicated than the former option for almost no
other advantages.

I've also changed how statext_is_compatible_clause_internal() handles
the attnums bitmapset - you were right in your 3/10 message that we can
just pass the value, without creating a local bitmapset. So I've just
done that.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-retain-reltuples-relpages-on-ALTER-TABLE.patch.gzapplication/gzip; name=0001-retain-reltuples-relpages-on-ALTER-TABLE.patch.gzDownload

0002-multivariate-MCV-lists.patch.gzapplication/gzip; name=0002-multivariate-MCV-lists.patch.gzDownload

0003-multivariate-histograms.patch.gzapplication/gzip; name=0003-multivariate-histograms.patch.gzDownload

�6�\0003-multivariate-histograms.patch�<isG���_��z�&9���%���r�-[���96�b5g����`���8��@���$��>�-rzh
�t�^%q�&�>�l>������a��c������LC��0��7eo��]�%3�L���?f����
�<gWq�S�cy	g�>}kF��9��(��8<��3��]�Q�:{��`L�1z��������_���7{�~�����5���o,������(���f�u��t0�3/v�i���`����u��[��'�7l�k.���TN�k�-E���p3(�9�*\��4��FJ��G+p�X�D�$f������4��{yC���G^:�A��K5w�P�^�bO��8�PB+@ 3W�e���"��/���#@@�Y\�$(��!,P���	W�c���?��xqz����z�-J8�T�
�������XB��\_����R�!C@o&������
���-SK�a���F<�{�I��j�����
�<���&���2��R�UA4�q�c?.���l�&�\�#���t0����2�z�������(|�py=sy�i�����e���mp���@-��6������D��
'=��g7��f�i�fm��Q&���{�=`
���nZ�����JS*�*�5��D\'"E*�������	���{��\3��H�U�x��R��}���R88�%��h���=��Q�����%��n������7<�^��c0?��q�vz]X�a������L����'`�����8[�d����X�����n�	�[^�����5��!\�i|"F#���R,���c���4�����l0r�?a=��2h�<�8~;�|}<0���&VLW�}d>XG	37�������pX����_N�q�x��{��2�G~t
_���.,�LD��R��=�"=�p�����w`<��W��y���+a�"���B���+�� �-��V���uB�5x��0N3��HM��y{�c����o���$�����H�V����U�F��h(�,�?�/����M����T����������!=)Fe��4Kr7[�"����C������D��FOx+[
��6�_�\
�u������`��@����x������Z�n��d��WO�<�����:��%=i:�����e����l��X��O�������^8|���U�x��%j�i�S�J~� �������g����������s�m
K��t�����xR`:��}P�����a�g ��(���\�.��%H���D@��rB��r���|�E/ 	2�v��<��)K�~�^�3�$�$�����k��5KD�'8_�.��L��(�t�����I3��AF�������/� �����}���J���<P��������O�'��6�U��0�����.�FW��s���%��K�J�����&)����SV�Y����n$H�e7�����=�0�7`
q��{�������.�C?�u%}0������"�C$l������6�_J<���/��Qi�&����n�N�����GC-G=���7���y~(���+�S�+���n|���(���F��������,-�{��������9[&��Sr�;�*F����_�v<����c@��H>�T,�-�����<I�3��/�L������yB}R�]w�U��E�Z�2"�		��m5��?������\��`y
��^�9�J B#����kfw|Y���YGG��htY�~z}zq
������4���zz4W�Pc�(S:��l�D
��9���#��n�\!�BcV�{����e��5g�N��������7|	������|�;�^��=�p��6�M�L������cM��x�I�a9�����b�s%�HD���q�%�]����!���������_�}zXl����C���`�~����|�/��U^�,��*4��)��-�JT�|��Fc��!O@R�jO��MI[��|���3p'�'����P�&gP�KT
�8,q +X�� A�)	�P�����������n���:��)��
Tw7�8�a��r�������)�� ���9��\2���dYLa���_�/�P=[����O@2 ��'?RZ������*\_|L�pH_e�r>+r�:�9�n�u'��W�������������S�Y������Xr��H�I���_{�^�h��o���S�I���9%0�]���@��a6|����X����Z�O���` ���g{Ko��G�(��@�GG3���1i��'$��ih�M�H�N>a�������1�g��������5]7��X��05��uS��^��n*��)E�C������|���t����>.�A�@)�J��66>I��X�'�dj���=N�jDx0&S��K�1'cE����6�+
T��h#�������6�tg<nMAA�M�M4���@�0LKs������R��n0�:��p��L�p��C-��9���Tg���l��j��p�r
tZ@;������e��IS�T0���;�N`2��U�b��e8N�sl��$��mh���hA���hMN�W�6��`��"`��uFM�5	�	�(������.\�$� Oh��kM+�D#���������Q�
�����F�����1?�m�E���y�~J��jcs�����Ri�6K8�#�rYne7<�*w��B���p��T���/T�(��T��!/��n�<�C�O2��(0)���9��\�{���MP��9v��@?����O!�lGu������eK��<X�N(oU�b�
�Z�S����,]�F��oK[�X�%.���6��|������?�9>;g%�>�:{{v�={��U��kT(�� �3��{��);>������s�-�'�m�Y�6wK�=��-��>w�8�� �=E��GY{�+���C"�*�
{��P�'�#�+wq��������  �,t�2��+����i���*�4�X5v���8�����a=/!*mK�]7O�n!�I�-��������,��i��_,�2�$��8�Ae����$���Gz�yx�C�3�7�'��c\d�UX����_���9������[Q��-3������8c��rt�Tki�n�}d U���Tx�%2��2n�(�l�;������O�iZ��>�)�WC�U�,)��x�C;�26�_V;��0��d�4������\Z:|;XM�V��_��0�|5����R��.q�2�:n�f/p��4�3].�/�y�Qz���I�u�:U���z	����+?�D�9�t��.��!~�����������b:��-�����
���>��2f���#� ���N��(k�,�	i.�y�6�����"$��H?��
:�TA�.T����W����)�}���[5����X��r
�������qN<W�����	�{h�3���e���
�<������e9:��������
���DT�_����.�.cO�7�	���m9���~Z����Ml�����t"��[RH9��(U�i.�5bH�h�
�6�8S��D��B�=��[���:�������b��-�~�9�����=�_����!H�*CK�(M�C^\�`��:�u��d��/�5�!t��V�^U�Xb�y�<��Sm�,����j��m�������t��qh���^� 
��q������e��T�Ud.�k7"�8:��\��U�Ob��������>�s)���a0�K�r\�������]��,���[�.��(b����E�mk���
��/
��)����:W���FU�O���9|�N�����m>,~�fs�[�U/D����f�����e��`����G�6��~�*M�4"�Fz�}R����c����9q��-s���,G���R����l2�X��"��E1oD5������&��t�{�l�`p65
w�aM�����R�k$7��P
��,���L�/�\��Q��Nf.�;.���
�`�0��������A�z#uov+�����C�b�5�kG��Yc0�e��W�J�%Y-A��,��1���Dj��W\�hK2��.��t���YJ�ja(�����:�Z�U"����5 ������'����I���8V>�
x�
�n�%{K����I)��C�#���wRQ�y�'3�3d�����yN��'@��mG6�\�?�gq�C�D7mnk�|l;��>�/��E�`\S#�����?f=�=�Z�j��+;{���]���������wLn��A�#���Q5�,����gQh�����n����T���Yu��@���K�J�{[;��X���dj��eBX�C���5�������*4�
V���4ia����ex�->���8����q���@�n�yE���]R��.�P�\�}i��f���K�O�G�p����ESX��"�k�q���VR�+oN)���	q.+C>q�E$���
��Vg���8����iyx�<G�~O���<�R���Y��(2���A������$"�L�0g��g�X<G����T���{I��:j���� �E�=/��{]:rv~yzq�����!X��^��c����X��?c��
`�w�r6:k�g������a���������cv ���5��l�D����J���E[f5���G��i��	n�eUD��oV�o�	b�K��m l(PG�g���xG��m�jQ���bQ_��������wvQK�x�;Skn������F��fs���J�M;���C�c�~X�,�^���S5��Y��g)�����w����&�{�H�Xm��E�I�
t�������`�%��po����j�v��7|��m�	+@U5�>�#}��p���`?���h����I��C?���f<+��G�Y���_����;X�@�M�YU})��c��wQ�8(�����;��E ���yKml������&��U}�U�V�gUc��[�?L
�I��U�P�B��H�GT��'{���x�y���1d���������v�S�����0��^�XZ��5sQ��l�����yn��`�>{R]���/�������H�����p0&#� ���szq��B�^G$	�u���w/Og���_�<�N�>	��)�����F?JI������\{����S�
y�\�)�@uP���|W/t���`���!�4-]?�M�D�'��~�N`��EFN��k���������N�_�����^v�6AJ^�oI�X*�b��z0���^�����-Qt���1�H8v�t��7L=�`��]C���H�#��`�R�3����W�}v�������>3�4u}��}�(��:M��?V�(���8��Y��9�2d�f�VS�p
�|(jn���R��HR��_���8��+�Z� �d�o_�0����Z{�;8J���~��D)/h��T���//�=�J����yX��zR�r�`�@E�
$�y�,���o?��.�diIn;SY�c��U�	�����NC��{�_.���4�<�Ey�L����Q�S�9'tOS:�A����i���&� \����p�%���{'w��� Ed�$�������MVco��T�6��T�z��	�����k�������Q�0�	���c��b�:6�@����^G��XH��0�3�c����k������_�ii�/k��{I+n�n������p�����CB;	�@#��Hb�=�`D�18��vl����b��x��X��LE�X�[��3`2bE&-n4J��������R��^��F������1����Z���WX�u�u�0v�<OkP.d#
Ix]H��*MF?wO��t�8)���!0V�T"z����BX���.z�/���@�iM�H����Y�>��M�+�W��.zu����G��TA�Wg����|���V����L����l|w�Qlc�����8��e��;�k)w����qk���lnn��$��v�[-R�)���"��on����O��%���G�1,�U����7������������'L�Sz_�H��"���� ��������p*�W�G�_��O�Sn�?�&�&�=���(�gl��#��w;�XK�.v������w�������9�����|�Q������7o�����=}���m���UL�R����v�� P���N�j%���cP�\�8T��
�B�ObO.�iG	�����b�������Mqd�����V] sD<����:������������s���.�n�4K���ZW(�NP'(_�K	S��<�'�+%�m���p��j���c��82������Zm�7A�����������ZJ����[��@��Q��K����m�ju��bY��u��� �$�AO��Q���������a��b
��t3�����6k��w�hE�2���}�)
u��,��=�W����%�4O/��\C���i{�FU� �/��|�oO��L����C9u%J����o��@p��9c,0D�Y��h[��:5DZ
>������:�}F�[�@���������a��\Of?og�
��+�����|puW�
���1WL�4��U�~�d��v��8��WsS��x<<���a�bv��7����#+>}�#�?��+��
�����CH�������7����;��N��C�h��T�og����s��;�c���aw:��J)?��h9�3|�pS�Q���e�������
�R�������$��S����7���w��]��A=�����t-��QE���y8�r�t;>w|9"��+}�[jD~pr.+���qH���J(_�������tA<I�����V�_}��Z*��Rt����f��~:A�PZu���\}.�l�������j�H���6/��.�C,�����s����*�!���To�/��Y����6��u�V���y�~��h��u����9����dwvy��A���5������R��PRD���\8�8���&��]wc�%�� �j��=D����������lO���V#X����Q����
��z���]�����t�&
��r�c����7b���<��|��l�m��u��������I����3�@�����E��G?�o�H�rK�3K�i�Rs�p��e
v�;�9�x3��
��>�DJ��Pl���G?��(���i`�J��\Pq;�'Y}���78>��"P3:�L(4�N��bdF�}��io�����a��h>�&����A�%w�hm�������f3i�[�-�����~�a�[��*��V���i��_W�q����g���m���)Y���}���w��Wg�2,�:�O���$:�c���=�QCC|M���J�D��#,��*>����Q�:��nq��� ����x��W�x��bP���S�kf"�B)<��qt�K|�G7���	G�B}��c<��T�+G�������O`a������3^�0�����l�~��4�&Yut�^\�%�E{�-�X�+�����)2�ax��y`��q�a��G�B�Q�k��(;j6����N�v<B���:�[����e�8-�`��bWL�Bme]�4���ku�cb(���iT�:���n��#��TQN�V����r��7VC�1*�Z<����!%��2�cB^o��w�,��u`\���I�xH�����@U�l�OxV�el�d�!n6)���0F��]r:Q��5�2����N�@�A�[0z��%^�tM�o���A�]k�\�K�5q+#[�s������{��t�09XCzYs������&����r�X������{���1�\�z
/�X�k�>cMS��$��W�[A�:F�7��+��\:��}����!�/��J�,�N��(p9����
[*�H����@T�_��_)�
�R���n2\#���>�K=������3���/�������*���<I8.7�1����'�8�m��tT�'� ��/�.[�����q�K�g��,X��U^�^�w���{��
�������,��{��C)TEA�q7 (��z�M�\�1���V�~���2"@��nc�1���C���G^�t��
s��?��a�~��`����;�n!$>�]�hd5��M�������t�:a��O��U�n�o�B�v����X�s:%�h+u��Cc��y$|�K:e���f<";S�'Ah��*Q�~2�a������\����#�4V�����4G�X ��H�wv��-u�B��zT`�1&���\���Q"�j��j�M=+�4]�����������N��r
�GE���V�"���I&�$��,t��U��la���I]P���3�t�{uiL@4���X����{������dj	��z($�
\"2"�$w�7�F:�h�����,u�
��[#�X�Ad�D��������8o�.�J)���s���,^I2
5%����a�M�� ���>0	�s�~iL����'R����(gbb����h������#I�Yi�����
�=��?s[Z���s��5�������@���[
<Q8����0����~7��@����8Q�/�k<[U��-�K�J�����]����1�xn�j~<d���*��b6��h�CuNI2|;h�L�>���9�0H�</c��F�*����`:J�,&1������|��!��H���Q������J���"�4��'���0�����2#����@o�x��%C�W����"���ZG��UXd9+2�����>`��#}�.N�����Y��:�D3:�F����e�i�H��e��;pr���n�������d���Q��Q�`_�3X'�a��^Z�y�.�I�9�
=F
��TE��x�b|�d&j���k�%RE��A��Z�D��.��W5�z,#�����/�u*�!����i5�	��%����w�1�L����?/T
�f��Nb�P�>�)�
q�#�A���J�c���Ht���(F�}|�$�?T�1$~a'n�v�m��G���<I��b�UQ��=����������H���C7����s����BQ�C	!Eg$}�=����,���w6����86����=��@��|�U��U��P%���
���fP��~?!o�l���P{�6k	��M������0v�O,�'�,-	� ��n��Gn��Dc^�q~����C�^6I��9�+��
to�_'�N`������3A����rE�OBS��VY�$��0�%��P�5h�>�	�����	���3�A#J-<3&�1�`3����I�m:������CxR��@��&�x��g��UMX�C���x��O�64��,��@_f�����b�R�-nSt�&�D
'N(���I�K�	Z�{A���1�e�
����Z����d�����s[y	�D%s�T��5���ELE@z2]��Gdu����kemnZ�_9��{�G�1/�A��K���Dp� ���d�?��|�K�rp��k��!4�@��.�P���L�G�jLK��R��K(�0�`b_�H�	qB6<��b��?�s������I71br�r9L�{3�.ocLM�p�����^"d}������H�o����C���S��L���!EG�Y?���
K8b��k����Y���Cl��R�)-��+���LTa��V�b�R�����������������gq��TO�H?���h�'c�-�t�-���
�����|%�2F���{�
���_18���N���j!zs|�y���_�.:g�?��)�G�Y2����`�%�����r�^\�G�w���`�l����u'U�L�/����R�5�}�� D������y/G�8#Gyk��&GE������4��s��
f��ND)R
���E+�1�4�ND4���,����������`�\���h��N)!uC�'����32����t���a^8w4������	��b�#�\������G���V	+?v���+P�8��L�9��(�(�d,�6���y��m(�Wr0��*Un�m���2���g�fj/�0$
14�����e<�A������H�'��
���;5c������b�G N�N�l��_������s�%��G�TKk����1Z3@�����3����XA!d*7-���W�3�������������Z�X\�S���������l���!��������D�>�$�$��y��
Y�^����-E��;��v8�������2�����M�;�������>����?��]N2�C�����m����4*
���9z�<*�VN��z
b�(xn2������k!;����X91��&�"�;L�����?q��$;��
GV*��5�k�d	�+���s�3�_�O�lP���"��p�c��<gr�?@���)q
7����Y����b)`wF)���������;o����O*���J�*����7)zVF�,�[����_V�uO�27D�b%���?����@�d2x�<��������}�s<�#��H�>^Y/+ �8"�j���c�f��I��%y%�9]/�r�T6���/�S;���A���"�t2r�s�N��O/�[��[N������]��qaJ�a���0�d<��t�6~a_��K�/��`�4����u$��Z��(��xz���7�}��P���D��������������hc��Z|���r�]���q������@�0%v�<���
-��p��!K{/��;�*���&f�p.�>rs9��������6`�j���5�N[��#�zuP"d��1awj_����x?����M.H����^��{���-L���9���7A���N���It�:�[�O7��z�P�q��6M �6s���t�XL���^.�p��D%��C��Q/��%+�����'��8R�T���jf���}��~��4��h�:�L{����`]��A.����+W{WI��9qLS��=�|��.���p�����NM��)�8'�%�M|;���E%�3�M�D�����)��W[K7�LG�1�LE���d� s	��W���|3����#�Z�����N��xl�`;Eb
�$0�]����T�
H}f+z'��P�HJ)hQ��=?=y�TV��7�=��p1�9���l�xt)i�d���B�#
�[���/�?�WN_l%��S��w��t-�cK��@x��s�}J)���l}M~V�|��Of��h
1�o���c`��Y�j���>SNC���2�f0����O
%�9l!���`?�$���e�!�Q%��r����$���~c��	p�L�{&�?t�ar-z��-��!�L��4���C�u���1�VY���q�������X�M�����H3WKt�����D�����85��������/���dsq%c�`����}t�+/~V����������sGz��w�C��F�Q�����rZ<�����w<����v&xlo3��@�F����\f����L��D�bj�r�Q������='���>t~��U4td�����x��M7|5��.!O���D�>+Oz �~��#dP���W�M��jP\��G:3|/K�IQ�|�������F$�r������<pn�������Kf�Le��X��.h����Q1���s�^b���/kK���A^gq��:��5�����W��������;�nLk*r�xr�j�f�9�i�I��@��E������It��}�����NF!��x<L������h�=�+���R�?������'���U1ye��"���U�Wm���<�Hz��L��~��m\`'9K"��YaWi�s�������y4������yJ�v��&��u�^�xD�����,��4}~n5���

"������4�;������'f��k�{q�=W^���!C��0�����0�>�E* �ko�X�-�ce��j�t�j�1f�9���������-�_�������x@��5B\����(��n�@�N�����>����0��v,�	K�+�8|X�&�J��z�2�A����8J���w{��?��0�lv���;g�n�����P�#�<�1�w1���.ahy��0�Kt�T��}F�(}�����������l�76�� F�������
j&.�����������oR+�E?�=�A�r�VV0=��M4��	O�f�]���Y�2��F�5�����oZ>�`n�A�"Ej ��{��`"���(z���KU��8��|��B����� ��]���J�
���Ei�7?�B��P���"�?�g"F-DB[��o��p���s����HTP"\vy�H[�6@Z���p�#��k�����8"�|-j���t�V��Y���[��+����*[/��B
{����[��r�cS[�����}�J�EBtQ[�Evc�`�RID��%},������q���$�&6�v:�c{{�>���~���O��������9
����<z4��v����_DU��;{w�~���b���i�w�����>���?7=��������X�@&��������{�<�iF���!�&��E�YG��	�)f��6���"���~�MT�����,BN�.��?��k�x�������|�n����&h�}���v�>���%;%��[P�pw���e��@���R��Mh�������`�H�":�������o��],+��%3�P���RX�H�=�S<p"�z��I?%�%U��S�"�=*�_�!0�p�l��x���J������s;�sx����c	W�d��y0.��i
x�x8�d�}!dZ<���
�S#2�I��D��p_�=�ddi����DZ
�z�������v�dG�[�A�<bx-BNG	�A�X����}��Jw�����'�J����p�&���o�vZ�����q�xID�"��$�����	��2�(���n3^� ���hT������R���q����#�]%L�HM^0��?�G�GU�@��k� �8y|*�
>X�`�	7�\����T�������z�6���\�-���(��dq>�Do���N*�2���B�b���#�fxq��0��SA��&qe��=K�^�������`UJ!�J�X�����ih-?��f��O\c�mH��8�:o*~#��d7X4��<4�BC��������"�r��P{��j>�����'�L`��VW&9x���<�/�����;Gn=h-���G���O�g��2`=e��n�{I��8�tP�O�[�j9M���X���nC?�\��L=����
�$|)����5���{�t��E"�����I�?��P�(��5���J E0r�0��W��3%�1��V;�����5*%���s:�OM�nI��-h"�2����0�,J�����=�8t��������Wl��8\���3A�������h ?("�x$vZm��;;��!����:dt���L�W�M�����$�E�>��!�$T���ba����Rs;�jw8/5�'����U��9*�����t5H���9��p���l~��i'�R}��}�gV���5���L�Hf�2����3���H���q��V9~�t
d��
��,?�|�,��0������h����a����Y"D����iB������q�����.�qV�����+��NC�-����m�u,z�j�PgX�ZZ��=L���!�N32:�����s�x��g�����Yw�6��gJ8,���2�_������cKa�����f���Y����Dfx����#}�)��Xi�`C�c��%%�"��'y��6�����{���=��)F���gj���h�1���@aw����YM3�~�S�����{�U�%�� 4�fR�\B!J��C5�����2l*�#�<������dl��j��U�i?b!�^����]c���|K��B�zt�B���.2FF����x��4��
��0�0D��23I�\�������pj�	�9���}sZ&�)������Px�tg>�g���0*�Xb���t�i��v��1�)%>L�d��y�W`y|���{�;|�"��|�j�>�j�����C���$�i�;�W��[��-�.s
�VZ{�P��*��i�S	����W�:!�<���{���5�xg59��W0n�CmM�T�����?��s�.;��_n���fW��M�<��������nj�#���<.��~Tz����:���g��|��KL�1�A���t:��i�_u�����uy�t��������GO��5
�u>q���`��#��_N��7�\d����{���O"R�P��mz�=H$���yGR�����BP<*i����if���P��� UC+z�K��8r�0M4M����A��^���B����V�F*�+]BO��(��{��VF����UM�|����-�|2r	�2R+�M����q����m�R�5�!|�)c��F�;cM���&V��B	i��~����#�]G/C"�].W��%dz.0�#9��U����qcJ�T��>ND������a�I\sTZuTZ�WKJn�J�<ud=G2���*�D�l���W��&�41��=��t,�cf���q���u+,��WiM !�`In!���~��n��p��/�6��y��������@���a�Dq��tq����ns��MBl���p�~�O�:NG�������9C��7"A�B%�W�Ce�@�O�_�Y���P���F�*=r��J�@���s������6��j���z�K�h����h�p����tGR�9d|�q
�G��]s�O���?".�+��&0Y�e>��
�ks������u����6K�,����.��u�$~�.Y6`���\�[�FR8�]�VL���2S��S?}����J>i������W�����k5��c�����(��Sl���6��nm���_���p��+�ic�B�md��#$����K�m�O���1�h��!���!������V_`�g�C�M5q(�����>�3!y�3C�*7�0�#��2��T���t�h@j�M��,*�=��o\be	B�2�j
�H��9=c��@��.�����VM�o������
ET�K��8T�I��&��������2�FT|hP��$g�<�r����o��p��G���B�3�5����,���P>b����&��j�
����sT�����K������Y���	
%#("
����u�F�����i]��fW�M~�He���y���/���>*	2����#�X���w1�r��faF�T(`�+�:AT��&T��Jbs�c?mJ�]���|vh��F��WX<��V1Hl#^@$��.>�������Bw5�F�	)��Z�G2ug�G8e���j�<�&�gjjQ8���<��N'i�	�v��#��<[��>9O����N��	�������NW���j�@SN��'F���^N�U^E)����3��������� ��[��{'.c����2�P������,������o%�*����+.X���;KjY���0ipmI��7��H��.+-��xc9������2��5�����f�Dc�l%�(P|�5F�c9s�l��&�:5$���i���Q�����rekF,�}.��P+�����.�9��"x�d�H����������_d��/���R��v%p���-�b��s�K=U8EU���� ������ki��o���PG[_�C��u�
�d���|�P����2�X�����
>r�Etw���
Z'��	Z�����tO\����7o;o�N_�8~}|����j��!�z����������S�h����#��+�N�v��2�kV-�z���]
�N�����+�#|Dw����z#n���/u\�>��2���t�uf�LMp��f�,'�D�����O�����a�������#��:26
:�l�Z�)�s��u����ak]�Z��Lp%��R��7�.��1���~�
^����szY���2|w`���;
��;��������������#������8�$2�)!:��8RE�	2_�y(g����U�m5r�Qvxk�D����N�{r�~V���kjD�� Nh�
Lsq�����m���Y$Bf+�*��tV�S��V�9:����C���BZ[th���BAZ7���������}f@5�B�fnw�7�����LV`�Q���A��W��������}��I�==������
��U}����eq8��g�#L�����Y�h�wP"�^N��{}G����z2��|i�wH$hi}.�Ie�F)��+t�`<aF�������W����'2�9V
~����cd��Jt����[�W�����K������7��@}	��\������J�E���\����$����&�}:����_)��VqF'NH�7�B�8���
F��a���LG_b�P�78�q�/�6�p&�o-J��Y5�&�/�cGe�o:GHI��@���7?���U`O!|k�*U/a��M������ta��W�SX������&H\	��f���s�C�d|]](����3,�l�$GT��r�5��2U��cip#=�=�����=	sy�?���c�k�����$���xs�WN�d���1H��mk�5����Q�}��]m��8T�����9����u�R�i�O��p�+�}����t�?�<=�V�R#������8��^J�`�g���]`���RK����8�_J+����{qp~|X�-� ��4J���-�#W�)�hO���n �8M����3�~o��yy�������wT���e���m\B�Q���3�
9:P���nm�8Q����KA���gfXNKV�\��1����C4���k2��B��|�H/�`���M*(�#er��}����?�9�0Sv����02�0h�oo�������<~������h�q�\y�l�y�l
$�n���(��,�J���!�8HYCt�>?���Q��K�8��-+6�t9h�)�X�i��5����M����e�'+���D�Ii���2�rwXDd�� 7���)�TybY���(��������a���QKyX[��%��f�)��$8M�+8��+!� %�oq9��L�}���?����"��X�'
A8�I?�`�q��f^ I�E���?�ES�/�5���a|��Hfp����W��*�Wo��/I5f�C20��Vsr��$���c�QUEf��[��T�*��V�����3�J>�%�]���O��f��*X2����~�A�0����ib��Z�[��'��dH;?��#Z����p�k����������d��9��}HF�h��a��(���]6�WgG�������j���Bm�{E�;|P/�dp�'��k������cr���K�?f��5��9*����y�����r�m�U����G=:�r5�{y�V�$����qM�-D*d�#A��=�$-�7���j��(%fR*�,�-���V����<��sf�]2�((���0_n$����4�g@6	��,������
����2�?s��]*�����
�T�\�������������N�v����$����OU-vycd���:��,��Y^�G��w�@81���kW��h��^����$�4Ya�h~���e����Ar�����]J��avM���q�.q������������x������E���|��Y���'�=����c��;&V��"�e(�{E�������@5|^	M��]�����������q"[p�������U����F2Na����(������UY*]�.fD����Y�%���3U{%��}��Z��O�G���,��f���
�|�g���F~Y��x�b����)��\.�@�l��`$��U3���k��[�o�S�a�9%X�$�^#�v><.���/�2�h�����F����<zr�E"
��nS#f�n'�iF/�C���tZ�'(|-����Kd��f��)�<���sJ����R��?I��%b)�&gJs�1����_��'��1�^���b��S"�����@s�5�BZ.X@`E���K��S�vf�dQh�1��h��a�	P��m���9F�#���[uU\&L�.q�g��#&�����F�� ��#��������0���nm6S�eZ
��YXW��az7�5�������Vr���+�0�W���:
Yt����S����r��=g���-:��w|�4,Fz������uH��+j�FPC�/_H����M��g��s�j�8�i��?	4P�R�8�UD�<�-���J���1��O�kC`N�<���M�L��������oG���������9���L
��-���0F�U5��[�x��Z�XV��(���m��x8��X��"���Q�\k�����oRL�E�P�<�U7�>i�
��@3W����S�i��|���H���IEi��?�Q��J�w�B��J�^�]Q������Q�Xn���f��@�T�+��(���{����$D�q�9�
	�`����������!3�����)G�&�5�����,d�qzY��x|�YmT�aS(0�xw�%&������
v�F4V��x�"��13�#�%������m��$~U�V'���C�l������+��Svd�x��p�x9�E���������K�X�5F��XMml^��2�V2��U�y@�
6���m�+�YE���kT��Z��5�����1�'����sT�d�������i�|S�CBb���C��y���\��
f6�2Dw�`���8�����#x�������q���*�y6��l��A�r�h�2sB	�w����D��#��m�J^���J��dfR�f�"�U�Y�[^8C��f{���/F0p)��1k�7N��
Nhp�����@8��h�L��F�K4/#$(��Q����y���2�8|����M~�y}�9}{�y y���e:��5�e�������%�G��I�u~k�:��H)%��
e�<�,����\�*F=r�U"�:�#
�t��=�^�:�&ux��5y�uN�_�c"��'�0�\V�"_:I�"���}�� ��75\���C�jD�(�+2L��!�T��/L�[���%
�M��Pz�B���S|9Bv_���_���,y�D�(��W��i������l�qy[�P|�d���bJ#�=}b�����Tv��L�8}�~�������m�O�i��� �-l�Y_V���O��G�\u}f?�~����Epo+����
`�����f�Hl�Y��r��"��=g�0oa��#c���Yi���Za��:�j�\w
��E�!U�U:��\3�B��2^U���jR-I����E$�T��=�m9�BL����Vd#�����Nh�z�1��'����]W�73.�6�WeD�:|q�����v�u�K(-w���%�3����.�{��������i�r2�����_D5|zQ�a�U�	�#Z��5��q,�����������1g�,Gt_\�KU��0]�5^���"1��	�������>;�{U�$b�����i���Y����pBkf!`���7����&��v�!����p�2N2W�h)�����������$�����j({MR���ziZC�t\~���z��=wI�m��*��A~��ql`���m:U�SS���Ma&����9� p��f�Z�Ni���k��U�&�	9����~2�{k���uG�H ���n�Z>����D���#r~|���t�+Yv���zu�0,�9�8��~�u�c�}��,���Y������%�����u�;�H�������/��[����W�����'Qc�~���]b���'s���`��N���n��0`{"�N��M�ek��5���,��L�Ad����W�Y�N_�q�������tR��".@|p���d�l�@���+�05�Hr���Y#��� ���������:�U%��_?U�92
�C��fN����G2�G��9���pI\�4gr\Ykw%�t���R���J��@h�A�N���L����Q����v�����wo��FC6�n�^�g%�KO�t�NM)_�sNr�����I���9��F��<R7�F��tbj�P�kM��a�
a��o�S����j�/����fwJs�����Jw�,�q�t�[���)O����x$��/�r'��eV�8=���&�������1�ND\�U��{���v��nz�M�O�����8s@�^T�_S[M�hNy�	
�~Z������U�
���������`����(1��ZDt ��2��K��(��,�Mc"n��^$�L���s�jIE����C8s���n6�.�n���+�&�?��B[�#��o��O3�e�~U�����V"�h:AFQ3������B
n���",�-�JF�������������wV��E]�s�Y"����9=��Q���J�+��o����b�y���J���O�5�Y���l�����5�=N�G~����c�WP��0�/w:��_x$u�V�A;LQL��n9#I�m�ub*�^�0�`�8�����D����E6�T��R����!�kJg>j���,��K;j@i!�3������I���$k�a��A�Z�P�$���T3�g���4
-! �df~�R�d�������r�(���Hg��W_zR�RT������G�z�$n��g��'������f���������W�+� ���PC[rQ �kr���g��.-��;�l� B�[��� ��n,)�F�5l�9�f������jq�+Q~OY�#9��l��h��F�!���1����0�xp�\y���o:����0�Kg���w�9������K0�~��9|��rsN�#	3������r�}�B�������)/���
�-�e�����Uy��3"�-�;��=�g�dG�D5��(s+F��)����7��V�4����9�������Ez9E�5{����	K����|�\���t�sAD���%E���E�C��v��g�M3�sV�,�S�2��N6Z/��h��
�������2&�B���Z�6�d:6\=���h�&�J��%�SO==a��p�-��j@����GG�N(�n:�`�r��~�B�+������.(�}]'1E����d<M�Y:���>%�tuq��,!B������%��f��!p)�[�)t�����fI��c��������E�<��R�I/G	j�"�S�iS�����S������C�}���M�-��t�/���Dbp�E�������{����`U�Z5��x����Z����h����P�cl�����5,����~]�fd;h��/(����;)���A�4��w���j�����A�[���-�%Yv��~z�V�?����u����1>�/.ju�~��/�.���< ��7�Q�G>���+��X�a_�2C]y:����6\�9��t��S��\/3KzW`�yPq`���~�S:������9??:�������|G��*0���S���������@px�?a�\�.��l'Y�z��B
|�d���H������
������&��38��x0<��Q��O����V�c��p��V�����z���k|u��>v|U-E�����]�aQ�r���qo*�p�5q
rg`�)�&����
�A����r��m'	��2V!6����WN)]�
Z��)JJ����~,a����L0-����k<�kE�{����phADX��S��V���b����*�WR�}��s��������"��G=��Lg�=y��z��b�c��xP��Sh�aQ7������4�e(�
d�\��u�W��������D�������d2vQL�������+h\�hq)��C�*N
�=G�E��_%�[��+`��O���yw�&Z����D�S��&~�M��36��tP�����Mo8[��{:�]	vB����~��4��A3���xT_���s�������O_��RW������e�7�kM��@���c���/�2`��{�D�*B�)%n�T�.�|p��T/j"�:Q�k�����<����e��1�����x6L���8��d�kE=��!e�g�(���
a�N�sNk����r����m�e~FwA������p {�f&g�J���Ti�Ap
��D���q�)�B������S�4�M�aye�C����!�D����#�s�[�@�
���� ��.B��v����E���|A_���,�}|r�<��������3&���/i�-V��` ��q��n�����V�=�u?%��.�m�$�9�hE*2Q@��8L����8�^�Hc���~���FW��;[�Z�Y������2U5'��\g��S]U��\�j\x�0�bu���?�����@.�����`b~/�����Z�d�2�T������������/E����e(�����#���L���.�7�����yV�FR���k��p���*���s����� |������Q��tLdE��`�qQ�K�
��S:�r@��p=��A���<lo�O�����GTnvO��~x{q���$@�����&�_�d�&���aN�7��:��S�T�y9����Z������%��[�W��s���]�1��:��3"���r�j;��|:�>�&;���}�������a�%b���u��= �!W��r���qn	���hC��s����):������I��v%z��F$
��;r�
C��$�4����'�4&R:��7����L>�1��
;rSBc=�Xh[BQa���B^N�QW���PAj���]i�'��������[��t�Oo�#���/�j@y�&��l���p��T�.��Qp�V����"#r%C�3�{g�W�ld�r5��;t�a���$L�%������W%�"����eW�.2�U?A�?[TSl6V���=���[G5J����x�=��<����u^W��XW��q���N�5o���3�?�!�>`:�xl�zC\�"F�BTg5w(B��i�4(�������0fgJ����~����Iz�R���$�.������	�S;b��+�[�&P����f���5��].n�B��{Ad�H,��f3$�����4*�=�+��m�� M�Q��\��1�"y�/�*��l���Z}���-u���5�K�El��&��}�yU4c��0�)����J��)�^c��"�����	�/Z��(o~^r���{�D�uu�4�G��T�:E��4�����ZP������(��B��(%��o����)�ZZ�#G-�����Y4w����2&�=y����T)������O�<�IH�2�7����8����m���� ���#��;'�2HY��+�9��{�F>~g�{������m�[��C����UW��qF����;�s �������?��n��4���^�����y�x�F����h��A��u`��b����Z���CJ�����q8�5��:��nu���'%0m�����d}E�v�� 
�g)�l��W�1Y������A�������
��k'��.:��iig������[�5�j6h_�Y�+�Y[������^��|?�0��|er��7|Gjy�B�m	,pk
�	m[��ok��{���O�������U|+0q".��<�kT��7���3[��E2�D�e�A��G�Aq=���I�^�kq���U�|G9<�{]/�
F������g��u98��Y���HQ�Q�K��e������c���-~)�o�$�&����#3�4��!/@�c�l,2��)R:���`���ryl��R��U�A�I�z�n�P�:���t^I9?��0y������97�GW^| �f-����v����~%�����Lx��`�@=�C�9w��b<s���K����t��|)R��3�H��vw�����l=��OM"�m�tP�����6�,l��%����+=�cn���xEUU^�2�^�5�L~aj.�	�eq��;����p�E��uliM���/�c�sw��B��9BF���4��9l�$&��:}��`7r>LA���	P��D���c�)����t�c�vP<$3jjM���ry��M*��9j�d���m{�IQ�'����%�_~�Ba��k�)Clr
b���h���#��}r<�-i�9M�:�
�d��aia8������'.���BT�#��A�}�,�����27�"6�����S��/���<���-N@���9b�}�����������S���1F9��k�m7����D��+Q�9��D���m��7��=I{�����&�C��-�]�P�h�)����E�gGG�����G�4:8���^^����1$vy_���^����.�)G��J�IVk7�-�W��Z�W��	r.�����Q��NO"$J�n=����{��V>89x�����O�c4����)=`t�x.]a�>�:�0��+Y�lK���k����P�6F48'C�wf���p6��YG�Q�JX�S���������_�E�F?}tvs���=8y��7��_�[����+����-�fp��}`�n��2�[Al�����l5H�g���>8���}�?n�<�{t�0 ��!��8�>k5[�f�������$t���]��������a��H�a�	Tlo�7�{������O9��'�@��D���t���8#�C��j�}��js���x��	x��{;������I���(�����_���i�A�K����r%1��1�[���j5��	�yQ�����w1l����s��s�������{0��6~k���S��n=f���?o��w�*�Zo�������\���xB��&�n<��5a���$�<MY?���S�Q�f~���7����h���n���Z��-�k{�]��9�����M�F.m�`�<��x)D��Q{k��g�KkW�N�VwIIV�z\�������������im8#��-��8�t���l
�.��}a�)�;zK)�a��<p�����j�>��pI�2;�.����a����r����]�����~�0�>$�R��!��t�	���a������
���h�|Ry"�|�kW�<��x������0J��������RJ��Z��s����@H�o����`�������km�K��{��l��v�����m&��z�{.��C�]Q�jJ��\�7%�����%������8{Ra
�E?�/���Q��B�����N`�v/��aR.e�
�����K���*�=t�y��A���Q�tSH[�6����IV	�KY$ku7y�l�|�%kgpib�t�Ha�#��>Yec�a���6��%�������
��y��4�KZ&�N��wU��O�_�'�����y�e��KM7��+"'�Vbj\9<&9�����H*m�k��q�bT�#���Jm��9���(�|fb|�Px�4'2����C��������8�����]�Y>�p����9��ly�KU��Su;p&�D���i���)>o�fm�F�@�����W��A�S��kd�*yf,�tN1t0H�o�I8D����H:�H�(����YG9��\,���8Q<�(�y��
^�������Sr������������#��(�Qs�[�}��@���Z���%/��q�|����dY4D����c���.��|�aOs����&Q�s����2�Hs�D��5�*�~��N ��%���y(m)L!���
���x8Xb��O/Vg`���#�$�A*����~�btN
��M.�d�
r�(�6^�+���Q/q��'������U
�	�3�9�O�N9�[���B�ibR��M�#9�Mk���0(��:��1����!M��X��_��2z�X�v���]r	�s�<�8uo����������\��f�()���'D��M�_
|���>��F)�_�Ft���Z�����������,�f5N>�����X�drms������)�>��aT&j]�e!�7H�-�M!!|����|�:!�|��#�
��^3�K1�9�@D(�W0~��������G=�����yP�*���D��,
9��R�%�AB*!�����L����tF�3��k�`�p�������z��������emL��iL^,����@�GU�2�fF�����(��0�w�?�����Z*���&a��8
r57���h�����mQ�'Z���,������qX_Q�Vb&)!�q��g;��0�O�w���v������L7G��>������U����QB����H�w����dF���K��MX�,��3f���:����a?�������
x6n��s�KR��t6��������F?���o.G�e��[?�H��(y|Q���<���$*N��l��$�"�Y��XkD;�RTy\��i�'	U��E�B��R�'�����}���
��V�+1"�F��X]>ax��4�&�r��Y!����$&x0��!m����F_"=����H���������9�I��9Y�1�\��+SH�k1��wU���U�Vx�s�O�zrn����]�r�8���Y��t�Q�sGq5j�q2	�)�d{�%C����H��p�1��1��"��a�TH�e�/��@��������Epb�;m7[�m7L2������'D����v�4*-���4����oI��k2��]����q$&�y`�T�i��3��o�+r���w�N�k�
6���%�C/}��Qa�m��Xq����\���0<��K���&��?�m���3g�9��t�.5��&��\zZ�_u�4��p��������kO$�a�{(5	�[�EdN�P#��[��}��*q��� ��_���S�7����@��\��S�J8�v�2��c�}�R�<�~���/��+����f�C\q'��{*�P����9R�
��)�t�(��O��]��{#��o�!{K��.i&�$�F�������95<�D���3{��j�}z�:���8�sR'0���^�N�����htQ��fDEh��c��>��#�<�KBk3V�;�H�F ��MO%4�)� B@��y7��w �pA�j�8�8��'�L����*c�� ���~t��������b��	b�y�	O[���$����h��d:�(��{�M.]�u�z����rkM�$��&DYY���kU��7�Te����9�22E�����k�S���a@q����2q��7���gN<��Y-u0�^�\�6�-D��	>�#L�a��Au����,�U�`�_���=04)��f�|������?X������)����G��s��\_�����~���O��[Q���{���:��scz)Qz���+,]�{�!��e5^q���%hn�����7���������=H���Hb��gQN��W��W�
_-r,���xe�u�\-%p���,�qK����N���~r��2��>!S�N��;�O ,��K�����(�6}��.�i0X�;�xU���p��o
�������Y���Z��2s�����t��Y-&�F���5rV�	�ScC����o����'��4��������k\T�P.��f����Q��,���g6X6��F�6]8;��.��X�����H��A9�uM��+b��"��A
���������r��8Y��a{Xm`��9�������������+,��J�@�J9���F/+�����A��z�y�g�F��c�L*+�)V����'M3�	�����y5]���<7���K�]��U�_���%�W}fJ5���Bd�������h�WX,��"8����� d,�������t���D9��5ts�$kL�CAo/R�<^�c>���f���W��,�|�T�Y/
���O��N�K,7���E#4����vu�����EX=�q7W����G�����Ya�68��EzVr�C���A�s�gV�gn����e}�����?���U����S����������~��lYy�AkmKP�L��_��'/��6���ud�BE�'�&y������

v���������?�:��*��n��M�x0�����1>��9�;$��3&P�kp��o��eJ�|�)?�x���$Q�L"Y��a�q����k,r�|�5���u/�N�.���������*�gn��/��P�	.��x���]gX�|:�FCI���e�"�XB��1%wz��a��,�����P������w>-\^�R\��Y�F|��*�igyHS��X�e-+��q�@��i������O|�DO��R���i�C�]v��MB������]��x�J`6=��=7Z�3A�gh�����~HB���s��d�'���x(���5������$C$#c��������bx%���P�G�F�I���7D+N�����h��������M^�+�g��������OW����7��U��37�^���%*����N��M1������H+�Jr��6^�{�u.W�z��q(�>0"YUAC���*2���V�8`���0"�O������K��]�A�$5{����o���3;�n�p�_�x~��g�F����C��j��r��;���>w��8����_�������y:s���f�_4�#��\f�Eq����b1!�dKqj���=��3���P�.xh� �JL�����]D��Z�b��F���{dQ��0��'&������x��7f���l�������T
�Y~f��
��b�?�K������7�;@�k4�����G�����8��c3���1P���#JZ����ba�<����Z���xEa������e��p�:��W�����i���F�j[Wx������@�9����%z��@�^6��"��fO]�����fs�H9��_rj=Kt.�(���aq��wJ3g�;�=�'���{F��L���KH�I5��/�2`HU�_�q�D)8&�3�� 6J���������)
�� ���7����e".�|Y�}n���<�"p�O���=7�\�l�5�-�D�#�����9J�A�7$�
so��h���W!C��\t~�C��p�"*���S[*d����4DT���$�r�Pk��C��$�x��p
�a��]$��
*�����{���:z��n�AaX�g���7�rh"q+_�a42�;.n����oUlC(d�V���(���Hm<r�����T�JF�&�mpC��>����a�I,�����Os-�vj�
��R~�l�z96*�4��]���:��L$��M���>_���Ow����EQ3���F����t�
��T (����8��l��N�Q�J]-Nh#:i�����I�����w�>�<�d����w�5Rv�N��.����F`�I3�h�q��m��t.kr$
�L<"a�|o��	�NnW�ZR��PXH.�%����U'��q�^�Z/�Z��$g���	f�X�>�G��D���=��<����ntp;q~r��Mhn+�B�`��#���V��a�Xf=i������(D���q}��A!@�c�H��>g{��R��������z�����H*�r���q�j�W4;Q�������<H��;�ZN]�F�x�U'
,W3���4C������^��{�.��=����z���z)|�@.S����Tb�6�]���g`��e^��Q�L���<��R����e���}z��e�_$6*�D`��+��:��0i`���v���]�em�/^f�	���Y����|�Z�>t�1��K��$��w��'��_d��@�;�������9��'"o:	��\N m[^��cL�VC��S�����	�1[�����V�P�U~_
1%+�����@$r7�X7H��3����L3��2�+���fPZ<U��/��3��q�l
�!(�pc���;Y���C<u�8�sy�B�CmB8���u��w�-��%e��y��+\t�*Q-������3��	/g�WW���b��EW0��b��	�R$E����0h��(`OSL��u0��GS�KF���!Wg�Wl/�[f�y�uE��"W`�M���%�g���{��k������(�W�~�x���k�?�v9~�'�3�Zy���k�K�R��T<J�b��P��=4��g��HE�Y�&nPuJ?��W<W.����4���i6
e`�AVV�.����t�������w�P���VN9�S����u����������9�j���CXW���:�����W���+�^c��<����eR����-�.�V���x����:U�d�oB��;���z���������"r���"���%2%\%�NP6��{wN���|,����q�S���O�-��HW"��^g%���[@gQ��(EN?�h����V2o45�Kn�g�a��p~�J�E�-���*j��J��u�k���j�f��D��T55�$q#��}��V�����)��L���v�]n9�LEI�j��X������A`n9M� ���h@�3�;������d����h����5���d�������D�Q�&Q�@s}�8�]����$�"�C��r>��
Qs���5���������27`�D!��������v ZQ>��bwG�7p0y�����~]��]t��������W����0t����A6���1�1m��l���C/�=@��8�B�~�Cj�1�ABP:�v���^^u�Oz4�!7���x:cbhMr������������fO��#L��|v�w�eD$]�� AnFi��Y�DH-�e�����v��!h��K��3�����������Gh�k�y>���2���v��Hvc�h'gcI]���Q/Y�|�I;S[u�K����!���d�"�*]Y�$a@�HU�o�������\��O^�v~l�n.}�l}]�i�S?��N�x��E����������O��o%��B���K��CX��.��cq�l5;�����Ey�
��!����� U�M��|_�No��!�#tM ���!������I*���x�F��G	d�T��+N�`d�]��`��5�! .�m��G��g$"\q�e�������7*)�G������)���N&����_���}t�������w^��_�~]��,���
�{��m��Z�Vl�@���z#�]Jm�2���F#�A��l�Y�E����u�����Z_���V7��*3��!�A7	�������� �;?1�1��9up1�������+���2R��7*g��5IOk�|�����A�m��rO����t�%��Z�^��g�({���E�fL�b��!�d���%����b���f?�[�
���#�h�	���&O����z��`���^��H:��o���_���A�F�G��%�Io�O�����G�WG?�uNN/:�?�}{zvq����#(~�]�������7aK6���-9��Q�^<"���%72[��T������B����
�����0��Qb��SM�;i�P�EQ���#p�%����tu���������/U��(��D�&P!%x{tFt������J-G��7 ���O2���+���n����k�������y|L2&P��a��U��
���������Q�L�`�.�'�y*�th���������p]��@���!r^�����N�>������1����`[T~�k��^E��r#���Z�@=*R �$6��6fo�(7�&����G�&��~x��nBp�
/����4�
��2����M��oV�\S������~x�+�%i����5XZ����:v���I���k;[a������Pa9�~n��D�"T���eu6�Ny�ky�M�CjVt�(�����3�l�
��%��{�\���W�{��p�D^W�=����|
V�VnH(vz��E�S,��;<B-�GE�-�W����D���}����Df����X�n"y
���mL�}wd�^�G*j5e��HR{u��CvNibtMs$
T���DYi3�P��\�ZXR��a��������hgd���K��M�^��W�m�j�$������PB�s���a[S	g��K�I�xUr��������9���k:W(��+D������
]��"��_~��=)dQE�IN"
h;x�1�3*b�
0$�_���5Vd�^���Ia��/��R���������D�a_��y����;h.Z�N<���U����t�B`��C�*VZ��������3���1��6�����*TC�,�[:[�K�O&��������0;H��l������ ���(
7"l�l�I?��d�6^V�W�7�#����/y,����/�b�9Z�.EO����@�U��U�R0�����Z�C���.��tf�����z��'���ED�'�l������������pL|3�~]�5�Z��Y�%��nJU���
\�8�-��O�w�A$�J������Y�Yt7��
��4�O�2
��K���@���o����bs��To���p_	[�Z�v�>�f����]��`�����uL��&Q��V�@Q7|���oF���o���/��������=��F��<����1����'�wP�A��d�h6�����^��#W�&o������Y]e���+-��O���v�~##������,�J���4�&���`����(~����,~����*~����v_�1�;��,�'�Y����FpI��kdwv#����J~��&�J�����8�&LP��H[��(
"�h����j��a �!(�@`���v���s-d��Ei�;��w������YG�����B�<w�"�������g/�p��������0`�I�B������n��vk���n\<\��8|-�������N:'G�PkFCl�&����La�AaNg�u�����'G��6j�ge:ZXX���M��G��%����T���[x�iS�1w!��^>��M0|	5��9��1�#���zh�&����b�-����*��:Y�%MI�}.0�8I��kv)�'���v��Yt^ul� �SGJ�+�/Ub�y�I�tB�L����(��dg�m5�$r~<=~YzD��|�Q�O1U��oS��<p �x�{2��������8����O�����"��F����ju��O�0�9;��5�0��Z�J�?���?>�UT7�������ye�U��l�z��qn�1RB?C���r)5�'���5������7��&.8��{L&���n#�T]�9��=\#���<$o��a�1a���3..~�����Gl��;��M��<Er���5c�������CS�Y�����g����w2�#Cl���L|-�����3]]���\H�A3��s��\�i*��*}�R�(����;�SV����q�����0I�AJ���08����y\ ���@��a\���y����������@����]������z��xK�Yz9JP7H�����	���mP3�%��d�H�8#E��=���Y]�SI&��^w��A|X������l��Pu31��g��S�����
��.�p�$nF�����4e6��w��QX/�2aaf<��k	���C2J)I�d�2���{0��usY��eu���`����:@�K�,���U��{�;�D���������V������=�,-����/R`�\#d����<0(�����R`���X�}�$�9:�0���|�����K��
�
=�;�;�k[X���N����q�����p	]FG	��	m�.1#F��|��}v��ft<"��hI�������5����u]�>�x�S����i'��A&c4�g"�%�r~6���{�Dk7@��r��r�����H$�rRo��#������g;��������F��zy�_�����CW8|���S;���w�xHma��d��H�
4:�w
}b�Ln�w��Bq��%��p60���x&)��Fr`B����-yt�=�������1���6�Rzs��XQ�������f����[�M��)F�j�fB�b�$z�H�A��+� ����D	6o9dT	�O����;:+H?�Gg������O�#��q+Ne��7[�5U���%'4R2�"Z�\K?�q��R���R����]3��L��a#�{�Mz�����t�/�(�d�5�L&W�^R�U<�}�����z�,{|�\�r��v
��~C9��Y�::����jn/K����{OW�[���h)M����/���z��AY����E�����i����R��k��.��9��b�.:�0�/�i�9"9���%>H����07.�k������[c�4]����8�5k��8�[�F����EQ9�e�����z��8�,�g�Z�-�%��������+��X��8A4�*�M�����*�=y�*9B�
�J������^:h�sn{Y��,RO���E+�:��8��k��Y-	6Y�.���C�[n$?��\�#�L2�����8c0�If��,��D��s���h����6[�����_��9���9����.p\?��Z�&g ��q���(�E��������L���H�Pc�����0���o�
A�9���v� ��[Y����fU��<�
��YA�<�Be���X��8���Y��7W��G��T�����<rn>���cRI����s���Y�����e���N���*ms������"�������\s��E%I-�U5��
Pt��2^�e'D����S���9�FD�n���x�;����D$d-���TBw�L��"�S�������P�@Y�D?���<���x�/I�u������@��;�\W���C�4����;�S�����1�����A��;����Yn��y���7(��AC��1����n��!���uQ<��+�&�I���
2�A�#��@'������B]A����qC]4�yi]+���4�������UA����q{��#0����
G����?�e/�n��v����_�����?�[�x2Z�1�[/q�zJ�-gGK�#:�O�i���=rbV�A�Y����S��J�$�I5�$�����j/�L�Gg	��jB{k�eS?� ��l�j�*������K������\����b�K���$�Z�ITk���f��|����~.3O�^�S�:�Fl�s�G������``�g?1t
�<������p<���R�y�\	�
�$�7�{��A#T��T�=���D781Gj��;��>F��Y��f\�6M$�
����:$n�+ej��W����wU��*�$J�WW�K�!�T����l�]c.Br/�D^��%��P~��4l]���mdJV4��m#s�b��7Wfq%;�P	�GF+��+P�������P���L=�����Q�w�'V��������:�i����g��}1��lU������!���!��|�1�����q
�q�����Hw>vL��ca�;�������2��F:�e�����#�]�VF���W�z3����!��`�t�ZH�c��U�SO3!?���\2prZG��^�jF/��DzH�nj��l".�-��u�9�m��u2��f\�6W��=dM�M��@��>��B>/�kj�k!����OY0�����L��~��ko�
�1Y����s�'��x�9�A�#��}r�<*�u�bg<)&��8vQ�2U�?(x�Ssf��b��'+X=��eZ�$����7���
�7��Q�B��=8�����(c �����3l�7yN)+{/qahn5!5��^�	������X?��ZU����:U�.v����p;��=6p-�Zx�m�g�c�E��QI%����)��p��,Z�Gml��F�J���$x��)xH$��
)�iaa�0hx��G"Q�yDi�X���K���<j�������Au|:�������zU;Q;�(���-�
@m����f6���AV�]�(���&�8�t��e����j0�<����CE����Itx���"�d�0�9=�&��;������:I�/A:L\c�
�>!���VvG�!X<]�X�!dL��������G���?�l�#H���6�9��p<��,��.R�
�C��>�5 Iz8u�'�~�������#��R�k-H�5`��8:K6�G����'��X��5�q���hu%<��{�f��'\����1�"Pdf�No����#~�)��8p�o�O��������C�G����#�?o��>� ��l7���x��.���b]ZLz��q��1��b���f4�c</�7�m�D{�ZT�'@�e@H1;������`����g�0�g�Lh"m�e�_u\Ix�o���6���#�:d9}<���^�DI���1��hF���*h�Gh������l�H����-�c���s|3���"^r����T*��HK)�$��(s�3ac�
N�Y_'h<a��,
Ke���d%��C\ ��4�\s�z�ZI'��F�_R[_���v��bn�r��S�&&�0��YZlUx�? �~���2�h Y��/�+���&u��_�}@�$w 8�����$K+fk��b��=A�������,C�V����V\d��fK���'S���AR����e�*M�zX}��������]B
��a����O�K�5�x������!4&�&\8�eF��\C�"N:��z^U����q9��Y��7����"�L9�����3g8�<�����a�KcG`����(���[�$�?���1G�l�����3@�xf��H'�k��J��y����}eb=�����G��C6I�7n82!G�����#]r����*��)	���Q��v�d����Zrf3�p|	�-����{�\���;Xu(��������,�1g�+3��\'�@��V�TR��]X�������C����.�����=8�8>x��
�H�p����g�����L�Zu����H���m!7A�p���LK���!�7��?�m��@(���R^E��Egw�q28��9�y���G�@�})0�9��m{���e����p�q 7�b�� ��=���R�Vo�[v_}�?�-9+V+NA��{�g��Cc�~���1'��^���p~��#���/_���vD�0$a��O�FY������	�h�Kq)0ApC�;&f�tn ������7A4�ft�TZ���Ln&0����2��_79'^�oo��N�<�����,S��������8��E�(w��d
�A<@|��u����.6b��m7P|��~m#r� �T�����OE���zE�5�yO<��R�h��t��+0��Y�X�+�9?<x}p�����5E�#��mDl�o+W�((�lv�����?P8h��m�������p�Q����+���?t��n����2�DL����:��"�{�)L���lGN7�7����Y�0�������Z�><�O]mG��Ad�lb@����Ml �w�mrE�pkn�!r����������B=����Y�T�<	9�p�hc��6t��49���
` �]6�������h	��7��2��+n',6�'����IW�����z=�*��,�W�-"tn���Q�a���`�N��ku��u<"�{OE
l�?�EU��9Z�\�����Z;�&���VHf����m 2y{%�0�`�m�3i9�6�����q���X'�t����_�c"�#���p�b;�2x��Yv{�;�����&�p4e����B�E|�n��	:NN��F?���[��
��D����@v� �U�E���H��`{_�y�y{v����������J���^n�[��S�gU���t4���F�T�$V��������"�kPU�iu�N�i��^�/��(U�8��U2
������E��>�������0��������H����Tm�
:$��7S���WoLn��5���G����mL�l��d)L���red�8�Cuy-�>E�a�F�	9Xhi0K����.p�c���Q2EGiX��
�����BI�jHl`��(�4@���v9
b�1f��Ht�����'���G
�n��sT������T��|>�U>�J:��{�]�����]�S�	�U�4�7��l��_/>��8eok0�rr^+��U?���~��o�s���yiU�u�I�M5��3y��H�6jq����
�F���^J������<d��&�����Ml5����q%��X�� ����v�<���_��=E��(��X��o~���w0cx_(3���d����aO3�^���N9����!��0��R(Sr4J9�c������8ON�	i�R�z�q&��m��f��<����m���|��fFC�����#	,�%�2d��}Q����������E8��}�Iz��'.?�E���L��x�G�!U��r�+3'�$�S@�L%�gf4��xO"�J_��G���z�{��]����c��7�?��g�c�-�|+�(z'��85�!g���_��e�Hc-�2�p|9�9<��q
a�9sENBV/����M��j����D���K&���J�i��{:��4h�b
2�m��( y}���)F�*k!���o�{m�:u���]
�#{�E^I'.Q�����]�h-���&g"��-~�S�����4�
&����p�Op8\$��Wc�L��XK���XZ�n���/g�/�!t�eB�P��A�cr����*��,��zzU�����qT���.[����~*���?�g��WQ�Z7A$M .u�"Ti
=<�9w,��u�Z�l�_B](����������5f���Z��'����a�*����S����t�p�95�x�*+f}�����U�YLhP�j�������_.��[��A�Uz���)y���K=�WB�4�P�
gE�\�
�U� ?p,�^3��S���94���@U��!��Q��&)l1{<LrLI�fx-/,,���/�6�����e�A�������%�/2N����"P������J�����c�g��?4�0{P��c�K��Q2ITw�\	�9�Ah��Gk�L��D�%��2���^E4�^p
����?u>2�8HO��m�p���c)s�U3|���_Fk"A���% ��sR#��P�g��XF�1��~#J>rV�M3!;eLCu�t�u�I�]��9���w5Q������S����	���
�x��0
��z��7�������F=������}�B�u&F��)��jK�\{�^�mS���1m���G��g��.���.���oK����%}��T����.�y�#[��oXQ_��vWl���H�{���������Y���j$'q��s�L&�	�B�v���Gk��n��A<���������O��5F���6� `M���bA=L����6%�R�?����<�_��xJ~�������J�K�����q�.����������O��AV�R���	u��jN����%U��8V�P����T3��m��I��+~�/�_�-3"0�����!l���	\,�}�Y�sMQ+���
}8�w��O�� �����!����!��B\K�$QI�����'%L$$3��8�����.�e�R�[��;]z�1���������W��o��\���L+�� i��^}GI�F�
����gW���G�*q_�t��l�d6�
-����`mz��1�����hT�xIs(5�5Z^���H��|=��	����������#�������Yt�s���yM+SX�{BX���D~�Q�-��5��|�����)��r �S�`#��q:���_���5E�o��>��n��P��F�2I�El������cr����4�������L��rI�rka�sFoQ���F�D����c����hs7��8f�A.�J��]k��Sp�0}�g��qj���UT��,������G�U��Sd�0������!q����%�������u������$Oe���|7&��E���i3Y��1TLJ�M�s�$����b!Px>14^`s�F��>�!dx���%���!��C����e%�	���}Z� ��p}T;�xBx��Z�q3��|��Pr
^����^����	�
I-7
�,-Ja
h	�y��R�;'����[��4G��w�T��8�@�)����C���������]���g��z7�1������_�>4{Qw����o���������fsoow�t7�6px��������zY\]]}����h���j�7�U���Gt���C�����FF��5|��a���2��N���#\N|���\t� �C���-��o� ��`:�M
������w��
g{E���Vg7�Q
����0����l��X�����BM����O��S��	�|W�h�Mf> ;�y�e�tPH��3�!m�K���;���w����n����n6�qw?C"�c�I<�(o�[�����p�4�Ain(���{#f�Y8�o����#�u:*Jt���-P#���?����D��	"�����y/X}N<����Vl(�`��8�O�6��*���P��.������5�J^^���
��1��T��H���M��.�t��)�5u���YM�`����&!�������/w ��F���N������fp��i�)��9X��A�6�?;]nK�.:�%M�7����z;J���Y��v�;�&�l�H��VZ��V�nu������Bk���&�t�1��I��"�������<�����iv�t;��9��'��b�M�?-l��:������HjXLwa��hd`���Q��z@�NI�|i/D(��[�e�	��%�9),�b��f-
��i����t��0J����^);%�><��
5���q&�N,�ZLD�f}3���@	>��v���d)�!2��Y3�	�p;�������o���88�
���6J ��6�E
�p����v�/����#��1��0��+�0
h��LQ@E>����aP�q�.Y�x�^2\4���Nn����6�n`&�&��h�!k,�c�sn��L�%���q�L�8���en���W���H5Wcf�RB�>���Rh���1c���/^2p��O�5���/,�$�&��E+Z���cd>�]8;�A���
p�}
�����J����p8T��h���VS���y�:!���6z�b��������hv��s����
R2(H�0'�u���;���)t��
\w���5]wTD���r��_���f],aF��E�m�����[���m�i'�=�@3Vc�-�����8���g[�L��H��5�V��h@0X���������;���j��$%�$�5�k
��6$���rg�D5<JpD�? �N�����8G$�O6�A��
|k~�F�Fi��/��������	�24T�`q���Z7�Z�c��>]{��2�z�N��_X�& '��"��2M7��d6u���
hu=/���Mc��Ecq����!YrI����F��b2Y	3�>�;H�C�%��G���o+��sJ�C&�FLp����7��{��9I�']z^
�����\��a���Rs�eL�(����H��f�fP�WD����)��������r��H��@B1�md�J}
���l=�O�1���*(UJ��feo��k���d���w���VwWkV��*�W������"!���psIjS��w���;��0	�hG��4�|t�/\-gr�$#d�O�WA^���(��5������^������9�;}~�>`�v~qpq��L7��r���]2���f�W�dk�dk�K,Ha)��`T��{����� ���?gg���o�k@M�?���K�C�3���o�6�>��{���"&��1�@$�G���on$���@�U��~wv��io�2�~��f�W`o�V`o����K�����C^_�����O1PMj�o���F��u�����0�v�$ZR\�����^c]�1/Dug>UwE�4�|���x"m,�����s�,���,���(I����7�6[��~�����{��D��,�h��Y��n�[pv�R�%���5�;�L�N'K�S:
��O�D31�R��]�3!���E'F�:�H����E�{��R���_tG[��>�����5�}1s�V��_��76��R����b��G����cr���gW�������^���~,�����uF�U$��7kf2)=q:��$�A�������MD��d��%��A
��=Rjnn��5�;��t[�Jx�$�x������w5��N;�����������L\K��0F�Bt��E-e�C(Y�nG�XK�.1�h AIG
�{�L�9 ��x�N(���v3���?�?\Jx�]�b>�<Z�!N�������R-�VO�z��e����)�jE���WqM��/�������@��O��������:�82_���$�Y���� ��Hll�Eg5v�$GO����R�(�ij���O�Q�zhlr��Wb�"����o�CE�Kx��O/�i&� ��d0��rq2Bll���}z��aG�+$\����Y�{}��;t����`y��H9e�����q�>���8nG��o>����:�$���������a�X5�E{���Na^��
f�c�F�.�w�e*:��`/��S3e��4�9s�]��bT/��?Jr�0��SZ{�u�D�
��&��,���P���N.��JGI����$l��4"�4��>��Hi����
��8����|��U����o�N\��X-Q~��k���#kPw��S�g?�p�jR�I���th�����O�Q�������H����r����!�J�~���ja���<���������]��DQa��;������^�p�szs����|H��b��+���Fk�y��r����}����?h����;�6�JW�g*6]n��G������k�%A5��?@����M���S����LC��$�������_�`L��X2�P��$���������-�m����G5��� D�zt�K�Swg	���RH��,�� �9���������z���� �9!�����[�������{���g3�����!�p�adV~cHG]�7���p�������Ca�=��W���o�/�7f<6��2+��p����s
��b��,r��W������//}!�7��I\�������*u����Y1�bq��+N����8������`�(�k6����N��������w����}}�9;z��d����4���`j$=����a�]�#:p������w>4�����J8�����$N3:����S0����<�C�"��d
C�	$����>�"o��4�t�n�Q�����BR"�{����RouT��z��?��������0r�w��O��	1�����[NK����V�����*�m�;�!������#��<yS�S���������1/�� ��n��P�Jg�`��d�����}��z�_�����5�=�wb��n�O(���(i����1�x������7�9����Gk%/|q��w���D�%��L�rm<X����D�@[_[�`���lxn�}vn[�[�����!�O0���^�m��i(�r�a�~P6��*����fW��(��Ra(ea�Z"V�ZAS�2x�)2��D��-��mnol��vs���V�2�R-�@*��gW1��H	����U�L%��-^��ezI�S�=�
���zL�,��=>yuj����b����u\�����<n�� �Y}�J
h-w6I������RB(5S+Jpb|���%6=�QN%�H � ts���Z�{y\�����i�g�m��h	���Z���*��3����������'h=K���h�&��p��d�I�M�v���(�7���no����l&�Nowk���-Q���`t�%����6�W�'����(�@��%EPf��=b�j����>�)��h)�qS�e<�2�!w	�L3��A/�������c�6��_Fyt_����Qq:�j3)�4(�U�Y�\��1� �����b���*+PnB��N�����q8�l���,t��L����O����9y��h��l9������3T��.5� �o�C���z��I@J�M'�����a��0TV#J�n�-M�,�TK�������������cnk�B���Hsm/��m���=l�����nu[���$�:�h���`w���4�R�a2-�$��BM��x���
����v��������gP�\+%%W���&S��6Z(��R�Mg��
��,��k�4��)oXgR[:<;:�8������s���?���D��vq�������R
����F<�I������u,���t��+�����@}��W�K��QX����?��b�����u!S�D��ix9.3J����� ��-x�6w���6�������?��J�I�4������(�M����$�y�U4fi�A_J�~��V>E�2}�ZF.�>C��3~�J9A����_�[�OH���6��]4�&��N�FC_3��4(u��Z6V�Jq�T�l��s��K�A�O�'%��E�S�O�K�
���Cvq
��_$7�]
��������9��K2�(j	9��;��d�����Fw���Y��5���G����������~n�a�PE��/kF�������!���pQ�0��W^lh��L	4�Xh���#����hB�����������)hGV8z�t�[GJ�p6�:���,�c�=W5#
��3n���a�8:�w��r��jt�pax���$�}�N>�Q�r��U�^���I*�CW�t��cq���8������G-��3���brZ��7C�dbe,�t`N�M�l�P��f�]�����,q��$l*~xM�+�������?l���t�����@�w0������v(���v�6��8�AaC��P��Y�$�s�e
�&�MD
�������������6S�p��������\wQ�/`��9���-Ol�<����\�`�&��=>��"�j�V4�
�h��3�Z�'6p��j���{�I�yU��
�����?�O��
����Vk�W�y�����a�0���"��pxpq����Z�pcss����+g�)rL���Ju���2�L�W�o
)�����e�$���a�.C3�M�����XP��	}��:c�1 ��W���=B�1�W�����J�4E��kt���W��w�}zG������Au�h�*i+��K����Oa�����"/����<:9<>:�B��B�>(//_/�\��yc��e�4O_T������G��������������#Z���.�l�QK�������`{s����O������5��q-��.���op\�;C#�����N�R�0@�=<:	�p���F��-�����3w9r�9���	}
?��Zq���Q�����
�������;f���;����/��{�R�c0���g�:Lx0km��{O��I���*����\�|9�y^��{��,.|���f����y�~u��#�oV~�l�_�|}x�����K^��.�������r�Z�?__�8Z9v3�}���v;������U�������Y~k����)�7���~+�u����������<q{���d���]��n�xU�e����E_u��m�-�����1�HUx����	b�SN�y���i�_��_�W]�����=�EH�eq,dk��C��T�8�dS�-x$'l��72��������Z�`��T���=e��w����P;��Bh�������Gha���<D[�%�c�g���:5D��!��$������������z�
�{�*��t7w��{=8������f��� =���-��� �t/��p`��x����ZQl� ^	��2?5-��[���
)��#�5��W�(Z� �jl�
P��i�c=uS���	�y�����|p�)_��6kW������y��z�=na��i�CK��9GX=�U�>�T`����
F���a�#�	��k���b
�[��a��D�!2w�_P�G�p�R2�f�X���	!-������ol����g��I�����&�����7�F�jdv�H��C���x
n�C�c����v�s��[�X���-���j���vHYv{cA�������V��+6��V�H���8[(��u�� tG��_��-���vcX����[�?�/��0��EA4�L�e���{��t������o�o�j>�$�TS$�J4�(J^ ����?��'�;�#w2�s�����R�ho������[���������Vg�5���6��J�	m��
������������m�����S,�
9�J'�� �`_Jl�WlRS��<��\W���w���}����j�p{g���o�������u^�ccmf�3�e���e��A_�x&���3�o���N'��Z�z���y���_�.�����I��5�����y!��.�8
{q�Q,rx2�i�]~K%�G��_�xe�y�|CZc���?],��4gE�{�5�F��[C-p� �����{�jB���/6`�waV�5�"#��;'��������h�P�������8et85���*��>C�0c�%���C��\�>��4�4����/��]�hw��^a)%F�x[�����p�5~x�U��~�����y*�������'�~��S�p��O����4�.XOs��2��iw!�	��Xs`�H~���|V7'�j��M������}y����������S����	k�%���	c���-p	b�%�����a�K�a��g�ZY	�^]���,A(5(����v���J�b�#�H	�B0��&K����D�*�����C��BD���a�F����K������� �md���'���<Reu�2Q�P�T�@z	9�(O�9X'�Fe5�w�bD���e�fKd��on4����o�h	��4/J�f�e��yv�wFP\QG����^S)��L��AUP5�c���g�R1�����y��b��V*������M2�N�(��Q^��������P:�"#�=��]z��n����������noo��L7Nz{;����3�f�lj�\�k����8��t	��������i%Y��GE�D�9�E����Dne����?���&�Z�
1����-6��`%ke\��V
�H��r��O���v���o��J{=`�u�e�;�s�6�Gr��E��7
��/��f��'d[���\Q����$�������Q!�r�J2y����k����i������1N�C��0(��Thn��$����uFmH��=�O:dG�`V�&2�������������d���d�����.q9xL���<���1�3�����y�r
.dk'��S��-�<mw���.E���%�I���=�6��������F�o`���R����mu��.E5��4w����o<>�=��:Z?�Y����8l�G��/�^�n��ie�]
��(N�����������=4�D�����1�1���O��&r9+���D���8��R^7+��\|�N�6�"$J&��7[>������_����G��I��B�@aQ�9���f{3�����\������`�&�2������Dr�J?I�)��n��K����n�������_S����_������j��=� {i����?�����z���tu���r�1����{T�<�ZEok�v+�5�o�b�'X���fzs)�4�5�y=�g��0���.��(W{�&C���:9�oSPRC��;�����.S�7kc�]�0O����4�����C��?^_!�8x��(������������o�#%��6M.a��l��O�#
��J����1o9���������Z�a
�y4��r�M#T%�P�����a�%�D]������Q���9�������B���B�9�@�E��1�V����g�
�@m<x��?
>�#C��82�K��P�3�*�w�gm88
�����D�WB���}�?�?��A��W�?�6�����dD����c�����������|�|�,>cF�2�����)��U2�=��\������������+wG���.�u���'�Gg����8����6�o�L��?���kC���M-=�?��t���{����6�Zs�y��e�+����~��r��f����������0��p��@���m#��*� O!9x+"����q�A�����2�fr������O������l�v��b�����"������s_&�4<��!�:i�Y�^T������keQ�{�#����kNv�[�M���~�����S��'��s��z��9��1ISW�Y�l�5~��2�,��
X�7Gg���i��E~�F?�~p���^BO�Q��S������������v�H�k�24i1h2�G���
Q���@���X����%�v�4���w=����&����[m����u������	�w%Y�����T��Z��W��G>���	�=�6�&���-/o//#T��[������������������E���o5L���Op���}��	���.�a��0�q5�����������_�������`����d}���%f�Z\�8����8��f������h:r��G���n��m$$��8��t;���;�������m�{Gl��/0����C�Kl�����t�����]�"�J
��#n��G�+�;�0�vxp~�ubi�3��]��4���P�|}t���x��o��q�n���h�7�h\��T~���zFf��(}>1�n}�+S:T�6����7S�/��V��_��U��e��f���Hy)����?�������g���j��$�c�r����G*4�v������d��Z�e��F=^#����\�*����:*�@	T�>�Z Q��H�#�;q��:��w�)g\�B6+Y�������|����O��u��������OV�'��&���u���D��@�V�E|T��#��KG|cf�.�Q���J��3��9��x��5����)�������m�o4��dw��k�������C��|y�J��w�[�o�M(J���zF���rc���\��e�P��G2�-��t�^^����$�?�jm����_l�ZR\��k�-�DG��_���+������B�z�oFOeR;���]2S�'��v��F��L03L!fggT�"���
�95�!I�O
�)zk0�U�U�BrNvz��vk���s�l�;���������X�)�����C��?~����nr�4C����T�[;{������s�B���[����`�P_���������8C����"*���T�;i�������U�\e�o��y-���[��N�O.���L�?u�o�N�mg���w����������K���Qs����:���Z������g*a�7��f{q��}��z�

#142

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: Tomas Vondra (#138)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Fri, 15 Mar 2019 at 00:06, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

... attached patch ...

Some more review comments, carrying on from where I left off:

16). This regression test fails for me:

@@ -654,11 +654,11 @@
 -- check change of unrelated column type does not reset the MCV statistics
 ALTER TABLE mcv_lists ALTER COLUMN d TYPE VARCHAR(64);
 SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a =
1 AND b = ''1''');
  estimated | actual
 -----------+--------
-        50 |     50
+        11 |     50
 (1 row)

Maybe that's platform-dependent, given what you said about
reltuples/relpages being reset. An easy workaround for this would be
to modify this test (and perhaps the one that follows) to just query
pg_statistic_ext to see if the MCV statistics have been reset.

17). I'm definitely preferring the new style of tests because they're
much neater and easier to read, and to directly see the effect of the
extended statistics. One thing I'd consider adding is a query of
pg_statistic_ext using pg_mcv_list_items() after creating the MCV
stats, both to test that function, and to show that the MCV lists have
the expected contents (provided that output isn't too large).

18). Spurious whitespace added to src/backend/statistics/mvdistinct.c.

19). In the function comment for statext_mcv_clauselist_selectivity(),
the name at the top doesn't match the new function name. Also, I think
it should mention MCV in the initial description. I.e., instead of

+/*
+ * mcv_clauselist_selectivity
+ *        Estimate clauses using the best multi-column statistics.

it should say:

+/*
+ * statext_mcv_clauselist_selectivity
+ *        Estimate clauses using the best multi-column MCV statistics.

20). Later in the same comment, this part should now be deleted:

+ *
+ * So (simple_selectivity - base_selectivity) may be seen as a correction for
+ * the part not covered by the MCV list.

21). For consistency with other bms_ functions, I think the name of
the Bitmapset argument for bms_member_index() should just be called
"a". Nitpicking, I'd also put bms_member_index() immediately after
bms_is_member() in the source, to match the header.

22). mcv_get_match_bitmap() should really use an array of bool rather
than an array of char. Note that a bool is guaranteed to be of size 1,
so it won't make things any less efficient, but it will allow some
code to be made neater. E.g., all clauses like "matches[i] == false"
and "matches[i] != false" can just be made "!matches[i]" or
"matches[i]". Also the Min/Max expressions on those match flags can be
replaced with the logical operators && and ||.

23). Looking at this code in statext_mcv_build():

/* store info about data type OIDs */
i = 0;
j = -1;
while ((j = bms_next_member(attrs, j)) >= 0)
{
VacAttrStats *colstat = stats[i];

mcvlist->types[i] = colstat->attrtypid;
i++;
}

it isn't actually making use of the attribute numbers (j) from attrs,
so this could be simplified to:

/* store info about data type OIDs */
for (i = 0; i < numattrs; i++)
mcvlist->types[i] = stats[i]->attrtypid;

24). Later in that function, the following comment doesn't appear to
make sense. Is this possibly from an earlier version of the code?

/* copy values from the _previous_ group (last item of) */

25). As for (23), in build_mss(), the loop over the Bitmapset of
attributes never actually uses the attribute numbers (j), so that
could just be a loop from i=0 to numattrs-1, and then that function
doesn't need to be passed the Bitmapset at all -- it could just be
passed the integer numattrs.

26). build_distinct_groups() looks like it makes an implicit
assumption that the counts of the items passed in are all zero. That
is indeed the case, if they've come from build_sorted_items(), because
that does a palloc0(), but that feels a little fragile. I think it
would be better if build_distinct_groups() explicitly set the count
each time it detects a new group.

27). In statext_mcv_serialize(), the TODO comment says

* TODO: Consider packing boolean flags (NULL) for each item into a single char
* (or a longer type) instead of using an array of bool items.

A more efficient way to save space might be to do away with the
boolean null flags entirely, and just use a special index value like
0xffff to signify a NULL value.

28). I just spotted the 1MB limit on the serialised MCV list size. I
think this is going to be too limiting. For example, if the stats
target is at its maximum of 10000, that only leaves around 100 bytes
for each item's values, which is easily exceeded. In fact, I think
this approach for limiting the MCV list size isn't a good one --
consider what would happen if there were lots of very large values.
Would it run out of memory before getting to that test? Even if not,
it would likely take an excessive amount of time.

I think this part of the patch needs a bit of a rethink. My first
thought is to do something similar to what happens for per-column
MCVs, and set an upper limit on the size of each value that is ever
considered for inclusion in the stats (c.f. WIDTH_THRESHOLD and
toowide_cnt in analyse.c). Over-wide values should be excluded early
on, and it will need to track whether or not any such values were
excluded, because then it wouldn't be appropriate to treat the stats
as complete and keep the entire list, without calling
get_mincount_for_mcv_list().

That's it for now.

Regards,
Dean

#143

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Dean Rasheed (#142)

3 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 3/16/19 10:26 PM, Dean Rasheed wrote:

On Fri, 15 Mar 2019 at 00:06, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

... attached patch ...

Some more review comments, carrying on from where I left off:

16). This regression test fails for me:
@@ -654,11 +654,11 @@
-- check change of unrelated column type does not reset the MCV statistics
ALTER TABLE mcv_lists ALTER COLUMN d TYPE VARCHAR(64);
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a =
1 AND b = ''1''');
estimated | actual
-----------+--------
-        50 |     50
+        11 |     50
(1 row)
Maybe that's platform-dependent, given what you said about
reltuples/relpages being reset. An easy workaround for this would be
to modify this test (and perhaps the one that follows) to just query
pg_statistic_ext to see if the MCV statistics have been reset.

Ah, sorry for not explaining this bit - the failure is expected, due to
the reset of relpages/reltuples I mentioned. We do keep the extended
stats, but the relsize estimate changes a bit. It surprised me a bit,
and this test made the behavior apparent. The last patchset included a
piece that changes that - if we decide not to change this, I think we
can simply accept the actual output.

17). I'm definitely preferring the new style of tests because they're
much neater and easier to read, and to directly see the effect of the
extended statistics. One thing I'd consider adding is a query of
pg_statistic_ext using pg_mcv_list_items() after creating the MCV
stats, both to test that function, and to show that the MCV lists have
the expected contents (provided that output isn't too large).

OK, will do.

18). Spurious whitespace added to src/backend/statistics/mvdistinct.c.

fixed

19). In the function comment for statext_mcv_clauselist_selectivity(),
the name at the top doesn't match the new function name. Also, I think
it should mention MCV in the initial description. I.e., instead of
+/*
+ * mcv_clauselist_selectivity
+ *        Estimate clauses using the best multi-column statistics.
it should say:
+/*
+ * statext_mcv_clauselist_selectivity
+ *        Estimate clauses using the best multi-column MCV statistics.

fixed

20). Later in the same comment, this part should now be deleted:
+ *
+ * So (simple_selectivity - base_selectivity) may be seen as a correction for
+ * the part not covered by the MCV list.

fixed

21). For consistency with other bms_ functions, I think the name of
the Bitmapset argument for bms_member_index() should just be called
"a". Nitpicking, I'd also put bms_member_index() immediately after
bms_is_member() in the source, to match the header.

I think I've already done the renames in the last patch I submitted (are
you looking at an older version of the code, perhaps?). I've moved it
right after bms_is_member - good idea.

22). mcv_get_match_bitmap() should really use an array of bool rather
than an array of char. Note that a bool is guaranteed to be of size 1,
so it won't make things any less efficient, but it will allow some
code to be made neater. E.g., all clauses like "matches[i] == false"
and "matches[i] != false" can just be made "!matches[i]" or
"matches[i]". Also the Min/Max expressions on those match flags can be
replaced with the logical operators && and ||.

fixed

23). Looking at this code in statext_mcv_build():

/* store info about data type OIDs */
i = 0;
j = -1;
while ((j = bms_next_member(attrs, j)) >= 0)
{
VacAttrStats *colstat = stats[i];

mcvlist->types[i] = colstat->attrtypid;
i++;
}

it isn't actually making use of the attribute numbers (j) from attrs,
so this could be simplified to:

/* store info about data type OIDs */
for (i = 0; i < numattrs; i++)
mcvlist->types[i] = stats[i]->attrtypid;

yep, fixed

24). Later in that function, the following comment doesn't appear to
make sense. Is this possibly from an earlier version of the code?

/* copy values from the _previous_ group (last item of) */

yep, seems like a residue from an older version, fixed

25). As for (23), in build_mss(), the loop over the Bitmapset of
attributes never actually uses the attribute numbers (j), so that
could just be a loop from i=0 to numattrs-1, and then that function
doesn't need to be passed the Bitmapset at all -- it could just be
passed the integer numattrs.

fixed

26). build_distinct_groups() looks like it makes an implicit
assumption that the counts of the items passed in are all zero. That
is indeed the case, if they've come from build_sorted_items(), because
that does a palloc0(), but that feels a little fragile. I think it
would be better if build_distinct_groups() explicitly set the count
each time it detects a new group.

good idea, fixed

27). In statext_mcv_serialize(), the TODO comment says

* TODO: Consider packing boolean flags (NULL) for each item into a single char
* (or a longer type) instead of using an array of bool items.

A more efficient way to save space might be to do away with the
boolean null flags entirely, and just use a special index value like
0xffff to signify a NULL value.

Hmmm, maybe. I think there's a room for improvement.

28). I just spotted the 1MB limit on the serialised MCV list size. I
think this is going to be too limiting. For example, if the stats
target is at its maximum of 10000, that only leaves around 100 bytes
for each item's values, which is easily exceeded. In fact, I think
this approach for limiting the MCV list size isn't a good one --
consider what would happen if there were lots of very large values.
Would it run out of memory before getting to that test? Even if not,
it would likely take an excessive amount of time.

True. I don't have a very good argument for a specific value, or even
having an explicit limit at all. I've initially added it mostly as a
safety for development purposes, but I think you're right we can just
get rid of it. I don't think it'd run out of memory before hitting the
limit, but I haven't tried very hard (but I recall running into the 1MB
limit in the past).

I think this part of the patch needs a bit of a rethink. My first
thought is to do something similar to what happens for per-column
MCVs, and set an upper limit on the size of each value that is ever
considered for inclusion in the stats (c.f. WIDTH_THRESHOLD and
toowide_cnt in analyse.c). Over-wide values should be excluded early
on, and it will need to track whether or not any such values were
excluded, because then it wouldn't be appropriate to treat the stats
as complete and keep the entire list, without calling
get_mincount_for_mcv_list().

Which part? Serialization / deserialization? Or how we handle long
values when building the MCV list?

cheers

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#144

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: Tomas Vondra (#143)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Sat, 16 Mar 2019 at 23:44, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

21). For consistency with other bms_ functions, I think the name of
the Bitmapset argument for bms_member_index() should just be called
"a". Nitpicking, I'd also put bms_member_index() immediately after
bms_is_member() in the source, to match the header.

I think I've already done the renames in the last patch I submitted (are
you looking at an older version of the code, perhaps?). I've moved it
right after bms_is_member - good idea.

Ah OK, I was on the 20190315 patch yesterday. I've just updated to the
20190317 patch.
It looks like you forgot to update the argument name in the header file though.

Regards,
Dean

#145

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: Tomas Vondra (#143)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Sat, 16 Mar 2019 at 23:44, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

28). I just spotted the 1MB limit on the serialised MCV list size. I
think this is going to be too limiting. For example, if the stats
target is at its maximum of 10000, that only leaves around 100 bytes
for each item's values, which is easily exceeded. In fact, I think
this approach for limiting the MCV list size isn't a good one --
consider what would happen if there were lots of very large values.
Would it run out of memory before getting to that test? Even if not,
it would likely take an excessive amount of time.

True. I don't have a very good argument for a specific value, or even
having an explicit limit at all. I've initially added it mostly as a
safety for development purposes, but I think you're right we can just
get rid of it. I don't think it'd run out of memory before hitting the
limit, but I haven't tried very hard (but I recall running into the 1MB
limit in the past).

I've just been playing around a little with this and found that it
isn't safely dealing with toasted values. For example, consider the
following test:

create or replace function random_string(x int) returns text
as $$
select substr(string_agg(md5(random()::text), ''), 1, x)
from generate_series(1,(x+31)/32);
$$ language sql;

drop table if exists t;
create table t(a int, b text);
insert into t values (1, random_string(10000000));
create statistics s (mcv) on a,b from t;
analyse t;

select length(b), left(b,5), right(b,5) from t;
select length(stxmcv), length((m.values::text[])[2]),
left((m.values::text[])[2], 5), right((m.values::text[])[2],5)
from pg_statistic_ext, pg_mcv_list_items(stxmcv) m
where stxrelid = 't'::regclass;

The final query returns the following:

length | length | left | right
--------+----------+-------+-------
250 | 10000000 | c2667 | 71492
(1 row)

suggesting that there's something odd about the stxmcv value. Note,
also, that it doesn't hit the 1MB limit, even though the value is much
bigger than that.

If I then delete the value from the table, without changing the stats,
and repeat the final query, it falls over:

delete from t where a=1;
select length(stxmcv), length((m.values::text[])[2]),
left((m.values::text[])[2], 5), right((m.values::text[])[2],5)
from pg_statistic_ext, pg_mcv_list_items(stxmcv) m
where stxrelid = 't'::regclass;

ERROR: unexpected chunk number 5008 (expected 0) for toast value
16486 in pg_toast_16480

So I suspect it was using the toast data from the table t, although
I've not tried to investigate further.

I think this part of the patch needs a bit of a rethink. My first
thought is to do something similar to what happens for per-column
MCVs, and set an upper limit on the size of each value that is ever
considered for inclusion in the stats (c.f. WIDTH_THRESHOLD and
toowide_cnt in analyse.c). Over-wide values should be excluded early
on, and it will need to track whether or not any such values were
excluded, because then it wouldn't be appropriate to treat the stats
as complete and keep the entire list, without calling
get_mincount_for_mcv_list().

Which part? Serialization / deserialization? Or how we handle long
values when building the MCV list?

I was thinking (roughly) of something like the following:

* When building the values array for the MCV list, strip out rows with
values wider than some threshold (probably something like the
WIDTH_THRESHOLD = 1024 from analyse.c would be reasonable).

* When building the MCV list, if some over-wide values were previously
stripped out, always go into the get_mincount_for_mcv_list() block,
even if nitems == ngroups (for the same reason a similar thing happens
for per-column stats -- if some items were stripped out, we're already
saying that not all items will go in the MCV list, and it's not safe
to assume that the remaining items are common enough to give accurate
estimates).

* In the serialisation code, remove the size limit entirely. We know
that each value is now at most 1024 bytes, and there are at most 10000
items, and at most 8 columns, so the total size is already reasonably
well bounded. In the worst case, it might be around 80MB, but in
practice, it's always likely to be much much smaller than that.

Regards,
Dean

#146

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: Tomas Vondra (#143)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Sat, 16 Mar 2019 at 23:44, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

16). This regression test fails for me:
@@ -654,11 +654,11 @@
-- check change of unrelated column type does not reset the MCV statistics
ALTER TABLE mcv_lists ALTER COLUMN d TYPE VARCHAR(64);
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a =
1 AND b = ''1''');
estimated | actual
-----------+--------
-        50 |     50
+        11 |     50
(1 row)
Maybe that's platform-dependent, given what you said about
reltuples/relpages being reset. An easy workaround for this would be
to modify this test (and perhaps the one that follows) to just query
pg_statistic_ext to see if the MCV statistics have been reset.
Ah, sorry for not explaining this bit - the failure is expected, due to
the reset of relpages/reltuples I mentioned. We do keep the extended
stats, but the relsize estimate changes a bit. It surprised me a bit,
and this test made the behavior apparent. The last patchset included a
piece that changes that - if we decide not to change this, I think we
can simply accept the actual output.

I don't think changing the way reltuples is reset ought to be within
the scope of this patch. There might be good reasons for it being the
way it is. Perhaps open a discussion on a separate thread?

As far as this test goes, how about just doing this:

-- check change of unrelated column type does not reset the MCV statistics
ALTER TABLE mcv_lists ALTER COLUMN d TYPE VARCHAR(64);
SELECT stxmcv IS NOT NULL AS has_mcv
FROM pg_statistic_ext WHERE stxrelid = 'mcv_lists'::regclass;

-- check change of column type resets the MCV statistics
ALTER TABLE mcv_lists ALTER COLUMN c TYPE numeric;
SELECT stxmcv IS NOT NULL AS has_mcv
FROM pg_statistic_ext WHERE stxrelid = 'mcv_lists'::regclass;

Regards,
Dean

#147

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Dean Rasheed (#145)

3 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

On 3/17/19 12:47 PM, Dean Rasheed wrote:

On Sat, 16 Mar 2019 at 23:44, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

28). I just spotted the 1MB limit on the serialised MCV list size. I
think this is going to be too limiting. For example, if the stats
target is at its maximum of 10000, that only leaves around 100 bytes
for each item's values, which is easily exceeded. In fact, I think
this approach for limiting the MCV list size isn't a good one --
consider what would happen if there were lots of very large values.
Would it run out of memory before getting to that test? Even if not,
it would likely take an excessive amount of time.

True. I don't have a very good argument for a specific value, or even
having an explicit limit at all. I've initially added it mostly as a
safety for development purposes, but I think you're right we can just
get rid of it. I don't think it'd run out of memory before hitting the
limit, but I haven't tried very hard (but I recall running into the 1MB
limit in the past).

I've just been playing around a little with this and found that it
isn't safely dealing with toasted values. For example, consider the
following test:

create or replace function random_string(x int) returns text
as $$
select substr(string_agg(md5(random()::text), ''), 1, x)
from generate_series(1,(x+31)/32);
$$ language sql;

drop table if exists t;
create table t(a int, b text);
insert into t values (1, random_string(10000000));
create statistics s (mcv) on a,b from t;
analyse t;

select length(b), left(b,5), right(b,5) from t;
select length(stxmcv), length((m.values::text[])[2]),
left((m.values::text[])[2], 5), right((m.values::text[])[2],5)
from pg_statistic_ext, pg_mcv_list_items(stxmcv) m
where stxrelid = 't'::regclass;

The final query returns the following:

length | length | left | right
--------+----------+-------+-------
250 | 10000000 | c2667 | 71492
(1 row)

suggesting that there's something odd about the stxmcv value. Note,
also, that it doesn't hit the 1MB limit, even though the value is much
bigger than that.

If I then delete the value from the table, without changing the stats,
and repeat the final query, it falls over:

delete from t where a=1;
select length(stxmcv), length((m.values::text[])[2]),
left((m.values::text[])[2], 5), right((m.values::text[])[2],5)
from pg_statistic_ext, pg_mcv_list_items(stxmcv) m
where stxrelid = 't'::regclass;

ERROR: unexpected chunk number 5008 (expected 0) for toast value
16486 in pg_toast_16480

So I suspect it was using the toast data from the table t, although
I've not tried to investigate further.

Yes, it was using the toasted value directly. The attached patch
detoasts the value explicitly, similarly to the per-column stats, and it
also removes the 1MB limit.

I think this part of the patch needs a bit of a rethink. My first
thought is to do something similar to what happens for per-column
MCVs, and set an upper limit on the size of each value that is ever
considered for inclusion in the stats (c.f. WIDTH_THRESHOLD and
toowide_cnt in analyse.c). Over-wide values should be excluded early
on, and it will need to track whether or not any such values were
excluded, because then it wouldn't be appropriate to treat the stats
as complete and keep the entire list, without calling
get_mincount_for_mcv_list().

Which part? Serialization / deserialization? Or how we handle long
values when building the MCV list?

I was thinking (roughly) of something like the following:

* When building the values array for the MCV list, strip out rows with
values wider than some threshold (probably something like the
WIDTH_THRESHOLD = 1024 from analyse.c would be reasonable).

* When building the MCV list, if some over-wide values were previously
stripped out, always go into the get_mincount_for_mcv_list() block,
even if nitems == ngroups (for the same reason a similar thing happens
for per-column stats -- if some items were stripped out, we're already
saying that not all items will go in the MCV list, and it's not safe
to assume that the remaining items are common enough to give accurate
estimates).

Yes, that makes sense I guess.

* In the serialisation code, remove the size limit entirely. We know
that each value is now at most 1024 bytes, and there are at most 10000
items, and at most 8 columns, so the total size is already reasonably
well bounded. In the worst case, it might be around 80MB, but in
practice, it's always likely to be much much smaller than that.

Yep, I've already removed the limit from the current patch.

cheers

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#148

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Dean Rasheed (#146)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 3/17/19 1:14 PM, Dean Rasheed wrote:

On Sat, 16 Mar 2019 at 23:44, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
16). This regression test fails for me:
@@ -654,11 +654,11 @@
-- check change of unrelated column type does not reset the MCV statistics
ALTER TABLE mcv_lists ALTER COLUMN d TYPE VARCHAR(64);
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a =
1 AND b = ''1''');
estimated | actual
-----------+--------
-        50 |     50
+        11 |     50
(1 row)
Maybe that's platform-dependent, given what you said about
reltuples/relpages being reset. An easy workaround for this would be
to modify this test (and perhaps the one that follows) to just query
pg_statistic_ext to see if the MCV statistics have been reset.
Ah, sorry for not explaining this bit - the failure is expected, due to
the reset of relpages/reltuples I mentioned. We do keep the extended
stats, but the relsize estimate changes a bit. It surprised me a bit,
and this test made the behavior apparent. The last patchset included a
piece that changes that - if we decide not to change this, I think we
can simply accept the actual output.
I don't think changing the way reltuples is reset ought to be within
the scope of this patch. There might be good reasons for it being the
way it is. Perhaps open a discussion on a separate thread?

Agreed, will do.

As far as this test goes, how about just doing this:

-- check change of unrelated column type does not reset the MCV statistics
ALTER TABLE mcv_lists ALTER COLUMN d TYPE VARCHAR(64);
SELECT stxmcv IS NOT NULL AS has_mcv
FROM pg_statistic_ext WHERE stxrelid = 'mcv_lists'::regclass;

-- check change of column type resets the MCV statistics
ALTER TABLE mcv_lists ALTER COLUMN c TYPE numeric;
SELECT stxmcv IS NOT NULL AS has_mcv
FROM pg_statistic_ext WHERE stxrelid = 'mcv_lists'::regclass;

OK, that's probably the best thing we can do.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#149

David Rowley

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Tomas Vondra (#147)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Mon, 18 Mar 2019 at 02:18, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Yes, it was using the toasted value directly. The attached patch
detoasts the value explicitly, similarly to the per-column stats, and it
also removes the 1MB limit.

I just made a pass over 0001 and 0002.

0002 is starting to look pretty good, but I did note down a few things
while looking. Some things below might just me being unclear on how
something works. Perhaps that means more comments are needed, but it
might also mean I need a brain upgrade. I'm hoping it's the former.

0001:

1. Could you write a full commit message for this patch. Without
reading the backlog on this ticket it's not all that obvious what the
patch aims to fix. (I have read the backlog, so I know, but the next
person might not have)

2. Should all the relpages variables be BlockNumber rather than double?

0002:

3. I'm not sure what the following is trying to say:

* Estimate selectivity on any clauses applicable by stats tracking
* actual values first, then apply functional dependencies on the
* remaining clauses.

can you reword it?

4. This seems out of date:

* clauses that we've already estimated for. Each selectivity
* function will set the appropriate bit in the bitmapset to mark that
* no further estimation is required for that list item.

We're only passing estimatedclauses to 1 function before
clauselist_selectivity_simple is called for the remainder.

5. In build_attnums_array there's
Assert(AttrNumberIsForUserDefinedAttr(j)); I just wanted to point out
that this could only possibly trigger of the bitmapset had a 0 member.
It cannot have negative members. Maybe it would be worth adding a
comment to acknowledge that as it looks a bit misguided otherwise.

6. In build_attnums_array(), what's the reason to return int *, rather
than an AttrNumber * ? Likewise in the code that calls that function.

7. Not properly indented. Should be two tabs.

* build sorted array of SortItem with values from rows

Should also be "a sorted array"

8. This comment seems to duplicate what is just mentioned in the
header comment for the function.

/*
* We won't allocate the arrays for each item independenly, but in one
* large chunk and then just set the pointers. This allows the caller to
* simply pfree the return value to release all the memory.
*/

Also, typo "independenly" -> "independently"

9. Not properly indented:

/*
* statext_is_compatible_clause_internal
* Does the heavy lifting of actually inspecting the clauses for
* statext_is_compatible_clause. It needs to be split like this because
* of recursion. The attnums bitmap is an input/output parameter collecting
* attribute numbers from all compatible clauses (recursively).
*/

10. Header comment for get_mincount_for_mcv_list() ends with
*---------- but does not start with that.

11. In get_mincount_for_mcv_list() it's probably better to have the
numerical literals of 0.0 instead of just 0.

12. I think it would be better if you modified build_attnums_array()
to add an output argument that sets the size of the array. It seems
that most places you call this function you perform bms_num_members()
to determine the array size.

13. This comment seems to be having a fight with itself:

* Preallocate Datum/isnull arrays (not as a single chunk, as we will
* pass the result outside and thus it needs to be easy to pfree().
*
* XXX On second thought, we're the only ones dealing with MCV lists,
* so we might allocate everything as a single chunk to reduce palloc
* overhead (chunk headers, etc.) without significant risk. Not sure
* it's worth it, though, as we're not re-building stats very often.

14. The following might be easier to read if you used a local variable
instead of counts[dim].

for (i = 0; i < mcvlist->nitems; i++)
{
/* skip NULL values - we don't need to deduplicate those */
if (mcvlist->items[i]->isnull[dim])
continue;

values[dim][counts[dim]] = mcvlist->items[i]->values[dim];
counts[dim] += 1;
}

Then just assign the value of the local variable to counts[dim] at the end.

15. Why does this not use stats[dim]->attrcollid ?

ssup[dim].ssup_collation = DEFAULT_COLLATION_OID;

16. The following:

else if (info[dim].typlen == -2) /* cstring */
{
info[dim].nbytes = 0;
for (i = 0; i < info[dim].nvalues; i++)
{
values[dim][i] = PointerGetDatum(PG_DETOAST_DATUM(values[dim][i]));
info[dim].nbytes += strlen(DatumGetCString(values[dim][i]));
}
}

seems to conflict with:

else if (info[dim].typlen == -2) /* cstring */
{
memcpy(data, DatumGetCString(v), strlen(DatumGetCString(v)) + 1);
data += strlen(DatumGetCString(v)) + 1; /* terminator */
}

It looks like you'll reserve 1 byte too few for each cstring value.

(Might also be nicer to assign the strlen to a local variable rather
than leave it up to the compiler to optimize out the 2nd strlen call
in the latter of the two code fragments above.)

17. I wonder if some compilers will warn about this:

ITEM_INDEXES(item)[dim] = (value - values[dim]);

Probably a cast to uint16 might fix them if they do.

18. statext_mcv_deserialize: I don't think "Size" should have a
capaital 'S' here:

elog(ERROR, "invalid MCV Size %ld (expected at least %zu)",
VARSIZE_ANY_EXHDR(data), offsetof(MCVList, items));

Also, the following should likely use the same string to reduce the
number of string constants:

elog(ERROR, "invalid MCV size %ld (expected %zu)",
VARSIZE_ANY_EXHDR(data), expected_size);

19. statext_mcv_deserialize: There seems to be a mix of ereports and
elogs for "shouldn't happen" cases. Any reason to use ereport instead
of elog for these?

I also really wonder if you need so many different error messages. I
imagine if anyone complains about hitting this case then we'd just be
telling them to run ANALYZE again.

20. Isn't this only needed for modules?

PG_FUNCTION_INFO_V1(pg_stats_ext_mcvlist_items);

21. Do you think it would be better to declare
pg_stats_ext_mcvlist_items() to accept the oid of the pg_statistic_ext
row rather than the stxmcv column? (However, I do see you have a mcv
type, so perhaps you might want other types in the future?)

22. I see lots of usages of DEFAULT_COLLATION_OID in
mcv_get_match_bitmap. Can you add a comment to explain why that's
okay? I imagined the collation should match the column's collation.

23. Are these comments left over from a previous version?

/* OR - was MATCH_NONE, but will be MATCH_FULL */
/* AND - was MATC_FULL, but will be MATCH_NONE */
/* if the clause mismatches the MCV item, set it as MATCH_NONE */

24. I think the following comment needs explained a bit better:

/*
* mcv_clauselist_selectivity
* Return the selectivity estimate of clauses using MCV list.
*
* It also produces two interesting selectivities - total selectivity of
* all the MCV items combined, and selectivity of the least frequent item
* in the list.
*/
Selectivity
mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
List *clauses, int varRelid,
JoinType jointype, SpecialJoinInfo *sjinfo,
RelOptInfo *rel,
Selectivity *basesel, Selectivity *totalsel)

I see 3 possible selectivities. What's different with *totalsel and
the return value of the function?

(I can see from looking at the actual code that it's not, but I don't
really know why it has to be different)

25. In README.mcv, I don't quite understand this:

TODO Currently there's no logic to consider building only an MCV list (and not
building the histogram at all), except for doing this decision manually in
ADD STATISTICS.

Not sure why histograms are mentioned and also not sure what ADD STATISTICS is.

26. I don't quite understand the "to defend against malicious input" part in.

It accepts one parameter - a pg_mcv_list value (which can only be obtained
from pg_statistic_ext catalog, to defend against malicious input), and
returns these columns:

It kinda sounds like there's some sort of magic going on to ensure the
function can only be called using stxmcv, but it's just that it
requires a pg_mcv_list type and that type has an input function that
just errors out, so it could only possibly be set from C code.

27. This looks like an unintended change:

  /*
- * Get the numdistinct estimate for the Vars of this rel.  We
- * iteratively search for multivariate n-distinct with maximum number
- * of vars; assuming that each var group is independent of the others,
- * we multiply them together.  Any remaining relvarinfos after no more
- * multivariate matches are found are assumed independent too, so
- * their individual ndistinct estimates are multiplied also.
+ * Get the numdistinct estimate for the Vars of this rel.
+ *
+ * We iteratively search for multivariate n-distinct with the maximum
+ * number of vars; assuming that each var group is independent of the
+ * others, we multiply them together.  Any remaining relvarinfos after
+ * no more multivariate matches are found are assumed independent too,
+ * so their individual ndistinct estimates are multiplied also.
  *

28. Can you explain what this is?

uint32 type; /* type of MCV list (BASIC) */

I see: #define STATS_MCV_TYPE_BASIC 1 /* basic MCV list type */

but it's not really clear to me what else could exist. Maybe the
"type" comment can explain there's only one type for now, but more
might exist in the future?

29. Looking at the tests I see you're testing that you get bad
estimates without extended stats. That does not really seem like
something that should be done in tests that are meant for extended
statistics.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#150

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: David Rowley (#149)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 3/21/19 4:05 PM, David Rowley wrote:

On Mon, 18 Mar 2019 at 02:18, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Yes, it was using the toasted value directly. The attached patch
detoasts the value explicitly, similarly to the per-column stats, and it
also removes the 1MB limit.

I just made a pass over 0001 and 0002.

0002 is starting to look pretty good, but I did note down a few things
while looking. Some things below might just me being unclear on how
something works. Perhaps that means more comments are needed, but it
might also mean I need a brain upgrade. I'm hoping it's the former.

That's good to hear. Thanks for the review.

0001:

1. Could you write a full commit message for this patch. Without
reading the backlog on this ticket it's not all that obvious what the
patch aims to fix. (I have read the backlog, so I know, but the next
person might not have)

2. Should all the relpages variables be BlockNumber rather than double?

Probably. But I think the conclusion from the discussion with Dean was
that tweaking the relpages/reltuples reset should really be a matter for
a separate patch. So I've removed it from this patch series and the
tests were modified to check the stats are still there.

0002:

3. I'm not sure what the following is trying to say:

* Estimate selectivity on any clauses applicable by stats tracking
* actual values first, then apply functional dependencies on the
* remaining clauses.

can you reword it?

It was supposed to say we first try to apply the more complicated stats
(those that track dependencies between values) before applying the
simpler ones that only track dependencies between columns. I've reworked
and simplified comments in this part of the code.

4. This seems out of date:

* clauses that we've already estimated for. Each selectivity
* function will set the appropriate bit in the bitmapset to mark that
* no further estimation is required for that list item.

We're only passing estimatedclauses to 1 function before
clauselist_selectivity_simple is called for the remainder.

True. I've simplified/reworded this. The old wording was mostly a
residue of how this worked in previous patch versions.

5. In build_attnums_array there's
Assert(AttrNumberIsForUserDefinedAttr(j)); I just wanted to point out
that this could only possibly trigger of the bitmapset had a 0 member.
It cannot have negative members. Maybe it would be worth adding a
comment to acknowledge that as it looks a bit misguided otherwise.

Right. I've added an explanation, and another assert checking the
maximum value (because bitmaps store integers, but we only expect
attnums here).

6. In build_attnums_array(), what's the reason to return int *, rather
than an AttrNumber * ? Likewise in the code that calls that function.

Laziness, I guess. Also, bitmaps work with int members, so it was kinda
natural. But you're right AttrNumber is a better choice, so fixed.

7. Not properly indented. Should be two tabs.

* build sorted array of SortItem with values from rows

Should also be "a sorted array"

Fixed.

8. This comment seems to duplicate what is just mentioned in the
header comment for the function.

/*
* We won't allocate the arrays for each item independenly, but in one
* large chunk and then just set the pointers. This allows the caller to
* simply pfree the return value to release all the memory.
*/

Also, typo "independenly" -> "independently"

Fixed. I've removed this comment, the function comment is enough.

9. Not properly indented:

/*
* statext_is_compatible_clause_internal
* Does the heavy lifting of actually inspecting the clauses for
* statext_is_compatible_clause. It needs to be split like this because
* of recursion. The attnums bitmap is an input/output parameter collecting
* attribute numbers from all compatible clauses (recursively).
*/

Fixed. It might be a tad too similar to statext_is_compatible_clause
comment, though.

10. Header comment for get_mincount_for_mcv_list() ends with
*---------- but does not start with that.

Fixed.

11. In get_mincount_for_mcv_list() it's probably better to have the
numerical literals of 0.0 instead of just 0.

Why?

12. I think it would be better if you modified build_attnums_array()
to add an output argument that sets the size of the array. It seems
that most places you call this function you perform bms_num_members()
to determine the array size.

Hmmm. I've done this, but I'm not sure I like it very much - there's no
protection the value passed in is the right one, so the array might be
allocated either too small or too large. I think it might be better to
make it work the other way, i.e. pass the value out instead.

13. This comment seems to be having a fight with itself:

* Preallocate Datum/isnull arrays (not as a single chunk, as we will
* pass the result outside and thus it needs to be easy to pfree().
*
* XXX On second thought, we're the only ones dealing with MCV lists,
* so we might allocate everything as a single chunk to reduce palloc
* overhead (chunk headers, etc.) without significant risk. Not sure
* it's worth it, though, as we're not re-building stats very often.

Yes, I've reworded/simplified the comment.

14. The following might be easier to read if you used a local variable
instead of counts[dim].

for (i = 0; i < mcvlist->nitems; i++)
{
/* skip NULL values - we don't need to deduplicate those */
if (mcvlist->items[i]->isnull[dim])
continue;

values[dim][counts[dim]] = mcvlist->items[i]->values[dim];
counts[dim] += 1;
}

Then just assign the value of the local variable to counts[dim] at the end.

I've tried that, but it didn't seem like an improvement so I've kept the
current code.

15. Why does this not use stats[dim]->attrcollid ?

ssup[dim].ssup_collation = DEFAULT_COLLATION_OID;

Hmmm, that's a good question. TBH I don't recall why I used the default
collation here, but I think it's mostly harmless because it's used only
during serialization. But I'll check, it seems suspicious.

But that made me revisit how collations are handled when building the
MCV list, and I see it's using type->typcollation, which I seems wrong
as the column might use a different collation.

But if this is wrong, it's already wrong in dependencies and mvdistinct
statistics ...

16. The following:

else if (info[dim].typlen == -2) /* cstring */
{
info[dim].nbytes = 0;
for (i = 0; i < info[dim].nvalues; i++)
{
values[dim][i] = PointerGetDatum(PG_DETOAST_DATUM(values[dim][i]));
info[dim].nbytes += strlen(DatumGetCString(values[dim][i]));
}
}

seems to conflict with:

else if (info[dim].typlen == -2) /* cstring */
{
memcpy(data, DatumGetCString(v), strlen(DatumGetCString(v)) + 1);
data += strlen(DatumGetCString(v)) + 1; /* terminator */
}

It looks like you'll reserve 1 byte too few for each cstring value.

(Might also be nicer to assign the strlen to a local variable rather
than leave it up to the compiler to optimize out the 2nd strlen call
in the latter of the two code fragments above.)

Good catch! Fixed.

17. I wonder if some compilers will warn about this:

ITEM_INDEXES(item)[dim] = (value - values[dim]);

Probably a cast to uint16 might fix them if they do.

Possibly. I've added the explicit cast.

18. statext_mcv_deserialize: I don't think "Size" should have a
capaital 'S' here:

elog(ERROR, "invalid MCV Size %ld (expected at least %zu)",
VARSIZE_ANY_EXHDR(data), offsetof(MCVList, items));

Also, the following should likely use the same string to reduce the
number of string constants:

elog(ERROR, "invalid MCV size %ld (expected %zu)",
VARSIZE_ANY_EXHDR(data), expected_size);

Yeah, it should have been "size". But I don't think reusing the same
string is a good idea, because those are two separate/different issues.

19. statext_mcv_deserialize: There seems to be a mix of ereports and
elogs for "shouldn't happen" cases. Any reason to use ereport instead
of elog for these?

I also really wonder if you need so many different error messages. I
imagine if anyone complains about hitting this case then we'd just be
telling them to run ANALYZE again.

Yeah, it seems a bit of a mess. As those are really "should not happen"
issues, likely caused by some form of data corruption, I think we can
reduce it to fewer checks with one or two error messages.

20. Isn't this only needed for modules?

PG_FUNCTION_INFO_V1(pg_stats_ext_mcvlist_items);

Yep, fixed.

21. Do you think it would be better to declare
pg_stats_ext_mcvlist_items() to accept the oid of the pg_statistic_ext
row rather than the stxmcv column? (However, I do see you have a mcv
type, so perhaps you might want other types in the future?)

I don't think so, I don't see what advantages would it have.

22. I see lots of usages of DEFAULT_COLLATION_OID in
mcv_get_match_bitmap. Can you add a comment to explain why that's
okay? I imagined the collation should match the column's collation.

Yeah, same thing as above. Have to check.

23. Are these comments left over from a previous version?

/* OR - was MATCH_NONE, but will be MATCH_FULL */
/* AND - was MATC_FULL, but will be MATCH_NONE */
/* if the clause mismatches the MCV item, set it as MATCH_NONE */

Fixed.

24. I think the following comment needs explained a bit better:

/*
* mcv_clauselist_selectivity
* Return the selectivity estimate of clauses using MCV list.
*
* It also produces two interesting selectivities - total selectivity of
* all the MCV items combined, and selectivity of the least frequent item
* in the list.
*/
Selectivity
mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
List *clauses, int varRelid,
JoinType jointype, SpecialJoinInfo *sjinfo,
RelOptInfo *rel,
Selectivity *basesel, Selectivity *totalsel)

I see 3 possible selectivities. What's different with *totalsel and
the return value of the function?

(I can see from looking at the actual code that it's not, but I don't
really know why it has to be different)

Well, it returns the selectivity estimate (matching the clauses), and
then two additional selectivities:

1) total - a sum of frequencies for all MCV items (essentially, what
fraction of data is covered by the MCV list), which is then used to
estimate the non-MCV part

2) base - a sum of base frequencies for matching items (which is used
for correction of the non-MCV part)

I'm not sure I quite understand what's unclear here.

25. In README.mcv, I don't quite understand this:

TODO Currently there's no logic to consider building only an MCV list (and not
building the histogram at all), except for doing this decision manually in
ADD STATISTICS.

Not sure why histograms are mentioned and also not sure what ADD STATISTICS is.

Yeah, that's obsolete. Removed.

26. I don't quite understand the "to defend against malicious input" part in.

It accepts one parameter - a pg_mcv_list value (which can only be obtained
from pg_statistic_ext catalog, to defend against malicious input), and
returns these columns:

It kinda sounds like there's some sort of magic going on to ensure the
function can only be called using stxmcv, but it's just that it
requires a pg_mcv_list type and that type has an input function that
just errors out, so it could only possibly be set from C code.

Yeah, the idea is that if it was possible to supply arbitrary binary
data as a MCV list, someone could inject arbitrarily broken value. By
only allowing values from the catalog (which we must have built) that's
no longer an issue.

27. This looks like an unintended change:

/*
- * Get the numdistinct estimate for the Vars of this rel.  We
- * iteratively search for multivariate n-distinct with maximum number
- * of vars; assuming that each var group is independent of the others,
- * we multiply them together.  Any remaining relvarinfos after no more
- * multivariate matches are found are assumed independent too, so
- * their individual ndistinct estimates are multiplied also.
+ * Get the numdistinct estimate for the Vars of this rel.
+ *
+ * We iteratively search for multivariate n-distinct with the maximum
+ * number of vars; assuming that each var group is independent of the
+ * others, we multiply them together.  Any remaining relvarinfos after
+ * no more multivariate matches are found are assumed independent too,
+ * so their individual ndistinct estimates are multiplied also.
*

Right. Reverted.

28. Can you explain what this is?

uint32 type; /* type of MCV list (BASIC) */

I see: #define STATS_MCV_TYPE_BASIC 1 /* basic MCV list type */

but it's not really clear to me what else could exist. Maybe the
"type" comment can explain there's only one type for now, but more
might exist in the future?

It's the same idea as for dependencies/mvdistinct stats, i.e.
essentially a version number for the data structure so that we can
perhaps introduce some improved version of the data structure in the future.

But now that I think about it, it seems a bit pointless. We would only
do that in a major version anyway, and we don't keep statistics during
upgrades. So we could just as well introduce the version/flag/... if
needed. We can't do this for regular persistent data, but for stats it
does not matter.

So I propose we just remove this thingy from both the existing stats and
this patch.

29. Looking at the tests I see you're testing that you get bad
estimates without extended stats. That does not really seem like
something that should be done in tests that are meant for extended
statistics.

True, it might be a bit unnecessary. Initially the tests were meant to
show old/new estimates for development purposes, but it might not be
appropriate for regression tests. I don't think it's a big issue, it's
not like it'd slow down the tests significantly. Opinions?

This patch version also does two additional changes:

1) It moves the "single-clause" optimization until after the extended
statistics are applied. This addresses the issue I've explained in [1]/messages/by-id/d207e075-9fb3-3a95-7811-8e0ab5292b2a@2ndquadrant.com.

2) The other change is ignoring values that exceed WIDTH_THRESHOLD, as
proposed by Dean in [2]/messages/by-id/CAEZATCVazGRDjbZpRF6r-Asiv_-U8vcT-VA0oSZribhhmDUQHQ@mail.gmail.com. The idea is similar to per-column stats, so
I've used the same value (1024). It turned out to be pretty simple
change in build_sorted_items, which means it affects both the old and
new statistic types (which is correct).

FWIW while looking at the code I think the existing statistics may be
broken for toasted values, as there's not a single detoast call. I'll
investigate/fix once this commitfest is over.

[1]: /messages/by-id/d207e075-9fb3-3a95-7811-8e0ab5292b2a@2ndquadrant.com
/messages/by-id/d207e075-9fb3-3a95-7811-8e0ab5292b2a@2ndquadrant.com

[2]: /messages/by-id/CAEZATCVazGRDjbZpRF6r-Asiv_-U8vcT-VA0oSZribhhmDUQHQ@mail.gmail.com
/messages/by-id/CAEZATCVazGRDjbZpRF6r-Asiv_-U8vcT-VA0oSZribhhmDUQHQ@mail.gmail.com

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-multivariate-MCV-lists.patch.gzapplication/gzip; name=0001-multivariate-MCV-lists.patch.gzDownload

����\0001-multivariate-MCV-lists.patch�\ks����l�
L�m+����=v<um9�Lb�X��t:��d6�����?��)Q�O�3�jZK�`�`����8Z2o>�f���{��Nw��;#����=r��l���!{�l*V��N����N����W�:Z����^��a�W�:������0m����v�S��M����{�c�`��=@z���;v�S��g�	7}�~�p|}���m�W�\��cH��'?��O��Vk6�5�En;����2h�<�A�HZx��|����.�/7��C���n8r�~��J��(^V7�v��f���I��JN��m6�����4q��A�[ �v�����Q�^l������Ww�%-w����a���=��%_%"-�R
{=�l�R��!�����m7��D$"�	����F�y�u�KM�Y�~��N����
�n�)Y����v����G1��E@���������d�f��f6�������a�U���
�����iG�o<�����bC�F>7o8p`Z���1�����������)�A�$m���x�*C��^���W�����?����<�l��Y��*���%����=��H{��qy��<�nk8���nGnE;���]���T���he��a%�������-+��]n����W�m��V�U=�n�0q���\H������u���d;�/�X$��
b�����q����(Z�:�b�T$�f�h�$<���L5�>�P����aCg����X�sW�8�����d{��t�~-a�-�� D����c�~&���ex�0[�N�QcR�l	=������9����j�?��fs����w����5��'���3��V���������@��������|�-k��[�������������q����0�\����� =�a�s�*s�u�~�H��K�JMj��C���?@�0�~���	���`��`"�F���>�*����|��������a@�G�a[�$�a�Cb�������h8/5D$����eQ���Q���$,Hf�e��2JR�����|l~4Y���"L�������{�P6���|4����F�bC
|HWJ�C��a��k7��"����L���y/#Yj����4�A���|}�������fA�����0ly>"�!_��2�3�tE����k�*�E��#\,�h����6����'Of���y��
���n�<�7���3��p89�-�&���q�a�� �h����������kv}u�����aTPq.�1�������	��(�m��tH��^��3.i�Vz!�z��)�j��g�!�<rh����Rz�O�H90���(;{`���_r0�~������S�.�F���Qwe�����
�x�K��a�F)�����a_�}w>�W��!�F)1D3	'?I�9/���~�y���F�|�2�#e���������dz���#�~�����8N�q�09�:�b�Y�\i�l�h=���2)���Aqn>�Z�m=f�X,�u��uDs����kZ�Q��!:����`��!w�(|X�����P��H��$�y����H"y��F}�H��P����N^�����!��^��:��0���4{x
����)�A+9AJ�h���4+9o���-��d�<�V�$������(}%���MG���^���2�<Y��/�>&����Cp��L��pO�wE����,��H����?�q��x���wf��[R�c����%��5L/�������{���O�����'���	�a_R��h`���tl�-��~
N����w3]���x�3o�<+��E����c:y79�f��Kvvu���� V�/��`�.ao�-��o'W���0�����$M�o���F���rZ+�T��P�31,
�0[���I>�6�9�g�M��R�7�~~e��p�W�zr-��g��~����3����fb������@c�-<Gk;2u������_������X�kG���q��Kg����S�x��J����b�L������������������
d"�~[���I*���"� ������W�L"��[�&J��0� ���UJdp�(R��]wo�!N7�*E��k���$�N����:F� 3 ��������Z���������
c[�/B���O�4[ �*�����"�@�0*�	MU`Y�����t3E�\�^",Y��Y��N{rL���9��T��{9�C���g�06��~�c�����@>E
��Qq#1�����7?�&!�9Pp���"����hEh%�?����-�/v&w";���HBh�r�<����h��!��
�w1�5�Q��5�o��@�r��HEg���PD/D�+W3������YQ�4�	��>�5�,������gD@�	��$��u^1����	�8�rVA�$gQ����6=���m��(����0�� @$�����Iq��xf���Y�����*��G����Y�"�n���^��)*^^P� ���2����H�$��U���"0w�����Fe������c��a�_��Xm�)?�{�������T�Q?���F��NO>g��9�����N������u�5�N_Q����Ud���gV���tL*vklPq���0l����>WRql�
���������'������j�����J_����(�h���?o�����/vOQ��^R��)����j��A������z��S@�0���:9����3�[#�
2��{�/}���+X��I��l`P����yRR�/]���/c|�����9�C/]S�~����Z���1�� il���@"��zz�S�hf������}``!w~K���	���K�q��5�)ZW9sB������:�n��V��`�"Q������1i�����"�e!M��r����E�|��� zp��>bQ���v��	�kpSq3&O�?���w�'T�����n#0Jnud�4�|���A@�����g:g1(��F������Q��z9�@W2��`ehl��j���8�����!s�Y�5�
��1��m��w�I�-8�,��T�Jb>�u��Q����^w��w-.��l���o`����quQ~�
��-�0QD|����8_������LB4�����c�����P�e��-���C��n�G��i��*i���Y�IO���������R�{�Z~���O��_��3�R���_�a�gg
����%���d���S6�_5s��H�t�����MZ�+�XH�.��h���;D��0~������\~q]��2�������g�sV�� myA�X����
B������7��%�S��S,��*������>��,E�*�*0�)muo_�W���-�2���Z�X[G���O�����-�hc������O!Q������J�d���@�Xv6���f��9�����h�S�0�����67�wH����U���K����c�n���7������eK���f\�C�<:4U������H�9�y�O�z5�3{��S��L�RiZp�Z��(�Qn�{e@�;-�Fl�X	gk?�t�g��+A���E��($V<f�w<��
>aiD��<?$-��q�7�a�_�]+%
Q`N�'�	&Y�HM�P�ae%&���J�����#=�>�f8zP9�p;�����Y���Sb[�zzu����T%�N����S�� OX����������\�����Ze��_~�����M���f�Y��,� ����}�&(�����^z�z�,v��
~5H�3?������$n�����
����+����v0/u�����Q1G��A0:[�Y��'Y	�zl����k��)5������7�>���%.�����wx�+��0>��Zc���-Ato���KZ��9:��DON\����UY�b���Q/���*��g����u�{�P@nB�#xd����c��X���(�<�����^�)���(�~���_�,�%#]+���D�.u,����X6���6���a�l���jG�<��Qa�7����g��'~��p���vY�c*����zN�c*��G�n��v$��?Y���|��Z��`~�}D��
��������*U`A [+�v�G�xj)�n*d� #��e`���_C1[��B���I�*U;$x5n�FY��'�g8��?aT�cl�fo��^�!����=�4j>U������hx�&������`�]�j��yk���������S��<��
���VzEr����ot2O��"���{�^��-)�k.�������0����#0���
?+�!�����m��ih�_��������Gty���:C����K'	a���k&�Vr(��cp�����m]
�@p%�i���p*\Q*�&%�5�p	
*b���� �>V �����#@��N$8%���~d� r%�0Gc<�@�	M
�J��-*��3�:�:<�v�v�G.V&���7d���:`.��|�K%��c�O����0��?�(-l6���od�����By���Y�&�4��:���b�f��)gO}S�����?���o����Q��������[
�;�������l��c���L�������5��w��Wf?�)�$�B�+i`��N���������5<t��{&r��71l,i�4��u�,�o�ZE1���+���i�|���i�>K�y
&�8�b��������B����QU�m=��n�>�R�'���&�Y+T����j��r��:���O�!��
S����"���e{b�M~F's ���9<"����t�'�VF��=�T6�[��YC����k��o��������d�1!�C���p�-�������J
�Q��i��Y��/����'��A��)�o��H9p�
KkL���k
}��q$L�o�{�ur��|[q�����r���w�&,uX]�:d`����Um�A;7�/���kv~q}�Mt�yu�Q�a���?��B�xX
\m*���O�%u_���>h�X�K#NaK��,�V��`9�x0,�����DvC���[z����zr;v�^�[����������Lh���xd����p�����,]G��+��qDS���.*�\��+33�:
K)�/��uq)����h����V�G��
�:�=�p�����W�q���I��C�7-�&��/���La��+
(��ao8v�p�j
����;���%%�����v-��5��6�JW����7�i�L���6jl����������^�#?Vk:�D��t�����Xx~Pk���t���#X������^�%&�y
$+Gn������~�B����(�)$
��1���6%��!o�A��=Z:�� A���a����_���W�#�0_�<H�A���?�A/'8[���i����C
�g')oI��M!����f��������-�&AB�+�^�D���<�C9���~�A�N;����X
 2��t4 AG#���t�=�W�����g�>��{"@�3gu��]�����pE�E��F0�o��
j�T��;���������>����R�c�.��W�������������n���%�3�v���s�dsd����m�2	����!A�������t�v����W�(O�C��vW�������3x�W`�z~#Rru����O�7�	.N�4�����>�������trqr>�b�5�"�@�=�|r@	�su�����k��_�@�i���a�D�)��Y��/_�Ts���=��<?��m���7ok������OQ�;��$&�;�N8m���7�GOI*aI�VI�t����k�C
B8v:}��=���v�y���������u���`���`���{
�~���<��,��gLtH����x>jg
=m4��}XZ���u�w�&�>���)�`�1�]�g\[9���l>��	���Q~�K�5X6�����fT��!k;[;5��psa����o�N{!\�4�)�3)�A������l�YB���(��0�G�&J��3��G��A�	�s`�jI�w&;�S%�p0���P��%��_����T�`4�{��2�k ������7b����a�g��5�	I�7TG��w�����j��d>�g�@���-�����s�`@�����d�R&a0o(���;qp�$�I��G���n����@]��{x����"8:������o�NpI@\�����;�.~�Mu�A��@o`��L�&,�x����(�H���-T�n:��i����A����^~K|<����T���( ,�$;�����A�.�'l�{&e�Qv&����aL7���g��2g6�����H��)�����{���������\s�cE���;�=�a��t9�#:7�����k�Kz0@�6���
�8X6����
���}�y���|\!��=���oX�o���+����D�]^�_���4G�^[��i���D���$�����I����.=�/n��+������fh�����*�h%���k]���-��>K�#���r�V�����
�&FWX���K\���-�����qh�l�bq-K�2$�+�h���,��7.%Z\�'�\T��"]T�'�]T�G/���x>'P���E���Yl������M�70U��}U�mzL�,[���������I���a��6^O��x�6z"�f���0��<_�i�a�������7rK
�&�@S��U���lh�����y-�9\<�$8q�a���Hq�+�E�~ �g�z~�:����d����!�1���&��1�������[R���Q��`����#i��ZPL4��~���t��L��S%����%�%
\a��%����g������	���\�'�gz����y���������"��H?7uubC�h�x�S��%=8��{�n�Kobm
o"tW�iV��2������V1�q�~=b�7������j�W���3r���������q�����;���(����m8���<kg�{�96�{��v������z�G�fw�U�s���W8�������&��K���(E���OC����X�
��o���M�F��Q�d~k�*�4"|���CS��m���U/0@3\8x�Zc���_/(��*�-R��x-��Y�EU1nM�ud���3��( @���?�i8g�7�!=����?
���Q�Xhe�7�������w;���o�a�#��+�&�[���t~q|����}E:���=�2O�/d����E�������{�|I8�A���@G�UsW8c���Q����@XSl��^{���5�@��S��{�,��6�Buy:�����n��dB�����-t��b�W.�:�0
�g�| !���>��h��@L��lJ��%�;�Z���.e�=b���

3W�
y�I3�;��o
W�J,��L�TB"���0����}���A��������
��P8�f�d�__�`�E�?�G8�H�9���x:��[�y�3�/O��Y�p�l��KS���R�����dA
��o�luz�~��8�E���^����.�0OM&��f������i�����]x>�!"_��G
��Q%>�i��-�\p�$h�	x��m���n�a���@�L�-B�LX%z��~�����Kt|F[82��P!"�wT�:}E�u�,����&��w��Z�[=�7�u�S�x���Z3�^�Yj����5:�{\#���=��GEAX��z(
`�+/& �(G�mSl�2������.�x�j�E\�E���`CZv����.�^�}�y�?����[��$��%Z��!����hF����h���p�������X��g�a�_'�)q�O<�|l�z�`S;�����S\���i�{������3�v�"*���n%Bo�$3�����v�(F����
X�<�Y��4��:i_��8�:=?3��/d�������)i��'�|"�����>.`�k����S����z�}�;P�@b���*�����6^��S�<�-�A�T����I~=F$�(]���6�*���o�k��-�����\�f�|���V�����@�\W��������"pU��
�5�L��}G�;gj�����t��D-�)KNS~�2����fH��������/�\���ID��1B*
��lxW?<;���q��}(�8z*��VHh$c��@�����~���1B�N,<6�>���'I	��X�����D�R����f� k�HA������B/|��X�@��!�Pc*w.3M:���������If�d�P���DpG1�������x�����i1O0�CI0x�5�(��IX��v��OnZ��������$ZN[���Q�Ru�l'���������9�pGLB�F��N��El���c��(����D��d��eO�>30��X�-�r�G6Bo2���1���'����b�����B���6�����#�m�I�0�Jy�HjoJ��Q K�>���GJwt+E�Y>!�>�4$����4�?��N�^y=�9X���O]c�8hy�}�,��!��� ��l���q�T;{JE=�u���io�;��)��6�deoIHL&���Le��#CQiL;
X$�����vw�jkg��u���~o�t�'�q�x2����;���Z�p�|�����eO�������<��S���=5������b)�����������4�,��[�Q-��Y40%�v=�h��c��������g��������6�m���{�~l_a*GV��}D^�H���S��$�
����F��K���u��yLA��!��8o�#��&�������4����w�����z�.~T�D�p��&�:��#��'�5F��l��� 3�m��Q��j�4��ql���`��g����������h-�+���96��o�O$��?Fh��,?\h��v��B[}��;��Aq�.kAM
\Q�}�)���pwR�{��HB0}"�-����>�r�6�E�����%�B^t���$r],�qL4m��e!n����(uG)�QX���*�����\y���0�����������X�p]\J���k�F'��JO���e�B`��u������$-J��.l
����Q'y'��Vp�n���B�F�o��`��x���4����\`��n���:�A����s�����2�trOzc5p&�`��/�h����N����hD��Qwwgg���
8S�"p�(E;m��Y��e�0���RL'$0��5c/5���S�E����q!!}\��@m*��oJ<���5s��rh1���	�%�bm����f���sD�/���FQ����%�}|�c�[��L�B�m�R����rv�7l�
�FF�������$oeg�B�)��G�V�����x���t��F�.QY~K�����-���h�s4~��i��Cb.����/pS������!%�g��X{�}Hx��a��5��������_����������
zx]��q�����(����	\���X�1�_vA���o�9�&�i�����w�;����N����[�[{���LE5�w]Q)�#�Yj���;��#](�tn�	��k46��V5T5�s��P���"P����s	�g2�&>�@����l$��e?���~d;�2�sc��O�,�����t�����~��4��<����v�hw�5�z���������g�qey�F�a�i�u����fv*j�JQ�B�O�	8�n��em*�
�UFL�
$L��U�e`�#1N<��yd|�23���Tc��yG���ff�Q�b.���3���cnE�u�^�do&��N��=+E�2����
���~��^s�0o��.��''�'S��C��	�!��<�����a������7���|}�{��?~?�
�P�FZT�����Y��U���P��,�&s;�r8���);*:,�u��������������w{[t�6{��M�c[�tQ;���&0���=�Z����O�?�{rIL��E�j!��jU���^�"��� (e��l�/�`�,kX_4c�?2x�VE\�������<+P��X
X��:�'R	�E�r�X2�&�-�C��_�Q���������\E��!�F�e3Y$sX�^��_Zq@�H���l�)���(��*����bh�`>�
q�����v���e�q`�M���]�
"�3����D��6";$����9��j�#�y���1f��p�:$[���n������(�c�� :ugb��%Bm
S�.����P�<2�Mm_a��X3��V�T�5�����B�����0;������� ����-�(n�M26�U�e�N9��1���93gK�U���7���1��2Ha'����x�g,��EC/V�
��w3Nn���:���>2��@�9�
�[	xv�t9�s�e @�X �e�Q'�YaF�������)�Wb;%kvQ���-���)�]5���CN6eC��~�����L-��!>���I����@��1�HP4fk���/�3.]+'��g�4��g����������������/�#C>�o��R��5�>��gRI��I��l�P���Z�|
[ !�P�<:2��mt������"��p�z�0�?:�&�N+cF;�����I-Jo�?I�bW��p��)��(}����R��������5����]<?����L���7;���z�u�������q��S����6g�.��h��_�:[D�,1&v���a�%�	���[�J�:������{7h�2F��U�r"�H��M�p��}�Q�opU�hmo�;�������Q��Rx��U'	]X���nI'���;l��^N���I�@j�w����3����^�Y��R��6�cX���F�����(�eL�f�kR�|��/�\��z�=a���;�<��aNT���9I
%h��8������"����tZ&~��,�M�lE��r����fB��
'R	�t��V�h
i1�$H}���Sl�@�Dv� ��1���8���mt��`/�f�[@2�(k"��i]��k
wa4dHZ<�`�%"~y������#!/'�6�\�c�
(��^�|k2e��B��*����[��`r�}������<i�~B�{T����2�>Qj�������v
��%�D�'ft������s����K)o!�S������BMsD����O�e���f7!�P�����	|�1�\&�N�$j������l�0����QZltRQ	n��`��/pR��k,�U��9r�������y`��c]�a4��������0����xB9�b�;\b�V*����.���;�h�-����v9�������W�3���+1�`<��*r����8i]�w��"l�(�\���Y������g���a�i�8sfJ8����M��{A�e�7�J6kw)-'*�����^��V��w��oG���������u2eI�������*�Kh#/�������J���[k�%@'|�ooj�!P�3VBl�*��S#w�s��
	�w�����%I������+�p�vx��	6������A3)�.���v{��s|t��.6��B��#����Jq����#:z�;���A������]���'/iN1J9��"���r���A�z9���vk�����-X������E���KB
b�Y]�^i�SX�������6:��JL�U<�z������Ah�o�4��nh���4k�
9��������^4�9MkM��_�F�D!��7<���=�}$Z����lJ�4�lH�BB���F���u���+A�����+d����B��Ju5��))�(�����C�Y3��P�Z��o�Q�)5�����4]�Ln�-���.�����w��-bX��n�U����'-1�3L�����Oy�3�������0c]��s\r������1��V�3|��+o������v-��9G�	��/k��J5M\u��/F{��v�A���0x����D���rO���=1�=���������5��Z��;����q�#���)��2�F�N
.�J��?k�f����VG�k��foF����>��~[;{�>���z���r$�t�Q� ��[ .o��L���EVVl�P�L���H� 5��m41�\�����
���C�9K�f��[����_�0
���O&��0���`!v0��3��k���Y�c]qn���k��!\>�n��������L��������#�G1r"%���`�v�((+X��c�EZ���o]z���=|!�o���	c�0@��S��P�Vr����,��5�p>A��\�����|�����}+k��g"Z�T��������u"x��f7B����.O���"5���i�c��Y�R�	�
��
��4�l`.����~��gV� &-�o	���zR.<��p����+��Z
��<��!"��E����a^
�g]��vb�� {�`]����>
�����_<)"I�E�j,�"kB���l'� �csc�R��O���h#���,���d8�h�����J�=;�W�[k������9���#����(����D��(���6�{tDS�ZN�a�ZDh�f��jA�B<zL�J8eo���F�q3�����P�N�%��m��T���������
�J�����d.1
�����z�����q�q�p��81u�FG���)o����;���}x��e�t �OQ�p���!y"�?_3���`�g���Nw0Fu���lb�����C~'������v�)@����sy�'�{�0�'JO�������{���<lq����u���B�`)7g�,V|K�H8n�m��Z�f���'����2�;�5��<���O������L�IP��(V;�W76FI�)d*��f���n2kgc�`*���G���I�?��(	�D2a^���dYU5�8e� ��1M�As�7*G��	����8w9��@5_��EW��03S���gr�������%�&�	�*�)�I��1��>����g@`7\�G�I/C�����������I�����8.�"YD\����,@@i6�d�I��$
1/�3)�7��9�����=0Zf%�{��9�
��&�����U���\���?�l5gK0��G�2\A����A=}� �y�,}��Xy5P�!���F��}���+�]{�����1�Yg�m�Dn-��De��P��;u�'�������3�1�x=O��"��7Q���v��`������E�1�QKS�iv�r����A��q���K�f���e���X�$��� ��2ugS��I�i�Q�!_/���02�����E����QI�R��
G��H{��@��1���O
���Y(��gS����c����wR������bh�5e���z�'I>z�V0(��Y��NP,��rh�=rO�	������m&]	5��K9��+����
��d&jA�����5��{��f����lH�m��������f
�$*,c��s+Um5�w2J���H^��.q��;�w�@��(*�iH5�������L<
N9��CA�!�s�E������qK+�,�
h��tf�Q�:�%�����:�v�[�������x����3W�B{g�4�����[����I��������d��7k�06��	rK�F_U����Y�@~3��q}�f(_���u��)��xv�LGj�!���[9�A�e��_�nL���f4�%���|�����i+~�����0x����2��q���0�K	6>S)���E�DL�9b`���>��3(VW���P6��z�%	�$C��]����z�D�;������,Y�$�Og�E�w��g����$�~����Ms8���N���o_}qr����c�	��&�V��|Qq�	�1[�s��2J�A�]M���*���0�$w���|O�Q�&Q�HXV5��!&�`������r��k<c8�jolv��%po`2�do���x8q%��H9L��'W����&�����1#����j�
f�������j�=gw��B>e�Q�E���68QBv�
S>e��]�)aT�7�!^�V-�����i{�v�<�lK�A��Isdh�nb�����l�`����z�K	�b�G?2�z�3�^R>�����m�j�x^�Eu�CB�mx�8�'�3)^C��C�#���`������!�2GX<'��r�#���C9#���h2�����R�Gkk��s�m�8������}!�+�.P��m����O���t'��5e�1��+<��:��5Z���|h&��#���&�q��P|���;k��q��rP�3��\���U�<��4�1BTo�,@�����YxmK��b���d�J 7��9+���Z* �(P���@`���?�E��^��X����\������������8����j�U2������j?�u�P�lI�D���!�Q���e�/.���//�h:���|m�S"��b�����3��,� �p��6�����&t2��
�y���Z�����l6Vbn��vK����i��su��S����=�.����M&�UDn5E���C����S�!���;��$S��Zi[����T�$���,�HkC�>������=zG�/��A	�fi���}t�.WU�d�iex��^�q�VC�]@�DU0G������/�9��tQ��H;#5��*��E�e�w�}�$�	~C�2IRF�I�9;���������F���[�aj�4���$�D�cp�TA%9�	��D���vj�|z-��
��} �O���n��������<t��+�7��^��F�*�)�}t]���2�9]%�2,B��r�Y�){K
?���K�N��"�1y(C/$��xvV�������	���23J)�`���o��S�Yn��+�W�_�.����W�����s��� ��&�{���t��g�	_���]�b7��>7��5���
��`��y���V�x�����
|���4*���V`��&�3�/��v��IU��2Z�};�i�!�~�L�UN��N����X�W"
���f`Nd,�bD/f��/��.�g�UAB@,���GMF�e�(�c�S�p����%	����
���}`#H���9��/�[Q���|��q/Y��8�ZpGY
��eO��x��m��;8K6�s�7�Du�	�d��-��`OWq�S3�tlS��m���9��������:T��A�aN��
���s�dI�>	Vr�����MV*�Q�����d�������Kbs��� ahr,$3�m���P/m�5]gD��I����!!\Ic���K �q���ZP)�t�����{;�n���SL��<<���7�[!.R�-��yU*I�8zx+��f}�L�V�h5L�'l|�W��g��GC����������\��uX��
w����CD0Q���P����Q%�ZV���v���75������P�=�
M�n^�:�A���SF $z��L�p���s)�<�(nB]Ks�G��K�Y����]��S��N�
I��;���MZ<f=�p���b�|���p_���ljh��?G:� q���.��x��R���H���������;�I
�o#�7Us�"�s����mF����.�j�������kW3�����
��y$��[4ETSu��L�24J@�qN�at-'�(������7-\wA����iM�F���k0wbv��}� �]x8uX(�B����c%CoJ�B������Ns��L��8��x���,��F�%	YF�j�D���O���(�xA=<�)�-��F���<L$\����%���3����`
>�Ma c��R��������9Q@���W��
��`���18�|��xf�"ct��k��P��G�����e�l8��#c���Q�� *����5�k��N�
��'����R2�����Z�����Pg~���A�FH{`M��i�J�[;�rE%H��j!��g,g��@]t ��I��.�������f*6�OD���������*"~C��5�*�g��h.���c�/�S�@����F�,��7�hWX<��9m�Yq����]ff�>��k�\�^^���~x)M��T?�`���9�#�%���u����<�:R�}�mv+��!P(�N��*��cR�YA��
��pFD��q��sB7���+%�s��S60p����#%{'�D���%�0{��y�GQ���Q�����b=���(��*��&��y.�0���������R�����
��;�>'��i����Y��w�d��x8��&�4=|]��6�Z3�f����m26G6�tB��@R�0����6W+����F@����z['g$��MLU���bH�8�o
�;D5&��S���&j�i�XL�"y���QIW),>e��������`V0G��Ru
�]��C	/r��E�w����]�����k�-\]����A�v;(�}�������j<�:�i"�����_���x�xe��h
%�������E���$fOg�������b62��)fL6�&c3-�+��������i�a�0����(-��)g+�r�zU���e��.�
��d��FbZ+,��nf<11	��.�M�X��U��.�x��K�)9��O��=2rP���a@�[���������S�bk�.�d��HX�6�'���������K��1w_�y�0t
�.�����5�!�F~���:�H3�<
(y�L$3ziI��x��I�cEdJ���Y���=>�p���������������g�?��[b�u�.������)n���xlW?f�pw���Az���f�.j�����w)��;���YR�B!kD��,�^�����[���Ji�t_��-E����.e�C�X�o��Q;{��fy�=b��\��<��C���I�m���s�7N��0��\&�_%7v)���IY�L���ZC
����+yq�!$�b����>:E[��h/�q�b�%�oD��/�p��A��V0��[��r�	��}@	 ��p����jg��������5	l8d-��y�[@[�L'��?[P�&�rkT�}>6�����q���qI�������)�S4Q�/9��G0��d�����Z��5��3��0�6���z���b2Y���_t�:	��."A����:>�KJ��tVj2y�B��no:�C�	�AE����1`�BLt�L0�k&��L�4�lF�?��'�<y��}r�tyt�����U������������sa��I�Ui��X���K���	$��v��d�;� ������=����3��\a��T�o���<
��OO��9�)�DA�
7����������M��&������`�5�<-{��j�h[����{����;���#��{��8M�)?�4s�#6i4x�4:�H2��v�$�v�Y�CZ2b������T5���.���=���{Q�n�Gs-������!a�*�F
��)NP:	��
�����p�XLrv�a��������Q�%��m\1�c�U\V����R�����Fy#�����������W�.�#�4��Y����
���i��������o�I >��<�3��������n-p���Vc�}���V�QXl����Ej6�O����[M^������������:��4��]D�-��R��|8��c`��>�y����'y	Bvqz�0:mV<��n�>��4����^��g�����p���b�~�l(�w�xF{��A�L��]Zye�e�ilY��'Q989l4-�����UV�A��|l>E�|Mycp�1��'g���F�&b�G�"�`��[{����Ul����+6%���[��y0��w�)y:v@���Sb#���$���d<L9��?��z��xI:!8��3>N�q!I�%�TR��c�H��0���nY>�Cd�I*d�zKS�y�W!.b����W���)g+�x�7�M�K\��6I���{��4�"��z�L�m9�4�,~���}yz���������|%���P��PD��r0-��3NJ[�7�D�u���p=�0g��/���V\u<�"Qe��4=�6 L�5�8��v��U�$ ]�y"s`P����g���q�td)��I>�#;��&�t��Z ����.Y\��r`���A�P8��f������&Toa�j�-�H�2���i
?q�����K���T�g���<�I��������������1pi{]�d�+�ua�_J�����#��H���6
�AU����.:ufZ�d4x����J��wo�_���8��fm��lj�M:ZL5�N$����[�BT�kF��,O���xs����a���7�����sX�>�z�`cR}ddi�R���]����+�i�LwIxy3��e{���O���������@�>u=$	�$�d�UG�)�n7SG�3�C������)�	��P>,���m$2�:bf�������06��n_�Bo��6����'��D�%���\&�l ����6!bG�q�1�8 ��.}f�b|�e������"�;K�}�
zTu?.����)��Z��
)�@�T�$�.�b�RC�g�U��^��K�����f��C�!����\��jG���*�E*j��oVLe��$��h:37@����~�Q>�[��8Y�i����c\^�G(������<�������
�/��~,��(;��e:%e��$IrJ����oYg������&���J���fh ���L����"��F��>��\#���<�����	�'3i������2&�F��H��w������&w��"��9���K��e�����t=�bb�%�1�a���p�W
	�^5�&���@S�������Cq�5�L�c�R�X3��5{�h0�$��&Wc8����D"
w~�*�>Sn\����3
����Y!�������@z�=0cD'aI$�w����O2�a�-:NQ�K��o��-I`�w������C�a!�S����~c�����[��V��UW�Q�H�d�2�T�3�+���p��9����O�����#r>������Hk�#�Q4�#2��A9�HjKg��� ���xN�.pm�cI��Jd��q]@v=��.;fD�"���f��
?�,�=U2e��0���g
#��(JO)����c��0�s�
n��C\f�LB~3��t
��>����hN�=��j!���q�B�C��CPr���X���P��-uM��1�Rv3��--Cy���?&���IgB�i�%U�Z�N,�B��V_���*oJu\�����^\��zqRH���_�_�JQ�����I�1���L��^�v�&�'�6z�d������8����"��s�F�R{�C����Q�?Sq�55��{�����'���<��i!�9k�G��3������������w��{�,w�]�����.�8��8���N��{����=�����k�+�	,�BN�yO�'
�~�.M����l���Z���"\��kV�H-�>zQJ��S<��>�3Hk����5��+Vr�)[�V��?���&����u�f4����O�6O�aJi�|���9H�kdl�t(n�'nY?$�����3���R{���e���6x�,����9��W���iG'�D�W�U����_��~��,�.f+6�������O
g,E�r�j��,��O'���#��V$���"��=`�r�*�K]��q��8|��������oO_�^�-4_�����8�3�q�A��wc�mkI:�v�zF��~A"n��ELo��q?���d���O�"��F�!����U�����s�&(*Yd*�\f�M/���?������G
��D�,4�b���s}�Y����z��N��&���}:�� 7,������]fR��/y�����G���9���(�� x�L����\��%F���0!?'O<��EQh�+V]�QDMJI�J3h��p��3>�"�Q>^��[��1X�sK�"U�B0{/��	x��6v<�>�X"��1<��5G7
�8'�h�v����z
56�n����1VD�����4�� ,��I#�(�A��>`���{}��.�v]������w���J�=R��4��E���zu�	&O�E�i�k4�[�{�V��h���7{�;
%^�M`�S��������.��/�e������[��N����w'�%�����f���p�L��-]oZ{��f��%���4������a��=x�����%|�i���i=Q��!������l2;r
�o���:���]�>?=:�:=?�q-�h��������<�y`�j����a�4�<��AzW��..�!�e��G���#��d�I��)_��]�����^�`\��
����A��3|��,I�k�;���%ft4/
�v��-��B\
�_�XZ��S���5�Nml�r
TL&;?`fJ��cd�[<�#��.��My���~�9�BX<��)��g������:��m�G��kH�s��=cp����)�u������\r������E�5���0�#����������f�f�)j��sI)����������������F��DWS��as[���@��������-;�A+����e������������&
��G6Qq����;MX�	��x<�����g�)��s�E6������|rI��X�P:G^������\�&���l%�W�Ps\0W������prv�7��P�n�s�d2�+����d��3�l�-�xc���1|�(M�"4�^��&���U(
g�s�� L�0���f���z��+�V��
r�91k�FlA�QnC���:.�,G ��i�&�k12L�����j���Fx<�%���4c^`h���T��PDJ�~3����y�?��#*����g\�m�F��Q�3�q|LMo�%��u�SI�������@���Y$�u{%��;��s�*�e�c�#��}gq����:�	k�ih��6���H�Q��df>�(���5�J�)e����`��D�I�c�������h�SQ(x]����#>c�~��q2��%�2Y���������y��P���9���]�7�^��D�YY50K)���E������
.;C�3+�V����TNgu�aN��"���(z������dw�����T���0)��.�y��H�
��`�L$�������**\����x�9���S�MQ���L@	T����	j��1�����U�T6�x�����>�l����o�A� �QEc�cw����be����1F-=�Q���i����_�T�i@����	"�6�43��9�-`,4@�O4���L��#�;�ga_<�`�kr4��D��a5�.��Y�9H���� %���,�����m��������g�x�+���j�����g�t4f38�3A��]�`��#R��������G��qy+d���G%� i
P���w	%x,���������t��v�)�e��9�'
�0��:*S�#��(�W��2��wtB�b'�ZU��c���%$�pb�O�g��6��	��`c��_�DY��������q%DLxw��:��	�C^��.sf2"wG����+&n'��N������_N{< �~2$�O�1�F�����qvi��������/e��������?BW���l�H��q���Iq/5B�[pGY��A�H��8���fn������>-��e�N�)	�����W��M��� ������%�"7) x[��,n�)���������u�������d��R���$V�qNd>�Z��2�;-3���6b���++��i�?q��!&�~����O�����d��Uz���(���LFn*���$�����Z����Jp1�*/p�`~� _Qr;S���alz��@�9�����{�h�]k%
J-�1+��"�y����;����L�v�q�[5����FmW��^7��T�����lU���L,��F�����I'�X��vE �=�w'n�
�xt���~$}�[~/v�S�g�P/�����4�~�v��#[�$���l?.Ix���b���(��d5I���$�v���;�n�C�\����]
��>��grg�b�Nf��dq4�}�8x�*��M�Q s��(���������X��l�+�`����_����f����^fki,���u��\����{of^�x�����z�g3�l_��}�:���A�"�Pw��=������g46tc�P�Y8& �3�&tC�3u=�_3v��y�<0��.#��@X�u�V��W.d��2p��k�����pU����q�k�m��#q�����F��1����L�q�:�I'�IV��:v)QG'��c�!���Q$����r��A���j�M�

�(u��0���>E������fr,�2^=�vv��fw>��)3V���Q��l������k���:*����.v������p��0I5���Y���|���7��&�� ���g��jG��<�+h��Y �^i��V�o]��&@���|,�A���^���ed�U&6���J��9���;���:��`f���M���NiY�`1[w�R��Yh:,��
!`�%�HYr�t�Ob���o�
`!K.! �!���S|��'�r���.n�Bb��hxC}������^���=�L����"��UQd�dY������*gpE��qTT�%��������3C��V�YqF�2 �
�����0��+g�����f�PEH%tw�-�oe%Q�UYW7�)|��d�q�_3���Ic4�D����#�Q�(�*P�`�4'N����$�.�@w7��&��������"0������@������Kj���p���@e�����G�g��DA�%�b�����=�
Q�F����^��)[p�����y&�2�Zu�\��M�"�d$����8-��Y�!���R#fXx1�e��������@�K����|Z#!�2��"����(}	�P��^7r�zG=r�V@�1��b��G�$�2:*H-i��mw��ks��\G��)�������cf���J�q��SA����S����L�h�mc��3�����������[kas�b���M�Y4�*$�H&wRa]95���z���������\��n�m���3�7D�7�R�i��#��(w+QA��U������+�"	���l
���>��;2"���{��������g��E�,�v��L��R�Q���p^�b��1H.���&V��joM��(7���A�2�������m4@��h��C��`��9����cMn'�+�s�
��%���p.�4BR�1�%��S�����p}7X�H[��Vg�HP
�����[������o^s���[1]��u�����T�S���D~��lO6����y�2��zR�pA�2���,z���@�%�����7[���Z�W#��b��%�|�Y�Bqhws����4F�b�U2�+�&S� ��*�8%��z��j�^.���?0	ATf��#	(����G�C6��M�)��C/��f� Outx��I��U����;JB��
	�7�d=����MD�P� ;9�^�mjWz�6��E#�&������_��T���a$�
a����B���\��y�����������#P��C�!�������sFv4i/���6%�$�U��Q��%�����F��T$E��]��]��T�Z�K���r���Z>	����L��Ku���y�~��&��?���K�"���������v�U��o�i�w��&2
�i�d�x�l,B!��g:wB�CG}U)b	
7Aa�3k������y����~���X]� �?�d������lZ[�YqqJj'�Ux[UI9���dR��?	\���Mx��7i�/���$K.��o?��0�\X��3�`�j!��y&�)���c*��
���M��JMo�5^�I���6_{����X�������G�����4�;�o3�����xu�7�J���LL���N��������+�e^g0$!	��C�?5N�VI�=1.R�[���3�51�������P9�E2�nSn/��KM�c��$�W	Q��m�7�cRc|`��S����T/4�����"����`6�r�E[��
F8v9��e�f��4�+��_1V��A]�f��K�ul��[;��������i�����\���4�y��e<����<�r{2(n�����J3���u���h��N,�z�eSI^p
^����\]\��{�6��+�U��Yy��ct^�2�J%�y.��O8�9E@D&��s��8������� �`�Ym"ZP@!k����]��CDMtE"L�a��]n�ss)c!c�5�tr>",��g���+������D�x��9it#1�o���i����(����U��Y� 	A��m�����E)v�)�&	~@'^t�Y;�� �8
&��5��P�96�e8���Q��2����� `+���JW����PS��i����q����E	����D]X������bG!�����a��&jo�F`i����
<�F��s���6G�7����<
a�	�p&���}�q��T�!����N6�D��CN�� Y1]���r�U�KH�2����`���	�����
,������S:�h�	n��f��Y�%��h��*Ll/��\��M<��x�������O����
�
���H��<�����1!���:E�"5_��v�C�\9��&!@�!�M#�YP!��8�����q�x�w[Pb�:��g.NK����
�Z��U����<Q��E([�z���i.�������:�K\+b��KIa�	3����`��*r-z4��xO�g�X'�0]�j��:�P�����xL�$y�M�N�3b��="�W.�,�Y��f�^<�������9J��h�+��s^�c�i@rW�kx�}�j
��!�Q�"<��HfH��p��������u�\`����QR/k"7h�g|f��e��U�M�\
���\D�P	l8[���}�O���#j�0��hb�����a�@��A'���K�RB�9t��p��������Q���l<��)����������Y��<<d�5l�"[8�a�2"s����}��|�jJ����u��,�M%Rb�CO!����vrT����{���=Q]��i+�>clI#�|����.�,�@�_Y�p^\Y�o��i�G7Ge6�R�^�Q��B�z�H���d2 �"���:E`�Y���B!���X;Y�g;����*r����aFS���zx��A�'�sW�����/����,�AA/Xy
������A}�������`F��\�xj�*�7��$8��
����;��Qf|�+�&��s+����N���,������N�;]&�:���9�)���zgR~����~]d�V34�`�����2;fz�%���vw4����8V��E����F�a23^�pY�b��dz�{�jw���.��^�7����'E��
��<?������o���#\���~
O�DV���P��@M���t>	��[��W�r��6m��8�������q�7�%">�`���u���.�V�m��g<K�9��AzM�@cH��,{ u���at�
�^���x��v�1_�Y�?�R���&��X7D�5"��C
v��o"��T.�e|KD��t��$���f3�E�O�X�q�tq�56�V��ko���]�=�6���	����#�&��M�`�R�/>I�J(���Y
�H~UOJ�\�AR����w�N��Z{���?����Tl�����P����$�<�."X_���^SuwO9@.R��z@��3\'�!��1�g�olz,51b�����;��}�	��&���v�C9*~k@���V���=��\�;�G������F���cYy�e'��1�3c���^��ZrMn�����W�x9^�iPoUX����<�U��������8^����oi��{�e6�������������/�3�Z���������������e?����3[4]t���&��s#F	8Yo"��iGM���:����?����j@��?���������fE@��B�:"�;�P�\o�1!�8�#|�	P���h�jc5o�o1�m��7�<�����l��a�C��T�����^�@�|[Q�B����4���h��>���+�g)ED4��x��C���&�AF�����=����%gd��i}�������j~k��[�ED�yF
1+�����
���^A�}�4�xK%�~�L����fr�T����Z}���C9�*���p'k�xs�z*JR��m���3�V��9�c������Fc�_��7��\=Hy���~���^{���������u�V�t;A�����-�8��$R�)�[�������:R�.	��8j��^�BgN���W������Jc�z�����0�Z�h��T��Y�a��e;����u���B����=`��{�]�K��_��
W�#��ym<H!��A���.�N��<���������}'�4��g9��Wq����C����Fi=�vc��"��.b�������6@^�C���F��v�����V��w�c����`!�����E��I$}�g�Goya �w����2���g����A��~t�A�+b��wd�Q�_1$,Q�;��oT����i���@;N�9\��>A����;���;�Z�pH0�M ���kOn�����91�Y�uM2%0Tb�F���V|�� ��B��Vz�J^/�SO:�y>]z.B�9�n��X$���6�o�2}&�����s�T���q9:�]"�H�|���&��V$D����\rU��D���'��j �-�\����}Y�i�i�T�
�3�0nc���*��P�&�lG��,l]���)���f�/�_W�-j5�~%���t=]������b���C5����[�/Sy�Zt9h�q
b�2CQ6MRj	�c�z�����D�E@1T�7uw:\%�m<H�5s�	>)K���V���K\���8���F�n=�������0�iq	,*�}~�X6X��]^U�8���$�U���*+B`-���=*�����o�<���gB��,i0�<��t�
�V�������6�X\U�2d���I�R9qT��������?���~�GZ$�����	��.7?�c��%��m`��g�O����|�j+�����3�"4����M��G�G��"�{u��D�X	�o����A/&�,^b6��������~`zH�fcw�'I��������
�3�SvWtkU_	_NA~q")��D$!T��B��������w]���A�Ef1�YK�z]�Q�}qwzNHp:�Z w$�R
��CR���y�'$>���I��i�9x�HNZ�o����2����u8g�
������zk`/��V�@���`2��6�	jA5��V;-�w��xjj&h6R���sU����1��A3{�3rN��=J�x�-3Qs�v�Gzp������_P���.W)�V3.���c(?����a.^���?�	�J�]������7>i��&{0��>3��L�9-���ReDnX�����!���d��.XzK�Q\����
�eL������-F������~�����X����X�����sg����W^�U]�
�AQ����+�yOO$QVQ����@������h�aF�-]c���hq��_��3��������'g���g���c)��k�]y���e���{��r(���Xzf�`��dkfW����M2��K�v^)i27P�\l����Z��U�>']��G$�d�l\C���+U����I�EF���|m�:_�P�,�iR#���8=Z��A<ep.[i:���� �r�_��V�E
L?E3�N*��g�{�`��~"y�S7M�_5
�Jo�J,��������;�,1��,��9O^'��e�0\�7��^��l?�6^���07�6	�'��3�5��3c��Q6Dm�sY-��>��XDe����%��y��!����m�K�Al������-%?�X
����;%7�Eb!�� �0��7y�b�)�*J-������y�d�i�lnI��[CwL��Ej�bXj�&d��[s
����m*�bE���.>$�h`����p�
�.2��Z�#�m|*��e�cm%�K��������������,�`oy �/b:��2n���	N��\�gf�UE|�A���	�a��Q�����J�)��#�'��j�����A��`�&����e����>������������?�H�;���M���+K�Y�m�:�[�;�`n���&Rd���W!E����`S��\i�%j�D�	�<u�3TR�a��A�/��#$�v�,Z�d�����j� ��w���)~�#I5�T�M	��$r�]���:��6���q�^������
h�tp=�����Iv8���{PJg�,W��Y��E��CB
�ZA�d��H�L�N�������&R�!�8���Q�0�F���/V���	I�����U���������*�:_�V3��=p���uk�.S<�<^�:�^5*b�������~��aBr|�x�PS����SL@�1k�/1��V����~�!,0#����o�gE)3��l�����J���pg���z�c���D������~��G.UvT��LWY��%��n)��O�i�<(�b��5����&�M�C~���m|��~�����~���T�2�+�����=3���OK,��W�;(�`p0y%��0�C�����fw�}�mq�g��E��p���;ky��?�D��;���1�[9�k����6����G&�|b�J�����{��.�=H�\����K+�Qt�{^c�����'K����gRJ���Ds�Q��BK2R�:��Vv��7V��E�p!��G�6�V�(�26�io8�5������Z�l`���n����.K>e�|j��(��EcP�f.�rD���BS�UQ�R����[��\���(c"H-��Y�A�����f|$\� �F������_9��<�si59�9hyQlg�
gOz�6��h�[��x���~%���|��c�x�UW���U��J���=U�l��-�x��M�S�k���z
�:p$��|9�)����2\B+������	�GM������f}��������p�)�b�lfU��CA��&��G��e1����S�y|y�\@��Y4��O��9���x�x��0�aW�
�^��	�;���M�.y�CE��
8�x�Qd2���S�8{I�n���C�Mt����/S��h8qh���%7��5Ay3E��B$m�*�li[�_p$������8;������.���|�=�
#����,wg��p�0����3�G��}�.�p6;������f��(&�~�C��<��K��`5:E��lu��]�������WG�/^���������u��K\uVY�����DAtN-B������T���I����tc������!���t���U��v>*���k�>u2%e8�fp�m_p��9;Vh���Kj�*	�G�g��IK�m��q�'tmQ�0����+�.�o�vur�~6
�6m��|��`'[�;�R�D����fv�7��!���h�f(h��Y�����Z:�-��r�$y�D����T�����jr1�M'�&l����x�K-�J��d���]������K�5��p�FAl3����t�Mz1>=:?������������U���W��/�N��3z���p�l�P��l��}�9�e�����5I�C)��d9����k��^e{�fFwb�Gz��~#�>����TW����H3#=�j-��t����������:�O�-�~�!A+�g�#�V����*����_�\���<����}C@TC��QUw��w1�p����C�m}�
*���y���(
���O\jj�V����V�i�-�)�(=��+>2��jDU"�G[K���9��b'��������g8j����zJsx���-�[)��J9�:����W�w���w��M�����J�U�H����l=�h)����f���N��1S�0I�e�E>���Up�VH��|�I��B%XY�I��p�:+��S"\7�PT���v��*�:7w\�p��PG6��T���*2�Uy�{+:g���;%�q�(u�%��0�:_��ZT���
����L�"��R�p�!��9��c��;�@i���VbT�8�60�K��
���R�����M��U��,1a��G'���=Q�CN�"G��n5�`�_�\WyQ�����E�0�hh����d�$��9�|�[���a
�O�����@����]�z��5�Q��&�]CFH�)�Y�@msE��5>���I+t7�~e���D c@S�����d� �&X�����������W��W��J�  �g���Z����e�\-\q���)�_�������|�K���R=	�!���������k��+7.@���S����@�1Y?J����:�%@���@�k�r������V���\}i
�tS����*<F�{M5����O�}e�@�q��`m��<�)x�U"Cz���Y����+��j"�(�KtE�96q�8��������N�r����L��\++u�-�p����%r
��#�!v�=����+�P3LSmP�;��q���B`�Ht���[���
R�RM�O�$�iT���iQ����,e�L�"d0��f'��9DSvzE�LG\n1����E�I���8/E>H�4�/
J��x/8���!L�l��O�A��mD�ef3w��� ����������P�����	��n���s�;{���\�|E����P��>�\��`���VHsJ��'�x����DlP��"��#���ao��pl�q����uL�d�	���;��G��a���IP'�1����:���t�5! &;jS�MQ�|��6�����	}�(�ML���v��S��o��}����%bK�I9i���w����-Ic�;�u���?��L�������B������.��,`���d?�l�;���C.��X�����2�l"*W���.��ne��`6�&��������0"g���Qp ����IO-X&����:���RI���� �w���;^�j������\p&����(�����At��1%G��?7������d�Q4�A{R�i/��#�������5yG�C���3uxv,}��
���h����X)j����f��l�obv��<���`Da��H��sP��T�eA�p������T�]�H��8����#�p:�����}d �����b#��P���;�C������/Jg1�%�E����H�����������<`��DU����~#8�<yXT�q���T�M��T�W4U�jom�G��k<3L��(z���:R�~4��x��-�B#B]4�N����M&�[����~�Zt��WC���hW�I���>9.6����G$�71fa(Hi�i1���Y`��vDlP�(R��6�]V�Q������X������Yn�����q���� ��i)J��r������^9���E��~Z{(.+����$Ov��T�l��f����@�A�FZ&���+t8�I2�"�h���3���B�9+�[:Wf=e����mb��{�BE�pL
[%���;66�v��Rq�������i�#T�P�t��x�(��Q��%]�}�	�_�Hm�|u��.�y��t��0t��.S�?�?����a>��?sVv��sL���Hz���U��$z�����7�/6�yr���8��L�E����ON�O����)�e����G�	�K�$�>����Kn��j^W����X��I��C��}�l�i����wx=��E|��|�L�~ ��+���X���d2��>W����t���zg�Hnh�����T�
]?Z�(��gH����)W)0�m�[8sx�J+��:��X��3G��/�R��:m�t\����n%��$�+4ks=����?F��RxL�4���j&pU{_�!�=����=JP�`�`^W����������������Y+�h�O��R~%��XGA�:(����������uB��
P���]:�uW[�Pz�d@Te�?�A��)��u����o��9�u��4�<c;�C�kg��94�Mv��omh
u�pg�A�c[�	W�=����s��L�P�9L�s�|��_Sq��5�~<�������*Hg�;m���`����ux�������q��!w�t����Ya��"#�~��\�G��<l��ZIA'�2��S�#wi������N6]�-�.-C�s�Y�����pk�
���������8��RO�9.�nb�EU�����
=��\�H������0j��D���=fE�/�Ytx���.�u*^���C�
i�P��~[���{�y?�G�3��%�e�@�����_/O^<�<<������������H�L���!
�x���;#.�3��8�Q�/C�<��K��D7�F��p�n���}��Iy
;>t�b[G�p�������s�������������o����D�����t�<���<:|qx��
�#��Z�����0������$���{��.����?�6�,�L�z�N���k��[8�p
2���!<B�_D�[H�R��q6	��{��]��[���0��l)>���v�D��A�l�}V����1*�;����#.>1��C���f������9p��#U��I����}V��6��H�7���L�a���x�*c����v����na��(n�Fd��V�82����.����.e%G��l�_`B�H�+�R� dj&(Q<���nD+������+���L�
z�JN�V�K�%�I��wdJ-����u��*�K
���J��|l$mG@6�2"�+F�|��1����������������<�3�8b���q!C���<;�/W��9a
�K1��l^����\\�����m��DT���rq�}�o�����>[O���%�8=�N	��<�R1��>�y4�'���j��J����u{���8oke� ��"5=��D����[�Ti�G^F��Y��^��m�4�~T�P	[��k����Rv|��mSv�2I��4�wT���I%j��p��++xk��aH����H���UB�G�u_�|�
h�'�E�4���d���1�:F�OuZ��&5��^'}�J���-Lj�-��Zg�\/)�����8�tzWh
a��p- J�3/��Q�+
�������/���/��	�6Xhs2�
x��$:��aP]��H����X�9�nY�it{J#��e9	�<kd;~��g�]5��l�S]��R&��s�oE�������\x�t�����l�O0��~�C�U.<������e��g���YkV�N-8��f�1��������}t���������I��o@Kc�6f�7�������p=L:�P>�j#�<8��)�dr��]��9��b�J�3����R���}B�)�#��#FPx4��3�sSCrD��a����ZX���'eR���N��$�V�8����8?=���Rt���DVV������1C��b��<'�1��GA�S���cd���
�U5t>F`��J����-Lr�!<���{�������1������5��B�`Dy�A�'y���
Y/,����K�V|w����� TZ������95�S��������VOgl-�K��m�'Y���@�B3�yE��{E9�FP��
bv�2���U�h2�*�ffN�z�#���""�/a 2�cu��N���Y]*u=y/$Ho
�������A�J�n��Ic2&�E`F����A4�7R�[
�v��j�'��4��.]��#I�J���~^�8S|�~Iq���q�z_�u�'���y�H��:4:�I�-&���@�(��g���h�x�B^�:�bQ��rs|�By�)E����zXz����������r�Q��T�7�~[}���rB���6�%�_�\�az(q�G����l�o�Rs�p�WW�J���3������b����������z���h3�v7;Q���|6��Qo��|�jt���eV9rw�����tv���^�y������l�������e�[�V�K���OPo�6j{A��}���5mL{2�`�o��M<��]9�H��pR�Zi�GR`|��Oxb~����x��G=���0o����1q>1T%����S�\R������Z!�]��|����N�i���!�r�gag�gag��t.7b���cvm��������<����a4�V~i������a��xX�������i���O�<9��_a��'�N��O��NO.���I0������1������MpO<	{�Zkgao���	�8����M��h��v��l��������������d�fJ�0�Y�iP���:s���������Y��z�'���\�SU,��"~8�EP����1B
��m���\�1Wd�����4]<�����i�����%V6�r�e[�v�L��j�4�
��cl�R�{���=(Z�L��qD5}�^�Xj>I��n0����n2P'�w\���l��~����n����Dq|��/��
jp����x��vw�����D����8���:f5�'b,�n�QD0�����]C������7�O����3P{?E�!t��~�����Z�6��
��(���'��������7M`=�J
/��&��*[��c��;��
�-�C���>�{��>��h�/~R}�'��7��o�
�������Y6$A��`�'?}rq8s��e��������"��oA��ZM��������Zk����t6��&����
B���5�@�k���k@�$��/��W=y?��PN
����so�}�J���q=	���UZK��i@k�a���,��5�`���o����Q���v�Z�F��������2�����dJ�~��n������m)W�����`Y���O����z��Q�vtq�+�-�;��W�G������*������I�����������B<�Jx�3]��z�][|O.]~�����mT�
G���q�[z.1�#��$Y��P�M"P�(dk,(!�pgo+jF���f�E�~�(�7���pQMv+.*��q{������j�/t����|�B�,X�D�����,"�m,��T�b�)=k�����P��)���IO��j�O�j�;���]2@1�j��lQ��gV���!�[k����.�tu�Nw��l�l��k��	��?��<��^�,%�N�w�����.�W@IF�<�+�b�����^��W��]|�����o��[��-��qk+:xt��_� ��S
���)3���&���x��
��	�"��!�C.���i<c���lR!��}��J@�[���� ��6[��5�J����9��L�|�|��+{=�������pzv���%�Y�k���n�'��kXm�o[���
�*��[�*��\��6�������^���������c'���������G\����ImsH�&����E��&��� 4�� �Z&g����y��T00�s��C����B5X��1rm�����$���?�FFIO+��^�)�4��&x1�W������C��+A�����e�`��4V+���N��N��,(#mR'

t+!��@����$�"������.��~���!�/wOybd�M���)(�Vwg�����������9���x�[+[���+[��0��X�8:�:|q�]�-\���o�\�!����q��a�%ba��&��������An���m��.���
lM�T�^lX�������UTW|��hd3��\��

D�!8BL�vvbX��G����qa�����
|UP��'��=����%���+��zo������B�u��s�`��>��uj�F'kSp��������y�����F���<,up��95-����nlooE����Aww'��,��MM7�)EJyX4L�?�m�����*�M�U@���+f��z��i�E`{]�
[����it���75u�!.����|X���������n�I����N�?�O��W(��#�:i�
�������;IYx�E���9C&s���J����1�W#)���a��S����w���2S^�g���l{�h���_��|yxv�s�s�\�������P�o���(�"����'<�H������� �����O��N.��2[to��c��(�N�����{���j�m�v�fwogwok?�-?��z��5_O��#<���DN�2���?����d�X�_�����uPo#U��
+���w���u|���}�n��A��#���r��7v�CT��/���I�34R�|!�f����$,�7�Uj�L����=�;��
+�7���>jeK7;�:1��������e�����������h�5����Z��f��/�+��������������&���v���b�a=�-}�w�2�J\����L���M�dqQ��	����j���V
;m_��q�#���bO�{���t��������A�nX+mb`�D���$���Dix��v�����F��������E�RU�w�R��*��E�[�.����8����q"��$I9�H�'
��#���Q����~.�����S"@�V4�W����\C����c�E�Ev}5}%���b�2��W��x%�=-�Td��<U.,�0�qE��'�r���?2���#r��4��bf�����^^�'�����C,6�����K<�!~�������N5X?J��2	��������h����X��
u���$tj
0rD��dw���{����?�"������Z<��pT�&���M���3��cS7e�u�9�h���)��u�.�6H����2���'��?�=��P#�zm��
8WGf~�������m8U��e�����y�ll>m<m��I�����;���Q��6+a��T��0�3.:|!]��2�������o��u��S��i[XLl����c��y%{�d�uj����-o1
�����,����j�)��T
����2�������PP%0uq2�H�O����4�K�Owl������u�&��t'R�=��l��B��m�4��1MA��]�^�~�:>���kz�'����
:�mv�I���%�e�����Y2�7\������c�����M�P@.������7.w��+}P �x�������n{�>L���	�xR���B
�q����U8;|���e�U��"��-���������m`Iw�������j[��z%��>���U���V�_��*��	K�R�{lG#��W�f�JZ����V�&�>@L����	y�'�5�i��BO�^�)��7]�.�;j������1;����N�`���OZ�{���-�~��x�,�Q1�0����[�����h9��(zo3|�2�(Iw;�v3�W��p��I�X}�[[N����^�A'�0�������;�a�!� @�F�]{&�����A#n�����B����#}�;��&Cr��c���z�
e���L1;�9����VVl����D�v� u��b�~�	J���f�L����L+f����r>��)G���y�;�H
3����3B�v����E��B��3��c-�]�
�m��=_�$kq�RJ�V�6y�i�W�$D)+�y��M�4�7����s,#K���Z\��nr4�H��<9�<=?�|M����.��6l���m���������s�q��/�VCnv�L�C%�����|[>�N��:	�����P�b$&^�9A�2���(��9�"�������G�8�L��J[$� �W�W��*��t��=�3G���b�h�h�����O	�&JEI����N��[��k�Z�q����a�m��I ��8u�~��t�����7����ksc�j�&dO��l�m��h7q���?���:��[[����f�����v;�G�V�?{H��W{�W�
-���y:�>������p>����>��3�o�;�Z��M@�k���'G	B���*��`,K&$2�������Z�*jc.����E]s�j�Y'�s�,;�x�_���#k����j�^��89�
�
8B��H��)��H�4�yV�����H
lD���+��}���1,�#���[.1hPE&::T����
�[�	�(�^\yN�j^�����=7�d�N#����=0~Q�d�f��6�N�A�������������d���.�V+������l������9I��4�)��#�]��_��&�9�����a����<���>�z�h�;�M�������
�l�{�$*�)x�E�i���PL|�g%������������?������G�9K@g�5}�"In8��p_c�'�����_��������x�{���r���;��~�����f���i���=K~��%���`�
����W@���}qD�Vp|q�*8:��K �H�:v���kX�_�u������K\J�!�;����s!u������EKVd�!�6��'c�V�D#n��J�RJ:H���.�H��f�]�L�yq���`��r���[�=��x`�D������qO#�>29.�T�-����\q	�'��Z���`��W��}����Z���8�n�j��3?v�X������>��:���H	��Q�&8��v
\�2����r�@XF��k�2���}�EJ���t�%�D�Z��[��M�*���]������,���,Y���'���!�i3��w�d>)^m.�M�S���S^z�����6W�gk����W/O�������28�Zw��vqryrL�q/�'�u��'�C�����O���	��9�0g�L&r=�U�2l�C��m>n�W!^�����Wh� ��
�
w2L��'��u��	�Z���W��6�4f�;[���F���ky@i��,�>6���.e���}�� 
uQ��9t�u���f�G��b���Z��VE��$������[�������-�c�W��U�
� [��4>
�����������n
�y��y<��-�������i����p>E��A_�*7����?�>1IQW��)���i��4m�Rsmg?�<�8=��Qpz�%:���G�^x_l���<�:�{T�j��}�����p����:��]�\\���nWQ������Uj9Jz��������N���m�|Ta�y�13ZL��8
[�]8�� 
��������\�3iiC$��������g�Yn�(�A��Z_�-aO��z�V]�j�v�������bU?S/Y����������	S%�B�
k��m/�]#��0|��^�]��q
����k%�qJ����Z�������]3����o#�ECv��[����U��m~�������^>
��M�������|~����W_�����}�d����|�.����m�'����OHt>g??��������g�����u�(�o@'�����b��9��:L��i�6We�\R��|����h~{�TV��po�bI���F�2�h7a�2q����7�����}�wMcM?v�]�x>V�����e���m�H�`g2�U�g�o�*���/Op��Q���
�Q�O^@����q����]�����e��fA�6[�@��wJCHC��������,�������<F�������?�z��{��oTv�E��R=.�9unD��_^E�(q������������(����^\���Qo7�H�?&�A-������U��u-W���\���YL+E�����j��������E������a����E��r7�GJ�O�F�d?f8?��c��z�[�Pn�fP�&�w2������%�[��B=�m����9��=���=:���OG��Nr���(���;�������N���.lg[�����.���pi� �?�D�A��8�-��"~����Q��%��Pt_�'5SV��z����v���h���w���;�e��u��J�-O�v����o]R'-�' �*b@t
��������z�����b��\�N����������"Z���������<u���w�[���d[����H ��=���9W��n@�@�6�����	�,����]4��H�j4�(�OL���W��
F��_�3�(�"�<k��>�
92b��A�p����'K�+d ��n�:���v��������a�SY��$_��-t�Z���@�h�_'���������}������D��&
��&�c��Gx�W>�r��1}N����9��<f��>3��kXlx�����k��������_�����y��6��c���
���X�����h��I��L��s���Q_���}�����
�����_����}?��w�����������h���f�����;��>�i����G]m =�[��f������&���P

0002-multivariate-histograms.patch.gzapplication/gzip; name=0002-multivariate-histograms.patch.gzDownload

#151

David Rowley

david.rowley@2ndquadrant.com

almost 7 years ago

In reply to: Tomas Vondra (#150)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Sun, 24 Mar 2019 at 12:41, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 3/21/19 4:05 PM, David Rowley wrote:

11. In get_mincount_for_mcv_list() it's probably better to have the
numerical literals of 0.0 instead of just 0.

Why?

Isn't it what we do for float and double literals?

12. I think it would be better if you modified build_attnums_array()
to add an output argument that sets the size of the array. It seems
that most places you call this function you perform bms_num_members()
to determine the array size.

Hmmm. I've done this, but I'm not sure I like it very much - there's no
protection the value passed in is the right one, so the array might be
allocated either too small or too large. I think it might be better to
make it work the other way, i.e. pass the value out instead.

When I said "that sets the size", I meant "that gets set to the size",
sorry for the confusion. I mean, if you're doing bms_num_members()
inside build_attnums_array() anyway, then this will save you from
having to do that in the callers too.

21. Do you think it would be better to declare
pg_stats_ext_mcvlist_items() to accept the oid of the pg_statistic_ext
row rather than the stxmcv column? (However, I do see you have a mcv
type, so perhaps you might want other types in the future?)

I don't think so, I don't see what advantages would it have.

Okay. I just wanted to ask the question. When I thought of it I had
in mind that it might be possible to carefully craft some bytea value
to have the function crash, but when I tried to I discovered that the
input function for the pg_mcv_list just errors, so it's impossible to
cast a bytea value to pg_mcv_list.

28. Can you explain what this is?

uint32 type; /* type of MCV list (BASIC) */

I see: #define STATS_MCV_TYPE_BASIC 1 /* basic MCV list type */

but it's not really clear to me what else could exist. Maybe the
"type" comment can explain there's only one type for now, but more
might exist in the future?

It's the same idea as for dependencies/mvdistinct stats, i.e.
essentially a version number for the data structure so that we can
perhaps introduce some improved version of the data structure in the future.

But now that I think about it, it seems a bit pointless. We would only
do that in a major version anyway, and we don't keep statistics during
upgrades. So we could just as well introduce the version/flag/... if
needed. We can't do this for regular persistent data, but for stats it
does not matter.

So I propose we just remove this thingy from both the existing stats and
this patch.

I see. I wasn't aware that existed for the other types. It certainly
gives some wiggle room if some mistakes were discovered after the
release, but thinking about it, we could probably just change the
"magic" number and add new code in that branch only to ignore the old
magic number, perhaps with a WARNING to analyze the table again. The
magic field seems sufficiently early in the struct that we could do
that. In the master branch we'd just error if the magic number didn't
match, since we wouldn't have to deal with stats generated by the
previous version's bug.

29. Looking at the tests I see you're testing that you get bad
estimates without extended stats. That does not really seem like
something that should be done in tests that are meant for extended
statistics.

True, it might be a bit unnecessary. Initially the tests were meant to
show old/new estimates for development purposes, but it might not be
appropriate for regression tests. I don't think it's a big issue, it's
not like it'd slow down the tests significantly. Opinions?

My thoughts were that if someone did something to improve non-MV
stats, then is it right for these tests to fail? What should the
developer do in the case? update the expected result? remove the test?
It's not so clear.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#152

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: David Rowley (#151)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Sun, 24 Mar 2019 at 00:17, David Rowley <david.rowley@2ndquadrant.com> wrote:

On Sun, 24 Mar 2019 at 12:41, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 3/21/19 4:05 PM, David Rowley wrote:

29. Looking at the tests I see you're testing that you get bad
estimates without extended stats. That does not really seem like
something that should be done in tests that are meant for extended
statistics.

True, it might be a bit unnecessary. Initially the tests were meant to
show old/new estimates for development purposes, but it might not be
appropriate for regression tests. I don't think it's a big issue, it's
not like it'd slow down the tests significantly. Opinions?

My thoughts were that if someone did something to improve non-MV
stats, then is it right for these tests to fail? What should the
developer do in the case? update the expected result? remove the test?
It's not so clear.

I think the tests are fine as they are. Don't think of these as "good"
and "bad" estimates. They should both be "good" estimates, but under
different assumptions -- one assuming no correlation between columns,
and one taking into account the relationship between the columns. If
someone does do something to "improve" the non-MV stats, then the
former tests ought to tell us whether it really was an improvement. If
so, then the test result can be updated and perhaps whatever was done
ought to be factored into the MV-stats' calculation of base
frequencies. If not, the test is providing valuable feedback that
perhaps it wasn't such a good improvement after all.

Regards,
Dean

#153

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: David Rowley (#151)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

Attached is an updated patch, fixing all the issues pointed out so far.
Unless there are some objections, I plan to commit the 0001 part by the
end of this CF. Part 0002 is a matter for PG13, as previously agreed.

On 3/24/19 1:17 AM, David Rowley wrote:

On Sun, 24 Mar 2019 at 12:41, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 3/21/19 4:05 PM, David Rowley wrote:

11. In get_mincount_for_mcv_list() it's probably better to have the
numerical literals of 0.0 instead of just 0.

Why?

Isn't it what we do for float and double literals?

OK, fixed.

12. I think it would be better if you modified build_attnums_array()
to add an output argument that sets the size of the array. It seems
that most places you call this function you perform bms_num_members()
to determine the array size.

Hmmm. I've done this, but I'm not sure I like it very much - there's no
protection the value passed in is the right one, so the array might be
allocated either too small or too large. I think it might be better to
make it work the other way, i.e. pass the value out instead.

When I said "that sets the size", I meant "that gets set to the size",
sorry for the confusion. I mean, if you're doing bms_num_members()
inside build_attnums_array() anyway, then this will save you from
having to do that in the callers too.

OK, I've done this now, and I'm fairly happy with it.

28. Can you explain what this is?

uint32 type; /* type of MCV list (BASIC) */

I see: #define STATS_MCV_TYPE_BASIC 1 /* basic MCV list type */

but it's not really clear to me what else could exist. Maybe the
"type" comment can explain there's only one type for now, but more
might exist in the future?

It's the same idea as for dependencies/mvdistinct stats, i.e.
essentially a version number for the data structure so that we can
perhaps introduce some improved version of the data structure in the future.

But now that I think about it, it seems a bit pointless. We would only
do that in a major version anyway, and we don't keep statistics during
upgrades. So we could just as well introduce the version/flag/... if
needed. We can't do this for regular persistent data, but for stats it
does not matter.

So I propose we just remove this thingy from both the existing stats and
this patch.

I see. I wasn't aware that existed for the other types. It certainly
gives some wiggle room if some mistakes were discovered after the
release, but thinking about it, we could probably just change the
"magic" number and add new code in that branch only to ignore the old
magic number, perhaps with a WARNING to analyze the table again. The
magic field seems sufficiently early in the struct that we could do
that. In the master branch we'd just error if the magic number didn't
match, since we wouldn't have to deal with stats generated by the
previous version's bug.

OK. I've decided to keep the field for now, for sake of consistency with
the already existing statistic types. I think we can rethink that in the
future, if needed.

29. Looking at the tests I see you're testing that you get bad
estimates without extended stats. That does not really seem like
something that should be done in tests that are meant for extended
statistics.

True, it might be a bit unnecessary. Initially the tests were meant to
show old/new estimates for development purposes, but it might not be
appropriate for regression tests. I don't think it's a big issue, it's
not like it'd slow down the tests significantly. Opinions?

My thoughts were that if someone did something to improve non-MV
stats, then is it right for these tests to fail? What should the
developer do in the case? update the expected result? remove the test?
It's not so clear.

I think such changes would affect a number of other places in regression
tests (changing plans, ...), so I don't see why fixing these tests would
be any different.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-multivariate-MCV-lists-20190325.patch.gzapplication/gzip; name=0001-multivariate-MCV-lists-20190325.patch.gzDownload

��b�\0001-multivariate-MCV-lists-20190325.patch�\ks����l�
L�m+��������,'�q����t:��d6���7�?��)Q�O�3�jZK����>�r�K6�������Y�o�F������[������l���|$��0`S�b������?f��V���b������0p#�����?�nI�	�G���Wl���j�w<��Y}�gYl�m����z��p�W��������j�����O�;y�����|/N�J��hT�:�8��K������"n������:l�H8_N)�&�l�@��<����H��o��<�����,����H�[N$@7H���l��#B$�q����`�7nQ������v	�A���5��%_�")R)�nE4��U�-�?E�Z�����|�X��=�h0#\'����l��h\�(����IA�|o��m�ejl��������f�������V��
�l�5�����a����bwD�x����+���� �{#������O91ts������z������E�w�V��]����n�{/h��?|6v"o&����y�I��Y����v��a������^��Z�8<N�.O���������)���;�Rm���������%+Q��A��8�oY�CtV�.����m��f�Y>�n� Q�}MN.�]l��7|�m�=I��Eb�[A�v�0|q�{E3\'[��8��W�M�/y�d�����
��!�=����XlsW�8�����`k��tl�~-f�-��C�
���L�0���5���.s����U�T [��A��7��'{y�J���s�h,������>���������=���~��l�����m��U71�l�nkw+���^���-�{���t|u��1}{��{��;�D-G��;��O<����p$�.�������I%5�q�x���^���0���
&$rY�V�Td<�����w
d<��G�>��?r[�'q.2�$c�+�
�y����}�F��Ge�o��3@2[�,|�a�0>�x%}�c����%��%�$z8���(��e�a��G��uf�����!>�+%��|�0N����=��Gq�	��a����,���9��7��C�@��o��Z�y���)}C�un��A�'[��%F���N�E���)U�W�U2]�G�XD�zEW�-
	G;O��J(�c���t���l�vw��,�?��p2[�M��m[�n���1�a7W{=a��&������������\$�#{7��JYc�-�Q��R���.���~�g\����Bj�0�P�4���Ly��>���Rz�OT��K�g��-��U�-9�Q�t��w��S�.�FFM�����V�h�
�x�J��a�F)��������>��;�+O�KB�������$��/���^�y��F�|�2�#e��������l<=���S?������q�R�4`r�
t&{���)r�Q�	��lL�<��h����<#ha���>c�H�Q�F ���}F�i������(��r��2`�����ayJ8l�bCa#e
6�������2�"����7��F����R�t$�v��E�E��%w3w�������aVt���kpn�O(Z�	RxG{>}����M���}K#)�0���]��|I�H_����S����iRn�G�|L|�,D�����C��l��8�P����`��O��,}����<��"GW���y������f��[R�#����%q�5L/����b���[������W�������	�a_S��h`���tl�-�-~rN����w3���x�3o�<+��E��>�hc:9��������^]�c�:H=���lt�*ao�-��o'W���0�����8I�����F���OrZ+�T��P�31,�0[���I6�6�9���M��B�7�~ve��p�W@���J���g
��w���f��31�Y���������#��l�:���Fq�/���Ug��W����:���qi�K{����S�x��J����d�L�{����������OL����2f��]��8���L�v��Q��tJ�+z&����K���L����UJdp�(R��]wn���n�U����9
��#�����w�u�����\�*`t�rkw�l�N�%��P�boxsF}����#pU����;
l	�����P5�u;M�{L7�W�5�%"p��5���X�iO�)��#>��a�}/g^�U��iw�zK���a@��� �����(�8���b<���7/�.��s��`�����-� ��*�co�j��[_�������C	-���8�y�
�K���r�%]���tP��
PL�M�T����?R����"T��������j%`~`V.�|B���i�/��hDmq�y�~���|���� &�y�:f��\���,	��E���g�e�Mjoq�0??��&�"50�=�.�y���:�)&h������7��b���G�+	�i&���aU����>H�x���z+��i��������)&��_)#����/F��3����3V�o�����l{a�e%Uc��<�'aPg'�/io>���/��v���t����vs ���b�������'_X�S��6�X�����������\�����m\�F��t��<��{��\�]�bi.]|�B��a���.>|�����F_����#�$�^go��%���N��N�c���/O��
j|\b#�kr4����w_zuvz��RN/��1HY��2��O ������tLKs}���[�#��|�^:�D���ri6�/VG#)A\�
|��@"��zr��hf������}``!w~K���	���q]�q��5�)ZU9sL��������N��f�+��%D"��*����X�1���:��H��4+��^,����7uy��c��CF���k���P�v!�75R`���s����w^LA��Z+��6�������F���T:H����L�42�����5P�27�4X/g��J�x����^�y>���J.l2�=+����Y�3f��:��<)�������\I���w!3j6�3����������m�R�
�����.�/����e&��n�`}��K�~��W���@�0?y�yLV�5�R�j��P�e��w����h�7L��B%M2�9�0���R���1QP�^��R�/�1��������1.�����vyzZc
�(\�(h�AM;�8a3�uP1GLn��Kgj�h���5�������q������!�������Ov|���u�eM������������ kN������G���e��#n�1K@����H��*������:�Y�d�OU`�S���
�"��&+[2e�Y��x�d��A7�^�5���l��3��.��D��~��G�7�.��J�>dn~�A��f"�"����"��5�B
�+L�	-:A���������� y�.���8�)����[���h�Ml8�,.nY��{���P#�
�?�q�� b����.�S�V
��(���#��D����d:
g��^��B�,��<E����]�������q�N��H �x���<��
>aiD��<?$/��q�7�ah_�]+%
Q`N�%�1&Y�HL�P�ae%"�F�J�����C=�>j2=����S�q��]���Sb[�zru�����%�v����S�� KX��W������~�\}`��E�4��/?��T���X�B3�,w`��F��lZ�|x�[<�=7�}=��;}q�j$�����x��U�F�VI��|�
���X�w`;���b4j�wT��k���xZp�qZ�)[��4��|DhJ$��28�����\���_�����s��g�WklC2S���
����tI�1Gg������+����
 K\��t������n�`��e]�^��b�����Y�f^���1���&J+��GA9�����i�us2��_2��E#�FY�H����"���B�:kcf�/�`Fi�Y�0AV�Y���T�u�0�����g���������3��vGT
���9���2��Yt$��=Y���|��Z��`~�=D��
��Hk����Z�*� ����;�#I<��I72`�T�20|U�o��������I�$y��7�u����	��zt��0�?��H��rW�:Ch���?z:i�$<���5�	(���wqMT�/��]��������5�
O��-�OcO���,��&|��	��(����!�!�F'��-(r�P����e�,���Rt��Z�;��9�7y����w��Jv�q�6�G��o(�k4�\�p<U��.�0�8�3�/z�w�$� L�B���q����J�E?�"�u�i�'D�XW5�\I)@��&5|�
F����I�s
�\AI�
���D���g`�$�t�~�{���������DN�d�/���!PfB�B��)d�@�
�L�����]��r�����i���
��Dw���a@5D��C	7����-�n���'���#J�+%��Y���Y-+W���bV����
���`����Yf�q��S�T��^g4��\���M.,w������8����B�����8��#����I�H�eg�����M~�;S�3���cP!��40�c��M��Y�ls|��:A�=\��6�4���xE01�[�Va�Tt|��>m����8-�PW1B���K���F�>���Fzx&W�.cf@�`[O�������������l�
��!-���m�\1��-�$�Sb�2��-m���3{���r������^v�O��b�,?��	B�����~�6��{�N�����P�5[�7oD�vM���E2�������XT8����D�i�%�(O�4+�����`�X�����y�%���[d�R�Gz��Z�����B���j	����F��%<��B�A��c�q/�Dr}����%6��\��3������h����tru��.�/�D���[���ku����G�Y�K�k�MeUtc������v����`��zi��miQ����A��*�,G��e��d�e����6�`K+8r�RKvm�N�ks+���SZ����i�����^5b��n����V����(wt��1�h�>��E�����refFR�A!e��Z��/e���!M8��*��w]AT�������;4�=�
5��V:�]3hA������e6<��)�}E��=�F6�����uf������$Bx�5���N�}
�����e����x�,�2���
������=�=�{�������&�A"�;�Dxqo��T{'<Y/����j�v���z��qVI���{�e�����y>�>
���1���6%��!o�A��=Z:�I!A���a�����^���W�#�0_����A��C+c�-���i����C
�g')oA��M!����&	��A�U���-�AB�+�V?���[��J\e?q�Z���e��s��ZM	:����!�gI���*�O��������{�G�3gU���\U��:��5�������"���7��
0�uP�������J�����"�^�w��'�������_n��Z]jg4@�t���2��(H5�Cu���e(���PC�YQ��c�0�6��=�0(Q�N�v;t��<+�c��/g���o��1��F$���7�_�o.N\��k4�����>�������dr1>�L1�Lr����>9������`G�����������@��*g�(?�S����KuE^g���W�g'u�������+��>E��	%4 ��`��C�$�m�H��f��S�JXFR�U�1�����;��!;�>�s�i��Uk^�z����������&V�k�>�5�9|�����$.7�s0��9���r4Y�;YCO�6f���gg=���I�O�}�;�o���`�Y�V�,��y;_L�CtB� w����8��p
�_���h>�&�E��n{���	���a.�������Y?�K��3�?`���#�;�����7�6��?OH�%Q���F�dsn������0�w,Q
#	��dgy�d&�U[���D2$��,��j��q�RF�
�������F��]Z�!,�l���&8!i��j���7����7W�z�,f=��C���Eu�B�v!�@h�;���\�$���9�o7n��� �2�h�V����K���r�^^�^�/zu\���).	�K?]���@��p���>H�]�
�5��"������q/��t�����MG�:�s��0�p��������b�q��4Bd�e��a��P�:����z���;�n��<���8���B~��l�VP��F8Q�>I^?E�}�9q/ Q������yV�@��k�q�(}�~O�?M����"�wL��;���|
sI�(���@`��^��f:���&��^���<#��#��k$@���
k�-� 1q�R�C�����K��94�����y2
|��h������3 {W i��^������-����{�������
3��W\Q���Fx�`�����43`��bdr1@_/6j�:b��v�@���
k�1��K�`��Rc4���!M���P,��S�V�dqe�L�<��u�
��D�+�����\��*�L����d��_s�/��g�vQ>�i��e�"���-L�4`�CUd���-���86gws;��c�'3�u����7S��?��������>=L�+�/�4�������^z��
���H�r�)���a�A6�C�_����.{����0�JF�8Q�5�"�??���M=��:����d����!�1���&��1���
��[R���Q��a����#i��ZPL4��~���t��L��S%������%
\a��%�����g������	���\���z����y���~�����2��J?�uubC�h�x�S��'}8��{�n�Kobm
o"tW�iU��2������V1�q�~=f�7������j�7���r�������fq�����7���(����m8���<kw�{�96�����Nt�������'�fo�U�s���W8�������&��G���8E����B��>�X�
��o���M�F��Q�f~k�*:4"|���GS��m���U?0@3\8��Zc���_/(��*�R�x-��Y�EU1nM�ud���3��( @���?�i8g�7�!}����?���Q�Xhm���������w;���o�a�#��k�&�[���������W���t�wvm�r_���
�5�8'19�cU)���pH�
�����:V���p�zE��������������Yk����gt-��Y�]����tr��a�u���*	�*�[�<����\&uaT�n�@BbM�}��H���	(���.�KF�����i]�,L{���R�54�����-�f&�7v������X������D�E#`�A����c��0�R	Z����	*x<�;�p6���*�	����+�6�p���r����l[���g
f_��C����,"&��{%���G�JI� ���5���������f�q������ho	]Ra��.)L���Z!G��}��������b:GD�`k�n)��J|��5[��,I�b����s���,2\=��0��[��	��J�.�
��,AC
+�����pdd!��B
D%��"u����:�Y���#M��������z|o<�5L�f�X�S�f>��������t���F���T[1�������Q� 'V^LA�Q�T���d�'���
z]9����/��*���M����<�K�}4�D������!)3�w�H��+�G#,�M����,7��� m���A7��KY��f��`�L�N�s���y���\�,��v.g��I/����F�� �D; �{�u
�$��ET����J��RIf���3�y��Q��w�|�y�F�ypy}��<}yt}vqn�c_�|7���KsS�<`�O��DK�A�}\0��"umf�~�7:��z�Pw����.2�%U�
���]��>���y�[�.�G���f�(H�Q��a/m=T-9���r[x�����?���&;+����	�������y��)E��
!�6jz;��/�,���������3����ZNSV���<e�M����K;��}g_���Q������T
{����~t~��<�P*q�T�����H�l#����)G�2�'�c����Xxl�}B5�O,�2V���1������Rj#P�t'@����fc�?J�g��^��z�T��Cx��T�\f�tL9Z)L?[��;��<���@����b02���7�W�3�Ua1.�b�`	��`�
j�Q=��&�]����!�VM��
��I���8O�������n����#�Me�s�����d��g�
��(�/��q4C�;���-��Q��
|>f`b��h[`=��l��d��obD�OT�Q���)/^�-#l��AuYG���'�&,La���8����x#�@�R}|3E9����V�|/�|BB}�iH&��+�?h���:e{����`���c<u��������`g���>������;��2v�����-��S3���\��%S4	`m ������L�+-��L{F��<��:v$�H���?<������A�}���~o�t�g�q�x2����{���Z�p�|�����eO�������<��S���=5������b)�����������4�,��[�Q-��Y40%�v=�h�Wc���K�����g��������6�m�����~l_a*GV��CD^�H���S��$�g���F��K���u��yLA��#��8o�#��&�������4����������z�.~R�D�p��&�:��#��'�5F��l��� 3�m�vP��j�4��ql���`�������������h-�k���96��o��$��?Eh��,?^h��z��B[}��;��Aq�.kAM
\Q�}�)���p�R�{��HB0}"������r�6�E�����%�B^t���$r],�qL4m��e!n����(uG)�QX���*�����\{���0�����������X�p]\J���k�F'�#�JO���e�B`�6u����2X$-J��.l
����Q7y/��Vp�n���B�F�o��`��x���4����\`��n���&�C����s�����2�trOzc5p&�`��/�h��vowpp�=h4�f����������T�L�)J�N;�dV�3����		3<@��K
}eG�}@|�8Bs\�GHW�i:P�J�/��O��vM��%��DL!�l�l	�@���)+�Y����K&��Q���/�u�{������%���F�r�F�����-�z����u�#u��,�[�]��l�h�I�5ht�F#���6���+T���%9�n�-�w�Mp�G��a5�7qH����,�B#�%nJ\S���]6�D���k?�	/ ��#lX�a�Y6�Q�z��;�/�7x]�xt?�A���6n�C��E#�Qq8��O��8k6�����Cm?U��?�>TB6��ns7�����0��Ak���7X�TTS~���<�E�F�����9���O������s�Fc��o]CU��
�"��6�`�@��������L���H��{���d���G���O�l�[�{n�������U����2�Wbr0h6���&�'�~��������R���2����L>�
 O���0�#��.z\�S���n%@MU)*W(�)3!���8���MeYAB���)��������l��x$��Gw�:����Vf�:��j����*~��l�:*U��<xFxt~��(����+���$t�Ip�g�hau@���_\C�P�/�����q�%}�����tfPwH�;!�"D?�g�>��"������Z�����oaw������TA`���H�JZB7Z{=�Z�Jz�J���%�dnGS�v=eGE�E"�NzTu�2�5�������^�Mgn���F?�UN��{�Yk��Xj���U�:��\�C�'���_����B��W�|���,�*!��Rm���r����Es��#��zU���m�}�������`�5h@J�sx"�[)��%�Y��b�0�^���=u��|���7qy���h�4��H��`&�d�OP�K+�I����4���%T�tU�,&C:�!��?����]��L5lz�I85�+RA$q���X��T�F$�$q�7��]�"p$;���6����y��P�dk���I���v�����q�Y�D���\�wD�Maj���S<*��&����+�<k���*�j"����Z(��u�fG���P`�$w� v������I&��
�L�	 �u:&�:g�l������F]�5���[�)��	;V���E�h`��zU��`�n'��qRG�����G��:�Pa}+���.�@^b�����`��(���:�!������q42e�JLbb�d�.*�������0E���6yH���lHu��]�������3�G�;0��b�^u�43&	���l�yv��"u��k�d�t������O��BTX���sp?������%pd�G��y_
�b�����L*�W�?�W�m��������b$���GG��������W{�V���A��$�ie�hG}�&}99�E���'P�
��c��9e���oa��U^P��0��r����/�����g�p�i���fg2��[��n��z�0ntq��2�������;
��@S�Vg��%��.0�8���$=�����U�[������t�mR�(>��^ND��0�mn��9��-�
��v�;��
n�y��+����^u�����_���t�Zj�����Dz9����xg:���>C�0-]j�5����H!mi�<����i�~��2^����a&�&���q��r��L�g,�sF������ �D]�\��@P�f���;�9O])�8��K�e��M�����VT�-����i&4��p"�P�I��i%����N���a>���	tHTa�
�I��n��#���E�zf��m6�%D ����!������q�fpF#������^"����j{#:�r�lsp�%}v�����������&S&�+��+��"_����
&�<q����+?��f���|�3=zAZ��'J�1�7���N� ����������`n1v�bs)�-Dz
�0��w[S�i��s���LP2��&�
�}:4�:F�����)��DB
5wS���1t��1J��N*�#��M"�TC����N�u�a�e�N�� G�:vcc\]~X���XWv
�f�x�r������@i&�R��������J�{���>�N.Zd�1�!�=�`�1�� F��L��E�JL�8�L��,'Az/GZ������5
��A��kV��nd(��-��d�vC(���N��l2eS&�^Pp������]J����p���G���hZ-X�~�pg%mi���z�LY�/6���NP�	m��Ev��
���Ai�#~k-0������m-8*~�J�-[��{`j�� cN��%��n��U�$	}�;r�`.���?�Vp�V�5�;h� E���W�n�Sz���p���V�1|�u�3B=�R)n1�>>bDB�r���1"�8=xB�}fa��$����!*��+��t0z���h�;�6������_d��$f���d���"��\�S�l4��t�����J�	L�U
�z���n��Ah�i��d�^hW["��L��l����f�C/�a�]�%����p��Qd��
`0���	G�!Q�2�R'
(�(���Eq��Q�2|�0c�F�;����<�
�.�9����zL]��IJz7�r$�zd��>��Z�
���udO8�/D
"����������z�%�
�����X|�Zfn[��ILZK�����1���E��`�n�b>L��XO9���|n��aEB	]O8&M�g�������Z_c%Z�o�]�XY����TP1��U�N�2d��wHl�L�ctI-�(�7������kvk��}�f����5!�n�d��y��N0<c������x�*��:�Q�[���R������PG�����������S&h{O0�_{w�Q�>;9C�}�Q���2J��u���t�i[�`��h�
�?*d"S��
�T��b�(w:f#oC��Q�J��z�������0��`���/,I6k�CX��C�+��Za��,���X���%��;�u��[#���b4�$�����5�g����Q,�H���b�W��<�������5�V���;���c��_��;%�t���!l"����)���>��(K8|#\L�h�iD/�7D�]��nY���~5\L�0A'��J�����&$�
}^�!{Is_�)�q���fD�4"�1Y�,����i�zl�^@�Q60�G�p]��_X����l�%(r:��G����sby+��W��� �
8I��B��s�oA���7F�:W/��p������PV�q<'NOy%�-xRDb4���D�E��6I��$N�@@����5�����9�-pf����O�����"��*����
,�^���
m��K�w�In&�N- ����Q�U}m6�������������>�0v��,mx���������[�4�f�AEq����n�#��]���<���������
�D�����.������z�����q�q����X1u�F��!��)s����(?���)��y��D?E��-��mC��8d�a#H��>N?#��b���N����_���g��N���'�����'R���	��8O���I�O���/9��'�	�h7y��H1�����9
�F�Rn��y4�4�2�h�p��%:"���`[gONw�vkm ;�;�W�O���O������LxIP�q(�/�W�6�J����e.������n�jgg��)hDT"'}<S�.���]^.�w��Y��g9`����f�9l6k��<�YWt�����'y���e�I�kk��y���W���� &c<�y]��
\�Pr�ppA���R'$�9�h�7�1%��{��('(e��� ��s"+4;D0$��H�O��"����	[ �&m��l�RDQ��}&���
=gh�������\Aq����!�������
1�z+xG��k���l	�-���h���7����/�d&��_D@n��v*��
�9"/��{{��w��|e��H���^������C6��M�~�xe��Ly�h]��+�����p��b������\CU�6J��I_z�u�2G~7���`i*9���^�i�'O�p�J�����JB]/[�%�j�,���~��������\Nc�<���g1
�Dq���6��A�}$�I:��&l8���S�W�� ��)�5Yj�W��5!|>�I��d���D����]|��D#b)��`�����H�"�a]��,��8V�>1�h
{2���@;�a�i�L�j�2��r��7\< ��L�
��%vw��5k�b���"�Z�����;��7s����P�R��Vs�V��j��3��G��Q���p��;���@��*��iH5�������L<�81�CAB!>r�E�X�����Fk��+�hV1NfH��'�%�@��mF{;�~�hD�f����`%e���&�\i�3�x����v�=��z�m0�}��ko7
^a8]Gs�� �
���_-��$J��f>{�6�P�������8��L���0:*�/�2���6�S�v����2��&)�P8��h1O�G��";����Q����0(��\�����n�_L
L�G�}JH��J�-�.���%������Rg/������F����M�a-I���������]��%����-F>Y.g��!�u:�.��;�,3�Z��@T��~j�va�����������W?^�<�p�llu����WKp��H��\��()���;��0���VJ������w�����5�?U�Ia��.V�u��[�,����3�SA���a�S'�Z��&���L��dA�s]P�����N�}R<�?0�zn"���wA9��:U�K0�%�J$��\"f�9!�^w��'5x3�����x+����n�u����d���!�w-[�<GF�����r��-�EG%Mk������'b�e�g���^ ���9���zI�@V�2�R�\���y��:��������8��2�!�?��N�E�I��Z���a��8�7�i�`t��$Z������N��*�V{�t���8���P��CQ�k�.����m����O�/�
'��5e�1�k<��:z�5Z��Vh&�"�����q��P|����;k��q�LP�3�AI���U�<��4A1B�-o�,������s��H��b`��e���6�7k���Z)� ]�(PW��@����?�E{�^J�X���\�O�b����$�I��4�����f�U�x���A�A<������@�T#C���Y�Z��./;'g���h6���|cM"��b�^���3������ �p���������&@�����y���Z���b�l�;b:��NK��O�i��s}��S����=�.�������RDn5�����C?��G�S���5;���P��Zi����\c���~G��!�
��ch�G��#���3��\�4L���:_���*�/�2���S/���i���>�a�*������P������,s�'d���u��������qnr��"�����";�Pd��n������l4|<��q��f��MM(B�<�;��L����wL�YJ�g���g7w�hz����-��9�H�<���C�����w���u�o������G�{:M)��UB�"iA���8����s��������'��F�2�q("��gg��Mi��������>3��%-�P�f~�?5�l����x����H&��Zq�kk�����R]skr���!H�L|F��e����e�!����C�X��0��D��S���P�C��G�W��?f�O��mG�;4����~Q=�S�[���]��Z��)�B�� ��e
�s&���O-,��i�/�s�c
� z$��z������<�Q�
�Na��<�1:s���[�B~S]P��.y�0
MV��
]A� �����2�%/^�GZ��u �S�@����QG	3�����G^Q��k���T`D�17D���M�KB$������8����*�����i��:Ln qT��tm��=O�b���3z��5���U��P}��B6��>�B�A����O���-�X���+���ssn��M����U&a��J�~�0�3����.��D��6���.��T���H�
��-F��1pW�%�����c-�a:c������g7�� ��fR�	��L����))�=��*�4e<F��}g>Mg	9�j��,��+u�������yJ��\2!����E�����A���Q�`��g~	}3j�q��Z�������fY�0	��m�Ic��K��4(��v���D������c�o�rx��G�q4��g��5
�}�q
�zQ�W�����S7�B�$�������Y�.\����jAXX5>\��h:���z���(���gk���!9���#v��k��!y���U�����
2�N�\�H����'w��+���K��C�y�+0>���c�F��B�x��~,M�T]6*S����b��`���-����7��M��A������~�
�'Ga����s�6A��zA�p���`��:0L����&
���E�	��A��8o�nu��G�d}�Yh��<O2�<�D�����l�CQD��zx(?�[>��+/�y�H�D�9�K�1e������|�9������jE��C5��yz���#���:�y�TK�c�]��7��?F�nG���3a�U������{��+��p����n!�\��|7D,���Ik��Z	�:�1�[R�N&%����m)�I]�����?���Y�ew�����]L�E+IX��G��
����1�U�p�%X��J&����O������6n�W�b�����X
YX������_����
���?�NAN^g���L.���
a�H�s���e��=��ew�������srz}qtu�99����4-X�<w�qt��H��,������c�~�h��IUdS�h�9�[��XX:�&P�._�I�f�s*���M)�+���	g��&���H���#:�L �=F*J3�����K�ZF���'&�V�D�GJ�cP>J�U�K��,o����3�����`"#[��J����g*�!W���f��8�?o���%���I�.6�����
��!���i43��&Y�1�8bQ-�B���hg��}�'���Z�B��7��FV���I#y�mb�r��@��3���0�1qf�L.H6Q��S]��~����L��J�Ja�)l���������!0���Sh�M2Hx��Dp/��C.	���NP}�����pu���N}����%�5��W��ViJ�8iO�,��s>�����+�D'��4p��h'��,"^�$1;K�������)��^�b�)6�iy_	n�7��]�M�#�q�UD���nM9G�������;��tA�2�&{�M�Za�m3���H������(������xpI�1V�J��GP�J������r��<C��"�nD������!X�r'C:FJ��w5	}�-�?K��k�%���������Sp��g7����P+�s4�.������i@)3d"��KKz�+�N�k"S�4��B�������,4t4_�@DTp6Q5�����$h K��ku9��-��Nq��qG����9����#�KG
��X�6OvQ�G��<�O)��	��S���
Y#
�Heav�~@����}�TJ;��2��l)�_�|(u)g��E���
���p�4����=��6f����M�o�gF��q�LF�`�2��*��Kq1����]R.OTj��=A���!q��(���)�H������1�E����d����M!BY�Pk�i%3��'�d��}HCs��~V�5�q�F�~�!+I4������n��{mq3�l��lI�����Qu/8������w�/��W�F�%M*��On���(k�D�D��d#��gS��7j����?/0����`��!�C����ZS>��}��&�����>��n���3)1>�Y��dO9,��E�p�'$b��r�B���
1�M2�H���J2U����}�9���������������G�/�K��?�����<�QO��J[<���X�]�=X��NL��x1i�,���'���h��M����5g���
3���1(���������t5�����H��p�
9>:?����F _a\�M�>��s ���7�1K��9����[��'p�(���[\�;<�����h������H3'}b���H�#
v!� iwLjm���;�&cV��nNUI-���pM��������z4�B.���,UF��Y�p�Z��������P����
	w�����d�L���=�0�~89���;f�;EIa����(��=�nk�-r8��@]�@L��}e��"8R&3Q.�%b�>��)�6��i�i �����3��<�y��X��������[k5�?F}������Z*�^�f����<�����_=��X�n?Y9�\�SYN#��E��<^,5�/F�k8��������y�� dW�'>7��f�c���0�M#x�#
�}�?8�YQw+�*���
E����i/X9�����k@+�,�2���{�,*'s�&�V���[��jH9�O������n�>&����Z^�����H�H���}/�����m�[1{�&�q��*~&���3%�&��p"l��9��(7���� g���G�Y�8=I'��hu��i1.%U���J"���Isf��-��"(�L�&)C��ToiJ,O�*�IL��jXZ2��S������t��P�&��q�^��P�O�����-�������^]u���xyz}q^��O�^��]@��@y.��s�Idk\������G��R��R����B�����GV��2�L���O9����G���"?������<#f
V����<v%<.��,e[5)gudV�D�u�#T��Z�����%w�uJ�K|�;�!��A��,t�������-Z��e	U���x0m�'��6�c��xI`d6���,;���4i^�xZ0S<���C�.m����t�i�N#��K��r<���Dz��Q�����c:���=w�E��L�W�O��^	�]�����������-��Mm��fB��f���l]�
y��Z��i8�bo`���8�=:�����z�
�&S/�lL���l!�X��3��XxzE��Q����`7	/o�[�l/c��3�,*�*����<�O�I�1��8r��8	�����������*"�~�eB�8�v�Kn7y�L����y ��2��7���������x�M�l!����+�rI�%+W�;���$0E�K��Qm��a���KE��=�p	_{��p*9����Rs��>U@����(�Q��F��V-	B
'P<U�s��K�XD��(�9z��B�":�f����y���i�dv7��1������c�J`������S�w0�q*���
Prs��r���V��mN�=k��+���B�/*r���c9�n7�p4�����F�����&���e��H��)I������[��1������k�����*1I�H`0�v#�����,�Oh(�i5(�#��fqB��Lg;����	��3����<�z*����_����z�p���b�#ue$]��X�Cia���������BBr�W@����x6��*����h�����\�!��1��xl����O4��#�}�G�1a��f"�;?_�S�)7.��n���MD��v�����8?��1��0�$�����f�;N����(m�g��7���$0��\�{W��������P�`�1}u�������T�f1��I�q�I�d�2�T�3�k���p��9����O�����#r>������`m�G��Q@�#2F�R9�HjKg��A ���xN�.pm��I��Nf�gq] w=��.;fD���������
����d�N�a���FN)�Q��R$/�A��Q"���3�(�\���j�:��}�:���V��q{Xu�B�:���2�:y������>(n�>l	J\��L�����f�N\Z'���L�`Y������K���X��=��$�
�U����2=7��W?��>{����T�_�X�H����;*G��cH0����M�OLm�a��	xOMql�a�Er%���������we���~kB:���06^���O0�	�y���s�P�g^�qQT�W7S�+������Y�����M^\�qq��1J�����##)z2s
I��W5(p
J����.��Oc�l]���1�����Z'��E��Y��l�Z }2���
0e�PH�}�
f�6|��kTW��"S�(�"�`MfS�}��hFM���,�m�*������b��s�R�����P�(~O��~H.
9m
�g�_���9e�Xm 1�Y>���s�S�XL���4N��	�,��#
"�1�����Yx]LDV.l���4���X�|����KYjK�N�1v�7��G�HFg�3D�9�{�p�wb��8���?~y��u����wG���<��kh���A��I2�g��~�����t<��h;��p���D$�>����\��~V��'�X5�.EJ���C�eW����3c�DMPT��T���D�^�y+.fQ�%�O�=��Yh����q����g��x��M�l���t�=�AnX>����y���_�;<�)��+`�s����A�������<_K���aJ~N�x���.�W��~������f8�lW��O�|JE��,�b��B��T��`E�"�:x�^ ��Em�x0R}(�D��c��:��4t��`�1���{��)���o�K��0e+n2�J�h7o��&yp�MT?��A�v�u���`v���J��ik6y�dH�*H�pN��O���IL<n$�<�f �����h
v;��'�1!Po���J�a�����f
�1k���=D":_�����1��w�����.O�J���7��k�����8���[<������f��%���,������Q��=x����.��I��i=Qn�!������n2;r���o'��:����_�}v|t}vq.�Z:���g��m�y�������oa��m�}�=,C�>��]\�-��of�G@1�����|;S�����)�&����%b����'�~�g��Y����w�{�K��h^��X=[(#��\����*@*'*��k�������(Lv~�<�bc���k�x.��E]I��>�	@M3���xI��8����m!��u����������)�Z��
&��(�YS�)�x3I!���
�Q+�'�Vk 5a�:�;@	��$e;C=���)�X%R��d��R�q�M!?y{YA1+�s1I������<�1&���N;k��3�G�EY��m;�An����U����9��������4���D������,a�&0H�����{d��fL��A�z{v~r���+
���j��9���������5%|�`+a�RD�������<��O�����e��>��@&������NW�9S�����7����'��4,d����R�����*����X}&[���{[�d�g�f��3~�����5F#
����(�!�Co�k�#������p����&����K5b_B#<���v�l��/04�G*OF("�B��nrS�b0�����l�3���6�#���(���8>g��������:����e��x�@ c�D�:�������#��P�C����������bH	m���5�44�S_��������m�3��y�k���u�N�b�wC���$�����
�O��{
�n��,���h������-?��8���M�0H����#u�E���b$�\��i��.�G/K`������{���ZM���E�����N����{+��[*��:��?g��i�=G^a���w�If2�{o'CD�p*F�W�N����U��Xs0M��@��\��l���yo2�����)����`Y& �*������A����V��O+��M�Ix^�T��[�7� 	M����q�Q��;
���x
	R��Ml�����������T���`���� Us&���t{�`���&w���!���������f�!��x�b�u�,l����!l�o�fy��x4j�����
�2��������t�%<>�\�������@�"�l�b��[m6��[�L���l�~&�!���
��?"U9�ZN�'�m}T)��B�lyT���uh����'�N�[Y�]�Kw�`�4��Q������C���2?"����{�,C|G� $/�I��U�H=f,�nQB	'��t�q�j�
���-66��H�%%8i��]IWB��w�^�CA�������N�2g&#rwJ���bw�l���/�<!D�@����C�q$#���4eh�����G�d�������Z\�R�Zx�\8N�| �#t���o��=8��p9)��F�rK�(�9L����������[>3�D$��Or}�SvJ!�o}��eGiS?~?�l,:���D���M
^�|���:%@d:���O����5~TS����������qj3�����WK�R����`�7�G��xm
�:m�g��s���������0��x��1=F�O�Qn&#7�@|�Gtr�R-k��a������
��A0En��(��M�JB�06=i�9��:v4��C~sO���k1�|C�%Af��d�a�4W}zt�c'����U���H�~������>�����j�����A�
q���vE�I��bj�zH������5_�T���wb���GWj��GH��{����c<��|	|.|'��%��h���&��j�i���,���%���5P��S�T~)����(�9�������}����L�-Z+��l{KR1 T���>#E���3J�2U��"���X�o���p�f��Q�e��*�����b���Xte6���o���:u�b�C�`{30���5��4�<�A(/`w�^H���!e�	:����������U�!���1��:�Z�n�I�n��&�K ��w������H��og����p������J����r��ey�����5����U!ol�,�~�G�����T#����j&��8S�!~f�I5����Z��F��	]|��i�9�c���[a0��j;A\��wS��+J�u���m�PL�a6t���J4�W������<n���12d��9cqt75�i�F60��������m�����l���)�3JR
��z6"4'����
��IE�!��������Q"�N	Zqf �W����[W*�i�F������D`�����byUg���� ��i`&|t�N&����b0���������;p��#��5d.���dEgC�i�#R��$=����-{�[rCX��K�v�m��wh�	����5�����G�s4��>I�P�pD.��L��r&��F�M���E!2]�QHKC�}%�"Q�8Z���fWM���� ��!Pr+��8�\�w�[�C��"T���Q\O3��G"��;����&����&���>WFb�$B
���^�.�u"�B�
��(+
e(�0�L�Y6N�N�����Qo����������
��`��H���>F�M�%�|Zh����F �0yc������qt����C�������y�(��[����O�N�-8q,���Tns�:l��&�v2Ipqv����,���wJ	3��������CxEU|[P�%����H�������(_L�]�aZ���b~Pnw��y>���>�B+f��TD1��#���}�&{�.�;����Vn"����j�D;E�1�pI%��N��
{��d�)�|�&�4���\��yCi����d�o������9���}���D^�.d�T��*H��UE�\M��G$��`�u�n��#�'��(r�-���1Cut��pfE�[��@������2�\aI0��g�$�u%���qozOvb��Z`�}�����������%�������]#�������0�e<��d����?�ioM��(7���A�2�������l4@�sm�CC��`�d2����cMn'�+�s�
��#����p��B�1�%��S����r}�X�H[��VgEP������;������o�p~�����.U����LJM�����rV"?�y�'�9�{��h�t=+��� ]+�r
=p��
f��ru�R_���@�Y�������b���:�m���w����i�L�e#g����F�:��Fc��SO=���� A[��	�D*3A����s�S.[�t���I�)��9��R���������<����\��dzGyF�������F�����5@����&'���M�JO��9LC�hD�U��v�����:��5��]!�G3�_��4��[;���Qu�y@B�2s
�n�8d4����_���&9�����$[���J�� �a�u������$�}��P�wP��i�6�Z�!f��9�h�V-�����e�����:^>�<�
�QVsA��t�	��d�NT��k��Pn;�����7����G�:�@�<m6�(E�3�{!��������������5.�U�G�TqQV}?�f�t�.{�_�
2zW~xF+��d���_�T�v�[��U�Q���|�G&��� �-��D`�������}�!����28���j��2����b� K@1���$�)���c*�3
���M�S	JM��5^�I���6�x����X���[���G�����4�;��2��l��xuS4�
J���LL���N��������+�e^g`$��G#	15~�VI�=1.�;t��3�51�������P9M72��P�.��K-�j�,��UB�s[�Mf�.Jj����X\�R���c���Q�va�A��� ��b4.7c!p^��.���$q�iq���k�����5�K��W�)���uk���8;��tq�7���-�Bi�;KF^0�jr`%�/��������zE�%��2��}�v]�����N,���e�E^r�]����\]\��{�6���+�U]�Yy���p^�2�J%�y���O8A:9D&��s��8�����q� �`�Ym�VP� k��0��}��;D�eE"L���On�s�%c!c�5�ts. ,�m�����+���{����x��9i�1�o���i���
����U��Y� 9?��m����EYt�)�&	a@'^t�<;~�P�8�	&��5���H�96)d8g��4��2	
���� `+�S��JO���O6��g��:�u����E��u��}@]W������bG!����2^��&j�V�g�2��	<�F�=��'_�NmX+�8�.0�dm�1�[����;�}7�;�l^��t����?�bz��9�d���Je�9�F�x��,����;�G�)���t�aS���M��0Onf�8�5T$�~�1����x�)�CK�#Y}������[X2L��("m>�����S�'#3t�RBjJ6�2�0����r4�MB�nCT��7�"�B��qDyeE��p������pu�a�\��8��U^�BO�2��?{�0��,P��	��\��e���nuJ��V��h���P&��G#��3�U�Z�h�I���[4�N&'�(��L�=t<�����G���I~*����*gP�gn{D�3�\�Y4���\�x4�:7����r6����W�3��0���"����������r��&H]	'x������9����\w7W����������#����Dn�������0�������	9�B��,p����7�cp1�[�R��i������k�;`0�G}����TT#
� H9D���%�����/�
������0/�NR�/�qs�v�jR����	
^���iV�����}~���YCh���Y���6�H�=�t��;��QAG�6�;�^�^�Du�C�=�8�h�%�4��-�B?\_���X�e
�/Y�ixym�!;���P����K�x��M��#A^������W�|l���eiC.
E56�c�d�~��0�7��E�'�#F�0���E��6����}]���J��P���`�5�~���{�-��
����	U%����0�dD|�����*0b7sV��F��=S���L`9���C3u�dN��R:
F�t���4g��D�8I��zgR~����~Sd�V34�`�����2;fz��%���vw4����8V��E����F�a23^�pY�b��$s�{���>��.��^�7����gE��
���~������/	��~�G�Ni���{��^�1����5���4�Co
B���u;������g�2>��h�����W�F��of���Z|!J4��,E�(A��59�!-t�����������+$z��C��
�u�f!�0J��B
���c�t���T�(��%�����R�p��-����� b���h}?	c��E��q��([�go���ut��@�f�&���K��dF���7m��K���,�+�\�f)�"�U=+�r9�uHu�o���?t~:;�n�w^�E%OG��b��d����z5�����a���������{��j��������:��5�l��;�|c�c��A���������kMx65�Mgw�>z�Q�;�+.���N�x��O�=��-��?1�O����,;���s��tO����`V���kr+�V��5�j���2��z��2������6]_�5�u���'K������(�a���^_]]wN��z���7�������.Qv���5�Y�����h�d��MH��l���Q$�07d���W[�5I���B_`
�~�������G��Z�C7_K/��Y�	]���0(Bqr�y��<���&@a"R/�R����1��9�t�mZ�@S�fp|NvSi�OR��:�z]�A��]Ei
�z�����.����@�#�L�����SF��������y�^�.��$�����I�M�1�g�&H����~i�"z.��bV�����U��*�������J�4��L���:�����)�?>��W�r�U��q'k�xs�z*R��m���3�V��;�c������Fc�_��7��\C=Hy���~���^{����������u�V�t;A�����#�8��$R�)���������:R�.	��90��^�BwA���W������Ic�z�����0�Z�h��T��Y�a��e;����u���B���=`T��]�KK��_��
W�#��ym<J!��������N��:�������9;�A.�Y��rt#����l�o��E#�1��z>F���tE),�\�|�=�}�m$��2��+���+��zM����7�k9�L/c]�B��o�%,�2��H�b��,�����@��+_�e���	�:5��3����.���zW��i����*�b�HX�w�����%����e#P�v��.gBeA���#w@����`�@������@E%s"b�����dJ`����u	/.����A���"bI!�����^J��t��|��\�zs��,V��H�?m��e�LZ}7�
>�������nt��D��z����M��H��
/U������\��O0�(@�[����m?�"�N��3��Rv%&+ah�f-W�*T���8M4��
�/IV��@�P0�g9���_&���[�j�+�F�S�z��;���J���ju����B�����
8�,���d��l�������b�	������Jo��t�J4&6�x�jk��BR��5���,����qd����;�z(�g�q���av=���S�����l��������N9�I&Z�����UV�ZP�{TT�;Kv�zyd7���R�Y�`y���(��^3�����mz��>�z7d�"���
�r���=�XM�V�S����H:�)S/�m]n~� ��K����b��z�.PI�)��VF���g�Eh�/Q�v�����E<��t��V�(2����^L%\Y��l����=k�����|�&��1
N��i_�bGY3&'5���6,������ ��DR�f�HB461���#��p��|�}�6��b��������"�������t��@�H������C���NH|\����m8�����Z����� �0R����X��!����Po
�%~!��>�L�3��4A-�&�c��jgE��r�L���F��Yq�
@��5�%�6h�_�|F����E������e.j.�.�H�4�9���'��9���U
�����������5B����=~��EE���v�k2xa0��O�Dm���
&A���\f5dN�E��Cc�T��}����xH<�!�&n���e�;�i�Pv.�f|m���@+�{�j{<&�5�_�04�/5���Y������rUt�t�_�t,��D{>���UT{Ao<$uq(�B.{��hKD�D��
Z#�7�B�w������������+�����-���{j��_Y���l9
N��e�<3��|R��������������F���4��{.�KV�^�>������R�c�V2�	6�!C�����t��d�"#t�L��Z��x�`��4���x]����� �28��4]��E�qA9�	�/_j��"���T'��3)��a0R� �T����R�/���u�7�N%~aS�cC@��T��~�I�
��	�u����U/fo�}�t	�M�M����FF�������<�(�6�����	k�q,��c�A�����P������.A�%� 6�n����s���Y����NR����"���\L�C����y�b�)�*J-������y�d�i�ln���[CwL��Ej�cXj�&d��[s
����m*�bE���.>$�K`����p�
�.2��Z�#�m|*��>%�cm%�K��������(�����,�`oy �/c:��*n���	N��\�gf�UE|�A���	�a��Q�����J�)��#�'��jv����A��`������e����>������������?�H�;�������+K�Y�m�:�[�;�b����	Rd���W!E����`S��\i�%j�GD�	�<u�3TR�a��A�/��#$�v�,Z�d�����j� ��w���)~�#I5�T����$r�]���:��6���q�^���R��
h�lx3�����I8���{PJg�,W��Y��E��CB
�ZA�d��H�L�^�����o�&R�!�8���Q�0�F���/V���	�~����U���������*�:_�V3��=p6��Mk�4-3<�<^�:�^5*b�������~N�aBr|�x�PS����S�@�1k�/1��W����~�!,0#����o�gE)3��l�����J���pg���zNc���D������}��G.UvT��JWY��%��n)��O�i�<(�b��5����&�M�C~���m|��~�����~���T���+�����=s���O+,��W�;(�`p0y%��1�C�����fw�}�m��g��E�������;ky����D��;���1��������{��P|��#�d>�n%�U�s�o��.�-H�\����K+�Qt�{^c�����'+����gRJ��F�Q��BK2R�:��Vv��7V��E�p!��G�6fV�(�26)ho8�5������Z�l`���n���g.K>c�|f��%(��EcP�e.�rD���BS�UQ�R�����[��\���(c"H-��Y�A�����f|$\� �F������_9��<�si59�9hyQlg�
gOz�6��h�[��x���~%��|��c�x�UW���M��J���U�l��-����M�S�k���z
�:p$��|9�)����2\B+������	�GM������f}��������p�)�bWlf]��#A��&��G��e1����S�y|u���,��h8J����s$A������#a8;	�&0�&��=
w@7��y]R�����p��$��t�(m\q���r�J	X����f3_f
$���p��T50@KnZa
j��f����H�zU����x��
HJ��?t��������.�����1l�%�f�7�@�;�������� ?G:��t�G����,'o�h>N�1y��J�x���_2�����)R�g�����9��|vyu}|��eX���n|���='��Ug���0o��|1ED��"T���^H��T~��i�:;E7&�]���L�AMg���]�aHa��b)Qm��@R�3R�ckg��g[��c��i����PzT�X����fW|B'�ES�i~8���������G��a��n���w<v����-��e�+��mf�{�����L�&n���r���?����c�1j-wH�cO.8Y����|(�&�
�t�k�6	=������odo�I�)��u�/^����[�;�a�6#\~�f�������LO���<��_\w�~z��������0�g�8�	7�6�	E*�f]�7���[fC>^[��49��M�S-�Qq��u��i�s'6|�'��7��+�Ku�I�:�43�q ����hH��9��Mi)L�����2���V|F�1�oU�\����J�g���%������^��7A5��Uu7l~c	wx�1[?������"���zx��p��>����h���(o�����"����3���� ���1h�FT%��~���=_� �/v2�]�k�{���*����`/��p+��Yi#G]��<����w��q1�Nq����M�\	�J)3��?�"���-�`�4��������=��9f
&I�,���UR�
n����1I\�+k1#�8�Yga2uJ��&j��
�����Y�U'����.��#���_�*vQE&�*otObE���q��<[��c@�d�bfW���W�*3�Ra|������T�cQR���9� �zl�wG(m���J�
����~I��w�Z*�S8#�-��J���%&,������'J��c�)[�V���l�c��*/J5������Lt��,����>���vK��1l���ZpYy�S����+^U��>"*�d���i=6+��`�9������h�
�
C�_�.�2���bj�05�9�l�7����`���+���+���f_�&�Ahi-�������f��8[x�l�Hl
��Ov>�%�eo�������n����il��;���� ���9u�<tL�/

������E	P�B�:���A�\���6+�U��/W�CZ���,���
O���F3M�39��Sl_�)�v��'X[���wj���^#zy�9?����O5�l��������8IN�r�N�'��Z�V��|�7�.�����38�Il�9m�������v��p�9��6��]N����hY!0L$���������S��)a����y��4*N���(���v��^�V2b[�����;��`��#�	�KT����$���a��"%C	���
%��BR���Y�&D���g' ��6��2������b���A������[C�s{���tw�{���9���s��
�G�"��kh(dM�A�aD���D+�9�7��u2���O"6(RmH�����(��K�J8��$EJ����q2���N����#���0�I�$�������Ci��Y��:���u(
k���&��@�bR����{s����N�&&�?LY����)q���}����%bK�Y9i���������Hc�;�u������L�������B������.��,`���e?�l�{���C.��X�����2�l"*W���.��ne��p>����������("g���Qp ����i_-X&����:���RI����0�w��]8^�j������\p&����(�����At��1%G��?7������d�q4�E{Rlh/7�#��'7���
yG�C���3ut~"}��
���h����X)j�G��f��l�ocv��<���`Da��I��sP��T�eA�p������T�]�H��8����#�p:�����}l �����b#��P���;�C������/J�1�%�E����H����.��^����<`��DU�����Fp67x��.�<'�����^�����oh������'���0����C���HUH0�p6���#�l�et��;e�����oI�����j�Q_�����]�%	������,o�q�p����� ����t�Od�e�c�A�E	�H�B�HvY5F1;#��ci�$SP���g�U�����e�"�����(qZ�-��N[�{��6����yi�����o���<�SE��Ok�!
�Dl�!h�`.����_&�����y�s�k��#z
5�L��_l�\������������
��1)l��;���H�]WJ��s�8kS��A��P8"Ci8r$�������2F�n�Tt5�'pIt#�Y��Yz$���}+�����-�O�L���������)��Y}�5��90��c����q:��M�#:?������Q:s��N2�JdG�v1=�0e�s�Bg���G����'�.��T�����.��_�y]K�B��b�Z'%zYg�����1�����$n|���k�I2���lH��c��/��D��\%*$F�)�zX���"��\�/a{�S�*t�h!��;�M u�~��\���wdn����*�D���bb��t��P���JA�����qu�[O�Y����H������\������J]�1���W���U�}������#���(A}�Q�y]M��&&;3n,�3KH���d�`��?=6K���KcU��`|X�W�wxx�����		k>DM
�v���q\m�v@EH8�=.�Q�A����>������E��dV�������Pm��q������B���5�-�A�;I�m1�'\��<k+f�&3�f@��0��I�]r~M��7t���Fv�+�� ����i�+��O^���a����:��������$f������y2�s)GC�8�=�j%�T���O%����J�7`#;�t��������g�f���N���+@�1JC�W&����~J=���T��e\a���K�6p���s�"]0;;t����'��A������5e��g�M��f��d��x���6�C\l?�m=�������-�\��T(��������wN�tu��i���>��v��@�w~�l�D
�d">���Q��s�~8d�q�0�9G���J|R������_RV&�A�5���v������L�S���kk'�Q(��k����/��RR����:���R��7�1�6���@���������k\��3k-�J>�|�3z����'����l3/�$���L3%[�{vbd�_�����)�k�����] �"*�B�F�J&��I`���S��J����G��dK�!�q��g�e��f+����&��Q�_9�\Wq���/%6�V��.��CH5�Z�LR7������fu@*���|�f����cW��5��'t��6Bq7"�����q/6�vq.��t)+9Jg���E�]Q�j!S3A���a��w#Z1o-U}�^Y��g�U�+Trr��]r/	LZ<�'SjI�<��{�WQ\j���U�v�c#i;�y��]1��������
>��]�e��x�:�8������k�`�b���������	Sp\��~f����(������V'�m��v� ��������|�6� ��z����(���wJp�_��	��)��9=6�T{q]R�W���;��yW+�9d����6R
���*�fJc<��*��p�B��BtnC��������J���^��OT���/U�m��.P�C&	T�������1�D���\��|m
omR�!)�������JH���k�/P��d�H�fY�����5FW����N������=�����_+����Im�v	�u������.����M�w��6��
���>�b����
����9���Oz��~��kC��6'�����)J2�
�e�$Y�Z�%�l�c��u�F���W�ZX�3��a���@��?�,��f3��|��?W�$�yN���}�S��q �K���6������	�����v����g�4��y���l_�,���=k����%GY�,:�0:s�YU�����=]�/�ux~��5��:�
h��p���9����T0t�nFI7�Ym$�g�=E�Ln��+Y8�6W�W�}R���XJR��OH�"Et��A~�
����c�znjH��hV0,1WXk�^���L*�4�_)��!�*������g'��\�������Z����Q8f(�PL����<fr�(�{���{�����B�G�C
]Lu����i�Ar�g��G������x�����g���x�v��3B�5���8�Cg���G^41�B��u�t����]p���6�V����DsN���������������9[����s��I���8��mH^�w�^Q��!i������6�}#�������������s��H��K8V�+wq�����J����ArkX8��#�RWr\p��w2��-���0`�4�
�����Q��Ts1��g��}u��g��Hj�SZ
x�w�����#�K
[o4��������X>�G6�����A�H��0w�����cE!>�����`E����������j����K��<)(M������Pt��������
.���q���s.���1%���-���e
�Cys�;���eK���e[��Vx5��\�7~0���^Ph������7�$G������z�@���(����md#�W�t.����{���n�{��h�{��^��fsww�^����z�Z]�����zk�yX�����Mohc���s��n�E���@TDr�����J��*�l�3������?��W^��y[U�����*�����1��r�]����
����/�h�
.�d��M���fxvwyv�>�,@�r#�\XP8f������>����K��qF�Y���9�(�^����S;���[�������c����sr������������*�#.�
��|�LiO>�d�������vp�wk����������kb$�$F35���f�^5�����F<��%S�R�@��0��RO�Jo���!�/�?�|U4�8U���^8��{�*r�M2G�G��������AHq��Mr=�k7�����X_���G^�B�;��64�������Xn�lk�����R-�F�Ha��R���,o��6S�bPM_oV0��OR��N����F������#7��w����o4���Q�n������-����.�^U�������8Y'��&�De��M1���63��@_��#���i5#Qpq��if�w#�N_��Q�w������B������-*�08;��X����+��Ik��6�����;psK��P������G���`;����T�I-����g�fF������,��lD0��?�xzy8s��u��������2���A��FM���>�����Z����`:��������!eG��Q ��`��
 M����������?�~�~G����
��a�u�y��O�>V���������@�*������4�����%�eu��|0Ob��Bfv��������~�1h��[q����L-�&S����N��
����H)���m�����������0��+������S�\qn��9��>;��_��hr_�M��h/O�O;>��1����9/�����p9�U�����������7n�O��Ve�p��A�G����;r�Or�lU�&��B����
w��Q3j�7�(��'�������j�[qY)��;m^�"n��?�3r�&�Y��`3��Y���p��nRy������	t/�gB��'}y|����?x��l���1�����x��C�EMB�Y���\�o�-�g������:��N?���yZ�I�'`������za���&���*��_%O�����wMW��z5�^�r�t�A�r�����iu����vt���{��A05-=����CRf��;M:
���/:E�'#`=F\,�g�x�v)���BX
��������(��A2�%�k�6k�������U��;�*�>z;W>�z*9r=tk�C����q=K�|�L�E��O\I���\�����
�*��[�*��\��4�������^���������S'�����������v����)�Q�����"�u���@��h�r-�3w]��<\e*�u�����W�N�,�n��j@j�Q����n#���k��?���qrY�B��Y�x_y��h��Y�}c��u��)��J=u��S��S�u%K�H��IC���JH�5���k4�f�;�E2��K�}Pp�`���S��x[~}d
������=������vw�� ^���V�����;�o0�+����^^�f�vvZ5�`�!�(z��+Ff��0G�F��
��.���(m~��}l�`0�Kk2G�x�ZS1������t.n�l�5����$��$C�#�������C�/�|�A�����`_�m��Dl�|>?AI�������,*�j���`8�\!$���x�������������i���s�����lNNiV:8�����P��`/�wv�Q���w��������yS���nJ�R�
�OkG�4( �JE|sP#����9���2�CZ��A�^7��7�d�p�LM�d��Kn�A�0#-�$�g'�w&p�������Jp���>z�r!h����NRxQo�oN����~�R=��r�U�H�G��g���g�*����,���W�r�.�iM������W���Ox�w��kcO"��0�1����'�q�������>�y��������������������[���
����r����w[����n�������������'^�e��<��
�K�(#L������L��E����jA�Y�6R����!�E�m]�w��:��U=��8��H-�~e���1D������T>C#u��o��)jI�RE_�N�Z^
0��������	�~�����V�t����M��_�
_XF�H��A����Fc�y��n
���M_\W~�����w����
2M�;�<���L��zN[�4��e^��n9;.����Y� ���~YN!�����v��|���S_��������J�;�3g���*��V<����&����WIn���������^o��$��Ao����o���y����{��H�L�"���=�sfjap���$V}��A$��E��F��8f�t�[?W�d�3J)�w+�+��GR
�!H~������"�>����=���B�z���+�A�����H*2n��*�V�����W��l9fe����9�z���13��+|{/�SUEY�!I����%���Y�����s`g	��zn�|V�b���GS	IEr4_�|~�����KZ�:5�
���m�{�S�?�����O\������\-�_X8�������&�u�	���������J�R���P��@��`�D��t�aUs�]R��_���Q��xv��#3?�gZ\<��6��l��Y�tT��f66�����u��H�@��=JN�����x�0_e*�X������F�C\AB���Uk:�+r����b����A^�^>�w�Z�$��p�[L��myI�k���r��`J4<U��+���� ��z�9 TF9L]��4��|�v)M����[����'��y����3��z!1;���A�\��L�iHSnq+�W�_��O�����^��9��e�Nx����l	�G�������Y3�	w�����#\�(�"6mX0G	�x����t���;[��>*��<xdL��b�����'Q��DL<+��]�
���
CR+�������*���s���R���'���^{x��'�����%z������^I��OjA�w�U���7/�Jrl��T���%tV������������`'�%{�`B^���n�i�w���B��"��Z�����EwG�;�x�Y�/fgxu���1,v��i�h�u������W��*E	C=-I�5�Yc�������6��)��q��Yw8�|��
W���Q�����u�v��^e��zQ���gT'�w �a
#�	�.����3��6��q#������P-�	���	g5�����|'��P(�5.f��\�}��,0[��fcF��$j�;���-���LT����
0�g���OgZA0��m.�����M9��G�3�!E
i�q�}7����.�-��Rf�)>6ky3�Io�o�F���%Y�c�R�����+O��� 1JY���mo��	d������ca����(���v��G����������������v!%�a3&}m�x����L�G�K�e^.yQ���c�J+���^*����yt�H��I�������#1�*�I*�K7�G�d�w�a��Jm1*)��q>V����2�H�A��n�H�U�9�T?�{*g�J�����*��2	�=0��m���*t��<���y�"���>��;:���{�$@l�q���x5Z��g���-n.��!P��,��T�M������;�D�
o��q���u���V����h����n�I���g����j��J|���UOg��8}��'x��'s|��m���y�ut#8�����P0a�J�"h�e������u�~
4b��hPE��u���NdQ�;��w�e�=��K�6~b��[V-��������A�G�bp���Ae�@I�FA8�j��{H�I��8��~%��/7�>�E�r��}�%�
����@�J^�W�Cq+6��z��+�������;�U�M2�u�.���_T%����z�8��r8���v�A��h�*:"��d7���
��/�9e�.y"���kN��s6�y
a��s�����G��g�mG�����g���J�U=�G���g�q�h���>�GA��S�|�<�!�4\|�(&>������a@j��j�j��@`me����L$�3���y�$���n4�o0��Yv�]���x�Ie��d\<L^I9K��^��|�z2h4�;���������g�������#�2��s������w/O���
N./^�/z�	Q�n�y|+�OoE�?qc>��
�t��s.\H]`+����w���{���7����:���[$������Sse�O �p��|_9�Ge^���i�	�}���r�L�����y���p�W�����|�\H��R��h�ns�%h���j�S��]��~m����?6vj����`����c���u`1��#l��6��+��R %@?F�����
0p5�4vK�b�}a���M�d����Z�'O~���~�E-_���^���A�iW������}��%��	�dx:BGmv�a�,����z�l� 7q�K����[��
��vq�O������y_\]_���{�f�<�:�f����;��&�k���!���}d�'F����t�X��D��\&����*K6�!U�1w��/�����+4e���i��;&�f�����&��\����W�1l�Y�tw4!���������7����aX4}j�o	�]���=B���@����s���p��"��9�F��������Z�C�L�����*"l�W~��Z ��^�^�Vh(�l����<���V%hl�eOwk���-����la����<}J��?����@�y�?p\��� �����u��q�r�+o��N=�A�V�x.5�Q;8�������<���s-�
~>�<�����b'8���Q�)�������lV������������?�p��2 \�5����R�q�����?w[�����
���x���b2��i������i8�.�������<���D-::�~�}������]$����M��t�S���KU�����Q��\1S���%������������"Ac�dY(�amT��%�k��a��/s��?k��!����b�D5NI;�@�����?����CbF��?�u��h�������o�
�B{�����5�������AS�����������/��{��K���������>�������?_c{���E{������� :_v�??�����K��.e@�-������`�\�]2�_���5������K�\��?�o�������V,)���������_f�&LR�"n���;�Gof�����������:�v�+������{����X&D�3�*�3��K�Z��GW�8U���ASv�����/�������R[V��TQ\��*U�h��
|��b(�+�!�!�U�n��w�c�T�������y��K^�_r=���L��7*����b���:7"U�/o��V���	�ey���B�gj��RJ��./��������m�kO�����g�V���y������P.`��������l�e5������#h����sWr��
�I��h_���#�[��'f#`��
�=��lp=-(7	^3�Q������{������r��:�������������>�V��#��]�9;
�}���_�������Qg������V�p����(�?\�>���w8x�=N�c����������D�vvE':�W�I��G��~����i������`o�=�-��^���\���)������K�����@E���fo�����?����sR���������o��Mt�i�q]X����~K�:NB;��6?u=�����:R��EtO�j�z����[3��
q�(xf�1)'/|�F�$��
 J��a���x��%����0J�g�����T!GF,�5n}���`�r�d����'��^���h������k�?�p���d�����A]�����a�+P�Z���F�?q�r��(u�-�%��7�������~��D�E���/���mLF_R#����F3�j���5,6<Ys�o��~��ofP�����7���j�������/{,{B�/4���$�i&�/�M��������������r��oa��������_������E���W��s�4x}a3�o`\�
LB_��������6���v��l����?;�NiP

0002-multivariate-histograms-20190325.patch.gzapplication/gzip; name=0002-multivariate-histograms-20190325.patch.gzDownload

#154

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Dean Rasheed (#152)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 3/24/19 8:36 AM, Dean Rasheed wrote:

On Sun, 24 Mar 2019 at 00:17, David Rowley <david.rowley@2ndquadrant.com> wrote:

On Sun, 24 Mar 2019 at 12:41, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 3/21/19 4:05 PM, David Rowley wrote:

29. Looking at the tests I see you're testing that you get bad
estimates without extended stats. That does not really seem like
something that should be done in tests that are meant for extended
statistics.

True, it might be a bit unnecessary. Initially the tests were meant to
show old/new estimates for development purposes, but it might not be
appropriate for regression tests. I don't think it's a big issue, it's
not like it'd slow down the tests significantly. Opinions?

My thoughts were that if someone did something to improve non-MV
stats, then is it right for these tests to fail? What should the
developer do in the case? update the expected result? remove the test?
It's not so clear.

I think the tests are fine as they are. Don't think of these as "good"
and "bad" estimates. They should both be "good" estimates, but under
different assumptions -- one assuming no correlation between columns,
and one taking into account the relationship between the columns. If
someone does do something to "improve" the non-MV stats, then the
former tests ought to tell us whether it really was an improvement. If
so, then the test result can be updated and perhaps whatever was done
ought to be factored into the MV-stats' calculation of base
frequencies. If not, the test is providing valuable feedback that
perhaps it wasn't such a good improvement after all.

Yeah, I agree. I'm sure there are ways to further simplify (or otherwise
improve) the tests, but I think those tests are useful to demonstrate
what the "baseline" estimates are.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#155

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: Tomas Vondra (#153)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Mon, 25 Mar 2019 at 23:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated patch, fixing all the issues pointed out so far.
Unless there are some objections, I plan to commit the 0001 part by the
end of this CF. Part 0002 is a matter for PG13, as previously agreed.

Yes, I think that's reasonable. It looks to be in pretty good shape. I
have reviewed most of the actual code, but note that I haven't
reviewed the docs changes and I didn't spend much time reading code
comments. It might benefit from a quick once-over comment tidy up.

I just looked through the latest set of changes and I have a couple of
additional review comments:

In the comment about WIDTH_THRESHOLD, s/pg_statistic/pg_statistic_ext/.

In statext_mcv_build(), I'm not convinced by the logic around keeping
the whole MCV list if it fits. Suppose there were a small number of
very common values, and then a bunch of uniformly distributed less
common values. The sample might consist of all the common values, plus
one or two instances of some of the uncommon ones, leading to a list
that would fit, but it would not be appropriate to keep the uncommon
values on the basis of having seen them only one or two times. The
fact that the list of items seen fits doesn't by itself mean that
they're all common enough to justify being kept. In the per-column
stats case, there are a bunch of other checks that have to pass, which
are intended to test not just that the list fits, but that it believes
that those are all the items in the table. For MV stats, you don't
have that, and so I think it would be best to just remove that test
(the "if (ngroups > nitems)" test) and *always* call
get_mincount_for_mcv_list() to determine how many MCV items to keep.
Otherwise there is a risk of keeping too many MCV items, with the ones
at the tail end of the list producing poor estimates.

Regards,
Dean

#156

Dean Rasheed

dean.a.rasheed@gmail.com

almost 7 years ago

In reply to: Dean Rasheed (#155)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Tue, 26 Mar 2019 at 11:59, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

On Mon, 25 Mar 2019 at 23:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated patch...

I just looked through the latest set of changes and I have a couple of
additional review comments:

I just spotted another issue while reading the code:

It's possible to build an MCV list with more than
STATS_MCVLIST_MAX_ITEMS = 8192 items, which then causes an error when
the code tries to read it back in:

create temp table foo(a int, b int);
insert into foo select x,x from generate_series(1,10000) g(x);
insert into foo select x,x from generate_series(1,10000) g(x);
alter table foo alter column a set statistics 10000;
alter table foo alter column b set statistics 10000;
create statistics s (mcv) on a,b from foo;
analyse foo;
select * from foo where a=1 and b=1;

ERROR: invalid length (10000) item array in MCVList

So this needs to be checked when building the MCV list.

In fact, the stats targets for table columns can be as large as 10000
(a hard-coded constant in tablecmds.c, which is pretty ugly, but
that's a different matter), so I think STATS_MCVLIST_MAX_ITEMS
probably ought to match that.

There are also a couple of comments that refer to the 8k limit, which
would need updating, if you change it.

Regards,
Dean

#157

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Dean Rasheed (#155)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Tue, Mar 26, 2019 at 11:59:56AM +0000, Dean Rasheed wrote:

On Mon, 25 Mar 2019 at 23:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated patch, fixing all the issues pointed out so far.
Unless there are some objections, I plan to commit the 0001 part by the
end of this CF. Part 0002 is a matter for PG13, as previously agreed.

Yes, I think that's reasonable. It looks to be in pretty good shape. I
have reviewed most of the actual code, but note that I haven't
reviewed the docs changes and I didn't spend much time reading code
comments. It might benefit from a quick once-over comment tidy up.

I just looked through the latest set of changes and I have a couple of
additional review comments:

In the comment about WIDTH_THRESHOLD, s/pg_statistic/pg_statistic_ext/.

In statext_mcv_build(), I'm not convinced by the logic around keeping
the whole MCV list if it fits. Suppose there were a small number of
very common values, and then a bunch of uniformly distributed less
common values. The sample might consist of all the common values, plus
one or two instances of some of the uncommon ones, leading to a list
that would fit, but it would not be appropriate to keep the uncommon
values on the basis of having seen them only one or two times. The
fact that the list of items seen fits doesn't by itself mean that
they're all common enough to justify being kept. In the per-column
stats case, there are a bunch of other checks that have to pass, which
are intended to test not just that the list fits, but that it believes
that those are all the items in the table. For MV stats, you don't
have that, and so I think it would be best to just remove that test
(the "if (ngroups > nitems)" test) and *always* call
get_mincount_for_mcv_list() to determine how many MCV items to keep.
Otherwise there is a risk of keeping too many MCV items, with the ones
at the tail end of the list producing poor estimates.

I need to think about it a bit, but I think you're right we may not need
to keep those items. The reason why I decided to keep them is that they
may represent cases where the (actual frequency << base frequency). But
now that I think about it, that can probably be handled as

((1 - total_sel) / ndistinct)

So yeah, I think I'll just change the code to always us the mincount.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#158

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Dean Rasheed (#156)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Tue, Mar 26, 2019 at 01:37:33PM +0000, Dean Rasheed wrote:

On Tue, 26 Mar 2019 at 11:59, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

On Mon, 25 Mar 2019 at 23:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Attached is an updated patch...

I just looked through the latest set of changes and I have a couple of
additional review comments:

I just spotted another issue while reading the code:

It's possible to build an MCV list with more than
STATS_MCVLIST_MAX_ITEMS = 8192 items, which then causes an error when
the code tries to read it back in:

create temp table foo(a int, b int);
insert into foo select x,x from generate_series(1,10000) g(x);
insert into foo select x,x from generate_series(1,10000) g(x);
alter table foo alter column a set statistics 10000;
alter table foo alter column b set statistics 10000;
create statistics s (mcv) on a,b from foo;
analyse foo;
select * from foo where a=1 and b=1;

ERROR: invalid length (10000) item array in MCVList

So this needs to be checked when building the MCV list.

In fact, the stats targets for table columns can be as large as 10000
(a hard-coded constant in tablecmds.c, which is pretty ugly, but
that's a different matter), so I think STATS_MCVLIST_MAX_ITEMS
probably ought to match that.

There are also a couple of comments that refer to the 8k limit, which
would need updating, if you change it.

I think we can simply ditch this separate limit, and rely on the
statistics target. The one issue with it is that if we ever allows the
statistics target to we may end up overflowing uint16 (which is used in
the serialized representation). But I think it's OK to just check that in
an assert, or so.

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#159

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Tomas Vondra (#158)

2 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Hi,

I've now committed the MCV part, after addressing the last two issues
raised by Dean:

* The MCV build now always uses the mincount to decide which of the
items to keep in the list.

* Both the MCV build and deserialization now uses the same maximum number
of list items (10k).

Unfortunately, I forgot to merge these two fixes before pushing, so I had
to commit them separately. Sorry about that :/

Attached are the remaining parts of this patch series - the multivariate
histograms, and also a new patch tweaking regression tests for the old
statistic types (ndistinct, dependencies) to adopt the function-based
approach instead of the regular EXPLAIN.

But those are clearly a matter for the future (well, maybe it'd make sense
to commit the regression test change now).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#160

Petr Jelinek

petr.jelinek@2ndquadrant.com

almost 7 years ago

In reply to: Tomas Vondra (#159)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 27/03/2019 20:55, Tomas Vondra wrote:

Hi,

I've now committed the MCV part, after addressing the last two issues
raised by Dean:

Congrats!

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#161

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Tomas Vondra (#159)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Wed, Mar 27, 2019 at 08:55:07PM +0100, Tomas Vondra wrote:

Hi,

I've now committed the MCV part, ...

Hmmm, what's the right status in the CF app when a part of a patch was
committed and the rest should be moved to the next CF? Committed, Moved
to next CF, or something else?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#162

John Naylor

john.naylor@2ndquadrant.com

almost 7 years ago

In reply to: Tomas Vondra (#161)

1 attachment(s)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

I believe I found a typo in mcv.c, fix attached.

--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

mcv-comment-fix.patchapplication/octet-stream; name=mcv-comment-fix.patchDownload

diff --git a/src/backend/statistics/mcv.c b/src/backend/statistics/mcv.c
index 3e9d39c25a..90e639ba0a 100644
--- a/src/backend/statistics/mcv.c
+++ b/src/backend/statistics/mcv.c
@@ -931,7 +931,7 @@ statext_mcv_deserialize(bytea *data)
 	mcvlen += nitems * MAXALIGN(sizeof(Datum) * ndims);
 	mcvlen += nitems * MAXALIGN(sizeof(bool) * ndims);
 
-	/* we don't quite need to align this, but it makes some assers easier */
+	/* we don't quite need to align this, but it makes some asserts easier */
 	mcvlen += MAXALIGN(datalen);
 
 	/* now resize the deserialized MCV list, and compute pointers to parts */

#163

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: John Naylor (#162)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Sun, Mar 31, 2019 at 08:50:53AM +0800, John Naylor wrote:

I believe I found a typo in mcv.c, fix attached.

Thanks, pushed.

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#164

Michael Paquier

michael@paquier.xyz

almost 7 years ago

In reply to: Tomas Vondra (#161)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Sat, Mar 30, 2019 at 09:13:01PM +0100, Tomas Vondra wrote:

Hmmm, what's the right status in the CF app when a part of a patch was
committed and the rest should be moved to the next CF? Committed, Moved
to next CF, or something else?

This stuff has been around for nine commit fests, and you have been
able to finish the basic work. So I think that committed is most
appropriate so as you can start later on with a new concept, new patch
sets, perhaps a new thread, and surely a new CF entry.
--
Michael

#165

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Michael Paquier (#164)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Tue, Apr 09, 2019 at 11:29:18AM +0900, Michael Paquier wrote:

On Sat, Mar 30, 2019 at 09:13:01PM +0100, Tomas Vondra wrote:

Hmmm, what's the right status in the CF app when a part of a patch was
committed and the rest should be moved to the next CF? Committed, Moved
to next CF, or something else?

This stuff has been around for nine commit fests, and you have been
able to finish the basic work. So I think that committed is most
appropriate so as you can start later on with a new concept, new patch
sets, perhaps a new thread, and surely a new CF entry.

OK, makes sense. I'll start a new thread for the remaining pieces.

cheers

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#166

Alvaro Herrera

alvherre@2ndquadrant.com

almost 7 years ago

In reply to: Tomas Vondra (#159)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On 2019-Mar-27, Tomas Vondra wrote:

Attached are the remaining parts of this patch series - the multivariate
histograms, and also a new patch tweaking regression tests for the old
statistic types (ndistinct, dependencies) to adopt the function-based
approach instead of the regular EXPLAIN.

But those are clearly a matter for the future (well, maybe it'd make sense
to commit the regression test change now).

IMO it makes sense to get the test patch pushed now.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#167

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Alvaro Herrera (#166)

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

On Tue, Apr 09, 2019 at 12:14:47PM -0400, Alvaro Herrera wrote:

On 2019-Mar-27, Tomas Vondra wrote:

Attached are the remaining parts of this patch series - the multivariate
histograms, and also a new patch tweaking regression tests for the old
statistic types (ndistinct, dependencies) to adopt the function-based
approach instead of the regular EXPLAIN.

But those are clearly a matter for the future (well, maybe it'd make sense
to commit the regression test change now).

IMO it makes sense to get the test patch pushed now.

OK, I'll take care of that soon.

cheers

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services